An Empirical Analysis of Default Risk for Listed Companies in India: A Comparison of Two Prediction Models

The objective of this study is to examine the performance of two default prediction models: the Z-score model using discriminant analysis, and the logit model on a dataset of 60 defaulted and 60 solvent companies. Financial ratios obtained from corporate balance sheets are used as independent variables while solvent/defaulted company (ratings assigned) is the dependent variable. Furthermore, for logistic regressions, an attempt is made to combine macro variables and dummy industry variables along with accounting ratios. The predictive ability of the proposed Z score model is higher when compared to both the Altman original Z-score model and the Altman model for emerging markets. The research findings establish the superiority of logit model over discriminant analysis and demonstrate the significance of accounting ratios in predicting default.


I. Introduction
Every business entity undertakes a variety of operational activities to carry on business. By necessity, the outcome of at least some of such activities may be unpredictable. This introduces an element of risk for every organization. Among the different risks that an organization is faced with, credit risk is perhaps one of the oldest financial risks, though there have not been many instruments to manage and hedge this type of risk till recently. Earlier, the focus had been primarily on market risk and bulk of the academic research was focused on this risk. However, there has been an upsurge in research on credit risk with increasing emphasis being given to its modeling and evaluation.
Credit risk pervades virtually all financial transactions and includes a wide array of functions from agency downgrades to failure to service debt liquidation. With the innovations in new financial instruments, risk management techniques and with the global meltdown, credit risk has assumed prominence. At the centre of credit risk is the risk of default: implying failure on the part of a corporate to service the debt obligation. In the emerging markets like India, credit rating agencies (CRAs) have been the predominant source for assessing the credit quality of borrowers. Since upgrades and downgrades of ratings can impact the price of traded debt and equity, market participants are interested in developing good forecasting models. With the implementation of Basel II norms globally, banks are increasingly developing their own internal ratings-based models; developing internal scores. However, a credit rating or a credit score is not the same as directly estimating the probability of default. Thus, academicians and practitioners across the world advocate improvements in methodology and applications of credit ratings and also propose several credit risk models to predict default on debt obligations.
With the movement of financial markets towards a more quantitative methodology and the constantly growing number of credit instruments, there is an increasing need felt for quantitative models to help analyze and mitigate this risk. Despite a plethora of mathematical models available, there has been little effort, specifically in an emerging market economy such as India to develop a default prediction model. Thus, a default prediction model that can quantify the credit risk by predicting the probability that a corporate defaults in meeting the financial obligation can be specifically useful to the lenders. Traditionally the credit risk literature has taken two approaches to measure default on debt. One is the structural approach which is based on market variables, and the second is the statistical approach or the reduced approach which factors in information from the financial statements.
Thus, this paper attempts to evaluate the predictive ability of two default prediction models for listed companies in India: a Z-score model using discriminant analysis and logistic regressions. Discriminant analysis and Logistic regressions are used for two reasons. Firstly, there is prior empirical evidence of the two models being used to forewarn against defaults in the developed countries. Secondly, through this study, we can judge to what extent accounting-based models can predict default risk from information available in the public domain. By using Z score, banks and financial institutions can assess the solvency status for companies while logistic regressions can directly predict the probability of default.
The remainder of this paper is organized as follows. Section 2 presents the literature review on accounting based credit risk models while section 3 presents the research design and describes the methodology. Section 4 presents the empirical findings and section 5 gives the conclusion.

Review of Literature
Important research studies having relevance to the present work have been reviewed under broad categories viz. studies on accounting models. Accounting-based models are developed from information contained in the financial statements of a company. The first set of accounting models were developed by Beaver (1966Beaver ( , 1968 and Altman (1968) to assess the distress risk for a corporate. Beaver (1966) applied a univariate statistical analysis for the prediction of corporate failure. Altman (1968) developed the z-score model using financial ratios to separate defaulting and surviving firms. Subsequent z-score models were developed by Altman et al. (1977) called ZETA and Altman et al. (1995) in the context of corporations in emerging markets. Altman and Narayanan (1997) conducted studies in 22 countries where the major conclusion of the study was that the models based on accounting ratios (MDA, logistic regression, and probit models) can effectively predict default risk.
Since the 1980s, logistic regression analysis replaced the traditional discriminant analysis gradually. Ohlson's O-Score model (1980) selected nine ratios or terms which he thought should be useful in predicting bankruptcy. Martin (1977) applied logistic regression model to a sample of 23 bankrupt banks during the period 1975-76. Other accounting-based models developed were by Taffler (1983Taffler ( , 1984 and Zmijewski (1984). Bhatia (1988) and Sahoo, et al. (1996) applied the multiple discriminant analysis technique on a sample of sick and non-sick companies using accounting ratios. Several other studies used financial statement analysis for predicting default. Opler and Titman (1994) and Asquith et al. (1994) identified default risk to be a function of firm-specific idiosyncratic factors. Lennox (1999) concluded from their study that profitability, leverage, and cash flow; all three parameters have a bearing on the probability of bankruptcy on a sample of 90 bankrupt firms. Further studies were done by Shumway (2001), Altman (2002) and Wang (2004) and all these studies emphasized the significance of financial ratios for predicting corporate failure. Grunert et al. (2005) however, found empirical evidence in his study that the combined use of financial and non-financial factors can provide greater accuracy in default prediction as compared to a single factor. Jaydev (2006) emphasized on the role of financial risk factors in predicting default while Bandyopadhyay (2006) compared three z-score models. Bandyopadhyay (2007) developed a hybrid logistic model based on inputs obtained from Black Scholes Merton (BSM) equity-based option model described in his paper, Part 1 to predict corporate default. Agarwal and Taffler (2007) emphasized on the predictive ability of Taffler's z-score model in the assessment of distress risk spanning over a 25-year period. Baninoe (2010) evaluated two types of bankruptcy models; a logistic model and an option pricing method and concluded from his research that distressed stocks generated high returns. Laitinen (2010) in his study assessed the importance of interaction effects in predicting payment defaults using two different types of logistic regression models. Kumar and Kumar (2012) conducted empirical analysis on three types of bankruptcy models for Texmo industry: (i) the Altman z-score; (ii) Ohlson's model; and (iii) Zmijewski's models to predict the probability that a firm will go bankrupt in two years.
It is observed from the literature review above that several models have been developed based on accounting information (MDA, logit, probit). However, MDA which is applied to develop a z-score does not directly compute probabilities. Moreover, the model to be developed and the ratios may vary across regions. Thus, this paper examines the MDA to develop a Z-score and also logistic regressions to evaluate which is a better model in its predictive ability that can be used by lenders to forewarn against a corporate default.

Research Design
As the objective of the research is to develop a default prediction model, secondary data has been used to carry out the analysis. The relevant secondary data on the financial statements of the companies has been primarily collected from Prowess database of the Centre for Monitoring Indian Economy (CMIE). A dataset of 120 companies is taken from the CRISIL database as the estimated sample which consists of 60 companies rated "D" by CRISIL (defaulted) and 60 companies rated "AAA" and "AA" (indicating highest safety thus 'solvent'). The solvent companies are chosen on a stratified random basis to match the defaulted list. Table 1 provides the industry classification and the number of companies in each industry.
There are three major components in the analysis.
(i) The first component involves running discriminant analysis on the 120 companies in the dataset for estimated sample. Here the dependent variable is the solvent companies coded as "0" and defaulted companies coded as "1" and the financial ratios are taken as the independent variable. There are three models evaluated for their predictive ability using discriminant analysis. The first model is based on the five ratios included in the original Altman model. The second model is based on the ratios taken from the Altman model for emerging markets. The third model is developed in this study based on the ratios identified by the researcher as significant predictors.
(ii) Next, reduced form models are developed using logistic regression for all the companies in the dataset that are a part of the research study. While binary logit regression wants a distinct classification the firms have been coded as 1 if the firm defaults and 0 otherwise.
(iii) In the second stage of analysis for logistic regressions, dummy variables for industry type and macro-economic variables are added as predictors along with accounting ratios and the model is tested for its predictive ability.
(iv) In order to test the models for validation, a holdout sample of 36 companies (18 defaulted and 18 solvent) is taken and tested for one-year forward, i.e. the financial information in taken one year prior to the ratings assigned and tested for accuracy.

Scope of the Study
The scope of this study covers listed companies in India. All the companies from the financial services sector have been removed from the database. The rationale for removing the companies in the financial services sector is that their financial statements broadly differ from those of nonfinancial firms. For ratings the focus of the research is on long-term debt instruments and structured finance ratings and short-term ratings.

Selection of Variables
Since the focus of the present study is to measure the default risk, it is imperative to choose a set of financial ratios which can be relevant in impacting the default risk of the company. In assessing creditworthiness, both business risks and financial risks have been factored. The criteria for choosing ratios are those that: (i) have been theoretically identified as indicators for measuring default (ii) have been used in predicting insolvency in empirical work before (iii) and can be calculated and determined in a convenient way from the databases used by the researcher In all 24 accounting ratios as predictors of default risk spread across four categories were identified: liquidity, profitability, solvency, productivity (activity) ratios. The Altman ratios are also factored in as predictors. (Gupta et al, 2013).
The four categories of ratios are as follows: International Journal of Business and Management Vol. 9, No. 9; 2014 1). Profitability ratios: High profitability margins reflect the company's ability to grow and also indirectly indicate the ability of the company to generate cash and thereby service its debt obligations. The ratios included under this classification are (i) Profit after Tax/Capital Employed (PAT/CE) ; (ii) Profit After Tax /Sales (PAT/Sales); (iii) Profit before interest and tax/Sales (PBIT/Sales); (iv) Profits before depreciation, interest, tax and amortization/Total Income (PBDITA/TI).

2). Liquidity ratios:
The liquidity position of a company reflects on the readily available cash of the company or the assets which can be liquidated. Since the purpose of identifying ratios is to determine which ones impact the creditworthiness of a company, liquidity plays a very important role as cash resources are necessary to service the debt obligations. The liquidity ratios taken for this study as independent variables to measure default risk are: (i) Cash profits/ Total Assets; (ii) Current ratio (CR); (iii) Quick ratio (QR); (iv) Cash flow from operations/Debt (CFO/Debt); (v) Cash/Current Liabilities (Cash/CL); (vi) Net working capital/Sales (NWC/Sales).

3). Solvency ratios:
These ratios assess the ability of a company to meet long -term debt obligations. These ratios are: (i) Interest coverage (INTCOV); (ii) Debt/Equity (D/E).

4). Productivity ratios:
Activity ratios measure the efficiency with which a company can utilize its resources. These ratios are: (i) Cash/Cost of sales (Cash/COS); (ii) Net working capital cycle (NWC cycle); (iii) Debtor days; (iv) Creditor days; (v) Raw material cycle (RM cycle); (vi) Work in progress cycle (WIP cycle); (vii) Finished goods cycle (FG cycle).

5). Altman Ratios:
The Altman z-score model is the pioneer work in predicting bankruptcy and distress firms, and thus the original five ratios which constitute the Altman Z score model are also included. These are: (i) Net working capital/Total Assets (NWC/TA); (ii) Retained Earnings/Total Assets (RE/TA); (iii) Profit before interest and tax /Total Assets (PBIT/TA); (iv) Sales/Total Assets (Sales/TA); (v) Market value of equity/ Book value of debt (MVE/BVD).  Summary statistics on these variables are presented in Table 3. It is observed that the mean for explanatory variables in the defaulted group shows a poor performance when compared to the solvent group. The mean of profitability ratios for firms which are defaulted is with a negative sign whereas the average for solvent firms shows a higher average margin. Also, for the solvency ratios, namely the Debt/Equity, the ratios is less than 1 for solvent firms, indicating low leveraging whereas for defaulted firms the average is significantly higher than 1, interest coverage ratios is negative for defaulted companies and is greater than 1 for solvent companies. The canonical correlation is the most useful measure in the table, and it indicated the degree of association between the dependent variable and the explanatory variables.

Logistic Regression
Logistic regression is used for predicting the outcome of a categorical (a variable that can take on a limited number of categories) criterion variable based on one or more predictor variables. Logistic regression can be bior multinomial. In the case of binary logistic regression, the observed outcome can have only two possible types 0 and The outcome is coded as "0" and "1" in binary logistic regression as it leads to the most straightforward interpretation. The merits of logistic regression model are that it is flexible and simple, and the predicted probability of an outcome can be directly computed. Moreover, it is not based on linearity between the dependent and the independent variables, the dependent variable need not be normally distributed, and there is no homogeneity of variance assumption.
In the present study, the second stage of analysis is by applying logistic regression. The significance of the regression coefficients is tested for the interaction effects together (model) by the Omnibus Chi-square statistic. The Chi-square statistic refers to goodness of fit. The likelihood ratio measures the improvement in fit that the explanatory variables make compared to the null model. The linearity of logit is tested by the Hosmer & Lemeshow Chi-square test. Pseudo R 2 (Cox and Snell and Nagelkerke R square) indicate the association between the dependent variable and the explanatory variables is strong or not. The classification accuracy is measured by the accuracy ratio given in the Classification Table. This table reflects the sensitivity of prediction.
In the logit analysis, in the second stage of analysis, industry group is added as a variable with dummy variables taken for 9 different industries (Table 1). In recent years there has been a growing interest in exploring the predictive power of macroeconomic variables. Developing models based on both micro and macroeconomic aspects of a company and its environment is a logical future step in predicting financial distress. Thus, the following macro variables are also included: a. GDP growth rate; b. Manufacturing output (% change); c. Industrial output (% change) and; d. Rate of inflation.

Model Validation
For validating both the models, the models are tested on a sample that has not been used for estimation. A sample of 36 companies is considered as hold out sample for the FY2013 and tested by taking the data one year prior to the ratings being assigned. For any model, its performance is validated by the extent of Type I and Type II errors. This is based on the classification accuracy for the hold out sample. This accuracy is expressed as Type I accuracy-the accuracy with which the model identified the failed firms as weak. Type II accuracy is the accuracy with which the model identified the healthy firms as such.

1 Results of Discriminant Analysis
By running discriminant analysis, three reduced form equations based on the original Altman model, the Altman model for emerging markets and the model proposed by the author are presented below in Table 4. www.ccsenet.org/ijbm International Journal of Business and Management Vol. 9, No. 9;2014 For Model 1, the five ratios taken are the ones of Altman's original z-score model. These five ratios used in the original Altman model are provided in Table 2. The empirical findings reveal the coefficients of these variables using the above data (Table 4). For Model 2, the four variables from the Altman's Emerging Market Score Model (1995) are identified. Altman model for emerging markets dropped the ratio Sales/Total Assets and the remaining four ratios of the original model were taken. Model 3 of Table V is what is proposed and tested for the research study. This model is based on a set of ratios which reflect the profitability, liquidity, solvency as parameters. Since the scope of the study is manufacturing sector, productivity ratios are significant. In addition to these four categories, the original Altman ratios are also included. It is observed from Table 4 below that although the classification of prediction for Model 1 and Model 2 is high; the predictive ability of Model 3 is significantly higher than the other two models, for both types of firms. The classification accuracy is around 97% for all the firms put together on the proposed model.  The output of discriminant analysis is further analyzed for the three models. The F-test and Wilk's Lamba are used for conducting the analysis. It is observed from Tables 5-7 that the means of the ratios for solvent and defaulted companies differs. The profitability ratios are negative for the defaulted firms but positive for the solvent firms. A high value of the F-statistic means a greater chance for the null of equal means to be rejected. A small Lambda denotes that of the total variance of the variables, only a small proportion is accounted by the within groups' dispersions. (Bandyopadhyay, 2006) The canonical correlation which explains the association between the dependent variable and the independent variables is moderate for the first two models at 78.6% and 76.2% respectively for the two models but is significantly higher for the third model at 92%. Thus the third model is robust in predicting default.    Table 7 shows that the Model 3 can correctly classify 94 percent of both defaulted and solvent firms while Models 1 and 2 can classify up to 77% and 85% respectively. Therefore, Model 3 has the best predictive accuracy tested on the hold out sample.

Empirical Findings from the Logit Model
The results from running the logistic regressions for each of our models are presented below. The co linearity diagnostics are checked for all the variables. All the variables have a low variance inflation factor (VIF) and high tolerance. The model is statistically significant, as is evident by the Hosmer and Lemeshow test (hereafter the HL test) for goodness-of-fit which is >.05. Moreover the Pseudo R square indicates a strong relationship between the dependent and the independent variables. The model with predictors shows a significant increase to 99% from 50% in our base model with only the constant and no predictor variables.

Adding Industry Effects
In this section, we follow the approach of Chava and Jarrow (2004) and test for industry effects by entering dummy variables for each industry grouping. It is observed that the classification accuracy has reduced only marginally, but the accounting variables reflect the liquidity, profitability and leverage of the firm. It is also observed from the research findings that the model with predictors shows a significant improvement from 51% to 98%.

The Effects of Adding Macro-Economic Variables
When the four maco-economic variables are taken as independent variables and logistic regression is run, it is observed that the growth in manufacturing output and GDP growth rate emerges as key predictors. However, the classification accuracy is 70%. As a next step, accounting ratios are included along with industry dummy variables and macro-economic variables, the research findings show In Table 13, we report results from logit regressions with the inclusion of macro-economic variables. We can see that with the inclusion of economic variables, the predictive ability of the model is high at 98.4%, with the pseudo R square also indicating strong association between the predictors and dependent variable.  The model is tested on a hold-out sample of 36 companies and the classification accuracy is 89.4% which is reasonably high.

Conclusion
This paper evaluates the predictive ability of the z-score model using discriminant analysis and logistic regressions on a sample of 120 Indian listed companies. In the first model, discriminant analysis is applied to develop a z-score model by taking accounting information one year prior to the ratings assigned as defaulted/ non defaulted. The proposed model exhibits significantly higher predictive ability when compared with the two Altman models: the original Altman model, and the Altman model for emerging markets, as evident by the classification accuracy. The z-score model developed can be used by financial institutions and banks in determining the solvency status for companies based on financial information of companies available in the public domain.
In the second model, logistic regressions are used which also depicts high predictive ability. Moreover, when industry group and macro-economic variables are added to accounting ratios as predictors, the research findings indicate that industry affiliation, the GDP growth rate, rate of growth of manufacturing output are also predictors of default along with leverage and liquidity ratios.
The conclusion drawn from the research findings are that though accounting-based models are not sufficient in themselves, they can identify financially distressed companies from the information disclosed in the financial statements. The logit model provides greater flexibility in that it includes merits of dichotomous and categorical both, is simple and easy to interpret, can compute the probability of default directly and can be used to forewarn lenders and investors on the magnitude of distress risk for a corporate.