Financial Distress Prediction Using GA-BP Neural Network Model

Financial distress prediction, the crucial link of enterprise risk management, is also the core of enterprise financial distress theory. With currently global economic recession and the gradual perfection of artificial intelligence technology, the study in this paper begins by optimizing the back-propagation (BP) neural network model using the genetic algorithm (GA). In doing so, it can overcome the deficiency that the BP neural network model is slow in convergence and easily trapped into local optimal solution. The study then conducts training and tests on the optimized GA-BP neural network model, using financial distress data from Chinese listed enterprises. As can be seen from the experimental results, the optimized GA-BP neural network model is significantly improved in terms of the accuracy and stability in financial distress prediction. The study in this paper not only provides an effective test model for the automatic recognition and early warning of enterprise financial distress, but also contributes to new thoughts and approaches for the application of artificial intelligence in the field of financial accounting.


Introduction
Affected by the COVID-19 epidemic and oil prices, the real economic crisis caused by the global economic recession has been inevitable (Khan et al., 2020), and the probability is continuously increasing that various types of enterprises fall into financial distress or bankruptcy. This has also directly attracted high attention of various market participants on enterprise operation state, and has put forward higher requirements on enterprise financial distress prediction. Since Gordon first formally proposed the financial distress theory in 1971, researches have continuously emerged in terms of financial distress beforehand, nowadays and afterwards (Gordon, 1971;Opler & Titman, 1994;Jones & Hensher, 2004;Elkamhi et al., 2012). Among them, the beforehand prediction of financial distress is the most critical (Bellovary et al., 2007). Much evidence has indicated that the neglect of enterprise financial risks is an important cause of business failure (Tamari, 1966;Marin, 2013), and financial distress prediction significantly influences corporate sustainability (Inam et al., 2019;Mehreen et al., 2020). Thus, for enterprises with significant financial risks, but having not yet caused significant losses (bankruptcy), great importance lies in how to timely identify possible financial distress and forewarn these enterprises of it. While for enterprises and various market participants, it is critical to obtain a financial distress prediction model with strong operability, high prediction accuracy and wide application scope.
Over the past 50 years, enterprise financial distress prediction has gradually evolved into the important part of enterprise risk management, which has always been an important issue studied by the fields of accounting and finance (Purnanandam, 2008). The earliest research on financial distress prediction can be traced back to Fitzpatrick's univariate analysis approach (Fitzpatrick, 1932) in 1932. Subsequently, Beaver (1966 proposed a more perfect univariate analysis model based on the thought, mainly using three approaches in predicting enterprise financial distress, namely, mean comparison, dichotomy classification test, and likelihood ratio analysis. For the first time, Altman (1968) constructed the financial distress prediction model by the approach of multivariate discriminant analysis (MDA), and then, improved the model twice in 1977 and 1995 (Altman et al., 1977;Altman et al., 1995). In the use of linear multivariate discriminant analysis, based on financial ratios, Altman's (1968) Z index model has occupied an important position in the field of financial risk prediction until now. Moreover, for the first time, Martin (1977) applied the Logit model to bank bankruptcy prediction, and then, Ohlson (1980) extended this model to the enterprises group, having achieved decent prediction results. Since 1990s, with the information technology continuously developing, the artificial intelligence technology has made rapid breakthroughs in the field of financial distress prediction. Artificial intelligence approaches began to emerge as tries in the field of financial distress prediction, including: neural network, genetic algorithm, rough set theory (RST), case-based reasoning, and support vector machines (SVM), etc (Odom & Sharda, 1990;Shin & Lee, 2002;Tay & Shen, 2002;Shin et al., 2005). Besides, as the artificial intelligence technology continuously improves, during the process of financial distress prediction, the joint use of multiple artificial intelligence approaches has become a new trend (Min et al., 2006;Ding et al., 2011;Chou et al., 2017).
In order to further improve the effect of financial distress prediction model, we can start from two aspects: First, we select the effective feature set from initial financial indexes by choosing a suitable approach. Second, we start with the approach, that is, use the correct classification algorithm to construct and optimize the prediction model (Lin et al., 2011;Lin et al., 2014). Discovered from the literature, the selection of initial financial indexes at current stage is generally achieved by approaches such as principal component analysis (PCA), independent sample T test, and discriminant analysis, etc. While compared with the selection of the indexes, studies on financial distress prediction model are much richer. The existing studies have basically achieved gradual transition from the stage of predicting by initial statistical approaches to the stage of predicting by the artificial intelligence. Existing literature has shown that, compared with statistical methods, the artificial neural network (ANN) has strong nonlinear mapping capabilities and flexible network structures, so that occupies advantages in the accuracy of financial distress prediction (Lee et al., 1996). Abid and Zouari (2000) believe that there is no need of complex neural network architecture when it is used to predict financial distress. Apart from shortening the forecast period and entering the latest variable information, the neural network model is also relatively well capable of predicting financial distress. Yim and Mitchell (2005) compare hybrid neural network technology with traditional statistical technology and traditional artificial neural network (ANN) model, as tested by Brazilian enterprise financial distress data, it shows that the hybrid neural network performs better than all other models in predicting enterprise financial distress. Constructed by Chen et al. (2006), the financial distress prediction model aiming at Chinese bankrupt companies shows that, logit and neural network models are the optimal prediction models, with the prediction accuracy ranging from 78% to 93%. Bose and Pal (2006) analyze the internet bubble from the financial perspective. They used discriminant analysis (DA), neural network (NN) and support vector machine (SVM) to find out whether it is possible to predict the survival or failure of click-and-mortar corporations based on financial ratios. On the whole, NN seems to perform better than the other methods. Constructing the financial distress prediction model using data of Taiwan listed companies, Chen & Du (2009) find that the artificial neural network (ANN) approach performs better than the data mining (DM) clustering approach in terms of the prediction accuracy. Using public industrial firms data from Taiwan，the accuracy of financial distress prediction are compared by Lin (2009) among several approaches, namely, multiple discriminate analysis (MDA), logit, probit, and artificial neural network (ANN) methodology. Its conclusion demonstrates that，while the assumptions of the statistical approach is not satisfied by the data, the artificial neural network (ANN) approach would demonstrate its advantage and achieve higher prediction accuracy. Geng et al. (2015) embark on the empirical research using data from enterprises with special treatment (ST) in Chinese capital market, and find that, the performance of neural networks is more accurate than other classifiers, such as decision trees and support vector machines, as well as an ensemble of multiple classifiers combined using majority voting. Developed specially by Iturriaga and Sanz (2015) for the study of the bankruptcy risks of American banks, the neural network model shows that it can detect 96.15% of corporate bankruptcy; moreover, it is significantly superior to traditional bankruptcy prediction models. The financial distress status of Iranian enterprises is predicted by Salehi et al. (2016), by approaches of support vector machines, artificial neural networks (ANN), k-nearest neighbor and naive bayesian classifier; the results shows that the neural network method performs best in relation to the prediction effect.
It can be seen that, artificial neural networks have been applied widely in financial distress prediction; meanwhile the prediction effects are relatively good. Among them, BP neural network has a simple structure, many adjustable parameters, many training algorithms, and strong operability, so that it is used relatively frequently in enterprise financial distress prediction. But considering in essence, as essentially a gradient descent algorithm, as well as a local search optimization approach, the BP neural network algorithm is easy to fall into local extremum when facing complex nonlinear functions, resulting in inefficient algorithm or even unsuccessful training (Ding et al., 2011). Therefore, for the purpose of this paper, it uses the genetic algorithm (GA) to optimize the back-propagation (BP) neural network model, and explores the possibility of GA-BP neural network model in financial distress prediction. The genetic algorithm can effectively reduce the possibility that BP neural networks fall into local optimal solutions, as it can search for global optimal solutions (Back et al., 1996). Thus, genetic algorithm and BP neural network should be combined, in doing so, the advantages of the two can be maximized, achieving mutual complementation, and further, improving the stability and applicability ijef.ccsenet.org International Journal of Economics and Finance Vol. 13, No. 3; of financial distress prediction (Ding et al., 2011). In summary, artificial intelligence approaches have contributed significantly to financial distress prediction, and will continue to lead the improvement and innovation of financial distress prediction approaches in the foreseeable future (Chen et al., 2016). The main contributions of the study in this paper are: (1) After distinguishing enterprises in financial distress from those in non-financial distress, this paper selects characteristic variables of significant differences between the two groups, and then, forms them as a feature set; (2) It constructs the BP neural network model which has been optimized by the genetic algorithm, and verifies the applicability of the model in the financial distress prediction; (3) It further compares the difference in prediction accuracy between the GA-BP approach and traditional statistical prediction approaches; (4) It broadens new insights for the application of artificial intelligence in the field of financial accounting.
The paper is organized as follows. In Section 2, we briefly introduce the advantages and disadvantages respectively in relation to the genetic algorithm and BP neural network, and systematically analyze the GA-BP neural network model constructed in this paper. In section 3, we introduce the data and sample selections during the research process, and introduce the optimizing process of the weights and thresholds in the BP neural network using the GA algorithm. In Section 4, we show the financial distress prediction results of the GA-BP neural network model. In section 5, it includes open discussions on the empirical results of this paper, as well as future research directions. In section 6, it is the conclusion of this paper.

GA-BP Neural Network
The overall goal of the financial distress prediction model is to use existing characteristic variables to construct models or predictive variables. The model should be able to extract relevant knowledge of risk assessment from past observed values, and from a broader perspective, provide assessments for enterprise risk prediction in the future (Altman, 1968;Beaver, 1966;Zmijewski, 1984;Lin et al., 2014). Much literature has achieved relatively good prediction results after applying BP neural network to many areas, including financial distress, stock market, credit risk, commodity price prediction, etc. (Zhang & Wu, 2009;Wang et al., 2011;Wang et al., 2017). However, further verification is needed on whether the BP neural network can play a better role in financial distress prediction after it is optimized by genetic algorithm.

BP Neural Network
BP neural network is derived from the error back propagation algorithm proposed by Rumelhart and McClelland in 1986. This algorithm has the advantages of simple structure, stable working state and easy hardware implementation (Rumelhart & McClelland, 1986). These strengths have also made BP neural network the most frequently used and the most common neural network model until now. In essence, BP neural algorithm is to solve the problem of minimum value in the error function, which can be used to adjust the weights of the multilayer feed forward neural network. The BP neural algorithm includes two processes, one is the forward propagation of information，the other is the backward propagation of errors. After the error analysis between each training result and each prediction result, the weights and thresholds are subsequently modified; through repeating learning and training, the network parameters (weights and thresholds) corresponding to the minimum error is determined, until then the training stops. A typical 3-layer BP neural network structure is composed of three parts, namely, input layer, hidden layer and output layer, as illustrated in Figure 1. The advantage of BP neural network is that, as a non-linear mapping artificial neural network, it is very suitable for solving problems with complex internal mechanisms (Chung et al., 2008). Equipped with self-learning and self-adaptive ability, generalization ability and fault-tolerant ability, BP neural network embraces relatively large advantages in credit evaluation, risk prediction, performance evaluation and other aspects (Lee et al., 1996;Fethi & Pasiouras, 2010;Dixon et al., 2017). However, to regard mathematically, BP neural network is a local search optimization approach, thereby when solving complex nonlinear problems, it will cause that the algorithm is trapped into local extremum, resulting in training failure.

Genetic Algorithm
Genetic algorithm is a global optimization algorithm that has developed in recent years. In 1962, Professor Holland of Michigan University forms the original genetic algorithm (GA) idea by simulating natural genetic mechanism and biological evolution theory. After introducing the idea of "only the fittest can be survived from natural selection" in nature into the genetic algorithm, according to the selected fitness function, individuals are selected by Selection Operator, Crossover Operator, and Mutation Operator of genetics, in this way, those with high fitness are retained to form a new group. Thereby the new group not only retains the excellent quality inherited from the previous generation, but is also significantly better than the previous generation. Further, by revolving periodically above selecting operation, finally the new group moves towards the optimal solution. Simple in basic principle and able of parallel processing, the genetic algorithm helps to obtain the global optimal solution. Research indicates that, after proper improvement, genetic algorithms can converge to the global optimal solution with probability 1 for any optimization problem (Janson & Frenzel, 1993). The basic process of the genetic algorithm is illustrated in Figure 2.

GA-BP Neural Network
In view of the deficiencies of the BP neural network model, we need to utilize the advantage of the genetic algorithm in the respect of the global optimal solution, by doing so, the BP neural network is optimized; complementary advantages of the two are achieved. In terms of the optimization for the BP neural network by the genetic algorithm, it is mainly reflected in the optimization for its initial weight and threshold, by doing this, the optimized BP neural network can play a better role in prediction. GA-BP neural network can be explicitly divided into three parts: BP neural network structure determination, genetic algorithm optimization and the output of BP neural network prediction. Among them, first, according to the number of input and output parameters of the fitting function, the BP neural network structure is determined, and then, the lengths of the genetic algorithm individuals are determined. Second, the optimization for the BP neural network by the genetic algorithm mainly lies in the determination of its weights and thresholds. The elements that the genetic algorithm optimizes the BP neural network include population initialization, fitness function, Selection Operator, Crossover Operator, and Mutation Operator. Through the cyclic iteration of Selection Operator, Crossover Operator, Mutation Operator, the optimal weights and thresholds can be finally obtained. For the last one, BP neural network prediction, the optimal individual obtained from the genetic algorithm is used to assign initial weights and thresholds of the network. Then the network is outputted by the predictive function after training. After operating the above three parts, the optimal solution of the model can be obtained. The process in which the genetic algorithm optimizes the BP neural network is illustrated in Figure 3.  Figure 3. Optimization of BP neural network by genetic algorithm

Data Collection
Up to date, academic circles have not made consensus on the definition of financial distress. For the sign that enterprises are trapped into financial distress, some refer to bankruptcy or business failure (Beaver, 1966;Lee et al., 1996), while others also regard to the occurrence of continuous abnormal conditions of some financial indexes (Campbell et al., 2011;Molina & Preve, 2012). Regarding using bankruptcy as the sign of enterprises falling into financial distress, the advantage is that the division criteria are simple and clear, and the research samples can be determined easily. Despite the advantages, the shortcoming is equally obvious: To compare the two conceptually, bankruptcy means that the enterprise has fallen into an irreparable state, whose immediate consequence is the termination of the operation. However, the financial distress is only a crisis, which means there is still room for salvation, that is, the two have essential difference. Moreover, it is also biased to use only one or several financial indexes as the sign of enterprise falling into financial distress. Especially, under current capital market environment in China, there are few real bankrupt companies in Chinese capital market. This can be attributed to both weak form of efficient market caused by institutional deficiencies and the imperfect market withdrawal mechanism. So that generally listed companies on the verge of bankruptcy also exist as a kind of unique shell resources. Those companies with financial problems are only identified as special treatment (ST). Therefore, after combining the current situation of Chinese capital market, and drawing on the studies of Chen et al. (2006) and Geng et al. (2015), we regard the occurrence of special treatment (ST) as the sign to judge whether the enterprise falls into financial distress.

Financial Indicators
There is no clear theoretical support for the construction of the financial early-warning index system. No single financial index can clearly predict or explain the cause and the mechanism of enterprises falling into financial distress. Therefore, after extensively referring to relevant literature, this paper draws on the studies of Lee et al. (1996), Sun and Li (2012), Lin et al. (2014), Geng et al. (2015). And then, it selects 53 indexes as initial indexes to construct an early warning index system for enterprise financial distress, including: return on assets, accounts receivable turnover ratio, financial leverage, current ratio, asset preservation and appreciation ratio, cash ratio, etc. Considering comprehensively, these existing indexes can be roughly divided into seven groups, namely: profitability, operating capacity, risk level, solvency, growth capacity, cash flow level and others. The specific financial early warning indicator system is illustrated in Table 2.

Modeling
We will construct the GA-BP model through the following four steps: ijef.ccsenet.org International Journal of Economics and Finance Vol. 13, No. 3; Step 1: Preprocess the data-normalize the data to speed up subsequent network calculations. Since there are no categorical variables in the characteristic quantities selected in this paper, the unified approach of normalizing numerical variables, namely, min-max((x-min(x))/(max(x)-min(x)), is adopted in this paper, assuring that all variables are in the range of 0-1.
Step 2: Optimize weights and thresholds by the genetic algorithm.
(1) Initialize the population number, and encode individuals; (2) Discretely encode the weights and thresholds, then randomly generate m weights and n thresholds, conducting random sampling among the m weights and n thresholds every time (Or generate m×n parameter lists); and determine the fitness function; (3) Conduct select operation. Calculate individual fitness, and calculate the probability of being selected. This probability is individual fitness/overall fitness. The higher the individual fitness value, the greater the possibility of chromosomes being selected is; (4) Conduct crossover operation. Sample chromosomes according to the calculated probability, then randomly swap one or more points between two chromosomes to obtain two new chromosomes; (5) Mutate. According to a certain mutation probability, among randomly selected chromosomes, randomly change a point of the chromosome. This kind of operation can effectively avoid the premature convergence in the evolution process that can result in falling into the local optimal state. (6) Repeat step (3), (4) and (5), and iterate the steps repeatedly until the conditions are satisfied. If the conditions are not met, the appointed maximum genetic algebra is used as the termination calculation criterion, and then the optimal weight and learning rate will be obtained.
Step 3: Establish the BP network model. Obtain best initial weight and best learning rate to construct a BP network. Theoretically, any nonlinear mapping can be achieved by a three-layer BP network. Its error precision can be obtained by increasing the number of neurons in the hidden layer, and in this way, its training effect is also easier to be adjusted than in the way of increasing the number of layers. Therefore, the GA-BP neural network structure of this paper is set to be three-layer; the loss function is in the use dichotomy cross entropy.
Step 4: Obtain the results through the BP network. In order to predict the output data, the sample data is inputted into the BP network model, after which the output data is obtained once the result meets final conditions.
The construction process of GA-BP neural network is illustrated in Figure 4.

Data Preparation
The financial indexes used in this paper are all continuous variables, except that the target variables are binary variables. In order to reduce the complexity of the neural network structure, mitigate the model's operating burden, and improve the operating speed, we need to identify and select the initial characteristic variables. Aiming at finding the best index combination of enterprise financial distress prediction, principal component analysis (PCA) and independent sample T test (Ding et al., 2008;Sun & Li, 2009) are identified as two commonly used approaches. After combining existing literature and the analysis results of the correlation coefficient matrix, further, existing indexes are discriminated and screened through independent sample T test. The results in Table 3 illustrate 15 of the most relevant characteristic variables (5% significance level) selected from the 53 initial financial variables, including X3, X5, X9, and X10... which also form the basis of GA-BP network analysis. The specific test results are illustrated in Table 3. The study in this paper takes 2017-2018 data as training samples, and 2019 data as test samples.  -3.788×10 9 9.406×10 9 -1.486×10 9 3.601×10 9 -2.302×10 9 1.023×10 9 -2.251* X51 -8.453×10 8 5.322×10 9 3.847×10 8 1.594×10 9 -1.230×10 9 5.641×10 8 -2.180* *Significant at the level of 5%.

GA-BP Neural Network Prediction Results
Generally, during the process of financial distress prediction, two types of errors can occur: Type I: enterprises in financial distress are misclassified as health companies. Type II: on the opposite, health enterprises are classified as enterprises in financial distress. Programmed by matlab platform and analyzed by the GA-BP neural network, the result of this paper is illustrated in Figure 5. As can be seen from this figure, it is obvious that, after being optimized by the GA algorithm, the BP neural network is significantly less likely to make the two types of errors above. Further, statistics of BP neural network and of GA-BP neural network are collected in terms of the prediction accuracy; the concrete results are illustrated in Table 4. The table indicates that, after being optimized by the genetic algorithm, the BP neural network achieves lower error probability in predicting financial distress, besides its overall prediction accuracy has also rised from 77.27% to 81.82%.

Discussion
The main purpose of the study in this article is to solve three problems: What are the most important factors in determining the characteristics of enterprise financial distress? Is neural network model the optimal model for predicting enterprise financial distress? Has the BP neural network improved in terms of the accuracy of financial distress prediction after being optimized by the genetic algorithm? First of all, the premise of constructing the financial distress prediction index system is to select out significant characteristic variables. After summarizing and comparing existing financial ratio selection approaches, Lin et al. (2011), Lin et al. (2014 found that principal component analysis (PCA) and independent sample T test are most commonly used in the selection of financial ratios. In this paper, based on the approach of independent sample T test, from the 53 initial financial ratio indexes, 15 statistically significant indexes are selected. They can well distinguish enterprises in financial distress from those in non-financial distress, laying foundations for the follow-up GA-BP neural network analysis. Secondly, much literature has indicated that, financial distress predictions using the neural network model is superior to those using statistical approaches in relation to the accuracy and stability. For instance, research conclusions from Lin (2009), Geng et al. (2015), Salehi et al. (2016) show that, the artificial neural network (ANN) approach performs significantly better than traditional financial distress prediction approaches (statistical approaches). In view of this, this paper further investigate whether the neural network model is really superior to traditional statistical approaches using the financial distress data from Chinese listed companies, adopting multiple discriminant analysis (MDA), probit and logit analysis. The specific analysis results are shown in Table 5. Compared with the results in Table 4, the prediction accuracy of the neural network approach is significantly higher. Finally, our research also shows the superiority of the GA-BP neural model in the field of enterprise financial distress prediction. We believe that the advantages of GA-BP neural network are mainly reflected in three aspects: First, for this model, there is no need to make any assumptions on the statistical distribution or attributes of financial ratios, that is, the data is unnecessary to meet strict statistical assumptions (Lin, 2009). Second, this model is very strong in fitting nonlinear data, so that it can accurately simulate complex data patterns (Geng et al., 2015). Third, optimized by the GA algorithm, the weights and thresholds of the BP neural network can be quickly determined to achieve the global optimal solution. While different scholars have applied the GA-BP neural network model to many fields including: wind speed forecasting (wang et al., 2016), energy saving (Li et al., 2016), estimation of groundwater level (Hosseini & Nakhaie, 2015), price forecast for gold (Liu, 2009), personal credit scoring (Wang et al., 2008) and so on, where the GA-BP neural network model has all achieved good prediction results, it is a beneficial attempt for the study in this paper to apply GA-BP neural network to the field of financial distress. In summary, we will construct a GA-BP binary classification model to predict in advance the enterprise financial distress, based on the advantages of the neural network in dealing with issues of nonlinear fitting and multi-input parameters. The research conclusions of this paper can effectively help business owners, managers, external audit institutions, governments and other stakeholders to effectively predict the possibility of enterprise falling into financial distress.

Conclusion
In the study of this paper, we constructed a financial distress prediction model using the GA-BP neural network model, and empirically test it using the financial distress data of Chinese listed enterprises. We selected a total of 53 indexes to construct the initial financial distress prediction index system, and then screened out 15 significant characteristic variables from them through the independent sample T test approach. Meanwhile, the same number of enterprises in non-financial distress was selected to match with the 97 enterprises in financial distress from 2017 to 2019, constituting the data set studied in this paper. During the research process of this paper, our work can be divided into three aspects: First, we selected 15 characteristic variables that can distinguish between financial distress and non-financial distress. Second, we optimized the BP neural network using the genetic algorithm, and constructed a financial distress prediction model based on the GA-BP neural network. Third, we compared the differences between traditional statistical approaches and neural network approaches in terms of ijef.ccsenet.org International Journal of Economics and Finance Vol. 13, No. 3; the results of financial distress prediction.
As can be seen from the research conclusion of this paper, it is feasible to construct a financial distress prediction model using financial ratios. Among them, effective ones for distinguishing between enterprises of financial distress and those of non-financial distress include: the net interest rate on current assets, earnings per share, financial expense ratio, financial leverage, asset-liability ratio, total asset growth rate and corporate free cash flow, etc. Further, optimized by the GA algorithm, the BP neural network can be effectively advanced in terms of the accuracy of enterprise financial distress prediction. Research using financial distress data of Chinese listed enterprises shows that, after being optimized by the GA algorithm, the accuracy of financial distress prediction of the BP neural network rises from 77.27% to 81.82%. Meanwhile, the study conclusion of this paper has once again verified the superiority of the neural network approach compared with traditional statistical approaches in terms of financial distress prediction.