Evaluating company bankruptcies using causal forests

ABSTRACT This study sought to analyze the variables that can influence company bankruptcy. For several years, the main studies on bankruptcy reported on the conventional methodologies with the aim of predicting it. In their analyses, the use of accounting variables was massively predominant. However, when applying them, the accounting variables were considered as homogenous; that is, for the traditional models, it was assumed that in all companies the behavior of the indicators was similar, and the heterogeneity among them was ignored. The relevance of the financial crisis that occurred at the end of 2007 is also observed; it caused a major global financial collapse, which had different effects on a wide variety of sectors and companies. Within this context, research that aims to identify problems such as the heterogeneity among companies and analyze the diversities among them are gaining relevance, given that the sector-related characteristics of capital structure and size, among others, vary depending on the company. Based on this, new approaches applied to bankruptcy prediction modeling should consider the heterogeneity among companies, aiming to improve the models used even more. A causal tree and forest were used together with quarterly accounting and sector-related data on 1,247 companies, 66 of which were bankrupt, 44 going bankrupt after 2008 and 22 before. The results showed that there is unobserved heterogeneity when the company bankruptcy processes are analyzed, raising questions about the traditional models such as discriminant analysis and logit, among others. Consequently, with the large volume in terms of dimensions, it was observed that there may be a functional form capable of explaining company bankruptcy, but this is not linear. It is also highlighted that there are sectors that are more prone to financial crises, aggravating the bankruptcy process.


INTRODUCTION
Empirical studies often focus on the structure, causality, or treatment of a phenomenon of interest. In economics, for example, some studies seek to analyze the effects of an economic policy on economic development and employment, among others. However, there are unobservable conditions that make the strategy unviable, with it obtaining undesirable effects (Belloni, Chernozhukov, & Hansen, 2014a).
Within this setting, computational resources are gaining space, and their application in contexts such as economics and finance is inevitable. Computer systems are helping in the analysis of large databases (big data) in which the conventional statistical tools, such as regression analysis, present results that fall short of those of other tools (Varian, 2014(Varian, , 2016. With the traditional statistical tools (regressions), data manipulation and subsequent predictive potential are restricted, particularly to linear models, and they do not capture the relationships with other behaviors. Along this same line of thinking, the empirical studies generally report their estimates based on a single model, leaving part of the results unexplained by the functional specification that would normally lead to different punctual results (Athey & Imbens, 2015).
One solution for such estimation problems would be machine learning (ML) tools, for example techniques such as decision trees, support vector machines (SVMs), artificial neural networks (ANNs), and deep learning, among others, which present better results for more complex models, concentrating on high computational performance, as well as dealing with the presence of restrictions regarding linear or non-linear functional relationships (Varian, 2014).
Supervised learning techniques (ML) thus focus on guiding the models based on a dataset (Athey, 2015). They also extrapolate, presenting more reliable results when the data are heterogeneous and the functional form cannot be observed. Thus, the various ML methods are more effective for problems related to prediction (Athey & Imbens, 2016), in this case of company bankruptcy.
The possibility of non-linear relationships between the variables constantly used in bankruptcy prediction may present greater accuracy with ML techniques (Tsai, Hsu, & Yen, 2014). These variables are treated as homogeneous and sometimes they are not, causing interpretations risks, primarily of the causal and imprecise effects. Debt ratios, for example, present distinct characteristics when their components are analyzed individually, explaining the heterogeneity among companies, and bringing a new perspective to studies that use such variables (Boot & Thakor, 1997;DeMarzo & Fishman, 2007;Park, 2000). Based on this, it is suspected that these characteristics may be extended to the other indicators used in bankruptcy analysis.
The use of non-parametric approaches, such as the causal forest (CF), would facilitate an understanding of the heterogeneity, enabling a flexible model with high levels of interactions and dimensions (Athey & Imbens, 2016;Wager & Athey, 2018). This approach thus enables the construction of valid confidence intervals to analyze the treatment, even considering a high number of variables in relation to the sample size.
CFs are gaining more prominence, since techniques such as K-nearest neighbor (KNN) would present limitations regarding the number of variables, raising the number of dimensions (Zhang & Zhou, 2007); that is, a greater quantity of variables would cause imprecision regarding the distance metric used, generating inaccurate estimates. Another option would be long short-term memory (LSTM); however, this methodology would be more indicated in cases of long time series, since it is based on the principle of the temporal evolution of the variables for classification (Hochreiter & Schimidhuber, 1997), and does not provide relevant results in this research, since the longest series would be five years.
In general terms, maximizing the predictability of company bankruptcy, especially after periods of deterioration, such as in a financial crisis, is gaining greater relevance. In such periods, a government intervention, 3 R. Cont. Fin. -USP, São Paulo for example, helping companies that are more prone to bankruptcy, avoiding decreases in employment and income for the region, would be more beneficial, reducing the regional effects of the recession.
The CF proposed by Athey and Imbens (2016) and Wager and Athey (2018) would thus resolve this problem, facilitating the analyses. In this methodology, the tree looks for groups in which the average effects of the treatment differ most. The search would be for an individualized treatment, balancing both conditions. First, the tree seeks to find where the effects of the treatment differ most and then it estimates the effects of the treatment more accurately. Moreover, using computational methods, the honesty condition is inserted, in which the sample is subdivided to train the tree (training sample), followed by the application (validation sample). Finally, each one of the leaves is estimated, analyzing the difference between the means of the treatment and control, that is, the mean from observing a company with bankruptcy characteristics.
It is within this context that this study seeks to explore the CF methodology, aiming to identify a set of relevant variables relating to company bankruptcy and find behavioral patterns in the data on companies that presented bankruptcy. The most common models, discriminant analysis and logit, are the most widely used and, when treating bankruptcies, the use of CFs is still at an early stage, with few applications, and this research thus helps future studies on company bankruptcy.

THEORETICAL FRAMEWORK
The studies on bankruptcy generate numerous relevant results, especially regarding capital structure, indicators used, and market sensitivity. In regard to capital structure, debt concentration enables fewer transaction costs involving the renegotiation of values. When presenting a recovery plan to a lower volume of creditors, these are more likely to accept, as they run risks of greater losses if liquidation occurs. There is also the possibility of a change in ownership, resulting in a reduction in credibility and increasing the probability of liquidation (Ivashina, Iverson & Smith, 2016). Also in the context of leverage, riskier structures are more prone to resorting to a bankruptcy process. This probability is reduced when there is a considerable amount of debts with real guarantees (Jostarndt & Sautner, 2010).
Presenting real solid guarantees to creditors, such as fixed assets, can help reduce the bankruptcy process, since these guarantees would be enough to honor the debts. However, keeping a high volume of this type of asset would compromise the company's liquidity. There is thus a negative relationship between the firm's liquidity and bankruptcy risk, a relationship that does not appear to be linear (Brogaard, Li, & Xia, 2017). For the Italian context, in which the reorganization and liquidation process mirrors chapters 7 and 11 of Title 11 of the regulations on bankruptcy and bankrupt companies in the United States Code (https://uscode. house.gov/browse/prelim@title11&edition=prelim), a company, when it falls into the reorganization process, produces an increase in interest on bank financing, directly reflected in its investments (Rodano, Serrano-Velarde, & Tarantino, 2016). Regarding the profitability indicators, such as return on equity (ROE) and return on assets (ROA), a rise in the latter of more than 15% can indicate a greater propensity for failure, being driven by the cash flow risk combined with internal and costly financing. Other results have shown that low leverage represents a higher probability of bankruptcy, possibly reflecting the low volume of credit (Giordani, Jacobson, Schedvin, & Villani, 2014).
The market thus understands that similar companies may be experiencing the same problems, this effect being known as contagion. On the other hand, a bankruptcy announcement conveys information about how good the remaining companies are, generating an expectation of wealth redistribution in the segment, this effect being known as competitive (Lang & Stulz, 1992). There is also the possibility of collateral effects, reducing the value of similar assets in the secondary market, generating a disequilibrium in supply and demand (Benmelech & Bergman, 2011).
There is also the expectation of market sensitivity, where the average price of stocks of companies in the same segment presents a negative reaction, that is, a drop, which may be a reflection of the contagion effect (Lang & Stulz, 1992).

Bankruptcy and ML
Given the importance of the bankruptcy issue, the studies that aim to predict it have grown, especially in recent years. Comparisons between ML methodologies (SVM, ANN, weighted least squares [WLS], and decision tree, among others) and the traditional methodologies (discriminant analysis and logit) are inevitable, with the results indicating the superiority of the computational techniques. Min and Lee (2005) used SVM for predicting bankruptcy and a promising response was identified when comparing it with the most widespread methodologies in the literature, such as discriminant analysis and logit, revealing SVM to be superior in terms of predictive capacity, once the parameters were estimated.
Regarding the selection of financial indicators for bankruptcy prediction, Yang, You, and Ji (2011) used PLS and found it was better at predicting compared to the other traditional techniques, as well as observing a complex and non-linear relationship in the parameters. Tsai et al. (2014) compared various ML methodologies, such as the decision tree, ANN, and SVM, and found that the ML models are better at predicting than the traditional metrics. Among these, SVM presented the best results compared with the other models studied, presenting intermediate performance. Comparing the Gaussian model with SVM and the logit model, better predictions were found with the Gaussian process than with SVM and logit, as well as slightly higher accuracy of SVM compared to logit (Antunes, Ribeiro, & Pereira, 2017). Barboza, Kimura, and Altman (2017) compared various methodologies with ML and concluded that these present a substantial improvement in bankruptcy prediction, with around 10% more precision, especially when they include, besides the variables proposed by the Altman z score, some complementary financial indicators.
In general, when the traditional methodologies are compared with the ML ones, the latter are shown to be superior. However, when analyzing the results among the ML techniques, the conclusions are still contradictory, depending on the variables used.

METHODOLOGY
Various models have been employed in finance with the aim of identifying the next companies to fail. Within the context of conventional analyses, the models used, discriminant analysis (Altman, 1968) and logit (Ohlson, 1980), among others, primarily depend on a functional form pre-established by the researcher that is limited to the scope of the methodology. In machine learning, however, there may be the extrapolation imposed by the models, achieving more satisfactory results. This requires the input or independent variablesx ∈ R (profitability, liquidity, leverage, and gross domestic product [GDP], among others) -and the result or dependent variable -y ∈ R or y ∈ [0; 1], bankrupt or not bankrupt -with the aim of learning how the inputs explain company bankruptcy. The results may be nonlinear models (relationship suggested in the studies of Giordani et al. [2014] and Brogaard et al. [2017]).
Other methodologies have been tested over the years; however, in many cases, the focus has only been on the use of the methodologies, and not on a robust analysis of the results found. A summary of these models can be observed in Table 1.  Lennox (1999), Min and Lee (2005), Cho, Kim, and Bae (2009), Lee and Choi (2013), Barboza et al. (2017), García, Marqués, Sánchez, and Ochoa-Domínguez (2017) Logit Basic Ohlson (1980), Lennox (1999), Min and Lee (2005) However, problems are directly encountered regarding (i) the high volume of dimensions and (ii) heterogeneity. Nonparametric approaches that seek to analyze heterogeneous effects perform well in applications with small quantities of variables (Wager & Athey, 2018). In the ML literature, there is a variety of effective methods, the most popular of which -regression tree, random forest, and SVM, among others -imply modeling relationships between the attributes and the results (Athey & Imbens, 2016).
Among the possibilities for analyzing the effect of the 2007 financial crisis, one solution would be to include an interaction dummy; however, the models became even more complex, resulting, in this research, in more than 80 variables. These variables could be chosen using the least absolute shrinkage and selection operator (Lasso) and the post-Lasso, as will be seen below. However, we would encounter linear models, since they would be estimated by ordinary least squares (OLS). Using SVM would also be an option, but it would be limited to the non-exploration of the unobserved characteristics (particularities) of the companies. The tree and CF proposal are more recommended in this context, since they would enable the conditions to observe the most latent bankruptcy characteristics, considering the particularities of each set of companies.

Conditional Treatment
In the literature on machine learning based on prediction, the regression tree presents characteristics that are little different from the other methods, producing partitions of the population based on the variables so that all the units of a partition receive the same prediction (Athey & Imbens, 2016).
The proposal of this study would thus be to apply an incipient methodology in the context of finance, especially regarding bankruptcy evaluation, analyzing its characteristics. Thus, the studies of Athey and Imbens (2016) and Wager and Athey (2018) were applied to CFs.
CFs have properties that provide impartiality and asymptotic normality, producing a partition of the population according to the variables in which all the partitions received the same prediction. Formalizing the problem based on Athey and Imbens (2016), we have N units with i = 1..., N, with there being a pair for each unit Y i (0);Y i (1), and a causal effect given by t i = Y i (1) Y i (0). We also denote a binary indicator W i ∈ {0,1} with Wi = 0, indicating that it did not receive the treatment, and Wi = 1, which did receive it; we thus have: We also have X i as a vector composed of K variables not affected by this treatment, thus generating a set of observations composed of Y i obs , W i , X i with i = 1,..., N, this being an independent and identically distributed sample. It is also assumed that the observations can be exchanged and, in a randomized experiment with constant treatment attribution probabilities, e(x) = p for the values of x, where the probability of the marginal effect of the treatment is given by p = pr(W i = 1) and that of the conditional treatment is given by e(x) = pr(W i = 1|X i = x). We thus arrive at: The conditional average treatment effect (CATE) is therefore: With this, Athey and Imbens (2016) obtained more precise estimates for the conditional average treatment effect, that is, �. � , in which τ(x) is based on the partitioning of resources, not varying in the partitions. The treatment is randomly attributed in the associated subpopulations by X i = x, indicating that, once all the observable characteristics of individual i are known, the status of the treatment does not generate extra information about its possible results.

Post-Lasso
One simple possibility for analyzing the conditional effect related to some treatment and the interactions of its effect can be carried out using the Lasso (procedure adopted to choose the relevant variables in a regression model). We thus have the following model: So, if CATE is the true model, it can be written as follows: Equation 5 implies different subpopulations indexed by X i = x, having different effects for β xw ≠ 0. This approach is very common when the dimensions of the variables are small (p = dim(X i )), using OLS. However, the problem increases as p grows and tends toward p > n, making the application of OLS unviable. The acceptable solution would thus be to apply the Lasso and subsequently the post-Lasso, choosing the variables that best explain the dependent variable using OLS. These procedures present advantageous properties when the regularization parameters are chosen appropriately (Belloni, Chernozhukov, & Hansen, 2014b;Belloni et al., 2014a), as well as presenting impartiality and asymptotic normality.

CF
With the possibility of a large size, one solution would be the CF. In a broad context, regression trees and forests can be considered neighbors, using an adaptive metric in the approximations. Generally, these types of methods use the Euclidian distance to analyze the closest neighbors. Decision trees can present narrower leaves throughout the directions in which the sign changes quickly, and longer ones in other directions. Thus, a causal tree can be built that resembles the regression tree, finding a point at which the high dimensionality does not cause as much of a problem for the estimates (Wager & Athey, 2018).
For this construction, suppose that there are independent samples (X i ,Y i ) of a regression tree. The space is then divided until partitioning it into a set of leaves L containing only training samples. Given a point x, the prediction value � �, is evaluated, identifying leaf L(x), which contains x, establishing: CFs are adaptive and flexible, making them efficient for estimating local parameters, such as the application of the CATE (Athey, Tibshirani, & Wager, 2019). Locally weighted estimators are calculated; that is, the effects of the treatment on a specific target X i = x are estimated, giving greater weights to the most relevant observations. The main benefit would be the greater efficiency in choosing the most important dimensions, reducing the dimensionality problem. By incorporating the conditional treatment (CATE), we have: The CF thus generates a set B of causal trees, in which each one produces an estimate 1 � � 2 3 4 . The forests thus aggregate their predictions calculating the mean �� ∑̂���� � ��� . Using the output mean of many trees, the mean effect of the conditional can also be calculated. These procedures ignore the information about the result, since they set sample divisions, calling them honesty, producing large leaves with asymptotic normality in each one. It warrants mentioning that no item of data was wasted, thus satisfying the honesty properties.
The sample divisions, also known as sample partitioning, are made, generating an estimation sample and a test sample. After this procedure, the results are estimated and a cross-validation process is carried out in which it is possible to predict the punctual estimates of the effect of the treatment on the estimative sample. Also in this procedure, the tree is trimmed based on its level of complexity (complexity parameter).
With this, it is assumed that the individual causal trees in the forest are random subsamples of treatment examples (Athey & Imbens, 2016). The various adjustment parameters are also observed, such as minimum size of nodes for the trees and cross validation, minimizing the losses and the reduction of standard errors. The CF can be estimated using the causalTree package proposed by Athey (2019) for the R ® software. See also the link to the code in Github (https://github.com/susanathey/causalTree). Other procedures and complements can be observed in the manual. We also suggest reading Vapnik (2000) for more information on ML.

Data and Variables Used
For the market, it would be interesting to identify companies before they present bankruptcy characteristics, minimizing investment losses. Such models or methodologies make the evaluation impartial, exempt from subjective influences, enabling the analyst to classify the risks of the company regarding its future and capacity to generate results.
For this verification, the bankruptcy prediction techniques are divided into: qualitative analysis, with subjective models; univariate analysis, using rates based on accounting data or market indicators; multivariate analysis, including discriminant analysis, logit, probit, non-linear, neural network, Altman z score, Ohlson o score, and models based on market value, among others (Altman & Hotchkiss, 2007). Models such as those of Altman (1968) use discriminant analysis to classify companies as solvent and insolvent.
Limitations of these studies are found when non-linear relationships may be presented between the variables studied, such as bankruptcy and the main company indicators (leverage, profitability, liquidity) (Giordani et al., 2014). Other limitations are of a modeling nature, such as the normality of the data used for the discriminant analysis, as well as the linearity of the variables. One problem associated with neural networks relates to the understanding and resolutions of the patterns found.
Regarding the causes of bankruptcy, there is no predominant isolated factor of company bankruptcy. The first studies used only endogenous variables, related to profitability, liquidity, and leverage indicators (Altman, 1968;Deakin, 1972;Ohlson, 1980). Following the same line with internal variables, Giordani et al. (2014) adopted the augmented standard logit methodology, in which they sought to understand the non-linear relationships of the variables that influence bankruptcy, and found significant and robust results.
In addition, there are the arguments that company bankruptcy suffers from an external influence, that is, exogenous variables related to the country's economic situation or to government policies, since the internal indicators do not present sufficient information about the economic conditions faced by companies (Johnson, 1970). Giordani et al. (2014) also suggest the inclusion of variables external to the bankruptcy models and also warn of the need for non-linear approaches.
Regarding the exogenous variables, there are arguments showing that smaller companies are more likely to fail due to various factors, such as: (i) bigger companies appear to more easily take advantage of the effects of scale; (ii) bigger companies have more bargaining power with suppliers and financial institutions, among others; and (iii) bigger companies tend to benefit from greater experience or learning (Strömberg, 2000).
It also warrants mentioning that, in some situations, it is advisable to build specific models for the sector, where there is a distinction between the size of the companies (Mensah, 1984;Taffler, 1984). A summary of some studies and variables can be observed in Table 2.

Exogenous variables
Size Altman et al. (1977), Ohlson (1980), Cole and Gunther (1995), Strömberg (2000), DeYoung ( Source: Elaborated by the authors. Giordani et al. (2014) emphasize that the internal indicators are often explored in insolvency analyses, reflecting the capital structure, profitability, and liquidity of companies. In regard to leverage, the authors argue that, in bankruptcy conditions, liabilities exceed assets. Regarding profit and liquidity, these provide relevant information about the scarcity of liquid assets to give continuity to the company's activities, with continuous expenses and debt payment.
Low net working capital is a frequent problem presented by companies in bankruptcy situations, since resources are constantly consumed by the operating losses, reducing the proportion of current assets, generally represented by the company's liquidity. In regard to retained earnings/total assets, they indicate that newer companies tend to have lower earnings than companies consolidated in the market. According to Altman (1968), this individually tested variable was the most relevant for dividing the groups into bankrupt and non-bankrupt companies.
Debt structure is also relevant for explaining company bankruptcy. Companies that are more indebted with banks are more likely to restructure due to the greater ease of renegotiating their debt (Jostarndt & Sautner, 2010). The insolvency risk of big companies is reduced due to the large volume of assets, that is, they are too big to fail (Acharya & Mora, 2015), giving greater relevance to the Size variable. The inclusion of sector variables would make up for the economic variations caused by market oscillations, especially due to some financial, sector-related, technological, or supply-related crisis, among others.

DATA ANALYSIS
In recent years, the literature on ML has worked hard to produce quality estimates, even for large volumes of data. The predictions can be used to guide small populations with specific characteristics, such as corporate bankruptcy. With the aim of analyzing the heterogeneity among companies in the market, various accounting and sector-related variables for 1,247 companies were listed.
One thousand two hundred forty-seven U.S. companies were chosen from 10 sectors classified according to the Thomson Reuters Business Classification. The balance sheets chosen involve the five years of the bankruptcy process, as there is proof of declines in the indicators (Kalay, Singhal, & Tashjian, 2007). Among these companies, 66 filed for bankruptcy, 22 of which went bankrupt before 2008 and 44 after -the treatment period. For a closer measurement, the balance sheets of the nonbankrupt companies were collected in the same year as the bankrupt ones, totaling 32,188 quarterly observations retrieved from the Thomson Reuters database.
A large sample imbalance is perceived, with 1,181 nonbankrupt companies and 66 bankrupt ones, characterizing the unequal proportion between the two classes (bankrupt and non-bankrupt). To resolve this problem, the synthetic minority oversampling technique (SMOTE) methodology was used. The SMOTE is an algorithm for generating artificial data to balance the minority class based on the closest neighbors. The majority class is also resampled, increasing the volume of data (Chawla, Bowyer, Hall, & Kegelmeyer, 2002).
In regard to the variables, when applying the tree and CF methodology, as well as the other ML techniques, a greater number of variables would be interesting, with the aim of capturing the company characteristics in detail. This process generates considerable difficulty, as there are absent data in much of the balance sheets, thus compromising a high number of observations. We thus list a set of equity and sector variables in order to apply the methodology. The descriptive statistics without the synthetic data can be observed in Table 3. As expected, there is great variety among the companies, especially in size. This variation contributes substantially to the heterogeneity of the companies. It is also observed that despite there being many accounting variables, there is a low correlation between them (Figure 1).
The X29 variable refers to the binary variable, indicating bankrupt or non-bankrupt firms, and the TRA variable refers to the binary treatment variable -before and after the crisis. It warrants mentioning that we are not interested in the causal effects caused based, especially, on parametric metrics, but in analyzing some variables that may indicate relevant partitions to indicate the soundness of a company. Within this context, the results of the CF cannot be interpreted as partial effects, keeping the other variables constant.

Post-Lasso Analysis
A simple way of analyzing the causal effects between the pre-and post-financial collapse variables would be via simple interactions with a linear model, as described in equation 4. Athey and Imbens (2016) warn that this methodology would be relevant in models with few variables, becoming a problem when there is a large volume. With large sizes, one solution would be to carry out the Lasso as a kind of operator for choosing variables that are relevant to the model (Athey, Imbens, Pham, & Wager, 2017) and then applying the OLS regression (Belloni et al., 2014b). Having carried out these procedures, the results can be observed in Table 4. With the interactions, the model would have 66 variables, of which 33 are the initial ones of the model (33 variables, 23 of which are accounting and 10 are sector-related) and 33 are interactions. It is observed that the volume of relevant interactions I(*W), especially in the internal company variables, is high, totaling 11. The sector indicatives D were only relevant on four occasions, revealing that, before the financial crisis, the Basic materials (D_BM), Cyclical consumption (D_CC), Noncyclical consumption (D_CNC), and Telecommunications (D_TS) sectors were the most affected in the bankruptcy processes. After the crisis, the results would be broad, with no relevant interactions. However, there is a limitation regarding the interpretation of this model, as it concerns a linear regression.
These results are very generic in terms of possible predictability, since different effects are found in a wide variety of companies. Given the individual characteristics of each company, the possibility of renegotiating debts, for example, would cause distortions regarding the possibilities of intervention in the companies. Another relevant point would be the characteristics of current assets in terms of the quick ratio and burn rate. The operating and non-operating income, as well as the quality of the earnings involved, may be relevant determinants for a company going bankrupt or not. And with these results (Table 4), the variables are treated homogenously.

Conditional Treatment and Causal Tree
Analysis In this context, there is the need to know in which subpopulations the financial crisis had the greatest effect. Athey and Imbens (2016) state that in these cases a data-oriented way of identifying the relevant heterogeneity may be convenient. Causal trees produce this indication based on the data in order to understand the heterogeneity and where it is according to the space of each variable, generating impartial estimates of the treatment in each subgroup. The initial tree was generated with 294 leaves. The cross-validation error (x-val) does not always reduce when the tree becomes more complex (to make it easy to understand, an analogy R. Cont. Fin. -USP, São Paulo to the regression model is used: with the inclusion of more variables in the model, its predictive power does not increase). A good cut-off point would be when the points cut and are located below the horizontal line, opting for the point furthest to the left, generally the lowest xerror value. After all these analysis procedures, the regularization parameter converges in 156 divisions -the xerror value ceases to decrease.
It is also known that the interaction coefficients generated are the mean treatment effects of each one of the leaves (Table 5). After the adjustments, the tree would thus have 156 leaves. It is also known that in all these leaves the treatments are relevant. The analyses are similar to an OLS regression. It is observed that the data are in decreasing order and only from leaf 107 onward are the coefficients positive; thus, the crisis would have a negative effect on more than half of the leaves, showing the relevance for the accounting variables analyzed.
Given the company conditions and their particularities, the financial crisis that occurred affected the various companies differently, since the effect of the treatment is different in each one of the leaves, calculated using the F test. It also warrants mentioning that if a division did not occur in a specific variable, it does not mean its irrelevance. There are various ways to choose a subsample with a wide variety of treatment effects, which can be high or low.
The general mean effect (mean of the variables) can be observed in Table 6. The sector variables, as highlighted, were the ones that presented a mean treatment close to 0 for the various leaves of the tree, indicating lower heterogeneity. Basic materials (D_BM) and Cyclical consumption (D_CC) stand out as the most affected sectors, having the most relevance at times of crisis, these being the most predominant sectors in terms of company bankruptcies after the crisis period. Companies that operate in sectors such as Utilities, Financial, Telecommunications, Energy, Health, Noncyclical consumption, and Technology are the least affected by the financial crisis, possibly due to the need for the items produced. In regard to the variables used, it is observed that the most affected would be Net equity, EBITDA, EBIT, Operating income, Income after taxes, and Retained earnings. As expected, the Profit and Net equity variables had the negative effects with treatment means lower than 0, with retained earnings standing out with the lowest coefficient.
Due to the size of the estimated tree, which would be invisible in this document, it would not be possible to incorporate the figure, but the main segregation point would be the sector type the companies form part of. Standing out as a first division is the Basic materials (D_BM) sector and, for certain volumes in assets, smaller companies (LN_TA < e 12.238 ), the next division would be Retained earnings. For companies that do not belong to the Basic materials sector (< 0.5), the next partition would be in Total Assets (LN_TA), where, for those bigger than LN_TA e 12.238 , the segregation would be the Cyclical consumption R. Cont. Fin. -USP, São Paulo (D_CC) sector, highlighting that bigger companies tend to be less affected, presenting a high volume of subdivisions.
Characteristics such as Total liquid receivables (TRN_I) were shown to be relevant, given the need for an increase in company cash flows, especially at times of recession. Companies with TRN_I, for example, greater than 16% would tend to have bankruptcy points, depending on their size (LN_TA) and volume of debt (TL_I).
Not very far from what Giordani et al. (2014) presented, company size was relevant in the main partitions found, dampened by their high volume in assets, since smaller companies tend to be more prone to bankruptcy. There is also the possibility of more benefits and government interventions, aiming to dampen the amount of unemployment generated by large company bankruptcies.
Another important variable would be Net sales, converging with one of the indicators proposed by Altman (1968), showing that companies with more capacity to generate revenues present fewer problems in crisis periods. The liquidity variables were also relevant, as well as the profitability indicators.

CFs
CFs are therefore an adaptive and efficient method for estimating parameters that can be defined by local conditions, such as after applying the CATE. The predictions of the CF are mean causal tree estimates; that is, at least two causal trees are estimated and then the trees are combined, generating the CF estimates. The weights found in each one of the leaves of the causal trees reveal greater reliability in the volume of important dimensions, as well as being adaptive, making the estimates more robust in the face of company heterogeneity.
By predicting the CATE estimates and their variation for each observation, little variability is found, with a general mean close to 0 (Table 7) on the Predictions and Estimated variance lines. The term "Biased error, " on the line, indicates that the error is only due to the variability of the data sample; that is, it represents the error that is expected with the construction of the forest containing an infinite number of trees. With this, the consistency of the estimates is noted, with an error close to 0. Based on the predictions of the test set, we estimated the predictions for the validation sample in Table  7. As expected, the estimates presented very small variations, all close to 0, indicating that the model fits the parameters and the data well. Therefore, the results converge toward a greater predictability possibility, as well as treating the characteristics of the companies analyzed homogenously. A reduction in the maximum value of the estimated variance is also found, reducing the previous threshold of 2.98 to 0.64. The Biased error variable does not appear, since it was tested in the validation sample.
The most used variables in the partition of the tree can be seen in Table 8. However, we cannot fall into the trap where, with little frequency of use in the partitions, the variable is not relevant. Observe that the frequency of the sector variable D_BM is 0.2%, but the main partition of the tree is found in that variable.  In the subpartitions, the Gross profit (GP I ) variable was the one that presented the highest frequency when the tree was divided, with approximately 27% of the appearances. The Accounts payable (AP_I) variable is relevant in the process of determining the bankruptcy of the companies, as it directly affects their cash flows, as well as their credibility. It also warrants mentioning that if two variables are highly correlated, there may be partitioning in one of the variables, but not in the other. However, if one is removed, the subdivision can occur in the one that was left, keeping the definitions in each leaf unaltered.

CONCLUDING REMARKS
The results indicated that there are several variables that are not normally included in the bankruptcy analysis and prediction models. The Net sales (NS_I) variable, according to Altman (1968), continues to be relevant. It warrants mentioning the importance of including variables that indicate the operating sector. It could be speculated that there are sectors that are more prone to bankruptcy, especially at times of crisis. In this research, the most affected was that of Basic materials (D_BM), which includes chemical, mineral exploration, and environmental (paper, wood, and recipients) companies. If it does not belong to D_BM, another highly affected sector would be Cyclical consumption (D_CC) (automobiles, construction material, domestic utensils, hotels, production, and entertainment).
We also observed the presence of heterogeneity among the companies, which in many cases are treated as identical. The debt ratios, for example, in linear models are treated as similar among the companies and they are not, given the size and bargaining capacity with suppliers and the government, among others.
Smaller-sized companies can also present less capacity for obtaining credit, requiring of managers larger amounts in cash or equivalents to remain functioning. With this, they tend to present higher liquidity indicators. Depending on the segment, companies can present greater amounts of fixed assets, reducing liquidity ratios; on the other hand they present larger volumes in depreciation. These characteristics should be taken into consideration in the treatment or intervention, especially in crisis periods, and it is up to the interventionists to adopt the best strategy for each company.
One limitation of this methodology would be the need for a quasi-experimental approach, requiring a database before and after a specific phenomenon. Analyzing without the need for this event would provide a greater academic contribution. It is suggested that future studies explore the unobserved characteristics of companies using other methodologies, addressing, for example, the intertemporal impact on companies and on the variables, as this proposed methodology would not address such effects and their magnitudes.