Ex-ante moral hazard: empirical evidence for private health insurance in Brazil

Abstract This paper explores the existence of ex-ante moral hazard in private health insurance in Brazil. Before the advent of illness, insured individuals have no incentives to seek preventive care if it is not previously contractible. The data set comprises longitudinal administrative records of health care utilization from a Brazilian employer-sponsored health insurance plan. The empirical strategy is based on an exogenous and anticipated shock in health insurance coverage not associated with health conditions. The results show an increase of up to 17% on medical visits and 22% on diagnostic tests due to the loss of health insurance. Medical visits start to increase ﬁve months before the individual leaves the health insurance pool, reaching its peak at two months prior to exit. For diagnostic tests, the increase was observed only in the last two months before the loss of health insurance coverage.


Introduction
Moral hazard is a well-known fact associated with consumer behaviour in the context of health insurance coverage (Newhouse, 1993;Arrow, 1963;Pauly, 1968;Zeckhauser, 1970). Two distinct types of moral hazard are observed in individual behaviour: ex-ante and ex-post moral hazard (Zweifel & Manning, 2000). Ex-post moral hazard is related to the overuse of health care services observed in the presence of health insurance once an illness event has occurred. The reduction in the marginal cost perceived by insured individuals results in a higher level of health care use compared to the absence of health insurance.
In contrast, ex-ante moral hazard refers to situations before the advent of illness, when individuals have no incentives for preventive care. Moreover, having health insurance coverage can lead to individual underinvestment in preventive care, if it is not contractible, that can be interpreted as an ex-ante moral hazard (EAMH), as it usually affects the future costs of illness. Several factors may affect the individual choice of preventive care, for instance, opportunity costs, risk aversion, or the effect of ill health on the utility of such care. Ehrlich and Becker (1972) distinguish two sources of ex-ante moral hazard: self-protection and self-insurance. Self-protection refers to the ability of individuals to affect the probability of illness through the avoidance of risky behaviours (e.g., smoking, drinking) and the adoption of healthy ones (healthy diet, exercise), whereas self-insurance relates to an individual's ability to affect the future costs of disease, such as regular doctor visits, dental care, immunization, and check-ups. Some empirical evidence has already shown the presence of EAMH. Zweifel and Manning (2000) explore EAMH in contexts of health insurance, focusing on the opportunity costs of preventive care effort and individual risk aversion as the elements that can determine the amount of preventive care chosen by insured individuals. Opportunity costs are usually related to consumer time constraints, and therefore the wage rate is the shadow price of preventive care. Authors have extended the notion of opportunity costs by including the generosity of coverage of preventive care. The less one pays in out-of-pocket expenditure, the higher the consumption of preventive care. Less generous insurance is associated with decreased use of preventive medical services (secondary prevention), which contradicts the pure intuition that less protection should cause an increase in prevention (Lilliard et al., 1986). Dave and Kastner (2009), using the Health and Retirement Study, examined the effect of health insurance on the health behaviours of the elderly population. Specifi cally, the authors investigated EAMH comparing changes in healthy activities among uninsured and insured individuals before and after Medicare enrolment. More precisely, the authors performed a difference in difference analysis comparing the utilization of preventive health care before and after the adhesion to Medicare considering two groups of individuals: those who were uninsured before age 65 and those who were already insured. The authors found evidence that Medicare enrolment increased unhealthy activities such as smoking and alcohol use for males, pointing to evidence of ex-ante moral hazard.
On the other hand, results for utilization of doctor visits and hospital stays showed that Medicare enrolment increased the probability of seeing a doctor. These results may seem to contradict the presence of EAMH related to self-insurance, but in truth, this phenomenon results from an access effect. Uninsured individuals had restricted access to health care services before Medicare enrolment. Card and Maestas (2008) documented the same fi ndings.
De Preux (2011) compared changes in physical exercise, smoking, and drinking among uninsured and insured individuals before Medicare eligibility using the same database but incorporating the effect of the anticipation of Medicare benefi ts. The author extends the analysis of Dave and Kastner (2009), introducing the defi nition of anticipatory EAMH. According to de Preux (2011), as individuals anticipate the enrolment in Medicare, they tend to change their current preventive care choices just before age 65. Therefore, EAMH is related to two mechanisms. First, Medicare coverage reduces the cost of future illness care and consequently discourages healthy habits; and second, the anticipation of Medicare enrolment should reduce the future benefi ts of current prevention. The author found that before Medicare enrolment, uninsured individuals are less likely to exercise than are insured ones, pointing out the existence of anticipatory EAMH associated with selfprotection. That is, uninsured individuals choose to reduce self-protection because they anticipate they will receive Medicare coverage.
Despite these pieces of evidence, there is scarce work related explicitly to self-insurance. This lack of evidence in part is due to diffi culties in empirically distinguishing the coverage effect on the use of preventive care from the access effect and endogeneity issues. Individuals who choose to buy health insurance may present specifi c attributes that may affect health care utilization.
Theoretically, the presence of EAMH relies on lower incentives for preventive care due to the presence of coverage of future illness events (coverage effect). In addition to that, for uninsured individuals, health insurance enrolment means a signifi cant change in access to health care services (access effect). In that manner, while insured individuals tend to use fewer health care services due to the coverage effect, the uninsured tend to use more health care services due to the access effect. Therefore a comparison between uninsured and insured individuals may not be enough to distinguish the coverage effect from the access effect.
This study aims to estimate EAMH using new data from an employersponsored health insurance plan in Brazil provided by the Sewage Company of the State of São Paulo-Companhia de Saneamento Básico do Estado de São Paulo, Sabesp. Here, all Sabesp employees and their enrolled dependents are benefi ciaries of a health insurance plan.
The data set comprises longitudinal administrative records of health care utilization during the 2004-2008 period. We take advantage of a specifi c form of EAMH in a patient's decision to seek preventive using an exogenous variation of health insurance coverage as our identifi cation strategy. The expected loss of job due to retirement or dismissal is an exogenous variation of coverage status not associated with health conditions. The anticipation of health insurance loss can affect consumer behaviour of preventive care-self-insurance. More services are chosen to compensate for the underutilization due to EAMH and consequently to reduce the future costs of medical care. In this context, the EAMH as measured in this study is the variation observed in the consumption of preventive care when individuals anticipate the loss of health insurance coverage.
In Brazil, since 1998, with the regulation of the private health insurance market (Law 9.656/1998, articles 30 and 31), in the case of dismissal or retirement of health insurance policyholders, all benefi ciaries may remain insured for an additional period of a minimum of six months. In these circumstances, enrollees must formally opt to remain or exit from the health insurance pool. Because individuals know that they will lose coverage, it is possible to estimate the amount of preventive care that is unused when they are insured. When individuals have health insurance they postpone or even avoid preventive care due to, for example, opportunity costs. Therefore, the increase in the demand for health care observed prior to their loss of health insurance is taken as the ex-ante moral hazard. Specifi cally, we are dealing with ex-ante moral hazard related to self-insurance, estimated through outpatient care, such as doctor visits and diagnostic tests.

Database
The database used in this study comprises administrative records regarding the utilization of health care services funded by the Sewage Company of Sao Paulo State (Sabesp) in Brazil. Sabesprev is a health insurance pool managed and provided by Sabesp that bears the major part of the cost of the health insurance premium for Sabesp employees, their dependents, and other relatives. Employee enrolment is mandatory, and the premium payment for employees and their dependents is a uniform wage tax of 2%. For other relatives and retired employees, Sabesp fi xes the premium according to age groups. There are three types of contracts: a full plan that is mandatory for active employees; a basic plan that is available for retired employees and their relatives with only shared-room hospital accommodation; and a special plan that is also available for retired employees and their relatives with single-room accommodation. All types of contracts include the same network of providers and the same co-payments.
Originally, administrative records were organized in three separate data sets. The fi rst set comprises monthly health care utilization records for each individual, including benefi ciary ID, date of utilization, type of procedure, provider, and expenditures. The second data set contains detailed information about inpatient care, including hospitalization period, diagnosis, and provider information, which allows us to distinguish ambulatory from inpatient procedures. Finally, the last set has benefi ciary information, such as birth date, gender, relationship to Sabesp employee, insurance enrolment, and exit date. This set also includes the reason for dismissal if the benefi ciary is no longer enrolled.
In this study, we organized these administrative records in a monthly longitudinal data set for the 2004-2008 period with all benefi ciaries of the Sabesprev health insurance plan. Because benefi ciaries can enrol or leave the risk pool at any time, we built an unbalanced panel. New entries in the risk pool may occur because Sabesp can hire new employees, whereas exit may occur due to fi ring, retirement, or death. Our data set includes all benefi ciaries who were enrolled prior to January 2004 and followed for at least 13 months. Therefore, our data had 49,106 individuals, from which 46,929 remained enrolled in the health insurance pool during the entire period, and only 2,177 left the pool for any of the reasons mentioned above.
To estimate ex-ante moral hazard, we considered two groups of benefi ciaries: those who remained in the risk pool during the fi ve years and those who left the risk pool due to dismissal or retirement.

Empirical strategy
Our empirical strategy explores the presence of an exogenous and anticipated variation in health insurance coverage to estimate ex-ante moral hazard. We argue that the benefi ciary exit is exogenous to his/her current health status and may be anticipated by enrollees. As mentioned, beneficiary exit occurs due to the dismissal or retirement of the policyholder. We assume that dismissal is not caused by a shock in health status conditions because labour law in Brazil protects unhealthy workers from being dismissed (Decree-Law 3048/99).
According to the Brazilian Social Security Laws (Giambiagi & Afonso, 2015), employees are entitled to three types of pension benefi ts managed by the Social Security System (Instituto Nacional de Seguridade Social-INSS): 1) age retirement (60 years for urban females and 65 for urban males); 2) retirement for fulfi lment of minimum time employed; and 3) retirement due to disability.
In the fi rst two categories, the retirement benefi ts requirement is voluntary and, therefore, not usually correlated to contemporaneous changes in the health status of the policyholder. For retirement due to disabilities, it is clear that there is an association between risk pool exit and employee health status. Unfortunately, in the Sabesprev data set it is not possible to identify the reason for retirement. In the case of withdrawal due to disabil-ity, we assume that the health shock occurred before the retirement date. As the judicial process necessary to gain these benefi ts is very long, the enrollee has to be off work at least 24 months before the exit date. During the period that the enrollee is claiming his/her rights, he/she remains covered. In that manner, the health shock should have occurred at least 24 months before the enrollee's exit. Additionally, only 35% of the benefi ciaries who left the risk pool are policyholders, and it is reasonable to assume no correlation between the retirement decision of the policyholder and the dependent's health status.
It is worth mentioning that employees and their dependents in any case of risk pool exit can anticipate the loss of health insurance coverage. Retirement is a decision taken by employees and is usually not a fast process. Dismissed workers have to be notifi ed at least 30 days before exit and can remain in the risk pool for at least six months after the dismissal.
The moral hazard estimation considers two health care indicators: the number of doctor visits and the number of diagnostic tests. To a great extent, primary care utilization is a patient's decision, being less associated with previous medical diagnosis or referral. We included diagnostic tests because they are a deployment of medical visits and part of a routine of preventive care.
To measure moral hazard, we built six dummy variables according to time before leaving the health insurance risk pool: 1 to 6 months. Each dummy variable captures the increase in health care observed for benefi ciaries who left the risk pool in the respective month compared to the ones who remained. The comparison of these dummy coeffi cients allows for analysing how this effect varies over time.
The estimated model is specifi ed as follows: for each individual i = 1, ..., m and time t = 1, ..., n, y it is the dependent variable regarding health care utilization for individual i on time t, a i is a vector of individual attributes that does not vary over time (for example, sex), b t is the vector of time dummies, x it is a vector of individual attributes that varies over time (for example, age), α i are the non-observable characteristics of individual i (for example, health status), and µ it is the idiosyncratic error. Chart 1 describes all independent variables included in the model.

Variable Description
Time before leaving the pool Number of months until leaving the pool specifi ed by a set of six dummy variables. Each dummy variable d itj assumes a value equal to one if individual i during t time is j months away from leaving the portfolio, and zero otherwise; j varies from 1 to 6 while t varies from 1 to 60. These are our interest variables that measure for each indicator the average increase observed in the health care utilization j months before exiting the risk pool.

Family size
Number of benefi ciaries in each family enrolled in the Sabesprev pool. Family size can affect opportunity costs to use health care services.

Relationship with the policyholder
Set of three dummy variables R itj regarding the benefi ciary's relationship with the policyholder. Three types of household membership are available: proper policyholder, aggregate, or dependent. Each dummy variable r itj assumes a value equal to one if individual i during time t has j relationship with the policyholder, and zero otherwise. Reference category: policyholder.

Sex
Dummy variable S ij that assumes a value equal to 1 if the benefi ciary is a man, and zero otherwise. Sex is a proxy for individual risk.
Type of contract Set of three dummy variables C itj regarding each type of contract: full, basic, and special plan. The reference category is full plan.

Time
Set of 60 dummy T t variables for each year. This allows controlling for aggregate shocks that can affect health care utilization. The reference category is January 2004.
To correctly estimate model (1), it is worth taking into account two characteristics of our data. First, as clarifi ed by Cameron and Trivedi (2013), we cannot assume a normal distribution of our interest outcome as it is a limited-range dependent variable (number of doctor visits and diagnostic tests). A commonly applied count data model for such a situation is the negative binomial that controls for data overdispersion. Second, in a longitudinal data set with counted responses correlated within subjects over time, it is crucial to deal with the violation of independence assumption. It is reasonable to assume that the utilization of health care has an observed individual pattern. As emphasized by Hilbe (2011), the generalized estimation equation (GEE), or population-averaged model, adds at least one extra parameter to linear prediction to inform how observations within panels are to be construed.
To consider these issues, in this study we estimate a negative binomial model by the GEE method, adopting three approaches to correct the coeffi cient variance based on different correlation structures: AR (autoregressive), exchangeable, and unstructured.
The GEE is a method initially proposed by Liang and Zeger (1986) and constitutes an extension of the Generalized Linear Model (GLM) applied to panel data. The method evaluates the relationship between the response and predictor variables in a population context, and it is known as the marginal effect model. The essential characteristic of its specifi cation is factoring the variance function to include a parametric correlation structure in the panel.
The GLM of y it with covariates can be expressed as: where g is a function link and F is the distribution of the dependent variable, called family. Each pair of links and family functions defi nes a different model. In our case, we specify the family function as the negative binomial distribution and the link as the log function.
To better understand the GEE algorithm proposed by Liang and Zeger (1986), equation (3) specifi es the variance function of the general model, estimated by GLM: where V ( µ it ) is the GLM variance function defi ned in terms of the mean. In the negative binomial, the variance function is µ + αµ 2 . D is a diagonal matrix of the variance functions of y i , and R is the specifi ed correlation matrix. If we assume a zero correlation between subsequent measures of a subject within panels, we have . In this case, the correlation structure is independent.
In the exchangeable structure, it is assumed that the correlation between subsequent measures of a subject within panels is always the same. On the other hand, in the unstructured approach, all correlations are considered different and are estimated from the data. In the AR correlation Despite its contribution, the GEE model does not take into account the individual-specifi c unobserved effect. Since the seminal article by Hausman et al. (1984), there has been some application in the literature of fi xed models for the negative binomial. However, as Allison and Waterman (2002) discuss, this model does not meet the usual properties required by the fi xed effect method as a control of all stable covariates. According to the authors, this occurs due to the decomposition of the overdispersion parameter instead of the usual average decomposition (Allison & Waterman, 2002). Although they present alternatives such as the conditioning of the negative multinomial, these are not secure (Hilbe, 2011).
In this study, as an additional step, we opted to estimate a random effect model-a negative binomial with random beta-distributed effect, following Hausman et al. (1984). In this model, the conditional expected value and the variance are: e where with x it has exogenous covariates in time t, and is a random beta-distributed variable. In this case, the estimated coeffi cients are consistent only if the random effect is not correlated with exogenous variables. The random individualspecifi c effect can be interpreted as different attitudes regarding health protection, and the previous assumption can be violated. However, the variation of covariates among individuals is more relevant than the variation of individuals; in this scenario, the fi xed effect can generate inconsistent estimators (Chamberlain, 1984). Table 1 presents the descriptive statistics of observable individual attributes considering the status of health insurance coverage. The fi rst group regards individuals who remained in the risk pool during the 2004-2008 period, and the second one those who lost health insurance coverage in 2005 due to dismissal or retirement of the policyholder. As some attributes

Results
vary over time, it is only possible to compare individual characteristics between the two groups for each year separately. In 2005, both groups were quite similar regarding age, gender, relationship to policyholder, and type of contract: 50% of benefi ciaries were men, on average they were 35 years old, almost 60% were dependent, and around 90% had full coverage. The only difference between active benefi ciaries and the ones who exited the risk pool concerns the family size and type of health insurance contract. It is expected that type of contract differs between active and non-active benefi ciaries because the full plan is available only for Sabesp policyholders who are still working. Family size is slightly larger among benefi ciaries who remained in the risk pool during the entire period. Graphs 1 to 4 present the average number of doctor visits and diagnostic tests received by individuals according to the number of months before exiting the risk pool. For each fi gure, the X-axis shows the number of months before the departure, and zero means the exit date. The shaded area corresponds to the 95% confi dence interval for health care indicators.
Note: Graphics estimated using a kernel-weighted local polynomial regression for each indicator by number of months before the risk pool exit.
Graphs 1 and 2 show the kernel function of health care utilization considering only 12 months before the exit, whereas, in Graphs 3 and 4, this period extends to 36 months. It is noteworthy that for both health care indicators, the number of procedures grew as the risk pool exit became closer. These results suggest that individuals tend to intensify preventive health care utilization as the loss of health insurance coverage approaches. Additionally, this increase starts 10 months before the risk pool exit. Table 2 presents the results for regression models estimated for medical visits. The fi rst three columns display the population effect model using the different correlation structures of the variance matrix (AR1, exchangeable, and unstructured), whereas the fourth corresponds to the random effect model. Our interest outcomes are the dummy variables regarding the number of months before the risk pool exit (time dummies). To better understand time dummy effects, semi-elasticities for each health care indicator are reported in Table 2. The semi-elasticity coeffi cient is the percentage increase observed in health care utilization for enrollees who lost health insurance coverage in comparison to benefi ciaries who remain in the risk pool.
The main fi ndings confi rm the presence of ex-ante moral hazard for both types of health care utilization. The analysis of time dummies shows that these effects start at different months before the risk pool exit, depending on the kind of care. These results are found independently of the estimation method, pointing to the robustness of the analysis.
The increase in medical visits starts four months before the risk pool exit when the dummy coeffi cient turns statistically signifi cant. The highest effect is observed two months before the departure, and the magnitude of ex-ante moral hazard varies from 14% to 17% conditioned on the model estimated. It is worth noting that one month before losing health insurance coverage, the number of medical visits proves similar among both groups (Table 2).
Compared to medical visits, the utilization of diagnostic tests increases closer to the exit date (Table 3). For these procedures, the effect is observed only in the last two months before the risk pool exit. The semielasticity coeffi cients are higher than those estimated for medical visits, varying from 17% to 22%. As we shall see, a time lapse is observed between the increase in the frequency of medical appointments and diagnostic tests. This behaviour refl ects the need for a physician referral to perform diagnostic tests.        To compare the time coeffi cients pattern among the several models estimated, smoothed curves are shown in Graphs 5 and 6. It is noteworthy that the same pattern is observed independently of the estimation method for both medical visits and diagnostic tests.
Age and gender dummies present the typical pattern observed along the life cycle (Yamamoto, 2013). The utilization of medical visits is higher for the fi rst fi ve years, decreasing until age 15, and then increasing smoothly until age 75. This pattern is observed for all estimation methods (Graph 7). Male enrollees receive fewer doctor visits and diagnostic tests than do women, around 35% for both health care indicators. The use of diagnostic tests increases smoothly with age, presenting a lower decrement only for individuals over 80 years old (Graph 8). The same behaviour is observed for all methods.

Discussion
To the best of our knowledge, this is the fi rst study to explore the existence of ex-ante moral hazard as a result of a change in the health insurance coverage status in Brazil. Previous studies usually analyse EAMH in the context of obtaining health insurance coverage through Medicare enrolment (de Preux, 2011;Card & Maestas, 2008;Dave & Kastner, 2009).
This study implements an empirical strategy similar to de Preux (2011), with at least three main advantages. First, its database contains a population with different age groups, rather than only the elderly, as is the case with other studies. The change of health insurance coverage status occurs for all benefi ciaries and not just for the policyholder as in the Medicare experiment.
Because Sabesp is an employment-based health benefi t, all employees and their dependents are automatically included in the risk pool when they are hired. Therefore, our data avoids the self-selection bias of health insurance adhesion. Also, the longitudinal design allows robust estimation of EAMH because it takes into account unobserved heterogeneity and is a more powerful strategy to identify causality.
The second advantage refers to the mechanism that identifi es EAMH. In both exercises, the assumption is that individuals anticipate the change in health insurance coverage. Differing from de Preux (2011), this study refers to a loss of health insurance coverage, and its time window occurs just before the change of insurance coverage. Therefore, it is possible to guarantee that all benefi ciaries have equal access to health care services and providers during the entire period, which allows for controlling the access effect.
Finally and yet importantly, in this study, the anticipatory EAMH relates to self-insurance, not self-protection. As there is no effective change in the health insurance coverage and consequently no access effect, it is possible to measure EAMH based on health care utilization instead of lifestyle.
The main fi ndings showed that individuals tend to use fewer preventive health services when they are insured by Sabesp. These results are in accordance with Ehrlich and Becker (1972), who showed that when prevention is not contractible, health insurance coverage reduces self-insurance due to indirect costs such as displacement and time costs. It applies directly to the case of Sabesp because there is no enforcement for the use of preventive care by the benefi ciaries.
The results revealed an increase of up to 17% on medical visits and 22% on diagnostic exams. Medical visits begin to increase from the fi fth month before leaving the portfolio, with a peak at two months before exit. For diagnostic tests, the higher increase was observed at one and two months before the loss of health insurance coverage. This result probably occurs because the use of diagnostic tests requires a referral by providers and is usually not a patient's decision.
These results are relevant because the presence of EAMH may affect future costs of illness. In that manner, one policy recommendation is for the introduction of regulatory mechanisms to induce mandatory preventive care. There is substantial evidence for the conclusion that preventive care saves future costs of illness (Cohen, Neumann, & Weinstein, 2008).