Socioeconomic Status, Birth Weight, Maternal Smoking during Pregnancy and Adiposity in Early Adult Life: an Analysis Using Structural Equation Modeling

We describe here an example of structural equation modeling in epidemiology. The association between birth weight and adiposity in early adult life, adjusted for the number of cigarettes smoked during pregnancy and socioeconomic status at birth, was evaluated. Data involving 2,063 adults from the 1978/1979 Ribeirão Preto cohort study were used. Adiposity was measured by body mass index (BMI), waist circumference (WC) and the sum of triceps and subscapular skinfolds (STSS). Models were submitted to maximum likelihood estimation, separately for men and women. Birth weight had a small and significant effect on adiposity in men (standardized coefficient, SC = 0.08) and women (SC = 0.09). Smoking during pregnancy did not influence adiposity in men (SC = 0.004), but its effect was marginally significant in women (SC = 0.07; P = 0.056). Socioeconomic status at birth had a small and positive effect on adiposity in men (SC = 0.08) and a moderate and negative effect in women (SC =-0.16). In this young adult population, BMI, WC and STSS used alone or in combination were valid estimators of body adiposity.

Socioeconomic status, birth weight, maternal smoking during pregnancy and adiposity in early adult life: an analysis using structural equation modeling Situação sócio-econômica, peso ao nascer, tabagismo na gravidez e adiposidade na vida adulta jovem: uma análise utilizando-se modelagem com equações estruturais Introduction Birth weight has been associated with the risk of obesity in adolescence and adult life.Most studies conducted in developed countries have reported a positive association between birth weight and body mass index (BMI) in adult life 1,2,3,4 .An association between maternal smoking during pregnancy and increased BMI in adolescence and adult life has also been described in some studies 5,6 , but reports are contradictory 7 .Associations between birth weight, maternal smoking during pregnancy and BMI have been investigated very little in developing countries 8,9,10 .
Although BMI has been used for the evaluation of obesity and its risk factors because it is a simple technique of easy applicability, there is still some disagreement as to whether BMI is a good indicator of adiposity.Since BMI reflects total body mass (fat, lean and bone mass), it is possible that some individuals with BMI in the normal range have excess body fat 11 .Other studies have suggested the combined use of BMI and other variables that measure body fat, such as waist circumference and skinfold thicknesses, for the evaluation of the risks associated with cardiovascular diseases in order to reduce the limitations of BMI 12,13,14 .
The use of new statistical methods that may overcome the limitations of those currently employed for the analysis of data from observational studies may increase the validity of the findings.
Cad. Saúde Pública, Rio de Janeiro, 26(1):15-29, jan, 2010 Among these new methods, structural equation modeling has been recently introduced for the evaluation of causal associations in epidemiology 15,16,17 .This method consists of the simultaneous estimation of a series of multiple linear regression equations and has some advantages over linear regression.First, a theoretical model of hypothetical relationships between variables is submitted to a test in which distal, intermediate and proximal variables are arranged hierarchically in a causal chain 18 .Only if this theoretical model fits the data will the association studied be analyzed.
Structural equation modeling allows for measurement error, for model correlations between explanatory variables and for estimating indirect effects (effects of an explanatory variable on the outcome mediated by one or more intervening variables).If a variable is measured in an imperfect manner, instead of working with a single indicator variable it is possible to work simultaneously with more than one measure of the same construct, with the creation of a latent variable.A latent variable is a non-observable variable which is deduced from covariances between two or more indicator variables.Latent variables are free of measurement error (consisting of a random error plus singularity, which is the portion of variance present in the variable that measures something different from the dimension of interest) 19 .Thus, only the common variance (variance related to the dimension of interest) shared by different indicators of a latent variable remains, a fact permitting the estimation of effects free of bias caused by measurement error.In contrast to multiple regression, adjustment for confounding factors is more complete because indirect effects are taken into account and it is also possible to control for common causes 20 .The use of latent variables also allows the researcher to deal with the problem of colinearity between explanatory variables 20 .The disadvantages of some estimation methods in structural equation modeling are the assumptions of multivariate normality and of the fact that the variables are continuous and the relationships are linear 18,19 .
The objective of the present study was to describe an example of the use of structural equation modeling in epidemiology and compare the results with the conventional linear regression model.This approach was used to determine the association between birth weight and adiposity in early adult life, adjusted for maternal smoking during pregnancy and socioeconomic status at birth.Adiposity, measured by BMI or waist circumference or the sum of triceps and subscapular skinfolds (STSS) as continuous indicators, was analyzed in separate models, and was also modeled as a latent variable in order to determine whether the inclusion of measures of adiposity other than BMI adds information for a more valid estimation of body adiposity.

Study design
Between June 1, 1978 and May 31, 1979, a total of 9,067 live births from eight maternity hospitals in Ribeirão Preto, São Paulo State, Brazil, corresponding to 98% of all births in the city, were examined and their mothers were interviewed.Immediately after birth and after verbal consent was obtained, the mothers responded to a standardized questionnaire from which the following variables were extracted: number of cigarettes smoked during pregnancy, father's occupation, family income (as a multiple of the minimum wage), and maternal schooling.The newborns were weighed immediately after birth using standardized techniques 21 .
Among the births recorded, 2.5% were discharged before the interview and less than 1% of mothers refused to participate in the study, with 6,484 subjects being eligible for follow-up after exclusion of non-residents (n = 2,094), multiple births (n = 146) and deaths up to 20 years of age (n = 343).
A total of 5,665 participants (87.4% of the eligible subjects) were located for the fourth follow-up of this study when they were 23 to 25 years old.The city was divided into four geo-economic regions according to the income of the head of the household based on census data: poor, middle poor, middle rich and rich.Contact was established by telephone or letter with one-third of the eligible subjects of each geo-economic region.In the case of refusal (209 cases), no contact because the subject was in prison (34  cases) or failure to attend the scheduled interview (431 cases), the next subject on the list was contacted.Thus, 705 subjects were replaced and 2,063 adults participated in the fourth phase of the study, corresponding to 31.8% of the original sample.Lower follow-up rates were observed for men (p = 0.004), subjects whose parents had a less qualified occupation (p < 0.001) and mothers of low educational level (p < 0.001).There were no differences in follow-up rate according to birth weight (p = 0.618).Details of the methods have been published elsewhere 21,22 .
The participants attended an interview at the Hemocentro of Ribeirão Preto and, on this occasion, weight, height, waist circumference and triceps and subscapular skinfold thickness were measured using standardized techniques 23 .

Theoretical model
Figure 1 shows the theoretical model tested.The response variable is a construct, a latent variable not directly observed, i.e., adiposity in early adult life, represented by an ellipse.This variable is composed of three directly observed continuous variables: BMI, waist circumference and STSS.The explanatory variables are socioeconomic status, birth weight and number of cigarettes smoked during pregnancy.The variable 'socioeconomic status' is also a construct and is composed of three ordinal (directly observed) indicator variables: father's occupation at the time of birth, per capita family income at the time of birth and maternal schooling at the time of birth.The variables birth weight and number of cigarettes smoked during pregnancy are directly observed continuous variables represented by rectangles.In the model, direct and indirect effects are estimated.According to this hypothesis, socioeconomic status, birth weight and number of cigarettes smoked during pregnancy exert a direct effect on adiposity.Socioeconomic status directly influences the number of cigarettes smoked during pregnancy and birth weight.The number of cigarettes smoked during pregnancy has a direct effect on birth weight.In addition, indirect effects are estimated: socioeconomic status interferes with adiposity measured by birth weight and by the number of cigarettes smoked during pregnancy.The number of cigarettes smoked during pregnancy exerts an indirect effect on adiposity mediated by birth weight.

Variables
The structural equation model includes both directly observed variables and not directly observed variables, known as latent variables or constructs.
• Socioeconomic status at the time of birth Socioeconomic status at birth was treated as a latent variable measured using the following indicators: father's occupation, family income and maternal schooling at the time of birth.
• Indicators of socioeconomic status at the time of birth Father's occupation at the time of birth was classified as follows: 1 = manual unskilled work or unemployed; 2 = manual skilled and semi-skilled work; and 3 = non-manual work.Family income at the time of birth, reported as a multiple of the national minimum wage, was divided into ten categories: 1: ≤ 1; 2: 1.01 to 2; Maternal schooling at the time of birth was classified as follows: 0 = illiterate; 1 = acquiring literacy by other means or incomplete primary school; 2 = complete primary school; 3 = incomplete secondary school; 4 = complete secondary school; 5 = incomplete high school; 6 = complete high school; 7 = incomplete higher education, and 8 = complete higher education.

• Adiposity of young adults
Adiposity was measured by BMI, waist circumference and STSS and was also treated as a latent variable in another model measured by the following continuous indicators: BMI, waist circumference and STSS.

• Directly observed variables
Birth weight (in grams) and the number of cigarettes smoked during pregnancy were treated as continuous numerical variables.

• Missing values
Family income was the variable with the largest number of missing answers (n = 371).The other variables presented a small number of missing data: father's occupation (n = 60), cigarettes smoked during pregnancy (n = 52), and maternal schooling (n = 40).No data were missing for birth weight.

• Statistical analysis
Multiple linear regression and structural equation modeling were used for the data analysis.In the multiple regression model BMI was regressed on maternal schooling, birth weight and the number of cigarettes smoked during pregnancy.
The structural equation model consists of two sub-models: the measurement model which establishes how the constructs are measured and the structural model which analyzes the structural relationships, corresponding to associations between variables.Latent variables are represented by circles or ellipses and observed variables are represented by squares or rectangles.The elaboration of a latent variable is made in the measurement model, in which the indicators of the latent variable are specified.A good latent variable presents convergent validity, showing that its indicators measure the same construct, as measured by the loads of the indicators (factor loadings) that should be high (higher than 0.60).In addition, there should be discriminant validity, i.e., the correlations between indicators should not be excessively high (> 0.85), since each indicator should measure a distinct aspect of the construct 18 .The models proposed were estimated using the AMOS 16.0 program (SPSS Inc., Chicago, United States).Instead of excluding cases without information, full information maximum likelihood (FIML) estimation was used since studies have shown this method to be the best alternative to deal with missing data 24 .Separate models were fitted for men and women.Since ordinal indicators of socioeconomic status were treated as continuous variables, bootstrapping was used to determine the consistency of the results.
The standardized coefficients (SC) were interpreted according to Kline 18 , where an SC of about 0.10 indicates a small effect, an SC of about 0.30 indicates a medium effect, and SC > 0.50 indicates a strong effect.

• Measures of model fit
The following measures were analyzed to test the fit of the model: χ² (likelihood ratio chisquare statistic): a statistically significant value indicates discrepancy between the observed and estimated matrices, with consequent rejection of the theoretical model under analysis; χ²/d.f.normed chi-square: there is no exact critical value to decide the adequacy or not of the model based on this index, with indices of 5.00 or lower being accepted in practice and values lower than 1 indicating overfitted models; root mean square error of approximation (RMSEA) which is directly based on residues: a value close to zero indicates that the theoretical model fits the data, while values less than 0.08 indicate a satisfactory fit; normed fit index (NFI) and Tucker-Lewis index (TLI): values higher than 0.90 indicate a good fit; Akaike information criterion (AIC) used for the comparison of models: lower values indicate a better fit; R 2 (coefficient of determination): indicates how much of the variability in the response variable is explained by explanatory variables 18,25,26 .

Ethical aspects
The

Measurement model
The coefficients of the measurement models were high and statistically significant.The factor loadings of the indicators of the latent construct "socioeconomic status at the time of birth" were higher than 0.60 and were statistically significant (Table 1) for both men and women.The indicator with the highest load for this construct was family income, with the socioeconomic status construct explaining 66% of the variability in family income among women and 68% among men.This indicates that the latent variable adequately predicted the variability of the observed variable (Figure 2).Cronbach's alpha coefficient was close to 0.70 (0.69 for women and 0.68 for men).
With regard to the adiposity construct, elevated (generally higher than 0.80) and statistically significant factor loadings (Table 1) were observed for all indicators, as shown in Figure 2. Particularly for BMI the factor loading was 0.98 in women, indicating that 97% of the variability in BMI was explained by the adiposity construct.A similar result was obtained for men (factor loading of 0.96).Cronbach's alpha coefficient for the latent adiposity variable was 0.87 for men and 0.88 for women.One potential drawback of this construct is its low discriminant validity, with all factor loadings being higher than 0.85.

Structural model
Figure 2 shows the standardized coefficients of the structural model obtained for women (Figure 2a) and men (Figure 2b).These coefficients indicate the impact, expressed as standard deviation units, on the response variable relative to the variation of one standard deviation unit in the explanatory variable.This coefficient is similar to the beta weight coefficient of regression models and allows for the evaluation of the relative importance of variables in the model.
The direct effect of birth weight on adiposity in adult life, adjusted for socioeconomic status at the time of birth and the number of cigarettes smoked by the mother during pregnancy, was of low magnitude (0.08 for men and 0.09 for women) but statistically significant (p < 0.05).This means that for each variation of one standard deviation in birth weight there was a significant increase of 0.09 standard deviation in adiposity for women and of 0.08 standard deviation for men, corresponding to a small effect.The number of cigarettes smoked by the mother during pregnancy had no effect on adiposity in adult life among men (SC = 0.004, p = 0.91), but was marginally significant among women (SC = 0.06, p = 0.056) (Table 1).The portion of the indirect effect of the number of cigarettes smoked by the mother during pregnancy on adult adiposity measured by birth weight was not relevant.The indirect effect was estimated by multiplying the coefficient of the effect of the number of cigarettes on birth weight by the coefficient of the effect of birth weight on adiposity, and was -0.014 (-0.16 x 0.09) for women and -0.013 (-0.16 x 0.08) for men (Table 2).
The socioeconomic status of the family at the time of birth was the variable that most influenced adiposity in adult life, but its effect differed between men and women.For women, socioeconomic status interfered negatively with obesity (-0.16), whereas for men the effect was positive (0.07), with both effects being significant (Figure 2).This finding indicates that adiposity was greater among men of better socioeconomic status and among women of poor socioeconomic status.The total effect (sum of the direct effect and of indirect effects) of socioeconomic status on adiposity presented a similar pattern to that observed for part of the direct effect, being positive for men (0.085) and negative for women (-0.149) (Table 2).Most of the effect of socioeconomic status at the time of birth on adiposity was direct, with a non-relevant portion being mediated by the variables included in the model.The models including adiposity as the response variable presented satisfactory fit indices for both men and women; for example, a nonsignificant χ² for both men (p = 0.076) and women (p = 0.107).The model for men presented an RMSEA of 0.023 (< 0.08 indicates a good fit) and TLI of 0.995 (> 0.90 suggests a good fit).According to most indices, the model fit was better for women than for men (Table 3).
In view of the elevated factor loadings of BMI, waist circumference and STSS indicators on the composition of the latent adiposity construct, we tested a model in which the adiposity construct was replaced with the BMI indicator.Figure 3 shows the results obtained with this model for women (Figure 3a) and men (Figure 3b).The estimates of the structural relations in the model including BMI (Figure 1) did not change in a rel-evant manner compared to those obtained with the model including adiposity (Figure 2).Most indices indicated a better fit of the models including BMI as response variable.In the model for women, χ² were 3.785 and not significant (p = 0.706), and AIC was 45.785.For men, the estimated χ² value was 5.558 and not significant (p = 0.474), and AIC was 47.558.On the other hand, the models including BMI might be overfitted as suggested by the normed chi-square (χ²/d.f.) values lower than 1, thus reducing the possibility of generalizing the results to other populations.These models are more parsimonious, including a smaller number of parameters since parameters associated with the adiposity construct were excluded (Table 3).
Models that include only waist circumference or STSS were also tested.The results and fit Cad.Saúde Pública, Rio de Janeiro, 26(1):15-29, jan, 2010 indices obtained were similar to those observed for the model including only BMI (data not shown, available from the authors).

Multiple linear regression model
Figure 4 shows the standardized coefficients of a multiple linear regression model obtained for women (Figure 4a) and men (Figure 4b).For women direct effects of maternal schooling (SC = -0.11;p = 0.010), birth weight (SC = 0.08; p < 0.001) and the number of cigarettes smoked during pregnancy (SC = 0.07; p = 0.021) on BMI were significant.For men only birth weight influenced BMI (SC = 0.06; p = 0.045).

Discussion
Birth weight influenced adiposity in both male and female young adults, with its effect being direct and small.Maternal smoking during pregnancy did not influence adiposity in either gender, although the model for women suggests a small effect of marginal significance.The variable that influenced adiposity of young adults the most was socioeconomic status of the family at the time of birth.Its adjusted effect was small and positive for men, indicating that men who had a better socioeconomic status at birth tended to present greater adiposity when adults.For women, the effect of socioeconomic status on adiposity was moderate and negative, demonstrating that women who had a better socioeconomic status at the time of their birth presented lower adiposity in early adult life.The most important effects were direct, with the effect of socioeconomic status on adiposity mediated by birth weight or maternal smoking during pregnancy being very small.
The addition of abdominal circumference and STSS to BMI to compose the adiposity construct resulted in models with good fit indices.However, the elevated indicator factor loadings   (continues) pointed to low discriminant validity, suggesting that BMI alone is a good estimator of adiposity and that the addition of abdominal circumference and STSS seems to be unnecessary.However, both the model including only BMI and the model including adiposity presented good fit indices, with a better fit being observed for the more parsimonious model including only BMI.In addition, the fact that the estimates of structural relationships and fit indices obtained with models including either waist circumference or STSS were similar to those observed for the model including only BMI suggests that in this population of young adults these three measures alone or in combination are good estimators of body adiposity.Some studies have suggested that the addition of waist circumference to BMI aggregates important information in order to determine the effect of body adiposity on the risk of coronary artery disease 27 or metabolic syndrome 12 .Another study reported that waist circumference in addition to BMI provides a moderate prediction of coronary risk in young populations only 28 .Furthermore, waist circumference has been suggested to be a better predictor of the risk of cardiovascular disease than BMI when both variables are dichotomized 29 , and a better estimator of adiposity than BMI when both are included in the model as continuous variables 14 .Another study reported that waist circumference and BMI independently contribute to the prediction of non-abdominal and abdominal subcutaneous and visceral fat in both men and women 13 .In the present study, both BMI and waist circumference were treated as continuous variables.However, we do not know if the result would be the same if these variables were analyzed as categorical variables.
Positive associations between birth weight and BMI in childhood 30,31,32 , adolescence 33 and adult life 2,3,32 were the predominant findings in studies on this subject.However, a Finnish study only described this association for men 34 , whereas an Indian study only observed this association in women 35 .On the other hand, a recent English study found no association between birth weight and BMI 36 .The interpretation of this association would be that birth weight influences BMI in adult life or that both are influenced by other variable(s) or common cause(s).In most studies, this association even persisted after adjustment for socioeconomic status 2,3,32,33 , in agreement with the present results.However, although the relationship between higher birth weight and higher BMI in adult life seems to be consistent, in some studies adjustment for other confounding factors such as parental adiposity and gestational age resulted in the disappearance of this association 32 .In the present investigation no data regarding parental adiposity were collected and the results were not adjusted for gestational age.Since the main objective of the present study was to describe a didactic example of the use of structural equation modeling in epidemiology, the number of variables to be included in the model was limited in order to facilitate their interpretation.The proposed model is a simplification of a  complex reality and it is not intended to be a full causal model.
The observation of a positive association between socioeconomic status and adiposity among men and negative association among women agrees with other Brazilian studies 37,38,39 .This association seems to involve historicity and to depend on the social context.In developed countries, at the beginning of the nutritional transition, obesity predominated among both men and women of a higher socioeconomic strata 40,41 .With progression of the nutritional transition, an inversion of this association was observed and, today, obesity is more prevalent among men 3 and women of poor socioeconomic status in developed countries 41,42,43 .Brazil is currently undergoing nutritional transition, with the observation of an already inverted relationship among women, whereas this association remains positive among men 37,39,44 .Nutritional and sociocultural patterns and, in particular, greater physical activity related to the occupation of men of poor socioeconomic status may explain why the relationship between socioeconomic status and adiposity is positive among men 39 .It is possible and expected that the inversion of this relationship will also be observed among men as the nutritional transition advances in Brazil 45 .
In contrast to various studies, we observed no association between maternal smoking during pregnancy and adiposity in men 6,46,47,48 .For women, the effect of the number of cigarettes smoked during pregnancy on adiposity was small and marginally significant.However, in all studies this association was evaluated by treating smoking and obesity as categorical variables, whereas in the present investigation these two parameters were analyzed as continuous variables.
Main estimates of direct effects in the structural equation modeling model were similar to those derived from the multiple linear regression model.However, it is important to note that multiple linear regression is not conceptually appropriate to answer questions of temporality because it does not consider the temporal sequence among the factors.Moreover, had structural equation modeling not been performed it would not be possible to answer two questions: if inclusion of other measures of body fat other than BMI contributes to a better measure of body adiposity and if indirect effects of socioeconomic status and maternal smoking on body adiposity are important or not.Hierarchical modeling could have been used to study mediation including only directly observed variables 49 .However, this approach does not produce fit indices of the whole model, nor does it estimate indirect effects with its standard errors, being necessary to implement as many regressions as the number of endogenous variables to study mediation.
The present study has some positive points.This was a cohort study conducted in a developing country and involving a large sample, facts that confer high power to detect associations.The use of structural equation modeling permitted a better adjustment for socioeconomic status, which is a common confounding factor in studies on the association of birth weight and maternal smoking during pregnancy with adiposity in early adult life.The use of latent variables also permitted the consideration of measurement error in the evaluation of socioeconomic status and adiposity, thus providing effect estimates that were less contaminated by measurement bias for these variables.In addition, structural equation modeling permitted the estimation of indirect effects which, in this case, were not important.One limitation of the present study was the use of maximum likelihood estimation using the AMOS program.This method requires multivariate nor-mality, linear relationships and the measurement of all variables on a continuous scale.Since all variables were treated as continuous variables, with some being ordinal variables or showing elevated degrees of asymmetry and/or kurtosis, bootstrap estimation was performed 26 to test the robustness of the results, which were similar.Simulation studies have shown that the maximum likelihood method produces good estimates even in the presence of excessive kurtosis and when the number of categories of ordinal variables is at least four 19 .Another limitation of this study was the high percentage of missing information about family income (16.1%).FIML was used to reduce this limitation since this method is more efficient in the treatment of missing data than deleting records with incomplete information for some variable (listwise deletion) 24 .Finally, there was selective loss to follow-up since subjects of low socioeconomic status and children born to mothers who smoked during pregnancy, presented the lowest follow-up rates.There was no difference in follow-up rate according to birth weight 21 .However, although the differences in follow-up rates were statistically significant due to the large sample size, the percentage differences were small.As a consequence, the estimates of the association of socioeconomic status and birth weight with adiposity obtained here might have been underestimated.Selective loss probably did not yield a false-negative result in the analysis of the association between maternal smoking during pregnancy and adiposity among men because the estimated coefficient was close to zero.
Socioeconomic status of the family at the time of birth was the variable that most influenced adiposity among young adults.Birth weight influenced adiposity among both male and female young adults, but its effect was small.Maternal smoking during pregnancy did not influence adiposity in men, but exerted a small and marginally significant effect in women.In this young adult population, BMI, waist circumference and STSS used alone or in combination were valid estimators of body adiposity.

Figure 1 Theoretical
Figure 1Theoretical model tested using structural equations.

Figure
Figure 2

Figure 3 Structural
Figure 3 Structural equation model with body mass index as observed response variable.

Figure 4 Multiple
Figure 4Multiple linear regression with only direct effects and directly observed variables.
research project was approved by the Ethics Committee of the Faculty of Medicine of Ribeirão Preto, São Paulo State University (USP) in accordance with Resolution n o .196/96.All participants signed a free informed consent form.

Table 1
Standardized and non-standardized coeffi cients of the structural equation models using adiposity as latent response variable.

2
Structural equation model with adiposity as latent response variable.

Table 2
Standardized direct, indirect and total effects of the structural equation models using adiposity as a latent response variable.

Table 3
Fit indicators of the structural equation models.