Estimation of fat mass in Southern Brazilian female adolescents: development and validation of mathematical models

abstract This study aimed to develop and validate the first mathematical models, based on anthropometric properties, to estimate fat mass (FM) in a heterogeneous sample of female adolescents. A cross-sectional and quantitative study conducted with 196 individuals aged 12 to 17 years from the metropolitan region of Curitiba, Paraná, Brazil. The participants were randomly divided into two groups: regression sample (n = 169) and validation sample (n = 27). Dual-energy X-ray absorptiometry (DXA) was used as the reference method to determine body fat in relative and absolute values. Stature, body mass, waist girth and triceps, subscapular, biceps, iliac crest, abdominal, front thigh and medial calf skinfold thickness were defined as independent variables and measured according to an international technical protocol. Statistical analyzes used the Ordinary Least Square (OLS) regression model, paired t test and Pearson correlation. Four multivariate mathematical models with high determination coefficients (R2 ≥90%) and low estimated standard errors (SEE = ≤2.02 kg) were developed. Model 4 stands out for its low number of independent variables and significant statistical performance (R2 = 90%; SEE = 1.92 kg). It is concluded that the four mathematical models developed are valid for estimating FM in female adolescents in southern Brazil.


INTRODUCTION
An excess of subcutaneous adipose tissue in adolescence is considered an independent risk factor for metabolic syndrome 1 , psychosocial disorders 2 and leads to structural and physiological abnormalities 3 .Body adiposity is the morphological parameter used in the diagnosis of obesity and consequently in the periodic control of the effectiveness of multidisciplinary interventions 4,5 .
Therefore, it is essential to critically understand the potentials and limitations of available assessment methods.The tetra-compartmental chemical-molecular model is considered an important laboratory reference method.However, methodological operability is limited in clinical studies 4 .Dual-energy X-ray absorptiometry (DXA) is the most accepted convergent method by the scientific community to quantify the fat component as an indicator of body adiposity in pediatric populations 5 .Nevertheless, the high cost of acquisition and/or maintenance, the need for adequate infrastructure and technicians, restricts the use of DXA in the academic research environment.It is noteworthy that the double indirect methods, such as surface anthropometry, are satisfactorily valid for estimating certain chemical and/or anatomical components of total body mass using mathematical models in different contexts and operational settings of clinical practice 6 .A regression model is specifically derived from the quantitative analysis between measurable anthropometric properties and the unknown component in a homogeneous or heterogeneous population sample 4 .To choose the ideal mathematical model to be used, the health professional must analyze the existence of similarity in the morphological and demographic characteristics between the study sample and the target individual 7 .
In this way, the traditional mathematical models developed by Slaugther et al. 8 f rom a North American sample, 35 years ago are conventionally used in Latin American countries.However, estimation errors are observed in pediatric samples with different genotypic and phenotypic profiles 9,10 .It is noteworthy the excess of body adiposity is associated with the interaction of sociocultural, behavioral, and biological moderators, therefore, the obesogenic environment of exposure has a direct impact on the body composition of adolescents 11 .Consequently, due to the lack of agreement and statistical sensitivity observed in the cross-validation studies, bi 10,12 and multicompartmental 13 mathematical models with better specificity, have been proposed and disseminated to estimate body adiposity in healthy Brazilian schoolchildren.
However, there are no references from research conducted with southern female samples, becoming an important gap in the literature, given that high prevalence of overweight and obesity in this geographic region were highlighted found in a recent systematic review with meta-analysis this population 11 .Therefore, this study aimed to develop and validate the first mathematical models, based on anthropometric properties, to estimate fat mass (FM) in a heterogeneous sample of female adolescents in southern Brazil.

Study design and population
A cross-sectional and quantitative study conducted in the three-year period between 2014-2016 in the metropolitan region of Curitiba, Paraná, Brazil.
We investigated multiracial, clinically healthy female adolescents between the ages of 12 to 17 years old, who were regularly enrolled in public and private educational institutions.The statistically significant sample size was determined to be at a confidence level of 91%, with a specified error of 4.5% of a universe of 82,414 individuals obtained through the national Brazilian database system (Datasus) of 2011.The details of which are in a previous study 14 .Emphasis was placed on the exclusion of individuals suspicious of pregnancy; using diuretic agents and/or had undergone a radiographic examination seven days prior to the performance of the evaluation procedures.
The individuals' participation was voluntary and required that consent be given by the parents and/or legal guardians.

Independent variables
Chronological age was based on the year of birth and grouped in decimal values.The collection of independent variables was performed in a private room by two experienced anthropometrists accredited by the International Society for the Advancement of Kinanthropometry (ISAK).The following anthropometric properties were measured in duplicate and according to the protocol International Standards for Anthropometric Assessment 15 : body mass using a digital scale (Filizola®, Brazil) with resolution of 0.05 kg, stature using a stadiometer (Cardiomed®, Brazil) with resolution of 0.1 cm, waist girth using a flexible steel anthropometric tape (Cescorf®, Brazil) with resolution of 0.1 cm and triceps, subscapular, biceps, iliac crest, abdominal, front thigh and medial calf skinfolds thickness using a skinfold caliper (Cescorf®, Brazil) with resolution of 0.1 mm and a pressure of ± 10 g/mm 2 .An anthropometric bench (Anthropos®, Brazil) and a dermographic pen (Viscot Medical®, USA) were used as complementary resources.All anthropometric instruments used are registered with the Agência Nacional de Vigilância Sanitária (ANVISA) [Brazilian National Health Surveillance Agency] and were previously calibrated.The reliability of the skinfolds thicknesses was determined by estimating the relative Technical Error of Measurement (%TEM) intra-evaluator.The values obtained were 3.2% and 3.5%, both considered adequate 16 .The participant was referred to the evaluation using DXA immediately after anthropometric measures.

Dependent variable
The fat component was determined in relative and absolute values using the DXA Hologic Discovery Fan-Bean system (Hologic Discovery A, Bedford, USA).The reliability of the internal measurement was established with the use of Phantom in the daily calibration of the equipment.The coefficient of variation was lower than 2%.The evaluation procedure was performed according to the guidelines recommended by the International Society for Clinical Densitometry (ISCD) 17 .The volunteer was not allowed to use metal objects and/or clothing.The reference curves developed for the Southern Brazilian adolescents 18 were applied to determine the phenotype of body composition.The 75th percentile was used as a discriminatory parameter for excess FM.

Statistical analysis
To initiate analysis of the data, the values of the collected variables were submitted to the exploratory test of asymmetry or normality of the distribution of the variables, through the Kolmogorov-Smirnov test.Measures of central tendency (mean) and dispersion (minimum, maximum and standard deviation) were used.Pearson's (r) correlation coefficient tests were used to quantify the intensity and the direction of association of the variables.This model allowed the analysis of the linear association between anthropometric measurements and DXA values.
For regression models, a multiple linear regression technique, Ordinary Least Square (OLS), was employed to establish the prediction equation of the FM 12 .The independent variables were: age in years, body mass (kg), stature (m), waist girth (cm) and the seven skinfolds thickness (mm): triceps; subscapular; biceps; iliac crest; abdominal; front thigh; medial calf.Heteroscedasticity can occur due to the presence of discrepant data, specification errors, omission of relevant variables or asymmetry in the distribution of one or more of the regressions.White's test was applied to control this variable 19 .Adjustment of skinfold thickness with a logarithmic scale was used to stabilize the variance and normalize the normal data distribution.Then, the development of the mathematical models followed the methodological steps in pursuit of criteria of practicability, reduced number of independent variables, higher determination coefficient (R 2 ) and lower standard error of estimate (SEE).
To test the applicability of mathematical models, an independent sample originating from the initial population of this study was also used, but it was not used for the initial development.The individuals were randomly selected by the software itself and represented approximately 15% of each age category.To quantify the agreement between DXA and skinfolds thickness methods, we also compared DXA and skinfold models using the Bland-Altman test.The Bland-Altman procedure quantify the agreement measurements by the bias (mean difference) and limits of agreement (LOA).A significative bias (p<0.05)verify with paired t-test could indicate discordance between techniques.Furthermore, a deviation up to 8% in the 95% LOA was consider clinically acceptable 20 .Finally, as deviation of the residual SEE was applied.For the analyzes, the value of p<0.05 was adopted as statistically significant and these were performed using the statistical packages IBM SPSS Statistics for Windows, version 21.0 (IBM Corp., Armonk, N.Y., EUA) and the software GNU Regression, Econometrics and Time-series Library (GRETL), version 2016.

RESULTS
Data on 169 girls with average age of 15.1 ± 1.87 years for the regression sample were included.Descriptive analyzes of demographic and morphological characteristics are shown in Table 1.Data stratified by age are shown in Supplementary Material 1. Regarding the body composition classification by FM percentile curves, results found an overweight or obesity in 32% (n = 55) of the girls.As for the skinfolds thickness analysis, subscapular (r = 0.793), iliac crest (r = 0.765), front thigh (r = 0.765), triceps (r = 0.727), had the highest correlation with relative (%) and absolute (kg) FM, follow by abdominal (r = 0.705), biceps (r = 0.694) and medial calf (r = 0.690), all p = 0.001.The correlation coefficient between FM with body mass and waist girth were 0.927 and 0.902 respectively.These anthropometric properties were selected to develop the mathematical models.Correlation analyses are shown in Supplementary Material 2. The analysis of collinearity and heteroscedasticity among the dependent variables is commonly disregarded or omitted in studies of the development of mathematical models for FM prediction.In this study, we opted for the sum of skinfolds measurements to control collinearity followed by logarithmic transformation for the composition of new dependent variables.Table 2 shows the result of this step.
Figures 1, 2, 3 and 4 illustrate the results and the values obtained by DXA and by the developed mathematical models.Data of 95%CI are presented as well as the residual analysis of each equation.Despite the presented models presenting low SEE and no difference when compared to DXA, validation in an independent sample was necessary.Thus, data from 27 girls (15.1 ± 1.9 years) were collected.The validation sample represents 15% of the total group and was not part of the regression group.With close data to the regression sample, the body composition classification by FM percentile was 41% (n = 11) for overweight or obesity.Descriptive characteristics and concordance analysis of validation group are present in Supplementary Material 3.For the validation sample (Table 3), high correlations were verified between DXA and skinfold thickness models (r = 0.956 to 0.967).Despite showing lower values of relative fat as compared to DXA, no significant difference compared to the reference method were detected (p>0.05 for all models), and Bland and Altman analysis confirmed the results.Finally, Pearson correlations between bias and the independent variables of all anthropometric models were tested to understand the errors of each mathematical model (Supplementary Material 4).This information could be helpful to identify intrinsic errors.The results indicated no correlation.

DISCUSSION
The uniqueness of this study lies in the development of mathematical models, with the objective towards the estimation of FM, strictly in a sample of female adolescents from a southern Brazil region.The selected independent variables are widely evidenced in the scientific literature [8][9][10][11][12][13] as important explanatory elements of body adiposity in children and/or adolescents.Regarding the selectivity of independent variables, the significance of the correlation coefficients conditioned the inclusion of body mass, waist girth and all skinfolds thicknesses in the regression analyzes.The adopted statistical methodology ensured the collinearity control of the variables, normality of residues and homoscedasticity.Commonly, the absolute and/or squared sum of skinfolds thicknesses is used, however, due to the fact that the frequency distribution is skewed, non-rectilinear trends with FM are observed in pediatric samples 8,12 .
Four multivariate mathematical models with the combination of body mass, waist girth and the logarithmic conversion of the sum of seven, four, three or two skinfolds were developed (Table 2).The waist girth is an independent indicator of android obesity 21 and has satisfactory discriminatory ability of body adiposity in adolescents 22 .The body mass directly reflects the sum of organic and inorganic matter and, in addition, variations are observed during the menstrual cycle due to the increase in total body water.Moreover, in very similar proportions, as adiposity increases or decreases, body mass will necessarily change 23 .The percentage of fat estimated using mathematical models developed from densitometric methods, such as hydrodensitometry and air displacement plethysmography, is the result of the conversion of volumetric body density 6,24 .However, imaging methods with two-dimensional projections, specifically DXA, provide values in units of mass, as the measurement of the area of body tissues more reliably translates the quantification of molecular components 25 .Thus, the present study chose to develop mathematical models with estimates in absolute values (kg) (Table 2) given that they are directly actionable and convertible into kilocalories and, therefore, can be used as a decision parameter in the planning of clinical interventions, based on an energy expenditure target.
In all mathematical models, determination coefficients with high explanatory power (R 2 ≥90%) of the FM were obtained.Emphasized are the low estimation errors (SEE = ≤2.02kg), of which are in agreement with the criteria established in the reference literature 26 .Additionally, none of the models showed a significant difference with the mean values obtained using DXA.Bland & Altman's analysis outlines limits of agreement (LOA) with acceptable oscillations and absent of significant and/or systematic bias.The mathematical model structured with body mass, waist girth and log transformation sum of the subscapular and front thigh skinfolds (Model 4), is presented with significant statistical performance (R 2 = 90%; SEE = 1.92 kg; LOA = -3.59 to 3.43 kg).Considering the heterogeneity and variability of the subcutaneous distribution of adipose tissue, evidenced during the somatic and biological maturation stages 27 , note that in the aforementioned mathematical model, there is a balance of variables between the different central and appendicular anatomical sites, optimizing the topographic monitoring of adiposity.In addition, the compressibility of skinfold thickness is not constant 28 , therefore, the low number of variables reduces the influence of random errors in FM estimation 8,12 .An important consideration was related to the errors of all models.The application of our models or any other present in the literature should consider this point to ensure the reproducibility of the data.The correlation analysis presented in the Supplementary Material 4 brought to light data no association between bias with the independent variables favoring their use in longitudinal clinical practice.
The statistical performance of the mathematical models proposed in the present study (Table 2) are better than those identified in Mexican (R 2 = 85%) 9 and Chilean (R 2 = 89%) 29 children and adolescents with a mean age of 11.5 ± 2.9 and 12.0 ± 1.3 years, respectively.A statistical similarity was observed with the mathematical model developed with 84 children and adolescents with a mean age of 11.64 ± 2.42 years from Ribeirão Preto, São Paulo, Brazil (R 2 = 90%; SEE = 2.23 kg) 13 .However, the sample showed an absolute mean FM (16.51 ± 7.08 kg) that was lower than that identified in the present study.This difference is partially understood by the phenotypic specificity inherent in the samples.Furthermore, systematic variations between devices were identified and, therefore, the comparison of mathematical regression models developed with different software versions or scanning speed is susceptible to estimation bias 4,5 .
The validation of the mathematical models was performed on an independent and randomly selected sample (Table 3).The eminence of the coefficients was maintained with a maximum LOA oscillation of -3.13 to 3.63 kg (Model 4) and a minimum of -2.74 to 3.82 kg (Model 2).It must be emphasized that all mathematical models developed meet the criteria of non-significant bias, LOA within clinical limits and high coefficient of determination (Table 3) and reproducibility is suggested in female adolescents with equivalent morphological and demographic characteristics.The mathematical models proposed, well as those developed for the male sample 12 , are suggested to estimate body adiposity in adolescents aged 12 to 17 years in southern Brazil.Gender-specific percentage curves 18 , developed from samples of the aforementioned studies, can be applied for the normative classification of FM in clinical and epidemiological contexts.The World Health Organization (WHO) recommends the use of body mass index, however, its clinical applicability becomes limited by the low accuracy of stratifying tissue components 24 .Thus, skinfold thickness is a valid tool to quantify, describe and assess body adiposity, since approximately 80% of the adipose tissue is deposited subcutaneously 28 .Moreover, it is suggested as a screening resource to determine which adolescents are at greater risk of becoming obese adults 30 .
The limitations of this study are in reference to the lack of information on the control of biological maturation that makes it impossible to interpret the predictive accuracy of mathematical models in the stages of puberty.Although the sample with ethnic heterogeneity is understood, it is important to validate the proposed mathematical models in different population groups, due to the morphological variability mainly influenced by the miscegenation observed in the Brazilian regions.The most relevant aspects are the use of DXA as a reference method and the internal validation of the mathematical models in an independent sample.

CONCLUSIONS
It is concluded that the four mathematical models developed are valid for estimating FM in female adolescents in southern Brazil.Therefore, they may be useful for health professionals as a tool for diagnosing and body adiposity controlling.Furthermore, the mathematical model 4 has a better statistic performance and a low number of predictive variables.

COMPLIANCE WITH ETHICAL STANDARDS Founding
This project received funding from Programa de Pesquisa para o Sistema Único de Saúde: Gestão Compartilhada em Saúde PPSUS -edition 04/2012, project number: 41614 -FA, agreement 982/2013 with Federal University of Technology-Paraná.The funders had no role in the study design, data collection and analysis, decision to publish, preparation of the manuscript, and none of the authors received a salary.

Table 1 .
Demographic and morphological characteristics of the regression sample (n= 169).

Table 2 .
Mathematical models to estimate FM (kg) in female adolescents from southern Brazil.

Table 3 .
Cross-validation of mathematical models in an independent sample (n = 27).
SD: standard deviation; r: correlation coefficient; SEE: standard error of estimate; Bias: a systematic distortion of a statistical result due to a factor not allowed for in its derivation; LOA: limits of agreement.*Bland & Altman concordance analysis.