Is there any difference between “amateur” and “recreational” runners? A latent class analysis

aims: to identify and describe the clustering of characteristics related to running amongst Brazilian runners using latent class analysis and to verify if there is a profile associated with better performance. Material and Methods: a sample of 1149 Brazilian runners answered an online questionnaire, that provided information about biological (sex, age, height, weight), training (running pace, frequency and volume/week, motivation), and socioeconomic factors, as well as a multidimensional questionnaire of fear of failure. Latent Class Analysis was used to identify subgroups of Brazilian runners, based on BMI, training volume and frequency/week, motivation, socioeconomic factors, and the fear of failure. Further, a c2 test was computed to verify statistical differences in the frequency of the descriptive variables between classes. Finally, binary logistic regression analysis estimated factors associated with running performance, with running pace as the dependent variable. Results: It was possible to identify two different classes among Brazilian runners, which were called as “amateur runners” and “recreational runners”. Variables that highlighted classes’ differences were: volume and frequency training/week, motivation for the practice, and BMI. Regarding the running performance, logistic regression analysis showed that men (OR=5.39; 95%CI=4.00-7.25), young runners (OR=0.38; 95%CI=0.28-0.51), and “amateur runners” (OR=4.19; 95%CI=2.95-5.94) were more prone to have higher performance. Conclusion: Two distinct classes were found among Brazilian runners that were linked to their performance, highlighting that even among non-professional runners, differences can be observed regarding their classification. Hence, future studies should consider using these classes to properly stratify or identify non-professional runners.


Introduction
In recent years, the number of road runners has considerably increased around the world 1 , but this scenario has not been followed by an improvement in performance, given that the meantime to complete these races has not decreased amongst non-professional "runners" group 2 . Consequently, studies have focused on determinant factors of performance, such as physiological, psychological 3 , environmental perception 4 , anthropometric, and body composition 5 . In addition, since different factors/characteristics tend to cluster to better explain the performance (it is known that it is a multifactorial trait), understanding the runner profile seems to be of relevance.
It seems pretty clear that there is a heterogeneity amid runner profiles 1 . For example, it is possible to observe, in each running race, participants that run under the motto "complete and not compete", and those motivated by the idea of "health and quality of life", as well as runners who take part in these events focused of improving performance/results 6 . These differences may reflect in their involvement in the modality, which can be seen by the number of weekly training sessions, time spent in training/week, and distance covered weekly mileage 7,8 , which can be related to their commitment to the practice and, consequently, with the performance.
In Brazil, the observed scenario is similar to that seeing in other countries, with hundreds of races occurring every year, involving about 4 million runners 9 . In addition, Brazil presents large socio-cultural, climatic, and economic differences between its regions 10 , which can lead to differences in the runners' profile. Previous studies highlighted some sociodemographic issues, such as the fact that most of the runners were male and presented high socioeconomic and educational levels 11,12 . Besides, evidence suggested that most runners have more than one year of practicing experience, covering an average mileage at about 30 km/week, performed in at least three weekly training sessions [13][14][15] . Notwithstanding, these studies were conducted on the local ambit and described only few characteristics related to running performance, not allowing the generalization of the results 12,13,15 . Moreover, to the best of our knowledge, no study has established a national-level Brazilian runners' profile, nor whether possible different profiles can be associated with better performance. Given that, the aim of this study is two-fold: a) to identify and describe the clustering of characteristics related to running amongst Brazilian runners using latent class analysis, and b) to verify if there is a profile associated with better performance.
iD Is there any difference between "amateur" and "recreational" runners?

Study Design and Sample
Data from the present study is part of the InTrack Project, a cross-sectional research project developed to identify the main factors associated with running performance. To take part in the study, runners should have answered the online questionnaire (referred below), and those younger than 18 years of age, or with missing data/nonsense responses for the information regarding body weight and/or height, training frequency, and volume/ week, the motivation for the practice, economic status, and about fear of failure, were excluded for the analysis. So, the sample comprised 1149 runners (447 female, 702 male), which were sampled by convenience, distributed across all the five Brazilian regions (Southeast -415; North -84; Northeast -411; South -137; Midwest -99). All participants obtained information and were fully informed about the purposes and perspectives of the study, and they gave their consent to participate in it. The research was conducted under the Declaration of Helsinki, and it was approved by the Federal University of Sergipe Ethics Committee (protocol n º 3.558.630).

Procedures and data collection
The questionnaire "Profile characterization and associated factors for runner's performance" 16 was available for eligible subjects using internet software (Google Forms), as used in previous studies [17][18][19] . This strategy was used with the purpose to increase the response rate of subjects invited to take part in the study. Further, the invitation to eligible subjects was made through social media apps, where they were asked to answer the following question: "what does determine the performance in road running?". The instrument was available between November/2019 and March/2020, and during this period, runners who had answered the questionnaire were encouraged to publicize the instrument, inviting other runners to take part in the study. Based on the information provided by the questionnaire, the following variables were used:

Biological Variables
Age, sex, height, and weight were self-reported by participants. Body mass index (BMI) was computed using the standard formula [body mass (kg)/height (m)²], and participants were categorized as normal weight or overweight/obese according to cut-off points suggested by the Word Health Organization 20 .

Training Variables
Participants provided information regarding their known running pace, training frequency, and volume per week, as well as their motivation for the practice.
Running pace. Subjects reported their running pace, and the information was checked, whenever possible, by the official pace registered in the running race they reported as having taken part in the last 12 months. Group mean value was used as cut-off point to stratify data.
Training Frequency/Week. Weekly training frequency was reported in counts (from 1 to 7 training sessions), and the cut-off point of 3 sessions/week was used to categorize the subjects ("at least 3 training/week", "more than 3 training/week"), based on studies previously reported that showed that most of the runners tend to train at least three times/week 13,14 .
Training Volume/Week. Runners reported the approximated total distance (in km) usually covered per week in training sessions. Group mean value was used as cut-off point to stratify data. Motivation.
Participants were asked about their motivation for running, and they were split into two groups, based on their answers, as "performance" or "health and quality of life".

Socioeconomic Status (SES)
Subjects indicated their monthly income and based on Brazilian minimum wage in 2019 21 they were categorized as those with income ≤ R$2.994 (three minimum wages) or those with income >R$2.994. This cut-off point was used because according to the Brazilian Institute of Geography and Statistics, in 2019 the majority of the Brazilian population had a monthly income whose value lied between 2-3 minimum wages 22 .

Fear of Failure
The multidimensional questionnaire of fear of failure (FoF) 23 was transcribed for an electronic platform and it was used in the study. Runners were invited to answer it, at the end of the questionnaire abovementioned. The instrument assesses facts associated with the FoF, taking into account five domains (shame and embarrassment; self-esteem depreciation; uncertainty about the future; loss of interest by others; and other people worried about you). Those answers provide a score about a general FoF. Answers follow a Likert scale, ranging from 1 (not at all) to 5 (very much), where the highest the result, the highest the FoF is.

Statistical Analyses
Descriptive statistics were performed in SPSS 24.0, with values presented in mean and standard deviation, or frequency. Using Mplus v.6, a Latent Class Analysis (LCA) identified subgroups of Brazilian runners. The main purpose of this analysis is to cluster subjects who share similar characteristics 24 . Given that, classes were made based on the following variables: BMI (normal weight or overweight), training volume/week (>35 km or ≤35 km), training frequency/week (>3 training/week or ≤3 training/week), motivation for the practice ("performance" or "health and quality of life"), SES (>R$2.994/month or ≤ R$2.994/month), and "FoF" (final result ≤ 3 or >3). For all the variables, the first-class was used as a reference.
The assessment of the model fitting was done using the Pearson c 2 statistics, and the bootstrap likelihood ratio difference test (LRT), the Akaike Information Criteria (AIC), and the Bayesian Information Criteria (BIC) as measures of the model fit when comparing models with the different number of latent classes. First, a two-classes model was computed, following by a 3-classes model. The best-fitting model was determined based on fit measures, the replication of the best results, and the substantive interpretation. After that, WinPepi software was used to verify the statistical difference in the frequency of the descriptive variables between class, by the use of the c 2 test. Binary logistic regression analysis estimated factors associated with the rate of runners being classified as having a better performance (pace ≤330 s/km). So, the models were composed for both biological-[sex (female being a reference (ref)), age ("≤37 years and ">37 years" (ref))] and behavioral-variables [latent class ("recreational runner" (ref) and "amateur runner")]. The significance level was set at 5%.

Results
Considering mean values, the sample comprised a group of normal weight, late-thirties age runners from both sexes (447 women, 702 men), which covered nearly 35 km/week during training sessions, at a running pace of 5:30 min/km. There was a slight difference in the distribution of subjects according to their SES, and the majority of the participants reported "health and quality of life" as their main motivation for running practice, training more than three times/week, and values for FoF ≤3 (Table 1).
Results from the LCA are presented in Table 2. The model with two classes showed to be statistically significant better than that with one class. Furthermore, the model with three classes did not replicate the best log likelihood values, and the most parsimonious model with two classes was chosen ( Table 2). Figure 1 illustrates the probabilities of runners being classified into the classes. Given the motivation for running and the variables related to training (volume and frequency) that could be associated with the BMI, class 1 was labeled as "recreational runners", while class 2 was labeled as "amateur runners" (more volume and frequency training, lower BMI, and "performance" as a motivation for running). Frequencies of the characteristics of the two classes are presented in Table 3. "Recreational runners" class has significantly more female and overweight runners; in addition, runners into this class have monthly income higher than R$2.994, training up to three times/week, have the "quality of life" as their motto for running, with a pace above 330 s/km. No statistically significant difference in distribution between classes was observed for FoF.
The results of the logistic regression analysis are presented in Table 4. Sex and age were significantly associated with performance (OR=5.39; CI: 4.00-7.25; and OR=0.38; CI=0.28-0.51, respectively), so men and young runners are more prone to be classified in the "better performance group" (pace ≤330 s/km) than their peers (women and older runners, respectively). Further, runners classified as "amateur runners" were 4.19 times more prone to have higher performance (p<0.001).

Discussion
The main purpose of this study was to identify the profiles of Brazilian runners and to verify if these profiles were associated with better performance. The latent class analyses allowed the identification of two classes of runners, labeled as "recreational runners" and "amateur runners". In the literature, there is no consensus about how to describe runners, given that authors tend to consider a range of different variables. In this context, it is possible to find classifications such as "novice runners", "recreational runners", "competitive runners", "amateur runners", "advanced runners" and "experts runners" 18,[25][26][27] , that can be used to differentiate runners using single aspects related to their practice. For example, the label "novice runners" can be used to differentiate runners based on their time of practice and running history practice 28 , but this classification does not take into account the running pace, meaning that a "novice runner" could present a better performance than a more expert runner. Given that runners' performance and profile are based on a set of variables, the use of a classification that reflects all the variables is not an easy task.
Based on the results of the present research, the labels "recreational" and "amateur" were used with the purpose to differentiate runners' profile based on a set of variables (namely, variables related to their training commitment/ motivation, biological aspects, and socioeconomic factors).
Thus, it is possible to propose an explanatory model ( Figure  2), where the motivation for the practice seems to act as the "flagship" for runners' practice/commitment since those who run for improving "performance" are more prone to get involved in higher training volume/frequency. This can be reflected in their body composition, leading them to present a better relationship between their body weight and height, which, in turn, can reflect in lower energy expenditure to cover the same distance as their heaviest peers 29,30 .
Moreover, the SES showed to be higher among "recreational runners", reinforcing the idea that motivation can be one of the most important variables to describe runners' profile, based on those variables considered. Since the 1970s, the "running world" has observed a transition in runners' profile 2 : if until that date the majority of the athletes who competed in races were professional or were looking for this "status", nowadays most of the participants in running races are not aiming to become professional runners, but their engagement in competitions occurs because the practice has increasingly become a leisure/ social activity, that is performed, frequently, in groups, and focusing to increase social relationships and health, without any competitive perspective 31,32 .
Because of this, subjects with higher SES tend to participate in running, most of the time, following the idea of "competing to complete", whereas those with lower SES may understand running as a way of social ascension (that can be understood not only as a professionalization but also as "be known" and social prestige/ascent) 33 . Although it has been thought that running is a low-cost practice 32 , the "market" around the running involves running events participation, acquisition of equipment, and accessories. Hence, it reveals that, currently, the practice does not seem to be accessible for everyone 34 ; notwithstanding the existence of same race events that can be called "low-cost", they are observed in a much lower frequency.
It is important to highlight that the variable FoF did not differ between the runners' classes. This variable has a strong relationship with negative stress, including worry, anxiety psychological stress, and a reduced sense of accomplishment 35 , which is strongly observed among professional/ elite athletes and has been associated with dropout in sport 36 . Although "amateur runners" were more prone to present higher values for this variable than the "recreational" ones, their involvement in the practice, as well as the demands for results, are different than those observed amongst professional athletes, which can explain the results observed.
The binary logistic regression presented significant values for sex and age, meaning that men and young runners were more prone to have better performance than their peers (women and oldest runners). Regarding sex, two aspects should be highlighted: 1) differences in the performance between sexes are usually associated with physiological, anthropometric, thermoregulatory, and metabolic differences 37 , in association with social influence, given that man are usually more encouraged to take part in more intensive training than women, focusing in improve the performance; and 2) the frequency of men involved in running events is quite higher than women, meaning that maybe there is not enough data to better describe/understand sexes differences in performance in this modality, which can be related to any other factors that not only physiological ones.
Regarding age, results showed that young runners (<37 years) had best performance than older ones. Other studies presented that, in general, running performance seems to increase with age, until about 35-40 years for man, and until about 30-34 for woman 38 , when it is observed the better performance in athletes. This fact is supposed to be associated with the expertise in the practice (older runners tend to be engaged in the practice for more time than the youngsters) and also with the peak of the physiological traits linked to running performance 38 . However, these results should be analyzed with caution, because most of the sample studied in the present research is concentrated in this age range, and this can biased the understanding/ interpretation of the results.
Regarding the runners' classes, it was observed that amateur runners had four times more chances to have a better performance than recreational runners. Most likely, this result can be associated with the motivation for the practice, given that runners in this class are more prone to report "performance increases" as the main reason to be involved in the practice. Since motivation, as above cited, can be seen as the "flagship" to training commitment, runners from this class are more likely to present higher training volume (>35 km/week) and frequency (>3 training sessions/week) than the recreational ones, and these characteristics can lead to an improvement in physical capacities, physiological variables 39,40 , body composition, and consequently improvement in performance 29 . Surely, having better performance is not exclusive to the amateur runners, but this group has more chances to present this trait given this result for variables previously presented.
One limitation of the study is the use of self-reported anthropometric data. However, this strategy has been used by previous researches 18,19,41 . On the other hand, as an instrument with easy access and cheap, it can be further used in a daily-basis routine by coaches as well as researchers should consider the strategy adopted to spread the questionnaire as an efficient approach. Another point to be mentioned is the difference in the sample across Brazilian states, which did not allow the statistical analyses to be performed by states, but it is important to highlight that, to the best of our knowledge, this is the first study to address a country-based sample, instead of a local ambit range.
It was found the existence of two distinct classes of runners in Brazil, defined based on training, motivation, biological, and economic characteristics, that are associated with their performance. The classes presented can be used in future studies, which can allow the comparison between them when talking about runners' classification. Further, results provided a clear differentiation to the use of the classes "amateur" and "recreational", making clear that, even among non-professional runners, differences are observed regarding the way they can be classified. In a practical context, it is expected that coaches could differentiate the distinct classes of road runners, taking into account a set of different variables, and prescribing training based on the profile presented by these athletes. For runners aiming at performance, it is clear that training-related aspects (volume and frequency) and biological characteristics play relevant roles to prevail. So, these variables should be developed from a long-term perspective.

Perspectives
Results of the present study suggest that individual, training, and socioeconomic characteristics play a relevant role to differentiate non-elite runners into "amateur" and "recreational" ones, and this classification seems to be related to their performance. Taking this into account, it was proposed an explanatory model to clarify classes' differences, highlighting the conceptual idea regarding the use of these classes in sports science, as well as by non-elite runners, their coaches, and even sports events organizations. Notwithstanding the implications/applications of the nomenclature suggested, future studies must be conducted to corroborate the classes found, or even to suggest new approaches.