Fatigue at Work: Scale Validation with Airline Pilots

In the organizational context, the study of occupational stress encompasses constructs of fatigue at work. Within the air transportation sector, fatigue at work is a potential issue influencing both safety and occupational stress. The objective of the present study was to perform a convergent-discriminant validity analysis of the Feeling of Fatigue scale in the area of Administration. Data from an observational cross-sectional study involving a sample of 1,066 airline pilots were analyzed using quantitative modeling. Confirmatory factor analysis with the structural equations model was performed to determine the validity of a Portuguese version of the Feeling of Fatigue scale in the organizational context of civil aviation. This study fills a gap in the literature on occupational stress in Administration, highlighting the relevance of research on fatigue at work. The results confirmed the validity of a Portuguese version of a mature scale for subjective assessment of fatigue in Administration, thereby contributing to fatigue management in organizational settings.


INTRODUCTION
Most research in Administration addresses stress and burnout, with the latter defined as a psychophysiological state of occupational exhaustion and incapacity to work (Monteiro, Pereira, Daniel, Silva, & Matos, 2017;Vasconcelos, Vasconcelos, & Crubellate, 2008). However, fatigue at work has received far less research attention in Administration journals, as evidenced by the dearth of studies found by the authors in a review of the relevant literature.
In the health sector, the impact of the recent pandemic on health workers has highlighted the need for further research investigating fatigue at work and burnout (Sasangohar, Jones, Masud, Vahidy, & Kash, 2020) to better identify and study these related (yet different) constructs. Another organizational area concerned with fatigue at work is transportation, particularly the air transport sector. In this sector, fatigue is a potential issue in terms of both safety and occupational stress, largely in relation to the inherent intense work schedules (Drongelen, Boot, Hlobil, Beek, & Smid, 2017).
A considerable proportion of workers (pilots) regularly report fatigue. This is partly the result of long irregular working days, crossing of time zones, and insufficient sleep opportunities (Drongelen et al., 2017). Research on the effects of shift work has focused mainly on physiological, psychosocial, and sleep health. However, few investigations have evaluated shift workers' personal experiences (Matheson, O'Brien, & Reid, 2014). There is a particular need for more studies measuring the phenomenon of feeling of fatigue at work.
Feeling of fatigue can be a direct result of overexertion to achieve task objectives and assure performance levels during periods of higher workload. The feeling of fatigue has properties resembling a generalized background emotion, incorporating characteristics of other basic emotions (Hockey, 2013). Fatigue at work has been assessed using a variety of instruments (Gawron, 2016;Sagherian & Brown, 2016;Winwood, Winefield, Dawson, & Lushington, 2005). However, the Feeling of Fatigue scale (Yoshitake, 1971) is widely used for subjectively measuring this emotion (Matthews, Desmond, Neubauer, & Hancock, 2018).
Against this background, the objective of the present work was to perform a convergentdiscriminant validity analysis of the Portuguese version of the Feeling of Fatigue scale among airline pilots. This study contributes by filling a gap in the field of fatigue in the workplace, emphasizing its relevance for Administration research and demonstrating the validity of a Portuguese version of the Feeling of Fatigue scale. There are few subjective instruments available for assessing fatigue in professions where it poses a major challenge, both in terms of the health of these professionals and enhancement of operational safety (Gander, Mangie, Phillips, Santos-Fernandez, & Wu, 2018;Morris, Wiedbusch, & Gunzelmann, 2018;Zaslona, O'Keeffe, Signal, & Gander, 2018), aspects also addressed by the present study.

Fatigue at work
Societal transformations in the workplace have led to studies on pleasure and mental suffering, together with their causes and consequences for work performance (Silva et al., 2015). Among the approaches reported in the literature, occupational stress considers that people have an ability to confront stimuli in an intermediate state between health and disease (Silva et al., 2015), requiring actions for individual and collective mental health management. A study of the occupational stress resulting from effects of different organizational variables showed that support from managers and colleagues at work (Monteiro et al., 2017) was more important than human resources services or the organizational culture. The mainstream belief holds that occupational stress is manageable by the organization and adaptable to the environment in which it operates. Another study explored the relationship between organizational stress-inducing practices and employee responses/performance, concluding that "stress in organizations is as complex as the level of stress in society: it will depend on the control of stress levels coming from society" (Vasconcelos et al., 2008, p. 48). Instead of serving as a management tool to induce behaviors, occupational stress can result in unforeseen organizational consequences, including risk of fatigue at work.
In a review of a century of research on occupational stress, the authors anticipated a future trend in which theory and research continue to develop toward gathering evidence for causal inference, through greater integration of psychophysiological data and work-life models (Bliese, Edwards, & Sonnentag, 2017). Thus, this field of study should continue to seek theory and research that support applied knowledge in order to assist organizations in managing current and future stressors that may emerge in the next 100 years. Despite the importance of the theme, a bibliographical review of occupational stress literature published from 2010 to 2014 (Ferreira, Reis, Kilimnik, & Santos, 2016) determining whether the topic continues to be investigated, how and where, found few papers in major Brazilian Administration journals (Ferreira et al., 2016). Occupational stress is important in Administration given its impact on health and well-being at work, which, in turn, can negatively affect performance, increase costs, and reduce the effectiveness of organizations.
The term 'occupational stress' has been employed in the literature with various different meanings (Hancock & Desmond, 2001;Paschoal & Tamayo, 2004). Within the broader concept of occupational stress, jobs in some sectors, such as transportation, medicine, and energy, still face the challenge of how to deal with occupational safety, particularly fatigue at work. The key issue tends to center on defining optimum conditions in which humans and technology can work together safely and sustainably (Nunes & Cabon, 2015). Another bibliometric study reviewed 100 years of research in occupational safety, showing how this evolved from basic protections and job analysis to a systemic and multi-level view of safety and risk (Hofmann, Burke, & Zohar, 2017). The study concluded that, although much progress has been made, too many injuries, fatalities, and cases of occupational diseases still occur in the workplace. Thus, there is still much to be researched.
It is noteworthy that the concepts of fatigue and stress, due to a long history of use (in science, work, and by the general public), are often reported in the workplace as if their meaning is clear, overlooking the complexity involved (Sonnentag & Frese, 2003;Tepas & Price, 2001). Research has not only found differences, but also shown that fatigue and stress are multidimensional constructs that interact. Fatigue and stress states can occur simultaneously and are difficult to distinguish, but should not be considered synonymous (Gaillard, 2001;Glendon, Clarke, & Mckenna, 2006). The ISO 10075-1 standard -Ergonomic principles related to mental workload. General terms and definitions -proposes the standardization of definitions related to occupational stress. For the purpose of this study, the definition for fatigue proposed below was adopted (ISO, 2017): Fatigue (Mental): temporary impairment of mental and physical functional efficiency, depending on the intensity, duration, and temporal pattern of the preceding mental strain. Recovery from mental fatigue is achieved by rest rather than changes in activity. This reduced functional efficiency becomes apparent in feelings of tiredness, less favorable relationships between performance and effort, type and frequency of errors. The extent of this impairment is also determined by individual preconditions (online).
Fatigue should not be reduced to a single dimension, given that it entails aspects that are multidimensional, dynamically interdependent, and not fully correlated (Phillips, 2015). To study fatigue at work, from a systemic theoretical perspective, psychophysiological data must be collected to determine the boundary conditions for the lives of the individual and/or group, by modeling the complexity of relationships between constructs such as cognition, emotion, and action, which can be treated as subsystems. Thus, this system can be analyzed in terms of multiple physiological, neuropsychological, and socio-political aspects. Finally, the literature recommends that convergent-discriminant validation should be sought, based on models for analyzing the effect of fatigue precursors, such as stressors at work (Melan & Cascino, 2014).

Fatigue in the aviation work environment
Worker (pilot) fatigue is a significant problem in modern aviation operations, mainly due to work shifts, variable journeys, desynchronization of circadian rhythm, and insufficient sleep, factors that are prevalent in both civil and military flight operations. The negative effect of fatigue has proven a contributory factor for errors and accidents (Caldwell et al., 2009). Within aviation and other safety-critical fields, such as transportation, medicine, and energy, fatigue risk management systems (FRMS) represent a novel regulatory approach that combines advances in understanding of worker fatigue and factors that contribute to accidents, as well as advancements in safety management (Gander et al., 2011). FRMS work on the basis of data and the combination of scientific and operational knowledge, including processes for monitoring safety performance and for continuous improvement.
Prescriptive limits on working hours are familiar to shift workers, but these are more suited to circumstances of low-risk safety-related fatigue. However, economic needs have placed pressure on a society with 24/7 shift workers requiring more customized and flexible approaches to fatigue management, such as FRMS (Gander, 2015). Although the implementation of FRMS is growing in aviation, there is still little consensus on which constructs and associated safety performance indicators should be measured (Gander et al., 2014). Initiatives are scarce in both academia and industry, with insufficient results to draw any meaningful conclusions about a safe or unsafe condition from the indicators measured. Thus, a relative comparison of indicators, analyzed based on different operational contexts, is necessary to allow compilation of a database on psychobiological and operational factors and foster cooperation in the global effort to standardize acceptable indicators.
In order to manage fatigue responsibly, decisions cannot be based on a single measurement or sole technology to determine an absolute safety value. Human fatigue risk management systems should instead adopt a comprehensive approach (Mallis & James, 2012). Evidence-based nonprescriptive approaches to fatigue management are needed in aeronautical operations (Mallis, Banks, & Dinges, 2010). Therefore, an FRMS must be multi-layered and utilize multiple risk identification methods and risk reduction controls (Gander et al., 2011;Lerman et al., 2012). There is growing evidence that subjective assessments can serve as an effective, efficient, and costeffective tool in managing fatigue-related risk. Such assessments, however, should be based on a validated instrument and always be used as part of a more comprehensive FRMS (Smith, Browne, Armstrong, & Ferguson, 2016).
The reliable use of subjective assessments in FRMS depends on a just culture, where individuals are encouraged and supported in reporting fatigue and elevated impairment (Darwent, Dawson, Paterson, Roach, & Ferguson, 2015). Within a system such as FRMS, all stakeholders should be made aware of contributing factors that might affect their performance, through a system design able to capture and utilize the information from these reports. A recent systematic literature review (Bendak & Rashid, 2020) concluded that risk associated with fatigue in aviation is diverse and ambiguous in nature. This study also revealed that many aspects related to this risk have not yet been fully investigated, and therefore further research identifying mitigation strategies for this risk is warranted.

Subjective measures of fatigue
The use of subjective measures of fatigue has been restricted mainly to laboratory-based methodologies (Smith et al., 2016), producing satisfactory results. A simulated field study showed that, at a group level, subjective assessments of fatigue correlated with objective performance, but that subjects' ability to predict performance varied significantly, both across conditions and between individuals (Smith et al., 2016). As expected, variation in fatigue tolerance was identified (Van Dongen, Maislin, & Dinges, 2004). In particular, individuals with higher objective performance were worse at predicting their performance than those with lower objective performance. Two possible explanations have been proposed (Smith et al., 2016), whereby either weak correlations between objective and subjective assessments occur due to the range of the objective performance measure (less variability due to fatigue) or some individuals have an optimism bias and underestimate their impairment. The fact that individuals with some degree of tolerance to fatigue may be more confident in their resilience has major implications for fatigue risk management. This should be expected because such individuals may be unaware of their performance decline and may be unwilling to admit any fallibility due to professional and social pressures (Smith et al., 2016).
Fatigue at work has been assessed using a range of instruments (Gawron, 2016;Sagherian & Brown, 2016;Winwood et al., 2005). Although internationally there are few validated scales available, "it is clear that there is no gold standard for fatigue assessment" (Aghdam, Alizadeh, Rasoulzadeh, & Safaiyan, 2019). Of the fatigue assessment instruments available, the Feeling of Fatigue scale (Yoshitake, 1971) stands out for measuring subjectivity of this emotion. The scale was developed in 1969 by the Research Committee on Industrial Fatigue of the Japan Society of Occupational Health. It has since been applied to workers from a variety of sectors and countries (Chang, Sun, Chuang, & Hsu, 2009). Unfortunately, most publications on the Feeling of Fatigue scale, including its originally validation (Saito, 1982), were published in Japanese only.
The Feeling of Fatigue scale (Yoshitake, 1971) consists of a checklist of 30 items that explore the presence of symptoms, classified into three groups of fatigue symptoms (Yoshitake, 1978): (a) drowsiness and dullness, (b) lack of ability to concentrate, and (c) projection of physical discomfort. Generally, the higher the number of symptoms, the greater the feeling of fatigue. Both A and C symptom sets are physical, with A 'general' and C 'specific (sensory and neuronal).' B symptoms are purely mental. Of the A, B, and C symptoms, the strongest correlation with feeling of fatigue is found for B. Because these symptoms do not exist independently and are mutually related, a multifactorial construct was originally proposed.
In the first study, involving 170 office workers (Yoshitake, 1971), each symptom was evaluated on a Likert scale for the presence or absence of the symptom, and not only with 'yes' or 'no' answers, as was implemented in a subsequent study. The latter study confirmed the three-factor Feeling of Fatigue scale through a comprehensive field study assessing subjective symptoms of fatigue at work in 17,789 workers on 250 occasions (Yoshitake, 1978). The labor activities evaluated included both physical (in several industries) and mental (pilots, train drivers, drivers, factory operators, at offices, researchers) work during different shifts (day, night, and shift work).
The internal structure of the three-factor Feeling of Fatigue scale was validated originally in Japan (Saito, 1982) among railway workers. The workers were assessed before and after work shifts for different schedules. Results showed that B symptoms were also associated with motivation. The content of the Feeling of Fatigue scale has been validated for use in Latin America (Almirall & Reyes, 1982), where it has been consistently applied (Barrientos-Gutierrez, Martinez-Alcantara, & Mendez-Ramirez, 2004;Parody, Viloria, Hernandez, Niño, & Cervera, 2020). In the 1990s, Prof. Dr. Frida Fischer translated the Feeling of Fatigue scale into Brazilian Portuguese as part of her habilitation thesis (Privatdozent German Degree) (Fischer, 1990). Although the version was not formally validated, it has since been used in Brazil for several studies on fatigue at work (Metzner & Fischer, 2001;Metzner, Fischer, & Nogueira, 2008;Vasconcelos, Fischer, Reis, & Moreno, 2011).

Instrument
The Feeling of Fatigue scale (Yoshitake, 1971) is composed of three constructs (latent variables), each with 10 items measuring the presence of fatigue symptoms: FFA01-10 (drowsiness and dullness), FFB11-20 (lack of ability to concentrate), and FFC21-30 (projection of physical discomfort). The scores of the three latent variables (A, B, and C) are referred to as feelings of fatigue A, B, and C, denoted FFA, FFB, and FFC, respectively. The overall score of the 30 items is denoted FFS. Figure 1 depicts the instrument structure. Table 1 shows the instrument together with a proposed symptoms checklist in English (Yoshitake, 1971) and the translated version in Portuguese (Fisher, 1990) used in Brazil since 1990 by several studies, as outlined in section 'Subjective measures of fatigue'. Each indicator is assessed on a Likert scale with values ranging from 1 to 5, where respondents answer the question 'how often do you present the following symptoms?' by choosing one of the following alternatives: 'never,' 'rarely,' 'sometimes,' 'many times,' or 'always.' Developed by the authors based on Yoshitake, H. (1971). Relations between the symptoms and the feeling of fatigue. Ergonomics, 14 (1)

Study design
Adopting a deductive epistemological approach drawing on the theoretical background presented, we did a quantitative modeling study (Cauchik-Miguel et al., 2018;Creswell, 2014) to test the convergent-discriminant validity of the Feeling of Fatigue scale. We collected data by a cross-sectional observational study (Breakwell, Smith, & Wright, 2012;Fontelles, Simões, Farias, & Fontelles, 2009) carried out as part of a larger study on Chronic fatigue, working conditions, and health of Brazilian pilots (Marqueze, Diniz, & Nicola, 2014) in a Brazilian sample.

Study population and sample
The target population of the larger study comprised 2,350 regular aviation pilots, members of the Brazilian Association of Civil Aviation Pilots (Abrapac). Of this total, 1,234 answered an online questionnaire, representing 52.5% of the study population. Initially, the sample size was calculated (G*Power) to meet the objectives of the larger study Chronic fatigue, working conditions, and health of Brazilian pilots (Marqueze et al., 2014), in which the primary outcome was fatigue and sample power was 99%. Of the overall total of 1,234 pilots, most participants (97.1%) were male and average age of the pilots was 39.1 years (SD = 9.8 years). Most of the respondents were captains (57.9%), and the others were co-pilots/first officers (42.1%). In terms of pilots' personal profile, 84.3% had a marital partner and 61.3% did not have children younger than 12 years. The average number of persons who contributed to family income was 1.6 (SD = 0.7). Most pilots (82.4%) were attending or had already completed college education. Of the sample, 53.7% did not reside near their primary work base, requiring long commutes between residence and base.
The professional environment reported indicated that mean time practicing as a pilot was 15.2 years (SD = 10.1 years) and mean time engaged with the current airline was 5.8 years (SD = 4.8 years). The type of time off varied among pilots, but 27.6% usually had a single day off per week. A high percentage of pilots reported frequent or constant delays due to operational, maintenance, and dispatch issues (40.7%). Most pilots (91.2%) were predominantly flying domestically with basic crews. Pilots flew for an average of 65 hours monthly. The work shifts of almost all pilots (94.1%) were irregular and involved night shifts (from 10 p.m. to 5 a.m.). Working hours were longest during the day shift (typically with early starts before 6 a.m.), followed by the afternoon shift (with late finishes after 10 p.m.) and night shifts (usually starting before 10 p.m.). Finally, regarding working conditions potentially associated with increased fatigue, main factors reported by pilots were long working hours, number of flying hours, short rest periods between work shifts, and working night shifts (Marqueze, Nicola, Diniz, & Fischer, 2017).
After the application of inclusion and exclusion criteria, 1,066 pilots remained in the present study sample, representing a large proportion of the overall pilot population in Brazil. The effort involved in achieving this sample size was considerable, as this population is usually averse to research surveys. Similar, more recent, attempts have failed to enroll more than a few dozen respondents. As operational conditions have not changed greatly since 2014, this data remains valid for the analysis performed.
Pilots actively working and flying with airlines at the time of the study, of both sexes, who were members of the Abrapac, were invited to participate in the study. Executive aviation, cargo, and air taxi pilots were excluded. Respondents with missing data on the Fatigue Scale were also excluded. A total of 168 cases with missing data (13.6%) were excluded.

Data collection
After confirmation of the adequacy of the questionnaire via pilot testing conducted with Abrapac's Board of Directors (Brazilian aviation captains or co-pilots), invitations were sent out for participation in the study. Data were collected using a free online questionnaire tool, from December 2013 to March 2014. To avoid duplicate responses, individual emails were sent out. Questionnaire completion time was around 40-60 minutes. The data collection instrument contained questions gathering information on sociodemographics, work, health, lifestyle, and sleep variables used in the present study. The study evaluated the Feeling of Fatigue scale (Yoshitake, 1971) and sociodemographic variables (age, sex, and job position) as multiple groups for cross-validation. Ethical aspects related to research involving humans were duly observed and all participants signed a Consent Form (Resolution 466/12 of the National Health Council). The study was supported by the Abrapac and approved by the Ethics Committee of the Federal Institute of Education, Science, and Technology of São Paulo (Opinion No. 625.158 / CET-IFSP).
Respondents whose standardized fatigue score exceeded three standard deviations (Z < -3 or > 3) (i.e., outliers) were first identified (Cousineau & Chartier, 2010;Hair et al., 2009). The latent variables FFA, FFB, and FFC were incorporated in a recursive reflexive measurement model (Bido, 2019;Gana & Broc, 2018;Hair et al., 2009) with multiple groups represented by the sociodemographic variables assessed (age group, sex, job position). The measurement model with the lavaan code (Bido, 2019;Gana & Broc, 2018;Rosseel, 2012) is shown in Table 2. The symbol "=~" denotes a reflexive model, where the exogenous latent variables on the left explain the variances of the endogenous variables (indicators) on the right side of the measurement model equations (Bido, 2019). Convergent validity is obtained when indicators of a construct converge and share a large proportional of common variance (Hair et al., 2009). The first indicator involves standardized factor loadings (ideally > 0.5) after confirmatory factor analysis (CFA) using structural equation modeling (SEM). Another metric is derived from average extracted variance (preferably > 0.5). Finally, reliability measures > 0.6 should be attained for convergent validity (Hair et al., 2009).
When comparing different instruments for discriminant validity, each construct should be unique and capture phenomena not measured by the others (Hair et al., 2009). Discriminant validity can be assessed by two criteria: Fornell-Larcker (Fornell & Larcker, 1981) or Heterotrait-Monotrait (HTMT) ratio of correlation (Henseler, Ringle, & Sarstedt, 2015). The Fornell-Larcker criterion is based on the comparison of the square of the correlations between one construct and all others and the average variance extracted (AVE) by the construct. The HTMT criterion evaluates the ratio of the correlation between two constructs to the square root of the product of the reliability of the two latent variables. A cut-off of 0.85 is proposed in the literature for HTMT (Voorhees, Brady, Calantone, & Ramirez, 2016), below which discriminant validity is shown. For discriminant validity, indicators should be related to a single latent variable without cross-loadings (Hair et al., 2009). In the present study, discriminant validity was assessed based on these two criteria for the three latent variables comprising the multifactorial Feeling of Fatigue scale (FFA, FFB, and FFC). Correlations were measured and comparisons against AVE and reliability were analyzed for each pair. Given that all three latent variables measure aspects of the feeling of fatigue (Yoshitake, 1971), discriminant validity is expected to be rejected.
In this study, group comparisons were also performed to show cross-validation. Loose crossvalidation and loadings equivalence procedures (Hair et al., 2009) were applied. The former procedure verified that loadings, correlations between latent variables, and error variances (uniqueness) were similar for both groups. For the latter, non-standardized loadings are forced to be equal in both groups for model estimation and the difference in the chi-square statistic is evaluated. Other measures of fit were also determined.
In accordance with literature guidelines (Graham, 2006;Raykov, 1997aRaykov, , 1997b, congeneric, tauequivalent, and parallel measurement models were also considered to assess for adequate compromise among internal consistency, reliability, and parsimony. Of the models evaluated, this study sought to validate the first order trifactorial model as that which best expressed the internal structure of the Feeling of Fatigue scale. To evaluate the fit quality of the measurement model, based on literature guidelines (Hooper, Coughlan, & Mullen, 2008), the chi-square statistics, degrees of freedom and p value, comparative fit index (CFI), root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), and the Akaike information criterion (AIC) were employed to assess parsimony between alternative models. These indices were selected over others because they have proven less sensitive to sample size, erroneous model specification, and parameter estimation (Hooper et al., 2008).
Recommended standards to evaluate cut-off values for these quality indicators can be found in the literature. In this study, the recommendations proposed by Hair, Black, Babin and Anderson and Black (2009) were adopted. Even for very large samples and many variables, chi-square statistics should yield significant values to reject the null hypotheses that the estimated model resembles the measured covariance (Hair et al., 2009). Therefore, the fact that the chi-square statistic is not usually significant is irrelevant, especially when the analysis needs to consider a distribution that violates the normality assumption, usual and acceptable for a Likert scale (Norman, 2010). CFI values > 0.92 are sought, while RMSEA and SRMR should be < 0.08 (Hair et al., 2009).

Convergent validity of internal structure of constructs
The preliminary step was the removal of outliers. As the lowest standardized fatigue score was -2, only participants with Z > 3 were excluded. Although only six outlier scores (observations 347, 484, 681, 704, 890, 1057) existed for the aggregated Feeling of Fatigue score (FFS), a total of 18 observations were excluded, including additional outliers for FFA (547, 775), FFB (50,306,921),and FFC (192,228,301,597,694,708,905).
The relevant variables for the final sample of 1,048 pilots are described in Table 3, with the analysis of variance indicating significant mean differences between groups (p-value < 0.05).
The next stage entailed the assessment of the measurement model. As initial assessments showed inadequate fit measures, modification indices were introduced by adding correlations between indicators that were related only to the same latent variable in order to avoid cross-loadings. Although some indicators with poor loadings could have been eliminated in order to improve the model, this approach was rejected for two main reasons. First, the validated version of the scale contained all indicators. Secondly, the Bartlett sphericity test and Keiser-Meyer-Olkin test for sample adequacy (Hair et al., 2009) showed that the three scales with all indicators had acceptable values. While modification indices should be avoided in general, they can be useful for representing known effects, such as the correlation between the errors in the measurement process or data collection (Hair et al., 2009), a frequent phenomenon in psychometric measures such as the Feeling of Fatigue scale. Thus, 40 correlations were added to the measurement model, based on modification indices above an improvement cut-off of 20 (0.5%) on the chi-square statistics exhibiting significance (p < 0.05) ( Table 4). Note. (*) ~~ in lavaan, denotes correlation between error terms of indicators measured. Table 5 compares fit measures and reliability before and after inclusion of modification indices. Adequate fit measures and reliability were obtained for the measurement model. Although the chi-square statistics remained non-significant, the other quality indices (CFI, RMSEA, SRMR) presented satisfactory results. The model incorporating modification indices had the best parsimony (AIC). In addition, the coefficient alpha values confirmed the reliability and internal consistency of the instrument. Therefore, these results confirm the convergent validation of the internal structure. Finally, cross-validation between groups with the final measurement model was performed only for groups with significant differences on mean comparisons ( Table 3). The following models were evaluated: Model 1 -all available data; Model 2 -age group comparison; Model 3 -job group comparisons. First, a loose cross-validation (Hair et al., 2009) was applied in order to check measurement invariance of latent variable correlations, loadings, and error variances. Table 6 shows similar correlations between latent variables for Models 1,2,and 3. Tables 7,8,and 9 show comparisons of loadings and error variances (uniqueness) for latent variables (FFA, FFB, and FFC), respectively. In Table 7, similar loadings result in similar average variance extracted (AVE), representing the sum of communalities divided by 10. Error variances (uniqueness) were also similar. In Table 8, similar loadings and error variances were also obtained. As results proved similar in Table 9, a loose cross-validation was shown.     For loadings equivalence cross-loading, Models 2 and 3 were compared, with free estimation and non-standardized loadings set to be equal in the respective models. The results in Table 10 also confirm cross-validation by this alternative procedure. Therefore, convergent validity was confirmed for the first-order trifactorial tau-equivalent measurement model of the Feeling of Fatigue scale.

Discriminant validity of constructs
The final stage involved the assessment of discriminant validity among the three constructs (FFA, FFB, and FFC). Composite reliability (CR) derived from coefficient alpha. Average variance extracted (AVE) and correlations among the three constructs were measured to evaluate Fornell-Larcker criterion (Fornell & Larcker, 1981) and compare Heterotrait-Monotrait (HTMT) criterion (Voorhees et al., 2016) against the cut-off of 0.85 (Table 11). For discriminant validity according to Fornell-Larcker criterion, the square root of AVE in the diagonal of the correlation matrix must be larger than the off-diagonal terms. Alternatively, using the HTMT criterion, calculated ratios between correlation and the square root of reliability term products must be less than 0.85. The results failed for both criteria. Although it was ensured that indicators were related to a single latent variable and there were no cross-loadings, discriminant validity was rejected. This outcome confirms that the three constructs are interrelated for measuring feeling of fatigue (Yoshitake, 1971).

DISCUSSION
The measurement of feeling of fatigue at work is particularly important in the case of shift workers, such as civil aviation pilots, the target population of the present study. As outlined earlier, fatigue is a major issue affecting the occupational health and safety of pilots, exacerbated by a number of factors inherent to the profession. In this study, a convergent-discriminant validation analysis of a Portuguese version of the Feeling of Fatigue scale was performed in the organizational context of civil aviation. As recommended in the literature (Hutz et al., 2015), construct validity of the Feeling of Fatigue scale was evaluated by confirmatory factor analysis and analysis of internal consistency, with the latter approach based on reliability measures.
Different procedures were applied to confirm convergent validity, including group comparisons for cross-validation. No comparison was performed for sex, because no significant gender differences in fatigue scores were found for the three latent variables. Discriminant validity was rejected by two alternative methods, confirming that the three latent variables indeed measure different interrelated aspects of fatigue (Yoshitake, 1971). The various fit quality indicators in the structural equations model (Hair et al., 2009) were compared. Reliability measured by coefficient alpha for tau-equivalent models was included as part of the overall convergent-discriminant validation analysis carried out. The results provided confirmation of psychometric validity of the first-order trifactorial tau-equivalent measurement model of the Feeling of Fatigue scale.
Confirmatory factor analysis is often used to assess scale reliability and construct validity through convergent and discriminant analyses in Administration research studies (Demo, Neiva, Nunes, & Rozzett, 2012;Santos & Brito, 2012). Sometimes, exploratory factor analyses are also applied (Neiva, Odelius, & Ramos, 2015;Wimalasiri, 1995), but only sample adequacy was proved in the present study, given the scale was originally validated with the same first-order trifactorial model. Previous studies have shown that content validity of a scale, in a cultural adaptation to another language, can be inferred from the analysis of the internal structure and reliability of the instrument (Boada-Grau, Merino-Tejedor, Gil-Ripoll, Segarra-Perez, & Vigil-Colet, 2014; Gouveia et al., 2015;Hutz et al., 2015). Validation of a scale based on the measure of its reliability is a widely used technique (Hutz et al., 2015).
Although this study validated the first-order trifactorial model originally proposed, it is believed that a one-dimensional instrument could evaluate fatigue at work. In a previous study (De Vries, Michielsen, & Van Heck, 2003), failure to confirm multidimensionality might have been due to the fact that fatigue manifests itself as a one-dimensional construct for healthy individuals. However, fatigue can manifest multidimensionality owing to symptoms reported by patients.
Results of the Feeling of Fatigue scale are usually expressed as an aggregate score. However, it is also accepted that fatigue should not be reduced to a single dimension, because it encompasses multidimensional, dynamically interdependent, yet not fully correlated aspects. These aspects provide a description of how fatigue reflects psychophysiological states and performance, and should be considered from a systemic perspective (Phillips, 2015). Consequently, the analysis of multiple constructs in the Feeling of Fatigue scale proves important.
The absence of robust results on fatigue measurements precludes the ranking of measuring instruments by effectiveness (Phillips, Kecklund, Anund, & Sallinen, 2017). A lack of consistency in the use of these instruments hampers comparison and validation of different fatigue measurements. There is also no golden rule for establishing the existence of fatigue and the validity of instruments measuring fatigue cannot be proven (Beurskens et al., 2000). In the absence of this consensus, convergent-discriminant validation is applied. The results of the present analysis of convergent-discriminant validation serve to confirm the validity of the Feeling of Fatigue scale. Therefore, the present study helps further knowledge on the measurement of fatigue.

FINAL COMMENTS AND FURTHER RESEARCH
Discussions on fatigue risk management systems (FRMS) recommend that validated instruments measuring fatigue be applied to capture different related constructs. FRMS are a fast-growing regulatory trend set to further research on occupational stress in Administration.
This study fills a gap in the occupational stress literature in Administration by highlighting the relevance of research on fatigue at work and validating the Portuguese version of the Feeling of Fatigue scale in Brazil. The instrument is important for fatigue management in the workplace.
The scarcity of similar studies in Administration journals should be addressed, given the relevance of the subject in the international scientific literature.
Follow-up studies involving further analysis and collection of new data samples are underway. These investigations may help determine a more accurate prevalence of fatigue based on the Feeling of Fatigue scale in the organizational context, and support validation of other scales measuring fatigue and related constructs. The authors believe this research in the Administration area will likely increase, as more and more organizations from different sectors strive to implement and operationalize fatigue management systems.