Internal construct validity of the Brazilian version of a tool for assessing the population’s knowledge of human papillomavirus

REV BRAS EPIDEMIOL 2020; 23: E200054 ABSTRACT: Objective: To verify the internal construct validity of the Brazilian Portuguese version of a tool for measuring the general population’s knowledge of human papillomavirus (HPV). Materials and methods: A cross‐culturally adapted Brazilian Portuguese version of a measurement tool originally designed for English speaking populations was administered to 330 adults in Tubarão, Santa Catarina, Southern Brazil. After examining the overall suitability of the method, we performed investigations based on the item response theory and exploratory factor analysis. Results: Ten of the 29 items presented a low contribution to the construct and were excluded from subsequent analysis. The factor analysis yielded three factors, which explained approximately 51% of the variance variability. A different arrangement from the original measurement tool was found: general HPV knowledge, with six items; HPV vaccination knowledge, with five items; HPV transmission and testing knowledge, with eight items. Conclusion: The Brazilian Portuguese version under study presented a different behavior from the original measurement tool, but proved to be a reliable and valid instrument in assessing the Brazilian population’s knowledge about HPV.


INTRODUCTION
The persistent human papillomavirus (HPV) infection is the main cause of cervical cancer and is associated with neoplasms in other tissues. Cervical cancer is one of the most common causes of cancer-related deaths in women in low-and middle-income countries 1 .
According to recent estimates by the HPV Information Center 2 , cervical cancer is the third most frequent among women aged 15-44 years in Brazil. Infection with HPV types 16 and 18, which are responsible for approximately 70% of cervical cancer cases, can be prevented through vaccination 3 . However, vaccine coverage is low among adolescents, the target population of HPV immunoprophylaxis in Brazil 4 . Some authors have argued that the low HPV vaccine uptake in Brazil may be due to fear of adverse reactions (following media reports of neurological symptoms in clusters of girls in Brazil), parental vaccine hesitancy, and/or logistical challenges to vaccinating adolescents at health care centers 5,6 .
Among the main restrictive factors of preventive practices regarding the cervical cancer is the lack of knowledge about the disease and its prevention, especially in adolescent populations, considered the most vulnerable group for acquiring HPV infection 7 . Media advertisements, health professionals, and parents are the most frequent sources of information, but doubts on the accuracy of the message transmitted remain 8 . We believe that by measuring the knowledge about HPV, it will be possible to identify the need to expand the information on the virus, diagnosis, treatment, and prevention.
Recently, we proposed and published a cross-culturally adapted Brazilian Portuguese version of that tool. The details of the cross-cultural adaptation process can be found in Manoel et al. 15 . The preliminary model showed satisfactory reliability; however, the construct validity still needs to be established. The type of validation that allows verifying whether an instrument actually measures what it proposes to measure is called construct validity, given by the ability of a test to measure a theoretical trait or construct, thus validating a background theory 14 . In fact, it checks whether the observations that supported the arguments correspond to the theoretical parameters of the investigated subject. Construct validity is not limited to the measurement. Its broader purpose focuses on the validation of the theory on which it relied for the construction of the instrument 16 .
In 2014, the Brazilian National Health System launched the HPV immunization campaign, but the goals were not reached 1 . Possibly, the lack of general public knowledge has raised social and ethical discussions that could explain it, at least partially. Public understanding of preventive measures and consequences of HPV infection can play an important role in contributing to the achievement of these goals. Thus, it is crucial to have a valid instrument for measuring HPV knowledge that can be administered to different segments of the population.
This study aimed to verify the internal construct validity of the Brazilian Portuguese version of a tool for measuring the general population's knowledge of HPV. We expect that a validated tool can help determine how education measures can be implemented.

STUDY DESIGN AND PARTICIPANTS
This is a cross-sectional study. The 29 items of the cross-culturally adapted Brazilian Portuguese version 15 of the instrument proposed to measure lay knowledge of HPV in English speaking countries 14 were self-administered by 330 adults. The sample comprised parents of 9-15-year-old adolescents, recruited from public and private schools in the municipality of Tubarão, Santa Catarina, Southern Brazil, and community health agents from the same municipality, who had not been previously trained in HPV screening. The sample was non-probabilistic, as previously described 15,17 .
An informed consent form, complying with standards of the Declaration of Helsinki, was delivered to the participants, along with the HPV knowledge survey instrument. The Human Research Ethics Committee of Universidade do Sul de Santa Catarina, Brazil, approved the research project, under protocol number 734,735.

DATA ANALYSIS
We assessed the psychometric properties of the Brazilian Portuguese version of the questionnaire at the item level based on the item response theory (IRT). Items that did not fit the IRT model, evaluated by the chi-square test, were excluded from the subsequent analysis. For the remained items, we used classical reliability and exploratory factor analysis (EFA) with principal component analysis (PCA) to assess internal consistency and determine possible factors or subscales.
Questions 1 (Have you ever heard of HPV?), 2 (Have you ever heard of HPV vaccination?), and 3 (Have you ever heard of HPV testing?) were excluded from the analysis because their purpose was solely to enable the participant to continue responding to each of the three proposed sections.
IRT is a modern method recommended by the Patient-Reported Outcomes Measurement Information System (PROMIS) 18 group to test the item loading. We analyzed the 29 items to identify the discriminatory power of the instrument and of each item alone. The analysis was performed in the open software R 3.30.
Cronbach's alpha was used to assess internal consistency reliability after a previous analysis of the overall suitability of the dataset. A matrix between each pair of questions was examined through Pearson's linear correlation. We also carried out the Kaiser-Meyer-Olkin (KMO) and Bartlett's sphericity tests. All tests were performed in the software IBM SPSS version 18.0 (IBM Corp., Armonk, USA) for statistical analysis.
The PCA to define the number of factors involved two analyses: using the Kaiser-Guttman criterion 19 , which considers factors with eigenvalues greater than or very close to one (λ ≥ 1); performing a scree plot test to observe the eigenvalues.
Based on the Kaiser-Guttman criterion, we assumed that the loadings between factors with eigenvalues smaller than one and the original variables should be low, given that they would have had higher correlations with previously extracted factors with higher eigenvalues 20,21 . Therefore, the original questions that shared small variance percentages with the other questions had their factor loadings raised in a single factor. The analysis of communalities allowed us to verify if any question did not share a significant percentage of variance with the defined factors.
We used the Varimax method to minimize the number of questions that presented high loadings in a given factor by redistributing the loadings and maximizing the shared variance in factors with smaller eigenvalues 22 . Finally, we carried out a theoretical evaluation according to the information required by the item and performed a regrouping based on the possibilities offered by the statistical model.

RESULTS
The sample consisted mainly of women (87.6%), and the mean age was 41.3 ± 8.8 years.
More details about the sample are available in Manoel et al. 15 and Manoel et al. 17 .

ITEM RESPONSE THEORY
The one-dimensionality of the instrument allowed us to proceed with IRT. Three classic models were created: • rasch; • two-parameter logistic (ltm); • three-parameter logistic (tpm).
They were analyzed through ANOVA. Compared to rasch, the ltm model presented better results [Akaike Information Criterion (AIC): ltm 9900.3 < rasch 10084.0; Bayesian Information Criterion (BIC): ltm 10120.6 < rasch 10194.2; p < 0.001]. The second analysis showed similar results between ltm and tpm (AIC: tpm 9881.3 < ltm 9900.3; BIC: tpm 10211.8 > ltm 10120.7; p < 0.001). Both models (ltm or tpm) could be valid, but we selected the two-parameter logistic, since it is the most classic model. Figure 1 shows the item trace lines. They were expressed numerically by the discrimination and difficulty parameters. IRT demonstrated that the item 1e had the strongest  Ability discrimination (3.4) while the item 1q (0.0) had the lowest contribution. Otherwise, item 3f presented the highest difficulty level (1.7) while the lowest was identified in item 1q (-175.9). We excluded ten items with poor discrimination power (1g, 1k, 1m, 1q, 2c, 2e, 3a, 3b, 3d, and 3f ) and performed a new IRT analysis. All items with a p-value greater than or equal to 0.05 remained for the EFA.

RELIABILITY AND FACTOR ANALYSIS
The instrument demonstrated an overall Cronbach's alpha index of 0.80. The Cronbach's alpha index if each item was deleted from the subscale remained similar to the overall value.
The assessment of the overall suitability of the dataset for EFA showed that the relationship between the number of subjects interviewed and the number of questions was 11.4, which represented a favorable condition. The correlation matrix of the variables presented Pearson correlation coefficient values above 0.2. The overall adequacy of the EFA for the data set showed a score of 0.82 in the KMO statistic, evidencing a correlation between the variables. Bartlett's test presented a significance level of p < 0.001.
The scree plot indicated the existence of two to five factors, depending on the slope of the curve points ( Figure 2). According to the Kaiser-Guttman criterion 20 , the extraction of five factors was responsible for explaining 51.38% of the total variance of the instrument, with eigenvalues greater than or close to one (Table 1). PCA was performed with two, three, four, and five factors to identify which configuration could better explain the theoretical affinity between items. We considered that, usually, the indicated difference between dimensions should be greater than 2.5 to allow selecting the next factor.
The best configuration was obtained with three factors. The rotation of the factors by the Varimax method sought to minimize the number of variables with high factor loadings in a factor and maximize the variation between the weights of each main component. The component matrix, after orthogonal rotation, aimed to maximize the factor loadings, so that each variable was associated with only one factor, simplifying the interpretation of these factors. All values lower than 0.2 were eliminated due to a weak correlation. The rotated factor loading matrix showed that factor 1 included items related to "general information," factor 2 to "vaccination," and factor 3 to "transmission and diagnosis." Items 1c, 1e, and 1j demonstrated loadings for factors 1 and 3, but although the highest loading belonged to factor 1, they were nearest to factor 3 according to the theoretical analysis. The same occurred to item 2b, with loading in factors 2 and 3, but better proximity to factor 2. Table 2 shows how the items were effectively related to the factors. The sequence order followed the factor and value of correlation (above 0.2) of each item.

DISCUSSION
The results pointed to keeping the latent construct, and the Brazilian Portuguese version might retain 19 of the 29 items, with a different arrangement from the originally proposed instrument: general HPV knowledge, with six items; HPV vaccination knowledge, with five items; HPV transmission and testing knowledge with eight items. The first dimension comprised questions related to "general HPV knowledge" as proposed by Waller et al. 14 The second also consisted of questions from the original "HPV testing knowledge." However, the third dimension of the Brazilian Portuguese instrument involved a combination of two questions from the original "HPV vaccination knowledge" and six questions from the original "general HPV knowledge." Nevertheless, we emphasize that ten items presented factor loadings in two dimensions and two other items in three dimensions, indicating certain non-specificity. It forced the researchers to arbitrate their distribution among the dimensions based on the theoretical status, respecting the statistical results of each loading on the factor. The practical need for separating the items clearly by dimension remains. On the other hand, probably the most important practical result of this study was providing a shorter and valid version with good internal consistency, indicating good reliability.
HPV: human papillomavirus; HIV: human immunodeficiency virus; AIDS: acquired immunodeficiency syndrome. It is noteworthy that the Brazilian public health system does not use the HPV test as a screening tool, which limits the knowledge about the diagnosis even for health professionals. In addition, Brazil recently modified the vaccination schedule to two anti-HPV doses, while other countries maintained the three-dose vaccination schedule 4 .
The low anti-HPV vaccination coverage observed in Brazil is mainly due to hesitation and resistance from many parents who are still deciding on the health care of their children. When considering the child's age and the presumed time frame until sexual exposure, parents usually underestimate the child's susceptibility to acquiring sexually transmitted infections or even to developing cancer in the future 6 . Another relevant factor would be the connection between HPV and sexual activity. This issue has been a major source of religious controversy surrounding the HPV vaccine. Several religious groups and parents have expressed concern that the vaccine would be a trigger for promiscuity and early sexual life among adolescents. Furthermore, it has been argued that religious norms that regulate the sexual activity of unmarried women make the HPV vaccine unnecessary 23 .
Once a valid instrument is available to gauge the general knowledge of the population about HPV and aspects related to vaccination, transmission, and diagnosis, it is possible to plan and develop more appropriate educational strategies. There are several differences between subgroups, as well as in sociodemographic status, including age, ethnicity, maternal schooling, healthcare coverage, and health providers' recommendations about HPV 22 . Indeed, one study demonstrated that young people were unaware that the HPV vaccine could be given to males. Authors have suggested improving the discussion with health care providers about this issue, as some social barriers still need to be overcome 24 . One of the principal concerns regarding the HPV diagnosis is the knowledge generated by the information that follows the diagnosis per se.
The search for the partner responsible for the transmission and the need to determine when the infection has occurred are questions that remain unanswered. The emotional stress generated by the diagnosis and non-scientific sources of information can contribute to negative results in the progress of the disease or even in the possibility of transmission. Therefore, an effective instrument able to measure HPV knowledge is crucial to focus on preventive programs and health policies.
The existence of an instrument to determine the knowledge about HPV allows safer choices and more reliable strategies for the dissemination of this knowledge in a specific population. It is important to remember that teachers and administrators will be on the front lines of these programs, coordinated with health providers and parents. Also, there will always be anti-campaigns to defeat 25,26 .
Among the limitations of this study are the predominantly female sample and the HPV-related problems they might have experienced that are not known, which theoretically have the potential to influence the results. Such a hypothesis would imply the need to incorporate a generic question about the issue, even if the interviewee were male, since he could have experienced the problem, personally or in his family. The greater the heterogeneity of the sample, the more representative of the target population it would be.

REV BRAS EPIDEMIOL 2020; 23: E200054
Nonetheless, further studies should include a discriminatory analysis between groups with and without a history of personal or family problems related to HPV infection.
Lastly, we underline that at the time of data collection, the vaccination schedule included three doses. Currently, the country has a two-dose regimen. Thus, we suggest changing question 2.2 of the final version of the instrument in Portuguese based on the current vaccination schedule.

CONCLUSION
We can conclude that the Brazilian Portuguese version under study presented a different behavior from the original measurement tool. A shorter version with 19 questions proved to be a reliable and valid instrument in assessing the Brazilian population's knowledge about HPV. In this scenario, it is possible to infer that the chances of success of health promotion and disease prevention actions would improve.