Assessing knowledge : psychometric properties of the BAMS semantic memory battery

Background: Semantic memory is a cultural influenced cognitive domain that is responsible for our knowledge about words and the world. Semantic Memory Battery (BAMS) is a new battery that evaluate semantic memory based on a compendium of tasks, including verbal fluency, naming, conceptualization, categorization, general questions, and word definitions, and was designed to consider cultural aspects. Objectives: We aimed to evaluate the BAMS psychometrics structure comprising classical and modern analysis, and also evaluate a clinical subdivision of the battery. Methods: 114 Brazilian cognitively healthy older adults BAMS performance provided data for psychometric analysis using validity tests, item response theory analysis and confirmatory factor analysis for goodness-of-fit measures. Results: BAMS results revealed good validity and good-fit measures in each subtest, total score (X2 = 20.684, p = 0.110) and a hierarchical structure with clinical subdivision of the battery (X2 = 20.089, p = 0.093). Discussion: BAMS is a new compendium of tasks that evaluate distinct aspects of semantic memory and can clinically consider the impact of executive function. This battery evaluates verbal fluency, naming, conceptualization, categorization, general knowledge and word definitions. The BAMS has clinical importance once semantic memory is mostly influenced by culture and language, and there is an absence of broadly semantic memory tests in our scenario, especially with older adults that can have a pathological aging condition that affects primarily or secondarily this domain.


Introduction
Semantic memory is a subcomponent of long-term declarative memory responsible for general information about the world, words, definitions, categories, and concepts, operating like a knowledge store.Semantic memory allow us to give meaning to the unstoppable sensory information and gives us foundation for behavioral acts 1 .
The semantic knowledge distributes across the brain 1 .This cognitive system has a semantic control network and a huband-spoke representational network, that interact providing a generalization of concepts across contexts and retrieving conceptual properties of stimuli, respectively 1 .These two semantic networks interact with neural basis that includes distributed temporoparietal areas related to conceptual properties, and convergence zones (anterior temporal lobe and angular gyrus) and prefrontal cortex related to semantic control.The semantic control network with neural basis at the prefrontal cortex and parts of the middle temporal gyrus suggests that these regions are also importantly active during executively demanding tasks 1 .
Semantic memory can be divided into subcomponents to facilitate its assessment, comprehension, and also take into account the effects and use of the two described networks.Neuropsychological tests of categorical verbal fluency, naming, conceptualization, categorization, general knowledge questions, and definitions of words are considered tasks that assess the semantic memory system 2,3 .Semantic memory batteries often include a combination of tasks, allowing a more complex assessment of this cognitive domain than isolated tasks can offer. 2,3.
Impairments in semantic memory are core characteristics in some clinical conditions.For example, in the semantic variant of primary progressive aphasia (svPPA), semantic memory is the most prominent cognitive deficit 4 .Patients with this form of pre-senile dementia have a notable anomia and loss of knowledge about things, more than an episodic memory deficit.Usually, during this dementia, patients lose acquired general information, knowledge about things and words, and the words themselves 4 .
Other degenerative conditions may have semantic memory deficits, such as some types of mild cognitive impairment (MCI) 5 and Alzheimer's disease (AD) 6 .Some patients with MCI have inconsistent findings in semantic memory evaluation [7][8][9][10] .Traditional tasks as naming have evidences of more preserved performance, otherwise, verbal fluency evidence some degree of impairment 11 .AD patients otherwise frequently presents semantic deficits in tasks such as verbal fluency, naming, categorization and general knowledge information 12,13 .
Among these clinical conditions the semantic system may reveal a pattern of dissociations or profiles related to the use of abstract and concrete words, and also living and non-living items.svPPA patients may show deficits for concrete words but not for abstract concepts 14 , although we also have controversial evidence suggesting that this dissociation does not occur 15 .
Category-specific impairments may indicate that knowledge about living and non-living are independent semantic information.svPPA patients, considering the lost of general knowledge related to degeneration of a convergence zone do not normally present category specific deficits 16 .
Considering that we do not have a Brazilian semantic battery, we propose the present instrument.Even though we have some individual tasks in our neuropsychological scenario, as the Boston Naming Test 17 and semantic verbal fluency 18 , we lack of a culturally developed and broadly semantic evaluation.This absence results in some clinical difficulties with conditions that require a more precise semantic examination and also considers the cultural influence upon this cognitive domain.
Aiming to perform a better semantic memory assessment in the older adults, we developed a battery that would consider the actual theoretical literature about semantic memory as a cognitive construct and a clinical marker for healthy and pathological aging.This new battery was design to take into account distinct tasks that evaluate specific aspects of the semantic memory, and also the patterns of abstract/concrete and living/non-living items.

BAMS
The Semantic Memory Battery (Bateria de Avaliação da Memória Semântica -BAMS) is composed of seven tasks that assess different semantic memory subdomains.Initially, all tasks had 20 items each, except the naming test with 65 items and the verbal fluency with six categories.The first version of the BAMS was built with more items considering that a first cognitive health sample would test and improve the selection of the definitive items according to classical and modern psychometric analysis, as item response theory (IRT) and confirmatory factor analysis (CFA).The administration time is approximately 30 minutes.Box 1 shows a description of each task, correction, and scoring system.The supplementary material contains detailed information about item selection and the remained structure of the battery after psychometric analysis.Some primary criteria defined the items choices according to each task: the frequency of the word according to the Brazilian Portuguese Corpus 19 , the expected scholar knowledge for the mean education achievement of the Brazilian older adults population and the nuisance variables available 20 .Items were selected according to a high, medium and low frequency to avoid a ceiling or floor effect for the illiterate and highly educated older adults.
The BAMS have some similar tasks from other semantic memory batteries, including the naming in response to verbal description, picture naming, semantic verbal fluency, and visual categorization based on semantic association 2,3 .This instrument also includes tasks not present at other batteries as general knowledge questions and verbal similarities.
1) clinical interview designed to exclude subjects with psychiatric, neurological or other self-related disease, 2) a brief cognitive screening to exclude those subjects suffering from pathological cognitive decline and, 3) a comprehensive neuropsychological assessment to provide data for the assessment of BAMS psychometric properties.
The participants must be 60 years or older, be cognitively intact at the cognitive screening tasks and gave written consent for participation.The exclusion criteria adopted included that none of the patients must have actual or past reported history of neurological diseases; no actual psychiatric symptoms; no severe sensory or motor impairments; no self-reported hormonal or vitamins dysfunctions; and daily dependence.

Cognitive screening
The cognitive screening tasks included the Mattis Dementia Rating Scale (DRS) 21 and the Frontal Assessment Battery (FAB) 22 , and the participants should score inside or above the Brazilian normative sample mean according to age and educational achievement.The Ethics Committee in Research of the Universidade Federal de Minas Gerais approved the present study (CAAE-26795714.4.0000.5149).

Neuropsychological protocol
All participants underwent selected neuropsychological tasks.Some of the tasks were grouped into composite scores.The executive function score was composed by Digit Span task 23 , and parts three and four of the Five Digits Test 24 .The episodic memory score was composed by the learning, retrieval and recognition parts of the Rey Auditory Verbal Learning Test 25 .The participants also performed the vocabulary subtest of the WAIS-III scale 26 and the identification of common objects task 27 .This configuration of neuropsychological tasks was used to provide psychometric and validity information about the BAMS.

Data analysis
The statistics analyses were performed at Statistical Package for Social Science (SPSS) and MPlus v7 according to the objective.We choose to perform analysis from the classical and modern psychometric theory.
Psychometric analysis was decided according to data type.For the Verbal Fluency tasks we performed a CFA to verify if all six categories could group into a single measure that includes living and non-living.For the last six subtests we followed the steps: (Step 1) We first excluded items with no variability, once these items were very easy and may not be truly informative of the semantic memory average performance.(Step 2) We used the estimated IRT analysis, in a two-parameter logistic model (2PL), to evaluate the psychometric properties of the test and provide a better selection of the items according to each item difficulty and discrimination 28 .We determined that the items selection would respect a minimum of discrimination parameter of 0.65, classified as moderate, and all difficulty items would be considered after the discrimination criteria.If more than ten items passed this first selection criterion, we only kept the best ten.The Naming subtest was an exception to this rule and we kept more items to maintain diversity of nouns and verbs, living and non-living.
We performed this described procedure for each subtest that composes the BAMS.(Step 3) We underwent the selected items into a CFA to evaluate the constructs manifestation throughout a stronger analytics framework accounting for measurement errors, and also performed a Cronbach's Alpha according to the classical test theory.If an item showed Heywood case or a poor fit to the model, it would be excluded.(

Sample
A hundred and fourteen older adults compose the cognitive health sample.The recruitment involved participants from the community, from physical exercise groups of governmental programs, retirement groups and healthy older adults from a public medical service.

Procedures
All the participants underwent a clinical interview and neuropsychological assessment conducted by a neuropsychologist.All the participants underwent assessment composed by three stages: and performing the CFA only with the seven tasks composite scores were done to avoid errors of fit measures according to our sample size.
We performed the subtest CFA with a robust diagonally weighted least square (WLSMV) once the items are categorical, and this estimator does not assume normally distributed variables 28 .The WLSMV does not require the diagonal weight matrix to be positive definite, and requires a smaller sample size than weighted least square (WLS).WLSMV analysis can produce accurate test statistics, parameters estimates and errors with small sample size (100 or higher) 28 .The WLSMV performs accurately also with variables with floor or ceiling effects, although the IRT selection looked to avoid these effects 28 .Correlation analyses were performed to show valuable information about the construct and criterion validity of the BAMS.

Additional BAMS scores
Among the seven tasks that compose the BAMS, we have three subtests that share the influence of executive functions.The tasks of verbal fluency and categorization/similarities involve the frontal lobe network [29][30][31] and they include compromised performance in clinical groups with dysexecutive syndrome 30,32 even when semantic memory is preserved.We then tested for sub-composite scores built with a division of the BAMS tasks: semantic (SEM) and semantic-executive (SEF).The Naming by Definition, Naming Test, General Knowledge, and Word Definition tests created the SEM score, and Verbal Fluency, Categorization, and Similarities built the SEF score to accomplish findings related to executive and semantic interaction.This clinical division was evaluated using correlation with the composite score of executive functions, episodic memory, vocabulary subtest and the identification of common objects task.
Episodic memory composite score and the WAIS-III Vocabulary subtest were chosen as a convergent validity for the SEM considering that episodic and semantic memory share common long term declarative memory characteristics and the vocabulary measure is also used as a single assessment of semantic memory.The executive composite score and the identification of common objects task were chosen as convergent validity for the SEF score once the composite score was build comprising the theoretical view of three nuclear executive functions 33 and the identification of common objects task is an abstract categorization task.

Results
Sample descriptive characteristics are described in Table 1.BAMS initial and final configurations are reported at the Supplementary material.
The CFA for the Verbal Fluency subtest indicated that the category of birds was not a significant parameter and therefore was excluded, remaining five categories (animals, fruits, household items, tools and clothes).This five categories Verbal Fluency model revealed a good fit (Table 2).
For the Naming Test were selected all items with moderate or higher discrimination.The final task remained with 38 items.The fit measures showed a Chi-square with almost a good fit, but no modification indices were suggested (Table 2).The Root Mean Square Error of Approximation index indicates a good fit (RMSEA = 0.030; CI: 0.008-0.043;p = 0.997), also the CFI (0.992) and TLI (0.991), leading to our decision to keep the task with no more modifications.
After the IRT analysis Naming by Definition, General Knowledge, Categorization, Similarities and Word Definitions subtests remained with ten items each, all showing a good fit model (Table 2).All the subtests with the remained configuration of items also revealed satisfactory internal consistency according to Cronbach's Alpha values (Table 2).
The final selections of items for each subtest were computed into composite scores for each task (standardized estimates of BAMS subtests are shown on Table 3).The EFA analysis revealed a good fit for a unitary latent factor of the BAMS (X 2 = 23.012,df = 14,   p = 0.06) and a general CFA for the battery also revealed a good fit in all indices (RMSEA = 0.06, CFI = 0.981, TLI = 0.972, see Table 2 for Chi-square value).The Cronbach's Alpha for the BAMS also indicates a good internal consistency (Table 2).
Considering the subdivision of the BAMS, we tested with the Cronbach's Alpha the internal consistency of the two composite scores SEM (α = 0.822) and SEF (α = 0.755).Considering the possibility of a hierarchical composition, we tested for a CFA hierarchical model build with two latent factors semantic (SEM) and semantic-executive (SEF).This hierarchical model also indicates good fit: Chi-square = 20.089,df = 13, p = 0.093, RSMEA = 0.070, CFI = 0.980, TLI = 0.968.
Correlation results indicated convergent and divergent validity.The BAMS has positive and higher correlation with education (r = 0.647, p < 0.001) than age (r = -0.422,p < 0.001), and also positive correlation with the General Cognition measure (r = 0.778, p < 0.001).The criterion validity is demonstrated through the negative correlation between the BAMS score and the Functional Assessment Questionnaire (Pfeffer Index), indicating that higher semantic memory is related to lower functional impact (r = -0.333,p < 0.001).
The division of the BAMS into SEM and SEF scores also revealed convergent and divergent validity.The SEM score has higher correlations and significant distinct correlation values with education and the vocabulary subtest, and also does not have correlation with the identification of common objects task (Table 4).The SEF score otherwise, has higher correlations and significant distinct correlation values with age and the number of direct questions at the identification of common objects task (Table 4).
Considering the important correlation between BAMS scores with age and education, we choose to divide the sample into two age groups (60 to 75 years old, and 76 thru highest), and three educational groups (0-2 years, 3-8 years, 9 thru highest).The divisions of the educational and age groups were combined and indicated a good homogeneity within each combination.The BAMS scores according to age and educational achievement are shown in Table 5.
The BAMS composition has similar and distinct tasks from the Cambridge Semantic Memory Test Battery (2) and the Nombela 2.0 semantic battery (3) with good acceptance in the field of neuropsychological evaluation.However, the BAMS does not use the same stimulus across tasks, is not equally divided into living and non-living items after the IRT selection, and not all the nuiance variables could be controlled.
The BAMS analysis prioritized the item performance in order to avoid ceiling and floor effects.This analysis will provide more performance variability being also a potential clinical instrument with other conditions despite primary semantic memory decline.
The BAMS is also composed of tasks that are similar to standard measures of semantic memory, even when these tasks just access a specific part of this domain, as the Boston Naming Task 17 and semantic verbal fluency tasks 18 .The use of distinct tasks that evaluates important aspects of semantic memory broadens the assessment of this cognitive domain and raises the possibility of a better clinical diagnosis and intervention.
The BAMS also revealed a good clinical structure when the tasks were divided according to the level of executive function influence.The two composite score variables SEF and SEM also fit an overall score of semantic memory.This division is relevant when assessing patients with executive functions deficits that could drive the total score at the BAMS, and induce the perception of a worse semantic memory 30,34 .This hypothesis needs to be further tested looking for differences at the SEF and SEM division of the battery in clinical groups.
Age and education revealed relations with the semantic measures as expected.The education had higher correlation with the total score and also with SEM score, indicating that the semantic battery is also influenced by schooling process and acts as a crystallized cognition.The correlation with age was higher with the SEF score indicating a more fluid performance influence compatible with the executive function use in tasks of categorization, similarities and verbal fluency.
The BAMS as a total score and as clinical scores SEM and SEF showed good fit measures and also construct and criteria validities, indicating that even though this is the first semantic battery of the Brazilian neuropsychological scenario, it does have good psychometric indicators.
According to the education and age correlations, the sample was split into two age groups and, three education groups, so this influence could be taken into account when evaluating the semantic memory performance older adults.Is notorious the score difference among the fewer educated older adults and the medium to highly educated.We highlight that the BAMS is a battery that can be used with illiterate or semi-literate older adults owing to the fact that items selection took into account distinct educational backgrounds, allowing the assessment of this cognitive domain that is also influenced by cultural insertion.
Beyond the absence of a clinical group, this first study also has a limitation of working with a reduced sample size.Despite the results that the BAMS shows good psychometric properties and will be of relevant use in our neuropsychological evaluation scenario, a bigger sample size will improve the psychometric analysis and also provide parameters to our sample.Once our sample has sociocultural particularities and education has a relation to task achievement, these limitations encourages new perspectives in conducting a normative study with a lager cognitively healthy sample and clinical groups with semantic deficit.
The availability of a better semantic memory assessment is even more important when working with older adults that can have a particular pathological aging process that affects this domain 4,12,13 .The present battery may be a promising instrument for the cognitive assessment and clinical use with older adults.

Discussion
The objective of the present study was to analyze the properties of the Semantic Memory Battery (BAMS) using modern and classical psychometrics analysis to better select items and verify the general quality of the proposed battery in a sample of older adults.The BAMS is composed by seven tasks that evaluate different aspects of semantic memory and showed good fit scores and validities for intra-tasks and overall battery.These results indicate that the selected items and tasks format indeed compose a common semantic score, and therefore, should be considered as a valid measure for this cognitive domain.
Step 4) Remain items were summed into composite scores for each subtest and these values underwent an Exploratory Factor Analysis (EFA) and a new CFA to assess a general semantic memory construct (BAMS total), and also another Cronbach's Alpha.Summing the individual item into a composite score for each task

Table 2 .
CFA for each subtest and the battery (n = 114) * WLSMV chi-square cannot be used for difference testing in a regular way.BAMS: Semantic Memory Battery.

Table 5 .
Sample scores of each subtest, SEM, SEF and total BAMS.