Genetic recurrence and molecular markers of dyslexia in the Brazilian population

ABSTRACT Purpose: to investigate genetic recurrence and molecular markers for dyslexia in two candidate genes in the Brazilian population. Methods: a cross-sectional, case-control, observational study, with five single nucleotide polymorphisms (SNPs) studied in DYX1C1 and KIAA0319 genes in 86 subjects with dyslexia and 66 controls, matched for gender and age. SNPs were genotyped using the polymerase chain reaction technique in real time, and distribution of genotypic and allelic frequencies between the groups was analyzed. Results: it was determined that 68% of the subjects with dyslexia present a family history of learning difficulties. The DYX1C1 gene did not demonstrate an association with dyslexia, which was found regarding the rs9461045 marker of the KIAA0319 gene. Conclusion: a family history of learning problems was present in more than two-thirds of the group with dyslexia, indicating that this is an important risk factor. An association with dyslexia in the rs9461045 marker was noted, making the study the first one to show an association of the KIAA0319 gene with dyslexia, in Latin America.


INTRODUCTION
Dyslexia is a heterogeneous, neurofunctional disorder that affects language, characterized, chiefly, by an unanticipated difficulty in learning to read and write, despite adequate intelligence, motivation, and educational opportunity and suitable social environment and absence of sensory or neurological deficits 1 .Its prevalence is reported to be about 5 -12% in schoolaged children, but it varies as a result of differences in diagnostic criteria 2 .Causal mechanisms have been investigated and attributed to genetic and environmental factors 3 .
From a neuroanatomical point of view, dyslexia may be characterized by the presence of abnormalities in the normal pattern of neuron migration, which principally affect the perisylvian regions of the brain's left hemisphere.Neuroimaging studies have confirmed these structural abnormalities as well as abnormalities in the functional organization of these cortical areas.An abnormal pattern of neuronal migration in some cortical areas associated with dyslexia is related to the functional nature of genes whose mutation appears to be a causal factor.Such genes would be responsible for coding the regulation mechanisms of radial migration of neurons and the growth of axons 4 .Research also indicates that susceptibility to dyslexia correlates with at least nine loci: DYX1 (15q21), DYX2 (6p21), DYX3 (2p16-p15), DYX4 (6p13-q16), DYX5 (3p12-q12), DYX6 (18p11), DYX7 (11p15), DYX8 (1p34-p36) and DYX9 (Xp27) 1,5,6 .The primary candidate genes are DYX1C1 on chromosome 15 [6][7][8] , KIAA0319 on chromosome 6 9,10 DCDC2 on chromosome 6 2,11 KIAA0319L on chromosome 1 12 , ROBO1 on chromosome 3,13 and MRPL19 and C2ORF3 on chromosome 2 14 .In general, genetic factors appear to account for 30 -70% of the variability in reading ability in a given population 15 .
The inheritance pattern of dyslexia, that is, autosomal dominant, autosomal recessive, or polygenic, is not clearly established.The accuracy and replicability of this research are limited by inherent difficulties in the characterization and evaluation of the phenotype, the reduced size of the samples, genetic heterogeneity, and limitations of the statistical methods 16 .
An alternative to performing a more refined genetic analysis is an association study in which specific polymorphic markers are used for the candidate gene, and an association is noted when a particular allele of a microsatellite or single nucleotide polymorphism (SNP) is present with increased or reduced frequency in affected subjects compared with controls.
SNPs are biallelic markers resulting from substitutions of nitrogenous bases during DNA replication as a result of spontaneous or induced mutational processes.Abundant in the human genome, these simple framework markers constitute a valuable tool to identify genes that may explain the variation of complex phenotypes 17 .
It should be noted that genetic factors that could account for dyslexia have yet to be identified.The process for determining a potential genetic etiology for a given phenotype should follow a logical progression.The initial step would construct a descriptive epidemiology, in which variations in geographic origin and descent (commonly referenced as ethnicity or race), social class, age, and gender might indicate the involvement of genetic or environmental factors.The next would examine whether the phenotype presents familial aggregation, that is, a higher occurrence in certain families than would be expected by chance 18 .The occurrence of complex inheritance phenotypes, such as dyslexia, results from a combination of genetic and environmental factors, in part, predictable and, in part, accidental.Thus one must first distinguish between accidental family aggregations, in which there is a systematic tendency for a phenotype to segregate across generations, and those involving a genetic component, in which, in many cases, the inheritance pattern does not follow such simple models.Segregation analysis assists in determining the presence of one or more major genes within families that may explain all or part of the family aggregation of the observed characteristic of interest.The fundamental importance of evaluating heritability is its application in identifying genes related to specific characteristics.The determination that a given characteristic is significantly correlated with heritability indicates a potentially favorable prognosis for investigating its genetic determinants 19 .The result of past research can significantly reduce the number of candidate genes that need to be studied and facilitates the prioritization of genes and markers for the detection of mutations, thus reducing research costs 20 .
Several studies have examined the association of dyslexia with markers, especially the DYX1C1 and KIAA0319 candidate genes, primarily in European and Asian populations, but none has involved Latin American populations.Therefore, this study investigates the genetic recurrence and molecular markers for dyslexia in the Brazilian population, in the two previously reported candidate genes.

Inclusion and exclusion criteria
Experimental Group inclusion criteria: Consent forms signed by parents or guardians, authorizing participation in the study; no complaint of visual or auditory acuity; normal intellectual performance (IQ >80); diagnosis of dyslexia, according to DSM-5 21 criteria, performed by a multidisciplinary team.Exclusion: multidisciplinary diagnosis of specific language disorder, ADHD or other neurological or psychiatric disorders.
Control Group inclusion criteria: No complaint of visual or auditory acuity; school performance as expected for age and grade, according to the family and school report; satisfactory results in PROLEC standard reading test 22

Subjects
The recruitment of subjects with dyslexia was conducted in diagnostic research centers in 3 cities in the state of São Paulo, Brazil.An interdisciplinary team comprising speech therapists, neuropsychologists and neurologists was established to assure the diagnosis of dyslexia, conforming to the criteria of Diagnostic and Statistical Manual of Mental Disorders: (1) constant difficulties in learning and using one or more of the academic domains (i.e., reading fluency, reading comprehension, and/or written expression) for at least six months although target skill interventions have been given; (2) academic skills are below what is expected at the individual's age, which impairs functioning in school, at work and in activities of daily living; (3) early signs of learning difficulties may appear in the preschool years (e.g., difficulty learning names of letters or counting objects), but they can only be diagnosed reliably after starting formal education; (4) those who have intellectual developmental disorders, global developmental delays, hearing or vision disorders, psychosocial difficulties, language differences and who lack proficiency in the language of academic instruction are excluded 21 .Accordingly, 86 subjects, aged 7 to 17, who were diagnosed with dyslexia participated in the Experimental Group, while 66 subjects, aged 7 to 17, consisting of elementary and middle school students from two public schools, who presented no reading problems constituted the Control Group.The Control Group is smaller than Experimental one due to the exclusion of subjects who did not engaged in all research stages.It did not compromise the comparisons, statistical analysis demonstrates no significant difference between these variables: gender and mean age distribution (described in Table 1).

Genetic and molecular analyses
Genetic analyses were conducted in the laboratory This study follows some of these steps to examine the molecular characteristics of dyslexia in Brazil, as no case-control findings have been reported for this population.
Is there evidence of phenotypic aggregation with families?
Consider non-genetic influence

Yes
Is the pattern of aggregation consistent with an effect of genes?
Is there evidence of a gene with a substantial enough effect to justify expensive studies to attempt to identify it?
Where in the genome is causative gene most likely to lie?
Can we be more precise about its position?Is there a causative polymorphism?Is there an identifiable haplotype block?
Segregation analysis Linkage analysis -Association analysis Linkage disequilibrium -Mapping Haplotype analysis Does the polymorphism affect mRNA?In which tissues is mRNA expressed?Is there an effect on the protein product?
For each allelic discrimination of the genes, Forward and Reverse Primer oligos were used, the normal sequence was labeled with the VIC fluorophore (with the exception of that of SNP rs11629841, which used the FAM fluorophore) and the mutated sequence with the FAM fluorophore.The tests present four oligonucleotides: the Forward Primer, which extends from the 5'-position to the 3'-position of the DNA; the Reverse Primer from the 3'-position to the 5'-position, and two probes that go from the 5' to the 3'-position.
The Forward and Reverse Primers flank the DNA region which has the polymorphism and is amplified by the PCR technique.The probes precisely ring the region of the polymorphism.
Thus following the experiment, amplification of DNA fragments labeled with the FAM or VIC fluoroscope could be observed, indicating whether the subject is a normal or polymorphic homozygote, depending on the assay for the given gene.Amplification of DNA fragments labeled with both probes characterized the subject as a heterozygote.
Primers are specific to the sectors adjacent to the polymorphic site, and the probes hybridize the DNA segment by complementarity, according to its alleles.In the amplification of the segment delimited by the primer, the probe is degraded by the enzyme Taq DNA polymerase, and any separation augments its intensity exponentially, as captured by ViiA 7 in each PCR cycle.
The experiments were performed twice on 384 well plates (Applied Biosystems, catalog 4309849), using the final 5μL mixture previously described for the reaction.
The cycling sequence for performing real-time PCR is as follows: initial temperature of 95° C for 10 minutes, then 50 cycles of 95° C for 3 seconds, followed by 60° C for 20 seconds.Negative controls consisting of the reaction without DNA were used to assess any DNA contamination of the reagents.
Samples submitted to the real-time PCR reaction were subsequently analyzed with ViiA 7, which reported the genotype results.

Analysis of heredity
The software PELICAN 1.1.0-(Pedigree Editor for Linkage Computer Analysis) were used to analyze Subjects who lived in the same city as the higher education institution were instructed to refrain from food, drink, or brushing their teeth for at least a half hour prior to the collection of their saliva.Participants were instructed to stimulate salivation and expectoration was performed in sterile 15-ml graduated Falcon tubes until a volume of 5 ml was attained for each.The tubes, preserved in crushed ice, were transported to the laboratory of pharmacology and genetics.Following aliquoting in 1.5-ml tubes, samples were stored in a freezer at -20° C until DNA extraction.From subjects who lived in the other cities, 8-15 ml of peripheral blood was collected.Participants were instructed to follow the manufacturer's instructions, which were identical to those for the collection in the higher education institution, save for the collection of a 2-ml sample, followed by homogenization with the kit's preservative liquid and shaking for 5 seconds.The samples were subsequently stored in a freezer at -20° C following the manufacturer's instructions.
For subjects in the experimental and control groups of the same city as the higher education institution the extraction was performed using the DNA Extract All Reagents kit (Thermo Fisher Scientific, catalog number 4403319).First, 2μl of the previously homogenized saliva was transferred to a clean microtube; then 20μl of the lysis solution was added prior to homogenization and centrifugation.Following this process, the reaction rested at room temperature for 3 minutes before 20μl of the stabilizer solution was added.For those who lived in the other cities, genomic DNA was extracted from leukocytes, collected in Vacutainer tubes containing 10% EDTA, using the phenol-chloroform method.After extraction, the samples were quantified with a NanoDrop ND-1000 spectrophotometer.
The genotypes were analyzed using the real-time PCR technique.The reactions were carried out in a ViiA 7 thermocycler (Applied Biosystems), using pre-standardized and experimentally validated TaqMan SNP genotyping assays (Applied Biosystems), following the manufacturer's instructions.To prepare the reaction, 2μl of the sample obtained in the DNA extraction was used with 2.5μl of GTXpress Master Mix (Thermo Fisher, catalog number 4401892), 0.125 of each genotyping assay, and 0.375μl of water, for the dyslexia or learning problems recurrence in the family history.The recurrence information was collected through interview (anamnesis) with parents or guardians, they were asked about previously dyslexia diagnosis (by an interdisciplinary team) or similar difficulties in family members.Individuals whose phenotype could not be proven were represented by a question mark in the pedigree.For this purpose, the number of cases in which both parents of individuals with dyslexia are affected, and those in which only one parent or none is affected, and their gender were estimated.The same assessment was also made regarding grandparents, uncles and cousins, and therefore, it was necessary to collect data from at least three generations, on the paternal and maternal side.

Data analysis
Calculation of allele frequencies and genotypes and an association analysis were performed using the SNP Stats Web tool (23) .The Hardy-Weinberg equilibrium for the alleles of the SNPs studied in the control group was performed via an exact test to better fit the chi-square test.In the test, p values less than 0.05 indicate sample imbalance.Five logistic regression models were performed corresponding to the codominant, dominant, overdominant, recessive, and log-additive models.The effect of genetic association was established with the odds ratio (OR) and 95% confidence interval (CI).

Family history of learning difficulties
As Table 2 depicts, 68% of the subjects with dyslexia presented a family history of learning difficulties while only 11% of Control Group had this history.It was decided not to display these data in Tables 2 and 3 considering the reduced sample size compared to the other group.Of the dyslexic subjects, 19 (43%) have affected parents; 14 (32%) uncles or aunts; 10 (23%) siblings, 9 (20%) grandparents, 5 (11%) cousins and 1 (2%) nieces.Female No ----------Fourteen (67%) of the female and sixteen (70%) of the male subjects had a family history of learning problems (Table 3).Thus, the gender difference was not statistically significant.

Genotyping of experimental and control groups
In comparing genotypes, an association with dyslexia was found in marker rs9461045 in the codominant and log-additive models.Although the proportion of allele frequency was higher in the subjects with dyslexia, with the exception of rs4504469, no association was found between dyslexia and other markers (Table 4).
In the Hardy-Weinberg equilibrium test of the control group, an imbalance was observed for SNP rs4504469, which reduces the reliability of the statistical test in comparing genotypes between groups.For other SNPs, an imbalance was not observed in the control group, demonstrating its reliability for statistical analysis (Table 5).

DISCUSSION
Most subjects with dyslexia (68%) presented a history of familial recurrence of learning problems with no indication of a gender effect.The difficulty of collecting data on grandparents should be noted as some parents provided incomplete information since grandparents had dropped out of primary school, making it difficult to determine learning difficulties.
Previous research estimates inheritance of dyslexia at about 80% 2,24 which conforms to the findings of this study.
As for affected relatives, most were parents (43%), followed by uncles and aunts (32%), siblings (23%), grandparents (20%) and cousins (11%).The lower rate for siblings could arise from the fact that not all subjects with dyslexia had siblings.Research indicates that some 35% to 40% of first-degree relatives (parents, siblings, or offspring) of persons with dyslexia are affected 25,26 which concurs with this study's findings.
Thus family history is deemed one of the most significant risk factors for dyslexia since there is evidence  that families that have a member with dyslexia have at least one other who has similar difficulties 25 .Some studies suggest that inheritance is greater in males 2,27 .Another research, however, has found no statistically significant gender difference 28 , which is consistent with the findings of this study.
Thus, the data found in this study and in previous research, as reported herein, reinforce the hypothesis of genetic cause.
The results of this study suggest that the genetic variant rs761100 in KIAA0319 is significantly associated with dyslexia in Brazilians, with no association observed for the other markers.Although the gene's functions are not thoroughly understood 29 , studies have indicated that KIAA0319 is related to decreased neuronal migration and intercellular adhesion 6,30 .The markers rs4504469, rs761100, and rs9461045 were associated with dyslexia in European 10 , Indian 31 , and Chinese populations 32 .An association with dyslexia is reported for rs4504469, in populations of the United Kingdom, Germany, and India 10,31 .
The marker rs9461045 can alter KIAA0319 gene expression in neuronal and non-neuronal cells and create a binding site for the octamer-1 transcriptional repressor 3 33 .
In a cohort study 34 with 141 children age 3 to 12 significant associations were found with reading comprehension for KIAA0319, NRSN1, CNTNAP2, and CMIP, with KIAA0319 also associated with reading rate.
A case-control, meta-analysis study of German subjects 35 found that DYX1C1, KIAA0319, and DCDC2 were associated with dyslexia.Of the 16 SNPs in the 5 genes studied, including rs4504469 in KIAA0319 gene, the authors found greater allelic risks in subjects with dyslexia than in controls.Among the SNPs evaluated in KIAA0319, rs2038137 and rs6935076 were associated with dyslexia, and no relationship was found with rs4504469.
The latter research is confirmed by the present study, which found no difference between the experimental and control groups for this marker.Some studies that examined linkage imbalance reported that the haplotype rs4504469-rs2038137-rs2143340 was associated with reading difficulties 1,33 .In the present study, however, no associations of rs4504469 with dyslexia were observed.
Neuroimaging results in 332 European subjects aged 3 to 20 reported an association between DCDC2, KIAA0319, ACOT13, and FAM65B.They found that some markers, including rs9461045 in KIAA039, demonstrated an association with decreased cortical thickness in the left orbitofrontal region 36 .
A meta-analysis study 37 was conducted to assess the association of polymorphisms in KIAA0319 and the risk for dyslexia in Asia.The research was based on seven case-control studies involving a total of 2,711 cases and 2,991 controls, and five studies of linkage imbalance, involving 943 families.The results indicated that none of the six markers examined, including rs4504469 and rs761100, demonstrated an association.However, a stratified ethnic analysis found divergent associations regarding rs4504469 in KIAA0319 in European and Asian subjects, with a protective effect in Europeans population (OR = 0.90, 95% CI = 0.83-0.99,p = 0.028), but risk factor for Asians (OR = 1.56, 95% CI = 1.28-1.90,p <0.001).This stratification also showed that the minor allele of the SNP rs9461045 (allele T), also studied in this study, showed protective effect in Asians (OR = 0.82, 95% CI = 0.68-0.98,p = 0.026).The authors recommended further research involving different ethnicities to confirm their findings.
Another study evaluated 20 SNPs in DYX1C1, KIAA039, and DCDC2 genes, in the same sample and found no association between the markers studied and dyslexia 38 .They continued the investigation 31 of markers in DYX1C1 and KIAA039.In the former case-control study, expanded by the addition of 210 Indian children with dyslexia and 256 without reading difficulties, they examined SNPs in KIAA0319 and DCDC2 and found an association with rs4504469 in KIAA039 (OR = 2.53, 95% confidence interval = 1.36-4.71),a result which does not correspond with this study, which found no association for this marker.Analysis of dominant, recessive, and additive models demonstrated the same association in the dominant model, while no significant association with rs9461045 was observed in either group, an inverse result from the present study, which found an association for rs9461045, and none for rs4504469.In their prior study, involving a smaller sample, researches did not find an association between rs9461045 and dyslexia, suggesting that increasing the sample size may have led to its discovery.Using the same sample from their previous study, they examined SNPs in DYX1C1 and found an association with dyslexia in rs12899331, rs142084351, and rs77641439, and no associations with rs3743205, which corresponds with the results of the present study, which also found no association for rs11629841, although associations have been reported in a study with Canadians 39 .
Few studies have investigated dyslexia in Latin America from a genetic point of view, and no casecontrol genetic studies of Brazilians with dyslexia were found in the literature, although one study of relatives were conducted.The study 40 examined 51 subjects with dyslexia, researches evaluated deletions and duplications in DCDC2 genes, KIAA0319, and ROBO1 and haplotypes in DCDC2 and KIAA0319, analyzing the same markers as the present study in the latter, among others.No deletions or duplications were found in the genes studied, and no association between DCDC2 and KIAA0319 were observed.
The largest case-control Genome Wide-Association Studies (GWAS) conducted to date in European populations identified a suggestive association with the rs6035856 at gene LOC388780 with reading disorders 41 .The (GWASs) have been reported as the gold standard method for identifying genetic factors associated with neurodevelopmental disorders, such as dyslexia.However, the sample size is relatively low compared to other studies, progress in conducting GWAS research for dyslexia has not been made at the same pace as for other disorder 42 .
The identification of mutations in candidate genes could inform early diagnosis of reading disorders through genetic evaluation in the context of a multifactorial framework, but several levels of analysis need to be completed before such data could prove clinically useful 3 .Thus, in conjunction with previously cited research 40 the present study advances understanding of genetic risk for dyslexia in a Latin American population, in particular, Brazilians.To the best of our knowledge these findings are pioneer.

CONCLUSION
A family history of learning problems was present in most dyslexic subjects, indicating that this is a significant risk factor.Regarding molecular aspects, an association with dyslexia was observed in marker rs9461045.While a study of 86 individuals with dyslexia may be deemed a modest first step, the present study is the first genetic research to establish KIAA0319 as a candidate gene for dyslexia in a Brazilian population.Further research with greater sample sizes and more subjects from diverse regions of Brazil should be conducted to replicate the findings reported herein, and further research to identify markers for dyslexia in other Latin American population should be undertaken.
of pharmacology and genetics of a higher education institution in the state of São Paulo, Brazil.The organization chart (Fig 1) summarizes a sequence of investigations proposed by Burton 19 and used to identify and characterize genetic determinants of complex diseases.

Table 1 .
Gender and mean age distribution groups a Experimental Group.b Control Group.c p<0,05 = statistically significant.Chi-squared test.d ± standard deviation.% = percentage language or learning problems specific or secondary to any pathology.

Table 2 .
Familial recurrence for learning problems in dyslexic subjects

Table 4 .
Genes and alleles among dyslexic individuals and proficient readers c Control Group.

Table 5 .
Exact test of Hardy-Weinberg equilibrium a Single Nucleotide Polymorphism.b Experimental Group.c Control Group.d p<0,05 = statistically significant.*Statistically significant.