Allelic frequencies and statistical data obtained from 12 codis STR loci in an admixed population of the Brazilian Amazon

The allelic frequencies of 12 short tandem repeat loci were obtained from a sample of 307 unrelated individuals living in Macapá, a city in the northern Amazon region, Brazil. These loci are the most commonly used in forensics and paternity testing. Based on the allele frequency obtained for the population of Macapá, we estimated an interethnic admixture for the three parental groups (European, Native American and African) of, respectively, 46%, 35% and 19%. Comparing these allele frequencies with those of other Brazilian populations and of the Iberian Peninsula population, no significant distances were observed. The interpopulation genetic distances (FST coefficients) to the present database ranged from FST = 0.0016 between Macapá and Belém to FST = 0.0036 between Macapá and the Iberian Peninsula.

The population of the Amazon region originated from the miscegenation of three large ethnic groups: Native American, European colonizers and Africans. Archeological studies have estimated that the first humans which reached the Amazon region were Paleoindians coming from the north and west of the American continent about 12,000 years ago (Salzano and Bortolini , 2002).
The mixture between Europeans and Native Americans from the Amazon region started soon after the arrival of the first European colonizers. After their settling in the new territory, the Europeans began using the indigenous labor force for the occupation and exploitation of the Amazon region. Along the 17th century, the indigenous slave labor force decreased and, as of the mid-18th century, Africans were introduced as slave labor force, representing the third migration wave into the region (Curtin, 1969;Cunha, 1995).
In the State of Amapá, the first documented records of contact between Europeans and Native Americans go back to 1499, when Americo Vespucci, who took part in the ex-pedition of Alonso de Hojeda, under the orders of the Catholic sovereigns of Spain Fernando and Isabel (Castela and Aragon), traveled along its coast, passing the Cavian, the Pigs and the Pará islands, which face the current capital of the State of Amapá. An important record of the arrival of Africans in Amapá was made in 1764, during the construction of the Fortress of São José de Macapá, where Africans and Tucuju, Aruan, Aruaque and other natives living in the delta of the Amazon River and on the island of Marajó were used as labor force (Morais and Morais, 2000).
Defining from the genetic point of view what it means to be Brazilian is a difficult task, mainly if one considers that this is one of the most heterogeneous populations in the world. Countless scientific studies have attempted to provide answers as to the contribution of Natives Americans, of Europeans and of Africans to the shaping of the current Brazilian population, but the only consensus is that the crossbreeding dynamics that occurred in Brazil are unique and present great complexity.
The main purpose of this work was to estimate population parameters based on the allele frequencies obtained for 12 polymorphic autosomal STR loci investigated in a sample of the population of Macapá, and to compare the results with others, of different Brazilian populations and of the Iberian Peninsula. This study was approved by the SEAMA College Research Ethics Committee (REC Resolution n o . 023/2007).
After obtaining informed consent, 3 mL samples of peripheral blood were collected from 307 unrelated healthy individuals (185 women, 122 men; mean age 22.3 years; range 18 to 80 years) who live in the city of Macapá (0°02'20" N; 51°03'59" W), state of Amapá, northern Brazil, and were recruited upon routine examinations at the Macapá UNILAB Clinical Analysis Laboratory. Genomic DNA was extracted using the phenol-chloroform protocol described by Sambrook at al. (1989), and DNA quantification was done with a NANODROP 1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA). 1-5 ng of target DNA were used to co-amplify the 12 short tandem repeats (D8S1179, D21S11, D7S820, CSF1PO, TH01, D13S317, D16539, vWA, TPOX, D1851, D5S818, FGA) investigated in this study. The PCR primer sequences and DNA amplification conditions used were previously described (Ribeiro-Rodrigues et al., 2007). Electrophoresis and genotyping were performed in an ABI 3130 Avant Automated Sequencer (Applied Biosystems, Foster City, CA, USA). Data acquisition was performed with the ABI PRISM 3130 -Avant Data Collection v2.0 software (Applied Biosystems) and for profile analysis we used the GeneMapper ID v3.2 software (Applied Biosystems). Genotyping quality and allele designations were assured by simultaneous electrophoretic analysis of a control sample of known size. Allele designations were made by using the ABIGS ROX 500 reference ladder (Applied Biosystems) as size standard and according to published nomenclature and in concordance with the National Institute of Standards and Technology for forensic STR analysis (NIST).
Allele frequencies, heterozygosity (H), polymorphism information content (PIC), power of discrimination (PD), power of exclusion (PE) and deviation probability from the Hardy-Weinberg equilibrium (P) were obtained using the Arlequin Version 2.000 software (Schneider et al., 2000). Matching probability (MP) and typical paternity index (TPI) were calculated for each locus using the Powerstats V12 software (Tereba, 1999). Interethnic admixture was calculated using the ADMIX 95 software. Genetic distance F ST coefficients were determined from the allelic frequencies using the DISPAN software (Ota, 1993) for the 12 loci analyzed. The F ST matrix and UWPGMA (Unweighted Pair Group Method of Analysis) tree analysis were performed using the GDA program (Lewis and Zaykin, 2001). The tree was displayed by means of the TreeView software (Page, 1996).
The forensic parameters investigated show high average values: polymorphism information content (PIC) = 77%; power of discrimination (PD) = 93%; power of exclusion (PE) = 61%; observed heterozygosity (Ho) = 80%; cumulative matching probability (MP) = 0.000000000000095 (probability of finding another person with the same genetic profile using these 12 markers), and cumulative typical paternity index (TPI) = 73,200.00 (index based on Bayesian statistics that indicates the ratio between the possibility of the alleged father being the true parent versus the possibility of the alleged father not being the true parent, using the 12 analyzed markers).
The FGA marker showed the highest level of heterozygosity (89.5%), and the TPOX marker showed the lowest (71.3%). The power of discrimination and the power of exclusion for the 12 STRs studied were 99.9999999999992% and 99.9991%, respectively (Table 1).
By comparing the allele frequencies obtained for the 12 STR-autosome systems investigated in the population of Macapá (Table 1)  However, when the ancestry percentages estimated for the population of Macapá are compared with the percentages described in other Brazilian populations, it becomes clear that there is a regional variation regarding the dynamics of crossbreeding in Brazil. Thus, by comparing the results obtained in the population of Macapá with those of populations of different geopolitical regions of Brazil (Grattapaglia et al., 2001;Ferreira da Silva et al., 2002;Dellalibera et al., 2004;Góes et al., 2004;Ribeiro-Rodrigues et al., 2007;São-Bento et al, 2008;Ocampos et al., 2009Ribeiro-Rodrigues et al., 2007 Figure 1), the population of Macapá showed to be closer, in terms of genetic distances, to the population of Belém (F ST = 0.0016), in strict accordance with their geographic location and history of colonization. The Iberian Peninsula (F ST = 0.0036) is clearly the most distinct population, as well as the one of São Paulo (F ST = 0.0029). These results agree with other population studies and historical data and are consistent with the anthropological origins (Caucasian, African and Native American) of the Brazilian populations tested. Figure 1 shows that there is a clear-cut grouping among the populations of cities or states which are geo- 36 Francez et al. graphically closer to each other, as in the case of Macapá and Belém, Santa Catarina and São Paulo, and Pernambuco and Alagoas, respectively. These results are in agreement with other studies (Handley et al, 2007) indicating that the gene flow among population groups is inversely proportional to the geographic distance between them. These results also agree with the fact that, due to the special occupation policies of such a vast territory, the admixture process occurred in different ways in different geographic regions of the country. The data in Table 2 are in agreement with those of Salzano and Bortolini (2002), indicating that in northeastern Brazil the African contribution is high and the Native American component is low; in the North, the contribution of Native Americans is pronounced, whereas in the South the Native American and African influence is reduced compared to all the other geographic regions.
The absence of significant differences between the genetic distances among the population of Macapá and other Brazilian populations observed in this study is due to the fact that the markers used have low values of F ST between different human ethnic groups. This characteristic was deliberately selected and is important, because these markers are used in human identification studies, including civil and criminal forensic investigations. It is therefore not desirable that they present significant differences in gene frequencies among different population groups, since this could increase the risk of statistical errors, such as overestimation of the paternity index arising from populational substructuring.  (Dellalibera et al., 2004), São Paulo (São-Bento et al., 2008), Santa Catarina (Ocampos et al., 2009), Rio de Janeiro (Góes et al., 2004), Brazil (Grattapaglia et al., 2001), and the Iberian Peninsula (Ribeiro-Rodrigues, 2003, Masters Thesis).