Conservation strategies for Arapaima gigas ( Schinz , 1822 ) and the Amazonian várzea ecosystem

In the present study we report a spatial autocorrelation analysis of molecular data obtained for Arapaima gigas, and the implication of this study for conservation and management. Arapaima is an important, but critically over-exploited giant food fish of the Amazonian várzea. Analysis of 14 variable microsatellite loci and 2,347 bp of mtDNA from 126 individuals sampled in seven localities within the Amazon basin suggests that Arapaima forms a continuous population with extensive genetic exchange among localities. Weak effect of isolation-by-distance is observed in microsatellite data, but not in mtDNA data. Spatial autocorrelation analysis of genetic and geographic data suggests that genetic exchange is significantly restricted at distances greater than 2,500 km. We recommend implementing a source-sink metapopulation management and conservation model by proposing replicate high quality várzea reserves in the upper, central, and lower Amazon basin. This conservation strategy would: 1) preserve all of the current genetic diversity of Arapaima; 2) create a set of reserves to supply immigrants for locally depleted populations; 3) preserve core várzea areas in the Amazon basin on which many other species depend. We stress that conservation strategies should not only preserve current genetic diversity, but also the evolutionary processes which have generated the observed patterns.


Introduction
Arapaima gigas is one of the largest freshwater fishes of South America, attaining up to three meters in length and weighing over 200 kg (Saint-Paul, 1986;Nelson, 1994).Arapaima gigas is the only member of its genus, and is the sister species to the African Heterotis niloticus (Nelson, 1969;Lavoué and Sullivan, 2004).Arapaima is distributed predominantly in the Amazon basin, although it has also been recorded from the upper Essequibo (Rupununi) ba-sin of Guyana (Lüling, 1964).Fossil record also indicates that Arapaima was present in the Magdalena River basin (Lundberg and Chernoff, 1992).
Historically Arapaima appears to have formed an important portion of the diet of Amerindians living in the Amazon River floodplain as evidenced from middens in the Monte Alegre, PA region (Roosevelt et al., 1996).Significant commercialization of Arapaima began in the late 18 th century when salted and dried Arapaima began to be marketed as a substitute for bacalhau, the salted and dried cod (Gadus morhua).Arapaima was harvested, salted and dried, and then exported out of processing centers in Manaus, Santarém and Belém (Brazil).The commercial importance of Arapaima can be appreciated in the descriptions of Veríssimo (1895) who mentions that between the years 1885 and 1893, 11,540 tons of the dry and salted form of Arapaima were sold in Belém (1,282 t/year).Menezes (1951) cites that between 1918 and 1924, 6,775 tons of pirarucu (968 t/year) were landed in the State of Amazonas, andbetween 1919 and1921, 5,729 tons (1,920 t/year) were landed in Belém alone.In spite of an apparent extensive exploitation, Arapaima appears to have been very abundant near main Amazonian cities and centers of export of bacalhau until the early 1960's (Crossa and Petrere Jr., 1999).However, by the late 1970's Arapaima became commercially extinct near larger Amazonian cities (Goulding, 1979(Goulding, , 1980;;Bayley and Petrere Jr., 1989).All commercial fishing was banned in early 2001, but illegal fishery continues to this day.However, in 1999 limited exploitation of Arapaima was allowed within the Mamirauá Sustainable Biological Reserve, and shortly thereafter other regions in the Amazonas State were authorized for a limited and sustainable exploitation of their Arapaima stock.
While many of these reserves and community managed regions have been able to locally preserve Arapaima and some areas have seen increases in census numbers (e.g.Queiroz, 1999;Crossa, 2003;Crossa and Del Aguila Chaves, 2005), the larger problem of long-term preservation of a global scale still exists.To address the issue of long-term preservation, and specifically to answer how many reserves at what geographic scale are needed to maintain the current genetic diversity of Arapaima gigas within the Amazon basin, we use a spatial autocorrelation analysis.Spatial autocorrelation analysis can be applied to samples obtained from continuously or discretely distributed populations, describing both actual genetic variation and the dynamics of evolutionary processes.It summarizes the spatial distribution of genetic variation and is therefore useful for delimiting geographic regions at which similarity between samples is high or low.Distances at which samples become significantly dissimilar delimit genetically differentiated regions and biologically independent population even in continuously distributed species.
A total of 126 individuals sampled from seven populations were typed for 14 polymorphic microsatellite loci (Farias et al., 2003); the forward primers were labeled with the fluorescent dyes 6-FAM, HEX or TET.Polymerase Chain Reactions (PCRs) were performed in 10 µL reaction volumes containing 5.5 µL of ddH 2 O, 1.0 µL of 10x Buffer (100 mM Tris-HCl, 500 mM KCl, 15 mM MgCl 2 ), 2.0 µL of primer mix (0.2 M each forward and reverse primer), 0.8 µL of dNTP mix (200 M each dNTP), 0.2 U of Sigma RedTaq ® DNA Polymerase, and about 5 ng of DNA.PCRs were run on a Perkin Elmer GeneAmp PCR 9700 thermocycler.An initial denaturation step (94 ºC, 2 minutes) was followed by 35 cycles of 10 seconds at 94 ºC, 10 seconds at the locus specific annealing temperature (see Table 1 in Farias et al., 2003), 30 seconds at 72 ºC, and a final extension step for 60 minutes at 72 ºC.PCR products generated with labeled primers were visualized on a Perkin Elmer ABI 3100 Genetic Analyzer.Allele sizes were scored against an internal GeneScan-500 (ROX) size standard.Individuals were genotyped using the GeneScan Analysis 3.7 and Genotyper 3.7 software from Perkin Elmer.
Mitochondrial gene sequence data were taken from Hrbek et al. (2005).Briefly, the authors PCR amplified the complete NADH dehydrogenase subunit 1 gene and its 5' end flanking tRNA Leucine and the 3' end of the 16S rRNA, and an ATPase segment which included the 3' end of Cytochrome Oxidase subunit 2, tRNA Lysine, ATPase subunit 6, ATPase subunit 8, and the 5' end of the Cytochrome Oxidase subunit 3. PCR reactions were purified with Qiagen spin-columns, cycle sequencing reactions followed standard Perkin Elmer Big Dye sequencing protocol for double-stranded cycle sequencing reactions, and resulting cycle sequencing product was resolved on an ABI 3100 automatic DNA sequencer.Electrophoregrams were aligned and edited in the program BioEdit (Hall, 1999).

Data analysis
To investigate genetic connectivity among our sampling localities, we used a multivariate correlogram (Mantel, 1967) to test for an association between genetic and geographic distances of samples.The Mantel's Z statistic (Smouse et al., 1986) is given as where the element N ij represents a genetic distance between samples i and j, and W ij is the element of the connectivity matrix W; the Z statistic is the numerator  is within a given distance-class interval, otherwise it is coded as 1.Distance-class intervals were delimited at increments of 500 km.The significance of the Z statistic is tested via the non-parametric bootstrap by randomly permuting the rows and columns of one of the matrices and repeatedly calculating the Z values (Manly, 2006).The proportion of random values exceeding the observed Z value yields a Type I error -the P value -under the null hypothesis of no correlation between genetic and geographic distances (Manly, 2006).The Mantel test of correlation including significance testing was performed in the program Arlequin ver 3.1 (Excoffier et al., 2005).Correlation of within locality genetic diversity and among locality genetic differentiation and the significance of the correlation were also calculated in the program Arlequin ver 3.1 (Excoffier et al., 2005).All significance testing was done via 100,000 random permutations.
Geographic distances between localities were calculated in kilometers as shortest river-distance paths.Mitochondrial DNA genetic distances between sampling localities were calculated ST which are analogues of Cockerham and Weir's F ST F estimator (Cockerham and Weir, 1993).Two generalized genetic distance metrics have been developed specially for microsatellites, the R ST (Slatkin, 1995), and the classic F ST F (Wright, 1969).The R ST uses the variance in repeat numbers (VRN) and is in accordance with the stepwise mutation model.The F ST F distance is based on the variance in allele frequencies (VAF), and is compatible with the infinite alleles mutation model.In the present work, the population dif-f f ferentiation was analyzed using Wright's F ST F rather than Slatkins's R ST because F ST F -based estimates of differentia T tion are considered better than R ST -based estimates when T less than 20 loci are analyzed (Gaggiotti et al., 1999).Additionally, microsatellite distances based on VAF are consistently more highly correlated to mtDNA genetic distances (Richard and Thorpe, 2001), and thus these two types of analyses are more readily comparable.
Among-locality gene flow was estimated from the parameter Nm using the maximum-likelihood and Bayesian inference methods implemented in the program MIGRATE (Beerli and Felsenstein, 1999;Beerli, 2006).MIGRATE default search strategy values were used, except that we ran 20 short chains and five long chains with moderate levels of heating.

Results
Based on the analysis of 126 individuals from seven geographic localities spanning the Amazon basin, we find a significant association of geographic and microsatellite genetic distances (Mantel test r = 0.619, P = 0.012); however, there is no association of geographic and mtDNA genetic distances (Mantel test r = -0.042,P = 0.555).In the spatial autocorrelation analysis of microsatellite data where classes of connectivity were grouped at intervals of 500 km, i.e. 500, 1,000, 1,500, 2,000, 2,500, 3,000, and 3,500 km, one observes significant (P < 0.05) association between genetic and geographic distances at geographic distances of less than 2,500 km.At connectivity intervals of 3,000 and 3,500 km one no longer observes significant association between genetic and geographic distances (Figure 2).Within locality genetic diversity ( ) and among locality genetic differentiation is negatively correlated (r = -0.935,P < 0.002).Maximum-likelihood and Bayesian-inference estimates of gene flow (Beerli, 2006) among localities were very high (Table 1), while lack of differentiation among sampling localities could not be rejected (Raymond and Rousset, 1995;Goudet et al., 1996).For mitochondrial DNA data, within locality genetic diversity ( ) and among locality genetic differentiation is also negatively correlated (r = -0.796,P < 0.018).However, significant differentiation between certain pairs of sampling localities exists, but due to a lack of spatial and geographic pattern, Hrbek et al. (2005) viewed this result as largely an artifact of anthropogenic actions.

Discussion
Spatial autocorrelation analysis describes both actual genetic variation and the dynamics of evolutionary processes; it can be applied to samples obtained from continuously or discretely distributed populations, but is particularly suited for continuously distributed species.Because the spatial autocorrelogram is a summary of the genetic variation in space, it can be used for delimiting geographic regions with low and high genetic connectivity.The autocorrelogram profiles will also differ with accordance to variation in ongoing and historical processes involved in structuring of the species (e.g.Sokal and Jacquez, 1991;Epperson, 1993;Sokal et al., 1997).In all spatial autocorrelograms, however, the intercept is defined as the geographic distance at which the autocorrelation becomes non-significant.Samples separated by larger distance than the intercept are genetically independent.In recently established species, this distance has been interpreted as the average distance of gene flow (Sokal and Wartenberg, 1983); however, older species -in the intraspecific coalescence sense -often possess a historical pattern of local differentiation, which is statistically confounded with recent gene flow in spatial autocorrelation analyses (Diniz-Filho and Telles, 2002).The estimate of average distance of gene flow may therefore be an overestimate of actual gene flow, and therefore the geographic distance inferred from the autocorrelogram intercept will represent an upper bound limit for a genetically connected population.
Implicitly, the number and definition of the distance classes to be used in the autocorrelograms are arbitrary.However, the distance classes will always be defined by the spatial scale of the study, and a general methodological criterion for choosing distance classes is to try to maximize the similarity in the number of connections for the geographic distance classes.Thus the inherent arbitrariness is not important in most cases because the purpose of the analysis is to describe continuous spatial processes rather than specific instance of geographic distance classes.Our 500 km spatial scale is appropriate for the data analyses.

Inference of conservation units from spatial Autocorrelograms
Diniz-Filho and Telles (2002) provide an excellent overview for interpreting spatial autocorrelograms for the purpose of defining conservation units.The authors propose three basic patterns: 1) a long-distance cline where one observes a continuous decrease in autocorrelation coefficients with increasing geographic distances; 2) a stabilizing profile where in the first distance classes one observes a continuous decrease in autocorrelation coefficients while the second set of classes varies randomly around the mean autocorrelation coefficient of zero; and 3) an absence of spatial pattern where the mean autocorrelation coefficient varies randomly around zero.
In the first instance of a long-distance cline there are no discrete patches of genetic variability or these patches are strongly structured across space.The intercept of the autocorrelogram can be interpreted as the limit of the geographic range of independent samples.Conservation and management units should be established at distances equal to or larger than this intercept, since this strategy will have the potential to preserve most of the intra specific genetic diversity.Conservation units established at smaller distances than indicated by the intercept will be pseudoreplicates of each other, and their creation may not be necessary or justifiable.
In the second instance of a stabilizing profile, one clearly observes a spatial range of genetic similarity.Interpretation of the distance classes of continuous decrease in autocorrelation coefficients is similar to that for long-distance cline.The second class where the mean autocorrelation varies randomly around the coefficient of zero signifies lack of correlations between geographic and genetic distances.The pattern observed in the second class is due to lack of association between genetic divergence and geographic distance among localities at larger geographic distances most often attributed to historical divergence among phylogeographic units.Under this scenario, simply choosing conservation units separated by geographic distances larger than those indicated by the intercept will not be sufficient to present current genetic diversity.Other criteria such as the identification of phylogeographic units and the spatial distribution of genetic diversity will be necessary to the identification of the number and spatial distribution of conservation units.
In the third instance an absence of spatial pattern in the autocorrelogram, the most important criterion for conservation is maximizing preserving genetic diversity.In the case of a significant among-locality divergence, each sampled locality could be considered a distinct conservation unit.The absence of significant divergence in principle indicates that we are dealing with a panmictic population that can be viewed as a single conservation unit if no physical barriers to gene flow exist.

Definition of conservation units for Arapaima gigas
Arapaima gigas has been exploited as a source of food in the Amazon basin for centuries (Roosevelt et al., 1996).It seemed to have been able to sustain moderate levels of exploitation until the 1960's when the effects of overexploitation started to become evident (Crossa and Petrere Jr., 1999).By the late 1970's stocks near major Amazonian cities became commer-cially extinct (Goulding, 1979(Goulding, , 1980)), and fishermen had to move further and further from traditional grounds to catch Arapaima.A moratorium on Arapaima fishing spurred the recovery in certain protected areas such as the Mamirauá Sustainable Development Reserve, while illegal fishing continued in others.The question, however, remains at which scale do reserves need to be created or management programs need to be developed in order to preserve the current population dynamics and genetic diversity of Arapaima gigas.It should be noted that there are a number of reserves within the Amazon basin; however, in majority of areas there is a lack of political will and lack of resources to enforcement of existing laws to ensure viable conservation.Further, there is a lack of management programs which in most areas will need to be implemented to ensure the long term survival and recovery of economically valuable species such as the Arapaima gigas.Management programs will need to be applied even within reserves since these areas have permanent human settlements often whose only source of cash income is the exploitation of natural resources.
In our analyses of microsatellite data collected for individuals from the seven sampling localities we found a significant association of geographic and genetic distances supporting the hypothesis of isolation-by-distance.Spatial autocorrelation analysis suggests that sampling localities become genetically significantly differentiated only between 2,500 km and 3,000 km (Table 1, Figure 1).This distance represents nearly the complete length of the várzea system and the natural distribution of Arapaimaapproximately 4,000 km.Because the intercept only becomes non-significant at distance classes which approach the length of the complete distribution of Arapaima, and the autocorrelation line does not cross zero, it is not possible to determine if we are observing an autocorrelogram representative of a long-distance cline or a stabilizing profile.The phylogeographic study of Hrbek et al. (2005) suggests minimal historical divergence among localities, and therefore a long-distance cline is the more likely inference drawn from the autocorrelogram.The microsatellite data based autocorrelogram also suggests that other processes in addition to isolation-by-distance are responsible for genetic structuring of Arapaima at smaller geography distance.At these geographic distances the spatial pattern of genetic divergence among localities is better explained by short-distance dissimilarity rather than long-distance differentiation.Indeed, since Arapaima has a long history of exploitation, and in many areas is genetically severely depleted (Hrbek et al., 2005), there is a significant negative correlation between within-locality diversity and among-locality differentiation (r = -0.935,P 0.002).This anthropogenically induced genetic differentiation of the sampled localities via genetic drift accounts in part for the observed non-linearity of the spatial autocorrelogram.Removal of the most genetically depauperate and divergent locality, samples from Macapá, does not result in a qualitative change.We still observe a non-linearity of the autocorrelogram with an initial in-crease in correlation of geographic and genetic distances in the first set of distance classes, and the intercept remains around 2,500 km.
The spatial autocorrelation analysis of mitochondrial DNA data suggests an absence of spatial pattern where the mean autocorrelation coefficient varies randomly around zero (Figure 3).In this situation, Diniz-Filho and Teles (2002) point out that decision about how many reserves should be proposed will depend on whether or not differentiation among sampled localities exists.In the case of Arapaima, certain pairs of localities are differentiated, however, the pattern of differentiation appears largely non-sensical (Hrbek et al., 2005).For example, the two geographically most distance sampling localities -Iquito (Pacaya-Saimiria reserve) and from Marabá on the Tucurui dam (Araguaia-Tocantins system) -are not differentiated.However, they are differentiated from some but not all central Amazonian regions through which would be migrants must pass in order to maintain gene flow between these two localities.This is biologically not realistic.The observed pattern of distribution of genetic diversity most likely reflects a pattern of historical gene flow seen as lack of differentiation among genetically diverse sampling localities, and a pattern of recent anthropogenic differentiation driven by genetic drift among genetically depauperate sampling localities (Hrbek et al., 2005).The observed pair-wise differentiation would appear largely artifactual, and historically Arapaima has formed a nearly panmictic species distributed over a geographically very large area.Although non-significantly associated, Hrbek et al. (2005) noticed that one of the two most common haplotypes predominates in the upper Amazon while the other most common haplotype predominates in the lower Amazon localities.This pattern lends credence to the isolation-by-distance pattern observed in microsatellite data.
Interpreting both autocorrelograms would lead us to conclude that localities Arapaima gigas separated by between 2,500 and 3,000 km, the distance at which the correlation between genetic and geographic distance is no longer significant in the microsatellite data (the intercept), should be treated as genetically independent populations.Preserving localities in the extremes of the spatial range of the species ensures that most of the among-population genetic divergence will be sampled (Diniz-Filho and Telles, 2002), while preserving areas near the species' center ensures that most of the genetic diversity and thus the species' evolutionary potential is preserved (Crandall et al., 2000;Frankham et al., 2002).The initial increase in correlation (Figure 2) also suggests additional source(s) of population structuring unrelated to isolation-by-distance.
The distance of 2,500-3,000 km inferred should thus act as a metric around which reserves should be structured.The extent of the várzea ecosystem is approximately 4,000 km, thus at least two reserves need to be created to preserve majority of genetic diversity observed within Arapaima gigas.Since geographic centers of a species' distribution tend to be the oldest and most diverse areas, an additional reserve should be created near the geographic center of the distribution of Arapaima.The Araguaia-Tocantins River system essentially forms a discontinuous várzea system, thus at least another reserve should be created in this river system.In all cases the areas chosen should be those that have large areas of high quality várzea habitat that are well connected to other regions and areas which contain Arapaima populations with high genetic variability.Localities with highest genetic variability are those upstream of Tefé, Brazil, and those away from the central and lower Amazon River mainstream (Table 1).These four areas are the minimum number of regions within which reserves should be designated.This number does not provide for replication within each region.
Although we have not conducted systematic study of suitable habitats, several potentially excellent areas exist.In upper Amazon basin in the Ucayali region could be the Pacaya-Samiria reserve (20,800 km 2 ) and in the middle Solimões the Mamirauá RDS (11,240 km 2 ) and Amanã RDS (23,130 km 2 ), and in the middle Amazon basin the extensive várzea system like the Nhamundá APA reserve with 1,960 km 2 .Selecting an area in the lower Amazon basin is more difficult since várzea in this region is highly degraded, the várzea is narrow and the region is relatively densely settled.However, an area in this region should be designated, and may, perhaps in the lower Amazon the Tapará and Aritapera region (180 km 2 ), Mexiana Island (1,000 km 2 ) and Piratuba Rebio reserve (3,940 km 2 ) in the Amapá state.
We suggest also the development of Arapaima management program in some reserves like the Araguaia park and the Inàwébohona Indigenous Land (1,735 km 2 ) located in the Araguai-Tocantins River basin, and another area upstream of the Tucurui dam; the Resex of middle Juruá River (2,513 km 2 ); and in the Piagaçú RDS reserve (10,081 km 2 ) and the Abudafi Rebio reserve (2,248 km 2 ) both in the middle Purus River.Also the area of the Juami/Japurá Resex in the upper Japurá appears suitable.Further details on these regions are presented in Table 2.These are existing regions under formal federal or state protection that, based on our field experience, would appear amiable for long terms conservation and management of Arapaima.However, more informed choices should be made through a more sophisticated analysis such as through the use of a Geographic Information System (GIS).Such areas of management and conserva-  Independent of the final choice of areas of conservation, it is important that the selected areas are sufficiently large and ecologically complex to not be overly sensitive to environmental stochasticity.Furthermore, these areas must be allowed to produce an excess of individuals that can act as sources of migrants to other less favorable areas, and areas with intense fishing pressure.Genetic estimates of movement suggest that large numbers of individuals are exchanged among regions every generation (Table 1), and animals have been observed to move up to 35 km per month (Crossa, 2003).Thus we believe that with proper management, and because of limited resources, these key protected várzea areas could act as source areas in a source-sink metapopulational system, ensuring and maintaining the long-term health of Arapaima gigas in the whole Amazon River basin.Upon the verification that there are no barriers to gene flow that would prevent the functioning of the proposed source-sink metapopulational system, this strategy will serve to preserve not only the species and its genetic diversity, but also the evolutionary processes responsible for the generation of these patterns.Preserving patterns of genetic diversity as well as the processes generating these patterns is the only way that long-term conservation strategies can succeed.

Implications for other várzea organisms
Arapaima gigas is an economically important and a conspicuous member of the várzea, but not the only one.Other economically very important fish groups include the tambaqui (Colossoma macroponum Cuvier, 1816), the pirapitinga (Piaractus brachypomus (Cuvier, 1818)), the matrinxã (Brycon spp.Müller and Troschel 1844), the aruanã (Osteoglossum bicirrhosum (Cuvier, 1829)), the sardinha (Triportheus spp.Cope 1872), and various catfish and cichlid species for which very large commercial fisheries exist.The black caiman (Melanosuchus niger (Spix 1825)) and the spectacled caiman (Caiman crocodilus (Linnaeus 1758)), and the giant river turtles (Podocnemis spp.Wagler 1830) also form an important and conspicuous component of the várzea ecosystem as does the Amazonian manatee (Trichechus inunguis (Natterer, 1883)).As far as examined (Farias et al., 2004;Cantanhede et al., 2005;Pearse et al., 2006;Vasconcelos et al., 2006;Santos et al., 2007) all of these large and economically important species have a genetic structure similar to that of Arapaima gigas.Therefore the designation of várzea reserves on the scale suggested for Arapaima is likely to benefit other important members of the várzea ecosystem of the Amazon basin.

Figure 1 .
Figure 1.Distribution map and sampling localities of Arapaima gigas in the Amazon River basin.Area in grey represents the most heavily exploited region of the várzea and of Arapaima gigas stocks.Names in bold are major rivers.Table 1. MIGRATE analysis showing pair-wise estimates of gene flow.Populations can have different number of immigrating and emigrating individuals.(MCMC = Monte Carlo Markov Chain (a resampling strategy); Loci = number of loci analyzed (all = 14); Ln(L) = natural log of the likelihood of estimated parameters; Theta = population genetic parameter defined as N e (effective population size times mutation rate); Nm = effective population size times migration rate) MCMC estimates Population [x] Loci Ln (L) Theta Nm [x = receiving population]

Figure 2 .
Figure 2. Spatial autocorrelogram of microsatellite and geographic distances.The intercept, a distance at which the autocorrelation no longer is significant, is indicated by an arrow.

Table 2 .
Areas potentially suitable for long-term conservation and management of Arapaima gigas.Arapaima respects no political borders and large tracts of well preserved and ecologically suitable habitat exists in frontier regions of the western Amazon basin.