Using community phylogenetics to assess phylogenetic structure in the Fitzcarrald region of Western Amazonia

Here we explore the use of community phylogenetics as a tool to document patterns of biodiversity in the Fitzcarrald region, a remote area in Southwestern Amazonia. For these analyses, we subdivide the region into basin-wide assemblages encompassing the headwaters of four Amazonian tributaries (Urubamba, Yuruá, Purús and Las Piedras basins), and habitat types: river channels, terra firme (nonfloodplain) streams, and floodplain lakes. We present a robust, well-documented collection of fishes from the region including 272 species collected from 132 field sites over 63 field days and four years, comprising the most extensive collection of fishes from this region to date. We conduct a preliminary community phylogenetic analysis based on this collection and recover results largely statistically indistinguishable from the random expectation, with only a few instances of phylogenetic structure. Based on these results, and of those published in other recent biogeographic studies, we conclude that the Fitzcarrald fish species pool accumulated over a period of several million years, plausibly as a result of dispersal from the larger species pool of Greater Amazonia.


INTRODUCTION
The global diversity of freshwater fishes is highest in the rivers of the Neotropics, from where 6,080 fish species are currently described (5,607 from South America alone), comprising about 18% of all known fish species, and about 9% of all vertebrate species combined (Reis et al., 2016;Van der Sleen, Albert, 2017;Dagosta, de Pinna, 2019). Including as-yet undescribed species, estimates of the total diversity of Neotropical freshwater fishes are in the range of 8,000-9,000 species (Reis et al., 2016). Within the Neotropics, the greatest diversity is concentrated in the Amazon basin, from which more than 2,300 species have been described to date (Albert et al., 2011b;Van der Sleen, Albert, 2017;Oberdorff et al., 2019).
Here we present a study exploring the phylogenetic structure of freshwater assemblages, taking as a representative fauna the Fitzcarrald region of southeastern Peru, a broad (~400,000 km 2 ), low elevation (200-500 m) structural arch located in southwestern Amazonia (Fig. 1). The Fitzcarrald arch was uplifted in the Pliocene ~4 Ma as a result of subduction of the Nazca Ridge (Espurt et al., 2009). This uplift resulted in the partial hydrological separation of four large white-water, Andean tributaries of the Amazon River: the Urubamba (U), Yuruá (Y), and Purús (P) basins, which drain to the western Amazon basin, and the Las Piedras (LP) basin, which drains to the Madeira basin. The aquatic habitats of these basins can be divided into three major categories based on hydrological criteria: small terra firme (non-floodplain) streams, seasonallyinundated floodplains, and large river channels characterized by sand banks and hard clay deposits, with practically no rocky substrates (Albert et al., 2011a;Crampton, 2011).
3/16 scielo.br/ni | sbi.bio.br/ni Given its mix of habitats, compelling and well-studied geographical history and robust but not overwhelmingly large species pool, the Fitzcarrald region is a promising system in which to explore phylogenetic structure of freshwater assemblages. Furthermore, because the Fitzcarrald region is characterized by many of the same conditions present throughout Neotropical freshwaters, including aspects of geological history such as the Andean uplift, habitat types such as river channels, terra firme streams, and floodplain lakes, as well as many common fish clades such as Ctenobrycon, Hoplias, and Prochilodus, the results of this pilot study are likely to be applicable to other Neotropical faunas.
To explore the phylogenetic structure of assemblages in the river basins and habitats of the Fitzcarrald region, we apply the emerging method of community phylogenetics (Losos, 1996;Webb et al., 2002Webb et al., , 2008Strauss et al., 2006;Cavender-Bares et al., 2009). Community phylogenetic analyses can estimate two broad types of assemblage structure. First, assemblages in which the constituent species are phylogenetically more distant from one another than expected by chance are considered phylogenetically "overdispersed". Second, assemblages consisting of species more closely related than scielo.br/ni | sbi.bio.br/ni expected by chance are considered phylogenetically "clustered" (Cavender-Bares et al., 2009). These patterns have been used to support several ecological hypotheses. Patterns of overdispersion and clustering may support hypotheses of environmental filtering or competition having influenced local assemblage structure, depending on the spatial and temporal scales (Emerson, Gillespie, 2008). Phylogenetic clustering may indicate phylogenetic conservatism of traits exhibited by species within an assemblage, supporting a hypothesis of habitat filtering during assemblage formation. Phylogenetic clustering may also be consistent with assemblage formation by in-situ speciation. By contrast, phylogenetic overdispersion is consistent with hypotheses of assembly by dispersal or convergent evolution (Cavender-Bares et al., 2009).
The community phylogenetic approach is well suited to testing a number of evolutionary and ecological hypotheses. For example, ecophylogenetic studies can be carried out anywhere from the global scale (Pearse et al., 2019) to the local (Pearse et al., 2013), and can incorporate aspects of clade age based on the underlying phylogeny, adding a temporal dimension not present in some ecological work. Community phylogenetic methods can also be used to infer influences on the formation of a modern assemblage based on the phylogenetic structure of the species therein (Abreu et al., 2019;Zheng et al., 2019). The presence of phylogenetic structure, in the form of either clustering or overdispersion, may suggest non-neutral patterns during assemblage formation. Invasive species are also a topic of several community phylogenetic studies, where communities of low phylogenetic diversity are shown to be less resilient to invasion (Qin et al., 2020), especially from invaders who share few of their traits (Strauss et al., 2006). Invasions of an assemblage may also leave a signature of phylogenetic clustering (Lessard et al., 2009;Barfknecht et al., 2020), making them detectable to community phylogenetic methods.
Most studies using community phylogenetics to date have focused on terrestrial taxa, especially floras (Webb, Peart, 2000;Kembel, Hubbell, 2006;Sargent, Ackerly, 2008;Vamosi et al., 2009;Hawkins et al., 2014;Dexter et al., 2017;Mastrogianni et al., 2019;Zheng et al., 2019), with some work on terrestrial vertebrates (Amador et al., 2019;Caldas et al., 2019;García-Navas, 2019) and reef fishes (Muss et al., 2001). The present work, therefore, serves as a valuable early example of the community phylogenetic approach in Neotropical freshwater fishes. Further, the results of this study and future community phylogenetic work in Neotropical freshwaters holds the potential to better assess the common assumptions that competition, habitat filtering, and habitat conservatism influence assemblage formation, the latter of which has already been subjected to some scrutiny using community phylogenetic methods (Pearse et al., 2014).

Collections.
We made collections in the upper portions of each of the four major river basins in the Fitzcarrald region: Urubamba (U), Yuruá (Y), Purús (P), and Las Piedras (LP) rivers between 183-328 m elevation during the period of low water (Tab. 1). We collected fishes in three major habitat types: large rivers (>40 m wide) with deep channels (>5 m deep) and flooded beaches, small terra firme (non-floodplain) rainforest streams, and floodplain oxbow lakes (habitats described and quantified in Crampton 5/16 scielo.br/ni | sbi.bio.br/ni (2011)). We used standard fish collecting equipment at each site including, where appropriate, seine nets (five and 10 m width with five mm between knots), dip nets, cast nets, gill nets, and hook-and-line. We located electrogenic gymnotiform species using a two-meter length of wire plugged into a RadioShack 9-volt portable audio amplifier via a 3.5 mm audio jack.
The authors identified all specimens to morphospecies where possible (we queried experienced taxonomists as needed), then photographed vouchers, individually labeled every specimen with a unique field number and preserved them as a reference collection. Catalogued voucher specimens are listed in Carvalho et al. (2009Carvalho et al. ( , 2011Carvalho et al. ( , 2012 and Albert et al. (2012). We excised tissue samples from selected vouchers of each species using a sterilized scalpel and stored in 1.8 ml vials of 100% ethanol in a cool location in the field before being placed in a freezer maintained at -80° C for long-term storage. We then fixed all voucher specimens in 10% formalin for >48 hours in a sealed container at base camp and shipped them to the laboratory with tissue samples. All collected specimens were cataloged at the Museum of Natural History of the University of San Marcos (MUSM), Lima.
In total, 11,162 specimens from 77 field sites were sampled during 63 field days over the course of four project years (2008)(2009)(2010)(2011), with one expedition per year to a different tributary basin. These materials represent 272 valid species in 157 genera, 35 families, and 11 taxonomic orders (S1), or about 9%, 28%, 56%, and 58% the total numbers of these ranked taxa among fishes in the whole of Greater Amazonia (totals from Van der Sleen, Albert (2018)). We report only one species as endemic to the Fitzcarrald region: Gymnotus chaviro Maxime, 2009. Voucher sequences. We generated 16S and CO1 voucher sequences for each species present in each river basin when possible, given the limitations of remote field work. We extracted DNA from muscle, fin or liver using the D'Neasy Qiagen extraction kit (Qiagen, Hilden, Germany). For each morphospecies, 2.0μl of DNA in a total volume of 25μl were amplified by 35 rounds of PCR alternating between 30s of denaturation at 94°C, 60s of annealing at 56°C (16S) or 54-58°C (CO1) and 80s of extension at 72°C, finishing with a final 300s extension. Temperatures and times varied depending on the gene being amplified. Primers for 16s were 16Sar-L, 16Sbr-H (Palumbi, 1996) and for CO1 were CO1-BOL-F1 5', BOL-R1 5' (Ward et al., 2005). PCR products were then checked on a 1% agarose gel. PCR products were sequenced at Beckman Coulter Genomics (Danvers, MA). Sequences were visualized in MEGA 7 (Kumar et al., 2016).  Phylogeny. Given that mitochondrial markers like the 16S and CO1 collected for this study have difficulty accurately resolving deep phylogenetic nodes (Ortí, Meyer, 1997;Tagliacollo et al., 2016), we estimated rough phylogenetic relationships among fish species of the Fitzcarrald region using a synthetic approach relying on new data and published phylogenetic hypotheses following the methods of Marki et al. (2015) and Kennedy et al. (2017). First, we designated the robust phylogeny reported in Betancur-R. et al. (2013), generated from 20 nuclear genes and one mtDNA gene (16S), as a backbone phylogeny. Second, we manually added species found in the Fitzcarrald region but absent from the backbone as polytomies with their closest present relatives and resolved to the genus level where possible. We used published phylogenies to establish these relationships as follows: Pugedo et al. To visually demonstrate the phylogenetic distribution of species among basins and habitat types, we displayed presence and absence data at the tips of the working phylogeny. The phylogeny was transformed to an ultrametric cladogram with ordered nodes in FigTree (Rambaut, Drummond, 2018) (Fig. 2).
Community phylogenetics. We explored the phylogenetic structure of each assemblage (four river basins and three habitat types, rivers, lakes, and terra firme streams) using two indices calculated using the picante package in R (Kembel et al., 2010): Mean Phylogenetic Distance (MPD) which estimates the average phylogenetic relatedness between all possible pairs of taxa in a local assemblage and Mean Nearest Taxon Distance (MNTD) which estimates the mean phylogenetic relatedness between each taxon in an assemblage and its nearest relative (Kembel et al., 2010). Thus, the MPD reports structure throughout the phylogeny, while the MNTD focuses on branch tips. The standardized effect size of each metric was calculated using the Net Related Index (NRI) for MPD values and the Nearest Taxon Index (NTI) for MNTD values. To calculate NRI and NTI, we perform a z-Test between the observed phylogenetic distance and the average of 9,999 randomly generated null assemblages, standardized by the standard deviation of phylogenetic distances in the null assemblages (Webb et al., 2002(Webb et al., , 2008. In both cases, negative values statistically significantly different from the random expectation indicate a pattern of phylogenetic clustering. By contrast, positive values which are statistically significantly different from the random expectation indicate a pattern of phylogenetic overdispersion.

Community phylogenetics.
We report no cases of branch-tip phylogenetic structure and only two cases of whole-tree phylogenetic clustering in the assemblages of the Fitzcarrald region, thus the broad patterns reported here were consistent with the random expectation. The stream habitat possesses a statistically significant negative NRI value (p<0.05) and the Urubamba basin possesses a statistically significant negative NRI value (p<0.05) (Tab. 2).

DISCUSSION
Scale and novelty. This study is among the largest to date using community phylogenetic methods in fishes, exceeded in scope only by Leprieur et al. (2016), Floeter et al. (2018 and Bower, Winemiller (2019), and among the first applications of these methods to fishes of lowland Amazonia. Our community phylogenetic results suggest that the species present in the assemblages of the Fitzcarrald region were likely drawn from a regional species pool with no new lineages speciating in-situ. This broad pattern has scielo.br/ni | sbi.bio.br/ni been reported in several other fish faunas that have been examined by similar methods, including terrestrial (Caley, Schluter, 1997;Westoby, 1998), marine (Belmaker et al., 2008Harrison, Cornell, 2008) and freshwater (Santorelli et al., 2014;Abreu et al., 2019) systems.
Influences on phylogenetic structure. This study predominantly recovers results statistically indistinguishable from random, revealing a lack of prevalent phylogenetic clustering or overdispersion in the assemblages of the Fitzcarrald region. This suggests a lack of phylogenetic or niche conservatism in the formation of these assemblages, as well as a lack of in-situ speciation or convergent evolution, all of which have been interpreted as driving phylogenetic structure in assemblages (Cavender-Bares et al., 2009;Pearse et al., 2014). Thus, the random nature of these assemblages may support a hypothesis of their formation by effectively neutral dispersal of each species with respect to one another.
In contrast to the broad trend of results which are statistically indistinguishable from random, we recover two instances of phylogenetic structure. First, we recover statistically significant whole-tree phylogenetic clustering (NRI) for species in the stream habitat assemblages (Tab. 2). We propose that this result may reflect a signature of small-bodied taxa specialized to inhabit upland streams (e.g. Stevardiinae, Hemibrycon, Farlowella) dispersing to lowland terra firme streams in the Fitzcarrald region, as is observed in other areas of Amazonia (Mendonça et al., 2005). Second, we recover NRI clustering in the Urubamba basin which may be some evidence of Andean taxa (e.g. Chaetostoma, Parodon pongoense) which are able to disperse to and from and subsequently inhabit this nearby basin, as discussed in Lundberg et al. (1998) and Schaefer (2011) who propose a bi-directional invasion of hill-stream habitats by specialized clades during and following the Andean uplift.
We also propose that any signals of clustering or overdispersion may be harder to recover in the data due to the low resolution among the shallower branches of the phylogeny, which contributes substantial noise to the analysis (Swenson, 2009). Limitations of community phylogenetic analysis. Our interpretations of the community phylogenetic results have some important caveats. The lack of a highlyresolved phylogeny, especially among the shallower branches, risks biasing our results toward previously-documented patterns of biodiversity based on deeper relationships where the present phylogeny is better-resolved, or among larger, more fecund, or better-studied species (Hoplias, Prochilodus). Importantly though, Swenson (2009) notes that community phylogenetic analyses are most sensitive to low resolution deeper in the phylogeny, where the present phylogeny is best-resolved. Additionally, the community phylogenetic methods presented here investigate only the present assemblages and do not address any speciation that may have occurred previously, during dispersal, an issue that is compounded by the limitations of the working phylogeny and by present taxonomic knowledge.
Despite the limitations of the present study, there remain several compelling hypotheses within Neotropical ichthyology that might be explored in future community phylogenetic work. For example, the continental assessment of structure based on ecological variables present in Hawkins et al. (2014) could be applied to variables like body size, vagility, and diet in Neotropical fishes to explore patterns of migration and dispersal to complement Araujo-Lima, Goulding (1997) and Barthem, Goulding (1997). Additionally, investigations of invasive species like those of Lessard et al. (2009) andQin et al. (2020) could be adapted to cases of marine incursion in Neotropical freshwaters (Albert et al., 2006;Bloom, Lovejoy, 2017), or to explore the dynamics of biological corridors like the Casiquiare canal connecting the Orinoco and Negro (Willis et al., 2010), or the Rupununi portal connecting the Branco and Essequibo (de Souza et al., 2012).
Comparison with recent work. Although our community phylogenetic analysis here may not have recovered a strong enough signal of phylogenetic structure to draw powerful conclusions about assemblage formation, our results are more compelling in light of other recent work. Numerous historical biogeographic studies have suggested an important role for assemblage formation by dispersal from regional species pools in the Amazon Basin (Albert, Carvalho, 2011;Albert et al., 2011b;Tagliacollo et al., 2015), and other Neotropical freshwater fish assemblages (Stewart et al., 2002;Arrington et al., 2005;Lujan et al., 2012;2013;López-Fernández et al., 2013;Picq et al., 2014;Roxo et al., 2014;Roa-Fuentes et al., 2015;Tagliacollo et al., 2015;Thomaz et al., 2015;Melo et al., 2018). While it is not possible to draw macroevolutionary conclusions from community phylogenetics alone (Cavender-Bares et al., 2009;Ewers et al., 2013;Pearse et al., 2014), we do note that robust data on the geological and biogeographic history of the region support an important role for dispersal in assemblage formation, especially in the lowgradient, alluvial (rather than high-gradient, rocky) rivers tested here (Lundberg et al., 1998;Albert, Reis, 2011).
Additionally, such a conclusion would be consistent with a general hypothesized time-frame for the formation of Amazonian fish assemblages during the Neogene c.