Emerging hotspots of tree richness in Brazil

We present a summary of floristic variation and distribution of richness of tree and tree-like taxa ( i.e. , freestanding plants that reach at least 3m in height) in Brazil. We investigated composition patterns throughout phytogeographic domains and vegetation types based on 698,490 occurrence records obtained from the NeoTropTree (NTT) database, and used rarefaction and extrapolation methods to compare species richness. We delimited areas of high taxa richness in Brazil by applying the Geographic Interpolation of Endemism method. There are 9,108 tree species catalogued in NTT for Brazil, but our extrapolations indicate that the total could reach 9,525 species. Predominantly forested domains showed the greatest richness of taxa with the Amazon domain having the highest number of exclusive taxa. Fabaceae and Myrtaceae were the most represented families. The richest vegetation types were Rain Evergreen Forest and Seasonally Semideciduous Forest. Distribution patterns of richness in Brazil and among its domains were found to be controlled by different spatial scales for each taxon. Transition zones had high species richness. The patterns found here can help to identify priority areas for biodiversity conservation in Brazil.


Introduction
The great diversity of natural environments in Brazil comprises a monumental degree of biodiversity at the level of families, genera and species (BFG 2015). There are 60,065 registered tree species worldwide, with Brazil being the country with the most diverse known tree flora on Earth (Beech et al. 2017). A total of 32,086 native species of angiosperms have been recorded for Brazil, with trees representing a larger proportion in the Amazon and Atlantic Forest (BFG 2015). The territory occupied by Brazil encompasses most of the remaining areas of Tropical Rainforest, Tropical Seasonally Forest (including Seasonally Dry Tropical Forest) and Tropical Savanna (Pennington et al. 2006;Fiaschi & Pirani 2009) in the world, and comprises six major phytogeographic domains: Amazon, Atlantic Forest, Cerrado, Caatinga, Pampa and Chaco.
The Amazon, comprising, for the most part, the largest humid tropical forest in the world, but also seasonal forests and savannas (Ab'Sáber 2003), is the largest domain in Brazil, accounting for about 40 % of its territory (Coutinho 2016). An equatorial climate prevails in the Amazon, which is characterized by annual mean temperatures between 21.5 ºC and 27.8 ºC -when discounting ten sites located at extreme conditions, which include some cloud forests and inselbergs -and annual precipitation ranging from 1,027 to 3,731 mm (Oliveira-Filho 2017). The Amazon domain has suffered environmental deterioration for years, and is currently experiencing an alarming increase in degradation, with deforestation alerts in April 2019 for areas totaling 1,055. The Atlantic Forest, considered a hotspot for the conservation of biodiversity (Myers et al. 2000;Ribeiro et al. 2011), includes various types of rainforests and seasonal forests, as well as campos de altitude (mountain grasslands), restingas (sand coastal ecosystems), mangroves and savanna encrustations (Oliveira-Filho & Fontes 2000;Eisenlohr et al. 2011;Neves et al. 2017). The Atlantic Forest is the second largest tropical forest in South America, covering about 15 % of the Brazilian territory. It possesses a diversity of climates (tropical humid, subtropical of altitude, subtropical dry winter and temperate) due to its latitudinal extension (Ab'Sáber 2003), with annual precipitation ranging from 643 to 3,525 mm and annual mean temperatures ranging from 11.3 ºC to 27.9 ºC (Oliveira-Filho 2017). Only 12.5 % of the original 1.3 million km² of this domain remains as native forest fragments (SOS Mata Atlântica 2019).
The Cerrado domain, another biodiversity hotspot, occupies about 25 % of the Brazilian territory and includes the largest tropical savanna on Earth. It represents a phytophysiognomic gradient with xeromorphic vegetation dominated by rocky outcrop grasslands but also with Cerradão (forested savanna), riverine forests (mainly gallery forests) and seasonal forests (Oliveira-Filho 2009;2017). Located in central Brazil, the Cerrado possesses a tropical seasonal climate (Joly et al. 1999;Ab'Sáber 2003;Fiaschi & Pirani 2009;Coutinho 2016), with annual precipitation varying between 696 and 2,443 mm and annual mean temperatures ranging from 17 ºC to 27.9 ºC (Oliveira-Filho 2017). The Cerrado has already lost 50 % of its original area, with the main threats to its biodiversity being agriculture and livestock activities (WWF Brasil 2019).
The Caatinga, in which steppic-savanna under a semiarid hot climate stands out, holds the largest continuous area of Seasonally Dry Tropical Forest (SDTF) in South America and constitutes the only exclusively Brazilian domain (Joly et al. 1999;Queiroz 2006). This domain covers 11 % of the Brazilian territory (MMA 2019a) and consists mainly of thorny and deciduous vegetation ranging from shrubs to trees (Queiroz 2009). Annual mean temperatures range from 19.6 ºC to 27.8 ºC and annual precipitation from 372 to 1,664 mm (Oliveira-Filho 2017), with a concentration of 50 to 70 % of the rainfall in three consecutive months; however, in drier areas, precipitation can be less than 300 mm per year (Prado 2003;Araújo et al. 2007). Approximately 46 % of the original vegetation cover of the Caatinga domain has been removed by deforestation and burning (MMA 2019a).
The Pampa, in turn, possesses a warm, humid and temperate climate, with annual precipitation of 1,174 to 1,818 mm and annual mean temperatures ranging from 16.8 ºC to 20.7 ºC (Coutinho 2016;Oliveira-Filho 2017). This domain covers about 2 % of Brazilian territory and consists mainly of grassland vegetation in southern Brazil (IBGE 2004;Roesch et al. 2009). It is estimated that more than 50 % of the original Pampa vegetation has already been suppressed (MMA 2019b).
Finally, the Chaco corresponds to a humid-arid tropical region with rainy summers and comprises deciduous forests (Walter 1986). Annual mean temperatures range from 23.4 ºC to 25.1 ºC, with annual precipitation varying from 1,219 to 1,402 mm (Oliveira-Filho 2017). The Chaco is located between the Central Plateau of Brazil to the east, the mountains of the pre-Andean range to the west, and the Pampa domain to the south. It comprises a complex vegetation that, in Brazil, is restricted to the outskirts of Porto Murtinho, in the extreme southwest of Mato Grosso do Sul State (Prado 1993;Prado & Gibbs 1993). The Chaco is not officially recognized as a Brazilian phytogeographical domain and, therefore, there are no public policies for the conservation of its vegetation. Only 13 % of the original vegetation of this domain in Brazil remains (ECOA 2019).
These phytogeographic domains are structured by a diverse tree flora and varied vegetation types associated with the enormous climatic, edaphic and geomorphic diversity (Rizzini 1997;Joly et al. 1999;Giulietti et al. 2009) in Brazil. The heterogeneity of environmental and ecological variables is reflected in plant community structure with plants distributed along gradients and in mosaics of vegetation (Fávero et al. 2015). Habitat preferences and geographic distributions vary widely among species (Ricklefs 2002). Non-random distributions tend to concentrate biodiversity in specific areas, allowing the coexistence of a large number of species (Schoener 1974;Tilman 1982;McPeek & Miller 1996). Likewise, species differ in their frequencies of occurrence (Gaston 2000). The most common pattern is that of few frequent species, some common species and many rare species (Magurran 2004), with the numerical dominance of few species being a general law of ecology (McGill et al. 2007;Steege et al. 2013). Studies have shown that species richness is related to several environmental factors, such as temperature, precipitation and edaphic factors (Huston 1980;Gentry 1982;Huang 1994;Oliveira-Filho & Fontes 2000;Zhao et al. 2005;Eisenlohr et al. 2013;Brasil et al. 2019). The distributions of plant species can be strongly impacted by climate change and habitat loss, the two main threats to terrestrial biodiversity (Brooks et al. 2002;Bellard et al. 2012).
Investigating the distribution of tree diversity in a phytogeographic context allows the assessment of floristic patterns, which facilitates ecological studies (Brito et al. 2007;Freitas & Magalhães 2012), and contributes substantially to the adoption of strategies for biodiversity management and conservation (Peixoto & Gentry 1990;Lima et al. 2006;Oliveira-Filho et al. 2008). Such studies are necessary because of extensive environmental alteration and deterioration caused by unsustainable use of resources by humans (e.g., deforestation in the Amazon, desertification in the Caatinga, agriculture and livestock in the Cerrado, livestock in the Pampa and the Chaco, and urban expansion in the Atlantic Forest), besides being essential for updating phytogeographic divisions (Oliveira-Filho & Fontes 2000;Eisenlohr & Oliveira-Filho 2015a).
Floristic studies represent an important step in assessing natural environments because they provide basic data regarding species composition and distribution, and the identification of co-occurring and indicator species, and thus can contribute to the protection and/or recovery of various types of vegetation (Freitas & Magalhães 2012). The use of tree and tree-like taxa for phytogeographic investigations is compelling due to the large amount of data available in the literature and online repositories (Eisenlohr & Oliveira-Filho 2015b). One of the major challenges facing decisionmakers for biodiversity conservation is the establishment of national, regional and local priorities, which is essential for determining whether policy decisions are translated into concrete actions (MMA 2007). Thus, quantifying and delimiting areas of high biodiversity become relevant for identifying possible sampling biases, recognizing distribution patterns and providing support for proposals of new conservation areas (Oliveira et al. 2019).
In this study, we took advantage of NeoTropTree, a rich floristic database that provides listings of tree and treelike species with information on occurrence in geographic areas and vegetation types (Eisenlohr & Oliveira-Filho 2015b;Oliveira-Filho 2017). We examined floristic variation among the six Brazilian phytogeographic domains to: (1) investigate how the tree and tree-like vegetation varies in Brazil; (ii) identify areas of high richness for species, genera and families; and (iii) discuss floristic and biogeographic issues. We first analyzed Brazil as a whole, and then analyzed each phytogeographic domain separately.

Study area
Brazil encompasses a total area of 8,515,767 km², which represents most of South America in an area of tropical and subtropical zones (Coutinho 2016). The country possesses a great diversity of climates (equatorial hot-humid, tropical, semi-arid tropical and temperate hot-humid; Coutinho 2016), which contributes to the occurrence of various vegetation types, such as Rain Evergreen Forest, Seasonally Evergreen Forest, Seasonally Semideciduous Forest, Seasonally Deciduous Forest, Tropical Savanna and Semi-Arid Savanna, among others (IBGE 2004). We used the circumscription of phytogeographical domains proposed by Oliveira-Filho (2017): Amazon (Amz), Atlantic Forest (Atl), Cerrado (Cer), Caatinga (Caa), Pampa (Pam) and Chaco (Cha) (Fig. 1).

Database and occurrence records
We worked with tree and tree-like (TTL) taxa of Brazil, which we defined as freestanding plants that reach at least 3-m in height (Eisenlohr & Oliveira-Filho 2015b). The NeoTropTree database (NTT; Oliveira-Filho 2017) used here contained 3,959 sites, 9,108 TTL species and 698,490 occurrence records for Brazil. These were distributed among the six phytogeographic domains of the country (Tab. 1), and 21 vegetation types (Fig. 2) grouped according to climatic regime and leaf flush regime following the Neotropical Physiognomic-Ecological Classification System proposed by Oliveira-Filho (2009; (Tab. S1 in supplementary material). NeoTropTree is an online database with a wide geographical scope. It contains floristic information and a list of species based on publications and herbarium data. These data undergo a rigorous "cleaning" process, such as nomenclatural standardization by means of constant interaction with specialists, and the elimination of dubious occurrence records and checking of locations with Google Earth, Google Maps and GIS tools, before being incorporated and stored (Eisenlohr & Oliveira-Filho 2015b;Neves et al. 2017). Previously called TreeAtlan, NTT has been continuously fueled by new records, and remains as source of data for studies involving biogeography, biodiversity, endemism and ecological processes, among others (Eisenlohr & Oliveira-Filho 2015b;Oliveira-Filho 2017).
NeoTropTree contains occurrence records of TTL species organized into sample units (sites) (Tab. 1) with a 5-km radius. Each site corresponds to a single vegetation type, which is defined according to the phytogeographic classification system proposed by Oliveira-Filho (2009;. The vegetation types of this classification system consist of a sequence of words containing a 'name' and five 'attributes': A1, A2, A3, A4 and A5. The name is derived from five large forest groups (forest, shrubby, savanna, prairie and desertic) while the attributes correspond to phytophysionomic characterization (A3: foliar renewal regime) and four climatic-ecological attributes (A1 = thermal domain; A2 = climate regime; A4 = geomorphological altitudinal range; and A5 = azonal character of the substrate). This classification system is flexible and allows attributes of lesser relevance to the user to be discarded in the final nomenclature. Details about NTT can be found in Eisenlohr & Oliveira-Filho (2015b) and Oliveira-Filho (2017).

Data analysis
To examine floristic patterns, we prepared matrices of occurrence records for each domain and for each vegetation type. We then applied rarefaction and extrapolation methods based on Hill series to compare species richness among phytogeographic domains and vegetation types. For this we used the iNEXT package (Hsieh et al. 2019) of the R environment (R Development Core Team 2019), which computes 95 % confidence intervals using the bootstrap method (999 randomizations) on the diversity of rarefied/ extrapolated samples to facilitate comparisons of diversity among multiple communities. Sample size-based rarefaction and extrapolation of species richness followed Colwell et al. (2012), while the corresponding methodologies based on coverage followed Chao & Jost (2012). Chao et al. (2014), in turn, expanded previous work for species richness based on Hill series.
In order to delimit areas of overlap of species distributions (hereafter, richness areas), we used the method of Geographic Interpolation of Endemism (GIE), which is based on quantifying the co-occurrence of species, weighted by geographical distance between occurrence records (details in . This method estimates a central point (centroid) for the occurrence points of each species and defines a circular area that corresponds to the estimate of the distribution of each species. The circular area for a species consists of an area with a radius produced by the distance between its centroid and its farthest point of occurrence. This radius allows the overlap of geographical distributions of species to be quantified. The degree of overlap is measured according to a Gaussian function around the centroid of each species. The density of species in each overlap area is converted into interpolated curves using the Kernel interpolation function. Ultimately, the interpolated curves are rasterized and displayed on maps (see details and tools in . We applied the GIE method to the entire NTT database (9,108 species, 1,011 genera, 143 families and 698,490 occurrence records). We evaluated Brazil as a whole and its phytogeographic domains separately. For analyses at the species level we defined distance categories in which the first group corresponded to a distribution range of 0 to 49 km (encompassing single occurrence species and species with restricted distribution in the NTT database). From the number of records of occurrences of the constituent species of this first class, we established the next classes of distances with approximately the same number of occurrences, because the definition of the number of distance classes does not influence the GIE method, but the number of occurrence records within each class can influence the resulting overlap . For the taxonomic ranks of genera and families that had rare cases of single records or restricted distributions (resulting in few genera or families for the distribution class of 0 to 49 km, thus limiting possible overlaps), we defined groups of distance classes by dividing the total number of occurrence records of genera and families, separately, into four classes for Brazil and for each of the domains, maintaining, when possible, the same number of occurrence records of taxa in each class, thus standardizing the methodology. We used the method default to obtain a consensus for the distance range size categories, which adds the predefined classes and gives more weight to groups (distance classes) with the highest number of occurrence records. To perform these procedures, we calculated the area of influence of each species and plotted interpolated curves on maps using Quantum GIS (QGIS Development Team 2015; http://www.qgis.org/en/ site/). The darkest colors (tending towards navy blue) in the maps correspond to areas of greater richness; that is, areas where there is greater co-occurrence of species, genera or   families. Considering that the area of possible distribution of each taxon corresponds to a circular area, estimated by the radius around the centroid to the record of occurrence furthest from the distance class, the richness area of one domain may exceed the limit of another.

Results
We found 9,108 tree or tree-like (TTL) species, representing 1,011 genera and 143 families, distributed among the six phytogeographical domains of Brazil. The extrapolation analyses estimated 9,446 TTL species for Brazil, with a minimum value of 9,381 and a maximum value of 9,525 (Tab. 2). The number of genera and families were estimated at 1,116 and 157, respectively (Tabs. S2-S3 in supplementary material). Among the 9,108 TTL species observed in Brazil, 66.39 % (6,047) were exclusive to only one Brazilian domain (Fig. 3). The predominantly forested domains (Amazon and Atlantic Forest) had the greatest richness of species, genera and families, and greatest number of exclusive taxa (Fig. 3). The richness curves obtained for species, genera and families reveal that the Amazon was the richest domain of Brazil, followed by the Atlantic Forest, Cerrado and Caatinga ( Fig. 4A-C). The Chaco and Pampa domains had the lowest richness of species, genera and families ( Fig. 4A-C).
As the frequency and richness of the taxa (Fig. 5), Fabaceae was the family with the highest number of occurrence records (110,443 or 15.81 % of the 698,490 occurrence records, Fig. 5A) and the greatest richness of TTL species (1,312 or 14.40 % of 9,108; Fig. 5E) and genera (145 or 14.34 % of 1,011; Fig. 5F). Among genera, Eugenia (Myrtaceae) was the most frequent (19,085 or 2.73 % of all occurrences, Fig. 5B) and the richest in number of TTL species (325 or 3.56 % of the total, Fig. 5D). The most frequent species was Casearia sylvestris (Salicaceae) (Fig.  5C), with occurrence records in all of the phytogeographic domains of Brazil.
The ten most frequent genera of Brazil accounted for more than 15 % of the total for this taxonomic level (Fig.  5C). This pattern was also found for all of the domains (Figs. S1-S6 in supplementary material), with emphasis on Caatinga (26.38 %, Fig. S4B in supplementary material) and Pampa (25.65 %, Fig. S5B in supplementary material), in which the ten most prevalent genera accounted for more than a quarter of the total genera of these domains. The ten most frequent families accounted for more than 50 % of the total number of family records in the country and in each domain, as evidenced by the Caatinga, in which the ten most frequent families accounted for 70 % of the occurrence records (Fig. S4A in supplementary material), indicating the dominance of a few families in the TTL flora of this domain. In turn, the ten richest families accounted for 47.67 % of the genera and 55.54 % of the TTL species (Fig. 5E, F), which are percentages similar to those found for each domain separately (Fig. S1-S6E, F in supplementary material).
We identified more than 200 areas with the highest TTL species richness in Brazil (i.e., darkest color on the map; Fig.  6A). Most of these areas were concentrated in the Atlantic Forest and Amazon domains. Large regions of the Atlantic Forest with this level of species richness were in the states of São Paulo, Rio de Janeiro, Espírito Santo, Minas Gerais and Bahia, followed by the coastal region. On the other hand, we did not identify areas with such high TTL species richness in the coastal region between the states of Paraná and São Paulo and between Espírito Santo and Bahia. In the Amazon, areas of high co-occurrence of species were found in the states of Amazonas, Roraima, Acre, Amapá and Pará. The South, Center-West and Northeast regions of Brazil showed few species-rich areas (Fig. 6A). For genera, we identified large areas of floristic richness in Central Brazil, forming a corridor that extends from the state of Amazonas, around its capital Manaus, through the entire state of Goiás and reaching an area of genera richness in São Paulo (Fig. 7A). The areas richest in families were concentrated in a large block in the states of Goiás, Minas Gerais and São Paulo (Fig. 8A). The majority of families (125, 87.41 %) and genera (73.29 %, 741) had circular distributions of over 1000km between the centroid and furthermost point. For TTL species, 14.25 % (1.298) had circular distributions of up to 99 km, 47.57 % (4.333) between 100 and 999 km, and 38.17 % (3,477) above 1,000 km.

Amazon domain
The Amazon supported 60.18 % (5,482) of the TTL species richness of Brazil, 79.12 % (800) of the genera and 89.51 % (128) of the families (Fig. 3). The extrapolation analysis estimated a maximum of 5,835 TTL species, 871 genera and 132 families for this domain (Tab. 2; Tabs. S2, S3 in supplementary material). The Amazon region can be characterized as having high numbers of exclusive TTL species and genera, corresponding to 67.78 % (3,716) of the total TTL species of Brazil and 30.37 % (243) of its genera (Fig. 3). The richest vegetation type in the Amazon was Rain Evergreen Forest (3,977 TTL species, with a maximum extrapolation of 4,251 TTL species), followed by Seasonally Semideciduous Forest (2,659 TTL species, with a maximum of 3,149 TTL species) (Tab. S4 and Fig.  S7A in supplementary material).
The family, genus and species with the highest frequencies in the Amazon were Fabaceae (16.51 %), Inga (Fabaceae; 3.07 %) and Tapirira guianensis Aubl. (Anacardiaceae),  Fabaceae was also the richest in genera (113 or 14.12 % of the total number of genera in this domain) and species (791 or 14.42 % of the total number of species in this domain). Among genera, Miconia (Melastomataceae; 145 TTL species, or 2.64 % of the 5,482 TTL species found in this domain) had the greatest TTL species richness (Fig. S1D-F in supplementary material).
We observed approximately 150 areas of TTL species richness distributed throughout the Amazon domain (Fig. 6B). The largest identified areas were located in the states of Amazonas and Roraima, with a notably low number of areas with high overlap of species in the southern and northeastern regions of this domain. The areas with the greatest richness of genera (Fig. 7B) and families (Fig. 8B) were concentrated in the state of Amazonas, in a large region surrounding Manaus. Most of the Amazon TTL families and genera had distributions of over 1,000 km (centroid to furthermost point), with 112 (87.50 %) and 570 (71.25 %), respectively. We identified that 2,270 (41.57 %) TTL species had circular distributions of between 100 to 999 km and 2,402 (43.81 %) had distributions of over 1,000 km. Acta Botanica Brasilica -34(1): 117-134. January-March 2020

Atlantic Forest domain
The Atlantic Forest comprises 46.80 % (4,263) of the TTL species, 65.67 % (664) of the genera and 85.31 % (122) of the families that occur in Brazil. Among the taxa that are unique to this domain, we recorded 2,105 (49.37 %) TTL species, 65 (9.78 %) genera and five (4.09 %) families (Fig. 3). The extrapolation curves predicted a maximum of 4,488 TTL species for the Atlantic Forest (Tab. 2), with Rain Evergreen Forest (3,217 TTL species, with a maximum extrapolation of 3,499 TTL species), Seasonally Semideciduous Forest (3,080 TTL species, with a maximum extrapolation of 3,576 TTL species) and Cloud Evergreen Forest (1,888 TTL species, with a maximum extrapolation of 2,308 TTL species) being the richest vegetation types in this domain (Tab. S4 and Fig. S7B in supplementary material).
The most abundant family, genus, and species for the Atlantic Forest domain were Myrtaceae (12.35 %), Eugenia (3.75 %) and Casearia sylvestris Sw., respectively ( Fig. S2A-C in supplementary material). Fabaceae was the family with the highest number of genera (96 or 14.45 % of the total genera), while Myrtaceae (659 or 15.45 % of the 4,263 TTL TTL species) and Eugenia (233 or 5.46 % of total of TTL species in this domain) were the family and genus with the greatest number of species in the Atlantic Forest ( Fig. S2D-F in supplementary material).
The TTL species richness of the Atlantic Forest was distributed along the coastal region, covering almost the entire territory of the states of Rio de Janeiro, Espírito Santo, Santa Catarina, Paraná, São Paulo and Bahia. We did not find any areas of high richness along the coastal region of the state of São Paulo, in a small region between the states of Rio de Janeiro and Espírito Santo and between the states of Espírito Santo and Bahia (Fig. 6C). We observed a large block of genera richness extending from São Paulo to Bahia and away from the coastal region (Fig. 7C). The richness areas for families occupied mainly the states of São Paulo and Minas Gerais (Fig. 8C). In the Atlantic Forest, 2,480 (58.17 %) species had distributions between 100 and 999 km (circular distribution), and 1,031 (24.18 %) between 1,000 and 2,400 km, while 414 (62.34 %) genera and 105 (86.06 %) families had distributions of over 1,000 km.
The most abundant family, genus and species for the Cerrado domain were Fabaceae (18.18 %), Myrcia (Myrtaceae; 2.85 %) and Casearia sylvestris, respectively (Fig. S3A-C in supplementary material). Miconia (83 or 3.01 % of the 2,753 species) was the genus with the highest species richness while Fabaceae was the family richest in species (451 or 16.38 % of the total TTL species) and genera (100 or 16.39 % of the 610) (Fig. S3D-F in supplementary material).
We identified more than 120 areas of high species overlap in the Cerrado (Fig. 6D). We found many TTL high species richness areas located in transitional regions (Fig. 6D). The GIE analysis revealed large blocks of richness for genera and families concentrated mainly in the states of Goiás and Tocantins, but there a larger area of overlap was delimited for the genus level (Figs. 7D, 8D)

Caatinga domain
The Caatinga was found to have 1,077 (11.82 %) TTL species, of which 66 are endemic. We also identified 349 (34.52 %) genera with 11 exclusive, and 75 (52.44 %) families, none of which were exclusive to this domain (Fig. 3). Extrapolation analysis indicated a maximum richness of 1,226 TTL species in the Caatinga (Tab. 2). The most diverse vegetation type was Seasonally Deciduous Forest (Arboreal Caatinga and Deciduous Forest; 949 TTL species with a maximum extrapolation of 1,103), followed by Semi-arid Deciduous Stiff-leaved Dwarf forest (Alkaline Caatinga, Caatinga Quartzosa, Sandy Caatinga; 613 TTL species with a maximum extrapolation of 814) and Semi-arid Deciduous Broadleaved Dwarf forest (Rocky Caatinga; 467 TTL species and a maximum extrapolation of 504 TTL species) (Tab. S4 and Fig. S7D in supplementary material).
The ten most frequent families were responsible for 70.67 % of all the family occurrences for the Caatinga. The most frequent genus and species were Senna (Fabaceae; 4.09 %) and Aspidosperma pyrifolium (Apocynaceae), respectively ( Fig. S4A-C in supplementary material). The family Fabaceae had the highest number of occurrences in the Caatinga, and was also the richest family in genera (82 or 23.49 % of the total genera) and species (269 or 24.97 % of the total TTL species). The genus with the greatest richness was Eugenia (34 or 3.15 % of the total species) (Fig. S4D-F  in supplementary material).
In the Caatinga, large areas of TTL species richness were identified encompassing the northern region of Minas Gerais and most of the state of Bahia, as well as Alagoas and Pernambuco, but smaller areas of richness were also identified in the states of Rio Grande do Norte and Paraíba (Fig. 6E). The northern part of the Caatinga had fewer areas of species richness. A large area of richness of genera was found mainly in the states of Bahia and Pernambuco (Fig. 7E). Areas of family richness were concentrated in the state of Bahia, with smaller areas distributed in the states of Piauí, Rio Grande do Norte, Alagoas, Pernambuco and Minas Gerais (Fig. 8E). Most of the TTL taxa in the Caatinga had circular distributions of between 100 and 999 km: 800 species or 75.28 % of all 1,077 species, 290 genera or 83.09 % of all 349 genera and 69 families or 92.00 % of all 75 families.

Pampa domain
The Pampa domain includes 322 (3.53 % of the Brazilian species) TTL species (four of them exclusive), 176 genera (17.40 % of all genera) and 64 families (44.75 % all families) (Fig. 3). The most diverse vegetation type was Riverine Seasonally Semideciduous Forest, which was found to potentially to harbor up to 369 TTL species, as indicated by its ascending extrapolation line (Tab. S4 and Fig. S7E in supplementary material).
The ten most frequent genera corresponded to 25.65 % of all occurrence records of genera in the domain, while the ten most abundant families accounted for 53.33 % of the total occurrence records of families (

Chaco domain
The Chaco domain had 317 TTL species, corresponding to 3.48 % of the TTL species in Brazil, distributed among 194 genera (19.18 % of the total) and 56 families (39.16 % of the total) (Fig. 3). We identified eight unique TTL species for the Chaco. We also highlight the presence of the genus Stetsonia (Cactaceae), an endemic of this domain. The vegetation type with the greatest TTL species richness in this domain was Seasonally Deciduous Forest with 233 TTL species and a maximum extrapolation of 388 TTL species (Tab. S4 and Fig. S7F in supplementary material).
The most frequent taxa of this domain were the family Fabaceae (21.72 % of the total), the genus Aspidosperma (Apocynaceae; 3.77 % of the total) and the species Libidibia paraguariensis (Fabaceae). The ten most prevalent genera corresponded to 16.92 % of all genus records, while the ten most abundant families accounted for 55.79 % of all family records (Fig. S6A-C in supplementary material). The family Fabaceae (60 or 31.70 %) and the genus Aspidosperma (9 or 2.83 %) were the family and genus with the highest number of species; Fabaceae was also the richest in number of genera with 36 (18.55 %) of the 194 genera found for the Chaco (Fig. S6D-F in supplementary material).
The GIE analyses for the Chaco identified this domain as a region rich in species, genera and families (Figs. 6G,7G,8G). However, this result should be interpreted with caution because this domain covers a very restricted area in Brazil. All taxa in this domain had circularly distributions of up to 52 km.

Predominantly forested domains had the greatest richness
The predominantly forested domains (Amazon and Atlantic Forest) had the highest observed, rarefied and extrapolated richness of tree or tree-like (TTL) taxa in Brazil. These were followed, in order, by Cerrado, Caatinga, Pampa and Chaco. This finding is consistent with the pattern found in the Flora of Brazil 2020 database (BFG 2015;Flora do Brazil 2020under construction 2019. This result was also expected because it is known, for instance, that the Amazon shelters ~11 % of the tree species estimated to occur worldwide (Cardoso et al. 2017). In addition, the Amazon basin is not only the most diverse rainforest in the world, but no other region of tropical America surpasses it in terms of contribution to biodiversity (Antonelli et al. 2018). The Atlantic Forest, one of the top biodiversity hotspots in the world (Myers et al. 2000), also possesses exceptional species diversity, which can be higher than many of the forests in the Amazon (Morellato & Haddad 2000).
The proportion of taxa exclusive to domains in Brazil was significant, suggesting that environmental heterogeneity is one of the main factors that determine the floristic composition and richness of a site (Oliveira-Filho 1989;Stein & Kreft 2015). Besides being associated with the diversity of habitats, this high level of diversity may also be related to factors that promote tree growth, such as the availability of water and nutrients, and the ability of plants to use resources of forest environments (Murphy & Bowman 2012), resulting in a large number of species considered locally rare (Hubbell & Foster 1986;Kochummen et al. 1990;Lieberman & Lieberman 1994). However, it should be emphasized that our estimate of 9,525 species (from the 9,108 observed) does not mean that there are only 417 species of TTL yet to be described/found in Brazil. This statistical estimate, yielded by the methods developed by Chao & Jost (2012) and Chao et al. (2014), allows a confidence extrapolation, and does not necessarily represent an expectation of species to be described/found in the future.

Fabaceae and Myrtaceae emerged as the most representative families in Brazil and in each phytogeographic domain, confirming global patterns
The dominance of Fabaceae and Myrtaceae was expected since the former had the largest number of TTL species in the world, and the latter is the third most diverse (Beech et al. 2017). The high richness of Fabaceae seems to be a constant for Neotropical forests (Beech et al. 2017), while high Myrtaceae richness seems to be a characteristic of forests of eastern Brazil (Peixoto et al. 2008). In fact, the patterns found for the Atlantic Forest and Pampa domains support the suggestion of some studies that, overall, the forests of coastal Brazil represent an important center of diversity for the family Myrtaceae (Mori et al. 1983;Amorim et al. 2008;Murray-Smith et al. 2008). The high number of occurrences for the families Fabaceae and Myrtaceae are also reflected in high frequencies for the genera Eugenia and Myrcia in the Atlantic Forest, Pampa and Cerrado, and Inga and Senna in the Amazon and Caatinga, respectively.
The most frequent families in Brazil and its domains are usually also the richest. The pattern of richness for Brazilian domains reveals the dominance of a few families, among which the "top ten" account for at least half of the species and approximately half of the genera of each domain. The three largest and richest Brazilian domains (Amazon, Atlantic Forest and Cerrado) exhibited great differences in species richness. The Cerrado was found to have 2,753 species, while the Atlantic Forest, with 4,263, has about 1,500 more species than the Cerrado, and the Amazon, with 5,482, has approximately twice the number of species as the Cerrado. On the other hand, the Cerrado, Atlantic Forest and Amazon exhibited smaller differences in the number of genera (610, 664 and 800, respectively) and greater similarity in the number of families (117, 122 and 133, respectively). These results are suggestive of numerous speciation processes in predominantly forest areas (Amazon and Atlantic Forest).
Among the phytogeographic domains of Brazil, the pattern of frequency for the ten most-frequent species reveals that the higher the species richness of a domain, the lower the frequencies. Competitive interactions could explain this result, but specific studies are needed to confirm this hypothesis. Competition among plants is an important determinant of community structure (Aschehoug et al. 2016). Coexistence implies the spatio-temporal overlap of the distribution of some species that compete for limited resources, which can drive the structuring of patterns of vegetation composition, richness and frequency (Craine & Dybzinski 2013).

The Rain Evergreen Forest had the highest number of tree and tree-like species in Brazil
The extraordinary diversity of Rain Evergreen Forest confirms the findings of Oliveira-Filho & Fontes (2000), who reported a high proportion of Atlantic Forest species to be concentrated in this vegetation type. This emphasizes that the TTL flora of the Rain Evergreen Forest is considerably richer and has more exclusive species than the Seasonally Semideciduous Forest. On the other hand, our results also confirm the findings of Eisenlohr & Oliveira-Filho (2015a), by showing that the flora of the Atlantic Seasonally Semideciduous forest has high species richness and should not be considered an impoverished subset of rainforest flora, as suggested by Oliveira-Filho & Fontes (2000). Riverine Forest had the greatest richness for the TTL component of the Cerrado and Pampa. These forests pass through different vegetation types and possibly connect the floras of the main tropical forest domains (Oliveira-Filho & Fontes 2000;Oliveira-Filho & Ratter 2000). Caatinga and Chaco harbor most of their richness in Seasonally Deciduous Forest, which is a vegetation type that occurs predominantly in Northeast Brazil and in the Chaco (Veloso 1992;Prado & Gibbs 1993;Pennington et al. 2000).

Hotspots of tree and tree-like richness in Brazil
The non-random distribution of TTL species, genera and families allows the delimitation of areas of richness in the phytogeographical domains in Brazil. Areas with high species richness in Brazil were predominantly in the Amazon and Atlantic Forest, which are dominated by tropical forests. Tropical forests are usually very diverse (high richness) and have low species dominance (Pitman et al. 2001;McGill et al. 2007;Steege et al. 2013). Thus, given the large-scale of this analysis of Brazil, these domains were expected to show the greatest overlap of TTL species and, consequently, more definable areas with greater richness.
The core region of the Brazilian territory, which harbors part of the Amazon, Cerrado and Atlantic Forest domains, stood out as the area with the greatest richness of genera and families. One fact that should be considered is the distribution distance pattern for genera and families, in which the majority had distributions with distances greater than 1,000 km. Such large distributions allow the central region of Brazil to be the main area of overlap of these taxa. In this way, we show that the areas richest in genera and families are in central Brazil and form a northwest to southeast richness corridor.
Many areas of high species richness are distributed throughout the Amazon, with the greatest richness being in the region of Manaus, whose flora is relatively well studied, and lower richness in the southern and western Amazon. This distribution pattern of richness is associated with sites where there has been a high-intensity of collection efforts, and thus has been considered a 'museum effect' (Nelson et al. 1990;Ponder et al. 2001;Werneck et al. 2011). Therefore, these results could be interpreted as a bias related to sampling effort, which has been evidenced for decades by Brazilian science (Nelson et al. 1990;Ponder et al. 2001;Moerman & Estabrook 2006;Werneck et al. 2011;Ribeiro et al. 2016).
Areas of high species richness in the Atlantic Forest were congruent with areas of endemism identified in other studies, such as with mammals (Costa et al. 2000), endemic angiosperms (Werneck et al. 2011) and arthropods Hoffmeister & Ferrari 2016). Although this pattern could also be associated with a 'museum effect', it has held up for various groups of organisms, including those of the present study, which used a significant number of occurrence records (330,225 for the Atlantic Forest). Areas of species richness are distributed along coastal habitats of the Atlantic Forest, while areas of richness of genera and families are concentrated in areas more distant from the coast and form a band of richness on the continent between the southeastern and central corridors of the Atlantic Forest (see Werneck et al. 2011). These distribution patterns of richness can be strongly influenced by environmental filtering (coastal zone-continent) related to the climatic seasonality of the Atlantic Forest domain (Scarano 2002;Neves et al. 2017), which would act at the species level (Scarano 2002;Neves et al. 2017), resulting in high diversity in coastal habitats.
We found a large number of areas containing high species richness in the Cerrado domain, especially in Cerrado-Amazon and Cerrado-Atlantic Forest transitional areas. These zones of ecological transition between domains are commonly areas of high richness since they harbor species of both domains. These transition areas also have rare species (Pianka 1994;Araújo 2002) and are considered evolutionary cradles (Smith et al. 1997). For the Cerrado, we found that the most central region of Brazil, in the state of Goiás, has the greatest richness (overlap) of genera and families; however, caution is needed with this interpretation, and other appropriate analyses need to be conducted to confirm these patterns, such as those involving phylogenetic, molecular genetics and migratory approaches. The geographic position of the Cerrado in central Brazil seems to mediate most species migrations across Brazilian domains.
The Caatinga had areas of high richness that were congruent among the different taxonomic levels. Soil aridity has been proposed as a biogeographical filter for the woody flora of the Caatinga, acting by selecting species with particular ecological strategies (Silva & Souza 2018). In addition, the areas of high richness that we found indicate that this arid environment is also acting similarly at the level of genus and family. In turn, large areas of richness in the Pampa were close to the hydrographic basins, mainly in the southern proximity of the extensive Lagoa dos Patos, an area of transition with the Atlantic Forest, and in the central portion of the domain. This finding can be explained by the location of the Pampa vegetation, which is concentrated in riparian forests (Joly et al. 1999). The northern area of Ibirapuitã Environmental Protection Area was also found to have areas rich in species, genera and families.
The distribution patterns for vegetation richness in Brazil were found to be controlled at spatial scales that acted in different ways for TTL species, genera and families. Establishing floristic groups is important for comparative ecological, evolutionary and biogeographical studies, and the recognition of areas of richness can serve as a starting point for efficiently documenting patterns of diversity for conservation purposes (Morrone & Escalante 2002). A recent effort in this regard was that of BFG (2015), which also demonstrated areas of high richness for seed plant species in Brazil. We highlighted such areas specifically for the TTL floristic component, with strong statistical support and details about the main vegetation types of Oliveira-Filho (2009;. The areas of richness for TTL species, genera and families identified in this study represent different sets of spatially clustered taxa. These areas indicate high biological diversity and, thus, are areas of biological importance for conservation (MMA 2007). Most of the areas of high species richness in the Cerrado are located at the boundaries of this domain, in transition areas. Such transition zones can constitute geographic areas of high species overlap, and consequently high richness (Pianka 1994;Araújo 2002), and thus deserve special conservation attention. The patterns presented by the present study can help to identify priority areas, which is of utmost relevance to formulating biodiversity conservation strategies in the most megadiverse country on Earth.