Brazilian herbaria: an overview

Herbaria represent irreplaceable repositories of biodiversity and are used to answer questions about conservation, ecology, systematics, and other sciences. In this sense, we characterize the infrastructure, human resources, and idiosyncrasies of Brazilian herbaria. To achieve this goal, curators were sent a structured and standardized questionnaire to gather information about herbaria. The Brazilian Herbaria Network listed 216 active herbaria in the year 2018, of which 139 answered the questionnaire. These herbaria hold 6,741,469 samples in their collections and more than 39,000 type samples. Most herbaria are in federal universities (40.28 %). Only 24 % of the curators considered that their herbarium is valued by their institutions and 52 % indicated inadequate storage areas. Only nine collections have smoke sensors. Our analysis showed that if an herbarium has an institutional policy the curator is 78 % more likely to consider its herbarium valued. Therefore, it is important for all herbaria to institute their policy. These numbers reflect the difficulty in maintaining herbaria, in many cases cared for only by its curator without institutional recognition and support. Despite recent losses in Brazilian natural history collections, herbaria are still threatened by a lack of basic infrastructure.


Introduction
Biological collections, such as herbaria, are the main sources of biodiversity data used to map the distribution of organisms in time and space and are critical in understanding the impact of climate change in the Anthropocene (Meineke et al. 2018). According to Wheeler et al. (2012), it is through the knowledge about biodiversity that we can understand the past and learn how to better manage the future.
Herbaria represent irreplaceable repositories of information about plants, fungi, algae, lichens, and the world they inhabit (Funk 2003). Likewise, they are a fundamental source of associated metadata (Heberling & Isaac 2017;Soltis 2017). Collections are the basis for evolutionary biology and biogeography (Dalton 2003;Funk 2003), ecological studies (Beauvais et al. 2017;Souza & Hawkins 2017), and conservationist efforts (Iganci & Morim 2012;Nualart et al. 2017;Schindel & Cook 2018). Challenging questions on a large temporal or spatial scale, such as phenological changes as a result of climate change (Davis et al. 2015) or the control of zoonoses (Schindel & Cook 2018), can be answered by herbaria information.
In the last decade, there was a considerable increase in taxonomic research on plants, algae, and fungi in Brazil (along with the training of many taxonomists) leading to a better knowledge of our flora. This helped the government to achieve one of the goals of the Global Strategy for Plant Conservation/CBD (2001 through the publication of the List of Brazilian Flora Species (Forzza et al. 2010). Since then, Brazil made relevant progress in systematizing, modernizing (Egler & Santos 2006) and making its collections available in digital format Forzza et al. 2015;Maia et al. 2015;Silva et al. 2017), ensuring the advancement of e-taxonomy (Smith & Figueiredo 2009).
Brazil is a megadiverse country that encompasses the richest flora in the Americas and one of the richest on the planet (Forzza et al. 2012;Ulloa et al. 2017). However, the country is underrepresented in terms of specimens stored in herbaria. Adding all the samples recorded in the Brazilian herbaria (including duplicates), the collections reach approximately 8.4 million specimens (Thiers 2018)-value similar to the number of specimens deposited in the Museum National d'Histoire Naturelle in Paris (P). We can assess inventory gaps using the 206 datasets on the INCT Virtual Herbarium of Flora and Fungi (INCT-HVFF) platform. The INCT-HVFF reports that the records per km 2 are much lower in the Northern (0.2 records/km 2 ), Midwest (0.45) and Northeast regions (0.88) when compared to the Southeast (1.85) and South (2.01) regions-where the most consolidated funding agencies and postgraduate programs are concentrated.
In this sense, our goal is to describe and characterize the infrastructure, human resources, and specificities of the Brazilian Herbaria in order to stimulate discussions about botanical collections in the country, and if Brazil's herbaria are prepared to face the challenges of properly documenting the country's biodiversity, joining the efforts of the international scientific community.

Data survey
During the Scientific Sessions entitled "Brazilian Herbaria from North to South: How to Overcome Regional Differences?", which took place at the 69 th National Botany Congress, in Cuiabá, from July 8 to July 13, 2018, geopolitical data on Brazilian herbaria were presented. After the presentation, we developed a new and standard questionnaire that was sent to the herbarium curators.
Thus, between August 30 and October 30 of 2018, each Brazilian Herbaria registered at the Brazilian Herbaria Network (https://www.botanica.org.br/catalogo-da-rede-brasileira-deherbarios) received an email with 62 structured questions (List S1 in supplementary material): (1) general data; (2) institutional data; (3) collection data (4) herbarium physical structure; (5) usage. The questionnaire also asked what are the main goals and biggest challenges for the collection.

Data analysis
From question 62 ("What is the major challenge in the collection?") a word cloud was generated (https://www. jasondavies.com/wordcloud) using as input the number of words cited by the curators. These terms were standardized based on synonyms, removing prefixes, articles, and other grammatical items.
Moreover, we wanted to understand the different answers to the following question: "11. Do you consider that the herbarium is valued in your institution?" (answer: yes, partially or no). To do this, we constructed a multiple logistic regression model to associate variables that could be related to a valued herbarium. We started by removing the forms that chose "partially" as an answer to the question. We did this based on the subjectivity of the word: a partial value could tend to "partially yes" or "partially no", so the curator answer might induce noise in our model. After this, we removed nine forms that contained blank spaces (i.e., the curator did not answer all the questions). Doing this left us with 67 completed answered questionnaires from 139 curators. For predictors, we used the answers to the following questions: 10,13,14,15,16,18,23,33,35,42,47,51,52, and 55 (hereafter β 1 x 1 to β 14 x 14 , respectively-see Tab. 1). Does the herbarium students for the herbarium provided by the institution? (categorical) b6x6 23 The total estimated size of the collection (numerical) b7x7 33 What percentage of the collection is digitized? (numerical) b8x8 35 What percentage of the collection is available online? (numerical) b9x9 42 Are the herbarium's dimensions adequate? (categorical) b10x10 47 Is the herbarium used in the development of postgraduate course activities? (categorical) b11x11 51 Does the herbarium receive specialists of plant identification from other institutions? (categorical) b12x12 52 Approximately how many specialists visited the herbarium in the last year? (numerical) b13x13 55 Does the herbarium promote specimens exchange with national institutions? (categorical) b14x14 The predictors consisted of nine categorical variables and five numerical variables. The binary yes-no (1:0) answer of question 11 (from now on Q 11 ) was considered our response variable. The numerical variables were normalized into a 0-1 range using the function 'normalize' from package BBmisc (Bischl et al. 2017). Like this, we were able to construct a multiple logistic regression model (MLRM;McDonald 2009) to grasp what makes an herbarium valued in the view of the curators. Let Y be the binary response (yes = 1; no = 0) for Q 11 , p the probability of answering 'yes', b 0 the intercept, b n x n the predictors, and logit the log(odds). The MLRM for p = P(Y=1) has the following equation: By exponentiating and simple algebra we obtain the odds of answering 'yes': Which can be used to obtain the probability of answering 'yes': Since we are trying to understand what makes an herbarium valued (answering 'yes' to Q 11 ), we can extract the odds and probabilities of a one-unit increase (0 → 1; again, answering 'yes') in Y concering any predictor -for instance, a curator that answered 'yes' to b n x n has p chance of also answering 'yes' to Q 11 . To obtain the final model, odds and probabilities we first defined a null (N M ) and a full model (F M ). Our N M consisted of the regression of the answers to Q 11 against b 0 : The F M is the generalization of Equation (1) to our variables, that is: Then, we used the 'step' function to conduct a stepwise procedure from the lower (N M ) to upper (F M ) bound regressor and selected the best model using Akaike's Information Criteria (AIC; Akaike 1973). After this, we tested the significance of our (FINAL M ) through the Chi-square test by comparing it against the N M . All analyses were carried out with R software version 3.6.1. (R Development Core Team 2019).

Results
The Brazilian Herbaria Network listed 216 active herbaria in the year 2018 ( Fig. 1), 24 of those inactive and 15 have been transferred or incorporated by other collections. Of these, 162 (75 %) are registered in the Index Herbariorum. Altogether, the South has 57 herbaria, Southeast 74, Northeast 40, Midwest 21, and North 24. In Fig. 1 we summarize the number of Brazilian collections per geopolitical region, indicating the top ten herbaria in collection size. The oldest herbarium is the Herbarium of the National Museum, Rio de Janeiro (R), established in 1831. Since then, the number of created herbaria constantly increased (Fig. 2). Between 2000 and 2018, 91 herbaria were established in Brazil, with emphasis on the years 2000-2010, when seven new herbaria were created per year.
From 216 active herbaria in Brazil, 139 (64 %) filled in the standardized questionnaire (Tab. 2). The numbers represented in this work are based on the answers of these 139 herbaria. The region with the greatest response rate was the southern region, with 87 % of the herbaria answering the questionnaire, followed by the northern region, with 73 %. The 139 herbaria hold 6,741,469 samples in their collections and a little more than 39,000 type specimens.
Federal universities hold most of the herbaria (40.28 %), followed by state universities (19.42 %), and community universities (7.9 %). Research institutes and botanical gardens add up to 10 %. Regarding the value of collections, 38 (27 %) curators replied that their herbarium is not valued in their institution, 67 (48 %) curators answered that their collection is partially valued and only 34 (25 %) considered that the herbarium is valued. Only 40 collections (28.77 %) have their own website and only 62 (44.6 %) collections have their own policies. Despite this, 80 (57.5 %) collections have their own technician to assist the curator in day-today activities.
When asked about the presence of taxonomists in the collections, 30 herbaria (21.5 %) said they lack expert taxonomists, while the other herbaria accounted for 300 associated taxonomists (mean = 1.45 taxonomist per herbarium; sd = 2.29). Postgraduate studies are associated with 79.13 % of the collections and are responsible for supporting research in different areas of science. Researchers made more than 1,800 visits in the previous year to 73 % of the herbaria, most of them Brazilian taxonomists.
Resources from scientific research or extension projects funded 49 herbaria, while occasional supply from their institutions was the main source of funding for 38 collections. A single collection indicated support from the private sector as its main source of funding. Among the research project initiatives, 134 herbaria (96 %) said that they were part of national projects, such as INCT Virtual Herbarium of Flora and Fungi, Reflora and/or SiBBr. Regarding sample storage and collection protection, most herbaria have traditional two-door cupboards (47.48 %) or a mix of mobile storage (compactors) and traditional cupboards (16.54 %). Some collections use wooden or metal boxes on open shelves to hold their samples (5.7 %). The collections are in inadequate space for 52 % of the curatorsthe size of the storage room being the most criticized item. The database management system used most often is Brahms, cited by 35 % of curators, followed by Excel spreadsheets (28 %) and Jabot (17 %). Only 32 % of the collections are 100 % digitized. This number is much lower when considering the database of the photo images, where only nine herbaria (6.4 %) indicated 100 % of its holdings photographed-contrasting with the 65 (46.7 %) collections that do not have any image of their specimens.
Some collections are at risk for lacking any type of fire protection system (21.5 %), while others have only fire extinguishers (72 %)-nine collections alone have smoke sensors. Infrastructure is identified as one of the biggest constraints of collections, including physical space or basic types of equipment such as air conditioners and dehumidifiers (Fig. 3). Still, the word cloud (Fig. 3) showed that the absence of technicians to carry out routine activities such as drying, preparing the exsiccates, and updating the database represents the biggest obstacle.

Multiple Logistic Regression Model
The AIC value of FINAL M was 85.72 (N M AIC = 94.82). The FINAL M consisted of three predictors: b 3 x 3 , b 4 x 4 , and b 10 x 10 . However, only b 4 x 4 ("Does the collection have its own policies or any legal document recognized by the institution?") had a significant value (z-statistic < 0.05) (Tab. 3). The curators that answered 'yes' to "Does the collection have its own policies or any legal document recognized by the institution?" were 80 % more likely to answer 'yes' to Q 11 ("Do you consider that the herbarium is valued in your institution?"). Converting the probability to odds ratio, this result tells us that having a policy or any legal document recognized by the institution increases in 3,7 times the chance of answering 'yes' to Q 11 . Finally, the Chi-square test demonstrated that our FINAL M differs from N M (p-value < 0.001).

Discussion
Over the past ten years, the botanical collections in Brazil advanced greatly through governmental programs such as the List of Species of the Brazilian Flora, INCT-Virtual Herbarium of Flora and Fungi, REFLORA (Flora of Brazil 2020 and Virtual Herbarium) projects and public funding such as Biological Collections (SiBBr MCTIC), National Forest Inventory (which strengthened and subsidized the digitization of collections), expansion of collections in neglected areas, and the training and qualification of Brazilian taxonomists (Forzza et al. 2015;Dias 2017;Maia et al. 2017;Silva et al. 2017;REFLORA 2019). Most herbaria have benefited from these golden times, and the expansion and online availability of the collection's data led the country to a new era of biodiversity studies.
Despite this progress, infrastructure is still precarious in many collections. Recent losses of major zoological collections such as the Butantan Institute and the National Museum (Kumar 2010;Kury et al. 2018) seemed to not affect the protection of collections since 33 herbaria do not have fire extinguishers and only nine reported the presence of smoke detectors-even though herbaria are highly flammable seeing that specimens are packed in paper materials. The presence of air conditioners in more than 100 collections (70 %) and annual fumigation in 50 % of them at least ensure proper conservation of the samples. However, the conservation may be threatened by the lack of physical space since 52 % of the collections indicated that the space is inadequate for the current size of the collection -which usually results in inadequate manners of storage (see a proper way in Bridson & Forman 1992) and can damage the samples. This lack of infrastructure may be a reflection of the lack of institutional recognition. Indeed, curators that have their own herbarium policy, or any other internal instrument that recognizes the collection in their institution, have 3.7× more chance (or ≈ 80 %) to answer 'yes' to Q 11 than curators that do not (Tab. 3).
The Southeast and South regions have the largest number of botanical collections, accounting for 63 % of the country's herbaria, although they only represent 18 % of the national territory. This can be attributed to the historical location of research institutions and universities, mostly   concentrated in southeastern and southern Brazil. The two largest Brazilian herbaria, the Rio de Janeiro Botanical Garden and the National Museum were created in the 19th century and together hold more than 1,3 million samples (Thiers 2018). The Midwest region comprises 9 % of the country's active botanical collections (21 herbaria). This data represents the lowest number of collections per region, according to previous records (Egler & Santos 2006). Although the Midwest has the second largest area of the national territory, it is the region with one of the lowest specimen records collected per km 2 (0.45). The number of active botanical collections and specimen records/km 2 in this region could be related to few taxonomists that participate directly in the staff of the collections (Tab. 2) (Barbosa et al. 2005;Sartori & Pott 2018).
The North region has the third oldest herbarium in Brazil, the Museu Paraense Emílio Goeldi (MG), founded in 1895. Along with IAN and INPA, these collections hold 80 % of north samples. This region represents 45 % of the entire Brazilian territory. However, it is the fourth in the number of samples in herbaria and it has the lowest number of records/km 2 (0.2). One of the major impediments is the scarcity of taxonomists working in the northern regionjust 30. Also, the large geographical region difficult access to sample areas (Sobral & Stehmann 2009;Milliken et al. 2010). Despite all the adversities, the North region stands out with 91 % of its specimens digitized and 89 % available online thanks to data entry initiatives that began in the 1990s (Viana et al. 2015).
Two Northeast states (Bahia and Pernambuco) have the largest number of herbaria (10 and eight, respectively), as well as the largest number of consolidated botanical postgraduate programs and the number of records in the collections. Despite some important regional and local flora (Giulietti et al. 2006;Lyra-Lemos 2010;Prata et al. 2013), recent National Inventory studies in the Rio Grande do Norte (Versieux et al. 2017) and Ceará revealed new species and a singular richness in its floristic diversity. Among online species records in Brazil, 20 % are in the Northeast, and 84 % are already available through data platforms (speciesLink or Jabot). We highlight in the Northeast herbaria the largest collection of fungi in Latin America (URM) and the most important collections of the Caatinga and 'Hiléia Baiana' (CEPEC).
Most Brazilian herbaria (67 %) are in universities. They are a fundamental repository of biodiversity data, supporting scientific research in the environmental field, especially botany and ecology. According to Maia et al. (2017), of the 26 Postgraduate Programs in Botany in Brazil (included in the CAPES Biodiversity area), 25 have an associated herbarium and make data available online. However, given their importance in science, the limited institutional recognition perceived by their curators is surprising. Perhaps, in the academic view, biological collections remain institutionally associated with "oldfashioned" research and not a place where cutting-edge science can be done (Meineke et al. 2018). Therefore, long term policies and institutional recognition are fundamental to consolidate the advances made in the areas of taxonomy and biological collections in Brazil (Marinoni & Peixoto 2010).
As a network, Brazilian herbaria are one of the largest repositories of data on flora and fungi, a fundamental component of the country's biodiversity (Forzza et al. 2012). The digitalization and online availability of collection's data that began in the last decade, supported by different initiatives, made Brazilian's biodiversity researchers enter the "big data" era, i.e., the use of large datasets in the scientific, political, social, and commercial domains (Devictor & Vincent 2016). For instance, the data usage of INCT Virtual Herbarium is impressive since more than 400,000,000 of registers were used from 2014 to 2016 via search interface (Maia et al. 2017). This gave competitiveness to Brazilian environmental sciences, which are mostly done in public institutions.
Brazil is known as the holder of one of the greatest biodiversities on the planet, highlighting a large number of species and plant endemisms (Ulloa et al. 2017). Since the turn of the century, the number of publications in Brazil containing descriptions of new species of its flora has grown, mostly authored by researchers based on the country (Sobral & Stehmann 2009;Grieneisen et al. 2014). The alpha taxonomy growth is certainly related to investments in infrastructure for the study of biodiversity, which allows the collaborative programs between Brazilian and foreign herbaria.
However, continued budget cuts by the Ministries of Science, Technology, Innovations, and Communications, and the Ministry of Education (Fernandes et al. 2017;Petherick 2017) are a serious threat to the sustainability of all infrastructure that has been built in recent decades and endanger the open sharing of data from Brazilian's biological collections. Also, the new Brazilian Biodiversity Law (Brasil 2015), which simplified some processes for bioprospecting, hindered the process of sending samples abroad (either as a donation or as a loan) since the new law requires the manual registration of each sample in a nonfunctional system (Alves et al. 2018). As a consequence, this new law harmed the collaboration between Brazilian and international researchers, restricting the partnership and production of data for science (despite the proposals made by CGEN to solve part of the bureaucracy).
Our results suggest that the main challenges to the improvement and maintenance of the integrity of botanical collections in Brazil are related to a scarcity of institutional support (lack of space, lack of staff, lack of infrastructure and collection recognition) (Fig. 3). Thus, we recommend that the collections seek institutional recognition via publication of internal resolutions and policies, reinforcing that the herbarium is an institutional patrimony and represents part of the scientific sovereignty of the region and state.
Finally, we would like to stress that scientific, technological, and innovational research in botanical sciences depends on herbaria data. The country will not be able to face the challenges of the new century without a national policy for herbaria (and biological collections) that guarantees the strengthening of collaborative research networks, infrastructure, and the training of human resources for biodiversity research. We believe that the knowledge, correct use, and conservation of our rich biodiversity will bring sustainable economic growth and social welfare in the benefit of future generations facing the challenges of the global climate change.