Is the genetic variability of elite rice in southern Brazil really disappearing?

There is a worldwide concern about a possible narrowing of the genetic base of most crops, as e.g. that of rice (Oryza sativa L.), as a result of the modern breeding practices. Thus, the purpose of this study was to investigate this phenomenon in the germplasm of elite paddy rice in southern Brazil, including frequently used accessions in crosses. The panel consisted of 91 accessions. Data of morphological traits, SNP markers and mineral content of husked and polished grain were analyzed by hierarchical clustering and principal component analysis. The SNP markers and hierarchical clustering proved most appropriate to assess the genetic variability. A narrowing of the genetic base of rice was confirmed, although a certain level of genetic variability was still found in the germplasm of elite paddy rice in south Brazilian rice, particularly for grain mineral content.


INTRODUCTION
Rice (Oryza sativa L.) is one of the most important crops worldwide, including in Brazil, and is the primary food source of more than half of the world population, contributing to global wealth and food security (Zhang et al. 2011). In view of the importance of this crop, new improved cultivars have to be constantly developed in breeding programs, which is only possible as long as genetic variability with novel genes and alleles that control traits of economic importance are available.
From the 1960s onwards, the developing countries invested in the green revolution to supply their food demand, which resulted in high yielding cultivars, especially for wheat and rice. The intense use of these genotypes on farms as well as in breeding programs has raised the food production in these countries but has also caused losses with regard to the genetic variability (genetic erosion) (Govindaraj et al. 2015). In fact, the genetic progress in rice breeding has declined steadily over the last decades in several countries, due to the narrow genetic base of the accessions used for crosses (Aljumaili et al. 2018), even though some variability is still available within the rice gene pool. However, this diagnosis of the genetic variability has to be established for each specific germplasm, e.g., for South Brazilian paddy rice, mainly because the southern region produces C Busanello et al.
more than 80% of all Brazilian rice (SOSBAI 2018). In fact, studies have reported evidence for the gradual narrowing of the genetic variability among the elite cultivars grown in southern Brazil (Rangel et al. 1996, Raimondi et al. 2014, Rabelo et al. 2015, Streck et al. 2018, Streck et al. 2019. However, it is important to emphasize that these studies may have applied datasets and methods with somehow restricted efficiency to evaluate the genetic variability. Studies on genetic variability can improve the efficient use and conservation of genetic resources (Aljumaili et al. 2018). Genetic variability has been frequently measured based mainly on morphological traits. However, this approach has some disadvantages in terms of time, space, cost and labor requirements. Furthermore, it cannot define the precise level of genetic variability among accessions due to the additive gene action for trait expression and the environmental effect on the phenotypic performance. Genetic variability can also be assessed by analyzing the grain mineral content (Roy and Sharma 2014). However, this is also a phenotype-based approach, with possibly the same limitations as for morphological traits.
An alternative to the phenotypic traits are molecular markers, by which an organism can be directly characterized based on DNA features (Aljumaili et al. 2018). Molecular markers were developed in the 1980s and with the rapid increase of information regarding plant genomes, a revolution in molecular genetics has taken place, which provided a tremendous opportunity to investigate genetic variability contained in any given germplasm. These markers were successfully applied to determine and classify the genetic variability, among which the SNP (single-nucleotide polymorphism) markers are currently preferred, in view of their advantages (Nadeem et al. 2018).
Based on the above, this study investigated whether the genetic variability among rice accessions cultivated in southern Brazil and accessions used in crosses in breeding programs based on genotypic and phenotypic approaches is really becoming exhausted.

Rice panel
A panel consisting of 91 rice accessions was selected for this study, to sample the genetic variability of the elite South Brazilian rice germplasm, including foreign cultivars widely used in crosses (Table 1). These are accessions of Brazilian cultivars released in the last four decades, introduced cultivars, a few traditional varieties and mutant lines, and one hybrid. Twenty-five conventional genotypes (i.e. not

Field conditions for morphological and grain mineral content analysis
For the morphological trait measurement and grain production for mineral content analysis, the panel cultivars were grown at the Experimental station Terras Baixas of Embrapa Temperate Climate, in Capão do Leão, state of Rio Grande do Sul, southern Brazil, in the 2016/2017 growing season. The experiment was arranged in a completely randomized block design, with three replications in 1.0-m single-row plots, spaced 0.2 m apart. All crop management practices were applied according to the recommendations for rice cultivation in southern Brazil (SOSBAI 2018).

Morphological trait measurements
Five plants per row were evaluated for the following 10 traits: number of panicles per plant, panicle weight, number of grains and sterile grains per panicle, 1000-grain weight, lemma color, sterile glume color and caryopsis length, width and thickness.

Grain mineral content
For the analysis of grain mineral content, the accessions were harvested by hand in stage R8 [phenological scale of Counce et al. (2000)] and the grains were dried. The mineral content was measured in both husked and polished (white) rice. To this end, the samples for polished rice were husked and polished for 2 min in a mill test (Suzuki, model S21, MT). Next, all samples were ground (MARCONI, model MA020, Piracicaba-SP). The content of 11 minerals was determined (As, Ca, Co, Cu, Cr, Fe, Mg, Mn, P, Se and Zn). In general, the samples were prepared and analyzed as described by Paniz at al. (2018). The samples were analyzed by inductively coupled plasma mass spectrometry (ICP-MS Agilent 7900, Hachioji, Japan), at the Federal University of ABC, in Santo André, São Paulo.

Genotyping with SNP markers
The panel was genotyped with 7,098 SNP markers of the 7K Infinium SNP genotyping platform (Illumina ® ) at the laboratory of genotyping services of the International Rice Research Institute -IRRI / Philippines (updated version of 6 K Infinium array; Thomson et al. 2017). A filtering procedure was carried out using TASSEL V.5.2.41 (Bradbury et al. 2007), where SNPs with missing data > 20% and minor allele frequency ≤ 5%, were removed, leaving 4,973 high quality SNPs for the analysis.

Statistical analysis
To make the results comparable, all four datasets were subjected to the same statistical analyses, i.e., distancebased map, principal component analysis and hierarchical clustering. Firstly, a genetic distance matrix was calculated, based on the Euclidean distance metric. This matrix was transformed into a cluster distance heat map, for a simple and direct visualization of the genetic distances among the accessions. For hierarchical clustering, the previously generated Euclidean distance matrix was used and the Average Linkage method was applied. A cut-off was assigned by the Mojena (1977) methodology, where the cut-off = mean + K(1.25)*SD. For principal component analysis (PCA), the K-means algorithm was applied to cluster the genotypes and the Elbow criteria to determine the number of groups. Finally, the hierarchical group containing most accessions separated by the K-means algorithm in the four treatments were compared to verify the common and the exclusive accessions in a Venn diagram. Normalization was applied to all analyses to adjust to the scale of each variable. Software Orange v. 3.18 was used for all analyses and graphically displayed (Demsar et al. 2013).

RESULTS AND DISCUSSION
In this study, genetic variability was measured in a selected panel of rice accessions based on four distinct datasets, i.e., of morphological traits, genotyping with SNP markers and grain mineral content of husked and polished rice. Two statistical analyses were applied to each dataset, i.e., hierarchical clustering combined with genetic distance heat map C Busanello et al. and principal component analysis (PCA). The application of each dataset and analysis resulted in slightly distinct findings, but common features of this germplasm were also observed.
Based on the morphological traits, hierarchical clustering formed five groups, while five genotypes were not clustered in any group ( Figure 1A). The major part (~75% of the genotypes) of all accessions were grouped in one (yellowish) of the five clusters. Almost all current south Brazilian elite (CSBE) cultivars were assigned to this group, while all other types of accessions were also represented, e.g., foreign cultivars and traditional varieties. Among the accessions that were not clustered in this main group, only BRS Querência is a representative of the CSBE cultivars. The smallest group (orange) contained only two accessions, one of which was a traditional variety and the other a foreign cultivar. A visual grouping of the genotypes by the genetic distance heat map was not possible, since the range of genetic distances was narrow (mean genetic distance = 1.64). The use of morphological traits to assess genetic variability within a given germplasm is a widely applied strategy in this field of research (e.g. Cargnelutti Filho et al. 2010). Based on this approach, a recently published study, focused on the South Brazilian rice germplasm, showed that most of the genotypes were also clustered in a single large group, which allowed the conclusion of an expressive narrowing of the genetic variability of this germplasm, especially within the indica group (Streck et al. 2017, Streck et al. 2018. However, it is well known that results of studies based on phenotypic data alone may be imprecise, especially due the confounding environmental effect. Moreover, phenotypic assessments are subjected to human errors, in particular visual evaluations, which are usually also rather time-consuming. For the SNP dataset ( Figure 1B), hierarchical clustering defined six groups, and four accessions were not grouped. Similarly, as for the morphological traits, a very large cluster (red) contained most of the evaluated genotypes (~79%). All CSBE cultivars, with exception of cvs. BRS Firmeza and SCS BRS Tio Taka, were clustered in this largest group. However, again all other types of accessions were also found in this cluster, similarly as in the morphological trait analysis. For this dataset, the mean genetic distance was expressively high (54.22). Consequently, the genetic distance heat map allowed a clear visual separation of the clusters, confirming the hierarchical clustering. Summing up, molecular markers circumvent the environmental effect in genetic studies. Moreover, it is a technology with an impressive scope, since thousands of loci in hundreds of genotypes can now be assessed within only a few days (Rasheed et al. 2017, Nadeem et al. 2018). Another key advantage of this approach is that SNP markers established by the current technologies are well-distributed across the genome (Thomson et al. 2017), which allows inferences on the genetic variability even of traits not measured in the study, due to the linkage phenomenon. Studies using biochemical/molecular markers to assess the genetic variability of Brazilian rice are rare (Guidolin et al. 1994, Rangel et al. 1996, Malone et al. 2006, Branco et al. 2007, Raimondi et al. 2014), but also agree with this study, evidencing a certain level of narrowing of the genetic diversity of this germplasm. The scarcity of studies applying molecular markers is due to the still high cost of these technologies and the need for qualified staff to perform the analysis.
For mineral content of rice grain, the results show a less threatening scenario, as more groups were found by the hierarchical analysis and the size of these groups was smaller. Grain mineral content of husked rice formed seven groups, while four genotypes were not included in any cluster ( Figure 1C). The largest group (pink) contained approximately half of the evaluated genotypes, however, only 14 of the 25 CSBE participated in this group. Although a range of visual genetic distances was calculated (mean = 1.72), the groups formed by hierarchical clustering could not be confirmed by a heat map. For grain mineral content of polished rice, a total of seven groups were also formed, and once again four accessions were not clustered in any group ( Figure 1D). The largest group (purple) contained ~55% of the accessions. Of these 50 genotypes, 16 were CSBE cultivars. Of the remaining nine CSBE cultivars, six were clustered in the second largest group (gray). Once again, the genetic distance heat map could not distinguish the groups formed by hierarchical clustering (mean genetic distance = 1.86). In southern Brazil, mineral content of rice grain has not been a main target of breeding programs, but rather yield potential along with grain quality (industrial and cooking properties), followed by other field traits, such as adaptation to variable environments, earliness, tolerance to abiotic and resistance to biotic stresses, among others (Streck et al. 2018). In other words, virtually no direct selection for enhanced grain mineral storage has been applied to the studied germplasm. This explains why the clustering obtained with this dataset did not fully agree with the one based on morphological traits and even on molecular markers. The detected variability can be attributed to the origin of the genotypes (Pinson et al. 2015) and also to unintentional selection, as studies have found evidence that selection for improved yield may also change the grain content of several elements (Anandan et al. 2011). In fact, studies of worldwide panels found high genetic variability for rice mineral content (Anandan et al. 2011, Pinson et al. 2015, i.e., any future lack of genetic variability can be improved by the exploitation of foreign accessions.
Across all datasets, a key result, especially of the analysis of morphological traits and SNP datasets, was the frequent clustering of most CSBE cultivars in a single group. When interpreting clustering analysis, evidence for a relatively narrow genetic variability within a group of evaluated genotypes is either the formation of few groups, but more importantly, the clustering of most genotypes in a single large group. This aspect was well-documented in this study, mainly for the C Busanello et al.

analysis of morphological traits and SNP markers by hierarchical clustering.
A surprising lack of the well-known phenomenon "breeder signature" (Ali et al. 2011, Xie et al. 2015 was observed in all datasets, since several cultivars developed in different breeding programs (Embrapa Temperate Climate, Irga, Basf, Epagri) and even foreign genotypes clustered together. Possible reasons are that these programs develop cultivars with similar breeding targets, i.e. mainly yield potential, for a similar geographical region and mainly for paddy rice cultivation. Similar evidence was described by Reig-Valiente et al. (2016), who showed that breeding for a specific region even tends to genetically isolate the germplasm regarding adaptability. The second reason is probably more important, as studies have named the causes for the narrowing of the genetic base of rice by genealogy records, showing that few ancestors represent the base of practically all breeding programs, and that in addition, over decades, few parents were used repeatedly in crosses, affecting the resulting variability (Rangel et al. 1996, Raimondi et al. 2014, Rabelo et al. 2015. Regarding the datasets used in this study, it is possible to assume that the results of SNP markers are the most refined. An important evidence in this sense is the high mean genetic distance (54.22) for this dataset (versus morphological data 1.64; mineral content of husked rice 1.72; mineral content of polished grain 1.86). Molecular markers are not only stable in the genome of the evaluated plants, but are also unaffected by trait expression and independent from interactions with other genomic sequences. In agreement of the importance identify SNPs associated with a desirable trait, is their application in genome-wide association studies, which have been applied in rice accessions for agronomic important traits (Changrog et al., 2020). On the other hand, any phenotypic dataset is a result of not only environmental effects but also of influences of specific genetic and genomic architectures, in which several phenomena, such as gene expression, can affect the types of gene mode and even epigenetic controls.
To confirm the results of the hierarchical method, a PCA analysis was carried out and groups were formed, using the same datasets ( Figure 2). Based on the morphological trait dataset (Figure 2A), four components explained 76% of the genetic variation and the applied method suggested three groups of genotypes. The largest group (blue) comprised ~47% of the accessions, the second group (red) ~38% and the smallest (yellow) only ~14% of the genotypes. The largest group contained 10 and the second largest cluster 15 CSBE accessions. The smallest group consisted basically of foreign japonica cultivars and traditional varieties. The PCA analysis of the SNP dataset required six components to explain 62% of the genetic variability of the measured panel ( Figure 2B). Four clusters were generated, of which the largest (red) comprised ~69% of the accessions, and 19 CSBE. The second largest group (blue) contained ~12% of the accessions and six CSBE cultivars.
In the PCA analysis for mineral content of husked rice, four components explained 74% of the genetic variation and four groups were established ( Figure 2C). The largest group (blue) contained ~31% of the genotypes and 11 CSBE cultivars. Finally, in the PCA analysis of mineral content of polished rice, four components explained 75% of the genetic variation and three clusters were formed ( Figure  2D). Approximately 38% (blue), 29% (red) and 33% (yellow) of the genotypes were grouped per clusters. Surprisingly, CSBE cultivars were evenly distributed across all three groups, as also found for the traditional varieties and foreign genotypes. Importantly, for both datasets of husked and polished grain mineral content, there was no clear clustering of cultivars according to the breeding company that developed the cultivars. Once more, this indicates the presence of genetic variability within the measured panel for these traits.
Venn diagrams were constructed to elucidate the agreement between groupings (Figure 3). For hierarchical clustering, a total of 25 accessions in common were present in the largest group resulting from each dataset, of which 12 are CSBE cultivars ( Figure 3A). Regarding PCA, only two accessions in common were grouped in the largest cluster of each dataset (BRS Ligeirinho and IRGA 428, a CSBE cultivar) ( Figure 3B).
A comparison of the clustering methods showed that hierarchical clustering allowed the formation of more groups than the PCA analysis, for all datasets. It has been clearly demonstrated that different clustering analysis   (Cargnelutti Filho et al. 2010). A classical requirement for a proper interpretation of PCA analysis is that few components should explain a large amount of variation in a dataset, usually of > 80%. In this study, the first components of none of the PCA analysis explained > 76% of the variation, and adding more components improved this result only minimally. Thus, taking this point into consideration, it is possible to suggest that hierarchical clustering is more reliable for this dataset.
In crop breeding programs, the analysis of the genetic variability of the germplasm under study is of paramount importance, as this variability is simply the raw material, the basis, on which selection can be successfully carried out. There is a worldwide concern that the genetic variability of most crops will gradually become eroded, due to the current practices of breeding programs, in which few parents are used for crosses and a high selection pressure is applied to a very limited range of phenotypes, also called the funnel effect (Wouw et al. 2009). For bread wheat (Triticum aestivum L.), a crop with a naturally narrow genetic variability, this threat has been widely recognized among researchers (Reif et al. 2005). Although the rice gene pool is considerably richer in comparison with wheat, there are indications of a narrowing of the genetic variability of the crop, especially in the elite germplasm, which has been under selection. The international literature has described this phenomenon (Aljumaili et al. 2018), but important reports on a narrowing of the Brazilian rice germplasm have also been published (Rangel et al. 1996, Raimondi et al. 2014, Rabelo et al. 2015, Streck et al. 2018, Streck et al. 2019).
Generally, it was found that a certain level of variability for immediate use in South Brazilian rice breeding programs is still available, especially for breeding for grain mineral content, but also for morphological and other traits, as indicated by SNP analysis. Thus, genotypes of one given cluster can be a source of variability for accessions of another. This result was somehow confirmed by a recent study that showed a linear increase in the genetic progress of South Brazilian rice over the last 45 years (Streck et al. 2018). However, the prevalent result of this study indicated a concerning situation of erosion of the genetic variability of Brazilian rice, which will probably impair the genetic progress of the crop within a few years. Furthermore, the genetic vulnerability caused by this narrow variability also has to be taken into account, with regard to the threat of biotic epidemics and even abiotic factors triggered by the ongoing climate change (Raimondi et al. 2014). Breeders and researchers have to be concerned and respond with the application of strategies to broaden the genetic variability of Brazilian paddy rice, be it by mutagenesis, by gene introgression from wild relatives and landraces or even by biotechnological approaches, such as the cutting-edge technology of genome editing.

CONCLUSIONS
For the evaluated germplasm, SNP markers and hierarchical clustering are the most appropriate tools to assess genetic variability. The narrowing of the genetic base of rice mentioned in the literature was confirmed here. Nevertheless, a certain level of genetic variability within the germplasm of South Brazilian elite paddy rice is still available, especially for grain mineral content.