Breeder-Learning : a new tool for teaching and learning plant breeding principles

The Be-Breeder application is an on-line tool constructed through the R software for the purpose of assisting in some of the main genetic and statistical analyses related to the area of plant breeding. In addition, Be-Breeder provides a section called “Learning”, which in a simple click-point manner allows explanation of theories related to the effect of inbreeding, population structure, qualitative and quantitative traits, heterosis, population size, effect of selection, and composition of hybrids. Be-Breeder is available for network use on the website of the Allogamous Plant Breeding Laboratory (Laboratório de Melhoramento de Plantas Alógamas) of ESALQ-USP through the link: http:// www.genetica.esalq.usp.br/alogamas/R.html.


INTRODUCTION
Most Agronomy courses have mandatory subjects related to plant breeding.This determination of mandatory subjects by CONFEA (CONFEA 1973) and the Ministry of Education (MEC 2006) in the course curriculum is based on the importance of this topic for the education/training of the agronomist and for agriculture.For example, it is estimated that half of the increase in yield of the main agricultural crops has come about through plant breeding processes (Kuhr et al. 1985, Byerlee andMoya 1993).Facts of this nature have also led to the creation of numerous graduate study programs in this line of research.The aim of these programs is to train human resources in understanding the mechanisms and concepts in regard to this theme, as well as possible applications of the principles that guide plant breeding.
In this context, seeking to assist teaching and learning, different tools, such as computer applications or software, can be used in clarification of more complex theories and equations.For this purpose, the Be-Breeder application, a compilation of on-line routines visualized through the Shiny package of the R software (Chang et al. 2015), has a section called Learning, with interactive learning strategies.They are an easily applied alternative tool for explaining theories related to the effect of inbreeding, population structure, qualitative and quantitative traits, heterosis, population size, effect of selection, and composition of hybrids.

BE-BREEDER APPLICATION
The Be-Breeder application is teaching tool freely available for on-line use at http://www.genetica.esalq.usp.br/alogamas/R.html.In the Learning section, the user has interactive tabs that deal with themes related to quantitative genetics, population genetics, and breeding methods.The tabs available and suggestions on how to use them are provided below.

Inbreeding Effect
Inbreeding is the process of raising the frequency of gene loci in homozygosity by means of successive processes of crosses between individuals with kinship, with maximum expression through self-pollination in plants (Nass 2001, Bos andCaligari 2007).This allows the formation of pure lines that have various applications (Shull 1909, Johannsen 2014).
In dealing with the inbreeding process in plants and reporting reduction in the percentage of loci in heterosis by half in each self-pollination cycle, the user can use visualization of histograms for a single locus with two alleles: "A" and "a".Upon choosing generation of self-pollination from F 1 to F 9 and the F.Inf (∞), it can be observed that at each cycle the percentage of the genotype "Aa" is cut in half, beginning at 100% in F 1 to 0% in F.Inf; in contrast, the percentage of homozygous genotypes "AA" and "aa" increases from zero in generation F 1 to 50% in generation F.Inf.
In this tab, the user can also observe the fluctuation in additive (σ 2 a ) and dominant (σ 2 d ) variance between and within populations as a function of the inbreeding coefficient (F of Wright).Additive variance between is estimated by the expression σ 2 aA = 2Fσ

Qualitative vs. Quantitative
The number of genes and their different forms of intra-allelic interaction influence the number of phenotypic classes and also the frequency of each class in the population.Consequently, in the total number of genes that control the trait, as well as the types of interactions, concepts arise and transition of qualitative traits to quantitative ones (Ramalho et al. 2008, Borém andMiranda 2013).In this respect, a simple algorithm was developed in Be-Breeder in which the user can construct a genetic structure of a trait, choosing the number of genes with dominance effect, the number of genes with partial dominance effect, and the number of genes with additive effect.As a response, a histogram relating frequency and number of classes can be visualized in the output box.It should be emphasized that, just as in natural biological systems, the randomness factor was embedded in the algorithm in such a way that, although the number of genes chosen is the same, it will not indicate that the number of classes will necessarily be the same for polygenic effects, allowing inferences to be made regarding segregation.

Progeny Size
Observation of a determined genotype is dependent on the number of genes acting on a trait, such that the greater the number of genes, the more individuals are necessary in population sampling to verify all the possible genotypes or the genotype desired.Inbreeding, promoted by self-pollination, has a direct influence on this observation through the fact of increasing loci in homozygosity in the population, reducing genotypic variability.The size of the population to be evaluated is given by the expression no = log (1-p)  log(1-IH) , in which p represents the probability of observing a determined genotype, and , in which m is the number of generations of self-pollination and n is the number of genes that control the trait.This number of individuals can easily be obtained in this tab of Be-Breeder, in which the user can simulate numerous scenarios and observe fluctuation in the population size as a function of generation of self-pollination, the number of genes, and the probability of observation.

Effect of Selection (HWE)
In population genetics, it is relevant to check if a determined gene locus is in Hardy-Weinberg equilibrium (HWE) as a function of the frequency of the alleles A(p) and a(q).From this information, it is possible to determine the number of individuals expected for each genotype using the expressions no AA =(p 2 + pqF)N, no Aa =(2pq -2pqF)N, and no aa =(q 2 + pqF)N, in which N is the total number of individuals of the population and F is the inbreeding coefficient of Wright.Upon comparing the number expected with the number observed for each genotype, it is possible to perform R Fritsche-Neto and FI Matias a chi-square test (χ 2 ) to verify significance through the expression for i = (AA, Aa, aa), in which O i is the number of individuals observed for genotype i and E i is the number of individuals expected for genotype i.
Nevertheless, the effect of selection is the main piece of information the breeder uses to conduct a breeding program.Thus, upon applying a selection intensity in a population with an original mean value (μ o ), a selected population is obtained with a mean value (μ s ), in which the difference between these mean values is equivalent to the differential of selection (DS).By multiplying DS by the heritability of the trait (h 2 ), gain from selection is obtained (GS = DS ⃰ h 2 ), which upon being added to μ o will give rise to the predicted mean of the improved population in the next evaluation cycle (Falconer et al. 1996, Bernardo 2010).
In the application, the user can simulate numerous scenarios, modifying the allelic frequencies as a function of the phenotypic observations, thus allowing speculations in regard to the effect of the composition of the population, heritability of the trait, and the intensity of selection in a breeding population.The output is a table with the following information: allelic frequency, genotypic frequencies, the number of individuals observed, the number of individuals expected, and the χ 2 test to verify Hardy-Weinberg equilibrium.In addition, the user can simulate selection by identifying the number of individuals selected in each genotypic class and the respective genetic values, obtaining the parameters μ o , μ s , DS, and GS as output.

Components of Genetic Variance
Genetic variance is composed of variance of additive and non-additive effects that fluctuate as a function of allelic frequencies (Bernardo 2010).So as to deal with these concepts, a function was developed that allows the user to indicate the frequency of the allele "A" (p), which is obtained from the difference with allele "a" (q).The user can also modify the additive effect (a) and dominance effect (d) and observe the effect of these factors (p, a, and d) on the magnitude and relationship between total genetic variance and its additive and dominance components.The expressions used to estimate the variances are σ 2 A = 2pq[a + d(qp)] 2 for the additive, σ 2 D = (2pqd) 2 for the dominance, and σ 2 G = σ 2 A + σ 2 D for the total of a determined gene locus (Falconer et al. 1996).

Constructing Synthetic Populations
The number of alleles of a gene in a breeding population indicates variability, such that the greater the number of alleles present in the population, the greater the variability of the heterozygotes will be, according to the expression . However, this does not indicate that the number of heterozygotes in the population will increase indefinitely; that is, retention of heterozygosity (RH) reaches a plateau with a determined number of alleles.Although this value increases toward 1.00, the number of heterozygotes in the population will remain constant, just as indicated by the expression RH =1-Σ i (p i ) 2 (Nietlisbach et al. 2016).It is known that the objective is to maintain heterozygosity at high levels in an allogamous population, so as to exploit heterosis and avoid inbreeding depression (Falconer et al. 1996).Nevertheless, an excessive increase in the number of alleles can increase genetic variability to critical levels, which can impede selection and standardization of the population for important agronomic traits or descriptors.
In this regard, in this function the user provides the number of alleles of a gene (maximum of 10 alleles) and their respective initial frequencies in the composition of the population.The sum total should be equal to one.Retention of heterozygosity and the genetic variance of the heterozygotes can be observed in the output window as a result, representing that, although genetic variance increases the number of heterozygotes in the population, this reaches a plateau and stabilizes.Thus, it is possible to identify the ideal composition of parents for formation of a population for numerous situations.

Intrapopulational (IRS)
Intrapopulational recurrent selection is a breeding procedure that leads to an increase in the frequencies of alleles of interest in the population without, however, drastically reducing its variability, improving the performance per se of the population in each selective and recombination cycle (Bernardo 2010).In this context, Be-Breeder allows estimation of gain from selection (GS), effective size of the population (Ne), and inbreeding coefficient (F of Wright) for the IRS breeding arrangement in different selection scenarios, number of progenies evaluated, selection intensity, and heritability.
The expression of response to intrapopulational recurrent selection, according to Falconer et al. (1996), is given by GS , in which GS is gain from selection, i is the standardized selection differential, c and D 1 are values that depend on the selection arrangement by parental control (Table 1), σ 2 a is additive genetic variance, σ p is the phenotypic standard deviation from the unit of selection, and ID refers to inbreeding depression given in percentage.The Ne parameter refers to the effective size of the population given the expression Ne = Ne tab ⃰ N ⃰ i , in which Ne tab is the value dependent on the selection arrangement (Table 1), N is the total size of the population, and i is selection intensity.
The inbreeding coefficient F of Wright is estimated by F = 1 2Ne .

b) Reciprocal (RRS)
Reciprocal or intrapopulational recurrent selection is a breeding arrangement that leads to an increase in complementarity between two heterotic groups or populations from crosses.The RRS brings about superior hybrids by crossing these groups in each selection cycle, in which intragroup selection and recombination of the most complementary parents increases the frequency of favorable alleles within each group (Bernardo 2010).Thus, as for IRS, in Be-Breeder it is also possible to simulate different selection arrangements, number of progenies evaluated, selection intensity, and heritability for RRS.
For that purpose, the response of reciprocal recurrent selection was estimated by the expression , in which GS is gain from selection, i is the standardized selection differential for each group (1 and 2), c is a value that depends on the selective arrangement of parental control (Table 2), σ 2 a is additive genetic variance, and σ p is the phenotypic standard deviation for each group (1 and 2) (Falconer et al. 1996).The effective size of the population (Ne) and inbreeding coefficient F of Wright are also provided for each heterotic group (1 and 2) in the output window; they are estimated in a manner similar to that described in the item IRS.

Prediction of Hybrids
Among the main products coming from plant breeding, hybrids stand out for the important role they exercise in the world economy (USDA 2016).Among them, single-cross hybrids (SH), three-way hybrids (TH), and double-cross hybrids (DH) (Shull 1910, Jones 1918) are the most representative products on the market.In this section, the user of Be-Breeder can find the predicted genotypic value of three-way hybrids (TH) and double-cross hybrids (DH) through the input of a .txtdocument containing the mean phenotypic dataset of the single-cross hybrids (SH) coming from the lines of interest.
For this purpose, the expressions of Jenkins (1934) are used as a basis, in which HT (AB)C = HS (Ac) + HS (Bc)   2 and HD (AB)(CD) = HS (Ac) + HS (Ac) + HS (Bc) + HS (BD) 4 . The sequence of the columns in the txt file must follow the example indicated in Table 3.
Table 1.Selection arrangement in regard to intrapopulational recurrent selection for the population of evaluation, population of recombination (HS -half sibs, FS -full sibs, and S1 -self-pollination), c index, effective size (Ne tab ), and coefficient between additive and dominance effects of the homozygotes (D 1 ) R Fritsche-Neto and FI Matias

b) Number of Hybrids
In this tab of the application, the user can obtain the possible number of single-cross hybrids (SH), three-way hybrids (TH), and double-cross hybrids (DH), given the number of lines (n) of a breeding population.This information can be obtained for a single population through the expressions (Vencovsky and Barriga 1992)

Genotype x Environment
In plant breeding, the statistical significance of the component of the genotype × environment interaction defines the selection strategy and commercial recommendation.The absence of interaction indicates that the environments of evaluation do not have a different influence on the behavior and on the ordering of the genotypes, and so it is sufficient to choose a single environment because the crop recommendation will be the same.Simple interaction indicates that the genotypes respond differently to environmental influences, but not enough to for there to be change in ordering, maintaining the same commercial recommendation among them.In contrast, in complex interaction, there are changes in ordering of genotypes among the environments, and a crop recommendation per location is necessary (Borém and Miranda 2013).In this context, Be-Breeder provides a tab for simulation of interactions among three genotypes in two environments, and it is possible to construct various scenarios, simulate recommendations, and make inferences regarding the implications of the G × A effect in the selection process and in data analysis.The user has columns that range from zero to five for each genotype/environment combination (genotype value of individual i in environment j), obtaining the mean values of genotypes and of environments separately as output, as well as graph visualization of each scenario.

Heterosis
According to the Dominance and Repulsion Hypothesis (Borém and Miranda 2013), the mean performance of hybrids (F1) in relation to the mean of the parents is related to allelic complementarity between the parents (P1 and P2), the genetic divergence between them, and the magnitude of the dominance deviations (Nass 2001).By being observed mainly in quantitative traits, the number of genes they control also has a certain influence on the magnitude of the heterosis observed.In this context, hybrid vigor in F1, also called biological heterosis (H), is estimated by the expression (Falconer et al. 1996).Hybrid performance in relation to superior performance (Best Parent -B.P), for its part, receives the name heterobeltiosis (Hb) or agronomic heterosis, and is estimated by Hb = F1 - B.P. In light of the foregoing, in this tab of the application, it is possible to simulate different scenarios as a function of the number of genes, of deviation of dominance, and of divergence among the parents, making it possible to observe the fluctuation of heterosis and of heterobeltiosis for the simulated hybrids.

FINAL CONSIDERATION
Through use of the Be-Breeder -Learning application, it can be observed that it provides a teaching interface of the R software, which can be easily handled and which offers an alternative for understanding concepts and expressions of quantitative and population genetics applied in plant breeding.This tool can assist researchers, professors, and students in experiments, predictions, and academic studies, among other applications, both in learning and in teaching the principles that guide plant breeding.

( 1
and 2), the number of SH, TH, and DH depends on the number of lines belonging to group 1 (a) and the number of lines belonging to group 2 (b), according to the expressions: no HS = a ⃰ b, no HT = a ⃰ (a -1) ⃰ b + b ⃰ (b -1) ⃰ a, and no HD = a ⃰ (a -1) ⃰ b ⃰ (b -1).

Table 2 .
Selective arrangement in reference to reciprocal recurrent selection for the population of evaluation, the population of recombination (TC -Testcross, HS -half sibs, FS -full sibs, and S1 -self-pollination), c index, tabulated effective size (Ne tab )

Table 3 .
Example of .txtfile for input in the Be-Breeder application in the Learning section, "Hybrid Effect" tab so as to predict the phenotypic value expected of three-way and double-cross hybrids from the mean phenotypic value observed from the single-cross hybrids between the lines of interest