Detection and mapping of a lethal locus in a eucalyptus hybrid population

The objective of this work was to verify the existence of a lethal locus in a eucalyptus hybrid population, and to quantify the segregation distortion in the linkage group 3 of the Eucalyptus genome. A E. grandis x E. urophylla hybrid population, which segregates for rust resistance, was genotyped with 19 microsatellite markers belonging to linkage group 3 of the Eucalyptus genome. To quantify the segregation distortion, maximum likelihood (ML) models, specific to outbreeding populations, were used. These models consider the observed marker genotypes and the lethal locus viability as parameters. The ML solutions were obtained using the expectation‐maximization algorithm. A lethal locus in the linkage group 3 was verified and mapped, with high confidence, between the microssatellites EMBRA 189 e EMBRA 122. This lethal locus causes an intense gametic selection from the male side. Its map position is 25 cM from the locus which controls the rust resistance in this population.


Introduction
Many mapping studies for different species have reported a large number of loci showing segregation distortion (SDL).These loci do not segregate according to the Mendelian laws.Segregation distortion (SD) may be caused by a variety of factors, including hybrid sterility or incompatibility, and nuclear cytoplasmic interaction.For simplicity, all of these factors are called collectively as lethal factors (Song et al., 2006).
In most cases, SD is detected when a large number of molecular markers is analyzed.In this circumstance, marker segregation distortion results from the elimination of certain types of gametes or even of zygotes.Such elimination is controlled by a lethal factor located in the marker neighboring region (Cheng et al., 1998).Segregation distortion may also occur due to the occurrence of duplicated markers (Frisch et al., 2004) -which may complicate or even hamper the correct identification of the alleles belonging to each locus -or even to genotyping errors (Vogl & Xu, 2000).
The occurrence of SD in at least one locus of a linkage group leads to biased estimates of the distance between loci pairs, decreasing the resolution of the linkage map, and thus complicating the identification and mapping of quantitative trait loci (QTL) (Song et al., 2006).In order to eliminate such bias, loci displaying SD are typically removed from the framework set of markers in genetic and QTL mapping.Unfortunately, one cannot rule out the possibility that some important QTL may reside nearby a distorted locus.Therefore, when the distorted markers are removed, linked QTL can be missed as well, and this can cause relevant information loss in QTL mapping (Vogl & Xu, 2000).In such context, the linkage group 3 of the Eucalyptus genome can be used as an example.In this linkage group, a QTL related to rust resistance was recently mapped and validated (Mamani et al., 2010;Rosado et al., 2010;Alves et al., 2011).However, a comparison between this linkage group and others, in many independent studies, showed that it is the group which has the largest number of distorted markers.Not surprisingly, it also holds a large mapping gap, which does not allow QTL position to be precisely determined.Collectively, these facts suggest the existence of a putative lethal locus in this group.
In accordance with the previously mentioned, the SDL of linkage group 3 were often discarded from mapping analysis (Byrne et al., 1995;Marques et al., 1998;Thamarus et al., 2002Thamarus et al., , 2004;;Myburg et al., 2003).When considered, such loci have an asterisk put near their name in the map, to indicate they are distorted loci.Most importantly, the possible causes of the distortion are neglected in most cases (Grattapaglia & Sederoff, 1994;Brondani et al., 2002Brondani et al., , 2006;;Myburg et al., 2003).Young et al. (2000) and Missiaggia (2005) attributed the elevated SD of the microsatellite loci to a specific coincidence between deleted alleles and genes linked to the markers.However, the authors were not concerned about detecting and positioning the locus with lethal effect.
Segregation distortion loci should be maintained in the analysis by methods which consider SDL as important for genomic studies dealing with genome regions, in which distorted markers occur, especially those intended to detect and map QTL, which may be influenced by partial lethal factors.This is necessary to obtain more reliable information for assisted-selection procedures (Rocha et al., 2010).
The objective of this work was to verify the existence of a lethal locus in a eucalyptus hybrid population, and to quantify the segregation distortion in the linkage group 3 of the Eucalyptus genome.

Materials and Methods
The analyses were based on 135 F 1 individuals belonging to a single full-sib family (outbreeding population).These individuals were obtained based on controlled cross between the female genitor 7074 (E.grandis) with the male 1213 (E.urophylla x E. grandis hybrid).Young, healthy leaves from the parent trees and from the progeny were collected in greenhouse and transported to the laboratory of molecular genetics of microrganisms, Bioagro, of the Universidade Federal de Viçosa, MG, Brazil.
A genotypic segregation test was carried out in order to verify the segregation rate expected for each locus.To verify which of the parents produced gametes with distorted frequency, a gametic segregation test was performed with the GQMol software (Cruz, 2010) using chi-square ( χ 2 ) test, at 5% probability.
Detection of SDL depends on the use of appropriate ordinary genetic mapping methods.Distortion rate was estimated in an outbreeding population, considered initially as a crossing between two heterozygotic parents (A i A j x A k A l ).To estimate the segregation distortion occurring in markers which segregated in the first parent, the gametic frequencies were estimated by f(A i ) = (0.5 + s) or f(A i ) = (0.5 -s), f(A ik ) = 0.5, in which s is the segregation distortion rate.
Using these segregation distortion estimators, the genotypic frequencies for each genotype are given by: f(A i A k ) = 0.5(0.5 + s), f(A i A l ) = 0.5(0.5 + s), f(A j A k ) = 0.5(0.5 -s), and f(A j A l ) = 0.5(0.5 -s).The gametic segregation distortion rate in this population was estimated considering: s = [O 1 /(O 1 + O 2 )] -0.5, in which O 1 is the number of individuals carrying the allele form A i ; and O 2 is the number of individuals carrying the allele form A j .
Maximum likelihood (ML) methods are quite useful for genetic mapping and QTL detection.The maximum likelihood estimate of an unknown parameter ( Θ) is the value of Θ which corresponds to the maximum of L(Θ; x), i.e. the value of Θ which is "most likely" to have produced the data x.According to the notation used by Lynch & Walsh (1998) and Schuster & Cruz (2004): in which Θ is the vector of unknown parameters; x is the observed data, and n is the number of observations.
In SDL detection, the maximum likelihood function represents the product of the individual's density probability functions in N observations affected by the occurrence of genotypic classes of the segregating populations.Liu (1998) demonstrated that the marker segregation follows the multinomial distribution.According to the notation used by Schuster & Cruz (2004) in which x is a random variable, n i is the i th event.
The SDL detection was based in the use of likelihood functions to estimate the genetic distance among the markers and the SDL locus, based in the conditional probabilities of the marker genotypes: is the conditional probability to manifest the binary phenotype, considering the occurrence of the i th genotype in the A locus, the j th genotype in the B locus, k th genotype in the C locus; P(Q ∩ A i B j C k ) is the joint probability to manifest the binary phenotype, considering the occurrence of the i th genotype in the A locus, the j th genotype in the B locus, k th genotype in the C locus; and P(A i B j C k ) is the marginal occurrence probability of the genotype A i B j C k in the population.
Since θ is a function of s and r, the recombination frequencies may be obtained by: f(B) = 0.5 + s -2rs, and f(b) = 0.5 -s + 2rs, for which θ = s -2rs is used to estimate the distance in recombination frequencies between the marker locus and the lethal gene, converted to cM based on the Kosambi's map function.
The marker segregation ratio was verified using the chi-square test.The linkage groups were clustered based on the maximal recombination frequency (rmax = 30%) and on the minimal LOD (LODmin = 3).The best marker order was estimated by the sum of adjacent recombination fractions (Sarf) method.

Results and Discussion
Out of the 19 microsatellite loci previously selected to screen the linkage group 3, nine loci were polymorphic  (1998), this type of crossing is considered to be fully informative, and is possible to identify the origin of all alleles in relation to both parents.The amplification pattern of the microsatellite marker EMBRA 189 exemplifies this crossing type (Figure 1).The markers EMBRA 125, EMBRA 181 and EMBRA 350, on the other hand, displayed a crossing type which is not fully informative (A i A j x A i A k , type VI), even though both parents are heterozygous.In this case the progeny segregates for three alleles, and it is not possible to trace them all, as it can be seen in Figure 1 for EMBRA 125.The marker EMBRA 115 was classified as belonging to the type IV (A i A j x A k A k ).In this case, the population segregated for three allelic forms.The marker EMBRA 171 was classified as type III (A i A i x A i A j ), the population segregated for two allelic forms, and the alleles of the heterozygous parent (1213) could be distinguished in the progeny.
The individual chi-square test showed that eight of the microsatellite loci from linkage group 3 showed distorted segregation, considering the expected Mendelian genotypic proportions for each crossing type.Only the microssatellite loci EMBRA 115 showed the expected Mendelian segregation pattern (Table 1).In order to identify the parents which produced the gametes with segregation distortion, the chi-square test was also used to check the segregation pattern for each genitor (Table 2).All marker loci, except EMBRA 350, showed gametic distortion for the male genitor 1213.The literature reports that the linkage group 3 of the   et al., 1998;Brondani et al., 2002Brondani et al., , 2006;;Myburg et al., 2003;Missiaggia, 2005).On the map constructed by Brondani et al. (2002) using 50 SSR markers, the number of markers per linkage group varied from two (group 3) to eight (group 10), and on the map further generated by the same authors (Brondani et al., 2006), using 230 SSR, the number of markers varied from 12 (group 3) to 25 (groups 5 and 8).Besides the low saturation of linkage group 3, it is interesting to note that it also shows a large gap in the region where Rosado et al. (2010) have previously mapped a QTL for rust (Puccinia psidii) resistance.These authors suggested that the elevated segregation distortion of the microsatellite loci were due to the specific amplification of deleterious alleles linked to SSR markers.
Based on these facts, and trying to elucidate not only the consequences but also the origin of segregation distortion in loci of the linkage group 3, it is possible to propose a hypothesis on the existence of a putative lethal gene that acts in this specific group.This hypothesis was first confirmed by the genotypic test (Table 1) and the gametic segregation tests (Table 2).The genotypic segregation detected 42% of the loci with segregation distortion, and gametic segregation detected that the distortion occurs during formation of the male gamete (prezygotic selection) in all loci.Furthermore, the occurrence of a distortion gradient in all markers of linkage group 3 attest the existence of a selective factor, and excludes the possibility that the observed distortion is due to genotyping errors or duplicated marker genotyping, since in these cases the distortion would be displayed in only one or in few specific loci.The other loci belonging to the same linkage group, but far apart from the lethal locus, would remain with the expected patterns of Mendelian segregation (Frisch et al., 2004).Moreover, according to Song et al. (2006), when a locus is under selection, markers linked to it will exhibit segregation distortion only and exclusively because of the indirect action of the linked loci, which was attested in the present work.
Considering that a gene with lethal effect causes a distortion gradient near its position in the genome, the distortion rates were used to estimate the position of this kind of gene within the linkage group 3 (Table 3).The distance estimates confirmed the occurrence of a selective factor.The adjustment of the maximum likelihood functions allowed for the positioning of the lethal factor (Figure 2).The observed distortion gradient (Table 3) corroborates the results of Pereira et al. (1994) and Xu et al. (1997), who verified that marker loci located near to markers under selection showed higher rates of SD, and as the distance increased, the SD rates  decreased.Marker loci linked to genes under selection will exhibit an indirect effect according to the linkage with the locus under selection (Song et al., 2006), as attested in the present work by the positioning of the lethal-factor locus closer to the markers EMBRA 122 and EMBRA 189, the ones with higher rates of SD.Therefore, the segregation distortion observed for loci of the linkage group 3 is, in fact, caused by a lethal gene that acts on the prezygotic selection.
The algorithm for estimation of the distortion rates of exogamic populations, developed in this work, was essential to detect the distortion gradient in the loci of linkage group 3.The position of marker loci in linkage group 3 was carried out according to distance estimates showed in Table 3.The sum of adjacent recombination fractions was used to estimate the best marker order considering the lethal factor.The obtained linkage map shows the position of the lethal factor (Figure 2).Linkage group 3 total length was 57.14 cM, and the longest distance between markers (EMBRA 122 and EMBRA 189) was 23.13 cM.The lethal gene was mapped 10.49 cM apart from the EMBRA 189 locus and at 12.64 cM from the EMBRA 122 locus.Based on the information provided by Rosado et al. (2010), who studied this same hybrid population, the lethal locus was mapped 25 cM apart from the locus which controls the rust resistance in eucalyptus.The map constructed for linkage group 3 is consistent with the reference map of Eucalyptus (Brondani et al., 2006), showing that the proposed mapping procedure based on distortion rates is adequate.With the methodology proposed in the present work, markers with segregation distortion can now be included in genomic analyses without further concern, providing more accurate maps and better positioning of QTL.
To our knowledge, this is the first work which detected and located a lethal gene in an outbreeding population.The model considered the deviation of the expected allelic frequencies and estimated the position of the lethal locus based on an EM algorithm.It proved to be efficient and can be applied, with some modifications, to all type of population.Lethal loci mapping, however, has been very explored in line-crossing experiments (Hedrick & Muona, 1990;Mitchell-Olds, 1995;Cheng et al., 1996Cheng et al., , 1998;;Vogl & Xu, 2000).Cheng et al. (1996) presented a method for estimating the recombination values between a partial lethal locus, and linked molecular markers solely by using marker segregation data of an F 2 population.Cheng et al. (1998) expanded this methodology, so it could be applied to backcross and to doubled-haploid populations.Hedrick & Muona (1990) developed a flanking marker analysis to estimate the fitness parameters of a viability locus in a complete recessive model.Mitchell-Olds (1995) adopted the idea of interval mapping by examining one putative viability locus at a time and, then, scanning the entire genome for every putative position, in order to provide a visual presentation of the LOD test statistic profile for identification of the viability locus.More recently, Luo et al. (2005) combined the complete recessive model of Hedrick & Muona (1990) and the dominance model of Mitchell-Olds (1995) to formulate a consensus model which allows for simultaneous estimation, and to test the degree of dominance.
Considering that an important QTL to rust resistance was mapped in the linkage group 3 (Mamani et al., 2010;Rosado et al., 2010), the lethal factor locus characterized in the present study can be especially important for Eucalyptus breeding.

Conclusions
1.The segregation distortion observed in the linkage group 3 of Eucalyptus genome is a consequence of a lethal locus affecting gamete formation.
2. The lethal locus is mapped between the microssatellites EMBRA 189 and EMBRA 122, and 25 cM apart from the locus which controls the rust resistance on the evaluated mating in the linkage group 3.
3. Distortion rates, potentially caused by lethal locus, can be used for genetic mapping, and provide more acurate maps and better positioning of QTL.

Figure 2 .
Figure 2. Linkage group 3 map based on distortion rates of the microsatellite marker loci.The values to the left correspond to the intervals between the markers, in cM (Kosambi's map function).The values in parentheses correspond to the distance of the marker loci in relation to the lethal gene (L/l).

Table 2 .
Gametic segregation test of eight microsatellite loci with significant segregation distortion genotyped on the progeny originated from the crossing between the Eucalyptus female genitor 7074 (♀) and the male genitor 1213 (♂).

Table 3 .
Distortion rates (θ) and genetic distances, estimated for each marker locus, considering the number of observed individuals with the allele type 1 (A i ) and type 2 (A j ), in a Eucalyptus outbreeding population.