Selective genotyping for discovery of QTL controlling flowering time in dolichos bean (Lablab purpureus L.)

Abstract Dolichos bean is grown in environments with short and long crop growth seasons. Development and deployment of cultivars with flowering time (FT) that matches prevailing crop growth season help maximize their productivity in the environments to which they are targeted. Dependable knowledge on genetic basis of FT enables the use of the most appropriate selection strategy to breed cultivars with desired FT. We unraveled the genetic basis by detecting QTL controlling FT using SSR markers following selective genotyping strategy (SGS) in F2 mapping population (MP). We evaluated the effectiveness of SGS as compared to entire MP genotyping strategy (EGS) to detect QTL controlling FT. Our results suggest that alleles at two SSR markers (LPD 25 and LPD 190) are linked to QTL controlling FT in both SGS and EGS. Our results provide adequate evidence for comparable statistical power of SGS relative to EGS for detection of QTL controlling FT.

Selective genotyping for discovery of QTL controlling flowering time in dolichos bean (Lablab purpureus L.)

INTRODUCTION
The transition from vegetative to reproductive phase is widely known as flowering time (FT). It is a key factor that determines optimal yield in crops (Mathan et al. 2016), including dolichos bean. Dolichos bean is one of the popular cool season grain legumes extensively grown in southern India. It is predominantly grown for fresh pods, which are harvestable and marketable economical product. Immature beans after removal of fresh pod coat are used as vegetable in various culinary preparations. Dry beans are also used in various culinary preparations especially in dry seasons. Both fresh and dry beans contribute to protein requirements of millions of people who depend on vegetarian diet for their energy requirement (Ramesh and Byregowda 2016). It is grown in arid and semi-arid production environments with short and long crop growth seasons. It is, therefore, necessary to develop cultivars with FT that matches prevailing crop growth season to maximize their productivity in the production environments to which they are targeted (Keerthi et al. 2014). A thorough and dependable knowledge on genetic basis of FT enables the use of the most appropriate selection strategy to breed dolichos bean cultivars with desired FT. However, to date, there have been seldom attempts to unravel genetic basis of FT in dolichos bean. One of the ways to unravel genetic basis is B Gonal et al.
to detect and map QTL controlling FT using DNA-based markers. This approach enables simultaneous detection of QTL and identification of markers linked to them. The use of such linked markers facilitates implementation of marker-assisted selection (MAS) to enhance the pace and precision of breeding dolichos bean cultivars with desired FT.
Mapping QTL requires generation of phenotyping and genotyping data from mapping population (MPs) derived from parents contrasting for trait of interest and a large number of DNA markers. Both phenotyping and genotyping demand substantial resources. Most QTL detection experiments are performed with fixed resources. Genotyping MPs for DNA markers is often much more expensive than phenotyping most of the quantitative traits of interest (Ronin et al. 1998). A judicious allocation of fixed and limited resources to phenotyping and genotyping is therefore necessary to make QTL detection cost-effective (Lee et al. 2014). In this backdrop, several approaches have been proposed to increase the statistical power to detect QTL per individual genotyped at the expense of the power per individual phenotyped for quantitative traits of interest. These approaches include DNA sample pooling, selective genotyping (SG) and sequential sampling (Lebowitz et al. 1987, Lander and Botstein 1989, Darvasi and Soller 1992, Motro and Soller 1993, Darvasi and Soller 1994, Xu and Vogl 2000, Zou et al. 2016. Of these approaches, SG is widely used to detect QTL controlling target quantitative traits. SG is based on the principle of testing the significance of differences in frequencies of alleles at the DNA-based marker loci between extreme phenotype groups of quantitative trait distribution in MPs (Lebowitz et al. 1987). Significant difference in marker allele frequency is considered evidence for linkage of target trait controlling QTL and the marker loci. This is because SG approach is based on the hypothesis that extreme phenotypes chosen for genotyping harbour higher frequency of favourable alleles at multiple loci controlling the trait of interest (Sun et al. 2010, Bernardo 2020. The rationale of this hypothesis is that in a QTL MP, a few individuals contribute to more marker-trait QTL linkage information than others. The extreme phenotypes are most informative for detection of marker-trait QTL linkage. To illustrate this further, extreme phenotype individuals that are at least one standard deviation from the trait mean of the MP account for 81% of the linkage information (Lebowitz et al. 1987, Lander and Bostein 1989, Darvasi and Soller 1992. Lebowitz et al. (1987) and Gallais et al. (2007) developed and discussed theoretical basis and explained experimental design and analytical procedure for testing the significance of differences in marker allele frequencies between extreme phenotypic classes defined on the basis of target quantitative trait values. As only extreme phenotype individuals are genotyped, considerable resources are saved in SG approach. Hence, SG is regarded as a cost-effective alternative to genotyping entire MP. Significant trait mean differences among the groups of individuals classified based on marker genotypes in the entire MP genotyping strategy are considered evidence for marker-trait QTL linkage. Hence, the strategy of genotyping the entire MP is also called as marker-based approach (Lebowitz et al. 1987).
However, efficiency of SG-based QTL detection compared to commonly used marker-based approach is not verified empirically in dolichos bean. Also, most researchers attempt detecting QTL in MPs derived from crosses involving elite and exotic trait donor parents. QTL detected in such MPs are relevant only for introgression of target trait from donor parent but not for selection within the breeding population (BP) (Jannink et al. 2001). However, QTL detected in MP derived from elite parents could be directly used in selection of superior genotypes within BP for use as cultivar (Wurschum 2012, Cui et al. 2015. The objective of our study is to assess the efficiency of detecting QTL controlling FT using SG as compared to marker-based approach in F 2:3 breeding population derived from elite parents in dolichos bean.

Development of F 2 mapping population
Two elite genotypes, HA 4 and HA 5, differing for flowering time by 6-8 days (Ramesh et al. 2018) were used to develop F 2 MP. While HA 4 is determinate, HA 5 is indeterminate in growth habit. Indeterminacy is dominant over determinacy (Basanagouda et al. 2022). The seeds of the two genotypes were space-planted in crossing block located at the experimental plot of the Department of Genetics and Plant Breeding (GPB), College of Agriculture (CoA), University of Agricultural Sciences (UAS), Bangalore, India, during 2020 rainy season. For affecting the cross, the flowers were emasculated in HA 4, the evening of the day before pollination on next day morning. The hand-emasculated flowers were pollinated using the pollen grains collected from HA 5 during 2020 rainy season. A total of 15 well-filled F 1 seeds could be obtained from HA 4 × HA 5. The seeds of the F 1 's were space-planted during 2020 post rainy season. All the F 1 seeds germinated and survived to maturity. Indeterminate growth habit of all the F 1 plants suggested their true hybridity. The F 1 's were selfed and the selfed seeds were planted to raise F 2 population. A total of 144 F 2 plants survived to maturity. All the 144 F 2 plant populations were selfed in 2021 summer season to obtain F 2:3 populations.

Genotyping F 2 individual plants
The parents (HA 4 and HA 5) of F 2 population were initially screened for 634 simple sequence repeats (SSR)-based markers to identify those polymorphic between the parents. 86 of these SSR markers were polymorphic between the parents. Leaf samples from 10 day-old individuals from F 2 populations were ground to fine powder using liquid nitrogen. DNA was isolated from fine powder using CTAB method (Khairallah and Hoisington 1994) and genotyped using 86 polymorphic SSR markers.

Phenotyping F 2:3 families
The seeds of 144 F 2:3 families were sown in single rows of 3-m length with 0.6-m spacing between rows in two replications following simple lattice design during 2021 rainy and post-rainy seasons. 10 days after sowing, the seedlings were thinned, maintaining 0.3 m spacing between plants. About 10-12 plants survived to maturity in each of 144 F 2:3 families. Recommended package of practices was followed to establish healthy plants. Days to flowering (referred to as FT) were recorded on 5 randomly selected plants as days from date of planting the seeds to the day on which 50% of the plants in each of 144 F 2:3 families flowered in each replication.

Statistical analysis
Average FT of 5 plants in each replication was used for statistical analysis. The extreme 15 of both late flowering and early flowering plants were selected for detecting QTL controlling FT using SG approach. SG approach is based on the principle that, if a marker is unlinked to a QTL controlling FT, then the expected frequency of a marker allele in both late and early flowering plants is 0.50. If a maker is linked to QTL controlling FT, then frequency of alleles at linked marker between early and late flowering groups will be significantly different from 0.5. The frequencies of alleles at each of the 86 polymorphic SSR markers were estimated in each of the 15 late and early flowering groups of plants using MS Excel software. The differences, if any in frequencies of alleles at each of the 86 polymorphic SSR markers was tested using two-tailed two-sample 't' test employing the following formula with a null hypothesis that the marker allele frequencies are = 0.5 in either extreme group (Lebowitz et al. 1987).

Validation of detected QTL and its linked marker
The genotypic and phenotypic data of F 2:3 progeny families evaluated in two seasons were integrated to detect QTL controlling FT, initially using single marker regression (SMR) analysis. Subsequently, simple interval mapping (SIM) and Inclusive Composite Interval Mapping (ICIM) were performed to detect and estimate size effects of QTL controlling FT separately in each season. The accuracy of QTL position and significance of size effect of QTL controlling FT was determined using data-driven estimates of threshold LOD scores obtained by 1000 permutations (Churchill and Doerge 1994). All these statistical analyses were implemented using QTL ICiMapping software version 4.0 ).

RESULTS AND DISCUSSION
The frequencies of alleles at two SSR markers (LPD 25 and LPD 190) and one SSR marker (LPD 25) differed significantly between late and early flowering groups of F 2 population phenotyped during 2021 rainy season and 2021 post rainy season, respectively (Table 1). The significant difference in frequencies of alleles at these two marker loci is attributed to hitchhiking effects between alleles at QTL controlling FT and those at linked SSR marker loci. This is because the frequency of QTL alleles that increase FT will increase in late flowering group and those that decrease FT will increase in early flowering group. Consequently, the frequencies of coupled marker alleles linked to QTL alleles controlling early and late flowering will differ significantly (Lebowitz et al. 1987, Lander and Botstein 1989, Darvasi and Soller 1992, Motro and Soller 1993, Darvasi and Soller 1994. Our results suggest that alleles at these two SSR markers (LPD 25 and LPD 190) are in LD with those at QTL controlling FT. This LD was confirmed by QTL mapping using the data generated from genotyping the entire F 2 MP (Figure 1) and phenotyping 144 F 2:3 families. The QTL controlling FT was detected on linkage group 2 ( Figure 2) with a LOD score of ~ 5.0 (Table 2). Phenotypic variance explained by this QTL was comparable and consistent across two seasons regardless of analytical tool (SMR/SIM/ ICIM) used to detect and estimate the variance (Table 3). This QTL was found flanked (with 53.89 cM interval) by the same two linked markers (LPD 25 and LPD 190) that are detected in SG approach. The forward and reverse sequences and annealing temperature of the two linked SSR markers are provided in Table 4. However, the additive effects of detected QTL were rather small (Table 2).   Small-effect QTL detected in our study is not surprising given that the F 2 MP used in the investigation is derived from elite parents, and therefore arguably there can be no more major-effect QTL alleles left segregating due to their fixation driven by domestication (Doebley 2006) coupled with long history of selection (Bernardo 2020). Hence, F 2 population derived from elite × elite parents is likely to segregate only for a few minor-effect QTL alleles controlling FT (Wurschum 2012, Bernardo 2020. Our results, therefore, are justifiable considering that SG approach is particularly useful for detecting small-effect QTL controlling traits (Sun et al. 2010). Thus, our results provide adequate evidence for equivalent statistical power of SG relative to markerbased approach (if not better than that of the latter) for detection of QTL controlling FT in dolichos bean. Ayoub and Mather (2002) demonstrated that SG approach was sufficient to detect all of the grain and malt quality QTL that were identified based on marker-based approach in barley. Sun et al. (2010) using simulation and empirical data, Lebowitz et al. (1987) and Lee et al. (2014) using simulation data, and Abdel-Haleem et al. (2011), Masojc et al. (2016), and Myskow and Stojalowski (2016) using empirical data demonstrated comparable statistical power of SG and marker-based approaches for detecting linkage between alleles at QTL and marker loci.

Implications in dolichos bean breeding
SG approach is particularly recommended to detect small-effect QTL controlling the traits that are easy and less expensive to phenotype such as FT in our study.  showed that SG approach never results in loss of accuracy compared to marker-based approach. SG is, therefore, a cost-effective and cost-saving alternative to marker-based approach to discover QTL controlling quantitative traits such as FT. This is because the cost of SG is only 6% of that of marker-based approach (Sun et al. 2010). The saved resources could be reallocated to detect useful QTL in a range of MPs using SG approach. Further, with the cost of genotyping drastically decreasing driven by the availability of next generation sequencing technologies, it is now possible to genotype fewer extreme trait phenotype individuals selected/retained from a large numbers of BPs routinely developed in practical crop breeding programmes and detect QTL using SG approach (Navabi et al. 2009). This strategy augurs well with the argument that QTLs detected from unselected random individuals of MPs may not be directly relevant to plant breeding. This is because QTL detected in   designed MPs most often may not segregate and are not directly relevant in BPs used for selection (Wurschum 2012, Cui et al. 2015. To facilitate simultaneous detection and implementation of MAS, QTLs must be detected from the very BPs in which the selection is practiced (Wurschum 2012, Cui et al. 2015, Korontzis et al. 2020, Li and Xu 2021.