Genetic variation in oriental tobacco (Nicotiana tabacum L.) by agro-morphological traits and simple sequence repeat markers

The objectives of this study were to assess genetic diversity and determine differences between several oriental tobacco genotypes by examining both agro-morphological traits and molecular markers. Simple lattice design with two replications was used to evaluate 100 oriental tobacco genotypes. Analysis of variance manifested that there is high level of genetic diversity in oriental-type tobaccos based on morphological traits including number of leaf, days to 50% flowering, leaf length, leaf width, leaf fresh weight, leaf dry weight, stem height and stem girth. Classification of genotypes using agro-morphological data by means of un-weighted pair-group method using arithmetic average (UPGMA) algorithm based on squared standardized Euclidean distances resulted four distinguishable groups that pursuit own geographical distribution. In the molecular marker investigations, a total of 13 simple sequence repeats (SSR) primer pairs were used to determine polymorphism of the test germplasm. Thirty five alleles were scored at 13 SSR loci. The average number of alleles per locus (na) and the effective allele number (Ae) were 2.69 and 2.34, respectively. By using SSR data, pair wise Jaccard's similarity coefficients were produced. Grouping of genotypes via Jaccard's similarity coefficients and using UPGMA clustering method lead to three groups that had not any accommodated with own origins. Results reveled that there is not completely agreement for classification based on agro-morphological and SSR loci in oriental-type tobaccos. Because of non influence of environmental effects on molecular marker, heterotic groups based on SSR markers could be closer to reality.


INTRODUCTION
Nicotiana spp. is one of the most important nonfood crops that are widely cultivated worldwide (MOON et al., 2009).It belongs to family Solanaceae which has more than 64 species, being Nicotiana tabaccum one of the most cultivated species among them (REN;TIMKO, 2001).It was proven that N. tabaccum is natural amphidiploid (2n = 4x = 48) arisen by hybridization of wild progenitor species (N.sylvestris N. tomentosiformis).Numerous types of tobacco are defined by different criteria such as region of production, intended use in cigar (i.e., filler, binder and wrapper) and cigarette manufacturing, method of curing (flue-, air-, sun-and fire-cured tobacco) as well as morphological and biochemical characteristics (i.e., aromatic fire-cured, bright leaf tobacco, Burley tobacco, Turkish or oriental tobacco) (REN;TIMKO, 2001).Turkish or oriental tobacco has a much milder flavor and contains less nicotine and fewer carcinogens than other varieties (DAVIS;NIELSEN, 1999).In order to get an American Blend type of cigarette, it is mixed with more robust tobacco such as Virginia and Burley tobacco.
The study of genetic diversity of tobacco is of interest for the conservation of genetic resources, broadening of the genetic base and practical applications in breeding programs.Several traits such as agromorphological (WENPING et al., 2009;ZHANG, 1994;ZEBA;ISBAT, 2011), chemical and cytological traits (DARVISHZADEH et al., 2011;EL-MORSY et al., 2009;OKUMUS;GULUMSER, 2001) have already been used to study the genetic variation of tobacco germplasm.This is noticeable that agro-morphological traits usually vary with environment and the number of chromosomal characters is limited (LU, 1997).With the emergence of molecular markers such as amplified fragment length polymorphism (VOS et al., 1995), simple sequence repeat (THOMAS; SCOTT, 1993), and inter simple sequence repeat (PRADEEP REDDY et al., 2002), this is possible to evaluate genetic divergence of plant germplasm in greater detail.In this sense, several studies were used molecular markers (DAVALIEVA et al., 2010;JULIO et al., 2006;REN;TIMKO, 2001;YANG et al. 2007;YAO ZHANG et al., 2008) to reveal genetic diversity of N. tabaccum.Molecular markers are stable and detectable in all tissues, regardless of growth, differentiation and development or stage of the cell.They are not subject to environmental, pleiotropic or epistatic effects (AGARWAL et al., 2008;MOOSE;MUMM, 2008).With the advent of high-density SSR maps for tobacco it is feasible to estimate genetic variation with a large number of markers that are well distributed across the tobacco genome (BINDLER et al., 2007).Recently, SSR as a reproducible, codominant, wide genome coverage and multi allelic marker has been successfully employed to reveal genetic variation of chewing tobacco genotypes (SIVA RAJU, 2011).Davalieva et al. (2010) could classify 10 tobacco genotypes into three groups using 24 microsattelite markers.The aim of this research was to employed SSR technique and agro-morphological traits simultaneously to assess the genetic variation of different local and exotic oriental tobacco genotypes belong to Urmia Tobacco Research Center of Iran.

Field experiment
One hundred genotypes of tobacco (Nicotiana tabacum L.) with different growth-type and origins were investigated under filed conditions (Tabela 1).Tobacco seeds were sown at a rate of approximately 5 g m -2 in bed.After sowing the seeds, a fine layer of well fermented and sieved sheep manure was spread on top of beds.Then tobacco seedlings were transplanted to plots when plant averaged about 12 cm in height.The experiment was conducted in a simple square lattice design (10 10) with two replications.Each plot was comprised of three lines of 5m, with a spacing of 65 20 cm.The plants were not topped as is common with most other tobacco types (such as: Virginia and Burley).The agro-morphological traits were plant height (PH), stem girth (SG), leaf number (LN), leaf length (LL), leaf width (LW) and day to 50% flowering (DF) that were recorded on 5 random plants in total competition per plot (KARA;ESENDAL, 1995).Dry leaf yield and fresh leaf yield were evaluated using total plants of plots with exception of border effects (KARA;ESENDAL, 1995).

DNA extraction and polymerase chain reaction
Considering to availability of leaf sample just for 70 genotypes out of 100 genotypes, total DNA was extracted from these 70 genotypes leaves following the method described by Doyle and Doyle (1987).Concentration of DNA samples was determined spectrophotometrically at 260 nm (BioPhotometer 6131; Eppendorf, Hamburg, Germany).The quality of the DNA was checked by running 1 l DNA in 0.8% (w/v) gels in 0.5X TBE buffer (45mM Tris base, 45mM boric acid, 1mM EDTA pH 8.0).DNA samples that gave a smear in the gel were rejected.
13 SSR primer pairs out of 278 from the tobacco SSR database (BLINDER et al., 2007) were used for DNA fingerprinting (Tabela2).The choice of SSR markers was based on clarity of produced bands.Polymerase chain reaction (PCR) was performed in a 20µl volume using a 96-well Eppendorf Mastercycler Gradient (Type 5331, Eppendorf AG, Hamburg, Germany).The reaction mixture contained 2.5mM of each primer (Tabela 2), 0.4 Unit of Taq DNA polymerase (Cinna Gen Inc., Tehran, Iran), 100µM of each dNTP (BioFluxbiotech, http://biofluxbiotech.com),2µl 10X PCR buffer, 2mM MgCl 2 (CinnaGen, Tehran, Iran), ddH 2 O and 25ng template DNA.Amplification was a Genetic distance from the upper telomere estimated according to framework genetic linkage map of tobacco (BINDLER et al., 2007) carried for 35 cycles consisting of a denaturation step at 94 ºC for 1 min, annealing at 55 ºC for 1 min and an extension step at 72 ºC for 1.5 min.An initial denaturation step at 94 ºC for 4 min and a final extension step of 10 min at 72 ºC were also included.The reaction products were mixed with an equal volume of formamide dyes (98% formamide, 10Mm EDTA, 0.05% bromophenol blue and 0.05% xylene cyanol) and resolved in a 3% (w/v) agarose gel in 0.5X TBE buffer, stained with 1.0 g ml -1 ethidium bromide and photographed under UV light using a Gel-Doc image analysis system (Gel Logic 212 PRO, USA).

Data analysis
Analysis of variance (ANOVA) followed by descriptive statistics were calculated for total genotypes based on agro-morphological traits using general linear model in the SAS 9.13 software (SAS Institute, Cary, NC).In order to comparing the classification results of agro-morphological data with marker data, 70 genotypes that had also marker data, were undertaken to clustering.Classification of genotypes using agro-morphological data was performed by means of un-weighted pair-group method using arithmetic average (UPGMA) algorithm based on squared Euclidean distances.Prior to squared Euclidean distance calculation, the data were standardized to have a mean of zero and a variance of one.Data processing was performed using SPSS 15.00 statistical software (SPSS/PC-15, SPSS Inc., Chicago, IL, USA; http://www.spss.com).The pseudo F statistic and the pseudo T 2 statistic (JOBSON, 1992) were examined to establish the optimum numbers of morphological clusters by using SAS 9.13 software (data not shown).
About marker data, the amplification products were scored for the presence (1) and absence (0) of bands across the 70 genotypes to construct a binary data matrix.Several indices such as mean number of allele per locus (n a ), effective allele number (A e ), allele frequency, gene flow (N m ), observed heterozygosity (H o ) and expected heterozigosity (H e ) were estimated using GenAlEx 6.41 software (PEAKALL; SMOUSE, 2006) according to the following equations: (1) where n a is the number of alleles at i th locus and n is the number of loci; (2) where A e is the effective allelic number at a locus, and P i is the frequency of the i th allele in a locus (HARTL; CLARK, 1997);

Allele frequency =
(3) which was calculated locus by locus, where N XX is the number of homozygotes for allele X(XX), N XY is the number of heterozygotes containing the allele X(Y can be any other allele), and N = the number of samples (HARTL; CLARK, 1997); where F ST represents the degree of population genetic differentiation (FRANKHAM et al. 2004); (5) where H oi represents the observed heterozygosity of the i th locus, and q ij is the frequency of the j th allele at i th locus (HARTL; CLARK, 1997); ( 6) where H i is the expected heterozygosity of the i th locus, and q ij is the frequency of the j th allele at i th locus (LYNCH; MILLIGAN, 1994).
It was assumed that the gene frequency within a population was under Hardy-Weinberg equilibrium.Genetic similarity among individuals was calculated using Jaccard's similarity coefficient (JACCARD, 1908).Dendrograms were constructed by the un-weighted pair group method using arithmetic average (UPGMA) algorithm.The efficiencyof-clustering algorithms and their goodness-of-fit were determined based on co-phenetic correlation coefficients by using NTSYS-pc version 2.11 software (ROHLF, 1998).The significance of co-phenetic correlation was tested using the mantel matrix correspondence test (MANTEL, 1967).All analysis was performed using the NTSYS-pc 2.11 software package (ROHLF, 1998).

Agro-morphological traits
According to univariate statistical analysis (Tabela 3), there is wide ranges of genetic variation between oriental-type tobacco genotypes for all studied agro-morphological traits which show the possibility of selection among genotypes for improving tobacco.
Leaf number varied from 8.7 to 52 and leaf length ranged from 19.1 to 52.5 cm.Traits including leaf width, fresh leaf yield and dry leaf yield ranged from 10.7 to 33.3 cm, 1.6 to 26.4 kg and 0.4 to 5 kg, respectively.Stem height and stem girth fluctuated from 70 to 198.7 cm and 4.7 to 10.3, respectively.Days to 50% flowering ranged from 23 to 134 days.
Genetic variation in oriental tobacco (Nicotiana tabacum L.) by agro-morphological traits and simple sequence repeat markers Maximum (29.7) and minimum (0.73) standard deviation corresponded to stem height and dry leaf yield, respectively.Relatively, large variation was detected for studied traits (Tabela 3).Utility of univariate statistical techniques in identification of tobacco genetic diversity has been reported by Wenping et al., (2009) and Zeba and Isbat (2011).Evaluation of fifteen diverse tobacco genotypes based on agro-morphological traits depicted that there is statistically significant differences between tobacco genotypes based on all studied morphological traits such as plant height, leaf number, leaf length, leaf width, stem girth and day to 50 flowering (ZEBA; ISBAT, 2011).Genetic variation for traits comprising leaf appearance, percent of dry matter, leaf area index, leaf number and leaf length was also reported in F 2 population of Burley tobaccos (HONARNEJAD; SHOAIE-DEYLAMI, 2004).
There were some reports implying genetic variation of tobacco based on qualitative traits such as nicotine content (TSO et al., 1983), sodium, potassium and chlorine concentration in leaf (DARVISHZADEH et al., 2011;TSO et al., 1983) as well as susceptibility to disease such as stem rot (ELLIOT et al., 2007) and powdery mildew (DARVISHZADEH et al., 2010).
Classification of genotypes based on agromorphological traits using UPGMA clustering algorithm separated them into four main groups (Figura1).Cluster I included two tobacco genotypes originated from Yugoslav that had distinguishable morphological performance in field conditions.Genotypes C.H.T.209.12.e and C.H.T.269.12.e that belong to Mazandran province of Iran established cluster II.Breeding lines known as SPT that   ESENDAL, 1995) showed that there were heterosis for several agro-morphological traits in tobacco such as stem height, leaf number and dry leaf yield.
In this research, clustering of oriental-type tobacco based on morphological traits was in agreement with their geographical distribution and growth characteristics.Therefore, there is acceptable genetic diversity within orientaltobacco genotypes that is accommodated by Darvishzadeh et al. (2011) based on chlorine concentration in leaves.

SSR markers
Similar to the agro-morphological traits, high molecular genetic variability was also observed among the genotypes studied which are in agreement with the finding of Ren andTimko (2001), Yang et al. (2007) and Davalieva et al. (2010) by means of AFLP, ISSR and SSR markers, respectively.The number of alleles detected for each SSR locus varied from 2 to 3 alleles per locus and a total of 35 alleles were detected over all loci (Tabela 4).The mean number of allele per locus was 2.7, which was parallel with Davalieva et al. (2010) reports in Macedonian tobacco with average of 3 alleles per locus.There was not any rare allele (an allele that will be detected once in the 70 genotypes) in studied tobacco genotypes.
The effective number of alleles varied from 1.50 to 2.96 (Tabela 4).It was inferred from low differences among observed and effective number of alleles (Tabela 4), that there is low standard deviation between allele frequencies in each SSR loci.The observed homozygosity values ranged from 0.48 in locus PT30008 to 1.00 in PT30021, PT30241, PT30250, PT30202, PT30172, PT30165, PT30126 and PT30034 loci, with an average of 0.88 across all loci and the observed heterozygosity values ranged from 0.00 in loci PT30126, PT30034, PT30172, PT30165, PT30241, PT30250, PT30202, PT30021 to 0.52 in locus PT30008 with an average of 0.11 across all loci (Tabela 4).The expected homozygosity values ranged from 0.33 in locus PT30014 to 0.67 in locus PT30165, with an average of 0.43 across all loci (Tabela 4).The expected heterozygoty values also ranged from 0.33 in locus PT30165 to 0.67 in locus PT30014, with an average of 0.55 across all loci (Tabela 4).
The genetic similarity based on Jaccard similarity coefficient varied from a maximum of 0.92 (between Pobeda1 and C.H.T.209.12egenotypes) to a minimum of 0.00 (between C.H.T.269-12e and SPT 405 genotypes) with average of 0.32.So, there is a wide range of genetic variation among all oriental-type genotypes.Classification of genotypes based on SSR data by using UPGMA clustering method separated them into three main clusters (Figura 2).The first, second and third clusters comprised 1.4%, 88.6% and 10% of genotypes, respectively.In this study, molecular marker data based clustering of oriental tobacco genotypes did not pursuit agro-morphological based grouping.This is similar to finding of Yao Zhang et al. (2008).Since, 77% of the total genomic DNA in cultivated tobacco is composed of repetitive sequences (NARAYAN, 1987), therefore, there is low amount of non-repetitive DNA in the genome of tobacco that responsible for any morphological and quality traits variation.Therefore, recommended to use functional markers such as EST-SSR as barley (SALEM et al., 2010) to achieving precise evaluation of tobacco germplasm.In contrast to agro-morphological traits, there were not completely concurrences among geographical distribution of oriental-type genotypes and established clusters based on SSR marker data.Yao Zhang et al. (2008) also indicated that dendrogram constructed by using RAPD and AFLP markers in flue-cured tobaccos could not indicate any clear pattern of their geographical origins.
Considering both agro-morphological and SSR marker classifications, some genotypes trend to be located in the same group that is not unexpected because all of studied genotypes considered as oriental or semi oriental tobacco genotypes.

CONCLUSIONS
There were several studies in the genus Nicotiana that used growth characteristics and cytogenetic attributes to describing genetic diversity but there is little information about using agro-morphological traits accompanying with molecular markers data to reveal genetic variation within N. tabaccum.Considering to agro-morphological traits and SSR markers, there is significant variation within oriental-type tobacco germplasm.Structure of genetic diversity of oriental-type tobacco genotypes does not pursuit their geographical origins.There is not completely accommodation between classification based on agromorphological traits and SSR markers.Regarding to reality of markers data, their information could be effectively used in tobacco heterosis breeding program.

Figure 1 -
Figure 1 -UPGMA clustering of oriental tobacco genotypes based on squared Euclidean distances by using agro-morphological traits

Figure 2 -
Figure 2 -UPGMA clustering of oriental tobacco genotypes based on Jaccard's similarity coefficient by using SSR data

Table 1 -
Name and origin of tobacco genotypes

Table 2 -
Name, sequence, linkage group and position of 13 SSR primers applied to 70 oriental tobacco genotypes

Table 3 -
Variation observed among the tobacco genotypes for the traits under study ** Significant at a level of 1%

Table 4 -
(ALEKSOSKI, 2010;KARA; parameters across single sequence repeat loci in oriental tobacco germplasm a = observed number of alleles; A e = effective number of alleles; Obs Het = observed heterozygosity; Exp Het = expected heterozygosity derived from Iran`s northwest landraces by single seed descent method were located in cluster III.Other studied genotypes that had several sympatric origin grouped into cluster IV.Wenping et al. (2009)andZeba and Isbat (2011)used multivariate statistical analysis such as cluster analysis and principle component analysis to grouping tobacco genotypes and identifying important agromorphological traits.Identification of groups of genotypes with large distances could be effective in recognition of parental lines that might produce hybrid vigor in breeding programs.Several reports(ALEKSOSKI, 2010;KARA; n Genetic variation in oriental tobacco (Nicotiana tabacum L.) by agro-morphological traits and simple sequence repeat markers These diverse values of similarity between genotypes could validate that this collection is a valuable tobacco germplasm that already have not been exposed to degradation.The lowest value of similarity (0.0) was belong to two genotypes including C.H.T.269-12e and SPT 405 which were originated from two difference regions of Iran with very differed geographical conditions.SPT series genotypes are dwarf type with low flowering period that are distinguished dramatically them from others such as C.H.T.269-12e.Genotype C.H.T.209.12ebelonged to Iran's north province (Mazandaran) and had highest value of similarity with genotype Pobeda1 originated from Russian country near to north province of Iran.