The dynamics of genome size and GC contents evolution in genus Nicotiana

Abstract Hybridization and Polyploidization are most common of the phenomenon observed in plants, especially in the genus Nicotiana leading to the duplication of genome. Although genomic changes associated with these events has been studied at various levels but the genome size and GC content variation is less understood because of absence of sufficient genomic data. In this study the flow cytometry technique was used to uncover the genome size and GC contents of 46 Nicotiana species and we compared the genomic changes associated with the hybridization events along evolutionary time scale. The genome size among Nicotiana species varied between 3.28 pg and 11.88 pg whereas GC contents varied between 37.22% and 51.25%. The tetraploid species in genus Nicotiana including section Polydiclae, Repandae, Nicotiana, Rustica and Sauveolentes revealed both up and downsizing in their genome sizes when compared to the sum of genomes of their ancestral species. The genome sizes of three homoploid hybrids were found near their ancestral species. Loss of large genome sequence was observed in the evolutionary more aged species (>10 Myr) as compared to the recently evolved one’s (<0.2 Myr). The GC contents were found homogenous with a mean difference of 2.46% among the Nicotiana species. It is concluded that genome size change appeared in either direction whereas the GC contents were found more homogenous in genus Nicotiana.

hybridization events respectively, involving different diploid ancestral species (Knapp et al., 2004).Nicotiana is a model genus to understand polyploidization in plants and it has long been used to explore many of the evolutionary processes involved in allopolyploidization events (Clarkson et al., 2005;Leitch et al., 2008).So far, the parental lineage of almost all the Nicotiana allotetraploid (including section Suaveolentes) and homoploid hybrid species (N.linearis, N. spengzii and N. glauca) has been documented based on morphological (Goodspeed, 1954), cytological, plastid sequence data and the most recently evolved nuclear coding sequence data (Clarkson et al., 2010;Kelly et al., 2010;Kelly et al., 2013).The age of each tetraploid section in the genus Nicotiana has also been documented (Knapp et al., 2004;Clarkson et al., 2005;Clarkson et al., 2010).In addition, the number of genomic resources for Nicotiana species are increasing over the last few years, with the draft genome sequence available for the most important species of Nicotiana (N.Banthamiana, N. tabacum, N. tomentosiformis and N. Sylvestris), along with the increasing knowledge on diversity of repetitive DNA elements (Koukalova et al., 2010), karyotpic studies (Marks et al., 2011) and genome size evolution (Leitch et al., 2008, Renny-Byfield et al., 2011;Renny-Byfield et al., 2013).
Given the importance of this genus, with such opportunities not available in other angiosperms group, this study was carried out to estimate the genome size and GC contents of the different Nicotiana species by flow cytometry and study the overall extent of genome size (up-and downsizing) and GC content variation.

Plant material
The seeds of all Nicotiana species (Table 2) were provided by the germplasm bank of Tobacco research institute of Chinese Academy of Agricultural Sciences, Beijing, China.The seeds of standard plants were obtained from Jaroslav Doležel, Experimental Institute of Botany, Czech Republic (Table 1).All the species of Nicotiana and standard plants were grown under controlled conditions in glass house and leaf samples were harvested for analysis.

Introduction
Polyploid and homoploid hybridization are two important evolutionary phenomena involved at species levels.These processes have regularly contributed in diversification of plant species.Evolutionary consequences associated with hybridization events have been studied at various levels such as chromosomal rearrangements, repetitive DNA sequence evolution, genome size change, and diploidization (Hegarty and Hiscock, 2008;Baack et al., 2005;Leitch et al., 2008;Renny-Byfield et al., 2011;Renny-Byfield et al., 2013).The genome size changes associated with hybridization and polyploidization and genomic GC contents variation has been the subject of immense interest.Recently, numerous studies have provided novel insights into the potential basis of genome size evolution in plants (Bennett and Leitch, 2011;Veselý et al., 2012).Similarly, the range of GC contents in major plant species studied is narrow except for grasses that exhibit a remarkable GC content heterogeneity (Barow and Meister, 2003;Šmarda et al., 2012).While it has also been shown that dynamics and magnitude of GC base composition is persistently lacking in plants (Tatarinova et al., 2010;Serres-Giardi et al., 2012).
Polyploidization can induce rapid genomic changes, including the gain or loss of DNA, but the magnitude and timing of such changes are not well understood (Baack et al., 2005).In this regard, the Nicotiana genus is more suitable candidate as this genus consists of several sections of allotetraploids formed at different times from their diploid ancestors and the estimates of ages of each section is also well studied (Clarkson et al., 2005;Leitch et al., 2008).The genus Nicotiana also offers the section Suaveolentes where multiple chromosome fusions resulted in chromosome number reduction (Chase et al., 2003;Leitch et al., 2008).On the other hand, homoploid hybridization has also significant role in contributing to species diversity in plants.While the genomic changes associated with homoploid hybrid speciation has been previously reported in Helianthus and Paeonia (Rieseberg and Willis, 2007;Paun et al., 2009) but not in the homoploid hybrids species of genus Nicotiana (Clarkson et al., 2010;Kelly et al., 2010).Such study in plants will require that genome size, evolutionary origin and age of all polyploid groups must be known (Lim et al., 2007).While at the same time, a very scattered and dichotomist viewpoint has emerged on the pattern of GC contents evolution in plants through the studies of few representative species of monocots and dicots (Wong et al., 2002;Wang and Roossinck, 2006;Serres-Giardi et al., 2012).However, until now, the GC content has been reported for limited plant species (Veselý et al., 2012), especially in the lower taxonomic groups (Genus level).In this regard, flow cytometry offers a reliable method to estimate GC contents (Šmarda et al., 2012).
The genus Nicotiana comprises of 76 species among which 35 are allotetraploid and the rest of them are diploid species (including the recently identified 3 homoploid hybrids).The polyploid and homoploid hybrid species in the genus Nicotiana have been evolved through different polyploidization and interspecific

Sample preparation
Fresh leaf sample from both standard and sample (50 mg) was co-chopped in plastic Petri dish by sharp razor blade in 500 µl of ice cold Otto-1 buffer supplemented with 2% mercaptoethanol.The suspension of nuclei was filtered through 30 µm of disposable filter (Partec) and stained with 2 ml of respective flourochrome buffer for 5 minutes in dark.Staining buffer for genome size estimation consisted of 1 ml Otto-II buffer supplemented with 50 µg propidium iodide and 50 ul Rnase I whereas for AT-specific staining 1 ml Otto-II buffer was supplemented with 5 µl of DAPI.

Genome size and nucleotide contents estimation
The nuclear suspensions stained with propidium iodide were subjected to flow Cytometer (Cube Partec, Germany).The channels were set into a proper position on the abscissa and different parameters like threshold level and gain value were adjusted with a flow speed of 0.5µl/sec and approximately 10, 000 nuclear particles were measured.The genome size was calculated by the method described (Doležel et al., 2007).The GC contents were calculated by the most widely accepted equations (eqns 7, 8) described by Barow & Meister (Barow and Meister, 2002).The calculations were performed with binding length of DAPI=4, as recommended by Barow & Meister.The average DNA contents per chromosome was calculated by dividing the genome size (2C) value by total number of chromosomes.The entire samples were analyzed in three replications with CV value of less than 5%.

Evaluation of genome size changes in tetraploid species
The expected genome size values of tetraploid species were calculated by the sum of genome sizes (1C flow cytometry) of their two-ancestral species that formed them whereas the observed values for the same tetraploid species were those obtained by flow cytometry.In all the polyploid cases, the extant diploid species are not what exactly formed the tetraploid species but these species are the closest living relatives to the diploids that formed them.The genome size changes of all the tetraploid section were also analyzed on evolutionary time scale, as the ages of all sections are known.

Statistical analysis
All samples were analyzed in three replications and the mean values along with the standard error were calculated.Boxplot distribution analyses of genome size and GC contents were performed on different polyploid sections based on the evolutionary age of each section (Figures 2 and 5).Scatter line plot were carried out on genome size vs GC contents (Figure 6).All the statistical analysis were carried out by MINITAB 16 statistical package.The graphs and figures were made by Origin 2015.

Genome size data
The genome sizes, genomic base compositions (AT+GC) and average chromosome size of 46 different diploids and allotetraploid species are listed in Table 2.

Genome size changes along the evolutionary time scale in genus Nicotiana
The observed vs expected genome size values of the 8-tetraploid species were compared and both genome up and downsizing were observed in all tetraploid species (Figure 1).N. tabacum, N. rustica, N. clevelandilii, N. nudicaulis reveals genome downsizing whereas N. quadrivalvis, N. repanda, N. nesophila, N. stocktoni showed genome upsizing.The sum of 1C values of all the 14 species in section Suaveolentes (average) was compared with the sum of 1C values of their two-ancestral species (Figure 1).The observed vs expected genome size of this newly studied section Suaveolentes reveals a huge amount of genome downsizing.The section Suaveolentes originates through allopolyploidization that involves ancestral member of the section Sylvestris as paternal progenitor and a member of either section Petunioides or section Noctiflorae or a hypothetical hybrid species between these two sections as maternal progenitor (Kelly et al., 2013).
The three homoploid hybrid species (N.linearis, N. spengzii and N. glauca) and their possible ancestral species showed little differences among their genome sizes except for N. noctiflora (Figure 1).The genome size of these hybrid species ranges from 6.50 pg to 7.11 pg whereas uniformity in the pattern of GC contents distribution but the more recently evolved species (N.tabacum and N. rustica) showed comparatively high GC contents (Figure 5).The mean difference in GC contents observed among all Nicotiana sections is 2.46% representing a more homogenous content within genus Nicotiana.

Reliability of the genome size data
The genome size estimates of N. sylvestris, N. tomentosiformis and N. tabacum based on flow cytometry and 17-mer depth distributions of sequence data by Tobacco Research Institute (unpublished data) are diagrammatically represented (Figure 3).Our genome size estimates by flow cytometry were found 6-15% higher than 17-mer based sequencing results of the three species.For instance, the recently sequenced genomes of N. sylvestris and N. tomentosiformis were estimated as 2.41 Gb (2.63 pg) and 2.43 Gb (2.68 pg) respectively using a 17-mer distribution, smaller than expected 1C value estimated by flow cytometry (Sierro et al., 2013).Genome size estimates (FCM) of Arabidopsis (157 Mb) were found 25% larger than the Arabidopsis genome sequencing estimates of ~125 Mb.The discrepancy among genome size estimates might arise due to the un-sequenced gap in the heterochromatin region, telomere or nucleolar region (Bennett et al., 2003).Furthermore, the study of repetitive content in the 727 Mb potato genome assemblies reveals that much of the unassembled genome sequences were composed of repeats (Xu et al., 2011).Fortunately, considerable benefits can be achieved by bridging the genome size and sequence data, as uniformity exist between the two estimates.
Our study generated genome size values of 46 species of genus Nicotiana among which the values of 14 species were found in parallel with that of previous study (Leitch et al., 2008), with little differences observed among three species i.e.N. tabacum, N. attenuata N. quadrivalvis and N. repanda (Figure 4).Significant differences were observed   approximately the same range of 5.30 pg to 6.95pg was observed in their ancestral species.
The loss in genome size in the five tetraploid sections was found directly proportional to the age of each section.The more recently evolved tetraploid sections i.e.Nicotiana and Rustica, revealed small amount of genome size loss whereas the section Suaveolentes showed large amount of genome size loss (Figure 2).

GC content variation in genus Nicotiana
The average GC contents of the 46 species in genus Nicotiana were estimated by flow cytometry (Table 2).Ascending pattern was observed in the GC contents from diploid to tetraploid species with a mean difference of 2.46% (Figure 5).The boxplot analysis of GC contents reveals between our genome size estimates and 23 species listed by Narayan et al., 1987.However, the exact cause of such huge differences between the two studies might be methodological error as the previous study (Narayan, 1987) used Feulgen photometry for genome size estimation whereas flow cytometry has been emerged as a method of choice for genome size in the last decade (Doležel and Bartos, 2005).Furthermore, the genome size estimates for the standard plants used in the previous study (Narayan, 1987) were not accurate because sequenced genomes were not available at that time.

Genome size estimation and genomic changes along evolutionary time scale in the tetraploid species
Our study indicated differences in the extent of genome up-and downsizing with that of Leitch et al., 2008 but the direction of genome size change was found similar except for N. clevandii.Next generation sequencing data of the section Repandae also reveals both genome up and downsizing in the section Repandae (Renny-Byfield et al., 2013).Frequent loss of genomic sequences in polyploid species and genome contraction seems to be a general response to polyploidization (Leitch and Bennett, 2004;Renny-Byfield et al., 2011, Yang et al., 2011) whiles a few cases of genome size expansion has also been reported (Bennett and Leitch, 2005;Leitch et al., 2008).Nevertheless, some studies reported no DNA loss (Ozkan et al., 2006;Mestiri et al., 2010).
The observed vs expected genome size of the section Suaveolentes reveals a huge amount of genome downsizing.The exact cause of such a huge amount of genomic DNA loss is not clear until now but the dysploid reduction in chromosomes number, largely occur in section Suaveolentes due to the fusion of chromosome and might be one of the possible reason (Clarkson et al., 2004).The section Suaveolentes revealed a huge amount of genome size reduction among the polyploidy species because this is the oldest section among polyploid species in genus Nicotiana with an age of approximately 10 Myrs.The evolutionary age of each polyploid section of genus Nicotiana has been documented in the previous studies (Clarkson et al., 2005;Leitch et al., 2008).The extent of DNA sequence divergence encountered in polyploids is more dependent on the age of the species with genome turnover more evident in older species (Lim et al., 2007).
The homoploid hybrid species (N.linearis, N. spengzii and N. glauca) has been recently identified and evolved by hybridization of members from the section Noctiflora and Petunoides (Clarkson et al., 2010;Kelly et al., 2010).The genome size of N. linearis, N. spengzii and N. glauca were 6.50 pg, 7.11 pg and 6.85 pg respectively whereas approximately the same range of 5.30 pg to 6.95pg was observed in their ancestral species.As opposed to our findings, genome size expansion had been observed in three homoploid hybrid species in Helianthus with 50% more nuclear DNA than their parental species (Baack et al., 2005).

GC content variation in genus Nicotiana
Several studies in various organisms including plants have reported an increase in GC contents from diploid to tetraploid species.The more recent studies on seed plant reveal the GC poor and homogenous pattern of diploid species to a more heterogeneous and GC rich polyploid species (Serres-Giardi et al., 2012).The pattern of GC contents was tested on a narrower range in the genus Nicotiana.Our study indicates more homogenous pattern of GC contents among the diploid ancestors and   Genomic content evolution in Nicotiana the polyploidy progenitors with median value of 39.97% and 41.28% respectively (Figure 5).The interquartile range of GC contents among the whole range of species were found 2.46%.The pattern of GC contents in the genus Nicotiana was found similar to that of the previous study (Serres-Giardi et al., 2012) but the magnitude of difference was different because their study includes wider range of species from eudicots to monocots.Positive correlation was found between genome size and GC contents with Pearson co-efficient of correlation value of 0.56 (Figure 6).

Conclusions
Our study provides a more comprehensive and recent review of genome size estimates of 46 different species of Nicotiana in both diploids and tetraploids.Altogether, our study reveals both genome up and downsizing along the evolutionary time scale in genus Nicotiana.Genome downsizing were observed in the large and newly studied section of Suaveolentes whereas genome size estimates of three homoploid hybrid species were found in similar range to their ancestral species.The genomic loss was found highly correlated to the age of each sections i.e. evolutionary older sections showed high amount of genomic sequence loss as compared to recently evolved sections of genus Nicotiana.The GC contents were found strongly correlated with genome size having correlation coefficient of 0.56.The GC contents were found more homogenous in this genus with a mean difference of 2.46%.The GC content also reveals moderate increase in the recently evolved species of section Nicotiana and Rustica.The sub-genomic processes and specific sequences that generate variation in genome size can only be examined in detail through large-scale comparisons of DNA sequences.Study of total DNA contents (C-value) and individual sequences can provide new spectrum to genome biology with sequence data providing novel insights into genome-size evolution, and with genome-size data being of both practical and theoretical significance for large-scale sequence analysis.

Figure 1 .
Figure 1.Observed (black triangles) and expected (white triangles) genome size values of the tetraploid and homoploid hybrid species (N.linearis, N. spengzii and N. glauca) evolved from their ancestral diploid species.The observed values are 1C genome estimated by flow cytometry and expected values are the sum of 1C parental genome.

Figure 2 .
Figure 2. Boxplot plot distribution of genome size estimates (1C in pg) of different tetraploid sections over evolutionary time scale (time scale is represented from left to right on x-axis from most recently evolved species to the older one).Genome size estimates of each tetraploid section of genus Nicotiana are represented over evolutionary timescale along with average genome size of diploids progenitors.

Figure 3 .
Figure 3.Comparison of genome size estimate with sequencing results.Black bars represent the genome size estimates by Flow cytometry whereas the line bars represent genome size estimates from 17-mer sequencing results (Unpublished data).

Figure 4 .
Figure 4. Comparison of genome size estimates between our study (black triangles) with 15 species estimated by Leitch et al., 2008 (white triangles).

Figure 6 .
Figure 6.Scatter line plot of genome size vs GC contents in genus Nicotiana (correlation coefficient of 0.56).

Table 1 .
Plant standards for genome size estimation.

Table 2 .
Genome size, nucleotides composition and average DNA contents per chromosome of 46 different Nicotiana species.