Genetic analysis of scattered populations of the Indian eri silkworm, Samia cynthia ricini Donovan: Differentiation of subpopulations

Deforestation and exploitation has led to the fragmentation of habitats and scattering of populations of the economically important eri silkworm, Samia cynthia ricini, in north-east India. Genetic analysis of 15 eri populations, using ISSR markers, showed 98% inter-population, and 23% to 58% intra-population polymorphism. Nei’s genetic distance between populations increased significantly with altitude (R2 = 0.71) and geographic distance (R2 = 0.78). On the dendrogram, the lower and upper Assam populations were clustered separately, with intermediate grouping of those from Barpathar and Chuchuyimlang, consistent with geographical distribution. The Nei’s gene diversity index was 0.350 in total populations and 0.121 in subpopulations. The genetic differentiation estimate (Gst) was 0.276 among scattered populations. Neutrality tests showed deviation of 118 loci from Hardy-Weinberg equilibrium. The number of loci that deviated from neutrality increased with altitude (R2 = 0.63). Test of linkage disequilibrium showed greater contribution of variance among eri subpopulations to total variance. D’2IS exceeded D’2ST, showed significant contribution of random genetic drift to the increase in variance of disequilibrium in subpopulations. In the Lakhimpur population, the peripheral part was separated from the core by a genetic distance of 0.260. Patchy habitats promoted low genetic variability, high linkage disequilibrium and colonization by new subpopulations. Increased gene flow and habitat-area expansion are required to maintain higher genetic variability and conservation of the original S. c. ricini gene pool.


Introduction
The Indian eri silkworm, Samia cynthia ricini (= S. ricini Donovan; Lepidoptera: Saturniidae) is a domesticated silkworm that feeds primarily on leaves of the castor plant, Ricinus communis L. This species, initially reported from the foot-hills of Himalaya, was later introduced into temperate countries. For more than a century, a few populations have been used for commercial silk production in northeast India, in such regions as the Brahmaputra river valley, east and west Garo Hills, east and west Khasi Hills, Nagaland and Manipur (Singh and Benchamin, 2002;Piegler and Naumann, 2003). Primary and secondary food plants occur in abundance in the surroundings characterized by temperatures of 7°C to 34°C and an average annual rainfall of 2800 mm. Fifteen populations of the eri silk-worm were recognized from localities at various altitudes ranging from 35 to 925 meters above sea level (MSL) (Table 1). Although morphological characterization indicated phenotypic variability, molecular data on genetic relations and structure are required to improve the population characterization. Recently, the species was identified as a potential source for exploring functional genomics (Arunkumar et al., 2008), although prior to their use in assessing genetic diversity, variations in EST and putative gene expression need to be ascertained. The lack of co-dominant markers, such as microsatellites from the eri silkworm itself, has added to the problem when analyzing population genetic status.
Inter simple-sequence repeat (ISSR) primers were designed from regions of microsatellite abundance in the genome, which exhibit high level of polymorphism (Schlotterer, 2000). As ISSR, the dominant marker system is simpler to use though not entailing prior information on target regions (Zietkiewicz et al., 1994;Nagaraju et al., 2002), it has been extensively employed for identifying inter-and intra-populational genetic variability in several insect species, including domestic and wild silkworms (Ehtesham et al., 1995;Pradeep et al., 2005Pradeep et al., , 2007Pradeep et al., , 2008Vijayan et al., 2005Vijayan et al., , 2006Santana et al., 2009;Saravanakumar et al., 2010). Hence the ISSR marker system has been selected for genetic analysis of the eri populations.
On route, many unexploited eri populations from remote areas of northeast India were identified. Habitats were isolated by high mountains, farm land and human dwellings, all leading to the formation of patchy inbred populations. Habitat fragmentation had deleteriously affected the genetic structure of natural populations (Hale et al., 2001), thus causing a loss in biological diversity (Davies et al., 2001). Furthermore, commercial exploitation gave rise to gene-pool mixing, thus contributing to the loss of population identity. Genetic analysis of these scattered populations using molecular markers becomes imperative, in order to develop a sustainable strategy for conserving the various genetic resources as yet available. The aim was to analyze the genetic structure of unexploited, scattered eri populations, as well as the genetic interaction with topographic features of collection sites. The data demonstrated a loss of genetic diversity in sub-populations, as well as the genetic reorganization of certain populations, which can be considered as markers of eco-genetic consequences on insects undergoing regional habitat fragmentation.
DNA extracted from individual moths by the phenol:chloroform method (Suzuki et al., 1972), was incubated with RNase A to remove RNA contamination and then re-extracted. The purified DNA was dissolved in Tris-EDTA (TE; pH 8.0) buffer and quantified on 0.8% agarose gels. PCR amplification was carried out on an MJ Research Thermal-Cycler, PTC 200, using 20 mL reaction mixture containing 10x PCR buffer, 2 mM dNTPs, 2.5 mM MgCl 2 , 0.10 mL of Taq DNA polymerase (recombinant; Fermentas) (5 U/mL), 2.0 mL of 1.5 mM ISSR primer and 40 ng of DNA. The ISSR primers (set # 9) were purchased from the Nucleic Acid & Protein Service of Biotechnology Laboratory (University of British Columbia, Vancouver, Canada). Of the 100 primers tested, only 20 showed reproducible amplification. Each reaction was replicated at least twice for reproducibility. The PCR cycle followed was 94°C for 2 min followed by 35 cycles of 94°C for 30 s, annealing at 50°C for 30 s and extension at 72°C for 2 min and a final extension at 72°C for 10 min. Following PCR, the reaction mixture was loaded on a 1.5% agarose gel in Tris-Boric acid-EDTA (TBE) buffer and resolved at 60 V in a submarine electrophoresis system. The gel was illuminated with UV and the ISSR profiles were photographed using a gel documentation system (Syngene). Pradeep et al. 503

Statistical analyses
Amplification products were scored in a binary mode (1, 0), where "1" represented the presence of a marker and "0" its absence. The data was analyzed by SPSS 11.5 software and POPGENE version 1.32 (Yeh, 1998). Polymorphism levels were expressed as the percentage of all the loci that proved to be polymorphic in the profile. The null hypothesis was tested with Chi-square (c 2 ) and likelihood ratio (G 2 ) analysis for each locus, which determined the probability of homozygosity under Hardy-Weinberg equilibrium. The Ewens-Watterson test for neutrality was applied for assessing variation from Hardy-Weinberg equilibrium (H-WE) in 1000 simulated samples. The Nei (1972) genetic distance, the Nei (1973) gene diversity index (h), the Shannon Information index (I) (Lewontin, 1972), observed number of alleles (na), effective number of alleles (ne) (Kimura and Crow, 1964) and coefficient of genetic differentiation (Gst), were estimated using the POPGENE program. According to Nei (1973) genetic diversity statistics, Gst, a measure of population differentiation, is defined as the proportion of genetic diversity that resides among populations. Gst values range from zero to one, with low values indicating little inter-population genetic variation (Culley et al., 2002). Pair-wise genetic similarity was calculated as 2Nij/(Ni + Nj), where Nij is the number of common bands in i and j populations, and Ni and Nj the total number of bands produced by the populations i and j (Nei and Li, 1979). A phylogeny tree was constructed based on the dissimilarity (1-similarity value) matrix, using cluster analysis by UPGMA (Unweighted Pair Group Method using an Arithmetic average), available in the clustering program, PHYLIP version 9.0. Bootstrap analysis was performed using 10000 replicates as provided in PAUP* software (Swofford, 1998).
Linkage disequilibrium among pairs of loci for each population was calculated using ISSR loci that deviated from HWE, according to Weir (1979). Non-random association between two loci was assessed by Chi-square testing. Variance components of linkage disequilibrium were calculated by two-locus analysis, as per D -Statistics of Ohta (1982). Total variance (D 2 IT) was partitioned into intra-(D 2 IS, D' 2 IS) and inter-(D 2 ST, D' 2 ST) population components, both calculated with POPGENE software.
Relation between genotypic parameters and geographical parameters of the site of collection was analyzed by linear regression analysis and correlation coefficient (R 2 ).
The distribution pattern of individuals within each population was analyzed by ALSCAL multidimensional scaling available in the SPSS program. In this analysis, the genetic distance was computed and a dissimilarity matrix created, using Euclidean distances acquired from the ISSR profile. Euclidean distance was used for data stimulus configuration, using the classical Young-Householder multidimensional scaling procedure (Young et al., 1984;Young and Harris, 1990). Canonical discriminant function analysis (DFA, as given in the SPSS program) was applied for analyzing the contribution of altitude to the genetic status of eri populations, by using altitude and ISSR loci for defining the percentage of variance of the dependent variable (altitude), as inferred from independent loci. Altitudes were categorized as class 1; 0-100 MSL, class 2; 101-200 and so on. The multivariate stepwise approach, with the same limit of F value for stepwise entry and removal of variables (ISSR loci in this case), was used for calculating Mahalanobis distance (D 2 ) statistics, and arriving at estimates of Chi-square and Wilks' Lambda (Anderson, 1984;Introna Jr et al., 1997). For DFA, squared Mahalanobis distance (D 2 ) was used for testing the distance from the group centroid in each case. The shorter the distance, the closer the case was to the group itself, and thus more likely for classification as belonging thereto (Mahalanobis, 1936).

Phenotypic variability
Of the 15 S.c.ricini populations, eight were from lower (35-150 MSL), two from mid-(~300 MSL) and five from higher (500-925 MSL) altitudes (Table 1). Eri larvae have a distinct blue or yellow body-color, with or without zebra markings, and spines. The color of the cocoons produced by the various populations was either white or brick-red, or both.

Inter population genetic status
Twenty ISSR primers amplified 194 products from genomic DNA of the 15 populations within a size range of 0.3 kb to 3.0 kb. The UBC842 primer produced 15 bands, the maximum, whereas UBC807 and UBC812 produced only five, the minimum. According to AMOVA, inter-population polymorphism was 98%. The Nei (1972) genetic distance was the highest between Dhemaji and Dhanubhanga (0.593), and the lowest between Dhakuakhana and Cachar (0.0686). The average genetic distance among the Lakhimpur, Dhemaji and Dhakuakhana populations located in close proximity (an average distance of 21 km) in the Brahmaputra river basin in upper Assam, was 0.109, and between these populations and those geographically farther placed was 0.442. The geographic distance between Mendipathar and Cachar was 915 km and the genetic distance 0.413. Regression of genetic distances between populations over geographical distances, revealed significant positive correlation (R 2 = 0.783; p < 0.0009; Y = 0.0009 X + 0.121).
According to the dendrogram plotted from the ISSR profile, the Lakhimpur, Dhemaji, Dhakuakhana and Cachar populations were distinctly clustered, whereas the Barpathar and Chuchuyimlang were grouped together. The Diphu and Dhansiripar populations were grouped together with commercial populations (E3 to E8). Tree-branching in the dendrogram was strongly supported by significantly high bootstrap values. At the node representing Barpathar and Chuchuyimlang clustering, bootstrap value was 87%. On plotting the phylogeny tree onto a geographical map of the collection sites, the Barpathar and Chuchuyimlang populations were found to be at an intermediate position in a north-south direction, both on the map itself and the phylogram ( Figure 1).
Analysis of genetic variability revealed low gene diversity throughout both the total eri population (Ht = 0.350 ± 0.019) and subpopulations (Hs = 0.121 ± 0.008). Among loci, UBC808-7 1200 bp and UBC885-11 550 bp presented the highest Ht of 0.42. The coefficient of genetic differentiation (Gst) was lower among the scattered non-commercial populations than among the commercial (Table 2). In the total eri population, gene flow was 0.263. The contribution of ISSR loci to population heterogeneity was assessed by Chi-square analysis. Out of the 194 loci, 10 showed significant probability ((c 2 )/g 2 < 0.05 or better) values indicative of a significant contribution (p < 0.01 or better).
Locus-variation from H-WE was examined by the Ewens-Watterson test for neutrality. Deviation occurred in 118 loci (Table 3), with 77 loci had the F-values above the upper limit of expectancy (95% confidence), and 41 below the lower limit. Linkage disequilibrium was calculated for the total population, in order to assess the contribution of non-selective forces on the fluctuation of loci from HWE. Single population linkage disequilibrium analysis (Weir 1979) showed significant (p < 0.05) disequilibrium in S.c.ricini populations. In populations from areas of low altitude (< 200 MSL), linkage disequilibrium was more pronounced when compared to those from higher altitudes. Total variance (D 2 IT) of the disequilibrium (Ohta, 1982) was 0.2601. D '2 IS (0.2416) was greater than D '2 ST (0.0185), an indication of close-to-zero variance in the total population, and consequently the more pronounced inde- Pradeep et al. 505   Table 1 pendence of loci. On the other hand, D 2 ST (0.258) was greater than D 2 IS (0.0022), thus indicating that a larger proportion of total variance in disequilibrium resulted from deviation "among the subpopulations" than that from "within populations".

Intra-population genetic status
Intra-population polymorphism was the highest in the Lakhimpur (58%) and the lowest in the Dhemaji and Chuchuyimlang (22% and 26%, respectively) ( Table 3). In the Lakhimpur, 113 loci were polymorphic, as against 62 in the Cachar. Nei's intra-population gene diversity (h) was the highest in the Barpathar (0.161) and Lakhimpur (0.155) and the lowest in the Dhemaji (0.088), Cachar (0.098) and Chuchuyimlang (0.092). Generally, gene diversity was somewhat higher in the populations of the lower Assam and Meghalaya regions (~0.12), when compared to the upper Assam and Nagaland regions (~0.11). The Shannon Information index (I) was the lowest in the Dhemaji population (0.12) and the highest (0.238) in the Lakhimpur. The observed number of alleles varied from 1.227 in the Dhemaji to 1.583 in the Lakhimpur, whereas the effective number of alleles varied from 1.155 in the Dhemaji to 1.286 in the Barpathar.
Genetic distances were calculated, so as to assess genetic relationships among individuals within each population. These ranged from 0.179 in the Chuchuyimlang to 0.291 in the Dhakuakhana population (Table 3). In Dhemaji and Nongpoh also, the genetic distance among individuals (~0.190) was low.
The distribution of individuals within each population was assessed by ALSCAL multidimensional scaling. Except for Lakhimpur, they were either uniformly distributed on the matrix or clustered in the center (data not shown). Although in the Lakhimpur population (E12), one group of individuals was clustered together (core population), another was separated, thus forming a peripheral group (Figure 2). The average genetic distance among core population individuals was 0.17, whereas that between clustered and peripheral was 0.26, thus significantly higher (p < 0.007; Student t-test). On the dendrogram, individuals of the peripheral population, besides being grouped separately, presented less genetic differentiation (Data not shown).

Relationship between geographic variation and genetic status
DFA of altitude-dependent genetic variance among the eri populations showed significant (p < 0.000) genetic variance, supported by significant (0.000) Wilk l and Chisquare estimates and large canonical-correlation coefficients (R 2 = 0.996). The mean genetic distance of high altitude populations, such as those of Diphu, Cachar, Imphal and Chuchuyimlang, from low altitude populations was 506 Genetic differentiation of S.c.ricini subpopulations 0.329. As a whole, genetic distances among eri populations increased with altitude, a clear indication of a significant relationship between the two parameters (R 2 = 0.71; p < 0.004). Notwithstanding, genetic distances among individuals within a specific population was not influenced by the prevailing altitude (data not shown). In the high altitude populations of Chuchuyimlang and Cachar, Nei's gene diversity was low (~0.09), whereas in the low altitude, such as Dhanubhanga and Dhakuakhana, it was high (~0.12).
The regression of Nei's gene diversity on altitude, revealed only a weak negative correlation between the two parameters (R 2 = 0.118; Y = -3E-05X + 0.1251). There was no correlation in populations from Imphal and Dhemaji, as shown by the R 2 value decreasing from 0.437 to 0.118. On assessing the deviation of loci from neutrality, the significant increase in the number of loci that deviated from H-WE coincided with the increase in altitude (R 2 = 0.627; Y = 0.0081X + 3.4111).

Discussion
Infiltration and the use of forest areas for farming and human dwellings have been the cause of forestfragmentation in northeast India. This, besides contributing to genetic erosion, has negatively affected the survival of many flora and fauna species, including the Samia (Piegler and Naumann, 2003). The fragmentation of habitats has induced scattered distribution, geographical isolation and localized inbreeding of eri silkworm populations. Those from the different altitudes have shown significant phenotypic variation due to varied gene expression. Moreover, 118 loci (60%) have given evidence of deviation from H-W E. Presumably, in different geographic locations, natural selection acts on different genomic regions (Weinig et al., 2003;Mitchell-Olds and Schmitt, 2006). Fragmented populations are highly susceptible to those environmental factors capa-ble of inducing genetic divergence (Antonovics, 1976;Willi et al., 2006). Herein, an attempt was made to employ molecular markers for assessing genetic hazard due to fragmentation of habitats, and the subsequent impact on eri-subpopulation differentiation. In these populations, the positive correlation between geographic and genetic distances clearly indicated the effect of geographical isolation on genetic structure. On the contrary, in Cachar (altitude: 680MSL) from lower Assam and Dhakuakhana (altitude: 68 MSL) from upper Assam, there was both phenotypical and genotypical proximity, as well as joint grouping on the phylogram. These populations presented low genetic polymorphism (28%), relatively lesser observed alleles (1.2 ± 0.4) and lower gene diversity (h = 0.09 and 0.11 respectively), apparently through deficient selection in patchy habitats. Nevertheless, in the commercial populations (E3 to E8), the grouping together in the phylogram indicated genetic similarity, due to the continuous multiplication by inbreeding and/or genetic mixing through human intervention.
The increased rate of inbreeding and genetic drift can be considered as possible causes of erosion in genetic diversity in small eri populations (Keyghobadi et al., 2005;Ouma et al., 2005). Wide-spread linkage disequilibrium, a further indication of recuperated allelic variability, gives added support to this hypothesis. The decrease in disequilibrium with the increase in altitude implies low genetic diversity in high-altitude populations. Tests showed D 2 ST > D 2 IS to be an indication that disequilibria was more pronounced among S.c.ricini subpopulations. In contrast, D '2 IS > D '2 ST implies a significant contribution of random genetic drift to the increase in disequilibrium variance (Ohta, 1982). Such random fluctuation is enhanced by the differentiation in those subpopulations (Ohta, 1982) that presents low genetic variation (Hedrick, 2000;Vuorinen and Eskelinen, 2005). Although less loci deviating from H-WE within the different subpopulations can be an indication of allele fixation by natural selection under geographical isolation, random drift may also have induced population differentiation under conditions of low gene migration.
A significant decrease in gene diversity coinciding with the increase in altitude (35 to 900 MSL) in eri populations places in evidence the influence of the heterogeneous topographical conditions of northeast India on population build-up, as reported for Drosophila melanogaster (Hoffmann et al., 2001;Anderson et al., 2005) and other animals (Pariset et al., 2009), under varying geographical conditions. Continuous mountain ranges, by restricting gene flow between scattered populations, have contributed significantly to the large-scale inbreeding, excess of homozygotes and stabilization of divergent sub-populations, which have ultimately led to the Wahlund effect (Wahlund, 1928), as noticed earlier in Anopheles arabiensis (Kamau et al., 2007), the Atlantic dipteran, Ochlerotatus taeniorhynchus (Bello and Becerra, 2009), and wild popu- Pradeep et al. 507 Figure 2 -Spatial distribution of individuals from the Lakhimpur population of Samia cynthia ricini on an ALSCAL matrix, based on Euclidean distance calculated from independent variables, ISSR loci. Individuals of the population were represented as e12.1 to e12.22. Those in a circle represent the core population. Arrows indicate the separation, at a significant genetic distance, of the peripheral population from the core.
lations of the muga silkworm, Antheraea assama (Arunkumar et al., 2009). The availability of local congenial survival conditions was evident by most of the eri populations having been uniformly distributed on the ALSCAL matrix. Despite low gene diversity, the high survival rate (~90.52%) indicated the voidance of threshold genediversity for survival. Notably, most of the loci that deviated from H-WE revealed high F-values and a tendency towards homozygosity, thus enhancing their dominance in eri subpopulations. High homozygosity, homogeneous distribution and the high survival rate exerted a positive influence on genes favorable for fitness, thus confirming the significant association between alleles and fitness traits that contribute to the survival (Schulze and McMahon, 2002;Lyons et al., 2009) of eri populations under environmental stress (e.g. high altitudes), instead of high heterozygosity, which tends to be non-heritable.

Differentiation of sub-populations
By plotting the phylogram onto a geographic map, the intermediate geo-genetic positions of the Barpathar and Chuchuyimlang populations in northeast India became apparent. But the absence of intermediate status, on comparing phenotypic traits in them with other eri populations, indicated the independency of genetic expression. Moreover, the low hatching rate (~70%) indicated instability in expressing an important fitness trait. Geographically, whereas Barpathar is situated at a low altitude (99 MSL), Chuchuyimlang is at higher altitude (925 MSL). Polymorphism was 43% in the Barpathar population and 27% in the Chuchuyimlang. Gene diversity in both was low (h = 0.161 and 0.092 respectively), whereas genetic distance to other populations was proportionately higher (mean = 0.349). DFA grouping of these populations with those at low altitudes revealed mutual genetic closeness. Such an unstable genetic structure is mostly due to the loss of rare alleles by a slower-rate drift (Cornuet and Luikart, 1996). The low hatching rate, coupled with low gene-diversity and intermediate geo-genetic positioning, imply their origin by chance hybridization, but with reduced hybrid fitness (Peterson et al., 2005). Gene diversity in these eri sub-populations may also be influenced by non-genetical parameters, preferably geographical mountain-barriers, variations in altitude, low flying capacity, etc., that are non-supportive to natural hybridization. Alternatively, genetic grouping of these geographically separated populations supposes genetic invasiveness and subsequent intra-habitat adaptation. How invasion occurred requires further investigation.
In the Lakhimpur population, situated at 102 MSL, both polymorphism (58%) and gene diversity (h = 0.155) were higher, when compared with the others. Individuals hereof were separated into core and peripheral groups, with an intermediate genetic distance of 0.263. Segregation of the peripheral population may be due to partial isolation of the population where non-adaptive inbreeding permits nat-ural selection to enter into play (Charlesworth and Wright, 2001), which possibly induced the deviation from H-WE. Most of the quantitative traits of the Lakhimpur population showed intermediate values, an indication that natural selection had played upon different traits. The shift can possibly be associated with changes in habitat also (Calsbeek et al., 2009). Such genetic reorganization within the Lakhimpur sub-population is an indication of the occurrence of ecological devastation with genetic consequences in the Brhmaputra river valley.
Genetic distances among eri populations increased with the increase in altitude and geographical distance. The low gene diversity in high-altitude populations could be attributed to geographic isolation by mountain-barriers, low gene flow due to long-distance flying incapacity of the moths, and the enhanced action of drift in patchy habitats (Barrett and Kohn, 1991;Leimar, 2009). Notably, patchy habitat induces low genetic variability, high linkage disequilibrium and colonization of new subpopulations. Such genetic pressure leads to discrete modes of selection, thus causing plasticity in phenotypes under varying environmental conditions. Thus, the revelation of the challenges faced by scattered subpopulations, through the interaction between genomic and ecological processes, will help both in evaluating the total eri population status, and conserving the original S. c. ricini gene pool. Both enlargement in the effective size of subpopulations and habitat protection are essential for maintaining genetic variability and increasing gene flow among eri populations. A strongly differentiated subpopulation, as that of Lakhimpur, is to be considered as an evolutionarily active habitat for conservation planning.