Genetic diversity and molecular characterization of Cucumber mosaic cucumovirus (CMV) subgroup II infecting Spinach (Spinacia oleracea) and Pea (Pisum sativum) in Pothwar region of Pakistan

Abstract Cucumber mosaic virus (CMV) is a tremendous threat to vegetables across the globe, including in Pakistan. The present work was conducted to investigate the genetic variability of CMV isolates infecting pea and spinach vegetables in the Pothwar region of Pakistan. Serological-based surveys during 2016-2017 revealed 31.70% overall CMV disease incidence from pea and spinach crops. Triple-antibody sandwich enzyme-linked immunosorbent assay (TAS-ELISA) revealed that all the positive isolates belong to CMV subgroup II. Two selected cDNA from ELISA-positive samples representing each pea and spinach crops were PCR-amplified (ca.1100 bp) and sequenced corresponding to the CMV CP gene which shared 93.7% nucleotide identity with each other. Both the sequences of CMV pea (AAHAP) and spinach (AARS) isolates from Pakistan were submitted to GenBank as accession nos. MH119071 and MH119073, respectively. BLAST analysis revealed 93.4% sequence identity of AAHAP isolate with SpK (KC763473) from Iran while AARS isolate shared maximum identity (94.5%) with the strain 241 (AJ585519) from Australia and clustered with some reference isolates of CMV subgroup II from UK (Z12818) and USA (AF127976) in a Neighbour-joining phylogenetic reconstruction. A total of 59 polymorphic (segregating) sites (S) with nucleotide diversity (π) of 0.06218 was evident while no INDEL event was observed in Pakistani isolates. The evolutionary distance of Pakistani CMV isolates was recorded as 0.0657 with each other and 0.0574-0.2964 with other CMV isolates reported elsewhere in the world. A frequent gene flow (Fst = 0.30478 <0.33) was observed between Pakistani and earlier reported CMV isolates. In genetic differentiation analysis, the value of three permutation-based statistical tests viz; Z (84.3011), Snn (0.82456), and Ks* (4.04042) were non-significant. The statistical analysis revealed the values 2.02535, 0.01468, and 0.71862 of Tajima's D, Fu, & Li’s F* and D* respectively, demonstrating that the CMV population is under balancing selection.


Genetic diversity and molecular characterization of Cucumber mosaic cucumovirus (CMV) subgroup II infecting Spinach (Spinacia oleracea) and Pea (Pisum sativum) in Pothwar region of Pakistan
Diversidade genética e caracterização molecular do Cucumber mosaic cucumovirus (CMV) subgrupo II infectando espinafre (Spinacia oleracea) e ervilha (Pisum sativum) na região de Pothwar, Paquistão CMV holds a significant position among plant viruses infecting various crops in Pakistan and causes massive vegetable loss (Ashfaq et al., 2014;Mozamil et al., 2019). Although it has been reported that CMV infects a limited vegetable species viz., chilli, tomato, melon and pea in Pakistan (Mughal, 1985;Akhtar et al., 2008;Malik et al., 2010;Iqbal et al., 2012;Ahsan and Ashfaq, 2018). However, there is a need to explore other important vegetables and areas systematically for the infection of CMV. Moreover, data regarding the prevalence of different strains and genetic diversity of CMV infecting major vegetables are quite less. Considering these points present study was conducted for identification and characterization of CMV infecting spinach and peas grown in Pothwar region of Pakistan.

Field surveys and serodiagnosis of CMV in Pothwar region
Cucumber mosaic virus disease (CMVD) surveys were carried out in Pothwar region (Attock, Chakwal, Jhelum, Rawalpindi and the capital territory of Islamabad) ( Figure 1) during 2016-2017 by following random stratified design (Khan et al., 2015). In 2016 and 2017, 172 and 175, respectively, pea and spinach leaf samples displaying suspected CMVD symptoms such as mosaic, interveinal chlorosis, chlorotic streaks, unusually thick lateral veins with malformed shoestring leaves, and misshapen fruits were collected, put on ice in an ice-bucket and brought to the Plant Virology Lab, Department of Plant Pathology, PMAS Arid Agriculture University Rawalpindi, Pakistan. All the samples were subjected to serological assay and positive samples were stored at -20 °C for identification and characterization of the viral agents.
The preliminary CMV detection was conducted by DAS-ELISA kit (Cat. No. SRA 44501/0500, Agdia Inc., USA), with positive and negative controls, as per manufacturer instructions and methodology employed by Clark and Adams (1977). The absorbance values (405 nm) were measured with an Automatic ELISA Reader (HER-480 HT Company (Illford) Ltd., UK). Samples were considered positive for CMV infection when the ELISA absorbance value was equal to two times or higher than the average of absorbance value of the healthy tissue as well as negative control (Ashfaq et al., 2014). Commercial positive and negative controls (Agdia) were included in CMV ELISA kit. The relative disease incidence was determined with ELISA

Introduction
The Pothwar region, (longitude 71°10-73°55E; latitude 32°10-34°9N), covering 22, 254 Km 2 area of northern Punjab province lies in the north-eastern part of Pakistan with diverse vegetation comprising fruits, ornamentals, vegetables as well as cereal crops, and small forests (Ashfaq et al., 2017;Riaz et al., 2022). Attock, Rawalpindi, Chakwal and Jehlum districts and capital territory Islamabad constitute this diversified region (Cheema et al., 2013). Spinach (Spinacia oleracea) and Pea (Pisum sativum) are among the most important rabbi vegetables which covers 0.035 million hectares (MH) with an average production of 1.57 MT (Pakistan, 2018). The average yield per acre of these vegetables in Pakistan is quite low due to several biotic and abiotic factors. Plant viruses, notably Cucumber mosaic virus (CMV) is known to cause devastating losses and reduce crop quality and quantity and a serious threat to hamper the regional vegetable production (Rehman et al., 2015).
CMV is a type member of the Cucumovirus genus having a tripartite genome with three genomic RNAs, each of which is separately enclosed in an isometric particle of 28nm diameter (Azizi and Shams-Bakhsh, 2014). Single-stranded RNA (RNA1, 2, and 3) has positive sense segments of size 3.4kb, 3.0kb, and 2.2 kb with five open reading frames (ORFs) (Mochizuki and Ohki, 2012). The genes perform multiple functions where host specificity, symptoms induction, long-distance movement, interviral recombination and virulence determination are controlled by ORF 1, 2a and 2b located on RNA1 and RNA2 while RNA 3 encodes for capsid protein (CP; translated from subgenomic RNA 4) and movement protein (MP) responsible for movement of virions from cell to cell, involved in the transmission of CMV from infected to healthy plants through aphids (Shi et al., 2008;Nouri et al., 2014;Ohshima et al., 2016).

Serotyping of CMV isolates using monoclonal antibodies
Plant sap from DAS-ELISA positive samples were extracted and used to identify serogroups in Triple Antibody Sandwich (TAS) ELISA. For this purpose, commercially available TAS-ELISA kits (Cat No. PSA 44700/0480 and SRA 44800/0500, Agdia Inc., USA) containing positive and negative controls and poly/monoclonal antibodies were used. TAS-ELISA was performed as prescribed by vendor and others (Hosseinzadeh et al., 2012).

Total RNA extraction and RT-PCR
The TRIzol ® Reagent (Life Technologies, USA) was used for total RNA isolation from serologically positive samples and subsequently checked for quantity and quality by Nanodrop (Thermo Fisher Scientific Co. USA) followed by Complementary DNA (cDNA) synthesis (Ahsan et al., 2020, Riaz et al., 2022).
The whole mixture was subjected to incubation at two different temperatures (42 °C and 70°C) with an incubation period of 60 min and 10 min respectively. For PCR amplification, a standard PCR reaction was performed according to the Chen (2003), The primer pair CMVF-45 (5'-CCC CGG ATC CAC ATC AYA GTT TTR AGR TTC AAT TC-3') and CMVR-45 (5'-CCC CGG ATC CTG GTC TCC TT -3') was used, which amplifies a fragment of approximately 1100bp consisting of a 5' NTR and a complete CP gene. The pre-stained agarose gel (1.0% w/v) was used to separate amplified products and visualized under UV transilluminator (Vilber Lourmat, Serial number 6532).

Cloning and sequencing
The PCR amplicons were ligated to the pTZ57R/T vector (InsTAclone TM PCR cloning kit, Thermoscientific, USA; Cat No. K1214) after purification with GeneJET PCR Purification Kit, Thermoscientific, USA; Cat No. K0702). The heat shock method was employed to transform recombinant plasmids into chemically synthesized XL1-Blue strain of E. coli competent cells (Chan et al., 2013) for 50 sec at 42°C. The recombinant plasmid, from transformed cells, was extracted using GeneJET Plasmid Miniprep Kit (Thermoscientific, USA) by procedure followed by Ahsan et al. (2020). The recombinant plasmid was confirmed by EcoRI and HindIII restriction digestion analysis and sequenced in both orientations (Macrogen, South Korea).
The sequence homology was checked by NCBI BLASTn tool (Zhang et al., 2000) with reported CMV isolates/ strains associated with subgroup I (A&B) and II (Table 1) and pairwise sequence comparisons along with nucleotide/ amino acid identities of selected sequences were calculated by CLUSTAL W embedded in BioEdit v7.2.6.1 (Thompson et al., 1994;Kumar et al., 2016). The phylogeny was constructed by Neighbour-joining method with 1000 bootstrap value in MEGA 6 software (Tamura et al., 2013).

Recombination and selection pressure analysis
The Nucleotide diversity (π), number of polymorphic (segregation site, S), insertion and deletions (INDEL's), haplotype diversity (Hd) and synonymous (Ka) and non-synonymous (Ks) rate of mutations, gene flow, genetic differentiation and neutrality within each group and defined region, statistical tests like Fst, Z, Ks*, Snn, Tajima's D, Fu, Li's D* and F* were calculated using DnaSP v 6.12.03 (Tajima, 1989;Fu and Li, 1993;Li and Fu, 1994;Rozas et al., 2017 ). Recombination events were observed in aligned BrYV sequences with all the standard methods using the default settings implemented in RDP4 package (Martin et al., 2017). Recombination breakpoints are deemed important if four or more methods were assisted.

RT-PCR and sequence analysis
The CMVF-45/CMVR-45 (Chen, 2003) primer pair amplified the expected 1100 bp DNA fragments for ELISA positive samples (Figure 2). Sequences confirmed that CMV was present in DAS-ELISA positive samples. Both Phylogenetic analysis revealed that all the isolates were grouped into three well-separated clusters in accordance with the sequences of the close relatives in the same subgroups (IA, IB and II) (Figure 3). The two new AAHAP and AARS isolates form distant clads with some proven standard subgroup II isolates i.e. SC-J1 from China, Kin from UK, Q and strain 241 from Australia, Trk7 from Hungary, isolate SpK from Iran, and LS from USA which are consistent with previous studies (Chaumpluk et al., 1996;Gal-On et al., 1996;Haq et al., 1996;Boccard and Baulcombe, 1993;Roossinck et al., 1999;Takanami et al., 1999). Phylogeny results indicate that in Pothwar region of Pakistan pea and spinach are solely infected by CMV subgroup II rather than subgroup I.

Selection pressure and recombination analysis
Sequence polymorphisms were observed in 660bp of sequence within 19 CMV isolates, covering complete CP cistron. A total of 217 polymorphic (segregating) sites (S) were observed, out of which 41 were singleton variable and 175 parsimony informative sites with Eta of 264. Overall Nucleotide diversity (π) in all the understudied 19 isolates was 0.14073 while a total of 6 INDEL events were observed which were present in other than Pakistani isolates. Haplotype diversity analysis revealed that there were a total 19 haplotypes with haplotype diversity (Hd) of 1.00 in all the 19 isolates. Nucleotide diversity analysis among the populations revealed the presence of 59 S sites with π of 0.06218 in Pakistani isolates while 208 (S) and 0.014063 (π) was in the isolates reported from other parts of the world. The average number of nucleotide differences (k) i.e. 40.667 and 92.035 were observed in isolates reported from Pakistan and other parts of the world, respectively.
The evolutionary distance of Pakistani CMV isolates was recorded as 0.0657 with each other and 0.0574-0.2964 with isolates from other countries. AAHAP isolate had the lowest evolutionary distance of 0.0589 from Iranian SpK isolate while the AARS had 0.0574 from Australian

Gene flow and genetic differentiation analysis
The value of the coefficient of gene differentiation (Fst) was recorded as 0.30478 (less than 0.33 standard value) between Pakistani and other reported CMV isolates suggesting it to be a frequent gene flow. In genetic differentiation analysis, the value of three permutationbased statistical tests viz; Z (84.3011), Snn (0.82456) and Ks* (4.04042) were found non-significant (Balasubramanian and Selvarajan, 2014). In the present study, all the values 2.02535, 0.01468 and 0.71862 of Tajima's D, Fu, & Li's F* and D*, commonly used tests to recognize sequences that do not suit the neutral model in genetic drift and mutation equilibrium, weren't statistically significant, respectively (Tajima, 1989;Ramírez-Soriano et al., 2008) demonstrating that CMV population is under balancing selection.

Discussion
Cucumber mosaic virus is a notorious virus with a broad host range that is known to infect spinach for the first time in Pakistan. CMV has already been reported to cause losses in spinach crop grown in Brazil (Yuki et al., 2017), Greece (Fotopoulos et al., 2011), USA (Yang et al., 1997), and UK (Bailiss and Okonkwo, 1979). CMV is a pathogen widely distributed in country infecting nearly all the grown vegetables i.e. pea (Ahsan and Ashfaq, 2018), chilli (Iqbal et al., 2012), tomato (Akhtar et al., 2008 and melon (Malik et al., 2010). Extensive surveys revealed that CMV has a disease incidence of 38.2% infecting pea and 20.48% in spinach during 2016. While in 2017, a relatively higher incidence of CMV was observed i.e. 42.17 and 26.09 from pea and spinach, respectively. This increasing incidence of CMV during succeeding survey year was also observed by Iqbal et al. (2012), Malik et al. (2010) and Akhtar et al. (2008). Broad range of hosts, together with frequent occurrence and prevalence of CMV in the Pothwar region of Pakistan, had a significant impact on efficient vegetable production (Ashfaq et al., 2017). TAS-ELISA revealed the presence of subgroup II in the Pothwar region as used for the serotype prevalence studies of CMV by Hosseinzadeh et al. (2012). Our findings from serotype analysis support the hypothesis presented by Hord et al. (2001) which underlines the domination of Subgroup I isolates in tropical and subtropical zones while temperate regions as Pothwar region serves home for subgroup II.
CMV is a prominent economically important bottleneck in the production of field crops, vegetables, and ornamentals in Pakistan (Ashfaq et al., 2017;Iqbal et al., 2012). However, less information is available on the molecular characteristics of natural populations of CMV infecting vegetables from Pakistan. The Pakistani CMV isolates were highly homologous to each other and to isolates within the same subgroup based on the CP gene sequence.
The already published studies indicated the higher degree of coat protein gene sequence homology of CMV isolates of the same subgroup sampled from a single crop (Ahsan and Ashfaq, 2018), as well as of its global populations (Roossinck et al., 1999;Takanami et al., 1999). The reported Pakistani isolates in this study share sufficient homology with each other and less π value also confirms the phylogeny results as both the Pakistani isolates present in the same clad of subgroup II but despite of higher homology they cluster in distinct clads within the subgroup II clade.
Phylogenetic relationship confirms the presence of genetic variation in Pakistani isolates as they show a relationship with an isolate of different geographical location i.e. Australia, UK, USA, Japan and Hungary. Clustering of CMV isolates based on subgroups and formation of clusters with representative isolates was also reported by many scientists (Roossinck et al., 1999;Ohshima et al., 2016). Besides, mechanical means like routine handling especially at transplanting, pruning and harvesting (Dragich et al., 2014), the cucumoviruses are transmitted by infected plantlets and seeds over long distances (Yang et al., 1997;Ali and Kobayashi, 2010). Pakistan imports most of its vegetable seeds from other countries i.e. Netherland, India and China etc. and there is need to strengthen the quarantine measures in Pakistan to check the movement of seed-borne plant pathogens.
As evolutionary process, variations occur in the genetic makeup of organisms by the addition of new alleles through gene flow or mutations etc. Gene flow may be calculated with Nm (the number of migrants) and Fst (the degree of genetic differentiation) between sub-populations. Fst value 0.33 might be considered as a threshold level below which gene flow is frequent and above that gene flow appears infrequent (Zu et al., 2019). This study confirms the frequent gene flow with moderate population structuring as value of coefficient of gene differentiation (Fst) was recorded as 0.30478 below of standard 0.33 value. Moreover, positive values of Neutrality tests i.e. Tajima's D test, Fu, & Li's F*, Fu, & Li's D* indicates that CMV population is under balancing selection with an excess of high frequency variations and contraction of population (Tajima, 1989;Fu and Li, 1993;Li and Fu, 1994). Evolutionary distance describes the divergence of homologous sequences from their common ancestors (Rosenberg et al., 2005). In this study, the higher sequence identities, close phylogenetic relationship and lower evolutionary distance of Pakistani isolates from Iranian isolate justifies their geographical association but with Australian isolate also indicate the demographic expansion and these results are in agreement with Moury (2004) and Hasiów-Jaroszewska et al. (2017). The findings of the present study will enable plant breeders to develop CMV resilient varieties and also lead to forecast the chance of resistance breakdown in pathogen-mediated resistant transgenic lines of vegetable crops. This research is the first evidence of genetic variability of CMV in spinach from Pakistan. Moreover, continued research is needed to identify, alone or in combination with other viruses, the possible impact of this emerging class of threatening plant viruses on quality and quantity of crucifers and other vegetables.

Conclusion
CMV is a notorious pathogen, having broad host range and causes colossal losses to the vegetable crops worldwide. In the present study, CMV was detected both in spinach and pea in all sampling sites of Pothwar region with an overall disease incidence 30.71% during 2016-17. The sequences of two newly identified CMV isolates comprising a complete CP gene of 657bp along with some portion of 5' and 3′UTR. Evolutionary distance and phylogenetic analysis revealed that both the isolates have a close relationship with Iranian and Australian subgroup II isolates. Frequent gene flow with lower nucleotide diversity was detected in CMV population. Moreover, positive values of statistical tests confirm the balancing selection in understudied CMV population. Detection of CMV infection first time in spinach, relatively higher disease incidence of subgroup II isolates underlines an alarming situation for successful production of vegetable crops. Developing management techniques such as breeding projects to identify resistant sources based on genetic variation findings could help to manage this menace.