Analysis of Germin-like protein genes family in Vitis vinifera (VvGLPs) using various in silico approaches

Germin-like proteins (GLPs) play an important role against various stresses. Vitis vinifera L . genome contains 7 GLPs; many of them are functionally unexplored. However, the computational analysis may provide important new insight into their function. Currently, physicochemical properties, subcellular localization, domain architectures, 3D structures, N-glycosylation & phosphorylation sites, and phylogeney of the VvGLP s were investigated using the latest computational tools. Their functions were predicted using the Search tool for the retrieval of interacting genes/proteins (STRING) and Blast2Go servers. Most of the VvGLP s were extracellular (43%) in nature but also showed periplasmic (29%), plasma membrane (14%), and mitochondrial- or chloroplast-specific (14%) expression. The functional analysis predicted unique enzymatic activities for these proteins including terpene synthase, isoprenoid synthase, lipoxygenase, phosphate permease, receptor kinase, and hydrolases generally mediated by Mn + cation. VvGLP s showed similarity in the overall structure, shape, and position of the cupin domain. Functionally, VvGLPs control and regulate the production of secondary metabolites to cope with various stresses. Phylogenetically VvGLP1, -3, -4, -5, and VvGLP7 showed greater similarity due to duplication while VvGLP2 and VvGLP6 revealed a distant relationship. Promoter analysis revealed the presence of diverse cis -regulatory elements among which CAAT box, MYB, MYC, unnamed-4 were common to all of them. The analysis will help to utilize VvGLPs and their promoters in future food programs by developing resistant cultivars against various biotic ( Erysiphe necator and in Powdery Mildew etc.) and abiotic (Salt, drought, heat, dehydration, stresses.

their promoters (Das et al., 2019;Ilyas et al., 2016a) which provided important new insights into their structure and function against diverse stresses. Recently, 258 GLP genes were identified in wheat genome through various in silico approaches and their role was confirmed against Blumeria graminis f. sp. tritici (Bgt) (Yuan et al., 2021).
Vitis vinifera contains seven GLP genes among which function of the VvGLP3 was previously predicted against Powdery mildew using various computational tools (Ahmad et al., 2019). Similarly, in a separate study, they were also proved effective against Erysiphe necator and in Powdery Mildew (Godfrey et al., 2007). So based on these studies we deduced that these genes may carry additional potential abilities against various stresses which need to be explored. So due to the importance of VvGLPs in various stress-related processes, the present approach was adopted to analyze and investigate various structural and functional properties of the 7 VvGLPs genes and their promoters using various computational tools. The analysis will help to utilize them in various crop improvement programs.

Database search & sequence retrieval
The nucleotides sequence of the 7 VvGLPs genes were downloaded from the Plant Ensemble database (Release 52 -Dec 2021 © EMBL-EBI) (Bolser et al., 2017) and translated into protein sequence through the EMBOSS Transeq website accessed via EMBL (European Molecular Biology Laboratory) server. The protein domain of the selected genes was authenticated using the NCBI conserved domain search (CD Search) tool (Lu et al., 2020).

Multiple sequence Alignment & Motives analysis
Multiple sequence alignment of the selected protein sequences was carried out using the Clustal Omega program (Sievers and Higgins, 2018) to uncover their common features. The aligned sequences were searched for GER motives (GER box), peptide signal, and KGD motives, etc. However, to get further insights into the motives structure and organization; the sequences were scanned with the "Multiple Em for Motif Elicitation" (MEME) software (Version 5.4.1) for 5 possible motives (Bailey et al., 2015). The function of each motif was predicted with Motif scan and Motif search softwares.
Modern computational tools such as Ensemble Plant (Bolser et al., 2017), BioSeq-Analysis (Kearse et al., 2012), Pse-in-One (Liu et al., 2015), Pse-Analysis (Liu et al., 2017), Libd3C (Lin et al., 2014), MRMD (Hollebeek et al., 1999;Nagy et al., 2014;Tong et al., 2012), RiceXpro (Sato et al., 2010) and PlantPAN3.0 (Chow et al., 2019) etc. provide an opportunity to explore various structural and functional properties of a DNA or peptide sequence. Such studies help to predict the structure, function, and enzymatic properties of a gene or promoter with great certainty. Functional genomics techniques coupled with modern computational tools for functional analysis of genes is a widely used practice in plant molecular biology. These studies are not only economical but may also provide significant new insight into the function and regulation of genes which may then be used in various transgenic approaches to cope with various stresses.
So far, numerous (almost 350) Germins (GERs) and Germin-like proteins (GLPs) genes have been recognized in diverse plant genomes (Ilyas et al., 2016b). Many of them and their promoters were subjected to various computational studies to predict their function and regulatory mechanism. Previously, several GLPs from Oryza sativa (12 genes) (Davidson et al., 2010), Glycine max (21 genes) (Lu et al., 2010), and Eucalyptus Grandis (EgGLP) (Sassaki et al., 2015), were first characterized using various computational methods by studying their domain architectures, enzymatic activities, and phylogenetic relationships, etc. and then functionally confirmed through different experimental approaches. The same methodology was adopted for 43 OsGLPs genes (Ilyas et al., 2020) and with NetNglyc -1.0 (Gupta and Brunak, 2002) and NetPhos -3.1 (Blom et al., 2004) servers respectively. Similarly, the Peptide signals were predicted using the "SignalP 4.1" server (Armenteros et al., 2019). Swiss Modeling server (Waterhouse et al., 2018) was used to predict structural models (3D) for each peptide. The quality of each model was confirmed with the Rampage (RPA) server (DasGupta et al., 2015). Similarly, MEGAX (Molecular and evolutionary genetics analysis tool Ver.10) was used to investigate their phylogenetic relationship by employing the neighbor-joining tree-making method.

Functional analysis
Various possible functional roles for each gene were deduced through the Tool for the Retrieval of Interacting Genes/Proteins (STRING Ver. 11) (Szklarczyk et al., 2019) server which considers its homology and co-expression pattern with the related previously reported genes. In a similar approach, the genes were also scanned with Blast2GO (Ver. 4.1) server to predict their possible functions (Conesa and Götz, 2008).

Promoter analysis
The upstream promoter region of 1000 base pairs (1 kb) was downloaded for each VvGLP gene using the plant Ensemble server so that more insight into their regulatory mechanism can be obtained. The promoter sequences were scanned for putative regulatory elements with the Plant Pan (Ver. 3) software (Chow et al., 2019). Abundant, common, and unique elements were identified. Similarly, the detail of each element in each promoter along with its number of copies, sequence, position, and proposed function was identified and the data were presented in graphical as well as in tabulated form.

Genes sequence retrieval
The nucleotide sequences of the 7 monocupin VvGLPs genes were retrieved from the Ensemble database and their accession no, amino acid, & nucleotide count, and chromosomal coordinates were presented as Table S1 in supplementary data (Supplementary Material). GLPs play important role in various plant processes but their number varies greatly among different plant species for example 48, 43, 32, 26, and 258 GLPs were found in barley, rice, Arabidopsis, maize, and wheat, respectively. Vitis vinifera possess 7 GLPs and all of them were situated close to each other on chromosome 14 except VvGLP6 which was found on chromosome 11. It may be due to duplication on this specific chromosome through the course of evolution during which multiple copies of the same gene were generated on a specific locus to cope with various stresses. Duplication is common among GLPs of many crops as previously reported in several studies (Ilyas et al., 2020;Zimmermann et al., 2006;Lu et al., 2010;Davidson et al., 2010). Recently, 258 GLPs were found in the wheat genome forming tandemly duplicated genes clusters on specific chromosomes (specifically on 4B) (Yuan et al., 2021). Genes with the highest and smallest number of nucleotides (nt) were VvGLP3 (1,023 nt) and VvGLP2 (217) respectively. Similarly, the highest and smallest number of amino acids were noted for VvGLP3 (225) and VvGLP5 (207) respectively showing very little difference which may be due to a similar exon structure that arose due to tandem duplication events as already reported for GLPs in Arabidopsis (Membré et al., 2000), barley (Zimmermann et al., 2006), soybean (Lu et al., 2010), rice (Davidson et al., 2010;Ilyas et al., 2020), and wheat (Yuan et al., 2021).

Multiple sequence alignment & Motives Analysis
The peptide sequences of the VvGLPs genes were aligned to uncover similarities and variations in the domain architecture as well as in the functional features of these genes (Figures 1 and 2). Peptide signal was located at the start of each sequence which is essential for entry into the secondary pathways of various metabolic processes in both prokaryotes and eukaryotes (von Heijne, 1990). The basic cupin domain or Germin box was represented by three Germin motives (A, B, and C) which are considered as the fundamental structural feature of the GLPs. It has a consensus sequence of GxxxxHxHPxAxEh where "x" represents any aa (amino acid) while "h" is a hydrophobic aa (Dunwell et al., 2004). The motif A existed at the beginning of the Germin box (≈23-36 aa) while motif B and C were found around ≈105-120 and ≈152-167 aa away from the amino-terminal end (N-end) of each protein respectively. The putative KGE/KGD or sometimes RGD sequence existed upstream to the GER motif C. It is one of the essential features of GLPs structure shared by almost half of the GLP sequences, but not found in GER genes (Barman and Banerjee, 2015). It is not only involved in protein-protein interaction but also acts as a receptor in the ECM (Extracellular matrix) by helping plant protein to interact with several other proteins or exchange information to perfom function (Bernier and Berna, 2001). Such mechanisms play a fundamental role in protecting plants from pests and pathogens's attack, acting either by cell wall adhesion, plasma membrane strengthening or by preventing the penetration of fungal toxin. Its presence in GLPs of different plant species such as Medicago truncatula (Doll et al., 2003), sunflower (Helianthus annuus) (Beracochea et al., 2015), Prunus salicina (El-Sharkawy et al., 2010), soybean (Glycine max) (Langenbach et al., 2016), cotton (Gossypium hirsutum) (Kim and Triplett, 2004;Pei et al., 2019), Oryza sativa (Ilyas et al., 2020) and Craterostigma plantagineum (Giarola et al., 2020), etc. is widely reported. All the VvGLP proteins have KGD motif except VvGLP2 where it is replaced by KGE which may be due to mutation. The presence of this motif in VvGLPs indicates their role as plant defense protein. It also shows its evolutionary importance in the structure and function of these genes. It not only suggests the functional similarity of VvGLPs but also their similar phylogenetic history. It also indicates their better interactive ability with other membrane proteins.   (Bailey et al., 2015) was used for the analysis to predict 05 possible motives in the VvGLPs as represented with colored blocks. The conserved motives were confirmed for their functional role via "Motif scan" as well as "Motif Search" softwares. Motives width and sequence is given in the figure. M: Motif, S4: Small RNA-binding protein domain, N/A: Not available.

5/16
In silico analysis of the Vitis vinifera Germin-like proteins (VvGLPs) genes MEME analysis revealed 5 motives in VvGLPs in which motif (M) 1, 2, and 4 represent the basic Germin motives which are the fundamental structural feature of GLPs. The length of motives ranged from 21 to 50 amino acids whereas their number in each protein was from 3 to 5. Overall, VvGLP1, 3, 4, 5, and 7 showed similarity in motives arrangement while VvGLP2 lacks motif 1 (M1) and VvGLP6 lacks M1 and M5. Various phenomena such as duplication, mutation, deletion, and insertion, etc. may be responsible for variation in the motives number and their position. Cupin domain possesses a variety of functions against different biotic (fungal, bacterial, viruses etc.) and abiotic (salt, drought, heat, light, wounding etc.) stresses (Ilyas et al., 2016b). However, besides the basic cupin domain, additional functional domains were also found among which M3 is common to all peptide sequences. It is involved in N-Glycosylation, Phosphorylation, Casein kinase II-, and S4 domain (small RNA-binding protein domain) -related activities. Protein N-glycosylation mainly operated by N-glycans perform major function in the quality control and protein's folding inside the lumen of the endoplasmic reticulum (ER). It is also important in plant development, cellulose biosynthesis, immune responses, and growth under various biotic or abiotic stresses (Nagashima et al., 2018). Similarly, phosphorylation of protein is an important and well-studied modification at a post-translational level that controls and regulates almost all the cellular processes using a variety of mechanisms such as altering protein interactions or localization, and conformations (Strumillo et al., 2019). In the same way, casein kinase II regulates circadian rhythm, hormone signaling as well as various plastids-related functions in plants (Ogiso et al., 2010). Similarly, the S4 domain is a multifunctional protein domain playing important role in transcription and translation. It is also part of an important transmembrane protein playing a significant role in various processes by regulating the movement of various molecules and ions across the cell membrane (Jegla et al., 2018). The presence of these motives in VvGLPs suggested their importance and role in these processes.

Physicochemical properties
VvGLPs exhibited a notable variations in various physicochemical properties (Table 1). Their molecular weight (M.wt) and size showed comparable variations along with other studied properties. The observed values for M.wt ranged from 21525.13 to 243226.95 Dalton (Da) noted for VvGLP3 and VvGLP6 respectively showing variation in the structure and physicochemical properties of these proteins. Similarly, the lowest and highest recorded values of isoelectric point (pI) were 4.90 (VvGLP3) and 8.53 (VvGLP2) which is important for the estimation of protein solubility, electrophoresis, and electrophoretic separation (Audain et al., 2016). In the case of Extinction coefficient (EC), the lowest value was shown by VvGLP2 (1490 M -1 cm -1 ) while the highest by VvGLP3 (18450 M -1 cm -1 ) at 280 nm. High EC value is largely due to a large amount of Tryptophan (Trp), phenylalanine (Phe), and tyrosine (Tyr) residues in a protein structure (Adeloye and Ajibade, 2011). The instability index (II) ranged from 17.51 (VvGLP4) to 29.21 (VvGLP6) which is an important indicator for protein stability. Protein stability in the cellular environment is crucial to its proper function (Gamage et al., 2019). All the VvGLPs are stable at the cellular level due to their low II values (below 40) which makes them suitable candidates for in vitro expression studies. The aliphatic index (AI) values ranged from 93.00 (VvGLP7) to 105.05 (VvGLP6). It indicates higher thermal stability of the protein in the cellular environment, largely depend on the presence of percentage (%) of aliphatic amino acids in each protein (Ikai, 1980). VvGLP1 (100), VvGLP3 (103), and VvGLP6 (105) showed the highest AI values which indicate their stability at a wide range of temperatures. Nevertheless, proteins that showed low AI values showed their structural flexibility which is largely contributed to the presence of aliphatic amino acids with aliphatic side chains. The highest and lowest values of the negatively charged residues were noted for VvGLP6 (12) and VvGLP4 (19) respectively. Similarly, total positive charged residues were ranged from 11 to 19 for VvGLP5 and VvGLP4 respectively. Charged residues (positive or negative) affect multiple properties such as stability, interaction with other molecules, conformation, and thermal adaptation of natural proteins which is helpful in protein functional analysis and designing (Berezovsky et al., 2007). All VvGLPs showed positive GRAVY values revealing that all of them are hydrophobic in nature. The importance of hydrophobic protein in deriving various biological processes is universally accepted (Rego et al., 2021). However, the observation is in contrast to the previous study of GLPs in rice where all proteins were hydrophilic (Ilyas et al., 2020). The analysis revealed that all VvGLPs exhibit comparable variation in physicochemical properties except VvGLP6 which showed high M.wt, II, AI, and negatively charged residues which may be due to its presence on a separate chromosome which gave rise to its more distinct properties.

Sub-cellular localizations
At the cellular level, VvGLPs are mainly expressed in the extracellular (43%) (VvGLP1, -3, -4, -5, -6) or periplasmic (29%) (VvGLP1, -4, -5, -7) regions. However, they may also be expressed in mitochondria (VvGLP2), chloroplast, or plasma membrane (VvGLP7). It indicates the possible role of these genes in these regions which are mainly related to photosynthesis, lipid or protein metabolism, and various other biochemical processes of the cell. Such diverse expression patterns may have evolved through the course of evolution due to the specific function of these genes in these cellular parts. The finding is similar to the earlier study of OsGLPs where most of the genes showed either extracellular -(57%) or plasma membrane-specific (33%) expression. However, it is contradictory to the study of 258 TaGLPs which were mostly expressed in secretory pathways (Yuan et al., 2021). Similarly, such diverse expression pattern was also noted for GLP families of rice, soybean, and barley (Dunwell et al., 2008;Lu et al., 2010;Zimmermann et al., 2006). However, some of the genes showed organelles-specific expression which is similar to the previous results in Arachis hypogea's GLPs where various members of the family showed cytoplasm -(AhGLP4), cell wall -(AhGLP2, AhGLP6), both cell wall & cytoplasm -(AhGLP3, AhGLP5) or cytoplasm -, plasma membrane and cell wall (AhGLP1, and AhGLP6) -specific expression during transient expression studies in onion cells (Wang et al., 2013). Similarly, in another study, AtGLP4 showed the highest expression in the Golgi complex (Yin et al., 2009). Recently, a similar expression pattern was also detected during in silico study of the OsGLPs where OsGLP1-2 showed chloroplast-, OsGLP3-7 endoplasmic reticulum-, and OsGLP3-1 mitochondrialspecific expression (Ilyas et al., 2020).

N-glycosylation and phosphorylation sites
Variable numbers of N-sites (from 1-3) and P-sites (from 3-10) were observed in VvGLPs (Table 1). N-sites were mostly found between 18-26 amino acids away from the amino-terminus (N-terminal end) of protein. The highest number of N-sites (3 sites) were shown by VvGLP4 while the lowest (1 site) by VvGLP3 and VvGLP6. Variation in N-sites indicates structural and functional diversity.
Glycosylation of amino-terminus affects conformation, stability, interaction, and various enzymatic functions of the peptide in various parts of the plant (Rayon et al., 1998). Similarly, VvGLPs also showed variability in P-sites (from 3-10) as well, the highest being found in VvGLP4 (10) and lowest in VvGLP6 (3). VvGLP2 and VvGLP3 have 7 while VvGLP1 and VvGLP5 contained 6 P-sites. Various plant processes depend on cell signaling including cell propagation, metabolism, cell differentiation, and program cell death (apoptosis) which is controlled and regulated by phosphorylation either at serine, tyrosine, or threonine residue of the protein (Blom et al., 2004). The presenceof N-glycosylation and Phosphorylation sites in VvGLPs suggested their role in several cellulars, physiological, and metabolic-related processes. Recent studies have greatly emphasized on the role of glycosylation and phosphorylation on the structure, properties, and overall importance of protein in various plant processes Wu et al., 2021).

3D structural analysis
Protein structure determination through alignment and blast search with known proteins structures using various online databases is one of the common strategies in plant computational biology (McGuffin et al., 2013). Swiss modeling server gave 3 models for each VvGLP protein ( Figure S1 in supplementary data). GMQE (Global model quality estimation) and Z-score (Qmean estimation scores) were used as a criteria for the selection of the best model. Ramachandran plot analysis (RPA) showed that on average approximately above 90% of amino acids of each protein exist in the favored region, 8.90% in the allowed while 1% in the outliner regions which confirmed the best quality of each model. RPA is considered as a standard procedure for confirmation of protein models and it is vastly used in such studies (DasGupta et al., 2015). It can also be used to explore various additional characteristics of protein models (Abel et al., 2020). Overall, VvGLPs are highly conserved in their structure showing greater similarity in the overall position and shape of the cupin domains. Each protein is comprised of six Germin monomers bind by six Mn +2 ions forming a hexameric structure. The result is similar to the previously studied GLPs in various crops (Breen and Bellgard, 2010;Ilyas et al., 2020). Their similar structure revealed their common evolutionary history. Many of these genes (VvGLP1, 2, 3, 4, 5, and 7) are located close to each other on chromosome 14 showing that they might originate via tandem duplication along the course of evolution. Such findings have been previously reported for GLPs in various plants including rice, maize, barley, and soybean (Davidson et al., 2010;Ilyas et al., 2020;Lu et al., 2010;Zimmermann et al., 2006). Even though, VvGLP6 is located on chromosome 11 but its structure is similar to other members of the family which may be due to their common origin due to retrotransposon activity on this chromosome. However, it showed variation in other properties.

Phylogenetic analysis
The phylogenetic study revealed greater similarity among VvGLPs with a narrow genetic background by showing a branch length scale of 0.05% (Figure 3) suggesting little variation in their protein sequences. It shows that these genes might have evolved very recently on an evolutionary timescale mainly through duplication. VvGLP1 and VvGLP5 showed greater similarity while the most distant relationship was observed for VvGLP2 and VvGLP6. The distant relationship of VvGLP6 can be justified by its presence on a separate chromosome (Chr 11) which may be due to retrotransposon activity through the course of evolution. However, the distant relationship of VvGLP2 may be due to a greater mutation rate which gave rise to changes in an overall domains structures. Similarly, a close relationship was observed for VvGLP1, -3, -4, -5, and -7 which may be due to multiple duplication events on chromosome 14. Comparative analysis of their domain architecture revealed similarity in their exon structure, which suggest their evolutionary relatedness which further gives rise to their close phylogenetic relationships. All the genes showed similar domain architecture (except VvGLP2 and -6) that may be due to gene tandem duplication events from an ancestral DNA sequence. Previously, similar results were also reported in rice, barley, maize, etc. and recently in wheat (Yuan et al., 2021). GLPs often exist in clusters on specific chromosomes which arose through multiple tandem duplication events. Sometimes such clustered loci offer significant resistance against diverse environmental constraints. A cluster of GLPs on chromosome 8 was found effective against B. Graminis in barely (Zimmermann et al., 2006). Similarly, 12 OsGLPs provided resistance against Magnaporthe oryzae and Rhizoctonia solani in rice (Manosalva et al., 2009). Such phenomena were also reported in soybean (Lu et al., 2010), maize (Breen and Bellgard, 2010), and peanut (Wang et al., 2013) etc. Previously VvGLP3 was proved effective against Powdery mildew both at computational study (Ahmad et al., 2019) and experimental level (Godfrey et al., 2007). Its close relationship with VvGLP1, -4, -5, and -7 showed their similar properties which need to be examined against fungal pathogenicity.

Functional analysis
Enzymatic activities and the predicted roles of VvGLPs in diverse plant processes are presented in Table 2. STRING predicted various possible roles for VvGLP1, -2, -3, and -6 but no function was predicted for VvGLP4, -5, and -7. VvGLP1 and VvGLP3 showed similar functions which are mainly related to terpenoid biosynthesis. In plants, terpenoids play important role in numerous cellular, physiological and biochemical processes for example electron transport chain, photosynthesis, membrane architecture, and development (Pichersky and Raguso, 2018). They also showed Lipoxygenase activity which is important in lipid oxidation that is crucial for many developmentallyregulated processes as well as plays a vital role in numerous abiotic and biotic stresses (Andreou and Feussner, 2009 -García et al., 2009;Shin et al., 2008). VvGLP2 also showed a co-occurrence pattern with the thaumatin (THN) domain (InterPro ID: SM00205) family which plays important role in plant pathogenesis (Ruiz-Medrano et al., 1992). Such observation is in accordance with the role of GLPs against diverse stresses including pests and pathogens protection employing a variety of mechanisms (Ilyas et al., 2016b). Similarly, VvGLP6 showed enzymatic activities (Receptor kinases) related to the protection of plants against pests and pathogens attacks (Stegmann et al., 2017). It also has a prominent role in response to various biotic and abiotic stresses [Alpha/beta hydrolase (ABH)] (Mindrebo et al., 2016), as well as in plant immunity, growth and development, self-incompatibility, and disease resistance (Receptor kinases) etc. (Guo et al., 2011). VvGLP6 also showed a co-expression pattern with tetratricopeptide repeat (TPR) domain possessing genes which are strongly involved in protein-protein interaction (Perez-Riba and Itzhaki, 2019). Engineering tolerance in various crops against diverse stresses is a highly desirable practice in recent days (Fahad et al., 2021b) and VvGLPs play important role in defense against various biotic and abiotic stresses. At the cellular level, they also help other proteins to transport various biomolecules inside the plant body.
Previously, expression analysis via reverse transcriptasepolymerase chain reaction (RT-PCR) showed that VvGLPs provide significant resistance against Erysiphe necator, Plasmopara viticola, and Botrytis cinerea. The highest induction was noted for VvGLP3 against E. necator by showing infection-site specific expression. At the cellular level, the highest expression was noted in the the cell wall when analyzed through transient expression studies in onion cells using the VvGLP3:GFP fusion construct. Protein isolated from transformed Arabidopsis thaliana showed strong SOD activity (Godfrey et al., 2007). Similarly, the role Figure 3. Phylogenetic analysis of the Vitis vinifera Germin-like proteins (VvGLPs) genes family. Groups with distinct colors were highlighted with parenthesis. Genes located on chromosomes 14 and 11 were labeled as triangles and rectangles respectively. The evolutionary history was inferred using the Neighbor-Joining method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the p-distance method and are in the units of the number of amino acid differences per site. This analysis involved 7 amino acid sequences. All ambiguous positions were removed for each sequence pair (pairwise deletion option). There were a total of 225 positions in the final dataset. Evolutionary analysis was conducted with Molecular and evolutionary genetic analysis tool Ver.10 (MEGAX).
of the VvGLP3 was also predicted against Powdery mildew disease via in silico analysis (Ahmad et al., 2019). Though the current study reveals some novel enzymatic activities for these proteins but it needs further experimental confirmation. However, Blast2Go results revealed their metal ion binding nature and non-cytoplasmic activities which is similar to the earlier findings related to Germinlike protein in rice (Ilyas et al., 2020). Metal ion binding activity is mainly related to the antimicrobial nature of protein with the main role in plant defense against pests and pathogens (Dunwell et al., 2008). In a recent study, GLPs located on the fourth homologous group chromosomes in wheat were found effective against Blumeria graminis f. sp. tritici (Bgt) invasion (Yuan et al., 2021). Similar findings were also reported for some of the GLPs in cucumber against Downy mildew (DM) disease caused by Pseudoperonospora cubensis infection (Liao et al., 2021). The analysis revealed that VvGLP1, -2, -3, and -6 are functionally more important due to their diverse enzymatic activities which may seem suitable candidates for further studies.

Promoter analysis
Transcriptional factors binding sites (TFBSs) analysis of the VvGLPs promoters revealed a total of 393 diverse regulatory elements distributed throughout the promoter regions on both sense and anti-sense strands (Figure 4). Many of these elements were involved in the regulation of light response, anaerobic induction, hormonal (Auxin, gibberellin, methyl jasmonate and salicylic acid) stresses, cell cycle, low temperature response, seed-specific activities, and circadian rhythm, etc. Detailed information related to the sequence, position and function of these cis-regulatory elements in each promoter is presented in Table 3 as well as in the supplementary data file (Table  VvGLP1 to VvGLP7). The highest number of elements were found in VvGLP6 (68 elements), lowest in VvGLP1 (44 elements), while VvGLP2, -3, -4, -5, and -7 contained 55, 64, 57, 53, and 52 TFBSs respectively. Previously, similar in silico approaches were adopted for the promoter study of 05 different plant species (Mahmood et al., 2010), EgGLP (Sassaki et al., 2015), and OsGLPs (Das et al., 2019;Ilyas et al., 2016a), etc. to get insight into their function and regulatory mechanism. Recently, in silico promoter analysis of 3 pathogenesis-related (PR1, 2, 3) and OsRGLP1 genes promoters revealed several common cis-elements (Ohwofasa et al., 2021). The highest number of cis-elements in VvGLP6 represents its importance in various processes.

Abundant elements
CAAT box was the most abundant element in every promoter showing the importance of this element in VvGLPs regulation. Other abundant elements included the TATA box, STRE, unnamed-4, and Box4. It is similar to the previous study of OsRGLP2 promoter of different rice accession where CAAT and TATA box elements were found abundant (Mahmood et al., 2018), but comparable to the study of 43 OsGLPs promoters where Arabidopsis homeobox protein or AHBP and TATA box were the most abundant elements (Ilyas et al., 2016a). Based on the TATA box; VvGLPs promoters can be divided into TATAcontaining (VvGLP1, -4, and -6) and TATA-less (VvGLP2, -3,  -5, and -7) sequences. The previous group is important in Table 2. In silico functional analysis of the Vitis vinifera Germin-like proteins (VvGLPs) genes. The analysis was conducted using Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (Szklarczyk et al., 2019) and Blast2go Servers (Conesa and Götz, 2008).

Gene Name STRING Analysis Blast2Go
VvGLP1 stress-related responses while the latter is important in cell growth or housekeeping genes functions (Bae et al., 2015). The highest number of these elements were found in VvGLP4 and -6 promoters which indicate their importance. Stress response elements (STRE) (consensus sequence of AGGGG) were found in VvGLP2, -5, -6, and -7 which was important in heat shock induction of the AtHsp90-1 gene in transgenic Arabidopsis (Haralampidis et al., 2002) as well as in oxidative, heat, and osmotic stresses due to its ability to bind with Msn2 and Msn4 (multicopy suppressor of SNF1 mutation proteins 2 and 4) proteins (Gao et al., 2013). It shows the importance of these promoters in the development of various agricultural strategies to increase crop production under heat and drought stresses (Fahad et al., 2017). Similarly, Box4 play important role in light response regulation (Kaur et al., 2017). It was found in all promoters except VvGLP3 and -6, the highest being found in VvGLP7 (3 copies) showing the importance of these genes in these processes.

Common cis-regulatory elements
Common cis-elements included CAAT box, MYB, MYC, and unnamed-4 showing their importance in various biological processes related to the overall function of these genes. Previously, AHBP, vertebrate TATA-box-binding protein (VTBP) and MYB-like proteins (MYBL) were also found common in OsGLPs promoters (Ilyas et al., 2016a). Among these OsRGLP1 & -2 were highly induced by salt, drought, ABA, wounding and pathogenic [Fusarium solani (Mart.) Sacc. and Alternaria solani Sorauer] stresses (Ilyas et al., 2019;Munir et al., 2016). However, it is contradictory to the study of Mahmood et al., (2010) where ARA, W-box, GT-element and ACGT Sequence were common to 10 GLPs promoters of 5 plant species. CAAT box is part of the core promoter element with a consensus sequence of GGCCAATCT located upstream to the start codon of most eukaryotic genes and enhancer region playing important role in transcription (Etminan et al., 2018). Assembly of the CAAT box-binding factor is regulated by cytokinin, light, and stages of the plastids in spinach photosynthestic gene (AtpC) promoter (Kusnetsov et al., 1999). CAAT box is also important against various abiotic stress responses such as chilling stress etc. (Zhang et al., 2016). Such observation are important in crop improvement against various abiotic stresses (Fahad et al., 2021a). In another study, the CAAT box controls the expression level of GW6 (grain width 6), weight, and width of the grain in rice (Shi et al., 2020). Similarly, the CAAT box and TATA box may also be important against drought and salt stresses as previously reported in banana MaTIP1;2 promoter in transgenic Arabidopsis (Song et al., 2018). The highest number of CAAT box elements were found in VvGLP3 (26), -6 (23), and -2 (21 copies) showing the importance of these promoters. Similarly, MYB transcriptional factors belong to one of the diverse gene family playing important role in development, plant growth, cell morphology, cellular and physiological processes, metabolism, primary and secondary metabolic reactions as well as secondary metabolite biosynthesis (Cao et al., 2020), phenylpropanoid biosynthesis (Ma and Constabel, 2019), drought stress (Zhang et al., 2019), Figure 4. Regulatory element analysis of the Vitis vinifera Germin-like proteins (VvGLPs) genes promoters. Detail of the elements in each promoter is given in each graph labelled with respective promoter name. The analysis was conducted with the PlantPAN (Ver.3) server (Chow et al., 2019). The number of each element is given on each bar. Graphs were made using Excel 2016. Table 3. Transcriptional factor binding sites (TFBSs) analysis of the Vitis vinifera Germin-like protein genes (VvGLPs) promoter. The cis-regulatory elements were found using PlantPAN (Version 3) software (Chow et al., 2019). The elements may exist on both strands (sense or antisense) of the promoters.  (Li et al., 2019a), etc. Previously, MYB was also found frequent in OsGLPs promoters (Das et al., 2019) and it is involved in the transactivation of OsRGLP2 gene expression (Deeba et al., 2017). Similarly, MYC cis-elements regulate diverse biological, cellular, physiological processes, flavonoid biosynthesis, secondary metabolite biosynthesis thereby providing resistance against a wide range of biotic and abiotic (drought, salinity, cold, etc.) stresses (Xie et al., 2020). The highest number of these elements were present in VvGLP2 and -3 promoter. In the same way, unnamed-4 is involved in the regulation of genes controlling various plant processes (Gupta and Ranjan, 2017). The element was largely found in VvGLP4 promoter. The presence of these elements in all promoter represents their evolutionary significance.

Unique Cis-regulatory elements
Some unique elements were also identified in VvGLPs promoters which include TGA-element, RY-element, AuxRR-core (Auxin responsive element), WUN-motif (Wound responsive element), ERE (Ethylene responsive elements), and MRE (MYB responsive element), etc. TGAelements were specific to VvGLP7 which play important role in various biological processes such as growth regulation, development, pathogens response, hormonal and abiotic (salt & drought) stresses (Li et al., 2019b). In the same way, RY-element is involved in seed-specific regulation, embryogenesis, and seed development (Reidt et al., 2000). One copy of this element was found in the VvGLP3 promoter. WUN elements play important roles in wounding and abiotic (salt, drought etc) stresses (Hayashi et al., 2003;Valifard et al., 2015) while AuxR-core is important in hormonel response (auxin) regulation (Mironova et al., 2014). Previous studies have shown that phytohormone play important role in acclimatizing plants to various environmental stresses largely mediated by these elements in promoter (Fahad et al., 2015). A single copy of these elements can be found in VvGLP1 and -5 promoters respectively suggesting their role in these processes. They also contained a single copy of the ERE elements which play important role in abiotic stresses (drought, salinity, submergence), fruit ripening, and secondary metabolite production (Srivastava and Kumar, 2018). MYB response element (MRE) is crucial for light responsiveness and auxin regulation (Hartmann et al., 2005) which was found in VvGLP5. The presence of unique elements in these promoters suggest that these genes have adapted to multiple challenges of the natural environment by inserting novel cis-elements in their promoter through the course of evolution.

Conclusion
VvGLPs are similar in structure but showed significant variation in their functions. They showed similar physicochemical properties, domain architectures, expression pattern at the subcellular level, 3D structures, and functional properties. The phylogenetic study further revealed a narrow genetic background suggesting their origin via recent duplication events that gave rise to great resemblance among them. Through the course of evolution, these genes have adopted diverse enzymatic activities mainly related to the production of secondary metabolites to better cope with environmental stresses. VvGLPs play important role in plant defense through increase immunity, better growth, regulating diverse physiological and biochemical processes both in the cellular and extracellular environment largely mediated by the presence of CAAT, MYB, MYC, and unnamed-4 cisregulatory elements in their promoters. Among all, VvGLP2 and VvGLP6 showed more distinctive features which make them more suitable candidates for future use. In the future, the data can be used to develop agronomically important cultivars against a diverse range of stresses.

Human and animal rights
No humans/animals were used in the current research project. and phosphorylation of proteins from the amino acid sequence. Proteomics, vol. 4, no. 6, pp. 1633-1649. http://dx.doi.org/10.1002

Supplementary Material
Supplementary data is provided along with the article as excel files (S1).
Supplementary material accompanies this paper. Figure S1. 3D structure of the Vitis vinifera Germin-like proteins (VvGLPs) genes. Different domains are represented with different colours. The data were obtained with Swiss modeling server. Table S1. Description of the Vitis vinifera Germin like protein genes (VvGLPs) sequences used in the analysis. Table VvGLP1 to VvGLP7. Description of the cis-elements in the promoters of Vitis vinifera Germin like proteins genes. This material is available as part of the online article from https://www.scielo.br/j/bjb