CRISPR-transient expression in soybean for simplified gRNA screening in planta

Abstract The objective of this work was to develop a method to create and validate CRISPR-Cas systems and different gRNAs in soybean (Glycine max) embryos. Two model genes were used for simple mutation with one gRNA or partial gene deletion with two guides. The gRNAs were inserted into the CRISPR transformation vectors by a type IIS restriction enzyme or by subcloning and inserting the promoter + gRNA2 in the final transformation vector using the classic restriction enzyme cloning method. The vectors were successfully constructed for one and two gRNAs. Agrobacterium-mediated transient transformation in soybean was carried out to test the quality of gRNAs and of the system itself (expression cassette). Simple mutation and gene deletion were detected in the embryos transformed after DNA enrichment by enzyme digestion followed by polymerase chain reaction and sequencing, which indicates that the CRISPR-Cas system and guides were working. This protocol can be used to accelerate CRISPR-based genome editing strategies for genetic transformation in soybean.


Introduction
The clustered regularly interspaced short palindromic repeats-associated proteins (CRISPR-Cas) system is an RNA-guided mechanism in which an RNA molecule directs Cas nucleases to break a target DNA site (Doudna & Charpentier, 2014).This is possible, according to the same authors, because the guide RNA (gRNA) is part of the Cas-gRNA ribonucleoprotein complex that can recognize and bind to any DNA sequence that contains a protospacer adjacent motif (PAM), which differs for each nuclease.Once the Cas-gRNA complex identifies a PAM adjacent to the gRNA matching sequence, the protein anchors to the DNA strand and generates a DNA double-stranded break (DSB), after which the cell's innate DNA repair system fixes the damaged DNA (Ran et al., 2013;Doudna & Charpentier, 2014).
The two main DNA repair pathways are: nonhomologous end joining (NHEJ), in which the end of the damaged strands is simply joined; and homology-directed repair (HDR), in which another strand is used as a template (Huang & Puchta, 2019).However, the error-prone NHEJ pathway prevails in the cell regardless of the cell cycle, and the creation of consecutive DSBs may result in repair system failure, possibly introducing a small insertion or deletion (indels) of nucleotides at the DSB site (Ran et al., 2013;Doudna & Charpentier, 2014).In comparison, the HDR mechanism may repair DNA damage or trigger crossover during meiosis, being active only in phases S/ G2 of the cell division cycle when the sister chromatid is available (Li et al., 2019); therefore, to induce HDRderived editing, a template strand containing the desired sequence must be delivered to mimic the sister chromatid (Huang & Puchta, 2019).
Although a myriad of approaches and techniques have emerged since the discovery of CRISPR-Cas as an innate immune system in bacteria and its first application as a biotechnological tool, the gRNA still plays a critical role in the success of the whole approach (Doench et al., 2016;Horlbeck et al., 2016).Therefore, gRNA is the key component of the CRISPR-Cas editing toolbox, and several in silico tools have been developed to help guide gRNA design (Stemmer et al., 2015;Doench et al., 2016;Haeussler et al., 2016;Chari et al., 2017;Liu et al., 2017).However, these tools only predict the performance of gRNA as a guide, requiring in vivo assays to validate its real functionality or efficiency.Moreover, validating CRISPR-Cas systems with a transient assay is strongly recommended when the targeted species is hard to transform, as is the case for most crops, in order to optimize any subsequent steps in the research pipeline (Shan et al., 2020).
Genome editing by the CRISPR-Cas toolbox has been increasingly used to generate superior crop genotypes, including of soybean [Glycine max (L.) Merr.] (Gao, 2021).However, generating edited soybean plants has been considered complex, time-consuming, and laborious since the efficiency of stable transformations commonly remains below 10% (Do et al., 2019).This shows the importance of validating the CRISPR-Cas system by applying transient assays to verify whether the gRNA has access to the targeted DNA and directs nucleases to the desired sequence (Shan et al., 2020).In soybean, some of the approaches used include leaf cell agroinfiltration, protoplast transfection (Kim & Choi, 2021), and hairy root transformation using Agrobacterium rhizogenes (Do et al., 2019).However, these methods are sometimes labor-and time-intensive and/or still present a relatively low transformation efficiency, confirming that gRNA validation continues to be a bottleneck in the soybean-editing pipeline.
Other approaches may facilitate the screening of CRISPR-Cas-mediated editing, such as restriction enzyme digestion-suppressed polymerase chain reaction (RE-PCR) (Liang et al., 2017;Do et al., 2019) and the multiplex approach for the induction of larger DNA deletions with two gRNAs directing Cas9 to the same targeted gene, which facilitates the detection of edited plants and the assessment of gRNA functionality using PCR (McCarty et al., 2020).
The objective of this work was to develop a method to create and validate CRISPR-Cas systems and different gRNAs in soybean embryos.

Materials and Methods
The Kunitz trypsin inhibitor (KTI3, gene model Glyma.08G341500) and Lectin 1 (LE1, gene model Glyma.02G012600)were targeted for NHEJ-induced mutation.For KTI3 and LE1, the gene sequences were retrieved from gene IDs 547831 and 100818710, respectively, from the National Center for Biotechnology Information (National Library of Medicine, Bethesda, MD, USA) and aligned through the BioEdit software (Hall, 1999) with a reference genome, the BRS 537 soybean cultivar available at GenBank GCA_012273815, perfectly matching the target DNA in this cultivar.The rational design of gRNAs was carried out in CRISPRDirect (Naito et al., 2015), by selecting the most commonly used nuclease, Cas9, and its required PAM, PAM NGG (N, any nucleotide), using Glycine max v.2.0 for specificity check in the software.Off-targets were double-checked by aligning the gRNA + PAM in the Wm82 soybean genome at SoyBase (Grant et al., 2010).The structures of the selected proteins were presented in the context of topology using the Protter web-based tool (Omasits et al., 2014) to visualize the intended mutation sites and possible results.
The most promising and highly specific guides, with no off-target hits in the seed region of the guide + PAM, were selected.For KTI3, the presence of a restriction enzyme (RE) site at the Cas9 cut site was sought, meaning that for the restriction site not to be lost by DNA editing, it should encompass 3-4 bp before PAM, after which putative transformants can be screened by the restriction digestion of PCR products, with the lack of digestion indicating edited sequences.For LE1, a distance of 300 bp was set as the minimum interval between the two guides for a sufficient distance for coupling of the two Cas9-gRNA systems in each site in order to induce a partial gene deletion that would be easily detected in agarose gel after PCR amplification.The final gRNAs were: CTGCAAATGAATCGAACTTA for KTI3 and ACTGGTGCTACTGACCAGCA and GTTTGTGGCTTAGTGTCAAT for LE1.
Vector digestion and oligonucleotide (gRNA) phosphorylation, annealing, and ligation into the digested vector were performed according to the protocol proposed by Shalem et al. (2014), with minor adaptations.The final vector (ligation reaction) was cloned into thermocompetent Escherichia coli cells using the heat shock method (Froger & Hall, 2007).To verify whether the gRNA was inserted into the modular vector, PCR was carried out using the 5'-CCCTGGGAATCTGAAAGAAGAG-3' (U6At) primer that anneals in the guide promoter region and oligo 2 of the gRNA as the reverse primer (Figure 1).All PCR products were visualized in 1.0% agarose gel and performed using a 25 µL reaction composed of 1X buffer, 2.0 mmol L -1 MgCl 2 , 400 µmol L -1 dNTP, 0.2 µmol L -1 of each primer, and 0.01 U Taq polymerase.The main steps were as follows: annealing temperature specific to each primer pair for 30 s, a cycle at 94°C for 5 min, 35 cycles at 94°C for 30 s, a cycle at 72°C for 1 min per kilobase, and a final extension of 72°C for 7 min.Sanger sequencing was performed to confirm gRNA insertion and fidelity (Sanger et al., 1977).
To create a CRISPR-Cas vector bearing two gRNAs, gRNA1 and gRNA2 were inserted separately into the modular vector, following the protocol described previously.The U6 promoter+gRNA2 sequence (including the whole structural gRNA) was amplified with primers 5'-GTCGACGAATTCCTTCGTTGAACAACG-3' and 5'-GGTACCGACAAAAAAAGCACCGACTC-3', containing overhangs of the SalI and KpnI restriction enzymes, respectively.The amplified sequence was gel purified using the QIAquick Gel Extraction Kit (Qiagen, Germantown, MD, USA) and ligated into the pENTR entry vector (Thermo Fisher Scientific, Waltham, MA, USA).The entry vector bearing the U6 promoter + gRNA2 sequence was then cloned into E. coli using the heat shock method.PCR and Sanger sequencing were carried out to confirm sequence insertion and fidelity, respectively.Afterwards, the modular vector + gRNA1 and pENTR+U6::gRNA2 underwent double digestions with SalI and KpnI in individual reactions at 37°C for 2 hours in the Anza buffer (Thermo Fisher Scientific, Waltham, MA, USA).The digested vector and target Pesq.agropec.bras., Brasília, v.58, e03000, 2023 DOI: 10.1590/S1678-3921.pab2023.v58.03000fragment were gel purified and ligated using T4 ligase and, then, the final vector was cloned into E. coli by heat shock.PCR, using oligo 1 of gRNA1 and oligo 2 of gRNA2 as primers, was performed to check whether there was a successful insertion of the fragment, which was visualized in 1.0% agarose gel (Figure 2).The final  Steps of the genetic engineering approach to create a CRISPR-Cas vector bearing two gRNAs.In steps 1 and 2, gRNA1 and gRNA2 are inserted into the CRISPR-Cas modular vector as described in Figure 1 and separately, respectively.In step 3, the clone U6 promoter + gRNA2 is ligated into an entry vector with enzyme restriction borders that correspond to the modular vector using SalI and KpnI, whereas, in step 4, the modular vector + gRNA1 and pENTR+U6::gRNA2 are digested separately with SalI and KpnI, ligating the purified open modular vector + gRNA1 and fragment (U6::gRNA2).Finally, in step 5, the insertion of gRNA2 into the modular vector + gRNA1 is confirmed by polymerase chain reaction using one oligonucleotide of each gRNA as primers.Agar gel shows expected bands of ~600 bp amplified from plasmids cloned into Escherichia coli.C, negative control.
vectors were then transformed into Agrobacterium tumefaciens by the heat shock method.
The transient expression assay for soybean embryos was performed with cultivars BRS 537, BRS 283, and BMX Potência RR based on the first step of the stable transformation protocol described by Paz et al. (2006), with modifications.First, soybean seeds were surface sterilized, being briefly washed with 70% ethanol, soaked in a 2.0% hypochlorite solution for 30 min, and then washed thoroughly with distilled water.After this, the seeds were placed in a germination medium (2.2 g L -1 MS salts, 1.0 mL L -1 B5 vitamins, 2.0% sucrose, and 0.9% agarose, with pH adjusted to 5.7), which was kept overnight in a growth chamber at 25°C and 60% relative humidity, in the dark.
The Agrobacterium used for transient transformation was grown in 100 mL yeast extract peptone (0.5% yeast extract, 1.0% peptone, and 0.5% NaCl, adjusted to pH 7.0) until a 0.7-0.8optical density at 600 nm, centrifuged at 5,000 × g for 10 min, and resuspended in half the initial volume using a liquid co-cultivation medium (CCM), containing 0.44 g L -1 MS salts, 1.0 mL L -1 B5 vitamins, 4.62 g L -1 2-morpholinoethanesulfonic acid, 3.0% sucrose, 1.67 mg L -1 6-benzylaminopurine, 0.25 mg L -1 gibberellic acid, 200 l L -1 of 1.0 mol L -1 acetosyringone, 1.0 mL L -1 of 1.0 mol L -1 sodium thiosulphate, and 1.0 mL L -1 of 1.0 mol L -1 dithiothreitol, at pH 5.4.After seed imbibition, cotyledons and leaf primordia were removed, isolating the embryos, which were then fully and delicately injured using a steel microbrush smeared with the Agrobacterium solution to mimic the natural process of infection.The embryos were maintained in cell CCM medium with Agrobacterium for 30 min in the dark, dry blotted in a Whatman filter, kept in solid CCM medium (0.9% agarose) with a filter on top for five to six days in a growth chamber at 25°C and 60% relative humidity in the dark, washed three times with distilled water, and then dry blotted.Three pools consisting of 30 embryos each were collected and immediately frozen in liquid nitrogen until further use.The genomic DNA of the pooled embryos was extracted using the protocol of Doyle & Doyle (1990).
The genomic DNA from transformed soybean embryos was analyzed by the RE-PCR method (Shan et al., 2014).For KTI3, DNA was first digested with AflII overnight to enrich samples with edited sequences, which may have lost the corresponding RE site.Then, one PCR was performed to amplify the remaining sequences of KTI3 and another to increase the number of sequences.This final reaction was digested with AflII for 2 hours and run in 2.0% agarose gel, with undigested samples indicating sequence editing.Putative positive samples for gene editing were gel purified and sent for Sanger sequencing using the Inference of CRISPR Edits (ICE) software, specifically the ICE CRISPR analysis tool, volume 2 (Synthego, Red Wood City, CA, USA).For LE1 editing, the DNA from transformed embryos was first digested overnight with PvuII, which cuts the region of partial gene deletion to eliminate wild-type (WT) sequences.Then, PCR was carried out to amplify the full sequence of the gene, whether complete or partially deleted, being visualized in 1.5% agarose gel.

Results and Discussion
In the strategy with one gRNA in the RE-containing site, one biological sample from each genotype showed no digested bands, indicating that the RE site was lost due to DNA editing, with deletions of 4 and 15 bp (Figure 3 A), which was later confirmed by sequencing and analysis using the ICE software (Synthego, Red Wood City, CA, USA).As observed in the PCR agarose gel, even after DNA enrichment for mutations, there were still undigested bands in the control samples, which may explain the remaining high percentage of WT sequences.Furthermore, since the DNA samples come from a pool of transformed and WT embryos, the transient expression of the system and rate of edited DNA tend to be low.
According to Atkins & Voytas (2020) and Xu et al. (2022), genetic transformation is genotype dependent, and therefore, a first screening may indicate what genotype is best to work with.In addition, among genotypes, the sequence of the target gene may vary, meaning that the gRNA may not work since it is designed mainly based on one reference genome that does not represent the genomic variability of the species germplasm (López-Girona et al., 2020).In this case, the fast detection of this problem is key to optimize the whole process.
In the strategy of multiplexed gRNAs, another advantage of CRISPR systems described in the literature (Minkenberg et al., 2017), two gRNAs can be designed to induce partial gene deletion and plants can in soybean (Glycine max), using gRNA designed in a restriction enzyme site that, once edited, may be lost (A) and two gRNAs to induce a partial gene deletion of 305 bp (B).In the first strategy, the DNA (target gene) from embryos of the BRS 283 and BRS 537 soybean cultivars transformed with Agrobacterium was previously digested with AflII, visualized in 2.0% agarose gel, and amplified in two polymerase chain reactions (PCRs).C1-and C2-are untransformed negative controls, positive samples were sequenced, and gene editing was confirmed using the ICE software (Synthego, Red Wood City, CA, USA).In the second strategy, the whole gene from PvII-digested embryos of the BMX Potência RR and BRS 537 soybean cultivars transformed with Agrobacterium was amplified by PCR and visualized in 1.5% agarose gel.C1-and C2-are untransformed and undigested negative controls, whereas C3-and C4-are untransformed and not PvuII-digested negative controls.In the protein representation from Protter, amino acids in red and green represent the protein N-terminal and N-glycan motifs, respectively.
In the present study, the whole U6 promoter + gRNA2, including the CDS hairpin and other gRNA features, were cloned into a pENTR vector, after which the obtained sequence was transferred to the modular vector + gRNA1.One biological sample from the 'BMX Potência RR' genotype and two from 'BRS 537' showed amplification of the expected deleted gene (Figure 3 B).
Although already reported in the literature (Shan et al., 2014;Liang et al., 2017;Do et al., 2019;McCarty et al., 2020), both of these strategies confirmed the functionality of the proposed vector in soybean and the gRNAs designed for both simple mutation and gene deletion, being applied here to amplify the editing signal from a transiently expressed CRISPR system for an early and easy-editing detection.
The Agrobacterium-mediated transformation of soybean embryos to assess transiently expressed CRISPR systems reduces the time required to achieve the point of safe and non-destructive sampling to a week, when compared with the 127 days in stable transformation (Paz et al., 2006).Transient expression may take as little as two weeks from the transformation of embryos to the evaluation of the results of the CRISPR-Cas system.From this point on, non-functioning guides may be discarded, and only the best functioning gRNAs should be used in the stable transformation process.This represents an important optimization in terms of the time and money invested in soybean research and improvement programs considering soybean genetic transformation history, evolution, and current hurdles (Xu et al., 2022).
Although in silico and in vitro tools are used to verify gRNA quality, they do not always represent in vivo reality.Naim et al. (2020), for example, found a low consensus for predictive uniformity and performance among eight different online gRNA-site tools and no significant correlation with their in vivo effectiveness in Nicotiana benthamiana Domin.Moreover, Arndell et al. (2019), assessing the efficiency of seven gRNAs for wheat (Triticum aestivum L.) genome editing, did not observe a clear correlation between in silico prediction and in vivo guide activity, narrowing both of them down to one suitable gRNA for a CRISPRbased improvement of the species.Therefore, there are variations in the actual efficiency of CRISPR-derived mutations induced by selected gRNAs in silico, meaning that reliable methods for the validation of gRNAs are critical to improve the system.
In crops, such as soybean, labor-intensive and timeconsuming in vitro processes are required for plant transformation.Even though different methods have been proposed and improved for the species, they are still laborious, take a long time, and present a low efficiency, continuing to be a bottleneck for genetic manipulation (Kereszt et al., 2007;Chen et al., 2018;Xu et al., 2022).Therefore, any optimized step is advantageous, although it is important to guarantee that the CRISPR-Cas system is working before starting genetic transformation to obtain edited plants due to the high investments involved in the stable transformation of soybean, including reagents, highly skilled personnel, time, and space.

Conclusions
1. CRISPR-Cas systems based on restrictionenzyme sites and partial gene deletion are efficient in detecting genome editing in soybean (Glycine max).
2. The Agrobacterium-mediated transient transformation of soybean embryos can be performed to test the quality of CRISPR-Cas systems, including of gRNAs, in a fast manner.
3. Simple mutation and gene deletion can be detected in transformed embryos after sample enrichment by enzyme digestion, followed by polymerase chain reaction and sequencing.

Figure 1 .
Figure1.gRNA insertion into the C034p7ioR-35sCas CRISPR-Cas9 modular vector, following three steps: digestion of the modular vector with the BsmBI type IIS restriction nuclease (A), phosphorylation and annealing of gRNA oligonucleotides designed with borders complementary to the vector (B), and confirmation of gRNA insertion by polymerase chain reaction using a forward primer that anneals to the U6At promoter and the reverse gRNA oligonucleotide (C).The agarose gel presents expected bands of ~200 bp amplified from plasmids cloned into Escherichia coli.C-, negative control; and CDS, coding DNA sequence.

Figure 2 .
Figure 2.Steps of the genetic engineering approach to create a CRISPR-Cas vector bearing two gRNAs.In steps 1 and 2, gRNA1 and gRNA2 are inserted into the CRISPR-Cas modular vector as described in Figure1and separately, respectively.In step 3, the clone U6 promoter + gRNA2 is ligated into an entry vector with enzyme restriction borders that correspond to the modular vector using SalI and KpnI, whereas, in step 4, the modular vector + gRNA1 and pENTR+U6::gRNA2 are digested separately with SalI and KpnI, ligating the purified open modular vector + gRNA1 and fragment (U6::gRNA2).Finally, in step 5, the insertion of gRNA2 into the modular vector + gRNA1 is confirmed by polymerase chain reaction using one oligonucleotide of each gRNA as primers.Agar gel shows expected bands of ~600 bp amplified from plasmids cloned into Escherichia coli.C, negative control.

Figure 3 .
Figure3.Strategies for CRISPR-Cas-mediated gene editing and results of transient expression to test gRNA functionality in soybean (Glycine max), using gRNA designed in a restriction enzyme site that, once edited, may be lost (A) and two gRNAs to induce a partial gene deletion of 305 bp (B).In the first strategy, the DNA (target gene) from embryos of the BRS 283 and BRS 537 soybean cultivars transformed with Agrobacterium was previously digested with AflII, visualized in 2.0% agarose gel, and amplified in two polymerase chain reactions (PCRs).C1-and C2-are untransformed negative controls, positive samples were sequenced, and gene editing was confirmed using the ICE software (Synthego, Red Wood City, CA, USA).In the second strategy, the whole gene from PvII-digested embryos of the BMX Potência RR and BRS 537 soybean cultivars transformed with Agrobacterium was amplified by PCR and visualized in 1.5% agarose gel.C1-and C2-are untransformed and undigested negative controls, whereas C3-and C4-are untransformed and not PvuII-digested negative controls.In the protein representation from Protter, amino acids in red and green represent the protein N-terminal and N-glycan motifs, respectively.