Revealing the expression profile of genes that encode the Subcortical Maternal Complex in human reproductive failures

Abstract The Subcortical Maternal Complex (SCMC) is composed of maternally encoded proteins required for the early stages of embryo development. Here we aimed to investigate the expression profile of the genes that encode the individual members of the SCMC in human reproductive failures. To accomplish that, we selected three datasets in the Gene Expression Omnibus repository for differential gene expression (DGE) analysis, comprising human endometrial and placental tissues of patients with recurrent implantation failure (RIF) or recurrent pregnancy loss (RPL). The SCMC genes KHDC3L, NLRP2, NLRP4, NLRP5, OOEP, PADI6, TLE6, and ZBED3 were included in the DGE analysis, as well as CFL1 and CFL2 that connect the SCMC with the actin cytoskeleton. Additionally, differential co-expression analysis and systems biology analysis of gene-gene co-expression were performed for KHDC3L, NLRP5, OOEP, and TLE6, demonstrating gene pairs differentially correlated under the two conditions, and the co-expression with genes involved in immune response, cell cycle, DNA damage repair, embryo development, and male reproduction. Compared to control groups, NLRP5 demonstrated upregulation in the endometrium of RIF patients, and KHDC3L was upregulated in the fetal placental tissue of RPL patients, shedding light on the importance of considering SCMC genes in reproductive failures.


Introduction
Infertility is the failure to establish a clinical pregnancy after 12 months of regular and unprotected sexual intercourse, affecting 8-12% of reproductive-aged couples worldwide (Vander Borght and Wyns, 2018).Many factors may lead to infertility, being manifested in different ways, according to the impact on the processes related to human reproduction, whether of maternal, paternal, and/or embryonic origin (Carson and Kallen, 2021).Fertilization failure, embryo arrest, and embryonic implantation failure are some of the reasons for the inability to initiate gestation.However, even once the pregnancy is achieved, its maintenance depends on the correct communication between maternal and embryonic tissues, and abnormalities during this period can lead to pregnancy losses (Ashary et al., 2018).
Recurrent implantation failure (RIF) is the lack of implantation after the transfer of several embryos through assisted reproductive technologies (Franasiak et al., 2021), whilst recurrent pregnancy loss (RPL) is the failure of two or more clinically recognized pregnancies before 20-24 weeks of gestation (Dimitriadis et al., 2020).However, there is no consensus on the definition of RIF and RPL, varying according to the published guidelines.Both conditions may be related with disturbances in the maternal immune system, genetics of the embryo, anatomic factors, hematologic factors, reproductive tract microbiome, and endocrine environment, as well as endometrial-embryo asynchrony (Dimitriadis et al., 2020;Franasiak et al., 2021).
The processes involved in early embryo development are regulated and coordinated simultaneously to ensure the generation of a competent embryo capable of sustaining the implantation process and the maintenance of pregnancy (Conti and Franciosi, 2018).During this critical period, specific patterns of gene expression are paramount for regulating cellular proliferation and differentiation, being pivotal for the correct embryo development (Shahbazi and Zernicka-Goetz, 2018).In addition, proper gene expression in the maternal tissues during the time of implantation and pregnancy is also necessary for the changes that this period requires in the maternal reproductive environment (Ashary et al., 2018).
Before the embryo genome activation, the initial development relies almost entirely upon maternal-effect-genes, which have important roles during embryogenesis, such as in the elimination of maternal mRNAs and proteins, epigenetic remodeling in oocytes and early embryos, as well as embryo genome activation (Conti and Franciosi, 2018).Recently, a Subcortical Maternal Complex (SCMC), comprising proteins encoded by maternal effect genes, was identified in mice (Li et al., 2008a) and humans (Zhu et al., 2014), demonstrating fundamental roles in early embryogenesis.Four proteins compose the SCMC: KHDC3L, NLRP5, OOEP, and TLE6.However, the sum of the four proteins (~255 kDa) is smaller than the estimated molecular weight of the SCMC (~669-2000 kDa), hypothesizing that other proteins may be part of the complex, such as the candidates NLRP2, NLRP4F, PADI6, and ZBED3 (Bebbere et al., 2021).In addition, it was demonstrated that the SCMC interacts with the actin cytoskeleton through Cofilin (CFL), regulating symmetric cell divisions of mouse zygotes (Yu et al., 2014).
The SCMC appears to work as a maternal functional module regulating mammalian early embryogenesis (Lu et al., 2017), however, we hypothesized the individual members of the SCMC could have other roles in human reproduction in addition to embryonic development.Although the SCMC is confirmedly present only in oocytes and early embryos (Li et al., 2008b;Zhu et al., 2014), it is not settled whether the proteins of the SCMC could act as single molecules in other tissues, not being aggregated to form the multiprotein complex.Literature reports based on the evaluation of conditions such as male fertility (Rockenbach et al., 2023), and imprinting disorders (Eggermann et al., 2021), have helped to instigate this hypothesis.However, not limited to tissue variability, we speculate whether the individual members of the SCMC could have roles in processes such as embryonic implantation and even in later steps, such as the maintenance of pregnancy.
It is well-defined that pregnancy initiation and continuation are regulated by different molecular mechanisms that must be correctly orchestrated between maternal and embryofetal tissues (Ashary et al., 2018).Since the SCMC expression is pivotal for the embryonic genome activation and other initial steps of the pregnancy initiation (Lu et al., 2017), it is coherent to hypothesize that its inactivation might result in an implantation failure (IF).Nevertheless, literature is scarce in regard to the effects of the SCMC in the later gestational period.If the SCMC proteins, acting as a complex or as single molecules, have a role in placentation and endometrial receptivity, it is also feasible to suggest they might be implicated in the recurrent pregnancy loss (RPL) etiology.Therefore, we analyzed the gene expression profile of the SCMC genes, as well as CFL1 and CFL2 in endometrial and placental tissues of patients with RIF or RPL through publicly available transcriptomes.

Gene expression analysis
The expression profile of the SCMC genes in RIF or RPL patients was evaluated through differential gene expression (DGE) and differential co-expression analyses of transcriptome data available in the Gene Expression Omnibus (GEO) repository (Edgar et al., 2002;Barrett et al., 2013).For each pathological condition (RIF or RPL), the comparisons were performed against a control group, considering for DGE analysis the gene expression of KHDC3L, NLRP5, OOEP, TLE6, CFL1, CFL2, NLRP2, NLRP4, PADI6, and ZBED3, and for differential co-expression analysis the expression of KHDC3L, NLRP5, OOEP, and TLE6.

Obtention of transcriptome data
For datasets search in the GEO, the keywords "implantation failure", "pregnancy loss", "endometrium", "placenta", "chorionic villus", and "decidua" were used, filtering by Entry type (Series), Organism (Homo sapiens), and study type (Expression Profile by Array or Expression Profile by Throughput Sequencing).Only studies performed in consolidated platforms, containing the raw data, experimental design, and well-described sample groups were included.Following these criteria, three studies were selected, covering endometrium samples of patients with RPL or RIF, as well as placental tissue (chorionic villus or decidua) of RPL patients: GSE26787 (Lédée et al., 2011), GSE121950 (Huang et al., 2018), and GSE113790 (Yu et al., 2018).In the studies selected, the RPL definition was: having at least three pregnancy losses between 6 and 12 weeks of gestations (GSE26787) or two or more consecutive pregnancy losses before 20 weeks of gestations (GSE121950 and GSE113790).RIF was defined as the absence of pregnancy despite the transfer of at least ten embryos over several assisted reproductive cycles (GSE26787).The Control group for endometrial sample of patients with RIF or RPL was fertile patients (successfully delivered after the first or second attempt of IUI or IVF/ICSI related to a male infertility diagnosis).The Control group for chorionic villus or decidua sample of RPL patients consisted of women who underwent legal termination of an apparently normal early pregnancy, without medical reasons, history of pregnancy loss or any pregnancy complication.Additional information about the datasets is available in the supplementary material (Table S1).

Differential gene expression analysis
The DGE analysis was conducted in the R environment (v.3.6.3).For the studies comprising RNA-Seq data, sequence alignment was performed through the Galaxy Europe server (Jalili et al., 2020), using the HISAT2 (Kim et al., 2019) alignment tool against the human reference genome hg38 and transcript count was performed through featureCount tool (Liao et al., 2014).The parameters for RNA-Seq data alignment and transcript count were the default ones, and the alignment rate was above 80% for all the samples analyzed.The DGE was calculated in the aligned transcriptomes using the edgeR (v.3.28.1) (Robinson et al., 2010) package.Considering microarray data, the packages affy (v.1.64.0) (Gautier et al., 2004) and limma (v.3.42.2) (Ritchie et al., 2015) were used to evaluate the DGE.RNA-Seq data was normalized through the trimmed mean of M values (TMM) and microarray data by robust multi-array average (RMA).The DGE results are demonstrated as values of log 2 fold-change (logFC) and adjusted P-value for false discovery rate (FDR), being the DGE considered statistically significant when identified a gene with both log 2 fold-change (logFC) ≥ |1.0| and adjusted P-value ≤ 0.05.The heatmaps were generated in the R environment through the ggplot package (v.3.3.5).

Differential co-expression analysis
Additional to the DGE analysis, a differential coexpression analysis was performed considering the basal gene expression of KHDC3L, NLRP5, OOEP, and TLE6 in control and RIF or RPL patients.Gene-gene co-expression was evaluated using Pearson's correlation coefficient (Pearson's r) through the diffcoexp package (v.3.17) in the R environment.
According to Pearson's r, a negative correlation coefficient means one gene is upregulated and the other is downregulated; hence, there is an inverse expression between gene pairs.In contrast, positive correlation coefficients mean both genes are upregulated or downregulated.Gene-pairs co-expression was considered moderately correlated when Pearson's r was ≥ |0.5| and highly correlated when Pearson's r was ≥ |0.8|.In this study, Pearson's r was calculated for control samples and then for fertility issues samples (RIF or RPL).The differential correlation between control vs. affected group was calculated through Fisher's Z transformation method and P-Values < 0.05 was set as significant.Due to the small number of gene-pairs evaluated, no adjustment in the P-Values were performed, but q-Values are presented in Table 1.Hence, gene-pairs were considered differentially co-expressed when there was a significantly different correlation coefficient under the two conditions.As in DGE, the heatmaps for the differential coexpression analyses were generated in the R environment through the ggplot package.

Systems biology analysis
To better elucidate the roles of the SCMC genes in multifactorial conditions such as RIF and RPL, a systems biology approach was conducted for the four validated components of the SCMC: KHDC3L, NLRP5, OOEP, and TLE6.A co-expression network was assembled in the Cytoscape (v.3.8) using the GeneMania application (Montojo et al., 2010), considering only the co-expressed genes filter.

Results
An upregulation of NLRP5 was observed in the endometrium of patients with RIF compared to the control group (logFC = 3.025; adjusted P-Value = 0.014).Although it was not statistically significant in the other three analyses, NLRP5 was upregulated in the four scenarios evaluated.Considering the placental tissue, an upregulation of KHDC3L was demonstrated in the chorionic villus of RPL patients when compared to the control group (logFC = 3.008; adjusted P-Value = 0.003) (Figure 1).Interestingly, no statistically significantly altered genes were observed in the placental samples of maternal origin (decidua) (Adjusted P-Value > 0.05); however, some logFC were increased, demonstrating a differential expression might be present, but without statistical power to confirm it.The values of logFC and adjusted P-Values for all the SCMC genes analyzed are available in Table 2.
Pearson's r was calculated to evaluate the co-expression between gene-pairs, considering the four genes of the SCMC complex.It was observed that the gene expression correlation between KHDC3L, NLRP5, OOEP, and TLE6 is lost in the endometrium of RIF patients compared to the control group (Figure 2A), although it was not statistically different when applying Fisher's-Z transformation.Except for TLE6  and NLRP5 (Pearson's r < 0.5), all the other gene-pairs demonstrated a moderate or high correlation in the control group.However, in RIF patients these correlations were lost, except for OOEP and NLRP5, which significantly inverted the correlation pattern, from a high inverse correlation to a moderate positive correlation (Control = -0.84 vs. RIF = 0.75, P-Value = 0.03).Considering the endometrium of RPL patients, no statistically significant differences in SCMC genes' co-expression were observed (Figure 2B).
Gene-gene co-expression analysis for KHDC3L, NLRP5, OOEP, and TLE6 was also evaluated through systems biology strategy.It was demonstrated the SCMC genes interact with genes related to DNA damage response and repair, embryo development, immune response, cell division, chromosome segregation, and male reproduction (Figure 3).The four SCMC genes were co-expressed with genes related to male reproduction, especially OOEP which was also co-expressed with genes located in the Y chromosome.Additionally, except for OOEP, all the other genes were co-expressed with genes associated with immune response, and only KHDC3L was co-expressed with a gene related to DNA damage repair.

Discussion
Embryo implantation and maintenance of pregnancy depend on a competent blastocyst, receptive endometrium, and successful cross-talk between the embryonic and maternal interfaces (Ashary et al., 2018).Here we demonstrated, through publicly available data, altered SCMC gene expression and coexpression patterns in RIF and RPL patients.Compared to the control groups, an upregulation of NLRP5 was demonstrated in the endometrium of patients with RIF, as well as an upregulation of KHDC3L in the chorionic villus of RPL patients.Additionally, we demonstrated that KHDC3L, NLRP5, OOEP, and TLE6 are being co-expressed with genes involved in processes required for proper embryo development and gestational maintenance, such as immune response, cell proliferation, and DNA damage repair.
The SCMC exerts several functions during early embryo development, being required for embryo progression beyond the first cell divisions (Li et al., 2008a;Zhu et al., 2014).However, studies have demonstrated alterations in SCMC genes associated with later reproductive problems, such as recurrent hydatidiform mole (Ji et al., 2019), RPL (Zhang et al., 2019), and multilocus imprinting disorders (Docherty et al., 2015).Although the molecular mechanisms behind reproductive disorders and SCMC genes remain poorly understood, it is feasible to suggest that SCMC gene expression is important not only during early embryogenesis but also later in pregnancy.
During the implantation process, the receptive endometrium is modified by embryonic signals accompanied by substantial morphological, molecular, and immunological changes required for proper embryo implantation and further maintenance of pregnancy (Ashary et al., 2018).We demonstrated an upregulation of NLRP5, a member of the SCMC, in the endometrium of RIF patients in comparison to the control group, as well as NLRP5 co-expression with genes related to immunological processes.Early pregnancy modulates the expression of the NLR family in ovine lymph nodes (Zhao et al., 2022), evidencing a role for this protein family in maternal immune regulation during pregnancy.Considering NLR family of proteins have a role in the activation of pro-inflammatory cytokines (Platnich and Muruve, 2019) and embryo implantation is considered a pro-inflammatory reaction characterized by increased endometrial vascular permeability and trophoblast invasion (Kim and Kim, 2017), we postulate NLRP5 upregulation in endometrial cells may affect embryo implantation through altered immunological regulation.
Gene variants in NLRP5, as well as in other genes of the SCMC, are associated with embryo arrest (Mu et al., 2019;Xu et al., 2020).Additionally to this phenotype, alterations in NLRP5 have been associated with multilocus imprinting disorders in humans, a disturbance of multiple imprinting locus across the genome affecting metabolism, growth, and behavior (Docherty et al., 2015;Sparago et al., 2019).Epigenetic regulation of gene expression has a role in embryo implantation and gestational maintenance by regulating both embryo development and endometrial changes required for successful implantation (Munro et al., 2010;Xu et al., 2021).Although the mechanisms behind methylation defects associated with mutations in NLRP5 remain to be elucidated, this gene could be involved in the epigenetic regulation of endometrial gene expression during embryo implantation.Therefore, we hypothesized that upregulation of NLRP5 could be associated with epigenetic deregulation of genes important during the pro-inflammatory scenario necessary for the trophoblast invasion and embryo implantation, thereby affecting the activation of pro-inflammatory cytokines.
However, after embryo implantation and establishment of pregnancy, its maintenance depends not only on proper embryo development but also on the correct maternal-embryo communication (Ashary et al., 2018).In this context, the correct formation of the placenta, an extraembryonic organ crucial for normal development and long-term health, is pivotal for gestational maintenance (Knöfler et al., 2019).Around 5-6 days after fertilization, the blastocyst develops and segregates into two cellular subtypes: the trophectoderm, which will differentiate to form the embryonic placental tissue -chorionic villus -, and the inner cell mass giving rise to the embryo proper (Knöfler et al., 2019).After blastocyst implantation, placental development is initiated and trophectoderm-derived cells give rise to all trophoblast cell types of the future placenta (Woods et al., 2018).In addition to the fetal-derived cells, the placental development is also dependent on the maternal uterine tissue into which the blastocyst is embedded after implantation (Woods et al., 2018).The cells of the endometrium undergo decidualization, which is pivotal for supporting normal placentation and providing the proper environment for embryonic growth and survival (Woods et al., 2018).Interestingly, we demonstrated an upregulation of KHDC3L in the chorionic villus of patients with RPL in comparison to the control group.Although KHDC3L mRNAs are rarely detected in human morulae, the transcript's level increases dramatically in the blastocyst, and like the other members of the SCMC, its location in the blastocyst stage is exclusive of the outer layer formed by the trophectoderm (Li et al., 2008b;Zhu et al., 2014).The specific localization of the SCMC during early embryo development could be associated with a role in lineage cell decisions during development and, in this context, KHDC3L could be related to the trophoblast cells proliferation and differentiation involved in placental development.
In addition, variants in KHDC3L have been associated with hydatidiform moles, an abnormal pregnancy characterized by abnormal trophectoderm proliferation and abnormal or no embryo development (Ji et al., 2019;Demond et al., 2019).This evidence demonstrates a possible role of KHDC3L in the trophectoderm proliferation and differentiation, which further could affect the placental development.Indeed, failures in placental formation can compromise embryonic growth and development, and abnormal placentation is a feature of diverse pregnancy complications such as pregnancy loss, stillbirth, intrauterine growth restriction, and preeclampsia (Knöfler et al., 2019).KHDC3L variants have also been related to imprinting disturbance and genomic instability of early embryonic cells leading to reproductive failures, including RPL (Zhang et al., 2019).Indeed, KHDC3L has a role in safeguarding genome integrity through homologous DNA repair (Zhang et al., 2019) and stalled replication fork restart (Zhao et al., 2018).Therefore, we hypothesized that upregulation of KHDC3L in chorionic villus could be associated with altered epigenetic regulation of genes related to DNA repair mechanisms, thereby disturbing trophoblast cell proliferation and differentiation, influencing the proper placental development.
Interestingly, considering KHDC3L, NLRP5, OOEP, and TLE6, it was observed that except for TLE6 and NLRP5, all the other gene-pairs demonstrated a moderate or high correlation in the endometrium of control group.However, in RIF patients these correlations were lost.Furthermore, OOEP and NLRP5 significantly inverted the correlation pattern, from a high negative correlation to a moderate positive correlation, which could be related to the upregulation of NLRP5 demonstrated in the DGE results.Alterations in gene expression correlation were also observed between the genepairs NLRP5 and KHDC3L, TLE6 and KHDC3L, OOEP and NLRP5, and TLE6 and OOEP in the chorionic villus of RPL patients.Considering the role of gene expression patterns during embryo development, the disrupted gene-gene coexpression demonstrated in RIF and RPL patients could influence the proper embryo implantation and gestational maintenance.Although we cannot confirm a causal association between RIF and RPL with altered co-expression patterns in the four validated SCMC genes, a transcriptional deregulation of these genes is present in these conditions and even if it is not associated with RIF and RPL, tertiary factors could be influencing this deregulation.
Additional to the differential co-expression analysis performed for KHDC3L, NLRP5, OOEP, and TLE6, we evaluated the co-expression of the SCMC members with other genes.Interestingly, the co-expression network demonstrated that the four validated SCMC genes are co-expressed with genes related to male reproduction.Interestingly, a previous work of our group evidenced OOEP downregulation in patients with teratozoospermia or non-obstructive azoospermia (Rockenbach et al., 2023), evidencing possible new roles for the SCMC genes in both embryonic development and in female and male reproduction.Moreover, the results presented here shed light on a possible role of SCMC gene expression profile in later reproductive conditions, such as in post-implantation gestational events.
This study has some limitations, such as the lack of validation of the data analyzed and the absence of singlecell transcriptome studies.Functional analysis needs to be performed to demonstrate the mechanisms behind the gene expression alterations of NLRP5 and KHDC3L in RIF and RPL patients, respectively, as well as in the altered co-expression patterns observed for these conditions.It is also important to highlight the biases of clinical differences between the datasets used in this study, such as the different definitions for RPL.However, the results presented here shed light on possible molecular mechanisms associated with reproductive failures and demonstrate the importance of considering the roles of the SCMC genes in different scenarios, as well as the role of gene expression profiles in the beginning of pregnancy.Besides, the co-expression network performed for KHDC3L, NLRP5, OOEP, and TLE6 demonstrated their co-expression with genes related to different biological processes involved in human reproduction, such as DNA damage response and repair, embryo development, immune response, cell division, chromosome segregation, and male reproduction.Therefore, although the SCMC is confirmedly present in oocytes and early embryos, the components of this complex may exert different reproductive roles in different scenarios and may be considered in future studies aiming to understand reproductive failures of both embryonic, maternal and/or paternal origin.

Figure 2 -
Figure 2 -Differential co-expression analysis for the four validated Subcortical Maternal Complex genes in the endometrium of control and RIF patients (A), endometrium of control and RPL patients (B), chorionic villus of control and RPL patients (C), and decidua of control and RPL patients (D).Positive correlations represented in green; negative correlations in pink; absence of correlation in white.RIF (recurrent implantation failure); RPL (recurrent pregnancy loss).

Figure 3 -
Figure 3 -Co-expression network for the four validated Subcortical Maternal Complex genes.A) KHDC3L co-expression network; B) NLRP5 coexpression network; C) OOEP co-expression network; D) TLE6 co-expression network.Nodes of different colors means distinct biological functions and edges connecting nodes represent genes that are co-expressed.Pink: genes related to immune response; yellow: male reproduction; green: embryonic development; purple: cytoskeleton organization; orange: DNA damage response; brown: cell cycle and chromosome segregation; red: meiosis; blue: located in Y chromosome; grey: other functions.