Use of mass spectrometry as a tool for the search or identification of flavonoids in Urticaceae

Abstract The Urticaceae family, circumscribed within the Rosales, was investigated in this study with an overview of the current literature about phytochemical studies using the Liquid Chromatography coupled to Mass Spectrometry (LC-MS) technique. The aim of this study was to review the secondary metabolites identified in the Urticaeae using LC-MS analysis. A systematic review was performed using Scifinder and ScienceDirect databases. Phenolic substances are the most abundant in the Urticaceae family, especially flavones, phenolic acids, and flavonols. We have shown that flavonoids are important chemotaxonomic markers of the chemical composition of the Urticaeae. Following chemical attributes, the C-glycosylated and O-glycosylated flavones stand out as the main skeletons. Our results revealed the chemical profile and structural variability of micromolecules from each genus of Urticaceae. This approach demonstrates a greater use of reversed-phase and liquid chromatography coupled to a mass spectrometer with a negative mode electrospray ionization (ESI) source. In addition, the mobile phase is usually composed of binary systems and eluted by gradient systems. Finally, this paper presents the identification of molecular ion patterns and fragmentation of chemical markers in Urticaceae, identified and isolated using LC-MS, which has been proven to be a valuable tool in several areas, such as phytochemistry, chemosystematics, and chemophenetics. In conclusion, this review is expected to help identify and separate phenolic compounds from the Urticaceae family.


Introduction
High-resolution mass spectrometry (MS) has become a powerful highly sensitive structural tool with excellent analytical power.It can be particularly valuable and versatile in determining the structural information required for the characterization of secondary metabolites very quickly (Alvarez-Rivera et al. 2019).
Different ionization techniques yield molecular ions and their respective molecular fragments.The fragmentations generated provide relevant data for structural elucidation, such as molecular weight, empirical formula, detection of functional groups and substituents, stereochemical features and isotope ratios (Syage et al. 2008;Patel et al. 2010;Alvarez-Rivera et al. 2019).
Technological advances in recent decades have allowed Mass Spectrometry to be coupled with chromatographic methods, such as Liquid Chromatography (LC), playing an important role in the separation and identification of secondary metabolites (Fraige et al. 2018).
Coupling mass spectrometers in Liquid Chromatography (LC-MS) provides many advantages for the identification of organic/phenolic substances with high specificity, response time and speed of analysis.Mainly, collision-induced dissociation (CID) influenced the separation and efficiency and thus, the ability of structural elucidation in LC-MS (Seraglio et al. 2016;Fraige et al. 2018).
LC-MS is a versatile analytical tool with great potential in qualitative and quantitative studies of a wide variety of metabolites with different physicochemical properties (polarity, volatility and molecular weight) (Lanças 2009;Cajka & Fiehn 2016;Kruve 2020;Dührkop et al. 2021).
The technique quickly established itself as a tool for the determination of volatile micromolecules, especially phenolic micromolecules.Flavonoids are phenolic substances that are easily separated and identified by LC-MS.About glycosylated flavonoids, the approach of this technique provides detailed information on the structure of the aglycone, the type of sugar and the position of its substitutions, as well as the types of interglycosidic bonds or acyl substituents (Cuyckens & Claeys 2004).
Flavonoids are abundant in angiosperm plants, but their structural variability is unexplored in the Urticaceae family from a chemotaxonomic point of view.The Urticaceae family comprises approximately 1,200 species distributed in 54 genera (Treiber et al. 2016).The micromolecular profile of this family is characterized by the presence of flavonoids, phenolic acids and triterpenic acids.It should be emphasized, however, that flavonoids, especially O-glycosylated and C-glycosylated flavonoids, are the main chemical markers of this family.
Aiming to approach the use of LC-MS technique in the separation and identification of the main classes of metabolites present in the Urticaceae family, a bibliographic survey of the chemical substances identified and isolated in this family was carried out.Therefore, this review is designed to present the structural variability of the main classes identified, being differentiated by fragmentation data.

Material and Methods
The systematic review was performed on the Scifinder and ScienceDirect databases.Articles from January 2006 to April 2022 were selected using the keywords and combinations as search terms: Urticaceae, HPLC-MS (High-Performance Liquid Chromatography coupled to Mass Spectrometry), UHPLC-MS (Ultra High-Performance Liquid Chromatography coupled to Mass Spectrometry) and LC-MS.The literature searches encompassed all genera of the family, as classified by Treiber et al. (2016).

Analysis parameters for liquid chromatography coupled to mass spectrometry
Forty-two studies were identified using the search criteria.The characteristics of the LC-MS were shown in Table S1 (available on supplementary material <https://doi.org/10.6084/m9.figshare.23907960.v1>)with details about the part of the plant analyzed, the column type, stationary and mobile phase chosen and the respective authors.
The liquid chromatography coupled mass spectrometry with an electrospray ionization source (ESI) is mostly used.However, two studies used Atmospheric Pressure Photon Ionization (APPI) (Pinelli et al. 2008;Rivera-Mondragón et al. 2019) and only one paper used Atmospheric Pressure Chemical Ionization (APCI) (Shrestha et al. 2020), with the majority of papers operating in negative mode.The LC-MS parameters are detailed in Table S1 (available on supplementary material <https:// doi.org/10.6084/m9.figshare.23907960.v1>).
As for the stationary phase, most columns were octadecyl-silica C-18 reversed-phase, with particle size ranging from 2.6 to 5 μm for HPLC and 1.7 to 1.9 μm for UHPLC.In contrast, the study developed by Pinelli et al. (2008) used the C-12 stationary phase specifically for the identification of anthocyanins.
The binary mobile phase solvent system was employed in most studies, which consisted of acidified water and an organic phase, also acidified, containing methanol or acetonitrile.The method of elution most noticeable was gradient elution, with different elution times and compositions.Separation of chemicals with similar structures is usually performed by polarity gradient elution, providing better resolution between peaks and shorter analysis time (Costa et al. 2000).
In many studies, acetic acid or formic acid have been used in the mobile phase because acidification helps reduce the ionization of phenolic substances, improving retention and separation.It also contributes to obtaining chromatograms with better sharpness of the peaks (Cuyckens & Claeys 2004).The concentration of the acids used varies in most works from 0.05 to 2% (v/v).However, in the identification of anthocyanins a high acidity (4.5% v/v) was observed, presumably due to the instability of these substances, as low pH prevents the degradation of non-acylated anthocyanin pigments (Costa et al. 2000).
In studies performed with LC-MS, most chemical substances were identified by interpretation of characteristic fragmentation patterns and/or comparison with existing database spectra.Furthermore, some studies performed identification by means of analytical standards and also with the aid of other techniques, such as NMR (Nuclear Magnetic Resonance).
In this context, it is worth considering that the necessary structural information provided by the Mass Spectrometry technique, combined to the separation efficiency of Liquid Chromatography, make the LC-MS technique a versatile analytical tool with great potential in qualitative and quantitative studies (Lanças 2009).

Chemical profile of Urticaceae
IOf the 54 genera of the Urticaceae family, only 11 have been studied.Of these studied genera, seven are found in Brazilian territory (Boehmeria, Cecropia, Coussapoa, Pourouma, Urera, Urtica and Pilea).In Brazil, the Urticaceae family is represented by 13 genera [Flora do Brasil 2020 (continuously updated)].

Boehmeria genus.
About Boehmeria genus, studies were found referring to nine species (Tab.S1, available on supplementary material <https:// doi.org/10.6084/m9.figshare.23907960.v1>),eight studies done with the leaves and only one with the root.Boehmeria nivea is the species most studied.In this genus, twelve phenolic acids, four flavonols, one sterol, one alkaloid and one monoterpene were identified (Tab.S2, available on supplementary material <https:// doi.org/10.6084/m9.figshare.23907960.v1>).The phenolic acid class (64%) was the most found, followed by flavonols (21%) (Fig. 2), which were mainly present in the leaves.To a lesser extent, a steroid (β-sitosterol), an alkaloid [(-)-cryptopleurine] and a monoterpene [(-)loliolide] were also identified in the leaves.In a study carried out on the roots of B. nivea, only four phenolic acids were identified.Aqueous solvents of methanol and ethanol were used for extraction from the leaves and ethyl acetate was used for extraction from the root.The genus Cecropia, the most studied to date, presents a diversity of types of secondary metabolites, such as flavonols, flavones, flavan-3-ols, condensed tannins, phenolic acids, flavonolignans, terpenoids (pentacyclic triterpenoids, iridoids, and triterpenoid saponins), and steroids (Tab.S2, available on supplementary material <https://doi.org/10.6084/Rodriguésia 74: e01152022.2023 m9.figshare.23907960.v1>).All chemical metabolites were identified in the leaves of the ten species of the genus Cecropia.Phenolic substances are the most found in this genus (Fig. 2), flavones being the most abundant (39%).The aqueous and hydroethanolic extracts of the leaves were the most analyzed.

Girardinia genus.
To date, only one study has been conducted on the genus Girardinia.In the methanolic extract made from the tips of the shoots of the species Girardinia diversifolia, the presence of flavonol, flavones, anthocyanins, phenolic acids, terpenoids (triterpenes, carotenoids, triterpenoid saponins, seco-iridoid glycosides), and steroids were evidenced.Of the nine genera found, the genus Girardinia was the one that presented a greater abundance of lipophilic substances.The most abundant secondary metabolite types were terpenoids (37%) steroids (23%), phenolic acids (20%), and flavones (14%), respectively (Fig. 2).

Urtica genus.
The genus Urtica has been the subject of several studies, mainly concentrated on Urtica dioica.Secondary metabolites have been identified in different parts of species of this genus, such as flowers, stems and roots, and especially in the leaves.A large number of types of secondary metabolites have been identified in the genus Urtica, such as flavones, flavan-3-ols, anthocyanins, coumarins, lignans and phenolic acids, (Tab.S2, available on supplementary material <https://doi.org/10.6084/m9.figshare.23907960.v1>).Phenolic substances such as phenolic acids (38%) and flavonols Rodriguésia 74: e01152022.2023 (30%) are predominant in the genus (Fig. 2).The hydroethanolic and hydromethanolic extracts were the most analyzed.

Secondary metabolites identified by LC-MS
Flavonoids and phenolic acids were the most abundant compounds (52% and 21%, respectively).Therefore, this study presents a discussion of the fragmentations of the flavonoid structural variability and of the phenolic acid micromolecules identified by LC-MS.

Fragmentation pattern of flavones and flavonols
The main classes of flavonoids in the Urticaceae family are flavones and flavonols, attached to one or more sugar units, such as O-or C-glycosides.The flavone aglycones derived from apigenin, diosmetin and luteolin, while the flavonol aglycones derived from isorhamnetin, kaempferol, quercetin and myricetin (Fig. 3).
The flavonoids O-glycosides have sugar units substituents attached to an aglycone hydroxyl group, usually at positions C-3 or C-7, while in the case of C-glycosides flavonoids, sugars are linked to aglycone by a carbon-carbon bond at the C-6 or C-8 position (Waksmundzka-Hajnos & Sherma 2011).
In O-glycoside flavonoids, fragmentation of the entire sugar unit is usually observed, because the energy required for breaking the hemiacetal C-O bonds is low (Vukics & Guttman 2010) (Fig. 4).
This type of fragmentation can be observed in data from some of the flavonoids O-glycosides identified in the Urticaceae family.Quercetin 3-O-hexoside (m/z 463) and luteolin-7-O-hexoside (m/z 447) generate the fragments m/z 301 e m/z 285, respectively, by the loss of a hexoside unit (162 u), as shown in Figure 5.
In the Urticaceae family, intraglycosidic cleavage commonly occurs in flavonoid C-glycosides, requiring high energy to break the C-C bond. Figure 6 shows the main fragmentations that occur in these cases.It is important to point out that this type of fragmentation can also be observed in O-glycosides, but less frequently (Vukics & Guttman 2010).
The flavone diosmetin-C-hexoside (m/z 461), identified in the Cecropia genus, serves as an example of this kind of fragmentation.Its most abundant fragments were generated by intraglycosidic cleavage, the base peak m/z 341 ( 0.2 X -) and m/z 371 ( 0.3 X -), as illustrated in Figure 7.
The flavonoid aglycone type is identified through the fragmentation spectrum, but this requires further data analysis.One way of analysis is by cleavage of the C-C bonds by the retro-Diels-Alder (RDA) mechanism of the C-ring of the aglycone.Mass losses related to water, methyl, CO 2 and CO and successive losses of these molecules are also observed and important in identifying the specific functional groups (Fig. 9) (Fabre et al. 2001;Benayad et al. 2014;Treiber et al. 2016;Villiers et al. 2016).

Fragmentation patterns of phenolic acids
Phenolic acids represent the second most recorded class of micromolecules in Urticaceae in the period analyzed.Mainly the derivatives of cinnamic acid stand out.Several types of chlorogenic acids have been identified, acids formed by the esterification of quinic acid with the following trans-arylpropionic acids: caffeic, ferulic, and p-coumaric acids.
The 5-O-caffeoylquinic, 4-O-caffeoylquinic and 3-O-caffeoylquinic acids present m/z 353 which corresponds to the molecular ion in the negative mode of caffeoylquinic acid (Fig. 13).The differentiation between these isomers is based on the fragmentation profile by MS/ MS.The isomers at the C-3 and C-5 positions are characterized by the base peak m/z 191 ([quinic acid-H]-), relative to quinic acid, whereas the C-3 isomer has other fragments relative to quinic acid and the corresponding cinnamic acid.The isomer 4 provides as the base peak m/z 173 ([quinic acid-H 2 O-H] -) (Clifford et al. 2003).
Arylpropionic acids have also been identified.While ferulic acid presents as base fragmentation the ion m/z 134 on the elimination of CO 2 and of the methyl group (by radical fragmentation), for caffeic acid the ion m/z 135 refers only to the loss of CO 2 , resulting from decarboxylation.The illustration of these fragmentation steps is shown in Figure 15.
The p-coumaric acid (m/z 163) has as fragmentation the m/z 119, formed by the loss of CO 2 ([p-coumaric acid-CO 2 -H] -).On the other hand, p-coumaroylhexoside acid (m/z 325) shows the fragmentations m/z 163, formed by the loss of sugar and m/z 119 ([p-coumaric acid-H-CO 2 ] -), originating from the loss of CO 2 (Fig. 16).Synapic acid (m/z 223) has as main fragmentation the ion m/z 193, resulting from the successive loss of two methyl groups; whereas sinapichexoside acid (m/z 385) presents as fragmentation the m/z 223 ion ([sinapic acid-H] -), formed by the loss of hexose (Fig. 16).
Arylpropionic acids, such as p-coumaric, caffeic and ferulic acids were also found esterified with malic acid.Benzoic acid derivatives have also been identified in the Urticaceae family, such as gallic, protocatechuic, gentisic acid, vanillic and syringic acids (Fig. 18).The main fragmentation of gallic (m/z 169), gentisic acid (m/z 153) and protocatechuic (m/z 153) acids is generated by the   Based on the survey, it was found that of the 54 genera of the Urticaceae family, only 11 genera had phytochemical studies using the LC-MS, and seven of these genera were found in Brazil.Most of the studies have focused on the Cecropia and Urtica genera.Therefore, although the family has been little explored from a phytochemical point of view, from the articles found, it could be concluded that the studies are concentrated on the leaf part of the species and that the phenolic substances are the majority in the Urticaceae family, mainly belonging to the flavone, phenolic acids, and flavonol classes.Detailed fragmentation analysis of some phenolic metabolites allows chemical identification without the use of standards and can even differentiate certain isomers.Therefore, LC-MS provides a faster and more efficient characterization of the chemical profile of plant species.Finally, with its little explored biodiversity, promising perspectives

Figure 1 -
Figure 1 -Classes of secondary metabolites identified in the Urticaceae family by LC-MS.

Figure 2 -
Figure 2 -Classes of secondary metabolites identified by LC-MS in genera of the Urticaceae family.

Figure 3 -
Figure 3 -Types of aglycone flavones and flavonols present in the Urticaceae family.