Strategies towards expansion of chemical space of natural product ‐ based compounds to enable drug discovery

Natural products (NPs) are an excellent source of biologically active molecules that provide many biologically biased features that enable innovative designing of synthetic compounds. NPs are characterized by high content of sp3-hybridized carbon atoms; oxygen; spiro, bridged, and linked systems; and stereogenic centers, with high structural diversity. To date, several approaches have been implemented for mapping and navigating into the chemical space of NPs to explore the different aspects of chemical space. The approaches providing novel opportunities to synthesize NP-inspired compound libraries involve NP-based fragments and ring distortion strategies. These methodologies allow access to areas of chemical space that are less explored, and consequently help to overcome the limitations in the use of NPs in drug discovery, such as lack of accessibility and synthetic intractability. In this review, we describe how NPs have recently been used as a platform for the development of diverse compounds with high structural and stereochemical complexity. In addition, we show developed strategies aiming to reengineer NPs toward the expansion of NP-based chemical space by fragment-based approaches and chemical degradation to yield novel compounds to enable drug discovery.


INTRODUCTION
Natural products (NPs) are considered an excellent source for the discovery and development of new drug molecules (Newman, Cragg, 2012) owing to their diversity in both structural features and molecular targets.NPs unveil opportunities for synthetic organic chemists to develop innovative synthetic strategies (Northrup, MacMillan, 2002;Sharpless et al., 1983) while navigating the natural chemical space, which is characterized by diversity and is clearly separated from synthetic libraries (Koch et al., 2005;Wetzel et al., 2007).
NPs have also received considerable attention owing to their wide presence in drugs (Newman, Cragg, 2012).The natural drug space includes pure unaltered NPs, semisynthetic compounds, and NP mimics.The reduction in the number of NP-based compounds observed in recent years has led to the search for alternative methods to explore NPs for accelerating innovation process in the search for new drug scaffolds, which could be potential drug candidates.
A significant number of natural drugs are available in the market, and the number of newer representatives is increasing, although at a slower pace.Approximately 28% of all approved drugs are either NPs (15%) or NP derivatives (NPD, 13%) (Dias, Urban, Roessner, 2012).However, the process of moving from target identification to lead generation is laborious, especially because chemical space is enormously large and cannot be exploited conclusively by means of synthetic efforts (Wetzel et al., 2011).
NP-based drugs need to meet their endpoints and demonstrate solid safety profiles.Despite all the endeavors of pharmaceutical industry, around nine out of every ten drug candidates fail to obtain approval, resulting in high cost of drug development ("CSDD-Tufts Center for the Study of Drug Development," 2018).
A new trend is the application of high-throughput screening (HTS) of large compound libraries, which allow identification of multiple lead compounds to increase the chance of discovering a lead compound that can successfully reach the market.This method provides a picture of the contribution of phenotypic screening for the discovery of first-in-class small-molecule drugs (Swinney, Anthony, 2011).
NPD have the potential to yield novel drugs having improved pharmacokinetic properties and biological activities, which provide therapeutic benefits in treating diseases.
Some examples of NPs as sources of drugs are shown in Figure 1.Acetylsalicylic acid (Aspirin): an antiinflammatory agent derived from salicin, which is isolated from the bark of the willow tree Salix alba L; (Dias, Urban, Roessner, 2012) dopamine: a neurotransmitter obtained from phenylethylisoquinoline alkaloid by hydroxylation of tyrosine to L-3,4-dihydroxyphenylalanine (DOPA) followed by decarboxylation; solifenacin used to treat overactive bladder is derived from tetrahydroisoquinoline and has additional functional groups on this ring; and topotecan: an anticancer drug derived from camptothecin and has additional hydroxyl and 1-N,Ndimethylmethanamine groups (Ehrenworth, Peralta-Yahya, 2017).Pure unaltered NPs (PUNP) include Yohimbine (an indole alkaloid), quinine (antimalarial therapeutic), the diterpene gibberellic acid (a plant hormone), and adrenosterone (steroid hormone) (Wetzel et al., 2011).
With the exponential increase in the number of viable novel drug targets, the quality and availability of a specific library for biological campaigns is a major issue.Therefore, populating the chemical space with innovative chemotypes besides HTS is a challenging process to rationally find hits and leads.
As an alternative, cheminformatic methods are useful to filter and select compounds.With the upcoming focus on virtual high throughput screening (vHTS), computational methods are being increasingly applied to accelerate drug discovery process (Wetzel et al., 2007;Subramaniam, Mehrotra, Gupta, 2008).Considering the uniqueness of structural patterns available from NP libraries, correlation of innovative scaffolds with the evolutive ability to bind biological targets is still an inspiring library design to populate new areas of chemical space, with higher chances of achieving increased hit rates in biochemical and biological screening.
The two types of NP-based drugs (NPD and PUNP) given in Figure 1, provide new opportunities to enhance the application of alternative strategies for drug discovery.
NP-based drugs can be classified into different areas.For this review, we can classify NP-based drugs into fragment-like compounds with low molecular weight and complex scaffolds with high three-dimensionality and complexity.
Recent approaches use NP-derived fragments or fragment-based drug discovery (FBDD) to explore a large fraction of chemical space and to suggest simple metrics for assessing synthetic compounds for NP likeness (Over et al., 2013;Genis, Kirpichenok, Kombarov, 2012).
Alternative and elegant synthesis of NP-inspired scaffolds is another area of increasing interest in recent years (Paciaroni et al., 2017;Balthaser et al., 2011;Mcleod et al., 2014), and it applies well-known strategies such as biology-oriented synthesis (BIOS) (Antonchick et al., 2013;Van Hattum, Waldmann, 2014;Svenda et al., 2015) and diversity-oriented synthesis (DOS) (Schreiber, 2009;Schreiber, 2000;Grossmann et al., 2014) to facilitate drug discovery.BIOS involves a hierarchical classification of bioactive compounds according to structural relationships and type of bioactivity, and selection of scaffolds of bioactive molecule classes as starting points for the synthesis of compound collections, with focus on diversity.DOS is considered an efficient method to achieve structural diversity along with structural complexity (Wetzel et al., 2011).
DOS includes "ring distortion" approach to chemically explore the available NPs containing a complex and fused ring system to expand focused libraries.Ring distortion involves the altering of complex ring systems through various chemical reactions that enable ring cleavage, expansion, rearrangement, and fusion.This method has started to gain attention recently (Huigens et al., 2013;Rafferty et al., 2014).
NP-derived fragments and ring-distortion strategies have the potential to identify drug candidates by accessing the unexplored chemical and biological spaces of NPs (Figure 2) (Crane et al., 2016).
In this review, the strategies to reduce structural complexity or to introduce drastic structural modifications of NPs to obtain the desired biological parameters, such as potency or selectivity, are shown.
Over the years, FBDD has evolved to allow characterization of an ideal fragment from a set of criteria defined as the "Rule of 3" (Congreve et al., 2003): molecular mass <300 Da; up to three hydrogen bond donors and acceptors; and logP ≤ 3 (Congreve et al., 2003).FBDD allows exploiting larger areas of chemical space in compounds having low molecular weight during the process of drug discovery.However, this approach usually does not represent the real bioactivity of the tested compounds owing to low affinity for the target (Scott et al., 2012).The possibility of adding functional groups increases the chances of improving bioactivity, making FBDD a process with high success rate depending on the number of compounds tested initially (Figure 3).
Combining NPs and fragment approaches for drug discovery is a complex process when trying to fit the properties described above.NP-derived fragments can be highly different from synthetic compounds, when structural features, design, and obtention of a library component are compared.(Genis, Kirpichenok, Kombarov, 2012;Austin et al., 2016;Over et al., 2012;Pascolutii et al., 2015;Prescher et al., 2017;Rodrigues et al., 2016).
A previous study examined the Dictionary of NPs (DNP) (Buckingham, 1995) and confirmed that the most frequent structural elements of NPs are characterized by a high content of sp 3 -hybridized carbon atoms; oxygen; spiro, bridged, and linked systems; and stereogenic centers, with high structural diversity (Genis, Kirpichenok, Kombarov, 2012;Lee, Schneider, 2001).Furthermore, NPs populate areas of chemical space not occupied by average synthetic molecules (Over et al., 2013).In contrast, synthetic libraries are mainly characterized by sp 2 -rich compounds, with high content of nitrogen, covering well-explored regions of chemical space with less diversity in structure and molecular properties (Genis, Kirpichenok, Kombarov, 2012;Thomas, Johannes, 2011).This could lead to the selection of more structures belonging to the same chemical space (Cheshire, 2011).
Another aspect to be considered is the source of fragment components.While synthetic fragments are usually obtained by feasible synthesis in few steps or from available commercial sources, (Murray, Rees, 2016) NP-derived fragments are obtained in diverse ways.The production and/or amount of pure compounds are not straightforward and depends on the strategy used.(Pascolutti et al., 2015;Over et al., 2013;Rodrigues et al., 2016).
It is possible to use NPs in FBDD in three main ways as shown in Figure 4. Fragment-derived NPs can be divided in two main groups: 1) Fragment-sized NPs, which are usually selected from available NPs based on FIGURE 3 -Evolutive process of vemurafenib discovery from a fragment starting point.fragment-like properties; and 2) NP-based fragments, which are designed or chemically derived from natural complex structures.
Different strategies have been described in the literature for using fragments derived from NPs to develop bioactive compounds successfully.1. Fragment-sized NPs: it is possible to filter fragment-sized NPs from an NP library, and with this collection of compounds, chemical modifications or designing of fully synthetic new compounds can be initiated; 2. In silico fragmentation: In silico fragment generation from a defined collection of compounds, which will be obtained synthetically or from a commercial library; 3. High-molecular weight degradation: Chemical modifications of NPs that generally cleave the structure to generate a diverse library of compounds.Each strategy is described below with successful examples in literature.

a) Fragment-sized natural products
The use of fragment-sized NPs is the most classic strategy.It involves isolation, synthesis, or acquisition of known compounds for screening or medicinal chemistry campaigns.In addition, this strategy can be used to design derivatives, which could be obtained by semi-synthesis or total synthesis.Artemisinin is a classic example (Figure 5-a) (Haynes, Vonwiller, 1994;Silva et al., 2015).For fragment-sized NPs, several examples can be found in the literature.Approximately 23% of all known NPs fit fragment-like properties.However, rational approaches involving the construction of a fragment library, followed by specific FBDD are not widespread.
The main characteristic of this library is fitting of fragment-like properties, while maintaining biological relevance and high degree of three-dimensionality, incorporating chirality and molecular flexibility to the compound collection, and populating diverse fractions of the chemical space, as expected for NPs.
A recent study described the use of fragment-sized NPs for target identification and hits discovery against Plasmodium falciparum (Vu et al., 2018).The group from SGC Toronto elegantly reported a strategy that involved screening of an in-house library containing 643 NPs (complying with, at least, five of the established criteria for fragment-like compounds (MW ≤ 250 Da, ALogP < 4, HBD ≤ 4, HBA ≤ 5, RB ≤ 6, %PSA < 45) against 62 proteins using a native mass spectrometry (MS) approach that allowed detection of protein-ligand interactions.From this screening, it was observed that 32 proteins formed complexes with 96 hits.
The next step involved testing the 96 compounds in vitro against asexual intraerythrocytic blood stage P. falciparum 3D7 (100 mM).From the cell-based assay, it was possible to calculate the IC 50 for 24 compounds, with 14 compounds showing IC 50 values lower than 45 μM (Figure 5-b).
In a similar approach, Ronald et al developed their own fragment collection by screening an in-house druglike NP library using established fragment-like properties (Figure 5-c).From the collection of 371 compounds, binding interactions of compounds with P. falciparum deoxyuridine 5′-triphosphate nucleotidohydrolase (PfdUTPase) was analyzed using electrospray ionization Fourier-transform ion cyclotron resonance mass spectrometry (Vu et al., 2013).After selecting the effective binders and testing the ability to reduce viability of P. falciparum stage V gametocytes, it was possible to establish a clear correlation between biochemical and phenotypic assays and to highlight the potential antimalarial activity of the fragment-sized NP securinine.
Another useful application for fragment-sized NPs is biology-driven functionalization to fragment growth usually to enhance molecular properties and improve pharmacokinetics, solubility, and bioavailability of the parent natural fragment.For fragment-sized NPs, library diversity is also achievable and can enhance the quality of library for biological purposes.Lizos and collaborators at Novartis showed library development from a lowmolecular weight NP for synthesizing analogs without risky Michael acceptor portion as well as analogs with high three-dimensionality (Figure 6) (Prescher et al., 2017).

b) Natural product-based fragments
The fragmentation of high-molecular weight NPs can be performed manually or by in silico approaches, and is characterized by dissecting complex structures to find scaffold patterns that can be obtained from in-house or commercial libraries or even by synthetic campaigns for biology studies (Pascolutti et al., 2015;Prescher et al., 2017;Crane, Gademann, 2016).This approach tends to generate simpler compounds, however, without escaping from the properties of NP scaffolds, making structural diversity one of the most important features of FBDD.(Over et al., 2012).The development of fragment library based on NP structures can be oriented according to research interest and library features.In this regard, it is possible to describe two main strategies for working on high-molecular weight NPs to produce compounds fitting fragment-like properties: 1) virtual structure fragmentation followed by synthesis/acquisition, (Koch et al., 2005;Over et al., 2012) and 2) semisynthetic degradation of NPs.(Prescher et al., 2017).
Examples of fragmentation of high-molecular weight NPs can be found in the literature, ranging from development of a large library of fragment derivatives by synthesizing or extracting simpler scaffolds to medicinal chemistry approaches.
Using simpler NP-based scaffolds to develop bioactive compounds is a classic strategy that can be classified as FBDD approach.Recently, the Searcey group developed a hybrid of the high-molecular weight simocyclinone D8 with ciprofloxacin, using a spacer to link the antibiotic to the coumarin core inspired by the NP structure (Figure 7).(Austin et al., 2016).
Virtual fragmentation can be assessed by dissecting the structures of parent compounds (Figure 9), usually available in house libraries, commercial sources, or NP databases, using a "pseudo-retrosynthetic" approach (RECAP) (Rodrigues et al., 2016;Lewell et al., 1998) or a structural classification of NPs (SCONP) approach.(Koch et al., 2005) Other approaches with similar algorithms have different methods to dissect high-molecular weight NPs to develop a fragment library (Figure 9-d) (Over et al., 2012).Furthermore, it is possible to generate new fragments by in silico degradation.In this case, in silico reactions are applied to specific structures to generate a virtual products library (Figure 10) (Prescher et al., 2017).In these two approaches, target fragments are generated from a NP-guided compound library development approach and can be selected for chemical elaboration according to biology results.
The chemical degradation of NPs is a challenging process as it depends on the number of available compounds for semisynthetic exploration, number of steps required for generating fragments, elucidation of complex structure, and reactivity of highly functionalized NPs.Furthermore, this strategy leads to a highly diverse library of derivatives of NPs and is gaining special attention from the research community because of the opportunities it opens for drug discovery.This topic is explained in the following section.

Ring-distortion strategy
Substituted heterocycles are critical scaffolds for drug development and have received considerable attention owing to their wide existence in a myriad of NPs (Zhang, Song, Qin, 2011).When applying BIOS and  DOS with focus on fused ring systems from NPs, several notable biological discoveries have been made in diverse disease areas.(Svenda et al., 2015;Basu et al., 2011;Collins, ones, 2014).
In BIOS, biological relevance is the prime criterion for the selection of compound classes and scaffolds that inspire the synthesis of compound collections with high bioactivity.BIOS connects chemical and biological spaces, that is, protein structure similarity clusters and small-molecule compound collections through biological prevalidation.This extends well-beyond NPs and includes all compounds with known biological relevance.(Wetzel et al., 2011) Although BIOS offers relevant compounds, it demands more of chemistry, and may require the application of elaborate chemistry methods and demand multistep sequences.(Dandapani, Marcaurelle, 2010).
Alternatively, DOS can provide access to complex and diverse small molecules, thereby demonstrating great promise in modulating the activity of many targets that have largely been outside the purview of traditional compound collections (Dandapani, Marcaurelle, 2010).In addition, DOS is of great significance in both chemical and pharmaceutical fields (Wetzel et al., 2011).
A recent study (Liu et al., 2018) described a synthetic methodology for C-N bond formation based on Al(OTf) 3 -mediated cascade deprotection, cyclization, and ionic hydrogenation pathway.This method facilitates the synthesis of diverse heterocycles, including hexahydrophenoxazines, tetrahydroquinolines, indoline, hexahydrocarbazoles, and pyrrolidones.This strategy can be employed as a key step towards the formation of piracetam (an approved medication for the treatment of mental diseases), thus offering an alternative and non-toxic route to the traditional methods.
Another heterocycle present in many drugs is indole (Zhang, Song, Qin, 2011;Paciaroni et al., 2017;Osman, El-Samahy, 2000;Bromidge et al., 1998).This fused ring demonstrates diverse biological activities owing to the numerous biological targets that bind various indoles, making indole nucleus a privileged scaffold that occupies a notable and biologically relevant chemical space.
An example of indole-containing NP is Yohimbine (Figure 11), which has a highly complex ring system fused to an indole nucleus (Paciaroni et al., 2017).However, the development of novel DOS and BIOS approaches remains a challenging task, demanding more efficient protocols for the production of heterocycles (Liu et al., 2018).
The ring-distortion strategies offer an alternative to the classic means of drug discovery, including new compound isolation and semisynthetic NP modifications.They allow access to unexplored chemical space, which in turn provides the potential for new compound targets to be discovered (Rossiter et al., 2017).This methodology is useful to construct stereochemically complex and structurally diverse compounds from NPs. Ring-cleavage, ring-expansion, ring-fusion, and ring-rearrangement reactions can enable dramatic structural changes, as shown in Table I.
To d e m o n s t r a t e r i n g -d i s t o r t i o n s t r a t e g y, (Figure 11-14) four of the most readily available and well-studied NPs were selected, including the alkaloids Yohimbine and quinine, the diterpene gibberellic acid, and the steroid adrenosterone.

Diversifying Yohimbine
To develop an innovative strategy, the ring-distortion methodology involving Yohimbine as a platform was used to generate complex and diverse small molecules with unique and diverse molecular architectures (Paciaroni et al., 2017).Figure 11 shows a tryptoline ring-distortion strategy that enabled the rapid synthesis of four complex and diverse compounds (A-D) from Yohimbine (Paciaroni et al., 2017).
For the rapid synthesis of sequences (diverse ring cleavage, ring fusion, and ring rearrangement) from Yohimbine, two known transformations were employed: cyanogen bromide ring cleavage with alcohols and transformation of Yohimbine into a spirooxindolecontaining scaffold (Paciaroni et al., 2017).A four-step oxidative rearrangement of Yohimbine afforded the ringrearranged spirooxindole form A. Owing to the synthetic utility of cyanogen bromide in ring cleavage reactions, the centrally positioned basic nitrogen was connected through the tryptoline sub-structure of Yohimbine to yield B-D.
By altering molecular topology and reorganizing the core structure via ring-rearrangement of Yohimbine-based scaffold, a new ring-distorted compound A was obtained, which showed anti-inflammatory activity.This clearly demonstrated the potential of ring-distortion strategies for the discovery of new biologically active small molecules.

Diversifying quinine
The stereochemical complexity, diverse functionality (a tertiary amine, a secondary alcohol, an olefin, and a quinoline), and two discrete ring systems of quinine (Figure 12), (Huigens et al., 2013) make it amenable to selective ring-system distortion to produce diverse molecular scaffolds.
The application of ring-distortion strategies in the synthesis of complex and diverse small molecules from quinine have followed the sequences between one-to five-step reactions, which allow the conversion of quinine to the four structures.The cleavage and  (Huigens et al., 2013;Hergenrother et al., 2013).
In this review, one chemical step starting from quinine provided new functional groups that can be further diversified (compounds E and H) to enhance the inherent biological activity or to improve drug-like properties.
Quinine shows ring-cleavage and fusion strategies as the most utilized approaches.While the ring-cleavage approach increases flexibility and conformational freedom (compound G), the ring-fusion approach results in opposite  effect by the simple addition of a new constrained ring to the ring system (compounds E, F, and H).Depending on the objectives, both ring strategies can improve the profile of small molecules as potential drugs.

Diversifying the diterpene gibberellic acid
The diterpene gibberellic acid enables the selective and independent functionalization (Figure 13) of each ring of the core structure via various reactions that can distort the tetracyclic diterpene core with a fused lactone (Huigens et al., 2013;Hergenrother et al., 2013).
Employing three-to five-step reactions can convert gibberellic acid into the four structures I-L.Oxidative cleavage conditions using pyridinium chlorochromate were used to obtain I and J.A ring opening reaction produced K by intramolecular [4 + 2] cycloaddition.Subsequently, base-catalyzed lactone rearrangement produced L.
An intermediate of L showed anticancer activity against lymphoma, cervical, melanoma, lung, and breast cell lines.The dramatic reorganization of the core structure by ring rearrangement provided a new derivative with biological properties.

Diversifying adrenosterone
Adrenosterone comprises five contiguous stereogenic centers and four individual carbocyclic rings.In addition, this structurally complex steroidal framework is functionalized with an enone or ketone and an exocyclic double bond.These functional groups can be manipulated strategically to synthesize novel, diverse, and complex FIGURE 13 -Ring-distortion strategies to access gibberellic-based scaffolds.chemical scaffolds (Huigens et al., 2013;Hergenrother et al., 2013).
Using the known chemical reactivity of these functional groups, two-or three-step reactions can be employed to convert adrenosterone into four structures (Figure 14, M-P).Applying a novel Schmidt reaction and synthetic elaboration (catalyzed by DMAP) yielded M and N. Oxidative cleavage of adrenosterone ring using NaIO 4 and KMnO 4 and further elaboration using Baeyer-Villiger rearrangement or Schmidt reaction yielded O and P (Huigens et al., 2013).
Ring expansion is an efficient method to form novel ring skeletons or as a prelude to ring-cleavage reactions as shown in Figure 14.Changing the molecular volume and shape can be useful in controlling the lipophilicity of small molecules for use as drugs, and as an initial point for the design of compounds inherently biased by biological success.

Concluding remarks
The novel approaches shown in this review demonstrate the main characteristics of NPs and provide an opportunity to explore compounds by different approaches.Besides molecular biology, chemical ecology, and biosynthesis-driven strategies, we described classic NP chemistry based on isolation-testing sequence for identifying good leads, as well as diversity-oriented approaches based on fragments and synthetic methodologies to expand the NP-based chemical space.
Overall, the main advantage of fragments and ring distortion over the other available strategies is the ability to rapidly generate novel chemotypes that enhance the quality of chemical libraries for biological screening and drug discovery from virtual technologies and synthetic campaigns.Taken together, these strategies have led to the identification of multiple biologically active NPderived compounds, thereby demonstrating the potential of ring-distortion approach to discover new biologically active compounds, (Paciaroni et al., 2017) and providing proof-of-concept for these methods (Rossiter et al., 2017).
It is also possible to address diversity problems in screening libraries of small molecules that occupy biologically relevant chemical space in areas critical to human health.For medicinal chemists involved in the design of compound libraries, it is highly advisable to diversify the range of appropriate fragments derived from NPs, and these fragments can be incorporated in the molecular framework of the synthetic molecule.
Only few studies have successfully explored these methods to date (Rossiter et al., 2017;Huigens et al., 2013;Ciardiello et al., 2017;Garcia, Drown, Hergenrother, 2016).These studies provide an open opportunity for NP and medicinal chemists to populate different areas of chemical space and inspire to move beyond common scaffolds and identify more innovative structures for probing biological activity (Rossiter et al., 2017).

FIGURE 1 -
FIGURE 1 -Examples of natural products (NP) as sources of drugs (blue rectangle) and NP-inspiring compounds (green rectangle).

FIGURE 2 -
FIGURE 2 -Approaches providing novel opportunities to synthesize natural product-inspired compound libraries.

FIGURE 4 -
FIGURE 4 -Classification and definitions of natural and non-natural compounds according to structural features and fragment-like properties.

FIGURE 5 -
FIGURE 5 -Different approaches for developing bioactive compounds from fragment-sized natural products.

FIGURE 7 -
FIGURE 7 -Fragment selection from natural products for developing new bioactive compounds.

FIGURE 8 -
FIGURE 8 -Selection and synthesis of antitumoral fragments derived from high-molecular weight natural products.

FIGURE 9 -
FIGURE 9 -Virtual fragmentation by RECAP and SCONP approach (a and c); RECAP construction of a fragment library (b); and SCONP approach toward bioactive compound development (d).

FIGURE 10 -
FIGURE 10 -Virtual degradation toward fragment library development.