Evaluation of drug susceptibility profile of Mycobacterium tuberculosis Lineage 1 from Brazil based on whole genome sequencing and phenotypic methods

BACKGROUND The evaluation of procedures for drug susceptibility prediction of Mycobacterium tuberculosis based on genomic data against the conventional reference method test based on culture is realistic considering the scenario of growing number of tools proposals based on whole-genome sequences (WGS). OBJECTIVES This study aimed to evaluate drug susceptibility testing (DST) outcome based on WGS tools and the phenotypic methods performed on isolates of M. tuberculosis Lineage 1 from the state of Pará, Brazil, generally associated with low levels of drug resistance. METHODOLOGY Culture based DST was performed using the Proportion Method in Löwenstein-Jensen medium on 71 isolates that had been submitted to WGS. We analysed the seven main genome sequence-based tools for resistance and lineage prediction applied to M. tuberculosis and for comparison evaluation we have used the Kappa concordance test. FINDINGS When comparing the WGS-based tools against the DST, we observed the highest level of agreement using TB-profiler. Among the tools, TB-profiler, KvarQ and Mykrobe were those which identified the largest number of TB-MDR cases. Comparing the four most sensitive tools regarding resistance prediction, agreement was observed for 43 genomes. MAIN CONCLUSIONS Drug resistance profiling using next-generation sequencing offers rapid assessment of resistance-associated mutations, therefore facilitating rapid access to effective treatment.

Tuberculosis (TB) is a millenary infectious disease caused by bacteria belonging to the classical taxonomy group Mycobacterium tuberculosis complex (MTBC).Within this group, M. tuberculosis and M. africanum are the main causative agents of the disease in humans which has been classified into eight phylogenetic lineag-es (L1-L8), presenting different patterns of geographical distribution and associated drug resistance profile.Among these, L1 is mostly restricted to Eastern Africa and the South of India. (1)ineage 1 is infrequently reported in South America except for one study that described 98 such strains in Northern Brazil, representing 10% of the study population. (2)Through single nucleotide polymorphism (SNP) typing, we hypothesised to have been introduced during the trans-Atlantic slave trade. (3)he L1 has not been associated to the drug resistance (DR), (4) different to L2 and L4 that are strongly associated with DR. (4) We therefore determined the DR profile using the conventional phenotypical antimicrobial drug susceptibility tests (DST) and compared that with in silico DR prediction on whole-genome sequencing (WGS) of 71 isolates of M. tuberculosis L1 from Pará generated presently, together with three isolates with recently published genomes. (5)In silico procedures were: TB-profiler, (6) KvarQ, (7) PhyresSe, (8) Mykrobe, (9) MTBSEQ, (10) CASTB (11) and Resistance Sniffer (RSniffer). (12)

MATERIALS AND METHODS
Sampling -Out of 980 M. tuberculosis isolates from the state of Pará, Brazil, 97 were classified as East-African-Indian (EAI) by Spoligotyping, (2) among which 71 were recovered and classified as Lineage 1 (3) were used in this present study for DR evaluation.
Drug susceptibility testing -DST for isoniazid (INH), rifampicin (RIF), ethambutol (EMB) and streptomycin (SM) was performed using the Proportions Method in Löwenstein-Jensen (LJ) medium using the recommended critical concentrations and using the H37Rv strain as a control.The DST was performed according to the national manual for laboratory surveillance of TB and other mycobacteria (13) without modifications.This test consisted of detecting the proportion of resistant bacilli present in a sample of M. tuberculosis, given the concentration of the drug, capable of inhibiting the development of sensitive cells, but not that of resistant cells -"critical concentration".
Whole-genome sequencing -After DNA extraction by Phenol-chloroform protocol (5) and library preparation using the Nextera XT DNA Library Prep Kit (Illumina, San Diego, USA), the isolates were sequenced using the Hiseq 2500 platform (Illumina, San Diego, USA) with a coverage of 250x.The raw reads were deposited at NCBI under the accession number PRJ-NA494931 (5) and PRJNA630228.
Statistics -To compare the sensitivity, specificity and accuracy of DST as determined phenotypically or in silico, the Kappa Concordance Analysis was applied using the BioEstat 5.5 software. (15)This test is a measure of interobserver agreement and measures the degree of agreement beyond what would be expected only by chance.This measure of agreement has a maximum value of 1, where this value 1 represents total agreement and values close to and even below 0 indicate no agreement, or the agreement was exactly what was expected by chance.An eventual Kappa value less than zero or negative, suggests that the agreement found was lower than that expected by chance.Therefore, it indicates disagreement, but its value has no interpretation as a degree of disagreement.The p-value is considered significant when it is less than or equal to 5% (p ≤ 0.05).

RESULTS
Based on the DST, among the 71 isolates, 38 (53.5%) were drug susceptible, 17 (23.9%)were resistant to at least one of the drugs and 16 (22.5%)were TB-MDR.The summary of the results obtained with each of the WGS based tools for TB resistance prediction is described in Table I.
Among the tools TB-profiler, KvarQ and Mykrobe identified the largest number of TB-MDR cases, while PhyResSe presented a low capacity to find mutations related to the rpoB gene (k = 0,08).Due to technical issues by not generating data with the PhyResSe and CASTB softwares, we were unable to obtain results for all submitted genomes, reducing the total number of samples to 69 and 60, respectively.All samples submitted to RSniffer were determined as being drug susceptible.
When comparing the WGS based tools to DST (Table II), we observed the highest level of agreement on all drugs in the case of TB-profiler (Table III).The program that showed the least compatibility (16) with all antibiotics was RSniffer.The conclusion for each tool is described in Supplementary data (Table II).
Considering the performance of the in silico DST against the antibiotics separately, Mykrobe was the one with highest accuracy in relation to INH (k = 0.855 and p < 0.0001).Owing to the design of algorithm or technical runtime issues, CASTB presented a greater number of positive results for SM, influencing the agreement (k = 0.04 and p < 0.2025) together with EMB (k = 0.166 and p < 0.095), these results do not indicate statistical significance.For pyrazinamide (PZA), in silico analysis demonstrated a low agreement rate, with Kappa coefficient results ranging from 0.114 to 0.508 [Supplementary data (Table III)].

DISCUSSION
One of the objectives of the Genomic Era is to replace the classic genotyping techniques for the detection and identification of MTBC species for diagnostic purposes and the phenotypic methods for DST, by in silico analysis of WGS data. (17)During the last decades, genotyping tools have been developed that identify both lineage and drug resistance and their validation is of major importance to evaluate their impact as a possible substitute for traditional methodologies.
This present study compared the widely used WGS based tools to predict antimicrobial resistance profile in 71 genomes from isolates of M. tuberculosis of the Lineage 1 from Pará, Brazil, using DST as the reference method.The DST based on the proportions method is mostly used in Brazil as an AST for mycobacteria, but it is a laborious and time-consuming method, requiring four to six weeks to obtain the results. (18)On the other hand, DR prediction from WGS data can be performed from early positive MGIT cultures after an average of 14 days, or even directly on sputum sample generating results within five days. (19)ineage 1 (EAI) is not usually associated with DR and has also low correlation to transmissibility and virulence, presenting a restricted geographical distribution. (4,20)In this study however, the resistance profile by the DST demonstrated that 17 (23.9%)were resistant at least to one drug and 16 (22.5%)were MDR.This high frequency of MDR isolates might be related to the fact that the TB cases were from the reference hospital for MDR-TB Hospital Universitário João de Barros Barreto (HU-JBB), including TB contacts (without a previous history of TB), and patients suspected of treatment failure or TB relapse.Compared the DR of Lineage 1 in the context of other lineages from the same region, the most predominant was Lineage 4, among of which T and X lineages, were associated to MDR-TB, while Lineage 1 the highest among 'any resistance' group. (2)mong all in silico based tools tested presently, we encountered difficulties to predict resistance to PZA, which can be partly due to alternative mechanisms of resistance to this drug (non pncA related) (21) and reports of low-frequency SNPs that may be associated with PZA resistance. (22)In the present study however, we did not include PZA in the conventional DST, a major limitation of the study.The ability to correctly identify whether there is a mutation in the sample is called sensitivity and the ability to identify whether the sample does not actually have the mutation is specificity, when analysing these results it is important to generate the level of accuracy, thus it is easier to assess whether the results obtained were compared correctly.
TB-profiler showed that, in addition to good sensitivity and specificity, (23) it has a good statistical correlation with conventional DST, proving that it is a good resistance predictor tool.A recent study on isolates from patient from the state of São Paulo in Brazil and from province of Sofala in Mozambique compared DST performed in liquid medium MGIT-960 SIRE kit against TB-profiler prediction and the LPA tests Genotype-MT-BDRplus 2.0 and MTBDRsl 2.0.The TB-profiler had the best performance among the genotypic DST as compared to the phenotypic test with a good concordance with phenotypic DST for RIF and SM (89.6%),INH (96.5%) and EMB (82.7%).WGS sensitivity and specificity for detection resistance were respectively 87.5 and 92.3% for RIF; 95.6 and 100% for INH; 85.7 and 93.3% for SM while 100 and 77.2% for EMB. (24)oreover, our data is also in agreement with other studies (10,25) suggesting that the use of TB-profiler together with Mykrobre, MTBSeq and KvarQ may increase the chances to fully elucidate the mutations of the genomes under analysis.
Regarding RIF prediction by PhyResSe, it detected only one mutation (rpoB_His445Arg) in rpoB gene of a MDR strain by DST (G04893), therefore this tool presented a low sensitivity and specificity for this drug.In general, we observed a lower performance PhyResSe compared to other pipelines, much more pronounced that that described in other studies. (23)This might be a characteristic of the performance of this pipeline particular in genomes of Lineage 1 and needs further investigation.
Even though in some studies, CASTB has demonstrated a good performance in finding variants related to mutations, (11,21) we observed a high number of false positives for resistance to SM, in addition to not generating outputs for some samples resulting in inconclusive results.
Regarding RSniffer based genome analysis, all isolates were considered as susceptible to all drugs and this seems to be due to the fact that by default, this tool assumes Linage 1 as a drug pan-susceptible, (12) limiting its applicability in this strain population.
Since WGS is mainly done from a DNA pool of a culture, it is possible that there is a discrepancy between the phenotypic and genotypic tests for the same sample, as these strains can manifest themselves, thus tests can be influenced by mixed infections or mixtures of drug susceptible and resistant populations in phenomenon of heteroresistance. (26)he evaluated tools in this study were based on the technique of Direct Association (DA) which relies on the established correlation between the various resistance conferring mutations and their presence or absence in the MTB isolate under study.These pre-documented correlations are utilised by these tools to ascertain the drug resistance profile of the sample. (27,28)Studies which are valid as genomic analysis protocol for the detection of MTBC species and their genetic characterisation, especially for resistance analysis, are important for the progress of translational research in TB, with the goal to replace phenotypic tests by WGS.
Recently, WGS performed directly on clinical specimen has been proposed for an even more rapid TB surveillance, allowing researchers to do real-time genomic epidemiology and drug resistance surveillance in settings where culture and DST are not available. (29,30)However, this is still technically challenging and a under active study.