Accuracy of mucocutaneous leishmaniasis diagnosis using polymerase chain reaction: systematic literature review and meta-analysis

The diagnosis of mucocutaneous leishmaniasis (MCL) is hampered by the absence of a gold standard. An accurate diagnosis is essential because of the high toxicity of the medications for the disease. This study aimed to assess the ability of polymerase chain reaction (PCR) to identify MCL and to compare these results with clinical research recently published by the authors. A systematic literature review based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses: the PRISMA Statement was performed using comprehensive search criteria and communication with the authors. A meta-analysis considering the estimates of the univariate and bivariate models was performed. Specificity near 100% was common among the papers. The primary reason for accuracy differences was sensitivity. The meta-analysis, which was only possible for PCR samples of lesion fragments, revealed a sensitivity of 71% [95% confidence interval (CI) = 0.59; 0.81] and a specificity of 93% (95% CI = 0.83; 0.98) in the bivariate model. The search for measures that could increase the sensitivity of PCR should be encouraged. The quality of the collected material and the optimisation of the amplification of genetic material should be prioritised.

American tegumentary leishmaniasis (ATL) is caused by parasites of the Leishmania genre. The clinical manifestations of ATL are predominantly classified as cutaneous (CL) or mucocutaneous leishmaniasis (MCL), depending on the absence of mucous lesions or on the involvement of mucous membranes, respectively (Gomes et al. 2014b).
The etiologic diagnosis of ATL in all its forms is a difficult task. This difficulty is increased in cases of MCL, in which sampling frequently requires invasive techniques because of the potential presence of deep nasopharyngeal lesions (Maretti-Mira et al. 2011, Weinkopff et al. 2013. The absence of a gold standard diagnostic test for MCL is a major obstacle. The existing tests cannot be independently used for the diagnosis of MCL because the accuracy of the tests is unsatisfactory. The classical diagnostic methods for MCL include immunological and parasitological tests. The immunological tests include the intradermal leishmanin test (the Montenegro skin test) and serological tests, which are known for their high sen-sitivity; however, these immunological tests exhibit low diagnostic specificity (Ferreira et al. 2006, Gomes et al. 2014a). The parasitological tests aim to directly identify the parasite and primarily consist of direct tests, culturing and histopathological examinations. Contrary to the initial tests, these latter tests have good specificity; however, they also exhibit poor sensitivity (Gomes et al. 2014a).
Studies have indicated that molecular biology techniques, particularly polymerase chain reaction (PCR), offer good sensitivity and specificity and, ultimately, increased accuracy compared with immunological and parasitological tests (Medeiros et al. 2002, Garcia et al. 2005, Silva et al. 2012. However, several methodologies are used, making it difficult to generate simple comparisons among the papers. We performed a systematic literature review of studies of the diagnostic accuracy of PCR in detecting MCL as part of our groups' effort to develop non-invasive collection methods for the diagnosis of MCL. The primary objective of this study was to assess the status of various molecular biology techniques in recognising MCL and to compare these data with the recent clinical research published by our group (Gomes et al. 2014a).

MATERIALS AND METHODS
This systematic literature review was performed based on recommendations by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses: the PRISMA Statement, with adjustments for evaluating the diagnostic accuracy of the studies (Devillé et al. 2002, Leeflang et al. 2008, Moher et al. 2009, Leeflang 2014). This review is registered with the platform PROSPERO -International Prospective Register of Systematic Review (University of York, UK), with the identifier CRD42014007038.
Paper selection -In January 2013, the following search terms were generated: "((((((mucosal) OR mucocutaneous) OR mucous)) AND leishmaniasis) AND diagnosis) AND Polymerase Chain Reaction", using the advanced search system on the PubMed site (PubMed Advanced Search Builder) following the recommendations for searches with comprehensive criteria. Because MCL is a highly endemic disease in South America, predominantly in Brazil, we included the non-MeSH terms "mucous" and "mucosal", which are frequently translated from the Portuguese words leishmaniose mucosa.
Two researchers were recruited to search the related references in the following databases or virtual libraries: PubMed, EMBASE, Web of Knowledge, SciELO and LILACS, using the same terms and connectors through each specific advanced search tool (Table I). No specific date was defined for applying the search criteria by each examiner. The paper search began in January 2013 and was concluded in December 2013. The inclusion and exclusion criteria for the paper are detailed below.
Inclusion criteria -Papers that fulfilled all of the following criteria were selected and included. Addressed the topic: using PCR techniques for the diagnosis of MCL leishmaniasis, studies in humans, cases of mucosal leishmaniasis potentially associated with skin lesions, more than seven cases of mucosal leishmaniasis, any number of controls without a diagnosis of mucosal leishmaniasis, no restriction for defining the controls (healthy, with mucosal or skin lesions) and written in English, French, Portuguese or Spanish.
Exclusion criteria -Fewer than eight cases of mucosal leishmaniasis, studies without controls, studies that used known cured cases of ATL as controls, studies in which patients with CL and mucosal leishmaniasis were included without being explicitly separated, PCR tests conducted with cultured patient samples, only the presence of exclusively CL forms of ATL and written in languages other than English, French, Portuguese or Spanish. No restrictions were imposed on the paper selection process regarding the year of publication and no additional limits were set. Additionally, no restrictions were imposed regarding the reference composite standard for the diagnostic criteria of MCL. This characteristic was considered for the classification of selected papers and for the inclusion in the final meta-analysis model.
The papers were accessed using the periodicals portal available through the Brazilian Federal Agency for Support and Evaluation of Graduate Education on the internet-connected network of the University of Brasília and through direct contact with the authors via electronic correspondence.
After the search and review by the researchers, any disagreements were resolved by a third evaluator, who was blind to the identity of the researchers who conducted the previous selection. Electronic communications were sent to all of the corresponding authors of the selected papers with an inquiry regarding whether they knew of any published or unpublished study that addressed the criteria of interest. Additionally, the references cited in the pre-selected papers were examined for additional relevant studies. Thesis databases at several internationally recognised universities were searched as well (Supplementary data I).
The researchers constructed a 2 x 2 table based on the information contained in the papers with the aim of calculating the sensitivity, specificity, predictive values and accuracy using OpenEpi ® v.3.01 (Emory University, Rollins School of Public Health, USA). After this procedure, the corresponding authors of the selected papers were again contacted by email with the intent of retrieving the data of papers in which it was impossible to construct the 2 x 2 table and to confirm the precision of the accuracy data obtained after reading the papers.
Analysis of the selected papers -Two reviewers conducted the qualitative analysis and paper critiques. The qualitative assessment was based on the completion of the tables, which were constructed during the abovementioned clinical study previously published by this group (Table II) (Gomes et al. 2014a). Any discrepancies in completing the qualitative table were resolved using an evaluation by a third examiner.
The critical analysis was completed using the "QUA-DAS tool: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews" (Whiting et al. 2006), which consists of 14 items rated as "yes", "no" or "unclear". Discrepancies between the data entered by the two evaluators were recorded under the ''unclear'' classification.
The decision regarding which results would be included in the meta-analysis was primarily resolved using criteria related to the test methodology. In this step, the study previously published by our group, which was the basis for performing this systematic review, was included (Gomes et al. 2014a).
Analysis of the heterogeneity of the papers -In the first step, the papers were separated according to the composite reference standard used to define the cases and controls. The papers were classified according to the use of clinical, immunological, or/and parasitological criteria (Table III).
In the second step, the following characteristics were considered: the sample analysed, molecular weight and the PCR primer target. Human tissue samples were considered similar whether they were stored in filter paper, paraffinised or frozen (Marques et al. 2001). The DNA extraction methods that used commercial kits or phenol-extraction techniques were considered similar (Marques et al. 2001). Additionally, the studies that used primers that targeted the kDNA mini-circle of Leishmania spp and whose amplicons were less than 150 bp were considered similar. Statistical tests to evaluate the heterogeneity among the selected studies were not performed because of the limited number of papers with comparable methodologies.
Creating pooled forest plots -A sensitivity analysis considering the estimates of the univariate and bivariate models was performed (Deppen et al. 2014). The univariate model was estimated with Meta-DiSc ® software (Ramón y Cajal University Hospital, Spain) and the bivariate model was fitted in R ® software, v.3.1.2, using the mada package (Institute for Statistics and Mathematics of Wirtschaftsuniversität, Austria). This library estimates the parameters by the restricted maximum likelihood method. All of the tests were bimodal (2-sided) and applied a 5% type I error rate.

RESULTS
At the end of the selection process, 14 papers were included based on the above-mentioned criteria (Fig. 1 One of the main reasons for heterogeneity among the selected papers was the composite reference standard used to define cases of MCL and controls. The patients were selected based on the compilation of clinical, epidemiological and parasitological criteria in only three studies. A common error observed in six papers was the inclusion of a PCR test to determine the inclusion and allocation criteria. Of the remaining studies, three used parasitological criteria, one used clinical criteria and one did not report the diagnostic criteria used. One paper clearly stated the calculation of a sample size  considering only the expected sensitivity of the tested diagnostic methods. Seven papers used a case-control design, whereas the other onehalf of the papers used a cross-sectional/cohort design for the recruitment and allocation (Table III).
With respect to the type of samples analysed, three used blood samples, one used urine samples, one used  lesion scrapings, one used lesion aspirates, one used nasal swabs and biopsies and seven used tissue samples collected by biopsy. The data on the positivity and characteristics of the tests are detailed in Tables II, IV. The studies that exhibited greater accuracy for MCL using PCR were performed using tissue samples (Deborggraeve et al. 2008.
With respect to the properties of the diagnostic tests, specificity tended to be the highest. Ten papers had a specificity of 100% and two reported a specificity of less than 90%. The level of sensitivity reported in most of the papers was between 60-90%. Two papers reported sensitivities below 50% and one paper reported a sensitivity of 100%.
Critical analysis of the selected papers -The analysis was performed using the QUADAS tool (Fig. 2) and indicated that the largest amount of data not reported in the papers (a "no" response) were related to the last two questions, which inquired about inconclusive results and records for the loss of patients in the studies. The cases in which the response was ''unclear'' were most frequently observed for questions 10, 11 and 12, thus indicating that the selected papers did not obviously report whether the examiner was blind to the reference standard during the application of the evaluated test and vice versa. This result additionally indicated that most of the papers did not explicitly detail whether the data evaluated in the present study were consistent with the data used in clinical practice.
Papers published during or after 2010 were of higher quality according to the QUADAS tool than the papers published prior to this date (Fig. 2).
Creating pooled forest plots -After the methodological evaluation, only three papers were considered sufficiently similar and were selected: Disch et al. (2005), Bracho et al. (2007) and Thomaz-Soccol et al. (2011). It was possible to compare the results in these studies to clinical research previously performed by this group; however, this comparison was restricted to tissue samples (Gomes et al. 2014a).
The joint estimates of sensitivity and specificity in the univariate and bivariate models are presented in Table V and Fig. 3. The area under the receiving operating characteristics (ROC) curve, considering the bivariate model, was estimated at 0.94 (Fig. 4).

DISCUSSION
Some authors have reported that systematic reviews on diagnostic accuracy testing tend to result in studies of differing quality compared with reviews that examine clinical trials (Devillé et al. 2002, Leeflang et al. 2008, Leeflang 2014. Additionally, papers regarding diagnostic accuracy have considerable methodological heterogeneity compared with that of intervention studies. For this reason, it is not always possible to complete a meta-analysis (Devillé et al. 2002). The full methodological procedures are frequently not described within the papers and to understand these procedures, one must conduct a detailed reading of one or more papers cited by the authors.
Definition of cases and controls (the use of a composite reference standard) -The definition of cases and controls exerts a fundamental influence on the results of accuracy   diagnostic studies. Variations in these criteria were one of the most frequent points of heterogeneity among the papers (Table III). Recent systematic reviews of the literature focused on the diagnosis of kala-azar have reported the same problems regarding the classification of cases and controls, which appears to be a major concern for all of the clinical manifestations of leishmaniasis, including MCL , de Ruiter et al. 2014.
Defining cases using only parasitological tests substantially increased the sensitivity of PCR because parasites were abundant in these cases. However, these case definitions did not correspond to clinical practice. Infection by L. (V.) braziliensis is known for its low density of parasites and severe local inflammation (Brelaz-de-Castro et al. 2012). This species is the primary cause of MCL and a considerable percentage of these patients do not exhibit positivity using parasitological techniques.
Additionally, defining controls is crucial. The possible inclusion of patients with previously cured MCL or those undergoing treatment might reduce the specificity of the tests. These patients might continue to harbour Leishmania cell debris or even the latent form of the parasite without manifesting the clinical disease (Mendonça et al. 2004). We evaluated the procedures used to define cases of MCL and controls (Tables II, III) to define which papers were comparable for inclusion in the meta-analysis model.

Properties of the diagnostic tests -
The results revealed that the major factor responsible for the variations in diagnostic accuracy among the studies was the sensitivity value. It is difficult to assign a specific reason for increases or decreases in sensitivity. However, an unbiased analysis of the results supports the conclusion that specificity remains relatively stable even with extreme increases in sensitivity. Factors such as reductions in the molecular weight of the fragments amplified by primers and the use of commercial kits to extract DNA did not increase the proportion of false positive tests. This information was confirmed using an analysis of the generated the summary ROC curve (Fig. 4).
Random effects bivariate logit-normal sensitivity and specificity estimates are recommended for metaanalyses of diagnostic studies (Simel & Bossuyt 2009). However, these estimates are considered more complex methodologies (Simel & Bossuyt 2009). Some authors suggest that the use of univariate methods is not likely to result in significant changes. An analysis of the pooled results in the univariate and bivariate models confirmed similarities between the methods (Table V).
These results confirmed the high specificity of the PCR techniques for the diagnosis of MCL, which could be considered an important advantage compared with the unspecific immunological tests. In addition, the sensitivity was greater than the values described for the parasitological tests (Gomes et al. 2014b). These results confirm previous reports that consider PCR to be the most accurate method for the diagnosis of MCL (Gomes et al. 2014a).
Limitations -One of the most important limitations of this study is that we excluded studies with fewer than eight cases of MCL. This decision aimed to define a minimal relevance of studies because of an initial intention to include non-controlled studies and to use only simpler univariate methods of aggregation. This procedure was based on existing recommendations for sample size calculation in diagnostic test studies. As an example, a sample size of eight patients would only be sufficient if a low confidence interval inferior limit were set in association with an expected sensitivity level greater than 95%, which is not realistic for the diagnosis of MCL (Flahault et al. 2005, Bailly et al. 2014. We hypothesise that the existence of a considerable quantity of references with fewer than eight cases that fulfil all of the inclusion criteria is not probable. In addition, studies with a sample size that is insufficiently small would have a lower influence on the final result of the meta-analysis. The use of QUADAS was selected instead of its more recent version QUADAS 2. Although the methods measure identical characteristics, the newer version allows a written description of four key domains (patient selection, index test, reference standard and flow/timing) (Whiting et al. 2011). Because we constructed a simplified tool that covered these domains and the domains related to molecular biology procedures (Table II), we decided to use the QUADAS for simplicity and to avoid the duplication of data collection. The absence of a third reviewer for the analysis of the QUADAS classification (Fig. 2) could have reduced the precision of this procedure. We reasonably considered that any mistake occurring during the classification would be highly influenced by an unclear description of the methodologies. In this case, the discrepancies were classified as "unclear".
The heterogeneous application of the composite reference standard in the selected papers ensures that it is difficult to separate the patients with CL or MCL. The simultaneous inclusion of these two forms of ATL was identified in nine studies, which may have increased the risk of bias during the extraction of the data from the stud- ies. After constructing the 2 x 2 table and calculating the sensitivity, specificity, predictive values and accuracy, the authors of previously selected papers were contacted by email to retrieve the incomplete data and confirm the accuracy of the data. This approach aimed to reduce possible discrepancies between the accomplishments of the study and the possible interpretations of the scientific paper.
Recommendations -This systematic literature review analyses the available data on the PCR techniques used in the diagnosis of MCL. The significant methodological heterogeneity makes comparisons between studies more difficult.
Based on these results, it is necessary to generate a consensus and protocols that recommend optimised practices for PCR in MCL (da Graça et al. 2012). It is possible to infer that the use of techniques aimed to increase the sensitivity of the tests should be pursued, particularly because the specificity is generally satisfactory. The tissue samples collected directly from the lesion and the use of high-sensitivity extraction methods must be observed in the preparation for PCR processing. Additionally, the targeted DNA sequence must be observed because the primers that amplify the kDNA sequences of Leishmania exhibited better sensitivity than other sequences (van den Bogaart et al. 2013). Additionally, related techniques, such as real-time quantitative PCR, might be used to improve sensitivity (van den Bogaart et al. 2013).