Methodological quality of network meta-analysis in dentistry: a meta-research

Abstract This meta-research aimed to provide an overview of the methodological quality and risk of bias of network meta-analyses (NMA) in dentistry. Searches for NMA of randomized clinical trials with clinical outcomes in dentistry were performed in databases up to January 2022. Two reviewers independently screened titles/abstracts, selected full texts, and extracted the data. The adherence to PRISMA-NMA reporting guideline, the AMSTAR-2 methodological quality tool, and the ROBIS risk of bias tool were assessed in the studies. Correlation between the PRISMA-NMA adherence and the AMSTAR-2 and ROBIS results was also investigated. Sixty-two NMA studies were included and presented varied methodological quality. According to AMSTAR-2, half of the NMA presented moderate quality (n = 32; 51.6%). The adherence to PRISMA-NMA also varied. Only 36 studies (58.1%) prospectively registered the protocol. Other issues lacking of reporting were data related were data related to the NMA geometry and the assessment of results consistency, and the evaluation of risk of bias across the studies. ROBIS assessment showed a high risk of bias mainly for domains 1 (study eligibility criteria) and 2 (identification and selection of studies). Correlation coefficients between the PRISMA-NMA adherence and the AMSTAR-2 and ROBIS results showed moderate correlation (rho < 0.6). Overall, NMA studies in dentistry were of moderate quality and at high risk of bias in several domains, especially study selection. Future reviews should be better planned and conducted and have higher compliance with reporting and quality assessment tools.


Introduction
Systematic reviews of health interventions aim to identify, evaluate, and synthesize high-quality data to answer a specific clinical question. 1 They are considered the most robust evidence to assess the benefits and harms of health interventions and to develop clinical practice guidelines. 2 Randomized controlled trials (RCTs) are routinely conducted to provide high-quality, evidence-based data to guide clinical decisions.However, single RCTs are rarely powerful and robust enough to provide conclusive answers to clinical questions, so metaanalysis methods have been used to combine data from similar trials to achieve enough power.
Standard direct pair wise (head-to-head) comparisons of RCTs are performed in 'traditional' meta-analyses.They typically compare two treatment options directly, either two active treatments or an active treatment versus a placebo.The simplicity of this approach contrasts with the complex nature of treatments in healthcare. 3Nonetheless, for many clinical conditions, there are more than two potential interventions.Also, conducting multi-arm RCTs with more than three comparisons is often considered too complex and expensive.Therefore, not all potential interventions are directly compared, which restricts the ability to compare them in practice for a particular condition.In addition, other limitations may occur, such as when few RCTs are available and there are different interventions, increasing the chance of inconsistency in the results.Also, meta-analyses that include only a few RCTs may not have enough statistical power to detect a true difference between treatments, leading to inconclusive results and hindering the decision-making process. 4In this way, approaches that allow comprehensive comparisons across multiple treatment options and rank these interventions are encouraged.
Nevertheless, the scientific community has seen considerable advances in standards, methods, and systems for planning, conducting, and reporting systematic reviews and trustworthy clinical practice guidelines. 5,6One of these new statistical methodologies is network meta-analysis (NMA), which allows comparing multiple treatments simultaneously, 7,8 overcoming some limitations of the traditional pairwise meta-analysis 9 The NMA takes into account a larger body of evidence by analyzing the results of primary studies through direct head-to-head comparisons and indirect comparisons conducted by one or more common comparators to estimate the relative effectiveness of all interventions and their ordering. 10From this perspective, the NMA has proven to be an interesting tool to assess the efficacy of different interventions for a health condition that have not been directly compared in primary studies.
However, there are challenges and concerns associated with conducting an NMA, such as complex statistics, the assumption that all interventions included in the "network" are equally applicable to all populations and contexts of the studies included, and the fact that ranking may be misleading as it does not highlight the absolute effects of the interventions.The reliability of the rankings requires consideration of the geometry and strength of the network. 11espite some concerns, the NMA is undeniably attractive because it can answer the primary concern of researchers and clinicians "what is the best available intervention?" since it allows estimating an hierarchy of interventions. 12Not surprisingly, NMA publications have shown a marked increase in recent years 13 and have become increasingly popular in dentistry.But what are the methodological qualities of these studies?With their potential and limitations in mind, this investigation should shed light on the concept of NMA and its potential use to promote evidence-based dentistry.This meta-research study aimed to assess and provide an overview of the methodological quality and risk of bias of network meta-analyses in dentistry.

Methodology
This report followed the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.The study protocol was registered on the OSF Register Website (doi 10.17605/OSF.IO/4VX3Y), and post hoc changes were made, including no limitation on publication in recent years, and additional statistical analyses were performed on the correlation between PRISMA-NMA adherence and AMSTAR-2 and ROBIS results.
The research question was: "What is the methodological quality and risk of bias of network meta-analyses in clinical dentistry?"This study evaluated adherence of NMA to reporting guidelines applying the PRISMA-NMA checklist, 14 assessed the methodology quality of the NMA reviews with the A MeaSurement Tool to Assess Systematic Reviews (AMSTAR-2) tool, 15 and assessed the risk of bias of the studies with the Risk Of Bias In Systematic reviews (ROBIS) tool. 16As a secondary aim, the study investigated the correlation between the results of these different tools (AMSTAR-2 and ROBIS) and adherence to the PRISMA-NMA guideline.

Search strategy
A systematic electronic search was performed on January 28th, 2022, with no restriction on language, date, or publication status in the following databases: PubMed/MEDLINE, EMBASE, Science Direct, Scopus, and Web of Science.The search strategy considered a combination of medical subject headings (MeSH) and free-text terms.The search strategies for all databases are shown in Table 1.
Additionally, the grey literature was searched on OpenGrey (opengrey.eu)and a hand search of the reference lists of potentially included NMA systematic reviews on clinical outcomes in dentistry was performed to identify any additional studies.

Eligibility criteria
The titles/abstracts of potentially eligible studies were initially evaluated.Systematic reviews with network meta-analysis (NMAs) that compared three or more interventions in randomized clinical trials (RCTs), with clinical objectives and measurable outcomes in dental sciences were considered eligible studies.
Manuscripts that met the inclusion criteria were fully assessed.NMA studies with only one author, studies with subjective or non-clinical outcomes (e.g., patient-reported pain, measurements from biopsies, etc.), NMA studies that included both RCT and nonrandomized or quasi-randomized trials, NMA studies published in predatory journals (based on the list published at the website https://predatoryjournals. com/journals/), and/or NMA studies that did not provide a complete description of the NMA review process were excluded.

Study selection procedure
Duplicates were independently identified by two reviewers (AFM and PDMA) and removed from the records using the Rayyan Website Program. 17sing the same software, these two reviewers independently, in duplicate and blind, rated each citation as "included" or "excluded" according to the eligibility criteria.Citations could also be classified as "maybe" when data were insufficient to make Table 1.Search strategies for all databases.

Search terms Search strategies Number of citations
PubMed/MEDLINE ((Teeth OR Tooth OR Dental) OR Dentistry OR Root OR Periodont*) AND ("Network meta-analysis" OR "Network Meta-Analyses" OR "Multiple Treatment Comparison Meta-Analysis" OR "Multiple Treatment Comparison Meta Analysis" OR "Mixed Treatment Meta-Analysis" OR "Meta-Analyses, Mixed Treatment" OR "Meta-Analysis, Mixed Treatment" OR "Mixed Treatment Meta Analysis" OR "Mixed Treatment Meta-Analyses" OR "Bayesian meta-analysis" OR "Bayesian meta-analyses") 464 EMBASE ('teeth'/exp OR teeth OR 'tooth'/exp OR tooth OR 'dental'/exp OR dental OR 'dentistry'/exp OR dentistry OR 'root'/exp OR root OR periodont*) AND ('network meta-analysis'/exp OR 'network meta-analysis' OR 'network meta-analyses'/exp OR 'network meta-analyses' OR 'multiple treatment comparison meta-analysis' OR 'multiple treatment comparison meta analysis' OR 'mixed treatment meta-analysis' OR 'meta-analyses, mixed treatment' OR 'meta-analysis, mixed treatment' OR 'mixed treatment meta analysis' OR 'mixed treatment meta-analyses' OR 'bayesian meta-analysis' OR 'bayesian meta-analyses')

257
Science Direct (teeth OR tooth OR dental OR dentistry) AND ("Network meta-analysis" OR "Multiple Treatment Comparison Meta-Analysis" OR "Mixed Treatment Meta-Analysis" OR "Bayesian meta-analysis") 373 Scopus (teeth OR tooth OR dental OR dentistry) AND ("Network meta-analysis" OR "Multiple Treatment Comparison Meta-Analysis" OR "Mixed Treatment Meta-Analysis" OR "Bayesian meta-analysis") 112 Web of Science ((teeth OR tooth OR dental) OR (dentistry) OR (root) OR (periodont)) AND ("Network meta-analysis" OR "Multiple Treatment Comparison Meta-Analysis" OR "Mixed Treatment Meta-Analysis" OR "Bayesian meta-analysis") 96 a decision and a full-text analysis was required.
When the abstract provided unclear information, the study was selected for full-text assessment to avoid excluding potentially eligible articles.Then, the two reviewers independently assessed and duplicated the full-text articles that met the inclusion criteria.Discrepancies in title/abstracts screening or in full texts were solved through a discussion and consensus between the two reviewers with the help of a third reviewer (TKT).For equal NMAs identified in more than one study, only the most recent study was included.

Training
The two reviewers (AFM and PDMA) were trained in tool application (PRISMA, AMSTAR-2, and ROBIS) by the studying and discussion of the tools' explanations and by the practical application on three of the included NMA.

Assessment of reporting adherence
The PRISMA Extension Statement for reporting of systematic reviews incorporating NMA of health care interventions (PRISMA-NMA) 14 was applied to assess the reporting of general components and key methodological components of the included studies.It includes a 32-item checklist and a flow diagram: 27 general items and five new NMA items.The extension adds five new items (S1-5) that authors should consider when reporting an NMA: geometry of the network (S1), assessment of inconsistency (S2), presentation of network structure (S3), summary of network geometry (S4), and exploration for inconsistency (S5).Using this tool, it was assessed whether key methodological and general components were reported (yes) or not (no).

Methodological quality assessment
The AMSTAR-2 tool checklist 15 was used to assess the methodological quality of the included systematic reviews with NMA.For the 16 domains, the questions are worded so that a "Yes" answer denotes a positive result.If no information was available, the item was rated as "No", denoting a negative result.Additionally, a "partial Yes" answer was used to identify partial adherence to the standard.
Rating of overall confidence in the results of the review was: a) High: no weakness or one noncritical weakness; b) Moderate: more than one noncritical weakness; c) Low: one critical flaw with or without non-critical weaknesses; and d) Critically low: more than one critical flaw with or without non-critical weaknesses.

Risk of bias assessment
The risk of bias of the included NMAs was assessed using the ROBIS tool. 16The tool is completed in 3 phases: a) assess relevance (optional -not applied in this study), b) identify concerns with the review process, and c) rate the risk of bias in the review.The signaling questions for phases 2 and 3 were answered as "Yes", "Probably Yes", "Probably No", "No", and "No Information".The subsequent level of concern about each domain's bias was rated as "low," "high," or "unclear".If the answers to all signaling questions for a domain are "Yes" or "Probably Yes", then the level of concern was rated as low.The potential for concern was considered if any signaling question was answered "No" or "Probably No".

Synthesis of results
Initially, the inter-reviewer agreement for study selection was calculated (Kappa coefficient).Afterwards, a descriptive synthesis of the data was performed considering the adherence to the PRISMA-NMA checklist, the assessment of the methodology quality by the AMSTAR tool, and the risk of bias by the ROBIS tool.Finally, the correlation coefficient between those different reporting guidelines and tools was investigated.The analyses were performed with the Jamovi program (The jamovi project, 2021, version 2.3, Computer software, https://www.jamovi.org).For these analyses, the tools were adapted as follows: a. PRISMA-NMA -each item on the checklist received a score of 1 if the item was reported and a score of 0 if the item was not reported in the study; then the maximum total score could be 32.b.AMSTAR-2 -each tool item received a score of 1 if the item provided a yes result, a score of 0.5 if the item provided a partial yes result, and a score 0 if the item provided a no result; then the maximum total score could be 16.c.ROBIS -each item of the tool received a score of 1 if the domain was rated as low, a score of 0.5 if the domain was rated as unclear, and a score of 0 if the domain was rated as high; then the maximum total score could be 5.

Studies selection
In total, 1,971 publications were retrieved.After eliminating duplicates, 1,409 studies remained, of which 1,260 were excluded after screening the title and abstract.Thus, 149 articles were reviewed for eligibility by assessing the full text, and 87 were excluded (Table 2).Finally, 62 NMA studies were included  . Figur 1 summarizes the study selection process.The kappa coefficient for the inter-reviewer agreement was 0.87 for title/abstracts and 0.83 for full text.
The five items added to the PRISMA-NMA extension did not have full adherence.,69,71,75

Correlation between PRISMA-NMA adherence and results from AMSTAR-2 and ROBIS tools
The mean score of the PRISMA-NMA checklist, AMSTAR-2 tool, and ROBIS tool for the 62 NMA studies were, respectively, 26.5 ± 3.25 (range 17-32), 12.0 ± 2.02 (range 8.5-16), and 3.42 ± 1.14 (range 1-5).The Shapiro-Wilk test showed that the three guidelines presented non-parametric distribution.Considering the mean scores, two additional analyses were conducted.First, the studies were stratified based on year of publication up to 2019 (n = 27) and from 2020 onward (n = 35).The stratification was designed to have a similar number of publications in the two strata and to have a reasonable minimum time span (in this case, 4 years) from the launch of the PRISMA-NMA guideline in 2015.The Mann-Whitney U test was used.The mean scores were 25.8 ± 3.02 vs. 27.0 ± 3.37 for PRISMA-NMA (p = 0.100), 11.1 ± 1.76 vs. 12.6 ± 2.00 for AMSTAR-2 (p = 0.008), and 3.26 ± 1.02 vs. 3.54 ± 1.22   Did the research questions and inclusion criteria for the review include the components of PICO?; 2. Did the report of the review contain an explicit statement that the review methods were established prior to the conduct of the review and did the report justify any significant deviations from the protocol?; 3.Did the review authors explain their selection of the study designs for inclusion in the review?; 4.Did the review authors use a comprehensive literature search strategy?; 5.Did the review authors perform study selection in duplicate?; 6.Did the review authors perform data extraction in duplicate?; 7.Did the review authors provide a list of excluded studies and justify the exclusions?; 8.Did the review authors describe the included studies in adequate detail?; 9.Did the review authors use a satisfactory technique for assessing the risk of bias (RoB) in individual studies that were included in the review?; 10.Did the review authors report on the sources of funding for the studies included in the review?; 11.If meta-analysis was performed did the review authors use appropriate methods for statistical combination of results?; 12.If metaanalysis was performed, did the review authors assess the potential impact of RoB in individual studies on the results of the meta-analysis or other evidence synthesis?; 13.Did the review authors account for RoB in individual studies when interpreting/ discussing the results of the review?; 14.Did the review authors provide a satisfactory explanation for, and discussion of, any heterogeneity observed in the results of the review?; 15.If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias (small study bias) and discuss its likely impact on the results of the review?; 16.Did the review authors report any potential sources of conflict of interest, including any funding they received for conducting the review?N = No, Y = Yes, PY = Partially Yes. for ROBIS (p = 0.282).Second, the adherence to the PRISMA-NMA guideline was evaluated among the studies that: 0 -did not inform a reporting guideline (n = 9), 1 -followed PRISMA checklist (n = 25), and 2 -followed the PRISMA-NMA (n = 28) guideline.The Kruskal-Wallis test was used.The mean scores were 25.9 ± 2.85, 26.1 ± 3.60, and 27.0 ± 3.07 (p = 0.435).Finally, the following Spearman's correlation coefficients were found between the guideline and the two tools: PRISMA-NMA vs. AMSTAR-2 rho: 0.586, p < 0.001; PRISMA-NMA vs. ROBIS rho: 0.547, p < 0.001; AMSTAR-2 vs. ROBIS rho: 0.671, p < 0.001, showing a moderate correlation between PRISMA-NMA adherence and methodological quality and risk of bias, as well as a moderate to strong correlation between methodological quality and risk of bias.

Discussion
T h i s i s t he f i r st st udy to eva lu at e t he methodological quality and the risk of bias of NMA in dentistry.The methodological quality assessed by AMSTAR-2 varied among the 62 included studies, and although many were of moderate quality (n = 37, 51.6%), assuming that the systematic review provides an accurate summary of the results of the available studies included in the review, there is still a considerable portion of studies with methodological flaws based on the critical appraisal tool used (low quality n = 17, 27.4%; critically low quality n = 3, 4.8%).The low score indicates that the review has a critical flaw and may not provide an accurate and comprehensive summary of the available studies that address the question of interest 15 .Moreover, the risk of bias assessment of NMA with the ROBIS tool showed a high risk of bias or concerns mainly for domains 1 and 2 that are related to the methodological aspects.This finding is in line with the methodological quality observed in the AMSTAR-2 instrument.Indeed, in this study, a correlation coefficient of 0.671 was found between AMSTAR-2 and ROBIS, a value that can be considered a moderate to strong correlation.This finding is corroborated by a previous study that also found a strong correlation between AMSTAR-2 scores and overall domain rating in ROBIS. 80he present investigation has shown how the NMA approach can be used to improve clinical data interpretation in dentistry.The NMA is a promising approach to provide a comprehensive and up-to-date presentation of the evidence on all available management options for a health condition.Currently, more and more NMA studies are being conducted and an with any new instrument, there is a learning curve in using and interpreting the data.The adherence to reporting guidelines is one of the aspects to be learned.
To better unsure transparency and integrity in research, there is a clear recommendation that all protocols for systematic reviews should be submitted to registry platforms before they begin, with essential information about its design and conduction. 81Nonetheless, protocols were registered in only 58.1% of the included studies.Indeed, the lack of proper registration of NMAs was one of the main reasons for the high risk of bias observed in domain 1 of the ROBIS tool.A moderate correlation (rho < 0.6) was found between the AMSTAR-2 and ROBIS tools and PRISMA-NMA adherence.It has been shown previously shown that registration of the systematic review protocol or working from a previously established protocol improves the final study report.However, the authors did not observe an association between protocol registration and reduction in outcome reporting bias. 82he importance of a comprehensive and sensitive literature search is also well established in systematic reviews, and the adequacy of the literature search is a critical domain in AMSTAR-2. 15,83A considerable number of the NMA studies included in this review limited the language for study selection and did not perform the gray literature assessment and the hand search for potentially relevant studies, resulting in a selection bias of eligible studies.
Addit ionally, the st udies reviewed here had different methodological and publication characteristics.Besides, 45.2% of the studies did not show aspects aligned with current research integrity practices, such as the use of the most appropriate reporting guideline for systematic reviews with network meta-analyses, the PRISMA extension for NMA. 14This finding is in accordance with a previous study that examined whether published NMA papers follow reporting recommendations and found that key reporting components of the systematic review process were missing in most of the NMA evaluated. 84Here, no differences in the mean scores of PRISMA-NMA adherence were found between studies that reported having followed or not followed this guideline.It can be argued that this guideline is relatively new, having been launched in 2015.Nonetheless, only 6 of the included studies were published in 2015 or earlier, 24,33,36,46,59,74 which does not justify the non-adherence to this extension by the other 56 included studies.This finding also raises an alert: reviews shouldn't only report on the application of a guideline, but also pay attention to better adherence to each item.
On the other hand, the similarity among the mean scores for PRISMA-NMA adherence among those 3 categories can be most likely due to the non-adherence in reporting the 5 new items of the PRISMA-NMA extension. 14As some items of the PRISMA-NMA could not be assessed in some studies included here, it is believed that the results might have been influenced.Overall, the geometry of the network, the assessment of inconsistency in the method section, the presentation of the network structure, a summary of network geometry, and the exploration for inconsistency in the result section were not clearly examined.The network geometry of an NMA study is crucial as it provides clarity in the presentation of the data.This makes it easier for readers to understand and interpret the quality and integrity of the review.For example, direct and indirect comparisons for a given comparison are shown in a graph when closed connections within the network diagram result in a new geometric figure, such as a triangle.On the other hand, open connections may represent less reliable networks since the results of the comparisons only come from indirect comparisons. 76In this sense, aspects related to the reporting of statistical analysis are in great need of improvement, especially the description of the methods used to explore the geometry of the treatment network and the potential biases related to it, as well as the methods used to assess inconsistency of direct and indirect evidence.A better description of the NMA statistical methods could be achieved by including a statistician on the author team.Another PRISMA-NMA item with low adherence was the assessment of the risk of bias across studies.
Interestingly, the correlation coefficients between adherence to the PRISMA-NMA checklist and results from AMSTAR-2 and ROBIS tools indicated just a moderate correlation.This can occur because the AMSTAR-2 and ROBIS tools are qualitative in nature and have fewer items/domains, where the absence of a characteristic in the NMA studies can already determine a negative score in that item/domain.This is particularly evident when considering that the cutoff points for the classification of quality/ risk of bias in these tools are not linear, but rather are determined by the presence of items/domains considered essential.For example, in the adaptation performed here, a study with a score of 13.5 on the AMSTAR-2 guide was classified as being of a low quality, whereas a study with a score of 8.5 was classified as being of moderate quality (data not shown).From these observations, it can once again be assumed that better adherence to the PRISMA-NMA should be sought when writing NMA manuscripts to improve the reporting quality of these studies.Also, knowledge of the available tools to assess methodological quality and risk of bias, as well as their characteristics, may help authors in this task.
The use of GRADE combined with information synthesis by the NMA facilitates structured evidence summaries.Thus, it is a tool that helps physicians and patients make decisions about health interventions and provide them with certainty of the evidence.In this investigation, only one third of the NMA studies rated the certainty of generated evidence using the GRADE approach.This finding is understandable given the complex application of GRADE for NMA.In this case, the GRADE must consider the certainty of the direct and indirect evidence and their contribution to the network estimate, including local incoherence and imprecision. 85,86his study has limitations, including a lack of consideration of the scope of the systematic reviews.In this study, only NMA reporting clinical outcomes that are of great importance to researchers and clinicians were included.Furthermore, clinical outcomes impact evidence-based dentistry, and systematic reviews of studies on healthcare interventions are used extensively for clinical and health policy decisions.Therefore, it is important for users to be able to distinguish between high-quality and lowquality reviews, once the increase in publications of systematic reviews was accompanied by an increase in poorly conducted, poorly reported, and/ or unnecessary studies. 87In this sense, the results of this investigation can contribute to a critical appraisal of available NMAs and help readers better understand the evidence.
This review showed that the number of NMA publications has increase since 2010, with an expressive increase in recent years.NMA is a promising approach to provide a broad, complete, and updated presentation of the evidence regarding all the available intervention options in dentistry.This might well represent a paradigm shift for systematic reviews. 88However, as with any new knowledge, NMA in dentistry will require overcoming several challenges, such as improving document reporting and methodological quality to reduce the risk of bias, indexing of the study type in databases, and a consensus and discussion on terminology and standards for conducting and reporting. 13On the other hand, despite their great relevance, NMAs conduction and extrapolation of results to support clinical practice strongly depends on well-conducted primary studies 87 and a clear and objective prior review protocol.Corroborating these findings, the NMA studies that applied the GRADE assessment had, in general, low to very low certainty of evidence.Therefore, NMAs should only be planned when there is a sufficient body of evidence and sound studies.If these premises cannot be met, it is suggested that appropriate primary studies be conducted.

Conclusion
Considering methodological aspects, NMA studies in dentistry had moderate quality and a high risk of bias in several domains, especially those related to study selection.The adherence to reporting guideline is also questionable regarding the analysis of the available data.These findings raise doubts about whether systematic reviews of NMA should be conducted without sufficient and adequate evidence and without proper compliance with the criteria proposed in the reporting and quality assessment tools.

Figure 1 .
Figure 1.PRISMA flow diagram for the identification of relevant studies.
*Country of the correspondent author; # One reviewer extracted the data and the second reviewer confirmed all extracted data; SR: Systematic review; NMA: network meta-analyses; NR: not reported; RoB: Cochrane risk of bias tool; OR: Odds ratios; RR: relative risks; WMD: Weighted mean difference; CI: confidence intervals; CrI: credibility interval; MD: Mean difference; SD: Standard deviation.