Grading the quality of evidence

Grading the quality of evidence GRADE workshop Julius Centrum, UMC Utrecht Yngve Falck-Ytter Regina Kunz Holger Schünemann Utrecht, September 18, 2008

Content • Why the usual hierarchies to grade quality of evidence are problematic • How GRADE does it differently • Why judgments are still required • Why GRADEing quality of evidence is easier than you might think

Before GRADE Source of evidence Grades of recomend. Level of evidence I SR, RCTs A II Cohort studies B III Case-control studies IV Case series C V Expert opinion D Oxford Centre of Evidence Based Medicine; http://www.cebm.net

Committee of Ministers of the Council of Europe. Oct 2001.

Grading used in GI CPGs AGA AASLD ACG ASGE I Syst. review of RCTs I RCTs A. Prospect. controlled trials I RCTs, well designed, n↑ for suff. stat. power II-1 Controlled trials(no randomization) II 1+ properly desig. RCT, n↑, clinical setting B. Obser-vational studies II 1 large well-designed clinical trial (+/- rand.), cohort or case-control studies or well designed meta-analysis II-2 Cohort or case-control analytical studies III Publ., well-desig. trials, pre-post, cohort, time series, case-control studies II-3 Multiple time series, dramatic uncontr. experiments IV Non-exp. studies >1 center/group, opinion respected authorities, clinical evidence, descr. studies, expert consensus comm. C. Expert opinion III Clinical experience, descr. studies, expert comm. III Opinion of respected authorities, descrip. epidemiology IV Not rated

Level of evidence Levels of evidence Oxford Centre of EBM Ia Systematic reviews (meta-analyses) of RCTs Ib Randomized controlled trials II Cohort studies Bias Bias III Case-control-studies Bias Bias IV Case-series Bias V Expert opinion Bias Adapted from: Oxford Centre of Evidence Based Medicine; http://www.cebm.net

GRADEQuality of evidence The extent to which one can be confident that an estimate of effect or association is correct. Although the degree of confidence is a continuum, we suggest using four categories: • High • Moderate • Low • Very low

Quality of evidence across studies 1B I B GRADE IV II I A V Outcome # 1 Outcome # 2 Outcome # 3 III Quality: High Quality: Low Quality: Moderate

Determinants of quality • RCTs start high • Observational studies start low • What lowers quality of evidence? 5 factors: • Detailed study design and execution • Inconsistency • Indirectness • Publication bias • Imprecision

What is the study design?

Design and Execution • Limitations • Lack of allocation concealment • No true intention to treat principle • Inadequate blinding • Loss to follow-up • Early stopping for benefit

Allocation concealment 250 RCTs out of 33 meta-analysesAllocation concealment: Effect (Ratio of OR) adequate 1.00 (Ref.) unclear 0.67 [0.60 – 0.75] not adequate 0.59 [0.48 – 0.73] * • * significant Schulz KF et al. JAMA 1995

Bias Fields et al 1970: RCT • 167 pts with bilateral stenosis of the carotids+TIA • Surgical vs medical management • 151 pt analysed per protocol:RRR (TIA, CVA, death ): 26% [6%, 42%], p = 0.01 • Outcome: pt had to be d/c‘ed alive and without TIA/CVA • 15 in surg. group/1 in med. management excluded:ITT: RRR (TIA, CVA, death): 17% [-3%, 32%], p = 0.09 Sackett, Gent. NEJM 1979

Another bias MS: Plasmapheresis, Cyclophosphamide, Prednisone vs. Placebo Follow up: 6 months 12 months 18 months 24 months Neurologist:p values < 0.05 < 0.005 NS < 0.05 outcome assessment blinded NS NS NS NS Noseworthy et al. Neurology 1994

What is double blind? Participants Bias through other effective interventions, differential reporting of symptoms, dropping out Health care providers Differentially prescribing effective co-interventions, influence compliance with follow-up, influence patient reports Data collectors Differential encouragement, timing/frequency of outcomes assessment, differential recording of outcomes Judicial assessors of outcomes Differential assessment of outcome Data analyst Differential decisions on patient withdrawal, post hoc selection of outcomes or analytic approaches, selection of time points Data safety and monitoring committee Differential decisions to continue or stop the trial Manuscript writers May reduce biases in the presentation and interpretation of results

Baseline Allocation B A Intervention No interv. Follow up Follow up Outcome Outcome Quality issues Question Method Random? Selectionbias? Performance bias? Attritionbias? Detectionbias? Sequ. generation Allocation concealment Blinding/Masking Intention-to-treat analysis Blinding/Masking

Design and Execution • Limitations • Lack of allocation concealment • No true intention to treat principle • Inadequate blinding • Loss to follow-up • Early stopping for benefit

5 vs 4 chemo-Rx cycles for AML

Studies stopped early becasue of benefit

Consistency of results • If inconsistency, look for explanation • patients, intervention, outcome, methods • How to analyze • Differences in effect size • Overlap of confidence intervals • Chi-square of homogeneity • I-squared • Unexplained inconsistency downgrade quality

Heterogeneity Pagliaro L et al. Ann Intern Med 1992;117:59-70

Directness of Evidence • Indirect comparisons • Interested in head-to-head comparison • Drug A versus drug B • Infliximab versus adalimumab in Crohn’s disease • Differences in • patients (early cirrhosis vs end-stage cirrhosis) • interventions (CRC screening: flex. sig. vs colonoscopy) • outcomes (non-steroidal safety: ulcer on endoscopy vs symptomatic ulcer complications)

Determinants of quality • RCTs start high • Observational studies start low • What lowers quality of evidence? 5 factors: • Detailed study design and execution • Inconsistency • Indirectness • Publicationbias • Imprecision

ISIS-4Lancet 1995 I.V. Mg in acute myocardial infarction Publication bias Meta-analysisYusuf S.Circulation 1993 Egger M, Smith DS. BMJ 1995;310:752-54

Funnel plot 0 Symmetrical: No reporting bias 1 Standard Error 2 3 0.1 0.3 0.6 1 3 10 Odds ratio Egger M, Cochrane Colloquium Lyon 2001

Funnel plot 0 Asymmetrical: Reporting bias? 1 Standard Error 2 3 0.1 0.3 0.6 1 3 10 Odds ratio Egger M, Cochrane Colloquium Lyon 2001

ISIS-4Lancet 1995 I.V. Mg in acute myocardial infarction Meta-analysisYusuf S.Circulation 1993 Reporting bias Egger M, Smith DS. BMJ 1995;310:752-54

Determinants of quality • RCTs start high • Observational studies start low • What lowers quality of evidence? 5 factors: • Detailed study design and execution • Inconsistency • Indirectness • Publicationbias • Imprecision

Imprecision • Small sample size • Small number of events • Wide confidence intervals • Uncertainty about magnitude of effect • Is this an example of imprecision? • RCT: clopidogrelvs aspirin • 19,185 patients at risk of vascular events • Clopidogrel: 939 (5.32%) had major vascular event • Aspirin: 1,021 (5.83%) • RR of 0.91 (95% CI 0.83 – 0.99).

Quality assessment criteria Study design Lower if… Higher if… Quality of evidence Randomized trial Study limitations (design and execution) High (4) Inconsistency What can raise the quality of evidence? Moderate (3) Observational study Low (2) Indirectness Very low (1) Imprecision Publication bias

BMJ 2003;327:1459–61 37

Quality assessment criteria Lower if… Higher if… Quality of evidence Study design Study limitations Large effect (e.g., RR 0.5) Very large effect (e.g., RR 0.2) High (4) Randomized trial Inconsistency Evidence of dose-response gradient Moderate (3) Observational study Low (2) Indirectness All plausible confounding would reduce a demonstrated effect Very low (1) Imprecision Publication bias

Categories of quality High Further research is very unlikely to change our confidence in the estimate of effect Moderate Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate Low Very low Any estimate of effect is very uncertain

Judgements about the overall quality of evidence • Most systems not explicit • Options: • Benefits • Primary outcome • Highest • Lowest • Beyond the scope of a systematic review • GRADE: Based on lowest of all the critical outcomes

Practical points: Cochrane risk of bias tool • Adequate sequence generation? • Adequate allocation concealment? • Adequate blinding of participants, personnel, and outcome assessors? (assess each main outcome) • Incomplete outcomes data adequately addressed? (assess each main outcome) • Free of selective outcome reporting? • Free of other sources of bias? Judgment: Yes (low risk of bias), No (high risk of bias), Unclear

Risk of bias graph in RevMan 5

From risk of bias to quality of evidence for main outcomes Interpretation Considerations Risk of bias Across studies GRADE Low risk of bias Most information is from studies at low risk of bias Plausible bias unlikely to seriously alter the results No apparent limitations No serious limitations, do not downgrade Unclear risk of bias Most informa-tion is from studies at low or unclear risk of bias Plausible bias that raises some doubt about the results Potential limitations are unlikely to lower confidence in the estimate of effect No serious limitations do not downgrade Potential limitations are likely to lower confidence in the estimate of effect Serious limitations, down-grade 1 level High risk of bias The proportion of information from studies at high risk of bias is suffi-cient to affect the interpreta-tion of results Plausible bias that seriously weakens confidence in the results Crucial limitation for one criterion, or some limitations for multiple criteria, sufficient to lower con-fidence in the estimate of effect Serious limitations, down-grade 1 level Crucial limitation for one or more criteria sufficient to substantially lower confidence in the estimate of effect Very serious limitations, down-grade 2 levels

Grading the quality of evidence

Grading the quality of evidence

Presentation Transcript

Systems for Grading Evidence of Medical Effectiveness

GRADing Evidence

Grading evidence and recommendations

Grading the Strength of a Body of Evidence on Diagnostic Tests

Grading Strength of Evidence

Grading Evidence in Medicine

Facing Challenging Situations When Grading Strength of Evidence

Beef Quality Grading, Yield Grading and Pricing

Facing Challenging Situations When Grading Strength of Evidence

QUALITY OF EVIDENCE

Grading quality of evidence the GRADE approach

A short history of quality grading ...

Quality and Yield Grading

Quality and Yield Grading

Grading evidence and recommendations

SYNTHESIZING THE EVIDENCE Grading the Evidence

Evaluating and grading evidence

Systematic Review Module 11: Grading Strength of Evidence

Quality Grading Physical Education

Grading Strength of Evidence

Grading evidence and recommendations The GRADE approach

Grading the Strength of a Body of Evidence on Diagnostic Tests