Join the 200th Anniversary Celebration

Correspondence

Meta-Analyses and Large Randomized, Controlled Trials

N Engl J Med 1998; 338:59-62January 1, 1998

Article

To the Editor:

We were pleased to see that using an independent protocol, LeLorier et al. (Aug. 21 issue)1 confirmed both our2 previous estimates of the frequency of discrepancies between large trials and meta-analyses and those of Villar et al.3 Their selection of 12 large trials from four influential journals may have inflated the frequency of apparent discrepancies. Such journals may tend to publish trials that are likely to change practice, whose results disagree with prior evidence.4 Still, the estimates of LeLorier et al. are largely similar to prior estimates. However, we are concerned that several of their premises propagate outdated myths.

First, why is the latest single large trial always the gold standard against which all prior evidence (often including several large trials) must be measured? In 6 of the 12 cases discussed, the meta-analysis had more patients than the subsequent gold standard. Second, decision making based solely on which side of 0.05 the P value lies is potentially misleading; an odds ratio of 0.7 (95 percent confidence interval, 0.5 to 0.9; P = 0.01), although different in precision, is hardly discrepant with an odds ratio of 0.7 (95 percent confidence interval, 0.3 to 1.8; P = 0.4). The measure that LeLorier et al. use may misrepresent the true frequency of disagreement.

Third, even with appropriate measures, discrepancies between meta-analyses and large trials should be expected, given the variable characteristics and treatment responses in different persons, protocols, and populations. Not only are trials in meta-analyses frequently heterogeneous, but also the idea of the homogeneous single trial is often a myth. Discrepancies occur even within trials 5 and between large trials themselves, as studies of magnesium in myocardial infarction exemplify.6 Meta-analysis has recently been evolving toward evaluating this heterogeneity. It is more constructive to quantify reasons for discrepancies 2 rather than wait for the latest larger and better trial that may nullify past experience. Unfortunately, LeLorier et al. did not explore such reasons systematically.

Fourth, potential biases exist in both meta-analyses and clinical trials. If nothing else, meta-analysis sensitizes us to several of these biases regarding the conduct and reporting of trials.4 LeLorier and colleagues made use of such scientific advances to make their points. Meta-analysis is not statistical alchemy that makes life easier by distilling one magic number from confounded data; it is a scientific discipline that aims to quantify evidence and to explore bias and diversity in research systematically. We should keep trying to improve clinical trials and meta-analyses, not undermine them.

John P.A. Ioannidis, M.D.
National Institute of Allergy and Infectious Diseases, Bethesda, MD 20892

Joseph C. Cappelleri, Ph.D., M.P.H.
Pfizer Central Research, Groton, CT 06340

Joseph Lau, M.D.
New England Medical Center Hospitals, Boston, MA 02111

6 References
  1. 1

    LeLorier J, Gregoire G, Benhaddad A, Lapierre J, Derderian F. Discrepancies between meta-analyses and subsequent large randomized, controlled trials. N Engl J Med 1997;337:536-542
    Full Text | Web of Science | Medline

  2. 2

    Cappelleri JC, Ioannidis JP, Schmid CH, et al. Large trials vs meta-analysis of smaller trials: how do their results compare? JAMA 1996;276:1332-1338
    CrossRef | Web of Science | Medline

  3. 3

    Villar J, Carroli G, Belizan JM. Predictive ability of meta-analyses of randomised controlled trials. Lancet 1995;345:772-776
    CrossRef | Web of Science | Medline

  4. 4

    Ioannidis JP, Cappelleri JC, Sacks HS, Lau J. The relationship between study design, results, and reporting of randomized clinical trials of HIV infection. Control Clin Trials 1997;18:431-444
    CrossRef | Medline

  5. 5

    Horwitz RI, Singer BH, Makuch RW, Viscoli CM. Can treatment that is helpful on average be harmful to some patients? A study of the conflicting information needs of clinical inquiry and drug regulation. J Clin Epidemiol 1996;49:395-400
    CrossRef | Web of Science | Medline

  6. 6

    Woods KL. Mega-trials and management of acute myocardial infarction. Lancet 1995;346:611-614
    CrossRef | Web of Science | Medline

To the Editor:

LeLorier et al. assume that the results of randomized, controlled trials correctly represent the true effect of an intervention and that the results of meta-analyses must be judged against this gold standard. This comparison, however, is not valid when there are major methodologic differences between the trials included in the meta-analysis and the subsequent randomized, controlled trial.

For example, the authors compare the results of a meta-analysis and a randomized, controlled trial that examined the efficacy of nitrates in patients with acute myocardial infarction. The meta-analysis, published in 1988,1 found a benefit in terms of mortality from the use of nitrates (odds ratio, 0.65; 95 percent confidence interval, 0.51 to 0.82), but the randomized, controlled trial, published in 1994,2 found no benefit (odds ratio, 0.94; 95 percent confidence interval, 0.84 to 1.05). However, both the interventions and the patient populations were markedly different. Patients in the meta-analysis were not treated with thrombolytic agents and were rarely treated with beta-blockers, and the control group had a high mortality rate (20.5 percent).1 In contrast, patients in the randomized, controlled trial were intensively treated with multiple therapies (72 percent received thrombolytic agents and 31 percent received beta-blockers), and the mortality rate (6.9 percent) in the control group was much lower.2 Rather than indicating that the meta-analysis is wrong, the findings suggest that nitrates decrease mortality only in patients who are not treated acutely with other therapies.

Stephen Bent, M.D.
Karla Kerlikowske, M.D.
Deborah Grady, M.D., M.P.H.
University of California, San Francisco, San Francisco, CA 94143

2 References
  1. 1

    Yusuf S, Collins R, MacMahon S, Peto R. Effect of intravenous nitrates on mortality in acute myocardial infarction: an overview of the randomised trials. Lancet 1988;1:1088-1092
    CrossRef | Web of Science | Medline

  2. 2

    Gruppo Italiano per lo Studio della Sopravvivenza nell'Infarto MiocardicoGISSI-3: effects of lisinopril and transdermal glyceryl trinitrate singly and together on 6-week mortality and ventricular function after acute myocardial infarction. Lancet 1994;343:1115-1122
    Web of Science | Medline

To the Editor:

. . . An overall estimate from a meta-analysis can be misleading if there is considerable heterogeneity among the included trials that has not been fully investigated. Similarly, it is misleading to compare the results of a single study with those of a meta-analysis without a careful examination of important characteristics of the patients and interventions included in these trials. Unfortunately, the study by LeLorier and colleagues, by giving the impression that the meta-analyses and the large trials were measuring the same thing, applies a simplistic analysis to a complex issue. These potentially misleading comparisons were seized on in the accompanying editorial (Aug. 21 issue)1 to assert that a conventional narrative review is more reliable than a well-conducted meta-analysis, without providing any objective evidence to demonstrate the predictive accuracy of such narrative reviews. The reliability of large randomized, controlled trials, systematic reviews, meta-analyses, and narrative or ad hoc reviews and their respective roles in the field of clinical evaluation should be decided on the basis of careful scientific inquiry rather than prejudice.

Fu-Jian Song, B.Med., M.Med., Ph.D.
Trevor A. Sheldon, M.Sc.
University of York, York YO1 5DD, United Kingdom

1 References
  1. 1

    Bailar JC III. The promise and problems of meta-analysis. N Engl J Med 1997;337:559-561
    Full Text | Web of Science | Medline

To the Editor:

The study by LeLorier et al. comparing the results of meta-analyses and subsequent large randomized, controlled trials illustrates the importance of exploring the heterogeneity of research evidence, a point noticeably missing from the editorial by Bailar. It would surely have been informative for LeLorier et al. to have explored the heterogeneity evident in Figure 1 of their article, particularly with respect to the methodologic quality, numbers of patients, and the length of follow-up. Instead, the authors chose to summarize the results in terms of predictive ability, a simplistic approach, particularly when correlated outcomes from within the same studies were included.

Both the article and the editorial highlight pitfalls that are only too well known to reviewers in the Cochrane Collaboration.1 However, LeLorier et al. failed to provide information about how closely the meta-analyses followed Cochrane Collaboration guidelines,1 among which are identifying unpublished studies, specifying whether data on individual patients or aggregate data were used, and revealing the way in which the quality of the original trial design was evaluated and whether heterogeneity between trial results was investigated. None of these points were mentioned by Bailar. Similarly, the only indication of the rigor of the large, randomized trials selected in the study by LeLorier et al. is provided by the journal in which they were published and the number of patients randomized, rather than by the mention of any previously published standards,2 despite the description and use of such trials as the gold standard in evaluations of the efficacy of clinical interventions. Although recognizing the key role of rigorous, large, randomized, controlled clinical trials, we must not throw out the baby with the bath water, or fall prey to the biases inherent in conventional narrative review,3 by dismissing systematic reviews and, when appropriate, meta-analysis.

Saboor Khan, F.R.C.S.
Paula Williamson, Ph.D.
Robert Sutton, F.R.C.S.(Gen.)
University of Liverpool, Liverpool L69 3BX, United Kingdom

3 References
  1. 1

    Oxman A. Preparing and maintaining systematic reviews. Cochrane Collaboration hand book. Oxford, England: Cochrane Collaboration, 1996:Section VI.

  2. 2

    DerSimonian R, Charette LJ, McPeek B, Mosteller F. Reporting on methods in clinical trials. N Engl J Med 1982;306:1332-1337
    Full Text | Web of Science | Medline

  3. 3

    Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers CT. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA 1992;268:240-248
    CrossRef | Web of Science | Medline

To the Editor:

LeLorier et al. restrictively searched for trials in four high-profile journals that can be very selective about publication. It is possible that the trials identified were submitted or published for the very reason that their effect sizes differed from those in previous meta-analyses, whereas trials closely confirming meta-analyses may have appeared in less prestigious journals. A less biased approach might have been to have conducted a similar analysis in which primary selection was applied to the meta-analyses and all journals were searched for subsequent trials. . . .

In our opinion, the editorial presents the biased viewpoint of a single person (much like a conventional review), illustrated by the statement that “when both the trial and the meta-analysis seem to be of good quality, . . . I tend to believe the results of the trial.” On what basis? Support of narrative over systematic reviews is worrying. The problems of traditional review are numerous and have been well documented.1 The expression of such an opinion in a Journal editorial is a step back in this era of evidence-based medicine.

Lesley A. Stewart, Ph.D.
Mahesh K.B. Parmar, D.Phil.
Jayne F. Tierney, Ph.D.
British Medical Research Council Cancer Trials Office, Cambridge CB2 2BW, United Kingdom

1 References
  1. 1

    Mulrow CD. The medical review article: state of the science. Ann Intern Med 1987;106:485-488
    Web of Science | Medline

To the Editor:

LeLorier et al. make a key assumption that the meta-analyses and the randomized trials were both estimating the same underlying effect. They attempted to adjust for any error in this assumption by performing a sensitivity analysis on the determination of similarity by the reviewers.

We believe that more advanced techniques of meta-analysis that explore specific sources of heterogeneity would provide additional insight into why the meta-analyses and their corresponding large trials did not observe the same outcomes. For example, techniques such as hierarchical Bayes'1 and regression methods could be used to identify specific points on which the large trials and the individual trials in the meta-analyses differ, and to quantify the associations of these sources of heterogeneity with the observed outcomes. These analyses might therefore generate fruitful new directions for research.

When we acknowledge that meta-analysis is a method for studying studies rather than a shortcut for conducting large, randomized trials, we will begin to find the proper place for meta-analysis in our biostatistical toolbag.

Ida Sim, M.D.
Philip Lavori, Ph.D.
Palo Alto Veterans Health Care System, Palo Alto, CA 94304

1 References
  1. 1

    DeMouchel WH, Harris JE. Bayes methods for combining the results of cancer studies in humans and other species. J Am Stat Assoc 1983;78:293-308
    CrossRef | Web of Science

To the Editor:

Meta-analysis provides an opportunity to look for reasons for inconsistent results among studies, but LeLorier et al. mention only some hypothetical, generic reasons and overlook clinical information that might have explained the discrepant findings.

The discrepancies may be explained more by clinical heterogeneity and details of the study protocols and less by publication bias and analysis of random as opposed to fixed effects. For example, there was a statistically significant discrepancy between the meta-analysis1 and the large randomized, controlled trial — the Collaborative Low-Dose Aspirin Study in Pregnancy (CLASP)2 — involving low-dose aspirin for the prevention of intrauterine growth retardation. Eligibility criteria for the study (women at 12 to 32 weeks of gestation with a sufficient risk of preeclampsia or intrauterine growth retardation according to the responsible clinician) were vastly different from those of the meta-analysis (women with prior preeclampsia, intrauterine growth retardation, or placental infarction; primiparas with either increased blood pressure in response to angiotensin II or abnormal uteroplacental blood flow). This difference is reflected in the base-line risks of intrauterine growth retardation in the control groups: 6.6 percent (95 percent confidence interval, 6.2 to 7.0 percent) for women enrolled in the CLASP trial and 28 percent (range, 18 to 63 percent) for the study groups in the meta-analysis. This difference in risk by more than a factor of 4 exists despite the use of a less stringent definition of intrauterine growth retardation in the CLASP trial. Differences in the base-line risk of preeclampsia further highlight the heterogeneity: 7.6 percent (95 percent confidence interval, 6.8 to 8.1 percent) in the CLASP trial and 33 percent (range, 17 to 52 percent) in the meta-analysis.

The difference in the base-line risks of intrauterine growth retardation and preeclampsia, despite an offsetting difference in the criteria for intrauterine growth retardation, is a plausible explanation for the discrepant results. In reality, the CLASP trial and meta-analysis results are not necessarily discrepant, but they may reflect a variation in the effect of treatment with low-dose aspirin as a function of the risk of intrauterine growth retardation.

Without a careful consideration of clinical homogeneity, the work of LeLorier et al. has the same limitations as meta-analyses that do not carefully consider the clinical aspects of data synthesis.

Thomas F. Imperiale, M.D.
Indiana University School of Medicine, Indianapolis, IN 46202

2 References
  1. 1

    Imperiale TF, Petrulis AS. A meta-analysis of low-dose aspirin for the prevention of pregnancy-induced hypertensive disease. JAMA 1991;266:260-264
    CrossRef | Medline

  2. 2

    CLASP (Collaborative Low-Dose Aspirin Study in Pregnancy) Collaborative Group. CLASP: a randomised trial of low-dose aspirin for the prevention and treatment of pre-eclampsia among 9364 pregnant women. Lancet 1994;343:619-629
    CrossRef | Web of Science | Medline

Author/Editor Response

The authors reply:

To the Editor: Ioannidis et al., as well as Khan et al. and Stewart et al., suggest that the editors of the influential journals that published the trials we chose may tend to favor the publication of large trials whose results disagree with prior evidence. This is a new variation on publication bias that, unfortunately, cannot be proved. Ioannidis et al. mention that our work confirms their own results1 and those of Villar et al.,2 but we want to respond to their comments.

First, we do not agree with the view that the six meta-analyses with more patients than the large randomized, controlled trials are more credible. Although the inclusion of more patients gives more statistical power, it cannot compensate for methodologic flaws. Second, we still think that the precision of an odds ratio is important, since it determines whether a therapy is adopted or rejected. An odds ratio whose confidence interval overlaps 1 will be considered, at best, to represent a tendency, and the null hypothesis will still stand. Third, we fully agree that the problems of heterogeneity are extremely important, and they are the object of our present work. Fourth, we are certainly in favor of having meta-analysis emphasize the systematic exploration of bias and diversity in research rather than the distillation of a magic odds ratio.

According to Bent et al., the higher base-line mortality rates in the meta-analysis3 of the efficacy of nitrates in patients with myocardial infarction could explain the discrepancy between its results and those of the subsequent large randomized, controlled trial — the third study of the Gruppo Italiano per lo Studio della Sopravvivenza nell'Infarto Miocardico (GISSI-3).4 The question is whether these differences alone could move the odds ratio from 0.5 to nearly 1, given that the meta-analysis includes studies with base-line mortality rates that are lower than the one in the GISSI-3 trial. The large randomized, controlled trial would thus have met the homogeneity criteria of the meta-analysis. It is fortunate that the investigators decided to examine the role of nitrates in acute myocardial infarction in the era of thrombolytic agents and beta-blockers by conducting a trial rather than a sequential meta-analysis.

Imperiale proposes that the differences in base-line rates of preeclampsia and intrauterine growth retardation can be used to explain why the positive results of his meta-analysis5 on the effects of aspirin were not confirmed by the large randomized, controlled trial6 (the CLASP trial). An alternative explanation would be that among the six studies in the meta-analysis, one was a nonrandomized trial7 and two were not placebo-controlled.7,8

We agree with the proposal of Sim and Lavori for the development of statistical techniques to explore specific sources of heterogeneity and assist in the selection of studies. The choice of the data to be included constitutes the first and most fundamental step in a review and is, in our opinion, much more important than its eventual shape or form.

Jacques LeLorier, M.D., Ph.D.
Geneviève Grégoire, M.D.
Hôtel-Dieu de Montréal, Montreal, QC H2W 1T8, Canada

8 References
  1. 1

    Cappelleri JC, Ioannidis JP, Schmid CH, et al. Large trials vs meta-analysis of smaller trials: how do their results compare? JAMA 1996;276:1332-1338
    CrossRef | Web of Science | Medline

  2. 2

    Villar J, Carroli G, Belizan JM. Predictive ability of meta-analyses of randomised controlled trials. Lancet 1995;345:772-776
    CrossRef | Web of Science | Medline

  3. 3

    Yusuf S, Collins R, MacMahon S, Peto R. Effect of intravenous nitrates on mortality in acute myocardial infarction: an overview of the randomised trials. Lancet 1988;1:1088-1092
    CrossRef | Web of Science | Medline

  4. 4

    Gruppo Italiano per lo Studio della Sopravvivenza nell'Infarto MiocardicoGISSI-3: effects of lisinopril and transdermal glyceryl trinitrate singly and together on 6-week mortality and ventricular function after acute myocardial infarction. Lancet 1994;343:1115-1122
    Web of Science | Medline

  5. 5

    Imperiale TF, Petrulis AS. A meta-analysis of low-dose aspirin for the prevention of pregnancy-induced hypertensive disease. JAMA 1991;266:260-264
    CrossRef | Medline

  6. 6

    CLASP (Collaborative Low-Dose Aspirin Study in Pregnancy) Collaborative Group. CLASP: a randomised trial of low-dose aspirin for the prevention and treatment of pre-eclampsia among 9364 pregnant women. Lancet 1994;343:619-629
    CrossRef | Web of Science | Medline

  7. 7

    Schiff E, Peleg E, Goldenberg M, et al. The use of aspirin to prevent pregnancy-induced hypertension and lower the ratio of thromboxane A2 to prostacyclin in relatively high risk pregnancies. N Engl J Med 1989;321:351-356
    Full Text | Web of Science | Medline

  8. 8

    Wallenburg HC, Rotmans N. Prevention of recurrent idiopathic fetal growth retardation by low-dose aspirin and dipyridamole. Am J Obstet Gynecol 1987;157:1230-1235
    Web of Science | Medline

Author/Editor Response

My objections to meta-analysis are purely pragmatic. It does not work nearly as well as we might want it to work. The problems are so deep and so numerous that the results are simply not reliable. My editorial cites a few relevant references, and I could have cited many more. The work of LeLorier et al. adds to the evidence that meta-analysis simply does not work very well in practice.

Khan et al. seem concerned that neither the meta-analyses nor the randomized, controlled trials were performed to their own standard of excellence. But that is just the point. As it is practiced and as it is reported in our leading journals, meta-analysis is often deeply flawed. Many people cite high-sounding guidelines, and I am sure that all truly want to do a superior analysis, but meta-analysis often fails in ways that seem to be invisible to the analyst. We cannot know whether improved implementation would alter the findings.

Stewart et al. suggest that leading journals may deliberately select and publish randomized, controlled trials that disagree with previously published meta-analyses, and they propose that all journals be searched for randomized, controlled trials. That could be useful, but it would pose a much bigger task than the work of LeLorier et al. and might miss the main point: the results of meta-analyses are often at variance with those of randomized, controlled trials. Certainly, randomized, controlled trials can be done as poorly as meta-analyses, and the analysis conducted by LeLorier et al. is also less than perfect. What we need is a guide through the imperfect world of science.

The advocates of meta-analysis and evidence-based medicine should undertake research that might demonstrate that meta-analyses in the real world — not just in theory — improve health outcomes in patients. Review of the long history of randomized, controlled trials, individually weak for this specific purpose, has led to overwhelming evidence of efficacy. Examples include the development of better vaccines, more effective screening for diseases, and improved treatments for childhood cancer, infections, mental illness, cardiovascular disease, and many others. I am not willing to abandon that history to join those now promoting meta-analysis as the answer, no matter how pretty the underlying theory, until its defects are honestly exposed and corrected. The knowledgeable, thoughtful, traditional review of the original literature remains the closest thing we have to a gold standard for summarizing disparate evidence in medicine.

John C. Bailar, III, M.D., Ph.D.
University of Chicago, Chicago, IL 60637-1470

Citing Articles (13)

Citing Articles

  1. 1

    Jessica Gurevitch, Kerrie Mengersen. (2010) A statistical view of synthesizing patterns of species richness along productivity gradients: devils, forests, and trees. Ecology 91:9, 2553-2560
    CrossRef

  2. 2

    Geertruida E. Bekkering, Jos Kleijnen. (2008) Procedures and methods of benefit assessments for medicines in Germany. The European Journal of Health Economics 9:S1, 5-29
    CrossRef

  3. 3

    U. A. Liberman, M. C. Hochberg, P. Geusens, A. Shah, J. Lin, A. Chattopadhyay, P. D. Ross. (2008) Response to Boonen et al. ‘Assessing the relative efficacy of different osteoporosis agents based on the outcomes from meta-analyses’. International Journal of Clinical Practice 62:1, 165-166
    CrossRef

  4. 4

    Martin Penagos, Enrico Compalati, Francesco Tarantini, Rodrigo Baena-Cagnani, Jose Huerta, Giovanni Passalacqua, Giorgio Walter Canonica. (2006) Efficacy of sublingual immunotherapy in the treatment of allergic rhinitis in pediatric patients 3 to 18 years of age: a meta-analysis of randomized, placebo-controlled, double-blind trials. Annals of Allergy, Asthma & Immunology 97:2, 141-148
    CrossRef

  5. 5

    Coleen A. Boyle, Paul Ladenson, James E. Haddow. (2005) Methods and Criteria Used in Evidence-Based Decisions in Public Health. Thyroid 15:1, 41-43
    CrossRef

  6. 6

    Michael Jones, Lloyd Ibels, Brad Schenkel, Martin Zagari. (2004) Impact of epoetin alfa on clinical end points in patients with chronic renal failure: A meta-analysis. Kidney International 65:3, 757-767
    CrossRef

  7. 7

    P Sjögren, A Halling. (2002) Medline search validity for randomised controlled trials in different areas of dental research. British Dental Journal 192:2, 97-99
    CrossRef

  8. 8

    Jin-Ling Tang, Joseph LY Liu. (2000) Misleading funnel plot for detection of bias in meta-analysis. Journal of Clinical Epidemiology 53:5, 477-484
    CrossRef

  9. 9

    David Fishbain, Robert B. Cutler, Hubert L. Rosomoff, Renee Steele Rosomoff. (2000) What Is the Quality of the Implemented Meta-Analytic Procedures in Chronic Pain Treatment Meta-Analyses?. The Clinical Journal of Pain 16:1, 73-85
    CrossRef

  10. 10

    Petteri Sjögren, Arne Halling. (2000) Trends in dental and medical research and relevance of randomized controlled trials to common activities in general dentistry. Acta Odontologica Scandinavica 58:6, 260-264
    CrossRef

  11. 11

    Paul F. White, Mehernoor F. Watcha. (1999) Has the Use of Meta-Analysis Enhanced Our Understanding of Therapies for Postoperative Nausea and Vomiting?. Anesthesia & Analgesia 88:6, 1200-1202
    CrossRef

  12. 12

    Harold L. Kennedy. (1999) The importance of randomized clinical trials and evidence-based medicine: A clinician's perspective. Clinical Cardiology 22:1, 6-12
    CrossRef

  13. 13

    C.J. Williams. (1998) Evidence-based cancer care. Clinical Oncology 10:3, 144-149
    CrossRef