Join the 200th Anniversary Celebration

Correspondence

Observational Studies and Randomized Trials

N Engl J Med 2000; 343:1194-1197October 19, 2000

Article

To the Editor:

In the June 22 issue, Concato et al.1 compared 5 systematic reviews of randomized, controlled trials and observational studies on the same topic, and Benson and Hartz2 evaluated 18 case studies of randomized, controlled trials and observational studies of different interventions and 3 reviews of randomized, controlled trials and observational studies of the same intervention. Do these data constitute a representative sample of all available comparisons in the medical literature? We think not. We know of at least two systematic reviews3,4 in which observational studies detected effects that were not supported by randomized, controlled trials. It would be counterintuitive if randomization, the most important way to produce groups that are truly comparable with respect to known and unknown prognostic factors at base line, were superfluous for generating valid estimates of effect. Even in trials purported to be randomized, if the randomization is inadequately implemented, higher estimates of effect are produced.5-7

Furthermore, the conclusion that the estimates of effect generated by the studies of different design are similar is valid only if the two groups of studies are similar in all respects other than the design itself. Concato et al. state that patients in the observational studies were different from those in randomized, controlled trials for a number of reasons: broader inclusion criteria, a wider spectrum of coexisting illnesses, a wider spectrum of disease severity, and dissimilar concomitant treatments. Therefore, the detection of similar magnitudes of effect in randomized, controlled trials and observational studies could simply be due to differences in base-line prognostic factors and concomitant interventions. Without adjustment for such differences, it is not possible to assess whether the observed lack of difference in effect between the two groups of studies is likely to be due to a lack of benefit from randomization.

Although anecdotal evidence of case studies such as those of Benson and Hartz and Concato et al. makes us think about this topic, their conclusions have limited value for generalization. Only a systematic review that includes all the available evidence, with explicit comparison among individual studies with regard to patients, interventions, and follow-up, can provide the needed answers.

Regina Kunz, M.D.
Universitätsklinikum Charité, 10117 Berlin, Germany

Khalid S. Khan, M.B., B.S.
University of York, York YO10 5DD, United Kingdom

Hans-Helmut Neumayer, M.D.
Universitätsklinikum Charité, 10117 Berlin, Germany

7 References
  1. 1

    Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med 2000;342:1887-1892
    Full Text | Web of Science | Medline

  2. 2

    Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med 2000;342:1878-1886
    Full Text | Web of Science | Medline

  3. 3

    Lonn EM, Yusuf S. Is there a role for antioxidant vitamins in the prevention of cardiovascular disease? An update on epidemiological and clinical trials data. Can J Cardiol 1997;13:957-965
    Web of Science | Medline

  4. 4

    Patterson RE, White E, Kristal AR, Neuhouser ML, Potter JD. Vitamin supplement and cancer risk: the epidemiological evidence. Cancer Causes Control 1997;8:786-802
    CrossRef | Web of Science | Medline

  5. 5

    Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408-412
    CrossRef | Web of Science | Medline

  6. 6

    Linde K, Scholz M, Ramirez G, Clausius N, Melchart D, Jonas WB. Impact of study quality on outcome in placebo-controlled trials of homeopathy. J Clin Epidemiol 1999;52:631-636
    CrossRef | Web of Science | Medline

  7. 7

    Kunz R, Oxman AD. The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials. BMJ 1998;317:1185-1190
    CrossRef | Web of Science | Medline

To the Editor:

I write as one of the authors of a report in which randomized, controlled trials were compared with those using historical controls.1 Concato et al. seem to have set up, and knocked down, a straw man. As far as I know, no one has claimed that “all observational studies are misleading.” Indeed, if all observational studies were misleading, there would be no problem, since we could always disregard their conclusions. The problem, of course, is that only some observational studies are misleading (just as some randomized, controlled trials are misleading), but that no one has devised a foolproof method for distinguishing those that are useful from those that are misleading. The demonstration that the results of observational studies are sometimes in agreement with those of randomized trials is not a great surprise. When there is such agreement, treatment decisions may be straightforward. Providers and their patients are faced with difficult choices when large numbers of observational studies suggest, for example, that hormone-replacement therapy reduces the risk of coronary artery disease in postmenopausal women but randomized trials fail to confirm the benefit. There have been many other therapies that looked promising in observational studies but that were later discredited and abandoned.

I agree with Concato et al. that observational studies may often yield estimates of treatment effects that are similar to those provided by randomized trials. However, their study does not appear to have answered the following questions: What should we do when randomized, controlled trials and observational studies disagree, and which type of study design is more likely to give the truth?

Henry S. Sacks, Ph.D., M.D.
Mount Sinai School of Medicine, New York, NY 10029

1 References
  1. 1

    Sacks HS, Chalmers TC, Smith H Jr. Randomized versus historical controls for clinical trials. Am J Med 1982;72:233-240
    CrossRef | Web of Science | Medline

To the Editor:

Observational comparisons are always subject to bias by an unknowable amount; the examples cited by Benson and Hartz and by Concato et al. provide no assurance that the next observational study will not be misleading.

As just one example, during the 1980s several nonrandomized studies investigated combinations of new-generation chemotherapy drugs with highly promising results for treating advanced non-Hodgkin's lymphoma. These regimens included combinations of methotrexate, bleomycin, doxorubicin, cyclophosphamide, vincristine, and dexamethasone (m-BACOD); prednisone, doxorubicin, cyclophosphamide, and etoposide, followed by cytarabine, bleomycin, vincristine, and methotrexate (ProMACE-CytaBOM); and methotrexate, doxorubicin, cyclophosphamide, vincristine, prednisone, and bleomycin (MACOP-B). However, a large randomized trial in which a standard treatment, the combination of cyclophosphamide, doxorubicin, vincristine, and prednisone (CHOP), was compared with these new regimens found no differences among the four regimens in patients' responses or survival.1 In addition, m-BACOD and MACOP-B were associated with significantly more life-threatening and fatal toxic effects. This trial was critical in the retention of CHOP as the standard of care for this group of patients.

Autologous marrow or stem-cell transplantation for breast cancer provides another lesson on the hazards of nonrandomized comparisons. Years of single-group transplantation studies, with results purported to be superior to those of historical studies,2 led many transplantation practitioners to believe strongly in the superiority of these regimens. Consequently, the completion of randomized clinical trials was difficult and prolonged. The results of trials are now finally available and fail to confirm the expectations for transplantation.3 These results could have been obtained years ago.

The fact that these important inconsistencies exist and that they are unpredictable and not especially uncommon demonstrates the necessity of randomized trials. Readers of the Journal should not now think that randomized trials are unnecessary because observational studies give the same answers as the trials most of the time.

Observational studies identify and set priorities for further investigations. However, as the only approach that guarantees unbiased treatment assignment, randomized trials, whenever feasible, must be the standard of evaluation. As Fredrickson said more than 30 years ago, “If, in major medical dilemmas, the alternative is to pay the cost of perpetual uncertainty, have we really any choice?” 4

Ping-Yu Liu, Ph.D.
Garnet Anderson, Ph.D.
John J. Crowley, Ph.D.
Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024

4 References
  1. 1

    Fisher RI, Gaynor ER, Dahlberg S, et al. Comparison of a standard regimen (CHOP) with three intensive chemotherapy regimens for advanced non-Hodgkin's lymphoma. N Engl J Med 1993;328:1002-1006
    Full Text | Web of Science | Medline

  2. 2

    Ho AD, Gluck S, Germond C, Sinoff C, Corringham RET. Long-term outcome after high-dose chemotherapy with stem cell support for metastatic breast cancer. Proc Am Soc Clin Oncol 1994;13:96-96 abstract.

  3. 3

    Peters W, Rosner G, Vredenburgh J, et al. A prospective, randomized comparison of two doses of combination alkylating agents (AA) as consolidation after CAF in high-risk primary breast cancer involving ten or more axillary lymph nodes (LN): preliminary results of CALGB9082/SWOG 9114/NCIC MA-13. Proc Am Soc Clin Oncol 1999;18:1a-1a abstract.

  4. 4

    Fredrickson DS. The field trial: some thoughts on the indispensable ordeal. Bull N Y Acad Med 1968;44:985-993
    Medline

To the Editor:

The debate over the relative merits of controlled, observational studies as opposed to randomized, controlled trials raises Socratic as well as scientific issues of concern. How do we in fact determine the truth in clinical medicine? The articles and the accompanying editorial1 accept the sacrosanct view that randomized, controlled trials are the gold standard. All too often, however, the conclusions of randomized clinical trials are not replicable when the outcomes are examined in everyday practice. Examples are the failures to demonstrate a benefit from tissue plasminogen activator in patients with ischemic stroke,2 from endarterectomy in patients with symptomatic carotid artery disease,3 or from primary percutaneous transluminal coronary angioplasty in patients with acute myocardial infarction 4 in community studies. The limitations of randomized, controlled trials must therefore also be considered. Ethical standards and patient-selection criteria often create study groups that differ from the general population (in what is known as a filtering effect). Often, randomized, controlled trials are specific with respect to sex, age, or race, necessitating extrapolations that may not always be valid. Randomized, controlled trials are generally performed at clinical centers, where highly skilled practitioners perform procedures. The level of skill and quality control evident in a large teaching hospital may not be replicable in a small, community hospital. Close scrutiny of patients is likely to reduce the adverse effects of treatment, even with double blinding. Randomized, controlled trials with negative results are less likely to be seen in print.5

The reproducibility of findings, the demonstration of meaningful benefits, and the presence of only minimal risks to patients should determine the value of treatments. Indeed, ascertaining the truth in clinical science may depend more on the precision, objectivity, and integrity of medical scientists than on the designs of their studies.

Howard S. Friedman, M.D.
New York University Medical School, New York, NY 10016

5 References
  1. 1

    Pocock SJ, Elbourne DR. Randomized trials or observational tribulations? N Engl J Med 2000;342:1907-1909
    Full Text | Web of Science | Medline

  2. 2

    Katzan IL, Furlan AJ, Lloyd LE, et al. Use of tissue-type plasminogen activator for acute ischemic stroke: the Cleveland area experience. JAMA 2000;283:1151-1158
    CrossRef | Web of Science | Medline

  3. 3

    Wennberg DE, Lucas FL, Birkmeyer JD, Bredenberg CE, Fisher ES. Variation in carotid endarterectomy mortality in the Medicare population: trial hospitals, volume, and patient characteristics. JAMA 1998;279:1278-1281
    CrossRef | Web of Science | Medline

  4. 4

    Canto JG, Every NR, Magid DJ, et al. The volume of primary angioplasty procedures and survival after acute myocardial infarction. N Engl J Med 2000;342:1573-1580
    Full Text | Web of Science | Medline

  5. 5

    Ioannidis JPA. Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials. JAMA 1998;279:281-286
    CrossRef | Web of Science | Medline

To the Editor:

It seems odd that Benson and Hartz included studies of beta-blockers in their attempt to argue that observational studies, as compared with randomized studies, do not produce overestimates of treatment effects. The largest nonrandomized study of beta-blockers and mortality of which we are aware was published in the Journal in 1998.1 Its publication thus falls within the period Benson and Hartz chose to focus on, yet it is not listed among their references — an omission that seems hard to justify. This nonrandomized study, which included data on more than 200,000 patients, estimated that beta-blockers conferred a 40 percent reduction in mortality.1 The randomized trials estimated that beta-blockers provided a reduction in mortality on the order of 20 percent. The bias in the estimate of the treatment effect from the nonrandomized methods is the same size as the estimate of the treatment effect in the randomized trials — a fact that appears to indicate a need for the control of bias that randomization provides.

Rebecca P. Smith, M.D.
New York Presbyterian Hospital, New York, NY 10021

Paul Meier, Ph.D.
Columbia University, New York, NY 10027

1 References
  1. 1

    Gottlieb SS, McCarter RJ, Vogel RA. Effect of beta-blockade on mortality among high-risk and low-risk patients after myocardial infarction. N Engl J Med 1998;339:489-497
    Full Text | Web of Science | Medline

Author/Editor Response

The authors reply:

To the Editor: Kunz et al. question whether the studies chosen for our article were representative. Since we included all the studies found in a systematic search with a priori criteria, they should be representative. The two studies cited by Kunz et al. about the effectiveness of vitamins did not meet our criteria. We included only studies of physician-assigned treatments, because the risk factors that influence this approach to the choice of treatments can be better measured and accounted for than those that influence patients' choice of treatments. Therefore, observational studies of physician-assigned treatments may be more valid than observational studies of treatments chosen by patients. Other studies cited by Kunz et al. show that errors in the design or implementation of randomized, controlled trials may bias the results. These studies do not necessarily imply that observational studies are invalid, however, because the types of systematic errors that may occur in controlled trials differ from those that may occur in observational studies.

Liu et al. cite case series on treatments for lymphoma and breast cancer that yielded results that differed from the results of comparable randomized, controlled trials. As indicated in our article, we made the a priori decision not to include case series. These types of studies are subject to many biases that can be avoided in well-done comparative observational studies.

We agree with Friedman about the limitations of randomized, controlled trials. As discussed in our article, these limitations may account for some of the differences between the results of observational studies and those of randomized, controlled trials. Even if the randomized, controlled trial is not a perfect gold standard, however, the frequent agreement between the results of observational studies and those of randomized, controlled trials supports the potential value of observational studies.

Smith and Meier cite an observational study of beta-blockers1 in which the size of the effect was greater than that in the study cited in our article.2 In both of these observational studies the size of the effect varied greatly according to the characteristics of the patients. In the study by Horwitz et al.,2 the sizes of the effects in the observational study and those in the randomized, controlled trial were similar when the patients were similar. Since the patients in the study by Gottlieb et al.1 differed substantially from those in the randomized, controlled trial,3 the sizes of the effects in the two studies are not comparable.

The medical community places great faith in the value of randomized, controlled trials.4 Well-done randomized, controlled trials do provide the best medical evidence. However, blind faith in the exclusive role of randomized, controlled trials has led to the acceptance of poorly conceived reviews and inappropriate discrediting of observational studies. It is time to move beyond such views, toward scientific investigations of the appropriate place for observational studies in evidence-based medicine.

Kjell Benson, B.A.
Arthur J. Hartz, M.D., Ph.D.
University of Iowa College of Medicine, Iowa City, IA 52242-1097

4 References
  1. 1

    Gottlieb S, McCarter RJ, Vogel RA. Effect of beta-blockade on mortality among high-risk and low-risk patients after myocardial infarction. N Engl J Med 1998;339:489-497
    Full Text | Web of Science | Medline

  2. 2

    Horwitz RI, Viscoli CM, Clemens JD, Sadock RT. Developing improved observational methods for evaluating therapeutic effectiveness. Am J Med 1990;89:630-638
    CrossRef | Web of Science | Medline

  3. 3

    A randomized trial of propranolol in patients with acute myocardial infarctionI. Mortality results. JAMA 1982;247:1707-1714
    CrossRef | Web of Science

  4. 4

    Rimm AA, Bortin M. Clinical trials as religion. Biomedicine 1978;28:S60-S63
    Medline

Author/Editor Response

Kunz et al. are concerned that we used a selected sample of articles to compare randomized, controlled trials and observational studies. As we reported, our strategy included a comprehensive search of prominent journals for meta-analyses of clinical topics involving both types of research design. The five topics (and corresponding 99 articles) may not be exhaustive but should be representative of the available literature in those journals. The study reported in the companion article by Benson and Hartz and other studies1 — performed with different strategies and different articles but reaching the same conclusion — also add to the credibility of our findings.

Friedman emphasizes the fallibility of individual randomized, controlled trials, just as other correspondents remind us of the fallibility of individual observational studies. We concur but point out that we did not base our analyses on the results of a single randomized, controlled trial or a single observational study, but rather on the averaged result of summary estimates from all the randomized, controlled trials and observational studies on each topic.

Sacks comments that some observational studies are misleading (as are some randomized, controlled trials) but acknowledges that no one has a foolproof method for distinguishing useful studies from misleading ones. His question — “Which type of study design is more likely to give the truth?” — strikes at the core of this debate. Our response is “Either design” when all randomized, controlled trials and observational studies on a particular topic are examined, but “Neither design” when an individual randomized, controlled trial or observational study is examined. On this issue, we agree with Sacks and with Liu et al. that the next individual study (whether an observational study or a randomized, controlled trial) may be misleading.

We sympathize with all who find intellectual security in randomization as a method of ensuring the validity of study results. Surely, however, other methods (matching, stratification, adjustment, and restriction) are available to ensure validity when randomization is absent. One of the implications of our research and the research by Benson and Hartz is that randomized trials have taught investigators how to design and analyze better observational studies (better, for instance, than the single-group trials and case series mentioned by Liu et al.). We should celebrate this enhanced quality of observational studies and the opportunity it provides for evaluating therapies in clinical medicine.

John Concato, M.D., M.P.H.
Nirav Shah, M.D., M.P.H.
Ralph I. Horwitz, M.D.
Yale University School of Medicine, New Haven, CT 06510

1 References
  1. 1

    McKee M, Britton A, Black N, McPherson K, Sanderson C, Bain C. Methods in health services research: interpreting the evidence: choosing between randomised and non-randomised studies. BMJ 1999;319:312-315
    CrossRef | Web of Science | Medline

Citing Articles (8)

Citing Articles

  1. 1

    Vinay Prasad. (2011) Perspective: Why There Must Be a Meditative Medicine. Alternative and Complementary Therapies 17:5, 274-278
    CrossRef

  2. 2

    2011. References. , 193-223.
    CrossRef

  3. 3

    Cengiz Akkaya, Asli Sarandol, Sengul Cangur, Selcuk Kirli. (2007) Retrospective database analysis on the effectiveness of typical and atypical antipsychotic drugs in an outpatient clinic setting. Human Psychopharmacology: Clinical and Experimental 22:8, 515-528
    CrossRef

  4. 4

    Josep Maria Haro, Stathis Kontodimas, Miguel Angel Negrin, Mark Ratcliffe, David Suarez, Frank Windmeijer. (2006) Methodological Aspects in the Assessment of Treatment Effects in Observational Health Outcomes Studies. Applied Health Economics and Health Policy 5:1, 11-25
    CrossRef

  5. 5

    Bonnie Spring, Sherry Pagoto, Peter G. Kaufmann, Evelyn P. Whitlock, Russell E. Glasgow, Timothy W. Smith, Kimberlee J. Trudeau, Karina W. Davidson. (2005) Invitation to a dialogue between researchers and clinicians about evidence-based behavioral medicine. Annals of Behavioral Medicine 30:2, 125-137
    CrossRef

  6. 6

    Fred G. Barker, Sepideh Amin-Hanjani, William E. Butler, Brian L. Hoh, James D. Rabinov, Johnny C. Pryor, Christopher S. Ogilvy, Bob S. Carter. (2004) Age-dependent Differences in Short-term Outcome after Surgical or Endovascular Treatment of Unruptured Intracranial Aneurysms in the United States, 1996–2000. Neurosurgery 54:1, 18-30
    CrossRef

  7. 7

    E. B. Lamont. (2003) Is Patient Travel Distance Associated With Survival on Phase II Clinical Trials in Oncology?. CancerSpectrum Knowledge Environment 95:18, 1370-1375
    CrossRef

  8. 8

    (2001) Current Awareness. Pharmacoepidemiology and Drug Safety 10:2, 173-188
    CrossRef