ATTENTION:Due to global market conditions, you may experience a delivery delay for your print issue of the New England Journal of Medicine. Your subscription also includes full access to the NEJM.org website. We regret any print delays and are working to ensure all issues are delivered as soon as possible. Thank you for your patience..
Reduced Mortality with Hospital Pay for Performance in England
List of authors.
Matt Sutton, Ph.D.,
Silviya Nikolova, Ph.D.,
Ruth Boaden, Ph.D.,
Helen Lester, M.D.,
Ruth McDonald, Ph.D.,
and Martin Roland, D.M.
Abstract
Background
Pay-for-performance programs are being adopted internationally despite little evidence that they improve patient outcomes. In 2008, a program called Advancing Quality, based on the Hospital Quality Incentive Demonstration in the United States, was introduced in all National Health Service (NHS) hospitals in the northwest region of England (population, 6.8 million).
Methods
We analyzed 30-day in-hospital mortality among 134,435 patients admitted for pneumonia, heart failure, or acute myocardial infarction to 24 hospitals covered by the pay-for-performance program. We used difference-in-differences regression analysis to compare mortality 18 months before and 18 months after the introduction of the program with mortality in two comparators: 722,139 patients admitted for the same three conditions to the 132 other hospitals in England and 241,009 patients admitted for six other conditions to both groups of hospitals.
Results
Risk-adjusted, absolute mortality for the conditions included in the pay-for-performance program decreased significantly, with an absolute reduction of 1.3 percentage points (95% confidence interval [CI], 0.4 to 2.1; P=0.006) and a relative reduction of 6%, equivalent to 890 fewer deaths (95% CI, 260 to 1500) during the 18-month period. The largest reduction, for pneumonia, was significant (1.9 percentage points; 95% CI, 0.9 to 3.0; P<0.001), with nonsignificant reductions for acute myocardial infarction (0.6 percentage points; 95% CI, −0.4 to 1.7; P=0.23) and heart failure (0.6 percentage points; 95% CI, −0.6 to 1.8; P=0.30).
Conclusions
The introduction of pay for performance in all NHS hospitals in one region of England was associated with a clinically significant reduction in mortality. As compared with a similar U.S. program, the U.K. program had larger bonuses and a greater investment by hospitals in quality-improvement activities. Further research is needed on how implementation of pay-for-performance programs influences their effects. (Funded by the NHS National Institute for Health Research.)
Introduction
A wide variety of pay-for-performance programs have been developed for health care providers, and such programs are being increasingly adopted internationally with the aim of improving the quality of care.1 Medicare is scheduled to introduce pay for performance in hospitals across the United States in 2013 under its Value-Based Purchasing Program.2 Increased adoption of pay for performance is occurring despite a scant evidence base. According to a review3 published in 2009, only three hospital pay-for-performance programs had been evaluated, and good evidence was available for only one, the Hospital Quality Incentive Demonstration (HQID) adopted by the Centers for Medicare and Medicaid Services in 2003 and supported by Premier. These evaluations4-6 and later articles7-9 show at best modest and short-term effects on hospital processes of care. Evidence of an effect on patient outcomes is even weaker: the HQID has been shown to have no effect on patient mortality,10,11 and a 2011 Cochrane review found no evidence that financial incentives improve patient outcomes.12
Design choices for pay-for-performance programs encompass goals, measures, incentives, and implementation as well as the context in which they are introduced. These may have an important bearing on the effects they have.1 It is rare for similar programs to be introduced in substantially different contexts, but in October 2008, Advancing Quality, a program very similar to the HQID, was introduced in all 24 National Health Service (NHS) hospitals in the northwest region of England (population, 6.8 million) that provided emergency care. Like the HQID, this was a “tournament” system in which only the top performers received a bonus. The program was designed and supported by Premier and included the same indicators and conditions as the HQID. Using patient-level data from all hospitals across England for three conditions included in the program and six conditions not included in the program for 18 months before and 18 months after the introduction of the program, we analyzed the association of this program with patient mortality.
Methods
The Incentive Program
The Advancing Quality program was the first hospital-based pay-for-performance program to be introduced in England. Hospitals were required to collect and submit data on 28 quality measures covering five clinical areas: acute myocardial infarction, coronary-artery bypass grafting, heart failure, hip and knee surgery, and pneumonia.
Like the HQID, Advancing Quality began as a pure tournament system. At the end of the first year, hospitals that reported quality scores in the top quartile received a bonus payment equal to 4% of the revenue that they received under the national tariff for the associated activity. For hospitals in the second quartile, the bonus was 2%. For the next 6 months, the reward system changed so that bonuses could be earned on the basis of three criteria. Hospitals were awarded an “attainment” bonus if their achievement in the second year exceeded the median achievement level from the first year, an “improvement” bonus if their increase in achievement from the first year was in the top quartile of increases in achievement from the first year, and an “achievement” bonus if their level of achievement in the second year was in the top or second quartile of achievement levels in the second year. Hospitals could earn all three bonuses and had to achieve the “attainment” bonus to be eligible for the “improvement” and “achievement” bonuses. There were no penalties for poor performers at any stage.
Bonuses totaling $5 million (£3.2 million) were paid to hospitals at the end of the first year. Bonuses totaling $2.5 million (£1.6 million) were paid 6 months later. Thereafter, the program was absorbed into a new pay-for-performance program that applied across the whole of England. This was not organized as a tournament, and the new program involved withholding of payments rather than bonuses. We therefore focus in this article on the first 18 months of the program, before these changes were implemented.
At the outset of the program, the chief executive officers of the 24 hospitals collectively agreed that bonuses would be allocated internally to clinical teams whose performance had earned the bonus. This could not be taken as personal income but would be invested in improved clinical care. Quality improvement was supported by other mechanisms, including feedback of data from Premier on performance, centralized support to ensure standardization of data collection, and a range of quality-improvement activities within hospitals. In addition, despite the competitive nature of the program, there were regular shared-learning events for hospitals involved in the program. Composite results were publicly reported on a dedicated website.13
Data
We obtained patient-level data from national Hospital Episode Statistics14 from the NHS Information Centre for Health and Social Care for all patients in England treated for one of three conditions included in the program: acute myocardial infarction, heart failure, and pneumonia. We did not include hip and knee surgery because mortality after elective joint replacement is less than 1%. We also did not consider coronary-artery bypass grafting because this procedure was performed in only 4 of the 24 hospitals in the northwest region of England.
Hospital Episode Statistics in England include deaths that occur in any hospital. We focused on all deaths that occurred within 30 days after admission. Published national statistics15 show that more than 90% of deaths within 30 days after admission for one of the conditions included in the program occur in a hospital. To check that there were no changes in discharge policies that might have led to more deaths outside of hospitals, we also analyzed changes in the proportions of patients discharged to care institutions rather than their own homes.
We obtained equivalent data for patients admitted for six primary diagnoses that were not included in the program. These conditions were chosen by the first, fourth, and last authors on the basis of published statistics at a national level13 to meet the following criteria: no clinical linkage to any condition included in the program, sufficient volume (more than 9000 admissions in England per year), 30-day mortality of more than 6%, and more than 80% of deaths within 30 days after admission occurring in a hospital.
Six diagnoses met these four criteria and were treated as reference conditions: acute renal failure (International Classification of Diseases, 10th Revision [ICD-10] codes beginning with N17), alcoholic liver disease (K70), intracranial injury (S06), paralytic ileus and intestinal obstruction without hernia (K56), pulmonary embolism (I26), and duodenal ulcer (K26). We excluded from the reference group all patients who had a condition included in the program at the time of any of their admissions during the 3-year study period. Our comparators included two mutually exclusive sets of patients — one set with a diagnosis covered by the program who were admitted to hospitals not included in the program and one set with an admission for a reference condition and no diagnosis covered by the program on any admission during the 3-year period.
Data were obtained for patients admitted during a 3-year period: April 1, 2007, through March 31, 2010. This period includes 18 months before the introduction of the program and the first 18 months of its operation. The data set included patients treated at the 24 NHS hospitals in the northwest region and the 132 NHS hospitals in all other regions of England. For each condition, the analysis was restricted to hospitals that admitted more than 100 patients for the condition during the 3-year period. The final sample included 410,384 patients with pneumonia (admitted to 154 hospitals; mean number of patients per hospital, 2665 [interquartile range, 1734 to 3353]), 201,003 patients with heart failure (154 hospitals; mean number of patients per hospital, 1305 [interquartile range, 839 to 1680]), 245,187 patients with acute myocardial infarction (154 hospitals; mean number of patients per hospital, 1592 [interquartile range, 951 to 2146]), and 241,009 patients with conditions not included in the program (153 hospitals; mean number of patients per hospital, 1575 [1035 to 1896]). Hospital characteristics were obtained from the websites of national regulators16,17 and the NHS Information Centre.18
Statistical Analysis
We calculated expected risks of death, using a logistic-regression model at the patient level that included sex and age; the primary ICD-10 diagnosis code; 31 coexisting conditions included in the Elixhauser algorithm, with data derived from secondary ICD-10 diagnosis codes19; the type of admission (emergency or transfer from another hospital); and the location from which the patient was admitted (own home or institution). The analysis of risk-adjusted mortality was performed on data aggregated by the quarter of the year and by admitting hospital.
We tested whether the incentives had an effect on mortality in three ways: a between-region difference-in-differences analysis that compared the changes in mortality over time between the northwest region and the rest of England for conditions included in the program, a within-region difference-in-differences analysis that compared the changes in mortality over time between the conditions included in the program and those not included in the program in the northwest region of England, and a triple-difference analysis that compared the changes over time in mortality between the conditions included in the program in the northwest region and those in the rest of England and between the conditions included in the program and those not included in the program. The triple-difference analysis captured the effect of the program on mortality for the conditions included in the program in the northwest region, controlling for the effects of changes over time in mortality for the conditions included in the program owing to factors other than the initiative itself, in addition to changes over time in overall mortality in the northwest region and differences in mortality between the conditions included in the program and those not included in the program between the northwest region and the rest of England.
We estimated the effects of all three included conditions combined and then of each condition separately. Each analysis very flexibly allowed for time trends with the use of a binary variable for each of the 12 quarter years and also allowed for hospital differences with the use of a binary variable for each hospital. Each analysis included an interaction term between the intervention group and the postimplementation period.
Results
Table 1. Table 1. Characteristics of Patients before and after Introduction of Pay for Performance in the Northwest Region of England (Intervention Region), as Compared with Patients in the Rest of England (Control Region).Table 2. Table 2. Characteristics of Hospitals in the Intervention and Control Regions.
The characteristics of the patient populations in the northwest region and the rest of England before and after the introduction of the program are shown in Table 1. For all conditions, patients in the northwest region were slightly younger but had more coexisting conditions. Similar changes over time in patient volumes and patient characteristics were observed in both areas. The profile of hospitals in the northwest region was similar to that in the rest of England (Table 2), with a slight tendency for a smaller percentage of hospitals in the northwest region to have received the lowest ratings by the national regulators for overall care quality and financial management in 2007.
Table 3. Table 3. Risk-Adjusted Mortality for the Conditions Included in the Pay-for-Performance Program and Those Not Included in the Program, before and after Introduction of the Program in the Northwest Region of England.
Risk-adjusted mortality for all the conditions that we studied decreased during the study period in both the northwest region and the rest of England. The reduction in mortality for conditions included in the program was greater in the northwest region than in the rest of England, decreasing from 21.9% to 20.1% in the northwest region and from 20.2% to 19.3% in the rest of England (Table 3). As compared with overall mortality for conditions not included in the program within the northwest region (within-region difference-in-differences analysis) (Table 3), there was a significantly greater reduction in overall mortality for conditions included in the program of 0.9 percentage points (95% confidence interval [CI], 0.1 to 1.7), with a significant reduction for pneumonia and a nonsignificant reduction for the other two conditions. In a comparison of mortality for the conditions included in the program in the northwest region with mortality for the same conditions in other regions (between-region difference-in-differences analysis) (Table 3), there was again a significantly greater reduction in overall mortality in the northwest region of 0.9 percentage points (95% CI, 0.4 to 1.4), again with individually significant reductions for pneumonia and nonsignificant reductions for the other two conditions.
Combining these two methods (triple-difference analysis) (Table 3) suggested a greater overall reduction in mortality of 1.3 percentage points in the northwest region (95% CI, 0.4 to 2.1; P=0.006). This represents a substantial relative rate reduction of 6% and, during the 18-month period that we studied, equates to a reduction of 890 deaths (95% CI, 260 to 1500) in the total population of 70,644 patients with these conditions in the northwest region of England. There was a significant reduction in mortality for pneumonia (P<0.001), and there were nonsignificant reductions for acute myocardial infarction (P=0.23) and heart failure (P=0.30). The reduction in mortality for conditions not included in the program during the period studied was not significantly different between the northwest region and the rest of England (P=0.36).
Our finding that risk-adjusted mortality for the conditions not included in the program decreased by similar amounts in the northwest region and the rest of England suggests that our findings are not explained by higher preintervention mortality or by a general improvement in the quality of care or a reduction in case-mix complexity in the study region. Nonetheless, we performed a wide range of further analyses to test the robustness of our findings (see the Supplementary Appendix, available with the full text of this article at NEJM.org). There were no significant changes in the proportion of patients discharged to care institutions, and all differences were smaller than 0.3 percentage points. We verified that the trends in mortality were similar in the two areas before the introduction of the program. We also checked that our findings were unaffected when we controlled for changes in patient volumes and baseline mortality and when we compared the northwest region with a subset of similar English regions.
Further examination of the additional mortality reductions in the northwest region showed few differences according to hospital type (see the Supplementary Appendix). Small hospitals and hospitals rated as having “excellent” or “good” quality services by the national regulator before the program showed the largest mortality reductions. Hospitals in the northwest region that were rated as having “weak” or “fair” quality services before the program did not reduce mortality more than did similar hospitals in other regions.
Discussion
Currently, there is little evidence that pay for performance has an effect on patient outcomes,12 but reviews of published studies stress the importance of the design of the measures and incentives, approaches to implementation, and the context in which they are introduced.1 We took advantage of a unique initiative in which a hospital quality-improvement program that was developed in the United States (the HQID) was introduced in England. We used as a natural experiment the fact that this program was introduced in only one region and found that the introduction of pay for performance was associated with a reduction in mortality of 1.3 percentage points in the combined mortality for the three conditions studied.
Performance reported by the participating hospitals improved on all the quality measures — particularly heart failure and pneumonia — during the first 18 months of the program (see the Supplementary Appendix). However, previous studies21-24 have shown weak links between these process measures and mortality. No data are available regarding the performance of hospitals on these measures before the introduction of the program or on the performance of hospitals outside the study region. However, we think that it is very unlikely that improved performance on the process measures alone could explain the reduced mortality that we observed.
Key questions are how and why this program was associated with reduced mortality when previous studies have found little evidence of an effect of pay for performance on outcomes,12 including studies of the HQID in the United States.10,11 The quantitative analysis reported here was part of a mixed-methods evaluation in which we observed meetings and interviewed more than 250 clinicians and managers over a period of 18 months, and we draw on this work to interpret our findings. Participating hospitals adopted a range of quality-improvement strategies in response to the program, including the use of specialist nurses and the development of new or improved data-collection systems linked to regular feedback about performance to clinical teams. Despite the “tournament” style of the program, staff from all participating hospitals met face to face at regular intervals to share problems and learning, particularly in relation to pneumonia, for which compliance with clinical pathways presented particular challenges and for which we found the largest reduction in mortality. Face-to-face communication, pan-regional participation, and the smaller size of the program in England may have made interaction at these events more productive than interaction at the similar shared-learning events that were run as “webinars” in the HQID. Other design differences may also be important. In particular, the larger size of the bonuses and the greater probability of earning bonuses in this program as compared with the HQID may explain why hospitals made substantial investments in quality improvement. The largest bonuses were 4%, as compared with 2% in the HQID, and the proportion of hospitals that earned the highest bonuses was 25%, as compared with 10% in the HQID.
In addition, the participation process may be important. To participate in the HQID, hospitals had to be subscribers to the Premier quality-benchmarking database and agree to participate and not withdraw from the program within 30 days after the results were announced. The 255 hospitals that participated represented just 5% of the 4691 acute care hospitals across the United States.5 In contrast, the English program was a geographically defined initiative with participation of all NHS hospitals in the region. This eliminated the possibility of participation by a self-selected group that might already be high performers or whose staff might be more motivated to improve. Further research would be required to identify whether pay-for-performance programs are more effective when participation is universal.
Our finding that a program that appeared similar to a U.S. initiative was associated with different results in England reinforces the message from previous research1 that details of the implementation of incentive programs and the context in which they are introduced may have an important bearing on their outcome. We cannot be certain from these results what caused the reduced mortality associated with the introduction of financial incentives for hospitals in England, but the possibility of a substantial effect of the incentives on mortality cannot be excluded.
Funding and Disclosures
The views and opinions expressed in this article are those of the authors and do not necessarily reflect those of the National Institute for Health Research or the National Health Service.
Supported by the National Health Service National Institute for Health Research.
Disclosure forms provided by the authors are available with the full text of this article at NEJM.org.
Author Affiliations
From the Centre for Health Economics, Institute of Population Health (M.S., S.N.), and Manchester Business School (R.B.), University of Manchester, Manchester, Primary Care Clinical Sciences, University of Birmingham, Birmingham (H.L.), the Business School, University of Nottingham, Nottingham (R.M.), and Cambridge Centre for Health Services Research, University of Cambridge, Cambridge (M.R.) — all in the United Kingdom.
Address reprint requests to Dr. Sutton at the Centre for Health Economics, Institute of Population Health, University of Manchester, Rm. 1.304, Jean McFarlane Bldg., Oxford Rd., Manchester M13 9PL, United Kingdom, or at [email protected].
Supplementary Material
References (24)
1. Van Herck P, de Smedt D, Annemans L, Remmen R, Rosenthal MB, Sermeus W. Systematic review: effects, design choices, and context of pay-for-performance in health care. BMC Health Serv Res2010;10:247-247
3. Mehrotra A, Damberg CL, Sorbero MES, Teleki SS. Pay for performance in the hospital setting: what is the state of the evidence? Am J Med Qual2009;24:19-28
4. Grossbart SR. What's the return? Assessing the effect of “pay-for-performance” initiatives on the quality of care delivery. Med Care Res Rev2006;63:Suppl:29S-48S
7. Damberg CL, Raube K, Teleki SS, de la Cruz E. Taking stock of pay for performance: a candid assessment from the front lines. Health Aff (Millwood)2009;28:517-525
9. Werner RM, Kolstad JT, Stuart EA, Polsky D. The effect of pay-for-performance in hospitals: lessons for quality improvement. Health Aff (Millwood)2011;30:690-698
12. Flodgren G, Eccles MP, Shepperd S, Scott A, Parmelli E, Beyer FR. An overview of reviews evaluating the effectiveness of financial incentives in changing healthcare professional behaviours and patient outcomes. Cochrane Database Syst Rev2011;7:CD009255-CD009255
19. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care2005;43:1130-1139
20. Marini G, Miraldo M, Jacobs R, Goddard M. Giving greater financial independence to hospitals -- does it make a difference? The case of English NHS trusts. Health Econ2008;17:751-775
22. Jha AK, Orav J, Li Z, Epstein AM. The inverse relationship between mortality rates and performance in the Hospital Quality Alliance measures. Health Aff (Millwood)2007;26:1104-1110
23. Ryan AM, Burgess JF Jr, Tompkins CP, Wallack SS. The relationship between Medicare's process of care quality measures and mortality. Inquiry2009;46:274-290
24. Bhattacharyya T, Freiberg AA, Mehta P, Katz JN, Ferris T. Measuring the report card: the validity of pay for-performance metrics in orthopedic surgery. Health Aff (Millwood)2009;28:526-532
Table 1. Characteristics of Patients before and after Introduction of Pay for Performance in the Northwest Region of England (Intervention Region), as Compared with Patients in the Rest of England (Control Region).
Table 1. Characteristics of Patients before and after Introduction of Pay for Performance in the Northwest Region of England (Intervention Region), as Compared with Patients in the Rest of England (Control Region).
Table 2. Characteristics of Hospitals in the Intervention and Control Regions.
Table 2. Characteristics of Hospitals in the Intervention and Control Regions.
Table 3. Risk-Adjusted Mortality for the Conditions Included in the Pay-for-Performance Program and Those Not Included in the Program, before and after Introduction of the Program in the Northwest Region of England.
Table 3. Risk-Adjusted Mortality for the Conditions Included in the Pay-for-Performance Program and Those Not Included in the Program, before and after Introduction of the Program in the Northwest Region of England.