https://clinton.presidentiallibraries.us/files/original/1405ac43fe78b12fd46a6a86febd230d.pdf ccffb02b46c0ff228a51cebb4baac5c4 PDF Text Text FOIA Number: 2006-0810-F FOIA MARKER This is not a textual record. This is used as an administrative marker by the William J. Clinton Presidential Library Staff. Collection/Record Group: Clinton Presidential Records Subgroup/Office of Origin: Health Care Task Force Series/Staff Member: Subseries: OA/ID Number: 1230 FolderlD: Folder Title: Quality Briefing Book [8] Stack: Row: Section: Shelf: Position: S 51 6 5 2 �Withdrawal/Redaction Sheet Clinton Library DOCUMENT NO. AND TYPE 001. list SUBJECT/TITLE DATE Health Care Task Force Working Group 9 [partial] (2 pages) n.d. RESTRICTION P6/b(6) COLLECTION: Clinton Presidential Records Health Care Task Force OA/Box Number: OA/ID 1230 FOLDER TITLE: Quality Briefing Book [8] 2006-0810-F ke217 RESTRICTION CODES Presidential Records Act - [44 U.S.C. 2204(a)| Freedom of Information Act -15 U.S.C. 552(b)| PI P2 P3 P4 b(l) National security classified information [(bXl) of the FOIA) b(2) Release would disclose internal personnel rules and practices of an agency 1(b)(2) of the FOIA] b(3) Release would violate a Federal statute |(bX3) of the FOIA| b(4) Release would disclose trade secrets or confidential or Financial information [(b)(4) of the F01A| b(6) Release would constitute a clearly unwarranted invasion of personal privacy 1(b)(6) of thc FOIA) b(7) Release would disclose information compiled for law enforcement purposes 1(b)(7) of thc FOIA| b(8) Release would disclose information concerning the regulation of financial institutions [(b)(8) of the FOIA] b(9) Release would disclose geological or geophysical information concerning wells 1(b)(9) of the FOIA) National Security Classified Information 1(a)(1) of the PRA| Relating to the appointment to Federal office 1(a)(2) of the PRA| Release would violate a Federal statute |(aX3) of the PRA| Release would disclose trade secrets or confidential commercial or financial information [(a)(4) of thc PRA| PS Release would disclose confidential advice between the President and his advisors, or between such advisors |a)(5) of the PRA| P6 Release would constitute a clearly unwarranted invasion of personal privacy [(a)(6) of the PRA| C. Closed in accordance with restrictions contained in donor's deed of gift. PRM. Personal record misfile defined in accordance with 44 U.S.C. 2201(3). RR. Document will be reviewed upon request. �Clinton Presidential Records Digital Records Marker This is not a presidential record. This is used as an administrative marker by the William J. Clinton Presidential Library Staff. This marker identifies the place of a tabbed divider. Given our digitization capabilities, we are sometimes unable to adequately scan such dividers. The title from the original document is indicated below. • Divider Title: 9 �Tab I Research and Technology Assessment 1. Article: "The Appropriateness of Use of Coronary Artery Bypass Graft Surgery in New York State," by Lucian Leape, Lee Hilbome, Rolla Park, Steven Bernstein, Caren Kamberg, Marjorie Sherwood, and Bob Brook, Journal of the American Medical Association. February 10, 1993. 2. Article: "The Appropriateness of Use of Percutaneous Transluminal Coronary Angioplasty in New York State," by Lee Hilbome, Lucian Leape, Steven Bernstein, Rolla Park, Mary Fiske, Caren Kamberg, Carol Roth, and Robert Brook, Journal of the American Medical Association. February 10, 1993. 3. Article: "The Appropriateness of Use of Coronary Angiography in New York State," by Steven Bernstein, Lee Hilbome, Lucian Leape, Mary Fiske, Rolla Park, Caren Kamberg and Robert Brook, Journal of the American Medical Association. February 10, 1993. 4. Report excerpt: Healthy People 2000. Department of Health and Human Services. 5. Article: "Small Area Analysis and the Medical Care Outcome Problem" by John Wennberg, AHCPR Conference Proceedings: Research Methodology: Strengthening Causal Interpretations of Nonexperimental Data, May 1990 6. Article: "The Appropriateness of Hysterectomy: A Comparison of Care in Seven Health Plans," by Steven Bernstein, et. al., Journal of the American Medical Association. May 12, 1993. 7. Article: "Effects of the National Institutes of Health Consensus Development Program on Physician Practice," by Jacqueline Kosecoff, et al, Journal of the American Medical Association, Nov. 20, 1987. �For Official Use Only 5/5/93 Titles: (1) The Appropriateness of Use of Coronary Ancry Bypass Graft Surgery in New York Stale (2) The Appropriaicncss of Pcrciitaneous Trnnsliiminal Coronary Angioplasty in NY State O) The Appropriateness of use of Coronary Angiographv m NY Stale These article should be considered together. Implication for Healih Care Reform: In a state where performance of cardiovascular procedures is heavily rcgulaied. great uncertainly still is idcniiried in practice patterns. % Crucial Coronary Bypass Surgery: 82 Coronary Angioplasty 35 Coronary Angiography 64 % Appropriate 8 23 II % Unceriain 7 38 20 % Inappropriate 2 4 4 Full disclosure of all malcnal infonnation lo consumers (including inrormalion based on thc besi-pracuce guidelines) including meaningful education, could reduce the frequency of inappropriate, uncertain, and possible even the appropnaic-biil-noi-caicial procedures Full disclosure would empower consumers and increase their salisfaction Providers as well as consumers need sound practice guidelines to serv e as the basis for such dialogue Preliminary Staff Working Paper for Illustrative Purposes Only �Original Contributions '•It The Appropriateness of Use of Coronary Artery Bypass Graft Surgery in New York State Lucian L Leape, MD; Lee H. Hilborne. MD, MPH; Rolla Edward Park, PhD; Steven J. Bernstein, MD, MPH; Caren J. Kamberg, MSPH; Marjorie Sherwood, MD; Robert H. Brook, MD, ScD irate pain? ety profile ibuprofen.' less likeiv .•rfere with orofen and nechanical hritiswith raENOLV Objective.—To determine the appropriateness of use of coronary artery bypass graft surgery in New York State. Design.—Retrospective randomized medical record review. Setting.—Fifteen randomly selected hospitals in New York State that provide coronary artery bypass graft surgery. Patients.—Random sample of 1338 patients undergoing isolated coronary artery bypass graft surgery in New York State in 1990. Main Outcome Measures.—Percentage of patients who had bypass surgery for appropriate, inappropriate, or uncertain indications; operative (30-day) mortality; and complications. Results.—Nearly 9 1 % of the bypass operations were rated appropriate; 7%, uncertain; and 2.4%, inappropriate. This low inappropriate rate differs substantially from the 14% rate found in a previous study of patients operated on in 1979,1980, and 1982. The difference in rates was not due to more lenient criteria but to changes in practice, the most important being that the fraction of patients receiving coronary artery bypass grafts for one- and two-vessel disease fell from 51 % to 24%. Individual hospital rates of inappropriateness (0% lo 5%) did not vary significantly. Rates of appropriateness also did not vary by hospital location, volume, or teaching status. Operative mortality was 2.0%; 17% of patients suffered a complication. Complication rates varied significantly among hospitals (P<.01) and were higher in downstate hospitals. Conclusions.—The rates of inappropriate and uncertain use of coronary artery bypass graft surgery in New York State were very low. Rates of inappropriate use did not vary significantly among hospitals, or according to region, volume of bypass operations performed, or teaching status. (JAMA. 1993;269:753-760) 5 From RAND (Drs Leaoe. Hilbome. Park. Bernstein, ano Brook., and Ms Kamberg) and Value Healih Sciences Inc (Dr Sherwood). Santa Monica, Calif; Harvard School ot Public Health, Boston. Mass (Dr Leape); the Departments ol Medicine (Drs Hilbome and Brook) and Pathology and Laboratory Medicine (Dr Hilbome). the School ol Medicine, and the School of Public Health (Dr Brook). UCLA Los Angeles. CaM; and '.he Schcols o! Medicine and Public Healih, University of Michigan, Ann Arbor (Dr Bernstein). Reprint requests to RAND, 1700 Main St, Santa Monica. CA 90406-2398 (Dr Brook). AT THE REQU EST of the Cardiac Advisory Committee of the state of New York, we conducted a study of the appropriateness of use of coronary artery bypass graft (CABG) surgery, percutaneous transluminal coronary angioplasty (PTCA), and coronary angiography in New York State in 1990. New York differs from most other states in that the Department of Health has limited the number of centers where cardiac procedures are performed. Before expanding the number of centers that perform these procedures, the state wished See also pp 761, 766, and 794. to know how appropriately they were being used. In this article, we present a detailed description of our methods and the results for CABG. In subsequent articles in this series we present results for PTCA and coronary angiography. Coronary artery bypass graft surgery is one of the most commonly performed operations. For some patients with coronary atherosclerosis, it has been shown to be lifesaving, ' and in many others it relieves angina. However, a previous study* of patients operated on in three randomly selected hospitals in a western state in 1979,1980, and 1982 showed a significant fraction (14%) of inappropriate use. Inappropriate use was defined as performing the procedure under circumstances where the medical risks exceeded the medical benefits. Since that lime, the practice of coronary revascularization has changed remarkably. Bypass surgery has become safer and medical therapy has improved. Most importantly, PTCA has emerged a? an alternative me'ihod of revascularization. For all of these reasons it is appropriate to reassess whether there is still substantial overuse of coronary artery bypass surgery. 1 2 3 6 ..^IES. Oo not tate JAMA, February 10, 1993—Vol 269, No. 6 Coronary Artery Bypass Graft—Leape el al 753 �Assessing the appropriateness of use of any procedure for a particular clinical indication (scenario) depends on evaluIting at a point in time what is known bout the probabilities and values (utilities) of the possible outcomes that will occur if the procedure is or is not used. If the value of the benefits (prolonged life, relief of pain, and cure of disease) outweigh the value of the risks (operative mortality, complications, pain, and anxiety), then performing the procedure is appropriate. There are two sources of information for assessing appropriateness: outcome data and judgments of experts. Each has its strengths and weaknesses. Assessing appropriateness by analysis of outcomes is the ideal. If complete, consistent, and generalizable evidence about the risks and benefits of applying a procedure were available for each of its possible indications, the assessment of appropriateness could be made solely on the basis of those data. There are several reasons why this is virtually never possible. First, for any procedure, there are literally hundreds of substantially different clinical scenarios for which it might be beneficial. Outcome information is never available for all of these uses us or, for that matter, for more than small fraction of the most common dications. Second, outcome studies are ^iften outdated by the time the results are available. The pace of technologic change is such that by the time information from even well-designed and well-executed randomized trials is available the nature of the treatment may have changed so significantly that the conclusions are not sufficient for making a decision in a specific patient. Thus, while more and better outcome data are desperately needed, it is likely that the data will always lag clinical advance and be incomplete. Third, even current outcome data can seldom be used as is. Data from similar studies may conflict, the conditions under which studies are carried out and the selection of patients vary, and findings in one population may not be generalizable to another. Like all scientific data, outcome data must be evaluated and interpreted before it can be applied. Even though outcome data are often inadequate, decisions must nevertheless be made every day by myriads of patients and physicians about whether procedures should or should not be used. The RAND/UCLA appropriateness nethod deals wiih the deficiencies of tcome data by asking experts to prode an assessment of appropriateness after they have reviewed the available information. It recognizes that physi- ft 754 JAMA. February 10. 1993-Vol 269. No 6 cian* have a wealth of knowledge from their education and experience that enables them to make sound judgments about the validity of the outcome data as w ell as for situations where data are absent. 11 also recognizes that for a great many clinical situations a consensus does, in fact, exist. The strengths of the appropriateness method are that it evaluates all available outcome information, it is efficient and comprehensive, and the recommendations are applicable at the time they are rendered. The weaknesses of the method are that it is limited by the available outcome data and group judgments are subjective. Because of the latter, rigorous methods must be used to structure the way in which the decisions are framed and the manner in which the judgments are rendered. Studies of appropriateness are therefore not a substitute for outcome studies but a way to define at a point in time, using the best available data and expert methods, w hich services are and which are not appropriate for individual patients. METHODS Development of Indications and Appropriateness Criteria Criteria for measuring appropriateness were developed by a previously described method." First, the relevant literature published from 1971 to 1990 concerning effectiveness and risks of CABG was reviewed. A total of 670 articles were abstracted. The results of these studies were synthesized into an annotated summary of the evidence for effectiveness and risks for each of the indications for CABG. Next, based on the literature review and consultation with experts in cardiology and cardiac surgery, a set of clinical scenarios, which we call indications, was derived that encompassed all possible reasons (both appropriate and inappropriate) for performing CABG that might arise in clinical practice. Each indication consisted of a unique combination of clinical information and other factors that are considered in recommending surgery. Each indication is specified in sufficient detail that patients within a given indication would be reasonably homogeneous, and performing bypass surgery for the indication would be equally appropriate or inappropriate for all patients with that indication. An example of a typical indication is CABG iialiL-ated within 21 days of an acute myocardial infarction in a patient who has continuing pain, a low operative risk, an ejection fraction of l.V/i to -ibQ, and in whom coronary angiography has dem- onstrated significant triple-vessel disease. Each term in the indication is defined in a glossary that accompanies the indications. There were a total of 99G indications for CABG, organized into eight groups, called "chapters," according to presenting symptoms: chronic stable angina, unstable angina, during an acute myocardial infarction, within 21 days following a myocardial infarction, asymptomatic, near sudden death, complication of coronary angioplasty or angiography, and CABG performed with valve surgery. Indications were arranged within each chapter according to the extent of significant anatomic disease as revealed by coronary angiography (eg, left main, three vessel), level of operative risk, the results of an exercise stress test or thallium scan, ejection fraction, anginal class, adequacy of medical therapy, and the patient's comorbidity as assessed by our modified Parsonnet score. The definitions for the specific factors were developed and agreed on by the expert panel that later rated the indications for appropriateness. For example, significant arterial disease was defined by the panel as (1) a reduction in the luminal diameter of 50% or more, and (2) for all but the left main coronary artery, a reduction of at least 70% in the lumen of at least one vessel. 910 Panel Selection and Appropriateness Ratings Nine expert clinicians were selected from nominations provided by the relevant specialty societies: the American College of Cardiology, American Heart Association, Society of Thoracic Surgeons, American Association for Thoracic Surgery, American College of Physicians, and American College of Surgeons. Panelists were all highly respected specialists chosen for their expertise and national influence. They represented all geographic regions of the country and both academic and private practice. They were asked to provide their personal judgments, not positions of the societies that nominated them. The panel included three cardiac surgeons, three cardiologists w ho performed angioplasty, one noninterventional cardiologist, and two internists. The panel was convened in November 1990. Panelists were provided with the literature review and, after reading it, were asked to rate each indication for the appropriateness of performing CABG using their best clinical judgment and cunskienng an averaj.-e palienl presenting to an average surgeon performing CARG surgery in 1990. Appropriateness was defined to mean that the expected health benefit (quality of life b,pr. Coronary Anery Bypass Graft—Leape et al JAV th ai' in tl-. Pi ar A Ct"' at ra ni' rati 9). diat. prty a i im Cei cat fia �m and/or longevity) exceeded thc expected negative consequences (pain, disability, and risk of death) by a sufficient margin that thi- pnuvilure was wnrlh performing. Cost of the procedure was not considered in the appropriateness rating. Extremely appropriate indications were rated as 9, extremely inappropriate indications as l.and those neither appropriate nor inappropriate as 5. The ratings w ere confidential and took place in two rounds, using a modified Delphi process. The first round of ratings was performed at home. These results were then collated and presented to the panelists at a second round during a 2-day meeting attended by all panelists. Each panelist received the anonymous ratings of all the other panelists as well as a reminder of his own ratings. The panel reconsidered and refined the definitions of some of the factors. In addition, the panel provided ratings for additional chapters: ventricular arrhythmias, congestive heart failure, and postmyocardial infarction after 21 days. Because CABG and PTCA are often alternative treatments, each indication was rated three ways: appropriateness of C ABG in a patient who is not also a candidate for PTCA. appropriateness of CABG in a patient who is a candidate for both PTCA and CABG, and appropriateness of PTCA compared with medical therapy. This required each panelist to provide nearly 3000 appropriateness ratings. Appropriateness Scores The final appropriateness rating was the median of the nine panelists' ratings after the second round of ratings. An indication was considered appropriate if the median rating was 7 to 9, inappropriate if the median rating was 1 to 3, and uncertain if the median rating was 4 to 6. In addition, an indication was considered uncertain if there was disagreement, regardless of the median rating. Disagreement was defined as more than two panelists' assigning a rating in both the inappropriate range (1 to 3) and the appropriate range (7 to 9). Four percent of ratings were with disagreement. After computation of the appropriateness scores, ratings that were appropriate for either bypass or angioplasty were returned to the panelists who in a third round rated these appropriate indications for necessity, ie, was the procedure of crucial importance. An indication was defined as crucial or necessary if a panelist believed that a physician has an obligation to recommend CABG or PTCA because it is clearly the best option available to the patient. A procedure was considered crucial to the JAMA, February 10, 1993—Vol 269, No. 6 extent that all four of the follow ing criteria were met: (1) the procedure was appropriate without disagreement. (21 it wnuM be improper care nol to prnvidithis service for most patients, (3) the likelihood of benefit was significant, and (4) the extent of the benefit was not small. An indication was most likely to be rated as crucial when there were outcome data confirming the effectiveness of bypass surgery (such as in the treatment of left main coronary artery disease). An indication could be appropriate, ie, of benefit and preferable to the alternatives, without being crucial. The literature review, listing of all 2990 appropriateness ratings, definitions of terms, and the final panel ratings of appropriateness and necessity have been published as a monograph available from RAND, Santa Monica, Calif. 10 Data Collection and Sample Using the indications, definitions, and ratings provided by the expert panel, a medical record abstraction form w as created to capture the data needed to determine the appropriateness of performing CABG in the sample patients. Under the supervision of the Island Peer Review Organization, medical records were abstracted by experienced nurses trained in the use of the form. All abstracted records were reviewed by an Island Peer Review Organization nurse supenisor for completeness, accuracy, and consistency. Photocopies of the admission note, the discharge summary, and reports of stress tests, echocardiograms and other noninvasive tests, coronary angiograms, and operative notes were provided for interpretation by the physician overreader. Each abstract was then reviewed by a RAND physician who coded the results of the key tests and the angiogram. Each patient was then assigned to a specific clinical chapter (eg, chronic stable angina or unstable angina). To ensure confidentiality of information, we assigned coded identifiers to patients, hospitals, and physicians. Once the data collection process was completed, the files linking these identifiers were destroyed. We obtained a sample of patients who had CABG surgery in 1990 in nonfederal hospitals in New York State by means of a two-step sampling process. First, we randomly selected a sample of hospitals stratified according to two characteristics, upstate or downstate location and volume of CABG operations performed in 1989. Downstate location included New York City, Long Island, and Westchester County, and upstate was the remainder. Low-volume hospitals were those that performed fewer than 325 operations. (Twenty percent of 1 patients receiving CABGs in 1989 in New York were operated on in hospitals performing fewer than 325 CABG operalions that year.) Four hospitals were excluded from the sample because the programs were new (one), temporarily suspended (one), or the volume of cases was insufficient to provide 90 cases for study (tw o). In each of the four strata we randomly sampled approximately equal numbers of hospitals performing CABGs. Fifteen of 30 hospitals performing CABGs were selected. To obtain our desired sample of 90 patients per hospital, we reviewed a random selection of 1426 medical records. Fifty-five records were excluded because another major procedure was performed in conjunction with CABG or because the procedure was miscoded as CABG. Twenty records (1.4%) w ere not located. Of 1351 records in the final sample, 13 records (1.0%) were excluded because critical data were missing and could not be obtained from the referring physician. A total of 1338 records were abstracted for analysis. The results of the exercise stress test w ere frequently not in the record. For patients in whom the results of the test would affect the rating of appropriateness (predominantly patients undergoing elective CABG for single- and two-vessel disease) we requested a report from the referring physician. We obtained all but 10 of these missing reports; for these 10 patients, we assumed that the stress test had not been done or that it was not strongly positive. Analysis We assigned an indication to each patient based on the information abstracted from the record. In cases where more than one indication applied to a patient, we assigned the one that had the higher appropriateness score. Patients who were candidates for both CABG and PTCA and for whom the panel rated PTCA more appropriate (ie, the rating of CABG for a patient who is a PTCA candidate equals 1 to 3 without disagreement) constituted a special group. In accordance with the panel's decision, these patients were given a rating one category lower than the rating that would have applied for CABG if they had not been PTCA candidates (eg, rating of "uncertain" if the CABG rating is "appropriate" when the patient is not a candidate for PTCA). I n addition to appropriateness, we analyzed surgical mortality, which we defined as in-hospital death occuning within 30 days following operation, and major complications by hospital. All results were population weighted according to the number of cases perCxxonary Artery Bypass Graft—Leape et al 755 �emergency status category as independent variables. Differences in standardized complication rates across hospitals were tested by comparing two logistic regressions, one with hospital indicator variables, one without; both regressions included the standardizing variables. Under the hypothesis of no difference among hospitals, twice the difference in log likelihood between the two equations was distributed as x - Differences in standardized complication rates between groups of hospitals are presented as RRs with 95% CIs; these were calculated from the estimated coefficients of the group indicator variable in logistic regressions that also included the standardizing variables. formed in each institution. All SEs were inflated as necessary to compensate for the design effects of the two-stage sample. Most results are presented as a ean rate and a 95% confidence interval I). The CIs for rates were calculated ing the normal approximation, and truncated at zero if the approximation extended below zero. Comparisons between two categories are presented as relative risks (RRs) with 95% CIs; these were calculated from bivariate logistic regression results. Differences in distribution across multiple categories were tested using the x statistic for the unweighted contingency table. For comparisons across hospitals or between groups of hospitals, complication rates were standardized for case mix. We used indirect standardization, with the predicted hospital complication rates calculated from logistic regressions with age category, risk category (modified Parsonnet score), angiographic disease category, indications chapter, and 1112 2 2 RESULTS Seventy-six percent of the patients were men and 69% were less than 70 years of age. Fourteen percent of pa- Table V—Appropriateness ol Coronary Artery Bypass Graft According to Anatomical Disease in 1338 Patients in New York State in 1990 Appropriateness, % Location of Disease by Angiography* Appropriate and Crucial No. % Left main 280 21 95 3 2 0 Three vessels 735 55 94 3 3 0 Two vessels, with PLADT 144 11 58 25 16 1 I t p vessels, other 125 9 36 24 29 11 Appropriate Uncertain Inappropriate ^ W e vessel, with PLAD 23 2 22 39 30 9 0mgle vessel, other 30 2 17 21 31 31 002 0 0 0 100 Insignificant disease! 1 •Minimum ol 50% narrowing in all affected vessels, with 70% narrowing in at least one artery (except tor left main). TPLAD indicates proximal left anterior descending artery. ^Angiographic findings did not meet the minimum criteria. tients w ere 75 years of age or older, and 3% w ere 80 years of age or older. Three quarters of the bypass operations were performed for either left main (21%) or three-vessel (55%) disease. Four percent of patients had single-vessel disease (Table 1). Ninety-three percent of operations were for one of three clinical chapter categories: chronic stable angina (43%), post-myocardial infarction (28%), or unstable angina (22%) (Table 2). Of the 2990 scenarios rated by the panel, 315 were actually used in this sample of 1338 patients. One patient (0.02%) failed to meet the criteria for significant disease (70% stenosis for at least one vessel [except for left main disease] and 50% stenosis for all other affected arteries). Overall, 59% of patients were in the low-risk group as judged by our modified Parsonnet score, 31% were in the moderaterisk group, and 10% were in the highrisk group. For the most common categories (chronic stable angina, unstable angina, post-myocardial infarction, and asymptomatic), the percentage of highrisk patients did not vary substantially. Six percent of CABG operations were performed as emergencies. Nearly 91% of the bypass operations performed in these patients were rated appropriate, 7% uncertain, and 2.4% inappropriate (Table 3). Most of the appropriate cases (82% of all procedures) were also rated as crucial. The major reason for an inappropriate rating was use of CABG when, in the panel's judgment, PTCA would have been preferable. More than half of the 28 inappropriate bypass operations (61%) would have been rated as uncertain or appro- Table 2 —Selected Data on the Use of Coronary Artery Bypass Graft in 1338 Patients in New York State in 1990 by Clinical Indications Chapter* Appropriateness, Patients I 1 High Risk. \ 1 Appropriate and Crucial Appropriate Uncertain Inappropriate I % » 0.5 (0-1) Mortality, Indications No. Chronic stable angina 545 43 (39-48) 6 (4-9) 86 (82-89) 7 (5-9) 5(3-7) 3(1-4) Unstable angina 309 22 (19-25) 12 (8-16) 88 (84-91) 6 (3-8) 6(3-8) 1 (0-2) Post-myocardial infarction 6 h.21 d 254 18(15-21) 10(6-15) 81 (75-88) 5 (2-9) 11 (5-16) 2(0-5) 1 (0-3) 141 10(8-12) 12(7-17) 77 (70-84) 9(5-14) 10(5-15) 4(1-7) 1 (0-3) 48 (30-65) 16(4-28) 5(0-15) 10(0-22) 0 11 (0-23) 22-91 d|| %t M Asymptomatic 39 3 (2-5) 16(4-27) 31 (17-46) Congestive hean failure 28 2(2-3) 24 (4-44) 88(74-100) i m p l i c a t i o n of PTCA or coronary angiographyl 10 0 9 ( 0 4-1.5) 72 (42-100) 9 0.5(0.1-1.0) 44 (9-78) Cardiogenic shock/ acute myocardial infarction Near sudden death 0.1 (0.0-0.3) Insignificant disease 1 Totals 1338 3 6 0.02 (0 00-0.09) 100 2(0-7) 4(2-6) 0 100 0 0 0 15(0-38) 30(0-62) 0 100 0 0 0 100 0 0 0 0 0 0 0 0 100 0 82 (80-85) 8(7-10) 7(5-9) 2 (2-3) 10(8-13) 2(1-3) 6 t w r n M ^ , ' " P ' " * * " are 95% confidence intervals. •n„ S Percentages do not add up lo 100 due lo rounding, r u n admission as assessed by modified Parsonnet score ' 530-C m-hospnal monality. S-iO-C ^Ml^tie kl'inn'u! .' l evaluation and subsequent bypass surgery was a myocardial infarction wilhin the 22- to 91-d period. The expert panel rated these patients - 4 'o symptom (eg. chronic stable angina, unstable angina, or asymptomatic). - « indicates percutaneous transluminal coronary angioplasty 0 756 W h t m t h e r e a s o n , o r J A M A , February i o , 1 9 9 3 - V o l 269. N o 6 Tat Bye 199 App Ape UneInap pn; abl. can cau urn. PT( tail i fro i Ins: (9:;' ben. cial of p. nim9 ra ings addi thei appi Tl did ; maji did \ dise. in p; 3% , sel ii pria: with erat; ease Pati. dise; acco case were singl the .inap: certa case.sel il In. port; less • meet at Ic main disea cases posit atic a of th. prop: in th. Coral inapi Coronary Artery B y p a s s G r a f t — L e a p e et al JAMA �Hi Table 3 — Appiopnaleness ol Coronary Artery Bypass Graft m 1338 Patients in New York State in 19P0 Category •is T ). No (S) 95".. Confidence Interval Appropnaie and crucial Appropriate Uncertain Inappropriate 1096 182.3) 114 (8.3) 100 (7.0) 28(2.4) 79.6-85.1 6.6-9.9 5.4- 8 6 1.5- 3.2 nle 15 ;s •o . -e \orre.i•.tle id hy. re is .-d n- 3 as ga•old o- 21 2) priate if PTCA had not been an available option for these patients. However, candidacy for PTCA was not a major cause of uncertain ratings. Only 3% of uncertain cases were so rated because PTCA was preferred to CABG. Uncertain ratings also seldom (4%) resulted from polar disagreement of panelists. Instead,uncertain ratings almost always (93%) reflected panel consensus that the benefits and risks were about equal. Crucial ratings also reflected a high degree of panel consensus. Of crucial cases, all nine panelists' ratings were in the 7 to 9 range for SO'/r, and eight of nine ratings were in the 7 to 9 range for an additional 10%. In no crucial case was there more than a single dissenting inappropriate rating. The distribution of appropriateness did not vary substantially among the major clinical categories (Table 2) but did van- according to extent of anatomic disease (Table 1). Only 2% of operations in patients with left main disease and 3% of those in patients with three-vessel disease w ere rated less than appropriate, but 2S% of operations in patients with two-vessel disease and 52% of operations in patients with one-vessel disease were rated less than appropriate. Patients with left main and three-vessel disease represented 76% of cases but accounted for 82% of all appropriate cases and 87% of appropriate cases that were also rated crucial. Patients with single-vessel disease comprised 4% of the sample but accounted for 39% of the inappropriate cases and 16% of the uncertain cases. The majority of uncertain cases (59%) were patients with two-vessel disease. Inappropriate cases had several important characteristics. First, they had less severe disease. One patient did not meet the requirement of 70% stenosis in at least one vessel, and all of the remainder had either one- or two-vessel disease. None of the inappropriate cases had a stress test classified as very positive, and 11 of 28 were asymptomatic at tho time of surgery. On the basis of the modified Parsonnet score, inappropriate cases w ere more likely to be in the high-risk category' (19% vs 10% for all patients). Finally, the majority of inappropriate cases were also potential J A M A . February 10. 1993—Vol 269. No. 6 Table 4 —The Most Frequently Used Indications by Appropnateness Category* Indications No of Cases Appropriateness Rating Appropnalet Chronic stable angina, class l/ll. treated with maximal medical therapy, three-vessel disease, election Iraclion ;-35°o. candidate for PTCA. low nsk 61 9 Post - myocardial infarction angina. 6 h-21 d. three-vessel disease, ejection traction ; - 3 5 V candidate lor PTCA. low nsk 60 9 UncertamJ Post-myocardial infarction. 43-91 d. asymptomatic, with less than strongly posmve exercise ECG. three-vessel disease, election fraction i : 5 0 V not candidate for PTCA. low nsk 7 6 Post-myocardial infarction. non-O-wave, asymptomatic, with less than strongly positive exercise ECG, three-vessel disease, ejection traction >35%. not candidate for PTCA. moderately high risk 5 6 lnappropriate§ Cnronic stable angina, dass l/ll. treated with maximal medical therapy, with less than strongly positive exercise ECG. two-vessel disease without proximal left anterior descending involvement, election fraction > 3 5 V candidate for PTCA, low risk 4 3 Asymptomatic, with less than strongly positive exercise ECG. twovessel disease without proximal left antenor descending involvement, election Iraclion < 5 0 V not candidate tor PTCA. low nsk 2 3 •PTCA indicates percutaneous transluminal coronary angioplasty: and ECG. electrocardiogram. t l 2 1 0 cases were raled appropriale, of which 1096 were also rated crucial. J100 cases. §28 cases candidates for PTCA. Examples of appropriate, uncertain, and inappropriate cases are presented in Table 4. Appropriateness did not differ signficantly across age categories (P=.09), but appropriateness did vary by presenting symptoms (P=.0001), eg, operations in asymptomatic patients were more likely to be rated as uncertain or inappropriate (21%) than were those in all patients (9%) (RR, 2.8; 95% CI, 1.7 to 4.3) (Table 2). Mortality and Complications Operative mortality, defined as in-hospital death within 30 days of surgery, was 2.0% overall (Table 2). Operative mortality was significantly higher for patients 75 years of age and older (5.7%) compared with 1.4% in patients less than 75 years of age (RR, 42; 95% CI, 1.1 to 13.6). Mortality was significantly higher in patients with cardiogenic shock (30%), PTCA complications (15%), and congestive heart failure (11% ) (RR for all three, 10.1; 95% CI, 3.8 to 22.9, compared with all other patients). Mortality was significantly lower for patients with chronic stable angina (0.5%) or those who were asvmptomatic (0.0%) (RR for both, 0.1; 95% CI, 0.0 to 0.5). Complications occurred in approximately 17% of patients (Table 5). Many patients with complications suffered more than one. Nearly 8% required reoperation in the immediate postoperative period, 3% because of continued bleeding or tamponade. Seven percent of patients had at least one major cardiac complication (perioperative myocardial infarction [2.3%], cardiac arrest [2.9%], arrhythmia requiring defibrillation [1.7%], or insertion of a permanent pacemaker [1.0%]). Nearly 6% of patients required prolonged ventilatory assistance, and 2% of patients suffered a cerebrovascular accident. The use of blood transfusions varied substantially among patients and among hospitals. Wliile 33% of patients received no transfusion, 24% required transfusion of more than 3 U. Mortality rates and the incidence of all types of complications were closely related to the operative risk as predicted by the modified Parsonnet score. Patients in the high-risk category were much more likely to die (4.5% vs 0.2%) (RR, 21; 95% CI, 5 to 74) or to have complications (32% vs 11%) (OR. 3.0; 95% CI, 2.6 to 3.5) as were patients in the low-risk category. Interhospital Comparisons Among hospitals the inappropriateness rate varied from 0% to 5%, and the uncertain rate varied from 3% to 15%, but neither these differences nor thencombination were significant (Table 6). However, the variation in the fraction of patients rated appropriate and crucial (71% to 89% ) w as significant (F'=.02). After adjustment for Parsonnet score, severity, age, clinical indication chapter, and emergency operation, differences in operative mortality also were not significant (P=.43) (Table 7). However, risk-adjusted complication rates varied significantly from 9% to 26% (P=.009). The number of patients requiring more than 3 U of blood also varied markedly among hospitals: 5% to 57% (P=.008). The correlations of hospital inappropriateness rates with mortality and with complication rates were smalland nonsignificant(r=.01 and - .03, respectively). Coronary Artery B y p a s s G r a f t — L e a p e et al 757 �Table 5.—Complications Following Bypass Surgery Table 6 —Appropnateness of Performing Coronary Artery Bypass Graft Surgery by Hospital in 1338 Patients in 1990 ApproprtBteness, \* 1 NO. Of Patienti (N) 95% Confidence Interval Hospital 1 Appropriate and Crucialf 249 (17.1) 111 (7.7) 13 8-20 3 5 7-9 7 A 86 B 77 101 (5.9) 36-8.2 C 81 8 D 88 43(3.0) 42 (2.9) 2 1-3.9 1.7-4.2 E 71 F 85 G H 22(1.7) 10-2 4 16(1.0) 15(1.0) 0.4-1 6 0.2-1.8 "Some patients had more than one complication. Appropriateness did not vary significantly according to hospital CABG volume, location, or teaching status (Table 8). While the fraction of patients with high operative risk did not vary significantly between high- and low-volume hospitals, or between teaching and nonteaching hospitals, patients in upstate hospitals were less likely to be in the high-risk category than those in downstate hospitals (7Vc vs 12%) (RR, 0.6; 95% CI, 0.4 to 1.0). Complications were also less common in upstate hospitals (13% vs 19%) (RR, 0.7; 95% CI, 0.5 to tl jp.9), and patients in upstate hospitals ere less likely to receive transfusion of 'more than 3 U of blood (14% vs 29%) (RR, 0.4; 95% CI, 0.2 to 0.6). There w ere no significant differences in comphcation rates according to volume or teaching status (Table 8). COMMENT This study found that in New York State in 1990 fewer than 3% of CABG operations w ere performed for inappropriate reasons and 7% for uncertain reasons. These results differ considerably from those reported earlier in which the inappropriate rate was 14% and the uncertain rate was 30%. There are at least four possible explanations for these differences: First, the previous study may not have been representative, ie, the 14%i inappropriateness rate might have been higher than the overall rate in the United States as a whole. Second, the appropriateness ratings of the 1990 panel may have changed so that cases previously rated inappropnate would now be rated as appropriate or uncertain. Third, overall practice patterns may have changed so that fewer patients are now being operated on for inappropriite reasons. Fourth, New York State y be atypical; rates of inappropriate uncertain use may be significantly Tgher in other regions of the country. 758 JAMA. February 10. 1993—Vol 269. No. 6 i 1 10 1 9 2 6 4 2 12 15 2 H; 9 3 2 C; 89 2 6 3 79 12 7 2 1 73 10 12 5 J 79 13 6 2 86 7 8 0 L 78 6 11 5 88 4 7 1 ai.- N 81 11 8 0 O 1.5-3.1 1 1-3.2 1.0-2.5 1 0-2.4 9 12 K 32(2.3) 29(2.1) 27(1 8) 25(1.7) Inappropriatet'i M Complication Any* Reoperation Ventilatory assistance for >3 d Bleeding requiring reoperation Cardiac arrest Acute myocardial infarction Sternal wound mlection Cerebrovascular accident Groin-wound infection Arrtiylhrnia requinng dedbnllalion Insertion ol permanent pacemaker Acute renal lailure Uncertain^ 88 7 3 2 all lar Appropriate lio •Percentages may not add up to 100 due to rounding tP=02. i P = 09 for uncertain and inappropriate combined. §P=67. Table 7 . --Adjusted Mortality and Complication Rates of Bypass Surgery by Hospital* High Risk. Complications, % Mortality, I Cardiac!! Transfusion >3 U, % 1 Hospltal %t %* A 7 2 9 4 5 B 9 5 26 13 39 C 6 2 13 2 15 D 19 1 20 8 22 E 13 1 22 9 45 F 4 4 16 5 2 G 10 3 19 7 19 Any§ H 7 2 26 7 25 1 14 2 21 5 30 J 7 3 16 10 10 K 18 2 26 e 57 L 16 2 18 6 39 M 16 0 21 13 29 N 12 2 12 6 17 O 4 0 9 4 15 •Indirectly standardized tor Parsonnet score, disease seventy, age. indication chapter, and emergency status. TAs judged by modified Parsonnet score on admission §P=.009. I|P=.19. T1P=.008. All four reasons probably contributed to the differences. It is possible that the earlier study was not representative of bypass surgery in 1979, 1980, and 1982." It was an analysis of patients treated in three hospitals in one geographic region and may not. therefore, have been generalizable to the entire country. However, it was a randomized sample of both hospitals and patients, and the fraction of inappropriate use was similar in magnitude to those found for other major procedures.' ' Could the differences we found in rates of inappropriate and uncertain care merely reflect differences in panel ratings, not differences in practice? There : are least three reasons that panel ratings might differ: First, ratings would (and should) change in response to new information from outcome studies that alter the benefit-risk ratio for certain clinical scenarios. Second, scenarios could be defined differently. Third, the 1990 expert panel might have been more lenient. The first of these did occur, outcome data published between the st utiles demonstrated increased benefit of CABG for a wider range of indications, such a? patients with three-vessel disease without reduced left ventricular function and patients with two-vessel disease with a strongly positive stress test. The mortality of elective CABt! Coronary Artery Bypass Graft—Leaoe e: 3 ; re;' nat mo mogre eff1 Slide f; and one ste.-, clir.i lev,oft! incii; ios. i ereii the. whii unce ical rate, shou give i. ate. ; To ferer. the p appn using inapp study clinic; sible I cases, when to ace. tion o sider; rated . A tl differ, in pr.u is pen the tl,But it ; �ma Table 8 —Appropnaieness. Percemage ol High-Risk Paiienis. and Adjusted Complication Rates by Selected Hospital Charactenstics" Location} Volumef 1 1 Characteristics Low, " . High. " . 1 Upstate.". Appropriateness Appropriate and crucial Teaching Ho!.pital§ 82 (79-83) i 1 Downstate. °.» Ves". 81 (77-64) 83 (79-87) 7(5-9) 9(6-11) 8 (6-101 6(4-8) 7 15-10) 8 (6-10) 7(4-9) 2(1-3) 3(2-4) 3(1-4) 2(1-3) 81 176-86) 83 (79-86) Appropriate 9 16-11) 8(6-10) 10(7-13) Uncenam 8 (6-11) 7(3-10) Inappropnate 2(1-3) 2 (1-4) High-risk patients 13(9-16) 10 (7-13) Complicationsll 22(17-28) 16(13-20) 83 176-87) No. % 7 1 (5-10) 1 13(9-18) 12 (9-15) 13 (9-18) 9(6-11) 19(16-22) 20(16-24) 15(11-19) : "Numoers m parentneses are 95' o conlidence intervals. tLow-volume hospitals performed fewer than 325 coronary anery artery bypass grafts in 1989. tDownstate hospitals include those trom New York City. Long Island, and Westchester County. Upstate hospitals are in the remaining regions of the state. §Teaching hospitals are the primary university hospitals. IIP-:.05 Ulndirectly standardized lor Parsonnet score, disease severity, age. indication chapler. and emergency status also decreased during this period, overall and for patients with poor ventricular function, shifting the benefit-risk ratio for some patients. Definitions also changed, in part to reflect changes in practice. Clinical scenarios w ere defined for the 1990 panel in more detail than for the previous panel, most significantly to take account of the great importance of surgical risk. The effect of these changes was to make the 1990 panel ratings more stringent: the definition of significant disease for oneand two-vessel disease required at least one vessel to be narrowed by 70% instead of 50% as in the earlier ratings; all clinical scenarios were rated at three levels of risk instead of one: the results of the stress test were more frequently included in the definition of the scenarios. Finally, the panel explicitly considered the appropriateness of CABG in the context of the availability of PTCA, which resulted in CABG being rated uncertain or inappropriate for some clinical scenarios that were previously rated appropriate. All of these changes should have made it more likely that a given case would be rated inappropriate, not less. To test the hypothesis that the difference in results was due to changes in the panel ratings, we examined the inappropriate cases from the earlier study using the 1990 panel ratings. For the 55 inappropriate cases from the earlier study, changes in the definition of the clinical scenarios made rerating impossible for nine cases. Of the remaining 46 cases, 45 (98%) were still inappropriate when rated with 1990 ratings (modified to accept 50% nan-owing as the definition of significant disease and to consider all cases as low risk). One case was rated uncertain. A third possible explanation for the differences wc- found could be changes in practice. Indeed, how bypass surgery is performed has changed markedly in the decade between these two studies. But it is the development of PTCA that JAMA. February 10. 1993—Vol 269. No. 6 has had the greatest effect on patient referrals for surgery. While 7442 patients underwent CABG in our study hospitals in 1990, 6391 patients underwent PTCA in those same hospitals that year. Whereas in 1979, 1980, and 19S2 we found 51% of study patients underwent CABG for one- or two-vessel disease, in New York State in 1990, it was 24%. Because virtually all of the inappropriate use in both studies was in operations performed for one- or two-vessel disease, the decrease in the number of these patients coming to surgery alone could account for half of the reduction in the rate of inappropriate CABG. Practice patterns could also have been affected by precertification requirements of the peer review organizations. In New York State in 1990, the Island Peer Review Organization required all candidates for CABG to meet one of the following screening criteria prior to admission or undergo physician review: left main or three-vessel disease, prior myocardial infarction, an abnormal electrocardiogram, an abnormal stress test, or angina that is not well controlled by medication. Not surprisingly, all of the 28 patients in our sample who received an inappropriate rating easily met these broad and inclusive screening criteria. In fact, 91% of the patients with inappropriate ratings from the prior study also met them. It is unlikely that precertification requirements have had much effect on practice. A fourth explanation for the difference between the two studies could be that the selection of patients for CABG in New York State is different from that in other states. There are important reasons why this could be so. For nearly 40 years the New York Cardiac Advisory Committee has exercised an oversight function that includes reviews of institutional performance of cardiac surgery, investigation of centers with suboptimal results, and periodic site visits of all centers. Under a certificate of need statute, the state Department of Health has strictly limited the number of cardiac surgical centers and has set high standards for credentialing surgeons, training of staff, necessary equipment, and minimum annual volume of open heart operations per hospital. Angioplasty is only authorized in hospitals with CABG capability. Finally, surgeons are required to file detailed reports of all cardiac surgical procedures with the department, which annually reports risk-adjusted mortality data by hospital and, recently, also by surgeon. In addition to providing comparative information for statewide assessment, the detailed reporting procedures afford a strong incentive for each hospital to monitor its owm performance. Perhaps as a result of these restrictions, the total number of CABG procedures performed (alone and with other procedures) in New York State in 1989 was 13 715, or 74 per 100000 patients, half the national rate for CABG of 148 per 100 000. '' Our examination of surgical complications confirms the work of others that the rate of complications varies remarkably by hospital. We did not evaluate the appropriateness of blood transfusions, but the extreme variation that we found among institutions in the use of blood transfusions mirrors the findings reported by Goodnough et al. 7'he correlation of appropriateness with the complication rate at the individual hospital level was - .03 and with operative mortality was .01, confirming earlier observations that hospitals and physicians who have the ability to achieve excellent technical results do not necessarily select their patients more appropriately. Similarly, we found no significant correlations of rates of appropriateness with location, volume, or teaching status of hospitals. While the overall 2.4% rate of inappropriateness encompasses hospitals with individual rates that vary from 0% to 5%, these differences are not statistically significant and may well repreN 1 16 Coronary Anery Bypass Graft—Leape et al 759 �sent annual variations. The high level of appropriate and crucial use, 82% of the bypass operations performed, while exemplary, raises a concern that some pa'jents might have been denied needed rgery. It is time to look for underuse, Especially among the uninsured and in the minority communities. The low rate of inappropriate use of CABG in New York State reflects high standards of performance by cardiac surgeons and cardiologists. These findings should reassure both patients and payers that there is very little inappropriate use of bypass operations in New York State. While these exemplary outcomes result from multiple factors, including changes in the practice of surgery that have made bypass surgery safer and more successful and the diversion of patients with less severe dis- ease to medical treatment or to PTCA, it seems inescapable that the oversight and feedback provided by the Cardiac Advisory Committee and the Department of Health in New York State have played a major role. For this reason, our findings may not be generalizable to the country as a whole. However, they do provide evidence that physicians and regulators can work together to achieve high standards of care. This work was supported by grants from the Commonwealth Fund, the John A. Hartford Foundation, the Morgan Guaranty Trust, the New York Community Trust, and the New York Sute Health Department. The development of the appropriateness ratings w as carried out as part of the Appropriateness Initiative, a joint elTort with the Academic Medical Center Consortium and the American Medical Association. Chicago. 111. We are grateful to the member institutions of the Academic Medical Center Consnrlium for their p;iriici|uiion iiml iissisliincv in the project. The high level of d;iu collection was achieved in large me^ure because because of the suppnrt and assistance of Frederick Parker, MD. who chaired the subciimmittee of thi New York state Canliac Advisory Commiltee under whnse au.-pices the study was performed. We thank John Kirklin. MD. and the members of the New York Canliac Advisory Committee, Barbara Genovese. and Joan Keesey at RAND, Sanu Monica, Calif, and Harry Feder. MPA.and Dorothy Know lton, RN, at Island Peer Review Organization for their supjiort and assisunce during this project; and Carol Roth. RN. MPH. at Value Health Sciences Inc, Santa Monica, Calif, for invaluable assistance in abstractor training, data collection, and analysis. 1 We are also indebted to the members of the Coronary Artery Bypass Graft and Percutaneous Transluminal Coronary Angioplasty Panel who gave generously of their time, their knowledge, and their wisdom: Robert S. Dittus. MD. MPH: David P. Faxon. MD; Mark A. Hlatky. MD: J. Ward Kennedy, MD: Nicholas T. Kouchoukos, MD; Flovd D. Loop'; MD; Alvin I. Mushlin. MD. ScM; Richard 0. Russell. Jr. MD;and William S. Stonev.Jr. MD. References 1. Hilborne LH, Leape L L , Bernstein SJ, et al. The appropriat<?ness of use of percutaneous transluminal coronarv angioplasty in New York Slate. JAM A l!)93£69:761-765. 2. Bernstein SJ. Hilborne LH. Leajw LL, et al. The appropriateness of use of coronarv angiop-aphy in New York State. JAMA. iy!)3^69:TGC>-769. 3. Alderman E. Bourassa M. Cohen L. et al. Tenyear follow-up of survival and myocardial infarction in the randomized Coronarv Artery Surgery Study. Circulation. 1990:82:1-18. 1. European Coronary Surgery Study Group. Prospective randomized study of coronary artery bypass surgery in suble angina pectoris: a progress pass port on survival. Circulation. 1982^:1:67-71. .European Coronary Surgery Study Group. Longresulls of prospective randomized study of mary artery bypass surgery in suble angina pectoris: European Coronarv Surgerv Studv Group. Lancet. 1982£:H?i-1180. 6. Winslow CM, Kosecoff JB, Chassin M. Kanouse DE, Brook RH. The appropnateness of performing 760 JAMA. February 10. 1993—Vol 269. No. 6 coronary artery bypass surgery. J A M A iyS8u.'00: 505-509. 7. Park RE. Fink A. Brook RH, et al. Physician ratings of appropriate indications for six medical and surgical procedures. Am J Public Health. 19S6: 76:766-772. 8. Chassin MR, Park RE. Fink A. Rauchman S. Keesey J. Brook RH. Indications for Selected Medical and Surgical Procedures: A Literature Renrir and Ratings of Appropriateness: Coronary Artery Bypass Surgrru. Sanu Monica. Calif: RAND; I'.ISd. Publication R-22(U^-CWF-HF-HCFA-PMT RWJ. 9. Parsonnet V, Dean D, Bernstein AD. A method of uniform stratification of risk for evaluating the results of surgery in acquired adult heart disease. Circulation 19S9;79(suppl 11:3-12. 10. Leape LL. Hilbome LH. Kahan JP. et al. Coronary Artery Bypass Graft: A Litrraturr Rcriric and Ratings of Appropriateness aud S'ecessity. Sanu Monica, Calif: RAND; 1991. Publication JRA-02. 11. Kish L. Sun-ey Sampling. New York. NY: John Wiley & Sons Inc; liKio. 12. Cochran WG. Samplnig Trchniqnts. .'Ird ed. New York. NY. John Wiley & Sons Inc; VJ77. 13. Chassin MR. Ko>ecofr j , Park RE. Does ina^ propriatc use explain geographic variations in the use of heakh care sen ices? a study of three procedures. JAMA. I'tSTiSiSiVW-iW. 14. Office of Health Systems Management New York Sute Dept of Health. Annual AYyiorf of Cardiac Diagno.itic and Cardiac Surgical Centers: li>00 Sinnmanj Report. Albanv: New York Suite Depl of Health; 1!W1. 15. National Center for Health Statistics. Detailed diagnoses and procedures: National Hospital Discharge Sun-ev, 1*S). Vital Health Stat IJ. 1991: No. 108. 16. Goodnough LT. Johnston MFM, Toy PTCY. Transfusion Medicine Academic Award Group. The variability of transfusion practice in coronary artery bypass surgery'- JAMA. l«J Jlitn:.S0-90. , Coronary Anery Bypass Grail—Leape el al �The Appropriateness of Use of Percutaneous Transluminal Coronary Angioplasty in New York State Lee H. Hilborne. MD. MPH; Lucian L. Leape. MD; Steven J. Bernstein, MD, MPH; Rolla Edward Park, PhD; Mary E. Fiske, MD: Caren J. Kamberg, MSPH; Carol Pindar Roth, RN, MPH, Roberl H. Brook, MD, ScD Objective.—To determine the appropriateness of use of percutaneous transluminal coronary angioplasty (PTCA) in New York State. Design.—Retrospective randomized medical record. Setting.—Fifteen randomly selected hospitals in New York State that provide PTCA. Patients—Random sample of 1306 patients undergoing PTCA in New York State in 1990. Main Outcome Measures.—Percentage of patients who underwent PTCA for indications rated appropriate, uncertain, and inappropriate. Results.—The majority of patients received PTCA for chronic stable angina, unstable angina, and in the post-myocardial infarction period (up to 3 weeks). Fiftyeight percent of PTCAs were rated appropriate; 38%, uncertain; and 4%, inappropriate. The inappropriate rate varied by hospital from 1% to 9% (P=A2); the uncertain rate, from 26% to 50% (P=.02); and the combined inappropriate and uncertain rate, from 29% to 57% (P<.001). There was no difference in appropriateness when the institutions were grouped by volume (fewer than 300 procedures annually or at least 300 procedures annually), location (upstate vs downstate), or by teaching status. Conclusions.—Few PTCAs were performed for inappropriate indications in New York State. However, the large number of procedures performed for indications that were rated uncertain as to their net benefit requires further st jdy and justification at both clinical and policy levels. (JAMA. 1993;269:761-765) FOLLOWING the performance of the first percutaneous transluminal coronary angioplasty (PTCA) in 1977, its use has become increasingly more common, and it is now advocated as the procedure of choice for many patients with symptomFiom RAND (D'S Hilbome. Leaoe. Bernstem. Park, r-.ske. anc 6 : - x * . and Ms K.iT.Dc-rgi Sanib Monica. Calif; the Departmems ol Medicine (Drs Hilborne and Brook) and Palhoiogy and Laboratory Medicine (Dr Hilborne), School ol Medicine, and the School of Public Health (Dr Brook), UCLA. Los Angeles. Calif; Harvard School ot Public Health. Boston. Mass (Dr Leape): the Schools ol Medicine and Public Health, University ot M.chigan. Ann Airoi (O: Bc-ins'.oin); anc Value Health Sciences inc. Santa Monica. CaM (Ms Roth) Reprint requests to RAND. 1700 Main Sl. Mail Stop 3F. Santa Monica. CA 90406-2398 (Dr Hilborne). J A M A . February 10. 1 9 9 3 - V o l 2 6 9 , N o . 6 atic single- and two-vessel coronary artery disease. However, the use of PTCA has been subject to less evaluation than the procedure it can replace, coronary artery bypass graft (CABG) surgery. 1 See also pp 753, 766, and 794. No formal assessment of the appropriateness of use of PTCA has been performed, and randomized controlled trials comparing the efficacy of PTCA wilh CABG and medical therapy are still under way. Nevertheless, the extent of use of PTCA in the United States ap2 3 proximates that of CABG. Paralleling the increase in PTCA use nationally, the number of PTCA cases perlormed in New York State increased 105% from 19S6 to\990.* At the request of the NewYork Cardiac Advisory Committee, we performed a study that assessed t he appropriateness of PTCA in New York Sute in 1990. METHODS The development of appropriateness and necessity ratings is detailed in the article on CABG surgery by Leape et al in this issue of JAMA. The liter ature review for PTCA, including the panel ratings of appropriateness and necessitv, is available from RAND, Santa Monica, Calif. 5 0 Sample We obtained a random sample of patients who underwent PTCA in 1990 from non-federal hospitals in New York State by means of a two-step sampling process. First, we selected a sample of hospitals stratified according to two characteristics: volume and geographic location (ie, upstate or downsute). The volume stratification was performed based on the annual number of CABG surgeries performed at each location; this resulted in two groups of PTCA patients, those undergoing PTCA in hospiuls in which either fewer than 300 PTCAs or at least 300 PTCAs were performed per year. We randomly sampled approximately equal numbers of hospitals performing PTCA in each stratum, for a toul of 15 hospiuls. Second, within each hospital, we requested an average of 98 medical records (total, 1467) from a random sample of patients who Percutaneous Transluminal Coronary A n g i o p l a s t y — H i l b o r n e el al 761 �Table 1—Demographic and Clinical Characleris- Table 2 — Appropnaieness ol PTCA in 1306 Patients in New York State in 1990 by Clinical Indications lics ol Patients Undergoing Percutaneous Translu- Chapter* minal Coronary Angioplasty (PTCA) in New York Appropriateness, % State in 1990 Characteristics Age. y 19-49 50-59 60-64 65-69 70-74 75-79 280 Female Race White Black Hispanic Other Coronary artery disease nsk lactors Hypenension Family history Hypercholesterolemia Smoking Diabetes mellitus Anatomic disease Left main Three vessels Two vessels With PLAD arteryt No PLAD anery One vessel With PLAD anery Nol PLAD artery Insignificant disease} Cardiac historyf No myocardial infarction No previous revasculanzation Previous PTCA Previous CABG Myocardial infarction No previous revasculanzation Previous PTCA Previous CABG 17 26 18 16 13 7 3 31 (14-20) (23-29) (16-19) (14-17) (11-16) (5-9) (2-5) (27-35) 91 (86-96) 3(1-5) 4(1-6) 2(1-4) Indication Total No. of Patients Crucial Appropriate Chronic stable angma 519 34 (30-37) 23 (18-28) 42 (37-48) 1 10-2) Unstable angmaf S (95% Confidence Interval) 356 52 (47-56) 13(10-16) 34 (30-38) 2(0-3) Acute myocardial infarction} 32 6(0-10) 93(83-100) 32 ( 2 3 - t t ) 30(25-34) 35(27-44) 3(0-7) 53 (46-70) 39 (26-52) Post-myocardial infarction§ Asymptomatic 0.4' (0-0.9) 16(12-20) 308 76 1 0 0 Inappropriate 1 (0-4) 3(1-6) 6 0 33 53 17 Near sudden death 1 100 0 0 0 Venlncular arrhythmias 2 0 0 100 0 Insignificant disease" 5 0 0 0 100 0 100 Total 1 0 35 (31-39) 1306 0 23(20-25) 38 (35-41) 4(2-6) •PTCA indicates percutaneous transluminal coronary angioplasty. Numbers in parentheses are 95% confidence intervals. Percentage totals may not add up to 100 due 10 rounding. tChest pain thought to be due to myocardial ischemia requinng hospitalization (and infarction is ruled out). }Within 6 h of an acute myocardial infarction with or without shock. §From 6 h to 21 days following an acute myocardial intarction. liAngiographic findings did nol meet the minimum cntena established by an expert panel: a minimum of 50°.; narrowing in all affected vessels with 70% narrowing in at least one anery (except for led main disease). 8(6-11) 29 (25-32) 12 (8-17) 34 (30-38) 0.2 (0-0.4) na or acute myocardial infarction), some patients with multivessel disease may undergo PTCA only of the culprit lesion, ie, the lesion thought to be responsible for the acute change. Because, in the absence of an emergency, CABG is often preferred for patients with multivessel disease and because the panel did not address this issue, we performed a sensitivity analysis to investigate the possible effects of culprit-lesion PTCA. First, results were calculated without consideration of the culprit lesion. Second, appropriateness was assessed by considering patients who underwent urgent PTCA of the culprit lesion as if they had only single-vessel disease. For example, a post-myocardial infarction patient with triple-vessel coronary artery disease who received a single-vessel PTCA was analyzed after first placing the patient into the single-vesseldisease category (other clinical factors, such as ejection fraction and risk, were left unchanged). 7 32 (28-35) 12(8-15) 4(3-5) 38 (33-43) 10(9-12) 6 (4-7) "All patients had protected left main disease (a patent bypass graft around the left main obstruction). fPLAD (proximal left antenor descending) artery is defined as an obstruction belore the first septal pertorator. ^Angiographic findings did not meet the minimum cntena established by an expert panel: a minimum ol 50% narrowing m all affected vessels, with 70% narrowing in at least one artery (except for left mam disease). §CABG indicates coronary anery bypass graft: 2% (23 patients) had both a previous PTCA and a previous CABG These categories, theretore. are not mutually exclusive received PTCA in 1990. Seventy-five records could not be located and 60 were excluded because the procedure did not meet inclusion criteria (eg, the studywas not performed during 1990 or the patient did not receive PTCA). In addition, we excluded 26 patients for whom we were unable to locate an exercise stress-test report. The final sample compromised 1306 PTCA cases. Appropriateness and complication results were weighted to reflect the population of patients who underwent coronary angioplasty in New York State during 1990 and SEs were adjusted to correct for the design effects of the twostage sampling process."9 RESULTS Analysis We assigned each patient to a unique indication (clinical scenario) based on the methods described by Leape et al. We also analyzed the special situation of a ^"culprit" lesion i'or a subset of PTCA patients. In the setting of an urgent or mergent admission (eg, unstable angi0 • Uncertain 1 Flash pulmonary edema Palliative procedure 51 (49-52) 50 (46-54) 45 (39-50) 28(22-33) 23 (21-25) ce \o 762 JAMA. February 10. 1993—Vol 269. No. 6 Table 1 shows the demographic and clinical characteristics of the study patients. Sixty-nine percent of patients were men. Ninety-one percent were white. oTr were black, and 47r were Hispanic. The median age was 50years, and ~ ~ i were less than 70 years old. Using our modified Parsonnet score,' 7() r of r r patients were in the low-risk category, 22^ were in the moderately high-risk category, and 8% were in the very highrisk category. The majority (9195-) of procedures were performed for indications falling into three clinical chapters: chronic stable angina (40%), unstable angina (2795-), and post-myocardial infarction (24 7c). An additional 6% of procedures were performed on asymptomatic patients (Table 2). Eighty-three percent of PTCA procedures were performed on patients with either single-vessel or two-vessel disease. Regardless of the extent of disease, the vast majority of patients (84 7c) received only single-vessel PTCA. Of these, 49% were performed on the left anterior descending artery, 22% on the left circumflex artery', and 29% on the right coronary artery. Most patients (67%) had a single-lesion angioplasty. Twenty-five percent had double-lesion angioplasty and 8%' received angioplasty of three or more lesions. There were 284 patients (22%) w ho met our criteria for culprit-lesion angioplasty (Table 3). < Angioplasty was completely successful in 88% of procedures. In accordance with conventional criteria, we defined complete success as residual luminal stenosis less than 50% for all lesions attempted. An additional 5% of patients had procedure? that were partially successful (ie, less than 507c luminal stenosis for at least some of the lesions attempted). Patients with diffuse disease, long lesions, total occlusions lasting for more than 3 months, and lesions at major bifurcations tie, type C lesions) have been shown to have a lower success rate."' In our study the complete suc- Percutaneous Transluminal Coronary Angioplasty—Hilborne et al lo in �cess rate for patients with any of these low success rate characteristics was 57, lower than for patients wit h lesions lacking any of them t ^ n vs S.-'';, /'--.i'i I. Table 3 — Eflect ol Adjusting lor Culprit Lesion lor the 282 Unstable Angina. Acute Myocardial Intarction. or Posl-Myocardial Infarction Patients With a Culprit Lesion" Appropriateness, "o Appropriateness Thirty-five percent of procedures were performed for indications rated appropriate and crucial by our expert panel. An additional 2370 of procedures were performed for appropriate indications; 387( were uncertain and 4^ were performed for inappropriate indications (Table 2). Among the 496 cases rated uncertain, 587f received a median rating of uncertain, 27% received a median rating of appropriate yet CABG was preferred, and 157c were uncertain because of panelist disagreement. Similarly, for the 61 cases rated inappropriate, 927r were explicitly rated as such and 87r were rated uncertain yet CABG was preferred. There were no clinically important differences in the appropriateness rates for patients in the three major clinical chapters: chronic stable angina (577c), unstable angina (65%), and post-myocardial infarction (62%). For patients with a culprit lesion, the sensitivity analysis shows that when a culprit lesion is considered, the percentage of these cases that are rated uncertain declines significantly (from 43% to 22%) (Table 3). Almost afl of these cases became either crucial or appropriate. The percentage of PTCAs rated inappropriate did not change. The effect of this analysis on the entire sample, however, was much less: 387c of angioplasties were rated crucial; 24%, appropriate; and 34%, uncertain. The percentage of inappropriate angioplasties remained unchanged at 4%. Examples of the mostfrequentlyoccurring appropriate, uncertain, and inappropriate indications are shown in Table 4. Mortality and Complications The PTCA procedural mortality of 14 .% was directly related toriskas determined by the modified Parsonnet score. Among low-risk patients, 02% died compared with 2.3% of high-risk patients and 9.57c of those who were at very' high risk (P<.00l). Mortality was also related to patient age. Patients less than 60 years of age had a mortality rate of 0.27e; corresponding rates in older groups were 127c of patients aged 60 to 74 years, 4.4% of patients aged 75 to 79 years, and 14.3% of patients aged 80 years and older (P< .001). Forty-six patients (3.5%) required emergency CABG surgery because of a PTCA complication. Repeat PTCA during the hospitalization secondary to vessel closure was required in an additional 2.5% of patients, and 1.97c sustained an JAMA. February 10, 1993—Vol 269, No. 6 Uncertain Appropriate Crucial Inappropriate Culprit Lesion Cases Unstable angmat (0^145) Before adjustment 47(46-53) Acute myocardial infarction} (n=l2) Before adjustment 7(1-13) 45 (40-50) 1 (0-3) 44 (35-53) Alter adiustment 27(21-33) 27 (21-33) 2 (0-4) 84 (57-100) 16 (0-43) Atter adiustment 0 Posi-myocardial inlarction§ (n=125) Belore adjustment 0 0 0 100 0 16(3-28) 0 18 (12-24) 1 (0-3) 24(16-33) 43(38-48) 0.6 (0-2) 46 (40-52) After adjustment 44 ( 34-54) 27 (23-31) 31 (22-41) All three categories (n=282) Before adjustment 40 (29-57) 54 (47-61) After adjustment 30 (26-34) 22(17-26) 2(0-3) All Cases In Each Clinical Category (Culprit and Nonculprlt Lesions) Unstable angina, acute myocardial infarction, and post-myocardial intarction (n=696) Before adjustment 40 (34-47) 24 (20-28) 33 (29-37) After adjustment 2 (0-5) 46 (40-43) 3 (0-5) 23 (20-25) 38(35-41) 4(2-6) 38 (35-42) After adjustment 24 (19-28) 35 (31-39) Entire sample (n=1306) Before adjustment 27 (23-30) 24 (21-27) 34 (30-37) «(3-6) •Culpnt lesion is defined in the "Analysis" section ol the text One patient with flash pulmonary edema and one patient with near sudden death with culpnt lesions are not listed. Numbers in parentheses are 95°.<. conlidence mlervals. tChest pain thought to be due to myocardial ischemia requiring hospitalization (and intarction e ruled oul). }Within 6 h ol an acute myocardial infarction with or without shock. §Ffom 6 h to 21 d totlowing an acute myocardial ntarction. Table 4 — T h e Most Frequently Used Indications by Appropriateness Category Indications No. of Cases Appropnate Severe chronic stable angina (class IllflV) treated with maximum medical therapy, single-vessel nonproximal tett anterior descending obstoiction in a patient with low risk and an ejection fraction > 3 5 % Post-myocardial infarction, within 21 d of an acute myocardial infarction, with continuing chest pain (postinfarction angina), single-vessel nonproximal left anterior descending obstruction in a patient with low risk and an ejection traction >35% Appropriateness Rating 59 56 Uncertain Severe chronic stable angina (class IIWV) with pain on using maximum medical therapy, three-vessel disease in a palienl with low risk and an ejection traction >35% Mild to moderate chronic stable angina (dass l/ll) tor a patient treated with iess than maximal medical therapy, with single-vessel nonproximal leti antenor Descending obstruction, low risk, and an ejection traction > 3 5 % 32 30 Inappropriate Asymptomatic patient withoui a very positive exercise stress test, single-vessel nonproximal left antenor descending obstruction in a patient with low nsk and an ejection fraction of £ 5 0 % } 7" 7t 14 3 "Uncertain because coronary artery bypass graft was preferred tUncerlain because of panel disagreement. }AII oiher inappropriate indications had less tr.an 10 occurrences each. acute myocardial infarction following PTCA but before discharge. Fifty-five patients (4.2%) required transfusion, including 20 (437c) of the 40 emergencv CABG patients and 35 (2.87c) of the 1260 patients who did not require CABG. Among the non-CABG patients receiving transfusion, 267f. received 1 U, 43% received 2 U, and 31% received more than 2U of blood. One patient had a cerebrovascular accident and 22 had periprocedural cardiac arrest. Three patients were returned to the catheterization laboratory because of bleeding. The PTCA complication rate was independent of whether the patient had a Percutaneous Transluminal Coronary Angioplasty—Hilbome et al 763 �prior CARG. More women (13%) than men (8%) experienced a complication (P=.004). Interhospital Differences Individual hospital inappropriateness 'rates for PTCA ranged from 1% to 9% (P=A2). Institutional appropriate and crucial rates varied from 247c to 43%, appropriateness rates from 13% to 36%, and uncertain rates from 26% to 50% (Table 5). When crucial and appropriate cases were grouped and compared with the group of cases rated either uncertain or inappropriate, combined uncertain and inappropriate rates by hospital varied from 43% to 71% (P<".001). Severity-adjusted hospital-specific mortality varied from OTt to 5^ and overall complication rates (complications include a coronary vascular event requiring CABG or repeat PTCA, acute myocardial infarction, blood loss sufficient to warrant transfusion or a return to the catheterization laboratory, cardiac arrest, wound infection, or death) ranged from 4% to 17%. These differences in mortality and complication rates were not statistically significant. There were no significant appropriateness differences among hospitals when grouped by volume of procedures performed, location (upstate or downstate), or teaching status (Table 6). Appropriateness, % 1 Hospital Appropriate and Crucial Appropriate Uncertain Inappropriate A 36 35 26 3 B 33 23 38 6 C 33 20 44 2 D 42 20 34 3 E 31 36 29 3 F 40 22 36 2 G 24 24 43 9 H 30 13 48 9 1 31 23 37 9 J 45 18 33 3 K 37 28 31 3 6 L 28 16 50 M 43 22 34 1 N 34 20 39 7 0 33 20 46 1 "PTCA indicates percutaneous transluminal coronary angioplasty. Percentages may not add up lo 100 due to rounding. P=.12 for inappropnate vs crucial/appropriale/uncerlain. P<.001 tor mappropnate/uncertain vs crucial/ appropriate Table 6.—Appropriateness. Percentage of Very High-Risk Patients, and Adjusted Complication Rates by Hospiial Charactenstics* This study found the rate of inapproTriate use of PTCA in New York State in 1990 to be 4%. This inappropriateness rate is very close to that of inappropriate use of CABG surgery in New York State and is considerably lower than rates of inappropriate use reported in previous studies of other procedures." However, the fraction of patients in whom the procedure was performed for uncertain indications was 38%. Most of these indications were rated uncertain because the median panel rating was within the uncertain range (ie, between 4 and 6), reflecting the panel's judgment that the benefits and risks of the procedures for these indications were about equal. The uncertain rating rarely was assigned because the expert panel was widely divided with respect to its final appropriateness ratings. Adjusting the patient classification for the presence of a culprit lesion decreased the uncertain rate slightly (from 38% to 347,). 12 There are a number of explanations for the high uncertain rate. The most important is the shortage of outcomes data. Because appropriateness determinations are outcomes-driven, our expert 1U1LIIJ delists frequently did not have suffiinformation to make a definitive opriateness assessment. Second, 764 JAMA. February 10. 1993—Vol 269. No. 6 Location, Volume, % t I 1 1 I Teaching Hospital, %§ I 1 Yes No Low High Upstate Downstate Appropriateness Appropriate and crucial 35(31-39) 35(31-40) 37 (31-42) 34 (29-40) 39 (33-45) 34 (29-38) Appropnate 23(18-28) 22 (20-25) 20 (18-23) 24(21-27) 20 (17-23) 23 (21-26) Uncertain 38(31-44) 38 (35-42) 40 (35-46) 37 (34-40) 37 (32-42) 39 (35-42) 5 (2-7) 4 (3-5) 4 (2-6) Inappropnate OMMENT • TaOle 5 —Appropnateness of Pertorming PTCA* by Hospital (Not Adjusted lor the Pertormance of a Culprit-Lesion PTCA) 4 (3-6) 4(2-6) 3(2-4) Very high-risk patients 8(6-11) 7(5-10) 7(4-10) 8(5-11) 9(7-11) 7 (S-9) Complications: 11 (8-14) 10 (8-12) 10(8-13) 11 (8-14) 11 (9-14) 10 (8-12) 2(1-2) ' (1-3) 1 (0-2) 3(1-4) 1 (1-2) 1 (1-2) Mortality •Numbers in parentheses are 95% confidence intervals. tEach low-volume hospital perlormed lewer than 300 percutaneous transluminal coronary angioplasties in 1990. tDownstate hospitals include those from New Yor* City, Long Island, and Weslchester County §Teaching hospitals are the primary acute care facility associated with a medical school •iComplicalions are indirectly standardized for the modified Parsonnet score, election traction, disease severity, age. indication chapter, and emergency status. Complications include a coronary vascular event requiring coronary artery bypass graft or repeat percutaneous transluminal coronary angioplasty, acute myocardial infarction, blood loss sufficient to warrant transfusion or a return to the catheterization laboratory, cardiac arrest, wound infection, or death. even when PTCA is successful, longterm results, particularly the high restenosis rate, have led some to question the long-term benefit of PTCA. Third, the coronary revascularization field is rapidly changing. New catheter designs and the introduction of alternatives such as coronary atherectomy and coronary stenting alter the feasibility and outcomes of nonsurgical coronary revascularization, " continually changing the benefits and risks. Our findings of a high success rate in patients receiving PTCA for lesions that were previously considered to have a low success rate illustrates this point. " While the immediate success rate in these patients was lower than those without these lesion characteristics, it is much higher than in previous reports. The increased success rate probably results from both 1 0 1 increased experience and advancing technology. This demonstrates how important it is for appropriateness ratings to represent current, state-of-the-art practice, particularly for an evolving technology. Because the ratings used in this study are evidence-based, results from randomized controlled trials currently under way might change the appropriateness ratings of some of the clinical scenarios (indications). It is essential that these results be incorporated into updated ratings promptly. This study had three limitations. First, our findings suggest that PTCA, like coronary angiography" and CABG, is rarely used inappropriately in New York State. These findings, however, may not be generalizable to other states or to the United States as a whole because NewYork State limits the number of facili5 Percutaneous Transluminal Coronary Angioplasty—Hilborne et al �port outcomes data as a condition of reimbursement so that ultimately the value of the procedure for these clinica! so-nario.- could In- r.-taNislinl. I). • policy analysts and the public as a whole may also wish to consider whether it is in the interest of society to use limited public funds to pay for procedures rated uncertain before making procedures that are crucial and/or appropriate more available to patients who are uninsured or underinsured. ties and physicians perfonning I'TCA to 31 centers. Second, we have no information concerning thi' validity ofcoriinary angiogram inlerpri-tal ion-, lircause the angiographic extent uf disease is essential for determining appropriateness, if there is systematic overreading of angiograms, the extent of inappropriate use could be substantially higher. We have no evidence that overinterpretation is prevalent or reason to suspect that it occurs; however, we are investigating the validity of angiographic interpretation. Finally, our panel did not expressly address use of culprit-lesion angioplasty. Considering culprit-lesion PTCA as if it were performed for single-vessel disease reduces the uncertainty rate and increases the appropriateness rate. If the panel had considered culprit-lesion angioplasty explicitly, their ratings mayhave been different. Our sensitivity analysis, however, suggests that separate culprit-lesion ratings would have had a minimal effect on our conclusions. How should these ratings of appropriateness by applied? One logical application is as a source document for the development of clinical practice guidelines to assist clinicians and patients with difficult clinical decisions. These approReferences priateness criteria are developed by an expert panel considering the average patient presenting to the average physidan peri'i irming I ' I'< 'A in ' l " a\ orag'hospital. In individual patients, extenuating clinical circumstances may necessitate special interpretations of appropriateness ratings. Nevertheless, these ratings can be used as a place to begin a discussion with a patient. Quality assurance and utilization review programs should only use these ratings as a screen to identify cases for individualized professional review. Irrespective of their use, to be of value these ratings must be regularly updated as new information becomes available. Updating should occur at least even- 2 years and whenever data from randomized trials are released. The high rate of use of this procedure for uncertain indications (SS /,) and the variation by hospital also should be addressed. At the very least, patients considering undergoing PTCA for clinical scenarios rated uncertain should be fully informed that with the current state of scientific knowledge the benefits of the procedure w hen used for these indications are about equal to its risks. For uncertain scenarios it would be reasonable to require practitioners to re- We also express our deepest appreciation to the members of the Coronary Revascularization Appropriateness Panel for the ume they devoted to reviewing the literature and providing the appnpriateness and necessity ratings used in this project. Members of the panel included the follow inp Robert S. Dittus. MD, MPH; David P. Faxon. MD; Mark A. Hlatky. MD; J. Ward Kennedy. MD. Nicholas T. Kouchoukos. MD; Flovd D. Loop. MD: Alvin I . MushUn. MD. ScM; Richard 0. Russell. Jr. MD; and William S. Sloney, Jr. MD. 1. King S I I I . Perculaneou? transluminal coronary anpioplast j - the second decade. A in J Cardiol 1 HSS; 62(suppl k)^K-6K. 1 BAR1. CABR1, EAST, BABRI. and RITA: coronarv anpioplastv on trial. Lancet. li>90^>6:131;>1316. 3. National Center for Health Statistics. Detailed diagnoses and procedure?, National Hospital Discharge Survev. 1939. Vital Health Slat Id 19yi;13; No. 109. 4. Office of Health System.- Managemenl. New York State Depl of Health. Annual Rrfiort of Cardiac Diagnostic and Cardiac Surgical Centers: 199(i Summary Report. Albanv: New Vork Dept of HeaJth: 1991. 5. Leape L L . Hilborne L H . Park RE. et al. The appropriateness of use of coronary artery bypa.-? graft surgery in New York State. J.A.VA 1993; 269:753-760. 6. HilbomeLH.LeapeLL.KahanJP.ParkRE.Kamberg CJ, Brook RH. Percutaneous Tmnsluiiiiiial Corvnary Angioplasty: A Litemturc Rcnrit aud Ratings of Apjiropriateuesf and S'ecessity Santa Monica. Calif: RAND; 1991. Publication JRA-01. 7. Vacek JL, Rosamond TL. Robuck W'. Kramer PH. Beauchamp GD. Prognosis of culprit lesion PTCA in acute myocardial infamion for multi versus single vessel disease. Cathet Cardioixisc Diaqn 1991^:I61-16o. 8. Kish L. Surrey Sampling. New York. NY: John Wiley 4 Sons Inc; 1965. 9. Cochran WG. Sampling Techniques. 3rd ed. New York. NY: John Wiley & Sons Inc; 1977. 10. Ryan TJ. Faxon DP, Gunnar RM. et al. Guidelines for percutaneous transluminal coronary angioplasty: a repon of the American College of Cardiology/American Heart Association Task Force on Assessment of Diagnostic and Therapeutic Cardiovascular Procedures (Subcommittee on Percutaneous Transluminal Angioplasty). Circulation. 19SS; 7S:4S6-o02. 11. Chassin MR, Kosecoff J, Solomon DH, Brook RH. How coronary angiography is used: clinical determinants of appropriateness. JAMA l'J.-7i.>. 2543-2547. 12. Winslow CM. KosecofTJB. Chassin M. Kanous*DE, Brook RH. The appropriateness of performinp coronary artery bypass surgery. JAMA ISSS^fK': 505-509. 13. Shapiro TA, Herrmann HC. Coronary angiography and interventional cardiology. Curr Opn. Radiol 1992;4:55-64. 14. Vlietstra RE. Advances in coronary interventional techniques. Int J Cardiol. 1991;3i:175-lSl. 15. AndersonHV.RoubinGS.LeimbruberPP.Doup las JS Jr, King SB Jr. Gruenuig AR. Primar. angiographic success rates of precutaneous tran.-luminal coronarv angioplastv. Am J Caiiiol. 1985; 56:712-717. 16. Tuzcu EM. Sunpfendorfer C, Dorosti K. et al. Changing patterns in pemjlaneous transluminal coronarv angioplastv. Am Heart J 1989:117:13741377. 17. Bernstein SJ. Hilbome L H . Leape LL, et al. The appropriateness of use of coronary angiogn, phy in New York Sute. JAMA. 1!<93;2G!»:76G.7GS<. JAMA, February 10. 1993—Vol 269. No 6 1 1 This work was supported by grams from the Commonwealth Fund. Morgan Guaranty Trus;. and the New York Community Trust. We thank Frederick Parker, MD. and the members of the New York Cardiac Adv isory Commillee for their suppon and advice during this pro.iec . We are indebted lo Joan Keesey. Barbara Genovese. Marjorie Sherwood. MD. Amar Iqbal. MD. and Jacqueline Kosecoff. PhD, for their assisunriin abstractor training, data analysis, and project coordination. We also thank Harry Feder. MI'A. and Dorothy Knowlton. RN, of Island Pt-er Review Organization, without whom this work could nohave been completed. - Percutaneous Transluminal Coronary Angioplasty—Hilborne et al 765 �J"he Appropriateness of Use of Coronary Angiography in New York State Steven J. Bernstein. MD. MPH; Lee H. Hilborne, MD, MPH; Lucian L. Leape, MD; Mary E. Fiske, MD; Rolla Edward Park, PhD; Caren J. Kamberg, MSPH; Roberl H Brook, MD, ScD Objective.—To determine the appropriateness of use of coronary angiography in New York State. Design.—Retrospective randomized medical record review. Setting—Fifteen randomly selected hospitals in New York State that provide coronary angiography. Patients—Random sample of 1335 patients undergoing coronary angiography in New York State in 1990. Main Outcome Measures.—Percentage of patients who underwent coronary angiography for appropriate, uncertain, or inappropnate indications. Results.—Approximately 76% of coronary angiographies were rated appropriate; 20%, uncertain; and 4%, inappropriate. Inappropriate use did not vary significantly between the elderly (ie, patients aged 65 years and older) and nonelderly, 4.7% and 3.9%, respectively. Although the rate of inappropriate use varied from 0% to 9% among hospitals, the difference was not significant. Rates of appropriateness did not vary by hospital location (upstate vs downstate), volume (fewer than 750 procedures annually or at least 750 procedures annually), teaching status, or whether revascularization was available at the hospital where angiography was performed. Conclusions.—Although coronary angiography was used for few inappropriate idications in New York State, many procedures were performed for uncertain inations in which the benefit and risk were approximately equal or unknown. (JAMA 1993;269:766-769) IN 1989, more than 1 million Americans underwent coronary angiography, a sevenfold increase from a decade earlier. Wide variations in the population-based use rate of this procedure within the United States ' and between the United States and other countries^ have led some to question how appropriately it is being used. In two previous studies, inappropriate-use rates of 17% in the elderly in the United States" and 17% in adults in the United Kingdom* have been reported. The current study evaluated the appropriateness with which coronary angiography w as performed in NewYork State in 1990. In the two related articles in this series, we have reported on the appropriateness of coronary artery revascularization. 1 1 1 ;tl From RAND. Sania Monica Calil (Drs Bernslem. HMBC'.i-:-. L-i-sac-. F'S-e. Po:-.. s r a B:ozr. and Ms r,.-m. bo'Q) me Schools ol Medicine and Public Heaitn. University of Michigan Ann Arbor (Dr Bernstein)- the Departmems ol Medicine (Drs Hilborne and Broon I and Patnoiogy and Laboralory Medicme (Dr Hilbome) me School ol Med'One (Drs HilDome and Brook) ara ihe School ol Puoiic Healih (D.- B'ook). UCL-"- Los Ange• ^ f" j i i i . r.nc h j ' - . 3:c S:n.>;-i o: r-j:. nc i-ic-tJ:'- B-s:on. (Dr Leape). bnra ic-ai.eyc !o RAtiD. w o o Mrm S: S.>.-ta h . CA ciOJaC-.'3?S ,Or Bernstv-m 766 J A M A . February 10. 1993—Vol 269. No. 6 METHODS Overview We have previously described the methods by which we developed appropriateness ratings of possible indications (clinical scenarios) for the use of coronary angiography. Based on a review of the medical literature, we developed a mutually exclusive and comprehensive set of 2111 possible indications for 56 See also pp 753, 761, and 794. which coronary angiography might be used in 1990. The indications were grouped into 10 clinical categories corresponding to the patient's primary symptom or reason for having the procedure, such as chronic stable angina, unstable angina, or acute myocardial infarction. Using a modified Delphi technique, a nine-member expert physician panel composed of three interventional cardiologists, two noninterventional cardioiogisis, two cardiothoracic surgeons, one internist, and one family physician rated all possible indications.'' The def- initions and methods that the panel used are previously described. The literature review- and final ratings have been published as a monograph available from RAND, Santa Monica, Calif. 8 10 Sample We obtained a random sample of patients who underwent coronary angiography in 1990 from nonfederal hospitals in New York State by means of a twostep sampling process. The hospitals were stratified based on three characteristics: (1) geographic location (upstate vs downstate); (2) number of coronary angiographies performed in 1989 (fewer than 750 procedures or at least 750 procedures), and (3) whether the hospital in which coronary angiography was performed was authorized to perform coronary' artery bypass graft (CABG) surgery. We selected approximately equal numbers of hospitals from each stratum to yield afinalsample consisting of 15 of the 56 hospitals in which coronary angiography was performed. Within each hospital, we randomly selected the medical records of 99 patients who underwent coronary angiography in 1990. Of the 1479 records selected, we located 94% (n=1387) and excluded 52 because 49 did not contain a coronary angiography (coding error), and three were incomplete. Analytic Approach We assigned each patient to a specific indication based on the abstracted information." All results were weighted to reflect the population of patients who underwent coronary angiography in New York State during i m ' - ' Most results are presented as a mean rate and a 95% confidence interval (CI). Confidence intervals for rates were calculated using the normal approximation and truncated at zero if the approximation extended below zero. Logistic regression was used to compare between two categories (eg, elderly and nonelderly). Di^erences in distribution aero;? multiple categories were tested using the x" statistic for unweighted contingency tables. 1 3 Coronary Angiography—Bernslem e: a! �Table 1—Demographic and Clinical Characteristics ot 1336 Patients Undergoing Coronary Angiography m New York State m 1990 Characteristics Age. y 19-49 50-59 60-64 65-74 ^75 Median Women Race White Black Hispanic Other Cardiac nsk (actors Hypertension Family history Hypercholesterolemia Smoking Diabetes mellilus Cardiac history" Myocardial infarction PTCA CABG Anatomic diseaset Lett main Three vessels Two vessels, with PLADJ Two vessels, other Single vessel, witn PLADJ Single vessel, other Insignificant disease Table 2.—Appropriateness of Use of Coronary Angiography in New YorV State m 1990 by Clinical Indications C h a p t e r Appropriateness. * . {9S°° Confidence Interval) • , (95°. Conlidence Interval) 17 (16-19) 25 (22-27) 19(17-21) 27 (25-30) 12 (10-13) 61 35 (31-40) 78 (67-89) 10(4-15) 9(2-15) 4(2-6) 53 12 41 28 25 (50-57) (8-15) (34-48) (25-31) (22-27) 48 (43-54) 7(5-10) 8(7-11) 8 (6-10) 25 (22-29) 6(4-7) 13(9-17) 3 (2-3) 12(9-15) 33(31-35) •PTC A indicates percutaneous transluminal coronary angioplasty, and CABG. coronary anery bypass graft. tMmimum ol 50 x, narrowing m all attected vessels, with 70*o narrowing m at least one anery lor non-lelt mam disease; tor led main disease a minimum ol 50% narrowing. Data are from the coronary angiogram. tPLAD (pronimal lett antenor descending) artery stenosis is defined as an obstruction belore the firsi septal perforator. c RESULTS Demographic and Clinical Characteristics The median age was 61 years and 12% were aged 75 years and older. Sixtythree percent were men and 71% were white. Almost half of the patients had a previous myocardial infarction while fewer than 10% had a prior percutaneous transluminal coronarv angioplasty (PTCA) or CABG (Table 1). Left main coronary artery disease was found in 8% of the patients at angiography while three-vessel disease was discovered in 25^. In one third of patients, no significant coronary artery disease was found (Table 1). Almost half of all angiographies were performed in patients either with unstable angina or during an acute myocardial infarction (Table 2). Appropriateness Approximately 76% of coronary angiographies were considered either crucial or appropriate, 20% uncertain, and 4% inappropriate (Table 2). The rate of inappropriate use of coronary angiography was similar for elderly (ie, aged 65 years and older) and nonelderly patients, 4.7% and 3.9%, respectively. There was a significantly greater chance of patients' undergoing coronary angiogJAMA. February 10, 1993—Vol 269, No. 6 - Patients. Indication % Appropriate and Crucial Unceriain Inappropriate 0 28 (20-36) 24 (15-331 3(0-5) 51 (34-68! 7 13-111 40 (28-521 9 i 185-98) 8(1-15) 1 (0-2) 28 88 (85-92) Chronic stable angina 22 45 (36-54) Dunng an acute M l } 18 7 Following unstable angina§ Appropriate 12 (8-151 Unstable anginal 0 3(0-5) 0 Following Mlii 7 64 (43-85) 13(5-21) 10(0-21) ',3(1-25) Asymptomatic 4 44 (19-70) 1 (0-3) 28 (9-47) 27 (14-41) 40 (24-56) Chest pain ol unknown ongm 3 42 (31-52) 10(0-26) 8 (0-161 Following CABG 3 65 (40-90) 24 (6-42) 11 (0-28) MiscellaneousD 9 55 (45-64) 21 (15-28) 24 (17-31) Total 64.1 (59-69) 11.5 (8-15) 20.2 (19-22) 0 0 4.2(3-5) •Ml indicates myocardial intarction: and CABG. coronary anery bypass graft surgery. Percentages may not add up to 100 due to rounding. tCoronary angiography performed during an admission for unstable angina. tThis is defined as within 10 d ol the onset ol Ml. §This is defined as within 3 mo of hospital discharge following an episode of unstable angma. (This is defined as within 3 mo of an Ml (but more than 10 d after ils onset). ^This category includes patients with a vanety ol conditions including congestive heart failure, ventncular arrhythmias, near sudden death, and valvular hean disease. raphy for inappropriate indications if thev were asvmptomatic (27^ vs 1.2%; RR, 22; 95%*C1, 9 to 42), if their presenting symptom was chest pain of uncertain origin (40% vs 1.2%; RR, 32; 95% CI, 19 to 47), or following a recent myocardial infarction (not performed during the myocardial infarction admission; 13% vs l'.2%; RR, 10; 95% CI, 4 to 23) compared with the overall inappropriate rate excluding these three clinical indications groups, called "chapters." Forty percent of coronary angiographies performed in patients experiencing an acute myocardial infarction, 28% in asymptomatic patients, and 24% in patients with chronic stable angina were rated as uncertain. Twenty percent of the coronary angiographies (n=5o) that were rated uncertain were so rated because of disagreement among the panelists; the other 213 angiographies had a median appropriateness rating ranging from 4 to 6 without disagreement. The most common appropriate, uncertain, and inappropriate cases are displayed in Table 3. Interhospital Comparisons Although the rate of inappropriate use varied from 0% to 9%, uncertain use from 13% to 31%, and crucial use from 49%. to 71% among hospitals, the differences among hospiuls were only significant (P=.04) between crucial and less than crucial (Table 4). We also examined whether differences in inappropriate use might exist between upstate and dowmstate hospitals, high- and low-volume hospitals, teaching and nonteaching institutions, and by whether PTCA and CABG were performed at the hospital where the coronary angiography was performed. There was more uncertain use in teaching hospitals, those located dowmstate, and those performing fewer than 750 coronary angiographies per year, but all other differences were not significant (Table 5). COMMENT This study evaluated the apropriateness of use of coronary angiography in the state of New York. The 4.2% inappropriate rate of coronary angiography that we found in this study is significantly less than the 17Cr rate reported for 1981 for a national sample of patients aged 65 years and older (P<.0001). This difference was also present for NewYork State patients aged 65 years and older. However, the proportion of elderly patients who received angiographies for uncertain indications was 21%, more than twice the 9% uncertain use rate previously reported for 19S1 (P<:.0001). The proportion of patients who underwent angiography for appropriate indications remained unchanged at 747r. This change in the distribution of inappropriate and uncertain use of angiographies in patients aged 65 years and older from 1981 to 1990 may be due to any or all of the following: First, the reasons patients undergo coronitry angiography have changed substantially during the past decade. In 1990, almost half of the coronary angiographies were performed for two conditions: unstable angina (28%) and acute myocardial infarction (18%). In 1981, thefigureswere 20% and 2%, respectively. Second, some panel ratings changed over time. For example, the use of coronary angiography in 33 patients with unexplained cardiomegaly or congestive heart failure (2.5%) was rated as uncertain in 1990. In 1981, based on the available literature, performing coronary angiography was considered inappropriate for patients with congestive heart failure who did 14 ( Coronary Angiography—Bernslem et al 767 �Table 3 —The Most Frequently Used Indications by Appropnateness Category* No. of Angiographies Indications Appropnalet Unstable angina (not lollowing an Ml), m patients aged <75 y during the admission lor unstable angina but after the first 24 h, and pain resolves or is controlled by inpatient medical treatment Unstable angina (not lollowing an Ml), in patients aged <75 y. during the admission lor unstable angma but atter the first 24 h. and pain persists or recurs after admission Uncertain} Acute Ml in patients aged <75 y. between 12 h after symptom onsel and discharge, if they have no strong contraindications to either thrombolytic therapy or CABG/PTCA. did not receive thrombolytic therapy, and Ml is uncomplicated non-Qwave inlarction with no submaximal exercise stress test Unstable angina (not following an Ml) in patients aged <75 y. within 24 h ol admission tor unstable angina and pain resolves or is controlled by inpatient medical treatment Appropriateness Rating 180(13.5) 7 109 (8.2) 9 65 (4.9) 6 27 (2.02) 4 lnapprophate§ Within 12 wk ol an acute Ml. in patients aged <75 y. with nor>Q-wave inlarction who have been discharged from initial hospitalization, have expenenced no chest pain, did not undergo an exercise stress test or a stress imaging study, and either did not undergo ambulatory electrocardiographic momtonng or showed no evidence of silent ischemia on such monitoring 10(0.75) 3 Acute uncomplicated non-Q-wave Ml in patients aged >75 y, between 12 h afler symptom onsel and discharge, if they have no strong contraindications to either thrombolytic therapy or CABG/PTCA. did not receive thrombolytic therapy, and did not undergo a submaximal exercise stress test 6 (0.45) 3 10 7 •The total number of angiographies was 1335. Ml indicates myocardial infarction. CABG. coronary artery bypass gratl. and PTCA. percutaneous transluminal coronary angioplasty. 11017 cases were rated appropnate t268 cases were rated uncertain. §50 cases were rated inappropnate Table 4.—Appropriateness of Use of Coronary Angiography in New York State in 1990 by Hospital* Appropriateness, % (95% Confidence Interval) Hospital No. ot Patients Appropriate and Crucial Appropriate Uncertain A 90 59 (49-69) 18 (10-26) 20(12-28) 3(0-7) B 90 69 (59-79) 13(6-20) 16 18-23) 2(0-5) Insppropriate C 89 71 (61-80) 6(1-10) 19 (11-27) 4 (0-9) O 90 67 (57-76) 16(8-23) 16 (8-23) 2 (0-5) E 88 66 (56-76) 10(4-17) 23(14-32) 1 (0-3) F 89 66 (44-65) 10(4-16) 19(11-27) 4 (0-9) G 90 54 (44.65) 19(11-27) 18 (10-26) 9(3-15) H 88 49(38-59) 18 (10-26) 27(18-37) 6(1-11) 1 90 71 (62-81) 8(2-13) 17(9-24) 4(0-9) 4 (0-9) J 90 70 (60-80) 12(5-19) 13(6-20) K 89 57 (47-68) 16 (8-23) 26(17-35) 1 (0-3) L 88 58 (48-68) 17(9-25) 19(11-28) 6(1-11) M 86 58 (48-69) 8(2-14) 31 (22-41) 2 (0-6) N 89 66 (56-76) 13 (6-21) 20(12-29) 0 O 89 66 (56-76) 10(4-16) 18(10-26) 6(1-10) •There is a significant difference among hospitals lor those procedures judged crucial vs less than crucial (ie. appropriate, uncertain, or inappropnate) (P=.04). There was no significant difference lor inappropriate vs other categories (P=.19) or inappropriate and uncertain vs appropriate and crucial (P=.31). 5 not also meet criteria based on angina. Third, the regulatory environment in New York State may contribute to the lower rate of inappropriate use. The inappropriate rate of use of coronary angiography described in this study also differs significantly from the results recently reported by Graboys et al, who concluded that 50% of coronary angiographics are not indicated. Their conclun was based on the evaluation of IfiS 'If-selected patients who sought second 15 tf 768 J A M A , February 10. 1 9 9 3 — Vol 269. No. 6 in our study were in these categories; 53% of our patients underwent coronary angiography for an acute myocardial infarction or unstable angina. It is likely that some of the patients in the study by Graboys et al would have been classified as inappropriate by our criteria. In particular, 21% of our asymptomatic patients were judged inappropriate, but they represented only 4 7 of *0 patients undergoing coronary angiography. However, since the clinical reasons used to judge a case as inappropriate are not described in sufficient detail in the article by Graboys et al, it is impossible to tell whether patients with similar clin ical characteristics would be judged the same with regard to appropriateness. In contrast, our criteria, as previously mentioned, are explicit and in the public domain so that clinicians can assess their face and content validity. It is also important to consider the results of this article in relationship to the results reported in the other two articles -' in this series. The appropriateness of use of these diagnostic and therapeutic cardiovascular procedures within a single state varied significantly by procedure (Table 6). For example, the crucial rate was 82% for CABG, 64% for coronary angiography, and 35% for PTCA. Conversely, the uncertain rate was 7% for CABG, 20% for coronary angiography, and 38% for PTCA. This variation was not explained by hospital location, volume of cardiovascular procedures, teaching status, or whether PTCA and CABG were performed at the hospital where the coronary angiography was performed. Thus, even within a single specialty, appropriate use of one procedure does not necessarily lead to appropriate use of another. Identifjing inappropriate use requires directly assessing the appropriate use of each procedure independently since extrapolation of data from ohe procedure to another may lead to erroneous conclusions. This process may be justified for all expensive, frequently used, or high-risk procedures. How might this information on the appropriateness of these three procedures be used? The answer will vary depending on who is viewing the data. Govemment officials under strong political pressure to reduce health care costs might authorize public funds to pay for only those procedures rated crucial, because these senices must be made available to everyone enrolled in the public programs. Cardiologists, on the other hand, would feel obligated to offer their patients every possible chance to improve their health. While cardiologists might airree that none of these procedures should be offered for inappropriate indications (ie. -4'< of angiographies and PTCAs anil 2% opinions during a 7-year period beginning in 1981. The two studies are not comparable. First, ours was a populationbased study, which used a randomized sample that was representative of all patients who underwent coronary angiography in New York State in 1990. Second, patients in the study by Graboys et al were healthier and were referred for elective angiography; 89%- were either asymptomatic or had mild angina (class I or II). Few er than one third of patients Coronary A n g i o g r a p h y — B e r n s l e m el al �Table 5 —Appropriateness of Use of Coronary Angiography by Hospital Characteristics' Volumet 1 Appropriateness Appropriale and crucial Low. High, "a Location} I 1 Upstate. * . Oownstale. % 65(61-70) 58 (50-66) 64 (59-69) Attached Teachings i Yes, % 64 (57-711 64 (57-61) No. % 64 ( 59-69) 65 (60-70) Yes. No, 62 (55-691 Aporopnale 11 19-18) 11 (7-15) 13(9-161 11 (6-15) 10 (615) 13(10-17) 11 (6-15) 14 (11-171 Uncenam 25 (22-28) 1911 (18-21) 18(17-20) 21» (19-24) 22 (19-24' 1811(17-20) 20(18-22) 20 (10-24) Inappropnate 4 (3-5) 3 (1-5) 5(4-6) 4 (3-5) 4 (3-5) 4(4-5) 4 (3-5) 4 (3-6) 0 'Numoers m parentheses are 95 » conlidence intervals. Uow-volume hospitals pertormed lewer than 750 coronary angiographies m 1989. IDownslate hospitals include those Irom New Yor* City. Long Island, and Westchester County. §Teaching hospitals are Ihe primary acute care facility associated with a medical school .Attached hospitals have the ability to pertorm revascularization procedures (eg. percutaneous transluminal coronary angioplasty or coronary artery bypass graft). IP-.01. • P--:.05. Table 6.—Appropriateness of Use of Coronary Angiography. Percutaneous Transluminal Coronary Angioplasty (PTCA), and Coronary Artery Bypass Graft (CABG) in New York State in 1990 Appropriateness. S (95% Confidence Interval) 1 1 Procedure No. of Patients Appropriate and Crucial Coronary angiography 1335 PTCA 1306 CABG' 1338 Appropriate Uncertain Inappropriate 64 ( 59-69) 12 (8-15) 20(19-22) 4 (3-5) 35 (31-39) 23 (20-25) 38(35-41) 4 (2-6) 82 (80-85) B (7-10) 7 (5-9) 2(2-3) 'Percentages may not add up to 100 due to rounding of CABGs), they might fee) strongly that these procedures should be made available for all other indications, if desired by the patient, and that the procedures should be paid for by the govemment or insurance companies. The most important player in making this decision is, of course, the patient. Unfortunately, the patient's attitudes toward this decision are unknown. The patient may trust the physician's recommendations and want the physician to have the freedom to recommend the best treatment and have the govemment or insurance company pay for it. Then only inappropriate care would not be available. However, regardless of their individual preferences, patients (ie, the public) may agree that only the procedures judged appropriate (ie, 90% of CABGs, 58%- of PTCAs, and 76% of coronary angiographies) would be paid for as part of a basic benefits package or subsidized by public money. In summary, the current study de- scribed in this series of three articles was designed to examine overuse of three cardiovascular procedures. We found little evidence of inappropriate use of any of these procedures in New York State; however, a significant proportion of two of the three procedures are being performed for uncertain indications for which benefit and risk are thought to be about equal. Additional clinical research will help to define more precisely how much benefit or risk is associated with use for these indications. What remains unanswered is whether patients who could benefit from coronary angiography, PTCA, and CABG are not receiving the procedure. Are the procedures being underused especially in underserved or minority populations? The same ratings developed in this study should be apphed to patients who could benefit from these procedures but who may not be receiving them. This study of underuse will take on added importance as cost-containment pressures increase and reimbursement for physicians change. Increasing the health of the American public will require simultaneous elimination of both underuse and overuse. This work was supported by g r a n u from the Commonw ealth Fund. Morgan Guaranty T r u s t , the New York Community T r u s t , and the New York Stale Department of Health. We thank Frederick Parker. M D . and the members of the New York Cardiac Advisory Committee for their suppon and advice d u r i n g this project. We also wish l o than!. Laurie McDonald for computer programming; Barbara Genovese, David H a d o m , MD, Carol Roth. R N , M P H , Margorie Sherwood. MD. Amar Iqbal, M D , and Jacqueline Kosecoff, PhD. for their assistance in abstractor training, dam analysis, and project coordination. We are aiso grateful io H a m Feder. M P A , and Dorothy Knowlton. R N . of the Island Peer R e n e w Organizauon for their tireless efforts in data collection. Finally, we express our deepest appreciation to the members of the Coronary Angiography Appropriateness Panel—Gottlieb Friesinger, M D ; Sidney Goldstein. M D ; David Hickam. M D : Robert Jones. M D ; George Kaiser. M D ; Spencer K i n g . M D : Patrick Scanlon. M D . Joseph Scherger, M D , M P H ; and William Sheldon, M D — w h o contributed their tunc-, scholarship, and insight. References 1. Gillum RF. Coronary artery bypass surgery' and angiofraphy in the United States, 1979-1983. Am Heart J. 19S7;113:1255-1200. 2. National Center for Health SUtistics. Detailed diagnoses and procedures: National Hospital Discharge Survev, 19S9. Vital Health Slat IS. 1991; 1U?.:110. 3. Wenneker MB. Epstein A M . Racial inequalities in the use of procedures for paUenls w i t h Ischemic heart disease in Massachusetts. J A M A 19S9-^Gl£53-2o7. 4. Gray D. Hampton J R , Bernstein SJ, Kosecoff J B , Brook R H . Audit of coronary angiography and bypass surgery. Lawcft. 199O;Xiy:i:'.17-13:>0.' 5. Chassin MR, Kosecoff J , Solomon D H . Brook R H . How coronary angiography Is used: clinical determinants of appropriateness. J A M A . 1987^58: 18-2.543-2M7. 6. Bernstein SJ, Kosecoff J, Gray D, Hampton J R , J A M A . February 10. 1993—Vol 269. No. 6 Brook R H . The appropriateness of the use of cardiovascular procedures: B n t i s h versus US perspectives, h i t J Technol Assess Health Care. I n press. 7. Hilbome L H . Leape L L . Bernstein SJ, et al. The appropriateness of use of percutaneous transluminal coronary angioplastv in New York State. J A M A . lS««569:7ol-7ia. 8. Leape L L , Hilborne L H , Bernstein SJ, e t a l . The appropriateness of use of coronary a r t e r y bypass graft surgery in N e w Y o r k Slate. J A A l A . 1993; 209:753-760. 9. Park R E . Fink A , Brook R H , et al. Physicians rotinp* of a p p m p m l e indications for six medicai and surgical procedures. A m J Public Health. 1986; 76:766-772. 10. Bernstein SJ, Laouri M, Hilborne LH, et al. Coronary Angiography: A Literature Rerieu-and Ratings of Appropriateness and S'ecessity. Santa Monica, Calif: R A N D , 1992. Publication J R A - t t i . 11. Kish L. Sun/ey S a m p l i n g . N e w Y o r k , N Y : John Wiley & Sons Inc: 1965. 12. Cochran W G . S a m p l i n g Techniques. 3 r d ed. New York, N Y : John W i l e y i Sons Inc; 1977. 13. Huher PJ. The behavior of maximum likt-lihoi'-ri esumates under non-standard conditions. I n : Proceedings of the F i f t h Berkeley Symposiu m an M a t h • ematical Statistics a n d P r o b a b i l i t y . Berkeley: University of Califomia Press; 1967; 1:221-233. 14. Chassin M R . Kosecoff J , Park R E , et al. Docs inappropriate us* explain geographic variations in thi- a<v of health can- si-mces'.' a study of ilirev procedures. J A M A . 19872^8^533-2537. la. Graboys T B , Biegelsen B, L a m p e r t S, Blatt CM, Lown B. Results of a second-opinion trial among patients recommended for coronary angiographv. J A M A 1992268:2537-2540. Coronary A n g i o g r a p h y — B e r n s t e i n et al 769 �HEALTHY PEOPLE 3COJ National Health Promotion and Disease Prevention Objectives U.S. Department of Health and Human Services Public Health Service Healthy People 2000 is a statement of national opportunities. Although the Federal Government facilitated its development, it is not intended as a statement of Federal standards or requirements, ll is the product of a national effort, involving 22 expert working groups, a consortium that has grown to include almost 300 national organizations and all the State heatth depanments, and the Institute of Medicine of the National Academy of Sciences, which helped the U.S. Public Health Service to manage the consortium, convene regional and national hearings, and receive testimony from more than 750 individuals and organizations. Afler extensive public review and comment, involving more than 10,000 people, the objectives were revised and refined lo produce this report. �1. Introduction Healthy People: The Economics of Prevention Despite the overall health improvements achieved as a result of preventive interventions, the Nation continues to be burdened by preventable illness, injury, and disability. In 1960, the share of the Gross National Product (GNP) going to medical services was 5 percent It is estimated to reach nearly 12 percent in 1990. Lost economic productivity attendant to illness and early death compounds the impact of this problem, so that in 1980 the total costs of illness equalled nearly 18 percent of GNP. Injury alone now costs the Nation well over $100 billion annually, cancer over $70 billion, and cardiovascular disease $135 billion. 2 3 9 Sophisticated technology for the diagnosis and treatment of disease conditions has outstripped society's ability to pay for it. But many of these expenses are avoidable (Fig. 1.3). Coronary artery disease affects approximately 7 million Americans and causes about 1.5 million heart attacks and 500,000 deaths a year. The number of coronary Overall magnitude Avoidable inien^ention' Cost per paiient" 1 million with coronary anery disease 500,000 deaths/yr 284,000 bypass procedures/yr Coronary bypass surgery $30,000 I million new cases/yr 510,000 deaths/yr Lung cancer treaunent $29,000 Cervical cancer treatment $28,000 Stroke 600,000 strokes/yr 150,000 deaths/yr Hemiplegia treatment and rehabilitation $22,000 Injuries 2.3 million hospitalizations/yr 142.500 deaths/yr 177,000 persons with spinal cord injuries in the United States Quadriplegia treatment and rehabilitaiion $570,000 (lifetime) Condition Hean disease Cancer Hip fracture treatment and rehabilitation $310,000 HIV infection 1-1.5 million infected 118,000 AIDS cases (as of Jan 1990) AIDS treatment $75,000 (lifetime) Alcoholism 18.5 million abuse alcohol 105,000 alcohol-related deaths/yr Liver transplant $250,000 Drug abuse Regular users: 1-3 million, cocaine 900,000, IV drugs 500,000, heroin Drug-exposed babies: 375,000 Treatment of drugaffected baby $63,000 (5 years) Low birth weight baby 260,000 LBWB bom/yr 23,000 deaths/yr Neonatal intensive care for LBWB $10,000 Inadequate immunization f **'•:•?•- Costs of treatment for selected preventable conditions $40,000 Severe head injury treatment and rehabilitation lb Fig. 1.3 Lacking basic immunization series: 20-30%, aged 2 and younger 3%, aged 6 and older Congenita] rubella syndrome treatment $354,000 (lifetime) Representativefirst-yearcosts, except as noted. Not indicated are nonmedical costs, such as lost productivity to society. Source: Data compiled from various sources by the Office of Disease Prevention and Health Promotion �Healthy People 2000 bypass procedures performed each year is approaching 300,000, each one of these procedures at a cost of approximately $30,000. Arepresentativecost for treating a single case of lung cancer is $29,000 and $28,000 for invasive cervical cancer. A liver transplant for alcoholic cirrhosis can cost $250,000 or more. The lifetime treatment costs per patient are $570,000 for quadriplegia from a spinal cord injury, $354,000 for congenital rubella syndrome, and $75,000 for Acquired Immunodeficiency Syndrome (AIDS). Yet virtually all of these conditions are preventable. Mobilizing the considerable energies and creativity of the Nation in the interest of disease prevention and health promotion is an economic imperative. Healthy People 2000: The Challenge and Goals The Nation has within its power the ability to save many lives lost prematurely and needlessly. Implementation of what is already known about promoting health and preventing disease is the central challenge of Healthy People 2000. But Healthy People 2000 also challenges the Nation to move beyond merely saving lives. The health of a people is measured by more than death rates. Good health comes from reducing unnecessary suffering, illness, and disability. It comes as well from an improved quality of life. Health is thus best measured by citizens' sense of well-being. The health of a Nation is measured by the extent to which the gains are accomplished for all the people. The challenge of Healthy People 2000 is to use the combined strength of scientific knowledge, professional skill, individual commitment, community support, and political will to enable people to achieve their potential to live full, active lives. It means preventing premature death and preventing disability, preserving a physical environment that supports human life, cultivating family and community support, enhancing each individual's inherent abilities to respond and to act, and assuring that all Americans achieve and maintain a maximum level of functioning. The purpose of Healthy People 2000 is to commit the Nation to the attainment of three broad goals that will help bring us to our full potential (Fig. 1.4). We have a broad array of opportunities to achieve our goals. This report presents many of these opportunities in the form of measurable targets, or objectives, to be achieved by the year 2000, organized into 22 priority areas. The first 21 of these areas are grouped into three broad categories: healih promotion; health protection; and preventive services (Fig. 1.5). • Increase the span of healthy life for Americans • Reduce health disparities among Americans • Achieve access to preventive services for all Americans Fig. 1.4 Healthy People 2000 Goals Health promotion strategies are those related to individual lifestyle—personal choices made in a social context—that can have a powerful influence over one's health prospects. These priorities include physical activity and fitness, nutrition, tobacco, alcohol and other drugs, family planning, mental health and mental disorders, and violent and abusive behavior. Educational and community-based programs can address lifestyle in a crosscutting fashion. Health protection strategies are those related to environmental or regulatory measures that confer protection on large population groups. These strategies address issues such as unintentional injuries, occupational safety and health, environmental health, food and drug safety, and oral health. Interventions applied to address these issues are generally �1. Introduction Health Promotion 1. Physical Activity and Fitness 2. Nutrition 3. Tobacco 4. Alcohol and Other Drugs 5. Family Planning 6. Mental Health and Mental Disorders 7. Violent and Abusive Behavior 8. Educational and Community-Based Programs Health Protection 9. Unintentional Injuries 10. Occupational Safety and Health 11. Environmental Health 12. Food and Drug Safety 13. Oral Health Preventive Services 14. Maternal and Infant Health 15. Heart Disease and Stroke 16. Cancer 17. Diabetes and Chronic Disabling Conditions 18. HTV Infection 19. Sexually Transmitted Diseases 20. Immunization and Infectious Diseases 21. Clinical Preventive Services Fig. 1.5 Healthy People 2000 Priority Areas Surveillance and Data Systems 22. Surveillance and Data Systems Age-Related Objectives Children Adolescents and Young Adults Adults Older Adults not exclusively protective in nature—there may be a substantial health promotion element as well—but the principal approaches involve a communitywide rather than individual focus. Preventive services include counseling, screening, immunization, or chemoprophylactic interventions for individuals in clinical settings. Priority areas for these strategies include maternal and infant health, hean disease and stroke, cancer, diabetes and chronic disabling conditions, HIV infection, sexually transmitted diseases, and infectious diseases. Crosscutting professional and access considerations in the delivery of clinical preventive services are also addressed. A special category has been established for surveillance and data systems. Given the centrality of monitoring progress toward the stated targets in the overall approach of Healthy People 2000, the integrity of our data collection efforts at every level is critical. Objectives have therefore been established to improve those efforts. Finally, because issues and approaches vary by age, chapters are included for each of four age groups: children, adolescents and young adults, adults, and older adults. Objectives related to each of these age groups are found throughout the priority areas. To give them special emphasis, some of the key targets have been collected and presented according to these four ages. The full set of objectives with commentary is presented as Part II of Healthy People 2000. The material presented here in Pan I defines the overall national agenda and outlines goals, objectives, and strategies for change. Chapter 2 of Pan I reviews the �Healthy People 2000 challenges for people in various age groups. Chapter 3 addresses high risk populations. Chapter 4 presents the broad goals. Chapter 5 gives synopses of each of the priority areas with selected examples of the objectives addressed. Chapter 6 reviews the challenge for implementation for various groups throughout the Nation. The last chapter deserves special comment Healthy People 2000 uses the three approaches of health promotion, health protection, and preventive services as organizing categories, but running through the priority areas and the objectives is a common theme of shared responsibility for carrying out this national agenda. Achievement of the agenda depends heavily on changes in individual behaviors. It requires use of legislation, regulation, and social sanctions to make the social and physical environment a healthier place to live. It calls on medical and health professionals to prevent, not just to treat, the diseases and conditions that result in premature death and chronic disability. All are necessary. None is sufficient alone to achieve Healthy People 2000's goals and objectives. The challenge spelled out in Healthy People 2000 calls upon communities to translate national objectives into State and local action. To accomplish this, a new edition of Model Standards—Healthy Communities 2000: Model Standards, Guidelines for Attainment of Year 2000 Objectives for the Nation—provides a flexible planning tool to enable communities to share in the various efforts necessary to attain these objectives. The volume covers the priority areas of Healthy People 2000 and includes all of the national objectives that call for action at the community level. It offers community implementation strategies for putting the objectives of Healthy People 2000 into practice and encourages communities to establish achievable community health targets. References Bureau of the Census. Projections of the Numbers of Households and Families: 1986 lo 2000. Washington, DC: U.S. Department of Commerce, 1986. Public Health Service. Promoting Health/Preventing Disease: Objectives for the Nation. Washington, DC. U.S. Department of Health and Human Services, 1980. Health Care Financing Administration, Office of the Actuary. Expenditures and percent of gross national product for national healih expenditures, by private and public funds, hospiial care, and physician services; calendar years 1960-87. Health Care Financing Review 10:2, Winter 1988. Rice. D.P.; MacKenzie. E.J.; Jones. A.S.; Kaufman, S.R.: deLissovoy, G.V.; Max. W.; McLoughlin, E.; Miller. T.R.; Robertson, L.S.; Salkevcr. D.S.; and Smith, G.S. Cost of Injury in the United Suites: A Report to Congress. 1989. San Francisco, CA: Institute for Healih and Aging, University of Califomia and Injury Prevention Center, The Johns Hopkins University, 1989. Hodgson, T.A.. and Rice, D P. Economic impaci of cancer in the United Suues. In: Schonenfeld. D., ed. Cancer Epidemiology and Prevention. Chapier 13, in press. Kutscher, R.E. Projections 2000: Overview and implications of the projections lo 2000. Monthly Labor Rei'ifw September, 1987. National Cenler for Health Statistics. Health. United States. 1989 and Prevention Profile. DHHS Pub. No. (PHS)90-1232. Hyansville, MD: U.S. Depanment of Healih and Human Services, 1990. Passel, J.E., and Woodrow, K.A." Immigration lo the United Stales." Paper presented to the Census Table. August 1986. Public Health Service. Healthy People: Surgeon General's Report on Health Promotion and Disease Prevention. Washington, DC: U.S. Depanment of Health and Human Services, 1979. Shapiro, S.; Venel, W.; Strax, L.; and Roeser, R. Selection. Followup, and Analysis in the Health Insurance Plan Study: A Randomized Trial With Breast Cancer Screening. National Cancer Institute Monographs 67:65-74, 1985. Spencer, G. Projections of ihe Hispanic Population: 1983-2080. Currenl Population Repons. Population Estimates and Projections. Series P-25, No. 995. Washington. DC: U.S. Department of Commerce, Bureau of the Census, 1986. Spencer. G. Projections of the population of the United States, by age, sex, and race: 1988 to 2080. Currenl Population Reports. Population Estimates and Projections. Senes P-25, No. 1018. Washington, DC: U.S. Department of Commerce, Bureau of the Census, 1989. �For Official Use Only 5/11/93 Title: "Small Area Analysis and the Medical Care Outcome Problem" by John Wennberg, AHCPR Conference Proceedings: Research Methodology: Strengthening Causal Inteipretations of Nonexperimental Data. (May 1990) This article reviews the literature showing that medical practice varies widely among regions and localities in the United States, without an accompanying variation in the severity of patients' illnesses or in the quality of health outcomes achieved. Dr. Wennberg notes that this variation is costly, and that eliminating it could also have great benefits for patients' well-being. �Feature Article mall Area Analysis and the Medical Care Outcome Problem John E. Wennberg, M.D. practice. These defects include (1) the failure to give priority to evaluation of the outcomes of care and (2) the For years, the per capita costs of hospitalization for the failure to place appropriate emphasis on the patient's residents of Boston have been about twice as great as the preferences when making value judgments that affect costs for the residents of New Haven. In the early 1970s, clinica] choices. Thus, there is a direct and imponant the chances that a child would reach age IS with tonsils connection between small area variations and uncertainin place was over 90 percent in Middlebury, Vermont, ty about the efficacy and effectiveness of medical care. but less than 40 percent in Morrisville. Once informed The need for information-from the perspective of paof the high rate of tonsillectomy, the physicians of Mor- tient welfare and the potential for reducing the cosu of risvillereviewedtheir practice patterns, and the inciunnecessary care-is great. The challenge is to improve dence rate of tonsillectomy dropped to become one of the scientific basis of clinical practice by a program of the lowest in Vermont, a rate that has persisted until the research to reduce uncertainties about the basic probapresent day. bilities for outcomes associated with alternative treatments and to structure medical decisions so the preferThese statistics are examples of the information proences of patients matter. Another challenge is to leam vided by small area analysis (SAA), a technique that how to improve the quality and effectiveness of care in uses large administrative data bases to obtain populalocal settings. The final sections of this article emphation-based measures of utilization and resource allocasize the importance of nonexperimental techniques in tion. As such, SAA is part of a broader inquiry that medical care outcome studies. targets the health system itself for epidemiologic inves"igation. This inquiry, which could be called "medical are epidemiology," has proven useful in a spectrum of The Small Area Variation Phenomenon studies ranging from policy analyses of fee-fpr-service The classic description of the small area variation phemedicine to studies of the outcomes of care. The data nomenon is Glover's (1938) account of differences in bases involved and the methods of analysis have been the tonsillectomy rate among British school children: described elsewhere (Wennberg and Gittelsohn, 1980); Comparison of some of the rates in different an overview of the basic methods of SAA is provided in areas in 1931 ...revealedstriking contrasts in the Appendix to this article. areas apparently somewhat similarly circumThis article describes how SAA can be used to interpret stanced. Thus, in that year the operation rate in the influence of physician "practice style" on the level Margate was eight times that in Ramsgate; that of health care resources utilization and expenditures of Enfield was six times that of Wood Green and among defined populations. The term "practice style" four times that of Finchley; that of Bath five denotes clinical decision rules held idiosyncratically by times that of Bristol; that of Guilford four times individual physicians or by members of "schools of that of Reigate; that of Salisbury three times that thought" that, when analyzed, can be shown not to be of Winchester. based onreasonablywell-tested hypotheses concerning outcomes of care or on accurate assessments of the utility of care to patients. The evidencefromSAA reflects Since Glover's time, extensive variations for total surthe dominating influence of practice style in the decision gery and for specific procedures such as hysterectomy, prostatectomy, back surgery, and tonsillectomy have to hospitalize or perform major surgery. The ubiquitous influence of practice style can be traced been documented among nations and large regions, to defects in the scientific and ethical bases of medical neighboring communities within the United States (Figure 1), the United Kingdom, Norway, Denmark, Sweden, Switzerland, Australia, and Canada (see Bames, Dr. Wennberg is Professor of Epidemiology in the Department of Community and Family Medicine, Darmouth Medical School. 1982; Bloor, 1976; Bunker, 1970; McPherson, Introduction 177 AHCPR Conference Proceedin s: Research Methodology: Strengthening Causal Interpretation, of Nonexperimental Data. (May 1990) DHHS Pub. No. (PHS) 90-3454. R �Figure 1. 1980 surgery rates in 23 lowa hospital service areas with populations < 20,000 60 50 - 40 o a m OC : 20 10 - T&A Hysiercciomy Prostateciomy Hernia Source: McCracken, Laiessa. and Wennberg (1982) NOTE: Each doi reprcsems the rate for an operation in I of the 23 most populated hospital market areas in lowa. Inguinal hemia operations show relatively little variation compared to the other operations. The pattern of variation is stable from year to year and from region to region. The data are for inpatient operations only; at the time of this study, these operations were inpatient procedures. Wennberg, Hovind, and Clifford, 1982; L.L. Roos, 1979; N.P. Roos, 1984; Wennberg and Gittelsohn, 1973); between geographically separated but apparently homogeneous members of insurance plans (Bloor, 1976; Lewis, 1969); and between enrollees in prepaid group practices (Luft and Hunt, 1986). The variations sometimes imply extraordinary differences in the lifetime probabilities of having an operation. In the early 1970s in the Mofficville area of Vermont, 65 percent of children were estimated to have had tonsillectomies by age 15. In Middlebury, only 7 percent underwent the operation. For prostatectomy, the rates in some communities predict that IS percent of males will undergo the operation by age 85, while in others more than half of the male population can expect to undergo the operation by the same age. Rates for hysterectomy predict that less than 20 percent of women will have the operation by age 70 in some communities, while in others the rate is over 70 percent. 1 178 Recent studies show that the degree of variation in medical and pediatric admissions is even more extensive than that for surgical conditions (Wennberg, McPherson, and Caper, 1984). In Maine, the rate of hospital admission for pneumonia varies by a factor of 5 for adults and by a factor of more than 10 for children. Admissions for back injuries vary more than ten-fold among Maine, Massachusetts; Iowa, and California. There are also extensive variations in per capita expenditures and resource allocations among communities. In the United States, total expenditures for hospital care typically vary more than two-fold among the communities of a State, as do the numbers of hospital beds and personnel invested on a per capita basis in health care. These differences are seen among rural communities and among urban areas with highly sophisticated health care systems. Throughout the 1970s, per capita costs for hospital care in Rumford, Maine, were twice those of neighboring Farmington, and in Boston they were twice as high as in New Haven as shown in Figure 2 (Wennberg and Gittelsohn, 1980). The comparisons between Boston and New Haven are of special interest because residents in those communities are hospitalized most often in teaching hospitals staffed by physicians affiliated with Harvard, Boston University, Tufts, or Yale medical schools. Some 92 percent of hospitalizations for the residents of New Haven and 85 percent for Boston occur in teaching hospitals. Theresidentsthus have access to state-of-the-art academic medicine, yet hospital costs per capita were S889 for Boston and $451 for New Haven residents in 1982. A number of hospital markets with high-quality care-judged on the basis of the percentage of residents who are hospitalized in teaching hospitals-have relatively low costs, showing that teaching hospitals, when viewed from the "bottom line" indicator of cost per capita, need not be expensive. Reimbursementsreportedfor Medicare Part A (which covers hospital costs) were 80 percent higher in 1982 for enrollees living in Suffolk County (Boston) than in New Have" County ($1,894 versus $1,088). If New Haven reimbursements had applied to the 78,000 enrollees living in Boston, the outlays of the Medicare program for hospitals would have been $63 million less, $85 million rather than the actual $ 148 million. The higher costs for hospitalization for Boston enrollees were not offset by higher physician costs: Medicare Part B reimbursements were 59 percent higher in Boston ($753 versus $473) (Health Care Financing Administration, 1983). The increased costs in Boston were associated with the use of greater numbers of beds and hospital personnel 'For the methods used for estimating organ removal, see Ginelsohn and Wennberg (1976). �Figure 2: Hospital expenditures in Connecticut and ssachusetts (1975) • — $324 $300 . 8S200 c iL u $'53 SI00 Conneclicul Massachusetts Hospital markets •TE: Each dot represents one of the 1 mosi populated market 1 as in Connecticut or Massachusetts. Per capita expenditures for hospitals are generally lower in Connecticut, but there is a two-fold range of variation. The circled dois represent the New Haven and Boston Markets, where the majority of hospitalizations occur in leaching hospitals. and an increased incidence of hospitalization. If the expenditure cost per capita for hospital care observed in New Haven had applied to the residents of Boston in 1982, the expenditures would have been $300 million less than they were. If the utilization of hospitals by Bostonians were the same as for residents of New Haven (on an age- and sex-standardized basis), 739 of the beds now used to treat the 685,000 Boston residents could be closed. Importance of the Practice Style Factor, Illustrated by Patterns of Admission to Hospitals Small area analysis is particularly well suited to studying the effects of differing practice styles on health care utilization rates. The numbers of physicians whose clinical decisions contribute to the overall rate for a specific 179 treatment often arerelativelysmall, so variations that may arise from individual physician differences in diagnostic style or therapeutic choice are not masked by averaging as they are when larger geographic areas are compared. When rates of utilization among neighboring communities are compared, variation notrelatedto demand (and/or errors in the data) must have its immediate origin in differences in the way physicians make diagnoses or recommend treatments-or differences in the way the agency role is executed by different groups of physicians. The simple model given in Figure 3 may help the reader to see how four factors-illness rates (Factor 1), decisions of individual patients to contact physicians (Factor 2), the diagnostic decisions of physicians (Factor 3), and the treatment or prescription decisions of physicians (Factor 4)-combme to produce the overall rates of hospitalization or surgery for a population. To understand the variations in treatment rates, it is necessary to understand therelativecontribution of patient demands (Factors 1 and 2) and physician decision rules (Factors 3 and 4) for individual conditions or illnesses. The importance of practice style can be illustrated by contrasting the amount of variation in the hospitalization rates for specific diseases or conditions among small areas. A few "anchor" or "low-variation" conditions have been found for which the contribution of physician decisionmaking to the rates of hospitalization is constrained by a demonstrable professional consensus. The classic example is hospitalization for fracture of the hip: all physicians agree on the criteria for diagnosis, and the implementation of these criteria is objective and reproducible by virtually all physicians. Physicians also agree on the necessity of hospitalization. Factors 3 and 4 do not vary from physician to physician, and any systematic variation in the hospital admission rates mustrelateto differences in illness rates (Factor 1), differences in access (Factor 2), and/or errors in the data, L source of variation not summarized in the model. By comparing variations in the incidence of hospitalization for the low-variation conditions to other causes of hospitalization for which professional behavior is less constrained, it is possible to obtain an indication of the impact on total variation that likely derives from the practice style factor. Orthopedic injuries and discretionary hospitalization. The pattern of variation in hospitalization rates foi specific orthopedic injuries provides a good example of the power of SAA to uncover and characterize the importance of discretionary clinical decisionmaking in determining the rate of utilization. Specific types of injuries appear to have a characteristic pattern of variation. �cause fractures of the hip are very serious injuries, with depending on the nature of the injury (Fig. 4). It is unlikely that this is due to the failure of injured people high death rates. The professional standard of practice is firm in its requirement that all patients be hospitalized. to seek medical attention, since it can be asserted with Thus the incidence of hip fracture itself is the only relesome confidence that virtually all patients who break their hips, ankles, or forearms will seek the attention of vant factor that can vary. In contrast to hip fracture, the implicit standards of physicians (Factor 2). Nor is it likely that the reasons practice for other, less serious injuries are not as conreflect differences in diagnostic styles or skills among physicians, particularly for fractures (Factor 3). Except straining withregardto the decision to hospitalize. in rare circumstances, fractures are easily diagnosed by There is discretion for the physician, and some cases can be and often are treated in the outpatient department. physical examination or x-ray. The two factors in the model presented in Figure 3 that are likely to contribute Compare, therefore, the pattern of variation of hip fracture, for which discretion in professional decisionmakto the variations are differences between communities ing on the need to hospitalize plays no role, to that of in the incidence of fracture or differences in the deciankle fracture, for which physician discretion does play sions physicians make concerning the use of hospitals. a role. It can be inferred that the striking variation in the It is this author's contention that the low pattern of variation seen for hip fracture in Figure 4 is related to hospitalization rates for ankle fractures, as compared Factor 1 (illness or fracture rate) and to errors in the data. with hip fractures, occurs because the clinical decision to hospitalize varies among communities. In some The decision to hospitalize is narrowly constrained beFigure 3. A simple model of the effects of illness rates and patient and physician decisions on diagnoses, treatment decisions, and the surgery rate Variations in the incidence of surgery may occur because of four sets of factors that are intrinsic to the patients or to the physician. Illness rates may vary (Factor 1); patients may vary in their proclivity 10 seek care and whom they seek il from (Factor 2). Once patients seek the care of a physician, physicians may vary in iheir diagnoses (Factor 3) and/or in ihe treatments they prescribe (Factor 4). The probability for operation S for a particular disease g can be expressed as a function of the conditional probalbilities of these four factors: k Factor I (illness rates): P = probability of disease g in individual h gh Factor 2 (physician contact rate, given illness): ft i|g.h -f probability of care fom physician i given g and h Factor 3 (a physician's diagnosis, given contact and illness): Rj|g.h,i = probability of condition label j given g. h. and i Factor 4 (a physician's decision on need for a specific procedure, given the diagnosis, contact, and illness): S |g,h,i,j = probability of operation k. given g. h, i. and j k The expected number of operations k may be represented as the sum over all disease states g of individuals h, condition label j , and physicians i: E S k = expected number of operations k in population * gki p h R h H h < t.H)<»i|8- »< ji*- .')( kU. ->0) Finally, the overall rate of surgery SR in a population of size N is a function of the expected number of procedures for each constituent operation that is pan of current practice of medicine: E S, = total expected number of operations in population - SES I SR ES, x N k 180 �conununities the standard of care allows a greater pro- pital care because they are uninsured. Like hip fracture, — com rtion of ankle fractures to be treated on an outpatient the uniformity in the standards of care and the absence of opportunity for misdiagnosis minimize the potential is than in other communities (Fig. 4). for variations arisingfromthe way physicians exercise Most clinicians, and probably most patients, would agree that fractures of the forearm are, on average, less the agency role. Appendectomy is universally presevere than ankle fractures and do not necessarily need scribed for appendicitis, but the mimetic effect of other nonsurgical sources of abdominal pain sometimes leads hospitalization. The greater "zone of discretion" for forearm fractures leads to greater variation in hospital- to unnecessary surgery; thus misdiagnosis is an ization rates for fracture of the forearm than for fracture additional source of variation for this procedure (Factor of the ankle. It is not surprising that rates of hospitaliza- 3). Within aregion,appendectomy typically shows tion forfracturesof the forearm are considerably more greater variation than inguinal hemia (the SCV is typivariable than those for anklefractures.Indeed, there is cally about 25), but it is less than for most other operaan eight-fold range in variationfroma low in one market tions. area that is only 30 percent of the State average to more In contrast, hysterectomy, prostatectomy, and, particthan twice the State rate in another. Hospitalization rates ularly, tonsillectomy show large variations; in addition for knee injuries and low back injuries are even more to variation arising because of possible differences in variable, showing a 12- to 15-fold variation. patient demand and professiona] diagnostic acumen, the It is possible to estimate the percentage of the variation decision to use these operations is often discretionary in in the incidence of hospitalization for the injuries in the sense that legitimate alternatives exist and are often Figure 4 that might reasonably be attributed to differences in discretionary clinical decisionmaking between used by ethical, well-trained physicians. Hysterectomy typically has an SCV of about 60. The medical literature communities. McPherson and colleagues (1982) have developed a statistic (the systematic component of vari- contains many articles that demonstrate the lack of proation, or SCV) for estimating the magnitude of variation Figure 4. Distribution of variations in hospital admission rates for common orthopedic problems that improves upon traditional measures by removing the component of variation attributable to sample size (see Appendix). Under the assumption that variations in the incidence rates for common fractures are about the Standardized admissions ratio (log scale) same, the magnitude of variation in the hospitalization 0.50 0.75 I 1.25 1.50 2 rates for hip fractures can be used to estimate the propor- 0.25 Fracture hip tion of variation in hospitalization rates attributable to . fr<: illness rate and data errors (Table 1). TheresidualvariaFracture ankle tion according to the model is likely due to practice style. The pattern of variation in use of major surgery. • M*« • * } « • • Variations in the incidence of major surgery (surgery • Knee injury that physicians agree must be performed in the hospital) . . Lower back suggest that practice style plays an important role in the injury use of some but not all major surgical procedures (Fig. 1). Studies of inguinal hemia repair (an operation that NOTE: Data are age adjusted and expressed as the ratio to the until recently w*s not considered safe for ambulatory State average. Each dot is the rate in I of the 15 most populated surgery) show th? low variation pattern typical for hip hospital markets in Maine. 1980-1982. The person yean of fracture, with an SCV usually less than 10. experiencerangefrom 35.019 in Houlion to 581,543 in Portland. In the United States, inguinal hernias are uniformly The expected number of hospitalizations for the least frequent treated by surgicalrepair,and the diagnosis is relatively injury in Houlton is 26.7. straightforward and often made by the patient or the patient's parent; variations are likely if the incidence of hemia varies or if patients have different access to hos2 4 3 2 . While it is logically possible that the incidence of ankle fracture is much more variable from community to community than the incidence of hip fracture, there are neither sound theoretical reasons nor any epidemiological data suggesting this is to. 'In recent years, however, some inguinal hemia repairs have been performed on outpatients. The SCV for this operation is now rising. 181 'There are, however, interesting variations in the rates of appendectomy by region, with rates uniformly higher in California and lowa compared to the East Coast Regional differences also have been observed in gall bladder surgery that McPherson and others (1985) suggest may be due toregionaldifferences in the incidence of gallstones. �fessional consensus on the value of this operation for treating a number of gynecologic conditions. Prostatectomy is similarly variable. Although there is considerably less overt controversy about this operation-and some health services researchers have classified the operation as "necessary'^-an examination of the literature reveals many examples of uncertainty about the probabilities of outcomes for treating patients with urinary tract disease. Tonsillectomy (with an SCV generally greater than 200) is an operation that historically has engendered considerable debate about efficacy, including one article that labeled the procedure "ritualistic surgery" (Bolande, 1969). More recent studies show that the pattern of variation for these operations is similar in other States, including Massachusetts, Iowa, and Califomia. It also appears to be typical for regions and health districts in countries with very different methods of organizing and financing health care, providing additional evidence that the pattern of variation for a specific operation is intrinsic to the operation, rather than social organization or financial incentives. McPherson and colleagues (1982) have shown that the patterns of variation for nine operations were similar among North America, Norway, and the United Kingdom. The exception was for inguinal hemia repair in the United Kingdom, which was significantly more variable (SCV 44) than in Norway (SCV 2) and New England (SCV 6). In contrast to Norway and the United States, clinicians in the United Kingdom use the truss as an option for inguinal hemia; the researchers suggest that the increased variation for this operation is related to the differences in the way this choice is exercised among regions. Discretionary professional decisionmaking appears to play a significant role in the use of most major surgery. Under the working hypothesis that the degree of variation in the surgery rate measures the relative importance of professional discretion in affecting the decision to use the operation, each operation can be ranked on the SCV to identify highly variable operations for which professional consensus on outcome-based standards is effectively absent. This methodrevealsthat most surgical operations exhibit a variation profile greater than that typically seen for hysterectomy. For example, a recent unpublished study compared the rates of hospitalization in 16 larger communities in Cali5 fomia, Iowa, New York, and Massachusetts. Several of these communities contain well-known academic medical centers, including Stanford, the University of Iowa, the University of Rochester, the University of Massachusetts (Worcester), the University of Califomia (Sacramento-Davis), Yale, and the three medical schools located in Boston. Table 2 lists the SCV and other measures of variability for operations with more than 400 cases. Most are more variable than hysterectomy. For the majority of operations studied, there is a good deal of consistency in the SCV ranking from oneregionto another. The high SCV for appendectomy observed in this study derives from the systematically higher rates for this operation in Iowa and California compared with New England and New York. The pattern of variation of medical admissions and minor surgery. As a group, medical and minor surgical hospitalizations are considerably more variable than major surgical operations. For example, in the 16-area study referred to above, adult medical conditions showed a weighted average SCV of 144-more than twice the variation seen for hysterectomy. Nonsurgical pediatric admissions at 292 and hospitalizations for minor surgery at 272 were considerably more variable, about the same as for back injury. Major surgery, with a weighted SCV of 82, was the least variable. Adult medical conditions. Among adults, only three medical conditions consistently demonstrate the low to moderate variation pattern: strokes, gastrointestinal hemorrhages, and heart attacks. All are conditions for which patients can be expected to seek care, they are reasonably well diagnosed, and physicians usually agree on the need for hospitalization. Pediatric admissions and minor surgery. None of the pediatric causes of admission demonstrate the low- or moderate-variation "anchor" pattern seen for a few adult medical and surgical conditions (Table 4). In the 16-area study the lowest SCV was about 50, nearly as great as that seen for hysterectomy. For minor surgery, the lowest SCV was for foot operations (SCV 80). s See Wennberg. Bunker, md Bames. 1980. for a summary of controvenies concerning the use of the common operations discussed here. < The definition was based on the responses of a panel of physicians who vyere asked ioraleprosiaieciomy along with other operaiions on a scale of necessity. Prosutectomy was ranked as necessary because of fear that ihe underlying condition posed a threat to life lhat could be reduced by the operation (Bombardier, Fuchs.Lillard, and Warner. 1977). A recent assessment of prosiaieciomy oulcomes (discussed here) suggests this consensus is incorrect 182 7 T There are, of course, erroo in the diagnosis of myocardial infarction. Goldberg and colleagues (1986)revieweda sample of hospital records for patients whose hospital discharge abstracts contained the diagnosis of acute myocardial infarction. Of those records, 11 percent failed lo meet the strict cmeria for presence of a myocardial infarction, suggesting lhat variation due to Factor 3 does influence hospitalization rues for this condition. In addition, interesting clusters of high and low communityratesfor myocardial infarction have been found in several States. Note in Table 3 that 10 of the 16 areas have chi-square tests for rates with p values less than .01. and the range in rates is two-fold. It is not that acute myocardial infarction rates do not vary, they simply vary much less than most conditions. All other adult medical conditions, includinf many other manifestations of chronic cardiovascular disease, and all pediatric hospitalizations and minor suryical admissions show substantially more variation. For these, the practice style factor should be assumed to exercise a substantial influence on the pailem of utilization. Table 3 lists the SCV and other sutistical measures of variation for the 40 discrete causes of admission obtained in our 16-area study that constitute about 71 percent of adult medical admissions. �Table 1. Variation in hospitalization rates for various injuries and practice style 1 Amount of variation in hospitalization rate & xlO' (SCV) Estimate of percentage of variation related io practice style 7 0 Fracture of ankle 47 85 Fracture of forearm 138 95 Knee injury 11 6 95 Back injury • 296 98 Type of injury 5 Fracture of hip (I These uniformly high SCVs suggest that variability in decisionmaking concerning inpatient versus outpatient treatment is a very significant factor in the relative costliness for pediatric hospitalization and minor surgery. The differences among the academic communities are particularly impressive because they show that wide differences in utilization are compatible with academic standards for treatment. Table S compares the age-adjusted rates for admissions, length of stay, and patient day rates in several of these communities for all pediatric nonsurgical cases. The data are presented as ratios of the rates in each area to that of Rochester, the lowest rate area. Practice Patterns Can Change With Feedback of Information on Utilization and Outcomes gram (MMAP). Originally funded as a demonstration project by the Commonwealth Fund and the Robert Wood Johnson Foundation, the MMAP, which now receives its core funding from Maine Blue Cross, has organized a series of study groups made up of practicing physicians interested in specific common treatments. Topics selected for study groupreviewhave included back surgery, pediatric admissions, tonsillectomy, prostatectomy, and hysterectomy. For each of these, the process has led to significant reductions in utilization rates in certain high-rate areas (American Medical Association, 1986), sometimes by simply making physicians aware of their differences. In other cases, reductions were achieved because of direct, welldocumented peer pressure brought by the study group on the physicians whose clinical strategies were judged to be unreasonable. Figure 5 depicts the declining number of hysterectomies performed onresidentsof area II in Maine over a 13 year period. Beginning in 1979, members of the State association of obstetrics and gynecology met with the physicians practicing in area II to discuss the indications for hysterectomy and present the data for their area. Subsequently, the hysterectomy rates dropped close to the State average where they have remained.Through their actions, the MMAP study groups provide a mechanism for assuring that the extreme examples of variation are brought under scrutiny, in effect imposing some implicit regional standards of care for the operations they have elected to study. However, such activities, while useful inreducingvariations, do not address the fundamental question of efficacy. Moreover, the feasibility of using SAA for the systematic detection of outlier hospitals or physicians is highly dependent on geographic circumstance. A more important outcome of the feedback process is the interest it evokes among practicing physicians to address underlying uncertainties about outcomes. Discussions among practicing physicians about the rates of service in their own and neighboring areas have led to questions about the outcome significance of different practice styles. The activities of the study group concerned with variations in prostatectomy provide an e xample. Prostatectomy rates showed a three-fold variation among Maine communities in the 1970s (Wennberg and Gittelsohn, 1980). The prostatectomy study group, composed of practicing urologists broadly representative of Maine practitioners, met to consider 8 Another argument for the importance of practice style is the documented change that can accompany the feedback of information on utilization and outcomes. This evidence is important, not only because the causal relationships between professional decisions and variations in utilization in these examples are direct, but also because the constructiveresponseof the medical profession to feedback holds significant promise for reforms that could substantially improve clinical decisionmaking. The literature contains severalreportsof changes in practice patterns that have occurred following the feedback of information to physicians on rates of utilization in their own and in comparative practices (Eisenberg, -1986). The most extensive effort to combine the continuous monitoring of small area utilization rates and the feedback of data on variations with programs of professional education is the Maine Medical Assessment Pro- 183 •There must be reUdvely fewsurgeons who contribute to the rue for an area before the overallratein an area may be attributed to individuals. The "true" population served by individual hospitals or physicians wilhin an area is not known. It should also berememberedthai theratefor an area is the average for allrelevantpopulations and physicians (Fig. 3); an area wiih an average rate may be average because of a mixture of physicians with low and high thresholds for performing > particular operation. �Table 2. Measures of variation In the Incidence of Inpatient surgery among 16 university hospital or large community hospital market areas Surgical procedure Colectomy Resection of small intestine Pneumonectomy Inguinal hemia repair Simple mastectomy Open heart surgery Extended simple radical mastectomy Hysterectomy Cholecystectomy Embolectomy. lower limb anery Proctectomy Pacemaker insertion Thyroidectomy Appendectomy Total hip replacement Repair of retina Prostatectomy Coronary bypass surgery Mastoidectomy Aorto-iliac-femoral bypass Diaphragmatic hemia Supes mobilization Spinal fusion with or without disc excision Peripheral anery bypass Cardiac catheterization Excision of imravertebral disc Graph replacement of aortic aneurysm Laparotomy Total knee replacement Carotid endarterectomy Number ofcases 3.910 1.017 505 9.795 359 1.439 2.012 10.055 8.S5& 529 927 3.430 949 5.381 1.717 1.134 6.379 3.744 569 551 2.178 606 1.234 1.455 9.952 4.240 491 4.126 998 1.471 Outliers' percentage Coefficient variation Ratio H/L count 6.8 8.9 27.9 28.2 34.0 37.6 39.7 66.0 66.0 73.4 74.4 80.2 90.9 93.3 95.0 96.6 100.1 116.9 120.3 122.8 135.6 135.8 151.9 .116 .142 .213 .152 .266 .232 .214 .275 .231 .364 .272 .281 .342 .305 .353 .274 .327 .383 .461 .384 .369 .483 .520 1.47 0 0 1.75 2.72 2.01 2.71 2.29 2.21 2.60 2.22 4.10 3.01 2.63 3.35 2.86 2.99 3.12 3.12 3.62 4.03 4.07 3.45 4.28 5.20 1 I 7 0 2 3 9 6.3 6.3 43.8 .0 12.5 18.8 56.3 13 2 2 9 6 10 8 3 14 7 5 2 10 4 5 81.3 ' 12.5 12.5 56.3 37.5 62.5 50.0 18.8 87.5 43.8 31.3 12.5 62.5 25.0 31.3 154.6 156.2 161.6 162.6 227.1 261.7 412.0 .359 .443 .433 .402 .471 .525 .825 4.36 4.48 5.09 6.26 5.60 7.42 19.39 7 11 9 6 11 10 9 43.8 68.8 56.3 37.5 68.8 62.5 .56.3 SCV •Outlier defined by one degree of freedom chi-square test, p value .01 or less. NOTE: Causes of admission ranked by systematic component of variation (SCV). the variations and discuss how their difTerent approaches to practice might be contributing to the variations. Information from Medicare claims (discussed below) was developed within the context of the study group's efforts to come to grips with the variations. The magnitude of the death rate in the 3-month period following the operation surprised the urologists and confirmed their concern about the need to document the 184 benefits of the operation. This, in turn, led to the active participation of Maine urologists in the design and execution of a prospective interview study of outcomes of their own prostotectomy patients. It is of interest that the cost of the assessment was considerably less than the savings that could be derivedfromthe reduction in hospitalization. �Table 3. Measures of variation in the hospitalization rate among 16 university or large community hospital jnarketa MRiei of hospiul izition Moderate vitiation conditions specific cerebrovasculardisorden gastrointestinal hemorrhage acute myocardial infarction High-variation conditions hean failure and shock G.I. obstruction cardiac arrhythmias chest pain respimory neoplasms nutritional and metabolic diseases transient ischemic attacks infectious disease diagnoses urinary tract stones peripheral vascular disorders syncope and collapse adult simple pneumonias angina pectoris miscellaneous injuries to extremities disorders of the biliary traci ied blood cell disorders circulatory disorders, except AMI. with cardiac catheterizatjon ^ respiratory signs & symptoms P»ery high-variation conditions adult bronchitis and asthma adult gastroenteritis seizures and headaches kidney and urinary tract infections trauma to skin, subcutaneous tissue. and breast female reproductive system diagnoses chronic obstructive lung disease digestive malignancy chemotherapy deep vein thrombophlebitis toxic effects of drugs cellulitis adult diabetes atherosclerosis medical back problems peptic ulcer hypenension minor skin disorders adult otitis media and upper respiratory infection A medical DRG groups M Outlien' percenuge Number of cases Perce m g of ae obst nations SCV Coefficient variation Ratio H/L count 7.001 5.110 11.282 2.56 1.87 4.12 137 20.3 35.2 .161 .220 .221 1.72 1.85 2.03 3 6 10 18.8 37.5 62.5 9.874 2.006 5.723 4.289 4.370 3.725 3.201 4.321 3.398 3.475 2.625 8.581 7.381 2.784 1.966 2.903 6.892 361 .73 2.09 1.57 1.60 1.36 1.17 1.58 1.24 1.27 .96 3 1 3 2.70 1.02 .72 1.06 2.5: 51.3 61.2 72.7 80.3 83.9 91.6 99.5 100.7 101.3 102.5 IW6 116.: 1206 122.0 132.3 136.2 140.3 .322 .297 .295 .296 .263 .356 .321 .354 .296 .336 .374 .370 .345 .404 .408 .376 .449 1.91 2.45 2.56 2.45 2.72 3.56 3.32 2.97 3.26 3.25 3.50 3.37 3.56 3.46 3.75 4.15 3.80 10 6 9 9 9 10 II 10 12 10 8 13 1 4 8 9 9 13 62.5 37.5 56.3 56.3 56.3 62.5 68.8 62.5 75.0 62.5 50.0 81.3 87.5 50.0 56.3 56.3 81.3 1.713 .63 141.8 .403 4.12 9 56.3 7.916 17,497 4.567 4.324 2.674 2.89 6.39 1.67 1.58 .98 157.9 168.5 173.9 176.9 179.2 .372 .428 .410 .452 .455 344 4.40 4.10 4.43 5.77 15 13 II 1 4 10 93.8 81.3 68.8 87.5 62.5 4.566 5.782 2.003 2.622 1.368 5.130 4.468 4.959 3.570 12.943 1.667 2.439 1.346 1.643 1.67 2.11 .73 .96 .50 1.87 1.63 1.81 1.30 4.73 .61 .89 .49 .60 180.2 181.6 201.0 222.9 224.7 242.3 277.8 298.1 311.2 312.0 325.0 399.4 426.9 466.8 .425 .481 .452 .535 .457 .695 .551 .489 .670 .483 .529 .628 .657 .841 3.81 4.52 3.76 8.25 5.43 9.22 6.74 4.14 8.44 11.30 5.48 8.38 7.73 15.91 12 13 10 7 10 7 9. 1 4 12 13 13 1 4 13 9 75.0 11.3 62.5 43.8 62.5 43.8 56.3 17.5 75.0 81.3 81.3 17.5 81.3 56.3 151.7 .386 273.869 63.44 "Outlier defined by one degree of freedom chi-square test, p value .01 or less. OTE: Cases classified by modified DRG for medical causes of admission for patients 15 yean of age or older. Causes of admission ed by systematic componeni of variation (SCV) (0^ 185 �Table A. Measures of variation in the incidence of hospitalization for pediatric nonsurgical conditions and minor surgery among 16 university or large community market areas Cause ofhospiuliuiion or procedure Number of cases Percentage of observations SCV Coefficient variation Ratio H/L count Outlierc' percentage Pedatric nonsurgical conditwrn endocrine diagnoses traumatic stupor and coma circulatory diagnoses seizure and headache Fractures, sprains, strains, dislocated estremeties dental diseases toxic effects of drugs bronchitis and asthma skin, subcutaneous tissue diagnoses viral illness and fevers of unknown origin conclusion gastroenteritis urinary tract infections mental diagnoses simple pneumonia and pleurisy laryngotracheitis otits media and upper respiratory infeaion all pediatric cases 1.246 707 771 1.445 1.007 3.53 2.00 2.18 4.09 2.85 49.6 57.4 127.7 152.4 245.5 .253 .293 .358 .395 .460 2.11 2.91 5.68 4.22 4.60 7 3 4 9 9 43.8 18.8 25.0 56.3 56.3 369 837 4.814 1.344 1,377 1.05 2.37 13.63 3.81 "3.90 250.5 254.9 255.7 262.4 264.0 .534 .484 .461 .490 .546 7.46 5.64 4.63 6.10 5.40 4 7 10 8 13 25.0 43.8 62.5 50.0 81.3 821 4.388 517 890 2.828 1,094 1.940 2.33 12.43 1.46 2.52 8.01 3.10 5.49 287.6 393.6 449.1 452.0 514.6 532.7 899.3 .495 .570 .626 .736 .805 .572 .764 5.50 6.25 8.58 8.85 13.46 17.33 22.82 9 15 II 9 12 II 10 291.8 .500 35.308 * 56.3 93.8 68.8 56.3 75.0 68.8 62.5 49.7 Minor surgical cases foot operations lens operations hand operations except ganglion pediatric hemia operations minor genito-urinary tract opentions minor knee operations miscellaneous ear, nose and throat operations other female laparoscopic operations breast biopsy and local excision for non malignancy adenoidectomy and other T&A operations DAC, conization except for malignancy tubal interruption for nonmalignancy dental extractions and restontiotis laparoscopic tubal intenuptions all minor surgery 3,731 11.738 3.121 1.344 5.506 1.72 5.40 1.43 .62 2.53 81.1 81.2 111.6 126.2 142.0 .314 .355 .317 .360 .424 2.92 3.33 2.96 4.67 5.46 6 1 1 12 7 1 1 35.3 64.7 70.6 41.2 64.7 6.251 4,699 2.87 2.16 151.4 206.3 .363 .457 4.83 4.92 16 14 94.1 82.4 2.453 1.13 363.9 .542 5.19 14 82.4 1.637 .75 427.9 .589 11.08 II 64.7 2.378 1.09 473.3 1.083 12.53 13 76.5 5.045 2.32 494.8 .623 7.01 14 82.4 I.JCS .60 803.5 .904 14.28 II 64.7 1,519 .70 969.7 1.039 25.54 15 88.2 1.436 .66 1565.1 1.025 39.83 14 82.4 217.535 23.98 277.1 .344 •Outlier defined by one degree of freedom chi-square test, p value of .01 or less. NOTE: Causes of admission ranked by SCV. 186 55.8 �Table 5. Hospital utilization in different market areas for pediatric nonsurgical admissions The Practice Style Factor and Availability of Hospital Beds Small area studies are well suited for studying the associations between geographically fixed resources such as the quantity of hospital beds and the utilization of services. Quite apart from the question of how some communities come to have many more beds per capita than others, it is useful to inquire about the effect of varying the numbers of beds per capita on the rates of utilization of hospitals. Shain and Roemer (1959) have suggested that more beds mean more utilization, but it is, of course, the physicians who admit the patients. When one population has more hospital beds in comparison with another, how do physicians use the additional beds? What services are deployed in areas with high bed availability that are not deployed in areas with low bed availability? Is there evidence of scarcity in low-rate areas? Can evidence for rationing be found in low-rate areas? Small area studies document a strong statistical association between allocated beds per capita and admission rates for medical conditions. The rates for major surgery are less dependent on the supply of beds (McCracken, Latessa, and Wennberg, 1982). When the number of available hospital beds is greater, the beds tend to be used for medical and minor surgical cases as proportionately more highly variable medical conditions (conditions for which hospitalization rates are substantially uncorrelated with morbidity) are admitted. The situation in Boston (4.5 allocated beds per 1,000) compared with New Haven (2.9 beds per 1,000) illustrates this phenomenon as shown in Table 6 (Wennberg, Freeman, and Gulp, 1987). For major surgery, the admission rates for Boston and New Haven residents are virtually the same. The admission rates for stroke, heart attack, and bleeding from the gastrointestinal tract (moderate-variation medical conditions) are 6 percent greater for Boston, suggesting that morbidity levels are not very different in the two communities. By contrast, admission rates for high-variation medical conditions are 56 percent greater in Boston. For pediatrics and minor surgery, they are 47 and 38 percent higher, respectively. For the 684,000 residents of Boston, the higher patient day rates require 625 more beds for these conditions than would be required if the New Haven hospitalization rate were applicable (Wennberg and others, 1987). 9 Admissions per 1.000 Market area Patient days Average per 1,000 length of stay Boston 2.67 1.14 3.04 Worcester 2.71 0.83 2.25 Palo Alto 1.22 1.21 1.48 New Haven 1.79 1.02 1.83 Iowa City 1.44 0.92 1.33 Rochester 1.00 1.00 1.00 *The importance of the practice style factor in determining the use of hospi.lals as well as the lack oi correlation between indicators of population need and the utilizatioa of hospitals (see the section on demand-related fscion) suggest that the causal relationship is not in the direction of need creating the stimulus to build more beds. Case studies by Altman, Green, and Sapolsky (1981) capture the imporiance of non-hea) th-rtlated factors in determining the numbers of beds available in a community. 187 Figure 5. Changes in practice patterns 1973-85 among gynecologists In Maine (area II) following feedback and review 700 r s • Observed • Expected A- I 50 0 \ o Z 300 100 73 75 77 79 81 83 85 Year NOTE: The differences shown here were explained largely by the clinical practices of two of the area's Tive gynecologists. The high-variation medical conditions account for 501 of the excess beds used. Seven conditions-medical back problems, gastroenteritis, heart failure, simple pneumonia, diabetes,respiratoryneoplasms, and bronchitis and asthma-account for more than 29 percent of the excess beds (Table 7). High admission rates rather than excessive lengths of stay contribute more to the high patient day rates. The lower rate of use of medical services among residents of New Haven has been accomplished without producing a scarcity of beds as measured by an in- �creased bed occupancy rate. The occupancy rates of the two hospitals in New Haven are about 85 percent, indicating that additional resources are available if needed. The weighted occupancy rate for Boston hospitals is also about 85 percent. Indeed, the independence of the per capita quantity of beds and the occupancy rate-the latter a measure of the actual scarcity of beds experienced by local clinicians-is a general phenomenon: the weighted occupancy rates of local hospiuls show virtually no correlation with the admission or the patient day rates among small areas (McCracken and others, 1982; Wennberg, Gittelsohn, and Shapiro, 1975). Corroborating evidence for a lack of conscious rationing caused by relatively low bed supply comes from the clinicians themselves. Boston and New Haven clinicians-a number of whom have practiced in both cities-were asked to estimate their own rates (high or low) and to guess which city has the higher rate and the magnitude of the difference. Until they were direcdy informed, most did not have a perception of the rates in their own areas, much less any appreciation of the magnitude of the difference between New Haven and Boston. Moreover, in discussions held with chiefs of services at the Yale-New Haven Hospital about their low utilization rates, there was no recognition that the low rate of bed use and per capita cost in New Haven implied rationing. The strong correlation between per capita beds and admission rates for high-variation medical conditions, the lack of correlation between occupancy rate and per capita beds, and the failure of clinicians in low-rate areas to perceive that rationing is occurring in their own market areas suggest the following interpretation. The pattern of variation revealed by SAA suggests that only a small fraction of admissions are for conditions where a clear consensus exists on the need for hospitalization. Demand for these services, which is exogenous, places no great strain on available resources. Most of the available resources are deployed in the pursuit of health benefits for patients with high-variation conditions where the rates of hospitalization or surgery are only loosely correlated with illness because admission decision thresholds are variably set by the physicians themselves. The clinical hypotheses that govern the deployment of hospitalresourcesfor many patients with high-variation conditions are weak, implicit, individualistic, and untested. Given the strong consensual hypotheses that govern the decision to hospitalize any patient who has a low-variation condition such as hip fracture or myocardial infarction and any needing colectomy for cancer of the colon, there appears to be a strong ethic to reserve enough beds for patients with consensual high need for 188 Table 6. Comparative rates of hospital utilization for Boston and New Haven Cise mix by type of admission Ratio of rates. Boston to New Haven Excess beds use this case mix Admissions per 1.000 Avenge length of stay low variation 1.06 III 26 high variation 1.56 I.II 501 1.47 1.16 Adult medical cases Pediatric medical cases 35 Surgical cases minor surgery 1.38 1.17 major surgery 1.00 1.13 89 Table 7. Conditions contributing to excess hospitalization in Boston Boston to New Haven utilization ratios Discharges per 1.000 Average length of stay Number of excess Boston beds used for these conditions Medical back problems 3.75 1.05 33.8 Adult gasiroenierilis 1.81 1.14 26.7 Hean failure and shock 1.40 1.17 26.5 Adult simple pneumonia 1.33 1.10 16.8 Adult diabetes 2.35 0.85 16.5 Respiratory neoplasms 1.53 1.44 16.4 Adult bronchitis and asthma 2.06 0.95 16.1 hospitalization. Thus within the range of variation in bed supply, the threshold for admitting patients with high-variation conditions appears to be adjusted to assure the availability of marginalresourcesfor the low-variation conditions. The Practice Style Factor and the Supply of Surgeons The supply of surgeons is of considerable importance to the per capitarateof surgery as shown by the correlations between surgeons and the overall rate of surgery (Lewis, 1969; Mitchell and Cromwell, 1982; Wennberg and Gittelsohn, 1973). For medical conditions, rates tend to be uniformly high or low for all causes of admis- �• sion, depending on the numbers of beds. In contrast, irrelations between surgeons per capita and the rates individual procedures are not always strong. This is ell illustrated by the surgical signature phenomenon, in which areas with equal rates of total surgery are found to have strikingly different patterns of use of individual procedures. In Figure 6, taken from studies in Maine (Wennberg and Gittelsohn, 1982), the numbers of gynecologists in the low and high hysterectomy areas were equal, but the relative proportions of various gynecologic procedures were quite different. It appears that variations in physician supply represent a less important contribution to small area variations in rates of high-variation surgical procedures than differences in the opinions about the proper indications for surgery. Left undisturbed by feedback and review or by migration of physicians in and out of an area (Roos, Flowerdew, Wajda, and Tate, 1986), the surgical signatures of a community tend to remain quite constant from year to year (Wennberg and Gittelsohn, 1980). The comparison in rates of surgery between Boston and New Haven serves as an example. Although the overall rate for major surgery is similar in the two communities, the rates for many individual operations vary substantially (Table 8). For example, carotid endarterectomies are more than two times higher for residents of Jtoston, but coronary bypass operations are twice as igh for New Havenresidents-eventhough the rates of ospitalization for strokes and heart attacks suggest that the incidence rates for therelevantunderlying illnesses are similar. Rates of hysterectomy are substantially higher in New Haven, while knee and hip operations are performed more frequently onresidentsof Boston (Wennberg and others, 1987). The impact of surgeon migration on small area rates illustrates the influence of the number of available surgeons on the rates of surgery for those procedures where the decision to operate is not closely controlled by professional consensus. Area II in Maine had a low rate for laminectomy in the early 1980s. A rapid increase in back surgery followed the market entry of two neurosurgeons who invested most of their surgical work loads in this operation. Figure 7 shows the number of laminectomies performed on localresidentsand on residents of three adjacent areas inrelationto this change in human resources. The increase in expected numbers of cases after 1982 is due almost entirely to the increases brought about by the two surgeons. Although the population served by the hospitals in Area II is less than 20 percent of the State population (including the three adjacentreferralareas), the numbers of laminectomies performed by these two new surgeonsresultedin nearly a doubling of the State rate. In late 1984, the Maine Medical Assessment Program study group met with the surgeons to discuss the indications for laminectomy. The rates dropped precipitously and continued to fall in 1986 (American Medical Association, 1986). Figure 7 also illustrates the counterinfluence on rates Figure 6. The surgical signature « o Area I Area II Area III l Tonsillectomy • £ * Hysterectomy % Varicose veins • Area VI AreaV Prostatectomy Hemorrhoidectomy All procedures 2.0 E & K E it ii 189 7 o 1.5 l 1.0 0.5 ~ JS 1 1 "-1 73 74 — Tonsillectomy o —• Hysterectomy —A Varicose veins i — 75 Year 76 77 —* Prostatectomy —o Hemorrhoidectomy —• All procedurts NOTE: The numbers on the venical axis are theratioof the State average rate to the area rate. The top figure gives dau for the five most populous hospital areas in Maine. It shows that theratesat which specific procedures are performed within an area vary markedly and to a large degree are independent of the toul operation rate. Areas II and III have the same toul operationrate,but Area II exceeds in hysterectomies, with 56 percent more than the Sute average, while Area III exceeds in varicose veins. In each of the five areas a different procedure is performed most often; in four of the five areas, the least performed procedure is different. The numbers of surgeons and their speciality distribution do not vary to the same degree. The trend lines in the bottom figure give (he rates of Area I for a 5-year period. �sary insurance coverage or that low-rate areas have ethnic or racial characteristics that lean toward stoicism. These and other consumer-related theories about the — the sources of variation have been shown to have virtual! norelevancein explaining small area variatio n s ^ ^ Boston to New Haven utilization ratios (Wennberg, 1987). Average Unease about the theoretical as well as the practical Type of major surgery Days per Discharges length implications of the variations in tonsillectomy rates in per 1,000 of stay 1.000 Vermont led to a formal testing of the hypothesis that patient or population factors could account for the varia2.33 1.30 3.03 Carotid endarterectomy tions. By the early 1970s, sociologists working at the Total knee replacement 2.14 1.75 1.22 University of Chicago had worked out a model of con1.63 1.48 1.10 Total hip replacement sumer determinants of demand and developed empirical Peripheral artery bypass 1.48 1.18 1.26 tests of therelativeimportance of illness and economic 1.17 1.02 1.15 Major bowel surgery and sociological factors in determining utilization. The 1.16 1.00 1.15 Inguinal hemia repair goal here was to determine whether the variables that 1.35 0.96 • 1.41 Pacemaker insertion Andersen and Newman (1973) found useful for predict0.95 1.12 1.06 Prostatectomy ing individual patient demand for care were distributed Appendectomy 0.94 0.90 1.04 differently among Vermont communities. It was al1.01 0.84 1.20 Open hean surgery 0.94 0.84 1.11 Proctectomy ready known from small area studies (Wennberg and Plastic repair of cystocele and 0.82 0.83 0.99 Gittelsohn, 1973) that these communities differed as retocele much as two-fold in overall expenditures and utilization 1.14 0.75 0.85 Thyroidectomy of hospitals and up to ten-fold in the use of tonsillectomy 0.94 0.70 1.35 Excision of intra vertebral disc and other elective surgery. Members of approximately 0.68 1.17 0.79 Cholecystectomy 300 households in each of six different hospital market 0.65 1.12 0.73 Hysterectomy areas were interviewed to ascertain their status on a se0.63 1.42 Extended simple radical 0.90 ries of factors, including ethnic background, educationmastectomy al level, insurance coverage, andreponedillness rates Coronary bypass surgery 0.63 0.49 1.28 (Wennberg and Fowler, 1977). 0.54 0.49 1.11 Splenectomy Very few differences were found, and nonerelatedin a systematic way to differences in utilization or per capiNOTE: Table includes types of major surgery with at least 100 ta costs. Indeed, the populations of the six communities discharges in the combined Boston and New Haven areas. were found to beremarkablysimilar in their actual beassociated with the MMAP's imposidon of implicit re- havior in contacting their physicians for an episode of illness or for preventive care. The survey results implied gional standards. rather clearly that the differences in utilization and costs These patterns of allocation of surgical technology decisions their serve as examples of the disjunction between the theory resulted fromthem-as aphysicians made afterin thepatients contacted resultof differences of the physician as guarantor of the rational allocation by the Vermont phyof resources among competing clinical priorities and the way the agency role was exercisedfor the two communisicians. An example of the results realities of everyday surgical practice. ties with the greatest differences in per capita cosu and Demand-Related Factors utilization of hospital care is given in Table 9. Roos and Roos (1981)repona similar lack of correlation Do Not Predict Differences between demand factors and use of surgery in Manitoba. In Population Utilization Although surveys of patient behavior are not usually Small area analysis provides a framework for investi undertaken in small area studies, demographic factors gating the role of Factors 1 and 2-illness rates and parelating to illness and the need for care account for very tient demand-in determining the population-based utilization or expenditure rates in health care markets. Some little of the observed variation in use of hospital care. In a typical small area study comparing hospital utilization experts favor theories that patients and populations are among the communities of a State or region, the age the main sources of the differences in utilization rates among small areas. For such large differences in utiliza- structure of the populations show very weak and some times paradoxical correlations with hospitalization rates tion to exist, illness rates must differ widely; others (Wennberg and others. 1975). The same lack of rclaspeculate that people in low-rate areas lack the necesTable 8. Ratios of Boston to New Haven utilization rates by selected types of major surgery 190 �omies tionship holds for morbidity indices such as bed disabil- Figure 7. The effect on surgery rates of physician ity days. Data presented by Blumberg (1987) giving migration in Maine, 1980-85 variations in morbidity and patient day rates for 18 cities the United States show that only 5 percent of the 1980-85 Workers' compensation laminectomies in an urban area ariation in population-based hospital rate is associated with variation in population bed disability days, even 80 — • Observed though more than 1S percent of bed disability days occur in the hospital. An interesting example of the dissoci— • Expected 60 ation of demographic factors and utilization and costs is provided by New Haven and Boston. These communiE 40 ties have similar age structures, racial composition, and income profiles, even though hospitalization rates in 20 Boston are substantially higher (Table 10). I The Rational Agency Model, Professional Uncertainty Hypothesis, and Medical Care Outcome Problem 0 80 191 82 83 Year 84 85 86 Three adjacent areas 80 • s - • Observed Expected 60 • E o nect Over the years, this author has become increasingly aware of the depth of the challenge that small area variations raise to the conventional wisdom that demand in medical markets is controlled by a consensus among practicing physicians on the appropriateness of care. This consensus-which practicing physicians are thought to adopt during their arduous years of scientific education in accredited medical schools and residency training and through lifetime postgraduate education from medical journals, textbooks, and formal courses-is assumed to be based on valid scientific information on therelativeadvantages of alternative treatments producing desirable outcomes. > (1963), in an influential discussion of uncerw ^^ointy and the welfare economics of health care, gives inty expression to these assumptions in an elaboration of the rational agency hypothesis. Patients cannot themselves exercise consumer sovereignty in purchasing health care or buy the products theyreallywant because the products are too complex for diem to understand. Physicians hold the knowledge about the probabilities for outcomes, and their clinical experience allows them to make vicarious evaluations of the preferences of patients, choosing the treatment patients would choose if they had the facts. Arrow thus argues that it is rational for patients to delegate decisionmaking to physicians. While nonhealth-related factors such as physicians' need for income, their ignorance about the facts based on their failure to keep up with medical progress, or their own personal preferences (rather than those of their patients) for health outcomes could influence clinical decisionmaking and flaw the agency role, the opportunity for the physician as a seller of goods to distort the transaction is tighdy constrained by a professional consensus based on the objective function of the healthcare system-the improvement in health status of individual patients. Moreover, departuresfromprofessional stan- 81 I 2 40 • # \ 20 0 80 81 82 83 Year 84 85 86 NOTE: The doned line represents the expected number of laminectomies based on the State average rate. dards due to greed or ignorance about the facts on the part of individual physicians would be discovered and corrected by professional peer review. Arrow, along with many other policymakers, characterizes the physician's opportunity to influence utilization as an example of moral hazard or deviant behavior. In Arrow's conceptualization of health care markets, utilization follows the algorithm physicians apply to hip fracture: Factors 3 and 4 are tighdy constrained. The assumption that demand in health care markets is limited by professional rules and treatment paradigms based on information about outcomes and patient utilities implies that the supply ofresourcesshould not systematically affect utilization. In areas where there is a supply shonage inrelationto the need for care, unmet need exists, and health care services are being rationed. In the case of hospitals, there would be a scarcity of available beds as evidenced by very high occupancy �Table 9. Association between consumer factors relevant to demand and rate of use of care In two Vermont hospital areas, 1973 Randolph Middlebury 23 4 5 23 5 5 Ability to use care percentage below poverty percentage with health insurance percentage withregularsource of physician care 23 84 99 20 84 97 Access to care percentage pf patients who contacted physician within last year 73 73 220 80 441 132 49 92 Consumer factors Need for care percentage with chronic condition percentage withrestrictedactivity within 2 weeks of interview percentage who spent more than 2 weeks in bed wilhin last 12 months Use of care hospitalization per 1,000 inpatient surgery per 1,000 Medicare Part A reimbursements ($) NOTE: Small area utilization rates have been monitored in Vermont since 1963. The residents of Middlebury consistently receive fewer services than residents of any other Vermont market area. In this study, 245 households were interviewed and health care information was obtained for 765 residents of Middlebury. No consumer-related factors helped explain the substantial differences in utilization between Middlebury and other Vermont communities. The data in this table compare Middlebury residents to those living in the Randolph area, a contiguous hospita! market area located in the center of Vermont. A total of 280 households were interviewed in Randolph to obtain data on 858 residents. None of the consumer-related variables listed are significantly different. rates. When supply is in excess of need, care that has an actual negative utility to the patient might be prescribed, but such transgressions would be rare. While some physicians might knowingly break the rules to remain fully employed, the tendency for perversion inherent in the delegation of decisionmaking to the seller is controlled by professional consensus on outcomes embodied in professional ethics and enforced in practice by utilization review programs. This interpretation of the relationship between supply and udlization has colored the public debate on two closely related topics-rationing and unnecessary medical care. When the rates of surgery in an area violate statistical norms, when correlations are observed between the supply of beds and the rate of hospitalization or between the supply of surgeons and the incidence of surgery, or when anecdotal reports of excess uses of surgery come to light, they tend to be viewed as examples of deviant behavior and interpreted by policymakers as evidence of the need to strengthen regulatory programs to ensure better quality of care. But this view would correspond to reality only if medical practice were, in fact, substantially guided by a professional con- 192 sensus. What if the physicians themselves do not know the facts because of weaknesses in the underlying scientific basis of medical practice, or they cannot fully serve as rational agents because of flaws in the methods and assumptions governing the physician's role as vicarious judge of patient utilities? If physicians as well as patients face considerable uncertainty about the value of medical care, then demand could not be tightly constrained by outcome-based standards of appropriate care. Direct evidence against the notion that clinical decisionmaking is based on adequate information on outcomes and utilities comes from a critical appraisal of the scientific literature and from theresultsof contemporary assessments of well-established practice patterns thatrevealinconsistencies, ambiguities, controversies, and inaccuracies in the information base that supports everyday medical decisionmaking. Cochrane's (1972) analysis of the technical, organizational, and behavioral factors that limit the broad application of scientific principles to an evaluation of the endresultsof health care is a classic description of the medical care outcome problem. �Table 10. Dissociation of sociodemographic variables and hospital utilization in Boston and New Haven Boston New Haven Ratio of Boston/ New Haven 13.5 18.7 16.934 154 12.9 21.4 17.216 14.3 1.05 0.87 0.98 1.08 4.5 889 2.9 451 1.55 1.97 2647 1561 1.69 Demographic facton pereentage of population over 65 percenuge Black median family income (S) povenytutus Consumption of care (1982) beds per 1.000 hospital expenditures per capiu (J) Medicare reimbursements per enrollee (S) NOTE: Boston includes Chelsea and Brookline; New Haven includes West Haven and East Haven (see Appendix). Source: Demographic data are from thc 1980 census. Even a casual critical appraisal of the scientific literature uncovers a host of controversies and uncertainties concerning the use of most common operations (Wennberg, Bunker, and Bames, 1980). In addition, there is ample evidence for the lack of reproducibility or uniformity of clinical decisions. Second-opinion programs are built on the evidence that different physicians examining the same patient recommend different treatments (Finkel, McCarthy, and Ruchlin. 1982). The lack of conformity among physicians and the lack of reproducibility by the same physician in the interpretation of diagnostic tests has been well documented (Koran, 1975a, 1975b). Moreover, when presented with the same set of facts in the form of standardized hypothetical cases, equally well-qualified surgeons commonly diverge in their opinions on the need for surgery (Rutkow, 1982). Efforts to obtain consensus on the appropriate indications for surgery or diagnostic tests among experts also demonstrate fundamental disagreements on the interpretation of the facts or utilities of outcomes. A series of consensus conferences by the RAND Corporation to develop appropriateness criteria for the use of several common procedures, including cholecystectomy, carotid endarterectomy, and coronary artery bypass graft surgery, uncovered a substantial lack of agreement among the panelists. Afterreviewingan extensive summary of the literature and openly debating their initial conclu- sions concerning appropriateness, the nine members of the panel reached agreement less than 50 percent of the dme (Park, Fink, Brook, and others, 1986). When agreement among experts on the appropriateness of a specific intervention exists, this does not assure that the agreement is based on facts or even a similar perception among the experts on what the facts might be. Eddy (1984) asked a panel of experts on colorectal cancer, all of whom hadrecommendedan annual examination for blood in the stool, to answer the following quesdon: "What is the overallreductionin colorectal cancer mortality that could be expected if men and women over age 50 were tested with fecal occult blood tests and 60cmflexiblesigmoidoscopy every year?" The answers ranged from very near 0 to very near 100 percent. Uncertainty can also arise, even when the experts agree on the facts, if the sources of data they are relying on present a distorted picture ofreality.Much of the conventional practice of medicine has not been subjected to randomized clinical trials. Thus, the patterns of practice have been based on extrapolations from biological models and empirical studies that by the nature of their design are subject to a number of biases. The next section of this article will describe the application of modem nonexperimental techniques to the assessment of one highly variable operation (prostatectomy) to uncover inaccuracies in the data base that are important to clinical decisionmaking. The literature also documents the divergence between patient utilities and those of their physicians and shows that physicians' preferences for outcomes can dominate the decisionmaking process. McNeil, Weichselbaum, and Pauker (1978) showed that patients can be more averse toriskthan their physicians, with patients preferring treatments that optimize shorter- rather than longer-term chances for survival. Theseresearchersassessed the utilities among a group of patients who had already received surgical treatment for lung cancer and found that a significant proportion actually would have preferred treatment with x-ray therapy because of the reduced short-term risk of death. There is also evidence that the way options are presented to patients can have a decisive influence on the choices patients make (Wennberg, 1982). Taken together, the evidence from small area studies and a critical appraisal of the strengths and weaknesses of the scientific basis of medicine builds a consistent and seemingly strong case against the rational agency hypothesis andrelatedassumptions about tne determinants of utilization. In their place, the evidence suggests the professional uncertainty hypothesis as an explanation of why nonhealth factors can broadly influence the utiliza- 193 �tion of care. For the majority of interventions, there is no professional consensus on treatment standards that is based on reasonablyfirmevidence of the effect of alternative treatments on patient outcomes. As a result, the standards of practice do not narrowly constrain the treatment choices ethical physicians make. The reasons individual physicians adopt specific practice styles remain largely unexplained but must relate to a variety of factors such as clinical experience, habits of work, economic expectations, disposition with regard to risk taking, and other contingencies of personal preference (Eisenberg, 1986, has discussed these possibilities in detail). Practice style is clearly influenced by a physician's colleagues, as the changes in practice patterns following feedback andreviewillustrate. The underlying possibility, however, emerges from the failure of clinical science to provide adequate information on the outcomes and the utilities of one treatment relative to another or to no treatment at all. The professional uncertainty hypothesis thus implies a fundamental dissociation between need and the efficacy of care and the utilization of services. It permits the possibility of fundamental dissociations between population welfare and the availability of health care and level of expenditures invested in such care. Some who interpret geographic variations assume that the variations are occurring on theregionvery near the "flat of the curve" where services still result in a positive, albeit diminishing, marginalreturn.This assumption may be too generous. Within the spectrum of current levels of investment there can be no certainty that the marginal returns in high-rate areas are positive,regardlessof their costs. The comparative utilization experiences of residents of Boston and New Haven provide an example. If there is marginal utility to individual patients associated with higher rates of hospitalization for high-variation medical conditions, this utility is probably notrecognizedby clinicians practicing in low-utilization communities. The uncertainty about marginal utilityrevealedby the contrasting panems of high-variation medical admissions in New Haven and Boston isrepresentedin Figure 8. The zone of uncertainty is the area beneath the implicit marginal utility curves of the two cities. The challenge to the health services research community is to ascertain 10 'Similarly, the decisiont of hotpiui adminUtnton, community leaden, health maintenance organizauon managers, and others who determine thc size of the hospiul industry are not constrained by population needs or health care outcomes. Nor. at least in fee-for-service medicine, are the decisions that affect the relative size of local health care industries narrowly regulated by economic feedback. The manner in which the actuarial bases for most health insurance programs are organized guarantees lhat high per capiu cost market areas receive subsidies from reiidents living in low-cost 194 Figure 8. Net effect of an increasingrateof hospitalization a. o z z New Haven Boston Increasing rate of hospitalization NOTE: As hospitalizations are increased, inevitably a point will be reached where the net effect is harmful to patients. This figure represents alternative interpretations of where thai point may be. The Tint curve shows that the marginal benefits plateau at point A and decline into a zone where the iatrogenic effects of hospital care exceed the benefits. The second curve shows a much broader zone of decreasing but still positive utility extending past point A to point B. The present state of clinical knowledge as exemplified by the practice patterns of New Haven and Boston does not distinguish between these two possibilities. the outcome significance of these differences in practice styles and resource consumption. If the Boston curve is correct, then the problem is that marginally useful health care is being withheld in New Haven, and the issue of rationing must be faced direcdy. On the other hand, it may be that the extra beds of Boston are allocated to patients for whom the disutility of hospitalization-its discomfort, cost, and risk of iatrogenesis-outweighs health benefits. If this is the case, hospitalization rates and costs can be substantially reduced while utility is actually gained. In the case of major surgery, the situation is somewhat different Here, the variations imply fundamentally different choices among treatments, often the choice bemaitets (Wennberg. 1982). In centrally crpniird national health care systems (e.g., the United Kingdom), per capiu expenditures among regions can be adjusted for population Illness indices, but the equity interpretation of equalizing per capiu expenditures is moot due to uncertainty about the value of many purchased treatmenu. Moreover, the empirical investigation of allocaUon panems (e.g.. the surgical signature phenomenon) shows lhat the equalized allocation of resources does not lead to the rational diitribulion of services across competing health care priorities. �tween surgery and medical treatment that may include watchful waiting. For many high-variation operauons, the decision to operate probably hinges on the evaluation of benefits that involve improvement in the quality rather than the length of life. Indeed, because of the mortality attributable to the operation itself, the decision to undergo surgery may actually shorten life. For such operations, the decision involves the weighing of "soft" outcomes-anticipated improvements in functional status and symptom reduction-against therisksof operation-induced death and morbidity. Only patients are in a position to make such judgments about utility; while "objective" evidence obtained by the physician may help suggest the probabilities for outcomes, the interpretation of the value of the various outcomes to patients can come only from the patients themselves. However, the sharp differences in probabilities for major surgery among areas displayed by the surgical signature phenomenon suggest that the preferences of the physician rather than those of the patient dominate clinical choice. It seems quite unlikely that the marginal utilities of patients now closely influence the shape of the surgical signature. The next section illustrates the importance of accurate assessment of patient preferences as a cornerstone for rational decisionmaking in clinical medicine. The Medical Care Outcome Problem Illustrated by Prostatectomy Prostatectomy for benign hypertrophy of the prostate exemplifies both the efficacy and quality of care aspects of the medical care outcome problem. The efficacy of this operation has not previously been thoroughly evaluated, and physicians and patients face considerable uncertainty about the probabilities and utilities of the various outcomes associated with the operation. The quality of care also is uncertain: there are no outcome-based standards on the appropriate use of the operation, and there is evidence for considerable variation in mortality and morbidity associated with the technical skill of the surgeon or hospital. There is considerable variation in the chance of undergoing this operation from one country to another and from one region to another within a country, and whatever the costs, risks, and benefits of this operation may be, they are being distributed very unevenly among potential patients. On the basis of itsfrequencyof use, this is a big problem. It is the most common major operation performed on males over 65 years of age. Some 354,000 prostatectomies were performed in the United States in 1984. The operation has not always been so popular. It was first introduced as an emergency treatment for patients with complete blockage of urination who, in the judgment of their physicians, would die without the operation. As surgical and anesthesia techniques improved, the indications for prostatectomy were widened to emphasize early intervention to prevent the severe complications of obstruction, which include renal failure and death. The operation is now used often to treat patients with moderate and even minimal symptoms of urinary obstruction, where the therapeutic objective is improvement in the quality of life through the lessening of discomfort or disability due to urinary tract symptoms. However, the relative importance of preventive hypotheses (avoidance of future renal disease or death) versus quality of life hypotheses in influencing the decision to operate is not clearly explicated in the literature. Historically, several surgical approaches to the prostate developed. One is the so-called open prostatectomy, which requires an incision through the perirectal region or the bladder and involves the direct visualization and removal of prostatictissue.The other, the transurethral prostatectomy (TURP), is accomplished by passing an instrument through the urethra to excise the prostate at the base of the bladder, without direct incision through the skin. In recent years, the TURP has virtually replaced the open operation. The scientific evidence supporting hypotheses of the efficacy of prostatectomy is limited to case series reports of outcomes that record the experiences of this operation in teaching hospitals and a few large community hospitals. No controlled clinical trials have been performed to test the advantages of transurethral (or open) prostatectomies compared to watchful waiting (i.e., no immediate treatment) or to test preventive theories about the extension of life or avoidance of serious morbidity such asrenalfailure. The relative advantages and disadvantages of the open operation versus the TURP also have not been studied in a randomized clinical trial. Although the operation is commonly done to reduce symptoms and improve the quality of life, evaluations of the probabilities or assessments of the value (utility) of such outcomes to the patient are noticeably absent in the scientific literature. The literature contains isolatedreportsindicating variations in short-term outcomes (death and morbidity)relatedto the quality of care. In seeking to identify the clinical reasons for variation in prostatectomy rates, as well as a fresh approach to the problem of assessing the outcomes of treatments that are already part of everyday practice, a comprehensive nonexperimental Phase I and Phase II assessment of safety and efficacy was conducted, similar to studies that the Food and Drug Administration would probably require 195 �The probabilities for outcomes following transurethral prostatectomy. The informadon obtained from analyzing claims data and interviewing cohorts of patients undergoing prostatectomyrevealsa more pessi- ^ mistic estimate for outcomes than that conveyed in the literature or in the press. The following compares the results of this study with the estimates and evaluations reached by Grayhack and Sadlowski (1975) on the basis of their extensivereviewof the literature. Grayhack and Sadlowski state, on postoperative death rates, "The cur1. What are the advantages and disadvantages of rent mortality rate for prostatectomy is less than 1 perwatchful waiting versus immediate operation for cent even though poorriskpatients are rarely denied the patients with symptoms of prostatism? operation." Thefindingsfrom the study described here (conducted 2. What are the advantages and disadvantages of transon a cohort of patients from the mid-1970s) show that urethral prostatectomy compared with open prostawithin 3 months, 4.7 percent of patients 65 years of age tectomy? and older in one region were dead, while in the other The assessment was based on several techniques that, region, the mortality rate was 3.0 percent (Wennberg taken together, may provide a useful model for the non- and others, 1987). In arelatedstudy based on a 20 experimental assessment of operations. Following a percent sample of national Medicare claims data, critical review of the literature, cohon studies of outLubitz, Riley, and Newton (1985) found a 2.2 percent comes were performed using claims data to estimate the death rate at 6 weeks postsurgery during 1980-1981. probabilities for"hard outcomes" such as death, urethral The differences in mortality figures extracted from strictures, and reoperation. Claims data appear to be a claims data (compared withreportsin the literature) are better source of information than case series reports for due to their completeness and length of follow-up and estimating the probabilities for outcome because the to the fact that reporting bias does not affect the estinumber of cases is much larger than that available from mates. any single institution; moreover, the data are population In the current study, all hospitals in both regions based, so the effect of reporting bias is removed, and the (Maine and Manitoba) were included, and in the nation- ^ ^ k ) period of follow-up can be extensive. al Medicare sample, all hospitals in the United States In this assessment, patients were followed for up to 8 were included. This study also shows marked differyears after surgery. Additional information on outences in death rates by hospital (see below). Grayhack comes was obtained by directly interviewing patients and Sadlowski (1975) state that "long-term morbidity is before the surgery and at 3, 6, and 12 months postsurlimited. The procedure provides correction of urinary gery. This infonnation was used to estimate the probastasis in approximately 90-98 percent of patients operbilides for "soft outcomes" such as relief of symptoms ated upon. The need for further operative treatment is (incontinence and impotence), information that could uncommon." The estimates for postoperative complinot be drawn from the claims data but could be ascercations arising out of the current study give considertained only by asking patients directly. To confront the ably higher 4-year cumulative probabilities for continuutility question, the patient interviews were useful for ing urinary tract problems than suggested by most studying padents' perceptions about the reasons for sur- articles in the literature: gery and the degree to which they were bothered by their • urethral stricture, 13.3 percent symptoms before surgery. The information obtained from these sources was then integrated and synthesized • indwelling catheter, 2.9 percent into a decision analysis to identify the key probabilities and utilities on which the decision to undergo the opera- • cystoscopy or other test, 20.4 percent tion depended. • subsequent prostatectomy, 10.2 percent before agreeing that a randomized clinical trial was ethically and scientifically warranted. The principal purposes of this assessment were to (a) identify the critical hypotheses, if any, that should be tested by a randomized clinical trial, and (b) leam whether or not the safety of the operation appeared to vary enough from hospital to hospital to affect an assessment of the value of the operation. The inquiry addressed two major clinical questions: 11 "The team included the members of ihe Maine Medical Assessment Program's Urology Study Group, led by Dr. Roben Timothy; Dr. Dan Kanley. principal investigator of the MMAP; and David Soule. director of dau services for MMAP. The university-based members of the research leam included Michael Barry and Al Mulley. who were responsible for ihe literature review and the decision analysis; Jack Fowler, who was responsible for the patient interview study; and Noralou Roos and John Wennberg. who wereresponsiblefor claims dau analysis. 196 • alive and without the above, 52.0 percent. These data are for patients 65 years of age and older. Most patients who undergo prostatectomy are in this age group. At 8 years the cumulative probability of a second prostatectomy reached 20 percent for those with benign �hypertrophy of the prostate and 30 percent for those with cancer of the prostate. The reasons for the discrepancies between these data and the literature appear to be loss to follow-up, failure to remove dead patients from the denominator, and repons on relatively small numbers of patients in most case series studies. Estimates for complications and morbidities that could not be measured by claims data were obtained through the interview study. Some 5 percent of sexually active males reported continuing impotence, and 3 percent of those with the operation reported continuing problems with incontinence throughout the year after surgery (Fowler, Wennberg, Timothy, and others. 1988). Outcomes following transurethral prostatectomy and open prostatectomy. The lack of an orderly process for evaluating the efficacy of common operauons carries the risk that new procedures may replace more effecuve older technologies. Beginning in the 1960s, the transurethral prostatectomy became increasingly more popular, until today it has virtually replaced the older open operative technique. This replacement, occurring without the benefit of randomized clinical trials or careful nonexperimental assessment,reflectsthe belief that the TURP is a safer, less invasive operation that is effective in long-term results. Because the data reported here extend backward to the mid-1970s, it is possible to compare the outcomes of these two operations as they were used at that time. The results indicate that the open procedure, at least in some jrespects,may be the more effective operation. Patients with open operations had significandy fewer complicated urethral strictures, presumably because use of the instrument through the urethra in the TURP procedure leads to this complication. Figure 9 shows that patients who had an open procedure also had a lower incidence of subsequent cystoscopies andrecurrentprostatectomies, suggesting that the more completeremovalof prostatic tissue associated with the open operation results in better long-term reduction in urinary tract symptoms. By the time the interview study was conducted, the open operation had become so uncommon that it was not possible to estimate the frequency ofreliefof symptoms or the incidence of incontinence and impotence following open prostatectomy. Impact of the quality of care. Not surprisingly, the decision analysis showed that the decision to undergo the operation should be sensitive to the death rate in the ''There it growing evidence thai the probabilities for death and morbidity vary from one setting to another. On average, larger hospitals and physicians performing more operaiions have lower complication rates (Luft and others, IS79). These differences are rarely taken into account in evaluating the prospects for individual patienu in the clinical sening. where it is usually assumed that ouicome sutistics quoted in the literature apply. The Medicare claims dau base provides a means for esublishing ihe monality and morbidity rates for individual hospitals. local setting. To ascertain the importance of this, the mortality rates for each of 15 community hospitals were compared with the mortality rates at two university hospitals that were used as the empirical standard for high quality. Overall, after adjustment for age and illness, the death rates infiveof the community hospitals within 3 months of prostatectomy was 3.3 times higher for the transurethral prostatectomy-9.0 percent versus 2.8 percent for the university hospitals. Most of these deaths occurred after postsurgical discharge. The hospitals with high deathratestended to be small hospitals where fewer operations were performed, but not all small hospitals had high death rates (Wennberg and others, 1987). To determine the safety of the operation in the individual setung, direct monitoring and feedback on performance are required. The value of outcomes for patients. This assessment of prostatectomyrevealsthat rational decisionmaking depends on how patients assess the utility of competing outcomes. The decision analysis, used to integrate information on the nonsurgical as well as the surgical arm of the decision tree, indicates a slightly negative effect on longevity for those who chose the operation, even in hospitals with low mortality and complication rates. The surgical decision appears to hinge on how patienu value the probability forreliefof symptoms versus the chance for operative death (mortalityrateshave improved inrecentyears), impotence, and persistent incontinence (Barry, Mulley, Fowler, and Wennberg, 1988). The patient interview study shows that some patienu were unhappy about their surgery and wished they had not had it. One subgroup with a high incidence of disappointment was made up of moderately symptomatic but sexually acdve men who became impotent after TURP. The study also shows that the intensity of feeling about symptoms varied substantially among individuals. Patienu were asked how they felt about their symptomshow much the symptoms bothered or concerned them. For all levels of severity, some patienu stated they were bothered a lot by their symptoms while others professed to be hardly bothered at all. It seems reasonable that patients who feelrelativelyunconcerned about their symptoms would be less willing toriskan adverse outcome than those who are greatly concerned. Although this study was undertaken with the expectation that theresuluwould point out uncertainties about the probabilities for hard outcomes that would lead to randomized clinical trials between watchful waiting and operation, it was found that therelevantissue was the 197 12 ,3 "The oulcomes on the nonsurgical arm were estimated using the literature review. The sources of dau include follow-ups of patienu with prosutisin and randomized clinical trials between dmgs and placebo (Barry. Mulley, Fowler, and Wennberg. 1988). �Figure 9. Eight-year cumulative probability for patients' assessments of the utility of soft outcomes. The clinical remedy for unwanted, practice-style-driven recurrent prostatectomy by diagnosis and type of small area variations thus appears to be the development prostatectomy (p.<.001) of means for assuring informed patient decisionmaking-that is, assuring that choices are based on patient assessments of their anitudes toward risk and the • Ca prostate. TURP (N-329) strength of their feelings about the various expected • Ca prostate, open operation ( N - 124) outcomes from the watchful waiting and from the "oper• Bengin disease. TURP ( N - 1522) 0.35 • ate now" strategies. A Benign disease, open operation ( N - 9 4 5 ) ^ » * 0.30 • jj Meeting the Challenge of the Medical Care 14 Outcome Problem Many treatments must be assessed if outcome-based guidelines for decisionmaking are to be developed. There is a need to increase clinical knowledge about the efficacy of care-the outcome probabilities and the assessments patients make of the value of care-and to improve ways of dealing with the quality of care where the concern is the appropriateness of clinical decisionmaking and the technical safety of care in the individual setting. This discussion is focused on the efficacy of care, with particular attention to the priorities for assessment suggested by SAA. Getting the probabilities straight. It is suggested that systematic nonexperimental Phase I and Phase II evaluations (along the lines the FDA would require before agreeing to a randomized clinical trial for a new drug) are needed to document therisksof treatment and obtain evidence of efficacy for the alternative treatments now used in the management of high-variation medical and surgical conditions.Until these studies are done, it is not clear for which hypotheses a randomized clinical trial is scientifically or ethically justified. The study of prostatectomy reviewed above suggests the advantage of the coordinated use of several techniques in the nonexperimental evaluation of efficacy, as follows. 1. Comprehensive literature reviews should be undertaken to evaluate the range of esumates on probabilities for outcome among treated ai.d untreated or alternatively treated patients. Meta-analysis and other synthetic methods of integrating existing information should be applied in this effort. 15 2. Large claims data bases should be analyzed to obtain information on survival and complications from unselected cohorts of patients undergoing the treatments in question. For "hard" outcomes, these ,4 To deal wiih the practice . ariation phenomenon, informed padenl decioonmaJung U needed al the level of the primaryreferringphyiician at well at at the point of contacting the urologist. Ii is easy to imagine that many patients who might want the operation are now dissuaded from seeking urologists' opinions because of their primary physicians' preference for the watchful waitins straieiv. 198 1 U E 0.25 • 0.20 0.15 - o >. 0.10 JD 0.05 1 2 3 4 5 6 7 8 Years after initial prostatectomy NOTE: Patients who received an open operation had a lower probability for a second operation. For those with no evidence of malignancy during the first operation, the relative risk of recurrence was 2.0. In the Cox regression model, the only significant covariable was cancer of the prostate. Age, size, and teaching status of hospital and all patient illness covariabies were not significantly associated with the probability of undergoiong a second operation. data sources should provide superior information on the probabilities for outcomes (as compared with other nonexperimental sources) because they can be based on very large numbers of cases and, at least in the case of the Medicare claims data, are virtually free of selection bias and loss to follow-up. 3. Functional status measures should be developed to measure "soft" outcomes as well as symptoms that '^Considerable confusionresultswhen the issues of efficacy and quality of care are not distinguished. Although these issues overlap, theremedyfor a quality problem can and should be pursued independently of the efficacy iuue. Monitoring and feedback systems are needed lo ensure that agreed-upon ttandards of care are met (ie., physicians know the standards and do not violate them). The possible ute of small area data for feedback to physirians has been addressed elsewhere (Wennberg, 19*4). Recent research thowt that within the statistical limits imposed by small numben and variations in case mix. it it possible lo use hotpital discharge and claims dau to monitor mortality and morbidity rates for individual hospitals. It remains to be seen whether this information can be used in a systematic way io improve the qualily of care. �are specific to the conditions and outcomes under investigation. These measures should then be used to assess probabilities for gain (or loss) in functional status and quality of life in relation to the alternative treatments by direct padent interviews prior to and after the treatments under consideration. Since the instruments used for this purpose will have direct clinical applicability, they should be developed with great care, and every effort should be made to obtain the necessary standardization asresultscan be compared among studies. 1 6 4. Decision analysis should be used to (a) integrate informadon on outcome probabilities and utilities gained from the various sources to point out the critical uncertainties that may need resolution through clinica] trials' and (b) define and elaborate the importance of utilities in the choice to undergo specific treatments. The magnitude of the outcome problem also suggests that priorities should be carefully selected to assure that the major problems are addressedfirst.Small area analysis suggests two priority areas: (1) high-variation medical and minor surgical conditions and (2) high-variation treatments or diagnostic procedures. High-variation medical and minor surgical conditions. Much of the difference in expenditures per capita for hospital care between high- and low-rate areas is accounted for by the hospitalization of patients with igh-variation medical conditions and minor surgery, rates of admission for these conditions are closely associated with per capita bed capacity. Great savings could be realized if the practice styles evident in low-rate areas were adopted by clinicians currently practicing in high-rate areas. (For example, an estimated 625 beds in the Boston area could be closed and upwards of $200 million in 1982 dollars could be saved if the clinical strategies used to treat New Haven residents were used to treatresidentsof Boston.) The similarity in demography between theseregionsand the fact that New Haven clinicians believe their (academic) standards of care do not imply rationing suggest that reductions in expenditures of this magnitude might indeed be possible in Boston. The challenge is tofindout if this is the case. To accomplish this, the decisionmaking strategies for hospitalizing high-variation patients must be examined, and the underlying specific therapeutic hypotheses must be explicated. Since most hypotheses uncovered by this inquiry will presumably involve issues ofrelativesafety of outpatient versus inpatient care (theories about escape from cure), the relevant outcomes will probably occur over a short period of time. Therefore, the studies necessary to understand the relevant outcomes presumably can be completed, and conclusions important for cost containment can be reachedrelativelyquickly." The research must involve'physicians and hospitals located in low- and high-rate areas. Medical record review will no doubt be needed. Differences in the organization of ambulatory care and social services as possible contributors to differences in hospitalization rates need to be explored. The overall objective is to reach consensus on whether low-rate patterns of care represent prudent practice from the patientVpoint of view and, if so, to identify any social impediments to achieving low-rate practice patterns. High-variation treatments or diagnostic procedures. A longer-range focus of the agenda must be the systematic evaluation of specific alternative treatments or diagnostic strategies used in the pursuit of specific therapeutic goals. An examination of the clinical hypotheses underlying the various conditions for which the 23 high-variation operations listed in Table 2 are performed would cover approximately 63 percent of major surgical admissions. Most of the operations listed in Table 2 have not been subjected to randomized clinical trials, and those that have are often used for conditions or patient subgroups where information from the trials cannot be directly extrapolated to predict outcomes. The investigation should begin with a careful Phase I and Phase II evaluation. Helping patients understand the outcomes they desire. The flaws in the rational agency model for clinical decisionmaking pointed out by small area studies and by the critical appraisal of the scientific basis of clinical practice are created as much by confusion about the value or utility of outcomes to patients as they are by uncertainty about the basic probabilities. While gaps in knowledge about outcomes of care can be narrowed by better science, better information per se is neither a sufficient basis for rational decisionmaking nor a sufficient means for reducing unwanted small area variations. In "The number of conditions thai must be examined to make a major impact is quite small: 40 more or less specific causes of admission constitute more than 70 percent of adult medical admissions. Seven conditions-adult gastroenteritis, medical back problems, hean failure, pneumonia, bronchitis, angina "Crou-ttctionaJ as well at longitudinal designs should be pursued. Patienu pectoris, and cardiac catheierizadoiv-nuke up 26 perceni of hospiulized who had "quality of life" operations (e.g., hip replacement, prosutectomy) ai cases. The number of pediatric conditions ii considerably smaller 17 more or long ago as the early 1970s can be located in the Medicare claims dau base, lets specific pediatric conditions constitute 75 perceni of nonsurgical admisand survivors can be interviewed to build a picture of the long-range outcom- sions. Four conditions-bronchitis and asthma, gasiroenierilis. simple pneumonia, and otitis media-make up nearly 40 percent of pediatric admistions es of these operaiions. It is not necessary 1 wail for prospeciive studies to 0 (see Tables 3 and 4). begin to understand the long-ierm implications of these operations. 199 �addition to better informadon, active, informed patient decisionmaking is also needed. The goal of medica] care is often an improvement in some specific aspect of the quality of life that can be obtained only by accepting a relatively small but very real risk of a reduction in quality of life or death. In the case of prostatectomy, the "average" 75-year-old sexually active male with moderate symptoms of prostatism has about a 90 percent chance for improvement; for this he risks about a 2.6 percent chance of death in 3 months, a 6 percent chance of impotence, and a 3 percent chance of persistent incontinence. Whether the patient wants to take that risk is a personal decision that is entirely subjective-it cannot be decided for the patient by the doctor. Procedures for achieving a uniform and unbiased method of conveying information to patients are a central part of the strategy for dealing with the practice variation phenomenon. How risks and benefits are described, how their probabilities are conveyed and how patients' preferences are solicited and evaluated are crucial, suggesting that the procedure used to help patients make informed decisions should be viewed as a major diagnostic intervention. If correctly done, this intervention will suggest the best decision for an individual patient that available information allows; if done poorly, it can lead to the wrong prescription. If done in a uniform and reproducible way, the procedure for achieving informed patient decisionmaking would also open up new methodological possibilities for the active, ongoing assessment of health care outcomes. Patients who choose operations can be compared with those who choose watchful waiting, using standardized methods for obtaining data on patients, at thetimeof the informed decisionmaking procedures as well as at relevant postdecision intervals. The results could then be used to improve the data base, correcting for errors in the estimates of probabilities and updating information as technology changes. The development of procedures toconvey information to patients to help them assess the value of treatments in their own individualized circumstances, clinical trials to evaluate the efficacy of the decisionmaking procedure, and longitudinal follow-up of patients to assess outcomes according to patient choice of treatment are important and virtually unexplored frontiers for health servicesresearch.It is aresearchagenda that offers an important opportunity for improving the rationality of the clinical decision process. American Medical Association. (1986). Confronting regional variations: The Maine approach (Pub. No. OP-007). Chicago: Author. Andersen, R. and J. Newman. (1973). Societal and individual determinants of medical care utilization. Milbank Memorial Fund Quarterly, 51.95-124. Arrow, K. (1963). Uncertainty and the welfare economics of medicil care. American Economics Review, 53.941-973. Bames, B.A. (1982). Population-based small-unit analysis of health care. In D.L. Rothberg (Ed.), Regional variations in hospital use: Geographic and temporal patterns ofcare in the United States. Lexington, MA: D.C. Heath. Barry, MJ.. A.G. Mulley. FJ. Fowler, and J.E. Wennberg. (1988). Watchful waiting versus immediate transurethral resection for symptomatic prostatism. Journal of the American Medicai Association, 259(20), 3010-3017. Bloor, M. (1976). Bishop Berkeley and the adenotonsillectomy enigma: An exploration of variation in the social construction of medical disposals. Sociology, 10,44. Blumberg, M.S. (1987). Inter-area variations in age-adjusted health sutus. Medical Care, 25(4), 340-353. Bolande, R.P. (1969). Ritualistic surgery: Circumcision and tonsillectomy. New England Journal of Medicine, 280,591. Bombardier. C . V.R. Fuchs. LA. Lillard, and K.E. Warner. (1977). Socioeconomic factors affecting the utilization of surgical operalions. New England Journal of Medicine, 297,699-705. Bunker, J.P. (1970). Surgical manpower A comparison of operations and surgeons in the United States and in England and Wales. New England Journal of Medicine, 282.1102-1108. 1 Cochrane, A.L. (1972). Effectiveness and efficiency. London: Nuffield Provincial Hospital ThisL Eddy, D.M. (1984). Variations in physician practice: The role of uncertainty. Health Affairs, 3(2), 74-89. Eisenberg, J.M. (1986). Doctors' decisions and the cost of medical care. Ann Arbor, MI: Health Administration Press Perspective. Ftnkel. Ml-., E.G. McCarthy, and H.S. Ruchlin. (1982). The current status of surgical second opinion programs. In I.M. Rutkow (Ed.), 77K surgical clinics cf North America. Philadelphia: Saunders. Fowler. FJ.. J.E Wennberg.. TP. Timothy, and others. (1988). Symptom status and quality of life following prostatectomy. Journal cf the American Medical Association, 259(20). 3018-3022. Ginelsohn, A.M. and J£. Wennberg. (1976). On iheriskof organ loss. Journal of Chronic Disease, 29.527-535. Glover, JA. (1938). The incidence of tonsillectomy in school children. Proceedings afthe Royal Society ofMedicine.lX, 1219-1236. References Goldbeig. RJ., J.M. Gore. J.S. Alpert, and J£. Dalen. (1986). Recent changes in atuck and survivalratesof acute myocardial infarction. The Worcester heart attack study. Journal of the American Medical Association, 255(20), 2774-2779. Altman, D., RJ. Greene, and H.M. Sapolsky. (1981). Health planning and regulation: The decision-making process. Ann Arbor, MI: AUPHA Press. Grayhack. J.T and R.W. Sadlowski. (1975). Resulu of surgical treatment of benign prosutic hyperplasia. In J.T. Grayhack. J.D. Wilson, and MJ. Scherbenske (Eds.), Benign prostatic hyperplasia (pp. 200 �Roos, N.P. and L L . Roos. (1981). High and low surgicalrates:Risk factors for arearesidenu.American Journal of Public Health, 71. 591-600. 125-134) (DHEW Pub. No. [NIH] 76-1113). Washington, DC: Govemment Printing Office. Health Care Financing Administration. (1983). [Medicare reimbursement dau by sute and county, 1982]. Unpublished dau, Bureau of Dau Management and Strategy. Rutkow. I.M. (1982). The reliability and reproducibUity of the surgical decision-making process. In I.M. Rutkow (Ed.), The surgical clinics ef North America. Philadelphia: Saunders. Koran, L.M. (1975a). The reliability of clinical methods, data, and judgments: Part I. New England Journal ofMedicine, 293,642-646. Koran, LM. (1975b). The reliability of clinical methods, dau, and judgmenu: Part II. New England Journal ofMedicine, 293.695-701. Shain. M. and M.I. Roemer. (1959). Hospiul cosurelatedto the supply of beds. Modern Hospital, 92(4). Lembke,P.(1959). [Article]. Hospitals, 33.65. Wennbeig. J.E (1982). Should the cost of insurancereflectthe cost of use in local hospiul markets? New England Journal ofMedicine, 307.1374-1381. Lewis, CE. (1969). Variations in the incidence of surgery. New England Journal of Medicine, 218,880-884. Wennberg, JB. (1984). Dealing with medical practice variations: A proposal for action. Health Affairs, 3,6-31 Lubitz, J., G. Riley, and M. Newton, (1985). Outcomes of surgery among the Medicare aged: Mortality after surgery. Heahh Care Financing, 6,103-115. Wennberg. J.E (1987). Population illnessratesdo not explain population hospiulizationrates.Medical Cart, 25(4). 354-0359. Luft, H.S., J.P. Bunker, and A.C. Enthoven. (1979). Should operations be regionalized? The empiricalrelationbetween surgical volume and mortality. New England Journal of Medicine, 301, 1364-1369. Luft, H.S. and S.S. Hunt (1986). Evaluating individual hospital quality through outcome statistics. Journal ofthe American Medical Association. 255.2780-2784. McCracken. S.. P. Latessa, and J.E. Wennberg, (1982). A study of hospital utilization in lowa in 1980. Des Moines: Servi-Share of Iowa. Wennberg, J.E. J.P. Bunker, and B. Bames. (1980). The need for assessing the outcome of common medical practices. Annual Review of Public Health, 1.277-295. Wennbeig, JUL and FJ. Fowler. (1977). A test of consumer contributions to small area variations in beaith care delivery. Journal of the Maine Medical Association, 68,275-279. Wennbeig. J£.. J.L Freeman, and WJ. Culp. (1987). Are hospiul services rationed in New Haven or over-utilized in Boston? Lancet, 1. 1185-1188. Wennberg. JE. and A. Gittelsohn. (1973). Small area variations in beaith care delivery. Science, 182,1102-1108. McNeil. BJ., R. Weichselbaum. and S.G. Pauker. (1978). Fallacy of the five-year survival in lung cancer. New England Journal ofMedicine, 299.1397-1401. Wennberg, JE. and A. Gittelsohn. (1980). A small area approach to the analysis of health system performance (DHHS Pub. No. [HRA] 80-14012). Washington, DC: Govemment Printing Office. McPherson, K., P.M. Strong, L. Jones, and B J. Britton. (1985). Do cholecystectomyratescorrelate with geographic variations in the prevalence of gallstones? Journal of Epidemiology and Community Health, 39(2), 179-182. Wennberg. JE. and A. Gittelsohn. (1982). Variations in medical care among small areas. Scientific American, 246(4), 120134. McPherson, K.. J.E. Wennberg. O.B. Hovind, and P. Clifford. (1982). Small-area variations in the use of common surgical procedures: An international comparison of New England, England, and Norway. new England Journal of Medicine, 307.1310-1314. Mitchell, J.B. and J. Cromwell. (1982). Variations in surgery rates and the supply of surgeons. In D.L. Rothberg (Ed.), Regional variations in hospital use. Lexington, MA: D.C. Heath. Park. R.E., A. Fink, R.H. Brook, and others. (1986). Physician ratings of appropriate indications for six medical and surgical procedures (No. R-3280). Sanu Monica. CA: RAND Corporation. Roos, L.L. (1979). Alternative designs to study outcomes: The tonsillectomy case. Medical Care, 17,1069-1087. Roos, NP. (1984). Hysterectomy: Variations in rates across small areas and across physicians' practices. American Journal of Public Health, 74,327-335. Roos. NJ*.. G. Flowerdew. A. Wajda, and R. B. Tate. (1986). Variations in physicians' hospitalization practices: A population-based study in Manitoba, Canada. American Journal ofPublic Healthjt, 45-51. 201 Wennberg, JJL, A. Gittelsohn, and N.Shapiro. (1975). Healthcare delivery in Maine III: Evaluating the level of hospital perfonnance. Journal of the Maine Medical Association, 66(11), 298-306. Wennbeig, J.E, K. McPherson, and P. Caper. (1984). Will payment based on diagnosis-related groups control hospital cosu? New England Journal cf Medicine, 311,295-300. Appendix: Methodological Notes The general methods of small area analysis (SAA) have been described elsewhere. The following outiines the major sources of data and general analytic strategy for SAA and addresses a few specific issues concerning utilization rates and the measurement of variation. Data Bases The large administrative data bases useful for evaluation of health care utilization and outcomes are of two types: hospital discharge abstracts and health insurance claims. Hospital discharge abstracts. These data, collected by private organizations and State agencies, usually lack �the personal identifiers to link records together. While of limited value for outcome studies, hospital discharge data sets covering the populations of States are extremely useful for SAA. The dau routinely collected include the following: admission date; length of stay; operations performed; age, sex, and race of patient; discharge status (e.g., alive, dead); discharge date; admission diagnoses; padent geographic residence code; and physician codes. Increasingly, States are passing statutes that mandate the collection of these data. At this writing, the list includes Califomia, Massachusetts, New York, New Hampshire, Maryland, Maine, North Carolina, Rhode Island, Ohio, West Virginia, Washington, Oregon, and Vermont. Health insurance claims. These data, collected to substantiate payments for fee-for-service medicine, contain detailed information on medical care transactions that can be used to study the process and outcomes of care. When a physician bills an insurance company for a padent visit, a diagnostic examination, or a therapeutic procedure, a record is created containing a code for the service, its date of delivery, the amount paid to the physician by the insurance company, and personal idendfiers for the patient and the physician. Hospitalizations result in computerizedrecordscontaining at least the following: codes for the primary and secondary diagnoses, codes identifying the principal procedures performed, the amount paid to the hospital, and the patient's personal identifier, age, sex, and postal code. In the United States, the Medicare program (which pays for services for most of the U.S. population 65 years of age and older) maintains a computerized record of each individual who is or was at one time eligible for service; therecordcontains the patient's personal identifier, last or current address, age, sex, and date of death (if applicable). In Manitoba, the Health Services Commission maintains similar files for the entireresidentpopulation. Thus the claim for a specific service such as a prostatectomy can be identified and linked through the personal identifier to all previous or subsequent services the patientreceivedand to the file identifying survival status. Therecordsfor the cohort of patients undergoing a specific service or with specific diagnoses can be assembled for sutistical analysis to document costs, resource allocation, and utilization rates; measure the frequency of complications; test hypotheses about the relationship between outcomes; and evaluate the quality of care. Since insurance companyrecordscontain medical record identifiers and current patient addresses, they also have great potential for follow-up or follow-back studies if adequate procedures to assure patient confi- 202 dentiality and informed consent are esublished. Moreover, the records giving age, sex, and location for all _ individuals eligible for services provide an ongoing c e n ^ H I) sus for constructing denominators for population-baseo^W studies. The various potential uses for these dau are summarized in Table A.1. Small Area Techniques Small area analysis can be viewed in three steps: (1) defming the areas for comparative study, (2) estimating resource allocation to populations living in the areas, and (3) measuring utilization. Defining geographic boundaries. If the interest is in measuring the delivery of health care services within the boundaries of a specific city or group of neighborhoods, then the boundary question is already decided.' For the Boston-New Haven comparisons, Chelsea, Revere, and Brookline were included within the defmition of Boston, and the New Haven area comprised the towns of New Haven, West Haven, and East Haven. This intercity and suburban neighborhood mix yielded two populations of very similar demographic characteristics (Table 10). If the objective is to study the distribution of care throughout aregionand torelatepopulations as closely as possible to their major source of care, the more empirical "patient origin" approach is called for. The method, fust suggested by Lembke (1959), has been used with minor changes. It involves measuring the frequency of hospitalization in each of the smallest available geographic units (e.g., a zip code) and grouping these small units into larger market areas based on a plurality rule. Table Al. Uses of claims data for epidemiologic research (1) Studying utilization, expenditures, and allocation of health resources (2) Measuring the frequency of outcomes (a) death (b) acute morbidity (c) recurrence and other long-term morbidity (d) intervention-free survival ' (3) Testing medical care hypotheses (4) Monitoring the quality of care (5) Studying the incidence of some illnesses (6) Organizing cohort studies ofcontroilcd clinical tnals (7) Obtaining representative samples of patients with specific interventions �In practice, this means that hospitals in the same or closely related communities are grouped together. In New Haven, for example, there are two hospitals: the Yale-New Haven Hospital, a university teaching hospital of some 800 beds associated with Yale Medical School, and St. Raphael, a community hospital with about 500 beds and a medical school teaching program. In defming the New Haven market area, these two hospitals Were grouped, and the proportion of admissions in each community in the State to these and all other hospitals was calculated, as shown in Table A.2. For 12 communities, a plurality of hospital admissions was to these two hospitals. Accordingly, these 12 communities constitute the New Haven market area. For the 12 communities, 54.8 percent of admissions were to the Yale-New Haven Hospital, 38.1 percent to St. Raphael, and 7.1 percent to other Connecticut hospitals located outside the New Haven market area. Resource allocation. Once the market areas have been defined, estimates are then made of the amount of resources allocated to the resident population-such as the number of hospital beds, employees, and expenditures per capita. Table A.3 shows how an estimate was made of the number of hospital beds allocated to the population of the New Haven hospital market area. Eighteen hospitals experienced one or more admissions from the population of New Haven. (The experience of each of these hospitals in serving the New Haven population was taken into account in the estimates, but to make the table manageable, only those hospitals that account for more than 1 of the admissions for New % Haven residents are listed.) Per capita expenditures and number of hospital employees were estimated in an analogous fashion. Comparisons of resource allocations among the 193 hospital market areas of New England revealed sometimes large and surprising differences (Figs. 4,8). The results are not sensitive to whether allocations are done on a per admission or per patient day basis (Wennberg and Gittelsohn, 1980). it was found that a large proportion of patients were admittedfirstto a local hospital and then transferred to another facility. These cases could be identified by record linkage based on demographic characteristics. In cross-sectional studies, the rates for nonrepetitive events (such as hysterectomy)representunderestimates of the prevailing rates because previously hysterectomized womenremainin the denominator. With the accumulation of many years of data, the eligible populations could be corrected for prior removals, but migration across boundaries confounds these corrections. The denominators for claims data are the insured population. Because claims data contain information on ambulatory services as well as inpatient care, they can be used to monitor the total use of care, and utilization can be expressed as a rate per event or per person. While claims data could theoretically be used to trace the lifetime utilization of fee-for-service users, this is not a realistic expectation, so claims data provide no systematic solution to correcting the denominators for prior removal. Table A.4 uses Medicare program data to illustrate differences in rates of use of cystoscopies among Maine enrollees living in different hospital markets. The number of enrollees comes from the enrollmentfile;the count of cystoscopic examinations comes from the patient claimsfiles.The percentage of patients with one or more cystoscopies is determined by counting enrollees with one or more cystoscopic examinations, rather than the number of services. Utilization Rates Utilization rates are calculated for market areas on a crude and age-adjusted basis, usually using the indirect method of standardization. For studies based on hospital discharge data, the denominator is obtained from census data, corrected for intercensal changes, usually using data provided by State planning agencies. The rates represent events, not persons, as patients receiving the same service more than once are counted each time. This can sometimes lead to confusion. For example, in one small area with unexpectedly high hip fracture rates. 203 Table A2. Hospital admissions In the New Haven •rea Percentage of all Town Benthamy Branford East New Haven Guilford Hamdon Madison New Haven North Branford Nonh Hamdon Orange West Haven Woodbridge Total number of admissions admissions to Yale-New Haven or St. Raphael hospiuls 325 1.767 2.338 1.089 4.516 897 16.540 743 1.968 1.157 5.589 649 75.4 97.7 98.0 83.1 97.2 70.1 98.2 96.6 92.8 68.8 90.1 90.9 �Table A3. Allocation of hospital beds to New Haven market area residents Number ofbeds Number of admissions Percenuge in area Allocated beds Market share Yale-New Haven (New Haven) 793 30.557 68.30 S4I.6 54.8 St. Raphael's (New Haven) 482 16.775 86.43 416.6 38.1 Milford (Milford) 149 6.684 14.30 21.3 2.5 Middlesex (Middlesex) 323 14.435 4.04 13.1 1.5 — — 30.7 3.4 1.023.2 100.0 Hospiul (location) All others • — Toul NOTE. Admission data, population estimates, and estimates of the number of beds were provided by the Sute planning agency. The towns making up the New Haven markel were so defined because a plurality of their admissions to hospiuls were to the two hospiuls located in New Haven. For each hospital in the State, the computer was programmed to count the total number of admissions and to calculate the percenuge of its admissions that resided in the New Haven area. The estimate of the bed resources used by the New Haven populatton from each hospiul is obtained by multiplying the decimal fraction of admissions by the toul number of beds for each hospital. Tbe figure in the beds allocated column is the loal number of beds allocated to the population from all (in-Sute) sources. The per capiu rate is obtained by dividing the sum of that column by the population of New Haven. In 1982, the number of beds was 2.9 per 1.00. Source: Wennberg, 1984. Used with permission. For surgery that results in nonrepetitive organ removal, the cross-sectional rates can be used in a method analogous to life table analysis to predict the cumulative probabilities of organ removal, given the assumption of stable rates over the lifetime of the population (Wennberg and Gittelsohn, 1982). This has been found to be a useful way of combining age-specific surgery rates into a single index ofriskthat is easily grasped by persons unfamiliar with standardized incidence rates. Measuring Variation In an SAA, neither the size of the denominator nor the numerator is under the control of the user of the data. The populations, which are defined by patient origin studies or by geopolitical boundaries, typically range from about 10,000 to well over a half million (in the case of Boston). The frequency of hospitalizations per year typically range from less than 1 per 1,000, as seen for deep vein thrombosis, to more than 10 per 1,000 for adult gastroenteritis. While the stability of the rates can be improved by longer periods of observation, the intrinsic differences in mean rates and population sizes are such that the usual measures of variation (the coefficient 204 of variation and the range and number of statistical outliers as ordinarily defined) cannot be used to compare directly the pattern of variation between different causes of hospitalizations or operations within the same set of geographicregionsor to assess differences in variation for the same operation compared between two different sets of small areas (e.g., betweenregionsin the National Health Service in the United Kingdom and hospital market areas in New England). McPherson and colleagues (1982) have developed one solution to this problem, the SCV (systematic component of variation), which is a measure of variation based on the proportionate hazard model thatremovesfrom total variance the amount attributable to stochastic variation. In several (unpublished) studies, it was found that the SCV, unlike the other measures of variation, is substantially uncorrelated with the mean rate. Estimated rates of surgery vary from area to area because of sampling error and systematic area-dependent factors. Since all surgical rates vary with age and sex, the systematic component depends on the age and sex composition of each area. Adjustments to observed rates are commonly made by means of indirect standardization. �Table A4. Rate for cystoscopies among Maine Medicare enrollees by urology market area of residence, 1976-77 jmriogy Mirlcet Area Enrollees' Number of examinations Rate Ratio to Sute average Percenuge of enrollees with one or more examinations 1 1 Portland Bangor Lewiston 43,192 29,814 16.397 1,641 857 328 3.8 2.9 ZO 133* 1.00 .70* 18 1.8 13 Augusta Waterville Biddeford 9.920 12.886 8.212 235 201 315 14 1.5 3.8 .83* •54* 134* 1.7 1.2 16 Rumford Presque Isle Skowhegan 3.895 6361 4.203 232 143 95 5.9 2.9 13 2.08* .78* .79 3.9 1.6 1.6 Ellsworth Caribou Calais 2.805 5,757 1.969 68 125 23 2.4 2.2 1.2 .85 .76* .41* 13 1.8 1.0 156325 4.478 Sute 186 10 1.00 •enrollee person-year, • p r 1,000 enrollees. "e 'significant (p<. 01). NOTE: The count of the number of cystoscopic examinations is made from the claims historyfilesof the Medicare program obtained from NUit: carrier, ^Athecarr using the appropriate procedure codes to select relevant records. ReimbursemenU (not shown) are also Ubulated from the claims (^^^cords The population counu are for all Medicare enrollees who were in the Part B program in 1977. The percenuge with one or more stoscopy is determined by counting enrollees with cystoscopic examinations, rather than number of services. ^^^^ystosc Source: Wennberg, (1984). Used with permission. The basic assumption underlying this method of standardization is that each age and sex grouping changes the risk of surgery by a fixed multiplicative factor. To allow for the possibility that, even after this adjustment is made, area differences remain, an additional multiplicative factor was introduced that varied from area to area. If the factor is 1 for all areas, then obviously it is assumed that age and sex adjustments are sufficient. If the factor varies with a nonzero variance a , it is concluded that there are unexplained area differences. In each geographic region, let there be k hospital service areas. The age- and sex-specific rates for all the areas combined are known, and the numbers of people at risk in each age and sex group for each area are known. It is routine to note the observed number of operations in each area (Oi) for a particular period and to calculate the expected number of operations (Ei), given the regional age-and sex-specific rates. Let y, be the multiplicative factor associated with the i'th area. Since 2 205 surgery is arelativelyrare event, it is concluded that the distribution of Oj is approximately Poisson with mean Y i E i. If Yi is considered as a random variable with an expected value of 1 and variance a 2 2 2 (O^ = E, a +£( and if Yj is defined as the logarithm of the ratio of Oj to Ei. Y = log (20 It follows that the expected value of Yj is approximately zero and E(Yi ), the expected value of Y^ is approximated by 2 �2 EfYj ) = 7 J variance ( O, - E, ) Ej 3 E, ( E ' a ' + E,) so that £'*)--TS(i) 2 and therefore O can be estimated by ~ir, k " k Thus the area-dependent component of variance in rates standardized for age and sex can be estimated by 206 subtracting the random component from the observed iserveo variance of the logarithm of the observed over exi ratios. In this way, relative variation around a :gio^W regi norm is compared; therefore, the estimate should not 6 ^ I not affected by differences in prevailing operation ptes. Also, because the contribution of random variation to the total observed variation in the logarithm of observed over expected ratios varies according to the total number of operations in each area, this method adjusts for unequal contributions to the variance estimate that are introduced by differences in prevailing rates and population denominators between areas and between regions. Each estimate has approximately k-1 degrees of freedom. A plot of the log (Oj / Ei) in each region for each operation does not show any systematic departure from a symmetric Gaussian distribution. Therefore, a test of the null hypothesis (no difference in systematic variance for each operation acrossregions)can be performed using an F-test for independent samples. Moreover, the estimated values of a can be compared for pairs of operations in a single region. 2 �For Official Use Only 5/11/93 Title: "The Appropriateness of Hysterectomy. A Comparison of Care in Seven Health Plans" by Steven Bernstein, et. al., The Journal of the American Medical Association, May 12, 1993. The authors found that 16% of all hysterectomies performed in the seven managed care plans they evaluated were performed for inappropriate reasons, and that another 25% were performed for uncertain indications. Implications for Health Care Reform This article shows that a significant number of costly procedures have been found to be either inappropriate or of uncertain value, in managed care settings as well as in fee-for-service medicine. �The Appropriateness of Hysterectomy A Comparison of Care in Seven Health Plans Steven J. Bernstein, MD, MPH; Elizabeth A. McGlynn, PhD; Albert L. Siu, MD, MSPH; Carol P. Roth, RN, MPH; Marjorie J. Sherwood, MD; Joan W. Keesey; Jacqueline Kosecoff, PhD; Nicholas R. Hicks, MRCP (UK), MFPHM; Robert H. Brook, MD, ScD; for the Health Maintenance Organization Quality of Care Consortium Objective.—To develop and test a method for comparing the appropriateness of hysterectomy use in different health plans. Design.—Retrospective cohort study. Setting.—Seven managed care organizations. Patients.—Random sample of all nonemergency, nononcological hysterectomies performed in the seven managed care organizations over a 1-year period. Patients who were not continuously enrolled in a plan for 2 years prior to their hysterectomy were excluded. Main Outcome Measures.—Proportion of women undergoing hysterectomy in each plan for inappropriate clinical reasons according to ratings derived from a panel of managed care physicians. Results.—Overall, about 16% of women underwent hysterectomy for reasons judged to be clinically inappropriate. Only one plan had significantly more hysterectomies rated inappropriate compared with the group mean (27%, unadjusted). Adjusting for age and race did not affect the rankings of the plans and had little effect on the numeric results. Conclusion.—The rates of inappropriate use of hysterectomies are similar to those for other procedures and vary to a small degree among health plans. This information may be useful to purchasers when they consider which health plans to offer their employees. (JAM A. 1993269^398-2402) assessments of both the overuse andunderuse of services; and (3) evaluate-' care for acute and chronic conditions fpt^ all age groups in the population. As part*? of that system, the study reported here-J in developed explicit criteria for assess-^ ing whether hysterectomies were being/ performed for clinically appropriate rea-J. sons and applied those criteria to pro-'! cedures performed in 1989 or 1990 in ' seven managed care organizations o f j varying size, structure, and geographic^ location. •HOver 500 000 hysterectomies are per^ formed each year in the United States'i at an estimated cost of over $2 billion. ^ Since Doyle first reported that many hysterectomies may be unnecessiiry, nu-? merous articles in both the lay and med ical press have reported the overuse of hysterectomyThe American College of Obstetricians and Gynecologists has^ recognized the potential for overuse anthj issued guidelines regarding the appro-, priate indications for hysterectomy. ^ Perhaps because of this increased at^j tention the number of hysterectomies: performed annually in the United States^ has declined by more than 100 000 since 1978. ' Although overuse (eg, inappro^j priate use of hysterectomy) may be con-; sidered a less serious problem in managed care compared with the fee-forservice sector," the extent of overuse isj^ unknown. FurUtcr, we are unaware of-.^ any concerns about underuse of this pro-' cedure (eg, women with invasive cervi-^i cal cancer being unable to obtain a hys'-jj terectomy) in either managed care or fee-for-service practice. For these rea^ sons, the consortium decided to exam f ine the appropriateness of hysterecto^ 2 3 ; 9 I N 1987, leaders from the managed care industry joined with health services researchers to form the Healtl" Maintenance Organization Quality of Care Con- iM From RAND, Santa Monica. Calil (Drs Bernslein. McGlynn, Siu, and Brook, and Ms Keesey); Ihe Schools of Medicine and Public Health. University ol Michigan, Ann Arbor (Dr Bernstein); the Schools of Medicine (Drs Siu. Kosecoft, and Brook) and Public Health (Dr Brook), UCL^ l.os Angoles. Calil; Value Heatth Sciences. Santa Monica. Calil (Ms Roth and Drs Sherwood and Kosecotl). and the Department ol Public Health Medicine, Oxfordshire Health Authority. Oxlord, England (Dr Hicks). • • A complete list of the members of the Health Maintenance Organization Quality of Care Consortium appears at the end of this article. Reprint requests to RAND, 1700 Main Sl. Santa Monica. CA 90407-2138 (Dr McGlynn). 2398 JAMA. May 12. 1993—Vol 269. No. 18 sortium. The purpose of the consortium was to design a method for systematicully assessing tlie quality of etire provided in health plans with different organizational and financial characteristics and to make information from those assessments publicly available. The consortium was in part motivated by a concern that if decisions about purchasing coverage for health services were marie entirely on the basis of cost, serious unintended consequences for the health of the American public could result. The consortium proposed development of a system that would (1) allow for fair comparisons among both managed care and fee-for-service health plans; (2) include 1 10 v : Appropriateness of Hysterectomy—Bernstein ef a l l �. ; :is n n r li|i'.'isu|-|.' iil'l.ln.; • -.A idird in I I C J I I . I I plans y nl'cjic iml-iiii-iil in amsideringan average palienl pn-seiiling td an average LIS physician wlio perlormed liyst.erectoniy in I'.IS'.I. - Using a modified Delphi lechniiiue, they rated each of the 2115 indiuat.iims tm a nine-point scale of appropriateness, in which 9 was extremely appropriate, 5 was uncertain, and 1 was extremely inappropriate. A procedure was considered appropriate if its expected health henefits (eg, increased life expectancy and relief of pain) exceeded the expected negative consequences (eg, mortality, morbidity, and time lost from work) by a sufficiently wide margin so that the procedure was worth doing. We considered indications appropriate if they had a median panel rating of 7 to 9, without disagreement. Inappropriate indications had a median rating of 1 to 3, without disagreement. We classified as uncertain all indications with a median rating of 4 to 6 and all other indications for which there was disagreement. We defined agreement as occurring when no more than two of the ratings were outside the three-point region (ie, 1 to 3,4 to 6, or 7 to 9) that contained the median. We defined disagreement as occurring when three or more ratings were in the l-to-3 region and three or more ratings were in the 7-to-9 region. Details of the panel process have been published. ' 1 METHODS i/iew develop measuivs ni'Llie appropi-i- • aleiiess of liysterecloniy, we first, reviewed the literature to identify the indications for which hysterectomy is used, its efficacy, and risks. I'Yom this review and discussions with clinicians we developed a list of indications, or specific clinical scenarios, in which a nonemergency, nononcological hysterectomy might be considered. The appropriateness of each indication was rated by a panel ofmanaged care physicians. These panel ratings serve as the basis of our appropriateness method. Indications The indications were organized according to 13 different clinical categories: recurrent uterine bleeding (anovulatory, ovulatory, and postmenopausal); leiomyomas; endometrial hyperplasia; endometrial polyps; cervical dysplasia; cervical polyps; pelvic relaxation; endometriosis; chronic pelvic inflammatory disease; chronic pelvic pain; asymptomatic; dysmenorrhea; and miscellaneous. Within each clinical category, specific factors were used to define the indicat i ^ ^ r h e s e factors included age, menoj ^ ^ ^ K t a t u s , desire for future preg^ ^ ^ ^ e v e r i t y of bleeding, presence of pemc pain, pelvic examination findings (eg, leiomyoma size, ovarian mass, or degree of pelvic relaxation), intensity of medical therapy, previous investigations (eg, ultrasound, laparoscopy, or laparotomy), previous surgical treatment, cytology results, and past medical history. One example of an indication is as follows: a patient with leiomyomas who states that she does not want additional children, is 30 to 34 years old with one or more children, has a 12- to 15-week gestational uterine size, has mild uterine bleeding, and experiences pelvic pain or discomfort. Panel Ratings We convened a panel of nine physicians nominated by the members of the Health Mainenance Organization Quality of Care Consortium. Six physicians specialized in obstetrics and gynecology, two specialized in obstetrics and gynecology with an emphasis in reproductive endocrinology, and one specialized in internal medicine. The panelists were provided with a detailed literature review and asked to rate tht: appropriatenes^^hhysterectomy for each of the i n d ^ ^ ^ R using their own best clinical JAMA, May 12, 1993—Vol 269. No 18'' " 12 13 Sample The panel's ratings were used to assess, among women in seven of our managed care organizations who had already undergone hysterectomies, the proportion who had the surgery for inappropriate reasons. Three different types of health maintenance organizations participated in the consortium. All are prepaid health care systems that are distinguished by the relationship between the health plan and the physicians who provide the services. An independent practice association (IPA) contracts directly with solo or group-practice physicians to provide services. In the staff model, the physicians are salaried employees of the health plan. In the group model, the health plan contracts with a single physician group to provide the services. These managed care organizations have more than 3 million members, represent four geographic regions of the United States, and vary in size and organizational characteristics. They include A V-i,lED (statewide IPA; headquarters in Gainesville, Fla); Group Health Cooperative of Puget Sound, Seattle, Wash (staff model); The Health Care Plan, Buffalo, NY (staff model); The Health Plan of America (statewide IPA; headquarters in Orange,Calif); Kaiser Permanente, Colorado Region, Den- ver (group model); Kaiser Permanente, Soiiliiern Calil'onua Megion, Pasadena (group model): and United HealthCare Corporation. Minneapolis, Minn (IPA). All women who underwent hysterectomy between August 1, 1989, and July 31, 1990. and who were continuously enrolled in a plan for a minimum of 2 consecutive years prior to the surgery were eligible for inclusion. The inclusion criteria were selected because detailed clinical information over the 2 years prior to thc procedure was required to determine the appropriateness of hysterectomy. The hysterectomies were identified by reviewing International Classification of Diseases, Ninth Revision, Clinical Modification UCD-9-CM) procedure codes UCD-9-CM codes 68.3 through 68.8) and Current Procedural Terminology (CPT-J,y codes (CPT-i codes 51925 and 58150 through 58285) from the administrative databases (eg, claims or discharges) of the seven organizations. A random sample of 100 hysterectomies in each plan was drawn and we made three attempts to locate the medical records. Two of the smaller plans had fewer cases than requested and one provided more than requested. Patients who underwent hysterectomy as an emergency procedure or who had a confirmed oncological diagnosis as the reason for the procedure were excluded. H b Data Collection We developed a medical-records abstraction form, guidelines, and training material, which are available elsewhere. The data collectors returned the completed abstraction forms to RAND, Santa Monica, Calif, along with photocopies of the admitting history and physical, discharge summary, operative report, reoperation report (if any), and postoperative pathology report. All abstraction forms were reviewed by a nurse at RAND for consistency with the photocopied portions of the record and by a study physician who made specific clinical determinations such as whether a trial of hormonal therapy had been adequate. During the study period, a total of 5126 patients underwent hysterectomy at the seven plans. The records of 712 of these patients were selected for abstraction. We excluded 60 records (8%) because they were for patients who had undergone either emergency hysterectomy or hysterectomy for a gynecological malignancy, and an additional 10 records (1%) could not be found. In one plan, patient permission had to be obtained to access the medical records for review. Consent was obtained from 73% of its patients. 10 Appropriateness of Hysterectomy—Bernstein et al 2399 �Indication:-, in '"'•1:. l-'al ont;, UndC'rgoinq , I'jhlc 1.—DemogrHphiu ancl Clinical Cluraclerislic^. cif G-i:-.' I'atiijnls Undcrgmng Mvr.lO"*cloniy in 1990 in Seven Healih Plans Hystcrectomv Median Appropriateness Rating No of Cases Appiopriaio: sympiomnt.c soccnd-aeriree ulenne prolapse wiinoul cystocele or reciuccle. aged .-•10 y with children and no iler.'ie toi more children, no [inoi conscivalive therapy' 9 20(3 1) Unceriain mild abnotmal ovulatory uterine bleeding, aged i:40 y. currently bleeding persistently, one course ot hormonal Iherapy and one diagnostic evaluation of thc endometrium C 14(2.2) Inappropriate: leiomyomas <12-wk size witli mild bleeding and without pain or discomfort, paiient aged ^ 4 0 yt 3 23 (3.6) No. of Cases (%) Characteristic Age. y a0-39 40-49 50-59 ^60 Marital status Married Single No data Race White Black Other Children None One Two or more Diagnostic procedures' Laparotomy Laparoscopy Diagnostic endometrial evaluation One Two or more Ultrasound History ol gynecological surgery Myomectomy Bladder suspension Unilateral oophorectomy Cesarean section Obesityt Hypertension Chronic obstructive pulmonary disease Diabetes mellitus 195(30) 292 (4C) 89 (14) CG (10) 481(75) 153(24) 8(1) 473(74) 85(13) 84(13) 76(12) 98(15) 466 (73) Indication ( ) % •Degree of uterine prolapse was detined as follows: (1) lirst degree (cervix descends into the vagina but not beyond the introitus); (2) second degree (cervix protrudes at the introitus); (3) third degree (cervix protrudes beyond the introitus); and (4) (ourth degree (both Ihe cervix and vagina are beyond the introitus and the vagina is inverted). Conservative therapy consisted ol Kegel exercises, topical estrogen cream, or vaginal pessary. Patients were considered to have desired more children if there was a note documenting that the patient wanted more children or was unsure if she wanted luture pregnancy. tLeiomyomas were classified based on uterine size: <12-wk gestational size, 12-15-wk gestational size, and > 15-wk gestational size. Bleeding was either mild (hematocrit >0.36). moderate (hematocrit 0.30 to 0.36), or severe ' (hematocrit <0.30). A complete list of definitions is available elsewhere." 6(1) 53 (8) 183 (29) 66(10) 345(54) 4(1) 5(1) 37 (6) 60(9) 126 (20) 110(17) 53 (8) 14(2) *Within 2 years prior to hysterectomy. tDefined using Quetelet's index. 17 Analytical Approach Each patient was assigned to an indication and, for some, more than one indication applied (eg, a patient with dysfunctional uterine bleeding could also have leiomyomas). For these patients, the final indication applied was the one with the highest appropriateness score. We then used logistic regression to test whether there were differences in the appropriateness of hysterectomy among the plans. The regression equations were estimated both with and without adjustment for age and race. The proportion of inappropriate hysterectomies performed in each plan was compared with the average proportion of hysterectomies performed in all of the participating plans combined. Thus, each plan had a score representing its difference from the group mean and the statistical significance of this difference. This comparison was selected because there are no national standards for the correct rate of inappropriate hysterectomies. RESULTS - The patients' median age was 44 years and 10% were older than the age of 60 years. Seventy-five percent were married ami 12% had no children. Tlie majority of patients were white and one 2400 Table Z . - E x a m p i c s ol A;»;»r'»pti;iV;, U i w c i a i - . .'.•Hi H M I X M W K . H . J A M A , M a y 12, 1993—Vol 2 6 9 , N o . 18 v: fifth were obese (as defined by Quetelet's index ). In the 2 years prior to hysterectomy, six patients had undergone an exploratory laparotomy; 53 underwent laparoscopy; 183 had one diagnostic evaluation of the endometrium (eg, dilatation and curettage, laparoscopy, and endometrial biopsy); 66 had two or more diagnostic evaluations of the endometrium; and 345 had undergone ultrasound. Previous gynecological surgery included myomectomy (1%), bladder suspension (1%), unilateral oophorectomy (6%), and cesarean section (9%) (Table 1). Examples of appropriate, uncertain, and inappropriate indications for hysterectomy are shown in Table 2. We used 199 (9%) of the possible 2115 indications to categorize the 642 patients. Three indications were needed to describe 12%, six indications were needed to describe 21%, and 24 indications were needed to describe 50% of our sample. The five most frequent appropriate indications describe 30% of appropriate hysterectomies; the seven most common inappropriate indications describe 40% of inappropriate hysterectomies; and the four most common uncertain indications represent 33% of the uncertain hysterectomies. Overall, 58% of the patients underwent hysterectomy for appropriate reasons, 25% for uncertain reasons, and 16% for inappropriate reasons. Older women were more likely than younger women to have received a hysterectomy for appropriate reasons (P<.0001). The proportion performed for appropriate indications was 44% in the 21- to 40-year-old age group compared with 83% in patients aged 60 years and older. Similarly, 28% of surgeries were inappropriate in the younger age group compared with 5% among the older age group. Although nonwhite women had fewer inappropriate hysterectomies, the difference was 17 not statistically significant (11% vs 19% | inappropriate hysterectomies; P<.09). ^ Among the seven plans, the propor- '< tion of procedures performed for inap- ; propriate reasons ranged from 10% to i 27% and the proportion performed for '.\ appropriate reasons ranged from 42% . ? to 69%. Patients undergoing hysterec-' (; tomy at health maintenance organiza- ; tion F were more likely to have undergone a hysterectomy for inappropriate.i indications, with an adjusted inappro-1 priate rate of 29.4% (13 percentage points ; • higher than the group mean; 95% confidence interval, 3.8 to 21.2) (Table 3). Finally, after controlling for age and race, /• there was a difference of 6 percentage:'. points between the two organizational, models (favoring the group and staff , model over the IPA). This difference ' did not reach statistical significance .: (P<.07). : COMMENT In this study we developed criteria to assess the appropriateness of use of hysterectomy. We applied these criteria to women in seven managed care organi-. zations and found that 16% were performed for inappropriate reasons, and,}, this rate varied from 10% to 27% by , plan. One plan had a significantly higher :? proportion of hysterectomies rated as ij inappropriate compared with the over- v • all average. These results can be com- " pared with reports in the literature, some of which are quite old. For example, Gambone et al found that 5% of hysterectomies performed at the Naviil Hospital in San Diego, Calif, were inappropriate while Dyck et al'' reported that 24% of hysterectomies performed inSaskatchewan in the early 1970s were inappropriate. . Some physicians may be concerned about the criteria that were developed to assess appropnaieness. A group of expert physicians from the participat-.. 18 A p p r o p r i a t e n e s s o l Hysterectomy—EJernstein et a l . �• '<: :<•—Pcrccntago ol Inappropnate HystcructOny Managed Care Organisation . Inappropriate* r Unadjusted (95% Confidence Interval) Adjusted (95% Confidence lnterval)t HMOC HMO 0 HMO E HMO F -2.2 (-7.8 1053) 5.3 (-1.4 to 13.7) 0.7 (-5.310 8.5) -2.4 (-7.9 to 5.1) -2.8 (-9.1 to 6.2) 10.8(2.2to2t.2)§ -2.0 1.7 0.3 -0.8 -2.3 13.0 --'MO G - 6 1 (-10.8 10 0.5) - 5 . 6 (-10.4 to 1.0) • ( - 7 . 5 to 5.4) ( - 4 . 2 to 9.0) ( - 5 . 5 to 7.8) ( - 6 . 7 to 6.9) ( - 8 6 10 6.7) (3.8 to 23.7)§ •The percentages for each health maintenance organization (HMO) are differences from the overall mean. Both the unadjusted and adjusted overall means are 16.4. tAdjusted for age and race. (For example, HMO A has an unad|usted proportion of inappropriate surgeries that is 2.2 percentage points lower than the average, or an overall proportion of 14.2%; the adjusted proportion is 2.0 percentage points lower than the average, or 14.4% inappropriate surgeries. §P<.01. ing organizations reviewed and rated the indications for hysterectomy based on their clinical judgment and the literature review. Although a different group of clinicians may have selected different criteria, we beheve that the judgments are reasonable. A review of the indications in Table 2 shows them to be consistent with the clinical literature. In addition, the final ratings of indications for leiomyomas, recurrent uterine bleeding, and chronic pelvic pain are similar to the Americal College of Obstetricians and Gynecologists criteria sets. tncians anc ngs the public domain and TJfifcLng: are in examination. able for Eneral, we do not believe that ^IPnei cases were rated as inappropriate because of poor documentation. We restricted this study to patients who were enrolled in the same plan for at least 2 years prior to hysterectomy in order to collect all relevant outpatient data. The abstractors collected data from their own plan and therefore knew its specific charting system. They had access to and abstract ed data from all physician office records to ascertain severity and duration of symptoms, intensity of medical therapy, and previous diagnostic tests. In addition, to examine whether missing data might affect our results we performed several sensitivity analyses. For some patients, the data to determine the duration of hormone therapy for recurrent uterine bleeding were incomplete, which made the adequacy of that trial difficult to determine. We compared the difference in overall rates of appropriateness if such patients were given a code for receiving an adequate trial vs an.inadequate trial and found that at most there was a 1-percentage-point difference in appropriateness ratings. We ran a similar test for thc treatment of data regarding the persistence 9 12 JAMATMay 12. 1993—Vol 269. No.v18^••.ctC' i.-- ,,(' liU.v.liui; and found that coding such missing daia as persistent vs not persist!. l i ' i not significantly affect the overall ral.iims. With one cxeeption, individual patient preferences for one set of benefits vs one set of risks were not explicitly accounted fur in defining appropriateness. The one major exception was women who desired future pregnancy, for whom additional efforts to preserve the uterus were required before judging hysterectomy to be appropriate. Data on a desire for future fertility were missing from the record in 131 women. If all missing data were scored as equivalent to a woman desiring future fertility, 124 (95%) of the 131 hysterectomies would be rated inappropriate. If missing data were scored as equivalent to a woman not desiring future fertility, 78 cases would be scored as appropriate, 34 as uncertain, and 19 as inappropriate. We elected to treat missing data on future fertility as equivalent to the woman not wanting future fertility because, among those women for whom we had data and who were premenopausal, about 3% indicated a desire for more children. The data presented herein can be used both for external (consumers and purchasers) and internal (quaUty improvement) purposes. The flexibility of this approach to respond to a variety of demands for information is one of the strengths of this method. Because our primary purpose was to develop information for public release we begin by illustrating its uses in that arena and follow with a brief discussion of the information that might be provided to health plans. Assume for the moment that the data presented herein were collected on competing health plans within the same geographic area by an independent organization (private or public). Using an external organization will to some extent mitigate opportunities for gaming results. In considering which plan to choose, consumers could receive information on the rate of hysterectomies by plan and the percentage in each that were clinically inappropriate. The information presented to consumers would be based on results adjusted for age, and only numbers for the group average and plans with clinically and statistically significantly different results would be presented (along with information on the magnitude of the difference). We envision that the infonnation on hysterectomy could be presented along with results from the entire quality assessment system, which might show plans performing better in one area and worse in another. I'ui-chnscrsco'.iUl be provided with all of the infornriiion presented in Table ami they mighl limit the health plans offered to employees to only those that performed at the. average or better in their area. Accreditation organizations such as the National Committee for Quality Assurance could require plans seeking its accreditation to demonstrate the extent to which they perform hysterectomies for appropriate reasons. Health plans could use data with considerably more clinical detail for their internal quality improvement efforts and in addition, they could receive their individual clinical profiles. This would allow a plan to identify the areas in which quality-improvement efforts might be targeted. Because the information was collected to represent the plan as a whole, it is unlikely that it could be used to determine whether there were differences among individual providers or individual health centers within the plan. The plan might collect additional data as part of its continuous quality-improvement system to answer questions regarding differences among its sites or providers. In conclusion, we have shown that competing health care systems can work effectively together to examine the appropriateness of the care they provide. This information will be valuable to them and could be used in the future by regulatory agencies, corporate purchasers, or the pubhc. During these times of increasing financial constraints, it is important to seek the most cost-effective quality care that is available. I t may be unreasonable to continue paying, out of pubhc funds, for health care that is judged on average to be inappropriate (ie, the risks outweigh the benefits) or even uncertain (ie, risks and benefits are about equal). This might be the policy even for patients whose preference structure differed from that of the average patient such that performing the procedure would be slightly beneficial. Such patients might be offered the opportunity to pay for the procedure themselves or to use after-tax health insurance dollars for such a purpose. Regardless of decisions about the use of such infonnation for reimbursement purposes, purchasers ought to have this type of information available to facilitate their owm decisions about health care coverage. Finally, it is important to note that this is not simplv an issue for managed care—the inappropriate provision of surgical and medical procedures occurs throughout the health care delivery system. As we begin moving toward the systematic provision of public information on quality, the full range of health plans must be included in the effort. Appropriateness of Hystefectomy—Bernstein et al 2401 �1 This wcu-k was sii|i|«irii.'il I iv Lln- .luhii A. llanfni-,1 Kiimiilaiiuti. N<'«- Vni-k. NV.llii.- Naiional Insl.inil.c nn Ajjinu, Hrllmsila. M.I (Ur Siu), ami UnrtK.-mlnn s nf Lhr Hrall li Mainb.-nance Or);aiii-/-ainii Qualily of <.:aif ( Vinson ium. Ilr Hk-ks' work was su|i|>uili.'cl hy a llarkimss l-Vlloivship, funiii.-d hy Lhu Coinniomvnall.li Fuml uf New Vork, NV. Tho Huah h Maintenanco Organi/jiLion Quality of Care Consortium consists of 11 managed care plans .seeking to cstahlisli a <jiiality-<if-caru research agemia leading to the development of valid outcome- and proeess-relalei] measures that can he used by differing health care delivery systems to provide information on quality for public release. Consortium members and their representatives during the course of this project included the following: AV-MKD, Gainesville, Fla: Jerome Beloff, MD, Bernard Mansheim, MD; CIGNA HealthpUxiis, Bloomfield. Conn: Norbert Goldfield, MD, Francis Lieb, MD, David Ferriss, MD; Croup HeaUh Cooperative of Puget Sound, Seattle, Wash: Bi-uce I'erry, Ml.); //„,-,•„,,/ (•,,„....»."'<•< //••""'' /''on. Hrookliiie. .Mas>: C.-IMI.I I'l-lki". .M I ' . - I ' ' ! " " fer Leaning. Ml); Th.• <'•.•.•• /•/...( ".' /.•»(>:>'•• (NV): F.dwaid Marin.' MD MHA; The ll-altl, ri,,,,,.(A Ka'.llli-i-u ''unin. finn.p-.i'.ilif- •)»hn Austin. M I ) . A'rl('».•,• f . - r . n n n r n l . : r o/..,.../" Ite.giuii, Denver: A m l r e w Wii-.-eill Iml. M l ) ; Kiiis,:r l'i:miane.iiti\ Xmlhircsl /iVi/io... 1'oilland, (he: Terr)- Carr, HN; Kaiser I'rnnaiinil,:. Soulhtim California Ilajion. Pasadena, Calif: Samuel Sapiu, MD; McdCent'ers Health Plan, Minneapolis, Minn: Iris Johnson, KN, Gloria Swanson, HN; and fyrnfed HealtliCare Corjioralion, Minneapolis, Minn: Sheila leatherman, MSW. This study could not have licen completer) without the support of the physicians, health care plans, and hospitals, who [lermitted us to. review their records. The study also could not have l>een accomplished without the assistance of the many data collectors from each of the institutions and the medicalrecords staffs of the associated hospitals, who prnvi.l.'.l l l h - iiir.Tiiiali'.n ns.-.i in ihis sanly. V. l l u n l : Mark Chassin, M D . anil C -g.' Gnldl.e,-. Ml), fur Iheir earlier ivorl; nil in.li.-al ions for hys;.< . . l . . i i i \ . on wliich I 1II> sunly u-as huill. i lur KAN' colleagil... Kolieil Hell. I ' h U . provi.l.-d valuahle s: lisiii-al i-onsulling. We also thank Ahmar Iqlial. M': Harliara ArnaelsUrn. KN; I'.d I'ark. I'll!); Kri.st.iai KaulK.'. I'lil); and David lladorn, Ml), for iLssistan, in alisli-acior training, data collection, and prwes. ing. Tamara Majcski, Thoa Nguyen, and Channuii Kriehanl deserve R|>eeia] thanks for assistance wii manuscript prep;u-ation. Finally, we express oi: deejMist appreciation to the members of the Heali Maintenance Organization Quality of ('are Consoi tium Hysterectomy Appropriateness Panel for th lime they took to rate the appropriateness criter used in this project. The (unci's memliers includi Edward Blumenstock, MD; David Coyer, MD; Er Emont, MD; Sheila Gately, MD; John Hachiya, MI Ruth Krauss, MD; Walter Schwimmer, MD; Ati Sheth, MD; and Sanford Yankow, MD. References 1. Graves EJ. Detailed diagnoses and procedures, National Hospital Discharge Survey: United States, 1990. Vital Health Stat IS. 1991;113:118. 2. Easterday CL, Grimes DA, Riggs JA. Hysterectomy in the United States. Obstet Gynecol. 1983; 62:203-212. 3. Doyle JC. Unnecessary hysterectomies: study of 6248 operations in thirty-five hospitals during 1948. JAMA. 1953;151:360-365. 4. DeFrieae GH, Evans AT, Ricketts TC, Cromartie EP. Norih Carolina Medical Society Practice Variation Study of Hysterectomy. Chapel Hill: University of North Carolina; 1989. 5. Dyck FJ, Murphy FA, Murphy JK. Effect of surveillance on the number of hysterectomies in the province of Saskatchewan. N Engl J Med 1977; 296:1326-1328. 6. Jenkins VR. Unnecessary-elective-indicated? audit criteria of the American College of Obstetricians and Gynecologists to assess abdominal hysterectomy for uterine leiomyoma. Qual Rev Bull. 1977;3:7-12, 21. 7. Dranov p. Do you need these operations? Health. June 1986:24-27. 8. Miller NF. Hysterectomy, therapeutic necessity or surgical racket? Am J Obstei Gynecol. 1946;51: 804-810. 9. American College of Obstetricians and Gynecologists Task Force on Quality Assurance. Quality Assurance in Obstetrics and Gynecology. Washington, DC: American College of Obstetricians and Gynecologists; 1989. 10. Pokras R, Hunagel V. Hysterectomy in the United States, 1965-1984. Am J Public Health. 1988; 78:852-353. 11. Greenfield S, Nelson EC, ZubkofTM, et al. Variations in resource utilization among medical specialties and systems of care: results from the Medical Outcomes Study. J A M A 1992267:1624-1630. 12. Bernstein SJ, McGlynn EA, Kamberg C, et al. Hysterectomy: A Literature Review and Ratings of Appropriateness. Santa Monica, Cali£ RAND; 1992. 13. Park RE, Fink A, Brook RH, et al. Physician ratings of appropriate indications for six medical and surgical procedures. Am J Public Health. 1986; 76:766-772. 14. Division of Quality Control Management, Ame ican Hospital Association. Intematio-nal Classii cation of Diseases, Ninlh Revision, Clinical Mo ification. Chicago, 111: American Hospital Publis ing Inc; 1989. 15. Department of Coding and Nomenclatui American Medical Association. Physicians' Cv rent Procedural Terminology—1991. 4th ed. Cl cage, 111: American Medical Association; 1990. 16. Sherwood MJ, Roth CP, Bernstein SJ, et ; Medical Record Abstraction Form and Guidelin: for Assessing the Appropriateness of Use of Hi terectomy. Santa Monica, Cali/: RAND; 1991. Pu lication N-3435-HF. 17. Khosla P, Lowe CR. Indices of obesity derive from body weight and height. Br J Prev Soc Mi 1967;21:122-128. 18. Gambone JC, Lench JB, Slesinski MJ, Moo JG. Validation of hysterectomy indications and tl quality assurance process. Obstet Gynecol. 1989;i 1045-1049. , «>;!!>:.:.!.). j I ! "I! •: 2402 JAMA, May 12. 1993—Vol 269, No. 18 -c Appropriateness of Hysterectomy—Elemstein el �For Official Use Only 5/11/93 SUMMARY Title: "Effects of the National Institutes of Health Consensus Development Program on Physician Practice," Jacqueline Kosecoff, et. al., JAMA. 258:19, pp. 2708-2713 The authors studied the impacts of guidelines developed by National Institutes of Health consensus conferences on treatment of breast cancer, on Caesarian sections, and coronary artery bypass surgery. It showed that compliance increased significantly with only 5 of the 11 recommendations studied, and that 6 of the 11 recommendations were complied with less than half of the time, following dissemination of the recommendations. Implications for Health Care Reform: This article shows that new scientific knowledge, expert consensus and practice guideline development often fail to produce changes in the practices of physicians and hospitals across the country. �Effects of the National Institutes of Health Consensus Development Program on Physician Practice Jacqueline Kosecoft PhD; David E. Kanouse, PhD; William H. Rogers, PhD; Lois McCloskey, MPH; Constance Monroe Winslow* MD, MBA; Robert H. Brook, MD, ScD The effects of the National Institutes of Health Consensus Development Program on physician behavior were investigated. The medical records of 2770 patients treated in ten hospitals throughout the state of Washington were reviewed to determine if quality of care improved withrespectto 12 recommendations put forth by four consensus panels concerning surgical management of primary breast cancer, the use of steroidreceptorsin breast cancer, cesarean childbirth, and coronary artery bypass surgery Care was studied during 24 months before and 13 to 24 months after each consensus conference. Results showed that the conferences mostly failed to stimulate change in physician practice, despite moderate success in reaching the appropriate target audience. It was concluded that the consensus development conference is an important educational tool whose effects might be enhanced by focusing on areas of practice that need improvement and by encouraging follow-up programs at the state and local level. (JAMA 1987;258:2708-2713) THE EXPLOSIVE growth in medical knowledge is a well-known fact A large part of that growth has been financed by support for biomedical research provided by the National Institutes of Health (NIH). With this explosive growth has come an ever more pressing need to educate the physician community so that new knowledge can be translated into medical practice. Fbr that reason, in 1977, the NIH created a program to facilitate the dissemination of changes in the state of science to the health care profession and the public. The NIH Consensus Development Program, administered by the NIH Office of Medical Applications of Research, brings together scientists, medical practitioners, and informed laypeople to conduct public evaluation of Soe also pp 2727,2738, and 2739. From the Department of Medicine. UCLA School ot Medicine (Dr Winslow); and UCLA School ot Public Health (Drs Kosecotl and Brook and Ms McOoskeyl Fink and Kosecoft inc. Santa Monica. Calil (Dr Kosecoft), ana Tbe Rand Cora Santa Monica. Calit (Drs Kanouse. Rogers, and Brook). The views expressed in this article are the author* and are not necessarily shared by The Rand Core. UCLA, or the National Institutes ot Health Repfint requests to The Rand Corn 1700 Main St Santa Monica. CA 90406 (Dr Kanouse) 27 JAMA. Nov 20. 1987—Vol 258. No. 19 "scientific information about biomedical technologies. Each panel meets for about 2Vz days, first reviewing scientific evidence and then meeting in executive session to seek consensus on key questions posed in advance of the conference. The panelsfindingsare presented in the form of a consensus statement that contains recommendations for medical practice. The statements are disseminated through reports in medical journals and directly to practicing physicians and other health professionals, the biomedical research community, and the public. Through 1986, the NIH had conducted 60 consensus development conferences covering drugs, devices, techniques, and procedures used for diagnosis, treatment, prevention, and public health purposes. Recent topics have included the impact of screening blood and plasma products for human immunodeficiency virus antibodies, infantile apnea and home monitoring, platelet transfusion therapy, and diet and exercise in noninsulin-dependent diabetes mellitus. The NIH program has served as a model for similar activities in Canada and Europe." lb assess the effectiveness of the Consensus Development Program, the NIH funded a study of how consensus conferences have affected the knowledge, attitudes, and practices of physicians. The study drew on data from several sources, including a survey of physicians' knowledge, attitudes, and practices and a review of hospital medical records to determine changes in actual practice. This article reports re-. suits of the medical review component of the study. It specifically examines three questions: (1) Did the conferences' recommendations address areas in need of change or was preconference "compliance" with the recommendation already high? (2) Did the recommendations stimulate change where none was Physician Practice—Kosecoft et al �1 IkWa 1.—Sampling Frame tor Evaluating Effects of NIH Conlerencaa* NIH Conference . The treatment ot prtmary txeast cancer managemenl of local 2. StaroM receptors In breast cancer 3. Cesarean cWldblrth Dete. imvy 8/79 9/80 Time Period Range, Sampling Frame Patients witn onmary breast cancer receiving mastectomy Cesarean delivery this pregnancy No. No. of Hospital* Time V. 7/77*78 Time 2: 7/78*79 Time 3: 7/80-6/81 102 222 10 Time l : 1/79.12/79 Time 2: 1/60-9/80 Time 3:7/81*82 75 140 162 10 6. 7,8 of RecommenOatlona AOdreesed 1.2.3 249 Delivery witn history of a cesarean section I 4. Coronary artery bypass surgery: sdentific snd dlnical aapects 12/80 sr. iff- 10 4 Same 16 39 48 10 5 Random sample of deltveriea f I" 56 113 126 Sreecfl presentation at Ume of cunent delivery I Same Same 25 57 73 88 160 216 10 4, 5, 6, 7. 8 10 10 Patients admitted tor unstable angina Time V. t/79-12/79 Time 2:1/80-12/80 Time 3:1/82-12/82 Patents receiving coronary angiography during cunent admission Same 64 161 178 9, 11,12 Patients reoeivtng bypass surgery during current admission Same 72 154 175 9,11,12 •NIH Indicates National Institutes of Health. 2.—Consensus Conference Recommendations lor Four Conferences Conference The treatment of primary breast cancer management of local disease Steroid receptors in breast cancer Cesarean cfukJbtrth Coronary srtery bypass surgery: sdentific snd dlnical aspects Recommendatton 1. Total mastectomy with axlUary dtosecdon In women wttti stage 1 or earty stage II disease Is the treatment standard. 2. A 2-step procedure should be performed. 3. An estrogen receptor assay should be pertormed on each primary tumor 4. LowMiak women wt>o had a previous low-segment cesarean birth should be given a trial ot latiortorpotential vaginal delivery 5. Vaginal delivery of a (rank breech baby weighing less than 3.6 kg (8 lb) is acceptable provided Ihe mother has normal petvic architecture. 6. The choice of anesthesls should be discussed with the patient: regional anesthesia ia an appropriate option. 7. It the need for cesarean delivery occurs during laboc a diacussion between the patient and physician should take place. 8. The need lor a cesarean delivery should be based on sound il judgment. 9. The workup should be effident 10. Patients with unstable sngina should receive coronary angiography during the initial phase of hospitallzatton 11. Surgery la indicated only In patients with critical stenosis of any maior coronary branch. 12. High-risk patients should undergo coronary angiography and if justified by the patientfe symptoms, lack ol response to medical management or coronary anatomy coronary artery bypass surgery should be perlormed. 'The recommendations are listed In s paraphrased summary lorm. The lull text at the recommendations is available elsewhere* occurring? (3) Did they accelerate change that was already under way? METHODS The medical record review was deed to measure the impact of four Inferences: (1) the treatment of prinf mary breast cancer—management of local disease; (2) the use of steroid JAMA, Nov 20. 1987—Vol 258. No. 19 receptors in breast cancer; (3) cesarean childbirth; and (4) coronary artery bypass surgery. These four conferences were selected from among 30 held through March 1981, based on the following criteria: (1) the conference must have produced readily apparent recommendations; (2) compliance with these must be ascertainable by reviewing a hospital medical record; (3) patients to whom the recommendation applied should be easy to identify; and (4) recommendations should affect enough patients to permit the development of a feasible sampling frame. Study Design We first identified all acute, nonspecialty, nonfederal hospitals in the state of Washington with more than 150 beds and divided them into four strata based on size and teaching status. We then selected ten hospitals (likelihood of being sampled proportional to number in ptrataX Fbr each conference, we studied medical records drawnfromtwo periods before and one period after the conference, with each period lasting nine to 12 months. Generally, time period 1 was 13 to 24 months and time period 2 was zero to 12 months before the conference. Time period 3 began nine to 12 months after the conference to allow for dissemination of the conference results. (See Table 1 for specific dates of the conferences and the three time periods.) Medical records of specified patients were sampled from hospital logs or the hospital-abstracting servicels reports using a two-stage procedure in which we first sampled physicians within hospitals and then patients within physicians. Within a hospital, each patient had an equal Physician Practice—Kosecoft et al 2709 �Table 3—Compliance Wltti Consensus Conference Recommendations by Time Penod Compliance, % Before Periods Atter Cnangermo, All Periods TlmeS (%)' Hatkm Definition of Compliance No. Time 1 Time 2 1 Stage I and a any stage II breast cancer patients receiving total mastectomy wtth axillary dlsaectlon. % (N - 73. 158. 170) 74 79 84 0.23 (0.14) After ve Adjusted (%)t -1.5 (4.6) 2 Breast cancer palianta receiving 2-step procedure. % (N - 1 0 1 . 219. 248) 38 39 46 0.30t(0.14) 3 Estrogen receptors performed on primary breast tumor, % ( N - 1 0 1 , 219, 248) 54 78 86 0.675(0.12) 4 Wai ol labor occurred tn women with prwrioua low transverse cesarean section. % ( N - 3 5 , 84. 70) 6 11 29 0.905(0-22) 2.4 (5.8) Vaginal delivery ooourred m women with previous kwr transverse cesarean section. X ( N - 3 5 . 84.70)0 6 6 16 0.4U(0.17) 2.1 (4.S) THal of labor lor eligible fran* breech babiea and mothers. % ( N - 1 2 . 28. 35) 56 46 37 -0.48 (.48) 2.1 (13.1) Vaginal delivery tor eligible (rank breech babies and mothers. % ( N - 1 2 . 26. 35)11 33 23 28 -0.05 (.42) 1.1 (11.4) Discussion with patients of options tor anesthesls In random sample ol dedveriea. % ( N - 2 5 . 57. 73) 40 51 33 - 0 . 5 0 (0.33) -12.7 (8.9) Discussion with psttents ol options tor anesthesia In sample ol women delivering by cesarean. % (N - 75.140.182)1 31 41 36 0.03 (0.21) -7.4 (5.3) -5.4 (5.0) 5 6 0.8 (4.8) -13.18(3.9) Cesarean dellvenes with regional anesthesia. % (N-307)( 69 81 84 0.39t(0.19) 7 Discussion with patient about surgery In esses of unplanned cesarean aectlon, % ( N - 3 8 . 72, 85) 34 32 22 - 0 . 4 0 (0.22) - 2 . 5 (6.6) S Women not receiving cesarean delivery In sample ol random deliveries, % ( N - 2 5 , 57. 73) 64 84 85 0.01 (0.25) 1.3 (8.8) 9 Patients receiving s 1 noninvasive test before coronary angiography, % ( N - 5 9 . 153, 164) 90 97 93 0.01 (0.09) -6.4t(3.0) Patients receiving s i noninvasive lest before coronary artery bypass surgery, % ( N - 6 8 . 151. 157)|| 90 92 95 0.13 (0.09) -1.9 (3.2) 10 Patients with unstable angina receiving coronary angiography on day 1 or 2 of hoapitallzatlon. % (N-86.173. 209) 14 19 24 0.23 (0.13) -2.0 (4.3) 11 Patients receiving coronary artery bypass surgery who had mors than 1 diseased vessel or a diseased left anterior descending or left main vessel ( N - 6 8 . 151. 187) 65 87 68 0.07 (0.12) -1.7 (4.0) Coronary angiography patients who had 1 vessel disease and who did not receive or were not recommended to undergo bypass surgery, % (N - 1 0 , 42. 44)|( 70 80 55 - 0 . 3 3 (0.38) 100 100 97 - 0 . 1 0 (0.09) Coronary angiography patients who had no vessel disease and did nof receive or were not recommended to undergo bypesa surgery % (N - 1 3 . 2 7 , 3 4 ) | 12 6.9 (13.3) -1.0 (3.2) ... Average appropnateness of pertorming coronary angiography (defined on 9polnt scale where 9 la most appropriate) ( N - 5 9 . 1 5 3 . 1 6 4 ) «-5.7 t-5JS ft-Si Average appropriateness of pertorming coronary artery bypass surgery (defined on same 9-point scale) (N-88,151,167)1 *-6.5 ft-6.6 »-8.7 •Change per month represents s lines/ trend. Forrecommendation1, It means that compliance Increased 0.23%/mo. The SE was 0.14% and tha change was not slgniflcant tAlter vs belore lor recommendation 1 means thai the expected compliance in alter period 3 ia 1.5 less than what would have been predicted by the linear trend established In periods 1 and 2. The SE of prediction «wu 4.6% and comparison Is not significant tP<.05. V<oa\. IIAItamatlve definitions of compliance. chance of being selected. Ttoo of the four consensus conferences made recommendations that covered different subpopulations. Fbr instance, the cesarean conference made recommendations about breech deliveries on the one hand and delivery of women who had a previous cesarean section on the other. For these conferences, we sampled records for each group of patients to whom recommendations applied (Table 1). Selecting Recommendations We defined as a recommendation any statement that directs physicians or other health care personnel to provide medical care in a specified way. Three physicians reviewed each consensus statement and independently identified 2710 JAMA. Nov 20. 1987—Vol 258 No. 19 recommendations. We selected for study those that (1) were so identified by at least two of the three physicians and (2) could be assessed through a review of hospital medical records, ie, the recommended practice (if performed) had a high likelihood of being recorded in a hospital medical record. For several recommendations on which we collected data, this proved not to be the case. Here we report results for all recommendations that could be accurately assessed through medical record review (see Table 2 for abbreviated and paraphrased descriptions of these recommendations and Table 3 for definitions used to assess compliance). Fbr certain vague recommendations, such as Nos. 8 and 9 in Table 2, we based our definition of compliance on a careful reading of the entire conference proceedings, supplemented when necessary by communication with the conference chairman. Data Collection We developed separate versions of medical record abstraction forms for cesarean childbirth, treatment of primary breast cancer, and use of steroid receptors in patients with breast cancer, coronary artery bypass surgery, coronary angiography, and management of patients with unstable angina. Each abstraction form was used to collect detailed clinical data in both precoded and descriptive form and took about 30 to 60 minutes to complete. In addition, we obtained photocopies of selected test reports, such as coronary Physician Practice—Kosecoff et al �? Tattta 4 —Hoaoital Dlflaroncea in Complianca With Two Consensus Conl«i«nc« Rocommandattona iming Braast Cancer < Jjj^on^ Hoepital 1 2 3 4 5 6 7 S 9 10 Compliance' No. St 32 49 29 31 36 43 33 63 34 Recommendation 2: Women Having 2-8tep Procedure, %t 65 64 40 66 18 39 16 72 23 32 Recommendation 3: Women Having Estrogen Receptor Assay, %* 66 68 64 44 83 76 73 69 73 97 •Compliance based on 401 patients with stage I or earty stage II breast cancer. These data are aggregated across aU three Urns periods and are adjusted lor differences in patient sge among hospitals. tOlfterences among hospitals, P<.0O1. tDifterences among hospitals. P<.05. Table 5.—Percemage of Breast Cancer Surgeries Complying With Consensus Conference Recommendation 1, by Hospital and Time Period* Hoepttalt 1 2 3 . 4 5 ) f 6 7 8 9 10 Time 1 85 85 100 4 40 83 83 99 60 100 Time 2 33 90 87 85 86 95 89 60 75 69 Time 3 43 90 83 99 75 79 91 89 89 69 All Time** 53 68 90 62 67 86 88 89 75 94 •Based on hospitsi records lor 401 women with stags I or earty stage II breast cancer. Cell sizes range from 3 to 26. Compliance is defined as total mastectomy with axillary dissection. Olfferencss are age adjusted. tHospital effect signiticant at P<.001 (after controlling lor age). tHospital by Ume effect significant st P<.001 (after controlling lor age). angiograms. Records were abstracted by 28 data collectors who had previous experience with medical record reviews, passed a test of their abstraction skills, received four days of intensive training, and successfully completed a further test at the end of training. Work was supervised by a chief data collector who visited each hospital on a regular basis (four to six times) and reabstracted records to maintain quality control. Completed forms were reviewed by boih a physician and a nonphysician, who assessed internal consistency and made sure that the coding decisions were consistent with supporting clinical data that were copied by data collectors from the medical record, data c crepancies that could not be resolved ^BDiscrig the review process were re^^•tirin for reabstraction. Tfest reports lumed were interpreted by the physician, based on photocopies of the test report. JAMA. Nov 20. 1987—Vol 258, No 19 lb ensure the confidentiality of information, we assigned coded identifiers to patients, hospitals, and physicians. Once the data collection process had been completed, all files linking these identifiers to physicians, patients, or institutions were destroyed. Oata Analysis We examined compliance with 12 recommendations (two for breast cancer treatment, one for use of estrogen receptors, five for cesarean section, and four for heart disease). Fbr six of the recommendations, we examined alternative definitions of compliance. For 11 of the 12 recommendations, compliance was scored as 1 if the recommendation was Mowed, and as 0 if otherwise. For these recommendations, the extent of compliance across physicians and patients could be specified in percentage terms. Fbr onerecommendation(No. 12 in Table 2), we were unable to define compliance in either/or terms because the recommendation was imprecise. Instead, we substituted a measure of appropriateness, which we believe matches the intent of the NIH panel. The appropriateness approach was developed in another research study described in detail elsewhe^e.•- In brief, patients were divided into clinically homogeneous groups (300 for coronary angiography and 480 for coronary artery bypass surgery) that represented all possible indications for performing each procedure. The appropriateness of each indication was rated by a national panel of experts (three cardiologists, two cardiac surgeons, one radiologist, two internists, and one family practitioner) whorepresentedboth private and academic practice. Using a modified Delphi technique, indications were rated on a nine-point scale from 1 being "very inappropriate" to 9 being "very appropriate," where appropriate was defined to mean that the expected health benefit (ie, increased life expectancy, relief of pain, reduction in anxiety, or improved functional capacity) exceeded the expected negative consequences (ie, mortality, morbidity, anxiety anticipating the procedure, pain produced by the procedure, or time lost from work) by a sufficiently wide margin that the procedure was worth doing The panelists' median rating was assigned as the appropriateness score for each indication. For each patient in the coronary angiography and coronary artery bypass surgery samples of the study reported here, the most appropriate indication was identified along with its associated appropriateness score. The arithmetic , means of these appropriateness scores for each time period were used to assess compliance with recommendation 12. lb evaluate possible conference effects, we carried out ordinary least squares (OLS) regressions on the compliance measures. (We chose OLS rather than logistic regression because the OLS method yields coefficients that are morereadilyinterpretable and because when certain normality assumptions are met, the maximum likelihood estimate of the regression coefficient for polytomous logit is equivalent to the discriminant function estimate.)* Fbr each compliance measure for recommendations 1 through 11, we regressed compliance on a "time period" variable that was calibrated to reflect elapsed time in months. This allowed us to estimate the average percent change in compliance per month across all three time periods (linear trend). We then carried out additional regressions to estimate the deviation (acceleration or deceleration) from this linear trend that occurred after the conference. Regression coefficients were tested for significance by means of t testa l b test the significance of trends across recommendations, we summed the values of separate f tests and divided by the square root of the number of recommendations. (On the conservative assumption that the separate tests are independent, the SE of the sum of the t tests is approximately the square root of the number of tests, since each separate test has an SE of approximately 1.) RESULTS Completion Rate and Reliability Of the ten hospitals asked to participate, eight agreed and two were replaced with hospitals in the same strata. The final sample included one major teaching facility, two other medical school-affiliated hospitals, one nonaffiliated hospital with aresidencytraining program and more than 200 beds, and six community hospitals with more than 150 beds. The hospitals were located in the Seattle, Tacoma, Spokane, or Yakima areas in the state of Washington; no small rural hospitals were in the sample. Of the 2770 patient records sampled, only 22, or less than 1%, could not be located; these were replaced. Fburteen breast cancer, 40 obstetric, and 25 heart records were randomly reabstracted by a different medical record abstractor. The K value was used to assess reliability for 84 critical variables that help to define compliance with a recommendation. Across the 89 records, the mean K value was 0.82. Fburteen records were also abstracted by a physician who was blind to the previous Physician Practice—Kosecoff et al 2711 �TaWe 6.—Compliance With Conaenaus Conteience Hecommenbation HegarOing Urgent Pertormance o( Coronary Angiograpny in Patients With Unstable Angina, by Hoapital and Time PenoO ferences by recommendation, hospital, and conference. For instance, we carried out further analyses for the breast cancer recommendations using analysis of covariance (with the patient's age as Subpopulation and Time Time Time All 2 Outcome* 1 3 Time* the covariate) to examine whether there 68 96 197 No. ot panenta admitted 33 were differences in the pattern of reto * hospitaia with sults by hospital. We found significant angiograpny tadlMas 64 69 Received angiography. S 45 62 age and hospital effects for all three 41 47 Received angiography on 42 30 recommendations and a significant hossn emergency basis. % 74 67 Received angiograpny 66 60 pital by time interaction for recommenon an emergency basis dation 1 (Tables 4 and 5). In addition, smong those who ewer we found that in time period 2, just 7% received angiography, % 53 105 113 271 No. <* patients sdmitted of women undergoing surgery for to S hosprtals without breast cancer had a radical mastectomy, angiography tadlltiet Might hav* had 36 33t 42* 38* so that there was little room for conferangiography alter ence recommendation 1 to curtail furdischarge. %t ther the use of this procedure. Tysnsterred lor angiograpny, % Evidence on the effects of the conference on childbirth by cesarean section 'Baaed on the analysis ol medical records ol 468 is mixed. On the one hand, it did not patients admitted to the hoapltals wtth unstable angina. t Ind udes patients sped Heal ty transferred for anresult in more women with infants in giography. the breech position being given a trial t x Is significant st P<.005 for the oontrtst between this veiue snd the value In row 4 In the same column. of labor, more discussion between the woman and her physician about anesthesia options or surgical options before abstraction; the mean physician agree- an unplanned cesarean section, or a ment with the original abstractor was decreased cesarean rate in general. On 90%. the other hand, it may have resulted in an increased trial of labor and vaginal Compliance With Recommendations delivery rates in women who had had a Table 3 shows percent compliance previous transverse cesarean section. with the 11 recommendations for which In these women, the postconference this could be calculated, including com- trial-of-labor rate was 29% vs 11% bepliance under alternative definitions. fore the conference (P<.001), and the Across the 11 recommendations (using actual vaginal delivery rate was 16% vs only the main definitionX compliance 6% before the conference (P<.05). averaged 52% in time period 1, 57% in Judged by the more stringent standard time period 2, and 57% in time period of whether thisrepresentsa significant 3. In addition, the average appropriate- acceleration from the preconference ness score went down from time periods rate of change, results are not signifi1 to 3 from 5.7 to 5.2 for coronary cant for either measure, although the angiography and up from 6.5 to 6.7 for small sample sizes in time period 1 make coronary artery bypass surgery. it difficult to detect anything short of a fbr three of the 11 recommendations, very large effect compliance with primary recommendaFinally, we found no evidence that tions increased significantly over the the consensus conference on coronary three time periods (see the next to last artery bypass surgery affected cardiocolumn in Table 3). The combined test vascular surgery practice for any of the for linear trend across the 11 recom- three recommendations. Compliance mendations also showed a statistically with two of the three recommendations significant increase (1 = 3.50, P<.001). was so high before the conference that The test for significant conference ef- improvement would be difficult to defects, ie, for an acceleration in linear tect Compliance with the other rectrend in time period 3, failed to show ommendation (recommendation 10) depositive effects for any of the recom- pended on the type of hospiul. If mendations. Instead, across the 11 rec- angiographic facilities were available, ommendations, there was a significant then compliance was substantially deceleration in the rate of change dur- higher both before and after the confering the postconference period (z = 2.39, ence (Table 6). Examination of the mean P<.Q5). This deceleration was statisti- appropriateness score, which was not cally significant for two of the 11 indi- affected by the conference, shows that vidual recommendations (Table 3). improvement in the use of both coronary Based on these results, we can con- angiography and coronary artery byclude that taken as a whole, these four pass surgery was possible, but many of consensus conferences had no effect on the inappropriate uses of these two physician^ hospital practice. This con- procedures were not addressed by the clusion, however, hides important difNIH consensus conference. Fbr exam2 2712 JAMA. Nov 20. 1987—Vol 258. No. 19 ple, the conference did not address how patients with chest pain of uncertain origin or those with two-vessel disease should be diagnosed or treated. COMMENT Our results at first glance are disappointing. The dedicated efforts of a national agency to affect clinical practice through a consensus conference approach mostly failed to produce change at the bottom line—in the care provided by practitioners to their patients. Indeed, for six of the 11 recommendations analyzed in terms of compliance, the levelremainedat less than 50% even after the conference. Practice failed to change even though the recommendations appeared to reflect the state of science and sound practice at the time and even though efforts to disseminate therecommendationswere at least moderately successful at reaching an appropriate target audience of physicians." Compliance with the recommendations we studied was, in general, increasing during the year or two immediately preceding the conference. Persuasive evidence that a conference has influenced compliance requires more than a continuation of such a preexisting trend; itrequiresan acceleration in the rate of change. Our results show instead a deceleration, suggesting first that the conferences probably did not have the intended effect on practice, and second that other sources of influence on practice may have peaked some time before the conferences were held. Compliance with the recommendations varied by hospital. Certain structural characteristics, such as having a catheterization laboratory, were associated with higher levels of compliance. The small number of teaching hospitals in our sample prevents us from making meaningful comparisons of their responses to those of nonteaching institutions. One possible explanation of our findings might be that our "after" time period occurred too soon for changes to have taken place. Although we have no additional evidence torefutethis assertion, other investigators have demonstrated for at least one recommendation, trial of labor and vaginal delivery following a previous cesarean birth, that the conference had little effect even four years later. -" Even if a later effect did occur for another conference, the passage of time makes it increasingly difficult to link such changes to the consensus conference. Changes in practice by no means follow inevitably from the dissemination 10 Physician Practice—Kosecoff et al �J of technology assessmentfindings.' To addressed by consensus panels' recomchange the state of practice, a dissemi- mendations needed change. If confernation program must offer a timely, ences are to concentrate on the areas scientifically grounded, and clinically where change is most needed, topics relevant message, and it must succeed must be selected partly on the basis of in getting that message across to the community practice. This requires that appropriate professional audience, who data about actual community practice must be willing and able to act on it. be systematically examined and considThe results reported herein demon- ered before a conference's final focus is strate that not all of these requirements chosen and key questions are formuwere met lated. Ideally, such data would be obThe problem of transmitting infor- tained using sound epidemiologic prinmation about improvements in the state ciples (appropriate sampleframes),but of science to practicing physicians, and the use of lessrepresentativedata from thereby changing their practice, will existing data sources should also be not go away. The NIH model is an considered. Data on the current state appealing one that has been applied in of practice should be made available to Canada and Europe. The NIH has al- the conference participants so that their ready improved the dissemination of its deliberations and recommendations can consensus conference recommendations take into account both the state of by publishing some of them in JAMA, science and the state of practice. the most widely read physician journal. Second, compliance with some recBut based on our results, other changes ommendations may require changes not seem advisable as well. under physician control, such as acquiFirst, the consensus conferences sition of new resources or Cacilities or should concentrate on areas of practice the presence of a 24-hour anesthesiolothat need improvement The data from gist to permit emergency cesarean secthis study show that preconference tions. Recommendations in those areas compliance with recommendations var- that are directed to physicians are of ied markedly both within and across limited value unless suggestions for conferences; not all the areas of practice providing suchresourcesare also dealt with in the conference report. Changing behavior is difficult Medical education programs seem to have little effect unless they are directed to an individual physician's or institution's experience and are accompanied by feedback or by face-to-face endorsement by respected others.' -" The situation here seems to be similar. The consensus conference is an educational tool; unless it is coupled with follow-up programs that help translate the message into local or individual action and with monitoring to determine that appropriate change is occurring, its impact will be limited. Of course, changes in the dissemination, implementation, and monitoring of the consensus conference do not guarantee its effectiveness, but they could put in place the essential building blocks of a system that has a higher likelihood of affecting physician behavior. 4 This research was supported in part by contract NOl-OD-2-2128 from the Office of Medical Apphcations of Research. NIH, Bethesda, Md. We are grateful to Itzhak Jacoby, PhD, of the Office of Medical Applications of Research, NIH, for his cooperation and constructive advice, and to Mark Chassin, MD. and George Goldberg, MD, for their clinical insight- Raferance* ' l . Consemtus Development StaUment^-Tbtal Hip Joint Replacement Stockholm, Swedish Planning and Rationalization Institute, 1982. 2. Comennu Report—Eariy Detection cf Breast Cancer. Copenhagen, Danish Medical Research Council, 1983. 3. Stocking B, Jennett B: Consensus Development Conference—coronary artery bypass surgery in Britain. Br Med J Clin Res 1984^88:1712. 4. Panel and Planning Conunittee of the National Consensus Conference on Aspects of Cesarean Birth: Indications for cesarean section: Final statement of the National Consensus Conference on Aspects of Cesarean Birth. Con Med Assoc J 1986;134:1348-1352. 5. Chassin MR, Kosecoff J, Park RE, et al: Indications for Selected Medical and Surgical Procedures—A Literature Review and Ratings of Appropriateness: Coronary Angiography, publication (Rand) R.3204/1-CWF/HF/HCFA/PMT/RWJ. Sanu Monica, Calif. The Rand Corp, 1986. 6. Chassin MR, Park RE, Fink A, et al: Indica- JAMA. Nov 20. 1987—Vol 258. No. 19 (urns for Selected Medical and Surgical Procedurei—A Literature Review and Ratings of Appropriatenett: Coronary Artery Bypass Surgery, publication (Rand) R-3204/2-CWF/HF/HCFA/ PMT/RWJ. Sanu Monica, Calif; The Rand Corp, 1986. 7. Park RE, Fink A, Brook RH, et al: Physician ratings of appropriate indications for six medical and surgical procedures. Am J Public Health 1986;76:766-772. 8. Hsggstrom GW: Logistic regression and discriminant analysis by ordinary least squares. J But Econ Stat 1983;1:229-238. 9. Kanouse DE, Brook RH, Winkler JD. et al: Changing Medical Practice Through Technology Assessment. An Evaluation cf the NIH Consensus Development Program, publication (Rand) R3462-NIH. Santa Monica, Calif, The Rand Corp, 1987. 10. Rosen MG: Premature concerns for cesarean sections? JAMA 19842523296. 11. Shiono PH, Felden JG, McNellis D, et al: Recent trends in cesarean birth and trials of labor in the United Sutes. JAMA 1987;257:494-501. 12. Shiono PH, McNeills D, Rhoads GC: Reasons for the rising cesarean delivery rates: 1978-1984. Obstet Gynecol 1987;69:696-700. 13. Eisenberg JM: Physician utilization: The state of research about physicians' practice patterns. Med Can 198523:461-483. 14. Lloyd JS. Abramson S: Effectiveness of continuing medical education. Eval Health Prof 19792251-280. 15. Stein LS: The effectiveness of continuing medical education: Eight research reports. J Med Educ 1981;56:1<»-110. 16. Pinkerton RE, Tinanoff N, Williams JL, etal: Resident physician performance in a continuing education format JAMA 19802442183-2185. 17. Avorn J, Soumerai SB: Improving drug-therapy decisions through educational outreach: A randomized controlled trial of academically-based 'detailing.' N Engl J Med 1983:308:1457-1400. Physician Practice—Kosecoff el ai 2713 �Clinton Presidential Records Digital Records Marker This is not a presidential record. This is used as an administrative marker by the William J. Clinton Presidential Library Staff. This marker identifies the place of a tabbed divider. Given our digitization capabilities, we are sometimes unable to adequately scan such dividers. The title from the original document is indicated below. Divider Title: )O �Tab J Examples of Clinical Guidelines 1. "Preventing Pressure Ulcers in Adults: Prediction, and Prevention," AHCPR 2. "Depression in Primary Care: Detection, Diagnosis and Treatment," AHCPR �For Official Use Only 5/11/93 Title: "Preventing Pressure Ulcers in Adults: Prediction, and Prevention, and "Depression in Primary Care: Detection, Diagnosis and Treatment." These are examples of clinical practice guidelines. As part of its congressional mandate the Agency for Health Care Policy and Research (AHCPR) facilitates the development of clinical practice guidelines by commissioning expert panels t address selected clinical conditions. The expert panels are multi-disciplinary and include consumers. The guidelines are based on a comprehensivereviewof the scientific literature on valid evidence presented at open meetings and on the professional judgments of panel members ant other experts in the fields. Guidelines are developed in several formats: a long, technical version, called the Guideline Report; a shorter version, the clinical practice guideline; an abbreviated Quick Reference Guide; and a Patient's Guide (in English and in Spanish). Implication for Health Care Reform: Research in the past two decades has identified major variations in the way physicians care for a specific health problem. Researchers believe that practice variations occur in part because there is no strong consensus among physicians about what works best and for whom. Evidenced-based clinical practice guidelines can assist the clinical decisionmaking of practitioners and consumers. �Clinton Presidential Records Digital Records Marker This is not a presidential record. This is used as an administrative marker by the William J. Clinton Presidential Library Staff. This marker identifies the place of a tabbed divider. Given our digitization capabilities, we are sometimes unable to adequately scan such dividers. The title from the original document is indicated below. Divider Title: �TabK List of Consultants This section includes a list of consultants invited to provide information to the Quality Work Group �Consultant List Aetna Health Plans W. Allen Schaffer, M.D., F.A.C.P. Vice President Professional Affairs, MC14 151 Farmington Avenue Hartford, Connecticut 06156 American Accreditation Program Inc. Mr. Brant P. Kelch President 2270 Cedar Cove Court Reston, Virginia 22091 American Association of Preferred Provider Organizations Mr. Douglas L. Elden General Counsel 150 North Michigan Avenue Suite 3000 Chicago, I l l i n o i s 60601-7567 American Association of Retired Persons Ms. Mary Jo Gibson 1909 K Street, N.W. Washington, D.C. 20039 American Hospital Association Mr. Thomas A. Granatir Senior Associate Director Division of Health Policy 840 North Lake Shore Drive Chicago, I l l i n o i s 60611 American Managed Care and Review Association Mr. Charles S t e l l a r President 1227 25th Street, Suite 610 Washington, D.C. 20037-1156 American Medical Association John T. Kelly, M.D., Ph.D. Director Office of Quality Assurance and Medical Review 515 North State Street Chicago, I l l i n o i s 60610 �Mr. and Mrs. Jerry Apodaca 1155 Connecticut Avenue, N W .. Suite 500 Washington, D.C. 20036 Bay Area Business Group on Health Ms. Pat Powers Executive D i r e c t o r 90 Montgomery Street, Suite 410 San Francisco, C a l i f o r n i a 94105 Bray, Dan M.D. P.O. Box 596 Algona, Iowa 50511 Commission on Professional and Hospital A c t i v i t i e s W i l l i a m F. Jessee, M.D. Chairman and I n t e r i m CEO 2929 Plymough Road, Suite 208 P.O. Box 304 Ann Arbor, Michigan 48106-0304 Department o f Health Mark R. Chassin, M.D. Commissioner State o f New York Empire State Plaza Albany, New York 12237 Department o f Veterans A f f a i r s Ms. Betty Bishop Secretary 806 15th Street, N.E. Room 729 Washington, D.C. 20006 Department o f Veterans A f f a i r s Mr. John D i e t r i c h 806 15th S t r e e t , N.E. Room 729 Washington, D.C. 20006 Department o f Veterans A f f a i r s (12B) S h i r l e y Meehan, M.B.A., Ph.D. Deputy D i r e c t o r Health Services Research and Development 810 Vermont Avenue, N W .. Washington, D.C. 20420 �Suzanne Eickhorn, Ph.D. 3041 Sedwick, N W , Apt. #104 .. Washington, D.C. 20008 Families USA Ms. Judith G. Waxman Director Government Affairs 1334 G Street, N W .. Washington, D.C. 20005 FDA Office of Planning and Evaluation Ms. Maureen Holohan Program Analyst Parklawn Building 5600 Fishes Lane, Room 1074 Rockville, Maryland 20857 Gerontological Society of America Mr. Paul Kerschner Executive Director 1275 K Street, N W .. Suite 350 Washington, D.C. 20005-4006 Group Health Association of America Ms. Judy C a h i l l Vice President Member Services and Operations 1129 20th Street, N W .. Washington, D.C. 20036 Harvard Community Health Plan John Ludden, M.D. Medical Director 10 Brookline Place West Brookline, Massachusetts 02146 Harvard School of Public Health R. Heather Palmer, M.B., B.Ch., S.M. Director Center for Quality of Care Research and Education 677 Huntington Avenue Boston, Massachusetts 02115 Health Action Council of Northeast Ohio Mr. Pat Casey Executive Director P.O. Box 39008 Solon, Ohio 44139 �Health Outcomes Institute Mr. Michael Huber 2001 Killebrew Drive Suite 122 Bloomington, Minnesota 55428 Health Policy Corporation of Iowa Mr. Paul Pietzsch President Two Ruan Center, Suite 330 601 Locust Street Des Moines, Iowa 50309 IAMETER William C. Mohlenbrock, M.D. Medical Director 901 Mariner's Island Boulevard Suite 565 San Mateo, California 94404 I n s t i t u t e for Health Care Improvement Don Berwick, M.D. President and CEO One Exeter Plaza 9th Floor Boston, Massachusetts 02116 Intermountain Health Care Brent C. James, M.D., M.Stat. Assistant Vice President of Medical Research & Continuing Medical Education 36 South State, 22nd Floor Salt Lake City, Utah 84111 Joint Commission on Accreditation of Healthcare Organizations Dennis S. O'Leary, M.D. President One Renaissance Boulevard Oakbrook, I l l i n o i s 60181 Kaiser Permanente Don Nielsen, M.D. Quality Consultant Permanente Medical Groups Interregional Services One Kaiser Plaza Oakland, California 94612 �Managed Health Care Association Ms. Carol A. Cronin Executive Director 1225 I Street, N W .. Suite 300 Washington, D.C. 20005 Maryland Hospital Association Vahe Kazandjian, Ph.D. Director of Research Heaver Plaza 1301 York Road Lutherville, Maryland 21093-6087 Massachusetts General Hospital David Blumenthal, M D , M.P.P. .. Chief, Health Policy Research and Development Unit Medical Practice Evaluation Center 50 Staniford Street, 9th Floor Boston, Massachusetts 02114 National Association of Protection & Advocacy Systems, Inc. Mr. Curtis L. Decker Executive Director 900 Second Street, N.E. Suite 211 Washington, D.C. 20002 The National Citizens' Coalition for Nursing Home Reform Ms. Elma Holder Executive Director 1224 M Street, N W .. Suite 301 Washington, D.C. 20005 National Committee to Preserve Social Security and Medicare Bente Ewaldsen Cooney, M S W ... Senior Policy Analyst 2000 K Street, N W .. Suite 800 Washington, D.C. 20006 National Committee for Quality Assurance Janet Corrigan, Ph.D. Vice President, Planning and Development 1350 New York Avenue, N W .. Suite 700 Washington, D.C. 20005 �National Committee for Quality Assurance Ms. Margaret O'Kane President 1350 New York Avenue, N W .. Suite 700 Washington, D.C. 20005 National Senior Citizens Law Center Mr. Alfred J . Chiplin, J r . Staff Attorney Suite 700 1815 H Street, N W .. Washington, D.C. 20006 New England Medical Center Harris Allen, J r . , Ph.D. The Health Institute 750 Washington Street, #345 Boston, Massachusetts 02111 New England Medical Center John E. Ware, J r . , Ph.D. Senior S c i e n t i s t 750 Washington Street NEMC #345 Boston, Massachusetts 02111 Office of Coordinated Care Policy and Planning Melvin Silverman, D.D.S. Division of Planning and Promotion Health Care Financing Administration 330 Independence Avenue, S W .. Cohen Building, Room 4355 Washington, D.C. 20201 Office of Disease Prevention and Health Promotion Steven H. Woolf, M D , M.P.H. .. Switzer Building, Room 2132 330 C Street, S W .. Washington, D.C. 20201 Office of the Inspector General, DHHS K. Michael Nelson, M D .. Office of Investigations 330 Independence Avenue, S W .. Washington, D.C. 20201 �Oregon Health Resources Commission Mr. Dan H a r r i s Executive D i r e c t o r Suite 640 800 N.E. Oregon Street #21 Portland, Oregon 97232 The P r u d e n t i a l Insurance Company of America I . Steven Udvarhelyi, M.D., S.M. Vice President, Medical Services Health Care Operations & Research D i v i s i o n Group Department 56 North L i v i n g s t o n Avenue Roseland, New Jersey 07068 RAND Corporation Robert H. Brook, M.D. Director RAND Health Services Program P.O. Box 2138 Santa Monica, C a l i f o r n i a 90407-2138 Spectrum Management, Inc. Mr. W i l l i a m F. Benson Vice President 1133 20th S t r e e t , N W .. Suite 321 Washington, D.C. 20036 Thomas J e f f e r s o n U n i v e r s i t y Hospital and Medical College David B. Nash, M.D., M.B.A. Director Health P o l i c y and C l i n i c a l Outcomes 1015 Walnut Street C u r t i s B u i l d i n g , Room 621 P h i l a d e l p h i a , Pennsylvania 19107 United HealthCare Corporation Ms. Sheila Leatherman Vice President 9900 Bren Road East P.O. Box 1459 Minneapolis, Minnesota 55440-8001 U n i v e r s i t y of Pennsylvania Mark V. Pauly, Ph.D. Professor Leonard Davis I n s t i t u t e of Health Economics 3641 Locust Walk C o l o n i a l Penn Center P h i l a d e l p h i a , Pennsylvania 19104 �VA Office of Quality Management Ms. Jackie McEwan Program Manager 810 Vermont Avenue, N W .. Washington, D.C. 20420 Vermont Employers Health Alliance Ms. Jeanne Keller Executive Director 104 Church Street Burlington, Vermont 05401 Xerox Corporation Ms. P a t r i c i a M. Nazemetz Director, Benefits P.O. Box 1600 Stanford, Connecticut 06904 �Clinton Presidential Records Digital Records Marker This is not a presidential record. This is used as an administrative marker by the William J. Clinton Presidential Library Staff. This marker identifies the place of a tabbed divider. Given our digitization capabilities, we are sometimes unable to adequately scan such dividers. The title from the original document is indicated below. Divider Title: )X �TabL Members of the Quality Work Group (Work Group 9) �Withdrawal/Redaction Marker Clinton Library DOCUMENT NO. AND TYPE 001. list SUBJECT/TITLE DATE Health Care Task Force Working Group 9 [partial] (2 pages) n.d. RESTRICTION P6/b(6) COLLECTION: Clinton Presidential Records Health Care Task Force OA/Box Number: OA/ID 1230 FOLDER TITLE: Quality Briefing Book [8] 2006-0810-F ke217 RESTRICTION CODES Presidential Records Act - [44 U.S.C. 2204(a)| Freedom of Information Act - |5 U.S.C. 552(b)| PI P2 P3 P4 b(l) National security classified information |(bXl)of the FOIA| b(2) Release would disclose internal personnel rules and practices of an agency 1(b)(2) of the FOIA] b(3) Release would violate a Federal statute 1(b)(3) of the FOIA| b(4) Release would disclose trade secrets or confidential or financial information 1(b)(4) of the FOIAj b(6) Release would constitute a clearly unwarranted invasion of personal privacy 1(b)(6) of the FOIA| b(7) Release would disclose information compiled for law enforcement purposes 1(b)(7) of the FOIA) b(8) Release would disclose information concerning the regulation of financial institutions 1(b)(8) of the FOIA) b(9) Release would disclose geological or geophysical information concerning wells |(bX9) of the FOIA) National Security Classified Information 1(a)(1) of thc PKA| Relating to the appointment to Federal office 1(a)(2) of the PRA| Release would violate a Federal statute 1(a)(3) of the PRA) Release would disclose trade secrets or confidential commercial or financial information 1(a)(4) of the PRA) P5 Release would disclose confidential advice between the President and his advisors, or between such advisors |a)(5) of the PRA) P6 Release would constitute a clearly unwarranted invasion of personal privacy 1(a)(6) of the PRA] C. Closed in accordance with restrictions contained in donor's deed of gift. PRM. Personal record misfile defined in accordance with 44 U.S.C. 2201(3). RR. Document will be reviewed upon request. �HEALTH TASK FORCE WORKING GROUP 9 Name Phong fa Agency ASCMD/VA Galen Barbour, MD AHCPR Linda Demlo FAX jf (W) (202) 535-7259 (202) 535-7541 (W) (301) 227-8453 (301) 227-8157 David Eddy, MD v v ;•: • (301) 718-2682 -•) ) ^ Arnold Epstein, MD Co-Chair RWJ/Harvard Med./ Senate Labor (H) Barbara Gagel HCFA (W) (410) 966-6842 (H) (410) 966-6857 Sylvia Gaudette Rep. Olver (MA) (W) (202) 225-5335 (202) 226-1224 •; (b)(6) :(b)(6) David Jackson, MD Steve Jencks, MD (202) 456-7739 HCFA (H) Henry Kraukauer, MD Risa Lavizzo-Mourey, MD Co-Chair USUHS (W) (410) 966-6508 (410) 966-6857 (410) 966-6730 (W) (301) 295-3831 AHCPR (W) (301)227-6662 : 'tb)(6): (301) 295-3891 (301) 227-8168 5lso fia) Tim McKee OASD (HA) (W) (703) 756-7896 (H)|, (b)(6) ~ | (703) 756-7887 (also fax) Sandy Robinson AHCPR (W) (301) 227-8455 (H)! (b)(6)- ~ i (301) 227-8157 David Schulke Rep. Wyden (OR) (W) (202 (W) (202)225^811 (H)g£ (202) 225-8941 •(b)(6) Nicole Simmons HCFA (W) (410) 966-6752 (H) Paul Tibbits.MD OASD (HA) (W) (703) 756-9081 (H)!'' : -; '(b^ ' ;i ; Tim Ward OASD (HA) (410) 966-6857 /(b)(6) r : :;: (703) 756-0985 ,v (W) (703) 756-7856 (703) 756-7887 �John W. Williamson, MD V VA (W) (202) 376-6481 (H) (202) 376-6488 � Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Health Care Reform Identifier An unambiguous reference to the resource within a given context 2006-0810-F Description An account of the resource This collection consists of records related to Hillary Rodham Clinton's Health Care Reform Files, 1993-1996. First Lady Hillary Rodham Clinton served as the Chair of the President's Task Force on National Health Care Reform. The files contain reports, memoranda, correspondence, schedules, and news clippings. These materials discuss topics such as the proposed health care plan, the need for health care reform, benefits packages, Medicare, Medicaid, events in support of the Administration's plan, and other health care reform proposals. Furthermore, this material includes draft reports from the White House Health Care Interdepartmental Working Group, formed to advise the Health Care Task Force on the reform plan. This collection is divided into two seperate segments. Click here for records from: <a href="http://clinton.presidentiallibraries.us/items/browse?advanced%5B0%5D%5Belement_id%5D=43&advanced%5B0%5D%5Btype%5D=is+exactly&advanced%5B0%5D%5Bterms%5D=2006-0810-F+Segment+1">Segment One</a> <a href="http://clinton.presidentiallibraries.us/items/browse?advanced%5B0%5D%5Belement_id%5D=43&advanced%5B0%5D%5Btype%5D=is+exactly&advanced%5B0%5D%5Bterms%5D=2006-0810-F+Segment+2">Segment Two</a> Provenance A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation. The statement may include a description of any changes successive custodians made to the resource. Clinton Presidential Records Publisher An entity responsible for making the resource available William J. Clinton Presidential Library & Museum Text A resource consisting primarily of words for reading. Examples include books, letters, dissertations, poems, newspapers, articles, archives of mailing lists. Note that facsimiles or images of texts are still of the genre Text. Original Format The type of object, such as painting, sculpture, paper, photo, and additional data Paper Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Quality Briefing Book [8] Creator An entity primarily responsible for making the resource Health Care Task Force General Files Identifier An unambiguous reference to the resource within a given context 2006-0810-F Segment 1 Is Part Of A related resource in which the described resource is physically or logically included. Box 55 <a href="http://clinton.presidentiallibraries.us/items/show/36144" target="_blank">Collection Finding Aid</a> <a href="https://catalog.archives.gov/id/12090749" target="_blank">National Archives Catalog Description</a> Provenance A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation. The statement may include a description of any changes successive custodians made to the resource. Clinton Presidential Records: White House Staff and Office Files Publisher An entity responsible for making the resource available William J. Clinton Presidential Library & Museum Format The file format, physical medium, or dimensions of the resource Adobe Acrobat Document Medium The material or physical carrier of the resource. Preservation-Reproduction-Reference Date Created Date of creation of the resource. 5/5/2015 Source A related resource from which the described resource is derived 42-t-2194630-20060810F-Seg1-055-008-2015 12090749