User login
Taking Critical Appraisal to Extremes
In January 2000 an article in The Lancet drew attention when it questioned the supporting evidence for screening mammography.1 Danish investigators Peter Gøtzsche and Ole Olsen presented a series of apparent flaws in the 8 randomized trials of mammography, ultimately concluding that screening is unjustified. Their cogent arguments and the press coverage they received left many physicians wondering whether they should continue to order mammograms. The story led the CBS Evening News2 and was featured in the Washington Post,3 Time,4 and Reuters.5
A Patient-Oriented Evidence that Matters (POEM) review in the April 2000 issue of The Journal of Family Practice6 that addressed The Lancet study lent apparent support to these concerns. The POEM related the arguments in The Lancet article without challenging them and concluded that “mammography screening has never been shown to help women to live longer.” The authors of this POEM suggested that the only reasons for screening to continue are “politics, patients’ preconceptions, and the fear of litigation.” Unlike most POEMs, this one included no critical appraisal of the methods or assumptions of the reviewed study. This lack of comment, combined with the authors’ negative remarks about mammography, may have convinced family physicians that the criticisms of Gøtzsche and Olsen were beyond dispute.
However, controversy does surround their arguments, as the many letters to the editor published in The Lancet attest.7 For example, The Lancet critique made much of inconsistent sample sizes and baseline dissimilarities between screened and unscreened women. The authors asserted that such age and socioeconomic differences were “incompatible with adequate randomization.” That premise is contestable. It is normal and predictable that a proportion of population variables will differ between groups for statistical reasons, no matter how perfect the randomization. Also, the observed age difference (1 to 6 months) would not explain the 21% reduction in mortality observed in the trials.8
For Gøtzsche and Olsen the discrepant age patterns and sample sizes were less a cause of the results than a warning sign that randomization had been subverted (because of failure to conceal allocation). Since mortality in the screened and unscreened groups differed by only a relatively small number of deaths, they reasoned that very little bias would be necessary to tip the scales in favor of mammography.
Several arguments weaken their case, however. First, they offered no evidence that subversion or unconcealed allocation actually occurred. They equated inexplicit documentation of procedures (and dissimilar group characteristics) with improper randomization. Second, even if unconcealed allocation occurred, it does not in itself thwart randomization. Investigators who know to which group a patient will be assigned can still follow the rules and make the correct assignment. Anecdotal reports of subversion (by deciphering assignment sequences to divert or target patients for allocation) do not offer denominator data to assess how often this occurs.9 It would have had to occur in every trial that favored mammography to uphold the authors’ allegations. Third, even if the trials were subverted there is no indication that case mix differed enough to skew outcomes. Age differences were minor; the authors speculated that sizable imbalances in unmeasured factors could have altered results, but they gave no evidence. They cited reports that poorly concealed allocation is associated with a 37% to 41% exaggeration in odds ratios,10,11 but these reports concerned other trials and made arguable assumptions. Finally, their confirmatory finding—that only the 6 “flawed” trials reported a benefit for mammography and that the 2 acceptable trials showed no effect—was based on recalculated relative risk rates. The original trial data show no such pattern.8
This is not to suggest that weaknesses in the mammography trials do not merit scrutiny. Others have also voiced criticisms.12 But the alarm raised by Gøtzsche and Olsen goes further, compelling us to rethink the purpose of critical appraisal and the extremes at which it might cause more harm than good.
Excessive critical appraisal
We seek perfection in evidence to safeguard patients. Prematurely adopting (or abandoning) interventions through uncritical acceptance of findings risks overlooking potential harms or more effective alternatives. But critical appraisal can do harm if valid evidence is rejected. Deciding whether to accept evidence counterbalances the risks of acceptance against the risks of rejection, which are inversely related. At one extreme of the spectrum, where data are accepted on face value (no appraisal), the risk of a type I error (accepting evidence of efficacy when the intervention does not work or causes harm) is high, and that of a type II error (discarding evidence when the intervention actually works) is low. At the other extreme (excessive scrutiny) the risk of a type II error is great; such errors harm patients because knowledge is rejected that can save (or improve) lives. Obviously, patients are best served somewhere in the middle, striking an optimal balance between the risks of type I and type II errors.
Enthusiasts for critical appraisal sometimes forget this, assuming that more is better. Gøtzsche and Olsen, fearing that the medical community had committed a type I error (promoting an ineffective or harmful screening test), set a high standard for judging validity (eg, dismissing trials with any baseline differences). But when data from 8 trials demonstrate a large effect size (21% reduction in mortality) with narrow confidence intervals (13%-29%),8 the risk of a type II error overwhelms that of a type I error. The implications of the error are stark for a disease that claims 40,000 lives per year.13 Rejecting evidence under these conditions is more likely to cause death than accepting it.
The pivotal question in rejecting evidence should not be whether there is a design flaw but something more precise: the probability that the observed outcomes are due to factors other than the intervention under consideration (eg, chance, confounding). This can be just as likely in the absence of design flaws (eg, a perfectly conducted uncontrolled case series) as when a study is performed poorly, and it can be low even when studies are imperfect. It is a mistake to reject evidence reflexively because of a design flaw without studying these probabilities.
For example, what worried Gøtzsche and Olsen about flawed randomization was that factors other than mammography might account for lower mortality. But how likely was that (compared with the probability that early detection was efficacious)? Suppose a trial with imbalanced randomization carries a 50% probability of producing spurious results due to chance or confounding. The probability that the same phenomenon would occur in 8 trials (conducted independently in 4 countries in different decades, using different technologies, views, and screening intervals14) would be in the neighborhood of (0.50)8 or 0.39%. The probability would be higher if one believed that subversion is systematic among researchers, but without data such speculation is an exercise in cynicism rather than science.
Suppose that the criticisms of Gøtzsche and Olsen justify the termination of mammography. Are we applying the same standard for other screening and clinical practices, or do we move the goal posts? Physicians screen for prostate cancer without one well-designed controlled trial showing that it lowers mortality.15 We say this is not evidence-based,16 but what kind of study would change our minds? The conventional answer is a randomized controlled trial showing a reduction in prostate cancer mortality is required,17 but The Lancet article finds that even 8 such trials are unconvincing. Indeed, flawless trials lose influence if the end points or setting lack generalizability.18 If the consequence of such high standards is that 30 years of trials involving a total of 482,000 women (and untold cost) cannot establish efficacy, the prospects for making the rest of medicine evidence-based are slim indeed.
The search for perfect data also has epistemologic flaws. No study can provide the absolute certainty that extremism in critical appraisal seeks. The willingness to reject studies based on improbable theories of confounding or research misbehavior may have less to do with good science than with discomfort with uncertainty, that is, unease with any possibility that the inferences of the investigators are wrong. The wait for better evidence is in vain, however, because science can only guess about reality. Even in a flawless trial a P value of .05 means that claims of significance will be wrong 5% of the time. Good studies have better odds in predicting reality, but they do not define it. It is legitimate to reject poorly designed studies because the probability of being wrong is too high. But to reject studies because there is any probability of being wrong is to wait futilely for a class of evidence that does not exist.
Inadequate critical appraisal
The POEM about The Lancet article highlights the other extreme of critical appraisal, accepting studies at face value. The review mentioned none of the limitations in The Lancet analysis, thus giving readers little reason to doubt the conclusion that mammography lacks scientific support and potentially convincing them to stop screening. Physicians should decide whether this is the right choice only after having heard all the issues. That this study reported a null effect and used meta-analysis does not lesson the need for critical appraisal. Like removing a counterweight from a scale, the omission of critical appraisal unduly elevates study findings (positive or negative), thus fomenting overreaction by not putting the information in context.
Several new resources, POEMs among them, have become available to alert physicians to important evidence. Some features (eg, the “Abstracts” section in the Journal of the American Medical Association) simply reprint abstracts. Others associated with the evidence-based medicine (EBM) movement offer critical appraisals. In family practice, these include POEMs and Evidence-Based Practice.19 In other specialties, they include the American College of Physicians’ ACP Journal Club and the EBM journals from the BMJ Publishing Group (eg, Evidence-Based Medicine, Evidence-Based Nursing). These efforts try to approach critical appraisal systematically. ACP Journal Club and the BMJ journals apply a quality filter (excluding studies failing certain criteria20) and append a commentary that mentions design limitations. POEMs go further, devoting a section to study design and validity and giving the authors explicit criteria for assessing quality.21
But a closer look reveals inconsistencies in how these criteria are applied. The heterogeneity is apparent in any random set of POEMs: Some authors list strengths and weaknesses by name with no elaboration, some give more details, some address only one criterion (eg, allocation concealment), while others state simply that the study was well designed. Some, like The Lancet review, say nothing about quality, reporting only the design and results. Eight (17%) of the 48 POEMs published in the first 6 months of 2000 included no critical appraisal (or a vague remark).
Similar omissions plague ACP Journal Club and the BMJ journals. Although the reviews describe concealment of allocation and blinding, and the commentary sections sometimes address design flaws at length, the degree to which this occurs, if at all, is variable. An ACP Journal Club review22 of the United Kingdom Prospective Diabetes Study, the landmark trial of intensive glycemic control, mentioned no concerns about its external validity (for contrast see the report by the American Academy of Family Physicians and American Diabetes Association23). Calling such synopses critical appraisals obfuscates the meaning of the term.
Some say that even brief remarks are critical appraisals. But good appraisals consistently and objectively rate studies using uniform criteria.24 POEMs and the EBM journals do not currently meet this standard; more systematic procedures are needed to ensure that narratives routinely discuss core elements of internal and external validity and that manuscripts lacking these elements are returned. The current conditions for preparing reviews make this difficult, however. With their modest budgets3 journals rely on hundreds of volunteer contributors, each with a different writing style and level of expertise. Onerous procedures for critical appraisal might discourage participation. Because journal space is limited, narratives are kept short (700 words for POEMs,25 425 words for the BMJ journals and ACP Journal Club20) to review more studies. Each issue of JFP contains 8 POEMs, and Evidence-Based Medicine has 24 reviews. Longer appraisals would reduce the number and currency of reviews and would be less concise for busy physicians.
The disadvantage of short reviews and rapid turnover is a greater risk of inaccuracies and imbalance. Careful analysis of a study requires more time to do research and more space to explain the results than journals currently provide. Authors have only weeks to prepare manuscripts, which leaves little time to verify that descriptions are accurate and give proportionate emphasis to the issues that matter most. For many studies a few hundred words provide inadequate room to fully explain the design, results, and limitations. The risk of mistakes is heightened with less time, analysis, expert review, and space.
This tradeoff between quantity and quality begs the question of what is more important to readers and to patient care: the number of studies that physicians know about or the accuracy with which they are described. POEMs, which are designed to change practice,26 can do harm if physicians acting on inaccurate or incomplete information make choices that compromise outcomes. The Lancet review is a shot across the bow. The 40,000 annual deaths from breast cancer13 remind us that for certain topics a mistaken inference can cost thousands of lives. If inconsistencies in appraisal make this happen often enough, efforts to synopsize evidence can do more harm than good. Also, it is incongruent for programs espousing EBM, discourages a discipline that accepts evidence on face value, to report studies with little or no discussion of validity.
Setting policy in critical appraisals
It is also antithetical for EBM to support evidence-based practice guidelines27 and to publish clinical advice that is not derived from these methods. Critical appraisals that conclude by suggesting how physicians should modify patient care cross the line from science to policy. In the first half of 2000 73% of the POEMs advised physicians (with varied explicitness) to use tests (6), drugs (14), or other treatments (5) and to withhold others (10). Such advice is common fare in the medical literature, but EBM ascribes to a higher standard. Because imprudent practice policies can do harm or compromise effectiveness, EBM holds that guidelines should be drafted with care using evidence-based methods.28 This entails reviewing not one study but all relevant evidence, with systematic grading of studies and explicit linkage between the recommendations and the quality of the data.27-30 This process typically involves an expert panel and months or years of deliberations.
In contrast, practice recommendations in EBM journals and POEMs reflect what individual authors think of a study. They lack the time, funding, and journal space for a systematic literature review. Thus, the authors and their readers cannot be sure that the conclusions reflect the evidence as a whole without undue influence from the reviewed study. Rules of evidence and grades for recommendations are rarely provided. Unlike guideline panels, authors seldom vet their recommendations with experts, societies, and agencies, which often uncover flawed inferences. The thinking process behind recommendations is necessarily telescoped. While the United States Preventive Services Task Force spent 2 years deciding whether pregnant women should be screened for bacterial vaginosis, a POEM31 produced its recommendations within weeks.
To some such pronouncements are not guidelines, only the bottom line of a review. But in policy terms it matters little whether physicians prescribe a drug because of a guideline or because of the advice they read in ACP Journal Club or a POEM. The outcome for the patient is the same. Reviews need a bottom line, but summarizing the results of a study (eg, drug A worked better than drug B) differs from advising physicians what to do (eg, prescribe drug A). The latter is a statement of policy rather than science and should be based on broader considerations than one study.28
EBM faults guidelines that omit evidence-based methods, such as those issued by advocacy groups that reflect personal opinions and selective use of studies more than systematic reviews.32,33 Yet the recommendations in EBM journals and POEMs differ little in appearance: They provide little documentation of how conclusions were reached, feature select evidence (the study under review), rely on authors’ opinions, and provide few details on rationale. EBM journals should extract themselves from this inconsistency by sharpening the distinction between summarizing evidence and setting policy and eschewing the latter unless it emanates from evidence-based methods.
How this is handled in POEMs will reflect on family medicine. The prominence the specialty has given POEMs (promotion in JFP, family practice literature,34-35 the Internet,36-37 and newsletters19) signals the way family physicians think studies should be reviewed. It is important to get this right. If POEMs are meant to be critical appraisals and 17% contain no critique, calling them critical appraisals casts doubts on the specialty’s understanding of the term and perpetuates confusion about definitions. Conversely, by instituting greater scrutiny—defining the core criteria that must be discussed to qualify a study commentary as a critical appraisal and systematizing their use in POEMs—the specialty would set a new standard for EBM. If POEMs are not meant to be critical appraisals, it is important to clarify the distinction in terms, especially for family physicians who have grown accustomed to POEMs, know little about alternatives, and have come to believe that POEMs, critical appraisals, and EBM are essentially the same.
Conclusions
Advocates of EBM should be systematic in their application of critical appraisal. Critical appraisals do not deserve the name if they accept studies on face value. The criteria for determining which studies are rated good or bad should be explicit and consistent. But the scrutiny of evidence should not be taken to extremes, to the point that studies are rejected for being imperfect when there is little likelihood that the findings are wrong. By making the perfect the enemy of the good, excesses in critical appraisal do injustice to the goal of helping patients and imply existence of a level of certainty that science cannot provide.
1. Gøtzsche PC, Olsen O. Is screening for breast cancer with mammography justifiable? Lancet 2000;355:129-33.
2. Health watch: mammography controversy. CBS Evening News January 7, 2000. Vanderbilt University Television News Archive, available at tvnews.vanderbilt.edu.
3. Mammography assessed. Washington Post, January 7, 2000, A14.
4. Reaves J. Here’s why your oncologist is angry. Time January 13, 2000. Available at www.time.com/time/daily/0,2960,37449,00.html.
5. Mammography screening deemed unjustifiable. Reuters Medical News January 7, 2000. Available at www.medscape.com/reuters/prof/test/2000/o1/01.07/pbo1070c.html.
6. Wilkerson BF, Schooff M. Screening mammography may not be effective at any age. J Fam Pract 2000;49:302-371.
7. Screening mammography re-evaluated. Lancet 2000;355:747-52.
8. Kerlikowske K, Grady D, Rubin SM, Sandrock C, Ernster VL. Efficacy of screening mammography: a meta-analysis. JAMA 1995;273:149-54.
9. Schulz KF. Subverting randomization in controlled trials. JAMA 1995;274:1456-58.
10. Moher D, Pham B, Jones A, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?. Lancet 1998;352:609-13.
11. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408-12.
12. Berry DA. Benefits and risks of screening mammography for women in their forties: a statistical appraisal. J Natl Cancer Inst 1998;90:1431-39.
13. American Cancer Society. Cancers facts & figures 2000. Atlanta, Ga: American Cancer Society; 2000.
14. Fletcher SW, Black W, Harris R, Rimer BK, Shapiro S. Report of the International Workshop on Screening for Breast Cancer. J Natl Cancer Inst 1993;85:1644-56.
15. Collins MM, Stafford RS, Barry MJ. Age-specific patterns of prostate-specific antigen testing among primary care physician visits. J Fam Pract 2000;49:169-72.
16. Lefevre ML. Prostate cancer screening: more harm than good? Am Fam Phys 1998;58:432-38..
17. Woolf SH, Rothemich SF. Screening for prostate cancer: the role of science, policy, and opinion in determining what is best for patients. Ann Rev Med 1999;50:207-21.
18. Bucher HC, Guyatt GH, Cook DJ, Holbrook A, McAlister FA. Users’ guides to the medical literature: XIX. Applying clinical trial results. A. How to use an article measuring the effect of an intervention on surrogate end points. Evidence-Based Medicine Working Group. JAMA 1999;282:771-78.
19. Advertiement Evidence-Based Practice Montvale, NJ: Quadrant HealthCom Inc.; 2000.
20. Purpose and procedure. Evidence-Based Medicine. Available at www.bmjpg.com/data/ebmpp.htm.
21. Assessing validity and relevance. Available at www.infopoems.com/EBP_Validity.htm.
22. Gerstein HC. Commentary on “Intensive blood glucose control reduced type 2 diabetes mellitus-related end points.” ACP J Club 1999; 2-3. Comment on: UK Prospective Diabetes Study Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications. Lancet 1998;352:837-53.
23. Woolf SH, Davidson MB, Greenfield S, et al. Controlling blood glucose levels in patients with type 2 diabetes mellitus: an evidence-based policy statement by the American Academy of Family Physicians and American Diabetes Association. J Fam Pract 2000;49:453-60.
24. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med 1997;126:376-80.
25. Instructions for POEMs authors. Available at www.infopoems.com/authors.htm.
26. Slawson DC, Shaughnessy AF. Becoming an information master: using POEMs to change practice with confidence. J Fam Pract 2000;49:63-67.
27. Hayward RS, Wilson MC, Tunis SR, Bass EB, Guyatt G. Users’ guides to the medical literature. VII. How to use clinical practice guidelines. A. Are the recommendations valid? The Evidence-Based Medicine Working Group. JAMA 1995;274:570-74.
28. Woolf SH, George JN. Evidence-based medicine: interpreting studies and setting policy. Hem Oncol Clin N Amer 2000;14:761-84.
29. Shekelle PG, Woolf SH, Eccles M, Grimshaw J. Developing guidelines. BMJ 1999;318:593-96.
30. Woolf SH. Practice guidelines: what the family physician should know. Am Fam Phys 1995;51:1455-63.
31. Lazar PA. Does oral metronidazole prevent preterm delivery in normal-risk pregnant women with asymptomatic bacterial vaginosis (BV)? J Fam Pract 2000;49:495-96.
32. Cook D, Giacomini M. The trials and tribulations of clinical practice guidelines. JAMA 1999;281:1950-51.
33. Grilli R. Practice guidelines developed by specialty societies: the need for a critical appraisal. Lancet 2000;355:103-06.
34. Geyman JP. POEMs as a paradigm shift in teaching, learning, and clinical practice. J Fam Pract 1999;48:343-44.
35. Dickinson WP, Stange KC, Ebell MH, Ewigman BG, Green LA. Involving all family physicians and family medicine faculty members in the use and generation of new knowledge. Fam Med 2000;32:480-90.
38. JFP online. Available at www.jfponline.com.
39. POEMs for primary care. Available at www.infopoems.com.
In January 2000 an article in The Lancet drew attention when it questioned the supporting evidence for screening mammography.1 Danish investigators Peter Gøtzsche and Ole Olsen presented a series of apparent flaws in the 8 randomized trials of mammography, ultimately concluding that screening is unjustified. Their cogent arguments and the press coverage they received left many physicians wondering whether they should continue to order mammograms. The story led the CBS Evening News2 and was featured in the Washington Post,3 Time,4 and Reuters.5
A Patient-Oriented Evidence that Matters (POEM) review in the April 2000 issue of The Journal of Family Practice6 that addressed The Lancet study lent apparent support to these concerns. The POEM related the arguments in The Lancet article without challenging them and concluded that “mammography screening has never been shown to help women to live longer.” The authors of this POEM suggested that the only reasons for screening to continue are “politics, patients’ preconceptions, and the fear of litigation.” Unlike most POEMs, this one included no critical appraisal of the methods or assumptions of the reviewed study. This lack of comment, combined with the authors’ negative remarks about mammography, may have convinced family physicians that the criticisms of Gøtzsche and Olsen were beyond dispute.
However, controversy does surround their arguments, as the many letters to the editor published in The Lancet attest.7 For example, The Lancet critique made much of inconsistent sample sizes and baseline dissimilarities between screened and unscreened women. The authors asserted that such age and socioeconomic differences were “incompatible with adequate randomization.” That premise is contestable. It is normal and predictable that a proportion of population variables will differ between groups for statistical reasons, no matter how perfect the randomization. Also, the observed age difference (1 to 6 months) would not explain the 21% reduction in mortality observed in the trials.8
For Gøtzsche and Olsen the discrepant age patterns and sample sizes were less a cause of the results than a warning sign that randomization had been subverted (because of failure to conceal allocation). Since mortality in the screened and unscreened groups differed by only a relatively small number of deaths, they reasoned that very little bias would be necessary to tip the scales in favor of mammography.
Several arguments weaken their case, however. First, they offered no evidence that subversion or unconcealed allocation actually occurred. They equated inexplicit documentation of procedures (and dissimilar group characteristics) with improper randomization. Second, even if unconcealed allocation occurred, it does not in itself thwart randomization. Investigators who know to which group a patient will be assigned can still follow the rules and make the correct assignment. Anecdotal reports of subversion (by deciphering assignment sequences to divert or target patients for allocation) do not offer denominator data to assess how often this occurs.9 It would have had to occur in every trial that favored mammography to uphold the authors’ allegations. Third, even if the trials were subverted there is no indication that case mix differed enough to skew outcomes. Age differences were minor; the authors speculated that sizable imbalances in unmeasured factors could have altered results, but they gave no evidence. They cited reports that poorly concealed allocation is associated with a 37% to 41% exaggeration in odds ratios,10,11 but these reports concerned other trials and made arguable assumptions. Finally, their confirmatory finding—that only the 6 “flawed” trials reported a benefit for mammography and that the 2 acceptable trials showed no effect—was based on recalculated relative risk rates. The original trial data show no such pattern.8
This is not to suggest that weaknesses in the mammography trials do not merit scrutiny. Others have also voiced criticisms.12 But the alarm raised by Gøtzsche and Olsen goes further, compelling us to rethink the purpose of critical appraisal and the extremes at which it might cause more harm than good.
Excessive critical appraisal
We seek perfection in evidence to safeguard patients. Prematurely adopting (or abandoning) interventions through uncritical acceptance of findings risks overlooking potential harms or more effective alternatives. But critical appraisal can do harm if valid evidence is rejected. Deciding whether to accept evidence counterbalances the risks of acceptance against the risks of rejection, which are inversely related. At one extreme of the spectrum, where data are accepted on face value (no appraisal), the risk of a type I error (accepting evidence of efficacy when the intervention does not work or causes harm) is high, and that of a type II error (discarding evidence when the intervention actually works) is low. At the other extreme (excessive scrutiny) the risk of a type II error is great; such errors harm patients because knowledge is rejected that can save (or improve) lives. Obviously, patients are best served somewhere in the middle, striking an optimal balance between the risks of type I and type II errors.
Enthusiasts for critical appraisal sometimes forget this, assuming that more is better. Gøtzsche and Olsen, fearing that the medical community had committed a type I error (promoting an ineffective or harmful screening test), set a high standard for judging validity (eg, dismissing trials with any baseline differences). But when data from 8 trials demonstrate a large effect size (21% reduction in mortality) with narrow confidence intervals (13%-29%),8 the risk of a type II error overwhelms that of a type I error. The implications of the error are stark for a disease that claims 40,000 lives per year.13 Rejecting evidence under these conditions is more likely to cause death than accepting it.
The pivotal question in rejecting evidence should not be whether there is a design flaw but something more precise: the probability that the observed outcomes are due to factors other than the intervention under consideration (eg, chance, confounding). This can be just as likely in the absence of design flaws (eg, a perfectly conducted uncontrolled case series) as when a study is performed poorly, and it can be low even when studies are imperfect. It is a mistake to reject evidence reflexively because of a design flaw without studying these probabilities.
For example, what worried Gøtzsche and Olsen about flawed randomization was that factors other than mammography might account for lower mortality. But how likely was that (compared with the probability that early detection was efficacious)? Suppose a trial with imbalanced randomization carries a 50% probability of producing spurious results due to chance or confounding. The probability that the same phenomenon would occur in 8 trials (conducted independently in 4 countries in different decades, using different technologies, views, and screening intervals14) would be in the neighborhood of (0.50)8 or 0.39%. The probability would be higher if one believed that subversion is systematic among researchers, but without data such speculation is an exercise in cynicism rather than science.
Suppose that the criticisms of Gøtzsche and Olsen justify the termination of mammography. Are we applying the same standard for other screening and clinical practices, or do we move the goal posts? Physicians screen for prostate cancer without one well-designed controlled trial showing that it lowers mortality.15 We say this is not evidence-based,16 but what kind of study would change our minds? The conventional answer is a randomized controlled trial showing a reduction in prostate cancer mortality is required,17 but The Lancet article finds that even 8 such trials are unconvincing. Indeed, flawless trials lose influence if the end points or setting lack generalizability.18 If the consequence of such high standards is that 30 years of trials involving a total of 482,000 women (and untold cost) cannot establish efficacy, the prospects for making the rest of medicine evidence-based are slim indeed.
The search for perfect data also has epistemologic flaws. No study can provide the absolute certainty that extremism in critical appraisal seeks. The willingness to reject studies based on improbable theories of confounding or research misbehavior may have less to do with good science than with discomfort with uncertainty, that is, unease with any possibility that the inferences of the investigators are wrong. The wait for better evidence is in vain, however, because science can only guess about reality. Even in a flawless trial a P value of .05 means that claims of significance will be wrong 5% of the time. Good studies have better odds in predicting reality, but they do not define it. It is legitimate to reject poorly designed studies because the probability of being wrong is too high. But to reject studies because there is any probability of being wrong is to wait futilely for a class of evidence that does not exist.
Inadequate critical appraisal
The POEM about The Lancet article highlights the other extreme of critical appraisal, accepting studies at face value. The review mentioned none of the limitations in The Lancet analysis, thus giving readers little reason to doubt the conclusion that mammography lacks scientific support and potentially convincing them to stop screening. Physicians should decide whether this is the right choice only after having heard all the issues. That this study reported a null effect and used meta-analysis does not lesson the need for critical appraisal. Like removing a counterweight from a scale, the omission of critical appraisal unduly elevates study findings (positive or negative), thus fomenting overreaction by not putting the information in context.
Several new resources, POEMs among them, have become available to alert physicians to important evidence. Some features (eg, the “Abstracts” section in the Journal of the American Medical Association) simply reprint abstracts. Others associated with the evidence-based medicine (EBM) movement offer critical appraisals. In family practice, these include POEMs and Evidence-Based Practice.19 In other specialties, they include the American College of Physicians’ ACP Journal Club and the EBM journals from the BMJ Publishing Group (eg, Evidence-Based Medicine, Evidence-Based Nursing). These efforts try to approach critical appraisal systematically. ACP Journal Club and the BMJ journals apply a quality filter (excluding studies failing certain criteria20) and append a commentary that mentions design limitations. POEMs go further, devoting a section to study design and validity and giving the authors explicit criteria for assessing quality.21
But a closer look reveals inconsistencies in how these criteria are applied. The heterogeneity is apparent in any random set of POEMs: Some authors list strengths and weaknesses by name with no elaboration, some give more details, some address only one criterion (eg, allocation concealment), while others state simply that the study was well designed. Some, like The Lancet review, say nothing about quality, reporting only the design and results. Eight (17%) of the 48 POEMs published in the first 6 months of 2000 included no critical appraisal (or a vague remark).
Similar omissions plague ACP Journal Club and the BMJ journals. Although the reviews describe concealment of allocation and blinding, and the commentary sections sometimes address design flaws at length, the degree to which this occurs, if at all, is variable. An ACP Journal Club review22 of the United Kingdom Prospective Diabetes Study, the landmark trial of intensive glycemic control, mentioned no concerns about its external validity (for contrast see the report by the American Academy of Family Physicians and American Diabetes Association23). Calling such synopses critical appraisals obfuscates the meaning of the term.
Some say that even brief remarks are critical appraisals. But good appraisals consistently and objectively rate studies using uniform criteria.24 POEMs and the EBM journals do not currently meet this standard; more systematic procedures are needed to ensure that narratives routinely discuss core elements of internal and external validity and that manuscripts lacking these elements are returned. The current conditions for preparing reviews make this difficult, however. With their modest budgets3 journals rely on hundreds of volunteer contributors, each with a different writing style and level of expertise. Onerous procedures for critical appraisal might discourage participation. Because journal space is limited, narratives are kept short (700 words for POEMs,25 425 words for the BMJ journals and ACP Journal Club20) to review more studies. Each issue of JFP contains 8 POEMs, and Evidence-Based Medicine has 24 reviews. Longer appraisals would reduce the number and currency of reviews and would be less concise for busy physicians.
The disadvantage of short reviews and rapid turnover is a greater risk of inaccuracies and imbalance. Careful analysis of a study requires more time to do research and more space to explain the results than journals currently provide. Authors have only weeks to prepare manuscripts, which leaves little time to verify that descriptions are accurate and give proportionate emphasis to the issues that matter most. For many studies a few hundred words provide inadequate room to fully explain the design, results, and limitations. The risk of mistakes is heightened with less time, analysis, expert review, and space.
This tradeoff between quantity and quality begs the question of what is more important to readers and to patient care: the number of studies that physicians know about or the accuracy with which they are described. POEMs, which are designed to change practice,26 can do harm if physicians acting on inaccurate or incomplete information make choices that compromise outcomes. The Lancet review is a shot across the bow. The 40,000 annual deaths from breast cancer13 remind us that for certain topics a mistaken inference can cost thousands of lives. If inconsistencies in appraisal make this happen often enough, efforts to synopsize evidence can do more harm than good. Also, it is incongruent for programs espousing EBM, discourages a discipline that accepts evidence on face value, to report studies with little or no discussion of validity.
Setting policy in critical appraisals
It is also antithetical for EBM to support evidence-based practice guidelines27 and to publish clinical advice that is not derived from these methods. Critical appraisals that conclude by suggesting how physicians should modify patient care cross the line from science to policy. In the first half of 2000 73% of the POEMs advised physicians (with varied explicitness) to use tests (6), drugs (14), or other treatments (5) and to withhold others (10). Such advice is common fare in the medical literature, but EBM ascribes to a higher standard. Because imprudent practice policies can do harm or compromise effectiveness, EBM holds that guidelines should be drafted with care using evidence-based methods.28 This entails reviewing not one study but all relevant evidence, with systematic grading of studies and explicit linkage between the recommendations and the quality of the data.27-30 This process typically involves an expert panel and months or years of deliberations.
In contrast, practice recommendations in EBM journals and POEMs reflect what individual authors think of a study. They lack the time, funding, and journal space for a systematic literature review. Thus, the authors and their readers cannot be sure that the conclusions reflect the evidence as a whole without undue influence from the reviewed study. Rules of evidence and grades for recommendations are rarely provided. Unlike guideline panels, authors seldom vet their recommendations with experts, societies, and agencies, which often uncover flawed inferences. The thinking process behind recommendations is necessarily telescoped. While the United States Preventive Services Task Force spent 2 years deciding whether pregnant women should be screened for bacterial vaginosis, a POEM31 produced its recommendations within weeks.
To some such pronouncements are not guidelines, only the bottom line of a review. But in policy terms it matters little whether physicians prescribe a drug because of a guideline or because of the advice they read in ACP Journal Club or a POEM. The outcome for the patient is the same. Reviews need a bottom line, but summarizing the results of a study (eg, drug A worked better than drug B) differs from advising physicians what to do (eg, prescribe drug A). The latter is a statement of policy rather than science and should be based on broader considerations than one study.28
EBM faults guidelines that omit evidence-based methods, such as those issued by advocacy groups that reflect personal opinions and selective use of studies more than systematic reviews.32,33 Yet the recommendations in EBM journals and POEMs differ little in appearance: They provide little documentation of how conclusions were reached, feature select evidence (the study under review), rely on authors’ opinions, and provide few details on rationale. EBM journals should extract themselves from this inconsistency by sharpening the distinction between summarizing evidence and setting policy and eschewing the latter unless it emanates from evidence-based methods.
How this is handled in POEMs will reflect on family medicine. The prominence the specialty has given POEMs (promotion in JFP, family practice literature,34-35 the Internet,36-37 and newsletters19) signals the way family physicians think studies should be reviewed. It is important to get this right. If POEMs are meant to be critical appraisals and 17% contain no critique, calling them critical appraisals casts doubts on the specialty’s understanding of the term and perpetuates confusion about definitions. Conversely, by instituting greater scrutiny—defining the core criteria that must be discussed to qualify a study commentary as a critical appraisal and systematizing their use in POEMs—the specialty would set a new standard for EBM. If POEMs are not meant to be critical appraisals, it is important to clarify the distinction in terms, especially for family physicians who have grown accustomed to POEMs, know little about alternatives, and have come to believe that POEMs, critical appraisals, and EBM are essentially the same.
Conclusions
Advocates of EBM should be systematic in their application of critical appraisal. Critical appraisals do not deserve the name if they accept studies on face value. The criteria for determining which studies are rated good or bad should be explicit and consistent. But the scrutiny of evidence should not be taken to extremes, to the point that studies are rejected for being imperfect when there is little likelihood that the findings are wrong. By making the perfect the enemy of the good, excesses in critical appraisal do injustice to the goal of helping patients and imply existence of a level of certainty that science cannot provide.
In January 2000 an article in The Lancet drew attention when it questioned the supporting evidence for screening mammography.1 Danish investigators Peter Gøtzsche and Ole Olsen presented a series of apparent flaws in the 8 randomized trials of mammography, ultimately concluding that screening is unjustified. Their cogent arguments and the press coverage they received left many physicians wondering whether they should continue to order mammograms. The story led the CBS Evening News2 and was featured in the Washington Post,3 Time,4 and Reuters.5
A Patient-Oriented Evidence that Matters (POEM) review in the April 2000 issue of The Journal of Family Practice6 that addressed The Lancet study lent apparent support to these concerns. The POEM related the arguments in The Lancet article without challenging them and concluded that “mammography screening has never been shown to help women to live longer.” The authors of this POEM suggested that the only reasons for screening to continue are “politics, patients’ preconceptions, and the fear of litigation.” Unlike most POEMs, this one included no critical appraisal of the methods or assumptions of the reviewed study. This lack of comment, combined with the authors’ negative remarks about mammography, may have convinced family physicians that the criticisms of Gøtzsche and Olsen were beyond dispute.
However, controversy does surround their arguments, as the many letters to the editor published in The Lancet attest.7 For example, The Lancet critique made much of inconsistent sample sizes and baseline dissimilarities between screened and unscreened women. The authors asserted that such age and socioeconomic differences were “incompatible with adequate randomization.” That premise is contestable. It is normal and predictable that a proportion of population variables will differ between groups for statistical reasons, no matter how perfect the randomization. Also, the observed age difference (1 to 6 months) would not explain the 21% reduction in mortality observed in the trials.8
For Gøtzsche and Olsen the discrepant age patterns and sample sizes were less a cause of the results than a warning sign that randomization had been subverted (because of failure to conceal allocation). Since mortality in the screened and unscreened groups differed by only a relatively small number of deaths, they reasoned that very little bias would be necessary to tip the scales in favor of mammography.
Several arguments weaken their case, however. First, they offered no evidence that subversion or unconcealed allocation actually occurred. They equated inexplicit documentation of procedures (and dissimilar group characteristics) with improper randomization. Second, even if unconcealed allocation occurred, it does not in itself thwart randomization. Investigators who know to which group a patient will be assigned can still follow the rules and make the correct assignment. Anecdotal reports of subversion (by deciphering assignment sequences to divert or target patients for allocation) do not offer denominator data to assess how often this occurs.9 It would have had to occur in every trial that favored mammography to uphold the authors’ allegations. Third, even if the trials were subverted there is no indication that case mix differed enough to skew outcomes. Age differences were minor; the authors speculated that sizable imbalances in unmeasured factors could have altered results, but they gave no evidence. They cited reports that poorly concealed allocation is associated with a 37% to 41% exaggeration in odds ratios,10,11 but these reports concerned other trials and made arguable assumptions. Finally, their confirmatory finding—that only the 6 “flawed” trials reported a benefit for mammography and that the 2 acceptable trials showed no effect—was based on recalculated relative risk rates. The original trial data show no such pattern.8
This is not to suggest that weaknesses in the mammography trials do not merit scrutiny. Others have also voiced criticisms.12 But the alarm raised by Gøtzsche and Olsen goes further, compelling us to rethink the purpose of critical appraisal and the extremes at which it might cause more harm than good.
Excessive critical appraisal
We seek perfection in evidence to safeguard patients. Prematurely adopting (or abandoning) interventions through uncritical acceptance of findings risks overlooking potential harms or more effective alternatives. But critical appraisal can do harm if valid evidence is rejected. Deciding whether to accept evidence counterbalances the risks of acceptance against the risks of rejection, which are inversely related. At one extreme of the spectrum, where data are accepted on face value (no appraisal), the risk of a type I error (accepting evidence of efficacy when the intervention does not work or causes harm) is high, and that of a type II error (discarding evidence when the intervention actually works) is low. At the other extreme (excessive scrutiny) the risk of a type II error is great; such errors harm patients because knowledge is rejected that can save (or improve) lives. Obviously, patients are best served somewhere in the middle, striking an optimal balance between the risks of type I and type II errors.
Enthusiasts for critical appraisal sometimes forget this, assuming that more is better. Gøtzsche and Olsen, fearing that the medical community had committed a type I error (promoting an ineffective or harmful screening test), set a high standard for judging validity (eg, dismissing trials with any baseline differences). But when data from 8 trials demonstrate a large effect size (21% reduction in mortality) with narrow confidence intervals (13%-29%),8 the risk of a type II error overwhelms that of a type I error. The implications of the error are stark for a disease that claims 40,000 lives per year.13 Rejecting evidence under these conditions is more likely to cause death than accepting it.
The pivotal question in rejecting evidence should not be whether there is a design flaw but something more precise: the probability that the observed outcomes are due to factors other than the intervention under consideration (eg, chance, confounding). This can be just as likely in the absence of design flaws (eg, a perfectly conducted uncontrolled case series) as when a study is performed poorly, and it can be low even when studies are imperfect. It is a mistake to reject evidence reflexively because of a design flaw without studying these probabilities.
For example, what worried Gøtzsche and Olsen about flawed randomization was that factors other than mammography might account for lower mortality. But how likely was that (compared with the probability that early detection was efficacious)? Suppose a trial with imbalanced randomization carries a 50% probability of producing spurious results due to chance or confounding. The probability that the same phenomenon would occur in 8 trials (conducted independently in 4 countries in different decades, using different technologies, views, and screening intervals14) would be in the neighborhood of (0.50)8 or 0.39%. The probability would be higher if one believed that subversion is systematic among researchers, but without data such speculation is an exercise in cynicism rather than science.
Suppose that the criticisms of Gøtzsche and Olsen justify the termination of mammography. Are we applying the same standard for other screening and clinical practices, or do we move the goal posts? Physicians screen for prostate cancer without one well-designed controlled trial showing that it lowers mortality.15 We say this is not evidence-based,16 but what kind of study would change our minds? The conventional answer is a randomized controlled trial showing a reduction in prostate cancer mortality is required,17 but The Lancet article finds that even 8 such trials are unconvincing. Indeed, flawless trials lose influence if the end points or setting lack generalizability.18 If the consequence of such high standards is that 30 years of trials involving a total of 482,000 women (and untold cost) cannot establish efficacy, the prospects for making the rest of medicine evidence-based are slim indeed.
The search for perfect data also has epistemologic flaws. No study can provide the absolute certainty that extremism in critical appraisal seeks. The willingness to reject studies based on improbable theories of confounding or research misbehavior may have less to do with good science than with discomfort with uncertainty, that is, unease with any possibility that the inferences of the investigators are wrong. The wait for better evidence is in vain, however, because science can only guess about reality. Even in a flawless trial a P value of .05 means that claims of significance will be wrong 5% of the time. Good studies have better odds in predicting reality, but they do not define it. It is legitimate to reject poorly designed studies because the probability of being wrong is too high. But to reject studies because there is any probability of being wrong is to wait futilely for a class of evidence that does not exist.
Inadequate critical appraisal
The POEM about The Lancet article highlights the other extreme of critical appraisal, accepting studies at face value. The review mentioned none of the limitations in The Lancet analysis, thus giving readers little reason to doubt the conclusion that mammography lacks scientific support and potentially convincing them to stop screening. Physicians should decide whether this is the right choice only after having heard all the issues. That this study reported a null effect and used meta-analysis does not lesson the need for critical appraisal. Like removing a counterweight from a scale, the omission of critical appraisal unduly elevates study findings (positive or negative), thus fomenting overreaction by not putting the information in context.
Several new resources, POEMs among them, have become available to alert physicians to important evidence. Some features (eg, the “Abstracts” section in the Journal of the American Medical Association) simply reprint abstracts. Others associated with the evidence-based medicine (EBM) movement offer critical appraisals. In family practice, these include POEMs and Evidence-Based Practice.19 In other specialties, they include the American College of Physicians’ ACP Journal Club and the EBM journals from the BMJ Publishing Group (eg, Evidence-Based Medicine, Evidence-Based Nursing). These efforts try to approach critical appraisal systematically. ACP Journal Club and the BMJ journals apply a quality filter (excluding studies failing certain criteria20) and append a commentary that mentions design limitations. POEMs go further, devoting a section to study design and validity and giving the authors explicit criteria for assessing quality.21
But a closer look reveals inconsistencies in how these criteria are applied. The heterogeneity is apparent in any random set of POEMs: Some authors list strengths and weaknesses by name with no elaboration, some give more details, some address only one criterion (eg, allocation concealment), while others state simply that the study was well designed. Some, like The Lancet review, say nothing about quality, reporting only the design and results. Eight (17%) of the 48 POEMs published in the first 6 months of 2000 included no critical appraisal (or a vague remark).
Similar omissions plague ACP Journal Club and the BMJ journals. Although the reviews describe concealment of allocation and blinding, and the commentary sections sometimes address design flaws at length, the degree to which this occurs, if at all, is variable. An ACP Journal Club review22 of the United Kingdom Prospective Diabetes Study, the landmark trial of intensive glycemic control, mentioned no concerns about its external validity (for contrast see the report by the American Academy of Family Physicians and American Diabetes Association23). Calling such synopses critical appraisals obfuscates the meaning of the term.
Some say that even brief remarks are critical appraisals. But good appraisals consistently and objectively rate studies using uniform criteria.24 POEMs and the EBM journals do not currently meet this standard; more systematic procedures are needed to ensure that narratives routinely discuss core elements of internal and external validity and that manuscripts lacking these elements are returned. The current conditions for preparing reviews make this difficult, however. With their modest budgets3 journals rely on hundreds of volunteer contributors, each with a different writing style and level of expertise. Onerous procedures for critical appraisal might discourage participation. Because journal space is limited, narratives are kept short (700 words for POEMs,25 425 words for the BMJ journals and ACP Journal Club20) to review more studies. Each issue of JFP contains 8 POEMs, and Evidence-Based Medicine has 24 reviews. Longer appraisals would reduce the number and currency of reviews and would be less concise for busy physicians.
The disadvantage of short reviews and rapid turnover is a greater risk of inaccuracies and imbalance. Careful analysis of a study requires more time to do research and more space to explain the results than journals currently provide. Authors have only weeks to prepare manuscripts, which leaves little time to verify that descriptions are accurate and give proportionate emphasis to the issues that matter most. For many studies a few hundred words provide inadequate room to fully explain the design, results, and limitations. The risk of mistakes is heightened with less time, analysis, expert review, and space.
This tradeoff between quantity and quality begs the question of what is more important to readers and to patient care: the number of studies that physicians know about or the accuracy with which they are described. POEMs, which are designed to change practice,26 can do harm if physicians acting on inaccurate or incomplete information make choices that compromise outcomes. The Lancet review is a shot across the bow. The 40,000 annual deaths from breast cancer13 remind us that for certain topics a mistaken inference can cost thousands of lives. If inconsistencies in appraisal make this happen often enough, efforts to synopsize evidence can do more harm than good. Also, it is incongruent for programs espousing EBM, discourages a discipline that accepts evidence on face value, to report studies with little or no discussion of validity.
Setting policy in critical appraisals
It is also antithetical for EBM to support evidence-based practice guidelines27 and to publish clinical advice that is not derived from these methods. Critical appraisals that conclude by suggesting how physicians should modify patient care cross the line from science to policy. In the first half of 2000 73% of the POEMs advised physicians (with varied explicitness) to use tests (6), drugs (14), or other treatments (5) and to withhold others (10). Such advice is common fare in the medical literature, but EBM ascribes to a higher standard. Because imprudent practice policies can do harm or compromise effectiveness, EBM holds that guidelines should be drafted with care using evidence-based methods.28 This entails reviewing not one study but all relevant evidence, with systematic grading of studies and explicit linkage between the recommendations and the quality of the data.27-30 This process typically involves an expert panel and months or years of deliberations.
In contrast, practice recommendations in EBM journals and POEMs reflect what individual authors think of a study. They lack the time, funding, and journal space for a systematic literature review. Thus, the authors and their readers cannot be sure that the conclusions reflect the evidence as a whole without undue influence from the reviewed study. Rules of evidence and grades for recommendations are rarely provided. Unlike guideline panels, authors seldom vet their recommendations with experts, societies, and agencies, which often uncover flawed inferences. The thinking process behind recommendations is necessarily telescoped. While the United States Preventive Services Task Force spent 2 years deciding whether pregnant women should be screened for bacterial vaginosis, a POEM31 produced its recommendations within weeks.
To some such pronouncements are not guidelines, only the bottom line of a review. But in policy terms it matters little whether physicians prescribe a drug because of a guideline or because of the advice they read in ACP Journal Club or a POEM. The outcome for the patient is the same. Reviews need a bottom line, but summarizing the results of a study (eg, drug A worked better than drug B) differs from advising physicians what to do (eg, prescribe drug A). The latter is a statement of policy rather than science and should be based on broader considerations than one study.28
EBM faults guidelines that omit evidence-based methods, such as those issued by advocacy groups that reflect personal opinions and selective use of studies more than systematic reviews.32,33 Yet the recommendations in EBM journals and POEMs differ little in appearance: They provide little documentation of how conclusions were reached, feature select evidence (the study under review), rely on authors’ opinions, and provide few details on rationale. EBM journals should extract themselves from this inconsistency by sharpening the distinction between summarizing evidence and setting policy and eschewing the latter unless it emanates from evidence-based methods.
How this is handled in POEMs will reflect on family medicine. The prominence the specialty has given POEMs (promotion in JFP, family practice literature,34-35 the Internet,36-37 and newsletters19) signals the way family physicians think studies should be reviewed. It is important to get this right. If POEMs are meant to be critical appraisals and 17% contain no critique, calling them critical appraisals casts doubts on the specialty’s understanding of the term and perpetuates confusion about definitions. Conversely, by instituting greater scrutiny—defining the core criteria that must be discussed to qualify a study commentary as a critical appraisal and systematizing their use in POEMs—the specialty would set a new standard for EBM. If POEMs are not meant to be critical appraisals, it is important to clarify the distinction in terms, especially for family physicians who have grown accustomed to POEMs, know little about alternatives, and have come to believe that POEMs, critical appraisals, and EBM are essentially the same.
Conclusions
Advocates of EBM should be systematic in their application of critical appraisal. Critical appraisals do not deserve the name if they accept studies on face value. The criteria for determining which studies are rated good or bad should be explicit and consistent. But the scrutiny of evidence should not be taken to extremes, to the point that studies are rejected for being imperfect when there is little likelihood that the findings are wrong. By making the perfect the enemy of the good, excesses in critical appraisal do injustice to the goal of helping patients and imply existence of a level of certainty that science cannot provide.
1. Gøtzsche PC, Olsen O. Is screening for breast cancer with mammography justifiable? Lancet 2000;355:129-33.
2. Health watch: mammography controversy. CBS Evening News January 7, 2000. Vanderbilt University Television News Archive, available at tvnews.vanderbilt.edu.
3. Mammography assessed. Washington Post, January 7, 2000, A14.
4. Reaves J. Here’s why your oncologist is angry. Time January 13, 2000. Available at www.time.com/time/daily/0,2960,37449,00.html.
5. Mammography screening deemed unjustifiable. Reuters Medical News January 7, 2000. Available at www.medscape.com/reuters/prof/test/2000/o1/01.07/pbo1070c.html.
6. Wilkerson BF, Schooff M. Screening mammography may not be effective at any age. J Fam Pract 2000;49:302-371.
7. Screening mammography re-evaluated. Lancet 2000;355:747-52.
8. Kerlikowske K, Grady D, Rubin SM, Sandrock C, Ernster VL. Efficacy of screening mammography: a meta-analysis. JAMA 1995;273:149-54.
9. Schulz KF. Subverting randomization in controlled trials. JAMA 1995;274:1456-58.
10. Moher D, Pham B, Jones A, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?. Lancet 1998;352:609-13.
11. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408-12.
12. Berry DA. Benefits and risks of screening mammography for women in their forties: a statistical appraisal. J Natl Cancer Inst 1998;90:1431-39.
13. American Cancer Society. Cancers facts & figures 2000. Atlanta, Ga: American Cancer Society; 2000.
14. Fletcher SW, Black W, Harris R, Rimer BK, Shapiro S. Report of the International Workshop on Screening for Breast Cancer. J Natl Cancer Inst 1993;85:1644-56.
15. Collins MM, Stafford RS, Barry MJ. Age-specific patterns of prostate-specific antigen testing among primary care physician visits. J Fam Pract 2000;49:169-72.
16. Lefevre ML. Prostate cancer screening: more harm than good? Am Fam Phys 1998;58:432-38..
17. Woolf SH, Rothemich SF. Screening for prostate cancer: the role of science, policy, and opinion in determining what is best for patients. Ann Rev Med 1999;50:207-21.
18. Bucher HC, Guyatt GH, Cook DJ, Holbrook A, McAlister FA. Users’ guides to the medical literature: XIX. Applying clinical trial results. A. How to use an article measuring the effect of an intervention on surrogate end points. Evidence-Based Medicine Working Group. JAMA 1999;282:771-78.
19. Advertiement Evidence-Based Practice Montvale, NJ: Quadrant HealthCom Inc.; 2000.
20. Purpose and procedure. Evidence-Based Medicine. Available at www.bmjpg.com/data/ebmpp.htm.
21. Assessing validity and relevance. Available at www.infopoems.com/EBP_Validity.htm.
22. Gerstein HC. Commentary on “Intensive blood glucose control reduced type 2 diabetes mellitus-related end points.” ACP J Club 1999; 2-3. Comment on: UK Prospective Diabetes Study Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications. Lancet 1998;352:837-53.
23. Woolf SH, Davidson MB, Greenfield S, et al. Controlling blood glucose levels in patients with type 2 diabetes mellitus: an evidence-based policy statement by the American Academy of Family Physicians and American Diabetes Association. J Fam Pract 2000;49:453-60.
24. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med 1997;126:376-80.
25. Instructions for POEMs authors. Available at www.infopoems.com/authors.htm.
26. Slawson DC, Shaughnessy AF. Becoming an information master: using POEMs to change practice with confidence. J Fam Pract 2000;49:63-67.
27. Hayward RS, Wilson MC, Tunis SR, Bass EB, Guyatt G. Users’ guides to the medical literature. VII. How to use clinical practice guidelines. A. Are the recommendations valid? The Evidence-Based Medicine Working Group. JAMA 1995;274:570-74.
28. Woolf SH, George JN. Evidence-based medicine: interpreting studies and setting policy. Hem Oncol Clin N Amer 2000;14:761-84.
29. Shekelle PG, Woolf SH, Eccles M, Grimshaw J. Developing guidelines. BMJ 1999;318:593-96.
30. Woolf SH. Practice guidelines: what the family physician should know. Am Fam Phys 1995;51:1455-63.
31. Lazar PA. Does oral metronidazole prevent preterm delivery in normal-risk pregnant women with asymptomatic bacterial vaginosis (BV)? J Fam Pract 2000;49:495-96.
32. Cook D, Giacomini M. The trials and tribulations of clinical practice guidelines. JAMA 1999;281:1950-51.
33. Grilli R. Practice guidelines developed by specialty societies: the need for a critical appraisal. Lancet 2000;355:103-06.
34. Geyman JP. POEMs as a paradigm shift in teaching, learning, and clinical practice. J Fam Pract 1999;48:343-44.
35. Dickinson WP, Stange KC, Ebell MH, Ewigman BG, Green LA. Involving all family physicians and family medicine faculty members in the use and generation of new knowledge. Fam Med 2000;32:480-90.
38. JFP online. Available at www.jfponline.com.
39. POEMs for primary care. Available at www.infopoems.com.
1. Gøtzsche PC, Olsen O. Is screening for breast cancer with mammography justifiable? Lancet 2000;355:129-33.
2. Health watch: mammography controversy. CBS Evening News January 7, 2000. Vanderbilt University Television News Archive, available at tvnews.vanderbilt.edu.
3. Mammography assessed. Washington Post, January 7, 2000, A14.
4. Reaves J. Here’s why your oncologist is angry. Time January 13, 2000. Available at www.time.com/time/daily/0,2960,37449,00.html.
5. Mammography screening deemed unjustifiable. Reuters Medical News January 7, 2000. Available at www.medscape.com/reuters/prof/test/2000/o1/01.07/pbo1070c.html.
6. Wilkerson BF, Schooff M. Screening mammography may not be effective at any age. J Fam Pract 2000;49:302-371.
7. Screening mammography re-evaluated. Lancet 2000;355:747-52.
8. Kerlikowske K, Grady D, Rubin SM, Sandrock C, Ernster VL. Efficacy of screening mammography: a meta-analysis. JAMA 1995;273:149-54.
9. Schulz KF. Subverting randomization in controlled trials. JAMA 1995;274:1456-58.
10. Moher D, Pham B, Jones A, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?. Lancet 1998;352:609-13.
11. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408-12.
12. Berry DA. Benefits and risks of screening mammography for women in their forties: a statistical appraisal. J Natl Cancer Inst 1998;90:1431-39.
13. American Cancer Society. Cancers facts & figures 2000. Atlanta, Ga: American Cancer Society; 2000.
14. Fletcher SW, Black W, Harris R, Rimer BK, Shapiro S. Report of the International Workshop on Screening for Breast Cancer. J Natl Cancer Inst 1993;85:1644-56.
15. Collins MM, Stafford RS, Barry MJ. Age-specific patterns of prostate-specific antigen testing among primary care physician visits. J Fam Pract 2000;49:169-72.
16. Lefevre ML. Prostate cancer screening: more harm than good? Am Fam Phys 1998;58:432-38..
17. Woolf SH, Rothemich SF. Screening for prostate cancer: the role of science, policy, and opinion in determining what is best for patients. Ann Rev Med 1999;50:207-21.
18. Bucher HC, Guyatt GH, Cook DJ, Holbrook A, McAlister FA. Users’ guides to the medical literature: XIX. Applying clinical trial results. A. How to use an article measuring the effect of an intervention on surrogate end points. Evidence-Based Medicine Working Group. JAMA 1999;282:771-78.
19. Advertiement Evidence-Based Practice Montvale, NJ: Quadrant HealthCom Inc.; 2000.
20. Purpose and procedure. Evidence-Based Medicine. Available at www.bmjpg.com/data/ebmpp.htm.
21. Assessing validity and relevance. Available at www.infopoems.com/EBP_Validity.htm.
22. Gerstein HC. Commentary on “Intensive blood glucose control reduced type 2 diabetes mellitus-related end points.” ACP J Club 1999; 2-3. Comment on: UK Prospective Diabetes Study Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications. Lancet 1998;352:837-53.
23. Woolf SH, Davidson MB, Greenfield S, et al. Controlling blood glucose levels in patients with type 2 diabetes mellitus: an evidence-based policy statement by the American Academy of Family Physicians and American Diabetes Association. J Fam Pract 2000;49:453-60.
24. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med 1997;126:376-80.
25. Instructions for POEMs authors. Available at www.infopoems.com/authors.htm.
26. Slawson DC, Shaughnessy AF. Becoming an information master: using POEMs to change practice with confidence. J Fam Pract 2000;49:63-67.
27. Hayward RS, Wilson MC, Tunis SR, Bass EB, Guyatt G. Users’ guides to the medical literature. VII. How to use clinical practice guidelines. A. Are the recommendations valid? The Evidence-Based Medicine Working Group. JAMA 1995;274:570-74.
28. Woolf SH, George JN. Evidence-based medicine: interpreting studies and setting policy. Hem Oncol Clin N Amer 2000;14:761-84.
29. Shekelle PG, Woolf SH, Eccles M, Grimshaw J. Developing guidelines. BMJ 1999;318:593-96.
30. Woolf SH. Practice guidelines: what the family physician should know. Am Fam Phys 1995;51:1455-63.
31. Lazar PA. Does oral metronidazole prevent preterm delivery in normal-risk pregnant women with asymptomatic bacterial vaginosis (BV)? J Fam Pract 2000;49:495-96.
32. Cook D, Giacomini M. The trials and tribulations of clinical practice guidelines. JAMA 1999;281:1950-51.
33. Grilli R. Practice guidelines developed by specialty societies: the need for a critical appraisal. Lancet 2000;355:103-06.
34. Geyman JP. POEMs as a paradigm shift in teaching, learning, and clinical practice. J Fam Pract 1999;48:343-44.
35. Dickinson WP, Stange KC, Ebell MH, Ewigman BG, Green LA. Involving all family physicians and family medicine faculty members in the use and generation of new knowledge. Fam Med 2000;32:480-90.
38. JFP online. Available at www.jfponline.com.
39. POEMs for primary care. Available at www.infopoems.com.
Controlling Blood Glucose Levels in Patients with Type 2 Diabetes Mellitus An Evidence-Based Policy Statement by the American Academy of Family Physicians and American Diabetes Association
PARTICIPANTS: A 9-member panel composed of family physicians, general internists, endocrinologists, and a practice guidelines methodologist was assembled by the American Academy of Family Physicians, the American Diabetes Association, and the American College of Physicians.
EVIDENCE: Admissible evidence included published randomized controlled trials and observational studies regarding the effects of glycemic control on microvascular and macrovascular complications and on adverse effects. We followed systematic search and data abstraction procedures. Greater weight was given to clinical trials and to evidence about health outcomes.
CONSENSUS PROCESS: Interpretations of evidence and approval of documents were finalized by unanimous vote, with recommendations linked to evidence and not expert opinion. The full report was prepared by the chair and 2 panel members, representing each of the 3 organizations. The initial draft underwent external review by 14 diabetologists and family physicians and changes consistent with the evidence were incorporated.
CONCLUSIONS: The evidence demonstrates that the risk of microvascular and neuropathic complications is reduced by lowering glucose concentrations. Whether glycemic control affects macrovascular outcomes is less clear. The potential benefits of glycemic control must be balanced against factors that either preempt benefits (eg, limited life expectancy, comorbid disease) or increase risk (eg, severe hypoglycemia). The magnitude of benefit is a function of individual clinical variables (eg, baseline glycated hemoglobin level, presence of preexisting microvascular disease). Appropriate targets for treatment should be determined by considering these factors, patients’ risk profiles, and personal preferences.
What are the benefits and risks of glycemic control in type 2 diabetes, and what are the implications for clinical practice?
An estimated 16 million people in the United States have diabetes.1 Its microvascular complications (retinopathy, nephropathy, neuropathy) are a leading cause of blindness among adults,2 end-stage renal disease,3 and lower-extremity amputations.4 Its macrovascular complications pose an even greater public health burden, increasing the risk of coronary artery disease, stroke, and peripheral vascular disease.5 Each year diabetes costs the country an estimated $90 billion to $99 billion.6,7
Type 2 diabetes, which accounts for 90% to 95% of diabetes cases,8 differs from type 1 disease in average age of onset and etiology. In both forms, however, the underlying cause of microvascular (and possibly macrovascular) complications appears to be chronic elevations in blood glucose concentrations. The overriding factors in predicting microvascular pathogenesis have less to do with the type of diabetes than with the number of years the patient has had hyperglycemia and the magnitude of the glucose elevation.
In recent years, 2 major randomized controlled trials (RCTs), the Diabetes Control and Complications Trial9 (DCCT) in patients with type 1 disease and the United Kingdom Prospective Diabetes Study10 (UKPDS) in patients with type 2 diabetes, have shown that microvascular complications can be reduced significantly in patients who achieve normal or near-normal blood glucose levels. There is now agreement in the medical community about the importance of lowering markedly elevated blood glucose levels.
The incremental benefit of tight glycemic control (as opposed to less intensive therapy) varies across patient groups, however. Years are required for microvascular complications to progress to symptomatic disease. Patients with type 1 diabetes, who are generally younger, are more likely to live long enough to benefit from tight glycemic control than patients with type 2 disease, who face a shorter life expectancy because of their age and risk of cardiovascular disease. For patients with coexistent diseases, the delayed benefits of glycemic control may be offset by the more immediate inconvenience, complications, and costs of intensive treatment and by the health effects of comorbid conditions.
These generalizations do not apply to all patients. The older age of onset of type 2 diabetes is a population average with a wide distribution. Many patients with type 2 diabetes live long enough to experience significant microvascular disease and may benefit from glycemic control. Also, interventions that extend life expectancy (eg, smoking cessation, blood pressure control, and lipid control) give patients more time to encounter advanced microvascular disease and an opportunity to benefit from treatment.
In 1996, the American Academy of Family Physicians convened a panel to conduct a systematic review of the evidence of the benefits and harms of glycemic control in type 2 diabetes and to develop evidence-based recommendations. The 9-member panel included family physicians, general internists, endocrinologists, and a practice guidelines methodologist. Four members were appointed by the American Diabetes Association and the American College of Physicians. We summarize the panel’s findings, which are available in a full report.*
Methods
Systematic Review
The review methods are provided in detail in the full report. The literature search retrieved published evidence on the effects of glycemic control on microvascular and macrovascular complications in type 1 and type 2 diabetes and on adverse effects. RCT evidence for type 1 diabetes was considered relevant in evaluating the effects of glycemic control in type 2 disease. A total of 798 citations met initial inclusion criteria. All articles underwent structured abstraction. We closed the search with the publication of the UKPDS results in September 1998.
In reviewing the evidence, the panel gave greater weight to RCTs than to observational studies and emphasized data on health outcomes perceptible to patients (eg, visual acuity) over those for intermediate or surrogate end points (eg, retinopathy) that precede or are associated with such outcomes. Most trials did not designate health outcomes as primary end points and therefore lacked the statistical power and duration to prove an effect. Panel recommendations were evidence-based and did not reflect expert opinion. Fourteen outside diabetologists and family physicians externally reviewed the full report, and revisions consistent with the evidence were adopted. The American Academy of Family Physicians and the American Diabetes Association endorsed the full report in March 1999 and this policy statement in October 1999.
We focused the review on the benefits of glycemic control in general and not on specific agents (eg, sulfonylureas, metformin). Interventions not associated with glycemic control (eg, laser phototherapy, angiotensin-converting enzyme inhibitors), which also mitigate the effects of microvascular disease, were not examined in our review.
Results
Microvascular Outcomes
Evidence from Observational Studies. Many cross-sectional studies indicate that people with type 2 diabetes who have higher plasma glucose or glycated hemoglobin (e.g., hemoglobin A1c) levels are more likely to have evidence of retinopathy, neuropathy, or albuminuria.11 Numerous prospective longitudinal studies also show that an elevated fasting plasma glucose (FPG) concentration or glycated hemoglobin level at baseline or over time increases the chances that type 2 patients will develop new or worsened retinopathy, abnormal electrophysiologic findings, or renal dysfunction.12-15 However, observational data, unlike evidence from RCTs, does not prove that lowering blood glucose levels reduces the incidence of these complications.
Evidence from RCTs. Ten RCTs do provide this evidence.9,10,16-24 Three trials involved patients with type 2 diabetes: the very large UKPDS10 (approximately 4200 patients) and 2 small Japanese studies.16,19 The largest of the 7 trials of patients with type 1 diabetes was the DCCT (1441 patients).9 In most trials, patients were randomly allocated to an intensive treatment group that received multiple or continuous insulin administrations or to a control group that received conventional insulin therapy. Most studies confirmed (through mean glycated hemoglobin levels) better glycemic control with intensive treatment. For example, mean hemoglobin A1c levels in the intensive/conservative treatment groups of the DCCT and UKPDS were 7.2%/9.1% and 7.0%/7.9%, respectively.9,10 An old RCT (University Group Diabetes Program) that did not produce significant differences in glucose control in some treatment arms and lacked statistical power was excluded from our review.25 Average lengths of follow-up in the RCTs ranged from 2 to 12.5 years (6.5 and 10 years, respectively, in the DCCT and UKPDS).
Retinopathy in Patients with Type 2 Diabetes. RCTs provide good evidence that glycemic control reduces the incidence of retinopathy. In a Japanese trial, among patients with no retinopathy at baseline, the 6-year incidence of new disease (progression Ž2 steps on the Early Treatment Diabetic Retinopathy Study [ETDRS] scale26) in the intensive and conservative treatment groups was 6% and 36%, respectively, a relative reduction of 83%. In patients with retinopathy at baseline (secondary prevention group), the incidence rates for intensive and conservative treatment groups were 17% and 44%, respectively.16
The authors of the UKPDS results, who reported a 25% relative reduction in the incidence of microvascular complications (11.4 vs 8.6 events/1000 patient-years) attributed much of the benefit to reduced retinopathy.10 The need for laser therapy was lowered from 11.0 to 7.9 events/1000-patient years, a 29% relative reduction. Within 6 years, the incidence of a 2-step progression on the ETDRS scale was lowered from 28% to 23%. The relative reduction in cataract extraction was 24%. The incidence of decreased visual acuity, blindness, and vitreous hemorrhage was not lowered significantly. The extent to which the latter resulted from treatment for early complications is not known.
Retinopathy in Patients with Type 1 Diabetes. Among patients with no baseline retinopathy in the DCCT (primary prevention group), the 6.5-year incidence of a sustained 3-step change on the ETDRS scale was reduced by 76% (from 4.7 to 1.2/100 patient-years).9 In the secondary prevention group, the rate of progression was lowered by 54% (from 7.8 to 3.7/100 patient-years). In this group intensive treatment was also associated with a lower incidence of severe retinopathy, need for laser treatment, and sustained progression worsening for at least 6 months (adjusted relative risk reduction=47%, 56%, and 65%, respectively).9,27 A Swedish trial reported improved ETDRS scales and a lower prevalence of visual impairment (14% vs 35%) at 7.8 years median follow-up.17
Peripheral Neuropathy in Patients with Type 2 Diabetes. Some trials suggest that lowering blood glucose improves isolated electrophysiologic measures.10,16,19 A Japanese RCT16 reported an increase in median nerve conduction velocity and a reduction in arm vibration threshold, but other physiologic measures were unaffected. Neurologic symptoms were not measured. The UKPDS10 showed no effect on the incidence of absent ankle and knee reflexes, but abnormal biothesiometery data (for toes) occurred less frequently with intensive treatment. Impotence and heart rate responses to deep breathing and standing occurred with equal frequency.
Peripheral Neuropathy in Patients with Type 1 Diabetes. The DCCT showed a 69% reduction (9.8 vs 3.1/100 patient-years) in newly “confirmed clinical neuropathy” (abnormal neurologic history or physical examination combined with abnormal nerve conduction or autonomic nervous system studies), but the incidence of neurologic symptoms was not reported.9,28 Progression of clinical neuropathy in patients with preexisting disease was reduced by 57%, from 16.1 to 7.0/100 patient-years. A Swedish RCT reported a lower incidence of neuropathic symptoms at 10-year follow-up (14% vs 32%) and higher pin-prick sensitivity.29
Nephropathy in Patients with Type 2 Diabetes. Intensive insulin treatment does appear to reduce the incidence of albuminuria.10,16,19 The UKPDS observed a lower incidence of microalbuminuria within 3 years and a lower incidence of gross proteinuria and increased plasma creatinine within 9 years of follow-up (relative risk reduction=17%, 33%, and 60%, respectively).10 The incidence rates of renal failure and death from renal disease did not differ significantly between the groups, but the absolute number of cases was small.
Nephropathy in Patients with Type 1 Diabetes. In the DCCT, among patients who lacked microalbuminuria at baseline, the incidence of new cases over 6.5 years was reduced by 34% (3.4 vs 2.2/100,000 patient-years), but the incidence of sustained microalbuminuria, macroalbuminuria, or abnormal creatinine clearance did not differ. In the secondary prevention group, the incidence of microalbuminuria was reduced from 5.7 to 3.6/100,000 patient-years, a 43% relative reduction, and the incidence of sustained microalbuminuria and of macroalbuminuria was also reduced.9,30
Macrovascular Outcomes
Observational Studies. Some cross-sectional studies in type 2 diabetes report that elevated FPG concentrations or glycated hemoglobin levels are more common in people with coronary artery disease, an abnormal electrocardiogram, or cardiovascular disease.31,32 Longitudinal studies show that patients with elevated blood glucose, glycated hemoglobin, or postprandial glucose levels at baseline are more likely to develop coronary artery disease or an abnormal electrocardiogram or to die of coronary artery or cardiovascular disease.33-36 An association between blood glucose concentration and stroke or peripheral vascular disease (eg, incidence of amputation and foot ulcers) has also been demonstrated in such studies31,33,34,37,38 but less consistently.
Clinical Trials of Patients with Type 2 Diabetes. The UKPDS showed a 16% reduction in the 10-year incidence of myocardial infarction with intensive treatment, a difference of borderline statistical significance (P=.05, 95% confidence interval for relative risk=0.71-1.00).10 Statistically significant differences were noted in certain subgroups.39 Sudden death was less common (relative risk reduction=46%, P=.05), but the incidence of fatal myocardial infarction, heart failure, angina, stroke, amputation, and death from peripheral vascular disease was unchanged.10 The authors noted that the study lacked statistical power to exclude an effect on fatal outcomes.
Another RCT reported no significant effect on cardiovascular events or mortality with intensive treatment, but the mean follow-up period was only 27 months.40 A British trial involving patients with moderate hyperglycemia reported that cardiovascular events occurred less frequently in a group given high-dose tolbutamide and a recommended diet than in the control group, but the patient population, type of diabetes, and outcome measures were defined imprecisely.41
Clinical Trials of Patients with Type 1 Diabetes. The incidence of major cardiovascular and peripheral vascular events in most trials did not differ significantly with intensive treatment,9,42 but the number of cases and length of follow-up were generally too small to detect a difference.
All-Cause Mortality
Observational Studies. Some observational studies report an association between poor glycemic control and all-cause mortality or overall survival rates in type 2 diabetes.35,36,43-45 However, other cohort studies report that death rates are not reliably predicted by FPG concentrations or glycated hemoglobin levels.46,47
Clinical Trials of Patients with Type 2 Diabetes. A Swedish RCT found that patients with diabetes admitted to coronary care units for recent myocardial infarction achieved better glycemic control and experienced significantly lower all-cause mortality (33% vs 44%) if they received intensive insulin therapy (insulin-glucose infusion for the first 24 hours and subcutaneous insulin 4 times daily for 3 months).48 The UKPDS showed no significant effect on either all-cause or diabetes-related mortality but lacked statistical power to exclude an effect.10
Clinical Trials of Patients with Type 1 Diabetes. There are few data on all-cause mortality in patients with type 1 diabetes because of the low event rate.9,42
Discussion
Potential Harms of Intensive Glycemic Control
Specific complications can occur with each of the agents used to treat type 2 diabetes: Insulin has potential adverse effects, and oral glucose-lowering drugs carry some risk of undesirable side effects and uncommon but serious complications (eg, lactic acidosis, hepatotoxicity).
Attempts to achieve euglycemia can increase the risk of hypoglycemia, and some medications are associated with weight gain. The risk of severe hypoglycemia is greatest for patients with type 1 diabetes. In the subjects with type 2 diabetes in the UKPDS, the incidence of major hypoglycemic episodes was higher among the intensively treated than conventionally treated patients, but the rate was low (1% to 2%).10 More typically in type 2 disease, a more substantial risk exists for minor hypoglycemic episodes, which are usually inconsequential.10,40 An association between intensive treatment and weight gain has been reported (mean=3.1 kg and 4.6 kg in the UKPDS and DCCT, respectively),9,10 but there is no evidence that this amount of weight gain affects outcomes.
Intensive treatment requires that patients perform home glucose monitoring; follow diet and physical activity regimens; tolerate minor side effects and the risk of more serious complications from medications; regularly attend physician visits for testing and examinations; and absorb costs not covered by insurance for physicians and medical supplies, lost work (or school), and transportation. In many cases these inconveniences, discomforts, and costs are borne over a number of years, often a lifetime. RCTs have shown no adverse association between these efforts and quality of life,49 however, and one study suggested that glycemic control might improve quality of life and work productivity.50
Modeling Estimates
Mathematical models, largely based on the DCCT, have attempted to estimate the magnitude of benefits and harms from glycemic control. One model estimated that patients with type 2 diabetes who maintained a glycated hemoglobin level of 7.2% would reduce their cumulative lifetime risk of blindness, end-stage renal disease, and lower-extremity amputation by 72% (from 19% to 5%), 87% (from 17% to 2%), and 67% (from 15% to 5%), respectively.51 Life expectancy would increase by 1.39 years. A Markov model estimated that reducing glycated hemoglobin from 9% to 7% in a patient in whom diabetes developed at age 45 years would lower the lifetime risk of blindness from 2.6% to 0.3%.52 The same change in a patient who developed diabetes at age 65 years would decrease the risk of blindness from only 0.5% to <0.1%.
In theory, such projections could be useful to clinicians and patients to estimate the benefits and harms of different levels of glycemic control in individual situations. Since there are different designs and assumptions, however, available models offer discrepant predictions about the same types of patients. For example, the lifetime risk of blindness in a white patient aged 55 years who lowers his glycated hemoglobin level from 9% to 7% would, according to one model, be reduced by 5.6% (from 9% to 3.4%)51 and, according to another model by 1.1% (from 1.2% to 0.1%).52 These discrepancies must be reconciled before reliable outcome estimates can be introduced with confidence in practice.
Weighing the Magnitude of Benefit
The evidence demonstrates a continuous and curvilinear relationship between hyperglycemia and the microvascular and neuropathic complications of diabetes, with risk rising progressively as mean blood glucose concentrations increase. RCTs confirm that for both type 1 and type 2 diabetes glycemic control significantly reduces the incidence of microvascular complications. The following points should be considered when applying this evidence to routine practice:
- The intensity of treatment in RCTs may be difficult to replicate to the same degree in community practice. In the DCCT, for example, patients received insulin by injection 3 times daily or by external pump, self-monitored blood glucose at least 4 times per day, underwent weekly nocturnal blood glucose measurements, visited their study center monthly, and received frequent telephone calls. The target glycated hemoglobin value was less than 6.1%.9 More typical treatment practices were followed in the UKPDS. Although some practices and health care systems have successfully achieved satisfactory blood glucose levels through aggressive programs that assist clinicians and patients, other constraints and the inability or reluctance of patients to adhere to treatment protocols remain problems in other settings.53
- The microvascular end points in most RCTs were primarily intermediate (eg, ETDRS scales, nerve conduction velocity, urinary albumin excretion) or surrogate (eg, laser phototherapy) outcomes rather than health outcomes. Few trials were designed to measure health outcomes, providing limited data on the extent to which the symptoms that patients experience (eg, visual impairment, paresthesias, complications of renal failure) are reduced by intensive treatment. Such complaints generally do not occur until the patient has end-stage disease and are often forestalled by early treatment (eg, laser phototherapy). It is reasonable to infer that long-term benefits result from glycemic control-the intermediate end points affected are known risk factors for clinical disease-but one cannot assume that the observed magnitude of risk reduction for intermediate outcomes applies also to symptomatic disease.
- Relative risk reductions are greater than absolute risk reductions. The 25% relative reduction in microvascular complications reported by the UKPDS represents an absolute reduction of only 2%: 8%, rather than 10%, of patients had complications during 10 years of treatment.10 Also, relative risk reductions generally refer to intermediate outcomes. The 76% reduction in the risk of retinopathy reported by the DCCT refers to a 3-step change on the ETDRS scale, not to improved vision.9 The number needed to treat to affect outcomes perceptible to patients is necessarily higher than that for retinopathy, delayed nerve conduction, or elevated urinary albumin excretion, because only a subset of patients with these intermediate outcomes go on to develop symptomatic disease.15 According to the UKPDS, 37 patients would require intensive treatment for 10 years to prevent one patient from undergoing laser treatment; 208 would require treatment to prevent one case of blindness.10 The observed 16% difference in the incidence of blindness in the UKPDS was not statistically significant.On the other hand, when examined at the population level, even modest absolute risk reductions can translate into large numbers of persons in society for whom clinical benefit is achievable. Given the millions of people in the United States with type 2 diabetes, even a 2% absolute reduction in the risk of microvascular complications represents many thousands of people who would benefit from glycemic control.
- Because of the average time required for glycemic control to affect outcomes, some patients with diabetes may not live long enough to benefit, because of the competing risks of death of macrovascular complications and other comorbid diseases. Although elevated blood glucose levels are a likely risk factor for cardiovascular disease, glycemic control has not been shown to enhance life expectancy or prevent heart disease. The 16% reduction in myocardial infarction reported by the UKPDS was of borderline statistical significance.10 Another RCT did find that improved glycemic control reduced the incidence of ischemic cardiac events, stroke, and cardiovascular deaths in patients with acute myocardial infarction.48 Two other trials that failed to show a benefit may have lacked adequate duration and sample size.16,40
- For any given patient, the absolute magnitude of risk reduction is a continuous variable that is a function of the patient’s current glycated hemoglobin level, the duration and magnitude of previous hyperglycemia, and the extent of preexisting microvascular complications. The probability that the patient will live long enough to experience the benefits of reduced complications depends on cardiovascular risk factors other than blood glucose (eg, smoking, hypertension, lipid levels, physical inactivity, obesity, preexisting coronary artery disease) and other determinants of life expectancy (eg, age, coexisting diseases, health status). Of these, the most critical variable is the patient’s current glycated hemoglobin level. Because of their increased risk of complications, individuals with marked elevations generally benefit more (in absolute terms) from the same absolute reduction in glycated hemoglobin levels than do individuals with mild to moderate elevations.53 Although it is obviously important for clinicians to keep patients from progressing from mild (eg, hemoglobin A1c levels of 6% to 8%) to marked hyperglycemia (eg, hemoglobin A1c levels >9.5%), in those patients who have already developed marked hyperglycemia, efforts directed at achieving even moderate control (eg, hemoglobin A1c levels of 8% to 9.5%) will yield greater health benefits than pursuing euglycemia in patients with moderate elevations.
Recommendations for clinical practice
For any patient with type 2 diabetes, the better the glycemic control, the lower the probability of chronic microvascular, neuropathic, and possibly cardiovascular complications. However, because of differences in patients’ life expectancies, comorbidities, and preferences, it is inappropriate to set a uniform target glycated hemoglobin level for all patients. Individuals with long life expectancies and few comorbidities may wish to pursue euglycemia, but less vigorous goals may be appropriate for others, such as patients with multiple comorbid conditions or with limited life expectancies.
Whether the magnitude of benefit of a given treatment goal justifies the potential inconvenience, harms, and costs involves value judgments that must be tailored to the individual patient. Patients’ personal risk profiles and capabilities and the relative importance they assign to the potential outcomes and supporting evidence are integral in determining how intensively to treat.
Cardiovascular disease is the most likely cause of death in patients with type 2 diabetes, and attention to glycemic control should not distract clinicians and patients from other interventions that may be more effective in preventing coronary artery disease and stroke. These include smoking cessation, serum lipid management, control of blood pressure, diet, physical activity, and weight management. Guidelines for the control of these risk factors appear elsewhere.54-56 Clinicians should also pursue treatments other than glycemic control for preventing microvascular complications (eg, blood pressure control, angiotensin-converting enzyme inhibitors for diabetic nephropathy, laser treatment for diabetic retinopathy).
Whatever the desired goals and intensity of treatment, patients face considerable barriers in implementing recommendations. Modifying diet and other personal habits; complying with self-monitoring, medication, and home care; and returning for follow-up visits are difficult. Physicians should work with patients to overcome remediable barriers and should use recommended techniques for patient education and counseling to offer the necessary information and motivation for meaningful change.57
Acknowledgments
The systematic review on which this guideline is based was supported in part by funding from the Health Care Financing Administration. We thank Richard D. Kahn, PhD, (American Diabetes Association), and Herbert F. Young, MD, and Bellinda Schoof (American Academy of Family Physicians) for their assistance, as well as the expert panel that externally reviewed the full report: Eugene Barrett, MD; John A. Colwell, MD; Richard C. Eastman, MD; Saul Genuth, MD; Ronald Klein, MD, MPH; Martin Mahoney, MD; James W. Mold, MD; David M. Nathan, MD; Jonathan E. Rodnick, MD; Jeffrey L. Susman, MD; Sandeep Vijan, MD, MS; and Bruce Zimmerman, MD. Their participation in the review process does not necessarily imply endorsement of the report or its recommendations.
1. Harris MI. Diabetes in America: epidemiology and scope of the problem. Diabetes Care 1998;21(suppl 3):C11-14.
2. Klein R, Klein BEK. Vision disorders in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;293-338.
3. Nelson RG, Knowler WC, Pettitt DJ, Bennett PH. Kidney diseases in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;349-400.
4. Reiber GE, Boyko EJ, Smith DG. Lower extremity foot ulcers and amputations in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;409-27.
5. Wilson PW, Cupples LA, Kannel WB. Is hyperglycemia associated with cardiovascular disease? The Framingham Study. Am Heart J 1991;121:586-90.
6. Javitt JC, Chiang YP. Economic impact of diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;601-11.
7. American Diabetes Association. Economic consequences of diabetes mellitus in the US in 1997. Diabetes Care 1998;21:296-309.
8. Harris MI. Summary. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;1-13.
9. Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. N Engl J Med 1993;329:977-86.
10. UKPDS Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 1998;352:837-53.
11. Knuiman MW, Welborn TA, McCann VJ, Stanton KG, Constable IJ. Prevalence of diabetic complications in relation to risk factors. Diabetes 1986;35:1332-9.
12. Klein R, Klein BEK, Moss SE, Davis MD, DeMets DL. Glycosylated hemoglobin predicts the incidence and retinopathy of diabetic retinopathy. JAMA 1988;260:2864-71.
13. Klein R, Klein BEK, Moss SE. Relation of glycemic control to diabetic microvascular complications in diabetes mellitus. Ann Intern Med 1996;124:90-6.
14. Klein R, Klein BEK, Moss SE, Cruickhanks KJ. Ten-year incidence of gross proteinuria in people with diabetes. Diabetes 1995;44:916-23.
15. Humphrey LL, Ballard DJ, Frohnert P, Chu CP, O’Fallon WM, Palumbo PJ. Chronic renal failure in non-insulin-dependent diabetes mellitus: a population-based study in Rochester, Minnesota. Ann Intern Med 1989;111:788-96.
16. Ohkubo Y, Kishikawa H, Araki E, et al. Intensive insulin therapy prevents the progression of diabetic microvascular complications in Japanese patients with non-insulin-dependent diabetes mellitus: a randomized prospective 6-year study. Diabetes Res Clin Pract 1995;28:103-17.
17. Reichard P, Nilsson BY, Rosenqvist U. The effect of long-term intensified insulin treatment on the development of microvascular complications of diabetes mellitus. N Engl J Med 1993;329:304-9.
18. Kroc Collaborative Study Group. Diabetic retinopathy after two years of intensified insulin treatment: follow-up of the Kroc Collaborative Study. JAMA 1988;260:37-41.
19. Kawamori R, Kamada T. Determination of the glycemic threshold for the regression or prevention of diabetic microangiopathies, and the insulin injection regimen to establish strict glycemic control in NIDDM. Jpn J Med 1991;30:618-21.
20. Eschwege E, Job D, Guyot-Argenton C, Aubry JP, Tchobroutsky G. Delayed progression of diabetic retinopathy by divided insulin administration: a further follow-up. Diabetolgia 1979;16:13-5.
21. Lauritzen T, Frost-Larsen K, Larsen HW, Deckert T, et al. Two-year experience with continuous subcutaneous insulin infusion in relation to retinopathy and neuropathy. Diabetes 1986;34(Suppl 3):74-9.
22. Feldt-Rasmussen B, Mathiesen ER, Jensen T, Lauritzen T, Deckert T. Effect of improved metabolic control on loss of kidney function in type 1 (insulin-dependent) diabetic patients: an update of the Steno studies. Diabetologia 1991;34:164-70.
23. Dahl-Jørgensen K, Brinchmann-Hansen O, Hanssen KF, et al. Effect of near normoglycaemia for two years on progression of early diabetic retinopathy, nephropathy, and neuropathy: the Oslo study. Br Med J 1986;293:1195-9.
24. Beck-Nielsen H, Olesen T, Mogensen CE, et al. Effect of near normoglycemia for 5 years on progression of early diabetic retinopathy and renal involvement. Diabetes Res 1990;15:185-90.
25. University Group Diabetes Program. Effects of hypoglycemic agents on vascular complications in patients with adult-onset diabetes. VIII. Evaluation of insulin therapy: final report. Diabetes 1982;31(Suppl 5):1-31.
26. Early Treatment Diabetic Retinopathy Study Research Group. Grading diabetic retinopathy from stereoscopic color fundus photographs-an extension of the modified Airlie House classification: ETDRS report number 10. Ophthalmology 1991;98:786-806.
27. Diabetes Control and Complications Trial Research Group. The relationship of glycemic exposure to the risk of development and progression of retinopathy in the Diabetes Control and Complications Trial. Diabetes 1995;44:968-83.
28. Diabetes Control and Complications Trial Research Group. The effect of intensive diabetes therapy on the development and progression of neuropathy. Ann Intern Med 1995;122:561-8.
29. Reichard P, Pihl M, Rosenqvist U, Sule J. Complications in IDDM are caused by elevated blood glucose level: the Stockholm Diabetes Intervention Study (SDIS) at 10-year follow-up. Diabetologia 1996;39:1483-8.
30. Diabetes Control and Complications Trial Research Group. Effect of intensive therapy on the development and progression of diabetic nephropathy in the Diabetes Control and Complications Trial. Kidney Int 1995;47:1703-20.
31. Welborn TA, Knuiman M, McCann V, Stanton K, Constable IJ. Clinical macrovascular disease in Caucasoid diabetic subjects: logistic regression analysis of risk variables. Diabetologia 1984;27:568-73.
32. Hillson RM, Hockaday TDR, Mann JI, Newton DJ. Hyperinsulinaemia is associated with development of electrocardiographic abnormalities in diabetics. Diabetes Res 1984;1:143-9.
33. Fu CC, Chang CJ, Tseng CH, et al. Development of macrovascular diseases in NIDDM patients in northern Taiwan. Diabetes Care 1993;16:137-43.
34. Fuller JH, Shipley MJ, Rose G, Jarrett RJ, Keen H. Mortality from coronary heart disease and stroke in relation to degree of glycaemia: the Whitehall study. Lancet 1983;287:867-70.
35. Moss SE, Klein R, Klein BEK, Meuer SM. The association of glycemia and cause-specific mortality in a diabetic population. Arch Intern Med 1994;154:2473-9.
36. Andersson DKG, Svãrdsudd K. Long-term glycemic control relates to mortality in type II diabetes. Diabetes Care 1995;18:1534-42.
37. Kuusisto J, Mykkänen L, Pyörälä K, Laakso M. Non-insulin-dependent diabetes and its metabolic control are important predictors of stroke in elderly subjects. Stroke 1994;25:1157-64.
38. Lee JS, Lu M, Lee VS, Russell D, Bahr C, Lee ET. Lower-extremity amputation: incidence, risk factors, and mortality in the Oklahoma Indian Diabetes Study. Diabetes 1993;42:876-82.
39. UKPDS Group. Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). Lancet 1998;352:854-65.
40. Abraira C, Colwell J, Nuttall F, et al. Cardiovascular events and correlates in the Veterans Affairs Diabetes Feasibility Trial: Veterans Affairs Cooperative Study on Glycemic Control and Complications in Type II Diabetes. Arch Intern Med 1997;157:181-8.
41. Keen H, Chlouverakis C, Jarrett RJ, Boyns DR. The effect of treatment of moderate hyperglycaemia on the incidence of arterial disease. Postgrad Med J 1968;Suppl:960-5.
42. Reichard P, Pihl M. Mortality and treatment side-effects during long-term intensified conventional insulin treatment in the Stockholm Diabetes Intervention Study. Diabetes 1994;43:313-7.
43. Muggeo M, Verlato G, Bonora E, et al. Long-term instability of fasting plasma glucose predicts mortality in elderly NIDDM patients: the Verona Diabetes Study. Diabetologia 1995;38:672-9.
44. Sasaki A, Uehara M, Horiuchi N, Hasegawa K, Shimizu T. A 15-year follow-up study of patients with non-insulin-dependent diabetes mellitus (NIDDM) in Osaka, Japan. Factors predictive of the prognosis of diabetic patients. Diab Res Clin Pract 1997;36:41-7.
45. Gall MA, Borch-Johnsen K, Hougaard P, Nielsen FS, Parving HH. Albuminuria and poor glycemic control predict mortality in NIDDM. Diabetes 1995;44:1303-9.
46. Hadden DR, Blair ALT, Wilson EA, et al. Natural history of diabetes presenting at 40-69 years: a prospective study of the influence of intensive dietary therapy. Q J Med 1986;230:579-98.
47. Davis WK, Hess GE, Hiss RG. Psychological correlates of survival in diabetes. Diabetes Care 1988;11:538-45.
48. Malmberg K. Diabetes Mellitus, Insulin Glucose Infusion in Acute Myocardial Infarction (DIGAMI) Study Group. Prospective randomised study of intensive insulin treatment on long-term survival after acute myocardial infarction in patients with diabetes mellitus. Br Med J 1997;314:1512-5.
49. Diabetes Control and Complications Trial Research Group. Influence of intensive diabetes treatment on quality of life outcomes in the Diabetes Control and Complications Trial. Diabetes Care 1996;19:195-203.
50. Testa MA, Simonson DC. Health economic benefits and quality of life during improved glycemic control in patients with type 2 diabetes mellitus: a randomized, controlled, double-blind trial. JAMA 1998;280:1490-6.
51. Eastman RC, Javitt JC, Herman WH, et al. Model of complications of NIDDM. II. Analysis of the health benefits and cost-effectiveness of treating NIDDM with the goal of normoglycemia. Diabetes Care 1997;20:735-44.
52. Vijan S, Hofer TP, Hayward RA. Estimated benefits of glycemic control in microvascular complications in type 2 diabetes. Ann Intern Med 1997;127:788-95.
53. Hayward RA, Manning WG, Kaplan SH, Wagner EH, Greenfield S. Starting insulin therapy in patients with type 2 diabetes: effectiveness, complications, and resource utilization. JAMA 1997;278:1663-9.
54. Fiore MC, Bailey WC, Cohen SJ, et al. Smoking cessation: clinical practice guideline no. 18. Rockville, Md: US Department of Health and Human Services, Agency for Health Care Policy and Research; 1996. AHCPR Publication No. 96-0692.
55. Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults. Summary of the second report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel II). JAMA 1993;269:3015-23.
56. Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. The Sixth Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Arch Intern Med 1997;157:2413-46.
57. Roter DL, Hall JA, Merisca R, Nordstrom B, Cretin D, Svarstad B. Effectiveness of interventions to improve patient compliance: a meta-analysis. Med Care 1998;36:1138-61.
PARTICIPANTS: A 9-member panel composed of family physicians, general internists, endocrinologists, and a practice guidelines methodologist was assembled by the American Academy of Family Physicians, the American Diabetes Association, and the American College of Physicians.
EVIDENCE: Admissible evidence included published randomized controlled trials and observational studies regarding the effects of glycemic control on microvascular and macrovascular complications and on adverse effects. We followed systematic search and data abstraction procedures. Greater weight was given to clinical trials and to evidence about health outcomes.
CONSENSUS PROCESS: Interpretations of evidence and approval of documents were finalized by unanimous vote, with recommendations linked to evidence and not expert opinion. The full report was prepared by the chair and 2 panel members, representing each of the 3 organizations. The initial draft underwent external review by 14 diabetologists and family physicians and changes consistent with the evidence were incorporated.
CONCLUSIONS: The evidence demonstrates that the risk of microvascular and neuropathic complications is reduced by lowering glucose concentrations. Whether glycemic control affects macrovascular outcomes is less clear. The potential benefits of glycemic control must be balanced against factors that either preempt benefits (eg, limited life expectancy, comorbid disease) or increase risk (eg, severe hypoglycemia). The magnitude of benefit is a function of individual clinical variables (eg, baseline glycated hemoglobin level, presence of preexisting microvascular disease). Appropriate targets for treatment should be determined by considering these factors, patients’ risk profiles, and personal preferences.
What are the benefits and risks of glycemic control in type 2 diabetes, and what are the implications for clinical practice?
An estimated 16 million people in the United States have diabetes.1 Its microvascular complications (retinopathy, nephropathy, neuropathy) are a leading cause of blindness among adults,2 end-stage renal disease,3 and lower-extremity amputations.4 Its macrovascular complications pose an even greater public health burden, increasing the risk of coronary artery disease, stroke, and peripheral vascular disease.5 Each year diabetes costs the country an estimated $90 billion to $99 billion.6,7
Type 2 diabetes, which accounts for 90% to 95% of diabetes cases,8 differs from type 1 disease in average age of onset and etiology. In both forms, however, the underlying cause of microvascular (and possibly macrovascular) complications appears to be chronic elevations in blood glucose concentrations. The overriding factors in predicting microvascular pathogenesis have less to do with the type of diabetes than with the number of years the patient has had hyperglycemia and the magnitude of the glucose elevation.
In recent years, 2 major randomized controlled trials (RCTs), the Diabetes Control and Complications Trial9 (DCCT) in patients with type 1 disease and the United Kingdom Prospective Diabetes Study10 (UKPDS) in patients with type 2 diabetes, have shown that microvascular complications can be reduced significantly in patients who achieve normal or near-normal blood glucose levels. There is now agreement in the medical community about the importance of lowering markedly elevated blood glucose levels.
The incremental benefit of tight glycemic control (as opposed to less intensive therapy) varies across patient groups, however. Years are required for microvascular complications to progress to symptomatic disease. Patients with type 1 diabetes, who are generally younger, are more likely to live long enough to benefit from tight glycemic control than patients with type 2 disease, who face a shorter life expectancy because of their age and risk of cardiovascular disease. For patients with coexistent diseases, the delayed benefits of glycemic control may be offset by the more immediate inconvenience, complications, and costs of intensive treatment and by the health effects of comorbid conditions.
These generalizations do not apply to all patients. The older age of onset of type 2 diabetes is a population average with a wide distribution. Many patients with type 2 diabetes live long enough to experience significant microvascular disease and may benefit from glycemic control. Also, interventions that extend life expectancy (eg, smoking cessation, blood pressure control, and lipid control) give patients more time to encounter advanced microvascular disease and an opportunity to benefit from treatment.
In 1996, the American Academy of Family Physicians convened a panel to conduct a systematic review of the evidence of the benefits and harms of glycemic control in type 2 diabetes and to develop evidence-based recommendations. The 9-member panel included family physicians, general internists, endocrinologists, and a practice guidelines methodologist. Four members were appointed by the American Diabetes Association and the American College of Physicians. We summarize the panel’s findings, which are available in a full report.*
Methods
Systematic Review
The review methods are provided in detail in the full report. The literature search retrieved published evidence on the effects of glycemic control on microvascular and macrovascular complications in type 1 and type 2 diabetes and on adverse effects. RCT evidence for type 1 diabetes was considered relevant in evaluating the effects of glycemic control in type 2 disease. A total of 798 citations met initial inclusion criteria. All articles underwent structured abstraction. We closed the search with the publication of the UKPDS results in September 1998.
In reviewing the evidence, the panel gave greater weight to RCTs than to observational studies and emphasized data on health outcomes perceptible to patients (eg, visual acuity) over those for intermediate or surrogate end points (eg, retinopathy) that precede or are associated with such outcomes. Most trials did not designate health outcomes as primary end points and therefore lacked the statistical power and duration to prove an effect. Panel recommendations were evidence-based and did not reflect expert opinion. Fourteen outside diabetologists and family physicians externally reviewed the full report, and revisions consistent with the evidence were adopted. The American Academy of Family Physicians and the American Diabetes Association endorsed the full report in March 1999 and this policy statement in October 1999.
We focused the review on the benefits of glycemic control in general and not on specific agents (eg, sulfonylureas, metformin). Interventions not associated with glycemic control (eg, laser phototherapy, angiotensin-converting enzyme inhibitors), which also mitigate the effects of microvascular disease, were not examined in our review.
Results
Microvascular Outcomes
Evidence from Observational Studies. Many cross-sectional studies indicate that people with type 2 diabetes who have higher plasma glucose or glycated hemoglobin (e.g., hemoglobin A1c) levels are more likely to have evidence of retinopathy, neuropathy, or albuminuria.11 Numerous prospective longitudinal studies also show that an elevated fasting plasma glucose (FPG) concentration or glycated hemoglobin level at baseline or over time increases the chances that type 2 patients will develop new or worsened retinopathy, abnormal electrophysiologic findings, or renal dysfunction.12-15 However, observational data, unlike evidence from RCTs, does not prove that lowering blood glucose levels reduces the incidence of these complications.
Evidence from RCTs. Ten RCTs do provide this evidence.9,10,16-24 Three trials involved patients with type 2 diabetes: the very large UKPDS10 (approximately 4200 patients) and 2 small Japanese studies.16,19 The largest of the 7 trials of patients with type 1 diabetes was the DCCT (1441 patients).9 In most trials, patients were randomly allocated to an intensive treatment group that received multiple or continuous insulin administrations or to a control group that received conventional insulin therapy. Most studies confirmed (through mean glycated hemoglobin levels) better glycemic control with intensive treatment. For example, mean hemoglobin A1c levels in the intensive/conservative treatment groups of the DCCT and UKPDS were 7.2%/9.1% and 7.0%/7.9%, respectively.9,10 An old RCT (University Group Diabetes Program) that did not produce significant differences in glucose control in some treatment arms and lacked statistical power was excluded from our review.25 Average lengths of follow-up in the RCTs ranged from 2 to 12.5 years (6.5 and 10 years, respectively, in the DCCT and UKPDS).
Retinopathy in Patients with Type 2 Diabetes. RCTs provide good evidence that glycemic control reduces the incidence of retinopathy. In a Japanese trial, among patients with no retinopathy at baseline, the 6-year incidence of new disease (progression Ž2 steps on the Early Treatment Diabetic Retinopathy Study [ETDRS] scale26) in the intensive and conservative treatment groups was 6% and 36%, respectively, a relative reduction of 83%. In patients with retinopathy at baseline (secondary prevention group), the incidence rates for intensive and conservative treatment groups were 17% and 44%, respectively.16
The authors of the UKPDS results, who reported a 25% relative reduction in the incidence of microvascular complications (11.4 vs 8.6 events/1000 patient-years) attributed much of the benefit to reduced retinopathy.10 The need for laser therapy was lowered from 11.0 to 7.9 events/1000-patient years, a 29% relative reduction. Within 6 years, the incidence of a 2-step progression on the ETDRS scale was lowered from 28% to 23%. The relative reduction in cataract extraction was 24%. The incidence of decreased visual acuity, blindness, and vitreous hemorrhage was not lowered significantly. The extent to which the latter resulted from treatment for early complications is not known.
Retinopathy in Patients with Type 1 Diabetes. Among patients with no baseline retinopathy in the DCCT (primary prevention group), the 6.5-year incidence of a sustained 3-step change on the ETDRS scale was reduced by 76% (from 4.7 to 1.2/100 patient-years).9 In the secondary prevention group, the rate of progression was lowered by 54% (from 7.8 to 3.7/100 patient-years). In this group intensive treatment was also associated with a lower incidence of severe retinopathy, need for laser treatment, and sustained progression worsening for at least 6 months (adjusted relative risk reduction=47%, 56%, and 65%, respectively).9,27 A Swedish trial reported improved ETDRS scales and a lower prevalence of visual impairment (14% vs 35%) at 7.8 years median follow-up.17
Peripheral Neuropathy in Patients with Type 2 Diabetes. Some trials suggest that lowering blood glucose improves isolated electrophysiologic measures.10,16,19 A Japanese RCT16 reported an increase in median nerve conduction velocity and a reduction in arm vibration threshold, but other physiologic measures were unaffected. Neurologic symptoms were not measured. The UKPDS10 showed no effect on the incidence of absent ankle and knee reflexes, but abnormal biothesiometery data (for toes) occurred less frequently with intensive treatment. Impotence and heart rate responses to deep breathing and standing occurred with equal frequency.
Peripheral Neuropathy in Patients with Type 1 Diabetes. The DCCT showed a 69% reduction (9.8 vs 3.1/100 patient-years) in newly “confirmed clinical neuropathy” (abnormal neurologic history or physical examination combined with abnormal nerve conduction or autonomic nervous system studies), but the incidence of neurologic symptoms was not reported.9,28 Progression of clinical neuropathy in patients with preexisting disease was reduced by 57%, from 16.1 to 7.0/100 patient-years. A Swedish RCT reported a lower incidence of neuropathic symptoms at 10-year follow-up (14% vs 32%) and higher pin-prick sensitivity.29
Nephropathy in Patients with Type 2 Diabetes. Intensive insulin treatment does appear to reduce the incidence of albuminuria.10,16,19 The UKPDS observed a lower incidence of microalbuminuria within 3 years and a lower incidence of gross proteinuria and increased plasma creatinine within 9 years of follow-up (relative risk reduction=17%, 33%, and 60%, respectively).10 The incidence rates of renal failure and death from renal disease did not differ significantly between the groups, but the absolute number of cases was small.
Nephropathy in Patients with Type 1 Diabetes. In the DCCT, among patients who lacked microalbuminuria at baseline, the incidence of new cases over 6.5 years was reduced by 34% (3.4 vs 2.2/100,000 patient-years), but the incidence of sustained microalbuminuria, macroalbuminuria, or abnormal creatinine clearance did not differ. In the secondary prevention group, the incidence of microalbuminuria was reduced from 5.7 to 3.6/100,000 patient-years, a 43% relative reduction, and the incidence of sustained microalbuminuria and of macroalbuminuria was also reduced.9,30
Macrovascular Outcomes
Observational Studies. Some cross-sectional studies in type 2 diabetes report that elevated FPG concentrations or glycated hemoglobin levels are more common in people with coronary artery disease, an abnormal electrocardiogram, or cardiovascular disease.31,32 Longitudinal studies show that patients with elevated blood glucose, glycated hemoglobin, or postprandial glucose levels at baseline are more likely to develop coronary artery disease or an abnormal electrocardiogram or to die of coronary artery or cardiovascular disease.33-36 An association between blood glucose concentration and stroke or peripheral vascular disease (eg, incidence of amputation and foot ulcers) has also been demonstrated in such studies31,33,34,37,38 but less consistently.
Clinical Trials of Patients with Type 2 Diabetes. The UKPDS showed a 16% reduction in the 10-year incidence of myocardial infarction with intensive treatment, a difference of borderline statistical significance (P=.05, 95% confidence interval for relative risk=0.71-1.00).10 Statistically significant differences were noted in certain subgroups.39 Sudden death was less common (relative risk reduction=46%, P=.05), but the incidence of fatal myocardial infarction, heart failure, angina, stroke, amputation, and death from peripheral vascular disease was unchanged.10 The authors noted that the study lacked statistical power to exclude an effect on fatal outcomes.
Another RCT reported no significant effect on cardiovascular events or mortality with intensive treatment, but the mean follow-up period was only 27 months.40 A British trial involving patients with moderate hyperglycemia reported that cardiovascular events occurred less frequently in a group given high-dose tolbutamide and a recommended diet than in the control group, but the patient population, type of diabetes, and outcome measures were defined imprecisely.41
Clinical Trials of Patients with Type 1 Diabetes. The incidence of major cardiovascular and peripheral vascular events in most trials did not differ significantly with intensive treatment,9,42 but the number of cases and length of follow-up were generally too small to detect a difference.
All-Cause Mortality
Observational Studies. Some observational studies report an association between poor glycemic control and all-cause mortality or overall survival rates in type 2 diabetes.35,36,43-45 However, other cohort studies report that death rates are not reliably predicted by FPG concentrations or glycated hemoglobin levels.46,47
Clinical Trials of Patients with Type 2 Diabetes. A Swedish RCT found that patients with diabetes admitted to coronary care units for recent myocardial infarction achieved better glycemic control and experienced significantly lower all-cause mortality (33% vs 44%) if they received intensive insulin therapy (insulin-glucose infusion for the first 24 hours and subcutaneous insulin 4 times daily for 3 months).48 The UKPDS showed no significant effect on either all-cause or diabetes-related mortality but lacked statistical power to exclude an effect.10
Clinical Trials of Patients with Type 1 Diabetes. There are few data on all-cause mortality in patients with type 1 diabetes because of the low event rate.9,42
Discussion
Potential Harms of Intensive Glycemic Control
Specific complications can occur with each of the agents used to treat type 2 diabetes: Insulin has potential adverse effects, and oral glucose-lowering drugs carry some risk of undesirable side effects and uncommon but serious complications (eg, lactic acidosis, hepatotoxicity).
Attempts to achieve euglycemia can increase the risk of hypoglycemia, and some medications are associated with weight gain. The risk of severe hypoglycemia is greatest for patients with type 1 diabetes. In the subjects with type 2 diabetes in the UKPDS, the incidence of major hypoglycemic episodes was higher among the intensively treated than conventionally treated patients, but the rate was low (1% to 2%).10 More typically in type 2 disease, a more substantial risk exists for minor hypoglycemic episodes, which are usually inconsequential.10,40 An association between intensive treatment and weight gain has been reported (mean=3.1 kg and 4.6 kg in the UKPDS and DCCT, respectively),9,10 but there is no evidence that this amount of weight gain affects outcomes.
Intensive treatment requires that patients perform home glucose monitoring; follow diet and physical activity regimens; tolerate minor side effects and the risk of more serious complications from medications; regularly attend physician visits for testing and examinations; and absorb costs not covered by insurance for physicians and medical supplies, lost work (or school), and transportation. In many cases these inconveniences, discomforts, and costs are borne over a number of years, often a lifetime. RCTs have shown no adverse association between these efforts and quality of life,49 however, and one study suggested that glycemic control might improve quality of life and work productivity.50
Modeling Estimates
Mathematical models, largely based on the DCCT, have attempted to estimate the magnitude of benefits and harms from glycemic control. One model estimated that patients with type 2 diabetes who maintained a glycated hemoglobin level of 7.2% would reduce their cumulative lifetime risk of blindness, end-stage renal disease, and lower-extremity amputation by 72% (from 19% to 5%), 87% (from 17% to 2%), and 67% (from 15% to 5%), respectively.51 Life expectancy would increase by 1.39 years. A Markov model estimated that reducing glycated hemoglobin from 9% to 7% in a patient in whom diabetes developed at age 45 years would lower the lifetime risk of blindness from 2.6% to 0.3%.52 The same change in a patient who developed diabetes at age 65 years would decrease the risk of blindness from only 0.5% to <0.1%.
In theory, such projections could be useful to clinicians and patients to estimate the benefits and harms of different levels of glycemic control in individual situations. Since there are different designs and assumptions, however, available models offer discrepant predictions about the same types of patients. For example, the lifetime risk of blindness in a white patient aged 55 years who lowers his glycated hemoglobin level from 9% to 7% would, according to one model, be reduced by 5.6% (from 9% to 3.4%)51 and, according to another model by 1.1% (from 1.2% to 0.1%).52 These discrepancies must be reconciled before reliable outcome estimates can be introduced with confidence in practice.
Weighing the Magnitude of Benefit
The evidence demonstrates a continuous and curvilinear relationship between hyperglycemia and the microvascular and neuropathic complications of diabetes, with risk rising progressively as mean blood glucose concentrations increase. RCTs confirm that for both type 1 and type 2 diabetes glycemic control significantly reduces the incidence of microvascular complications. The following points should be considered when applying this evidence to routine practice:
- The intensity of treatment in RCTs may be difficult to replicate to the same degree in community practice. In the DCCT, for example, patients received insulin by injection 3 times daily or by external pump, self-monitored blood glucose at least 4 times per day, underwent weekly nocturnal blood glucose measurements, visited their study center monthly, and received frequent telephone calls. The target glycated hemoglobin value was less than 6.1%.9 More typical treatment practices were followed in the UKPDS. Although some practices and health care systems have successfully achieved satisfactory blood glucose levels through aggressive programs that assist clinicians and patients, other constraints and the inability or reluctance of patients to adhere to treatment protocols remain problems in other settings.53
- The microvascular end points in most RCTs were primarily intermediate (eg, ETDRS scales, nerve conduction velocity, urinary albumin excretion) or surrogate (eg, laser phototherapy) outcomes rather than health outcomes. Few trials were designed to measure health outcomes, providing limited data on the extent to which the symptoms that patients experience (eg, visual impairment, paresthesias, complications of renal failure) are reduced by intensive treatment. Such complaints generally do not occur until the patient has end-stage disease and are often forestalled by early treatment (eg, laser phototherapy). It is reasonable to infer that long-term benefits result from glycemic control-the intermediate end points affected are known risk factors for clinical disease-but one cannot assume that the observed magnitude of risk reduction for intermediate outcomes applies also to symptomatic disease.
- Relative risk reductions are greater than absolute risk reductions. The 25% relative reduction in microvascular complications reported by the UKPDS represents an absolute reduction of only 2%: 8%, rather than 10%, of patients had complications during 10 years of treatment.10 Also, relative risk reductions generally refer to intermediate outcomes. The 76% reduction in the risk of retinopathy reported by the DCCT refers to a 3-step change on the ETDRS scale, not to improved vision.9 The number needed to treat to affect outcomes perceptible to patients is necessarily higher than that for retinopathy, delayed nerve conduction, or elevated urinary albumin excretion, because only a subset of patients with these intermediate outcomes go on to develop symptomatic disease.15 According to the UKPDS, 37 patients would require intensive treatment for 10 years to prevent one patient from undergoing laser treatment; 208 would require treatment to prevent one case of blindness.10 The observed 16% difference in the incidence of blindness in the UKPDS was not statistically significant.On the other hand, when examined at the population level, even modest absolute risk reductions can translate into large numbers of persons in society for whom clinical benefit is achievable. Given the millions of people in the United States with type 2 diabetes, even a 2% absolute reduction in the risk of microvascular complications represents many thousands of people who would benefit from glycemic control.
- Because of the average time required for glycemic control to affect outcomes, some patients with diabetes may not live long enough to benefit, because of the competing risks of death of macrovascular complications and other comorbid diseases. Although elevated blood glucose levels are a likely risk factor for cardiovascular disease, glycemic control has not been shown to enhance life expectancy or prevent heart disease. The 16% reduction in myocardial infarction reported by the UKPDS was of borderline statistical significance.10 Another RCT did find that improved glycemic control reduced the incidence of ischemic cardiac events, stroke, and cardiovascular deaths in patients with acute myocardial infarction.48 Two other trials that failed to show a benefit may have lacked adequate duration and sample size.16,40
- For any given patient, the absolute magnitude of risk reduction is a continuous variable that is a function of the patient’s current glycated hemoglobin level, the duration and magnitude of previous hyperglycemia, and the extent of preexisting microvascular complications. The probability that the patient will live long enough to experience the benefits of reduced complications depends on cardiovascular risk factors other than blood glucose (eg, smoking, hypertension, lipid levels, physical inactivity, obesity, preexisting coronary artery disease) and other determinants of life expectancy (eg, age, coexisting diseases, health status). Of these, the most critical variable is the patient’s current glycated hemoglobin level. Because of their increased risk of complications, individuals with marked elevations generally benefit more (in absolute terms) from the same absolute reduction in glycated hemoglobin levels than do individuals with mild to moderate elevations.53 Although it is obviously important for clinicians to keep patients from progressing from mild (eg, hemoglobin A1c levels of 6% to 8%) to marked hyperglycemia (eg, hemoglobin A1c levels >9.5%), in those patients who have already developed marked hyperglycemia, efforts directed at achieving even moderate control (eg, hemoglobin A1c levels of 8% to 9.5%) will yield greater health benefits than pursuing euglycemia in patients with moderate elevations.
Recommendations for clinical practice
For any patient with type 2 diabetes, the better the glycemic control, the lower the probability of chronic microvascular, neuropathic, and possibly cardiovascular complications. However, because of differences in patients’ life expectancies, comorbidities, and preferences, it is inappropriate to set a uniform target glycated hemoglobin level for all patients. Individuals with long life expectancies and few comorbidities may wish to pursue euglycemia, but less vigorous goals may be appropriate for others, such as patients with multiple comorbid conditions or with limited life expectancies.
Whether the magnitude of benefit of a given treatment goal justifies the potential inconvenience, harms, and costs involves value judgments that must be tailored to the individual patient. Patients’ personal risk profiles and capabilities and the relative importance they assign to the potential outcomes and supporting evidence are integral in determining how intensively to treat.
Cardiovascular disease is the most likely cause of death in patients with type 2 diabetes, and attention to glycemic control should not distract clinicians and patients from other interventions that may be more effective in preventing coronary artery disease and stroke. These include smoking cessation, serum lipid management, control of blood pressure, diet, physical activity, and weight management. Guidelines for the control of these risk factors appear elsewhere.54-56 Clinicians should also pursue treatments other than glycemic control for preventing microvascular complications (eg, blood pressure control, angiotensin-converting enzyme inhibitors for diabetic nephropathy, laser treatment for diabetic retinopathy).
Whatever the desired goals and intensity of treatment, patients face considerable barriers in implementing recommendations. Modifying diet and other personal habits; complying with self-monitoring, medication, and home care; and returning for follow-up visits are difficult. Physicians should work with patients to overcome remediable barriers and should use recommended techniques for patient education and counseling to offer the necessary information and motivation for meaningful change.57
Acknowledgments
The systematic review on which this guideline is based was supported in part by funding from the Health Care Financing Administration. We thank Richard D. Kahn, PhD, (American Diabetes Association), and Herbert F. Young, MD, and Bellinda Schoof (American Academy of Family Physicians) for their assistance, as well as the expert panel that externally reviewed the full report: Eugene Barrett, MD; John A. Colwell, MD; Richard C. Eastman, MD; Saul Genuth, MD; Ronald Klein, MD, MPH; Martin Mahoney, MD; James W. Mold, MD; David M. Nathan, MD; Jonathan E. Rodnick, MD; Jeffrey L. Susman, MD; Sandeep Vijan, MD, MS; and Bruce Zimmerman, MD. Their participation in the review process does not necessarily imply endorsement of the report or its recommendations.
PARTICIPANTS: A 9-member panel composed of family physicians, general internists, endocrinologists, and a practice guidelines methodologist was assembled by the American Academy of Family Physicians, the American Diabetes Association, and the American College of Physicians.
EVIDENCE: Admissible evidence included published randomized controlled trials and observational studies regarding the effects of glycemic control on microvascular and macrovascular complications and on adverse effects. We followed systematic search and data abstraction procedures. Greater weight was given to clinical trials and to evidence about health outcomes.
CONSENSUS PROCESS: Interpretations of evidence and approval of documents were finalized by unanimous vote, with recommendations linked to evidence and not expert opinion. The full report was prepared by the chair and 2 panel members, representing each of the 3 organizations. The initial draft underwent external review by 14 diabetologists and family physicians and changes consistent with the evidence were incorporated.
CONCLUSIONS: The evidence demonstrates that the risk of microvascular and neuropathic complications is reduced by lowering glucose concentrations. Whether glycemic control affects macrovascular outcomes is less clear. The potential benefits of glycemic control must be balanced against factors that either preempt benefits (eg, limited life expectancy, comorbid disease) or increase risk (eg, severe hypoglycemia). The magnitude of benefit is a function of individual clinical variables (eg, baseline glycated hemoglobin level, presence of preexisting microvascular disease). Appropriate targets for treatment should be determined by considering these factors, patients’ risk profiles, and personal preferences.
What are the benefits and risks of glycemic control in type 2 diabetes, and what are the implications for clinical practice?
An estimated 16 million people in the United States have diabetes.1 Its microvascular complications (retinopathy, nephropathy, neuropathy) are a leading cause of blindness among adults,2 end-stage renal disease,3 and lower-extremity amputations.4 Its macrovascular complications pose an even greater public health burden, increasing the risk of coronary artery disease, stroke, and peripheral vascular disease.5 Each year diabetes costs the country an estimated $90 billion to $99 billion.6,7
Type 2 diabetes, which accounts for 90% to 95% of diabetes cases,8 differs from type 1 disease in average age of onset and etiology. In both forms, however, the underlying cause of microvascular (and possibly macrovascular) complications appears to be chronic elevations in blood glucose concentrations. The overriding factors in predicting microvascular pathogenesis have less to do with the type of diabetes than with the number of years the patient has had hyperglycemia and the magnitude of the glucose elevation.
In recent years, 2 major randomized controlled trials (RCTs), the Diabetes Control and Complications Trial9 (DCCT) in patients with type 1 disease and the United Kingdom Prospective Diabetes Study10 (UKPDS) in patients with type 2 diabetes, have shown that microvascular complications can be reduced significantly in patients who achieve normal or near-normal blood glucose levels. There is now agreement in the medical community about the importance of lowering markedly elevated blood glucose levels.
The incremental benefit of tight glycemic control (as opposed to less intensive therapy) varies across patient groups, however. Years are required for microvascular complications to progress to symptomatic disease. Patients with type 1 diabetes, who are generally younger, are more likely to live long enough to benefit from tight glycemic control than patients with type 2 disease, who face a shorter life expectancy because of their age and risk of cardiovascular disease. For patients with coexistent diseases, the delayed benefits of glycemic control may be offset by the more immediate inconvenience, complications, and costs of intensive treatment and by the health effects of comorbid conditions.
These generalizations do not apply to all patients. The older age of onset of type 2 diabetes is a population average with a wide distribution. Many patients with type 2 diabetes live long enough to experience significant microvascular disease and may benefit from glycemic control. Also, interventions that extend life expectancy (eg, smoking cessation, blood pressure control, and lipid control) give patients more time to encounter advanced microvascular disease and an opportunity to benefit from treatment.
In 1996, the American Academy of Family Physicians convened a panel to conduct a systematic review of the evidence of the benefits and harms of glycemic control in type 2 diabetes and to develop evidence-based recommendations. The 9-member panel included family physicians, general internists, endocrinologists, and a practice guidelines methodologist. Four members were appointed by the American Diabetes Association and the American College of Physicians. We summarize the panel’s findings, which are available in a full report.*
Methods
Systematic Review
The review methods are provided in detail in the full report. The literature search retrieved published evidence on the effects of glycemic control on microvascular and macrovascular complications in type 1 and type 2 diabetes and on adverse effects. RCT evidence for type 1 diabetes was considered relevant in evaluating the effects of glycemic control in type 2 disease. A total of 798 citations met initial inclusion criteria. All articles underwent structured abstraction. We closed the search with the publication of the UKPDS results in September 1998.
In reviewing the evidence, the panel gave greater weight to RCTs than to observational studies and emphasized data on health outcomes perceptible to patients (eg, visual acuity) over those for intermediate or surrogate end points (eg, retinopathy) that precede or are associated with such outcomes. Most trials did not designate health outcomes as primary end points and therefore lacked the statistical power and duration to prove an effect. Panel recommendations were evidence-based and did not reflect expert opinion. Fourteen outside diabetologists and family physicians externally reviewed the full report, and revisions consistent with the evidence were adopted. The American Academy of Family Physicians and the American Diabetes Association endorsed the full report in March 1999 and this policy statement in October 1999.
We focused the review on the benefits of glycemic control in general and not on specific agents (eg, sulfonylureas, metformin). Interventions not associated with glycemic control (eg, laser phototherapy, angiotensin-converting enzyme inhibitors), which also mitigate the effects of microvascular disease, were not examined in our review.
Results
Microvascular Outcomes
Evidence from Observational Studies. Many cross-sectional studies indicate that people with type 2 diabetes who have higher plasma glucose or glycated hemoglobin (e.g., hemoglobin A1c) levels are more likely to have evidence of retinopathy, neuropathy, or albuminuria.11 Numerous prospective longitudinal studies also show that an elevated fasting plasma glucose (FPG) concentration or glycated hemoglobin level at baseline or over time increases the chances that type 2 patients will develop new or worsened retinopathy, abnormal electrophysiologic findings, or renal dysfunction.12-15 However, observational data, unlike evidence from RCTs, does not prove that lowering blood glucose levels reduces the incidence of these complications.
Evidence from RCTs. Ten RCTs do provide this evidence.9,10,16-24 Three trials involved patients with type 2 diabetes: the very large UKPDS10 (approximately 4200 patients) and 2 small Japanese studies.16,19 The largest of the 7 trials of patients with type 1 diabetes was the DCCT (1441 patients).9 In most trials, patients were randomly allocated to an intensive treatment group that received multiple or continuous insulin administrations or to a control group that received conventional insulin therapy. Most studies confirmed (through mean glycated hemoglobin levels) better glycemic control with intensive treatment. For example, mean hemoglobin A1c levels in the intensive/conservative treatment groups of the DCCT and UKPDS were 7.2%/9.1% and 7.0%/7.9%, respectively.9,10 An old RCT (University Group Diabetes Program) that did not produce significant differences in glucose control in some treatment arms and lacked statistical power was excluded from our review.25 Average lengths of follow-up in the RCTs ranged from 2 to 12.5 years (6.5 and 10 years, respectively, in the DCCT and UKPDS).
Retinopathy in Patients with Type 2 Diabetes. RCTs provide good evidence that glycemic control reduces the incidence of retinopathy. In a Japanese trial, among patients with no retinopathy at baseline, the 6-year incidence of new disease (progression Ž2 steps on the Early Treatment Diabetic Retinopathy Study [ETDRS] scale26) in the intensive and conservative treatment groups was 6% and 36%, respectively, a relative reduction of 83%. In patients with retinopathy at baseline (secondary prevention group), the incidence rates for intensive and conservative treatment groups were 17% and 44%, respectively.16
The authors of the UKPDS results, who reported a 25% relative reduction in the incidence of microvascular complications (11.4 vs 8.6 events/1000 patient-years) attributed much of the benefit to reduced retinopathy.10 The need for laser therapy was lowered from 11.0 to 7.9 events/1000-patient years, a 29% relative reduction. Within 6 years, the incidence of a 2-step progression on the ETDRS scale was lowered from 28% to 23%. The relative reduction in cataract extraction was 24%. The incidence of decreased visual acuity, blindness, and vitreous hemorrhage was not lowered significantly. The extent to which the latter resulted from treatment for early complications is not known.
Retinopathy in Patients with Type 1 Diabetes. Among patients with no baseline retinopathy in the DCCT (primary prevention group), the 6.5-year incidence of a sustained 3-step change on the ETDRS scale was reduced by 76% (from 4.7 to 1.2/100 patient-years).9 In the secondary prevention group, the rate of progression was lowered by 54% (from 7.8 to 3.7/100 patient-years). In this group intensive treatment was also associated with a lower incidence of severe retinopathy, need for laser treatment, and sustained progression worsening for at least 6 months (adjusted relative risk reduction=47%, 56%, and 65%, respectively).9,27 A Swedish trial reported improved ETDRS scales and a lower prevalence of visual impairment (14% vs 35%) at 7.8 years median follow-up.17
Peripheral Neuropathy in Patients with Type 2 Diabetes. Some trials suggest that lowering blood glucose improves isolated electrophysiologic measures.10,16,19 A Japanese RCT16 reported an increase in median nerve conduction velocity and a reduction in arm vibration threshold, but other physiologic measures were unaffected. Neurologic symptoms were not measured. The UKPDS10 showed no effect on the incidence of absent ankle and knee reflexes, but abnormal biothesiometery data (for toes) occurred less frequently with intensive treatment. Impotence and heart rate responses to deep breathing and standing occurred with equal frequency.
Peripheral Neuropathy in Patients with Type 1 Diabetes. The DCCT showed a 69% reduction (9.8 vs 3.1/100 patient-years) in newly “confirmed clinical neuropathy” (abnormal neurologic history or physical examination combined with abnormal nerve conduction or autonomic nervous system studies), but the incidence of neurologic symptoms was not reported.9,28 Progression of clinical neuropathy in patients with preexisting disease was reduced by 57%, from 16.1 to 7.0/100 patient-years. A Swedish RCT reported a lower incidence of neuropathic symptoms at 10-year follow-up (14% vs 32%) and higher pin-prick sensitivity.29
Nephropathy in Patients with Type 2 Diabetes. Intensive insulin treatment does appear to reduce the incidence of albuminuria.10,16,19 The UKPDS observed a lower incidence of microalbuminuria within 3 years and a lower incidence of gross proteinuria and increased plasma creatinine within 9 years of follow-up (relative risk reduction=17%, 33%, and 60%, respectively).10 The incidence rates of renal failure and death from renal disease did not differ significantly between the groups, but the absolute number of cases was small.
Nephropathy in Patients with Type 1 Diabetes. In the DCCT, among patients who lacked microalbuminuria at baseline, the incidence of new cases over 6.5 years was reduced by 34% (3.4 vs 2.2/100,000 patient-years), but the incidence of sustained microalbuminuria, macroalbuminuria, or abnormal creatinine clearance did not differ. In the secondary prevention group, the incidence of microalbuminuria was reduced from 5.7 to 3.6/100,000 patient-years, a 43% relative reduction, and the incidence of sustained microalbuminuria and of macroalbuminuria was also reduced.9,30
Macrovascular Outcomes
Observational Studies. Some cross-sectional studies in type 2 diabetes report that elevated FPG concentrations or glycated hemoglobin levels are more common in people with coronary artery disease, an abnormal electrocardiogram, or cardiovascular disease.31,32 Longitudinal studies show that patients with elevated blood glucose, glycated hemoglobin, or postprandial glucose levels at baseline are more likely to develop coronary artery disease or an abnormal electrocardiogram or to die of coronary artery or cardiovascular disease.33-36 An association between blood glucose concentration and stroke or peripheral vascular disease (eg, incidence of amputation and foot ulcers) has also been demonstrated in such studies31,33,34,37,38 but less consistently.
Clinical Trials of Patients with Type 2 Diabetes. The UKPDS showed a 16% reduction in the 10-year incidence of myocardial infarction with intensive treatment, a difference of borderline statistical significance (P=.05, 95% confidence interval for relative risk=0.71-1.00).10 Statistically significant differences were noted in certain subgroups.39 Sudden death was less common (relative risk reduction=46%, P=.05), but the incidence of fatal myocardial infarction, heart failure, angina, stroke, amputation, and death from peripheral vascular disease was unchanged.10 The authors noted that the study lacked statistical power to exclude an effect on fatal outcomes.
Another RCT reported no significant effect on cardiovascular events or mortality with intensive treatment, but the mean follow-up period was only 27 months.40 A British trial involving patients with moderate hyperglycemia reported that cardiovascular events occurred less frequently in a group given high-dose tolbutamide and a recommended diet than in the control group, but the patient population, type of diabetes, and outcome measures were defined imprecisely.41
Clinical Trials of Patients with Type 1 Diabetes. The incidence of major cardiovascular and peripheral vascular events in most trials did not differ significantly with intensive treatment,9,42 but the number of cases and length of follow-up were generally too small to detect a difference.
All-Cause Mortality
Observational Studies. Some observational studies report an association between poor glycemic control and all-cause mortality or overall survival rates in type 2 diabetes.35,36,43-45 However, other cohort studies report that death rates are not reliably predicted by FPG concentrations or glycated hemoglobin levels.46,47
Clinical Trials of Patients with Type 2 Diabetes. A Swedish RCT found that patients with diabetes admitted to coronary care units for recent myocardial infarction achieved better glycemic control and experienced significantly lower all-cause mortality (33% vs 44%) if they received intensive insulin therapy (insulin-glucose infusion for the first 24 hours and subcutaneous insulin 4 times daily for 3 months).48 The UKPDS showed no significant effect on either all-cause or diabetes-related mortality but lacked statistical power to exclude an effect.10
Clinical Trials of Patients with Type 1 Diabetes. There are few data on all-cause mortality in patients with type 1 diabetes because of the low event rate.9,42
Discussion
Potential Harms of Intensive Glycemic Control
Specific complications can occur with each of the agents used to treat type 2 diabetes: Insulin has potential adverse effects, and oral glucose-lowering drugs carry some risk of undesirable side effects and uncommon but serious complications (eg, lactic acidosis, hepatotoxicity).
Attempts to achieve euglycemia can increase the risk of hypoglycemia, and some medications are associated with weight gain. The risk of severe hypoglycemia is greatest for patients with type 1 diabetes. In the subjects with type 2 diabetes in the UKPDS, the incidence of major hypoglycemic episodes was higher among the intensively treated than conventionally treated patients, but the rate was low (1% to 2%).10 More typically in type 2 disease, a more substantial risk exists for minor hypoglycemic episodes, which are usually inconsequential.10,40 An association between intensive treatment and weight gain has been reported (mean=3.1 kg and 4.6 kg in the UKPDS and DCCT, respectively),9,10 but there is no evidence that this amount of weight gain affects outcomes.
Intensive treatment requires that patients perform home glucose monitoring; follow diet and physical activity regimens; tolerate minor side effects and the risk of more serious complications from medications; regularly attend physician visits for testing and examinations; and absorb costs not covered by insurance for physicians and medical supplies, lost work (or school), and transportation. In many cases these inconveniences, discomforts, and costs are borne over a number of years, often a lifetime. RCTs have shown no adverse association between these efforts and quality of life,49 however, and one study suggested that glycemic control might improve quality of life and work productivity.50
Modeling Estimates
Mathematical models, largely based on the DCCT, have attempted to estimate the magnitude of benefits and harms from glycemic control. One model estimated that patients with type 2 diabetes who maintained a glycated hemoglobin level of 7.2% would reduce their cumulative lifetime risk of blindness, end-stage renal disease, and lower-extremity amputation by 72% (from 19% to 5%), 87% (from 17% to 2%), and 67% (from 15% to 5%), respectively.51 Life expectancy would increase by 1.39 years. A Markov model estimated that reducing glycated hemoglobin from 9% to 7% in a patient in whom diabetes developed at age 45 years would lower the lifetime risk of blindness from 2.6% to 0.3%.52 The same change in a patient who developed diabetes at age 65 years would decrease the risk of blindness from only 0.5% to <0.1%.
In theory, such projections could be useful to clinicians and patients to estimate the benefits and harms of different levels of glycemic control in individual situations. Since there are different designs and assumptions, however, available models offer discrepant predictions about the same types of patients. For example, the lifetime risk of blindness in a white patient aged 55 years who lowers his glycated hemoglobin level from 9% to 7% would, according to one model, be reduced by 5.6% (from 9% to 3.4%)51 and, according to another model by 1.1% (from 1.2% to 0.1%).52 These discrepancies must be reconciled before reliable outcome estimates can be introduced with confidence in practice.
Weighing the Magnitude of Benefit
The evidence demonstrates a continuous and curvilinear relationship between hyperglycemia and the microvascular and neuropathic complications of diabetes, with risk rising progressively as mean blood glucose concentrations increase. RCTs confirm that for both type 1 and type 2 diabetes glycemic control significantly reduces the incidence of microvascular complications. The following points should be considered when applying this evidence to routine practice:
- The intensity of treatment in RCTs may be difficult to replicate to the same degree in community practice. In the DCCT, for example, patients received insulin by injection 3 times daily or by external pump, self-monitored blood glucose at least 4 times per day, underwent weekly nocturnal blood glucose measurements, visited their study center monthly, and received frequent telephone calls. The target glycated hemoglobin value was less than 6.1%.9 More typical treatment practices were followed in the UKPDS. Although some practices and health care systems have successfully achieved satisfactory blood glucose levels through aggressive programs that assist clinicians and patients, other constraints and the inability or reluctance of patients to adhere to treatment protocols remain problems in other settings.53
- The microvascular end points in most RCTs were primarily intermediate (eg, ETDRS scales, nerve conduction velocity, urinary albumin excretion) or surrogate (eg, laser phototherapy) outcomes rather than health outcomes. Few trials were designed to measure health outcomes, providing limited data on the extent to which the symptoms that patients experience (eg, visual impairment, paresthesias, complications of renal failure) are reduced by intensive treatment. Such complaints generally do not occur until the patient has end-stage disease and are often forestalled by early treatment (eg, laser phototherapy). It is reasonable to infer that long-term benefits result from glycemic control-the intermediate end points affected are known risk factors for clinical disease-but one cannot assume that the observed magnitude of risk reduction for intermediate outcomes applies also to symptomatic disease.
- Relative risk reductions are greater than absolute risk reductions. The 25% relative reduction in microvascular complications reported by the UKPDS represents an absolute reduction of only 2%: 8%, rather than 10%, of patients had complications during 10 years of treatment.10 Also, relative risk reductions generally refer to intermediate outcomes. The 76% reduction in the risk of retinopathy reported by the DCCT refers to a 3-step change on the ETDRS scale, not to improved vision.9 The number needed to treat to affect outcomes perceptible to patients is necessarily higher than that for retinopathy, delayed nerve conduction, or elevated urinary albumin excretion, because only a subset of patients with these intermediate outcomes go on to develop symptomatic disease.15 According to the UKPDS, 37 patients would require intensive treatment for 10 years to prevent one patient from undergoing laser treatment; 208 would require treatment to prevent one case of blindness.10 The observed 16% difference in the incidence of blindness in the UKPDS was not statistically significant.On the other hand, when examined at the population level, even modest absolute risk reductions can translate into large numbers of persons in society for whom clinical benefit is achievable. Given the millions of people in the United States with type 2 diabetes, even a 2% absolute reduction in the risk of microvascular complications represents many thousands of people who would benefit from glycemic control.
- Because of the average time required for glycemic control to affect outcomes, some patients with diabetes may not live long enough to benefit, because of the competing risks of death of macrovascular complications and other comorbid diseases. Although elevated blood glucose levels are a likely risk factor for cardiovascular disease, glycemic control has not been shown to enhance life expectancy or prevent heart disease. The 16% reduction in myocardial infarction reported by the UKPDS was of borderline statistical significance.10 Another RCT did find that improved glycemic control reduced the incidence of ischemic cardiac events, stroke, and cardiovascular deaths in patients with acute myocardial infarction.48 Two other trials that failed to show a benefit may have lacked adequate duration and sample size.16,40
- For any given patient, the absolute magnitude of risk reduction is a continuous variable that is a function of the patient’s current glycated hemoglobin level, the duration and magnitude of previous hyperglycemia, and the extent of preexisting microvascular complications. The probability that the patient will live long enough to experience the benefits of reduced complications depends on cardiovascular risk factors other than blood glucose (eg, smoking, hypertension, lipid levels, physical inactivity, obesity, preexisting coronary artery disease) and other determinants of life expectancy (eg, age, coexisting diseases, health status). Of these, the most critical variable is the patient’s current glycated hemoglobin level. Because of their increased risk of complications, individuals with marked elevations generally benefit more (in absolute terms) from the same absolute reduction in glycated hemoglobin levels than do individuals with mild to moderate elevations.53 Although it is obviously important for clinicians to keep patients from progressing from mild (eg, hemoglobin A1c levels of 6% to 8%) to marked hyperglycemia (eg, hemoglobin A1c levels >9.5%), in those patients who have already developed marked hyperglycemia, efforts directed at achieving even moderate control (eg, hemoglobin A1c levels of 8% to 9.5%) will yield greater health benefits than pursuing euglycemia in patients with moderate elevations.
Recommendations for clinical practice
For any patient with type 2 diabetes, the better the glycemic control, the lower the probability of chronic microvascular, neuropathic, and possibly cardiovascular complications. However, because of differences in patients’ life expectancies, comorbidities, and preferences, it is inappropriate to set a uniform target glycated hemoglobin level for all patients. Individuals with long life expectancies and few comorbidities may wish to pursue euglycemia, but less vigorous goals may be appropriate for others, such as patients with multiple comorbid conditions or with limited life expectancies.
Whether the magnitude of benefit of a given treatment goal justifies the potential inconvenience, harms, and costs involves value judgments that must be tailored to the individual patient. Patients’ personal risk profiles and capabilities and the relative importance they assign to the potential outcomes and supporting evidence are integral in determining how intensively to treat.
Cardiovascular disease is the most likely cause of death in patients with type 2 diabetes, and attention to glycemic control should not distract clinicians and patients from other interventions that may be more effective in preventing coronary artery disease and stroke. These include smoking cessation, serum lipid management, control of blood pressure, diet, physical activity, and weight management. Guidelines for the control of these risk factors appear elsewhere.54-56 Clinicians should also pursue treatments other than glycemic control for preventing microvascular complications (eg, blood pressure control, angiotensin-converting enzyme inhibitors for diabetic nephropathy, laser treatment for diabetic retinopathy).
Whatever the desired goals and intensity of treatment, patients face considerable barriers in implementing recommendations. Modifying diet and other personal habits; complying with self-monitoring, medication, and home care; and returning for follow-up visits are difficult. Physicians should work with patients to overcome remediable barriers and should use recommended techniques for patient education and counseling to offer the necessary information and motivation for meaningful change.57
Acknowledgments
The systematic review on which this guideline is based was supported in part by funding from the Health Care Financing Administration. We thank Richard D. Kahn, PhD, (American Diabetes Association), and Herbert F. Young, MD, and Bellinda Schoof (American Academy of Family Physicians) for their assistance, as well as the expert panel that externally reviewed the full report: Eugene Barrett, MD; John A. Colwell, MD; Richard C. Eastman, MD; Saul Genuth, MD; Ronald Klein, MD, MPH; Martin Mahoney, MD; James W. Mold, MD; David M. Nathan, MD; Jonathan E. Rodnick, MD; Jeffrey L. Susman, MD; Sandeep Vijan, MD, MS; and Bruce Zimmerman, MD. Their participation in the review process does not necessarily imply endorsement of the report or its recommendations.
1. Harris MI. Diabetes in America: epidemiology and scope of the problem. Diabetes Care 1998;21(suppl 3):C11-14.
2. Klein R, Klein BEK. Vision disorders in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;293-338.
3. Nelson RG, Knowler WC, Pettitt DJ, Bennett PH. Kidney diseases in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;349-400.
4. Reiber GE, Boyko EJ, Smith DG. Lower extremity foot ulcers and amputations in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;409-27.
5. Wilson PW, Cupples LA, Kannel WB. Is hyperglycemia associated with cardiovascular disease? The Framingham Study. Am Heart J 1991;121:586-90.
6. Javitt JC, Chiang YP. Economic impact of diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;601-11.
7. American Diabetes Association. Economic consequences of diabetes mellitus in the US in 1997. Diabetes Care 1998;21:296-309.
8. Harris MI. Summary. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;1-13.
9. Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. N Engl J Med 1993;329:977-86.
10. UKPDS Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 1998;352:837-53.
11. Knuiman MW, Welborn TA, McCann VJ, Stanton KG, Constable IJ. Prevalence of diabetic complications in relation to risk factors. Diabetes 1986;35:1332-9.
12. Klein R, Klein BEK, Moss SE, Davis MD, DeMets DL. Glycosylated hemoglobin predicts the incidence and retinopathy of diabetic retinopathy. JAMA 1988;260:2864-71.
13. Klein R, Klein BEK, Moss SE. Relation of glycemic control to diabetic microvascular complications in diabetes mellitus. Ann Intern Med 1996;124:90-6.
14. Klein R, Klein BEK, Moss SE, Cruickhanks KJ. Ten-year incidence of gross proteinuria in people with diabetes. Diabetes 1995;44:916-23.
15. Humphrey LL, Ballard DJ, Frohnert P, Chu CP, O’Fallon WM, Palumbo PJ. Chronic renal failure in non-insulin-dependent diabetes mellitus: a population-based study in Rochester, Minnesota. Ann Intern Med 1989;111:788-96.
16. Ohkubo Y, Kishikawa H, Araki E, et al. Intensive insulin therapy prevents the progression of diabetic microvascular complications in Japanese patients with non-insulin-dependent diabetes mellitus: a randomized prospective 6-year study. Diabetes Res Clin Pract 1995;28:103-17.
17. Reichard P, Nilsson BY, Rosenqvist U. The effect of long-term intensified insulin treatment on the development of microvascular complications of diabetes mellitus. N Engl J Med 1993;329:304-9.
18. Kroc Collaborative Study Group. Diabetic retinopathy after two years of intensified insulin treatment: follow-up of the Kroc Collaborative Study. JAMA 1988;260:37-41.
19. Kawamori R, Kamada T. Determination of the glycemic threshold for the regression or prevention of diabetic microangiopathies, and the insulin injection regimen to establish strict glycemic control in NIDDM. Jpn J Med 1991;30:618-21.
20. Eschwege E, Job D, Guyot-Argenton C, Aubry JP, Tchobroutsky G. Delayed progression of diabetic retinopathy by divided insulin administration: a further follow-up. Diabetolgia 1979;16:13-5.
21. Lauritzen T, Frost-Larsen K, Larsen HW, Deckert T, et al. Two-year experience with continuous subcutaneous insulin infusion in relation to retinopathy and neuropathy. Diabetes 1986;34(Suppl 3):74-9.
22. Feldt-Rasmussen B, Mathiesen ER, Jensen T, Lauritzen T, Deckert T. Effect of improved metabolic control on loss of kidney function in type 1 (insulin-dependent) diabetic patients: an update of the Steno studies. Diabetologia 1991;34:164-70.
23. Dahl-Jørgensen K, Brinchmann-Hansen O, Hanssen KF, et al. Effect of near normoglycaemia for two years on progression of early diabetic retinopathy, nephropathy, and neuropathy: the Oslo study. Br Med J 1986;293:1195-9.
24. Beck-Nielsen H, Olesen T, Mogensen CE, et al. Effect of near normoglycemia for 5 years on progression of early diabetic retinopathy and renal involvement. Diabetes Res 1990;15:185-90.
25. University Group Diabetes Program. Effects of hypoglycemic agents on vascular complications in patients with adult-onset diabetes. VIII. Evaluation of insulin therapy: final report. Diabetes 1982;31(Suppl 5):1-31.
26. Early Treatment Diabetic Retinopathy Study Research Group. Grading diabetic retinopathy from stereoscopic color fundus photographs-an extension of the modified Airlie House classification: ETDRS report number 10. Ophthalmology 1991;98:786-806.
27. Diabetes Control and Complications Trial Research Group. The relationship of glycemic exposure to the risk of development and progression of retinopathy in the Diabetes Control and Complications Trial. Diabetes 1995;44:968-83.
28. Diabetes Control and Complications Trial Research Group. The effect of intensive diabetes therapy on the development and progression of neuropathy. Ann Intern Med 1995;122:561-8.
29. Reichard P, Pihl M, Rosenqvist U, Sule J. Complications in IDDM are caused by elevated blood glucose level: the Stockholm Diabetes Intervention Study (SDIS) at 10-year follow-up. Diabetologia 1996;39:1483-8.
30. Diabetes Control and Complications Trial Research Group. Effect of intensive therapy on the development and progression of diabetic nephropathy in the Diabetes Control and Complications Trial. Kidney Int 1995;47:1703-20.
31. Welborn TA, Knuiman M, McCann V, Stanton K, Constable IJ. Clinical macrovascular disease in Caucasoid diabetic subjects: logistic regression analysis of risk variables. Diabetologia 1984;27:568-73.
32. Hillson RM, Hockaday TDR, Mann JI, Newton DJ. Hyperinsulinaemia is associated with development of electrocardiographic abnormalities in diabetics. Diabetes Res 1984;1:143-9.
33. Fu CC, Chang CJ, Tseng CH, et al. Development of macrovascular diseases in NIDDM patients in northern Taiwan. Diabetes Care 1993;16:137-43.
34. Fuller JH, Shipley MJ, Rose G, Jarrett RJ, Keen H. Mortality from coronary heart disease and stroke in relation to degree of glycaemia: the Whitehall study. Lancet 1983;287:867-70.
35. Moss SE, Klein R, Klein BEK, Meuer SM. The association of glycemia and cause-specific mortality in a diabetic population. Arch Intern Med 1994;154:2473-9.
36. Andersson DKG, Svãrdsudd K. Long-term glycemic control relates to mortality in type II diabetes. Diabetes Care 1995;18:1534-42.
37. Kuusisto J, Mykkänen L, Pyörälä K, Laakso M. Non-insulin-dependent diabetes and its metabolic control are important predictors of stroke in elderly subjects. Stroke 1994;25:1157-64.
38. Lee JS, Lu M, Lee VS, Russell D, Bahr C, Lee ET. Lower-extremity amputation: incidence, risk factors, and mortality in the Oklahoma Indian Diabetes Study. Diabetes 1993;42:876-82.
39. UKPDS Group. Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). Lancet 1998;352:854-65.
40. Abraira C, Colwell J, Nuttall F, et al. Cardiovascular events and correlates in the Veterans Affairs Diabetes Feasibility Trial: Veterans Affairs Cooperative Study on Glycemic Control and Complications in Type II Diabetes. Arch Intern Med 1997;157:181-8.
41. Keen H, Chlouverakis C, Jarrett RJ, Boyns DR. The effect of treatment of moderate hyperglycaemia on the incidence of arterial disease. Postgrad Med J 1968;Suppl:960-5.
42. Reichard P, Pihl M. Mortality and treatment side-effects during long-term intensified conventional insulin treatment in the Stockholm Diabetes Intervention Study. Diabetes 1994;43:313-7.
43. Muggeo M, Verlato G, Bonora E, et al. Long-term instability of fasting plasma glucose predicts mortality in elderly NIDDM patients: the Verona Diabetes Study. Diabetologia 1995;38:672-9.
44. Sasaki A, Uehara M, Horiuchi N, Hasegawa K, Shimizu T. A 15-year follow-up study of patients with non-insulin-dependent diabetes mellitus (NIDDM) in Osaka, Japan. Factors predictive of the prognosis of diabetic patients. Diab Res Clin Pract 1997;36:41-7.
45. Gall MA, Borch-Johnsen K, Hougaard P, Nielsen FS, Parving HH. Albuminuria and poor glycemic control predict mortality in NIDDM. Diabetes 1995;44:1303-9.
46. Hadden DR, Blair ALT, Wilson EA, et al. Natural history of diabetes presenting at 40-69 years: a prospective study of the influence of intensive dietary therapy. Q J Med 1986;230:579-98.
47. Davis WK, Hess GE, Hiss RG. Psychological correlates of survival in diabetes. Diabetes Care 1988;11:538-45.
48. Malmberg K. Diabetes Mellitus, Insulin Glucose Infusion in Acute Myocardial Infarction (DIGAMI) Study Group. Prospective randomised study of intensive insulin treatment on long-term survival after acute myocardial infarction in patients with diabetes mellitus. Br Med J 1997;314:1512-5.
49. Diabetes Control and Complications Trial Research Group. Influence of intensive diabetes treatment on quality of life outcomes in the Diabetes Control and Complications Trial. Diabetes Care 1996;19:195-203.
50. Testa MA, Simonson DC. Health economic benefits and quality of life during improved glycemic control in patients with type 2 diabetes mellitus: a randomized, controlled, double-blind trial. JAMA 1998;280:1490-6.
51. Eastman RC, Javitt JC, Herman WH, et al. Model of complications of NIDDM. II. Analysis of the health benefits and cost-effectiveness of treating NIDDM with the goal of normoglycemia. Diabetes Care 1997;20:735-44.
52. Vijan S, Hofer TP, Hayward RA. Estimated benefits of glycemic control in microvascular complications in type 2 diabetes. Ann Intern Med 1997;127:788-95.
53. Hayward RA, Manning WG, Kaplan SH, Wagner EH, Greenfield S. Starting insulin therapy in patients with type 2 diabetes: effectiveness, complications, and resource utilization. JAMA 1997;278:1663-9.
54. Fiore MC, Bailey WC, Cohen SJ, et al. Smoking cessation: clinical practice guideline no. 18. Rockville, Md: US Department of Health and Human Services, Agency for Health Care Policy and Research; 1996. AHCPR Publication No. 96-0692.
55. Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults. Summary of the second report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel II). JAMA 1993;269:3015-23.
56. Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. The Sixth Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Arch Intern Med 1997;157:2413-46.
57. Roter DL, Hall JA, Merisca R, Nordstrom B, Cretin D, Svarstad B. Effectiveness of interventions to improve patient compliance: a meta-analysis. Med Care 1998;36:1138-61.
1. Harris MI. Diabetes in America: epidemiology and scope of the problem. Diabetes Care 1998;21(suppl 3):C11-14.
2. Klein R, Klein BEK. Vision disorders in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;293-338.
3. Nelson RG, Knowler WC, Pettitt DJ, Bennett PH. Kidney diseases in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;349-400.
4. Reiber GE, Boyko EJ, Smith DG. Lower extremity foot ulcers and amputations in diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;409-27.
5. Wilson PW, Cupples LA, Kannel WB. Is hyperglycemia associated with cardiovascular disease? The Framingham Study. Am Heart J 1991;121:586-90.
6. Javitt JC, Chiang YP. Economic impact of diabetes. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;601-11.
7. American Diabetes Association. Economic consequences of diabetes mellitus in the US in 1997. Diabetes Care 1998;21:296-309.
8. Harris MI. Summary. In: Harris MI, Cowie CC, Stern MP, Boyko EJ, Reiber GE, Bennett PH, eds. Diabetes in America. 2nd ed. NIH Publication No. 95-1468. Rockville, Md: National Institutes of Health; 1995;1-13.
9. Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. N Engl J Med 1993;329:977-86.
10. UKPDS Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 1998;352:837-53.
11. Knuiman MW, Welborn TA, McCann VJ, Stanton KG, Constable IJ. Prevalence of diabetic complications in relation to risk factors. Diabetes 1986;35:1332-9.
12. Klein R, Klein BEK, Moss SE, Davis MD, DeMets DL. Glycosylated hemoglobin predicts the incidence and retinopathy of diabetic retinopathy. JAMA 1988;260:2864-71.
13. Klein R, Klein BEK, Moss SE. Relation of glycemic control to diabetic microvascular complications in diabetes mellitus. Ann Intern Med 1996;124:90-6.
14. Klein R, Klein BEK, Moss SE, Cruickhanks KJ. Ten-year incidence of gross proteinuria in people with diabetes. Diabetes 1995;44:916-23.
15. Humphrey LL, Ballard DJ, Frohnert P, Chu CP, O’Fallon WM, Palumbo PJ. Chronic renal failure in non-insulin-dependent diabetes mellitus: a population-based study in Rochester, Minnesota. Ann Intern Med 1989;111:788-96.
16. Ohkubo Y, Kishikawa H, Araki E, et al. Intensive insulin therapy prevents the progression of diabetic microvascular complications in Japanese patients with non-insulin-dependent diabetes mellitus: a randomized prospective 6-year study. Diabetes Res Clin Pract 1995;28:103-17.
17. Reichard P, Nilsson BY, Rosenqvist U. The effect of long-term intensified insulin treatment on the development of microvascular complications of diabetes mellitus. N Engl J Med 1993;329:304-9.
18. Kroc Collaborative Study Group. Diabetic retinopathy after two years of intensified insulin treatment: follow-up of the Kroc Collaborative Study. JAMA 1988;260:37-41.
19. Kawamori R, Kamada T. Determination of the glycemic threshold for the regression or prevention of diabetic microangiopathies, and the insulin injection regimen to establish strict glycemic control in NIDDM. Jpn J Med 1991;30:618-21.
20. Eschwege E, Job D, Guyot-Argenton C, Aubry JP, Tchobroutsky G. Delayed progression of diabetic retinopathy by divided insulin administration: a further follow-up. Diabetolgia 1979;16:13-5.
21. Lauritzen T, Frost-Larsen K, Larsen HW, Deckert T, et al. Two-year experience with continuous subcutaneous insulin infusion in relation to retinopathy and neuropathy. Diabetes 1986;34(Suppl 3):74-9.
22. Feldt-Rasmussen B, Mathiesen ER, Jensen T, Lauritzen T, Deckert T. Effect of improved metabolic control on loss of kidney function in type 1 (insulin-dependent) diabetic patients: an update of the Steno studies. Diabetologia 1991;34:164-70.
23. Dahl-Jørgensen K, Brinchmann-Hansen O, Hanssen KF, et al. Effect of near normoglycaemia for two years on progression of early diabetic retinopathy, nephropathy, and neuropathy: the Oslo study. Br Med J 1986;293:1195-9.
24. Beck-Nielsen H, Olesen T, Mogensen CE, et al. Effect of near normoglycemia for 5 years on progression of early diabetic retinopathy and renal involvement. Diabetes Res 1990;15:185-90.
25. University Group Diabetes Program. Effects of hypoglycemic agents on vascular complications in patients with adult-onset diabetes. VIII. Evaluation of insulin therapy: final report. Diabetes 1982;31(Suppl 5):1-31.
26. Early Treatment Diabetic Retinopathy Study Research Group. Grading diabetic retinopathy from stereoscopic color fundus photographs-an extension of the modified Airlie House classification: ETDRS report number 10. Ophthalmology 1991;98:786-806.
27. Diabetes Control and Complications Trial Research Group. The relationship of glycemic exposure to the risk of development and progression of retinopathy in the Diabetes Control and Complications Trial. Diabetes 1995;44:968-83.
28. Diabetes Control and Complications Trial Research Group. The effect of intensive diabetes therapy on the development and progression of neuropathy. Ann Intern Med 1995;122:561-8.
29. Reichard P, Pihl M, Rosenqvist U, Sule J. Complications in IDDM are caused by elevated blood glucose level: the Stockholm Diabetes Intervention Study (SDIS) at 10-year follow-up. Diabetologia 1996;39:1483-8.
30. Diabetes Control and Complications Trial Research Group. Effect of intensive therapy on the development and progression of diabetic nephropathy in the Diabetes Control and Complications Trial. Kidney Int 1995;47:1703-20.
31. Welborn TA, Knuiman M, McCann V, Stanton K, Constable IJ. Clinical macrovascular disease in Caucasoid diabetic subjects: logistic regression analysis of risk variables. Diabetologia 1984;27:568-73.
32. Hillson RM, Hockaday TDR, Mann JI, Newton DJ. Hyperinsulinaemia is associated with development of electrocardiographic abnormalities in diabetics. Diabetes Res 1984;1:143-9.
33. Fu CC, Chang CJ, Tseng CH, et al. Development of macrovascular diseases in NIDDM patients in northern Taiwan. Diabetes Care 1993;16:137-43.
34. Fuller JH, Shipley MJ, Rose G, Jarrett RJ, Keen H. Mortality from coronary heart disease and stroke in relation to degree of glycaemia: the Whitehall study. Lancet 1983;287:867-70.
35. Moss SE, Klein R, Klein BEK, Meuer SM. The association of glycemia and cause-specific mortality in a diabetic population. Arch Intern Med 1994;154:2473-9.
36. Andersson DKG, Svãrdsudd K. Long-term glycemic control relates to mortality in type II diabetes. Diabetes Care 1995;18:1534-42.
37. Kuusisto J, Mykkänen L, Pyörälä K, Laakso M. Non-insulin-dependent diabetes and its metabolic control are important predictors of stroke in elderly subjects. Stroke 1994;25:1157-64.
38. Lee JS, Lu M, Lee VS, Russell D, Bahr C, Lee ET. Lower-extremity amputation: incidence, risk factors, and mortality in the Oklahoma Indian Diabetes Study. Diabetes 1993;42:876-82.
39. UKPDS Group. Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). Lancet 1998;352:854-65.
40. Abraira C, Colwell J, Nuttall F, et al. Cardiovascular events and correlates in the Veterans Affairs Diabetes Feasibility Trial: Veterans Affairs Cooperative Study on Glycemic Control and Complications in Type II Diabetes. Arch Intern Med 1997;157:181-8.
41. Keen H, Chlouverakis C, Jarrett RJ, Boyns DR. The effect of treatment of moderate hyperglycaemia on the incidence of arterial disease. Postgrad Med J 1968;Suppl:960-5.
42. Reichard P, Pihl M. Mortality and treatment side-effects during long-term intensified conventional insulin treatment in the Stockholm Diabetes Intervention Study. Diabetes 1994;43:313-7.
43. Muggeo M, Verlato G, Bonora E, et al. Long-term instability of fasting plasma glucose predicts mortality in elderly NIDDM patients: the Verona Diabetes Study. Diabetologia 1995;38:672-9.
44. Sasaki A, Uehara M, Horiuchi N, Hasegawa K, Shimizu T. A 15-year follow-up study of patients with non-insulin-dependent diabetes mellitus (NIDDM) in Osaka, Japan. Factors predictive of the prognosis of diabetic patients. Diab Res Clin Pract 1997;36:41-7.
45. Gall MA, Borch-Johnsen K, Hougaard P, Nielsen FS, Parving HH. Albuminuria and poor glycemic control predict mortality in NIDDM. Diabetes 1995;44:1303-9.
46. Hadden DR, Blair ALT, Wilson EA, et al. Natural history of diabetes presenting at 40-69 years: a prospective study of the influence of intensive dietary therapy. Q J Med 1986;230:579-98.
47. Davis WK, Hess GE, Hiss RG. Psychological correlates of survival in diabetes. Diabetes Care 1988;11:538-45.
48. Malmberg K. Diabetes Mellitus, Insulin Glucose Infusion in Acute Myocardial Infarction (DIGAMI) Study Group. Prospective randomised study of intensive insulin treatment on long-term survival after acute myocardial infarction in patients with diabetes mellitus. Br Med J 1997;314:1512-5.
49. Diabetes Control and Complications Trial Research Group. Influence of intensive diabetes treatment on quality of life outcomes in the Diabetes Control and Complications Trial. Diabetes Care 1996;19:195-203.
50. Testa MA, Simonson DC. Health economic benefits and quality of life during improved glycemic control in patients with type 2 diabetes mellitus: a randomized, controlled, double-blind trial. JAMA 1998;280:1490-6.
51. Eastman RC, Javitt JC, Herman WH, et al. Model of complications of NIDDM. II. Analysis of the health benefits and cost-effectiveness of treating NIDDM with the goal of normoglycemia. Diabetes Care 1997;20:735-44.
52. Vijan S, Hofer TP, Hayward RA. Estimated benefits of glycemic control in microvascular complications in type 2 diabetes. Ann Intern Med 1997;127:788-95.
53. Hayward RA, Manning WG, Kaplan SH, Wagner EH, Greenfield S. Starting insulin therapy in patients with type 2 diabetes: effectiveness, complications, and resource utilization. JAMA 1997;278:1663-9.
54. Fiore MC, Bailey WC, Cohen SJ, et al. Smoking cessation: clinical practice guideline no. 18. Rockville, Md: US Department of Health and Human Services, Agency for Health Care Policy and Research; 1996. AHCPR Publication No. 96-0692.
55. Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults. Summary of the second report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel II). JAMA 1993;269:3015-23.
56. Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. The Sixth Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Arch Intern Med 1997;157:2413-46.
57. Roter DL, Hall JA, Merisca R, Nordstrom B, Cretin D, Svarstad B. Effectiveness of interventions to improve patient compliance: a meta-analysis. Med Care 1998;36:1138-61.
Effects [FET1] of Influenza Vaccination of Health Care Workers on Mortality of Elderly People in Long-Term Care: A Randomized Controlled Trial
CLINICAL QUESTION: Does vaccination of health care providers working in long-term care facilities lower mortality and rates of influenza infection in patients?
BACKGROUND: The Centers for Disease Control and Prevention (CDC) recommend influenza vaccination of all patients in long-term care facilities and of health care workers employed there. Several studies have demonstrated the effectiveness of vaccinating elderly patients, and other studies have shown decreased infection rates in vaccinated health care workers.1,2 The effectiveness of vaccinating health care workers for preventing the spread of infection from worker to patient is not as well documented. The authors of this study evaluated whether vaccinating the health care workers at long-term care facilities reduced the nosocomial infection rate and the mortality of the patients in the facilities.
POPULATION STUDIED: A total of 1217 health care workers from 20 long-term care geriatric facilities in Scotland and the 1437 patients for whom they cared during a 6-month period participated in the study. Patients’ age, sex, and degree of disability based on a modified Barthel index were recorded.
STUDY DESIGN AND VALIDITY: Long-term care facilities were matched according to the number of beds and the vaccination policy. Employees randomly selected from half of these facilities were offered an influenza vaccination. Approximately half of the health care workers who were offered a vaccination received it, compared with less than 5% of workers in the control group. A random sample of 50% of the patients at each facility underwent prospective influenza monitoring by nasal and throat swab. Because patient demographics were not well defined, it is difficult to determine if the patients and long-term care facilities in the study are similar to those in other countries. Vaccinations are not routine for the elderly population of the United Kingdom. Consequently, vaccinating a transmission source such as health care workers could be more beneficial in the United Kingdom than in the United States. Also, all-cause mortality rates were very high (13.6%-22.4%) during the 6 months of the study, denoting a higher-risk population than that encountered in many other facilities.
OUTCOMES MEASURED: The outcomes measured included the mortality rate of patients during the winter months and the number of confirmed cases of influenza A and B.
RESULTS: Influenza rates were similar (5.4% vs 6.7%). Overall, the vaccination program was associated with lower mortality (13.6% vs 22.4%, P=.014) among residents. This benefit remained even after adjusting for the higher vaccination rate of residents in the facilities in which the health care workers were not vaccinated. However, after accounting for differences in age, sex, vaccination rate, and disability between the 2 groups, the reduction in the adjusted mortality rate was not statistically significant (adjusted odds ratio=0.6; 95% confidence interval, 0.36-1.04; P=.09).
Vaccination of health care providers working in geriatric inpatient facilities was associated with a decreased mortality among residents, despite equal rates of influenza infection. However, after adjusting for the baseline health of the patients, this benefit disappeared. Practitioners should continue to strive to meet CDC guidelines for vaccination of elderly adults and health care workers, but this study provides only a small impetus to do so.
CLINICAL QUESTION: Does vaccination of health care providers working in long-term care facilities lower mortality and rates of influenza infection in patients?
BACKGROUND: The Centers for Disease Control and Prevention (CDC) recommend influenza vaccination of all patients in long-term care facilities and of health care workers employed there. Several studies have demonstrated the effectiveness of vaccinating elderly patients, and other studies have shown decreased infection rates in vaccinated health care workers.1,2 The effectiveness of vaccinating health care workers for preventing the spread of infection from worker to patient is not as well documented. The authors of this study evaluated whether vaccinating the health care workers at long-term care facilities reduced the nosocomial infection rate and the mortality of the patients in the facilities.
POPULATION STUDIED: A total of 1217 health care workers from 20 long-term care geriatric facilities in Scotland and the 1437 patients for whom they cared during a 6-month period participated in the study. Patients’ age, sex, and degree of disability based on a modified Barthel index were recorded.
STUDY DESIGN AND VALIDITY: Long-term care facilities were matched according to the number of beds and the vaccination policy. Employees randomly selected from half of these facilities were offered an influenza vaccination. Approximately half of the health care workers who were offered a vaccination received it, compared with less than 5% of workers in the control group. A random sample of 50% of the patients at each facility underwent prospective influenza monitoring by nasal and throat swab. Because patient demographics were not well defined, it is difficult to determine if the patients and long-term care facilities in the study are similar to those in other countries. Vaccinations are not routine for the elderly population of the United Kingdom. Consequently, vaccinating a transmission source such as health care workers could be more beneficial in the United Kingdom than in the United States. Also, all-cause mortality rates were very high (13.6%-22.4%) during the 6 months of the study, denoting a higher-risk population than that encountered in many other facilities.
OUTCOMES MEASURED: The outcomes measured included the mortality rate of patients during the winter months and the number of confirmed cases of influenza A and B.
RESULTS: Influenza rates were similar (5.4% vs 6.7%). Overall, the vaccination program was associated with lower mortality (13.6% vs 22.4%, P=.014) among residents. This benefit remained even after adjusting for the higher vaccination rate of residents in the facilities in which the health care workers were not vaccinated. However, after accounting for differences in age, sex, vaccination rate, and disability between the 2 groups, the reduction in the adjusted mortality rate was not statistically significant (adjusted odds ratio=0.6; 95% confidence interval, 0.36-1.04; P=.09).
Vaccination of health care providers working in geriatric inpatient facilities was associated with a decreased mortality among residents, despite equal rates of influenza infection. However, after adjusting for the baseline health of the patients, this benefit disappeared. Practitioners should continue to strive to meet CDC guidelines for vaccination of elderly adults and health care workers, but this study provides only a small impetus to do so.
CLINICAL QUESTION: Does vaccination of health care providers working in long-term care facilities lower mortality and rates of influenza infection in patients?
BACKGROUND: The Centers for Disease Control and Prevention (CDC) recommend influenza vaccination of all patients in long-term care facilities and of health care workers employed there. Several studies have demonstrated the effectiveness of vaccinating elderly patients, and other studies have shown decreased infection rates in vaccinated health care workers.1,2 The effectiveness of vaccinating health care workers for preventing the spread of infection from worker to patient is not as well documented. The authors of this study evaluated whether vaccinating the health care workers at long-term care facilities reduced the nosocomial infection rate and the mortality of the patients in the facilities.
POPULATION STUDIED: A total of 1217 health care workers from 20 long-term care geriatric facilities in Scotland and the 1437 patients for whom they cared during a 6-month period participated in the study. Patients’ age, sex, and degree of disability based on a modified Barthel index were recorded.
STUDY DESIGN AND VALIDITY: Long-term care facilities were matched according to the number of beds and the vaccination policy. Employees randomly selected from half of these facilities were offered an influenza vaccination. Approximately half of the health care workers who were offered a vaccination received it, compared with less than 5% of workers in the control group. A random sample of 50% of the patients at each facility underwent prospective influenza monitoring by nasal and throat swab. Because patient demographics were not well defined, it is difficult to determine if the patients and long-term care facilities in the study are similar to those in other countries. Vaccinations are not routine for the elderly population of the United Kingdom. Consequently, vaccinating a transmission source such as health care workers could be more beneficial in the United Kingdom than in the United States. Also, all-cause mortality rates were very high (13.6%-22.4%) during the 6 months of the study, denoting a higher-risk population than that encountered in many other facilities.
OUTCOMES MEASURED: The outcomes measured included the mortality rate of patients during the winter months and the number of confirmed cases of influenza A and B.
RESULTS: Influenza rates were similar (5.4% vs 6.7%). Overall, the vaccination program was associated with lower mortality (13.6% vs 22.4%, P=.014) among residents. This benefit remained even after adjusting for the higher vaccination rate of residents in the facilities in which the health care workers were not vaccinated. However, after accounting for differences in age, sex, vaccination rate, and disability between the 2 groups, the reduction in the adjusted mortality rate was not statistically significant (adjusted odds ratio=0.6; 95% confidence interval, 0.36-1.04; P=.09).
Vaccination of health care providers working in geriatric inpatient facilities was associated with a decreased mortality among residents, despite equal rates of influenza infection. However, after adjusting for the baseline health of the patients, this benefit disappeared. Practitioners should continue to strive to meet CDC guidelines for vaccination of elderly adults and health care workers, but this study provides only a small impetus to do so.
Changing Physician Practice Behavior
Growing evidence that certain interventions can significantly lower morbidity and mortality (eg, mammography, immunizations, b-blockers for acute myocardial infarction) has focused attention on the challenges of implementation: translating this evidence into practice. Clinicians do not always offer recommended services to patients, and patients do not always readily accept them. Approximately 50% of smokers report that their physician has never advised them to quit.1 One out of 4 patients with an acute myocardial infarction is discharged without a prescription for a b-blocker.2 Only 40% of patients with atrial fibrillation receive warfarin.3 Conversely, tests that are known to be ineffective, such as routine chest radiographs, urinalyses, and preoperative blood work, are ordered routinely.4
Tools for behavior change
Various programs have been developed to bridge the gap between what should be practiced and what is actually done, but few have been uniformly successful. Passive education, such as conferences or the publication of clinical practice guidelines, has been shown consistently to be ineffective.5 More active strategies to implement guidelines, such as educational outreach, feedback, reminder systems, and continuous quality improvement, offer greater promise and have captured the interest of physicians, health systems, hospitals, managed care plans, and quality improvement organizations.6 To date, however, research on whether these methods produce meaningful change in practice patterns or patient outcomes has yielded mixed results.
One such study appears in this issue of the Journal. McBride and colleagues7 compared 4 strategies for improving preventive cardiology services at 45 Midwestern primary care practices. The control practices attended an educational conference and received a kit of materials. The other 3 groups attended a similar conference but also received a practice consultation, an on-site prevention coordinator, or both. The study used surrogate measures—provider behaviors rather than health outcomes such as lipid levels and blood pressure—to gauge effectiveness, and the results were positive. Patient history questionnaires, problem lists, and flow sheets were used more often by the combined intervention group than by the conference-only control group. Other behaviors, such as documentation of risk factor screening and management in the medical record, improved across all intervention groups. The authors apparently did not examine whether patients in the intervention groups had improved outcomes, such as better control of risk factors or a lower incidence of heart disease. With only 10 or 11 practices per group, the statistical power and duration of follow-up to make such comparisons was probably lacking.
The need for discretion in quality improvement
How should we apply these results? When there is evidence that a particular strategy improves outcomes—in this case, practice consultations and on-site coordinators—should physicians immediately adopt that approach in their own practices? Although McBride and colleagues found full participation in the project, it is doubtful that practices nationwide would have the necessary resources. The practice consultation included 3 meetings and 2 follow-up visits, and the on-site coordinator devoted 4.5 hours per week per physician. Moreover, this is only one of many studies showcasing a promising success in quality improvement. No clinician could adopt the full range of strategies that have been advocated by researchers and health systems. Even if that were possible for one disease, it would be impossible for all of the conditions encountered in primary care for which quality improvement is needed, such as preventive care, heart disease, diabetes, asthma, and depression.
Practices face trade-offs when considering quality improvement.8 Although there are exceptions— improved systems of care for one disease can have spillover benefits for other conditions—quality initiatives in one area tend to draw time, resources, and motivation away from others. Before reconfiguring practice operations, the astute clinician must judge not only whether available resources can support the effort, but whether the strategy offers the best use of resources. In the case of the study by McBride and colleagues, physicians might ask whether the proven benefit—improved chart records—justifies a change when data on patient outcomes are lacking. Even if improved health outcomes are likely, they should judge whether applying the same effort to another aspect of care, perhaps for another disease, would help patients even more.
In weighing these choices, physicians should not rely on the results of a single study. It is best to step back and examine the evidence as a whole, reviewing the results of multiple studies of the same strategy. For example, a Cochrane group analyzed 18 systematic reviews of various methods for disseminating and implementing evidence in practice. Although some interventions were consistently effective (eg, educational outreach, reminders, multifaceted interventions, interactive education), others were rarely or never effective (eg, educational materials, didactic teaching) or inconsistently effective (eg, audit, feedback, local opinion leaders, local adaptation, patient-mediated interventions).9 Similarly, a recent review of 58 studies of strategies for improving preventive care found that most interventions were effective in some studies but not others.10
An analytic framework for behavior change
It makes sense that a particular strategy would not succeed in all cases. The reason that clinicians do not adopt new behaviors, or abandon old ones, is often specific to the disease or procedure in question, local practice conditions, and the personal barriers that each physician faces.11 A common solution would not be expected to work. Most physicians undergo stages of change in adopting new behaviors:
- They must have knowledge (information). They must know about the new data or new practice guidelines that advocate a change in practice behavior. Keeping abreast of this knowledge, with its exponential expansion, is medicine’s great challenge. However, as so many studies have shown, information by itself is not enough.12
- Knowledge must foster a change in attitudes. Clinicians must accept the validity of the evidence and its applicability to their practices and their patients. There must be “buy in” for new practice guidelines and acceptance that the recommendations represent good medical care; have been embraced by peers, local consultants, opinion leaders, or one’s specialty; and are acceptable to patients.
- Even if physicians know about and accept the behavior, they must have the ability to implement it. Enthusiasm by itself is insufficient if there is a lack of time, resources, staff, training, or equipment. Physicians must have access to eligible patients, and those patients must be able and willing to do their part. (Like clinicians, patients may not comply because of barriers to knowledge, attitudes, ability, and reinforcement.) Finally, constraints imposed by office or clinic operations, practice leadership, information systems, regulations, and insurance coverage can impede change.
- Like all people, physicians need reinforcement to maintain behaviors. It is human nature to forget, overlook, or lose interest over time. That 36% of physicians do not notify patients of abnormal test results is not because they doubt the importance of that type of communication.13 The most committed physician needs reminder systems to remember when to implement guidelines, tracking systems to identify patients who need follow-up, and encouragement from practice leaders, systems of care, and patients that their efforts are appreciated.
Putting the framework to use
This 4-part framework helps to organize the menu of implementation tools that are available to physicians. Some tools focus on providing knowledge, such as conferences, journal articles, practice guidelines, the Cochrane database,14 and information mastery programs to help clinicians access useful data.15 Some focus on attitudes, such as local adaptation of guidelines,16 academic detailing,17 endorsements by opinion leaders and specialty societies,18 and feedback from colleagues and patients.19 Some address ability, such as scheduling and staff changes,20 revised delivery systems,21 skill building, teamwork,22 information technology,23 comprehensive disease24 or total quality25 management, and community support. Some provide reinforcement, such as computerized or manual reminder systems, flow sheets, standing orders, provider incentives, and feedback reports.26
Knowing that, in general, the 4 steps occur in sequence helps clarify why so many methods of changing behavior appear successful in some settings but not in others. An intervention that delivers information is not helpful if clinicians already know the facts but lack ability. If a family practice does poorly in administering polio vaccine, the problem is less likely to be solved by circulating a photocopy of the Advisory Committee on Immunization Practices’s immunization schedule—the physicians already know the guidelines and the importance of vaccination—than by implementing a tracking and reminder system to flag eligible patients, the most effective way to boost immunization rates.27 Conversely, reinforcement tools, such as reminder systems and standing orders, are unlikely to succeed if clinicians are at an earlier stage of change (eg, they are unaware of or question the data). Adding a space for exercise counseling on flow sheets or preventive care forms accomplishes little if physicians are fundamentally resistant because they doubt this type of counseling helps patients. A knowledge intervention is precisely what is needed when physicians withhold warfarin for atrial fibrillation because of the mistaken belief that bleeding complications outweigh benefits.28 Citing an earlier description of this model,29 Cabana and colleagues30 expanded its structure to better organize published evidence on barriers to physician adherence to practice guidelines. In their model, barriers to knowledge include lack of awareness and familiarity with guidelines. Barriers to attitudes include lack of agreement with guidelines, lack of outcome expectancy, lack of self-efficacy, and lack of motivation. Barriers to behavior include factors related to patients (eg, patient expectations), the practice environment (eg, lack of time, resources), and the guidelines themselves (eg, conflicting recommendations). Of the 120 studies on barriers covered in their review, 58% examined only one type of barrier.
A diagnostic approach
Too many advocates of quality improvement champion their method as the only way to improve care. Hospitals, practices, and health systems often seize on a particular approach for improving quality, perhaps because it is easier to organize programs around a single theme. But there are no “magic bullets.”31 Seasoned clinicians know this; they understand that proper treatment begins with a good diagnosis. The first step is to “find the lesion,” to determine precisely why the guideline is not followed. Knowing whether the barriers involve knowledge, attitude, ability, or reinforcement is the starting point for designing a targeted solution. The alternative is quality improvement by reflex. No physician can improve everything at once, and this is especially true for primary care physicians because of the spectrum of diseases for which they care. Their special need to set priorities makes a rational diagnostic approach to quality improvement essential.
Seeking outcomes that matter
The study by McBride and colleagues reminds us that the utility of outcomes research often depends on the outcome measures. An effect on surrogate or intermediate end points, such as better use of medical records, does not prove beneficial for patients unless data suggest that a change in such measures improves health.32 Smoking illustrates the ideal surrogate measure because of the strong evidence linking it to disease. Too many researchers rely on less-validated surrogate measures, either because more distal health outcomes are hard to quantify or because statistical power concerns demand too large or lengthy a study. Using surrogates is easier, but it has yielded a profusion of outcome studies that fail to tell us whether patients benefit in ways that matter. It would be better to do fewer studies and conserve the resources for a definitive investigation that gives patient-centered outcomes the attention they deserve.
1. Frank E, Winkleby MA, Altman DG, Rockhill B, Fortmann SP. Predictors of physician’s smoking cessation advice. JAMA 1991;266:3139-44.
2. Krumholz HM, Radford MJ, Wang Y, et al. National use and effectiveness of b-blockers for the treatment of elderly patients after acute myocardial infarction: National Cooperative Cardiovascular Project. JAMA 1998;280:623-9.
3. Stafford RS, Singer DE. Recent national patterns of warfarin use in atrial fibrillation. Circulation 1998;97:1231-3.
4. Allison JG, Bromley HR. Unnecessary preoperative investigations: evaluation and cost analysis. Am Surg 1996;62:686-9.
5. Davis DA, Thomson MA, Oxman AD, Haynes RB. Changing physician performance: a systematic review of the effect of continuing medical education strategies. JAMA 1995;274:700-5.
6. Chassin MR, Halvin RW. and the National Roundtable on Health Care Quality The urgent need to improve health care quality. Institute of Medicine National Roundtable on Health Care Quality. JAMA 1998;280:1000-5.
7. McBride P, Underbakke G, Plane MB, et al. Improving practice prevention systems in primary care: the Health Education and Research Trial (HEART). J Fam Pract 2000;49:115-125.
8. Casalino LP. The unintended consequences of measuring quality on the quality of medical care. N Engl J Med 1999;341:1147-50.
9. Bero LA, Grilli R, Grimshaw JM, et al. Closing the gap between research and practice: an overview of systematic reviews of interventions to promote the implementation of research findings. BMJ 1998;317:465-8.
10. MEJL, Wensing M, Grol RPTM, van der Weijden T, van Weel C. Interventions to improve the delivery of preventive services in primary care. Am J Public Health 1999;89:737-46.
11. R. Beliefs and evidence in changing clinical practice. BMJ 1997;315:418-21.
12. RM, Cebul RD, Wigton RS. You can lead a horse to water: improving physicians’ knowledge of probabilities may not affect their decisions. Med Decis Making 1995;15:65-75.
13. EA, Ward RE, Uman JE, McCarthy BD. Patient notification and follow-up of abnormal test results: a physician survey. Arch Intern Med 1996;156:327-31.
14. L, Rennie D. The Cochrane Collaboration: preparing, maintaining, and disseminating systematic reviews of the effects of health care. JAMA 1995;274:1935-8.
15. DC, Shaughnessy AF. Teaching information mastery: creating informed consumers of medical information. J Am Board Fam Pract 1999;12:444-9.
16. JB, Shye D, McFarland B. The paradox of guideline implementation: how AHCPR’s depression guideline was adapted at Kaiser Permanente Northwest Region. J Qual Improv 1995;21:5-21.
17. SB, Avorn J. Principles of educational outreach (‘academic detailing’) to improve clinical decision making. JAMA 1990;263:549-56.
18. SB, McLaughlin TJ, Gurwitz JH, et al. Effect of local medical opinion leaders on quality of care for acute myocardial infarction: a randomized controlled trial. JAMA 1998;279:1358-63.
19. BS, Tonesk X, Jacobson PD. Implementing clinical practice guidelines: social influence strategies and behavior change. Qual Rev Bull 1992;18:413-22.
20. AJ, Woodruff CB, Carney PA. Changing office routines to enhance preventive care: the preventive GAPS approach. Arch Fam Med 1994;3:176-83.
21. DM. A primer on leading the improvement of systems. BMJ 1996;312:619-22.
22. LL, Gemson DH, Carney P. Office system intervention supporting primary care-based health behavior change counseling. Am J Prev Med 1999;17:299-308.
23. RE, McKay G, Boles SM, Vogt TM. Interactive computer technology, behavioral science, and family practice. J Fam Pract 1999;48:464-70.
24. Ellrodt G, Cook DJ, Lee J, Cho M, Hunt D, Weingarten S. Evidence-based disease management. JAMA 1997;278:1687-92.
25. LI, Kottke TE, Brekke ML. Will primary care clinics organize themselves to improve the delivery of preventive services? A randomized controlled trial. Prev Med 1998;27:623-31.
26. Corporation Interventions that increase the utilization of Medicare-funded preventive services for persons age 65 and older. Pub. No. HCFA-02151. Baltimore, Md: Health Care Financing Administration; 1999.
27. Force on Community Preventive Services Vaccine-preventable diseases: improving vaccination coverage in children, adolescents, and adults. A report on recommendations from the Task Force on Community Preventive Services. MMWR 1999;48:(RR-8)1-15.
28. J, Gurwitz JH, Rochon PA, Avorn J. Physician attitudes concerning warfarin for stroke prevention in atrial fibrillation: results of a survey of long-term care practitioners. J Am Geriatr Soc 1997;45:1060-5.
29. SH. Practice guidelines: a new reality in medicine. III: impact on patient care. Arch Intern Med 1993;153:2646-55.
30. MD, Rand CS, Powe NR, et al. Why don’t physicians follow clinical practice guidelines? A framework for improvement. JAMA 1999;282:1458-65.
31. AD, Thomson MA, Davis DA, Haynes RB. No magic bullets: a systematic review of 102 trials of interventions to improve professional practice. Can Med Assoc J 1995;153:1423-31.
32. RA. Using outcomes to improve quality of research and quality of care. J Am Board Fam Pract 1998;11:465-72.
Growing evidence that certain interventions can significantly lower morbidity and mortality (eg, mammography, immunizations, b-blockers for acute myocardial infarction) has focused attention on the challenges of implementation: translating this evidence into practice. Clinicians do not always offer recommended services to patients, and patients do not always readily accept them. Approximately 50% of smokers report that their physician has never advised them to quit.1 One out of 4 patients with an acute myocardial infarction is discharged without a prescription for a b-blocker.2 Only 40% of patients with atrial fibrillation receive warfarin.3 Conversely, tests that are known to be ineffective, such as routine chest radiographs, urinalyses, and preoperative blood work, are ordered routinely.4
Tools for behavior change
Various programs have been developed to bridge the gap between what should be practiced and what is actually done, but few have been uniformly successful. Passive education, such as conferences or the publication of clinical practice guidelines, has been shown consistently to be ineffective.5 More active strategies to implement guidelines, such as educational outreach, feedback, reminder systems, and continuous quality improvement, offer greater promise and have captured the interest of physicians, health systems, hospitals, managed care plans, and quality improvement organizations.6 To date, however, research on whether these methods produce meaningful change in practice patterns or patient outcomes has yielded mixed results.
One such study appears in this issue of the Journal. McBride and colleagues7 compared 4 strategies for improving preventive cardiology services at 45 Midwestern primary care practices. The control practices attended an educational conference and received a kit of materials. The other 3 groups attended a similar conference but also received a practice consultation, an on-site prevention coordinator, or both. The study used surrogate measures—provider behaviors rather than health outcomes such as lipid levels and blood pressure—to gauge effectiveness, and the results were positive. Patient history questionnaires, problem lists, and flow sheets were used more often by the combined intervention group than by the conference-only control group. Other behaviors, such as documentation of risk factor screening and management in the medical record, improved across all intervention groups. The authors apparently did not examine whether patients in the intervention groups had improved outcomes, such as better control of risk factors or a lower incidence of heart disease. With only 10 or 11 practices per group, the statistical power and duration of follow-up to make such comparisons was probably lacking.
The need for discretion in quality improvement
How should we apply these results? When there is evidence that a particular strategy improves outcomes—in this case, practice consultations and on-site coordinators—should physicians immediately adopt that approach in their own practices? Although McBride and colleagues found full participation in the project, it is doubtful that practices nationwide would have the necessary resources. The practice consultation included 3 meetings and 2 follow-up visits, and the on-site coordinator devoted 4.5 hours per week per physician. Moreover, this is only one of many studies showcasing a promising success in quality improvement. No clinician could adopt the full range of strategies that have been advocated by researchers and health systems. Even if that were possible for one disease, it would be impossible for all of the conditions encountered in primary care for which quality improvement is needed, such as preventive care, heart disease, diabetes, asthma, and depression.
Practices face trade-offs when considering quality improvement.8 Although there are exceptions— improved systems of care for one disease can have spillover benefits for other conditions—quality initiatives in one area tend to draw time, resources, and motivation away from others. Before reconfiguring practice operations, the astute clinician must judge not only whether available resources can support the effort, but whether the strategy offers the best use of resources. In the case of the study by McBride and colleagues, physicians might ask whether the proven benefit—improved chart records—justifies a change when data on patient outcomes are lacking. Even if improved health outcomes are likely, they should judge whether applying the same effort to another aspect of care, perhaps for another disease, would help patients even more.
In weighing these choices, physicians should not rely on the results of a single study. It is best to step back and examine the evidence as a whole, reviewing the results of multiple studies of the same strategy. For example, a Cochrane group analyzed 18 systematic reviews of various methods for disseminating and implementing evidence in practice. Although some interventions were consistently effective (eg, educational outreach, reminders, multifaceted interventions, interactive education), others were rarely or never effective (eg, educational materials, didactic teaching) or inconsistently effective (eg, audit, feedback, local opinion leaders, local adaptation, patient-mediated interventions).9 Similarly, a recent review of 58 studies of strategies for improving preventive care found that most interventions were effective in some studies but not others.10
An analytic framework for behavior change
It makes sense that a particular strategy would not succeed in all cases. The reason that clinicians do not adopt new behaviors, or abandon old ones, is often specific to the disease or procedure in question, local practice conditions, and the personal barriers that each physician faces.11 A common solution would not be expected to work. Most physicians undergo stages of change in adopting new behaviors:
- They must have knowledge (information). They must know about the new data or new practice guidelines that advocate a change in practice behavior. Keeping abreast of this knowledge, with its exponential expansion, is medicine’s great challenge. However, as so many studies have shown, information by itself is not enough.12
- Knowledge must foster a change in attitudes. Clinicians must accept the validity of the evidence and its applicability to their practices and their patients. There must be “buy in” for new practice guidelines and acceptance that the recommendations represent good medical care; have been embraced by peers, local consultants, opinion leaders, or one’s specialty; and are acceptable to patients.
- Even if physicians know about and accept the behavior, they must have the ability to implement it. Enthusiasm by itself is insufficient if there is a lack of time, resources, staff, training, or equipment. Physicians must have access to eligible patients, and those patients must be able and willing to do their part. (Like clinicians, patients may not comply because of barriers to knowledge, attitudes, ability, and reinforcement.) Finally, constraints imposed by office or clinic operations, practice leadership, information systems, regulations, and insurance coverage can impede change.
- Like all people, physicians need reinforcement to maintain behaviors. It is human nature to forget, overlook, or lose interest over time. That 36% of physicians do not notify patients of abnormal test results is not because they doubt the importance of that type of communication.13 The most committed physician needs reminder systems to remember when to implement guidelines, tracking systems to identify patients who need follow-up, and encouragement from practice leaders, systems of care, and patients that their efforts are appreciated.
Putting the framework to use
This 4-part framework helps to organize the menu of implementation tools that are available to physicians. Some tools focus on providing knowledge, such as conferences, journal articles, practice guidelines, the Cochrane database,14 and information mastery programs to help clinicians access useful data.15 Some focus on attitudes, such as local adaptation of guidelines,16 academic detailing,17 endorsements by opinion leaders and specialty societies,18 and feedback from colleagues and patients.19 Some address ability, such as scheduling and staff changes,20 revised delivery systems,21 skill building, teamwork,22 information technology,23 comprehensive disease24 or total quality25 management, and community support. Some provide reinforcement, such as computerized or manual reminder systems, flow sheets, standing orders, provider incentives, and feedback reports.26
Knowing that, in general, the 4 steps occur in sequence helps clarify why so many methods of changing behavior appear successful in some settings but not in others. An intervention that delivers information is not helpful if clinicians already know the facts but lack ability. If a family practice does poorly in administering polio vaccine, the problem is less likely to be solved by circulating a photocopy of the Advisory Committee on Immunization Practices’s immunization schedule—the physicians already know the guidelines and the importance of vaccination—than by implementing a tracking and reminder system to flag eligible patients, the most effective way to boost immunization rates.27 Conversely, reinforcement tools, such as reminder systems and standing orders, are unlikely to succeed if clinicians are at an earlier stage of change (eg, they are unaware of or question the data). Adding a space for exercise counseling on flow sheets or preventive care forms accomplishes little if physicians are fundamentally resistant because they doubt this type of counseling helps patients. A knowledge intervention is precisely what is needed when physicians withhold warfarin for atrial fibrillation because of the mistaken belief that bleeding complications outweigh benefits.28 Citing an earlier description of this model,29 Cabana and colleagues30 expanded its structure to better organize published evidence on barriers to physician adherence to practice guidelines. In their model, barriers to knowledge include lack of awareness and familiarity with guidelines. Barriers to attitudes include lack of agreement with guidelines, lack of outcome expectancy, lack of self-efficacy, and lack of motivation. Barriers to behavior include factors related to patients (eg, patient expectations), the practice environment (eg, lack of time, resources), and the guidelines themselves (eg, conflicting recommendations). Of the 120 studies on barriers covered in their review, 58% examined only one type of barrier.
A diagnostic approach
Too many advocates of quality improvement champion their method as the only way to improve care. Hospitals, practices, and health systems often seize on a particular approach for improving quality, perhaps because it is easier to organize programs around a single theme. But there are no “magic bullets.”31 Seasoned clinicians know this; they understand that proper treatment begins with a good diagnosis. The first step is to “find the lesion,” to determine precisely why the guideline is not followed. Knowing whether the barriers involve knowledge, attitude, ability, or reinforcement is the starting point for designing a targeted solution. The alternative is quality improvement by reflex. No physician can improve everything at once, and this is especially true for primary care physicians because of the spectrum of diseases for which they care. Their special need to set priorities makes a rational diagnostic approach to quality improvement essential.
Seeking outcomes that matter
The study by McBride and colleagues reminds us that the utility of outcomes research often depends on the outcome measures. An effect on surrogate or intermediate end points, such as better use of medical records, does not prove beneficial for patients unless data suggest that a change in such measures improves health.32 Smoking illustrates the ideal surrogate measure because of the strong evidence linking it to disease. Too many researchers rely on less-validated surrogate measures, either because more distal health outcomes are hard to quantify or because statistical power concerns demand too large or lengthy a study. Using surrogates is easier, but it has yielded a profusion of outcome studies that fail to tell us whether patients benefit in ways that matter. It would be better to do fewer studies and conserve the resources for a definitive investigation that gives patient-centered outcomes the attention they deserve.
Growing evidence that certain interventions can significantly lower morbidity and mortality (eg, mammography, immunizations, b-blockers for acute myocardial infarction) has focused attention on the challenges of implementation: translating this evidence into practice. Clinicians do not always offer recommended services to patients, and patients do not always readily accept them. Approximately 50% of smokers report that their physician has never advised them to quit.1 One out of 4 patients with an acute myocardial infarction is discharged without a prescription for a b-blocker.2 Only 40% of patients with atrial fibrillation receive warfarin.3 Conversely, tests that are known to be ineffective, such as routine chest radiographs, urinalyses, and preoperative blood work, are ordered routinely.4
Tools for behavior change
Various programs have been developed to bridge the gap between what should be practiced and what is actually done, but few have been uniformly successful. Passive education, such as conferences or the publication of clinical practice guidelines, has been shown consistently to be ineffective.5 More active strategies to implement guidelines, such as educational outreach, feedback, reminder systems, and continuous quality improvement, offer greater promise and have captured the interest of physicians, health systems, hospitals, managed care plans, and quality improvement organizations.6 To date, however, research on whether these methods produce meaningful change in practice patterns or patient outcomes has yielded mixed results.
One such study appears in this issue of the Journal. McBride and colleagues7 compared 4 strategies for improving preventive cardiology services at 45 Midwestern primary care practices. The control practices attended an educational conference and received a kit of materials. The other 3 groups attended a similar conference but also received a practice consultation, an on-site prevention coordinator, or both. The study used surrogate measures—provider behaviors rather than health outcomes such as lipid levels and blood pressure—to gauge effectiveness, and the results were positive. Patient history questionnaires, problem lists, and flow sheets were used more often by the combined intervention group than by the conference-only control group. Other behaviors, such as documentation of risk factor screening and management in the medical record, improved across all intervention groups. The authors apparently did not examine whether patients in the intervention groups had improved outcomes, such as better control of risk factors or a lower incidence of heart disease. With only 10 or 11 practices per group, the statistical power and duration of follow-up to make such comparisons was probably lacking.
The need for discretion in quality improvement
How should we apply these results? When there is evidence that a particular strategy improves outcomes—in this case, practice consultations and on-site coordinators—should physicians immediately adopt that approach in their own practices? Although McBride and colleagues found full participation in the project, it is doubtful that practices nationwide would have the necessary resources. The practice consultation included 3 meetings and 2 follow-up visits, and the on-site coordinator devoted 4.5 hours per week per physician. Moreover, this is only one of many studies showcasing a promising success in quality improvement. No clinician could adopt the full range of strategies that have been advocated by researchers and health systems. Even if that were possible for one disease, it would be impossible for all of the conditions encountered in primary care for which quality improvement is needed, such as preventive care, heart disease, diabetes, asthma, and depression.
Practices face trade-offs when considering quality improvement.8 Although there are exceptions— improved systems of care for one disease can have spillover benefits for other conditions—quality initiatives in one area tend to draw time, resources, and motivation away from others. Before reconfiguring practice operations, the astute clinician must judge not only whether available resources can support the effort, but whether the strategy offers the best use of resources. In the case of the study by McBride and colleagues, physicians might ask whether the proven benefit—improved chart records—justifies a change when data on patient outcomes are lacking. Even if improved health outcomes are likely, they should judge whether applying the same effort to another aspect of care, perhaps for another disease, would help patients even more.
In weighing these choices, physicians should not rely on the results of a single study. It is best to step back and examine the evidence as a whole, reviewing the results of multiple studies of the same strategy. For example, a Cochrane group analyzed 18 systematic reviews of various methods for disseminating and implementing evidence in practice. Although some interventions were consistently effective (eg, educational outreach, reminders, multifaceted interventions, interactive education), others were rarely or never effective (eg, educational materials, didactic teaching) or inconsistently effective (eg, audit, feedback, local opinion leaders, local adaptation, patient-mediated interventions).9 Similarly, a recent review of 58 studies of strategies for improving preventive care found that most interventions were effective in some studies but not others.10
An analytic framework for behavior change
It makes sense that a particular strategy would not succeed in all cases. The reason that clinicians do not adopt new behaviors, or abandon old ones, is often specific to the disease or procedure in question, local practice conditions, and the personal barriers that each physician faces.11 A common solution would not be expected to work. Most physicians undergo stages of change in adopting new behaviors:
- They must have knowledge (information). They must know about the new data or new practice guidelines that advocate a change in practice behavior. Keeping abreast of this knowledge, with its exponential expansion, is medicine’s great challenge. However, as so many studies have shown, information by itself is not enough.12
- Knowledge must foster a change in attitudes. Clinicians must accept the validity of the evidence and its applicability to their practices and their patients. There must be “buy in” for new practice guidelines and acceptance that the recommendations represent good medical care; have been embraced by peers, local consultants, opinion leaders, or one’s specialty; and are acceptable to patients.
- Even if physicians know about and accept the behavior, they must have the ability to implement it. Enthusiasm by itself is insufficient if there is a lack of time, resources, staff, training, or equipment. Physicians must have access to eligible patients, and those patients must be able and willing to do their part. (Like clinicians, patients may not comply because of barriers to knowledge, attitudes, ability, and reinforcement.) Finally, constraints imposed by office or clinic operations, practice leadership, information systems, regulations, and insurance coverage can impede change.
- Like all people, physicians need reinforcement to maintain behaviors. It is human nature to forget, overlook, or lose interest over time. That 36% of physicians do not notify patients of abnormal test results is not because they doubt the importance of that type of communication.13 The most committed physician needs reminder systems to remember when to implement guidelines, tracking systems to identify patients who need follow-up, and encouragement from practice leaders, systems of care, and patients that their efforts are appreciated.
Putting the framework to use
This 4-part framework helps to organize the menu of implementation tools that are available to physicians. Some tools focus on providing knowledge, such as conferences, journal articles, practice guidelines, the Cochrane database,14 and information mastery programs to help clinicians access useful data.15 Some focus on attitudes, such as local adaptation of guidelines,16 academic detailing,17 endorsements by opinion leaders and specialty societies,18 and feedback from colleagues and patients.19 Some address ability, such as scheduling and staff changes,20 revised delivery systems,21 skill building, teamwork,22 information technology,23 comprehensive disease24 or total quality25 management, and community support. Some provide reinforcement, such as computerized or manual reminder systems, flow sheets, standing orders, provider incentives, and feedback reports.26
Knowing that, in general, the 4 steps occur in sequence helps clarify why so many methods of changing behavior appear successful in some settings but not in others. An intervention that delivers information is not helpful if clinicians already know the facts but lack ability. If a family practice does poorly in administering polio vaccine, the problem is less likely to be solved by circulating a photocopy of the Advisory Committee on Immunization Practices’s immunization schedule—the physicians already know the guidelines and the importance of vaccination—than by implementing a tracking and reminder system to flag eligible patients, the most effective way to boost immunization rates.27 Conversely, reinforcement tools, such as reminder systems and standing orders, are unlikely to succeed if clinicians are at an earlier stage of change (eg, they are unaware of or question the data). Adding a space for exercise counseling on flow sheets or preventive care forms accomplishes little if physicians are fundamentally resistant because they doubt this type of counseling helps patients. A knowledge intervention is precisely what is needed when physicians withhold warfarin for atrial fibrillation because of the mistaken belief that bleeding complications outweigh benefits.28 Citing an earlier description of this model,29 Cabana and colleagues30 expanded its structure to better organize published evidence on barriers to physician adherence to practice guidelines. In their model, barriers to knowledge include lack of awareness and familiarity with guidelines. Barriers to attitudes include lack of agreement with guidelines, lack of outcome expectancy, lack of self-efficacy, and lack of motivation. Barriers to behavior include factors related to patients (eg, patient expectations), the practice environment (eg, lack of time, resources), and the guidelines themselves (eg, conflicting recommendations). Of the 120 studies on barriers covered in their review, 58% examined only one type of barrier.
A diagnostic approach
Too many advocates of quality improvement champion their method as the only way to improve care. Hospitals, practices, and health systems often seize on a particular approach for improving quality, perhaps because it is easier to organize programs around a single theme. But there are no “magic bullets.”31 Seasoned clinicians know this; they understand that proper treatment begins with a good diagnosis. The first step is to “find the lesion,” to determine precisely why the guideline is not followed. Knowing whether the barriers involve knowledge, attitude, ability, or reinforcement is the starting point for designing a targeted solution. The alternative is quality improvement by reflex. No physician can improve everything at once, and this is especially true for primary care physicians because of the spectrum of diseases for which they care. Their special need to set priorities makes a rational diagnostic approach to quality improvement essential.
Seeking outcomes that matter
The study by McBride and colleagues reminds us that the utility of outcomes research often depends on the outcome measures. An effect on surrogate or intermediate end points, such as better use of medical records, does not prove beneficial for patients unless data suggest that a change in such measures improves health.32 Smoking illustrates the ideal surrogate measure because of the strong evidence linking it to disease. Too many researchers rely on less-validated surrogate measures, either because more distal health outcomes are hard to quantify or because statistical power concerns demand too large or lengthy a study. Using surrogates is easier, but it has yielded a profusion of outcome studies that fail to tell us whether patients benefit in ways that matter. It would be better to do fewer studies and conserve the resources for a definitive investigation that gives patient-centered outcomes the attention they deserve.
1. Frank E, Winkleby MA, Altman DG, Rockhill B, Fortmann SP. Predictors of physician’s smoking cessation advice. JAMA 1991;266:3139-44.
2. Krumholz HM, Radford MJ, Wang Y, et al. National use and effectiveness of b-blockers for the treatment of elderly patients after acute myocardial infarction: National Cooperative Cardiovascular Project. JAMA 1998;280:623-9.
3. Stafford RS, Singer DE. Recent national patterns of warfarin use in atrial fibrillation. Circulation 1998;97:1231-3.
4. Allison JG, Bromley HR. Unnecessary preoperative investigations: evaluation and cost analysis. Am Surg 1996;62:686-9.
5. Davis DA, Thomson MA, Oxman AD, Haynes RB. Changing physician performance: a systematic review of the effect of continuing medical education strategies. JAMA 1995;274:700-5.
6. Chassin MR, Halvin RW. and the National Roundtable on Health Care Quality The urgent need to improve health care quality. Institute of Medicine National Roundtable on Health Care Quality. JAMA 1998;280:1000-5.
7. McBride P, Underbakke G, Plane MB, et al. Improving practice prevention systems in primary care: the Health Education and Research Trial (HEART). J Fam Pract 2000;49:115-125.
8. Casalino LP. The unintended consequences of measuring quality on the quality of medical care. N Engl J Med 1999;341:1147-50.
9. Bero LA, Grilli R, Grimshaw JM, et al. Closing the gap between research and practice: an overview of systematic reviews of interventions to promote the implementation of research findings. BMJ 1998;317:465-8.
10. MEJL, Wensing M, Grol RPTM, van der Weijden T, van Weel C. Interventions to improve the delivery of preventive services in primary care. Am J Public Health 1999;89:737-46.
11. R. Beliefs and evidence in changing clinical practice. BMJ 1997;315:418-21.
12. RM, Cebul RD, Wigton RS. You can lead a horse to water: improving physicians’ knowledge of probabilities may not affect their decisions. Med Decis Making 1995;15:65-75.
13. EA, Ward RE, Uman JE, McCarthy BD. Patient notification and follow-up of abnormal test results: a physician survey. Arch Intern Med 1996;156:327-31.
14. L, Rennie D. The Cochrane Collaboration: preparing, maintaining, and disseminating systematic reviews of the effects of health care. JAMA 1995;274:1935-8.
15. DC, Shaughnessy AF. Teaching information mastery: creating informed consumers of medical information. J Am Board Fam Pract 1999;12:444-9.
16. JB, Shye D, McFarland B. The paradox of guideline implementation: how AHCPR’s depression guideline was adapted at Kaiser Permanente Northwest Region. J Qual Improv 1995;21:5-21.
17. SB, Avorn J. Principles of educational outreach (‘academic detailing’) to improve clinical decision making. JAMA 1990;263:549-56.
18. SB, McLaughlin TJ, Gurwitz JH, et al. Effect of local medical opinion leaders on quality of care for acute myocardial infarction: a randomized controlled trial. JAMA 1998;279:1358-63.
19. BS, Tonesk X, Jacobson PD. Implementing clinical practice guidelines: social influence strategies and behavior change. Qual Rev Bull 1992;18:413-22.
20. AJ, Woodruff CB, Carney PA. Changing office routines to enhance preventive care: the preventive GAPS approach. Arch Fam Med 1994;3:176-83.
21. DM. A primer on leading the improvement of systems. BMJ 1996;312:619-22.
22. LL, Gemson DH, Carney P. Office system intervention supporting primary care-based health behavior change counseling. Am J Prev Med 1999;17:299-308.
23. RE, McKay G, Boles SM, Vogt TM. Interactive computer technology, behavioral science, and family practice. J Fam Pract 1999;48:464-70.
24. Ellrodt G, Cook DJ, Lee J, Cho M, Hunt D, Weingarten S. Evidence-based disease management. JAMA 1997;278:1687-92.
25. LI, Kottke TE, Brekke ML. Will primary care clinics organize themselves to improve the delivery of preventive services? A randomized controlled trial. Prev Med 1998;27:623-31.
26. Corporation Interventions that increase the utilization of Medicare-funded preventive services for persons age 65 and older. Pub. No. HCFA-02151. Baltimore, Md: Health Care Financing Administration; 1999.
27. Force on Community Preventive Services Vaccine-preventable diseases: improving vaccination coverage in children, adolescents, and adults. A report on recommendations from the Task Force on Community Preventive Services. MMWR 1999;48:(RR-8)1-15.
28. J, Gurwitz JH, Rochon PA, Avorn J. Physician attitudes concerning warfarin for stroke prevention in atrial fibrillation: results of a survey of long-term care practitioners. J Am Geriatr Soc 1997;45:1060-5.
29. SH. Practice guidelines: a new reality in medicine. III: impact on patient care. Arch Intern Med 1993;153:2646-55.
30. MD, Rand CS, Powe NR, et al. Why don’t physicians follow clinical practice guidelines? A framework for improvement. JAMA 1999;282:1458-65.
31. AD, Thomson MA, Davis DA, Haynes RB. No magic bullets: a systematic review of 102 trials of interventions to improve professional practice. Can Med Assoc J 1995;153:1423-31.
32. RA. Using outcomes to improve quality of research and quality of care. J Am Board Fam Pract 1998;11:465-72.
1. Frank E, Winkleby MA, Altman DG, Rockhill B, Fortmann SP. Predictors of physician’s smoking cessation advice. JAMA 1991;266:3139-44.
2. Krumholz HM, Radford MJ, Wang Y, et al. National use and effectiveness of b-blockers for the treatment of elderly patients after acute myocardial infarction: National Cooperative Cardiovascular Project. JAMA 1998;280:623-9.
3. Stafford RS, Singer DE. Recent national patterns of warfarin use in atrial fibrillation. Circulation 1998;97:1231-3.
4. Allison JG, Bromley HR. Unnecessary preoperative investigations: evaluation and cost analysis. Am Surg 1996;62:686-9.
5. Davis DA, Thomson MA, Oxman AD, Haynes RB. Changing physician performance: a systematic review of the effect of continuing medical education strategies. JAMA 1995;274:700-5.
6. Chassin MR, Halvin RW. and the National Roundtable on Health Care Quality The urgent need to improve health care quality. Institute of Medicine National Roundtable on Health Care Quality. JAMA 1998;280:1000-5.
7. McBride P, Underbakke G, Plane MB, et al. Improving practice prevention systems in primary care: the Health Education and Research Trial (HEART). J Fam Pract 2000;49:115-125.
8. Casalino LP. The unintended consequences of measuring quality on the quality of medical care. N Engl J Med 1999;341:1147-50.
9. Bero LA, Grilli R, Grimshaw JM, et al. Closing the gap between research and practice: an overview of systematic reviews of interventions to promote the implementation of research findings. BMJ 1998;317:465-8.
10. MEJL, Wensing M, Grol RPTM, van der Weijden T, van Weel C. Interventions to improve the delivery of preventive services in primary care. Am J Public Health 1999;89:737-46.
11. R. Beliefs and evidence in changing clinical practice. BMJ 1997;315:418-21.
12. RM, Cebul RD, Wigton RS. You can lead a horse to water: improving physicians’ knowledge of probabilities may not affect their decisions. Med Decis Making 1995;15:65-75.
13. EA, Ward RE, Uman JE, McCarthy BD. Patient notification and follow-up of abnormal test results: a physician survey. Arch Intern Med 1996;156:327-31.
14. L, Rennie D. The Cochrane Collaboration: preparing, maintaining, and disseminating systematic reviews of the effects of health care. JAMA 1995;274:1935-8.
15. DC, Shaughnessy AF. Teaching information mastery: creating informed consumers of medical information. J Am Board Fam Pract 1999;12:444-9.
16. JB, Shye D, McFarland B. The paradox of guideline implementation: how AHCPR’s depression guideline was adapted at Kaiser Permanente Northwest Region. J Qual Improv 1995;21:5-21.
17. SB, Avorn J. Principles of educational outreach (‘academic detailing’) to improve clinical decision making. JAMA 1990;263:549-56.
18. SB, McLaughlin TJ, Gurwitz JH, et al. Effect of local medical opinion leaders on quality of care for acute myocardial infarction: a randomized controlled trial. JAMA 1998;279:1358-63.
19. BS, Tonesk X, Jacobson PD. Implementing clinical practice guidelines: social influence strategies and behavior change. Qual Rev Bull 1992;18:413-22.
20. AJ, Woodruff CB, Carney PA. Changing office routines to enhance preventive care: the preventive GAPS approach. Arch Fam Med 1994;3:176-83.
21. DM. A primer on leading the improvement of systems. BMJ 1996;312:619-22.
22. LL, Gemson DH, Carney P. Office system intervention supporting primary care-based health behavior change counseling. Am J Prev Med 1999;17:299-308.
23. RE, McKay G, Boles SM, Vogt TM. Interactive computer technology, behavioral science, and family practice. J Fam Pract 1999;48:464-70.
24. Ellrodt G, Cook DJ, Lee J, Cho M, Hunt D, Weingarten S. Evidence-based disease management. JAMA 1997;278:1687-92.
25. LI, Kottke TE, Brekke ML. Will primary care clinics organize themselves to improve the delivery of preventive services? A randomized controlled trial. Prev Med 1998;27:623-31.
26. Corporation Interventions that increase the utilization of Medicare-funded preventive services for persons age 65 and older. Pub. No. HCFA-02151. Baltimore, Md: Health Care Financing Administration; 1999.
27. Force on Community Preventive Services Vaccine-preventable diseases: improving vaccination coverage in children, adolescents, and adults. A report on recommendations from the Task Force on Community Preventive Services. MMWR 1999;48:(RR-8)1-15.
28. J, Gurwitz JH, Rochon PA, Avorn J. Physician attitudes concerning warfarin for stroke prevention in atrial fibrillation: results of a survey of long-term care practitioners. J Am Geriatr Soc 1997;45:1060-5.
29. SH. Practice guidelines: a new reality in medicine. III: impact on patient care. Arch Intern Med 1993;153:2646-55.
30. MD, Rand CS, Powe NR, et al. Why don’t physicians follow clinical practice guidelines? A framework for improvement. JAMA 1999;282:1458-65.
31. AD, Thomson MA, Davis DA, Haynes RB. No magic bullets: a systematic review of 102 trials of interventions to improve professional practice. Can Med Assoc J 1995;153:1423-31.
32. RA. Using outcomes to improve quality of research and quality of care. J Am Board Fam Pract 1998;11:465-72.