THE HEDIS (Health Plan Employer Data and Information Set) quality measurements are now widely used by managed care organizations, as pointed out by Mainous and Talbert in this issue.1 These quality measurements will be even more widely used as an increasing number of states require HEDIS measurements from managed care organizations participating in Medicaid. The federal Health Care Financing Administration (HCFA) is even contemplating requiring HEDIS measurements in fee-for-service Medicare organizations.
Previous versions of HEDIS did not contain outcomes measures. The newest version of HEDIS for Medicare includes, at HCFA's insistence, HEDIS' first true outcomes measure, the Health of Seniors measure. (Previous measures such as immunization and screening rates are process measures.) The HCFA's insistence on using this outcomes measure was probably encouraged by a study that showed that elderly patients in health maintenance organization settings had worse outcomes than elderly patients in fee-for-service plans.2 The Health of Seniors measure will give HCFA a way to monitor and compare health care among different managed care organizations.
The Health of Seniors measure is an attempt to measure actual health outcomes. From the patient's and purchaser's perspectives, measuring the actual effects (outcomes) of health care is more important than measuring processes of health care, so the Health of Seniors measure is an advance. But the Health of Seniors measure, virtuous as it may be as a true outcomes measure, is not without limitations.
The Health of Seniors measure is based on the Medical Outcomes Study 36-Item Short Form Health Survey (SF-36), a survey that will be administered by an independent vendor to plan enrollees 65 years and older. The SF-36 produces 2 summary measures. One summary measure is physical health, including the person's perception of his or her own physical function, bodily pain, and general health. The other summary measure is mental health, including the person's perception of his or her own vitality, social functioning, and emotions.
The survey will be repeated on the same population after 2 years. It is expected that most older people will measure about the same in physical and mental health or will have gone down in score a little. The change in score is the basis for the measure. Three rates will be produced: the proportion of people in the plan whose scores improve more than expected ("better"); the proportion of people in the plan whose score change is not larger than expected ("same"); and the proportion of people in the plan whose score change drops more than expected ("worse").
One issue with the Health of Seniors measure is attribution. How can such global concepts as a person's perception that they have "accomplished less than [they] would like" (one of the SF-36 questions) be attributed to medical care? Doesn't a large part of the responsibility for health status fall on the individual? We have all seen patients who, notwithstanding our best advice and treatment, simply do not take care of themselves. However, the measure is based on change in status, not actual health status. Presumably good care will stabilize even the most recalcitrant lifestyle abuseror such unfortunates will be equally distributed among plans so no plan is unjustly criticized. There is also concern that the SF-36, being a general measure, will not reflect important changes attributable to specific diseases. In spite of these valid concerns, though, the SF-36 has been used to evaluate treatment benefits, and has been used to demonstrate differential outcomes for a variety of treatment alternatives.3
Another issue is risk adjustment. Patients with heart failure, for example, are more likely to have a worse status after 2 years than patients without heart failure. Calculations attempt to adjust for clinical characteristics by using a weighting scheme, but a criticism may be that the weighting scheme has not been adequately validated and may not include all important variables. If the weighting scheme does not work, plans that care for the most ill people will have a higher proportion in the "worse" category after 2 years, in spite of good treatment. This could create an incentive for some plans to avoid sicker patients.
Response rate bias is also a concern. Managed care plans whose vendors produce lower survey response rates may have better ratings because healthy people tend to respond, while the more ill people do not respond to surveys.4 Population health status will seem to fall as response rate increases. Because healthy people tend to have the "same" or "better" health after 2 years, low response rate improves apparent plan performance. This effect has not been adequately quantified, so comparing plans with different response rates can be misleading. The HCFA's goal is to assure that all plans have similarly high response rates.
Another issue is gaming. When plans learn what diagnoses are associated with an expectation for "worse" health status after 2 years, unscrupulous plans may attempt to inflate the apparent number of patients with those diagnoses. This is akin to hospitals labeling patients with the worst possible diagnosis to increase diagnosis-related group reimbursement. Gaming can be controlled by audit, but so far there are no plans for auditing the Health of Seniors measure.
Finally, there is concern about the SF-36 itself. It has not been extensively used in elderly populations. It is not known, for example, how valid it is when its questions are answered by a spouse or caregiver acting for the patient, as might be the case for very old and frail people.
Because it is an outcomes measure, the Health of Seniors measure is a dramatic step forward in evaluating health care for older Americans. But it should be recognized as having potential weaknesses. The HCFA should keep an open mind about other potential outcomes measures, and should encourage research into all the issues surrounding the Health of Seniors measure.