Comparative Effectiveness Topics from a Large, Integrated Delivery System

Kim N Danforth, ScD; Carrie D Patnode, PhD; Tanya J Kapka, MD; Melissa G Butler, PharmD, PhD; Bernadette Collins, PhD; Amy Compton-Phillips, MD; Raymond J Baxter, PhD; Jed Weissberg, MD, FACP; Elizabeth A McGlynn, PhD; Evelyn P Whitlock, MD

Perm J 2013 Summer; 17(3):80-86


Objective: To identify high-priority comparative effectiveness questions directly relevant to care delivery in a large, US integrated health care system.
Methods: In 2010, a total of 792 clinical and operational leaders in Kaiser Permanente were sent an electronic survey requesting nominations of comparative effectiveness research questions; most recipients (83%) had direct clinical roles. Nominated questions were divided into 18 surveys of related topics that included 9 to 23 questions for prioritization. The next year, 648 recipients were electronically sent 1 of the 18 surveys to prioritize nominated questions. Surveys were assigned to recipients on the basis of their nominations or specialty. High-priority questions were identified by comparing the frequency a question was selected to an "expected" frequency, calculated to account for the varying number of questions and respondents across prioritization surveys. High-priority questions were those selected more frequently than expected.
Results: More than 320 research questions were nominated from 181 individuals. Questions most frequently addressed cardiovascular and peripheral vascular disease; obesity, diabetes, endocrinology, and metabolic disorders; or service delivery and systems-level questions. Ninety-five high-priority research questions were identified, encompassing a wide range of health questions that ranged from prevention and screening to treatment and quality of life. Many were complex questions from a systems perspective regarding how to deliver the best care.
Conclusions: The 95 questions identified and prioritized by leaders on the front lines of health care delivery may inform the national discussion regarding comparative effectiveness research. Additionally, our experience provides insight in engaging real-world stakeholders in setting a health care research agenda.


Comparative effectiveness research has been proposed as a way to address the health care questions that are most relevant to patients, clinicians, and policymakers. Comparative effectiveness research is commonly defined as research designed to inform health care decision making through comparing the effectiveness, benefits, and harms of alternative strategies to diagnose, treat, or manage a clinical condition.1 Currently, limited information exists regarding the effectiveness, benefits, and harms associated with many clinical practices. Furthermore, available research may not address the questions most relevant to practicing clinicians because studies may have included nonrepresentative patient groups in nonrepresentative settings (eg, academic medical centers), or have made comparisons to a placebo or untreated group. Comparative effectiveness research, in contrast, compares different strategies for preventing, diagnosing, treating, or managing a clinical condition in real-world settings with respect to their effectiveness, benefits, or harms. This type of research further seeks to determine what works best for whom, recognizing potential treatment response heterogeneity among populations. Thus, as the number of treatment and prevention options increases, and as appreciation of potential differences among individuals and populations grows, comparative effectiveness research has emerged as one way to improve the quality, efficiency, and value in health care.2

For comparative effectiveness research to reach its potential in improving and transforming health care, efforts will need to focus on the questions of greatest relevance to patients, clinicians, administrators, and policymakers. Integrated health care delivery organizations are well situated to identify important research questions whose answers could improve the everyday delivery of health care.3 These settings include large groups of nonresearch and research clinicians who care for patients in a population-based model of care, health system administrators who manage the health care systems in which these patients are seen, and the members or patients themselves. Given the breadth of questions that comparative effectiveness research can address, identifying and prioritizing questions with the greatest clinical significance is essential and should include the perspective of practicing leaders and clinicians.

In 2009, Congress directed the Institute of Medicine (IOM) to identify national priorities for comparative effectiveness research to inform funding decisions by government agencies awarding grants under the American Recovery and Reinvestment Act.4 When putting together its list of the top 100 questions in comparative effectiveness, the IOM obtained input from a diverse group of stakeholders via interactive, mailed, and online mechanisms.1 Nominated topics were reviewed and prioritized by the IOM committee, with additional topics added by the committee to diversify the portfolio.1,5 Many questions on the IOM's list involved issues concerning health care delivery systems, racial and ethnic disparities, functional limitations and disabilities, and cardiovascular and peripheral vascular disease.

There have been several other efforts that engaged practicing clinicians or patients in identifying and prioritizing health care research questions, but most do not publish the actual questions prioritized. Instead, these efforts have focused on describing the methods of generation and prioritization of research questions.6 As a result, it is unclear how well the questions that have been published or otherwise made widely available reflect the views of those on the front lines of health care delivery, who are key stakeholders and anticipated consumers of comparative effectiveness research.

In its report to Congress, the IOM recommended a "continuous evaluation of research topic priorities."1 We conducted a survey of clinical and operational leaders within Kaiser Permanente (KP) to obtain their input on the comparative effectiveness research questions of particular importance to them. KP serves approximately nine million patients across the country and has been cited as one example of a large, preventive health care delivery system in national health care discussions. Thus, questions of high priority to KP leaders on the front lines of care delivery and health care decision making may be relevant to others. Additionally, it has been advocated that the questions generated by these types of surveys be published so that they are available to others.6

The aim of this article is to report the high-priority comparative effectiveness research questions identified and prioritized by practicing clinical and operational leaders in a large, diverse, integrated delivery system—along with the process used to engage them—to inform the national discussions on comparative effectiveness research.


Study Setting

The KP Center for Effectiveness and Safety Research was established to promote and facilitate interregional research on effectiveness and safety involving the 8 KP Regions: Colorado, Georgia, Hawaii, Mid-Atlantic States (District of Columbia, Maryland, Virginia), Northern California, Northwest (Oregon and Washington), Ohio, and Southern California. KP is an integrated health care organization that provides comprehensive services to its members, including preventive, primary care, specialty, emergency, and hospital services. More than 15,000 physicians are employed by KP, and together the 8 Regions serve about 9 million members with diverse geographic, racial/ethnic, and socioeconomic characteristics. The work presented here was conducted as part of KP operational activities and was determined not to be research by the institutional review board.

Surveys of Clinical and Operational Leaders

To elicit comparative effectiveness research questions and subsequently prioritize them, we sent 2 surveys approximately 10 months apart to approximately 800 clinical and operational leaders in KP who were identified through input from national and regional executive leadership. Figure 1 displays the flow of identification and prioritization process.

Nomination of Questions

In Fall 2010, we e-mailed a link to the KP Survey of Critical Topics in Comparative Effectiveness to 792 clinical and operational leaders asking them to nominate up to 5 comparative effectiveness research questions within their areas of expertise. Recipients were invited to provide the specific comparative effectiveness research question and any relevant background information, including specific populations, interventions, outcomes, and comparators of interest. Survey recipients were identified in multiple ways, including using existing distribution lists of clinical leaders in and across KP Regions for specific clinical specialties (eg, breast cancer, urology, cardiovascular disease, behavioral health), as well as lists of those involved in developing national KP clinical practice guidelines, working in quality improvement, or working in technology and products. Most survey recipients had direct clinical roles (83%); the remaining represented nonclinical roles such as executive leaders, experts in medical technology, and those in fields such as laboratory medicine.

Each nominated question was reviewed by the research team, and questions that were clearly not comparative effectiveness were excluded. For example, questions focused on establishing disease registries or clinical guidelines without mention of a specific comparative effectiveness research question were excluded. In making this determination, we used a broad definition guided by the IOM's definition of comparative effectiveness research.1 Questions with multiple but distinct parts were separated (eg, if one part focused on prevention and another on treatment of a disease, they were separated into two questions). Likewise, nearly identical questions posed by different nominators were combined into a single research question.

Each question was reviewed by 2 team members to classify the question according to its content area from a listing of 43 possible codes. We assigned up to 3 clinical and 4 cross-cutting nonclinical (eg, pharmacology, service delivery) categories to each question. Clinical categories were adapted from the list used by the IOM7 and were modified after pilot testing. Cross-cutting themes largely reflected overarching interests of the delivery organization, concerns among health care reformers, and clinical issues that did not fit into the more focused clinical conditions. Because the research team was particularly interested in questions related to cost, cost-effectiveness, and resource allocation, any question that contained this domain, either specifically in the question or in the background information provided by the nominator, was coded in this category. Additionally, a "main" classification was selected for each question from one of the clinical or cross-cutting classifications, favoring clinical areas unless the question primarily focused on a cross-cutting issue. Differences in classifications were resolved through informal discussion or team meetings. A final review of all questions was done by 1 team member (TJK) to ensure consistency of coding decisions across questions.

Prioritization of Questions

In the second phase of the project, we took the comparative effectiveness research questions generated by the KP comparative effectiveness research survey and further engaged KP stakeholders to prioritize among questions in broad clinical and systems-level categories. After omitting the research questions that were not clearly comparative effectiveness research and combining and splitting the questions as appropriate, a total of 288 questions remained. To facilitate prioritization, we divided the 288 questions into 18 groups of related topics (eg, obesity and diabetes). We believed that splitting the questions into smaller, more manageable lists of related topics would better facilitate prioritization rather than prioritizing across the full list of 288 questions. On the basis of this process, 18 electronic prioritization surveys were developed that included a range of 9 to 23 nominated research questions each (Table 1). The cardiovascular disease questions (n = 48) were separated into 3 surveys to make them more manageable for prioritization. In contrast, certain content areas received few nominations, and we elected to create more heterogeneous prioritization surveys containing these questions.

Because 10 months elapsed between nomination and prioritization, we updated the respondent list with input from national and regional KP leadership, including adding researchers with relevant expertise to the survey recipients. All of the original nominators and a random sample of the remaining group of original recipients were included in the updated list. The resulting 648 individuals were assigned to receive a particular prioritization survey based on their specialty area or whether they had nominated a question on that survey. Generally, recipients were sent only 1 prioritization survey, but there were a few exceptions (eg, someone nominated multiple questions that ended up on different prioritization surveys).

The number of individuals sent a particular prioritization survey ranged from 26 to 46.

Within each survey, we asked participants to choose the 5 research questions that they believed should have the highest priority for comparative effectiveness research in their set of grouped topics (without ranking them). Because the number of nominated questions and the number of survey respondents varied substantially across surveys, we sought a metric that would standardize the meaning of "highly prioritized" across content areas. Therefore, to identify highly prioritized questions, we calculated an "expected" number of times a question would be selected or "voted for" if all the questions were judged to be of equal importance and each reviewer selected 5 questions (no. of survey respondents multiplied by 5 potential votes/total no. of questions on the survey). For example, the pediatrics survey had 19 survey questions and 16 respondents, making the expected votes per question if all questions were judged to be of equal importance as follows: (16·5)/19 = 4.2. We then compared the number of times a question was actually selected (or "voted" for) with the expected number of votes to identify high-priority topics.

The a priori goal as stated at the outset by leaders initiating the survey was to generate a list of approximately 100 highly prioritized questions, which equated to approximately one-third of questions from each of the 18 surveys. We compared the observed number of votes with the expected number for each survey and categorized the questions into 3 tiers on the basis of the ratio of observed-to-expected votes; a cut-point of greater than the number of expected votes was used to identify high-priority questions because it yielded about 33% of nominated questions. The first tier consisted of questions receiving 2 or more times as many votes as expected; the second tier, those receiving 1.5 to 2 times the number of expected votes; and the third, those receiving more than the expected number but less than 1.5 times the expected number of votes.




Nomination of Questions

Of the 792 individuals invited to nominate topics, 181 responded with at least 1 topic nomination (23% response rate). Most individuals (56%) nominated 1 topic, but 21% nominated 2 topics, and 23% nominated 3 or more. The nominators represented 50 distinct clinical specialties or areas of health system leadership.

A total of 326 research questions were received; 16 were dropped from prioritization because they were not comparative effectiveness research questions. After separating out distinct questions in a multiple-question nomination or combining nearly identical topics into a single question, there were 288 research questions for prioritization (Table 1). Questions on cardiovascular and peripheral vascular disease were the most frequent (n = 48 questions), followed by health systems (n = 23); chronic diseases and chronic disease management (n = 22); obesity, diabetes, endocrinology, and metabolic disorders (n = 21); and surgery, procedures, anesthesia, and imaging (n = 21). When question content was examined using the "main" topic classification according to our team's rating, independent of prioritization survey, similar results were observed. However, prevention, health promotion, and screening (n = 19) questions also were identified as a common focus of questions.

Prioritization: 95 High-Priority Questions

The overall prioritization survey response rate was 31%, ranging from 11% for the geriatrics survey to 53% for the oncology and hematology survey. Ninety-five questions were identified as high-priority questions on the basis of a comparison of the observed-vs-expected number of votes (Tables 2 to 4). There were 12 questions in the top tier, 46 in the second tier, and 37 in the third tier.

Of the 12 research questions in the top priority tier, 9 were questions from a systems perspective about the way in which care is delivered. For instance, questions focused on the comparative effectiveness of face-to-face vs remote management of patients (including different types of remote management); care provided in specialty clinics vs primary care; and the use of different staffing models (eg, linking nurses to a specific physician or emphasizing the role of nurses in care provision). In contrast, the other 3 top-tier questions focused on what specific care was best in particular clinical instances: 2 questions focused on treatment and management of prostate cancer, and 1 on treatment of autism. The prostate cancer questions both focused on localized prostate cancer and were related but not identical. One question proposed comparing a wide range of management and treatment methods, whereas the other focused on comparisons between radical retropubic vs robotic prostatectomy. The high-priority research questions in the second- and third-priority tiers represented a mix of broad systems-level and specific clinical questions (eg, comparisons of 2 drugs for a particular clinical condition).

The most common clinical categories among the high-priority questions were cardiovascular and peripheral vascular disease (19%); obesity, diabetes, endocrinology, and metabolic disorders (14%); and oncology and hematology (14%). Frequent cross-cutting, nonclinical areas were service delivery and systems-level issues (40%); pharmacology/pharmacy (34%); and prevention, health promotion, and screening (22%). Health information technology, which tended to include questions related to the electronic medical record, was also mentioned somewhat frequently (14%). Additionally, issues related to cost or cost-effectiveness were coded as occurring in most (62%) of the 95 high-priority research question nominations.


The nominated and high-priority questions identified in this study ranged from prevention and screening to treatment and quality of life, reflecting the broad spectrum of issues encountered by practicing clinicians and administrators in a large health system. Questions addressed common health conditions facing our nation, including cardiovascular disease, obesity, and cancer, as well as topics related to health disparities, such as health literacy. Many of the high-priority topics raised complex, systems-level questions about how to deliver the best care.

Half of the 12 top-priority topics identified by our survey were the same or largely similar to questions on the IOM list. Overall, results from the 95 high-priority questions identified in our survey echoed some common themes from the IOM report, including health systems, chronic disease management, behavioral health integration into primary care, optimal cardiovascular disease management strategies, and concerns about better management of patients with chronic pain. In fact, despite favoring clinical areas in our determination of the "main" focus of a question, service delivery and systems-level questions were still the second most common main topic area.

The IOM prioritized system-level questions highly as well, with topics about health care delivery systems being the most common primary or secondary topic among its top 100 comparative effectiveness research questions.1 However, the systems-level questions identified by KP leaders tended to be somewhat broader than those raised in the IOM's report. For instance, the high-priority questions in KP raised questions about staffing models (eg, primary care vs specialty care) or how care is delivered (eg, remote medicine vs in-person visit). In contrast, questions in the top quartile of the IOM's list focused more on comparisons of specific strategies for particular conditions. These comparisons encompassed wide-ranging options (eg, primary prevention vs clinical interventions), but tended to focus more on the content of care rather than broad approaches to delivering care.

These differences between the IOM and KP lists may reflect differences in who was involved in nominating and prioritizing questions. Nearly all our respondents had direct patient care roles and practiced in an integrated delivery system, in contrast to the IOM respondents. Additionally, the processes themselves were different. For instance, the IOM solicited nominations from a wide group of stakeholders, started with many more nominated questions, and then prioritized them through several rounds of voting,7 whereas we used a single round of prioritization among clinical and operational leaders in KP. The IOM also deliberately included questions of key significance to vulnerable subpopulations as part of its prioritization process.1 Although we did not specifically seek out those types of questions, several nominated questions fell into that category, such as questions on health literacy and intimate partner violence, one of which was included among the 95 high-priority questions. Thus, the questions generated here are complementary to those of the IOM and reinforce the importance of certain questions and areas, such as care coordination, delivery, and management.

We designed the survey process around practical considerations, including respondent burden, which limited the information we collected. Although a large number of clinicians were invited to participate in the study, we did not randomly sample all staff in direct care-delivery roles. Instead, those invited to participate were intended to represent clinical and administrative leadership roles. Thus, the sample may not be fully reflective of all caregivers in our system or in general. Additionally, survey respondents were not equally inclusive across different specialties and did not include some specialties (eg, dentists).

We also grouped questions into 18 separate prioritization surveys instead of asking recipients to review all nominated questions to decrease respondent burden. Even in a prioritization survey, we did not ask recipients to rank all questions but rather to select their top 5 questions. Despite trying to minimize the time required to complete the surveys through these methods, response rates were relatively low, and thus the nominated and prioritized topics may not be representative of KP clinicians and leaders as a whole. Low response rates were also a problem in the high-profile IOM process, in which about 9% of those e-mailed responded with nominations.1 Overall, the limitations due to how we grouped the questions and conducted the study, along with the low response rates, are important to consider when assessing the internal and external generalizability of the results. Issues in particular disease areas may be underrepresented or overrepresented given the number of individuals invited to participate and responders in that area. However, we surveyed a wide range of clinical and operational leaders, whose responses correspondingly reflected the wide spectrum of issues faced by leaders in an integrated delivery system. Although different prioritization methods or a higher response rate (and therefore a different group of responders) may have yielded different high-priority questions, we believe that the questions identified here are still likely to represent questions of practical, clinical importance given that they were first nominated and then prioritized by a diverse group of clinicians and administrators.

Given the number of nominations we received, it was out of scope for this project to include more objective information regarding the underlying disease burden or to qualify nominated questions as clearly unanswered (by searching for in-process or published research). Thus, we cannot confirm whether the nominators' questions represented needed research or instead indicated lack of dissemination of existing research findings or recommendations for practice. In other work, we have found that about 25% of the time, publicly nominated research questions are already addressed through recent systematic reviews (Michelle Eder, PhD, oral communication, 2012 May 14).a

Another limitation is that we were not able to determine why highly prioritized questions were selected. We collected ratings from the nominators and prioritizers regarding the potential impact of the question on health care quality, efficiency, or equity. However, questions were generally rated highly on all these domains, which did not enable us to discriminate the reason for the priority. Additionally, we did not include patients in our surveys. However, their perspective is being obtained by the Patient-Centered Outcomes Research Institute, which is currently asking patients to nominate research questions. This research institute also is encouraging studies to include patients and other stakeholders in the research process by making their involvement part of the criteria for funding decisions, as well as including patients in the review of submitted proposals.8

A range of quantitative and qualitative methods have been used to obtain stakeholders' input on research needs and priorities. These methods have included semistructured, rating, and forced-ranking surveys, iterative Delphi or modified Delphi techniques, focus groups, citizens' juries and consumer panels (in the case of engaging the public at-large), and deliberative democracy and other facilitated consensus-building methods.9 Each of these methods has different strengths and weaknesses, and their most appropriate application depends on context (eg, the intended focus of the research, its intended uses, and the available resources to support the engagement). Our method was a broad-based initial effort to engage real-world health care leaders in a large population-based system that has an integrated research and quality-improvement capability, with the potential for further development. As other organizations deliberate processes for eliciting research needs and priorities, an essential step is for them to think about the ultimate implications and uses of the results. They may wish to draw on existing work, such as the Agency for Healthcare Research and Quality's prioritization of topics by reviewing existing evidence (eg, prevalence, mortality, variations in treatment, existing studies).7,10





In conclusion, by seeking input from practicing clinicians and operational leaders in a large health system providing comprehensive care, we obtained a wide range of questions reflecting the diverse health issues facing patients, clinicians, and health care systems. The 95 high-priority questions presented here represent issues of importance to those on the front lines of health care delivery, who are key stakeholders and anticipated consumers of comparative effectiveness research. Thus, these questions may help inform the national discussion regarding comparative effectiveness research and health care.

a Research Associate, Kaiser Permanente Center for Health Research, Portland, OR

Disclosure Statement

This study was funded through Kaiser Permanente internal operating funds provided by the Kaiser Permanente Center for Effectiveness and Safety Research. The author(s) have no other conflicts of interest to disclose.


We gratefully acknowledge the contributions and support provided by: Joe Selby, MD, Executive Director, Patient-Centered Outcomes Research Institute, Washington, DC; the Kaiser Permanente National Research Council; and the Kaiser Foundation Research Institute, Oakland, CA.

Kathleen Louden, ELS, of Louden Health Communications provided editorial assistance.

1. Committee on Comparative Effectiveness Research Prioritization, Institute of Medicine of the National Academies. Initial national priorities for comparative effectiveness research: report brief [monograph on the Internet]. Washington, DC: Institute of Medicine of the National Academies; 2009 Jun 30 [cited 2013 Aug 14]. Available from:
2. Federal Coordinating Council for Comparative Effectiveness Research. Report to the President and the Congress. Washington, DC: US Department of Health and Human Services; 2009 Jun 30.
3. Dubois RW, Graff JS. Setting priorities for comparative effectiveness research: from assessing public health benefits to being open with the public. Health Aff (Millwood) 2011 Dec;30(12):2235-42. DOI:
4. American Recovery and Reinvestment Act of 2009, Pub L No. 111-5 (Feb 17, 2009).
5. Iglehart JK. Prioritizing comparative-effectiveness research—IOM recommendations. N Engl J Med 2009 Jul 23;361(4):325-8. DOI:
6. Stewart RJ, Caird J, Oliver K, Oliver S. Patients' and clinicians' research priorities. Health Expect 2011 Dec;14(4):439-48. DOI:
7. Committee on Comparative Effectiveness Research Prioritization, Institute of Medicine of the National Academies. Initial national priorities for comparative effectiveness research. Washington, DC: Institute of Medicine of the National Academies; 2009.
8. Patient-Centered Outcomes Research Institute [homepage on the Internet]. Washington, DC: Patient-Centered Outcomes Research Institute; c2013 [cited 2013 Aug 14]. Available from:
9. O'Haire C, McPheeters M, Nakamoto E, et al; Oregon Evidence-based Practice Center; Vanderbilt Evidence-based Practice Center. Engaging stakeholders to identify and prioritize future research needs. Methods future research needs report number 4. Publication No. 11-EHC044-EF [monograph on the Internet]. Rockville, MD: Agency for Healthcare Research and Quality; 2011 Jun [cited 2013 Aug 14]. Available from:
10. Whitlock EP, Lopez SA, Chang S, Helfand M, Eder M, Floyd N. AHRQ series paper 3: identifying, selecting, and refining topics for comparative effectiveness systematic reviews: AHRQ and the effective health-care program. J Clin Epidemiol 2010 May;63(5):491-501. DOI:


Click here to join the eTOC list or text ETOC to 22828. You will receive an email notice with the Table of Contents of The Permanente Journal.


2 million page views of TPJ articles in PubMed from a broad international readership.


Indexed in MEDLINE, PubMed Central, EMBASE, EBSCO Academic Search Complete, and CrossRef.




ISSN 1552-5775 Copyright © 2021

All Rights Reserved