User login
How to interpret surveys in medical research: A practical approach
Surveys are common in medical research. Although survey research may be subject to inherent self-report bias, surveys have a great impact on policies and practices in medicine, often forming the basis for recommendations or new guidelines.1,2 To interpret and use survey research results, clinicians should be familiar with key elements involved in the creation and validation of surveys.
The purpose of this article is to provide readers with a basic framework for evaluating surveys to allow them to be more informed as consumers of survey research.
IMPORTANT TOOLS IN MEDICAL RESEARCH
Surveys are important tools for answering questions on topics that are difficult to assess using other methods.3 They allow us to gather data systematically from subjects by asking questions, in order to make inferences about a larger population.3,4 Clinicians use surveys to explore the opinions, beliefs, and perceptions of a group, or to investigate physician practice patterns and adherence to clinical guidelines. They may also use surveys to better understand why patients are not engaging in recommended behavioral or lifestyle changes.
Survey methods include interviews (in person, by phone) and questionnaires (paper-and-pencil, e-mailed, online).4
A well-constructed, validated survey can provide powerful data that may influence clinical practice, guide future research development, or drive the development and provision of needed programs and services. Surveys have the potential to transform the ways in which we think about and practice medicine.
READER BEWARE
While survey research in health care appears to have grown exponentially, the quality of reported survey research has not necessarily increased over time.
For consumers of survey research, the adage “reader beware” is apt. Although a considerable number of studies have examined the effects of survey methodology on the validity, reliability, and generalizability of the results,4 medical journals differ in their requirements for reporting survey methods.
In an analysis of 117 articles, Bennett et al3 found that more than 80% did not fully describe the survey development process or pretesting methods. They also found limited guidance and lack of consensus about the best way to report survey research. Of 95 surveys requiring scoring, 66% did not report scoring practices.
Duffett et al5 noted that of 127 critical care medicine surveys, only 36% had been pretested or pilot-tested, and half of all surveys reviewed did not include participant demographics or included only minimal information.
Because journal reporting practices differ, physicians may be unaware of the steps involved in survey construction and validation. Knowledge of these steps is helpful not only in constructing surveys but also in assessing published articles that used survey research.
LIMITATIONS OF SURVEY RESEARCH
Indirect measures of attitudes and behaviors
Surveys that rely on participants’ self-reports of behaviors, attitudes, beliefs, or actions are indirect measures and are susceptible to self-report and social-desirability biases. Participants may overestimate their own expertise or knowledge in self-report surveys. They may wish to reduce embarrassment6 or answer in ways that would make them “look better,”7 resulting in social-desirability bias. These issues need to be mentioned in the limitations section in papers reporting survey research.
Questions and response choices
The data derived from surveys are only as good as the questions that are asked.8 Stone9 noted that questions should be intelligible, unambiguous, and unbiased. If respondents do not comprehend questions as researchers intended, if questionnaire response choices are inadequate, or if questions trigger unintended emotional responses,10–14 researchers may unwittingly introduce error, which will affect the validity of results. Even seemingly objective questions, such as those related to clinical algorithm use, practice patterns, or equipment available to hospital staff, may be interpreted differently by different respondents.
In their eagerness to launch a survey, clinician researchers may not realize that it must be carefully constructed. A focus on question development and validation is critical, as the questions determine the quality of the data derived from the survey.8 Even the position of the question or answer in the survey can affect how participants respond,15 as they may be guided to a response choice by preceding questions.16
WHAT DO YOU NEED TO KNOW ABOUT ASSESSING SURVEY RESEARCH?
What follows are questions and a basic framework that can be used to evaluate published survey research. Recommendations are based on the work of survey scientists,4,7,10,14,15,17,18 survey researchers in medicine and the social sciences, and national standards for test and questionnaire construction and validation (Table 1).4,19,20
Who created the survey? How did they do it?
How the survey was created should be sufficiently described to allow readers to judge the adequacy of instrument development.3–5 It is generally recommended that feedback from multiple sources be solicited during survey creation. Both questionnaire-design experts and subject-matter experts are considered critical in the process.4
What question was the survey designed to answer?
Is the objective of the study articulated in the paper? 3,20 To judge survey research, readers need to know if the survey appears to adequately address the research question or questions and the objectives of the study in terms of methods used.4
Was evidence on validity gathered?
Instrument pretesting and field testing are considered best practices by the American Association for Public Opinion Research, a professional organization for US survey scientists.4
Pretesting can include cognitive interviewing, the use of questionnaire appraisal tools, and hybrid methods, all of which are aimed at addressing validity issues.21 Pretesting with a group of participants similar to the target population allows for assessment of item ambiguity, instrument ease of use, adequacy of response categories (response choices), and time to completion.4,12
Cognitive interviewing is designed to explore respondents’ comprehension of questions, response processes, and decision processes governing how they answer questions.4,7,10,11 In cognitive interviewing, respondents are generally interviewed one on one. Techniques vary, but typically include “think alouds” (in which a respondent is asked to verbalize thoughts while responding to questions) and “verbal probing” (in which the respondent answers a question, then is asked follow-up questions as the interviewer probes for information related to the response choice or question itself).7 These techniques can provide evidence that researchers are actually measuring what they set out to measure and not an unrelated construct.4,19
Field testing of a survey under realistic conditions can help to uncover problems in administration, such as issues in standardization of key procedures, and to ensure that the survey was administered as the researchers intended.21,22 Field testing is vital before phone or in-person interviews to ensure standardization of any critical procedures. Pilot testing in a sample similar to the intended population allows for further refinement, with deletion of problem items, before the survey is launched.15
Because even “objective” questions can be somewhat subjective, all research surveys should go through some type of pretesting.4,21 Based on the results of pretesting and field testing, surveys should then be revised before launch.4,21 If an article on a self-report survey makes no mention of survey validation steps, readers may well question the validity of the results.
Are the survey questions and response choices understandable?
Is the meaning of each question unambiguous? Is the reading level appropriate for the sample population (a critical consideration in patient surveys)? Do any of the items actually ask two different questions?13 An example would be: “Was the representative courteous and prompt?” as it is possible to be courteous, but not prompt, and vice versa. If so, respondents may be confused or frustrated in attempting to answer it. If a rating scale is used throughout the questionnaire, are the anchors appropriate? For example, a question may be written in such a way that respondents want to answer “yes/no” or “agree/disagree,” but the scale used may include response options such as “poor,” “marginal,” “good,” and “excellent.” Items with Likert-response formats are commonly used in self-report surveys and allow participants to respond to a statement by choosing from a range of responses (eg, strongly disagree to strongly agree), often spaced horizontally under a line.
It is recommended that surveys also include options for answers beyond the response choices provided,20 such as comment boxes or fill-in-the-blank items. Surveys with a closed-response format may constrain the quality of data collected because investigators may not foresee all possible answers. Surveys need to be available for review either within the article itself, in an appendix, or as supplementary material that is available elsewhere.
Does the sample appear to be appropriate?
Articles that report the results of surveys should describe the target population, the sample design, and, in a demographic table, respondents and nonrespondents. To judge appropriateness, several questions can be asked regarding sampling:
Target population. Is the population of interest (ie, the target population) described, including regional demographics, if applicable? The relationship between the sample and the target population is important, as a nonrepresentative sample may result in misleading conclusions about the population of interest.
Sampling frame. Who had an opportunity to participate in the survey? At its simplest, the sampling frame establishes who (or what, in the case of institutions) should be included within the sample. This is typically a list of elements (Groves et al4) that acts to “frame” or define the sample to be selected. Where the target population may be all academic internal medicine physicians in the United States, the sampling frame may be all male and female US physicians who are members of particular internal medicine professional organizations, identified by their directory email addresses.
Sample design. How was the sample actually selected?4 For example, did investigators use a convenience sample of colleagues at other institutions or use a stratified random sample, ensuring adequate representation of respondents with certain characteristics?
Description of respondents. How is the sample of respondents described? Are demographic features reported, including statistics on regional or national representativeness?5 Does the sample of survey respondents appear to be representative of the researcher’s population of interest (ie, the target population)?3,23 If not, is this adequately described in the limitations section? Although outcomes will not be available on nonrespondents, demographic and baseline data often are available and should be reported. Are there systematic differences between respondents and nonrespondents?
Was the response rate adequate?
Was the response rate adequate, given the number of participants initially recruited? If the response rate was not adequate, did the researchers discuss this limitation?
Maximum response rate, defined as the total number of surveys returned divided by the total number of surveys sent,18 may be difficult to calculate with electronic or Web-based survey platforms. When the maximum response rate cannot be calculated, this issue needs to be addressed in the article’s limitations section.
The number of surveys has increased across fields over the past few decades, but survey response rates in general have decreased.17,21,24,25 In fields outside of clinical medicine, response rates in the 40% range are common.17 In the 1990s, the mean response rate for surveys published in medical journals (mailed surveys) was approximately 60%.26 A 2001 review of physician questionnaire studies found a similar average response rate (61%), with a 52% response rate for large-sample surveys.27 In 2002, Field et al28 examined the impact of incentives in physician survey studies and found response rates ranging from 8.5% to 80%.
Importantly, electronically delivered surveys (e-mail, Web-based) often have lower response rates than mailed surveys.24,29 Nominal financial incentives have been associated with enhanced response rates.28
A relatively low response rate does not necessarily mean you cannot trust the data. Survey scientists note that the representativeness of the sample may be more critical than response rate alone.17 Studies with small sample sizes may be more representative—and findings more valid—than those with large samples, if large samples are nonrepresentative when considering the target population.17
Do the conclusions go beyond the data?
Are the inferences overreaching, in view of the survey design? In studies with low response rates and nonrepresentative samples, researchers must be careful in interpreting the results. If the results cannot be generalized beyond the research sample, is this clear from the limitations, discussion, and conclusion sections?
In this review, we have summarized the findings of three published surveys1,2,30 and commented on how they appear to meet—or don’t quite meet—recommendations for survey development, validation, and use. The papers chosen were deemed strong examples in particular categories, such as description of survey authorship,1 instrument validation,30 sampling methodology,2 and response rate.1
It should be noted that even when surveys are conducted with the utmost rigor, survey reporting may leave out critical details. Survey methodology may not be adequately described for a variety of reasons, including researchers’ training in survey design and methodology; a lack of universally accepted journal-reporting guidelines3; and even journals’ space limitations. At times, journals may excise descriptions of survey development and validation, deeming these sections superfluous. Limitations sections can be critical to interpreting the results of survey research and evaluating the scope of conclusions.
- Jha AK, DesRoches CM, Campbell EG, et al. Use of electronic health records in US hospitals. N Engl J Med 2009; 360:1628–1638.
- Angus DC, Shorr AF, White A, Dremsizov TT, Schmitz RJ, Kelley MA; Committee on Manpower for Pulmonary and Critical Care Societies (COMPACCS). Critical care delivery in the United States: distribution of services and compliance with Leapfrog recommendations. Crit Care Med 2006; 34:1016–1024.
- Bennett C, Khangura S, Brehaut JC, et al. Reporting guidelines for survey research: an analysis of published guidance and reporting practices. PLoS Med 2010; 8:e1001069.
- Groves RM, Fowler FJ, Couper MP, Lepkowski JM, Singer E, Tourangeau R. Survey Methodology. 2nd ed. Hoboken, NJ: John Wiley and Sons, Inc; 2009.
- Duffett M, Burns KE, Adhikari NK, et al. Quality of reporting of surveys in critical care journals: a methodologic review. Crit Care Med 2012; 40:441–449.
- Mattell MS, Jacoby J. Is there an optimal number of alternatives for Likert-scale items? Effects of testing time and scale properties. J Appl Psychol 1972; 56:506–509.
- Willis GB. Cognitive Interviewing. A “How To” Guide. Research Triangle Institute. Presented at the meeting of the American Statistical Association; 1999. http://fog.its.uiowa.edu/~c07b209/interview.pdf. Accessed June 3, 2013.
- Schwarz N. Self-reports. How the questions shape the answers. Amer Psychol 1999; 54:93–105.
- Stone DH. Design a questionnaire. BMJ 1993; 307:1264–1266.
- Willis GB, Royston P, Bercini D. The use of verbal report methods in the development and testing of survey questionnaires. Appl Cogn Psychol 1991; 5:251–267.
- Desimone LM, LeFloch KC. Are we asking the right questions? Using cognitive interviews to improve surveys in education research. Educ Eval Policy Anal 2004; 26:1–22.
- Presser S, Couper MP, Lessler JT, et al. Methods for testing and evaluating survey questions. Public Opin Q 2004; 68:109–130.
- Rogers G. Accreditation Board for Engineering and Technology (ABET), Inc. Sample Protocol for Pilot Testing Survey Items. www.abet.org/WorkArea/DownloadAsset.aspx?id=1299. Accessed January 22, 2013.
- Schwarz N, Oyserman D. Asking questions about behavior: cognition, communication, and questionnaire construction. Am J Eval 2001; 22:127–160.
- Bradburn N, Sudman S, Wansink B. Asking Questions. The Definitive Guide to Questionnaire Design—For Market Research, Political Polls, and Social and Health Questionnaires. San Francisco, CA: Jossey-Bass; 2004.
- Stone AA, Broderick JE, Schwartz JE, Schwarz N. Context effects in survey ratings of health, symptoms, and satisfaction. Med Care 2008; 46:662–667.
- Cook C, Heath F, Thompson RL. A meta-analysis of response rates in Web or internet-based surveys. Educ Psychol Meas 2000; 60:821–836.
- Kaplowitz MD, Hadlock TD, Levine R. A comparison of Web and mail survey response rates. Public Opin Q 2004; 68:94–101.
- American Educational Research Association. Standards for Educational and Psychological Testing/American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Washington, DC: American Educational Research Association; 1999.
- Burns KE, Duffett M, Kho ME, et al; ACCADEMY Group. A guide for the design and conduct of self-administered surveys of clinicians. CMAJ 2008; 179:245–252.
- American Association for Public Opinion Research (AAPOR). http://www.aapor.org/Home.htm. Accessed June 3, 2013.
- National Center for Education Statistics. Planning and Design of Surveys. http://nces.ed.gov/statprog/2002/std2_1.asp. Accessed January 22, 2013.
- Bordens KS, Abbott BB. Research Design and Methods. A Process Approach. 6th ed. New York, NY: McGraw-Hill; 2004.
- Sheehan K. Email survey response rates: a review. JCMC 2001. http://jcmc.indiana.edu/vol6/issue2/sheehan.html. Accessed January 22, 2013.
- Baruch Y, Holtom BC. Survey response rate levels and trends in organizational research. Hum Relat 2008; 61:1139–1160.
- Asch DA, Jedrziewski MK, Christakis NA. Response rates to mail surveys published in medical journals. J Clin Epidemiol 1997; 50:1129–1136.
- Cummings SM, Savitz LA, Konrad TR. Reported response rates to mailed physician questionnaires. Health Services Res 2001; 35:1347–1355.
- Field TS, Cadoret CA, Brown ML, et al. Surveying physicians. Do components of the “Total Design Approach” to optimizing survey response rates apply to physicians? Med Care 2002; 40:596–606.
- Converse PD, Wolfe EW, Huang X, Oswald FL. Response rates for mixed-mode surveys using mail and e-mail/Web. Am J Eval 2008; 29:99–107.
- Hirshberg E, Lacroix J, Sward K, Willson D, Morris AH. Blood glucose control in critically ill adults and children: a survey on stated practice. Chest 2008; 133:1328–1335.
Surveys are common in medical research. Although survey research may be subject to inherent self-report bias, surveys have a great impact on policies and practices in medicine, often forming the basis for recommendations or new guidelines.1,2 To interpret and use survey research results, clinicians should be familiar with key elements involved in the creation and validation of surveys.
The purpose of this article is to provide readers with a basic framework for evaluating surveys to allow them to be more informed as consumers of survey research.
IMPORTANT TOOLS IN MEDICAL RESEARCH
Surveys are important tools for answering questions on topics that are difficult to assess using other methods.3 They allow us to gather data systematically from subjects by asking questions, in order to make inferences about a larger population.3,4 Clinicians use surveys to explore the opinions, beliefs, and perceptions of a group, or to investigate physician practice patterns and adherence to clinical guidelines. They may also use surveys to better understand why patients are not engaging in recommended behavioral or lifestyle changes.
Survey methods include interviews (in person, by phone) and questionnaires (paper-and-pencil, e-mailed, online).4
A well-constructed, validated survey can provide powerful data that may influence clinical practice, guide future research development, or drive the development and provision of needed programs and services. Surveys have the potential to transform the ways in which we think about and practice medicine.
READER BEWARE
While survey research in health care appears to have grown exponentially, the quality of reported survey research has not necessarily increased over time.
For consumers of survey research, the adage “reader beware” is apt. Although a considerable number of studies have examined the effects of survey methodology on the validity, reliability, and generalizability of the results,4 medical journals differ in their requirements for reporting survey methods.
In an analysis of 117 articles, Bennett et al3 found that more than 80% did not fully describe the survey development process or pretesting methods. They also found limited guidance and lack of consensus about the best way to report survey research. Of 95 surveys requiring scoring, 66% did not report scoring practices.
Duffett et al5 noted that of 127 critical care medicine surveys, only 36% had been pretested or pilot-tested, and half of all surveys reviewed did not include participant demographics or included only minimal information.
Because journal reporting practices differ, physicians may be unaware of the steps involved in survey construction and validation. Knowledge of these steps is helpful not only in constructing surveys but also in assessing published articles that used survey research.
LIMITATIONS OF SURVEY RESEARCH
Indirect measures of attitudes and behaviors
Surveys that rely on participants’ self-reports of behaviors, attitudes, beliefs, or actions are indirect measures and are susceptible to self-report and social-desirability biases. Participants may overestimate their own expertise or knowledge in self-report surveys. They may wish to reduce embarrassment6 or answer in ways that would make them “look better,”7 resulting in social-desirability bias. These issues need to be mentioned in the limitations section in papers reporting survey research.
Questions and response choices
The data derived from surveys are only as good as the questions that are asked.8 Stone9 noted that questions should be intelligible, unambiguous, and unbiased. If respondents do not comprehend questions as researchers intended, if questionnaire response choices are inadequate, or if questions trigger unintended emotional responses,10–14 researchers may unwittingly introduce error, which will affect the validity of results. Even seemingly objective questions, such as those related to clinical algorithm use, practice patterns, or equipment available to hospital staff, may be interpreted differently by different respondents.
In their eagerness to launch a survey, clinician researchers may not realize that it must be carefully constructed. A focus on question development and validation is critical, as the questions determine the quality of the data derived from the survey.8 Even the position of the question or answer in the survey can affect how participants respond,15 as they may be guided to a response choice by preceding questions.16
WHAT DO YOU NEED TO KNOW ABOUT ASSESSING SURVEY RESEARCH?
What follows are questions and a basic framework that can be used to evaluate published survey research. Recommendations are based on the work of survey scientists,4,7,10,14,15,17,18 survey researchers in medicine and the social sciences, and national standards for test and questionnaire construction and validation (Table 1).4,19,20
Who created the survey? How did they do it?
How the survey was created should be sufficiently described to allow readers to judge the adequacy of instrument development.3–5 It is generally recommended that feedback from multiple sources be solicited during survey creation. Both questionnaire-design experts and subject-matter experts are considered critical in the process.4
What question was the survey designed to answer?
Is the objective of the study articulated in the paper? 3,20 To judge survey research, readers need to know if the survey appears to adequately address the research question or questions and the objectives of the study in terms of methods used.4
Was evidence on validity gathered?
Instrument pretesting and field testing are considered best practices by the American Association for Public Opinion Research, a professional organization for US survey scientists.4
Pretesting can include cognitive interviewing, the use of questionnaire appraisal tools, and hybrid methods, all of which are aimed at addressing validity issues.21 Pretesting with a group of participants similar to the target population allows for assessment of item ambiguity, instrument ease of use, adequacy of response categories (response choices), and time to completion.4,12
Cognitive interviewing is designed to explore respondents’ comprehension of questions, response processes, and decision processes governing how they answer questions.4,7,10,11 In cognitive interviewing, respondents are generally interviewed one on one. Techniques vary, but typically include “think alouds” (in which a respondent is asked to verbalize thoughts while responding to questions) and “verbal probing” (in which the respondent answers a question, then is asked follow-up questions as the interviewer probes for information related to the response choice or question itself).7 These techniques can provide evidence that researchers are actually measuring what they set out to measure and not an unrelated construct.4,19
Field testing of a survey under realistic conditions can help to uncover problems in administration, such as issues in standardization of key procedures, and to ensure that the survey was administered as the researchers intended.21,22 Field testing is vital before phone or in-person interviews to ensure standardization of any critical procedures. Pilot testing in a sample similar to the intended population allows for further refinement, with deletion of problem items, before the survey is launched.15
Because even “objective” questions can be somewhat subjective, all research surveys should go through some type of pretesting.4,21 Based on the results of pretesting and field testing, surveys should then be revised before launch.4,21 If an article on a self-report survey makes no mention of survey validation steps, readers may well question the validity of the results.
Are the survey questions and response choices understandable?
Is the meaning of each question unambiguous? Is the reading level appropriate for the sample population (a critical consideration in patient surveys)? Do any of the items actually ask two different questions?13 An example would be: “Was the representative courteous and prompt?” as it is possible to be courteous, but not prompt, and vice versa. If so, respondents may be confused or frustrated in attempting to answer it. If a rating scale is used throughout the questionnaire, are the anchors appropriate? For example, a question may be written in such a way that respondents want to answer “yes/no” or “agree/disagree,” but the scale used may include response options such as “poor,” “marginal,” “good,” and “excellent.” Items with Likert-response formats are commonly used in self-report surveys and allow participants to respond to a statement by choosing from a range of responses (eg, strongly disagree to strongly agree), often spaced horizontally under a line.
It is recommended that surveys also include options for answers beyond the response choices provided,20 such as comment boxes or fill-in-the-blank items. Surveys with a closed-response format may constrain the quality of data collected because investigators may not foresee all possible answers. Surveys need to be available for review either within the article itself, in an appendix, or as supplementary material that is available elsewhere.
Does the sample appear to be appropriate?
Articles that report the results of surveys should describe the target population, the sample design, and, in a demographic table, respondents and nonrespondents. To judge appropriateness, several questions can be asked regarding sampling:
Target population. Is the population of interest (ie, the target population) described, including regional demographics, if applicable? The relationship between the sample and the target population is important, as a nonrepresentative sample may result in misleading conclusions about the population of interest.
Sampling frame. Who had an opportunity to participate in the survey? At its simplest, the sampling frame establishes who (or what, in the case of institutions) should be included within the sample. This is typically a list of elements (Groves et al4) that acts to “frame” or define the sample to be selected. Where the target population may be all academic internal medicine physicians in the United States, the sampling frame may be all male and female US physicians who are members of particular internal medicine professional organizations, identified by their directory email addresses.
Sample design. How was the sample actually selected?4 For example, did investigators use a convenience sample of colleagues at other institutions or use a stratified random sample, ensuring adequate representation of respondents with certain characteristics?
Description of respondents. How is the sample of respondents described? Are demographic features reported, including statistics on regional or national representativeness?5 Does the sample of survey respondents appear to be representative of the researcher’s population of interest (ie, the target population)?3,23 If not, is this adequately described in the limitations section? Although outcomes will not be available on nonrespondents, demographic and baseline data often are available and should be reported. Are there systematic differences between respondents and nonrespondents?
Was the response rate adequate?
Was the response rate adequate, given the number of participants initially recruited? If the response rate was not adequate, did the researchers discuss this limitation?
Maximum response rate, defined as the total number of surveys returned divided by the total number of surveys sent,18 may be difficult to calculate with electronic or Web-based survey platforms. When the maximum response rate cannot be calculated, this issue needs to be addressed in the article’s limitations section.
The number of surveys has increased across fields over the past few decades, but survey response rates in general have decreased.17,21,24,25 In fields outside of clinical medicine, response rates in the 40% range are common.17 In the 1990s, the mean response rate for surveys published in medical journals (mailed surveys) was approximately 60%.26 A 2001 review of physician questionnaire studies found a similar average response rate (61%), with a 52% response rate for large-sample surveys.27 In 2002, Field et al28 examined the impact of incentives in physician survey studies and found response rates ranging from 8.5% to 80%.
Importantly, electronically delivered surveys (e-mail, Web-based) often have lower response rates than mailed surveys.24,29 Nominal financial incentives have been associated with enhanced response rates.28
A relatively low response rate does not necessarily mean you cannot trust the data. Survey scientists note that the representativeness of the sample may be more critical than response rate alone.17 Studies with small sample sizes may be more representative—and findings more valid—than those with large samples, if large samples are nonrepresentative when considering the target population.17
Do the conclusions go beyond the data?
Are the inferences overreaching, in view of the survey design? In studies with low response rates and nonrepresentative samples, researchers must be careful in interpreting the results. If the results cannot be generalized beyond the research sample, is this clear from the limitations, discussion, and conclusion sections?
In this review, we have summarized the findings of three published surveys1,2,30 and commented on how they appear to meet—or don’t quite meet—recommendations for survey development, validation, and use. The papers chosen were deemed strong examples in particular categories, such as description of survey authorship,1 instrument validation,30 sampling methodology,2 and response rate.1
It should be noted that even when surveys are conducted with the utmost rigor, survey reporting may leave out critical details. Survey methodology may not be adequately described for a variety of reasons, including researchers’ training in survey design and methodology; a lack of universally accepted journal-reporting guidelines3; and even journals’ space limitations. At times, journals may excise descriptions of survey development and validation, deeming these sections superfluous. Limitations sections can be critical to interpreting the results of survey research and evaluating the scope of conclusions.
Surveys are common in medical research. Although survey research may be subject to inherent self-report bias, surveys have a great impact on policies and practices in medicine, often forming the basis for recommendations or new guidelines.1,2 To interpret and use survey research results, clinicians should be familiar with key elements involved in the creation and validation of surveys.
The purpose of this article is to provide readers with a basic framework for evaluating surveys to allow them to be more informed as consumers of survey research.
IMPORTANT TOOLS IN MEDICAL RESEARCH
Surveys are important tools for answering questions on topics that are difficult to assess using other methods.3 They allow us to gather data systematically from subjects by asking questions, in order to make inferences about a larger population.3,4 Clinicians use surveys to explore the opinions, beliefs, and perceptions of a group, or to investigate physician practice patterns and adherence to clinical guidelines. They may also use surveys to better understand why patients are not engaging in recommended behavioral or lifestyle changes.
Survey methods include interviews (in person, by phone) and questionnaires (paper-and-pencil, e-mailed, online).4
A well-constructed, validated survey can provide powerful data that may influence clinical practice, guide future research development, or drive the development and provision of needed programs and services. Surveys have the potential to transform the ways in which we think about and practice medicine.
READER BEWARE
While survey research in health care appears to have grown exponentially, the quality of reported survey research has not necessarily increased over time.
For consumers of survey research, the adage “reader beware” is apt. Although a considerable number of studies have examined the effects of survey methodology on the validity, reliability, and generalizability of the results,4 medical journals differ in their requirements for reporting survey methods.
In an analysis of 117 articles, Bennett et al3 found that more than 80% did not fully describe the survey development process or pretesting methods. They also found limited guidance and lack of consensus about the best way to report survey research. Of 95 surveys requiring scoring, 66% did not report scoring practices.
Duffett et al5 noted that of 127 critical care medicine surveys, only 36% had been pretested or pilot-tested, and half of all surveys reviewed did not include participant demographics or included only minimal information.
Because journal reporting practices differ, physicians may be unaware of the steps involved in survey construction and validation. Knowledge of these steps is helpful not only in constructing surveys but also in assessing published articles that used survey research.
LIMITATIONS OF SURVEY RESEARCH
Indirect measures of attitudes and behaviors
Surveys that rely on participants’ self-reports of behaviors, attitudes, beliefs, or actions are indirect measures and are susceptible to self-report and social-desirability biases. Participants may overestimate their own expertise or knowledge in self-report surveys. They may wish to reduce embarrassment6 or answer in ways that would make them “look better,”7 resulting in social-desirability bias. These issues need to be mentioned in the limitations section in papers reporting survey research.
Questions and response choices
The data derived from surveys are only as good as the questions that are asked.8 Stone9 noted that questions should be intelligible, unambiguous, and unbiased. If respondents do not comprehend questions as researchers intended, if questionnaire response choices are inadequate, or if questions trigger unintended emotional responses,10–14 researchers may unwittingly introduce error, which will affect the validity of results. Even seemingly objective questions, such as those related to clinical algorithm use, practice patterns, or equipment available to hospital staff, may be interpreted differently by different respondents.
In their eagerness to launch a survey, clinician researchers may not realize that it must be carefully constructed. A focus on question development and validation is critical, as the questions determine the quality of the data derived from the survey.8 Even the position of the question or answer in the survey can affect how participants respond,15 as they may be guided to a response choice by preceding questions.16
WHAT DO YOU NEED TO KNOW ABOUT ASSESSING SURVEY RESEARCH?
What follows are questions and a basic framework that can be used to evaluate published survey research. Recommendations are based on the work of survey scientists,4,7,10,14,15,17,18 survey researchers in medicine and the social sciences, and national standards for test and questionnaire construction and validation (Table 1).4,19,20
Who created the survey? How did they do it?
How the survey was created should be sufficiently described to allow readers to judge the adequacy of instrument development.3–5 It is generally recommended that feedback from multiple sources be solicited during survey creation. Both questionnaire-design experts and subject-matter experts are considered critical in the process.4
What question was the survey designed to answer?
Is the objective of the study articulated in the paper? 3,20 To judge survey research, readers need to know if the survey appears to adequately address the research question or questions and the objectives of the study in terms of methods used.4
Was evidence on validity gathered?
Instrument pretesting and field testing are considered best practices by the American Association for Public Opinion Research, a professional organization for US survey scientists.4
Pretesting can include cognitive interviewing, the use of questionnaire appraisal tools, and hybrid methods, all of which are aimed at addressing validity issues.21 Pretesting with a group of participants similar to the target population allows for assessment of item ambiguity, instrument ease of use, adequacy of response categories (response choices), and time to completion.4,12
Cognitive interviewing is designed to explore respondents’ comprehension of questions, response processes, and decision processes governing how they answer questions.4,7,10,11 In cognitive interviewing, respondents are generally interviewed one on one. Techniques vary, but typically include “think alouds” (in which a respondent is asked to verbalize thoughts while responding to questions) and “verbal probing” (in which the respondent answers a question, then is asked follow-up questions as the interviewer probes for information related to the response choice or question itself).7 These techniques can provide evidence that researchers are actually measuring what they set out to measure and not an unrelated construct.4,19
Field testing of a survey under realistic conditions can help to uncover problems in administration, such as issues in standardization of key procedures, and to ensure that the survey was administered as the researchers intended.21,22 Field testing is vital before phone or in-person interviews to ensure standardization of any critical procedures. Pilot testing in a sample similar to the intended population allows for further refinement, with deletion of problem items, before the survey is launched.15
Because even “objective” questions can be somewhat subjective, all research surveys should go through some type of pretesting.4,21 Based on the results of pretesting and field testing, surveys should then be revised before launch.4,21 If an article on a self-report survey makes no mention of survey validation steps, readers may well question the validity of the results.
Are the survey questions and response choices understandable?
Is the meaning of each question unambiguous? Is the reading level appropriate for the sample population (a critical consideration in patient surveys)? Do any of the items actually ask two different questions?13 An example would be: “Was the representative courteous and prompt?” as it is possible to be courteous, but not prompt, and vice versa. If so, respondents may be confused or frustrated in attempting to answer it. If a rating scale is used throughout the questionnaire, are the anchors appropriate? For example, a question may be written in such a way that respondents want to answer “yes/no” or “agree/disagree,” but the scale used may include response options such as “poor,” “marginal,” “good,” and “excellent.” Items with Likert-response formats are commonly used in self-report surveys and allow participants to respond to a statement by choosing from a range of responses (eg, strongly disagree to strongly agree), often spaced horizontally under a line.
It is recommended that surveys also include options for answers beyond the response choices provided,20 such as comment boxes or fill-in-the-blank items. Surveys with a closed-response format may constrain the quality of data collected because investigators may not foresee all possible answers. Surveys need to be available for review either within the article itself, in an appendix, or as supplementary material that is available elsewhere.
Does the sample appear to be appropriate?
Articles that report the results of surveys should describe the target population, the sample design, and, in a demographic table, respondents and nonrespondents. To judge appropriateness, several questions can be asked regarding sampling:
Target population. Is the population of interest (ie, the target population) described, including regional demographics, if applicable? The relationship between the sample and the target population is important, as a nonrepresentative sample may result in misleading conclusions about the population of interest.
Sampling frame. Who had an opportunity to participate in the survey? At its simplest, the sampling frame establishes who (or what, in the case of institutions) should be included within the sample. This is typically a list of elements (Groves et al4) that acts to “frame” or define the sample to be selected. Where the target population may be all academic internal medicine physicians in the United States, the sampling frame may be all male and female US physicians who are members of particular internal medicine professional organizations, identified by their directory email addresses.
Sample design. How was the sample actually selected?4 For example, did investigators use a convenience sample of colleagues at other institutions or use a stratified random sample, ensuring adequate representation of respondents with certain characteristics?
Description of respondents. How is the sample of respondents described? Are demographic features reported, including statistics on regional or national representativeness?5 Does the sample of survey respondents appear to be representative of the researcher’s population of interest (ie, the target population)?3,23 If not, is this adequately described in the limitations section? Although outcomes will not be available on nonrespondents, demographic and baseline data often are available and should be reported. Are there systematic differences between respondents and nonrespondents?
Was the response rate adequate?
Was the response rate adequate, given the number of participants initially recruited? If the response rate was not adequate, did the researchers discuss this limitation?
Maximum response rate, defined as the total number of surveys returned divided by the total number of surveys sent,18 may be difficult to calculate with electronic or Web-based survey platforms. When the maximum response rate cannot be calculated, this issue needs to be addressed in the article’s limitations section.
The number of surveys has increased across fields over the past few decades, but survey response rates in general have decreased.17,21,24,25 In fields outside of clinical medicine, response rates in the 40% range are common.17 In the 1990s, the mean response rate for surveys published in medical journals (mailed surveys) was approximately 60%.26 A 2001 review of physician questionnaire studies found a similar average response rate (61%), with a 52% response rate for large-sample surveys.27 In 2002, Field et al28 examined the impact of incentives in physician survey studies and found response rates ranging from 8.5% to 80%.
Importantly, electronically delivered surveys (e-mail, Web-based) often have lower response rates than mailed surveys.24,29 Nominal financial incentives have been associated with enhanced response rates.28
A relatively low response rate does not necessarily mean you cannot trust the data. Survey scientists note that the representativeness of the sample may be more critical than response rate alone.17 Studies with small sample sizes may be more representative—and findings more valid—than those with large samples, if large samples are nonrepresentative when considering the target population.17
Do the conclusions go beyond the data?
Are the inferences overreaching, in view of the survey design? In studies with low response rates and nonrepresentative samples, researchers must be careful in interpreting the results. If the results cannot be generalized beyond the research sample, is this clear from the limitations, discussion, and conclusion sections?
In this review, we have summarized the findings of three published surveys1,2,30 and commented on how they appear to meet—or don’t quite meet—recommendations for survey development, validation, and use. The papers chosen were deemed strong examples in particular categories, such as description of survey authorship,1 instrument validation,30 sampling methodology,2 and response rate.1
It should be noted that even when surveys are conducted with the utmost rigor, survey reporting may leave out critical details. Survey methodology may not be adequately described for a variety of reasons, including researchers’ training in survey design and methodology; a lack of universally accepted journal-reporting guidelines3; and even journals’ space limitations. At times, journals may excise descriptions of survey development and validation, deeming these sections superfluous. Limitations sections can be critical to interpreting the results of survey research and evaluating the scope of conclusions.
- Jha AK, DesRoches CM, Campbell EG, et al. Use of electronic health records in US hospitals. N Engl J Med 2009; 360:1628–1638.
- Angus DC, Shorr AF, White A, Dremsizov TT, Schmitz RJ, Kelley MA; Committee on Manpower for Pulmonary and Critical Care Societies (COMPACCS). Critical care delivery in the United States: distribution of services and compliance with Leapfrog recommendations. Crit Care Med 2006; 34:1016–1024.
- Bennett C, Khangura S, Brehaut JC, et al. Reporting guidelines for survey research: an analysis of published guidance and reporting practices. PLoS Med 2010; 8:e1001069.
- Groves RM, Fowler FJ, Couper MP, Lepkowski JM, Singer E, Tourangeau R. Survey Methodology. 2nd ed. Hoboken, NJ: John Wiley and Sons, Inc; 2009.
- Duffett M, Burns KE, Adhikari NK, et al. Quality of reporting of surveys in critical care journals: a methodologic review. Crit Care Med 2012; 40:441–449.
- Mattell MS, Jacoby J. Is there an optimal number of alternatives for Likert-scale items? Effects of testing time and scale properties. J Appl Psychol 1972; 56:506–509.
- Willis GB. Cognitive Interviewing. A “How To” Guide. Research Triangle Institute. Presented at the meeting of the American Statistical Association; 1999. http://fog.its.uiowa.edu/~c07b209/interview.pdf. Accessed June 3, 2013.
- Schwarz N. Self-reports. How the questions shape the answers. Amer Psychol 1999; 54:93–105.
- Stone DH. Design a questionnaire. BMJ 1993; 307:1264–1266.
- Willis GB, Royston P, Bercini D. The use of verbal report methods in the development and testing of survey questionnaires. Appl Cogn Psychol 1991; 5:251–267.
- Desimone LM, LeFloch KC. Are we asking the right questions? Using cognitive interviews to improve surveys in education research. Educ Eval Policy Anal 2004; 26:1–22.
- Presser S, Couper MP, Lessler JT, et al. Methods for testing and evaluating survey questions. Public Opin Q 2004; 68:109–130.
- Rogers G. Accreditation Board for Engineering and Technology (ABET), Inc. Sample Protocol for Pilot Testing Survey Items. www.abet.org/WorkArea/DownloadAsset.aspx?id=1299. Accessed January 22, 2013.
- Schwarz N, Oyserman D. Asking questions about behavior: cognition, communication, and questionnaire construction. Am J Eval 2001; 22:127–160.
- Bradburn N, Sudman S, Wansink B. Asking Questions. The Definitive Guide to Questionnaire Design—For Market Research, Political Polls, and Social and Health Questionnaires. San Francisco, CA: Jossey-Bass; 2004.
- Stone AA, Broderick JE, Schwartz JE, Schwarz N. Context effects in survey ratings of health, symptoms, and satisfaction. Med Care 2008; 46:662–667.
- Cook C, Heath F, Thompson RL. A meta-analysis of response rates in Web or internet-based surveys. Educ Psychol Meas 2000; 60:821–836.
- Kaplowitz MD, Hadlock TD, Levine R. A comparison of Web and mail survey response rates. Public Opin Q 2004; 68:94–101.
- American Educational Research Association. Standards for Educational and Psychological Testing/American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Washington, DC: American Educational Research Association; 1999.
- Burns KE, Duffett M, Kho ME, et al; ACCADEMY Group. A guide for the design and conduct of self-administered surveys of clinicians. CMAJ 2008; 179:245–252.
- American Association for Public Opinion Research (AAPOR). http://www.aapor.org/Home.htm. Accessed June 3, 2013.
- National Center for Education Statistics. Planning and Design of Surveys. http://nces.ed.gov/statprog/2002/std2_1.asp. Accessed January 22, 2013.
- Bordens KS, Abbott BB. Research Design and Methods. A Process Approach. 6th ed. New York, NY: McGraw-Hill; 2004.
- Sheehan K. Email survey response rates: a review. JCMC 2001. http://jcmc.indiana.edu/vol6/issue2/sheehan.html. Accessed January 22, 2013.
- Baruch Y, Holtom BC. Survey response rate levels and trends in organizational research. Hum Relat 2008; 61:1139–1160.
- Asch DA, Jedrziewski MK, Christakis NA. Response rates to mail surveys published in medical journals. J Clin Epidemiol 1997; 50:1129–1136.
- Cummings SM, Savitz LA, Konrad TR. Reported response rates to mailed physician questionnaires. Health Services Res 2001; 35:1347–1355.
- Field TS, Cadoret CA, Brown ML, et al. Surveying physicians. Do components of the “Total Design Approach” to optimizing survey response rates apply to physicians? Med Care 2002; 40:596–606.
- Converse PD, Wolfe EW, Huang X, Oswald FL. Response rates for mixed-mode surveys using mail and e-mail/Web. Am J Eval 2008; 29:99–107.
- Hirshberg E, Lacroix J, Sward K, Willson D, Morris AH. Blood glucose control in critically ill adults and children: a survey on stated practice. Chest 2008; 133:1328–1335.
- Jha AK, DesRoches CM, Campbell EG, et al. Use of electronic health records in US hospitals. N Engl J Med 2009; 360:1628–1638.
- Angus DC, Shorr AF, White A, Dremsizov TT, Schmitz RJ, Kelley MA; Committee on Manpower for Pulmonary and Critical Care Societies (COMPACCS). Critical care delivery in the United States: distribution of services and compliance with Leapfrog recommendations. Crit Care Med 2006; 34:1016–1024.
- Bennett C, Khangura S, Brehaut JC, et al. Reporting guidelines for survey research: an analysis of published guidance and reporting practices. PLoS Med 2010; 8:e1001069.
- Groves RM, Fowler FJ, Couper MP, Lepkowski JM, Singer E, Tourangeau R. Survey Methodology. 2nd ed. Hoboken, NJ: John Wiley and Sons, Inc; 2009.
- Duffett M, Burns KE, Adhikari NK, et al. Quality of reporting of surveys in critical care journals: a methodologic review. Crit Care Med 2012; 40:441–449.
- Mattell MS, Jacoby J. Is there an optimal number of alternatives for Likert-scale items? Effects of testing time and scale properties. J Appl Psychol 1972; 56:506–509.
- Willis GB. Cognitive Interviewing. A “How To” Guide. Research Triangle Institute. Presented at the meeting of the American Statistical Association; 1999. http://fog.its.uiowa.edu/~c07b209/interview.pdf. Accessed June 3, 2013.
- Schwarz N. Self-reports. How the questions shape the answers. Amer Psychol 1999; 54:93–105.
- Stone DH. Design a questionnaire. BMJ 1993; 307:1264–1266.
- Willis GB, Royston P, Bercini D. The use of verbal report methods in the development and testing of survey questionnaires. Appl Cogn Psychol 1991; 5:251–267.
- Desimone LM, LeFloch KC. Are we asking the right questions? Using cognitive interviews to improve surveys in education research. Educ Eval Policy Anal 2004; 26:1–22.
- Presser S, Couper MP, Lessler JT, et al. Methods for testing and evaluating survey questions. Public Opin Q 2004; 68:109–130.
- Rogers G. Accreditation Board for Engineering and Technology (ABET), Inc. Sample Protocol for Pilot Testing Survey Items. www.abet.org/WorkArea/DownloadAsset.aspx?id=1299. Accessed January 22, 2013.
- Schwarz N, Oyserman D. Asking questions about behavior: cognition, communication, and questionnaire construction. Am J Eval 2001; 22:127–160.
- Bradburn N, Sudman S, Wansink B. Asking Questions. The Definitive Guide to Questionnaire Design—For Market Research, Political Polls, and Social and Health Questionnaires. San Francisco, CA: Jossey-Bass; 2004.
- Stone AA, Broderick JE, Schwartz JE, Schwarz N. Context effects in survey ratings of health, symptoms, and satisfaction. Med Care 2008; 46:662–667.
- Cook C, Heath F, Thompson RL. A meta-analysis of response rates in Web or internet-based surveys. Educ Psychol Meas 2000; 60:821–836.
- Kaplowitz MD, Hadlock TD, Levine R. A comparison of Web and mail survey response rates. Public Opin Q 2004; 68:94–101.
- American Educational Research Association. Standards for Educational and Psychological Testing/American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Washington, DC: American Educational Research Association; 1999.
- Burns KE, Duffett M, Kho ME, et al; ACCADEMY Group. A guide for the design and conduct of self-administered surveys of clinicians. CMAJ 2008; 179:245–252.
- American Association for Public Opinion Research (AAPOR). http://www.aapor.org/Home.htm. Accessed June 3, 2013.
- National Center for Education Statistics. Planning and Design of Surveys. http://nces.ed.gov/statprog/2002/std2_1.asp. Accessed January 22, 2013.
- Bordens KS, Abbott BB. Research Design and Methods. A Process Approach. 6th ed. New York, NY: McGraw-Hill; 2004.
- Sheehan K. Email survey response rates: a review. JCMC 2001. http://jcmc.indiana.edu/vol6/issue2/sheehan.html. Accessed January 22, 2013.
- Baruch Y, Holtom BC. Survey response rate levels and trends in organizational research. Hum Relat 2008; 61:1139–1160.
- Asch DA, Jedrziewski MK, Christakis NA. Response rates to mail surveys published in medical journals. J Clin Epidemiol 1997; 50:1129–1136.
- Cummings SM, Savitz LA, Konrad TR. Reported response rates to mailed physician questionnaires. Health Services Res 2001; 35:1347–1355.
- Field TS, Cadoret CA, Brown ML, et al. Surveying physicians. Do components of the “Total Design Approach” to optimizing survey response rates apply to physicians? Med Care 2002; 40:596–606.
- Converse PD, Wolfe EW, Huang X, Oswald FL. Response rates for mixed-mode surveys using mail and e-mail/Web. Am J Eval 2008; 29:99–107.
- Hirshberg E, Lacroix J, Sward K, Willson D, Morris AH. Blood glucose control in critically ill adults and children: a survey on stated practice. Chest 2008; 133:1328–1335.
KEY POINTS
- Most survey reports do not adequately describe their methods.
- Surveys that rely on participants’ self-reports of behaviors, attitudes, beliefs, or actions are indirect measures and are susceptible to self-report and social-desirability biases.
- Informed readers need to consider a survey’s authorship, objective, validation, items, response choices, sampling representativeness, response rate, generalizability, and scope of the conclusions.