User login
Survey methodology for the uninitiated
Research using self-developed questionnaires is a popular study design in family practice and is frequently used for gathering data on knowledge, beliefs, attitudes, and behaviors. A Medline literature search from 1966 to 2000 identified 53,101 articles related to questionnaires, of which 2088 were directly related to family practice. Despite the large number of questionnaire-related articles, however, only 2 in the general medical literature1,2 and 1 in the family practice literature3 were directly related to research methodology.
To obtain guidance on survey research methodology, novice family practice researchers often must go through volumes of information by specialists in other disciplines. For example, a search of a psychology database (PsychInfo)4 from 1966 to 2000 produced 45 articles about questionnaire methodology. The goal of this article is to synthesize pertinent survey research methodology tenets-from other disciplines as well as from family practice-in a manner that is meaningful to novice family practice researchers as well as to research consumers. This article is not aimed at answering all questions, but rather is meant to serve as a general guideline for those with little formal research training who seek guidance in developing and administering questionnaires.
Avoiding common pitfalls in survey research
Although constructing a questionnaire is not exceedingly complex, simple mistakes can be avoided by following some basic rules and guidelines. The Figure is a checklist for conducting a survey research project that combines guidelines and suggestions from published survey research literature,5-9 and the cumulative experience of the authors. Two of the authors (M.J.D. and K.C.O.) are experienced survey researchers who have published, in peer-reviewed journals, numerous studies that used questionnaires.10-19 One of the authors (MJD) has been teaching research to residents and junior faculty for over a decade, and has been an advisor on scores of resident, student, and faculty research projects. The perspective of the novice researcher is represented by 1 author (C.R.W.).
Getting started
The “quick and dirty” approach is perhaps the most common pitfall in survey research. Because of the ease of administration and the relatively low cost of survey research, questionnaires can be developed and administered quickly. The researcher, however, should be sure to consider whether or not a survey is the most appropriate method to answer a research question. Adequate time must be given to thoroughly searching the relevant literature, developing and focusing on an appropriate research question, and defining the target population for the study (see Figure A, Getting Started). Large, multisite surveys are more likely to be generalizeable and to be published in peer-reviewed journals.
One way to avoid undertaking a project too rapidly and giving inadequate attention to the survey research process is for novice researchers to avoid independent research. Those with little or no experience must realize that researchers in both family practice and other fields perform research in teams, with the various participants bringing specific skills to the process.20 Oversights, mistakes, and biases in the design of questionnaires can always occur, whether a researcher is working independently or as a member of a team. It seems reasonable to assume, however, that significant problems are much less likely to occur when a multidisciplinary team approach is involved rather than an individual researcher undertaking a study independently.
Ideally, a research team should include a statistician, a professional with experience in the content areas of the study, and a senior investigator.21 The desirable area of expertise, however, is often not readily available to family physicians, especially those in community-based settings. Individuals with some training in research who are interested in being involved can usually be found in colleges and universities, hospitals, and at the local public health department. Psychologists, sociologists, health services researchers, public health epidemiologists, and nursing educators are all potential resources and possible collaborators. Establishing the necessary relationships to form an ad hoc research team is certainly more time and labor intensive than undertaking research independently, but generally results in the collection of more useful information.
Novices should consult survey methodology books before and during the study.5-9 Excellent resources are available that provide a comprehensive overview of survey methods,22 means for improving response rates,23 and methods for constructing relatively brief but thorough survey questions.5 Academic family practice fellowships often provide training in survey methodology. In addition, many family practice researchers respond favorably to requests for information or advice requested by telephone or email contact. The novice author of this article reports excellent success in contacting experts in this manner. With the advent of the Internet, a “cyberspace” team comprised of experts in the topic and the methodology is a reasonable and helpful option for the novice.
Survey content and structure
Novice researchers often assume that developing a questionnaire is an intuitive process attainable by virtually anyone, regardless of their level of research training. While it is true that questionnaires are relatively simple to construct, developing an instrument that is valid and reliable is not intuitive. An instrument is valid if it actually measures what we think it is measuring, and it is reliable if it measures the phenomenon consistently in repeated applications.24 By following a few basic guidelines, those with limited research training can develop survey instruments capable of producing valid and reliable information. The 3 primary concerns for developing appropriate questions (items) are: (1) response format; (2) content; and (3) wording and placement (see Figure B, Survey Questions; and Figure C, Designing and Formatting the Survey).
Format
Questionnaires generally use a closed-ended format rather than an open-ended format. Closed formats spell out response options instead of asking study subjects to respond in their own words. Although there are many reasons for using closed formats, their primary advantages over open formats is that they are more specific and provide the same frame of reference to all respondents, and they allow quantitative analysis. A disadvantage is that they limit the possible range of responses envisioned by the investigators. Questionnaires with closed formats are therefore not as helpful as qualitative methods in the early, exploratory phases of a research project.
Closed-ended items can be formatted into several different categories (classes) of measurement, based on the relationship of the response categories to one another. Nominal measurements are responses that are sorted into unordered categories, such as demographic variables (ie, sex, ethnicity). Ordinal measurements are similar to nominal, except that there is a definite order to the categories. For example, ordinal items may ask respondents to rank their preferences among a list of options from the least desirable to the most desirable.
Survey items that ask for respondents(delete apostrophe) to rank order preferences are often a more useful than items that state, “check all that apply.” While checking all relevant responses may be necessary for certain items, such questions often lose valuable information as they can only supply raw percentages without supplying any comparison between responses. If a survey uses a rank order response, it enables determining the relative importance of the different categories during data analysis Table 1.
Two additional tools used on questionnaires are continuous variables and scales. Continuous variables can be simple counts (eg, the number of times something occurred) or physical attributes (eg, age or weight). A general rule when collecting information on continuous variables is to avoid obtaining the information in ranges of categories unless absolutely necessary. Response categories that reflect ranges of responses can always be constructed after the information is gathered, but if the information is gathered in ranges from the start, it cannot later be expanded to reflect specific values.
Scales are used by survey researchers to assess the intensity of respondents’ attitudes about a specific issue or issues. Likert scales are probably the best known and most widely used for measuring attitudes. These scales typically present respondents with a statement and ask them to indicate whether they “strongly agree,” “agree,” “neither agree nor disagree,” “disagree,” or “strongly disagree.” The wording of the response categories can be changed to reflect other concepts (eg, approval or disapproval), and the standard 5-response format can be expanded or abbreviated if necessary.
There are no hard and fast rules for determining the number of response categories to use for scaled items, or whether to use a neutral category or one that reflects uncertainty. Research indicates that the reliability of respondents’ ratings declines when using more than 9 rating scale points.25 However, the reliability of a scale increases when the number of rating scale points is increased, with maximum benefit achieved with 5 or 7 scale points.25,26 Since the objective of using scales is to gauge respondent’s preferences, it is sometimes argued that a middle point or category of uncertainty category should not be used. Odd-numbered rating scales, however, conform better with the underlying tenets of many statistical tests, suggesting the need for including this category.29 As the number of rating scale points increases, respondents’ use of the midpoint category decreases substantially. 30 Thus, based on the available literature, it is generally advisable to use between 5 and 7 response categories and an uncertainty category, unless there is a compelling reason to force respondents to choose between 2 competing perspectives or alternatives.
Content
Items should not be included on questionnaires when the only justification for inclusion is that the investigator feels the information “would be really interesting to know.” Rather, for each item, you should ask yourself how it addresses the study’s research question and how it will be used in the data analysis stage of the study. Researchers should develop a data analysis plan in advance of administering a questionnaire to determine exactly how each question will be used in the analysis. When the relationship between a particular item and the study’s research question is unclear, or it is not known how an item will be used in the analysis, the item should be removed from the questionnaire.
Wording and placement
The wording of questions should be kept simple, regardless of the education level of the respondents. Questions should be kept as short and direct as possible since shorter surveys tend to have higher response rates.31,32 Each question should be scrutinized to ensure it is appropriate for the respondents and does not require or assume an inappropriate level of knowledge about a topic. Since first impressions are important for setting the tone of a questionnaire, never begin with sensitive or threatening questions.33 Questionnaires should begin with simple, introductory (“warm-up”)“questions to help establish trust and an appropriate frame of mind for respondents.34 Other successful strategies are: (1) when addressing multiple topics, insert an introductory statement immediately preceding each topic (eg, “In the next section we would like to ask you about …”); (2) request demographic information at the end of the questionnaire; and (3) always provide explicit instructions to avoid any confusion on the part of respondents.35
Additional, clear information on survey content and structure is available in 2 books from Sage Publications.5,36 By following simple guidelines and common sense, most family practice researchers can construct valid and reliable questionnaires. As a final safeguard, once a final draft of the questionnaire is completed, the researcher should always be the first respondent. By placing yourself in the respondent’s role and taking the time to think about and respond to each question, problems with the instrument that were overlooked are sometimes identified.
Analyzing surveys
It is not within the scope of this project to address statistical analysis of survey data. Before attempting data analysis, investigators should receive appropriate training or consult with a qualified professional. There are 3 topics that can and should be understood by novice researchers related to data analysis (Figure D, Developing a Framework for Analysis).
Coding
Before analyzing survey data it is necessary to assign numbers (codes) to the responses obtained. Since the computer program that is used for analyzing data does not know what the numbers mean, the researcher assigns meaning to the codes so that the results can be interpreted correctly. Coding refers to the process of developing the codes, assigning them to responses, and documenting the decision rules used for assigning specific codes to specific response categories. For example, almost all questionnaires contain missing values when respondents elect to not answer an item. Unique codes need to be assigned to distinguish between an item’s missing values, items that may not be applicable to a particular respondent, and responses that have a “none” or “no opinion” category.
Data can be entered into appropriate data files once codes have been assigned to responses and a codebook compiled that defines the codes and their corresponding response categories. It is important to ensure that the data are free of errors (are clean) prior to performing data analysis. Although many methods can be used for data cleaning (ie, data can be entered twice and results compared consistency), at a minimum all of the codes should be checked to ensure only legitimate codes appear.
Frequency distributions are tables produced by statistical software that display the number of respondents in each response category for each item (variable) used in the analysis. By carefully examining frequency tables, the researcher can check for illegitimate codes. Frequency tables also display the relative distribution of responses and allow identification of items that do not conform to expectations given what is known about the study population.
Sample size
Since it is usually not possible to study all of the members of the group (population) of interest in a study, a subset (sample) of the population is generally selected for study from the sampling frame. Sampling is the process by which study subjects are selected from the target population, while the sample frame is the members of a population who have a chance of being included in the survey. In probability samples, each member of the sampling frame has a known probability of being selected for the study, whereas in nonprobability samples, the probability of selection is unknown. When a high degree of precision in sampling is needed to unambiguously identify the magnitude of a problem in a population or the factors that cause the problem, then probability sampling techniques must be used.
When conducting an analytical study that examines precisely whether statistically significant differences exist between groups in a population, power analysis is used to determine what size sample is needed to detect the differences. Estimates of sample size based on power are inversely related to the expected size of the differences “(effect size)”-that is, detecting smaller differences requires a larger sample. If an analytical study is undertaken to determine the magnitude of the differences between 2 groups, it is necessary to work with a statistician or other methodology expert to perform the appropriate power analysis. For a basic but valuable description of sample size estimation, see chapter 13 of Hulley and Cummings.21
In contrast to analytical studies, exploratory and descriptive studies can frequently be conducted without the need for a power analysis. While some descriptive studies may require the use of probability techniques and precise sample estimates, this often is not the case for studies that establish the existence of a problem or estimating its dimensions. When conducting an exploratory or descriptive study using a survey design and a nonprobability sampling technique, considerations other than effect size or precision are used to determine sample size. For example, the availability of eligible respondents, limitations of time and resources, and the need for pilot study data can all contribute to selecting a nonprobability sample. When these types of sampling techniques are used, however, it is important to remember that the validity and reliability of the findings are not assured, and the findings cannot be used to demonstrate the existence of differences between groups. The findings of these types of studies are only suggestive and have limited application beyond the specific study setting.
Response rate
The response rate is a measure indicating the percentage of the identified sample that completed and returned the questionnaire. It is calculated by dividing the number of completed questionnaires by the total sample size identified for the study. For example, if a study is mailed to 500 physicians questionnaires and 100 returned a completed questionnaire, the response rate would be 20% (100/500).
The response rate for mailed questionnaires is extremely variable. Charities are generally content with a 1% to 3% response rate, the US Census Bureau expects to achieve a 99% rate, and among the general population, a 10% response rate is not uncommon. Although an 80% response rate is possible from an extremely motivated population, a rate of 70% is generally considered excellent.34
The effect of nonresponse on the results of a survey depend on the degree to which those not responding are systematically different from the population from which they are drawn.24 When the response rate is high (ie, 95%), the results obtained from the sample will likely provide accurate information about the target population (sampling frame) even if the nonrespondents are distinctly different. However, if nonrespondents differ in a systematic way from the target population and the response rate is low, bias in how much the survey results accurately reflect the true characteristics of the target population is likely.
When calculating the response rate, participants who have died or retired can be removed from the denominator as appropriate. Nonrespondents, however, who refuse to participate, do not return the survey, or have moved should be included. This bias tends to be more problematic in “sensitive” areas of research37 than in studies of common, nonthreatening topics.38 Imputing values for missing data from nonrespondents is complex and generally should not be undertaken.39
Given the importance of response rate, every effort must be made to obtain as many completed questionnaires as possible and strategies to maximize the response rate should be integrated into the study design (see Dillman23 for a useful discussion of successful strategies). Some simple means for improving response rates include constructing a short questionnaire, sending a well-written and personalized cover letter containing your signature, and emphasizing the importance of the study and the confidentiality of responses. It is also advisable to include a self-addressed, stamped envelope for return responses, and sometimes a small incentive is worthwhile. The National Center for Education Statistics notes that all surveys require some follow-up to achieve desirable response rates.40 Survey researchers, therefore, should develop procedures for monitoring responses and implement follow-up plans shortly after the survey begins.
Generally, 2 or 3 mailings are used to maximize response rates. Use of post card reminders is an inexpensive, though untested, method to increase response. Several randomized studies have reported an increase in response rate from physicians in private practice with the use of monetary incentives, although the optimum amount is debated. Everett et al40 compared the use of a $1 incentive vs no monetary incentive and found a significant increase with the incentive group (response rates: 63% in the $1 group; 45% in the no incentive group; P < .0001).41 Other studies have compared $2, $5, $10, $20, and $25 incentives and found that $2 or $5 incentives are most cost effective.4245 Similar findings have been reported for physician surveys in other countries.31,46 In an assessment of incentive for enrollees in a health plan, a $2 incentive was more cost effective than a $5 incentive.47 A $1 incentive was as effective as $2 in significantly increasing response rate in a low-income population.48 Quality of responses have not varied by use of incentives and there does not appear to be an incentive-bias.
Use of lottery appears to also increase response rate in both physicians and the lay public, although there are no studies comparing lottery to a monetary incentive enclosed for all participants.31,49 Use of either certified or priority return mail appears to increase response rates, and may be more cost effective when used for the second mailing.45,48
Pilot testing
Though pilot testing is generally included in the development of a survey, it is often inadequately conducted Figure F Final Preparation). Frequently, investigators are eager to answer their research question and pilot testing is synonymous with letting a few colleagues take a quick look and make a few comments. Table 2 illustrates a problem that could have been avoided with proper pilot testing.10 One of the questions in the survey asked about how time is allotted for faculty to pursue scholarly activities and research (Format A). Unfortunately, the question mixes 2 types of time in 1 question: extended time away from the institution (sabbatical and mini-sabbatical) and time in the routine schedule. This was confusing to respondents and could have been avoided by separating the content into 2 separate questions (Format B).
Investigators should consider carefully whom to include in the pilot testing. Not only should this include the project team and survey “experts”, but it should also include a sample of the target audience. Pilot testing among multiple groups provides feedback about the wording and clarity of questions, appropriateness of the questions for the target population, and the presence of redundant or unnecessary items.
Conclusions
One of the authors (C.R.W.) recently worked on her first questionnaire project. Among the many lessons she learned was the value of a team in providing assistance, the importance of considering if the time spent on a particular activity makes it cost effective, and the need to be flexible depending on circumstances. She found that establishing good communication with the team cuts down on errors and wasted effort. Rewarding the team for all of their hard work improves morale and provides a positive model for future projects.
The mailed self-administered questionnaire is an important tool in primary care research. For family practice to continue its maturation as a research discipline, family practitioners need to be conversant in survey methodology and familiar with its pitfalls. We hope this primer-designed specifically for use in the family practice setting-will provide not only basic guidelines for novices but will also inspire further investigation.
Acknowledgments
The authors thank Laura Snell, MPH, for her thoughtful review of the manuscript. We also thank Olive Chen, PhD, for research assistance and Janice Rookstool for manuscript preparation.
1. Siebert C, Lipsett LF, Greenblatt J, Silverman RE. Survey of physician practice behaviors related to diabetes mellitus in the U.S. I. Design and methods. Diabetes Care 1993;16:759-64.
2. Weller AC. Editorial peer review: methodology and data collection. Bull Med Libr Assoc 1990;78:258-70.
3. Myerson S. Improving the response rates in primary care research. Some methods used in a survey on stress in general practice since the new contract (1990). Fam Pract 1993;10:342-6.
4. PsycINFO: your source for psychological abstracts. PsycINFO Web site. Available at: http://www.apa.org/psycinfo. Accessed April 11, 2002.
5. Converse JM, Presser S. Survey Questions: Handcrafting The Standardized Questionnaire. Quantitative Applications in the Social Sciences. Newbury Park, CA: Sage Publications; 1986.
6. Cox J. Your Opinion, Please!: How to Build the Best Questionnaires in the Field of Education. Thousand Oaks, CA: Corwin Press; 1996.
7. Fink A. ed The Survey Kit. Thousand Oaks, CA: Sage Publications; 1995.
8. Fowler F. Survey Research Methods. Applied Social Research Methods Series. Newbury Park, CA: Sage Publications; 1991.
9. Fowler F. Improving Survey Questions. Applied Social Research Methods Series. Newbury Park, CA: Sage Publications; 1995.
10. Oeffinger KC, Roaten SP, , Jr. Ader DN, Buchanan RJ. Support and rewards for scholarly activity in family medicine: a national survey. Fam Med 1997;29:508-12.
11. Oeffinger KC, Snell LM, Foster BM, Panico KG, Archer RK. Diagnosis of acute bronchitis in adults: a national survey of family physicians. J Fam Pract 1997;45:402-9.
12. Oeffinger KC, Snell LM, Foster BM, Panico KG, Archer RK. Treatment of acute bronchitis in adults. A national survey of family physicians. J Fam Pract 1998;46:469-75.
13. Oeffinger KC, Eshelman DA, Tomlinson GE, Buchanan GR. Programs for adult survivors of childhood cancer. J Clin Oncol 1998;16:2864-7.
14. Robinson MK, DeHaven MJ, Koch KA. The effects of the patient self-determination act on patient knowledge and behavior. J Fam Pract 1993;37:363-8.
15. Murphee DD, DeHaven MJ. Does grandma need condoms: condom use among women in a family practice setting. Arch Fam Med 1995;4:233-8.
16. DeHaven MJ, Wilson GR, Murphee DD, Grundig JP. An examination of family medicine residency program director’s views on research. Fam Med 1997;29:33-8.
17. Smith GE, DeHaven MJ, Grundig JP, Wilson GR. African-American males and prostate cancer: assessing knowledge levels in the community. J Natl Med Assoc 1997;89:387-91.
18. DeHaven MJ, Wilson GR, O’Connor PO. Creating a research culture: what we can learn from residencies that are successful in research. Fam Med 1998;30:501-7.
19. Koch KA, DeHaven MJ, Robinson MK. Futility: it’s magic. Clinical Pulmonary Medicine 1998;5:358-63.
20. Rogers J. Family medicine research: a matter of values and vision. Fam Med 1995;27:180-1.
21. Hulley SB, Cummings S, eds. Designing Clinical Research: An Epidemiological Approach. Baltimore, MD: Williams & Wilkins; 1988.
22. Babbie E. Survey research methods. Belmont, CA: Wadsworth Publishing; 1973.
23. Dillman DA. Mail and Telephone Surveys: The Total Design Method. New York: John Wiley & Sons; 1978.
24. Carmines EG, Zeller R. Reliability and Validity Assessment. Quantitative Applications in the Social Sciences, 17. Newbury Park, CA: Sage Publications; 1979.
25. Preston CC, Colman AM. Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent p. Acta Psychol (Amst) 2000;104:1-15.
26. Bandalos DL, Enders CK. The effects of non-normality and number of response categories on reliability. Appl Meas Ed 1996;9:151-60.
27. Cicchetti DV, Showalter D, Tyrer PJ. The effect of number of rating scale categories on levels of interrater reliability: a Monte Carlo investigation. Appl Psychol Meas 1985;9:31-6.
28. Nunnally JC. Psychometric Theory. New York: McGraw-Hill; 1967.
29. Likert R. A technique for the measurement of attitudes. Arch Psychol 1932;140:55.-
30. Matell MS, Jacoby J. Is there an optimal number of alternatives for Likert scale items? Effects of testing time and scale properties. J Appl Psychol 1972;56:506-9.
31. Kalantar JS, Talley NJ. The effects of lottery incentive and length of questionnaire on health survey response rates: a randomized study. J Clin Epidemiol 1999;52:1117-22.
32. Yammarino FJ, Skinner SJ, Childers TL. Understanding mail survey response behavior: a meta-analysis. Public Opin Q 1991;55:613-39.
33. Bailey KD. Methods of Social Research. New York: The Free Press; 1994.
34. Backstrom CH, Hursh-Cesar G. Survey Research. 2nd ed. New York: John Wiley & Sons; 1981.
35. Babbie E. The Practice of Social Research. Belmont, CA: Wadsworth Publishing; 1989.
36. Fowler FJ. Survey Research Methods. Applied Social Research Methods, Volume 1. Newbury Park, CA: Sage Publications; 1988.
37. Hill A, Roberts J, Ewings P, Gunnell D. Non-response bias in a lifestyle survey. J Public Health Med 1997;19:203-7.
38. O’Neill TW, Marsden D, Silman AJ. Differences in the characteristics of responders and non-responders in a prevalence survey of vertebral osteoporosis. European Vertebral Osteoporosis Study Group. Osteoporos Int 1995;5:327-34.
39. Jones J. The effects of non-response on statistical inference. J Health Soc Policy 1996;8:49-62.
40. National Center for Education Statistics. Standard for achieving acceptable survey response rates, NCES Standard: II-04-92. 2001. Available at: http://www.nces.ed.gov/statprog/Stand11_04.asp. Last accessed April 11, 2002.
41. Everett SA, Price JH, Bedell AW, Telljohann SK. The effect of a monetary incentive in increasing the return rate of a survey to family physicians. Eval Health Prof 1997;20:207-14.
42. Asch DA, Christakis NA, Ubel PA. Conducting physician mail surveys on a limited budget. A randomized trial comparing $2 bill versus $5 bill incentives. Med Care 1998;36:95-9.
43. VanGeest JB, Wynia MK, Cummins DS, Wilson IB. Effects of different monetary incentives on the return rate of a national mail survey of physicians. Med Care 2001;39:197-201.
44. Tambor ES, Chase GA, Faden RR, Geller G, Hofman KJ, Holtzman NA. Improving response rates through incentive and follow-up: the effect on a survey of physicians’ knowledge of genetics. Am J Public Health 1993;83:1599-603.
45. Kasprzyk D, Montano DE, St Lawrence JS, Phillips WR. The effects of variations in mode of delivery and monetary incentive on physicians’ responses to a mailed survey assessing STD practice patterns. Eval Health Prof 2001;24:3-17.
46. Deehan A, Templeton L, Taylor C, Drummond C, Strang J. The effect of cash and other financial inducements on the response rate of general practitioners in a national postal study. Br J Gen Pract 1997;47(415):87-90.
47. Shaw MJ, Beebe TJ, Jensen HL, Adlis SA. The use of monetary incentives in a community survey: impact on response rates, data quality, and cost. Health Serv Res 2001;35:1339-46.
48. Gibson PJ, Koepsell TD, Diehr P, Hale C. Increasing response rates for mailed surveys of Medicaid clients and other low-income populations. Am J Epidemiol 1999;149:1057-62.
49. Baron G, De Wals P, Milord F. Cost-effectiveness of a lottery for increasing physicians’ responses to a mail survey. Eval Health Prof 2001;24:47-52.
Address correspondence to Cristen R. Wall, MD, The University of Texas Southwestern Medical Center, Department of Family Practice and Community Medicine, 6263 Harry Hines Boulevard, Dallas, TX 75390-9067. E-mail: Cristen.Wall@UTSouthwestern.edu.
To submit a letter to the editor on this topic, click here:jfp@fammed.uc.edu.
Research using self-developed questionnaires is a popular study design in family practice and is frequently used for gathering data on knowledge, beliefs, attitudes, and behaviors. A Medline literature search from 1966 to 2000 identified 53,101 articles related to questionnaires, of which 2088 were directly related to family practice. Despite the large number of questionnaire-related articles, however, only 2 in the general medical literature1,2 and 1 in the family practice literature3 were directly related to research methodology.
To obtain guidance on survey research methodology, novice family practice researchers often must go through volumes of information by specialists in other disciplines. For example, a search of a psychology database (PsychInfo)4 from 1966 to 2000 produced 45 articles about questionnaire methodology. The goal of this article is to synthesize pertinent survey research methodology tenets-from other disciplines as well as from family practice-in a manner that is meaningful to novice family practice researchers as well as to research consumers. This article is not aimed at answering all questions, but rather is meant to serve as a general guideline for those with little formal research training who seek guidance in developing and administering questionnaires.
Avoiding common pitfalls in survey research
Although constructing a questionnaire is not exceedingly complex, simple mistakes can be avoided by following some basic rules and guidelines. The Figure is a checklist for conducting a survey research project that combines guidelines and suggestions from published survey research literature,5-9 and the cumulative experience of the authors. Two of the authors (M.J.D. and K.C.O.) are experienced survey researchers who have published, in peer-reviewed journals, numerous studies that used questionnaires.10-19 One of the authors (MJD) has been teaching research to residents and junior faculty for over a decade, and has been an advisor on scores of resident, student, and faculty research projects. The perspective of the novice researcher is represented by 1 author (C.R.W.).
Getting started
The “quick and dirty” approach is perhaps the most common pitfall in survey research. Because of the ease of administration and the relatively low cost of survey research, questionnaires can be developed and administered quickly. The researcher, however, should be sure to consider whether or not a survey is the most appropriate method to answer a research question. Adequate time must be given to thoroughly searching the relevant literature, developing and focusing on an appropriate research question, and defining the target population for the study (see Figure A, Getting Started). Large, multisite surveys are more likely to be generalizeable and to be published in peer-reviewed journals.
One way to avoid undertaking a project too rapidly and giving inadequate attention to the survey research process is for novice researchers to avoid independent research. Those with little or no experience must realize that researchers in both family practice and other fields perform research in teams, with the various participants bringing specific skills to the process.20 Oversights, mistakes, and biases in the design of questionnaires can always occur, whether a researcher is working independently or as a member of a team. It seems reasonable to assume, however, that significant problems are much less likely to occur when a multidisciplinary team approach is involved rather than an individual researcher undertaking a study independently.
Ideally, a research team should include a statistician, a professional with experience in the content areas of the study, and a senior investigator.21 The desirable area of expertise, however, is often not readily available to family physicians, especially those in community-based settings. Individuals with some training in research who are interested in being involved can usually be found in colleges and universities, hospitals, and at the local public health department. Psychologists, sociologists, health services researchers, public health epidemiologists, and nursing educators are all potential resources and possible collaborators. Establishing the necessary relationships to form an ad hoc research team is certainly more time and labor intensive than undertaking research independently, but generally results in the collection of more useful information.
Novices should consult survey methodology books before and during the study.5-9 Excellent resources are available that provide a comprehensive overview of survey methods,22 means for improving response rates,23 and methods for constructing relatively brief but thorough survey questions.5 Academic family practice fellowships often provide training in survey methodology. In addition, many family practice researchers respond favorably to requests for information or advice requested by telephone or email contact. The novice author of this article reports excellent success in contacting experts in this manner. With the advent of the Internet, a “cyberspace” team comprised of experts in the topic and the methodology is a reasonable and helpful option for the novice.
Survey content and structure
Novice researchers often assume that developing a questionnaire is an intuitive process attainable by virtually anyone, regardless of their level of research training. While it is true that questionnaires are relatively simple to construct, developing an instrument that is valid and reliable is not intuitive. An instrument is valid if it actually measures what we think it is measuring, and it is reliable if it measures the phenomenon consistently in repeated applications.24 By following a few basic guidelines, those with limited research training can develop survey instruments capable of producing valid and reliable information. The 3 primary concerns for developing appropriate questions (items) are: (1) response format; (2) content; and (3) wording and placement (see Figure B, Survey Questions; and Figure C, Designing and Formatting the Survey).
Format
Questionnaires generally use a closed-ended format rather than an open-ended format. Closed formats spell out response options instead of asking study subjects to respond in their own words. Although there are many reasons for using closed formats, their primary advantages over open formats is that they are more specific and provide the same frame of reference to all respondents, and they allow quantitative analysis. A disadvantage is that they limit the possible range of responses envisioned by the investigators. Questionnaires with closed formats are therefore not as helpful as qualitative methods in the early, exploratory phases of a research project.
Closed-ended items can be formatted into several different categories (classes) of measurement, based on the relationship of the response categories to one another. Nominal measurements are responses that are sorted into unordered categories, such as demographic variables (ie, sex, ethnicity). Ordinal measurements are similar to nominal, except that there is a definite order to the categories. For example, ordinal items may ask respondents to rank their preferences among a list of options from the least desirable to the most desirable.
Survey items that ask for respondents(delete apostrophe) to rank order preferences are often a more useful than items that state, “check all that apply.” While checking all relevant responses may be necessary for certain items, such questions often lose valuable information as they can only supply raw percentages without supplying any comparison between responses. If a survey uses a rank order response, it enables determining the relative importance of the different categories during data analysis Table 1.
Two additional tools used on questionnaires are continuous variables and scales. Continuous variables can be simple counts (eg, the number of times something occurred) or physical attributes (eg, age or weight). A general rule when collecting information on continuous variables is to avoid obtaining the information in ranges of categories unless absolutely necessary. Response categories that reflect ranges of responses can always be constructed after the information is gathered, but if the information is gathered in ranges from the start, it cannot later be expanded to reflect specific values.
Scales are used by survey researchers to assess the intensity of respondents’ attitudes about a specific issue or issues. Likert scales are probably the best known and most widely used for measuring attitudes. These scales typically present respondents with a statement and ask them to indicate whether they “strongly agree,” “agree,” “neither agree nor disagree,” “disagree,” or “strongly disagree.” The wording of the response categories can be changed to reflect other concepts (eg, approval or disapproval), and the standard 5-response format can be expanded or abbreviated if necessary.
There are no hard and fast rules for determining the number of response categories to use for scaled items, or whether to use a neutral category or one that reflects uncertainty. Research indicates that the reliability of respondents’ ratings declines when using more than 9 rating scale points.25 However, the reliability of a scale increases when the number of rating scale points is increased, with maximum benefit achieved with 5 or 7 scale points.25,26 Since the objective of using scales is to gauge respondent’s preferences, it is sometimes argued that a middle point or category of uncertainty category should not be used. Odd-numbered rating scales, however, conform better with the underlying tenets of many statistical tests, suggesting the need for including this category.29 As the number of rating scale points increases, respondents’ use of the midpoint category decreases substantially. 30 Thus, based on the available literature, it is generally advisable to use between 5 and 7 response categories and an uncertainty category, unless there is a compelling reason to force respondents to choose between 2 competing perspectives or alternatives.
Content
Items should not be included on questionnaires when the only justification for inclusion is that the investigator feels the information “would be really interesting to know.” Rather, for each item, you should ask yourself how it addresses the study’s research question and how it will be used in the data analysis stage of the study. Researchers should develop a data analysis plan in advance of administering a questionnaire to determine exactly how each question will be used in the analysis. When the relationship between a particular item and the study’s research question is unclear, or it is not known how an item will be used in the analysis, the item should be removed from the questionnaire.
Wording and placement
The wording of questions should be kept simple, regardless of the education level of the respondents. Questions should be kept as short and direct as possible since shorter surveys tend to have higher response rates.31,32 Each question should be scrutinized to ensure it is appropriate for the respondents and does not require or assume an inappropriate level of knowledge about a topic. Since first impressions are important for setting the tone of a questionnaire, never begin with sensitive or threatening questions.33 Questionnaires should begin with simple, introductory (“warm-up”)“questions to help establish trust and an appropriate frame of mind for respondents.34 Other successful strategies are: (1) when addressing multiple topics, insert an introductory statement immediately preceding each topic (eg, “In the next section we would like to ask you about …”); (2) request demographic information at the end of the questionnaire; and (3) always provide explicit instructions to avoid any confusion on the part of respondents.35
Additional, clear information on survey content and structure is available in 2 books from Sage Publications.5,36 By following simple guidelines and common sense, most family practice researchers can construct valid and reliable questionnaires. As a final safeguard, once a final draft of the questionnaire is completed, the researcher should always be the first respondent. By placing yourself in the respondent’s role and taking the time to think about and respond to each question, problems with the instrument that were overlooked are sometimes identified.
Analyzing surveys
It is not within the scope of this project to address statistical analysis of survey data. Before attempting data analysis, investigators should receive appropriate training or consult with a qualified professional. There are 3 topics that can and should be understood by novice researchers related to data analysis (Figure D, Developing a Framework for Analysis).
Coding
Before analyzing survey data it is necessary to assign numbers (codes) to the responses obtained. Since the computer program that is used for analyzing data does not know what the numbers mean, the researcher assigns meaning to the codes so that the results can be interpreted correctly. Coding refers to the process of developing the codes, assigning them to responses, and documenting the decision rules used for assigning specific codes to specific response categories. For example, almost all questionnaires contain missing values when respondents elect to not answer an item. Unique codes need to be assigned to distinguish between an item’s missing values, items that may not be applicable to a particular respondent, and responses that have a “none” or “no opinion” category.
Data can be entered into appropriate data files once codes have been assigned to responses and a codebook compiled that defines the codes and their corresponding response categories. It is important to ensure that the data are free of errors (are clean) prior to performing data analysis. Although many methods can be used for data cleaning (ie, data can be entered twice and results compared consistency), at a minimum all of the codes should be checked to ensure only legitimate codes appear.
Frequency distributions are tables produced by statistical software that display the number of respondents in each response category for each item (variable) used in the analysis. By carefully examining frequency tables, the researcher can check for illegitimate codes. Frequency tables also display the relative distribution of responses and allow identification of items that do not conform to expectations given what is known about the study population.
Sample size
Since it is usually not possible to study all of the members of the group (population) of interest in a study, a subset (sample) of the population is generally selected for study from the sampling frame. Sampling is the process by which study subjects are selected from the target population, while the sample frame is the members of a population who have a chance of being included in the survey. In probability samples, each member of the sampling frame has a known probability of being selected for the study, whereas in nonprobability samples, the probability of selection is unknown. When a high degree of precision in sampling is needed to unambiguously identify the magnitude of a problem in a population or the factors that cause the problem, then probability sampling techniques must be used.
When conducting an analytical study that examines precisely whether statistically significant differences exist between groups in a population, power analysis is used to determine what size sample is needed to detect the differences. Estimates of sample size based on power are inversely related to the expected size of the differences “(effect size)”-that is, detecting smaller differences requires a larger sample. If an analytical study is undertaken to determine the magnitude of the differences between 2 groups, it is necessary to work with a statistician or other methodology expert to perform the appropriate power analysis. For a basic but valuable description of sample size estimation, see chapter 13 of Hulley and Cummings.21
In contrast to analytical studies, exploratory and descriptive studies can frequently be conducted without the need for a power analysis. While some descriptive studies may require the use of probability techniques and precise sample estimates, this often is not the case for studies that establish the existence of a problem or estimating its dimensions. When conducting an exploratory or descriptive study using a survey design and a nonprobability sampling technique, considerations other than effect size or precision are used to determine sample size. For example, the availability of eligible respondents, limitations of time and resources, and the need for pilot study data can all contribute to selecting a nonprobability sample. When these types of sampling techniques are used, however, it is important to remember that the validity and reliability of the findings are not assured, and the findings cannot be used to demonstrate the existence of differences between groups. The findings of these types of studies are only suggestive and have limited application beyond the specific study setting.
Response rate
The response rate is a measure indicating the percentage of the identified sample that completed and returned the questionnaire. It is calculated by dividing the number of completed questionnaires by the total sample size identified for the study. For example, if a study is mailed to 500 physicians questionnaires and 100 returned a completed questionnaire, the response rate would be 20% (100/500).
The response rate for mailed questionnaires is extremely variable. Charities are generally content with a 1% to 3% response rate, the US Census Bureau expects to achieve a 99% rate, and among the general population, a 10% response rate is not uncommon. Although an 80% response rate is possible from an extremely motivated population, a rate of 70% is generally considered excellent.34
The effect of nonresponse on the results of a survey depend on the degree to which those not responding are systematically different from the population from which they are drawn.24 When the response rate is high (ie, 95%), the results obtained from the sample will likely provide accurate information about the target population (sampling frame) even if the nonrespondents are distinctly different. However, if nonrespondents differ in a systematic way from the target population and the response rate is low, bias in how much the survey results accurately reflect the true characteristics of the target population is likely.
When calculating the response rate, participants who have died or retired can be removed from the denominator as appropriate. Nonrespondents, however, who refuse to participate, do not return the survey, or have moved should be included. This bias tends to be more problematic in “sensitive” areas of research37 than in studies of common, nonthreatening topics.38 Imputing values for missing data from nonrespondents is complex and generally should not be undertaken.39
Given the importance of response rate, every effort must be made to obtain as many completed questionnaires as possible and strategies to maximize the response rate should be integrated into the study design (see Dillman23 for a useful discussion of successful strategies). Some simple means for improving response rates include constructing a short questionnaire, sending a well-written and personalized cover letter containing your signature, and emphasizing the importance of the study and the confidentiality of responses. It is also advisable to include a self-addressed, stamped envelope for return responses, and sometimes a small incentive is worthwhile. The National Center for Education Statistics notes that all surveys require some follow-up to achieve desirable response rates.40 Survey researchers, therefore, should develop procedures for monitoring responses and implement follow-up plans shortly after the survey begins.
Generally, 2 or 3 mailings are used to maximize response rates. Use of post card reminders is an inexpensive, though untested, method to increase response. Several randomized studies have reported an increase in response rate from physicians in private practice with the use of monetary incentives, although the optimum amount is debated. Everett et al40 compared the use of a $1 incentive vs no monetary incentive and found a significant increase with the incentive group (response rates: 63% in the $1 group; 45% in the no incentive group; P < .0001).41 Other studies have compared $2, $5, $10, $20, and $25 incentives and found that $2 or $5 incentives are most cost effective.4245 Similar findings have been reported for physician surveys in other countries.31,46 In an assessment of incentive for enrollees in a health plan, a $2 incentive was more cost effective than a $5 incentive.47 A $1 incentive was as effective as $2 in significantly increasing response rate in a low-income population.48 Quality of responses have not varied by use of incentives and there does not appear to be an incentive-bias.
Use of lottery appears to also increase response rate in both physicians and the lay public, although there are no studies comparing lottery to a monetary incentive enclosed for all participants.31,49 Use of either certified or priority return mail appears to increase response rates, and may be more cost effective when used for the second mailing.45,48
Pilot testing
Though pilot testing is generally included in the development of a survey, it is often inadequately conducted Figure F Final Preparation). Frequently, investigators are eager to answer their research question and pilot testing is synonymous with letting a few colleagues take a quick look and make a few comments. Table 2 illustrates a problem that could have been avoided with proper pilot testing.10 One of the questions in the survey asked about how time is allotted for faculty to pursue scholarly activities and research (Format A). Unfortunately, the question mixes 2 types of time in 1 question: extended time away from the institution (sabbatical and mini-sabbatical) and time in the routine schedule. This was confusing to respondents and could have been avoided by separating the content into 2 separate questions (Format B).
Investigators should consider carefully whom to include in the pilot testing. Not only should this include the project team and survey “experts”, but it should also include a sample of the target audience. Pilot testing among multiple groups provides feedback about the wording and clarity of questions, appropriateness of the questions for the target population, and the presence of redundant or unnecessary items.
Conclusions
One of the authors (C.R.W.) recently worked on her first questionnaire project. Among the many lessons she learned was the value of a team in providing assistance, the importance of considering if the time spent on a particular activity makes it cost effective, and the need to be flexible depending on circumstances. She found that establishing good communication with the team cuts down on errors and wasted effort. Rewarding the team for all of their hard work improves morale and provides a positive model for future projects.
The mailed self-administered questionnaire is an important tool in primary care research. For family practice to continue its maturation as a research discipline, family practitioners need to be conversant in survey methodology and familiar with its pitfalls. We hope this primer-designed specifically for use in the family practice setting-will provide not only basic guidelines for novices but will also inspire further investigation.
Acknowledgments
The authors thank Laura Snell, MPH, for her thoughtful review of the manuscript. We also thank Olive Chen, PhD, for research assistance and Janice Rookstool for manuscript preparation.
Research using self-developed questionnaires is a popular study design in family practice and is frequently used for gathering data on knowledge, beliefs, attitudes, and behaviors. A Medline literature search from 1966 to 2000 identified 53,101 articles related to questionnaires, of which 2088 were directly related to family practice. Despite the large number of questionnaire-related articles, however, only 2 in the general medical literature1,2 and 1 in the family practice literature3 were directly related to research methodology.
To obtain guidance on survey research methodology, novice family practice researchers often must go through volumes of information by specialists in other disciplines. For example, a search of a psychology database (PsychInfo)4 from 1966 to 2000 produced 45 articles about questionnaire methodology. The goal of this article is to synthesize pertinent survey research methodology tenets-from other disciplines as well as from family practice-in a manner that is meaningful to novice family practice researchers as well as to research consumers. This article is not aimed at answering all questions, but rather is meant to serve as a general guideline for those with little formal research training who seek guidance in developing and administering questionnaires.
Avoiding common pitfalls in survey research
Although constructing a questionnaire is not exceedingly complex, simple mistakes can be avoided by following some basic rules and guidelines. The Figure is a checklist for conducting a survey research project that combines guidelines and suggestions from published survey research literature,5-9 and the cumulative experience of the authors. Two of the authors (M.J.D. and K.C.O.) are experienced survey researchers who have published, in peer-reviewed journals, numerous studies that used questionnaires.10-19 One of the authors (MJD) has been teaching research to residents and junior faculty for over a decade, and has been an advisor on scores of resident, student, and faculty research projects. The perspective of the novice researcher is represented by 1 author (C.R.W.).
Getting started
The “quick and dirty” approach is perhaps the most common pitfall in survey research. Because of the ease of administration and the relatively low cost of survey research, questionnaires can be developed and administered quickly. The researcher, however, should be sure to consider whether or not a survey is the most appropriate method to answer a research question. Adequate time must be given to thoroughly searching the relevant literature, developing and focusing on an appropriate research question, and defining the target population for the study (see Figure A, Getting Started). Large, multisite surveys are more likely to be generalizeable and to be published in peer-reviewed journals.
One way to avoid undertaking a project too rapidly and giving inadequate attention to the survey research process is for novice researchers to avoid independent research. Those with little or no experience must realize that researchers in both family practice and other fields perform research in teams, with the various participants bringing specific skills to the process.20 Oversights, mistakes, and biases in the design of questionnaires can always occur, whether a researcher is working independently or as a member of a team. It seems reasonable to assume, however, that significant problems are much less likely to occur when a multidisciplinary team approach is involved rather than an individual researcher undertaking a study independently.
Ideally, a research team should include a statistician, a professional with experience in the content areas of the study, and a senior investigator.21 The desirable area of expertise, however, is often not readily available to family physicians, especially those in community-based settings. Individuals with some training in research who are interested in being involved can usually be found in colleges and universities, hospitals, and at the local public health department. Psychologists, sociologists, health services researchers, public health epidemiologists, and nursing educators are all potential resources and possible collaborators. Establishing the necessary relationships to form an ad hoc research team is certainly more time and labor intensive than undertaking research independently, but generally results in the collection of more useful information.
Novices should consult survey methodology books before and during the study.5-9 Excellent resources are available that provide a comprehensive overview of survey methods,22 means for improving response rates,23 and methods for constructing relatively brief but thorough survey questions.5 Academic family practice fellowships often provide training in survey methodology. In addition, many family practice researchers respond favorably to requests for information or advice requested by telephone or email contact. The novice author of this article reports excellent success in contacting experts in this manner. With the advent of the Internet, a “cyberspace” team comprised of experts in the topic and the methodology is a reasonable and helpful option for the novice.
Survey content and structure
Novice researchers often assume that developing a questionnaire is an intuitive process attainable by virtually anyone, regardless of their level of research training. While it is true that questionnaires are relatively simple to construct, developing an instrument that is valid and reliable is not intuitive. An instrument is valid if it actually measures what we think it is measuring, and it is reliable if it measures the phenomenon consistently in repeated applications.24 By following a few basic guidelines, those with limited research training can develop survey instruments capable of producing valid and reliable information. The 3 primary concerns for developing appropriate questions (items) are: (1) response format; (2) content; and (3) wording and placement (see Figure B, Survey Questions; and Figure C, Designing and Formatting the Survey).
Format
Questionnaires generally use a closed-ended format rather than an open-ended format. Closed formats spell out response options instead of asking study subjects to respond in their own words. Although there are many reasons for using closed formats, their primary advantages over open formats is that they are more specific and provide the same frame of reference to all respondents, and they allow quantitative analysis. A disadvantage is that they limit the possible range of responses envisioned by the investigators. Questionnaires with closed formats are therefore not as helpful as qualitative methods in the early, exploratory phases of a research project.
Closed-ended items can be formatted into several different categories (classes) of measurement, based on the relationship of the response categories to one another. Nominal measurements are responses that are sorted into unordered categories, such as demographic variables (ie, sex, ethnicity). Ordinal measurements are similar to nominal, except that there is a definite order to the categories. For example, ordinal items may ask respondents to rank their preferences among a list of options from the least desirable to the most desirable.
Survey items that ask for respondents(delete apostrophe) to rank order preferences are often a more useful than items that state, “check all that apply.” While checking all relevant responses may be necessary for certain items, such questions often lose valuable information as they can only supply raw percentages without supplying any comparison between responses. If a survey uses a rank order response, it enables determining the relative importance of the different categories during data analysis Table 1.
Two additional tools used on questionnaires are continuous variables and scales. Continuous variables can be simple counts (eg, the number of times something occurred) or physical attributes (eg, age or weight). A general rule when collecting information on continuous variables is to avoid obtaining the information in ranges of categories unless absolutely necessary. Response categories that reflect ranges of responses can always be constructed after the information is gathered, but if the information is gathered in ranges from the start, it cannot later be expanded to reflect specific values.
Scales are used by survey researchers to assess the intensity of respondents’ attitudes about a specific issue or issues. Likert scales are probably the best known and most widely used for measuring attitudes. These scales typically present respondents with a statement and ask them to indicate whether they “strongly agree,” “agree,” “neither agree nor disagree,” “disagree,” or “strongly disagree.” The wording of the response categories can be changed to reflect other concepts (eg, approval or disapproval), and the standard 5-response format can be expanded or abbreviated if necessary.
There are no hard and fast rules for determining the number of response categories to use for scaled items, or whether to use a neutral category or one that reflects uncertainty. Research indicates that the reliability of respondents’ ratings declines when using more than 9 rating scale points.25 However, the reliability of a scale increases when the number of rating scale points is increased, with maximum benefit achieved with 5 or 7 scale points.25,26 Since the objective of using scales is to gauge respondent’s preferences, it is sometimes argued that a middle point or category of uncertainty category should not be used. Odd-numbered rating scales, however, conform better with the underlying tenets of many statistical tests, suggesting the need for including this category.29 As the number of rating scale points increases, respondents’ use of the midpoint category decreases substantially. 30 Thus, based on the available literature, it is generally advisable to use between 5 and 7 response categories and an uncertainty category, unless there is a compelling reason to force respondents to choose between 2 competing perspectives or alternatives.
Content
Items should not be included on questionnaires when the only justification for inclusion is that the investigator feels the information “would be really interesting to know.” Rather, for each item, you should ask yourself how it addresses the study’s research question and how it will be used in the data analysis stage of the study. Researchers should develop a data analysis plan in advance of administering a questionnaire to determine exactly how each question will be used in the analysis. When the relationship between a particular item and the study’s research question is unclear, or it is not known how an item will be used in the analysis, the item should be removed from the questionnaire.
Wording and placement
The wording of questions should be kept simple, regardless of the education level of the respondents. Questions should be kept as short and direct as possible since shorter surveys tend to have higher response rates.31,32 Each question should be scrutinized to ensure it is appropriate for the respondents and does not require or assume an inappropriate level of knowledge about a topic. Since first impressions are important for setting the tone of a questionnaire, never begin with sensitive or threatening questions.33 Questionnaires should begin with simple, introductory (“warm-up”)“questions to help establish trust and an appropriate frame of mind for respondents.34 Other successful strategies are: (1) when addressing multiple topics, insert an introductory statement immediately preceding each topic (eg, “In the next section we would like to ask you about …”); (2) request demographic information at the end of the questionnaire; and (3) always provide explicit instructions to avoid any confusion on the part of respondents.35
Additional, clear information on survey content and structure is available in 2 books from Sage Publications.5,36 By following simple guidelines and common sense, most family practice researchers can construct valid and reliable questionnaires. As a final safeguard, once a final draft of the questionnaire is completed, the researcher should always be the first respondent. By placing yourself in the respondent’s role and taking the time to think about and respond to each question, problems with the instrument that were overlooked are sometimes identified.
Analyzing surveys
It is not within the scope of this project to address statistical analysis of survey data. Before attempting data analysis, investigators should receive appropriate training or consult with a qualified professional. There are 3 topics that can and should be understood by novice researchers related to data analysis (Figure D, Developing a Framework for Analysis).
Coding
Before analyzing survey data it is necessary to assign numbers (codes) to the responses obtained. Since the computer program that is used for analyzing data does not know what the numbers mean, the researcher assigns meaning to the codes so that the results can be interpreted correctly. Coding refers to the process of developing the codes, assigning them to responses, and documenting the decision rules used for assigning specific codes to specific response categories. For example, almost all questionnaires contain missing values when respondents elect to not answer an item. Unique codes need to be assigned to distinguish between an item’s missing values, items that may not be applicable to a particular respondent, and responses that have a “none” or “no opinion” category.
Data can be entered into appropriate data files once codes have been assigned to responses and a codebook compiled that defines the codes and their corresponding response categories. It is important to ensure that the data are free of errors (are clean) prior to performing data analysis. Although many methods can be used for data cleaning (ie, data can be entered twice and results compared consistency), at a minimum all of the codes should be checked to ensure only legitimate codes appear.
Frequency distributions are tables produced by statistical software that display the number of respondents in each response category for each item (variable) used in the analysis. By carefully examining frequency tables, the researcher can check for illegitimate codes. Frequency tables also display the relative distribution of responses and allow identification of items that do not conform to expectations given what is known about the study population.
Sample size
Since it is usually not possible to study all of the members of the group (population) of interest in a study, a subset (sample) of the population is generally selected for study from the sampling frame. Sampling is the process by which study subjects are selected from the target population, while the sample frame is the members of a population who have a chance of being included in the survey. In probability samples, each member of the sampling frame has a known probability of being selected for the study, whereas in nonprobability samples, the probability of selection is unknown. When a high degree of precision in sampling is needed to unambiguously identify the magnitude of a problem in a population or the factors that cause the problem, then probability sampling techniques must be used.
When conducting an analytical study that examines precisely whether statistically significant differences exist between groups in a population, power analysis is used to determine what size sample is needed to detect the differences. Estimates of sample size based on power are inversely related to the expected size of the differences “(effect size)”-that is, detecting smaller differences requires a larger sample. If an analytical study is undertaken to determine the magnitude of the differences between 2 groups, it is necessary to work with a statistician or other methodology expert to perform the appropriate power analysis. For a basic but valuable description of sample size estimation, see chapter 13 of Hulley and Cummings.21
In contrast to analytical studies, exploratory and descriptive studies can frequently be conducted without the need for a power analysis. While some descriptive studies may require the use of probability techniques and precise sample estimates, this often is not the case for studies that establish the existence of a problem or estimating its dimensions. When conducting an exploratory or descriptive study using a survey design and a nonprobability sampling technique, considerations other than effect size or precision are used to determine sample size. For example, the availability of eligible respondents, limitations of time and resources, and the need for pilot study data can all contribute to selecting a nonprobability sample. When these types of sampling techniques are used, however, it is important to remember that the validity and reliability of the findings are not assured, and the findings cannot be used to demonstrate the existence of differences between groups. The findings of these types of studies are only suggestive and have limited application beyond the specific study setting.
Response rate
The response rate is a measure indicating the percentage of the identified sample that completed and returned the questionnaire. It is calculated by dividing the number of completed questionnaires by the total sample size identified for the study. For example, if a study is mailed to 500 physicians questionnaires and 100 returned a completed questionnaire, the response rate would be 20% (100/500).
The response rate for mailed questionnaires is extremely variable. Charities are generally content with a 1% to 3% response rate, the US Census Bureau expects to achieve a 99% rate, and among the general population, a 10% response rate is not uncommon. Although an 80% response rate is possible from an extremely motivated population, a rate of 70% is generally considered excellent.34
The effect of nonresponse on the results of a survey depend on the degree to which those not responding are systematically different from the population from which they are drawn.24 When the response rate is high (ie, 95%), the results obtained from the sample will likely provide accurate information about the target population (sampling frame) even if the nonrespondents are distinctly different. However, if nonrespondents differ in a systematic way from the target population and the response rate is low, bias in how much the survey results accurately reflect the true characteristics of the target population is likely.
When calculating the response rate, participants who have died or retired can be removed from the denominator as appropriate. Nonrespondents, however, who refuse to participate, do not return the survey, or have moved should be included. This bias tends to be more problematic in “sensitive” areas of research37 than in studies of common, nonthreatening topics.38 Imputing values for missing data from nonrespondents is complex and generally should not be undertaken.39
Given the importance of response rate, every effort must be made to obtain as many completed questionnaires as possible and strategies to maximize the response rate should be integrated into the study design (see Dillman23 for a useful discussion of successful strategies). Some simple means for improving response rates include constructing a short questionnaire, sending a well-written and personalized cover letter containing your signature, and emphasizing the importance of the study and the confidentiality of responses. It is also advisable to include a self-addressed, stamped envelope for return responses, and sometimes a small incentive is worthwhile. The National Center for Education Statistics notes that all surveys require some follow-up to achieve desirable response rates.40 Survey researchers, therefore, should develop procedures for monitoring responses and implement follow-up plans shortly after the survey begins.
Generally, 2 or 3 mailings are used to maximize response rates. Use of post card reminders is an inexpensive, though untested, method to increase response. Several randomized studies have reported an increase in response rate from physicians in private practice with the use of monetary incentives, although the optimum amount is debated. Everett et al40 compared the use of a $1 incentive vs no monetary incentive and found a significant increase with the incentive group (response rates: 63% in the $1 group; 45% in the no incentive group; P < .0001).41 Other studies have compared $2, $5, $10, $20, and $25 incentives and found that $2 or $5 incentives are most cost effective.4245 Similar findings have been reported for physician surveys in other countries.31,46 In an assessment of incentive for enrollees in a health plan, a $2 incentive was more cost effective than a $5 incentive.47 A $1 incentive was as effective as $2 in significantly increasing response rate in a low-income population.48 Quality of responses have not varied by use of incentives and there does not appear to be an incentive-bias.
Use of lottery appears to also increase response rate in both physicians and the lay public, although there are no studies comparing lottery to a monetary incentive enclosed for all participants.31,49 Use of either certified or priority return mail appears to increase response rates, and may be more cost effective when used for the second mailing.45,48
Pilot testing
Though pilot testing is generally included in the development of a survey, it is often inadequately conducted Figure F Final Preparation). Frequently, investigators are eager to answer their research question and pilot testing is synonymous with letting a few colleagues take a quick look and make a few comments. Table 2 illustrates a problem that could have been avoided with proper pilot testing.10 One of the questions in the survey asked about how time is allotted for faculty to pursue scholarly activities and research (Format A). Unfortunately, the question mixes 2 types of time in 1 question: extended time away from the institution (sabbatical and mini-sabbatical) and time in the routine schedule. This was confusing to respondents and could have been avoided by separating the content into 2 separate questions (Format B).
Investigators should consider carefully whom to include in the pilot testing. Not only should this include the project team and survey “experts”, but it should also include a sample of the target audience. Pilot testing among multiple groups provides feedback about the wording and clarity of questions, appropriateness of the questions for the target population, and the presence of redundant or unnecessary items.
Conclusions
One of the authors (C.R.W.) recently worked on her first questionnaire project. Among the many lessons she learned was the value of a team in providing assistance, the importance of considering if the time spent on a particular activity makes it cost effective, and the need to be flexible depending on circumstances. She found that establishing good communication with the team cuts down on errors and wasted effort. Rewarding the team for all of their hard work improves morale and provides a positive model for future projects.
The mailed self-administered questionnaire is an important tool in primary care research. For family practice to continue its maturation as a research discipline, family practitioners need to be conversant in survey methodology and familiar with its pitfalls. We hope this primer-designed specifically for use in the family practice setting-will provide not only basic guidelines for novices but will also inspire further investigation.
Acknowledgments
The authors thank Laura Snell, MPH, for her thoughtful review of the manuscript. We also thank Olive Chen, PhD, for research assistance and Janice Rookstool for manuscript preparation.
1. Siebert C, Lipsett LF, Greenblatt J, Silverman RE. Survey of physician practice behaviors related to diabetes mellitus in the U.S. I. Design and methods. Diabetes Care 1993;16:759-64.
2. Weller AC. Editorial peer review: methodology and data collection. Bull Med Libr Assoc 1990;78:258-70.
3. Myerson S. Improving the response rates in primary care research. Some methods used in a survey on stress in general practice since the new contract (1990). Fam Pract 1993;10:342-6.
4. PsycINFO: your source for psychological abstracts. PsycINFO Web site. Available at: http://www.apa.org/psycinfo. Accessed April 11, 2002.
5. Converse JM, Presser S. Survey Questions: Handcrafting The Standardized Questionnaire. Quantitative Applications in the Social Sciences. Newbury Park, CA: Sage Publications; 1986.
6. Cox J. Your Opinion, Please!: How to Build the Best Questionnaires in the Field of Education. Thousand Oaks, CA: Corwin Press; 1996.
7. Fink A. ed The Survey Kit. Thousand Oaks, CA: Sage Publications; 1995.
8. Fowler F. Survey Research Methods. Applied Social Research Methods Series. Newbury Park, CA: Sage Publications; 1991.
9. Fowler F. Improving Survey Questions. Applied Social Research Methods Series. Newbury Park, CA: Sage Publications; 1995.
10. Oeffinger KC, Roaten SP, , Jr. Ader DN, Buchanan RJ. Support and rewards for scholarly activity in family medicine: a national survey. Fam Med 1997;29:508-12.
11. Oeffinger KC, Snell LM, Foster BM, Panico KG, Archer RK. Diagnosis of acute bronchitis in adults: a national survey of family physicians. J Fam Pract 1997;45:402-9.
12. Oeffinger KC, Snell LM, Foster BM, Panico KG, Archer RK. Treatment of acute bronchitis in adults. A national survey of family physicians. J Fam Pract 1998;46:469-75.
13. Oeffinger KC, Eshelman DA, Tomlinson GE, Buchanan GR. Programs for adult survivors of childhood cancer. J Clin Oncol 1998;16:2864-7.
14. Robinson MK, DeHaven MJ, Koch KA. The effects of the patient self-determination act on patient knowledge and behavior. J Fam Pract 1993;37:363-8.
15. Murphee DD, DeHaven MJ. Does grandma need condoms: condom use among women in a family practice setting. Arch Fam Med 1995;4:233-8.
16. DeHaven MJ, Wilson GR, Murphee DD, Grundig JP. An examination of family medicine residency program director’s views on research. Fam Med 1997;29:33-8.
17. Smith GE, DeHaven MJ, Grundig JP, Wilson GR. African-American males and prostate cancer: assessing knowledge levels in the community. J Natl Med Assoc 1997;89:387-91.
18. DeHaven MJ, Wilson GR, O’Connor PO. Creating a research culture: what we can learn from residencies that are successful in research. Fam Med 1998;30:501-7.
19. Koch KA, DeHaven MJ, Robinson MK. Futility: it’s magic. Clinical Pulmonary Medicine 1998;5:358-63.
20. Rogers J. Family medicine research: a matter of values and vision. Fam Med 1995;27:180-1.
21. Hulley SB, Cummings S, eds. Designing Clinical Research: An Epidemiological Approach. Baltimore, MD: Williams & Wilkins; 1988.
22. Babbie E. Survey research methods. Belmont, CA: Wadsworth Publishing; 1973.
23. Dillman DA. Mail and Telephone Surveys: The Total Design Method. New York: John Wiley & Sons; 1978.
24. Carmines EG, Zeller R. Reliability and Validity Assessment. Quantitative Applications in the Social Sciences, 17. Newbury Park, CA: Sage Publications; 1979.
25. Preston CC, Colman AM. Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent p. Acta Psychol (Amst) 2000;104:1-15.
26. Bandalos DL, Enders CK. The effects of non-normality and number of response categories on reliability. Appl Meas Ed 1996;9:151-60.
27. Cicchetti DV, Showalter D, Tyrer PJ. The effect of number of rating scale categories on levels of interrater reliability: a Monte Carlo investigation. Appl Psychol Meas 1985;9:31-6.
28. Nunnally JC. Psychometric Theory. New York: McGraw-Hill; 1967.
29. Likert R. A technique for the measurement of attitudes. Arch Psychol 1932;140:55.-
30. Matell MS, Jacoby J. Is there an optimal number of alternatives for Likert scale items? Effects of testing time and scale properties. J Appl Psychol 1972;56:506-9.
31. Kalantar JS, Talley NJ. The effects of lottery incentive and length of questionnaire on health survey response rates: a randomized study. J Clin Epidemiol 1999;52:1117-22.
32. Yammarino FJ, Skinner SJ, Childers TL. Understanding mail survey response behavior: a meta-analysis. Public Opin Q 1991;55:613-39.
33. Bailey KD. Methods of Social Research. New York: The Free Press; 1994.
34. Backstrom CH, Hursh-Cesar G. Survey Research. 2nd ed. New York: John Wiley & Sons; 1981.
35. Babbie E. The Practice of Social Research. Belmont, CA: Wadsworth Publishing; 1989.
36. Fowler FJ. Survey Research Methods. Applied Social Research Methods, Volume 1. Newbury Park, CA: Sage Publications; 1988.
37. Hill A, Roberts J, Ewings P, Gunnell D. Non-response bias in a lifestyle survey. J Public Health Med 1997;19:203-7.
38. O’Neill TW, Marsden D, Silman AJ. Differences in the characteristics of responders and non-responders in a prevalence survey of vertebral osteoporosis. European Vertebral Osteoporosis Study Group. Osteoporos Int 1995;5:327-34.
39. Jones J. The effects of non-response on statistical inference. J Health Soc Policy 1996;8:49-62.
40. National Center for Education Statistics. Standard for achieving acceptable survey response rates, NCES Standard: II-04-92. 2001. Available at: http://www.nces.ed.gov/statprog/Stand11_04.asp. Last accessed April 11, 2002.
41. Everett SA, Price JH, Bedell AW, Telljohann SK. The effect of a monetary incentive in increasing the return rate of a survey to family physicians. Eval Health Prof 1997;20:207-14.
42. Asch DA, Christakis NA, Ubel PA. Conducting physician mail surveys on a limited budget. A randomized trial comparing $2 bill versus $5 bill incentives. Med Care 1998;36:95-9.
43. VanGeest JB, Wynia MK, Cummins DS, Wilson IB. Effects of different monetary incentives on the return rate of a national mail survey of physicians. Med Care 2001;39:197-201.
44. Tambor ES, Chase GA, Faden RR, Geller G, Hofman KJ, Holtzman NA. Improving response rates through incentive and follow-up: the effect on a survey of physicians’ knowledge of genetics. Am J Public Health 1993;83:1599-603.
45. Kasprzyk D, Montano DE, St Lawrence JS, Phillips WR. The effects of variations in mode of delivery and monetary incentive on physicians’ responses to a mailed survey assessing STD practice patterns. Eval Health Prof 2001;24:3-17.
46. Deehan A, Templeton L, Taylor C, Drummond C, Strang J. The effect of cash and other financial inducements on the response rate of general practitioners in a national postal study. Br J Gen Pract 1997;47(415):87-90.
47. Shaw MJ, Beebe TJ, Jensen HL, Adlis SA. The use of monetary incentives in a community survey: impact on response rates, data quality, and cost. Health Serv Res 2001;35:1339-46.
48. Gibson PJ, Koepsell TD, Diehr P, Hale C. Increasing response rates for mailed surveys of Medicaid clients and other low-income populations. Am J Epidemiol 1999;149:1057-62.
49. Baron G, De Wals P, Milord F. Cost-effectiveness of a lottery for increasing physicians’ responses to a mail survey. Eval Health Prof 2001;24:47-52.
Address correspondence to Cristen R. Wall, MD, The University of Texas Southwestern Medical Center, Department of Family Practice and Community Medicine, 6263 Harry Hines Boulevard, Dallas, TX 75390-9067. E-mail: Cristen.Wall@UTSouthwestern.edu.
To submit a letter to the editor on this topic, click here:jfp@fammed.uc.edu.
1. Siebert C, Lipsett LF, Greenblatt J, Silverman RE. Survey of physician practice behaviors related to diabetes mellitus in the U.S. I. Design and methods. Diabetes Care 1993;16:759-64.
2. Weller AC. Editorial peer review: methodology and data collection. Bull Med Libr Assoc 1990;78:258-70.
3. Myerson S. Improving the response rates in primary care research. Some methods used in a survey on stress in general practice since the new contract (1990). Fam Pract 1993;10:342-6.
4. PsycINFO: your source for psychological abstracts. PsycINFO Web site. Available at: http://www.apa.org/psycinfo. Accessed April 11, 2002.
5. Converse JM, Presser S. Survey Questions: Handcrafting The Standardized Questionnaire. Quantitative Applications in the Social Sciences. Newbury Park, CA: Sage Publications; 1986.
6. Cox J. Your Opinion, Please!: How to Build the Best Questionnaires in the Field of Education. Thousand Oaks, CA: Corwin Press; 1996.
7. Fink A. ed The Survey Kit. Thousand Oaks, CA: Sage Publications; 1995.
8. Fowler F. Survey Research Methods. Applied Social Research Methods Series. Newbury Park, CA: Sage Publications; 1991.
9. Fowler F. Improving Survey Questions. Applied Social Research Methods Series. Newbury Park, CA: Sage Publications; 1995.
10. Oeffinger KC, Roaten SP, , Jr. Ader DN, Buchanan RJ. Support and rewards for scholarly activity in family medicine: a national survey. Fam Med 1997;29:508-12.
11. Oeffinger KC, Snell LM, Foster BM, Panico KG, Archer RK. Diagnosis of acute bronchitis in adults: a national survey of family physicians. J Fam Pract 1997;45:402-9.
12. Oeffinger KC, Snell LM, Foster BM, Panico KG, Archer RK. Treatment of acute bronchitis in adults. A national survey of family physicians. J Fam Pract 1998;46:469-75.
13. Oeffinger KC, Eshelman DA, Tomlinson GE, Buchanan GR. Programs for adult survivors of childhood cancer. J Clin Oncol 1998;16:2864-7.
14. Robinson MK, DeHaven MJ, Koch KA. The effects of the patient self-determination act on patient knowledge and behavior. J Fam Pract 1993;37:363-8.
15. Murphee DD, DeHaven MJ. Does grandma need condoms: condom use among women in a family practice setting. Arch Fam Med 1995;4:233-8.
16. DeHaven MJ, Wilson GR, Murphee DD, Grundig JP. An examination of family medicine residency program director’s views on research. Fam Med 1997;29:33-8.
17. Smith GE, DeHaven MJ, Grundig JP, Wilson GR. African-American males and prostate cancer: assessing knowledge levels in the community. J Natl Med Assoc 1997;89:387-91.
18. DeHaven MJ, Wilson GR, O’Connor PO. Creating a research culture: what we can learn from residencies that are successful in research. Fam Med 1998;30:501-7.
19. Koch KA, DeHaven MJ, Robinson MK. Futility: it’s magic. Clinical Pulmonary Medicine 1998;5:358-63.
20. Rogers J. Family medicine research: a matter of values and vision. Fam Med 1995;27:180-1.
21. Hulley SB, Cummings S, eds. Designing Clinical Research: An Epidemiological Approach. Baltimore, MD: Williams & Wilkins; 1988.
22. Babbie E. Survey research methods. Belmont, CA: Wadsworth Publishing; 1973.
23. Dillman DA. Mail and Telephone Surveys: The Total Design Method. New York: John Wiley & Sons; 1978.
24. Carmines EG, Zeller R. Reliability and Validity Assessment. Quantitative Applications in the Social Sciences, 17. Newbury Park, CA: Sage Publications; 1979.
25. Preston CC, Colman AM. Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent p. Acta Psychol (Amst) 2000;104:1-15.
26. Bandalos DL, Enders CK. The effects of non-normality and number of response categories on reliability. Appl Meas Ed 1996;9:151-60.
27. Cicchetti DV, Showalter D, Tyrer PJ. The effect of number of rating scale categories on levels of interrater reliability: a Monte Carlo investigation. Appl Psychol Meas 1985;9:31-6.
28. Nunnally JC. Psychometric Theory. New York: McGraw-Hill; 1967.
29. Likert R. A technique for the measurement of attitudes. Arch Psychol 1932;140:55.-
30. Matell MS, Jacoby J. Is there an optimal number of alternatives for Likert scale items? Effects of testing time and scale properties. J Appl Psychol 1972;56:506-9.
31. Kalantar JS, Talley NJ. The effects of lottery incentive and length of questionnaire on health survey response rates: a randomized study. J Clin Epidemiol 1999;52:1117-22.
32. Yammarino FJ, Skinner SJ, Childers TL. Understanding mail survey response behavior: a meta-analysis. Public Opin Q 1991;55:613-39.
33. Bailey KD. Methods of Social Research. New York: The Free Press; 1994.
34. Backstrom CH, Hursh-Cesar G. Survey Research. 2nd ed. New York: John Wiley & Sons; 1981.
35. Babbie E. The Practice of Social Research. Belmont, CA: Wadsworth Publishing; 1989.
36. Fowler FJ. Survey Research Methods. Applied Social Research Methods, Volume 1. Newbury Park, CA: Sage Publications; 1988.
37. Hill A, Roberts J, Ewings P, Gunnell D. Non-response bias in a lifestyle survey. J Public Health Med 1997;19:203-7.
38. O’Neill TW, Marsden D, Silman AJ. Differences in the characteristics of responders and non-responders in a prevalence survey of vertebral osteoporosis. European Vertebral Osteoporosis Study Group. Osteoporos Int 1995;5:327-34.
39. Jones J. The effects of non-response on statistical inference. J Health Soc Policy 1996;8:49-62.
40. National Center for Education Statistics. Standard for achieving acceptable survey response rates, NCES Standard: II-04-92. 2001. Available at: http://www.nces.ed.gov/statprog/Stand11_04.asp. Last accessed April 11, 2002.
41. Everett SA, Price JH, Bedell AW, Telljohann SK. The effect of a monetary incentive in increasing the return rate of a survey to family physicians. Eval Health Prof 1997;20:207-14.
42. Asch DA, Christakis NA, Ubel PA. Conducting physician mail surveys on a limited budget. A randomized trial comparing $2 bill versus $5 bill incentives. Med Care 1998;36:95-9.
43. VanGeest JB, Wynia MK, Cummins DS, Wilson IB. Effects of different monetary incentives on the return rate of a national mail survey of physicians. Med Care 2001;39:197-201.
44. Tambor ES, Chase GA, Faden RR, Geller G, Hofman KJ, Holtzman NA. Improving response rates through incentive and follow-up: the effect on a survey of physicians’ knowledge of genetics. Am J Public Health 1993;83:1599-603.
45. Kasprzyk D, Montano DE, St Lawrence JS, Phillips WR. The effects of variations in mode of delivery and monetary incentive on physicians’ responses to a mailed survey assessing STD practice patterns. Eval Health Prof 2001;24:3-17.
46. Deehan A, Templeton L, Taylor C, Drummond C, Strang J. The effect of cash and other financial inducements on the response rate of general practitioners in a national postal study. Br J Gen Pract 1997;47(415):87-90.
47. Shaw MJ, Beebe TJ, Jensen HL, Adlis SA. The use of monetary incentives in a community survey: impact on response rates, data quality, and cost. Health Serv Res 2001;35:1339-46.
48. Gibson PJ, Koepsell TD, Diehr P, Hale C. Increasing response rates for mailed surveys of Medicaid clients and other low-income populations. Am J Epidemiol 1999;149:1057-62.
49. Baron G, De Wals P, Milord F. Cost-effectiveness of a lottery for increasing physicians’ responses to a mail survey. Eval Health Prof 2001;24:47-52.
Address correspondence to Cristen R. Wall, MD, The University of Texas Southwestern Medical Center, Department of Family Practice and Community Medicine, 6263 Harry Hines Boulevard, Dallas, TX 75390-9067. E-mail: Cristen.Wall@UTSouthwestern.edu.
To submit a letter to the editor on this topic, click here:jfp@fammed.uc.edu.