Verbal Labels
Verbal labels are words used as a reference to clarify the meanings of the different scale points. Design choices are: Fully-labelled, End-points and more points labelled, End- and midpoints labelled, End-points only labelled, Not labelled
Theoretical arguments
- Labels reduce ambiguity in translating subjective responses to scales’ options (Alwin 2007).*
- Verbal labels suffer from language ambiguity and are more complex to hold in memory, only labelling the endpoints is less cognitively demanding than fully labelling; verbal labels are more natural form of expression than numbers and labelling all points can help to clarify the meaning of numbers (Krosnick and Fabrigar 1997).*
- Verbal labels are advantageous because they clarify the meanings of the scale points while reducing the respondent burden (Krosnick and Presser 2010).*
- Labelling may increase the cognitive effort required to read and process all options, while clarifying the meaning of them (Kunz 2015).*
Empirical evidence on data quality
*DeCastellarnau, A. Qual Quant (2018) 52: 1523. doi: 10.1007/s11135-017-0533-4
- Fully labelled increases reliability significantly compared to only labelling the endpoints. [Wiley-Wiley reliability] (Alwin 2007) → YES*
- Fully labelled increases reliability [Proportion of variance attributed to true attitudes] (Alwin and Krosnick 1991) → YES*
- Data quality is below average with all categories labelled [MTMM validity, method effect and residual error] (Andrews 1984) → YES*
- Fully labelled produces less extreme responses [Extreme response bias through distribution comparison] (Eutsler and Lang 2015) → YES*
- Full verbal labelling improves reliability [Item reliability] (Krosnick and Berent 1993) → YES*
- Fully labelled scales have higher reliabilities than when only the endpoints are labelled [Guttman’s lambda] (Menold et al. 2014) → YES*
- End labelling evokes more extreme responses [Extreme response bias through latent class factor] (Moors et al. 2014) → YES*
- Non-verbal alternatives have lower random error [MTMM construct validity] (Rodgers et al. 1992) → YES*
- The use of labels increase reliability significantly [True-score MTMM reliability and validity] (Saris and Gallhofer 2007) → YES*
- There are higher acquiescence and lower extreme scores when all categories are labelled [Acquiescence and Extreme response bias through log odds] (Weijters et al. 2010) → YES*
*DeCastellarnau, A. Qual Quant (2018) 52: 1523. doi: 10.1007/s11135-017-0533-4
References
Alwin, D.F. (2007). Margins of Error: A Study of Reliability in Survey Measurement. Wiley, Hoboken
Alwin, D.F., Krosnick, J.A. (1991). The reliability of survey attitude measurement: the influence of question and respondent attributes. Sociol. Methods Res. 20, 139–181. doi: 10.1177/0049124191020001005
Andrews, F.M. (1984). Construct validity and error components of survey measures: a structural modelling approach. Public Opin. Q. 48, 409–442. doi: 10.1086/268840
Eutsler, J., Lang, B. (2015). Rating scales in accounting research: the impact of scale points and labels. Behav. Res. Acc. 27, 35–51 . doi: 10.2308/bria-51219
Krosnick, J.A., Berent, M.K. (1993). Comparisons of party identifications and policy preferences: the impact of survey question format. Am. J. Pol. Sci. 37, 941–964. doi: 10.2307/2111580
Krosnick, J.A., Fabrigar, L.R. (1997). Designing rating scales for effective measurement in surveys. In: Lyberg, L.E., Biemer, P.P., Collins, M., De Leeuw, E.D., Dippo, C., Schwarz, N., Trewin, D. (eds.) Survey Measurement and Process
Quality, pp. 141–164. Wiley, Hoboken.
Krosnick, J.A., Presser, S. (2010). Question and Questionnaire Design. In: Marsden, P.V., Write, J.D. (eds.) Handbook of Survey Research, pp. 263–313. Emerald Group Publishing Limited, Bingley.
Kunz, T. (2015). Rating scales in Web surveys. A test of new drag-and-drop rating procedures. Technische Universität, Darmstadt [Ph.D. Thesis]
Menold, N., Kaczmirek, L., Lenzner, T., Neusar, A. (2014). How do respondents attend to verbal labels in rating scales? Field Methods 26, 21–39. doi: 10.1177/1525822X13508270
Moors, G., Kieruj, N.D., Vermunt, J.K. (2014). The effect of labeling and numbering of response scales on the likelihood of response bias. Sociol. Methodol. 44, 369–399. doi: 10.1177/0081175013516114
Rodgers, W.L., Andrews, F.M., Herzog, A.R. (1992). Quality of survey measures: a structural modeling approach. J. Off. Stat. 8, 251–275.
Saris, W.E., Gallhofer, I.N. (2007). Design, Evaluation, and Analysis of Questionnaires for Survey Research. Wiley, Hoboken
Weijters, B., Cabooter, E., Schillewaert, N. (2010). The effect of rating scale format on response styles: the number of response categories and response category labels. Int. J. Res. Mark. 27, 236–247. doi: 10.1016/j.ijresmar.2010.02.004
Alwin, D.F. (2007). Margins of Error: A Study of Reliability in Survey Measurement. Wiley, Hoboken
Alwin, D.F., Krosnick, J.A. (1991). The reliability of survey attitude measurement: the influence of question and respondent attributes. Sociol. Methods Res. 20, 139–181. doi: 10.1177/0049124191020001005
Andrews, F.M. (1984). Construct validity and error components of survey measures: a structural modelling approach. Public Opin. Q. 48, 409–442. doi: 10.1086/268840
Eutsler, J., Lang, B. (2015). Rating scales in accounting research: the impact of scale points and labels. Behav. Res. Acc. 27, 35–51 . doi: 10.2308/bria-51219
Krosnick, J.A., Berent, M.K. (1993). Comparisons of party identifications and policy preferences: the impact of survey question format. Am. J. Pol. Sci. 37, 941–964. doi: 10.2307/2111580
Krosnick, J.A., Fabrigar, L.R. (1997). Designing rating scales for effective measurement in surveys. In: Lyberg, L.E., Biemer, P.P., Collins, M., De Leeuw, E.D., Dippo, C., Schwarz, N., Trewin, D. (eds.) Survey Measurement and Process
Quality, pp. 141–164. Wiley, Hoboken.
Krosnick, J.A., Presser, S. (2010). Question and Questionnaire Design. In: Marsden, P.V., Write, J.D. (eds.) Handbook of Survey Research, pp. 263–313. Emerald Group Publishing Limited, Bingley.
Kunz, T. (2015). Rating scales in Web surveys. A test of new drag-and-drop rating procedures. Technische Universität, Darmstadt [Ph.D. Thesis]
Menold, N., Kaczmirek, L., Lenzner, T., Neusar, A. (2014). How do respondents attend to verbal labels in rating scales? Field Methods 26, 21–39. doi: 10.1177/1525822X13508270
Moors, G., Kieruj, N.D., Vermunt, J.K. (2014). The effect of labeling and numbering of response scales on the likelihood of response bias. Sociol. Methodol. 44, 369–399. doi: 10.1177/0081175013516114
Rodgers, W.L., Andrews, F.M., Herzog, A.R. (1992). Quality of survey measures: a structural modeling approach. J. Off. Stat. 8, 251–275.
Saris, W.E., Gallhofer, I.N. (2007). Design, Evaluation, and Analysis of Questionnaires for Survey Research. Wiley, Hoboken
Weijters, B., Cabooter, E., Schillewaert, N. (2010). The effect of rating scale format on response styles: the number of response categories and response category labels. Int. J. Res. Mark. 27, 236–247. doi: 10.1016/j.ijresmar.2010.02.004