## Response Scales' Length

**The minimum and maximum possible values**are used to evaluate the length of continuous scales.

**The number of categories**is used to evaluate the length of categorical scales.

**Theoretical arguments**

- The optimal number of points in a scale should be taken into consideration in relation to the polarity of the scale (Alwin 2007).*
- There is no single number of response alternatives for a scale which is appropriate under all circumstances (Cox III 1980).*
- Optimal is a complex decision to few categories may compromise the information gathered, too long compromises the clarity of meaning (Krosnick and Fabrigar 1997).*
- Optimal length of continuous scales depends on the size of the device screen (Reips and Funke 2008).*
- More categories compromise discrimination and limit the capacity of respondents to make finer distinctions between the options (Schaeffer and Presser 2003).*

**Empirical evidence on data quality**

- Reliabilities remained constant despite changing the number of categories [Internal consistency reliability] (Aiken 1983) → NO*
- 11p scales is more reliable than 7p [True Score MTMM reliability] (Alwin 1997) → YES*
- The use of 4p scales improves reliability in unipolar scales, while the reliability in bipolar scales is higher for 2, 3 and 5p and lowest for 7p. [Wiley-Wiley reliability] (Alwin 2007) → YES*
- There is no differences between AD with 2 and 5p, IS reliability increases from 3 to 9p, but there is no differences between 7 to 9p [Proportion of variance attributed to true attitudes] (Alwin and Krosnick 1991) → YES*
- The biggest effect on data quality. More categories are better. 3p is worse than 2p [MTMM validity, method effect and residual error] (Andrews 1984) → YES*
- Reliability is independent of the number of scale categories [Test reliability] (Bendig 1954) → NO*
- Reliability and validity are independent of the number of points [Test retest reliability, concurrent validity and predictive validity] (Jacoby and Matell 1971) → NO*
- Reliability increases with the number of points up to 6p [Cronbach alpha] (Komorita and Graham 1965) → YES*
- Validity is higher in 7p and 11p points than 2p [Concurrent validity] (Lundmark et al. 2016) → YES*
- Reliability is independent of the number of points [Internal consistency and Test retest reliability] (Matell and Jacoby 1971) → NO*
- Validity is slightly better on 7p rather than 11p, reliability unaffected scale [Test retest reliability and Test validity] (McKelvie 1978) → NO*
- Reliability is lower for 2, 3, 4p, higher for 7, 8, 9, 10p; it decreases with more than 10p [Test-retest reliability] (Preston and Colman 2000) → YES*
- 11p affects positively the quality of IS scales [True-score MTMM reliability and validity] (Revilla and Ochoa 2015) → YES*
- Quality does not improve with more than 5p for AD scales [True-score MTMM reliability and validity] (Revilla et al. 2014) → YES*
- The number of points has the biggest effect on validity; use at least 5 to 7p, better quality [MTMM construct validity] (Rodgers et al. 1992) → YES*
- Reliability can be improved by using more categories (11p) without decreasing validity; [True-score MTMM reliability and validity] (Saris and Gallhofer 2007) →YES*
- The maximum value of a continuous scale has a significant effect on reliability or validity [True-score MTMM reliability and validity] (Saris and Gallhofer 2007) →YES*
- Highest validity is with 4, 5 or 7p [True-score MTMM validity] (Scherpenzeel and Saris 1997) →YES*
- 5 AD points reduces extreme response style [Extreme Response Style through log odds] (Weijters et al. 2010) → YES*



*DeCastellarnau, A. Qual Quant (2018) 52: 1523. doi: 10.1007/s11135-017-0533-4*

