Spearman–Brown prediction formula

The Spearman–Brown prediction formula, also known as the Spearman–Brown prophecy formula, is a formula relating psychometric reliability to test length and used by psychometricians to predict the reliability of a test after changing the test length.^[1] The method was published independently by Spearman (1910) and Brown (1910).^[2]^[3]

Calculation

Predicted reliability, ${\rho }_{xx'}^{*}$ , is estimated as:

{\rho }_{xx'}^{*}={\frac {n{\rho }_{xx'}}{1+(n-1){\rho }_{xx'}}}

where n is the number of "tests" combined (see below) and ${\rho }_{xx'}$ is the reliability of the current "test". The formula predicts the reliability of a new test composed by replicating the current test n times (or, equivalently, creating a test with n parallel forms of the current exam). Thus n = 2 implies doubling the exam length by adding items with the same properties as those in the current exam. Values of n less than one may be used to predict the effect of shortening a test.

Forecasting test length

The formula can also be rearranged to predict the number of replications required to achieve a degree of reliability:

n={\frac {{\rho }_{xx'}^{*}(1-{\rho }_{xx'})}{{\rho }_{xx'}(1-{\rho }_{xx'}^{*})}}

Split-half reliability

Until the development of tau-equivalent reliability, split-half reliability using the Spearman-Brown formula was the only way to obtain inter-item reliability.^[4]^[5] After splitting the whole item into arbitrary halves, the correlation between the split-halves can be converted into reliability by applying the Spearman-Brown formula. That is,

${\rho }_{xx'}={\frac {2{\rho }_{12}}{1+{\rho }_{12}}}$

,where ${\rho }_{12}$ is the Pearson correlation between the split-halves. Although the Spearman-Brown formula is rarely used as a split-half reliability coefficient after the development of tau-equivalent reliability, this method is still useful for two-item scales.^[6]

Its relation to other split-half reliability coefficients

Split-half parallel reliability

Cho (2016)^[7] suggests using systematic nomenclature and formula expressions, criticizing that reliability coefficients have been represented in a disorganized and inconsistent manner with historically inaccurate and uninformative names. The assumption of the Spearman-Brown formula is that split-halves are parallel, which means that the variances of the split-halves are equal. The systematic name proposed for the Spearman-Brown formula is split-half parallel reliability. In addition, the following equivalent systematic formula has been proposed.

${\rho }_{SP}={\frac {4{\rho }_{12}}{4{\rho }_{12}+2(1-{\rho }_{12})}}$

Split-half tau-equivalent reliability

Split-half tau-equivalent reliability is a reliability coefficient that can be used when the variances of split-halves are not equal. Flanagan-Rulon ^[8] ( ${\rho }_{FR1}$ , ${\rho }_{FR2}$ ), Guttman^[9] ( ${\lambda _{4}}$ ) suggested the following formula expressions: ${\rho }_{FR1}={\frac {4{\rho }_{12}{\sigma }_{1}{\sigma }_{2}}{{\sigma }_{1}^{2}+{\sigma }_{2}^{2}+2{\rho }_{12}{\sigma }_{1}^{2}{\sigma }_{2}^{2}}}$ , ${\rho }_{FR2}=1-{\frac {{\sigma }_{D}^{2}}{{\sigma }_{X}^{2}}}$ , and ${\lambda }_{4}=2(1-{\frac {{\sigma }_{1}^{2}+{\sigma }_{2}^{2}}{{\sigma }_{X}^{2}}})$ .

Where ${\sigma }_{1}$ , ${\sigma }_{2}$ , ${\sigma }_{X}$ , and ${\sigma }_{D}$ is the variance of the first split-half, the second half, the sum of the two split-halves, and the difference of the two split-halves, respectively.

These formulas are all algebraically equivalent. The systematic formula ^[7] is as follows.

${\rho }_{ST}={\frac {4{\rho }_{12}}{{\sigma }_{X}^{2}}}$ .

Split-half congeneric reliability

Split-half parallel reliability and split-half tau-equivalent reliability have the assumption that split-halves have the same length. Split-half congeneric reliability mitigates this assumption. However, because there are more parameters that need to be estimated than the given pieces of information, another assumption is needed. Raju (1970)^[10] examined the split-half congeneric reliability coefficient when the relative length of each split-half was known. Angoff (1953)^[11] and Feldt (1975)^[12] published the split-half congeneric reliability assuming that the length of each split-half was proportional to the sum of the variances and covariances.^[7]

History

The name Spearman-Brown seems to imply a partnership, but the two authors were competitive. This formula originates from two papers published simultaneously by Brown (1910) and Spearman (1910) in the British Journal of Psychology. Charles Spearman had a hostile relationship with Karl Pearson who worked together in King's College London, and they exchanged papers that criticized and ridiculed each other.^[13] William Brown received his Ph.D. under Pearson's guidance. An important part of Brown's doctoral dissertation^[14] was devoted to criticizing Spearman's work on the rank correlation.^[15] Spearman appears first in this formula before Brown because he is a more prestigious scholar than Brown.^[16] For example, Spearman established the first theory of reliability^[15] and is called "the father of classical reliability theory."^[17] This is an example of Matthew Effect or Stigler's law of eponymy.

This formula should be referred to as the Brown-Spearman formula for the following reasons:^[16] First, the formula we use today is not Spearman's (1910) version, but Brown's (1910). Brown (1910) explicitly presented this formula as a split-half reliability coefficient, but Spearman (1910) did not. Second, the formal derivation of Brown (1910) is more concise and elegant than that of Spearman (1910).^[18] Third, it is likely that Brown (1910) was written before Spearman (1910). Brown (1910) is based on his doctoral dissertation, which was already available at the time of publication. Spearman (1910) criticized Brown (1910), but Brown (1910) criticized only Spearman (1904). Fourth, it is the APA style to list the authors in alphabetical order.

Use and related topics

This formula is commonly used by psychometricians to predict the reliability of a test after changing the test length. This relationship is particularly vital to the split-half and related methods of estimating reliability (where this method is sometimes known as the "Step Up" formula).^[2]

The formula is also helpful in understanding the nonlinear relationship between test reliability and test length. Test length must grow by increasingly larger values as the desired reliability approaches 1.0.

If the longer/shorter test is not parallel to the current test, then the prediction will not be strictly accurate. For example, if a highly reliable test was lengthened by adding many poor items then the achieved reliability will probably be much lower than that predicted by this formula.

For the reliability of a two-item test, the formula is more appropriate than Cronbach's alpha (used in this way, the Spearman-Brown formula is also called "standardized Cronbach's alpha", as it is the same as Cronbach's alpha computed using the average item intercorrelation and unit-item variance, rather than the average item covariance and average item variance).^[6]

Citations

^ Allen, M.; Yen W. (1979). Introduction to Measurement Theory. Monterey, CA: Brooks/Cole. ISBN 0-8185-0283-5.
^ ^a ^b Stanley, J. (1971). Reliability. In R. L. Thorndike (Ed.), Educational Measurement. Second edition. Washington, DC: American Council on Education
^ Wainer, H., & Thissen, D. (2001). True score theory: The traditional method. In H. Wainer and D. Thissen, (Eds.), Test Scoring. Mahwah, NJ:Lawrence Erlbaum
^ Kelley, T. L. (1924). Note on the Reliability of a Test: A reply to Dr. Crum ’s criticism. Journal of Educational Psychology, 15, 193–204. doi: 10.1037 / h0072471.
^ Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151-160. doi: 10.1007 / BF02288391.
^ ^a ^b Eisinga, Rob; Grotenhuis, Manfred te; Pelzer, Ben (August 2013). "The reliability of a two-item scale: Pearson, Cronbach, or Spearman-Brown?". International Journal of Public Health. 58 (4): 637–642. doi:10.1007/S00038-012-0416-3. ISSN 1661-8556. PMID 23089674.
^ ^a ^b ^c Cho, E. (2016). Making reliability reliable: A systematic approach to reliability coefficients. Organizational Research Methods, 19, 651-682. doi:10.1177/1094428116656239.
^ Flanagan, J. C. (1937). A proposed procedure for increasing the efficiency of objective tests. Journal of Educational Psychology, 28, 17-21. doi: 10.1037 / h0057430. Rulon, P. J. (1939). A simplified procedure for determining the reliability of a test by split-halves. Harvard Educational Review, 9, 99-103.
^ Guttman, Louis (December 1945). "A basis for analyzing test-retest reliability". Psychometrika. 10 (4): 255–282. doi:10.1007/BF02288892. ISSN 0033-3123. PMID 21007983. Zbl 0060.30902.
^ Raju, N. S. (1970). New formula for estimating total test reliability from parts of unequal length. Proceedings of the 78th Annual Convention ofAPA, 5, 143-144.
^ Angoff, W. H. (1953). Test reliability and effective test length. Psychometrika, 18(1), 1-14.
^ Feldt, L. S. (1975). Estimation of the reliability of a test divided into two parts of unequal length. Psychometrika, 40(4), 557-561.
^ Cowles, M. (2005) Statistics in psychology: An historical perspective. New York: Psychology Press.
^ Later published as a book Brown, W. (1911). The essentials of mental measurement. London: Cambridge University Press.
^ ^a ^b Spearman, C. (January 1904). "The Proof and Measurement of Association between Two Things" (PDF). American Journal of Psychology. 15 (1): 72. doi:10.2307/1412159. ISSN 0002-9556. JSTOR 1412159.
^ ^a ^b Cho, E. & Chun, S. (2018). Fixing a broken clock: A historical review of the originators reliability coefficients including Cronbach's alpha. Survey Research, 19 (2), 23-54.
^ Cronbach, L. J., Rajaratnam, N., & Gleser, G. C. (1963). Theory of generalizability: A liberalization of reliability theory. British Journal of Statistical Psychology, 16, 137-163. doi: 10.1111 / j.2044-8317.1963.tb00206.x.
^ Traub, R. E. (1997). Classical test theory in historical perspective. Educational Measurement: Issues and Practice, 16, 8-14. doi: 10.1111 / j.1745-3992.1997.tb00603.x.

References

Spearman, Charles, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.

[1] Allen, M.; Yen W. (1979). Introduction to Measurement Theory. Monterey, CA: Brooks/Cole. ISBN 0-8185-0283-5.

[Stanley_1971-2] Stanley, J. (1971). Reliability. In R. L. Thorndike (Ed.), Educational Measurement. Second edition. Washington, DC: American Council on Education

[3] Wainer, H., & Thissen, D. (2001). True score theory: The traditional method. In H. Wainer and D. Thissen, (Eds.), Test Scoring. Mahwah, NJ:Lawrence Erlbaum

[4] Kelley, T. L. (1924). Note on the Reliability of a Test: A reply to Dr. Crum ’s criticism. Journal of Educational Psychology, 15, 193–204. doi: 10.1037 / h0072471.

[5] Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151-160. doi: 10.1007 / BF02288391.

[Eisinga_2013-6] Eisinga, Rob; Grotenhuis, Manfred te; Pelzer, Ben (August 2013). "The reliability of a two-item scale: Pearson, Cronbach, or Spearman-Brown?". International Journal of Public Health. 58 (4): 637–642. doi:10.1007/S00038-012-0416-3. ISSN 1661-8556. PMID 23089674.

[Cho_2016-7] Cho, E. (2016). Making reliability reliable: A systematic approach to reliability coefficients. Organizational Research Methods, 19, 651-682. doi:10.1177/1094428116656239.

[8] Flanagan, J. C. (1937). A proposed procedure for increasing the efficiency of objective tests. Journal of Educational Psychology, 28, 17-21. doi: 10.1037 / h0057430. Rulon, P. J. (1939). A simplified procedure for determining the reliability of a test by split-halves. Harvard Educational Review, 9, 99-103.

[9] Guttman, Louis (December 1945). "A basis for analyzing test-retest reliability". Psychometrika. 10 (4): 255–282. doi:10.1007/BF02288892. ISSN 0033-3123. PMID 21007983. Zbl 0060.30902.

[10] Raju, N. S. (1970). New formula for estimating total test reliability from parts of unequal length. Proceedings of the 78th Annual Convention ofAPA, 5, 143-144.

[11] Angoff, W. H. (1953). Test reliability and effective test length. Psychometrika, 18(1), 1-14.

[12] Feldt, L. S. (1975). Estimation of the reliability of a test divided into two parts of unequal length. Psychometrika, 40(4), 557-561.

[13] Cowles, M. (2005) Statistics in psychology: An historical perspective. New York: Psychology Press.

[14] Later published as a book Brown, W. (1911). The essentials of mental measurement. London: Cambridge University Press.

[Spearman_1904-15] Spearman, C. (January 1904). "The Proof and Measurement of Association between Two Things" (PDF). American Journal of Psychology. 15 (1): 72. doi:10.2307/1412159. ISSN 0002-9556. JSTOR 1412159.

[Cho_2018-16] Cho, E. & Chun, S. (2018). Fixing a broken clock: A historical review of the originators reliability coefficients including Cronbach's alpha. Survey Research, 19 (2), 23-54.

[17] Cronbach, L. J., Rajaratnam, N., & Gleser, G. C. (1963). Theory of generalizability: A liberalization of reliability theory. British Journal of Statistical Psychology, 16, 137-163. doi: 10.1111 / j.2044-8317.1963.tb00206.x.

[18] Traub, R. E. (1997). Classical test theory in historical perspective. Educational Measurement: Issues and Practice, 16, 8-14. doi: 10.1111 / j.1745-3992.1997.tb00603.x.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]