29 Jul 2018

Investigating the Validity of Two Widely Used Quantitative Text Tools

James W. Cunningham, Elfrieda H. Hiebert, and Heidi Anne Mesmer

Journal Article
Submitted for Publication

Cunningham, J.W., Hiebert, E.H., & Mesmer, H.A., (2018). Investigating the validity of two widely used quantitative text tools. Reading and Writing, 31(4), p 813-833. 


In recent years, readability formulas have gained new prominence as a basis for selecting texts for learning and assessment. Variables that quantitative tools count (e.g., word frequency, sentence length) provide valid measures of text complexity insofar as they accurately predict representative and high-quality criteria. The longstanding consensus of text researchers has been that such criteria will measure readers’ comprehension of sample texts. This study used Bormuth’s (1969) rigorously developed criterion measure to investigate two of today’s most widely used quantitative text tools—the Lexile Framework and the Flesch-Kincaid Grade-Level formula. Correlations between the two tools’ complexity scores and Bormuth’s measured difficulties of criterion passages were only moderately high in light of the literature and new high stakes uses for such tools. These correlations declined a small amount when passages from the University grade band of use were removed. The ability of these tools to predict measured text difficulties within any single grade band below University was low. Analyses showed that word complexity made a larger contribution relative to sentence complexity when each tool’s predictors were regressed on the Bormuth criterion rather than their original criteria. When the criterion was texts’ grade band of use instead of mean cloze scores, neither tool classified texts well and errors disproportionally placed texts from higher grade bands into lower ones. Results suggest these two text tools may lack adequate validity for their current uses in educational settings.