Person and Item Validity and Reliability in Essay Writing Using Rasch Model

  • Yenni Arif Rahman Bina Sarana Informatika University
Abstract views: 200 , PDF downloads: 181
Keywords: validity; reliability; essay writing; rasch model.


This study aims to examine the reliability and validity of items and persons using the Rasch model. The students' writing skills were assessed through four constructs elaborated into twenty-four items statements in the form of a rubric. The participants were 40 EFL learners who had taken an essay writing course in TOEFL iBT class whereas the writing samples were taken in collaboration with a language center wherein the TOEFL iBT class was held. The research method employed the Rasch model as a quantitative analysis approach by using three Ministep software outputs used for data analysis: the “statistical summary output” to obtain figures and data in general, item statistics to obtain item validity, and person statistics to acquire person validity. The four assessment constructs: content, structure, diction, and mechanic fulfill item fit measured by OUTPUT MNSQ dan OUTPUT ZSTD though mechanic constructs pass the item fit with a note. The person fit order identified by INFIT MNSQ shows seven students are misfits which need further assessment to find the source of a misfit.


Download data is not yet available.


C. Bond, T., & Fox, Applying the Rasch Model. Routledge, 2015.

W. Sumintono, B. & Widhiarso, Aplikasi Model Rasch untuk Penelitian Ilmu-Ilmu Sosial. Trim Komunikata Publishing House, 2013.

& I.-C. A. Paul C. Price, Rajiv Jhangiani, Research Methods of Psychology (4th ed). Victoria, BC: BCCAMPUS, 2020.

S. A. Osman, S. I. Naam, O. Jaafar, W. H. W. Badaruzzaman, and R. A. A. O. K. Rahmat, “Application of Rasch Model in Measuring Students’ Performance in Civil Engineering Design II Course,” Procedia - Soc. Behav. Sci., vol. 56, pp. 59–66, Oct. 2012, doi: 10.1016/J.SBSPRO.2012.09.632.

G. Engelhard, “The measurement of writing ability with a many-faceted Rasch model.,” Appl. Meas. Educ., vol. 5, no. 3, pp. 171–191, 1992.

M. S. Boone, W. J., Staver, J. R., Yale, M. S., Boone, W. J., Staver, J. R., & Yale, “Item Measures. Rasch Analysis in the Human Sciences,” pp. 93–110, 2014, doi:

S. Tan, “Validation of an Analytic Rating Scale for Writing: A Rasch Modeling Approach.,” Tabaran Inst. High. Educ. Iran. J. Lang. Test., vol. 3, no. 1, 2013.

E. I. D. & A. D. B.A, “Analyzing rater severity in a freshman composition course using many facet Rasch measurement.,” Lang. Test. Asia. Springer Open, vol. 10, no. 1, 2020, doi:

K. Ashraf, Z.A., & Jaseem, “classical and modern methods in item analysis of test tools,” Int. J. Res. Rev., vol. 7, no. 5, pp. 397–403, 2020.

S. Meyer, J.P., & Zhu, “air and equitable measurement of student learning in MOOCs: an introduction to item response theory, scale linking, and score equating,” J. Res. Pract. Assess., vol. 8, no. 1, pp. 26–39, 2013.

H. B. Yilmaz, “A comparison of IRT model combinations for assessing fit in a mixed format elementary school science test,” nternational Electron. J. Elem. Educ., vol. 11, no. 5, pp. 539–545, 2019, doi:

X. Fan, “em response theory and classical test theory:An empirical comparison of their item/person statistics,” Educ. Psychol. Meas., vol. 58, pp. 357–381, 998.

C. Magno, “Demonstrating the difference between classical test theory and item response theory using derived test data,” he Int. J. Educ. Psychol. Assess., vol. 1, pp. 1–11, 2009.

A. A. Bichi, “Classical test theory: An introduction to linear modelling approach to test and item analysis,” Int. J. Soc. Stud., vol. 2, pp. 27–33, 2016.

R. W. Hambleton, R.K., & Jones, “omparison of classical test theory and item response theory and their applications to test development,” ducational Meas. Issues Pract., vol. 12, pp. 38–47, 1993, doi:

& R. H. J. Hambleton, R.K., Swaminathan, H., Fundamental of item response theory. London: Sage Publishing, 1991.

N. Rezaee, R., Shafiayan, M., Jafari, P., & Zarifsanaiey, “Invariance of item difficulty parameter estimates based on classical test theory and item response theory,” J. Adv. Pharm. Educ. Res., vol. 8, pp. 156–161, 2018.

S. Anastasi, A. & Urbina, Psychological testing. New York: Prentice Hall, 2002.

K. S. Maier, “A Rasch hierarchical measurement model,” J. Educ. Behav. Stat., vol. 26, pp. 307–331, 2001, doi:

S. . Reise, “A comparison of item- and person-fit methods of assessing model-data fit in IRT,” Appl. Psychol. Meas., vol. 14, no. 2, pp. 127–137, 1990, doi:

A. Boone, W.J., & Noltemeyer, “Rasch analysis: A primer for school psychology researchers and practitioners,” Cogent Educ., vol. 4, no. 1, pp. 1–13, 2017, doi:

M. Wright, B., & Stone, Measurement essentials (2nd ed.). Wilmington: Wide Range, Inc., 1999.

C. M. Bond, T.G., & Fox, pplying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: awrence Erlbaum Associate, 2007.

N. L. A. Zubairi, A.M., & Kassim, “Classical and rasch analyses of dichotomously scored reading comprehension test items,” Malaysian J. ELT Res., vol. 2, no. 1, pp. 1–20, 2006.

L. M. Razak, N. bin Abd, Khairani, A.Z. bin, & Thien, “Examining quality of mathemtics test items using rasch model: Preminarily analysis,” Procedia - Soc. Behav. Sci., vol. 69, pp. 2205–2214, 2012, doi:

M. Rost, J., & von Davier, “A conditional item-fit index for rasch model,” Appl. Psychol. Meas., vol. 18, no. 2, pp. 171–182, 1994, doi:

G. Karabatsos, “Comparing the abberant response detection performance of thirty-six person-fit statistics,” Appl. Meas. Educ., vol. 16, no. 4, pp. 277–298, 2003.

R. R. Meijer, “Person-fit research: an introduction,” Appl. Meas. Educ., vol. 9, no. 1, pp. 3–8, 1996.

B. Misbach, I. H., & Sumintono, “Pengembangan dan Validasi Instrumen ‘Persepsi Siswa Tehadap Karakter Moral Guru’ di Indonesia dengan Model Rasch,” . PROCEEDING Semin. Nas. Psikometri, pp. 148–162, 2014.

H. Jacobs., Holly. L., Stephen, A., Zingkgraf., Deanne. R., Wormuth, V., Faye, H., Jane, B., Testing ESL Composition: A Practical Approach. Rowley: Newbury House Publishers, Inc, 1981.

W. Fisher, “Rating scale instrument quality criteria,” Rasch Meas. Trans., vol. 1, 2007.

J. M. Linacre, “KR-20/Cronbach alpha or Rasch person reliability: Which tells us the truth?,” Rasch Meas. Trans., vol. 11, pp. 580–581, 2002.

J. M. ( Linacre, “What do infit and outfit mean-square and standardized mean?,” Rasch Meas. Trans., vol. 16, p. 878, 2002.

E. V. J. Smith, “Evidence for the reliability of measures and validity of measure interpretation: A Rasch measurement perspective,” J. Appl. Meas., vol. 2, no. 3, pp. 281–311, 2001.

PlumX Metrics

How to Cite
Rahman, Y. A. (2023). Person and Item Validity and Reliability in Essay Writing Using Rasch Model. Konstruktivisme : Jurnal Pendidikan Dan Pembelajaran, 15(1), 41-55.