Person and Item Validity and Reliability in Essay Writing Using Rasch Model
Abstract
This study aims to examine the reliability and validity of items and persons using the Rasch model. The students' writing skills were assessed through four constructs elaborated into twenty-four items statements in the form of a rubric. The participants were 40 EFL learners who had taken an essay writing course in TOEFL iBT class whereas the writing samples were taken in collaboration with a language center wherein the TOEFL iBT class was held. The research method employed the Rasch model as a quantitative analysis approach by using three Ministep software outputs used for data analysis: the “statistical summary output” to obtain figures and data in general, item statistics to obtain item validity, and person statistics to acquire person validity. The four assessment constructs: content, structure, diction, and mechanic fulfill item fit measured by OUTPUT MNSQ dan OUTPUT ZSTD though mechanic constructs pass the item fit with a note. The person fit order identified by INFIT MNSQ shows seven students are misfits which need further assessment to find the source of a misfit.
Downloads
References
C. Bond, T., & Fox, Applying the Rasch Model. Routledge, 2015.
W. Sumintono, B. & Widhiarso, Aplikasi Model Rasch untuk Penelitian Ilmu-Ilmu Sosial. Trim Komunikata Publishing House, 2013.
& I.-C. A. Paul C. Price, Rajiv Jhangiani, Research Methods of Psychology (4th ed). Victoria, BC: BCCAMPUS, 2020.
S. A. Osman, S. I. Naam, O. Jaafar, W. H. W. Badaruzzaman, and R. A. A. O. K. Rahmat, “Application of Rasch Model in Measuring Students’ Performance in Civil Engineering Design II Course,” Procedia - Soc. Behav. Sci., vol. 56, pp. 59–66, Oct. 2012, doi: 10.1016/J.SBSPRO.2012.09.632.
G. Engelhard, “The measurement of writing ability with a many-faceted Rasch model.,” Appl. Meas. Educ., vol. 5, no. 3, pp. 171–191, 1992.
M. S. Boone, W. J., Staver, J. R., Yale, M. S., Boone, W. J., Staver, J. R., & Yale, “Item Measures. Rasch Analysis in the Human Sciences,” pp. 93–110, 2014, doi: https://doi.org/10.1007/978-94-007-6857-4_5.
S. Tan, “Validation of an Analytic Rating Scale for Writing: A Rasch Modeling Approach.,” Tabaran Inst. High. Educ. Iran. J. Lang. Test., vol. 3, no. 1, 2013.
E. I. D. & A. D. B.A, “Analyzing rater severity in a freshman composition course using many facet Rasch measurement.,” Lang. Test. Asia. Springer Open, vol. 10, no. 1, 2020, doi: https://doi.org/10.1186/s40468-020-0098-3.
K. Ashraf, Z.A., & Jaseem, “classical and modern methods in item analysis of test tools,” Int. J. Res. Rev., vol. 7, no. 5, pp. 397–403, 2020.
S. Meyer, J.P., & Zhu, “air and equitable measurement of student learning in MOOCs: an introduction to item response theory, scale linking, and score equating,” J. Res. Pract. Assess., vol. 8, no. 1, pp. 26–39, 2013.
H. B. Yilmaz, “A comparison of IRT model combinations for assessing fit in a mixed format elementary school science test,” nternational Electron. J. Elem. Educ., vol. 11, no. 5, pp. 539–545, 2019, doi: https://dx.doi.org/10.26822/iejee.2019553350.
X. Fan, “em response theory and classical test theory:An empirical comparison of their item/person statistics,” Educ. Psychol. Meas., vol. 58, pp. 357–381, 998.
C. Magno, “Demonstrating the difference between classical test theory and item response theory using derived test data,” he Int. J. Educ. Psychol. Assess., vol. 1, pp. 1–11, 2009.
A. A. Bichi, “Classical test theory: An introduction to linear modelling approach to test and item analysis,” Int. J. Soc. Stud., vol. 2, pp. 27–33, 2016.
R. W. Hambleton, R.K., & Jones, “omparison of classical test theory and item response theory and their applications to test development,” ducational Meas. Issues Pract., vol. 12, pp. 38–47, 1993, doi: https://doi.org/10.1111/j.1745-3992.1993.tb00543.
& R. H. J. Hambleton, R.K., Swaminathan, H., Fundamental of item response theory. London: Sage Publishing, 1991.
N. Rezaee, R., Shafiayan, M., Jafari, P., & Zarifsanaiey, “Invariance of item difficulty parameter estimates based on classical test theory and item response theory,” J. Adv. Pharm. Educ. Res., vol. 8, pp. 156–161, 2018.
S. Anastasi, A. & Urbina, Psychological testing. New York: Prentice Hall, 2002.
K. S. Maier, “A Rasch hierarchical measurement model,” J. Educ. Behav. Stat., vol. 26, pp. 307–331, 2001, doi: https://doi.org/10.3102%2F10769986026003307.
S. . Reise, “A comparison of item- and person-fit methods of assessing model-data fit in IRT,” Appl. Psychol. Meas., vol. 14, no. 2, pp. 127–137, 1990, doi: https://doi.org/10.1177%2F014662169001400202.
A. Boone, W.J., & Noltemeyer, “Rasch analysis: A primer for school psychology researchers and practitioners,” Cogent Educ., vol. 4, no. 1, pp. 1–13, 2017, doi: https://doi.org/10.1080/2331186X.2017.1416898.
M. Wright, B., & Stone, Measurement essentials (2nd ed.). Wilmington: Wide Range, Inc., 1999.
C. M. Bond, T.G., & Fox, pplying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: awrence Erlbaum Associate, 2007.
N. L. A. Zubairi, A.M., & Kassim, “Classical and rasch analyses of dichotomously scored reading comprehension test items,” Malaysian J. ELT Res., vol. 2, no. 1, pp. 1–20, 2006.
L. M. Razak, N. bin Abd, Khairani, A.Z. bin, & Thien, “Examining quality of mathemtics test items using rasch model: Preminarily analysis,” Procedia - Soc. Behav. Sci., vol. 69, pp. 2205–2214, 2012, doi: https://doi.org/10.1016/j.sbspro.2012.12.187.
M. Rost, J., & von Davier, “A conditional item-fit index for rasch model,” Appl. Psychol. Meas., vol. 18, no. 2, pp. 171–182, 1994, doi: https://doi.org/10.1177/014662169401800206.
G. Karabatsos, “Comparing the abberant response detection performance of thirty-six person-fit statistics,” Appl. Meas. Educ., vol. 16, no. 4, pp. 277–298, 2003.
R. R. Meijer, “Person-fit research: an introduction,” Appl. Meas. Educ., vol. 9, no. 1, pp. 3–8, 1996.
B. Misbach, I. H., & Sumintono, “Pengembangan dan Validasi Instrumen ‘Persepsi Siswa Tehadap Karakter Moral Guru’ di Indonesia dengan Model Rasch,” . PROCEEDING Semin. Nas. Psikometri, pp. 148–162, 2014.
H. Jacobs., Holly. L., Stephen, A., Zingkgraf., Deanne. R., Wormuth, V., Faye, H., Jane, B., Testing ESL Composition: A Practical Approach. Rowley: Newbury House Publishers, Inc, 1981.
W. Fisher, “Rating scale instrument quality criteria,” Rasch Meas. Trans., vol. 1, 2007.
J. M. Linacre, “KR-20/Cronbach alpha or Rasch person reliability: Which tells us the truth?,” Rasch Meas. Trans., vol. 11, pp. 580–581, 2002.
J. M. ( Linacre, “What do infit and outfit mean-square and standardized mean?,” Rasch Meas. Trans., vol. 16, p. 878, 2002.
E. V. J. Smith, “Evidence for the reliability of measures and validity of measure interpretation: A Rasch measurement perspective,” J. Appl. Meas., vol. 2, no. 3, pp. 281–311, 2001.
Copyright (c) 2023 Konstruktivisme : Jurnal Pendidikan dan Pembelajaran
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).