Does Repetition of the Same Test Questions in Consecutive Years Affect their Psychometric Indicators? – Five-year Analysis of In-house Exams at Medical University of Warsaw
More details
Hide details
Division of Teaching and Outcomes of Education, Faculty of Health Science, Medical University of Warsaw, Warsaw, POLAND
University Exams Office of Medical University of Warsaw, Warsaw, POLAND
Online publication date: 2018-05-16
Publication date: 2018-05-16
EURASIA J. Math., Sci Tech. Ed 2018;14(7):3301–3309
Aim of study:
Evaluation of the re-used test questions on the impact of psychometric indicators of test items in examinations in cardiology in Physiotherapy at the Medical University of Warsaw (MUW).

Materials and Methods:
A case study based on analysis of 132 five-option (528 distractors) multiple-choice questions (MCQs) developed at MUW included in five in-house exams. Questions repeated at least twice during the period considered and constituted 42.4% of all MCQs. Each MCQ was assessed on the basis of the following three indicators: difficulty index (DI), discrimination power (DP), and the number of non-functioning distractors (N-FD). The change in psychometric indicators of test items was assessed using Krippendorff alpha coefficient (αk).

Together with each MCQs repetition, a decrease in the number of questions that would maintaining the analogical DI value towards the initial level of easiness was observed. However, the level of DI compliance was significantly higher, even when there were five consecutive repetitions (coefficient αk for the consecutive repetitions was 0.90, 0.85, 0.78 and 0.75). N-FD number in consecutive repetitions remained on a satisfactory level (good and very good compliance), although there was a significant decrease in this range when there were three or more repetitions (coefficient αk was 0.80, 0.69, 0.66 and 0.65, respectively). Whereas the level of similarity as for DP for consecutive repetitions was significantly lower in comparison with those noted for DI and DE (DP coefficient αk was 0.28, 0.23, 0.25 and 0.10, respectively).

The observed change in the initial values of psychometric indicators together with consecutive use of the same MCQs confirms the examiners’ concerns as for the progressive wear of the bank of test questions. However, the level of psychometric MCQs values loss, especially in the area of the easiness and the number of non-functioning distractors was not drastic. It appears that the level of MCQs spread among students of consecutive years is not too high, at least within two consecutive years.

Boulet, J. R., McKinley, D. W., Whelan, G. P., & Hambleton, R. K. (2003). The effect of task exposure on repeat candidate scores in a high-stakes standardized patient assessment. Teaching and Learning in Medicine, 15(4), 227-232.
Case, S. M., & Swanson, D. B. (2002). Constructing Written Test Questions For the Basic and Clinical Sciences (3rd ed.). National Board of Medical Examiners.
Cates, W. M. (1982). The efficacy of retesting in relation to improved test performance of college undergraduates. The Journal of Educational Research, 75(4), 230-236.
Considine, J., Botti, M., & Thomas, S. (2005). Design, format, validity and reliability of multiple choice questions for use in nursing research and education. Collegian, 12(1), 19-24.
Erguven, M. (2013). Two approaches to psychometric process: Classical test theory and item response theory. Journal of Education, 2(2), 23-30.
Gilmer, J. S. (1989). The effects of test disclosure on equated scores and pass rates. Applied psychological measurement, 13(3), 245-255.
Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 253-262.
Hansen, J. D., & Dexter, L. (1997). Quality multiple-choice test questions: Item-writing guidelines and an analysis of auditing testbanks. Journal of Education for Business, 73(2), 94-97.
Hertz, N., & Chinn, R. (2003). Effects of question exposure for conventional examinations in a continuous testing environment. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago.
Kim, H. (2013). Reuse, Reduce, Recycle... Test Questions? Retrieved from
Krause, J. (2012). Assessment of item parameter drift of known items in a university placement exam. Arizona State University.
Krippendorff, K. (2012). Content analysis: An introduction to its methodology. Newbury Park: Sage.
Masters, J. C., Hulsmeyer, B. S., Pike, M. E., Leichty, K., Miller, M. T., & Verst, A. L. (2001). Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education. Journal of Nursing Education, 40(1), 25-32.
Mujeeb, A., Pardeshi, M., & Ghongane, B. (2010). Comparative assessment of multiple choice questions versus short essay questions in pharmacology examinations. Indian Journal of Medical Sciences, 64(3), 118-124.
Nachar, N. (2008). The Mann-Whitney U: A test for assessing whether two independent samples come from the same distribution. Tutorials in Quantitative Methods for Psychology, 4(1), 13-20.
O’Neill, T. R., Sun, L., Peabody, M. R., & Royal, K. D. (2015). The Impact of Repeated Exposure to Items. Teaching and Learning in Medicine, 27(4), 404-409.
O’Neill, T., Lunz, M. E., & Thiede, K. (2000). The impact of receiving the same items on consecutive computer adaptive test administrations. Journal of Applied Measurement, 1(2), 131-151.
Oyebola, D. D., Adewoye, O. E., Iyaniwura, J. O., Alada, A. R., Fasanmade, A. A., & Raji, Y. (2000). A comparative study of students’ performance in preclinical physiology assessed by multiple choice and short essay questions. African Journal of Medicine and Medical Sciences, 29(3-4), 201-205.
Park, Y. S., & Yang, E. B. (2015). Three controversies over item disclosure in medical licensure examinations. Medical Education Online, 20(1), 28821.
Raymond, M. R., Neustel, S., & Anderson, D. (2009). Same‐Form Retest Effects on Credentialing Examinations. Educational Measurement: Issues and Practice, 28(2), 19-27.
Schuwirth, L. W., & van der Vleuten, C. P. (2011). General overview of the theories used in assessment: AMEE Guide No. 57. Medical Teacher, 33(10), 783-797.
Somin, I. (2011). The Perils of Reusing Questions from Past Exams. Retrieved from
Swygert, K. A., Balog, K. P., & Jobe, A. (2010). The impact of repeat information on examinee performance for a large-scale standardized-patient examination. Academic Medicine, 85(9), 1506-1510.
Tarrant, M., Ware, J., & Mohammed, A. M. (2009). An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis. BMC Medical Education, 9, 40.
Tio, R. A., Schutte, B., Meiboom, A. A., Greidanus, J., Dubois, E. A., & Bremers, A. J. (2016). The progress test of medicine: the Dutch experience. Perspectives on Medical Education, 5(1), 51-55.
Wagner-Menghin, M., Preusche, I., & Schmidts, M. (2013). The effects of reusing written test items: A study using the Rasch model. ISRN Education, 2013.
Wood, T. J. (2009). The effect of reused questions on repeat examinees. Advances in Health Sciences Education, 14(4), 465-473.
Wood, T. J., St-Onge, C., Boulais, A.-P., Blackmore, D. E., & Maguire, T. O. (2010). Identifying the unauthorized use of examination material. Evaluation & the Health Professions, 33(1), 96-108.
Yang, E. B., Lee, M. A., & Park, Y. S. (2018). Effects of test item disclosure on medical licensing examination. Advances in Health Sciences Education, 23(2), 265-274.