Určovanie hraničných skóre pre kriteriálne testy

Mentel, Andrej

ORBIS SCHOLAE

We inform authors and readers that, following an agreement with the Karolinum publishing house, from 2024 (Volume 18), the journal Orbis scholae will be published only in electronic form.

Orbis scholae is an academic journal published by Charles University, Prague. It features articles on school education in the wider socio-cultural context. It aims to contribute to our understanding and the development of school education, and to the reflection of teaching practice and educational policy.

The journal is indexed in SCOPUS, CEEOL, DOAJ, EBSCO, and ERIH Plus.

ORBIS SCHOLAE, Vol 9 No 1 (2015), 139–155

Určovanie hraničných skóre pre kriteriálne testy

[The Standard (Cut Score) Setting for Criterion-referenced Educational Tests]

Andrej Mentel

DOI: https://doi.org/10.14712/23363177.2015.76
published online: 01. 11. 2015

abstract

The study investigates the process of standard (cut score) setting for criterion-referenced educational tests. The main goal of this study is to provide the comprehensive framework on standard setting based mainly on Anglo-American research literature as well as on the Standards for educational and psychological testing (AERA, APA, & NCME, 1999). Main emphasis is dedicated to the process of creating description of performance levels and to the methods of cut score setting. Description of performance levels is shown on some examples concerning the reading comprehension and is based on MCAS testing. Among the test-item centered methods, Angoff, Nedelsky and bookmarking methods are desribed. The person-centered methods are represented by contrasting group approach. As a representant of methods combining both approaches, the measurement decision theory classification is described. The validity issues of these methods are briefly discussed. Stať skúma proces určovania štandardov (hraničných skóre) pre kriteriálne pedagogické testy. Hlavným cieľom štúdie je poskytnúť ucelený rámec pre určovanie hraničných skóre založený hlavne na anglo-americkej výskumnej literatúre, ako aj na Štandardoch pre pedagogické a psychologické testovanie (AERA, APA, & NCME, 1999). Hlavný dôraz je venovaný procesu vytvárania opisov úrovní výkonu a metódam na určovanie hraničného skóre. Opisy úrovní výkonu sú ukázané na príkladoch zameraných na testovanie čítania s porozumením a sú založené hlavne na testovaní MCAS. Spomedzi metód určovania hraničného skóre, ktoré sú zamerané na testové položky, sú opísané Angoffova a Nedelského metóda a metóda záložiek. Metódy zamerané na osobu žiaka sú zastúpené prístupom založeným na kontrastných skupinách. Reprezentantom metód využívajúcich prvky oboch prístupov je klasifikácia na základe measurement decision theory. V závere sú krátko diskutované otázky validity týchto metód.

keywords: criterion-referenced tests; methods of standard (cut score) setting; reading comprehension; cultural capital kriteriálne testy; metódy určovania štandardov (hraničných skóre); čítanie s porozumením; kultúrny kapitál

references (43)

1. AERA, APA, & NCME (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association (AERA).

2. Allensworth, E. M. (2005). Dropout rates after high-stakes testing in elementary school: A study of the contradictory effects of Chicago's efforts to end social promotion. Educational Evaluation and Policy Analysis, 27(4), 341–364. CrossRef

3. Amrein, A. L., & Berliner, D. C. (2002). An analysis of some unintended and negative consequences of high-stakes testing. Dostupné z http://greatlakescenter.org/docs/early_research/pdf/H-S%20Analysis%20final.pdf.

4. Angoff, W. H. (1971). Scales, norms and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., s. 508–597). Washington, DC: American Council on Education.

5. Au, W. W. (2008). Devising inequality: a Bernsteinian analysis of high‐stakes testing and social reproduction in education. British Journal of Sociology of Education, 29(6), 639–661. CrossRef

6. Bloom, B. S. et al. (1956). Taxonomy of educational objectives: Handbook 1, the cognitive domain. New York: McKay.

7. Bourdieu, P. (1986). The forms of capital. In J. E. Richardson (Ed.), Handbook of theory of research for the sociology of education (s. 241–258). Westport, CT: Greenwood Press.

8. Bourdieu, P., & Passeron, J.-C. (1990). Reproduction in education, society and culture (2nd Ed.). (R. Nice, prekl.). London, UK, Thousand Oaks, CA, New Delhi, I: SAGE. [Pôv. dielo vyd. 1970.]

9. Buckendahl, C. W., Smith, R. W., Impara, J. C., & Plake, B. S. (2002). A comparison of Angoff and bookmark standard setting methods. Journal of Educational Measurement, 39, 253–263. CrossRef

10. Çetin, S., & Gelbal, S. (2013). A comparison of bookmark and Angoff standard setting methods. Educational Sciences: Theory & Practice, 13(4), 2169–2175.

11. Hambleton, R. K. (1980). Test score validity and standard-setting methods. In R. A. Berk (Ed.), Criterion-referenced measurement: The state of the art (s. 80–123). Baltimore and London: The John Hopkins University Press.

12. Hambleton, R. K., & Pitoniak, M. (2006). Setting performance standards. In R. L. Brennan (Ed.), Educational measurement (s. 433–470). Westport, CT: Praeger Publishers.

13. Chráska, M. (2009). Testování výkonů ve vzdělávání. In J. Průcha (Ed.), Pedagogická encyklopedie (s. 594–598). Praha: Portál.

14. Impara, J. C., & Plake, B. S. (1998). Teachers' ability to estimate item difficulty: A test of the assumptions in the Angoff standard setting method. Journal of Educational Measurement, 35(1), 69–81. CrossRef

15. Jacob, B. A. (2005), Accountability, incentives and behavior: the impact of high-stakes testing in the Chicago Public Schools. Journal of Public Economics, 89(5–6), 761–796. CrossRef

16. Kaščák, O., & Pupala, B. (2012). Škola zlatých golierov. Vzdelávanie v ére neoliberalizmu. Praha: Sociologické nakladatelství (SLON).

17. Livingston, S. A., & Zieky, M. J. (1982). Passing scores: A manual for setting standards of performance on educational and occupational tests. Princeton, NJ: Educational Testing Service.

18. Livingston, S. A., & Zieky, M. J. (1989). A comparative study of standard-setting methods. Applied Measurement in Education, 2, 121–141. CrossRef

19. MCAS (2013a). Massachusetts Comprehensive Assessment System. MCAS achievement level definitions. Dostupné z http://www.doe.mass.edu/mcas/tdd/pld/.

20. MCAS (2013b). English Language Arts. General performance level definitions. Dostupné z http://www.doe.mass.edu/mcas/tdd/pld/ela410.pdf.

21. Mills, C. N., & Jaeger, R. M. (1998). Creating descriptions of desired student achievement when setting performance standards. In L. Hansche (Ed.), Handbook for the development of performance standards: Meeting the requirements of Title I (s. 73–85). Washington, DC: Council of Chief State School Officers.

22. Minarechová, M. (2012). Negative impacts of high-stakes testing. Journal of Pedagogy / Pedagogický časopis, 3(1), 82–100. CrossRef

23. Mitzel, H. C., Lewis, D. M., Patz, R. J., & Green, D. R. (2001). The bookmark procedure. Psychological perspectives. In G. J. Cizek (Ed.), Standard setting: Concepts, methods, and perspectives (s. 249–281). Mahwah, NJ: Erlbaum.

24. Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3–19.

25. Näsström, G., & Nyström, P. (2008). A comparison of two different methods for setting performance standards for a test with constructed-response items. Practical Assessment, Research & Evaluation, 13(9). Dostupné z http://pareonline.net/pdf/v13n9.pdf.

26. OECD (2009). PISA 2006 technical report. Paris: OECD. CrossRef

27. Olsen, J. B., & Smith, R. (2008). Cross validating modified Angoff and bookmark standard setting for a home inspection certification. Príspevok prednesený na stretnutí National Council on Measurement in Education, New York. Dostupné z http://siterepository.s3.amazonaws.com/00373201006251026068636.pdf.

28. Rasch, G. (1980). Probabilistic Models for Some Intelligence and Attainment Tests (Exp. ed.). Chicago and London: The University of Chicago Press.

29. Raymond, M. R., & Reid, J. B. (2001). Who made thee a judge? Selecting and training participants for standard setting. In G. J. Cizek (Ed.). Standard setting: Concepts, methods, and perspectives (s. 119–157). Mahwah, NJ: Erlbaum.

30. Reckase, M. D., & Bay, L. (1999). Comparing two methods for collecting test-based judgements. Príspevok prednesený na stretnutí National Council on Measurement in Education, Montréal, QC. Dostupné z https://www.measuredprogress.org/documents/10157/19213/ComparingTwoMethods.pdf.

31. Rosenkvist, M. A. (2010). Using student test results for accountability and improvement: A literature review. OECD Education Working Paper No. 54, EDU/WKP(2010)17, 22. Nov. 2010. Paris: OECD, Directorate for Education. CrossRef

32. Rudner, L.M. (2009). Scoring and classifying examinees using measurement decision theory. Practical Assessment, Research & Evaluation, 14(8). Dostupné z http://pareonline.net/getvn.asp?v=14&n=8.

33. SCIO (n.d.). Scate. Souhrnná zpráva z testování 2013/2014. Dostupné z https://www.scio.cz/download/skoly/SCATE/SZ_AJ_final.pdf.

34. Shepard, L. A. (1995). Implications for standard setting of the National Academy of Education evaluation of the National Assessment of Educational Progress achievement levels. In Proceedings of the joint conference on standard setting for large scale assessments of the National Assessment Governing Board (NAGB) and the National Centre for Educational Statistics (NCES), Vol. 2 (s. 143–160). Washington, DC: Government Printing Office.

35. Slavík, J. (1999). Hodnocení v současné škole. Východiska a nové metody pro praxi. Praha: Portál.

36. ŠPÚ (2009). Vzdelávacie štandardy zo slovenského jazyka a literatúry pre 2. stupeň základných škôl a 1.–4. ročník gymnázií s osemročným štúdiom. Štátny vzdelávací program Slovenský jazyk a literatúra (vzdelávacia oblasť Jazyk a komunikácia). Príloha ISCED 2. Bratislava: Štátny pedagogický ústav. Dostupné z http://www.statpedu.sk/files/documents/svp/2stzs/isced2/vzdelavacie_oblasti/slovensky_jazyk_a_literatura_isced2.pdf.

37. Trna, J. (1996). Vzdělávací standardy pro základní a střední školy. Pedagogika, 46, 349–353.

38. Urbánek, T., Denglerová, D., & Širůček, J. (2011). Psychometrika: Měření v psychologii. Praha: Portál.

39. West, A. (2010). High stakes testing, accountability, incentives and consequences in English schools. Policy & Politics, 38(1), 23–39.

40. Wiersma, W., & Jurs, S. G. (1990). Educational measurement and testing (2nd ed.). Boston: Allyn and Bacon.

41. Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing (3rd ed.). Waltham, MA: Academic Press. CrossRef

42. Williams, N. J., & Schulz, E. M. (2005). An investigation of response probability (RP) values used in standard setting. Príspevok prednesený na stretnutí National Council on Measurement in Education, Montréal, QC.

43. Zákon č. 245/2008 Z. z. o výchove a vzdelávaní (školský zákon) a o zmene a doplnení niektorých zákonov. Dostupné z http://www.uips.sk/sub/uips.sk/images/PKvs/z245_2008.pdf.

Určovanie hraničných skóre pre kriteriálne testy is licensed under a Creative Commons Attribution 4.0 International License.

157 x 230 mm
periodicity: 3 x per year
print price: 150 czk
ISSN: 1802-4637
E-ISSN: 2336-3177

Download

OS_1_2015_08_Mentel.pdf

Share