On the complementarity of classical test theory and item response
					models: item difficulty estimates and computerized adaptive
					testing

Costa, Patrícia; Ferrão, Maria Eugénia

doi:10.1590/S0104-40362015000300003

Acessibilidade / Reportar erro

Brasil

Ensaio: Avaliação e Políticas Públicas em Educação

Español English

Brasil

Español English

sumário « anterior atual seguinte »

Sumário

Articles • Ensaio: aval. pol. públ. educ. 23 (88) • July-Sept 2015 • https://doi.org/10.1590/S0104-40362015000300003 copy

On the complementarity of classical test theory and item response models: item difficulty estimates and computerized adaptive testing

En la complementariedad de la teoría clásica de los tests y los modelos de la teoría de respuesta al ítem: estimaciones de la dificultad de un ítem y tests adaptativos computarizados

Sobre a complementaridade da teoria clássica dos testes e dos modelos de resposta ao Item: estimativas da dificuldade do item e testes adaptativos computarizados

Authorship SCIMAGO INSTITUTIONS RANKINGS

This study aims to provide statistical evidence of the complementarity between classical test theory and item response models for certain educational assessment purposes. Such complementarity might support, at a reduced cost, future development of innovative procedures for item calibration in adaptive testing. Classical test theory and the generalized partial credit model are applied to tests comprising multiple choice, short answer, completion, and open response items scored partially. Datasets are derived from the tests administered to the Portuguese population of students enrolled in the 4^th and 6^th grades. The results show a very strong association between the estimates of difficulty obtained from classical test theory and item response models, corroborating the statistical theory of mental testing.

Generalized partial credit; Item response model; Classical test theory; Educational assessment

Math4	CTT			IRM
Item	Discrimination Index (c)	Difficulty Index (p)	Bi-serial Correlation (r)	Discrimination Estimate (a)	Difficulty Estimate (b)
1	0.650	0.647	0.532	0.785	-0.611
2	0.610	0.630	0.507	0.714	-0.569
3	0.569	0.303	0.458	0.340	0.215
4	0.126	0.913	0.172	0.278	-5.159
5	0.670	0.667	0.594	0.604	-0.670
6	0.414	0.837	0.505	0.981	-1.412
7	0.330	0.817	0.381	0.291	-1.890
8	0.742	0.366	0.551	0.393	0.241
9	0.242	0.874	0.318	0.219	-2.835
10	0.610	0.542	0.461	0.302	-0.210
11	0.412	0.787	0.43	0.633	-1.478
12	0.215	0.917	0.364	0.821	-2.220
13	0.618	0.703	0.545	0.860	-0.817
14	0.107	0.947	0.179	0.283	-4.001
15	0.373	0.800	0.37	0.535	-1.767
16	0.558	0.743	0.544	0.492	-0.951
17	0.747	0.582	0.588	0.421	-0.467
18	0.500	0.757	0.472	0.530	-1.984
19	0.448	0.838	0.506	1.033	-1.385
20	0.674	0.475	0.547	0.502	-0.544
21	0.585	0.670	0.489	0.364	-0.786
22	0.556	0.472	0.409	0.482	0.153
23	0.625	0.674	0.512	0.744	-0.751
24	0.359	0.833	0.41	0.355	-1.847
25	0.340	0.828	0.363	0.588	-1.858
26	0.627	0.428	0.497	0.442	-0.274
27	0.639	0.546	0.475	0.615	-0.219

Port4	CTT			IRM
Item	Discrimination Index (c)	Difficulty Index (p)	Biserial Correlation (r)	Discrimination Estimate (a)	Difficulty Estimate (b)
1	0.380	0.744	0.376	0.270	-2.415
2	0.228	0.870	0.305	0.329	-3.569
3	0.490	0.246	0.461	0.324	0.116
4	0.312	0.771	0.334	0.297	-2.507
5	0.335	0.806	0.380	0.343	-3.188
6	0.196	0.162	0.229	0.286	0.640
7	0.438	0.668	0.412	0.299	-1.435
8	0.272	0.897	0.417	0.450	-3.130
9	0.493	0.461	0.424	0.375	-2.328
10	0.565	0.549	0.534	0.239	-1.530
11	0.388	0.266	0.352	0.363	-1.106
12	0.243	0.879	0.393	0.579	-2.343
13	0.486	0.723	0.519	0.535	-1.209
14	0.398	0.350	0.331	0.318	-0.985
15	0.457	0.701	0.461	0.475	-2.603
16	0.360	0.829	0.476	0.313	-2.141
17	0.321	0.881	0.534	0.764	-1.952
18	0.364	0.333	0.357	0.268	-0.739
19	0.419	0.675	0.408	0.303	-1.726
20	0.102	0.955	0.283	0.474	-3.221
21	0.673	0.307	0.650	0.895	-0.427
22	0.433	0.157	0.480	1.286	-0.219
23	0.447	0.169	0.494	2.032	-0.421
24	0.476	0.187	0.500	1.955	-0.549
25	0.438	0.163	0.480	1.554	-0.768
26	0.524	0.228	0.507	1.091	-0.517
27	0.557	0.283	0.480	0.598	-1.055

Math6	CTT			IRM
Item	Discrimination Index (c)	Difficulty Index (p)	Biserial Correlation (r)	Discrimination Estimate (a)	Difficulty Estimate (b)
1	0.348	0.841	0.381	0.402	-2.034
2	0.647	0.527	0.579	0.531	-0.476
3	0.280	0.864	0.342	0.544	-2.296
4	0.546	0.683	0.534	0.625	-0.888
5	0.352	0.805	0.379	0.508	-1.872
6	0.738	0.427	0.721	0.347	0.062
7	0.348	0.841	0.412	0.431	-1.664
8	0.381	0.703	0.353	0.197	-1.541
9	0.728	0.350	0.762	0.431	0.221
10	0.122	0.181	0.169	0.219	4.170
11	0.569	0.437	0.496	0.300	-1.102
12	0.126	0.037	0.319	0.536	2.200
13	0.559	0.362	0.519	0.734	0.573
14	0.632	0.406	0.607	0.307	0.206
15	0.699	0.262	0.818	0.612	0.564
16	0.728	0.360	0.788	0.882	-0.138
17	0.544	0.247	0.596	0.364	0.447
18	0.527	0.765	0.552	0.485	-1.248
19	0.530	0.348	0.531	0.749	0.629
20	0.175	0.049	0.402	0.462	2.019
21	0.476	0.260	0.486	0.696	1.110
22	0.593	0.298	0.591	0.440	0.017
23	0.461	0.208	0.501	0.336	0.599
24	0.491	0.490	0.439	0.431	0.056
25	0.627	0.402	0.599	0.370	0.134
26	0.472	0.585	0.406	0.228	-0.654
27	0.530	0.449	0.505	0.290	0.093

Port6	CTT			IRM
Item	Discrimination Index (c)	Difficulty Index (p)	Biserial Correlation (r)	Discrimination Estimate (a)	Difficulty Estimate (b)
1	0.484	0.607	0.416	0.404	-0.699
2	0.512	0.618	0.471	0.354	-2.931
3	0.258	0.320	0.231	0.175	2.594
4	0.028	0.991	0.141	0.470	-6.256
5	0.052	0.978	0.166	0.414	-5.447
6	0.249	0.442	0.202	0.207	-3.025
7	0.223	0.892	0.342	0.485	-2.872
8	0.417	0.743	0.407	0.269	-2.272
9	0.161	0.921	0.269	0.487	-3.279
10	0.452	0.439	0.379	0.211	-0.715
11	0.479	0.686	0.439	0.376	-1.328
12	0.451	0.529	0.386	0.159	-1.844
13	0.299	0.668	0.263	0.259	-1.659
14	0.207	0.192	0.207	0.109	3.398
15	0.289	0.825	0.330	0.415	-2.405
16	0.513	0.558	0.470	0.506	-0.312
17	0.118	0.059	0.212	0.423	2.028
18	0.142	0.040	0.348	0.407	1.471
19	0.283	0.111	0.402	0.416	-0.560
20	0.217	0.092	0.327	0.617	0.977
21	0.391	0.207	0.463	0.343	0.988
22	0.475	0.506	0.422	0.419	-2.155
23	0.389	0.667	0.371	0.428	-2.911
24	0.405	0.279	0.380	0.460	-0.605
25	0.508	0.579	0.464	0.410	-1.261
26	0.052	0.978	0.200	0.428	-3.202
27	0.603	0.469	0.517	0.559	-1.223
28	0.509	0.228	0.524	0.813	-0.768
29	0.420	0.175	0.496	1.159	-0.427
30	0.415	0.168	0.493	1.401	-0.610
31	0.391	0.141	0.497	1.295	-0.761
32	0.354	0.128	0.447	1.160	-0.339
33	0.585	0.279	0.567	0.514	-0.603

Subject/Grade	Correlation	95% Confidence Interval
Subject/Grade	Correlation	Lower	Upper
Mathematics / 4^th grade	-0.826	-0.927	-0.766
Portuguese / 4^th grade	-0.883	-0.938	-0.809
Mathematics / 6^th grade	-0.879	-0.977	-0.799
Portuguese / 6^th grade	-0.805	-0.885	-0.712

Fundação CESGRANRIO Revista Ensaio, Rua Santa Alexandrina 1011, Rio Comprido, 20261-903 , Rio de Janeiro - RJ - Brasil, Tel.: + 55 21 2103 9600 - Rio de Janeiro - RJ - Brazil
E-mail: ensaio@cesgranrio.org.br

Acompanhe os números deste periódico no seu leitor de RSS

[1] Informações das autoras

Patrícia Costa: Doutora em Engenharia Industrial e de Sistemas, Universidade do Minho, Portugal. Investigadora na Unidade de Econometria e Estatística Aplicada do Joint Research Center – European Commission. Contato: patricia.costa@jrc.ec.europa.eu

Maria Eugénia Ferrão: Doutora em Engenharia (área de concentração Estatística e Teoria de Controle), Pontifícia Universidade Católica do Rio de Janeiro - PUC-Rio. Agregada em Métodos Quantitativos, Instituto Universitário de Lisboa - ISCTE. Professora Auxiliar da Universidade da Beira Interior, Investigadora integrada do Centro de Matemática Aplicada à Previsão e Decisão Económica - CEMAPRE, Universidade de Lisboa. Visiting Fellow da Graduate School of Education, University of Bristol, Reino Unido. Contato: meferrao@ubi.pt