Correcting for Person Misfit in Aggregated Score Reporting

There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of testing Vol. 7; no. 1; pp. 1 - 25
Main Authors Brown, Richard S, Villarreal, Julio C
Format Journal Article
LanguageEnglish
Published Philadelphia Lawrence Erlbaum Associates, Inc 2007
Taylor & Francis Ltd
Subjects
Online AccessGet full text
ISSN1530-5058
1532-7574
DOI10.1207/s15327574ijt0701_1

Cover

Abstract There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing.
AbstractList There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing. [PUBLICATION ABSTRACT]
There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing.
Author Brown, Richard S
Villarreal, Julio C
Author_xml – sequence: 1
  fullname: Brown, Richard S
– sequence: 2
  fullname: Villarreal, Julio C
BackLink http://eric.ed.gov/ERICWebPortal/detail?accno=EJ754606$$DView record in ERIC
BookMark eNotjctOwzAURC1UJNrCDyAWFvvAvY4fMbuqKi8VgaD7KI-byBHEwU4X_D0NZTWjo6OZBZv1vifGLhFuUIC5jahSYZSRrhvBAOZ4wuYTSyY4--uQKFDZGVvE2AGARSPn7G7tQ6BqdH3LGx_4G4Xoe_7iYuNG7nq-attAbTFSzT8qH4i_0-DD5J-z06b4jHTxn0u2u9_s1o_J9vXhab3aJiQzmxQWrbIkqIRKK6ywLKWttQUhMRVYUS2MEbKstVSNSVGhanSdyUKLEkuVpkt2fZwdgv_eUxzzzu9Df3jMBUJmM6ntQbo6ShRclQ_BfRXhJ988GyU16PQXnSlTgA
ContentType Journal Article
Copyright Copyright (c) 2007, Lawrence Erlbaum Associates, Inc.
Copyright_xml – notice: Copyright (c) 2007, Lawrence Erlbaum Associates, Inc.
DBID 7SW
BJH
BNH
BNI
BNJ
BNO
ERI
PET
REK
WWN
7QJ
AHOVV
DOI 10.1207/s15327574ijt0701_1
DatabaseName ERIC
ERIC (Ovid)
ERIC
ERIC
ERIC (Legacy Platform)
ERIC( SilverPlatter )
ERIC
ERIC PlusText (Legacy Platform)
Education Resources Information Center (ERIC)
ERIC
Applied Social Sciences Index & Abstracts (ASSIA)
Education Research Index
DatabaseTitle ERIC
Applied Social Sciences Index and Abstracts (ASSIA)
DatabaseTitleList Applied Social Sciences Index and Abstracts (ASSIA)
ERIC
Database_xml – sequence: 1
  dbid: ERI
  name: ERIC
  url: https://eric.ed.gov/
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Education
Psychology
Mathematics
EISSN 1532-7574
ERIC EJ754606
ExternalDocumentID 1229948011
EJ754606
Genre Feature
GroupedDBID -W8
.7I
.GO
.QK
0BK
0R~
4.4
5GY
5VS
7SW
AAGDL
AAGZJ
AAHIA
AAHSB
AAMFJ
AAMIU
AAMUQ
AAPUL
AATTQ
AAZJI
AAZMC
ABCCY
ABDBF
ABFIM
ABIVO
ABJNI
ABLIJ
ABPEM
ABTAI
ABXUL
ABXYU
ABZLS
ACDYK
ACGFS
ACMAZ
ACTIO
ACTOA
ACUHS
ADAHI
ADCVX
ADKVQ
ADQZN
AECIN
AEFOU
AEISY
AEKEX
AEMXT
AEOZL
AEPSL
AEYOC
AEZRU
AFNSQ
AFRVT
AGDLA
AGDNC
AGMYJ
AGRBW
AHDZW
AIJEM
AIYEW
AJQZJ
AJWEG
AKBVH
ALMA_UNASSIGNED_HOLDINGS
ALQZU
AVBZW
AWYRJ
BEJHT
BJH
BLEHA
BMOTO
BNH
BNI
BNJ
BNO
BOHLJ
CAG
CCCUG
COF
CQ1
CS3
DGFLZ
DGXZK
DKSSO
EAP
EBS
EDJ
EFRLQ
EGDCR
EJD
EMK
EPL
EPS
ERI
ESX
E~B
E~C
FEDTE
G-F
GTTXZ
H13
HF~
HVGLF
HZ~
IPNFZ
J.O
KYCEM
LJTGL
M4Z
NA5
O9-
PET
PQQKQ
RBICI
REK
RIG
RNANH
ROL
ROSJB
RSYQP
S-F
STATR
TASJS
TBQAZ
TDBHL
TED
TEH
TFH
TFL
TFW
TNTFI
TRJHH
TUROJ
TUS
UT5
UT9
VAE
WWN
~01
~S~
7QJ
AHOVV
ID FETCH-LOGICAL-e489-a91959e2eb0c651c1bb49d690241321ced27724bd645f731515f6d84a62b1b533
ISSN 1530-5058
IngestDate Sat Sep 06 17:41:43 EDT 2025
Tue Sep 02 19:06:14 EDT 2025
IsDoiOpenAccess false
IsOpenAccess false
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-e489-a91959e2eb0c651c1bb49d690241321ced27724bd645f731515f6d84a62b1b533
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
PQID 210898469
PQPubID 25335
PageCount 25
ParticipantIDs proquest_journals_210898469
eric_primary_EJ754606
PublicationCentury 2000
PublicationDate 2007-00-00
20070101
PublicationDateYYYYMMDD 2007-01-01
PublicationDate_xml – year: 2007
  text: 2007-00-00
PublicationDecade 2000
PublicationPlace Philadelphia
PublicationPlace_xml – name: Philadelphia
PublicationTitle International journal of testing
PublicationYear 2007
Publisher Lawrence Erlbaum Associates, Inc
Taylor & Francis Ltd
Publisher_xml – name: Lawrence Erlbaum Associates, Inc
– name: Taylor & Francis Ltd
SSID ssj0009174
Score 1.6436572
Snippet There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are...
SourceID proquest
eric
SourceType Aggregation Database
Index Database
StartPage 1
SubjectTerms Achievement Tests
Cheating
Comparative Analysis
Computer Assisted Testing
Computers
Credibility
Educational evaluation
Estimates
Estimating techniques
Ethnicity
Goodness of Fit
Item Response Theory
Mathematics
Mathematics Tests
Measures
Motivation
Psychometrics
Reading Tests
Research methodology
Scores
Simulation
Standardized tests
Statistical analysis
Students
Test Bias
Testing
Tests
Weighted Scores
Weighting
Title Correcting for Person Misfit in Aggregated Score Reporting
URI http://eric.ed.gov/ERICWebPortal/detail?accno=EJ754606
https://www.proquest.com/docview/210898469
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA6yXryIT3yTgzepNNk0ab2JrCwLisgK3kqTprIiq7gV0V_vTJO0PkG9hBLatOQLky_TmW8I2RfaVmlq-hHrlzwSRshImyKOklSVVmaoYNYEyJ7L4ZUYXSfXoZa8zy6p9aF5_Tav5D-oQh_gilmyf0C2HRQ64BrwhRYQhvZXGJ9gaQ1Th2DIi4Y9H5xNZtWkUf4_voHTNPrJgFSiXKWn22G3uu2i2Dun4DspiRoVOPy9H07sLlfrXbUuLF0En1LchZTre-99Df4E1XkSimcnbDt4vNMF_h7wC6S1Vx_sJOAZO9X1YEjVl_XijCL71lRzn7qc9LlKlJjc1mB9WM66jakNFxyMVCIkCqvPc6VY0iPz48vRcNipKzu57farfHYUb3IkP7-gUSFwA34Ocg_7cEMuxktk0Z8K6LGDeJnM2ekKFtT2wTcrZKHdqF5WyVGHOwXcqcOdOtzpZEo73GmDO21xXyPj08H4ZBj5GhiRFWkWFRmK_1hudWxkwgzTWmSlzGL8HcqZsSVMCBe6lCKpVB_ZaSXLVBSSa6aByq-T3vR-ajcITVNUApImLoBz8koVqJNUWlFxaaHNNskazkX-4FRO8jBHm2Q7TE7ul-As5yxOM6Cu2dYPT22TBecFR2fVDunVj092F-hbrfcaF8Weh_AN5U1FvQ
linkProvider Taylor & Francis
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Correcting+for+Person+Misfit+in+Aggregated+Score+Reporting&rft.jtitle=International+journal+of+testing&rft.au=Brown%2C+Richard+S&rft.au=Villarreal%2C+Julio+C&rft.date=2007&rft.pub=Lawrence+Erlbaum+Associates%2C+Inc&rft.issn=1530-5058&rft.volume=7&rft.issue=1&rft.spage=1&rft_id=info:doi/10.1207%2Fs15327574ijt0701_1&rft.externalDocID=EJ754606
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-5058&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-5058&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-5058&client=summon