Correcting for Person Misfit in Aggregated Score Reporting

There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a...

Full description

Saved in:

Bibliographic Details
Published in	International journal of testing Vol. 7; no. 1; pp. 1 - 25
Main Authors	Brown, Richard S, Villarreal, Julio C
Format	Journal Article
Language	English
Published	Philadelphia Lawrence Erlbaum Associates, Inc 2007 Taylor & Francis Ltd
Subjects	Achievement Tests Cheating Comparative Analysis Computer Assisted Testing Computers Credibility Educational evaluation Estimates Estimating techniques Ethnicity Goodness of Fit Item Response Theory Mathematics Mathematics Tests Measures Motivation Psychometrics Reading Tests Research methodology Scores Simulation Standardized tests Statistical analysis Students Test Bias Testing Tests Weighted Scores Weighting
Online Access	Get full text
ISSN	1530-5058 1532-7574
DOI	10.1207/s15327574ijt0701_1

Cover

Abstract	There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing.
AbstractList	There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing. [PUBLICATION ABSTRACT] There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing.
Author	Brown, Richard S Villarreal, Julio C
Author_xml	– sequence: 1 fullname: Brown, Richard S – sequence: 2 fullname: Villarreal, Julio C
BackLink	http://eric.ed.gov/ERICWebPortal/detail?accno=EJ754606$$DView record in ERIC
BookMark	eNotjctOwzAURC1UJNrCDyAWFvvAvY4fMbuqKi8VgaD7KI-byBHEwU4X_D0NZTWjo6OZBZv1vifGLhFuUIC5jahSYZSRrhvBAOZ4wuYTSyY4--uQKFDZGVvE2AGARSPn7G7tQ6BqdH3LGx_4G4Xoe_7iYuNG7nq-attAbTFSzT8qH4i_0-DD5J-z06b4jHTxn0u2u9_s1o_J9vXhab3aJiQzmxQWrbIkqIRKK6ywLKWttQUhMRVYUS2MEbKstVSNSVGhanSdyUKLEkuVpkt2fZwdgv_eUxzzzu9Df3jMBUJmM6ntQbo6ShRclQ_BfRXhJ988GyU16PQXnSlTgA
ContentType	Journal Article
Copyright	Copyright (c) 2007, Lawrence Erlbaum Associates, Inc.
Copyright_xml	– notice: Copyright (c) 2007, Lawrence Erlbaum Associates, Inc.
DBID	7SW BJH BNH BNI BNJ BNO ERI PET REK WWN 7QJ AHOVV
DOI	10.1207/s15327574ijt0701_1
DatabaseName	ERIC ERIC (Ovid) ERIC ERIC ERIC (Legacy Platform) ERIC( SilverPlatter ) ERIC ERIC PlusText (Legacy Platform) Education Resources Information Center (ERIC) ERIC Applied Social Sciences Index & Abstracts (ASSIA) Education Research Index
DatabaseTitle	ERIC Applied Social Sciences Index and Abstracts (ASSIA)
DatabaseTitleList	Applied Social Sciences Index and Abstracts (ASSIA) ERIC
Database_xml	– sequence: 1 dbid: ERI name: ERIC url: https://eric.ed.gov/ sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Education Psychology Mathematics
EISSN	1532-7574
ERIC	EJ754606
ExternalDocumentID	1229948011 EJ754606
Genre	Feature
GroupedDBID	-W8 .7I .GO .QK 0BK 0R~ 4.4 5GY 5VS 7SW AAGDL AAGZJ AAHIA AAHSB AAMFJ AAMIU AAMUQ AAPUL AATTQ AAZJI AAZMC ABCCY ABDBF ABFIM ABIVO ABJNI ABLIJ ABPEM ABTAI ABXUL ABXYU ABZLS ACDYK ACGFS ACMAZ ACTIO ACTOA ACUHS ADAHI ADCVX ADKVQ ADQZN AECIN AEFOU AEISY AEKEX AEMXT AEOZL AEPSL AEYOC AEZRU AFNSQ AFRVT AGDLA AGDNC AGMYJ AGRBW AHDZW AIJEM AIYEW AJQZJ AJWEG AKBVH ALMA_UNASSIGNED_HOLDINGS ALQZU AVBZW AWYRJ BEJHT BJH BLEHA BMOTO BNH BNI BNJ BNO BOHLJ CAG CCCUG COF CQ1 CS3 DGFLZ DGXZK DKSSO EAP EBS EDJ EFRLQ EGDCR EJD EMK EPL EPS ERI ESX E~B E~C FEDTE G-F GTTXZ H13 HF~ HVGLF HZ~ IPNFZ J.O KYCEM LJTGL M4Z NA5 O9- PET PQQKQ RBICI REK RIG RNANH ROL ROSJB RSYQP S-F STATR TASJS TBQAZ TDBHL TED TEH TFH TFL TFW TNTFI TRJHH TUROJ TUS UT5 UT9 VAE WWN ~01 ~S~ 7QJ AHOVV
ID	FETCH-LOGICAL-e489-a91959e2eb0c651c1bb49d690241321ced27724bd645f731515f6d84a62b1b533
ISSN	1530-5058
IngestDate	Sat Sep 06 17:41:43 EDT 2025 Tue Sep 02 19:06:14 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	false
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-e489-a91959e2eb0c651c1bb49d690241321ced27724bd645f731515f6d84a62b1b533
Notes	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14
PQID	210898469
PQPubID	25335
PageCount	25
ParticipantIDs	proquest_journals_210898469 eric_primary_EJ754606
PublicationCentury	2000
PublicationDate	2007-00-00 20070101
PublicationDateYYYYMMDD	2007-01-01
PublicationDate_xml	– year: 2007 text: 2007-00-00
PublicationDecade	2000
PublicationPlace	Philadelphia
PublicationPlace_xml	– name: Philadelphia
PublicationTitle	International journal of testing
PublicationYear	2007
Publisher	Lawrence Erlbaum Associates, Inc Taylor & Francis Ltd
Publisher_xml	– name: Lawrence Erlbaum Associates, Inc – name: Taylor & Francis Ltd
SSID	ssj0009174
Score	1.6436572
Snippet	There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are...
SourceID	proquest eric
SourceType	Aggregation Database Index Database
StartPage	1
SubjectTerms	Achievement Tests Cheating Comparative Analysis Computer Assisted Testing Computers Credibility Educational evaluation Estimates Estimating techniques Ethnicity Goodness of Fit Item Response Theory Mathematics Mathematics Tests Measures Motivation Psychometrics Reading Tests Research methodology Scores Simulation Standardized tests Statistical analysis Students Test Bias Testing Tests Weighted Scores Weighting
Title	Correcting for Person Misfit in Aggregated Score Reporting
URI	http://eric.ed.gov/ERICWebPortal/detail?accno=EJ754606 https://www.proquest.com/docview/210898469
Volume	7
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA6yXryIT3yTgzepNNk0ab2JrCwLisgK3kqTprIiq7gV0V_vTJO0PkG9hBLatOQLky_TmW8I2RfaVmlq-hHrlzwSRshImyKOklSVVmaoYNYEyJ7L4ZUYXSfXoZa8zy6p9aF5_Tav5D-oQh_gilmyf0C2HRQ64BrwhRYQhvZXGJ9gaQ1Th2DIi4Y9H5xNZtWkUf4_voHTNPrJgFSiXKWn22G3uu2i2Dun4DspiRoVOPy9H07sLlfrXbUuLF0En1LchZTre-99Df4E1XkSimcnbDt4vNMF_h7wC6S1Vx_sJOAZO9X1YEjVl_XijCL71lRzn7qc9LlKlJjc1mB9WM66jakNFxyMVCIkCqvPc6VY0iPz48vRcNipKzu57farfHYUb3IkP7-gUSFwA34Ocg_7cEMuxktk0Z8K6LGDeJnM2ekKFtT2wTcrZKHdqF5WyVGHOwXcqcOdOtzpZEo73GmDO21xXyPj08H4ZBj5GhiRFWkWFRmK_1hudWxkwgzTWmSlzGL8HcqZsSVMCBe6lCKpVB_ZaSXLVBSSa6aByq-T3vR-ajcITVNUApImLoBz8koVqJNUWlFxaaHNNskazkX-4FRO8jBHm2Q7TE7ul-As5yxOM6Cu2dYPT22TBecFR2fVDunVj092F-hbrfcaF8Weh_AN5U1FvQ
linkProvider	Taylor & Francis
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Correcting+for+Person+Misfit+in+Aggregated+Score+Reporting&rft.jtitle=International+journal+of+testing&rft.au=Brown%2C+Richard+S&rft.au=Villarreal%2C+Julio+C&rft.date=2007&rft.pub=Lawrence+Erlbaum+Associates%2C+Inc&rft.issn=1530-5058&rft.volume=7&rft.issue=1&rft.spage=1&rft_id=info:doi/10.1207%2Fs15327574ijt0701_1&rft.externalDocID=EJ754606
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-5058&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-5058&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-5058&client=summon