Correcting for Person Misfit in Aggregated Score Reporting
There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a...
Saved in:
Published in | International journal of testing Vol. 7; no. 1; pp. 1 - 25 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Philadelphia
Lawrence Erlbaum Associates, Inc
2007
Taylor & Francis Ltd |
Subjects | |
Online Access | Get full text |
ISSN | 1530-5058 1532-7574 |
DOI | 10.1207/s15327574ijt0701_1 |
Cover
Abstract | There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing. |
---|---|
AbstractList | There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing. [PUBLICATION ABSTRACT] There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation, attention, test dimensionality, test bias, or even cheating. Whatever the reason, otherwise sound assessments may yield individual score estimates that are not truly reflective of a subject's underlying trait level. In large-scale testing situations, the extent of individual misfit may materially affect aggregated score reports for relevant subgroups of students. This article investigates the impact of using individual person-fit measures from item response theory (IRT) scored achievement examinations as weighting factors in providing aggregated score reports. About 160,000 students from more than 300 schools completed a mathematics or reading test in a computer adaptive environment. For each subject, a standardized person-fit statistic (Drasgow, Levine, & Williams, 1985) was used to estimate the degree of misfit for each subject. A brief simulation study was done and the results provided cut points for determining misfit in a computer adaptive environment. Step and logistic functions were applied to these person-fit statistics to determine credibility weights for each respondent. Data were then aggregated by using the credibility weights to produce school-level estimates for all students and for each student subgroup. The weighted aggregated estimates were then compared with unweighted estimates to identify the impact of the use of person-fit measures as weighting factors for aggregated score reporting. As expected, weighted group estimates generally produced significantly different group scores than unweighted group estimates but did so differentially across student subgroups (e.g., ethnicity, grade). It is argued that the use of person-fit measures may provide for a useful correction factor for model misfit in generating better aggregated score estimates in large scale testing contexts without jeopardizing relative school standing. |
Author | Brown, Richard S Villarreal, Julio C |
Author_xml | – sequence: 1 fullname: Brown, Richard S – sequence: 2 fullname: Villarreal, Julio C |
BackLink | http://eric.ed.gov/ERICWebPortal/detail?accno=EJ754606$$DView record in ERIC |
BookMark | eNotjctOwzAURC1UJNrCDyAWFvvAvY4fMbuqKi8VgaD7KI-byBHEwU4X_D0NZTWjo6OZBZv1vifGLhFuUIC5jahSYZSRrhvBAOZ4wuYTSyY4--uQKFDZGVvE2AGARSPn7G7tQ6BqdH3LGx_4G4Xoe_7iYuNG7nq-attAbTFSzT8qH4i_0-DD5J-z06b4jHTxn0u2u9_s1o_J9vXhab3aJiQzmxQWrbIkqIRKK6ywLKWttQUhMRVYUS2MEbKstVSNSVGhanSdyUKLEkuVpkt2fZwdgv_eUxzzzu9Df3jMBUJmM6ntQbo6ShRclQ_BfRXhJ988GyU16PQXnSlTgA |
ContentType | Journal Article |
Copyright | Copyright (c) 2007, Lawrence Erlbaum Associates, Inc. |
Copyright_xml | – notice: Copyright (c) 2007, Lawrence Erlbaum Associates, Inc. |
DBID | 7SW BJH BNH BNI BNJ BNO ERI PET REK WWN 7QJ AHOVV |
DOI | 10.1207/s15327574ijt0701_1 |
DatabaseName | ERIC ERIC (Ovid) ERIC ERIC ERIC (Legacy Platform) ERIC( SilverPlatter ) ERIC ERIC PlusText (Legacy Platform) Education Resources Information Center (ERIC) ERIC Applied Social Sciences Index & Abstracts (ASSIA) Education Research Index |
DatabaseTitle | ERIC Applied Social Sciences Index and Abstracts (ASSIA) |
DatabaseTitleList | Applied Social Sciences Index and Abstracts (ASSIA) ERIC |
Database_xml | – sequence: 1 dbid: ERI name: ERIC url: https://eric.ed.gov/ sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Education Psychology Mathematics |
EISSN | 1532-7574 |
ERIC | EJ754606 |
ExternalDocumentID | 1229948011 EJ754606 |
Genre | Feature |
GroupedDBID | -W8 .7I .GO .QK 0BK 0R~ 4.4 5GY 5VS 7SW AAGDL AAGZJ AAHIA AAHSB AAMFJ AAMIU AAMUQ AAPUL AATTQ AAZJI AAZMC ABCCY ABDBF ABFIM ABIVO ABJNI ABLIJ ABPEM ABTAI ABXUL ABXYU ABZLS ACDYK ACGFS ACMAZ ACTIO ACTOA ACUHS ADAHI ADCVX ADKVQ ADQZN AECIN AEFOU AEISY AEKEX AEMXT AEOZL AEPSL AEYOC AEZRU AFNSQ AFRVT AGDLA AGDNC AGMYJ AGRBW AHDZW AIJEM AIYEW AJQZJ AJWEG AKBVH ALMA_UNASSIGNED_HOLDINGS ALQZU AVBZW AWYRJ BEJHT BJH BLEHA BMOTO BNH BNI BNJ BNO BOHLJ CAG CCCUG COF CQ1 CS3 DGFLZ DGXZK DKSSO EAP EBS EDJ EFRLQ EGDCR EJD EMK EPL EPS ERI ESX E~B E~C FEDTE G-F GTTXZ H13 HF~ HVGLF HZ~ IPNFZ J.O KYCEM LJTGL M4Z NA5 O9- PET PQQKQ RBICI REK RIG RNANH ROL ROSJB RSYQP S-F STATR TASJS TBQAZ TDBHL TED TEH TFH TFL TFW TNTFI TRJHH TUROJ TUS UT5 UT9 VAE WWN ~01 ~S~ 7QJ AHOVV |
ID | FETCH-LOGICAL-e489-a91959e2eb0c651c1bb49d690241321ced27724bd645f731515f6d84a62b1b533 |
ISSN | 1530-5058 |
IngestDate | Sat Sep 06 17:41:43 EDT 2025 Tue Sep 02 19:06:14 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | false |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-e489-a91959e2eb0c651c1bb49d690241321ced27724bd645f731515f6d84a62b1b533 |
Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 |
PQID | 210898469 |
PQPubID | 25335 |
PageCount | 25 |
ParticipantIDs | proquest_journals_210898469 eric_primary_EJ754606 |
PublicationCentury | 2000 |
PublicationDate | 2007-00-00 20070101 |
PublicationDateYYYYMMDD | 2007-01-01 |
PublicationDate_xml | – year: 2007 text: 2007-00-00 |
PublicationDecade | 2000 |
PublicationPlace | Philadelphia |
PublicationPlace_xml | – name: Philadelphia |
PublicationTitle | International journal of testing |
PublicationYear | 2007 |
Publisher | Lawrence Erlbaum Associates, Inc Taylor & Francis Ltd |
Publisher_xml | – name: Lawrence Erlbaum Associates, Inc – name: Taylor & Francis Ltd |
SSID | ssj0009174 |
Score | 1.6436572 |
Snippet | There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are... |
SourceID | proquest eric |
SourceType | Aggregation Database Index Database |
StartPage | 1 |
SubjectTerms | Achievement Tests Cheating Comparative Analysis Computer Assisted Testing Computers Credibility Educational evaluation Estimates Estimating techniques Ethnicity Goodness of Fit Item Response Theory Mathematics Mathematics Tests Measures Motivation Psychometrics Reading Tests Research methodology Scores Simulation Standardized tests Statistical analysis Students Test Bias Testing Tests Weighted Scores Weighting |
Title | Correcting for Person Misfit in Aggregated Score Reporting |
URI | http://eric.ed.gov/ERICWebPortal/detail?accno=EJ754606 https://www.proquest.com/docview/210898469 |
Volume | 7 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA6yXryIT3yTgzepNNk0ab2JrCwLisgK3kqTprIiq7gV0V_vTJO0PkG9hBLatOQLky_TmW8I2RfaVmlq-hHrlzwSRshImyKOklSVVmaoYNYEyJ7L4ZUYXSfXoZa8zy6p9aF5_Tav5D-oQh_gilmyf0C2HRQ64BrwhRYQhvZXGJ9gaQ1Th2DIi4Y9H5xNZtWkUf4_voHTNPrJgFSiXKWn22G3uu2i2Dun4DspiRoVOPy9H07sLlfrXbUuLF0En1LchZTre-99Df4E1XkSimcnbDt4vNMF_h7wC6S1Vx_sJOAZO9X1YEjVl_XijCL71lRzn7qc9LlKlJjc1mB9WM66jakNFxyMVCIkCqvPc6VY0iPz48vRcNipKzu57farfHYUb3IkP7-gUSFwA34Ocg_7cEMuxktk0Z8K6LGDeJnM2ekKFtT2wTcrZKHdqF5WyVGHOwXcqcOdOtzpZEo73GmDO21xXyPj08H4ZBj5GhiRFWkWFRmK_1hudWxkwgzTWmSlzGL8HcqZsSVMCBe6lCKpVB_ZaSXLVBSSa6aByq-T3vR-ajcITVNUApImLoBz8koVqJNUWlFxaaHNNskazkX-4FRO8jBHm2Q7TE7ul-As5yxOM6Cu2dYPT22TBecFR2fVDunVj092F-hbrfcaF8Weh_AN5U1FvQ |
linkProvider | Taylor & Francis |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Correcting+for+Person+Misfit+in+Aggregated+Score+Reporting&rft.jtitle=International+journal+of+testing&rft.au=Brown%2C+Richard+S&rft.au=Villarreal%2C+Julio+C&rft.date=2007&rft.pub=Lawrence+Erlbaum+Associates%2C+Inc&rft.issn=1530-5058&rft.volume=7&rft.issue=1&rft.spage=1&rft_id=info:doi/10.1207%2Fs15327574ijt0701_1&rft.externalDocID=EJ754606 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-5058&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-5058&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-5058&client=summon |