Performance Task and Constructed-response Scoring Clause Samples
Performance Task and Constructed-response Scoring. ETS will score Performance Task and constructed-response student responses (including mathematics responses in Spanish) to maximize validity and reliability while incorporating efficiencies wherever possible. Table 16 represents the division of labor between ETS and MI. Grade 3 ETS MI - ETS ▇▇▇▇▇▇▇ Grade 4 ETS MI - ETS ▇▇▇▇▇▇▇ Grade 5 ETS MI ETS ETS ▇▇▇▇▇▇▇ Grade 6 MI ETS - ETS ▇▇▇▇▇▇▇ Grade 7 MI ETS - ETS ▇▇▇▇▇▇▇ Grade 8 MI ETS ETS ETS ▇▇▇▇▇▇▇ Grade 9 - - - ETS ▇▇▇▇▇▇▇ Grade 10 - - - ETS ▇▇▇▇▇▇▇ Grade 11 ETS ETS ETS ETS ▇▇▇▇▇▇▇ The procedures ETS proposes for California include: careful recruiting of raters utilizing ETS best practice hiring process extensive training of all levels of scoring leadership, not only on the prompts, rubrics, and related scoring material but on how best to monitor the quality of the scoring rigorous training of the raters in appropriately applying the rubric for each prompt type, following the generic sample responses that exemplify the quality required for each score point so that every prompt is scored on the same general criteria requiring new raters to demonstrate their accuracy by passing a “certification” test before being assigned to score a specific assessment and then by passing a shorter, more focused “calibration” test before each new prompt type using scoring leaders to read behind and monitor raters; scoring leaders have the option of evaluating responses a rater previously scored, with or without the knowledge of the score he or she gave (“informed” versus “blind” back rating) using scoring system’s live operational data to identify (and, for scoring leaders, then counsel) raters who are reading at unusually slow (or overly fast) rates using content scoring leaders to monitor the scoring leaders and their virtual teams including pre-scored validity responses (sometimes called monitor papers) within each rater’s set of assigned responses in order to evaluate ongoing accuracy while scoring regularly analyzing inter-rater reliability (IRR) statistics to verify that raters are scoring consistently (scoring system produces real-time IRR and validity response scoring statistics)
