Alison Yin for EdSource Today

Equally millions of students prepare, for the outset time, to accept a battery of assessments aligned with the Common Core using computers, at to the lowest degree portions of the tests will have to exist scored the old-fashioned mode: past humans.

That'south because the then-called Smarter Balanced tests, aligned with the Mutual Core Country Standards, include essay questions designed to measure disquisitional thinking skills. Even the math tests require students to explain how they reach their answers.

And dissimilar the former multiple-choice California Standards Tests that students took every twelvemonth until the spring of 2013, those more than complex portions of the Smarter Balanced tests tin't exist easily scored by automobile.

To score them, the Educational Testing Service,* which will administer the tests under contract with the California Department of Teaching, is in the process of hiring six,500 scorers in California. It has almost reached its goal. As of Feb. nineteen it has recruited 6,294 people to work as manus scorers, pending their passing certification. Of those, 3,777 have passed certification, according to the California Department of Instruction.

"Considering the examination is so new, I wanted to come across exactly what they're looking for when they're assessing students," said Christopher Vue, a math teacher at Washington Union High School in Easton.

The use of people rather than computers to charge per unit essay questions has been routine for portions of tests like the Graduate Record Exams and the Graduate Management Admissions Examination. Merely they volition be used to score many more answers than on previous tests taken by M-12 students in California.

While human scoring has advantages over automated reckoner scoring on essay-blazon questions, it also has its disadvantages. How well the testing service recruits and trainsscorers could have an bear upon on individual students' scores.

A 2022 newspaper published by the Educational Testing Service noted that "humans tin can make mistakes due to cognitive limitations that tin can exist difficult or even impossible to quantify, which in turn can add systematic biases to the last scores." That's on top of the logistics of managing and training – and paying – thousands of scorers, a process that the testing service paper described as "labor intensive, time consuming and expensive."

According to an Educational Testing Service recruiting flyer, to get a exam "rater" – the term used in testing parlance – a bachelor's caste in whatever field is required, although education experience is "strongly preferred."

Among those who have been hired so far, only 241 are current California teachers.

Ane of them is Christopher Vue, a math teacher at Washington Matrimony High Schoolhouse in Easton, nigh Fresno. To go certified as a rater, he recently sat down in front of his home computer to figure out the best fashion to grade the critical thinking skills of students he's never met. The test results he was asked to score were those of students who took the Smarter Balanced field tests administered concluding spring.

On one middle school math trouble on "proportional relationships" that Vue was asked to score, a student's response could earn up to three points. A articulate ready of guidelines provided by the testing service helped him effigy out how to score five possible responses to the aforementioned problem, he said.

It took Vue two hours to read and review all of the material for the preparation and certification. When he begins scoring students this jump, he said that he will exist able to enquire a team leader if he is unsure virtually how to score a item answer.

While Vue will earn $13 an hour for working equally a scorer, his reason for signing up was not the extra cash. "Because the examination is so new, I wanted to meet exactly what they're looking for when they're assessing students," Vue said.

Natalie Albrizzio, a math specialist in the Ventura Unified Schoolhouse District, had a similar motivation for becoming  a test scorer. Simply she has already had experience with the process.

When California adopted the Mutual Core Standards in 2010, her school commune created its own math exams that required students to explain the reasoning behind their answers. To determine how to score those exams on a scale of 0 to 3, she said, teachers discussed all the possible responses to the questions. For example, she said, they pondered whether to give a student whose answer was nonsensical a 1 for effort.

The funds to pay for mitt scorers volition come out of a $24 million budget that's set aside for test processing, scoring and assay of the Smarter Counterbalanced assessments, according to officials at the California Department of Educational activity.

California may have used hand scorers in a more limited fashion on statewide K-12 assessments, merely the scope of their use has been widespread for decades in other states, including Washington and Connecticut, according to Shelbi Cole, the deputy director of content for the Smarter Balanced Assessment Consortium.

She said that the math and English Language Arts scoring guidelines that Smarter Balanced has distributed to California and other states using its assessments were adult past educators at meetings where they considered numerous answers students might provide to the test questions.

What kinds of responses earn a high score depends on the complexity or difficulty of a test item. For case, to earn a top score of four on an essay in which a student argues that the British Museum should return the Rosetta Stone to Egypt requires articulate sourcing and citations, the use of expert opinions to rebut opposing views, and the appropriate utilize of vocabulary.

The Educational Testing Service is looking into ways more than of the Common Core tests can be scored without man intervention, and experts in the testing field believe that more machine testing is inevitable. The 2022 testing service report concluded that "advances in artificial intelligence technologies have made car scoring of essays a realistic option… and that information technology volition be used more than widely in educational assessments in the near future."

Simply for now, scorers like Vue and Albrizzio will exist essential to the procedure.

Administration of the Smarter Balanced assessments will begin in some districts at the end of March, and testing will run across June, depending on the district. Vue is now waiting for instructions about what to practice next, including being told which class levels he'll be expected to score. "I feel similar I'chiliad really in the dark about what'southward going to happen side by side," he said.

Similar Vue, Albrizzio is looking forward to getting a closer look at the tests themselves. "Especially for math, I think it's important for teachers to participate," she said.

*Correction: An earlier version of this article incorrectly stated the proper noun of the Educational Testing Service. The story was also updated to reflect the extent to which California is using manus scorers for Thousand-12 assessments.

To go more reports like this one, click here to sign upward for EdSource'southward no-cost daily email on latest developments in education.