Writing assessment refers to an area of study that contains theories and practices that guide the evaluation of a writer's performance or potential through a writing task. Writing assessment can be considered a combination of scholarship from composition studies and measurement theory within educational assessment. Writing assessment can also refer to the technologies and practices used to evaluate student writing and learning.
Writing assessment began as a classroom practice during the first two decades of the 20th century, though high-stakes and standardized tests also emerged during this time. During the 1930s, College Board shifted from using direct writing assessment to indirect assessment because these tests were more cost-effective and were believed to be more reliable. Starting in the 1950s, more students from diverse backgrounds were attending colleges and universities, so administrators made use of standardized testing to decide where these students should be placed, what and how to teach them, and how to measure that they learned what they needed to learn. The large-scale statewide writing assessments that developed during this time combined direct writing assessment with multiple-choice items, a practice that remains dominant today across U.S. large scale testing programs, such as the SAT and GRE. These assessments usually take place outside of the classroom, at the state and national level. However, as more and more students were placed into courses based on their standardized testing scores, writing teachers began to notice a conflict between what students were being tested on—grammar, usage, and vocabulary—and what the teachers were actually teaching—writing process and revision. Because of this divide, educators began pushing for writing assessments that were designed and implemented at the local, programmatic and classroom levels. As writing teachers began designing local assessments, the methods of assessment began to diversify, resulting in timed essay tests, locally designed rubrics, and portfolios. In addition to the classroom and programmatic levels, writing assessment is also hugely influential on writing centers for writing center assessment, and similar academic support centers.
Because writing assessment is used in multiple contexts, the history of writing assessment can be traced through examining specific concepts and situations that prompt major shifts in theories and practices. Writing assessment scholars do not always agree about the origin of writing assessment.
The history of writing assessment has been described as consisting of three major shifts in methods used in assessing writing. The first wave of writing assessment (1950-1970) sought objective tests with indirect measures of assessment. The second wave (1970-1986) focused on holistically scored tests where the students' actual writing began to be assessed. And the third wave (since 1986) shifted toward assessing a collection of student work (i.e. portfolio assessment) and programmatic assessment.
The 1961 publication of Factors in Judgments of Writing Ability in 1961 by Diederich, French, and Carlton has also been characterized as marking the birth of modern writing assessment. Diederich et al. based much of their book on research conducted through the Educational Testing Service (ETS) for the previous decade. This book is an attempt to standardize the assessment of writing and is responsible for establishing a base of research in writing assessment.
The concepts of validity and reliability have been offered as a kind of heuristic for understanding shifts in priorities in writing assessment as well interpreting what is understood as best practices in writing assessment.
In the first wave of writing assessment, the emphasis is on reliability: reliability confronts questions over the consistency of a test. In this wave, the central concern was to assess writing with the best predictability with the least amount of cost and work.
The shift toward the second wave marked a move toward considering principles of validity. Validity confronts questions over a test's appropriateness and effectiveness for the given purpose. Methods in this wave were more concerned with a test's construct validity: whether the material prompted from a test is an appropriate measure of what the test purports to measure. Teachers began to see an incongruence between the material being prompted to measure writing and the material teachers were asking students to write. Holistic scoring, championed by Edward M. White, emerged in this wave. It is one method of assessment where students' writing is prompted to measure their writing ability.
The third wave of writing assessment emerges with continued interest in the validity of assessment methods. This wave began to consider an expanded definition of validity that includes how portfolio assessment contributes to learning and teaching. In this wave, portfolio assessment emerges to emphasize theories and practices in Composition and Writing Studies such as revision, drafting, and process.
Indirect writing assessments typically consist of multiple choice tests on grammar, usage, and vocabulary. Examples include high-stakes standardized tests such as the ACT, SAT, and GRE, which are most often used by colleges and universities for admissions purposes. Other indirect assessments, such as Compass, are used to place students into remedial or mainstream writing courses. Direct writing assessments, like Writeplacer ESL (part of Accuplacer) or a timed essay test, require at least one sample of student writing and are viewed by many writing assessment scholars as more valid than indirect tests because they are assessing actual samples of writing. Portfolio assessment, which generally consists of several pieces of student writing written over the course of a semester, began to replace timed essays during the late 1980s and early 1990s. Portfolio assessment is viewed as being even more valid than timed essay tests because it focuses on multiple samples of student writing that have been composed in the authentic context of the classroom. Portfolios enable assessors to examine multiple samples of student writing and multiple drafts of a single essay.
This section needs expansion. You can help by adding to it. (October 2016)
Methods of writing assessment vary depending on the context and type of assessment. The following is an incomplete list of writing assessments frequently administered:
Portfolio assessment is typically used to assess what students have learned at the end of a course or over a period of several years. Course portfolios consist of multiple samples of student writing and a reflective letter or essay in which students describe their writing and work for the course. "Showcase portfolios" contain final drafts of student writing, and "process portfolios" contain multiple drafts of each piece of writing. Both print and electronic portfolios can be either showcase or process portfolios, though electronic portfolios typically contain hyperlinks from the reflective essay or letter to samples of student work and, sometimes, outside sources.
Timed essay tests were developed as an alternative to multiple choice, indirect writing assessments. Timed essay tests are often used to place students into writing courses appropriate for their skill level. These tests are usually proctored, meaning that testing takes place in a specific location in which students are given a prompt to write in response to within a set time limit. The SAT and GRE both contain timed essay portions.
A rubric is a tool used in writing assessment that can be used in several writing contexts. A rubric consists of a set of criteria or descriptions that guides a rater to score or grade a writer. The origins of rubrics can be traced to early attempts in education to standardize and scale writing in the early 20th century. Ernest C Noyes argues in November 1912 for a shift toward assessment practices that were more science-based. One of the original scales used in education was developed by Milo B. Hillegas in A Scale for the Measurement of Quality in English Composition by Young People. This scale is commonly referred to as the Hillegas Scale. The Hillegas Scale and other scales used in education were used by administrators to compare the progress of schools.
In 1961, Diederich, French, and Carlton from the Educational Testing Service (ETS) publish Factors in Judgments for Writing Ability a rubric compiled from a series of raters whose comments were categorized and condensed into a five-factor rubric:
As rubrics began to be used in the classroom, teachers began to advocate for criteria to be negotiated with students to have students stake a claim in the how they would be assessed. Scholars such as Chris Gallagher and Eric Turley, Bob Broad, and Asao Inoue (among many) have advocated that effective use of rubrics comes from local, contextual, and negotiated criteria.
Multiple-choice tests contain questions about usage, grammar, and vocabulary. Standardized tests like the SAT, ACT, and GRE are typically used for college or graduate school admission. Other tests, such as Compass and Accuplacer, are typically used to place students into remedial or mainstream writing courses.
Automated essay scoring (AES) is the use of non-human, computer-assisted assessment practices to rate, score, or grade writing tasks.
Some scholars in writing assessment focus their research on the influence of race on the performance on writing assessments. Scholarship in race and writing assessment seek to study how categories of race and perceptions of race continues to shape writing assessment outcomes. However, some scholars in writing assessment recognize that racism in the 21st century is no longer explicit, but argue for a 'silent' racism in writing assessment practices in which racial inequalities in writing assessment are typically justified with non-racial reasons. These scholars advocate for new developments in writing assessment, in which the intersections of race and writing assessment are brought to the forefront of assessment practices.