Assessing Learning

Defining assessment

When students and educators hear the word “assessment,” they tend to think of tests, portfolios, final papers, and more. While these are valid and common forms of measuring student performance, it can be beneficial to consider assessment more broadly as “an ongoing process aimed at understanding and improving student learning” (Angelo). In this view, assessment is any evidence, both graded and ungraded, that instructors collect on students’ progress towards the stated learning outcomes.

Defining assessment broadly invites students to develop cognitive skills over time, and to demonstrate these skills at various points in the learning experience. Likewise, assessment provides opportunities for instructors to collect varied evidence of student learning; based on this feedback, instructors are then empowered to adjust their educational strategies to meet students where they are in the learning process. In this way, assessment posits instructors and students as collaborators in the learning process, thus potentially enhancing student motivation.

Taking a general view of assessment as measuring student learning, we can further differentiate it into two types: formative and summative. Both forms serve different purposes, yet both are necessary for creating an effective educational environment.

Formative assessment refers to assessment for learning, or, gathering feedback to help both students and the instructor improve their teaching and learning context. Typical formative assessments include practice problems, drafts or outlines, or other activities that are generally low-stakes (meaning ungraded or of low-point value). In general, formative assessments are intended to provide students with opportunities for practice and feedback before they complete a high-stakes assessment.

By contrast, summative assessment is a measurement of, rather than for learning. Summative assessments, like mid-semester and final exams, papers and portfolios, or even a recital or art exhibition, are designed to measure student proficiency with respect to specific course content or objectives. They are generally always graded, as they “audit” student performance against standards. In a sense we might think of formative assessment as forward-looking (designed for future student improvement and including timely and supportive instructor feedback) while summative assessment is backwards-looking (designed to measure what a student has accomplished at the end of a course or unit, and including limited feedback for future performance) (Fink 2003). Importantly, these two goals can overlap. For example, a homework assignment can test a student’s proficiency (summative), while also giving both students and the instructor feedback, thus informing future behavior and classroom design (formative).

Impact on student learning outcomes: the testing and spacing effects

Considerable research in the social sciences indicates that frequent, low-stakes assessment has a positive impact on students’ understanding and retention of content. One well-studied phenomenon, known as the testing effect, suggests that testing is not just a form of assessing knowledge, but also applying it. In other words, testing (a form of summative assessment) can also function as practice (formative assessment). In one influential study, Roediger and Karpicke (2006) examined the impact of assessment on short- and long-term knowledge retention. Researchers prompted 180 undergraduate students to read a short written passage. Then, students either re-read the passage 3 times or were tested repeatedly on their knowledge of the passage. Next, both groups completed a retention test, thus measuring their short-term mastery of the material. While the first group (the “readers”) performed higher in the short-term, when both groups were again tested one week later, the second group outperformed the first. These findings suggest that while cramming material leads to short-term retention, it is repeated testing on course content – a form of practice, or formative assessment – that ultimately contributes to long-term retention. Relatedly, Trumbo et al. (2016) found that students who completed frequent, low-stakes graded quizzes saw the value in this method – and even preferred it – to merely restudying the material without testing.

A related phenomenon, known as the spacing effect, refers to the timing of, or gap between, 1) students’ first exposure to the material, 2) their restudying of the materials, and 3) being tested on the material. The optimal time between these three stages, Cepeda et al. (2006) observe, depends on how long you want students to retain the information for. A study by Rohrer and Taylor (2006) reveals that when practice problems are distributed, or “spaced” across multiple sessions – as opposed to condensed, or “massed” into a single session – students performed more effectively in a summative retention test 4 weeks later (the “retrieval” test). These findings corroborate what we know about how the brain retains information, in that the process of knowledge retrieval (recalling and exercising prior knowledge) helps to move information from working memory to long-term memory. When considered alongside the testing effect, the spacing effect suggests that an optimal course design incorporates frequent, low-stakes practice and feedback and periodic reinforcement of key concepts and skills over time.

Impact on student motivation

Beyond helping students retain and retrieve content, interspersing formative and summative assessment can significantly impact student motivation and metacognition. As Trumbo et al. (2016) propose, low-stakes formative evaluation allows for the “reduction of test anxiety,” as well as for the “identification of knowledge gaps.” A famous research study by Kruger and Dunning (1999) found that novice learners tended to grossly overestimate their abilities (the Dunning-Kruger effect). Providing learners with opportunities for practice and feedback can allow them to more accurately assess their performance—a prerequisite for improvement.

Furthermore, the feedback that accompanies assessment can have a profound impact on learners’ motivation. Research by Blackwell, Trzesniewski, and Dweck (2007) reveals that feedback – when provided in a targeted, timely, and supportive manner – can help students move from a “fixed” mindset (that intelligence is indicative of a fixed personality, i.e. “I’m just bad at math”) to a “growth” mindset (that intelligence is malleable, i.e. “I need to work more on my topic sentences in future essays”). In this way, feedback must be encouraging yet specific in order to encourage students to persevere on difficult tasks. Importantly, feedback plays an especially important role in combating stereotype threat (a phenomenon in which minorities, due to cognitive stress about a stereotype linked to their personal identity, risk underperforming and thus inadvertently endorsing the stereotype) (Aronson and Steele, 1995; Aronson et al., 2002). Specific feedback on formative assessments, when delivered encouragingly, can empower and motivate minority students to persevere despite negative stereotypes.

Conclusion

Both formative and summative assessments are necessary for helping students become self-directed learners. When designing assignments, then, instructors should consider:

  • what cognitive skills, or learning objectives, the assignment is designed to measure
  • whether the purpose of the assignment is largely formative or summative
  • what kind of feedback students will receive, and when (in relation to future assessments)

As research on the testing and spacing effects implies, knowledge is most easily retained when students encounter frequent, low-stakes practice, with opportunities for short-term and long-term knowledge retrieval. This sort of intentional course design facilitates the development of meta-cognitive skills (such as an awareness of their proficiency), as well as equips students with the necessary feedback for improving their performance. Over time, this feedback can encourage students to cultivate a growth mindset and persist in the face of “desirable difficulties.”

References

Angelo, T. (1995). Reassessing (and Defining) Assessment. American Association for Higher Education (AAHE) Bulletin 48(3), 7.

Aronson, J.; Fried, C. B.; Good, C. (2002). Reducing the effects of stereotype threat on African American college students by shaping theories of intelligence. Journal of Experimental Psychology 38, 113-125. doi:10.1006/jesp.2001.1491

Blackwell, L. S., Trzesniewski, K. H. and Dweck, C. S. (2007). Implicit theories of intelligence predict achievement across an adolescent transition: A longitudinal study and an intervention. Child Development 78, 246–263. doi:10.1111/j.1467-8624.2007.00995.x

Cepeda, N. J.; Pashler, H.; Vul, E.; Wixted, J. T.; Rohrer, D. (2006). Distributed practice in verbal recall tasks: A review and quantitative synthesis. Psychological Bulletin 132(3): 354-380. DOI: 10.1037/0033-2909.132.3.354

Fink, D. (2003). Creating Significant Learning Experiences: An Integrated Approach to Designing College Courses. San Francisco, CA: Jossey-Bass.

Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6), 1121-1134. http://dx.doi.org/10.1037/0022-3514.77.6.1121

Roediger, H. L.; Karpicke, J. D. (2006). The power of testing memory: Basic research and implications for educational practice. Perspectives on Psychological Science 1, 181-210. https://doi.org/10.1111/j.1745-6916.2006.00012.x

Rohrer, D. and Taylor, K. (2006). The effects of overlearning and distributed practice on the retention of mathematics knowledge. Applied Cognitive Psychology, 20, 1209–1224. doi:10.1002/acp.1266

Steele, C. M.; Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology 69 (5), 797–811. doi:10.1037/0022-3514.69.5.797

Trumbo, M.; Leiting, K. A.; McDaniel, M. A.; Hodge, G. K. Effects of reinforcement on test-enhanced learning in a large, diverse introductory college psychology course. Journal of Experimental Psychology: Applied 22(2), 148-160. http://dx.doi.org/10.1037/xap0000082