A decade after their release, A Framework for K–12 Science Education (2012) and the Next Generation Science Standards (NGSS) (2013), as well as revised state science standards, continue to transform science instruction in the US. As instruction changes, so too must assessments to meet the expectations of three-dimensional standards and learning. A strong science assessment serves as an important tool for educators to gauge students’ knowledge building and for students to see how their knowledge can be applied in different contexts.
If we truly want to ensure that students are building enduring science knowledge and developing sensemaking skills, science assessments must move beyond repetition of memorized information and ask students to apply their learning in new and novel contexts. How must assessments change to rise to this challenge?
In this research brief, we’ll look at the evolution of three-dimensional assessments alongside the creation of new science standards, how practitioners translated goals and expectations into actionable frameworks, and the resources currently available to support assessment task item development. This brief will only explore assessment design as it relates to classroom-based assessments, not state tests.
The Goals of Three-Dimensional Science Assessments
The release of A Framework for K–12 Science Education by the National Research Council (NRC) brought a paradigm shift in science education—moving away from curricula and instruction that present processes and content separately toward an integrated, three-dimensional approach in which students use Science and Engineering Practices (SEPs) and the lens of Crosscutting Concepts (CCs) to develop Disciplinary Core Ideas (DCIs).
The Framework for K–12 Science Education established goals for the future of science assessment, including more robust assessments that allow students to apply their knowledge. Next generation science assessments should test students’ understanding of both “science as a content domain” and “science as an approach” (NRC 2012, 263). Assessments should provide students opportunities to apply their current knowledge to figure out new phenomena and problems and demonstrate that they are “building on their existing knowledge and skills in ways that lead to deeper understanding” of science (NRC 2012, 263).
After the development of the Framework and the release of the NGSS, the National Research Council then established the Committee on Developing Assessment of Science Proficiency in K–12 to develop recommendations for creating assessments that measure student proficiency as defined by the Framework. The committee’s findings culminated in the 2014 report, Developing Assessments for the Next Generation Science Standards. The committee arrived at four conclusions for three specific challenges in designing and developing assessments aligned to the goals of the Framework and the NGSS. The three assessment challenges created by the standards are the following: how to assess three-dimensional learning, how to assess the development of three-dimensional learning over time, and how to cover the breadth and depth of content over time.
The committee, in identifying these three primary challenges to assessing science learning, came to four conclusions. They found the following. (Note: the term assessment tasks includes any learning activity that is designed to elicit evidence of student knowledge and achievement of intended learning outcomes.)
- To meet the demands of three-dimensional learning, assessment tasks will need multiple, interrelated components and questions. Each component may not cover all three dimensions, but the task in totality needs to ask students to engage in three-dimensional learning.
- Assessment tasks must provide students the opportunity to demonstrate how their knowledge and understanding grow and deepen over time.
- Assessing all of the expected science learning at each grade level will require multiple, and varied, assessment opportunities.
- Assessment tasks should not be directly mapped to one NGSS performance expectation. Instead, multiple assessments across the grade level or grade band may be needed to truly assess student mastery of an NGSS performance expectation. One assessment task may, however, address multiple related NGSS performance expectations. Further, assessment tasks may probe students’ application of a practice in more than one disciplinary context.
While developing assessment tasks that meet the expectations defined by the committee may seem daunting, the committee offers encouragement that good three-dimensional science instruction will provide educators with many opportunities for informal assessment. Since an assessment task is any learning activity that is designed to elicit evidence of student knowledge and achievement of intended learning outcomes, opportunities for assessment occur in nearly every learning sequence. They note opportunities for educators to observe evidence of student thinking “such as when students develop and refine models; generate, discuss, and analyze data; engage in both spoken and written explanations and argumentation; and reflect on their own understanding of the core idea and the subtopic at hand (possibly a personal science journal).” (NRC 2014, 87)
With goals from the Framework, guidance from the NGSS, and the findings of the committee, the science education community had to take the next steps of operationalizing these goals and guidelines into classroom-level assessment tasks.
Developing Early Assessment Task Frameworks
The Committee on Developing Assessment of Science Proficiency in K–12 developed and commissioned the creation of early protype assessment items. As a result, science educators and science education researchers have had the benefit of seeing the explicit thinking and rationale of assessment task creators. These early assessment tasks helped make the abstract notion of three-dimensional assessment more concrete and provided a starting point for others in the field to conceptualize how to develop their own assessment tasks. As the committee members noted in their report, there are many possible task designs and the ones presented are only a sampling, but in them, some common attributes began to emerge.
To help further define these attributes and build early assessment tasks that align to the expectations of the Framework, Achieve, an education nonprofit that served as the project manager and writing facilitator for the NGSS, launched the Task Annotation Project in Science (TAPS) to answer the questions “what does it look like to ask students to demonstrate progress toward three-dimensional standards?” and “what are the most important features of high-quality science tasks?” (n.d.). This project found that there are “many different and appropriate ways to assess three-dimensional standards,” but there are five “must-haves” that assessment tasks must incorporate to be considered designed for three-dimensional standards (Achieve n.d., 1).
- Be focused on a phenomenon or problem. Assessment tasks must focus on making sense of a phenomenon or addressing a problem so students can demonstrate their understanding and use of the three dimensions to respond to the task.
- Require students to engage in sense-making. The primary goal of assessment tasks is to actively engage students in sense-making. Assessment tasks should emphasize reasoning so students can demonstrate their understanding of ideas and practices instead of demonstrating the ability to follow procedures or share back information without application.
- Require students to use both science ideas and practices. All forms of assessment should require students to use at least one SEP and one core idea together as they demonstrate their sensemaking process. One SEP and one core idea is the floor for assessment tasks, not the ceiling.
- Make sense to students. Assessment tasks need to be both understandable and accessible to all students. To ensure that assessment tasks make sense to students and are equitable, the assessment tasks must be developed with readability in mind, in addition to providing sufficient information and logical scaffolding from the student perspective so that all students see how each part of the assessment task helps address the phenomenon or answer the question posed in the task.
- Support the intended purpose and use. Not all assessment tasks serve the same purpose in a curriculum. Assessment tasks need to be clear in what purpose they serve and that the task will produce evidence to meet that stated purpose. There should be clarity about what is being assessed through a task and what is not.
With the guidance provided by the Framework, the NGSS, the NRC assessment committee, and the work done by Achieve, a strong foundation was set for practitioners and researchers to begin developing their own assessment tasks to measure three-dimensional learning. Developing good assessment tasks can be challenging and time consuming, but fortunately, resources exist to help educators create their own or leverage existing ones.
Current Resources for Evaluating Assessment Tasks
As more assessment tasks and items are developed by the science education community, educators need resources to evaluate the quality of those assessment items. These two tools developed for NGSS by Achieve were designed to inform educators’ reviews of assessment items.
- The Science Task Prescreen (2018) was designed as a first step to help educators decide if a task is aligned to the intent of the Framework and NGSS and worthy of further exploration. There are eight “red flag” questions reviewers answer to determine if the task is meeting the minimum requirements of a well-designed three-dimensional assessment task. If there are no red flags, reviewers are encouraged to proceed to the Science Task Screener.
- The Science Task Screener (2018) was designed to “determine whether classroom assessment tasks are high quality, designed to elicit evidence of three-dimensional performances, and designed to support the purposes for which they will be used.” The screener is organized around four criteria: 1) whether the tasks are driven by “high-quality scenarios that focus on phenomena or problems;” 2) tasks that “require sense-making using the three dimensions;” 3) tasks that are “fair and equitable;” and 4) tasks that support their intended targets and purpose (1).
- A Framework to Evaluate Cognitive Complexity in Science Assessments (2019) was designed to provide a framework to understand “the degree to which the three dimensions contribute to sense-making in an assessment task” (Achieve, 2). The framework has two essential questions: 1) To what degree does the assessment task ask students to engage in sense-making? and 2) In what ways does the task ask students to use each of the dimensions in the process of sense-making?
Creating three-dimensional science assessment tasks is no easy feat. Thoughtful assessment tasks carefully consider the intended learning outcomes and how they can be assessed in an authentic context. These tasks help students demonstrate what they know as it is applied in a new setting while helping educators understand what students know and where any misconceptions may still exist.
Three-Dimensional Assessment Tasks for All Students with PhD Science®
For schools and districts seeking to adopt high-quality instructional materials in science, they should consider the assessment types and tasks embedded in the instructional materials. When assessment tasks are built into a curriculum, educators can reallocate the time they might have spent developing tasks to preparing their instruction to ensure all students have opportunities to build enduring science knowledge and demonstrate that knowledge through high-quality assessment tasks. PhD Science assessment items were developed to meet the goals and expectations of three-dimensional instruction and assessment. The curriculum writers thoughtfully designed assessment tasks that engage students with authentic phenomena and problems that provide them with opportunities to apply their knowledge. PhD Science includes both formative and summative assessments. In every module, educators will find these four types of assessment of student learning.
One per module
One per module
One per concept
At least one per lesson
1. Checks for Understanding formatively assess students in the moment at essential instructional points as they are making sense of a phenomenon or designing a solution to a problem. Checks for Understanding provide teachers an opportunity to observe student thinking. There is at least one Check for Understanding per lesson that is related to the phenomena or problem under investigation. Some Checks for Understanding are multi-dimensional while others are intentionally targeting one dimension.
Each Check for Understanding may not cover all three dimensions, but through the course of a lesson set and over a module, all dimensions are assessed. For example, in Level 2 Module 3 Lesson 2, the check for understanding looks for evidence of students by using an element of an SEP and DCI, but not a CC. However, in Lesson 6 of that same module, students are assessed on a CC. Since Checks for Understanding occur at the lesson level, they provide students with the opportunity to demonstrate how they are building knowledge as they make sense of phenomena.
Throughout this assessment task structure, students practice their knowledge and skills within the current phenomenon or problem, which over the course of the module prepares them to later apply their knowledge and skills to make sense of a novel phenomenon or problem.
2. Conceptual Checkpoints are designed to measure deep conceptual understanding of a module’s DCIs and the skills and knowledge associated with SEPs and CCs. These Checkpoints can be used as formative or summative assessments and are found at the end of each concept of the module. There are two to four Conceptual Checkpoints per module. For example, the first Conceptual Checkpoint in Level 2 Module 3 occurs in Lesson 7. As this example box Conceptual Checkpoint shows, the assessment includes all three dimensions and students must combine content knowledge along with skills and processes to succeed on the Checkpoint.
Conceptual Checkpoints can be related to the phenomenon or problem students are already making sense of in the module or a new phenomenon or problem. Focusing an assessment task on a new or related phenomenon or problem allows students the opportunity to apply their learning in a new context. Checkpoints may only address part of the knowledge that has been learned in the concept that is needed to make sense of the phenomenon or problem. In this Conceptual Checkpoint, students are focused on the question “Do different amounts of natural resources change how well a certain type of plant grows?” Students need to understand the answer to this question as part of a larger body of knowledge as they work to investigate the Essential Question “How did local plants recover after the eruption of Mount St. Helens?”
3. Engineering or Science Challenges allow students to apply their conceptual knowledge to solve real-world problems or address real-world questions. These challenges are found in every module and can be used as formative or summative assessments. Engineering or science challenges are often completed over the course of several lessons as students apply their module knowledge to answer the question or solve the problem posed by the challenge. For example, in Level 2 Module 3, students are presented with an engineering challenge in Lessons 14–18 when they’re challenged to develop pollination tools that can help humans pollinate plants in the absence of pollinators. This challenge unfolds over the course of several lessons.
- In Lesson 14, students define the problem.
- In Lesson 15, students imagine solutions to the problem and test pollen-collecting materials to determine which materials best transfer pollen. Then they collect data that guide their decision-making when they plan their pollination tools.
- In Lesson 16, students plan, create, and test their initial pollination tools. This lesson provides students the opportunity to demonstrate their learning in a three-dimensional manner as they draw and label plans that will show which materials they will use and how they will combine the materials to make their tools (DCI). Then they test their tools and make claims (SEP) about how well their tools structure supported its function (CC).
- In Lesson 17, students reflect on their test results before improving and retesting their pollination tools.
- In Lesson 18, students present their tools to the class and evaluate each other’s pollination design tools.
In science and engineering challenges, students use their knowledge and skills developed while sensemaking in the module and apply them in a new context. By asking students to apply the information in a new context, students develop a deeper understanding of the content and processes. This in turn prepares them to apply their learning in future novel contexts.
4. End-of-Module Assessments give students the opportunity to demonstrate and transfer the knowledge and skills they acquired throughout the module in the context of one or more new phenomena. These are intended to be summative assessments. End-of-Module Assessments require students to combine their new DCI, SEP, and CC knowledge developed throughout the module and apply it to a new context. The End-of-Module Assessment in Level 2 Module 3 revisits the island of Surtsey, which was the anchor phenomenon of Level 2 Module 2. While previously, students focused on how the island formed, now they consider the plant life on the island and how it arrived there. Students are asked a series of questions about the island, which requires them to apply the knowledge they have developed as they make sense of the module anchor phenomenon, the eruption of Mount St. Helens, to complete the assessment. The assessment includes three item clusters, each composed of multiple components. The rubric shows how each item cluster addresses all three dimensions, it defines what student success on each assessment task looks like, and it provides an example of what evidence of students engaging in each of the targeted dimensions of the task looks like.
Meaningful science assessments help educators see what students know and help students build enduring knowledge. They show teachers what students know while they engage students in sense-making by using the three dimensions in novel contexts. When science assessment is done well, educators are provided with key insights about their students’ learning. In addition, they can have more confidence in both what they learned about the phenomenon and how their students will apply their knowledge in the future. With high-quality science instructions and assessments, students begin to see science as a connected body of knowledge where skills and processes cross over content areas and where knowledge is intertwined. When students see science as a body of knowledge rather than discrete concepts that they only need to know for the purposes of assessment and then can quickly forget, their future learning will be disconnected from current learning. We want all students to build enduring knowledge. With knowledge, students’ potential is limitless.
Download the Article as a Free PDF
Jenny has over a decade of experience in education policy and research. She has worked with states and districts on the development and implementation of college and career readiness policies, especially around the implementation of rigorous standards and high-quality instructional materials. She has extensive knowledge about K–12 standards, graduation requirements, assessments, and accountability systems nationwide. Additionally, she has conducted research for school districts to address pressing needs in those districts. Jenny received her B.A. in English and education from Bucknell University and her M.Ed. in education policy from the University of Pennsylvania Graduate School of Education.