Data Has No Meaning Apart from Their Context
Note: This is the second of a five-part series on understanding the concept of variation. Knowledge about variation is one of the components of W. Edwards Deming’s System of Profound Knowledge and is fundamental to correctly interpreting data.
In the K-12 education sector, one of the primary uses of data is in state accountability systems. Many states now issue district and school report cards typically based on various performance metrics such as proficiency rates on standardized tests, absenteeism rates, and college and career readiness indicators. Unfortunately though, as James Leonard stated so eloquently in The New Philosophy for K-12 Education:
Absent an understanding of the type of variation present, any discussion of accountability is a burlesque!
Data in Context
My first post in this series-Numerical Naiveté-ended with a quote from Dr. Donald Wheeler who may be the foremost expert on understanding variation in the United States. Dr. Wheeler’s first principle for understanding data is, “No data have meaning apart from their context.” This principle serves as a summary of two rules that Dr. Walter Shewhart, the father of statistical quality control, gave for the presentation of data. In Understanding Variation, Wheeler paraphrased Shewhart’s rules, and it will be helpful here to review both in this post. In order to bring these rules to life, I’ll first describe them, and then I’ll outline a real life example of how we break the rules in practice in the education sector.
Shewhart’s rule one for the presentation of data is as follows:
Data should always be presented in such a way that preserves the evidence in the data for all of the predictions that might be made from these data.
The implications of this rule are significant. The context from which the data was collected should not be divorced from the data itself. Even in presentation form, anyone looking at the data should be able to answer some basic questions such as: Who collected the data? How were the data collected? When were the data collected? Where were the data collected? What do these values represent? What is the operational definition of the concept being measured? How were the values of any computed data derived from the raw inputs? Have there been any changes made over time that impact the data set (i.e. change in the operational definition of the concept being measured, change in the formula being used to compute the data)?
As a starting point, we typically organize data sets into a table of values like those in Figure 1. However, tables are necessary but insufficient in the analysis of our data. It is very difficult for people to see trends or patterns when data are only displayed in table format. These displays become even more problematic when color-coding or other similar graphics are added to the table for the purpose of comparing the data to each other or to a target. Because people are visually oriented, a time series graph with annotation should accompany the table.
This brings us to Shewhart’s rule two for the presentation of data:
Whenever an average, range, or histogram is used to summarize data, the summary should not mislead the user into taking any action that the user would not take if the data were presented in a time series.
The main reason that Shewhart stresses this point is because almost all data occurs across time, and in many cases this time order is the point. In other words, it’s the pattern that emerges from viewing the data in time order that gives us the most insight into what is happening with the data we are analyzing. Wheeler takes Shewhart’s two rules and summarizes them in one succinct first principle, the aforementioned: “No data have meaning apart from their context.”
An Education Example
It will be helpful here to turn to an example to bring this principle alive. Each year in the Buckeye state, the Ohio Department of Education releases a summary of state testing results in a 5-6 page document. A snapshot from the 2017-2018 results is shown in the figure below.
Figure 1: Ohio School & District Results 2017-2018, Page 2
The headline for the full document (not included) is stated in red in all caps on page 1 of the report: “STUDENT ACHIEVEMENT INCREASES SEEN STATEWIDE IN 2018 OHIO SCHOOL REPORT CARDS.” This is followed by the headline on page 2 that accompanies the data table above: “Ohio students continue to show improved achievement in academic content areas.” The data table shows all of the grades and subject areas where state tests are given each year in grades 3-8 and in high school in the first two columns. In the next three columns, proficiency rates for the 2015-2016, 2016-2017, and 2017-2018 school years are listed. In the column on the far right, there is a green arrow pointing up to show that test scores increased from 2016-2017 to 2017-2018 or there is a red arrow pointing down to show that test scores decreased from 2016-2017 to 2017-2018. There is no doubt that the average proficiency rates with green arrows had a year-to-year increase. For example, the percentage of proficient students in 4th grade math went from 72.4% in 2016-2017 to 72.5% in 2017-2018. On the flip side, there is no doubt that the average proficiency rates with red arrows had a year-to-year decrease. For example, the percentage of proficient students in 3rd grade reading went from 63.8% in 2016-2017 to 61.2% in 2017-2018.
Defining Improvement
The key question though is not whether there was an increase or decrease in our data, but rather if that change in our data was meaningful. Does an increase in proficiency rates in the state testing data represent improvement? Does a decrease in proficiency rates in the state testing data represent a lack of improvement? In order to answer these questions we need to do two things. First, we need a clear definition of improvement, which I will outline in a moment. Second, we need to understand the two types of variation and the two types of mistakes we can make in interpreting variation in our data. The latter explanation will come in Part III of this series.
The best definition of improvement that I’ve come across has three dimensions and fittingly comes from a seminal book on the topic entitled The Improvement Guide: A Practical Approach to Enhancing Organizational Performance (2nd Edition):
Improvement results from fundamental changes that do the following: (1) alter how work or activity is done or the makeup of a tool; (2) produce visible, positive differences in results relative to historical norms; (3) and have a lasting impact.
When the idea of improvement is viewed through this lens, Leonard’s idea of “accountability as burlesque” comes much more into focus. There is no way to claim improvement or the lack thereof without first having considerably more information about the historical context from which the state testing data is derived. Two or even three years worth of data does not provide enough context to validate the state’s improvement claims. In subsequent posts in this series we’ll define two types of variation and two types of mistakes we can make in interpreting this variation before returning to Figure 1 to learn why the narrative that accompanies the table is akin to writing fiction.
***
John A. Dues is the Chief Learning Officer for United Schools Network, a nonprofit charter-management organization that supports four public charter schools in Columbus, Ohio. Send feedback to jdues@unitedschoolsnetwork.org.