NO exams but for education
When I set out to craft thirty-one straight posts on the blog Testing: A Personal History, my outline of topics designed to effect the exorcism of my educational measurement experiences seemed to fit within that January timeframe. I was wrong. Again. This erroneous estimation is of a piece with 28 others in which every day I would underestimate the amount of time required for me to write that day’s post. Such inaccuracy matches my lifelong record of always thinking that my writing will get done in a shorter amount of time then it ever takes. Ever. Ever. Ever.
The upshot is that several topics will remain for the moment uncovered or rather unexplored by Monday end of day 31. I promise to return to them in the coming weeks, but not with this circadian requirement. The postponed posts include looking specifically at testing within the university environment, which is a subject that several of my friends who come from that milieu have agreed to discuss with me. Another topic that I think will be of great interest is Myers Briggs, its connection to ETS, and a story that I was told by the very person who made sure its makers and the test because of its lack of validity and reliability were banished from that citadel of psychometrics. And there are a few other tales of failed innovation and management in the world of testing that merit recounting that will also have to wait.
Those delays are because having strung together my thoughts about testing each day this month into what I hope is a compelling argument its peroration must now appear to catch the attention of as many as possible. All the posts so far serve as background to this plea:
NO tests but for learning;
NO exams but for education.
What I believe must be done to save testing from both its often ill informed or hopelessly biased critics as well as its all too insular risk-adverse industry leaders and proponents is for citizens to insist on different tests than those normally administered in our educational systems. The call to arms is to require no tests but for learning and no exams but for education. (Spoiler alert: An end of term exam is not for the purposes of education. Most exams now given exist to tell –however imperfectly — what someone knows at the moment; they are summative. They rarely have any formative uses that employing all that we know about learning theory and cognitive science help the student to get better.
We must shift the enormous resources that go towards these ubiquitous summative tests entirely over to formative tests, to tests that as my friends and former colleagues Caroline Wylie and Christine Lyon described as an assessment that “takes place before or during the instruction with the explicit purpose of eliciting evidence that can be used to improve the current learning. One widely accepted definition of formative assessment describes it as a classroom-based process in which students and teachers collect evidence of learning in order to understand current learning progress and to make adjustments to learning or to teaching as necessary.”
Formative testing lets someone know how they can learn whatever should be next for them. Formative is also what my friend and former colleague Alina VonDavier over at Duolingo is attempting in asking whether tests can be delightful in that they help us to learn in a way that we find engaging.
Formative testing is NOT what the College Board has in mind when as a friend noted to me they said in their latest press release regarding the 2024 SATs that it would be “……digital and more relevant”. I suspect “more relevant” refers to rearranging the deckchairs on the things examined, but not in a way to help people learn.
We know the problem of insufficient learning in so many areas from 3rd grade to takers of the Certified Financial Advisor exam: The pass rate for the Level 1 exam on the latter fell to just 25% in the latest round of results. Average reading and mathematics scores for the nation’s 9- and 13-year-olds have remained flat or declined since 2012. That’s ten years. As they used to say on Saturday Night Live, “Francisco Franco is still dead” and we still lack the skills in our kids that we AND they desire.
We know the problem so why are we still testing to find out if there is a problem? Let’s declare instead that we need to try something very different with out national and local testing regimens. Shift the dollars completely to formative testing and the structures necessary to make that work.
Navigating or better yet bypassing the bureaucratic quicksand that keeps K-12 education in the United states from change of any sort except perhaps banning books will prove difficult and time consuming. The nation’s educational leaders still have not rebounded from what we did (and spent) with Smarter Balance and PARCC, which was a sh**show as this fellow points out
We keep looking in K-12 where it is established how to look and fail to devote sufficient resources to different areas that might make a larger difference in the development of individuals. There are some exceptions such as creativity, but they are rare.
Want to see where testing should go? Try this video from The Next Generation Science Standards (NGSS) whose leaders “envision student multi-dimensional learning. Our project is a proof-of-concept study to develop and test an innovative toolkit that would automate diagnosis and feedback on student science learning. The project plans to make critical improvements to understanding student reasoning patterns associated with multi-dimensional learning of ecosystems, creating automated classification of those patterns, and providing immediate feedback for teachers and students.
Immediate feedback indicates how the student is actually reasoning so that the teacher and the student can either reinforce what’s going on or seek to remedy where there may be faults. This kind of formative testing that is what is needed: a type that “utilized natural language processing and machine learning techniques to automate [in this case] the reasoning pattern diagnosis.”
My experience at ETS is that the money and attention to make this kind of innovation happen was not forthcoming from state education departments. The research is there, but the willingness to make changes is not for the most part. So where should we start this shift to saying “NO tests but for learning“?
A Case for Reskilling and Upskilling America’s Middle-Skill Workers in the 21st Century
That K-12 alteration has to be done, but perhaps starting with adults, much younger entrants to the workforce, makes more sense especially as employers who have felt the pinch of insufficient labor supply at all levels of the economy may prove to be allies in this transformation. Guidance specific to this path exists right now in papers like this one, but
“A powerful example of this finding comes from an examination of data from a large-scale assessment of adult skills, which indicates that over half (53 percent) of young adults ages 16–34 with a high school degree and some postsecondary education, typical of middle-skills workers, lack the skills that many experts believe are required to meet the challenges of today’s technological workplace where middleskill occupations are increasingly demanding higher levels of cognitive skills.”
The full report is available here Someone needs to put it into practice.
The authors, Irwin Kirsch, Anita Sands, Steve Robbins, and Madeline Goodman are all former colleagues and current friends who in this work connect to a very specific theory of action as to what needs to be done.
But has anybody outside of ‘the academy’ read this paper? How was it promoted or enacted by ETS or other entities? A search of several newspaper indexes returned zero hits. The New York Times was also a strikeout on the title “Policy Report Buttressing the Middle: A Case for Reskilling and Upskilling America’s Middle-Skill Workers in the 21st Century” This is not the first time or even the hundredth time that important even groundbreaking ETS research has gone unheard and unread. The failure to-properly advocate for and disseminateof its work by an organization that as a non-profit is a public trust in that regard hurts all of us.
Reality Check
It’s not easy to effect what the authors suggest in this paper. Their theory of action requires a formative type of learning system, this is something I actually have experience in creating in my past life. Such a structure is not about technology principally, but more about anthropology, how people and institutions currently operate and how we would get our fellow citizens to change so that we could draw upon the innovations available in learning.
More on this tomorrow. T-shirts with our slogan NO tests but for learning will be available!
Interesting position and argument. Always great to read your reflections, T.J. While I agree shifting the amounts of formative and summarize testing might be helpful, you undercut your argument for “no summative” by citing summative NAEP scores to support it when you say “Average reading and mathematics scores for the nation’s 9- and 13-year-olds have remained flat or declined since 2012.” Summative assessment has many purposes including informing policy decisions and knowing if major educational changes are having positive or negative effects at district, state, and national levels. I wonder if your argument is intended to focus on the questionable value of summative assessments when used at the individual level? And to reduce summative assessment to places where there is a strong value proposition (like the NAEP scores you use in your argument) with few negative consequences?
As ever, plenty of food for thought (especially the link to the excellent ‘Policy Report Buttressing the Middle…’) but to follow on from Malcolm’s comment, whilst I am wholeheartedly supportive of more* and better* use* of formative* assessment*, we can never do away with summative* assessment* – it is relied upon to measure* progress*, attainment*, achievement* – the list goes on. I fully accept that my asterisked words all need defining, and perhaps to some extent defy definition in this context! But I don’t think that’s what you were saying, T.J., but I think/hope we can agree that the use/deployment of summative assessment could be more nuanced/judicious, better understood, and less knee-jerk-implemented-because-we-don’t-know-what-else-to-do?
Pingback: NO Tests But For Learning: Alphabet Soup and Irish Whiskey – Testing: A Personal History