Summa-wha? Defining our assessment buzzwords

I have a problem: I’m fresh out of my B.Ed program, new to teaching, and I’m easily hooked on expanding my vocabulary.

If you aren’t a teacher, you may wonder how these things add up to a problem.  If you are a teacher, you probably already have buzzword-proximity sirens going off.  You may experience slight itching or allergic reactions as I nitpick on definitions for things like “summative” and “formative”.  Warning: may contain uses of “flow”, “pedagogy”, or “reflect”.  (Okay, maybe that last one is just an SFU thing.)

When my B.Ed program (called PDP there) first brought up assessment, the guest speaker did an activity where words were passed around on index cards, and we did something I don’t even remember what, but the important part was that I ended up stumbling across these cards that said “summative” and “formative”.  I asked the speaker what these meant, and she pulled a classic PDP technique – she asked me what I thought they meant.  I suggested something totally wrong, about summative being a summary and formative being … um I had no idea.  She nodded and didn’t disagree with me and offered no better definition but suggested I was on the right track.  We then proceeded to talk about assessment some more, providing context to show me that I was pretty much wrong, but feeling just uncertain enough that it continued to haunt me.

And thus, my journey began.

At the time, this kind of annoyed me.  In retrospect, I am so glad that I had this experience rather than someone trying to lay out the facts for me.  This got me hooked on figuring out just what this nonsense was supposed to mean, which helped me dodge a lot of conceptual landmines that I watched other teachers, new and otherwise, get hit by over and over.

There are a number of misconceptions (or things I *think* are misconceptions; still learning here) that I’ll quickly address, and then I’ll try to round out my own definition at the end that hopefully clears up these myths.

Myth #1: Summative = traditional assessment (exams, tests);  formative = progressive assessment (SBG, informal assessment, etc)

Myth #2: Summative is bad, formative is good.

Myth #3: An assessment is either summative or formative, not both.

I’m hoping that for those of you who have looked into assessment practices, these myths are familiar enough that I don’t need to expand on them.  I could dredge up plenty of examples from my own experience, but let’s cut to the good stuff.

My current understanding:

An assessment is summative when it reports information on a student’s past learning to the outside community. This corresponds to the buzz-phrase “assessment of learning” that’s also passed around in educational circles.

An assessment is formative when it reports information on a student’s current learning to the student and/or the instructor. The matching buzz-phrases here are “assessment for learning” and “assessment as learning”.

Probably the biggest shift here away from the myths listed above is to start using these words to describe functionality, rather than as mutually exclusive categories.  There are many types of assessment, both traditional and progressive, that function as both summative and formative assessment.

Examples:

  • A traditional unit test is summative in that it is included in the grade that ends up on the report card.  It is formative in that it informs students on what they may need to study for a final exam.
  • A Standards-Based Grading assessment is formative in that students are informed of the results in time to adjust their understanding, study, practice, etc and be reassessed later.  SBG assessments are also summative in that they form the final grade which goes on the report card at the end of the course.

One reason I think we really need this clarification in how we use these words is that new assessment strategies such as SBG are not only better at functioning as formative assessments, but are also potentially much better summative assessments as well.  eg. SBG does a great job of gathering information on what a student has learned, and that information could (in theory) be passed along to the outside community, or to teachers who the student will see next year.

Another related reason I think we need to be clearer in how we use these terms is that if we aren’t, we risk falling hard on Myth #2 which ends up ostracizing those peers who primarily use traditional assessments.

On the flip side, with a focus on using “formative” and “summative” as describing functionality, we can start having useful discussions around how good a job a given assessment does at being formative or summative.  In my perfect world, we should be looking for overall assessment strategies that are fantastic from both a summative and formative standpoint.  (In my actual world, we’re usually stuck within a very rigid system of how summative results are reported, so there’s only so much you can do. But, still.)

There’s still (at least) one huge understanding lurking in those definitions that I haven’t expanded on, but this is already rather long and I think I’ll need a diagram to explain what I’ve got in mind next.

Does this help anyone?

Summer Writing List

Things I would like to blog during the oh-so-near summer break:

  • deconstructing classroom logistics in my night-school math 12 class (how group work did or didn’t work; WCYDWT with grown-ups, how I structured note-taking, etc.)
  • deconstructing what I actually taught:
    • that senior-level trig stuff: unit circle vs ASTC vs ???
    • some fun stuff that actually worked that I haven’t talked about
  • deconstructing (is he still using that word? sheesh) a summer math-ed class I took last year during my B.Ed that was amazing and which I would still like to transform my own classroom into. I need to think through why it worked, how it worked, and how to steal it and make it my own.  (And I want to share the good stuff so that I’m not the only one trying it.)  Some overlap here on the previously mentioned topic of co-operative / collaborative group work, which was a key feature of this class.
  • Maybe something about Civ-games and colonialism; may review that Canadian History mod I stumbled across this week as a case study.
  • Someone challenged me to make a math lesson out of some crazy web-game about fish?  I may have to take them up on that challenge.

Back to that whole game design / teaching thing

So, more on that post I made a little while back and failed to follow-up on.

The key points of the game design analysis that came up (for that case) were:

  1. Keep it small.
  2. Keep the action constant.
  3. Reward success.
  4. Extend the challenge as people master the basics.

How many of these are practical tips for your classes?  Here’s my first thoughts.

1. Keep it small.

This is something I’ve incorporated into my assessments.  I try to keep them as short as they can be.  What does this mean?  If I am assessing a given standard, two good questions are just as meaningful as ten mediocre ones.  What is the point of asking twenty questions on the same topic?  Are we testing comprehension, or mental endurance (ie. tolerance levels for boredom and redundancy)?  At best, this is just wasting everyone’s time; at worst, it’s disadvantaging students who do get it but have attention problems.  (Yes, it’d be good for those students to learn how to cope, and college exam prep etc etc, but I still want them to know they actually do get this stuff.)

Plus, um, it’s more work for me.  Why would I do that to myself?

There’s more grey area stuff to explore on this point but let’s move on.

2. Keep the action constant.

If there is one thing I’ve learned as a teacher-on-call, it’s that boredom leads to scary things being thrown at me trouble.  But this doesn’t mean busy-work.  If my design goal is to get students thinking, then busy work is nearly as bad as doing nothing at all.  The parallel to teaching is to keep them thinking.

This relates directly to the last point: 4. Extend the challenge. Students who get it should still be kept thinking.  Got a handful of students who finish the assigned problems in half the time of the rest of the class?  Have a few tougher problems in your back pocket on the same topic.  (I’ve varied greatly as to how good I am at being prepared for this.)

The flip side is to do what you can to keep students from shutting their brains off and giving up.  This means support mechanisms.  At this point, though, it’s probably obvious I’m talking about all the “differentiated instruction” tricks that get praised in Education circles but which are sometimes really hard to make actually work.  All I can say is, take it with a grain of salt as needed, but don’t give up on it.  The plan I think can work is simply emphasizing group work – peers are an instant support structure.  But my adventures with structured group work in a classroom are another post.

3. Reward success.

For the design of Super Meat Boy, this meant a unique replay system that showed all your failed attempts at a level simultaneously while replaying your success.  The message?  “Look at this crazy hard level that beat you up so many times and now you beat it! Therefore you are awesome.”

The goal for their game was to make something crazy-hard, but keep building up skill and confidence in the player so that they persist through the challenge and get to enjoy the success at the end.  They did this by deliberately not dwelling on failure, giving as many chances to succeed as the player needs, and highlighting what the player accomplished at the end.

Do you hear that? That’s the sound of an entire industry mastering the art of creating self-efficacy in people.  (Sorry, I know it’s a ten-dollar word, but ever since I found a word that describes exactly what is most needed for students to succeed in math, I can’t let it go.)

The thing is, when teachers talk about crazy things like standards-based grading, replacing poor marks with good ones whenever students demonstrate mastery, etc, it’s almost guaranteed that someone will come out of the woodwork and complain, “Oh great, more dumbing down the math class. Hope I’m not stuck with your students next year!”  But even the video game design example we’re looking at here is all about making things hard in a way that people won’t give up on.  Get that?  This is not about dumbing down – this is about training students not to give up.

Does everyone need “Math”?

A couple of days ago, Rhett Allain of Dot Physics suggested that not everyone needs a “functional understanding of math” to get by.

What percent of people in this world have a functional understanding of math? (let me just say functional understanding means they can do basic word problems and understand what is going on) If I estimate this percent of people at 50%, you might argue that this is too high, but that is my estimate. …

Now compare this to the ability to read and write. I think that in this society you really need to know how to read and write to get along

He then goes on to say, thankfully, that yes the world would be a better place if more people understood math, and yes you should study it.

However I think it’s still worth critiquing what’s going on here.  Rhett’s definition of a functional math competency is boiled down to understanding word problems.  Later on, he continues in the comments and generalizes more to “anything above counting”, although he seems to mean anything above basic arithmetic.

There are some weird problems going on here, and they’re not uncommon.  First of all, “word problems” are given as a starting point – probably because they symbolize a roughly late-elementary or middle-school level of mathematics.  And yet, when is the last time you saw a word problem outside of a classroom?  Of course word problems are irrelevant to real-life competency – they’re a construct used solely in schools.  Ironically, these problems are ones which test both reading/writing skill as well as mathematics, and often it’s the reading comprehension that makes students struggle with these problems.

But looking beyond that, there’s a bigger problem of apples-to-oranges.  Anything beyond arithmetic is considered “math”, and arithmetic is something lesser.  And yet, isn’t arithmetic the same kind of base-level functional skill that reading and writing are?  To put it another way, when is the last time you heard someone say, “Oh, I don’t know how to write” because they aren’t any good at composing sonnets, or writing essays?  And yet “I don’t know how to do math” is the message we hand to students who have trouble with factoring, solving equations or deciphering clumsy word problems.

So do we need “math”?  If we’re talking basic numeracy, yes.  Beyond that, we may be able to get by, but we’ll be a lot more competent and much less taken advantage of if we expand our numeracy to include basic probability, statistics, and core algebra skills.  I file this directly next to skills like: reading and critiquing print media and rhetoric, critical media literacy (ie. video, etc), basic political and economic knowledge.  I view these as near-mandatory as a public school educator because I don’t want my students to grow up and get ripped off, taken advantage of, and used for other people’s political or economic gain without understanding what just happened.  They may be able to get by without it, but they won’t be as free as they could be.

Standards-based grading made easy (and less effective)

I’m using standards-based grading (SBG) in my night school Math 12 class right now.  Kind of.  What it amounts to at the moment is that I take the end-of-unit assessment and split it apart into one mark per standard (ie. per skill) rather than a lump-sum grade.

However, SBG is often hailed as a formative assessment tool.  I am doing a lousy job of formative assessment – not completely absent, but not great.  This approach does give students better feedback to use on rewrites, but I don’t particularly have time to adjust my lessons by the time I mark these end-of-unit tests.

The reason I think this is worth sharing is twofold: a) to demonstrate that using SBG is no guarantee that you’re doing things “formatively”; b) to show that SBG has advantages as a form of summative assessment as well as formative.

So what does SBG get me?  For starters, it gives me more fine-grained control over how I weight specific knowledge in the final grade.  If I think that graphing sinusoids should be worth 1/3 of the mark for this unit, I don’t have to pad the assessment with extra questions until I can make the “points” total up to 1/3 of the test.  I can just set the weight on the standard scorecard, and then all I need to care about when writing the test is ensuring that I ask questions that completely demonstrate all of the required skills.

It also makes that grade handed back more meaningful to students.  If they see a 72%, they have to go unpacking what they got wrong, and they may or may not isolate that down to specific skills that they misunderstood.  With the scorecard approach, they can see which stuff they didn’t get at a glance.

It also helps remedy the weird logistics of teaching a night-school class when it comes to test rewrites.  I don’t have office hours or a classroom that you can find me at during lunchtime.  We don’t have time for a full test rewrite during class time.  Breaking the content up into distinct standards means that I can have them re-demonstrate mastery of smaller parts during the 20-30 min we have available at the end of a typical class.

But perhaps the biggest difference it’s made for me is that I’m more confident that the scores I’m assigning to students are grounded in reality.  This is related to having better control over distribution of weights, but it’s more than that.  The SBG approach pushes me to write up a descriptive rubric for what level of ability I would rate as a 2/4, and what’s required beyond that to get a 4/4.  So if I look at that scorecard, I know that someone’s barely-passing grade is aligned with, on average, a barely-good-enough set of understandings and skills, and I can specify exactly why.  In short, I can justify that grade at a glance, to others and (more importantly) to myself.

WCYDWT: Steam user stats, brainstorming and trigonometric modeling

Part of my mandate as a teacher of Principles of Mathematics 12 (ie. roughly Precalc, for you Americans) includes teaching mathematical modeling.  My textbook is filled with little subsections that boldly proclaim, “MODELING!”.  It’s one of the mathematical processes that are supposed to be woven together across all of the curriculum content I’m covering.

I am wrapping up a unit on trig functions and it’s time to hook into the “MODELING REAL-WORLD SITUATIONS” content.  Now, I personally have a serious love-hate relationship with trig functions.  Being a graduate of a computer engineering program means I’ve seen them a LOT in my formal education.  Trig integrals have nearly killed me on multiple occasions.  On the other hand, playing with trig functions in an electronics lab is awesome, and being able to visually comprehend trig graphs is probably the only reason I managed to pass a course on Communication Systems.  So part of me really, really wants them to get this.

But there’s one big problem: when I look through the textbook for real-world, all I see is textbook perfection.  The opener they use is tide-level data from Nova Scotia – except that they’ve stripped the real data down to this:

completely faked tidal wave graph

This is a complete and utter fabrication.  That sine wave is friggin’ perfect.

Here’s the reality.

actual tidal levels data from a real source

Messy peaks that don’t always line up.  Some kind of weird alternating pattern hiding in them as well that totally makes sense if you stop for a second and think about how far the earth has turned in 12h.

You know what?  It’s not perfect, it’s reality.  And our model, based on a single sinusoid, is never ever going to match that reality perfectly.  And that’s just fine, but for some reason the textbook seems deathly afraid of letting students realize this.  The really ironic bit is that the real data is already incredibly close to the model, and yet they still couldn’t bring themselves to let students deal with even a tiny bit of messy reality.

So that’s what led me to this.  I’ll just start off by saying this image probably only scores a C on the WCYDWT rubric but somehow this kind of worked anyway.

Steam Users Graph w/ spike in users

(source: Steam Game and Player statistics)

Opening question was an obvious one: “Which part of the graph do you notice first?”

After students pointed out the weird downward spike in the middle, I moused over that part and talked a bit about the numbers and what this was graphing.  (The site this image comes from has that graph in a Flash applet that gives exact values when you mouseover the graph.)  The story went something like this:

This graph shows the number of users connected to the online PC gaming service Steam over the past 48 hours.  That spike in users online probably represents about 500,000 really ticked-off customers who can’t get at their online game.

We can see they got people back online pretty quickly.  Which is good, because you don’t want to give 500,000 ticked-off time to start posting on forums on a Sunday afternoon.  YOU DON’T WANT TO ANGER THE INTERWEBS.

So imagine you’re working at Steam.  It’d be really nice to have some kind of system that alerts you automatically when something like this happens, because you don’t want to be at the office all weekend watching this.

So, ignoring the programming for now and just thinking about the math … how can we come up with a system that catches this?

Brainstorming session ensued!  Ideas – great ideas! – came from the room and hit the whiteboard.  We started off with four big ideas that were just point-form statements:

  • goes down too quickly
  • drops below a threshold
  • below the avg for that time of day
  • doesn’t fit the pattern

This made me so happy.  From there we turned some of these into something we could calculate; we talked about turning these ideas into “math”.  One thing I loved about this brainstorm is that the third item on the list was outside the scope of the class, but at least as good of a solution as what I was guiding them to.

Then we unpacked the big one: “doesn’t fit the pattern”.  What pattern? Can we model it?  Cue discussion / lecture on trig graphing where I showed them how to construct a sinusoidal model, and afterward we checked how accurate it was.  (Not very, but probably enough to fit our task of catching a large drop in user connectivity.)

My own evaluation?  This lesson isn’t a great WCYDWT – it required a lot of me talking, and I had to do some storytelling.  The question wasn’t short.  It got students participating who weren’t normally confident of their skills, but it didn’t get everyone involved.

But I had students asking, nearly begging me at the start of class to wrap up before 7pm to catch the Canucks game.  This lesson went straight through to 7:15 and they didn’t even notice until they were a couple minutes into individual work afterward.  Something must have gone right.

Level design, learning, and assessment

Remember that time you failed?Normally when my brain cross-links game design and education, I try to temper my enthusiasm by remembering that I like to relate nearly everything to games at some point, somehow, and not everyone else has this disease.  But I can’t let this one go.

I’m going to attempt to write about difficulty in game design then talk a bit about the Super Meat Boy design process, namely when it comes to how we approached dealing with difficulty. …

Dealing with difficulty is one of the key challenges I face every time I bring a math lesson into my classroom.  This kind of design analysis gets my attention.

I’ll skip over drawing comparisons to the history of platformer design and the history of mathematics education, but the parallels are there, at least in the caricatured form you hear when teachers gripe.  (“…back before the make-it-fun-and-easy crowd got a hold of the curriculum”, etc)

How could we make a seemingly aggravatingly difficult game into something fun that the player could get lost in?

This is what I can’t let go. This is the question I stare down when I start to question how I’m presenting that next lesson.  This is the question that makes me rethink what I’m doing when I’m writing the next big unit test.

Go, read the article if you haven’t yet.  Then come back.

It’s when I start to look at the solution to the design problem that I suspect Edmund McMillen has it easier than we do in the classroom.

Here are the key points to summarize:

  1. Keep it small.
  2. Keep the action constant.
  3. Reward success.
  4. Extend the challenge as people master the basics.

How many of these could be applied to the classroom to improve things?  Where does it break down?  (I’ve got some ideas but let’s get some discussion going in the comments first.)

Everything is research

From William Gibson‘s blog:

A You write from 10AM til whenever. Is research a separate activity?
Q I don’t regard research as a separate activity. From anything. Everything is research. Relatively little great stuff turns up for me as a result of deliberately looking. Life is crowd-sourcing. In a good way.
A The reason I ask is that research tends to wander off into the weeds so easily, especially on the internets.
Q But they hide the good stuff *in the weeds*!

This is so many kinds of good.  Professional development shouldn’t be limited to seminars and workshops.

I’d rather drive standard

I can’t completely explain it – either it was channeled procrastination, or my top-down brain demanding more structure. Either way, I found I couldn’t get this first unit test written until I committed myself to a standards-based assessment strategy and wrote up a full rubric for that unit’s learning standards.

I’m glad I went through with it. I was kind of wussing out and just going to put together “the usual” – write tests, add up scores, call that a grade. Now I feel much more in control of how this is going to come together. I’m still unsure of whether I want to expose the system to the students; we’ll see. But at least when students come to me after bombing a test, I’ll have a framework for getting them to demonstrate future mastery of those skills. Even better, it means I can split apart the re-demonstrations into subparts, rather than having them rewrite an entire test at once (which, being a night school course, we don’t really have time for).

What was interesting about the write-up experience is that our provincial standards are *so close* to being directly usable for a solid standards-based (concept-based? whatever) system. They’re already written up in nice bite-sized chunks of understanding. The problem is, they’re totally imbalanced. One unit that takes about 1/7th of your class time over the year contains 1/4 of the total number of standards. Worse, in other grades there are individual “standards” that have subpoints that encompass an entire unit on their own.

Now that I type this up, I suppose arguably I’m making the same classic mistake – assuming that an accurate summative grade should come from simply adding up all of the individual standard marks. But with a nice manageable number of concepts listed, keeping them weighted equally makes the entire system more accessible for students as well as for yourself. Students can glance at their scorecard and know immediately how they’re doing and what to focus on mastering. (In theory.)

After the break I’ll c&p the provincial standards, and my own reworking I came up with including marking rubrics. I’d love to hear critique in the comments! (I pulled a sneaky trick with my third ‘concept’; I can’t decide yet if that was evil or not.)
Continue reading