How I Teach Dataviz to High School Students
I have been teaching a data science class for high school students in the summers since 2016, and have wanted to catalog some of the work I have been doing with them for a while, so here is a post in that vein. For context, the class is an enrichment class for talented rising seniors in the New Hampshire public schools. It is a fun, gradeless, intensive experience where they basically take only this one class for 5 weeks, and is super focused on data analysis for performance as opposed to for theory. We use R and Tableau, though really any platform would work! Here are some things about dataviz that I have used:
The Power of Dataviz: W.E.B. Dubois, Visualizing Black America
First, what’s the point of Data Visualization? Well, it is to tell the story of some data, probably without people having to read all that much. For inspiration, we look to W.E.B. Dubois and the 1900 Paris Exhibition, where he presented a series of graphs in “The Exhibit of American Negros” that challenged narratives of the conditions of Black America at the time (h/t to my colleague Blake for suggesting it). But unlike in his essays, W.E.B. Dubois had to express these ideas in a form where people visiting an exhibition could understand important and nuanced ideas without reading a full essay! Enter… data visualization. Here is an article series about the exhibition and here is a video that gives some good context for the exhibition (from about 4:50-9:00).
But the main work we did with this came just from marveling at his graphs, which are both powerful and beautiful, as presented in this book below (you should be able to get a bunch online too):
With this book we looked at a bunch of graphs and used them to think about the following questions?
- What message was Dubois trying to express with his graphs? How does each fit in with the wider context and purpose of his exhibition?
- Some of these graphs are creatively presented – what messages are strengthened by the surprising visual choices?
- Which are your favorites? Which do you find the most powerful? Which are the most effective?
Data Visualization as Storytelling: The 7 Types of Data Stories
With some motivation for the power of data storytelling, we then moved on to thinking about the types of data stories. With this as our grounding, I next help students understand the different types of stories they could tell using visualization. I use a framework from Tableau catalogued here. I find these helpful because sometimes students don’t know where to go with a dataset, but if we focus their eye on storytelling and give them some concrete frameworks to work in, they can find direction with a dataset more often rather than just making sick looking graphs.
With these, then we ask… which types of stories was W.E.B. Dubois telling? Can we find an example of each? Then we look at some other examples… here was a short homework assignment.
- Pick one of the data visualizations below and explore it. They are beautiful data visualizations. Which types of the 7 different Data Stories do you see in here (could be multiple)?
- US Land Use [Bloomberg]
- Gender Pay Gap [The Guardian]
- Colorism in Vogue [Pudding]
- Office Dialogue in Five Charts [Pudding]
Recreating Graphs: Running Data
Next, it’s time to learn the actual tools needed to create graphs. My class uses Tableau, which I would highly recommend if dataviz is something you will do all year long. I have my running data from the past couple years (I log every run!) and give them a bunch of graphs to recreate. You could do this with any dataset and any platform, but it’s a good way to get them going with the technical skills.
As they are creating the graphs, I have them write down one piece of insight they gained from each graph as a way to keep focusing their eye on a graph only being useful if it says something interesting.
Telling a Simple Story: Which was the Greatest NBA Team?
To then get them arguing with each other using graphs as evidence, we do a quick exercise with the stats of the 1996 Bulls and 2017 Warriors. Which was the best team? Which had the best typical player? What graphs (and descriptive statistics at this point too) can you use to argue your case?
A Performance Assessment: Exploratory Data Analysis
Next, it’s time to release them to work on their own. I have them do an exploratory data analysis on a data set. With raw data, how can you use the storytelling framework to tell the story of the dataset? Put together a series of graphs and measurements that give insight about the data. Here is a rubric for a similar assignment that I did during the year in a graded class.
I have used a million different data sets for this (I like to choose one that I get a lot of datasets from a newsletter called Data Is Plural. Some that I have liked in the past have been (these are all links to .csv files, sorry I don’t have them well documented with their sources!)
- Movie gross vs. rating, length etc
- School shootings since Columbine
- Gun laws across the states
- Driving deaths across the states vs. various laws
And lastly, I wanted to outline two creative projects that my classes have done in the past to really explore dataviz in depth. The first, each year, my class has created a wall-sized Data Visualization based on a dataset. We used a mobile projector to project the graphs/titles on the wall and then covered them in painters tape. These were much easier to do in the summer than during the year, but we still did a cool one during the year too! Here are two examples, one about school shootings, and one about Wordle.
And last, this is something I’ve used in the past and have loved – there are two data scientists who spent a year sending each other postcards of data visualizations based on a topic that they collected data on throughout the course of the week. They are beautiful!! I have students pick a topic, collect data on it about themselves for a few days, then creatively visualize it. Fun little exercise.
I wrote about that (and a fun data speed dating exercise) here, but check out their website for some amazing creative dataviz.
Phew!!! There are a million other things that pop up along the way, like chart junk and annotating a graph, and how pie charts suck, but these are the main pillars of my dataviz instruction.
Words Speak Louder than Comments: A Statistical Analysis of the Words a School Uses in Narrative Comments
In all honesty, writing end-of-term comments for students is just about my least favorite part of teaching. Hours and hours spent writing narrative comments that try to serve too many purposes – evaluative feedback for parents, information for college counselors, but also coaching feedback for students to try to improve. Last year, at a school I have since left, I was curious about what insight we could gain from the word choice faculty use in these comments. What do these words say about priorities as teachers? What advice do we give to students that are struggling versus students that are excelling? Does the math department talk about different measures of success than the English department? Do we use different words for boys and girls? What about for white students and students of color?
To study this, I got a hold of a full set of comments from one semester at my school (every teacher, every student in every class – 1,711 comments in total). I am thankful that my administration trusted the statistics teacher with this information! My goal was to study word choice blind of context – instead of worrying about how a word was used, I just looked at which words were used. Using a coding language called R, I stripped the comments of words like “the” “and” “been” or “have” (called stop words) to try and look only at words that have real meaning. Below are the words with 400 instances or more:
The words almost read like a comment: “As we wrap up the class for this semester, next quarter, I’m looking forward to Steve’s understanding developing so that he can answer difficult questions on the final exam.” Some minor insights emerge from this list – “strong” and “excellent” show up as the most positive words, but no negative words appear, and discussions, papers, essays and written work seem to be the meat of what we talk about, though the exam is talked about frequently.
More interesting than this list of very comment-y words is what happens when we see what words are disproportionately used by certain groups or about certain groups. The inspiration for this analysis was taken from Ben Blatt’s book “Nabokov’s Favorite Word Is Mauve,” a fantastic statistical exploration of the world of literature. In it, he describes the idea of a “cinnamon word,” which is a word used much more frequently by an author than you would expect in normal literary writing. For example, as the title of the book suggests, the word “mauve” (a purplish color) is 44x more likely to show up in Nabokov’s writing than in English writing in general. It’s still not a common word in his books – it appears only 5 times in his most famous book, “Lolita” – and common words like “the” are far more prevalent in his book, but whereas his usage of “the” is similar to other authors, he uses “mauve” far more often than you would expect. So, thus, this word is characteristic of his writing. (The term “cinnamon word” is named after Toni Morrison’s most characteristic word, which is cinnamon).
What was my “cinnamon word” for my comments? Objectives. The word objectives was 1000x more likely to appear in the text of my comments than in comments as a whole. This made sense to me – my class, including the grading, was organized around Learning Objectives, which I referenced constantly in my comments. At our final faculty meeting last year, I gave each faculty member their top 5 cinnamon words, and left them to interpret. For a word to qualify, it needed to show up at least 10 times in their comments, and show up more frequently in theirs than other teachers’ comments. A history colleague’s was “scholarly,” a Latin teacher used “annotations” frequently, and an English teacher had “quotations” show up 60x more often in her comments than others, all things that made perfect sense! Others had odd words like “crisis” or “discovered” or “70” appear, which some could explain by an assignment that they were talking about, but others truly revealed an interesting piece of insight for each person. I must admit, that this was quite fun for teachers to see some quirks of their own writing!
Any individual’s comments are prone to the quirks of their own writing or the specific assignments in their class that quarter, so I decided to zoom out and ask questions of word choice by groups of people. For example, what words are characteristic of various departments (math – reassessment, English – evolving, science – lab)? I found the most interesting analysis though to be what words are used ABOUT certain groups of students. For example, here are the words that are more likely to be used about boys, and the words more likely to be used in a girl’s comments:
There are some really interesting patterns that are easy to explain, and some that are not. Some words seem to show up because of curricular choices of the genders (the first couple words seem to point to computer science for boys, and art words like “conte,” a type of charcoal, and “figure,” for figure drawing, seem to show up for girls). But others provide more questions than answers. Why do we use the word “humor” more in a boy’s comment? Is this something we inherently value more about our boys than our girls? Also, why is “beautiful” used more in a girl’s comment? Interestingly, “beautiful” does not make the top 10 cinnamon words for the arts department, so its presence can’t be explained by girls taking the arts alone. Are we more comfortable describing the writing of a girl as beautiful? A beautiful solution to a math problem? Other words that jump out to me are “creativity” and “genuine” on the boys’ side, and “openness” and “persistence” on the girls. I’m not sure what they say, but they intrigue me nonetheless.
Another analysis I did investigated the words we use for the top 10% of students and the bottom 10% of students, as defined by their raw GPA:
I think the biggest difference between these two lists are that a lot of quality words show up on the left (elegant, exceptional, amazing, refined, stars), and a lot of process words show up for the struggling students (studentship, distracted, stress, strategies, missing). With much talk about the growth mindset in the education community lately, I wondered if we had absorbed the message for our struggling students but not our stars. It seems like we were good at talking about specific behaviors and parts of the process for the struggling students, but perhaps aren’t talking as much about the work that our exceptional students put into producing assignments that are elegant and refined (something that the mindset people say can be just as damaging). Also, it tickles me a bit that “violin” shows up for the successful students – I have to imagine that our population that plays the violin happens to be a set of students that skews stronger.
Presented for you to make your own conclusions are the words that show up in the comments of students from different racial groups:
No data is perfect, and an analysis from a different year’s comments would turn out differently. And, I would still characterize myself as a budding statistician and coder, so there could be some fatal flaws in this analysis. But this was interesting nonetheless, and I think a great learning experience for our faculty, the administration and me. I am thankful that I was at a school that was willing to be vulnerable and discuss these issues (and willing to let me share my work here)!
Similar Triangles and a Self-Checking Physical Challenge: Mirror, Mirror on the Floor
Yesterday in Geo, I took some advice from Dan Meyer “You Don’t Have To Be The Answer Key” and set up a fun self-checking activity based on this blog post, Eye to Eye. The premise: place a sticky note on the wall, and then place a tiny mirror on the floor between you and the wall so that you can glance into the mirror and see the sticky note. The catch is that you can’t just stand and move around until you can see it, you need to place yourself, open your eyes and look, and see if you see it! The mirrors I found in the physics department were probably 3 inches in diameter, which was perfect for a little bit of precision, but enough wiggle room that this worked. It was fun because students would be really excited that it worked! And if it didn’t, they would just go back and check their calculations without needing direction from me.
Here were the three situations (I gave them the text and then they needed to show me their diagram kind of like the ones I drew below, and calculations before they were allowed to try it physically):
1. The sticky note is 7 feet up on the wall, and the mirror is 3 feet from the wall. Where should you place yourself so you can see the sticky note in mirror?
2. Now place yourself 5 feet from the wall, and place the mirror 4 feet from you. Where should you place the sticky note so that you can see it in the mirror?
3. Now place yourself 5 feet from the wall, and place the sticky note 3 feet up. Where should you place the mirror so you can see the sticky note in the mirror?
These got more difficult as they went a long, and kids did a great job with the last one solving it in a ton of different ways (most using some sort of x and 60-x on the bottom). My favorite was a boy who measured his eyes to be 63 inches off the ground.
“Well, I’m 63 inches tall, so the ratio of my height to the sticky note is 63:36, which simplifies to 21:12, but 21+12 = 33, so if I break the 60 inches on the floor into 33 pieces and then multiply that by 12, that’s how far I should place the mirror from the wall.”
Lesson Outline: Origami Construction of Octagon
I did this kinda fun hands on intro to the angles in a regular polygons earlier this year and I wanted to share. It was inspired/thieveried by an Illustrative Mathematics lesson that I can’t find on the internet now (I think their curriculum is about to come out, which I’m excited about, but maybe some of their stuff online going away) and this blog post from Jennifer Wilson, so nothing new, but I figured I’d amplify and give my own thoughts.
- Fold the Octagon:
- Take a square piece of paper, and halve it by folding one edge to the other edge across the way, unfold, and do it the other way too. Then halve it along the two diagonals too.
- Then fold all the halfway lines between those lines. Easiest way to do this is fold along the current lines until you get one of those eighth triangles, and then fold the folded edges to each other (see picture).
- Unfold. Then find the points indicated in the diagram and make a fold on the line between those two points (my diagram is a bad construction of this, they shouldn’t be exactly quartering the top).
- Take a square piece of paper, and halve it by folding one edge to the other edge across the way, unfold, and do it the other way too. Then halve it along the two diagonals too.
- Any conjectures about the shape? We talked through some of these and talked through why they might be true. Good reasoning involved the fact that when you fold something onto something else, it makes it congruent.
- It’s an octagon.
- It’s a REGULAR octagon.
- There are four kites in the figure (can you see them?)
- There are 8 isosceles triangles.
- The last fold we did made a 90 degree angle with the other fold.
- Now physically label the measure of all of the angles. Like, with a pencil and not your brain. Justify your thoughts.
- This was interesting because there seemed to be two ways to go:
- Some started with the corners. Since that angle is folded in half, those are two 45 degree angles. And the last fold makes two 90 degree angles because the angles are folded on top of each other (congruent) but also along a line (supplementary) so have to be 90. Then the other angle in those corner triangles is 45 degrees…. and go from there.
- Some started in the middle. Since we folded all 16 of those central angles on top of each other, they have to be congruent, and since they add up to 360, they need to be 22.5 degrees. Then lots of places to go from there either with isosceles triangles or right triangles.
- After kids spent 10 minutes or so labeling on their own or with a partner, we took turns sharing some ideas. Some kids talked through their thinking and we all gave critiques. It was a good switch between individual-full class modality too.
- This was interesting because there seemed to be two ways to go:
- Now, draw a STAR where there is an exterior angle and a HEART where there is an interior angle.
- We had just learned what these and had learned how to calculate them so I wanted to see if they could find one on this complicated diagram. This was harder than I thought it would be, which meant it was a good use of time!
- Do the measures of these angles match with the equations we figured out to calculate them? (Yes! 45=360/8 and 135=180-45=180(8-2)/8).
This took maybe 40 minutes (I can’t tell from my lesson plan if that’s right), but was great! It was physical and exploratory and was full of fun geometrical arguments that are based in transformations, but also had a nice, concrete, angle-labeling component for kids who prefer numerical lessons. This gets my 😀 of approval!!!
Coding in Geo: Snap! Regular Polygon Art
One of our department’s curriculum redesign goals is to incorporate a bit of coding into our curriculum, and the place they decided to place that was Geometry. We have been coding in Snap!, a block based coding platform really similar to Scratch. Block based means that students aren’t typing commands, but rather dragging and dropping them into lists to make programs. The advantage: no syntax errors, or spelling errors that are the bane of every beginning coder (wHy WoN’t It RuN?!? Well, because you have “Power” written here and “power” written here and the computer doesn’t know that you think those are the same thing). The disadvantage: it’s a bit clunky, in particular the saving and sharing system.
After an initial day where the kids explored by trying to get the program to write out an English letter, we then had them work to code in a regular polygon, something that would teach them both about loops and variables, and practice calculations of interior and exterior angles etc. Here is the packet of instructions we used, with much inspiration/petty theft from Dan Anderson (@dandersod, his conference materials).
Then, the instructions I gave them were to make a beautiful piece of art that shows of their understanding of regular polygons, coding loops and variables. Your code had to run in one click. The results were SUPER cool, and the kids loved it! Here are some below:
Sorry they are so small, but there are so many cool ones, this isn’t even all of them! Can’t wait to hang these up in the classroom.
Along the way, without me showing are really them needing to, kids figured out how to: incorporate sounds, incorporate input from the user, use randomness, and one kid figured out his own version of the sine function. I also had them write a written description of how their code works and what their artistic inspiration was, and they were adorable. I could tell how proud some kids were of their work! <3.
Paper Folding Video Explanations in Geometry
I love giving students genuinely different ways to show their understanding. In Geometry this year, I have been having students record screencasts to explain paper folding phenomena. Basically, I walk them through a paper folding exercise (details below on the two I have done so far) that has a surprising or interesting result. Then we talk about as a class why it’s happening – they try to figure it out together, and I help them figure it out through a full class discussion. Then, they go home and record a video of them explaining the idea, showing me physically on the paper what is happening and why. I give them feedback and they record again! I have found it a great way to engage them in geometrical argument without the annoying technicalities of written proofs.
(Here, a student is using the physicality of the paper to show why when you fold a point onto another point, all the points on the fold are equidistant from the two points)
For video collection, I use Flipgrid which makes things SO EASY. They all go in one place and no one has to worry about saving or uploading files. I limit them to 2 or 3 minutes so that they have to be efficient and I can view them easily.
PAPER FOLDING CONJECTURE 1:
1. Fold up one corner of the paper in any direction so long as the crease goes between two adjacent sides.
2. Then, fold an adjacent corner up so that it meets the side of the fold already there.
Any conjectures? Students will come up with lots of things, but the fun ones to argue are: Why is that bottom angle a right angle? Why are the two triangles that you made from the folds similar?
PAPER FOLDING CONJECTURE 2:
(from an Illustrative Mathematics Task that I CAN’T FIND right now, halp!)
1. Draw two points on a piece of paper. Fold the paper so that
Any conjectures? We had been talking about perpendicular bisectors, so most students immediately saw that this was a perpendicular bisector. Can you argue that all the points on this line are equidistant from the original two points?
2. Now draw a third point.
3. Fold the other two combos of points onto each other (so if the first fold was from A to B, then fold B to C and A to C).
4. Locate the point that they all meet.
Wait why do they all meet at one point?
5. Now draw a circle with the center at that point, and use the radius as one of the original points.
My circle goes through all 3 points! Why did that happen?
Homework Response Randomization
In my precal (pre-cal, pre-calc, precalculus, Precalc, p-Rec-aLk) class, I have multi-day homework assignments that I collect infrequently, a structure that works great for older kids who can plan their own time out well. But I was struggling figuring out how to deal with homework in my geometry class, as I think freshman needed the daily *umph* to keep them going. I wanted a structure allows for:
- Accountability to work hard on it for both completion and understanding
- Feedback on their work
- A workload that I can handle
So I adapted a mode from colleagues at my last school who would roll a dice to see what happens. The system incorporates a little bit of randomness and has been kind of fun. Students, as a class, pick a number from 1-6 and behind that black box is the option for what is going to happen for that day:
If 3 is picked day 1, then that is used up, and we pick the other numbers on successive days until we get through the cycle. Then I shuffle the options behind them and we start the cycle anew. My options right now are:
- Homework Quiz (no notes)
- Homework Quiz (notes)
- Sight Check (x2)
- No Check – everyone gets full credit
I give them 10 minutes in the beginning of class to check homework, and the homework quizzes are literally just a problem directly from the homework, so the idea is, if they worked hard on the homework and fixed any small issues they had with them in the first 10 minutes, they should have no problem on the quizzes (spoiler alert: the kids who don’t do homework well have a big problem here, but at least they are realizing it?). The sight checks are just for completion, and the collected homework is graded on completion PLUS the corrections that they did in the first 10 minutes of class – I’m trying to encourage them to use that time really well and then give them feedback on how to do that…
Things that I have liked about this:
- It has been fun! The reveal every day is actually hilarious, though I have gotten in trouble with the teachers next door for noise a few times :).
- It has reduced my workload without really reducing what they get out of homework. Sometimes, I tip the scale by hiding what I want to happen behind all the boxes (i.e. if it’s a good day for me to collect).
- The homework quizzes are good for both me and them for REAL feedback opportunities (no one looks at what you write on their homework…) and as a low-stakes way for them to assess their own knowledge
- I also thought the homework quizzes would take forever – they take about 7 minutes or so, but it’s 7 minutes where they are rehashing their thinking about an important problem, so I’ve found it useful and a good tradeoff for instructional time.
- Going through the cycle as opposed to rolling evens out the workload a lot and ensures that there are quizzes at regular intervals.
Assessments: OPEN YOUR MIND!
One of the things that I changed most about my teaching approach at my previous school (I’m joining a new one this fall) is an expansion of my approach to summative assessment. Our curriculum there was a collaborative problem solving curriculum based on the Exeter materials. Students would work together in class discussing and debating about ideas, and exploring the problems presented to them, so by giving traditional, sit-down, silent, period-long individual tests, we found that we weren’t really assessing them in the way we were teaching. Due to a scheduling snafu one winter, we were left without a midterm during our exam period (lollll), but we were all like… oh well, who cares? Let’s do creative things! We all tried oral exams that winter in class, and then our minds were open from there…
Here are some of the different things that I have in my toolbox now:
The Oral Assessment
Good for: Areas of math with layered conceptual ideas underpinning them
How it could work: This was a staple for our math department. We would assign a set of problems (maybe 6-8 meaty problems) and students would have the couple of days beforehand to complete them (either with or without each other, I preferred with!). They could use whatever materials they wanted, but knew that they not only needed to solve a problem, but needed to understand it deeply, so there in essence was no way to cheat (just having the correct work on the paper didn’t really help!). Then students would come in one at at time for a 5-10 minute oral where we would roll a dice and randomly pick a couple of the problems to talk through. I preferred students just to show me work they had already completed to save time, but some colleagues had students re-do the problems for them on the board. Then I would ask questions like, “Hmm, how do you know that?” and “Does that work all the time?” or “Was that the only way to do that?” For each problem we discussed, I would give them a grade on the accuracy of their work, their mathematical discourse with me, and then a completion/accuracy grade for the rest of the assessment we didn’t get to talk about it.
I loved this because it’s pretty immediately clear who knows what they’re talking about and who doesn’t. How often do we try to assess this on written tests and get stuff like this that seem to go viral all the time (usually attacking Common Core, which is not why I’m sharing this):
Instead, you can ask specific questions, and just rephrase a bit until students understand what you are asking. I also loved when a student would have incorrect work, and I would ask a question, and they would figure out a mistake and correct their work – learning happens DURING the assessment!
If you’re wondering, “when do you have time to do something like this?” I would often do it DURING a normal written test, or give them problems to work on and try to power through my whole class in one period. I trust my students, and I know that is a luxury, but that works for me.
The Group Assessment
Good for: When you want to assess something that takes slow thinking, or is too complicated for a written test, or when you just want to get your students sharing knowledge and ideas with each other
How it could work: There are some things that students just need to know how to do individually (how to factor, how to take a derivative etc), but problem solving skills are amplified by others, and *real* mathematicians work together. I would have the students work on a problem together on a big whiteboard and then whenever they felt ready, they would erase their work, and then silently and individually solve the problem on their own, and this is what their grade would be based on. Some colleagues just had them do the problem together and grade that product, and my way takes longer, but I preferred not having students’ work determine each other’s grades. This was often great evidence to help people see that their way of learning and collaborating wasn’t working (if their group mates got a problem that they totally didn’t).
A fun modification is to give them the problem with the values blacked out, like this example:
They can then discuss HOW they would go about the problem, focusing on ideas, instead of specific numbers. And then, when they felt like they were done with the group, I would give them the problem with the numbers included to complete individually.
I usually would combine this with a standard test (maybe the first question) and found that it really cut the tension in the room around an assessment (it was exciting!).
Good for: Presenting challenge problems, assessing understanding with homework, doing oral tests without taking up class-time, assessing understanding with coding
How it could work: This is an idea taken from Andy Rundquist (@arundquist) and many other science teachers that do this – students would take a picture of their work, upload it and record a brief 2-3 minute explanation of their work. They talked through the WHAT of their work, but also the WHY. Similar to the oral assessment, it was always super easy to tell who really understood what was going on and who was faking it. I would watch the videos and give them feedback, sometimes even requesting another video or a written response to my feedback. I especially loved this as a way to change up homework, and as a way to assess students’ understanding of the really tricky problems that we went over in class, or solved collaboratively. It was also great for assessing code because, again, it didn’t matter if everyone had the same code, I was assessing their understanding of it.
You really need an LMS, and to insist students do it the way that makes it easy for you to grade them in order for this to work, because otherwise it’s a technological hassle. I found students figured out the easiest way to do this, but would suggest Screencast-O-Matic if they needed a suggestion for an easy way to do this.
The Toolbox Assessment
Good for: Reviewing many learning objectives, forcing students to find their own examples of things and do the art of “problem-finding”
How it could work: When I have taught Statistics, I have used coding, which makes a lot of things quicker, and generally “get through” most of the material with 3-4 weeks to go at the end of the year. I found that students struggled with APPLYING the ideas though, so came up with the toolbox assessment. I had a list of standards and skills that covered the whole year (“Running a t-Test”, “Using a Boxplot” etc.), and they needed to design small, quick statistical studies that showed proficiency on these ideas. I had 13 of them, but they could check off multiple with a single study. They would complete a study and then “present” to me and have a conversation to show me their work and understanding. I told them not to worry too much about what their product looked like (i.e. no need for tri-fold poster presentations) and would sometimes send them back to fix or re-do something. Their grade at the end was about how many of the objectives they checked out. I loved this because it put the onus on them to really figure out where something applied, instead of just regurgitating problems I made up. Though it’s not as applicable to other math classes, I could totally see this working as a review activity before a final exam.
Can Writing Styles Be Boiled Down to Statistics?
Twas the week before Christmas Break, when all through the school, students were complaining, and trying to do new content would make me a fool… so we did something kind of interesting in my statistics class that was a cool application and reviews a lot of great stuff with inference testing. Inspired by this book called Nabakhov’s Favorite Word is Mauve by Ben Blatt, a super cool statistics-based analysis of literature, and this post on the Stats Medic, “Does Beyonce Write Her Own Lyrics?”. The basic questions are “How can we use basic statistics to examine and tell apart writing styles? What do statistics about your own writing say about your style?”
STEP 1: WHICH OF THESE AUTHORS ARE THE SAME?
To start, I gave them a page (available here, spoilers down below) from three different books (page 154 from each book, thanks Siri for the random number!). I told them two of these were written by the same author and one was written by a different author. How could we tell who wrote what? I told them the story of Hamilton, Madison and the disputed Federalist Papers to whet their appetite as a “real-world” example of this, but to be honest, they didn’t care about this, but were VERY intrigued just by the puzzle of figuring out which authors were the same.
And the statistics began, but not from a canned dataset that they ran pre-prescribed tests on – in fact, I was scrambling that week and hadn’t tried anything myself. I had no idea this was going to work! What things could we measure about the text to tell the difference between them? Some suggestions were too difficult to measure (i.e. tone), some had nothing to do with the writer (i.e. how many lines there were on the page), but others seemed easy to measure and perhaps distinctive of a writer (frequency of commas, length of words etc.). The students were skeptical that those things could distinguish authors, but we went after it anyway! We spent about a half-hour counting various things about the text, collected them on a document and then highlighted which two of the three were roughly more alike on each measure:
Of the 16 things we measured, 8 were the same between writers G and U (and 2 others were pretty much the same between all 3). Here come some interesting statistical questions… Why might one random page be off (one sample could be skewed for no reason other than randomness)? What’s the advantage and disadvantage of measuring a bunch of things (more things = more opportunity for random associations, but more opportunity to see a pattern). Which of these differences are “significant”?
We then spent about a class on that last question. Given that we know a chi-squared test and a t-test, how could you use those on these things we measured? We did this in R, and I can give some details about that for anyone interested, but the interesting part here is getting kids to imagine how you format data so that you could use a statistical test. What do you stick in about the sentence length in a t-test? How could -ly adverbs be a chi-squared test? Are either even appropriate here? (Meh, mostly… )
Page G is from The Causal Vacancy by J.K. Rowling.
Page U is from The Cuckoo’s Calling by Robert Galbraith.
Page S is from Big Little Lies by Liane Moriarty.
Wait… what? Those are three different authors. NOT SO FAST! Robert Galbraith is actually a pseudonym for… J.K. Rowling! (I wish I had played that up a bit more) So our statistics worked in a way – there were more similarities between G and U than the other combinations. So even when J.K. Rowling was writing under a pseudonym, her writing style was similar Cool!!!!!!
STEP 2: WHAT DOES YOUR WRITING STYLE LOOK LIKE?
Now, I wanted them to do something similar with their own writing. They had just written a joint paper with a partner, and I wanted them to see if their joint paper more closely resembled their own writing or their partners HAHAHAHAHAHA. They were hilariously sheepish about this idea, which told me immediately who had done what 🙂 (but it was all in good fun).
Enter a new tool, Count Wordsworth, an online tool that automatically measures a WHOLE BUNCH of statistics about any text that you paste in there (at which point they got mad at me because they had done so much by hand for the pages of the books, but they’re always mad at me for stuff like that). For example, here is just part of the output when I put in my teaching philosophy from my teaching portfolio:
I had them all put in a recent English paper and then find the THREE biggest differences between their paper and their partners. Again, a bunch of fun data questions – do the quotes in the paper mess things up? How about the number of words? What about the topic (English vs. a lab report)?
Then, once they had discovered the three biggest differences, I had them put in their joint paper and try to figure out whose writing style is more closely resembled. This class was a blast, and once they finished this, they were so curious so just kept exploring… Some kids put in their freshman year papers, some put in the headmaster’s emails etc. Lots of fun curiosity!
STEP 3: HOW DOES A PROFESSIONAL STATISTICIAN DO THE SAME SORT OF ANALYSIS?
Lastly, we read a short 10-page segment of the book I mentioned in the beginning,Ben Blatt’s Nabokhov’s Favorite Word is Mauve, specifically a chapter called “Searching for Fingerprints.” It was fun to see what a professional statistician does and we talked about how he could possibly measure some of the things that he did with the computing power we have nowadays.
Good stuff! Happy Holidays everyone!
Rock, Paper, Triggers
I played a quick, but fun and mathematically rich game in precalculus the other day that I thought I’d share. Let’s call it Rock, Paper, Triggers for now, (it’s kinda like Rock, Paper, Scissors but with Trig functions) but if you have a better name, let me know.
Each person secretly picks a trig function (SINE, COSINE or TANGENT) for themselves, and an angle to send to the other person. Then, once ready, both reveal and each person thinks about…
Whoever’s value is higher wins. No need for exact values, just figure out which one is bigger (and DNE automatically loses). So for example:
Person 1 has sin(190°) and person 2 has cos(269°). Well, both are negative, but 269° is so close to 270° that cos(269°) is a little less negative. So person 2 wins!
This was really good for number sense (no calculators), for thinking about what values of the different functions are possible, and where those values are on the unit circle.