Daily Archives: September 4, 2022
I have been teaching a data science class for high school students in the summers since 2016, and have wanted to catalog some of the work I have been doing with them for a while, so here is a post in that vein. For context, the class is an enrichment class for talented rising seniors in the New Hampshire public schools. It is a fun, gradeless, intensive experience where they basically take only this one class for 5 weeks, and is super focused on data analysis for performance as opposed to for theory. We use R and Tableau, though really any platform would work! Here are some things about dataviz that I have used:
The Power of Dataviz: W.E.B. Dubois, Visualizing Black America
First, what’s the point of Data Visualization? Well, it is to tell the story of some data, probably without people having to read all that much. For inspiration, we look to W.E.B. Dubois and the 1900 Paris Exhibition, where he presented a series of graphs in “The Exhibit of American Negros” that challenged narratives of the conditions of Black America at the time (h/t to my colleague Blake for suggesting it). But unlike in his essays, W.E.B. Dubois had to express these ideas in a form where people visiting an exhibition could understand important and nuanced ideas without reading a full essay! Enter… data visualization. Here is an article series about the exhibition and here is a video that gives some good context for the exhibition (from about 4:50-9:00).
But the main work we did with this came just from marveling at his graphs, which are both powerful and beautiful, as presented in this book below (you should be able to get a bunch online too):
With this book we looked at a bunch of graphs and used them to think about the following questions?
- What message was Dubois trying to express with his graphs? How does each fit in with the wider context and purpose of his exhibition?
- Some of these graphs are creatively presented – what messages are strengthened by the surprising visual choices?
- Which are your favorites? Which do you find the most powerful? Which are the most effective?
Data Visualization as Storytelling: The 7 Types of Data Stories
With some motivation for the power of data storytelling, we then moved on to thinking about the types of data stories. With this as our grounding, I next help students understand the different types of stories they could tell using visualization. I use a framework from Tableau catalogued here. I find these helpful because sometimes students don’t know where to go with a dataset, but if we focus their eye on storytelling and give them some concrete frameworks to work in, they can find direction with a dataset more often rather than just making sick looking graphs.
With these, then we ask… which types of stories was W.E.B. Dubois telling? Can we find an example of each? Then we look at some other examples… here was a short homework assignment.
- Pick one of the data visualizations below and explore it. They are beautiful data visualizations. Which types of the 7 different Data Stories do you see in here (could be multiple)?
Recreating Graphs: Running Data
Next, it’s time to learn the actual tools needed to create graphs. My class uses Tableau, which I would highly recommend if dataviz is something you will do all year long. I have my running data from the past couple years (I log every run!) and give them a bunch of graphs to recreate. You could do this with any dataset and any platform, but it’s a good way to get them going with the technical skills.
As they are creating the graphs, I have them write down one piece of insight they gained from each graph as a way to keep focusing their eye on a graph only being useful if it says something interesting.
Telling a Simple Story: Which was the Greatest NBA Team?
To then get them arguing with each other using graphs as evidence, we do a quick exercise with the stats of the 1996 Bulls and 2017 Warriors. Which was the best team? Which had the best typical player? What graphs (and descriptive statistics at this point too) can you use to argue your case?
A Performance Assessment: Exploratory Data Analysis
Next, it’s time to release them to work on their own. I have them do an exploratory data analysis on a data set. With raw data, how can you use the storytelling framework to tell the story of the dataset? Put together a series of graphs and measurements that give insight about the data. Here is a rubric for a similar assignment that I did during the year in a graded class.
I have used a million different data sets for this (I like to choose one that I get a lot of datasets from a newsletter called Data Is Plural. Some that I have liked in the past have been (these are all links to .csv files, sorry I don’t have them well documented with their sources!)
- Movie gross vs. rating, length etc
- School shootings since Columbine
- Gun laws across the states
- Driving deaths across the states vs. various laws
And lastly, I wanted to outline two creative projects that my classes have done in the past to really explore dataviz in depth. The first, each year, my class has created a wall-sized Data Visualization based on a dataset. We used a mobile projector to project the graphs/titles on the wall and then covered them in painters tape. These were much easier to do in the summer than during the year, but we still did a cool one during the year too! Here are two examples, one about school shootings, and one about Wordle.
And last, this is something I’ve used in the past and have loved – there are two data scientists who spent a year sending each other postcards of data visualizations based on a topic that they collected data on throughout the course of the week. They are beautiful!! I have students pick a topic, collect data on it about themselves for a few days, then creatively visualize it. Fun little exercise.
Phew!!! There are a million other things that pop up along the way, like chart junk and annotating a graph, and how pie charts suck, but these are the main pillars of my dataviz instruction.