Monthly Archives: June 2019

Words Speak Louder than Comments: A Statistical Analysis of the Words a School Uses in Narrative Comments

In all honesty, writing end-of-term comments for students is just about my least favorite part of teaching. Hours and hours spent writing narrative comments that try to serve too many purposes – evaluative feedback for parents, information for college counselors, but also coaching feedback for students to try to improve. Last year, at a school I have since left, I was curious about what insight we could gain from the word choice faculty use in these comments. What do these words say about priorities as teachers? What advice do we give to students that are struggling versus students that are excelling? Does the math department talk about different measures of success than the English department? Do we use different words for boys and girls? What about for white students and students of color?

To study this, I got a hold of a full set of comments from one semester at my school (every teacher, every student in every class – 1,711 comments in total). I am thankful that my administration trusted the statistics teacher with this information! My goal was to study word choice blind of context – instead of worrying about how a word was used, I just looked at which words were used. Using a coding language called R, I stripped the comments of words like “the” “and” “been” or “have” (called stop words) to try and look only at words that have real meaning. Below are the words with 400 instances or more:


The words almost read like a comment: “As we wrap up the class for this semester, next quarter, I’m looking forward to Steve’s understanding developing so that he can answer difficult questions on the final exam.” Some minor insights emerge from this list – “strong” and “excellent” show up as the most positive words, but no negative words appear, and discussions, papers, essays and written work seem to be the meat of what we talk about, though the exam is talked about frequently.

More interesting than this list of very comment-y words is what happens when we see what words are disproportionately used by certain groups or about certain groups. The inspiration for this analysis was taken from Ben Blatt’s book “Nabokov’s Favorite Word Is Mauve,” a fantastic statistical exploration of the world of literature. In it, he describes the idea of a “cinnamon word,” which is a word used much more frequently by an author than you would expect in normal literary writing. For example, as the title of the book suggests, the word “mauve” (a purplish color) is 44x more likely to show up in Nabokov’s writing than in English writing in general. It’s still not a common word in his books – it appears only 5 times in his most famous book, “Lolita”  – and common words like “the” are far more prevalent in his book, but whereas his usage of “the” is similar to other authors, he uses “mauve” far more often than you would expect. So, thus, this word is characteristic of his writing. (The term “cinnamon word” is named after Toni Morrison’s most characteristic word, which is cinnamon).

What was my “cinnamon word” for my comments? Objectives. The word objectives was 1000x more likely to appear in the text of my comments than in comments as a whole. This made sense to me – my class, including the grading, was organized around Learning Objectives, which I referenced constantly in my comments. At our final faculty meeting last year, I gave each faculty member their top 5 cinnamon words, and left them to interpret. For a word to qualify, it needed to show up at least 10 times in their comments, and show up more frequently in theirs than other teachers’ comments. A history colleague’s was “scholarly,” a Latin teacher used “annotations” frequently, and an English teacher had “quotations” show up 60x more often in her comments than others, all things that made perfect sense! Others had odd words like “crisis” or “discovered” or “70” appear, which some could explain by an assignment that they were talking about, but others truly revealed an interesting piece of insight for each person. I must admit, that this was quite fun for teachers to see some quirks of their own writing!

Any individual’s comments are prone to the quirks of their own writing or the specific assignments in their class that quarter, so I decided to zoom out and ask questions of word choice by groups of people. For example, what words are characteristic of various departments (math – reassessment, English – evolving, science – lab)? I found the most interesting analysis though to be what words are used ABOUT certain groups of students. For example, here are the words that are more likely to be used about boys, and the words more likely to be used in a girl’s comments:


There are some really interesting patterns that are easy to explain, and some that are not. Some words seem to show up because of curricular choices of the genders (the first couple words seem to point to computer science for boys, and art words like “conte,” a type of charcoal, and “figure,” for figure drawing, seem to show up for girls). But others provide more questions than answers. Why do we use the word “humor” more in a boy’s comment? Is this something we inherently value more about our boys than our girls? Also, why is “beautiful” used more in a girl’s comment? Interestingly, “beautiful” does not make the top 10 cinnamon words for the arts department, so its presence can’t be explained by girls taking the arts alone. Are we more comfortable describing the writing of a girl as beautiful? A beautiful solution to a math problem? Other words that jump out to me are “creativity” and “genuine” on the boys’ side, and “openness” and “persistence” on the girls. I’m not sure what they say, but they intrigue me nonetheless.

Another analysis I did investigated the words we use for the top 10% of students and the bottom 10% of students, as defined by their raw GPA:


I think the biggest difference between these two lists are that a lot of quality words show up on the left (elegant, exceptional, amazing, refined, stars), and a lot of process words show up for the struggling students (studentship, distracted, stress, strategies, missing). With much talk about the growth mindset in the education community lately, I wondered if we had absorbed the message for our struggling students but not our stars. It seems like we were good at talking about specific behaviors and parts of the process for the struggling students, but perhaps aren’t talking as much about the work that our exceptional students put into producing assignments that are elegant and refined (something that the mindset people say can be just as damaging). Also, it tickles me a bit that “violin” shows up for the successful students – I have to imagine that our population that plays the violin happens to be a set of students that skews stronger.

Presented for you to make your own conclusions are the words that show up in the comments of students from different racial groups:


No data is perfect, and an analysis from a different year’s comments would turn out differently. And, I would still characterize myself as a budding statistician and coder, so there could be some fatal flaws in this analysis. But this was interesting nonetheless, and I think a great learning experience for our faculty, the administration and me. I am thankful that I was at a school that was willing to be vulnerable and discuss these issues (and willing to let me share my work here)!