"Radical changes are occurring in what democratic societies teach the young, and these changes have not been well thought through. Thirsty for national profit, nations, and their systems of education, are heedlessly discarding skills that are needed to keep democracies alive. If this trend continues, nations all over the world will soon be producing generations of useful machines, rather than complete citizens who can think for themselves, criticise tradition, and understand the significance of another person's sufferings and achievements. The future of the world's democracies hangs in the balance." Martha Nussbaum, Not For Profit: Why Democracy Needs the Humanities
A few weeks ago, I was listening to an episode of In Our Time which focused on the physicist Lise Meitner. I'd never come across her before, but this is one of the joys of the programme as it constantly tilts my understanding of the subjects I teach so I view the world anew. Apart from her brilliance and the challenges she faced, it was Meitner's ability to see beyond the silos of academic disciplines that struck me. The mixture of chemistry and physics led to her discovering a new atomic element with Otto Hahn called protactinium—no small feat, reinforcing the disposition that when we go beyond the confines of the familiar and work across disciplines, we are able to do more than we set out to do. Meitner's approach embodied the ability to examine a problem from multiple perspectives.
I'm recalling this story because there was a recent non-peer-reviewed paper published called 'Your brain on ChatGPT' which states that "the use of LLM had a measurable impact on participants, and while the benefits were initially apparent, as we demonstrated over the course of 4 months, the LLM group's participants performed worse than their counterparts" (p.2). The paper has made an impact on social media and in news outlets because of the current fascination about AI. I was intrigued too, but for a slightly more selfish reason as I spend the majority of my professional life marking and writing essays. This passage was of particular interest:
"The potential of LLMs to support students extends beyond basic writing tasks. ChatGPT-4 outperforms human students in various aspects of essay quality, namely across most linguistic characteristics. The largest effects are seen in language mastery, where ChatGPT demonstrated exceptional facility compared to human writers." (p.19)
I thought I should read the paper they cited. Which I did. Titled A Large Scale Comparison of human-written versus ChatGPT-generated essays, the discussion section states:
"There are still certain issues that may affect our conclusions. Most importantly, neither the writers of the essays, nor their raters, were English native speakers... our reliance on essays written by non-native speakers affects the external validity and the generalizability of our results. It is certainly possible that native speaking students would perform better in the criteria related to language skills, though it is unclear by how much."(p.8)
In the conclusion, the authors write:
"For non-native speakers, our results show that when students want to maximise their essay grades, they could easily do so by relying on results from AI models like ChatGPT... However, this is not and cannot be the goal of education. Consequently, educators need to change how they approach homework. Instead of just assigning and grading essays, we need to reflect more on the output of AI tools regarding their reasoning and correctness." (p.9)
This more nuanced and carefully qualified conclusion is not exactly what the authors of 'Your brain on ChatGPT' conveyed. The "exceptional facility" cited in the ‘Your brain on ChatGPT' paper was based on essays of around 200-300 words, which for me is not really an essay. As I continued to read the paper, I was a little more excited when I read that they had used topics from the SAT test. The SAT essays I had come across were based on a piece of text, like a source in history, that the test taker had to read and respond to:
I was confused when I saw that rather than a SAT essay question, the participants were faced with a general topic disembodied from the stimulus text to complete in 20 minutes using either an LLM, internet searches, or their own brain power:
"Many people believe that loyalty whether to an individual, an organisation, or a nation means unconditional and unquestioning support no matter what. To these people, the withdrawal of support is by definition a betrayal of loyalty. But doesn't true loyalty sometimes require us to be critical of those we are loyal to? If we see that they are doing something that we believe is wrong, doesn't true loyalty require us to speak up, even if we must be critical? Assignment: Does true loyalty require unconditional support?" (p.25)
This question/topic is revealing in what it lacks. Where Nussbaum would argue that an education must cultivate the ability to 'criticise tradition' and engage in complex moral questions with others, the task removes the person behind the thesis and demands an answer to be delivered in 20 minutes. There is no historical context, no competing perspective, and no opportunity for deep reflection (the SAT essay test gave 50 minutes to the essay writing section).
I freely admit that I could not understand all the brain data, but I did understand the inference: brain activity equalled learning. From what I understand about the brain, we can’t map writing to a particular part of it (I’m very open to learn if this is not the case). Moreover, just because you can measure something happening in the brain doesn't actually mean it's educationally useful. On the other hand, the claim that there is a cognitive debt (I do like the term) when using an LLM is seductive because it reduces your ability to recall and recount what was written by the participants. However, as some of my teaching colleagues would point out, if your mate or parents do your homework, it's not very likely that you're going to remember what it was.
After reading the paper, sitting with it for a while, and reading more commentary, I began to wonder what might have come about if the MIT Media Lab faculty and computer scientists who publish these papers actually talked to the education departments at their institutions or engaged with teachers. What amazing conclusions could they reach if the achievements of educators were recognised, allowing them to break free from the traditions that seem to bind them? The problem is that the space to ask these questions is becoming smaller because the paper has been wielded as a weapon in online debates about AI to the point that it seems to be a repackaged form of populism, where we now trade what it means to learn (or not learn) and be human on metrics of neural efficiency. This is not neutral discourse; it is political. As Nussbaum warns us, the future of democracy depends on nurturing citizens, not "useful machines" where we reduce thinking (or not thinking) to connectivity and writing to extractable patterns, stripping education of its moral and civic purpose.
In the complex world of learning, we should be cautious of discussions that are simplified and framed in a binary way that easily map onto current political discourse; the battle between the people and the corrupt elites. Rather than just a battle for the classroom, it is also a battle for the kind of society we want to sustain: one rooted in reason, reflection, and civic responsibility. The stakes are high.