Writing That is Language Now?

November 15, 2025

A follow-up to "Writing That's Not Language (Yet)."

Recently, a nonsense LLM-generated paper landed in the top 17% of ICLR submissions, with two reviewers giving it a score of 8/10. Although it was flagged by a third reviewer and later desk-rejected, this kicked up a flurry of outrage among AI researchers. Personally, I was struck by the review that initially flagged the paper as AI-generated:

As soon as I saw the phrase “appears technically sophisticated but lacks genuine substance,” I felt that there was something interesting here. The frustration here was palpable—I had experienced the exact same thing years before.

Writing That’s Not Language

In 2020, I close-read approximately 118 AI-generated documents on cannabis legalization (plus an equal number of human-written ones). It was a painstaking task: I needed to classify the lexical aspectual class of every clause, mark coherence relations between clauses, and rate the argumentation quality of each document. But there were two things that made this difficult. First, I didn’t know which articles were human-written and which were AI-generated. Second, the AI-generated documents were extremely uncanny. Take this sentence, for example:

If weed’s not really a public health issue and you’re really happy about it, get an understanding about the ways in which it will be able to influence your behaviour.

Is this someone’s Reddit comment, posted without a second thought? Or is it a semi-competent language model’s attempt to imitate the surface form of an argument? I was never really sure. But the quality of my annotations depended on actually understanding what was being said here, so I spent a lot of time re-reading these kinds of sentences over and over, trying to grasp some kind of meaning from them. It felt like I was having a stroke, or like I was being gaslit by the text. But then, I’d read something like

BART police already have a “marijuana alley” where potential customers could find sprayers and pagers ready to use and find it where they’re supposed to.

and breathe a sigh of relief. This document is nonsense—it was generated by GPT-2, a disembodied probabilistic model that does not vote and cannot smoke weed. So, it was safe to say that there was never any meaning here for me to extract in the first place.

The following summer (2021), I wrote an essay about this experience for a class. It includes some more cool examples of AI-generated nonsense that has the shape of a sensible argument. In that essay, I argued that the text generated by LLMs is odd in that it is not language yet—not until a human is able to read and extract meaning from that text. In this way, LLM-generated text is strange and beautiful.

Ants Fornicating Meaninglessly In the Sand

ants fornicating meaninglessly in the sand: https://arxiv.org/pdf/2308.05576
doesn’t matter if the meaning is there in the first place, meaning comes from us getting it? [I WOULD ARGUE THAT THIS IS WHAT READING IS!!!! “MY READING” OF A TEXT. art, philosophy, derrida, etc. just degree of difficulty?] metaphors we live by.
badly written human text: derrida, bad paper, bad review replies. show the actual paper!! compare to actual legit paper!!!!!

ground back into the iclr paper

if before it was writing that is not language… now it is language that is without meaning? show example of the abstract. this is the same thing as before, just kicked up a level. now the model can imitate the form of a whole paper pretty well. but there is still a vacancy in what it is actually trying to say. . just on the entire level of the paper.

fake paper examples that david mentioned. yeah like who cares if the author thought it was fake, if they did actually say something interesting
can we only do this if we have faith that the writer is a human?

https://sfeucht.github.io/rerereading/ Sheridan Feucht