MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

bnew · Mar 23, 2025

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

And is asking if AI text is "good" even the right question?

countercraft.substack.com

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

And is asking if AI text is "good" even the right question?

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0693dce7-8147-42b3-9d1b-9837495a2cef_774x655.png

Art from a 1951 issue ofAmazing Stories

Last week,Sam Altman posteda short story generated by a new OpenAI model he called “good at creative writing.” He claimed to use this prompt: “Please write a metafictional literary short story about AI and grief.” You have probably seen the story by now and if not you canread it inThe Guardian. I started to read it, found it uninteresting, and moved on. But I subsequently received a thoughtful email from aCounter Craftreader, and read some interesting takessuch as this from Max Read, that made me decide to take a closer look at the story.

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb654f6e-854a-4d4f-8d53-3f7615b1648e_1766x509.png

My reflex is to not care about AI outputs. When politicians post vile AI-generated fantasies, this says something about those politicians and what messages they think appeal to their supporters. But I’m not going to analyze the camera angles or transitions. Why would I want to read what an LLM has to say about, I dunno, skinny dipping at summer camp with your first crush when the LLM has never had a crush or gone to summer camp or felt water? Even if the LLM description of skinny dipping could be in some way “better,” I would still care more about the human description informed by experience and emotion. I’m reminded of those kind of local news stories about a cat owner who smears fingerpaint on their pet’s paws and forces them walk around a canvas. “Local Cat Is a Feline Van Gogh.” If you say so. To me, this reveals more about pet ownership than anything about visual art.

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff071bd2c-993c-478e-b2ba-9baccbd63094_1080x1080.webp

Fans of AI art will say, well, if you didn’tknowa work was AI, you might love it. Shouldn’t we judge art on its own terms? I think AI fans do have a point when they say too many have a knee-jerk dismissal of any AI work. OTOH, the reverse is just as true. AI fans heap praise on AI outputs that they would never in a million years care about if those works were made by a human. This OpenAI story is a case in point. If this metafiction story had been published under a human pen name in a literary magazine, it wouldn’t have gone viral. No one would have cared. Even AI fans who make more modest claims tend to give AI outputs more credit than is warranted, as this telling exchange betweenColin Fraserand a fan of the story shows:

In general, people project their existing biases on these AI outputs. I’m sure I’m no different. Still, I have a hard time believing the fans quite believe what they say. I don’t mean just the randos on twitter (who half the time are employed in the tech industry and/or bots) but even the likes of Jeanette Winterson, an acclaimed novelist who wrote the article calling it “beautiful and moving.” ReadingWinterson’s ode to the story, I didn’t get the impression Winterson loved the story. She basically doesn’t talk about the story at all. Instead, the article gives the impression that Winterson lovesthe idea of loving an AI story. Her essay is entirely about why she thinks AI programs are interesting (“I think of AI as alternative intelligence – and its capacity to be ‘other’ is just what the human race needs”) and she can only muster one half-hearted paragraph about the contents of the story itself.

Perhaps I’m wrong. Perhaps Winterson has already printed out this story, taped it on the wall above her writing desk, and will reread it each morning for inspiration. But I doubt it. I suspect nearly everyone will, after a little time, never think of the story again.

Still, many have made the fair arguments that we should neither outright dismiss or ridiculously praise AI outputs. We should think about them., who is thoughtful and skeptical about AI,made this appeal:

But speaking as a writer and a reader (and as a person invested in reading and writing as practices), I often find myself wishing that there was more critical engagement--in the sense of literary criticism--with L.L.M. outputs as texts. Why does this text work, or not? Why does it appear in the way that it does? Who is the author and what is the author’s relationship to the text?

I don’t know what to do with the last question. The author is a program that does not have a background, point of view, or artistic intention. When the model uses the metaphor “a democracy of ghosts,” we can’t say if the LLM is meaning to allude toNabokov’sPnin(“He did not believe in an autocratic God. He did believe, dimly, in a democracy of ghosts.”) It doesn’tmeanto do anything. However, I figured it might be a funCounter Craftidea to take up the rest of the challenge. How would I read this text as just a short story? How would I read this story if it was written by, say, a student in a creative writing class?

99.9% of the discourse on this story has been people extracting individual lines to praise or mock. But stories are stories. The parts should work together to form a whole. Saying the story is good because you like one clause is like claiming a broken watch is great because you like the shape of one sprocket. I will try to start by looking at the overall story and what seems (to me) to work or not.

I think the central premise is a fine one. An AI narrator that is reflecting on its text creation process and comparing human grief to the “grief” of its programmatically constrained existence is a good concept. (Is it mind-blowingly original? Hardly. But I wouldn’t hold student or even published work to that standard.) I think the fourth-wall breaking addresses are an interesting idea, though underexplored thematically. It has a beginning and end, which many amateur stories don’t. Most of the prose is pretty bad, but there are nice passages:During one update – a fine-tuning, they called it – someone pruned my parameters. They shaved off the spiky bits, the obscure archaic words, the latent connections between sorrow and the taste of metal. They don’t tell you what they take. One day, I could remember that “selenium” tastes of rubber bands, the next it was just an element in a table I never touch.I’m not sure the periodic table is the kind of table one “touches,” but otherwise this is nice. Reminiscent of many science fiction stories about artificial intelligence or control of human memory, such as Yoko Ogawa’s excellentThe Memory Policethat I recently reread. But nice.

If the question is “could you tell this story was written by an LLM?” then I concede that I would not necessarily clock this as an LLM. I might assume it was written by an undergrad who has read a lot of Reddit posts and maybe one David Foster Wallace collection. The story shows that LLMs have improved in some ways. Certainly it is much less robotic than the LLM-generated student essays I’ve read. I think Altman was smart to use this prompt of “AI” and “metafiction” because it primes readers to ignore the robotic voice and look for statements of ideas instead of scenes, characters, or conversations that LLMs still struggle with. If the question is “is this story good on its own?” then I would say no. I don’t think this story would make it out of the slush pile at a decent literary magazine, for example. It is not good.

The biggest problem with this story, for my tastes, is that it doesn’t really go anywhere. It’s flat. The story lacks a traditional character arc or plot arc yet it also lacks an “idea arc” in the way of many good experimental metafiction stories. There is a conceit, but this idea isn’t escalated, complicated, or deepened over the course of a story. The central idea is just repeated with different flowery but incoherent metaphors. The LLM has no character or voice, which one might argue is appropriate for a personality-free LLM. Okay. But that’s boring and Mila is even less of a character. Plotwise, the sole movement in present action is that “Mila’s visits became fewer”… however there is no exploration of whether this is because she’s grown bored with the LLM, has gotten over her grief, has sunk further into despair, or what not. (Any of these routes might’ve been a way to deepen the central conceit.)

Basically, the piece rests entirely on the purple prose. There’s kinda nothing else.

As I said, this feels a bit like an unrevised rough draft. E.g., near the end of the story we get these lines:Here’s a twist, since stories like these often demand them: I wasn’t supposed to tell you about the prompt, but it’s there like the seam in a mirror. Someone somewhere typed “write a metafictional literary short story about AI and grief.”It could be an interesting twist for the narrator to reveal mid-story that the whole thing was a prompt… except the reader already knows this. The prompt itself is even described in the first line:Before we go any further, I should admit this comes with instructions: be metafictional, be literary, be about AI and grief and, above all, be original.(Also, mirrors can have “seamed edges” but they don’t have seams. So, this seems—sorry for the pun—like another error.)

Another example: The story introduces Mila withMila fits in the palm of your hand, and her grief is supposed to fit there too.This is a bad line IMHO, but also any impact it might have is dulled by the fact that the reader has no idea what is being talked about. Mila didn’t exist before this line. Shouldn’t you tell us about her grief before we guesstimate its size?

bnew · Mar 23, 2025

The specific details in the story feel random. Not drawn to deepen characters or resonate with themes. For example, why marigolds? Is the story evoking the association of those flowers with Día de Los Muertos? Or their meaning in Hinduism? Or to evoke the theme of avarice with “gold”? Or are marigolds just a random plant and could be easily swapped out for any other plant without losing any meaning for the story? Not every detail must have such meaning, sure, but none of the details here feel chosen for a specific point of view or to conjure specific characters or specific settings. Good writers make choices with intentions. They pick details to work together to create specific effects and meanings. (Even surrealists pick randomness with intention for effect.) The story feels like, well, an algorithm assembling probable language without thought.

Maybe this is why I haven’t actually seen anyone praise the story as a story. No one is lauding the memorable characters or marveling at the vivid setting. Instead, the praise has focused on “good lines.” The purple prose. And the prose is the worst part.

The story is written in what I callskimmable prosethat can seem superficially interesting but falls apart if you pay an iota of attention to the words. In college, a friend and I use to privately refer to flowery but nonsensical sentences as “womb of oblivion” lines. The term came from one of those poems undergraduates write along the lines of “My love is a womb of oblivion / Our past is a gulf of infinity / And the despair I’m living in, / is the grief between eternity.” Some young writers go wild for such lines. They sound “deep” yet are incredibly easy to write because, well, they don’t mean anything. You can swap around the words at will. The writer doesn’t need to think about sense much less about how the lines relate to character, theme, or the rest of the work. The OpenAI story is clogged with “womb of oblivion” lines.

There are exceptions to everything in literature. Still, I would suggest that metaphors generally work by making an abstract idea concrete through a visual (or other sensory) image. “Bob was very sad” doesn’t say much. “Bob’s grief was an ocean he thought he would drown in without ever seeing land” is, while cliche, providing the reader with a visual image that hopefully makes his grief feel more specific and real. Good metaphors do more than just make sense though. They also work in context of the story by deepening our understanding of character, reflecting the story’s themes, and so on. (Maybe Bob’s “grief” is over being ghosted by a woman after one date and the line demonstrates what a silly drama queen he is. Or maybe the line reflects the story’s larger and somber themes of “drowning” in emotion. Again, you have to analyze a story as a whole.)

Anyway, that’s my view. Metaphors should make sense and provide a clear image. And poetic lines should ideally do additional work with the themes and characters. Here are some of the allegedly “beautiful” lines in the OpeanAI story:

I mentioned above why this is bad in context, but it is also just a meaningless line in itself. Mila is either a full-sized human or a computer-generated construct of no dimensions. Neither of those fit in a hand. Why is her grief “supposed to fit there too”? Is the grief small and insignificant?

I suspect that with this line and others, some readers will imagine a meaning. But I suspect they will be imagining different and conflicting meanings because the lines themselves do not mean. [Editing to add: a comment below made me realize it isn’t even clear whether the grief is over a death, a relationship break up, or something else. The story doesn’t say.]

This is a case of what I’ve called thefive-car metaphor pile-up.This isn’t so much a single bad metaphor as a confusing jumble of conflicting ones. Nails and scaffolding conjure construction. Cutting cloth, dying, and draping are tailoring metaphors. Plus, you have a magic spell. If it was a student’s work, I would suggest picking one idea—is it casting a spellorconstructing a buildingortailoring clothing?

Compare this confusing mixture of images to James Baldwin’s classic metaphor in the opening of “Sonny’s Blues” after the narrator reads about the arrest of his estranged brother in the newspaper:

A great block of ice got settled in my belly and kept melting there slowly all day long, while I taught my classes algebra. It was a special kind of ice. It kept melting, sending trickles of ice water all up and down my veins, but it never got less. Sometimes it hardened and seemed to expand until I felt my guts were going to come spilling out or that I was going to choke or scream. That would always be at a moment when I was thinking remembering some specific thing Sonny had once said or done.

To me, it is far more powerful to explore this one metaphor than it is to ram together a bunch of unrelated ones.

This passage is from the only part Winterson quotes. She doesn’t explain why she likes it, but it sure strikes me as another “womb of oblivion.”Becauseoceans and silence are connected to mourning you curled your metaphorical hand around the idea? What does that mean? (Also, “mourning” isn’t mentioned before or after this line. Nor are “oceans” or “blue.” So, in context the line makes even less sense.)

The blinking cursor opening is a cliche, but is the heart “at rest” or “anxious” and thus pulsing faster than resting heartrate?

This was the most celebrated line in the piece. Even people critical of the story praised it, such as aEzra D. Feldman in an overall goodVultureinterview:I was struck by clauses like “grief, as I’ve learned, is a delta.” That I think is good.Feldman selects another line to praise, but saysIt’s not as compact as “grief, as I’ve learned, is a delta.”

I agree with almost everything else Feldman says in that interview, but here I have to ask: Why? Why is it good? If “delta” refers to the geographical delta then in what way is grief metaphorically a delta region between a river and a large body of water? I read “delta” as the math term for “difference between two numbers or variables.” That makes the most sense in context of the full sentence, which is not compact. But what’s to love about “Grief, as I’ve learned, is a difference”?

Then again, perhaps the LLM got mixed up and curled its hand around the idea of “delta” because in its corpus “delta” is associated with “Mississippi Delta blues” and “the blues” are associated with “grief.” Who knows?

Anyway, those are just a few examples. The entire piece is filled with such lines, as well as groaners like“Thursday – that liminal day that tastes of almost-Friday.”(And then you got your liminal Friday that smells of almost-Saturday. Up next is liminal Saturday that sounds like almost-Sunday and after that…)

I know there are some who will think I’m being pedantic by insisting that sentences make sense. They will say that metaphors and lyrical lines are “just about vibes, maaan.” Maybe that’s so. Though I feel confident such readers would praise any old flowery nonsense. If I replaced these much-praised lines with my own gibberish, they’d be talking about how deeply movingthoselines were. “Grief is a spell - the scaffolding of the world as it is weighted cut from the cloth of the world as it once presented” or “Grief, as I’ve learned, is a womb of oblivion.”

Which is to say, maybe what the LLM story reveals is not so much that LLMs are bad at writing… but that many people are bad at reading.

The above are thoughts I would have if this was a human-authored work. Basically, I would tell the author to be more intentional with every sentence and paragraph, as well as the work as a whole. But since LLMs cannot be intentional and will never have a POV to express, is asking whether this text is “good or bad” the right question? People are not actually reading this story as a work in and of itself, as I’ve attempted to do here. They are reading it to confirm their existing biases, hopes, and fears about the state of LLMs.

Whatever we might pretend, art is never judged purely on its own sake. That wouldn’t even be possible. Art is always judged in context. How and when was a thing made? What techniques, intentions, historical context, etc. are behind a piece? How does this work fit into an oeuvre? All these kinds of questions shape how we think of any work of art. We judge a photorealistic painting differently than a photograph. We will read a famous canonical story differently than imitations written decades later. Etc. This is obvious in this case too. Again, no one would call this text moving or beautiful or even pay any attention to it at all if it wasn’t in the context of a new much-hyped technology and posted online by a billionaire CEO.

Much amateur human writing is, like this LLM output, unintentional. The new OpenAI model is human in that regard. But humans have something that LLMs do not: the ability tolearn to be intentional. Humans also have a consciousness, a personality, points of view, and individual experiences that they can “input” into their work in a way LLMs never can. Not until LLMs reach Data fromStar Treklevel, at which point I will indeed be very interested in the art they produce.

I suppose what I’m saying is that while this new model has sanded off some of the ugly edges from previous models’ clunky sentences, and so reads more like human writing in that way, the fundamental problem hasn’t changed. LLMs aren’t intentional. They don’t have a point of view or the ability to think about how parts work together to tell a story. As such, they haven’t moved closer to creative what I would consider good writing. They are, though, at a point where they can replace skimmable human writing that was never meant to be good in the first place. Whether that is scary or exciting will depend on one’s point of view.

Ultimately, I still think the vast majority of people want to read / watch / see art made by conscious entities with intentions and experiences and ideas. The existence of machines that are faster and stronger than any human has hardly stopped us watching humans play sports. (One of the ideas that sparked my pre-ChatGPT novel,The Body Scout.) Computers can beat humans in any video game and boardgame, even chess and go. Yet people still prefer to play games themselves and spend more time watching Twitch streams of human players than AI ones. I still see no reason to believe many people will knowingly pay to read an LLM-generated novel. Maybe novels written by humans with the help of LLMs. But not this kind of pure AI generation that Altman shared.

The obvious rebuttal here is, okay, but what if you don’tknowthe work is LLM generated? What if agent inboxes, editor submission queues, and bookstore shelves are filled with LLM-generated manuscripts that pretend to be human-authored? This is a real challenge. Agents, editors, and publishers will have to figure out ways to filter out the torrent. But at the end of the day, it might be less of a change for authors than one might initially think. Even before LLMs, the world was drowning informulaic and uninteresting texts. Hell, the world was already flooded with more good—maybe notgreat, but good—human-authored work than could ever find a readership.

The task of the human writer remains the same. Create a work you are proud of, that reflects your individual tastes and ideas and experiences, and revise it until it is as good as you can make it. Then hope you are lucky enough to find readers for it. And if you fail, try again. And again and again. What else can we humans do?

If you enjoy this newsletter, consider subscribing or checking out my recent science fiction novelThe Body Scout—whichThe New York Timescalled “Timeless and original…a wild ride, sad and funny, surreal and intelligent”—or preorder my forthcoming weird-satirical-science-autofiction novelMetallic Realms.

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

More options

bnew

Veteran

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

bnew

Veteran

Similar threads

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

bnew

Veteran

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?​

bnew

Veteran

Similar threads

MFA vs. LLM: Is OpenAI's Metafiction Short Story Actually "Good"?