Chat-based programming, which I will preemptively label
Chat
Oriented
Programming (CHOP) or just "chop", because I suspect I'll be to be saying it a lot, is a brand-spanking new phenomenon. Like, seriously, just over a month old. I first noticed it when GPT-4o came out, which was mid-May.
You remember that big manufactured drama, right, about the new OpenAI "4o" GPT model supposedly having Scarlett Johansson's voice? That's the one.
That model changed everything about programming overnight.
When GPT-4o dropped, it could suddenly edit 1000-line source files–a figure that encompasses 95+% of the source files in most repos worldwide–with tremendous precision and faithfulness to the original, leaving the untouched parts of the file 99.9% diff-perfect. Every once in a very rare while I see it drop a space character, or slightly mis-indent a line. But that's it.
From what I've seen, Google's Gemini and Anthropic's Claude 3 Opus have also cleared this hurdle. Which makes multi-model prompting a good bulwark against hallucinations, and a great way to select the best candidate for a design. Just feed your query to all of them and make ‘em fight it out. It's a common CHOP technique.
Briefly, CHOP is coding via iterative prompt refinement. Everyone's attempts to get it to do anything complex used to peter out after 4 to 5 iterations, and the models couldn't make progress. But now your iterations usually converge, which means you're reviewing and ultimately approving more and more LLM code, and writing less and less yourself.
So programming has leveled up into a problem of explaining your situation to the LLM – that is, slinging context, since you'll want to include a lot of information from different sources – and then merging the output back into your code base.
We'd better get a tool for this. Chop chop!
The big model upgrade was life-changing for people who have already been coding in raw ChatGPT, which is still the world's most-used and most popular AI coding assistant. ChatGPT coders have been using chop style for a year or more, but it has been only for the bravest, most patient, and dumbest of early adopters. Whereas now it's ready for everyone!
Of course, raw chat involves a lot of manual assembly of context in and shoveling responses out, and it can get tedious, so many programmers use coding assistants instead. (I mean, I guess business is good if you are the CTRL, C, or V keys on a keyboard. Or ⌘ if you are into that sort of thing.)
All AI coding assistants benefit from this upgrade, too. It has certainly been huge for our coding assistant Cody, which in my opinion has the best chat due to our automated context assembly engine, which saves you from having to explain your code base every time. Plus Cody Pro lets you use both GPT-4o and Claude (and others), so you can spot-check all your work with another LLM.
The model upgrade is arguably even more effective with our inline edits feature, which handles dropping the response directly into your code. No more shoveling code out of the chat window. Cody just changes your code in place, undoably and retryably. The models are getting smarter, but the integration with your workflow is getting tighter as well.
What's happening is a big deal. Programming this way is arguably on its way to being an order of magnitude speedup from completions-based programming. A 10x improvement might sound like an exaggeration. But we just saw examples from legal practice, publishing, and data science in the same ballpark, with 5-30x speedups for certain kinds of tasks, and estimates of at least 2-3x overall boost to productivity.
You can quibble over the numbers, but it's clear that programmers using CHOP with these new models are getting a turbo boost.
Why aren't you upgraded yet?
Trivia: My brother told me that movie was so scary that he had to leave the theater. Never saw it myself, but the short story was fun.
We're going to do story #5 now, the very last story, and it's mine, and it's about programming, and the plot is different from the others.
This is where we learn the hard way why chat programming is for senior devs. Please keep your hands and limbs inside the carriage during the ride.
The other day GPT-4o shocked me with one of its coding proposals, while I was chopping away with it at a project. It was such a shock that I laughed for minutes, practically unhinged. I am not often shocked, least of all by looking at the code for a working program.
I had presented a small design problem to GPT, asking for an evaluation of my options. It's great for that. Chat-oriented programming (CHOP) isn't just about coding. There are a lot of design discussions too. You do everything, even writing sometimes, with the LLM as a pair partner.
For some reason, this time ChatGPT launched full-bore into redesigning an entire subsystem of the framework I was stuck in, which had an issue I was trying to work around. I'm sure you have seen this wack behavior from LLMs before. The difference is, now they are very good at rewriting things, and sometimes they'll swing for the fences, consequences schmonsequences.
GPT-4o breezed quickly past most of the sane options before plowing straight into "let's rewrite it all" scorched-earth mode, clearly not having read
Joel Spolsky. It spewed out an unusually long answer, hundreds of lines of code and instructions, presenting thorough and persuasive reasoning supporting this approach.
And the rewrite would have worked! It was a very direct solution to my problem. So it wasn't "wrong" per se. There are rarely things in design that are truly wrong – like, wrong wrong – as there are exceptions to most rules.
There was only one teeny tiny issue with it, which is that it would have killed someone.
Not on my project, hopefully! But if the model is currently spitting out similar suggestions to people working on software that controls giant saws in machine shops, someone's gonna die.
GPT's redesign was awful – but not when you looked directly at it. Up close it was downright cozy.
You had to pull back and look at the design in the broader context of its interactions with other subsystems, but also support, maintenance, upgrades, add-ons, and a bunch of other dimensions that are critically important to software design and engineering. With that wider lens, it was an Unbreakable-level train wreck waiting to happen.
But it was wearing a superb disguise. A less experienced dev would surely have fallen for it. I could feel the power and the attraction of the approach it was espousing, the force of the arguments backing it. The whole thing was appealingly packaged, technically correct, and laid out for you to just take it and run! (My pal Chris Smith says the word I'm looking for is "specious". So yeah, that.)
And it just hit me all at once how hilariously inappropriate and potentially dangerous this answer was. It was so impressive and so spooky. I laughed and laughed, like a little kid watching bubble bath plant monsters replacing all the people.
My senior colleagues have recently recounted similar chat scenarios in which a more junior dev would have been completely taken in, potentially losing days to weeks of work going the wrong direction.
Or worse.
Chat, it seems, is safer for senior programmers than it is for junior ones. And a lot of companies are going to interpret "safer" to mean "better."
Buckle up. Topsy-turvy times ahead.
A lot of companies ask me, how can we tell which parts were written by the AI and which parts by the programmer? Well, if chat-oriented programming becomes the dominant modality for programmers, then LLMs will be writing the vast majority of all source code worldwide. That's a gargantuan shift, and it might even shake up the traditional software engineering roles.
We've seen how this shift affected a law firm, a publisher, and engineering's sibling discipline of data science. Stock in junior contributors is down, and there is concern there could be a market crash.
That presents a very serious problem for new people in those fields. What do you do? How do you learn the ropes, not to mention find gainful employment? What are the ropes now? And what do the companies do when their senior people retire?
They could wind up like the COBOL world which is in a worldwide crunch, because there are no junior COBOL devs to replace the ones who retired. Ironically, these huge legacy companies will likely be rooting hardest for the LLMs to write all their code.
Things are changing fast. I'm an optimist, and I generally think, or at least hope, that as companies become more productive in the coming months and years, they simply get correspondingly more ambitious.
Everyone will need to get a lot more serious about testing and reviewing code. Senior devs could become full-time reviewers and junior devs shepherd the LLMs, maybe?
It doesn't necessarily have to end with a post-apocalyptic wasteland for recent CS grads.