Why ChatGPT isn’t the End of Programming

9 min readDec 26, 2022

Many things are being written about ChatGPT, speculating on the end of programming. There was also a nice round of these with the introduction of Co-Pilot. These generally come with a few examples of these tools writing a bunch of code from a much shorter prompt or function declaration.

All these claims are fundamentally broken, and the reason is well summed up in the following webcomic that significantly predates any of these tools.

https://www.commitstrip.com/en/2016/08/25/a-very-comprehensive-and-precise-spec/

This isn’t the first time people have run around claiming that the end of programming is nigh, as the comic from 2106 illustrates. The argument for why ChatGPT and similar tools can’t get rid of programmers is straightforward. A complete function holds a certain amount of information. If it is well written and doesn’t have extraneous stuff, it is a minimal specification of everything it needs to do. When AI systems write code from a short prompt or function declaration, it is guessing what is desired. There will be many versions of the code that could satisfy the original prompt simply because the prompt isn’t wholly specified. The AI is picking one of those many possible solutions. It might pick precisely the right one, but unless it is boilerplate code or something that is solved precisely the same way many people have done before, odds are that something is wrong with what it picks. This isn’t disparaging the AI systems. It is just simple math. Until you provide enough information that the set of valid minimal solutions gets down to a size of one, the algorithm is guessing at the information you haven’t provided.

Amazon has an internal tool similar to Co-Pilot that I’ve been using. In one case, I wrote a function declaration with arguments, and it provided a function that was 90% correct. But it wasn’t 100%. It was missing a try block for an error I needed to handle. The AI had no way of knowing that. I hadn’t provided enough information for it to know that particular exception should be caught and handled locally instead of escaping. Putting that information into an English prompt would have been a lot more keystrokes than writing the code to do it, and the language I was using doesn’t provide the ability to add that information to the signature.

An interesting question is whether this assistant sped me up or not. I’m not sure of the answer to that. Anyone who has tried to debug someone else’s code to figure out what they are missing knows that this is time-consuming. First, you have to read the code for sufficient understanding. Then you have to spend time figuring out where it is wrong. I’m not convinced I couldn’t write it faster myself from scratch. Some might argue that it would be better when producing larger chunks of code, but I think the opposite is true. The larger the chunk of code, the less of it the prompt is likely to specify and the more alternatives the AI must pick from. Also, the more code it produces, the more code you, as a programmer, have to read, understand, and potentially debug.

How Could They Do Better?

Tools that generate code from specifications have been around for a long time. Tools like Rational Rose have done this for nearly 30 years. Some predicted they would take over software development. That didn’t happen. Why not? I expect there are many factors, but one that isn’t subject to the whims of humans is that well-written code is close to a minimal specification. Every other approach is at least as much work as writing the code.

I’m not saying that these tools are useless. I’m just saying that they don’t replace actual programmers. Unless you believe that a non-programmer will write a fully specified prompt in prose. That is as true for ChatGPT as it was for Rational Rose. But some things could make these tools work better and increase the likelihood that they will speed up software development.

First, for code generation from a function signature, the more information you can provide in the signature, the more likely the AI will get it right. One immediate implication is that languages/code with types in the function signature will tend to enable an AI to produce better results. If the syntax/type system provides information about error handling, that will help even more.

A number of years ago, I saw a fun demonstration of generating code from function signatures using Idris. The Idris programming language has a very powerful type system that includes the ability to do dependent typing in a direct manner. Dependent typing provides a lot of information so that even without fancy AI, it is possible to iterate through the options that satisfy the types. Throwing some decent AI on top of such a type system could produce correct code and speed up development. In such a system, the developer still puts significant thought into the types.

Testing?

One of my co-workers at Amazon suggested another interesting alternative. Generate the code from tests. One might describe his idea as AI-enhanced TDD. As with normal TDD, you write tests. But then you have an AI generate the code to satisfy those tests. Given informative names, the AI might even generate some test cases after a developer has written the original ones.

The key to both type systems and AI-enhanced TDD is that the developer provides a more complete specification. Again, that is the key to why no AI system will completely automate software development. Someone has to provide that information. Let’s be honest, most of a developer’s time isn’t spent banging out code. Their time is spent doing various tasks associated with solving problems and creating software, but that covers much more than writing code. The code is the final product in this creation process, but a lot of work has to be done before you even get to the point where you can start writing code.

AI for Everything?

One of the recent articles arguing that ChatGPT is “The End of Programming” managed to make it into the Communications of the ACM. As with so many others, this article seems to ignore the fundamental requirement that a complete specification with no ambiguity would have to be provided for software to be generated. Unlike many articles, it takes an interesting twist and argues that what we will see is the end of using programming languages to create programs. Instead, the author argues that all software will be AIs, and a “programmer” will be someone specialized in training these AIs instead of someone specialized in writing logic to solve problems. I see several problems with this argument.

To support this idea, the author refers to the things that programmers used to need to know back when everything was written in machine language or assembly language, and a deep understanding of the hardware was a significant requirement. He then argues that programmers don’t need to know these things anymore. What he ignores is that many programmers today do need to know these things. There are a large number of software developers employed in fields where knowledge of the hardware is important. The first one that comes to my mind is “Embedded Software Developer” because it is an area I have worked in during the last decade, and I know quite a few people, including many in my division at Amazon, who work in that area. I wrote “bit banging” code for a low-level bus a few years ago. I wrote code that converted strings into packed bytes as one of my first tasks at Amazon. To argue that this level of knowledge isn’t important just because front-end devs don’t use it daily shows me the author isn’t thinking about the full scope of what software development entails.

The argument that all apps will just be trained AIs bothers me even more and seems even less likely. There are two reasons I say this. First, many of the applications we use daily need to be some combination of exact, predictable, and/or explainable. Deep-learning models lack all three of these qualities. Anything that deals with money fits into all three of these categories. I doubt you want your bank or your tax software to be trained instead of having the rules for how it works explicitly written out. Even for commerce software, being “close enough” probably won’t make most customers happy. My current job deals with networking, and I promise you that the security team won’t be happy if the rules for allowed IP addresses lack exactness.

Second, I think the article's author again ignores just how extensive the breadth of software development truly is. Arguing that all software will be done with giant, trained models ignores many counter-examples where software is developed using the minimum required tools because they work and that going beyond them is wastefully complex. For example, GPUs these days are much faster than CPUs. We haven’t moved all software development to GPUs, though, because most software doesn’t benefit from it, and it is much harder to do correctly. Indeed, a lot of software doesn’t even both using the multiple cores present on the CPU because it doesn’t even need that. Similarly, quantum computers won’t be used for all software when they become more available. They will only be used for the problems that are well suited for them.

I expect the author is correct that a fair bit of software will be developed using deep learning models, but only when it makes sense. Over time, the set of applications that benefit from this approach will grow as people figure out new things to do with it, but software that doesn’t need it will not use it.

What Will Ready Happen (IMHO)

So, where are we really going? What is software development going to look like in the future? My guess is that these software writing aids will improve and will be more broadly used over time, but for the reasons expressed above, they won’t replace the programmer. One possibility is that they suggest multiple possibilities and allow the developer to iterate to the version they want.

We will also see a whole new class of applications come into existence that are created using these deep learning models and which take advantage of their capabilities. But that doesn’t mean all software development is going that way. Basic algorithms will still be written in languages that are optimal for expressing them. That language isn’t a natural language, and it won’t be accomplished by training a large neural network to do a task that can be done in a much more straightforward way by writing out the algorithm.

Instead, these new tools and approaches will become yet another tool in the software developer’s toolbox. And they will open up whole new areas of applications that can be developed. I say that because one of the fundamental rules I’ve learned about Computer Science and software development is that nothing ever disappears completely. There is no silver bullet. No new approach completely takes over and makes everything before it obsolete. Instead, we keep adding new stuff: new approaches, new tools, new techniques. But the old doesn’t go away. Generally, the old way is still required at some level and is very likely optimal for working at that level. When new stuff comes out, it is common for people to talk it up and say how it will replace all the old ways, but that never actually happens.

That is not to say that these new tools won’t profoundly impact software, software development, and society in general. It is to say that they aren’t going to replace everything that came before them.

One area where I think these tools will have an especially profound impact is teaching computer science and software development. Because nothing ever goes away, CS curricula are already jammed full of material, and adding a new “required topic/approach” isn’t going to help. Moreover, these AI tools are quite good at producing exactly the type of functions/programs we ask novice programmers to write. How do you convince students to learn to do something themselves when an AI can solve that specific problem? How do you convince students that the AI won’t be able to solve all the problems, and if the student can’t solve the easy ones without an AI, they have no chance of solving harder ones? Similarly, how will people trying to teach Computer Science assess if their students learned things when an AI can solve the problem that was given in the assessment?

Why ChatGPT isn’t the End of Programming

How Could They Do Better?

Testing?

AI for Everything?

What Will Ready Happen (IMHO)

Written by Mark Lewis

Responses (2)