Introductory Programming — Avoid Letting Students Pick up Bad Habits

Mark Lewis
5 min readOct 29, 2022

--

One of my strongly held beliefs about teaching people to program is that you should not only strive to instill good habits but also try to avoid having students pick up bad ones. This rule isn’t just significant for programming. The general rule that unlearning things is hard applies to pretty much everything. A few experiences have led me to hold this belief so strongly. I wanted to write this quick post describing them.

The first one had nothing to do with Computer Science. My High School Physics teacher, James (Jim) Chapman, gave all the students in Physics 1 a multiple-choice pre-test of basic physics during the first week of class. As the test was multiple-choice, random guessing would have produced a score of something like 25%. What Jim found year after year was that students consistently scored below that. The test had wrong answers that were specifically put there based on common misconceptions people have about Physics, and most students held those misconceptions. Jim said his biggest challenge in Physics 1 wasn’t teaching the Physics concepts. It was getting students to unlearn the various misconceptions they held about Physics.

That was Physics, not Computer Science or programming, but an experience I had early in my teaching career convinced me that it also applies to programming in the form of bad habits. Unlike Physics, students don’t pick these things up from their normal daily experience. Instead, they pick them up during their first programming experiences, whether in classes or on their own. Especially in the classroom, we should do what we can to help prevent this.

In my first few years of teaching, I had a very bright student in my CS1 course who was a capable programmer. The catch is that all of this programming experience was with line-numbered BASIC. He had done it for years and was good at it, which was the problem.

For those who have never seen or used line-numbered BASIC, it had a simple syntax and was popular for teaching/learning back in the 1980s. As the name implies, every line was numbered, and by default, they were executed in order. The main control structure was a GOTO, which would cause execution to jump to another line. (There was also a GOSUB that would keep track of where the call had come from so you could return to it.) There were no functions. There were no blocks of code defining scope. Everything was global, and you controlled execution by conditionally or unconditionally jumping to various points in the program. It is worth noting that the first examples in the Wikipedia entry for Spaghetti Code are in line-numbered BASIC. There is a good reason for that. Large programs written in line-numbered BASIC inevitably became spaghetti.

So how did being good at line-numbered BASIC impact my student? He had a lot of experience writing code in a single mass. He saw no need for functions. He saw no need for limiting scope or breaking problems down. I could tell him that he should break his code into functions, limit scope, make constant actually be constants, make things private, etc. But he didn’t see a reason for it because, for nearly every problem he encountered in his first 2–3 semesters, he could solve it with hacky, spaghetti code. It wasn’t possible to get him to unlearn his bad habits until he got to a point where the projects got big enough that he could see how his normal style wasn’t going to work. He missed a lot of learning during that time. He could have been building good habits, but instead, he clung to the bad ones.

Note I’m not condemning this student. When I reflect, I was the same way. I, too, learned to program using line-numbered BASIC (and a little LOGO). I, too, got good at it. I wrote programs with thousands of lines of spaghetti that were impossible to maintain. When I moved on to Pascal and C, I maintained that style. When people told me I should use a different style, I also resisted it because I could solve things in the style I started with. That is what was natural to me.

What I am condemning is having people learn to code in line-numbered BASIC. Sure, it was “simple” and easy to pick up. But simplicity of syntax does not mean simplicity of logic, and learning to code is more about learning to structure logic than it is about syntax. Languages that let or even encourage students to do things in bad ways should be avoided so that students don’t get those bad habits early and have to unlearn them later. Similarly, a good introductory language prevents bad practices. I would argue that it should also do this in a way that provides quick feedback. The earlier the student is told they are doing something wrong, the better.

Note that using tools that enforce good practices can be applied to tutorials and other places where people learn to code that aren’t standard classrooms. People just need to write more tutorials for beginners that use the “good” languages and fewer that use the “bad” ones.

So what languages allow students to develop bad habits? What are the modern equivalents of line-numbered BASIC? My personal beliefs fall in line with this post on LinkedIn. I believe that scripting languages generally fall into this category, but Python is particularly bad. This is because Python has no block scope mechanisms, nor any form of const declaration on top of the general trait of scripting languages that they allow students to do pretty much whatever they want with types.

What languages do the most to prevent bad habits? The answer to that aligns with the set of languages with one of my favorite properties. My ideal languages are ones where when a program compiles, odds are good that it is correct. The more typos and silly mistakes the compiler catches, the better. The same rules that will catch typos and silly mistakes are also likely to catch and disallow bad habits or at least make it a lot harder to write code that includes those bad habits. As a general rule, statically typed functional languages (Scala, Haskell, F#, etc.) generally seem to fit into this category. Rust also does a great job in that regard. (Rust is the only language I’ve seen provide an error because a variable was declared in a larger-than-minimal scope.)

Of course, there are many other factors to consider for the novice, but the perceived complexity of the language as a whole probably shouldn’t be right at the top of the list. Preventing bad habits should probably come above it. Even when considering the complexity, the complexity of the set of the language that the novice is going to need to use early on is the part of the complexity that matters, not the complexity of the language as a whole. Additionally, being simple doesn’t mean a language is good for learning to program. Line-numbered BASIC was a prime example of that. The best research I am aware of on how much novices struggle across languages found that students learning Python struggle as much, or more, than those using C++ and Java (2018 paper, 2022 paper). When picking a language for someone to learn to code, you are laying a foundation. Try to pick one that helps to make that foundation a strong one.

--

--

Mark Lewis

Computer Science Professor, Planetary Rings Simulator, Scala Zealot