Specific Issues with Python for Teaching Introductory Programming

Mark Lewis
8 min readOct 1, 2023

--

This blog continues my goal of documenting issues that I have with Python and other dynamically typed languages as the starting point for learning to program, especially for CS majors. This post focuses primarily on factors that are specific to Python. I have links to my other related posts at the bottom.

I am a firm believer in the idea that languages are tools. In the case of students learning to program, the goal is to teach students to break down problems and express algorithmic solutions in the formal syntax of a programming language. For someone who plans to make a career of writing applications, I also want the language to help teach key concepts that help them develop good coding practices.

Constants and Immutability

Python is surprisingly bad for teaching about constants and immutability. There is no const declaration in Python. That means that the following is perfectly valid.

>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 99
>>> math.pi
99

This should be an error. The fact that it isn’t an error sends the signal to students that there could be okay reasons to do this. Getting the language/compiler to enforce when values shouldn’t change is helpful for code quality. A change like the one above can create very hard-to-find bugs. Indeed, books like “Effective C++” and “Effective Java” strongly encourage programmers to make everything constant by default. It was recognized this was an oversight in JavaScript, and it has since been fixed. Python hasn’t fixed it yet. Newer languages often push this even further. Scala and Kotlin make it as easy to declare a constant with val as it is to declare a non-constant with var. Rust takes this even further, and everything you declare is constant unless you add mut.

Reduce Scope of Variables

Another standard recommendation that can be followed in most languages to improve code quality is to reduce scope as much as possible, especially for variables. This is somewhat related to having const. The advice for const is that if something shouldn’t change, the programmer should be able to say that, and the language should enforce it. The rule here is that if a value needs to change, it should be limited to the smallest part of the program doing so. You can’t teach this effectively in Python because Python lacks block scope. You have global scope and function scope, which isn’t sufficient granularity. This is another item where the standards body for JavaScript realized this was a problem and fixed it.

The general recommendation to reduce scope prevents bugs in examples like the following.

int localVar = 0;
while(condNotUsingLocalVar) {
// Do stuff with localVar
}
// Not using localVar

In this situation, the localVar should be moved inside of the loop, as that is the only place it is used. Where code like this often becomes a bug is when the namelocalVar gets used later in the code and it isn’t properly re-initialized for value from this loop impacts the behavior of later code, even when it shouldn’t.

Parsing Expressions

Python has special forms that some might find are more like math or natural language, but unfortunately, they are less like other programming languages. So while they might be good for teaching coding to people who aren’t going on in CS, they do a poor job for CS majors.

Consider an example like 2 < x < 7. Some might like these examples because they show how Python is more like English. The problem is that a programming class isn’t a class in English or any other natural language. Part of the goal of an introductory programming course is to help develop students’ ability to think in the formal grammar of programming languages. One aspect of understanding programming languages is understanding how complex expressions are built up from simpler ones. Below is an illustration of how the Python shortcut breaks this.

>>> 2 < x < 7
True
>>> (2 < x) < 7
True
>>> 2 < (x < 7)
False
>>> 2 < x
True
>>> x < 7
True

In most programming languages, you would have to write something like the 2 < x && x < 7. This has the advantage that we can show it is separate expressions and talk about the associated types, (2 < x) && (x < 7). The && operator works on booleans. The < and > compare ordered values. In Java and many (if not most) of the languages created after it, 2 < x < 7 is a type error. Due to standard precedence rules, it gets parsed to (2 < x) < 7, which checks if a boolean is less than an integer. That’s not a meaningful comparison, so the language doesn’t allow it. You can’t teach this aspect of types and how things get parsed in Python because they decided to include a special form to make it more English-like.

There are other examples. The following examples are allowed in basic Python and enforced in linters like flake8.

x not in y
a is not b

Trying to move the not to the front where you have clear subexpressions is flagged as an error by flake8. Similar examples would look like the following in Java and other C-family languages.

!y.contains(x)
!a.equals(b)

The key is that the ! operator is operating on some subexpression. In Java and most other newer languages, the subexpression is also required to have a boolean type. You can show students this and even highlight it in parentheses around the subexpression. Where do you put the parentheses to show the subexpressions in the Python form? That doesn’t work. Those expressions only work in Python because they created unique forms that won’t work for students in pretty much any other language.

break and continue are Poor Form

As a general rule, I’m not a fan of break and continue. They are special cases of goto and I view them as being poor for maintenance. My experience is that they seem to get used a lot in languages that list being “simple” as one of their key goals, like Python and Go. For full disclosure, my favorite day-to-day language, Scala, completely got rid of them, but I tended to avoid them even when I was writing Java. (I will note that I have no data on how commonly break is used with loops in different languages. A quick internet search didn’t help. If anyone knows of a source for that information, please link it in the comments.)

There are two main reasons I consider these constructs bad for maintenance. One doesn’t apply to continue, but both apply to break. The one that applies only to break is that it makes code harder to reason about. Consider the following loop.

while i < 10:
...
# code after the loop

The condition of a while loop should express an invariant. When you read this code, it seems like it should be safe to assume that after the loop, the value of i will be greater than or equal to 10. A break statement in the loop breaks that invariant. If the maintainer doesn’t notice it and starts adding code after the loop with the assumption i >= 10, they will probably get bugs in the situations where the break is executed. There are similar arguments that apply to for loops. A break makes the logic more complex and requires the maintainer to look more closely at more code to ensure they understand it and are doing the right thing.

Now, I know that some people will argue that if the loop is small, this doesn’t matter as the break will stand out. That leads to the second reason, which applies to both break and continue. As code ages, sections tend to change in size. What started as a little loop with just a few lines can grow to include many lines. The proper response to this is to refactor and split the loop body into separate functions that are called from inside the loop. Unfortunately, break and continue are bad for refactoring. When you use them, you can’t just highlight code and copy it out to some other function because then the break and continue no longer work. So, the task of refactoring the code and making sure it still works is much harder in a code base that makes extensive use of break and continue.

Students who learn to program in Python will inevitably see lots of code examples that make use of break and continue and pick up the habit of using them. This ties into one of my other rules of thumb for introductory programming.

Loop else

The special forms mentioned above aren’t the only odd constructs in Python. Another fun example is the else clause on loops. This might seem like an interesting construct, and it might even appear useful. If the else on loops did what it reads like it should do, that might be true. Students might learn about it and try it later in other languages, then quickly learn that it wasn’t an option in other languages. Alas, the else on loops doesn’t do what one might think. Based on what happens with an else associated with an if, one might expect that the else only occurs if the loop of the body isn’t executed. Unfortunately, that isn’t the behavior. The else on a loop will always happen unless the loop is exited with a call to break. So not only is the behavior inconsistent with the use of else on an if, using it encourages the use of break.

Types are Design

This one applies to all dynamically typed scripting languages and isn’t specific to Python, but I don’t want to write a separate blog post for it, so I am putting it here. One of the things that I try to emphasize in introductory programming courses is that students need to think about their code before they write it. I try to get them to do some level of design work, even on small programs.

What I have come to realize is that types are part of design. Thinking about the types for the inputs and outputs of functions or just for variables in code is an aspect of the design work. After all, part of the definition of a type is the things you can do with values of that type. Students should be thinking, what will I do with this value? What type would be able to do those things? Dynamically typed languages don’t push the student to do that up-front thinking about design. If somehow the instructor does get students to think about the types up front, if they don’t get documented in the code, then they add to the cognitive load of the student who must keep track of that information in their head.

There is no such thing as dynamic types. Types are either:
— Enforced by the compiler
— Enforced by the app
— in some developer’s head
— to figure out again and again.
Now think what is the impact of changing it and refactor code.

Paul Snively

The need to think about types as part of design grows as students begin thinking about data structures and the right way to store information for efficient access.

Older Blogs

Here are links to my other recent blog posts related to this topic.

The Struggle of Dynamically Typed Languages

Error Comparison (Python 3 vs. Scala 3)

Autocomplete in Dynamically Typed Languages

Syntactic Consistency/Uniformity

Introductory Programming — Avoid Letting Students Pick up Bad Habits

--

--

Mark Lewis

Computer Science Professor, Planetary Rings Simulator, Scala Zealot