Mike Levin SEO

Future-proof your technology-skills with Linux, Python, vim & git... and me!

The Python Nuances That Keep Code Clean

by Mike Levin SEO & Datamaster, 01/25/2011

I am a casual programmer, having spent my entire professional career in jobs that don’t have programming in the job description. Truly, I started on BASIC on the TRS-80 in 1982, but the computer science flame was never sparked. Therefore, I went through life and career gravitating instead towards design and business, only programming to DO my job, and not AS my job. This non-programmer programmer approach provided certain freedoms in language and environment choices, but the disadvantage of the lack of formal training. Specifically, I was corralled into Microsoft Active Server Pages and VBScript like so many others in those days by the force of Microsoft servers and IIS in the workplace. Unfortunately, it took me a long time to understand just how much of my capabilities and professional potential was defined by this choice. Python Clean CodeNot all languages are equal (they’re designed for different things), but I underestimated the importance of choosing a language the same way you would choose the chair or bed you will be spending so much time in. A bad choice could make you miserable subtly without realizing it over long periods of time. And conversely, the right choice (for you) could make everything almost magically easy.

Python is one such magical language, with deeply ingrained philosophies that fly in the face of traditional compiled code programmers who live and die by their static typed variables. It’s exactly that magic that they hate, because in stable optimized code, nothing should be magical. However, in in Python, functions don’t care about the datatypes being fed in. If the datatypes’ interfaces are compatible with the type being fed in, the function will work. Feed something designed for two strings two integers instead, and it will still probably work. And if it doesn’t, Python will generate an error at runtime. This, according to the Python philosophy, has a two-fold advantage. First, your code gets cleaner without all those type declarations and extra error checking. You get error checking anyway—just not bloating your code. Second, your function is able to work in a greater variety of situations. Of course, the reverse arguments exist, that your function working on unintended datatypes will cause unexpected bugs, and the code could never run as fast as the static typed compiled equivalent. But the final measure is the net gain experienced by you for your purposes. And for my purposes, the stripping out of all that extra code is welcome.

Python has a number of nuances past the obvious of dynamic variables and indents that matter. I’ve been programming in Python for almost a year now, and these odd rules are taking me awhile to internalize, but which should be taken advantage of for cleaner code. The first is that everything has a true or false status that can be checked directly, eliminating tons of if x == ‘’ style checking, in favor of just if x. If it’s an integer, then the check that’s eliminated is if x == 0. Any sequence, such as lists [] and dictionaries {} also return false if empty when checked, as well as the special None value (the more aptly named equivalent of Nil or Null in other environments). So in short, if any non-zero or non-empty value exists in an object, the object returns true during a comparison or check. The upshot is that every object, in addition to its actual value also has a second value, true or false, built in which greatly eliminates the need for of extra Boolean variables.

The next thing is how Python tries to eliminate the need for increment counters, truth toggles and other “extra” housekeeping variables. The above Boolean-related point about how every variable also returns true if populated is integral to this. For example, sometimes you have to decide whether a function should return a Boolean regarding it’s successful execution or the actual value that resulted from executing the function. In Python, they can be the same. So you can directly test truth against the function, and it will return true or false, or you can set assign a variable equal to it, which will contain the entire return value. This is so common in Python because it is so universally true throughout Python. Therefore, the best practice is to always return the function’s results if you can, confident you can also check its truth. Speaking of which, when you exit a loop, the variables you used in that loop keep their last value (so long as still in scope), which evaluate to true or false when checked, so bye bye truth toggles in loops. Just declare a new local variable as empty outside the loop, and then use it inside the loop. When the loop breaks, if it had a value set during iteration, you still have its value, plus you can check it like a Boolean for truth, therefore eliminating dedicated truth toggles.

Increment counters for loops generally go away as well, because of built-in iteration. At it’s simplest, this means you should use the format “for item in list” to grab each item in a list for iteration. If you really need the index value and don’t want to look at the internals of an enumerable list, you can use the “for x in range(len(list))” format to externally generate an index won’t need manual incrementing. If you don’t mind seeing the internals and want an index and item simultaneously from a list, you can take a peek at list internals with the “for (x, item) in enumerate(list)” format. For any Enumerable object, enumerate(list) returns a tuple with an offset and object for each item in the list in the form (offset, item).

Oh, did I mention tuples? It’s a datatype you sometimes hear of in other languages but rarely encounter, because of how special-case they are. But tuples are everywhere in Python without knowing it, because they underpin the variable assignment interface. Tuples are basically just lists, but are immutable—meaning, they can’t be changed with methods like .append(). And as such, Python leverages the positional awareness of tuples on assignment. So, if you wanted to assign a LIST of values to an equally sized LIST of variable names, you could [a, b, c] = [1, 2, 3]. But you don’t really need the overkill power of the list datatype for such synchronized assignment, so you can use tuples instead, like (a, b, c) = (1, 2, 3). But since Python has this habit of eliminating extra characters if there can be no ambiguity, this can also be expressed as a, b, c = 1, 2, 3 and it is still a tuple! But you begin to forget about the datatype, and just start to love this positional awareness stuff. Taking advantage of tuple-powered positional awareness is found all throughout Python for parameter passing and printing. You can take two positionally paired lists and zip them together into a list of tuples with lot = zip(list1, list2) and then step through them with “for (key, val) in lot”. This easy syncing and passing of bundled valued with tuples is just another example of how Python’s “personality” keeps you programming cleanly with less housekeeping variables.

Once you dump such housecleaning variables, particularly on loops, how can you know if it’s your last iteration through the loop—one of the worst offenders of necessitating such checks? The answer is that for-loops and even while-loops have an “else”. Think about that—for x in [1, 2, 3]… else! You have a ready-made location to put anything that should only occur once having successfully iterated. It’s a great place for success checks, because encountering a break will skip the else code, which is not true for code merely following outside a loop, which will execute in either case. Since the concept is so unfamiliar, I find myself going back and cleaning up my code, switching over to a final else where It would clearly have been a better choice.

I could probably go on, and am tempted to discuss how Python uses its built-in framework-like sequence datatypes of list, dictionary and tuple to replace all sorts of extra code required in other languages. One example would be the common practice of using a quickly built dictionary to replace case/switch statements. For example, if you ran switch = {‘foo’:’bar’, ‘spam’:’eggs’}, then you could just use the value switch[‘foo’] to produce ‘bar’ or switch[‘spam’] to produce ‘eggs’. But then this article would go on forever, and I would lose my reader and run out of things to write about later—like how Python arguably IS a framework, even without the popular Django that’s built on it. But I’ll try to take my own advice on less code… and stop here.