Thursday, January 2, 2014

The Quality of Code

One of the trickier issues in programming is whether or not a program is well-written. Personally I believe that the overall quality of the software is heavily affected by its internal quality. This is because most actively used software is in continuous development, so there is always more that can be done to improve it, the project never really stops. To keep this momentum on a reasonable track the underlying code needs to be both readable and extendable. These form the two foundations for code quality.

Readability is simple in its essence, but notoriously difficult to achieve in practice. Mostly this is because programming languages support a huge variance in style. Two different programmers can use the same language in very different ways and still get good results. This capacity opens the door to each programmer coding in their own unique style, an indirect way of signing their own work. Unfortunately a code base made up of 4 different styles is by definition four times harder to read. You have to keep adjusting your understanding when switching between the different sections in different styles. Getting multiple programmers to align on nearly identical styles is incredibly hard because they don't like having any constraints, there are deadlines and most programmers won't read other programmer's code. Style issues should really be set up before any development begins and any new programmers should learn and follow the stylistic rules already laid down. When that happens well enough, the code quality increases.

To get around reading other's code many programmers will attempt to extend existing code by doing what Tracy Kidder described in her book "the Soul of a New Machine" as just attaching a bag on the side. Essentially instead of extending, refactoring or integrating they just write some external clump of code and try to glue it to the side of the existing system. This results in there affectively being two different ways of handling the same underlying mechanics, again doubling any new work to extend the system. Done enough, this degenerates the architecture into a hopeless 'ball of mud' eventually killing any ability to extend the system further. Many programmers justify this by stating that it is faster, but that speed comes at the cost of gradually stopping any further extensions.

Both multiple styles and bad extensions are very obvious if you read through the code. In this way if you read a lot of code, it is fairly obvious if the system is well-written or not. If its fairly consistent and the mechanics of the system are all encapsulated together, its probably not going to be hard to read it and then extend its functionality. If on the other hard it looks like it was tossed together by a bunch of competing programmers with little structure or organization then making any changes is probably long, painful and will require boat loads of testing to validate them. Given lost of experience with different systems, experienced programmers can often just loosely rank a code base on a scale of 1 to 10, with the obvious caveat that any ranking from a programmer who hates reading other's code will obviously be erratic.

An important side effect of achieving good quality is that although the project starts slower, it maintains a consistent pace of development throughout it's lifetime, instead of slowing down over time. This opens a door to keeping a metric on long term development that mirrors the underlying quality. If the amount of code getting in the final production state is rapidly decreasing, on of the causes is declining quality (there are several other causes to consider as well).