Saturday, January 19, 2013

Quality and Scale

[LACK OF EDITOR'S NOTE: I wrote and posted this entirely on my iPad, so there are bound to be spelling and formatting problems. Please feel free to point them out, I'll fix them as I go.]

I use a couple of metrics to size development work. On one axes I always consider the underlying quality of the expected work. It's important because quality sits on at least in a logarithmic space. That is small improvements in quality get considerable more expensive as we try for better systems. On the other axes I use scale. Scale in also at least logrithmic in effort.
My four basic catigories for both are:

Software Quality
- Prototype
- Demo
- In-house
- Commercial

Project Scale
- Small
- Medium
- Large
- Huge

Software Quality

The first level of quality are prototypes. They are generally very specific and crude programs that show a proof of concept. There is almost no error handling, and no packaging. Sometimes these are built to test out new technologies or algorithms, sometimes they are just people playing with the coding environment. Proototypes are imortant for reducing risk, they allow experience to be gained before commiting to a serious development effort.

Demos usually bring a number of different things together into one package. They are not full solutions, rather they just focus on 'talking points'. Demos also lack reasonable error handlng and packaging, but they usually show the essence of how the software will be used. Occasionally you see someone release one as a commercial product, but this type of premature exposure comes with a strong likelihood of turning off potential users.

My suspicion is that in-house software accounts for most of modern software development. What really identifies a system as in-house is that it is quirky. The interface is a dumping ground for random disconnected functionality, the system looks ugly and it's hard to navigate around. Often, but not always, the internal code is as quirky as the external appearance. There is little architecture, plenty of duplication and usually some very strange solutions to very common problems. Often the systems have haphazaardly evolved into their present state. The systems get used but its more often because the audience is captive, given a choise they'd prefer a more usable product. Many enterprise commercial systems are really in-house quality. Sometime they started out higher, but have gradually degenerated to this level after years of people just dumping code into them. Pretty much an unfocused development project is constrained by this level. It takes considerable experience, talent, time and focus to lift the bar.

What really defines commercial quality is that the users couldn't imagine life without the software. Not only does it look good, it's stunningly reliable, simple and intuitive to use. Internally the code is also clean and well organized. It handles all errors correctly, is well packaged and requires minimal or no support. Graphic designs and UX experts have heavily contributed to give the solution both a clear narrative and a simple, but easily understood philosopy. A really great example really does solve all of the problems that it claims to. Even the smallest detail has been though-out with great care. The true mastery of programming comes from making a hard problem look simple; commercial systems require this both to maintain their quality and to support future extensions. Lots of programmers claim commercial quality programming abilities, but judging from our industry very few can actually operate at this level. As user's expectation for scale and shortened development times have skyrocketed, the overall quality of software has been declining. Bugs that in the past would have caused a revolt or lawsuit are now convientently overlooked. This may lead to more tools available, but it also means more time wasted and more confusion.

Project Scale

It is imossible to discuss project scale without resorting to construction analogies. I know programmers hate this, but without some tangilble basis, people easily focus on the wrong aspects or oversimplify their analysis. Grounding the discusion with links to physical reality really helps to visualise the work involved and the types of experience necessary to do it correctly.
A small project is akin to building a shed out back. It takes some skill, but it is easily learned and if there are the inevitable problems, they are relative well contained. A slightly wonky shack still works as a shack, it may not look pretty but hey, it's only a shack. Small projects generally come in around less than 20,000 lines of code. It's the type of work that can be completed in days, weeks or a few months by one or two programmers.

A medium project is essentially a house. There is a great deal of skill in building a house; it's not an easy job to complete. A well-built house is impressive, but somewhere in the middle of the scale it's hard for someone living in the house to really get a sense of the quality. If it works well enough, then it works. Medium projects vary somewhat, falling in around 50,000 lines of code and are generally less than 100,000. Medium projects usually require a team, or thet are spread across a large number of years.

You can't just pile a bunch of houses on top of each other to get an apartment building. Houses are made of smaller, lighter materials and are really scaled towards a family. Building an apartment building on the other hand, requires a whole new set of skills. Suddenly things like steel frames become necessary to get the size beyond a few floors. Plumbing and electricty are different. Elevators become important. The game changes, and the attention to detail changes as well. A small flaw in a house, might be a serious problem in an apartment building. As such more planning is required, there are fewer options and bigger groups need to be involved, often with specializations. For software, large generally starts somewhere after 100,000 lines but can also get triggered by difficult performance constraints or more than a trvial number of users. In a sense it's going from a single family dwelling into a system that can accomodate significatly larger groups. That leap upwards in complexity is dangerous. Crossing the line may not look like that much more work, but underneath the rules have changed. It's easy to miss, sending the whole project into what is basically a brick wall.

Skyscapers are marvels of modern engineering. They seem to go up quickly, so it's easy to miss their stunning degree of sophistication, but they are complex beasts. It's impressive how huge teams come together and managed to stay organized enough to achieve these monuments. These days, there are many similar examples within the software world. Sytems that run across tens of thousands of machines or cope with millions of users. There isn't that much in common between an apartment building and a skyscaper, although the lines may be somewhat blurred. It's another step in sophistication.

Besides visualization, I like these categories because it's easy to see that they are somewhat independent. That is, just because someone can build a house doesn't mean they can build a skyscaper. Each step upwards requires new sets of skills and more attention to the detail. Each step requires more organization and more manpower. It wouldn't make sense to hire an expert from one category and expect them to do a good job in another. An expert house-builder can't necessarily build a skyscraper and a skyscraper engineer may tragically over-engineer a simple shed. People can move of course, but only if they put aside their hubris and accept that they are entering a new area. This -- I've often seen -- is true as well for software. Each bump up in scale has its own challenges and its own skills.

Finally

A software project has an inherent scale and some type of quality goals. Combining these gives a very reliable way of sizing the work. Factors like the environment and experience of the developers also come to play, but scale and quality dominate. If you want a commercial grade skyscaper for instance, it is going to be hundreds of man-years of effort. It's too easy to dream big, but there are always physical constraints at work, and as Brookes pointed out so very long ago, there are no silver bullets.

Sunday, January 6, 2013

Potential

Computers are incredibly powerful. Sure they are just stupid machines, but they are embodied with infinite patience and unbelievable precision. But so far we’ve barely tapped their potential, we’re still mired in building up semi-intelligent instruction sets by brute force. Someday however, we’ll get beyond that and finally be able to utilize these machines to improve both our lives and our understanding of the universe.

What we are fighting with now is our inability to bring together massive sets of intelligent instructions. We certainly build larger software systems now then in the past, but we still do this by crudely mashing together individual efforts into loosely related collections of ‘functionality’. We are still extremely dependent on keeping the work separated, e.g. apps, modules, libraries, etc. These are all works of a small groups or individuals. We have no real reliable ways of combining the effort from thousands or millions of people into focused coherent works. There are some ‘close but no cigar’ examples, such as the Internet or sites like Wikipedia where they are a collection from a large number of people, but these have heavily relied on being loosely organized and as such they fall short of the full potential of what could be achieved.

If we take the perspective of software being a form of ‘encoded’ intelligence, then it’s not hard to imagine what could be created if we could merge the collective knowledge of thousands of people together into a single source. In a sense, we know that individual intelligence ranges; that is some people operate really smartly, some do not. But even the brightest of our species isn’t consistently intelligent about all aspects of their life. We all have our stupid moments where we’re not doing things to our best advantage. Instead we’re stumbling around, often just one small step ahead of calamity. In that sense ‘intelligence’ isn’t really about what we are thinking internally, but rather about how we are applying our internal models to the world around us. If you really understood the full consequences of your own actions for instance, then you would probably alter them to make your life better...

If we could combine most of what we collectively know as a species, we’d come to a radically different perspective of our societies. And if we used this ‘greater truth’ constructively we’d be able to fix problems that have long plagued our organizations. So it’s the potential to utilize this superior collective intelligence that I see when I play with computers. We take what we know, what we think we know, and what we assume for as many people as possible, then compile this together into massive unified models of our world. With this information -- a degree of encoded intelligence that far exceeds our own individual intelligence -- we apply it back, making changes that we know for sure will improve our world, not just ones based on wild guesses, hunches or egos.

Keep in mind that this isn’t artificial intelligence in the classic sense. Rather it is a knowledge-base built around our shared understandings. It isn’t sentient or moody, or even interested in plotting our destruction, but instead it is just a massive resource that simplifies our ability to comprehend huge multi-dimensional problems that exceeds the physical limitations of our own biology. We can still choose to apply this higher intelligence at our own discretion. The only difference is that we’ve finally given our species the ability to understand things beyond their own capabilities. We’ve transcended our biological limitations.

Friday, December 28, 2012

State of the New Machines

I'm typing in this post from a cottage in northern Ontario. This is my first post on my iPad, connect via 3G to the world. I couldn't imagine 20 years ago, while lugging home my 30 pound Compaq "portable", that someday I'd be miles off the grid, yet connect to the world via a wafer thin screen resting on my laptop. Hardware, it seems, has made amazing progress.

The irony is probably that my tablet is packed with retro apps, most of which are throw backs to the early days of PC computing. While hardware has screamed ahead, software has, well, lagged. Still, there are a few apps that impress me, mostly the stuff coming from Autodesk. The rest however...

Sunday, November 25, 2012

Theory and Practice

Nearly three decades ago, when I started university all I really wanted to learn was the magic of programming. But my course load included plenty of mathematics and computer theory courses, as well as crazy electives. “What does all this have to do with programming?” I often complained. At first I just wished they’d drop the courses from the curriculum and give me more intensive programming assignments. That’s what I thought I needed to know. In time I realized that most of it was quite useful.

Theory is the backbone of software development work. For a lot of programming tasks you can ignore the theory and just scratch out your own eclectic way of handling the problem, but a strong theoretical background not only makes the work easier it also is more likely to withstand the rigors of the real world. Too often I’ve seen programmers roll their own dysfunctional code to a theoretical problem without first getting a true appreciation of the underlying knowledge. What most often happens is that they flail away at the code, unable to get it to be stable enough to work. If they understood the theory however, not only is the code shorter, but they’d spend way less time banging at it. It makes it easier. Thus for some types of programming, understanding the underlying theory is mandatory. Yes, it’s a small minority of the time, but it’s often the core of the system, where even littlest of problems can be hugely time intensive.

The best known theoretical problem is the ‘halting problem’. Loosely stated, it is impossible to write some code that can determine if some other code will converge on an answer or run forever (however one can write an estimation that works with a finite subset within a Turing Machine and that seems doable).  

In its native form the halting problem isn’t crossed often in practice, but we do see it in other ways. First is that an unbounded loop could run forever. An unbounded recursion can run forever as well. Thus in practice we really don’t want code that is ever unbounded -- infinite loops annoy users and waste resources -- at some point the code has to process a finite set of discrete objects and then terminate. If that isn’t possible, then some protective form of constraint is necessary (although the size should be easily configurable at operational time).

The second way we see it is that we can’t always write code to understand what code is trying to do. In an offbeat way, that limits the types of tools we can use in automation. It would be nice for instance if we could write something that would list out the stack for all possible exceptions in the code with respect to input, but that would require the lister to ‘intelligently’ understand the code enough to know the behavior. We could approx that, but the lack of accuracy might negate the value of the tool.

Another interesting theoretical problem is the Two Generals Problem. This is really just a coordination issue between any two independent entities (computers, threads, processes, etc.). There is no known way to reliability get 100% communication if the entities are independent. You can reduce the window of problems down to a tiny number of instructions, but you can never remove it entirely. With modern computers we can do billions of things within fractions of a second, so even a tiny 2 ms window could result in bugs occurring monthly in a system with a massive number of transactions. Thus what seems like an unlikely occurrence can often turn into a recurring nuisance that irritates everyone.

Locking is closely related to the Two Generals Problem. I’ve seen more bugs in locking than in any other area of modern programming (dangling pointers in C were extremely common in the mid 90s but modern languages mitigated that). It’s not hard to write code to lock resources, but it is very easy to get it wrong. At its heart, it really falls back to a simple principle: to get reliable locking you need a ‘test-and-set’ primitive. That is, in one single uninterrupted single-threaded protected operation, you need to test a variable and set it to ‘taken’ or return that is it unavailable. Once you have that primitive, you can build all other locking mechanisms on top of it. If it’s not atomic however, there will always be a window of failure. That links back to the Two Generals Problem quite nicely, since where it becomes an issue is when you can’t have access to an atomic ‘test-and-set’ primitive (and thus there will always be problems).

Parsing is one of those areas where people often tread carelessly without a theoretical background, and it always ends badly. If you understand the theory and have read works like The Red Dragon Book then belting out a parser is basically a time problem. You just decide what the ‘language’ requires such as LR(1), and how big the language is and then you do the appropriate work, which more often than not is either a recursive descent parser or a table driven one (using tools like lex/yacc or antlr). There are messy bits of course, particularly if you are trying to draft your own new language, but the space is well explored and well documented. In practice however what you see is a lot of crude split/join based top-down disasters, with the occasional regular expression disaster thrown in for fun. Both of those techniques can work with really simple grammars, but then fail miserably when applied to more complex ones. Thus being able to parse a CSV file, does mean you know how to parse something more complex. Bad parsing usually is a huge time sink, and if it’s way off then the only reasonable option is to rewrite it properly. Sometimes it’s just not fixable.

One of my favorite theoretical problems is the rather well-known P vs NP problem. While the verdict is still outstanding on the relationship, it has a huge implication for code optimizations. For people unfamiliar with ‘complexity’, it is really a question of growth. If you have an algorithm that takes 3 seconds to run with 200 inputs, what happens when you give it 400 inputs? With a simple linear algorithm it takes 6 seconds to run. Some algorithms perform worse, so they may take 9 secs (3^2 -- three squared) to run, or even 64 seconds (4^3 -- four to the power of three). We can take any algorithm and calculate its ‘computational complexity’ which will tell us exactly how the time grows with respect to the size of the input. We usually categorize this by the dominant operators so O(1) is a constant growth, O(n) is growing linearly by the size of the input, O(n^c) is growing by a constant exponent (polynomial time) and O(c^n) has the size of the input as the exponent (exponential time). The P in the equation is a reference to polynomial time, while NP is rather loosely any growth such as exponential that is larger (I know, that is a gross oversimplification of NP, but it serves well enough to explain that it references problems that are larger, without getting into what constrains NP itself).

Growth is a really important factor when it comes to designing systems that run efficiently. Ultimately what we’d like is to build is a well-behaved system that runs in testing on a subset of the data, and then to know when it goes into production that the performance characteristics have not changed. The system shouldn’t suddenly grind to a halt when it is being accessed by a real number of users, with a real amount of data. What we’ve learned over the years is that it is really easy to write code where this will happen, so often to get the big industrial stuff working, we have to spend a significant amount of time optimizing the code to perform properly. The work a system has to do is fixed, so the best we can do is find approaches to preserve and reuse the work (memoization) as much as possible. Optimizing code, after its been shown to work, is often crucial to achieving the requirements.

What P != NP is really saying in practice is that there is a very strong bound on just exactly how optimized the code can really be. If it’s not true then there would be no possible way you could take an exponential problem and find clever tricks to get it to run in polynomial time. You can always optimize code, but there might be a physical bound on exactly how fast you can get it. A lot of this work was best explored with respect to sorting and searching, but for large systems it is essential to really understand it if you are going to get good results.

if it were true however, amongst many other implications, that would mean that we are able to calculate some pretty incredible stuff. Moore’s law has always been giving us more hardware to play with, but users have kept pace and are continually asking for processing beyond our current limits. Without that fixed boundary as a limitation, we could write systems that make our modern behemoth's look crude and flaky, and it would require a tiny fraction of the huge effort we put in right now to build them (also it would take a lot of fun out of mathematics according to Gödel).

Memoization as a technique is best known from ‘caching’. Somewhere along the way, caching became the over-popular silver bullet for all performance problems. Caching in essence is simple, but there is significant more depth there than most people realize, and as such it is not uncommon to see systems that are deploying erratic caching to harmful effect. Instead of magically fixing the performance problems, they manage to make them worse and provide a slew of inconsistencies in the results. So you get really stale data, or a collection of data with parts out of sync, slower performance, rampant memory leaks, or just sudden scary freezes in the code that seem unexplainable. Caching, like memory management, threads and pointers is one of those places where ignoring the underlying known concepts is most likely to result in pain, rather than a successful piece of code.

I’m sure there are plenty of other examples. Often when I split programming between ‘systems programming’ and ‘applications programming’ what I am really referring too is that the systems variety requires a decent understanding of the underlying theories. Applications programming needs an understanding of the domain problems, but they can often be documented and passed on to the programmer. For the systems work, the programmer has to really understand what they are writing, for if they don’t, the chances of just randomly striking it lucky and getting the code to work are are nearly infinitesimal. Thus, as I found out over the years, all of those early theory courses that they made me take are actually crucial to being able to build big industrial strength systems. You can always build on someone else’s knowledge, which is fine, but if you dare tread into any deep work, then you need to take it very seriously and do the appropriate homework. I’ve seen a lot of programmers fail to grok that and suffer horribly for their hubris.

Sunday, November 18, 2012

Best Practices

One significant problem in software development is not being able to end an argument by pointing to an official reference. Veteran developers acquire considerable knowledge about ‘best practices’ in their careers, but there is no authoritative source for all of this learning. There is no way to know whether a style, technique, approach, algorithm, etc. is well-known, or just a quirk of a very small number of programmers.

I have heard a wide range of different things referred to as best practices, so it’s not unusual to have someone claim that their eclectic practice is more widely adapted than it is. In a sense there is no ‘normal’ in programming, there is such a wide diversification of knowledge and approaches, but there are clearly ways of working that consistently produce better results. Over time we should be converging on a stronger understanding, rather than just continually retrying every possible permutation.

Our not having a standard base of knowledge makes it easier for people from outside the industry to make “claims” of understanding how to develop software. If for instance you can’t point to a reference that says there should be separate development, test and production environments, then it is really hard to talk people out of just using one environment and hacking at it directly. A newbie manager can easily dismiss 3 environments as being too costly and there is no way to convince them otherwise. No doubt it is possible get do everything all on the same machine, it’s just that the chaos is going to extract a serious toll in time and quality, but to people unfamiliar with software development issues like ‘quality’ find that they are not easily digestible.

Another example is that I’ve seen essentials like source code control set up in all manner of weird arrangements, yet most of these variations provide ‘less’ support than the technology can really offer. A well-organized repository not only helps synchronise multiple people, but it also provides insurance for existing releases. Replicating a bug in development is a huge step in being able to fix it, and basing that work on the certainty that the source code is identical between the different environments is crucial.

Schemas in relational databases are another classic area where people easily and often deviate from reasonable usage, and either claim their missteps as known or dismiss the idea that there is only a small window of reasonable ways to set up databases. If you use an RDBMS correctly it is a strong, stable technology. If you don’t, then it becomes a black hole of problems. A normalized schema is easily sharable between different systems, while a quirky one is implicitly tied to a very specific code base. It makes little sense to utilize a sharable resource in a way that isn’t sharable.

Documentation and design are two other areas where people often have very eclectic practices. Given the increasing time-pressures of the industry, there is a wide range of approaches happening out there that swing from ‘none’ to ‘way over the top’, with a lot of developers believing that one extreme or the other is best. Neither too much or too little documentation serves the development, and often documentation isn’t really the end-product, but just necessary steps in a long chain of work that eventually culminates in a version of the system. A complete lack of design is a reliable way to create a ball of mud, but overdoing it can burn resources and lead to serious over-engineering.

Extreme positions are common elsewhere in software as well. I’ve always figured that in their zeal to over-simplify, many people have settled on their own unique minimal subset of black and white rules, but often the underlying problems are really trade-offs that require subtle balancing instead. I’ll often see people crediting K.I.S.S (keep it simple stupid) as the basis for some over-the-top complexity that is clearly counter-productive. They become so focused on simplifying some small aspect of the problem that they lose sight that they’ve made everything else worse.

Since I’ve moved around a lot I’ve encountered a great variety of good and insane opinions about software development. I think it would be helpful if we could consolidate the best of the good ones into some single point of reference. A book would be best, but a wiki might serve better. One single point of reference that can be quoted as needed. No doubt there will be some contradictions, but we should be able to categorize the different practices by family and history.

We do have to be concerned that software development is often hostage to what amounts to pop culture these days. New “trendy” ideas get injected, and it often takes time before people realize that they are essentially defective. My favorite example was Hungarian notation, which has hopefully vanished from most work by now. We need to distinguish between established best practices and upcoming ‘popular’ practices. The former have been around for a long time and have earned their respect. The latter may make it to ‘best’ someday, but they’re still so young that it is really hard to tell yet (and I think more of these new practices are deemed ineffective then promoted to ‘best’ status).

What would definitely help in software development is to be able to sit down with management or rogue programmers and be able to stop a wayward discussion early with a statement like “storing all of the fields in the database as text blobs is not considered by X to be a best practice..., so we’re not going to continue doing it that way”. With that ability, we’d at least be able to look at a code base or an existing project and get some idea of conformity. I would not expect everyone to build things the same way, but rather this would show up projects that deviated way too far to the extremes (and because of that are very likely to fail). After decades, I think it’s time to bring more of what we know together into a usable reference.