"The jury is in. The controversy is over. The debate has ended, and the conclusion is: TDD works. Sorry."
-- Robert C. Martin (Uncle Bob)
A quote from the upcoming work "97 Things Every Programmer Should Know":
http://commons.oreilly.com/wiki/index.php/Professionalism_and_Test-Driven_Development
Honestly, I think that this quote by this very well-known and out-spoken consultant in the Computer Software industry says far more about the current state of our industry than just its mere words alone.
Test Driven Development (TDD) for those that are unfamiliar with it, is more or less a "game" to be played while coding, with a set of three "laws" orienting a programmer's efforts. The programmer essentially builds up the program by first building simple tests and then making the code cause the tests to pass. In this way, step by step the tests and the code gradually get more sophisticated.
It's an interesting process offshoot coming from the vocally loud, but not entirely popular family of software methodologies that are collectively called Agile.
The overall essence of Agile is to change the focus of coding to be more customer-oriented, and in the process hopefully, increase the speed of development and the quality of output. It's a mixed bag of things, where some of the Agile ideas are really good, and in general, most of the newer approaches are definitely a step better than the old report-oriented waterfall methods of the past.
However, even from it's earliest days, Agile has been bent by its cheerleaders towards ideas that resonate well with young and inexperienced programmers but are not necessarily practical. Ideas meant to be popular. In short, it focuses far harder on trying to make programming fun and exciting, and far less on trying to actually solve any of the real underlying causes of our inability to consistently deliver workable products. The tendency is to address moral issues, rather than development ones. It's far more driven by what actually sells, then by what would fix things.
Consequently, there are always great streams of feedback coming from practitioners about how things did not work out as planned. The ideas sound good, but the projects are still failing.
And, as is normal for any consulting industry trying to protect itself with spin, there are always experts coming forward to claim that all of this backlash is false, misleading and not applicable. Somebody always seems to know of a successful project, somewhere, run by someone else. Any "discussions" are often long on personal attacks and short on tangible details. The term Agile more applicably describes it defenders than the actual ideas.
It's not that I don't want my job to be fun, far from it. It's just that I don't think fun comes from playing silly little games while working. I don't think fun comes from everybody jumping in with their two cents. I don't think work should be 100% fun all of the time. All laughs and chuckles.
I think fun comes from getting together with a group of skilled people and doing something well. I think it comes from being successful, time and time again, in building things that last and will get used for a long time. I think that it comes from not having too much stress, and not embarking on death marches or long drawn out programming disasters. In short, I think it comes from actually producing things, on time and on budget. I think success is fun.
A successful project is always more satisfying and way less stressful than a failure. But, and this is a huge but, to get a successful project, at times the individual parts of it are not necessary intensively creative, they are not necessarily interesting, and they are not, in themselves, necessarily fun.
Work, as it always has been, is work not play. It calls for getting our nose down to the keyboard and powering through the sometimes painful, monotonous bits of coding that so often surround the more entertaining stuff. Coding can and always will be tiring at times, that's just the way it is. Good programmers know this, and the rest should just learn to accept it.
If someone is coming around and selling ways to make it "fun" and "easier", then we ought to be very wary. Distractions from work may improve moral in the short-term, but they are generally tickets to losing. And losing just builds long term stress.
Getting back to TDD, I'm sure for some younger programmers there is some value in using this approach to help them better learn to structure their code. When you're first learning how to program, there is too much emphasis on the lines of code themselves and not enough on the other attributes. Looking at the code simultaneously from both the testing and coding perspectives is bound to strengthen one's overall perspective. Nice. Probably very applicable as a technique for teaching programming, and to help juniors along in their early years.
But it is expensive. Very expensive. Obviously, you're doing more work. And obviously, you're creating far more code, far more dependencies, and far more downstream work.
These things, some people try to point out, are offset by some magical property of this new code being just so much better than what might have been written beforehand. To steal a phrase from Saturday Night Live: "Really?!".
Because honestly, if you were bound to write sloppy code that was bug-infested in the first place, having a better overall structure doesn't change the fact that the things you "missed" the first time around, will still be missed the second time around. I.e. just because you have a "test" doesn't mean that the test completely covers all of the inputs and outputs of the code (infinity, after all, is hard to achieve).
And if you did write a suite of tests so large and extensive that they do in fact cover most of the expected possible inputs and output, while the code may be perfect, you've invested a huge (huge) amount of time and effort. Too much time.
Consider, for example, testing some trivial string concatenation code. We don't need to test every permutation to know that the code works. We as humans are quite capable of essentially doing a proof by induction on the algorithm, so all we need to test are nulls, empty strings and N character strings for each input.
Still, if we've been coding long enough, the tests for null are obvious (and we'd quickly noticed if they don't work), so we merely need to ascertain that the mainline concatenation works, and then we can move on. It should be a non-event for a senior programmer.
But, if you were applying TDD, you would have to test for each and every permutation of the input strings, since you would have to ratchet up the testing to match the functionality of the code. Start small and grow. So I believe you'd start with the null inputs first and enhance the code to really work with all combinations of strings.
Well, maybe. Or maybe, you'd just add one test to ascertain that the strings are actually merged. Oops.
Of course, you can see a problem, since either way is no good. You've either spend way too long writing trivial tests or you've not really written the tests that cover the routine (and now foolishly believe that you have). Both are recipes to failure. You've wasted time or you're wrong (or both).
More importantly, the functionality of merging two string is trivial. An experienced programmer shouldn't even give it a second look, let alone it's own test. If the problem exists with something this simple, it will only get magnified as the complexity grows. It only gets worse. Either the tests aren't complete, or the effort sunk into the work is massive (and redundant).
Worse still is that the suite of tests has ended up multiplying the overall code base by huge proportions. I don't care if the code is core or scaffolding (like tests), any and all code in a system must be kept up to date as the system grows. Complexity in code certainly grows far faster than linear and is likely well past exponential. If you double the code, you have more than double the effort to update and fix it. If you're short on time now, adding more work to the mix isn't going to make that better.
Also of note, in coding, there are always parts of the program that have real complexity issues, so it is not a great idea to get hung up on the parts that are nearly trivial. The bulk of most systems should be simple, it's still an artifact of our technologies. It's all that code that doesn't need much testing, generally works and is really boring to work on. The biggest problems with this type of code are the growing number of small inconsistencies caused in redundant code, not algorithmic issues. Unit testing doesn't detect inconsistent code.
In programming, the clock is always ticking, and there is never enough time to get the work finished. Programming is still too slow and tedious to be priced correctly, so nobody wants to pay fair value and everybody, all the time is skipping steps.
But again, that in itself is not a reason to push in extra low-level testing. It is actually a reason to be very very stingy with one's time. To "work smarter, not harder". To protect our time carefully, and to always spend it where it has the most impact.
Inevitably the code will go out before it is ready, so do you want to protect it from little failures or do you want to protect it from big ones?
And it is big failures that are clearly the most significant issues around. It's not when a widget slightly doesn't work that really matters, it's usually when the system grinds to a complete and utter halt. That type of event causes big problems which are really noticeable. Little bugs are annoying, but big ones grind the whole show to a screeching halt. A bad screeching halt. An embarrassing screeching halt.
The only real ways to ensure that all components of a system are humming along together in perfect unison is by ensuring that all components of the system are humming along in perfect unison. You have to test it.
And you have to test it in the way that it is being used, in the environment that it is being used in, and with the most common things that it is being used by. The only real tests that prove this are the top-down system tests on the final system (in an integrated environment). Unless you really know what the system is doing, you are just guessing at how it will behave. The more you guess, the more likely you will be wrong.
Not that having something like a suite of regression tests wouldn't be hugely useful and automate a significant amount of the overall effort, but it is just that these tests are best and most effective if they are done at the highest level in the code, not the lowest one.
You don't want a lot of automated unit tests, you want a lot of automated system tests. In the end, that's where the real problems lay, and that's what really works in practice. But it is a lot of work, and it's not the type of effort that is really practical for most typical systems. There are often better ways to spend the effort.
Coming right back to my initial starting place, my real concern about the initial quote isn't TDD itself. History may ultimately prove it to be more useful than I believe, although I doubt it. It's an interesting exercise that has its time and place but is not applicable at a global level. It is not a cure for what plagues most development projects.
My real concern is the "Mission Accomplished" tone of the statement, and how we're seeing that over and over again in our industry.
We've spent decades now claiming success on each and every new approach or technology that has come along without every having really verified the results. Without having any real proof. Without having many actual successes.
Meanwhile, we keep ignoring the glaring, and horrible fact that the really big companies keep all their really big and important data on forty-year-old "mainframe" technologies, precisely because all of this new stuff doesn't work. We've had a long string of broken technologies.
Actually, they sort of work, but never well enough. For most organizations, it has been one expensive dud after another.
So, now instead of taking a hard and fast look at the real source of our problems: flaky technology, poor analysis, and bad design, some of our industry has moved on and started declaring victories with this new version of game playing. Last week it was extreme, this week testing and next is will be lean. We'd certainly rather claim victory than actually be victorious.
Personally, I think it's good to get new ideas into the mix. Obviously, we haven't found what works, so we shouldn't just plop down and start building trenches. We're probably a long way from real answers, but that doesn't mean we should stop trying. Nor should we declare it "over", until it really is over. New ideas are good, but we shouldn't overstate them.
In as long as our industry distracts itself with low hanging fruit, purely for the purpose of convincing huge organizations to spend more money on more probable project failures, we are not going to put in the time or effort to fix any real problems. We are not going to make any significant progress.
Sure we've had some success, after all, we've managed to take a huge potentially useful technology like the Internet and allowed it to decay into something that now has the same intellectual capabilities of bad 70's TV. And, instead of mumbling to just ourselves in public, we can now mumble to our friends and family through our fancy phones, or just play video games on them. Yea us. We've clearly done well. We've learned to utilize our wonderful machines for churning out masses of noise and misinformation.
I guess in the end I shouldn't be too surprised or frustrated, we've clearly entered a period in history where truth and values are meaningless; all that matters is making profits and spinning reality. The software industry follows the larger overall trend towards style and away from substance. It's not whether it is right or wrong, but how it's played that is important. Truth is just another casualty of our era. If you can't win, at least you can declare that you did, that's close enough.
There's two points I don't agree with
ReplyDelete* Trivial things don't deserve tests: the problem I see is that you seem to forget that those who code are humans, who are well-known for being unreliable. A senior may fail on very trivial things if he had a bad night for instance. I wanted to make a similar comment to your previous post "programming is simple!": programming is not as simple as it should be, we make mistake.
* You ask the rhetoric question of let pass little bugs or major failures. The problem in real life is that you cannot predict what will be a "big" or a "little" bug, in particular when you deliver feature-rich software to many customers with many use cases. A little bug that cripple what you think is a second class functionality, may be a major failure when this functionality is precisely why a very-important-customer bought your product.
Hi Astrobe,
ReplyDeleteThanks for the comments. They're both excellent points.
You are absolutely right about about our intrinsic reliability. Even when I know better I've written out horribly ridiculous bits of code and watched them fail miserably. What changed as I got older was that I became much faster at recognizing my low quality work and removing it (instead of just trying to patch it). Mostly, for me, these sorts of problems have shown themselves very early on, so I generally leave some slack in the schedule to correct them long before the code gets frozen. Long before system testing. Log before the problems get worse.
I actually have a rule of thumb whereby if I am forced to return to the same piece of code three times for little bugs, I go back and refactor at least some of it. If I visit any code, and find possible errors (but not necessarily known ones) and it's not the tail end of the cycle, I fix them right away. In that way, although I always have lots of little trivial errors with all new code, I'm very fast at plugging up the leaks or re-working the base to make it more stable.
Someone once told me to expect 1 bug for each 200 lines of code. If you accept that there are that many bugs then you tend to change your expectations about finding and fixing them.
In "programming is simple!" I did say that the biggest reason programming wasn't simple was ourselves. Over-complexity was my reason, but lack of self-discipline and just human imprecision are also big causes. It's near impossible for a coder to be perfect, so it's not worth the effort to try. The best thing we can do is find other people to offset our work and help in finding the errors through things like code reviews and testing. We always need good and honest feedback, even if it is painful.
A bug being big or little must absolutely depend on its impact in an operational environment. It's from the user's perspective, not the coding one. It really doesn't matter if it's a one liner or a a whole bad module, it's how the system is effected that counts.
Systems are collective suites of functionality that all work on some finite set of underlying data. In that sense, on the various "paths" through the system, some of those paths are used more often and more vulnerable, and their impact will be larger. For instance, if the core of a system failed, it would be big, but a failure in some rarely used admin tool might not be noticed for years. Any and all code belongs to some finite set of functionality, so we can know (with a bit of work) the specific impact to a system by a failure at any line of code. Pick a line, if that is wrong what will happen?
In a "big ball of mud" architecture, there are no easy ways to determine impact of any bug. That's why we try to avoid spaghetti code. In a highly componentized and organized architecture, the impact is refined and narrowed. We know for instance that a typo in the parser code will only effect the parser. A problem with the report module will only effect reports.
In most of my systems that I've worked on I can identify the code that "has to run perfectly" from the rest. I also know which paths through the system that are popular and how frequently they are used. Also I know the battle tested code from the newer stuff. These are important bits of understanding that come from having to effectively deal with bugs and support. They are also very important for keeping the development on the right path. If you don't know the state of the code and how your code is being used, then it is unlikely that you'll be doing a good job extending it. You're missing the necessary feedback coming from your users (in big development groups, at very least the overall architects and senior developers should know this, but not necessarily the juniors).
Paul.