Fran’s Bouma has an interesting post about the importance of documentation. I mostly agree with him. We developers generally hate to write documentation, and Agile processes lead us to believe that we don’t need to. Our code is the documentation.
Well, I’ve used enough frameworks, OpenSource libraries and worked on enough unrelated large, pre-existing products that I can say with certainty that this is simply not true. Code is not documentation, nor is it enough "information" to get you by. The amount of time required to become familiar enough with a large code base to be able to just find your way around, let alone to correctly implement something inside of it, is simply huge. I think most of us have been there in our careers. You’re on a new project or using some new library, and the source code is all you have. You get assigned your first defect to work on and "become familiar with the code". You have absolutely no idea where to begin to look for the cause of the defect, and have a directory full of thousands of files with millions of lines of code.
Since we’ve all been there, we all fall back to the same patterns to "solve" our dilemma. Out comes grep or some equivalent utility and we search for keywords we *think* will help us to find the area of the code in which to start looking. If you’re monumentally lucky, this will locate a minimum of hits and you can start trying to decipher the code. Usually, though, you have to refine your search numerous times, and sometimes you’ll even have to resort in asking someone else where to look, because you simply can’t find anything unique to search on, and reading a million lines of code is obviously out of the question.
So, you eventually find where to look. If the code is well designed and maintained, you can probably figure things out with little effort at this point. But we all know the reality is that the code has likely grown organically and is now "spaghetti" that you have to pick apart. Always a difficult task, but worse when you’re new to the code base.
Well, you pick it apart and fix your first defect. Congratulations, that just took you a factor of 10 times the amount of time it should have. But, that’s to be expected, because you’re new to the code. So, on to the next defect. Rinse and repeat for a month or two.
Whew. Now you’re "acclimated" to the code base. You can find your way around fine, so everything’s OK now, right? Nope. Now you’re given more difficult defects to work on, and since you’ve got nothing but the code to rely on for "documentation" you have to take what you see in the code literally. The problem is, the code is specific, but not necessarily literal. There’s all kinds of information "between the lines" of the code. The "why" that Fran’s talked about. So, you change an algorithm in ignorance and cause a meltdown in the code. Or you "copy" a pattern from one place in the code (and no, I do not mean you copy and paste code blindly and/or fail to refactor to properly reuse code), just to discover that there’s details you don’t understand that applied only to that specific place in the original code.
No, code is simply a set of instructions. It’s not documentation. Documentation is designed to be consumed by humans, code is designed to be consumed by a computer. Documentation has things code does not, nor can it. Like a table of contents, a forward, an overview, an index, and figures and diagrams that convey a lot of conceptual information at a glance. Documentation is understandable by team members who aren’t developers, such as QA, sales and business analysts.
I can consume a large technical manual in a matter of hours. Personally, I read Petzold’s latest book on WPF in roughly 2 hours, cover to cover, and retained most of the information contained in the book. What’s not retained is readily re-discoverable through the table of contents, index and other structural concepts found in documentation that’s not found in code. In contrast, I’m lucky if I can read through a dozen source files in an entire work day with any real "depth of knowledge" on what I read. That’s not because I’m not experienced in reading code, though I will admit there are people who seem to be better at "spelunking code" than I am. It’s because code doesn’t convey concepts, it conveys procedures. Code isn’t structured for human consumption, it’s structured for machine execution.