The unsung hero - documentation

Why keeping good, up-to-date documentation matters and what are the risks of not doing so.

Marcin Gorczyński

24 Feb 2023 • 7 min read

One of the most overlooked aspects of running a project or a company is documentation. Creating and maintaining it is usually treated as some kind of punishment that pushes us away from the interesting stuff like engineering the solution or coming up with flashy ideas on how to improve the product. It's "yucky".

In this piece I will like to argue that the above approach to documentation has a lot of negative, short and long term, effects and that whenever you're in charge of a small piece of code, a big project consisting of many intertwined pieces or the company as a whole you should re-evaluate how much emphasis should be put into thinking about documenting things.

Why it matters

Many times companies and management reiterate how important for them is "knowledge sharing" and that they do everything in their power to promote it. In general that seems like a great idea - people sharing their knowledge, getting more skillful and competent as a result and decreasing the chance of a single point of failure appearing because person X is on holiday and deployment of a critical feature is halted.

The issue here is that the above, even if true as many times it is just a "kumbaya" line, is mainly focused on interpersonal sharing of said knowledge - kitchen chat, Slack message, sometimes an email or online call. Many would treat this as the "agile" approach that removes barriers and makes people work better together but in my opinion that is myopic and has a series of problems compared to more "traditional" ways of documentation via centralized text-first knowledge bases (wikis, Confluence, etc.). From now on I'll refer to the approach that focuses solely on people "figuring it out" by asking around others as Folklore Based Sharing (FBS for short).

FBS has a series of problems attached to it (no particular order):

Scalability
Inconsistency of the knowledge ("cache miss")
Overhead and time wasted by people
Discoverability

Does it scale?

The first one is about how much our approach scales - as the application, project and company gets bigger how our way of doing things holds up (or doesn't). And I would argue that FBS here very quickly falls apart. As a simple example for this, and for the next points also, we'll take a small startup and a big corporation where in each works Bob - the guy who is responsible for maintaining some pretty important piece of software (part of their core product) and takes the helm of technical leadership around it.

In the startup Bob leads the development of Core.App (TM) and has been with it from the beginning till the MVP that happened recently - he know every nook and cranny, where everything is located within the code and who-is-who (duh, there's only 5 people with him included). Sometimes the founder/"CEO" will ask Bob about how something works and how viable a change is or maybe Jane, the other engineer working with him, will ask how to approach fixing a particular bug. The project is in the early phase so stuff change constantly, maybe even a big rewrite is possible which makes any potential documents quickly outdated.

So in the above FBS works - there's only two people who would approach Bob for answers, the process and non-technical requirements are almost non-existent and the requirements + code are violatile so there's a high chance that the documentation would get outdated quickly.

But what if Bob is the Staff Engineer of Core.App (TM) Enterprise Edition at MegaCorp Inc.? Well there a few things change almost certainly - much more people, much bigger scope, much more people and non-technical requirements towards the solutions around e.g. legal, support, internal audit and enterprise architecture. Now it is not only Bob and Jane but there's 20 other people split into two projects and as for non-technical people instead of just the founder we get a queue of people and technical leadership asking around questions.

And so Bob's life gets much more problematic - he gets constantly bombarded by questions by either management or the technical staff about a solution that is possibly so complicated it is impossible for him to answer all of them without some research. Eventually there will be so much of it that he will not manage to continue and he will also not manage to keep up with what is happening in the code - his knowledge as a result getting outdated.

Another great example is basic setup/onboarding in large companies - this should be a very standarized, reproducible process and a natural target for well-defined documentation. If you have hundreds or thousands of people, many new to the company, do you think it is a good idea to tell them to "ask around" how to do it? Who should they ask about it? Does that person have time to help them, or maybe his knowledge is outdated as the last time he did the onboarding was years ago?

FBS falls quickly apart whith the increase of the complexity of the knowledge and the number of parties/people interested about it.

KNOWLEDGE_NOT_FOUND_ERR

The second issue is one of incosistency between what people hold in their heads and reality. As with caching - whenever a change occurs usually we would want to update also the cache to get results from it consistent with the source of truth. The thing is that is impossible with people - there will be gaps in what they know and what they know might very well be outdated and they can be not even aware of that - and that will propagate further to the ones that get their info from them.

A lot of this stems from accountability - with FBS it is most often hard to even say who should know something and what are the exact expectations towards that person regarding keeping their knowledge up-to-date. One can argue that a similar problem might arise with a centralized knowledgebase - as it gets updated a few times in the beginning of development and then forgot about. But it is much easier to track how the knowledge is lacking or incosistent and then set expectations towards people and teams to keep it updated. Processes have a predictability that ad-hoc lacks.

"Do you maybe have 5 minutes for a call?"

As the number of interested parties increases in the knowledge the overhead put onto a person (or a group) increases - he will get constantly bombarded with many times the same question over and over again from different people where at most his responsibility should be giving them a link that points to the latest documentation that contains the answer. And if it doesn't, maybe answering it after which he'll update the docs himself or report it to the person responsible for maintaining them.

Frustration builds up because of this as people don't really like to answer the same question time and time again being pulled away from actually productive work.

Hmm, what did he say about X then?

One of the major issues around knowledge in general is how to discover it and search it for the things we are actually interested in - with many very big businesses built around trying to solve that problem.

But how do you know exactly which person knows something you're interested about? Well in a smaller space like a single project or a startup this might be part of "common knowledge" to all the members but with increased size it falls apart. This results in playing detective - asking people about something and each of them redirecting you to someone else until eventually you'll find the guy who has the answer. There is no consistency on how to find the answers you're interested in.

The other thing is recalling what you've learned - people have the nasty habit of forgetting stuff and information we encode in our heads gets lost as time passes. But in FBS there is no effective way to recall that knowledge - sometimes that might not be a problem if it is a well written e-mail (chain) but piecing together what you've heard on a call a month ago with direct messages on Slack that intermix with talks about the last footbal game makes it hard, sometimes even impossible. It's usually unstructured and chaotic.

A well thought-out knowledge base lets us more easily discover the knowledge we're intersted in and come back to it when it falls out of our head without much issues. Many times also all of the historical versions are kept which might make some things much easier (tracking how e.g. requirements changed at given points of time).

How can we do better?

The first and most important step is understanding your own situtation - do the benefits outweigh the costs. As mentioned for some scenarios keeping a fresh centralized store of knowledge might not be needed, or even be a net negative - small projects and companies with highly violatile knowledge might actually loose more time than it's worth with little benefit to show. But as we get bigger in scope (big projects and programs, large companies) the benefits almost always show themselvs, eventually.

But the critical thing for a centralized knowledgebase to work out is treating it as a full-blown product, or at least close to that. There should be accountability to keep the stuff there updated, fill in the blanks and make improvements based on customer (anyone using it) input. This is probably the hardest part - as mentioned in the beginning most people don't want to do it and there is little glory in being the guy responsible for the companies Wiki.

Considering the above it is on management to create effective processes and paths for the knowledge to flow in- and outwards effectively keeping the people responsible for it happy. Their work should be adequatly recognized tying it to KPIs/OKRs which will translate to rises, bonsues and promotions. It is on you to make documentation sexy and keep high-morale for people responsible for it.

We'll be ending the post with a quote by Copernicus:

To know that we know what we know, and to know that we do not know what we do not know, that is true knowledge.

Thank you for reading!