top | item 20909783

Ask HN: What is key to good technical documentation?

165 points| jnxx | 6 years ago | reply

Imagine you are a junior software developer who is entering a relatively large and complex project with 150,000 lines of code. It is not some web software, but something way more complex, say an embedded system, or a navigation system, or a digital camera, or maybe a complex device for medical imaging. It is a lot of code in different languages. What you know is that the major aspects are working but there still might be some problems.

And of course, management is eager for you to get stuff up and running, as the project is already severely delayed, needs to be shipped as soon as possible, and the customers and investors are getting impatient.

There is only one problem - all the three senior engineers which were working on the codebase have left suddenly, only their manager is still there, and you have to start with reading and understanding the code, and what needs to be done. It is somewhat commented, but, of course, it is not easy to understand. And as it looks, the previous developers did not had any time to leave you proper technical documentation! What are you going to tell the manager?

But then, suddenly, a fairy godmother appears, which has a magic wand in her hand. It is a kind and witty documentation fairy, and she says: "I will fulfil you a single wish. By my magical powers, I will give you exactly the documentation which you think is most important. It can be anything you want. Anything! The only condition is, it must not be more than a good software developer can produce in a month."

What do you wish for?

88 comments

[+] pytester|6 years ago|reply

* Examples. Realistic, complete, useful examples that demonstrate the software being used in the manner it is intended.

* Glossary for any special terminology that can't just be googled - especially project specific terminology that may be confusing, and double especially terms or phrases which are used slightly differently to the way everybody else uses them (e.g. 'user' means something slightly different almost everywhere).

* Lots of "why" describing why the software behaves in ways that seem surprising at first glance.

[+] trynewideas|6 years ago|reply

Seconded on examples, but example code needs to be integrated into software testing for projects that are frequently updated. Broken examples are arguably worse than none.

Even less arguable is when examples that worked are broken by an update and remain unchanged--they'll work for people on outdated versions but break when they upgrade, and also break for new users, and nothing will sufficiently explain why.

Please add your documentation examples to your testing pipeline (or write tests clearly enough that they can be good examples, and just use those).

[+] petepete|6 years ago|reply

This is bang on - to get up and running people want examples and some context. For me it's the first thing I look for.

On my most recent project I split the docs in two; a 'guide' with example code and output - it has more of an 'overview' feel. The other half is the technical (API) docs built from comments in the source code.

It's not the most exciting project in the world, but it's a Rails form builder that meets GOV.UK requirements. Here's a draft of the guide if anyone's interested.

https://govuk-form-builder.netlify.com/

It's generated via Nanoc which loads the form builder and uses it to generate the example input, rendered output and HTML - in theory it can't be out of date or broken.

[+] teeray|6 years ago|reply

The best technical documentation I’ve seen at a company was written by a technical writer. Most companies don’t like hearing this because it costs more than adding an additional drag on your development teams to also become solid writers (good writing is real, time-consuming work). Having someone whose job it was to start from the perspective of your users, ask questions of the dev team along the way, and document the journey yielded enormously useful docs.

[+] jnxx|6 years ago|reply

As a scientist by education, I have learned to write on scientific technical stuff and it was a long, time-consuming and sometimes painful process - but ultimately, I like to write, and I like to produce clear written information. What was most difficult for me was to overcome perfectionism, which totally blocked me when I tried to write my first peer-reviewed paper. I learned a lot from a few very good books on writing. The very best I found so far is this one:

Effective Writing for Engineers, Managers, Scientists, by H.J. Tichy.

https://www.amazon.com/Effective-Writing-Engineers-Managers-...

It was fantastic for me because it describes writing as an iterative process with subsequent rounds of structuring and refinement, and that helped me to overcome blocks. I learned also a lot on how to make my writing more concise and to the point.

For a scientist and programmer, it is really essential that one does not simply assumes that one already knows how to do it! For me, it is way more difficult than it looks - but also way more satisfying than I thought, when it is finished.

I agree that this is not for everyone, because it needs to be learned, is hard to learn, and is a very different set of skills to coding. And there is clearly a field of work for specialists.

One lesson learned that I took away from this is that writing is a separate task, and it is difficult to do it at the same time as producing code.

But back to the main topic. As a documentation writer under time constraints, what could I do to find out what the not-yet-present reader of the documentation might need to know?

[+] Ididntdothis|6 years ago|reply

I work in medical devices so we have to produce a ton of documentation in a very specific format. I never understood why our devs have to spend a ton of time on producing not very good documents. We have a few tech writers who produce much better work but they are completely overloaded. I don't understand why management doesn't hire more writers to free up the devs and also to produce better docs faster.

[+] mooreds|6 years ago|reply

A couple of things:

* High level architecture (what pieces talk to each other, how they communicate, what they do)

* A list of 'here be dragons' areas, places where the code is gnarly

* A bug database, which will give you history of decisions

Then I'd ask the manager/product owner for a requirements document (or documents), as that should be something a non technical person can/has put together.

Frankly, though outside the scope of your question, I would also look at hiring one of the departed senior devs on a contract basis, just to ask questions of. I have offered this for the past few jobs I have left and there are times when 15 minutes of my time could save a few hours of investigation.

[+] ka0lin|6 years ago|reply

Is this an improvement over for example the Eclipse foundation's pile of high level architecture, bug database and here be dragons-areas?

[+] jnxx|6 years ago|reply

Good points, thank you!

[+] tpmx|6 years ago|reply

I wish for documentation of

1. Key data structures

2. Code layout/architecture (what modules/layers calls which modules/layers, etc)

I like Blender's example for how to do this: https://www.blender.org/bf/codelayout.jpg

These also tend to be relatively static so they don't need to be updated every month.

[+] jnxx|6 years ago|reply

Oh this, yes!

I remember reading "Practice of Programming" by Brian Kernighan and Rob Pike (which is a really, really fantastic book!), and it talks about the importance of data structures.

https://en.wikipedia.org/wiki/The_Practice_of_Programming

This comes up again and again, and it is so true! The core data structures are one of the most important aspects of a program. In fact, when I start to write a program, I think about the central data structures first. It is clear, that to a programmer who takes over a project, they need to be explained as one of the first and most important things.

[+] tetha|6 years ago|reply

I think good documentation needs to be written with a few decisions in mind: What role, what kind of person is this section written for, and what kind of task does the reader have at the moment?

There should be sections on familiarizing yourself with the system - which persistence APIs are used, what are the broad dataflows, where are things dispatched into greater detail, what are active parts in the system. And I did say "sections", because a developer, a system operator, a customer supporter and a consultant implementing the system all need different kinds of information and levels of detail.

There should also be sections on handling known task structures, known troubleshooting guidelines, again, with a focus on the role of the current reader. If content has to be searchable via elasticsearch, there should be focused documentation: How do I get an entity into elasticsearch? How do I connect an actuator setting on a pin with a setting on the UI page? These sections should be mostly like a todo-list.

In order to write something like that, imagine just explaining the system to someone and write most of that down. I've found that this results in very effective documentation. It's usually simple to read, and it's also easy to understand if you should read this block or not. For example, if something says "This is scoped for developers", I might skip it.

[+] MichaelMoser123|6 years ago|reply

I would ask the documentation fairy to come up with sequence diagrams for the most common flows/use cases. That used to be the winning argument whenever i have to come up with some docs. I think that sequence diagrams are the killer feature of UML -everyone gets the message.

[+] bob1029|6 years ago|reply

I share this sentiment regarding diagrams. Any time I sense that a design or requirements discussion is starting to lose track of all the various edge cases of a particular business process, I will draw out a state diagram of that process and put it into an email or message to the team. 9/10 times this either immediately ends the back-and-forth, or quickly narrows down the actual concerns to a few particular items.

I am consistently amazed by how quickly these diagrams can convey complex business requirements to non-technical people. I could write 30 pages of email, or draw 1 state machine on my whiteboard and snap a photo of it. Someone should try to run the math on information density of a state diagram vs a verbal or written communication of the same requirements. I would bet there are a few orders of magnitude of delta in there based on my experiences.

[+] jnxx|6 years ago|reply

What is it what makes these sequence diagrams especially important in your experience?

[+] ChrisSD|6 years ago|reply

I know I'm going against the spirit of this hypothetical but the trouble with good technical documentation is it can quickly become bad technical documentation. It needs constant maintenance over the long term. Every patch that's committed to the code needs to be evaluated for how it impacts the docs. Preferably changes to the docs should be done via a collaboration with the patch author(s) and a documentation expert (aka someone who is good at writing technical documentation and is familiar with the conventions of this specific document).

[+] jnxx|6 years ago|reply

That it is clearly something that could (and might should) have been done before. But. let's stick to situation described and accept that it is way to late to do that any more.

[+] mlthoughts2018|6 years ago|reply

Well-documented tests. Why are you testing something? Why do you expect the output to be something? What is the high level purpose of the test?

High level breakdown of the code structure. What is each module / package / namespace intended for? Why is it laid out in the certain hierarchy it’s in? If I have a question about some functionality, how do I map it to a subset of the code?

Who are the stakeholders? Who will get mad if this breaks and why? How many systems does this depend on / integrate with? How do I contact those people?

What are the data flow or job flows? Does a build get triggered in CI? Is the code published as a library? Does a web service get deployed? Does a human have to compile something and then rsync it to a production server every Tuesday or else everything crashes? What are all the regular tasks? Why is each one needed, how can a test version be run?

Aside from that, then just the basics:

- all functions (or all exportable functions) have consistently formatted docstrings

- docs are concise most of the time, but overly verbose in any “danger” sections of code that rely on unusual, specific or brittle logic.

- well documented description of project version control. What are all the usual code management tasks someone needs to do, and why? What is your strategy for hotfixes, releases, rollbacks, sharing branches, etc.

[+] slics|6 years ago|reply

As a developer myself for many years, I take the following steps for a new project: 1- Familiarize yourself with the application, by using / testing all its functions

2- Understand the workflows within the user interface and then map each one of the workflows / business processes to the logical code and the logical code to the database calls / tables.

3- Once you have built that mapping (tree structure) then you can tackle each one of those functions without feeling lost in the entire code baseline

4- As you work on each of the workflows/ business processes you can start comment the code in your terms

5- IntelliJ provides functions to conduct dependency mapping of the code, and database tables to make sure you don’t have circular dependencies.

6- Once you have completed the walk through and identified the workflows / functions / business processes you can than put a weight on each of those. Large, Medium, Small size.

7- After completing that sizing, sit down with your manager and help him understand what each one of those means in terms of cost, schedule and performance.

8- if none of the above is done, you will have a really hard time coming up with a way to explain to the manager what will it take or when will it be done and how many people will it take if they give you a date to finish. Just because someone wants something done in their time, doesn’t mean it can be done in that time.

Last thought: please don’t get discouraged, if you feel stressed and pressured already, you have lost the battle. Do the best you can with the tools you have, even if it’s not someone else’s best.

[+] jnxx|6 years ago|reply

Interesting. That sound like what is most helpful for a new developer is any information which would help him to do the mapping, to "connect the dots", is this correct?

I am wondering if it would help to have a document which takes the whole list of requirements from the requirements specification, and explains how each requirement is realized in the code. (Or of course, if it is still missing, in which way and where it is missing, and what would need to be done to add it).

[+] WheelsAtLarge|6 years ago|reply

People hate to update docs so the majority of the time it's out of date even if it was the best ever at one time. I was never told this but reading a ball of undocumented code is part of the job so don't ever expect documentation when you start a job, especially at a startup.

Here are a few things that would make life easier.

- reduce the code's complexity. Some old-time programmers want to show off their skills at the cost of simplicity. Ya, ya you're god's gift to programming but the real skill is to program in a way that's easy to understand. And also saving memory is not your primary goal anymore.

- easy to update

- Summary of the software's function

-Major sections of the code should be titled and explained as briefly as possible incode.

- Docs should be part of the code review. I had a software package that would fail if the code was updated and the comments had not or the other way around. It was easy to disable but at least it made you think.

- Buy in from the management on keeping docs up to date. Very hard, managers want results not docs

-Regular review of the docs. Again, hard to achieve

- Make sure you don't abuse the manager's trust. If you say you are updating docs, you better be doing that.

[+] piinbinary|6 years ago|reply

Documentation should explain why things are the way they are. With sufficient effort, you can look at a piece of code to figure out what algorithm it implements, but there is no amount of reading that will tell you why that algorithm is the right one to use. There should be a way to trace this "why" all the way up to the business reason.

Documentation should also provide mental shortcuts. Rather than forcing you to figure out what algorithm the code implements, it could just tell you.

It should also tell you things that are surprising, and might accidentally miss or misinterpret.

Documentation should tell you information that you can't just as easily glean from the code. A `doTransmorgification()` method doesn't need the comment `// does transmorgification`.

Documentation shouldn't attempt too tell you everything with one form of documentation. Documentation on a line or method or class is good for one sort of information, but it doesn't cover the overall architecture of the program. That is often best left as a separate piece of documentation.

It should consider the reader. Is the reader looking for an introduction to how to use this library, or are they looking for what property the method has in some edge case?

Documentation should use the easiest to follow, least technical language it can without being inaccurate. It's not an academic paper; it's an explanation to your fellow developer.

Documentation should tell you how the responsibilities are divided up in the code. If you implement an interface, will you need to record metrics or will that be handled for you?

Documentation should cover any processes you need to follow when making certain kinds of changes. If you need to do something in two separate deploys, those steps should be written down.

[+] rglover|6 years ago|reply

1. Where is the data? How does it get from the user to the database (e.g., REST API, GraphQL, SOAP)?

2. What specific technologies are in use (languages, frameworks, platforms, third-party services, etc.)?

3. How does configuration/security work? In other words, where are keys/tokens floating around that I need to make sure don't get into the wrong hands or misplaced?

4. What are the best resources/references for the technologies in use?

In essence, what's the landscape and where are the "breakpoints" that could wreak havoc for customers and the business?

[+] cjfd|6 years ago|reply

A high level overview.

First separate it out in what the different processes are that are running in the working system. How are they communicating? What is their function? Why was it separated in this particular way?

For the more difficult of these programs/processes: what is their rough structure and what function does each part have? Why are they difficult? What strategies were used to conquer this difficulty? What are the most common ways in which they get extended?

How does the build work?

[+] AgentOrange1234|6 years ago|reply

I would wish the documentation fairy would give me a file of asserts that 1. explain (in comments) the invariants the system is supposed to maintain and check them, and 2. explain the protocols that modules communicate with and check that they are being obeyed. And a fast test suite where these asserts are checked.

[+] jnxx|6 years ago|reply

This is a very good point, too. Knowing the invariants is extremely important for keeping the code correct when changing it, but, similar to locking and concurrency patterns, they are often implicit and sometimes need to be implemented in many place in the code.

[+] jackcodes|6 years ago|reply

Brevity. Assume your users are intermediate developers and only tell them what they need to know.

The worst documentation I had to parse recently was react-beautiful-dnd, and I’m saying this somewhat conservatively as it’s evident a lot of effort has went into producing something comprehensive. But I was paralysed by the volume of it, reams and reams of methodology and design decisions, how to contribute, and the history of the project. In the end I had to use one of the community codepen examples to get where I needed. Without meaning to stereotype too heavily I see this mostly in junior friendly React projects where you have more people contributing emojis to markdown than you have architecting your APIs. In case anyone from react-beautiful-dnd/Atlassian reads this, you have all the documentation you need, but make it less like a textbook and more like a two page CV.

The best documentation I’ve seen is spider-gazelle (crystal). It’s the prefect balance of plain-English explanation and example code. It probably helps that is was architected and written by the same person, and they were able to really hone down on exactly the right information to transfer.

Edit; I’ve just realised that your specific example is for an internal enterprise project, rather than a public facing library so the variables you optimise change slightly. I’d be maintaining and developing this software as a contributor, and the documentation is specifically for on-boarding me. I’d want to see more of the architecture rationale, point-in-time thought process, and explanations of business constraints that led to certain decisions being made (e.g. what prevented optimal architecture first time around)

[+] jnxx|6 years ago|reply

> Brevity. Assume your users are intermediate developers and only tell them what they need to know.

Sorry for an interjection. The documentation I am having in mind is technical documentation for future developers of a code base (its implementation). Not users of the API a code base.

Also, it would be possible to document the wrong stuff or stuff that people really already know, but it is basically not possible to document too much, because of the time constraint.

[+] baitman|6 years ago|reply

For internal docs, just like external (user-facing) ones, the key is for them to be focused around what the user needs. In this case, that means things like:

- overall architecture of the system, what lives where

- how it's meant to work, what the main interfaces to users are, and what users expect from them

- common tasks and troubleshooting, eg stuff that often goes wrong

One of the major problems with internal docs is keeping them up to date. So that brings the second key point which is minimalism: don't document everything. Stick to what is genuinely relevant and useful to developers on your project. It's a hard balance to find, but you have to be practical and recognise that if you write something you can't commit to maintaining, it may not be worth writing at all. (it's still hard with user-facing docs, but at you have releases which you can use as a drive/reminder to update things)

Thirdly, structure really matters. If people can't find things in your docs then they become much less useful. Good information architecture is hard, though. I'm a tech writer and I honestly believe that most engineers can write decent docs, given a bit of training (eg I've given something like this and people have found it useful https://youtu.be/8TD-20Mb_7M) - but working out a clear structure is something that it helps to talk to someone with practise about. If that's not possible, look at products similar to yours and see what their high-level structure is - if it looks good, do something similar, and if it doesn't, that gives you something to avoid.

[+] arkadiytehgraet|6 years ago|reply

Discoverability is at least as important as the documentation content itself.

A lot of really good advice here is given with regards to the content. Obviously, this is very important; however, it is not enough, when there are multiple places of documentation. Keeping at most 2 (code + wiki or something) sources is essential to ensuring that docs are actually read and can be found by others.

In my current company (~3k developers) there are more than 10 different places for documentation. This both makes it difficult to find anything AND discourages developers to write any docs in the first place.

[+] ka0lin|6 years ago|reply

- Which problem will be solved by the software - What is the system's context - How to Build - How to Run

The rest is in the code. And 80% of all »projects« lack at least one of the points above.

[+] jnxx|6 years ago|reply

> The rest is in the code.

I know that philosophy well, but I do not really agree with it - not at all. For example in concurrent code written in C, the way things are synchronized is normally implicit. You can understand it if you read all of the code, but not by looking at single functions. If you add accesses without proper synchronization, you will get quickly undefined behaviour which can cause extremely nasty concurrency bugs.

I also agree with the several comments made that it is often more helpful "why" something is done, than "how". This is, of course, valid for in-code comments as well, but IMO equally important in overview documentation.

> And 80% of all »projects« lack at least one of the points above.

Yeah, I know. In fact, I have rarely seen good technical documentation.

[+] uberswe|6 years ago|reply

Start with any inventions or algorithms that make the software unique, make sure these are documented. After that I would focus on the highest level, document the input handlers, views or maybe controllers. This will give a good overview of the application and cover the most complicated things. If the developer has more time left to document they simply try to go deeper and deeper (more detailed).

[+] jnxx|6 years ago|reply

Good points. For really complex algorithms, I have a high opinion of literate programming.

For more contextual stuff, like, for example, build procedures, I am wondering whether using a Wiki would be better. It is not only easier to keep it up to date, but it also allows to cross-link topics and this in turn makes it easier to split up information into smaller, easier to find and more digestible units.

Of course, normally a Wiki must be maintained and organized, otherwise it becomes a steaming mess, but if it is a focused effort to write a dozen pages which document a specific area, this is not hard do organize reasonable.

[+] wirthjason|6 years ago|reply

I work at a bank writing financial software —- a trade capture system, it’s basically everything that happens after the client agrees on the price.

Knowing the business and thus rules is the most important thing.

We support a global business and Asia regulations are different from Europe, products are managed differently by different traders, etc.

Knowing the business gives context to what the software is doing and why because this kind of software is written by people who didn’t understand the rules in the first place. It isn’t abstracted correctly and there is a ton of if-else branching for edge cases, variable/function/class naming that is inconsistent and often incorrect, etc.

I would also +1 a known bug list. You can spend a lot of time tracking down a “bug” that seems new because it’s an edge case that rarely appears. Each time it comes up the users think it’s new and a complain.