Topic-based authoring

November 3, 2011 by

Topic-based Authoring is an approach to developing end-user documentation that explicitly identifies and uses topics as a fundamental organising principle.

It is a set of practices, processes, tools, and an organising conceptual framework.

Topics are present in all human communication, but they are often implicit and not utilized as a formal organisational principle.

Contrast “writing to make the sentences and paragraphs clearer“, on one hand, with “writing to make the thinking clearer“, on the other.

The first – writing to make the sentences and paragraphs clearer – involves interaction with words and sentences, which can be explicitly identified and whose relationships can be formally described.

The second – writing to make the thinking clearer – involves interaction with topics.

When topics are implicit there are no terms to describe topics, and no vocabulary to describe their interaction and relationship. Authors can intuitively deal with and structure information without an explicit vocabulary for topics, but capturing and communicating practices and standards, and creating processes around those, is impossible.

Topics become explicit first when they are a participant in the universe of discourse of authors. This can be thought of as weakly topic-based authoring. In this stage authors know what topics are, can see them in existing documentation, and may even use them as an organising principle while planning and writing. However, there is no specialised tooling support for topics, and the topic does not have an existence beyond an organising principle in the mind of the author.

Topics become completely explicit when they become first-class citizens, with a formal existence and participation in documentation workflow from planning to implementation as an organising principle with support from tooling. This is strongly topic-based authoring. In this stage topics are both used as an organising principle, and also have a physical existence in the workflow and toolchain.

Topic-based authoring (either weak or strong) produces the traditional output media expected by users, such as books, articles and help files. However, it can also produce enhanced output such as dynamic websites.

Topic-based authoring is used by companies to aid in information planning and project management, increase automation, decrease maintenance costs, and increase reuse (including multiple output structures from the same content).


The Tao of Topics Part 1

October 29, 2011 by

It’s the 21st century.

In contrast to the vast majority of history, today a significant percentage of humans can read and write. In fact, statistically, the odds of you being one of those humans is 99.9%.

Everyone who reads this blog, can read. I’d wager that everyone who can read can also write, but not everyone who can write can necessarily write well.

What does it mean to write well? From a technical writing perspective, it means primarily to write content that is clear, unambiguous, comprehensive, and coherent.

It must be clear – necessary information should be discoverable and not embedded in an “unlikely place”; it must be unambiguous – it is worded in such a way that reduces uncertainty to the greatest extent possible; it must be comprehensive – it contains all the information needed for the user to generate the complete solution to the problem provided by the technology; and it must be coherent – users should consistently use the same patterns to use the documentation.

Well-written content might also be engaging, inspiring, or entertaining, but for technical writing these are secondary. The goal of technical writing is to reduce uncertainty in users of a technology. If your documentation is entertaining, but ambiguous, or it is inspiring but has massive gaps in it that leave the user confused, then it’s not well-written from a technical writing perspective. Better flat, dry prose that gets the job done, than inspiring, entertaining prose that doesn’t!


Handwriting is so 19th century, but most of this blog’s readers still know how to do it (you are totally next-gen if you skipped it to go straight to texting). To have a conversation about “writing clearly” from the perspective of handwriting we need to talk about letters. If you don’t know what a letter is, then we have to start with understanding that before we can move on to how to write a letter clearly.

Incidentally, having people write their email address on a survey form or email sign-up list has been a bane of my existence for years. People don’t seem to realise that if I can’t read it, it’s useless. The scribbles that they put down look more like mnemonic aids than actual communication. They certainly don’t reduce my uncertainty!

When it comes to writing clear prose using a keyboard, you don’t need to focus on the formation of the letters, because the machine takes care of that for you. Then come words, where you have to spell them in such a way that others can understand them.

Text speak

Once you get the words right, we’re looking at sentences.

To write well-formed, clear, and unambiguous sentences, you need to understand what nouns, verbs, adjectives, adverbs, and prepositions are. You can write without knowing what these are, and many people do. Some people even manage a passable level of written communication without knowing this explicitly, but to have a conversation about “how to write clearly“, you need to be able to talk about what it is you are writing.

If I say: “adverbs generally weaken a sentence“, if you know what an adverb is you’ll see how “adverbs weaken a sentence” is both a test of the idea, and a demonstration of it. You’ll also be able to take the idea and apply it to your own writing to eliminate unnecessary adverbs to tighten it up and increase clarity.

My argument is that you can write without explicitly understanding the elements of writing, but writing well, and participating in and benefiting from a conversation about improving writing, requires an explicit understanding of the elements of writing.

Beyond letters with their shape and spatial relationships in writing, we have words, which aggregate these inferior units to form higher-order units of meaning. Beyond words we have sentences with their parts of speech, syntax, and grammar. At all of these stages we want to make sure that everything that needs to be there is, whatever doesn’t need to be there isn’t there, and whatever is there is clear.


Beyond letters, beyond words, beyond sentences in technical communication are topics. Technical communciation is the art and science of communicating useful information about systems. Words are the atoms of communication, with sentences as the molecules. The atoms of “useful information” are topics.

Topics are written representations of elements of the mental models that we use to interact with the real world. An expert user of a system has an internal model of the system that she uses to make predictions about how a system will act, and how a system will react. An accurate, complete mental model allows her to accurately make predictions and influence the system to achieve her desired outcomes (or know if this is not in fact possible).

An inaccurate or incomplete model is the cause of uncertainty. Technical communication aims to reduce uncertainty. As users consume technical communication they enhance and refine their internal model. The role of the technical communicator is in many cases to explicate the internal model of a subject matter expert, convert it into transmissible “chunks”, and deliver it to a user, who can then internalise the chunks, reconstruct the model, and rock out like an expert!

Topics are atomic chunks of mental models.

Just as all food can be analyzed in terms of its protein, fat, and carbohydrate content, all technical information can be divided into similar “macronutrient” groups. Both our digestive system and our brain are systems designed to interact with the world, so they have systems that reflect broad categories in the environment.

Just as the food we digest can be divided into macronutrient groups, the information our brain digests can also be divided into macronutrient groups.

Topics are exactly that – ontological categories that exist in the environment, which lead to an organ that processes these categories differently (the brain). Different topic types correspond to different physical mechanisms in the brain.

When we want to have a conversation about writing clearly we can then tackle it at three different levels: writing the glyphs clearly (taken care of by the machines now), writing sentences that are clear and unambiguous, and providing all necessary information in digestible chunks.

Just as some mixtures of food can give you indigestion, some mixtures of topics are indigestible. Think about this: it’s counter-productive to give someone elements of a mental model that rely on other underpinnings before they have those underpinnings. As an example, it’s pointless to talk about writing clear sentences to someone who doesn’t know how to write!

Let’s just skip back to those three levels of writing well:

  • writing the glyphs clearly
  • writing sentences that are clear and unambiguous
  • and providing all necessary information in digestible chunks

We have a progression here from letters / words, to sentences, to topics. Sentences are aggregations of words according to a defined structure that gives rise to sense. Topics are aggregations of sentences according to a defined structure that give rise to sense.

Let’s look at the last point in a little more detail:

Having a conversation about topics allows us to talk about how we “provide all the necessary information in digestible chunks“.

What is the necessary information? If you’re familiar with technical writing then you know that this depends on the audience. Think about topics as pieces of Lego. The mental model in the mind of the expert user is a fully assembled construction. To communicate this you can deconstruct it into its constituent pieces, and then deliver those pieces to the user, along with the assembly instructions.

Lego kit

A well-executed structured approach like this enables a user with a partially constructed internal model to quickly identify, locate, and consume the missing pieces, which are available as atomic units.

Rather than having to wade through monolithic blocks of interleaved topics, a user can quickly identify and locate the piece that they are missing.

Organising the information in topics makes it possible to provide multiple methods of locating specifc information, which I’ll discuss in more depth in a subsequent post dedicated to the topic. For now, suffice it to say that as an atomic unit each topic has a surface area, and surface area is discoverable. Think of the difference between sifting through a box of individual Lego pieces for the 3×2 green block, versus sifting through a collection of randomly joined pieces looking for it.

That’s not to say that approaching information as topics is a reductionist approach that somehow does away with top-down views, progressive disclosure, or overarching narratives – any more than considering sentences in terms of the parts of speech does away with prose.

What an understanding of topics gives us, among other things, is the ability to have a conversation about improving the information content: its coverage, clarity, and coherency for users. And that’s always a good thing.

Stay tuned for part two, where we’ll look at the “macronutrient groups” of topics – Topic Types.

Docbook 5.1 and topic-based authoring

June 4, 2011 by

Docbook 5.1 adds support for topic-based authoring. It’s studiedly neutral in its approach to topic-based authoring, in contrast with DITA’s almost religious zeal for the historical inevitability of the Topicalypse.

Without naming any names, the Docbook Definitive Guide V1.3 states:

One modern school of thought on technical documentation stresses the development of independent units of documentation, often called topics, rather than a single narrative.

It’s clear from this that the Docbook Technical Committee did not arrive at the topic epiphany by contemplating their navel in isolation, or by a blinding flash of light on the road to Damascus. And while no names are mentioned in this introduction to a pragmatic concession to topic-based authoring, Docbook 5.1 incorporates most, if not all of DITA’s features, while maintaining Docbook’s core strength of static linear narrative targeting a book-like output medium.

One thing that really screams: “Come to the mountain!” is the ability to build Docbook 5.1 topic-based output incorporating DITA topics. Just-in-time xsl processing (<transform>) is used to incorporate existing DITA topics in Docbook 5.1 topic assemblies, allowing you to come back to the fold without an expensive conversion of existing content.

Docbook 5.1 lacks the topic-based purity of DITA, implementing a dual (but isolated) static linear narrative / modular topic model, in contrast to DITA’s focus on topics uber alles, and adopting a pragmatic, descriptive approach (“hey, looks like people want to do this“), rather than a religious, prescriptive approach (“this is the one true way!“).

However, it looks to have learned from DITA, including DITA’s shortcomings, and brings its rich semantic model to the table, as well as its installed base of expertise and tooling.

A pragmatic approach is the approach that organisations need to take when moving from a pure static linear narrative approach to a modular topic-based one, so Docbook’s pragmatic compromise, along with its tooling maturity, may sit well with many organisations moving in that direction.

The Language of Marketing

February 9, 2011 by

I was absently staring at a new tube of toothpaste this morning as I washed my hair. You have to look at something, right? This one declared “healthy, whiter teeth for longer”. An image of extremely long (but healthy and white) teeth filled my mind, and was immediately pushed out by the technical writer in me asking “whiter and longer than what, exactly?”

Most marketing slogans give technical writers the screaming heebie-jeebies. Not only do they make spurious and vague claims like ‘more fibre’, ‘less fat’, and ‘20% bigger’ with alarming regularity, but the adjectives! I have no doubt that it is actually possible to sell things with sentences that contain only one adjective. And if they do need more than one, I’m sure a comma wouldn’t kill them. I could rant on the folly of adverbs, too, but that is a whole different article.

Why are marketers such terrible writers?

Because customers expect spin, and spin is easy to write. All you need is a handful of adjectives and a call to action: “The new fruity refreshing Globswoddle Fizz is now available. Experience the heady taste of summer today!”

While yelling at the toothpaste tube in the morning might make us all feel better, it is not likely to turn us into marketers just to help an obviously flailing industry. I finished my marketing degree about three weeks before I decided that the marketing industry was the last place in the world I wanted to work. Eventually, I became a technical writer instead, and discovered that I had inadvertently ended up working in marketing after all. Every word we set to paper is marketing in one way or another. If it is going to be read by a customer, then it needs to sell the product. But the last thing we want to write is spin.

Why are writers such terrible marketers?

Because customers want anything but spin, and while spin is easy to write, spinless marketing is not so easy.

Spin is wanted and welcomed in places where it is expected, like product packaging and on the airwaves. When our customers read technical manuals or help text, they are looking for a solution to a problem. If they were suddenly faced with the empty promises of spin, they would lose faith in the documentation, and possibly the product.

However brutal honesty is not required, either. Product documentation should not tell customers that the product cannot fulfill their expectations. Every question needs to be anticipated and answered. The documentation must give the customer hope that their problem can be resolved, their task completed, and their sanity retained in the process.

Effective documentation never tells the customer that a product is terrible (even if it is), and it never tells a customer that they are stupid (even if they are). It never makes over-inflated claims of software brilliance, and it never assumes greater-than-average user intelligence.

Somewhere nestled in there is product documentation that shows the product in a positive light, without the hard sell. Sound easy? Like most technical writing, it sounds easy until you actually try to do it. Some tips for getting started with spinless writing:

Kick adverbs, take names.
Adverbs are a big red flag for spin. Be ruthless and cut them all out. If your sentence requires a modifier, consider what you are really trying to say. If it forms part of an instruction or description (‘The widget can be fully removed by …’), reword it to remove the adverb (‘Remove the widget by …’).

Never call anything ‘simple’.
If you tell your users that something is ‘simple’, ‘quick’, or ‘easy’, and the customer struggles with it (for whatever reason), you are essentially telling them that they fail at life. Try not to insult your users.

Mind your adjectives.
Adjectives are fine in their place. Use them only where necessary, though, and try not to use more than one at a time. (‘Locate the red button’ is fine, but avoid ‘Locate the large, shiny, red button’ that is next to the ‘tiny, silver, shiny lever’).

Know your stuff.
If you can’t describe your topic in a single short sentence, you don’t understand it well enough, and it becomes too easy to succumb to spin statements. You need to be able to give succinct and accurate descriptions for each and every component part, as well as the product as whole. If you are not able to do this, continue to research your product until you can.

Understand the enemy.
As modern humans, we are largely desensitised to advertising, simply because we are so totally immersed in it. Start noticing it. Analyse what language is used, the sentence structure they’ve employed. Work out how you would re-write it to send the same message, but without the spin.

Edit with a knife.
Never say more than you need to.


This article was originally published in Words: A Quarterly Bulletin for Technical Writers and Communicators. Volume 3, Issue 1: February 2011, with the following bio:

Lana Brindley has been playing with technology since that summer in the 80’s when she spent the whole time trying not to be eaten by a grue. She has been writing since she could hold a pencil, and is currently writing technical documentation for Red Hat. Lana holds business degrees in marketing and information systems, and with any luck will have a technical communicators degree by the end of the year. She works from her home in Canberra, Australia, and occasionally leaves the house in order to berate university students and conference goers about passive sentence construction.

This post has been cross-posted to On Writing, Tech, and Other Loquacities

Keeping It Stupidly Simple

July 13, 2010 by

Everyone has heard the old adage about the “KISS Principle: Keep It Simple, Stupid”. Easy to say, easy to remember, but often hard to do. At least, hard to do well. When we simplify our language, it often comes across as patronising, dumbed-down, or just plain rude. So how should Stupid keep it simple, without making it stupidly simple?

Consider the sentence:

“Insert the writable media into the optical disk drive.”

It’s not horribly bad as it stands, but it could be made simpler. Here’s one version:

Open the disk drawer by pressing the button on the front of the drawer. Place the CD into the tray with the label facing upwards. Close the drawer by pressing the button again. Do not force the drawer closed.”

Well, it’s simpler. We’ve lost some of the more easily-confused terms such as “writable media” and”optical disk drive”, replacing them with more common and regular words. We’ve given more specific instructions about the actual process of performing the task, which can help with understanding, and also give users more information about troubleshooting. This would be great for a manual that is introducing people to computers for the first time.

But what if I were to tell you that this instruction is to go into a Developer’s Guide, that is, a book read and used by software developers? All of a sudden, the new version of this sentence has become horribly patronising. It is safe to assume that a software developer has opened a disk drawer once or twice before, and probably doesn’t need to be given explicit instructions about where to find the button. They probably also understand the terms “writable media” and “optical disk drive”. So we’re back to where we started from. How do we simplify the sentence for this audience without speaking down to the audience?

Think about what the sentence is trying to convey. How would you explain this to someone who is sitting across the table from you? Imagine you have a friend who is a software developer. You go around to their house, and they ask you a question about this product you’re working on the manual for. How would you explain it to them? If they said “what do I do now?” would you respond by handing them a CD and saying “Insert the writable media into the optical disk drive”? Probably not. I can just about guarantee that you would say something more like this:

Put the CD into the disk drive

So there’s your answer. It’s not patronising, it’s not too complicated. It uses terms that everyone is familiar with, and isn’t couched in lengthy words and stuffy language. It gives all the information the user needs, and isn’t drowning in information we can safely assume they already know.

The problem, of course, is that keeping it simple is not always simple. Corporate language is increasingly creeping into the everyday. Keeping it out of technical documentation is becoming increasingly difficult. Of course, if the product you are documenting is called a “Synergy Manipulation Process Leveraging Suite” there’s not much you can do about that. You can, however, ensure that you give information about the product in plain language. Explain what it does (other than leverage synergies!), explain how to use it. Try standing up and reading your text out loud. Try explaining the processes and concepts to a friend and take note of the language you use. Look at each individual word and think “is there a simpler word that I can use here?”. Keep your sentences short and to the point. Avoid repetition unless it is absolutely necessary.

Just yesterday, to give a real-world example, I saw a blog-post titled “Marketing Leaders Should Help Create the Next Generation of Australian Multi-Channel Retail”. Now, I don’t even know what that means (and surely it needs another noun on the end … “retail what“?). I clicked on the link, and read the first sentence, trying to work out if it was something I might be interested in, and saw whole sentences full of nothing but corporate-speak. Needless to say, I didn’t read any more. And therein lies a valuable lesson – write for your audience, but never write for the sake of putting words on paper. Even if your audience is a group of corporate-types in suits, who live and breathe corporate-speak, don’t write an empty document, filled with empty words. Make sure you have something to say, and then say it as simply and as accurately as possible.

The pictured quotes on this page have come courtesy of Andrew Davidson’s wonderful Corporate Gibberish Generator

This blog post has been cross-posted to On Writing, Tech, and Other Loquacities

Syntext Serna 4.2 and DITA support

May 26, 2010 by

Oh, one more thing while I’m at it:

There seems to be a problem with the DITA 1.1 templates included with Syntext Serna 4.2 (the Open Source version, of course). The xml generated by the template contains both xsd and dtd declarations.

For example:

<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd" []>
<concept xmlns:xsi="" id="concept-1" xsi:noNamespaceSchemaLocation="urnasis:names:tc: dita:xsd:concept.xsd:1.1">

You can see there a Doctype declaration, which invokes the DTD for validation, followed by an xsd declaration as an xmlns attribute of the concept element.

The problem with this is that either xsd or dtd should be used as the xml schema, not both. If both are present Xerces will attempt to do both types of validation. When it does DTD validation it will fail, because an xsd declaration is not part of the DTD.

Putting both types of validation into the template looks like a bug. If you create a new DITA 1.1 Concept from the template, and then Publish > HTML, it produces the error:

[pipeline] Using XERCES.
[pipeline] [Error] :4:155: Attribute "xmlns:xsi" must be declared for element type "concept".
[pipeline] [Error] :4:155: Attribute "xsi:noNamespaceSchemaLocation" must be declared for element type "concept".

My workaround has been to edit the template files in


and remove the xsd declaration. DITA topics created with the templates then validate fine using DTD validation.

DITA-OT 1.5 and Java version

May 26, 2010 by

I’ve been doing some work with the DITA Open Toolkit 1.5 lately, and got no love from Google on this issue, so here’s my contribution to anyone else who finds it:

If you are trying to build and get an error like:

DITA-OT1.5/build_preprocess.xml:80: java.lang.StringIndexOutOfBoundsException

Then check the version of your Java. I got this error with Java 1.5, and resolved it by switching to Java 1.6.

I’ll have something more to post about DITA soon, but I just thought I’d get that out there for anyone else who is scouring Google in vain.

FOSS Training

April 6, 2010 by

I was privileged enough to be able to attend in Wellington in January. While there, I caught Bob Edwards’ and Andrew Tridgell’s talk on “Teaching FOSS at Universities” (video of which can be found here). It intrigued me.

Open source software development is very different to developing software in a more traditional, closed source environment. The aim of the course is to teach students how to go about working within the open source community. It covers the practical aspects of checking out code from a repository, submitting patches, and undergoing code approvals and reviews. It also looks at some of the less tangible aspects, like what’s accepted and expected within the community, the motivation behind project development, and governance. The course also goes into some detail about documentation.

Documentation for open source projects is not quite the known quantity that it can be in many proprietary software environments. I once had a developer I was working with describe it as “we live in the Wild West out here”, and – at least to an extent – he makes a good point. While writing for an open source project may not be as wild and exciting as that sentence makes it sound, it can sometimes be unpredictable and, at times, incredibly frustrating. Frequently, a book has been written and reviewed in preparation for a release, only to find at the last minute that a feature has been pulled from the version, a component has suddenly been renamed, or the graphical interface has had some kind of redesign. All of these things happen to open source writers on a regular basis, and frequently the only solution is to pull an all-nighter, get the changes in, and have the document released on schedule. And that’s only if you were lucky enough to find out about the change with enough time to spare before release!

So how does a writer plan for and write a documentation suite when there’s so much unknown in a project? The answer is – perhaps ironically – to plan ahead. You can’t plan for every contingency, nor should you. But if you have a plan of any description, you’re going to be better off when things start to go wrong. Pin down the details as best you can as far ahead as possible. But don’t leave it there, continue to review and adapt your plan. Keep your ear to the ground, and constantly tweak your schedule and your book to suit. If something comes up in a mailing list about a feature you’ve never heard of, don’t be afraid to ask the question – “Does this need to be documented? Will it be in the next version? Where can I get more source information?”. Another trick is to make sure you build in ‘wiggle room’ to your schedule, in case you suddenly discover a new chapter that needs adding, or a whole section that needs to be changed. If you’re consistently a few days or a week ahead of schedule, then even a substantial change should not throw you too far off balance.

Just like a ballet dancer, technical writers need to be disciplined, structured, and organised. But you also need to have grace, poise, tact, and – most importantly – flexibility.

Thanks to Bob and Tridge, I’ll be lecturing the 2010 FOSS course students at the Australian National University later this week. I’ll also be contributing to the textbook that is being developed for the course. True to form, it is being built by and for the open source community, using open source tools (including Publican which has been developed in-house by some of my esteemed colleagues). Watch this space for more information.

Cross-posted to On Writing, Tech, and Other Loquacities

Publican: Not just for beer

March 10, 2010 by

Click image to embiggen.

The Publican User Guide is the best source of information on all things Publican.

Publican is a free and open source documentation toolchain that uses DocBook XML.

Wanted: Schedule Monkey

February 24, 2010 by

If you have any open source writing or writing-related opportunities, let us know and we’ll pimp them here.

Love technology? Love a challenge? We have the job for you.

Red Hat, the defining technology company of the 21st century, continues to expand its team to create compelling open source products.

We need someone to monkey around with our schedules. We have a bunch of schedules for our writers, and we need to merge them into one uber-schedule. We’re working with open source tools, so we need someone who isn’t scared of TaskJuggler.

You don’t need millions of years in the industry – you just need to be able to learn new tricks.

This is a part-time job is based in Brisbane, Australia. Check it out on