The Crash Course to DocBook

3.2. The Structure of a DocBook File

The tags covered in this section are listed below.

book - Book
article - Article
refentry - Equivalent of a man page
chapter - Chapter of a book or an article
sect1 ... sect5 - Sections and subsections of a chapter
title - Text of a heading or the title of a block-oriented element
para - Paragraph

Example 3-1. Chapters and sections


<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN">
 
<book id="hello-world" lang="en">
 
<bookinfo>
<title>Hello, world</title>
</bookinfo>
 
<chapter id="introduction">
<title>Introduction</title>
 
<para>This is the introduction. It has two sections</para>
 
<sect1 id="about-this-book">
<title>About this book</title>

<para>This is my first DocBook file.</para>

</sect1>

<sect1 id="work-in-progress">
<title>Warning</title>

<para>This is still under construction.</para>

</sect1>

</chapter>
</book>

The above example shows a skeleton of the structural tags. The first line is the DTD declaration which indicates which DTD to use to process this document (namely DocBook version 4.4). This information will be described in more detail in the Document Type Declaration section.

Next comes the content model, which is <book> here. You can also use <article>, which is more lightweight than <book>, or <refentry> which is the equivalent of a UNIX man page.

Note the use of the lang attribute in the <book> tag. The language attribute should always be used to make it easy to determine what language in which the document is written.

After the <book> tag comes the meta information for the document which is encapsulated within the <bookinfo> tag. This information will be described in more detail in the Meta Information section.

Then come the chapters of your book, which may contain one or more section tags (<sect1> - <sect5>). Human-readable (not numerical) ID attributes for <chapter> and <sect> tags are required for two reasons:

Chapters and sections must contain at least a <title> and an empty <para> tag. The place where certain elements can occur, cannot occur or must occur is defined by the DocBook DTD, and is covered in detail by the Reference Guide.

Content in DocBook is contained within a <para> tag, which is very similar to the <p> tag in HTML and LinuxDoc except that it must always have a closing </para> tag. Each time there can be a line break in some text (like in a list item), it means that the text will have to be enclosed in <para> tags.

Let's summarise and extend what we have seen so far. A book will be structured in the following way:


    book
      meta information
      chapter
        sect1
          sect2
        sect1
      chapter
        sect1
      appendix
        sect1
      appendix
        sect1
        ...
      glossary

An article will be structured in the following way:


    article
      meta information
      sect1
      sect1
        sect2
      sect1
      ...