ConTeXt is a system for creating high-quality print documents, such as PDF. It is similar to LaTeX but with different syntax. Normally, one creates a ConTeXt document in a text editor and then uses the ConTeXt software to convert it to PDF. But ConTeXt can also be used as means to convert XML to PDF. Using XSLT, an author can convert an XML document to ConTeXt document, and then convert this document to a print document.
Using ConTeXt in this way eliminates needing FO. While FO, an open standard, provides an exellent vocabulary to describe page layout, its open source implementation is lacking. The most complete open source software to convert FO to PDF is the java utilty called FOP, and this program cannot do basic things like center tables or control widow paragraphs. The developers of ConTeXt are developing their own utlity to convert the FO to print documents, but since their implementation is not complete, it makes sense to use plain ConTeXt to format XML documents. With ConTeXt, you does not face any of the limitations you encounter with FOP. You can create beautiful print documents—right now.
I assume you already have ConTeXt installed. If not, download and install the version found at http://www.pragma-ade.com/. You might have to join the mailing list to get everything working right. Once you make it that far, you probably know that you format ConTeXt by issuing commands that begin with a backslash.
\starttext A one line document. \stoptext
As mentioned above, we will not create such documents in a text editor, but through an XSLT conversion. Being text-based, ConTeXt does not always lend itself well to a direct conversion from an XML tree. If an innoncent blank line from an XML document finds its way into a document, the resulting document will wrongly contain a paragraph divison. In addition, certain characters in your XML document will have special meaning in ConTeXt, resulting in either a wrong result or a failed run. If your XML document contained an “{”, and you didn’t escape it by putting a backslash in front of it, your resulting ConTeXt document would be invalid.
In order to get around this problem, I advocate using TeXML . TeXML is a python utility that converts its own special form of XML into ConTeXt. That means you can use XSLT to convert from one XML tree to another and then let the python utlity to the dirty work of handling white space.
TeXML uses a very simple XML language. Basically, it represents ConTeXt commands in XML and does little more. One could look at a TeXML document and immediately know what the author meant to express in ConTeXt. In converting an XML document such as TEI to TeXML, one is coming as close as possible to actually converting to ConTeXt itself, without having to worry about white space, and while having the comfort of working with an XML tree. If you use TeXML to convert, you really won’t have to learn a new XML languge, since TeXML consists of very few elements. Instead, you will still think in terms of ConTeXt.
If you choose not to use TeXML, you should still find this document valuable, since I always include examples of how to format in both raw ConTeXt and in TeXML.
Here is the above document in TeXML format.
<?xml version="1.0"?> <TeXML xmlns="http://getfo.sourceforge.net/texml/ns1"> <env name="text"> A one line document. </env> </TeXML>
To convert this document to a print format, first convert it to a ConTeXt document.
texml.py -e utf -c infile.xml outfile.tex
Next, convert the ConTeXt file to a print with your normal ConTeXt command.
texexec outfile.tex
You can also convert your XML file to an outfile all in one step with the script texml_con.
texml_con infile.xml
This script converts the infile.xml to infile.tex and then exectutes the command line texexec on that file. You can use any option with this script that you would use with texexec.
Elements in the TeXML document are bound to a namespace. Since TeXML does not allow mixing of elements from other namespaces, I will present elements in this document without prefixes, assuming the root element contains the namespace.
Here are the basic concepts you need to know to understand the rest of the document. Don’t worry if you can’t understand the individual ConTeXt code. Just make sure you understand how to issue commands.
Each ConTeXt document must begin with a starttext command and end with a stoptext command. starttext and stoptext are examples of an environment. TeXML codes any combination of start- stop- with the env element, shorthand for an environment. The env element must have the name attribute, which consists of the command name minus the start or stop. If the commands in ConTeXt are starttext andstoptext, the name of the environment is text. If the ConTeXt commands are startnarrowr and stopnarrower, the environment’s name is narrower.
Commands start with a backslash and can be followed by setups, which are placed in brackets, and by the “scope or range of the command,” which are placed in curly brackets. This example creates a simple box with the words “that’s it” inside.
\framed[width=2cm,height=1cm]{that's it}
I will refer to the text in square brackets as an option. The text inside the curly brackets I will call a parameter. In TeXML, this simple fragment looks like this:
<cmd name="framed"> <opt>width=2cm, height=1cm</opt> <parm>that's it</parm> </cmd>
The element opt can contain different properties. The opt element above defines both the width and height. It is easy to forget and include one option value for each parameter, but don’t, because you will end of with extra square brackets and invalid ConTeXt.
By default, TeXML puts curly brackets after each command:
<cmd name="par"\>
\par{}
In order to supress the curly brackets, set the “gr” attribute to “0”.
<cmd name="par" gr="0"\>
\par
By default, TeXML does not insert newlines after or before commands.
That means that this code:
<cmd name="hairline" gr="0"/> <cmd name="par" gr="0"/> <cmd name="hairline" gr="0"/> <cmd name="par" gr="0"/>
Will appear as:
\hairline \par \hairline \par
I don’t think this ever will change the print document. However, if you want commands to appear on their own line in your ConTeXt document, use the “nl1” and “nl2” attributes. “nl1” forces a newline before the command; “nl2” forces a break after.
<!--a newline after--> <cmd name="par" gr="0" nl2="1"/> <!--no newlines--> <cmd name="hairline" gr="0" /> <!--a newline before and a newline after--> <cmd name="par" gr="0" nl1="1" nl2="1"/> <!--no newlines--> <cmd name="hairline" gr="0" />
\par \hairline \par \hairline
Often times we need to create a definition for some command and then recall that definition. In such a case, make up a name for the definition, and use this same name when recalling the definition. In the example below, myCustomLayout is the arbritrary word that identifies the definition.
\definepapersize[myCustomLayout][width=8.5in, height=11in] \setuppapersize[myCustomLayout]
<cmd name="definepapersize"> <opt>myCustomLayout</cmd> <opt>width=8.5in, height=11in</opt> </cmd> <cmd name="setuppappersize"> <opt>myCustomLayout</opt> </cmd>
ConTeXt allows many different ways to do the same thing. Much of its code facilitates automatic generation. Page numbers are automatically placed; sections are given numbers; table of contents and indices can be created automatically; front and back matter are formatted special ways.
The XML author will need very little of this code. He uses XSLT to do most of the numbering and generation of such things as table of contents and controls the default font size and spacing through XML.
This document will therefore use the ConTeXt code that is simplest and most consistent and will ignore that code meant for ConTeXt authors. When the rule of simplicity and consistency conflict, I will choose consitency. It is better to remember or look up one rule even if this rule requires a few more lines. Extra lines won’t bother us, because we will not have to do much (if any) editing of the ConTeXt document.
The one-line document illustrates the bare minimum code to create a vaild ConTeXt document. You need to know a few more commands to understand the next section.
The enableregime command tells ConTeXt what kind of input code to expect. Set the option to utf so that ConTeXt can handle any utf in your XML document, including utf-8 and utf-16.
\enableregime[utf]
<cmd name="enableregime"> <opt>utf</opt> </cmd>
By default, ConTeXt puts page numbers at the top of each page. In additon, it restarts numbering after each part command, as explained in the next section of this document. To turn of this automatic numbering, set the state to stop, and the way option to bytext.
\setuppagenumbering[state=stop, way=bytext]
<cmd name="setuppagenumbering"> <opt>state=stop, way=bytext</opt> </cmd>
By default, ConTeXt puts part numbers for cross references. We can generate our own section numbers, so we want to turn this off.
\setupreferencing[partnumber=no]
<cmd name="setupreferencing"> <opt>partnumber=no<opt> </cmd>
The input command inputs an external document. It takes not options or parameters. Instead, you put the path to the external file on the same line as the input command. The input command is useful for documentating ConTeXt because it allows for sample documents that just display the relevant commands, uncluttered by pages of text.
Here are two sample documents illustrating some of the examples discussed on this page.
copyright 2005 Paul Henry Tremblay
License: GPL
home | contents | previous | next
last updated: 2005-03-23