NAME
stx2any - converter from structured text to multiple formats
SYNOPSIS
stx2any [ -T format ] [ stx and m4 options ] [ file file ... ]
DESCRIPTION
stx2any converts files in structured text (Stx) format into other
formats. Formats currently implemented are HTML, man, raw text,
PostScript, LaTeX, XHTML and DocBook XML.
The source format, structured text, is a kind of plain text format with
standard markup for representing headings, lists, emphasis etc. The
markup is both quicker to write and easier to remember than
conventional tag-based markup languages, and is beautifully legible
also in source form. Stx markup is better explained in Stx quickie
guide, which is available in the examples directory.
Most of the conversion happens in m4, and you can define your own
macros and other stuff for giving structure to your documents. stx2any
provides a LaTeX-like extensible environment system and a diversion
system for rearranging input. (Tårta på tårta, as they say in
Swedish.)
Because stx2any doesn’t perform any kind of quoting on the input,
markup that isn’t available can be written directly in the destination
language (losing convertibility to multiple languages). This way, if
you are only interested in one output format (eg. LaTeX), you can use
Stx as an abbreviation format for the most common constructs.
Some formatting is not available as abbreviations, but by calling m4
macros. You need macros relatively rarely: for example, floats
(material that can “float” around in the document) are created by
macros.
OPTIONS
stx2any accepts all command line options of m4, passing them directly
on. Of these, the -D argument is important enough to mention here
separately.
-DNAME=VALUE
Define macro NAME to have the expansion VALUE. This allows
you to pass information into the document from the command
line.
-T format
Sets the output format. Default format is html. format
should be one of:
html produces basic HTML (hypertext markup language) output.
man produces man macro output. This output is usable as a man
page directly (although see WRITING MAN PAGES below), or
can be fed to troff / groff for formatting to e.g.
postscript.
latex produces LaTeX document preparation language output. You
can run latex on the result to produce e.g. high quality
pdf’s.
text produces raw text output by postprocessing HTML output
with w3m. The resulting output is very basic, like
stripping away most Stx markup; if you want more formatted
output, consider piping man output to nroff -man.
ps produces simple postscript output by postprocessing man
output with groff. If you want to do real publishing,
consider the LaTeX format instead.
xhtml produces XHTML output by postprocessing HTML output with
W3C tidy. By the way, check
http://hixie.ch/advocacy/xhtml for discussion about HTML
and XHTML.
docbook-xml
produces rudimentary DocBook XML output. See BUGS below
for more discussion about this.
--link-abbrevs
Take link abbreviation syntax into use. Note that because
link abbreviation processing occurs in two phases, it doesn’t
work totally when the input comes from standard input (for
example, if you use stx2any as a middle part of a pipeline).
--quote
Request quoting of characters (other than underscores and
dollar signs) that are somehow magical in the requested
output format. This will make it quite difficult to put
markup in the output format directly in your document, but
will greatly increase the possibility that your document will
be correct (ie. does not have syntax errors) in the output
format.
--quote-me-harder
Request quoting of underscores and dollar signs. This might
make some LaTeX documents work but might break some documents
where underscores are used in macro names or dollar signs in
macro definitions.
--numbering { on | off }
Request numbering of section headings. The default varies by
output format: section numbering is by default off for HTML,
DocBook XML and man, on for LaTeX.
--table-of-contents { on | off }
Request producing a table of contents from the headings. The
default is to produce a TOC when numbering is on. Not
implemented for DocBook XML.
--make-title { on | off }
Request a “title page”. The default is “on”. This setting
does not have any effect in some formats. In HTML, it
produces a big heading at the beginning of the document. In
LaTeX, it produces the canonical maketitle.
--no-template
Do not produce a document template at all, only the formatted
input text. You probably need this if your document will be
included as a part of a bigger document. If that bigger
document is written totally in Stx, however, it will be
cleaner to give all the source files directly as arguments to
stx2any rather than combine the results afterwards.
--symmetric-crossrefs
In document formats that support linking (HTML, DocBook),
produce reverse links from labels to referrers as well as
links from referrers to labels.
--latex-params params
Set the document class parameters for LaTeX documents. The
default is affected by system paper size; for example, on a
European system it is typically a4paper,notitlepage. (See
“ENVIRONMENT” below.)
--html-params params
Set the body tag parameters for HTML documents. The default
is no parameters.
--picture-suffix suffix
Inline images will refer to files with suffix suffix. The
default is png for HTML and DocBook, eps for LaTeX and man.
--no-emdash-separate
In the output, don’t separate em dashes from adjacent text
with spaces. This is in accordance to traditional English
typography (if I understand correctly), but is not standard
in many other languages — including Finnish, my mother
tongue.
--more-secure
Disable some insecure features of m4 and check some command
line arguments that are passed to shell for problematic
characters. This might be desirable if you’ve received the
document from somewhere else and want to make sure it won’t
do anything malicious when converted. Currently this denies
execution of shell escapes.
Note that clearly no implementation of m4 has been designed
with security in mind. As a consequence, this option cannot
prevent every potentially harmful thing. Things not
prevented which I’m aware of are including contents of
arbitrary files in the output and writing busy loops (so that
the conversion will use all processor time it can get, until
terminated).
--sed-preprocessor scriptname
Run the sed script scriptname for all input. This allows you
to add custom abbreviation markups. It is almost the same as
preprocessing input with sed, then piping it into stx2any,
but interacts better with --link-abbrevs (see its explanation
for details).
--version, -V
Just show version information and exit.
--help, -?
Just show a short help message and exit.
WRITING MAN PAGES
Basically, man pages are simply files in the man macro format.
However, there are some programs (first and foremost mandb) that
require parts of man pages to be in a specific format, and man pages
should generally adhere to the standard sectioning and form (see man
(1) and lexgrog (1) for details).
When writing a man page, the title (w_title) of the page should be the
program/file/format/utility name, and you should define the section
(w_section). To make the page suitable for mandb parsing, you should
start the page with one or more calls to w_man_desc. This will create
a proper “NAME” section for you. (Although you could write one by
yourself.)
DIAGNOSTICS
stx2any may give any error message that m4 may give, e.g. on
malformatted input (a macro call with missing closing parenthesis etc).
In addition, it has the following own error messages:
unknown output format: “X”
You requested unsupported output format X with the -T option.
unknown macro “X” called
stx2any encountered a macro beginning with w_, but knows no
definition for it. This is a warning, not an error — the
offending macro and its arguments are stripped from the
output.
environment “X” closed by “Y” in layer N
Environments in stx2any must be properly nested. stx2any
encountered w_end(Y) when it was expecting w_end(X). Often
this is a sign of a forgotten w_end(X).
If N (the layer) is something other than 0, then the problem
is probably in your environment definitions, not at the point
that stx2any was processing when it encountered the error.
unknown environment “X”
There was an attempt to begin an environment whose name is
unknown to stx2any, i.e. no such environment has been
defined.
diversion “X” closed by “Y”
unknown diversion “X”
Same as above, but for diversions (w_begdiv and w_enddiv).
attempt to use “X” in secure environment
You requested secure processing with --more-secure and the
document contained an “insecure” macro. This is a warning
message, not an error — the causing macro is left in the text
verbatim.
unknown cross link to “X”
There was a cross link to document X, but stx2any does not
know about such a document. Probably you didn’t gather /X/’s
data with gather_stx_titles or you misspelled the document
reference. This is a warning, not an error — the reference
is left in the output verbatim, without any kind of link.
The return value of stx2any is zero on success, one if there was some
problem.
ENVIRONMENT
PAPERCONF
PAPERSIZE
used for determining the default paper size for LaTeX
documents.
FILES
/etc/papersize
used for determining the default paper size for LaTeX
documents.
/usr/share/stx2any/common
directory for the definitions shared by all formats
/usr/share/stx2any/{html,man,latex,docbook-xml}
directory for output format specific definitions
SEE ALSO
m4 (1), latex (1), groff (1), lexgrog (1), w3m (1), strip_stx (1),
gather_stx_titles (1), html2stx (1), extract_usage_from_stx (1)
Stx quickie guide (/usr/share/doc/stx2any/Stx-doc.txt)
Stx markup reference (/usr/share/doc/stx2any/Stx-ref.txt)
BUGS
The structured text format is not yet fully standardised. There are
some corner cases where it is unclear what the result of the formatting
should be. In these cases, the output of stx2any is authoritative, so
it cannot have bugs :)
Some old GNU libc’s seem to be abysmally slow on some instances of the
emphasis regexps. It would be possible to make the regexps faster and
less correct, but as newer GNU libc’s and BSD libc seem to work OK in
these cases, I guess it’s not worth it.
The --more-secure switch is not really very secure for reasons
explained above.
The support for DocBook XML sucks. It is only included because someone
will show up anyway and ask, “hey, does it support DocBook XML?”
Partly this sucking is due to my laziness, but partly it is because of
the nature of DocBook. For instance, stx2any will transform literal
formatting into DocBook Literal elements, but the point of using
DocBook is to convey more information than that — whether it is some
ComputerOutput, UserInput, EnVar, or Application, or... and the result
is still very abstract, not actually meant for humans to read but
rather for computers to process into something readable. Now the truth
is that I doubt you will ever come up with a DSSSL stylesheet whose
output outperforms LaTeX (for publishing on paper) or direct conversion
to HTML (for publishing on the web).
The only sensible reasons I can think of for using Stx as a DocBook
frontend are:
1. the ability to use both DocBook constructs and Stx
abbreviations
2. if you have to write DocBook for some interesting reason (your
boss told you so) but don’t want to learn it
3. you happen to already have infrastructure for processing
DocBook documents, and you want to take advantage of it
AUTHOR
This page is written by Panu A. Kalliokoski.