Man Linux: Main Page and Category List

NAME

       Locale::Po4a::Sgml - Convert sgml documents from/to PO files

DESCRIPTION

       The po4a (po for anything) project goal is to ease translations (and
       more interestingly, the maintenance of translations) using gettext
       tools on areas where they were not expected like documentation.

       Locale::Po4a::Sgml is a module to help the translation of documentation
       in the SGML format into other [human] languages.

       This module uses nsgmls to parse the SGML files. Make sure it is
       installed.  Also make sure that the DTD of the SGML files are installed
       in the system.

OPTIONS ACCEPTED BY THIS MODULE

       debug
           Space separated list of keywords indicating which part you want to
           debug. Possible values are: tag, generic, entities and refs.

       verbose
           Give more information about what's going on.

       translate
           Space separated list of extra tags (beside the dtd provided ones)
           whose content should form an extra msgid.

       section
           Space separated list of extra tags (beside the dtd provided ones)
           containing other tags, some of them being of category 'translate'.

       indent
           Space separated list of tags which increase the indentation level.

       verbatim
           The layout within those tags should not be changed. The paragraph
           won't get wrapped, and no extra indentation space or new line will
           be added for cosmetic purpose.

       empty
           Tags not needing to be closed.

       ignore
           Tags ignored and considered as plain char data by po4a. That is to
           say that they can be part of an msgid. For example, <b> is a good
           candidate for this category since putting it in the translate
           section would create msgids not being whole sentences, which is
           bad.

       attributes
           A space separated list of attributes that need to be translated.
           You can specify the attributes by their name (for example, "lang"),
           but you can also prefix it with a tag hierarchy, to specify that
           this attribute will only be translated when it is into the
           specified tag. For example: <bbb><aaa>lang specifies that the lang
           attribute will only be translated if it is in an <aaa> tag, which
           is in a <bbb> tag.  The tag names are actually regular expressions
           so you can also write things like <aaa|bbbb>lang to only translate
           lang attributes that are in an <aaa> or a <bbb> tag.

       qualify
           A space separated list of attributes for which the translation must
           be qualified by the attribute name. Note that this setting
           automatically adds the given attribute into the 'attributes' list
           too.

       force
           Proceed even if the DTD is unknown or if nsgmls finds errors in the
           input file.

       include-all
           By default, msgids containing only one entity (like '&version;')
           are skipped for the translator comfort. Activating this option
           prevents this optimisation. It can be useful if the document
           contains a construction like "<title>&Aacute;</title>", even if I
           doubt such things to ever happen...

       ignore-inclusion
           Space separated list of entities that won't be inlined.  Use this
           option with caution: it may cause nsgmls (used internally) to add
           tags and render the output document invalid.

STATUS OF THIS MODULE

       The result is perfect. I.e., the generated documents are exactly the
       same. But there are still some problems:

       o the error output of nsgmls is redirected to /dev/null, which is
         clearly bad. I don't know how to avoid that.

         The problem is that I have to "protect" the conditional inclusions
         (ie, the "<! [ %foo [" and "]]>" stuff) from nsgmls. Otherwise nsgmls
         eats them, and I don't know how to restore them in the final
         document. To prevent that, I rewrite them to "{PO4A-beg-foo}" and
         "{PO4A-end}".

         The problem with this is that the "{PO4A-end}" and such I add are
         valid in the document (not in a <p> tag or so).

         Everything works well with nsgmls's output redirected that way, but
         it will prevent us from detecting that the document is badly
         formatted.

       o It does work only with the debiandoc and docbook dtd. Adding support
         for a new dtd should be very easy. The mechanism is the same for
         every dtd, you just have to give a list of the existing tags and some
         of their characteristics.

         I agree, this needs some more documentation, but it is still
         considered as beta, and I hate to document stuff which may/will
         change.

       o Warning, support for dtds is quite experimental. I did not read any
         reference manual to find the definition of every tag. I did add tag
         definition to the module 'till it works for some documents I found on
         the net. If your document use more tags than mine, it won't work. But
         as I said above, fixing that should be quite easy.

         I did test docbook against the SAG (System Administrator Guide) only,
         but this document is quite big, and should use most of the docbook
         specificities.

         For debiandoc, I tested some of the manuals from the DDP, but not all
         yet.

       o In case of file inclusion, string reference of messages in po files
         (ie, lines like "#: en/titletoc.sgml:9460") will be wrong.

         This is because I preprocess the file to protect the conditional
         inclusion (ie, the "<! [ %foo [" and "]]>" stuff) and some entities
         (like &version;) from nsgmls because I want them verbatim to the
         generated document. For that, I make a temp copy of the input file
         and do all the changes I want to this before passing it to nsgmls for
         parsing.

         So that it works, I replace the entities asking for a file inclusion
         by the content of the given file (so that I can protect what needs to
         in subfile also). But nothing is done so far to correct the
         references (i.e., filename and line number) afterward. I'm not sure
         what the best thing to do is.

AUTHORS

       This module is an adapted version of sgmlspl (SGML postprocessor for
       the SGMLS and NSGMLS parsers) which was:

        Copyright (c) 1995 by David Megginson <dmeggins@aix1.uottawa.ca>

       The adaptation for po4a was done by:

        Denis Barbier <barbier@linuxfr.org>
        Martin Quinson (mquinson#debian.org)

COPYRIGHT AND LICENSE

        Copyright (c) 1995 by David Megginson <dmeggins@aix1.uottawa.ca>
        Copyright 2002, 2003, 2004, 2005 by SPI, inc.

       This program is free software; you may redistribute it and/or modify it
       under the terms of GPL (see the COPYING file).