NAME
Locale::Po4a::Po - po file manipulation module
SYNOPSIS
use Locale::Po4a::Po;
my $pofile=Locale::Po4a::Po->new();
# Read po file
$pofile->read('file.po');
# Add an entry
$pofile->push('msgid' => 'Hello', 'msgstr' => 'bonjour',
'flags' => "wrap", 'reference'=>'file.c:46');
# Extract a translation
$pofile->gettext("Hello"); # returns 'bonjour'
# Write back to a file
$pofile->write('otherfile.po');
DESCRIPTION
Locale::Po4a::Po is a module that allows you to manipulate message
catalogs. You can load and write from/to a file (which extension is
often po), you can build new entries on the fly or request for the
translation of a string.
For a more complete description of message catalogs in the po format
and their use, please refer to the documentation of the gettext
program.
This module is part of the PO4A project, which objective is to use po
files (designed at origin to ease the translation of program messages)
to translate everything, including documentation (man page, info
manual), package description, debconf templates, and everything which
may benefit from this.
OPTIONS ACCEPTED BY THIS MODULE
porefs
This specifies the reference format. It can be one of 'none' to not
produce any reference, 'noline' to not specify the line number, and
'full' to include complete references.
Functions about whole message catalogs
new()
Creates a new message catalog. If an argument is provided, it's the
name of a po file we should load.
read($)
Reads a po file (which name is given as argument). Previously
existing entries in self are not removed, the new ones are added to
the end of the catalog.
write($)
Writes the current catalog to the given file.
write_if_needed($$)
Like write, but if the PO or POT file already exists, the object
will be written in a temporary file which will be compared with the
existing file to check that the update is needed (this avoids to
change a POT just to update a line reference or the POT-Creation-
Date field).
gettextize($$)
This function produces one translated message catalog from two
catalogs, an original and a translation. This process is described
in po4a(7), section Gettextization: how does it work?.
filter($)
This function extracts a catalog from an existing one. Only the
entries having a reference in the given file will be placed in the
resulting catalog.
This function parses its argument, converts it to a perl function
definition, eval this definition and filter the fields for which
this function returns true.
I love perl sometimes ;)
to_utf8()
Recodes to utf-8 the po's msgstrs. Does nothing if the charset is
not specified in the po file ("CHARSET" value), or if it's already
utf-8 or ascii.
Functions to use a message catalog for translations
gettext($%)
Request the translation of the string given as argument in the
current catalog. The function returns the original (untranslated)
string if the string was not found.
After the string to translate, you can pass a hash of extra
arguments. Here are the valid entries:
wrap
boolean indicating whether we can consider that whitespaces in
string are not important. If yes, the function canonizes the
string before looking for a translation, and wraps the result.
wrapcol
The column at which we should wrap (default: 76).
stats_get()
Returns statistics about the hit ratio of gettext since the last
time that stats_clear() was called. Please note that it's not the
same statistics than the one printed by msgfmt --statistic. Here,
it's statistics about recent usage of the po file, while msgfmt
reports the status of the file. Example of use:
[some use of the po file to translate stuff]
($percent,$hit,$queries) = $pofile->stats_get();
print "So far, we found translations for $percent\% ($hit of $queries) of strings.\n";
stats_clear()
Clears the statistics about gettext hits.
Functions to build a message catalog
push(%)
Push a new entry at the end of the current catalog. The arguments
should form a hash table. The valid keys are:
msgid
the string in original language.
msgstr
the translation.
reference
an indication of where this string was found. Example:
file.c:46 (meaning in 'file.c' at line 46). It can be a space-
separated list in case of multiple occurrences.
comment
a comment added here manually (by the translators). The format
here is free.
automatic
a comment which was automatically added by the string
extraction program. See the --add-comments option of the
xgettext program for more information.
flags
space-separated list of all defined flags for this entry.
Valid flags are: c-text, python-text, lisp-text, elisp-text,
librep-text, smalltalk-text, java-text, awk-text, object-
pascal-text, ycp-text, tcl-text, wrap, no-wrap and fuzzy.
See the gettext documentation for their meaning.
type
This is mostly an internal argument: it is used while
gettextizing documents. The idea here is to parse both the
original and the translation into a po object, and merge them,
using one's msgid as msgid and the other's msgid as msgstr. To
make sure that things get ok, each msgid in po objects are
given a type, based on their structure (like "chapt", "sect1",
"p" and so on in docbook). If the types of strings are not the
same, that means that both files do not share the same
structure, and the process reports an error.
This information is written as automatic comment in the po file
since this gives to translators some context about the strings
to translate.
wrap
boolean indicating whether whitespaces can be mangled in
cosmetic reformattings. If true, the string is canonized before
use.
This information is written to the po file using the 'wrap' or
'no-wrap' flag.
wrapcol
The column at which we should wrap (default: 76).
This information is not written to the po file.
Miscellaneous functions
count_entries()
Returns the number of entries in the catalog (without the header).
count_entries_doc()
Returns the number of entries in document. If a string appears
multiple times in the document, it will be counted multiple times
msgid($)
Returns the msgid of the given number.
msgid_doc($)
Returns the msgid with the given position in the document.
get_charset()
Returns the character set specified in the po header. If it hasn't
been set, it will return "CHARSET".
set_charset($)
This sets the character set of the po header to the value specified
in its first argument. If you never call this function (and no file
with a specified character set is read), the default value is left
to "CHARSET". This value doesn't change the behavior of this
module, it's just used to fill that field in the header, and to
return it in get_charset().
AUTHORS
Denis Barbier <barbier@linuxfr.org>
Martin Quinson (mquinson#debian.org)