668 lines
26 KiB
HTML
668 lines
26 KiB
HTML
|
<HTML>
|
||
|
<HEAD>
|
||
|
<!-- This HTML file has been created by texi2html 1.54
|
||
|
from gettext.texi on 25 January 1999 -->
|
||
|
|
||
|
<TITLE>GNU gettext utilities - PO Files and PO Mode Basics</TITLE>
|
||
|
<link href="gettext_3.html" rel=Next>
|
||
|
<link href="gettext_1.html" rel=Previous>
|
||
|
<link href="gettext_toc.html" rel=ToC>
|
||
|
|
||
|
</HEAD>
|
||
|
<BODY>
|
||
|
<p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
|
||
|
<P><HR><P>
|
||
|
|
||
|
|
||
|
<H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">PO Files and PO Mode Basics</A></H1>
|
||
|
|
||
|
<P>
|
||
|
The GNU <CODE>gettext</CODE> toolset helps programmers and translators
|
||
|
at producing, updating and using translation files, mainly those
|
||
|
PO files which are textual, editable files. This chapter stresses
|
||
|
the format of PO files, and contains a PO mode starter. PO mode
|
||
|
description is spread throughout this manual instead of being concentrated
|
||
|
in one place. Here we present only the basics of PO mode.
|
||
|
|
||
|
</P>
|
||
|
|
||
|
|
||
|
|
||
|
<H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">Completing GNU <CODE>gettext</CODE> Installation</A></H2>
|
||
|
|
||
|
<P>
|
||
|
Once you have received, unpacked, configured and compiled the GNU
|
||
|
<CODE>gettext</CODE> distribution, the <SAMP>`make install'</SAMP> command puts in
|
||
|
place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and
|
||
|
<CODE>msgmerge</CODE>, as well as their available message catalogs. To
|
||
|
top off a comfortable installation, you might also want to make the
|
||
|
PO mode available to your GNU Emacs users.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
During the installation of the PO mode, you might want modify your
|
||
|
file <TT>`.emacs'</TT>, once and for all, so it contains a few lines looking
|
||
|
like:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
(setq auto-mode-alist
|
||
|
(cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist))
|
||
|
(autoload 'po-mode "po-mode")
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
Later, whenever you edit some <TT>`.po'</TT>, <TT>`.pot'</TT> or <TT>`.pox'</TT>
|
||
|
file, or any file having the string <SAMP>`.po.'</SAMP> within its name,
|
||
|
Emacs loads <TT>`po-mode.elc'</TT> (or <TT>`po-mode.el'</TT>) as needed, and
|
||
|
automatically activates PO mode commands for the associated buffer.
|
||
|
The string <EM>PO</EM> appears in the mode line for any buffer for
|
||
|
which PO mode is active. Many PO files may be active at once in a
|
||
|
single Emacs session.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
If you are using Emacs version 20 or better, and have already installed
|
||
|
the appropriate international fonts on your system, you may also manage
|
||
|
for the these fonts to be automatically loaded and used for displaying
|
||
|
the translations on your Emacs screen, whenever necessary. For this to
|
||
|
happen, you might want to add the lines:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
(autoload 'po-find-file-coding-system "po-mode")
|
||
|
(modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\."
|
||
|
'po-find-file-coding-system)
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
to your <TT>`.emacs'</TT> file.
|
||
|
|
||
|
</P>
|
||
|
|
||
|
|
||
|
<H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">The Format of PO Files</A></H2>
|
||
|
|
||
|
<P>
|
||
|
A PO file is made up of many entries, each entry holding the relation
|
||
|
between an original untranslated string and its corresponding
|
||
|
translation. All entries in a given PO file usually pertain
|
||
|
to a single project, and all translations are expressed in a single
|
||
|
target language. One PO file <STRONG>entry</STRONG> has the following schematic
|
||
|
structure:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
<VAR>white-space</VAR>
|
||
|
# <VAR>translator-comments</VAR>
|
||
|
#. <VAR>automatic-comments</VAR>
|
||
|
#: <VAR>reference</VAR>...
|
||
|
#, <VAR>flag</VAR>...
|
||
|
msgid <VAR>untranslated-string</VAR>
|
||
|
msgstr <VAR>translated-string</VAR>
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
The general structure of a PO file should be well understood by
|
||
|
the translator. When using PO mode, very little has to be known
|
||
|
about the format details, as PO mode takes care of them for her.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
Entries begin with some optional white space. Usually, when generated
|
||
|
through GNU <CODE>gettext</CODE> tools, there is exactly one blank line
|
||
|
between entries. Then comments follow, on lines all starting with the
|
||
|
character <KBD>#</KBD>. There are two kinds of comments: those which have
|
||
|
some white space immediately following the <KBD>#</KBD>, which comments are
|
||
|
created and maintained exclusively by the translator, and those which
|
||
|
have some non-white character just after the <KBD>#</KBD>, which comments
|
||
|
are created and maintained automatically by GNU <CODE>gettext</CODE> tools.
|
||
|
All comments, of either kind, are optional.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
After white space and comments, entries show two strings, giving
|
||
|
first the untranslated string as it appears in the original program
|
||
|
sources, and then, the translation of this string. The original
|
||
|
string is introduced by the keyword <CODE>msgid</CODE>, and the translation,
|
||
|
by <CODE>msgstr</CODE>. The two strings, untranslated and translated,
|
||
|
are quoted in various ways in the PO file, using <KBD>"</KBD>
|
||
|
delimiters and <KBD>\</KBD> escapes, but the translator does not really
|
||
|
have to pay attention to the precise quoting format, as PO mode fully
|
||
|
intend to take care of quoting for her.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The <CODE>msgid</CODE> strings, as well as automatic comments, are produced
|
||
|
and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not
|
||
|
provide means for the translator to alter these. The most she can
|
||
|
do is merely deleting them, and only by deleting the whole entry.
|
||
|
On the other hand, the <CODE>msgstr</CODE> string, as well as translator
|
||
|
comments, are really meant for the translator, and PO mode gives her
|
||
|
the full control she needs.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The comment lines beginning with <KBD>#,</KBD> are special because they are
|
||
|
not completely ignored by the programs as comments generally are. The
|
||
|
comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE>
|
||
|
program to give the user some better disgnostic messages. Currently
|
||
|
there are two forms of flags defined:
|
||
|
|
||
|
</P>
|
||
|
<DL COMPACT>
|
||
|
|
||
|
<DT><KBD>fuzzy</KBD>
|
||
|
<DD>
|
||
|
This flag can be generated by the <CODE>msgmerge</CODE> program or it can be
|
||
|
inserted by the translator herself. It shows that the <CODE>msgstr</CODE>
|
||
|
string might not be a correct translation (anymore). Only the translator
|
||
|
can judge if the translation requires further modification, or is
|
||
|
acceptable as is. Once satisfied with the translation, she then removes
|
||
|
this <KBD>fuzzy</KBD> attribute. The <CODE>msgmerge</CODE> programs inserts this
|
||
|
when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy
|
||
|
search only. See section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>.
|
||
|
|
||
|
<DT><KBD>c-format</KBD>
|
||
|
<DD>
|
||
|
<DT><KBD>no-c-format</KBD>
|
||
|
<DD>
|
||
|
These flags should not be added by a human. Instead only the
|
||
|
<CODE>xgettext</CODE> program adds them. In an automatized PO file processing
|
||
|
system as proposed here the user changes would be thrown away again as
|
||
|
soon as the <CODE>xgettext</CODE> program generates a new template file.
|
||
|
|
||
|
In case the <KBD>c-format</KBD> flag is given for a string the <CODE>msgfmt</CODE>
|
||
|
does some more tests to check to validity of the translation.
|
||
|
See section <A HREF="gettext_6.html#SEC33">Invoking the <CODE>msgfmt</CODE> Program</A>.
|
||
|
|
||
|
</DL>
|
||
|
|
||
|
<P>
|
||
|
It happens that some lines, usually whitespace or comments, follow the
|
||
|
very last entry of a PO file. Such lines are not part of any entry,
|
||
|
and PO mode is unable to take action on those lines. By using the
|
||
|
PO mode function <KBD>M-x po-normalize</KBD>, the translator may get
|
||
|
rid of those spurious lines. See section <A HREF="gettext_2.html#SEC12">Normalizing Strings in Entries</A>.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The remainder of this section may be safely skipped by those using
|
||
|
PO mode, yet it may be interesting for everybody to have a better
|
||
|
idea of the precise format of a PO file. On the other hand, those
|
||
|
not having GNU Emacs handy should carefully continue reading on.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects
|
||
|
the C syntax for a character string, including the surrounding quotes
|
||
|
and imbedded backslashed escape sequences. When the time comes
|
||
|
to write multi-line strings, one should not use escaped newlines.
|
||
|
Instead, a closing quote should follow the last character on the
|
||
|
line to be continued, and an opening quote should resume the string
|
||
|
at the beginning of the following PO file line. For example:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
msgid ""
|
||
|
"Here is an example of how one might continue a very long string\n"
|
||
|
"for the common case the string represents multi-line output.\n"
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
In this example, the empty string is used on the first line, to
|
||
|
allow better alignment of the <KBD>H</KBD> from the word <SAMP>`Here'</SAMP>
|
||
|
over the <KBD>f</KBD> from the word <SAMP>`for'</SAMP>. In this example, the
|
||
|
<CODE>msgid</CODE> keyword is followed by three strings, which are meant
|
||
|
to be concatenated. Concatenating the empty string does not change
|
||
|
the resulting overall string, but it is a way for us to comply with
|
||
|
the necessity of <CODE>msgid</CODE> to be followed by a string on the same
|
||
|
line, while keeping the multi-line presentation left-justified, as
|
||
|
we find this to be a cleaner disposition. The empty string could have
|
||
|
been omitted, but only if the string starting with <SAMP>`Here'</SAMP> was
|
||
|
promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF1" HREF="gettext_foot.html#FOOT1">(1)</A> It was not really necessary
|
||
|
either to switch between the two last quoted strings immediately after
|
||
|
the newline <SAMP>`\n'</SAMP>, the switch could have occurred after <EM>any</EM>
|
||
|
other character, we just did it this way because it is neater.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
One should carefully distinguish between end of lines marked as
|
||
|
<SAMP>`\n'</SAMP> <EM>inside</EM> quotes, which are part of the represented
|
||
|
string, and end of lines in the PO file itself, outside string quotes,
|
||
|
which have no incidence on the represented string.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
Outside strings, white lines and comments may be used freely.
|
||
|
Comments start at the beginning of a line with <SAMP>`#'</SAMP> and extend
|
||
|
until the end of the PO file line. Comments written by translators
|
||
|
should have the initial <SAMP>`#'</SAMP> immediately followed by some white
|
||
|
space. If the <SAMP>`#'</SAMP> is not immediately followed by white space,
|
||
|
this comment is most likely generated and managed by specialized GNU
|
||
|
tools, and might disappear or be replaced unexpectedly when the PO
|
||
|
file is given to <CODE>msgmerge</CODE>.
|
||
|
|
||
|
</P>
|
||
|
|
||
|
|
||
|
<H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">Main PO mode Commands</A></H2>
|
||
|
|
||
|
<P>
|
||
|
After setting up Emacs with something similar to the lines in
|
||
|
section <A HREF="gettext_2.html#SEC8">Completing GNU <CODE>gettext</CODE> Installation</A>, PO mode is activated for a window when Emacs finds a
|
||
|
PO file in that window. This puts the window read-only and establishes a
|
||
|
po-mode-map, which is a genuine Emacs mode, in a way that is not derived
|
||
|
from text mode in any way. Functions found on <CODE>po-mode-hook</CODE>,
|
||
|
if any, will be executed.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
When PO mode is active in a window, the letters <SAMP>`PO'</SAMP> appear
|
||
|
in the mode line for that window. The mode line also displays how
|
||
|
many entries of each kind are held in the PO file. For example,
|
||
|
the string <SAMP>`132t+3f+10u+2o'</SAMP> would tell the translator that the
|
||
|
PO mode contains 132 translated entries (see section <A HREF="gettext_5.html#SEC25">Translated Entries</A>,
|
||
|
3 fuzzy entries (see section <A HREF="gettext_5.html#SEC26">Fuzzy Entries</A>), 10 untranslated entries
|
||
|
(see section <A HREF="gettext_5.html#SEC27">Untranslated Entries</A>) and 2 obsolete entries (see section <A HREF="gettext_5.html#SEC28">Obsolete Entries</A>). Zero-coefficients items are not shown. So, in this example, if
|
||
|
the fuzzy entries were unfuzzied, the untranslated entries were translated
|
||
|
and the obsolete entries were deleted, the mode line would merely display
|
||
|
<SAMP>`145t'</SAMP> for the counters.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The main PO commands are those which do not fit into the other categories of
|
||
|
subsequent sections. These allow for quitting PO mode or for managing windows
|
||
|
in special ways.
|
||
|
|
||
|
</P>
|
||
|
<DL COMPACT>
|
||
|
|
||
|
<DT><KBD>U</KBD>
|
||
|
<DD>
|
||
|
Undo last modification to the PO file.
|
||
|
|
||
|
<DT><KBD>Q</KBD>
|
||
|
<DD>
|
||
|
Quit processing and save the PO file.
|
||
|
|
||
|
<DT><KBD>q</KBD>
|
||
|
<DD>
|
||
|
Quit processing, possibly after confirmation.
|
||
|
|
||
|
<DT><KBD>O</KBD>
|
||
|
<DD>
|
||
|
Temporary leave the PO file window.
|
||
|
|
||
|
<DT><KBD>?</KBD>
|
||
|
<DD>
|
||
|
<DT><KBD>h</KBD>
|
||
|
<DD>
|
||
|
Show help about PO mode.
|
||
|
|
||
|
<DT><KBD>=</KBD>
|
||
|
<DD>
|
||
|
Give some PO file statistics.
|
||
|
|
||
|
<DT><KBD>V</KBD>
|
||
|
<DD>
|
||
|
Batch validate the format of the whole PO file.
|
||
|
|
||
|
</DL>
|
||
|
|
||
|
<P>
|
||
|
The command <KBD>U</KBD> (<CODE>po-undo</CODE>) interfaces to the GNU Emacs
|
||
|
<EM>undo</EM> facility. See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>. Each time <KBD>U</KBD> is typed, modifications which the translator
|
||
|
did to the PO file are undone a little more. For the purpose of
|
||
|
undoing, each PO mode command is atomic. This is especially true for
|
||
|
the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single
|
||
|
use of this command is undone at once, even if the edition itself
|
||
|
implied several actions. However, while in the editing window, one
|
||
|
can undo the edition work quite parsimoniously.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The commands <KBD>Q</KBD> (<CODE>po-quit</CODE>) and <KBD>q</KBD>
|
||
|
(<CODE>po-confirm-and-quit</CODE>) are used when the translator is done with the
|
||
|
PO file. The former is a bit less verbose than the latter. If the file
|
||
|
has been modified, it is saved to disk first. In both cases, and prior to
|
||
|
all this, the commands check if some untranslated message remains in the
|
||
|
PO file and, if yes, the translator is asked if she really wants to leave
|
||
|
off working with this PO file. This is the preferred way of getting rid
|
||
|
of an Emacs PO file buffer. Merely killing it through the usual command
|
||
|
<KBD>C-x k</KBD> (<CODE>kill-buffer</CODE>) is not the tidiest way to proceed.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The command <KBD>O</KBD> (<CODE>po-other-window</CODE>) is another, softer way,
|
||
|
to leave PO mode, temporarily. It just moves the cursor to some other
|
||
|
Emacs window, and pops one if necessary. For example, if the translator
|
||
|
just got PO mode to show some source context in some other, she might
|
||
|
discover some apparent bug in the program source that needs correction.
|
||
|
This command allows the translator to change sex, become a programmer,
|
||
|
and have the cursor right into the window containing the program she
|
||
|
(or rather <EM>he</EM>) wants to modify. By later getting the cursor back
|
||
|
in the PO file window, or by asking Emacs to edit this file once again,
|
||
|
PO mode is then recovered.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all available PO
|
||
|
mode commands. The translator should then type any character to resume
|
||
|
normal PO mode operations. The command <KBD>?</KBD> has the same effect
|
||
|
as <KBD>h</KBD>.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number of
|
||
|
entries in the PO file, the ordinal of the current entry (counted from
|
||
|
1), the number of untranslated entries, the number of obsolete entries,
|
||
|
and displays all these numbers.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The command <KBD>V</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in verbose
|
||
|
mode over the current PO file. This command first offers to save the
|
||
|
current PO file on disk. The <CODE>msgfmt</CODE> tool, from GNU <CODE>gettext</CODE>,
|
||
|
has the purpose of creating a MO file out of a PO file, and PO mode uses
|
||
|
the features of this program for checking the overall format of a PO file,
|
||
|
as well as all individual entries.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so the
|
||
|
translator regains control immediately while her PO file is being studied.
|
||
|
Error output is collected in the GNU Emacs <SAMP>`*compilation*'</SAMP> buffer,
|
||
|
displayed in another window. The regular GNU Emacs command <KBD>C-x`</KBD>
|
||
|
(<CODE>next-error</CODE>), as well as other usual compile commands, allow the
|
||
|
translator to reposition quickly to the offending parts of the PO file.
|
||
|
Once the cursor is on the line in error, the translator may decide on
|
||
|
any PO mode action which would help correcting the error.
|
||
|
|
||
|
</P>
|
||
|
|
||
|
|
||
|
<H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">Entry Positioning</A></H2>
|
||
|
|
||
|
<P>
|
||
|
The cursor in a PO file window is almost always part of
|
||
|
an entry. The only exceptions are the special case when the cursor
|
||
|
is after the last entry in the file, or when the PO file is
|
||
|
empty. The entry where the cursor is found to be is said to be the
|
||
|
current entry. Many PO mode commands operate on the current entry,
|
||
|
so moving the cursor does more than allowing the translator to browse
|
||
|
the PO file, this also selects on which entry commands operate.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
Some PO mode commands alter the position of the cursor in a specialized
|
||
|
way. A few of those special purpose positioning are described here,
|
||
|
the others are described in following sections.
|
||
|
|
||
|
</P>
|
||
|
<DL COMPACT>
|
||
|
|
||
|
<DT><KBD>.</KBD>
|
||
|
<DD>
|
||
|
Redisplay the current entry.
|
||
|
|
||
|
<DT><KBD>n</KBD>
|
||
|
<DD>
|
||
|
<DT><KBD>n</KBD>
|
||
|
<DD>
|
||
|
Select the entry after the current one.
|
||
|
|
||
|
<DT><KBD>p</KBD>
|
||
|
<DD>
|
||
|
<DT><KBD>p</KBD>
|
||
|
<DD>
|
||
|
Select the entry before the current one.
|
||
|
|
||
|
<DT><KBD><</KBD>
|
||
|
<DD>
|
||
|
Select the first entry in the PO file.
|
||
|
|
||
|
<DT><KBD>></KBD>
|
||
|
<DD>
|
||
|
Select the last entry in the PO file.
|
||
|
|
||
|
<DT><KBD>m</KBD>
|
||
|
<DD>
|
||
|
Record the location of the current entry for later use.
|
||
|
|
||
|
<DT><KBD>l</KBD>
|
||
|
<DD>
|
||
|
Return to a previously saved entry location.
|
||
|
|
||
|
<DT><KBD>x</KBD>
|
||
|
<DD>
|
||
|
Exchange the current entry location with the previously saved one.
|
||
|
|
||
|
</DL>
|
||
|
|
||
|
<P>
|
||
|
Any GNU Emacs command able to reposition the cursor may be used
|
||
|
to select the current entry in PO mode, including commands which
|
||
|
move by characters, lines, paragraphs, screens or pages, and search
|
||
|
commands. However, there is a kind of standard way to display the
|
||
|
current entry in PO mode, which usual GNU Emacs commands moving
|
||
|
the cursor do not especially try to enforce. The command <KBD>.</KBD>
|
||
|
(<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the
|
||
|
current entry properly, after the current entry has been changed by
|
||
|
means external to PO mode, or the Emacs screen otherwise altered.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
It is yet to be decided if PO mode helps the translator, or otherwise
|
||
|
irritates her, by forcing a rigid window disposition while she
|
||
|
is doing her work. We originally had quite precise ideas about
|
||
|
how windows should behave, but on the other hand, anyone used to
|
||
|
GNU Emacs is often happy to keep full control. Maybe a fixed window
|
||
|
disposition might be offered as a PO mode option that the translator
|
||
|
might activate or deactivate at will, so it could be offered on an
|
||
|
experimental basis. If nobody feels a real need for using it, or
|
||
|
a compulsion for writing it, we should drop this whole idea.
|
||
|
The incentive for doing it should come from translators rather than
|
||
|
programmers, as opinions from an experienced translator are surely
|
||
|
more worth to me than opinions from programmers <EM>thinking</EM> about
|
||
|
how <EM>others</EM> should do translation.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD>
|
||
|
(<CODE>po-previous-entry</CODE>) move the cursor the entry following,
|
||
|
or preceding, the current one. If <KBD>n</KBD> is given while the
|
||
|
cursor is on the last entry of the PO file, or if <KBD>p</KBD>
|
||
|
is given while the cursor is on the first entry, no move is done.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The commands <KBD><</KBD> (<CODE>po-first-entry</CODE>) and <KBD>></KBD>
|
||
|
(<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last
|
||
|
entry, of the PO file. When the cursor is located past the last
|
||
|
entry in a PO file, most PO mode commands will return an error saying
|
||
|
<SAMP>`After last entry'</SAMP>. Moreover, the commands <KBD><</KBD> and <KBD>></KBD>
|
||
|
have the special property of being able to work even when the cursor
|
||
|
is not into some PO file entry, and one may use them for nicely
|
||
|
correcting this situation. But even these commands will fail on a
|
||
|
truly empty PO file. There are development plans for the PO mode for it
|
||
|
to interactively fill an empty PO file from sources. See section <A HREF="gettext_3.html#SEC16">Marking Translatable Strings</A>.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The translator may decide, before working at the translation of
|
||
|
a particular entry, that she needs to browse the remainder of the
|
||
|
PO file, maybe for finding the terminology or phraseology used
|
||
|
in related entries. She can of course use the standard Emacs idioms
|
||
|
for saving the current cursor location in some register, and use that
|
||
|
register for getting back, or else, use the location ring.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
PO mode offers another approach, by which cursor locations may be saved
|
||
|
onto a special stack. The command <KBD>m</KBD> (<CODE>po-push-location</CODE>)
|
||
|
merely adds the location of current entry to the stack, pushing
|
||
|
the already saved locations under the new one. The command
|
||
|
<KBD>r</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and
|
||
|
reposition the cursor to the entry associated with that top element.
|
||
|
This position is then lost, for the next <KBD>r</KBD> will move the cursor
|
||
|
to the previously saved location, and so on until no locations remain
|
||
|
on the stack.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
If the translator wants the position to be kept on the location stack,
|
||
|
maybe for taking a look at the entry associated with the top
|
||
|
element, then go elsewhere with the intent of getting back later, she
|
||
|
ought to use <KBD>m</KBD> immediately after <KBD>r</KBD>.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously
|
||
|
reposition the cursor to the entry associated with the top element of
|
||
|
the stack of saved locations, and replace that top element with the
|
||
|
location of the current entry before the move. Consequently, repeating
|
||
|
the <KBD>x</KBD> command toggles alternatively between two entries.
|
||
|
For achieving this, the translator will position the cursor on the
|
||
|
first entry, use <KBD>m</KBD>, then position to the second entry, and
|
||
|
merely use <KBD>x</KBD> for making the switch.
|
||
|
|
||
|
</P>
|
||
|
|
||
|
|
||
|
<H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">Normalizing Strings in Entries</A></H2>
|
||
|
|
||
|
<P>
|
||
|
There are many different ways for encoding a particular string into a
|
||
|
PO file entry, because there are so many different ways to split and
|
||
|
quote multi-line strings, and even, to represent special characters
|
||
|
by backslahsed escaped sequences. Some features of PO mode rely on
|
||
|
the ability for PO mode to scan an already existing PO file for a
|
||
|
particular string encoded into the <CODE>msgid</CODE> field of some entry.
|
||
|
Even if PO mode has internally all the built-in machinery for
|
||
|
implementing this recognition easily, doing it fast is technically
|
||
|
difficult. To facilitate a solution to this efficiency problem,
|
||
|
we decided on a canonical representation for strings.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
A conventional representation of strings in a PO file is currently
|
||
|
under discussion, and PO mode experiments with a canonical representation.
|
||
|
Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform
|
||
|
way of representing equivalent strings would be useful, as the internal
|
||
|
normalization needed by PO mode could be automatically satisfied
|
||
|
when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>. An explicit
|
||
|
PO mode normalization should then be only necessary for PO files
|
||
|
imported from elsewhere, or for when the convention itself evolves.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
So, for achieving normalization of at least the strings of a given
|
||
|
PO file needing a canonical representation, the following PO mode
|
||
|
command is available:
|
||
|
|
||
|
</P>
|
||
|
<DL COMPACT>
|
||
|
|
||
|
<DT><KBD>M-x po-normalize</KBD>
|
||
|
<DD>
|
||
|
Tidy the whole PO file by making entries more uniform.
|
||
|
|
||
|
</DL>
|
||
|
|
||
|
<P>
|
||
|
The special command <KBD>M-x po-normalize</KBD>, which has no associate
|
||
|
keys, revises all entries, ensuring that strings of both original
|
||
|
and translated entries use uniform internal quoting in the PO file.
|
||
|
It also removes any crumb after the last entry. This command may be
|
||
|
useful for PO files freshly imported from elsewhere, or if we ever
|
||
|
improve on the canonical quoting format we use. This canonical format
|
||
|
is not only meant for getting cleaner PO files, but also for greatly
|
||
|
speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
<KBD>M-x po-normalize</KBD> presently makes three passes over the entries.
|
||
|
The first implements heuristics for converting PO files for GNU
|
||
|
<CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE>
|
||
|
fields were using K&R style C string syntax for multi-line strings.
|
||
|
These heuristics may fail for comments not related to obsolete
|
||
|
entries and ending with a backslash; they also depend on subsequent
|
||
|
passes for finalizing the proper commenting of continued lines for
|
||
|
obsolete entries. This first pass might disappear once all oldish PO
|
||
|
files would have been adjusted. The second and third pass normalize
|
||
|
all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively. They also
|
||
|
clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE>
|
||
|
for continued lines.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
Having such an explicit normalizing command allows for importing PO
|
||
|
files from other sources, but also eases the evolution of the current
|
||
|
convention, evolution driven mostly by aesthetic concerns, as of now.
|
||
|
It is easy to make suggested adjustments at a later time, as the
|
||
|
normalizing command and eventually, other GNU <CODE>gettext</CODE> tools
|
||
|
should greatly automate conformance. A description of the canonical
|
||
|
string format is given below, for the particular benefit of those not
|
||
|
having GNU Emacs handy, and who would nevertheless want to handcraft
|
||
|
their PO files in nice ways.
|
||
|
|
||
|
</P>
|
||
|
<P>
|
||
|
Right now, in PO mode, strings are single line or multi-line. A string
|
||
|
goes multi-line if and only if it has <EM>embedded</EM> newlines, that
|
||
|
is, if it matches <SAMP>`[^\n]\n+[^\n]'</SAMP>. So, we would have:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
msgstr "\n\nHello, world!\n\n\n"
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
but, replacing the space by a newline, this becomes:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
msgstr ""
|
||
|
"\n"
|
||
|
"\n"
|
||
|
"Hello,\n"
|
||
|
"world!\n"
|
||
|
"\n"
|
||
|
"\n"
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
We are deliberately using a caricatural example, here, to make the
|
||
|
point clearer. Usually, multi-lines are not that bad looking.
|
||
|
It is probable that we will implement the following suggestion.
|
||
|
We might lump together all initial newlines into the empty string,
|
||
|
and also all newlines introducing empty lines (that is, for <VAR>n</VAR>
|
||
|
> 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate
|
||
|
string), so making the previous example appear:
|
||
|
|
||
|
</P>
|
||
|
|
||
|
<PRE>
|
||
|
msgstr "\n\n"
|
||
|
"Hello,\n"
|
||
|
"world!\n"
|
||
|
"\n\n"
|
||
|
</PRE>
|
||
|
|
||
|
<P>
|
||
|
There are a few yet undecided little points about string normalization,
|
||
|
to be documented in this manual, once these questions settle.
|
||
|
|
||
|
</P>
|
||
|
<P><HR><P>
|
||
|
<p>Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_12.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
|
||
|
</BODY>
|
||
|
</HTML>
|