[2024-feb-29] Sad news: Eric Layton aka Nocturnal Slacker aka vtel57 passed away on Feb 26th, shortly after hospitalization. He was one of our Wiki's most prominent admins. He will be missed.

Welcome to the Slackware Documentation Project

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
howtos:misc:internationalization_and_localization_of_shell_scripts [2013/05/21 11:16 (UTC)] – [Sources] didierspaierhowtos:misc:internationalization_and_localization_of_shell_scripts [2015/12/22 22:31 (UTC)] (current) – [Processes diagrams] Typo fix didierspaier
Line 7: Line 7:
 This document is intended to help developers, maintainers and translators to write/maintain/translate internationalized then localized shell scripts, using the tools provided by GNU gettext. This document is intended to help developers, maintainers and translators to write/maintain/translate internationalized then localized shell scripts, using the tools provided by GNU gettext.
  
-The reference document is the manual entitled [[http://www.gnu.org/software/gettext/manual/gettext.html|GNU `gettext' utilities]].+The reference document is the manual entitled [[http://www.gnu.org/software/gettext/manual/gettext.html|GNU 'gettext' utilities]].
  
 The manual encompasses all programming languages usable with gettext, with a special focus on the C language. The manual encompasses all programming languages usable with gettext, with a special focus on the C language.
Line 55: Line 55:
  
 Among these actions are execution of some programs of the gettext suite: Among these actions are execution of some programs of the gettext suite:
-  * gettetxt: marks strings to be insternationalized, then displays localized messages during scripts' execution+  * gettetxt: marks strings to be internationalized, then displays localized messages during scripts' execution
   * xgettext: extracts marked strings from a set of shell scripts to build a POT or a PO file   * xgettext: extracts marked strings from a set of shell scripts to build a POT or a PO file
   * msgcmp:   checks a PO file against another PO or a POT file for consistency   * msgcmp:   checks a PO file against another PO or a POT file for consistency
Line 64: Line 64:
 In below diagrams gettext programs are surrounded by square brackets. In below diagrams gettext programs are surrounded by square brackets.
  
-=== Internationalization === +=== (1) Internationalization === 
- +
  
    Set of shell scripts ───> Preparation ───> Marked shell scripts ───╮    Set of shell scripts ───> Preparation ───> Marked shell scripts ───╮
Line 76: Line 76:
  
  
-=== Localization (example for French and Dutch languages). ===+=== (2) Localization (example for French and Dutch languages). ===
  
  
Line 100: Line 100:
  
  
-=== Usage ===+=== (3) Usage ===
  
 Let's assume that one of the scripts, "myscript.sh" includes following command: Let's assume that one of the scripts, "myscript.sh" includes following command:
Line 119: Line 119:
  
  
-=== Maintenance ===+=== (4) Maintenance ===
  
 Maintenance process can be triggered by a script's creation, modification or deletion. Maintenance process can be triggered by a script's creation, modification or deletion.
Line 138: Line 138:
                                            
                                                          
-Maintenance process can be triggered as well by a modification of a language catalog for a specific language (to correct an error for instance).+Maintenance process can be triggered as well by a modification of a messages catalog for a specific language (to correct an error for instance).
  
 This variant of the process is shorter: This variant of the process is shorter:
-   +
   ╭─ [msgcmp] <── <locale>.po <── PO editor <── <locale>.po <── Update needed   ╭─ [msgcmp] <── <locale>.po <── PO editor <── <locale>.po <── Update needed
   │   │
Line 182: Line 182:
   * If the text string includes a parameter expansion, eval_gettext is used instead of gettext.   * If the text string includes a parameter expansion, eval_gettext is used instead of gettext.
   * "The variable names must consist solely of alphanumeric or underscore  ASCII characters, not start with a digit and be nonempty; otherwise such a variable reference is ignored." (gettext manual)   * "The variable names must consist solely of alphanumeric or underscore  ASCII characters, not start with a digit and be nonempty; otherwise such a variable reference is ignored." (gettext manual)
-  * Parameter expansions are escaped with a single backslash like this:\\ \$parameter or \${parameter}unless the eval_gettext command be inside a  command substitution like this:\\ "`eval_gettext "..."`" or   "$(eval_gettext "...")".\\ In the latter case, three backslashes is used like this:\\ \\\$parameter or \\\${parameter}.+  * Parameter expansions are escaped with a single backslash like this:\\ \$parameter or \${parameter}\\ unless the eval_gettext command be inside a  command substitution like this:\\ "`eval_gettext "..."`" or   "$(eval_gettext "...")"\\ In the latter case, three backslashes are needed like this:\\ \\\$parameter or \\\${parameter}.
   * Only the forms $parameter and ${parameter} of parameter expansion are used inside an eval_gettext's argument (all other ones are forbidden).   * Only the forms $parameter and ${parameter} of parameter expansion are used inside an eval_gettext's argument (all other ones are forbidden).
   * Positional parameters, special parameters and command substitutions are *not* used inside a gettext's or eval_gettext's argument.   * Positional parameters, special parameters and command substitutions are *not* used inside a gettext's or eval_gettext's argument.
Line 208: Line 208:
 === Use 'xgettext' to produce a template catalog of messages === === Use 'xgettext' to produce a template catalog of messages ===
  
-The choice to produce only one POT file for the software as a whole or to make one POT files per set of scripts have to be made, considering for instance which choice will minimize maintenance work, how localization'work can be organized, relative frequency of updates for the different sets of scripts which comprise the software, and the relevance of distinguishing groups of features like setup vs configuration vs package management.+The choice to produce only one POT file for the software as a whole or to make one POT files per set of scripts have to be made, considering for instance which choice will minimize maintenance work, how localizations work can be organized, relative frequency of updates for the different sets of scripts which comprise the software, and the relevance of distinguishing groups of features like setup vs configuration vs package management.
  
 I'm inclined to produce only one POT file, but the choice is yours. I'm inclined to produce only one POT file, but the choice is yours.
Line 216: Line 216:
 The POT file will be generated using the 'xgettext' command (see the manual or 'xgettext --help' for details). The POT file will be generated using the 'xgettext' command (see the manual or 'xgettext --help' for details).
  
-I suggest to include following options in the command:<code>+Include following options in the command:<code>
 -L Shell (of course!) -L Shell (of course!)
 --strict (to facilitate checks and management of the messages catalogs) --strict (to facilitate checks and management of the messages catalogs)
Line 253: Line 253:
 I suggest to use an UTF-8 locale, as for reading this document. I suggest to use an UTF-8 locale, as for reading this document.
  
-If the user is polyglot, another option is to set gettext's specific LANGUAGE environment variable to specify a priority list of languages.+If the user is polyglot, another option is to set gettext's specific LANGUAGE environment variable to specify a prioritized list of languages.
  
 For instance, if LANGUAGE is set to 'de:fr' then a Deutsch translation will be used if available, else a French translation will be used if available, else messages will be displayed in the original language, usually English. See gettext's manual for details. For instance, if LANGUAGE is set to 'de:fr' then a Deutsch translation will be used if available, else a French translation will be used if available, else messages will be displayed in the original language, usually English. See gettext's manual for details.
Line 265: Line 265:
 The translators will use the new POT file to update their respective (saved) PO files with the 'msgmerge --update' command. The translators will use the new POT file to update their respective (saved) PO files with the 'msgmerge --update' command.
  
-Then they will edit/complete the translations, focusing on the messages marked as "fuzzy" in the PO files, using a PO editor.+Then they will edit/complete the translations, focusing on the not yet translated messages and on those marked as "fuzzy" in the PO files, using a PO editor.
  
 After that the PO file will be checked against the POT file with 'msgcmp', carefully saved, handed over to the maintainer who will generate the new MO file with 'msgfmt' and install it as in the initial localization process. After that the PO file will be checked against the POT file with 'msgcmp', carefully saved, handed over to the maintainer who will generate the new MO file with 'msgfmt' and install it as in the initial localization process.
Line 273: Line 273:
 ==== Practical recommendations for developers and maintainers ==== ==== Practical recommendations for developers and maintainers ====
  
-Many English words are polysemous: their meaning can only be determined from the context of their usage. +Many English words are polysemous: their meaning can only be determined from the context of their usage. 
 As a practical consequence, the more context you provide, the more accurate the translation can be. As a practical consequence, the more context you provide, the more accurate the translation can be.
  
 Example: recently, while downloading a software I saw something like this:\\ 31min gauche\\ Go figure? After a while I realized that "left" had been translated "gauche" (as in "left hand"). Example: recently, while downloading a software I saw something like this:\\ 31min gauche\\ Go figure? After a while I realized that "left" had been translated "gauche" (as in "left hand").
  
-Also, the order of words in a sentence vary upon the languages, furthermore not all languages are written left to right. +Also, order of words in a sentence vary upon language, furthermore not all languages are written left to right. Thus, mark entire paragraphs, or at least entire sentences, not lines, let alone isolated words but in special cases.
- +
-So I suggest to mark entire paragraphs, or at least entire sentences, not lines, let alone isolated words but in special cases.+
  
 For instance, if text paragraphs were split in lines displayed by 'echo' commands, replace all consecutive 'echo' commands by a single 'gettext' or 'eval_gettext' command. For instance, if text paragraphs were split in lines displayed by 'echo' commands, replace all consecutive 'echo' commands by a single 'gettext' or 'eval_gettext' command.
  
-Do not fear to include the variable substitutions in the sentences, the PO editor will check that they be present as is in the translations. +Do not fear to include the variable substitutions in the sentences, PO editor will check that they be present as is in the translations.
  
 === Recommendations for 'dialog' program. === === Recommendations for 'dialog' program. ===
Line 307: Line 303:
 In particular, I recommend to favor options which take as first argument a text string instead of a file, to allow line wrapping. It is still possible to preserve the intended layout using white spaces for indentation. In particular, I recommend to favor options which take as first argument a text string instead of a file, to allow line wrapping. It is still possible to preserve the intended layout using white spaces for indentation.
  
-For instance,\\ dialog <common-options> --textbox <file> <height> <width>\\ can be replaced with\\ dialog --no-collapse <common-options> --msgbox "`cat <file>`" <height> <width>+For instance, 
 +  dialog <common-options> --textbox <file> <height> <width> 
 +can be replaced with 
 +  dialog --no-collapse <common-options> --msgbox "`cat <file>`" <height> <width>
  
 ==== Practical recommendations for translators ==== ==== Practical recommendations for translators ====
  
-Depending on amount of work needed and available resources, there can be one translator or a team of translators per target language.+Depending on amount of work needed and available resources, there can be one translator or a team of translators per target language. In all cases, I recommend that at least one person be responsible for organizing the team's work, checking the translations and transmitting the checked PO file to the maintainer(s). Let's call this person the team coordinator.
  
-In all cases, I recommend that one person be responsible for organizing the team's work, checking the translations and transmitting the checked PO file to the maintainer(s). Let's call this person the team leader.+Don't feel obliged to translate verbatim. Not only is this rarely the best way to convey the meaning, but in addition this often leads to sentences too long to fit in allowed space.
  
-To qualify as translator, a person should in my opinion have enough ability of reading shell scripts to understand the meaning and layout of the messages to be translated. Also he or she should not feel obliged to translate word for word.+Use specialized PO editor'not' general text editor. This will not only prevent inadvertently editing 'msgid' strings but also facilitate their work and automatize additional checks, as the presence of a variable in the translation with the same spelling as in the original.
  
-Not only is this rarely the best way to convey the meaningbut in addition this often leads to sentences too long to fit in allowed space.+While translating, choose a serif fixed width (or "monospaced") font, like Courier. That allow to visually distinguish characters that otherwise would look the sameand check line's length when that matters.
  
-It is highly advisable that translators use a specialized PO editor or the PO mode of an editor like VIM or Emacs.+If possible, check the layout of the messages. You could do that looking at the context in the relevant source file. Even better, simply run the translated script
  
-This will not only prevent inadvertently editing 'msgid' strings but also facilitate their work and automatize additional checks, as the presence of a variable in the translation with the same spelling as in the original.+This is especially important if you are translating dialog boxes. In particular, take care not to write too long sentences on one single line if it appears that the text can't flow on next one.
  
-For instance 'lokalize' is included with Slackware Linux and 'poedit' is available @ http://slackbuilds.org.+Bear in mind that in VGA mode (used in text installers, in particular), line's width is limited theoretically to 80 characters, but practically often to 74.
  
-If possible, I suggest that you check the layout of the messages.+Do not add question marks that are not present in the original message.
  
-You could do that looking at the context in the relevant source file.+If the message refers to tags (text on the buttons) of dialog boxes, like "OK", "Yes", "NO", "Continue", "Cancel", check how theses tags are translated in your language in dialog's interface and use the same words.
  
-This is especially important if you are translating dialog boxes.+Avoid colloquialisms and technical slang.
  
-In particular, take care not to write too long sentences on one single line if it appears that the text can't flow on next one.+To "cut" (or end) a line inside a "dialog" box you should type \n: pressing [Enter] will 'not' insert a "new line" character in the text viewed by user
  
-Bear in mind that in VGA mode (used in text installersin particular), the width of line is limited theoretically to 80 charactersbut practically often to 74.+In addition, you will have to comply to gettext's requirements for it to work: 
 +  * If a word beginning with a dollar sign is included in the original text it should be present in the translation with exactly the same spelling (case matters). 
 +  * The translation text should include a "new line" character (or line feedrepresented by "\n"at the beginning or at the endexactly as the original text does. Converselyif the original text doesn’t have the character, then the translation shouldn’t have it. 
 +  * A single backslash character "\" is not allowed in the translation.
  
-Do not add question marks that are not present in the original message.+To check your translation against gettext's requirements you could run following command: 
 +  msgfmt -c <name of the PO file>
  
-If the message refers to tags (text on the buttons) of dialog boxes, like "OK", "Yes", "NO", "Continue", "Cancel", check how theses tags are translated in your language in dialog's interface and use the same words.+==== Warning  about translation of man pages ====
  
-I suggest to avoid colloquialisms and technical slang.+Preserve carefully syntax of man pages found in English markupFor instance don't replace:
  
-In addition, you will have to comply to gettext's requirements fot it to work: +  * 'B<' with 'B <' (don't insert a space) 
-  * If a word beginning with a dollar sign is included in the original text it should be present in the translation with exactly the same spelling. +  * 'B<' with 'b<' (keep the B as a capital letter - and don't replace it by the Greek capital letter BETA that looks the same on the screen) 
-  * The translation should or not begin or end with a "carriage return" character, represented by "\n", exactly as the original text does. +  * "Iwith '|' (don't replace the capital letter I with a pipe symbol)
-  * a single backslash character "\is not allowed in the translation.+
  
-To check your translation against gettext's requirements you could run +When translating shell commands, preserve English names of paths when needed. But you may and should translate arguments to be replaced by a value like 'packagename' 
-following command: + 
-  msgfmt -c <name of the PO file>+Didier Spaier
  
 ====== Sources ====== ====== Sources ======
 * Originally written by [[wiki:user:didierspaier | Didier Spaier]] * Originally written by [[wiki:user:didierspaier | Didier Spaier]]
 {{tag>howtos gettext shell scripts internationalization localization i18n l10n}} {{tag>howtos gettext shell scripts internationalization localization i18n l10n}}
 +
 howtos:misc:internationalization_and_localization_of_shell_scripts ()