kallithea Changeset - 5553ecc962e0

Changeset - 5553ecc962e0

Parent rev.

Child rev.

[Not reviewed]

default

0 2 0

Mads Kiilerich (mads) - 6 years ago 2019-09-04 22:54:49
mads@kiilerich.com

Grafted from: 4bd3514004ab

scripts/i18n: introduce --merge-pot-file to control normalization

There are actually *two* kinds of normalization:

- in main branches, where we just want the translations - not any trivially
derived information or temporary or unstructured data.
- in i18n branches, where we want the trivially derived information, and also
want to preserve any other information there might be in the .po files.

If no pot file is specifed, do it as on the main branches and strip everything
but actual translations. This mode will primarily be used when grafting or
rebasing changes from i18n branches.

When a pot file is specified, run GNU msgmerge with it on the po files. The pot
file should ideally be fully updated (as done by extract_messages). That will
establish a common baseline, leaving only the essential changes as needing merge.

If merging from default branches to 18n, it is better to skip .po and .pot in
first 'hg merge' pass, while resolving everything else. Then, with the
uncommitted merge, run 'extract_messages', and then merge the .po files using
--merge-pot-file kallithea/i18n/kallithea.pot .

(Actually, these two different modes could perhaps be auto detected ...)

2 files changed with 41 insertions and 19 deletions:

scripts/i18n

scripts/i18n_utils.py

0 comments (0 inline, 0 general)

scripts/i18n

➞

Show inline comments

 #!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
 # the Free Software Foundation, either version 3 of the License, or
 # (at your option) any later version.
+#
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 # GNU General Public License for more details.
+#
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 import os
 import shutil
 import sys
 import click
 import i18n_utils
 """
 Tool for maintenance of .po and .pot files
 Normally, the i18n-related files contain for each translatable string a
 reference to all the source code locations where this string is found. This
 meta data is useful for translators to assess how strings are used, but is not
 relevant for normal development nor for running Kallithea. Such meta data, or
 derived data like kallithea.pot, will inherently be outdated, and create
 unnecessary churn and repository growth, making it harder to spot actual and
 important changes.
 """
 @click.group()
 @click.option('--debug/--no-debug', default=False)
 def cli(debug):
     if (debug):
         i18n_utils.do_debug = True
     pass
 @cli.command()
 @click.argument('po_files', nargs=-1)
 def normalize_po_files(po_files):
 @click.option('--merge-pot-file', default=None)
 @click.option('--strip/--no-strip', default=False)
 def normalize_po_files(po_files, merge_pot_file, strip):
     """Normalize the specified .po and .pot files.
     Only actual translations and essential headers will be preserved.
     By default, only actual translations and essential headers will be
     preserved, just as we want it in the main branches with minimal noise.
     If a .pot file is specified, the po files will instead be updated by
     running GNU msgmerge with this .pot file, thus updating source code
     references and preserving comments and outdated translations.
     """
     for po_file in po_files:
-        i18n_utils._normalize_po_file(po_file, strip=True)
+        i18n_utils._normalize_po_file(po_file, merge_pot_file=merge_pot_file, strip=strip)
 @cli.command()
 @click.argument('local')
 @click.argument('base')
 @click.argument('other')
 @click.argument('output')
 def normalized_merge(local, base, other, output):
 @click.option('--merge-pot-file', default=None)
 @click.option('--strip/--no-strip', default=False)
 def normalized_merge(local, base, other, output, merge_pot_file, strip):
     """Merge tool for use with 'hg merge/rebase/graft --tool'
     Merging i18n files with a standard merge tool could yield merge conflicts
     when one side is normalized and the other is not. In such cases, it may be
     better to first normalize all sides, then proceed with a standard merge.
     This command does exactly that, and can be used as 'merge-tool' in
     Mercurial commands like merge, rebase and graft.
     i18n files are partially manually editored original source of content, and
     partially automatically generated and updated. That create a lot of churn
     and often cause a lot of merge conflicts.
     To avoid that, this merge tool wrapper will normalize .po content before
     running the merge tool.
     By default, only actual translations and essential headers will be
     preserved, just as we want it in the main branches with minimal noise.
     If a .pot file is specified, the po files will instead be updated by
     running GNU msgmerge with this .pot file, thus updating source code
     references and preserving comments and outdated translations.
     Add the following to your user or repository-specific .hgrc file to use it:
         [merge-tools]
         i18n.executable = /path/to/scripts/i18n
         i18n.args = normalized-merge $local $base $other $output
     and then invoke merge/rebase/graft with the additional argument '--tool i18n'.
     """
     from mercurial import (
         context,
         simplemerge,
         ui as uimod,
+    )
     print('i18n normalized-merge: merging file %s' % output)
     i18n_utils._normalize_po_file(local, strip=True)
     i18n_utils._normalize_po_file(base, strip=True)
     i18n_utils._normalize_po_file(other, strip=True)
     i18n_utils._normalize_po_file(output, strip=True)
     i18n_utils._normalize_po_file(local, merge_pot_file=merge_pot_file, strip=strip)
     i18n_utils._normalize_po_file(base, merge_pot_file=merge_pot_file, strip=strip)
     i18n_utils._normalize_po_file(other, merge_pot_file=merge_pot_file, strip=strip)
     i18n_utils._normalize_po_file(output, merge_pot_file=merge_pot_file, strip=strip)
     # simplemerge will write markers to 'local' if it fails, keep a copy without markers
     localkeep = local + '.keep'
     shutil.copyfile(local, localkeep)
     ret = simplemerge.simplemerge(uimod.ui.load(),
          context.arbitraryfilectx(local.encode('utf-8')),
          context.arbitraryfilectx(base.encode('utf-8')),
          context.arbitraryfilectx(other.encode('utf-8'))
+    )
     shutil.copyfile(local, output)  # simplemerge wrote to local
     if ret:
         basekeep = base + '.keep'
         otherkeep = other + '.keep'
         shutil.copyfile(base, basekeep)
         shutil.copyfile(other, otherkeep)
         sys.stderr.write("Error: simple merge failed. Run a merge tool manually to resolve conflicts, then use 'hg resolve -m'.\n")
         sys.stderr.write('Resolve with e.g.: kdiff3 %s %s %s -o %s\n' % (basekeep, localkeep, otherkeep, output))
         sys.exit(ret)
     os.remove(localkeep)
 @cli.command()
 @click.argument('file1')
 @click.argument('file2')
 def normalized_diff(file1, file2):
 @click.option('--merge-pot-file', default=None)
 @click.option('--strip/--no-strip', default=False)
 def normalized_diff(file1, file2, merge_pot_file, strip):
     """Compare two files while transparently normalizing them."""
-    sys.exit(i18n_utils._normalized_diff(file1, file2, strip=True))
+    sys.exit(i18n_utils._normalized_diff(file1, file2, merge_pot_file=merge_pot_file, strip=strip))
 if __name__ == '__main__':
     cli()

scripts/i18n_utils.py

➞

Show inline comments

@@ @@ -114,72 +114,75 @@ def _normalize_po(raw_content): @@
     msgstr "Ingen"
     <BLANKLINE>
     line 2
     <BLANKLINE>
     msgid "Specialist"
     msgstr ""
     "Expert"
     <BLANKLINE>
     msgid "%d minute"
     msgid_plural "%d minutes"
     msgstr[0] "minut"
     msgstr[1] "minutter"
     msgstr[2] ""
     ^^^
     """
     header_start = raw_content.find('\nmsgid ""\n') + 1
     header_end = raw_content.find('\n\n', header_start) + 1 or len(raw_content)
     chunks = [
         header_comment_strip_re.sub('', raw_content[0:header_start])
             .strip(),
         '',
         header_normalize_re.sub('', raw_content[header_start:header_end])
             .strip(),
         '']  # preserve normalized header
     # all chunks are separated by empty line
     for raw_chunk in raw_content[header_end:].split('\n\n'):
         if '\n#, fuzzy' in raw_chunk:  # might be like "#, fuzzy, python-format"
             continue  # drop crazy auto translation that is worse than useless
         # strip all comment lines from chunk
         chunk_lines = [
             line
             for line in raw_chunk.splitlines()
             if line
             and not line.startswith('#')
+        ]
         if not chunk_lines:
             continue
         # check lines starting from first msgstr, skip chunk if no translation lines
         msgstr_i = [i for i, line in enumerate(chunk_lines) if line.startswith('msgstr')]
         if (
             chunk_lines[0].startswith('msgid') and
             msgstr_i and
             all(line.endswith(' ""') for line in chunk_lines[msgstr_i[0]:])
         ):  # skip translation chunks that doesn't have any actual translations
             continue
         chunks.append('\n'.join(chunk_lines) + '\n')
     return '\n'.join(chunks)
 def _normalize_po_file(po_file, strip=False):
 def _normalize_po_file(po_file, merge_pot_file=None, strip=False):
     if merge_pot_file:
         runcmd(['msgmerge', '--width=76', '--backup=none', '--previous',
                 '--update', po_file, '-q', merge_pot_file])
     if strip:
         po_tmp = po_file + '.tmp'
         with open(po_file, 'r') as src, open(po_tmp, 'w') as dest:
             raw_content = src.read()
             normalized_content = _normalize_po(raw_content)
             dest.write(normalized_content)
         os.rename(po_tmp, po_file)
 def _normalized_diff(file1, file2, strip=False):
+def _normalized_diff(file1, file2, merge_pot_file=None, strip=False):
     # Create temporary copies of both files
     temp1 = tempfile.NamedTemporaryFile(prefix=os.path.basename(file1))
     temp2 = tempfile.NamedTemporaryFile(prefix=os.path.basename(file2))
     debug('normalized_diff: %s -> %s / %s -> %s' % (file1, temp1.name, file2, temp2.name))
     shutil.copyfile(file1, temp1.name)
     shutil.copyfile(file2, temp2.name)
     # Normalize them in place
     _normalize_po_file(temp1.name, strip=strip)
     _normalize_po_file(temp2.name, strip=strip)
     _normalize_po_file(temp1.name, merge_pot_file=merge_pot_file, strip=strip)
     _normalize_po_file(temp2.name, merge_pot_file=merge_pot_file, strip=strip)
     # Now compare
     try:
         runcmd(['diff', '-u', temp1.name, temp2.name])
     except subprocess.CalledProcessError as e:
         return e.returncode

0 comments (0 inline, 0 general)