kallithea Files · scripts/docs-headings.py

Files @ 53142fd5af4e

Branch filter:

Location: kallithea/scripts/docs-headings.py - annotation

53142fd5af4e 2.6 KiB text/x-python Show Source Show as Raw Download as Raw

Thomas De Schampheleire

lib/diffs: make sure that trailing tabs are indicated

Between the initial submission and final version of commit f79c40759d6f,
changes were made that turn out to be incorrect. The changes assume that the
later match on trailing tabs will 'win' from the plain 'tab' match. However,
Python 're' documentation says:

As the target string is scanned, REs separated by '|' are tried from
left to right. When one pattern completely matches, that branch is
accepted. This means that once A matches, B will not be tested further,
even if it would produce a longer overall match. In other words, the '|'
operator is never greedy.
https://docs.python.org/3.8/library/re.html

As a result, a trailing tab is seen as a plain tab and not highlighted in a
special way.

Unify the tab handling to make it unambiguous how they should be parsed.

The change diff mainly shows re group numbers shifting.

aa6f17a53b49
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
0a277465fddf
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
a188803df37e
a188803df37e
01aca0a4f876
a8e6bb9ee9ea
665dfa112f2c
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
a8e6bb9ee9ea
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
a8e6bb9ee9ea
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
ed2fb6e84a02
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6
665dfa112f2c
a188803df37e
a8e6bb9ee9ea
f38b50f8a6a6
f38b50f8a6a6
f38b50f8a6a6

#!/usr/bin/env python3

"""
Consistent formatting of rst section titles
"""

import re
import subprocess


spaces = [
    (0, 1), # we assume this is a over-and-underlined header
    (2, 1),
    (1, 1),
    (1, 0),
    (1, 0),
    ]

# http://sphinx-doc.org/rest.html :
#   for the Python documentation, this convention is used which you may follow:
#   # with overline, for parts
#   * with overline, for chapters
#   =, for sections
#   -, for subsections
#   ^, for subsubsections
#   ", for paragraphs
pystyles = ['#', '*', '=', '-', '^', '"']

# match on a header line underlined with one of the valid characters
headermatch = re.compile(r'''\n*(.+)\n([][!"#$%&'()*+,./:;<=>?@\\^_`{|}~-])\2{2,}\n+''', flags=re.MULTILINE)


def main():
    filenames = subprocess.check_output(['hg', 'loc', 'set:**.rst+kallithea/i18n/how_to']).splitlines()
    for fn in filenames:
        fn = fn.decode()
        print('processing %s' % fn)
        s = open(fn).read()

        # find levels and their styles
        lastpos = 0
        styles = []
        for markup in headermatch.findall(s):
            style = markup[1]
            if style in styles:
                stylepos = styles.index(style)
                if stylepos > lastpos + 1:
                    print('bad style %r with level %s - was at %s' % (style, stylepos, lastpos))
            else:
                stylepos = len(styles)
                if stylepos > lastpos + 1:
                    print('bad new style %r - expected %r' % (style, styles[lastpos + 1]))
                else:
                    styles.append(style)
            lastpos = stylepos

        # remove superfluous spacing (may however be restored by header spacing)
        s = re.sub(r'''(\n\n)\n*''', r'\1', s, flags=re.MULTILINE)

        if styles:
            newstyles = pystyles[pystyles.index(styles[0]):]

            def subf(m):
                title, style = m.groups()
                level = styles.index(style)
                before, after = spaces[level]
                newstyle = newstyles[level]
                return '\n' * (before + 1) + title + '\n' + newstyle * len(title) + '\n' * (after + 1)
            s = headermatch.sub(subf, s)

        # remove superfluous spacing when headers are adjacent
        s = re.sub(r'''(\n.+\n([][!"#$%&'()*+,./:;<=>?@\\^_`{|}~-])\2{2,}\n\n\n)\n*''', r'\1', s, flags=re.MULTILINE)
        # fix trailing space and spacing before link sections
        s = s.strip() + '\n'
        s = re.sub(r'''\n+((?:\.\. _[^\n]*\n)+)$''', r'\n\n\n\1', s)

        open(fn, 'w').write(s)

    print(subprocess.check_output(['hg', 'diff'] + filenames))

if __name__ == '__main__':
    main()