Files @ aa51aca7fd1a
Branch filter:

Location: kallithea/scripts/shortlog.py

Valentin Kleibel
controller: Handle UnicodeDecodeError from webob decoding invalid URLs

webob will try to utf-8 decode all %-encoded bytes in URL-parameters, but will
not handle Unicode erors ... and neither did Kallithea. Visiting a URL like
http://localhost:5000/?%AD would thus give an unhandled exception showing
"Internal Server Error" to the user, and logging the full traceback and:

WebApp Error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 0: invalid start byte

This has been seen a lot recently from attackers probing for a php
vulnerability
https://devco.re/blog/2024/06/06/security-alert-cve-2024-4577-php-cgi-argument-injection-vulnerability-en/ .

Now handle these exceptions more nicely and reject with "400 Bad Request".
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
Kallithea script for generating a quick overview of contributors and their
commit counts in a given revision set.
"""
import argparse
import os
from collections import Counter

import contributor_data


def main():

    parser = argparse.ArgumentParser(description='Generate a list of committers and commit counts.')
    parser.add_argument('revset',
                        help='revision set specifying the commits to count')
    args = parser.parse_args()

    repo_entries = [
        (contributor_data.name_fixes.get(name) or contributor_data.name_fixes.get(name.rsplit('<', 1)[0].strip()) or name).rsplit('<', 1)[0].strip()
        for name in (line.strip()
         for line in os.popen("""hg log -r '%s' -T '{author}\n'""" % args.revset).readlines())
        ]

    counter = Counter(repo_entries)
    for name, count in counter.most_common():
        if name == '':
            continue
        print('%4s %s' % (count, name))


if __name__ == '__main__':
    main()