Files
@ 7b0aafc6b7ca
Branch filter:
Location: kallithea/setup.py - annotation
7b0aafc6b7ca
4.7 KiB
text/x-python
mysql: create database with explicit UTF-8 character set and collation
A spin-off from Issue #378.
In MySQL, the character sets for server, database, tables, and connection are
set independently. Ideally, they should all use UTF-8, but systems tend to use
latin1 as default encoding, for example:
character_set_server = latin1
collation_server = latin1_swedish_ci
Databases would thus by default be created as:
character_set_database = latin1
collation_database = latin1_swedish_ci
To make things work consistently anyway, we have so far specified the utf8mb4
charset explicitly when creating tables, but there is no corresponding simple
option for specifying the collation for tables. We need a better solution.
If necessary and possible, the system charset and collation should be set to
UTF-8. Some systems already have these defaults default - see
https://mariadb.com/kb/en/differences-in-mariadb-in-debian-and-ubuntu/ .
The defaults can be changed as described on
https://mariadb.com/kb/en/setting-character-sets-and-collations/#example-changing-the-default-character-set-to-utf-8
to give something like:
character_set_server = utf8mb4
collation_server = utf8mb4_unicode_ci
Databases will then by default be created as:
character_set_database = utf8mb4
collation_database = utf8mb4_unicode_ci
and there is thus no longer any need for specifying the charset when creating
tables.
To be reasonably resilient across all systems without relying on system
defaults, we will now start specifying the charset and collation when creating
the database, but drop the specification of charset when creating tables.
For existing databases, it is recommended to change encoding (and collation) by
altering the database and each of the tables inside it as described on
https://stackoverflow.com/questions/6115612/how-to-convert-an-entire-mysql-database-characterset-and-collation-to-utf-8 .
Note the use of utf8mb4_unicode_ci instead of utf8mb4_general_ci - see
https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci .
For investigation of these issues, consider the output from:
show variables like '%char%';
show variables like '%collation%';
show create database `KALLITHEA_DB_NAME`;
SELECT * FROM information_schema.SCHEMATA WHERE schema_name = "KALLITHEA_DB_NAME";
SELECT * FROM information_schema.TABLES T, information_schema.COLLATION_CHARACTER_SET_APPLICABILITY CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "KALLITHEA_DB_NAME";
A spin-off from Issue #378.
In MySQL, the character sets for server, database, tables, and connection are
set independently. Ideally, they should all use UTF-8, but systems tend to use
latin1 as default encoding, for example:
character_set_server = latin1
collation_server = latin1_swedish_ci
Databases would thus by default be created as:
character_set_database = latin1
collation_database = latin1_swedish_ci
To make things work consistently anyway, we have so far specified the utf8mb4
charset explicitly when creating tables, but there is no corresponding simple
option for specifying the collation for tables. We need a better solution.
If necessary and possible, the system charset and collation should be set to
UTF-8. Some systems already have these defaults default - see
https://mariadb.com/kb/en/differences-in-mariadb-in-debian-and-ubuntu/ .
The defaults can be changed as described on
https://mariadb.com/kb/en/setting-character-sets-and-collations/#example-changing-the-default-character-set-to-utf-8
to give something like:
character_set_server = utf8mb4
collation_server = utf8mb4_unicode_ci
Databases will then by default be created as:
character_set_database = utf8mb4
collation_database = utf8mb4_unicode_ci
and there is thus no longer any need for specifying the charset when creating
tables.
To be reasonably resilient across all systems without relying on system
defaults, we will now start specifying the charset and collation when creating
the database, but drop the specification of charset when creating tables.
For existing databases, it is recommended to change encoding (and collation) by
altering the database and each of the tables inside it as described on
https://stackoverflow.com/questions/6115612/how-to-convert-an-entire-mysql-database-characterset-and-collation-to-utf-8 .
Note the use of utf8mb4_unicode_ci instead of utf8mb4_general_ci - see
https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci .
For investigation of these issues, consider the output from:
show variables like '%char%';
show variables like '%collation%';
show create database `KALLITHEA_DB_NAME`;
SELECT * FROM information_schema.SCHEMATA WHERE schema_name = "KALLITHEA_DB_NAME";
SELECT * FROM information_schema.TABLES T, information_schema.COLLATION_CHARACTER_SET_APPLICABILITY CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "KALLITHEA_DB_NAME";
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | aa6f17a53b49 266a3cbc0302 9382e88eae22 0a277465fddf 65c27fd21769 0a277465fddf 0a277465fddf 0a277465fddf 0a277465fddf 0a277465fddf 9382e88eae22 aa6f17a53b49 aa6f17a53b49 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 5725fa4cfecd 5725fa4cfecd 5725fa4cfecd 5725fa4cfecd 9382e88eae22 a553bc3a3d0e 9382e88eae22 9382e88eae22 9382e88eae22 5725fa4cfecd 5725fa4cfecd 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 9382e88eae22 3a1cf70e0f42 2d7a94f3eaae 9382e88eae22 d122a9532630 8e0efe7b3b10 d122a9532630 b72e8b7c33ae 55fc0bcce916 af1b0a59e605 55fc0bcce916 390b99920d02 b077cf7e7f90 fd59e56301a8 ed67d1df7125 d122a9532630 42312c8d070d 1d1f5598702d 0060db8a7dd5 d122a9532630 64b76a3150da b302d4254bd0 d122a9532630 55fc0bcce916 a99b7e388979 55b4e5cb4866 bee01ae374a2 b015fa0bfecb 42312c8d070d e965ff6f8cb3 e4b9a1d1fea1 f254dd2f9dcb 7433775cc53b 7433775cc53b 1e0632b6ec27 65b0d79ff293 9382e88eae22 9382e88eae22 880a39e5d8df 880a39e5d8df 880a39e5d8df fc6063e6630b cc48c1541c7e fc6063e6630b fc6063e6630b fc6063e6630b fc6063e6630b fc6063e6630b 01aca0a4f876 01aca0a4f876 01aca0a4f876 cc48c1541c7e fc6063e6630b a60cd29ba7e2 b0e2c949c34b 2afa6b8c2ade 2afa6b8c2ade 65c27fd21769 65c27fd21769 24c0d584ba86 326a9336fbe5 faad9dd06b58 266a3cbc0302 266a3cbc0302 7e5f8c12a3fc 266a3cbc0302 266a3cbc0302 266a3cbc0302 266a3cbc0302 2afa6b8c2ade 266a3cbc0302 65c27fd21769 7c732f2047f8 d69aa464f373 266a3cbc0302 27c4ad3e584f 27c4ad3e584f 266a3cbc0302 ad2e97c6f17f 2642f128ad46 c79e4f89bfd3 c79e4f89bfd3 c79e4f89bfd3 c79e4f89bfd3 c79e4f89bfd3 c79e4f89bfd3 c79e4f89bfd3 c79e4f89bfd3 a922e91a4f02 564e40829f80 a922e91a4f02 24c0d584ba86 9382e88eae22 ad2e97c6f17f 2642f128ad46 2d7a94f3eaae f4807acf643d 9382e88eae22 3315e9263a53 880a39e5d8df 9382e88eae22 2642f128ad46 a60cd29ba7e2 2642f128ad46 65c27fd21769 20dc7a5eb748 7e5f8c12a3fc 564e40829f80 564e40829f80 0e6035a85980 564e40829f80 20dc7a5eb748 20dc7a5eb748 7ac09514a178 9793473d74be 9793473d74be e4b9a1d1fea1 7ac09514a178 564e40829f80 3a02b678b5e7 564e40829f80 564e40829f80 | #!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os
import platform
import sys
import setuptools
# monkey patch setuptools to use distutils owner/group functionality
from setuptools.command import sdist
if sys.version_info < (3, 6):
raise Exception('Kallithea requires Python 3.6 or later')
here = os.path.abspath(os.path.dirname(__file__))
def _get_meta_var(name, data, callback_handler=None):
import re
matches = re.compile(r'(?:%s)\s*=\s*(.*)' % name).search(data)
if matches:
s = eval(matches.groups()[0])
if callable(callback_handler):
return callback_handler(s)
return s
_meta = open(os.path.join(here, 'kallithea', '__init__.py'), 'r')
_metadata = _meta.read()
_meta.close()
def callback(V):
return '.'.join(map(str, V[:3])) + '.'.join(V[3:])
__version__ = _get_meta_var('VERSION', _metadata, callback)
__license__ = _get_meta_var('__license__', _metadata)
__author__ = _get_meta_var('__author__', _metadata)
__url__ = _get_meta_var('__url__', _metadata)
# defines current platform
__platform__ = platform.system()
is_windows = __platform__ in ['Windows']
requirements = [
"alembic >= 1.0.10, < 1.5",
"gearbox >= 0.1.0, < 1",
"waitress >= 0.8.8, < 1.5",
"WebOb >= 1.8, < 1.9",
"backlash >= 0.1.2, < 1",
"TurboGears2 >= 2.4, < 2.5",
"tgext.routes >= 0.2.0, < 1",
"Beaker >= 1.10.1, < 2",
"WebHelpers2 >= 2.0, < 2.1",
"FormEncode >= 1.3.1, < 1.4",
"SQLAlchemy >= 1.2.9, < 1.4",
"Mako >= 0.9.1, < 1.2",
"Pygments >= 2.2.0, < 2.7",
"Whoosh >= 2.7.1, < 2.8",
"celery >= 4.3, < 4.5, != 4.4.4", # 4.4.4 is broken due to unexpressed dependency on 'future', see https://github.com/celery/celery/pull/6146
"Babel >= 1.3, < 2.9",
"python-dateutil >= 2.1.0, < 2.9",
"Markdown >= 2.2.1, < 3.2",
"docutils >= 0.11, < 0.17",
"URLObject >= 2.3.4, < 2.5",
"Routes >= 2.0, < 2.5",
"dulwich >= 0.19.0, < 0.20",
"mercurial >= 5.2, < 5.5",
"decorator >= 4.2.1, < 4.5",
"Paste >= 2.0.3, < 3.5",
"bleach >= 3.0, < 3.1.4",
"Click >= 7.0, < 8",
"ipaddr >= 2.2.0, < 2.3",
"paginate >= 0.5, < 0.6",
"paginate_sqlalchemy >= 0.3.0, < 0.4",
"bcrypt >= 3.1.0, < 3.2",
"pip >= 20.0, < 999",
]
dependency_links = [
]
classifiers = [
'Development Status :: 4 - Beta',
'Environment :: Web Environment',
'Framework :: Pylons',
'Intended Audience :: Developers',
'License :: OSI Approved :: GNU General Public License (GPL)',
'Operating System :: OS Independent',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Topic :: Software Development :: Version Control',
]
# additional files from project that goes somewhere in the filesystem
# relative to sys.prefix
data_files = []
description = ('Kallithea is a fast and powerful management tool '
'for Mercurial and Git with a built in push/pull server, '
'full text search and code-review.')
keywords = ' '.join([
'kallithea', 'mercurial', 'git', 'code review',
'repo groups', 'ldap', 'repository management', 'hgweb replacement',
'hgwebdir', 'gitweb replacement', 'serving hgweb',
])
# long description
README_FILE = 'README.rst'
try:
long_description = open(README_FILE).read()
except IOError as err:
sys.stderr.write(
"[WARNING] Cannot find file specified as long_description (%s): %s\n"
% (README_FILE, err)
)
long_description = description
sdist_org = sdist.sdist
class sdist_new(sdist_org):
def initialize_options(self):
sdist_org.initialize_options(self)
self.owner = self.group = 'root'
sdist.sdist = sdist_new
packages = setuptools.find_packages(exclude=['ez_setup'])
setuptools.setup(
name='Kallithea',
version=__version__,
description=description,
long_description=long_description,
keywords=keywords,
license=__license__,
author=__author__,
author_email='kallithea@sfconservancy.org',
dependency_links=dependency_links,
url=__url__,
install_requires=requirements,
classifiers=classifiers,
data_files=data_files,
packages=packages,
include_package_data=True,
message_extractors={'kallithea': [
('**.py', 'python', None),
('templates/**.mako', 'mako', {'input_encoding': 'utf-8'}),
('templates/**.html', 'mako', {'input_encoding': 'utf-8'}),
('public/**', 'ignore', None)]},
zip_safe=False,
entry_points="""
[console_scripts]
kallithea-api = kallithea.bin.kallithea_api:main
kallithea-gist = kallithea.bin.kallithea_gist:main
kallithea-cli = kallithea.bin.kallithea_cli:cli
[paste.app_factory]
main = kallithea.config.application:make_app
""",
)
|