Files
@ 33b71a130b16
Branch filter:
Location: kallithea/docs/overview.rst - annotation
33b71a130b16
6.1 KiB
text/prs.fallenstein.rst
templates: properly escape inline JavaScript values
TLDR: Kallithea has issues with escaping values for use in inline JS.
Despite judicious poking of the code, no actual security vulnerabilities
have been found, just lots of corner-case bugs. This patch fixes those,
and hardens the code against actual security issues.
The long version:
To embed a Python value (typically a 'unicode' plain-text value) in a
larger file, it must be escaped in a context specific manner. Example:
>>> s = u'<script>alert("It\'s a trap!");</script>'
1) Escaped for insertion into HTML element context
>>> print cgi.escape(s)
<script>alert("It's a trap!");</script>
2) Escaped for insertion into HTML element or attribute context
>>> print h.escape(s)
<script>alert("It's a trap!");</script>
This is the default Mako escaping, as usually used by Kallithea.
3) Encoded as JSON
>>> print json.dumps(s)
"<script>alert(\"It's a trap!\");</script>"
4) Escaped for insertion into a JavaScript file
>>> print '(' + json.dumps(s) + ')'
("<script>alert(\"It's a trap!\");</script>")
The parentheses are not actually required for strings, but may be needed
to avoid syntax errors if the value is a number or dict (object).
5) Escaped for insertion into a HTML inline <script> element
>>> print h.js(s)
("\x3cscript\x3ealert(\"It's a trap!\");\x3c/script\x3e")
Here, we need to combine JS and HTML escaping, further complicated by
the fact that "<script>" tag contents can either be parsed in XHTML mode
(in which case '<', '>' and '&' must additionally be XML escaped) or
HTML mode (in which case '</script>' must be escaped, but not using HTML
escaping, which is not available in HTML "<script>" tags). Therefore,
the XML special characters (which can only occur in string literals) are
escaped using JavaScript string literal escape sequences.
(This, incidentally, is why modern web security best practices ban all
use of inline JavaScript...)
Unsurprisingly, Kallithea does not do (5) correctly. In most cases,
Kallithea might slap a pair of single quotes around the HTML escaped
Python value. A typical benign example:
$('#child_link').html('${_('No revisions')}');
This works in English, but if a localized version of the string contains
an apostrophe, the result will be broken JavaScript. In the more severe
cases, where the text is user controllable, it leaves the door open to
injections. In this example, the script inserts the string as HTML, so
Mako's implicit HTML escaping makes sense; but in many other cases, HTML
escaping is actually an error, because the value is not used by the
script in an HTML context.
The good news is that the HTML escaping thwarts attempts at XSS, since
it's impossible to inject syntactically valid JavaScript of any useful
complexity. It does allow JavaScript errors and gibberish to appear on
the page, though.
In these cases, the escaping has been fixed to use either the new 'h.js'
helper, which does JavaScript escaping (but not HTML escaping), OR the
new 'h.jshtml' helper (which does both), in those cases where it was
unclear if the value might be used (by the script) in an HTML context.
Some of these can probably be "relaxed" from h.jshtml to h.js later, but
for now, using h.jshtml fixes escaping and doesn't introduce new errors.
In a few places, Kallithea JSON encodes values in the controller, then
inserts the JSON (without any further escaping) into <script> tags. This
is also wrong, and carries actual risk of XSS vulnerabilities. However,
in all cases, security vulnerabilities were narrowly avoided due to other
filtering in Kallithea. (E.g. many special characters are banned from
appearing in usernames.) In these cases, the escaping has been fixed
and moved to the template, making it immediately visible that proper
escaping has been performed.
Mini-FAQ (frequently anticipated questions):
Q: Why do everything in one big, hard to review patch?
Q: Why add escaping in specific case FOO, it doesn't seem needed?
Because the goal here is to have "escape everywhere" as the default
policy, rather than identifying individual bugs and fixing them one
by one by adding escaping where needed. As such, this patch surely
introduces a lot of needless escaping. This is no different from
how Mako/Pylons HTML escape everything by default, even when not
needed: it's errs on the side of needless work, to prevent erring
on the side of skipping required (and security critical) work.
As for reviewability, the most important thing to notice is not where
escaping has been introduced, but any places where it might have been
missed (or where h.jshtml is needed, but h.js is used).
Q: The added escaping is kinda verbose/ugly.
That is not a question, but yes, I agree. Hopefully it'll encourage us
to move away from inline JavaScript altogether. That's a significantly
larger job, though; with luck this patch will keep us safe and secure
until such a time as we can implement the real fix.
Q: Why not use Mako filter syntax ("${val|h.js}")?
Because of long-standing Mako bug #140, preventing use of 'h' in
filters.
Q: Why not work around bug #140, or even use straight "${val|js}"?
Because Mako still applies the default h.escape filter before the
explicitly specified filters.
Q: Where do we go from here?
Longer term, we should stop doing variable expansions in script blocks,
and instead pass data to JS via e.g. data attributes, or asynchronously
using AJAX calls. Once we've done that, we can remove inline JavaScript
altogether in favor of separate script files, and set a strict Content
Security Policy explicitly blocking inline scripting, and thus also the
most common kind of cross-site scripting attack.
TLDR: Kallithea has issues with escaping values for use in inline JS.
Despite judicious poking of the code, no actual security vulnerabilities
have been found, just lots of corner-case bugs. This patch fixes those,
and hardens the code against actual security issues.
The long version:
To embed a Python value (typically a 'unicode' plain-text value) in a
larger file, it must be escaped in a context specific manner. Example:
>>> s = u'<script>alert("It\'s a trap!");</script>'
1) Escaped for insertion into HTML element context
>>> print cgi.escape(s)
<script>alert("It's a trap!");</script>
2) Escaped for insertion into HTML element or attribute context
>>> print h.escape(s)
<script>alert("It's a trap!");</script>
This is the default Mako escaping, as usually used by Kallithea.
3) Encoded as JSON
>>> print json.dumps(s)
"<script>alert(\"It's a trap!\");</script>"
4) Escaped for insertion into a JavaScript file
>>> print '(' + json.dumps(s) + ')'
("<script>alert(\"It's a trap!\");</script>")
The parentheses are not actually required for strings, but may be needed
to avoid syntax errors if the value is a number or dict (object).
5) Escaped for insertion into a HTML inline <script> element
>>> print h.js(s)
("\x3cscript\x3ealert(\"It's a trap!\");\x3c/script\x3e")
Here, we need to combine JS and HTML escaping, further complicated by
the fact that "<script>" tag contents can either be parsed in XHTML mode
(in which case '<', '>' and '&' must additionally be XML escaped) or
HTML mode (in which case '</script>' must be escaped, but not using HTML
escaping, which is not available in HTML "<script>" tags). Therefore,
the XML special characters (which can only occur in string literals) are
escaped using JavaScript string literal escape sequences.
(This, incidentally, is why modern web security best practices ban all
use of inline JavaScript...)
Unsurprisingly, Kallithea does not do (5) correctly. In most cases,
Kallithea might slap a pair of single quotes around the HTML escaped
Python value. A typical benign example:
$('#child_link').html('${_('No revisions')}');
This works in English, but if a localized version of the string contains
an apostrophe, the result will be broken JavaScript. In the more severe
cases, where the text is user controllable, it leaves the door open to
injections. In this example, the script inserts the string as HTML, so
Mako's implicit HTML escaping makes sense; but in many other cases, HTML
escaping is actually an error, because the value is not used by the
script in an HTML context.
The good news is that the HTML escaping thwarts attempts at XSS, since
it's impossible to inject syntactically valid JavaScript of any useful
complexity. It does allow JavaScript errors and gibberish to appear on
the page, though.
In these cases, the escaping has been fixed to use either the new 'h.js'
helper, which does JavaScript escaping (but not HTML escaping), OR the
new 'h.jshtml' helper (which does both), in those cases where it was
unclear if the value might be used (by the script) in an HTML context.
Some of these can probably be "relaxed" from h.jshtml to h.js later, but
for now, using h.jshtml fixes escaping and doesn't introduce new errors.
In a few places, Kallithea JSON encodes values in the controller, then
inserts the JSON (without any further escaping) into <script> tags. This
is also wrong, and carries actual risk of XSS vulnerabilities. However,
in all cases, security vulnerabilities were narrowly avoided due to other
filtering in Kallithea. (E.g. many special characters are banned from
appearing in usernames.) In these cases, the escaping has been fixed
and moved to the template, making it immediately visible that proper
escaping has been performed.
Mini-FAQ (frequently anticipated questions):
Q: Why do everything in one big, hard to review patch?
Q: Why add escaping in specific case FOO, it doesn't seem needed?
Because the goal here is to have "escape everywhere" as the default
policy, rather than identifying individual bugs and fixing them one
by one by adding escaping where needed. As such, this patch surely
introduces a lot of needless escaping. This is no different from
how Mako/Pylons HTML escape everything by default, even when not
needed: it's errs on the side of needless work, to prevent erring
on the side of skipping required (and security critical) work.
As for reviewability, the most important thing to notice is not where
escaping has been introduced, but any places where it might have been
missed (or where h.jshtml is needed, but h.js is used).
Q: The added escaping is kinda verbose/ugly.
That is not a question, but yes, I agree. Hopefully it'll encourage us
to move away from inline JavaScript altogether. That's a significantly
larger job, though; with luck this patch will keep us safe and secure
until such a time as we can implement the real fix.
Q: Why not use Mako filter syntax ("${val|h.js}")?
Because of long-standing Mako bug #140, preventing use of 'h' in
filters.
Q: Why not work around bug #140, or even use straight "${val|js}"?
Because Mako still applies the default h.escape filter before the
explicitly specified filters.
Q: Where do we go from here?
Longer term, we should stop doing variable expansions in script blocks,
and instead pass data to JS via e.g. data attributes, or asynchronously
using AJAX calls. Once we've done that, we can remove inline JavaScript
altogether in favor of separate script files, and set a strict Content
Security Policy explicitly blocking inline scripting, and thus also the
most common kind of cross-site scripting attack.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | 154becd92f40 154becd92f40 154becd92f40 22a3fa3c4254 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 22a3fa3c4254 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 22a3fa3c4254 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 60e04a21bf0f 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 250f8150c4bb 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 22a3fa3c4254 154becd92f40 154becd92f40 154becd92f40 36a35394b3cb 154becd92f40 439792d55052 439792d55052 439792d55052 439792d55052 439792d55052 154becd92f40 439792d55052 154becd92f40 439792d55052 439792d55052 439792d55052 439792d55052 439792d55052 154becd92f40 439792d55052 439792d55052 439792d55052 154becd92f40 439792d55052 439792d55052 439792d55052 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 439792d55052 439792d55052 154becd92f40 154becd92f40 439792d55052 439792d55052 439792d55052 439792d55052 439792d55052 439792d55052 439792d55052 439792d55052 154becd92f40 154becd92f40 154becd92f40 439792d55052 439792d55052 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 154becd92f40 439792d55052 154becd92f40 154becd92f40 154becd92f40 154becd92f40 439792d55052 439792d55052 | .. _overview:
=====================
Installation overview
=====================
Some overview and some details that can help understanding the options when
installing Kallithea.
Python environment
------------------
**Kallithea** is written entirely in Python_ and requires Python version
2.6 or higher. Python 3.x is currently not supported.
Given a Python installation, there are different ways of providing the
environment for running Python applications. Each of them pretty much
corresponds to a ``site-packages`` directory somewhere where packages can be
installed.
Kallithea itself can be run from source or be installed, but even when running
from source, there are some dependencies that must be installed in the Python
environment used for running Kallithea.
- Packages *could* be installed in Python's ``site-packages`` directory ... but
that would require running pip_ as root and it would be hard to uninstall or
upgrade and is probably not a good idea unless using a package manager.
- Packages could also be installed in ``~/.local`` ... but that is probably
only a good idea if using a dedicated user per application or instance.
- Finally, it can be installed in a virtualenv_. That is a very lightweight
"container" where each Kallithea instance can get its own dedicated and
self-contained virtual environment.
We recommend using virtualenv for installing Kallithea.
Installation methods
--------------------
Kallithea must be installed on a server. Kallithea is installed in a Python
environment so it can use packages that are installed there and make itself
available for other packages.
Two different cases will pretty much cover the options for how it can be
installed.
- The Kallithea source repository can be cloned and used -- it is kept stable and
can be used in production. The Kallithea maintainers use the development
branch in production. The advantage of installation from source and regularly
updating it is that you take advantage of the most recent improvements. Using
it directly from a DVCS also means that it is easy to track local customizations.
Running ``pip install -e .`` in the source will use pip to install the
necessary dependencies in the Python environment and create a
``.../site-packages/Kallithea.egg-link`` file there that points at the Kallithea
source.
- Kallithea can also be installed from ready-made packages using a package manager.
The official released versions are available on PyPI_ and can be downloaded and
installed with all dependencies using ``pip install kallithea``.
With this method, Kallithea is installed in the Python environment as any
other package, usually as a ``.../site-packages/Kallithea-X-py2.7.egg/``
directory with Python files and everything else that is needed.
(``pip install kallithea`` from a source tree will do pretty much the same
but build the Kallithea package itself locally instead of downloading it.)
Web server
----------
Kallithea is (primarily) a WSGI_ application that must be run from a web
server that serves WSGI applications over HTTP.
Kallithea itself is not serving HTTP (or HTTPS); that is the web server's
responsibility. Kallithea does however need to know its own user facing URL
(protocol, address, port and path) for each HTTP request. Kallithea will
usually use its own HTML/cookie based authentication but can also be configured
to use web server authentication.
There are several web server options:
- Kallithea uses the Paste_ tool as command line interface. Paste provides
``paster serve`` as a convenient way to launch a Python WSGI / web server
from the command line. That is perfect for development and evaluation.
Actual use in production might have different requirements and need extra
work to make it manageable as a scalable system service.
Paste comes with its own built-in web server but Kallithea defaults to use
Waitress_. Gunicorn_ is also an option. These web servers have different
limited feature sets.
The web server used by ``paster`` is configured in the ``.ini`` file passed
to it. The entry point for the WSGI application is configured
in ``setup.py`` as ``kallithea.config.middleware:make_app``.
- `Apache httpd`_ can serve WSGI applications directly using mod_wsgi_ and a
simple Python file with the necessary configuration. This is a good option if
Apache is an option.
- uWSGI_ is also a full web server with built-in WSGI module.
- IIS_ can also server WSGI applications directly using isapi-wsgi_.
- A `reverse HTTP proxy <https://en.wikipedia.org/wiki/Reverse_proxy>`_
can be put in front of another web server which has WSGI support.
Such a layered setup can be complex but might in some cases be the right
option, for example to standardize on one internet-facing web server, to add
encryption or special authentication or for other security reasons, to
provide caching of static files, or to provide load balancing or fail-over.
Nginx_, Varnish_ and HAProxy_ are often used for this purpose, often in front
of a ``paster`` server that somehow is wrapped as a service.
The best option depends on what you are familiar with and the requirements for
performance and stability. Also, keep in mind that Kallithea mainly is serving
dynamically generated pages from a relatively slow Python process. Kallithea is
also often used inside organizations with a limited amount of users and thus no
continuous hammering from the internet.
.. _Python: http://www.python.org/
.. _Gunicorn: http://gunicorn.org/
.. _Waitress: http://waitress.readthedocs.org/en/latest/
.. _virtualenv: http://pypi.python.org/pypi/virtualenv
.. _Paste: http://pythonpaste.org/
.. _PyPI: https://pypi.python.org/pypi
.. _Apache httpd: http://httpd.apache.org/
.. _mod_wsgi: https://code.google.com/p/modwsgi/
.. _isapi-wsgi: https://github.com/hexdump42/isapi-wsgi
.. _uWSGI: https://uwsgi-docs.readthedocs.org/en/latest/
.. _nginx: http://nginx.org/en/
.. _iis: http://en.wikipedia.org/wiki/Internet_Information_Services
.. _pip: http://en.wikipedia.org/wiki/Pip_%28package_manager%29
.. _WSGI: http://en.wikipedia.org/wiki/Web_Server_Gateway_Interface
.. _HAProxy: http://www.haproxy.org/
.. _Varnish: https://www.varnish-cache.org/
|