A Python package to enable Unicode input and display when running Python from Windows console.
When running Python in the standard console on Windows, there are several problems when one tries to enter or display Unicode characters. The relevant issue is http://bugs.python.org/issue1602. This package solves some of them.
First, when you want to display Unicode characters in Windows console, you have to select a font able to display them. This has nothing to do with Python, but is included here for completeness.
The standard stream objects (
sys.stdin
,sys.stdout
,sys.stderr
) are not capable of reading and displaying Unicode characters in Windows console. This has nothing to do with encoding, since evensys.stdin.buffer.raw.readline()
returnsb"?\n"
when enteringα
and there is no encoding under whichsys.stdout.buffer.raw.write
displaysα
.The
streams
module provides alternative streams objects, which callReadConsoleW
andWriteConsoleW
functions to interact with Windows console. The functionstreams.enable
installs these streams instead of original ones andstreams.disable
restores the original ones. After replacing the stream objects, also usingprint
with a string containing Unicode characters and displaying Unicode characters in the interactive loop works. Forinput
, see below.Python interactive loop doesn't use
sys.stdin
to read input so fixing it doesn't help. Also theinput
function may or may not usesys.stdin
depending on whethersys.stdin
andsys.stdout
have the standard filenos. See http://bugs.python.org/issue17620 for more information.One way to solve this problem is to provide custom REPL which uses the streams. Such REPL is implemented in
console
module and based on stdlib modulecode
. The functionsconsole.enable
andconsole.disable
maintain (de)activation of our loop.Since there is no hook to run our interactive loop instead of the standard one, we have to wrap the execution of any Python script so our loop is run at the right place. The logic for this is contained in
runner
module and a helper scriptrun.py
, which is located outside of out package for practical reasons.Another and more practical solution is to install a custom readline hook. Readline hook is a function which is used to read a single line interactively by Python REPL. It may also be used by
input
function under certain conditions (see above). On Linux, this hook is usually set to GNU readline function, which provides features like autocompletion, history,…The module
readline_hook
provides our custom readline hook, which usessys.stdin
to get the input and is (de)activated by functionsreadline_hook.enable
,readline_hook.disable
. There also exists packagepyreadline
(https://github.com/pyreadline/pyreadline), which implements GNU readline features on Windows. It provides its own readline hook, which actually supports Unicode input. The problem is, that the input is then encoded usingsys.stdout.encoding
, which may not be capable of encoding all the characters. Our custom stream objects solve the problem, so the readline hook ofpyreadline
can be used as well, andreadline_hook.enable
tries to use it if possible as default to preserve the input features ofpyreadline
.Readline hook can be called from two places – from the REPL and from
input
function. In the first case the prompt is encoded usingsys.stdin.encoding
, but in the second casesys.stdout.encoding
is used. So we need these two encodings be equal.Python tokenizer, which is used when parsing the input from REPL, cannot handle UTF-16 or generally any encoding containing null bytes. Because UTF-16-LE is the encoding of Unicode used by Windows, we have to additionally wrap our text stream objects (
io.TextIOWrapper
with encoding UTF-16-LE over our raw console stream objects) with helper text io objects. This is done automatically bystreams.enable
when needed and can be configured.
win_unicode_console
package was tested on Python 3.4 and interacts well with pyreadline
, IPython
, and colorama
packages.
Install the package from PyPI via pip install win-unicode-console
(recommended) or download the archive and install it from the archive (e.g. pip install win_unicode_console-0.3.zip
) or install the package manually by placing directory win_unicode_console
and module run.py
from the archive to site-packages
directory of your Python installation.
Recommened usage is just calling win_unicode_console.enable()
whenever the fixes should be applied and win_unicode_console.disable()
to revert all the changes. By default, custom stream objects are installed as well as custom readline hook. In the case that pyreadline
is available, its readline hook is reused. For customization, see the sources. The logic should be clear.
Calling win_unicode_console.enable()
may be done automatically on Python startup by putting the command to your sitecustomize
or usercustomize
script. See https://docs.python.org/3/tutorial/interpreter.html#the-customization-modules for more information.
To run a Python script with our custom REPL (which is not needed with the approach above), type py -i -m run script.py
instead of py -i script.py
. You can also put "C:\Windows\py.exe" -i -m rum "%1" %*
to the registry in order to run .py files interactivelly and using custom REPL. To run the custom REPL when plain interactive console is run (just 'py') add environment variable PYTHONSTARTUP
pointing to site-packages\run.py
.
From version 0.3, the custom stream objects have the standard filenos, so calling input
doesn't handle Unicode without custom readline hook.
- The code of
streams
module is based on the code submited to http://bugs.python.org/issue1602. - The idea of providing custom readline hook and the code of
readline_hook
module is based on https://github.com/pyreadline/pyreadline.