Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Localizing Notebooks - continued #16

Closed
wants to merge 10 commits into from
118 changes: 118 additions & 0 deletions jupyter-notebook-gui-translation/jupyter-notebook-gui-translation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# Jupyter Notebook Translation and Localization

## Problem

There is currently no standard approach for translating the GUI of Jupyter notebook.
This has driven some people to do a
[single language translation for Jupyter 4.1](https://twitter.com/Mbussonn/status/685870031247400960).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong, but I think @Carreau's tweet was pointing to a translation of the release announcement, not the UI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I often translate at least the release announce in French.


For information: previous attempts and related issues:

- https://github.com/ipython/ipython/issues/6718
- https://github.com/ipython/ipython/pull/5922
- https://github.com/jupyter/notebook/issues/870

## Proposed Enhancement

For Python or Jinja2: use [Jinja2 with Babel](http://jinja.pocoo.org/docs/dev/extensions/#i18n-extension)
to create `.pot` -> Translators translate -> create `.po` -> compile and create `.mo` from `.po` ( probably at install time ).
Python or Jinja2 consumes the .mo directly.

For JavaScript (client side), use Babel ( [pybabel extract](http://babel.pocoo.org/en/latest/cmdline.html?highlight=extract) ) to create
.pot -> Translators translate -> create .po -> Create JSON as text from .po by
[iterating over the catalog](http://babel.pocoo.org/en/latest/api/messages/catalog.html#catalogs)
and then [export to JSON](https://docs.python.org/2/library/json.html).
From there, [jQuery Globalize](https://github.com/jquery/globalize/blob/master/doc/api/message/load-messages.md)
can read and process the message catalog.

## Detail Explanation

The language of the GUI is mostly hard coded in [html template files](https://github.com/jupyter/notebook/tree/master/notebook/templates) with some exceptions where some language is written in [javascript files](https://github.com/jupyter/notebook/blob/master/notebook/static/notebook/js/about.js#L12) and even a few words in [python code](https://github.com/jupyter/notebook/blob/4578c34b0f999735ee49e1492be3dd5941951551/notebook/base/handlers.py#L332).

### HTML Templates

Tornado [exposes](http://www.tornadoweb.org/en/stable/guide/templates.html#template-syntax) its `translate()` function to template rendering using `_` as a shorthand (which is common in other libraries web frameworks like Django). This is an example of how to translate the [menu of a notebook](https://github.com/jupyter/notebook/blob/4.x/notebook/templates/notebook.html#L80):

```HTML
<a href="#">New Notebook</a>
```

This will be done like this:

```HTML
<a href="#">{{ _("New Notebook") }}</a>
```

### Javascript files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this section is accurate - it looks like it's left over from when the proposal was to turn all JS files into templates, which we agreed we don't want to do.


Regarding Javascript we will use the same approach as HTML but we will have to do a few more changes to make sure javascript files get translated before they are sent to the browser. The approach for this is as follows:

1. Subclass `web.StaticFileHandler` and call it `JupyterStaticFileHandler`
2. Overide `get()` function to make it render static files if they end with .js
3. Use `JupyterStaticFileHandler` instead of the `web.StaticFileHandler` in the RequestHander for static files.

To demonstrate translation in a [jacascript file](https://github.com/jupyter/notebook/blob/4.x/notebook/static/notebook/js/about.js#L12), we will use the following:

```javascript
var text = 'You are using Jupyter notebook.<br/><br/>';
```

Will be done like this

```javascript
var text = "{{ _('You are using Jupyter notebook.<br/><br/>') }}";
```

### Python files

We can use `translate(message, plural_message=None, count=None)` (or it's shorthand `_`) in Tornado RequestHandler or anywhere else where text it sent to the GUI.

To demonstrate this I'll be using [existing text in python](https://github.com/jupyter/notebook/blob/4578c34b0f999735ee49e1492be3dd5941951551/notebook/base/handlers.py#L332) that needs translation:

```python
raise web.HTTPError(400, u'Invalid JSON in body of request')
```

This will be done like this:

```python
raise web.HTTPError(400, _(u'Invalid JSON in body of request'))
```

## Translation Files

All languages will be treated as translations including English. All translation files will be located inside the extensions folder and will be treated as extensions. This will allow Jupyter to be shipped with one translation (English) and allows people to get other translations as an nb-extension.

The files will be .po (Portable Object) files for each language and they will be compiled to .mo (Machine Object) files to work with xgettext which is supported by the `translate()` function in Tornado.

The original PO file can be created using [xgettext](http://www.gnu.org/software/gettext/manual/gettext.html#xgettext-Invocation):

```bash
xgettext [option] [inputfile]
```

For the translation, we can use a text edit for the PO files. But I would recommend using a crowd-sourced solution where people can translate words or sentences on a web application like [POEdit](https://poeditor.com/features/)

## Which translation to use?

The default configuration file can be used to add a new configuration variable for the default language.

c.gui_language = 'en_US'

We can also set it to "auto" if we want to use Tornado to detect the end-user language which is provided in `Accept-Language` header. Tornado can find the best match for the end-user language or return the default language if it doesn't have that translation.

## Pros and Cons

Pros associated with this implementation include:
* No extra dependencies
* Using a well known standard that can be extended for any number of languages
* Can be used later with Jupyter Hub to set multiple languages for multi-lingual teams.

Cons associated with this implementation include:
* Javascript strings and HTML files will have `{{ _(XXX) }}` in the source code.
* A change in the development guide lines to use translation
* Rendering javascript files means you cannot use `{{XXX}}` or `{% X %}` inside any javascript files. This means no [mustache](https://mustache.github.io/) (It is not used now, but it cannot be used in the future).

## Interested Contributors
@twistedhardware @rgbkrk @captainsafia @JCEmmons @srl295