-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support translations for 3rd party mods #505
Support translations for 3rd party mods #505
Conversation
This makes both libintl and cata_libintl try to use MO files where dialect is specified and matches language (e.g. fr_FR instead of just fr) before falling back to using dialect-agnostic version.
6d8c6c2
to
2e7aa1d
Compare
Cherry-picked from commit CleverRaven/Cataclysm-DDA@35a2d9e Co-authored-by: Zhilkin Serg <ZhilkinSerg@users.noreply.github.com>
@@ -1,14 +1,14 @@ | |||
[ | |||
{ | |||
"//": "See language.h for documentation", | |||
"id": "en", | |||
"id": "en_US", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll probably have to adjust the Transifex settings regarding language names after this.
From what I can tell, it won't allow en_US
to count as en
, even if there is no other en
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When dialect is specified in language id, the game picks up both dialect-specific and generic MO files (both old and new implementations), so it should be fine as is
// | ||
// This test reaffirms the assumption that both Transifex's and GNU's plf expressions | ||
// produce same values for integer numbers. | ||
TEST_CASE( "gnu_transifex_rules_equal", "[libintl][i18n][.]" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dot because slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, 9-10 seconds.
@@ -0,0 +1 @@ | |||
�� |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly empty. Is it supposed to be "empty MO" with magic numbers at start, truly empty file that ended up not-empty, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is that if the file is empty, or is smaller than 4 bytes, it's guaranteed to not be able to hold a magic number, ergo be a MO file.
I should've renamed it to not_a_mo_file.mo
or something similar.
Got it working with the docs, but it may be useful to have a "tl;dr" variant of the docs somewhere. |
For translation string collision, maybe the current extraction script need to be able to add some context like mod id. |
I thought about that; in this case, the game would also have to know under which context to look for translations. The main roadblock is how the game handles data loaded from JSON. There is no such "context" right now that could allow to differentiate between whether an item has been loaded from one mod or another, or whether half of its strings come from a completely different mod (e.g. the item's name has been modified via Someone would have to go over every type of data loadable from JSON (items, monsters, mutations...) and make sure that every translatable string receives the proper context. This is a lot of work, and is bound to produce various obscure bugs noone would notice until much later. The ideal solution would be to move away from gettext's "string is a translation lookup key and is also an English translation" to "JSON files contain lookup keys, localization files contain all translations including English" schema used by many games and software projects, so that mod authors could append mod id to all lookup keys in both JSON and localization data, but with 30k+ strings in JSON that would take a lot of work for us and break mod compatibility for everyone else. |
@Coolthulhu I've tried to do a little translation with current docs, everything went fine! The most "complex" part is to run scripts properly. Here is simplified steps that most of the modders will be able to follow, this one is for Windows:
the .bat file could be distributed along the scripts too, if it's applicable; in that case step 4 should include that file as well, and step 5 will be obsolete. |
Yeah, that should make it easier. Apparently full paths on Windows freak out Poedit (as More screenshotsSame with So seeing as there is no difference either way I think we can use relative path python extract_json_strings.py -i .\ -o lang\extracted_strings.pot --project Mod_Translation
python dedup_pot_file.py lang\extracted_strings.pot Then tweak @echo off
if not exist lang md lang
python extract_json_strings.py -i .\ -o lang\extracted_strings.pot
python dedup_pot_file.py lang\extracted_strings.pot
echo Done!
pause
@echo on Alternatively, we could incorporate |
Summary
SUMMARY: Infrastructure "New translation system with support for 3rd party mods"
Purpose of change
Allow 3rd party mods to ship their own MO files (CleverRaven/Cataclysm-DDA#25566).
Fix #495.
Fix plural forms for some languages on Android.
Describe the solution
Roll out custom runtime localization system (dubbed
cata_libintl
) that is compatible with gettext MO files.Key differences from currently used GNU
libintl
:BN only uses "default" translation domain, and rewriting toolchain to support multiple domains and rewriting the code to actually use them would be a huge and bugprone undertaking.
BN only uses UTF-8, so that's fine.
Dependence on locale / environment variables caused numerous problems with
libintl
in the past, as Cata aims to support multiple platforms and compilers. The most recent issue is Scaling factor x2 and x4 resets language to English #495, which seems to be caused by SDL fiddling with locale, and is absent when usingcata_libintl
.Or from memory. This is currently used only in tests.
To enable multi-MO support without the ability to use domains, the new system has to be able to merge MO files "on the fly". It is currently possible to do so manually via gettext utilities (use
msgunfmt
to decompile base game MO back into PO, thenmsgcat
to concatenate with PO from a mod, thenmsgfmt
to compile back into MO), but the process takes some time (20 seconds on my laptop) and has to be repeated each time game updates or mod list changes.Also add documentation on how to translate 3rd party mods.
Describe alternatives you've considered
Call gettext utilities from within the game to merge MO files and use some caching algorithm to cut down on re-compilations.
This solution has multiple potential problems:
en
anden_US
are treated differently bylibintl
, and the same probably applies tomsgcat
).libintl
Cata uses custom fork oflibintl-lite
, which would be its own source of fun little inconsistencies someone would have to sort through.Drop gettext completely and use some alternative localization system (e.g. tables of
string id : translation
as many games and software projects do), possibly with built-in merge function.We have over 10 thousand strings in source code and over 30 thousand in JSONs. Moving from gettext would be a pain.
Use some existing lightweight library for working with gettext files.
The only one I've found that wouldn't be a pain to use is spirit-po, but it requires Boost.
Testing
All tests pass.
Benchmarks (Intel Core i5-3230M)
Testing / benchmarking "in the wild" didn't indicate any significant changes in cpu/memory usage, and MO parsing + string sorting always took less than 50 ms. Reading from disk took 0..500 ms depending on os / hard drive / filesystem cache.
Additional context
Low-priority stuff for future PRs:
modinfo.json
)libintl
code and remove it from build dependencies