Skip to content

Commit

Permalink
Minor updates.
Browse files Browse the repository at this point in the history
  • Loading branch information
gdiaz384 committed Mar 26, 2024
1 parent 849e780 commit 1727cf4
Show file tree
Hide file tree
Showing 5 changed files with 23 additions and 14 deletions.
11 changes: 5 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,7 @@ Undetermined if:

## Installation guide

`Current version: 0.1 - 2024Mar20 pre-alpha`
`Current version 2024.03.20 pre-alpha`
`Current version: 2024.03.20 pre-alpha`

Warning: py3TranslateLLM is currently undergoing active development but the project in the alpha stages. Alpha means core functionality is currently under development.

Expand Down Expand Up @@ -434,9 +433,9 @@ Libraries can also require other libraries.
- 'Sugoi NMT' is a wrapper for fairseq which, along with the pretrained model, does the heavy lifting for 'Sugoi NMT'.
- Sugoi NMT is one part of the 'Sugoi Translator Toolkit' which is itself part of the free-as-in-free-beer distributed 'Sugoi Toolkit' which contains other projects like manga translation and upscaling.
- The use of Github to post source code for Sugoi Toolkit suggests intent to keep the wrapper code under a permissive license. A more concrete license may be available on discord.
- py3TranslateLLM.py and the associated libraries under `resources/` are [GNU Affero GPL v3](//www.gnu.org/licenses/agpl-3.0.html). Summary:
- Feel free to use it, modify it, and distribute it to an unlimited extent, but if you distribute binary files of this program outside of your organization, then please make the source code for those binaries available.
- The imperative to make source code available also applies if using this program as part of a server if that server is publically accessible.
- py3TranslateLLM.py and the associated libraries under `resources/` are [GNU Affero GPL v3](//www.gnu.org/licenses/agpl-3.0.html).
- Summary: You are free to use the software as long as you do not infringe on the [freedoms](https://www.gnu.org/philosophy/free-sw.en.html#four-freedoms) of other people.
- Details: Feel free to use it, modify it, and distribute it to an unlimited extent, but *if you distribute binary files of this program outside of your organization*, then please make the source code for those binaries available.
- The imperative to make source code available also applies if using this program as part of a server *if that server can be accessed by people outside of your organization*.
- Binaries for py3TranslateLLM.py made with pyinstaller, or another program that can make binaries, also fall under GNU Affero GPL v3.
- This assumes the licenses for libraries used in the binary are compatible with one another. If the licenses used for a particular binary are not compatible with one another, then the resulting binary is not considered redistributable. Only lawyers can determine that, and also only lawyers need to worry about it.

4 changes: 2 additions & 2 deletions resources/chocolate.py
Original file line number Diff line number Diff line change
Expand Up @@ -436,8 +436,7 @@ def importFromCSV(self, fileNameWithPath,myFileNameEncoding=defaultTextFileEncod


def exportToCSV(self, fileNameWithPath, fileEncoding=defaultTextFileEncoding):
#print('Hello World'.encode(consoleEncoding))
with open(fileNameWithPath, 'w', newline='', encoding=fileEncoding,errors=outputErrors) as myOutputFileHandle:
with open(fileNameWithPath, 'w', newline='', encoding=fileEncoding, errors=outputErrors) as myOutputFileHandle:
myCsvHandle = csv.writer(myOutputFileHandle)

# Get every row for current spreadsheet.
Expand All @@ -448,6 +447,7 @@ def exportToCSV(self, fileNameWithPath, fileEncoding=defaultTextFileEncoding):
for cell in row:
tempList.append( str(cell) )
myCsvHandle.writerow(tempList)

print( ('Wrote: '+fileNameWithPath).encode(consoleEncoding) )


Expand Down
4 changes: 3 additions & 1 deletion wiki/Home.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,6 @@
- [Translation Technologies - NMT, Language Models, LLMs, and AI](//github.com/gdiaz384/py3TranslateLLM/wiki/Translation-Technologies-%E2%80%90-NMT,-Language-Models,-LLMs,-and-AI)
- [Command Line Interfaces (CLI) Resources](//github.com/gdiaz384/py3TranslateLLM/wiki/Command-Line-Interfaces-(CLI)-Resources)
- [Text Encoding](//github.com/gdiaz384/py3TranslateLLM/wiki/Text-Encoding)
- [Licensing]
- Summary of Software [Licensing](//docs.codeberg.org/getting-started/licensing).
- [What is Free Software?](https://www.gnu.org/philosophy/free-sw.en.html)
- [License List](https://www.gnu.org/licenses/license-list.html) and [GPL FAQ](https://www.gnu.org/licenses/gpl-faq.html).
13 changes: 10 additions & 3 deletions wiki/pages/Text-Encoding.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,16 @@

### Common and Standard Encodings

- For all Python supported encodings see: [standard-encodings](//docs.python.org/3.7/library/codecs.html#standard-encodings). Common encodings:
- `utf-8` - If at all possible, please only use `utf-8`, and use it for absolutely everything.
- For all Python supported encodings see: [standard-encodings](//docs.python.org/3.7/library/codecs.html#standard-encodings). Common encodings:
- [`utf-8`](https://www.ietf.org/rfc/rfc3629.txt) - If at all possible, please only use `utf-8`, and use it for absolutely everything.
- py3TranslateLLM uses `utf-8` as the default encoding for everything except kirikiri.
- `shift-jis` - Required by the kirikiri game engine and many Japanese visual novels, games, programs, media, and text files in general.
- `utf-16-le` - a.k.a. `ucs2-bom-le`. Alternative encoding used by the kirikiri game engine. TODO: Double check this.
- `cp437` - This is the old IBM/DOS code page for English that Windows with an English locale often uses by default.
- `cp1252` - This is the code page for western european languages that Windows with an English locale often uses by default.
- [Error handlers](//docs.python.org/3.7/library/codecs.html#error-handlers) can be used to handle conversion errors from one type of encoding to another.

### Windows specific notes
### Windows Specific Notes

- Due to English locales being very common on Windows, both `cp437` and `cp1252` are very often the encoding used by `cmd.exe`.
- On newer versions of Windows (~Win 10 1809+), consider changing the console encoding to native `utf-8`.
Expand All @@ -27,3 +27,10 @@
- Historically, setting the Windows command prompt to ~utf-8 will reliably make it crash which makes having to deal with `cp437` and `cp1252` inevitable.
- To print the currently active code page on Windows, open a command prompt and type `chcp`
- To change the code page for that session type `chcp <codepage #>` as in: `chcp 1252`
- [Windows CLI](https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/) update to utf-8 in Windows 10 1809.

### More Information For Software Developers

- [The Absolute Minimum Every Software Developer Must Know about Unicode and Character Sets](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/).
- https://docs.python.org/3/howto/unicode.html
- https://docs.python.org/3/library/codecs.html#encodings-and-unicode
5 changes: 3 additions & 2 deletions wiki/pages/fairseq-installation-guide.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Hardware (CPU/GPU)
- Download the latest version from [PyPi](//pypi.org/project/fairseq) using pip:
- `pip install fairseq`
- Requires:
- Python 3.6+
- Python 3.8+
- PyTorch 1.5.0+
- `pip` will install various dependencies automatically if using a last stable version.
- More information: https://pypi.org/project/fairseq/
Expand All @@ -76,12 +76,13 @@ Hardware (CPU/GPU)
- (Optional) Download and install `git`: https://git-scm.com/download/
- It is possible to download fairseq as a release, a main repository archive, or last stable version using pip.
- `git` is not needed but still nice to have.
- Alternatively, use Chocolatey: `choco install git.install --params "/NoShellIntegration"`
- [Chocolatey](//chocolatey.org) is a package manager for Windows. It tends to be very good for programs that do not need any special options set during installation, like Ninja, and unlike Python and `git`. Programs like `git` require special handling to look up which installation parameters are necessary to make them behave. This was provided in the example above for convenience.
- Download the [Ninja](//ninja-build.org) build system and put the binary somewhere in %path%: [github.com/ninja-build/ninja/releases](//github.com/ninja-build/ninja/releases)
- To check for locations to place the Ninja binary file, open a command prompt (`cmd.exe`) or terminal and type the following:
- Windows: `echo %path%`
- Linux: `echo $PATH`
- Alternatively, `choco install ninja`
- [Chocolatey](//chocolatey.org) is a package manager for Windows. It tends to be very good for programs that do not need any special options set during installation, like Ninja, and unlike Python and Git.
- On Windows, building from source requires [Visual Studio C++ 2015 build tools](//stackoverflow.com/questions/40504552/how-to-install-visual-c-build-tools), [Visual Studio Build Tools 2015-2017](//aka.ms/vs/15/release/vs_buildtools.exe) in addition to the requirements below.
- Microsoft bundles the installer for the 2015 Build Tools in with their 2017 Visual Studio Installer.
- **Important**: fairseq needs the "Visual C++ 2015 Build Tools" to compile. These are not selected by default.
Expand Down

0 comments on commit 1727cf4

Please sign in to comment.