Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode fixes #246

Merged
merged 5 commits into from
Apr 30, 2019
Merged

Conversation

Aurel300
Copy link
Member

@Aurel300 Aurel300 commented Apr 25, 2019

  • convert to UTF-16 before printing in sys_print (similar to the fix in hxcpp)
  • 0xD7C0 was used instead of 0xD800 for high surrogates
  • handle surrogates when calculating string length
  • handle surrogates when converting to UTF-8
  • Sys.putEnv("TEST_VAR", "😉"); trace(Sys.getEnv("TEST_VAR")); did not produce the correct result (UTF-16 to UTF-8 conversion was wrong)
  • Unicode out of bmp char haxe#8204 ? cannot reproduce yet

@ncannasse
Copy link
Member

This should not be needed. Uprintf implementation is responsible for doing that.

@Aurel300
Copy link
Member Author

What platform are you testing on? Does trace("😉"); work for you? On Ubuntu 16.04 without this fix it prints the surrogate bytes (so it just shows ?????? in the terminal).

@ncannasse
Copy link
Member

What I meant is that it should be fixed by modifying uprintf implementation instead of changing only hl_sys_print.

I'm testing on Windows. And yes it's using wprintf which does not seem to correctly convert to utf8. But this seems quite tricky because Windows terminal is not utf8 so trace("é") works with wprintf in cmd but does not work when converting to utf8, it shows é in terminal.

@ncannasse
Copy link
Member

I fixed it for windows in 3c6164d (+ just added stderr as well)
@hughsando you might want to use that as well, it sets stdout to utf8 so wprintf actually works well when printing both to terminal and redirecting to a file

@Aurel300 on other platforms, look at uprintf implementation in ucs2.c

@Aurel300
Copy link
Member Author

Aurel300 commented Apr 25, 2019

Yes, you're right – I thought uprintf was a built-in. But uprintf (utostr) doesn't actually handle surrogate conversion, so I'll fix that.

@Aurel300 Aurel300 marked this pull request as ready for review April 26, 2019 10:27
@ncannasse ncannasse merged commit 260b628 into HaxeFoundation:master Apr 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants