Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode sys tests and fixes #8135

Merged
merged 77 commits into from
Jun 11, 2019

Conversation

Aurel300
Copy link
Member

@Aurel300 Aurel300 commented Apr 9, 2019

Closes #8094.

General

  • define target.unicode - true by default, false on Neko or with (cpp && !cppia && !hxcpp_smart_strings) (extracted into a separate PR target.unicode define #8244)

Test setup

  • create a script to generate a test suite with Unicode-named files and directories
    • no NUL bytes
    • no /:*"?<>|
    • UTF-8 boundary conditions
    • non-BMP characters
    • NFC/NFD ambiguous characters
    • compatibility equivalent characters? (out of scope)

Filesystem

  • Sys
    • setCwd, getCwd (cd into Uni directory, read back path)
    • programPath (symlink in a Uni directory, read back path) this does not work over symlinks – we could copy binaries into Uni directories and run them there, but this is problematic to cover for all targets; getCwd should mostly cover this anyway
  • sys.FileSystem
    • absolutePath (after setCwd, nested)
    • exists
    • fullPath (with symlinks in Uni directories)
    • isDirectory
    • readDirectory (test dir + nested Uni directories)
    • stat
    • rename
    • deleteFile
    • deleteDirectory
    • createDirectory
  • sys.io.File
    • copy (this should probably be in sys.FileSystem)
  • haxe.io.Path
    • construct + check directory components, name components, extension components

I/O API

Read from a known source. Write to a temporary file and compare on Bytes against a known correct output.

  • trace
  • sys.io.File
    • getBytes
    • getContent
    • read (+ haxe.io.Input test)
    • saveContent
    • update, write, append (+ haxe.io.Output test)
    • Unicode paths for the above
  • haxe.io.Input
    • readLine
    • readString
    • readUntil
  • haxe.io.Output
    • writeString
  • Sys
    • stdin, stderr, stdout
    • print, println
    • getChar (not Unicode related)

Environment-related API

  • Sys
    • args
    • environment
    • getEnv, putEnv

Process API

  • Sys
    • command - Uni executable, Uni arguments
  • sys.io.Process
    • new
    • stdin, stderr, stdout

Miscellaneous

  • move sys Unicode tests into TestUnicode
  • genTestRes.sh Windows port commit test suite to git just use Python
  • some common scaffolding for running process of the same platform (currently in TestUnicode as well as TestCommandBase)
  • replace ExitCode with UtilityProcess

Check targets

  • Travis
    • cpp
    • cs
    • eval
    • hl
    • lua
    • java
    • jvm
    • php
    • python
  • AppVeyor
    • cpp
    • cs
    • eval
    • hl
    • java
    • jvm
    • php
    • python

Issues / PRs spawned

@RealyUniqueName
Copy link
Member

sys.io.File is missing on your list.

@RealyUniqueName
Copy link
Member

Also, a windows script for unicode filenames generation is needed.

@Aurel300
Copy link
Member Author

sys.io.File is missing on your list.

Yup, I only listed the methods for FileSystem, didn't go through File in detail yet.

Also, a windows script for unicode filenames generation is needed.

I might port it yet, although the idea was partly to generate the test suite and commit the resulting files and directories to git.

@skial skial mentioned this pull request Apr 10, 2019
1 task
@Aurel300 Aurel300 requested review from Simn and ncannasse April 23, 2019 19:04
@Aurel300 Aurel300 marked this pull request as ready for review April 23, 2019 19:13
@ncannasse
Copy link
Member

Given there's not much fixes to do I'm putting that in Release 4.0 to monitor the changes. In case we have some very target specific issue yet unresolved we will still not block final release because of this.

@Aurel300 Aurel300 force-pushed the issue/8094-sys-unicode branch 2 times, most recently from c9e1679 to aa2fe60 Compare April 24, 2019 21:58
@Aurel300
Copy link
Member Author

So the non-Windows part of this is basically done. The various target failures should be covered by the "Issues / PRs spawned" list. I am not sure how to proceed with Windows support here (cc @andyli):

The last commit attempted to simply commit the test suite directory with its Unicode-named files and sub-directories. But AppVeyor fails to even git clone the repository with this. Apparently chcp 65001 should be run beforehand, but git clone is not even in the CI file, it is run automatically.

I also tried to put the test-res directory into a ZIP archive. Unfortunately this seems to fail on OS X already (it tries to override existing files, because \x01 is completely ignored, I think?) … Apparently this is best "solved" by using a different program, e.g. 7z.

Porting genTestRes.sh would be best, but outputting the correct Unicode filenames seems to be rather painful in Windows batch. I found workarounds to simply put binary data into a file (e.g. https://stackoverflow.com/questions/47750732/write-hex-values-to-file-in-windows-batch), but nothing yet to actually create files and directories with the correct names.

@andyli
Copy link
Member

andyli commented Apr 27, 2019

7z is pre-installed in AppVeyor: https://www.appveyor.com/docs/windows-images-software/#tools

You can also use whatever program that can correctly handle Unicode file names correctly (e.g. Python?).

One tip about working with AppVeyor: You can remote desktop into the build and do whatever testing/debugging there. Just make sure you're not doing it in the HF repo, but use a fork of your own, to prevent other people connect into it and print out our secret variables (those wouldn't be available in fork builds).

std/haxe/io/Path.hx Outdated Show resolved Hide resolved
@ncannasse
Copy link
Member

I have merged the HL PR and restarted the build to update results.

@ncannasse
Copy link
Member

ncannasse commented Apr 30, 2019 via email

@Aurel300
Copy link
Member Author

Uhm, something is weird there then, how and why were the tests were broken?

Here are the failing tests: https://travis-ci.org/HaxeFoundation/haxe/jobs/524618714#L3553

I didn't investigate why they failed, but you are right, it really should work on Java. Interestingly enough, the failures are e.g. expected 0x1F619, but got 0xF619 (forgetting to add the 0x10000), and this is through the Unicode-safe string iterator, I think.

I'll investigate if there is a problem with the iterator on Java tomorrow. I'm also wondering why it even works as-is on Java.

@Aurel300
Copy link
Member Author

Aurel300 commented May 1, 2019

Ok, fixed that with #8239.

@Aurel300 Aurel300 force-pushed the issue/8094-sys-unicode branch 2 times, most recently from 9b12833 to 7913b5f Compare May 2, 2019 17:33
@Aurel300 Aurel300 force-pushed the issue/8094-sys-unicode branch from 7913b5f to d98e310 Compare May 10, 2019 19:36
@Aurel300 Aurel300 mentioned this pull request May 27, 2019
@Aurel300 Aurel300 added this to the Release 4.0 milestone May 28, 2019
@Aurel300
Copy link
Member Author

Aurel300 commented May 30, 2019

It would be best if somebody with a working Haxe setup on Windows looked into this. It is difficult for me to debug the targets from AppVeyor results.


Update: I got a Vagrant set up with Windows 10 that mostly works, so hopefully I can progress on this again.

@Aurel300 Aurel300 force-pushed the issue/8094-sys-unicode branch from 78c692e to f68fc31 Compare June 3, 2019 11:33
@Aurel300 Aurel300 mentioned this pull request Jun 4, 2019
@Aurel300 Aurel300 self-assigned this Jun 5, 2019
@Aurel300 Aurel300 force-pushed the issue/8094-sys-unicode branch from f68fc31 to 2cdedee Compare June 5, 2019 10:20
@Aurel300 Aurel300 changed the title Unicode sys tests Unicode sys tests and fixes Jun 6, 2019
@Aurel300 Aurel300 force-pushed the issue/8094-sys-unicode branch from d40dae2 to 34d4435 Compare June 10, 2019 12:05
@Aurel300
Copy link
Member Author

Travis failure is unrelated (brew problem when installing Flash player).

@Aurel300 Aurel300 merged commit 51b10b5 into HaxeFoundation:development Jun 11, 2019
@Aurel300 Aurel300 deleted the issue/8094-sys-unicode branch June 11, 2019 15:54
This was referenced Jul 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sys unicode unit tests
4 participants