-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails on flow sequences #114
Comments
It looks like there's two issues going on here. First, "mini2.yml" starts with a UTF8-BOM (inserted by some windows utilities like notepad.exe). This was fixed in PR #107 when support for alternate encodings was added. It loads fine with the current github version. (To use the latest version in your project, use either Second, "new.yml" contains a trailing comma on line 62:
Looking at the YAML spec, this is explicitly allowed in both flow sequences and mappings. I'll open a PR shortly to fix this behavior. |
Thank you.
Interesting. The file was originally created on a Mac but round-tripped through one of the YAML verification websites. The one I used must have Windows-ized the file.
I’ll try again with both fixes in the file and report back.
- Lewis
On May 21, 2021, at 6:22 PM, Colin Gilgenbach ***@***.***> wrote:
It looks like there's two issues going on here.
First, "mini2.yml" starts with a UTF8-BOM (inserted by some windows utilities like notepad.exe). This was fixed in PR #107<#107> when support for alternate encodings was added. It loads fine with the current github version. (To use the latest version in your project, use either pkg add YAML#master or pkg develop YAML)
Second, "new.yml" contains a trailing comma on line 62:
outcomes: [3, 7,]
Looking at the YAML spec, this is explicitly allowed<https://yaml.org/spec/1.2/spec.html#id2790506> in both flow sequences and mappings. I'll open a PR shortly to fix this behavior.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#114 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAIYWLPPJG2U7XVKEWMTC6LTO4BNXANCNFSM44BMBRNQ>.
|
So, even w/o using master or develop version, the BOM is read and shows up as the first character of a string key. But, I got rid of it.
I deleted the trailing comma. While it should be allowed as it is weird sort of convention to make it easier add a new last item, I don’t typically do this.
I made the sample file manually and the comma crept in. You can fix as you feel needed and when it’s efficient to bundle up some minor fixes and/or features, etc.
Thanks for pointing these things out.
I am going to go with the stream format when I change the file format my code uses.
Assuming you’ve got what you need in your todo list, feel free to close.
Thanks for your responsiveness on this.
From: Colin Gilgenbach ***@***.***>
Reply-To: "JuliaData/YAML.jl" ***@***.***>
Date: Friday, May 21, 2021 at 6:22 PM
To: "JuliaData/YAML.jl" ***@***.***>
Cc: Lewis Levin ***@***.***>, Author ***@***.***>
Subject: Re: [JuliaData/YAML.jl] Fails on flow sequences (#114)
It looks like there's two issues going on here.
First, "mini2.yml" starts with a UTF8-BOM (inserted by some windows utilities like notepad.exe). This was fixed in PR #107<#107> when support for alternate encodings was added. It loads fine with the current github version. (To use the latest version in your project, use either pkg add YAML#master or pkg develop YAML)
Second, "new.yml" contains a trailing comma on line 62:
outcomes: [3, 7,]
Looking at the YAML spec, this is explicitly allowed<https://yaml.org/spec/1.2/spec.html#id2790506> in both flow sequences and mappings. I'll open a PR shortly to fix this behavior.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#114 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAIYWLPPJG2U7XVKEWMTC6LTO4BNXANCNFSM44BMBRNQ>.
|
I'm glad you got your code working. To match the YAML spec, there are two components:
Because the trailing commas bug still exists (and the resulting error message is nigh-unreadable), I think we should keep this issue open. When PR #116 is merged we can close the issue. |
I have a longish yml file. With block sequences, it works. I converted the blocks to flow because there were very few and very short entries in each sequence. All of the popular YAML parse/validate/prettify tools accept the entire file; it validates as correct; and it can successfully be converted to json, toml,
YAML.jl fails with an angry sequence of error messages that I can't make out (
YAML_error_msgs.txt
).
I broke the file down to many fewer nodes. The shortest fragment works:
YAML.load_file("mini1.yml) converts to a julia dict I called mini. Using PrettyPrint to output shows it is correct (effectively an ascii "repr" of json):
here is a slightly longer form:
This stops at the 5: to 25: to 8: to outcomes node. It effectively aborts after the first leaf node with no error message of any kind. Further, it won't generate the julia datastructure at all.
There are 4 more top level nodes like node 5 above. This leads to massive failure.
Finally, it almost works when converting the little arrays (sequences) into block sequences:
Except bizarrely the leading integer 1 at the top of the file is converted to a string representation of a utf8 symbol: "\ufeff1" which should be in Julia '\ufeff1' but this could be the way Jupyter wants to output it. The character is this thingee: You can see that the stream approach is much more readable and shows the matchup of the probability and the outcome arrays.
Visual Studio Code is pretty happy that the integer 1 is actually a 1. So, the conversion is causing the problem.
Maybe this is failing on numeric scalar keys? But, it works on the small fragment and it mostly works on the longer example. Some of the convert to json tools sort of insist that the yml BE valid json, but converting to a dict shouldn't have that restriction.
Perhaps I can make a little gist, preferably a jupyter notebook, for you of several examples if this isn't enough to see the problem.
For, now I've attached the files in a zip archive ymlfiles.zip, which contains:
new.yml => shows the stream version
new_block.yml => shows the block version
mini2.yml => shows the stream version for a single top level node
The text was updated successfully, but these errors were encountered: