Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip more exceptions for un-parsable files #60

Conversation

tanasegabriel
Copy link
Contributor

One commonly used pattern is having a Terraform mono-repo. hcl2tojson is great for converting a bunch of .tf files to json, enabling easy parsing on the fly. It can however get tripped by readme files or git files:

./<redacted>/<redacted>/<redacted>/README.md
Traceback (most recent call last):
  File "/usr/local/bin/hcl2tojson", line 71, in <module>
    parsed_data = load(in_file)
  File "/usr/local/lib/python3.9/site-packages/hcl2/api.py", line 9, in load
    return loads(file.read())
  File "/usr/local/lib/python3.9/site-packages/hcl2/api.py", line 18, in loads
    return hcl2.parse(text + "\n")
  File "/usr/local/lib/python3.9/site-packages/lark/lark.py", line 464, in parse
    return self.parser.parse(text, start=start)
  File "/usr/local/lib/python3.9/site-packages/lark/parser_frontends.py", line 115, in parse
    return self._parse(token_stream, start)
  File "/usr/local/lib/python3.9/site-packages/lark/parser_frontends.py", line 63, in _parse
    return self.parser.parse(input, start, *args)
  File "/usr/local/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 35, in parse
    return self.parser.parse(*args)
  File "/usr/local/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 86, in parse
    for token in stream:
  File "/usr/local/lib/python3.9/site-packages/lark/lexer.py", line 200, in lex
    raise UnexpectedCharacters(stream, line_ctr.char_pos, line_ctr.line, line_ctr.column, allowed=allowed, state=self.state, token_history=last_token and [last_token])
lark.exceptions.UnexpectedCharacters: No terminal defined for '|' at line 6 col 1

| Name | Description | Type | Default |
^

Expecting: {'SLASH', '__ANON_4', 'COLON', '__ANON_10', 'COMMA', '__ANON_9', 'LPAR', 'LSQB', '__ANON_7', '__ANON_8', 'BANG', 'STRING_LIT', 'DECIMAL', 'QMARK', 'RSQB', 'DOT', 'MINUS', '__ANON_11', '__ANON_3', 'LESSTHAN', 'RPAR', 'LBRACE', '__ANON_13', 'PERCENT', 'RBRACE', '__ANON_1', 'STAR', '__ANON_12', 'EQUAL', '__ANON_5', 'PLUS', '__ANON_0', 'MORETHAN', 'heredoc_trim', '__ANON_14', '__ANON_2', '__ANON_6', 'heredoc', 'EXP_MARK'}

Previous tokens: Token('__ANON_0', '\n')
./.git/index
Traceback (most recent call last):
  File "/usr/local/bin/hcl2tojson", line 71, in <module>
    parsed_data = load(in_file)
  File "/usr/local/lib/python3.9/site-packages/hcl2/api.py", line 9, in load
    return loads(file.read())
  File "/usr/local/Cellar/python@3.9/3.9.0_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 14: invalid start byte

Rescuing more of these commonly encountered exceptions under the -s arg (skip) should enable more usage cases.

Copy link
Member

@aoskotsky-amplify aoskotsky-amplify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for this PR

@aoskotsky-amplify aoskotsky-amplify merged commit 86c1244 into amplify-education:master Mar 4, 2021
@tanasegabriel tanasegabriel deleted the skip_more_exceptions_for_unparsable_files branch July 16, 2021 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants