-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbol Token Write Support #165
Symbol Token Write Support #165
Conversation
Codecov Report
@@ Coverage Diff @@
## master #165 +/- ##
==========================================
- Coverage 74.90% 74.78% -0.13%
==========================================
Files 25 25
Lines 4857 4893 +36
==========================================
+ Hits 3638 3659 +21
- Misses 724 734 +10
- Partials 495 500 +5
Continue to review full report at Codecov.
|
cmd/ion-go/main.go
Outdated
return err | ||
} | ||
if err := w.WriteString(internal.GitCommit); err != nil { | ||
return err | ||
} | ||
|
||
if err := w.FieldName(ion.NewSimpleSymbolToken("build_time")); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't w.FieldName()
happens only if ion.NewTimestampFromStr()
returns no error? Condition line 93 new
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look at the old code. w.FieldName("build_time")
is in both the nil err case and the non-nil err case.
ion/skipper.go
Outdated
@@ -546,44 +546,42 @@ func (t *tokenizer) skipLongStringHelper(handler commentHandler) error { | |||
// false indicating this is not the end of the long string, and true for consumed ' | |||
// as we have read the closing ''' of the first long string. | |||
func (t *tokenizer) skipEndOfLongString(handler commentHandler) (bool, bool, error) { | |||
isConsumed := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think this can be helpful. reading return false, isConsumed, err
is easier to understand compared to return false, false, err
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted back my change.
@@ -202,7 +214,7 @@ func (s *sst) String() string { | |||
buf := strings.Builder{} | |||
|
|||
w := NewTextWriter(&buf) | |||
s.WriteTo(w) | |||
_ = s.WriteTo(w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think WriteTo()
returns error
.
Why not checking the error whether it's nil or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this were inside a function that returns an error then I would have added the error checking. But it doesn't and I'm not gonna add an error to the return type because that would block sst from satisfying the Stringer
interface.
I added this explicit _ =
so that Goland would no longer flag this as a warning.
@@ -281,19 +293,22 @@ func (s *bogusSST) String() string { | |||
|
|||
buf := strings.Builder{} | |||
w := NewTextWriter(&buf) | |||
w.Annotations(SymbolToken{Text: &ionSharedSymbolTableText, LocalSID: 9}, SymbolToken{Text: &bogusText, LocalSID: SymbolIDUnknown}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings, ha!? Yikes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes all of this was just to make Goland stop flagging it.
ion/writer.go
Outdated
|
||
// FieldNameSymbol sets the field name symbol for the next value written. | ||
// It may only be called while writing a struct. | ||
func (w *writer) FieldNameSymbol(val SymbolToken) error { | ||
if !w.IsInStruct() { | ||
w.err = &UsageError{"Writer.FieldNameSymbol", "called when not writing a struct"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Writer.FieldName" instead of "Writer.FieldNameSymbol"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
return *val, err | ||
case Int64: | ||
val, err := d.r.Int64Value() | ||
if err != nil { | ||
return nil, err | ||
} | ||
return *val, err | ||
default: | ||
return d.r.BigIntValue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BigIntValue
also returns value, error.
We can the same check as above cases
es for default
too
ion/textwriter.go
Outdated
} | ||
|
||
if err := writeSymbol(text, w.out); err != nil { | ||
if err := writeSymbol(a.toText(), w.out); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read the code correctly, when we have a.Text
is nil and a.LocalSID
is Unknown, then toText()
will return an empty string. And this empty text is sent to writeSymbol
.
I think if a.Text
is nil and a.LocalSID
is Unknown, error should be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
ion/symboltoken.go
Outdated
return SymbolToken{Text: &text, LocalSID: SymbolIDUnknown} | ||
} | ||
|
||
func newSimpleSymbolTokenPtr(text string) *SymbolToken { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think we should not have this. we have NewSimpleSymbolToken()
and if its pointer is needed that should be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to not have this either but without it I would need two lines to get the address.
ie.
st := NewSimpleSymbolToken(text)
&st
instead of
newSimpleSymbolTokenPtr(text)
@@ -274,10 +272,9 @@ func (d *Decimal) trunc() (int64, error) { | |||
if err != nil { | |||
return 0, err | |||
} | |||
d = ud |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need this.
ud
is different from the original d
. I think when we change the scale, we need to keep d
with its updated version in case another operation is called on d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider d.checkToUpscale()
.
In the old code, it reassigns d with d = d.upscale(0)
. However I discovered that this change of state is not reflected in the calling function d.trunc()
which must be why d.checkToUpscale()
returns the new decimal. (Yes I'm confused by this as well since I expect a pointer receiver to change state.)
Codecov Report
@@ Coverage Diff @@
## master #165 +/- ##
==========================================
- Coverage 74.94% 74.64% -0.31%
==========================================
Files 25 25
Lines 4865 4906 +41
==========================================
+ Hits 3646 3662 +16
- Misses 724 739 +15
- Partials 495 505 +10
Continue to review full report at Codecov.
|
ion/binarywriter.go
Outdated
return w.err | ||
} | ||
} else { | ||
return &UsageError{"Writer.WriteSymbol", "invalid symbol token"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some detail to the error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ion/binarywriter.go
Outdated
return w.writeSymbolFromID("Writer.WriteSymbol", id) | ||
} | ||
|
||
// WriteSymbolFromString writes a symbol value given a string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be more explicit about how the provided string is going to be used; it expects the string to already be in the symbol table and will return an error if it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ion/binarywriter.go
Outdated
// Wrap text value in single quotes if the symbol's text is a symbol identifier | ||
// (ie. of form $n for some integer n) | ||
// This is done to distinguish from actual symbol table mappings. | ||
text = fmt.Sprintf("'%v'", text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the behavior that's intended here. Why are we wrapping $37
in single quotes?
If I pass a SymbolToken
with LocalSID=SymbolIDUnknown
and Text="$37"
, I would expect WriteSymbol
to try to resolve the text literal $37
(unmodified) in the symbol table. If I'd wanted to write SID 37, I would have set LocalSID=37
in the SymbolToken
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have roundtrip tests (such as ion-tests/iontestdata/good/symbols.ion
with the test case '$99'
) that fail without any internal logic wrapping $99
in quotes.
When the text reader reads '$99'
, it produces the string "$99"
(note the lack of single quotes) and then creates a SymbolToken using that string value. The absence of single quotes creates confusion further downstream about whether we are dealing with an actual symbol table mapping vs a non-mapping SymbolToken with this text value.
I moved the single quote wrapping logic over to textReader::nextBeforeTypeAnnotations
instead of binaryWriter::WriteSymbol
and textWriter::WriteSymbol
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such as ion-tests/iontestdata/good/symbols.ion with the test case '$99'
Here's a link to this test case for anyone who's interested.
When the text reader reads '$99', it produces the string "$99" (note the lack of single quotes) and then creates a SymbolToken using that string value.
This sounds like the correct behavior to me.
The absence of single quotes creates confusion further downstream about whether we are dealing with an actual symbol table mapping vs a non-mapping SymbolToken with this text value.
This is something we need to sort out; I don't think there's any writer API where the golang string $99
should be reinterpreted as anything other than a string literal.
Can you link to the function(s) that are causing the problems? Why is the reinterpretation happening at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember off the top of my head where specifically to go looking, but the root cause on this one is likely me. :)
At some point back before the introduction of SymbolToken
, if the binary reader hit a symbol ID it couldn't resolve (or the text reader hit an unquoted ${n}), they'd return the string ${id}
instead of an error. The writers correspondingly mapped WriteSymbol("$123")
to being a symbol with that ID, bypassing the symbol table. That came at the expense of being able to read/write an actual symbol whose text is ${n}
, but that wasn't a thing I thought I would ever need to do (😬😅).
Now that everything at the reader/writer level is converted to SymbolTokens, any remnants of that questionable decision can and should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes total sense. Thanks for the added context!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright I reworked this so the Symbol Token's text value is the unquoted $99
.
However when the text writer is writing out the symbol in textutils.go:writeSymbol()
, we still need to wrap this value in single quotes.
ion/binarywriter.go
Outdated
return err | ||
} | ||
} else { | ||
return &UsageError{api, "invalid field name symbol token"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message should communicate why the symbol token is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if err == nil { | ||
return id, nil | ||
} | ||
if id, ok := symbolIdentifier(sym); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to understand the symbol ID interpretation behavior that we're implementing here. I'm having difficulty imagining a scenario where I'm using a binary writer and I pass it a string in the $id
format ($12
, $47
, etc).
ion/textwriter.go
Outdated
// (ie. of form $n for some integer n) | ||
// This is done to distinguish from actual symbol table mappings. | ||
if _, ok := symbolIdentifier(text); ok { | ||
text = fmt.Sprintf("'%v'", text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here RE: quote wrapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
ion/symboltoken.go
Outdated
@@ -103,6 +112,15 @@ func symbolIdentifier(symbolText string) (int64, bool) { | |||
return SymbolIDUnknown, false | |||
} | |||
|
|||
// NewSimpleSymbolToken returns a Symbol Token with the given text value and undefined SID and Source. | |||
func NewSimpleSymbolToken(text string) SymbolToken { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is subjective, but to me the phrase Simple
doesn't communicate what this function will do. How about NewSymbolTokenFromString
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return *val, err | ||
case Int64: | ||
val, err := d.r.Int64Value() | ||
if err != nil { | ||
return nil, err | ||
} | ||
return *val, err | ||
default: | ||
return d.r.BigIntValue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update Unmarshal API to support shared symbol tables.
I was playing around marshalling/ unmarshalling roundtrip with sst and it was missing.
Update L95-97 to:
func Unmarshal(data []byte, v interface{}, ssts ...SharedSymbolTable ) error {
catalog := NewCatalog(ssts...)
return NewDecoder(NewReaderCat(bytes.NewReader(data), catalog)).DecodeTo(v)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.