Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorted JSON from MarshalJSON #171

Closed
rigelrozanski opened this issue Jun 10, 2018 · 18 comments
Closed

Sorted JSON from MarshalJSON #171

rigelrozanski opened this issue Jun 10, 2018 · 18 comments

Comments

@rigelrozanski
Copy link

rigelrozanski commented Jun 10, 2018

Can we enforce that the json created is ordered in the same order which the fields are defined in the struct?

edit by @liamsi: this is already the case. The discussion below diverted to if amino should sort keys alphabetically.

See: cosmos/cosmos-sdk#983 (comment)

@cwgoes
Copy link

cwgoes commented Jun 10, 2018

Necessary for deterministic transaction signing/verification, also cleaner UX in general.

@ValarDragon
Copy link
Contributor

Do we want it sorted by the order fields appear, or sorted by key name? I think keys will eventually be reordered for cosmos/cosmos-sdk#1257, so I think we should sort by key name.

@rigelrozanski
Copy link
Author

Interesting point - I like the idea of sorting by field order because it allows the developer to define the natural importance of information being show - but it also sucks if reordering the fields in a struct becomes a breaking change

@cwgoes
Copy link

cwgoes commented Jul 3, 2018

Can we prioritize this? Would like to validate sorted JSON Ledger-side.

cc @liamsi Thoughts?

@ValarDragon
Copy link
Contributor

I agree re sorting by field order. Also #1257 is too small of a memory improvement to worry about.

@liamsi
Copy link
Contributor

liamsi commented Jul 3, 2018

Hey, I'm surprised, that this is already the case (sorted by field-order). Yeah, let's prioritize this.

@liamsi
Copy link
Contributor

liamsi commented Jul 3, 2018

Can you point me to the types from which this is generated? cosmos/cosmos-sdk#983 (comment) Never mind, I think I found it.

@liamsi
Copy link
Contributor

liamsi commented Jul 4, 2018

I think, the culprit isn't amino here (or at least not only). Fields get encoded as they are ordered in the struct:

go-amino/json-encode.go

Lines 270 to 291 in 2106ca6

func (cdc *Codec) encodeReflectJSONStruct(w io.Writer, info *TypeInfo, rv reflect.Value, _ FieldOptions) (err error) {
if printLog {
fmt.Println("(e) encodeReflectJSONStruct")
defer func() {
fmt.Printf("(e) -> err: %v\n", err)
}()
}
// Part 1.
err = writeStr(w, `{`)
if err != nil {
return
}
// Part 2.
defer func() {
if err == nil {
err = writeStr(w, `}`)
}
}()
var writeComma = false
for _, field := range info.Fields {

Ok, looking at writeGenesisFile in init.go (cosmos-sdk), the GenDoc is encoded using amino which should create the json-keys in the same order as the field order (and exactly the order @zramsay was expecting in cosmos/cosmos-sdk#983 (comment)). Then the file gets saved (should still be OK).
But then, the file gets immediately read again (which is a bit weird on it's own), this time we read the contents in a good old map though: https://github.com/cosmos/cosmos-sdk/blob/270e2162622446994fb3ef17a5d141e7d4aa31df/server/util.go#L127-L129
I think this last step desrtoys the ordering.

@liamsi
Copy link
Contributor

liamsi commented Jul 4, 2018

Opened cosmos/cosmos-sdk#1531 in cosmos-sdk.

To solve your problem (non-amino related): You can have the desired order (by field order), without any amino change as far as I can tell: just do not read into a golang map or do not use AppendJSON from util.go.
Instead, encode via amino and manually append the appState key and the raw json you have at hand.

@cwgoes
Copy link

cwgoes commented Jul 4, 2018

If we order by field order, we have to order our structs alphabetically - the Ledger application (and other such third-party applications) can't store field orders for all the Cosmos SDK structs; it needs to be able to check alphabetical ordering.

Would it be possible to pass an option to Amino to order alphabetically by field name instead?

@liamsi
Copy link
Contributor

liamsi commented Jul 4, 2018

Would it be possible to pass an option to Amino to order alphabetically by field name instead?

Possible, but I'm not sure this is desirable in general (and differently from what was mentioned in the issue description), or as part of the amino API. If we decided this is generally useful and should go into amino, I would suggest, that this is only for json-encoding maps and by introducing a tag called json:map_sorted or sth like this. Not sure about side-effects decoding-wise (this is not a trivial change). If this is only for this one particular case and not necessary for several other use-cases, it might be easier to 1) read into a map as you do, 2) sort the keys, 3) manually append "key" and it's corresponding json.RawMessage into the JSON. Shouldn't be more than very few lines of code (~10) ... What do you think?

@jleni
Copy link

jleni commented Jul 4, 2018

Field order will definitely not work in the ledger nano S. It needs to be a very very simple rule. Ideally sorting key names alphabetically.

@liamsi
Copy link
Contributor

liamsi commented Jul 4, 2018

Updated/edited the issue description above to better reflect the discussion. Sorting by keys shouldn't be the default behaviour. I feel people would be very surprised about this change (e.g. json-amino is used for rpc). But I think adding a flag a la json:sorted_keys might work (as decoding from json also uses a map and doesn't seem to rely on field-order in JSON-land).

I'm still wondering if this (providing a deterministic normalization for cryptographic signing purposes for a format, namely JSON, which isn't know to provide the best guarantees with this regard) should be part of an encoding library? I feel like this is a separate concern. Other projects seem to try to solve this (for instance objecthash which is about hashing JSON). Tagging @tarcieri to join the discussion (he spent more time on thinking about howto deterministically hash JSON/other encodings than any of us). Also @jaekwon as this might have been discussed before?

@tarcieri
Copy link

tarcieri commented Jul 4, 2018

I really like objecthash for a few reasons, most notably that it sidesteps the problem of canonicalization, which is particularly problematic with JSON because no one can agree on what canonical JSON actually looks like. It's also friendlier to things like embedded devices because they don't have to do a bunch of string manipulation to produce a canonical JSON document to hash, but can instead just operate on parsed JSON in-place.

I'd consider the current Go library as something of a science experiment, but worst case you can make your own similar construction.

Re: field ordering, objecthash sorts by the objecthash of the key, which in JSON since those are always strings is digest('s' || key_name). There's really no reason you couldn't just use the key name though, if you wanted to do a simpler objecthash-alike.

@liamsi
Copy link
Contributor

liamsi commented Jul 4, 2018

Thanks a lot for your input @tarcieri!

I'd consider the current Go library as something of a science experiment, but worst case you can make your own similar construction.

Yeah, I agree. I think there are still some problems with UTF8 normalization. But for our purposes objecthash might be mature enough. objecthash-proto seems even more experimental for now. Are there any less experimental alternatives?

It also might be the case that this is overkill as what we need for ledger nano (see discussion above), are very specific JSON messages (not sure), where sorting the top-level keys is enough to be deterministic (can easily be done)? The more I think about this, the more I'm convinced this is not (only) a amino debate.

@tarcieri
Copy link

tarcieri commented Jul 4, 2018

I don't know of any more mature alternatives, and it's really a simple enough idea that designing your own for a particular purpose isn't too bad. Done correctly I think you could have something extremely simple that covers your immediate needs which is designed in a way to be extensible to hashing more complex structured documents in the future.

@cwgoes
Copy link

cwgoes commented Jul 4, 2018

It also might be the case that this is overkill as what we need for ledger nano (see discussion above), are very specific JSON messages (not sure), where sorting the top-level keys is enough to be deterministic (can easily be done)? The more I think about this, the more I'm convinced this is not (only) a amino debate.

That's fair - perhaps the best solution for now is for msg.GetSignBytes() in the SDK to remarshal JSON-to-be-signed as a map and sort the keys.

Per discussion with @liamsi @jleni, sorting in the SDK for now - further discussion about different serialization options (e.g. Amino / objecthash) tabled for now.

@jaekwon
Copy link
Contributor

jaekwon commented Jul 4, 2018

I think we should use another tool to re-order the JSON bytes produced by Amino for now...

It's what we would do in Amino anyways, and we can decide to add it as an option to Amino later down the road. This only needs to happen on the signer so it shouldn't be a big overhead for anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants