Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoc doesn't allow some field names due to bad json_name logic #7192

Closed
jhump opened this issue Feb 9, 2020 · 9 comments
Closed

protoc doesn't allow some field names due to bad json_name logic #7192

jhump opened this issue Feb 9, 2020 · 9 comments
Assignees
Labels
inactive Denotes the issue/PR has not seen activity in the last 90 days. json protoc

Comments

@jhump
Copy link
Contributor

jhump commented Feb 9, 2020

What version of protobuf and what language are you using?
Version: 3.11.3
Language: none (using --descriptorset_out)

What operating system (Linux, Windows, ...) and version?
Mac OS X

What runtime / compiler are you using (e.g., python version or gcc version)
No runtime; just using the protoc binary

What did you do?
I tried to compile the following proto file:

// test.proto
syntax = "proto3";
message Foo {
  string abc = 1;
  string abc_def = 2;
  string __a__b__c = 3;
  string _abc_def = 4;
}

I used the following command:

protoc test.proto -otest.protoset

What did you expect to see
I expected this to work and to produce a test.protoset file.

If I remove the conflicting fields and generate this in isolation, the resulting json_name fields in the output descriptors are all different -- so there's no actual conflict. For reference:

  • abc becomes JSON name of abc
  • abc_def becomes JSON name of abcDef
  • __a__b__c becomes JSON name of ABC
  • _abc_def becomes JSON name of AbcDef

What did you see instead?
The command fails with the following error:

test.proto:6:10: The JSON camel-case name of field "__a__b__c" conflicts with field "abc". This is not allowed in proto3.
test.proto:7:10: The JSON camel-case name of field "_abc_def" conflicts with field "abc_def". This is not allowed in proto3.

It appears that it is using a case-insensitive comparison to determine if there is a conflict. But I believe that is wrong for a couple of reasons:

  1. Key names in JSON are case-sensitive. So it is fine to have both "abc" and "ABC" in the same output, as well as both "abcDef" and "AbcDef".
  2. The error message suggests the conflict is on the camel-case JSON name and doesn't say anything about being case insensitive.
@jhump
Copy link
Contributor Author

jhump commented Feb 9, 2020

I think this is related to #5063, at least in the fact that they are both related to how protoc tries to check for uniqueness of JSON names.

In fact, adding a json_name option does not prevent the above errors. So if we add json_name = "FOO" to the __a__b__c field in the example above, it still complains with the same message:

test.proto:7:10: The JSON camel-case name of field "__a__b__c" conflicts with field "abc". This is not allowed in proto3.

@jhump
Copy link
Contributor Author

jhump commented Sep 18, 2020

FYI, @dsnet, it looks like protodesc.NewFile in the new Go runtime has this same issue: it complains about a conflict between two fields with same camel-case, even if one has custom JSON name.

@acozzette
Copy link
Member

@jhump I think there may be good reason for enforcing that the names are unique without considering custom JSON names. I haven't thought about this in a while but I believe there's a technical reason for why FieldMasks can't use custom JSON names and therefore need to use either the original field name or the default JSON name.

@jhump
Copy link
Contributor Author

jhump commented Sep 21, 2020

If that's really the case then the error message should be improved (since it is talking about JSON names and makes no caveat about why the option is ignored). Further this constraint should be clearly documented. I don't see any documentation for it, but it deserves to be called out in the main proto3 doc page.

@jhump
Copy link
Contributor Author

jhump commented Feb 16, 2021

@acozzette, you commented above that protoc may need to require that the camel-case version of each field is unique (not just JSON name). But what about the case described in the issue description? It complains even if the camel-case for one field differs from another by only case. Is that also related to field masks? Or is that possibly a bug?

@fowles
Copy link
Contributor

fowles commented Sep 1, 2022

We are going to shortly release updated json support, that we ought to be able to work in this fix as well

Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

@github-actions github-actions bot added the inactive Denotes the issue/PR has not seen activity in the last 90 days. label Mar 27, 2024
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.

This issue was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2024
@silverhammermba
Copy link

This is still a problem. As far as I can tell from the proto3 language guide, ab and a_b are different field names. I can trivially convert them to distinct lowerCamelCase (ab vs AB) or UpperCamelCase (Ab vs AB) names. All of these would be distinct string keys in JSON objects. These are all distinct identifiers in Go.

And yet you get this nonsensical error

The JSON camel-case name of field "ab" conflicts with field "a_b". This is not allowed in proto3.

This would only be a valid error for fields like a1 and a_1. These do conflict because "1" is always the same in upper- and lower-case. It seems that it's not actually checking for real camel-case conflicts, rather it is checking for conflicts when all underscores are removed and letter case is ignored, which is a much stronger requirement. I'd expect something in the language guide to point out that while you can put underscores in field names, they are effectively ignored as far as field uniqueness is concerned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive Denotes the issue/PR has not seen activity in the last 90 days. json protoc
Projects
None yet
Development

No branches or pull requests

7 participants