-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: export uncased identifiers like 日本語 #5763
Comments
IMHO this is working as intended. Also note that such change would not be backward compatible, ie. it could suddenly export things which were previously package private (safe from being accessed from other packages). PS: My native language is also not English, but I, for one, would never ever use a non English identifier in my code, except perhaps for some occasional Greek letters in math stuff. I suggest tag #Unfortunate |
I guess there won't be much old code which uses CJK identifiers, if we support it early. I myself rarely use non-English identifiers either. However, I know that CJK identifiers are often used in some business domain (no proper English translation exists), in C# and Java. Unicode identifier support is good, but it may be more useful for CJK languages if we change a bit. |
In language that have no upper/lower case distinction, we need a special case for exported symbols or a special case for non-exported symbols. The current language has a special case for exported symbols. I really don't know what is better. As far as I know there has never been a clear consensus either way among people who, unlike me, speak those languages, so we've just muddled along with the current approach. As you say, this can not change until Go 2 anyhow. But if there is a clear consensus for Go 2, then it could certainly change then. |
There are some discuss about this topic: https://groups.google.com/forum/#!topic/golang-china/h_vxbPHaIvw/discussion I can accept the current exported identifiers rule. |
Issue #6745 has been merged into this issue. |
Issue #6745 has been merged into this issue. |
A solution that's been kicking around for a while: For Go 2 (can't do it before then): Change the definition to "lower case letters and _ are package-local; all else is exported". Then with non-cased languages, such as Japanese, we can write 日本語 for an exported name and _日本語 for a local name. This rule has no effect, relative to the Go 1 rule, with cased languages. They behave exactly the same. |
@有通知的作用,比如在微博的时候,可以@人让他知道 |
Many says non-English words are rarely used in practical coding even among people whose native language are not English. That's mainly right. However, please let me give a special "rare" case, for your information. I work in the online game industry in China, and just like other industries our code are mainly written by programmers. but when it comes to the whole product, it contains resources besides the code. One important portion of online game resources consists of numerical and string values, which plays a vital role within the whole experience of gameplay. One product could contain thousands of such values, and they are provided, with their STRUCTURE, by game designers, not programmers. The programmers should follow the provided structure to use the values. When loaded these data from a file or database, the code could access a certain value by a static (Avatar.HP) or dynamic (Avatar["HP"]) manner. For the purpose of performance and static check, the static way is often preferred, and here comes the problem: the type names and field names, as part of the data structure, are created by the game designers, and they are often not systematically trained and adapted to the programmers' convention. They just compose the values in a spreadsheet editor and tools alike, define type names and field names by sheet names and header of the table, and of course they prefer using native words to describe the logic for clarity, especially for fancied concepts that are often difficult to translate, which are not rare at all in games. Their works are then converted by scripts into a form capable for loading by the program, but the identifiers they defined should anyway be preserved, i.e. when a static manner is adopted, it must involve code generation, and nobody wish to involve manual translation in this step, or to keep a translation dictionary up to date with the revisions of the designers' works . And ... now you will understand what I want to express. With initial characters of unicode category "Lo" not treated as exported, The Go language makes this working process impossible, and forces us either to sacrafice the performance and type safety by uisng the dynamic manner, or to force the designers to use English that they are not accustomed to, or to lose the clarity of logic encapsulation provided by the package system. There's no such obstacle in other programming languages. |
@lych77 Thank you very much for your thoughtful and helpful message. We appreciate getting a more authoritative contribution to this discussion. Unfortunately the Go 1 guarantee prevents us from changing this rule now, but if there ever is a Go 2, there could be a change as I described above: "Change the definition to "lower case letters and _ This is a fairly minor change to the implementation but could have major effect for Chinese programmers. Please let us know what you think about this idea. |
@robpike Thanks for your reply. Yes, the underscore way is reasonable. The Go language is already quite popular in China and I would certainly be glad to see it become more popular :) |
a dot . preceding filename is used to hide the file in unix. |
@xHacking A dot "." is not a valid character in an (unqualified) identifier, so if the dot is part of the identifier the answer is no. I suppose one could use a dot to mark an identifier as non-exported, but not have the dot be part of the name (I haven't looked into whether this might cause syntactic problems elsewhere, but it might not). But this would be a different approach to naming: As is, in Go, by looking at an identifier we can tell right away if it is exported or not. With an identifier-external marking scheme that would not be true anymore. Also, by default (no dot) identifiers would be exported, which is probably not what we want. |
I don't feel benefit from this. And I guess this won't be useful if Go will handle unicode case folding for exporting because we don't like to use properly for |
@mattn The problem is not aand A,Some professional term can't describe in English.Such as the objects in the game or industry. like 五行 ,fiveline?the five elements ?No,No.Literal translation is wrong, and unable to express the original meaning |
@chencun do you mean |
@mattn No,I'm just saying that 五行 can't correct expression in English。This is a simple example, the 五行 is the unique Chinese culture,There are many other industry words。My English is not good, reply is translated in Chinese, I'm sorry about this。 |
Korean, Japanese, and there is a lot of proper nouns,Can't use the full expression in English |
@chencun Sorry, I don't understand what you mean. Don't you like |
@mattn Sure,We want to directly can enter CJK as a public variable, rather than private,And _ also don't belong to CJK.If 五行 can be used as a public, than _五行 can be used in private.When we write code, public variables that will be more friendly in ide.When as a public API, It would read better. |
@chencun Ah, sorry. I was confused. I thought that current implementation make it public. |
yes, so, here we recommend to support it for the public! under the current 1.0 standard, The implementation code is ugly。We don't want to a variable mixture of English and Chinese。 |
Hey, I am the organizer of GopherChina. I create the biggest China Gopher community. gocn.io @mpvl reached to me today and mentioned this issue. I did a poll in our Gopher wechat groups. Title: Do you want to use Chinese name variable or function
Here is the result:
I hope this poll will help you to make decision. This poll just passed 2 hours. But the results have been very obvious. |
Today identifiers are exported if they begin with an upper-case letter. This issue proposes to change the rule to be unexported if they begin with a lower-case letter. The effect would be that identifiers beginning with uncased identfiers, such as The advantage of changing the rule is that exported identifiers need not all begin with some throwaway cased letter. As also noted in the original report, “It is very strange to use, say Z成本 or Jぶつける as identifiers.” It seems that there are two main disadvantages of changing the rule. The first disadvantage is that it will have the effect of retroactively exporting many identifiers, which we might finesse as a not quite breaking change but is at least an unexpected change that would likely require changing essentially all code written using uncased identifiers for top-level consts, funcs, types, and vars, as well as fields of exported types, and expecting those identifiers to be unexported. The second disadvantage is that it makes the "default export behavior" of an identifier essentially language-dependent in the following way. When I program using English, I and probably most other programmers write "data" by default but must give an explicit signal - capitalizing the d to get Data - in order to export. (As evidence of this, consider function argument names or local variables, where the choice doesn't matter: essentially everyone defaults to lower case.) When we chose the export rules, we decided intentionally that exporting requires an explicit signal. In fact, the original proposal that was made to us was to export everything by default and use a leading underscore to mean unexported (following a convention from Python). We used upper-case for export instead of underscore for unexported specificall to make exporting something “opt-in” instead of “opt-out,” so that programmers (in this case, using English) would not export fields without making an explicit decision to do so. If we make uncased identifiers exported by default, the effect will be that programmers writing programs in uncased languages will export by default and be required to reach for an explicit signal to unexport (that is, exporting will be “opt-out”), which is different from cased languages and exactly what we rejected way back in January 2009 for English. While being sensitive to the fact that I am not a native speaker of an uncased language, it nonetheless seems wrong to me for Go to adopt for uncased languages the exact behavior we rejected for English. To summarize, the two disadvantages to changing the casing rules are (1) it will break a lot of things, and (2) it's probably wrong from a large-scale software engineering point of view, because it makes exporting “opt-out” for some languages (and not others). I've been willing to try to work around (1), but I only recently realized the full import of (2). The combination of these suggests to me that we should not take this approach, and that 成本, ぶつける, 数据 should remain unexported just like "data". Even if we decide not to change the uncased export default, though, it may be that we should still address the original objection that “It is very strange to use, say Z成本 or Jぶつける as identifiers,” and we should make sure we have the ability to do so. There are other explicit signals we could adopt. I'm going to enumerate a few below, but this is not intended as a complete list. The point here is that there are things we can do other than changing the default exportedness of uncased identifiers. A special symbol for marking an uncased identifier as exported could be introduced. For example, 数据 is unexported but maybe An extension of that, suggested by @robpike, would be to require the exporting symbol only at the declaration site, so that A further extension, suggested by @griesemer, would be to make the exporting-at-declaration-time symbol a period, as in I'd like to explicit not discuss these alternatives further yet but instead loop back to whether we should change the default to make uncased identifiers be opt-out. I propose that we agree not to change the default rules for uncased identifiers for Go 2 and instead agree to consider only non-breaking changes, based on (1) and (2) above and also the new point (3) there appear to be decent alternatives that avoid those two problems. The reason I want to reach this partial agreement on the (non-)solution space is that another thing we are considering is to expand the identifier set to allow combining characters (#20706), and the main effect would be to introduce many more identifiers in uncased languages. If we are going to make a breaking change to the exporting of uncased languages, we should do it before expanding the identifier set, to limit the breakage. If we agree not to make a breaking change to the exporting of uncased languages, then the resolution of a new export signal for uncased languages and the expansion of the identifier set can proceed essentially completely independently. |
In short, a proposal for how to proceed here: Let's leave uncased identifiers unexported and find non-breaking ways to address the "Z成本 or Jぶつける are strange identifiers" problem. Please thumbs up/thumb down/respond to that specific steering suggestion, but let's defer discussion of details of specific alternatives for the moment. Thanks. |
I've done some work regarding interoperability between code written in Latin and non-Latin scripts. As we design an exporting scheme for uncased languages, one thing we want to keep in mind is that this process requires round-trippable translations of identifiers, given a valid dictionary. For example, given a stable dictionary mapping between ( (There are some edge cases involving identifiers with mixed alphabets, but that's a rather thorny and niche edge case for automatic dictionary translations to begin with, even without bringing case-based exports into the mix). |
OK, closing this issue in favor of #22188, which is explicitly about not changing the existing rules. |
maybe this can be solved in some other project's own coding rules // if you don't understand the requirement and abstract the concepts very well and cant' come up with good names // E for Export // or use a Getter some time some coder can't come up with a good name in English, this has nothing to do with the programming language. variable or function names in Chinese are not used so much in source code. other part of source code are still not Chinese: if, for, func, return. you don't use variable or function names in Chinese in C, C++ too. |
The text was updated successfully, but these errors were encountered: