-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with parsing Vietnamese names #17
Comments
Hey @yellow1912 . Only seeing this now. Thanks for raising the issue. |
Hi @wyrfel Let me check the utf8 issue. Regarding the ordering, perhaps we could have a setting to reverse order? If not I will just go ahead and do the if else outside, np at all. |
Hi @yellow1912, i have tried to reproduce the normalisation (capitalisation) issue but couldn't. I have added cases for it to the unit test suite in PR #20 . As for the reversed parsing order...i'm afraid there is no way this name parser can detect if the name is vietnamese and hence can't reverse the order on it's own. I could add a setting for this, but i think that may as well then be done outside of the parser. |
@yellow1912 It bugs me that this issue is still open. 😄 I've been think about this more. There is a possibility to include automatic reversal of name parts or maybe even a sort of name part templating via the language files. But that means we would have to introduce a form of language detection in those files. |
@yellow1912 I have added a section in the README about possible ways to detect the names language outside of the name parser. At this stage would like to treat this as an edge case and as something that can be handled outside the core name parser (by reversing the word order in the string or by overloading name parser components). Should you come up with an implementation i'd be very interested to hear about it and possibly integrate it or parts of it into this package. |
Sorry for disappearing for a while. I don't think the parser should detect the language, but perhaps there can be and option to reverse things when you return? Right now I manually detect and reverse outside of the parser. |
Hello there,
I have a name like this:
Nguyễn Quốc Thái
After parsing, the fullname I get there was "NguyễN QuốC TháI", you can notice how the cases are messed up.
One more thing is that Vietnamese names are written like this: Lastname MiddleName Firstname, how should this be handled?
The text was updated successfully, but these errors were encountered: