-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting hexa code values instead of characters for Unicode code characters. #483
Comments
With your version of cx_Oracle try something like:
|
@cjbj Thanks for the reply, I am actually passing those arguments when I am creating the connection but still, I am getting hex values.
'â\x80\x99' |
Corrupt data? Some other layer (Windows command shell?) not understanding the encoding? What happens if you insert new data and select it? |
Okay, I tried to set this character in one of the columns " ’ " and when I fetched the record, it is now coming as " ¿ " this. |
Show us a complete runnable example Python script. |
I am actually using shell so this how it looks like now.
|
I've also tried to use the latest oracle client library 18.5 Basic Light Package and cx_Oracle 8.0.01 but still no luck. |
You've retrieved the client character set. But it would be important to also get the database character set as shown here. The other thing you can do is use SQL*Plus to perform the same query, remembering to set NLS_LANG before running it as you did for Python (if you use the parameter in cx_Oracle, though, you don't need to set NLS_LANG, and in cx_Oracle 8, the default is UTF-8 anyway). |
@anthony-tuininga Thank you so much for the information. I think It might be the database character set problem because it is set to NLS_CHARACTERSET: WE8ISO8859P15 and I've also read here https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch2charset.htm#i1007228 that
So we could be experiencing the data loss because WE8ISO8859P15 does not have the characters which I am using in my test. I tried to add byte strings as the value but in database, it stored as Unicode value "E28099"
So I think we have to solve this problem internally. |
With Python 2, VARCHAR data is transferred to/from the database as bytes. With Python 3, however, the data is decoded into a string using the client encoding (in this case UTF-8), so the data must be stored in the database properly (with the advertised database character set) in order for the conversion to the client character set (UTF-8) to perform properly. With Python 2 (and other tools) it is possible to store data improperly in the database -- and once that is done you'll need to correct that corrupt data. There is an enhancement request (#385) that would allow you to do the same with Python 3 by simply returning/accepting bytes. This is intended solely to fix the corrupted data, not to maintain the corruption, of course! 👍 As for using byte strings, the problem there is that Oracle sees the bytes, treats them as RAW and converts them to a hex string in order to store them in the database! You can see that by doing the following:
which will return the string |
One of the hack we tried that we encode the return value by
So If I set |
Yes, you definitely need to change your database encoding to AL32UTF8 -- or at least one that can store all of the characters that you want to store! Glad you have a workaround for now -- not a very pleasant one, but as you noted, it works! :-) |
What versions are you using?
platform.platform: Linux-5.4.0-48-generic-x86_64-with-debian-buster-sid
sys.maxsize > 2**32: True
platform.python_version: 3.6.5
cx_Oracle.version: 6.3
cx_Oracle.clientversion: (12, 2, 0, 1, 0)
Describe the problem
When I am trying to fetch data from a database(NLS_CHARACTERSET is set to "WE8ISO8859P15"), Unicode texts are coming as "??" which is expected because cx_oracle is using ASCII by default. But when I am running this line
export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
and trying to fetch the same data, Unicode texts are coming as there hexa values like this "\x80\x99"for example
’
--> 'â\x80\x99'I was wondering if someone can help me in this situation.
The text was updated successfully, but these errors were encountered: