-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow RAW encoding #385
Comments
An interesting concept. And I understand the purpose behind it. The question is what sort of interface would make sense. Any thoughts on that? |
I think there are still a lot of patch-up python scripts out there that look like the below. If 'raw' is just recognized as a dummy charset, the below py27 python code may be run in py3x without if/else blocks to distinquish between python versions. The dummy 'raw' encoding is used to just bypasses the cx_Oracles internal decode() function. I suspect that this is the most easy way to implement it. cursor.execute('select somestring, somencoding from sometable') Alternatively, a sort of 'outputtypehandle' could be implemented where the user can feed a converter function into the query or settings but on a lower level than is currently possible. That is more 'elegant' in the sense that cx_Oracle would still produce only unicode output, but it would give control to the programmer. This would break the above python code though. Side requirement would be that this converter function needs to receive the entire row object somehow, not just the field in question, because in the example above it would need to read the 'someencoding' field which in itself might be character based. Given the above I would opt for the more 'quick and dirty' solution in favor of the 'elegant' solution. I know pyODBC has this mechanism, I use it and it works well like that. |
Signed-off-by: Darko Djolovic <ddjolovic@outlook.com>
* Implemented #385 enhancement and updated documentation Signed-off-by: Darko Djolovic <ddjolovic@outlook.com> * Created flag to Cursor.var() Signed-off-by: Darko Djolovic <ddjolovic@outlook.com> * Removed first commit changes, updated documetnation Signed-off-by: Darko Djolovic <ddjolovic@outlook.com> * Added testing sample 'QueringRawData.py' and renamed attribute 'bypassstringencoding' to 'bypassencoding' with updated documentation Signed-off-by: Darko Djolovic <ddjolovic@outlook.com>
Take a look at the implementation which is demonstrated in the new sample. This should address this enhancement but let me know if you agree! Feedback is always appreciated! |
cx_Oracle 8.2 has just been released which includes this enhancement. |
cx_Oracle could use an encoding 'raw' which would lead to returning bytes instead of unicode strings without any conversion. That way, conversion and fixing of corrupt strings can be done on Python level instead of cx_Oracle level.
Also, legacy database content with mixed encodings can be supported then. It would work like the utl_raw.cast_to_raw function but without the length limitation of 4000 bytes. In fact, it would work like in python 2.7 now.
For testing, there also should be a way to write data. Eg this table could be supported:
create table translations (encoding varchar2(20), content varchar2(1000))
insert into translations (encoding, content) values ('utf-8', 'abë'.encode('utf-8'))
insert into translations (encoding, content) values ('windows-1252', 'abë'.encode('windows-1252'))
Additional advantage is that legacy 2.7 Python code now might have already encoding and decoding in place. In case Py27 code still runs, it would make moving this to 3.8 easier because no changes on the Python level are needed then.
This change of course would only apply to the py3 version of cx_Oracle since in Py2 this is already how it worked.
The text was updated successfully, but these errors were encountered: