Skip to content

Commit

Permalink
Implemented oracle#385 enhancement and updated documentation
Browse files Browse the repository at this point in the history
Signed-off-by: Darko Djolovic <ddjolovic@outlook.com>
  • Loading branch information
Draco94 committed Mar 25, 2021
1 parent 8f901ab commit 00fc04d
Show file tree
Hide file tree
Showing 6 changed files with 111 additions and 7 deletions.
14 changes: 14 additions & 0 deletions doc/src/api_manual/connection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,20 @@ Connection Object
This attribute is an extension to the DB API definition.


.. attribute:: Connection.bypassstringencoding

This read-only attribute determines whether bypassstringencoding mode is
on or off. When bypassstringencoding mode is on, results of database
types CHAR, NCHAR, LONG_STRING, NSTRING, STRING will be returned raw
meaning cx_Oracle won't do any decoding conversion.

See :ref:`Quering raw data <querying-raw-data>` for more information.

.. note::

This attribute is an extension to the DB API definition.


.. method:: Connection.enq(name, options, msgproperties, payload)

Returns a message id after successfully enqueuing a message. The options
Expand Down
10 changes: 8 additions & 2 deletions doc/src/api_manual/module.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,13 @@ Module Interface
events=False, cclass=None, purity=cx_Oracle.ATTR_PURITY_DEFAULT, \
newpassword=None, encoding=None, nencoding=None, edition=None, \
appcontext=[], tag=None, matchanytag=None, shardingkey=[], \
supershardingkey=[])
supershardingkey=[], bypassstringencoding)
Connection(user=None, password=None, dsn=None, \
mode=cx_Oracle.DEFAULT_AUTH, handle=0, pool=None, threaded=False, \
events=False, cclass=None, purity=cx_Oracle.ATTR_PURITY_DEFAULT, \
newpassword=None, encoding=None, nencoding=None, edition=None, \
appcontext=[], tag=None, matchanytag=False, shardingkey=[], \
supershardingkey=[])
supershardingkey=[], bypassstringencoding)

Constructor for creating a connection to the database. Return a
:ref:`connection object <connobj>`. All parameters are optional and can be
Expand Down Expand Up @@ -125,6 +125,12 @@ Module Interface
The shardingkey and supershardingkey parameters, if specified, are expected
to be a sequence of values which will be used to identify the database
shard to connect to. The key values can be strings, numbers, bytes or dates.

The bypassstringencoding parameter, if specified, should be passed as
boolean. This feature allows results of database types CHAR, NCHAR,
LONG_STRING, NSTRING, STRING to be returned raw meaning cx_Oracle
won't do any decoding conversion. See
:ref:`Quering raw data <querying-raw-data>` for more information.


.. function:: Cursor(connection)
Expand Down
77 changes: 77 additions & 0 deletions doc/src/user_guide/sql_execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -690,6 +690,83 @@ columns:
Other codec behaviors can be chosen for ``encodingErrors``, see `Error Handlers
<https://docs.python.org/3/library/codecs.html#error-handlers>`__.


.. _querying-raw-data:

Querying Raw Data
---------------------

Sometimes cx_Oracle may have problems converting data to unicode and you may
want to inspect the problem closer rather than auto-fix it using the
encodingerrors parameter. This may be useful when a database contains
records or fields that are in a wrong encoding altogether.

It is not recommended to use mixed encodings in databases.
This functionality is aimed at troubleshooting databases
that have inconsistent encodings for external reasons.

For these cases, you can pass in the in additional keyword argument
``bypassstringencoding = True`` into :meth:`cx_Oracle.connect()` or
:meth:`cx_Oracle.Connection()` when initializing the Oracle client.

.. code-block:: python
connection = cx_Oracle.connect("hr", userpwd, "dbhost.example.com/orclpdb1",
bypassstringencoding=True)



This will allow you to receive data as raw bytes.

.. code-block:: python
statement = cursor.execute("select content, charset from SomeTable")
data = statement.fetchall()


This will produce output as:

.. code-block:: python
[(b'Fianc\xc3\xa9', b'UTF-8')]


Note that last \xc3\xa9 is é in UTF-8. Then in you can do following:


.. code-block:: python
import codecs
# data = [(b'Fianc\xc3\xa9', b'UTF-8')]
unicodecontent = data[0][0].decode(data[0][1].decode()) # Assuming your charset encoding is UTF-8


This will revert it back to "Fiancé".

If you want to save ``b'Fianc\xc3\xa9'`` to database you will need to create
:meth:`Cursor.var()` that will tell cx_Oracle that the value is indeed
intended as a string:


.. code-block:: python
connection = cx_Oracle.connect("hr", userpwd, "dbhost.example.com/orclpdb1",
bypassstringencoding=True)
cursor = connection.cursor()
cursorvariable = cursor.var(cx_Oracle.STRING)
cursorvariable.setvalue(0, "SomeValuě".encode("UTF-8")) # b'Fianc\xc4\x9b'
cursor.execute("update SomeTable set SomeColumn = :param where id = 1", param=cursorvariable)


At that point, the bytes will be assumed to be in the correct encoding and should insert as you expect.

.. warning::
This functionality is "as-is": when saving strings like this,
the bytes will be assumed to be in the correct encoding and will
insert like that. Proper encoding is the responsibility of the user and
no correctness of any data in the database can be assumed
to exist by itself.

.. _dml:


Expand Down
14 changes: 9 additions & 5 deletions src/cxoConnection.c
Original file line number Diff line number Diff line change
Expand Up @@ -470,7 +470,7 @@ static int cxoConnection_init(cxoConnection *conn, PyObject *args,
PyObject *tagObj, *matchAnyTagObj, *threadedObj, *eventsObj, *contextObj;
PyObject *usernameObj, *passwordObj, *dsnObj, *cclassObj, *editionObj;
PyObject *shardingKeyObj, *superShardingKeyObj, *tempObj;
int status, temp, invokeSessionCallback;
int status, temp, invokeSessionCallback, bypassStringEncoding;
PyObject *beforePartObj, *afterPartObj;
dpiCommonCreateParams dpiCommonParams;
dpiConnCreateParams dpiCreateParams;
Expand All @@ -483,12 +483,12 @@ static int cxoConnection_init(cxoConnection *conn, PyObject *args,
static char *keywordList[] = { "user", "password", "dsn", "mode",
"handle", "pool", "threaded", "events", "cclass", "purity",
"newpassword", "encoding", "nencoding", "edition", "appcontext",
"tag", "matchanytag", "shardingkey", "supershardingkey", NULL };
"tag", "matchanytag", "shardingkey", "supershardingkey", "bypassstringencoding", NULL };

// parse arguments
pool = NULL;
tagObj = Py_None;
externalHandle = 0;
externalHandle = bypassStringEncoding = 0;
passwordObj = dsnObj = cclassObj = editionObj = NULL;
threadedObj = eventsObj = newPasswordObj = usernameObj = NULL;
matchAnyTagObj = contextObj = shardingKeyObj = superShardingKeyObj = NULL;
Expand All @@ -499,13 +499,13 @@ static int cxoConnection_init(cxoConnection *conn, PyObject *args,
if (dpiContext_initConnCreateParams(cxoDpiContext, &dpiCreateParams) < 0)
return cxoError_raiseAndReturnInt();
if (!PyArg_ParseTupleAndKeywords(args, keywordArgs,
"|OOOiKO!OOOiOssOOOOOO", keywordList, &usernameObj, &passwordObj,
"|OOOiKO!OOOiOssOOOOOOp", keywordList, &usernameObj, &passwordObj,
&dsnObj, &dpiCreateParams.authMode, &externalHandle,
&cxoPyTypeSessionPool, &pool, &threadedObj, &eventsObj, &cclassObj,
&dpiCreateParams.purity, &newPasswordObj,
&dpiCommonParams.encoding, &dpiCommonParams.nencoding, &editionObj,
&contextObj, &tagObj, &matchAnyTagObj, &shardingKeyObj,
&superShardingKeyObj))
&superShardingKeyObj, &bypassStringEncoding))
return -1;
dpiCreateParams.externalHandle = (void*) externalHandle;
if (cxoUtils_getBooleanValue(threadedObj, 0, &temp) < 0)
Expand Down Expand Up @@ -666,6 +666,9 @@ static int cxoConnection_init(cxoConnection *conn, PyObject *args,
Py_DECREF(tempObj);
}

// set if should bypass default encoding and return bytes
conn->bypassStringEncoding = bypassStringEncoding;

return 0;
}

Expand Down Expand Up @@ -1947,6 +1950,7 @@ static PyMemberDef cxoMembers[] = {
{ "username", T_OBJECT, offsetof(cxoConnection, username), READONLY },
{ "dsn", T_OBJECT, offsetof(cxoConnection, dsn), READONLY },
{ "tnsentry", T_OBJECT, offsetof(cxoConnection, dsn), READONLY },
{ "bypassstringencoding", T_INT, offsetof(cxoConnection, bypassStringEncoding), READONLY },
{ "tag", T_OBJECT, offsetof(cxoConnection, tag), 0 },
{ "autocommit", T_INT, offsetof(cxoConnection, autocommit), 0 },
{ "inputtypehandler", T_OBJECT,
Expand Down
1 change: 1 addition & 0 deletions src/cxoModule.h
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,7 @@ struct cxoConnection {
PyObject *tag;
dpiEncodingInfo encodingInfo;
int autocommit;
int bypassStringEncoding;
};

struct cxoCursor {
Expand Down
2 changes: 2 additions & 0 deletions src/cxoTransform.c
Original file line number Diff line number Diff line change
Expand Up @@ -849,6 +849,8 @@ PyObject *cxoTransform_toPython(cxoTransformNum transformNum,
case CXO_TRANSFORM_NSTRING:
case CXO_TRANSFORM_STRING:
bytes = &dbValue->asBytes;
if (connection->bypassStringEncoding)
return Py_BuildValue("y#", bytes->ptr, bytes->length);
return PyUnicode_Decode(bytes->ptr, bytes->length, bytes->encoding,
encodingErrors);
case CXO_TRANSFORM_NATIVE_DOUBLE:
Expand Down

0 comments on commit 00fc04d

Please sign in to comment.