Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom fetch all handler for vertica to not miss errors #34041

Merged
merged 11 commits into from
Sep 6, 2023
54 changes: 53 additions & 1 deletion airflow/providers/vertica/hooks/vertica.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,24 @@
# under the License.
from __future__ import annotations

from typing import Any, Callable, Iterable, Mapping, overload

from vertica_python import connect

from airflow.providers.common.sql.hooks.sql import DbApiHook
from airflow.providers.common.sql.hooks.sql import DbApiHook, fetch_all_handler


def vertica_fetch_all_handler(cursor) -> list[tuple] | None:
"""Replace the default DbApiHook fetch_all_handler ."""
to_return = fetch_all_handler(cursor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename to_return? Does rows make sense as the name of this value? or result might be a bit better than to_return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I'll rename it as result

# loop on all statement result sets to get errors
if cursor.description is not None:
while cursor.nextset():
if cursor.description is not None:
row = cursor.fetchone()
while row:
row = cursor.fetchone()
return to_return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this value be changed during the execution of lines 31-36? It seems this variable is not touched after it is assigned. But I guess this is something related to cursor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returned value will not change between line 31 to 37, all this code is here to make vertica client throws error.
If you run the following sql:
INSERT INTO MyTable (Key, Label) values (1, 'test 1');
INSERT INTO MyTable (Key, Label) values (1, 'test 2');
INSERT INTO MyTable (Key, Label) values (3, 'test 3');

each insert has its own result set and if you don't try to fetch data of thoses result sets you won't detect error on the second insert.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nasty behaviour of Vertica

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fortunately that as beginning with airflow I tested that it worked as I thincked, but if I had the habit with other database before I may have ended with a really bad surprise.



class VerticaHook(DbApiHook):
Expand Down Expand Up @@ -99,3 +114,40 @@ def get_conn(self) -> connect:

conn = connect(**conn_config)
return conn

@overload
def run(
self,
sql: str | Iterable[str],
autocommit: bool = ...,
parameters: Iterable | Mapping[str, Any] | None = ...,
handler: None = ...,
split_statements: bool = ...,
return_last: bool = ...,
) -> None:
...

@overload
def run(
self,
sql: str | Iterable[str],
autocommit: bool = ...,
parameters: Iterable | Mapping[str, Any] | None = ...,
handler: Callable[[Any], Any] = ...,
split_statements: bool = ...,
return_last: bool = ...,
) -> Any | list[Any]:
...

def run(
self,
sql: str | Iterable[str],
autocommit: bool = False,
parameters: Iterable | Mapping | None = None,
handler: Callable[[Any], Any] | None = None,
split_statements: bool = False,
return_last: bool = True,
) -> Any | list[Any] | None:
if handler == fetch_all_handler:
handler = vertica_fetch_all_handler
return DbApiHook.run(self, sql, autocommit, parameters, handler, split_statements, return_last)
1 change: 1 addition & 0 deletions tests/providers/vertica/hooks/test_vertica.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ def test_get_conn_extra_parameters_cast(self, mock_connect):
class TestVerticaHook:
def setup_method(self):
self.cur = mock.MagicMock(rowcount=0)
self.cur.nextset.side_effect = [None]
self.conn = mock.MagicMock()
self.conn.cursor.return_value = self.cur
conn = self.conn
Expand Down