Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Oracle] Oracle Hook - make it possible to define a schema in the connection parameters #18664

Closed
1 of 2 tasks
mehmax opened this issue Oct 1, 2021 · 8 comments · Fixed by #19084
Closed
1 of 2 tasks

Comments

@mehmax
Copy link
Contributor

mehmax commented Oct 1, 2021

Description

Currently the oracle hook it not setting a CURRENT_SCHEMA after connecting with the Database.
In a lot of use cases we have production and test databases with seperate connections and database schemas e.g. TEST.Table1, PROD.Table1
"Hard-coding" the database schema in SQL Scripts is not elegant due to having different Airflow Instances for developing and Production.

An Option would be to store the database schema in a airflow Variable and getting it into the sql script with JINJA.
In Large SQL files with several tables it is not elegant either, because for every table a query to the metadatabase is made.

Why not using the Schema parameter in the Airflow Connections and executing
ALTER SESSION SET CURRENT_SCHEMA='SCHEMA'
right after successfully connecting to the database?

An alternative would be to use option Connection.current_schema of Library cx_Oracle.
https://cx-oracle.readthedocs.io/en/6.4.1/connection.html#Connection.current_schema

Use case/motivation

It makes Query development much easier by storing environment attributes directly in the Airflow Connection.
You have full flexibility without touching your SQL Script.
It makes separation of Test and Production environments and connections possible.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@mehmax mehmax added the kind:feature Feature Requests label Oct 1, 2021
@mehmax mehmax changed the title Oracle Hook - make it possible to define a schema in the connection parameters [Oracle] Oracle Hook - make it possible to define a schema in the connection parameters Oct 1, 2021
@nikochiko
Copy link

nikochiko commented Oct 2, 2021

Can I pick up this issue?

Perhaps using variables as suggested might be a good idea, with a default value for backward compatibility Jinja templates might be a good idea

@nikochiko
Copy link

@uranusjr
Copy link
Member

uranusjr commented Oct 2, 2021

If I understand correctly, you can already access connections in the template. But I think the original request is the hook should automatically set that from the connection (from an agreed-upon extra key).

@mehmax
Copy link
Contributor Author

mehmax commented Oct 4, 2021

The Idea was, to set the schema name in the corresponding field from UI.
image

When Initializing Oracle Hook, the current schema should be automatically set.

For example by adding a
conn.execute("ALTER SESSION SET CURRENT_SCHEMA = MY_SCHEMA") here:

conn = cx_Oracle.connect(**conn_config)

@nikochiko
Copy link

nikochiko commented Oct 5, 2021

@mehmax i don't know what a "schema" is in oracle terminology, but in airflow's case this parameter is used to refer to the conntype://login:password@host:port/<schemaname> schema name in the DSN.

If you look at these lines of code,

elif conn.host and service_name and not sid:
conn_config['dsn'] = cx_Oracle.makedsn(conn.host, port, service_name=service_name)
else:
dsn = conn.extra_dejson.get('dsn')
if dsn is None:
dsn = conn.host
if conn.port is not None:
dsn += ":" + str(conn.port)
if service_name or conn.schema:
dsn += "/" + (service_name or conn.schema)
the schema name is overwritten by service_name extra arg.

maybe what you're looking for can be achieved by removing the "service_name" extra arg?

@mehmax
Copy link
Contributor Author

mehmax commented Oct 6, 2021

@nikochiko
In oracle, schema is something different to service_name. I dont know why is was coded like this.
service name is needed to connect to the database

I am no oracle expert, but let me explain it in an example.
Tables in oracle are created in a schema. To query a table you have to use the following syntax:
SELECT * FROM SCHEMA.TABLE .
If you set a specific schema as a current_schema with
ALTER SESSION SET CURRENT_SCHEMA = SCHEMA, you do not have to specifiy it in the SQL Syntax anymore:
SELECT * FROM TABLE .

Please refer to also to this links
https://docs.oracle.com/cd/B28359_01/server.111/b28310/general009.htm#ADMIN02101
how it is working in cx_Oracle
https://cx-oracle.readthedocs.io/en/6.4.1/connection.html#Connection.current_schema

In my opinion it can be solved by adding the following lines

schema = conn.schema
...
conn = cx_Oracle.connect(**conn_config)
conn.current_schema = schema

or (equivalent)

schema = conn.schema
...
conn = cx_Oracle.connect(**conn_config)
conn.execute(f"ALTER SESSION SET CURRENT_SCHEMA={ schema }")

@eladkal
Copy link
Contributor

eladkal commented Oct 6, 2021

We probably also need to customize the connection fields in the UI and set Service Name with explanation to avoid confusion. Similar to what was done for Salesforce #17162

@mehmax do you want to work on this?

@mehmax
Copy link
Contributor Author

mehmax commented Oct 19, 2021

@eladkal @uranusjr @nikochiko
I created PR #19084. Could you please have a look at this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants