-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Respect quoting config in dbt-py models #6620
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good over all, thanks for fixing this! Just want to confirm the quoting "
is not tied only to snowflake.
# quoating seems like something very specific to sql so far | ||
# for all python implementations we are seeing there's no quating. | ||
# TODO try to find better way to do this, given that | ||
original_quoting = self.config.quoting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so glad these lines are removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any day we kill a todo is a good day.
@@ -3,7 +3,7 @@ | |||
{%- set ref_dict = {} -%} | |||
{%- for _ref in model.refs -%} | |||
{%- set resolved = ref(*_ref) -%} | |||
{%- do ref_dict.update({_ref | join("."): resolved.quote(database=False, schema=False, identifier=False) | string}) -%} | |||
{%- do ref_dict.update({_ref | join("."): resolved | string | replace('"', '\"')}) -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the "
here tied to specific quoting for snowflake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it would be relevant to any data platform that uses "
as its quoting character (it's our default), or any identifier that contains a quote character ("
) for any reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reading the principle change here as quoting all necessary "
s very \
escapes within the Python environment.
Will any downstream adapters need to override this or can we trust this behavior holds for all? I'm concerned about the quote rules being different in, idk, Bigquery from the simple transform applied here. Ergo, this PR could introduce a bug for Bigquery (or whatever db has its own quoting rules).
Or is this quoting restricted to the Python environment? [And as a corollary, this is just making sure Python doesn't alter the string before handing it to the underlying db, which will figure out if the relation name string makes sense or not]
Correct! This isn't specific to Snowflake, it would also be relevant on any other data platform that uses Big idea: in Python models, session.table("database.schema.table") Where the table identifier is a Python string, i.e. quoted. If there are any instances of quotation marks ( >>> string = "hello, \"world\"!"
>>> print(string)
hello, "world"! The automated test confirming that this change works is really in dbt-labs/dbt-snowflake#399, which also resolves a |
database = "{{ this.database }}" | ||
schema = "{{ this.schema }}" | ||
identifier = "{{ this.identifier }}" | ||
{% set this_relation_name = this | string | replace('"', '\\"') %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One potential option is to make this a macro and have adapters overwrite it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is related to the change of respecting quoting schema would break bigquery dataproc python model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just chatted with @colin-rogers-dbt about this. Simplest solution is to turn this processing into a dispatched macro that can be overridden on dbt-bigquery
.
Something like:
{% macro render_relation_for_python_model(relation) %}
{{ adapter.dispatch('render_relation_for_python_model', 'dbt')(relation)
{% endmacro %}
{% macro default__render_relation_for_python_model(relation) %}
{{ return(relation | string | replace('"', \\"')) }}
{% endmacro %}
{% macro bigquery__render_relation_for_python_model(relation) %}
{{ return(relation.quote(database=False, schema=False, identifier=False) | string) }}
{% endmacro %}
And use that macro in all three spots that I changed here (ref
, source
, this
)
resolves #6103
resolves #6619
Description
From our code comment:
Initial user feedback has proven otherwise!
dbt.ref()
,dbt.source()
,dbt.this()
)"
). This is principally relevant to Snowflake today, but several other data platforms also use"
as their quoting character.Checklist
changie new
to create a changelog entry