-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
airflow jinja template render error #33694
Comments
@jaegwonseo Thanks for logging this and I have been able to replicate it. Basically this means that we cannot extract a BigQuery table to JSON delimited files with the extension |
This is expected behaviour - the This is defined here: airflow/airflow/providers/google/cloud/operators/bigquery.py Lines 2697 to 2706 in 46fa5a2
I'm not sure if there's a preferred way around this, but I would just subclass the operator and remove the extension: class BigQueryInsertJobOperatorNoTemplateExt(BigQueryInsertJobOperator):
template_ext = [] |
@SamWheating I wonder if it might be possible to prevent the renderer from attempting to load a file for named keys in |
how about add no_template_fields parameter to BaseOperator constructor ? in this case no_template_fields = (configuration.extract.destinationUris) class BaseOperator(AbstractOperator, metaclass=BaseOperatorMeta):
def __init__(
xxx,
yyy,
no_template_fields: Sequence[str] = (): and skip rendering from https://github.com/apache/airflow/blob/main/airflow/template/templater.py#L152 |
@jaegwonseo Why not give it a try? I think we would still want the filenames in |
I think that this would get pretty complicated pretty fast, as a dict argument like Additionally, we'd have to be really specific to only disable this on fields in which we know a user would never ever want to apply this sort of formatting. I think in the case of Thoughts? |
I think the simplest case would be to add a feature to disable templating engine - either for all fields or for specific fields or maybe just disable specific extensions. I think we could couple this with something that we should have for quite some time - i.e. ability to dynamically change list of templated fields and template extensions by overriding them in the instance of the task. There is fundamentally no problem to disable or enable it selectively "per task". It could even be as simple as "template_fields", "template_ext" specified in the constructor of BaseOperator. Ir's a little bit more complex than just overriding it in classes, it would also have to be overrideable in serialized form of dag - i.e. whe template_fields are rendered in the UI. But I do not think there is anything that would prevent us from doing it. |
Yeah. Interesting approach.. I can't think of a bad side effect of it - this should not only work for parsing + execution of tasks, but it should also serialize well for the UI and appears to not have unintended side effects. Appears to be a solution that we never thought of, but was always possible. And it's super-appealing to have a problem that can be solved by documenting it. Let me summon a few people what they think @dstandish @uranusjr @hussein-awala @eladkal - WDYT ? I find it deceptively simple, but also - quite surprisingly - pretty workable. We had a number of discussions on whether we should make all fields templateable by default etc. but seems that this simple trick might do the job and we could make it "official". See #33694 (comment) comment by @SamWheating |
BTW. No wonder @SamWheating you have not thought about it. It seems no-one did for the last few years I was around. I kept on explaining the users how easy it is to just extend existing operators and modify template_fields and template_exts but it never occured to me, that we could modify it in the tasks during dag construction. Apparently - we can. |
Any objection to just adding this to the documentation as a footnote under the Template Rendering section? |
I have absolutely No objections, even more, I would love that as an "official" way of modifying templated/template_ext fields on-the-flight - as long as we get a few more people say "well, indeed that does not seem to have some side effects" :D |
Didn't we have the exact same bug when we added json as templated ext for |
Maybe the easiest way out would be to add a BigQueryInsertJobOperator(
...,
configuration={
"extract": {
"destinationUris": [Literally("gs://xxx/yyy/*.json")],
...,
},
},
) Please do bikeshed on the name. I’m intentionally avoiding |
I like the idea. |
How about |
Apache Airflow version
Other Airflow 2 version (please specify below)
What happened
version 2.6.2
An error occurs when *.json is included in the parameters of BigQueryInsertJobOperator.
error log
What you think should happen instead
According to the airflow.template.templater
source : https://github.com/apache/airflow/blob/main/airflow/template/templater.py#L152
In the Jinja template source, if the value ends with .json or .sql, an attempt is made to read the resource file by calling jinja_env.get_template.
How to reproduce
just call BigQueryInsertJobOperator with configuration what i added
Operating System
m2 mac
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: