-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1203] [Bug] mutiple references in a model causes duplicate entries in graph.node.depends_on
#5877
Comments
graph.node.depends_on
graph.node.depends_on
@dave-connors-3 Thanks for opening this, and sorry it's taken me forever to respond! This one is totally in our power to change. So really, the question is: What is the expected behavior? What ought it be? Here's a halfway position that makes sense to me, but I wouldn't be surprised if someone else found it more confusing:
The deduplication would be consistent with three other things already in the
example-- models/model_a.sql
select 1 as id -- models/model_b.sql
select 1 as id -- models/model_c.sql
select * from {{ ref('model_a') }}
union all
select * from {{ ref('model_b') }}
union all
select * from {{ ref('model_a') }} current "model.testy.model_c": {
...
"raw_code": "select * from {{ ref('model_a') }}\nunion all\nselect * from {{ ref('model_b') }}\nunion all\nselect * from {{ ref('model_a') }}",
"language": "sql",
"refs": [
[
"model_a"
],
[
"model_b"
],
[
"model_a"
]
],
"sources": [],
"metrics": [],
"depends_on": {
"macros": [],
"nodes": [
"model.testy.model_a",
"model.testy.model_b",
"model.testy.model_a"
]
},
"compiled_path": null
}, proposed "model.testy.model_c": {
...
"raw_code": "select * from {{ ref('model_a') }}\nunion all\nselect * from {{ ref('model_b') }}\nunion all\nselect * from {{ ref('model_a') }}",
"language": "sql",
"refs": [
[
"model_a"
],
[
"model_b"
],
[
"model_a"
]
],
"sources": [],
"metrics": [],
"depends_on": {
"macros": [],
"nodes": [
"model.testy.model_b",
"model.testy.model_a"
]
},
"compiled_path": null
}, |
Gave this a quick spin, but ran into some nasty |
Resolved by #7455 |
Is this a new bug in dbt-core?
Current Behavior
For a model that contains two relation references (is that the right term? trying not to say "references" to conflate that with the ref function), the corresponding entry in the
manifest.json
/graph
will have the same node in thedepends_on
list multiple times:Model with same source
Corresponding JSON
Model with same
{{ ref() }}
Corresponding JSON
Expected Behavior
This may in fact be expected! This came up in the dbt_project_evaluator project -- multiple sources joined together is a violation of the recommended best practices (stage those sources!) and the same source called twice with the star macro as above caused an unintentional error.
The fix on the package side is really straightforward, but got us wondering what the intended behavior here is!
Steps To Reproduce
manifest.json
entries for these nodesRelevant log output
No response
Environment
Which database adapter are you using with dbt?
postgres, redshift, snowflake, bigquery, spark
Additional Context
No response
The text was updated successfully, but these errors were encountered: