Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-961] [Bug] Materialized views that depend on other materialized views will fail to be refreshed due to drop not cascading #642

Closed
2 tasks done
jmeiring-sc opened this issue Oct 25, 2023 · 8 comments · Fixed by Shiphero/dbt-redshift#2 or #904
Assignees
Labels

Comments

@jmeiring-sc
Copy link

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

If a materialized view (parent_mv) is based on another materialized view, a refresh of the the parent view that requires a full rebuild will fail as the drop materialized view if exists parent_mv does not cascade

Expected Behavior

Drop all dependent materialized views and recreate them

Steps To Reproduce

--parent_mv.sql
{{ config(
    materialized='materialized_view',
    sort=['cust_id'],
    on_configuration_change = 'apply',
)
}}

SELECT
    log_date,
    cust_id,
    cust_country,
    count(request_id) as total_requests
FROM api_logs
GROUP BY 1,2,3

--child_mv.sql
{{ config(
    materialized='materialized_view',
    sort=['cust_id'],
    on_configuration_change = 'apply',
)
}}

SELECT
    log_date,
    cust_id,
    count(request_id) as total_requests
FROM {{ ref('parent_mv') }}
GROUP BY 1,2

dbt run --full-refresh

Relevant log output

cannot drop materialized view parent_mv because other objects depend on it

Environment

- OS:
- Python:
- dbt-core: 1.6.2
- dbt-redshift: 1.6.2

Additional Context

No response

@jmeiring-sc jmeiring-sc added type:bug Something isn't working triage:product labels Oct 25, 2023
@github-actions github-actions bot changed the title [Bug] Materialized views that depend on other materialized views will fail to be refreshed due to drop not cascading [ADAP-961] [Bug] Materialized views that depend on other materialized views will fail to be refreshed due to drop not cascading Oct 25, 2023
@dbeatty10
Copy link
Contributor

Thanks for trying out materialized views and sharing your experiences @jmeiring-sc !

I wasn't able to reproduce this upon my initial attempt. Could you try out the files that I have below and help me see what I need to change to get the same error as you?

models/my_model.sql

{{ config(
    materialized='table',
)
}}

select 1 as id

models/parent_mv.sql

{{ config(
    materialized='materialized_view',
    on_configuration_change='apply',
)
}}

select * from {{ ref('my_model') }}

models/child_mv.sql

{{ config(
    materialized='materialized_view',
    on_configuration_change='apply',
)
}}

select * from {{ ref('parent_mv') }}
$ dbt run --full-refresh
01:43:17  Running with dbt=1.6.0
01:43:17  Registered adapter: redshift=1.6.0
01:43:17  Found 3 models, 0 sources, 0 exposures, 0 metrics, 397 macros, 0 groups, 0 semantic models
01:43:17  
01:43:20  Concurrency: 5 threads (target='redshift')
01:43:20  
01:43:20  1 of 3 START sql table model dbt_dbeatty.my_model .............................. [RUN]
01:43:22  1 of 3 OK created sql table model dbt_dbeatty.my_model ......................... [SUCCESS in 1.89s]
01:43:22  2 of 3 START sql materialized_view model dbt_dbeatty.parent_mv ................. [RUN]
01:43:24  2 of 3 OK created sql materialized_view model dbt_dbeatty.parent_mv ............ [SUCCESS in 1.95s]
01:43:24  3 of 3 START sql materialized_view model dbt_dbeatty.child_mv .................. [RUN]
01:43:26  3 of 3 OK created sql materialized_view model dbt_dbeatty.child_mv ............. [SUCCESS in 1.93s]
01:43:26  
01:43:26  Finished running 1 table model, 2 materialized_view models in 0 hours 0 minutes and 9.25 seconds (9.25s).
01:43:27  
01:43:27  Completed successfully

@jmeiring-sc
Copy link
Author

I believe that the configuration you have above will work fine as the base table my_model will get dropped which will cascade to the materialized view. Here is a setup that will fail however:

CREATE TABLE test.test_table (
    id int
);
INSERT INTO test.test_table VALUES (1);
INSERT INTO test.test_table VALUES (2);
{{ config(
    materialized='materialized_view',
    on_configuration_change='apply',
)
}}

select * from test.test_table
{{ config(
    materialized='materialized_view',
    on_configuration_change='apply',
)
}}

select * from {{ ref('parent') }}

The first run will succeed, as there are no drops that need to happen. Subsequent runs will fail however:

�[0m15:29:45 Database Error in model parent (models/test/parent.sql)
cannot drop materialized view test.parent because other objects depend on it

@dbeatty10
Copy link
Contributor

Thanks @jmeiring-sc, that was exactly the insight we needed!

Reprex

seeds/my_seed.sql

id
1

models/parent_mv.sql

{{ config(
    materialized='materialized_view',
    on_configuration_change='apply',
)
}}

select * from {{ ref('my_seed') }}

models/child_mv.sql

{{ config(
    materialized='materialized_view',
    on_configuration_change='apply',
)
}}

select * from {{ ref('parent_mv') }}

This will work:

dbt seed
dbt run

But then run this to get an error:

dbt run --full-refresh

or this:

dbt run -s parent_mv+ --full-refresh

Error output:

15:55:45  1 of 2 START sql materialized_view model dbt_dbeatty.parent_mv ................. [RUN]
15:55:46  1 of 2 ERROR creating sql materialized_view model dbt_dbeatty.parent_mv ........ [ERROR in 0.89s]
15:55:46  2 of 2 SKIP relation dbt_dbeatty.child_mv ...................................... [SKIP]
15:55:46  
15:55:46  Finished running 2 materialized_view models in 0 hours 0 minutes and 5.27 seconds (5.27s).
15:55:46  
15:55:46  Completed with 1 error and 0 warnings:
15:55:46  
15:55:46  Database Error in model parent_mv (models/parent_mv.sql)
15:55:46    cannot drop materialized view dbt_dbeatty.parent_mv because other objects depend on it
15:55:46    compiled Code at target/run/foo/models/parent_mv.sql
15:55:46  
15:55:46  Done. PASS=0 WARN=0 ERROR=1 SKIP=1 TOTAL=2

@davidfromaplazo
Copy link

davidfromaplazo commented Dec 13, 2023

Also facing this issue. Attempted to solve with an on-run-start hook that looks for the --full-refresh flag, but I'm not super familiar with using these and it only seems to kinda work.

  project:
    model:
      +on-run-start: |
        {% if flags.FULL_REFRESH and this is not none %} DROP MATERIALIZED VIEW {{ this }} CASCADE
        {% endif %}

@nathaniel-may
Copy link
Contributor

This is noted as a limitation in our documentation. This is because Redshift does not support drop cascade on materialized views which is a prerequisite for lifting this limitation.

@colin-rogers-dbt
Copy link
Contributor

Re-opening as offline conversation with the Redshift team indicates we may be able to support this use case

@urkle
Copy link

urkle commented Feb 12, 2024

@colin-rogers-dbt any ETA on being able to get this supported?

@amychen1776
Copy link

amychen1776 commented May 15, 2024

In conversations with the Redshift team - it looks like the pathway is to update the materialization to use CASCADE which is now supported. We will take a look and update this thread!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants