Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiled sql not being written in target path of dbt project directory #851

Closed
EugenioG2021 opened this issue Feb 19, 2024 · 3 comments · Fixed by #1389
Closed

Compiled sql not being written in target path of dbt project directory #851

EugenioG2021 opened this issue Feb 19, 2024 · 3 comments · Fixed by #1389
Assignees
Labels
area:config Related to configuration, like YAML files, environment variables, or executer configuration dbt:compile Primarily related to dbt compile command or functionality epic-assigned execution:callback Tasks related to callback when executing tasks execution:local Related to Local execution environment profile:snowflake Related to Snowflake ProfileConfig
Milestone

Comments

@EugenioG2021
Copy link

EugenioG2021 commented Feb 19, 2024

I have ran a model in airflow, and it says the compiled sql can be found at some target/{some_subdirectory} path. However, I am not seeing any "target" directory created in neither of these places:

1.In the project directory (dbt_project_path argument of ProjectConfig on my DAG python file)
2. In airflow's home directory, nor inside the dags directory
3. In the directory I specified on the ExecutionConfig with the dbt_executable_path

This is my DbtTaskGroup which I use in the airflow DAG :

SNOWFLAKE_CONN_ID='some_connection_to_snowflake'
DBT_SNOWFLAKE_SCHEMA='some_schema'

   profile = ProfileConfig(
    profile_name="default",
    target_name="dev",
    profile_mapping=SnowflakeUserPasswordProfileMapping(
        conn_id=SNOWFLAKE_CONN_ID, profile_args={"schema": DBT_SNOWFLAKE_SCHEMA}
    ),
)

   dbt_tg = DbtTaskGroup(
        group_id='whatever',
        project_config=ProjectConfig(
            dbt_project_path=f"/usr/local/airflow/dags/dbt/data_eng_dbt",
            seeds_relative_path=f"seeds/",
        ),
        execution_config=ExecutionConfig(
            dbt_executable_path=f"/usr/local/airflow/dbt_venv/bin/dbt",
        ),
        render_config=RenderConfig(
            load_method=LoadMode.DBT_LS,
            select=[],
            exclude=exclude_list,
            test_behavior=TestBehavior.NONE,
            emit_datasets=False,
            dbt_deps=False,
        ),
        profile_config=profile
    )

On the other hand, my dbt_project.yml is as follows:

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'data_eng_dbt'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'default'

# These configurations specify where dbt should look for different types of files.
# The `source-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target" 

My project scafolding starting at airflow's Directory is:

dags/
|
|
 --- dbt/
   |
   |
     -- data_eng_dbt/
         |
         |
           -- dbt_project.yml
           -- macros/
           -- seeds/
           -- dbt_packages/
           -- profiles.yml
           -- packages.yml
           -- tests/
           -- models/
                |
                |
                    -- intermediate/
                          |
                          |
                             -- int_unite_factevents_pstage.sql

When airflow dag's run and I had an error on my sql of a model called intermediate.int_unite_factevents_pstage, it says my compiled sql should be at compiled Code at target/run/data_eng_dbt/models/intermediate/placement_id_match/int_unite_factevents_pstage.sql'

However, I cannot find that target/run directory anywhere, where should it be? And does the dbt_project.yml comes into play here? Because I did specified the target-path there.

I also have no system environment variables for dbt_target as mentioned here and I run that DbtTaskGroup instance by just putting it inside a standard airflow dag

@dag(
    start_date=datetime(2022, 2, 10),
    schedule_interval=None,
    catchup=False,
    default_view="graph",
    doc_md=__doc__,
    dag_id='dag_test',
    default_args=default_args,
    render_template_as_native_obj=True,
)
def dag_test():
   {..code shown above that generates dbt_tg instance of DbtTaskGroup...}
   (dbt_tg)
dag=dag_test
Copy link

dosubot bot commented Feb 19, 2024

🤖

Based on the information you've provided and the context from the Cosmos repository, the target/run directory, which contains the compiled SQL, should typically be located within your dbt project directory. In your case, it should be within the data_eng_dbt directory, as specified in your ProjectConfig.

However, if the target/run directory is not found, it could be due to a few reasons:

  1. The dbt project has not been run yet. The target/run directory is created when a dbt project is run. If the project has not been run, the directory will not exist.

  2. The dbt project directory is not correctly specified. In your case, the dbt project directory is specified as /usr/local/airflow/dags/dbt/data_eng_dbt. Make sure that this path correctly points to your dbt project directory.

  3. The dbt project directory does not have the correct permissions. Make sure that the user running the Airflow DAG has read and write permissions to the dbt project directory.

If none of these solutions work, you may need to manually create the target/run directory within your dbt project directory.

Regarding your question about how Cosmos handles the target-path configuration from the dbt_project.yml file, I wasn't able to find a specific answer within the repository. It's possible that this information is available elsewhere or I may have missed it.

If you have any additional information or context that could help me better understand your issue, please provide it. I'm here to help!

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosubot dosubot bot added area:config Related to configuration, like YAML files, environment variables, or executer configuration dbt:compile Primarily related to dbt compile command or functionality execution:local Related to Local execution environment profile:snowflake Related to Snowflake ProfileConfig labels Feb 19, 2024
@tatiana
Copy link
Collaborator

tatiana commented May 17, 2024

@EugenioG2021 Cosmos currently creates temporary directories to export the target_directory. This was done intentionally since other users faced issues with read-only file systems. That said, I agree that we should allow users to specify the target directory if they want to control this.

Would you be interested in contributing this feature?

@tatiana tatiana added this to the 1.5.0 milestone May 17, 2024
@tatiana tatiana added triage-needed Items need to be reviewed / assigned to milestone and removed triage-needed Items need to be reviewed / assigned to milestone labels May 17, 2024
@tatiana tatiana self-assigned this May 17, 2024
@tatiana tatiana mentioned this issue May 17, 2024
@tatiana tatiana modified the milestones: Cosmos 1.5.0, Cosmos 1.6.0 Jun 6, 2024
@tatiana tatiana modified the milestones: Cosmos 1.6.0, Cosmos 1.7.0 Jul 5, 2024
@tatiana tatiana modified the milestones: Cosmos 1.7.0, Triage Sep 20, 2024
@tatiana tatiana modified the milestones: Triage, Cosmos 1.8.0 Oct 30, 2024
@tatiana tatiana added the execution:callback Tasks related to callback when executing tasks label Nov 29, 2024
@pankajkoti pankajkoti assigned pankajkoti and unassigned tatiana Dec 11, 2024
@pankajkoti
Copy link
Contributor

hi @EugenioG2021 , we recently merged PR #1389, which introduces minor changes to the existing callback functionality and will be included in the upcoming Cosmos 1.8.0 release.

To allow users to try out these changes ahead of the official release, we have prepared an alpha release. You can install it using the following link: astronomer-cosmos 1.8.0a3. PR #1389 also provides examples showcasing how to use this callback functionality.

For additional guidance, refer to the documentation on leveraging callbacks: Callback Configuration. The helper functions demonstrated in the examples can be found here: cosmos/io.py. However, you are not limited to these; you can create your own custom callback functions using these examples as a reference and pass them via the callback argument.

We would greatly appreciate any feedback you have after testing this alpha release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:config Related to configuration, like YAML files, environment variables, or executer configuration dbt:compile Primarily related to dbt compile command or functionality epic-assigned execution:callback Tasks related to callback when executing tasks execution:local Related to Local execution environment profile:snowflake Related to Snowflake ProfileConfig
Projects
None yet
3 participants