Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-446] Docs generations doesn't work with multiple schemas #312

Closed
barloc opened this issue Mar 31, 2022 · 3 comments
Closed

[CT-446] Docs generations doesn't work with multiple schemas #312

barloc opened this issue Mar 31, 2022 · 3 comments
Labels
Stale type:bug Something isn't working

Comments

@barloc
Copy link

barloc commented Mar 31, 2022

Describe the bug

We have a lot of schemas in our dbt project.
And when I try to generate doc, I get error about:
Expected only one database in get_catalog, found [<InformationSchema INFORMATION_SCHEMA>, <InformationSchema INFORMATION_SCHEMA>]

Steps To Reproduce

Has more than one schema in project.
dbt docs generate
It generates error and falls down.

Expected behavior

Docs with multiple schemas.

Screenshots and log output

image

System information

The output of dbt --version:

installed version: 1.0.0
   latest version: 1.0.0

Up to date!

Plugins:
  - spark: 1.0.0

The operating system you're using:
Debian 11
The output of python --version:
Python 3.8.13

Additional context

Add any other context about the problem here.

@barloc barloc added type:bug Something isn't working triage:product labels Mar 31, 2022
@github-actions github-actions bot changed the title Docs generations doesn't work with multiple schemas [CT-446] Docs generations doesn't work with multiple schemas Mar 31, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Mar 31, 2022

Thanks @barloc! This sounds like the same issue as databricks/dbt-databricks#52, which ended up being an issue with quoting. Do you have any custom quoting rules set, in your project or for specific sources?

There are two proposed resolution in that issue, specifically databricks/dbt-databricks#52 (comment):

  1. Fix the logic within SchemaSearchMap (called by _get_catalog_schemas), which currently views a quoted "None" database and an unquoted None database as distinct entries in the set of databases. This makes sense in the general case—they have different quote_policy—but it doesn't make sense when the value is None.
  2. Defining this config doesn't (shouldn't) have any actual effect. Given that we don't allow database to be defined elsewhere in dbt-spark (+ dbt-databricks, at least until How to support SQL Standard Catalogs? #281), it really feels like our answer here should be to raise an explicit error any time the quoting config is defined at the database level. That could look like adding a __post_init__ hook to the SparkQuotePolicy:
   def __post_init__(self):
       if self.database:
           raise RuntimeException('Cannot set database-level quoting!')

Leaving this issue open in this repo, since we should aim to pursue one of them.

@barloc
Copy link
Author

barloc commented Mar 31, 2022

Hi, @jtcohen6
I check code and found check in function get_catalog, whick generates this error:

        if len(schema_map) > 1:
            dbt.exceptions.raise_compiler_error(
                f'Expected only one database in get_catalog, found '
                f'{list(schema_map)}'
            )

I dont't understand why we check len of schema_map if we handle dict with some keys later:

        with executor(self.config) as tpe:
            futures: List[Future[agate.Table]] = []
            for info, schemas in schema_map.items():
                for schema in schemas:
                    futures.append(tpe.submit_connected(
                        self, schema,
                        self._get_one_catalog, info, [schema], manifest
                    ))
            catalogs, exceptions = catch_as_completed(futures)

So after deleting this check all works fine.
image

@barloc barloc mentioned this issue Mar 31, 2022
2 tasks
cadl added a commit to xiachufang/dbt-spark that referenced this issue Aug 11, 2022
@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants