Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NOT MERGE PoC(tableau): codegen PoC #12491

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sgomezvillamor
Copy link
Contributor

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Jan 29, 2025
Copy link

codecov bot commented Jan 29, 2025

❌ 21 Tests Failed:

Tests completed Failed Passed Skipped
440 21 419 33
View the top 3 failed tests by shortest run time
::tests.integration.tableau.test_tableau_ingest
Stack Traces | 0s run time
ImportError while importing test module '.../integration/tableau/test_tableau_ingest.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
venv/lib/python3.8.../site-packages/_pytest/python.py:493: in importtestmodule
    mod = import_path(
venv/lib/python3.8.../site-packages/_pytest/pathlib.py:587: in import_path
    importlib.import_module(module_name)
.../hostedtoolcache/Python/3.8.18....../x64/lib/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1014: in _gcd_import
    ???
<frozen importlib._bootstrap>:991: in _find_and_load
    ???
<frozen importlib._bootstrap>:975: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:671: in _load_unlocked
    ???
venv/lib/python3.8.../_pytest/assertion/rewrite.py:184: in exec_module
    exec(co, module.__dict__)
.../integration/tableau/test_tableau_ingest.py:31: in <module>
    from datahub.ingestion.source.tableau.tableau import (
.../source/tableau/tableau.py:93: in <module>
    from datahub.ingestion.source.tableau.codegen_turms.schema import (
.../tableau/codegen_turms/schema.py:3: in <module>
    from typing import Annotated, List, Literal, Optional, Union
E   ImportError: cannot import name 'Annotated' from 'typing' (.../hostedtoolcache/Python/3.8.18....../x64/lib/python3.8/typing.py)
::tests.unit.test_tableau_source
Stack Traces | 0s run time
ImportError while importing test module '.../tests/unit/test_tableau_source.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
venv/lib/python3.8.../site-packages/_pytest/python.py:493: in importtestmodule
    mod = import_path(
venv/lib/python3.8.../site-packages/_pytest/pathlib.py:587: in import_path
    importlib.import_module(module_name)
.../hostedtoolcache/Python/3.8.18....../x64/lib/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1014: in _gcd_import
    ???
<frozen importlib._bootstrap>:991: in _find_and_load
    ???
<frozen importlib._bootstrap>:975: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:671: in _load_unlocked
    ???
venv/lib/python3.8.../_pytest/assertion/rewrite.py:184: in exec_module
    exec(co, module.__dict__)
tests/unit/test_tableau_source.py:6: in <module>
    from datahub.ingestion.source.tableau.tableau import (
.../source/tableau/tableau.py:93: in <module>
    from datahub.ingestion.source.tableau.codegen_turms.schema import (
.../tableau/codegen_turms/schema.py:3: in <module>
    from typing import Annotated, List, Literal, Optional, Union
E   ImportError: cannot import name 'Annotated' from 'typing' (.../hostedtoolcache/Python/3.8.18....../x64/lib/python3.8/typing.py)
tests.integration.tableau.test_tableau_ingest::test_project_path_pattern_allow
Stack Traces | 0.118s run time
pytestconfig = <_pytest.config.Config object at 0x7f8b09625410>
tmp_path = PosixPath('.../pytest-of-runner/pytest-0/test_project_path_pattern_allo0')
mock_datahub_graph = <MagicMock spec='DataHubGraph' id='140232476200560'>

    def test_project_path_pattern_allow(pytestconfig, tmp_path, mock_datahub_graph):
        output_file_name: str = "tableau_project_path_pattern_allow_mces.json"
        golden_file_name: str = "tableau_project_path_pattern_allow_mces_golden.json"
    
        new_config = config_source_default.copy()
        del new_config["projects"]
        new_config["project_path_pattern"] = {"allow": ["default/DenyProject"]}
    
>       tableau_ingest_common(
            pytestconfig,
            tmp_path,
            mock_data(),
            golden_file_name,
            output_file_name,
            mock_datahub_graph,
            pipeline_config=new_config,
        )

.../integration/tableau/test_tableau_ingest.py:602: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../integration/tableau/test_tableau_ingest.py:361: in tableau_ingest_common
    pipeline.raise_from_status()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <datahub.ingestion.run.pipeline.Pipeline object at 0x7f8a90313e90>
raise_warnings = False

    def raise_from_status(self, raise_warnings: bool = False) -> None:
        if self.source.get_report().failures:
>           raise PipelineExecutionError(
                "Source reported errors", self.source.get_report()
            )
E           datahub.configuration.common.PipelineExecutionError: ('Source reported errors', TableauSourceReport(ingestion_stage_durations={}, event_not_produced_warn=True, events_produced=0, events_produced_per_sec=0, _urns_seen=set(), entities=defaultdict(<class 'datahub.utilities.lossy_collections.LossyList'>, {}), aspects=defaultdict(<function SourceReport.<lambda>.<locals>.<lambda> at 0x7f8a6a002780>, {}), aspect_urn_samples=defaultdict(<function SourceReport.<lambda>.<locals>.<lambda> at 0x7f8a6a002830>, {}), _structured_logs=StructuredLogs(_entries={<StructuredLogLevel.ERROR: 40>: {'Pipeline Error-Ingestion pipeline raised an unexpected exception!': StructuredLogEntry(title='Pipeline Error', message='Ingestion pipeline raised an unexpected exception!', context=["<class 'pydantic.error_wrappers.ValidationError'>: 4 validation errors for GetItems_workbooksConnection\nworkbooksConnection -> nodes -> 0 -> tags\n  field required (type=value_error.missing)\nworkbooksConnection -> nodes -> 2 -> tags\n  field required (type=value_error.missing)\nworkbooksConnection -> nodes -> 3 -> tags\n  field required (type=value_error.missing)\nworkbooksConnection -> nodes -> 4 -> tags\n  field required (type=value_error.missing)"])}, <StructuredLogLevel.WARN: 30>: {'Insufficient Permissions-The user must have the `Site Administrator Explorer` role to perform metadata ingestion.': StructuredLogEntry(title='Insufficient Permissions', message='The user must have the `Site Administrator Explorer` role to perform metadata ingestion.', context=["user-name=<Mock name='Server().users.get_by_id().name' id='140231680631008'>, role=<Mock name='Server().users.get_by_id().site_role' id='140231680632368'>, site_id=190a6a5c-63ed-4de1-8045-site1"])}, <StructuredLogLevel.INFO: 20>: {}}), soft_deleted_stale_entities=[], last_state_non_deletable_entities=[], get_all_datasources_query_failed=False, num_get_datasource_query_failures=0, num_datasource_field_skipped_no_name=0, num_csql_field_skipped_no_name=0, num_table_field_skipped_no_name=0, extract_usage_stats_timer={}, fetch_groups_timer={}, populate_database_server_hostname_map_timer={}, populate_projects_registry_timer={}, emit_workbooks_timer={}, emit_sheets_timer={}, emit_dashboards_timer={}, emit_embedded_datasources_timer={}, emit_published_datasources_timer={}, emit_custom_sql_datasources_timer={}, emit_upstream_tables_timer={}, num_tables_with_upstream_lineage=0, num_upstream_table_lineage=0, num_upstream_fine_grained_lineage=0, num_upstream_table_skipped_no_name=0, num_upstream_table_skipped_no_columns=0, num_upstream_table_failed_generate_reference=0, num_upstream_table_lineage_failed_parse_sql=0, num_upstream_fine_grained_lineage_failed_parse_sql=0, num_hidden_assets_skipped=0, logged_in_user=[UserInfo(user_name=<Mock name='Server().users.get_by_id().name' id='140231680631008'>, site_role=<Mock name='Server().users.get_by_id().site_role' id='140231680632368'>, site_id='190a6a5c-63ed-4de1-8045-site1')], last_authenticated_at=datetime.datetime(2025, 1, 30, 13, 11, 5, 450273, tzinfo=datetime.timezone.utc), num_expected_tableau_metadata_queries=0, num_actual_tableau_metadata_queries=0, tableau_server_error_stats=defaultdict(<class 'int'>, {}), num_queries_by_connection_type=defaultdict(<class 'int'>, {}), num_filter_queries_by_connection_type=defaultdict(<class 'int'>, {}), num_paginated_queries_by_connection_type=defaultdict(<class 'int'>, {})))

.../ingestion/run/pipeline.py:599: PipelineExecutionError

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant