Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-generate Views on Prod Tables #291

Closed
fbertsch opened this issue Aug 8, 2019 · 7 comments
Closed

Auto-generate Views on Prod Tables #291

fbertsch opened this issue Aug 8, 2019 · 7 comments

Comments

@fbertsch
Copy link
Contributor

fbertsch commented Aug 8, 2019

Currently we have a view defined for every table in prod. We're proposing auto-generating these views instead. The code to auto-generate the views should live along-side the table deploys, which happens daily (once the generated-schemas branch is pushed to MPS).

If a view exists here instead, then that view will override the default one; this will e.g. allow us to selectively update these views to handle new columns, data changes, or unions of versions.

New versions will be automatically pointed to by the view. If a union is needed with a previous version that will have to be done manually.

cc @jklukas

@fbertsch
Copy link
Contributor Author

cc @whd for a possible step after table creation

@jklukas
Copy link
Contributor

jklukas commented Aug 22, 2019

We have a script for this now via #301.

It would be ideal for us to add this to the schema deploy pipeline such that view schemas are updated and new views are created immediately after we update the underlying table schemas in prod. Is running the bigquery-etl docker image something we can do fairly easily on Jenkins? cc @whd @fbertsch @acmiyaguchi.

If that's non-trivial, we can schedule this in Airflow to run once per night instead, at least as a shorter-term solution.

@whd
Copy link
Member

whd commented Aug 22, 2019

It would be ideal for us to add this to the schema deploy pipeline such that view schemas are updated and new views are created immediately after we update the underlying table schemas in prod. Is running the bigquery-etl docker image something we can do fairly easily on Jenkins? cc @whd @fbertsch @acmiyaguchi.

It is. I can look at setting this up imminently. I might re-implement the publish_views portion at some point to keep consistency with the ops tooling, but not for the first pass.

@whd
Copy link
Member

whd commented Aug 24, 2019

I see the following on the latest BQ table deploy, which has the logic to implement view publishing.

+ cd /app
+ cp -R sql /tmp/sql
+ ./script/generate_views moz-fx-data-shared-prod:*_stable.* --sql-dir /tmp/sql

Creating /tmp/sql/org_mozilla_fenix_nightly/activation.sql
Creating /tmp/sql/org_mozilla_fenix_nightly/baseline.sql
Creating /tmp/sql/org_mozilla_fenix_nightly/bookmarks_sync.sql
Creating /tmp/sql/org_mozilla_fenix_nightly/events.sql
Creating /tmp/sql/org_mozilla_fenix_nightly/history_sync.sql
Creating /tmp/sql/org_mozilla_fenix_nightly/metrics.sql
+ ./script/publish_views --target-project moz-fx-data-shared-prod /tmp/sql

Published view `moz-fx-data-shared-prod.edge_validator.error_report`
Published view `moz-fx-data-shared-prod.mozdata.event`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix.baseline`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix.metrics`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix.activation`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix.bookmarks_sync`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix.history_sync`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix.events`
Published view `moz-fx-data-shared-prod.org_mozilla_tv_firefox.baseline`
Published view `moz-fx-data-shared-prod.org_mozilla_tv_firefox.metrics`
Published view `moz-fx-data-shared-prod.org_mozilla_tv_firefox.events`
Published view `moz-fx-data-shared-prod.firefox_launcher_process.launcher_process_failure`
Published view `moz-fx-data-shared-prod.coverage.coverage`
Published view `moz-fx-data-shared-prod.org_mozilla_reference_browser.baseline`
Published view `moz-fx-data-shared-prod.org_mozilla_reference_browser.metrics`
Published view `moz-fx-data-shared-prod.org_mozilla_reference_browser.events`
Published view `moz-fx-data-shared-prod.firefox_installer.install`
Published view `moz-fx-data-shared-prod.mozza.event`
Published view `moz-fx-data-shared-prod.eng_workflow.build`
Published view `moz-fx-data-shared-prod.eng_workflow.hgpush`
Published view `moz-fx-data-shared-prod.eng_workflow.bmobugs`
Published view `moz-fx-data-shared-prod.pocket.fire_tv_events`
Published view `moz-fx-data-shared-prod.mobile.activation`
Published view `moz-fx-data-shared-prod.webpagetest.webpagetest_run`
Published view `moz-fx-data-shared-prod.telemetry.lockwise_mobile_events_v1`
Published view `moz-fx-data-shared-prod.telemetry.active_profiles`
Published view `moz-fx-data-shared-prod.telemetry.testpilot`
Published view `moz-fx-data-shared-prod.telemetry.addons_aggregates_v2`
Published view `moz-fx-data-shared-prod.telemetry.crash_summary`
Published view `moz-fx-data-shared-prod.telemetry.tls_13_study_v3`
Published view `moz-fx-data-shared-prod.telemetry.tls13_middlebox_alt_server_hello_1`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_anonymous_parquet`
Published view `moz-fx-data-shared-prod.telemetry.testpilottest`
Published view `moz-fx-data-shared-prod.telemetry.system_addon_deployment_diagnostics`
Published view `moz-fx-data-shared-prod.telemetry.addons_v2`
Published view `moz-fx-data-shared-prod.telemetry.firefox_desktop_exact_mau28_v1`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_shield_study_parquet`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_heartbeat_parquet`
Published view `moz-fx-data-shared-prod.telemetry.pioneer_study`
Published view `moz-fx-data-shared-prod.telemetry.anonymous`
Published view `moz-fx-data-shared-prod.telemetry.eng_workflow_hgpush_parquet_v1`
Published view `moz-fx-data-shared-prod.telemetry.event`
Published view `moz-fx-data-shared-prod.telemetry.main`
Published view `moz-fx-data-shared-prod.telemetry.focus_event`
Published view `moz-fx-data-shared-prod.telemetry.malware_addon_states`
Published view `moz-fx-data-shared-prod.telemetry.sync`
Published view `moz-fx-data-shared-prod.telemetry.clients_daily`
Published view `moz-fx-data-shared-prod.telemetry.firefox_nondesktop_exact_mau28_v1`
Published view `moz-fx-data-shared-prod.telemetry.android_anr_report`
Published view `moz-fx-data-shared-prod.telemetry.certificate_checker`
Published view `moz-fx-data-shared-prod.telemetry.update`
Published view `moz-fx-data-shared-prod.telemetry.health`
Published view `moz-fx-data-shared-prod.telemetry.x_contextual_feature_recommendation`
Published view `moz-fx-data-shared-prod.telemetry.experiments_aggregates`
Published view `moz-fx-data-shared-prod.telemetry.ftu`
Published view `moz-fx-data-shared-prod.telemetry.experiment_error_aggregates_v1`
Published view `moz-fx-data-shared-prod.telemetry.crash_aggregates`
Published view `moz-fx-data-shared-prod.telemetry.tls_13_study_v4`
Published view `moz-fx-data-shared-prod.telemetry.mobile_metrics`
Published view `moz-fx-data-shared-prod.telemetry.ssl_ratios_v1`
Published view `moz-fx-data-shared-prod.telemetry.shield_study_error`
Published view `moz-fx-data-shared-prod.telemetry.smoot_usage_fxa_v1`
Published view `moz-fx-data-shared-prod.telemetry.sync_events`
Published view `moz-fx-data-shared-prod.telemetry.deployment_checker`
Published view `moz-fx-data-shared-prod.telemetry.block_autoplay`
Published view `moz-fx-data-shared-prod.telemetry.clients_last_seen_v1`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_downgrade_parquet`
Published view `moz-fx-data-shared-prod.telemetry.mobile_event`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_anonymous_parquet_v1`
Published view `moz-fx-data-shared-prod.telemetry.tls_13_study`
Published view `moz-fx-data-shared-prod.telemetry.firefox_nondesktop_exact_mau28_by_product_v1`
Published view `moz-fx-data-shared-prod.telemetry.tls13_middlebox_ghack`
Published view `moz-fx-data-shared-prod.telemetry.ssl_ratios`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_new_profile_parquet`
Published view `moz-fx-data-shared-prod.telemetry.modules`
Published view `moz-fx-data-shared-prod.telemetry.tls13_middlebox_repetition`
Published view `moz-fx-data-shared-prod.telemetry.optout`
Published view `moz-fx-data-shared-prod.telemetry.experiments`
Published view `moz-fx-data-shared-prod.telemetry.fenix_events_v1`
Published view `moz-fx-data-shared-prod.telemetry.shield_study`
Published view `moz-fx-data-shared-prod.telemetry.core_clients_last_seen_v1`
Published view `moz-fx-data-shared-prod.telemetry.addon_install_blocked`
Published view `moz-fx-data-shared-prod.telemetry.fxa_users_last_seen_v1`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_downgrade_parquet_v1`
Published view `moz-fx-data-shared-prod.telemetry.downgrade`
Published view `moz-fx-data-shared-prod.telemetry.tls13_middlebox_beta`
Published view `moz-fx-data-shared-prod.telemetry.fxa_all_events_v1`
Published view `moz-fx-data-shared-prod.telemetry.nondesktop_clients_last_seen_v1`
Published view `moz-fx-data-shared-prod.telemetry.tls13_middlebox_testing`
Published view `moz-fx-data-shared-prod.telemetry.clients_last_seen`
Published view `moz-fx-data-shared-prod.telemetry.tls_13_study_v2`
Published view `moz-fx-data-shared-prod.telemetry.first_shutdown`
Published view `moz-fx-data-shared-prod.telemetry.disable_sha1rollout`
Published view `moz-fx-data-shared-prod.telemetry.addons_aggregates`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_ip_privacy_parquet_v1`
Published view `moz-fx-data-shared-prod.telemetry.pre_account`
Published view `moz-fx-data-shared-prod.telemetry.untrusted_modules`
Published view `moz-fx-data-shared-prod.telemetry.fxa_content_auth_events_v1`
Published view `moz-fx-data-shared-prod.telemetry.tls13_middlebox_draft22`
Published view `moz-fx-data-shared-prod.telemetry.crash`
Published view `moz-fx-data-shared-prod.telemetry.experiment_error_aggregates`
Published view `moz-fx-data-shared-prod.telemetry.sync_summary`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_mobile_event_parquet`
Published view `moz-fx-data-shared-prod.telemetry.smoot_usage_desktop_v1`
Published view `moz-fx-data-shared-prod.telemetry.smoot_usage_nondesktop_v1`
Published view `moz-fx-data-shared-prod.telemetry.prio`
Published view `moz-fx-data-shared-prod.telemetry.addons`
Published view `moz-fx-data-shared-prod.telemetry.firefox_accounts_exact_mau28_by_dimensions_v1`
Published view `moz-fx-data-shared-prod.telemetry.advancedtelemetry`
Published view `moz-fx-data-shared-prod.telemetry.core`
Published view `moz-fx-data-shared-prod.telemetry.new_profile`
Published view `moz-fx-data-shared-prod.telemetry.eng_workflow_build_parquet`
Published view `moz-fx-data-shared-prod.telemetry.main_summary`
Published view `moz-fx-data-shared-prod.telemetry.bhr`
Published view `moz-fx-data-shared-prod.telemetry.searchvolextra`
Published view `moz-fx-data-shared-prod.telemetry.saved_session`
Published view `moz-fx-data-shared-prod.telemetry.outofdate_notifications_system_addon`

Published view `moz-fx-data-shared-prod.telemetry.firefox_accounts_exact_mau28_v1`
Published view `moz-fx-data-shared-prod.telemetry.tls_13_study_v1`
Published view `moz-fx-data-shared-prod.telemetry.heartbeat`
Published view `moz-fx-data-shared-prod.telemetry.shield_icq_v1`
Published view `moz-fx-data-shared-prod.telemetry.events`
Published view `moz-fx-data-shared-prod.telemetry.shield_study_addon`
Published view `moz-fx-data-shared-prod.telemetry.flash_shield_study`
Published view `moz-fx-data-shared-prod.telemetry.searchvol`
Published view `moz-fx-data-shared-prod.telemetry.sync_flat_summary`
Published view `moz-fx-data-shared-prod.telemetry.deletion`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_focus_event_parquet`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_core_parquet`
Published view `moz-fx-data-shared-prod.telemetry.firefox_nondesktop_exact_mau28_by_dimensions_v1`
Published view `moz-fx-data-shared-prod.telemetry.first_shutdown_summary`
Published view `moz-fx-data-shared-prod.telemetry.glean_clients_last_seen_v1`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_heartbeat_parquet_v1`
Published view `moz-fx-data-shared-prod.telemetry.frecency_update`
Published view `moz-fx-data-shared-prod.telemetry.uitour_tag`
Published view `moz-fx-data-shared-prod.telemetry.eng_workflow_hgpush_parquet`
Published view `moz-fx-data-shared-prod.telemetry.socorro_crash`
Published view `moz-fx-data-shared-prod.telemetry.telemetry_ip_privacy_parquet`
Published view `moz-fx-data-shared-prod.telemetry.smoot_usage_all_v1`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix_nightly.baseline`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix_nightly.metrics`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix_nightly.activation`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix_nightly.bookmarks_sync`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix_nightly.history_sync`
Published view `moz-fx-data-shared-prod.org_mozilla_fenix_nightly.events`
Published view `moz-fx-data-shared-prod.activity_stream.spoc_fills`
Published view `moz-fx-data-shared-prod.activity_stream.impression_stats`
+ ./script/publish_views --target-project moz-fx-data-derived-datasets /tmp/sql

Published view `moz-fx-data-derived-datasets.edge_validator.error_report`
Published view `moz-fx-data-derived-datasets.mozdata.event`
Published view `moz-fx-data-derived-datasets.org_mozilla_fenix.baseline`
Published view `moz-fx-data-derived-datasets.org_mozilla_fenix.metrics`
Published view `moz-fx-data-derived-datasets.org_mozilla_fenix.activation`
Published view `moz-fx-data-derived-datasets.org_mozilla_fenix.bookmarks_sync`
Published view `moz-fx-data-derived-datasets.org_mozilla_fenix.history_sync`
Published view `moz-fx-data-derived-datasets.org_mozilla_fenix.events`
Published view `moz-fx-data-derived-datasets.org_mozilla_tv_firefox.baseline`
Published view `moz-fx-data-derived-datasets.org_mozilla_tv_firefox.metrics`
Published view `moz-fx-data-derived-datasets.org_mozilla_tv_firefox.events`
Published view `moz-fx-data-derived-datasets.firefox_launcher_process.launcher_process_failure`
Published view `moz-fx-data-derived-datasets.coverage.coverage`
Published view `moz-fx-data-derived-datasets.org_mozilla_reference_browser.baseline`
Published view `moz-fx-data-derived-datasets.org_mozilla_reference_browser.metrics`
Published view `moz-fx-data-derived-datasets.org_mozilla_reference_browser.events`
Published view `moz-fx-data-derived-datasets.firefox_installer.install`
Published view `moz-fx-data-derived-datasets.mozza.event`
Published view `moz-fx-data-derived-datasets.eng_workflow.build`
Published view `moz-fx-data-derived-datasets.eng_workflow.hgpush`
Published view `moz-fx-data-derived-datasets.eng_workflow.bmobugs`
Published view `moz-fx-data-derived-datasets.pocket.fire_tv_events`
Published view `moz-fx-data-derived-datasets.mobile.activation`
Published view `moz-fx-data-derived-datasets.webpagetest.webpagetest_run`
Published view `moz-fx-data-derived-datasets.telemetry.lockwise_mobile_events_v1`
Published view `moz-fx-data-derived-datasets.telemetry.active_profiles`
Published view `moz-fx-data-derived-datasets.telemetry.testpilot`
Published view `moz-fx-data-derived-datasets.telemetry.addons_aggregates_v2`
Published view `moz-fx-data-derived-datasets.telemetry.crash_summary`
Published view `moz-fx-data-derived-datasets.telemetry.tls_13_study_v3`
Published view `moz-fx-data-derived-datasets.telemetry.tls13_middlebox_alt_server_hello_1`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_anonymous_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.testpilottest`
Published view `moz-fx-data-derived-datasets.telemetry.system_addon_deployment_diagnostics`
Published view `moz-fx-data-derived-datasets.telemetry.addons_v2`
Published view `moz-fx-data-derived-datasets.telemetry.firefox_desktop_exact_mau28_v1`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_shield_study_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_heartbeat_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.pioneer_study`
Published view `moz-fx-data-derived-datasets.telemetry.anonymous`
Published view `moz-fx-data-derived-datasets.telemetry.eng_workflow_hgpush_parquet_v1`
Published view `moz-fx-data-derived-datasets.telemetry.event`
Published view `moz-fx-data-derived-datasets.telemetry.main`
Published view `moz-fx-data-derived-datasets.telemetry.focus_event`
Published view `moz-fx-data-derived-datasets.telemetry.malware_addon_states`
Published view `moz-fx-data-derived-datasets.telemetry.sync`
Published view `moz-fx-data-derived-datasets.telemetry.clients_daily`
Published view `moz-fx-data-derived-datasets.telemetry.firefox_nondesktop_exact_mau28_v1`
Published view `moz-fx-data-derived-datasets.telemetry.android_anr_report`
Published view `moz-fx-data-derived-datasets.telemetry.certificate_checker`
Published view `moz-fx-data-derived-datasets.telemetry.update`
Published view `moz-fx-data-derived-datasets.telemetry.health`
Published view `moz-fx-data-derived-datasets.telemetry.x_contextual_feature_recommendation`
Published view `moz-fx-data-derived-datasets.telemetry.experiments_aggregates`
Published view `moz-fx-data-derived-datasets.telemetry.ftu`
Published view `moz-fx-data-derived-datasets.telemetry.experiment_error_aggregates_v1`
Published view `moz-fx-data-derived-datasets.telemetry.crash_aggregates`
Published view `moz-fx-data-derived-datasets.telemetry.tls_13_study_v4`
Published view `moz-fx-data-derived-datasets.telemetry.mobile_metrics`
Published view `moz-fx-data-derived-datasets.telemetry.ssl_ratios_v1`
Published view `moz-fx-data-derived-datasets.telemetry.shield_study_error`
Published view `moz-fx-data-derived-datasets.telemetry.smoot_usage_fxa_v1`
Published view `moz-fx-data-derived-datasets.telemetry.sync_events`
Published view `moz-fx-data-derived-datasets.telemetry.deployment_checker`
Published view `moz-fx-data-derived-datasets.telemetry.block_autoplay`
Published view `moz-fx-data-derived-datasets.telemetry.clients_last_seen_v1`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_downgrade_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.mobile_event`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_anonymous_parquet_v1`
Published view `moz-fx-data-derived-datasets.telemetry.tls_13_study`
Published view `moz-fx-data-derived-datasets.telemetry.firefox_nondesktop_exact_mau28_by_product_v1`
Published view `moz-fx-data-derived-datasets.telemetry.tls13_middlebox_ghack`
Published view `moz-fx-data-derived-datasets.telemetry.ssl_ratios`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_new_profile_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.modules`
Published view `moz-fx-data-derived-datasets.telemetry.tls13_middlebox_repetition`
Published view `moz-fx-data-derived-datasets.telemetry.optout`
Published view `moz-fx-data-derived-datasets.telemetry.experiments`
Published view `moz-fx-data-derived-datasets.telemetry.fenix_events_v1`
Published view `moz-fx-data-derived-datasets.telemetry.shield_study`
Published view `moz-fx-data-derived-datasets.telemetry.core_clients_last_seen_v1`
Published view `moz-fx-data-derived-datasets.telemetry.addon_install_blocked`
Published view `moz-fx-data-derived-datasets.telemetry.fxa_users_last_seen_v1`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_downgrade_parquet_v1`
Published view `moz-fx-data-derived-datasets.telemetry.downgrade`
Published view `moz-fx-data-derived-datasets.telemetry.tls13_middlebox_beta`
Published view `moz-fx-data-derived-datasets.telemetry.fxa_all_events_v1`
Published view `moz-fx-data-derived-datasets.telemetry.nondesktop_clients_last_seen_v1`
Published view `moz-fx-data-derived-datasets.telemetry.tls13_middlebox_testing`
Published view `moz-fx-data-derived-datasets.telemetry.clients_last_seen`
Published view `moz-fx-data-derived-datasets.telemetry.tls_13_study_v2`
Published view `moz-fx-data-derived-datasets.telemetry.first_shutdown`
Published view `moz-fx-data-derived-datasets.telemetry.disable_sha1rollout`
Published view `moz-fx-data-derived-datasets.telemetry.addons_aggregates`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_ip_privacy_parquet_v1`
Published view `moz-fx-data-derived-datasets.telemetry.pre_account`
Published view `moz-fx-data-derived-datasets.telemetry.untrusted_modules`
Published view `moz-fx-data-derived-datasets.telemetry.fxa_content_auth_events_v1`
Published view `moz-fx-data-derived-datasets.telemetry.tls13_middlebox_draft22`
Published view `moz-fx-data-derived-datasets.telemetry.crash`
Published view `moz-fx-data-derived-datasets.telemetry.experiment_error_aggregates`
Published view `moz-fx-data-derived-datasets.telemetry.sync_summary`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_mobile_event_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.smoot_usage_desktop_v1`
Published view `moz-fx-data-derived-datasets.telemetry.smoot_usage_nondesktop_v1`
Published view `moz-fx-data-derived-datasets.telemetry.prio`
Published view `moz-fx-data-derived-datasets.telemetry.addons`
Published view `moz-fx-data-derived-datasets.telemetry.firefox_accounts_exact_mau28_by_dimensions_v1`
Published view `moz-fx-data-derived-datasets.telemetry.advancedtelemetry`

Published view `moz-fx-data-derived-datasets.telemetry.core`
Published view `moz-fx-data-derived-datasets.telemetry.new_profile`
Published view `moz-fx-data-derived-datasets.telemetry.eng_workflow_build_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.main_summary`
Published view `moz-fx-data-derived-datasets.telemetry.bhr`
Published view `moz-fx-data-derived-datasets.telemetry.searchvolextra`
Published view `moz-fx-data-derived-datasets.telemetry.saved_session`
Published view `moz-fx-data-derived-datasets.telemetry.outofdate_notifications_system_addon`
Published view `moz-fx-data-derived-datasets.telemetry.firefox_accounts_exact_mau28_v1`
Published view `moz-fx-data-derived-datasets.telemetry.tls_13_study_v1`
Published view `moz-fx-data-derived-datasets.telemetry.heartbeat`
Published view `moz-fx-data-derived-datasets.telemetry.shield_icq_v1`
Published view `moz-fx-data-derived-datasets.telemetry.events`
Published view `moz-fx-data-derived-datasets.telemetry.shield_study_addon`
Published view `moz-fx-data-derived-datasets.telemetry.flash_shield_study`
Published view `moz-fx-data-derived-datasets.telemetry.searchvol`
Published view `moz-fx-data-derived-datasets.telemetry.sync_flat_summary`
Published view `moz-fx-data-derived-datasets.telemetry.deletion`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_focus_event_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_core_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.firefox_nondesktop_exact_mau28_by_dimensions_v1`
Published view `moz-fx-data-derived-datasets.telemetry.first_shutdown_summary`
Published view `moz-fx-data-derived-datasets.telemetry.glean_clients_last_seen_v1`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_heartbeat_parquet_v1`
Published view `moz-fx-data-derived-datasets.telemetry.frecency_update`
Published view `moz-fx-data-derived-datasets.telemetry.uitour_tag`
Published view `moz-fx-data-derived-datasets.telemetry.eng_workflow_hgpush_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.socorro_crash`
Published view `moz-fx-data-derived-datasets.telemetry.telemetry_ip_privacy_parquet`
Published view `moz-fx-data-derived-datasets.telemetry.smoot_usage_all_v1`
Traceback (most recent call last):
  File "./script/publish_views", line 79, in <module>
    main()
  File "./script/publish_views", line 73, in main
    process_file(client, args, os.path.join(root, sql_file))
  File "./script/publish_views", line 33, in process_file
    query_job.result()
  File "/usr/local/lib/python3.7/site-packages/google/cloud/bigquery/job.py", line 2877, in result
    super(QueryJob, self).result(timeout=timeout)
  File "/usr/local/lib/python3.7/site-packages/google/cloud/bigquery/job.py", line 733, in result
    return super(_AsyncJob, self).result(timeout=timeout)
  File "/usr/local/lib/python3.7/site-packages/google/api_core/future/polling.py", line 127, in result
    raise self._exception
google.api_core.exceptions.NotFound: 404 Not found: Dataset moz-fx-data-derived-datasets:org_mozilla_fenix_nightly was not found in location US

I'm guessing the issue is that there was a manual copy of datasets from shared into derived-datasets that predates mozilla/mozilla-schema-generator#55. As far as I can tell fenix_nightly should be the first set of auto-generated and published views.

If the solution here is to manually create this dataset (and keep derived-datasets and shared in sync, but hopefully we'll migrate fully to shared before any new namespaces are created) then once we do that and re-run the BQ table updates we can probably call this complete.

EDIT: as a side note, performance is somewhat poor serially publishing hundreds of views, so re-implementing publish_views via terraform has some performance benefits in addition to tooling consistency ones.

@jklukas
Copy link
Contributor

jklukas commented Aug 26, 2019

If the solution here is to manually create this dataset (and keep derived-datasets and shared in sync, but hopefully we'll migrate fully to shared before any new namespaces are created) then once we do that and re-run the BQ table updates we can probably call this complete.

Yes, I think the way to go for the interim is to manually create the dataset. I just did so:

› bq mk moz-fx-data-derived-datasets:org_mozilla_fenix_nightly
Dataset 'moz-fx-data-derived-datasets:org_mozilla_fenix_nightly' successfully created.

We should be good to try again.

EDIT: as a side note, performance is somewhat poor serially publishing hundreds of views, so re-implementing publish_views via terraform has some performance benefits in addition to tooling consistency ones.

There may be ways of expressing this in terraform that I'm unaware of, but I worry about the terraform approach being inflexible. We want to be able to pick up concrete view definitions from bigquery-etl's sql directory where definitions exist and create a default view otherwise. We also want the script to publish views for derived datasets that may or may not match one to one with the underlying derived table; in the derived case, I don't want us to auto-create views but rather always expect that we codify them as queries checked in to the repo.

@whd
Copy link
Member

whd commented Aug 28, 2019

We should be good to try again.

I hit an unrelated issue in mozilla/mozilla-schema-generator#69 but I'm confident this has been fixed.

There may be ways of expressing this in terraform that I'm unaware of, but I worry about the terraform approach being inflexible. We want to be able to pick up concrete view definitions from bigquery-etl's sql directory where definitions exist and create a default view otherwise. We also want the script to publish views for derived datasets that may or may not match one to one with the underlying derived table; in the derived case, I don't want us to auto-create views but rather always expect that we codify them as queries checked in to the repo.

I'm specifically referring to re-implementing the "publish" portion of this. There would be custom logic on top of terraform for creating the definitions, just as there is with ingestion tables. In the current case for defining ingestion datasets and tables the custom logic is a combination of the generated-schemas branch (input: file paths and schema definitions) and a wrapper script in cloudops-infra for manifesting that as terraform (output: tf json). I am noting that a similar approach could be used for the output (or absence of output) from this script and would have performance and perhaps tooling consistency benefits.

@whd
Copy link
Member

whd commented Sep 24, 2019

This was rolled out in https://github.com/mozilla-services/cloudops-infra/pull/1294. We might revisit whether we want to generate views in stage as well, but for now we're creating them there on a "best effort" basis.

@whd whd closed this as completed Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants