Skip to content

Commit

Permalink
Merge branch 'feature/package-revamp' into feature/revamp/enhance-end…
Browse files Browse the repository at this point in the history
…-models
  • Loading branch information
fivetran-catfritz authored Jan 28, 2023
2 parents 5f62414 + 3a1faef commit 8db5df6
Show file tree
Hide file tree
Showing 24 changed files with 584 additions and 22 deletions.
2 changes: 1 addition & 1 deletion .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ dbt deps
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt test --target "$db"
dbt run --vars '{shopify_timezone: "America/New_York"}' --target "$db" --full-refresh
dbt run --vars '{shopify_timezone: "America/New_York", shopify_using_fulfillment_event: true, shopify_using_all_metafields: true}' --target "$db" --full-refresh
dbt test --target "$db"
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
target/
dbt_modules/
logs/

env/
dbt_packages/
7 changes: 6 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# dbt_shopify v0.7.0
# dbt_shopify v0.8.0
## 🎉 Documentation and Feature Updates
- Updated README documentation updates for easier navigation and setup of the dbt package
- Included `shopify_[source_table_name]_identifier` variable within the Shopify source package for additional flexibility within the package when source tables are named differently.
Expand All @@ -8,7 +8,12 @@
- Intermediate models that roll customer_ids up to emails:
- `shopify__customer_email_rollup`
- `shopify__emails__order_aggregates`
- Metafield support! This package now supports metafields for the collection, customer, order, product_image, product, product_variant, and shop objects. If enabled (see the [README](https://github.com/fivetran/dbt_shopify#adding-metafields) for more details), respective `shopify__[object]_metafields` models will materialize with **all** metafields defined within the `metafield` source table appended to the object. ([#50](https://github.com/fivetran/dbt_shopify/pull/50))

## Under the Hood
- Addition of the calogica/dbt_expectations package for more robust testing.

## dbt_shopify v0.7.0
## 🚨 Breaking Changes 🚨:
[PR #40](https://github.com/fivetran/dbt_shopify/pull/40) includes the following breaking changes:
- Dispatch update for dbt-utils to dbt-core cross-db macros migration. Specifically `{{ dbt_utils.<macro> }}` have been updated to `{{ dbt.<macro> }}` for the below macros:
Expand Down
4 changes: 4 additions & 0 deletions DECISIONLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ In validating metrics with the Sales over Time reports in the Shopify UI, you ma

We felt that reporting on the order date made more sense in reality, but, if you feel differently, please reach out and create a Feature Request. To align with the Shopify method yourself, this would most likely involve aggregating `transactions` data (relying on the `kind` column to determine sales vs returns) instead of `orders`.

## Using an Order's `created_timestamp` Instead of `processed_timestamp`

In a similar vein to the above, in the customer cohort and daily shop models, we aggregate orders on a daily grain. To do so, we truncate the timestamp at which the order was _created_. In contrast, Shopify in-app reports truncate the timestamp at which the order was _processed_. This may also contribute to discrepancies when comparing the package models to in-app reports. We felt that the creation timestamp makes more sense to use in reality, but please reach out if you have other thoughts by opening an [issue](https://github.com/fivetran/dbt_shopify/issues/new?assignees=&labels=enhancement&template=feature-request.yml&title=%5BFeature%5D+%3Ctitle%3E).

## Creating Empty Tables for Refunds, Order Line Refunds, Order Adjustments, and Discount Codes

Source tables related to `refunds`, `order_line_refunds`, `order_adjustments`, and `discount_codes` are created in the Shopify schema dyanmically. For example, if your shop has not incurred any refunds, you will not have a `refund` table yet until you do refund an order.
Expand Down
28 changes: 22 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,13 @@ The following table provides a detailed list of all models materialized within t
| **model** | **description** |
| ------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| [shopify__customer_cohorts](https://github.com/fivetran/dbt_shopify/blob/main/models/shopify__customer_cohorts.sql) | Each record represents the monthly performance of a customer, including fields for the month of their 'cohort'. |
| [shopify__customers](https://github.com/fivetran/dbt_shopify/blob/main/models/shopify__customers.sql) | Each record represents a customer, with additional dimensions like lifetime value and number of orders. |
| [shopify__orders](https://github.com/fivetran/dbt_shopify/blob/main/models/shopify__orders.sql) | Each record represents an order, with additional dimensions like whether it is a new or repeat purchase. |
| [shopify__order_lines](https://github.com/fivetran/dbt_shopify/blob/main/models/shopify__order_lines.sql) | Each record represents an order line item, with additional dimensions like how many items were refunded. |
| [shopify__products](https://github.com/fivetran/dbt_shopify/blob/main/models/shopify__products.sql) | Each record represents a product, with additional dimensions like most recent order date and order volume. |
| [shopify__transactions](https://github.com/fivetran/dbt_shopify/blob/main/models/shopify__transactions.sql) | Each record represents a transaction with additional calculations to handle exchange rates. |
| [shopify__customer_cohorts](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__customer_cohorts.sql) | Each record represents the monthly performance of a customer, including fields for the month of their 'cohort'. |
| [shopify__customers](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__customers.sql) | Each record represents a customer, with additional dimensions like lifetime value and number of orders. |
| [shopify__orders](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__orders.sql) | Each record represents an order, with additional dimensions like whether it is a new or repeat purchase. |
| [shopify__order_lines](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__order_lines.sql) | Each record represents an order line item, with additional dimensions like how many items were refunded. |
| [shopify__products](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__products.sql) | Each record represents a product, with additional dimensions like most recent order date and order volume. |
| [shopify__transactions](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__transactions) | Each record represents a transaction with additional calculations to handle exchange rates. |
| [shopify__daily_shop](https://fivetran.github.io/dbt_shopify/#!/model/model.shopify.shopify__daily_shop.sql) | Each record represents a day of activity for each of your shops, conveyed by a suite of daily metrics. |

# 🎯 How do I use the dbt package?

Expand Down Expand Up @@ -87,6 +88,21 @@ vars:
product_variant_pass_through_columns: []
```

### Adding Metafields
In [May 2021](https://fivetran.com/docs/applications/shopify/changelog#may2021) the Shopify connector included support for the [metafield resource](https://shopify.dev/api/admin-rest/2023-01/resources/metafield). If you would like to take advantage of these metafields, this package offers corresponding mapping models which append these metafields to the respective source object for the following tables: collection, customer, order, product_image, product, product_variant, shop. If enabled, these models will materialize as `shopify__[object]_metafields` for each respective supported object. To enable these metafield mapping models, you may use the following configurations within your `dbt_project.yml`.
>**Note**: These metafield models will contain all the same records as the corresponding staging models with the exception of the metafield columns being added. To ensure there is no fanout, this package takes advantage of the `dbt_expectations.expect_table_row_count_to_equal_other_table` test to ensure the metafield models contain the same row count as the staging model.

```yml
vars:
shopify_using_all_metafields: True ## False by default. Will enable ALL metafield models. FYI - This will override all other metafield variables.
shopify_using_collection_metafields: True ## False by default. Will enable ONLY the collection metafield model.
shopify_using_customer_metafields: True ## False by default. Will enable ONLY the customer metafield model.
shopify_using_order_metafields: True ## False by default. Will enable ONLY the order metafield model.
shopify_using_product_metafields: True ## False by default. Will enable ONLY the product metafield model.
shopify_using_product_image_metafields: True ## False by default. Will enable ONLY the product image metafield model.
shopify_using_product_variant_metafields: True ## False by default. Will enable ONLY the product variant metafield model.
```

### Changing the Build Schema
By default this package will build the Shopify staging models within a schema titled (<target_schema> + `_stg_shopify`) and the Shopify final models within a schema titled (<target_schema> + `_shopify`) in your target database. If this is not where you would like your modeled Shopify data to be written to, add the following configuration to your `dbt_project.yml` file:

Expand Down
3 changes: 2 additions & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,5 @@ vars:
shopify_tender_transaction: "{{ ref('stg_shopify__tender_transaction') }}"
shopify_abandoned_checkout_discount_code: "{{ ref('stg_shopify__abandoned_checkout_discount_code') }}"
shopify_order_discount_code: "{{ ref('stg_shopify__order_discount_code') }}"
shopify_abandoned_checkout_shipping_line: "{{ ref('stg_shopify__abandoned_checkout_shipping_line') }}"
shopify_abandoned_checkout_shipping_line: "{{ ref('stg_shopify__abandoned_checkout_shipping_line') }}"
shopify_fulfillment_event: "{{ ref('stg_shopify__fulfillment_event') }}"
3 changes: 2 additions & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ vars:
shopify_customer_tag_identifier: "shopify_customer_tag_data"
shopify_discount_code_identifier: "shopify_discount_code_data"
shopify_fulfillment_identifier: "shopify_fulfillment_data"
shopify_fulfillment_event_identifier: "shopify_fulfillment_event_data"
shopify_inventory_item_identifier: "shopify_inventory_item_data"
shopify_inventory_level_identifier: "shopify_inventory_level_data"
shopify_location_identifier: "shopify_location_data"
Expand All @@ -40,7 +41,7 @@ vars:
shopify_abandoned_checkout_discount_code_identifier: "shopify_abandoned_checkout_discount_code_data"
shopify_order_discount_code_identifier: "shopify_order_discount_code_data"
shopify_abandoned_checkout_shipping_line_identifier: "shopify_abandoned_checkout_shipping_line_data"

dispatch:
- macro_namespace: dbt_utils
search_order: ['spark_utils', 'dbt_utils']
Expand Down
6 changes: 6 additions & 0 deletions integration_tests/seeds/shopify_fulfillment_event_data.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
id,_fivetran_synced,address_1,city,country,created_at,estimated_delivery_at,fulfillment_id,happened_at,latitude,longitude,message,order_id,province,shop_id,status,updated_at,zip,_fivetran_deleted
451435,2022-11-18 04:39:07.945000,,,,2022-08-29 20:52:39.000000,,40495,2022-08-29 20:52:39.000000,,,,4502987,,89440612,delivered,2022-08-29 20:52:39.000000,,false
48779,2022-11-18 05:48:01.773000,,LONDON,GB,2022-09-13 08:07:57.000000,,4064737,2022-08-15 12:41:00.000000,101.349998474121094,-14.0333000011742115,Delay,4588203,,320612,out_for_delivery,2022-09-13 08:07:57.000000,CR0,false
1481515,2022-11-18 05:41:00.745000,,ECHO PARK,AU,2022-09-14 14:16:52.000000,2022-09-14 08:00:00.000000,4019339,2022-09-14 01:26:00.000000,-3.797698974609375,190.783958203125,Delay,451915,,89320612,out_for_delivery,2022-09-14 14:16:52.000000,2759,false
558955,2022-11-18 10:51:24.286000,,LAZYTOWN,US,2022-08-13 12:40:26.000000,,402947,2022-03-01 10:36:39.000000,22.337699890136719,-71.731002807617188,Delay,429188587,MA,89420612,in_transit,2022-08-13 12:40:26.000000,01505,false
6904235,2022-11-18 08:58:00.458000,,LA,US,2022-08-24 06:29:21.000000,2022-08-24 23:59:59.000000,4060491,2022-08-24 05:30:57.000000,12.287498474121094,-21.3573989868164,Delay,4242667,MA,89420612,in_transit,2022-08-24 06:29:21.000000,01760,false
52 changes: 52 additions & 0 deletions macros/get_metafields.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{% macro get_metafields(source_object, reference_value, lookup_object="stg_shopify__metafield", key_field="metafield_reference", key_value="value", reference_field="owner_resource") %}

{% set pivot_fields = dbt_utils.get_column_values(table=ref(lookup_object), column=key_field, where="lower(" ~ reference_field ~ ") = lower('" ~ reference_value ~ "')") %}

{% set source_columns = adapter.get_columns_in_relation(ref(source_object)) %}
{% set source_column_count = source_columns | length %}

with source_table as (
select *
from {{ ref(source_object) }}
)

{% if pivot_fields is not none %},
lookup_object as (
select
*,
{{ dbt_utils.pivot(
column=key_field,
values=pivot_fields,
agg='',
then_value=key_value,
else_value="null",
quote_identifiers=false
)
}}
from {{ ref(lookup_object) }}
where is_most_recent_record
),

final as (
select
{% for column in source_columns %}
source_table.{{ column.name }}{% if not loop.last %},{% endif %}
{% endfor %}
{% for fields in pivot_fields %}
, max(lookup_object.{{ dbt_utils.slugify(fields) }}) as metafield_{{ dbt_utils.slugify(fields) }}
{% endfor %}
from source_table
left join lookup_object
on lookup_object.{{ reference_field }}_id = source_table.{{ reference_value }}_id
and lookup_object.{{ reference_field }} = '{{ reference_value }}'
{{ dbt_utils.group_by(source_column_count) }}
)

select *
from final
{% else %}

select *
from source_table
{% endif %}
{% endmacro %}
25 changes: 25 additions & 0 deletions models/intermediate/int_shopify__daily_abandoned_checkouts.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
with abandoned_checkout as (

select *
from {{ var('shopify_abandoned_checkout') }}

-- "deleted" abandoned checkouts do not appear to have any data tying them to customers,
-- discounts, or products (and should therefore not get joined in) but let's filter them out here
where not coalesce(is_deleted, false)
),

abandoned_checkout_aggregates as (

select
source_relation,
cast({{ dbt.date_trunc('day','created_at') }} as date) as date_day,
count(distinct checkout_id) as count_abandoned_checkouts,
count(distinct customer_id) as count_customers_abandoned_checkout,
count(distinct email) as count_customer_emails_abandoned_checkout

from abandoned_checkout
group by 1,2
)

select *
from abandoned_checkout_aggregates
25 changes: 25 additions & 0 deletions models/intermediate/int_shopify__daily_fulfillment.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{{ config(enabled=var('shopify_using_fulfillment_event', false)) }}

with fulfillment_event as (

select *
from {{ var('shopify_fulfillment_event') }}
),

fulfillment_aggregates as (

select
source_relation,
cast({{ dbt.date_trunc('day','happened_at') }} as date) as date_day

{% for status in ['attempted_delivery', 'delivered', 'failure', 'in_transit', 'out_for_delivery', 'ready_for_pickup', 'label_printed', 'label_purchased', 'confirmed']%}
, sum(case when lower(status) = '{{ status }}' then 1 else 0 end) as count_fulfillment_{{ status }}
{% endfor %}

from fulfillment_event
group by 1,2

)

select *
from fulfillment_aggregates
96 changes: 96 additions & 0 deletions models/intermediate/int_shopify__daily_orders.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
with orders as (

select *
from {{ ref('shopify__orders') }}

where not coalesce(is_deleted, false)
),

order_lines as(

select *
from {{ ref('shopify__order_lines') }}
),

order_aggregates as (

select
source_relation,
cast({{ dbt.date_trunc('day','created_timestamp') }} as date) as date_day,
count(distinct order_id) as count_orders,
sum(line_item_count) as count_line_items,
avg(line_item_count) as avg_line_item_count,
count(distinct customer_id) as count_customers,
count(distinct email) as count_customer_emails,
sum(order_adjusted_total) as order_adjusted_total,
avg(order_adjusted_total) as avg_order_value,
sum(shipping_cost) as shipping_cost,
sum(order_adjustment_amount) as order_adjustment_amount,
sum(order_adjustment_tax_amount) as order_adjustment_tax_amount,
sum(refund_subtotal) as refund_subtotal,
sum(refund_total_tax) as refund_total_tax,
sum(total_discounts) as total_discounts,
avg(total_discounts) as avg_discount,
sum(shipping_discount_amount) as shipping_discount_amount,
avg(shipping_discount_amount) as avg_shipping_discount_amount,
sum(percentage_calc_discount_amount) as percentage_calc_discount_amount,
avg(percentage_calc_discount_amount) as avg_percentage_calc_discount_amount,
sum(fixed_amount_discount_amount) as fixed_amount_discount_amount,
avg(fixed_amount_discount_amount) as avg_fixed_amount_discount_amount,
sum(count_discount_codes_applied) as count_discount_codes_applied,
count(distinct location_id) as count_locations_ordered_from,
sum(case when count_discount_codes_applied > 0 then 1 else 0 end) as count_orders_with_discounts,
sum(case when refund_subtotal > 0 then 1 else 0 end) as count_orders_with_refunds,
min(created_timestamp) as first_order_timestamp,
max(created_timestamp) as last_order_timestamp

from orders
group by 1,2

),

order_line_aggregates as (

select
order_lines.source_relation,
cast({{ dbt.date_trunc('day','orders.created_timestamp') }} as date) as date_day,
sum(order_lines.quantity) as quantity_sold,
sum(order_lines.refunded_quantity) as quantity_refunded,
sum(order_lines.quantity_net_refunds) as quantity_net,
sum(order_lines.quantity) / count(distinct order_lines.order_id) as avg_quantity_sold,
sum(order_lines.quantity_net_refunds) / count(distinct order_lines.order_id) as avg_quantity_net,
count(distinct order_lines.variant_id) as count_variants_sold,
count(distinct order_lines.product_id) as count_products_sold,
sum(case when order_lines.is_gift_card then order_lines.quantity_net_refunds else 0 end) as quantity_gift_cards_sold,
sum(case when order_lines.is_shipping_required then order_lines.quantity_net_refunds else 0 end) as quantity_requiring_shipping

from order_lines
left join orders -- just joining with order to get the created_timestamp
on order_lines.order_id = orders.order_id
and order_lines.source_relation = orders.source_relation

group by 1,2
),

final as (

select
order_aggregates.*,
order_line_aggregates.quantity_sold,
order_line_aggregates.quantity_refunded,
order_line_aggregates.quantity_net,
order_line_aggregates.count_variants_sold,
order_line_aggregates.count_products_sold,
order_line_aggregates.quantity_gift_cards_sold,
order_line_aggregates.quantity_requiring_shipping,
order_line_aggregates.avg_quantity_sold,
order_line_aggregates.avg_quantity_net

from order_aggregates
left join order_line_aggregates
on order_aggregates.date_day = order_line_aggregates.date_day
and order_aggregates.source_relation = order_line_aggregates.source_relation
)

select *
from final
6 changes: 6 additions & 0 deletions models/metafields/shopify__collection_metafields.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{{ config(enabled=var('shopify_using_all_metafields', False) or var('shopify_using_collection_metafields', False)) }}

{{ get_metafields(
source_object = "stg_shopify__collection",
reference_value = 'collection')
}}
6 changes: 6 additions & 0 deletions models/metafields/shopify__customer_metafields.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{{ config(enabled=var('shopify_using_all_metafields', False) or var('shopify_using_customer_metafields', False)) }}

{{ get_metafields(
source_object = "stg_shopify__customer",
reference_value = 'customer')
}}
Loading

0 comments on commit 8db5df6

Please sign in to comment.