Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discounts! #47

Merged
merged 11 commits into from
Jan 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions DECISIONLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,23 @@ In validating metrics with the Sales over Time reports in the Shopify UI, you ma

We felt that reporting on the order date made more sense in reality, but, if you feel differently, please reach out and create a Feature Request. To align with the Shopify method yourself, this would most likely involve aggregating `transactions` data (relying on the `kind` column to determine sales vs returns) instead of `orders`.

## Creating Empty Tables for Refunds, Order Line Refunds, and Order Adjustments
## Creating Empty Tables for Refunds, Order Line Refunds, Order Adjustments, and Discount Codes

Source tables related to `refunds`, `order_line_refunds`, and `order_adjustments` are created in the Shopify schema dyanmically. For example, if your shop has not incurred any refunds, you will not have a `refund` table yet until you do refund an order.
Source tables related to `refunds`, `order_line_refunds`, `order_adjustments`, and `discount_codes` are created in the Shopify schema dyanmically. For example, if your shop has not incurred any refunds, you will not have a `refund` table yet until you do refund an order.

Thus, the source package will create empty (1 row of all `NULL` fields) staging models if these source tables do not exist in your Shopify schema yet, and the transform package will work seamlessly with these empty models. Once `refund`, `order_line_refund`, or `order_adjustment` exists in your schema, the source and transform packages will automatically reference the new populated table(s). ([example](https://github.com/fivetran/dbt_shopify_source/blob/main/models/tmp/stg_shopify__refund_tmp.sql)).
Thus, the source package will create empty (1 row of all `NULL` fields) staging models if these source tables do not exist in your Shopify schema yet, and the transform package will work seamlessly with these empty models. Once `refund`, `order_line_refund`, `order_adjustment`, or `discount_code` exists in your schema, the source and transform packages will automatically reference the new populated table(s). ([example](https://github.com/fivetran/dbt_shopify_source/blob/main/models/tmp/stg_shopify__refund_tmp.sql)).

> In previous versions of the package, you had to manually enable or disable transforms of `refund`, `order_line_refund`, or `order_adjustment` through variables. Because this required you to monitor your Shopify account/schema and update the variable(s) accordingly, we decided to pursue a more automated solution.

## Keeping Deleted Entities

todo - not filtering out _fivetran_deleted in staging models. when joining these tables together in the transform package, bring in _fivetran_deleted as is_<foreign key table>_deleted
todo - not filtering out _fivetran_deleted in staging models. when joining these tables together in the transform package, bring in _fivetran_deleted as is_<foreign key table>_deleted

## Accepted Value Test Severity

We test the following columns for accepted values because their values are hard-coded to be pivoted out into columns and/or used as `JOIN` conditions in downstream models.
- `stg_shopify__price_rule.target_type`: accepted values are `line_item`, `shipping_line`
- `stg_shopify__price_rule.value_type`: accepted values are `percentage`, `fixed_amount`
- `stg_shopify__fulfillment.status`: accepted values are `pending`, `open`, `success`, `cancelled`, `error`, `failure`

We have chosen to make the severity of these tests `warn`, as non-accepted values will be filtered out in the transformation models. They will not introduce erroneous data.
3 changes: 2 additions & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,5 @@ vars:
shopify_shop: "{{ ref('stg_shopify__shop') }}"
shopify_tender_transaction: "{{ ref('stg_shopify__tender_transaction') }}"
shopify_abandoned_checkout_discount_code: "{{ ref('stg_shopify__abandoned_checkout_discount_code') }}"
shopify_order_discount_code: "{{ ref('stg_shopify__order_discount_code') }}"
shopify_order_discount_code: "{{ ref('stg_shopify__order_discount_code') }}"
shopify_abandoned_checkout_shipping_line: "{{ ref('stg_shopify__abandoned_checkout_shipping_line') }}"
58 changes: 32 additions & 26 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,31 +16,30 @@ vars:
shopify_product_variant_identifier: "shopify_product_variant_data"
shopify_refund_identifier: "shopify_refund_data"
shopify_transaction_identifier: "shopify_transaction_data"

# will change these to identifiers later
abandoned_checkout_source: "{{ ref('shopify_abandoned_checkout_data') }}"
collection_product_source: "{{ ref('shopify_collection_product_data') }}"
collection_source: "{{ ref('shopify_collection_data') }}"
customer_tag_source: "{{ ref('shopify_customer_tag_data') }}"
discount_code_source: "{{ ref('shopify_discount_code_data') }}"
fulfillment_source: "{{ ref('shopify_fulfillment_data') }}"
inventory_item_source: "{{ ref('shopify_inventory_item_data') }}"
inventory_level_source: "{{ ref('shopify_inventory_level_data') }}"
location_source: "{{ ref('shopify_location_data') }}"
metafield_source: "{{ ref('shopify_metafield_data') }}"
order_note_attribute_source: "{{ ref('shopify_order_note_attribute_data') }}"
order_shipping_line_source: "{{ ref('shopify_order_shipping_line_data') }}"
order_shipping_tax_line_source: "{{ ref('shopify_order_shipping_tax_line_data') }}"
order_tag_source: "{{ ref('shopify_order_tag_data') }}"
order_url_tag_source: "{{ ref('shopify_order_url_tag_data') }}"
price_rule_source: "{{ ref('shopify_price_rule_data') }}"
product_image_source: "{{ ref('shopify_product_image_data') }}"
product_tag_source: "{{ ref('shopify_product_tag_data') }}"
shop_source: "{{ ref('shopify_shop_data') }}"
tender_transaction_source: "{{ ref('shopify_tender_transaction_data') }}"
abandoned_checkout_discount_code_source: "{{ ref('shopify_abandoned_checkout_discount_code_data') }}"
order_discount_code_source: "{{ ref('shopify_order_discount_code_data') }}"

shopify_abandoned_checkout_identifier: "shopify_abandoned_checkout_data"
shopify_collection_product_identifier: "shopify_collection_product_data"
shopify_collection_identifier: "shopify_collection_data"
shopify_customer_tag_identifier: "shopify_customer_tag_data"
shopify_discount_code_identifier: "shopify_discount_code_data"
shopify_fulfillment_identifier: "shopify_fulfillment_data"
shopify_inventory_item_identifier: "shopify_inventory_item_data"
shopify_inventory_level_identifier: "shopify_inventory_level_data"
shopify_location_identifier: "shopify_location_data"
shopify_metafield_identifier: "shopify_metafield_data"
shopify_order_note_attribute_identifier: "shopify_order_note_attribute_data"
shopify_order_shipping_line_identifier: "shopify_order_shipping_line_data"
shopify_order_shipping_tax_line_identifier: "shopify_order_shipping_tax_line_data"
shopify_order_tag_identifier: "shopify_order_tag_data"
shopify_order_url_tag_identifier: "shopify_order_url_tag_data"
shopify_price_rule_identifier: "shopify_price_rule_data"
shopify_product_image_identifier: "shopify_product_image_data"
shopify_product_tag_identifier: "shopify_product_tag_data"
shopify_shop_identifier: "shopify_shop_data"
shopify_tender_transaction_identifier: "shopify_tender_transaction_data"
shopify_abandoned_checkout_discount_code_identifier: "shopify_abandoned_checkout_discount_code_data"
shopify_order_discount_code_identifier: "shopify_order_discount_code_data"
shopify_abandoned_checkout_shipping_line_identifier: "shopify_abandoned_checkout_shipping_line_data"

dispatch:
- macro_namespace: dbt_utils
search_order: ['spark_utils', 'dbt_utils']
Expand Down Expand Up @@ -121,6 +120,7 @@ seeds:
closed_at: timestamp
created_at: timestamp
updated_at: timestamp
_fivetran_deleted: boolean
shopify_discount_code_data:
+column_types:
usage_count: float
Expand Down Expand Up @@ -159,4 +159,10 @@ seeds:
shopify_inventory_item_data:
+column_types:
updated_at: timestamp
created_at: timestamp
created_at: timestamp
shopify_abandoned_checkout_shipping_line_data:
+column_types:
markup: "{{ 'string' if target.type in ('bigquery', 'spark', 'databricks') else 'varchar' }}"
price: float
original_shop_markup: "{{ 'string' if target.type in ('bigquery', 'spark', 'databricks') else 'varchar' }}"
original_shop_price: "{{ 'string' if target.type in ('bigquery', 'spark', 'databricks') else 'varchar' }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
checkout_id,index,_fivetran_synced,api_client_id,carrier_identifier,carrier_service_id,code,delivery_category,discounted_price,id,markup,phone,price,requested_fulfillment_service_id,source,title,validation_context,delivery_expectation_range,delivery_expectation_type,original_shop_markup,original_shop_price,presentment_title,delivery_expectation_range_min,delivery_expectation_range_max
653675,1,2023-01-09 06:48:18.093000,,,,Standard,,,c3ce0972c2e30eaf7001bea,0.0,,0.0,,shopify,Standard,,,,0.0,0.0,Standard,,
379,1,2023-01-09 06:48:23.540000,,,,Standard,,,bf7c90953344902c13,0.0,,0.0,,shopify,Standard,,,,0.0,0.0,Standard,,
635,1,2023-01-09 06:48:24.243000,,,,Standard,,,519ff4275cd972e282db,0.0,,0.0,,shopify,Standard,,,,0.0,0.0,Standard,,
3211,1,2023-01-09 06:48:18.068000,,,,Standard,,,8d18671d481ad46a,0.0,,0.0,,shopify,Standard,,,,0.0,0.0,Standard,,
381227,1,2023-01-09 06:48:16.985000,,,,Standard,,,8f2fab1b455ec9e597,0.0,,0.0,,shopify,Standard,,,,0.0,0.0,Standard,,
5 changes: 2 additions & 3 deletions models/intermediate/int_shopify__customer_email_rollup.sql
Original file line number Diff line number Diff line change
Expand Up @@ -22,20 +22,19 @@ with customers as (
min(created_timestamp) as first_account_created_at,
max(created_timestamp) as last_account_created_at,
max(updated_timestamp) as last_updated_at,
max(accepts_marketing_updated_at) as accepts_marketing_last_updated_at,
max(marketing_consent_updated_at) as marketing_consent_updated_at,
max(_fivetran_synced) as last_fivetran_synced,
sum(orders_count) as orders_count,
sum(total_spent) as total_spent,

-- take true if ever given for boolean fields
{{ fivetran_utils.max_bool("has_accepted_marketing") }} as has_accepted_marketing,
{{ fivetran_utils.max_bool("case when customer_index = 1 then is_tax_exempt else null end") }} as is_tax_exempt, -- since this changes every year
{{ fivetran_utils.max_bool("is_verified_email") }} as is_verified_email

-- for all other fields, just take the latest value
{% set cols = adapter.get_columns_in_relation(ref('stg_shopify__customer')) %}
{% set except_cols = ['_fivetran_synced', 'email', 'source_relation', 'customer_id', 'phone', 'created_at',
'updated_at', 'has_accepted_marketing', 'accepts_marketing_updated_at', 'orders_count', 'total_spent',
'updated_at', 'marketing_consent_updated_at', 'orders_count', 'total_spent',
'is_tax_exempt', 'is_verified_email'] %}
{% for col in cols %}
{% if col.column|lower not in except_cols %}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
with abandoned_checkout as (

select *
from {{ var('shopify_abandoned_checkout') }}

-- "deleted" abandoned checkouts do not appear to have any data tying them to customers,
-- discounts, or products (and should therefore not get joined in) but let's filter them out here
where not coalesce(is_deleted, false)
),

abandoned_checkout_discount_code as (

select *
from {{ var('shopify_abandoned_checkout_discount_code') }}

-- we need the TYPE of discount (shipping, percentage, fixed_amount) to avoid fanning out of joins
-- so filter out records that have this
where coalesce(type, '') != ''
),

abandoned_checkout_shipping_line as (

select *
from {{ var('shopify_abandoned_checkout_shipping_line') }}
),

abandoned_checkouts_aggregated as (

select
abandoned_checkout_discount_code.code,
abandoned_checkout_discount_code.type,
abandoned_checkout_discount_code.source_relation,
sum(abandoned_checkout_discount_code.amount) as total_abandoned_checkout_discount_amount,
sum(coalesce(abandoned_checkout.total_line_items_price, 0)) as total_abandoned_checkout_line_items_price,
sum(coalesce(abandoned_checkout_shipping_line.price, 0)) as total_abandoned_checkout_shipping_price

from abandoned_checkout_discount_code
left join abandoned_checkout
on abandoned_checkout_discount_code.checkout_id = abandoned_checkout.checkout_id
and abandoned_checkout_discount_code.source_relation = abandoned_checkout.source_relation
left join abandoned_checkout_shipping_line
on abandoned_checkout_shipping_line.checkout_id = abandoned_checkout_discount_code.checkout_id
and abandoned_checkout_shipping_line.source_relation = abandoned_checkout_discount_code.source_relation

group by 1,2,3
)

select *
from abandoned_checkouts_aggregated
36 changes: 36 additions & 0 deletions models/intermediate/int_shopify__discounts__order_aggregates.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
with order_discount_code as (

select *
from {{ var('shopify_order_discount_code') }}
),

orders as (

select *
from {{ ref('shopify__orders') }}
),

orders_aggregated as (

select
order_discount_code.code,
order_discount_code.type,
order_discount_code.source_relation,
avg(order_discount_code.amount) as avg_order_discount_amount,
sum(order_discount_code.amount) as total_order_discount_amount,
sum(orders.total_line_items_price) as total_order_line_items_price,
sum(orders.shipping_cost) as total_order_shipping_cost,
sum(orders.refund_subtotal + orders.refund_total_tax) as total_order_refund_amount,
Comment on lines +20 to +23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also see avg being a relevant metric when assessing discounts. Especially when it comes to percent values, I would be really interested to understand the average discount dollar amount. Who knows, maybe a 10% discount code is being used on large orders and is coming out to a super high average. I could see this being a useful metric.

What are your thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah that's a good idea and will be very simple to add!

count(distinct customer_id) as count_distinct_customers,
count(distinct email) as count_distinct_customer_emails

from order_discount_code
join orders
on order_discount_code.order_id = orders.order_id
and order_discount_code.source_relation = orders.source_relation

group by 1,2,3
)

select *
from orders_aggregated
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,6 @@ joined as (
join fulfillment
on orders.order_id = fulfillment.order_id
and orders.source_relation = fulfillment.source_relation

left join refunds_aggregated
on refunds_aggregated.order_line_id = order_lines.order_line_id
and refunds_aggregated.source_relation = order_lines.source_relation
Expand Down
16 changes: 15 additions & 1 deletion models/intermediate/intermediate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,24 @@ models:
combination_of_columns:
- email
- source_relation
- name: int_hopify__inventory_level__aggregates
- name: int_shopify__inventory_level__aggregates
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- variant_id
- location_id
- source_relation
- name: int_shopify__discounts__order_aggregates
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- code
- type
- source_relation
- name: int_shopify__discounts__abandoned_checkouts
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- code
- type
- source_relation
Loading