Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

Package tidy up #53

Merged
merged 5 commits into from
Jun 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ version: 2
jobs:
build:
docker:
- image: circleci/python:3.6.2-stretch
- image: circleci/python:3.6.3-stretch
- image: circleci/postgres:9.6.5-alpine-ram

steps:
Expand Down
7 changes: 6 additions & 1 deletion .github/issue_template/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,5 +54,10 @@ If applicable, add screenshots or log output to help explain your problem.

### Additional context
<!---
Add any other context about the problem here.
Add any other context about the problem here. For example, if you think you know which line of code is causing the issue.
--->

### Are you interested in contributing the fix?
<!---
Let us know if you want to contribute the fix, and whether would need a hand getting started
--->
5 changes: 5 additions & 0 deletions .github/issue_template/feature_request.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,8 @@ Is this feature database-specific? Which database(s) is/are relevant? Please inc

### Who will this benefit?
What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly.

### Are you interested in contributing this feature?
<!---
Let us know if you want to contribute the feature, and whether would need a hand getting started
--->
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,20 @@ This [dbt package](https://docs.getdbt.com/docs/package-management):


## Installation instructions

1. Include this package in your `packages.yml` -- check [here](https://hub.getdbt.com/fishtown-analytics/segment/latest/)
New to dbt packages? Read more about them [here](https://docs.getdbt.com/docs/building-a-dbt-project/package-management/).
1. Include this package in your `packages.yml` check [here](https://hub.getdbt.com/fishtown-analytics/segment/latest/)
for installation instructions.
2. Run `dbt deps`
3. Include the following in your `dbt_project.yml` directly within your
`models:` block (making sure to handle indenting appropriately). **Update the value to point to your segment page views table**.
3. Include the following in your `dbt_project.yml` directly within your `vars:` block (making sure to handle indenting appropriately). **Update the value to point to your segment page views table**.

```YAML
# dbt_project.yml
config-version: 2
...

models:
vars:
segment:
vars:
segment_page_views_table: "{{ source('segment', 'pages') }}"
segment_page_views_table: "{{ source('segment', 'pages') }}"

```
This package assumes that your data is in a structure similar to the test
Expand All @@ -32,15 +31,16 @@ out bad records, do this in an upstream model.
for more details:
```yaml
# dbt_project.yml
config-version: 2

...

models:
vars:
segment:
vars:
segment_page_views_table: "{{ source('segment', 'pages') }}"
segment_sessionization_trailing_window: 3
segment_inactivity_cutoff: 30 * 60
segment_pass_through_columns: []
segment_page_views_table: "{{ source('segment', 'pages') }}"
segment_sessionization_trailing_window: 3
segment_inactivity_cutoff: 30 * 60
segment_pass_through_columns: []

```
5. Execute `dbt seed` -- this project includes a CSV that must be seeded for it
Expand Down
1 change: 0 additions & 1 deletion data/referrer_mapping.csv
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ social,Hocam.com,hocam.com
social,Hyves,hyves.nl
social,Taringa!,taringa.net
social,Classmates,classmates.com
social,Pinterest,pinterest.com
social,Paper.li,paper.li
social,Twitter,twitter.com
social,Twitter,t.co
Expand Down
18 changes: 18 additions & 0 deletions data/seeds.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
version: 2

seeds:
- name: referrer_mapping
description: "This is a CSV version of Snowplow's [referer parser database](https://github.com/snowplow-referer-parser/referer-parser)"
columns:
- name: medium
tests:
- not_null

- name: source
tests:
- not_null

- name: host
tests:
- unique
- not_null
43 changes: 21 additions & 22 deletions dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
name: 'segment'
version: '1.0'
require-dbt-version: ">=0.17.0"
config-version: 2

source-paths: ["models"]
analysis-paths: ["analysis"]
Expand All @@ -9,29 +11,26 @@ macro-paths: ["macros"]

target-path: "target"
clean-targets:
- "target"
- "dbt_modules"
- "target"
- "dbt_modules"

require-dbt-version: ">=0.14.0"
vars:
# location of raw data table
segment_page_views_table:

models:
vars:
# location of raw data table
segment_page_views_table:
# number of trailing hours to re-sessionize for.
# events can come in late and we want to still be able to incorporate
# them into the definition of a session without needing a full refresh.
segment_sessionization_trailing_window: 3

# number of trailing hours to re-sessionize for.
# events can come in late and we want to still be able to incorporate
# them into the definition of a session without needing a full refresh.
segment_sessionization_trailing_window: 3

# sessionization inactivity cutoff: of there is a gap in page view times
# that exceeds this number of seconds, the subsequent page view will
# start a new session.
segment_inactivity_cutoff: 30 * 60
# sessionization inactivity cutoff: of there is a gap in page view times
# that exceeds this number of seconds, the subsequent page view will
# start a new session.
segment_inactivity_cutoff: 30 * 60

# If there are extra columns you wish to pass through this package,
# define them here. Columns will be included in the `segment_web_sessions`
# model as `first_<column>` and `last_<column>`. Extremely useful when
# using this package on top of unioned Segment sources, as you can then
# pass through a column indicating which source the data is from.
segment_pass_through_columns: []
# If there are extra columns you wish to pass through this package,
# define them here. Columns will be included in the `segment_web_sessions`
# model as `first_<column>` and `last_<column>`. Extremely useful when
# using this package on top of unioned Segment sources, as you can then
# pass through a column indicating which source the data is from.
segment_pass_through_columns: []
10 changes: 6 additions & 4 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@

name: 'segment_integration_tests'
version: '1.0'
config-version: 2

profile: 'integration_tests'

models:
vars:
segment:
vars:
segment_page_views_table: "{{ ref('example_segment_pages') }}"
segment_page_views_table: "{{ ref('example_segment_pages') }}"

seeds:
+quote_columns: false
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ session_ids as (

{{dbt_utils.star(ref('segment_web_page_views'))}},
page_view_number,
{{dbt_utils.surrogate_key('anonymous_id', 'session_number')}} as session_id
{{dbt_utils.surrogate_key(['anonymous_id', 'session_number'])}} as session_id

from session_numbers

Expand Down
2 changes: 1 addition & 1 deletion packages.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
packages:
- package: fishtown-analytics/dbt_utils
version: '>=0.1.20'
version: [">=0.3.0", "<0.5.0"]