Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/first reply time update #59

Merged
108 changes: 106 additions & 2 deletions DECISIONLOG.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,114 @@
### Zendesk Backlog Tickets
# Decision Log


## Zendesk Backlog Tickets
- You may find some discrepancies between what Zendesk reports and our model the total number of backlog tickets on a given day. After investigating this we have realized this is due to Zendesk taking a snapshot of each day sometime in the 23rd hour as stated in their [article.](https://support.zendesk.com/hc/en-us/articles/4408819342490-Why-does-the-Backlog-dataset-only-show-the-Backlog-recorded-Hour-as-23-).

```
Because backlog data is captured on a per-day basis, it cannot be segmented hourly. The Backlog recorded - Hour is listed as 23 because data is captured daily between 11 pm, 12 am, or 1 am depending on factors like Daylight Saving Time (DST).
For more information, see the article: Analyzing your ticket backlog history with Explore.
```

- While Zendesk doesn't segment their backlog data per hour, on the other hand we always try to model our data starting at a greater granularity. This means we start by taking the _hour_ from the timestamp field from the Zendesk source tables then bringing it to _day_. Therefore there will be edge cases where tickets updated near the end of day may fall into different statuses, depending on whether you're looking at the Zendesk Backlog dashboard or our model outputs.

## Business Time Metrics
When developing this package we noticed Zendesk reported ticket response times in business minutes based on the last schedule which is applied to the ticket. However, we felt this is not an accurate representation of the true ticket elapsed time in business minutes. Therefore, we took the opinionated decision to apply logic within our transformations to calculate the cumulative elapsed time in business minutes of a ticket across **all** schedules which the ticket was assigned during it's lifetime.

Below is a quick explanation of how this is calculated within the dbt package for **first_reply_time_business_minutes** as well as how this differs from Zendesk's logic:
> Note: While this is an example of `first_reply_time_business_minutes`, the logic is the same for other business minute metrics.

- A ticket (`941606`) is created on `2020-09-29 17:01:38 UTC` and first solved at `2020-10-01 15:03:44 UTC`.
- When the ticket was created it was assigned the schedule `Level 1 Chicago`
- The schedule intervals are expressed as the number of minutes since the start of the week.
- Sunday is considered the start of the week.
- The `Level 1 Chicago` schedule can be interpreted as the following:

| **start_time_utc** | **end_time_utc** |
| ------------------ | ----------------- |
| 720 | 1560 |
| 2160 | 3000 |
| 3600 | 4440 |
| 5040 | 5880 |
| 6480 | 7320 |
| 7920 | 8760 |
| 9360 | 10200 |

- Looking closer into the ticket, we also see another schedule `Level 2 San Francisco` was assigned to the ticket on `2020-09-30 19:01:25 UTC`
- The `Level 2 San Francisco` schedule can be interpreted as the following:

| **start_time_utc** | **end_time_utc** |
| ------------------ | ----------------- |
| 2340 | 2910 |
| 3780 | 4350 |
| 5220 | 5790 |
| 6660 | 7230 |
| 8100 | 8670 |

- Now that we know the ticket had two schedules, let's see the comments exchanged within this ticket to capture when the `first_reply_time` was recorded.

| **ticket_id** | **field_name** | **is_public** | **commenter_role** | **valid_starting_at** |
| ------------- | -------------- | ------------- | ------------------ | --------------------- |
| 941606 | comment | TRUE | external_comment | 2020-09-29 17:01:38 UTC |
| 941606 | comment | FALSE | internal_comment | 2020-09-30 19:01:25 UTC |
| 941606 | comment | TRUE | internal_comment | 2020-09-30 19:01:46 UTC |
| 941606 | comment | TRUE | internal_comment | 2020-10-01 15:03:44 UTC |

- Seeing the comments made to the ticket, we understand that the customer commented on the ticket at `2020-09-29 17:01:38 UTC` and the first **public** internal comment was made at `2020-09-30 19:01:46 UTC`.
- In comparison of the two schedules associated with this ticket, we can see that the `Level 1 Chicago` schedule was set for almost the entire duration of the ticket before the first reply. Whereas, the `Level 2 San Francisco` schedule was only set for 21 seconds.
- Regardless, we will be using both schedules in the calculation of the `first_reply_time_business_minutes`.
- Now that we have the schedules, the schedule intervals, and the first_reply_time we can calculate the total elapsed `first_reply_time_business_minutes`. But, let's first convert the UTC timestamps to the Zendesk-esque intervals expressed within the schedules:
> The `Interval Results` are calculate via: `(Full Days From Sunday * 24 * 60) + (Hours * 60) + Minutes`

| **Action** | **Timestamp** | **Full Days from Sunday** | **Hours** | **Minutes** | **Interval Result** |
| ---------- | ------------- | ------------------------- | --------- | ----------- | ------------------- |
| Ticket Created and Schedule set to Level 1 Chicago | `Tuesday, September 29, 2020 at 5:01:38 PM` | 2 | 17 | 2 | 3902 |
| Schedule changed to Level 2 San Francisco | `Wednesday, September 30, 2020 at 7:01:25 PM` | 3 | 19 | 1.25 | 5461.25 |
| First Public Internal Comment | `Wednesday, September 30, 2020 at 7:01:46 PM` | 3 | 19 | 1.46 | 5461.46 |

- With the Interval Results obtained above, we can see where these overlap within the schedules.

**Level 1 Chicago**
> Overlap was from 3902 to 5461.25 and falls within two intervals

| **start_time_utc** | **end_time_utc** |
| ------------------ | ----------------- |
| 720 | 1560 |
| 2160 | 3000 |
| >**3600**< | >**4440**< |
| >**5040**< | >**5880**< |
| 6480 | 7320 |
| 7920 | 8760 |
| 9360 | 10200 |

**Level 2 San Francisco**
> Only overlap was from 5461.25 to 5461.46 and falls within one interval

| **start_time_utc** | **end_time_utc** |
| ------------------ | ----------------- |
| 2340 | 2910 |
| 3780 | 4350 |
| >**5220**< | >**5790**< |
| 6660 | 7230 |
| 8100 | 8670 |

- Now let's figure out the overlapping duration

| **Schedule** | **Schedule start_time_utc** | **Schedule end_time_utc** | **Ticket Start** | **Ticket End** | **Difference** |
|----| ------------------ | -----------------| ------------------ | -----------------| ------------------ |
| `Level 1 Chicago` | 3600 | >**4440**< | >**3902**< | 5461.25 | 538 |
| `Level 1 Chicago` | >**5040**< | 5880 | 3902 | >**5461.25**< | 421.25 |
| `Level 2 San Francisco` | >**5220**< (We use **5461.25** to account for overlap) | 5790 | 5462 | >**5461.46**< | .21 |

- Adding the differences above we arrive at a total `first_reply_time_business_minutes` of 959.46 minutes.

- So how does Zendesk calculate this?
- Instead of taking into account the various schedules used by the ticket, Zendesk will instead use the **last** schedule applied to the ticket to record the duration in business minutes.
- Therefore, in the example above Zendesk will **only** use the `Level 2 San Francisco` schedule when calculating the `first_reply_time_business_minutes` for ticket `941606`.
- Below is an example of how Zendesk calculates this:

| **Schedule** | **Schedule start_time_utc** | **Schedule end_time_utc** | **Ticket Start** | **Ticket End** | **Difference** |
|----| ------------------ | -----------------| ------------------ | -----------------| ------------------ |
| `Level 2 San Francisco` | 3780 | >**4350**< | 3902 | 5461.46 | 448 |
| `Level 2 San Francisco` | >**5220**< | 5790 | 3902 | 5461.46 | 241.46 |

- Adding the differences above we arrive at a total `first_reply_time_business_minutes` of 689.46 minutes.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,9 @@ vars:
zendesk:
ticket_field_history_timeframe_years: integer_number_of_years # default = 50 (everything)
```

## Opinionated Modelling Decisions
### Changing the Business Time Metrics Logic
This dbt package takes an opinionated stance on how business time metrics are calculated. The dbt package takes **all** schedules into account when calculating the business time duration. Whereas, the Zendesk UI logic takes into account **only** the latest schedule assigned to the ticket. If you would like a deeper explanation of the logic used by default in the dbt package you may reference the [DECISIONLOG](/DECISIONLOG.md).
## Database support
This package is compatible with BigQuery, Snowflake, Redshift and Postgres.

Expand Down
16 changes: 13 additions & 3 deletions models/intermediate/int_zendesk__ticket_schedules.sql
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,16 @@ with ticket as (
from schedule_events

)

select *
from ticket_schedules
select
--The below conditional is only used for testing purposes.
{% if var('using_zendesk_business_metrics_logic',false) %}
ticket_id,
max(schedule_id) as schedule_id,
min(schedule_created_at) as schedule_created_at,
max(schedule_invalidated_at) as schedule_invalidated_at
from ticket_schedules
group by 1
{% else %}
*
from ticket_schedules
{% endif %}
3 changes: 2 additions & 1 deletion models/reply_times/int_zendesk__comments_enriched.sql
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,6 @@ with ticket_comment as (

select
*,
first_value(valid_starting_at) over (partition by ticket_id order by valid_starting_at desc, ticket_id rows unbounded preceding) as last_comment_added_at
first_value(valid_starting_at) over (partition by ticket_id order by valid_starting_at desc, ticket_id rows unbounded preceding) as last_comment_added_at,
sum(case when not is_public then 1 else 0 end) over (partition by ticket_id order by valid_starting_at rows between unbounded preceding and current row) as previous_internal_comment_count
from add_previous_commenter_role
14 changes: 8 additions & 6 deletions models/reply_times/int_zendesk__ticket_reply_times.sql
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ with ticket_public_comments as (
valid_starting_at as end_user_comment_created_at,
ticket_created_date,
commenter_role,
previous_internal_comment_count,
previous_commenter_role = 'first_comment' as is_first_comment
from ticket_public_comments
where (commenter_role = 'external_comment'
Expand All @@ -21,14 +22,15 @@ with ticket_public_comments as (

select
end_user_comments.ticket_id,
-- If the commentor was internal and a first comment then we want the ticket created date to be the end user comment created date
-- If the commentor was internal, a first comment, and had previous non public internal comments then we want the ticket created date to be the end user comment created date
-- Otherwise we will want to end user comment created date
case when end_user_comments.is_first_comment
then end_user_comments.ticket_created_date
else end_user_comments.end_user_comment_created_at
end as end_user_comment_created_at,
case when is_first_comment then end_user_comments.ticket_created_date else end_user_comments.end_user_comment_created_at end as end_user_comment_created_at,
end_user_comments.is_first_comment,
min(agent_comments.valid_starting_at) as agent_responded_at
min(case when is_first_comment
and end_user_comments.commenter_role != 'external_comment'
and (end_user_comments.previous_internal_comment_count > 0)
then end_user_comments.end_user_comment_created_at
else agent_comments.valid_starting_at end) as agent_responded_at
from end_user_comments
left join ticket_public_comments as agent_comments
on agent_comments.ticket_id = end_user_comments.ticket_id
Expand Down