-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audiobook time tracking #1288
Audiobook time tracking #1288
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1288 +/- ##
==========================================
+ Coverage 89.82% 89.85% +0.03%
==========================================
Files 208 210 +2
Lines 28549 28655 +106
Branches 6545 6556 +11
==========================================
+ Hits 25644 25749 +105
Misses 1893 1893
- Partials 1012 1013 +1
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a first pass through this one, but have NOT really looked at the tests yet. More detail below, but here are a few high-level thoughts:
-
We need to be able to report at the collection and library level for the CSV report, so we need that information in the database tables.
-
Since the apps will use the presence of a item entry-level link with the time tracking
rel
to decide whether to perform/report time tracking, we want link to be present for only those books needing tracking to have one. So we need some collection or library/collection-level config to indicate whether time tracking should be used. -
So that we can more easily tune them over time, it would be useful to be able to set (or override) the following with script options:
- How old a time entry timestamp should be before it is
processed
into the summary table. - How old a
processed==True
entry should be (i.e., how long we want to actively avoid duplicates) before it is removed from theentries
table.
- How old a time entry timestamp should be before it is
Just throwing a note on here to remind us that since #1281 went in, and also had a DB migration, that we will need to fix the DB migration here before merging. |
The api takes bulk playtime entries to insert into the DB
The cron job is slated to run every 12 hours, shifted by 8 to avoid clutter
To run once a month and send a quarterly report
7fbb202
to
edf3756
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I resolved some conversations and added a few more comments.
Configuration.REPORTING_EMAIL_ENVIRONMENT_VARIABLE: "reporting@test.email" | ||
}, | ||
), | ||
# Horrible unbracketted syntax for python 3.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So glad this is fixed in more recent Python versions! Absolutely hideous and confusing.
core/query/playtime_entries.py
Outdated
if ( | ||
today - entry.during_minute.date() | ||
).days > cls.OLDEST_ACCEPTABLE_ENTRY_DAYS: | ||
# This will count as a success, since we don't want to repeat the entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is reported as a failure, but one that means the client should discard the corresponding entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to count as a failure
I did originally use collection id, but found that every other collection
based route on the CM, which are a lot, uses collection name. To keep it
consistent I also used the name.
I'm not privy as to why that's the standard though.
…On Thu, Aug 3, 2023, 10:03 Tim DiLauro ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In api/opds.py
<#1288 (comment)>
:
> + href=self.url_for(
+ "track_playtime_events",
+ identifier_type=identifier.type,
+ identifier=identifier.identifier,
+ library_short_name=self.library.short_name,
+ _external=True,
The route we're getting the URL for here is
"/playtimes/<collection_name>/<identifier_type>/<path:identifier>", so
the href should be something like:
href=self.url_for(
"track_playtime_events",
collection_name=active_license_pool.collection.name,
identifier_type=identifier.type,
identifier=identifier.identifier,
_external=True,
),
But I think it might be better -- for a couple of reasons -- if we used
the id of the collection, rather than the name:
- A collection name can have spaces, which add to the messiness of the
URLs (though, this is also the case with identifiers, especially the id
type, as well).
- A collection name can be changed at any time (for example, when we
notice a typo or extra spaces in the name).
------------------------------
In api/routes.py
<#1288 (comment)>
:
> @@ -672,6 +672,18 @@ def track_analytics_event(identifier_type, identifier, event_type):
)
***@***.***_route(
+ "/playtimes/<collection_name>/<identifier_type>/<path:identifier>", methods=["POST"]
I might be better to use collection_id, rather than collection_name, for
the first component of the path, since it is less likely to change.
—
Reply to this email directly, view it on GitHub
<#1288 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVRR5S7DJHJHG6D2EVLIPUDXTMSZRANCNFSM6AAAAAA2SURIFY>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
I'd still suggest using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking pretty good. I’m going to do some more testing once I wake up. And, of course, we’ll need to resolve the collection name vs. id issue.
One more fairly minor comment below.
logging.getLogger("TimeTracking").error( | ||
f"An incorrect timezone was received for a playtime ({value.tzname()})." | ||
) | ||
raise ValueError("Timezone MUST be UTC always") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising ValueError
here causes the entire request to fail, even if only one timeEntry
of many has an error. Ideally, we'd return this as a 400 response for just this entry.
I think it's unlikely for this to happen to one among many, so I think we can address this later.
api/model/time_tracking.py
Outdated
class PlaytimeEntriesPost(CustomBaseModel): | ||
time_entries: List[PlaytimeTimeEntry] = Field(description="A List of time entries") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The book_id
and library_id
from the spec are missing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to go! 🎈🎉
It'll be good to start testing against this with the apps!
@RishiDiwanTT I'm going to go ahead and merge this, so that we can get it deployed out for testing. Thanks for all your work on this! |
Description
New tables IdentifierPlaytimeEntries and IdentifierPlaytimes have been added.
The time tracking API has been added as
POST /playtimes/<type>/<identifier>/
.A new environment variable
SIMPLIFIED_REPORTING_EMAIL
has been added for the temporary requirement of emailing reports via a cron job, this will require a deployment change.Playtime aggregations run every 12 hours.
Playtime reporting occurs on the 2nd of every month.
Cannot add the api spec for now since HTTP 207 is not supported by
flask_pydantic_spec
. Bug ticket here.Motivation and Context
We need to track the total amount of time an audiobook is playing on a user’s device. The apps should send this information to a remote server every 1 minute.
JIRA
How Has This Been Tested?
Manually run the APIs and summation jobs.
Emailed a local smtp server with the CSV report.
Checklist