Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server - Event Logging - Database - Generate Database #780

Open
data-sync-user opened this issue Oct 18, 2024 · 1 comment
Open

Server - Event Logging - Database - Generate Database #780

data-sync-user opened this issue Oct 18, 2024 · 1 comment
Assignees

Comments

@data-sync-user
Copy link
Collaborator

Generate the Message Tracking Event database.

Per https://mozilla-hub.atlassian.net/browse/SYNC-4325?focusedCommentId=946578 there are two data storage systems in play. One being a Redis like storage system (probably GCP Memorystore system), and a modification to the existing Bigtable schema to include a new “reliability” column family (set to maxage=60 or maxversions=1) This would contain the milestone log messages. The expiration date of 60 days allows for twice the max message age, which should allow for more than long enough for any sort of large observation.

┆Issue is synchronized with this Jira Task

@data-sync-user
Copy link
Collaborator Author

➤ JR Conlin commented:

Going to use this ticket to collect up various comments and actions around the creation of the Reliability Data store system. (cc: Eric Maydeck Rachael Crook Philip Jenvey Taddes Korris )

https://mozilla-hub.atlassian.net/browse/SYNC-4325?focusedCommentId=946578 ( https://mozilla-hub.atlassian.net/browse/SYNC-4325?focusedCommentId=946578|smart-link ) contains the discussion of the schema and system design.

Expected LoadWe are only tracking a subset of messages that have a known Public Key (FxA tab operations). It’s worth noting that the overall percentage of total messages that meet this criteria is unknown, but less than 100%.

OperationsEach tracked message will create a Redis pipeline command set containing the following:

  • Increment a new state hincr "counts" $milestone 1
  • Decrement the previous state (unless at initial state) hincr "counts" $milestone -1
  • Remove the prior expiration pointer (indicating when and what milestone a given reliability_id is at) zrem "expiry" "$milestone#$reliability_id"
  • Add a expiration pointer for the new milestone. (zadd "expiry" "$milestone#$reliability_id" $expiration)

In addition, the app will record to Bigtable at a row identified by the reliability_id a cell with the qualifier of the milestone and value of the timestamp as well as a cell marked error with any message failure (after the message is accepted) indicating loss.

An example of a JSON formatted Bigtable entry for a message that expired while in storage might look like:
{"DEADBEEF...":{"received":17285143030001,"stored":17285143030010,"expired":1728589102, "error":"expired"}}

where a successfully transmitted message may look like:

{"DEADBEEF...":{"received":17285143030001,"transmitted":17285143030010,"accepted":1728589102}}

(Question: I am debating whether we should add cells for the message TTL as well as the total time a message spent in transit. I’m not sure of the general utility of those, though, since a message with a “too long” TTL isn’t particularly interesting, a message with a “too short” would just expire in transit, and the total time a message spent in system could be determined roughly by looking at the timestamps.)

Because Redis does not have a way to automatically decrement counters based on some expiration criteria, we need to have a “reaper” process that looks at the values included in the zadd for any that are less than the current timestamp, and decrements the "counts" for that $milestone, while recording the fate of the message into Bigtable’s log.

ReaperWhile it’s possible to create reaper processes within the Autopush applications that use a complex set of lock deciders, I believe it’s just simpler to have the reaper be a single, reasonably simple, external application which regularly checks the Redis storage for expired records, adjusts the counts, and logs the fate of the messages to Bigtable. (No cleanup for Bigtable is required, since records will automatically age out.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants