-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(releasehealth): Implement get_release_sessions_time_bounds in metrics backend [INGEST-249] #28824
Conversation
…-time-bounds-metrics
src/sentry/releasehealth/metrics.py
Outdated
) | ||
rows.extend( | ||
raw_snql_query( | ||
terminal_sessions_query, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this function is too long, can we maybe break it down to smaller components.. feel like it there are atleast a couple of independant logical blocks here that can be broken down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just now tried to do this locally and end up with basically this, which I dislike:
def _get_time_bounds_for_initialized_sessions(where: List[Union[BooleanCondition, ...]], org_id: int) -> List[Dict[str, str]]:
def _get_time_bounds_for_terminal_sessions(where: List[Union[BooleanCondition, ...]], org_id: int) -> List[Dict[str, str]]:
I feel like splitting out those separate logical blocks only makes sense if you're going to reuse them, but those queries make no sense outside of this context. I really hope we don't reuse them and instead manage to do one query that is as simple as your original one, sometime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahmedetefy is there anything specific that's maybe hard to understand that we should split out and document as a sep. function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so to me the initial sessions query and the terminal sessions query logic seems close enough to me that I would just extract the logic that generates the query for either initial or sessions into its own function and just call that function twice here once for the initial and one for the terminal to return the actual query (could potentially be simple enough that it takes a flag param initial or terminal, and return the appropriate query)
It will make the function much easier to read and follow
Or even one level deeper make the function just generate the query based on inputs like EntityKey, column, etc ...
but the implementation is up to you.. just throwing ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm already deduplicating most of the logic by pre-constructing a where
variable that I use in both queries so I'll skip this for now. I think with those kinds of refactors you'll just have to try it out and see if you like the end result, and I couldn't come up with a version I liked.
src/sentry/releasehealth/base.py
Outdated
@@ -23,13 +23,27 @@ class CurrentAndPreviousCrashFreeRate(TypedDict): | |||
CurrentAndPreviousCrashFreeRates = Mapping[ProjectId, CurrentAndPreviousCrashFreeRate] | |||
|
|||
|
|||
class _TimeBounds(TypedDict): | |||
sessions_lower_bound: FormattedIsoTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't this class look like this instead
class _TimeBounds(TypedDict):
sessions_lower_bound: Optional[FormattedIsoTime]
sessions_upper_bound: Optional[FormattedIsoTime]
instead of having the class _NoTimeBounds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are two different types. In my type, either all fields are non-None or all fields are None. In your type lower_bound
can be None
while upper_bound
can be non-None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I see although imo the usefulness of this is debatable but its up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course it's a pain to define, but it's the correct type to have, in that it simply describes more accurately what can be returned. In practice this means you can write code like:
if data['sessions_lower_bound'] is not None:
# can assume upper bound is not None in here
instead of having to assert both fields for None/not-None all the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I still wouldn't do the check like this because it might be confusing to read for whoever reads the code after me
I would probably just check like
if set(data.values()) == {None}:
# do whatever
src/sentry/releasehealth/metrics.py
Outdated
Condition(Column("timestamp"), Op.GTE, datetime.min), | ||
Condition(Column("timestamp"), Op.LT, datetime.now(pytz.utc)), | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels to me like the try/catch is wrapping much more logic than it actually should
I feel like lines 382 - 391 should be at the top of the function and if no env_filter
exists just return {"sessions_lower_bound": None, "sessions_upper_bound": None}
early on without raising an error and then catching it unless you want the error to be captured by sentry
to which I would also suggest that if that is your goal with this to maybe use logging.exception
or metrics.incr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of the exception mgmt is that I can use tag_key
and tag_value
multiple times without having to check their return-value for None
each. In previous PRs I would have to do that explicitly.
Instead of raising MetricIndexNotFound()
on line 391 I can also return the same value, but I feel that it would be simply duplication of code.
src/sentry/releasehealth/sessions.py
Outdated
org_id: OrganizationId, | ||
environments: Optional[Sequence[EnvironmentName]] = None, | ||
) -> ReleaseSessionsTimeBounds: | ||
return _get_release_sessions_time_bounds( # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like typing _get_release_sessions_time_bounds
would eliminate the need for us to use # type: ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but I really don't want to get into the business of refactoring adjacent modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but its not adjacent modules.. you are refactoring the very specific module you are calling
Surely if its something big I understand but if its a low hanging fruit like adding something like
def _get_release_sessions_time_bounds(project_id: int, release: str, org_id: int, environments: Optional[List[str]]=None) -> _TimeBounds :
shouldn't be that big of a deal, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't just put the type signature on there, you need to convert the entire file and add it to mypy.ini for mypy to even consider it for typing.
…-time-bounds-metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine.
If you could, please double check the comment about the return value, thinking about it a bit more it makes sense what you are doing there.
On a second thought, I think I wouldn't have bothered with the end time at all and just considered the start_time, if a release is still being used that start_time would probably be good enough, if it is not used anymore that a long running
session might change the end time a bit.
src/sentry/releasehealth/metrics.py
Outdated
for row in rows: | ||
if set(row.values()) == {formatted_unix_start_time}: | ||
continue | ||
if lower_bound is None or row["min"] < lower_bound: | ||
lower_bound = row["min"] | ||
if upper_bound is None or row["max"] > upper_bound: | ||
upper_bound = row["max"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused by this.
I am not sure what the function really returns since we appear to mix the session start time and the session end time, so (If I get it ) we return something like: min and max times when a session for a particular release was seen ?
That will most likely be something like min(session_start), max(session_end) except for the sessions that are still running and we don't have a session end and for those we would have also a max(session_start).
If that's what the intention is than everything is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are supposed to do min(session_start), max(session_start)
. where do you see us using session_end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression is that the counter session
with the "init" tag is pushed at the beginning of the session and the distribution session.duration
is pushed at the end of the session so the timestamps for the two queries would represent a combination of the session starts and session ends... I might be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarified in slack, all metric timestamps are derived from session.started.
…-time-bounds-metrics
src/sentry/releasehealth/metrics.py
Outdated
# equivalent to the sessions-table based backend. Example: | ||
# | ||
# 1. Session sid=x is started with timestamp started=n | ||
# 2. Same sid=x is updated with new payload with timestamp started=n - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this would be a bug, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yeah
…rics backend [INGEST-249] (#28824) Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Implement another function in terms of the metrics backend.