-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(releasehealth): Implement check_has_health_data on metrics backend [INGEST-249] #28729
Conversation
10f6d0a
to
9ae6dee
Compare
src/sentry/releasehealth/base.py
Outdated
@@ -100,3 +108,16 @@ def get_release_adoption( | |||
""" | |||
|
|||
raise NotImplementedError() | |||
|
|||
def check_has_health_data( | |||
self, projects_list: Sequence[Union[ProjectId, ProjectRelease]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two problems with this type:
- There's no relation between input and output.. type signature implies that one can pass in releases, and get out just projects. Or that one passes in projects and gets out releases (sometimes?)
- Union at this position implies that you can freely mix
ProjectId
andProjectRelease
within the same sequence but the implementation doesn't handle that
You can make this generic over a T
that is bounded by Union[ProjectId, ProjectRelease]
. Then you have the guarantee that the same union variant is returned as is put into it.
T = TypeVar("ProjectOrRelease", bounds=Union[...])
def check_has_health_data(self, projects_list: Sequence[T]) -> Set[T]: ...
This will make the function generic over T: Union[...]
. You should probably pick a better name for T
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input, will do.
From what I understand making T=TypeVar("T", bounds=Union[ProjectId,ProjectRelease])
would not improve the semantics in any way, but making the definition T=TypeVar("T", ProjectId, ProjectRelease)
would do exactly what you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that may be better. However in any case it will be better, because when you call check_has_health_data
you can declare T
to be specifically ProjectId
either way
src/sentry/releasehealth/metrics.py
Outdated
if len(projects_list) == 0: | ||
return set() | ||
|
||
includes_releases = type(projects_list[0]) == tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
includes_releases = type(projects_list[0]) == tuple | |
includes_releases = isinstance(projects_list[0], tuple) |
I believe this may make some type: ignore obsolete
src/sentry/releasehealth/metrics.py
Outdated
def extract_raw_info( | ||
row: Mapping[str, Union[int, str]] | ||
) -> Union[ProjectId, ProjectRelease]: | ||
return row.get("project_id") # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return row.get("project_id") # type: ignore | |
return row["project_id"] # type: ignore |
either we're sure of our types or we should use Optional
... there's no point in type-safety if everything is strictly typed and nulls still roam around freely IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
src/sentry/releasehealth/metrics.py
Outdated
if includes_releases: | ||
project_ids = [x[0] for x in projects_list] # type: ignore | ||
else: | ||
project_ids = [x for x in projects_list] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
project_ids = [x for x in projects_list] | |
project_ids = projects_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course :-), thanks
src/sentry/releasehealth/sessions.py
Outdated
from sentry.snuba.sessions import _get_release_adoption, get_current_and_previous_crash_free_rates | ||
from sentry.snuba.sessions import ( | ||
_get_release_adoption, | ||
check_has_health_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest renaming this function to lead with underscore, you will find some more places where it's being called. One is inside of sentry.snuba.sessions
itself.
You could argue it doesn't matter as sentry.snuba.sessions already implies we're using teh sessions backend, but I think we should call through the service as much as possible in case we really want to implement dual-read-and-compare.
where_clause.append(Condition(Column(release_column_name), Op.IN, releases_ids)) | ||
column_names = ["project_id", release_column_name] | ||
|
||
# def extract_raw_info(row: Mapping[str, Union[int, str]]) -> ProjectOrRelease: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there commented out code in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in next PR: #28823
tbh peppering the PR with |
extract_raw_info = extract_raw_info_func(includes_releases) | ||
|
||
query_cols = [Column(column_name) for column_name in column_names] | ||
group_by_clause = query_cols |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of defining group_by_clause
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some people do it to signal intent (Query(select=query_cols, groupby=query_cols)
looks like a copypaste error) but I wouldn't have done it either
Tbh, it feels strange to me seeing |
@ahmedetefy let's resolve those questions outside of this PR |
…d [INGEST-249] (#28729) check_has_health_data implemented for metrics
Like in #28598, port
check_has_health_data
from sentry.snuba.sessions to metrics.INGEST-249