Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log when no relevant samples are found for resampling #923

Conversation

daniel-zullo-frequenz
Copy link
Contributor

These additional logs will help in understanding why certain microgrid components, such as meters, fail to provide relevant samples for resampling.

@github-actions github-actions bot added part:tests Affects the unit, integration and performance (benchmarks) tests part:data-pipeline Affects the data pipeline labels Apr 10, 2024
@daniel-zullo-frequenz daniel-zullo-frequenz self-assigned this Apr 10, 2024
@daniel-zullo-frequenz daniel-zullo-frequenz added the cmd:skip-release-notes It is not necessary to update release notes for this PR label Apr 10, 2024
@daniel-zullo-frequenz daniel-zullo-frequenz force-pushed the feature/add-debugging-logs-resampling branch from 5a29f4d to 29c1732 Compare April 10, 2024 14:33
@daniel-zullo-frequenz daniel-zullo-frequenz marked this pull request as ready for review April 10, 2024 14:33
@daniel-zullo-frequenz daniel-zullo-frequenz requested a review from a team as a code owner April 10, 2024 14:33
@shsms
Copy link
Contributor

shsms commented Apr 10, 2024

This is a very useful debugging metric and would have pointed us to the underlying issue in the location with None values much earlier.

_logger.warning(
"No relevant samples found for component: %s",
component_name,
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shsms should I keep the log as a warning or this should be just a debugging log?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think warning makes sense, because somebody is expecting data that's not showing up. But it would also lead to a lot of noise when a component is under maintenance, etc. I wonder if we can just log once when an outage starts and once when it ends

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much more general issue, this applies to a lot of issues. Eventually I think we really need an alerting mechanism, not sure if the notifications service will be used for that. Looking at logs should be a very low level thing, users should just get notified when something went wrong.

Services like sentry (and probably any observability service out there) can set alarms when a particular type of warning (or log message) if firing for the first time, or if it is firing a number of times per second, or per hour, etc. and you can silence alerts also for some amount of times (when stuff is in maintenance for example).

At some point we'll need to have this kind of flexibility. For now I think warning every time is a considerable improvement. If it is trivial to do the log when it starts and when it ends, we could add it too, but I wouldn't do it if it makes the code much more complicated.

@daniel-zullo-frequenz daniel-zullo-frequenz force-pushed the feature/add-debugging-logs-resampling branch from 29c1732 to 6b774d9 Compare April 10, 2024 15:22
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to backport this to v0.25.x too?

@@ -745,6 +745,8 @@ def resample(self, timestamp: datetime) -> Sample[Quantity]:
# So if we need more performance beyond this point, we probably need to
# resort to some C (or similar) implementation.
relevant_samples = list(itertools.islice(self._buffer, min_index, max_index))
if not relevant_samples:
_logger.warning("No relevant samples found for component: %s", self._name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be interesting to log the contents of the self._buffer here too? Just to know which samples where the last received and when?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logging self._buffer could indeed be useful for debugging the resampler itself. However, for operational purposes, only the component name is necessary, as information can be obtained from the component logs and querying the current state of the component is enough.
In our specific operational scenario, we didn't find a need for accessing self._buffer. Should I still add a _logger.debug() for self._buffer?

@llucax llucax added this to the v1.0.0-rc6 milestone Apr 10, 2024
@llucax
Copy link
Contributor

llucax commented Apr 10, 2024

I added this to rc6, as I guess we want to ship it ASAP.

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, since it is a notable change in terms of debugability, I would add a release note in the Enhancements section (to be available when #924 is merge) mentioning this.

@daniel-zullo-frequenz
Copy link
Contributor Author

BTW, since it is a notable change in terms of debugability, I would add a release note in the Enhancements section (to be available when #924 is merge) mentioning this.

I'll add it once #924 gets merged

@daniel-zullo-frequenz
Copy link
Contributor Author

Do you want to backport this to v0.25.x too?

Yes, please.

@llucax
Copy link
Contributor

llucax commented Apr 11, 2024

Yes, please.

Haha, do it then! You can either change the target branch of this PR to v0.25.x and we then merge v0.25.3 to into v1.x.x, or you create a separate PR for the v0.25.x branch.

@daniel-zullo-frequenz
Copy link
Contributor Author

Haha, do it then! You can either change the target branch of this PR to v0.25.x and we then merge v0.25.3 to into v1.x.x, or you create a separate PR for the v0.25.x branch.

I'll create a separate PR for the v0.25.x branch once this is approved/merged

These additional logs will help in understanding
why certain microgrid components, such as meters,
fail to provide relevant samples for resampling.

Signed-off-by: Daniel Zullo <daniel.zullo@frequenz.com>
@daniel-zullo-frequenz daniel-zullo-frequenz force-pushed the feature/add-debugging-logs-resampling branch from 6b774d9 to 463efec Compare April 11, 2024 19:15
@github-actions github-actions bot added the part:docs Affects the documentation label Apr 11, 2024
To add an entry in the `Enhancements` session related to the
new warning message that is logged when no relevant samples
are found in a component for resampling.

Signed-off-by: Daniel Zullo <daniel.zullo@frequenz.com>
@daniel-zullo-frequenz daniel-zullo-frequenz force-pushed the feature/add-debugging-logs-resampling branch from 463efec to 922cb31 Compare April 11, 2024 19:19
@daniel-zullo-frequenz
Copy link
Contributor Author

Only updated the release-notes.

@daniel-zullo-frequenz daniel-zullo-frequenz added this pull request to the merge queue Apr 12, 2024
Merged via the queue into frequenz-floss:v1.x.x with commit d5cb853 Apr 12, 2024
14 checks passed
@daniel-zullo-frequenz daniel-zullo-frequenz deleted the feature/add-debugging-logs-resampling branch April 12, 2024 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd:skip-release-notes It is not necessary to update release notes for this PR part:data-pipeline Affects the data pipeline part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests
Projects
Development

Successfully merging this pull request may close these issues.

4 participants