-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
/// The task name is a `&'static str` as opposed to a `String`. The reason for that is that | ||
/// in order to avoid memory consumption issues with the Prometheus metrics, the set of | ||
/// possible task names has to be bounded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be overly restrictive, but I think that if we don't enforce this, it's only a matter of time before someone creates a task with a dynamic name.
It is possible to create a &'static str
with Box::leak(format!("foo").into_boxed_str())
, but I assume that if someone does that, it's not by accident.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much in favor of this 👍
@@ -312,8 +318,8 @@ impl<TBl, TCl, TSc, TNetStatus, TNet, TTxPool, TOc> Spawn for | |||
&self, | |||
future: FutureObj<'static, ()> | |||
) -> Result<(), SpawnError> { | |||
self.task_manager.scheduler().unbounded_send((Box::pin(future), From::from("unnamed"))) | |||
.map_err(|_| SpawnError::shutdown()) | |||
self.task_manager.spawn_handle().spawn("unnamed", future); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not about pr changes, but this is going to be used more in future, can we do better identifying what and from where spawned here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add an object that contains the task name in advance and implements Spawn
, and pass it around.
For example, you'd do something like:
GrandPa::start(tasks_manager.build_spawner("grandpa"));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>
Polkadot master seems broken. The error is unrelated to my changes, and the companion build was passing earlier. |
/// The task name is a `&'static str` as opposed to a `String`. The reason for that is that | ||
/// in order to avoid memory consumption issues with the Prometheus metrics, the set of | ||
/// possible task names has to be bounded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much in favor of this 👍
client/service/src/task_manager.rs
Outdated
)?, registry)?, | ||
tasks_started: register(CounterVec::new( | ||
Opts::new( | ||
"tasks_started_total", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"tasks_started_total", | |
"tasks_spawned_total", |
Would make it easier to differentiate from the metric above.
let before = Instant::now(); | ||
let outcome = Future::poll(this.inner, cx); | ||
let after = Instant::now(); | ||
|
||
this.poll_duration.observe((after - before).as_secs_f64()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let before = Instant::now(); | |
let outcome = Future::poll(this.inner, cx); | |
let after = Instant::now(); | |
this.poll_duration.observe((after - before).as_secs_f64()); | |
let _timer = this.poll_duratoin.start_timer(); | |
let outcome = Future::poll(this.inner, cx); | |
// _timer is dropped and thus automatically recorded in the Histogram. |
…iendly-tasks-manager
unused imports let the CI fail. |
icing while releasing. |
…iendly-tasks-manager
This PR adds Prometheus metrics to the
sc_service::task_manager
module.There are four new metrics, each accepting a task name as label:
started - stopped
to know what's running.Future::poll
. This can be used to figure out the CPU usage of the node.poll_duration_count
, the idea is to dopoll_started - poll_duration_count
to detect tasks that are stuck.I have consequently removed the
futures-diagnose-exec
tool, that was supposed to serve the same purpose but was totally undocumented and way more annoying to use on real nodes compared to Prometheus/Grafana.