Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a non-blocking best-effort buffered broadcaster #2819

Closed
faec opened this issue Jun 8, 2023 · 5 comments · Fixed by #2849
Closed

Implement a non-blocking best-effort buffered broadcaster #2819

faec opened this issue Jun 8, 2023 · 5 comments · Fixed by #2849
Assignees
Labels
enhancement New feature or request Team:Elastic-Agent Label for the Agent team

Comments

@faec
Copy link
Contributor

faec commented Jun 8, 2023

Following up on discussion in #2736, in an Agent team meeting it was decided to implement a best-effort buffered broadcast helper, and that CoordinatorState's subscription api should be the first client of this new component.

  • The broadcaster tracks the current value of some variable or data structure. When a subscriber listens to the broadcaster, it should receive updates over a synchronous channel. Subscribers can choose whether to receive only the most recent value, or to buffer up to n values if there are multiple changes in between reads. If many changes occur without a read, a subscriber with a buffer length of n will receive the n-th oldest state on its next read, then the (n-1)th, and so on.
  • The maximum buffer size is set during broadcaster creation, and subscriber buffer requests that exceed this will be capped at that maximum.
  • When a new subscriber joins, it receives the current value and any subsequent changes (it does not receive values in the past even if it requests a buffer).
  • If a subscriber has already received the most recent value, reads on its channel will block until the value changes again.
  • If there is a new value since the last one read by a subscriber, a read will always return it. If the subscriber's buffer length is zero, that value is guaranteed to be the most recent one at the time of the read.
  • Changes to the current value are sent by the component that owns the broadcaster (in the initial example probably Coordinator or CoordinatorState, depending on implementation details), and those sends should never(*) block. Values sent to the broadcaster become owned by the broadcaster (thus callers should make a deep copy before forwarding them if needed).

(*) Full deterministic blocking prevention isn't possible, because if the provider starts sending new values in a spin lock it can potentially overwhelm the receiving goroutine even if its input channel is buffered. However, the receiver should never block for longer than the time to read the values from its input channel; in particular it should always prioritize receiving new values over sending old ones to subscribers.)

Given the nature of this component as potentially supporting different types (not just Coordinator state information), and the technical dependence on golang channels and reflect.Select, it probably makes sense to implement the helper with a generic parameter. Generics exist but are not yet common in the Agent codebase, but a single-type container object like this is probably a reasonable place to deploy them. (Otherwise all applications of the component would probably have to operate on chan interface{} and do explicit conversion at every endpoint, which is possible but would be nice to avoid.)

@faec faec added enhancement New feature or request Team:Elastic-Agent Label for the Agent team labels Jun 8, 2023
@faec faec self-assigned this Jun 8, 2023
@cmacknz
Copy link
Member

cmacknz commented Jun 8, 2023

+1 to using generics here, we just haven't had a reason to use them more widely yet.

@faec should we bring this into your current sprint or leave it for a future iteration?

@faec
Copy link
Contributor Author

faec commented Jun 8, 2023

should we bring this into your current sprint or leave it for a future iteration?

@cmacknz I think it makes sense to do in this sprint, or at least a solid first draft, since it's so tied in with concurrency issues in the Coordinator. I wouldn't want it to completely gate followup on other components, but any progress towards a lock-free Coordinator will probably need to account for this change.

@cmacknz
Copy link
Member

cmacknz commented Jun 8, 2023

Added to sprint 13 as part of the watcher shift, CC @pierrehilbert

@faec
Copy link
Contributor Author

faec commented Jun 9, 2023

Followup based on preliminary prototyping: by far the largest part of this work will be integrating the new component with the Coordinator. The component itself will be a moderate change to the previous PR draft to provide a nicer API and make a few functional changes. But using it from Coordinator means updating all of the current CoordinatorState accessors to perform reliable synchronized updates before handing them off to the broadcaster, and that's going to involve some heavy changes.

I think it's still feasible and worthwhile, I just want to flag that this is necessarily going to be a significant change for the Coordinator itself, not a drop-in helper. There are some big payoffs -- Coordinator will no longer need to lock three mutexes just to check one of its own internal-only variables -- but as the draft converges it might be worth meeting directly with potential reviewers to talk through the code and get on the same page about the key changes instead of just dumping a monolithic review on them when it's done 😄

@cmacknz
Copy link
Member

cmacknz commented Jun 12, 2023

There are some big payoffs -- Coordinator will no longer need to lock three mutexes just to check one of its own internal-only variables -- but as the draft converges it might be worth meeting directly with potential reviewers to talk through the code and get on the same page about the key changes instead of just dumping a monolithic review on them when it's done

This definitely sounds valuable to me, +1 to getting early feedback before creating a giant review. If you can figure out how to split the review up that would probably also be valuable, maybe one PR for the new component and one for the uses of it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Team:Elastic-Agent Label for the Agent team
Projects
None yet
2 participants