Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe(r) minting of CapabilityRefs #429

Merged
merged 1 commit into from
Sep 7, 2022

Conversation

petrosagg
Copy link
Contributor

@petrosagg petrosagg commented Nov 2, 2021

Background

As part of progress tracking timely keeps track of three kinds of timestamp frequencies for each operator. Timestamps at the inputs (the consumeds part of SharedProgress), internal timestamps (the internals part of SharedProgress) and timestamps at the output (the produceds part of SharedProgress). A timely operator must carefully manipulate those frequencies in order to let timely know at what times data might be produced from the operator. An operator can misuse these frequencies if for example it clears a consumed timestamp t1 without adding it to its internal timestamps and later decides that it wants to produce data at t1.

With the exception of the raw operator builder timely provides a safe API that ensures timestamp frequency counts are correct via the CapabilityRef and Capability types. These act as a witness of the existence of a consumed or internal timestamp respectively.

CapabilityRef misuse

The Capability type which represents an internal timestamp contains within it a shared reference to the internal timestamp frequency counts and it automatically subtracts its timestamp from the counts on Drop. This allows users to keep a Capability around in whatever way they wish and have the internal operator timestamp counts follow accordingly.

However, the same is not true for the CapabilityRef type. When a CapabilityRef is minted its timestamp is immediately subtracted from consumed and provided to the user. This is a problem because if the operator logic held onto the CapabilityRef then it could later produce data that is behind the frontier.

In practice, holding onto a CapabilityRef across operator invocations is quite difficult as the user will be tasked with storing an input handle and its produced CapabilityRef (which is lifetimed) at the same time. Self-referencial structs are notoriously difficult (but not impossible nor unsound) to do in Rust and so synchronous operators don't usually face this problem.

This situation becomes trivially possible with an async operator. In async rust, where the compiler is tasked with generating a struct that captures the stack of a future at any given yield point, making this self-referencial is easy. A user simply needs to obtain a CapabilityRef from an input handle and then await on something. When this yield point is reached the current stack will be preserved, keeping the CapabilityRef alive across timely invocations.

Solution in this PR

The solution this PR implements is to make CapabilityRef behave in the same way as a Capability. Instead of relying on the difficulty of self-referencial structs in Rust this PR adds an extra guard field in CapabilityRef that when dropped it will update the consumed timestamps of the operator accordingly. This allows users to keep CapabilityRefs around for as long as they wish.

@petrosagg
Copy link
Contributor Author

I just built materialize with this change to get further confidence that it is correct. Tests passed :) MaterializeInc/materialize#9023

CapabilityRefs are valid to exist as long as the data in the input are
not marked as consumed. This change makes sure that this is the case by
including an extra drop guard in the capability ref.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
@frankmcsherry
Copy link
Member

Looks good; thanks!

@frankmcsherry frankmcsherry merged commit 9548bf9 into TimelyDataflow:master Sep 7, 2022
petrosagg added a commit to petrosagg/materialize that referenced this pull request Sep 12, 2022
After TimelyDataflow/timely-dataflow#429 holding
onto CapabilityRefs across await points is safe

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
petrosagg added a commit to petrosagg/materialize that referenced this pull request Sep 13, 2022
After TimelyDataflow/timely-dataflow#429 holding
onto CapabilityRefs across await points is safe

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
petrosagg added a commit to petrosagg/timely-dataflow that referenced this pull request Dec 22, 2022
Since the merge of
TimelyDataflow#429,
`CapabilityRef`s have been made safe to hold onto across operator
invocations because that PR made sure that they only decremented their
progress counts on `Drop`. While this allowed `async`/`await` based
operators to freely hold on to them, it was still very difficult for
synchronous based operators to do the same thing, due to the lifetime
attached to the `CapabilityRef`.

Since `CapabilityRef`s can now be held arbitrarily long, the lifetime
constraint on them can be lifted and therefore made into a `'static`
value. No extra clones of the timestamps were needed for this change.

After making this change, the name `CapabilityRef` felt wrong, since
there is no reference to anything anymore. Instead, the main distinction
between `CapabilityRef`s and `Capabilities` are that the former is
associated with an input port and the latter is associated with an
output port.

As such, I have renamed `CapabilityRef` to `InputCapability` to signal
to users that holding onto one of them represents holding onto a
timestamp at the input for which we have not yet determined the output
port that it should flow to. This nicely ties up the semantics of the
`InputCapability::retain_for_output` and
`InputCapability::delayed_for_output` methods, which make it clear by
their name and signature that this is what "transfers" the capability
from input ports to output ports.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
petrosagg added a commit to petrosagg/timely-dataflow that referenced this pull request Dec 22, 2022
Since the merge of
TimelyDataflow#429,
`CapabilityRef`s have been made safe to hold onto across operator
invocations because that PR made sure that they only decremented their
progress counts on `Drop`. While this allowed `async`/`await` based
operators to freely hold on to them, it was still very difficult for
synchronous based operators to do the same thing, due to the lifetime
attached to the `CapabilityRef`.

We can observe that the lifetime no longer provides any benefits, which
means it can be removed and turn `CapabilityRef`s into fully owned
values. This allows any style of operator to easily hold on to them. The
benefit of that isn't just performance (by avoiding the `retain()`
dance), but also about deferring the decision of the output port a given
input should flow to to a later time.

After making this change, the name `CapabilityRef` felt wrong, since
there is no reference to anything anymore. Instead, the main distinction
between `CapabilityRef`s and `Capabilities` are that the former is
associated with an input port and the latter is associated with an
output port.

As such, I have renamed `CapabilityRef` to `InputCapability` to signal
to users that holding onto one of them represents holding onto a
timestamp at the input for which we have not yet determined the output
port that it should flow to. This nicely ties up the semantics of the
`InputCapability::retain_for_output` and
`InputCapability::delayed_for_output` methods, which make it clear by
their name and signature that this is what "transfers" the capability
from input ports to output ports.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
petrosagg added a commit to petrosagg/timely-dataflow that referenced this pull request Jan 11, 2023
Since the merge of
TimelyDataflow#429,
`CapabilityRef`s have been made safe to hold onto across operator
invocations because that PR made sure that they only decremented their
progress counts on `Drop`. While this allowed `async`/`await` based
operators to freely hold on to them, it was still very difficult for
synchronous based operators to do the same thing, due to the lifetime
attached to the `CapabilityRef`.

We can observe that the lifetime no longer provides any benefits, which
means it can be removed and turn `CapabilityRef`s into fully owned
values. This allows any style of operator to easily hold on to them. The
benefit of that isn't just performance (by avoiding the `retain()`
dance), but also about deferring the decision of the output port a given
input should flow to to a later time.

After making this change, the name `CapabilityRef` felt wrong, since
there is no reference to anything anymore. Instead, the main distinction
between `CapabilityRef`s and `Capabilities` are that the former is
associated with an input port and the latter is associated with an
output port.

As such, I have renamed `CapabilityRef` to `InputCapability` to signal
to users that holding onto one of them represents holding onto a
timestamp at the input for which we have not yet determined the output
port that it should flow to. This nicely ties up the semantics of the
`InputCapability::retain_for_output` and
`InputCapability::delayed_for_output` methods, which make it clear by
their name and signature that this is what "transfers" the capability
from input ports to output ports.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
frankmcsherry pushed a commit that referenced this pull request Jan 29, 2023
Since the merge of
#429,
`CapabilityRef`s have been made safe to hold onto across operator
invocations because that PR made sure that they only decremented their
progress counts on `Drop`. While this allowed `async`/`await` based
operators to freely hold on to them, it was still very difficult for
synchronous based operators to do the same thing, due to the lifetime
attached to the `CapabilityRef`.

We can observe that the lifetime no longer provides any benefits, which
means it can be removed and turn `CapabilityRef`s into fully owned
values. This allows any style of operator to easily hold on to them. The
benefit of that isn't just performance (by avoiding the `retain()`
dance), but also about deferring the decision of the output port a given
input should flow to to a later time.

After making this change, the name `CapabilityRef` felt wrong, since
there is no reference to anything anymore. Instead, the main distinction
between `CapabilityRef`s and `Capabilities` are that the former is
associated with an input port and the latter is associated with an
output port.

As such, I have renamed `CapabilityRef` to `InputCapability` to signal
to users that holding onto one of them represents holding onto a
timestamp at the input for which we have not yet determined the output
port that it should flow to. This nicely ties up the semantics of the
`InputCapability::retain_for_output` and
`InputCapability::delayed_for_output` methods, which make it clear by
their name and signature that this is what "transfers" the capability
from input ports to output ports.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
@github-actions github-actions bot mentioned this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants