-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
execution errors with attached engine.Batch (for EndTransaction) #1989
Comments
There's no reason that all the methods in |
but // Execute the command.
reply, intents, rErr := r.executeCmd(batch, ms, args)
if proto.IsWrite(args) {
if rErr == nil {
// ...
} else {
batch.Close()
if bErr, ok := rErr.(*errWithBatch); ok {
batch = // ...
// call MergeMVCCStats, set ms = ...
} else {
batch = r.rm.Engine().NewBatch()
}
}
// ...
} |
I'm unclear on the specific use cases where this is necessary. Could you elaborate a little? |
|
That's not a use case which justifies this kind of a change. That's a sanity check--not something that is expected to ever occur (certainly not in the normal course of events). |
Ok, but then we should panic there. Another case is |
I'm not sure it's panic-worthy. It's not going to affect the correctness of the system or lead to corruption. For the second case, doesn't seem worthwhile to introduce the complexity of returning a batch just so we end up cleaning up local intents synchronously. |
Going to close for now and address some of the above issues in upcoming work towards completing #1821. |
(I was curious if we can fix #3037 (and #1989). The change introduces some complexity and I'm not totally sure we should fix these issues.) This commit changes EndTransaction to return nil error on txn abort so that the we can commit a batch and update the transaction record. Change txnSender.Send so that it convert a nil error to TransactionAbortedError so that txn.Exec will initiate a restart. This change is hacky, but I couldn't come up with a better approach. We still return TransactionAbortedError in other places and keep the code path for handling TransactionAbortedError.
all replica commands (
EndTransaction
,Put
, ...) operate on anengine.Batch
which is discarded by the caller if an error is returned.That worked well so far, but some commands actually fail and need something written to the engine: In #1916,
EndTransaction
resolves local intents in the same batch, but on an error, there's little it can do locally: it needs to return them as skipped intents so that they can be resolved indirectly.Even worse is when the error is related to the Txn itself: in case of an epoch or timestamp regression, you'd reasonably expect to have the transaction aborted - but you can't, because it's returning an error. This means that any open intents anywhere in the system cannot resolve until they abort the transaction or it times out.
A carefully set up system to return something like a
could be the best way to deal with the above. In case of errors, we create a new batch anyways to update the response cache: it's possible that the change will be relatively easy.
Once that (or a related mechanism - after all, the above is only my initial suggestion) is in place, we can clean up in
EndTransaction
on error.The text was updated successfully, but these errors were encountered: