-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a data_version
to the db_write
hook an implement optimistic locking
#3358
Conversation
Still checking rebaseability, will un-draft asap :-) |
When considering as well the possibillity of chaining
We may not have hook chaining yet, but it seems good to prepare the interface at this point as well, so that plugin authors for remote backup in particular can start adapting to the new interface, then later we can support hook chaining and thereby allow multiple backup plugins to hook the |
Totally with you on the chainability of hooks, standardizing on I'm not convinced that a |
704f99f
to
f4a84e2
Compare
FWIW we could send a notification at |
But the plugin doesn't know as a matter of fact, since we might terminate inbetween calling the hook and issuing the notification, so the plugin must handle the case where it doesn't get a notification but the DB transaction is still committed. Once you already have to handle that, why add yet another possible entrypoint for the happy case, and not just always assume the bad case? No matter how many notifications we add the plugin will have to have the ability to handle the case that a notification never comes :-) I guess I'm worried about the overhead of notifications (it's small but not free) and plugin devs relying to heavily on the helpful notification and not bothering with the bad case, which would then result in loads of subtly broken plugins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, seems like a good thing to have
|
||
/* The current DB version we expect to update if changes are | ||
* committed. */ | ||
u32 data_version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u64 not u32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, intval
in the vars
table is defined as an INTEGER
which gets translated into a u32
in postgres and the size has to match in non-sqlite3 databases. So I thought I shouldn't be too picky about this value.
Alternatives are 1) creation of a new table, or 2) adding a new field bigintval
to the vars
table. If this is a problem I'd prefer the second. But really all we're guaranteeing is that there is enough variety to identify if we skipped writes or have lost some. In particular we can't do much if we're off by more than 2: if the plugin missed an update we must take a snapshot, if the DB moved back to an earlier point in time we must abort. We just need to have the probability of accidentally using the same version again small enough (1/2**32 seems good enough for me 😉).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sweet, ok. that's a good thing to know about int -> u32 in postgres.
We were passing them in separately, while we could just retrieve them from the db instance instead.
We are about to do some more operations before committing, so moving this up allows us to reuse the same transaction.
This counter is incremented on each dirty transaction.
This increments the `data_version` upon committing dirty transactions, reads the last data_version upon startup, and tracks the number in memory in parallel to the DB (see next commit for rationale). Changelog-Changed: JSON-RPC: Added a `data_version` field to the `db_write` hook which returns a numeric transaction counter.
The optimistic lock prevents multiple instances of c-lightning making concurrent modifications to the database. That would be unsafe as it messes up the state in the DB. The optimistic lock is implemented by checking whether a gated update on the previous value of the `data_version` actually results in an update. If that's not the case the DB has been changed under our feet. The lock provides linearizability of DB modifications: if a database is changed under the feet of a running process that process will `abort()`, which from a global point of view is as if it had crashed right after the last successful commit. Any process that also changed the DB must've started between the last successful commit and the unsuccessful one since otherwise its counters would not have matched (which would also have aborted that transaction). So this reduces all the possible timelines to an equivalent where the first process died, and the second process recovered from the DB. This is not that interesting for `sqlite3` where we are also protected via the PID file, but when running on multiple hosts against the same DB, e.g., with `postgres`, this protection becomes important. Changelog-Added: DB: Optimistic logging prevents instances from running concurrently against the same database, providing linear consistency to changes.
73c1928
to
8076248
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 8076248
Currently the
db_write
hook is a simple sink to which we write, but a pluginhas no way of telling whether there were any gaps or whether the
lightningd
database lost some updates. This PR introduces a numeric
data_version
thatis incremented on any dirty transaction. A ditry transaction is a DB
transaction that performed some changes on the underlying database.
The
data_version
is guaranteed to change between successive changes, howeverit does not guarantee monotonicity (i.e., it can wrap around at 2**32). Its
goal is to assert to a plugin that no changes to the DB have been performed by
c-lightning between successive calls to
db_write
. Thedata_version
persists across restarts so a plugin can check whether it matches a backup
after startup.
In addition this PR implements an optimistic locking strategy. The
data_version
is incremented only if it matches what we expect fromstartup. If it doesn't match we
abort()
since there might be anotherinstance writing to the database. This is not really important for
sqlite3
which is already protected through the use of the pid file, but in
postgres
we may not have such alternative protection mechanism against concurrent
instances.
It is not meant as a graceful shutdown, but rather a last resort protection. I
expect that some other locking mechanism is implemented in a wrapper for
c-lightning that'd start the instance only after acquiring ownership. As a
standalone though it should be sufficient to protect against concurrent
instances by serializing all write operations, and
abort()
ing the loser ofthe write race.
As a minor detail the order in which the increment and the reporting is
currently done the plugin gets the version number after the transaction would
be applied, but that doesn't really matter, as long as the values differ
between calls.
The lock provides linearizability of DB modifications: if a database is
changed under the feet of a running process that process will
abort()
, whichfrom a global point of view is as if it had crashed right after the last
successful commit. Any process that also changed the DB must've started
between the last successful commit and the unsuccessful one since otherwise
its counters would not have matched (which would also have aborted that
transaction). So this reduces all the possible timelines to an equivalent
where the first process died, and the second process recovered from the DB.