Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move settlement queue to the driver #3129

Merged
merged 26 commits into from
Dec 23, 2024
Merged

Move settlement queue to the driver #3129

merged 26 commits into from
Dec 23, 2024

Conversation

squadgazzz
Copy link
Contributor

@squadgazzz squadgazzz commented Nov 15, 2024

Description

This is a follow-up to #3116, which addresses a suggestion of moving the settlement queue to the driver. This should help avoid increasing the block deadline since it can still be calculated starting from the simulation block.

Changes

  • Move the settlement queue to the driver. Each solver has its own driver endpoint and, now, its own settlement queue.
  • Calculate the deadline starting from the simulation block(revert to the previous version).
  • Do not send a settle request to the solver once the deadline is reached.
  • A new driver's config to configure the max queue size.

https://github.com/cowprotocol/infrastructure/pull/2241 should be reverted.

How to test

Existing tests. More driver tests will be implemented once the approach is approved.

@squadgazzz squadgazzz marked this pull request as ready for review November 18, 2024 09:32
@squadgazzz squadgazzz requested a review from a team as a code owner November 18, 2024 09:32
@squadgazzz squadgazzz marked this pull request as draft November 18, 2024 10:07
@squadgazzz squadgazzz marked this pull request as ready for review November 18, 2024 10:58
/// Archive node URL used to index CoW AMM
archive_node_url: Option<Url>,
}

fn default_settle_queue_size() -> usize {
3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it makes sense to keep more than 3 pending settlements, at least for mainnet.

Copy link
Contributor

@m-lord-renkse m-lord-renkse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to still review better the spawning thread logic, but it would be really cool if we could have e2e tests in which we can hit the limit 🙏

@squadgazzz
Copy link
Contributor Author

I need to still review better the spawning thread logic, but it would be really cool if we could have e2e tests in which we can hit the limit 🙏

Yes, before implementing it, I just wanted to finalize the approach itself to avoid back-and-forth changes.

@squadgazzz squadgazzz mentioned this pull request Nov 22, 2024
4 tasks
Copy link

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

@m-lord-renkse
Copy link
Contributor

@squadgazzz should I review this or aren't we going to have the queue in the end?

@squadgazzz
Copy link
Contributor Author

@squadgazzz should I review this or aren't we going to have the queue in the end?

@m-lord-renkse, we still need the queue for solvers that cannot handle multiple settlements in the same block.

"settle deadline exceeded. unable to return a response"
);
}
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we reach here is because the deadline hit before we could even call /settle, it could be the first call or it could be the 5th one. Shouldn't we log here for debug purposes? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we reach here, response_sender successfully sent the error, where receiver logs the error:

response_rx.await.map_err(|err| {
tracing::error!(?err, "Failed to dequeue /settle response");
Error::SubmissionError
})?

.await
.unwrap_or_else(|_| Err(competition::Error::SubmissionError))?)
let sender = state.settle_queue_sender();
let (response_tx, response_rx) = oneshot::channel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be created within settle_queue_sender? as we have now, we create sender and then response_tx just to be passed to sender in send() function. The queue sender should create its corresponding resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume, this is addressed now.

competition::Error::UnableToEnqueue
})?;

Ok(response_rx.await.map_err(|err| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we have a timeout here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since the mempool internally checks for the submission deadline and the task will be dropped once it is reached:

if block.number >= submission_deadline {
tracing::info!(
?hash,
deadline = submission_deadline,
current_block = block.number,
"tx not confirmed in time, cancelling",
);
self.cancel(mempool, settlement.gas.price, solver).await?;
return Err(Error::Expired);
}

Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea for this change makes sense to me but the implementation seems awkward and at the wrong place.

crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/config/file/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/api/routes/settle/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/api/mod.rs Outdated Show resolved Hide resolved
@squadgazzz squadgazzz marked this pull request as ready for review November 29, 2024 19:25
Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; just some small suggestions.

crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/config/file/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@m-lord-renkse m-lord-renkse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, assuming comments above will be addressed.

@squadgazzz squadgazzz added the blocked This issue is blocked by some other work label Dec 2, 2024
@MartinquaXD MartinquaXD marked this pull request as draft December 3, 2024 07:49
@squadgazzz squadgazzz removed the blocked This issue is blocked by some other work label Dec 16, 2024
@squadgazzz squadgazzz marked this pull request as ready for review December 16, 2024 12:30
@squadgazzz
Copy link
Contributor Author

Must be merged together with #3133

Copy link
Contributor

@sunce86 sunce86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing major, logic looks good.

crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Show resolved Hide resolved
crates/driver/src/domain/competition/mod.rs Show resolved Hide resolved
crates/driver/src/infra/config/file/mod.rs Show resolved Hide resolved
@sunce86
Copy link
Contributor

sunce86 commented Dec 20, 2024

BTW, what happens to colocated drivers running reference driver implementation? Could they experience issues if they don't pull the latest changes that contain this PR and we apply the release?

Is it a lot of work to add a configuration parameter to autopilot that will switch from queued to non-queued submission? This way we could:

  1. Apply the release with this PR but keep the option off for colocated drivers.
  2. Make sure all of them pulled the newest changes
  3. Enable for all drivers.

OR

  1. We can merge the PR
  2. Make sure all colocated drivers pulled the changes
  3. Apply the release

@squadgazzz
Copy link
Contributor Author

BTW, what happens to colocated drivers running reference driver implementation? Could they experience issues if they don't pull the latest changes that contain this PR and we apply the release?

The only thing I can think of is that we return to the state where no settlement queue exists for colocated solvers, so some of them can again suffer from receiving settlement requests after the deadline is exceeded, which happens rarely.

Is it a lot of work to add a configuration parameter to autopilot that will switch from queued to non-queued submission?

Not sure I am following why we need this.

  1. We can merge the PR
  2. Make sure all colocated drivers pulled the changes
  3. Apply the release

Probably, makes sense. So I'll merge this after the next weekly release to have some time to run it in staging, and solvers will also have time for an update.

# Conflicts:
#	crates/driver/src/domain/competition/mod.rs
#	crates/driver/src/infra/api/mod.rs
#	crates/driver/src/infra/api/routes/solve/mod.rs
#	crates/driver/src/infra/config/file/load.rs
#	crates/driver/src/infra/config/file/mod.rs
#	crates/driver/src/infra/solver/mod.rs
# Conflicts:
#	crates/driver/src/infra/config/file/mod.rs
@squadgazzz squadgazzz enabled auto-merge (squash) December 23, 2024 13:14
@squadgazzz squadgazzz merged commit 59091c1 into main Dec 23, 2024
10 of 11 checks passed
@squadgazzz squadgazzz deleted the settle-queue-to-driver branch December 23, 2024 13:17
@github-actions github-actions bot locked and limited conversation to collaborators Dec 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants