Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetBestMiningCandidate & F3 #730

Closed
Stebalien opened this issue Oct 29, 2024 · 8 comments
Closed

GetBestMiningCandidate & F3 #730

Stebalien opened this issue Oct 29, 2024 · 8 comments

Comments

@Stebalien
Copy link
Member

Stebalien commented Oct 29, 2024

When we pick a base on which to mine, we compare:

  1. The current chain head.
  2. The last tipset we mined on.

If the last tipset we mined on is heavier, we'll stick with that. This leads to a problematic situation:

  1. We try mining on X.
  2. F3 finalizes Y at the same height, Y is lighter than X.
  3. We will keep trying to mine on X until someone else mines a block on Y.
  • One option here is to remove the lastWork field, I'm not 100% sure why we have it.
  • Another is to change ChainTipSetWeight to take checkpoints into account and set the weight to 0 for any tipset not compatible with the current checkpoint. We need to be careful here in case the tipset weight is used anywhere else, but I don't think it is?
  • The best option is probably just to ask lotus about the current checkpoint.
@Stebalien
Copy link
Member Author

Stebalien commented Oct 29, 2024

Note: I have no idea if this is actually the issue, but it's very suspicious. We need to audit all the miner code for unexpected gotchas.

@Kubuxu
Copy link
Contributor

Kubuxu commented Oct 29, 2024

I'm pretty certain the reason for this existing is catchup mining, where when we fast loop and lotus takes too long to update the head. Also, it would be an issue almost for sure. It might not be the actual cause of what we observe, but I can see it crop up.

@Stebalien
Copy link
Member Author

Ok, I'm pretty sure there's no point behind lastWork. It's only useful if lotus returns a block that's weighs less than the last block it returned. I guess maybe it's to solve the issue of lotus crashing, restarting, and... coming up with a worse block? I'm not convinced.

To be clear, lastWork literally just returns the best block we've seen from ChainHead. We don't update it anywhere except in GetBestMiningCandidate.

@Stebalien
Copy link
Member Author

Oh! Maybe the concern is multiple lotus nodes where one is behind? I mean, this isn't really the right way to deal with that especially because we completely ignore anything we mined.

@Stebalien
Copy link
Member Author

It looks like we keep it for side effects in some cases... E.g., we shove the null rounds it it.

@Stebalien
Copy link
Member Author

Yeah, the issue is catch-up mining, but it's not about being "fast", it's about null rounds. When catching up, we keep trying to mine on every round after our last tipset until we either see a better head or win a block. Of course... we don't wait after winning which means it's kind of broken.

Unfortunately, I'm not sure if I can trivially remove the "take heaviest" rule because, due to F3, we can flip-flop. I.e.:

  1. Take base A.
  2. Try to mine 10 null blocks on A.
  3. See a better base B, mine 10 null blocks on B.
  4. F3 finalizes A, so we switch back to it.

At that point, we'd want to pick up at A+11, not A+1.

@Stebalien
Copy link
Member Author

Ok, there's a simple fix: we can't mine at the same height anyways. So we'll just track the last height we've successfully determined we didn't win and use that to calculate nulls.

@Kubuxu
Copy link
Contributor

Kubuxu commented Nov 18, 2024

Fixed in filecoin-project/lotus#12690

@Kubuxu Kubuxu closed this as completed Nov 18, 2024
@github-project-automation github-project-automation bot moved this from In review to Done in F3 Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants