Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential DoS vector: blobs that are too big to be included in a square #3072

Closed
rootulp opened this issue Jan 31, 2024 · 14 comments
Closed
Labels
bug Something isn't working WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc

Comments

@rootulp
Copy link
Collaborator

rootulp commented Jan 31, 2024

Problem

Originally reported via @tuxcanfly. See https://gist.github.com/tuxcanfly/207ab3dc1fac4bb9c38bedad2a053f18

Proposal

Is it possible for CheckTx to reject this type of transaction prior admittance into the mempool?

  • A stateful check could be based on the GovSquareSize of 64 which would reject transactions larger than ~2MiB
  • A stateless check could be based on the SquareSizeUpperBound of 128 which would reject transactions larger than ~8MiB
@rootulp
Copy link
Collaborator Author

rootulp commented Feb 2, 2024

Note we already have an ante handler that rejects txs where the total blob size is too large. It's possible that ante handler doesn't catch edge cases near the total blob size because it derives an upper bound based on a number of parameters. See

@cmwaters
Copy link
Contributor

cmwaters commented Feb 6, 2024

I think we calculate the space taken by the blobs incorrectly

func getTotal(sizes []uint32) (sum int) {
for _, size := range sizes {
sum += int(size)
}
return sum
}

it should be in regards of the amount of shares * 512 bytes i.e.

for _, size := range sizes {
	sum += uint64(shares.SparseSharesNeeded(size)) * appconsts.ShareSize
}

Also I would consider of adding an extra share or two as buffer for padding

@rootulp
Copy link
Collaborator Author

rootulp commented Feb 6, 2024

getTotal is only used here where we want the number of total bytes in the blob sizes.

Also I would consider of adding an extra share or two as buffer for padding

Agreed. That ante handler doesn't account for padding shares so this is related to #3081

@cmwaters
Copy link
Contributor

cmwaters commented Feb 6, 2024

getTotal is only used here where we want the number of total bytes in the blob sizes.

yes but we want the total bytes based on the amount of shares that those blobs will occupy; not the raw bytes of the blobs

@rootulp
Copy link
Collaborator Author

rootulp commented Feb 6, 2024

🤦‍♂️ you're right, good catch!

rootulp added a commit that referenced this issue Feb 15, 2024
@cmwaters noticed that the ante handler rejects txs but doesn't account
for the shares used by each blob size. See
#3072 (comment)

This means that a PFB with many small blobs would not be rejected by the
ante handler even though it can't possibly fit. This PR modifies the
ante handler to reject based on the # of shares occupied by each
`blobSize` instead of the # of total bytes. Also adds a test case that
would fail with the previous implementation but passes with this
implementation.

--- 

Note to future selves: we should add to the v2 release notes:

- breaking!: the blob ante handler became more restrictive in that it
rejects transactions that would have previously entered the mempool but
failed to be included in a block. Transactions that previously failed
with `ErrTotalBlobSizeTooLarge` should now fail with `ErrBlobTooLarge`.

---------

Co-authored-by: Sanaz Taheri <35961250+staheri14@users.noreply.github.com>
@rootulp
Copy link
Collaborator Author

rootulp commented Feb 15, 2024

This issue may be partially mitigated by the fix in #3082 which will be released in the next major release.

@evan-forbes
Copy link
Member

@rootulp since this is a bug, is it okay if we treat is as such? and is this still an open issue given #3082?

@rootulp rootulp added the bug Something isn't working label Mar 4, 2024
@rootulp
Copy link
Collaborator Author

rootulp commented Mar 4, 2024

TBH I still think this is open. #3082 improves the ante handler to reject PFBs that are guaranteed to not fit in a data square. But I think the DoS vector still exists and a malicious actor could craft PFBs that pass the ante handler but fail to fit in a data square.

Agreed it is a bug. What do you mean by "treat it as such"?

@evan-forbes
Copy link
Member

evan-forbes commented Mar 4, 2024

Agreed it is a bug. What do you mean by "treat it as such"?

mainly that we just add a label lol and discuss what is needed to fix it ofc. that is weird wording tho, apologies

do we need more tests than what were added in #3082? what do we need to do to be confident that this isn't occuring? I could be missing something, but we should be able to deterministically test for this, correct?

@rootulp
Copy link
Collaborator Author

rootulp commented Mar 5, 2024

what do we need to do to be confident that this isn't occuring?

One more idea to mitigate the DoS: modify BroadcastTx from blocking to non-blocking

do we need more tests

We could use a fuzz test that submits blobs to the ante handler and verifies that if a valid PFB tx passes the ante handler then it also must be able to be included in an empty block (i.e. a block where it is the only tx).

@evan-forbes
Copy link
Member

One more idea to mitigate the DoS: modify BroadcastTx from blocking to non-blocking

does the signer in the user packager using BroadcastMode Sync count for non-blocking? In general I agree that we should get rid of broadmost mode block, and perhaps a stupic question, but how does this reduce DoS? just that it reduces the resources required per connection? If so, we do have configurable limits on the time each node will wait and how many active connections it maintains help?

We could use a fuzz test that submits blobs to the ante handler and verifies that if a valid PFB tx passes the ante handler then it also must be able to be included in an empty block (i.e. a block where it is the only tx).

adding a fuzzer is a good idea! for this specific issue, perhaps we could limit to the scope to just the size of the PFB and the problematic antehandler?

@evan-forbes evan-forbes added the needs:discussion item needs to be discussed as a group in the next sync. if marking an item, pls be prepped to talk label Mar 11, 2024
@evan-forbes
Copy link
Member

to close issue, we need to add a test case for the boundary of max blob size in this test

func TestBlobShareDecorator(t *testing.T) {

@evan-forbes evan-forbes removed the needs:discussion item needs to be discussed as a group in the next sync. if marking an item, pls be prepped to talk label Mar 11, 2024
ninabarbakadze pushed a commit to ninabarbakadze/celestia-app that referenced this issue Apr 2, 2024
@cmwaters noticed that the ante handler rejects txs but doesn't account
for the shares used by each blob size. See
celestiaorg#3072 (comment)

This means that a PFB with many small blobs would not be rejected by the
ante handler even though it can't possibly fit. This PR modifies the
ante handler to reject based on the # of shares occupied by each
`blobSize` instead of the # of total bytes. Also adds a test case that
would fail with the previous implementation but passes with this
implementation.

--- 

Note to future selves: we should add to the v2 release notes:

- breaking!: the blob ante handler became more restrictive in that it
rejects transactions that would have previously entered the mempool but
failed to be included in a block. Transactions that previously failed
with `ErrTotalBlobSizeTooLarge` should now fail with `ErrBlobTooLarge`.

---------

Co-authored-by: Sanaz Taheri <35961250+staheri14@users.noreply.github.com>
@evan-forbes evan-forbes added WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc needs:triage and removed needs:triage labels May 17, 2024
0xchainlover pushed a commit to celestia-org/celestia-app that referenced this issue Aug 1, 2024
@cmwaters noticed that the ante handler rejects txs but doesn't account
for the shares used by each blob size. See
celestiaorg/celestia-app#3072 (comment)

This means that a PFB with many small blobs would not be rejected by the
ante handler even though it can't possibly fit. This PR modifies the
ante handler to reject based on the # of shares occupied by each
`blobSize` instead of the # of total bytes. Also adds a test case that
would fail with the previous implementation but passes with this
implementation.

--- 

Note to future selves: we should add to the v2 release notes:

- breaking!: the blob ante handler became more restrictive in that it
rejects transactions that would have previously entered the mempool but
failed to be included in a block. Transactions that previously failed
with `ErrTotalBlobSizeTooLarge` should now fail with `ErrBlobTooLarge`.

---------

Co-authored-by: Sanaz Taheri <35961250+staheri14@users.noreply.github.com>
@evan-forbes
Copy link
Member

I think we can close this as completed @rootulp correct

@rootulp
Copy link
Collaborator Author

rootulp commented Jan 27, 2025

We can close this and open a new issue for

We could use a fuzz test that submits blobs to the ante handler and verifies that if a valid PFB tx passes the ante handler then it also must be able to be included in an empty block (i.e. a block where it is the only tx).

Created #4275

@rootulp rootulp closed this as not planned Won't fix, can't repro, duplicate, stale Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc
Projects
None yet
Development

No branches or pull requests

3 participants