Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the Disperser perf #930

Merged
merged 2 commits into from
Nov 22, 2024
Merged

Conversation

jianoaix
Copy link
Contributor

@jianoaix jianoaix commented Nov 22, 2024

Why are these changes needed?

The Disperser in particular the retrieval latency was 60+ seconds, when there was high traffic and/or large blobs.

There was due to inefficient memory management when dealing with blob buffer from S3, making it spend a lot of CPU/time on memory allocation and GC.

Tested:
From local retrieval query for a large blob,

Before:

time grpcurl -max-msg-sz 15000000 -proto ./api/proto/disperser/disperser.proto -import-path ./api/proto -d '{"batch_header_hash": "G9ntJR15dlOlljWQnx2avoT39XqLD5STAo91B8OMBqw=", "blob_index": 613 }' disperser-holesky.eigenda.xyz:443 disperser.Disperser/RetrieveBlob > /dev/null 2>&1
real	0m7.335s
user	0m0.357s
sys	0m0.157s

After:

time grpcurl -max-msg-sz 15000000 -proto ./api/proto/disperser/disperser.proto -import-path ./api/proto -d '{"batch_header_hash": "G9ntJR15dlOlljWQnx2avoT39XqLD5STAo91B8OMBqw=", "blob_index": 613 }' disperser-holesky.eigenda.xyz:443 disperser.Disperser/RetrieveBlob > /dev/null 2>&1

real	0m0.847s
user	0m0.382s
sys	0m0.152s

From pprof:

Before:
mem

After:

Screenshot 2024-11-22 at 2 45 13 PM

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@jianoaix jianoaix requested review from bxue-l2 and dmanc November 22, 2024 22:54
Copy link
Contributor

@dmanc dmanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jianoaix jianoaix merged commit 0a4e852 into Layr-Labs:master Nov 22, 2024
7 checks passed
@@ -105,15 +106,32 @@ func NewClient(ctx context.Context, cfg commonaws.ClientConfig, logger logging.L
return ref, err
}

func PeekObjectSize(ctx context.Context, s3Client *s3.Client, bucket, key string) (int64, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have this method in this file HeadObject. Could we have used that one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants