-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disperser auth #984
Disperser auth #984
Conversation
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, few comments
uint32 disperserID = 2; | ||
|
||
// Signature using the disperser's ECDSA key over keccak hash of the batch. The purpose of this signature | ||
// is to prevent hooligans from tricking DA nodes into storing data that they shouldn't be storing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣
return nil | ||
} | ||
|
||
key, err := a.getDisperserKey(ctx, now, request.DisperserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would result in an extra ETH call for any new DisperserID
it sees.
Since we're already hardcoding the only allowed disperser, does it make sense if we consider a request invalid if DisperserID
is unknown (not EigenLabsDisperserID) in isAuthenticationStillValid
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var defaultAddress gethcommon.Address
if err != nil {
return defaultAddress, fmt.Errorf("failed to get disperser address: %w", err)
}
if address == defaultAddress {
return defaultAddress, fmt.Errorf("disperser with id %d not found", disperserID)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this reply is for another comment in reader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're correct @ian-shim, wires got crossed here. 🙃
The response I intended to type here was that the code does cache this value inside this cache: keyCache *lru.Cache[uint32, *keyWithTimeout]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case I was worried about is an exploit with sending requests with increasing disperser IDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see what you mean. You don't need to send increasing disperser IDs, since nothing is cached when a non-existent disperser ID is queried.
The core purpose of this caching was to reduce the latency of this RPC. The latency of an invalid request is not really that important. Even if reading the chain state is not particularly expensive, I agree that it's generally good practice to avoid doing as much as possible for invalid RPCs.
The solution in the immediate term is to simply reject any disperser ID other than 0
. I've made that change.
When we eventually have multiple dispersers, we will need to add a function to the smart contract that returns a list of all currently valid disperser IDs. We can cache this list and refresh it every X seconds, and always reject any request for a disperser that is unknown. Unfortunately this change will not be backwards comparable, meaning that node operators will have a required upgrade when we enable multiple dispersers. Not the end of the world, but not much we can do to avoid it now I think (short of updating contracts again).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my fix, I add an additional parameter to the authentication: disperserIDFilter func(uint32) bool
If a disperser ID causes this function to return false, then the authentication attempt immediately fails.
The purpose of this extra complexity (as opposed to hard coding it) is to avoid having to delete unit tests. The authenticator is more or less capable of handling multiple dispersers (when we eventually have multiple dispersers), and I wrote unit tests to cover those scenarios. Although I could delete those for slightly simpler code, I'd like to avoid having to rewrite them when we get around to decentralized dispersers.
disperser/cmd/controller/main.go
Outdated
nodeClientManager, err := controller.NewNodeClientManager(config.NodeClientCacheSize, logger) | ||
|
||
var requestSigner clients.RequestSigner | ||
if !config.DisperserStoreChunksSigningDisabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the store chunks signing is disabled, we should probably throw a warning on the start up logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var requestSigner clients.DispersalRequestSigner
if config.DisperserStoreChunksSigningDisabled {
logger.Warn("StoreChunks() signing is disabled")
} else {
requestSigner, err = clients.NewDispersalRequestSigner(
context.Background(),
config.AwsClientConfig.Region,
config.AwsClientConfig.EndpointURL,
config.DisperserKMSKeyID)
if err != nil {
return fmt.Errorf("failed to create request signer: %v", err)
}
}
|
||
// authenticationTimeoutDuration is the duration for which an auth is valid. | ||
// If this is zero, then auth saving is disabled, and each request will be authenticated independently. | ||
authenticationTimeoutDuration time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this value low for the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've set this to be one minute by default. Is that sufficient, or should we lower it further?
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
config *NodeClientConfig | ||
initOnce sync.Once | ||
conn *grpc.ClientConn | ||
requestSigner DispersalRequestSigner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A slightly different structuring of it: shall we pass in an interface, not a concrete implementation?
Like we have RequestSigners interface (and passed around), but with a KMS based implementation. This leaves space for potential non-KMS (i.e. non AWS specific) options (in a decentralized scenario it may be needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about this comment. DispersalRequestSigner
is already an interface. Is the interface ok in its current form?
// DispersalRequestSigner encapsulates the logic for signing GetChunks requests.
type DispersalRequestSigner interface {
// SignStoreChunksRequest signs a StoreChunksRequest. Does not modify the request
// (i.e. it does not insert the signature).
SignStoreChunksRequest(ctx context.Context, request *grpc.StoreChunksRequest) ([]byte, error)
}
&bind.CallOpts{ | ||
Context: ctx, | ||
}, | ||
disperserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we settle on either ID or Key? Using both is unnecessary and adds a little bit confusion.
(if this is meant to be an int, ID seems fit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with calling this an "ID". These are serial numbers. Eventually when we allow the community to register dispersers, we will allocate these in monotonic order via a smart contract.
Where do you see the terminology "key" being used in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I see what you are getting at. There is a method in the chain reader that references the disperser key:
getDisperserKey(
ctx context.Context,
now time.Time,
disperserID uint32) (*gethcommon.Address, error)
The disperser key and the disperser ID are actually distinct things.
- Disperser ID: a unique identifier for the disperser, a 32 bit serial number
- Disperser (public) key: the public key used to verify
StoreChunks()
requests from the disperser, is an eth address
The reason why the disperser's public key is not a good identifier for the disperser is that we want the capability to re-key the disperser (e.g. if we lose the key, it gets compromised, or we just want to rotate it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, seems like the “operator address" v.s. "operator ID" case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why DisperserKeyToAddress
is accepting a disperserID as param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a map from disperser IDs (serial numbers) to disperser public keys (eth addresses). This function looks up the address given the disperser's ID. The disperser's public key needs to be registered on-chain proper to this call, since the public key is not derived from the ID. Until we decentralize dispersers, there will always be exactly one disperser ID: 0
.
The reason why we decided to add this complexity now is that we wanted the ability to change the public key in case we need to rotate the disperser's keys. The disperser ID part was super simple to have, and will mean that we don't need to change this contract in the future when we move to decentralized dispersers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea the context makes sense. What I mean was why DisperserKeyToAddress
is not called DisperserIDToAddress
, since it's accepting an ID and mapping it to address.
api/clients/v2/request_signer.go
Outdated
@@ -0,0 +1,62 @@ | |||
package clients |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can update the file name as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot to rename the file. 😅 Fixed.
return nil | ||
} | ||
|
||
key, err := a.getDisperserKey(ctx, now, request.DisperserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this reply is for another comment in reader
.
node/auth/authenticator.go
Outdated
// chainReader is used to read the chain state. | ||
chainReader core.Reader | ||
|
||
// keyCache is used to cache the public keys of dispersers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what public key has only 32 bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uint32
is the relay ID. These are serial numbers, not eth keys. Currently we have exactly one disperser, and its ID is hard coded to be uint32(0)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix the comment then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated comments to clarify:
// keyCache is used to cache the public keys of dispersers. The uint32 map keys are disperser IDs. Disperser
// IDs are serial numbers, with the original EigenDA disperser assigned ID 0. The map values contain
// the public key of the disperser and the time when the local cache of the key will expire.
keyCache *lru.Cache[uint32 /* disperser ID */, *keyWithTimeout]
node/auth/authenticator.go
Outdated
return fmt.Errorf("failed to verify request: %w", err) | ||
} | ||
|
||
a.saveAuthenticationResult(now, origin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: cacheAuthenticationResult
, "save" feels like it's persisting the data which it's not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
} | ||
} | ||
|
||
address, err := a.chainReader.GetDisperserAddress(ctx, disperserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd cache this ID to address mapping to save RPCs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cached (see keyCache *lru.Cache[uint32, *keyWithTimeout]
). By design, we re-fetch the same key again after a timeout just in case the key was changed. But not for each RPC.
@@ -238,6 +238,33 @@ var ( | |||
EnvVar: common.PrefixEnvVar(EnvVarPrefix, "CHUNK_DOWNLOAD_TIMEOUT"), | |||
Value: 20 * time.Second, | |||
} | |||
DisableDispersalAuthenticationFlag = cli.BoolFlag{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why disabling is an option? shouldn't it be non optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's enabled by default. Necessary to get things working in inabox. Inabox integration is kind of complex, so I'd like to merge this code with it disabled in the e2e test, and to enable it in a follow up PR.
node/auth/request_signing.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file might be better fit in api/clients
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved all GRPC hashing code to a new package api/hashing
. There is now no longer a dependency (outside of test code) between the client and this file.
api/clients/v2/node_client.go
Outdated
@@ -3,6 +3,7 @@ package clients | |||
import ( | |||
"context" | |||
"fmt" | |||
"github.com/Layr-Labs/eigenda/node/auth" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem to me this is a good dependency direction for api/ to depend on node/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this dependency is no longer present
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
node/auth/authenticator.go
Outdated
// chainReader is used to read the chain state. | ||
chainReader core.Reader | ||
|
||
// keyCache is used to cache the public keys of dispersers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix the comment then?
&bind.CallOpts{ | ||
Context: ctx, | ||
}, | ||
disperserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why DisperserKeyToAddress
is accepting a disperserID as param?
Signed-off-by: Cody Littley <cody@eigenlabs.org>
grpc "github.com/Layr-Labs/eigenda/api/grpc/node/v2" | ||
"github.com/Layr-Labs/eigenda/api/hashing" | ||
"github.com/Layr-Labs/eigenda/common" | ||
"github.com/aws/aws-sdk-go-v2/aws" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer to have a thin kms wrapper client and use it to interface with kms instead of directly importing aws libraries. But feel free to ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is already just a thin wrapper, although its use case specific and not general purpose. Let's discuss this and make any changes in a follow up PR.
common/kms.go
Outdated
@@ -0,0 +1,167 @@ | |||
package common |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aws related utils are under common/aws/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to common/aws
return nil | ||
} | ||
|
||
key, err := a.getDisperserKey(ctx, now, request.DisperserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case I was worried about is an exploit with sending requests with increasing disperser IDs
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Why are these changes needed?
Authenticate
StoreChunks()
requests.Checks