-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify client codec #1105
Simplify client codec #1105
Conversation
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall big big fan of these changes. Much needed. I personally think we can lean in even higher on the type system and define types for EncodedPayload and Blob. I really don't like looking at a bunch of functions that take a []byte and return another []byte. Wdyt?
api/clients/codecs/blob_codec.go
Outdated
// The returned bytes may be interpreted as a polynomial in Eval form, where each contained field element of | ||
// length 32 represents the evaluation of the polynomial at an expanded root of unity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could also be interpreted in coeff form right? Like we had in the optimal disperse route at the top of
Also is it really at expanded roots of unity? This part I'm not sure. cc @bxue-l2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could also be interpreted in coeff form right
This is a tricky question, let me explain my reasoning for wording it this way.
From the perspective of the client, blob polynomials may either be dispersed in Eval
form, or Coeff
form. Regardless of the form the client chooses, however, the disperser chooses to interpret what it receives as a polynomial in Coeff
form.
client: "EncodePayload
returns a polynomial in Eval
form. Depending on my configuration, I may choose to directly disperse the blob in Eval
form, or I may convert the polynomial to Coeff
form" first.
disperser: "No matter what the client sends me, I will interpret it as Coeff
form"
One argument for this set of definitions is that it yields meaningful names for the enum which determines pre-dispersal processing: Eval
means don't IFFT, and Coeff
means do IFFT. If we were to define EncodePayload
, such that what it returns is considered to be Coeff
form in the case of "optimal dispersal", that means the polynomial sent from the client to the disperser is always in Coeff
form. In this case, we would need to think of alternate names for the enum values.
Essentially, the question boils down to this: in the alternate "optimal" dispersal pathway, at what point in time do we declare that the non-iffted bytes are in Coeff
form? Does the client make this declaration, or does the disperser? I chose for the disperser to make this declaration in my descriptions, but I think it's a pretty arbitrary decision. We just need to come to consensus, and stick to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is it really at expanded roots of unity?
I think so, but confidence is lacking. Waiting for clarification from @bxue-l2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome summary, let's definitely chat about this over standup with everyone, there's a lot going on which I need to think about more deeply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the roots_of_unity, having fought in our rust PRs over the meaning of primitive_roots_of_unity
, roots_of_unity
, expanded_roots_of_unity
, I think the semantics is something like:
- We need roots of unity that form a subgroup of the order equal to the number of FEs in our polynomial/blob
- We store the 29 primitive_roots_of_unity as const in the code, which, when
expanded
, each generate a group of order 2^index of that PROU
(2^0, 2^1, 2^2, etc.)
So if my current understand is correct (cc @bxue-l2 @anupsv), then we should not refer to primitive/expanded in these kinds of docs, just roots_of_unity
. expanded
just refers to the procedure we have to take because of the form we decided to cache the roots in. We could also just have stored every single root of unity for every power directly in the binary and used those directly without needing to expand
a primitive root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that is right.
Having roots_of_unity and expanded_roots_of_unity are sufficient.
See wiki.
which are 1, 2^(2^1), 2^(2^2), 2^(2^3)....2^(2^28). The exponent is power of 2 because it needs to work with FFT
Expanded ROU based on 2^(2^1) is [1, w, w^2, w^3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have a slight different idea on the following statement
EncodePayload returns a polynomial in Eval form
The client can decide which form it wants the EncodedPayload to be interpreted. If CoeffForm, then the EncodedPayload will go to the optimal path, otherwise in the EvalForm, the EncodedPayload needs to undergo IFFT
api/clients/codecs/blob_codec.go
Outdated
// The returned bytes may or may not represent a blob. If the system is configured to distribute blobs in Coeff form, | ||
// then the data returned from this function must be IFFTed to produce the final blob. If the system is configured to | ||
// distribute blobs in Eval form, then the data returned from this function is the final blob representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to be using blob
here, then we have to be 100% precise and link to its definition somewhere in the codebase.
Also what does "system is configured mean"? Isn't the system eigenda? And eigenda always interprets blobs as being in coeff form?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then we have to be 100% precise and link to its definition somewhere in the codebase
Do we have any precise definitions of a blob in the codebase yet?
system is configured mean
I intended this to mean "however the various clients are configured to disperse and retrieve blobs". Do you think it is better if I say:
"If clients are configured to distribute blobs in Coeff form,..."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think just using "clients are configured" is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might not have precise definition of blob in the codebase. Ideally we'll have a spec soon enough that we can point to though perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My definition of a blob is the non blobHeader part that is received by the disperser. If the length of the received part is not power of 2, we artificially pad it so. So blob is always power of 2
api/clients/codecs/fft_test.go
Outdated
// Number of test iterations | ||
iterations := testRandom.Intn(100) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit weird. Why not just always test the same number of iterations?
api/clients/codecs/fft_test.go
Outdated
// TestFFT checks that data can be IFFTed and FFTed repeatedly, always getting back the original data | ||
func TestFFT(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TestFFT checks that data can be IFFTed and FFTed repeatedly, always getting back the original data | |
func TestFFT(t *testing.T) { | |
// TestFFT checks that data can be IFFTed and FFTed repeatedly, always getting back the original data | |
// TODO: we should probably be using fuzzing instead of this kind of ad-hoc random search testing | |
func TestFFT(t *testing.T) { |
} | ||
|
||
var codec codecs.BlobCodec | ||
var polynomialForm codecs.PolynomialForm | ||
if config.DisablePointVerificationMode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good time to update this weirdly named flag to something more explicit like the polynomial form being passed directly?
Feel like the fact that coeff form has point verification feature is VERY context dependent on understanding the way the eigenda backend is doing things etc. This should prob just be documented in the ENUM and flag, and then users can choose which form they want to use accordingly?
api/clients/eigenda_client.go
Outdated
encodedPayload, err := m.blobToEncodedPayload(data) | ||
if err != nil { | ||
return nil, fmt.Errorf("error decoding blob: %w", err) | ||
return nil, fmt.Errorf("blob to encoded payload: %w", err) | ||
} | ||
|
||
return decodedData, nil | ||
decodedPayload, err := codecs.DecodePayload(encodedPayload) | ||
if err != nil { | ||
return nil, fmt.Errorf("error decoding payload: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super clean! We might even want a single function like blobToPayload
which does both operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also might want to use blob
instead of data
for the thing received by RetrieveBlob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a blob always power of 2?
// If the system is configured to distribute blobs in Eval form, the returned bytes exactly match the blob bytes. | ||
// If the system is configured to distribute blobs in Coeff form, the blob is FFTed before being returned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If the system is configured to distribute blobs in Eval form, the returned bytes exactly match the blob bytes. | |
// If the system is configured to distribute blobs in Coeff form, the blob is FFTed before being returned | |
// If the EigenDAClient is configured to distribute blobs in Eval form, the returned bytes exactly match the blob bytes. | |
// If the EigenDAClient is configured to distribute blobs in Coeff form, the blob is FFTed before being returned |
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Signed-off-by: litt3 <102969658+litt3@users.noreply.github.com>
Why are these changes needed?
Checks