Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed PoSt support #854

Closed
steven004 opened this issue Aug 30, 2019 · 6 comments
Closed

Distributed PoSt support #854

steven004 opened this issue Aug 30, 2019 · 6 comments
Assignees
Labels
enhancement New feature or request P1 Medium priority

Comments

@steven004
Copy link
Contributor

Description

This is a reminder for enabling PoSt to be distributed between servers. This could be done with 1) multiple miner workers support for one miner, and one worker will independently submitPoSt; 2) Allow multiple submitPoSt messages with different provingSet for each in one period; 3) Allow one submitPoSt message contain different proofs of spacetime with specified provingSet.

This had been discussed with @anorth via email, and @icorderi / @michellebrous and others in Berlin.

Acceptance criteria

In a distributed pool, the PoSt can be done on the server node which actually save the data. No data move or share required.

Risks + pitfalls

While implementing multiple workers, a new layer is required, more complicated.
While allowing multiple PoSt messages in one period, more messages and pressure for the chain.

Where to begin

Consider this when implement rational PoSt.

@dignifiedquire dignifiedquire added the enhancement New feature or request label Mar 2, 2020
@porcuquine
Copy link
Collaborator

@cryptonemo In case something here is relevant to your current design work.

@cryptonemo
Copy link
Collaborator

@magik6k We could really use some focused information on what this would look like.

I've been looking at this for some time, but have been quite sidetracked as other priorities have popped up. But it's getting down to the wire now, and we believe that this is needed. Help me get us there! Tag in anyone else that you think may be able to help. I'm tagging @porcuquine for tracking/input as well.

You had proposed a sketch API of something like this:

type Challenge struct{SectorID, stuff..}
GenerateChallenges() []Challenge
ReadChallenges([]Challenge) []PoStin // Can work on subset of challenges generated
GenerateWindowPoSt(.., challenges []Challenge)

While I understand this at some level, better requirements would be very useful (particularly including the flow, as that's not clear in my head). For reference, I have not run a multi-node mining setup, but am familiar with single node mining setups. I don't quite understand the separations across machines at the lotus process level, so bear that in mind.

Some questions I'd like clarified:

Of the machines involved, where is the (storage) data located, and where are the GPUs located?

How can we design this to minimize data transfer, but optimize parallelization/independence?

What is the primary use-case for this? (e.g. is it making sure WindowPoSt can be offloaded so as not to interfere with WinningPoSt, or just a general performance improvement by distributing the load)?

Regarding the proposed API:

My current understanding is that we can generate challenges and combine all requested inputs for post in some Go struct, and that thing can be shipped (w/o requiring storage/persistence), so long as it's not bigger than hundreds of MiBs (i.e. trees are not involved), correct?

It's not clear to me what ReadChallenges does, although I would think a remote receiver calls that to receive challenges to work on, which does whatever it needs to do to call GenerateWindowPoSt, correct?

And lastly, GenerateWindowPoSt is the most self-explanatory, once all of the inputs are retrieved and it has all challenges across the sector set.

@porcuquine
Copy link
Collaborator

porcuquine commented Aug 24, 2020

I don't know exactly what they have in mind, but my guess is along these lines:

  • For Winning PoSt: a prover needs to know which sector will be challenged (since all provided are not).
  • For Window PoSt: all provided sectors will be challenged once, but it would be useful to know the actual challenge.
  • In either case, it needs to be possible to distribute vanilla proving.

So the most obvious separation would be something like this (for each kind of PoSt — window and winning):

  • Generate challenges:
    • input: same as generate PoSt, but without the raw data paths (we will remove everything non-essential from the signature)
    • output: a sequence of actually challenged sectors and the leaf challenge(s) for that sector.
  • Generate vanilla proof:
    • input: all the necessary metadata and raw data paths for a single sector and its challenges.
    • output: serialized representation of one vanilla proof.
  • Generate PoSt:
    • input: all the serialized vanilla proofs, in order of the challenges obtained initially;
    • output: same as currently: the binary blob representing the Groth proof(s).

If this sound generally right, it should not be too difficult to split. If you have no input on how vanilla proof serialization/deserialization happens, and the vanilla proofs can be opaque blobs you plumb correctly (tracking any necessary metadata yourself) that would make it as easy as possible for us to design something. Alternately, we can propose a design which surfaces metadata for your tracking, but this might add complexity — so absent further information, I propose we don't do this. I think there is already a serialization mechanism, so it may be that the simplest version of this feature can be implemented fairly quickly.

The most important first question is whether I have mischaracterized the problem and potential solution. Therefore, @magik6k @whyrusleeping please confirm that something generally like the above would work — or else let us know what would be more suitable. If the general idea above is correct but I have overlooked important details, please supply what you can (cc: @cryptonemo Did I oversimplify something?)

@magik6k
Copy link

magik6k commented Aug 26, 2020

Yep that sounds about right. Fine vanilla proofs being opaque blobs.

One thing that would be really nice would be to have the Generate vanilla proof step sanity check the data, and fail early, so we don't try to generate an invalid proof (this may already be happening internally, except it currently fails the whole PoSt call).

@porcuquine
Copy link
Collaborator

porcuquine commented Aug 26, 2020

Absolutely. We will never knowingly return a bad proof. The only reason we have all-or-nothing failures now is because this decomposition hasn't been exposed. So doing this should give us the tool you need to spot check sectors for free. (In other words, if you want to be more pro-active than your required posts, you can just randomly challenge your sectors without generating SNARKs.)

@cryptonemo
Copy link
Collaborator

This was resolved by #1278

API and FFI PRs coming soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P1 Medium priority
Projects
None yet
Development

No branches or pull requests

5 participants