docs: storage/deliver and storage/confirm capabilities rfc #10

vasco-santos · 2024-02-09T13:10:17Z

This RFC proposes extending store/* current behaviour with storage/deliver and storage/confirm (or similar name). It is motivated by the direction of having a Proof of delivery where client should invoke storage/deliver, from which service can acknowledge request and on its time invoke storage/confirm signing the content is written on write target and the response of the challenge.

Note that this RFC excludes anything related to the delivery proof except for the need for it. Having the need for it, we propose to rely on this as part of the dataAnywhere project to decouple read services from assumed bucket. We should consider updating our store flow to better accomplish this in the future, once can fully rely on new UCAN spec and Invocation spec - see proposal by @Gozala #10 (comment)

rfc/store-deliver.md

olizilla

The idea here is good!

The text needs cleaning up but I'd rather go through it together in sync when you have time.

rfc/store-deliver.md

alanshaw · 2024-02-20T11:48:47Z

rfc/store-deliver.md

+}
+```
+
+5. service verifies if content was delivered by user, and queues a self invocation of `store/deliver` that is also provided as an effect.


I'd like this to be less complicated - in new invocation spec effects are CIDs to tasks not invocations so the server could issue this next task in the receipt and the client can invoke to complete the task and then we're done.

I agree with this, that said there is still needs to be a pending task awaiting for store/deliver receipt that will issue receipt from us that workflow was complete. I tried to show this in the flowchart in https://github.com/web3-storage/RFC/pull/10/files#r1514851900

vasco-santos · 2024-02-21T11:37:02Z

rfc/store-deliver.md

+  },
+  "fx": {
+    "join": { "/": "bafy...storeConfirm" },
+    "fork": [{ "/": "bafy...storeDeliver" }]


note that we won't be able to implement this until we implement invocation spec, so implementation would miss the fork for the start

Gozala · 2024-03-06T17:01:10Z

rfc/store-deliver.md

+
+### `store/deliver` capability
+
+After the agent handled the asynchronous task of posting the bytes, the agent can invoke `store/deliver` to notify the service that the bytes were delivered. This invocation in the future MAY include a challenge for the service based on the posted data.


Ideally we would not have this capability, but instead dispatch blob/write invocation (via effect). Client with content (could be a different one) can perform the invocation by performing HTTP PUT of those bytes and send signed receipt to a service. That receipt would effectively be what this capability is trying do, a claim from client that content has been delivered. When service is made aware of this it can then perform blob/close or something along those lines which is more or less what store/confirm is. Here is the rough workflow diagram

flowchart LR Add[👩‍💻 store/add 🤖] Allocate[🤖 blob/allocate 🤖] Receive[🤖 blob/receive 🤖] Write[🤖 blob/write 👩‍💻] Close[🤖 blob/close 🤖] Claim[🤖 assert/location 🌐] Add --> Allocate Add --> Receive Allocate --> Receive Receive --> Write Receive --> Close Write --> Close Close --> Claim

Loading

Emoji on the left denotes agent issuing invocation.

Emoji on the right denotes agent performing invocation.

👩‍💻 Denotes user agent

🤖 Denotes service agent

🌐 Denotes global network

Overview

Alice invokes store/add task

Service schedules blob/allocate and blob/receive tasks where later depends on former (using promise/await from invocation spec)

blob/receive schedules blob/write and blob/close tasks where later awaits on former

blob/write is executed by user agent, implying that on completion receipt will be issuer unblocking blob/close

blob/close verifies that written blob corresponds to the blob address (hash).

blob/close on success schedules assert/location claim

Note that errors in dependencies propagate, meaning if task depends on the failed task it will never run and be considered failed.

Point I'm trying to make is that invocation spec has promise pipelines as a means for forming dependencies between tasks. Ideally we would embrace those so task coordination is captured as opposed to having them described in just a spec.

@Gozala I definitely see this as a promising alternative. There is one critical point though that seems a hard requirement that this won't cover. We would like to have the blob/close + assert/location claim to happen right away. While they do not happen the data is not available. This is a requirement that we have continuously not able to accomplish, and we always had reports of delays to get data ready to read (today in w3up is due to delay of replicator to CF).

The problem is:

When service is made aware of this it can then perform blob/close or something along those lines which is more or less what store/confirm is. Here is the rough workflow diagram
5. blob/close verifies that written blob corresponds to the blob address (hash).
6. blob/close on success schedules assert/location claim

By being the client to immediately tell that seems a more efficient way than service having a huge list of tasks to handle in a queue. Also ends up being less prone to errors and maintenance on failures, and obviously way more expensive to run.

We could potentially consider some event subscription and associated runs with the receipt reporting API. Even though this will also require us to thing on how we want to manage receipts being sent by client.

I would like to get closer to that, but would like to figurre out how to not compromise better service for users and service cost

By being the client to immediately tell that seems a more efficient way than service having a huge list of tasks to handle in a queue. Also ends up being less prone to errors and maintenance on failures, and obviously way more expensive to run.

I think we need to do both (because we have no guarantee client will ever notify us).

Have run scheduled on pre-signed url expiry in case we have not got any update from the clients.

Have a handler on the receipt that will perform the closure.

We could potentially consider some event subscription and associated runs with the receipt reporting API. Even though this will also require us to thing on how we want to manage receipts being sent by client.

ucanto agent messages are send both ways and they can hold both invocations and receipts which is to say that client can send receipt to a server after they have completed an upload.

You are correct right now our service router will not react to received receipts in any way, however we could hook server up to do things with incoming.report which is currently ignored

I think trickiest bit here would be to track dependencies that is blob/close would have to execute when we get receipt for blob/write. However we can even pun on that and simply add metadata in the receipt that will signal us to poll blob/write. Which is not perfect by any means but IMO better than to have no trace of the workflow capture in the receipts.

Alternatively we could just have a endpoint or even a capability to poll a task where handler will attempt to check dependencies and if they are available then run the task.

Both approaches would be general to all of the cases where we have workflows with non-linear dependency graphs as oppose to being on off workarounds for not having support for this yet.

I would like to get closer to that, but would like to figure out how to not compromise better service for users and service cost

That is fare! I think there is a possible compromise here where we can do a just a bit more, not much and get workflows with traceable receipts for it. If we really want to aggressively cut scope we could even device something like

{ can: "system/poll", with: "did:key:zAlice", nb: { task: { '/': 'bafy...pending' }, receipts: { 'bafy...task': { '/'; 'bafy..receipt' } } } }

Where handler will effectively just try to run the task we already created in the and put in the receipt and provide all the dependencies for it.

while we decided to go to simplest solution now, I think there is a lot of things here that indeed could be the future. We should save this info in an issue, maybe within the context of the blob/* proposal you put together?

For instance, the Job on expiration seems also something we should do, but I would like to not increase the scope here of dataanywhere, which by itself is already quite big

@Gozala I think we can frame this differently to better be aligned toward implementing what you suggest later on, once we can. If we frame this into the context of the proof of delivery, where client will send a challenge that service needs to fulfil, this interaction as is needs to happen anyway. We can leverage it for the time being for submitting the index, until we have ease to implement this once UCAN updates land and we bubble them up

rfc/store-deliver.md

Gozala · 2024-03-06T17:07:29Z

rfc/store-deliver.md

+}
+```
+
+### `store/confirm` capability


We really should have this

In a different name space

Make resource service DID not user did.

Because it is service that can invoke this not the user and user has ownership of their space key not us.

yeah, I think resource here should probably be a did of the write target?

Gozala · 2024-03-06T17:19:46Z

rfc/store-deliver.md

+
+To offer verifiability, `store/add` invocation should be associated with `store/deliver` and `store/confirm` via [effect]s. `store/deliver` would be invoked by the client to notify the service that content was stored, and the the service should also sign a confirmation that it is true via `store/confirm`.
+
+### `store/deliver` of not previously delivered content


It may be a good idea to put this into a different namespace so that with is our DID not user DID, and store/add could delegate capability to invoke it. That would also enable use to perform the task e.g. on some time schedule etc...

General point is state change is not in the user space, but in our system so resource should be system DID not user space DID. And it is a system state as opposed to user space because state of the CAR is shared across all spaces.

Gozala · 2024-03-06T18:09:45Z

rfc/store-deliver.md

+  "with": "did:key:abc...",
+  "nb": {
+    "link": "bag...",
+    "url": "https://..."


What is the URL here, pre-signed PUT url we handed prior ?

yeah, I think it makes sense to be the same, rather than user deriving something. What do you think?

Are we using that URL for anything ? Otherwise I would just leave it out.

My main concern is a transition period where we change bucket and there were a few on going ops, where then this invocation is received after the switch. This would mean store/add handler returned presigned url for bucket A, then we swap to bucket B our back end, then store/deliver comes in and we do not know where to claim things. I would therefore like to keep this, unless there are other suggestions

vasco-santos · 2024-03-12T14:53:14Z

rfc/storage-deliver.md

+
+## Abstract
+
+`store/*` protocol is missing verifiability for content actually uploaded to the provided presigned URL. This factor together with intention to create a proof of delivery makes critical the introduction of a new capabilities in the `storage/*` namespace. In this RFC, we propose `storage/deliver` and `storage/confirm` capabilities.


better names are welcome :)

hannahhoward

Just one question -- otherwise LGTM

hannahhoward · 2024-03-12T23:34:22Z

rfc/storage-deliver.md

+}
+```
+
+5. service verifies if content was delivered by user, and queues a self invocation of `storage/deliver` that is also provided as an effect.


what is self-invocation of storage/deliver here? Or do you mean storage/confirm

Could the call for storage/confirm be issued to an independent node if the write target were somewhere else?

docs: store deliver capability rfc

d329967

vasco-santos force-pushed the docs/store-deliver-capability-rfc branch from 1b1fec9 to d329967 Compare February 9, 2024 13:27

vasco-santos requested review from Gozala, olizilla and alanshaw February 9, 2024 13:28

olizilla reviewed Feb 12, 2024

View reviewed changes

rfc/store-deliver.md Outdated Show resolved Hide resolved

olizilla suggested changes Feb 12, 2024

View reviewed changes

vasco-santos added 2 commits February 12, 2024 13:28

fix: use numbers in diagrams

f395bbf

fix: restructure to diagram after flow walkthrough

1576a8c

vasco-santos mentioned this pull request Feb 19, 2024

feat: store deliver and confirm on upload-api storacha/w3up#1310

Closed

alanshaw requested changes Feb 20, 2024

View reviewed changes

chore: add store/confirm

b17825b

vasco-santos changed the title ~~docs: store deliver capability rfc~~ docs: store/deliver and store/confirm capabilities rfc Feb 21, 2024

vasco-santos requested review from olizilla and alanshaw February 21, 2024 11:34

vasco-santos commented Feb 21, 2024

View reviewed changes

Gozala reviewed Mar 6, 2024

View reviewed changes

rfc/store-deliver.md Outdated Show resolved Hide resolved

Gozala reviewed Mar 6, 2024

View reviewed changes

vasco-santos mentioned this pull request Mar 7, 2024

docs: datawherehouse location claim + store/publish #13

Merged

chore: address review comments

b9501ad

vasco-santos changed the title ~~docs: store/deliver and store/confirm capabilities rfc~~ docs: storage/deliver and storage/confirm capabilities rfc Mar 12, 2024

fix: rename store to storage for new capabilities with service resource

c38dc6c

vasco-santos requested review from Gozala and alanshaw and removed request for olizilla and alanshaw March 12, 2024 14:51

vasco-santos commented Mar 12, 2024

View reviewed changes

This was referenced Mar 12, 2024

Complete write-anywhere work storacha/project-tracking#11

Closed

Develop MVP design for proof of delivery storacha/project-tracking#15

Closed

hannahhoward approved these changes Mar 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: storage/deliver and storage/confirm capabilities rfc #10

docs: storage/deliver and storage/confirm capabilities rfc #10

vasco-santos commented Feb 9, 2024 •

edited

Loading

olizilla left a comment

alanshaw Feb 20, 2024

Gozala Mar 6, 2024

vasco-santos Feb 21, 2024

Gozala Mar 6, 2024

Gozala Mar 6, 2024

vasco-santos Mar 7, 2024 •

edited

Loading

Gozala Mar 7, 2024

vasco-santos Mar 12, 2024

vasco-santos Mar 12, 2024

Gozala Mar 6, 2024

vasco-santos Mar 7, 2024

Gozala Mar 6, 2024 •

edited

Loading

Gozala Mar 6, 2024

vasco-santos Mar 7, 2024

Gozala Mar 7, 2024

vasco-santos Mar 12, 2024

vasco-santos Mar 12, 2024

hannahhoward left a comment

hannahhoward Mar 12, 2024


		### `store/deliver` capability

		After the agent handled the asynchronous task of posting the bytes, the agent can invoke `store/deliver` to notify the service that the bytes were delivered. This invocation in the future MAY include a challenge for the service based on the posted data.


		To offer verifiability, `store/add` invocation should be associated with `store/deliver` and `store/confirm` via [effect]s. `store/deliver` would be invoked by the client to notify the service that content was stored, and the the service should also sign a confirmation that it is true via `store/confirm`.

		### `store/deliver` of not previously delivered content


		## Abstract

		`store/` protocol is missing verifiability for content actually uploaded to the provided presigned URL. This factor together with intention to create a proof of delivery makes critical the introduction of a new capabilities in the `storage/` namespace. In this RFC, we propose `storage/deliver` and `storage/confirm` capabilities.

docs: storage/deliver and storage/confirm capabilities rfc #10

Are you sure you want to change the base?

docs: storage/deliver and storage/confirm capabilities rfc #10

Conversation

vasco-santos commented Feb 9, 2024 • edited Loading

olizilla left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Overview

Choose a reason for hiding this comment

vasco-santos Mar 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Gozala Mar 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hannahhoward left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vasco-santos commented Feb 9, 2024 •

edited

Loading

vasco-santos Mar 7, 2024 •

edited

Loading

Gozala Mar 6, 2024 •

edited

Loading