Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

feat: add ucanto aggregation api service #2

Merged
merged 2 commits into from
Jul 11, 2023
Merged

Conversation

vasco-santos
Copy link
Contributor

@vasco-santos vasco-santos commented May 17, 2023

This PR adds ucanto aggregation API service to POST / route.

It includes the implementation of:

  • offerStore - bucket keeping track aggregate offers received, so that they can be pulled by Spade
  • arrangedOfferStore - dynamoDB keeping track of arranged offers received, so that we can kick in cron jobs to poll Spade DB and invoke offer/arrange with status accepted or failed according to what happened. Note that we keep an index with stat so that we can query by stat in queue to then poll Spade on those and see if there are updates for that aggregate
  • Note that aggregate-store will be a wrapper for Spade DB to query aggregate infos. Therefore, it is not implemented in this PR and will come next.

Decisions:

  • how often should we be pulled. It influences the way we compute filenames. For ease I just did a 15 minutes multiple in this PR, we can come in follow up with a change once we align on what this number will be.

Note that:

  • package-lock was wrongly included before, so getting rid of it here
  • next will come CRON jobs to merge offers into single one (if needed) and

Closes storacha/w3infra#11

@seed-deploy
Copy link

seed-deploy bot commented May 17, 2023

View stack outputs

@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 7524d63 to 8d6e768 Compare May 19, 2023 14:04
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 8d6e768 to 5d8fd55 Compare May 19, 2023 14:14
@seed-deploy seed-deploy bot temporarily deployed to pr2 May 19, 2023 14:14 Inactive
@seed-deploy
Copy link

seed-deploy bot commented May 19, 2023

Stack outputs updated

@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 5d8fd55 to 0262c9f Compare May 25, 2023 12:57
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 0262c9f to 3b16221 Compare May 25, 2023 15:11
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 3b16221 to bdef525 Compare May 26, 2023 13:49
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from bdef525 to 68592ba Compare May 26, 2023 14:46
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 68592ba to 2ff25c8 Compare June 1, 2023 13:29
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 2ff25c8 to 3537c54 Compare June 6, 2023 07:26
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 3537c54 to 9e65abb Compare June 6, 2023 19:27
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 9e65abb to 7378ea2 Compare June 8, 2023 07:56
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 7378ea2 to 3d19dd3 Compare June 8, 2023 08:00
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 751a115 to 9b006b6 Compare June 8, 2023 11:55
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 11:55 Inactive
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 9b006b6 to e650e45 Compare June 8, 2023 12:06
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 12:06 Inactive
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from e650e45 to 701981f Compare June 8, 2023 12:37
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 12:37 Inactive
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from 701981f to b5822dc Compare June 8, 2023 12:39
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 12:39 Inactive
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from b5822dc to d495ec7 Compare June 8, 2023 12:42
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 12:42 Inactive
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from d495ec7 to f8d9329 Compare June 8, 2023 12:47
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 12:47 Inactive
@vasco-santos vasco-santos force-pushed the feat/ucanto-service branch from f8d9329 to 47e6ced Compare June 8, 2023 13:02
@seed-deploy seed-deploy bot temporarily deployed to pr2 June 8, 2023 13:02 Inactive
@vasco-santos vasco-santos marked this pull request as ready for review June 8, 2023 13:04
@vasco-santos vasco-santos requested review from Gozala and alanshaw June 8, 2023 13:04
@vasco-santos vasco-santos mentioned this pull request Jun 8, 2023
2 tasks
Copy link

@Gozala Gozala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some suggestions, but non I'd consider a blocker.


_Example:_ `MgCZG7EvaA...1pX9as=`

#### `UCAN_LOG_BASIC_AUTH`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: Do we really need an auth for the UCAN_LOG, they all come from API endpoint and anyone just as well could get messages here by sending there. That is to say maybe we don't need to gate it since we check incoming data is ucanto message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the point, but we do have things setup to count things like metrics that would be messed up if someone posted directly here.

...but if anyone sent messages directly to the log it wouldn't be a true representation of what we received and acted on

package.json Outdated
"deploy": "sst deploy",
"remove": "sst remove",
"console": "sst console",
"typecheck": "tsc --noEmit"
"typecheck": "tsc --noEmit && npm run typecheck -w packages/core",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using tsc --build instead. The --noEmit runs TS in legacy mode that can cause incompatibilities for the code that uses modern TS toolchains.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout.

History here is this was generated with sst project generator https://docs.sst.dev/start/standalone which already had this. I also tried to get it out in favour of --build but than given this is a TS project (which now is the only available by sst project generator) the JS files from build are created inline, which this avoids.

I changed to --build but needed to add noEmit true to tsconfig to not generate the js files


export function useAggregateStore() {
return {
get: async (commitmentProof) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would highly encourage use of Result types for any IO code se that error types can be captured in the interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree. We need to refactor this at the API code level and not here (and in reality for access-api and upload-api). Would be actually good to do it across the board to be consistent rather than here separately. I created issue for it storacha/w3up#818

const putCmd = new PutObjectCommand({
Bucket: bucketName,
ContentType: 'application/json',
Key: `${getNextUtcDateName()} ${commitmentProof.toString()}`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this I am not entirely sure why this prefixing scheme is used, I think it would really help to have comment explaining it or a pointer to document explaining it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a comment above. It is also part of what was decided with Riba and documented in https://www.notion.so/w3-spade-deal-flow-bd479485bbf64e85aa0dee0ad6fc3019

I can port this to a markdown file if that is helpful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue created #10

})
})

await pRetry(() => s3client.send(putCmd))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure what is the intention here, as in what kind of error it will attempt to recover from. I think it would be a good idea to provide a code comment to make it more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for historical reasons where sometimes write fail in these buckets. So we usually wrap with retires across the stack, so that spontaneous issues do not break calls

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can create a separate put utility function that has the retries and comments about potential write failures it is aiming to recover. I just worry not having that context captured anywhere.

})
})

await pRetry(() => s3client.send(putCmd))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure what is the intention here, as in what kind of error it will attempt to recover from. I think it would be a good idea to provide a code comment to make it more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for historical reasons where sometimes write fail in these buckets. So we usually wrap with retires across the stack, so that spontaneous issues do not break calls

})
})

await pRetry(() => s3client.send(putCmd))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is the intention here, as in what kind of errors it attempts to recover from. I think it would be a good idea to provide a code comment to make it more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for historical reasons where sometimes write fail in these buckets. So we usually wrap with retires across the stack, so that spontaneous issues do not break calls


// add commitmentProof for polling
// once an aggregate is fulfilled (accepted or rejected) a receipt will be generated.
await arrangedOfferStore.set(commitmentProof, 'queued')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Wonder if we can avoid separate store for this, e.g. if we knew invocation CID we could simply check if we have a receipt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a lot of time trying to figure it out best solution and so far this is what feels nicer. The issue is that receipt needs to be generated by spade-proxy, so it must keep some state so that a cron wakes up, checks Spade DB and issue receipts for the ones that were already success/failure.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, maybe just write the comment that says that spade-proxy need to keep track of state across cron activations and db is used to ....

@seed-deploy seed-deploy bot temporarily deployed to pr2 June 13, 2023 09:19 Inactive
const putCmd = new PutObjectCommand({
Bucket: bucketName,
ContentType: 'application/json',
Key: `${getNextUtcDateName()} ${commitmentProof.toString()}`,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we will still align with Spade about this part. Either we have a cron that merged this offers into a new file with format YYYY-MM-DD HH:MM:00, or Spade will be required to perform a list on the bucket with such prefix.

import type { Link } from '@ucanto/interface'
import * as AggregateAPI from '@web3-storage/aggregate-api'

export function createArrangedOfferStore(region: string, tableName: string, options?: ArrangedOfferStoreDbOptions) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not totally happy with this name. Anyone have suggestions?

I named it after what we use this for, to call offer/arrange once we know if aggregate made it to SPs or not

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading more on the filecoin side of things, it seems to me that this really should be called deal/arrange or deal/create.


_Example:_ `MgCZG7EvaA...1pX9as=`

#### `UCAN_LOG_BASIC_AUTH`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package.json Outdated
"deploy": "sst deploy",
"remove": "sst remove",
"console": "sst console",
"typecheck": "tsc --noEmit"
"typecheck": "tsc --noEmit && npm run typecheck -w packages/core",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout.

History here is this was generated with sst project generator https://docs.sst.dev/start/standalone which already had this. I also tried to get it out in favour of --build but than given this is a TS project (which now is the only available by sst project generator) the JS files from build are created inline, which this avoids.

I changed to --build but needed to add noEmit true to tsconfig to not generate the js files

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement store/* providers
3 participants