Skip to content
This repository has been archived by the owner on Aug 11, 2021. It is now read-only.

Enhancement: Remove dependency on IPFS Blockservice #260

Closed
chafey opened this issue Feb 28, 2020 · 7 comments
Closed

Enhancement: Remove dependency on IPFS Blockservice #260

chafey opened this issue Feb 28, 2020 · 7 comments

Comments

@chafey
Copy link
Contributor

chafey commented Feb 28, 2020

  • Version: 0.25.3
  • Platform: All
  • Subsystem:

Type: Enhancement

Severity: Very Low

Description: IPLD requires a Blockservice but does not define the interface for this Blockservice in the specs or code. It uses the ipfs-block-service in its example usage and tests thus creating a "soft" dependency on IPFS. This soft dependency can create confusion because ipfs-block-service provides both a block repository (which IPLD needs) and bitswap (which IPLD does not use). IPLD understandability would be improved if it was fully self contained without any dependency on IPFS. To accomplish this, the API interface should be defined in the IPLD project along with local implementations. Since IPLD just needs a repo, it would make sense to name this interface something like ipld-block-repository to avoid confusion with the ipfs-block-service which also adds in bitswap functionality. Examples showing how to use IPLD with ipfs-block-service can be provided, but would not be the default.

Steps to reproduce the error: N/A

@chafey
Copy link
Contributor Author

chafey commented Feb 29, 2020

Created a simple ipld-blockstore here: https://github.com/chafey/js-ipld-blockstore

@rvagg
Copy link
Member

rvagg commented Mar 2, 2020

@chafey have a look at https://github.com/ipld/js-block, most of the newer work we're doing on top of IPLD in JS is building on this rather than js-ipld. We're still clarifying how the entire stack should fit together but js-block is worth a look if you're wanting to properly decouple your IPLD usage from the IPFS stack.

@chafey
Copy link
Contributor Author

chafey commented Mar 2, 2020

OK - this is helpful to know! A few questions:

  1. Can you confirm that IPLD should not have any dependencies on IPFS?
  2. Is the plan to deprecate js-ipld some day?
  3. What is the current thinking on ipld and persistence (ipd.get, ipld.put, ipld.remove)? Is defining a persistence interface still part of IPLD scope, or will that be not a concern of IPLD?
  4. What is the current thinking on iplld path resolution (ipld.resolve)? This currently depends on persistence (getting blocks), but could theoretically be changed to just work on deserialized objects.

@mikeal
Copy link

mikeal commented Mar 4, 2020

Can you confirm that IPLD should not have any dependencies on IPFS?

I can confirm this, but with an emphasis on “should” 😁

Specifically, on the JS side the older IPLD libraries depended on a few libraries from IPFS, mostly fo the storage and “repo” APIs. The new stuff stays away from requiring a specific storage layer so this is no longer an issue, but those old libraries are still actively used by js-ipfs.

Is the plan to deprecate js-ipld some day?

It is already “soft deprecated.” We don’t recommend new applications use and many new applications aren’t (js-ipfs-lite for instance).

What is the current thinking on ipld and persistence (ipd.get, ipld.put, ipld.remove)? Is defining a persistence interface still part of IPLD scope, or will that be not a concern of IPLD?

Yes and no. Many IPLD libraries need a way to ask for blocks by cid. For those APIs I’ve just been having the user pass in a single get function.

I’ve tried to stay away from a put API in all the new IPLD primitives. Instead, I just produce Block instances and hand them to the user who can store them however they please.

As we continue working our way towards higher order abstractions this lack of storage interface won’t hold. I’ve been adamant that these low level pieces remain simple and storage neutral and that has gone pretty well, but there’s a point at which we’ll need to define a new interface and there’s a lot of requirements we now understand that even the old interfaces didn’t have. For instance, once you start replicating large graphs it’s important to be able to ask the persistence layer if it already has the entire graph for a given CID, if the persistence layer doesn’t index this then you always need to do a full parse of the entire graph, which isn’t practical.

@mikeal
Copy link

mikeal commented Mar 4, 2020

Whoops, that posted a little early.

Anyway, we’ve done a few side experiments on what higher order applications might look like or need. One I’m in the middle of is a database that has replication capabilities that look like a git repo https://github.com/mikeal/dagdb and for that I’ve started with a simple storage interface that is just a get/put method. We’ll see where it goes and what I feel has to be added to that interface over time.

What is the current thinking on iplld path resolution (ipld.resolve)? This currently depends on persistence (getting blocks), but could theoretically be changed to just work on deserialized objects.

The new stuff doesn’t really have this beyond the single block level. There’s a resolve() method in the new codec interface https://github.com/ipld/js-codec-interface but it only resolves within the block. We designed it this way because we expect to see codecs in the future that can do fast seeks into the block data and we wanted to make sure that our interfaces were able to leverage those features.

For multi-block lookups, we’re investing heavily in selectors. We’re actively working on those specs and you should see more selector engines implemented in the coming months. The selector engine we have for Go is already in heavy/active use as it’s what graphsync uses, and that’s the primary replication protocol for Filecoin.

@warpfork
Copy link

warpfork commented Mar 5, 2020

Regarding "current thinking on ipld.resolve":

For what it's worth, in the new generation of golang libraries, this is done with the traversal package (docs: https://godoc.org/github.com/ipld/go-ipld-prime/traversal ). These traversals are a functional construction over the Node interface.

The Node interfaces -- the most core thing that everything revolves around -- have Lookup and LookupIndex functions on them which do single steps of resolution... but anything that crosses more than one Node at a time goes through the traversal package, which just calls these.

The new generation of golang libraries don't really have a 'Block' interface at all. The closest thing is the Loader and Storer function interfaces. These use streaming interfaces and never require bringing the whole binary blob into memory at once. These take a couple of interesting parameters ("LinkContext") that are also meant to enable segmentation of storage on more than just the pure hash, which could be useful for making storage logic that admits some application-specific concepts (we'll see; I haven't actually used it yet, either).

If anything in the future wants to do fast seeks to mid-"block" data without fully parsing the whole thing, I'm expecting to do that by making an implementation of Node that lazily loads things. It's a speculative feature to date, though.


Thumbs-up'd to pretty much everything else Mikeal said. :) Just thought it might be also useful to offer some context on how the go libs feel about some of the resolve stuff.

@chafey
Copy link
Contributor Author

chafey commented Mar 5, 2020

Thank you for answering my questions, this is very helpful! I am going to submit a PR for this repo with an update to the README that reflects it being "soft deprecated". Closing this issue

@chafey chafey closed this as completed Mar 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants