Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for sync car reading #121

Merged
merged 8 commits into from
Jan 27, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ The basic `CarReader` class is consumed via:

```js
import { CarReader } from '@ipld/car/reader'
import { CarBufferReader } from '@ipld/car/buffer-reader'
```

Or alternatively: `import { CarReader } from '@ipld/car'`. CommonJS `require`
Expand All @@ -116,6 +117,8 @@ methods as well as iterators for [`blocks()`](#CarReader_blocks)] and
[an `AsyncIterable`](#CarReader__fromIterable) of `Uint8Array`s (note that
Node.js streams are `AsyncIterable`s and can be consumed in this way).

`CarBufferReader` works exactly the same way as `CarReader` but all method are synchronous.

### [`CarIndexedReader`](#CarIndexedReader)

The `CarIndexedReader` class is a special form of `CarReader` and can be
Expand Down Expand Up @@ -218,6 +221,7 @@ be directly fed to a
* [`async CarReader#get(key)`](#CarReader_get)
* [`async * CarReader#blocks()`](#CarReader_blocks)
* [`async * CarReader#cids()`](#CarReader_cids)
* [`CarReader.fromBytesSync(bytes)`](#CarReader__fromBytesSync)
* [`async CarReader.fromBytes(bytes)`](#CarReader__fromBytes)
* [`async CarReader.fromIterable(asyncIterable)`](#CarReader__fromIterable)
* [`async CarReader.readRaw(fd, blockIndex)`](#CarReader__readRaw)
Expand Down Expand Up @@ -263,6 +267,14 @@ be directly fed to a
* [`decoder.bytesReader(bytes)`](#decoder__bytesReader__bytes__)
* [`decoder.asyncIterableReader(asyncIterable)`](#decoder__asyncIterableReader__asyncIterable__)
* [`decoder.limitReader(reader, byteLimit)`](#decoder__limitReader__reader____byteLimit__)
* [`class CarBufferReader`](#CarBufferReader)
* [`CarBufferReader#getRoots()`](#CarBufferReader_getRoots)
* [`CarBufferReader#has(key)`](#CarBufferReader_has)
* [`CarBufferReader#get(key)`](#CarBufferReader_get)
* [`* CarBufferReader#blocks()`](#CarBufferReader_blocks)
* [`* CarBufferReader#cids()`](#CarBufferReader_cids)
* [`CarBufferReader.fromBytes(bytes)`](#CarBufferReader__fromBytes)
* [`CarBufferReader.readRaw(fd, blockIndex)`](#CarBufferReader__readRaw)

<a name="CarReader"></a>
### `class CarReader`
Expand Down Expand Up @@ -333,6 +345,17 @@ the CAR referenced by this reader.
Returns a `CIDIterator` (`AsyncIterable<CID>`) that iterates over all of
the `CID`s contained within the CAR referenced by this reader.

<a name="CarReader__fromBytesSync"></a>
### `CarReader.fromBytesSync(bytes)`
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved

* `bytes` `(Uint8Array)`

* Returns: `CarReader`

Instantiate a [`CarReader`](#CarReader) from a `Uint8Array` blob. This performs a
decode fully in memory and maintains the decoded state in memory for full
access to the data via the `CarReader` API.

<a name="CarReader__fromBytes"></a>
### `async CarReader.fromBytes(bytes)`

Expand Down Expand Up @@ -948,6 +971,105 @@ Wraps a `BytesReader` in a limiting `BytesReader` which limits maximum read
to `byteLimit` bytes. It _does not_ update `pos` of the original
`BytesReader`.

<a name="CarBufferReader"></a>
### `class CarBufferReader`

Properties:

* `version` `(number)`: The version number of the CAR referenced by this
reader (should be `1` or `2`).

Provides blockstore-like access to a CAR.

Implements the `RootsReader` interface:
[`getRoots()`](#ICarBufferReader__getRoots). And the `BlockReader` interface:
[`get()`](#ICarBufferReader__get), [`has()`](#ICarBufferReader__has),
[`blocks()`](#ICarBufferReader__blocks) (defined as a `BlockIterator`) and
[`cids()`](#ICarBufferReader__cids) (defined as a `CIDIterator`).

Load this class with either `import { CarBufferReader } from '@ipld/car/buffer-reader'`
(`const { CarBufferReader } = require('@ipld/car/buffer-reader')`). Or
`import { CarBufferReader } from '@ipld/car'` (`const { CarBufferReader } = require('@ipld/car')`).
The former will likely result in smaller bundle sizes where this is
important.

<a name="CarBufferReader_getRoots"></a>
### `CarBufferReader#getRoots()`

* Returns: `CID[]`

Get the list of roots defined by the CAR referenced by this reader. May be
zero or more `CID`s.

<a name="CarBufferReader_has"></a>
### `CarBufferReader#has(key)`

* `key` `(CID)`

* Returns: `boolean`

Check whether a given `CID` exists within the CAR referenced by this
reader.

<a name="CarBufferReader_get"></a>
### `CarBufferReader#get(key)`

* `key` `(CID)`

* Returns: `Block|undefined`

Fetch a `Block` (a `{ cid:CID, bytes:Uint8Array }` pair) from the CAR
referenced by this reader matching the provided `CID`. In the case where
the provided `CID` doesn't exist within the CAR, `undefined` will be
returned.

<a name="CarBufferReader_blocks"></a>
### `* CarBufferReader#blocks()`

* Returns: `Generator<Block>`
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved

Returns a `BlockIterator` (`AsyncIterable<Block>`) that iterates over all
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
of the `Block`s (`{ cid:CID, bytes:Uint8Array }` pairs) contained within
the CAR referenced by this reader.

<a name="CarBufferReader_cids"></a>
### `* CarBufferReader#cids()`

* Returns: `Generator<CID>`
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved

Returns a `CIDIterator` (`AsyncIterable<CID>`) that iterates over all of
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
the `CID`s contained within the CAR referenced by this reader.

<a name="CarBufferReader__fromBytes"></a>
### `CarBufferReader.fromBytes(bytes)`

* `bytes` `(Uint8Array)`

* Returns: `CarBufferReader`: blip blop
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved

Instantiate a [`CarBufferReader`](#CarBufferReader) from a `Uint8Array` blob. This performs a
decode fully in memory and maintains the decoded state in memory for full
access to the data via the `CarReader` API.

<a name="CarBufferReader__readRaw"></a>
### `CarBufferReader.readRaw(fd, blockIndex)`

* `fd` `(number)`: A file descriptor from the
Node.js `fs` module. An integer, from `fs.open()`.
* `blockIndex` `(BlockIndex)`: An index pointing to the location of the
Block required. This `BlockIndex` should take the form:
`{cid:CID, blockLength:number, blockOffset:number}`.

* Returns: `Block`: A `{ cid:CID, bytes:Uint8Array }` pair.

Reads a block directly from a file descriptor for an open CAR file. This
function is **only available in Node.js** and not a browser environment.

This function can be used in connection with [`CarIndexer`](#CarIndexer) which emits
the `BlockIndex` objects that are required by this function.

The user is responsible for opening and closing the file used in this call.

## License

Licensed under either of
Expand Down
8 changes: 7 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,11 @@
"browser": "./src/reader-browser.js",
"import": "./src/reader.js"
},
"./buffer-reader": {
"types": "./dist/src/buffer-reader-browser.d.ts",
"browser": "./src/buffer-reader-browser.js",
"import": "./src/buffer-reader.js"
},
"./writer": {
"types": "./dist/src/writer.d.ts",
"browser": "./src/writer-browser.js",
Expand Down Expand Up @@ -190,7 +195,7 @@
"test:examples": "npm run test --prefix examples/",
"dep-check": "aegir dep-check",
"coverage": "c8 --reporter=html --reporter=text mocha test/test-*.js && npx st -d coverage -p 8888",
"docs": "jsdoc4readme --readme --description-only src/reader*.js src/indexed-reader.js src/iterator.js src/indexer.js src/writer*.js src/buffer-writer.js src/decoder.js"
"docs": "jsdoc4readme --readme --description-only src/reader*.js src/indexed-reader.js src/iterator.js src/indexer.js src/writer*.js src/buffer-writer.js src/decoder.js src/buffer-reader*.js"
},
"dependencies": {
"@ipld/dag-cbor": "^9.0.0",
Expand All @@ -212,6 +217,7 @@
"./src/index.js": "./src/index-browser.js",
"./src/index-reader.js": "./src/index-reader-browser.js",
"./src/reader.js": "./src/reader-browser.js",
"./src/buffer-reader.js": "./src/buffer-reader-browser.js",
"./src/writer.js": "./src/writer-browser.js",
"fs": false,
"util": false,
Expand Down
22 changes: 20 additions & 2 deletions src/api.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
import type { CID } from 'multiformats/cid'

/**
* Literally any `Iterable` (async or regular).
*/
export type AwaitIterable<T> = Iterable<T> | AsyncIterable<T>

export type { CID }
/* Generic types for interfacing with block storage */

Expand All @@ -24,15 +29,27 @@ export interface RootsReader {
getRoots: () => Promise<CID[]>
}

export interface SyncRootsReader {
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
version: number
getRoots: () => CID[]
}

export interface BlockIterator extends AsyncIterable<Block> {}

export interface CIDIterator extends AsyncIterable<CID> {}

export interface BlockReader {
has: (key: CID) => Promise<boolean>
get: (key: CID) => Promise<Block | undefined>
blocks: () => BlockIterator
cids: () => CIDIterator
blocks: () => AsyncIterable<Block>
cids: () => AsyncIterable<CID>
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
}

export interface SyncBlockReader {
has: (key: CID) => boolean
get: (key: CID) => Block | undefined
blocks: () => Iterable<Block>
cids: () => Iterable<CID>
}

export interface BlockWriter {
Expand Down Expand Up @@ -60,6 +77,7 @@ export interface WriterChannel {
}

export interface CarReader extends BlockReader, RootsReader {}
export interface CarBufferReader extends SyncBlockReader, SyncRootsReader {}

/* Specific implementations for CAR block storage */

Expand Down
157 changes: 157 additions & 0 deletions src/buffer-reader-browser.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
import * as DecoderSync from './decoder-sync.js'

/**
* @typedef {import('multiformats').CID} CID
* @typedef {import('./api').Block} Block
* @typedef {import('./api').CarBufferReader} ICarBufferReader
* @typedef {import('./coding').BytesReader} BytesReader
* @typedef {import('./coding').CarHeader} CarHeader
* @typedef {import('./coding').CarV2Header} CarV2Header
*/

/**
* Provides blockstore-like access to a CAR.
*
* Implements the `RootsReader` interface:
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
* {@link ICarBufferReader.getRoots `getRoots()`}. And the `BlockReader` interface:
* {@link ICarBufferReader.get `get()`}, {@link ICarBufferReader.has `has()`},
* {@link ICarBufferReader.blocks `blocks()`} (defined as a `BlockIterator`) and
* {@link ICarBufferReader.cids `cids()`} (defined as a `CIDIterator`).
*
* Load this class with either `import { CarBufferReader } from '@ipld/car/buffer-reader'`
* (`const { CarBufferReader } = require('@ipld/car/buffer-reader')`). Or
* `import { CarBufferReader } from '@ipld/car'` (`const { CarBufferReader } = require('@ipld/car')`).
* The former will likely result in smaller bundle sizes where this is
* important.
*
* @name CarBufferReader
* @class
* @implements {ICarBufferReader}
* @property {number} version The version number of the CAR referenced by this
* reader (should be `1` or `2`).
*/
export class CarBufferReader {
/**
* @constructs CarBufferReader
* @param {CarHeader|CarV2Header} header
* @param {Block[]} blocks
*/
constructor (header, blocks) {
this._header = header
this._blocks = blocks
this._keys = blocks.map((b) => b.cid.toString())
}

/**
* @property version
* @memberof CarBufferReader
* @instance
*/
get version () {
return this._header.version
}

/**
* Get the list of roots defined by the CAR referenced by this reader. May be
* zero or more `CID`s.
*
* @function
* @memberof CarBufferReader
* @instance
* @returns {CID[]}
*/
getRoots () {
return this._header.roots
/* c8 ignore next 2 */
// Node.js 12 c8 bug
}

/**
* Check whether a given `CID` exists within the CAR referenced by this
* reader.
*
* @function
* @memberof CarBufferReader
* @instance
* @param {CID} key
* @returns {boolean}
*/
has (key) {
return this._keys.indexOf(key.toString()) > -1
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
/* c8 ignore next 2 */
// Node.js 12 c8 bug
}

/**
* Fetch a `Block` (a `{ cid:CID, bytes:Uint8Array }` pair) from the CAR
* referenced by this reader matching the provided `CID`. In the case where
* the provided `CID` doesn't exist within the CAR, `undefined` will be
* returned.
*
* @function
* @memberof CarBufferReader
* @instance
* @param {CID} key
* @returns {Block | undefined}
*/
get (key) {
const index = this._keys.indexOf(key.toString())
return index > -1 ? this._blocks[index] : undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just find the block?

Suggested change
const index = this._keys.indexOf(key.toString())
return index > -1 ? this._blocks[index] : undefined
return this_blocks.find(b -> b.cid.equals(key))

/* c8 ignore next 2 */
// Node.js 12 c8 bug
}

/**
* Returns a `BlockIterator` (`AsyncIterable<Block>`) that iterates over all
* of the `Block`s (`{ cid:CID, bytes:Uint8Array }` pairs) contained within
* the CAR referenced by this reader.
*
* @function
* @memberof CarBufferReader
* @instance
* @generator
* @returns {Generator<Block>}
*/
* blocks () {
for (const block of this._blocks) {
yield block
}
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
}

/**
* Returns a `CIDIterator` (`AsyncIterable<CID>`) that iterates over all of
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
* the `CID`s contained within the CAR referenced by this reader.
*
* @function
* @memberof CarBufferReader
* @instance
* @generator
* @returns {Generator<CID>}
*/
* cids () {
for (const block of this._blocks) {
yield block.cid
}
}

/**
* Instantiate a {@link CarBufferReader} from a `Uint8Array` blob. This performs a
* decode fully in memory and maintains the decoded state in memory for full
* access to the data via the `CarReader` API.
*
* @static
* @memberof CarBufferReader
* @param {Uint8Array} bytes
* @returns {CarBufferReader} blip blop
*/
static fromBytes (bytes) {
if (!(bytes instanceof Uint8Array)) {
throw new TypeError('fromBytes() requires a Uint8Array')
}

const { header, blocks } = DecoderSync.fromBytes(bytes)
hugomrdias marked this conversation as resolved.
Show resolved Hide resolved
return new CarBufferReader(header, blocks)
}
}

export const __browser = true
Loading