-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: Add option to specify chunking algorithm when adding files #1469
Conversation
This PR depends on ipfs-inactive/js-ipfs-unixfs-engine#223. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot. Most of my comments are pretty minor ones.
src/cli/commands/files/add.js
Outdated
@@ -135,6 +135,10 @@ module.exports = { | |||
default: false, | |||
describe: 'Only chunk and hash, do not write' | |||
}, | |||
chunker: { | |||
default: 'default', | |||
describe: 'Chunking algorithm to use, formatted like [default, size-{size}, rabin, rabin-{avg}, rabin-{min}-{avg}-{max}]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Go IPFS CLI uses size-262144
as default. Would it make sense to remove the default
value from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense, will make that change.
src/core/utils.js
Outdated
const sizeStr = chunker.split('-')[1] | ||
const size = parseInt(sizeStr) | ||
if (isNaN(size)) { | ||
throw new Error('Parameter avg must be an integer') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this message is wrong. The parameter isn't the average, but the fixed size.
src/core/utils.js
Outdated
} | ||
break | ||
case 4: | ||
options.minChunkSize = parseSub(parts[1].split(':'), 'min') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a port of the Go implementations. It seems to support something like rabin-min:123-avg:456-max:789
. This isn't documented (also not in the Go version). I lean towards not supporting it, i.e. less code, less docs, less testing, less bugs :)
Though it would be good to check with the Go team, what the original reason is for having support for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked with @whyrusleeping over irc and he said that we didn't have to worry too much about the alternative format, so I will remove.
src/core/utils.js
Outdated
} | ||
break | ||
case 4: | ||
options.minChunkSize = parseSub(parts[1].split(':'), 'min') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a port of the Go implementations. It seems to support something like rabin-min:123-avg:456-max:789
. This isn't documented (also not in the Go version). I lean towards not supporting it, i.e. less code, less docs, less testing, less bugs :)
Though it would be good to check with the Go team, what the original reason is for having support for that.
@@ -157,4 +157,51 @@ describe('utils', () => { | |||
}) | |||
}) | |||
}) | |||
|
|||
describe('parseChunkerString', () => { | |||
it('handles an empty string', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add test cases for error cases like an unsupported chunker.
src/cli/commands/files/add.js
Outdated
@@ -135,6 +135,10 @@ module.exports = { | |||
default: false, | |||
describe: 'Only chunk and hash, do not write' | |||
}, | |||
chunker: { | |||
default: 'size-262144', | |||
describe: 'Chunking algorithm to use, formatted like [default, size-{size}, rabin, rabin-{avg}, rabin-{min}-{avg}-{max}]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really a minor thing: I've seen that the Go implementation can parse default
as value, but also the CLI help of ipfs add --help
doesn't show this option. I'd remove default
from this list and also the code below. Less options, less bugs :)
src/core/utils.js
Outdated
* @return {Object} Chunker options for DAGBuilder | ||
*/ | ||
function parseChunkerString (chunker) { | ||
if (!chunker || chunker === '') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we ever pass in an empty string (or undefined
)? If not, this if case could be removed (and also removed from the JSDocs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks safe to remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! LGTM, I think it's ready to merged. I'd prefer if someone from the js-ipfs would do the merge and hence confirm that it's good to go.
This looks great @dordille. Do you have time to submit a PR to https://github.com/ipfs/interface-ipfs-core/blob/master/SPEC/FILES.md#filesadd to document this new option? |
Yeah, I’ll update the docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Thank you! ✨
There's just a few issues to iron out before we can merge this. Comments inline.
Additionally, the option needs to also be added to the HTTP API here https://github.com/ipfs/js-ipfs/blob/master/src/http/api/resources/files.js#L152
src/core/components/files.js
Outdated
@@ -133,12 +134,13 @@ class AddHelper extends Duplex { | |||
} | |||
|
|||
module.exports = function files (self) { | |||
function _addPullStream (options) { | |||
function _addPullStream (options = {}) { | |||
const chunkerOptions = parseChunkerString(options.chunker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parseChunkerString
can throw. add*Stream
methods should never throw when called, but should return a stream that immediately errors. You need to do something like:
let chunkerOptions
try {
chunkerOptions = parseChunkerString(options.chunker)
} catch (err) {
return pull.map(() => { throw err })
}
src/http/api/resources/files.js
Outdated
@@ -233,7 +233,8 @@ exports.add = { | |||
onlyHash: request.query['only-hash'], | |||
hashAlg: request.query['hash'], | |||
wrapWithDirectory: request.query['wrap-with-directory'], | |||
pin: request.query.pin | |||
pin: request.query.pin, | |||
chunker: request.query['chunker'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request.query['chunker']
=> request.query.chunker
@alanshaw Updated that http spec to include the option, can't retrigger a build though, looks like there wasn't enough space to run the build. |
582d880
to
30ea824
Compare
I've rebased this against master - there were test failures due to changes in dependencies that have now been resolved. Lets see what CI says now 🤞 |
a5eb1e7
to
f9c9d13
Compare
This allows the chunking algorithm, and options to be specified when using the adding files. Specifying chunker and options are identical to go-ipfs and support the following formats: default, size-{size}, rabin, rabin-{avg}, rabin-{min}-{avg}-{max} This is required to achieve parity with go-ipfs. Fixes ipfs#1283 License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
…ull stream License: MIT Signed-off-by: Dan Ordille <dordille@gmail.com>
Tests all passing, only commitlint that failed - am merging! Thanks @dordille ❤️ |
This allows the chunking algorithm, and options to be specified when using the adding files.
Specifying chunker and options are identical to go-ipfs and support the following formats:
default
size-{size}
rabin
rabin-{avg}
rabin-{min}-{avg}-{max}
Example usage via command line as follows
Fixes #1283