Python and JS wrappers #2

DonaldTsang · 2019-07-25T01:47:58Z

Would be useful to have wrappers that makes such usage easy.
Standard formatting: https://docs.python.org/3/library/hashlib.html
x = k12(param="number") x.update(b"binary data") y = x.digest()

The text was updated successfully, but these errors were encountered:

mumbleskates · 2020-05-16T04:47:59Z

There is no standard API for XOFs in Python, so this would entail inventing one at the very minimum.

DonaldTsang · 2020-05-16T06:45:48Z

@mumbleskates thanks for the response, what do you mean by "standard API"? We can simply copy the API standard of BLAKE2 where they allow the existence of digest size, key and salt.

mumbleskates · 2020-05-16T07:38:50Z

Cool, that's a fair observation! I forgot they added those in 3.6.

The C api of libk12.so is pretty straightforward, so calling into it with ctypes is not overly complex. I would expect the work to be 30% plumbing and 60% packaging and 10% left over for nonsense

gvanas · 2020-05-16T21:44:55Z

Creating bindings of K12 for other languages would of course be of great interest for the community. See for instance the bindings for Rust written by Jack O'Connor.

However, to be honest, I don't think I will work on this any time soon. For the time being, I stay focused on cryptography and core cryptographic implementations. Also, although I think (or hope) that I have the abilities to write such wrappers, I do not have any experience.

So, if anyone wants to take on this task it would certainly be very useful.

mumbleskates · 2020-05-16T22:32:03Z

Yes, I believe this would definitely belong in a separate repository.

DonaldTsang · 2020-05-17T05:13:52Z

It should be in a separate repo, I agree, but it should be good to know that it should be done.

mumbleskates · 2020-05-17T08:56:23Z

Regardless of a precedent for variable output length with blake2, Some immediate challenges i can see:

hashlib objects are designed to be initialized with optional data, optionally updated zero or more times with more data, and then produce a digest. They can then be updated again, and produce another digest of the longer data without re-hashing the earlier data.
Because of this multi step nature, there's no clear way to use the KangarooTwelve(...) all-in-one function, and any wrapper library needs to know how to allocate and manage K12 states if it is to conform to the existing API standard.
Managing allocated structs like this is a non-trivial affair that probably requires additional C code unless public APIs are added to return both sizeof(KangarooTwelve_Instance) and the required alignment (or we assume some janky worst-case and use that instead and hope it keeps working).
blake2 is still not an XOF, it only has a variable-size digest. There's no provision for squeezing from it multiple times, or producing a number of bytes from it unknown at initialization time. none of them have stateful modes, so there is no existing function with a concept of "sponge now in squeezing mode, you can't update with more data now".
There is no clear resolution here that solves all these problems, except perhaps to provide an alternate API for absorb/squeeze usage from a module outside hashlib.

So from all of the above, there are still non-trivial amounts of work to be done to 1) figure out the desired python API, and either 2a) add a couple helpful bits to K12 or 2b) create a new library with its own .so to use.

I would prefer adding public APIs to return the size and alignment of the struct, since managing a mutable buffer with a bytearray and ctypes is pretty doable and managing lifetimes of memory allocated by a C extension sounds pretty lame. Having these functions seems potentially useful for other language wrappers as well.

gvanas · 2020-05-18T16:18:56Z

There's no provision for squeezing from it multiple times, or producing a number of bytes from it unknown at initialization time. none of them have stateful modes, so there is no existing function with a concept of "sponge now in squeezing mode, you can't update with more data now".

The API for SHAKE128 and SHAKE256 in Python solves part of the problem. The method digest(Length) allows to produce a number of bytes unknown at initialization time, but it does not really enter in the "squeezing phase" since requesting more bytes restarts from the beginning of the output stream.

I would prefer adding public APIs to return the size and alignment of the struct, since managing a mutable buffer with a bytearray and ctypes is pretty doable and managing lifetimes of memory allocated by a C extension sounds pretty lame. Having these functions seems potentially useful for other language wrappers as well.

A possible fallback solution would be to have a canonical struct to store the state (with fixed size and alignment) and functions that import/export from/to the KangarooTwelve_Instance struct.

mumbleskates · 2020-05-19T01:10:05Z

A possible fallback solution would be to have a canonical struct to store the state (with fixed size and alignment) and functions that import/export from/to the KangarooTwelve_Instance struct.

We could, but this still punts on the idea of knowing how much space to allocate.

However, looking at it a little more, I think it might be quite safe to choose the current 64B aligned struct sizeof() and always use that. I believe that currently the most-aligned layout has a sizeof() of 512, and looks like this:

[  0, 200): queueNode.state       -- 200 bytes
 200      : queueNode.byteIOIndex -- 1 byte
 201      : queueNode.squeezing   -- 1 byte
[202, 256):   <padding>           -- 54 bytes
[256, 456): finalNode.state       -- 200 bytes
 456      : finalNode.byteIOIndex -- 1 byte
 457      : finalNode.squeezing   -- 1 byte
[458, 464):   <padding>           -- 6 bytes
[464, 472): fixedOutputLength     -- 8 bytes
[472, 480): blockNumber           -- 8 bytes
[480, 484): queueAbsorbedLen      -- 4 bytes
[484, 488): phase                 -- 4 bytes
[488, 512):   <padding>           -- 24 bytes

This actually seems like ample overhead and it could be very reasonable to just assume that 512 bytes, 64-byte aligned, will probably always be enough. It seems kind of unlikely that we would need to support a build whose padding overhead is somehow even worse than this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python and JS wrappers #2

Python and JS wrappers #2

DonaldTsang commented Jul 25, 2019

mumbleskates commented May 16, 2020

DonaldTsang commented May 16, 2020

mumbleskates commented May 16, 2020

gvanas commented May 16, 2020

mumbleskates commented May 16, 2020

DonaldTsang commented May 17, 2020

mumbleskates commented May 17, 2020

gvanas commented May 18, 2020

mumbleskates commented May 19, 2020 •

edited

Loading

Python and JS wrappers #2

Python and JS wrappers #2

Comments

DonaldTsang commented Jul 25, 2019

mumbleskates commented May 16, 2020

DonaldTsang commented May 16, 2020

mumbleskates commented May 16, 2020

gvanas commented May 16, 2020

mumbleskates commented May 16, 2020

DonaldTsang commented May 17, 2020

mumbleskates commented May 17, 2020

gvanas commented May 18, 2020

mumbleskates commented May 19, 2020 • edited Loading

mumbleskates commented May 19, 2020 •

edited

Loading