Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of atomic.increment with other methods; value type system #51

Open
pchickey opened this issue Oct 26, 2024 · 3 comments
Open

Comments

@pchickey
Copy link

pchickey commented Oct 26, 2024

dIn all operations on a keyvalue store, the type of a value is given by list<u8>. atomic.increment is the one exception where the value is passed in as an s64 and returned as an s64. With increment I can create a new s64 value for a key not in the store, modify by adding an s64, and retrieve the current s64 if I pass in a delta of 0.

The spec for atomic.increment needs to describe how the value associated with this key interacts with all of the other operations on the store. This is tricky because many underlying implementations of a store have a more complicated type system than representing every value as just a binary blob list<u8>, and only make operations like atomic addition available on types that were stored as numbers, rather than as blobs.

One possible path for the spec is, if the value in bucket.set to a byte sequence which, as ascii, parses to a base 10 integer value (optional leading -, then up to some limit of digits in 0-9) and contains no trailing symbols, the value is considered an numeric value instead of a binary blob. A bucket.get of the value gives the bytes of the ascii base-10 integer. (If you dont like ascii base 10, you could pick some other encoding of integers here, be it a sequence of 8 bytes with a little-endian 2s complement integer, or EB128 used in the wasm binary format, or whatever else you can dream up - its a big open ended problem). And now atomic.increment works on that value, because its a number value, right? You probably want an error variant now for when you use increment on a non-number value too.

This general path for solving the problem has introduced a phantom datatype system for values - it exists in the semantics of the interface, but it doesn't show up in the syntax.

So then say we introduce some common denominator datatype system that many of these underlying store implementations already have at the keyvalue interface. For example, we could define variant value { string(string), blob(list<u8>), number(s64) } and use that in place of all the list<u8>s in the set/get methods. This has a lot of advantages - now its very obvious what values are numbers versus which are binary blobs, and it means numbers get to be passed across the interface in an encoding defined by the component model, rather than being solved specifically for this proposal. It also puts this spec in a position where we have to choose our datatype system very carefully to have consistent semantics on the wide range of possible underlying common stores - we already have to do that for at least numbers in order to support increment properly, but now we have the opportunity to support storing other types as well, strings being the most obvious.

I think this is a pretty major design problem in the current spec of keyvalue. I'd like to hear what other implementors have done in order to implement atomic.increment so far, and what sort of discussion, if any, there has been to date about introducing a value datatype system.

@Mossaka
Copy link
Collaborator

Mossaka commented Oct 29, 2024

I like the idea of a value type system you brought up in this issue, which obviously adds more complexity to the implementation. I am also curious to hear what implementors thoughts on this.

CC @thomastaylor312 , @devigned

@devigned
Copy link
Collaborator

Recently, I have been implementing atomic.increment for Spin. I'm not sure we can safely implement increment in its current shape without bleeding internal implementation details out to the consumer of the interface.

For example, we do not separate key space between Vec<u8> and numeric key / values, so there is possibility of trying to increment a key that was a bin blob. This might be ok if the consumer were to choose the same bin format for that integer, but it's likely it will not be the case.

I'm also not sure we need to use variant value { string(string), blob(list<u8>), number(u64) }. There are aspects of this I do like, but I think we might be able to safely use the current interface if we were to implement K/V stores by wrapping the incoming value with some additional metadata.

For example, if we were to wrap all values with a metadata wrapper like the following
{ value: Vec<u8>, origin: Origin(integer | blob) } prior to storing in the backend store. This way we could operate on increment with a clear understanding that they stored value is going to be an integer stored in format we expect. The down side to this is that we would not be able to use https://redis.io/docs/latest/commands/incr/ since the value would be stored differently than how the backend store would expect it for increment.

I agree there is a problem. Let's figure out the best way to solve it.

@thomastaylor312
Copy link
Collaborator

My current approach to this is that you shouldn't do get on a value created by increment. We can even document that it is either undefined behavior or an error. The way to just "get" the value would be to call increment: func(bucket, "foo", 0). Unfortunately I have done this mapping of values far too often over the years (the very original keyvalue interface for wasmCloud was based mostly off of redis) and trying to do anything with types is gnarly. list<u8> is the lowest common denominator (and based on what I saw, the most common) value type for a key value store. Yes there are some with types, but for an interface like this it is definitely lowest common denominator. If you want more advanced typing, then it should be an interface specific to that store.

So my opinion is that this is something we should try to document, but not to solve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants