-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose the paging token API from the C/C++ driver #22
Conversation
Can one of the admins verify this patch? |
5 similar comments
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
@swift-server-bot add to allowlist |
self.token = token | ||
} | ||
|
||
public func withUnsafeBytes<R>(_ body: (UnsafeRawBufferPointer) throws -> R) rethrows -> R { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this used by clients? ideally we reduce unsafe API surface area
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I think I see, this is used as an "opaque" entity the users pass down to setPagingStateToken
, I guess what we can do to hide some of the details is to create a "guts" internal type and use that inside setPagingStateToken
eg token.guts
etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually want to use this feature myself externally, but doing it safely like the Java implementation. So I need access to the guts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, building that safety into the library itself would have necessitated pulling in additional dependencies and increasing the footprint of the library considerably, so I wanted to keep it simple here and allow users to use the token in a way that makes sense for their application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you shed more light on what that could look like? generally ew prefer not promote unsafe APIs and here the only public API is unsafe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What the Java implementation does is take the paging token, along with the query, and all the bound parameters, and hashes them together. It then also encodes the protocol version in the token. That way the token can only be used on the exact same query it came from, along with the same protocol it came from.
I originally had implemented that, but I needed a hashing algorithm to use. There doesn't appear to be anything within the Swift standard library, so I brought in SwiftCrypto. Unfortunately that requires a higher platform target, so I would have to bump that as well.
That seemed like a lot to change just to support this paging token. Maybe there's a better option I'm not thinking of?
Otherwise we could just expose the raw byte array and get rid of the contiguous bytes implementation.
What do you think?
Sources/CassandraClient/Data.swift
Outdated
/// A reusable page token that can be used by `Statement` to resume querying | ||
/// at a specific position. | ||
public struct PagingStateToken { | ||
public let token: [UInt8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the many warnings about these things being unsafe to share with external users or to create from arbitrary data, should we remove the public fields and public init? That would make this type opaque, and so ensure no inappropriate sharing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my use case I actually do want to use externally. Taking the appropriate safety measures like the Java client takes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the warnings come from the C library which also exposes this API. So I think the question is whether or not this library sits closer to the C implementation or offers more high level features like the Java client.
If the latter, then we may need to accept increasing the breadth of dependencies (such as on Swift Crypto in this case).
/// - Warning: The paging state should not be exposed to or come from | ||
/// untrusted environments. The paging state could be spoofed and | ||
/// potentially used to gain access to other data. | ||
func setPagingStateToken(_ pagingStateToken: PagingStateToken) throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this function supposed to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved.
/// potentially used to gain access to other data. | ||
func setPagingStateToken(_ pagingStateToken: PagingStateToken) throws { | ||
try checkResult { | ||
let bytes = pagingStateToken.token.map(CChar.init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra allocation here seems unnecessary: we can use withUnsafeBytes
of pagingStateToken
and pass it in directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved.
8ee2ffb
to
7a31604
Compare
Updated the PR based on some discussions, also updated the description accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once CI is green!
let bufferPointer = UnsafeBufferPointer(start: $0, count: length) | ||
return Array(unsafeUninitializedCapacity: length) { storagePointer, storageCount in | ||
var (unwritten, endIndex) = storagePointer.initialize(from: bufferPointer) | ||
precondition(unwritten.next() == nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we use precondition elsewhere? otherwise maybe guard and throe an error so we dont crash the program?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it is used quite extensively. I am always a bit apprehensive about using it for the reason you highlighted.
Someone suggested this in an earlier review. This isn't a normal error condition, so I don't think throwing is appropriate. For this to fail either means we passed differing lengths in to one of those calls (a really bad bug), or something has seriously gone wrong in the runtime internally.
As a guard against a programming error, it seems a little excessive, but were it to be a reality I think crashing is actually appropriate as things have totally gone off the rails.
Looks like I need to go back a few Swift versions... |
5cd6b6b
to
b443df4
Compare
Seems like |
b443df4
to
4c8b16c
Compare
It took a bit of work to get a Swift 5.4 installation working but looks like everything is good now. |
Adds support for getting/setting a paging token that can be reused.
Motivation:
Cassandra allows for paginated result querying to support iterating over large request sets. This pagination mechanism is currently an implementation detail internal to the client. For end clients talking to a web service, being able to jump to a particular page is useful for larger result sets, especially with unreliable network conditions, or additional business logic and processing that may happen on the end client.
Modifications:
Rows
to retrieve the current paging token.Statement
to set the paging token.Note that the current implementation mostly just exposes the C/C++ token, which is unsafe to use from untrusted sources without additional care to verify that the token is used for the exact same query. For this reason, I'm exposing this as an
OpaquePagingStateToken
to use internally. The internal token is itself available viawithUnsafeBytes
. This is so that people can implement their own safety mechanism on top of this primitive. ThewithUnsafeBytes
should act as a signal that care must be taken, and documentation read.In the future, we may want to provide a higher order abstraction like the Java client has, where the paging token is made safe by way of hash verification.
This pull request provides some early value for those that need paging tokens, while allowing for additional enhancement in the future, if desired.
Result: