Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the paging token API from the C/C++ driver #22

Merged
merged 1 commit into from
Mar 4, 2023

Conversation

erichmenge
Copy link
Contributor

@erichmenge erichmenge commented Feb 27, 2023

Adds support for getting/setting a paging token that can be reused.

Motivation:

Cassandra allows for paginated result querying to support iterating over large request sets. This pagination mechanism is currently an implementation detail internal to the client. For end clients talking to a web service, being able to jump to a particular page is useful for larger result sets, especially with unreliable network conditions, or additional business logic and processing that may happen on the end client.

Modifications:

  • Adds a function on Rows to retrieve the current paging token.
  • Adds a function on Statement to set the paging token.
  • Include tests for above

Note that the current implementation mostly just exposes the C/C++ token, which is unsafe to use from untrusted sources without additional care to verify that the token is used for the exact same query. For this reason, I'm exposing this as an OpaquePagingStateToken to use internally. The internal token is itself available via withUnsafeBytes. This is so that people can implement their own safety mechanism on top of this primitive. The withUnsafeBytes should act as a signal that care must be taken, and documentation read.

In the future, we may want to provide a higher order abstraction like the Java client has, where the paging token is made safe by way of hash verification.

This pull request provides some early value for those that need paging tokens, while allowing for additional enhancement in the future, if desired.

Result:

  • New public APIs are available for using paging tokens

@swift-server-bot
Copy link

Can one of the admins verify this patch?

5 similar comments
@swift-server-bot
Copy link

Can one of the admins verify this patch?

@swift-server-bot
Copy link

Can one of the admins verify this patch?

@swift-server-bot
Copy link

Can one of the admins verify this patch?

@swift-server-bot
Copy link

Can one of the admins verify this patch?

@swift-server-bot
Copy link

Can one of the admins verify this patch?

@yim-lee
Copy link
Member

yim-lee commented Feb 27, 2023

@swift-server-bot add to allowlist

@yim-lee yim-lee requested review from yim-lee, glbrntt and rnro February 27, 2023 22:54
self.token = token
}

public func withUnsafeBytes<R>(_ body: (UnsafeRawBufferPointer) throws -> R) rethrows -> R {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this used by clients? ideally we reduce unsafe API surface area

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I think I see, this is used as an "opaque" entity the users pass down to setPagingStateToken, I guess what we can do to hide some of the details is to create a "guts" internal type and use that inside setPagingStateToken eg token.guts etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually want to use this feature myself externally, but doing it safely like the Java implementation. So I need access to the guts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, building that safety into the library itself would have necessitated pulling in additional dependencies and increasing the footprint of the library considerably, so I wanted to keep it simple here and allow users to use the token in a way that makes sense for their application.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you shed more light on what that could look like? generally ew prefer not promote unsafe APIs and here the only public API is unsafe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the Java implementation does is take the paging token, along with the query, and all the bound parameters, and hashes them together. It then also encodes the protocol version in the token. That way the token can only be used on the exact same query it came from, along with the same protocol it came from.

I originally had implemented that, but I needed a hashing algorithm to use. There doesn't appear to be anything within the Swift standard library, so I brought in SwiftCrypto. Unfortunately that requires a higher platform target, so I would have to bump that as well.

That seemed like a lot to change just to support this paging token. Maybe there's a better option I'm not thinking of?

Otherwise we could just expose the raw byte array and get rid of the contiguous bytes implementation.

What do you think?

/// A reusable page token that can be used by `Statement` to resume querying
/// at a specific position.
public struct PagingStateToken {
public let token: [UInt8]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the many warnings about these things being unsafe to share with external users or to create from arbitrary data, should we remove the public fields and public init? That would make this type opaque, and so ensure no inappropriate sharing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my use case I actually do want to use externally. Taking the appropriate safety measures like the Java client takes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the warnings come from the C library which also exposes this API. So I think the question is whether or not this library sits closer to the C implementation or offers more high level features like the Java client.

If the latter, then we may need to accept increasing the breadth of dependencies (such as on Swift Crypto in this case).

/// - Warning: The paging state should not be exposed to or come from
/// untrusted environments. The paging state could be spoofed and
/// potentially used to gain access to other data.
func setPagingStateToken(_ pagingStateToken: PagingStateToken) throws {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this function supposed to be public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

/// potentially used to gain access to other data.
func setPagingStateToken(_ pagingStateToken: PagingStateToken) throws {
try checkResult {
let bytes = pagingStateToken.token.map(CChar.init)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra allocation here seems unnecessary: we can use withUnsafeBytes of pagingStateToken and pass it in directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

@erichmenge
Copy link
Contributor Author

Updated the PR based on some discussions, also updated the description accordingly.

Copy link
Member

@tomerd tomerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once CI is green!

let bufferPointer = UnsafeBufferPointer(start: $0, count: length)
return Array(unsafeUninitializedCapacity: length) { storagePointer, storageCount in
var (unwritten, endIndex) = storagePointer.initialize(from: bufferPointer)
precondition(unwritten.next() == nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we use precondition elsewhere? otherwise maybe guard and throe an error so we dont crash the program?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it is used quite extensively. I am always a bit apprehensive about using it for the reason you highlighted.

Someone suggested this in an earlier review. This isn't a normal error condition, so I don't think throwing is appropriate. For this to fail either means we passed differing lengths in to one of those calls (a really bad bug), or something has seriously gone wrong in the runtime internally.

As a guard against a programming error, it seems a little excessive, but were it to be a reality I think crashing is actually appropriate as things have totally gone off the rails.

@erichmenge
Copy link
Contributor Author

once CI is green!

Looks like I need to go back a few Swift versions...

@yim-lee
Copy link
Member

yim-lee commented Mar 4, 2023

Seems like withMemoryRebound was added in Swift 5.7. Perhaps we add a #if swift(>=5.7) check around the feature?

@erichmenge
Copy link
Contributor Author

Seems like withMemoryRebound was added in Swift 5.7. Perhaps we add a #if swift(>=5.7) check around the feature?

It took a bit of work to get a Swift 5.4 installation working but looks like everything is good now.

@yim-lee yim-lee merged commit 1da99aa into apple:main Mar 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants