-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP Support over the S3 XML API #2176
Conversation
@@ -368,9 +384,11 @@ def mem_storage() -> Generator[InMemoryStorageFixture, None, None]: | |||
@pytest.fixture( | |||
scope="function", | |||
params=[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not merge the commented out fixtures
b938fc1
to
b798d13
Compare
void update(const s3::S3Settings& config) { | ||
config_ = config; | ||
} | ||
|
||
std::string to_string() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just slightly helpful for debugging in the Python layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM I just have a couple of comments that don't need to be addressed in this PR:
- As far as I can see, it is still possible to connect to GCP via the
S3Storage
directly, rather thanGCPXMLStorage
, this would be a user error ofc but an annoying one. I think that we should either:
- have some logic that forbids this
- add a fall back in the
S3Storage
when the batch delete fails
- We should add some tests with a real GCP, not just a simulator
@@ -89,6 +89,12 @@ class ConfigCache { | |||
storage_conf.config().UnpackTo(&s3_storage); | |||
storages.emplace_back(create_storage(path, mode, s3::S3Settings(settings).update(s3_storage))); | |||
}, | |||
[&storage_conf, &storages, &path, mode] (const s3::GCPXMLSettings& settings) { | |||
util::check(storage_conf.config().Is<arcticdb::proto::gcp_storage::Config>(), "Only support S3 native settings"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"S3" -> "GCP"
def test_gcp_no_batch_delete(gcp_storage_factory: MotoGcpS3StorageFixtureFactory): | ||
# Given | ||
s3 = gcp_storage_factory | ||
bucket = s3.create_fixture() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"s3" to "gcp" maybe to lessen the confusion?
return py::make_tuple(settings.aws_auth()); | ||
}, | ||
[](py::tuple t) { | ||
util::check(t.size() == 1, "Invalid GCPXMLSettings pickle objects"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I have read the code correctly, gcpxml settings will be stored in this native class, instead of in protobuf, like we used to do.
If so, I think the rest of settings in the class are needed to be pickled as well.
Nice to have it test on test_fork.py
package arcticc.pb2.gcp_storage_pb2; | ||
|
||
message Config { | ||
string prefix = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why prefix is the special kid, gets to stay in the proto, instead of living in native config like its sibilings 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the path of a library is determined when the library is created, adding on the creation timestamp. We have to store it in _arctic_cfg
or we'd never be able to find the library again.
b'</Error>' | ||
) | ||
start_response( | ||
"503 Slow Down", [("Content-Type", "text/xml"), ("Content-Length", str(len(response_body)))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the http code is not 503 Slow Down
for the response of not-implemented batch delete right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fixed in the Windows fixes commit
Thanks! Agreed on 2, we'll do that as a follow up. I have done a lot of manual testing with real GCP though. On 1, to start with this will be a bit of a "hidden" feature and we can talk people through how to use it. I'd rather not implement any fancy "is it GCP" detection or fallbacks until we know that people get bitten by this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from the pickle handling
Nice async on delete
a53e89c
to
e3787c6
Compare
…ssue with test fixture cleanup
e3787c6
to
5463f96
Compare
This adds support for GCP over the S3-compatible XML API.
GCP support did not work out of the box because Google does not support any of the S3 "batch" APIs. We used batch delete for all our deletion operations, which broke. Instead I use the unary delete operation here, using the AWS SDK's async API to reduce the performance impact of sending single object deletes.
Users connect to GCP like:
We have a URL of
gcpxml
so that if we do a full GCP implementation later, users can get it withgcp://
. I've added a GCP storage proto object so that users on thegcpxml
API will later be able to move to agcp://
API without updating any library configs. The proto is tiny, just storing the library's prefix, which is the only information we need to serialize.This is implemented as a small sublass of our existing S3 implementation, with its deletion functionality overridden. The Python library adapter creates the
NativeVersionStore
with aGCPXMLSettings
in memory object (in theNativeVariantStorage
) which makes us create aGCPXMLStorage
C++ adapter rather than the normalS3Storage
.The interesting part of this PR is the threading model in
do_remove_no_batching_impl
. This is called from an IO executor as part of theRemoveTask
so it is not safe (due to the risk of deadlocks) to submit any work to the IO executor from it. Instead:s3_client_impl.cpp
submits delete operations with the S3 API. The S3 SDK has its own event loop to handle these. When they complete, they resolve afolly::Promise
, and the client returns afolly::Future
out of that promise.detail-inl.hpp
) collects these futures, on an inline executor. This is fine because there is no work for this executor to do, other than waiting to be notified that promises have been resolved by the AWS SDK.For testing, I created a
moto
based test fixture that returns errors when batch delete operations are invoked. Also tested manually against a real GCP backend.There's some duplication between the
gcpxml
and thes3
Python adapters, but since hopefully thegcpxml
adapter will be temporary until a full GCP implementation, I didn't see the value in refactoring it away.Monday: 8450684276