-
-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix user exports to deal with s3 storage #3228
Conversation
- custom storages - tar.gz within bucket using s3_tar - slightly changes export directory structure - major problems still outstanding re delivering s3 files to end users
Ok I think I've worked this out. Hopefully will have a fix and a cleaner PR tomorrow. |
- use signed url for s3 downloads - re-arrange tar.gz file to match original - delete all working files after tarring - import from s3 export TODO - check local export and import - fix error when avatar missing - deal with multiple s3 storage options (e.g. Azure)
pulls Mouse's fix for imagefile serialization
Another approach would be to subclass Thoughts? |
As of 0.1.13, the s3-tar library uses an environment variable (`S3_ENDPOINT_URL`) to determine the AWS endpoint. See: https://github.com/xtream1101/s3-tar/blob/0.1.13/s3_tar/utils.py#L25-L29. To save BookWyrm admins from having to set it (e.g., through `.env`) when they are already setting `AWS_S3_ENDPOINT_URL`, we create a Session class that unconditionally uses that URL, and feed it to S3Tar.
also undoes a line space change in settings.py to make the PR cleaner
Subclass boto3.Session to use AWS_S3_ENDPOINT_URL
Ok almost done, I just need to disable azure storage, which I forgot about. |
@bookwyrm-social/code-review this is ready to check. |
Conflicts: bookwyrm/models/bookwyrm_export_job.py requirements.txt
I've refactored the creation of the tar file a bit, by not re-using the same |
BookwyrmExportJob is now just two tasks, which run sequentially. Tests still need to reflect this. |
Creating the export JSON and export TAR are now the only two tasks.
I noticed that we are re-using the |
I just realised we may need to fix the acl of the temporary export json that is uploaded with the "manual" connection |
Saving avatars to /images is problematic because it changes the original filepath from avatars/filename to images/avatars/filename. In this PR prior to this commit, imports failed as they are looking for a file path beginning with "avatar"
@hughrun Thanks for testing and fixing my additions! Do you consider this PR ready to merge? |
I do. I've tested the latest changes on s3 and local. |
(updated 29 Jan (AEDT) )
User exports are failing on instances using s3 storage (i.e. most) because we're treating all image files as local files.
This PR:
S3Tar
library to combine s3 stored images and JSON data into atar.gz
file when using s3 storageStorage
to secure export files from public accessThe main guts of the changes here are in a new class called
AddFileToTar
in the export job.This has been manually tested with local and s3 storage and appears to work with both.
OUTSTANDING ISSUE