-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Valkyrie::StorageAdapter::File - insert a higher abstraction to support cloud providers? #954
Comments
@dchandekstark Do you have a reference library that handles file abstraction the way you're proposing? We've leaned on IO because it's how folks expect to handle files, and on I'm interested in StreamFiles not cleaning up, that wasn't the intention if it's true. A brief look at the code does look like that's the case though - folks would have to clear out that tmp dir occasionally. |
@dchandekstark Have you looked at |
@tpendragon I haven't. At this stage, I'm looking for the "simplest thing that could possibly work"(tm), but I'll definitely check it out. I actually contemplated trying to use ActiveStorage in a more or less similar way, but it felt over-engineered -- like I should just switch to using ActiveStorage instead of trying to integrate it into Valkyrie as another storage adapter. |
@dchandekstark It's pretty simple. In figgy we have this: then that adapter works just like the disk one. |
I'm thinking of something along the lines of the Rack spec? -- not that I'm well versed in it -- but at least the low-level notion of an enumerable. I get the |
I get that, but the motivation is that I want to suddenly be able to decide "Actually I don't want to be on disk, I want to be in S3" and have that be as easy and painless a transition as possible. |
That's fair. I do wish the temp file stuff in StreamFile could be addressed, but I'm not sure how without impacting |
We could use |
Tempfile + optional block would work I think. |
@tpendragon One more consideration on |
Hello! I thought I would chime in because I recently ran into a problem where I was using I ended up fixing the problem by creating a method just like @tpendragon described. It accepts a block and cleans up once the block ends. At the time spent some time reading the My suggestion would be keep Just my two cents. |
That's interesting! We do this for disk files and it does fail for our cloud ones. I think you could either proxy through nginx or apache or whatever for these, or I've considered adding a |
Ahhh this is great, good work @cgalarza ! |
I think I'm gonna look at adding a block to |
@dchandekstark @cgalarza How's #955 look? |
@dchandekstark Years ago I converted Figgy to cloud only as an experiment and I remember doing a thing for Downloading - pulibrary/figgy@main...figgy_in_the_cloud might be useful to you? (We've since decided to keep files local, but it was a good experiment in how flexible Valkyrie really is 😀 ) |
Yes. It also illustrates (in the respond_to? tests) that path is not always (easily) available. |
@tpendragon Seems like a good start! I do confess a little concern that explicit temp file naming is potentially unsafe (e.g., what happens when separate threads/processes act on the same file?). |
@tpendragon I feel that I should add to this thread that I appreciate what Valkyrie provides with respect to storage. It's simple, yet effective. I can see the attraction to Shrine and even ActiveStorage, but you don't always need or want that much extra functionality associated with your storage capability. With that said, I thought I would try to fit a cloud provider (in this case S3) into the Valkyrie framework more or less directly and adhere to the storage adapter contracts. |
I'm all for more options! Looking forward to it! |
I am working on a storage adapter for S3 (in a somewhat different fashion from https://github.com/stkenny/valkyrie-storage-s3), and I find that
Valkyrie::StorageAdapter::File
feels awkward in a couple of spots. I wonder whether we could insert a higher level abstraction above it to supportfind_by
in a more "natural" way for cloud providers.First,
#disk_path
IMO extends the responsibility of V::SA::F too far, andStreamFile
feels too opinionated (and appears to create but not clean up temp files?). Likewiserewind
,close
, andread
seem bound to Ruby's IO class -- nothing wrong with that as such, but if you're dealing with a cloud resource, it's not really "on disk" and it's not really an IO either. What I might suggest for a higher level abstraction is a class that minimally implements#each
to yield chunks of the file data; I could also see#stream
to get an IO.#size
is probably fine too, fwiw.Here's a partial illustration, where the
io
used to instantiate the class is an instance ofAws::S3::Object
Interested in any and all feedback. Thanks!
The text was updated successfully, but these errors were encountered: