-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Page dimensions analysis task #349
✨ Page dimensions analysis task #349
Conversation
@JMicheli The compression scheme you outlined is simple and smart! I'm excited to give this a review. I'll try to make time after work sometime this week, but it might land on the weekend. As always, thanks for your time and contributions - I really appreciate it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding some comments re: concerns I had while writing this code.
Change into_image_format into a TryFrom<ContentType> for image::ImageFormat impl; Add doc comment to explain page_dimension model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few really low priority comments/questions but this looks great!
I'll give it another review for my approval stamp once the conflicts are resolved ✅
Edit to add that I meant to call out all of your comments throughout the code. They are very helpful during review and will be beneficial in the long run. I appreciate it!
Co-authored-by: Aaron Leopold <36278431+aaronleopold@users.noreply.github.com>
Co-authored-by: Aaron Leopold <36278431+aaronleopold@users.noreply.github.com>
I'm working on getting things ready to merge - as is tradition, this part is often the most difficult for me. |
No rush on my part! Reach out if you need anything, but I would guess all you need to do is update your feature branch with this repo's experimental branch, either directly if you have a remote set up or just update your fork's experimental and then merge that in |
Alright I think that should do it - you were right, just needed a little |
I fixed a couple of clippy lints and a missing |
This pull request aims to close #180.
I've added a new type of task to the media analysis job created in #307,
AnalyzePageDimensions
. This task attempts to determine the dimensions of each page in a media item and then stores its findings to the database so that they can be accessed through the server's API.The database structure for storing the dimensions data is shown below:
An individual height, width pair is represented in Rust by
PageDimension
.One of the concerns discussed when planning this addition was the possibility that a very long book would have a very long list of dimensions, thus necessitating the separate table to store dimensions, making them optional when loading metadata. But for this same reason, the string encoding a vector of
PageDimensions
could potentially be very large.To avoid database bloat for users with large libraries/long books, I've implemented a simple compression scheme that serializes
Vec<PageDimension>
as follows:I think that this should be effective since many books should have uniform page sizes (excepting the cover and the occasional two-page spread). Thus, many books won't actually need to store that much data in the database. The deserialization function I wrote is copy-free and fast, I think it should not introduce any unwanted overhead.
Finally, the API matches the first option discussed on the issue:
curl -X GET http://stump.cloud/api/v1/media/:id/page/:page/dimensions
gives a single dimension for a single page.curl -X GET http://stump.cloud/api/v1/media/:id/dimensions
gives a list of dimensions for each page (0-indexed).