-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prioritize using imagesize library to get image size for ImageFromFile #1259
Conversation
Co-authored-by: Vinnam Kim <vinnam.kim@gmail.com>
@vinnamkim
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #1259 +/- ##
===========================================
- Coverage 80.54% 80.51% -0.04%
===========================================
Files 271 270 -1
Lines 30438 30426 -12
Branches 5930 5931 +1
===========================================
- Hits 24517 24498 -19
- Misses 4532 4535 +3
- Partials 1389 1393 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
Accelerate loading of image file-based datasets.
I found that printing out the YOLO dataset information for the first time was slow. After some digging I found that
datamaro
was reading the entire dataset through to get the size of each image.Interactive encoding with datasets on HDD is slow. So I added an override
size()
property in theImageFromFile
class which first tries to get the image size usingPIL
. ThePIL
library is about 8 times faster thanOpenCV
in getting the image size.All dataset classes that use the
size
property ofImageFromFile
can benefit from this modification.How to test
Checklist
License
Feel free to contact the maintainers if that's a concern.