Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/contour speed optimizations #634

Merged
merged 8 commits into from
May 26, 2023

Conversation

dcaustin33
Copy link
Contributor

Optimizations to write contours all locally and then upload in one file at the end because of colab's speed. For 50 examples this takes us from 60 seconds down to 30 seconds in colab.

@dcaustin33 dcaustin33 requested a review from a team as a code owner May 26, 2023 15:56
@Ben-Epstein
Copy link
Contributor

@dcaustin33 @elboy3 how will retrieval work on the backend with a single file? Are you going to need to download it every time?

Or in runners, will you split this up into individual files?

@codecov-commenter
Copy link

codecov-commenter commented May 26, 2023

Codecov Report

Merging #634 (1f527ae) into main (38db557) will decrease coverage by 0.21%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main     #634      +/-   ##
==========================================
- Coverage   84.42%   84.21%   -0.21%     
==========================================
  Files         165      165              
  Lines       12973    13005      +32     
==========================================
  Hits        10952    10952              
- Misses       2021     2053      +32     
Impacted Files Coverage Δ
...ity/integrations/cv/torch/semantic_segmentation.py 0.00% <0.00%> (ø)
...lity/loggers/model_logger/semantic_segmentation.py 0.00% <0.00%> (ø)
...ataquality/utils/semantic_segmentation/polygons.py 0.00% <0.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@elboy3
Copy link
Contributor

elboy3 commented May 26, 2023

@dcaustin33 @elboy3 how will retrieval work on the backend with a single file? Are you going to need to download it every time?

Or in runners, will you split this up into individual files?

@Ben-Epstein my thought was to keep it as one. We only need to retrieve it for the /rows endpoint (nothing else). I was considering adding it to the cache as well if the file is large and it takes some time to download. wdyt?

@Ben-Epstein
Copy link
Contributor

is it currently a single file, or would this change it into a single file from multiple individual files? I would imagine that a single contours file would be pretty large right?

@dcaustin33
Copy link
Contributor Author

is it currently a single file, or would this change it into a single file from multiple individual files? I would imagine that a single contours file would be pretty large right?

@Ben-Epstein for 200 examples the size of the file rn is 1.4 mb

Copy link
Contributor

@Ben-Epstein Ben-Epstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense on the dq side, i would watch the performance on the API and consider breaking its into individual polygon JSONs (or testing that behavior alongside) if the performance suffers

Comment on lines 160 to 162
if not os.path.isdir(prefix):
with lock:
os.makedirs(prefix, exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just do os.makedirs(prefix, exist_ok=True)

or preferably can we just create this folder at the beginning of a run and then not have to do any checks here?

Comment on lines 160 to 161
with lock:
os.makedirs(prefix, exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need the lock?

@dcaustin33 dcaustin33 merged commit cfc052e into main May 26, 2023
@dcaustin33 dcaustin33 deleted the feature/contour_speed_optimizations branch May 26, 2023 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants