Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance: use offscreen canvas to decode on-disk masks #5089

Merged
merged 3 commits into from
Nov 13, 2024

Conversation

sashankaryal
Copy link
Contributor

@sashankaryal sashankaryal commented Nov 12, 2024

What changes are proposed in this pull request?

This approach of using an off-screen canvas to decode mask images provides a significant performance boost (almost 2x based on benchmarking) in decoding mask paths compared to a naive javascript approach that the 3pp libraries (fast-png and jpeg-js) were using. Canvas also automatically handles edge-case scenarios in decoding images like multi-palette pngs and non-8-bit images.

Benchmarking script:
canvas-lean.zip

To start benchmarking app:

  1. Run npx http-server in root directory. This is to host images.
  2. Run yarn && yarn start.

How is this patch tested? If it is not, please explain why.

Chose to not write unit tests because canvas in vitest would have to be mocked and that defeats the purpose. Tested locally.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Summary by CodeRabbit

  • New Features

    • Introduced a new asynchronous function for enhanced image processing using canvas.
    • Expanded support for typed arrays by including Uint8ClampedArray in the array types.
  • Improvements

    • Streamlined image decoding by adopting a canvas-based approach, improving clarity and maintainability.
    • Refined handling of overlay image URLs and dimensions based on new processing logic.
  • Bug Fixes

    • Enhanced error handling and processing logic for overlay images, ensuring consistent functionality.

@sashankaryal sashankaryal requested a review from a team November 12, 2024 01:38
@sashankaryal sashankaryal self-assigned this Nov 12, 2024
Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

Walkthrough

The changes in this pull request introduce enhancements to the numpy.ts, canvas-decoder.ts, and index.ts files within the Looker application. A new Uint8ClampedArray type is added to the ARRAY_TYPES object and the TypedArray union in numpy.ts, expanding the range of typed arrays. Additionally, a new asynchronous function decodeWithCanvas is implemented in canvas-decoder.ts for processing image data using an offscreen canvas. The index.ts file is updated to utilize this new decoding method, refining image processing logic and improving maintainability.

Changes

File Path Change Summary
app/packages/looker/src/numpy.ts - Added Uint8ClampedArray to ARRAY_TYPES.
- Updated TypedArray type union to include Uint8ClampedArray.
app/packages/looker/src/worker/canvas-decoder.ts - Introduced decodeWithCanvas(blob: ImageBitmapSource) function for image data extraction using a canvas.
app/packages/looker/src/worker/index.ts - Replaced previous PNG and JPEG decoding methods with decodeWithCanvas.
- Updated overlay image handling and streamlined error processing.
- Modified async imputeOverlayFromPath(...), ProcessSample, and SetStream interface logic.

Suggested reviewers

  • minhtuev
  • Anddraca

🐇 In the code, a new type does hop,
Clamped arrays now join the crop.
With canvas we decode with grace,
Images dance in their new space.
So let’s cheer for changes bright,
In Looker’s code, all feels just right! 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sashankaryal sashankaryal added the app Issues related to App features label Nov 12, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (3)
app/packages/looker/src/worker/canvas-decoder.ts (2)

23-25: LGTM with minor suggestion

The channel calculation logic is correct, but could be more explicit.

Optional improvement for clarity:

-  const numChannels =
-    overlayData.channels ??
-    overlayData.data.length / (overlayData.width * overlayData.height);
+  const pixelCount = overlayData.width * overlayData.height;
+  const numChannels = overlayData.channels ?? (overlayData.data.length / pixelCount);

27-33: Improve type safety

The return value uses type assertion which could be avoided with proper typing.

Consider this improvement:

-  return {
+  const result: OverlayMask = {
     buffer: overlayData.data.buffer,
     channels: numChannels,
     arrayType: overlayData.data.constructor.name as OverlayMask["arrayType"],
-    shape: [overlayData.height, overlayData.width],
+    shape: [overlayData.height, overlayData.width] as [number, number],
-  } as OverlayMask;
+  };
+  return result;
app/packages/looker/src/worker/index.ts (1)

144-152: Consider using the URL API for parsing and handling URLs.

The current implementation manually splits the URL string and inspects query parameters. Utilizing the built-in URL API can improve readability and robustness, especially when dealing with complex URLs.

Apply this diff to refactor the code:

-const urlTokens = overlayImageUrl.split("?");
-
-let baseUrl = overlayImageUrl;
-
-// remove query params if not local URL
-if (!urlTokens.at(1)?.startsWith("filepath=")) {
-  baseUrl = overlayImageUrl.split("?")[0];
-}
-
-const fileExtension = baseUrl.split(".").pop();
+const overlayUrl = new URL(overlayImageUrl);
+
+// Remove query params if not local URL
+if (!overlayUrl.searchParams.has("filepath")) {
+  overlayUrl.search = "";
+}
+
+const fileExtension = overlayUrl.pathname.split(".").pop();
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between dcdfc89 and c52e31a.

⛔ Files ignored due to path filters (3)
  • app/package.json is excluded by !**/*.json
  • app/packages/looker/package.json is excluded by !**/*.json
  • app/yarn.lock is excluded by !**/yarn.lock, !**/*.lock, !**/*.lock
📒 Files selected for processing (3)
  • app/packages/looker/src/numpy.ts (2 hunks)
  • app/packages/looker/src/worker/canvas-decoder.ts (1 hunks)
  • app/packages/looker/src/worker/index.ts (3 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
app/packages/looker/src/numpy.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/looker/src/worker/canvas-decoder.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/looker/src/worker/index.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🔇 Additional comments (3)
app/packages/looker/src/worker/canvas-decoder.ts (1)

1-33: Verify performance improvements

While the implementation looks solid, the claimed 2x performance improvement should be verified with benchmarks.

Consider:

  1. Adding performance benchmarks
  2. Alternative testing approaches:
    • Integration tests with a real browser environment
    • Performance comparison tests between old and new implementations
    • Edge case testing with various image formats

Run this script to find existing benchmark/test files:

app/packages/looker/src/numpy.ts (1)

12-12: LGTM! Addition of Uint8ClampedArray aligns with canvas implementation.

The addition of Uint8ClampedArray to ARRAY_TYPES is necessary for supporting the new canvas-based mask decoding implementation. This array type is specifically designed for handling image data with automatic value clamping between 0 and 255.

app/packages/looker/src/worker/index.ts (1)

31-31: Import 'decodeWithCanvas' correctly integrated.

The import statement for decodeWithCanvas from canvas-decoder.ts is appropriately added and necessary for the new image decoding functionality.

@benjaminpkane
Copy link
Contributor

Love it!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c52e31a and a478f5c.

📒 Files selected for processing (1)
  • app/packages/looker/src/worker/canvas-decoder.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/looker/src/worker/canvas-decoder.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🔇 Additional comments (5)
app/packages/looker/src/worker/canvas-decoder.ts (5)

3-6: Enhance documentation and add input validation

The past review comment about adding input validation and documentation is still valid.

Let's check browser compatibility:

#!/bin/bash
# Description: Check browser compatibility for OffscreenCanvas
# Expected: Find any polyfills or fallbacks for browsers not supporting OffscreenCanvas

# Search for browser compatibility checks or polyfills
rg -g '!node_modules' -g '*.{ts,tsx,js,jsx}' 'OffscreenCanvas'

# Search for any existing fallback mechanisms
rg -g '!node_modules' -g '*.{ts,tsx,js,jsx}' 'createImageBitmap'

7-9: Add error handling and verify performance claims

The past review comment about adding error handling for bitmap creation is still valid.

Let's verify the performance claims:

#!/bin/bash
# Description: Find and analyze the previous implementation to verify 2x performance improvement
# Expected: Locate the old implementation using fast-png or jpeg-js

# Search for removed or replaced image decoding implementations
rg -g '!node_modules' -B 10 -A 10 'fast-png|jpeg-js'

# Search for any performance benchmarks or tests
rg -g '!node_modules' -g '*.{ts,tsx,js,jsx}' 'benchmark|performance.now'

11-12: Add context optimization and error handling

The past review comment about adding context type assertion and error handling is still valid.


30-36: Improve type safety and resource management

Replace type assertion with runtime type checking:

-  } as OverlayMask;
+  } satisfies OverlayMask;

Let's check for consistent resource cleanup:

#!/bin/bash
# Description: Check for consistent resource cleanup patterns
# Expected: Find similar image processing code to ensure resources are properly cleaned up

# Search for bitmap cleanup patterns
rg -g '!node_modules' -g '*.{ts,tsx,js,jsx}' 'close\(\)|dispose\(\)'

# Search for canvas cleanup patterns
rg -g '!node_modules' -g '*.{ts,tsx,js,jsx}' 'canvas\.width\s*=\s*0|canvas = null'

14-16: ⚠️ Potential issue

Add error handling and optimize memory usage

Add error handling for drawing operations and image data extraction:

-  ctx.drawImage(imageBitmap, 0, 0);
-  const imageData = ctx.getImageData(0, 0, width, height);
+  try {
+    ctx.drawImage(imageBitmap, 0, 0);
+    const imageData = ctx.getImageData(0, 0, width, height);
+  } catch (error) {
+    throw new Error(`Failed to process image: ${error.message}`);
+  }

Let's check for memory leaks:

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a478f5c and 988d345.

📒 Files selected for processing (1)
  • app/packages/looker/src/worker/index.ts (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/looker/src/worker/index.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🔇 Additional comments (3)
app/packages/looker/src/worker/index.ts (3)

31-31: LGTM: Clean import of canvas decoder

The import of decodeWithCanvas aligns with the PR's performance improvement objectives.


162-170: LGTM: Comprehensive MIME type handling

Good implementation with:

  • Support for multiple image formats
  • Safe fallback to 'application/octet-stream'
  • Clean mapping structure

171-174: ⚠️ Potential issue

Add error handling for canvas decoding operations

The canvas decoding implementation lacks error handling for potential failures.

Additionally, verify the buffer size calculation:

#!/bin/bash
# Check for potential buffer overflow issues in image processing
ast-grep --pattern 'new ArrayBuffer($size)' 

Also applies to: 179-179

* Decodes a given image source into an OverlayMask using an OffscreenCanvas
*/
export const decodeWithCanvas = async (blob: ImageBitmapSource) => {
const imageBitmap = await createImageBitmap(blob);
Copy link
Contributor

@benjaminpkane benjaminpkane Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could use ImageBitmap in the Overlay classes instead of ImageData. It seems like that could be a performance improvement.

Just thinking out loud. Nothing actionable in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. Let's explore in another thread.

Copy link
Contributor

@benjaminpkane benjaminpkane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app Issues related to App features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants