-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for macos_arm64 platform #664
Conversation
That's awesome, thanks @simonmaurer! Since it's not currently possible for our CI to test
|
@AdamHillier oh absolutely, my pleasure. the numbers are just crazy ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
updated the PR with benchmark results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
tensorflow/tensorflow#47639 (comment) and tensorflow/tensorflow#47639 (comment) mention that TFLite can make use of the Accelerate framework on macOS. |
yes, totally agree. according to these comments and evaluations this can further improve latency (even more than xnnpack) depending on the model. could we control this somehow via an additional cmdline parameter when executing the lce_benchmark_model binary? if not this can just be activated with the mentioned build flags |
This seems to be a build flag, so I don't think we can have a commandline parameter to toggle this behaviour in the benchmark binary. |
What do these changes do?
With the latest LCE release v0.6 the TensorFlow dependency has been upgraded to
2.5.0
including relevant changes to support compilation for themacos_arm64
platform using the recent Apple M1 ARM processor.This also included upstream dependencies on
XNNPACK
andpthreadpool
as needed by the LCE (lce_benchmark_model
/lce_minimal
) build process.Starting with latest bazel versions for
arm64
this small PR (given all upstream changes in TF) allows building LCE for Apple M1.Feel free to check out the discussions here: XNNPACK/pthreadpool/TF pip package/TFLite
How Has This Been Tested?
or with a more recent version of bazel (as of bazel@492829)
Benchmark Results
The table below presents single-/multi-threaded performance of Larq Compute Engine v0.6 on
different versions of QuickNet (trained on ImageNet dataset, released on Larq Zoo)
on an Apple mac mini 2020 (M1):
Related issue number
#604