-
Notifications
You must be signed in to change notification settings - Fork 870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add config option for ipex (https://github.com/min-jean-cho/serve/blob/ipex_enable/examples/IPEX/README.md) #1319
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
thanks @min-jean-cho , example config could be.
|
Next steps
|
@min-jean-cho to pass CI
|
ts/torch_handler/base_handler.py
Outdated
return torch.as_tensor(data, device=self.device) | ||
t = torch.as_tensor(data, device=self.device) | ||
if ipex_enabled and t.dim() == 4: | ||
t = t.to(memory_format=torch.channels_last) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest not do this here, reasons:
- Input will be converted to channels last automatically inside the op as long as
model.to(memory_format=torch.channels_last)
(https://github.com/pytorch/serve/pull/1319/files#diff-d30e1f5ef9fd05e2ab9f652c116461632586cc132dde5d23ac4bc5cd19c799ccR85) - The channels last conversion only benefits CNN models while we can't tell if the model is CNN just by its input shapes. Moreover, CNN models can also take 3D or 5D inputs. Checking 4D sounds too specific to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely agree we can't assume CNN based on the input dimension being 4D (e.g., CNN for videos can take 5D inputs).
@@ -67,6 +67,9 @@ | |||
private static final String TS_NETTY_CLIENT_THREADS = "netty_client_threads"; | |||
private static final String TS_JOB_QUEUE_SIZE = "job_queue_size"; | |||
private static final String TS_NUMBER_OF_GPU = "number_of_gpu"; | |||
private static final String TS_CPU_LAUNCHER_ENABLE = "cpu_launcher_enable"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment on top of the IPEX args like //ipex arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, will do so.
@@ -73,6 +81,9 @@ def initialize(self, context): | |||
self.model = self._load_torchscript_model(model_pt_path) | |||
|
|||
self.model.eval() | |||
if ipex_enabled: | |||
self.model = self.model.to(memory_format=torch.channels_last) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this apply only to vision models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but benign to non-CNN models.
ts/torch_handler/base_handler.py
Outdated
@@ -73,6 +81,9 @@ def initialize(self, context): | |||
self.model = self._load_torchscript_model(model_pt_path) | |||
|
|||
self.model.eval() | |||
if ipex_enabled: | |||
self.model = self.model.to(memory_format=torch.channels_last) | |||
self.model = ipex.optimize(self.model, dtype=torch.float32, level='O1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a document that describes what the ipex levels mean?
Also what happens here if a user passes in a model that's already in float16 or bffloat16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a document that describes what the ipex levels mean?
Yes, there will be a user guide as part of IPEX 1.10 release. But here, I would suggest we use the default "level" recommended by IPEX, not hard-code 'O1' here.
Also what happens here if a user passes in a model that's already in float16 or bffloat16?
@min-jean-cho please remove the dtype
arg. @msaroufim The default behavior without specifying dtype
is do nothing about the data type conversion of the model. If the user specifies a low-precision data type (e.g., bfloat16 or float16, note that float16 is not supported now) different from the low-precision data type the model has, IPEX will do nothing about the data type conversion and issue a warning message. The behavior will be clarified in the user doc too.
@min-jean-cho Suggest to change this line to self.model = ipex.optimize(self.model)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 it would also be great to clarify on different optimization levels in the doc as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HamidShojanazeri Yes, this is in the plan. Some initial draft below, welcome comments:
level (string): "O0"
or "O1"
. No optimizations are applied with
"O0"
. The optimizer function just returns the original model and
optimizer. With "O1"
, the following optimizations are applied:
conv+bn folding, weights prepack, dropout removal (inferenc model),
master weight split and fused optimizer update step (training model).
The optimization options can be further overridden by setting the
following options explicitly. The default value is "O1"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.model = ipex.optimize(self.model, dtype=torch.float32, level='O1')
this also only works for fp32; So if ipex_enabled:
is insufficient condition to check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@min-jean-cho may I know the specific scenario you are concerned with if you change the code to self.model = ipex.optimize(self.model)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 You're right. I'll integrate your two suggestions regarding dtype
arg and self.model = ipex.optiize(self.model)
- that would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 You're right. I'll integrate your two suggestions regarding
dtype
arg andself.model = ipex.optiize(self.model)
- that would work.
wouldn't we still need the level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to explicitly set the level as level = 'O1'
since it is the default value and we are optimizing the model with ipex enabled
frontend/server/src/main/java/org/pytorch/serve/util/ConfigManager.java
Outdated
Show resolved
Hide resolved
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@@ -73,6 +81,9 @@ def initialize(self, context): | |||
self.model = self._load_torchscript_model(model_pt_path) | |||
|
|||
self.model.eval() | |||
if ipex_enabled: | |||
self.model = self.model.to(memory_format=torch.channels_last) | |||
self.model = ipex.optimize(self.model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@min-jean-cho does it choose optimization level 01 by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ipex will choose level 'O1' by default.
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
rn50_pytorch_fp32_bs1: rn50_ipex_fp32_bs1: rn50_ipex_int8_bs1: |
I just tried running a BERT benchmark with and without IPEX and the improvements are clear with a 30% improvement. On an m6i.4x with 4 workers without the launcher. I'm running a couple more but I think it'd be safe to go ahead and get this merged in. Longer form benchmarks here |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
ts/torch_handler/base_handler.py
Outdated
import intel_extension_for_pytorch as ipex | ||
ipex_enabled = True | ||
except: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if users are saying they want to use ipex but it's not available then it makes more sense to print("pip install ipex")
instead of pass
otherwise users will still get a failure later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, we can log a message saying ipex was not installed hence not used.
<IPEX release/1.1 + int8 patch, + with launcher JIT, inference only time (s) >
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
This PR will replace https://github.com/pytorch/serve/pull/1312/files