Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --enable-unsafe-fp-math flag when -O3 and NNPA is used #2963

Merged
merged 2 commits into from
Oct 2, 2024

Conversation

chentong319
Copy link
Collaborator

Since NNPA uses less precise data, it is reasonable to enable aggressive optimization.
Add the flag only for O3, leaving O0 for debugging.

Copy link
Collaborator

@AlexandreEichenberger AlexandreEichenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will check if it makes a difference for the test that I was looking at.

If it makes a difference, I almost wonder if we should not always turn it on at O3.

// Enable aggressive optimization for NNPA with -O3
if (OptimizationLevel == OptLevel::O3 &&
getTargetAccel().find("NNPA") != std::string::npos &&
getLLVMOption().find("enable-unsafe-fp-math") == std::string::npos) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, the idea is that you just check if that opt is already there. If it's there, either its true and there is no need to add it again, or false and then we don't add it also. Clever.

@chentong319 chentong319 merged commit ccf9552 into onnx:main Oct 2, 2024
7 checks passed
@chentong319 chentong319 deleted the fp-flag branch October 2, 2024 17:17
@jenkins-droid
Copy link
Collaborator

Jenkins Linux amd64 Build #15754 [push] implement (#2963) Signe... started at 12:18

@jenkins-droid
Copy link
Collaborator

Jenkins Linux s390x Build #15757 [push] implement (#2963) Signe... started at 13:18

@jenkins-droid
Copy link
Collaborator

Jenkins Linux ppc64le Build #14784 [push] implement (#2963) Signe... started at 13:30

@jenkins-droid
Copy link
Collaborator

Jenkins Linux amd64 Build #15754 [push] implement (#2963) Signe... passed after 1 hr 24 min

@jenkins-droid
Copy link
Collaborator

Jenkins Linux s390x Build #15757 [push] implement (#2963) Signe... passed after 2 hr 1 min

@jenkins-droid
Copy link
Collaborator

Jenkins Linux ppc64le Build #14784 [push] implement (#2963) Signe... passed after 2 hr 35 min

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants