-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply integer math narrowing before VFPU sin/cos #14406
Conversation
Just to use a common union.
This makes the results much more accurate to the PSP's results. Could narrow a bit further swapping sin/cos/neg, which might be what the hardware does given vrot.
It still gets these off from zero, so let's just special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits, feel free to address or skip.
Core/MIPS/MIPSVFPUUtils.cpp
Outdated
@@ -946,34 +936,182 @@ void vfpu_sincos_single(float angle, float &sine, float &cosine) { | |||
} | |||
} | |||
|
|||
float vfpu_sin_double(float angle) { | |||
return (float)sin((double)angle * M_PI_2); | |||
float vfpu_sin_mod2(float a) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't mod4 be a more accurate name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First step is mod by 4 (since it has a repeating pattern by 4), but then it mods by 2 right below that (which may negate the result or negate the input.) Ultimately, as in the other note, I may want to mod by 1 and swap sin/cos (which I was already doing in my CORDIC test code), but that part adds unnecessary extra instructions without much accuracy benefit at this point.
-[Unknown]
Core/MIPS/MIPSVFPUUtils.cpp
Outdated
// This subtracts off the 2. If we do, flip sign to inverse the wave. | ||
if (k == 0x80 && mantissa >= (1 << 23)) { | ||
val.i ^= 0x80000000; | ||
mantissa -= 1 << 23; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we just add a phase shift here for one of [sin, cos], and share the rest of the code between them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost had it shared, and still waffling on it. But the NAN handling is a bit different and I was worried about the -0 cases making it messy.
-[Unknown]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely can swap sin/cos and negative though (and have to for CORDIC), I was just trying to minimize perf impact here.
-[Unknown]
Tested Reborn BA2. Tested multiple characters and multiplayer same machine. Tested broken kick animation from #12900 It is DONE! The game is 100%. Everything is working. Did some direct tests with hardware as well just to be sure but yes, this seems great! |
Awesome, thanks for testing! |
I lost my password , "Verify your account" to send new password to email
did not work
I tried 2 times to get "number 14" from 9 pictures of 15 times.... but
still fail
… |
@sum2012 did you post in the wrong thread? That doesn't seem relevant here. If you're having trouble logging into github, well, how did you post that? :) |
Hajime no Ippo work fine with this. |
New implementation should work for both cases.
Alright. Let's merge this and see how it goes! |
I'm starting to think the raspberry pi has a completely broken sin() / cos() implementation or something, because there are strange reports of crazy brokenness that only seem to be coming from raspberry pi users and seem new since this was merged. Might try forcing rpi to use sinf/cosf/sincosf, but haven't confirmed this is the problem... -[Unknown] |
Huh, sounds strange :( We could also maybe just replace sin/cos entirely with an cubic-interpolated lookup table after range reduction, we should be able to get close enough... |
Seems like it is not that, per #14456... -[Unknown] |
There's a speed penalty for this but it's not that significant (about 30% of the sin/cos instruction time, looks like.)
This seems to do a better job of narrowing than fmod/fmodf and gives results that match the PSP much better. It also handles some NAN/INF/zero cases carefully. I had to special case +1/-1 on cosine though, to make sure it's properly calculated.
Could use some help testing if this has negative effects. I don't have Cho Aniki Zero (@sum2012), Hajime no Ippo (@Saramagrean), or Hitman Reborn (@somepunkid). I'm hoping all three games still work with this. You can test by going to the Checks tab of this pull request and downloading a build from the Artifacts button.
FF3 does still work fine from my tests. It does not help Ridge Racer of course.
If it works well, I'll probably remove the old single float code entirely.
I might still try to figure out more exact calculation (still suspect it uses CORDIC from the vrot instruction), but it's much closer with this change.
Fixes #12900.
-[Unknown]