Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply integer math narrowing before VFPU sin/cos #14406

Merged
merged 7 commits into from
Apr 25, 2021

Conversation

unknownbrackets
Copy link
Collaborator

@unknownbrackets unknownbrackets commented Apr 24, 2021

There's a speed penalty for this but it's not that significant (about 30% of the sin/cos instruction time, looks like.)

This seems to do a better job of narrowing than fmod/fmodf and gives results that match the PSP much better. It also handles some NAN/INF/zero cases carefully. I had to special case +1/-1 on cosine though, to make sure it's properly calculated.

Could use some help testing if this has negative effects. I don't have Cho Aniki Zero (@sum2012), Hajime no Ippo (@Saramagrean), or Hitman Reborn (@somepunkid). I'm hoping all three games still work with this. You can test by going to the Checks tab of this pull request and downloading a build from the Artifacts button.

FF3 does still work fine from my tests. It does not help Ridge Racer of course.

If it works well, I'll probably remove the old single float code entirely.

I might still try to figure out more exact calculation (still suspect it uses CORDIC from the vrot instruction), but it's much closer with this change.

Fixes #12900.

-[Unknown]

Just to use a common union.
This makes the results much more accurate to the PSP's results.
Could narrow a bit further swapping sin/cos/neg, which might be what the
hardware does given vrot.
It still gets these off from zero, so let's just special case.
Copy link
Owner

@hrydgard hrydgard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nits, feel free to address or skip.

@@ -946,34 +936,182 @@ void vfpu_sincos_single(float angle, float &sine, float &cosine) {
}
}

float vfpu_sin_double(float angle) {
return (float)sin((double)angle * M_PI_2);
float vfpu_sin_mod2(float a) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't mod4 be a more accurate name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First step is mod by 4 (since it has a repeating pattern by 4), but then it mods by 2 right below that (which may negate the result or negate the input.) Ultimately, as in the other note, I may want to mod by 1 and swap sin/cos (which I was already doing in my CORDIC test code), but that part adds unnecessary extra instructions without much accuracy benefit at this point.

-[Unknown]

Comment on lines 961 to 965
// This subtracts off the 2. If we do, flip sign to inverse the wave.
if (k == 0x80 && mantissa >= (1 << 23)) {
val.i ^= 0x80000000;
mantissa -= 1 << 23;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we just add a phase shift here for one of [sin, cos], and share the rest of the code between them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost had it shared, and still waffling on it. But the NAN handling is a bit different and I was worried about the -0 cases making it messy.

-[Unknown]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely can swap sin/cos and negative though (and have to for CORDIC), I was just trying to minimize perf impact here.

-[Unknown]

@somepunkid
Copy link

somepunkid commented Apr 25, 2021

Tested Reborn BA2. Tested multiple characters and multiplayer same machine. Tested broken kick animation from #12900

It is DONE! The game is 100%. Everything is working. Did some direct tests with hardware as well just to be sure but yes, this seems great!

@hrydgard
Copy link
Owner

Awesome, thanks for testing!

@hrydgard hrydgard added this to the v1.12.0 milestone Apr 25, 2021
@sum2012
Copy link
Collaborator

sum2012 commented Apr 25, 2021 via email

@hrydgard
Copy link
Owner

hrydgard commented Apr 25, 2021

@sum2012 did you post in the wrong thread? That doesn't seem relevant here. If you're having trouble logging into github, well, how did you post that? :)

@LunaMoo
Copy link
Collaborator

LunaMoo commented Apr 25, 2021

Check the mail icon near his nick:
github

Seems he's having some captcha problems when resetting password, through with the email reply I guess he can reply/contribute even without logging on the site.

@Saramagrean
Copy link
Contributor

Hajime no Ippo work fine with this.

New implementation should work for both cases.
@hrydgard
Copy link
Owner

Alright. Let's merge this and see how it goes!

@hrydgard hrydgard merged commit 0ccc63b into hrydgard:master Apr 25, 2021
@unknownbrackets unknownbrackets deleted the vfpu-sincos branch April 25, 2021 15:11
@unknownbrackets
Copy link
Collaborator Author

I'm starting to think the raspberry pi has a completely broken sin() / cos() implementation or something, because there are strange reports of crazy brokenness that only seem to be coming from raspberry pi users and seem new since this was merged.

Might try forcing rpi to use sinf/cosf/sincosf, but haven't confirmed this is the problem...

-[Unknown]

@hrydgard
Copy link
Owner

Huh, sounds strange :(

We could also maybe just replace sin/cos entirely with an cubic-interpolated lookup table after range reduction, we should be able to get close enough...

@unknownbrackets
Copy link
Collaborator Author

Seems like it is not that, per #14456...

-[Unknown]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ULJS00218 - Hitman Reborn Battle Arena 2 - Player 2 side broken/reversed/broken kick animation
6 participants