Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input lag too high - ideas for improvement #17685

Open
hrydgard opened this issue Jul 10, 2023 · 10 comments
Open

Input lag too high - ideas for improvement #17685

hrydgard opened this issue Jul 10, 2023 · 10 comments
Milestone

Comments

@hrydgard
Copy link
Owner

hrydgard commented Jul 10, 2023

Keep hearing in the retroachievements discord that PPSSPP has a reputation for high input lag, heh.

Current options for reducing input lag:

  • Set "Buffered frames" as low as possible. This increases the risk of small stalls and stutters, but reduces latency.
  • Run at high refresh rate. The CPU thread will end up waiting on present a lot less.

Future possible improvements:

  • In case of games that have good and reliable 60hz patches, maybe include and suggest them? (Of course, let's not turn them on by default).
  • Closed-loop measurement and beginning-of-frame delay insertion, something like Add closed-loop latency control for Vulkan vsync dolphin-emu/dolphin#12035 (similar in effect to NVIDIA's Reflex)
  • Try harder to keep the audio buffer small even kind case of overrun
  • Possibly re-introduce single-threaded rendering for more control

Plans:

@hrydgard hrydgard added this to the Future-Prio milestone Jul 10, 2023
@nyanpasu64
Copy link

Hi, I wrote those PRs you linked to. Still waiting for Dolphin to take a look at them... 💀 Now I'll just write this out before I forget:

  • Dolphin differs from many emulators in that it depends on monotonic clock (real time) throttling on a sub-frame (clock cycles) basis rather than vsync throttling, to maintain a specific emulation speed. I'm not sure why it does so, perhaps it produces better results when talking to hardware Wiimotes, but I think it makes it harder to minimize input latency. This also results in its vsync operating in strange ways. In my PR I had to gradually speed up or slow down CPU throttling to remain roughly in phase with vsyncs.
    • If you're considering beginning-of-frame delay insertion, I'm guessing your emulator performs all the work as early as possible, submits rendering work to the GPU, then (possibly) waits for the frame to start scanning out before starting to emulate the next one? See also https://raphlinus.github.io/ui/graphics/gpu/2021/10/22/swapchain-frame-pacing.html.
      • Dolphin does not wait for the GPU to finish rendering and start showing a frame, before starting to emulate (or possibly even render to GPU?) the next one. On Vulkan it submits draw commands (a command buffer) to the GPU when the Wii would present a video frame to the hardware scanout system. You could submit draw commands earlier, but this is only a few ms and I found they're usually still submitted near the end of a frame. (On Wind Waker, a 30fps game that performs EFB operations which forces Dolphin to submit some draw commands early, I found that geometry was inconsistently submitted on the first or second 16ms fields of the previous framebuffer being scanned out. I still don't know why.)
      • For inspecting call stacks at runtime and the various timing of events, Tracy Profiler has been invaluable for debugging.
  • Dolphin already has closed-loop audio buffer length control by dynamically controlling the audio output resampling ratio (sadly using linear resampling). However the current code achieves a low jitter partly because the CPU thread is throttled in real time, and submits audio in chunks of just 8 samples (times 2 channels). If you generated an "emulation frame" of audio at a time, you'd need more smoothing, for example ensuring the maximum amount of queued audio seen in the last half second is 1 "emulation frame" plus latency headroom, and the minimum amount of queued audio seen in the last half second equals the latency headroom.

@hrydgard
Copy link
Owner Author

hrydgard commented Jul 12, 2023

We also have a mode to try to maintain pace, although it's off by default.

Our current timing is a bunch of rather messy code in sceDisplay.cpp that tries to maintain a regular rate of running guest CPU code up to a "swapbuffers" event. This code is completely blind to presentation unfortunately, it's kinda "what happens happens" which isn't good.

Additionally, in Vulkan and OpenGL we buffer up a whole frame of graphics commands before we pass them over to a separate thread that does the actual command buffer generation. This allows parallelism which is hugely helpful on old Android devices with slow drivers, but is not that useful on faster hardware (additionally, we use the buffering to reorder things in the frame to make more sense for modern GPU hardware, which helps performance in some games). Fortunately on the same faster hardware, both emulation and command buffer generation is so fast that if we just timed things properly, I think we could still achieve very good latency.

We do already have a resampler similar to Dolphin's for audio, it's not very heavily tuned but it works fine (although is also linear) - I think for reduced perceived latency, fixing the wait/render/presentation loop is the main priority for us.

As first steps when I get around to attacking this, I'm considering optionally bringing back the old single-threaded mode for Vulkan just to ease reasoning about the presentation loop. Additionally I want to lift all the timing code out of sceDisplay.cpp to EmuScreen.cpp next to PSP_RunLoopWhileState(). After that, we can start using the various present timing extensions where available (VK_present_wait, VK_GOOGLE_display_timing, the new one when it arrives).

And yes, I probably should finally get around to figuring out tracy integration...

@JMC47
Copy link

JMC47 commented Jul 12, 2023

Sorry if this is the wrong place to put my recent experience with PPSSPP, but I figure I might as well say something if input latency was considered a serious issue on the project.

Just as a note, I was doing some game research a few weeks ago, and was testing a game in PPSSPP, and the input latency on my computer made the game impossible to play outside of beginner mode (3 click golf game). The Wii version of the game, while it runs at a crappy 20 - 30 FPS (vs 60 FPS for the PSP version - even on my PSP during the actual shots) has much lower input latency in Dolphin and is at least somewhat playable.

My particular case of having issues with input latency may be partly due to moving to Linux, as I used to speedrun using PPSSPP on windows, and I don't remember the input latency being a huge problem (though it wasn't quite as snappy as console.) It could also be that a 3 click golf game is going to make even minor input latency issues more severe. The thing is that even as I played the game quite a bit in PPSSPP, I couldn't adjust to the timings, suggesting they were really bad or simply just inconsistent. Using Mouse and Keyboard seemed to help with the consistency a little, but the game is still very hard to play. On a real PSP handheld, I can play the game on advanced mode with harsher timings and it is perfectly fine. My biggest fear is that the timings aren't just late on PPSSPP (they absolutely are) but they are also somewhat inconsistent. I noticed on controller that sometimes while preparing shots there would be a nearly half a second delay on the final hit on the hitbar completely ruining the shot. I think this only happened after fast-forwarding, but I could be wrong. I have been unable to reproduce this extreme case on Keyboard/Mouse - only on controller. Even on keyboard and mouse, the timings are never good enough where I feel absolutely comfortable I can hit a perfect shot, where as on the actual PSP I can usually do it.

This was after doing buffered frames thing to try to reduce latency. I also ran in fullscreen in an attempt to get exclusive fullscreen, but considering the I didn't get the familiar flash of it grabbing, I'm pretty sure it's not working. That could account for some of the issues I guess. But as I noted earlier, I was able to use Dolphin and adjust to the timings there, despite the game running at a worse framerate there. Also, when comparing the Wii to Dolphin, there isn't a hugely noticeable delay when hitting buttons at roughly the same time. On PPSSPP there is a visible delay compared to hitting a button the PSP.

In other games that aren't so reliant on timings, I haven't really noticed anything out of place. Mega Man Legends 2 feels mostly okay, for instance. I might need to get more PSP golf and rhythm games to really test if this is a global problem or maybe something that only happens in certain games.

My PC's specs are an AMD 5950x, NVIDIA RTX 3060 on a 144Hz monitor running Pop_OS! in case that is pertinent.

@hrydgard
Copy link
Owner Author

hrydgard commented Jul 12, 2023

@JMC47 thanks for your observations JMC. I'm not really sure where the inconsistency is coming from (though the audio latency does get wonky for a bit after fast-forward). Controller-only is curious - I don't think we do anything weird with controller sampling, but who knows... On Linux I guess the control sampling timing is handled by SDL, and in 30fps games, maybe we don't give SDL control often enough.

The fact that many games are 30fps also makes it worse in other ways because a couple of frames of delay there is a huge amount of time.

But it's indeed time to start properly addressing this! While it's a very difficult issue with many possible causes, we need to make progress here. After RetroAchievements is done, this will be my primary focus.

@JMC47
Copy link

JMC47 commented Jul 12, 2023

Feel free to ping me however if you need any testing/observation. I'll try to get more concrete numbers with a high speed camera down the line so I'm not just going based on estimates and feel.

@EdHerdman
Copy link

The very recent JIT changes seem to be clearing up audio problems, like static bursts but also possibly some pacing problems, in Infected. The game's almost always playing a licensed audio track so it seems like a good one to check for audio pacing problems.

hrydgard added a commit that referenced this issue Aug 2, 2023
… screen

This extension is not available on Android, there they have
VK_GOOGLE_display_timing, which they also have an abstraction library
for, so will look at that later.

Early part of work on #17685
@hrydgard
Copy link
Owner Author

hrydgard commented Aug 2, 2023

@EdHerdman The recent JIT changes that I know of were for RISC-V CPUs only, so unlikely to be related.

Anyway, @nyanpasu64 I've just made a PR to measure the problem on PC - that's the first part of knowing that you have one! And yeah, it's bad, even on my 90hz monitor which by its higher refresh rate prevents the frame queue from getting too long.

Next I'll take a look at your stuff, and start refactoring how we control framerate internally to allow for placing the tuned delays where we need them to be in the frame. Also, really should integrate tracy... I started work on that once and got stuck on something.

@LunaMoo
Copy link
Collaborator

LunaMoo commented Aug 3, 2023

What I find quite funny is if the new option to display latency is working well, turning OFF buffer effects actually doubles latency.

GG to all the people actually turning it off trying to reduce it.

Edit: Oh it's actually happening because it disables frame duplication which I had enabled, maybe one easy way to achieve smaller latency would also be to duplicate frames to a higher value.

@hrydgard
Copy link
Owner Author

hrydgard commented Aug 3, 2023

Once we have a correct timing algorithm, I think we won't need duplicate frames anymore, it'll only be counterproductive.

@hrydgard
Copy link
Owner Author

hrydgard commented Aug 3, 2023

So with my new measurements, I've confirmed that:

  • On devices with high framerates where it's hard for queued frames to pile up, there's probably room for up to a frame of input delay improvement, compared to now, although the wins here are not huge.
  • On devices with 60hz displays, when playing 60hz games we very easily end up with three whole frames actually in flight for no good reason even when we could easily keep up, resulting in an unstable 90-100ms latency end to end. Some smart-control-loop delay insertions would really, really help.
  • When playing 30hz games, the situation is often a little better but still not great, there are wins to be had.
  • VK_GOOGLE_display_timing gives results up to 6 frames late! Can still use the result as a statistic, but as an input to a feedback control loop, we're gonna have to be careful not to start oscillating. we might want to lean a bit on the choreographer too on Android (a vsync event) to help timing things, although it's complex stuff - if we can avoid it for a simple control loop, all the better.
  • The AGDK frame pacing library is scarily complex, and if it can be avoided, I'd like to. Though, it does look like it provides a lot of stuff that we want, including the pipelineless mode: https://developer.android.com/games/sdk/frame-pacing

Moving the timing logic out from sceDisplay will be a challenge too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants