Font Size

Profile

Layout

Cpanel
cottonvibes_avatar

Introduction to Dynamic Recompilation

This blog post is an introduction to dynamic recompilers (dynarecs), and hopes to provide some insight on how they work and why pcsx2 uses them to speed up emulation.
It is probably easier to read on our forums, because some of the code didn't wrap nicely on our main blog page....
(Click here to view blog post in forum)

To first understand why dynarecs are useful, you must first be familiar with a basic interpreter emulator.

Read more: Introduction to Dynamic Recompilation

Jake Stine_avatar

A new kind of fullscreen!

One problem that has plagued PCSX2 since as long as most anyone can remember is it's general inflexibility and instability when flipping between windowed and fullscreen modes. This was something we really sought to address as we re-write the user interface in 0.9.7.

In 0.9.6 the situation was grim: The flip never really worked right in DX9 -- there's an "escape hack" that completely shuts down the emulator just to flip out of fullscreen mode. Hitting alt-enter is usually less forgiving (the plugin handles it, and doesn't allow PCSX2 to completely shut everything down and restart). In DX10 things were better: alt-enter typically works a couple times, but do it too often, or get an unlucky flip, and it'll still result in lost or corrupted video and possibly a crash depending on video drivers and (lack of) luck.

In 0.9.7 we've completely re-done our approach to fullscreen. Instead of using what Microsoft DirectX calls Exclusive Mode (you can read about the programmer-centric details of it here), we're taking a more modern approach and using a special type of maximized desktop window instead. Like anything there's some advantages and disadvantages to this new approach:

Advantages of the new fullscreen method:

  • Works perfectly in DX9 and DX10 alike.
  • No more risk of visual corruption or crashes, and no need to shutdown the emulator to avoid them.
  • Much faster and seamless flips.
  • Works with any GS plugin, regardless of how the GS plugin is implemented.
  • Always uses your LCD display's optimal resolution (assuming you have it set in the desktop as such, and you should).
  • Integrates better with your desktop -- Alt-Tab, TaskBar, popup errors, etc. are much less prone to being... annoying. (pulling up a strategy guide in a browser window, for example!)
  • Super-easy to implement, from a programming perspective!


Disadvantages:

  • A slight bit of extra window management overhead.
  • Always uses your desktop resolution on CRT displays (this is an advantage for anyone with an LCD display, but can be a disadvantage for people with CRT displays, depending on your setup .. however few of you are left)


The performance benefit of exclusive fullscreen is mostly related to Aero under Vista/Windows7. in which case the performance is sometimes better in a window over exclusive mode (this depending on video card/drivers/etc).

The thing that really sold us on it though is the fact that overhead of the non-exclusive fullscreen mode is minimal on modern GPUs. The only real advantage of using exclusive fullscreen is that it bypasses Aero and increases the "multimedia priority" of the app. If you already have Aero disabled, then neither of those apply anyway. And looking toward the future: the next generation of GPUs will reduce that overhead of Aero even further to the point where even that will likely not make a significant impact on performance. So the idea of a seamlessly working fullscreen flip for all DirectX incarnations on all incarnations of Windows over a rather iffy, unstable, and fragile fullscreen flip that might be 2-3% faster on legacy hardware ended up being a no-brainer.

We're still leaving the door open for adding optional support for exclusive mode fullscreen, since there could still be some use to it for special scenarios like CRT displays and TV projection; though there's no timetable for the implementation of the option -- and it would depend on the GS plugin to support it properly otherwise it'll still be the corruption/crash bomb that it's always been up to now.

Jake Stine_avatar

Global Visitor Stats

Wondering about who's watching as we make and break bleeding edge alpha/beta versions of PCSX2? I do! Here's a map representing the visits to our SVN repository at Googlecode, for the past 2 weeks.

visitor stats

On average, our SVN totals per day:

  • 900 unique viewers
  • 3,200 visits (3.5 visits per viewer)
  • 9,500 page views in total.


Top visitors by city:

1. Sapporo 171
2. London 165
3. Tokyo 151
4. Shanghai 147
5. Moscow 142

Jake Stine_avatar

omg it's r2000

... and omg! It's the r2000! Bring on the celebration! Everybody to the limit! and err.. well... it'd prolly be more impressive if the emulator was working better. Razz

r2000

Jake Stine_avatar

Thread Counting...

One thing is for sure: The new 0.9.7 betas will use a lot more threads than the current 0.9.6 releases. Now this doesn't necessarily mean the emulator will take advantage of quad core CPUs better than 0.9.6, least not in a gameplay sense. As I explained in my previous blog, threading is as much a function of improving responsiveness and recoverability as it is about sharing a workload across multi-core cpus, and so far most of the threading implemented into 0.9.7 is the scalable/responsive sort.

One of the major changes in 0.9.7 will be the removal of what I call an "aggressive spinwait" in the EmotionEngine (EEcore) emulation unit. A spinwait is a simple loop that waits on a variable to change, like so:

Code:
volatile bool IsRunning = true;
StartThreadedAction();
while( IsRunning );
// When the above while() exits, the ThreadedAction is done.


This is a very simple threading design, but it's mostly drawbacks and not many advantages. We've continued to use it up to now in PCSX2's EEcore because there wasn't much reason to do away with it; and with the EEcore being the main orchestrator of everything in PCSX2 (gui included!), having the high-resolution responsiveness of a spinwait made some sense.

On the current design in 0.9.6, the EEcore and the GUI share time on the same thread, and when the GS thread is busy, the EEcore will split time between waiting on the GS (via the spinwait) and processing GUI messages. This transition from EEcore emulation to GUI message processing was typically costly, but was necessary to handle input from the user. In the new 0.9.7 design, the EEcore has its own thread separate from the GUI. This allows us to remove the overhead of having to switch to/from GUI processing code, but it came with a somewhat unexpected drawback: the EEcore's aggressive spinwait is now suddenly very aggressive when a game becomes GS-limited. In 0.9.6 the spinwait breaks from time to time to splice in some gui messages and make time for the GS to do its thing too. This kept everything pretty happy. But with the splicing gone, the EEcore is allowed to run free, soaking up tons of resources simply re-testing the same variable over and over.

The full impact became obvious when we realized that setting 2 software threads in GSdx caused PCSX2 to slow to a crawl (sub 1fps!) on dual core systems. That's what happens when you have three threads using spinwaits on a dual core system -- they completely starved out everything else, and to some extent each other as well. (yes, GSdx software uses spinwaits also!)

The primary solution is to get rid of the spinwait in the EEcore. In its wake we'll put the EEcore to sleep and have it wake up only once the MTGS ringbuffer has emptied to a satisfactory percentage. With the EEcore asleep, the GSdx thread(s) will have full reign over all the resources of the cpu; which will allow it to play "catch up" more efficiently than it can even in 0.9.6. This model will be an obvious win for both software rendering, and possibly DX11's multithreaded pipeline in the future.

You are here: Home Developer Blog