Font Size

Profile

Layout

Cpanel
Jake Stine_avatar

A new kind of fullscreen!

One problem that has plagued PCSX2 since as long as most anyone can remember is it's general inflexibility and instability when flipping between windowed and fullscreen modes. This was something we really sought to address as we re-write the user interface in 0.9.7.

In 0.9.6 the situation was grim: The flip never really worked right in DX9 -- there's an "escape hack" that completely shuts down the emulator just to flip out of fullscreen mode. Hitting alt-enter is usually less forgiving (the plugin handles it, and doesn't allow PCSX2 to completely shut everything down and restart). In DX10 things were better: alt-enter typically works a couple times, but do it too often, or get an unlucky flip, and it'll still result in lost or corrupted video and possibly a crash depending on video drivers and (lack of) luck.

In 0.9.7 we've completely re-done our approach to fullscreen. Instead of using what Microsoft DirectX calls Exclusive Mode (you can read about the programmer-centric details of it here), we're taking a more modern approach and using a special type of maximized desktop window instead. Like anything there's some advantages and disadvantages to this new approach:

Advantages of the new fullscreen method:

  • Works perfectly in DX9 and DX10 alike.
  • No more risk of visual corruption or crashes, and no need to shutdown the emulator to avoid them.
  • Much faster and seamless flips.
  • Works with any GS plugin, regardless of how the GS plugin is implemented.
  • Always uses your LCD display's optimal resolution (assuming you have it set in the desktop as such, and you should).
  • Integrates better with your desktop -- Alt-Tab, TaskBar, popup errors, etc. are much less prone to being... annoying. (pulling up a strategy guide in a browser window, for example!)
  • Super-easy to implement, from a programming perspective!


Disadvantages:

  • A slight bit of extra window management overhead.
  • Always uses your desktop resolution on CRT displays (this is an advantage for anyone with an LCD display, but can be a disadvantage for people with CRT displays, depending on your setup .. however few of you are left)


The performance benefit of exclusive fullscreen is mostly related to Aero under Vista/Windows7. in which case the performance is sometimes better in a window over exclusive mode (this depending on video card/drivers/etc).

The thing that really sold us on it though is the fact that overhead of the non-exclusive fullscreen mode is minimal on modern GPUs. The only real advantage of using exclusive fullscreen is that it bypasses Aero and increases the "multimedia priority" of the app. If you already have Aero disabled, then neither of those apply anyway. And looking toward the future: the next generation of GPUs will reduce that overhead of Aero even further to the point where even that will likely not make a significant impact on performance. So the idea of a seamlessly working fullscreen flip for all DirectX incarnations on all incarnations of Windows over a rather iffy, unstable, and fragile fullscreen flip that might be 2-3% faster on legacy hardware ended up being a no-brainer.

We're still leaving the door open for adding optional support for exclusive mode fullscreen, since there could still be some use to it for special scenarios like CRT displays and TV projection; though there's no timetable for the implementation of the option -- and it would depend on the GS plugin to support it properly otherwise it'll still be the corruption/crash bomb that it's always been up to now.

Jake Stine_avatar

Global Visitor Stats

Wondering about who's watching as we make and break bleeding edge alpha/beta versions of PCSX2? I do! Here's a map representing the visits to our SVN repository at Googlecode, for the past 2 weeks.

visitor stats

On average, our SVN totals per day:

  • 900 unique viewers
  • 3,200 visits (3.5 visits per viewer)
  • 9,500 page views in total.


Top visitors by city:

1. Sapporo 171
2. London 165
3. Tokyo 151
4. Shanghai 147
5. Moscow 142

Jake Stine_avatar

omg it's r2000

... and omg! It's the r2000! Bring on the celebration! Everybody to the limit! and err.. well... it'd prolly be more impressive if the emulator was working better. Razz

r2000

Jake Stine_avatar

Thread Counting...

One thing is for sure: The new 0.9.7 betas will use a lot more threads than the current 0.9.6 releases. Now this doesn't necessarily mean the emulator will take advantage of quad core CPUs better than 0.9.6, least not in a gameplay sense. As I explained in my previous blog, threading is as much a function of improving responsiveness and recoverability as it is about sharing a workload across multi-core cpus, and so far most of the threading implemented into 0.9.7 is the scalable/responsive sort.

One of the major changes in 0.9.7 will be the removal of what I call an "aggressive spinwait" in the EmotionEngine (EEcore) emulation unit. A spinwait is a simple loop that waits on a variable to change, like so:

Code:
volatile bool IsRunning = true;
StartThreadedAction();
while( IsRunning );
// When the above while() exits, the ThreadedAction is done.


This is a very simple threading design, but it's mostly drawbacks and not many advantages. We've continued to use it up to now in PCSX2's EEcore because there wasn't much reason to do away with it; and with the EEcore being the main orchestrator of everything in PCSX2 (gui included!), having the high-resolution responsiveness of a spinwait made some sense.

On the current design in 0.9.6, the EEcore and the GUI share time on the same thread, and when the GS thread is busy, the EEcore will split time between waiting on the GS (via the spinwait) and processing GUI messages. This transition from EEcore emulation to GUI message processing was typically costly, but was necessary to handle input from the user. In the new 0.9.7 design, the EEcore has its own thread separate from the GUI. This allows us to remove the overhead of having to switch to/from GUI processing code, but it came with a somewhat unexpected drawback: the EEcore's aggressive spinwait is now suddenly very aggressive when a game becomes GS-limited. In 0.9.6 the spinwait breaks from time to time to splice in some gui messages and make time for the GS to do its thing too. This kept everything pretty happy. But with the splicing gone, the EEcore is allowed to run free, soaking up tons of resources simply re-testing the same variable over and over.

The full impact became obvious when we realized that setting 2 software threads in GSdx caused PCSX2 to slow to a crawl (sub 1fps!) on dual core systems. That's what happens when you have three threads using spinwaits on a dual core system -- they completely starved out everything else, and to some extent each other as well. (yes, GSdx software uses spinwaits also!)

The primary solution is to get rid of the spinwait in the EEcore. In its wake we'll put the EEcore to sleep and have it wake up only once the MTGS ringbuffer has emptied to a satisfactory percentage. With the EEcore asleep, the GSdx thread(s) will have full reign over all the resources of the cpu; which will allow it to play "catch up" more efficiently than it can even in 0.9.6. This model will be an obvious win for both software rendering, and possibly DX11's multithreaded pipeline in the future.

Jake Stine_avatar

Thread syncronization

It's the year 2009, and it's almost over at that; and as anyone reading this blog well knows, multithreaded applications are the here-and-now and future of desktop computing. It's the only way we can take advantage of multicore CPUs. But multithreaded programming offers more than just improved multicore performance. Using threaded programming is actually very important to developing software that behaves nicely. By that I mean software that refreshes its window contents quickly, responds to your mouse clicks, and lets you cancel stuff.

For that the best approach is usually threading, with the alternative being something called "Cooperative multitasking" where by a program is written such that it splits all tasks into neat little chunks. For example, the two possible ways to implement loading an image (let's say a png image):

* Load the image one scanline at a time, and then after each scanline manually check for keyboard, mouse, or other input, and refresh the screen.

* Load the image using a thread, and let the usual "global" windows message handler dispatch keyboard, mouse, and refresh messages as usual.

The second approach has several advantages. For one, it needs fewer temporary heap allocations (which are typically slow and fragment memory). It is more responsive: windows messages will be handled in parallel to the image loading, so you don't even need to "wait" until the end of a scanline for user input to have its effect. It's also more scalable: while the first system is able to load one image at a time only in co-operative fashion (extending it to support multiple is possible, but very difficult), the threaded approach can be scaled to load dozens of images at once with no additional complications.

The drawback is that thread synchronization and especially structured error handling across threads tends to be much more complicated than that of the linear cooperative model. If you don't have errors to handle, or don't really care about handling errors, then threaded tasking isn't so bad.

Enter PCSX2, where everything ends up being damn complicated. Being a perfectionist, I figured I'd design the new GUI completely on the threaded model, doing away with cooperative design almost completely. Such a design should help avoid any deadlocking scenarios and allow the emu to recover from almost any error gracefully. Problem: The emulator has a lot of inter-dependent parts and pieces that need to be interlocked and synchronized, and all of them can throw out a variety of errors -- which too I'd like to handle smartly; requesting extra user input when appropriate (and not just throwing out annoying or vague message boxes).

Interlocking dependencies can be a nightmare. For example, if you start a thread that loads an image, and then block on that thread until it completes, you're worse off than if you wrote yourself a cooperative image loader because now the whole program stalls waiting for the thread to complete anyway. And like everything else, there are two ways to handle this:

(1) Use a "friendly" blocking mechanism that periodically polls the user input and updates display. This is no better than cooperative single-thread designs though, as it has slow response times and doesn't scale well to multiple threads.

(2) Build your entire GUI around "messages" and "callbacks" (sometimes also called "signals"). This is the most flexible and user-friendly option but can add a lot of "framework" to any codebase.

I tried to use the first approach initially, because I was in a hurry to get things working. But it's been problematic since day 1, so now I'm redoing most things to use the second method instead.

The second one is in fact the recommended design by Microsoft, and one they've been using for almost everything in Windows ever since Win95. It's one of the reasons the Win32 API feels "heavy" to a lot of programmers, but as it turns out, it's not without good reason.

You are here: Home Developer Blog