Font Size



General Plot's son (and refraction's nephew) born today

I know this has nothing to do directly with PCSX2, but I thought this occasion deserved an announcement here, as without PCSX2, this never could have been possible. I met refraction's sister nearly five years ago after knowing him for a few years. And as everyone knows, I'm married to his sister. Today his sister and I celebrated the birth of our son. A child made possible (as odd as it may seem) only by this project.

Many people look at the emulator as strictly software, and the human side of it tends to be forgotten, but this is one of those cases where it's all about the human side of it. An epic emulator that brings about the possibility of an epic life event. It's not big, but this project has rewritten the history of our lives and bring a new life into it.

As much as I love this project (always have), no amount of speed increases or bug fixes can compare to the excitement that comes with seeing my son born today. I'm sure to most people, this isn't important to them, but I wanted to share it with anyone who might be interested. Just consider this as a reminder of the human side of it.

Post a Comment!

Path 3 Masking / Geometry Syncing

I promised myself ages ago I would write a blog about this, more for the PS2 community than anything else as this seems to be almost a dark art in terms of understanding how it works.

Some of you may be familiar with Path 3 masking, some of you may not.  In any versions around 0.9.4 or older, have you ever played a game where say some of the floor textures have had writing on them or just looked like absolute crap? Well the likelihood is that was due to Path 3 Masking problems. For those not familiar, here is a picture example of Persona 4 displaying Path 3 masking issues.

So, what is it exactly?

Path 3 masking is a method of synchronizing the order in which geometry data (polygons etc) and the textures that go on them are sent to the GS (Graphical Synthesizer). It was a pretty efficient method of transferring information, which took completely different path routes to make optimal use of the PS2 Memory BUS and used a stalling method to keep the texture and geometry lists in order instead of interrupts which took extra cpu time.  This way, developers could queue up massive amounts of textures and stream them to the GS in time with the geometry data efficiently.

Here is a badly drawn diagram showing what the Mask does:

As you can see, the mask stops PATH3 from transferring while the VU is busy, then lets it go and almost immediately puts the mask back on, this ensures that at the end of the texture transfer, PATH3 stops again!

So what was the problem? Why did everything look like crap?

Truthfully? we never had proper control over these packets.  The GIF packets can come through the PATH3 DMA channel lumped together, where in actual fact, we should have been stalling half way through it, but in the emulator, we didn't really care what these packets were, so we just threw it all at the GS and ignored any packet ends and this is what caused the Texture/Geometry lists to go out of sync, this is where the crap on the screen came from.  As the emulator evolved it became progressively easier to fix this issue, although we had the theory down, developers would handle it in different ways with different timings for the masks, so it took a fair amount of testing and tweaking to get right.

Fortunately these days we have got this pretty much solved and we completely analyze the GIF Packets before sending them on their way, so we know where we can stop the transfers, however we do still get the odd time where this rears its ugly head, due to how we have to time everything perfectly, but due to the nature of emulators, timing is painful to get correct and will probably never be so.  Generally however, using gamefixes such as "EE Timing Fix" can get around these issues, without compromising too much on stability.

Post a Comment!

Threading VU1

 Well if you've kept up with pcsx2's SVN, you'll notice we recently added a MTVU (Multi-Threaded microVU1) option, which runs VU1 on its own thread.

Getting pcsx2 to use more cores is something many people have asked for, and they wondered why we weren’t doing it. Some users would go as far as to flame us pcsx2 coders saying we didn’t have the skills to do it or would say some other nonsense.

Read more: Threading VU1

The History of PCSX2

Some forum members had shown quite an interest in the history of the emulator, so i thought, why not? I'll write a history of the emulator to the best of my knowledge for everybody to look at! Hopefully those who have been here longer (like bositman) can fill you in a bit more on what happened. My apologies for any inaccuracies, I didn't join the team until version 0.8.0 (January 2005)! So, here goes.....

Read more: The History of PCSX2

Benchmarking Multithreaded PCSX2

As most people probably know, PCSX2 is primarily a dual-thread application.  The two main threads are described as such:

  • EE/Core thread emulates the PS2's EmotionEngine (including VIF, SIF, GIF, and VUs) and the IOP (including SPU2, CDVD, and PAD)
  • GS thread emulates the PS2's Graphic Synthesizer (includes texture swizzling, texture filtering, upscaling, and frame rendering)

Each thread relies on the other thread in some way -- the GS thread cannot swizzle texture data until the EE thread has uploaded said data, for example.  Meanwhile, the EE thread cannot upload texture data to the GS thread if the GS thread is currently bogged down rendering last week's frame to video.  During these periods, either thread will sleep, only to be woken up once the other thread has caught up in its workload.

In theory the act of sleeping the EE/GS threads should make benchmarking the CPU load registered by each thread pretty easy: all modern operating systems have built-in APIs for reading the busy/idle time of any thread on the system -- this is the same API used by your tried and true task/process manager, for example:


(Air shows off his personal favorite, ProcessExplorer, part of the SysInternals Suite)

This readout is simple, efficient, and seemingly reliable.  It also avoids a lot of the annoying pitfalls one runs into trying to use common alternatives such as rdtsc and QueryPerformanceCounter.

... and this is precisely the method I decided to use for PCSX2 0.9.7.r3113 (and still in use as of r3878).  Simple theory really: if the GS thread is sleeping a lot (low load) then the game is bottlenecked by EE/Core thread activity.  If the EE thread is sleeping a lot and the GS thread reports 90+%, then the GS thread is the bottleneck (a problem often correctable through using lower internal resolutions, for example).

But as I've recently found out, it doesn't work as expected. -_-

It's filled with... threads!

The immediate problem faced by this simple method of load detection is that the latest wave of Windows Vista/7 GPU drivers themselves are multithreaded.  It should have come as little surprise that one of the primary goals of the new DWM/Aero/DX11 systems implemented into Vista/7 is scalable parallel processing that takes better advantage of modern multi-core CPUs.  Why this causes the OS built-in thread load detection to fail might be less obvious; I'll explain with an example:

When the GPU driver receives a directive to render the current scene (aka 'Present' in DirectX lingo), it sends the job to a thread dedicated to the task.  That thread has a Present Queue, typically 1 or 2 frames deep, that automatically handles triple buffered vsync'd page updates.  If the queue is full when the PCSX2 GS thread issues its next Present request, the GPU driver will put the GS thread to sleep until a slot in the Present Queue becomes available.  End result: The GS thread reports idle time to the operating system (and to PCSX2's GS window), but the GPU is still quite overloaded and bottlenecked via work supplied to it by a different thread altogether.

In essence, it is nearly the same sort of inter-thread dependence that the EE/Core and GS threads have between each other, only now the EE/Core thread's dependency chain extends to include GS and GPU driver threads (of which there could be one or many).

The solution to this problem is to use a more traditional method of manual load checking: timing various sections of code executed in-thread via either the aforementioned rdtsc (timestamp) or QueryPerformanceCounter, read at key points in the GS thread's execution/program flow.  This wasn't such a great idea a few years ago, due to K8/Athlon and P4 generation CPUs lacking a stable internal clock counter.  Fortunately, all modern CPUs have a consistent counter suitable for benchmarking, so the pitfalls that have been long associated with using Intel/AMD timestamps are finally obsolete enough to not be a concern for us here.

Post a Comment!