- Created: 30 July 2006
- Written by ZeroFrog
The Playstation 2 uses co-processor 0 to implement virtual paging. Even without COP0, the Playstation 2 memory map is pretty complex and the mapping can change depending on which processor you use to read the memory from. A simple version of how the default mapping looks from the Emotion Engine side is:
The 32Mb of main memory occupying 0000_0000 - 01ff_ffff
Hardware registers occupying 1000_0000 - 1000_ffff
VU/BIOS/SPU2 addresses in 1100_0000-1fff_ffff
Special kernel modes etc in 8000_0000-bfff_ffff
A scratch pad in some other address
...And of course can't forget the hidden addresses (thanks SONY)
To make matters worse, these mappings can change depending on the setting of COP0. (Note that at the time of writing, Pcsx2 doesn't emulate even half of COP0 correctly.) The simplest and most straightforward way to emulate this is to have another memory layer through a software Translation-Lookaside-Buffer (TLB). You pass it the PS2 address, and out comes the real physical address or some special code signifying a hardware register, etc. The problem is that every read/write has to be preceded by a TLB lookup. Considering that reads/writes are as common as addition, that's a lot of wasted cycles.
Well, the OS also uses virtual memory. In fact, every process has its own special virtual memory driven by a real hardware TLB. If we could get away by mapping the 4Gb PS2 memory map onto the process's virtual memory, we could eliminate the need for the software translation (Figure 1). Looking at the virtual manipulation functions Windows XP offers, there are two major problems with this:
1 WindowsXP reserves more than half the address space for OS specific stuff. A good amount is also reserved for all of Pcsx2's working memory, executable code, and plugins (especially ZeroGS). It looks like we are left with less than 1.5 Gb of address range to implement the 4Gb PS2 memory map. Note that this problem doesn't exist on 64bit operating systems where the address range is practically... infinite (don't quote me on this 20 years down the road).
2 Playstation 2 allows more than one virtual page to point to the same physical page, Windows XP doesn't (I don't know about Linux). Assume that PS2 address 0x1000 points to the same physical page as address 0x0000, each page is 4Kb. Now a write occurs at 0x1000. The game can retrieve that same value just by reading from 0x0000. In Windows XP, this has to be two different pages; so unless some clever solution/technology is discovered, we could kiss our VM dreams goodbye.
The first problem was solved somehow by introducing special address transformations before a read/write occurs.
And thankfully a clever technology presented itself for the second problem: Address Windowing Extensions. This lets Pcsx2 handle the actual physical page instead of a virtual page. We still can't map two virtual pages to the same physical page; however, what we can do instead is switch the mapping of the physical page as many times as needed! To achieve this, Pcsx2 hacks into the root exception handler and intercepts every exception the program generates. Whenever an illegal virtual page is accessed (ie, no physical page mapped to it), Pcsx2 gets a EXCEPTION_ACCESS_VIOLATION then it remaps the correct physical page to that empty virtual page and returns. Although I haven't calculated precisely, I'm pretty sure that switching physical pages around is pretty expensive, computationally speaking. So all this works fine under the assumption that game developers won't be crazy and access two virtual pages mapping to the same physical page back-and-forth frequently... [pause].
Alas, we were wrong... again (see floating-point article). It turns out that there are uncached and cached address ranges; so it is optimal to do such a bi-mapping trick: write in one virtual range and read from another. Pcsx2 tries to detect such cases and work around, but there's no clean solution.
And I'm going to stop here before this becomes a book.
So the ultimate question is: why doesn't VM work on some computers with 1Gb of RAM and the newest updates, while works on others? Turns out that real-time monitoring applications like to take up some of the 1.5 Gb of left over addresses on certain processes. (this might be OS specific programs too). I have also observed that performance/debugging monitors like NvPerfHud do similar tricks. There probably might be other reasons for VM builds of Pcsx2 not working because virtual memory is a pretty complicated issue.
Moral of the blog Read an OS book. I recommend Operating System Concepts (the dinosaur book) by Abraham Silberschatz, Peter Baer Galvin, Greg Gagne.
- Created: 24 July 2006
- Written by ZeroFrog
It is very hard to emulate the floating-point calculations of the R5900 FPU and the Vector Units on an x86 CPU because the Playstation 2 does not follow the IEEE standard. Multiplying two numbers on the FPU, VU, and an x86 processor can give you 3 different results all differing by a couple of bits! Operations like square root and division are even more imprecise.
Originally, we thought that a couple of bits shouldn't matter, that game developers would be crazy to rely on such precise calculation. Floating points are mostly used for world transformations or interpolation calculations, so no one would care if their Holy Sword of Armageddon was 0.00001 meters off from the main player's hand. To put it shortly, we were wrong and game developers are crazier than we thought. Games started breaking just by changing the floating point rounding mode!
While rounding mode is a problem, the bigger nightmare is the floating-point infinities. The IEEE standard states that when a number overflows (meaning that it is larger than 3.4028234663852886E+38), the result will be infinity. Any number multiplied by infinity is infinity (even 0 * infinity = infinity). That sounds great until you figure out that the VUs don't support infinities. Instead they clamp all large numbers to the max floating point possible. This discrepancy breaks a lot of games!
For example, let's say a game developer tries to normalize a zero vector by dividing by its length, which is 0. On the VU, the end result will be (0,0,0). On x86/IEEE, the result will be (infinity, infinity, infinity). Now if the game developer uses this vector to perturb some faces for artificial hair or some type of animation, all final positions on the PS2 will remain the same. All final positions on x86 will go to infinity... and there goes the game's graphics, now figure out where the problem occurred.
The simplest solution is to clamp the written vector of the current instruction. This requires 2 SSE operations and is SLOW; and it doesn't work sometimes. To top it off, you can never dismiss the fact that game developers can be loading bad floating-point data to the VUs to begin with! Some games zero out vectors by multiplying them with a zero, so the VU doesn't care at all what kind of garbage the original vector's data has, x86 does care.
These two problems make floating-point emulation very hard to do fast and accurate. The range of bugs are from screen flickering when a fade occurs, to disappearing characters, to spiky polygon syndrome (the most common problem and widely known as SPS).
In the end Pcsx2 does all its floating-point operations with SSE since it is easier to cache the registers. Two different rounding modes are used for the FPU and VUs. Whenever a divide or rsqrt occur on the FPU, overflow is checked. Overflow is checked much more frequently with the VUs. The fact that VUs handle both integer and floating-point data in the same SSE register makes the checking a little longer. In the future, Pcsx2 will read the rounding mode and overflow settings from the patch files. This is so that all games can be accommodated with the best/fastest settings.
Moral of the blog When comparing two floating point numbers a and b, never use a == b. Instead use something along the lines of
fabs(a-b) < epsilon
where epsilon is some very small number.
- Created: 23 July 2006
- Written by CKemu
Today marks the start of a new section called Blog, this is an area for the development team to write technical articles and day-to-day happenings. This is something we hope you find an interesting read, and a general resource for technical information with an insight into the PCSX2 team.
You will find entries about on a variety of subjects, including technical articles, site development, humorous aspects of working on a project such as this and the 'technical challenges' that are met every day.
You may of gathered that this blogs section will not be a 'light read', such as the primary news section. Blogs are aimed at the technically minded, though I dare say there shall be light hearted entries . Due to the technical nature of this blogs section, the forums will not be supporting general questions relating to blog entries.