bsnes/higan/sfc/cpu/io.cpp
Tim Allen ca277cd5e8 Update to v100r14 release.
byuu says:

(Windows: compile with -fpermissive to silence an annoying error. I'll
fix it in the next WIP.)

I completely replaced the time management system in higan and overhauled
the scheduler.

Before, processor threads would have "int64 clock"; and there would
be a 1:1 relationship between two threads. When thread A ran for X
cycles, it'd subtract X * B.Frequency from clock; and when thread B ran
for Y cycles, it'd add Y * A.Frequency from clock. This worked well
and allowed perfect precision; but it doesn't work when you have more
complicated relationships: eg the 68K can sync to the Z80 and PSG; the
Z80 to the 68K and PSG; so the PSG needs two counters.

The new system instead uses a "uint64 clock" variable that represents
time in attoseconds. Every time the scheduler exits, it subtracts
the smallest clock count from all threads, to prevent an overflow
scenario. The only real downside is that rounding errors mean that
roughly every 20 minutes, we have a rounding error of one clock cycle
(one 20,000,000th of a second.) However, this only applies to systems
with multiple oscillators, like the SNES. And when you're in that
situation ... there's no such thing as a perfect oscillator anyway. A
real SNES will be thousands of times less out of spec than 1hz per 20
minutes.

The advantages are pretty immense. First, we obviously can now support
more complex relationships between threads. Second, we can build a
much more abstracted scheduler. All of libco is now abstracted away
completely, which may permit a state-machine / coroutine version of
Thread in the future. We've basically gone from this:

    auto SMP::step(uint clocks) -> void {
      clock += clocks * (uint64)cpu.frequency;
      dsp.clock -= clocks;
      if(dsp.clock < 0 && !scheduler.synchronizing()) co_switch(dsp.thread);
      if(clock >= 0 && !scheduler.synchronizing()) co_switch(cpu.thread);
    }

To this:

    auto SMP::step(uint clocks) -> void {
      Thread::step(clocks);
      synchronize(dsp);
      synchronize(cpu);
    }

As you can see, we don't have to do multiple clock adjustments anymore.
This is a huge win for the SNES CPU that had to update the SMP, DSP, all
peripherals and all coprocessors. Likewise, we don't have to synchronize
all coprocessors when one runs, now we can just synchronize the active
one to the CPU.

Third, when changing the frequencies of threads (think SGB speed setting
modes, GBC double-speed mode, etc), it no longer causes the "int64
clock" value to be erroneous.

Fourth, this results in a fairly decent speedup, mostly across the
board. Aside from the GBA being mostly a wash (for unknown reasons),
it's about an 8% - 12% speedup in every other emulation core.

Now, all of this said ... this was an unbelievably massive change, so
... you know what that means >_> If anyone can help test all types of
SNES coprocessors, and some other system games, it'd be appreciated.

----

Lastly, we have a bitchin' new about screen. It unfortunately adds
~200KiB onto the binary size, because the PNG->C++ header file
transformation doesn't compress very well, and I want to keep the
original resource files in with the higan archive. I might try some
things to work around this file size increase in the future, but for now
... yeah, slightly larger archive sizes, sorry.

The logo's a bit busted on Windows (the Label control's background
transparency and alignment settings aren't working), but works well on
GTK. I'll have to fix Windows before the next official release. For now,
look on my Twitter feed if you want to see what it's supposed to look
like.

----

EDIT: forgot about ICD2::Enter. It's doing some weird inverse
run-to-save thing that I need to implement support for somehow. So, save
states on the SGB core probably won't work with this WIP.
2016-07-30 13:56:12 +10:00

311 lines
6.8 KiB
C++

auto CPU::readAPU(uint24 addr, uint8 data) -> uint8 {
synchronize(smp);
return smp.readPort(addr.bits(0,1));
}
auto CPU::readCPU(uint24 addr, uint8 data) -> uint8 {
switch((uint16)addr) {
//WMDATA
case 0x2180: {
return bus.read(0x7e0000 | io.wramAddress++, r.mdr);
}
//JOYSER0
//7-2 = MDR
//1-0 = Joypad serial data
case 0x4016: {
uint8 v = r.mdr & 0xfc;
v |= SuperFamicom::peripherals.controllerPort1->data();
return v;
}
//JOYSER1
case 0x4017: {
//7-5 = MDR
//4-2 = Always 1 (pins are connected to GND)
//1-0 = Joypad serial data
uint8 v = (r.mdr & 0xe0) | 0x1c;
v |= SuperFamicom::peripherals.controllerPort2->data();
return v;
}
//RDNMI
case 0x4210: {
//7 = NMI acknowledge
//6-4 = MDR
//3-0 = CPU (5a22) version
uint8 v = (r.mdr & 0x70);
v |= (uint8)(rdnmi()) << 7;
v |= (version & 0x0f);
return v;
}
//TIMEUP
case 0x4211: {
//7 = IRQ acknowledge
//6-0 = MDR
uint8 v = (r.mdr & 0x7f);
v |= (uint8)(timeup()) << 7;
return v;
}
//HVBJOY
case 0x4212: {
//7 = VBLANK acknowledge
//6 = HBLANK acknowledge
//5-1 = MDR
//0 = JOYPAD acknowledge
uint8 v = (r.mdr & 0x3e);
if(status.autoJoypadActive) v |= 0x01;
if(hcounter() <= 2 || hcounter() >= 1096) v |= 0x40; //hblank
if(vcounter() >= ppu.vdisp()) v |= 0x80; //vblank
return v;
}
//RDIO
case 0x4213: {
return io.pio;
}
//RDDIVL
case 0x4214: {
return io.rddiv.byte(0);
}
//RDDIVH
case 0x4215: {
return io.rddiv.byte(1);
}
//RDMPYL
case 0x4216: {
return io.rdmpy.byte(0);
}
//RDMPYH
case 0x4217: {
return io.rdmpy.byte(1);
}
case 0x4218: return io.joy1.byte(0); //JOY1L
case 0x4219: return io.joy1.byte(1); //JOY1H
case 0x421a: return io.joy2.byte(0); //JOY2L
case 0x421b: return io.joy2.byte(1); //JOY2H
case 0x421c: return io.joy3.byte(0); //JOY3L
case 0x421d: return io.joy3.byte(1); //JOY3H
case 0x421e: return io.joy4.byte(0); //JOY4L
case 0x421f: return io.joy4.byte(1); //JOY4H
}
return data;
}
auto CPU::readDMA(uint24 addr, uint8 data) -> uint8 {
auto& channel = this->channel[addr.bits(4,6)];
switch(addr & 0xff0f) {
//DMAPx
case 0x4300: return (
channel.transferMode << 0
| channel.fixedTransfer << 3
| channel.reverseTransfer << 4
| channel.unused << 5
| channel.indirect << 6
| channel.direction << 7
);
//BBADx
case 0x4301: return channel.targetAddress;
//A1TxL
case 0x4302: return channel.sourceAddress >> 0;
//A1TxH
case 0x4303: return channel.sourceAddress >> 8;
//A1Bx
case 0x4304: return channel.sourceBank;
//DASxL -- union { uint16 transferSize; uint16 indirectAddress; };
case 0x4305: return channel.transferSize.byte(0);
//DASxH -- union { uint16 transferSize; uint16 indirectAddress; };
case 0x4306: return channel.transferSize.byte(1);
//DASBx
case 0x4307: return channel.indirectBank;
//A2AxL
case 0x4308: return channel.hdmaAddress.byte(0);
//A2AxH
case 0x4309: return channel.hdmaAddress.byte(1);
//NTRLx
case 0x430a: return channel.lineCounter;
//???
case 0x430b:
case 0x430f: return channel.unknown;
}
return data;
}
auto CPU::writeAPU(uint24 addr, uint8 data) -> void {
synchronize(smp);
return writePort(addr.bits(0,1), data);
}
auto CPU::writeCPU(uint24 addr, uint8 data) -> void {
switch((uint16)addr) {
//WMDATA
case 0x2180: {
return bus.write(0x7e0000 | io.wramAddress++, data);
}
case 0x2181: io.wramAddress.bits( 0, 7) = data; return; //WMADDL
case 0x2182: io.wramAddress.bits( 8,15) = data; return; //WMADDM
case 0x2183: io.wramAddress.bit (16 ) = data.bit(0); return; //WMADDH
//JOYSER0
case 0x4016: {
//bit 0 is shared between JOYSER0 and JOYSER1, therefore
//strobing $4016.d0 affects both controller port latches.
//$4017 bit 0 writes are ignored.
SuperFamicom::peripherals.controllerPort1->latch(data.bit(0));
SuperFamicom::peripherals.controllerPort2->latch(data.bit(0));
return;
}
//NMITIMEN
case 0x4200: {
io.autoJoypadPoll = data.bit(0);
nmitimenUpdate(data);
return;
}
//WRIO
case 0x4201: {
if(io.pio.bit(7) && !data.bit(7)) ppu.latchCounters();
io.pio = data;
return;
}
//WRMPYA
case 0x4202: io.wrmpya = data; return;
//WRMPYB
case 0x4203: {
io.rdmpy = 0;
if(alu.mpyctr || alu.divctr) return;
io.wrmpyb = data;
io.rddiv = (io.wrmpyb << 8) | io.wrmpya;
alu.mpyctr = 8; //perform multiplication over the next eight cycles
alu.shift = io.wrmpyb;
return;
}
case 0x4204: { io.wrdiva.byte(0) = data; return; } //WRDIVL
case 0x4205: { io.wrdiva.byte(1) = data; return; } //WRDIVH
//WRDIVB
case 0x4206: {
io.rdmpy = io.wrdiva;
if(alu.mpyctr || alu.divctr) return;
io.wrdivb = data;
alu.divctr = 16; //perform division over the next sixteen cycles
alu.shift = io.wrdivb << 16;
return;
}
case 0x4207: io.hirqPos.bits(0,7) = data; return; //HTIMEL
case 0x4208: io.hirqPos.bit (8 ) = data.bit(0); return; //HTIMEH
case 0x4209: io.virqPos.bits(0,7) = data; return; //VTIMEL
case 0x420a: io.virqPos.bit (8 ) = data.bit(0); return; //VTIMEH
//DMAEN
case 0x420b: {
for(auto n : range(8)) channel[n].dmaEnabled = data.bit(n);
if(data) status.dmaPending = true;
return;
}
//HDMAEN
case 0x420c: {
for(auto n : range(8)) channel[n].hdmaEnabled = data.bit(n);
return;
}
//MEMSEL
case 0x420d: {
io.romSpeed = data.bit(0) ? 6 : 8;
return;
}
}
}
auto CPU::writeDMA(uint24 addr, uint8 data) -> void {
auto& channel = this->channel[addr.bits(4,6)];
switch(addr & 0xff0f) {
//DMAPx
case 0x4300: {
channel.transferMode = data.bits(0,2);
channel.fixedTransfer = data.bit (3);
channel.reverseTransfer = data.bit (4);
channel.unused = data.bit (5);
channel.indirect = data.bit (6);
channel.direction = data.bit (7);
return;
}
//DDBADx
case 0x4301: channel.targetAddress = data; return;
//A1TxL
case 0x4302: channel.sourceAddress.byte(0) = data; return;
//A1TxH
case 0x4303: channel.sourceAddress.byte(1) = data; return;
//A1Bx
case 0x4304: channel.sourceBank = data; return;
//DASxL -- union { uint16 transferSize; uint16 indirectAddress; };
case 0x4305: channel.transferSize.byte(0) = data; return;
//DASxH -- union { uint16 transferSize; uint16 indirectAddress; };
case 0x4306: channel.transferSize.byte(1) = data; return;
//DASBx
case 0x4307: channel.indirectBank = data; return;
//A2AxL
case 0x4308: channel.hdmaAddress.byte(0) = data; return;
//A2AxH
case 0x4309: channel.hdmaAddress.byte(1) = data; return;
//NTRLx
case 0x430a: channel.lineCounter = data; return;
//???
case 0x430b:
case 0x430f: channel.unknown = data; return;
}
}