byuu says:
Changelog:
- HALT waits 16 cycles before testing IRQs instead of 1 (probably less
precise, but provides a massive speedup) [we will need to work on this
later]
- MMIO regs for CPU/PPU simplified by combining array accesses
- custom VRAM/PRAM/OAM read/write functions that emulate 8->16-bit
writes
- 16-bit PRAM data (decent speedup)
- emulated memory access speed (but don't handle non-sequential
penalties or PPU access penalties yet) [amazingly, doesn't help speed
at all]
- misc. code cleanups
For this WIP, FPS for Mr. Driller 2 went from 88fps to 172fps.
Compatibility should be unchanged. Timers are still an interesting
avenue to increase performance, but will be very tough to handle the
16MHz timers with eg a period of 65535 (overflow every single tick.) And
that's basically the last major speed boost we'll be able to get.
Blending and windowing is going to hurt performance, but it remains to
be seen how much.
byuu says:
Changelog:
- added FIFO buffer emulation (with DMA and all that jazz) [Cydrak]
- fixed timers and vcounter assign [Cydrak]
- emulated EEPROM (you have to change size manually for 14-bit mode, we
need a database badly now) [SMA runs now]
- removed OAM array, now decoding directly to struct Object {} [128] and
ObjectParam {} [32] (faster this way)
- check forceblank (still doesn't remove all garble between transitions,
though??)
- lots of other stuff
Delete your settings.cfg, or manually change frequencyGBA to 32768, or
bad things will happen (this may change back to 256KHz-4MHz later.)
15 of 16 games are fully playable now, and look and sound great.
The major missing detail right now is PPU blending support, and we
really need to optimize the hell out of the code.
byuu says:
Merged Cydrak's r17c changes:
- BG affine mode added
- BG bitmap mode added
- OBJ affine mode added
- fixed IRQ bug in THUMB mode (fixed almost every game)
- timers added (broke almost every game, whee.)
Cydrak is absolutely amazingly awesome and patient. This really wouldn't
be happening without him.
Also fixed some things from my end, including greatly improved sprite
priorities, and a much better priority sorter. Mr. Driller looks a lot
better now.
byuu says:
Fixed the r15 mask per Cydrak.
Added DMA support (immediate + Vblank + Hblank + HDMA) with IRQ support.
Basically only missing FIFO reload mode for the APU on channel 2.
Added background linear renderer (tilemap mode.)
Added really inefficient pixel priority selector, so that all BGs+OBJ
could be visible onscreen at the same time.
As a result of the above:
* Mr. Driller is our first fully playable game
* Bakunetsu Dodge Ball Fighters is also fully playable
* Pinobee no Daibouken is also fully playable
Most games (15 of 16 tested) are now showing *something*, many things
look really really good in fact.
Absolutely essential missing components:
- APU
- CPU timers and their interrupts
- DMA FIFO mode
- OBJ affine mode
- BG affine mode
- BG bitmap mode
- PPU windows (BG and OBJ)
- PPU mosaic
- PPU blending modes
- SRAM / EEPROM (going to rely on a database, not heuristics. Homebrew
will require a manifest file.)
byuu says:
Fixed aforementioned issues.
[From a previous post:
- mul was using r(d) instead of r(n) for accumulate.
- mull didn't remove c/v clear.
- APU register mask was broken, so SOUNDBIAS was reading out wrong.
- APU was only mapping 0x088 and not 0x089 as well.
- Halfword reads in CPU+PPU+APU were all reading from the low address
each time.]
All CPU+PPU registers are now hooked up (not that they do anything.)
SOUNDBIAS for APU was hooked up, got tired of working on it for the rest :P
I recall from the GB APU that you can't just assign values for the APU
MMIO regs. They do odd reload things as well.
Also, was using MMIO read code like this:
return (
(flaga << 0)
|| (flagb << 1)
|| (flagc << 2)
);
Logical or doesn't work so well with building flags :P
Bad habit from how I split multiple conditionals across several lines.
So ... r14 is basically what r13 should have been yesterday, delaying my
schedule by yet another day :(
byuu says:
Contains all of Cydrak's fixes, sans PPU.
On the PPU front, I've hooked up 100% of read and write registers.
All three DISPSTAT IRQs (Vblank, Hblank, Vcoincidence) are connected now
as well.
Super Mario Advance now runs without *appearing* to crash, although it's
hard to tell since I have no video or sound :P
ARM Wrestler is known to run, as is the BIOS.
byuu says:
Enough to get through the BIOS and into cartridge ROM.
I am a bit annoyed that I was basically told that the GBA PPU wasn't
that bad. Sprites are a clusterfuck, easily worse than Mode7, docs don't
even begin to explain them in enough detail.
This is going to be fun.
byuu says:
Added all of the above fixes and changes. [A lot of individual fixes for
the ARM core from Cydrak - Ed.] Also new is pipeline_decode() to fetch
data, and IME/IE/IF support, and an ARM::processor.irqline flag that
triggers IRQs at 0x18. Only Vblank is hooked up, which is what SWI 4 was
waiting on previously.
I'm sure my interrupt support is horribly broken and wrong. I was never
able to really figure out IE/IF on the Game Boy, so there's no question
this is even worse.
It's now going crazy and writing 0 to IE forever now after the Vblank
IRQ triggers.
byuu says:
Changelog:
- fixed THUMB hi immediate reads (immediate * 4)
- cartridge is properly mirrored to 32MB (eg 12mbit repeats as
lo8+hi4+hi4+lo8+hi4+hi4) [so it's a bit slower than a standard memcpy
fill]
- added ARM - load/store halfword register offset
- added ARM - load/store halfword immediate offset
- added ARM - load signed halfword/byte register offset
- added ARM - load signed halfword/byte immediate offset
- added decode() function to make opcode bit testing a lot clearer
(didn't apply it to the debugger yet)
All ARMv4M and all THUMBv4 instructions should now be implemented.
Although I'm not sure if my implementations of the new instructions are
correct.
byuu says:
Split apart necdsp: core is now in processor/upd96050 (wish I had
a better name for it, but there's no family name that is maskable.)
I would like to support the uPD7720 in the core as well, just for
completeness' sake, but I'll have to modify the decoder to drop one bit
from each mode.
So ... I'll do that later. Worst part is even if I do, I won't be able
to test it :(
Added all of Cydrak's changes. I also simplified LDMIA/STMIA and
PUSH/POP by merging the outer loops.
Probably infinitesimally slower, but less code is nicer. Maybe GCC
optimization will expand it, who knows.
byuu says:
Added some more ARM opcodes, hooked up MMIO. Bind it with mmio[(addr
000-3ff)] = this; inside CPU/PPU/APU, goes to read(), write().
Also moved the Hitachi HG51B core to processor/, and split it apart from
the snes/chip/hitachidsp implementation.
This one actually worked really well. Very clean split between MMIO/DMA
and the processor core. I may move a more generic DMA function inside
the core, not sure yet.
I still believe the HG51B169 to be a variant of the HG51BS family, but
given they're meant to be incredibly flexible microcontrollers, it's
possible that each variant gets its own instruction set.
So, who knows. We'll worry about it if we ever find another HG51B DSP,
I guess.
GBA BIOS is constantly reading from 04000300, but it never writes. If
I return prng()&1, I can get it to proceed until it hits a bad opcode
(stc opcode, which the GBA lacks a coprocessor so ... bad codepath.)
Without it, it just reads that register forever and keeps resetting the
system, or something ...
I guess we're going to have to try and get ARMwrestler working, because
the BIOS seems to need too much emulation code to do anything at all.
There was a "v087r07pre" release that I unfortunately missed.
byuu says (about v087r07pre):
Creates the bsnes/processor folder. This has a shared ARM core there
which both the GBA and ST018 inherit.
There are going to be separate decoders, and revision-specific checks,
to support the differences between v3+.
In the future, I also want to move the other processor cores here:
- GBZ80 (GB, GBC)
- 65816 (SNES CPU, SA-1)
- NEC uPD (7725, 96050, maybe 7720 just for fun)
- Hitachi HG51B169
- SuperFX
- SPC700
- 65(C?)02
Basically, the GBA/ST018 forces my hand to start coding a bit more like
a multi-system emulator.
Right now, the ST018 is broken. Hence the pre. Apparently the GBA core
being used now has some bugs. So this'll be a nice way to stress-test
the GBA core a bit before we make it to ARMwrestler.
byuu says (about v087r07):
Both snes/chip/armdsp and gba/cpu use processor/arm now.
Fixed THUMB to execute the BL prefix and suffix separately. I can now
get the GBA BIOS stuck in some kind of infinite loop. Hooray ...
I guess?
byuu says:
Implemented all of the ARMv3 instructions, and the bx rm instruction as
well. Already hit THUMB mode right at the start of the BIOS, sigh.
Implemented the first THUMB instruction to get that rolling. Also tried
to support the S flag to LDM/STM, but not sure how successful I was.
byuu says:
GBA stuff re-added. Only thing missing that was there before is the ARM
branch opcode.
Since we're going to be staring at it for a very long time, I added
a more interesting test video pattern.
Went from 6fps to 912fps. Amazing what being able to divide can do for
a frame rate.
byuu says:
Fixing the PPU stepping increased FPS to 250. Promising, at least, since
the ARM core is still severely overclocked.
However, I reverted back to r02. This one patches gameboy/ and GameBoy::
to gb/ and GB:: and that's it.
Sorry, I just couldn't shake this bad feeling about the code. There were
some poorly hacked-together constructs. I'd rather just redo two days of
work than feel bad about the codebase for the next several years. Going
to attempt the GBA bridge again. Third time's a charm, I suppose (there
was a pre-r03 WIP I abandoned as well.)
This isn't unprecedented, GB core took a few attempts like this as well.
byuu says:
I wanted to keep this a secret, but unlike other recent additions, this
will easily take several weeks, maybe months, to show anything.
Assuming I can even pull it off. Nothing technically overwhelming here,
I'm more worried about the near-impossibility of debugging the CPU.