It's taken much longer than I would have liked, but I have the redesigned CPU mostly working now.
The CPU is running at 48MHz, and should be about 40x C64 speed, although the exact figure is likely to change.
The reason it isn't 48x is that I have had to put some wait-states in a few places to make the timing work.
Reading from anything other than fast RAM incurs one extra cycle, which means reading from IO currently has a two cycle penalty. Fastram, as the name suggests, has no wait states. Writing to IO also has no wait states.
Also, anywhere where the CPU makes a memory access for which the address or data is dependent on whatever has just been read from memory, this has had to be split into two cycles. This mostly affects the Read-Modify-Write (RMW) class of instructions, like INC, DEC, ASL and ROR. This basically means that we have a dummy cycle similar to what the real 6502 has.
Unfortunately, it isn't very practical to perform the dummy write that the 6502 does, so I will need to add an extra cycle for $D019 so that DEC $D019 and variations work for clearing interrupts.
I have some ideas for caching the top of the stack so that RTS can execute in a single cycle, which will provide a solid boost for many programs, but that's some way down the track, because I need to get the CPU working properly first.
The screen shot from simulation below shows that it can run the kickstart ROM and get as far as trying to find the SD card:
The astute observer will notice that the top line of the display is showing the wrong contents. This is because the bad-line for that row of characters had already occurred. If I leave the simulation long enough that it can draw a 2nd frame, then it should show the kickstart banner. As it happens, the simulation managed to draw another frame while I was writing the this post, so you can see the real version below:
I need to shake down the remaining bugs like this in the VIC-IV that have crept in with the substantial rework that it has suffered while I have been doing the CPU. Both efforts, CPU and VIC-IV rework, are really targeted at making the whole thing use much less of the FPGA so that I have enough space to implement sprites and the other missing functionality.
In any case, the fact that it can set the video mode, clear the screen, and decide that it is looking for the SD card shows that an awful lot of the CPU is actually working. There are some bugs, however.
First, I haven't finished implementing BRK or interrupts of any sort.
Second, I haven't finished implementing the PHW (push word, either immediate or absolute) instruction. It won't be hard, but it just hasn't hit the priority queue yet, and it is a little weird, since in the CPU the two addressing modes will likely have very different implementations.
Third, there are some weird bugs with accessing IO.
The SD card controller and other IO functions provided by that module aren't mapping in the address space properly when run on the FPGA, even though they simulate fine.
Also, running the following little routine to draw a rough vertical raster bar locks up as soon as the accumulator has the value $F0. Once that happens, the Z flag stays perpetually set, and so nothing more gets drawn.
loop LDA $D012
It works fine, however, if I put a NOP between the CMP and the BEQ. So there is something timing dependant going on. What is weird is that without the NOP the bug manifests, even if the CPU is in single-stepping mode.
This reworking of the VIC and CPU at the same time hasn't been the most fun, because it has gone backwards from working to a seething mess. But it is now finally starting to draw back together, and should hopefully soon catch up with where the old excessively large design got to. Then comes the fun part of adding sprites and other goodies, but that will still be a little while off.