I realised recently that I needed to reimplement the CPU so that it wasn't taking 70% - 80% of what is really a very large FPGA, so that I could fit the extra bits and pieces in that I have been planning.
Ideally, I would include a 1541 and a complete C64 in the FPGA for compatibility. This would help to free me from having to make the C65/C65GS mode too compatible with the C64, instead, I would just transfer control of a running program from one to the other when cycle-exact operation or illegal opcodes were needed.
But in the shorter term, I wanted to be able to put a couple of SIDs in, and perhaps also improve the synthesis time a bit.
So I set about redoing the CPU, and the process has been one frustration after another.
My gut feeling was, and remains that 96MHz should be fairly possible on this FPGA, although the complexity of the memory system of a C65 makes that more difficult than it first seems.
I was keeping a careful eye on the maximum clock speed in the synthesis reports, and realised that for the time being at least, 64MHz would be easier. I could then optimise my way back up.
After bashing my head against a lot of bugs and silly design decisions on my part, I started making some progress and got some instructions running.
But then testing LDA/LDX/LDY/LDZ I found that the value loaded into the register was often wrong, or from the previous instruction.
After a lot of poking around, I finally realised that ISE was not constraining the cpu clock in the design, and the late data was being used because the design had a real maximum clock speed of only about 28MHz, but was being run at 64MHz.
I was a bit cranky with ISE for not realising that a clock created by dividing another clock is in fact a clock.
I eventually a way to constrain the clock, discovered how horrible the timing situation was, and have been trying to fix it since.
At this stage it looks like 64MHz should be possible, although with some the odd drive stage when executing instructions to get values or addresses ready to read or write in the following cycle. Exactly how much impact this will make is hard to estimate at this stage, but the CPU might be in drive states perhaps 20% - 25% of the time.
That means that the speed up compared with a stock C64 might end up around 64MHz * 0.75 / 1MHz = 48x. The result will be helped a little by the fact that many of the single byte instructions will execute in a single cycle. But it is really too early to say whether I will be able to make the CPU work at 64MHz, and exactly what the speed comparison will be. It will all depend on whether I can make actual forwards progress or whether I stay bogged down chasing my tail some more.