Tuesday, 1 July 2014

A frustrating fortnight of work reimplementing the CPU

I realised recently that I needed to reimplement the CPU so that it wasn't taking 70% - 80% of what is really a very large FPGA, so that I could fit the extra bits and pieces in that I have been planning.

Ideally, I would include a 1541 and a complete C64 in the FPGA for compatibility.  This would help to free me from having to make the C65/C65GS mode too compatible with the C64, instead, I would just transfer control of a running program from one to the other when cycle-exact operation or illegal opcodes were needed.

But in the shorter term, I wanted to be able to put a couple of SIDs in, and perhaps also improve the synthesis time a bit.

So I set about redoing the CPU, and the process has been one frustration after another.

My gut feeling was, and remains that 96MHz should be fairly possible on this FPGA, although the complexity of the memory system of a C65 makes that more difficult than it first seems.

I was keeping a careful eye on the maximum clock speed in the synthesis reports, and realised that for the time being at least, 64MHz would be easier. I could then optimise my way back up.

After bashing my head against a lot of bugs and silly design decisions on my part, I started making some progress and got some instructions running.

But then testing LDA/LDX/LDY/LDZ I found that the value loaded into the register was often wrong, or from the previous instruction.

After a lot of poking around, I finally realised that ISE was not constraining the cpu clock in the design, and the late data was being used because the design had a real maximum clock speed of only about 28MHz, but was being run at 64MHz.

I was a bit cranky with ISE for not realising that a clock created by dividing another clock is in fact a clock.

I eventually a way to constrain the clock, discovered how horrible the timing situation was, and have been trying to fix it since.

At this stage it looks like 64MHz should be possible, although with some the odd drive stage when executing instructions to get values or addresses ready to read or write in the following cycle.  Exactly how much impact this will make is hard to estimate at this stage, but the CPU might be in drive states perhaps 20% - 25% of the time.

That means that the speed up compared with a stock C64 might end up around 64MHz * 0.75 / 1MHz = 48x.  The result will be helped a little by the fact that many of the single byte instructions will execute in a single cycle. But it is really too early to say whether I will be able to make the CPU work at 64MHz, and exactly what the speed comparison will be.  It will all depend on whether I can make actual forwards progress or whether I stay bogged down chasing my tail some more.

6 comments:

  1. Thanks for your hard work Paul !

    I hope this won't be for you just an exercise in frustration, but also a path to a better knowledge of FPGAs :)

    ReplyDelete
    Replies
    1. I am not sure that it is actually possible to separate the two ;)

      Delete
  2. Impressive! I'll be following this closely. :)

    Do I understand it correctly if it will be C64 software compatible and be able to run C64 software in the C65 mode? So that one could interoperate software, or even expand on software written for C64 and use it with extended abilities?

    ReplyDelete
    Replies
    1. So like the C65, you can access all new features from C64 mode. So it is very easy, for example, to use new video modes or the faster CPU from C64 mode. In other words, it won't be like the C128 that prevented accessing some new features from C64 mode. I have already patched Turbo Assembler to run at 3.5MHz, for example, and also written a little 80-column text editor.

      Delete
  3. Wow ! I didn't get this feature !

    This project is even greater then I thought :)

    ReplyDelete
    Replies
    1. Now I just need to get the jolly CPU redesign working. Everything looks like it should work, but it just isn't. Very frustrating.

      Delete