Sunday 18 February 2024

More cartridge port fixes for the MEGA65

Some cartridges are still not working correctly on the MEGA65.  These issues seem to not be restricted to the new R5/R6 boards.  For example, Tiny Quest doesn't work reliably.  Thanks to Olivier, I now have this cartridge with me here at home, and have been investigating it for a couple of weeks on and off now.

The key issue that I have discovered, is that when I implemented the MEGA65's cartridge port controller, I was not allowing a "hold time" after the rising edge of the 1MHz clock on the cartridge port.  This means that cartridges that have multiple logic chips can have signal propagation times that mean that the value written to a register in a cartridge can be wrong.  This is a problem with the Tiny Quest cartridge.

Let's go through this from the ground up:

First, the 6502 datasheet gives us this timing diagram:


The important bits are DATA(READ) and DATA(WRITE) in particular.  We can see that a T_HR and T_HW delay is required after the Phi2 clock goes low.  This is to allow the logic in the CPU to accept and process the data.  The same applies to logic chips in a cartridge. We weren't implementing this hold time. They should be between 60 and 150ns according to the datasheet.

Now, the MEGA65 doesn't have a 6502 connected directly to the cartridge port bus, but cartridges are made with this kind of timing assumption in mind. Specifically, they are allowed to assume that the data lines will be valid for at least 60 ns after the relevant edge of the 1MHz clock on the cartridge port.

So let's take a look at the schematic of the Tiny Quest cartridge, which uses a HUCKY v1.03 64KB cartridge. The original description is in German, but it is also discussed in English here, including a reconstructed schematic:

The bit I am interested in primarily is the logic that listens for writes to $DE00, and selects which 8KB bank of the EPROM to use, and whether to disable the cartridge completely: Bits 0-2 are inverted bank select bits into the EPROM, and writing a 1 to bit 3 will cause the cartridge to completely disable itself until the /RESET line next goes low.  Here is the relevant part of the schematic enlarged:

If we focus even more on the logic for the cartridge control, it will be easier to read and to follow:

We can see that the 4 lowest data lines are connected to the D inputs of the 74LS173 chip (which is really just a latching buffer).  The outputs feed into the three 74LS04 inverters that go to the EPROM upper address bits (which we will ignore for now), and also directly to the /EXROM line -- this is the important and clever bit.

If the /EXROM line is low, the MEGA65 (or C64) knows that the cartridge is saying that it has an 8KB ROM at $8000-$9FFF.  The 74LS173 has all of its outputs low when the CLR line (pin 15) is high. This is why the /RESET line is fed through an inverter into pin 15 (called Mr in the schematic above), so that when /RESET is low causing the computer to reset, it is also resetting the cartridge to be enabled, and selecting bank 7 (because the bank select bits all zero, but fed through the inverters to make them all 1). So this ensures the cartridge is in a known specific state on power on.

The 74LS173 is also always outputting its values, because the two /OE lines are tied to ground. However, it only updates the latched values when the clock on pin 7 goes high -- this is formed by inverting the 1MHz PHI2 clock from the MEGA65: That is, it will latch the value on the falling edge of a PHI2 clock.

That's shown up the next bug I had: I was making the MEGA65's cartridge port write with hold on the rising edge of PHI2. So I'll fix that, make sure my tests pass, and then resynthesise.

This follows on work I had done earlier to completely refactor the cartridge port control logic, because it was quite a mess before.  I'll describe that while I wait for resynthesis to run, by annotating the VHDL:

The first part of the logic works out when the next edge of the 8MHz dotclock on the cartridge port should occur. It does this by adding a magic value to a 16 bit counter, and watching for when it overflows. This allows for quite accurate frequency generation. If there is no dotclock edge, then we do nothing and clear any strobe signals to the MEGA65's processor:

      ticker <= ('0'&ticker(15 downto 0)) + dotclock_increment;
      if ticker(16) = '0' then
        cart_access_read_strobe <= '0';
        cart_access_accept_strobe <= '0';

But if we have a dotclock edge, then we need to keep track of where we are in the 1MHz cycle: There are 8MHz/1MHz x 2 edges per cycle = 16 dotclock edges per 1MHz cycle, so we count from 0 to 15, to know where we are in the 1MHz clock cycle:

        -- Each phi2_ticker increment is 1/16th of a 1MHz clock cycle,
        -- so about 64ns.
        if phi2_ticker /= 15 then
          phi2_ticker <= phi2_ticker + 1;
          phi2_ticker <= 0;
        end if;
We then generate the actual dotclock signal on the cartridge port by alternating between 0 and 1 every edge:

        -- Create the 8MHz dotclock signal
        case phi2_ticker is
          when 0 | 2 | 4 | 6 | 8 | 10 | 12 | 14 => cart_dotclock <= '1';
          when others => cart_dotclock <= '0';
        end case;

Then we work out what to do eat each stage during the 1MHz clock cycle.  The 6502 bus in the C64 has Phi1 and Phi2 halves of the clock, and can handle two separate things happening each 1MHz cycle staged in this way. Normally the CPU is using the Phi2 half, and the VIC-II the Phi1 half (except when the VIC-II steals some Phi2 cycles). Because the VIC-II can only read and not write, most peripherals will respond to a read asynchronously at any time, but will only process a write when the Phi2 clock goes low, i.e., when a falling edge is seen on the Phi2 clock -- and it is the Phi2 clock that is visible on the cartridge port.

In our state-machine, we have the low-half of the Phi2 clock first, and then have it high in the second half. Now, this is actually a bit of an approximation, as the Phi2 clock isn't actually high for 50% of the cycle, but a bit less than that if you look at the 6502 timing diagram I included above.  But there is no harm in having Phi2 stretch to a full 50% duty-cycle. Rather, it makes the timing a bit more relaxed for cartridges. So let's look at what we do at each of the 16 stages of a 1MHz clock cycle.

We use a case statement to select what to do. In the first stage (we count starting from 0), we set the Phi2 clock signal low, causing the negative edge, and just do some tidying up after any read or write request:

        case phi2_ticker is
          when 0 =>
            cart_phi2 <= '0';
            if cart_read_in_progress='1' then
              complete_read_request := true;
            end if;
            cart_write_in_progress <= '0';

In the 2nd stage (#1) we do nothing at all, as we are just allowing an extra bit of time for the T_HW / T_HR.  By waiting 2 stages, each of which are about 63ns, we are waiting ~126ns, which is near the upper limit of what the 6502 timing diagram specifies, i.e., we are being as accommodating as possible:

          when 1 =>
            -- Allow longer hold time for writes
It is only in the 3rd stage (#2) where we actually release all the cartridge port lines, i.e., stop presenting the address and data value that we might have presented during a write operation. We also release all the /ROML, /ROMH, /IO1 and /IO2 lines as part of this. If we have a new read request, we can accept it now, as well. But we can't accept a write, because Phi2 is low, and the write needs to end on a falling edge, and you can't fall off the floor (handling write requests happens later):

          when 2 =>
            -- Release key bus lines after a short hold time, and start any new
            -- access we have under way, but only if we don't already have an
            -- access happening.
            if cart_read_in_progress = '0' and cart_write_in_progress='0' then
              do_release_lines := true;
              commence_any_pending_read_request := true;
            end if;

Then for the rest of the half-cycle we do nothing at all, i.e., just continue to hold any fresh read request on the bus to give the cartridge time to process the signals:

          when 3 | 4 | 5 | 6 | 7  =>
            -- We are in the middle of the low-half of a PHI2 cycle.
            -- We are either continuing a read or write, or idle.
            -- We don't start doing anything else.
            -- We _could_ start a read now, and satisfy all timing by waiting
            -- the correct number of phi2_ticker ticks, and thus get data back
            -- to the CPU a few cycles earlier, but the benefit is relatively
            -- small, and it might not be compatible with some cartridges.

It is only after this, when we are exactly half-way through that we do anything different. First, we need to set Phi1 to high. We also tidy up and conclude any read request we had running:

          when 8 =>
            -- Begin high-half of PHI2
            cart_phi2 <= '1';
            do_release_lines := true;
            if cart_read_in_progress='1' then
              complete_read_request := true;
            end if;

Then similarly to in the first-half of the cycle, we can start a new read request. But we can also start a write request, because we are in the high-half of the Phi2 cycle, and thus there will be a falling edge to mark the write:

          when 9 | 10 =>
            if cart_read_in_progress = '0' and cart_write_in_progress='0' then
              do_release_lines := true;
              commence_any_pending_read_request := true;
              commence_any_pending_write_request := true;
            end if;

Then just as in the first half, we just hang around to give the cartridge time to process things:

          when 11 | 12 | 13 | 14 =>
            -- We are in the middle of the high-half of a PHI2 cycle.
            -- We are either continuing a read or write, or idle.
            -- We don't start doing anything else.
            -- We could in theory start a read, but not a write, as there
            -- would not be enough time before the falling edge of PHI2.
            -- But as for the during the high-half, we don't want to implement
            -- any really weird timing.

And finally, we just do some general house-keeping, like keeping track of what we are doing with /RESET etc:

          when 15 =>
            -- End of cycle: Check if we need to update /RESET
            -- We assert reset on cartridge port for 15 phi2 cycles to give
            -- cartridge time to reset.
            if (reset_counter = 1) and (reset='1') then
              reset_counter <= 0;
            elsif reset_counter /= 0 then
              reset_counter <= reset_counter - 1;
            elsif reset_counter = 0 then
              if (not_joystick_cartridge = '1' and force_joystick_cartridge='0') or (disable_joystick_cartridge='1') then
                cart_reset <= reset and (not cart_force_reset);
                cart_reset_int <= reset and (not cart_force_reset);
                if cart_reset_int = '0' then
                  report "Releasing RESET on cartridge port";
                end if;
              end if;
            end if;

And in many ways that's really all there is to it.  Each of those variables that I set to true triggers a block that does the required action, each of which are quite simple, and don't really need to be listed here at the moment.  

Anyway, the resynthesis has finished, and it's still not working.  The problem I am seeing now is that the data lines are holding the value from a previous write, rather than the results of the read.  

I think what is happening here is that when I was setting the data lines to input, I was actually disabling input and output.  This was happening because the data lines go through a bidirectional buffer that has DIRection and /ENable lines. /EN has to be low to allow data to flow in any direction, and DIR has to be 1 for output and 0 for input. When I wanted to stop outputting I was setting DIR=0 and EN=1. That stops output alright, but it also stops input. As a result the FPGA pins were effectively isolated from the data lines, and any value previously output on those pins would persist in being visible for some time, until the charge on the FPGA pins was consumed. I've fixed that and am resynthesising it now.  

While that happens, I am curious to see if I can observe the charge dissipating. If so, it will give me more confidence that was really the problem: Nope. Hmm... that makes me think it must be something else. Indeed, after resynthesis the problem persists. I have added extra checks to my simulation tests to ensure that the data direction is correct etc, but it hasn't picked up anything.  I guess the next step will be attacking the cartridge breakout board with the oscilloscope again, to see if I can work out what is happening.

Let's start by seeing if pins 3 - 5 of the 74LS173 get updated when I write to $DE00. They should latch the values from bits 0 - 2. No action is visible, so let's check pins 9 and 10 are low. Pin 9 is the one connected to /EXROM that can go high if the cartridge is commanded to disable itself. And it is has indeed gone high -- so something has written to it that it has interpreted as a command to disable itself, i.e., had bit 3 set.   

It could be that the cartridge port never goes through the reset process on cold start, since triggering a reset does fix that problem.  Probing it on power-up confirms that this is indeed the case.  I'm not 100% sure why, but I'm suspecting it is because the 5V rail to the cartridge port takes longer to come up than the reset sequence trigger takes to happen, so the reset sequence happens, but without any measurable impact on the cartridge port. 

It looks like it can take perhaps as long as 20 ms for the DC-DC converter to get to voltage. We don't have a feed-back line for sensing when the voltage has stabilised, so we will just need to have an initial figure that is long enough. This would certainly explain the lack of reset to the cartridge on cold start, but working on warm resets. The reset timer is clocked at 81MHz, so 20ms = 0.02 x 81x10^6 = 1.62x10^6. I'll round that up to 2 million, just to be sure.

I'm hoping that will get the cold reset stuff sorted. But nope, now it's totally broken.  I think I have it fixed now, and resynthesising.

Meanwhile, although reset is broken, after I force /RESET to pulse low, I can write to the $DE00 bank select register, and I can read stuff from the cartridge -- and the banking is even working... but. The "but" is that it's not always reading the correct values.  While annoying, this is still a big step forward. So I'll put some attention on debugging the reading.

It looks like each location ends up reading either what I assume is the correct value, or some incorrect value -- that seems most of the time to be $20:

The above displays 10 attempts of reading $8000-$800F. As we can see, we end up reading $20 a lot of the time. I think it should be reading:

09 80 09 80 C3 C2 CD 38 30  8E 16 D0 20 A3 FD 20

Some bytes positions are more stable than others. 

I'm not yet sure what is happening here, but I'm suspecting that sometimes we are reading something other than the value from the cartridge data lines.  There doesn't seem to be any other variation apart from this spurious returning of $20 instead of the correct value, which makes me think that the timing is probably ok, as otherwise we would see multiple different values appearing.

So what on earth can be providing the $20 value?  I am a bit suspicious that to force a reset on the cartridge port that I had to write $20 to $7010000. But writing a different value to $7010000 doesn't change the value that gets read back.

Hmmm... I after resynthesising with the reset control fixes the $20 problem has disappeared, but now the banking via $DE00 seems to be broken again. All simulation tests are still passing.  Hmm... If I force a reset on the cartridge port, the cartridge is re-enabled, and responds to banking commands. So in a sense it's already working. But Tiny Quest still fails to start. But I think that's just a fundamental incompatibility of Tiny Quest to the MEGA65 core, so I'm happy to leave that at that.  

Let's just make sure that resynthesising the same VHDL again gives the same and predictable result, to make sure it isn't a random synthesis thing: That looks ok.

Let's just double check by trying Sam's Journey, that we expect to work fine.

Yup, that works, so I'm fairly confident that I have it all working well now.  It's a bit of a bummer that Tiny Quest doesn't work in the MEGA65 core, but it does work in the C64 core, so that's actually not a big problem.

In short, I think it's all working now. Next step is for the team to test it, and let me know if there are any problems... which is happening now.

In the process, I tried plugging in an EasyFlash 3 in, and while it doesn't seem to work always from cold start, after pressing RESET button, it now works again, which is a big improvement as it stopped working on MEGA65 cores a long time ago, and we hadn't had the opportunity to investigate the cause.  But now it works nicely, at least in terms of loading and showing the menu:

Although, that being said, it is still being a bit temperamental, so some further investigation will be required. But it is still progress.

No comments:

Post a Comment