Tuesday, 18 August 2020

Debugging IEC port on MEGA65 R2/R3 boards

So we are almost finished testing the MEGA65 R3 board, but have hit a funny problem, that is now also showing up on my R2 board:  The IEC port is not working reliably.

The symptom is that the CLK and DATA  lines don't go to 0V when they are pulled low by the MEGA65, but stay at around 2V -- but only when a drive is connected.  

The only "drive" I have here in the desert with me is a Pi1541, so that will have to do.  Fortunately it has nice test points:

 

It uses a 7406 inverting hex buffer, to give maximum compatibility.

The Pi1541 uses 1K pull-ups, resulting in 5mA at 5V, while the MEGA65 uses NC7SZ126 drivers, which can sink 25mA. So the MEGA65 really should win, hands down.  But this is not what we see.

What is interesting, is that the problem occurs, even if the Pi1541 is not powered.  After being reminded that 7406's can be fried by hot-plugging drives and computers while powered on, I am now wondering if it isn't the 7406 that is fried.  Because the MEGA65's drivers are quite strong, I wonder if that hasn't increased the likelihood.

Basically I can't think of a better theory, since the problem happens only when the Pi1541 hat is connected to the MEGA65, even if I disconnect the hat from the Pi itself.  There just isn't really much more on the board apart from that and the pull-ups.  I did test that the pull-ups weren't too heavy by measuring the resistance between the lines, and also simulating 1K pull-up with a resistor directly in the MEGA65's IEC port.

So time to pull and replace the 7406. There are a few problems, though. First, I'm in the middle of the Outback with only limited facilities. I *thought* I had packed a set of hot soldering tweesers as part my soldering station, but that was, sadly, only my imagination speaking.  So that means its time for some quality time with my solder sucker. Except that seems to make as good a seal as a fly-screen door on a submarine.  So time to chop the legs of the 7406, and poke them out with the soldering iron one by one and replace the chip.  

Now about replacing that chip... I still live in the middle of the Outback.  Fortunately I do have my box of crazy electronics parts, which includes a good selection of early 1980s 74 glue chips. Unfortunately, none are 7406's.  BUT I have several 7405s, which are quite similar, but have lower maximum current sink capacity.  Depending on the manufacture it can be a limit of around 8mA instead of 30 or 40mA for the 7606.  8mA is probably sailing a bit close to the wind, but as the nearest spare 7406 is several hundred kilometres away, we'll give it a try. The worst that can happen is that we let the magic smoke out.

But while I am working through all of that, it occurred to me that the original problem of the IEC port not working properly might have been that I was controlling the IEC drivers on the MEGA65 main board incorrectly, pushing high, instead of tri-stating the output buffer, and relying on the pull-up resistors to float the lines high.  

Of course I realised this just after the German part of the team had gone to sleep for the night, and as I don't have any other drives here, I couldn't test it.  Anton has now tested it on his R2 and R3 PCBs, and his SD2IEC and 1581 each work on both boards.  However, his 1541 didn't work on the R3. Testing it on the R2 board found it also not working there. So now we are trying to work out if his 1541 is faulty, or if the MEGA65s are having problems communicating with real 1541s.

We know the problem is unlikely to be timing, because the other drives work.  

So we are now remotely going through the process of probing the IEC lines with his 1581 plugged in, which we know works.  We will then repeat the process with the 1541 plugged in, and see if we can spot any differences. As this can be sensitive to the level of the IEC signals, we will do it 8 times on each, with the different IEC line level combinations.  In IEC test the lines are cleared with 1,3 and 5, and set with 2,4 and 6 respectively.  So we will use that notation for the tests, and in all cases, probe pin 5, the data line.

The easy way

1581: pin 5 (data)

135 - HIGH 4.85 V

136 - LOW ~0V

145 - HIGH 4,85 V

146 - LOW ~0V

235 - HIGH 4.85 V

236 -LOW ~0V

245 - HIGH 4,85 V

246 - LOW ~0V

 

1581: pin 4 (clock)

135 - HIGH 4.85 V

136 - HIGH 4.85 V

145 - LOW

146 - LOW

235 - HIGH 4.85 V

236 - HIGH 4.85 V

245 - LOW

246 - LOW
 

1541: pin 5 (data)

135 - HIGH 4.53 V

136 - LOW

145 - HIGH 4.52 V

146 - LOW

235 - HIGH 4.52 V

236 - LOW

245 - HIGH 4.52 V

246 - LOW

 

1581: pin 4

135 - HIGH 4.51 V

136 - HIGH 4.51 V

145 - LOW (always at around 0.08V)

146 - LOW

235 - HIGH 4.51 V

236 - HIGH 4.51 V

245 - LOW

246 - LOW

 

So, those all look pretty much the same.  This makes me think that maybe there is still a CPU speed problem somewhere. If so, its rather odd, because I can run the freeze-combined.prg test programme, and while it isn't perfectly meta-stable, its accurate to 99.9% or so of C64 CPU speed (the only difference is we are not currently doing the write(s) of an instruction during the 3-cycle BA setup time).

Investigating further, we find that the first time trying to access the 1581 from C64 mode, we get a device not present error -- but 2nd time it works.  So we tried again from C65 mode, and now the 1541 is responding.  Nothing else has changed, except maybe that the 1541 has warmed up a bit.

Yes, and now its also responding from C64 mode.

Rechecking voltages on the 1541, its now upto about 4.86V, so we can't (yet) rule temperature out. But even if it is temperature, we need to know what is going wrong, so that we can actually make it work with all drives all of the time.

Anyway, this is all feeling weirder, although my gut still tells me that some timing issue is still not quite right, and is the cause.  The trick is how we can capture good and bad traces, so that we can play "spot the differences" and work out the problem.  One of those cheap USB data loggers and an IEC extension cable that taps them off might be a good path here. Again, not something I can easily conjure up in the desert, and if I don't get the Pi1541 working, I don't have a way to test anyway, so it will be depending on the rest of the team back in Germany to help me further on this one.

Another option is to use two MEGA65's connected via their IEC busses, with one sniffing and logging the bus traffic, while the other talks to the drive.  Alternatively, we could use something like Vivado Integrated Logic Analyser, that can log various signals to BRAM in the FPGA, and then later visualise them.  As the R3 board has more BRAM than the R2, there is actually some free BRAM, enough to be able to try this. But first, some sleep, then to think about the best path of attack.

But before I do hit the sack, we tested using a C64 ROM on the MEGA65, and also using drives set to device 11, and leaving the MEGA65's internal drive sitting on 8 and 9. And that DOES work will all drives. So it looks like we are tickling bugs in the C65's not completely finished DOS -- or the ROM patch to change the device number needs a bit of further tweaking.  But that can all wait for another blog post.


No comments:

Post a comment