Monday, 5 October 2020

Pulling out the big guns for improving compatibility with HDMI monitors

The MEGA65 has digital video output that is intended to be as compatible with HDMI standard as possible.  But we have been battling problems with compatibility for a while. We eventually gave up a few weeks ago, and I ordered an HDMI Protocol Analyser:

These things cost a fortune new, but fortunately we only need old HDMI 1.3 video modes, so a 10 year old pre-loved protocol analyser would work for us, and can be purchased on ebay for under A$1,000 on a good day -- provided you don't mind having to run the software on Windows XP.

I've been watching ebay on and off for these, and we had a good day, finding one that had been removed from service in a test facility in Japan, and looked to be in great condition. So I took the plunge and bought it. It will also come in handy for other FPGA-based projects where we want digital video output.  

International freight is a bit hit and miss at the moment, but it turned up yesterday after less than a month in transit -- not bad, given that it has to also deal with the Outback mail run here.

Anyway, I got it setup, installed the software, and can now run professional HDMI compliance tests on the MEGA65 -- including on the R2 board with the ADV7511 HDMI driver chip, that we could never drive properly.  We might also find out whether our problems with that chip were a case of PEBKAC on my part, or if there really are compliance problems with that chip in the video/audio mode combination that the MEGA65 uses.


I should also add that now that we have access to this nice piece of hardware, we are quite happy to use it to test other folks boards and retro computer designs.

So, let's start testing the MEGA65 R3 PCB in PAL mode. This is video mode #17 in the HDMI mode database.

The first issue it complains about is that there should be 12 clocks after the end of the active period to the start of HSYNC, but we have only 6.  That's easy to fix.

Similarly, our VSYNC was one raster line too long.  Also easy to fix. There seem to be other problems with VSYNC, too, which are a bit more cryptic to understand:

Number of Video Data Periods between each two VSYNC active edges should be 576 (V_ACTIVE)
Error : V_ACTIVE does not equal values for the selected video format(1152).

Number of pixel clocks between VSYNC active edges divided by H_TOTAL should be 1250.0 (V_TOTAL)
Error : V_TOTAL does not equal values for the selected video format(0).

Number of HSYNC pulses from VSYNC active edge to Video Data Period should be 44 (VS_TO_VIDEO)
Error : VS_TO_VIDEO does not equal values for the selected video format(0).

I'm not really sure what the issue is here, other than that we have something wrong.  I can at quickly test if it is VSYNC polarity, since I can invert the polarity in run-time: Nope, doesn't fix it.

Anyway, I'll synthesise those Audio info frame, HSYNC position and minor VSYNC position tweaks, and see if we can't reduce the number of errors.

While that is synthesising, I've also taken a look at the MEGA65 R2 PCB HDMI problems with the analyser. The results are quite interesting: The ADV7511 seems to be producing broken General Control Packets, and isn't emitting Audio Info Frames in PAL, but does in NTSC.  I've given this feedback to ADV, and it will be interesting to hear what they have to say about it, now that I am able to show the exact problem that is occurring.

But back to the R3 board, I have now repeated the test, fix problem, synthesise cycle close to a dozen times today. It sounds like a lot, but its nothing compared with trying to debug the HDMI problems without the analyzer.  I now already have the NTSC mode passing all tests:

And the PAL mode has only a few interrelated problems with the VSYNC polarity and position.  I'm hopeful I will have it fixed by the end of the night.

The bigger issues I have fixed that affected both were a subtle bug that was preventing the AVI Info Frames being sent at all, and that the Audio Info Frames, while they were being sent, complained rubbish data and an incorrect checksum.  The analyser even lets you dump all of the Info Frames as hex, so I could very quickly and easily work out what was going on, and confirm when I had it fixed.  It might be 11 years old and need Windows XP to run, but it really has worked wonders.

The AVI Info Frames were not being sent because the VGA to HDMI wrapper I am using assumed VSYNC and HSYNC leading edges should be coincident, which they weren't. But they should be. So I made the VGA to HDMI wrapper more robust against this, and fixed the problem.  So with all those changes, and getting the correct HSYNC/VSYNC polarity combination, I can make it pass the tests in PAL mode as well:

So now I am synthesising a bitstream that should have those settings as default, and will confirm that it can pass all tests without fiddling... which it does!

I will then also repeat the tests on the R2 PCB using the fixed video timing, to see if that doesn't also fix things there.  I am suspecting that it might, since one of the problems I saw, was that no Audio Info Frames were being produced by the R2 in PAL, but were in NTSC.  It's possible the logic in the ADV7511 requires coincident VSYNC/HSYNC edges, or is otherwise fussy about the VSYNC detection when deciding when to trigger an Audio Info Frame. Yes, that is one problem. Now I have audio samples frames being sent in PAL, but the monitor here still makes no noise.  

The key difference I can see now, is that the ADV7511 puts only one or two samples in each audio sample frame, whereas on the R3 our VHDL implementation puts 4. This means that the ADV7511 is sending many more audio sample packets. It shouldn't matter, but clearly something does matter to many monitors. Anyway, we aren't using the ADV7511 anymore, so I'm not going to waste a great deal more time on it. The last thing I will try here is to disable the broken GCP frames, and see if that fixes it.  Well, it makes it pass the HDMI compliance tests, but still doesn't cause audio to work on the Samsung TV here. Oh well. VSYNC is also several raster lines late in NTSC, but this doesn't seem to cause any problems. Given that the R2 boards are only for our internal testing, this is a reasonable place to leave things at.

Meanwhile, the team in Germany have begun testing their R3 board, and noticed that the picture is shifted left by some pixels in VGA mode.  My suspicion here is that the VGA output has a shallower pipeline depth than the HDMI output, with respect to the HSYNC signal.  

It might thus be easily correctable by simply delaying the VGA HSYNC, VSYNC and data enable signals by a few pixels, to bring it all back into line.  But it is now working with a picture on all tested monitors and TVs, and all except for one Sony TV that still refuses to produce audio.  It's quite old, and might support only one sample rate that doesn't match the sample rate we are producing.

I'm now synthesising a fix for the VGA display being displaced sideways.  I've also added code to support "Source Product Descriptor" packets. This is how some TVs etc are able to say which device is connected to them.  So far I haven't had any luck with having this information be displayed, though.  While the HDMI protocol analyser will happily dump those packets out for me, it doesn't seem to actually test their content. It does at least confirm that they have valid check-sums, and I can see that they are being sent properly.  Anyway, that's a low priority.

Now that the HDMI-compatible signalling is all a lot healthier, I have checked the one last thing: That the audio sample rate matches the expected value.  Good thing I did, as I had a copy-paste error in the frequency generator for the 12.288 MHz Fs x 128 clock. I had calculated it based on 12.228 MHz instead. As a result we were driving 47.76 KHz, not 48 KHz.  Its possible that this is what has caused the Sony TV to not produce any audio.  So I have fixed that, and also made the logic run-time selectable to choose either 48KHz or 44.1KHz sample rate, in case the 48 KHz sample rate is a problem more generally.  Now the usual wait for it to synthesise...

And it looks like fixing this sample rate error has fixed audio on some of the pickier Sony TVs and monitors that our team has access to.  We also determined that 48KHz rather than 44.1KHz is the more compatible default audio sample rate, so I will lock that in.  We also spotted that the VGA PAL display was shifted sideways, similarly to the VGA NTSC display, so I have put a fix in for that.  Hopefully with those changes, that will be the end of it, and we can deem ourselves to have working VGA and digital video output with sound, and move onto other things on the list. Probably fixing the remaining problems in the HyperRAM.

Meanwhile, if other open-source hardware projects would like to test their hardware using this protocol analyser, I am happy to arrange for this.


  1. Man that is interesting thanks I never knew there was even a thing as an HDMI protocol analyser. I assume we're a long way from having anything remotely open source for this purpose. Something for a DSO to run - or is this crazy talk?

    Also, are you in the sticks for real? I know you're near Adelaide but you again reference the "outback mail run". I'm imagining scrub-dotted red dirt stretching to the big blue sky horizon.

    1. Hi Chris,
      Yes, it would be quite possible to make an open-source HDMI protocol analyser -- just a lot of work. The hardware really just needs to be an FPGA board with HDMI input connector and a LOT of RAM. The one here can capture 20 full seconds of HDMI data. I haven't opened the box, but I'm expecting to see a big motherboard with a small FPGA with a lot of memory modules around it. Probably 16 or 32GB by my rough calculations, which is a lot for a device made in 2009.

      Yes, I am very much in The Sticks(tm), at Arkaroola in the northern Flinders Ranges, about 700km and 10 hours drive from Adelaide. 130km of dirt road to the nearest actual bitumin road. When it rains (like today), the roads all shut because of washouts and flooding.