Sunday, 17 November 2019

Fixing raster counting and freeze synchronisation for PAL50 video mode

With all the fixes to the video mode stuff in the last week, things are really moving forward. As is quite natural in such situations, we are also finding other funny little problems and regressions. 

One of those that I think is causing a number of weird display glitches is that the VIC-IV was not incrementing $D012 during the vertical blank period.  This is a relic from when the video modes had too many raster lines, and so we purposely wanted to not count beyond the limit, so that things wouldn't get confused, e.g., with PAL vs NTSC detection that expects precisely the right number of raster lines per frame.

To confirm what is going on here, I wrote a simple test program that uses a 63 cycle loop, i.e., exactly one raster line per iteration, and records the value of $D012 on each raster line, so that we can see what is going wrong. One of the nice things with getting the video mode stuff right, is that it is possible to make a meta-stable display, i.e., where it doesn't have any jitter or glitching from frame to frame, but shows exactly the same display continuously.  The program increments the border colour in each iteration, so that I can see that it is working. It only does 256 rasters, as numbering them all would have required 16-bit fiddling. Since I only care about the VSYNC region, this isn't a problem.  This is why there is the big single-colour block.

For those who are interested, here is the routine:

     ; blank screen
     lda #$00
     sta $d011



     ; Wait to near bottom of the screen
     lda #$f8
l1:     cmp $d012
     bne l1
     ldx #$00

     ; 63 cycles per iteration

     ; 14 cycles to do main part
     lda $d012
     sta $0400,x
     inc $d020

     ; 2 for loop setup
     ldy #7
     ; 5 per iteration
 rl2:    dey
     bne rl2
     ; Make total 63 cycles
     bit $ea
     bit $ea

     ; 5 cycles for end of loop
     bne rloop1

     jmp rrloop

The result is that it writes the raster numbers of 256 consecutive rasters beginning at raster $F8 to $0400 - $04FF.  It gives results like this:

:0400: F8 F9 FA FB FC FD FE FF 00 00 01 02 03 04 05 06
:0410: 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16
:0420: 17 18 19 1A 1B 1C 1D 1E 1F 20 20 20 20 20 20 20
:0430: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
:0440: 20 00 01 02 03 04 05 05 06 07 08 09 0A 0B 0C 0D
:0450: 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D
:0460: 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D
:0470: 2E 2F 30 31 32 33 34 35 36 37 39 3A 3B 3C 3D 3E
:0480: 3F 40 41 42 43 44 45 46 47 49 4A 4B 4C 4D 4E 4F
:0490: 51 51 52 53 54 55 56 57 59 5A 5B 5C 5D 5E 5F 61
:04A0: 62 63 64 64 65 66 67 69 6A 6B 6C 6D 6E 6F 71 72
:04B0: 73 74 75 76 76 77 78 79 7A 7B 7B 7C 7D 7E 7F 7F
:04C0: 81 82 82 83 84 85 86 86 87 88 89 8A 8B 8C 8C 8D
:04D0: 8E 8F 90 90 91 92 93 94 94 95 96 97 98 99 9A 9A
:04E0: 9B 9C 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A5 A6 A7 A8
:04F0: A9 A9 AA AB BB BC BD BE BF C1 C2 C3 C4 C5 C6 C7

So, apart from seeing that I still haven't got it 100% right for being once iteration per raster line (it might be that the VIC-IV is still generating badlines with the screen off. I'll look into it), we can see the main problem with the long string of $20s, which should in fact be counting to $37.  What is nice, is that we can see if we replaced the duplicate $20s with ascending numbers, we would get to $37.  So if I can find and fix the problem, it should work.

It turned out to be easy enough to find in the end:

-        if xcounter > 255 then
-          -- ... it isn't VSYNC time, then update Y position
+        if true then
+          -- ... update Y position, even during VSYNC, since frame timing is
+          -- now exact.

Basically we used to check that the raster was not a VSYNC raster, by seeing if the X counter was non-zero (the X counter is suppressed during VSYNC rasters).  So I just needed to change that if statement so that it would always execute.  Actually, I had to change it to only trigger on the edge of the raster strobe as well, but that's a minor detail (and another 45 minute synthesis run).  With this done, running the program now shows we have things almost right:

:0400: F8 F9 FA FB FC FD FE FF 00 01 02 03 04 05 06 07
:0410: 08 09 0A 0B 0C 0D 0E 0F 10 11 12 12 13 14 15 16
:0420: 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26
:0430: 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36
:0440: FF 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E
:0450: 0F 10 11 12 13 14 15 16 17 18 19 19 1A 1B 1C 1D
:0460: 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D
:0470: 2E 2F 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E
:0480: 3F 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
:0490: 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 61
:04A0: 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F 71 72
:04B0: 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F 81 82 83
:04C0: 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 91 92 93 94
:04D0: 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A1 A2 A3
:04E0: A4 A5 A5 A6 A7 A8 A9 AA AB AC AC AD AE AF B0 B0
:04F0: B1 B2 B3 B4 B4 B5 B6 B7 B8 B9 B9 BA BB BC BD BE

I said, almost right, because the rasters count to $36 (actually $136, since it is in the lower part of the screen), instead of $37.  This is because the VIC-IV has an option to report one raster line before the one actually being rendered, because once rendered, there is a one raster delay in displaying.  By reporting the raster line before, this allows a program to change the screen and/or border colours, and to have the effect occur on the intended raster line.  Anyway, if the raster line is $000, then taking one away gives $FFF, and thus the $FF being displayed.  What should be done, is that if the raster number is $000, then it should indicate the maximum raster line number.  So another little change required in the VHDL:

       -- That is, we trigger the interrupt when the raster is actually being DISPLAYED,
       -- and make the contents of $D012/$D011 match this.
       if enable_raster_delay='1' then
-        vicii_ycounter_minus_one <= vicii_ycounter_driver - 1;
+        if vicii_ycounter_driver /= 0 then
+          vicii_ycounter_minus_one <= vicii_ycounter_driver - 1;
+        else
+          vicii_ycounter_minus_one <= vicii_max_raster;
+        end if;
         vicii_ycounter_minus_one <= vicii_ycounter_driver;
       end if;

And another 45 minute synthesis run ...

In between these changes, I also implemented the synchronisation of the freezing and unfreezing process I mentioned in the last blog post.  So hopefully I will be able to test that and confirm that we can resume a frozen program while preserving the synchronisation between the CIAs, VIC-IV frame display and the CPU.

Ok, so the synchronised freeze/unfreeze worked just fine, as indeed did the ability to see raster line $37, which allows synthmark64, for example, to finally detect PAL mode correctly.  So that's all great. Here is a short video of freezing and resuming the freeze-combined test programme, showing that everything stays lined up:

What is a bit odd, however, is that some programmes still seem to have trouble with the frame timing. This is theoretically possibly due to the CPU running very slightly fast, however that seems rather unlikely.  Wizball is a good example, where it works, even knocking the vertical borders out, but seems to only do the interrupt once per two frames.  If I speed up the CPU, then it works properly, but of course no longer knocks the vertical borders out, because the instructions happen at the wrong time.  A little more investigation reveals that it is the bad-line emulation that causes it trouble: If I disable badlines, then Wizball looks perfect, with the vertical borders knocked out, and no two-frame flashing problem.

So this creates a bit of a mystery for me:  The freeze-combined test program indicates that the CPU is actually running slightly fast, but Wizball seems to indicate that the CPU is running slightly slow.  Ah.. a bit more fiddling around suggests that it might actually be the charging of cycles when branches are taken and/or cross page boundaries that might be the problem.  Or perhaps it is badlines getting triggered in the lower border area. So I still have another mystery to solve there.

Friday, 15 November 2019

(Just about) cycle-perfect PAL 50Hz HDMI video on the MEGA65

I've put a fair bit of effort this week into getting the HDMI video output of the MEGA65 working, and together with that, to get the timing of the video modes as close as possible to perfect. This has involved a few things:

1. Adjusting the frames to have exactly the right number of raster lines (it was out by one before).
2. Making the video generator provide the PHI clock pulses to the CPU, so that there are always the right number per raster line.  As the raster lines are not exactly the right length, this introduces a very small amount of jitter in the cycle durations. However, this will average out to zero over a few raster lines, and so shouldn't produce any audio artefacts.
3. Lots of other miscellaneous fixes.

Anyway, combined with the nice 4:3 digital output via the HDMI port, this means we are now creeping towards a really nice level of compatibility for display timing.  Some tricks, like VSP, will take a while before they would be supported by the MEGA65's default FPGA core, but lots of other things should work well now, including most sprite multiplexors etc.

There are still a few rough edges to sort out, in particular there is a bug in the VIC-IV that is preventing it from numbering raster lines in the flyback properly.  This stuffs up some raster interrupts etc, and will get fixed fairly quickly, I expect.

In fact, the video and CPU timing are now just about good enough to run the freeze-combined.prg test from the VICE emulator test suite.  This nasty little program runs interrupts on both CIAs which are used to draw green and yellow raster bars. A separate raster interrupt runs, which draws a red raster bar.  The yellow and green raster bars have accompanying bars drawn in the active part of the display, and to which they should stay lined up.  This requires that the CIAs correctly count down, and that each video frame has exactly the right number of cycles, so that they don't drift up or down between frames.  And that all now works on the MEGA65 :)

The only part that doesn't, is the grey and pink bars in the active part of the display creep upwards, indicating that the CPU is running slightly faster than it should. This is almost certainly because the MEGA65 always charges 40 clock cycles for a bad line, instead of between 40 and 43 as on the C64. Apart from making this display creep, it is unlikely to be a problem for most programs, as the timing is within 1%, and slightly fast rather than slow.

And finally, since the whole point of this test is to be able to demonstrate freezing and unfreezing without messing up the CIA to VIC to CPU synchronisation, I had a bit of a poke around to fix this.  Basically I need to make the freezer always trigger on a fixed raster line, and then have the unfreezer always wait until that raster line before resuming the frozen program.  In theory, that should be enough to restore the synchronisation to within a couple of clock cycles.  To get an idea of how it should behave, and how it (usually) behaves right now, here is a short video I made:

Fixing this synchronisation will have to wait for another post.

Tuesday, 12 November 2019

More work on HDMI video output

Ok. So in the last post I described how I had manged to get to the point where using code from Mike Field, I was able to get the ADV7511 HDMI driver chip to actually produce a 576p PAL 50Hz image.

Now the challenge is to get it ready for integrating with the MEGA65 target.  There are a few things to do here:

1. Switch from Mike's vga_generator to my pixel_driver for the actual VGA frame generation.

2. Tell HDMI controller the video is 4:3 aspect ratio.

3. Switch from DVI to HDMI encoding, so that we can include audio and other information in the stream.

4. Implement HDMI device identification, so that you can see "MEGA65 Computer" on the list of inputs if you have a fancy TV.

5. Implement audio frames, so that we can do audio over the HDMI link.

We'll work on the first one first.  This should just be a fairly simple. Indeed, this didn't take too much effort.  The second and third ones were also fairly easy, and after only an hour or two, I had the following display, now showing a 4:3 aspect ratio display via HDMI:

The fourth required some adapting the i2c_sender so that it can be told to write to registers of I2C devices with different addresses, since the 7511 has multiple I2C addresses, and the HDMI device identifier, which TVs etc should display as the source name,  are accessed via a different I2C device identifier ($70) to the main registers ($7A).

I've implemented this, but so far see no indication on any of the three HDMI monitors and TVs I have here, that they actually display this, or alternatively, that it is being properly sent.  What is clear from plugging it into our main TV, is that the TV thinks the HDMI link has stereo audio (which is how we configured things), and that it is 4:3 (even if I had to change the TV's default from "always 16:9" to "original aspect ratio of source").

So in theory, we are all set to start thinking about including the audio frames, which I'll get to next.  But for now, as all the other preliminary steps have been taken, I have merged the HDMi driver into the MGA65r2 FPGA target, and am synthesising this.  Hopefully, this will result in nice HDMI video output... And after the usual jiggery-pokery of working through small bugs, it did. First image is NTSC, second is PAL:

The X and Y positions of the text/graphics area within the borders is not quite right. I'm working on this right now.  It shouldn't be too hard to fix. It only got disturbed, because I had to re-work the zero point for X within raster lines, and the general rework to 576p/480p video modes for HDMI.  It's just a case of fine-tuning the positioning constants, and resynthesising between each attempt, to try to refine the position.

What's also nice is that the HDMI video is displayed faster than the VGA video, as the HDMI signalling format includes information about the pixel clock, which makes it easier for a monitor to determine the video mode.

So, that just leaves the fifth item: audio over HDMI.  Here I was hoping to use the same I2S audio transport that I have been using in the MEGAphone to talk to various components, as the ADV7511 also supports it. However, the MEGA65 Rev2 PCB has only the SPDIF signal connected. 

Fortunately Mike Field has once again implemented most of what I need, in the form of an SPDIF transmitter.  It's currently only mono, but that's okay. I can stereo-ise it later. However, it's nearly midnight again, so I'll stop for today, and have a think about the HDMI audio again when I get the chance.

Monday, 11 November 2019

Improving raster timing accuracy / Preparing for HDMI output

While the video PAL and NTSC modes for the MEGA65 are already kind of okay, they are far from perfect for various reasons:

1. Current timing is rather approximate, in terms of cycles per raster and rasters per frame. We need to be able to have this exactly match the C64, if we want better compatibility.
2. For HDMI output, we also need to use "real" video modes.
3. The different pixel clocks for PAL and NTSC modes causes various annoying problems.

In looking through all this, I have decided that the video modes that make the most sense are:

PAL: 720x576 50Hz
NTSC: 720x480 60Hz

These are widely supported HDMI-compatible modes used for TV, and which have the added advantage that they both use a 27MHz pixel clock -- which means we can in theory ditch all the complexity I previously added for supporting 30MHz and 40MHz pixel rates for the old VGA-style modes I was using.

The first question I had was whether VGA-input monitors are able to display these modes.  My initial experimentation with my monitors here, and those of the MEGA team seems to indicate that that this is no big problem -- although some fine-tuning is required to get the best picture on the most screens.  VGA is really only intended as a backup to the HDMI connector, in any case, so we are willing to accept moderate compromises on the VGA output, if required.

So first step, modify the clock generator in the FPGA to provide me with 27MHz for the pixel clock, and some multiples of it, for when I need to do things within a clock tick.  This was a bit of a pain, because the MEGA65's clock generator was generated using the old Xilinx ISE toolset, which we no longer use, and which even when we have tried to reinstall seems to refuse to generate the updated clocking file.

The solution here was to run the ISE generator as far as I could, so that it would tell me the clock multiplier and divider values for the PLL, and then import those by hand into my existing file.  Of course, this is a process prone to errors, so I generated a bitstream that just exports the clocks to some pins on a connector, so that I can verify that the frequencies are correct.  After a few false starts, I managed to get this sorted.

(One side-effect of this, is that I couldn't use exactly the same 40MHz CPU clock, but had to increase the CPU clock to 40.625MHz.  That shouldn't hurt the timing closure too much, but will be a welcome 1.5% CPU speed increase.)

Okay, but now I have a problem: We need a 650MHz so that we can divide it in various ways to generate the 27MHz video pixel clock oriented clocks, while also generating being able to generate the ~40MHz CPU clock.  This works because 162 = 27x6 ~= 650/4, and 40.5 = 162/4.  So that's all fine to better than 1% error. The problem comes that I also need a 200MHz clock for the ethernet TX phase control.  There is no integer n, which satisfies 200n = 650*.  So I have a problem.

(* Excluding the possibility of assistance from Stupid Bloody Johnson.)

One approach to solving this, would be if the FPGA we are using has two such clock units, that I could configure separately.  Reading the datasheet, it seems as though there should be 6 of them -- so I should just be able to create the 200MHz clock on a second one.  This would normally be problematic, because the phase of the 200MHz clock would have no relationship to that of the 50MHz fundamental clock for the ethernet.  However, in this case, as the 200MHz clock is only used to adjust the phase of the ethernet TX clock versus that of the TX bits, we might be able to avoid it. In fact, we could in theory use a different clock frequency for this function.  Quite possibly just 100MHz, and allow only 2 instead of 4 phase offsets for the ethernet.  This should probably be enough, and solves the problem for us, so that's what I will do.

So now I have the correct clocks.  Next step is to rework the frame generators, so that they natively use the 27MHz pixel clock to generate the frames.  In theory this should remove a pile of complexity from those files.  The MythTV database lists them as:

720 480 60 Hz 31.4685 kHz ModeLine "720x480" 27.00 720 736 798 858 480 489 495 525 -HSync -VSync

720 576 50 Hz 31.25 kHz ModeLine "720x576" 27.00 720 732 796 864 576 581 586 625 -HSync -VSync

First, we see that the redraw rates are approximately double what we would expect, with ~31KHz corresponding (coincidentally) to about 31 usec per raster.  That's okay, because we are using double-scanning instead of interlace, so we expect this. The VIC-IV internally manages this, so we can ignore it (for now).

So, these look pretty good -- except that the NTSC C64 has 65 usec per raster compared with 63 usec per raster for a PAL C64.   This means that if we want accurate timing, we need to adjust the lengths of the NTSC mode to be 65/63 the duration of the PAL mode. We'll come back to this, after we actually get the modes working correctly.

Now, previously I had a FIFO buffer that took the pixels generated by the VIC-IV on the 80MHz pixel clock, and allowed them to be read out using the 30 or 40MHz pixel clock of the actual display mode. In theory, it should work quite well. However, in practice, it seemed to always have jitter problems from somewhere, where the pixel positions would vary sideways a little.  Thus I am taking the FIFO out, and requiring that the pixels have explicit X positions when being written, so that the jitter cannot happen.  This buffer will get read out at the native 27MHz pixel clock rate, which should also completely eliminate the fat/skinny pixel problems caused by the video mode not matching what monitors expect, and thus trying to latch the pixels at some rate other than 27MHz.

The only challenge with this re-work, is that we need to make sure there is no vertical tear line, when the VIC-IV catches up with the pixel driver outputting from the buffer.  I'll work on that once I get it working, as there are a few ways to deal with this.

Actually, stop press on that... The VIC-IV's pixel clock is now 81MHz and the video pixel clock is 27MHz, because we had to make them related to generate them all from the one clock on the FPGA.  This means that the VIC-IV's internal pixel clock is 3x the real pixel clock.  This means we can just clock a pixel out every 3 cycles, and completely dispense with the FIFO, buffer or whatever AND avoid the vertical tear line AND free-up a precious 4KB BRAM in the FPGA in the process.  I'm not sure why it took so long for me to realise this...

Anyway, now that I have realised it, I am in the process of restructuring everything to use this property.  It's really nice in that it is allowing me to strip a pile of complexity out of the system.  It took a few hours to get the pixeltest bitstream working with it, but it now displays a nice stable pattern in both the 50Hz and 60Hz PAL and NTSC video modes.

The next step was integrating the changes into the main targets for the MEGA65 PCB and Nexys4 boards.  This wasn't too hard, but there were a few places where I had to adjust the display fetch logic, to account for the fact that the display is now 720 pixels wide, instead of 800.  In particular, this was preventing the character / sprite / bitplane fetch logic from occurring each raster line, because it was waiting for the X position to reach 800, which of course would now never happen.
There was then about a two month pause, while I had too much workload at work to have any real spare time to investigate any further.  After this involuntary pause, I started looking more seriously into the ADV7511 HDMI controller chip that we are using. 

I'd previously had trouble talking to the I2C interface on this chip.  So I started with trying to waggle the SDA and SCL lines of the device.  This revealed that I had one of the pin numbers of the FPGA incorrrectly recorded.  After fixing that, I finally was able to communicate with the 7511.  This was quite a nice step forward.  However, while I could read all the registers, it was refusing to write to most registers.  After some hunting around, I discovered that this was because it hadn't detected a monitor, and that causes it to stay in power-down mode, in which you can't write to most of the registers. 

The mystery of why it was stuck in power-down when I clearly had a monitor connected took a little bit of exploration to work out.  I hadn't realised that the MEGA65 main board has a kind of helper chip to protect the 7511 from static and other over-voltage hazards.  That chip has an output enable line, that if not asserted prevents its built-in level converter from connecting the 7511 to the actual HDMI port.  Once I figured that out, I was able to activate the output enable, and then I was suddenly able to get the 7511 to power-up, and even to read the EDID information from the connected monitor, which can be seen in the lowest two rows of hex in the second image.  The first image shows the registers before being configured. You can play spot the differences to work out some of the register settings I used ;)

The next step was to configure all the various registers of the 7511 for operation.  The documentation for the 7511 lists a whole bunch of mandatory register settings that are required for correct operation. It also lists a whole bunch of other registers that control video format and other things.    It's quite hard to know exactly the right settings.

After some trial and error, and a general lack of HDMI picture appearing. I had a look at Mike Field's example VHDL for driving the same chip.  I'm not sure why I hadn't discovered this a long time ago.  It took a few hours to adapt his code to run on the MEGA65 main board, since it was intended for another board.  However, it too failed to show any display initially.  At which point I was quite despondent, as I had really expected his code to work, and to thus give me a starting point from which to slowly adapt across to my purposes.  It was also midnight, so I sensibly gave up and went to bed.

It wasn't until this evening that I sat down again to look at this, and it occurred to me that Mike's code might have been assuming the incorrect I2C address for the 7511, since it supports two different addresses.  It was wrong, but fixing it didn't fix the problem.  However, my own code had shown me that sometimes you have to send the I2C commands to the 7511 twice instead of once, if the monitor disconnects and reconnects again.  It seems that both of my HDMI montiors do this.  So I wrote a little routine to trigger the sending of these commands to the 7511 every second.  And FINALLY I had some sort of picture showing up on the HDMI output:

 So here we see a couple of problems with the HDMI (right screen) vs VGA (left screen), as well as me writing this blog post on my laptop ;)

1. The HDMI output has the same colour for every pixel on a raster.

2. The colours are All Wrong.

For (1), I don't (yet) have any firm idea, but it could be something to do with the way Mike's program generates the HDMI frame using some complicated DDR output mode that I don't need for our 27MHz pixel clock, and thus haven't really tried to understand.

For (2), amongst the pile of reading I have done on the ADV7511, I know that this is likely a colour-space conversion problem, because RGB black becomes green in the YCbCr colour space.  That should thus be easy enough to fix by changing the default register setup.

Of these, (2) looks the easiest to fix for now.  So I'll start there.  There are two main registers on the 7511 that control the  pixel format and related things:

The upper-nybl of $15 controls the audio sample format, which we are not yet ready to worry about. The lower nybl, however, controls the pixel format.   $x6 indicates YCbCr, which is what Mike's code was using.  That would explain things. So I'll try $x0 instead, which should tell it to use RGB 4:4:4 (i.e., RGB where each colour channel is sampled at the same rate, i.e., Stink Normal RGB). And indeed it makes quite a difference:

Note that while we can now see lots of pixels, which probably means that (1) is not really a problem, but was some weird artefact, that not all the colours are being produced the same on both outputs.  The keen observer will also notice that the image is slightly rotated horizontally.  We'll also worry about that later.

The next register of interest is register $16, which controls a number of things related to pixel format. Mike's code sets it to $37.  In my code, I had it set to $30.  The difference is that my value has "Not valid" instead of "Style 2" for the pin assignments for the pixels coming to the 7511. That's definitely a possible problem.  Bit 1 controls DDR rising/falling edge for certain modes, which shouldn't matter for us.  Bit 0 selects the input colour space, where 0 is RGB and 1 is YCbCr.  I'm going to try $30 in Mike's code, and see if it breaks the display of the image, and then if that's the case, try $34, which I think should work in my code.  Except... that it seems to make no difference to Mike's program when I do that (or to my code when I try the different settings).

So then I tried copying in the colour space conversion registers, but also without any visible effect in my code.

Now I'm a bit confused.  Basically it seems to me that the only two differences are the use of the DDR versus simple delivery of pixels, and the video mode.  Neither of which should prevent the thing from displaying an image.  The video mode in particular should be fine, since the 7511 even reports that it detects the mode correctly as the EDTV 576p or 480p video mode, which means it thinks that the HSYNC and VSYNC timing are all just fine.

But just in case, and also to make it easier to progressively converge Mike's working code and my not working code, I am converting Mike's code to produce a 576p EDTV PAL image.  If that displays on HDMI, then I know it isn't the mode, and must presumably be to do with the pixel format, and that I might have to use DDR pixel delivery after all.

Okay, so video mode switched to 576p 50Hz, and with Mike's code, it still displays just fine.  Well, fine-ish, since I have messed with the colour space stuff.   In adapting Mike's code to produce 576p, I ended up using a slightly different pixel clock of 1400/52=26.9something MHz. 

The monitor shows the mode correctly as 720x576 @ 50Hz, so this might be a good option for the MEGA65 to use, since it might get us back to a simpler set of clock frequencies. But that's for another day.  First, let's try to get some sensible pixel colours happening again, by making sure we have the pixel format set to RGB etc, and finally, we have a working HDMI image:

After simplifying the code, and re-working it to use the MEGA65-style externally supplied 27MHz pixelclock, this is now VERY similar in operation to my own HDMI driver that doesn't work. This is really quite mysterious.  The main difference I can see so far, is Mike's code has an I2C sender that does the initialisation of the HDMI controller automatically, instead of having to be bit-bashed as I do it in mine.  I've checked the registers that it sends, and I am pretty sure I do all of them.  But I have a sneaking suspicion that there is still some subtle difference there.  So I might just try incorporating that into the MEGA65 target next.

Tuesday, 5 November 2019

MEGA65 Interview in the latest podcast

Just a super-quick post to say that the interview I gave a couple of weeks ago about the MEGA65 is now live in the German-language podcast.  This interview runs for about two hours, and covers a wide range of topics.  It is unfortunately only available in my dodgy German that my daughter describes as "Papa Deutsch".  So apologies to those of you who speak no German, and even more so to those of you who can understand it, and have to contend with my German ;)

Saturday, 2 November 2019

Guest post from Wayne on the MEGA65 User Guide Layout

Today we have another guest post, this time from Wayne, who is another of our documentation angels along with Edilbert.  Today, Wayne talks about some of the challenges to be able to produce a User Guide that visually is reminiscent of the C64 Programmer's Reference Guide / User Guide.  Over to Wayne...

One of the most remembered items that shipped with every Commodore 64 was the User Guide. This important book was the gateway for many of us to understanding the internals of the computer and how to program. We want to offer that same experience with the MEGA65. While there is always the option of simply creating online content and tutorials, nothing beats the experience of opening up your User Guide and working through learning your brand new computer.

As you can tell from the screenshots, we're trying to retain the spirit and style of the original Commodore 64 manual. Of course, all content, every word and illustration must be recreated entirely from scratch.

When the team first started on this sub project, the first ticket item was to create components and layouts that would provide a base for other content.

In terms of layout, Paul provided draft templates and styles. The next step was to come up with suitable and flexible components.

This was my trial of fire, my first time working with LaTeX. So I started work on the "Keys" component. For those that remember, when the C64 manual indicated a keypress, it would show something like:

This took a little time to work out, but eventually using the tcolorbox package I got it done. With this new confidence I started on the "Screen Output" component, used to show screen display output from the MEGA65 and for screen program listings.

I sank something like 20 hours into this. LaTeX is a fickle mistress. Much time was spent fighting with various packages and syntax. Thankfully with much help and advice from Edilbert and Paul, it got done.

Next component to tackle was the sprite grid component. I knew the guide would need one, and rather than put it off, it was best just to get started on it.

This resulted in around 40 hours of hard slog and frustration. I broke many typesetting rules to make it look close to the original sprite grid illustration that people would remember. Now we have both monochrome and multi-colour sprite grids at our disposal.

At the same time while all this was going on, Edilbert started chipping away at the BASIC 10 keywords chapter. The huge benefit of his work was that, once complete, programming chapters could be built (having documented the full BASIC 10 keyword set).

At this point I was getting pretty discouraged with LaTeX as a whole and figured we'd never get done until 2022. But the team agreed we should stick with it. A good job too because from here on in, the work became easier.

Chapters starting being produced more quickly because the available base elements like titles, sections, subsections, screen output listings, tables and so forth meant that writers could concentrate on the written content. So that was very encouraging.

Work has been progressing steadily on the User Guide. Recent completed sections include:

 - Setting up the MEGA65
 - Keyboard and the Screen Editor
 - ASCII Codes and Special Escape Sequences
 - Working with Decimal, Binary and Hexadecimal

As Edilbert showed in the last blog post, he has finally finished the completed set of BASIC 10 keywords. This means the development of Sprite and Sound chapters (and other programming chapters) will follow soon.

This has been a great sub-project to be involved with within the MEGA65 development team. We can always use more writers, or proof readers and people willing to work through the chapters looking for issues. Instructions for how to get involved, how to download the repo and set up the required LaTeX software is found at:

Monday, 28 October 2019

Open C64 ROMs Guest post update

And today we have another guest post, this time from our very-productive contributor to the C64 Open ROM project that we started, so that we can have copyright-problem free ROMs for inclusion with the MEGA65, and also for emulators and other projects to use for the same purpose.  But enough from me ...

Hi all! My name is Roman, I'm from Poland, and I work on the Open ROMs ( For those of you who missed previous posts - it's a MEGA65 side project to create free and open-source ROM set for Commodore 64 and every compatible machine. We will, of course, never reach 100% compatibility - but a reasonable degree should be achievable. My personal goal is to create not just a poor-man replacement - I want a ROM which is simply better!

Currently both KERNAL and BASIC needs a lot of work, for now I decided to focus on the first one.

I started my work with improving the IEC (read: floppy) support. That took me quite some time and a lot of debugging with specially modified VICE emulator, showing the details of serial communication. As a result LOAD should work now on real hardware too - not only with VICE KERNAL hacks. Although IEC functionality is mostly complete now (SAVE being the big missing, somehow I always find more urgent things to do), it will probably still need quite some work to fix incompatibilities with various utilities.

But not only LOAD works - we've got a built-in DOS Wedge too!

Another big task was migration from Ophis to Kick Assembler. I needed conditional compilation, Paul wanted to unify the tool-chain across various MEGA65 sub-projects, so Kick Assembler was a natural choice. Boring job, but had to be done. The most important advantage is, that we can now easily configure builds - enable and disable various features (sometimes mutually exclusive, or potentially incompatible with certain existing software) at will.

Kick Assembler has some nice features - it can export labels to VICE monitor, same with breakpoints (.break directive). And this is fully utilized by our make system - just type 'make test' (or try other test targets), enter VICE monitor, and see by yourself. I also started adding pseudo-commands for common snippets of code, which might be written in a more efficient way on 65C02 and higher CPUs - again, still a lot to do here.

I have rewritten our utility to place floating routines in memory (floating = not having predetermined addresses). Previous one used a very simple method: take the biggest hole (block of free ROM memory) and fill it in starting from the biggest routine. A new approach is to start from the smallest hole and try to fill it in as efficiently as possible, using a slightly modified knapsack solving algorithm - it prefers solutions using larger routines. The tool can still be improved, to fill in more than one hole in a single step - for now this is not needed, in practice the current solution leaves no unnecessary gaps.

One of the problems with improving compatibility is how to identify which unofficial routines we need to provide. We don't have a good solution for this yet, but I've extended the warm start routine - if caused by BRK (for now it's a quite likely outcome of failed attempt to call unofficial routine), it prints the actual BRK address. Try it - launch VICE with Open ROMs and do 'SYS 49152'.

Yet another large task was to rewrite the keyboard scanning. Till recently a TWW/CTR routine was used (…orrect_and_non_kernal_way), adapted to be a part of our Kernal. But although this code is really advanced, way better than the Commodore one, from our point of view it has a very serious disadvantage: compatibility problems. The algorithm used requires more memory than the original, it has to use some bytes that are normally free for use by the application/game. On the C64 we don't have memory allocation as in modern operating systems, everyone uses what he believes is free; game can put the final boss behaviour implementation in a place used by extended keyboard scanning, and we won't even notice it. The new routine isn't as sophisticated as TWW/CTR one - but it still prevents ghost characters from appearing (press A+S+D at the same time on a C64, with original ROMs you'll get letter F written). Additionally, it can decode extended C128 keys - currently released build still has some problems with them, fix is on the way.

In general, a lot is happening 'under the hood', to achieve better compatibility, sometimes using really dirty hacks. FileBrowser64 is now working (other browsers don't work yet), some game cartridges work too (don't bother with utility cartridges, as of yet most of them just crash).

Starting from recently, all the released builds are versioned. Release string DEV.191025.FC.1 means that this is a development snapshot done 25.10.2019 by FeralChild (my nick on C64 forums and on the GitHub), and this is the first snapshot released that day. If you create a bug report, please always include both the release string, and the build - for now one of: generic, MEGA65, Ultimate64, testing. Right now all ROM builds can be used on any platform, there is no hard dependency yet - but, for example, MEGA65 build won't even try to read C128 extra keys or initialize SIDs at $D5xx. Please also note, that 'testing' contains features that will hurt compatibility, in case of problems it is always a good idea to retest with 'generic'. BTW, don't mix the builds - it will be detected and you'll see a nice KERNAL panic screen.

For the near future, I'll definitely focus on KERNAL improvements - rather on compatibility, than on the features. Some official routines are incomplete, several unofficial entry points needs to be provided. Once you get involved with a project like this, there is no way to be bored!