Saturday, 8 October 2022

Adding proportional text support to the MEGA65's "web" browser

In the previous post I sorted out the Ethernet and TCP stack bugs that were preventing the browser from working, as well as getting some of the general infrastructure in place, like having a start screen for the browser, that lets you choose the initial URL to open.

What I want to work on now, is improving the appearance of the pages.  The H65 format allows for using almost all of the MEGA65's graphic features, but my current page generator doesn't really support any of them, except for underlined text and in-lined images. In particular, text is restricted to using the normal C64-style 8x8 fixed width font.  That's fine for starting, and its also good to have that font available for where it makes sense, e.g., for showing BASIC code snippets.  However, it would be nice to be able to use much nicer looking fonts in pages as well, if people want to.  

Fortunately, I have done quite a bit of work on displaying proportional fonts on the MEGA65. In fact, we have MegaWAT!?, a power-point like presentation program for the MEGA65 that renders proportional text in real-time on the MEGA65. In fact, I've even given presentations at conferences using it and a real MEGA65 instead of a boring normal computer. So I already have C code for reading TTF or Type 1 fonts using libtruetype on Linux to create rasterised fonts that the MEGA65 can display.  In fact, for the browser its even easier, as we are generating the H65 files on the server-side, so we can do all of the rendering there.

What will remain the same, though, is that we don't just render the fonts onto a big fat canvas, as that would waste lots of precious chip RAM, and limit the size of pages that can be displayed.  Instead, we convert each glyph from a font into a set of 8x8 FCM characters that the VIC-IV can display. We can then just assemble the glyph each time we want to display it using the same FCM character definitions, thus saving lots of RAM. 

This would mean that all the glyphs would have to be a multiple of 8 pixels wide, which kind of defeats the purpose. But here the VIC-IV in the MEGA65 has a secret weapon: In 16-bit text mode, you can tell the VIC-IV to only draw a variable number of the pixel columns of a character. This makes it nice and easy to display glyphs from fonts that aren't exact multiples of 8 pixels wide.  Thus we can still get the nice appearance, without wasting the memory. 

The second secret weapon of the VIC-IV is the ability to use FCM characters in "alpha blending mode", where instead of allowing each pixel of a character to be chosen from the 256 colour palette, we instead use that value to indicate the fade value between the foreground and background colour of the character.  For example, if the background colour were black and the foreground colour were white, then a pixel value of $FF would display white, and a pixel value of $00 would display black, and a pixel value of $80 would display a colour half-way between them, i.e., some shade of grey. This allows text to be anti-aliased and look much better.  Best of all, it doesn't use up any palette slots on the VIC-IV to do this. 

The third secret weapon is that instead of using FCM + alpha blending mode, we can use NCM + alpha blending mode. This means each 64 byte FCM char block encodes a 16x8 pixel area instead of an 8x8 pixel area, thus reducing the number of char blocks required to display larger glyphs.

By combining all three of these, I intend to support really nice looking text in the MEGA65 browser, like this:



The first step is to modify the md2h65 converter to allow the use of these fonts. This involved pulling a pile of the code from the tool used to prepare fonts for MegaWAT!? into md2h65.  I then had to re-factor the md parser I had made, as it was pretty rubbish, and couldn't support UTF-8, or even properly support most of the mark-down formatting strings. 

Markdown doesn't include a native way to specify typefaces, so I have hacked in a C-inspired syntax that allows declaration of different typefaces for the different text types. These lines look like this:

#define FONT(p) /usr/share/fonts/type1/urw-base35/NimbusRoman-Italic.t1,16

The p in brackets means paragraph, and in addition we also support h1, h2, h3, bold, italic and bolditalic to override all of the fonts.  The ,16 at the end indicates the size of the typeface.  At the moment, the font name has to be the absolute path to the font file, but I hope to improve that along the way.

Next, the renderer in md2h65 has to assemble the lines of text that might be a mix of C64 font and these proportional fonts, all of which might have different heights.  I already had code for mostly handling that in MegaWAT!?, which I also hacked into md2h65.

Initially I haven't implemented the trimming of the character widths, as I just want to make sure I have the font rendering working.  One challenge with debugging this, is that the m65 utility that I use to render screen-shots of the MEGA65 for these posts doesn't yet properly handle the alpha blending mode, so I have to bug fix that.  I then also hit another random bug in the network code, causing some corruption of the received TCP frames, which is confusing the browser code, so as usual, I have to go down a few rabbit holes, before I can progress.

Specifically, a byte of the H65 file is being mutated from $00 to $06 at some point during either network handling, or in my parsing the H65 in the browser code. I can see it is modified in the RX buffer of the TCP socket when it is being parsed.  So now to find out when it is being corrupted...

And it looks like the contents of the Ethernet frames are being corrupted, as I am seeing things like this:

00000000: 53 65 72 76 65 76 3a 22 53 6b 6d 74 6c 65 48 54    Servev:"SkmtleHT
00000010: 54 54 2f 36 2e 36 20 50 79 74 68 6f 6e 2f 32 2e    TT/6.6 Python/2.
00000020: 37 2e 31 38 0d 0e 44 65 74 65 3a 22 53 61 74 2c    7.18MNDete:"Sat,
00000030: 20 30 38 20 4f 67 74 24 32 30 32 32 20 30 33 3a     08 Ogt$2022 03:
00000040: 33 32 3a 37 35 24 47 4d 54 0d 0a 43 6f 6e 74 65    32:75$GMTMJConte
00000050: 6e 74 2d 74 79 70 65 3e 20 61 70 70 6c 6d 63 61    nt-type> applmca
00000060: 74 6d 6f 6e 2f 6f 63 74 65 74 2d 77 74 76 65 65    tmon/octet-wtvee
00000070: 6d 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74    mMJContent-Lengt
00000080: 68 3a 20 31 34 37 38 33 32 0d 0a 4e 61 73 74 2d    h: 147832MJNast-
00000090: 4d 6f 64 6d 66 6d 65 64 3a 22 53 61 74 2c 20 30    Modmfmed:"Sat, 0
000000a0: 38 20 4f 67 74 24 32 30 32 32 20 30 33 3a 33 32    8 Ogt$2022 03:32
000000b0: 3a 33 38 20 47 4d 54 0d 0a 0d 0a 48 36 35 ff 27    :38 GMTMJMJH65~'
000000c0: 00 50 85 54 e8 37 00 06 06 07 f0 c9 00 00 00 00    @PETh7@FFGpI@@@@

For example.. "Servev" should most likely be "Server". 

So why is this suddenly happening now?  Ah, I am still running the old broken bitstream for some reason... Ah, that would be because I updated the bitstream in slot 0, but didn't erase the old bitstream in slot 1. Well, at least that's a simple problem, and means that I haven't got any new network bugs to deal with.  

Okay, so that's all fixed, and I can load pages again.  So now to test the NCM rendered proportional text. Initially I have just a single word at the end of the test page that is in a proportional font, the word "Some".  Without aliasing, it looks a bit messed up, but you can still see what it is:

On real hardware, the background of those gylphs is blue, not black. This is part of that bug in the m65 screenshot renderer I was saying out. So let's fix that first. Righty-oh, now can see it properly:

Again, note that I haven't yet implemented the character trimming. There is also some artefacting around the alpha blended characters in the screenshot that is not visible on the real hardware, which will be easy enough for me to fix.

Done:

Much nicer. Now to implement the width trimming.  I also noticed that my renderer incorrectly things that C64 font text is two lines high, as well, which I'll have to deal with. But first the width trimming:

That's looking nicer.  But we have some "poop" on the right-most edge of the glyphs. I'm guessing I have an out-by-one error there somewhere, which should also be fairly easy to fix.

It turned out I was just not trimming by enough. With that corrected, that single pixel column of poop goes away. Quite why it is even getting into the char blocks though is still a bit of a mystery that I would like to solve. In theory, I'm only copying those columns indicated as part of the glyph. Ah! It's because in NCM mode, I am packing two columns of pixels into a byte, and I don't check if the char is an odd number of pixels wide. So now it looks like this:

Good. Now that single word looks right.  Let's get multiple words working. At the moment we have a problem where each word appears on a separate line of its own, like this:

"Some" and "more" should be next to each other, and "bold text" should also be next to those.  That "bold text" which is just C64 font ends up on a new line tells me that the problem is that a new line is being triggered when outputting proportional text.  This problem was that I wasn't first rendering the word to measure its length, but rather just feeding it out as a line. Probably just from when I started implementing it.  But now with that fixed, we get this:

That's looking more right, except that we aren't emitting spaces following the proportional text words. With a bit of re-factoring, we now have spaces of the correct size proscribed by the fonts:

Now its time to tackle that extra vertical space. That problem was that I was incrementing the Y position during rendering the lines, and then again after the line of text had been rendered: Only one of these is required. With that fixed, its looking right again:

We are now quite close, because we can now assemble whole lines of proportional text. So I'll modify my example MD file here to use the proportional font for all of the paragraph text, and maybe also set some fonts for the headings, bold and italic, and see how it looks. 

And all the text has disappeared, which is very odd!