Saturday, 24 July 2021

Work on a Simple HTTP-based file browser

First, a sneak peak of where we have gotten to, before explaining how we got there:

We have previously improved our port of WeeIP, a tiny TCP/IP engine, so that it works fairly well on the MEGA65 with its integrated 100mbit Ethernet adapter. 

It can now be used to open an HTTP connection, and try to fetch a file.  So the next step is to try to do something useful with that.  One challenge is that we are compiling it with CC65 at the moment, which produces rather large code, and as we are for simplicity for now (until the MEGA65 port of the VBCC compiler is ready), we are stuck with compiliers that don't really support >64KB of address space.  A TCP/IP stack is already fairly large, so we end up with a minimum programme size of about 28KB, which given we are using the C64 memory map, where we have only about 38KB of contiguous RAM available to us for programme code, this is a bit annoying.

Fortunately there is a 3rd C compiler that supports the MEGA65, and that is KickC, which specialises in extreme optimisation of the produced code. Unfortunately, it isn't an ANSI-compliant C dialect. Well, really I should say that it is inconvenient, not unfortunate, because KickC does lots of clever things to make it produce much better code on 6502 targets than a traditional C compiler is able to do.  But it does have some limitations and differences that mean that porting code is not always super simple.  We had anticipated this, by having separate directories for CC65 and KickC in the mega65-libc repository, and now the time has come to start forking things off, and getting stuff working with KickC there, so that I can see how much smaller KickC is able to produce a working TCP/IP stack for me, than CC65 is able.

First step is to update KickC to the latest version, ideally from source, so that I can patch the compiler if required.

Cloning the source from https://gitlab.com/camelot/kickc is easy, as was compiling it. This makes a kickc-release.tgz file that you can then extract where you want it.  That all went smoothly, except that when I try to compile something, I get this error:

$ bin/kickc.sh -e examples/helloworld/helloworld.c
java -jar bin/../jar/kickc-0.8.3.jar bin/../jar/kickc-release.jar -F bin/../fragment -e examples/helloworld/helloworld.c -I bin/../include -L bin/../lib -P bin/../target
Unmatched argument at index 4: 'examples/helloworld/helloworld.c'
Usage:
...

The binary release does however work, but quickly runs into problems:

$ make fetchkc.prg
git submodule init
git submodule update
../kickc/bin/kickc.sh -t mega65_c64 -a -I src/mega65-libc/kickc/include -I include -L src -L src/mega65-libc/kickc/src src/fetch.c
java -jar ../kickc/bin/../jar/kickc-0.8.5.jar -F ../kickc/bin/../fragment -t mega65_c64 -a -I src/mega65-libc/kickc/include -I include -L src -L src/mega65-libc/kickc/src src/fetch.c -I ../kickc/bin/../include -L ../kickc/bin/../lib -P ../kickc/bin/../target
//--------------------------------------------------
//   KickC 0.8.5 BETA by Jesper Gravgaard   
//--------------------------------------------------
Compiling src/fetch.c
line 65:8 mismatched input 'union' expecting {TYPEDEFNAME, PAR_BEGIN, 'const', 'extern', 'export', '__align', 'inline', 'volatile', 'static', '__interrupt', 'register', '__zp_reserve', '__address', '__zp', '__mem', '__ssa', '__ma', '__intrinsic', CALLINGCONVENTION, 'struct', 'enum', SIGNEDNESS, SIMPLETYPE}
/home/paul/Projects/mega65/weeIP/include/defs.h:65:9: error: Error parsing file: mismatched input 'union' expecting {TYPEDEFNAME, PAR_BEGIN, 'const', 'extern', 'export', '__align', 'inline', 'volatile', 'static', '__interrupt', 'register', '__zp_reserve', '__address', '__zp', '__mem', '__ssa', '__ma', '__intrinsic', CALLINGCONVENTION, 'struct', 'enum', SIGNEDNESS, SIMPLETYPE}
make: *** [Makefile:33: fetchkc.prg] Fehler 1

This is because KickC doesn't support unions, which is a bit of a bummer, because WeeIP makes fairly extensive use of them.

I've tagged the issue on the KickC repository to see if they might be inclined to add support for them sometime soon.

So that brings us back to working with CC65 for a bit longer, and working harder to keep everything small. enough to fit.

One help we have is that we don't have to fit all data into the first 64KB, but can strategically use the other RAM banks to stash stuff, such as the graphics display and hyperlink info.

A while back I started coming up with a really simple file format that the MEGA65 can basically just display, and that has some hyperlink info embedded, which I call H65 for Hypertext65.  I even started making a markdown to H65 converter at  * Markdown to H65 page formatter in the mega65-tools repository. Here are the comments I made about how I intended it to work:

/*
 * H65 is a very simple (if inefficient) rich hypertext standard
 * for the MEGA65.  It allows text, pictures and hyperlinks only
 * at this stage.  It largely works by pre-formatting a MEGA65
 * screen + colour RAM, and providing the custom FCM character data
 * required to satisfy this.
 *
 * File contains a header that is searched for by the viewer.
 * This contains the lengths of the various fields, and where they
 * should be loaded into memory, with the first 64KB of RAM being
 * reserved for the viewer. Screen data is expected at
 * $12000-$17FFF, colour RAM will load to $FF80000-$FF87FFF for
 * compatibility with MEGA65 models that have only 32KB colour RAM.
 * Hyperlinks are described in $18000-$1F7FF:
 * .word offset to list of link boundaries
 * List of null-terminated URLs.
 * List of screen-RAM offsets, hyperlink length and target link tuples.
 * (4 bytes each).
 * Tile data is allowed to be placed in banks 4 and 5.  Replacing
 * the 128KB ROM with page data is not currently allowed.
 *
 * Header:
 *
 * .dword "H65<FF>" (0x48, 0x36, 0x35, 0xFF) is required.
 * .byte version major, version minor
 * .byte screen line width
 * .word number of screen lines
 * .byte border colour, screen colour, initial text colour
 * <skip to offset $80>
 * [ .dword address, length
 *   .byte data .... ]
 * ...
 * .dword $0 -- end of file marker
 *
 * Parts:
 * Copyright 2002-2010 Guillaume Cottenceau.
 * Copyright 2015-2018 Paul Gardner-Stephen.
 *
 * This software may be freely redistributed under the terms
 * of the X11 license.
 *
 */

 

In short, we are basically making a format for loadable VIC-IV displays.  This is purposely a really unintelligent format, allowing intelligence to be built into the VIC-IV display info. For example, the page gets to provide both the VIC-IV screen RAM and colour RAM for the page. This means all VIC-IV tricks are available, and even that images can be included using FCM 256 colour or 16-colour modes.  It's also possible to provide the palette, allowing for setting the palette entries, thus allowing the images to be even prettier than if a single fixed palette were assumed. 

It also allows us to include proportional text etc on the pages, similar to how we do it in our PowerPoint clone, "MegaWAT!?", where we use the 16-bit text mode of the VIC-IV's ability to do hardware kerning and anti-aliasing. But for now, we will just work with normal text, then maybe add images, and work our way up from there.

We allow 24KB for screen RAM, which assuming that we use 80-column mode means a page can be 24*1024/80 = 307 lines long, or 153 lines long if we use 16-bit text mode. Again, as the page can provide VIC-IV register settings, this is up to the page designer.

The md2h65.c programme currently has the bare essentials to write out a valid empty .h65 file, but doesn't currently parse a .md file to get the contents of it.  So that's probably the next step, so that we have an example .h65 file that we can fetch. I'll then setup a temporary local webserver, and start testing the complete setup.

Initially I will just support plain text, and bold and maybe a simple header format, just using colours to indicate those styles, since that will be quick and easy to parse.  So something like:

# A heading

Some text with **a bit of bold**

will be allowed.

With a bit of hacking, I have, in theory at least, implemented writing out simple .h65 files generated from a .md file, allowing such formatting.  Now to modify the fetch programme to display it.  This means running a simple webserver, which is made super easy by python:

python2 -m SimpleHTTPServer 8000

in the directory where the files live.  So I can do a single test command like this:

make fetch.prg && m65 -F -4 -r fetch-unpacked.prg && python2 -m SimpleHTTPServer 8000

This will work because the WeeIP TCP/IP stack takes a few seconds to complete DHCP negotiation etc, before actually doing anything, thus giving the web server time enough to get started. 

This means that we need to be able to provide an IP rather than a hostname, since I am testing it on the local LAN. Our DNS library in WeeIP currently only supports resolving actual hostnames, so I have quickly written an IP parser that we try on the passed in hostname before falling back to DNS:

  // Check if IP address, and if so, parse directly.
  offset=0; bytes=0; value=0;
  while(hostname[offset]) {
    printf("Checking $%02x, value=%d, bytes=%d\n",hostname[offset],value,bytes);
    if (hostname[offset]=='.') {
      ip->b[bytes++]=value;
      value=0;
      if (bytes>3) break;
    } else if (hostname[offset]>='0'&&hostname[offset]<='9') {
      value=value*10; value+=hostname[offset]-'0';
    } else
      // Not a digit or a period, so its not an IP address
      break;
    offset++;
  }
  if (bytes==3&&(!hostname[offset])) {ip->b[3]=value; return 1; }

With the above, I am now able to have the MEGA65 attempt to fetch the page from the web server, and I see signs that it is indeed (mostly) working:


We can see that it connects to the HTTP server and fetches 536 bytes. Because the programme just prints out what it reads for the moment, we see the h65 and the cross-hatch character, which is the header of the .h65 file.  The only problem is that we can see that the content length is supposed to be 1,024 bytes, not 536.  But we can deal with that as we progress.  

What I might work on next is to instead focus on improving the UI of the programme itself. Or more correctly, actually give it a user interface, rather than just making a single fixed connection and displaying a pile of debug fluff on the screen. 

Having some kind of URL bar at the top of the screen, for example, would be a nice idea. What I will probably do, is have a display for entering a URL, and that it also shows while a page is being fetched. Then this will toggle to the page display view, after a page has been loaded.  Before doing this, our file size is currently 29,390 bytes, so we will see how much bigger it gets adding that.

Well, its now about 34KB, but it can show something like this:

The .md file is pre-converted into a .h65 before being placed on the web server. But it can then be fetched and displayed.  The video mode, colours and underline attribute are all built into the screen and colour RAM data of the .h65 file. So its just a case of loading the blocks into RAM in the right places, and the page is "rendered".  

Its now time to get ready for dinner, but after the kids are in bed, I might look at switching the font for our ASCII font, so that we don't have to munge the charset, and can use upper and lower-case chars. I might make it so that the pages can choose for themselves whether they are PETSCII or ASCII, so that we can have cool 80x50 PETSCII art rich pages.  

Once that is done, then I think it will be time to start adding support for hyper links. I have a cunning plan for implementing those in a super simple way, as well.  But whether I will start to run out of RAM before then, we will have to see.

Well, I got to spend a bit of time working on this, and even live streamed for a couple of hours as I added scrolling via keyboard and mouse, and setup my new MouSTer USB to C= mouse adapter.

This got things to the point where we can now have a page that is longer than fits on a single screen, and use cursor up and down to scroll through, and also use the mouse to scroll up and down.

So I thought this morning I might try to configure my MouSTer to enable the scroll wheel and 3rd button in C1351 mode, so that I can support them, as it is just so tempting to want to be able to use the scroll wheel.  However, I can't see how to to this. Maybe if the excellent Retrohax folks are reading along, they can drop me a comment below on how to do this.

So instead I'll attack something different for a little while.  First up, I'll call the mouse update routine in all the busy wait spots during DHCP, TCP connection etc, so that the machine doesn't feel so dead.  That makes it feel much nicer. I also modified the configuration of my MouSTer to double the mouse speed, which now also feels much more responsive, without being silly. So that's a couple of simple usability improvements for me. 

While I'm doing such things, I might also modify the mouse pointer code to change colours, so that it doesn't just sit there dully.  Ok, that looks nicer.

Next I might see if I can't make images work, as that will make it all look much fancier.  It won't support animation or anything fancy yet, just static graphics. But nonetheless, it will be much better than just plain text.  

We already have most of the PNG handling code we need from pngtoscreens.c, including palette handling and tile generation (and, I think, tile de-duplication).  I mostly just need to add the parser for the markdown for indicating an image, and then load the image. In fact, we even already have the code for writing out the tiles.  So we just need to draw the tiles on the screen RAM, and we should be good.

Well, that _almost_ works now:

Clearly the colour RAM attributes are borked. The display on the real hardware is actually a bit different, as the screen capture tool doesn't correctly handle transparency, but what is correct, is that the palette is all borked up, and VIC-III attributes are being applied to the tiles, when they shouldn't be.  The easy solution to the latter problem is to set the tile colour to a colour <16, although that does cause us some problems for colour $FF not being available. But that's the trade-off being able to use hardware underline etc for headings and things.

So let's focus on the palette for a bit, and try to get to the bottom of that. It isn't that the palette entries need nybl swapping, as I checked that, and while result is "different" its certainly not correct.  I did verify that the mode is display displaying correctly, and also check what the palette looks like by adding a routine on the P key in the fetch programme, to show the palette on 8x8 tiles at the top of the screen. The result while trying to show the well known MEGA65 logo banner is:

Which is clearly Not Correct(tm).  There should be colours matching the four colour bars.  

To try to work out what is going on, I added some debug info that shows when each colour is allocated, and saw that it was allocating colours twice-over.  I remembered that I had put the logo in the .md file twice. So just to keep life simple, I removed the 2nd instance of it, and what do you know? It suddenly displays mostly correctly:

Again, the screen-capture programme mis-renders the transparent pixels as black. But we can see its now 99% correct, except that the yellow bar isn't showing correctly.  This might be the 256th colour issue I was mentioning earlier.

So now to find out why my image loader is failing to handle the palette properly if it is asked to load a subsequent image.  It should simply remember the existing palette, and every colour should match. But clearly that's not happening, so let's find out why... And it was a buffer overflow, allowing one too many colours to be recorded, thus overwriting the number of recorded colours.  

With the 16 C64 colours already allocated, the banner image requires one too many colours, so it triggers this problem.

This actually broaches onto a broader problem, which is that all the images on a page are expected to share a single palette, and the converter programme at the moment doesn't do any down-sampling of the colours to make them fit.  

We could in theory use a 2nd palette bank for the graphics to ease this problem, or indeed use two separate palette banks for two separate images to allow 512 colours on screen, with 256 (well 256 - 16 = 240) in one and 256 potentially different colours in the other.

But maybe it will just be easier and more flexible to digest the palette requests for all the images in the page in a first pass, and then quantise the palette values to be used in a second pass.  That's probably a better approach.

So in the first pass we will log all colours, and ignore if we have too many colours to render first time around. And if there were too many colours, then we will trigger the quantisation and second pass.  There are fancy algorithms for doing this the best was possible, but for now, I just want a proof of concept. The general consensus seems to be "cluster and reduce"... But first we need a list of all the colours in the image(s)...

Code duly modified to record >256 colours and flag if we need a 2nd pass. What is curious, is that including the same image twice results in more colours being required, than if it is only included once. This should not happen.  This makes me suspect some other memory corruption problem somewhere. Nope, it turns out it was just the order of stdout and stderr messages appearing that made it look like that was the problem.

I've now implemented a very crude colour quantiser. I say crude, in that what it does is find the least frequently used colours, and replaces them with the closest remaining colour, until the palette fits.  This means the first image will have the best colours, but later images might end up with totally borked colours.  But that's okay for now -- remember that this isn't a limitation of the file format, but just me being a bit lazy in getting a proof-of-concept system up and running.

I'll soon grab a few more images, and see how it looks if I load multiple images in. But first, I want to fix the mouse scrolling: Previously it would scroll at a constant rate whenever the mouse was at the top or bottom edge. I now want to have it properly scroll proportional to the mouse's movement, so that it feels much more natural.  That feels much nicer. I'll have to make a video of it in action.

Now, back to multiple images, I have that working now, too. But we were saying just before, the palette selection stuff is really quite amazingly sub-optimal.  To get an idea of just how suboptimal it is, see the following screenshot:

Because the MEGA65 logo has large slabs of a lot of colours, it is taking precedence over the more complex (and thus fewer pixels of each colour) devkit image. 

It turns out that this quick and dirty colour quantiser really does just suck, as even if I remove the MEGA65 logos, it still looks like rubbish. But if I use a tool like convert or mogrify (or you could use gimp or something graphical like that, too), then the result is, not surprisingly, much better:


So improving the quantiser would obviously be a really good idea. But I'm going to first focus on adding hyper links, so that we can have multiple pages, and navigate between them.  We are still at only 35,600 bytes for the programme, so it should be possible to add support for hyperlinks.

My plan is to have a data structure of URLs and the regions of the page that are links to them in memory as part of the page.  Thus any click of the mouse just needs to look up the position on the page, and see if it matches anything in the list, and if so, get the address of the URL in memory from that.  

We can do his most simply by describing bounding boxes, which will also make it nice and easy to have images act as link anchors, as well.  Speaking of images with links behind them, there isn't actually a proper standard in markdown for doing that, so I'm going to use magic text in the alt text field, which we don't otherwise use.  I'll probably also allow some other attributes in there, such as horizontal alignment directives. But lets start by looking for an href= tag in the alt text.

Pottering away while I listen to the Olympic Games opening ceremony, I have implemented links for both images and text, and cooked up a few inter-linked pages for demonstrating it. And it works. I think. Except that the TCP/IP stack is giving me mountains of grief now, going super slow or corrupting the received stream most times, so I'm invetigating that a bit. 

But first, here is a nice example of a page with some links:

The image and the two hardware-underlined links are all clickable, and it even will load the target page, if the TCP stack plays nicely.  You can't see it here, but the mouse pointer changes colour based on whether you are over a link or not.

It looks like the TCP slow-start mechanism and ACKing stuff is getting all confused, as we can see in this tcpdump capture.

First, we see .60 (the MEGA65) acking a packet to the linux box, which then results in the Linux box immediately sending the next packet:

23:03:24.457335 IP 192.168.178.60.1093 > 192.168.178.31.8000: Flags [.], ack 13954, win 1536, length 0
23:03:24.457439 IP 192.168.178.31.8000 > 192.168.178.60.1093: Flags [P.], seq 13954:14490, ack 116, win 64125, length 536

... and then the TCP slow-start mechanism decides it would be a great idea to send another packet straight away, to increase throughput:

23:03:24.457502 IP 192.168.178.31.8000 > 192.168.178.60.1093: Flags [.], seq 14490:15026, ack 116, win 64125, length 536

Which should then result in the MEGA65's TCP/IP stack acknowledging both of those packets, which it doesn't do. Instead after 2 seconds it acknowledges the first of them, but not the second:

23:03:26.429488 IP 192.168.178.60.1093 > 192.168.178.31.8000: Flags [.], ack 14490, win 1536, length 0
23:03:26.429603 IP 192.168.178.31.8000 > 192.168.178.60.1093: Flags [P.], seq 15026:15562, ack 116, win 64125, length 536
23:03:28.461701 IP 192.168.178.60.1093 > 192.168.178.31.8000: Flags [.], ack 14490, win 1536, length 0

And similar disfunction continues.  Well, the good news is that I can see that the problem is clearly on the MEGA65 side, so I should be able to find what is going on, and fixing it.  

My guess is that the ack isn't sent, because the sequence number doesn't match exactly.

Well, except that I am also seeing spots where the MEGA65 ACKs a packet, and then the Linux box waits a very long time before bothering to send a packet, e.g., as here:

23:34:37.314214 IP 192.168.178.60.4000 > 192.168.178.31.8000: Flags [.], ack 6986, win 1536, length 0
23:35:49.678613 IP 192.168.178.31.8000 > 192.168.178.60.4000: Flags [P.], seq 6986:7522, ack 116, win 64125, length 536

It waits 12 seconds before sending the packet. I'm guessing the Round-Trip-Time estimation on the Linux side is getting very confused. That will probably somewhat resolve if we can fix problems with ACKing when multiple packets are sent. 

I also wonder if we are still having funny problems with the ethernet interface not immediately handing packets over.

Maybe I will make the TCP/IP stack send debug info via the serial monitor interface, so that I can time-stamp it, and determine the exact time-relationship between the various actions.  In particular, I want to know which side is delaying at which time.

No comments:

Post a Comment