Monday, 6 October 2025

SMS Thread Display, Message Editing etc

In a previous post, I got Unicode text rendering, complete with line breaking, emojis and a pile of other stuff working, that means we can show message threads.

Now we need to refactor that out into some thing more usable, and add the missing bits. Like being able to type a message, hit send, have a functional scroll-bar etc.

Then we'll have almost an entire working SMS system, sans talking to the actual cellular module -- which I really should finally hook up and test. 

Let's start with refactoring this stuff out into smsscreens.c 

I'd also like proper working scroll bars. After a bit of thought, I decided to use H640 multi-colour sprites that are full height (the VIC-IV in the MEGA65 lets you change the height of sprites, so that you don't need to use a multiplexer). That way I can have one colour for the background of the scroll area, and another for the foreground.

Done this way, the scroll bar implementation becomes embarrassingly simple:

char draw_scrollbar(unsigned char sprite_num,
            unsigned int start,
            unsigned int end,
            unsigned int total)
{
  unsigned char first;
  unsigned char last;
  unsigned long temp;

  if (!total) total=1;
  if (start>total) start=0;
  if (end>total) end=total;
  
  temp = start<<8;
  temp /= total;
  first = temp;

  temp = end<<8;
  temp /= total;
  last = temp;

  lfill(0xf100 + (sprite_num*0x300), 0x55, 0x300);
  lfill(0xf100 + (sprite_num*0x300) + first*3, 0xAA, (last-first+1)*3);
  
  return 0;  
}
  

So let's now implement simple scrolling through the message thread using the cursor keys, and show the current scroll region... And it works  :) The main issue is that it's quite slow to draw at the moment, because I haven't optimised anything. That's totally fine for now.




 And a video showing just how slow it is:


Okay, so that's the display of the messages working.  Next stop is letting you type and edit a message. I'm not going to allow up and down cursor keys to navigate the message draft: Those will still scroll the thread up and down.  Left and right we will make work, though. And any general key press will do what it should.  Backspace will work.  RETURN will be for send. For simplicity, we'll probably just always show the message edit box at the bottom. For now, I'll make it black on light grey, so that it's easy to tell apart from the message thread above.  

Unicode entry (e.g., for emoji) is something that I have yet to solve.  It's all a bit complicated because we can only have 512 unique glyphs on the screen at a time, because of our limited glyph buffer size.  What I might end up doing later is having an emoji/special character entry button that hides the SMS thread display, and just has an emoji/unicode point chooser, and when you select one, it passes it back to the editor that called it.  That will probably be a separate helper program, because of the code size it will entail.  It's also not on the critical path for demonstrating core functionality, so I'm not going to let myself be distracted by it.

Okay, so let's add this message drafting box.  Ah, that reminds me: the way I draw multi-line strings like this, I can't tell it to fill a certain number of lines. So I'll just make it take the space it needs, and it will grow as required, and trigger a re-draw of the whole screen if the number of lines in the message draft change. I'm also not yet sure how I'll handle the cursor. I might make it a special dummy character in the string (possibly just a | character, since it looks sufficiently cursor-ish).

Well, I continue to be pleased with the architecture I've laid down for this. It's not perfect, as we have those things I've describe above that we'll need to do. Nonetheless, I've been able to quickly plumb in the ability to do simple message drafting, complete with cursor key navigation, backspace, and because it's a C64-type machine, shift+HOME clears the message draft.

 

In short, we have all the ingredients ready, right up to the point where we need to add the message to the thread to simulate sending it, and then also send it to the cellular modem.

To finish that off, I'll need a routine to write to a record in the D81 from on native hardware. That shouldn't be too hard.  Then we can try to actually plumb it into the cellular communications stuff.

To test the saving and restoring of SMS drafts, I've added the bit of code necessary to read any saved draft, still using our horrible but adequate hack of using a | character as the cursor:

  // Read last record in disk to get any saved draft
  read_record_by_id(0,USABLE_SECTORS_PER_DISK -1, buffers.textbox.draft);
  buffers.textbox.draft_len = strlen(buffers.textbox.draft);
  buffers.textbox.draft_cursor_position = strlen(buffers.textbox.draft) - 1;
  // Reposition cursor to first '|' character in the draft
  // XXX - We really need a better solution than using | as the cursor, but it works for now
  for(position = 0; position<buffers.textbox.draft_len; position++) {
    if (buffers.textbox.draft[position]=='|') {
      buffers.textbox.draft_cursor_position = position;
      break;
    }
  }

Then all we need to do is to write that record back whenever we modify the draft message:

      // Update saved draft in the D81
      write_record_by_id(0,USABLE_SECTORS_PER_DISK -1, buffers.textbox.draft);

That should be all we need, once I have a working write_sector() routine. Which I think I have implemented now, but the draft message is notably not being restored when I re-run the program. So something is going wrong. 

The FDC busy flag strobes when requesting the write, so it seems like it should be working.  I've also confirmed that the buffer contents looks correct as it's being written.

Okay, I can actually confirm that it's being written to the sector on the D81. So why isn't it being loaded properly when the SMS thread is loaded?  The read sector call is failing is the reason.  So why is it failing? We know it works in the general case?  Is our record number too high, causing an invalid read?

Found the problem -- I wasn't mounting the disk image before reading the message draft :) With that fixed, it now retrieves the draft message on start-up. Nice!

So in principle, I should be able to now implement the "send message" function of hitting return by:

1.  Building an "outgoing" SMS message.

2. Allocating the next free record and writing it to it. 

3. Shifting the scroll position in the thread to the new end of the thread. 

4. Clear the draft message.

5. Redraw the thread.

We'll ignore what to do if the message thread is over-filled, and just silently fail. 

The good news is that we have all the functions we need to do this already, as this is just replicating what the import utility does when populating these message threads.

So let's start with forming the outgoing SMS message and logging it. It turns out that the sms_log() function does both 1 and 2.  All it needs is the phone number the message is being sent to. It's a little inefficient, in that it does the search for the contact that matches the phone number, instead of just slotting it into the current contact.  I might refactor that out, both for speed, and to stop messages being logged against the wrong contact if multiple contacts have the same phone number.

Hmm... the program is crashing now. I wonder if I haven't stomped over some RAM somewhere. I've had this before, where CC65 compiled programs go wonky on me when I hit memory limits.

I really have been using CC65 for this out of habit, but perhaps this is time to try one of the other MEGA65 supported compilers. Like LLVM-MOS.

I built and installed llvm-mos like this:

$ git clone https://github.com/llvm-mos/llvm-mos.git
$ cd llvm-mos
$ cmake -C clang/cmake/caches/MOS.cmake -G Ninja -S llvm -B build 
$ cd build
$ ninja
$ cd ..
$ ninja -C build install-distribution
$ cd ..
$ git clone https://github.com/llvm-mos/llvm-mos-sdk.git
$ cd llvm-mos-sdk
$ mkdir build && cd build
$ cmake -G Ninja .. -DCMAKE_INSTALL_PREFIX=/usr/local   # or $HOME/opt/llvm-mos
$ sudo ninja install

With that I should be able to use the same build setup as in GRAZE.

I've got it compiling now, but it's not working correctly, so I have to go through and debug whatever issues the switch to LLVM have caused. On the plus side, with optimisation enabled, the LLVM compiled program is only about 30KB, which is a significant improvement on CC65 -- even if I suspect it's mostly link-time optimisation, i.e., leaving out functions that don't ever get called. But if it still makes the difference between works and doesn't work, then that's fine by me.

This comes back to a recurring problem that I have with debugging this kind of thing: If the error occurs within nested function calls, then I initially can only detect the outer most error, which is probably just because some inner thing failed. When I'm cross-compiling, I have a fatal() macro that reports the failure tree. But for native builds I don't have that doing anything useful, because the screen gets all messed up, and because the code takes up space. But with LLVM, we have a bit more space available to us again.

We do have the serial monitor interface available to us, though. Or even just dumping stuff into RAM somewhere magic for later extraction.  The MEGAphone software doesn't use the AtticRAM at all, so I could just dump stuff up there, if it's too tricky to push the messages over the serial monitor interface. 

And now I'm pulling my hair out, because LLVM is generating incorrect code for the usleep() function and/or the function for writing strings to the serial UART monitor interface when I use it with -Oz, -O1, -O2 -O3. And it's failing in different ways.

So I think a lot of the problem is that mos-llvm assumes the Z register always contains zero. But then it gets set in places, and it doesn't realise what's happened.  Or perhaps somewhat equivalently it is using LDA ($zp) instructions that on the MEGA65 are actually LDA ($zp),Z, without realising.

Found the problem there: I was corrupting Z in another routine, which I've fixed.

But even after that, some stuff is output incorrectly from the mega65_uart_printhex() routine -- but only if I use stack storage for the hex string in that routine.  The problem is that the pointer it's using for the stack is pointing to $FFFB, i.e., under the KERNAL, and so it reads rubbish out, instead of the correct data.

That happens because the stack pointer is stored at $02 and $03, and something is causing a null pointer to be written to, thus overwriting the stack pointer.  Not cool.

Something to do with the shared resources helper assembly is borking things. I'm not sure why just yet, because it has an allocation for the 5 bytes it uses, so they shouldn't be at $0000.

Yup, for some reason the address is resolving to $0000.  Found the problem: the extern declaration for the _shres_regs area was as a pointer, and thus dereferncing the stored data, yielding $0000.

With that fixed, I can now see the failure points:

src/telephony/contacts.c:45:mount_contact_qso():0x03
src/telephony/contacts.c:58:mount_contact_qso():0x0A
src/telephony/contacts.c:60:mount_contact_qso():0x08
src/telephony/records.c:146:read_record_by_id():0x03
src/telephony/contacts.c:40:mount_contact_qso():0x01
src/telephony/contacts.c:41:mount_contact_qso():0x02
src/telephony/contacts.c:45:mount_contact_qso():0x03
src/telephony/contacts.c:51:mount_contact_qso():0x05
src/telephony/contacts.c:54:mount_contact_qso():0x06
src/telephony/contacts.c:58:mount_contact_qso():0x0A
src/telephony/contacts.c:60:mount_contact_qso():0x08
src/telephony/records.c:170:write_record_by_id():0x04
src/telephony/records.c:179:write_record_by_id():0x03
 

First is top of the list.  So I'll start investigating those. 

The first one was actually a debug thing I'd put in place. The second one is a file not found error when mounting the D81. This happens because the pointer to the filename is somehow winding up as a null pointer. Possibly because something upstream has messed with a null pointer. I _really_ don't like the SP being at $02-$03 in ZP with LLVM, as even the smallest null pointer dereference write will smash it, and make it hard to track down the root cause. Logged an issue: https://github.com/llvm-mos/llvm-mos/issues/505 

Anyway, I tracked the problem down. I hadn't converted the CC65 to LLVM calling convention for the assembly helper routine.  Now I'm back to it displaying the message thread for the contact :)

That said, it's still reporting some failures.  I'm now thinking I'd like to be able to get stack back-traces working, to make debugging even easier. After an hour or so of mucking about, I have it working as well as I can without having to load a line-by-line symbol table:

src/telephony/contacts.c:41:mount_contact_qso():0x02
Backtrace (most recent call first):
[02] 0x312A mount_contact_qso+0x00C5, SP=0xD000
[01] 0x10DB main+0x0660, SP=0xD000
[00] 0x0A89 main+0x000E, SP=0xD000
 

This is running entirely on the MEGA65, with this output via serial monitor UART interface. 

I can then make a utility that works out the exact line that those function offsets correspond to, should the need arise. But just having a general idea of the position of the call within the calling function should be enough.

I'm actually really happy with this. Actually, that statement really undersells just how happy I am to have access to stack back-traces in my code at reasonable cost.

I think it has the potential to be handy to lots of other folks, so I've documented it in a blog post of its own

Okay, so let's get back to debugging our actual bugs...

The problem now is that it's not mounting contact conversations properly. Or perhaps it's doing it correctly once, but not on subsequent attempts.

Yes, it's succeeding on the first attempt, but not on a subsequent call.

What's strange is it looks like the HYPPO chdir() call succeeded. i.e., that the problem is somewhere in the processing of the return value.  There is no HYPPO DOS error code set, which further supports this hypothesis.

Looks like the chddirroot() call isn't working. The problem was that the LLVM fileio.s didn't have the fix to chdirroot() that I had done a while back for CC65.

Now I'm finally back to having the SMS conversation thread displaying. But there is something funny where some of the messages are only having their first line displayed, instead of all lines of the message. 

Also pressing the INST/DEL key crashes it, instead of editing the draft SMS message. And now it's completely screwing up on start. Seems like Z register containing rubbish again. I've now added code to force Z = #$00 on startup, and it's getting further, but it looks like it's not loading the unicode glyphs any more.

So what weird thing is going on now?

Looks like the shared resource loader is loading the same empty glyph for every character. Which in turn is because it thinks that the font it's loading from is of zero length. So let's get to the bottom of this.

The magic string detection is failing. The SD card sector seems to be read correctly, and the magic_string buffer contains the right values, so how on earth is it failing?

The problem is that &magic_string[i] resolves to $002A, not the real address.

It turns out that LLVM doesn't seem to be calling it's own __copy_zp_data() function during initialisation prior to entering main() _and_ the linker decided that it would be better to relocate the magic_string down there in to ZP. It makes no sense, but that's what was happening.

With that fixed, it's looking less bad, at least while I had debug code in there -- it was even loading glyphs, although only the first row of text in a message was displaying.

Now it's back to the program crashing with Z=$F7 after I removed the debug code while I was figuring that ZP initialisation stuff out.

To figure out what's causing that, I've added an NMI catcher that shows the backtrace, like this:

NMI/BRK triggered.
                  Backtrace (most recent call first):
[04] 0x0A7E nmi_catcher+0x0001, SP=0xD000
[03] 0x4260 calc_break_points+0x0163, SP=0xD000
[02] 0x4205 calc_break_points+0x0108, SP=0xD000
[01] 0x1131 main+0x06A1, SP=0xD000
[00] 0xD000 __ashlhi3+0x772B, SP=0x41FF

Now that's a way faster way to get to the heart of the problem :)  

The code to add this was simple, too:


void nmi_catcher(void)
{
  mega65_uart_print("NMI/BRK triggered.\n");
  dump_backtrace();
  while(1) continue;
}

... 

  // Install NMI/BRK catcher
  POKE(0x0316,(uint8_t)&nmi_catcher);
  POKE(0x0317,((uint16_t)&nmi_catcher)>>8);
  POKE(0x0318,(uint8_t)&nmi_catcher);
  POKE(0x0319,((uint16_t)&nmi_catcher)>>8);
  POKE(0xFFFE,(uint8_t)&nmi_catcher);
  POKE(0xFFFF,((uint16_t)&nmi_catcher)>>8);
 

The fact that our function instrumentation is keeping track of the function entry and exit apart from the stack makes this quite robust.

Anyway, let's find out where this is all going west inside calc_break_points().

The double-call for it is the result of in-lining, I think. 

Anyway, with a bit more work, I have my NMI catcher showing the symbol and offset, and PC value of where it happened:

>>> NMI/BRK triggered.
 A:9E X:84 Y:05 P:B1 S:E8
  BRK source @ 0x4717 calc_break_points+0x04F5
Backtrace (most recent call first):
[03] 0x29A0 nmi_catcher+0x009E, SP=0xD000
[02] 0x43A8 calc_break_points+0x0186, SP=0xD000
[01] 0x111E main+0x06A1, SP=0xD000
[00] 0x0DD0 main+0x0353, SP=0x1E57

With that, I can disassemble and see what's going on: 

,00004700  64 05     STZ   $05
,00004702  64 06     STZ   $06
,00004704  86 07     STX   $07
,00004706  A6 1A     LDX   $1A
,00004708  86 08     STX   $08
,0000470A  64 09     STZ   $09
,0000470C  A6 1B     LDX   $1B
,0000470E  86 0A     STX   $0A
,00004710  A6 1A     LDX   $1A
,00004712  86 0B     STX   $0B
,00004714  A2 84     LDX   #$84
,00004716  0A        ASL   
,00004717  00 D0     BRK   $D0
,00004719  75 57     ADC   $57,X
,0000471B  00 D0     BRK   $D0
,0000471D  86 04     STX   $04

And look at that, there _is_ a BRK instruction in the middle of the instruction stream. What on earth is it doing there?   

Yup, something is bonkers here.  If I use this on the ELF object:

llvm-objdump -drS --no-show-raw-insn --print-imm-hex bin65/unicode-font-test.llvm.prg.elf

I can see this:

;   lcopy((unsigned long)buffers.textbox.break_costs,0x1A000L,RECORD_DATA_SIZE);
    46fe:       stz     $4                      ; 0xe004 <__heap_start+0x43d8>
    4700:       stz     $5                      ; 0xe005 <__heap_start+0x43d9>
    4702:       stz     $6                      ; 0xe006 <__heap_start+0x43da>
    4704:       stx     $7                      ; 0xe007 <__heap_start+0x43db>
    4706:       ldx     $1a                     ; 0xe01a <__heap_start+0x43ee>
    4708:       stx     $8                      ; 0xe008 <__heap_start+0x43dc>
    470a:       stz     $9                      ; 0xe009 <__heap_start+0x43dd>
    470c:       ldx     $1b                     ; 0xe01b <__heap_start+0x43ef>
    470e:       stx     $a                      ; 0xe00a <__heap_start+0x43de>
    4710:       ldx     $1a                     ; 0xe01a <__heap_start+0x43ee>
    4712:       stx     $b                      ; 0xe00b <__heap_start+0x43df>
    4714:       ldx     #$84
    4716:       lda     #$32
    4718:       jsr     $5775 <lcopy>
;   CHECKPOINT("post string_render_analyse");
    471b:       ldx     #$25
    471d:       stx     $4                      ; 0xe004 <__heap_start+0x43d8>

Let's focus on the bit that's broken:

,00004712  86 0B     STX   $0B
,00004714  A2 84     LDX   #$84
,00004716  0A        ASL   
,00004717  00 D0     BRK   $D0
,00004719  75 57     ADC   $57,X
 

vs 

    4712:       stx     $b                      ; 0xe00b <__heap_start+0x43df>
    4714:       ldx     #$84
    4716:       lda     #$32
    4718:       jsr     $5775 <lcopy>
 

Ah -- when I load it, but haven't yet run it, it's okay:

,00004712  86 0B     STX   $0B
,00004714  A2 84     LDX   #$84
,00004716  A9 32     LDA   #$32
,00004718  20 75 57  JSR   $5775
,0000471B  A2 B3     LDX   #$B3
,0000471D  86 04     STX   $04
,0000471F  A2 5D     LDX   #$5D 

So we're somehow overwriting this bit of code. That explains why the behaviour is so random.

Let's stick a watch on $4717 and see what's to blame for corrupting it.

Bingo -- we've found it:

w4717
.t0
.!
PC   A  X  Y  Z  B  SP   MAPH MAPL LAST-OP In     P  P-FLAGS   RGP uS IO ws h RECA8LHC
247B 0A 97 02 F7 00 01F2 0000 0000 A507    00     21 ..E....C ...P 15 -  00 - .....lh.
,0777247B  A4 09     LDY   $09

.D2470
,00002470  08        PHP   
,00002471  85 05     STA   $05
,00002473  A9 00     LDA   #$00
,00002475  A0 02     LDY   #$02
,00002477  91 04     STA   ($04),Y
,00002479  A5 07     LDA   $07
,0000247B  A4 09     LDY   $09
,0000247D  91 04     STA   ($04),Y
,0000247F  18        CLC   
,00002480  A5 08     LDA   $08
,00002482  69 02     ADC   #$02
,00002484  85 07     STA   $07
,00002486  A5 06     LDA   $06
,00002488  69 00     ADC   #$00
,0000248A  5A        PHY   
,0000248B  A4 07     LDY   $07

.m4
:00000004:1547470A1501004C0000000400000000

So why does the ZP vector at $04-$05 point there? And where is this bit of code?

Okay, so this is a bit embarrassing: It's in the trace-back logging code. In particular here:

;   callstack[depth] = (struct frame){ call_site, &__stack };
    2457:       lda     $40                     ; 0x6040 <__bss_size+0x2f24>
    2459:       asl
    245a:       rol     $6                      ; 0x6006 <__bss_size+0x2eea>
    245c:       asl
    245d:       sta     $4                      ; 0x6004 <__bss_size+0x2ee8>
    245f:       rol     $6                      ; 0x6006 <__bss_size+0x2eea>
    2461:       lda     #$15
    2463:       clc
    2464:       adc     $4                      ; 0x6004 <__bss_size+0x2ee8>
    2466:       tay
    2467:       lda     #$6b
    2469:       adc     $6                      ; 0x6006 <__bss_size+0x2eea>
    246b:       sta     $6                      ; 0x6006 <__bss_size+0x2eea>
    246d:       sty     $4                      ; 0x6004 <__bss_size+0x2ee8>
    246f:       sty     $8                      ; 0x6008 <__bss_size+0x2eec>
    2471:       sta     $5                      ; 0x6005 <__bss_size+0x2ee9>
    2473:       lda     #$0
    2475:       ldy     #$2
    2477:       sta     ($4),y                  ; 0x6004 <__bss_size+0x2ee8>
    2479:       lda     $7                      ; 0x6007 <__bss_size+0x2eeb>
    247b:       ldy     $9                      ; 0x6009 <__bss_size+0x2eed>
    247d:       sta     ($4),y                  ; 0x6004 <__bss_size+0x2ee8>
 

And again, the root cause is that Z has somehow got itself = $F7.

Where on earth is that coming from?  It's all the side-effect of Z not being zero on initial entry, which causes all sorts of things to go wrong in the initial pre-main() routines, and then thereafter.

I was able to force it to be cleared by adding this to hal_asm_llvm.s: 


    ;;  Ensure Z is cleared on entry    
    .section .init.000,"ax",@progbits
    ldz #0            
    cld            ; Because I'm really paranoid
 

With that all in place, it draws the display more or less properly again.  But if I try to edit the message, it still crashes-- but it does yield a stack back-trace at least for part of that problem. So I can investigate and fix that, and see what else remains broken:

src/telephony/records.c:144:read_record_by_id():0x01
Backtrace (most recent call first):
[02] 0x30C9 write_record_by_id+0x000E, SP=0xD000
[01] 0x1AFE main+0x107E, SP=0xD000
[00] 0x0A9A main+0x001A, SP=0xD000


src/telephony/records.c:75:append_field():0x01
Backtrace (most recent call first):
[04] 0x2DDE append_field+0x018E, SP=0xD000
[03] 0x1DEE main+0x136E, SP=0xD000
[02] 0x1C84 main+0x1204, SP=0xD000
[01] 0x1AFE main+0x107E, SP=0xD000
[00] 0x0A9A main+0x001A, SP=0xD000

Okay,  I'm ready to move forward again after a bunch of further diversions, and logging the odd issue against mos-llvm, and implementing a VHDL-based memory write-protection scheme to detect memory corruption bugs earlier.

I also had to re-provision the SD card files, because the SMS thread got corrupted during all of the above.

So now I have it at the point where it _looks_ like sendin an SMS works, except that after sending it doesn't show up in the thread.  Either it's not being written to the thread, or the message count in the thread is not being updated.

It looks like the problem is happening during the index update. Disabling that temporarily, I can now have an SMS message get stored into the message thread, and they display:

I need to fix the removal of the cursor before they get stored. And then also track down the reason why it messes up when updating the index.

Okay, I have the cursor hiding working now (although there are still some subtle bugs with cursor handling).  I've added instrumentation that lets me see which sectors of which disk image are being written to -- complete with the path and name of the disk image:

DEBUG: BAM sector before allocation
0000: 0 0 FF FF 07 00 00 00 00 00 00 00 00 00 00 00   ................
DEBUG: BAM sector after allocation
0000: 0 0 FF FF 0F 00 00 00 00 00 00 00 00 00 00 00   ................
Image in drive 00 is /PHONE/THREADS/0/0/0/3/MESSAGES.D8
DEBUG: Writing sector data beginning with
0000: 0 0 FF FF 0F 00 00 00 00 00 00 00 00 00 00 00   ................
Allocated record 003 for new SMS message
DEBUG: BAM sector read back after
0000: 0 0 FF FF 0F 00 00 00 00 00 00 00 00 00 00 00   ................
Image in drive 00 is /PHONE/THREADS/0/0/0/3/MESSAGES.D8
DEBUG: Writing sector data beginning with
0000: 0 27 00 00 06 0D 2B 39 39 39 32 36 37 35 3 34   .'....+99926754
Image in drive 0 is /PHONE/THREADS/0/0/0/3/MESSAGES.D8
DEBUG: Writing sector data beginning with
0000: 0 0 FF FF 07 00 00 00 00 00 00 00 00 00 00 00   ................
Image in drive 0 is /PHONE/THREADS/0/0/0/3/MESSAGES.D8
DEBUG: Writing sector data beginning with
0000: 0 03 00 00 06 0D 2B 39 39 39 32 36 37 35 3 34   ......+99926754
Image in drive 0 is /PHONE/THREADS/0/0/0/3/MESSAGES.D8
DEBUG: Writing sector data beginning with
...
 

With this, I can see that for some reason, the indexing code thinks that it's writing always to the MESSAGES.D81 (I need to fix that trimming from the filename displayed), even when updating the index. That would absolutely cause the kind of problem that we're seeing. So time to add some more instrumenting. I am so glad that I have the instrumentation stuff setup now.

Okay, found a big bug in write_sector: It was always selecting drive 0.

I'd like to optimise the indexing code, so that we only modify index sectors that have changed, since writing is much slower than reading. Ideally we would use freezer-style multi-sector writes to speed things further, but we don't have that implemented in the HAL yet. More the point, because we are accessing via the FDC emulation, multi-sector writes aren't actually possible.  So that'll have to go on the back-burner.

But what I can do in the meantime is add a busy indication with an hour-glass sprite. Except I've decided to go with a 1 TON "wait".  I may even add an IRQ routine to animate it, so that the weight seems to drop continuously while "weighting".  Okay, so it's tacky. But it puts everything in place for a much better wait indication. Anyone who's eyeballs are bleeding at what I have created is invited to submit alternate artwork for consideration. This what you have to improve upon:


So now the problem I have is that the text box for the SMS message  drafting is not being consistently drawn, and when it is, it's with a different height. First check is to see whether the flag to draw it is actually being seen.

Okay, so with a bit of debug output, we can see that it's being asked to be drawn, but the number of lines to be drawn is varying all over the place:

with_edit_box_P = 01, textbox.line_count = 00
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 01
with_edit_box_P = 01, textbox.line_count = 01
with_edit_box_P = 01, textbox.line_count = 02
with_edit_box_P = 01, textbox.line_count = 02
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 04
with_edit_box_P = 01, textbox.line_count = 02
 

Why? The draft message itself is empty.  And that's the problem: If it's zero bytes long, then it returns failure. I've fixed that to instead show that it should still show a single line in that case. The empty draft now has constant vertical space reserved on the screen, but if the string is empty, then that line still doesn't get shown.  I can live with that, because it should never happen, because we should always have a cursor character in there.

This then feeds back to the other bugs affecting the whole cursor thing, because there are a bunch of them.

This current one we can deal with by saying that if we have a string without a cursor marker, that we should add one to the end.

Also, if a string has more than one cursor marker, we should get rid of it.

Right now we use a | character to approximate a cursor. But we should probably go past that now, and use something better. Possibly a width trimmed reverse space. We could then enable the hardware blink attribute on it, to make it a proper blinking cursor.

We do need a way to represent the cursor in the string. I'd rather not use a >0x7f value, because then we have to worry about UTF8 encoding stuff. But we can use a low value <0x20, that normally wouldn't get displayed. Like 0x01, for example, and then just replace that with a cursor when we encounter it.

Okay, so I've implemented this, and we even now have a nice blinking cursor.  But if the cursor isn't at the end of the message, then when it gets reloaded, the cursor isn't there, and one of the characters next to where the cursor was gets munched. So I guess I'm handling the cursor finding thing wrong.  So, time for more serial monitor debug messages!

That's all fixed now, too.

So I think the last thing I'd like to deal with for now is to allow deleting messages, so that I don't clog up the message threads with all my testing.  This shouldn't be too hard: all I need to do is to deallocate the record in the BAM, and then update the index.  I can probably do this in two steps, by first zeroing out the message we're deleting, and pretending to index it. That will update the index. Then I just need to deallocate the BAM bit. But first, I need to key combo for it. I'm going to use SHIFT+DEL, since it's easy.

The more complex case is deleting messages that aren't at the end of the conversation.  I'm not updating the index when that happens, because the routine for reindexing a whole disk is currently embarrassingly slow on real hardware (of the order of an hour!).  But I don't need the index stuff right now, since searching isn't on the critical path. 

So I think that's probably everything we need here for the moment.  Time to get contact list and dialpad working, and then hook it up to the cellular modem and actually do some telephony! 

Saturday, 4 October 2025

Simple Memory Protection Scheme

Using LLVM has me wanting to implement a simple scheme that enforces memory protection, so that I can more easily detect memory corruption events, in part because of the unfortunate (although understandable) way that LLVM stores its stack pointer in zero-page at addresses $02 and $03, which renders them succeptible to easy corruption, which then results in all manner of down-stream corruption.  

While protecting the stack pointer itself would require munging with LLVM's code generator, I can at least make it possible to write-protect code and read-only data segments, wired so that they simulate a BRK instruction if attempted to be written to. 

Tracking via Issue #921

I'm going to have to work out how to do this in a way that can survive freeze and unfreeze, ideally without changing the frozen process image. But that can wait.

We'll start by defining what I want it to do, which I think is fairly simple:

1. Allow two ranges, each of which define a write-protected region.

2. Two flags that enable each write protected region.

3. A bit field that indicates whether each region should trigger a hypervisor trap (to freezer), an BRK-like IRQ,  or something else.

It was all a lot more mucking about to get working, due to some weird glitching causing false-positives. I've fixed those, and I can now cause an interrupt on a write to either protected region. However, the writes are still occurring, at least to chip RAM, which is hardly ideal.

That would because we have this weird split regime that got added in when we moved from the old synthesis tool, whose name I can't even remember at the moment, to Vivado.  The old one allowed a slightly weird BRAM timing configuration that was deeply depended on in the CPU design, and it was solved by splitting the whole thing into these two separate processes. But this means that our detection of the write violation and our inhibiting of the write is now split over two separate processes.

So I need some simple way to fix this, without messing up timing.

Well, I've got it enforcing for chip RAM now, but not IO. But I can live with that.

I doubt that this will make it into development, but who knows.  But here's the registers (write-only) for this:

$FFD5000-1 = low address (inclusive) for write-protect region 0
$FFD5002-3 = high address (inclusive) for write-protect region 0
$FFD5004-5 = low address (inclusive) for write-protect region 1
$FFD5006-7 = high address (inclusive) for write-protect region 1
$FFD5008 bit 0 = enable write protection region 0
$FFD5008 bit 4 = enable write protection region 1
$FFD5008 bit 1-3 = write protection region 0 violation angle: 111 = nothing, 000 = simulate BRK, 001 = NMI, 010 = trigger freezer.
$FFD5008 bit 5-7 = write protection region 1 violation angle: 111 = nothing, 000 = simulate BRK, 001 = NMI, 010 = trigger freezer.

Writing to any of $FFD5000-$5007 disables write protection for both regions. As does entering the freezer.

So for example we can do:

sffd5000 0 8 10 8 0 0 0 0 1

And that will write-protect $0800-$0810 inclusive, and trigger a fake BRK (which will trigger the MEGA65 ROM monitor by default).

So now I can make a little shim for LLVM that extracts the address range of the code and rodata segments, and then enforces write-protection.  Well, it felt like it should be possible to do with the linker, but I didn't have the time to dig deep, so I just made some python that parses the map file for the program (which I already had to add symbol tables to allow debug symbols in the natively generated stack backtraces on BRK instruction) to also make the 9-byte vector of values to get put at $FFD5000 to setup the write protection.

And with the latest commits to everything, it now works -- and I get a stack backtrace generated whenever the code or read-only data area gets written to :)

So now it's back to fixing the remaining bugs with the LLVM transition for the telephony software... 



 

 

 


Sunday, 28 September 2025

Stack backtrace on MEGA65 using MOS-LLVM

For the MEGAphone, I wanted to make my software debugging on the MEGA65 easier.  Stack back-traces are a great way to help debug errors, but we don't have gdb or lldb or anything like that on the MEGA65.  m65dbg and related tools can help here, but they don't have an easy way to provide the complete call stack.

To solve this for my needs, I added function instrumentation using the following in my Makefile:

COPT_M65=    -Iinclude    -Isrc/telephony/mega65 -Isrc/mega65-libc/include

COMPILER=llvm
COMPILER_PATH=/usr/local/bin
CC=   $(COMPILER_PATH)/mos-c64-clang -mcpu=mos45gs02 -Iinclude -Isrc/telephony/mega65 -Isrc/mega65-libc/include -DLLVM -fno-unroll-loops -ffunction-sections -fdata-sections -mllvm -inline-threshold=0 -fvisibility=hidden -Oz -Wall -Wextra -Wtype-limits

# Uncomment to include stacktraces on calls to fail()
CC+=    -g -finstrument-functions -DWITH_BACKTRACE

LD=   $(COMPILER_PATH)/ld.lld
CL=   $(COMPILER_PATH)/mos-c64-clang -DLLVM -mcpu=mos45gs02
HELPERS=        src/helper-llvm.c

LDFLAGS += -Wl,-Map,bin65/unicode-font-test.map
LDFLAGS += -Wl,-T,src/telephony/asserts.ld

# Uncomment to include stacktraces on calls to fail()
CC+=    -g -finstrument-functions -DWITH_BACKTRACE
 

Then for the build target, I run it twice, first to generate a map file with the memory addresses of all the functions in it, and then generate a C structure with the address of each function and its name listed:

# For backtrace support we have to compile twice: Once to generate the map file, from which we
# can generate the function list, and then a second time, where we link that in.
bin65/unicode-font-test.llvm.prg:    src/telephony/unicode-font-test.c $(NATIVE_TELEPHONY_COMMON)
    mkdir -p bin65
    rm -f src/telephony/mega65/function_table.c
    echo "struct function_table function_table[]={}; int function_table_count=0;" > src/telephony/mega65/function_table.c
    $(CC) -o bin65/unicode-font-test.llvm.prg -Iinclude -Isrc/mega65-libc/include src/telephony/unicode-font-test.c src/telephony/attr_tables.c src/telephony/helper-llvm.s src/telephony/mega65/hal.c src/telephony/mega65/hal_asm_llvm.s $(SRC_TELEPHONY_COMMON) $(SRC_MEGA65_LIBC_LLVM) $(LDFLAGS)
    tools/function_table.py bin65/unicode-font-test.map src/telephony/mega65/function_table.c
    $(CC) -o bin65/unicode-font-test.llvm.prg -Iinclude -Isrc/mega65-libc/include src/telephony/unicode-font-test.c src/telephony/attr_tables.c src/telephony/helper-llvm.s src/telephony/mega65/hal.c src/telephony/mega65/hal_asm_llvm.s $(SRC_TELEPHONY_COMMON) $(SRC_MEGA65_LIBC_LLVM) $(LDFLAGS)

The tool that generates the function list is fairly simple:

#!/usr/bin/env python3
import sys
import re

if len(sys.argv) != 3:
    print(f"usage: {sys.argv[0]} <mapfile> <output.c>")
    sys.exit(1)

mapfile, outfile = sys.argv[1], sys.argv[2]

entries = []
in_text = False

with open(mapfile) as f:
    for line in f:
        if ".text" in line and line.strip().endswith(".text"):
            in_text = True
            continue
        if ".rodata" in line:
            break
        if not in_text:
            continue
        # match lines like: " a7b      a7b     196b     1                 main"
        m = re.match(r"\s*([0-9a-fA-F]+)\s+[0-9a-fA-F]+\s+[0-9a-fA-F]+\s+\d+\s+(\S+)$", line)
        if m:
            addr = int(m.group(1), 16)
            name = m.group(2)
            # skip synthetic names if you want
            if name.startswith("bin") or name.endswith(".o:"):
                continue
            entries.append((addr, name))

with open(outfile, "w") as out:
    out.write("/* auto-generated from map file */\n")
    out.write("const struct function_table function_table[] = {\n")
    for addr, name in entries:
        out.write(f"  {{ 0x{addr:04x}, \"{name}\" }},\n")
    out.write("};\n")
    out.write(f"const unsigned function_table_count = {len(entries)};\n")

Then in an include file, I have:

#ifdef WITH_BACKTRACE
#define STR_HELPER(x) #x
#define STR(x)        STR_HELPER(x)

#define fail(X) mega65_fail(__FILE__,__FUNCTION__,STR(__LINE__),X)
void mega65_fail(const char *file, const char *function, const char *line, unsigned char error_code);
#else
#define fail(X)
#endif

struct function_table {
  const uint16_t addr;
  const char *function;
};

#endif

The last bit of setup then is to have a C file that includes the function table and implements the helper functions:

#include "includes.h"

extern const unsigned char __stack; 

#ifdef WITH_BACKTRACE
#include "function_table.c"
#endif

void dump_backtrace(void);

#ifdef WITH_BACKTRACE

__attribute__((no_instrument_function))
void mega65_uart_print(const char *s)
{  
  while(*s) {
    asm volatile (
        "sta $D643\n\t"   // write A to the trap register
        "clv"             // must be the very next instruction
        :
        : "a"(*s) // put 'error_code' into A before the block
        : "v", "memory"   // CLV changes V; 'memory' blocks reordering across the I/O write
    );

    // Wait a bit between chars
    for(char n=0;n<2;n++) {
      asm volatile(
           "ldx $D012\n"
           "1:\n"
           "cpx $D012\n"
           "beq 1b\n"
           :
           :
           : "x"   // X is clobbered
           );
    }
    
    s++;
  }

}

__attribute__((no_instrument_function))
void mega65_uart_printhex(const unsigned char v)
{
  char hex_str[3];

  hex_str[0]=to_hex(v>>4);
  hex_str[1]=to_hex(v&0xf);
  hex_str[2]=0;
  mega65_uart_print(&hex_str[0]);
}

__attribute__((no_instrument_function))
void mega65_uart_printptr(const void *v)
{
  mega65_uart_print("0x");
  mega65_uart_printhex(((unsigned int)v)>>8);
  mega65_uart_printhex(((unsigned int)v));
}

__attribute__((no_instrument_function))
void mega65_fail(const char *file, const char *function, const char *line, unsigned char error_code)
{

  POKE(0x0428,PEEK(0x02));
  POKE(0x0429,PEEK(0x03));

  mega65_uart_print(file);

  mega65_uart_print(":");

  mega65_uart_print(line);
  mega65_uart_print(":");
  mega65_uart_print(function);
  mega65_uart_print("():0x");

  mega65_uart_printhex(error_code);
  mega65_uart_print("\n\r");

  dump_backtrace();

  while(PEEK(0xD610)) POKE(0xD610,0);
  while(!PEEK(0xD610)) POKE(0xD021,PEEK(0xD012));

}

/*
  Stack back-trace facility to help debug error locations.

*/

#define MAX_BT 32
struct frame { const void *site, *stack_pointer; };
static struct frame callstack[MAX_BT];
static uint8_t depth, sp;

__attribute__((no_instrument_function))
void __cyg_profile_func_enter(void) {
  if (depth>=MAX_BT) depth--;
  
  // Get SPL into sp variable declared above.
  __asm__ volatile ("tsx" : "=x"(sp));
  // Now convert that in
  const uint8_t *stack_pointer = (void *)(0x0100 + sp);
  
  void *call_site = (void *)((*((uint16_t *)&stack_pointer[1])) - 1);
  
  callstack[depth] = (struct frame){ call_site, &__stack };
  depth++;
}

__attribute__((no_instrument_function))
void __cyg_profile_func_exit(void) {
  if (depth) --depth; // simple, assumes well-nested calls
}

__attribute__((no_instrument_function))
void dump_backtrace(void) {
  // For each frame, either:
  //  - print raw addresses, or
  //  - call your on-target addr2line() to print file:line + function

  mega65_uart_print("Backtrace (most recent call first):\n\r");
  unsigned char d= depth-1;

  for(unsigned char d = depth-1;d!=0xff;d--) {
    mega65_uart_print("[");
    mega65_uart_printhex(d);
    mega65_uart_print("] ");

    // Find function in table
    unsigned int func_num = 0;
    while(func_num<(function_table_count-1) && function_table[func_num+1].addr < (uint16_t)callstack[d].site)
      func_num++;

    // Display offset from function
    mega65_uart_printptr(callstack[d].site);
    mega65_uart_print(" ");
    mega65_uart_print(function_table[func_num].function);
    mega65_uart_print("+");
    mega65_uart_printptr((void*)((uint16_t)callstack[d].site - function_table[func_num].addr));

    // Show stack pointer
    mega65_uart_print(", SP=");
    mega65_uart_printptr(callstack[d].stack_pointer);
    mega65_uart_print("\n\r");
  } 
}
#endif

With all that in place, if you call fail(X) where X is an error code, the MEGA65's serial monitor interface will output something like this, and then wait for a keypress on the MEGA65's keyboard before contininuing:

src/telephony/contacts.c:44:mount_contact_qso():0x03
Backtrace (most recent call first):
[02] 0x312A mount_contact_qso+0x00C5, SP=0xD000
[01] 0x10DB main+0x0660, SP=0xD000
[00] 0x0A89 main+0x000E, SP=0xD000

 

So now I know that fail(3) was called from inside mount_contact_qso(), which was called from main().

 

Monday, 8 September 2025

Unicode / TrueType Pre-Rendered Fonts for the MEGA65

Okay, so another side-path I have to deal with for the MEGAphone telephony software: I need pre-rendered Unicode fonts so that we can support all languages and emojis (at least for display --- multi-language input and emoji inserting will come a bit later).

Sneak peek of where we get to at the end: 

 

I've already got a utility that I've been writing that can generate ASCII rendering of glyphs using libfreetype.  With a bit of mucking about, I have that working for a Nokia-esque open-source font.  And in another blog post I have the HYPPO and mega65-libc APIs for accessing pre-rendered fonts via the new MEGA65 System Partition Shared Resource Area (SHRES).  So our focus here will be on improving the font pre-rendered, so that we can make font files that I can put into that area.

The format for these fonts will be very simple:

1. 16x16 FCM per glyph, using 256 bytes per unicode point. 

2. The pre-rendered fonts will be laid out linearly, with a 256 slot for each and every possible Unicode code point. All 1,114,112 ( = 64K x 17 ) of them. 

3. Use the last byte (corresponding to the bottom right pixel of each glyph to encode some important info: Width of glyph (5 bits), whether it's colour or monochrome (1 bit), and whether the bottom right pixel should be fully on, fully off, or match the colour of the pixel to the left or above it (2 bits).  That should cover almost all plausible requirements for that pixel we are stealing bits from. 

In other words, our font files will be 272MiB in size!

But on the plus side, they will be super easy to use: Multiply code point by 256 and use the SHRES API to request the 256 bytes for the glyph, parse the magic byte that has the width of the glyph etc, and shove it into the right place in Chip RAM.

So let's write the code to write out each glyph in this format. We're making the characters logically be 2 VIC-IV characters wide, and 2 high --- but using the VIC-IV's vertical resolution doubling feature so that we don't have to manage two rows of character data.  This does mean that we have to interlace the pixel data when writing it out.  Not really a big problem.

I've got it allegedly working, and made a little script and Makefile targets for the mega65-modular-telephony repo to prepare the shared resources area on the SD card:

 $ tools/prepare-megaphone-shared-resources.sh
make: Nothing to be done for 'fonts'.
INFO: Attempting to open shared resource file or disk image '/dev/sdb'
INFO: Found MEGA65 SYSPART at sector 0x014d7ffe
INFO: Found MEGA65 SYSPART shared resource area of 2048 MiB.
DEBUG: area_start=0x00029b0ffc00, area_length=0x80000000
Resource Table (4 entries):
Idx  Start   Sectors  Bytes      Flags      Name
---- ------- -------- ---------- ---------- -------------------------
0    6       521244   266876928  0x00000007 NotoColorEmoji
1    521250  521244   266876928  0x00000007 NotoEmoji
2    1042494 61328    31399680   0x00000007 NotoSans
3    1103822 32639    16710912   0x00000007 Nokia Pixel Large

Those four fonts should be a good starting point. The Nokia-like font for a fun retro feel on the phone UI, and then the Noto fonts to cover practically all languages and emoji.

What I don't know is whether those fonts contain all the symbols that we might like to use to draw UI furniture. Those could just be the standard C64 font, but made into a 8x16 interlaced font, allowing smoother curves etc.  

But that's secondary to whether we can display the Unicode fonts --- that is where our focus will go now.

We're going to use 80 column + double-height characters to effectively give 720x480 , but with half the screen RAM etc requirements. I might come to regret not being able to do half-height characters, but I can go back on that later if I need to, and just re-generate the fonts without the interlacing.  But I like the idea of needing less screen RAM fiddling to display text, as it will speed up rendering.

First step is to setup a video mode for this.  For 90x30 characters in 16-bit text mode, we'll need 2,700 x 2 = 5,400 bytes. We can stash that under the C64 KERNAL at $E000.  Colour RAM I'll put at offset $0800 in the colour RAM. This all also means that the C64 screen stays untouched, which I know from experience is handy to leave it available for debug output via printf() etc.

Ah, except we actually need to allow more than 90 characters per row, so that we have extra chars to fill in the space if some unicode glyphs aren't exactly 16 pixels wide. So I'll increase the number of chars per row from 90 to 255.

 

So let's get back to working on our unicode font pre-renderer, to make a font to actually install!

Well, not quite so fast --- I'm making the pull request for mega65-libc, and we've now got a bit more structure in doing this than in the past. So I've had to doxygen document it (chatgpt is very handy here), clang-format it, and make at least an initial implementation of an llvm-mos compatible version of shres-asm.s.  All done. I used chatgpt to make the llvm-mos assembler version of the shres_asm.s file.

https://github.com/MEGA65/mega65-libc/pull/71

Now we can really get back to making the unicode font renderer.

Here's my code that setups the 16-bit text mode with this screen layout:

unsigned long screen_ram = 0x12000;
unsigned long colour_ram = 0xff80800L;

void screen_setup(void)
{
  // 16-bit text mode 
  POKE(0xD054,0x05);

  // PAL
  POKE(0xD06f,0x00);
  
  // Retract borders to be 1px
  POKE(0xD05C,0x3B);
  POKE(0xD048,0x3B); POKE(0xD049,0x00);
  POKE(0xD04A,0x1c); POKE(0xD04B,0x02);  

  // H640 + fast CPU
  POKE(0xD031,0xc0);  
  
  // 90 columns wide (but with virtual line length of 255)
  // Advance 512 bytes per line
  POKE(0xD058,0x00); POKE(0xD059,0x02);
  // XXX -- We can display more than 128, but then we need to insert GOTOX tokens to prevent RRB wrap-around
  POKE(0xD05E,0x80); // display 128 
  
  // 30 rows
  POKE(0xD07B,30 - 1);
  
  // Chargen vertically centred for 30 rows, and at left-edge of 720px display
  // (We _could_ use all 800px horizontally on the phone display, but then we can't see it on VGA output for development/debugging)
  POKE(0xD04C,0x3B); POKE(0xD04D,0x00);
  POKE(0xD04E,0x39); POKE(0xD04F,0x00);
  
  // Double-height char mode
  POKE(0xD07A,0x10);

  // Colour RAM offset
  POKE(0xD064,colour_ram>>0);
  POKE(0xD065,colour_ram>>8);
  
  // Screen RAM address
  POKE(0xD060,((unsigned long)screen_ram)>>0);
  POKE(0xD061,((unsigned long)screen_ram)>>8);
  POKE(0xD062,((unsigned long)screen_ram)>>16);

}

 

Then the beginnings of the unicode glyph loader --- and here we can see all the hard work of making pre-rendered unicode glyphs and the Shared Resource API paying off:


// 128KB buffer for 128KB / 256 bytes per glyph = 512 unique unicode glyphs on screen at once
#define GLYPH_DATA_START 0x40000
#define GLYPH_CACHE_SIZE 512
#define BYTES_PER_GLYPH 256
unsigned long cached_codepoints[GLYPH_CACHE_SIZE];
unsigned char cached_fontnums[GLYPH_CACHE_SIZE];
unsigned char glyph_buffer[BYTES_PER_GLYPH];

void reset_glyph_cache(void)
{
  lfill(GLYPH_DATA_START,0x00,GLYPH_CACHE_SIZE * BYTES_PER_GLYPH);
  lfill((unsigned long)cached_codepoints,0x00,GLYPH_CACHE_SIZE*sizeof(unsigned long));
}

void load_glyph(int font, unsigned long codepoint, unsigned int cache_slot)
{
  shseek(&fonts[font],codepoint<<8,SEEK_SET);
  shread(glyph_buffer,256,&fonts[font]);
  // XXX Extract glyph flags etc
  lcopy((unsigned long)glyph_buffer,GLYPH_DATA_START + ((unsigned long)cache_slot<<8), BYTES_PER_GLYPH);
  cached_codepoints[cache_slot]=codepoint;
  cached_fontnums[cache_slot]=font;
}

void draw_glyph(int x, int y, int font, unsigned long codepoint)
{
  unsigned int i;
  for(i=0;i<GLYPH_CACHE_SIZE;i++) {
    if (!cached_codepoints[i]) break;
    if (cached_codepoints[i]==codepoint&&cached_fontnums[i]==font) break;
  }
  if (cached_codepoints[i]!=codepoint) {
    load_glyph(font, codepoint, i);
  }

  // XXX -- Actually set the character pointers here
}

 

We can see that the actual code to load a glyph from the Shared Resources area is almost absurdly simple.  So now I should work on those missing bits in this, and then use it to draw an emoji on the screen :)

I'm planning on having a pre-computed table of 256 bytes for each colour RAM byte for the glyph flag byte values -- whether the glyph is coloured or not, and the width of the glyph, which also indicates whether a non-coloured glyph is FCM (1-15px wide) or NCM (16-31px wide).  (Note that we use NCM for 16px wide, so that we still have space for a blank column of pixels on the right-hand edge of the glyph, to make it easier to display, without having to add an extra 1px wide blank character to the right.)

For 16-bit text mode ("Super Extended Attribute Mode") the VIC-IV uses 2 bytes of screen RAM and 2 bytes of colour RAM for each glyph.   

So we can compute the values for these bytes for all possible glyph_flag bytes values.  In retrospect, I should have made the glyph_flags byte have all the bits that matter for this be in the lower bits, so that I only need a 6-bit table.  But that said, we can still do that, by just shuffling the bits before doing the lookup.  So let's start by defining that function. The following should do it:

#define glyph_flags_6(X) ( (X&0x1f) + ((X>>2)&0x20))

This shifts the colour/mono flag from bit 7 down to bit 5, while keeping the glyph width in pixels in the bottom 5 bits.

Now in terms of our screen and colour RAM values, let's work  out what we need in each.

For Screen RAM byte 0, we don't need a table, because it only contains information about which character to display.

For Screen RAM byte 1, we have the lower 3 bits of the number of pixels to trim from the right-hand edge of the glyph in the upper 3 bits of the byte.  The rest of the bits only contain information about which character to display.  Now don't forget that we have two characters being used together to form the entire glyph with our unicode fonts.  This means that we may have one or two characters actually be rendered out to show it, based on whether it fits in just a single 8px wide character or not.  So we'll need a flag to handle that, as well as values for Screen RAM byte 1 Left and Screen RAM byte 1 Right with the appropriate width reduction values. Whenever the Right character is needed, the left character will be the full 8px wide.

So lets generate those tables first. To avoid errors in hand-generating them, I'm writing a C program that outputs the C code that actually contains the tables, like this:


int main(int argc,char **argv)
{

  start_table("screen_ram_1_left");
  for(int i=0;i<64;i++)
    {
      int width = i&31;
      int is_colour=i&32;
      int width_subtract = 0;
      if (width<8) width_subtract = 8 - width;
      // zero-width glyphs don't exist
      if (width_subtract == 8) width_subtract = 7;
      
      unsigned char val = 0x00;
      val |= (width_subtract&7)<<5;

      emit_val(val);
    }
  end_table();

  start_table("screen_ram_1_right");
  for(int i=0;i<64;i++)
    {
      int width = i&31;
      int is_colour=i&32;
      int width_subtract = 0;
      if (width<16) width_subtract = 16 - width;
      else if (width<32) width_subtract = 32 - width;
      if (width_subtract == 8) width_subtract = 7;
      
      unsigned char val = 0x00;
      val |= (width_subtract&7)<<5;

      emit_val(val);
    }
  end_table();  
}

Which generates something like this:

$ make tools/gen_attr_tables  && tools/gen_attr_tables 
gcc -o tools/gen_attr_tables tools/gen_attr_tables.c
unsigned char screen_ram_1_left[64]={
  0xe0,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
  0xe0,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
};

unsigned char screen_ram_1_right[64]={
  0x00,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0xe0,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0x00,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0xe0,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0x00,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0xe0,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0x00,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20,
  0xe0,0xe0,0xc0,0xa0,0x80,0x60,0x40,0x20
};

That way if I have an error, I can quickly fix it. And if any poor sod has to maintain my program after me, they'll have the equations that were used to generate the tables.

So now to keep generating more tables...

For colour RAM byte 0, we have the 4th bit of width reduction, as well as the NCM flag, which we have to set if a glyph is >16 pixels wide --- there's an issue in the font generation logic where I thought we had to do it if >15px wide, which I should fix (the issue was I forgot that the blank column of pixels was already counted in the width, so there's no need to switch to NCM until we hit 17px wide, since we don't need to add the extra pixel for the inter-character gap).

We also have the alpha/full-colour flag, which we use to choose whether to display a colour glyph, or to interpret its pixel values as intensity values that blend the character foreground and background colours.

So we end up with something like this:


  start_table("colour_ram_0_left");
  for(int i=0;i<64;i++)
    {
      int width = i&31;
      int is_colour=i&32;
      int width_subtract = 0;
      if (width<8) width_subtract = 8 - width;
      // zero-width glyphs don't exist
      if (width_subtract == 8) width_subtract = 7;
      
      unsigned char val = 0x00;
      val |= is_colour? 0x00 : 0x20;
      val |= (width>15)? 0x04 : 0x00;
      val |= (width_subtract&8) ? 0x01 : 0x00;
      
      emit_val(val);
    }
  end_table();
  
  start_table("colour_ram_0_right");
  for(int i=0;i<64;i++)
    {
      int width = i&31;
      int is_colour=i&32;
      int width_subtract = 0;
      if (width<16) width_subtract = 16 - width;
      else if (width<32) width_subtract = 32 - width;
      // zero-width glyphs don't exist
      if (width_subtract >= 8) width_subtract = 7;
      
      unsigned char val = 0x00;
      val |= is_colour? 0x00 : 0x20;
      val |= (width>15)? 0x04 : 0x00;
      val |= (width_subtract&8) ? 0x01 : 0x00;
      
      emit_val(val);
    }
  end_table();

 

That just leaves the 2nd colour RAM byte. The only thing we care about here, is setting the "bold" attribute, which tells the VIC-IV to use the alternate palette for the glyph. We want to set that for colour glyphs, so that we select the 332 colour cube -- while still being able to maintain a C64-like primary palette for other UI furniture.

  start_table("colour_ram_1");
  for(int i=0;i<64;i++)
    {
      int is_colour=i&32;
      
      unsigned char val = 0x00;
      val |= is_colour? 0x40 : 0x00;  // Select alternate palette for colour glyphs
      
      emit_val(val);
    }
  end_table();
 

Then with a bit of Makefile magic, I can cause the attribute tables to be auto-generated using this program at build time:

tools/gen_attr_tables:    tools/gen_attr_tables.c
    gcc -o tools/gen_attr_tables tools/gen_attr_tables.c

src/telephony/attr_tables.c:    tools/gen_attr_tables
    $< > $@

bin65/unicode-font-test.prg:    src/telephony/unicode-font-test.c src/telephony/attr_tables.c
    mkdir -p bin65
    $(CC65) -Iinclude -Isrc/mega65-libc/include src/telephony/unicode-font-test.c
    $(CC65) -Iinclude -Isrc/mega65-libc/include src/telephony/attr_tables.c
    $(CC65) -Iinclude -Isrc/mega65-libc/include src/mega65-libc/src/shres.c
    $(CC65) -Iinclude -Isrc/mega65-libc/include src/mega65-libc/src/memory.c
    $(CC65) -Iinclude -Isrc/mega65-libc/include src/mega65-libc/src/hal.c
    $(CL65) -o bin65/unicode-font-test.prg -Iinclude -Isrc/mega65-libc/include src/telephony/unicode-font-test.s src/telephony/attr_tables.s src/mega65-libc/src/shres.s src/mega65-libc/src/cc65/shres_asm.s src/mega65-libc/src/memory.s src/mega65-libc/src/cc65/memory_asm.s src/mega65-libc/src/hal.s

So now that we have those tables, we can pull them in as externs, and use them to try to draw a glyph!

Well, I have it drawing a glyph --- but it all looks like gibberish. So I'll need to figure out what I am doing wrong.  Am I loading the wrong glyph? Is the screen RAM selecting the wrong glyph?

For a start it is setting the glyph number wrong. Ah -- I was writing the colour RAM bytes into the screen RAM.  With that fixed,I don't see gibberish, but rather a 1px wide blank character.

If I replace the glyph loader's code that reads from the fonts in the shared resources and instead construct a dummy glyph, then I can get that to display.  I found the alpha compositor was off, so the alpha-blending wasn't working, which I've fixed now.  

The next problem is that it isn't enabling NCM for glyphs >16px wide. That was a red herring: It was actually the RRB position pointer wrapping. There is a real thing here to be dealt with, though: To support variable-width characters, we have to muck about with the RRB to put the right edge of any UI element in the right place.  I'll have to deal with that in due course, but there are a few ways to do it.  We could do a GOTOX after drawing the background, and then back-track to draw the variable width text in. But that wastes RRB time and screen RAM columns. The other way is to render the variable width text, and then add dummy characters to the right of it to make it line back up where it needs to. That's probably the saner way, since we are using H640 and V400 mode, which means we can't double our RRB time.

But that's all for later -- right now I want to focus on getting glyphs from fonts to actually render.

... and it looks like I might have hit a hardware bug in the MEGA65 core: Double-height characters in FCM/NCM mode doesn't actually do anything.  Yup. the FCM data path completely ignores it.  Okay, filed an issue: https://github.com/MEGA65/mega65-core/issues/902

This looks like it should do the trick:

paul@bob:~/Projects/mega65/mega65-core$ git diff
diff --git a/src/vhdl/viciv.vhdl b/src/vhdl/viciv.vhdl
index 492a9a30a..165af9e88 100644
--- a/src/vhdl/viciv.vhdl
+++ b/src/vhdl/viciv.vhdl
@@ -4357,7 +4357,19 @@ begin
             -- We only allow 8192 characters in extended mode.
             -- The spare bits are used to provide some (hopefully useful)
             -- extended attributes.
-            glyph_number(12 downto 8) <= screen_ram_buffer_dout(4 downto 0);
+
+            -- FCM/NCM + CHARY16 causes interlacing of two consecutive glyphs
+            if reg_char_y16='1' and charrow_repeated='1' then
+              glyph_number(7 downto 0) <= glyph_number(7 downto 0) + 1;
+              if glyph_number(7 downto 0) = x"ff" then
+                glyph_number(12 downto 8) <= screen_ram_buffer_dout(4 downto 0) + 1;
+              else
+                glyph_number(12 downto 8) <= screen_ram_buffer_dout(4 downto 0);
+              end if;
+            else
+              glyph_number(12 downto 8) <= screen_ram_buffer_dout(4 downto 0);
+            end if;              
+            
             glyph_width_deduct(2 downto 0) <= screen_ram_buffer_dout(7 downto 5);
             glyph_width_deduct(3) <= '0';
             if screen_ram_buffer_dout = x"ff" then
@@ -4482,6 +4494,7 @@ begin
             -- Mark as possibly coming from ROM
             character_data_from_rom <= '1';
           end if;
+
           raster_fetch_state <= FetchTextCellColourAndSource;
         when FetchBitmapData =>
           -- Show what we are doing in debug display mode

Okay, so let's try synthesising that, and seeing if it does the trick.

Yes --- it works, but the ordering of the fonts is wrong, so I'll have to rebuild them to stack vertically, instead of scan horizontally among the 4x64 byte cards. But I've put a bit of a hack in for now that rearranges it. And that seems to work.

The issue now is that the non-FCM characters are seemingly being affected by it as well.  Fixing that and resynthesising a bitstream... And then I over-fixed it.

But it is looking very promising --- it looks like it is doing what it should, apart from the rendering problems. Well, and that it seems to pull the wrong character out by a position of 1, e.g., if you ask for BCDE you will get CBED, or thereabouts. Why they are reversed, I have no clear idea as yet. Maybe the font generator is putting them in in the wrong order?

Also, it looks like the nybl order for the NCM fonts is wrong --- but I am seeing tantalising hints of mono Emoji being rendered.  The colour ones are not rendering at all just yet, but I'll look into that later.

But either way, it looks like it's time to dive back into my font generator and:

1. Confirm that it's not reordering glyphs.

2. Swap the nybl order when writing bytes out.

3. Change the card ordering to match my renderer.

4. Figure out what's going on with the colour glyph rendering: Is it just my colour cube not being selected, or is it dodgy data?

First, let's take a look at what's in our pre-rendered font files, and see if the swapped order of glyphs is in there. A good way to do this is look at character positions 0x20 and 0x21, as 0x20 should be blank (space) and 0x21 should not (exclamation mark).

And it looks like the exclamation mark character is at 0x2100-0x21ff in the file. So they don't seem to be swapped there.  So they probably aren't swapped on the SD card in the system partition. I've just double-checked this by trying to render 0x21, and then looking at the SD card sector buffer to see what was actually loaded.

This makes me think that maybe the shared resources access API is wonky.  Interestingly, the data for the swapped glyphs exists within the same SD card sector.

Yup! in shread() I wasn't waiting for the SD card sector to finish reading, before using the data.  With that fixed, (1) is now fixed. So on to (2) and (3), as those are fairly easy fixes.

And they were indeed easy to fix. I also fixed an issue I hadn't thought about before: I am rendering glyphs 17px high, not 16, because the Nokia-esque font was pixel-perfect that way. But then the very top line of pixels gets truncated, which is bad. 

At this point I am so glad I made all those tools to automate the font processing, and even copy them into the shared resources partition on an actual SD card. This means I can change the font generation, move the SD card from the MEGA65 to my PC, run a single command, then move it back, and test again -- all within a minute or two.

So it looks like (2) and (3) are fixed now. Back onto the colour glyphs: And a quick look in the generated font file shows that it's writing all transparent pixels.  So hopefully this will be a quick fix, too.

The problem here is that the NotoColorEmoji font is being reported as having pixels that are of type FT_PIXEL_MODE_GRAY, when we were expecting them to be FT_PIXEL_MODE_BGRA.  And they are being reported as having 0 rows and 0 columns of pixels.

It turns out that this is because libfreetype2 doesn't support colour fonts.  ChatGPT is saying that there's no C library it knows about that supports them, but is suggesting some python approaches. So I'll try to get it to write me a python equivalent of my C program.

Except that didn't work either.

So I'm going to instead use the Twitter twemoji repository that has SVG files for all emojis and use a bit of Python to make our 16x16 glyphs in our binary format from those.

Got that working.  Now the problem is that the palette is wrong --- it should be selecting the alternate palette.  The problem there was that I hadn't enabled VIC-III Extended Attribute mode.

After that I had to work around a bug with the first row of text in the VIC-IV, where the first row of pixels is doubled up (a long known bug). But when interlaced text is used, it seems to be totally messed up.  

But with that done, on real hardware the Emoji displays! The m65 utility's screen renderer gets it mostly right, but the background and foreground colour of the FCM chars is wrong, so I can't show a digital pixel-perfect version here. But I can take a photo of the screen...


Look at that, we have Unicode text and emoji rendering working!

The code to draw them ends up being super simple, too:

  // Say hello to the world!
  draw_glyph(0,1, FONT_UI, 'E',0x01);
  draw_glyph(1,1, FONT_UI, 'm',0x01);
  draw_glyph(3,1, FONT_UI, 'o',0x01);
  draw_glyph(4,1, FONT_UI, 'j',0x01);
  draw_glyph(5,1, FONT_UI, 'i',0x01);
  draw_glyph(6,1, FONT_UI, '!',0x01);
  draw_glyph(7,1, FONT_UI, ' ',0x01);
  draw_glyph(8,1, FONT_EMOJI_COLOUR, 0x1f929L,0x01);

I'll document this all more, and maybe even promote it into a shared library for other projects. For now its all in:

https://github.com/MEGA65/megaphone-modular/blob/main/src/telephony/unicode-font-test.c

But first, I need to make a routine to parse a UTF-8 string and render it.  First step is to make a routine that extracts the next codepoint from a UTF-8 string, and let's us iterate over it.


unsigned long utf8_next_codepoint(unsigned char **s)
{
  unsigned char *p;
  unsigned long cp;

  if (!s || !(*s)) return 0L;

  p = *s;
  
  if (p[0] < 0x80) {
    cp = p[0];
    (*s)++;
    return cp;
  }

  // 2-byte sequence: 110xxxxx 10xxxxxx
  if ((p[0] & 0xE0) == 0xC0) {
    if ((p[1] & 0xC0) != 0x80) return 0xFFFDL; // invalid continuation
    cp = ((p[0] & 0x1F) << 6) | (p[1] & 0x3F);
    *s += 2;
    return cp;
  }

  // 3-byte sequence: 1110xxxx 10xxxxxx 10xxxxxx
  if ((p[0] & 0xF0) == 0xE0) {
    if ((p[1] & 0xC0) != 0x80 || (p[2] & 0xC0) != 0x80) return 0xFFFDL;
    cp = ((p[0] & 0x0F) << 12) |
         ((p[1] & 0x3F) << 6) |
         (p[2] & 0x3F);
    *s += 3;
    return cp;
  }

  // 4-byte sequence: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
  if ((p[0] & 0xF8) == 0xF0) {
    if ((p[1] & 0xC0) != 0x80 || (p[2] & 0xC0) != 0x80 || (p[3] & 0xC0) != 0x80)
      return 0xFFFDL;
    cp = ((unsigned long)(p[0] & 0x07) << 18) |
      ((unsigned long)(p[1] & 0x3F) << 12) |
      ((p[2] & 0x3F) << 6) |
      (p[3] & 0x3F);
    *s += 4;
    return cp;
  }

  // Invalid or unsupported UTF-8 byte
  (*s)++;
  return 0xFFFDL;
}

Then we need to know which font to pull the glyphs from.  Eventually I might combine the emoji with other fonts, so that we don't need this logic, but it's probably okay for now.


char pick_font_by_codepoint(unsigned long cp)
{
    // Common emoji ranges
    if ((cp >= 0x1F300 && cp <= 0x1F5FF) ||  // Misc Symbols and Pictographs
        (cp >= 0x1F600 && cp <= 0x1F64F) ||  // Emoticons
        (cp >= 0x1F680 && cp <= 0x1F6FF) ||  // Transport & Map Symbols
        (cp >= 0x1F700 && cp <= 0x1F77F) ||  // Alchemical Symbols
        (cp >= 0x1F780 && cp <= 0x1F7FF) ||  // Geometric Extended
        (cp >= 0x1F800 && cp <= 0x1F8FF) ||  // Supplemental Arrows-C (used for emoji components)
        (cp >= 0x1F900 && cp <= 0x1F9FF) ||  // Supplemental Symbols and Pictographs
        (cp >= 0x1FA00 && cp <= 0x1FA6F) ||  // Symbols and Pictographs Extended-A
        (cp >= 0x1FA70 && cp <= 0x1FAFF) ||  // Symbols and Pictographs Extended-B
        (cp >= 0x2600 && cp <= 0x26FF)   ||  // Misc symbols (some emoji-like)
        (cp >= 0x2700 && cp <= 0x27BF)   ||  // Dingbats
        (cp >= 0xFE00 && cp <= 0xFE0F)   ||  // Variation Selectors (used with emoji)
        (cp >= 0x1F1E6 && cp <= 0x1F1FF))    // Regional Indicator Symbols (🇦 – 🇿)
        return FONT_EMOJI_COLOUR;    
    
    return FONT_UI;
}

Then we need a routine that uses that to render a native UTF-8 string:

  {
    unsigned char *string="Ümlaute! 👀 😀 😎 🐶 🐙 🍕 🍣 ⚽️ 🎮 🛠️ 🚀 🎲 🧩 📚 🧪 🎵 🎯 💡 🔥 🌈 🪁";
    unsigned char *s=string;
    unsigned char x=0;
    unsigned long cp;
    unsigned char w;

    while (cp = utf8_next_codepoint(&s)) {
      unsigned char f = pick_font_by_codepoint(cp);
      x += draw_glyph(x, 4, f, cp, 0x01);
    }
  }
 

What's cool here is that I am literally just using a UTF-8 string in the source, and then it renders!


Now, there is something funny going on, with white being shown as cyan or something. This must be some subtle thing in the VIC-IV super extended attribute mode, because it doesn't happen with the m65 -S screenshot tool that has it's own interpreter of those attributes (which has other bugs -- which is why the text isn't appearing, and the screen background is black instead of blue):


At some point I'll have to figure out both of those bugs. But it's not a super high priority right now.

What is more annoying is that while the Nokia-like font has good kerning, the Noto font doesn't. I probably have to take account of some weird thing with that. It works, but as you can see here, the kerning is all messed up:

 


Compare the word "Ümlaute" (yes, I know, it's not a real world -- it should be Umlaute, but I wanted to test the rendering of an actual umlaut) in this screenshot to the one a few pictures back where it was in the Nokia-like font.  I think some glyphs, like a and e are have some weird shift in them, perhaps because they extend slightly to a negative X position or something.  Fonts are weird and complex.  I'll look into it in my font preparation tools at some point, and enforce that they have a blank column of pixels on the right hand edge. But for now, that's lower priority.

The next step will be to accumulate characters until an acceptable word break is found, and then check if we need to advance to the next line or not, before printing the word.  This will require knowledge of the left and right columns available for the rendering, both in terms of char columns, as well as pixel positions, so that if either is reached, it can wrap.  It probably also needs to know the max row available, too, so that it can report an error and the point in the string that it was able to render to.

But all that will have to wait until tomorrow. 

Okay, so let's start defining the text box structure.

It's now some time later, and I have in the meantime worked out the whole storage and indexing of messages.  As part of this, there is scope for me to store some short-attributes for them. For example, the number of bytes of the message in each line when rendered in a message box, and the number of lines, so that we know how much space we need on the screen. This will allow us to cache this information, allowing faster re-display of messages.

So let's start with an algorithm to work out the break-points in a message for our standard message box width. From there we can make a routine that displays a line of text message, and then next up, displaying the whole message.

We already have our little routine for displaying strings on a single line, so we'll use that as a starting point.

One thing we'll need to deal with at some point is calculating this all even when other messages are visible on the screen -- which might mean that our unicode unique character buffers are too full to determine the full dimensions of the message.  But that can wait for now -- there are a few ways I can deal with at the right time. For example don't retain the codepoints read purely for the purposes of determining their width, and then just buffering the width of the last few with a move-to-front cache, so that we don't keep re-fetching the sectors with that information.

Let's start with the routine to display a string, limited to a given view-port:

char draw_string_nowrap(unsigned char x_glyph_start, unsigned char y_glyph_start, // Starting coordinates in glyphs
            unsigned char f, // font
            unsigned char colour, // colour
            unsigned char *utf8,             // Number of pixels available for width
            unsigned int x_pixels_viewport,
            // Number of glyphs available
            unsigned char x_glyphs_viewport,
             // And return the number of each consumed
            unsigned int *pixels_used,
            unsigned char *glyphs_used)
{
  unsigned char x=0;
  unsigned long cp;
  unsigned char *utf8_start = utf8;
  unsigned int pixels_wide = 0;
  unsigned char glyph_pixels;
  unsigned char n=0;
  
  if (pixels_used) *pixels_used = 0;
  
  while (cp = utf8_next_codepoint(&utf8)) {
    // unsigned char f = pick_font_by_codepoint(cp);

    // Abort if the glyph won't fit.
    if (lookup_glyph(f,cp,&glyph_pixels, NULL) + x >= x_glyphs_viewport) break;
    if (glyph_pixels + pixels_wide > x_pixels_viewport) break;

    // Glyph fits, so draw it, and update our dimension trackers
    glyph_pixels = 0;
    x += draw_glyph(x_glyph_start + x, y_glyph_start, f, cp, colour, &glyph_pixels);
    pixels_wide += glyph_pixels;    
    }    

  if (glyphs_used) *glyphs_used = x;
  if (pixels_used) *pixels_used = pixels_wide;

  // Return the number of bytes of the string that were consumed
  return utf8 - utf8_start;
}
 

There isn't really too much to it: Try drawing glyphs, until we reach the end of the string, or the next glyph won't fit in the view port.

Perhaps the next thing is to make a function that after printing a string, pads it out to fill the remaining viewport space with blank glyphs. This is so that, for example, we could draw a box around a message, and have the boundaries line up. Or maybe we just want to be able to put more content on the right-hand side of a message like this.

Using NCM glyphs, we can consume upto 32 pixels per glyph. 

I would like to be able to make it work with reversed text as well, so that we could make a filled bubble around messages  This should work with the reverse SEAM flag.  This works for mono glyphs. But for coloured glyphs, the reverse flag messes with the colours, I think. In any case, it looks rubbish, and the background isn't inverted, which is really the point of what we wanted.

Yup, reverse flag with FCM glyphs inverts the pixel values. What we want instead is that it switches the background and foreground colours. I've filed a bug and crafted a possible fix here

While that fix is synthesising, I can still work on the viewport alignment. Ideally we want the viewport fix to use all tokens in the remainder of the view point.  So if there are spares after getting the alignment right, we should probably fill the rest with GOTOX's that just stay on the appropriate pixel.  But that means we need to know the absolute X position of the viewport end.  So we need another argument for the "parking position".

I have it implemented, but it's not working properly: The padding is messing up the display instead of doing what it should. Also, that fix for reversed FCM glyphs doesn't seem to have worked.  I've made one small fix to the VHDL and have set the bitstream synthesising again. While that's underway, I'll investigate what's wrong with the padding stuff. 

Okay, I've fixed a bunch of the padding bugs, but there's still something messed up with the reverse mode for FCM colour glyphs. Their colours are still being inverted, despite the fix for background or foreground colour for the transparent pixels (value $00) now being working properly.  But I can't for the life of me see where the inversion is happening.

The string padding is a lot closer now, though: I can display non-reversed text, and it looks fine, and gets padded to a straight right edge.  But when I render it reversed, it looks like the nybl order (this is for monochrome/alpha-blended glyphs) are being all messed up. It's kind of funny how it works, as it makes the nokiaesque font look almost like a gothic font:


We can see the normal video line at the top looks all fine, but all these problems I've been describing can be seen with the reverse-video.  The coloured FCM glyphs for the emoji have borked up colour, while something weird is going on with the pixel column selection for the reverse video.  I'm not currently sure exactly what it is yet, though.

Actually, I think I have an idea what it is for the alpha-blended glyphs: Nybls with value $F are getting inverted to value $0, which then gets rendered in the foreground colour, because of the fix I did for the colour glyphs.  Time for another synthesis run -- I'm glad I have the fast build box these days. Less than ten minutes to do a synthesis run.

I thought the off colours might be from accidental selecting the alternate palette, but that's not the cause. Likewise, my attempted fix hasn't worked for the reverse alpha-blended text.  I'm doing a `make clean` and rebuilding the bitstream, just in case it didn't rebuild properly.

So, that's fixed the alpha-blended NCM text:

Now we've got the foreground colour not showing for the emoji full-colour glyphs --- and they're still all messed up in colour.  This has me stumped for the moment, so I'll progressively try a few things in the background, while I focus on getting wrapped string display and determining appropriate breakpoints.

I'm wondering if I shouldn't immediately pull in a text message from one of the D81s rather than use hand-coded strings, so that I can move more directly towards a routine that can display a text message.  It will also give me an idea of the combined code size required to have the string display routines as well as the message handling routines -- if they don't both fit in the same executable, things are going to get interesting.

So let's start by pulling out the screen setup code (which won't need to be in the same executable) and then the string/unicode display stuff into separate source files so that I can pull in exactly which routines we need in each module.

Talking with Mirage and BobbyTables on discord (thanks both for helping me figure out what's going on), and looking at the screen init code above(!), I realised that the problem is almost certainly with the handling of the REVERSE and BOLD atrribute bits in FCM, and how I have the FCM glyphs use the alternate palette to access the colour cube.

i.e., I'm setting REVERSE and BOLD for ALTPALETTE in all the FCM glyphs.  But then for reverse mode, we have a problem. As reversed glyphs need REVERSE but not BOLD set.

Ideally I'd just use a single palette, but life's not quite that simple, because in the mode we're using, we have only 16 foreground colours available to us, so we'd need the transparent pixels to come from one palette, while the foreground pixels to come from the other!

It's probably not unreasonable to make the transparent pixels use the default palette, regardless. But that still leaves the problem of how to signal both REVERSE and alternate palette.  After a lot of fiddling about, I've used a spare register bit in $D053 to indicate that BOLD = ALTPALETTE, and that REVERSE should operate independently of it.  That way no existing software should get messed up, but we can do what we want here. Synthesising that now.

Okay, some progress: The reverse bit now doesn't cause the palette to get messed up: Both normal and reverse emojis are showing the correct palette.  However, when in reverse, the transparent pixels are being drawn black instead of the white colour I've selected. I'm guessing that it's using the alternate palette for that case, when it should use the primary palette foreground colour.

By writing over the palette entries, I can confirm that is _not_ what is happening. Instead it's a colour from the primary palette. Colour 17, in fact. That's suspiciously what we'd expect if the BOLD bit is working as BOLD still.  Which it will be, since my BOLDISALT fix only forced it to select the alternate palette, not disable its other normal functions.  I have a fix for that synthesising now.

Finally it shows fine:



The only remaining minor thing is that white in the FCM glyphs is represented by value $FF, which gets changed to the foreground colour of the glyph, which makes part of the eyes and the soccer ball disappear.  But that's not a problem in this program --- that's in the font preparation program.  Although we _could_ make an FCM display mode that disables the $FF colour substitution.  Better to just change the glyphs to use the closest colour to white.

So let's move on to wrapped string display now for a bit.  I intend to record the line-break points in the messages when we store them, so that we don't have to calculate them at run-time.  Actually generating the break points is a bit of a rabbit-hole, though.  LaTeX actually has a nice simple-ish way for optimally breaking words with a score based system to help choose the break points.  It needs a trie structure of ~100KB which we could accommodate.  But it does need one per language.  

That's all a bit complex, so I'll just try to break on white-space, and if that's not possible, then just break arbitrarily. Otherwise I'm going to waste even more time thinking about this.

The algorithm is thus quite simple: Find the extent of the next word, and measure it's width in pixels (and in glyphs, in case we run out of those), and break it if it's too long, or add it to the line if it fits.  This should all happen off-screen, just to get the break points in the string.  So I'll start by getting the width of each glyph in an array, and then we can do the rest from there.  This will let us have a framework for applying a cost for breaking at any given point in the string.

I'm pulling that together. In the process I've got the whole code base now compiling under CC65 and producing a single MEGA65 native binary. I half expected it to barf, because I've got sorting, indexing and a pile of other routines in there at the moment, and the CC65 linker doesn't prune out unneeded stuff. But the binary is currently still only 28KB or so.  That makes my life easier for the moment, even though I'll still need to separate things out later. 

Some progress: I can now display a wrapped string (although I haven't fully exposed the API for this just yet): 

 That looks really good! It's preferentially breaking on words, like we want. Nothing is too wide. And the box is constant width. Great!

Let's give it a bit more space to fit a bit more on each line: 

Ah. There's something going wrong there.

Let's put some strings at a nominal fixed position to the right, and see if the alignment is good and it's just the lack of reversed spaces that's our problem:

Nope. Something is wonderfully borked. Found the bugs. The main one was indeed in the call the padding when displaying a string segment. No it works nicely:

Now to use this code to make the routines to store the line break information into the SMS message record, and then be able to retreive that.

Which means that we need to hook in the remaining Hardware Abstraction Layer routines to mount and access the disk images etc.  Then I can make the code actually read real SMS messages and display those. 

Going through the libc source, I'm trying to find if there is already code for calling the MOUNT, CHDIR and other HYPPO calls that we'll need here.

Okay, I've found chdir and chdirroot, but not calls for mounting. A bit annoying, but chdir has the machinery necessary for providing a filename to HYPPO, which I can re-use for making a mount call.  So it shouldn't be too hard.

So this is how chdir() handles its argument and the HYPPO call. It does a bit more than we need, because we don't need to do a findfile() first when mounting: We just need to copy the D81 filename down to $0100, and then do the HYPPO setname call, and then the actual mount call.

Now, ideally, we'd reuse that code from the mega65-libc, rather than me having to cook it up from scratch, and having it use twice as much code. But the routines are small, so I think this is less of an issue than it might otherwise be.  I should probably log an issue on the mega65-libc repo suggesting pulling in the mount() code that I'll be writing for this repo.

That's all done now -- I did just fork the code and include it in hal_asm.s in the telephony repo, as it was easier. I've confirmed that I can mount disk images on drive IDs 0 and 1, so that's good.  Let's check the CHDIR functionality, including to root directory -- both of those are working, as confirmed by mounting disk images from two separate directories.

Next up is populating my SD card with the telephony directory structure, complete with sample message traffic, using the tooling I made for that earlier.  That tool can write directly to a mounted SD card, to make our life easier. Done.

So now I can use code like this:

  mega65_cdroot();
  mega65_chdir("PHONE");
  mega65_chdir("THREADS");
  mega65_chdir("0");
  mega65_chdir("0");
  mega65_chdir("0");
  mega65_chdir("0");
  mount_d81("MESSAGES.D81",0);

And that will mount the messages D81 for contact 0. In fact, we should be able to use the mount_contact_qso(int contact_id) function to do this. Except that it doesn't work. For some reason the chdirroot() call is failing.  Weird. Ah! the mega65-libc chdirroot() function doesn't set the X register value -- which is set by CC65 if an int argument is provided on call. And in turn, the HYPPO is returning the wrong value on success. So time for issues galore.  With those fixed, I can use mount_contact_qso() in C, and have it mount the messages and message index disk images as drive ID 0 and 1 (= CBDOS devices 8 and 9):


If I now add routines for reading and writing sectors from the disk image, I can retrieve a record with a dummy message in it, and try to render it -- it's getting exciting :)

With a bit more mucking about, I now have a message displaying:

This is a great step forward -- now to refactor the code doing this to be all reusable, and to store the line-breaks in the record for faster future rendering.  I've done that, and it's broadly working -- but only if I render to the left-most glyph on each character row.

Fixed a couple more bugs, and now that's working, too, but with a subtle glitch on the first raster of each text box at a precise horizontal position on the screen.

This seems to be related to the max render count for VIC-IV character rows in FCM. Even with the wrap-around protection on the RRB, it still glitches if it's too high. I'm guessing that this is because the RRB runs out of time.  But any value > 128 seems to trigger it, so I'm a little suspicious that something else might be involved here, too.

Okay, so, yes. If the RRB is still being rendered when it should be restarting the render process, then it doesn't abort the rendering. Instead it continues, and basically ignores the request to render the next raster line.  This causes all sorts of ugly almost interlace-looking glitching.

The root cause is that the state machine for the raster_fetch_state occurs lower down in the viciv.vhdl file, which means that its internal state machine overrides the impulse to start rendering a new raster line into the RRB.  So I'll fix that, too.

I've also realised that the HDMI display also seems to not be showing all 720px wide. Only 680px seems to be shown.  This seems to be a monitor problem, as on a different monitor, the cut-off point is different. But it's still not the full 720px wide.

So it turns out that it's totally normal for even HDMI digital PAL input to be clipped by the monitor to 702 or 704 pixels wide, which is justifyable under the standard. But further trimming to 680 -- 700 pixels? That's just plain annoying.

I suspect this doesn't happen on VGA. Wrong. It does. 680 seems to be all I can convince VGA or HDMI to display. So I guess that's what I'll work with for now.

Next up, we are drawing an extra line of junk at the bottom of the text boxes. Hmm. That seems to be in the actual record on disk.  They are supposed to be emoji, but are getting mis-rendered or mis-decoded or something.  Ah, it was the automatic font-selection magic for rendering emojis from the colour emoji font.  Both the width assessment and then the actual drawing of the string were messed up.  With those fixed, though, it now looks good: 

So let's try drawing a thread, working from the latest messages at the bottom of the screen, back to the top, until we run out of space. 

Look at that! It's a message thread. Loaded from actual SMS message storage in a D81 on the SD card. We even have the sent and reply sense indicated by the colour and position like on a Sensible Phone!  There's still plenty more to do, but this post is already more than long enough.