Thursday, January 3, 2019

More work on the MEGA65 built-in freezer

Yesterday I posted the progress on the built-in freezer for the MEGA65, and explained a bit how it works.  However, at that point in time, the freezer was not really functional -- it could save and restore some memory and IO registers, but not without problems, and thus it wasn't possible to actually resume a program after freezing.  That has changed today!  After quite a bit of fiddling, the freeze and unfreeze routines are now much better, and generally work.

The main progress is that I am able to save the main memory, the colour RAM, the VIC-IV registers (including the colour palettes), the MEGA65 Hypervisor saved state (which is really the saved state of the program being frozen, since it was saved on entry to the Hypervisor, which is what is actually doing the freezing), along with most of the new MEGA65 registers, e.g., those at $D7xx.

The result is that the program gets fairly convincingly frozen. But this is no good, if the program can't be unfrozen after.  But this also works just fine now, as the following video of me playing Krakout and freezing and resuming it multiple times shows.  (Apologies for the shaky video, I don't have my good camera and tripod here at home.  Similarly the general lack of audio due to the Zoom recorder also not being here.)

What is clear is that we can freeze and unfreeze a real game, and it resumes without any noticeable problems.  Even multiple times, is not a problem. It also works fine to freeze BASIC, as the following freezing, frozen and un-frozen images show:

Just to prove that it was still alive after, I typed some rubbish:

(Note the fun feature of the later C65 ROMs of showing error messages in red, regardless of what the cursor colour was before).

While I would like the freeze and unfreeze time to be a little faster, it is already quite acceptable.  Once we have the 8MB expansion RAM in the MEGA65 working, we will be able to freeze to expansion RAM instead of the SD card in the first instance, which should make freezing and unfreezing several times faster.

In fact, the main limitations at the moment are relatively few:

1. Like most C64 freezers, we can't really freeze the state of the SIDs, because of all those SID registers being write-only, and even if they were readable, they would only show what you wrote, not the current ADSR state of the voices etc.  I'll likely add some support for saving and resuming the internal state of the SIDs, so that freezing doesn't mess up music.

2. The CIAs are not currently backed up.  This is really just a little oversight, and should be quite trivial to fix.

3. The Hypervisor doesn't sanity-check the state of any previously mounted disk image(s), and re-mount them if still available.  Similarly, it doesn't check any other bits and pieces in the process descriptor block after loading it back in.

4. I noticed that by blindly restoring the VIC-IV registers that it is bad if the freeze occurred at a high raster line, because it is possible for the raster compare register to be programmed to an impossibly high raster number.  This would cause the program to effectively not resume after unfreezing, unless you manually modified $D011 to clear the high-bit of the raster compare register. Thus I should probably and $D011 with $7F after restoring the machine state.

Wednesday, January 2, 2019

Working on the MEGA65 Freeze Menu

For a long time, the planned primary interface for controlling the MEGA65 has been planned to be a kind of "freeze menu".  While this will be easy for folks to change, our rationale for this is that it allows the machine to boot the BASIC as expected, but still have all the features you want to commonly use, e.g., mounting disk images, loading programs from a menu etc, a single button press away.

A while back, I mentioned that we were planning on having a double-tap of RESTORE trigger this.  This has evolved a bit into a long-press of RESTORE (anywhere from ~0.5 seconds to 5 seconds.  Longer than that will reset the machine in stead, which we might remove to avoid accidents, especially since the M65 will come with a reset button).

Quite a lot of work has gone on in the background to actually get to the point of having a freeze menu appear and be useful.  While it isn't quite there yet, it is now getting much closer.  A lot of that work has been on getting functional(ish) freeze and unfreeze routines working, as well as the hypervisor hooks to actually trigger the freeze and load the freeze menu itself.

So let's walk through how this all pulls together, beginning with pressing the RESTORE key, and detecting if it is a normal press of the RESTORE key, a long-press that should trigger the Hypervisor trap that launches the freeze process, or whether it should reset the CPU.  This is all in src/vhdl/keymapper.vhdl

  -- 0= restore down (pressed), 1 = restore up (not-pressed)
        if restore_state='0' and last_restore_state='1' then
          -- Restore has just been pressed, do nothing special.
          -- (Events happen on rising edge)
        elsif restore_state='1' and last_restore_state='0' then
          -- Restore has just been released
          if restore_down_ticks < 8 then
            -- <0.25 seconds = quick tap = trigger NMI
            restore_out <= '0';
          elsif restore_down_ticks < 32 then
            -- 0.25 - ~ 1 second hold = trigger hypervisor trap
            hyper_trap <= '0';

            hyper_trap_count <= hyper_trap_count_internal + 1;
            hyper_trap_count_internal <= hyper_trap_count_internal + 1;
          elsif restore_down_ticks < 128 then
            -- Long hold = do RESET instead of NMI
            -- But holding it down for >4 seconds does nothing,
            -- incase someone holds it by mistake, and wants to abort doing a reset.
            reset_drive <= '0';
            report "asserting reset via RESTORE key";
          end if;
          hyper_trap <= '1';
          restore_out <= '1';
          reset_drive <= '1';
        end if;

When hyper_trap goes to zero, then this tells the CPU to trigger the freezer Hypervisor trap.  This really just means that the CPU enters Hypervisor mode after saving register state, and then jumps to a certain location in the Hypervisor programme.  To make writing the freeze menu easy, after saving the state of the machine to freeze slot #0, the hypervisor loads in the standard C64 character set and a C65 ROM, and assumes that the freeze menu is a program made for C64 mode with entry point at SYS 2061.  This means we can write the freeze menu using CC65, the C compiler for the C64, for example.  In the following snippet from kickstart_task.a65 we can see that the Hypervisor already implements a bunch of very handy routines, that make it easy to load the ROM files, and then the freeze menu itself.  Loading the freeze menu is performed by setting the name of the file we want to load from the SD card ("FREEZER.M65"), and then providing the 32-bit load address. We load it to $07FF instead of $0800 or $0801 as you might have otherwise expected, because we expect the program to have a normal C64-style $01 $08 header on it, and thus we need to pretend it loads at $07FF so that the first real byte of data is placed at $0801.  Otherwise, there is nothing too surprising here. We set the C64 memory map to make life easier for the program, and we also provide a dummy NMI vector, as we have seen race conditions where an NMI can be triggered before a proper NMI vector has been installed. Since we don't enter via the C64/C65 ROM's normal entry point, the NMI vector at $0316 won't get setup automatically, thus requiring this precaution.  Finally we set the value of the PC on exit from the Hypervisor, and actually exit the Hypervisor itself:


    ; Freeze to slot 0
    ldx #<$0000
    ldy #<$0000
    jsr freeze_to_slot

    ; Load freeze program
    jsr attempt_loadcharrom
    jsr attempt_loadc65rom

    ldx #<txt_FREEZER
    ldy #>txt_FREEZER
    jsr dos_setname

    ; Prepare 32-bit pointer for loading freezer program ($000007FF)
    ; (i.e. $0801 - 2 byte header, so we can use a normal PRG file)
    lda #$00
    sta <dos_file_loadaddress+2
    sta <dos_file_loadaddress+3
    lda #$07
    sta <dos_file_loadaddress+1
    lda #$ff
    sta <dos_file_loadaddress+0

    jsr dos_readfileintomemory
    jsr task_set_c64_memorymap
    jsr task_dummy_nmi_vector
    ; set entry point and memory config
    lda #<2061
    sta hypervisor_pcl
    lda #>2061
    sta hypervisor_pch

    ; return from hypervisor, causing freeze menu to start
    sta hypervisor_enterexit_trigger

The actual freezing happens in the Hypevisor in the freeze_to_slot routine, rather than in the freeze menu. Similarly, unfreezing happens in the Hypervisor as well.  This actually solves a lot of problems all at the same time. First, the freeze menu doesn't need to know about changing on-SD formats for the freeze slots.  Second, it makes sure that there is a single freeze and a single unfreeze routine used in all situations. Third, it allows use of the extra memory of the Hypervisor, to allow for near-perfect freezing, without corrupting the stack or any other memory.  It also means that we can provide a nice simple abstracted interface to allow one program to get itself replaced by another in memory, similar to exec() on UNIX-like systems.

The freeze and unfreeze routines are naturally very similar. They basically consist of a loop that iterates through a range of memory areas that have to be loaded or saved, with an optional pre-save or post-load hook.  This allows us to define pseudo regions that save some tricky bits of machine state that we can't just DMA to the SD card.  It also makes it quite easy to modify what gets saved.  Here is the definition of the list of regions to be saved as they currently stand.  We know there are some missing bits, and we have removed some bits to make this easier to read.

    ; start address (4 bytes), length (3 bytes),
    ; preparatory action required before reading/writing (1 byte)
    ; Each segment will live in its own sector (or sectors if
    ; >512 bytes) when frozen. So we should avoid excessive
    ; numbers of blocks.

    ; SDcard sector buffer + SD card registers
    ; We have to save this before anything much else, because
    ; we need it for freezing.
    .dword $ffd6000
    .word $0290
    .byte 0
    .byte freeze_prep_stash_sd_buffer_and_regs

    ; 384KB RAM (includes the 128KB "ROM" area)
    .dword $0000000
    .word $0000     
    .byte 6          ; =6x64K blocks = 384KB
    .byte freeze_prep_none   

    ; 32KB colour RAM
    .dword $ff80000
    .word $8000
    .byte $00
    .byte freeze_prep_none

    ; VIC-IV palette block 0
    .dword $ffd3100
    .word $0400
    .byte 0
    .byte freeze_prep_palette0

    ; VIC-IV palette block 1
    .dword $ffd3100
    .word $0400
    .byte 0
    .byte freeze_prep_palette1

    ; VIC-IV palette block 2
    .dword $ffd3100
    .word $0400
    .byte 0
    .byte freeze_prep_palette2

    ; VIC-IV palette block 3
    .dword $ffd3100
    .word $0400
    .byte 0
    .byte freeze_prep_palette3   

    ; Process scratch space
    .dword currenttask_block
    .word $0100
    .byte 0
    .byte freeze_prep_none
    ; $D640-$D67E hypervisor state registers
    ; XXX - These can't be read by DMA, so we need to have a
    ; prep routine that copies them out first?
    .dword $ffd3640
    .word $003F
    .byte 0
    .byte freeze_prep_none

    ; VIC-IV, F011 $D000-$D0FF
    .dword $ffd3000
    .word $0100
    .byte 0
    .byte freeze_prep_none

    ; $D700-$D7FF CPU registers

    .dword $ffd3700
    .word $0100
    .byte 0
    .byte freeze_prep_none

    ; XXX - Other IO chips!

    ; End of list
    .dword $FFFFFFFF
    .word $FFFF
    .byte $FF
    .byte $FF

There are four lots of the VIC-IV palette, because the MEGA65 has four palette banks that can be dynamically selected, but are mapped to the same region of memory, therefore the freeze_prep_paletten routines make sure the correct one is mapped before the area is saved/loaded. These routines are typically quite simple, e.g.:

    ; We do the same memory map setup during freeze and unfreeze
    ; X = 6, 8, 10 or 12
    ; Use this to pick which of the four palette banks
    ; is visible at $D100-$D3FF
    sbc #freeze_prep_palette0
    ora #$3f  ; keep displaying the default palette
    sta $d070

 Now if we turn our attention to the freeze menu, this basically consists of a normal program that can do whatever we want.  The current version just displays a simple set of options (most of which aren't yet implemented), and selects one of them based on key input.  Key input is done using the MEGA65's super-easy ASCII keyboard input abstraction layer, where you can basically just read $D610 to get the next key from the keyboard, with all modifiers like SHIFT and CONTROL already applied.  Function keys map to $F1 - $FE, making life super simple for menus.  Here is the important bit of freezer.c:

  // Flush input buffer
  while (PEEK(0xD610U)) POKE(0xD610U,0);
  // Main keyboard input loop
  while(1) {
    //    POKE(0xD020U,PEEK(0xD020U)+1);
    if (PEEK(0xD610U)) {
      // Process char
      switch(PEEK(0xD610U)) {
      case 0xf1: // F1 = backup
      case 0xf3: // F3 = resume
    // Load memory from freeze slot $0000, i.e., the temporary save space
    // This implicitly restarts the frozen program
    __asm__("LDX #<$0000");
    __asm__("LDY #>$0000");
    __asm__("LDA #$12");
    __asm__("STA $D642");
      case 0xf7: // F7 = show screen of frozen program
    // XXX for now just show we read the key
      // Flush char from input buffer

 The highlighted snippet of code makes a Hypervisor call asking for whatever currently lives in freeze slot 0 to be loaded back into memory.  This by definition will replace the freeze menu in memory, so there is nothing more to be done.  We have gone to quite some effort to make calling the Hypervisor really painless, which I think shows here:  All you have to do is prepare the register values for the call, where the accumulator usually indicates the sub-function of the Hypervisor call, and then write to the correct Hypervisor trap address between $D640-$D67F.  It doesn't matter what you write, or from which register, as the act of asking the CPU to write to these registers tells it you want to trap to the Hypervisor.  The Hypervisor automatically (in just one clock cycle!) saves all process or flags, registers and memory mapping settings, and switches to the Hypervisor memory context.  This makes Hypervisor calls very simple and efficient.  The only gotcha at the moment is the need to put a NOP or other single-byte junk instruction after the write that triggers the Hypervisor call.  This is to work around a bug where sometimes the PC value on exit from the Hypervisor call is incremented by one.

But enough theory already. We want pictures!

Here is the MEGA65 mid-freeze, with border colour action telling you something is happening:

After a couple of seconds, this is replaced with the freeze menu, which is currently rather spartan. You can probably tell I used to use an Action Replay as my preferred freeze cartridge ;) This program will get a thorough pimping as time goes on.

Finally, here is the view after resuming:

If you want to see it as moving pictures:

There are a few obvious things to point out here:

1. We can clearly trigger loading of the freeze menu program.
2. We can (at least partly) save and restore memory contents and IO registers, as shown by how we manage to restore the C65 BASIC boot screen on un-freeze, complete with switching back to 80 column mode, and restoring colour RAM (so that the bars are different colours etc.
3. The palette is seriously messed up.  It turns out I have a bug in the DMAgic implementation when reading the palette, where it gets it one byte late.  It might be that we need to have an extra wait-state on reading the palette memory.
4. The frozen program doesn't actually resume after being unfrozen.  I'll have to look at the saved registers etc, and see why they aren't getting restored correctly. Actually, it looks like the unfreeze process never quite completes, but is instead stuck loading a sector from the SD card. I'll have to investigate that.

Anyway, that's where things are upto right now.  It shouldn't hopefully be too much longer before we can correctly unfreeze with the right colours, and with a running program after.