Monday, 14 April 2014

Beginning to rework the VIC-IV to support DMA delay and other things

Until now, the VIC-IV implementation has fetched the character and colour ram byte as it draws each character.  Also, it has basically been hard-wired to fetch characters from

This has been fine for testing text mode things, and also for the new full-colour graphics mode I added, which is really just text mode on steroids, with characters composed of 64 8-bit colour values.

But now I want to get bitmap mode to work, so that at least technically simple graphics can work.  Similarly, it would be really nice for custom character sets to work.

To give an idea of what happens now, see the following screen shot of the crack intro to labyrinth.


My poor old phone camera has trouble capturing the motion, but you can see that the $D016 horizontal smooth scrolling is doing something.  In fact, it is working just fine on the live display.  However, as mentioned above, everything is just using the ROM-based character set, instead of showing what it should.

So the next step is to get it reading character and bitmap data so that things display correctly.

After that, it will probably be working on latching character and colour data for an entire character height, so that it more closely matches VIC-II/VIC-III behaviour.

Sunday, 13 April 2014

Simple disk chooser to pick D81 files from the SD card

Now that it is possible to read from D81 files on the SD card, I wanted a way to pick from the images on the SD card, and make the floppy controller think I have inserted it into the internal drive.

So I wrote a little program that reuses much of the FAT32 code from the kickstart ROM, and allows you to say Y or N to each image in turn.  Once you say Y to an image, it makes it available for use.

I haven't debugged whether it sets the disk change line properly, but it does work, directing access to that disk.

I wrote it initially as a program to be loaded, but that is a bit annoying, so instead I include it in the Kickstart ROM, which then copies the program to $C000, so that it can be entered with SYS49152 from C64 mode.

Below you can see a screen shot of it in action:


Clearly this could do with some serious visual improvement, but the core functionality is there.  What is nice, is that only the core SD card access routines need to be in this wedge, as it could load a larger program direct from the SD card (optionally after saving the memory it will occupy), to give a nice looking interface with support for DOS subdirectories and FAT32 long file names.

Friday, 11 April 2014

The internal drive now works (for reading)

I am continuing to work on the floppy controller.

I have been tantalisingly close to being able to access the SD card via the internal 1581 DOS for a while, but there always seems to be one more step to cover.

The latest progress has been getting the floppy controller to read sectors from the correct location in the disk images on the SD card.

I had worked out a while back that physical track numbers start at 0, and had taken that into account.

I assumed that sectors also started at zero, since there was no contradictory guidance in the C65 specifications.   However, directory headers were not being loaded correctly, indicating that something was amiss.  I suspected the SWAP line that swaps halves of the sector buffer around, but couldn't find the point when the directory header sector was actually being loaded.

I added extra debug features to the serial monitor so that I could get the CPU to halt when writing to the floppy controller command register, but even that failed to reveal the reading of a sector 0. At this point I began to suspect that sectors might be numbered from 1, not zero.

Looking through the C65 ROM disassembly I managed to find the routine that prepares to load a sector from disk.  After scratching my head over it for a while, I was able to confirm that sectors do in fact start from 1, not zero, and made the change to the FPGA.

This is the routine:

9A34  BD B4 1F   LDA $1FB4,X ; Read logical track number (1 - 80)
9A37  3A         DEC         ; Decrement by one to obtain
                             ;   physical track number (0 to 79)
9A38  8D 84 D0   STA $D084   ; Store physical track number in
                             ;   FDC register
9A3B  BD B5 1F   LDA $1FB5,X ; Read logical sector number 
                             ; (0 - 39)
9A3E  C9 14      CMP #$14    ; Set carry flag if sector number is >=20
9A40  A9 00      LDA #$00
9A42  2A         ROL A       ; Set A to 1 if sector number >=20,
                             ;   or 0 if sector number <20
9A43  8D 86 D0   STA $D086   ; Store physical side number in FDC
                             ;   register
9A46  F0 02      BEQ $9A4A   ; Take branch if sector is one
                             ;   side 0 (A will be $00)
9A48  A9 14      LDA #$14    ; Load A with 20, which will be
                             ;   subtracted from logical sector
                             ;   number
9A4A  42         NEG         ; Calculate 2's complement of A,
                             ;   so will be either 0 (front
                             ;   sector) or -20 (rear sector)
9A4B  18         CLC         ; With C flag, A will be either $00
                             ;   if sector is on front side, or
                             ;   ($00-20) if on back side
9A4C  7D B5 1F   ADC $1FB5,X ; Now add logical sector number.
                             ;   Result will be 0 for sector 0,
                             ;   through to 19 for sector 39.
9A4F  4A         LSR A       ; Shift sector number right one bit
                             ;   since physical sectors are
                             ;   512 bytes, not 256.  Sector
                             ;   number will now be 0 through 9
9A50  1A         INC         ; Add one to sector number, since
                             ;   physical sectors apparently
                             ;   start at 1, not 0. Result will
                             ;   be between 1 and 10.
9A51  8D 85 D0   STA $D085   ; Store sector number into FDC
                             ;   register

Armed with this knowledge, I fixed the sector location calculation (and a few other bugs along the way).

The result is that the DOS now works for loading directories and programs.

First, here is the kickstart startup process where -- if you are quick -- you can see it mounting the .D81 file:

Now a couple of screenshots showing part of the (rather long) directory listing of the C64 emulator test suite that lives on the disk I mounted.  Even holding CONTROL down, the list flies by very quickly because of the speed that this machine is running.




Finally, in this video you can see me run most of the CPU tests from the C64 emulator test suite in about a minute -- the C65GS is already a very fast C64. The video wasn't appearing in the preview here for me, but hopefully it will appear when I publish this post.

Monday, 7 April 2014

Floppy controller emulation now partly works

Some more work on the floppy controller and kickstart ROM has yielded some nice results.

First, I modified the kickstart ROM so that it looks for a file calles C65GS.D81 on the SD card.  If present, and if the file is the right length, and contiguous on the SD card, it loads the starting sector into special registers at $D68C-$D68F that the floppy controller uses as a base address for disk accesses.  It then sets some flags in $D68B to indicate that a disk is present and writable.


This is what the startup process looks like now.  In this case, after finding and mounting the 1581 disk image, we see the ROM has already been loaded, as it passes the checksum test, and so doesn't need reloading.  One day I will seriously pretty up the appearance of the kickstart ROM.

This made it much easier to debug the floppy controller, because it would boot up with a real disk image "in the drive".  In C65 mode this gives the yellow border because the disk image lacks a C65 auto run file. In C64 mode, it means I can do a good old LOAD"$",8 (well, LOAD"$ since it is the C65's C64-mode kernel), and see what I get.

As you can see, I can get some partly sensible output from it.  With a bit of fiddling, I can see part of a directory listing.  


The directory header, not shown here, is messed up, and any program I load seems to be messed up as well.

I think what is probably happening here is that the ROM expects the SWAP flag of the FDC to work, which switches the two halves of a physical 512-byte sector in the buffer. This is used when accessing an odd numbered logical 256-byte sector on a track, which really lives in the upper half of a 512-byte physical sector.

It seems likely that with the SWAP flag implemented, I will be able to LOAD and RUN programs from the internal drive.

This will make further testing easier, as I can have a D81 file with all the CPU and other tests in place, and able to run with a shift-RUN-STOP in C64 mode.

That in turn should help me track down whatever the bug is that is stopping C65-mode from interpreting commands typed into BASIC.

Sunday, 6 April 2014

Getting closer with the C65 DOS

The recent work on emulating the F011 floppy drive controller (FDC) is finally starting to come together.

Basically, this boils down to the C65 DOS ROM being able to use the FDC, without getting stuck in any of several infinite-loops that result if the FDC does not behave as expected.

So now, if the SD controller is told provide the FDC with a disk image, and the C65 boots, the drive light flashes briefly, before loading of the boot file fails (because right now the FDC can't actually read sector contents, and the image I am providing is a placebo piece of SD card memory, not initialised with a 1581 disk image).

All this can be easily detected on a C65, because the border colour increments each time there is a disk error.  This was presumably a diagnostic feature that they intended to remove once they were happy with the ROM, but just as it was useful for them, so it is useful for me. So here you can see the C65 starting up with the border incremented from blue to yellow:


(Note that in 80-column mode on the C65GS there are no side borders by default, since 640x3 = 1,920 pixels. It would be possible to have side borders, but using this resolution they would be 320 pixels wide each, which would look silly.)

So I decided to push things a little further, by talking to the DOS using the C65 ML monitor for convenience.  The ML monitor can be entered at reset/power on by holding down run-stop during reset.

It was nice to see that I could ask the DOS for its status.  I then tried $ to get a directory, but the C65 monitor thought I wanted to convert a hex value to decimal. So then I tried to load a (non-existent) file into memory.  That produced an I/O Error #4, presumably because when it tried to read the disk header sector, it read all $FF bytes (which is about all the floppy controller can return at present), and so gave an illegal track or sector error.



But it is still very pleasing that it can do that much.

It shouldn't be too hard from here to get it actually reading sectors from the SD card, once I figure out the way the FDC buffer operates.

In preparation I will modify the kickstart ROM to look for C65GS.D81 on the SD card, and pointing the FDC to that by default. I will have to check that the file is contiguous on the SD card, since the FDC emulation assumes continuous sectors.  I will also need to make sure that it can't write beyond the end of the disk image and corrupt things.

Saturday, 5 April 2014

Progress towards making the 1581 DOS work

Today I have been working on implementing enough of the F011 floppy controller to make the internal DOS work.

The idea is that it will point to a portion of the SD card storage corresponding with a 1581 image.

The floppy controller is reasonably well described in the C65 specifications manual, so a while back I set about implementing most of the registers.

The more recent work has focussed on tying those registers to the existing sd card controller.

First, I added some registers to select the starting sector on the SD card for the 1581 image, and some flags to tell the FDC whether it thinks it has a disk inserted, and whether it is write-protected.  I already knew that the C65 startup checks the disk inserted bit when it decides whether or not to try booting from the floppy drive.

I also implemented an on-screen-display facility for the drive led, so that a square of pixels in the top left of the video display go red if the FDC is active, or blink red if an error has occurred.  This was fairly simple, because the FDC has dedicated LED control bits used by the C65 ROM to turn the drive led on or make it blink, with the same indications as a 1541 disk drive.  This works differently to a real 1581 where the LED interpretation is different.

Here you can see the drive led on-screen display as I cycle it through error indication, on solid for activity, and off a couple of times:



This got me to the point where I could lie to the C65 by making the FDC indicate a disk was present, and see what it did.

It turns out it went into an infinite loop.  Some digging revealed that it was trying to step the head to track 0, and the track0 indicator bit was not being set.  That's not surprising, since I hadn't implemented it.  Here is the relevant bit of code from the C65 ROM.  It uses LSR $D082 to push the TRACK0 flag into the processor's carry flag as an easy way to test the status of this flag:

9B85  A9 10      LDA #$10
9B87  4E 82 D0   LSR $D082                      ;  put bit 0 (TRACK0 flag) of $D082 into C flag
9B8A  B0 05      BCS $9B91
9B8C  20 91 9A   JSR $9A91                      ; Execute FDC command from A, and wait for completion.
9B8F  80 F6      BRA $9B87

9B91  A9 00      LDA #$00
9B93  9D 0F 01   STA $010F,X
9B96  18         CLC 
9B97  60         RTS 

Once that was fixed, it got a bit further, but hit another infinite loop when a sector read succeeded, because the RDREQ bit was not being set to indicate that the sector was successfully read.  Here is the routine.  Basically the ROM checks if the RNF (sector not found) or RDREQ (sector found and read) flags are asserted, and loops otherwise:

; Read a sector?
975B  20 2C 9A   JSR $9A2C                      ; Set FDC track, sector and side register from $1FB4?
975E  B3 12 01   BCS $9872
9761  A9 40      LDA #$40                       ;  command byte: read sector from disk.
9763  8D 81 D0   STA $D081                      ;  FDC Command register
9766  AD 82 D0   LDA $D082                      ;  FDC Status byte 1
9769  29 10      AND #$10
976B  D0 28      BNE $9795                      ;  Take branch and abort if RNF (read not found) bit set
976D  AD 83 D0   LDA $D083                      ;  FDC Status byte 2
9770  10 F4      BPL $9766                      ;  Take branch and keep waiting if sector not yet found (RDREQ bit clear)

This is the sort of thing that comes up when specifications for a system are only mostly incomplete: integration testing is needed to discover and resolve the remaining ambiguities.

What is nice is that the drive light stays on on the screen, so that is working.

I have updated the FDC/sd card controller behaviour to attempt to fix this, by not clearing the RDREQ line after the sector has been read.  The FPGA code is rebuilding, so hopefully it will get a step further tomorrow.  Of course, I still haven't implemented actually reading out the bytes of the sector when they are read, so there will still be more work to do even if this fix is successful.

And that is as far as I have got today.

Thursday, 3 April 2014

C65-mode ML Monitor is working

After bashing my head against various CPU speed optimisations lately, with "working" and "faster" seeming to be mutually exclusive for the moment, I turned my attention to a bug I was seeing with the machine language monitor in the C65 ROM.

On the C65, holding down the commodore key during reset or power on boots to C64 mode instead of C65 mode, similar to the C128.

Alternatively, if you hold down the run-stop key instead, it drops you into a machine language monitor, without clearing memory.  This is quite a nice feature.

However, when I had tried this with previous builds, the display of hexadecimal characters was all wrong.  There would be one hex digit, followed by a strange character.  A screen shot is below:



As you can see things are quite wrong.  The colour changes are because printing certain characters on Commodore 8 bit computers causes the drawing colour to change.

So I set about discovering where this problem was.

I knew the monitor sits in its own 8KB part of the ROM as a discrete program.  I also knew that it prints a ">" just before printing a hexadecimal value.  From experience, I also knew that the hex printing routine would likely have four LSR A instructions to shift the high nybl down. This, I figured would help me to find the exact place in the code where things were going wrong.  I hoped to then use the serial monitor debug interface to step through the code and see where it goes awol.

It didn't take too much searching to find the routine in question.  Just looking at it, I was immediately sure that it was a hex printing routine.  Apart from the four LSR's, it had the tell-tale comparison with $0A to work out whether it is printing a digit or letter.  Here is the routine:

69EB  48         PHA 
69EC  63 07 00   BSR $69F5
69EF  AA         TAX 
69F0  68         PLA 
69F1  4A         LSR A
69F2  4A         LSR A
69F3  4A         LSR A
69F4  4A         LSR A
69F5  29 0F      AND #$0F
69F7  C9 0A      CMP #$0A
69F9  90 02      BCC $69FD
69FB  69 06      ADC #$06
69FD  69 30      ADC #$30
69FF  60         RTS 

Staring at it for a few minutes I realised that the problem was the BSR instruction was not doing anything.  

BSR is a cross between JSR and the branch instructions on the 6502.  Like JSR, it pushes the return address to the stack, so you can use RTS to resume execution, but like the branch instructions, it uses relative addressing.

This struck me as odd, because I knew I had implemented that instruction.  However, I hadn't tested it, because I am still in the process of writing a comprehensive CPU test suite for the 4510.  After staring at the relevant VHDL code, I realised the logic error that was preventing the BSR instruction from ever being executed.  

Basically BSR uses 16-bit relative addressing, and before the check for the BSR instruction there is a case for all relative addressing instructions to decide whether to take the branch or not.  Basically BSR was being treated as a conditional branch with an impossible condition.  A quick bit of rearrangement, so that BSR is handled first, and suddenly hex output was working, and the monitor now seems fully functional: 


I had hoped that this would also fix the problem with C65-mode BASIC ignoring all commands, but no such luck.  There must still be at least one more CPU bug in there.  All the more reason to get that test suite written.


Tuesday, 1 April 2014

Progress towards CPU speed-ups

The last few days I have been fighting against some subtle bugs introduced recently while trying to speed the processor up a bit.  The machine is almost working, but not booting properly for reasons I have yet to figure out.

Today at lunch Redback dropped by the lab to say hello and see the hardware in action.  However, I didn't have a working FPGA bitstream for the FPGA from the last good point.  

But after some poke and fiddle with the latest bitstream, it spontaneously booted to the C64 READY prompt after I wrote to a random piece of memory from the serial monitor interface.  Most bizarre.

Nonetheless, we seized the moment to run synthmark64, which was showing x18.69, I think because the read-modify-write (RMW) optimisation was running, even on IO addresses.  I need to fix that. So then I set the bit to enable the optimisation for hiding memory read wait states when possible.  That did work, and as the result below show provides a roughly 30% speed up.  

This makes the current C65GS prototype 24 times faster than a stock C64, and faster than all other known accelerators overall and for each instruction group, excepting for reading zero page, where the Chameleon is slightly faster.

Anyway, here is the screen show before things went bad when I tried to toggle the RMW optimisation.

I still hope to push the acceleration to closer to x50 in due course.