This problem also affects the freeze menu that has to save ~512KB of data every time you freeze. Because the SD cards are smart internally, they tend to ignore writes with unchanged data, but even then, you have the penalty of the ~64KB block read internally in the card. Thus freezing a problem that had very different memory contents than the last frozen program could take a number of seconds. This was quite annoying when all you might want to do in the freeze menu is toggle between PAL and NTSC, for example.
Thus I figured the time had come to attack the root cause of these problems and implement multi-sector writes on the SD card. But first I had to figure out exactly how they work. This was not as easy as it sounded, as there is a lot of contradictory information out there.
The only certain thing, was that CMD25 should be used instead of CMD24 to initiate a multi-block write. Some sources said you should also use a CMD18 to say how many blocks you are going to write, and others said you should use a CMD23 after to flush the SD card's internal cache. Then everyone seemed to disagree on the actual mechanics of multi-sector writes.
What I figured out was: You don't need CMD18 or CMD23, just CMD25. Then use the 0xFC data start token instead of 0xFE when writing the first sector. Then just keep writing extra sectors, this time with an 0xFC token until done, after which you should write an 0xFD token (without a sector of data following it) to mark the end of the multi-sector write. If you try to write data after the 0xFD, then really bad/weird things happen that require you to physically power things off and on again.
In the end, the main part of the change to the low-level SD card controller was quite small:
if write_multi = '0' then
+ -- Normal CMD24 write
txCmd_v := WRITE_BLK_CMD_C & addr_i & FAKE_CRC_C; -- Use address supplied by host.
- addr_v := unsigned(addr_i); -- Store address for multi-block operations.
+ state_v := START_TX; -- Go to this FSM subroutine to send the command ...
+ rtnState_v := WR_BLK; -- then go to this state to write the data block.
+ elsif write_multi = '1' and write_multi_first='1' then
+ -- First block of multi-block write
+ -- We should begin things with CMD25 instead of CMD24, but
+ -- then we don't need to repeat it for each extra block
+ txCmd_v := WRITE_MULTI_BLK_CMD_C & addr_i & FAKE_CRC_C; -- Use address supplied by host.
+ state_v := START_TX; -- Go to this FSM subroutine to send the command ...
+ rtnState_v := WR_BLK; -- then go to this state to write the data block.
+ else
+ -- We are in a multi-block write. So just go direct to WR_BLK
+ state_v := WR_BLK; -- Go to this FSM subroutine to send the command ...
end if;
Essentially the change was just to write the correct command for starting a multi-block write, and to write subsequent blocks with the correct data token at the start, and without re-sending any CMD25 or anything else. I actually managed to get that part correct on almost the first go, once I had figured the mechanics out, which rather surprised me. But it took quite a bit more effort to get FDISK/FORMAT and the hypervisor freeze routines to work properly with it. Indeed there are still a few wrinkles to work out. For example, when copying the frozen program to a new freeze slot, trying to use multi-sector writes causes things to go haywire. So for now, I have just disabled that.
That said, the changes to the various programs are actually quite small as well. It's just the usual problem of writing assembly language and low-level C code that works reliably. Add in the fun of CC65 not telling you when you have run out of memory during compilation, and life remains "interesting". For example, here are the changes to the freeze routine:
jsr copy_sdcard_regs_to_scratch
; Save current SD card sector buffer contents
- jsr freeze_write_sector_and_wait
+ jsr freeze_write_first_sector_and_wait
; Save each region in the list
ldx #$00
@@ -31,6 +31,8 @@ freeze_next_region:
cmp #$ff
bne freeze_next_region
+ jsr freeze_end_multi_block_write
+
rts
jsr sd_wait_for_ready_reset_if_required
- ; Trigger the write
- lda #$03
+ ; Trigger the write (subsequent sector of multi-sector write)
+ lda #$05
sta $d680
jsr sd_wait_for_ready
@@ -168,6 +206,13 @@ freeze_write_sector_and_wait:
sec
rts
+freeze_end_multi_block_write:
+ jsr sd_wait_for_ready
+ lda #$06
+ sta $d680
+ jsr sd_wait_for_ready
+ rts
+
In short, use the new commands $04 (begin multi-sector write), $05 (write next sector of multi-sector write) and $06 (finish multi-sector write) when writing to $D680, the SD card control register. The changes to FDISK/FORMAT were similarly simple, again, just changing the bulk-erase function to use these new commands. While I was there, I also implemented a little SD card reading speed test, because I was curious how fast (or slow) the SD card was going. My 8GB class 4 card can read more than 700KB/second. One day I will implement the 4-wire interface, which will allow speeds 8x faster, but for now 700KB/sec is ample.
All up, the result is already very pleasing: Formatting an SD card is easily 10x faster, taking less than 1 minute for an 8GB card now, instead of 10 minutes or more. You can see how fast it is in this video:
Freezing is now also MUCH faster, reliably now only about 1 second to get to the menu on my machine here. The biggest delay when freezing is now waiting for the monitor to resync if the frozen program was PAL (as the freeze menu is always 60Hz NTSC for better monitor compatibility). You can see how fast the freeze menu is in the videos below:
This time, we will have the English video first, and the German one below:
Und auf Deutsch:
No comments:
Post a Comment