Tuesday, 7 May 2019

Free and Open-Source Replacement ROMs for the C64



While this blog is usually about things for the MEGA65, this post is actually about something for stock standard C64s, and more the point, for emulators, and all re-creations: Free and open-source replacement ROMs, that can be used, modified and distributed by the general public, so that, for example, emulators can ship with fully legal ROMs, without having to be troubled by costs or legal complexities in terms of licensing.

But first, let's step back a bit, and look at the current situation.

The Commodore 64 as we all know uses three ROM parts: The KERNAL, BASIC and the character ROM.  These are all different sizes, but together make up the 20KB of total ROM that a C64 needs to operate.  Some of you will at this point be saying to yourselves, "no, the KERNAL and BASIC ROMs are the same size".  This is actually only a generalisation, because the KERNAL is actually only about 6.5KiB, and BASIC is about 9.5KiB, and uses the bottom 1.5KiB or so of the "KERNAL" ROM.

Anyway, this means that there are these three parts that have to be replaced in order to make a C64 or compatible computer come to life.

The character ROM I have already talked about. Basically it is highly doubtful that a copyright infringement suit could be bought against a user of the font. For a start, in countries like the USA, it simply isn't possible to copyright a bitmap font.  Then given the 8x8 size, there aren't many options for implementing most of the symbols, specially the line and block ones.  Add to that that the symbols have now been added to Unicode, and the long-standing lack of enforcement against distribution of any C64 ROMs, and it really looks like the character ROM isn't a big drama.  Of course, we have also effectively solved this problem by making our own complete char ROM based on a combination of hand-drawn symbols and hand-touched characters from the public domain VGA 8x8 font.  It isn't perfect, but it works.   So we have the 4KB character ROM already under control.

Now, the KERNAL and BASIC are much more interesting beasts.  The KERNAL implements the screen editor, keyboard scanning logic and IEC serial communications protocol, along with a few other bits and pieces.  Then BASIC uses the KERNAL's APIs to provide the familiar BASIC interpretor, which itself has quite a lot of complexity, with the line tokeniser and de-tokeniser, expression parser, variable management, commands, functions and operators.

Also, to have even a minimally working system, that would let you load and run a game or other program that was written in assembly language, you still need the BASIC tokeniser, LOAD, RUN and SYS commands at a bare minimum, with LIST also being practically essential, so that you can actually see what is on a disk.

Then, like the character ROM, we have the problem of how to create new ROMs that are non-infringing on the intellectual property rights of the rights-holders of the C64 ROMs.  This requires considerable care and thought.

The gold-standard for such endeavours is to have one team produce detailed specifications of the software being recreated, and another team implementing it.  Fortunately, with books like Compute's Mapping the 64, we actually have the specification effectively written for us back in the 80s.

This means that we can potentially implement the KERNAL and BASIC ROM functionality using such resources as a guide, and here is the important part, without looking at the C64's ROMs while writing them.

There is a residual risk that because the C64 ROMs are everywhere, and anyone likely to be inclined to write their own ROMs will have been exposed to them, it is very hard to enforce a true "clean room" reimplementation.  However, I think that it is still possible, provided that sufficient care is taken.

Basically the challenge is to have a development process that is transparent and makes it unambiguously obvious to any observer, that no infringement is being made of the original ROMs, and that all code being written is being freshly produced.  Here in many ways our audience is the rights holders to the original ROMs -- we want to make their job of assessing whether we are infringing their rights or not super easy.  We don't want anyone having to waste time and effort on lawyers that will only make everyone poor and sad. Thus it makes sense to take an approach that integrates an "abundance of caution" at every stage, so that all mess can be avoided.  This will hopefully also be clear from the outset, since the whole point of this project is to respect the intellectual property rights of the copyright owners of the C64 ROMs. That is, if we didn't care about their rights, we would just use the original C64 ROMs that are available for free download all over the internet like everyone else.

So, back to planning a process, here is the general process that we have come up with:
  1. Begin with the immutable starting point of the 6502 reset entries, IRQ entry and NMI entries, and the rest of the ROM being empty.  This starting point can have no copyright problems.
  2. Based on the public calling interface of the C64 KERNAL as documented in the C64 Programmer's Reference Guide, make stub routines for the jump table.
  3. All routines begin at the lowest address in the KERNAL, sorted by routine name.  Thus the order of the routines is deterministic, and not the result of any creative process.
  4. Implement publicly documented routines, using secondary sources, such as books about the C64, but without refering to the 64 ROM contents themselves. 
  5. Run test programs using the C64 KERNAL, and collect entry points into the ROMs.
  6. Where an entry point does not correspond to a public API of the KERNAL, research the function by searching for it in Google. Implement it according to those references.
  7. Where an entry point means that previously implemented routines have to be moved to make space at a specific address, move only those routines required to do so, to the next available address.
  8. Where understanding of the inner workings of a routine are required to replicate it, secondary sources, such as the "Mapping the C64" or "C64 Programmer's Reference Guide" should be used. When those do not provide the answer, internet searches based on the name of the routine should be done, and failing that, based on the routine's address if it has no well known name or insufficient material is turned up.  Reference to actual disassemblies of the ROMs is not to be made, to ensure that we have strong defences against any claim of copyright infringement.
A similar process should be followed for the BASIC ROM.

To help with this, I have created a framework that allows a ROM to be compiled from a collection of assembly files, which get linked together to produce the final ROM. This helps to compartmentalise the work, and with careful design of the framework, makes it very easy to move routines around and assign them fixed locations as the research of the secondary sources and the entry points are discovered from running programs and tracing their entry into the ROMs.

This framework turned out to be quite simple. I used the Ophis assembler, as I am already quite familiar with it, and it has a handy pair of pragmas that make it quite convenient to fix the location of a routine, .checkpc and .advance.  These can be used together to make sure that a routine will be located at an exact address, and will complain if there isn't enough space.  To help pack the routines into the free space around the routines, the framework implements a greedy packing algorithm that places the largest un-placed routine into each free space, until the free space is full.  There is room to improve this, for example by placing exactly the right sized routines into spaces, but that can wait until necessitated by the ROM filling up as we implement the last few features at the end.

The adage of "commit early and commit often" is especially true for this project, because we want the source control history to be strong evidence that we have developed each routine ourselves from scratch, and not copied from the C64 ROMs.  Thus commits when things are half-working and half-baked are especially important, as they document this implementation process.

We are also purposely using quite different algorithms and methods for some key parts of the system, so that there is even stronger evidence against infringement. So for example, the BASIC keyword list and tokeniser are implemented using a simple compression scheme for the BASIC keywords.  This not only saves a bit of space, it also means that the BASIC keyword list is not present in the ROM in the same format as the original (even though as a list of facts, it is not copyrightable), and the algorithm for searching for keywords in the compressed list is by necessity an entirely new work: There would be no point in deriving it from the C64 ROM's tokeniser.

Similarly, the keyboard scanner in the KERNAL is based on a publicly documented improved keyboard scanner, that supports multi-key roll-over and rejection of spurious joystick input. In this way, once again, we end up with a routine that has a demonstrably independent ancestory, and offers some nice improvements. We even expanded it slightly, so that the joystick can be used to move the cursor.

For the BASIC interpreter, we also decided to implement banking support from the outset, so that more than 38KiB would be available for BASIC.  The KERNAL LOAD routine was also improved to support loading files bigger than 202 blocks, without writing over the IO area.  Just like the improved keyboard scanner, the result is clearly a new and fresh implementation, and one that brings advantages along with it.

That is, our goal is not to create a 100% identical C64 ROM set, but rather a highly compatible and pleasant to use set of alternate ROMs for C64-compatible computers, and that are free for inclusion in emulators, FPGA-based computers and other projects that would like a C64-style environment, without the legal hazards that come from using the C64's own original ROMs.

So where are we up to?

Well, we have been sneakily working on this in the background for a few weeks now, as we wanted to hold-off until the project had clearly advanced to a point that proved its feasibility, and provided some minimal level of utility. As hinted at above, our idea of minimum utility is the ability to LOAD and RUN assembly-language based software in a manner that feels totally familiar and functional.

And this we have achieved. There are lots of things still missing, like expression parsing and almost all BASIC commands, and a surprising number of bits and pieces in both BASIC and the KERNAL that are not required by a reasonable range of software.  Also, things like RS232 and cassette support are very low on our priority list, as any real C64 has its original ROMs, and any emulator or FPGA-based C64-compatible computer worth its salt will have some kind of bulk storage on hand.

But this is perhaps best explained visually.  The following videos and images show the current progress we have achieved, and shows a number of old and new software titles that can already run using our ROMs.  Also, as a reminder, this is all running on a stock C64 (well, in VICE's C64 emulator). It does not need the MEGA65 in any way (although of course being able to include the ROMs in the MEGA65 is one of the many reasons for creating them).






The source code is at https://github.com/MEGA65/open-roms.

If you want to try the ROMs out yourself in your favourite emulator, you can get the files from here.

In many ways the hardest work is already done, to get this project off the ground, and get minimally functioning KERNAL and BASIC interpreter.  However, there is still much to do and much to be implemented.  We are thus looking for contributors who would be willing to help us implement the missing functionality and improve compatibility.

The next post in this series is here - reducing the attack surface for legal attacks.

8 comments:

  1. As usual, you did a great job.
    61000+ Bytes free :-D
    I like that.

    ReplyDelete
  2. any patents pertaining to the C64 will have expired years ago. there are no patent issues left.

    ReplyDelete
  3. sorry, I meant to say copyright. the only thing you'd likely have to worry about are trademarks, and those are easily changeable.

    ReplyDelete
    Replies
    1. Actually the copyrights can survive for more than 100 years in many countries, so we do need to be careful and respectful of them.

      Delete
  4. I'd love to see a BASIC 7.0 compatible ROM set for the 64- it'd cut down on the need for as much hand-coding of things in BASIC to get to the basic functionality of 7.0-level features.

    For that matter, a GEOS-compatible interface would be astounding.

    ReplyDelete
    Replies
    1. Both would need a lot more than 16KB to fit it in. This isn't to say that it can't be done, or wouldn't be an interesting project, but rather that quite a bit of expansion ROM would be required, i.e., a cartridge or internal ROM switcher type thing.

      Delete
  5. What tools would a seasoned software engineer (one who knows C, SmallTalk, Perl, Java, and other sundry languages) need to have/know, in order to help with this effort?

    ...and what're the odds of wedging in a small DOS wedge into this ROM?

    ReplyDelete
    Replies
    1. Howdy, we'd LOVE to have some help on this project. Basically you need to be able to do at least one of: (a) write 6502 code; (b) read disassembled 6502 code for marking up similarities with the original ROMs; (c) write test cases (could be in C using CC65, or in 6502 assembly, as well as in BASIC for that part of things); (d) run regular regression tests; (e) herd other volunteers to help with such things; (f) other things I am sure to have forgotten, like writing documentation ;)

      As for wedging in a small DOS wedge, I don't see why not. One of the great things of the approach that I have taken with building the code from lots of little files, is it is much easier to swap out one or more for ones that have extra features, such as a DOS wedge. In fact, the biggest challenge to writing a DOS wedge for our ROMs is that we don't have a CHRGET routine in zero page in which to hook (although this could be changed). Thus a bit of creativity will be required to implement it, but with the full source to the ROM obviously available, it won't be hard to hook it directly in where it should go. I'm happy to provide pointers on how to do this, if someone is willing to do the work.

      Delete