Friday, 17 January 2020

Adventures in talking to the QSPI flash

I am getting closer to being able to communicate with the QSPI flash, so that we can have the MEGA65 update its own bitstreams in the field. To recap the current situation:

1. Most of the signals to the flash are easy to connect to with the QSPI flash, except the clock, which is normally driven by the FPGA's configuration logic.

2. The FPGA has a facility, the STARTUPE2 component, that allows the running bitstream to take control of this signal.

3. I have managed to achieve (2) in a test bitstream, as confirmed by my new JTAG boundary scan setup.

4. But I haven't got it working for a real bitstream.

To get to this point, from the last blog post, I discovered that the STARTUPE2 component *must* be in the top level of a design.

The question is now why in the real bitstream, it still isn't working, even though I have moved it to the top level.

Basically it works in the pixeltext test target, that lacks a M65 computer, but not in the nexys4ddr-widget target. More weird, when I removed the M65 computer component out of this second target, it still isn't working.

This makes me suspect that there might be some kind of target setup in the Vivado project that is to blame. There is a "persist" flag that can be used, which causes the configuration clock to remain active on the QSPI clock pin.  That could be the problem -- but then I would still be expecting to see the line waggle, which it doesn't seem to.

However, digging further, I did managed to control the line with the M65 computer component taken out of the real bitstream.  Now trying to put it back in, but with a dedicated 1Hz clock on the pin, so that I can eliminate internal problems in the plumbing of the line to the register I had it hooked up to.  Basically I can keep pushing the connection deeper down into the design, until it is in the component where I was controlling it.

Ok, so with the full machine core, and the 1Hz clock in the outer layer, I can control the clock line. Next step is from in the sdcardio.vhdl file where it gets connected to, to see if I can toggle it there under automatic control.  If that works, then I must have some subtle bug in the register plumbing. If not, then the plumbing problem must be between sdcardio.vhdl and the outer layer of the design. Either way, I will be able to considerably narrow down where the problem can be hiding.

So, the clock toggles, meaning the problem is probably in sdcardio.vhdl somewhere...

Okay.... So, this is one of those funny bug fixes that I really hate. It could well be that I have done something really stupid, but if so, I am ignorant to what it is.  But the solution was to create a 2nd register to control the QSPI clock at $D6CD.  With that implemented, magically $D6CC works to control the clock.  I've had this kind of problem before with VHDL, where possibly something is incorrectly optimising out the ability to write to some signal.  Anyway, it is solved for now.

Then I started trying to investigate things, and came to the rapid conclusion that my life would be so much nicer, if I could make my new JTAG boundary scanner produce industry-standard VCD files that I could view in gtkwave, to get a more effective understanding of what is going on.  So I did. It wasn't too hard, and now I can produce pretty pictures like this:

Which is helpfully showing me that I can waggle the clock line, and also control the CS (chip select) line, but that the data lines are seemingly not doing anything.  But I know from prior experimentation that I can indeed control these lines, so this is probably an example of me having an error in my test program.  But how nice it is to be able to determine that in just a few seconds :)

Digging through this, I fixed the initial problem, but also found I had the SO and SI lines switched around from the way they should be, so that will need a resynthesis...  Well, then I wasn't so sure, so I made it so that the four data lines are open-collector with internal pull-ups in the FPGA. This means that the lines can be either driven low, or float high.  This means I can fiddle with which line is which etc, without having to resynthesise each time.

However, I am seeing some quite weird things with the data lines when I look at the JTAG traces:

So let me explain what we have here.  Because I was seeing weird things, I make a test program that tries every possible value on the four data lines, CS and clock pins to the QSPI flash.  The open-collector operation means that the direction pins (the .ctl pins in the lower half) basically indicate what we *should* be seeing on the actual pins (in the top half).  This holds true for QspiDB[2], QspiDB[3], QspiCSn and the clock, but not for QspiDB[1] and QspiDB[0]: These two pins switch a short time later.  This would only make real sense, if the QSPI flash was pulling those lines down (remember, open-collector outputs "float" high, so any device connected to them can pull them down to ground), or there is something really fishy going on with the FPGA control of those pins.  I now need to try to solve this riddle.

Let's look first at FPGA control of the pins as a potential cause. As the other pins don't exhibit this strange behaviour, and the four DB pins are all controlled in an identical manner, I find it hard to believe that the problem is there.  That leaves the QSPI flash as the current primary suspect.

First stop: Check the schematics.  Nothing sinister here on the Nexys4DDR boards: the QSPI flash is directly connected to the FPGA, with only some external pull-up resistors, which can't cause this funny problem I am seeing.
So that suggests it is most likely just the way that I am communicating with the QSPI flash.

Poking around, it seems that DB0 only changes (or is only changeable) when  CS is high. This makes sense, as when CS is high, the QSPI flash is not active, and so shouldn't be trying to drive any lines. When it is low, then DB1 stays tied low.  This makes me 99% sure that DB1 is the line from the QSPI to the FPGA, and DB0 is the command line from the FPGA to the QSPI.

This means, in theory at least, that I should be able to talk to the QSPI flash, if I drive the correct waveform. However, so far at least, there are no signs of active response from the QSPI flash.  And looking at the trace, here we see this weird problem again: The DB0 signal stays low for one clock tick longer than it is being pulled low:

This is really weird. I can slow the clock down even more (its currently less than 1KHz, anyway) to the point where it looks mucb better, but this feels altogether wrong: The FPGA can read out its bitstream from this QSPI interface at 66MHz, so ~660Hz should be absolutely no problem!  The 1.8KOhm pull ups should be able to pull these lines high in <1 micro second, but we are seeing rise (or delay) times of >1 milli second -- a thousand times slower.

This bizarre delay occurs whether the QSPI flash is selected via the CS line, or not.  This would seem to suggest that it is not the QSPI flash to blame -- unless it is in some strange mode following the FPGA configuration process. 

Ok, looking again that the schematic, there are indeed 1.8K pull-ups on the DB2 and DB3 lines, but not on DB0 or DB1. This means that it is possible that running these lines open-collector might not be practicable. So I resynthesised with the ability to push those lines actively high, as well as pull them low, or tri-state them, as before.  Now by actively pushing them, they respond immediately, as expected. So now I can send a byte via the SPI interface, and it all looks right:


Of course, it still isn't working. But that could be because I just realised I am sending the bits least-significant-bit first, instead of most-significant-bit first. And indeed, that suddenly gets it responding to me!

Now we're finally getting somewhere :)  Again, I am so glad I implemented this VCD logger and JTAG boundary scan stuff.

Of course I could have just figured out how to do it from in Vivado, but its so much nicer to have a little light-weight and open-source tool.  Also, by having it integrated in monitor_load, I can do multiple things all in one quick action.  Here is now I run the test program, and then ask monitor_load to sample those pins -- all in one single command:

make src/tests/qspitest.prg && src/tools/monitor_load -F -4 -r src/tests/qspitest.prg -V log.vcd -J src/vhdl/nexys4ddr-widget.xdc,${HOME}/build/artix7/public/bsdl/xc7a100tl_csg324.bsd,qspisck,qspicsn,qspidb[3],qspidb[2],qspidb[1],qspidb[0]

Okay, so its a bit of a long command, but that's what pressing the up arrow in a shell is all about, so you can just use it again and again, without having to re-type it. 

When that command has logged the pins for long enough, I just hit control-C, and then launch gtkwave on the resulting log.vcd file, with a little tiny script that tells it to automatically show all signals:

gtkwave -S allsigs.tcl log.vcd

So the whole work-flow is now super easy and efficient.

But anyway, back to figuring out why the test program doesn't read the data from the SPI response correctly... It's currently reading all ones, i.e., not noticing when the DB1 line goes low. Adding a short delay fixes this. Not entirely sure why. But with that, I can finally read some useful things out of the chip, and display them:

QSPI FLASH MANUFACTURER = $01          
QSPI DEVICE ID = $2018                 
RDID BYTE COUNT = 77                   
SECTOR ARCHITECTURE IS 4KB PARAMETER SEC
TORS WITH 64KB SECTORS.                
PART FAMILY IS 8000                    
256/512 BYTE PROGRAM TYPICAL TIME IS 2^8
 MICROSECONDS.                         
ERASE TYPICAL TIME IS 2^8 MILLISECONDS.
 01 80 30 30 80 FF FF FF               
 FF FF FF FF 51 52 59 02               
 00 40 00 53 46 51 00 27               
 36 00 00 06 08 08 0F 02               
 02 03 03 18 02 01 08 00               
 02 1F 00 10 00 FD 00 00               
 01 FF FF FF FF FF FF FF               
 FF FF FF FF 50 52 49 31               
                                       
READY.                                 

I confirmed with the data sheet that these data are broadly sensible.  So the next step will be to extract all the relevant data out, e.g., the information I need to programme the device, and after that, to implement simple block read, erase and write functions... Which turned out to be remarkably painless, if rather boring internally.  The more exciting part will be in the next post, where I (hopefully) actually implement writing of bitstreams to the QSPI flash.

Thursday, 9 January 2020

Programming the Bitstream Boot Flash and all things JTAG

So, in the last post, I implemented the ability to tell the MEGA65 to switch to a different bitstream. The next challenge is to make it possible for the MEGA65 to be able to re-program the contents of the flash memory, so that we can supply people with an  updated bitstream, and make it super-easy to upgrade the MEGA65.

First piece of detective work was to realise that we can take a .bit file, remove the 120 byte header, and write it directly to the flash somewhere, and it should Just Work (tm).

So now I need to be able to talk to the SPI boot flash. This is a bit tricky, because the FPGA boot process controls the clock line to this device. Fortunately, there is a way to put this back under control of the VHDL:  Basically you use this slightly magic STARTUPE2 thing, and feed it a clock:

  STARTUPE2_inst: STARTUPE2
    generic map(PROG_USR=>"FALSE", --Activate program event security feature.
                                   --Requires encrypted bitstreams.
                SIM_CCLK_FREQ=>0.0 --Set the Configuration Clock Frequency(ns) for simulation.
    )
    port map(CFGCLK=>CFGCLK,--1-bit output: Configuration main clock output
             CFGMCLK=>CFGMCLK,--1-bit output: Configuration internal oscillator
                              --clock output
             EOS=>EOS,--1-bit output: Active high output signal indicating the
                      --End Of Startup.
             PREQ=>PREQ,--1-bit output: PROGRAM request to fabric output
             CLK=>CLK,--1-bit input: User start-up clock input
             GSR=>GSR,--1-bit input: Global Set/Reset input (GSR cannot be used
                      --for the port name)
             GTS=>GTS,--1-bit input: Global 3-state input (GTS cannot be used
                      --for the port name)
             KEYCLEARB=>KEYCLEARB,--1-bit input: Clear AES Decrypter Key input
                                  --from Battery-Backed RAM (BBRAM)
             PACK=>PACK,--1-bit input: PROGRAM acknowledge input
             USRCCLKO=>spi_clock,--1-bit input: User CCLK input
             USRCCLKTS=>USRCCLKTS,--1-bit input: User CCLK 3-state enable input
             USRDONEO=>USRDONEO,--1-bit input: User DONE pin output control
             USRDONETS=>USRDONETS--1-bit input: User DONE 3-state enable output
             );

The important bits here are the USRCCLK and USRDONE signals.  Basically the first pair of signals let us control the clock to the SPI flash, while the second lets us control the DONE signal, which the FPGA normally outputs high when it is configured.  We just have to keep that one behaving normally, since the MAX10 FPGA depends on it.


When I first attempted to implement this, the system failed to come up.  After a lot of poking around and inadequate documentation from Xilinx, I found this project, that actually showed a working instantiation. From there it wasn't long, before I at least had a working bitstream.

It's actually likely to be helpful for the rest of this part, as well, because it actually does everything that I want: i.e., it allows programming of a connected QSPI flash memory. I'm glad to have finally found some source code that I can look at when I get stuck, to see how others have solved the same problems.

So in theory at this point, I have a bitstream with working ECAPE2 for bitstream switching, and now, a bit-bashing interface that *should* allow me to talk to the QSPI flash.  So I started writing a little test program for that, that basically tries to read some device information from the QSPI chip.

So, not entirely suprisingly, the test program doesn't work, in that it doesn't return the device ID.  If the pins for the QSPI flash chip were exposed on the PCB, I'd be able to stick my oscilloscope on them, and waggle them in software, and make sure that everything is correct.  However, as both the FPGA and QSPI flash are BGA parts with no exposed pins, there is no such possibility.

It should be possible, however, to use JTAG debugging tools to read the pin status of every pin on the FPGA.  The trick is how to do this easily from command line on linux.

The UrJTAG package provides the jtag command that *should* be able to do this.  After some hunting for info, the following should work to detect a MEGA65 connected via the USB debug cable:

jtag> cable FT2232 vid=0x0403 pid=0x6010

Then the detect command should show something connected, like this:

jtag> detect
IR length: 6
Chain length: 1
Device Id: 00010011011000110001000010010011 (0x13631093)
  Manufacturer: Xilinx (0x093)
  Unknown part! (0011011000110001) (/usr/share/urjtag/xilinx/PARTS)

That's looking good, except that the Artix7 FPGA is not in the part list.

There is, however, a newer version of UrJTAG, that has been patched to support the Artix7 series, and even has a boundary scan file for at least one version of the chip -- that should allow us to map the JTAG output to actual pins, which will be very helppful for us. Unfortunately, the pre-built package for Ubuntu lacks this, so I need to build it from scratch.

Building UrJTAG is proving interesting, because it needs ftd2xx.h, which I can't figure out which package on Linux provides. It looks like it might come from here: https://www.ftdichip.com/Drivers/D2XX.htm. You have to copy the include files from the release/ directory into the build directory for UrJTAG, and then it seems to build.

So, builting UrJTAG is a bit of a pain. The "make install" script basically doesn't work, so you have to do all that yourself.  With the new jtag binary, I now get this:

jtag> cable FT2232 vid=0x0403 pid=0x6010
Connected to libftd2xx driver.
jtag> detect
IR length: 6
Chain length: 1
Device Id: 00010011011000110001000010010011 (0x13631093)
  Manufacturer: Xilinx (0x093)
  Part(0):      xc7a100t (0x3631)
error: Unable to open file '/usr/local/share/urjtag/xilinx/xc7a100t/STEPPINGS'
  Unknown stepping! (0001) (/usr/local/share/urjtag/xilinx/xc7a100t/STEPPINGS)

So, that's a step forward, but  I have no idea yet where to get this STEPPINGS file from, or if it really is necessary. Ah, that was also just a problem with the install script not working. After manually copying  the data/ directory's contents into /usr/local/share/urljtag, it works:

jtag> detect
IR length: 6
Chain length: 1
Device Id: 00010011011000110001000010010011 (0x13631093)
  Manufacturer: Xilinx (0x093)
  Part(0):      xc7a100t (0x3631)
  Stepping:     1
  Filename:     /usr/local/share/urjtag/xilinx/xc7a100t/xc7a100t-csg324

This is all very nice, except that it thinks is the 324 pin part, not the 484 pin part that is actually in the MEGA65 R2 PCB.  It seems that UrJTAG might not support multiple variants of the same part, which is a bit annoying.

The first step, though, is to find the information required to actually even make the file. This seems to be available behind the license-wall at: https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/device-models/bsdl-models/artix-series-fpgas.html.   Using my account there, I downloaded the zip archive of BSDL files, and it seems that they are indeed the source material that I need. The PIN_MAP_STRING in each file seems to be the reverse-order of what appears in the UrJTAG file.  The syntax of BSDL is a bit weird, being VHDL derived, so where there are multiple pins defined on a single line, I'll have to work out how to parse those.

It turns out that UrJTAG has a parser utility for doing this:

bsdl2jtag xc7a100t_csg324.{bsd,jtag}
error: -E- error: In Package STD_1149_6_2003, Line 375, Error in User-Defined Package declarations.
error: -E- error: BSDL file 'xc7a100t_csg324.bsd' contains errors in VHDL stage, stopping
error: system error: Success Cannot open file STD_1149_6_2003 or /usr/local/share/urjtag/bsdl/STD_1149_6_2003

But as we can see, it is missing some files.  I suspect the install target of the Makefile might again be the problem here. Nope, apparently it just doesn't support STD_1149_6_2003. But someone has implemented the missing file. Unfortunately, it gives some error about user defined packages.  Someone else just took to modifying the BSDL files to remove the need for STD_1149_6_2003.  I might try that next.

Meanwhile, as I am out and about this morning, I took a Nexys4DDR board with me, which does have the exact chip that UrJTAG already supports, since I figured that should "just work", and I should be able to poke around with it while waiting for appointments.  Well, I don't get the error I described above, but I do instead get:

jtag> cable FT2232 vid=0x0403 pid=0x6010
Connected to libftd2xx driver.
jtag> detect
warning: TDO seems to be stuck at 1

What I don't know, is whether this is further along or not as far along as the other. I am guessing it is not as far along, since if the JTAG bus is stuck, it won't enumerate, and indeed, we are seeing a lack of enumeration. Fortunately I am not the only person with this problem.  Let's try some of their proposed solutions...

Unfortunately none of the suggestions on that page work. I'd suspect that my FPGA board is broken, except the fpgajtag command I use to send bitstreams to the FPGA via JTAG works perfectly.  So the JTAG interface *does* work, and my computer *can* communicate with it.  Most frustrating.

I also took a look at OpenOCD, an open-source JTAG tool for Linux etc.  This is an excellent project in many ways, but was never designed with doing simple FPGA boundary scans in practice.  Thus as a result, it still isn't in any way trivial to do them with it. I am sure if I invested enough time and energy I could figure out how to do it, but I really don't want to have to do that, if I don't have to.

I did take a quick look at the internals of the fpgajtag command, to see if I could easily adapt it.  It looks reasonably well-structured, but for someone who doesn't know that much about JTAG (although I am learning), it isn't immediately obvious what I would need to change.

So then I started looking at Vivado to see if the hardware manager in there can easily do a boundary scan.  I am sure it can, but even after a pile of Googling, I can't actually figure out how to do it.  There is a lot of talk about needing a debug bitstream or some debug core in the project.  This strikes me as incomplete information at best, since the JTAG interface on an FPGA, if not disabled, can ALWAYS do a boundary scan, if I understand things correctly.  Also, my workstation this morning doesn't have mains power, so I don't want to kill my battery before the kids swimming lessons finish for the morning:



The best thing I have found so far is this: https://www.fpga4fun.com/JTAG4.html
While whatever the JTAG library is that the example source code was written for isn't immediately clear, it does show how to go through the process of performing the boundary scan at a low level.  It might thus be enough information, together with the fpgajtag source, to cook something up that can work.  I have found the Xilinx BSDL files for the FPGAs I care about already, so in theory, I have all the information I need.

It also gives me hope of being able to take control of pins on the FPGA, so that I can more quickly test and develop things like this QSPI interface, as I can potentially avoid having to synthesise every change, but instead be able to bit-bash over JTAG.  But of course, I have to succeed in actually getting SOMETHING to work, before I can get that excited.

Well, at least integrating fpgajtag into monitor_load was relatively easy: The only slightly tricky part was re-doing the command line interface parse stuff. But I do want to extend it a little further, so that the fpgajtag stuff which correctly works out which USB serial port to talk to, can also be used to automatically find the correct serial port for the normal monitor_load communications.  This was also not too hard, once I found out that I could map the /dev/ttyUSBx paths to the entries in /sys/bus/usb-serial/devices, and look at the destination of those symlinks to check that the USB bus and port match.

So now, in theory, I have all necessary ingredients to adapt to be able to run a boundary scan from within monitor_load, so long as I can figure out how the  fpgajtag code does the JTAG communications.  But this is not proving as simple as I would like, as fpgajtag has what seems to be a quite clever mechanism for abstracting the low-level JTAG operations.

Unfortunately, there is little documentation in the source, and I am struggling to understand how to adapt it.  I'm pulling my hair out enough that I have logged an issue on the fpgajtag github repository asking for some help in understanding their code.  Within a few hours, I had received some pointers to documentation for the FTDI serial adapters, which gave me enough information, with quite a lot of trial and error, to work out how to control the JTAG interface.  This will also come in handy in the future, when we get to implementing updating the keyboard CPLD from the MEGA65 itself as well, as I will need to implement a JTAG interface for that.

Anyway, back to the point, I now seem to be able to read some JTAG boundary scan data from the FPGA.  It seems to be shifted by a few bits, and I don't yet capture it all, but I am able to see bits toggle as I flip the switches on a Nexys4 DDR board, and in roughly the right place in the boundary scan register.  I suspect the bit order of the bytes might be flipped, and that I need to ignore the first 6 or so bits, to make up for the bits of the boundary scan command itself being shifted out.  But the important thing is that I can now read boundary scan data.  The changes I made to the read_idcode() function to tell it to switch to boundary scan mode ended up being quite simple:

    ENTER_TMS_STATE('I');
    ENTER_TMS_STATE('S');
    write_bit(0, 0, 0xff, 0);     // Select first device on bus
    write_bit(0, 5, IRREG_SAMPLE, 0);     // Send IDCODE command
    ENTER_TMS_STATE('I');


(Checkout https://github.com/MEGA65/mega65-core/blob/unstable/src/tools/fpgajtag/boundary_scan.c if you would like to see it all together.)

This switches the JTAG interface from Reset to Idle, then to IR-Capture, send the JTAG SAMPLE command so that it ends up in the IR register, and then returns to Idle state, ready for the usual logic to shift bits in and out.  The boundary data is then in the data shifted back in.  All quite simple, once I had worked it out!

With a bit more work, I have now implemented an amazingly quick and dirty scanner for both the XDC and BSDL file formats.  XDC files inidicate the pins used by a project, while BSDL files have the information about the FPGA itself, importantly including the JTAG boundary scan information.  With these parsers, and a bit of glue, I can not only show the status of each FPGA pin, but also the name of the pin in the project.  While there is plenty of room to improve this, the result is already really nice.  Here is a little sample of the output on a Nexys4DDR board:

monitor_load -J src/vhdl/nexys4ddr-widget.xdc,${HOME}/build/artix7/public/bsdl/xc7a100tl_csg324.bsd
make: „src/tools/monitor_load“ ist bereits aktuell.
fpgajtag: Digilent:Digilent USB Device:210292645477; bcd:700; IDCODE:  3631093
Auto-detected serial port '/dev/ttyUSB1'
FPGA is assumed to be a XC7A100TL_CSG324, with 989 bits of boundary scan data.
bit#2 : CCLK_E9 (pin E9, signal {QspiSCK}) = 1
bit#3 : M0_P12 (pin P12, signal <unknown>) = 1
bit#4 : M1_P13 (pin P13, signal <unknown>) = 0
bit#5 : M2_P11 (pin P11, signal <unknown>) = 1
bit#6 : CFGBVS_P8 (pin P8, signal <unknown>) = 1
bit#10 : INIT_B_P7 (pin P7, signal <unknown>) = 1
bit#13 : DONE_P10 (pin P10, signal <unknown>) = 0
bit#53 : IO_U8 (pin U8, signal {sw[9]}) = 0
bit#56 : IO_T8 (pin T8, signal {sw[8]}) = 1

...

First, we have filtered out all the bits that are not marked "input" in the BSDL file, which dramatically shortens the list of output.

Second, we see the nice mapping of the BSDL bit names to FPGA pins and project signals.  sw[9] and sw[8] are two of the slide switches on the Nexys board, and I can happily twiddle those, and re-run the scan, and see the changing values.  So I am confident overall that its working, and that I can finally go back to what I was trying to do at the begining: Check whether I am correctly controlling the QSPI interface pins, in particular the CCLK pin.

So let's actually fire up a bitstream, and see if we can control the pin... and indeed I have confirmed that everything except the pesky clock pin is controllable.  This is what I had most suspected would be the problem, but now I don't have to suspect -- I can inspect!  But solving that will have to wait for the next blog post.

Meanwhile, if you would like to support me, I've setup a ko-fi page at ko-fi.com/paulgs.

Wednesday, 1 January 2020

Running multiple bitstreams

The MEGA65 is based on an FPGA.  FPGAs are like a blank canvas that you load a hardware design into, with that design being typically stored in flash memory.  Generally you don't notice this, because the whole process of loading the design into the FPGA and starting it, takes only about 0.3 seconds.  This is why the MEGA65 can boot much faster than, say, a THE C64, which has to boot a Linux operating system and fire up an emulator.

It's one of the many advantages of FPGAs, if you have the time and sanity to spare to implement a retro-computer that way, instead of using software emulation.  But there is a potential down-side to this: With software emulation, it's really easy to change the program you are running. So, for example, emulator-based systems typically let you run not only C64, but also VIC-20, Amiga, Spectrum, Apple ][ and a whole pile of other systems.

So how can we have a framework for "swapping programs" like this on an FPGA?  Fortunately, this is a question that lots of big-spending customers of FPGAs asked a very long time ago, and so Xilinx and the other major vendors all have various ways of doing this.  In this blog post, I will document my learning process, as I explore the Xilinx documentation, to work out how to do this on the MEGA65, so that we can potentially have different machine cores down the track, but also, so that we can more easily have updates for the MEGA65's main core, without having the risk of bricking the machine if an update fails part way through.

So the starting point is Xilinx's documentation for configuring their FPGAs. Configuration is Xilinx's name for "loading the design into an FPGA and setting it running".  You can just think of it as being like loading a programme on a regular computer.  Anyway, Xilinx's documentation lives here.  We're particularly interested in Chapter 7 "Reconfiguration and Multiboot", since what Xilinx calls "multiboot" is exactly what we want.

Xilinx's Multiboot facility basically allows one bitstream (the FPGA program) to indicate where the FPGA should look in the flash memory for a different FPGA program, and then tell the FPGA to pretend it has just been turned on, so that it will load the new bitstream instead.  This means two lots of the approximately 0.3 seconds of boot time, if you want to have the first bitstream load the second one.  Actually, it can be a bit quicker, if the first bitstream, which Xilinx calls the "Golden Bitstream," is a really simple design, and thus will compress well.

My current thinking is that our Golden Bitstream will just be a known-working release of the normal MEGA65 core.  At least to begin with.  What I'm thinking of doing, is adding the necessary extra bits to the bitstream to allow the triggering of reconfiguration, together with a little bit of code in the Hypervisor, that checks if any of the number keys from 1 to 9 are being held down. If one of them is, then it will calculate an address in the flash memory based on the number pressed, and then trigger reconfiguration.  This will allow the use off the standard MEGA65 core, as well as up to 9 other cores, subject to them all fitting in the flash memory.

We also want to be able to support having updates to the MEGA65 core itself, which I am currently thinking will be implemented by having the Hypervisor try to load an updated bitstream from a specific part of the flash memory, if none of those number keys are pressed.  If 0 is held down, then I will have it not do this, so if you need to "downgrade", this will be possible. For example, if some bitstream update doesn't work for some particular reason.

The Xilinx FPGAs are also capable of a nice trick: If when you try to load a bitstream from somewhere else in the flash memory, and it fails, it will reload the Golden Bitstream again, but this time, with special flag set to say that it has fallen back to the Golden Bitstream. That way, we can even have the MEGA65 display some kind of message on first boot, if the updated bitstream doesn't work for whatever reason.

All up, this should give us a good basis on which to build a nice update mechanism for the bitstream on the MEGA65.  All I need to do now, is actually extract the information I need from Xilinx's documentation, and then actually implement it.  This could be the fun part, as this is a feature that is notoriously under-documented...

First step: Find out how to instantiate the ICAPE2 thingy (Dingsbums for the Germans reading along), that allows access to the whole configuration system.  This seems to be available here on page 178.  What worries me, is that it looks to be a bit minimalistic:

Library UNISIM;
use UNISIM.vcomponents.all;
-- ICAPE2: Internal Configuration Access Port
-- 7Series
-- Xilinx HDL Libraries Guide, version 2012.2

ICAPE2_inst: ICAPE2 
generic map(
   DEVICE_ID => X"3651093",    -- Specifies the pre-programmed
                               -- Device ID value to be used for
                               -- simulation purposes.
   ICAP_WIDTH => "X32",        -- Specifies the input and output
                               -- data width.
   SIM_CFG_FILE_NAME => "NONE" -- Specifies the Raw Bitstream (RBT)
                               -- file to be parsed by the
                               -- simulation model.
)
portmap(
   O => O,        -- 32-bit output : Configuration data output bus
   CLK => CLK,    -- 1-bit input   : Clock Input 
   CSIB => CSIB,  -- 1-bit input   : Active-Low ICAP Enable
   I => I,        -- 32-bit input  : Configuration data input bus
   RDWRB => RDWRB -- 1-bit input   : Read/Write Select input
); 
-- End of ICAPE2_inst instantiation

So now I need to figure out what each of those does.
The DEVICE_ID and SIM_CFG_FILE_NAME are apprently only used for simulation, so that the fake configuration register values can be read-out, so we can ignore those, I think.

ICAP_WIDTH, O and I also seems to be prettz logical, defining the width and input and output bus.  The fact that it is allowing the width to be varied is tempting for trying to make the interface 8-bit, but I have a gut feeling that that would just Lead To Trouble.  But I'll have a think about it as I keep exploring.

So that just leaves CLK, which should be straight-forward, CSIB and RDWRB, which I am not yet totally sure about.

Reading page 148 of this, suggests that we have to write a series of 32-bit values that are basically a pretend tiny bitstream.  This would explain why the interface has only Read/Write select and Chip Select (CS) sigals to go with the data: We just have to write the correct series of values. It also suggests that the 8-bit interface mode might just work, too, which would be nice -- if I can get the byte order correct.

Xilinx's recommended set of values to send are:

FFFFFFFF - Dummy word
AA995566 - Sync word 
20000000 - Type 1 NOOP
30020001 - Type 1 write to WBSTAR
00000000 - Warm-boot start address
30008001 - Type 1 write words to CMD
0000000F - IPROG word
20000000 - Type 1 NOOP

Let's try to go through those to understand what is going on.
The dummy word probably doesn't require much explanation. The sync word, I think, helps the FPGA work out the bit/byte order / endian-ness. Might also work to help get 8-bit mode right.  We'll investigate that later.
Then we have some "Type 1 NOOP"s in there. Those we can generally ignore for now, as well.
Then we have the interesting part, where we write to the WBSTAR register.  This sets the upper bits of the flash memory address used to configure the FPGA from.  The lower 8 bits are undefined, so apparently the bitstream should be pre-padded with 256 FF bytes, to make sure.
Then we have the writing the IPROG word to the CMD register. This is apparently what tells the FPGA to reset and reconfigure, but keeping the just-set WBSTAR value.

So, let's cook up a bit of VHDL that embeds one of these ICAPE2 thingies, and tries to tell it to load a bitstream from a particular place, and see if we can make it work.

Along the way, I also found that the bit order of each byte in the ICAPE2 entity have to be reversed. I also found what claims to be a working implementation.

Then I discovered that on the Artix 7 FPGAs, you have to allow 3 cycles, so the write sequence ends up like this:

  signal bitstream_values : reg_value_pair := (
    x"FFFFFFFF", -- Dummy word
    x"FFFFFFFF", -- Dummy word
    x"FFFFFFFF", -- Dummy word
    x"FFFFFFFF", -- Dummy word
    x"FFFFFFFF", -- Dummy word
    x"AA995566", -- Sync word
    x"20000000", -- Type 1 NOOP
    x"20000000", -- Type 1 NOOP
    x"30020001", -- Type 1 write to WBSTAR
    x"00000000", -- Warm-boot start address
    x"20000000", -- Type 1 NOOP
    x"20000000", -- Type 1 NOOP
    x"30008001", -- Type 1 write words to CMD
    x"0000000F", -- IPROG word
    x"20000000", -- Type 1 NOOP
    x"20000000", -- Type 1 NOOP
    others => x"FFFFFFFF"
    );


I then dynamically change the contents of the entry for the Warm-boot start address via some memory mapped registers:

      bitstream_values(9) <= reconfigure_address;
        cs <= '1';
        rw <= '1';
        if trigger_reconfigure = '1' then
          counter <= 0;
        end if;


Asserting trigger_reconfigure sets the counter to the start of the command stream, and then sends them all, which triggers the reconfigure.

Then it's just a case of memory-mapping access to those registers:

           when x"C8" =>
              -- @IO:GS $D6C8-B - Address of bitstream in boot flash for reconfiguration
              reconfigure_address(7 downto 0) <= fastio_wdata;
            when x"C9" =>
              reconfigure_address(15 downto 8) <= fastio_wdata;
            when x"CA" =>
              reconfigure_address(23 downto 16) <= fastio_wdata;
            when x"CB" =>
              reconfigure_address(31 downto 24) <= fastio_wdata;
            when x"CF" =>
              -- @IO:GS $D6CF - Write $42 to Trigger FPGA reconfiguration to switch to alternate bitstream.
              if fastio_wdata = x"42" then
                trigger_reconfigure <= '1';
              end if;              


I got this all together, but then still had a problem: When I tested it, it wouldn't work.  I suspected that this is because I was loading the bitstream via JTAG, rather than from the SPI flash that contains the usual bitstream.  This means that the FPGA hasn't been setup for the SPI flash configuration, and that thus trying to load a subsequent bitstream will fail.  To test this, I had to reflash the SPI flash to contain this new bitstream, and then try it from there... And it worked without problem!

To be a bit more specific: If you set $D6C8-$D6CB to the value $00000000, and then write $42 to $D6CF, it will reload itself, since it is at $00000000.  Thus it works like a kind of Very Hard Reset Indeed for the MEGA65.  Better, if you put some other value in there, where no valid bitstream exists, the FPGA has a watchdog timer that gets tripped when the FPGA fails to configure up, and thus after a few seconds it falls back to the original bitstream at $00000000!  This means that if something goes wrong you get the "police lights" on the keyboard for a few seconds, before the machine boots normally.

Now, apparently there is a way to work out if this has happened, so that you can avoid an infinite loop of trying to start a broken bitstream, which I'll look into in due course.  Similarly, I need to work out how to write the extra bitstreams into the flash, so that we can actually use the multi-boot facility. Those who want to follow along, or see all the code, hop over to https://github.com/MEGA65/mega65-core/issues/153.

But for now, I think it's time to open those fireworks we bought for New Year's Eve / Silvester that are sitting in our MEGA65 Hack Session New Year's Kit* and celebrate!


* Limited stocks. Some items not available in some countries.  Contents may vary from above image. Contains small parts. Not suitable for children under 3 years of age or lactating rhinoceroses, except under medical supervision.

Wednesday, 11 December 2019

MEGA65 DevKit prototype r2

One week later, this is a new revision of the MEGA65 DevKit. The board was lowered to leave more room for cables and programmers which required for a backplanes and side pieces. Now everything fits perfectly even if you wish to keep the FGPA proggers attached forever. This time we also removed the top of the floppy drive for a very geeky look (final DevKits will be a bit more sleek, stay tuned):


Inside live the MEGA65 PCB r2 from Trenz and the wonderful MEGA65 keyboard from GMK.

Some of you have asked if the DevKit already is working so here is a little video of it playing Sam's Journey:


I have also compiled a little FAQ for possible other questions about the DevKit:

Q: What's a DevKit?
A: It's aimed at developers so they can develop software for the machine or even help shaping the final product before it is released.

Q: Why does it look so different?
A: The case is made in a way it can be produced in small batches without the high costs of injection moulds. Its transparency helps finding out if the smoke stays inside the chips.

Q: I am a collector not a developer.
A: The DevKits have laser-engraved Logos and serial numbers to make them unique. DevKits usually are great collector's items.

Q: I only want to play with it!
A: The DevKit is like a "real" MEGA65 only in a preliminary form. You might encounter hickups but you can always (soft-)update it.

Q: I do not like the floppy.
A: The DevKits come as (hence the name) kits which include a refurbished floppy drive. Feel free to leave that out and donate it to us.

Q: What will it cost?
A: DevKits are always more expensive than mass-produced machines, they also get strong support from the makers. It will most probably cost under EUR 1.000 but more than the final machines.

Q: Can you build it for me?
A: It's really easy to build, usually under an hour, maybe two if you are clumsy. If you do not dare to build it yourself please ask around e.g. in Forum64. There are many nice people there!

Q: When can I buy it?
A: We are working towards summer 2020. However, we strongly rely on help by the community so anything can happen!

Q: I am a blessed developer and want to sacrifice all my time but I do not have any money!
A: Please talk to us about support!

Q: You said summer 2020. This means no final machines in 2020?
A: You do the math. It depends on donations and support of the community. We are only humans and not so many, less than 10 active. We do this in our free time with no interest in profit.

Q: What do you need most besides money?
A: Probably VHDL experts but anyone willing to help is warmly welcome!

Q: If I buy a DevKit, can I transfer PCB and keyboard into a MEGA65 case later? Can I even 3D-print my own MEGA65 case?
A: Probably yes!

Sunday, 8 December 2019

MEGA65 DevKit prototype assembly

Here are a bunch of pictures of a laser cut acrylic MEGA65 developer kit case with PCB R2 and the MEGA65 keyboard inside (plus a floppy drive). As you can see it is more compact than the original MEGA65 case fitting the floppy drive on the side and above the board. The kit was designed and cut by our partner plexi|laser based on a layout by the MEGA65 assembly team.






The DevKit is an option for people, especially developers, to get a MEGA65 before it is fully finished and be able to carve the machine together with us and to develop software for it before its release. The top shell is bent to give it a more slick look and to avoid hurting those precious developers hands.



As it is the first version of the DevKit case we found a couple of things that we want to improve, such as better and quicker access to the board for accessing in terms of programming/flashing the board and keyboard as well as access to the SD drive slots. The new revised parts are already in the mail so expect an update soon.


The final version of the case will sport an engraved MEGA65 logo and also a prominent DevKit serial number to make it a great collector‘s item as well. This should shorten the dreadful waiting time for the final machine to be available significantly. Other possibilities to shorten this time can be found on MEGA65.org.


Monday, 25 November 2019

The long winding path to getting HDMI Audio working

But my notes assure me it has been only 8 or 9 days that it has taken me to get HDMI audio working, but it feels like a month!

Having spent a few days to get the HDMI video finally working recently, I was keen to get HDMI audio working as well, so that we could just cross the whole HDMI thing off the list of major outstanding tasks for the MEGA65.  It should have been easy, but turned out to be quite nasty for a variety of reasons, not the least of which was a little tiny typo... but more on that later.

The ADV7511 HDMI driver chip we are using supports audio as well as video, and of course we want to have the MEGA65 with with that, so that it is super-easy for people to plug a MEGA65 into their TV or monitor, and get sound as well as picture.  Having got the MEGA65 and MEGAphone prototypes working, I am beginning to get familiar with digital audio formats. 

I was expecting to use the same I2S audio format that the MEGAphone uses for various of its audio paths.  However, the MEGA65 R2 PCB has only SPDIF audio hooked up. No problem. After a bit of hunting around, I found a nice SPDIF implementation in VHDL that I could use.  It didn't take too much hassle to get that plumbed in and working. 

It was then also a bit of additional fiddling with ADV7511 I2C register settings (and research to find out what I wasn't setting) to get some audio coming out in test bitstreams. This took a bit longer than expected, because at first I didn't realise that I was hitting a rather weird problem: Enabling HDMI audio with the PAL video mode we are using caused the HDMI output to completely fail, i.e., the screen would go to sleep, as though there was no video signal. 

Once I figured out that that was happening and switched to using the NTSC mode to work on, I soon had reasonable audio.  I say reasonable, because it was quite recognisable, but also quite distorted. There were a number of small bugs here, as well as simply reducing the audio output level to a level that agreed with the ADV7511.  I also increased the sample rate to 192KHz, since HDMI supports that, and I figured it couldn't hurt.

What then followed was several long frustrating nights, as I tried to figure out what was going on with this crazy PAL video mode problem.  Why on earth would turning audio on with a PAL signal cause the signal to fail?  Why would a monitor even care whether the signal was PAL or NTSC when processing the audio?  These and many other questions kept flying through my head.

To help track the problem down, I reworked the whole HDMI I2C setup code, so that I could actually direct memory map the I2C registers, so that I could see what the ADV7511 was thinking about things, as well as trying to tweak any registers that seemed like they might be the culprit. It turned out that none of them were the cause, although I recall that I did improve a few of the settings along the way. It also meant that I wrote a nice little HDMI debug utility that shows the status of many of these registers, and reports on the detected video mode etc:



The PAL problem was at the time still causing me increasing frustration, as there was no apparent cause for it, as can be seen in the rather long github issue log as I tried various things.  So I resorted to asking for help on the Analog Devices forum, and then progressively mutating the NTSC video mode until it matched the PAL video mode, to try to find the actual cause. It was this last idea that finally let me solve the problem:  I had a typo in the VSYNC setup for the PAL video mode, that caused the VIC-IV to produce pixel data during the first line of VSYNC.

Once it started working, it was obviously a very great relief.  Commando sounds excellent.  I think there might be some minor problems with stereo and level normalisation causing some clipping due to DC bias on the digital samples. But those are all feel like achievable relatively small tasks, now that we have the fundamental audio working.

Friday, 22 November 2019

Writing the 2019 Christmas Demo

Today we have another guest post, this time from Anton, describing how he created the nice little Christmas Demo for the C64, C65 and MEGA65.  This is a nice little program to explore, because it works out which machine you are on, and varies the display accordingly.  But let's have Anton explain it in his own words...
Hi everybody, i wanted to present you with a small background info about the
MEGA65 BASIC 2 christmas demo I have just coded together with Paul.
A Forum64 member (ZeHa), started last year a project on forum64.
The idea was, to bring back the times of the 80’s, when you went to your local store, bought the latest computer magazine and started, once at home, to type the listings into your beloved 8-bit machine. So he collected last years, xmas themed BASIC listings from board members and put them together in a magazine.

So, this year, he contacted us, the MEGA65 team, if we wouldn’t be interested in providing a nice program as well. We’ve discussed this in the team and we all loved the idea. So I agreed to program something in BASIC. (I haven’t done any BASIC coding since the 80’s…..)

I knew from the beginning, that I would like to code something, that would run on every C64 but I also wanted to demonstrate, what the MEGA65 is capable of.
So that’s when I settled on BASIC 2. One of the biggest advantages of the MEGA65 really is, that you can be in normal C64 mode, you activate the VIC-IV features by Poking :

POKE 53295,71:POKE 53295,83


and you have all the VIC-IV features available at your hand. Easy as that !!!

I started to think about the demo and xmas-themed requirements and slowly an idea formed into my head. I contacted Paul and asked him, if there is a way to find out, if  there is a VIC-II present, like on the c64 or a VIC-III like on the c65 or a VIC-IV, the MEGA65 is using. So after some testing we had a nice code snippet ready to check, what machine our program is running on:
(I love the name, we call it “Knock Knock”-routine)

13710 REM *** "knock knock" TO CHECK WHAT VIC IS USED (c64/c65/MEGA65) ***
13720 REM In c65 Mode we cannot safely write to 53295, so we test a different way
13810 IF PEEK(V+24)AND32THENGOTO14410
13910 POKEV,1:POKEV+47,71:POKEV+47,83
14010 POKEV+256,0:IFPEEK(V)=1THEN VC=65:C$="VIC-IV":RETURN
14110 POKEV,1:POKEV+47,165:POKEV+47,150
14210 POKEV+256,0:IFPEEK(V)=1THEN VC=65:C$="VIC-III":RETURN
14310 VC=64:C$="VIC-II":RETURN
14320 REM we assume we have a c65 here
14410 V1=PEEK(V+80):V2=PEEK(V+80):V3=PEEK(V+80)
14510 IF V1<>V2 OR V1<>V3ORV2<>V3THEN VC=65:C$="VIC-IV":RETURN
14610 GOTO14110
So after I was able now to check, what machine my code is getting executed on, I could start with my demo. I wanted to make something simple, nice to watch and completely in BASIC.
Several people have asked already, how the 15 colour sprites on the MEGA65 are working, so I thought, this should go into the demo as well.

The idea shaped. After I found out, with a lot of help from Paul (Thanks Paul !), how the 15 colour sprite works, I started to design a 15 color sprite.

For a rough understanding:
The c64 sprites are using in HiRes mode 24bit x 21 bits so a standard sprite is 24 by 21 pixels
The multicolor sprites are using 2  bits for each color which leaves us with 12 doublewide pixels x 21 pixels. So a sprite on the c64 is using 63B (64)Bytes (one Byte stays empty).
The  MEGA65 VIC-IV is able to show a 15 color sprite, which is 16 pixels horizontally x 21 pixels vertically.
Each pixels colour is set by one Nibble (4bits) so a MEGA65 15colour sprite is 64bits horizontally and 21 bits vertically. This means:
4bits x 16 =  64 bits horizontally (8 Bytes)
64 x 21 = 1344 bits or 168 Bytes.
This means, that for a 15 color VIC-IV sprite you have 168B of Data.
The memory area at $0340 would be perfect. So I started drawing a concept.
Have a look: (PS, the blank middle line was removed by moving the right side of the tree one pixel to the left)
I have to say I was amazed, to see how much you can realize with a sprite with such dimensions !
Paul wrote a nice code snippet, which might not be the fastest routine to get the 15 colour sprite poked to the right memory address area, but it was a very good study example of HOW it works.
Have a look here:
(all the standard sprite commands like, positioning, sprite pointer, etc. remain exactly like it would be a normal HiRes sprite. here in this part you only see the commands, that are different from VIC-II)
(You can see, that I am using in this routine already the checks done in the previous code snippet, to see, what VIC is used.)

45 REM IF MEGA65 ENABLE VIC-IV FEATURES
48
IFC$="VIC-IV"THEN POKE V+47,71:POKE V+47,83

255 REM *** VIC-IV 15 COLOR SPRITE SETUP ***
260
REM IF NO MEGA65 JUMP TO 390
265
IF C$="VIC-II" OR C$="VIC-III" then goto 390
270
REM SET ADDRESS 15COL SPRITE TO $0340 (832)
275
AD=768+64
280
REM MAKE SPRITE 3 - 8 AS 15COLOR SPRITES AND MAKE THEM USE 64 BIT
285
POKE V+107,252:POKE V+87,252
290
REM COPY STANDARD COLORS OVER FOR MULTICOL SPRITES 3 - 8
295
FORI=16TO255:FORJ=1TO3:POKEV+256*J+I,PEEK(V+256*J+(I AND 15)):NEXTJ,I
315
REM READ 15COL SPRITE DATA
320
READN$:IFN$="end"THEN 390
325
REM JUMP TO DECODE ROUTINE
330
GOSUB 345
335
GOTO 320
340
REM DECODE STRING OF NYBLS IN N$ AT ADDRESS AD (832)
345
L=LEN(N$)
350
FOR I=1 TO (L/2+1):POKE AD+I,0
355
FOR I= 1 TO L:N=ASC(MID$(N$,I,1))-64:IFN<0THEN N=0
360
B=AD+INT((I-1)/2):IF (I AND 1)=1 THEN N=N*16
365
VA=PEEK(B):POKE B,VA OR N:PRINT"{home}";VA:NEXTI
370
AD=AD+INT(I/2)375 IF (L AND 1) THEN AD=AD+1
380 RETURN

1445 REM 15 COLOR TREESPRITE DATA 
1450 REM "-" = TRANSPARENT, "A"-"O"" = COLOURS 1 TO 15 
1455 DATA "------HGH-------"
1460 DATA "------GGG-------"
1465 DATA "------HGH-------"
1470 DATA "-------G--------"
1475 DATA "------EEH-------"
1480 DATA "-----E-EBE------"
1485 DATA "------ENB-------"
1490 DATA "-----EEEEM------"
1495 DATA "----EHEEEEE-----"
1500 DATA "---E-B-EE--E----"
1505 DATA "----EBEEENE-H---"
1510 DATA "---CEEEHEEEEB---"
1515 DATA "--EE--EBE--EB---"
1520 DATA "---HEEEBCEE-----"
1525 DATA "---BEEEEEEEE----"
1530 DATA "--EBE-EEEEEHE---"
1535 DATA "-EE--EEMEE-BEE--"
1540 DATA "H--EEEEECEEB----"
1545 DATA "BEEENEEEEEEEEE--"
1550 DATA "BEEEE--I--EECEE-"
1555 DATA "-------I--------"
1560 DATA "END"


I would like to mention, by setting the sprite color, you define the transparent color.
So if our Tree is sprite 3, by setting 53290 ($D02A) to 0 (Poke53290,0) you set Black(0) as transparent color for sprite 3 (53290).
Line 290 & 295 might also be worth mentioning as well. There we copy the RGB values of the 16 standard colors from Sprite 1 color bank over to the Sprite colour bank of sprite 3 – 8.
This means, you can set not only 15 colours per sprite, but also can define how these colors look like.
You have the RGB values under your control.



  
OK, but back to the demo. So I was drawing several other sprites (since I already knew, that people who have  a normal c64 or a c65 wouldn’t be able to see the 15color sprite), and started to put everything together as I continued to program in BASIC 2 I realized soon, that on a normal c64 with 1 MHz I would not be able to realise all the things I wanted to show.
But the MEGA65 is able to switch between 1MHz/2MHz/3.5MHz and 40MHz !!!




This was the idea I was looking for !!!  enable certain features of the demo, only if the machine is able to support a certain speed:
C64 max 1MHz.
C128 max 1MHz (2MHz not useable, since FAST mode on c128 disables screen)
C65 1MHz/3.5 MHz
MEGA65 max 40 MHz
 

So after I knew, what VIC my program is running on, I was able to tell what the maximum speed would be, that this machine can support.
So I needed some code to do the frequency change. After checking the VIC-III and VIC-IV registers, since the official MEGA65 manual is having all the necessary information:


I came up with this code snippet:


935 REM *** FREQUENCY CHANGE ***
940 REM SWITCH TO 2MHZ
945 IF MH=1 AND C$="VIC-IV" THEN GOSUB985:POKEV+48,1:MH=2:GOTO995
950 REM SWITCH TO 3.5 MHZ
952 IF MH=1 AND C$="VIC-III" THEN POKEV+49,64:MH=3.5:RETURN
955 IF MH=2 THEN GOSUB985 POKE0,64:MH=3.5:GOTO995
960 REM SWITCH TO 40MHZ
965 IF MH=3.5 AND C$="VIC-IV"THEN GOSUB985:POKE0,65:MH=40:GOTO995
970 REM SWITCH TO 1 MHZ
972 IF MH=3.5 AND C$="VIC-III" THEN POKEV+49,0:MH=1:RETURN
975 IF MH=40 THEN GOSUB985:POKE0,64:MH=1:GOTO995 
980 REM DISABLE VIC-IV FEATURES
985 POKEV+47,0:POKEV+48,0:RETURN 990 REM ENABLE VIC-IV FEATURES
995 POKEV+47,71:POKEV+47,83
1000 IF MH=40 THEN POKEV+49,0
1005 IF MH=3.5 THEN POKEV+49,64
1010 RETURN

So the c64 would run our nice Christmas demo with the least functions, while on the MEGA65
You can see all the functions implemented (including the 15 colour demo Sprite).
So if the demo is running on a machine > c64/c128 an extra text is shown, which asks you to press
(depending again on what machine it’s running) “SPACE” to cycle between standard Multicolour Sprite and 15 colour VIC-IV Sprite, or “F” to cycle through the freqencies.

This is more or less, the main idea behind the demo. It will run on a c64 also on a c65 and on the MEGA65 and it shows some of the features the MEGA65 offers.
So if you are interested in the demo, hop over to forum64 (this is also the official MEGA65 forum)
Just mention in the Thread, that you want to receive a copy of the forum64 2019 christmas magazine.

So to conclude: I want to wish all of you a merry Christmas 80’s style...

Anton