Thursday 9 January 2020

Programming the Bitstream Boot Flash and all things JTAG

So, in the last post, I implemented the ability to tell the MEGA65 to switch to a different bitstream. The next challenge is to make it possible for the MEGA65 to be able to re-program the contents of the flash memory, so that we can supply people with an  updated bitstream, and make it super-easy to upgrade the MEGA65.

First piece of detective work was to realise that we can take a .bit file, remove the 120 byte header, and write it directly to the flash somewhere, and it should Just Work (tm).

So now I need to be able to talk to the SPI boot flash. This is a bit tricky, because the FPGA boot process controls the clock line to this device. Fortunately, there is a way to put this back under control of the VHDL:  Basically you use this slightly magic STARTUPE2 thing, and feed it a clock:

  STARTUPE2_inst: STARTUPE2
    generic map(PROG_USR=>"FALSE", --Activate program event security feature.
                                   --Requires encrypted bitstreams.
                SIM_CCLK_FREQ=>0.0 --Set the Configuration Clock Frequency(ns) for simulation.
    )
    port map(CFGCLK=>CFGCLK,--1-bit output: Configuration main clock output
             CFGMCLK=>CFGMCLK,--1-bit output: Configuration internal oscillator
                              --clock output
             EOS=>EOS,--1-bit output: Active high output signal indicating the
                      --End Of Startup.
             PREQ=>PREQ,--1-bit output: PROGRAM request to fabric output
             CLK=>CLK,--1-bit input: User start-up clock input
             GSR=>GSR,--1-bit input: Global Set/Reset input (GSR cannot be used
                      --for the port name)
             GTS=>GTS,--1-bit input: Global 3-state input (GTS cannot be used
                      --for the port name)
             KEYCLEARB=>KEYCLEARB,--1-bit input: Clear AES Decrypter Key input
                                  --from Battery-Backed RAM (BBRAM)
             PACK=>PACK,--1-bit input: PROGRAM acknowledge input
             USRCCLKO=>spi_clock,--1-bit input: User CCLK input
             USRCCLKTS=>USRCCLKTS,--1-bit input: User CCLK 3-state enable input
             USRDONEO=>USRDONEO,--1-bit input: User DONE pin output control
             USRDONETS=>USRDONETS--1-bit input: User DONE 3-state enable output
             );

The important bits here are the USRCCLK and USRDONE signals.  Basically the first pair of signals let us control the clock to the SPI flash, while the second lets us control the DONE signal, which the FPGA normally outputs high when it is configured.  We just have to keep that one behaving normally, since the MAX10 FPGA depends on it.


When I first attempted to implement this, the system failed to come up.  After a lot of poking around and inadequate documentation from Xilinx, I found this project, that actually showed a working instantiation. From there it wasn't long, before I at least had a working bitstream.

It's actually likely to be helpful for the rest of this part, as well, because it actually does everything that I want: i.e., it allows programming of a connected QSPI flash memory. I'm glad to have finally found some source code that I can look at when I get stuck, to see how others have solved the same problems.

So in theory at this point, I have a bitstream with working ECAPE2 for bitstream switching, and now, a bit-bashing interface that *should* allow me to talk to the QSPI flash.  So I started writing a little test program for that, that basically tries to read some device information from the QSPI chip.

So, not entirely suprisingly, the test program doesn't work, in that it doesn't return the device ID.  If the pins for the QSPI flash chip were exposed on the PCB, I'd be able to stick my oscilloscope on them, and waggle them in software, and make sure that everything is correct.  However, as both the FPGA and QSPI flash are BGA parts with no exposed pins, there is no such possibility.

It should be possible, however, to use JTAG debugging tools to read the pin status of every pin on the FPGA.  The trick is how to do this easily from command line on linux.

The UrJTAG package provides the jtag command that *should* be able to do this.  After some hunting for info, the following should work to detect a MEGA65 connected via the USB debug cable:

jtag> cable FT2232 vid=0x0403 pid=0x6010

Then the detect command should show something connected, like this:

jtag> detect
IR length: 6
Chain length: 1
Device Id: 00010011011000110001000010010011 (0x13631093)
  Manufacturer: Xilinx (0x093)
  Unknown part! (0011011000110001) (/usr/share/urjtag/xilinx/PARTS)

That's looking good, except that the Artix7 FPGA is not in the part list.

There is, however, a newer version of UrJTAG, that has been patched to support the Artix7 series, and even has a boundary scan file for at least one version of the chip -- that should allow us to map the JTAG output to actual pins, which will be very helppful for us. Unfortunately, the pre-built package for Ubuntu lacks this, so I need to build it from scratch.

Building UrJTAG is proving interesting, because it needs ftd2xx.h, which I can't figure out which package on Linux provides. It looks like it might come from here: https://www.ftdichip.com/Drivers/D2XX.htm. You have to copy the include files from the release/ directory into the build directory for UrJTAG, and then it seems to build.

So, builting UrJTAG is a bit of a pain. The "make install" script basically doesn't work, so you have to do all that yourself.  With the new jtag binary, I now get this:

jtag> cable FT2232 vid=0x0403 pid=0x6010
Connected to libftd2xx driver.
jtag> detect
IR length: 6
Chain length: 1
Device Id: 00010011011000110001000010010011 (0x13631093)
  Manufacturer: Xilinx (0x093)
  Part(0):      xc7a100t (0x3631)
error: Unable to open file '/usr/local/share/urjtag/xilinx/xc7a100t/STEPPINGS'
  Unknown stepping! (0001) (/usr/local/share/urjtag/xilinx/xc7a100t/STEPPINGS)

So, that's a step forward, but  I have no idea yet where to get this STEPPINGS file from, or if it really is necessary. Ah, that was also just a problem with the install script not working. After manually copying  the data/ directory's contents into /usr/local/share/urljtag, it works:

jtag> detect
IR length: 6
Chain length: 1
Device Id: 00010011011000110001000010010011 (0x13631093)
  Manufacturer: Xilinx (0x093)
  Part(0):      xc7a100t (0x3631)
  Stepping:     1
  Filename:     /usr/local/share/urjtag/xilinx/xc7a100t/xc7a100t-csg324

This is all very nice, except that it thinks is the 324 pin part, not the 484 pin part that is actually in the MEGA65 R2 PCB.  It seems that UrJTAG might not support multiple variants of the same part, which is a bit annoying.

The first step, though, is to find the information required to actually even make the file. This seems to be available behind the license-wall at: https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/device-models/bsdl-models/artix-series-fpgas.html.   Using my account there, I downloaded the zip archive of BSDL files, and it seems that they are indeed the source material that I need. The PIN_MAP_STRING in each file seems to be the reverse-order of what appears in the UrJTAG file.  The syntax of BSDL is a bit weird, being VHDL derived, so where there are multiple pins defined on a single line, I'll have to work out how to parse those.

It turns out that UrJTAG has a parser utility for doing this:

bsdl2jtag xc7a100t_csg324.{bsd,jtag}
error: -E- error: In Package STD_1149_6_2003, Line 375, Error in User-Defined Package declarations.
error: -E- error: BSDL file 'xc7a100t_csg324.bsd' contains errors in VHDL stage, stopping
error: system error: Success Cannot open file STD_1149_6_2003 or /usr/local/share/urjtag/bsdl/STD_1149_6_2003

But as we can see, it is missing some files.  I suspect the install target of the Makefile might again be the problem here. Nope, apparently it just doesn't support STD_1149_6_2003. But someone has implemented the missing file. Unfortunately, it gives some error about user defined packages.  Someone else just took to modifying the BSDL files to remove the need for STD_1149_6_2003.  I might try that next.

Meanwhile, as I am out and about this morning, I took a Nexys4DDR board with me, which does have the exact chip that UrJTAG already supports, since I figured that should "just work", and I should be able to poke around with it while waiting for appointments.  Well, I don't get the error I described above, but I do instead get:

jtag> cable FT2232 vid=0x0403 pid=0x6010
Connected to libftd2xx driver.
jtag> detect
warning: TDO seems to be stuck at 1

What I don't know, is whether this is further along or not as far along as the other. I am guessing it is not as far along, since if the JTAG bus is stuck, it won't enumerate, and indeed, we are seeing a lack of enumeration. Fortunately I am not the only person with this problem.  Let's try some of their proposed solutions...

Unfortunately none of the suggestions on that page work. I'd suspect that my FPGA board is broken, except the fpgajtag command I use to send bitstreams to the FPGA via JTAG works perfectly.  So the JTAG interface *does* work, and my computer *can* communicate with it.  Most frustrating.

I also took a look at OpenOCD, an open-source JTAG tool for Linux etc.  This is an excellent project in many ways, but was never designed with doing simple FPGA boundary scans in practice.  Thus as a result, it still isn't in any way trivial to do them with it. I am sure if I invested enough time and energy I could figure out how to do it, but I really don't want to have to do that, if I don't have to.

I did take a quick look at the internals of the fpgajtag command, to see if I could easily adapt it.  It looks reasonably well-structured, but for someone who doesn't know that much about JTAG (although I am learning), it isn't immediately obvious what I would need to change.

So then I started looking at Vivado to see if the hardware manager in there can easily do a boundary scan.  I am sure it can, but even after a pile of Googling, I can't actually figure out how to do it.  There is a lot of talk about needing a debug bitstream or some debug core in the project.  This strikes me as incomplete information at best, since the JTAG interface on an FPGA, if not disabled, can ALWAYS do a boundary scan, if I understand things correctly.  Also, my workstation this morning doesn't have mains power, so I don't want to kill my battery before the kids swimming lessons finish for the morning:



The best thing I have found so far is this: https://www.fpga4fun.com/JTAG4.html
While whatever the JTAG library is that the example source code was written for isn't immediately clear, it does show how to go through the process of performing the boundary scan at a low level.  It might thus be enough information, together with the fpgajtag source, to cook something up that can work.  I have found the Xilinx BSDL files for the FPGAs I care about already, so in theory, I have all the information I need.

It also gives me hope of being able to take control of pins on the FPGA, so that I can more quickly test and develop things like this QSPI interface, as I can potentially avoid having to synthesise every change, but instead be able to bit-bash over JTAG.  But of course, I have to succeed in actually getting SOMETHING to work, before I can get that excited.

Well, at least integrating fpgajtag into monitor_load was relatively easy: The only slightly tricky part was re-doing the command line interface parse stuff. But I do want to extend it a little further, so that the fpgajtag stuff which correctly works out which USB serial port to talk to, can also be used to automatically find the correct serial port for the normal monitor_load communications.  This was also not too hard, once I found out that I could map the /dev/ttyUSBx paths to the entries in /sys/bus/usb-serial/devices, and look at the destination of those symlinks to check that the USB bus and port match.

So now, in theory, I have all necessary ingredients to adapt to be able to run a boundary scan from within monitor_load, so long as I can figure out how the  fpgajtag code does the JTAG communications.  But this is not proving as simple as I would like, as fpgajtag has what seems to be a quite clever mechanism for abstracting the low-level JTAG operations.

Unfortunately, there is little documentation in the source, and I am struggling to understand how to adapt it.  I'm pulling my hair out enough that I have logged an issue on the fpgajtag github repository asking for some help in understanding their code.  Within a few hours, I had received some pointers to documentation for the FTDI serial adapters, which gave me enough information, with quite a lot of trial and error, to work out how to control the JTAG interface.  This will also come in handy in the future, when we get to implementing updating the keyboard CPLD from the MEGA65 itself as well, as I will need to implement a JTAG interface for that.

Anyway, back to the point, I now seem to be able to read some JTAG boundary scan data from the FPGA.  It seems to be shifted by a few bits, and I don't yet capture it all, but I am able to see bits toggle as I flip the switches on a Nexys4 DDR board, and in roughly the right place in the boundary scan register.  I suspect the bit order of the bytes might be flipped, and that I need to ignore the first 6 or so bits, to make up for the bits of the boundary scan command itself being shifted out.  But the important thing is that I can now read boundary scan data.  The changes I made to the read_idcode() function to tell it to switch to boundary scan mode ended up being quite simple:

    ENTER_TMS_STATE('I');
    ENTER_TMS_STATE('S');
    write_bit(0, 0, 0xff, 0);     // Select first device on bus
    write_bit(0, 5, IRREG_SAMPLE, 0);     // Send IDCODE command
    ENTER_TMS_STATE('I');


(Checkout https://github.com/MEGA65/mega65-core/blob/unstable/src/tools/fpgajtag/boundary_scan.c if you would like to see it all together.)

This switches the JTAG interface from Reset to Idle, then to IR-Capture, send the JTAG SAMPLE command so that it ends up in the IR register, and then returns to Idle state, ready for the usual logic to shift bits in and out.  The boundary data is then in the data shifted back in.  All quite simple, once I had worked it out!

With a bit more work, I have now implemented an amazingly quick and dirty scanner for both the XDC and BSDL file formats.  XDC files inidicate the pins used by a project, while BSDL files have the information about the FPGA itself, importantly including the JTAG boundary scan information.  With these parsers, and a bit of glue, I can not only show the status of each FPGA pin, but also the name of the pin in the project.  While there is plenty of room to improve this, the result is already really nice.  Here is a little sample of the output on a Nexys4DDR board:

monitor_load -J src/vhdl/nexys4ddr-widget.xdc,${HOME}/build/artix7/public/bsdl/xc7a100tl_csg324.bsd
make: „src/tools/monitor_load“ ist bereits aktuell.
fpgajtag: Digilent:Digilent USB Device:210292645477; bcd:700; IDCODE:  3631093
Auto-detected serial port '/dev/ttyUSB1'
FPGA is assumed to be a XC7A100TL_CSG324, with 989 bits of boundary scan data.
bit#2 : CCLK_E9 (pin E9, signal {QspiSCK}) = 1
bit#3 : M0_P12 (pin P12, signal <unknown>) = 1
bit#4 : M1_P13 (pin P13, signal <unknown>) = 0
bit#5 : M2_P11 (pin P11, signal <unknown>) = 1
bit#6 : CFGBVS_P8 (pin P8, signal <unknown>) = 1
bit#10 : INIT_B_P7 (pin P7, signal <unknown>) = 1
bit#13 : DONE_P10 (pin P10, signal <unknown>) = 0
bit#53 : IO_U8 (pin U8, signal {sw[9]}) = 0
bit#56 : IO_T8 (pin T8, signal {sw[8]}) = 1

...

First, we have filtered out all the bits that are not marked "input" in the BSDL file, which dramatically shortens the list of output.

Second, we see the nice mapping of the BSDL bit names to FPGA pins and project signals.  sw[9] and sw[8] are two of the slide switches on the Nexys board, and I can happily twiddle those, and re-run the scan, and see the changing values.  So I am confident overall that its working, and that I can finally go back to what I was trying to do at the begining: Check whether I am correctly controlling the QSPI interface pins, in particular the CCLK pin.

So let's actually fire up a bitstream, and see if we can control the pin... and indeed I have confirmed that everything except the pesky clock pin is controllable.  This is what I had most suspected would be the problem, but now I don't have to suspect -- I can inspect!  But solving that will have to wait for the next blog post.

Meanwhile, if you would like to support me, I've setup a ko-fi page at ko-fi.com/paulgs.

No comments:

Post a Comment