I have been making small incremental fixes (or what were intended to be fixes, but in some cases were worse) in the sprite rendering module. However, the long synthesis times are a pain for making progress that way. Simulation is much faster, and also allows more insight as to why something is going wrong.
So, today I setup a test harness that allows me to render a single sprite, and work out what is going on. But before we go further, some of the known problems that I am trying to solve with sprites are:
1. Sprites "shimmer" with the data being shown on a physical raster line sometimes varying between what is on the raster above, and the raster below.
2. Sprites display a line (or more) of junk at the bottom. This is actually one extra VGA raster of data, as though the sprite were one VGA pixel taller.
3. Vertical expansion of sprites was broken.
4. 16-colour sprite mode was missing (this is a new MEGA65 sprite mode)
5. Tiling mode didn't work properly in 640H mode, but stopped at pixel 419.
6. Non-tiled sprites would stop in the right border, instead of wrapping around to the left of the next raster line.
There are probably a few others in there, as well.
I have started by refining the logic somewhat so that the Y advance logic and start of sprite rendering is now much cleaner and simpler. This results in the sprite thinking it is drawing the correct data on each raster line, and stopping immediately at the correct point.
I have also found some bugs with the sprite collision checking, which I have tried to fix.
Now the trick is to get the sprites showing the correct data on every line. This is a little bit complicated, because the sprite needs to know what data it needs for the next raster line, to tell the VIC-IV to fetch it. Otherwise, it will end up one VIC-IV raster late, and if the VIC-IV provides the updated data in the middle of a raster line, we can get that shimmer effect I mentioned in the bug list. What is nice is that realising all this has helped to explain to me the kinds of bugs that I am seeing, and gives me confidence that they can be fixed.
Now, to fix the sprite fetching, we need to understand how the sprites work on the MEGA65.
First, each sprite has a 64 bit raster buffer that contains the 8 bytes of sprite data for one row of pixels. On the VIC-II and VIC-III, this is only 3 bytes, because sprites are only 24 pixels wide. However, on the MEGA65, you can ask for sprites to be 64 pixels wide instead (or 16 pixels wide if in 16-colour mode, where four pixels control the colour of each pixel, but without becoming wider like they do in multi-colour sprite mode).
To get data into the sprite raster buffers, there is a ring bus where the sprites put their current offset with respect to the start of the sprite data block onto the bus, and the VIC-IV receives this as the ring bus rotates the data, and the VIC-IV then provides the 8 bytes of data relative to that address once per raster line. The VIC-IV fetches all 8 bytes for all 8 sprites (and also the 8 bitplanes) every raster line, regardless off whether they are to be displayed or not. This simplifies things somewhat, and is possible because of the high pixel clock of the MEGA65 (currently 100MHz) compared to the screen resolution.
This 64 bit buffer is continuously expanded into a 128 bit version for both multi-colour mode and hires mode, so that the sprite drawing logic can just always take two pixels. The only exception to this is 16-colour mode, where the 128 bit version is prepared as for hi-res mode, but the shift register is shifted four times per pixel on subsequent cycles (which requires that the VGA pixel clock be at least 4x the sprite pixel clock. For now, this means that 16-colour sprites cannot be used in 640H mode. I hope to relax this in future, by allowing the barrel shifter to shift either 1x2=2 or 4x2=8 bits each time.) In all cases, the sprite copies the 128 bits of expanded data into the shift register for drawing when the X position matches the left edge of the sprite -- which is why the shimmer can vary based on the sprite number and X position.
What we need to do instead is to latch the 128 bits of drawing data at the start of a raster line, to eliminate the shimmer. Then, to make sure the correct data is drawn each raster, we need to have the sprite request the data for the next row of pixels once it has received the data for the current row of pixels, and latch the new data just as the next row of pixels should begin to be drawn.
Because sprite data fetches happen every single raster line, we can do the fetch for the first row of pixels on a continuous basis, and latch it as soon as we begin drawing that row of pixels, and then begin asking for the next, and latch it in the same way.
Okay, so all of that is synthesising now.
While I wait for that, I want to try to fix a problem with bad-lines. The specific problem I want to solve is that in Gyrrus, it displays the wrong characters for the inter-level announcements. What I am not currently sure about is whether the problem is that badlines happen at the wrong time on the MEGA65, or whether the problem is that the raster numbers for raster interrupts are out by one. The latter would make sense, because a lot of games have the top pixel of text following a raster split missing. So, this says that we need the VIC-II raster interrupts to happen one raster line earlier than the currently do.
So, a bit of investigation using Gyrrus as an example was in order. First, this is how it looks by default. Here is the problem, we see the yellow row of funny characters instead of the copyright message (we can also see some rubbish under the sprites from the previously described bug, as well):
I have seen a funny problem where raster routines don't seem to run quite as fast as they should when the MEGA65 is at 1MHz, so I also tried putting the CPU to 50MHz. This made the shift of two pixels to be too much, but reducing it to a one pixel delta with 50MHz resulted in a perfect display (give or take the sprite rendering bug):
So, are we not getting enough CPU cycles per raster line? This is fairly easy to compute, because we know the pixel clock is 100MHz, and that there are 3,196 cycles per VGA raster = 63.92 microseconds per two raster lines (since we are effectively running double-scan). Given the CPU clock at 1MHz for PAL in the MEGA65 is calculated as (64569/65536) * 1MHz, this gives 63.92 * 64569/65536 = 62.9768 cycles per raster. That is, about once every 43 raster lines, a cycle will be missing. To fix this, we need to make the 1MHz PAL clock run at 64593/65536 = 985.61 KHz, instead of the true PAL clock speed of 985.25 KHz. That is, we are talking about about one part per two thousand. This is most likely an acceptable error.
However, it is best to actually test things like this rather than just trusting in them. I was able to confirm that raster lines do appear to have only somewhere between 62 and 63 cycles each.
Then I thought I should try NTSC. Loading Gyrrus in NTSC and adjusting the raster number by two yielded a perfect result at 1MHz -- no 50MHz boost required. Is it possible that Gyrrus was designed for NTSC instead of PAL? Anyway, it suggests that we should move the VIC-II raster numbers up relative to the physical screen dimensions. However, causing raster lines 0 and 1 to cease to exist is likely to cause other problems, e.g., for raster interrupts set to trigger on those raster lines. Thus there is an easier solution: Simply move the borders and character generator down two pixels. Then everything will line up naturally. We'll try that.
Now, back to the sprite rendering problems, Gyrrus has shown that we still have the junk at the bottom, and also that the top row of pixels in sprites are still only one pixel high. The row of junk is a true sprite pixel high, i.e., it expands with the Y-expand flag. What is helpful is that now with my test harness, I can simulate the drawing of a sprite, either expanded or not, and see what is going on. This image here is of an X and Y expanded sprite. You might need to enlarge the image to see what is going on:
So, we can see a few problems here:
1. The first row of pixels is too short.
2. The missing height of the first row seems to be made up for with pixels from row 21, i.e., a 22nd row of pixels that should not be present.
That is, it looks like it should just be the one problem the fix here. Basically what is happening is that it is pulling the new data in on the off-raster for the Y-expanded sprite. After a bit of poking around, I sorted this out, so now the output looks like:
Okay, so let's make sure we haven't messed things up for non-expanded sprites:
Hmm, here we have a similar problem, except that it is the top row of pixels being drawn too many times, instead of too few. This was fixed by setting the sprite drawing flag at the start of the raster line on which the sprite appears, instead of when the first pixel is due to be drawn, so that the data offset can be set in time. I still have some misgivings about this, as the VIC-IV may not be given enough notice from when the data offset changes to when the bytes are required. This I will have to verify. But at least now in simulation non-expanded sprites are drawing correctly:
Super! A quick check back to Y-expanded sprites to verify that it hasn't stuff that up again, and this patch is off for synthesis.
So, after synthesising, we have some improvements. The text messages in Gyrrus are now visible (still glitchy in PAL, but perfect in NTSC), thanks to the bad lines occurring at the right time, and the position of the rasters properly matching up. However, vertically expanded sprites are still messing up in a new worse way, although they are mostly fine when not vertically expanded. That said, the top row of pixels is still too short for the non-vertically expanded sprites. This can all be seen in this screenshot from Gyrrus:
In short, it feels a bit like two steps forward, one step back, or sideways or something. Anyway, let's start analysing what is going on with those vertically expanded sprites. Instead of just shimmering or being vertically rotated, they are a bit of a mess. What I suspect is happening is that the 64 pixel shift register is only being rotated by the 24 pixels being displayed, and thus isn't properly synchronised for use on the next raster line. If this theory is correct, enabling 64-pixel wide mode should fix the problem. And indeed, it does. In this screenshot I have just set the 64-pixel wide mode bit for the "G" sprite. Of course, now it is completely unreadable, but it is stable.
The frustrating part here is that I have not yet been able to reproduce the same behaviour in the simulation, which of course makes it more difficult to try to fix.
I did in fact try a fix where the pixel shift register gets reset at the right edge sprites, but that didn't work. Looking at that fix again this morning, I can see, however, that it was only being activated some of the time. So, we will see if that fixes that problem.
In the meantime, there is that problem of the top pixel of a non Y-expanded sprite being trimmed. Again, this doesn't show up in simulation. But I did see something like it at one stage in simulation, when the sprite data fetch was happening slightly wrongly. Thus, I am suspecting it is due to an unhelpful interaction of the sprite data latching and when the VIC-IV fetches the sprite data.
The problem in simulation wasn't quite the same, as it was the first row of pixels getting stretched instead of compressed, however, as it is a similar type of error, it is of interest. In simulation I was only fetching sprite data every VIC-II raster, not every VGA raster, i.e., only once every two rasters. Thus if the sprite changed its data pointer during the "off" raster, it would not get updated until one more raster line had passed, and thus would come one raster line late.
Thus I suspect it is some interaction with the VIC-IV sprite data fetch timing, presumably fetching the data before the sprite expects it. I can see if I can reproduce that sort of situation by making my sprite test harness fetch sprite data continuously, and see if that make it appear. Unfortunately, it refuses to do. So, back to testing on the running machine. The short first pixel only appears when a sprite is 24-pixels wide, but not when 64-pixels wide, again suggesting that the sprite pixel buffer is likely to blame in some way. So it is possible that the fix I have already committed to improve the behaviour of that may fix this problem as well. That would be nice. Now I just have to wait for synthesis to complete.
Hmm.. That didn't work. So this time I have added an explicit buffer for the sprites, so that the correct data gets kept and re-used for as long as it is required. This avoids all the shift register synchronisation problems. This time we have some marked improvement, particularly because the shift-register problems are being avoided:
From a distance, that even looks mostly correct. However, if you look carefully, you can see that some rows of pixels in the vertically expanded sprites are wrong. In effect, it looks as though the order of each pair of pixel rows is inverted, or more correctly, every other row of pixels is drawn from the row of pixels that follows, instead of the one it should be drawn from.
Now, it is rather odd that this doesn't happen in simulation. What occurs to me is that it could be a case of order of assignment differing between GHDL, which I am using for simulation, and the Xilinx ISE tools that perform the synthesis. That is, I might have multiple code paths that try in the same cycle to set the sprite pixel buffer, and GHDL happens to interpret them in the correct order, while ISE does not. It is in principle easy to find if this is the case, by putting debug statements on the assignments, and see if any of them happen at the same time. There is no sign of this happening, unfortunately. That said, it could well be other signals that are assigned in such a way.
Thinking about it from another perspective, what we can say is that the newly fetched data is used one VGA raster earlier than it should be, before reverting to using the previously fetch row. From this we can deduce that sprite_pixel_bits_last has the previous row of pixels, since it is later retrieved. However, there is simply no code path that allows a new row of pixels to be used, without replacing the contents of sprite_pixel_bits_last.
However, I noticed something as I was looking at the simulation output: Every other row of pixels, the sprite data is not actually reloaded -- the shift-register is left with what it had after being rotated by 24 pixels. The reason I wasn't seeing the effects of this, is that my test harness generated 8 of the same bytes for each sprite row, and because the buffer of 8 bytes was long enough for the two rows of pixels, each requiring only 3 bytes. However, by making the bytes have a bit field that changes across as well as down, suddenly the problem becomes apparent:
Remembering that for a real sprite, the 4th to 6th bytes of a data fetch are actually the 2nd row of pixels, which thus gives the appearance of the top row being shorter. It also explains why 3 bytes of rubbish appears at the bottom still: It is bytes 4 to 6 of the fetch for the last row of pixels.
The Lesson? Think carefully about your test vectors!
Now that the problem has been clearly identified, the fix was simple: Copy the cached row of pixels at the start of VGA rasters that are not also the start of a new VIC-II raster. And the result: Simulation now looks correct, for both un-expanded and expanded sprites:
And after synthesis, Gyrrus looks like it should, with expanded sprites looking correct, and the half-height first pixel row of the earth also fixed:
Indeed, using Impossible Mission as a test for the extra row of sprite junk, since it was obvious there as well, I was also able to confirm that sprites are drawing correctly there as well:
I have yet to report on is the 16-colour sprite mode, as well as some of the other more esoteric sprite render problems. However, seeing as this post has got a bit long, I'll cover those in separate posts later on.