Wednesday 10 January 2018

Testing Ethernet on the r1 PCB

There are now only a couple of interfaces left to test on the r1 PCB: HDMI output and the 100 Mbit ethernet port.  Ethernet is the next on the list, as it should, in principle, be easy to test, as we already had the same ethernet hardware on the Nexys4 DDR boards.  Thus, it really comes down to verifying that the pin assignments are correct.

However, it has been ages since we used the ethernet interface in earnest, in part because there is still a bug in my VHDL ethernet controller when transmitting (bits get corrupted, most probably due to a timing problem).

Thus, the first step was to get back to a working setup on the Nexys4 DDR board, where I could verify that I had a working test procedure.

The setup was quite simple:  The etherload program, which is a tiny program that listens for incoming ethernet packets on the MEGA65's ethernet interface, and if they are UDP packets on port 4510, it executes the contents of the packet in memory. This is used by a companion program on a computer connected via ethernet to send 1KB pieces of a program to be loaded, together with the little routine to copy it into place.  This scheme allows the ethernet loading program to be <256 bytes in length, including the ability to respond to ARP requests (although with the ethernet transmission problem, this is currently not very useful).

So, I loaded and ran the etherload program on the MEGA65 on a Nexys4DDR board, connected an ethernet cable, and then ran the etherload program on the Linux laptop at the other end of the ethernet cable. Without ARP, the IP address to send to must be a valid broadcast address on the ethernet interface.  I used a command like:

etherload 192.168.1.255 ../c64/games/gyrrus.prg

When etherload is running on the MEGA65 and waiting for packets, it looks like this:



(Note that etherload is so old, that it doesn't explicitly set the CPU to 50MHz, so I had to POKE0,65 before running it to do this. Otherwise it is too slow, and won't capture the packets coming in on the 100 Mbit/sec link.)

Then, when it is finished, it drops back to the ready prompt, like this:



The squiggly characters are drawn one per packet loaded, with the position matching the address of the packet loaded, so that you can see if there are any gap, which would indicate missed packets. None here, so I could happily run Gyrrus, which worked fine.

So, at this point, I have a test procedure that I can attempt on the r1 PCB.

Trying this on the MEGA65, I see the ethernet link light come on when the ethernet is plugged in, and the ethernet LED blinks on receiving the packets, but the etherload program shows no sign of having seen the packets.  Time to investigate.  Pausing the CPU, and looking at $D6E1 to see if the ethernet controller thinks that any packets have been received shows no signs of life. 

As I have had to debug this once before on the Nexys boards, there is a debug register at $D6E0 that shows the current status of the ethernet receive lines.  Thus I can write a little routine that continually draws the contents of that register on the screen, and try sending it a packet to see if we see signs of life. 

 This initially saw no signs of life, so I wrote a program to talk to the ethernet controller via the MIIM / MDIO interface, a two-wire interface that can be used to check the current connection and settings, and to set various link parameters.

After some trial and error, I was able to talk to the MIIM interface, and read out the various registers, which showed the link autonegotiating and coming up when a cable was connected.  So I tried again to write a little routine that shows the state of the ethernet interface registers. This time, I wrote the routine to increment a location on screen based on the contents of $D6E0, as a more robust way of seeing what is happening.  This showed that the RX lines were toggling, and that the RX valid line was also changing state when packets were flying on the ethernet connection.  However, etherload still failed to see any packets.

Back when I first implemented ethernet for the Nexys4 boards, I added a feature to allow reading the values arriving on the ethernet RX lines into a buffer to help debug the implementation.  That same function is now helpful for trying to work out what is going on here.  It confirms that the data bits are being received, and that they, in general, look right.  Digging deeper, I can see that packet data is being received, but no packet reception is reported. This most likely means that the CRC is invalid.  Fortunately, when a packet is rejected due to the CRC, it still gets written into the packet buffer.  Here is what I saw after receiving a 500 byte ping packet:

 :FFDE800 00 80 BD 00 5E 00 00 FB 10 05 01 9F FC FD 08 00
 :FFDE810 45 00 00 A9 9A 5F 40 00 FF 11 3E 3E C0 A8 01 02
 :FFDE820 E0 00 00 FB 14 E9 14 E9 00 95 A6 AF 00 00 00 00
 :FFDE830 00 09 00 00 00 00 00 00 05 5F 69 70 70 73 04 5F
 :FFDE840 74 63 70 05 6C 6F 63 61 6C 00 00 0C 00 01 04 5F
 :FFDE850 66 74 70 C0 12 00 0C 00 01 07 5F 77 65 62 64 61
 :FFDE860 76 C0 12 00 0C 00 01 08 5F 77 65 62 64 61 76 73
 :FFDE870 C0 12 00 0C 00 01 09 5F 73 66 74 70 2D 73 73 68
 :FFDE880 C0 12 00 0C 00 01 04 5F 73 6D 62 C0 12 00 0C 00
 :FFDE890 01 0B 5F 61 66 70 6F 76 65 72 74 63 70 C0 12 00
 :FFDE8A0 0C 00 01 04 5F 6E 66 73 C0 12 00 0C 00 01 04 5F
 :FFDE8B0 69 70 70 C0 12 00 0C 00 BD 8D 8E 8F 90 91 92 93
 :FFDE8C0 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3
 :FFDE8D0 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3
 :FFDE8E0 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF C0 C1 C2 C3
 :FFDE8F0 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF D0 D1 D2 D3
 :FFDE900 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF E0 E1 E2 E3
 :FFDE910 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF F0 F1 F2 F3
 :FFDE920 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF 00 01 02 03
 :FFDE930 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13
 :FFDE940 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23
 :FFDE950 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33
 :FFDE960 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43
 :FFDE970 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53
 :FFDE980 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63
 :FFDE990 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F 70 71 72 73
 :FFDE9A0 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F 80 81 82 83
 :FFDE9B0 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93
 :FFDE9C0 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3
 :FFDE9D0 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3
 :FFDE9E0 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF C0 C1 C2 C3
 :FFDE9F0 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF D0 D1 D2 D3

 :FFDEA00 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF E0 E1 E2 E3
 :FFDEA10 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF F0 F1 F2 BD
 

The first two bytes are supposed to indicate the length of the packet, low-order byte first, and with the MSB of the second byte indicating if a CRC error has occurred. If a CRC error occurs, then no packet received interrupt is triggered, and the controller will keep trying to receive a valid packet, instead of marking the receive buffer full (the MEGA65 ethernet controller has two receive buffers, so that one can be processed while the other is receiving a packet).

The byte $BD at the end of the packet is written by the ethernet controller as a handy marker so that if you have been receiving multiple packets, and want to see where the latest one ends, you can.  So, this tells us that the packet was indeed correctly received as being $A1F - $800 - (2 bytes length header) = $21D bytes long.  However, the length header in the first two bytes of the packet says that it is zero bytes long, and that there was a CRC error.  That the length header is wrong tells me that there is something fishy going on.  I am resynthesising with an option to ignore CRC errors, and to try to investigate a little deeper the writing of the length field.

So, synthesis has finally finished an hour and a half later, so I can try etherload again, this time with the ethernet CRC check disabled, and ... it works.  Moreover, there is no sign of the packets having any errors, as I can load a game, and the game runs fine.  This leaves me wondering what is going on, or more specifically, how an incorrect ethernet CRC is getting calculated on what seem to be perfectly correct packets.  To try to solve this riddle, I took a look at the last packet sent by etherload as received by a Nexys4 DDR board and by the MEGA65 r1 PCB. Here is the one from the Nexys4 board:

 :FFDE800 AE 00 FF FF FF FF FF FF 10 05 01 9F FC FD 08 00
 :FFDE810 45 00 00 9C B9 79 40 00 40 11 FC 85 C0 A8 01 02
 :FFDE820 C0 A8 01 FF CE 1F 11 9E 00 88 A1 55 A9 00 EA EA
 :FFDE830 EA EA EA EA A2 00 BD 44 68 9D 40 03 E8 E0 40 D0
 :FFDE840 F5 4C 40 03 A9 47 8D 2F D0 A9 53 8D 2F D0 A9 00
 :FFDE850 A2 0F A0 00 A3 00 5C EA A9 00 A2 00 A0 00 A3 00
 :FFDE860 5C EA 68 68 60 00 00 00 00 00 00 00 00 00 00 00
 :FFDE870 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 :FFDE880 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 :FFDE890 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 :FFDE8A0 00 00 00 00 00 00 00 00 00 00 00 00 41 0B 3F 4D
 :FFDE8B0 BD


Here we see our $BD end of frame marker, and just before it, four bytes that are the CRC.  So, everything is fine there, as we know it is, since etherload works fine on that board with CRC checking enabled.

Now, the same packet received by the MEGA65 r1 PCB:

 :FFDE800 A9 80 FF FF FF FF FF FF 10 05 01 9F FC FD 08 00
 :FFDE810 45 00 00 9C 52 C0 40 00 40 11 63 3F C0 A8 01 02
 :FFDE820 C0 A8 01 FF E8 8C 11 9E 00 88 86 E8 A9 00 EA EA
 :FFDE830 EA EA EA EA A2 00 BD 44 68 9D 40 03 E8 E0 40 D0
 :FFDE840 F5 4C 40 03 A9 47 8D 2F D0 A9 53 8D 2F D0 A9 00
 :FFDE850 A2 0F A0 00 A3 00 5C EA A9 00 A2 00 A0 00 A3 00
 :FFDE860 5C EA 68 68 60 00 00 00 00 00 00 00 00 00 00 00
 :FFDE870 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 :FFDE880 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 :FFDE890 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 :FFDE8A0 00 00 00 00 00 00 00 00 00 00 00 BD


Note that apart from some different values in the IP and UDP header fields, the ethernet frames are identical, except for the lack of a CRC field.  While this is rather confusing, as I have never seen ethernet frames lacking a CRC field before, it at least does explain the behaviour I am seeing.  I also confirmed that if I use my Mac instead of the Linux laptop, the same behaviour is seen on the receiving side.

The MEGA65 r1 PCB does use a different ethernet receiver IC.  Is it possible that this IC does automatic CRC checking, and simply trims the CRC field from the end of the packet?  If so, I can find no mention of this feature in the datasheet for it.  There is a way that this can be tested, however: Connect two MEGA65's back to back via ethernet, and send a frame from one to the other, and see if the CRC that the one sent is received by the other.  The MEGA65's ethernet controller I have written in VHDL always sends a CRC, so this eliminates that question.  This is also a good idea, since I want to test the sending of ethernet frames, since there is a problem with that, which I suspect is due to timing of the TX bits compared to the 50MHz ethernet clock.

To do this, I wrote a little program that simply copies a sample ethernet frame to the TX buffer and sends the packet, whenever a key is pressed.  First time trying this, I can see that a packet is sent from that side, and received by the other, with the CRC missing.  However, it also showed up a problem with memory mapping, because while I can read from the packet RX buffer when I had used the MAP instruction to make it visible at $6800-$6FFF, I can't write to it. Instead writes are going to colour RAM. Using the serial monitor causes the same problem. Time for another synthesis run to fix that (found the wrong 2-bit constant in the CPU source code that was causing it)...

So, having fixed that memory mapping error, I can now send packets from the Nexys4DDR board to the MEGA65 r1 board, but no CRC is visible. Also, I discovered that the packet length must be set to one more than the number of bytes in the packet. Now, what about in the other direction, from the MEGA65 r1 PCB to the Nexys4 board?

Here we have some interesting things.   First, the data coming through is corrupted, specifically, it looks like the bits that have been transmitted in one cycle are actually often used in the following cycle, i.e., I am presenting the data on the opposite side of the ethernet TX clock compared to when I should be.  Here is the hexadecimal version of the packet as received at the other end:

 :0000428 47 80 FF FF FF FF FF FF 47 45 45 45 45 45 29 00
 :0000438 3F 50 55 5A 5F 54 55 5E 5F 78 7D 7A 7F 7C 7D 7E
 :0000448 7F A0 A5 AA AF B4 B5 BE BF A8 AD AA AF BC BD BE
 :0000458 BF F0 2F FA FF F4 00 05 28 15 3C 3C 3F A0 5F 3F
 :0000468 5A 3C 14 A5 A0 40 7F 5F F4 BD 00 00 00 00 00 00


The first two bytes are the length ($47) + CRC error flag, then we have the usual ethernet fields.  Clues that the bits are being read once cycle are that the ethernet source address field is 474545454545, when it should be 414141414141.  The $47 has the upper two bits from the $FF of the last ethernet destination address rotated in, and then the $41 rotated left.  $41 = 01000001, so rotating it left and pulling in the 11 bits, we get 00000111, which isn't quite right. However, if we assume that each bit pair is the logical OR of the previous bits, plus the bits that are being sent now, then it makes sense: 01000001 OR 00000111 = 01000111 = $47.  This says to me quite strongly that it is this marginal timing issue.  Basically by presenting the bits and clock at the same time, there isn't enough time for them to stabilise and replace the old values before the ethernet controller samples them.

Despite the difficulty that this glitch provides in determining if the CRC field is there, by repeatedly sending slightly different frames, I can see that the last four bytes of the frame before the $BD end of frame marker (which looks like a reverse = sign on the screen display) change each time. The only other byte that changes is one byte in the frame that I am changing on the transmit side.  You can play spot the differences with me in these shots: There is a single byte different in the body of the packets shown, so that the CRCs would be different, and then the CRC fields themselves:




So, these problems shouldn't be too hard to fix. The out-by-one length error I can very easily fix. The timing error will be a little more work, but not particularly hard. What I will probably do is use a 200MHz clock to drive the TX lines, and have a register that allows me to adjust the phase of the TX data bits with relation to the ethernet clock. That way I will only need to resynthesise once to be able to find the correct settings, which can then be baked into the next synthesis run after that.


So, adding the phase delay on the ethernet TX data lines has fixed the data corruption we were seeing. Here is how it looks now, sent from the MEGA65 r1 PCB to the Nexys 4 DDR board:

 :7776800 47 80 FF FF FF FF FF FF 41 41 41 41 41 41 08 00
 :7776810 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E
 :7776820 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E
 :7776830 2F 30 0F 32 33 34 00 01 08 05 0C 0C 0F 20 17 0F
 :7776840 12 0C 04 21 20 91 51 83 E2


Now we see the MAC address being correctly formed, and all the bytes look correct. Also, as this was received by the Nexys4 DDR board, we see the ethernet CRC field.  However, it still thinks the CRC is wrong.

In the reverse direction, we still don't see the CRC field, so we see packets like this:

 :7776800 42 80 FF FF FF FF FF FF 41 41 41 41 41 41 08 00
 :7776810 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E
 :7776820 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E
 :7776830 2F 30 EE 32 33 34 00 01 08 05 0C 0C 0F 20 17 0F
 :7776840 12 0C 04 21


What is nice is that the same TX line phase delay works in both directions, so we don't need to make that a setting specific to the type of board.

We also see that the number of bytes sent differs between them by one, that is, the MEGA65 r1 is sending one more byte than the Nexys4 DDR board is. This probably explains why the Nexys board sees an incorrect CRC, and is more of a concern.

What I think I will do next, is to send a frame to the r1 PCB, and use the debug mode on the ethernet controller to see the raw data lines, and see if we see the CRC bits arriving.  Here is what I see:

 :7776800 80 80 80 80 80 80 80 80 80 80 80 80 81 81 81 81
 :7776810 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81
 :7776820 81 81 81 81 81 81 81 81 81 81 81 83 83 83 83 83
 :7776830 83 83 83 83 83 83 83 83 83 83 83 83 83 83 83 83
 :7776840 83 83 83 83 81 80 80 81 81 80 80 81 81 80 80 81
 :7776850 81 80 80 81 81 80 80 81 81 80 80 81 80 82 80 80
 :7776860 80 80 80 80 83 83 80 80 80 80 81 80 81 80 81 80
 :7776870 82 80 81 80 83 80 81 80 80 81 81 80 81 81 81 80
 :7776880 82 81 81 80 83 81 81 80 80 82 81 80 81 82 81 80
 :7776890 82 82 81 80 83 82 81 80 80 83 81 80 81 83 81 80
 :77768A0 82 83 81 80 83 83 81 80 80 80 82 80 81 80 82 80
 :77768B0 82 80 82 80 83 80 82 80 80 81 82 80 81 81 82 80
 :77768C0 82 81 82 80 83 81 82 80 80 82 82 80 81 82 82 80
 :77768D0 82 82 82 80 83 82 82 80 80 83 82 80 81 83 82 80
 :77768E0 82 83 82 80 83 83 82 80 80 80 83 80 82 81 80 83
 :77768F0 82 80 83 80 83 80 83 80 80 81 83 80 80 80 80 80
 :7776900 81 80 80 80 80 82 80 80 81 81 80 80 80 83 80 80
 :7776910 80 83 80 80 83 83 80 80 80 80 82 80 83 81 81 80
 :7776920 83 83 80 80 82 80 81 80 80 83 80 80 80 81 80 80
 :7776930 81 80 82 80 80 80 02 80 03 82 00 83 03 81 00 80
 :7776940 02 81 02 80 00 81 02 83 00 00 00 00 00 00 00 00


Each byte in this capture is one 20ns time step on the ethernet interface.  Bit 7 is the "data valid" signal, and bits 0 and 1 are the data being read. Four of these makes one byte of actual data. So, let's decode it. The long train of 81's followed by 83's is the ethernet preamble.  So we need to start from the second 83.  We then have the following 4 time steps making the following bytes:

$0000 : 83 83 83 83 = %11111111 = $FF
$0001 : 83 83 83 83 = %11111111 = $FF
$0002 : 83 83 83 83 = %11111111 = $FF
$0003 : 83 83 83 83 = %11111111 = $FF
$0004 : 83 83 83 83 = %11111111 = $FF
$0005 : 83 83 83 83 = %11111111 = $FF
$0006 : 81 80 80 81 = %01000001 = $41
$0007 : 81 80 80 81 = %01000001 = $41
$0008 : 81 80 80 81 = %01000001 = $41
$0009 : 81 80 80 81 = %01000001 = $41
$000a : 81 80 80 81 = %01000001 = $41
$000b : 81 80 80 81 = %01000001 = $41
$000c : 80 82 80 80 = 001000 = $08
$000d : 80 80 80 80 = 000000 = $00
$000e : 83 83 80 80 = 001111 = $0F
$000f : 80 80 81 80 = 010000 = $10
$0010 : 81 80 81 80 = 010001 = $11
$0011 : 82 80 81 80 = 010010 = $12
$0012 : 83 80 81 80 = 010011 = $13
$0013 : 80 81 81 80 = 010100 = $14
$0014 : 81 81 81 80 = 010101 = $15
$0015 : 82 81 81 80 = 010110 = $16
$0016 : 83 81 81 80 = 010111 = $17
$0017 : 80 82 81 80 = 011000 = $18
$0018 : 81 82 81 80 = 011001 = $19
$0019 : 82 82 81 80 = 011010 = $1A
$001a : 83 82 81 80 = 011011 = $1B
$001b : 80 83 81 80 = 011100 = $1C
$001c : 81 83 81 80 = 011101 = $1D
$001d : 82 83 81 80 = 011110 = $1E
$001e : 83 83 81 80 = 011111 = $1F
$001f : 80 80 82 80 = 100000 = $20
$0020 : 81 80 82 80 = 100001 = $21
$0021 : 82 80 82 80 = 100010 = $22
$0022 : 83 80 82 80 = 100011 = $23
$0023 : 80 81 82 80 = 100100 = $24
$0024 : 81 81 82 80 = 100101 = $25
$0025 : 82 81 82 80 = 100110 = $26
$0026 : 83 81 82 80 = 100111 = $27
$0027 : 80 82 82 80 = 101000 = $28
$0028 : 81 82 82 80 = 101001 = $29
$0029 : 82 82 82 80 = 101010 = $2A
$002a : 83 82 82 80 = 101011 = $2B
$002b : 80 83 82 80 = 101100 = $2C
$002c : 81 83 82 80 = 101101 = $2D
$002d : 82 83 82 80 = 101110 = $2E
$002e : 83 83 82 80 = 101111 = $2F
$002f : 80 80 83 80 = 110000 = $30
$0030 : 82 81 80 83 = %11000110 = $C6
$0031 : 82 80 83 80 = 110010 = $32
$0032 : 83 80 83 80 = 110011 = $33
$0033 : 80 81 83 80 = 110100 = $34
$0034 : 80 80 80 80 = 000000 = $00
$0035 : 81 80 80 80 = 000001 = $01
$0036 : 80 82 80 80 = 001000 = $08
$0037 : 81 81 80 80 = 000101 = $05
$0038 : 80 83 80 80 = 001100 = $0C
$0039 : 80 83 80 80 = 001100 = $0C
$003a : 83 83 80 80 = 001111 = $0F
$003b : 80 80 82 80 = 100000 = $20
$003c : 83 81 81 80 = 010111 = $17
$003d : 83 83 80 80 = 001111 = $0F
$003e : 82 80 81 80 = 010010 = $12
$003f : 80 83 80 80 = 001100 = $0C
$0040 : 80 81 80 80 = 000100 = $04
$0041 : 81 80 82 80 = 100001 = $21
$0042 : 80 80 02 80 = 100000 = $20 (some bits missing data valid)
$0043 : 03 82 00 83 = %11001011 = $CB (some bits missing data valid)
$0044 : 03 81 00 80 = 000111 = $07 (some bits missing data valid)
$0045 : 02 81 02 80 = 100110 = $26 (some bits missing data valid)
$0046 : 00 81 02 83 = %11100100 = $E4 (some bits missing data valid)


So, this is VERY interesting.  The ethernet controller isn't filtering out the CRC, but is rather claiming that those bits are not data valid.  Given the very specific pattern, with one di-bit missing the data valid for the last data byte of the packet, and then two di-bits missing the data-valid signal for the CRC, and the same two each byte, I suspect that the ethernet controller might be signalling the end of the frame.  This would mean that it must be buffering at least five bytes worth of received data, but that is not impossible.  Anyway, it explains where the CRC has gone. So, digging around a bit, I have found that the RX data valid signal is multiplexed with carrier sense on some PHY chips.  This looks like exactly what could be happening here (although the PHY receiver on the Nexys4 doesn't do this, as I have just re-confirmed), thus providing an explanation for what we are seeing.
So, time to resynthesise again, and see if it this gets us CRCs received on the MEGA65 r1 PCB.  That should just leave the CRC checksum problems, if they are still occurring after that change (which I expect that they will).

Indeed, success! I can now receive the last byte and CRC of a packet on the MEGA65 r1 PCB.  However, I still have a problem with CRCs.  I know that the CRC problem is on the sending side, because I can receive packets sent from my laptop without difficulty -- it is only packets sent from the MEGA65 that have this problem.

My planned approach was to investigate this is to capture some good packets, and find or write an ethernet CRC checking program, and confirm it works for those, and then see how the MEGA65-originated packets fare -- and if there is some mutation of the packet data that will make the CRC correct. However, then I decided to take a closer look at the CRC generation code in the ethernet controller, and get that to provide me with the list of bytes that it thought it was CRCing, to make sure that there was nothing strange going on. In the process of that, I found that the data valid input to the CRC calculator was remaining high, while clocking the CRC out at the end of a packet.  Thus, only the first two bits of the CRC would be correct, and the rest would be wrong. So, off to synthesis again, to see if this fixes the problem.

Testing with this fix, it still wasn't working.  So I took a known good packet sent by my laptop to the MEGA65, and got the other MEGA65 to send it, so that I could compare the CRC with that of the good packet, to try to get some handle on what was still going wrong. I was really quite frustrated at this point, because I had gone through the relevant code carefully, and thought I had understood what was going on, and with the help of simulation, confirmed that it was doing the right thing.  So I was somewhat relieved when I realised what the problem with the CRC was.  Here is the good and the bad CRC:

GOOD: $C2F7B15F = binary 11000010 11110111 10110001 01011111
 BAD: $C1FB72AF = binary 11000001 11111011 01110010 10101111
Looking at the hex, I could see that there were strong similarities, much more so than if the CRC was just plain wrong. Bit it took me a little while to realise it was just each pair of bits were swapped: The routine that copies the CRC bits out, two at a time, for transmission was putting them into the wrong TX line. So, its off for a few hours of synthesis again to fix this up... and after 10,238 seconds of synthesis, we finally have ethernet transmission with working CRC generation.

The only program I have that does any ethernet transmission at the moment is the etherload program, that in theory listens for ARP requests. It would be nice if it also listened for PING packets and replied, but you can't have everything. However, pinging it's IP address from my laptop does now result in ARP succeeding, with the very uncreative MAC address hardwired into etherload:

paul@F96NG92-L:~/Projects/mega65/mega65-core$ ping 192.168.1.65
PING 192.168.1.65 (192.168.1.65) 56(84) bytes of data.
^C
--- 192.168.1.65 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6140ms

paul@F96NG92-L:~/Projects/mega65/mega65-core$ arp -na

...
? (192.168.1.65) auf 40:40:40:40:40:40 [ether] auf enp0s31f6...

What would be nice, would be if etherload responded to pings, and also read out the MAC address it should use from the MIIM, now that I know how to use it. In fact, it would be nice if the ethernet controller provided simple register based access to the MIIM registers, and had an option to automatically populate frames with its MAC address.  I'll add these to the queue.  But for now, I am happy to finally have ethernet working in a solid way, for both transmit and receive, and will move on to testing the HDMI port, and now that I remember, the last aspects of the 3.5" floppy drive interface.

2 comments:

  1. Very interesting post! I'm not sure however (due my limited English knowledge, I guess) that I got everything. Do you mean, that it's no way M65 knows its own MAC address, ie reading it somehow? Does the board has an unique MAC address for every ethernet controller chip, or you need to assign one? Maybe if it's possible to assign one (even if it has - in the form, just send frames with a given MAC address you choose when TX'ing a frame) you can work around that situation used a hard-wired one into your test program or so. I don't know too much about ethernet controllers, but I enjoyed for example Microchips' ENC28J60 (if I remember the part number exactly know). As far as I remember it does not have any MAC address set by default (you need to assign one), however it has nice capabilities, like the "magic filter" which can ease the load of the MCU/CPU interfacing (via SPI, btw) with the 28J60, allow to only accept frames targeting its MAC addr set, or even the the broadcast MAC address (to implement the ARP resolution protocol), so it's efficient to implement a network stack then, even if the ethernet segment is quite "noisy" otherwise with frames not so much targeting us. Hmm, now I'm thinking to trying to implement ethernet emulation in Xemu, using Linux TAP device, I can "see" and send raw ethernet frames from software (AFAIK OSX for example has only TUN interface by default, so it's layer-3 not layer-2 access unfortunately, Linux has both of TUN and TAP).

    ReplyDelete
    Replies
    1. MIIM-compliant ethernet PHYs have a MAC address stored in a special register. However, I am not (yet) reading that out to use it. This is on my list of things to do. Likewise, I can add MAC filtering at the hardware level. These are not particularly hard things to do.

      Delete