10G Ethernet Frames

Posted by fmadio | 100G Ethernet

In the previous post we went over the components of a 10G Layer 1 ethernet system, MAC+PCS+PMA+MDI yet for those without a software background there`s some more fundamental problems that need address before drilling deep into the guts of the Layer 1 dungeon. Hardware systems are just different, its not orthogonal to software, its more a parallel universe connected via a wormhole. You need to think differently, de-construct problems differently and use completely different data structures and algorithms, so lets starts with a seemly simple problem.

In hardware how do you know when a packet starts and when it ends?

  • 10G ethernet layer 1

If your coding in C with the usual BSD sockets library you know exactly where and when a packet starts/ends (assume this is a raw socket, or udp socket etc etc). (run of the mill raw/udp socket code) int fd = socket(); . . . int RecvLen = recvfrom(fd, Buffer, BufferLen, 0); if (RecvLen > 0) { Buffer[0] <---- packet starts here Buffer[RecvLen-1] <---- packet ends here } How about going down a level, is there some special magic the network card`s driver does to find the start/end of a packet? Nope, pretty much the same except its represented in a different more hardware friendly data structure that contains an io address + packet length + other info. (ripped from solarflare kernel driver) . . . rx_buf = efx_rx_buffer(rx_queue, index); rx_buf->dma_addr = dma_addr + EFX_PAGE_IP_ALIGN; <-- io address of start of packet rx_buf->page = page; rx_buf->page_offset = page_offset + EFX_PAGE_IP_ALIGN; rx_buf->len = efx->rx_dma_len; <-- length of the packet at that address rx_buf->flags = 0; . . . ...but hardware its just a continuous stream of bits every cycle, how do you encode packet length ? packet start ?


Enter the world of framing. In the hardware world its done differently, typically post MAC the hardware application code will typically have 4(or more) wires per packet bus. A valid wire, a data wire, a start of packet wire and an end of packet wire. This makes it dead simple for the code to know when a packet starts/ends, as there`s dedicated signals for it!. An example definition in verilog is below. (example verilog definition) . . . // packet bus wire packet_vld_s; wire packet_sof_s; <-- Start of Frame wire packet_eof_s; <-- End of Frame wire [63:0] packet_data_s; . . . ... and a simple example of what it looks like clock by clock. Here there are 7 cycles of valid data, so 7 * 64bits = 448bits, which is 56 Byte packet. Making it an illegal runt packet e.g. less than the minimum packet size of 64Bytes.



So whats all of this got to do with the PCS 64b/66b bit encoding ? Not much to be perfectly honest but we will get there shortly. As we saw above we need a way to encode the side channel information into a single data stream. Using the above picture as an example, we are encoding 64bits worth of data using: 64(packet_dat_s) + 1 (packet_vld_s) + 1 (packet_sof_s) + 1 (packet_eof_s) = 67 bits In this sense its a 64b/67b encoding, tho certainly not a real encoding protocol but more like an intermediate data structure.

That's the basics of framing at the hardware level and hopefully you can see something trivial in the software world, such as where does the packet start has quite alot of effort and thought behind it. Next up we will discuss the XGMII framing protocol which is formally specified in IEEE 802.3 Clause 46 and is used to communicate between the MAC and PCS, specifically the MAC and the 64b/66b encoder.