Without considering buffer architecture nuances, it is difficult for customers to ascertain if what they see in the datasheet is what they actually get. This blog presents three important buffer related questions customers can ask themselves to gain better insight into the system level performance.
I want to cover this topic in two parts. In this first part, I will go over possible architectural choices for on-chip switch packet buffers and implications. The second part will be dedicated to off-chip “ultra-deep” packet buffer architectures.
Basic purpose of packet buffer is to provide three functions:
Every other functionality can be mapped to one of these three areas. Ask these three questions to ensure what you see is what you will get.
Question 1: Is the switch buffer a single unit or is it made of multiple fragmented slices?
As one can imagine, a single unit packet buffer is much more efficient when compared to split buffer architectures. With a single buffer, a congested queue can occupy substantial portion of the packet buffer. With a 4-way split buffer, the congested queue can only occupy 25 percent of the buffer at best (See Figure 1). This leads to very poor burst absorption capabilities. Also, queues belonging to the same port can physically reside in different buffer slices. The scheduling across slices often leads to port level and flow level unfairness.
Mellanox Spectrum has a single unit packet buffer which is dynamically shared across all ports. Datasheets for other products just state the total buffer capacity as an aggregate sum of each of the slices. Dig deeper to find out if it is single unit or fragmented slices.
Question 2: Can the packet buffer system sustain line rate bandwidth from all ports simultaneously?
Line rate packet buffer not only gives better performance, it also makes rest of the system simpler and elegant. With an over-subscribed buffer, packets can be dropped even before forwarding table lookup or packet classification (See Figure 2). This means the packet drops are indiscriminate and port isolation is not guaranteed. Scheduling packets out of an oversubscribed packet buffer also can be tricky and inefficient. This results in very poor flow isolation and inefficient utilization of bandwidth. Note that TCP window halves for every dropped packet. So, dropping a few packets during congestion can have a drastic impact on the TCP goodput.
Mellanox Spectrum Packet Buffer supports full line rate. Datasheets for other products sometimes just state the total port capacity as the platform throughput. Dig deeper to find out if the packet buffer can sustain the port line rate.
Question 3: How is buffer occupancy accounting done for ECN, WRED, PFC?
Congestion management algorithms are based on, “buffer occupancy”. The definition for buffer occupancy is straight forward when the buffer is a single unit. It becomes complex when the buffer is split n-ways. For example, if only one slice of a 2-way split buffer is experiencing congestion, the system will have to react as if the entire buffer is congested. This is not only sub-optimal … but it simply does not work for high performance applications such as Ethernet Storage and Deep Learning. Since existing dominant on-chip buffer solutions is not working, customers are looking to adopt expensive “ultra-deep” buffer solutions.
Mellanox’s Spectrum Packet Buffer is single unit and so buffer occupancy calculations are straight forward. Spectrum supports ECN, WRED and PFC with no inefficiencies. Datasheets for other products often just state that they support the congestion management protocols. Again, dig deeper to find out how they do the buffer occupancy accounting and look for inefficiencies if they have a fragmented buffer.
The Bottom Line
Mellanox Spectrum Open Ethernet Switches have the best on-chip buffer architecture. It can support line rate 10GbE/25GbE/50GbE/100GbE speeds. It provides flow isolation, burst absorption and intelligent congestion management that is critical for Cloud, Storage, Deep Learning and other high performance applications. Additionally, Spectrum is open and disaggregated – today you can run your choice of Cumulus Linux or Mellanox OS operating system on the platform. Mellanox will support even more options going forward. Explore more at: http://www.mellanox.com/open-ethernet/ and http://www.mellanox.com/tolly/.
Up next, in part two, we will discuss ultra-deep packet buffer architectures so stay tuned.