Transport Layer Innovation: RDMA
During my undergraduate days at UC Berkeley in the 1980’s, I remember climbing through the attic of Cory Hall running 10Mbit/sec coaxial cables to professors’ offices. Man, that 10base2 coax was fast!! Here we are in 2014 right on the verge of 100Gbit/sec networks. Four orders of magnitude increase in bandwidth is no small engineering feat, and achieving 100Gb/s network communications requires innovation at every level of the seven layer OSI model.
To tell you the truth, I never really understood the top three layers of this OSI model: I prefer the TCP/IP model which collapses all of them into a single “Application” layer which makes more sense. Unfortunately, it also collapses the Link layer and the Physical layer and I actually don’t think this makes sense to combine these two. I like to build my own ‘hybrid’ model that collapses the top three layers into an Application layer but allows you to consider the Link and Physical layers separately.
It turns out that a tremendous amount of innovation is required in these bottom four layers to achieve effective 100Gb/s communications networks. The application layer needs to change as well to fully take advantage of 100Gb/s networks. For now we’ll focus on the bottom four layers. Continue reading
In 1967, Gene Amdahl developed a formula that calculates the overall efficiency of a computer system by analyzing how much of the processing can be parallelized and the amount of parallelization that can be applied in the specific system.
At that time, deeper performance analysis had to take into consideration the efficiency of three main hardware resources that are needed for the computation job: the compute, memory and storage.
On the compute side, efficiency has to be measured by how many threads can run in parallel (which depends on the number of cores). The memory size affects the percentage of IO operation that needs to access the storage, which slows significantly the execution time and the overall system efficiency.
Those three hardware resources worked very well until the beginning of 2000. At that time, the computer industry started to use a grid-computing or as it known today, scale-out systems. The benefits of the scale-out architecture are clear. It enables building systems with higher performance, easy to scale with built-in high availability at a lower cost. However, the efficiency of those systems heavily depend on the performance and the resiliency of the interconnect solution.
The importance of the Interconnect became even bigger in the virtualized data center, where the amount of east west traffic continues to grow (as more parallel work is being done). So, if we want to use Amdahl’s law to analyze the efficiency of the scale-out system, in addition to the three traditional items (compute, memory & storage) the fourth item, which is the Interconnect, has to be considered as well.
Last year, Open Compute Project (OCP) launched a new network project focused on developing operating system agnostic switches to address the need for a highly efficient and cost effective open switch platform. Mellanox Technologies collaborated with Cumulus Networks and the OCP community to define unified and open drivers for the OCP switch hardware platforms. As a result, any software provider can now deliver a networking operating system to the open switch specifications on top of the Open Network Install Environment (ONIE) boot loader.
At the upcoming OCP Summit, Mellanox will present recent technical advances such as loading Net-OS on an x86 system with ONIE, OCP platform control using Linux sysfs calls, full L2 and L3 Open Ethernet Switch API, and also demonstrate Open SwitchX SDK. To support this, Mellanox developed SX1024-OCP, a SwitchX®-2-based TOR switch which supports 48 10GbE SFP+ ports and up to 12 40GbE QSFP ports.
The SX1024-OCP enables non-blocking connectivity within the OCP’s Open Rack and 1.92Tb/s throughput. Alternatively, it can enable 60 10GbE server ports when using QSFP+ to SFP+ breakout cables to increase rack efficiency for less bandwidth demanding applications.
Mellanox also introduced SX1036-OCP, a SwitchX-2-based spine switch, which supports 36 40GbE QSFP ports. The SX1036-OCP enables non-blocking connectivity between the racks. These open source switches are the first switches on the market to support ONIE over x86 dual core processors.