Network and Link Layer Innovation: Lossless Networks
In a previous post, I discussed that innovations are required to take advantage of 100Gb/s at every layer of the communications protocol stack networks – starting off with the need for RDMA at the transport layer. So now let’s look at the requirements at the next two layers of the protocol stack. It turns out that RDMA transport requires innovation at the Network and Link layers in order to provide a lossless infrastructure.
‘Lossless’ in this context does not mean that the network can never lose a packet, as some level of noise and data corruption is unavoidable. Rather by ‘lossless’ we mean a network that is designed such that it avoids intentional, systematic packet loss as a means of signaling congestion. That is packet loss is the exception rather than the rule.
Lossless networks can be achieved by using priority flow control at the link layer which allows packets to be forwarded only if there is buffer space available in the receiving device. In this way buffer overflow and packet loss is avoided and the network becomes lossless.
In the Ethernet world, this is standardized as 802.1 QBB Priority Flow Control (PFC) and is equivalent to putting stop lights at each intersection. A packet on a given priority class can only be forwarded when the light is green.
During my undergraduate days at UC Berkeley in the 1980’s, I remember climbing through the attic of Cory Hall running 10Mbit/sec coaxial cables to professors’ offices. Man, that 10base2 coax was fast!! Here we are in 2014 right on the verge of 100Gbit/sec networks. Four orders of magnitude increase in bandwidth is no small engineering feat, and achieving 100Gb/s network communications requires innovation at every level of the seven layer OSI model.
To tell you the truth, I never really understood the top three layers of this OSI model: I prefer the TCP/IP model which collapses all of them into a single “Application” layer which makes more sense. Unfortunately, it also collapses the Link layer and the Physical layer and I actually don’t think this makes sense to combine these two. I like to build my own ‘hybrid’ model that collapses the top three layers into an Application layer but allows you to consider the Link and Physical layers separately.
It turns out that a tremendous amount of innovation is required in these bottom four layers to achieve effective 100Gb/s communications networks. The application layer needs to change as well to fully take advantage of 100Gb/s networks. For now we’ll focus on the bottom four layers. Continue reading →
Big data is for real, but its places heavy demands on IT teams, who have to pull together and provision cloud infrastructure, then offer big data application deployments with validated performance to meet pressing business decision timelines. QualiSystems is partnering with Mellanox to simplify big data deployments over any cloud infrastructure, enabling IT teams to meet line of business needs while reducing operational costs.
We have submitted several speaking sessions for the upcoming OpenStack Summit in Paris, France. Please review the descriptions for each session below, click to vote and share these presentations with your colleagues. Remember: Voting closes on Wednesday, August 6, 2014 at 11:59 pm CDT.
Making another step towards enabling a world of truly open Ethernet switches, Mellanox recently became the first vendor to release as open source, implementation of Multi Chassis Link Aggregation Group, or as it is more commonly known – MLAG.
Mellanox is involved and contributes to other open source projects, such as OpenStack, ONIE, Puppetand others, and already contributed certain adaptor applications to the open source community. Mellanox is the first and only vendor to open-source its switch SDK API. Mellanox is also a leading member and contributor of the Open Compute Project, where it provides NICs, switches and software.
One of the biggest catchphrases in modern science is Human Genome–the DNA coding that largely pre-determines who we are and many of our medical outcomes. By mapping and analyzing the structure of the human genetic code, scientists and doctors have already started to identify the causes of many diseases and to pinpoint effective treatments based on the specific genetic sequence of a given patient. With the advanced data that such analysis provides, doctors can offer more targeted strategies for potentially terminal patients at times when no other clinically relevant treatment options exist.
Dell Fluid Cache for SANis enabled by ConnectX®-3 10/40GbE Network Interface Cards (NICs) with Remote Direct Memory Access (RDMA). The Dell Fluid Cache for SAN solution reduces latency and improves I/O performance for applications such as Online Transaction Processing (OLTP) and Virtual Desktop Infrastructure (VDI).
Dell lab tests have revealed that Dell Fluid Cache for SAN can reduce the average response time by 99 percent and achieve four times more transactions per second with a six-fold increase in concurrent users**.
Ethernet switches are simple: they need to move packets around from port to port based on the attributes of each packet. There are plenty of switch vendors from which to choose. Differentiating in this saturated market is the aspiration of each vendor.
Mellanox Technologies switches are unique in this market. Not just “yet another switch” but a design based on a self-built switching ASIC and a variety of 1RU switches. These switches excel in performance compared to any other switch offered in the market. Being first and (still) the only vendor with a complete end-to-end 40GbE solution, Mellanox provides a complete interconnect solution and the ability to achieve the highest price-performance ratio.
Companies today are finding that the size and growth of stored data is becoming overwhelming. As the databases grow, the challenge is to create value by discovering insights and connections in the big databases in as close to real time as possible. In the recently published whitepaper, “Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks“ we describe a combination of high performance networking and graph base and analytics technologies which offers a solution to this need.
Each of the examples in the paper is based on an element of a typical analysis solution. In the first example, involving Vertex Ingest Rate shows the value of using high performance equipment to enhance real-time data availability. Vertex objects represent nodes in a graph, such as Customers, so this test is representative of the most basic operation: loading new customer data into the graph. In the second example, Vertex Query Rate highlights the improvement in the time needed to receive results, such as finding a particular customer record or a group of customers.
The third example, Distributed graph navigation processing starts at a Vertex and explores its connections to other Vertices. This is representative of traversing social networks, finding optimal transportation or communications routes and similar problems. The final example, Task Ingest Rate shows the performance improvement when loading the data connecting each of the vertices. This is similar to entering orders for products, transit times over a communications path and so on.
Each of these elements is an important part of a Big Data analysis solution. Taken together, they show that InfiniteGraph can be made significantly more effective when combined with Mellanox interconnect technology.
The University of Edinburgh’s entry into the ISC 2014 Student Cluster Competition, EPCC, has been awarded first place in the LINPACK test. The EPCC team harnessed Boston’s HPC cluster to smash the 10Tflop mark for the first time – shattering the previous record of 9.27Tflops set by students at ASC14 earlier this month. The team recorded a score of 10.14Tflops producing 3.38 Tflops/kW which would achieve a rank of #4 in the Green500, a list of the most energy efficient supercomputers in the world.
This achievement was made possible thanks to the provisioning of a high performance, liquid cooled GPU cluster by Boston. The system consisted on four 1U Supermicro servers, each comprising of two Intel® Xeon™ ‘Ivy Bridge’ processors and two NVIDIA® K40 Tesla GPUs, and Mellanox FDR 56Gb/s InfiniBand adapters, switches and cables.