Category Archives: Virtualization

Mellanox ConnectX Ethernet – I/O Consolidation for the Data Center

Today’s data center requires a low-cost, low-power I/O solution with network flexibility to provide I/O consolidation on a single adapter. Network administrators want the best performance, scalability, latency while solving all their LAN, SAN and IPC (Clustering) needs packed into one adapter card in a virtualized or data center environment.

ConnectX® is a single chip solution from Mellanox that provides these features for the Data Center I/O unification with its hardware and software capabilities.  ConnectX® EN 10 Gigabit Ethernet drivers provide seamless connectivity by providing optimized 10 Gigabit Ethernet I/O services that easily scale with multi-core CPUs and virtualized servers and storage architectures.

Mellanox entry into the 10GigE landscape was rather late if you consider 10GigE started showing up in 2001 on servers as PCI-X followed by PCIe adapters. With Mellanox’s extensive experience in high-performance computing and broad range of industry relationships, it has forged ahead with this technology and was the first company to offer 10GigE with PCIe 2.0. Along the way, our products have matured to become the market-leader for performance, latency, as well as consolidating all data center networking onto a single adapter.

In a span of less than 2 years, Mellanox has introduced a broad range of products supporting various media interconnects and cabling options including UTP, CX4 for copper and SR, LR and LRM for fiber optics.

Technology leadership in networking requires that a company not only have the best hardware solution, but compliment this with the best software solution to make a winning combination.

In my experience, working at other early startups, as well as network technology bellwether 10Gigabit Ethernet companies, the Gen1/Gen2 10GigE products introduced lacked the vision of what the end-customer requirements were. The products were a “mish-mosh” of features addressing 10GigE for LAN, clustering (iWARP), TCP acceleration (aka TOE) and iSCSI acceleration. They missed the mark by not solving the pain-points of a data center, be it blazing performance, low-latency, CPU utilization or true I/O consolidation.

Mellanox took the holistic approach to data center networking with a deep understanding from its InfiniBand leadership and knowledgebase, server and system configuration, virtualization requirements and benefits, driver software requirements, and most importantly, understanding customer requirements for each vertical segment.

Today, ConnectX® EN 10 Gigabit Ethernet drivers support a broad array of major operating systems, including Windows, Linux, VMware Infrastructure, Citrix XenServer and FreeBSD.

The ConnectX® EN 10 Gigabit Ethernet drivers provide:

– All Stateless offload features
– Virtualized accelerations
– Data Center Ethernet (DCE) support
– FCoE with full hardware offload for SAN consolidation on 10GigE
– Lowest 10GigE (TCP/IP) latency comparable to expensive iWARP solutions
– Single Root – IO Virtualization (SR-IOV) for superior virtualization performance
– Linux Kernel and Linux Distribution support
– WHQL certified drivers for Windows Server 2003 and 2008
– VMware Ready certification for VMware Virtual Infrastructure (ESX 3.5)
– XenServer 4.1 inbox support
– Line-rate performance with very low CPU utilization
– Replace multiple GigE NICs with a single ConnectX Ethernet adapter

To complete the last piece of the puzzle, i.e. IPC (clustering) for the Data Center, I’ll soon post in my blog on Industry’s Low Latency Ethernet (LLE) initiative and its advantages compared to current available clustering solutions on 10GigE.

Regards,
Satish Kikkeri
satish@mellanox.com

Mellanox at VMworld Europe

Yesterday, myself along with Motti Beck and Ali Ayoub (our main VMware software developer at Mellanox) diligently put together a very compelling demo that highlights the convergence capabilities of our BridgeX BX 4000 gateway that we announced last week.

We unpacked everything and got it all up and running in less than an hour (this after we sorted out the usual power and logistical issues that always comes with having a booth).

 

 
The slide below illustrates the topology of the demo. Essentially, we have two ConnectX adapters cards in one of the Dell server running two different interconnect fabrics. One adapter is running 40Gb/s InfiniBand, while the other adapter is running 10 Gigabit Ethernet.

1. The 40Gb/s InfiniBand adapter is connected to our MTS3600 40Gb/s InfiniBand switch which then passes through the BridgeX BX4020 where we convert the packets to Ethernet. The packets then run through the Arista 10GigE Switch and then into the LeftHand Appliance Virtual Machine which resides on the Dell Server (which is running ESX 3.5 and our certified 10GigE driver over our ConnectX EN 10GigE SFP+ adapter). We are showing a movie from the iSCSI storage on the IB end-point (the Dell Linux Server).

2. The 10 Gigabit Ethernet Adapter connects directly to the BridgeX BX4020 where it converts the traffic to FC (effectively FCoE). The traffic then moves to the Brocade Fibre Channel switch and then directly to the NetApp storage. We are showing a movie from the FC NetApp storage on the 10GigE end-point (the Dell Linux Server).

If you are coming to VMorld Europe (or already here) come and see us in Booth #100 and we will be happy to walk you through the demo.

Brian Sparks
Director, Marketing Communications
brian@mellanox.com

QuickTransit Performance Results

As previously suggested, I will review in this post a different application that is focused on converting protocols. QuickTransit, developed by a company called Transitive (recently acquired by IBM), is a cross-platform virtualization technology which allows applications that have been compiled for one operating system and processor to run on servers that use a different processors and operating systems, without requiring any source code or binary changes.

We are using: QuickTransit for Solaris/SPARC-to-Linux/x86-64 which we used to test for Latency by a basic test which was related to the financial-industry operating method and involves interconnect between servers performance.

The Topology we’ve used was 2 servers (the 1st acting as server and the 2nd as a client). We’ve measured Latency with different object sizes and rates when running using the following interconnects GigE, Mellanox ConnectX VPI 10GigE, and Mellanox ConnectX VPI 40Gb/s InfiniBand. I would like to re-iterate, to any of you who have not read the first posts, that we’re committed to our guideline of “out-of-the-box”, meaning that neither the application nor any of the drivers are to be changed after downloading it off of the web.

With InfiniBand we’ve used 3 different Upper-Layers-Protocols (ULPs) – none requiring code intervention; IPoIB connect-mode (CM), IPoIB datagram mode (UD), and Socket-Direct-Protocol (SDP). The results were stunning mainly because our assumption was that with all the layers of software, in addition to the software which converts Sparc Solaris code to x86 Linux code, the interconnect will have small impact, if at all.

We’ve learned that 40Gb/s InfiniBand performance is significantly better then GigE for a wide range of packets size and transmission rates. We could see superiority in latency of over 2x faster when using InfiniBand and 30% faster execution when using 10GigE. Go and beat that…


Let’s look at the results in a couple of different ways. In particular, let’s look at the size of the messages being sent – the above advantage is related to the small message sizes (see graph #2) while when moving to larger message sizes the advantage (which, as it is, is strikingly better) becoming humongous.
 

In my next blog I plan to show more results that are closely related to the financial markets. If anyone out there identifies an application they would like our dedicated team to benchmark, please step forward and send me an e-mail.

Nimrod Gindi
nimrodg@mellanox.com

Hitting the New Year Running – Virtualization


You don’t have to ask – vacation was awesome and as always not as long as we would like it to be.

Now that we’ve taken the rust off our fingers, we’ve made progress with a bit more complex testbed.

We’ve decided to look at the virtualization space and run our next application on top of VMware ESX 3.5. The application we’ve picked was the Dell DVD-Store application. Dell DVD Store is a complete online e-commerce test application, with a backend database component, a web application layer, and driver programs. In order to stay in-line with what is being used in the industry we’ve taken a 2-tier configuration which is using a MS SQL server (which will be running on VMware). This means we’ve used (as you can see in the picture) 2 hosts/systems running 20 Virtual Machines, Microsoft SQL server and Client driver.

The database contained a size of 1GB, serving 2,000,000 customers. During the testing we increased the number of Virtual Machines running the client driver from 2 to 20, and measured the number of generated orders per minute from the database.

The only change we performed after the out of the box deployment (which if you recall we’ve set as our goal) in order to execute the test more efficiently, was some developed scripts we created for test execution and results analysis.

The results of our tests are shown in the graph below:
 
The results clearly show a greater than 10% benefit when using VPI (both 10GigE and 40Gb/s InfiniBand). We’ve added the “up to 10 VMs” results, but from our results it seemed that the valid numbers (when jitter is not a factor) are till 8 VMs, and it seems like there is a dependency on the amount of cores on the systems running VMware.

In my next blog post I’ll plan to either review a new application or anther aspect of this application.
Nimrod Gindi
nimrodg@mellanox.com