Category Archives: Data Center

RoCE in the Data Center

Today’s data centers demand that the underlying interconnect provide the utmost bandwidth and extremely low latency. While high bandwidth is important, it is not worth much without low latency. Moving large amounts of data through a network can be achieved with TCP/IP, but only RDMA can produce the low latency that avoids costly transmission delays.

The speedy transfer of data is critical to it being used efficiently. Interconnect based on Remote Direct Memory Access (RDMA) offers the ideal option for boosting data center efficiency, reducing overall complexity, and increasing data delivery performance. Mellanox RDMA enables sub-microsecond latency and up to 56Gb/s bandwidth, translating to screamingly fast application performance, better storage and data center utilization, and simplified network management.

Continue reading

The Mellanox SX1018HP is a game changer for squeezing every drop of latency out of your network

Guest blog by Steve Barry, Product Line Manager for HP Ethernet Blade Switches

One of the barriers to adoption of blade server technology has been the reliance on a limited number of network switches available.  Organizations requiring unique switching capabilities or extra bandwidth have had to rely on Top of Rack switches built by networking companies that have little or no presence in the server market. The result was a potential customer base of users who wanted to realize the benefits of blade server technology but were forced to remain with rack servers and switches due to a lack of alternative networking products. Here’s where Hewlett Packard has once again shown why they remain the leader in blade server technology by announcing a new blade switch that leaves the others in the dust.

 

MellanoxSX1018HPenetSwitch_front_small.jpg         MellanoxSX1018HPenetSwitch_left_small.jpg

                                         Mellanox SX1018HP Ethernet Blade Switch

    

 

Working closely with our partner Mellanox, HP has just announced a new blade switch for the c-Class enclosure that is designed specifically for customers that demand performance and raw bandwidth. The Mellanox SX1018HP is built on the latest SwitchX ASIC technology and for the first time gives servers a direct path to 40Gb. In fact this switch can provide up to sixteen 40Gb server downlinks and up to eighteen 40Gb network uplinks for an amazing 1.3Tb/s of throughput. Now even the most demanding virtualized server applications can get the bandwidth they need. Financial service customers and especially those involved in High Frequency Trading look to squeeze every drop of latency out of their network. Again, the Mellanox SX1018HP excels, dropping port to port latency to an industry leading 230nS at 40Gb. There is no other blade switch currently available that can make that claim.

For customers currently running Infiniband networks, the appeal of being able to collapse their data requirements onto a single network has always been tempered by the lack of support for Remote Direct Memory Access (RDMA) on Ethernet networks. Again, HP and Mellanox lead the way in blade switches. The SX1018HP supports RDMA over Converged Ethernet (RoCE) allowing those RDMA tuned applications to work across both Infiniband and Ethernet networks. When coupled with the recently announced HP544M 40Gb Ethernet/FDR Infiniband adapter, customers can now support RDMA end to end on either network and begin the migration to a single Ethernet infrastructure. Finally, many customers already familiar with Mellanox IB switches provision and manage their network with Unified Fabric Manager (UFM). The SX1018HP can be managed and provisioned with this same tool, providing a seamless transition to the Ethernet word. Of course standard CLI and secure web browser management is also available.

Incorporating this switch along with the latest generation of HP blade servers and network adapters now gives any customer the same speed, performance and scalability that was previously limited to rack deployments using a hodgepodge of suppliers.   Data center operations that cater to High Performance Cluster Computing (HPCC), Telecom, Cloud Hosting Services and Financial Services will find the HP blade server/Mellanox SX1018HP blade switch a compelling and unbeatable solution.

 

 Click here for more information on the new Mellanox SX1018HP Ethernet Blade Switch.

Interconnect analysis: InfiniBand and 10GigE in High-Performance Computing

InfiniBand and Ethernet are the leading interconnect solutions for connecting servers and storage systems in high-performance computing and in enterprise (virtualized or not) data centers. Recently, the HPC Advisory Council has put together the most comprehensive database for high-performance computing applications to help users understand the performance, productivity, efficiency and scalability differences between InfiniBand and 10 Gigabit Ethernet.

In summary, there are a large number of HPC applications that need the lowest possible latency for best performance or the highest bandwidth (for example Oil&Gas applications as well as weather related applications). There are some HPC applications that are not latency sensitive. For example, gene sequencing and some bioinformatics applications are not sensitive to latency and scale well with TCP-based networks including GigE and 10GigE. For HPC converged networks, putting HPC message passing traffic and storage traffic on a single TCP network may not provide enough data throughput for either. Finally, there is a number of examples that show 10GigE has limited scalability for HPC applications and InfiniBand proves to be a better performance, price/performance, and power solution than 10GigE.

The complete report can be found under the HPC Advisory Council case studies or by clicking here.

Web Retailer Uses InfiniBand to Improve Response Time to Its Customers

Recently while talking with an IT operations manager for a major Web retailer, I was enlightened on the importance of reducing latency in web-based applications. He explained that they were challenged to find a way to reduce the response time to their web customers. They investigated this for quite some time before discovering that the major issue seemed to be the time it takes to initiate a TCP transaction between their app servers and database servers. Subsequently their search focused on finding the best interconnect fabric to minimize this time.

Well, they found it in InfiniBand. With its 1 microsecond latency between servers, this web retailer saw tremendous opportunity to improve response time to its customers. In their subsequent proof of concept testing, they found that indeed they could reduce latency between their app servers and database servers. Resulting improvement to their customers is over 30%. This is a huge advantage in their highly competitive market. I would tell you who they are but they would probably shoot me.

More and more enterprise data centers are finding that low latency, high-performance interconnects, like InfiniBand, can improve their customer-facing systems and their resulting web business.

If you want to hear more, or try it for yourself, send me an email.

Thanks,

Wayne Augsburger
Vice President of Business Development
wayne@mellanox.com

High-Performance Computing as a Service (HPCaaS)

High-performance clusters bring many advantages to the end user, including flexibility and efficiency. With the increasing number of applications being served by high-performance systems, new systems need to serve multiple users and applications. Traditional high-performance systems typically served a single application at a given time, but to maintain maximum flexibility a new concept of “HPC as a Service” (HPCaaS) has been developed. HPCaaS includes the capability of using clustered servers and storage as resource pools, a web interface for users to submit their job requests, and a smart scheduling mechanism that can schedule multiple different applications simultaneously on a given cluster taking into consideration the different application characteristics for maximum overall productivity.

HPC as a Service enables greater system flexibility since it eliminates the need for dedicated hardware resources per application and allows dynamic allocation of resources per given task while maximizing productivity. It is also the key component in bringing high-performance computing into cloud computing. Effective HPCaaS though, needs to take into consideration the application’s demands and provide the minimum hardware resources required per application. The scheduling of runs of multiple applications at once requires the proper balance of resources for each application proportional to their demands.

Research activities on HPCaaS are being performed at the HPC Advisory Council (http://hpcadvisorycouncil.mellanox.com/). The results show the need for high-performance interconnects, such as 40Gb/s InfiniBand, to maintain high productivity levels. It was also shown that scheduling mechanisms can be set to guarantee same levels of productivity in HPCaaS versus the “native” dedicated hardware approach. HPCaaS is not only critical for the way we will perform high-performance computing in the future, but as more HPC elements are brought into the data center, it will become an important factor when building the most efficient enterprise data centers.

Gilad Shainer
Director, Technical Marketing
gilad@mellanox.com

Mellanox ConnectX Ethernet – I/O Consolidation for the Data Center

Today’s data center requires a low-cost, low-power I/O solution with network flexibility to provide I/O consolidation on a single adapter. Network administrators want the best performance, scalability, latency while solving all their LAN, SAN and IPC (Clustering) needs packed into one adapter card in a virtualized or data center environment.

ConnectX® is a single chip solution from Mellanox that provides these features for the Data Center I/O unification with its hardware and software capabilities.  ConnectX® EN 10 Gigabit Ethernet drivers provide seamless connectivity by providing optimized 10 Gigabit Ethernet I/O services that easily scale with multi-core CPUs and virtualized servers and storage architectures.

Mellanox entry into the 10GigE landscape was rather late if you consider 10GigE started showing up in 2001 on servers as PCI-X followed by PCIe adapters. With Mellanox’s extensive experience in high-performance computing and broad range of industry relationships, it has forged ahead with this technology and was the first company to offer 10GigE with PCIe 2.0. Along the way, our products have matured to become the market-leader for performance, latency, as well as consolidating all data center networking onto a single adapter.

In a span of less than 2 years, Mellanox has introduced a broad range of products supporting various media interconnects and cabling options including UTP, CX4 for copper and SR, LR and LRM for fiber optics.

Technology leadership in networking requires that a company not only have the best hardware solution, but compliment this with the best software solution to make a winning combination.

In my experience, working at other early startups, as well as network technology bellwether 10Gigabit Ethernet companies, the Gen1/Gen2 10GigE products introduced lacked the vision of what the end-customer requirements were. The products were a “mish-mosh” of features addressing 10GigE for LAN, clustering (iWARP), TCP acceleration (aka TOE) and iSCSI acceleration. They missed the mark by not solving the pain-points of a data center, be it blazing performance, low-latency, CPU utilization or true I/O consolidation.

Mellanox took the holistic approach to data center networking with a deep understanding from its InfiniBand leadership and knowledgebase, server and system configuration, virtualization requirements and benefits, driver software requirements, and most importantly, understanding customer requirements for each vertical segment.

Today, ConnectX® EN 10 Gigabit Ethernet drivers support a broad array of major operating systems, including Windows, Linux, VMware Infrastructure, Citrix XenServer and FreeBSD.

The ConnectX® EN 10 Gigabit Ethernet drivers provide:

– All Stateless offload features
– Virtualized accelerations
– Data Center Ethernet (DCE) support
– FCoE with full hardware offload for SAN consolidation on 10GigE
– Lowest 10GigE (TCP/IP) latency comparable to expensive iWARP solutions
– Single Root – IO Virtualization (SR-IOV) for superior virtualization performance
– Linux Kernel and Linux Distribution support
– WHQL certified drivers for Windows Server 2003 and 2008
– VMware Ready certification for VMware Virtual Infrastructure (ESX 3.5)
– XenServer 4.1 inbox support
– Line-rate performance with very low CPU utilization
– Replace multiple GigE NICs with a single ConnectX Ethernet adapter

To complete the last piece of the puzzle, i.e. IPC (clustering) for the Data Center, I’ll soon post in my blog on Industry’s Low Latency Ethernet (LLE) initiative and its advantages compared to current available clustering solutions on 10GigE.

Regards,
Satish Kikkeri
satish@mellanox.com

Moore’s Law’s Data Center Disruption

Change happens, and when you talk to anyone involved in the enterprise data center, change has been accelerating and is making their life more and more complicated. The most recent issue is the growing list of network protocols which the network engineer has to choose from.

 

Previously, the decision on what network protocol was very simple. For IP traffic, you used Ethernet, and for storage, Fibre Channel. Speeds were pretty simple to choose from also. You used 1 Gb Ethernet for the IP and 2 or 4 Gb Fibre Channel. The only challenge was choosing the vendor to purchase the equipment from.

 

Now what has happened is Moore’s Law has made the legacy data center network obsolete. Moore’s Law was originally conceived by one of the founders of Intel, Gordon Moore. He noticed that every generation of microprocessor that Intel made tracked a straight line when transistor count was plotted against time. What was more profound, he noticed that most all semiconductor companies tracked this line. He determined that transistor density of the microprocessors doubled every 18 months. His world famous graphical plot is still used today and now used to describe the steady march of technology.

 

Moore’s Law has caused an issue in the data center. Here is what has happened. For any data center to work properly, its major building blocks (storage, servers and network) should be in balance. Meaning, for them to work most efficiently, they should be matched. Also, you could say these three components of the data center have their functionality primarily dependent on semiconductor manufacturing processes i.e. the advance of Moore’s Law. Historically, storage and servers have tracked Moore’s Law very nicely. But when you look at the network you find a big discrepancy. Ethernet and Fibre Channel have not been tracking Moore’s Law. What has happened recently is that the efficiencies of server processing power and storage bandwidth have progressed so far ahead of the network, that the network has become a bottleneck.

 

Looking at present day data center networks, you can see that not only is the performance sub-par to the I/O needs of the server and storage, but also its functionality and features are woefully behind too. Why is this? If you look at Ethernet and Fibre Channel, you discover these protocols don’t track Moore’s Law. Go ahead and plot the advance in bandwidth over time with both Ethernet and Fibre Channel. Then overlay that onto server CPU density and storage bandwidth (aggregated) and you discover that the legacy network (Ethernet and Fibre Channel) have fallen way behind. Even their future roadmaps don’t track Moore’s Law. We are beginning to see the bottlenecks happening. While Ethernet is very popular, it was never designed for the data center. (Try pumping lots of data from tens-to-hundreds of servers and watch the congestion)! Fibre Channel is really too slow. Even 8 Gb is too slow. This lack of matching the technological advance of the servers and storage has made traditional approaches to data center network topology a dead-end. To get back in balance, the network needs to be matched using newer ways of deploying data enter networks.

 

Getting back to my original point; the network administrator of a large data center is probably noticing network problems and is pretty fed up with having to run 8 to 10 network cables to every server. Also, he can move servers anywhere from his desktop but when it comes to the network, he has to physically go into the data center and add NICs and HBAs plus cables. Throwing adapters and more cables at the problem is counterintuitive and not productive. These activities drive CapEx and OpEx through the roof.

 

There are many new network technologies which are available to the data center network administrator that offer compelling solutions to the Moore’s Law problem. 10Gb Ethernet, Low Latency Ethernet, Data Center Ethernet and InfiniBand all offer a wide range of features and solutions for the enterprise data center and cloud computing. The issue is, can people let go of the legacy way and embrace a new way to think about their network? It’s not about the protocol anymore. There are too many choices for that. The new way is to leverage what makes the most sense for the application. By leveraging the newer protocols and their powerful features

 

The change in the enterprise data center which is causing the network problems is actually a good thing. It is forcing people to think about how they deploy their networks in a new light. By adapting an open viewpoint rather than stubbornly holding onto legacy ways, the network engineer in the enterprise data center can leverage powerful alternatives which makes choice a good thing.


Tony Rea
tony@mellanox.com

Gain A Competitive Advantage

BridgeX received an excellent response from all the analysts that we briefed over the last few weeks. 

 One article talked about how BridgeX reminded the author of the early days of networking when networking companies delivered bridges for Ethernet, Token Ring and Banyan Vines.  The other one talked about the mish-mosh of protocols in the data center as a familiar story.   

 In my opinion, when data centers moved from Fast Ethernet to Gigabit Ethernet it was an easy decision because of the 10x performance improvements that were necessitated by the growth in Internet applications. The same 10x performance is now available with 10 Gigabit Ethernet but the data centers have not jumped into deploying the technology yet. Why?  The killer-app for 10 Gigabit Ethernet is I/O consolidation but the Ethernet protocol itself is still being enhanced in order for it to be deployed as an I/O consolidation fabric. Enhancements to the Ethernet protocol are being made within the IEEE Data Center Bridging Workgroup.  These enhancements will deliver new functionalities to Ethernet, yet the timeline for products is still a BIG question mark. Normally, in a growth economy, products will roll out within 12 to 18 months of spec finalization, whereas in the current economic condition it might taker a longer time and the spec is at least 18 months away for finalization. Till then,10 Gigabit Ethernet deployments will happen in data centers in smaller, niche applications and will not be deployed for I/O consolidation. So, if data centers want to save energy costs, reduce floor space and lower TCO today, then deploying a proven I/O consolidation fabric is critical. 

Just some of the enhancements currently being made to the Ethernet protocol in the IEEE:

  1. Lossless fabric
  2. Creating Virtual Lanes and providing granular QoS
  3. Enabling Fat-Tree
  4. Congestion management

These are already part of the InfiniBand fabric which has been shipping for almost 9 years now, and has been successfully deployed in several data centers and high-performance commercial applications.

Oracle Exadata is a great product that drives InfiniBand to the forefront of data centers for database applications. Exadata brings in new thinking and new strategy for delivering higher I/Os and lowering energy costs. Exadata certainly delivers a competitive advantage. 

Similarly, BridgeX coupled with ConnectX adapters and InfiniScale switching platforms provides competitive advantages by delivering a cost-optimized,I/O consolidation fabric. Data centers can consolidate their I/O using InfiniBand as the physical fabric and the virtual fabric will continue to be Ethernet or Fibre Channel. This means that the applications that need an Ethernet transport or a Fibre Channel transport will run un-modified in the InfiniBand cloud.   

I think it is time for the data centers to take a new look at their infrastructure and re-strategize the investments to gain an even greater competitive advantage. When the economy turns around, those who have infrastructure that can leapfrog their competition will eventually win.

TA Ramanujam (TAR)
tar@mellanox.com

Mellanox at VMworld Europe

Yesterday, myself along with Motti Beck and Ali Ayoub (our main VMware software developer at Mellanox) diligently put together a very compelling demo that highlights the convergence capabilities of our BridgeX BX 4000 gateway that we announced last week.

We unpacked everything and got it all up and running in less than an hour (this after we sorted out the usual power and logistical issues that always comes with having a booth).

 

 
The slide below illustrates the topology of the demo. Essentially, we have two ConnectX adapters cards in one of the Dell server running two different interconnect fabrics. One adapter is running 40Gb/s InfiniBand, while the other adapter is running 10 Gigabit Ethernet.

1. The 40Gb/s InfiniBand adapter is connected to our MTS3600 40Gb/s InfiniBand switch which then passes through the BridgeX BX4020 where we convert the packets to Ethernet. The packets then run through the Arista 10GigE Switch and then into the LeftHand Appliance Virtual Machine which resides on the Dell Server (which is running ESX 3.5 and our certified 10GigE driver over our ConnectX EN 10GigE SFP+ adapter). We are showing a movie from the iSCSI storage on the IB end-point (the Dell Linux Server).

2. The 10 Gigabit Ethernet Adapter connects directly to the BridgeX BX4020 where it converts the traffic to FC (effectively FCoE). The traffic then moves to the Brocade Fibre Channel switch and then directly to the NetApp storage. We are showing a movie from the FC NetApp storage on the 10GigE end-point (the Dell Linux Server).

If you are coming to VMorld Europe (or already here) come and see us in Booth #100 and we will be happy to walk you through the demo.

Brian Sparks
Director, Marketing Communications
brian@mellanox.com

I/O Agnostic Fabric Consolidation

Today, we announced one of the most innovative and strategic product – BridgeX, an I/O agnostic fabric consolidation silicon and you drop that in a 1U enclosure it becomes a full fledged system (BX4000)

Few years back we defined our product strategy to deliver a single-wire I/O consolidation to data centers.  The approach was not to support some random transports to deliver I/O consolidation but use transports that the data centers are accustomed to for the smooth running of their businesses.  ConnectX, an offspring of this strategy supports InfiniBand, Ethernet and FCoE.    ConnectX consolidates the I/O on the adapter but the data still has to go through different access switches.   BridgeX, the second offspring of our product strategy supports a stateless gateway functionality which allows for access layer consolidation.   BridgeX provides the Data Centers to innovate and remove two fabrics by deploying a single InfiniBand fabric which can support several virtualized GigE’s, 10GigE’s, 2, 4 or 8Gig FC in a single physical server.  BridgeX with its software counterpart BridgeX Manager that runs alongside on a CPU delivers management functionality for vNICs and vHBAs for both virtual OS (VMWare, XEN, Hyper-V) and non-virtual OS’s (Linux and Windows).

Virtual I/Os and BridgeX a stateless gateway implementation provides packet / frame integrity.  Virtual I/O drivers on the host adds InfiniBand headers to the Ethernet or Fibre Channel frames and the gateway (BridgeX) removes the headers and delivers it on the appropriate LAN or SAN port.  Similarly, the gateway (BridgeX) adds the InfiniBand headers to the packets / frames that it receives from the LAN / SAN side and sends it to the host which removes the encapsulation and delivers packet / frame to the application.  This simple, easy, and innovative implementation saves not only deployment costs but also saves energy and cooling costs significantly.

We briefed several analysts the last few weeks and most of them concurred that the product is innovative and in times like this a BridgeX based solution can cut costs, speed-up deployments and improve performance.

TA Ramanujam (TAR)
tar@mellanox.com