All posts by Cecelia Taylor

About Cecelia Taylor

Cecelia has served as the Sr. Social Media Manager for Mellanox since 2013. She previously worked at Cisco & ZipRealty managing social media marketing, publishing and metrics. Prior to her career in social media, she worked in audience development marketing for B2B publishing. She has a BA from Mills College and resides in the SF East Bay. Follow her on Twitter: @CeceliaTaylor

Deploying HPC Clusters with Mellanox InfiniBand Interconnect Solutions

High-performance simulations require the most efficient compute platforms. The execution time of a given simulation depends upon many factors, such as the number of CPU/GPU cores and their utilization factor and the interconnect performance, efficiency, and scalability. Efficient high-performance computing systems require high-bandwidth, low-latency connections between thousands of multi-processor nodes, as well as high-speed storage systems.

Mellanox has released “Deploying HPC Clusters with Mellanox InfiniBand Interconnect Solutions”.  This guide describes how to design, build, and test a high performance compute (HPC) cluster using Mellanox® InfiniBand interconnect covering the installation and setup of the infrastructure including:

  • HPC cluster design
  • Installation and configuration of the Mellanox Interconnect components
  • Cluster configuration and performance testing

 

 Scot Schlultz Author: Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Joining the Mellanox team in March 2013 as Director of HPC and Technical Computing, Schultz is 25-year veteran of the computing industry. Prior to joining Mellanox, he spent the past 17 years at AMD in various engineering and leadership roles, most recently in strategic HPC technology ecosystem enablement. Scot was also instrumental with the growth and development of the Open Fabrics Alliance as co-chair of the board of directors. Scot currently maintains his role as Director of Educational Outreach, founding member of the HPC Advisory Council and of various other industry organizations.

ConnectX-3 Pro Hardware Offload Engines

ConnectX-3 Pro,  a new addition to the ConnectX-3 family, is showing significant CPU overhead reduction and performance improvement while running NVGRE, dramatically improving ROI for cloud providers by reducing the application running cost.

We conducted initial tests to measure the performance improvements and the CPU overhead reduction while utilizing the ConnectX-3 Pro NVGRE hardware offload engines.

Blog 091613 Pic 1

Results show 2x performance improvement and 40% CPU overhead reduction!

Blog 091613 Pic 2

ConnectX-3 Pro supports VXLAN hardware offload engines on top of the NVGRE one and is the first adapter in the market that supports hardware offload engines for overlay networks, i.e., NVGRE and VXLAN.

 Gadi Singer Author: Gadi Singer – Product Manager, Adapter Drivers. Gadi manages the Adapters Product Line at Mellanox Technologies. He served as Marketing Product Manager for the HCA Software division at Mellanox Technologies from 2012 to 2013. Prior to joining Mellanox, Gadi worked at Anobit and PMC-Sierra as a Product Line Manager. Mr. Singer holds a BSc degree in Electrical Engineering from Ben-Gurion University in Israel.

Advancing Applications Performance With InfiniBand

High-performance scientific applications typically require the lowest possible latency in order to have the parallel processes be in sync as much as possible.  In the past, this requirement drove the adoption of SMP machines, where the floating point elements (CPU, GPUs) were placed as much as possible on the same board. With the increased demands for higher compute capability, and lowering the cost of adoption for making large scale HPC more available, we have witnessed the increase of clustering as the preferred architecture for high-performance computing.

 

 

We introduce and explore some of the latest advancements in the areas of high speed networking and suggest new usage models that leverage the latest technologies that meet the desired requirements of today’s demanding applications.   The recently launched Mellanox Connect-IB™ InfiniBand adapter introduced a novel high-performance and scalable architecture for high-performance clusters.  The architecture was designed from the ground up to provide high performance and scalability for the largest supercomputers in the world, today and in the future.

The device includes a new network transport mechanism called Dynamically Connected Transport™ Service (DCT), which was invented to provide a Reliable Connection Transport mechanism — the service that provides many of InfiniBand’s advanced capabilities such as RDMA, large message sends, and low latency kernel bypass — at an unlimited cluster size.  We will also discuss optimizations for MPI collectives communications, that are frequently used for processes synchronization and show how their performance is critical for scalable, high-performance applications.

 

Presented by:  Pak Lui, Application Performance Manager, Mellanox – August 12, 2013 – International Computing for the Atmospheric Sciences Symposium, Annecy, France

 

 

Mellanox Delivers High Speed Interconnect Solutions for New IBM NeXtScale System

IBM recently introduced their new NeXtScale System, a flexible computing platform that provides 3X as many cores as current one-unit rack servers, making it ideal for the fastest growing workloads such as social media, analytics, technical computing and cloud delivery.

NeXtScale n1200 Enclosure
IBM NeXtScale System Chassis front fully loaded

IBM and Mellanox have worked closely to develop a platform that addresses multiple large-scale markets and solves a variety of complex research and business issues.

Through the use of ConnectX-3 FDR 56Gb/s InfiniBand and 10/40GbE adapters and SwitchX-2 FDR 56Gb/s InfiniBand and 10/40GbE switches, we can provide IBM NeXtScale customers with unrivaled interconnect performance to address the needs for:

  • Large data centers requiring efficiency, density, scale, and scalability;
  • Public, private and hybrid cloud infrastructures;
  • Data analytics applications like customer relationship management, operational optimization, risk/financial management, and new business models;
  • Internet media applications such as online gaming and video streaming;
  • High-resolution imaging for applications ranging from medicine to oil and gas exploration;
  • “Departmental” uses where a small solution can increase the speed of outcome prediction, engineering analysis, and design and modeling

Mellanox’s technology, combined with the IBM NeXtScale compute density, provides customers with sustainable competitive advantage in building scale out compute infrastructures. Customers deploying the joint Mellanox-IBM solution will receive maximum bandwidth, lower power consumption and superior application performance.

cecilia-blog-IBM-v2

Resources:

 

 

Driving Innovation with OpenEthernet

Authored by: Amir Sheffer, Sr. Product Manager

For years, data center Ethernet switching equipment has been based on closed, proprietary vendor implementation, providing very limited flexibility for the user. The progress made in open source applications and software can be leveraged in Ethernet switches to create a new generation of open, flexible and customizable solutions.  

Open Source Enables New Solutions / Trends / Technologies

Open Source Enables New Solutions / Trends / Technologies

Switches based on the OpenEthernet approach will replace traditional closed-code switches and will allow data center customization for optimized and efficient operation. The OpenEthernet switch is based on functionality developed by the equipment vendor and integration with public, open cores and tools that can be freely downloaded from the internet.

As a leader of this approach, Mellanox is investing in the integration and development of such tools, which when combined, can provide complete functionality. Examples for such tools can be OpenFlow–for flow configuration; Puppet and Chef–for switch configuration, Quagga for routing protocols, etc.

Open Ethernet

Mellanox switch software runs over Linux.  Even if the Linux kernel provides good infrastructure for the switch, it lacks functionality to connect it to the switching and routing functions. For example, a routing reflector unit is required to synchronize between the Linux kernel, the routing stack and the silicon data path. For this purpose, we are developing and opening such “reflector” units to the open community.

Another example can be the hardware driver or the software development kit (SDK) application interface (API) for the switch. By opening the API to the community, we will be the first ones to enable full flexibility and ease-of implementation to our customers and we believe other will follow.

In parallel, Mellanox is participating in industry-wide groups that are taking a similar approach.  One example can be the OpenStack community, in which Mellanox is an active member. Another example for such group can be the Open Compute Project (OCP), which is defining open and standard equipment for data centers. Mellanox already builds OCP-compatible NICs and has recently contributed the hardware design documents of the SX1024 switch system to OCP.

So far, we have briefly touched several aspects of OpenEthernet. An important feature that will be explained in the coming weeks is the hardware – software separation.

 To be continued…..

The Storage Fabric of the Future Virtualized Data Center

Guest post by Nelson Nahum, Zadara Storage

It is evident that the future data center will be based on cutting-edge software and virtualization technologies to make the most effective use of hardware, compute power, and storage needs to perform essential analytics and to increase the performance of media-related and advanced web applications. And it turns out that the wires that will connect all this technology together are no less crucial to next-generation data centers and clouds than the software and virtualization layers that run within them.


There are multiple storage fabrics and interconnects available today, including Fibre Channel, Ethernet and SAS. Each has various pros and cons, and fabrics were chosen according to need of performance, compatibility and cost efficiencies.

 

As an enterprise storage as-a-service provider, delivering a software-based cloud storage solution for public, private and hybrid cloud models based on commodity hardware, Zadara Storage provides storage as-a-service in multiple public cloud and colocation facilities around the globe. Consistency, high availability and predictability are key in supplying the scalable, elastic service our customers expect, regardless of their location, facility or the public cloud they employ. The hardware we use needs to be dependable, pervasive and cost-efficient in order to sustain the performance and cost-level of our service, anywhere and at any scale.

 

When choosing our fabric, Ethernet came as a clear choice. Ethernet is likely to become the new standard, and boasts several advantages vital to our product:

  • Ethernet’s speed roadmap is aggressive: from 10GbE to 40GbE, and upcoming 100GbE
  • Ethernet is ubiquitous: we can employ it with no complication at any data center or colocation facility around the globe
  • The latency we have found to be more than manageable, specifically as we use advanced techniques such as IO virtualization and data passthrough
  • Ethernet is the most cost effective: an as-a-service company needs to have a competitive pricing edge.

The future of enterprise storage
The future of Enterprise Storage lies in software and a choice of hardware (premium or commodity). Software-defined storage can scale performance more easily and cost effectively than monolithic hardware, and by combining the best of hardware of software, the customer wins. Ethernet is a critical element of our infrastructure, and Mellanox switches offer significant higher performance and consistent dependability that enables our storage fabric and meets our customer’s needs.

 

Zadara Storage at the Mellanox Booth at VM World 2013
Wednesday, August 28, at 2:15pm
At the Mellanox Booth at VM World 2013, Zadara Storage CEO, Nelson Nahum, will present the Zadara™ Storage Cloud, based on the patent-pending CloudFabric™ architecture, and providing a breakthrough cost structure for data centers. Zadara’s software-defined solution employs standard, off-the-shelf x86 servers, and utilizes Ethernet as its only interconnect to provide performant, reliable, SSD- and spindle-based SAN and NAS as a service.

 

About Zadara Storage
An Amazon Web Services and Dimension Data Technology Partner and winner of the VentureBeat, Tie50, Under the Radar, and Plug and Play cloud competitions, Zadara Storage offers enterprise-class storage for the cloud in the form of Storage as a Service (STaaS). With Zadara Storage, cloud storage leapfrogs ahead to provide cloud servers with high-performance, fully configurable, highly available, fully private, tiered SAN and NAS as a service. By combining the best of enterprise storage with the best of cloud and cloud block storage, Zadara Storage accelerates the cloud by enabling enterprises to migrate existing mission-critical applications to the Cloud.