All posts by Scot Schultz

About Scot Schultz

Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Joining the Mellanox team in March 2013 as Director of HPC and Technical Computing, Schultz is 25-year veteran of the computing industry. Prior to joining Mellanox, he spent the past 17 years at AMD in various engineering and leadership roles, most recently in strategic HPC technology ecosystem enablement. Scot was also instrumental with the growth and development of the Open Fabrics Alliance as co-chair of the board of directors. Scot currently maintains his role as Director of Educational Outreach, founding member of the HPC Advisory Council and of various other industry organizations. Follow him on Twitter: @ScotSchultz

shattered-glass

Mellanox and Zettar Crush World Record LOSF Performance Using ESnet OSCARS Test Circuit

In the wake of SC16, Mellanox has just broken the record for Lots of Small Files (LOSF) performance using the ESnet OSCARS test circuit measuring 70Gb/s. Preliminary results show a ten-fold performance improvement (40+Gbps) compared to the best results DOE researchers have reported thus far (4Gbps), even with TLS encryption enabled, for LOSF[i], despite the bandwidth cap and QoS limitations of the ESnet OSCARS test circuit.

At SC16, Mellanox and Zettar demonstrated real-time transfers, round-trip from SLAC to Atlanta, Georgia, and back to SLAC over a 5000-mile ESNet OSCARS loop. The two companies also exhibited real-time data transfers using two 100Gbps LAN links that show line rate performance of moving data from memory-to-memory and file-to-file between clusters. The configuration leveraged Mellanox 100Gb/s InfiniBand connectivity on the storage backend as well as Mellanox 100Gb/s Ethernet connectivity at the front end. The motivation of such endeavors is due to the fact that the next generation Linear Coherent Light Source experiment (LCLS-II) at SLAC is anticipated to achieve an event rate of 1000 times that of today’s LCLS. The majority of the data analysis will be performed at the NERSC supercomputer center at Lawrence Berkley National Laboratory, where it is essential to have a solution that is capable of supporting this distributed, data-intensive project.

Mellanox was delighted and honored to participate with this important technology demonstration that leveraged a complete state-of-the-art 100Gb/s InfiniBand and Ethernet connectivity solution. By showcasing that the foundational interconnect requirements of the LCLS-II project, we now have hard evidence that co-design and open standards is on the proper trajectory needed to drive future generation requirements for both science and data.

“Providing a scale-out data transfer solution consisting of matching software and a transfer system design will be paramount for the amount of data generated by projects such as LCLS-II,” said Dr. Chin Fang, CEO and Founder, Zettar, Inc. “Harnessing the latest capabilities of RDMA, 100Gb/s InfiniBand and Ethernet with Zettar’s scale-out data transfer solution, we can achieve the performance needed to satisfy the demands of the future data centers for the most data-intensive research such as LCLS-II, even with the formidable challenges found in LOSF transfers.”

“The rates to transfer the data to NERSC is expected to reach several hundred Gb/s soon after the project turns on in 2020 and exceed a terabyte per second by 2025,” said Dr. Les Cottrell, SLAC National Accelerator Laboratory. “This demonstration will bring to light the growing need we are experiencing for data transfer and High Performance Computing (HPC) for analysis.”

ESnet provides the high-bandwidth, reliable connections that link scientists at national laboratories, universities and other research institutions, enabling them to collaborate on some of the world’s most important scientific challenges including energy, climate science, and the origins of the universe. Funded by the DOE Office of Science, ESnet is managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory. As a nationwide infrastructure and DOE User Facility, ESnet provides scientists with access to unique DOE research facilities and extensive computing resources.

Zettar Inc. delivers a scale-out data transfer software and architected a data transfer cluster design that proved the feasibility of using compact and energy-efficient high-density servers for high-performance big-data transfers.   The design leverages the industry leading Mellanox 100Gb/s ConnectX-4, SwitchIB-2 and the Mellanox SN2410 Spectrum-based 48-port 25GbE + 8-port 100GbE Open Ethernet Platform switches.

Supporting Resources:

[i] See http://datasys.cs.iit.edu/publications/2013_CCGrid13-100Gbps.pdf, Figures 6 and 7
HPC-Awards

Mellanox Technologies Honored with Six HPCwire Readers’ and Editors’ Choice Awards at the Supercomputing Conference

At SC 16, Mellanox’s commitment to innovation and dedication to cultivating the HPC community has led to multiple honors, including outstanding leadership in HPC.  The Company was honored with six HPCwire Readers’ and Editors’ Choice Awards. The six awards span a variety of categories and acknowledge the company’s achievements in delivering high-performance interconnect technology that enables the highest performance, most efficient compute and storage platforms. Mellanox Vice President of Marketing, Gilad Shainer, also received an award for outstanding leadership in HPC for his individual contributions, including his eight years of service as Chairman of the HPC Advisory council and his role in the development of Co-Design architecture, to the community over the course of his career.

We were further honored to be recognized for these awards by Tom Tabor of Tabor communications and publisher of HPCwire: “HPCwire is honored to recognize Mellanox’s technological achievements and applaud Mellanox’s commitment to the HPC community,” said Tom Tabor, CEO of Tabor communications, publisher of HPCwire. “Mellanox has long been a thought leader in the HPC space, recognizing the important role it plays in research, science and global progress, and the company continues to push the innovation envelope with each passing year. This is not the first year – and I’m sure it will not be the last – that our readers and editors have recognized Mellanox’s incredible contributions to the advancement of HPC.”

Mellanox was honored with the following Reader’s Choice Awards:

  • Best Interconnect Product Or Technology: Mellanox EDR InfiniBand
  • Top Five New Products Or Technologies To Watch: Mellanox ConnectX-5
  • Outstanding Leadership in HPC: Gilad Shainer, VP of Marketing, Mellanox

Mellanox was honored with the following Editor’s Choice Awards:

  • Best HPC Collaboration Between Government & Industry:PNNL’s Center for Advanced Technology Evaluation (CENATE), a program for early evaluation of technologies, is currently assessing products from Micron Technology, Mellanox Technologies, Penguin Computing, NVIDIA, IBM, and Data Vortex.
  • Best Use Of HPC In The Cloud: The San Diego Supercomputer Center and Mellanox collaboration, which made a historical and major advancement by recording the gravitational waves from two black holes colliding millions of light-years away.
  • Best HPC Interconnect Product Or Technology: Mellanox EDR InfiniBand

The show is in full swing and Mellanox is in the thick of the action. Tonight, we host a party with some seriously funny talent, a.k.a., “Fluffy” Gabriel Iglesias. Follow us and don’t miss a minute of the action on: Twitter, Facebook, Google+, LinkedIn, and YouTube

Visit Mellanox Technologies at SC16 (November 14-17, 2016)

Visit Mellanox Technologies at SC16 (booth #2631) to learn more on the new 200G HDR InfiniBand solutions and to see the full suite of Mellanox’s end-to-end high-performance InfiniBand and Ethernet solutions.

For more information on Mellanox’s booth and speaking activities at SC16, please visit: http://www.mellanox.com/sc16/.

As you can see, we have a lot going on at Supercomputing, so stay tuned in, more blogs and news to come.

SC16

Mellanox Takes Supercomputing16 by Storm

Although activities have been going on for a few days now, Supercomputing 16 officially opened last night in chilly Salt Lake City, Utah with Mellanox Technologies leading the charge. I’ll be bringing you updates throughout the week starting with today’s milestones and several key announcements. Today’s highlights include:

Big news from the TOP500 Supercomputers list with the announcement that Mellanox is chosen by nearly four times more end-users versus proprietary offerings in 2016. The news showed that we are accelerating the fastest supercomputer and connecting 65 percent of overall HPC systems; connecting all 40G and connecting the first 100G Ethernet systems. The announcement reflects that our InfiniBand solutions were chosen in nearly four times more end-user projects in 2016 versus Omni-Path and five times more end-user projects versus other proprietary offerings. This demonstrates an increase in both InfiniBand usage and market share. InfiniBand accelerates 65 percent of the total HPC systems on the list and 46 percent of the Petaflop infrastructures. Mellanox continues to connect the fastest supercomputer on the list, delivering highest scalability, performance and efficiency.

Published twice a year and publicly available at: www.top500.org, the TOP500 list ranks the world’s most powerful computer systems according to the Linpack benchmark rating system. A detailed TOP500 presentation can be found here: TOP500

  • Mellanox InfiniBand Accelerates the New National Computational Infrastructure (NCI) Supercomputer Australian National Computational Infrastructure (NCI)! We just announced that NCI has chosen Mellanox’s 100Gbit/s EDR InfiniBand interconnect for its new Lenovo NextScale supercomputer. The new system will deliver a 40 percent performance increase in NCI computational capacity starting in January 2017. The solution will also leverage the Mellanox smart interconnect and In-Network Computing technology to maximize applications performance, efficiency and scalability.
  • Mellanox is driving Virtual Reality to new levels with breakthrough performance via a demo of ultra-low latency long distance using Mellanox’s 100Gb/x EDR Infiniband. Mellanox is showcasing a Virtual Reality over 100Gb/s EDR InfiniBand demonstration at the Supercomputing Conference. Mellanox and Scalable Graphics will showcase an ultra-low latency solution that demonstrates the ultimate extended virtual reality experience for rapidly growing industry markets including computer aided engineering, oil and gas, manufacturing, medical, gaming and others. By leveraging the high throughput and the low latency of Mellanox 100Gb/s ConnectX®-4 InfiniBand, Scalable Graphics VR-Link Expander provides a near-zero latency streaming solution for bringing an optimal Virtual Reality experience even over long distances.
  • Just last Thursday, Mellanox announced the world’s first 200Gb/s InfiniBand data center interconnect solutions. Mellanox ConnectX-6 adapters, Quantum switches and LinkX cables and transceivers together provide a complete 200Gb/s HDR InfiniBand interconnect infrastructure for the next generation of high performance computing, machine learning, big data, cloud, web 2.0 and storage platforms. These 200Gb/s HDR InfiniBand solutions maintain Mellanox’s generation-ahead leadership while enabling customers and users to leverage an open, standards-based technology that maximizes application performance and scalability while minimizing overall data center total cost of ownership. Mellanox 200Gb/s HDR solutions will become generally available in 2017. To quote Mellanox’s CEO, Eyal Waldman:

“The ability to effectively utilize the exponential growth of data and to leverage data insights to gain that competitive advantage in real time is key for business success, homeland security, technology innovation, new research capabilities and beyond. The network is a critical enabler in today’s system designs that will propel the most demanding applications and drive the next life-changing discoveries,” said Eyal Waldman, president and CEO of Mellanox Technologies. “Mellanox is proud to announce the new 200Gb/s HDR InfiniBand solutions that will deliver the world’s highest data speeds and intelligent interconnect and empower the world of data in which we live. HDR InfiniBand sets a new level of performance and scalability records while delivering the next-generation of interconnects needs to our customers and partners.”

In addition, Mellanox received praise and support for the announcement from industry leaders including:

“Ten years ago, when Intersect360 Research began its business tracking the HPC market, InfiniBand had just become the predominant high-performance interconnect option for clusters, with Mellanox as the leading provider,” said Addison Snell, CEO of Intersect360 Research. “Over time, InfiniBand continued to grow, and today it is the leading high-performance storage interconnect for HPC systems as well. This is at a time when high data rate applications like analytics and machine learning are expanding rapidly, increasing the need for high-bandwidth, low-latency interconnects into even more markets. HDR InfiniBand is a big leap forward and Mellanox is making it a reality at a great time.”

  • Finally, last week, in tandem with powerhouses Tencent and IBM, we were part of a blockbuster announcement that we had all been named the 2016 winner of Sort Benchmark’s annual global computing competition. Tencent broke records in the GraySort and MinuteSort categories, improving last year’s Alibaba overall results by up to five times and achieving more than one Terabyte/second of sort performance. In addition, the results improved by up to 33 times per node.

Using 512 OpenPower-based servers, with NVMe-based storage and Mellanox ConnectX®-4 100Gbps Ethernet adapters, TencentCloud spent less than 99 seconds to finish sorting a massive 100 terabytes of data, and used 85 percent less servers than the 3,377 servers used by last year’s winner. To achieve this, Tencent developed their own sort application and tuned it for specifically for the benchmark. Managing the combination of sort, NVMe storage and high-performance CPU, pushes the analytics boundary and hence latency and bandwidth of the network which plays a crucial part in achieving maximum performance. With advanced hardware-based stateless offloads and flow steering engine, Mellanox’s ConnectX-4 adapter reduces the CPU overhead in packet processing and provides the lowest latency and highest bandwidth.

Visit Mellanox Technologies at SC16 (November 14-17, 2016)

Visit Mellanox Technologies at SC16 (booth #2631) to learn more on the new 200G HDR InfiniBand solutions and to see the full suite of Mellanox’s end-to-end high-performance InfiniBand and Ethernet solutions.

For more information on Mellanox’s booth and speaking activities at SC16, please visit: http://www.mellanox.com/sc16/.

As you can see, we have a lot going on at Supercomputing, stay tuned, more blogs and news to come.

Becoming Familiar with the Mellanox “Smart Interconnect”

Scot Schultz 031516 800_smart_interconnect

Mellanox is renowned for furthering development toward a more effective and efficient interconnect, and today we’re weaving intelligence into the interconnect fabric, improving our acceleration engines, and adding  improved capabilities that further remove communication tasks from the CPU and dramatically increase system efficiency.

Historically, increases in performance have been achieved around a CPU-centric mindset, with development of the individual hardware devices, drivers, middleware, and software applications in order to improve scalability and maximize throughput.  This archaic model is becoming short-lived, and as a new era of Co-Design is moving the industry toward Exascale-class computing, the creation of synergies between all system elements is the only approach that can lead to significant performance improvements.

 

As Mellanox is advancing the Co-Design approach, we strive for more CPU offload capabilities and acceleration techniques while maintaining forward and backward compatibility of new and existing infrastructures; the result is nothing less than the world’s most advanced interconnect, which continues to yield the most powerful and efficient supercomputers ever deployed.

 

ONLOAD vs. OFFLOAD

IF (!ONLOAD) {

ONLOAD technology, such as that attempted be used many years ago by companies such as Pathscale’s InfiniPath and QLOGIC TrueScale, have long since abandoned the method, actually selling off the failing IP technique twice over! ONLOAD, or “dumb NICs” was actually developed to take advantage of additional CPU cores that were not effectively leveraged by the older ecosystem, which had not yet matured to take advantage of the emerging multi-core processors. While very short-lived, it was a window of opportunity to tax the unused CPU cores and see some benefits from the very simple ONLOAD network host channel adapter design.

Continue reading

HPC Leaders Collaborate on Communication Framework to Enable the Unification of High-Performance and Data-Centric Applications

During ISC’15 in Frankfurt, Germany last week, a new unified communications framework was announced which will be the result of a collaboration between hardware vendors, the academic community and industry leaders.   Mellanox, Oak Ridge National Laboratories (ORNL), NVIDIA, IBM, the University of Houston (UH) and the University of Tennessee in Knoxville (UTK) have banned together to develop the Unified Communication-X Framework (UCX):  an open-source, production grade communication framework for high-performance computing (HPC) and data-centric applications.  The new communications framework will support all communication libraries and enable a closer connection to the underlying hardware.

 

The new community of supporters has been established behind the UCX efforts, which includes key participants from the HPC industry, laboratories as well as academia that will help usher the project forward.   At the core of the UCX project are the combined features, ideas, and concepts of industry leading technologies including MXM, PAMI and UCCS.  Mellanox Technologies has contributed their MXM technology, which provides enhancements to parallel communication.  MXM significantly increases the scalability and performance of message communications in the network.  ORNL has contributed the UCCS elements which are centric around InfiniBand optimizations as well as support for the CRAY interconnect devices and includes exploiting unique shared memory techniques.

 

NVIDIA has also contributed co-design efforts for high-quality support for GPU acceleration that aims to be more tightly coupled with networking operations. IBM has led in the co-design effort for the network interface and contributes ideas and concepts from PAMI, while UH and UTK has focused on the integration of the project with their research platforms.

 

Continue reading

Helping to Generate Economic Growth with the OpenPOWER Ecosystem

We are excited to share today’s announcement by the OPENPower
Foundation to enable the STFC Hartree Center. I wrote a short blog about the announcement, excerpted here:

 

In the latest announcement,  the STFC Hartree Center  is setting out to enable the latest in world-class, state-of-the-art technologies for the development of advanced software solutions to solve real-world challenges in academe, industry and Government and tackle the ever growing issues of big data. The architecture will include POWER CPUs from IBM, the latest in flash-memory storage, GPU’s from Nvidia and, of course, the most advanced networking technology from Mellanox.  Enhanced with native support for CAPI technology, and network-offload acceleration capabilities, the Mellanox interconnect will rapidly shuttle data around the system in the most effective and efficient manner to keep the cores focused on crunching the data; not on processing network communications.

 

Read the full post here:  http://openpowerfoundation.org/blogs/mellanox-and-the-openpower-ecosystem-to-help-generate-economic-growth/

OpenPOWER

ISC 2014 Student Cluster Challenge: EPCC Record-Breaking Cluster

The University of Edinburgh’s entry into the ISC 2014 Student Cluster Competition, EPCC, has been awarded first place in the LINPACK test. The EPCC team harnessed Boston’s HPC cluster to smash the 10Tflop mark for the first time – shattering the previous record of 9.27Tflops set by students at ASC14 earlier this month. The team recorded a score of 10.14Tflops producing 3.38 Tflops/kW which would achieve a rank of #4 in the Green500, a list of the most energy efficient supercomputers in the world.

 

Members:Chenhui Quan, Georgios Iniatis, Xu Guo, Emmanouil Farsarakis, Konstantinos MouzakitisPhoto Courtesty:  HPC Advisory Council

Members: Chenhui Quan, Georgios Iniatis, Xu Guo,
Emmanouil Farsarakis, Konstantinos Mouzakitis
Photo Courtesy: HPC Advisory Council

 

This achievement was made possible thanks to the provisioning of a high performance, liquid cooled GPU cluster by Boston. The system consisted on four 1U Supermicro servers, each comprising of two Intel® Xeon™ ‘Ivy Bridge’ processors and two NVIDIA® K40 Tesla GPUs, and Mellanox FDR 56Gb/s InfiniBand adapters, switches and cables.

 

Continue reading

Mellanox and IBM Collaborate to Provide Leading Data Center Solution Infrastructures

Mellanox recently announced a collaboration with IBM to produce a tightly integrated server and storage solutions that incorporate our end-to-end FDR 56Gb/s InfiniBand and 10/40 Gigabit Ethernet interconnect solutions with IBM POWER CPUs.  By combining IBM POWER CPUs with the world’s highest-performance interconnect solution will drive data at optimal rates, maximizing performance and efficiency for all types of applications and workloads, as well as enable dynamic storage solutions to allow multiple applications to efficiently share data repositories.

 162267608

Advances in high-performance applications are enabling analysts, researchers, scientists and engineers to run more complex and detailed simulations and analyses in a bid to gather game-changing insights and deliver new products to market. This is placing greater demand on existing IT infrastructures, driving a need for instant access to resources – compute, storage, and network.

 

Companies are looking for faster and more efficient ways to drive business value from their applications and data.  The combination of IBM processor technologies and Mellanox high-speed interconnect solutions can provide clients with an advanced and efficient foundation to achieve their goals.

Continue reading

Mellanox and IBM Collaborate to Provide Leading Data Center Solution Infrastructures

High Performance Computing

New advances in Big Data applications are enabling analysts, researchers, scientists and engineers to run more complex and detailed simulations and analyses than ever before.  These applications deliver game-changing insights, bring new products to market and place greater demand on existing IT infrastructures.

 

This ever-growing demand drives the need for instant access to resources – compute, storage, and network. Users are seeking cutting-edge technologies and tools to help them better capture, understand and leverage increasing volumes of data as well as build infrastructures that are energy-efficient and can easily scale as their business grow.

Continue reading

Mellanox Results are the Best on TopCrunch

The HPC Advisory Council published a best practices paper showing record application performance for LS-DYNA® Automotive Crash Simulation, one of the automotive industry’s most computational and network intensive applications for automotive design and safety.  The paper can be downloaded here:  HPC Advisory Council : LS-Dyna Performance Benchmark and Profiling.

 

The LS-DYNA benchmarks were tested on a Dell™ PowerEdge R720 based-cluster comprised of 32 nodes and with networking provided by Mellanox Connect-IB™ 56Gb/s InfiniBand adapters and switch.  The results demonstrate that the combined solution delivers world-leading performance versus any given system at these sizes, or versus larger core count system based on Ethernet or proprietary interconnect solution based supercomputers.

 

The TopCrunch project is used to track the aggregate performance trends of high performance computer systems and engineering software.  Rather than using a synthetic benchmark, actual engineering software applications are used with real datasets and run on high performance computer systems.

 

TopCrunch.png

Scot Schultz
Author: Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Prior to joining Mellanox, he spent the past 17 years at AMD in various engineering and leadership roles, most recently in strategic HPC technology ecosystem enablement. Scot was also instrumental with the growth and development of the Open Fabrics Alliance as co-chair of the board of directors. Follow him on Twitter: @ScotSchultz.