InfiniBand is a network communications protocol that offers a switch-based fabric of point-to-point bi-directional serial links between processor nodes, as well as between processor nodes and input/output nodes, such as disks or storage. Every link has exactly one device connected to each end of the link, such that the characteristics controlling the transmission (sending and receiving) at each end are well defined and controlled.
InfiniBand creates a private, protected channel directly between the nodes via switches, and facilitates data and message movement without CPU involvement with Remote Direct Memory Access (RDMA) and Send/Receive offloads that are managed and performed by InfiniBand adapters. The adapters are connected on one end to the CPU over a PCI Express interface and to the InfiniBand subnet through InfiniBand network ports on the other. This provides distinct advantages over other network communications protocols, including higher bandwidth, lower latency, and enhanced scalability.
One of the biggest catchphrases in modern science is Human Genome–the DNA coding that largely pre-determines who we are and many of our medical outcomes. By mapping and analyzing the structure of the human genetic code, scientists and doctors have already started to identify the causes of many diseases and to pinpoint effective treatments based on the specific genetic sequence of a given patient. With the advanced data that such analysis provides, doctors can offer more targeted strategies for potentially terminal patients at times when no other clinically relevant treatment options exist.
The University of Edinburgh’s entry into the ISC 2014 Student Cluster Competition, EPCC, has been awarded first place in the LINPACK test. The EPCC team harnessed Boston’s HPC cluster to smash the 10Tflop mark for the first time – shattering the previous record of 9.27Tflops set by students at ASC14 earlier this month. The team recorded a score of 10.14Tflops producing 3.38 Tflops/kW which would achieve a rank of #4 in the Green500, a list of the most energy efficient supercomputers in the world.
This achievement was made possible thanks to the provisioning of a high performance, liquid cooled GPU cluster by Boston. The system consisted on four 1U Supermicro servers, each comprising of two Intel® Xeon™ ‘Ivy Bridge’ processors and two NVIDIA® K40 Tesla GPUs, and Mellanox FDR 56Gb/s InfiniBand adapters, switches and cables.
Hadoop has become a leading programming framework in the big data space. Organizations are replacing several traditional architectures with Hadoop and use it as a storage, data base, business intelligence and data warehouse solution. Enabling a single file system for Hadoop and other programming frameworks benefits users who need dynamic scalability of compute and or storage capabilities.
This week, Las Vegas hosts the National Association of Broadcasters conference, or NAB Show. A big focus is the technology needed to deliver movies and TV shows using 4K video.
Standard DVD video resolution is 720×480. Blue-ray resolution is 1920×1080. But, thanks to digital projection in movie theatres and huge flat-screen TVs at home, more video today is being shot in 4K (4096×2160) resolutions. The video is stored compressed but must be streamed uncompressed for many editing, rendering, and other post-production workflows. Each frame has over 8 million pixels and requires 24x greater bandwidth than DVD (4x greater bandwidth than Blue-ray).
People often ask me why Mellanox is interested in storage, since we make high-speed InfiniBand and Ethernet infrastructure, but don’t sell disks or file systems. It is important to understand the four biggest changes going on in storage today: Flash, Scale-Out, Appliances, and Cloud/Big Data. Each of these really deserves its own blog but it’s always good to start with an overview.
Flash is a hot topic, with IDC forecasting it will consume 17% of enterprise storage spending within three years. It’s 10x to 1000x faster than traditional hard disk drives (HDDs) with both higher throughput and lower latency. It can be deployed in storage arrays or in the servers. If in the storage, you need faster server-to-storage connections. If in the servers, you need faster server-to-server connections. Either way, traditional Fibre Channel and iSCSI are not fast enough to keep up. Even though Flash is cheaper than HDDs on a cost/performance basis, it’s still 5x to 10x more expensive on a cost/capacity basis. Customers want to get the most out of their Flash and not “waste” its higher performance on a slow network.
Flash can be 10x faster in throughput, 300-4000x faster in IOPS per GB (slide courtesy of EMC Corporation)
Windows Azure continues to be the leader in High-Performance Computing Cloud services. Delivering a HPC solution built on top of Windows Server technology and Microsoft HPC Pack, Windows Azure offers the performance and scalability of a world-class supercomputing center to everyone, on demand, in the cloud.
Customers can now run compute-intensive workloads such as parallel Message Passing Interface (MPI) applications with HPC Pack in Windows Azure. By choosing compute intensive instances such as A8 and A9 for the cloud compute resources, customers can deploy these compute resources on demand in Windows Azure in a “burst to the cloud” configuration, and take advantage of InfiniBand interconnect technology with low-latency and high-throughput, including Remote Direct Memory Access (RDMA) technology for maximum efficiency. The new high performance A8 and A9 compute instances also provide customers with ample memory and the latest CPU technology.
Author: Eli Karpilovski manages the Cloud Market Development at Mellanox Technologies. In addition, Mr. Karpilovski serves as the Cloud Advisory Council Chairman. Mr. Karpilovski served as product manager for the HCA Software division at Mellanox Technologies. Mr. Karpilovski holds a Bachelor of Science in Engineering from the Holon Institute of Technology and a Master of Business Administration from The Open University of Israel. Follow him on Twitter.
Cloud computing was developed specifically to overcome issues of localization and limitations of power and physical space. Yet many data center facilities are in danger of running out of power, cooling, or physical space.
Mellanox offers an alternative and cost-efficient solution. Mellanox’s new MetroX® long-haul switch system makes it possible to move from the paradigm of multiple, disconnected data centers to a single multi-point meshed mega-cloud. In other words, remote data center sites can now be localized through long-haul connectivity, providing benefits such as faster compute, higher volume data transfer, and improved business continuity. MetroX provides the ability for more applications and more cloud users, leading to faster product development, quicker backup, and more immediate disaster recovery.
The more physical data centers you join using MetroX, the more you scale your company’s cloud into a mega-cloud. You can continue to scale your cloud by adding data centers at opportune moments and places, where real estate is inexpensive and power is at its lowest rates, without concern for distance from existing data centers and without fear that there will be a degradation of performance.
Moreover, you can take multiple distinct clouds, whether private or public, and use MetroX to combine them into a single mega-cloud. This enables you to scale your cloud offering without adding significant infrastructure, and it enables your cloud users to access more applications and to conduct more wide-ranging research while maintaining the same level of performance.
Last week (on December 9th, 2013), Symantec announced the GA of their clustered file storage (CFS). The new solution enables customers to access mission critical data and applications 400% faster than traditional Storage Area Networks (SANs) at 60% of the cost.
Faster is cheaper! Sounds like magic! How they are doing it?
Try to understand the “magic”: It is important to understand the advantages that using SSD with high performance interconnect enable in the modern scale-out (or clustered) storage systems. Up to now, SAN-based storage has typically been used to increase performance and provide data availability for multiple applications and clustered systems. However, with the recent high-performance applications demand, SAN vendors are trying to add SSD into the storage array itself to provide higher bandwidth and lower latency response.
Since SSDs offer an incredibly high number of IOPS and bandwidth, it is important to use the right interconnect technology and to avoid bottlenecks associated with access to storage. Old fabric, like Fibre Channel (FC) cannot cope with faster pipe demands, as 8Gb/s (or even 16Gb/s) bandwidth performance is not good enough to satisfy the applications requirements. While 40Gb/s Ethernet may look like an alternative, InfiniBand (IB) currently supports up to 56Gb/s, with a roadmap to 100Gb/s in next year.
Mellanox’s end-to-end FDR 56Gb/s InfiniBand solutions helped lead The University of Texas at Austin to victory at the SC Student Cluster Competition’s Standard Track during SC’13. Utilizing Mellanox’s FDR InfiniBand solutions, The University of Texas at Austin achieved superior application run-time and sustained performance within a 26-amp of 120-volt power limit, allowing them to complete workloads faster while achieving top benchmark performance. Special recognition was also provided to China’s National University of Defense Technology (NUDT), which through the use of Mellanox’s FDR 56Gb/s InfiniBand, won the award for highest LINPACK performance.
Held as part of HPC Interconnections, the SC Student Cluster Competition is designed to introduce the next generation of students to the high-performance computing community. In this real-time, non-stop, 48-hour challenge, teams of undergraduate students assembled a small cluster on the SC13 exhibit floor and raced to demonstrate the greatest sustained performance across a series of applications. The winning team was determined based on a combined score for workload completed, benchmark performance, conference attendance, and interviews.