Archive for the ‘Gilad Shainer’ Category

High-Performance Computing as a Service (HPCaaS)

Thursday, April 23rd, 2009

High-performance clusters bring many advantages to the end user, including flexibility and efficiency. With the increasing number of applications being served by high-performance systems, new systems need to serve multiple users and applications. Traditional high-performance systems typically served a single application at a given time, but to maintain maximum flexibility a new concept of “HPC as a Service” (HPCaaS) has been developed. HPCaaS includes the capability of using clustered servers and storage as resource pools, a web interface for users to submit their job requests, and a smart scheduling mechanism that can schedule multiple different applications simultaneously on a given cluster taking into consideration the different application characteristics for maximum overall productivity.

HPC as a Service enables greater system flexibility since it eliminates the need for dedicated hardware resources per application and allows dynamic allocation of resources per given task while maximizing productivity. It is also the key component in bringing high-performance computing into cloud computing. Effective HPCaaS though, needs to take into consideration the application’s demands and provide the minimum hardware resources required per application. The scheduling of runs of multiple applications at once requires the proper balance of resources for each application proportional to their demands.

Research activities on HPCaaS are being performed at the HPC Advisory Council (http://hpcadvisorycouncil.mellanox.com/). The results show the need for high-performance interconnects, such as 40Gb/s InfiniBand, to maintain high productivity levels. It was also shown that scheduling mechanisms can be set to guarantee same levels of productivity in HPCaaS versus the “native” dedicated hardware approach. HPCaaS is not only critical for the way we will perform high-performance computing in the future, but as more HPC elements are brought into the data center, it will become an important factor when building the most efficient enterprise data centers.

Gilad Shainer
Director, Technical Marketing
gilad@mellanox.com

Unleashing Performance, Scalability and Productivity with Intel Xeon 5500 Processors “Nehalem”

Tuesday, March 31st, 2009

The industry has been talking about it for a long time, but on March 30th, it was officially announced. The new Xeon 5500 “Nehalem” platform from Intel has introduced a totally new concept of server architecture for Intel-based platforms. The memory has moved from being connected to the chipset to be connected directly to the CPU, and the memory speed has increased. More importantly, PCI-Express (PCIe) Gen2 can now be fully utilized to unleash new performance and efficiency levels from Intel-based platforms. PCIe Gen2 is the interface between the CPU and memory to the networking that connects servers together to form compute clusters. With PCIe Gen2 now being integrated in compute platforms from the majority of OEMs, more data can be sent and received in a single server or blade. This means that applications can exchange data faster and complete simulations much faster, bringing a competitive advantage to end-users. In order to feed the PCIe Gen2, one needs to have a big pipe for his networking solutions, and this is what InfiniBand 40Gb/s brings to the table. No surprise that multiple server OEMs have announced the availability of 40Gb/s InfiniBand in conjunction with Intel announcement (for example HP and Dell).

 

I have been testing several applications to compare the performance benefits of Intel Xeon 5500 processors and Mellanox end-to-end 40Gb/s networking solutions. One of those applications was the Weather Research and Forecasting (WRF) application, widely used around the world. With Intel Xeon-5500-based servers and Mellanox 40Gb/s ConnectX InfiniBand adapters and MTS3600 36-port 40Gb/s InfiniBand switch system, we witnessed a 100% increase in performance and productivity over previous Intel platforms.

With a digital media rendering application – Direct Transport Compositor, we have seen a 100% increases in frames per second delivery, while increasing the screen anti-aliasing at the same time. Other applications have shown similar level of performance and productivity boost as well.

 

The reasons for the new performance levels are the decrease in the latency (1usec) and the huge increase in throughput (more than 3.2GB/s throughput uni-directional on more than 6.5GB/s bi-directional on a single InfiniBand port). With the increase in the number of CPU cores, and new server architecture, bigger pipes in and out from the servers are required in order to keep the system balanced and to avoid creating artificial bottlenecks. Another advantage for InfiniBand is its ability to use RDMA and transfer data directly to and from the CPU memory, without the involvement of the CPU in the data transfer activity. This mean one thing only – more CPU cycles can be dedicated to the applications!

 

Gilad Shainer

Director, HPC Marketing