
HPC Advisory Council Blog
August 7, 2008
The World’s Leading High Resolution Visualization System (NASA) - Part 2
I have posted earlier on the InfiniBand based high resolution visualization system we have installed at NASA. Recently, California Gov. Arnold Schwarzenegger and NASA Ames Research Center Director S. Pete Worden examined Hyperwall-2, a state-of-the-art visualization system developed at Ames.

Picture: California Gov. Arnold Schwarzenegger and NASA Ames Research Center Director S. Pete Worden Viewing the Hyperwall-2
Hyperwall-2 is one of the largest displays in the world and is used by scientists for data interpretation. Schwarzenegger visited Ames July 14, 2008, for a behind-the-scenes tour and briefings about NASA's support to firefighters battling California wildfires. Ames scientists are partnering with colleagues at Dryden Flight Research Center, Edwards, Calif., to send NASA's remotely piloted Ikhana aircraft on reconnaissance flights using sophisticated visual and thermal sensors to provide up-to-the-minute information to firefighters in the field.
Gautam Shah
CEO
Colfax
August 1, 2008
HPC Advisory Council Announcement
This week we launched the HPC Advisory Council Network of Expertise. The Network of Expertise is a collaboration of highly technical and knowledgeable individuals from the HPC Advisory Council members, that create a support network for consultations, questions, and issues for high-performance computing end-users, software vendors and systems builders.
HPC End users or vendors that provide HPC based solutions (systems, software, tools etc.) are encouraged to utilize the Network of Expertise for any questions, issues or advises. This can be done as simply as sending a request with the needed description to the Network of Expertise email address at HPCHelp@mellanox.com.
We will publish issues raised to the Network of Expertise, which are of an interest to the general HPC end-users, on the Network of Expertise web page.
Gilad Shainer
July 18, 2008
HPC Advisory Council Announcement
This week we officially announced the HPC Advisory Council http://www.mellanox.com/news/press_releases/pr_ 071508.php. The press release was well received. I would like to thank each member for their efforts in making the council a key eco-system in the High Performance
Computing world. For a complete member roster please refer to http://www.mellanox.com/partners/hpc_members.php.
The HPC Advisory Council will set up the Network of Knowledge shortly. The Network of Knowledge is a group of technical experts from various council members who will provide support for consultations, questions, issues etc. for HPC end-users. The Network of Experts can be reached via email HPCHelp@mellanox.com (mailing list). A reflection of the key technical support issues will be posted on the council web page as well. More information will be posted soon.
Gilad Shainer
July 7, 2008
![]()
The world’s highest resolution visualization system (NASA)
The power to visualize highly complex information in a way that's easier for the human mind to grasp is now available with the new NASA hyperwall-2 system, located in the NASA Ames Research Center.
The hyperwall-2 system consists of 128 screens and is capable of rendering one quarter billion-pixel graphics making it the world's highest resolution scientific visualization and data exploration environment. The system enables scientists to quickly explore datasets that otherwise would take many years to analyze such as safety of new space exploration vehicle designs, atmospheric re-entry analysis for the space shuttle, earthquakes, climate change, global weather and black hole collisions.
The system is powered by Colfax’s advanced computing cluster, which consists of 128 graphics processing units and 1,024 AMD processor cores, and provides 74 teraflops of peak processing power. Mellanox ConnectX InfiniBand DDR 20Gb/s adapters interconnect the cluster nodes to supply the needed fast communication capabilities.
You can see pictures of the systems we have installed below (courtesy of NASA, photo by Eric James).
Gautam Shah
CEO
Colfax
June 26, 2008
Appro InfiniBand 40Gb/s demonstration at the International Supercomputing Conference
Last week, we have featured the first InfiniBand QDR 40Gb/s based cluster on the show floor at the 2008 International Supercomputing Conference in Dresden Germany. We have captured the demo and the Mellanox IB 40Gb/s announcement in this cool video below.
The video shows the Appro Xtreme-X1 Supercomputer cluster, utilizing Mellanox ConnectX QDR adapters and the new Mellanox QDR 36-port switch (InfiniScale IV) running ANSYS/Fluent airplane structure application. We found this solution to provides a powerful, easy to manage supercomputer that reduces latency while significantly improving bandwidth and performance.

Steve Lyness
Vice President of HPC Solutions
Appro
June 25, 2008
Windows HPC debuts in the Top 25 fastest supercomputers in the world...what more do I need to say?
![]()
A few months ago we completed our runs for the Top500 list. For those of you not familiar with this bi-annual benchmark, the Top500 list represents the 500 most powerful computers in the world. It is the supercomputing supergeek superlist. We completed runs with the National Center for Supercomputing Application (NCSA) and with Umea University. The problem is that even though we did the runs months ago we weren’t allowed to discuss the results until this week, the week of the International Supercomputing Conference in Dresden, Germany. We had to keep it a secret. Ugh.
The NCSA cluster is amazing. 1200 nodes, InfiniBand connected, each with 8 cores, creating a 9600 core cluster. NCSA installed Beta 1 of Windows HPC Server 2008 and ran the benchmark. The results were outstanding: 68.5 teraflops and 77.7% efficiency. Using our beta software NCSA beat their November score by over 10%. This is the fastest Windows cluster to date. Check out the customer video and case study.
The Umea University cluster, “Akka”, is located in northern Sweden. This system was also running Beta 1 and hit 46 teraflops on 5,376 cores with a VERY impressive 85.5% efficiency score. This is the BEST efficiency score for an x86 architecture cluster on the Top 500 list. Umea University will run the new supercomputer at its facility known as “HPC2N”. The university’s cluster employs 672 IBM blade servers, InfiniBand connected, and also marks the first time that Windows HPC Server 2008 has been run publicly on IBM hardware.
So, the benchmarking numbers are looking pretty good, and those benchmarks were with our first beta. We shipped our second beta last month and we’re shipping our first release candidate at the end of this month.
How did we do so well on the benchmarks? We’ve made big improvements in the Microsoft MPI stack. MPI (Message Passing Interface) is used for tightly coupled communications between servers running in parallel. The biggest improvements were in what are called shared memory interfaces, that is, the interfaces used for communication between processor cores on the same system. Our MPI stack is based on Argonne National Lab’s MPI stack called MPICH2. We will contribute our changes back to Argonne for inclusion in the open source version of MPICH2. These are some of the largest contributions to the open source community by Microsoft. Yep, open source and Microsoft.
Network Direct, our new RDMA (Remote Direct Memory Access) networking stack was another area of improvement. We collaborated with partners like Mellanox, to build a very efficient RDMA stack. Improvements in MPI and Network Direct contributed hugely to our great score.
Very impressive benchmark results for a product that’s not even released to manufacturing yet and the benchmark scores were a very hard secret to keep. The release candidate of Windows HPC Server 2008 will be available for customers to download the last week of June.
Group Program Manager on the HPC Dev Team
Microsoft
June 18, 2008
The news from the International Supercomputing Conference
Last week, Mellanox launched the new InfiniBand QDR 40Gb/s silicon solution, with an incredible number of vendors showing demos in their booths, running applications from ANSYS Fluent and Scalable Graphics: Appro, Bull, HP, Intel, Microsoft, Sun Microsystems, Supermicro, TYAN, and Voltaire all participated. Mellanox hosted a special cocktail to celebrate 40Gb/s InfiniBand, which was the highlight of the first day (Tuesday).

Picture: Gilad Shainer celebrating InfiniBand QDR with Prof. Dr. Hans Meuer
The new Top500 list was revealed on Wed, introducing the new #1 fastest supercomputer on the Top500, the National Nuclear Security Administration’s (NNSA) Roadrunner - the world’s first Petaflop supercomputer. Built by IBM and housed at NNSA’s Los Alamos National Laboratory in New Mexico, Roadrunner’s performance is more than double the performance of the next leading contender on the latest TOP500 list of supercomputers. Roadrunner utilizes Mellanox ConnectX InfiniBand 20Gb/s interconnect to provide its amazing performance capabilities. There are 5 InfiniBand-based systems on the Top10, including Ranger (build by Sun and hosted at the Texas Advanced Computing Center).
Gilad ShainerMellanox Technologies
June 12, 2008

Providing a robust and well balanced set of technologies that delivers superior performance and scalability is what today’s savvy users demand. I would like to applaud Mellanox for taking the initiative to establish the HPC Advisory Council, as well as inviting AMD as one of the founding members. Interoperability and standardization across technologies that drive the highest performing solutions is key in successful adoption by the industry users.
Senior Strategic Alliance Manager
High Performance Computing
AMD
June 5, 2008
Windows HPC Server 2008 Beta 2 is Here![]()
Whew! Friday at 2:18PM we signed off on Beta 2 of Windows HPC Server 2008. It’s a good thing too since the Redmond team is looking at the first sunny and hot Northwest weekend this year. Mother nature usually gives us these days on weekdays. It’s been a hard push since November when we shipped our last beta. Since then we’ve done test runs on a cluster with over 1000 nodes, fixed over 1000 bugs, coded a bunch of new features, and made a bunch of design changes based on customer feedback. For example, one beta customer was using our new WCF Broker for financial risk modeling but wanted a totally reliable messaging solution. We built a solution leveraging MSMQ that still provides high throughput while allowing for reliable messaging.
Now that Beta 2 is finished our Technology Adoption Partners (TAP) will put this beta into production environments. We’ll carry pagers to help them out if they run into a crit-sit after hours. Actually, we have cell phones. Pagers have gone the way of sock punch cards, teletypes, and sock garters. I suspect there are teenagers wandering around that don’t know what a pager is.
Anyway, there’s a bunch of new stuff in Beta 2.
We checked in high availability for the head node and a new set of diagnostic tests to help people identify and troubleshoot their clusters. The new UI model is really coming together but for users more comfortable with command line interfaces we provide scripting support through COM and PowerShell. Finally, administrators can run administrative scripts in parallel across the cluster using our improved Clusrun feature.
A bunch of humbling (heh) usability testing pushed us to redesign the To Do List. It should be much easier for people to get through setting up a cluster, adding drivers to images, and configuring patching for the cluster (new feature!). The heat map is working so well we’ve thrown out our internal monitoring tools we use on Top500 runs.
After lots of, um, passionate debate we’ve finalized the APIs for job submission. It will continue to be easy for ISVs to integrate directly with our job scheduler while at the same time working with a cluster that may have thousands of jobs in the queue, each job with thousands of tasks.
A lot of people don’t know that we co-chair the HPC Basic Profile working group at the Open Grid Forum. With Beta 2, we ship our support for “HPC Basic Profile,” allowing us to interop with the LSF and PBSPro job schedulers.
We completed a few great Top500 runs in the last few weeks. We can’t talk about the numbers until the International Supercomputing Conference in June but it looks like Beta 2’s new MPI stack and new Network Direct RDMA interface are starting to hum.
Finally, our new programming model based on SOA is getting some nice usage from beta customers. Most of the feedback has come from folks in computational finance but there are also a couple folks in the life sciences industry that are kicking the tires. For example, what if you came up with a new theory about cancer and wanted to search through thousands of medical scans to see if your theory was correct. For Beta 2 we improved scalability, reduced latency and improved session initialization time. Beta 2 supports multiple WCF Brokers, allowing HPC Server 2008 to run really big SOA workloads.
So, we’re done with Beta 2. Lots of new features (whew) and lots of scalability improvements. We’ve posted build 1345, Beta 2, up at http://connect.microsoft.com
Thanks!
Ryan WaiteGroup Program Manager - HPC
Microsoft
May 28, 2008
INTRODUCING THE HPC ADVISORY COUNCIL
It is my pleasure to introduce the HPC Advisory Council. The Council includes leaders from the HPC community - best-in-class original equipment manufacturers (OEMs), strategic technology suppliers, independent software vendors (ISVs) and selected end-users across the entire HPC market segments. For more info on how to join us – please send an email to HPC@mellanox.com.
The Council members work together to ensure interoperability and robustness between the multiple parts that form HPC clusters – servers, storage, interconnect, CPUs, and applications, to provide the best total solution for the end user. The Council will also work together to form future HPC solutions, and provide an environment for testing, development and optimization of such solutions.
One of the sub groups of the Council is the HPC Advisory Council Support Group – which is a support center for consultations, questions, issues etc. for HPC end-users. You can reach the group mailing list at HPCHelp@mellanox.com.
This blog is the place where the Council members will post their news, exciting events, technology information and so forth. We will be adding news feeds shortly, but until then, please continue to visit this page.
Gilad Shainer
For questions or comments, please contact HPC@mellanox.com
![]() |
|










