Accelerate your TensorFlow with RDMA
for InfiniBand and Ethernet

How to Provide a Highly-scalable and Robust Fabric for Deep Learning/Machine Learning Clusters

Webinar Date: Wednesday, May 13, 2020

Webinar Time: 11:00am - 12:00pm India Standard Time


GPU accelerated computing and ever scaling Deep Learning/Machine Learning workloads are posing a unique challenge to network architects looking to design the perfect interconnect fabric. Efficient and sustainable scaling requires expanding the role of interconnect beyond standard message-passing agent to a more intelligent entity that can accelerate the overall compute process.

In this talk, we will present the next generation of InfiniBand and Ethernet solutions that are designed to provide a highly-scalable and robust fabric for Artificial Intelligence clusters. We will discuss how RDMA forms the backbone for accelerated computing and the capabilities required from interconnect to meet the new demands. We will touch upon both InfiniBand and Ethernet fabrics, introducing how to use TensorFlow RDMA in both InfiniBand and Ethernet GPU cluster, and conclude with some practical network designs for small to mid-sized GPU clusters.

In this webinar you will learn:


  • To Design Highly scalable & robust fabric for DL / ML users
  • RDMA accelerating overall computing capabilities
  • Build IB and Ethernet Network design for GPU clusters

Speaker:


Ashrut Ambastha
Principal Engineer, Solution Architect
NVIDIA, Mellanox Networking Business Unit


Webinar Registration
Please enter your first name.
Looks good!
Please enter your last name.
Looks good!
Please enter your company name.
Looks good!
Please enter your email.
Looks good!
Please enter your job title.
Looks good!
You must agree before submitting.

Mellanox Cookie Policy

This website uses cookies which may help to deliver content tailored to your preferences and interests, provide you with a better browsing experience, and to analyze our traffic. You may delete and/or block out cookies from this site, but it may affect how the site operates. Further information can be found in our Privacy Policy.