All posts by Avi Alkobi

About Avi Alkobi

Avi Alkobi is Ethernet Technical Marketing manager for EMEA in Mellanox Technologies. For the past 8 years he has worked at Mellanox in various roles focusing on the Ethernet switch product line, first as a SW developer, then as a team leader of the infrastructure team responsible for the first Ethernet switch management infrastructure. More recently, he has worked as a senior application engineer supporting the field on post sales, pre sales and complex proof of concepts for E2E Ethernet and InfiniBand solutions. Before coming to Mellanox, he worked for five years at Comverse Technology and prior to this, in the Israeli security industry as a software developer. He holds a Bachelor of Science degree in Computer Science and M.B.A from the Bar-Ilan University in Israel.

VXLAN is finally simple, use EVPN and set up VXLAN in 3 steps

I am writing this blog because, while building a simple Overlay network, it occurred to me that with the right Network Operating System there is a simple  structure to configuring EVPN, which I call  EVPN in Three steps:


In my example, the setup is very basic, but this atomic setup is the baseline for bigger setups and for the automation playbooks while emphasizing the three parts of such a design.


The minimum setup to bring up an EVPN network for a PoC or demo would be one spine switch and two leaf switches. However, my recommendation is to use two spine switches for a more relevant test. I will describe herein the one spine design.

  • Network Hardware: Spine & Leaf: SN2100 – Mellanox Spectrum-based product, 16 ports of 100GbE each
The SN2100 is a 16x100GbE half 19Switch that fits just great this exercise of a small and low cost PoC setup.
Note: I could use the SN2700 (32x100GbE) for the spine and the SN2410 (48x25GbE + 8x100GbE) as a leaf switch, but it doesn’t really matter, as all the Spectrum products are based on the same silicon and all offer the same set of features and performance
  • Servers: Two PCIe Gen3 servers, 16 lanes and 12 cores Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz each with ConnectX5 100GbE network card
Note: To get the best performance, there is a need to do some tuning, a good start can be the simple mlnx_tune utility
  • Protocols: eBGP for the Underlay VxLAN as the Overlay and EVPN for the control plan


Second leaf is the same (different AS, different Router-ID, different IPs)

And For the spine its just doing Steps 1 and 2 – configure the Underlay and enable the EVPN.




That is it! Just don’t forget to set the MTU properly since the VxLAN header requires 50 Bytes more.

Below are some results that I got with iperf – it can be even better (close to 100GbE line rate) if the server will be fully tuned:





























Supporting Resources:


Open Network for OpenStack

Mellanox Technologies been a part of the OpenStack community for a very long time.  You might ask: Why is a Networking company that delivers physical network components so involved in a software oriented project?

I often see Hyperconverged Infrastructure vendors talking about the scale-out concept, which is a fantastic model since it describes the very simple paradigm of a “pay as you grow” process – you can start small with a few compute/storage nodes and keep growing according to your needs.

But what about the network? Can it be static and monotonic? Obviously, the answer is the network needs to evolve and grow at the pace of the other components.

I see a similar analogy when it comes to Openstack, an open source project that defines the Cloud infrastructure; can the network be closed and locking in the Openstack environment? Does this contradict the ‘openness’ idiom?

For me, Openstack infrastructure must run together with Open Networking

What does it mean?

  • Switches that that are not locking anything, not features, not cables, not other switches and offers the choice of an operating system and the choice of network applications.
  • Is this a Linux environment or not? Switches needs to offer the Linux benefits together with the traditional management APIs.
  • Orchestration and automation that can cover not only the host side but also the network part.
  • Network Cards that are embedded in the compute and storage can be part of the Openstack echo systems and not be a burden on the operations.
  • Network that boosts performance without adding additional load to the CPUs.
  • Network offload and acceleration capabilities that are fully integrated into the Openstack projects – Inbox solutions.

Here is an example for a common use case:

  • VxLAN Overlay in an Openstack environment.
  • The network is required to provide VTEP. In other words, encapsulation and de-capsulation of the VxLAN packet. This can be done by the Top of Rack Switch or by the hypervisor.
  • With a controller (can be Neutron as a controller or a third party application such as Midonet, Nuage, Plumgrid, etc…) oOr without a controller by using EVPN

What are the benefits of using open networking?


  • Mellanox Spectrum™ switches are fully open, no locking of cables or features. All features are available including IP unnumbered BGP for the underlay and VTEP for the overlay, with controller integrations or EVPN.

The switch can run the operating system of your choice, it can be Cumulus Linux.

  • Cumulus Linux is a Debian Jessie-based networking focused distribution of Linux that runs on switch hardware, and enables a fully Linux based OpenStack deployment
  • Ansyble, puppet, Saltstack, Chef and other automation tools can naturally manage the switch like a server with NetDevOps.
  • Mellanox ConnectX®-5 network cards offer VxLAN offload and ASAP² (OVS Offload). This reduces the CPU usage because with the Mellanox cards, network operation is been handled by the Network Card and not by the CPUs.
  • Mellanox Spectrum switches and ConnectX5 network interface cards come in all high speed Ethernet speeds, including 10/25/40/50/100GbE, with no compromise on performance.

Supporting Resources

Ultra-Low latency for Everybody

Why should we compromise? Is the low latency network the industry’s legacy for the bold and the beautiful?

Low latency has always been viewed as a costly capability for High Frequency Trading and High Performance Computing. This capability comes at a premium and recently, also with tradeoffs such as limited throughput, bandwidth and feature set.

Dominated by very few vendors that have developed switches based on special ASICs, IT Administrators could not enjoy a low latency network without paying the price which is usually suffered by companies who cannot afford the extra packet delay.

Let’s say that I am your average Joe Q. Customer and I am deploying a public cloud. I can be designing my storage racks and using all flash array, I can also be thinking about deploying the next generation trading environment in my organization. Either way, I am looking at the next generation network infrastructure for a diversity of use cases.

So, this begs the question, what are my network requirements on the different environments?

  • High bandwidth, L2 and MLAG, L3 protocols like OSPF/BGP, BFD, IPv4 and IPv6, Scale and ZTP
  • Overlay VxLAN with a controller or without a controller using EVPN
  • Multicast IGMPv3 and PIM-SM, PTP and advanced Telemetry capability and,
  • ACL, SPAN, TAP aggregation.

Every vertical has its unique feature set, yet bandwidth is a consistent requirement across the board these days. So, it is fair to say that we need more bandwidth because we are consuming more data and from now on, we don’t need to compromise on the features and performance.

We can truly have it all.

The Mellanox Spectrum product line presents a portfolio of high quality switch products, all based on the Mellanox Spectrum ASIC.

The Mellanox Spectrum SN-Switch Series presents 1/10/25/40/50/100GbE available on all the Switches with low and predictable latency, this is the story of the next generation 25/100GbE at 300nsec latency.

No additional fees for features, no special product line, no compromising on feature set, no locking, the quote is very simple, only two lines: switch price and support price. And, as every savvy customer knows, price per port is the most cost effective in the market.

Advanced features such as VxLAN EVPN, VxLAN routing, high scale IPv4 and IPv6, L2 advanced features, L2 and L3 Multicast, and advanced telemetry capabilities to measure the queue depth and latency in the device and at the fabric level, it is all there – so simple to purchase and simple to manage and deploy.

What does predictable Latency mean? It means that the latency is the same, no matter what the packet size is, no matter what the port number is, does it sound trivial?  Check the latest Tolly report, this benchmark reveals a compelling story, Spectrum simply eliminates this tradeoff, with Spectrum the latency is low and it is the equal for all customers and services connected to the switch.

Spectrum latency is 300nsec second port to port and what about the rest of the time?

“Time is really the only capital that any human being has and the thing that he can least afford to waste or lose…” — THOMAS EDISON

Probably the same for a network packet 🙂

Mellanox @ World Hosting Days (WHD) .Fast

WHD.Global, also known as World Hosting Days, is coming. This fantastic event will take place at March 25-31 in Rust Germany.

This great festival presents a unique opportunity to meet the technology leaders of today and learn about the innovations of tomorrow that will shape the way we build and maintain our cloud services.

Mellanox Technologies as a leading network vendor is once again happy to sponsor and exhibit in this event. We are proud to be one of the few leading networking companies on the exhibition floor and the only company in the world today that can provide an end-to-end networking solution.

PCI-to-PCI Mellanox develops and manufacture High bandwidth NIC cards, Cables and Switch products delivering the Industry’s most complete line of 10, 25, 40, 50 and 100Gb/s products.

Mellanox Technologies develops every network component from the ASIC level to the software on top which come together to provide Cloud solutions that are much more than just network connectivity; they are complete solutions that take into consideration all the factors in today’s complex Cloud technologies:

ConnectX-4® and ConnectX-5 Hardware Accelerations and offloads that enhance the server and Storage capabilities

Spectrum Switches that feature the highest bandwidth, lowest latency, most predictable, and fair network forwarding, along with world-class telemetry for real time visibility into what is going on in the network.

LinkX High quality cable and transceivers products with some of the best bit error rates in the world.

Open Networking

The building blocks that we offer can be composed into full server, storage and network cloud designs that fulfil the most demanding goal of all: Efficient, Flexible and Simple.

We offer the option to run a native Linux native operating system on your network devices – which can be so powerful, with an impressive feature list that provides all the modern cloud needs of scale, SDN features, automation capabilities, Inbox availability, or very easy activation – that is our way to reduce the operational expenses.

And what about the capital expenditure?  Let us surprise you. Come and visit us at our booth Z10 at the dome: Alex Nardella, Eyal Belfer, Arne Heitmann, Daniel Ponten, Martin Rosen, and Niels Finseth will be there.  Please come say hello and get a first-hand impression of our quality products and talk to our technical professionals.

The WHD.Global will take place in Europa Park and I recommend you take a spin on the Silver Star Rollercoaster. It is the fastest vehicle in the Park…. well that is in regular days. However, between March 25 to 31st Mellanox’s SN2100 Spectrum Switch will run at 100GbE at the Mellanox Booth and you can’t go faster than that.





How to Get 100G Performance Out of Your 100GbE SDN

The high bandwidth requirements of modern data centers are driven by the demands of business applications, data explosion, and the much faster storage devices available today. For example, to utilize a 100GbE link, you needed 250 hard drives in the past, while today, you need only three NVMe SSDs.

After investing in the most advanced network infrastructure, the highest bandwidth links, the shiniest SDN controller, and the latest Cloud automation tools, your expectation is to fully utilize each link, whether 100GbE, 25GbE, or legacy 10GbE, in order to reach the highest IOPs measurements with your Software Defined Storage solution. But, is a collection of the cutting edge technologies enough?

Moving to a scale-out paradigm is a common practice today. Especially with hyper-converged solutions, data traffic is continuously running east-west between storage and compute server nodes. Even with 10GbE interfaces on individual servers, the aggregated data flow can fully utilize 100GbE links between leaf and spine layer switches. In addition, software defined storage generates extra traffic to maintain the solution, providing yet another source to consume network bandwidth.

To get the most from your network equipment, one needs to look at it from a PCIe to PCIe perspective, define the specific use case, and run a few simulations. Let us consider a simple example of an OpenStack deployment:

  • Dual 10GbE SFP port NICs on a medium density rack of 40 compute servers, 60 VMs per server
  • Layer 2 between ToR and Server with high availability requirement
  • Layer 3 between ToR and Aggregation layer, VXLAN Overlay and Nuage SDN controller
  • Storage is Ceph connected with 50GbE QSFP28 dual ports

Now, where are those pesky bottlenecks?

VXLAN UDP packets are hitting the NIC card on the server, and the NIC has no idea what to do with this creature, so it pushes it up to the kernel. Once the software is involved in the data plane, it is game over for high performance. The only way to sustain 10Gbps is for the NIC to know how to get inside the UDP packet and parse it for checksum, RSS, TSS and other operations that are natively handled with simple VLAN. If the NIC cannot do that, then the CPU will need to, and that will come at the expense of your application.

So, till now we were able to achieve higher CPU and lower performance, but what about the switch?

Can my switch sustain the 100GbE between the ToR and Spine? Losing packets means re-transmissions, how can you be sure that your switch has zero packet loss?

Ceph is now pushing 50GbE to the compute nodes 10GbE interfaces; congestion occurs and you cannot design it in a way that the congestion points will be predictable since the computes are dispersed. So, the question remains, can you ensure the switch will be able to handle this kind of congestion fairly?

There is a need for VXLAN Termination End Point (VTEP) to connect bare-metal servers to the virtual networks. This should be handled by the switch hardware. Another VTEP can be done on the Hypervisor, but then the OVS becomes the bottleneck. So, what if we offloaded it to the NIC?

I can continue on and on about the TCP/IP flow that involves the CPU in the network operations, but now let’s talk about deployment 100GbE infrastructure and getting 100GbE SDN deployment via:


  • Mellanox ConnectX4-Lx on your Servers provides VXLAN offload with a single parameter configuration on the driver.
  • Provide your Nuage SDN controller the ability to do VXLAN terminating on the Server but with hardware OVS offload; Mellanox ASAP² provides OVS Offload with Nuage integration on ConnectX®4-Lx.
  • Provide your Nuage SDN controller the ability to do VXLAN terminating on the ToR for bare metal servers and other L2 gateway requirements.
  • Switch that is running L2 and L3 at scale, not losing packets and can handle congestion scenarios so Mellanox Spectrum-based switches are not losing packets and they still provide fair buffering for each of the device ports. This is not a trivial accomplishment.
  • In 1RU, two SN2100 Spectrum-based switches serve 40 x 10GbE servers in MLAG topology with no oversubscription using 400GbE downlinks and 400GbE uplinks.
  • Run your Ansible, Puppet or other automation tools for the servers as for the Network Operating System (NOS), Cumulus Linux over Spectrum-based switches.
  • Note that fabric monitoring and provisioning tool for the switch and NIC that can be launched from Horizon or vSphere or from any VM: Mellanox NEO.
  • A tool that can provision the transformation of network types for Ironic and that has an ML2 mechanism driver interface: Mellanox NEO:
  • Reduce complexity by choosing inbox solutions.

And now, what about 25GbE from the server to the Top of Rack switch? You have the infrastructure, make sure that your cables are SFP28 capable, the form factor is the same and you are all set for the next step. You are now ready for 25G.


Euro 2016 Is the Year of (SN) 2000

SN2700, SN2410 and SN2100 are the high speed Ethernet switches that Europe loves to embrace today, in 2016.

It is spreading all over Europe, it is happening within the cloud services and host providers, health care industry, universities and others.

What is exactly the buzz? Mellanox sets the standard level of switch products back to the right place and opens up choice again in the networking domain.

More and more European data centers realize that the basic two requirements that they expect from a network product like a switch are not so common anymore:

No packet loss: All data should be kept intact for any packet size and with predictable latency.

Fairness: Network capacity has to be shared equally between all ports and streams as a baseline, from this baseline one can utilize the product features to create service levels per customer.

So what is happening? The SN2000 series of Ethernet switches, based on Mellanox’s Spectrum switch silicon, doesn’t lose packets at any size, and with Spectrum Customers can be sure that when network is congested and the shared buffers kick in, each and every port will get equal share of the available bandwidth.

The Mellanox switch is fair and doesn’t lose packets and this is not a trivial matter in most of today’s merchant switch Silicon (see the Tolly report

And what about the choice?

Fig 1 SN2100 Fig 2 SN2100








Europe has always been more open and advanced when it comes to embracing the logical next steps. For example, does anyone between the U.K. and Eastern Europe accept the single vendor blob model today? No, everyone chooses server hardware from a preferred hardware vendor, and separately chooses which Operating System to use. Moreover, customers choose which applications to run on top.

Spectrum silicon provides the Open API that allows us to develop our protocol stack and applications and ensures that the same software can run on different hardware platforms.

Many choose to run Quagga on their switch. Personally, I decided to manage my switch interfaces with the bash script that I have and Ansible modules that already work great on my servers. The same scripts works great also on my switch!

And what is the choice of Mellanox customers?

Euro 2016 Group E: Italy, Belgium, Sweden and N. Ireland, in soccer known as the ‘group of death’, who will qualify? Any team can.

In Group E, Belgium (GaaS) are 100G E2E with SN2700 running with Cumulus and ConnectX-4 as part of a VMware Cloud. And Italy (Cloud Services) is building an advanced OpenStack cluster with the SN2700.

In Group B, England (University) is building an HPC Cluster with RDMA over Converge Ethernet with the SN2000 series products.

In Group C, Germany (Hosting Company) is running Bird! Over Cumulus on the SN2700 and SN2410 as part of a cutting edge, OpenStack based technology.

2016 is the time of choices, choose MLNX-OS, choose Cumulus Linux, manage your switch with the industry standard CLI or run your applications over a native Linux operating system, enjoy the performance boost and enjoy the variety of choices; this is Euro 2016.