All posts by Mellanox Administrator

Three Ways ASAP2 Beats DPDK for Cloud and NFV

Telecommunication service providers are betting big on technologies such as cloud and Network Function Virtualization (NFV) to revamp their infrastructure. They are putting their cards on the table so that they can be more agile in new service creation, more elastic in on-demand service scaling, and achieve better economics in terms of infrastructure build-out. No wonder some of the leading service providers are also the strongest supporters and developers of NFV technologies. These include China Mobile, China Telecom, NTT, AT&T, Deutsche Telekom, Verizon, Vodafone…the list goes on and on.

One of the key trends of telco cloud and NFV is a shift away from running network functions on specialized appliances (such as a box that is built to run a Firewall and only a Firewall) to running on high-volume, general-purpose server, storage and switching platforms, sometimes called COTS (Commercial Off-the-Shelf) devices. It is natural to conclude that designing the infrastructure with the right COTS hardware to best support the NFV applications holds the key to overall NFV performance and efficiency, as well as determining ultimately how soon NFV can move beyond the proof of concept phase into real-world large-scale deployments.

There are some unique requirements on compute and networking infrastructure resulting from NFV applications. These include the Virtualized Network Functions (VNFs). In this blog, we will focus on the most fundamental need: packet performance, because intuitively, network functions, no matter whether physical or virtualized, process network traffic. VNFs look at packets, understand them and take action based upon pre-configured rules and policies. Packet performance is normally measured in millions of packets per second (Mpps). Note that this is different from throughput, which measures how fast I/O system can move bit streams. Throughput is normally measured in Giga-bits per second (Gbps). For the same throughput, packet performance can vary significantly. For example, with a throughput of 10Gbps, the theoretical maximum packet performance can be 14.88 Mpps for 64-byte packet streams or 0.82 Mpps for 1500-byte packet streams. Small packet size puts much more pressure on the I/O system’s capability to be able to push line rate throughput.

In carrier-grade networks, there is a substantial amount of control traffic that consists of mainly small packets. In addition, packets for real-time voice traffic are also just around 100 bytes long. In order to ensure that an Ethernet network is capable of supporting a variety of services (such as VoIP, video, etc.), and meeting the high availability requirements, IETF-defined RFC2544 is the key. This uses seven pre-defined frame sizes (64, 128, 256, 512, 1024, 1280 and 1518 bytes) to simulate various traffic conditions, and perform comprehensive Ethernet testing at service turn-up. RFC2544 and small packet performance are deemed vital to ensuring service quality and increasing customer satisfaction. Unfortunately, there are some switches that are unable to pass RFC 2544, however a well architected switch can switch packets of any size with, wait for it, zero packet loss.

But with NFV the goal is to swap out systems that have passed RFC2544 using servers built with general-purpose CPUs. These general-purpose CPUs and the operating systems running on the CPU were not designed to perform high-speed packet processing. If we look at the history of how network devices such as routers are built, the CPU has always been a key component. Historically speaking, routers started with software running on general processors. The AGS (Advanced Gateway Server), shipped in 1986 as Cisco’s first commercial multiprotocol router, was based on a Motorola 68000 series CPU (M68k). Due to the speed limitations of this processor and other factors, packet switching performance was limited to the range of 7,000 to 14,000 packets per second (pps) so even with fast switching—pretty anemic by today’s standards. Fast forward three decades and no reputable router vendor is using a CPU to perform the packet data path forwarding. Instead, all vendors use either custom ASIC, commercial silicon, or network processors which are much better equipped to support tens or hundreds of gigabits of throughput that a router line card needs to push. CPUs on routers only play significant roles on the route processors which are running the control and management planes.

To enhance the CPU’s capability to process packets, Intel and other contributing companies created Data Plane Development Kit (DPDK), a set of data plane libraries and network interface controller drivers aimed to perform fast packet processing. Aside from optimizing buffer management and other enhancements, DPDK changed the packet receive operation from push mode to poll mode, eliminating a number of interrupts, context switches and buffer copies in the Linux network stack, to achieve several fold improvement in packet performance. But the downside is also easy to see; IT professionals who deploy DPDK need to dedicate a significant number of CPU cores just for the packet processing. These expensive CPU cores will spin in loops, running at GHz rates and basically doing nothing, all while simply waiting for packets to arrive.

As a leader in high-performance server interconnect, Mellanox NICs include Poll Mode Driver (PMD) support for DPDK and we set the record of >90 Mpps of DPDK performance on our ConnectX®-4 100G NIC. But we have an alternative way, and better way, to solve the NFV packet performance challenge. We call this Accelerated Switching and Packet Processing (ASAP2). This solution combines the performance and efficiency of server/storage networking hardware, the NIC (Network Interface Card), along with the flexibility of virtual switching software to deliver software-defined networks resulting in the highest total infrastructure efficiency, deployment flexibility and operational simplicity.

Starting from Mellanox ConnectX-3 Pro series of NICs, Mellanox has designed an embedded switch (eSwitch) in its NIC silicon. This eSwitch is a flow-based switching engine capable of performing Layer-2 (L2) switching for the different VMs running on the server at higher performance levels and better security and isolation. eSwitch capabilities have been enhanced as we move to the current generation of ConnectX-4 NICs to perform packet classification and overlay virtual network tunneling protocol processing, specifically VXLAN encapsulation and de-capsulation. And in the latest ConnectX-5 NIC, the eSwitch can handle Layer-3 (L3) operations such as header rewrite, MPLS operations such as label push/pop, and even Flexible customer-defined parsing and header rewrite of vendor specific headers. The ASAP2 accelerator is built on top of eSwitch NIC hardware, and allows either the entire virtual switch, or significant portions of virtual switch or distributed virtual router (DVR) operations to be offloaded to the Mellanox NIC ̶ ̶̶  all of which achieve greatly improved packet performance with significantly reduced CPU overhead. At the same time, ASAP2 keeps the SDN control plane intact, and the SDN controller still communicates with the virtual switches to pass down the network control information. This is exactly how modern networking devices are built, a software control and management plane and a hardware data plane.

ASAP2 easily beats DPDK in 3 critical ways:

  • Much higher packet performance
  • Much lower CPU overhead (higher infrastructure efficiency)
  • No hidden costs
  1. Much higher packet performance

Everyone intuitively understands that hardware performs better than software, but how much better? We have compared packet performance in ASAP2 scenario where the data path of Open vSwitch (OVS) is offloaded to the Mellanox Connect-X 4 Lx NICs, and in DPDK-accelerated OVS scenario where OVS is fully run in user space using DPDK libraries to bypass the hypervisor kernel to boost packet performance.


There is a stark contrast in packet performance: with fewer number of flows, a 64-byte packet stream, ASAP2 reached 33 Mpps on a 25G interface where the theoretical maximum packet performance is 37.2 Mpps. For DPDK-accelerated OVS, the best performance is 8-9 Mpps, less than 30 percent of what ASAP2 can deliver.


When we scale up the number of flows, the contrast is even more apparent: ASAP2 can deliver ~10X of the DPDK-accelerated OVS packet performance. With 60,000 flows, and VXLAN encap/decap handled in the ConnectX-4 Lx eSwitch, ASAP2 achieves 18 Mpps. OVS over DPDK achieves 1.9 Mpps in the same amount of flows. It is worth mentioning that the OVS over DPDK configuration used standard VLAN as configuring VXLAN for OVS over DPDK and this was not trivial. We expect that with VXLAN OVS over DPDK performance will be even lower.


  1. Much lower CPU overhead (higher infrastructure efficiency)

In all the above scenarios, DPDK-accelerated OVS consumes four CPU cores, while ASAP2 consumes none, zero CPU cores, while delivering significantly higher performance with a software-defined OVS control plane. The CPU overhead of DPDK will be unbearable as we move to 25G, 50G or 100G server connectivity. Would you want to dedicate 10 or 15 CPU cores for packet processing? If so, you have no CPU cores left to do anything else. For cloud service providers, this is directly affecting their top line because for them, the more VMs or containers they can run on their servers, the more money they can make. If they allocate significant number of CPU cores for DPDK, then they spin up fewer number of VMs on their servers and make less money!

  1. No hidden costs

In his Netdev Conference keynote presentation Fast Programmable Networks & Encapsulated Protocols, David Miller, primary maintainer of the Linux networking stack, made his audience repeat three times, “DPDK is NOT Linux”. He wanted to make sure that people understand that you need a separate BSD license to run DPDK. If you are using a commercial DPDK-accelerated virtual switch solution, there is additional cost associated with that.

But there is absolutely no additional charge associated with ASAP2. ConnectX-4 Lx is a high-volume NIC that we are shipping in significant quantity to our hyperscale customers and it is priced very competitively compared to other high-volume NICs. All changes related to OVS Offload (using ASAP2 to offload OVS data plane to NIC hardware) are up-streamed to open source communities, including OVS and Linux. OVS Offload is transparent to applications and there is no change needed at the application level to take advantage of ASAP2. Mellanox does not charge a cent for the ASAP2 accelerator and it comes standard on the latest generations of our ConnectX NICs.

Getting Mellanox ConnectX-4 Lx is easy, it is carried by multiple of our server OEM partners including Lenovo, Dell and HP. So try ASAP2 ASAP, and enhance your NFV performance and efficiency today!

Tough Choices No More, at Least for SDN

It is great to have choices, but making decisions is hard, especially when none of the choices seems perfect; they all have their pros and cons. As the 2016 presidential election time draws near, many Americans are reminded of tough decision making that comes as part of the political freedom and democracy. And across the pond, the Brits are already living the aftermath of their Brexit decision.

Politics aside, when it comes to deploying overlay SDN (Software Defined Networking) solutions, the existing choices are not satisfactory, forcing network and cloud architects to choose between flexibility/programmability or performance/efficiency. As I described in the “Software Defined Networking, Done Right” series of blogs, there are two common ways to deploy VTEP: software VTEP in virtual switches, normally running in server hypervisors; or hardware VTEP in Top of Rack (ToR) switches, and there are tradeoffs between these two approaches. Software VTEP is flexible and conceptually simple, but can impact performance as well as raise CPU overhead on edge devices due to the packet processing associated with the relatively new tunneling protocols that not all server Network Interface Cards (NICs) can offload from CPU. This tradeoff can be even more pronounced when the applications themselves are virtualized network functions (VNFs) in Network Function Virtualization (NFV) deployment. Hardware VTEPs can often achieve higher performance but pose added complexity on the ToR switch since the ToR switch needs to be VM-aware, maintain a large forwarding table, and performance VM Mac address or VLAN to VXLAN translations.

CaptureBut what if you have an option to perform VTEP functionality on the host, but still enjoy the same great performance as what you could potentially get on a ToR? In the upcoming webinar between Nuage Networks and Mellanox Technologies on November 10th, 2016, we will show you our joint solution that can make the SDN deployment decision much simpler. After viewing our webinar, you will at least know how to cast your vote for the right SDN deployment model!


Get Your Cloud Networking Done Right: Find Out How from Mellanox at OpenStack Summit Barcelona

You want your cloud network to be software-defined, but without compromising performance or efficiency. How about 25G, 50G or 100Gbps speed without glitches? How about earth-shattering DPDK or virtualized network function (VNF) performance? How about unleashing your NVMe SSD speed potentials with software defined storage? The hyperscale giants are doing all of this and more with Mellanox solutions. We can enable you to do the same based on our experience working with some of the largest cloud service providers and hyperscale Internet companies. Our end-to-end interconnect portfolio including NICs, switches, cables, and associated network automation and management software, can seamlessly integrate with OpenStack and deliver total infrastructure efficiency, operational simplicity, and deployment flexibility. This is what we mean by getting your cloud networking done right:

  • Total Infrastructure Efficiency

Unleash the full potential of the entire data center infrastructure and deliver the highest application performance and workload density, ultimately guarantee business service SLA while capping cost.

  • Operational Simplicity

Ready integration with mainstream virtualization and cloud platforms, state-of-the-art API-driven network automation, and eliminating the need to engineer around the network to simplify network design and operations, and lower OPEX.

  • Deployment Flexibility

Based on best practices in hyperscale data center infrastructure design, the Mellanox solution employs network disaggregation and open networking principles to free customers from vendor lock-in, reducing risks and enhancing business agility.

There are two exciting demos we are going to show you in our booth #A31 at OpenStack Summit Barcelona:

  • ASAP2 – Accelerated Switching and Packet Processing

This is a unique feature supported by Mellanox ConnectX®-4 Lx series of Ethernet adapters to deliver more than four times the packet throughput when compared to OVS (Open vSwitch) over DPDK, with zero CPU overhead for hardware-based forwarding plane and SDN control plane.


  • Ceph over RDMA/RoCE

In this demo, we run a 4-node Ceph cluster with fast storage – NVMe flash drives in each Ceph server, running over 50GbE networks. It all connects using state-of-the-art Open Networking switches running Cumulus Linux over Mellanox Spectrum™ hardware, with PFC/ECN features enabled for RDMA/RoCE. We compare the 4KB read IOPS performance in three different network protocol settings:

  1. Simple Messenger TCP
  2. Async Messenger TCP
  3. XIO Messenger using RDMA/RoCE

In our test environment, RDMA/RoCE shows nearly 2X IOPS performance:


Now comes the really fun part, we are hosting a happy hour party during the OpenStack Summit with our Open Networking partner Cumulus Networks on Oct 25th. Space is limited, so if you want to secure your drinks, register today!


We look forward to seeing you in Barcelona!

Software Defined Networking, Done Right Part 3

Read Software Defined Networking, Done Right Part 1 and Part 2.


In part II of this SDN blog series, Innovations that can Move SDN from Depths of Disillusionment to Peak of Enlightenment, I discussed some of the technologies that can significantly improve total infrastructure efficiency, operational simplicity and deployment flexibility. In this blog, I will show you the best practice for putting everything together and building the most efficient SDN in production, whether it is Openflow or Overlay-based.

As a refresher, the following picture summarizes the SDN deployment models and the Mellanox solutions that apply to each of them:

Recommended deployment for OpenFlow based SDN networks:



  • Leaf-Spine architecture built using Spectrum switches with OpenFlow 1.3 support
  • For physical + virtual fabric, leverage ASAP2 to offload virtual switch data plane to Mellanox ConnectX-4 NICs
  • Advanced flow monitoring with SFLOW capabilities on Spectrum

Key benefits:

  • High performance, line rate performance at any speed from 10G to 100Gb/s, including on virtualized servers
  • Most flexible OpenFlow switch implementation
  • In-depth visibility into the OpenFlow fabric

Recommended Deployment for Overlay SDN Networks:



  • Multiple ways for virtualized server deployments:
    • Virtual switch as VTEP + Mellanox VXLAN stateless offload
    • Mellanox NIC as VTEP (Leverage ASAP2 to offload virtual switch data plane to Mellanox ConnectX-4 NICs, while keeping SDN control plane operations in virtual switch)
    • For VMs who needs SR-IOV, and the legacy NIC does not support ASAP2, use Mellanox Spectrum ToR as VTEP
  • High-performance hardware VTEP on Mellanox Spectrum ToR for bare metal server or storage provisioning
  • (Roadmap Feature) High-performance hardware Layer 3 VTEP on Mellanox Spectrum spine switches for VXLAN routing.
  • Advanced underlay fabric monitoring with SFLOW capabilities on Spectrum

Key benefits:

  • High performance, line rate performance at any speed from 10G to 100Gb/s, including on virtualized servers;
  • Most advanced and future-proof VTEP implementation with flexibility to do VTEP either at NIC or switch level, with potential to extend to Layer 3 hardware VTEP without forklift upgrade;
  • In-depth visibility into the SDN underlay fabric to facilitate correlation of stats from both layers and achieve easy troubleshooting.

Closing Thoughts

SDN is an evolving technology, and Mellanox is uniquely positioned to provide the most comprehensive, flexible and efficient SDN support through our end-to-end interconnect and associated software to help our customers with any SDN deployment.

Software Defined Networking, Done Right Part 2

Read Software Defined Networking, Done Right Part 1 and Part 3.


Part 2: Innovations that can Move SDN from Depths of Disillusionment to Peak of Enlightenment

In Part I of this SDN blog series, The -State-of-the-SDN Art, I reviewed the mainstream SDN deployments models and discussed the pros and cons of each model. In Part II of this blog series, I will discuss some of the key Mellanox innovations that can enhance the performance and efficiency of SDN deployments, especially large-scale deployments. As Andrew Lerner blogged about the 2016 Gartner Networking Hype Cycle, SDN is firmly entrenched in the trough of disillusionment now. He especially mentioned that:

During 2015, we started to see production adoption of SDN solutions, though broad deployments are still relatively rare. A variety of both established vendors and startups continue to develop SDN technologies, but full, robust, end-to-end enterprise-ready solutions are not yet fully established on the market.”

In some sense, these technologies we discuss below are key to moving SDN through the hype cycle from depths of disillusionment to peak of enlightenment.

VXLAN Offload on ConnectX® NICs

In the good old days, when network virtualization was realized through VLAN, achieving line rate performance on the server host is possible because the server could offload some of the CPU-intensive packet processing operations such as checksum, Receive Side Steering (RSS), Large Receive Offload (LRO) etc., into the NIC hardware. This both improved network I/O performance and reduced CPU overhead, ultimately making the infrastructure run more efficiently.

Now, with overlay SDN, a tunneling protocol such as VXLAN, NVGRE or GENEVE is introduced to encapsulate the original payload. For NICs that don’t recognize these new packet header formats, even the most basic offloads stop functioning, resulting in all packet manipulating operations to be done in software in CPU. This can cause significant network I/O performance degradation and excessive CPU overhead, especially when server I/O speed evolves from 10Gb/s to 25, 40, 50, or even 100Gb/s.

Starting from the ConnectX®-3 Pro series of NIC, Mellanox supports VXLAN hardware offload that includes stateless offloads such as checksum, RSS, and LRO for VXLAN/NVGRE/GENEVE packets. With VXLAN offload, I/O performance and CPU overhead can be restored to similar levels as VLAN.

The following two graphs show the bandwidth and CPU overhead comparison in three scenarios: VLAN, VXLAN without offload, and VXLAN with offload. VXLAN offload results in greater than 2X throughput improvement with approximately 50 percent lower CPU overhead.


VXLAN Offload is supported at OS/hypervisor kernel level for Linux, Microsoft Hyper-V, and VMWare ESXi, and does not depend on the type of virtual switch or router used.

ASAP2 (Accelerated Switching and Packet Processing) on ConnectX-4 NICs

Starting from ConnectX®-4 series of NICs, Mellanox support VTEP capability in server NIC hardware through the ASAP2 feature. With a pipeline-based programmable eSwitch built into the NIC, ConnectX-4 can handle a large portion of the packet processing operations in hardware. These operations include VXLAN encapsulation/decapsulation, packet classification based on a set of common L2 – L4 header fields, QoS and Access Control List (ACL). Built on top of these enhanced NIC hardware capabilities, ASAP2 feature provides a programmable, high-performance and highly efficient hardware forwarding plane that can work seamlessly with SDN control plane. It overcomes the performance degradation issues associated with software VTEP, as well as complexity issues of coordinating between server and TOR devices in case of hardware VTEP.

There are two main ASAP2 deployment models: ASAP2 Direct and ASAP2 Flex.


ASAP2 Direct

In this deployment model, VMs establish direct access to Mellanox ConnectX-4 NIC hardware through SR-IOV Virtual Function (VF) to achieve the highest network I/O performance in virtualized environment.

One of the issues associated with legacy SR-IOV implementation is that it bypasses the hypervisor and virtual switch completely and the virtual switch is not aware of the existence of VMs in SR-IOV mode. As a result, the SDN control plane cannot influence the forwarding plane for those VMs using SR-IOV on the server host.

ASAP2 Direct overcomes this issue through enabling rules offload between the virtual switch and the ConnectX-4 eSwitch forwarding plane. In this case, we use Open Virtual Switch (OVS), one of the most commonly used virtual switches, as an example. The combination of SDN control plane through OVS who communicates with a corresponding SDN controller, and NIC hardware forwarding plane offers the best of both worlds: software-defined flexible network programmability and high network I/O performance for the state-of-art speeds from 10G to 25/40/50/100G. By letting the NIC hardware taking the I/O processing burden from the CPU, the CPU resources can be dedicated to application processing, resulting in higher system efficiency.

ASAP2 Direct offers excellent small packet performance beyond the raw bit throughput. Mellanox’s benchmark shows that on a server with 25G interface, ASAP2 Direct achieves 33 million packets per second (MPPS) with ZERO CPU cores consumed for a single flow, and about 25 MPPS with 15000 flows performing VXLAN encap/decap in ConnectX-4 Lx eSwitch.

ASAP2 Flex

In this deployment model, VMs run in para-virtualized mode and still go through the virtual switch for its network I/O needs. However, through a set of open APIs such as Linux Traffic Control (TC), or Data Path Development Kit (DPDK), the virtual switch can offload some of the CPU intensive packet processing operations to the Mellanox ConnectX-4 NIC hardware, including VXLAN encapsulation/decapsulation and packet classification. This is a roadmap feature and the availability date will be announced in the future.

OpenFlow support on Spectrum Switches

Spectrum is Mellanox’s 10/25/40/50 and 100Gb/s Ethernet switch solution that is optimized for SDN to enable flexible and efficient data center fabrics with leading port density, low latency, zero packet loss, and non-blocking traffic.

From the ground up, at the switch silicon level, Spectrum is designed to have a very flexible processing pipeline so that it can accommodate programmable OpenFlow pipeline that allows packets to be sent to subsequent tables for further processing and allows metadata information to be communicated between OpenFlow tables. This makes Spectrum ideal choice for supporting the OpenFlow 1.3 specification.

In addition, Spectrum is an OpenFlow-hybrid switch that supports both OpenFlow operation and normal Ethernet switching operation. Users can configure OpenFlow at port level, assigning some Spectrum ports to perform OpenFlow based packet processing operations and others to perform normal Ethernet switching operations. In addition, Spectrum also provides a classification mechanism to direct traffic within one switch port to either the OpenFlow pipeline or the normal Ethernet processing pipeline. 

VTEP support in Spectrum Switches

Mellanox Spectrum supports VTEP gateway functionalities that make it ideal to be deployed as:

  • Layer 2 VTEP gateway between virtualized networks using VXLAN and non-virtualized networks using VLAN in the same data center or between data centers.
  • Layer 2 VTEP gateway to provide high-performance connection to virtualized servers across Layer 3 networks and enable Layer 2 features such as VM live migration (VMotion). On virtualized server hosts, where the NIC does not have VTEP capability and software VTEP can’t meet the network I/O performance requirement, the VTEP can be implemented on Mellanox Spectrum ToR. In some cases, the application running in the VM may desire to use advanced networking features such as Remote Direct Memory Access (RDMA) for inter-VM communication or access to storage. RDMA needs to run in SR-IOV mode on virtualized servers and in cases when Mellanox NIC is not present, the VTEP is best implemented in the ToR.
  • Layer 3 VTEP gateway that provides VXLAN routing capability for traffic between different VXLAN virtual networks, or for north-south traffic between an VXLAN network and a VPN network or the Internet. This feature is supported in Spectrum hardware, and the software to enable it is still under development.

Spectrum is an Open Ethernet switch and can support multiple switch operating system running over it. The Layer 2 VTEP gateway features will first be available in Cumulus Linux over Spectrum, and subsequently in MLNX-OS.

In the third blog, I will show how one can put these innovations together to deploy the most efficient SDN in production.

Software Defined Networking, Done Right

Read Software Defined Networking, Done Right Part 2 and Part 3.


Part 1: The State of the SDN Art

Software-Defined Networking (SDN) is a revolutionary approach to designing, building and operating networks that aims to deliver business agility in addition to lowering capital and operational costs through network abstraction, virtualization and orchestration. Conceptually, SDN decouples the control and data planes, and logically centralizes network intelligence and control in software-based controllers that maintain a global view of the network. This enables more streamlined policy-driven external control and automation from applications, which ultimately enhances network programmability and simplifies network orchestration. As such, SDN-based design allows for highly elastic networks that can readily adapt to changing business needs.

The first wave of SDN deployment focuses on functionality, but with many innovations and enhancements in data center interconnect technologies, it is time to take a fresh look at more efficient, higher performance SDN deployment options.

This blog series focuses on SDN solutions for data centers, which is often an essential part of building cloud, whether it is private cloud or public cloud. The three blogs in this series will cover:

  • The State of the SDN Art
  • Innovations that can Move SDN from Trough of Disillusionment to Peak of Enlightenment
  • Putting Everything Together: Deploy the Most Efficient SDN

SDN Deployment Models

Three different deployment models dominate today’s SDN landscape:

Device-Based SDN Deployment Model



chloes blog

In this model, the SDN Controller uses a south-bound device control protocol to directly communicate policy or forwarding table information to the physical and virtual switching and routing devices. OpenFlow is the most commonly used protocol, and some of the early SDN architectures are based on OpenFlow to decouple control plane from network devices.

Examples of SDN implementations based on this model are BigSwitch’s Big Cloud Fabric, Open Networking Lab(ON.LAB)’s ONOS, and Open Daylight (ODL). Beyond OpenFlow, ONOS and ODL also support other southbound protocols such as Netconf and SNMP for device configuration and management.

Essentially for every new flow, all the devices that this flow traverses potentially needs to be programmed to handle the proper flow operations. This model requires the network devices to be OpenFlow-aware, which can sometimes be a challenge when you have legacy networks or a mixture of various generation of network devices.

Overlay SDN Deployment Model

chloes blog 2

Many customers have an installed base of networking equipment that is not yet OpenFlow-enabled, and doing a network-wide upgrade may not be an option. Overlay approach of SDN deployment model came into being to bring SDN/network virtualization to these customers without requiring forklift network upgrade that can be both expensive and disruptive to business services. Overlay SDN has been the most commonly seen architecture, and mainstream SDN solutions such as VMware NSX, Nuage Networks (Now part of Nokia) VSP, PLUMGrid ONS, OpenContrail and Midokura MidoNet all primarily follow this model.

As the name indicates, in overlay model, logical networks are established through tunnels between endpoints, and these tunnels are overlaid onto an existing physical network. The intelligence about multi-tenancy, network and security policies are pushed to the network edge. Some of the most commonly used tunneling protocols include Virtual eXtensible LAN (VXLAN), Network Virtualization using GRE (NVGRE), and Generic Network Virtualization Encapsulation (GENEVE). In case of VXLAN, the tunnel endpoints are known as VXLAN Tunnel End Points (VTEP). The physical network, or the underlay, becomes the “core” network and its functionalities can potentially be simplified to providing high-performance IP connectivity between these VTEPs. An overlay SDN controller will primarily communicate with the VTEPs, which oftentimes are the virtual switching and routing device residing on servers.

Overlay SDN can be deployed to achieve network virtualization and automation without requiring upgrades of physical networking equipment, more specifically, the network devices that are NOT the VTEPs. Despite its pluses, overlay SDN introduce added complexity when it comes to managing both the overlay and underlay, and correlating information from both layers during troubleshooting.

There are two common ways to deploy VTEP: software VTEP in virtual switches, normally running in server hypervisors; or hardware VTEP in Top of Rack (ToR) switches, and there are tradeoffs between these two approaches. Software VTEP is flexible and conceptually simple, but can impact performance as well as raise CPU overhead on edge devices due to the packet processing associated with the relatively new tunneling protocols that not all server Network Interface Cards (NICs) can offload from CPU. This can be even more pronounced when the applications themselves are virtualized network functions (VNFs) in Network Function Virtualization (NFV) deployment. Hardware VTEPs can often achieve higher performance but pose added complexity on the ToR switch since the ToR switch needs to be VM-aware, maintain a large forwarding table, and performance VM Mac address or VLAN to VXLAN translations.

chloes blog 3

Beyond the virtualized environment with VXLAN/NVGRE/GENEVE, there are often Bare Metal Servers (BMS) or legacy networks that can only use VLAN, or North-South traffic that goes out to a VPN network or the Internet. In those cases, using a software VTEP gateway adds extra hop or potentially performance bottleneck and the best practice is to use the ToR that the BMS is connected to as hardware VTEP.

Proprietary SDN Solutions

There are other proprietary SDN solutions in the market, such as Cisco Application Centric Infrastructure (ACI), Plexxi and Pluribus. With these solutions, the SDN controller and the SDN switching and routing elements are often tightly coupled. This category of SDN solutions are not as open as the above two, and pose limitations for ecosystem vendors to integrate with them.

In the next blog in this series, I will provide an overview of some new innovations that overcome some of the issues associated with the overlay and Openflow deployment models, and can make these SDN deployment more efficient, flexible and streamlined.

Performance beyond Numbers – Stephen Curry Style Server I/O

This is an eventful season for NBA; with the Golden State Warriors making history with 73 wins out of an overall 82 games in an NBA season, and Kobe Bryant retiring from the LA Lakers. But one man is stealing all the thunder, and that is Stephen Curry, the NBA’s first unanimously-voted MVP.


Figure 1: Stephen Curry amazes while Paul Allen expresses incredulity.


Why do people love Curry? Because he can perform, and the numbers prove it. Curry is the first player in NBA history at any position to average 30 ‎points per game in less than 35 minutes per game over a full season. In the Warriors game against the Oregon Trail Blazers, Curry set a new NBA record when he scored 17 out of 21 points in overtime, leading the Warriors team to beat the Blazers 132 over 125. That itself has caused social media buzz when Blazer’s owner Paul Allen was caught on camera showing jaw-dropping disbelief.

In server I/O, numbers also speak louder than words, and that is one of the key reasons that Hewlett Packard Enterprise (HPE) is working with Mellanox Technologies to lead server networking to the 25G/100G era. In this week’s HPE Discover in Vegas, HPE is showcasing impressive throughput performance for its Proliant DL380 Gen 9 Servers with the Mellanox ConnectX-4 and ConnectX-4 Lx series of server Network Interface Cards (NICs) that support 10/25/40/50/100Gb/s interface speeds. As seen in the following chart, full line rate throughput is achieved at every interface speed, even for 100Gb/s interfaces. (The gap between the interface speed and TCP throughput is due to the TCP header overhead).



Figure 2: HPE Proliant DL380 Gen 9 Server with Mellanox ConnectX-4 NICs achieve full TCP line-rate performance with 25, 40, and 100Gb Ethernet links.

But the performance numbers don’t stop here, because for Telco and NFV applications, how fast a server and its NIC can process packets is extremely important. For example, Voice over LTE (VoLTE) media packets are small, around 100 bytes, which means, for the same interface speed, you are going to see a lot more packets per second when the packet is 100 bytes as opposed 9K bytes. This small packet performance as defined by RFC 2544 is a true test of how solid your server I/O system is. And for that, the HPE-Mellanox combination also provides jaw-dropping numbers that would inspire even Paul Allen to come up with a new Twitter-worthy facial expression. Take a 25Gb/s port as an example: the theoretical maximum small packet throughput is 37.2 million packets per second (The minimum packet size is 84 bytes consisted of 46 payload, 4 CRC, 2 MAC type, 6 MAC source address, 6 MAC destination address, 8 preamble, 12 inter frame gap. Thus 25Gb/s is converted to 25,000,000,000/(84 bytes * 8 bits/byte)/1,000,000=37.2 Mpps), and Mellanox ConnectX-4 Lx is achieving 34 million pps for the smallest 64 Byte traffic, with zero packet loss. This is done through a combination of running the Data Plane Development Kit (DPDK) library and Mellanox Poll Mode Driver (PMD) with bi-directional traffic from IXIA traffic generator.


Figure 3: HPE-Mellanox NICs excel at small packet performance.

But the Mellanox DPDK support advantages go beyond pure throughput performance numbers.

In terms of security, Mellanox NICs leverage hardware-based memory protection and translation. The Mellanox DPDK solution registers the process memory and the NIC performs Direct Memory Access (DMA) only to and from memory pages which are owned by this process, enforcing security without additional overhead.

In terms of transparency and troubleshooting, the Mellanox PMD, OFED and DPDK are separate packages and we allow for co-existence with the kernel NIC driver. The end result is that HPE customers can run Mellanox DPDK while managing the Ethernet port as opposed the competing solutions that require different drivers for DPDK and non-DPDK applications.

Together, HPE and Mellanox are bringing jaw-dropping, Steph Curry-style performance to a new generation of server I/O with 25GbE and beyond. If you happen to be in Vegas, do check us out in both the Mellanox booth and HPE booth at HPE Discover.





Mellanox Takes Home “Outstanding Components Vendor” Award

Leading Lights Award badgeUnlike Leonardo DiCaprio, who finally won his first Oscar in 2016 for survival epic The Revenant, after six nominations, Mellanox Technologies took home the title of “Outstanding Components Vendor” in the Light Reading 2016 Leading Lights Awards with our FIRST TRY. This award is given to the components vendor that stands out from its competitors, is consistently innovative and trendsetting in the industry, makes investors proud and makes employees happy. You can see why we were thrilled to win, and why Mellanox deserves to be the winner.



The Leading Lights program, which is in its 12th year, has 26 core categories focusing on next-generation communications technologies, applications, services and strategies. The awards are given to companies that have shown prominent leadership and inventive thinking in their fields for the past year. Judging was conducted by Light Reading’s editors and the analyst team from Heavy Reading, and the winners were announced at an awards dinner at Hotel Ella in Austin, Texas, on Monday, May 23 to coincide with the Big Communications Event. Kevin Deierling, our VP of Marketing, Chris Shea, our VP of Business Development for the CSP vertical, and I were able to attend and accept the award in person on behalf of Mellanox Technologies.

As part of the celebration, Mellanox had sponsored a table at the award dinner to share this moment with our valuable customers, partners and friends from the cloud and NFV ecosystem including Affirmed Networks, Hewlett Packard Enterprise, Verizon, Technicolor, Nokia and Heavy Reading.


The more amazing thing is, the Mellanox table won the largest number of awards! Out of the six companies represented on our table, we were honored with a total of four trophies including Affirmed Networking winning “Private Company of the Year”, Technicolor honored for “Best New Cable Product”, and Nokia hailed to have the “Most Innovative SDN Product Strategy”. In addition, the Mellanox table also won the loudest cheering table at the award dinner!

We are thrilled! We are also grateful of the recognition from Light Reading, thank you Steve Saunders and Ray Le Maistre. Based on the volume and quality of entries, as well as how seriously the industry takes these awards, we were very excited and proud to accept it. I’d say it’s even better than an Oscar!

Mellanox Named Leading Lights Outstanding Component Vendor Finalist

When I heard that Mellanox was named an Outstanding Components Vendor finalist for Lightreading’s Leading Lights Award, I was thrilled and proud, but not surprised because I was confident that Mellanox deserved to be in the spotlight. Mellanox is uniquely positioned to help Communication Service Providers build their next generation infrastructure with our vision in cloud networking and novel approach to high-performance high-quality interconnect. It is our mission to drive new technologies to market, revolutionizing the data center.

Mellanox has been a dominant player in the High-Performance Computing sector, managing large distributed computation-intensive workloads that require high-speed communication between processor cores and between processor cores and data. As a result, the Mellanox architecture and R&D team have rich experience in designing semiconductor chipsets that challenge the communication speed limits, while providing low latency and low power consumption, yet predictable and reliable application performance.

Building on top of our success in HPC, Mellanox expanded our footprint into the hyper-scale web and cloud service provider space, penetrated the majority of the top web services giants on a global basis. The infrastructure for this sector normally follows scale-out, software defined architectural pattern, and a high-performance data center network fabric is key to support communication and data access needs. More importantly, these new generation of companies carrying out the mission of digital transformation expect their infrastructure to support agile innovation, instead of being a roadblock. As such, they want to build their infrastructure much in the same style as building with Lego blocks. At Mellanox, we call this style of network infrastructure building “Open Composable Networks (OCN)”. OCN can truly unleash agile innovation, accelerate diverse workloads, and drive cloud-scale efficiency. It leverages hyper-scale web and cloud network architecture designs and is based on network disaggregation, open and unified abstraction interfaces, and automated orchestration and management.
Just like Lego building needs a set of high-quality basic components, the foundation of OCN relies on Mellanox end-to-end interconnect components that guarantees high performance:
Mellanox ConnectX-4 series of NICs:
– Best DPDK performance of 75 million pps for 100G interface and 33 million pps for 25G interface
– Advanced Switching And Packet Processing (ASAP2) support: SDN control plane with accelerated data plane through NIC ASIC;
Multi-host NIC supporting higher density CPU per server, and open, flexible combination of CPUs;
– Option of advanced acceleration and intelligent offload through on-board FPGA, multi-core processors and network processors.

Mellanox Spectrum Switch IC and Top-of-Rack Switch System:
Open Ethernet support of Cumulus, Microsoft SONiC, MetaSwitch, OpenSwitch and MLNX-OS
Zero Packet Loss, at any packet size, over any speed (10/25/40/50/100Gb/s) up to 6.4Tb/s switching capacity
– Efficient switch memory management resulting in 9X-15X more effective buffering and congestion resilience
– Fair bandwidth allocation independent of physical port
– Industry-leading, true cut-through latency
– Forwarding database sized for hyper-scale infrastructure build-out
– Optimized for SDN with OpenFlow and overlay tunneling support including: VXLAN, NVGRE, Geneve and MPLS

Mellanox LinkX Cables:
– Copper cables, active optical cables, and optical transceivers to support distances from < 2 m to 2 km
– Silicon Photonics-based single mode and VCSEL-based multi-mode optical modules and cables for 25, 50, and 100Gb/s networks
– Full range of 100Gb/s products in the same QSFP28 footprint

OCN is perfect for NFV use cases of virtualized EPC, IMS/SBC, vCPE, and vCDN for communications service providers to realize virtualization and build multi-cloud infrastructure without performance penalty.

If you are heading to in BCE Austin, be sure to join Mellanox in our two panel discussions:
• BCE Day 1 May 24th 4:15-5:05 p.m. : Components: Data Center Interconnects: Delivering 25G TO 400G
• BCE Day 2 May 25th 2:15-3:05 p.m.: Data Centers and Cloud Services: The New Telco Data Center
My fingers are crossed, I am hoping that Mellanox will walk down the red carpet in Austin as a winner of Outstanding Components Vendor of Leading Lights.

Affirmed Networks Partners with Mellanox to Further Boost NFV Deployment Efficiency

Mellanox and Affirmed

I am very excited that after engaging with Affirmed Networks for extensive integration and certification testing, Mellanox is now officially a partner of this leading virtual Evolved Packet Core (EPC) provider and key supplier of AT&T Domain 2.0 initiative. Through this mutually beneficial partnership, Mellanox aims to boost Affirmed Mobile Content Cloud (MCC) Virtualized Network Function (VNF) deployment efficiency with our high-performance server interconnect solutions.

Affirmed Networks is a leading telecommunications technology supplier with revolutionary Network Function Virtualization (NFV) solutions for EPC, vCPE, Gi LAN and SFC Controller. Affirmed’s virtualized MCC software has been designed to run on virtualized high-volume servers.  However when server I/O capacity becomes constrained, application performance may suffer, resulting in under-utilized CPU resources, and excessive server footprint. Mellanox’s high-speed server interconnect solution enhances utilization of infrastructure resources for Affirmed’s virtualized product offerings, enabling optimal application performance as well as space and energy footprint for vEPC deployment. Affirmed Networks and Mellanox are both HPE OpenNFV ecosystem partners.

Hear Ron Parker, Senior Director of System Architecture at Affirmed Networks, talks about this partnership, and how Mellanox can help supercharge NFV deployment.