Who’s Tapping Your Lines and Snooping On Your Apps?

 
Uncategorized

Spoiler alert – it should be you!

No one would argue whether good vision was important if you were a surgeon, a welder, or an Uber driver. In technology, whether you’re a Cloud Architect or in Network Operations, you really need good visibility into what is going on inside your data center. To sleep soundly at night, you have got to actively monitor the performance of your Network, your application performance, and be on the lookout for security breaches. There are analyzers that specialize in each of these three distinct monitoring disciplines: Network Performance, Application Performance, and Security.

How do you get the right traffic to the right analyzers?

You need to “tap your own lines” by placing TAPs at key points in your network. These TAPs will copy all the data traversing the links they are attached to. Then, you need to aggregate those TAPs, consolidating all the flows into a few high bandwidth links on the analyzers. The modern, scaled-out, approach for consolidating TAPs is to use a Software Defined TAP Aggregation Fabric, which amounts to a bunch of Ethernet switches that are only specialized in that they don’t run normal Layer2/3 protocols. Instead,are steering specific flows to specific analyzers.

TAP Aggregation Fabric

You might want the TAP Aggregation fabric to do more than just steer the right flows to the right analyzers. You may want your TAP Aggregators to some of the following:

  • Filter out unwanted flows which will save bandwidth to the analyzers and increase the utilization of the analyzers
  • Truncate packets – to remove unneeded payload data – especially if your analyzers only look at the packet headers
  • Source tagging – to identify where packets came from by changing the MAC address or popping on a VLAN tag
  • Time-stamping – to identify exactly when packets hit the wire
  • Matching inside tunnels – to forward the right tunneled traffic to the right analyzer, while preserving the MPLS or VXLAN tunnel headers
  • Centralized management – to configure all the TAP Aggregation switches from a single control point. The per-flow filtering and forwarding rules can be configured a number of ways, but most people like to use an OpenFlow controller which is almost purpose built for this type of application. An added bonus is that it makes automation super easy since the individual switches configs are dead simple.

Where do you TAP your network?

There is no universal consensus on where to place your TAPs, but there are some very common models:

Financial Services organizations frequently TAP every Tier of their network, so they can measure the latency as packets traverse the network while they also implement security monitoring:

Many Cloud Providers TAP every Rack in their data centers for their own monitoring purposes, as well as offering Application Performance reports to their customers:

 

How do you know what traffic to send for analysis?

If you have ever enabled too many debug features on a Cisco/Arista switch, you are rightfully a bit cautious.  (Friendly advice: don’t do it unless you also want a switch reboot)

TAP Aggregation switches are the ideal place to implement heavy duty Telemetry features because they cannot impact your production network.

One technique for determining which flows need to be analyzed is to start monitoring your traffic with sFlow. sFlow can give you a picture of the busiest flows, top talkers, top protocols, most flows, and various traffic anomalies. It can help you detect and diagnose network problems. It can also provide a glimpse into which applications are using the network most.

You can also see when something changes and can point out what flows should be sent on for further analysis.

Some of the best monitoring, analytics, and graphing tools are Open Source. Recently, folks have been well served by sending their sFlow data to sFow-RT for analysis and then monitor the state of their datacenter with Grafana:

What to look out for when considering TAP aggregation solutions

  • Go with an open multi-vendor solution – don’t get locked into a proprietary one-of-a-kind closed solution. In the data center business, we call these, “Unicorns” because they are single vendor focused, single vendor sourced, and cannot be easily replaced. Beware – Unicorns are expensive!

  • Be sure to make “apples to apples” cost comparisons. Don’t just look at the switch hardware costs, but also look at the per-switch licensing and controller costs
  • Consider best-of-breed Open Source Tools which were developed for hyperscale data centers and scale better than expensive vendor-specific solutions
  • TAPs are preferred to SPAN as some switches are not able to mirror every packet.
  • Make sure your TAP Aggregation switches have sufficient packet rates (PPS) to be able to forward every packet sent by the TAPs

 

Supporting Resources

Comments are closed.