<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2975524&amp;fmt=gif">
BLOG

Packet Deduplication Explained

November 12, 2020

Duplicate packets are one of those things that happen in a busy network. Although using different network equipment and optimizing your network will minimize duplicates, they’ll still crop up on a fairly regular basis. This can be a problem for your monitoring tools, because analyzing duplicate packets can take up their processing resources while also throwing off your analytics.

Network packet brokers (NPBs) can remediate this issue by finding and removing duplicate packets before they can reach your analytics tools. Here are some of the best ways to use this feature to optimize your network and improve monitoring throughput.

Why do Duplicate Packets Appear—And Why are they Bad for Monitoring?

No matter what happens, you’ll encounter duplicate packets in your network from time to time—but if your network uses Switch Port Analyzers (SPAN ports), you’ll find them a lot more often. When SPAN ports are used to either mirror ingress and egress ports, or to mirror traffic onto a VLAN, your monitoring network might experience a 50% increase in traffic from duplicates alone.

One simple fix is to utilize a network TAP instead of a SPAN port, which will slash the number of duplicate packets you need to deal with. We recognize that this is sometimes easier said than done—that you might have legacy equipment or may have to rely on the SPAN for network access. Additionally, switching from SPAN to TAP won’t get rid of duplicate packets altogether. For example, some hosts will commonly transmit duplicate packets in order to guard against packet loss.

If your monitoring needs are sensitive enough that any number of duplicate packets is too many, or if you can’t switch away from SPAN ports, then duplicate packets can cause you real problems.

First, data center traffic has already reached enormous throughput levels and is now approaching the 100G range. Monitoring tools are already undersized for the amount of traffic that they need to deal with. With duplicate packets added to the mix, you’ll find your monitoring tools stretched past the breaking point.

Second, your monitoring and analytics depend on having consistent data—and duplicate packets can skew this data in several ways. Let’s say that your network is starting to become unstable, but your monitoring tools are chewing through a queue of “good” packets—that are in fact all duplicates. You won’t catch a hint of anything wrong until there’s a performance issue. Conversely, a stream of duplicated bad packets can display a false positive for a cyberattack or an unexplained slowdown.

Again, there are ways to mitigate this problem, some tools may have deduplication capabilities but if you want to quash it permanently and not transfer the processing burden to the tools, use a network packet broker with deduplication functionality.

>> Download now: EMA Security Best Practices [Whitepaper]

How Does Deduplication Work?

When a packet comes in, the network packet broker uses one-way encryption to turn the packet into a series of letters and numbers called a hash. Its hash is compared to the hash of every packet that came before it—and comparing two hashes is much faster than comparing two packets line by line. Since the same content will always produce the same hash, any two hashes that match are automatically duplicates. The duplicate hash is discarded before it can be sent on to your network monitoring or security tools.

This explanation is simple, but it also obscures a few edge cases. For example, what happens if two packets have the same header and the same content, but different IP type of service (TOS) numbers. Are they different or are they duplicates? What about time series? If you get two virtually identical packets a few minutes apart, is the second packet a duplicate, or is the same user just logging in a second time?

Deduplication2
Generally, these are all variables that must be defined by the user and which depend on the data center context. If two packets are generally the same, but have different IP TOS or TCP sequence numbers, they may or may not be duplicates—it depends on the data center. If you’ve bought the right kind of NPB, you’ll be able to specify whether they are or not. If specified that they’re duplicates, then the NPB will hash every part of your packet except the IP TOS or TCP sequence numbers, which means that those numbers won’t be taken into consideration when the hashes are compared.

Meanwhile, time series information is something that is also up to the discretion of the administrator. Given the right kind of granular control, you’ll be able to specify how long the network packet broker keeps hashes before it discards them. This allows you to say that packets containing the same or similar information which are received in a short amount of time—say a 500ms window—are likely to be duplicates, but that apparent duplicates received over a longer interval may not be duplicates at all.

With granular controls, network administrators can optimize their control over deduplication at a highly individualized level. This means that they’ll be able to achieve the highest possible throughput for their monitoring equipment while retaining the highest level of accuracy for their analytics. This in turn will help them maintain a stable and more secure network without being forced to purchase, install, and configure large amounts of new equipment.

Adding Deduplication To Existing Infrastructure

In today’s environment with significant investment in security and monitoring tools, Garland recognizes the need for a cost effective solution that allows the flexibility and high speeds that are required for the networks of the future. 

TAP-to-Agg-Dedup

We take the approach of providing solutions that are scalable for future on-demand growth and ROI. Garland’s PacketMAXTM line of deconstructed packet brokers includes an Advanced Features Deduplication device purpose-built to extend the feature set of existing infrastructure and reduce the processing load to security or monitoring tools. Allowing you to deploy what you need, when you need it.


Looking to add a deduplication to your deployment, but not sure where to start? Join us for a brief network Design-IT consultation or demo. No obligation - it’s what we love to do.

EMA Security visibility in your network

See Everything. Secure Everything.

Contact us now to secure and optimized your network operations

Heartbeats Packets Inside the Bypass TAP

If the inline security tool goes off-line, the TAP will bypass the tool and automatically keep the link flowing. The Bypass TAP does this by sending heartbeat packets to the inline security tool. As long as the inline security tool is on-line, the heartbeat packets will be returned to the TAP, and the link traffic will continue to flow through the inline security tool.

If the heartbeat packets are not returned to the TAP (indicating that the inline security tool has gone off-line), the TAP will automatically 'bypass' the inline security tool and keep the link traffic flowing. The TAP also removes the heartbeat packets before sending the network traffic back onto the critical link.

While the TAP is in bypass mode, it continues to send heartbeat packets out to the inline security tool so that once the tool is back on-line, it will begin returning the heartbeat packets back to the TAP indicating that the tool is ready to go back to work. The TAP will then direct the network traffic back through the inline security tool along with the heartbeat packets placing the tool back inline.

Some of you may have noticed a flaw in the logic behind this solution!  You say, “What if the TAP should fail because it is also in-line? Then the link will also fail!” The TAP would now be considered a point of failure. That is a good catch – but in our blog on Bypass vs. Failsafe, I explained that if a TAP were to fail or lose power, it must provide failsafe protection to the link it is attached to. So our network TAP will go into Failsafe mode keeping the link flowing.

Glossary

  1. Single point of failure: a risk to an IT network if one part of the system brings down a larger part of the entire system.

  2. Heartbeat packet: a soft detection technology that monitors the health of inline appliances. Read the heartbeat packet blog here.

  3. Critical link: the connection between two or more network devices or appliances that if the connection fails then the network is disrupted.

NETWORK MANAGEMENT | THE 101 SERIES