Downtime is an expense that few businesses can afford in the age of hybrid work and digital transformation. According to Datto, an hour of downtime costs $8,000 for a small company, $74,000 for a medium company, and $700,000 for a large enterprise. For large businesses, this equates to around $11,600 per minute."
Downtime has numerous causes - most of which relate to an overburdened or compromised network. However, many companies fail to realize that their network is in poor health - until it’s too late. To combat this issue, they need a way of finding, diagnosing, and remediating those early signs of trouble. This is where evidence of packet loss becomes essential.
In network communications, a packet is a segment of data that is routed between two destinations. Because vast amounts of data can't be transmitted as one unit, they are then broken into smaller packets that help the transmission move through the network quickly and efficiently. Packets are a foundational part of digital communications today. Every email, chat message, photo, and video we send or receive over the internet is made up of packets.
Unfortunately, not every packet that we send is received. Sometimes packets are lost or dropped as they traverse through the network. The most common example of this is when a movie being streamed may be pixelated or glitchy to the viewer. When a packet is lost, it puts extra pressure on the network, having to retransmit the packets. Latency, bottlenecks, and performance troubles can often be routed back to packet loss. For all of these reasons, packet loss is a costly fault for any network to sustain. To prevent downtime, optimize network performance, and improve security, improving visibility into packet data is essential.
To reduce packet loss, we first need to understand what causes it. As with many information technology issues, packet loss has no one cause. Multiple culprits could be at play, which we will explore below.
Network Congestion
Think of data packets as mail and parcels traveling through the mail system. Now, imagine the mail system during a peak time like the holidays. There’s a lot of congestion. Mail vans are overcrowded, and, inevitably, one or two parcels might get lost along the way. It’s precisely the same for data packets. Packet loss is bound to happen - even on the best of networks. However, large amounts of packet loss can be avoided with intelligent network solutions that manage data traffic and capture discarded packets after high traffic peaks and microbursts.
Outdated hardware
Legacy hardware puts extra pressure on networks - especially at times when it's near max capacity. Outdated firewalls and routers can use up a large amount of processing power, increasing the likelihood of connectivity loss and disregarded packets.
Device overload
From your IT infrastructure to your security and monitoring tools, if they are running at max capacity, they’re at risk of over-utilization or oversubscription and dropping packets. While some devices have buffer mechanisms for this risk, the buffers themselves can become overloaded.
Security incidents
Cybercriminals can use numerous tactics to cause network disruption. Packet drop attacks, for example, occur when threat actors access a company router and instruct it to drop packets. Another standard attack is a denial-of-service attack (DoS), where cyber-criminals overwhelm the network with too much traffic, causing a complete outage.
Inadequate network monitoring
Lack of visibility, brought about by an insufficient network monitoring solution – may not properly analyze gaps in the network performance, leaving root causes like congestion, outdated hardware, and overloaded devices to continue to drop packets.
While some of the causes of packet loss are out of network administrators’ control, many can be solved by adopting the right solutions and policies. Ultimately, the antithesis of packet loss is packet visibility. With a deeper, more unified view of the network, administrators can spot the early signs of packet loss - and pinpoint their causes.
To do this, companies should adopt tools that foster comprehensive network visibility and monitoring. Integrating network performance monitoring (NPM) and voice over internet protocol (VoIP) with a network quality manager is typically the benchmark.
NPM monitors the network for performance issues. They collect and analyze performance data relating to packet loss, network delays, and performance bottlenecks so that administrators can gain deeper insight into network troubles. The best in the breed of these solutions offer strong visualization capabilities within an intuitive interface, making it straightforward for IT teams to identify, understand and troubleshoot any issues.
Similarly, VoIP & network quality managers give administrators insights into the network's specific quality of service measurements. This helps IT teams to understand network health in real-time while also facilitating pattern analysis over more extended periods to give a deeper understanding of the quality of service on the network over time.
Both network performance monitoring (NPM) tools and “VoIP & network quality managers” rely on packet data to give administrators an accurate picture of network performance and monitor network conditions for VoIP delivery. Incomplete or latent data negatively impacts the quality of these tools, making them a costly investment that doesn’t deliver the expected return on investment. Ultimately falling short of diagnosing packet loss within the network.
Engineers have two options for sending packet data to their performance monitoring tools: SPAN/port mirroring or network TAPs. Ironically, Using SPAN for monitoring can create packet loss in the monitoring network due to the volume of duplicate packets – especially where VLANS are utilized. Common network issues arise from congestion, bottlenecks, and overloaded devices - all of which can be caused by an over-reliance on SPAN ports. This is because SPAN tends to drop packets if a port is oversubscribed, as they were not designed for continuous monitoring.
By contrast, network TAPs are designed to prevent packet loss. They work by providing full-duplex copies of packet data, passing runt frames, physical layer errors, and purpose-built for 24/7/365 continuous monitoring. This gives NPM tools a complete, unified, and accurate picture of network traffic without sacrificing speed or reliability.
Ultimately, visibility is critical to secure network performance and management to prevent packet loss. To enable network monitoring tools to perform at their best, organizations should integrate network TAP technology to reduce the risks of downtime and packet loss.
Looking to add network TAP visibility to your deployment, but not sure where to start? Join us for a brief network Design-IT consultation or demo. No obligation - it’s what we love to do.
If the inline security tool goes off-line, the TAP will bypass the tool and automatically keep the link flowing. The Bypass TAP does this by sending heartbeat packets to the inline security tool. As long as the inline security tool is on-line, the heartbeat packets will be returned to the TAP, and the link traffic will continue to flow through the inline security tool.
If the heartbeat packets are not returned to the TAP (indicating that the inline security tool has gone off-line), the TAP will automatically 'bypass' the inline security tool and keep the link traffic flowing. The TAP also removes the heartbeat packets before sending the network traffic back onto the critical link.
While the TAP is in bypass mode, it continues to send heartbeat packets out to the inline security tool so that once the tool is back on-line, it will begin returning the heartbeat packets back to the TAP indicating that the tool is ready to go back to work. The TAP will then direct the network traffic back through the inline security tool along with the heartbeat packets placing the tool back inline.
Some of you may have noticed a flaw in the logic behind this solution! You say, “What if the TAP should fail because it is also in-line? Then the link will also fail!” The TAP would now be considered a point of failure. That is a good catch – but in our blog on Bypass vs. Failsafe, I explained that if a TAP were to fail or lose power, it must provide failsafe protection to the link it is attached to. So our network TAP will go into Failsafe mode keeping the link flowing.
Single point of failure: a risk to an IT network if one part of the system brings down a larger part of the entire system.
Heartbeat packet: a soft detection technology that monitors the health of inline appliances. Read the heartbeat packet blog here.
Critical link: the connection between two or more network devices or appliances that if the connection fails then the network is disrupted.