Lossless Networking is a category of networking technologies that attempt to greatly reduce (and in some cases completely eliminate) packet loss in IT communications infrastructure. A Lossless Network is achieved through using network devices that support lossless operations and careful configuration of all devices that are responsible for processing data over the network. This includes any storage devices, Network Interface Cards (NIC), and switches that are connected to each other in a datacenter fabric. A Lossless network is required to leverage the full performance potential of the OpenFlex platform.
Strictly speaking, a lossless Ethernet network can still drop packets in certain cases. However, the amount of packet drops is significantly lower than best-effort networks, providing consistent and high performance when carefully selecting and configuring network components.
Why is Lossless Networking required?
Traditional Ethernet is a “Best-Effort” networking protocol. This means that under load packets are dropped and it is up to the transport protocol to cope with that loss. (usually via retransmission) This would make Ethernet natively lossy. With packet drops comes retransmissions which reduce the apparent performance of the network (Bandwidth and Packets Per Second) and increases latency on the network.
Why are packets dropped under load?
To understand this let’s use an analogy. Let’s consider a garden hose. If you connect this hose to your house and it has consistent pressure you would expect a consistent flow rate from the end of your hose. Now consider a firehose. If you connect this firehose to fire hydrant that has consistent pressure you would have and exponentially higher flow rate than that of the garden hose. So, what would happen if you connect the firehose to a garden hose. In the event that the flow direction is from the firehose to garden hose the much higher flow previously experience by the firehose would be severely limited by the garden hose. This is called oversubscription. For water and other fluid dynamics oversubscription presents its self as increased back pressure as well as higher sustained flow rate due to increased pressure (Water Jets). For networking however, pressure does not impact flow rate, back pressure presents in memory queues overflowing and packets being dropped. This is called congestion. Oversubscription causes Congestion.
Network oversubscription is not always easy to identify.
In a perfect world, network devices would have a full 1:1 relationship with each other. If a computer has a 10G connection and is connected directly to another computer with a 10G connection, there is no oversubscription. This 1:1 relationship is unrealistic in practice. First not every device will have matching network speeds. Some devices may be connected with 1G while others may be connected at 50G. In the event that the 50G device (Firehose) communicates with the 1G device (Garden Hose) there is oversubscription. Secondly let’s consider a network where all devices are connected at 10G. If every device were communicating with every other device simultaneously then it may average out that there would be no oversubscription, but again this is not how it would happen in the real world. In the real world you may have five devices all talking to the same single device. In this case you have five 10G devices talking to a single 10G device. This is effectively a 5:1 oversubscription. Lastly not all networks are simple. With a single network switch, baring switch inefficiencies, the switch would introduce no oversubscription, but with a complex network architecture where multiple switches are involved there will almost certainly be oversubscription introduced by switch to switch networking links.