MIT Discovers Why There’s an Uneven Distribution of Bandwidth
Network congestion algorithms are supposed to fairly distribute internet bandwidth. But that's not always the case, MIT researchers say.
In 2020, Britannica estimated that over 4.5 billion people had access to the internet. As that number continues to grow, it is essential that internet bandwidth be distributed fairly across multiple users.
Bandwidth is defined as the maximum amount of data that can be sent over an internet connection in a given amount of time. For any given internet connection, there are likely many users sharing its bandwidth. Consider, for example, the case of a household internet connection with several users with some using streaming services and others using basic services like email. The internet connection must be allocated to each device in a way that is fair while also ensuring sufficient speed. This is achieved through congestion control algorithms.
The internet can be thought of as a water pipe, with congestion control algorithms metering the water connection to balance both speed of delivery (latency) and fairness of quantity (bandwidth). Image used courtesy of CompiraLabs
Recently, however, a team of researchers at MIT found that flaws in network congestion algorithms may be unevenly distributing bandwidth to internet users.
How Congestion Control Algorithms Can Cause Network “Starvation”
Since congestion control algorithms are critical to sharing the modern internet, researchers have invested heavily in optimizing and streamlining them. In 2017, Google engineers developed the BBR (bottleneck bandwidth and round-trip propagation time) congestion control algorithm (CCA), which was later deployed in 2019 by Amazon Cloudfront.
However, MIT researchers have recently published a study that asserts that BBR and similar “delay-bounding” congestion control algorithms cannot always avoid a scenario known as bandwidth starvation. Starvation in this context refers to one connected device hogging available bandwidth and consequently blocking another connected device from accessing minimal or any bandwidth at all.
Delay-bounding congestion control algorithms work by adjusting their congestion window, typically defined as the number of packets a device is allowed to send out into the network at a given time, by measuring the round-trip time (RTT) of the network. RTT is defined as the duration between when a packet is sent out from a device to when an acknowledgment is received by a remote server.
Illustration of round-trip time (RTT). Image used courtesy of StormIT
MIT Discovers Roadblock in Network Distribution
Recently, MIT proved that delay-bounded CCAs can result in starvation. In a published study, the researchers pointed out that delay-bounded congestion control algorithms are also "delay-convergent," meaning they converge to an RTT that oscillates between two values over time.
Delay-convergent property of delay-bounded congestion control algorithm. Image used courtesy of MIT
Using formal mathematics, the MIT researchers showed that if a delay-bounded (convergent) algorithm operates on two different real-world links (for instance, two different devices sharing a connection), they can converge to vastly different sending rates. In other words, one of the links can be starved of bandwidth.
Two different links converge to different sending rates when delay-convergent CCA is used. Image used courtesy of MIT
This occurs because of network “jitter”—delays in the network not caused by congestion but by other random events. When the jitter for two different links deviates, delay-convergent algorithms converge to an RTT, which may not reflect the actual congestion present on the link.
Improving Congestion Control Algorithms
According to Mohammed Alizadeh, associate professor of electrical engineering and computer science at MIT, one way to resolve this issue is to design an algorithm in which the oscillation converges to an RTT range greater than the jitter present on a link. Alizadeh also asserts that current delay-convergent CCAs almost always result in starvation under certain scenarios. Future CCAs will have to take this behavior into account to avoid bandwidth starvation.