Talk:Network congestion

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Source for 1986 congestive collapse[edit]

For https://en.wikipedia.org/wiki/Network_congestion#Congestive_collapse Reference from the horse’s mouth, Van Jacobson: https://ee.lbl.gov/papers/congavoid.pdf

“ In October of ’86, the Internet had the first of what became a series of ‘congestion col-lapses’. During this period, the data throughput from LBL to UC Berkeley (sites separatedby 400 yards and two IMP hops) dropped from 32 Kbps to 40 bps. We were fascinated bythis sudden factor-of-thousand drop in bandwidth and embarked on an investigation of whythings had gotten so bad. In particular, we wondered if the 4.3BSD(Berkeley UNIX)TCPwas mis-behaving or if it could be tuned to work better under abysmal network conditions.The answer to both of these questions was “yes”.” — Preceding unsigned comment added by 174.242.79.227 (talk) 01:58, 22 January 2020 (UTC)[reply]

Reference incorporated. Thanks for the suggestion. ~Kvng (talk) 15:38, 22 November 2022 (UTC)[reply]


[Network Congestion] Avoidance[edit]

I'd like to take umbrage with the statement that "The prevention of network congestion and collapse requires... End-to-end flow control mechanisms designed into the end points which respond to congestion and behave appropriately".

The example I offer in support of this taking of umbrage is ATM's Usage Parameter Control (and Network Parameter Control) where the only requirement on the end points relates to the source, which must not exceed bandwidth and jitter (delay-variation/burstiness) limits in transmission, and only then to prevent the UPC/NPC functions delaying or discarding some of its transmissions to enforce conformance to the traffic contract. To ensure congestion avoidance, it is also necessary to ensure that shared resources, e.g. switch output buffers, are not oversubscribed by the set of connections routed through them and thus, e.g., liable to overflow, but neither the source nor the destination end point respond to congestion; rather, the actions of the source are entirely proactive. Thus congestion avoidance can be done by predicting the loads on these resources from, e.g., the bandwidths and jitter. However, this prediction of the effect of a connection on congestion is either an off-line function, e.g. at system design time in reliable real-time systems, or a function of connection admission control, and thus not "designed into the end points".

A similar approach is taken in AFDX, where the source limits transmission by a Bandwidth Allocation Gap (BAG), and switches police traffic to the BAG and an allowed jitter tolerance on a per Vlink basis, to ensure that the bandwidths of the switch outputs are not oversubscribed and the switch buffers should not or cannot (depending on the rigor of the prediction method, which is not given in the ARINC 664P7 standard) overflow.

In both ATM and AFDX, it can be argued that such methods are inefficient. However, so is the overprovisioning of networks for QoS purposes, as recommended by the Internet2 project (see QoS page for refs). Indeed, in a way, these methods are merely a way of quantifying the overprovisioning that is required to ensure or even guarantee (depending again on the rigor in the predictions) that congestion is avoided. They also underscore that overprovisioning the switch buffers is just as important an issue as is overprovisioning the bandwidths of the physical links in the network, which may not necessarily be perceived where overprovisioning is done as an ad hoc process or using a "wet finger" approach.

So, whilst E2E flow control may be one way of solving the problem, it is not the only one, and thus not a necessary pre-requisite, as it is implied in this article. I suspect that the problem may be to do with a significant bias towards Ethernet networks and more specifically avoiding congestive collapse. And to be fair by paragraph 5 or 7 (depending on how you count) the section starts to allude to special measures, but never really gets past the necessity of designed in reactive functionality. Also what is meant by "quality-of-service routing[sic]" is not clear – there seems to be no reference to such routeing on the QoS page.

However, even in Ethernet, there are switches available that do per VLAN traffic shaping/policing, which would allow congestion to be avoided without the necessity of reactive mechanisms designed into the end-points; specifically, they would operate on UDP flows, if these are separately identified, e.g., by VLAN Id and priority. Again to be fair these methods may only really apply to private networks, such as on-platform avionic networks and those in automation control etc. But to state, in effect, that "End-to-end flow control mechanisms designed into the end points which respond to congestion and behave appropriately." are the only way to do it is far too narrow.

Graham.Fountain | Talk 12:58, 2 April 2012 (UTC)[reply]

That's all well and good but what we really need to sort this out is some good references. I have tagged the statement in the article. --Kvng (talk) 18:25, 4 April 2012 (UTC)[reply]

Having thought about this, a bit, it may be that the problem is one of semantics and the difference between "congestion avoidance", i.e. taking actions in response to incipient congestion, and what's done in some private networks, e.g. avionic and industrial networks, that might be referred to as "congestion prevention". This is taking continuous actions in the end systems that source the data and, generally, in the switches in the network, such that the network cannot become congested. In which case there is no need to take actions in specific cases – and they can be proved to have no emergent properties, rather than relying on what are essentially mathematical arguments based on assumptions about the self-similarity of the flows, which may or may not be reliable.

The two schemes that employ congestion prevention that come first to mind here are Time Triggered Ethernet (TTE) and ATM and, perhaps more relevantly, ATM’s avatar in the Ethernet context, the Avionics Full-Duplex Switched Ethernet (AFDX) protocol (the coming "down to Earth" aspect of an Avatar is ironic though) - though there are a few other protocols that are also based on Ethernet like Profinet and maybe Ethernet Powerlink worth thinking about where more detail is relevant.

Currently, these methods are, as far as I understand the situation, limited to private networks such as avionic/space-borne systems and industrial control. However, there is some work relevant to the Internet itself. I have no idea how the time domain constraints of TTE might be applied there. But in the frequency domain control of ATM and AFDX, for example, the paper Network Border Patrol: Preventing Congestion Collapse and Promoting Fairness in the Internet by Célio Albuquerque et al (IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 12, NO. 1, FEBRUARY 2004) addresses this very issue, proposing network border patrols and enhanced core-stateless fair queueing to the prevention of Internet congestion collapse. It addresses only parts of the problem for real-time data transport, but shows interest and possibly much wider notability than for TTE and AFDX themselves (I wouldn’t mention ATM in this context).

So, (all) that being said, the question is, is there any value in a new section on these methods of congestion prevention, and if so, what should its title be? Graham.Fountain | Talk 15:15, 5 March 2013 (UTC)[reply]

I have created a draft for such a section, presently titled Congestion Provention at User:Graham.Fountain/Congestion prevention; however, as yet, it contains no references or citations. I will get around to adding these in time, but if anyone is interested in comenting, ammending, or adding refs, plese feel free. Graham.Fountain | Talk 10:44, 7 March 2013 (UTC)[reply]

@Graham.Fountain it seems like this could be merged into Network congestion § Mitigation. Perhaps the reference already present there could be used to help support your new treatment. ~Kvng (talk) 15:43, 22 November 2022 (UTC)[reply]
@Kvng It might. I've been busy working on an actual implementation of the patented Deterministic Ethernet Fault Tolerant Network (DEFTNet) concept for something specific and not kept up with this. But I suppose something on the congestion prevention mechanisms that are explicit in DEFTNet and in Time Triggered Ethernet (TTE) - and maybe in the scheduled transmission aspects of Time Sensitive Networking - could go close to there. But I think it wants a separate section, i.e. not part of mitigation. The point being that mitigation of congestion is no use for hard or firm real-time systems where there's a requirement for reliable delivery, i.e. a specific probability of delivery in deadline, with a guaranteed data rate, not reliable transport, i.e. notification when data is not delivered, and some buggering about with the data rates when the network is a bit busy. Hard/firm real-time systems need something akin to a MIL-STD-1553B databus, but now we (in avionics) want that implemented on a Packet Switched Network, to get the advantages in the data rates and connectivity, and that should be packet switched Ethernet (at least the network interfaces should be plain old Ethernet) to get the advantages of availability.Graham.Fountain | Talk 18:17, 22 November 2022 (UTC)[reply]

Throughput and goodput[edit]

To who may concern,

"Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.[1]

Network protocols that use aggressive retransmissions to compensate for packet loss due to congestion can increase congestion, even after the initial load has been reduced to a level that would not normally have induced network congestion. Such networks exhibit two stable states under the same level of load. The stable state with low throughput is known as congestive collapse."

An illustrction: http://packetpushers.net/throughput-vs-goodput/

My personal idea,

The congestion will not reduce the throughput of a link, but reduce the goodput of the application, as the link is still working while congestion.

It is better to help audience distinguish this.

Best Regards Yuhang — Preceding unsigned comment added by Yuhang Ye (talkcontribs) 15:29, 23 October 2017 (UTC)[reply]

It seems like a lot of instances of throughput in this article could be replaced with goodput. Is that being too pedantic? Is goodput a familiar enough term to bear this weight? ~Kvng (talk) 15:49, 22 November 2022 (UTC)[reply]

External links modified (February 2018)[edit]

Hello fellow Wikipedians,

I have just modified one external link on Network congestion. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 12:10, 16 February 2018 (UTC)[reply]