Talk:Component (graph theory)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Merger proposal[edit]

User:DPoon has suggested that strongly connected component be merged here. I strongly disagree. We could perhaps use an article that would survey various notions of components in graph theory (connected components and biconnected components of undirected graphs, strongly connected components of directed graphs, etc) but both connected components and strongly connected components deserve their own articles: they are fundamental, important, have plenty of algorithmic depth, etc. And while connected components are reasonably intuitive and natural, strongly connected components require more care to define and use; including that material here would necessarily make it shorter and harder to understand while also confusing readers who only care about the undirected case. —David Eppstein (talk) 20:01, 1 October 2008 (UTC)[reply]

Start Class[edit]

This article is definitely not a stub so I have upgraded it to start. I am not sure if it meets the criteria for C-class so left it at that. Acb314 (talk) 11:38, 14 August 2009 (UTC)[reply]

connected components for directed graphs[edit]

According to this articles definition:

[A] connected component of an undirected graph is a subgraph in which any two vertices are connected to each other by paths, and to which no more vertices or edges (from the larger graph) can be added while preserving its connectivity

I'm looking for the name you can give to a "connected component" of a directed graph. That is, I'm looking for XYZ, for which the following definition holds:

A XYZ of an directed graph is a subgraph in which for any two vertices v1, v2 in this subgraph there is a path from v1 to v2 or there is a path from v1 to v2, and to which (subgraph) no more vertices or edges (from the larger graph) can be added while preserving its connectivity.

It should be called something like "maximally connected subgraph of a directed graph", though I guess there is already a special name for it. Thanks, --Abdull (talk) 21:50, 2 August 2010 (UTC)[reply]

I'm not sure that's the definition you really want. For instance, consider a graph with k layers, and n/k vertices per layer, with an edge between every two vertices on consecutive layers (directed from the higher layer to the lower one) — then there are (n/k)k different subgraphs formed by a path from the top layer to the bottom one, each of these subgraphs fits your description, and they don't form any kind of nice partition of the graph. But instead, if you require that there is both a path from v1 to v2 and a path from v2 to v1, then you do get a nice partition of the vertices into strongly connected components. —David Eppstein (talk) 21:59, 2 August 2010 (UTC)[reply]
Thank you for your fast reply (9 minutes! :). Though I didn't understand it yet, as I don't know the definition of layer. There is neither a Wikipedia article on it, nor is there an entry about it in the glossary of graph theory. Can you give a definition or point to one? Cheers, --Abdull (talk) 18:12, 3 August 2010 (UTC)[reply]
I didn't mean it in any standard formal sense, just as a way of identifying subsets of vertices. Define a collection of disjoint subsets of vertices, call them "layers", order them into a sequence, and put a directed edge between two vertices if they are in two layers that are consecutive in the ordering. —David Eppstein (talk) 18:17, 3 August 2010 (UTC)[reply]

Union-find-delete[edit]

The Algorithms section talks about amortized O(|V|) deletion of edges in disjoint-set data structure. Amortized constant time per edge delete can be achieved in disjoint-set while still maintaning the same time complexity of the other operations. Construction of such a data structure is described in several papers, that can be easily found using "union find delete" query in any search engine (Google being probably the best one). I am not familiar with Wikipedia's policies and I don't feel confident to change this article as I am just an undergrad student of computer science. 78.128.196.3 (talk) 09:05, 27 April 2013 (UTC)[reply]

Rename article to Component (graph theory), and refer to all "Connected components" and just components throughout the article.[edit]

A component, by definition is a connected subgraph of a graph, the term connected component is redundant, and does not appear in most texts.

The term "connected component" does however appear a lot in computer science literature, from what I've seen online. However, this is more of a math article than it is a compsci one, so I would argue the definition and use of the term seen most in Graph Theory texts would be more suitable. — Preceding unsigned comment added by Zer0dept (talkcontribs) 22:16, 17 March 2019 (UTC)[reply]

GA Review[edit]

This review is transcluded from Talk:Component (graph theory)/GA1. The edit link for this section can be used to add comments to the review.

Reviewer: Ovinus (talk · contribs) 23:19, 28 February 2022 (UTC)[reply]

GA review (see here for what the criteria are, and here for what they are not)
  1. It is reasonably well written.
    a (prose, spelling, and grammar): b (MoS for lead, layout, word choice, fiction, and lists):
  2. It is factually accurate and verifiable.
    a (reference section): b (citations to reliable sources): c (OR): d (copyvio and plagiarism):
  3. It is broad in its coverage.
    a (major aspects): b (focused):
  4. It follows the neutral point of view policy.
    Fair representation without bias:
  5. It is stable.
    No edit wars, etc.:
  6. It is illustrated by images and other media, where possible and appropriate.
    a (images are tagged and non-free content have non-free use rationales): b (appropriate use with suitable captions):
  7. Overall:
    Pass/Fail:

Hi, I'll take this one up. A cursory look through the article reveals it's in good shape. For reference, I know graph theory concepts that would be found in a typical undergrad computer science sequence. Any edits I make to the article will be minor (and not something I'll be a stickler on). Ovinus (talk) 23:19, 28 February 2022 (UTC)[reply]

Lead[edit]

  • Would it be fair to include something like "the analogous concept for directed graphs is called a strongly connected component" somewhere, perhaps at the end of the first paragraph?
    • Is it really "the" analogous concept? Turn it around: if you're seeking the analogous concept to strongly connected components in undirected graphs, would it be connected components, biconnected components, or something else? Algorithmically, biconnected and strongly connected are a lot closer to each other than they are to the components of this article. Knuth would argue (and has, recently) that the correct analogy from connected components of undirected graphs is to weak components of directed graphs. Another concept frequently used for directed graphs and also analogous in some ways are the components of a directed graph obtained by forgetting the directions and using undirected components. —David Eppstein (talk) 23:47, 28 February 2022 (UTC)[reply]
      • Fair point; maybe "Similar concepts" or "Similar concepts accounting for additional structure". My immediate reaction to the lead sentence was generalizing the idea to other types of graphs. Ovinus (talk) 00:03, 1 March 2022 (UTC)[reply]
        • So this is intended as a summary (as everything in the lead should be a summary) of the last two lines of the "Definitions and examples" section? It would have to be very brief to summarize those lines rather than repeating them. —David Eppstein (talk) 00:49, 1 March 2022 (UTC)[reply]
        • True; I'm convinced. Ovinus (talk) 01:26, 1 March 2022 (UTC)[reply]
  • Set off the definition of "giant component" with dashes or parens
    • It is correct grammar to set off parenthetical clauses such as this one with commas. They could alternatively be replaced by dashes, but according to [1], "dashes have a slightly more emphatic feel, making the reader focus on this information". I'm not convinced that adding additional emphasis to these glosses is desirable here. Also, whatever is done to the definition of giant component should certainly be done to the definition of percolation threshold (because those two parts of the sentence are parallel), and I'm not convinced that it's a good idea to have sentences with more than one parenthetical clause (it makes the structure look confused and cumbersome). —David Eppstein (talk) 00:55, 1 March 2022 (UTC)[reply]
      • This is true in isolation and is fine throughout the article, but in this instance it is part of a list. (Indeed the "incidence of" groups implies the commas are parenthetical and not in a list, but this is only confirmed in the second "of" (for a percolation threshold), which means that a reader might have to backtrack.) If dashes or parentheses are objectionable, a semicolon after "others" works too. Ovinus (talk) 01:10, 1 March 2022 (UTC)[reply]
  • constructed in linear time I'd specify "linear time in the number of vertices" to ensure it's not confused as linear in component count. ("sublinear time algorithms ... " later is clear)
    • It's not actually linear in the number of vertices. Linear time always means, linear in the total size of the input, and here that means vertices+edges. —David Eppstein (talk) 00:55, 1 March 2022 (UTC)[reply]
  • low time per change Would "constant time", "roughly constant time" or "amortized constant time" be too technical here? Not actually true, my bad

Definitions and examples[edit]

  • For instance, the graph shown in the first illustration has three components. I'd be a bit more explicit that the components are disjoint, or we're just stating something obvious; also, "three components" is stated in the caption. Maybe something like "For instance, the three components of the graph in the first illustration are disjoint, and together constitute the whole graph."
    • Writing it that way would suggest, incorrectly, that it's somehow possible for other examples to have components that are not disjoint, or do not cover the graph. —David Eppstein (talk) 23:51, 28 February 2022 (UTC)[reply]
      • How does it suggest that? We're explicitly considering examples of this rule ("For instance"). In any case, the disjoint part should be mentioned, or else the statement feels vacuous. Ovinus (talk) 00:05, 1 March 2022 (UTC)[reply]
        • The graph has exactly three components, in exactly the sense of the word "component" used in this article. There is no other number of components that it could have. Qualifiers do not help make the statement more true. Adding qualifiers creates the sense that the qualifiers are added for a reason, rather than being completely redundant, and therefore suggest that without the qualifier there might be some other number of components, which is false. —David Eppstein (talk) 00:47, 1 March 2022 (UTC)[reply]
          • I'm confused. Is the "For instance" sentence an example of the definition of component instead of the immediately preceding sentence? If so, I'd remove "For instance" entirely and maybe connect it to the next sentence with a semicolon, to show the natural connection between the ideas. Ovinus (talk) 00:53, 1 March 2022 (UTC)[reply]
            • I moved the "for instance" sentence earlier in the paragraph, to make clearer that what it is an example of is the definition in the first sentence, and not so much the additional properties in the later sentences of the paragraph. —David Eppstein (talk) 07:44, 1 March 2022 (UTC)[reply]
  • as sets of vertices, the equivalence classes of vertices, The second part can be removed as the meaning is clear
    • Ok, but I changed "as sets" to "as the sets" to make it less ambiguous that it is the same sets already being discussed.
  • for more advanced forms of graph connectivity I'd just say "for other forms"; advancedness is subjective

Number of components[edit]

  • topological connected components I'd say "topologically", but this is minor and perhaps slightly imprecise
    • I think it is a little imprecise. I meant "connected components as defined for topological spaces", with "topological" modifying the noun phrase "connected components", rather than "components that are topologically connected", with "topologically" modifying the adjective "connected". —David Eppstein (talk) 07:47, 1 March 2022 (UTC)[reply]
  • and representing its edges as line segments between those points I wouldn't call this a "representation" per se, since information about vertices of degree two are lost in the topological setting. Maybe "considering"? nvm
  • and in topological graph theory it can be interpreted as the zeroth Betti number of the graph This seems a bit redundant, since a construction was just essentially given (so it's just terminology)
  • Besides percolation theory–type questions which you discuss, are there any important applications to infinite graphs? It seems infinite graphs aren’t as frequently studied as I thought
    • Not anywhere near as much as finite graphs, definitely, to the point where a lot of references are sloppy and make certain claims as being true for all graphs when really they're true only for finite graphs and require more care for infinite graphs. That's true in many of our Wikipedia articles, too. I sprinkled a few more "finite"s through the article to be safe in cases where this was a little unclear. —David Eppstein (talk) 07:56, 1 March 2022 (UTC)[reply]
      • Makes sense. I have a conceptual question if you don't mind: is the notion of an infinite graph, treated as an object in its own right, ever useful in your experience/research? Most of the graph theory concepts I know of can be sensibly applied to countably infinite graphs, although connectedness would probably require some adjustment (maybe requiring finite paths or indexing the path with ?). Or is it usually easier to deal with the finite case and let the number of vertices tend to infinity? Ovinus (talk) 17:33, 1 March 2022 (UTC)[reply]
        • Do you mean, have I ever used the concept in my own research? Or do you mean, does my experience lead me to believe that it's a worthwhile concept to explore? It would be difficult to do without the infinite grid graphs and their relatives; they've come up repeatedly as an organizing tool in my papers despite those papers mostly being about finite things. So yes, infinite graphs are useful. I've only rarely studied infinite graphs directly in my own research; an exception is the paper "Combinatorics and geometry of finite and infinite squaregraphs". —David Eppstein (talk) 06:01, 2 March 2022 (UTC)[reply]
  • "The same number arises in other ways in graph theory as well." Specify number of components so it's not confused with the rank of the graph; can replace "Numbers of components plays" later with "This number plays" if wanted
  • Perhaps in the sentence on topology, a brief mention of topological separability as the analog of disjointness in the graph case?
    • Added something about this as a gloss for the meaning of topological connected components. —David Eppstein (talk) 06:06, 2 March 2022 (UTC)[reply]

Algorithms[edit]

Mostly prose nitpicks here:

  • To find all the components of a graph, loop through its vertices The imperative mood feels a bit off here; maybe just "Finding all the components of a graph is/can be done by looping ... "
  • breadth first or depth first hyphenate
    • Done. Yes, it should be consistent with the earlier use in the same paragraph. —David Eppstein (talk) 07:46, 2 March 2022 (UTC)[reply]
  • Hopcroft & Tarjan (1973) describe ... "well known" What is the utility of this vs. a plain citation? If the main thrust is "well known" then I think "essentially this algorithm" can be removed
    • It was intended primarily as a way of giving some history: that the algorithm was already well known in 1973, and therefore presumably invented significantly earlier. It was "essentially this algorithm" because the text of the reference is a little vague about how to find the next vertex that is not yet included in a component, and the more precise part of the reference is formatted as a flow chart at a much lower level of detail than would typically be used now, so I wanted to phrase it in a safe way that would allow for the possibility of some of those low-level details being different in an inessential way. —David Eppstein (talk) 07:52, 2 March 2022 (UTC)[reply]
  • or as likely to be part of objects depicted in the image rm "depicted in the image"
    • Well, I can see that this is somewhat redundant, but we somehow need to disambiguate "objects" meaning things that are visible in the image from "objects" meaning data in the computer code for processing the image. —David Eppstein (talk) 07:54, 2 March 2022 (UTC)[reply]
  • Are there any appealing images of CCL available?
  • Researchers in this area rm "in this area"
  • and labeling them as objects This seems quite clear; removing it would make the sentence more concise
  • allowing it to be processed in pixel order Provide why this is important (I assume it's to do with cache locality?)
    • The source had a line or two about this, so I added a line here based on that. That's one reason but not the only reason. Certain kinds of hierarchical image representations, for instance, might allow fast sequential access but not fast random access to the pixels. —David Eppstein (talk) 17:45, 2 March 2022 (UTC)[reply]
  • The second should be or just

In random graphs[edit]

  • "but should not depend on the choice of {\displaystyle n}n" What exactly do you mean here?
    • There's a limiting argument going on in the earlier text about "high probability": we're taking a limit as . As we change in this way, should remain unchanged. It doesn't work, for instance, to set and then try to take the limit. You can define a limit that way, but the thing that you want to be true with high probability will not actually be true in that limit. —David Eppstein (talk) 07:59, 1 March 2022 (UTC)[reply]
      • I understand that part, but this feels like a convoluted way of saying "fix first, then let ". Is "The analysis depends on a fixed which can be chosen arbitrary close to 0" too imprecise? Also, is the asymptotic width of the transition period in terms of known? Ovinus (talk) 17:27, 2 March 2022 (UTC)[reply]
        • @David Eppstein: Just this one point here. Ovinus (talk) 18:35, 2 March 2022 (UTC)[reply]
          • It's really a question of quantifier order. The correct order is something like "for all epsilon: for all delta: exists N: for all n: (n > N and p < (1-epsilon)/n) => Pr[all components are small] > 1-delta". The wrong order is something like "forall delta: exists N: for all epsilon: forall n: ..." Delta and N aren't even explicit in the text but are implied by the wording "arbitrarily close to one for sufficiently large values". Anyway, I'm not sure "a fixed epsilon>0" really conveys the intended meaning, because maybe that would allow us to set epsilon=1/n? That's fixing it, isn't it? In fact this sort of choice of epsilon to be a function of n rather than a value independent of n is commonly done (with other functions than 1/n), in studying the transition area in finer detail. A lot more is known about what happens there, but I didn't think this article was the place to get into details. —David Eppstein (talk) 22:01, 2 March 2022 (UTC)[reply]
            • Indeed it's a question of order of two limit processes, hence why I thought it'd be easiest to just say "choose epsilon first." But I agree it's distracting to mention at the beginning, "choose epsilon > 0," before introducing n. How about "a positive constant which can be chosen to be arbitrarily close to zero but independently of n"? Ovinus (talk) 22:11, 2 March 2022 (UTC)[reply]
              • Ok, but I reversed it to put the independence first before the close to zero part; I think it's a little tighter that way. —David Eppstein (talk) 06:51, 5 March 2022 (UTC)[reply]
  • "y=y(np)" I'm guessing you mean that y is only dependent on the value np, but the notation is confusing. Maybe just "where y is the solution to ..., only dependent on the value np"
    • I think someone else must have added that phrasing, so I can only speculate on the intended meaning. I agree that it is confusing, and unnecessary. I just removed the "=y(np)" part. I don't think we need to be explicitly told that it only depends on and not separately on both variables; what use are we making of this information? —David Eppstein (talk) 17:50, 2 March 2022 (UTC)[reply]

Note[edit]

I totally overestimated my knowledge in this area; if you felt my review wasn't thorough let me know. Ovinus (talk) 00:32, 1 March 2022 (UTC)[reply]

I don't have any complaints so far. But if you see something that you think could be understandable for someone with your level of technical background, but isn't, please let me know; some of this material is unavoidably technical but I don't want it to be technical when it doesn't have to be (and neither does GA criterion 1a). An undergraduate CS student level is a good target for much of this material; at least, for the material in the definition and algorithms sections. The number of components and random graphs sections may need to be a little more advanced, maybe an undergraduate math student level. —David Eppstein (talk) 08:02, 1 March 2022 (UTC)[reply]
Great, thanks. Overall it was quite understandable, despite being a formal mathematical approach and including the random graphs part. Ovinus (talk) 17:27, 2 March 2022 (UTC)[reply]

@Ovinus: I think I've responded to everything above, so please take another look. —David Eppstein (talk) 17:51, 2 March 2022 (UTC)[reply]

One point above and one last nitpick, allows them to be subjected to additional processing -> "allows additional processing", since you immediately describe the processing afterward and it's clear that it's on the identified components. I'll promote after these are addressed. Ovinus (talk) 18:35, 2 March 2022 (UTC)[reply]
Awesome; passing. Thank you for the excellent work! Ovinus (talk) 16:00, 5 March 2022 (UTC)[reply]