Talk:Galton–Watson process

This article is within the scope of WikiProject Human Genetic History, a project which is currently considered to be inactive.Human Genetic HistoryWikipedia:WikiProject Human Genetic HistoryTemplate:WikiProject Human Genetic HistoryHuman Genetic History articles

Molecular Biology: Genetics

	This article is within the scope of WikiProject Molecular Biology, a collaborative effort to improve the coverage of Molecular Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Molecular BiologyWikipedia:WikiProject Molecular BiologyTemplate:WikiProject Molecular BiologyMolecular Biology articles
???	This article has not yet received a rating on the importance scale.
	This article is supported by the Genetics task force (assessed as Low-importance).

Statistics Low‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Low	This article has been rated as Low-importance on the importance scale.

Mathematics Low‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
Low	This article has been rated as Low-priority on the project's priority scale.

Dead external links[edit]

The external links are both dead. Need updating. — Preceding unsigned comment added by 137.112.148.179 (talk) 03:00, 16 November 2005‎

no, the second one is ok, but you need Acrobat or something to see it. Not very "accessible" 203.218.141.99 07:32, 11 April 2006 (UTC)[reply]

Both links work fine for me. On is a pdf file; one reads it with acrobat. The other is a ps file. On the Linux machine I'm using, with my preferences set as they are, I read it with ghostview. On Microsoft systems, I suspect the appropriate software would be called gsview or something like that. Michael Hardy 20:19, 11 April 2006 (UTC)[reply]

example: one-child policy[edit]

Suggest:

"As a concrete example, suppose in one generation there are 100 persons (50 male and 50 female) with unique surnames, and a requirement that every person can participate in the conception and naming of at most one child. Then the next generation can have at most 50 unique surnames."

http://en.wikipedia.org/wiki/One-child_policy — Preceding unsigned comment added by 66.228.79.25 (talk) 20:57, 11 July 2006‎

Is it worth mentioning the original paper is wrong?[edit]

I just went through the Galton-Watson paper mentioned in the history section, which concludes by saying (incorrectly) that "whenever the survival probabilities can be represented by a polynomial, ... all the surnames, therefore, tend to extinction." It comes to this false conclusion because it says,

We get the equation

$_{r}m_{0}={\frac {1}{(a+b)^{q}}}\left\{a+b\cdot _{r-1}m_{0}\right\}^{q}$

[Here _rm₀ denotes the probability of extinction after r generations.]

whence it follows that as r increases indefinitely the value of _rm₀ approaches indefinitely to the value y where

$y={\frac {1}{a+b}}\left\{a+by\right\}$

that is where y=1.

The first formula is correct, but the second is not; it should be

y={\frac {1}{(a+b)^{q}}}\left\{a+by\right\}^{q}

this does have y=1 as one solution, but there are other solutions if the expected number of male descendants is greater than 1. Moreover, the sequence (_rm₀) will approach one of the other solutions, so the probability of the surname surviving indefinitely is nonzero. This is in agreement with the Wikipedia article, but not the Galton-Watson paper.

Meanwhile, all of this is original research, so Wikipedia can't include any of it without a citation. skeptical scientist (talk) 19:53, 31 January 2009 (UTC)[reply]

Definition unclear[edit]

The definition uses (n+1) in the superscript, it is not clear what this means. The paragraph below implies is is \xi_j^{(n)} is a sequence. I am not sure what it means to be summing whole sequences (as opposed to elements), especially ones with a different number of elements. I think something else is meant here. If it is not wrong, I think there is something missing in the explanation of the notation. --MATThematical (talk) 16:20, 16 June 2010 (UTC)[reply]

I added one sentence describing the correspondence between formal definition and analogy with family names; hopefully this makes the definition easier to parse. I'm not exactly sure if it's fair to call

\{\xi _{j}^{(n)}\}

a sequence, as it's a function, not from natural numbers to random variables, but from pairs (j,n) of natural numbers with j ≤ X_n to random variables (and so not technically a sequence, or perhaps an infinite sequence of finite sequences). However, each individual term

\xi _{j}^{(n)}

is a natural-number-valued random variable, and those can certainly be summed. skeptical scientist (talk) 13:28, 19 January 2012 (UTC)[reply]

Bad examples[edit]

Nbarth (talk · contribs) added Vietnamese, Korean and Chinese as examples of surname extinction. Unless a source is provided, I will remove them.

I do not think these Sinitic names underwent surname extinction. A Sinitic surname is a name for a whole lineage and extremely rately goes extinct (Japanese surnames are totally different as I already described at Talk:Japanese name#Surname extinction?). In Korea and Vietname, a limited number of ruling class clans first adopted Sinitic surnames, and as a result, a small number of surnames came into use. Later commoners followed them chosing surnames from the small name pool. That's why they have few surnames. --Nanshu (talk) 05:28, 26 November 2011 (UTC)[reply]

Good point about difference in number of family names not being primarily due to extinction, but rather to different processes of creation (and to adoption due to other reasons, as in Nguyễn) – thanks!

I’ve re-written the section in this edit to fix this; it’s much more complex than the simplistic “these are old, hence few, these are new, hence many” caricature there was previously.

I did add the Chinese and Korean examples (in this edit), but I didn’t actually add Vietnamese (someone else did). I also added the modern (Dutch, Japanese, Thai) counter-examples.

The example of Chinese names is very well-studied, and definitely has experienced significant surname extinction, from close to 12,000 recorded surnames in the past to about 3,100 now (a factor of about 4:1 or about 75%), as these references state:

Ruofu, Du; Yida, Yuan; Hwang, Juliana; Mountain, Joanna L.; Cavalli-Sforza, L. Luca (1992), Chinese Surnames and the Genetic Differences between North and South China (PDF), Journal of Chinese Linguistics Monograph Series, pp. 18–22 (History of Chinese surnames and sources of data for the present research), also part of Morrison Institute for Population and Resource Studies Working papers {{citation}}: External link in |postscript= (help)CS1 maint: postscript (link)
"O rare John Smith", The Economist (US ed.): 32, June 3, 1995, Only 3,100 surnames are now in use in China [...] compared with nearly 12,000 in the past. An 'evolutionary dwindling' of surnames is common to all societies. [...] [B]ut in China, [Du] says, where surnames have been in use far longer than in most other places, the paucity has become acute.
Cook, Steven (March 6, 1997), "China's Identity Crisis: Many People, Few Names", Christian Science Monitor, Why the lack of surnames, then? The reason, according to Du Ruofu of the Chinese Academy of Sciences, is that all societies experience an 'evolutionary dwindling' of family names as less-common ones die out. Because the Chinese have used surnames for thousands of years (compared to just a few centuries in many parts of Europe), this effect has become particularly significant.

The main authority on this (or at least most-quoted in English) seems to be Du Ruofu, who seems a noted Chinese researcher (Chinese Academy of Sciences); I learned about this from the 1995 Economist article.

Chinese is the classic example of this, and it is frequently contrasted with Japanese (as the first paper does), so I think it appropriate to include here, but, as you note at Japanese names, it’s more due to creative Japanese naming, rather than the recent history.

I don’t know if Korean and Vietnamese names have undergone significant name extinction, but clearly the original small number and other effects are more significant factors, as you note.

I’ve also re-written the Japanese and Korean pages to reflect this.

The issue of the huge diversity of number of family names between countries is clearly of interest (100 Vietnamese names vs. 100,000+ Japanese names) and belongs somewhere, but perhaps it’s better placed at Surname or Family name than here, since it’s not primarily due to this process? (This process seems the main mathematical theory of name frequency/distribution, hence of related interest, but the facts of frequency are separate.)

Thanks again – reading up on this and tracking down papers took some time, but it’s very informative and a much more nuanced picture than the simplistic (and incorrect) explanation before.

Please feel free to make further suggestions or changes as you see fit – in particular, perhaps the extreme examples of name frequency (v. few, v. many) would be better placed somewhere else, and perhaps reworded (if not on this page)? (The US is another example, with over 150,000 family names, AFAICT, where this reflects multiethnic origins.)

—Nils von Barth (nbarth) (talk) 12:11, 26 November 2011 (UTC)[reply]

Thank you for providing references. They are very intriguing. Unfortunately I have no time to take a closer look right now. There is just one point I would like to note. The surname of the first author of the 1992 paper is Du, not Ruofu. This paper must be cited as Du et al., 1992. --Nanshu (talk) 13:39, 28 November 2011 (UTC)[reply]

Thanks for the catch – I thought “Ruofu” was a funny Chinese family name (almost always monosyllabic), but the order listed in the reference was inconsistent (Chinese names family first, Western names family last) – fixed!

—Nils von Barth (nbarth) (talk) 04:46, 1 December 2011 (UTC)[reply]

I have quickly scanned Du et al., 1992. Its main subject is not surname extinction and it makes no mention of the Galton–Watson process. So I am unsure if the following citation at Chinese surname#Surnames at present is valid:

Of the thousands of surnames which have been identified from historical texts prior to the [[Han Dynasty]], most have either been lost (via the [[Galton–Watson process]] of extinction of family names)<ref>{{Harv|Du et al.|1992}}</ref> or simplified. Historically there are close to 12,000 surnames recorded, of which only about 3,100 are in current use,<ref>{{Harv|Economist|1995}}</ref>

The authors give some possible reasons for the dwindling of surnames (pp.19–22). The no-children assumption is not presented. Personally I doubt that a lineage dies out peacefully. If one lineage died out, it must involve a catastrophe that characterized the end of a dynastic cycle, with which the population reduced to a half, a quarter or even worse (The Three Kingdom period is a well-known example). It is unfortunate that I have never read literature that relates it to surname extinction. --Nanshu (talk) 13:57, 4 December 2011 (UTC)[reply]

Thanks again! You’re right, that was lazy linking on my part. I’ve changed the link from Galton–Watson process to instead point to extinction of family names (in this edit).

While surnames certainly have become extinct, you’re right that this is not primarily (or perhaps even significantly?) due to the Galton–Watson process, and the source doesn’t say this (it just says “extinction”), so I’ve removed references to “family lines dying out” from the Chinese surname page (in this edit); if someone finds a reference actually citing this, it could be added back.

I’ve left a brief note on “family lines dying out possibly playing a part” on this page (since this is the Galton–Watson page, showing connection to content), but noting that this is not the main story.

Hope the current statements in the articles look ok; feel free to correct or comment if not!

You’re right that the Du et. al. 1992 paper is not a great reference; it does cover surname extinction briefly, and the content seems to be “convention wisdom” on Chinese surname extinction (which is what I’m using it for – some brief general remarks), but proper focused references are really necessary. I’m also completely unfamiliar with this literature; hopefully someday an expert will be of help. If you find the time to read up on it, please improve it!

—Nils von Barth (nbarth) (talk) 07:40, 1 January 2012 (UTC)[reply]

edit on assumption[edit]

I modified the sentence "Assume, as was taken for granted in Galton's time, that surnames are passed on to all male children by their father".

Indeed it is quite absurd to suggest that this assumption was ever taken for granted. Children out of wedlock have always been a known reality. Moreover, even within the traditional family paradigm this assumption was not at all the standard in all cultures, so that it is not a question of "time" but also of place. — Preceding unsigned comment added by 128.178.14.162 (talk) 12:48, 20 June 2014 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Galton–Watson process. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added archive http://www.webcitation.org/6bTLDd2kR?url=http://hsblogs.stanford.edu/morrison/files/2011/02/27.pdf to http://hsblogs.stanford.edu/morrison/files/2011/02/27.pdf

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 13:34, 7 January 2017 (UTC)[reply]

Possible over-emphasis on family names[edit]

It's true that the Galton-Watson process was originally invented as a model for family names, but the idea itself is now quite a fundamental plank of stochastic process theory, and has applications in evolutionary biology and nuclear physics, among other fields. I wonder if the article could/should be changed to place more emphasis on its fundamental mathematical nature, and less on its original historical purpose? I say this because currently, without careful reading, the article comes across as talking about quite a specific model of genealogies of names, and doesn't really hint at the idea's broad applicability in other fields. (I realise this re-writing would be a substantial amount of work.) Nathaniel Virgo (talk) 01:05, 16 January 2018 (UTC)[reply]

Yes, I agree. Surnames are of interest only as an example that people can quickly grasp; the concept is useful for many other things.

I'd like to add the fact that the math contains an unstated assumption that the population is unbounded. If the population is bounded, the extinction criteria don't apply. (Consider a constant population of N, which has F family names. At arbitrarily long time, there will always remain at least one family name, so clearly the probability of extinction cannot be less than F/N.) Another way to look at it is that the number of descendents of a given line is a random walk with an absorbing boundary condition at zero. But if the total number of the population is not also diminishing to zero, then for every line that disappears, a different line must grow. But I don't have time to seek out a citation for this. Geoffrey.landis (talk) 20:58, 30 November 2020 (UTC)[reply]