User:Graham87/Page history observations

From Wikipedia, the free encyclopedia

Here is a collection of observations I've made about page history oddities. I find out if there are page histories to merge by checking deleted contributions of early editors, and checking articles on place name lists like List of urban areas by population or national place name lists. I also use the lists at WikiProject History Merge to find pages to history merge. Sometimes, I can use old copies of Wikipedia to restore the missing history or find pages with lost edits (see my further notes on this and the "Resolved" section. If you find any other page history oddities, feel free to let me know.

Pages whose history has been lost[edit]

Page history started to be reliably kept and dated after the conversion to Phase II software in January 2002 (see Wikipedia:Usemod article histories for caveats). Therefore, all history from that time onwards should theoretically be accessible. However, some page history has disappeared entirely due to moves and deletions. The following situation is typical:

  • Page A is moved to Page B by cut and paste, either before the page move function became available to non-sysops, after which it was used more often and it was more reliable, or by a user who was not aware of or could not use the page move function.
  • Page B is moved back to page A with the move function, thus deleting page A's old history.
  • The deleted revisions are cleared from the database, meaning that the deleted history of page A is gone permanently. The deleted revisions were last cleared on 8 June 2004 in a database crash, and were previously cleared on 3 December 2003, when the Wikipedia database was transferred to a new server. (the mechanism for storing deleted revisions was established on 10 August 2002.)

Any page history deleted before 8 June 2004 is no longer in the current Wikipedia database. Some very early revisions appear in Nostalgia Wikipedia, a copy of the Wikipedia database from 20 December 2001, and some old database dumps are available, which can be used to find and restore missing edits; for more information about how I copy those edits to the current Wikipedia database, see User:Graham87/Import. The following articles have missing history which seemingly cannot be restored by sysops:

Unresolved[edit]

These cases cannot be sufficiently resolved by any of the publicly available database dumps:

Resolved[edit]

These cases have been either mostly or entirely fixed using old copies of the Wikipedia database. Unless otherwise specified, they have been completely resolved.

Nostalgia Wikipedia[edit]

March 2002 database dump[edit]

January 2003 database dump[edit]

May 2003 database dump[edit]

Some close calls[edit]

I have restored some page history that was deleted due to page moves. Most of these operations were trivial, like this one at Accra. However, the following are interesting cases and show the problems with cut and paste moves. See my logs dealing with:

Also see the contributions of Ipo3, a page move vandal, and the logs of Angela Merkel (disambiguation) and Small Business Administration (disambiguation), to see what can go wrong with page move vandalism. Ipo3's edits particularly affected the Klingon article.

In another science fiction franchise, the Palpatine article also had missing edits. Its history was moved to "Palpatine, Dantius", a name for the character cpoined by SuperShadow, who ran a Star Wars website, and then the page was moved by cut-and-paste back to Palpatine. The "Palpatine, Dantius" page was redirected to the SuperShadow article then moved to Dantius Palpatine; the redirect was deleted after a deletion discussion for the SuperShadow article, taking all the early history of the Palpatine page with it.

If the deleted revisions had been cleared out or there was a bad database crash, the page histories mentioned above may have become permanently inaccessible.

Revision ID numbers[edit]

When a revision is added to the database, it is assigned an ID number, which is one more than that of the previous revision. Thus, in general, a low revision ID number will indicate an early edit while a higher revision number will indicate a more recent edit. Revision ID numbers can be a reasonable way to estimate the date of a revision, with some caveats.

Edits made when Wikipedia used UseModWiki were imported to the current Wikipedia database on 20 September 2002; therefore they have revision ID's over 200,000. The edit with a revision ID of 1 is not Wikipedia's earliest edit, but it is the first edit to be added using the Phase II software.

Before Wikipedia was upgraded to MediaWiki 1.5 in late June 2005, if a revision was deleted and then undeleted, it would get a new ID number as if it was a brand new revision. For example, this edit to the article "Wikipedia" has an ID number of 13,435,822, even though the edit was made in December 2001, because the article was deleted and restored before June 2005. For comparison, the revision with the previous ID number of 13,435,821 was made in May 2005.[note 1]

Page transfers via Special:Import or manually on the server also result in "out-of-order" revision IDs. Revision IDs do not increase monotonically with time, and you should never rely on this. --brion (talk) 16:42, 20 October 2008 (UTC) [1]

Strange times reported in diffs[edit]

Here are some revision links that show the consequences of the server clocks being reset. See T4219 for other examples.

The unusual times in this diff at "User:CryptoDerk/CDVF" probably occurred because one of the servers was set to the wrong time zone. The out-of-order edit makes the relevant page history seem rather confusing. Another set of out-of-order edits is this edit to Template:Edit.

Another situation in which there can be incorrect timestamps is a result of early versions of MediaWiki. When pages were moved over redirects, the edit history of the newly created redirect would show the date and time when the overwritten redirect was created. There are more details in the section of the page move guidance about moving over a redirect, along with an example at Talk:PETA.

Many edits, especially those by Conversion script, are incorrectly reported as having occurred on 25 February 2002 (UTC) due to an early database glitch that was once called the "great oops".

Incorrect timestamps can also affect the reported creation times of accounts at places like the list of all users, because this information wasn't oficially recorded in the database until the introduction of the user creation log in September 2005.

Talk pages created before articles[edit]

Sometimes, the first visible edit to a talk page can have an earlier timestamp than that of the corresponding article. This can happen for several reasons, including copyright violations (where early article history is deleted) and unusual page moves. With my encouragement, my friend Codeofdusk wrote an extended essay about this topic, which can be found at User:Codeofdusk/ee.

Fun page history facts[edit]

Notes[edit]

  1. ^ Out-of-order revision IDs are associated with some historical bugs. Until May 2017, the inconsistent ID numbers could cause problems when checking diffs, as diff navigation was based on the revision ID of edits rather than their timestamps (see T4930). Before the MediaWiki 1.18 update, the software also calculated the number of intermediate revisions between diffs by using revision ID's rather than timestamps; that result was incorrect when the order of the revision ID's did not correspond to the dates of the edits. An example of this phenomenon was at this edit to Talk:Netherlands, which now displays correctly as of the MediaWiki 1.18 update.