User:Jnestorius/ELLINKS

From Wikipedia, the free encyclopedia

These edits made by User:BrownHairedGirl are a cleanup exercise after an RFC last year which changed the WP:NC-GAL convention for election names from "Foo election, YYYY" to "YYYY Foo election". Edits replace each wikilink of the form "Foo election, YYYY" with one of the form "YYYY Foo election". Edits are done using Wikipedia:AutoWikiBrowser (AWB) with a customised edit summary. The edit summaries are used to check every edit to spot and fix cases of replacing a bluelink with a redlink (if the list of changes is longer than the edit summary then the edit must be checked manually). A redlink may be replaced with a redlink; that is not a problem.

This run of edits has three primary purposes:

  1. To fix the use in running text of [[Foo election, YYYY]]. It is much more readable to have [[YYYY Foo election]].
  2. To fix the now-pointless redirects of [[Foo election, YYYY|YYYY Foo election]]. The wikicode is much more readable as [[YYYY Foo election]].
  3. To fix the broken links caused by changes in naming format. This is complex, but surprisingly widespread, so I'll try to explain it without to much verbosity by giving two examples of the permutations I have encountered which raise issues requiring standardisation:
    By-elections
    General elections usually involve many many links to a single title. In the case of Ireland, there have been 32 general elections to Dáil Éireann since 1918, but 131 by-elections to the Dáil. In the UK, there have been 56 general elections since the UK was established in 1801, but 4,167 by-elections.
    It's relatively easy to use redirects to cover most permutations of general election title: a dozen redirects in each case covers over 99%.
    However, doing that with a large set of target articles gets very problematic. For example a biographical article may contain a long-standing link to "ThisTown by-election, 1927" ... but if the by-election article is now created, it should be at "1927 ThisTown by-election", and all the redlinks will remain red. Alternatively, an editor may encounter the redlink in the biog and mistakenly create the page at the old-style "ThisTown by-election, 1927".
    With UK by-elections, there is further complication in that the place name may have variations: e.g. Midlothian used to be known for some purposes as Edinburghshire, and there are variants such "Western CountyName"/"West CountyName".
    So canonicalsiing the year format significantly reduces the chance that a redlink will remain red after article creation, by removing the major variant in naming format.
    Re-named series
    The development of naming conventions has often led to several changes in naming practice for article. For example:
    • Editors start creating articles on the local elections to FooBar Council, using the format "FooBar Council election, YYYY". Redlinks are created as appropriate, both from lists of elections and from other articles such as biogs, timelines etc.
    • Other editors conclude that greater specificity is needed, so they rename the articles to "FooBar Borough Council election, YYYY". Redirects are of course automatically created from the old titles .... but that leaves redlinks to the articles which did not exist.
    • Then the WP:NC-GAL renaming happens, and the articles are renamed to "YYYY FooBar Borough Council election". So now we have three naming formats to contend with, giving permutations:
      1. "YYYY FooBar Borough Council election" (the new canonical name)
      2. "FooBar Borough Council election, YYYY"
      3. "FooBar Council election, YYYY"
      4. "YYYY FooBar Council election"
    In some cases, there are even more permutations, e.g. the article currently named 1986 Southwark London Borough Council election could also be titled as "1986 Southwark Council election", "1986 Southwark Borough Council election", "1986 Southwark London Borough Council election", "1986 London Borough of Southwark Council election", etc. Allowing for the possibility of years at the end instead of the beginning doubles the number of variants, which means more redlinks; and in practice it quadruples the number of variants, because the links may be written with or without a comma, e.g. "Southwark Council election, 1986" or "1986 Southwark Council election 1986". It's a trivial matter for AWB to pick up both variants and standardise them.

When I started on this job a fortnight ago, I was initially doing a very restricted set of use cases. But the more examples I encountered, the more I realised that there was no advantage in doing only a sub-set, when each edit could resolve a much wider set of issues.

So the effect of what I am doing is to fix a set of redirects, some of which may be broken, but where identifying only the broken ones is massively more work than just standardising the lot. AWB just handles text patterns, and can't identify whether a link is red, so unless someone wants to handcode a whole bot which does squillions of system calls to identify only redlinks, this is the neatest way of doing it.

There is a some changes (example) of the form [[Foo election, YYYY|alias]] to [[YYYY Foo election|alias]]. This is a mild violation of WP:NOTBROKEN but harmless, and it's quicker to action the change mechanically (albeit unneccessarily) than to spend time calculating whether it would be redundant.