User:Andrewa/Let us abolish the whole concept of primary topic

From Wikipedia, the free encyclopedia

In that all pages belong to the whole project, any user may edit this one. But it's generally more helpful (and polite) to discuss the proposed change on its talk page first.

The concept of primary topic (hence often P T with a blank to ease linking) goes back to the early days of Wikipedia, and has never been seriously challenged. However:

  • Readers receive little (and often negative) benefit:
    • Searches are less effective, and wrong articles (which may be large) loaded as a result.
    • Many readers are astonished, as perceived P T is highly dependent on the background of the reader. Wikipedia unintentionally expresses a POV on controversial names.
    • Bad wikilinks are more frequent.
    • It saves at best the loading of a small page followed by one mouse click.
  • Editors are inconvenienced (and more than you might think):
    • It is responsible for an enormous amount of time spent in often heated discussion, for little benefit (often none at all).
    • It greatly complicates new article creation.
    • Editors whose background gives them a different sense of primary topic get no warning that they've linked to the wrong article.
  • It's a Wikipedia neologism. Nobody used either the term or the concept until early Wikipedians invented it.

Obviously it seemed like a good idea at the time! But consensus can change. Whether it will on this topic is another question! The arguments to date are strongly in favour of change. It's logical. But is it politically possible?

Advantages of the proposed change[edit]

Easier searching[edit]

Articles would no longer be at ambiguous titles, so all readers would be able to identify from a search results list the article they want.

Having an article at the ambiguous title disadvantages both those editors who want that article, and those who want something else. Obviously, those who want something else are at risk of loading the wrong page. But less obviously, those who do want the article which we currently regard as primary topic also have extra and unnecessary difficulty in finding it. They would also be better served by an unambiguous title.

In many cases there is already a redirect from an unambiguous title, and this extra difficulty is greatly reduced. The reader sees the unambiguous redirect, clicks on it, and gets to the right article. (Perhaps we should have a policy that such redirects are always created, but we don't, and this would make new article creation even more cumbersome.) But if such a redirect does not exist, then they must guess that the article they want is at the ambiguous name.

For some, who may not even know there are other meanings, this guess is easy, but others who do know of other meanings (and may even consider one of them primary) it is a difficult or even impossible guess. It would be best for them to choose the disambiguation page from the results list, but they may not know what (disambiguation) means.

And even if the redirect does exist, this doesn't help people who are looking at search lists such as Google results that do not include the redirect. So the unnecessary difficulty in finding the article at the ambiguous name is reduced but not eliminated.

The bottom line[edit]

In summary, all readers would be better off if all articles had unambiguous names. That's the conclusion of the argument above. What is wrong with it?

The main purpose of an article name is to identify the article topic to the reader. And an ambiguous name quite simply doesn't do that, whether or not the reader's background leads them to the same conclusion as Wikipedia's decision process in deciding the primary topic.

There are other advantages too, both for readers and editors, but that's the bottom line, surely.

Fewer wrong wikilinks[edit]

As has been often stated and demonstrated, both readers and editors often disagree on what topic is primary. At present, an editor who links in good faith to an article at an ambiguous base name title gets no warning whatsoever. Nobody does.

The problem is, to wrongly wikilink to an article at an ambiguous base name is not unlikely. The editor may unknowingly disagree with the consensus that chose the primary topic, particularly if the term has an important use in an area of interest to them (and if they're editing an article in that area and linking to the term, that's quite likely); They may not even know that any other use of the term exists. In any case they often just link to the base name, and from then on everyone who follows that link loads the wrong page until someone fixes it.

(And again, the NYRM discussions, which were associated with fixing thousands of wrong base name wikilinks, demonstrated that this happens a lot. It could be argued that every editor should check every wikilink they create when they preview their changes, and many do, but many don't, and if they think they know exactly what the term means anyway they're not likely to.)

And many of these are particularly bad wikilinks, and may take a while to fix. Readers arriving at the wrong page may just assume that there's no article on the topic they want, unless there's a hatnote... and there's no policy or guideline guaranteeing that there will be a hatnote, so many articles are without them.

National bias is reduced[edit]

Ethnic perspectives[edit]

Many geographical terms, such as Macedonia, are disputed between national or ethnic groups. Wikipedia currently acts as an umpire in such disputes, however reluctantly and unintentionally, which violates WP:NPOV and has nothing to recommend it.

And we're probably quite an influential umpire too, which leads understandably to a great deal of heat in some RM discussions. The stakes are seen as high by those involved. This proposal eliminates this problem completely.

US bias[edit]

A large proportion of Wikipedians are from the USA, leading to possible bias in the case of articles such as Second Amendment and State University, which have different primary meanings outside of the USA. This was also discussed as an issue in the NYRM discussions.

Wikipedia's global credibility would be enhanced by eliminating this type of bias or apparent bias.

Reduction of editor effort[edit]

RMs[edit]

Many RMs, and in particular many difficult ones, are purely about deciding Primary Topic. These would be eliminated.

Many of these RMs have borderline results, and so are likely to be overturned in the future, creating still more unproductive editor effort. (And a small number are just plain wrong, such as NYRM2016 and its many equally wrong predecessors, but hey we are all human. But reducing the number of controversial decisions will at least reduce the number of such fiascos.)

Part of the reason for this is, #the P T guideline is itself controversial.

Creating new articles[edit]

Creating new articles is greatly simplified by the proposal.

At present, when an editor wishes to create a new article but finds the obvious name is already taken, they must follow a lengthy procedure.

Under this proposal, the entire procedure is less work than the first step of the existing process. See #Creating new articles in detail.

The P T guideline is itself controversial[edit]

There is no consensus that the P T policy/guideline is correct, nor on how it should be interpreted.

IAR is regularly invoked (sometimes explicitly, often not) in RM discussions because voters and even closers simply don't agree with the policy/guideline as it stands.

There are strong opinions in many directions!

Three vague and Wikipedia-only terms would be eliminated[edit]

"Primary Topic" is ill-defined. What is the primary topic in one geographical area, or among those of a particular interest, or at a particular time, can easily change with time etc.. Both readers and editors can be assumed to have different ideas as to what the primary topic of a term is, depending on their backgrounds, and this is regularly demonstrated.

"Disambiguation", as used in parenthetical page titles, is a cumbersome word only rarely encountered in day-to-day life. It's a useful technical term for editors to use, but readers seeing it in a search results list cannot be assumed to know what it means. This proposal would move all DABs to their base names, where they would be easily and naturally found by any reader.

"Primary redirect" is particularly problematic, with many seasoned editors not grasping the (perhaps subtle) principles involved. It is often argued that whenever there's a better name for the article on the primary topic for a term, the term is then available for another article. Yes, it's a subtle point isn't it! Why not? But no, even if we (painful example) agree that the New York City article should be called New York City not New York, that doesn't automatically mean that the name New York is available for the article on New York State. It would only be available if the state were the primary topic of New York, and there was never any credible suggestion that this was the case. But many experienced editors argued that the name was available, simply because it wasn't the name of any other article, and regardless of whether or not the state was the primary topic.

The primary redirect guideline makes perfect sense given the primary topic policy, but it is a difficult concept for many to grasp, is often misquoted or totally ignored, and is one that this proposal would make unnecessary and also abolish. See #RMs that show various views on Primary Redirect.

Disadvantages of the proposed change[edit]

Raised by opponents[edit]

Objection 1[edit]

To keep exactly that from happening: to add qualifiers to a bajillion titles that don't need them, hindering both readers (predominantly looking for the, hm, "primary" topic for a title) and editors (who now have to figure out which qualifier to add to nearly every wikilink to a "primary" topic). [1]

I replied in situ, but to summarise:

  • to add qualifiers to a bajillion titles that don't need them... See #Implementation. Yes, a big job, but manageable,
  • hindering both readers (predominantly looking for the, hm, "primary" topic for a title)... This hinderance is restricted to loading a short DAB, and then making one mouse click, and even for these readers there are offsetting benefits. And it doesn't even affect every reader looking for the "primary topic". Most links will point to the correct article... and more so than now. Many, perhaps most, readers will choose the more specific name from the search results list. Popular articles will be ranked far more highly by Google etc than DAB pages. In summary, this is a small inconvenience for a small number of readers.
  • and editors (who now have to figure out which qualifier to add to nearly every wikilink to a "primary" topic)... Not nearly, it's every such link. Again, there are offsetting benefits which dwarf this hindrance. In particular, those editors whose background leads them to choose a different topic as primary currently just link to the wrong page and get no warning of this. Nobody gets any warning of it. Everybody just gets sent to the wrong page until somebody fixes it.

Objection 2[edit]

They are serious impacts with negative return on investment. You say "some readers" and "short page" and "single mouse click" to minimize it, but that is exactly the negative return on this investment you're seeking: we do a bunch of editor work (and increase the ongoing maintenance editor work) in order to worsen the overall reader encyclopedia experience. [2]

There's a section on this on the talk page, but in summary:

  • that is exactly the negative return on this investment you're seeking... Yes, some see these as serious impacts, and this is important.
  • (and increase the ongoing maintenance editor work)... No, the net effect is to decrease the work.
  • in order to worsen the overall reader encyclopedia experience... That's a big claim and would be a show-stopper. But it seems to depend on the assumption that having some readers load a short page and then need one extra mouse click overrides the benefits to all readers (including those who do want the primary topic, see #Easier searching above, and note that this is not the only reader benefit). And that's a very shaky assumption at best! But agree that this argument would be a show-stopper if it can be supported.

Objection 3[edit]

Primary topic is a solution, not a problem. User essays are fine, but I'm not going to go read and comment on them when they have no chance of becoming consensus. [3]

Or perhaps primary topic is a problem, not a solution.

The problems are described above. The question is simply whether they are big enough to justify change.

But to exactly what problem is primary topic a solution? And the question then would be whether it is the best solution. But even the problem is not obvious.

The problem that several topics often share one common name is solved by disambiguation pages, not by the concept of primary topic.

Primary topic (in glorious hindsight and in the light of many years experience) solves nothing, and causes significant problems. It seemed like a good idea once. It's now just pointless enforcing of a pointless rule. On the other hand, disambiguation pages are a very good solution to this problem, and this proposal would make even better use of them.

Raised by supporters[edit]

These may be preemptive or based on actual objections above or elsewhere, and are likely to be bit of both. It is important not to raise straw men, and to provide diffs etc. when citing others.

No chance of becoming consensus[edit]

See #Objection 3. But consensus can change.

How it would work in detail[edit]

Creating new articles in detail[edit]

At present, when an editor wishes to create a new article but finds the obvious name is already taken, they must follow a lengthy procedure as follows:

  • Assess whether either is primary topic
    • If neither is PT:
      • Choose a disambiguated name for the existing article and move it
      • Create a DAB at the undisambiguated name
      • Create the new article at another disambiguated name
    • If the existing article is PT:
      • Add a hatnote to the existing article
      • Create the new article at a disambiguated name
    • If the new article is PT:
      • Choose a disambiguated name for the existing article and move it
      • Create the new article at the undisambiguated name (with a hatnote)

and that's just the simplest case. If a DAB or even a hatnote to a third article already exists, it's even more complicated.

The corresponding procedure under this proposal is:

  • Choose a disambiguated name for the existing article and move it
  • Create a DAB at the undisambiguated name
  • Create the new article at a disambiguated name

and that's all. This is the most complex case under the proposed system, because if a DAB and/or a third article already exists, then there will already be a DAB at the base name, and the new article just needs to be added to it.

The entire procedure under this proposal is less work than the first step of the existing process.

Examples[edit]

These may duplicate the ones given in the above sections. There's no problem if they do, in fact details of a specific example are probably better here rather than cluttering the discussion above.
Nor is there any reason an example can't be in several sections below.

RM discussions concerning change of PT[edit]

This is not a comprehensive list, just some examples

General P T changes[edit]

RMs that show various views on Primary Redirect[edit]

Different perspectives[edit]

  • Macedonia is currently a DAB, which is the best solution under current guidelines as it supports neither the claims of the country nor of the Greek region to the title. But that's not as NPOV as it might seem... we have by this taken a stand that both claims are equally valid. Better still to take no stand at all. And it was an arduous journey, see Wikipedia:Naming conventions (Macedonia) and Wikipedia:Centralized discussion/Macedonia/main articles. Under this proposal, such names would always have a DAB at the base name, but this would carry no implication as to whether one or the other claim was being preferred, or whether both were equally valid. And this is as it should be.
  • Wave is currently about the concept in physics. But this is astonishing to all surfers, and probably to anyone with any experience of the ocean but no training in physics. This was extensively discussed at Talk:Wind wave#Requested move 6 March 2018, with no consensus, and there's not even a link there from Talk:Wave to prevent the same discussion happening again, in the fullness of time.
  • FEMA (disambiguation) links to several possible primary topics depending on the background of the reader.
  • Talk:Triplet#Requested move 19 March 2018 Several experienced hands on each side.

Wrong wikilinks[edit]

  • The NYRM saga found thousands of wrong wikilinks. And that's just one article, and a particularly high traffic article, so you'd have hoped that these links would be fixed. But they hadn't been, obviously. New York is currently a DAB, so new wrong wikilinks are being discouraged, found, and fixed. But the issue of primary topic was explicitly unresolved by NYRM2017, so there's a foreshadowed NYRM2018 to resolve it, as there should be under current policy. But under this proposal the DAB would simply stay where it is, as it should.

Food for thought[edit]

Examples that might make many think twice about the proposal!

US Presidents[edit]

Most if not all Presidents of the United States would be disambiguated, including:

Countries[edit]

Note however that the article at Micronesia is about the region, with the country also known simply as Micronesia [5] at Federated States of Micronesia. There being no third topic at present, there is no Micronesia (disambiguation).

Implementation[edit]

This section is very much a draft, particularly where the design of the bots and gadget are concerned. It demonstrates one way, but it's possibly not the best way.

This is version 2 of the draft, a complete rewrite, and may not be the last such.

This impacts an enormous number of articles and could not happen overnight. It should be conducted in three phases.

Phase one[edit]

Phase one would be to implement the new disambiguation guideline for new articles, and for article page moves that were justified for other reasons, only. This should be an extended period of at least one year and possibly two.

It would be initiated and authorised by consensus to change the disambiguation guideline to eliminate the concept of primary topic. But at this stage, a grandfather clause would explicitly but temporarily exclude existing articles from the new guideline. RMs based only on a change of primary topic, including ones based on there not being one either because of the new guideline or the old one, would be explicitly but temporarily prohibited.

The gadget would be written in preparation for this. It would suggest common disambiguators both for new articles and for those proposed to be moved for some other reason. The categories to which the article belonged, and the existing disambiguators used within those categories, would be used to create and prioritise this list and to keep it reasonably short. Its use would be entirely optional. Its own reports would also be used to tune it.

Phase two[edit]

Phase two would only occur if phase one did in fact produce the benefits predicted. But if consensus was achieved that significant benefits had resulted, then the grandfather clause of the guideline would be modified to allow admins, page movers and bots to boldly move existing articles to conform to the new guideline. Consensus to do this would authorise phases two and three.

The first bot would be written in preparation for this phase, to implement the change on older pages. In this phase it would be tested, using at first small subject areas, then moving to a sample of all subject areas.

Input to the first bot to select the pages to be moved, and the new disambiguators to be used, would at this stage be essentially manual, but using the gadget to choose the disambiguator, and to further tune its choices.

Phase two should be quite comprehensive. All WikiProjects would be encouraged to schedule some of the articles under their scope. Using the gadget, this would be a very quick process. Some smaller Wikiprojects might well end up moving all of the articles in their scopes, rather than trust the second bot to choose the new disambiguators. Others would provide valuable input to the tuning (both manual and automatic) of the gadget every time they chose a disambiguator other than the gadget-selected preferred one.

Phase two needs to continue until we are confident that the gadget needs no further tuning, however long that takes, so as to mimimise the risk of the bots going astray in phase 3 (but there will of course be stop buttons if they do). But towards the end of phase two, this tuning would be increasingly automatic, see below.

Phase three[edit]

Phase three is the big bang. The rest of the older pages would be converted to the new guideline.

The second bot would now be used to provide input to the first. The speed of both bots would be conservatively throttled so as to never overload the servers.

Once this phase was complete, the grandfather clause would have no relevance and should be removed from the guideline.

The gadget[edit]

The gadget would suggest preferred parenthetical disambiguators based on the category trees in which the article appeared. For example, if article XYZ was in category:oogleboz players and category: Brutopian musicians and XYZ (musician) was available, that would be suggested, and if not XYZ (oogleboz player) and XYZ (Brutopian musician) would be suggested. If several suggestions were made, the list would be prioritised with the first the preferred disambiguator. Priority would be given to disambiguators already in use in the categories in which the article appeared, and to those most often in use. Ones that included dates and similar details would be excluded or shortened.

A single mouse click selecting the disambiguator, or just Enter to select the preferred (or only) suggestion, would then perform the move (or there would of course be a cancel option), either within the article namespace adding the disambiguator in the case of existing articles, or from draft to article namespace adding the disambiguator in the case of new articles. Or, there would be an option to manually override by typing in a disambiguator that was not on the list, or a modification to one that was (to save keystrokes).

Natural disambiguation would not be recommended... that's just too complex to automate. It would be available but as a manual option only, by typing in the entire new article name. All manual overrides would require a verification, possibly by retyping, to minimise errors.

A report would be prepared listing every time a parenthetical disambiguator other than the preferred disambiguator was used, and used to tune the selection criteria. This might take the form of a new category Pages with new manually selected disambiguators. A new section added to the article talk page would state both the preferred option given and the one taken, and whether it was from the list, manually entered, or modified, and there would be an opportunity to (optionally) give reasons, which would also be added to this talk page section. The cancel option would not be logged.

However the gadget is to some extent self-tuning. Every time an article is moved to a disambiguated name, a name with that disambiguator is ipso facto added to all categories in which the article appears, which will then be considered by the gadget in making and prioritising future suggestions. As phase two proceeds, this automatic tuning would become more and more effective, and less and less manual tuning required.

In addition, there would be the option to recommend that suggestions might be added to a blacklist or greylist, and excluded from future recommendation lists (or always shortened in the case of the greylist). The ability to do this might be need to be restricted to avoid vandalism... say to admins and page movers, with others invited to recommend additions but these to require admin or page-mover approval before taking effect.

This gadget would remain available and useful (but entirely optional) after the process was complete. In particular, its use in moves from the draft namespace is not required for the bots, which move only within the main namespace, but will remain extremely useful for manual use.

The first bot[edit]

The first bot is written and used for phase 2. One important result of phase 2 is to test the first bot to allow its fully automated use in phase 3.

It would act on articles that occupy a base name (say XYZ) when there is also a DAB at XYZ (disambiguation) and a redirect from an article name XYZ (ABC) with no significant history or talk page history to the article at the base name, and where this redirect is in a special maintenance category Primary topic elimination redirects. The article would be moved to overwrite this redirect, and the DAB moved to the base name. A line would be added to the top of the DAB linking to the renamed article.

Pages in the category Primary topic elimination redirects but where the bot was unable to perform the moves would be moved to the category Primary topic elimination exceptions for manual intervention, and an explanation added to their talk pages, and no other action taken by the bot.

This bot would need to be strictly and conservatively throttled, perhaps dynamically, to avoid overloading the servers, particularly in phase 3.

The second bot[edit]

The second bot automates the process for phase 3, using the gadget and the first bot. It acts on DABs remaining at names of the form XYZ (disambiguation), scheduling them to be moved by the first bot. It does this by simply creating pages in the category Primary topic elimination redirects. The redirect points to the former Primary Topic article (at the base name), and has the effect of scheduling both this article and the DAB itself to be moved by the first bot. No talk page is created, owing to an existing software limitation.

If the redirect already exists and has no significant history or talk page history, then this redirect just needs to be added to Primary topic elimination redirects, and its talk page deleted if it exists.

This bot would also be strictly and conservatively throttled, which can be achieved just by keeping the list of scheduled moves short (ie by limiting the number of pages in Primary topic elimination redirects; Once this limit was reached, the second bot would add no more until the first bot removed some). In view of the throttling of the first bot, and the fact that the second bot performs far fewer database updates per entry than the first, this effectively throttles the second bot too. This limit would start off even more conservative until confidence in the bots and gadget acting together was established, but might then be increased while still being conservative regarding the impact on the server loads.

The second bot would use the gadget to select the new disambiguator, always using the "preferred" disambiguator, so it's important that this should always be reasonable and should normally be right on the money. However there could also be extra reasonableness checks built into the second bot.

Similar proposals[edit]

Better redirects[edit]

As observed above, some of the of the difficulty readers have in finding articles that are at ambiguous names can be removed by redirects, and in some cases it already is. Such redirects are intentionally created in some cases, and in others they are the result of previous article moves.

But in the case of wave for example, a surfer wanting the article at wave (physics) must guess that this article, contrary to their own personal mindset, is at wave. The Wikipedia search list is no help whatsoever; The redirect at wave (physics) exists but does not even appear in the search results list.

There's a hatnote at the article at wave of course, and it points to wave (disambiguation), but they don't see it unless they go to that article which is too late, they've already found the article they want (somehow). And there's one at wind wave too, but it's not at all helpful to them, and in any case they won't go there because they think they know (correctly in this instance) that it's not the article they want.