Wikipedia talk:Date formatting and linking poll

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Bug filed[edit]

With a great deal of thanks to MZMcBride, I finally managed to get disabling DynamicDates (setting $wgUseDynamicDates = false in LocalSettings.php) tested on a private wiki. The results are good - autoformatting of article text is disabled, whilst date preferences in article histories/logs are left intact. I've therefore gone and filed a bug here to make the change. Please keep discussion here, so to avoid filling the bug request up with threaded discussion and I'll link from the bug to here. Ryan PostlethwaiteSee the mess I've created or let's have banter 22:19, 15 April 2009 (UTC)[reply]

  • Where can we look at the results of what you are talking about? Are you saying that all the date formats shown in the big table, above, look OK simply by turning DynamicDates off? Greg L (talk) 22:58, 15 April 2009 (UTC)[reply]
  • With regard to turning off DynamicDates: I sure hope you know what you’re doing, Ryan. I’ve only got about two hours where I think I understand some of the ramifications. Anyway… What about date de-linking?? The community consensus on that issue couldn’t be clearer. With millions of linked dates, only a bot can handle it. What’s the plan? Greg L (talk) 23:18, 15 April 2009 (UTC)[reply]
  • Indeed, I'll get to that in the morning - That's going to be part two of the discussion as the autoformatting is close to being resolved. I'm about to go to bed, so it'll give me time to think. Hope that's ok. Ryan PostlethwaiteSee the mess I've created or let's have banter 23:20, 15 April 2009 (UTC)[reply]

(outdent). Well, from Bill Clark’s reaction on the Bugzilla, he has a firm grasp of the risks. He sounds to be someone with wherewithal, as he will be gathering statistics on just how many of the above junk-style formats will be junked with this move. I especially liked his quick grasp of the big picture at the end: I expect to have that list ready later tonight, at which point a decision could be made to either fix those pages before disabling DynamicDates, or that the problem isn't widespread enough to be concerned with (since bots will be starting delinking of ALL lesser-relevant dates soon anyway. Sweet. Greg L (talk) 23:58, 15 April 2009 (UTC)[reply]

Tcncv's table[edit]

Below is what I believe may be a fair demonstration of the before and after effect of setting $wgUseDynamicDates = false on both well and poorly formatted dates. Many of these examples are contrived and some look just as bad either way. Best viewed with date preferences set to none. Feel free to add examples if needed.
Code Recognized Before After
[[15 April]][[2009]] Green tickY 15 April2009 15 April2009
[[15 April]] [[2009]] Green tickY 15 April 2009 15 April 2009
[[15 April]],[[2009]] Green tickY 15 April,2009 15 April,2009
[[15 April]], [[2009]] Green tickY 15 April, 2009 15 April, 2009
[[April 15]][[2009]] Green tickY April 152009 April 152009
[[April 15]] [[2009]] Green tickY April 15 2009 April 15 2009
[[April 15]],[[2009]] Green tickY April 15,2009 April 15,2009
[[April 15]], [[2009]] Green tickY April 15, 2009 April 15, 2009
[[2009-04-15]] Green tickY 2009-04-15 2009-04-15
[[2009]]-[[04-15]] Green tickY 2009-04-15 2009-04-15
[[2009]][[04-15]] Red XN 200904-15 200904-15
[[2009]] [[04-15]] Red XN 2009 04-15 2009 04-15
[[2009]] - [[04-15]] Red XN 2009 - 04-15 2009 - 04-15
[[2009]][[April 15]] Green tickY 2009April 15 2009April 15
[[2009]] [[April 15]] Green tickY 2009 April 15 2009 April 15
[[2009]],[[April 15]] Green tickY 2009,April 15 2009,April 15
[[2009]], [[April 15]] Green tickY 2009, April 15 2009, April 15
[[2009]] , [[15 April]] Green tickY 2009 , 15 April 2009 , 15 April
[[15 April]] Green tickY 15 April 15 April
[[April 15]] Green tickY April 15 April 15
[[15 April]] ,[[2009]] Green tickY 15 April ,2009 15 April ,2009
[[15 April]] , [[2009]] Green tickY 15 April , 2009 15 April , 2009
[[15 April]]··[[2009]] Green tickY 15 April 2009 15 April 2009
[[15 April]]··,··[[2009]] Green tickY 15 April , 2009 15 April , 2009
[[15 April]],,[[2009]] Green tickYRed XN 15 April,,2009 15 April,,2009
[[15 April]] [[2009]] Green tickYRed XN 15 April 2009 15 April 2009
[[15 April]], [[2009]] Green tickYRed XN 15 April2009 15 April2009

Recognized indicates whether or not the coded format is currently recognized and is reformatted.
Before shows the current presentation and is dependent on your date preferences.
After shows the expected presentation when $wgUseDynamicDates = false.
†For these cases, the Day-month part is recognized, but is formatted separately from year.

(I suspect LightMouse can script a fix for these cases in next to no time.) -- Tcncv (talk) 00:51, 16 April 2009 (UTC)[reply]

…and back to our regularly scheduled programming[edit]

  • Thank you very much for you list, Tcncv: Items #1, 3, 5, and 7 appear to me that they should be rather common on Wikipedia. We will have to await Bill Clark’s statistics to be sure just how common they are. I will certainly have a difficult time understanding the rush to turn off DynamicDates if instances of those four types prove as common as I suspect they might be.

    It would make much more sense to leave DynamicDates alone until after bots have deleted unnecessary links. Then we will have the opportunity of having a bot go through and revise the syntax on these. Changing, for instance, [[15 April]][[2009]] to [[15 April]] [[2009]] would be trivial for a bot, and it could do so fast. Then we can turn off DynamicDates. I’m an engineer. This conservative, stepped approach is least disruptive to our I.P. users and is how things would be handled in "the real world" of engineering. Greg L (talk) 01:15, 16 April 2009 (UTC)[reply]

  • Well, I now understand why DD wasn't turned off back in December after the two marathin RfCs. It's also good to know that UC Bill (Bill Clark) hasn't bailed out on us totally. Based on current knowledge, I would still be inclined to support swithing off DD immediately. Unfortunately, the only person who is able to tell us in detail how Lightbot works, cannot.

    In my travels, I have found that formatting errors are rife throughout WP. I already see many of those abominations because I have my preferences disabled. While there are a majority of correctly formatted dates, there is a significant incidence of errors which Lightmouse's scripts work to address. However, Lightbot nor the scripts touch any ISO date formats, nor any of the more weird and wonder permutations such as the ambiguous "12/12/06" and "12-12-06" which also exist across WP. The function was disabled because, I believe, there were a significant number of false positives - most frequently with web links and image names. This may be a significant challenge because the vast majority of references/footnotes I have come across use the ISO format in part or in full. Although MOSNUM guides us to use a uniform date format within the body of an article, the issue of harmonising date links to include footnotes may still need to be discussed. It seems that switching off DD would lead to some red-linked ISO dates, which some editors may be inclined to fix rather than simply remove. Incidentally, I use the code [to convert ISO dates to dmy or mdy] written by Lightmouse in my own script, and need to review all the changes to weed out those false positives.

    Another issue which isn't addressed by Lightbot is the inconsistent date formats. It is a bot, and as such is unable to determine the rule as to whether dmy or mdy should apply. However, the monobook scripts perform that function under human supervision/discretion, and can be applied to categories in semi-automated mode in conjunction with AWB. I think there may still be false negatives when running the script, but false positives are now few and far between because the script has been continuously revised following error reports from users. Ohconfucius (talk) 02:04, 16 April 2009 (UTC)[reply]

  • I added a few more examples above. It appears that the current date formatting process handles any number of spaces and at most one comma between the day-month and year parts, replacing whatever it finds with a single space or comma-space combination, depending on selected format. --Tcncv (talk) 02:39, 16 April 2009 (UTC)[reply]
  • I have actually seen most of those forms manifest, so it's not contrived. Ohconfucius (talk) 07:58, 16 April 2009 (UTC)[reply]

(outdent) Wait a minute. Quoting you, Ohconfucius: It's also good to know that UC Bill (Bill Clark) hasn't bailed out on us totally. Do I understand this correctly? If Bill Clark, the individual who responded to Ryan’s Bugzilla 18479 and pledged to come back soon with more statistics, is, in turn, UC Bill, who is a the puppetmaster of Sapphic, who is under an indefinite block, then it appears we are in a rather awkward working relationship with Bill Clark, who is UC Bill, who is now also blocked for other sockpuppetry violations, particularly a threat using an account known Wclark xoom. Wikipedia is one odd place. And I’ve worked in odd places before. I don’t think I would ever want to be an admin. Greg L (talk) 06:37, 16 April 2009 (UTC)[reply]

  • Well <wiping a bit of egg off face>, that was before I became aware of the sockpuppetry which was 'UC Sapphic'. 'His' anger and disruption are clearly no longer welcome here. Ohconfucius (talk) 07:55, 16 April 2009 (UTC)[reply]


  • Confirmed. We will need luck and a large amount of goodwill on Sapphic’s part to obtain date syntax statistics from “Bill Clark.” The Wclark_xoom account that was blocked as a sockpuppet of Sapphic/UC Bill is the e‑mail address (wclark@xoom.org) of the “Bill Clark” upon whom we are/were waiting upon for statistics in Ryan’s Bugzilla #18479. XOOM (from the account Wclark_xoom and from wclark@xoom.org) is a web‑hosting service and Sapphic made a series of edits to that article. I believe the truest identity of this individual is female—the one that does yoga and pilates—and is best described by the profile that used to be on the userpage of “Sapphic” (aka Bill Clark, UC Bill, I.P. User:169.229.149.174, Wclark_xoom, and the e-mail wclark@xoom.org).

    I would be mildly surprised if Sapphic provides us further statistics on Ryan’s bugzilla given the latest events (a series of permanent or indefinite bans). Do we have someone else who can step up to the plate and provide statistical information as to whether problematical date syntaxes shown in Tcncv’s above table are rare or common?

    Or can we use the ol’ *grin test* and intuition to advance an assessment as to whether or not it would be a wise thing to first (maybe ever) shut DynamicDates down? I should think that the date syntax that generates April 152009 ought to be extraordinarily common. I can see no reason at all to cause so many dates to break. I, frankly, am quite skeptical that it would be a good thing to turn off DynamicDates before a bot can first clean this up (even though doing so anyway might be emotionally appealing at some levels).

    Frankly, I’m not catching on with how turning off DynamicDates can be all that appealing; doing so doesn’t get rid of any of the links (other than to turn a few of the blue ones into broken red ones), it just makes many of the dates look incorrectly written, and still others to become essentially unreadable. Come on guys. Lose (most of the links), and keep the blue-linked dates that remain looking half-way nice until there is a sensible plan here. Baby steps. You don’t cut off your nose to spite your face. Greg L (talk) 15:14, 16 April 2009 (UTC)[reply]

  • How about this profile of Sapphic, before some of the more interesting userboxes (ancestries, UC Berkeley) were removed? Ohconfucius (talk) 02:57, 17 April 2009 (UTC)[reply]
!! <rest pre-emptively self-censored>--Goodmorningworld (talk) 08:32, 17 April 2009 (UTC)[reply]

An important point[edit]

The majority of people don't have date preferences set. We're worrying what will happen to the people that have date preferences set, well we should thing of what's happening right now. If a date is linked as 2009-04-15, then the majority of users see a red link - it's already broken and will stay broken if dynamic dates are turned off. These dates need to be fixed regardless of what happens to dynamic dates and it shouldn't effect our thinking at all here. Ryan PostlethwaiteSee the mess I've created or let's have banter 18:58, 16 April 2009 (UTC)[reply]

that's apparently not quite right, Ryan Postlethwaite - according to the discussion above Dynamic Dates masks certain errors - like your example - even for people who don't have preferences set. i don't have preferences set, and i see your example as a healthy blue link; some of the charts above show what i'll see when DD is turned off. a bot can start spotting and fixing those errors even before DD is turned off, and GregL is right that that should start ASAP. (but for me that doesn't mean there's a reason to postpone turning off DD - both things should happen.) Sssoul (talk) 19:20, 16 April 2009 (UTC) Sssoul (talk) 19:20, 16 April 2009 (UTC)[reply]
  • OK, Sssoul. I see you are pretty much on my side here. Thanks. But I would love to see some explanation from you justifying how turning off DynamicDates right now could possibly be a good thing for our I.P. users and Wikipedia. Doing so would obviously generate a bunch of undecipherable and poor-looking dates while bots scramble to clean up the mess. Do you have reasoning that wouldn’t fall under the heading of “I would have a mind‑numbing orgasm when DynamicDates is shut off”?? Greg L (talk) 20:07, 16 April 2009 (UTC)[reply]
smile: okay, i'll try. if DD is kept on, diehards will keep marking up dates just to see them autoformatted; and pages where there are some linked dates and some unlinked dates will look inconsistent to some users (those who have their preferences set to a different format than the fixed-text dates for a given page). meanwhile, you predict that a whole lotta "ungodly ugliness" will be revealed when DD is turned off, but i don't expect it to be too dire. and i trust that the enlisted bots will clean up any ugliness really quickly, and will be given thanks & praises, which they'll enjoy. 8)
in short, as noted above: sure, let the bots start cleaning up faulty formats without waiting for DD to be turned off, but let's not delay turning off DD either. Sssoul (talk) 21:09, 16 April 2009 (UTC)[reply]
  • Hmm. Then what do you think of this, Sssoul: What if the decision was to get a bot quickly going (within, say, a week from now), that swept through Wikipedia and did cleanup like this:
  1. [[April 15]][[2009]][[April 15]], [[2009]]
  2. [[April 15]],[[2009]][[April 15]], [[2009]]
Without these fixes, the above two syntaxes will render as April 152009 and April 15,2009 respectively once DynamicDates is turned off. The bot would do the sweep, with the objective that DynamicDates is to be turned off in the next month. I take note of your …diehards will keep marking up dates just to see them autoformatted-concern. We can change the advise here on MOSNUM to advise that support for autoformatting will soon be turned off and dependencies orphaned.
The two key distinctions of this is we would 1) Do some cleanup first so we aren’t scrambling to fix stuff the world can see, and 2) Before even doing so, a formal statement goes onto MOSNUM formally declaring the impending inactivation of DynamicDates and the resultant orphaning of autoformatting. What do you think of this? Greg L (talk) 23:07, 16 April 2009 (UTC)[reply]
GregL, that sounds way more reasonable than keeping DD on indefinitely. i'm still not that bothered by the idea of the world seeing some of the typos that DD has been masking, and then seeing (and assisting the bots with) some of the clean-up, but yes, that sequence of events you've outlined above sounds sensible.
i also understand Tcncv's reasoning below about commissioning a separate bot to do just the date-typo-fixing - but i do want to know where delinking fits into the proposed sequence of events. commissioning a separate typo-fixing bot wouldn't collide with lifting the temporary injunction against delinking, would it? obviously bots are needed for these tasks but people can assist in the meantime, using Lightmouse's script to correct errors and delink. Sssoul (talk) 06:09, 17 April 2009 (UTC)[reply]
  • Indeed, you aren’t understanding the technical issues correctly, Ryan. Please examine Tcncv's table, above. I don’t give a dump either about what registered editors, (those who have their date preferences set to something other than “No preference”) see or don’t see. In the above table, the Before column shows what regular I.P. users see now. The After column shows what all I.P. users would see if we turned DynamicDates off. Everyone (I.P. users and the privileged elite) would see a bunch of crap in many cases. We don’t want to do that. It is not a viable solution because there are many instances on Wikipedia of syntaxes coded in ways that would become April 152009 and 15 April2009. Please see my 5:14, 16 April 2009 post above; particularly the last two paragraphs. Turning off DynamicDates is not a solution we can avail ourselves of, at least not early on. Greg L (talk) 19:35, 16 April 2009 (UTC)[reply]

    P.S. The red-checkmarked entries in the Before column are ‘what-ifs’, most of which wouldn’t be found on Wikipedia because they instantly generate broken red links for all editors. These are for illustrative purposes to show us they were examined to evaluate what DynamicDates is or is not capable of parsing. The point is that the syntax shown in rows 1, 3, 5, and 7 should be very, very common on Wikipedia. Even if Lightbot unlinks everything it is supposed to, there will be articles like 1985 that will continue to contain lined dates. Many of the dates in these intrinsically chronological articles have been coded with syntax that will generate the hammered dog shit shown in the After column were we to turn DynamicDates off. Everyone would see that. Greg L (talk) 19:51, 16 April 2009 (UTC)[reply]

  • I added footnotes to the table above to clarify the column meanings. Below is my suggestion for fixing the problem dates.
    Proposed date fix up process
    1. Hold off disabling auto-formatting until the majority of the potential poorly formed dates can be cleaned up.
    2. Confirm with someone who knows the software that we have properly identified the cases that need fixing.
    3. Commission a limited scope bot to perform the task of changing the poorly formatted dates into well formatted dates.
        a) The bot would fix the spacing and comma usage to be appropriate for the date style.
        b) The bot would not remove the links. (This can be done later.)
        c) The bot would not change the currently coded the date style (even if inconsistent within the article).
        d) For yyyy-mm-dd style dates, the code would be changed to [[yyyy]]-[[Month dd|mm-dd]] to simulate current link behavior.
    4. Get bot approved. The rules defined above should hopefully minimize controversy and potential objections.
    5. Identify and update affected main space pages.
    6. Revisit the request for disabling auto-formatting.
    I believe the limited function bot can get the job done fairly quickly once approved. Also, with the limited functionality should minimize the risk of having undesirable or controversial results, and would also require little operating supervision and intervention (once testing is satisfactorily completed). I expect that even dates in quoted text would not be an issue, because any poorly formatted, linked dates are already being modified by auto-formatting. I suspect the number of pages that need to be fixed will be numerous (1000's?), but not overwhelming.

    One loose item I can think of is templates: Are there any templates that currently emit portly formatted dates? If so, these will need to be fixed manually. Such cases may not be apparent until after the switch, but I would expect their number to be few (if any) and they should be easy fixes.

    Although some might prefer to rush this through, I think a slow methodical approach is better. -- Tcncv (talk) 00:01, 17 April 2009 (UTC)[reply]

  • Remember that these ISO dates are deemed acceptable within tables to save space and enable sorting. In these cases, it would make more sense to simply delink the dates altogether, rather than this conversion in step 3 above. Ohconfucius (talk) 08:36, 17 April 2009 (UTC)[reply]
  • This proposal makes sense in general. But why hold off delinking? Greg's suggestion that it all be done in one shot seems like a more efficient way forward, bearing in mind just how unpopular the vast majority of date links are... Ohconfucius (talk) 02:49, 17 April 2009 (UTC)[reply]
  • Woa. Where did I make such a suggestion? Hold off; don’t start quoting me. If I made such an assertion (that it “all be done in one shot”) it was purely unintentional.

    I strongly, urgently suggest that DynamicDates be left on until bots have A) removed all the overlinking on Wikipedia, and B) searched through the remaining dates for code syntax that would read improperly if DynamicDates was turned off. Then we can turn off DynamicDates. There is simply no justification for a rush to set some parameter to “false” (oh, soooo easy) and instantly make hundreds, perhaps thousands of dates look like crap (or worse: become unreadable or have broken links).

    I wholeheartedly agree with Tcncv’s enumerated plan, above. Good job, Tcncv. By design, since his preamble stated he was intending to address only the the challenge of “fixing the problem dates”, the only important step unmentioned is that a bot also needs to get to work removing the excess links and converting them into plaintext. Greg L (talk) 03:45, 17 April 2009 (UTC)[reply]

  • I don't know how long it would take to delink 2,800,000 articles, but I assume that it would be quicker to initially concentrate on the poorly formatted dates so that auto-formatting can be turned off sooner. As for not delinking, I am assuming that there are some dates that should remain linked (not that this is well defined at this point or that I have any idea what they might be), so some operator monitoring might be needed in the general delinking process. I also expect that the general delinking process might involve decisions on date formatting consistency in those articles with a mix. My intent was to define limited activities that are pretty much no-brainers with no decision-making needed, so the bot would be pretty much autonomous. The limited activity might also make it easier to review edits to confirm expected results without seeing unrelated activity. But I'm not a bot expert, so I may be seeing imaginary advantages. -- Tcncv (talk) 04:22, 17 April 2009 (UTC)[reply]
  • Most of the commonest formatting problems are satisfactorily dealt with by Lightmouse's script, which will also render a uniform date format per article (except ISO). One pass of the script over articles will sort out most of them in a non-piecemeal manner. However, it would involve semi-automated editing would be most efficient. Ohconfucius (talk) 04:58, 17 April 2009 (UTC)[reply]
  • You will all be happy to learn that Bill Clark is still working on the data concerning the various incorrect date formats. He should have some statistics tomorrow. Furthermore, he advises: "DynamicDates should NOT be turned off until we at least know how many links will be affected, and maybe not until they've been corrected. " By correction, I presume this must mean 'with or without delinking'. To repeat what I have said earlier, I see the best way is to set our gnoming/AWB editors free to run Lightmouse's monobook script which delinks dates and corrects most of the badly formatted ones; we could set a target of switching off only the relevant part of Dynamic Dates, say, three months after the injunction has been lifted. By then, the incidence of any messy dates should be minimised. Ohconfucius (talk) 09:07, 17 April 2009 (UTC)[reply]
Ohconfucius wrote: "I see the best way is to set our gnoming/AWB editors free to run Lightmouse's monobook script which delinks dates and corrects most of the badly formatted ones" - do you mean in addition to commissioning a bot like the one Tcncv has described? letting the script-users go to work makes sense, certainly, but a bot is needed too.
someone above said Lightbot can almost certainly make this kind of typo-correction at the same time as it delinks - has that been confirmed? can someone ask Lightmouse? Sssoul (talk) 09:17, 17 April 2009 (UTC)[reply]
  • Of course. While script users can go to work immediately once the injunction is lifted (and apply judgement to whether to use dmy or mdy), bot action is needed as bots work faster and more systematically. I have asked Lightmouse for clarification, and he will no doubt reply on his talk page. Ohconfucius (talk) 09:36, 17 April 2009 (UTC)[reply]
thanks for the clarification, and for asking Lightmouse for more detail. i see he's already pointed out that Lightbot isn't currently authorized to delink autoformatted dates - which means either getting the authorization changed or turning off DD before Lightbot resumes its work. Sssoul (talk) 09:46, 17 April 2009 (UTC)[reply]

It is time to hear from Lightmouse[edit]

Ryan, all:

In trying to figure out the roll of bots in moving forward, we now have a bunch of Wikipedians wandering around in dark caves, curious as to which tunnel to head down to avoid dead ends or pitfalls—and Lightmouse has the torch at the cave’s entrance! Can you address the “naughty naughty—you” stuff in a different venue and time and politely and graciously invite him to this discussion? Wikipedia needs his volunteer services and, right now, we could greatly benefit from his expertise and counsel (and his services to Wikipedia should he elect to contribute them). Greg L (talk) 14:35, 17 April 2009 (UTC)[reply]

  • To heck with this. I’m going to Lightmouse’s talk page and am going to find out what can and can not be done technically and what he would like to do. That all is, after all, a bit relevant here. Anyone interested can look on and participate there. Greg L (talk) 16:44, 17 April 2009 (UTC)[reply]

Conclusions[edit]

When all the dust has settled, could someone please remember to add in a prominent place at the top of the main poll page a summary of the poll's results and any consequent actions taken. Thanks! 86.161.41.37 (talk) 03:50, 18 April 2009 (UTC).[reply]

Linked dates statistics[edit]

I have created a bunch of new pages, whose directory is here as sub-pages of this discussion. I will rename this and continue using this as the central link when more data are available. So far, the data-extraction of ISO formatted dates indicates that ISO dates are far more commonly linked as [[yyyy-mm-dd]] (some 58,000 articles concerned), whilst the [[yyyy]]-[[mm-dd]] form appears only in about 2000 articles throughout the various spaces of WP. Data is courtesy of Bill Clark. Ohconfucius (talk) 04:24, 18 April 2009 (UTC)[reply]

Thank you both very much. It would also be useful to scan for dates with non-standard space-comma combinations between the parts. These are the ones that look good now, but will show their faults when autoformatting is turned off. I've done quite a bit of testing and it appears that in addition to the yyyy-mm-dd style formats, autoformatting recognizes dates with the following formats:
  • [[Month·Day]]...[[Year]]
  • [[Day·Month]]...[[Year]]
  • [[Year]]...[[Month·Day]]
  • [[Year]]...[[Day·Month]] - A very poor format, but it is recognized.
Where:
  • ... is a combination of zero or more spaces and zero or one comma anywhere in the string (·*(,·*)?). This could be no characters. This could be many spaces with a comma at the beginning, end, or anywhere in the middle.
  • Day is one or two digits (\d{1,2})
  • Month is the case insensitive full month name or three letter abbreviation (Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|June?|July?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?) - (Adapted from Lightmouse's scripts)
  • Year is one to four digits (\d{1,4})
Note that the above are less-that-formal pseudo- regular expressions. They will need to be tailored to whatever tool is used. I used "·" to represent a space. Autoformatting does not appear to recognized other types of whitespace such as tabs or newlines. Three digit days (001), five or more digit years, and alternate month abbreviations such as "Sept" are not recognized. The month-day and day-month part must have exactly one space between day and month. Note that this is the result of testing and reverse-engineering. It would be nice if someone who knows the software could independently confirm these results.
To locate for problem cases (those with other than the expected space-comma combination), I would propose running another scan using search strings similar to the following:
  • [[Month·Day]](·*|,|,··+}|·+,·*)[[Year]] - Excludes the comma + single space case.
  • [[Day·Month]](|··+|·*,·*)[[Year]] - Excludes the single space case.
  • [[Year]](|··+|·*,·*)[[Month Day]] - Excludes the single space case.
  • [[Year]]·*(,·*)?[[Day Month]] - All such cases need reformatting.
It would also be useful if Lightmouse could update his scripts to recognize and correct the general spacing variations before they are used for any large-scale delinking. Although I would expect the number poorly formatted dates to be relatively small (compared to the yyyy-mm-dd date counts), I think they still need to be identified and fixed before turning autoformatting off. -- Tcncv (talk) 07:15, 18 April 2009 (UTC)[reply]
The doc is at meta:Help:Date formatting and linking. You're essentially right except it also recognizes lowercase months and years BC. --80.104.235.89 (talk) 09:42, 18 April 2009 (UTC)[reply]
Got it covered with the case insensitive qualification. I've found that the software also recognizes odd cases such as [[ApRiL 1]][[2009]], although the result is typically a red-link like: "ApRiL 1 2009". It also doesn't like months outside the range 01-12, so [[2009]][[13-01]] displays as "2009-13-01" (empty). However, these cases are already broken. My goal here is to identify those cases that display correctly now, but will break when autoformatting is turned off.
If the above appears sufficiently well defined, who should we contact to request another scan? -- Tcncv (talk) 15:26, 18 April 2009 (UTC)[reply]

A new batch of processed wikilinks — involving articles with incorrectly formatted mdy dates — have now been posted (see Wikipedia:Date formatting and linking poll/List of articles with potential issues post Dynamic Dates). So far, the data-extraction of mdy formatted dates indicates that these dates are far more commonly {incorrectly) formatted as [[mm dd]]<without comma> (some 120,000 articles are concerned). Note that some of the formatting of foreign language titles are incorrect due to char substitution, which I will attempt to remedy. Data is courtesy of Bill Clark.Ohconfucius (talk) 12:47, 19 April 2009 (UTC)[reply]

"other brands" of DA[edit]

just before the poll opened a new (?) autoformatting template was announced: {{formatdate|dmy}} or something like that. i've also recently become aware of the {{date}} template (it's worth checking out that page for a description). so what happens to those now - will they be deactivated along with Dynamic Dates? Sssoul (talk) 07:12, 18 April 2009 (UTC)[reply]

  • Indeed, it seems to involve perhaps 2 thousand article-level transclusions. These templates will indeed be affected as these go against the agreed position. The templates support a wide number of date formats. Ohconfucius (talk) 16:11, 18 April 2009 (UTC)[reply]
Wikitext marked up with the {{date}} template is probably there for a reason, such as in an infobox. All of its occurrences ought to be relatively easy for a bot to find. As it outputs text correctly formatted for wikipedia, its formatting is not really a problem. That is, it doesn't autoformat, but outputs a style dictated by the second parameter (e.g. |dmy) - what you might call "fixed formatting". Nevertheless, it is also capable of producing linked dates by using a second parameter like |ldmy or |lmdy. I would again suggest that changing an |ldmy parameter to |dmy would be trivial for a bot, where the linking is to be removed. --RexxS (talk) 21:39, 18 April 2009 (UTC)[reply]
but ... 1] where i saw it in use it was in the main body of the article, not an infobox (and the template description specifies that it's for use in articles as well as infoboxes);
2] it looks to me like when the second (optional) parameter isn't set, it does format according to user settings (and when if it's used for "fixed formatting", why not just enter the dates as fixed text??); and 3] it goes right against a view that a whole lot of people expressed in the poll: that date formatting doesn't warrant complicating the mark-up at all. sorry, but it (and the other {{formatdate}} template) seem way too much like potential fodder for months of further strife over whether/how to mark up dates. Sssoul (talk) 06:33, 19 April 2009 (UTC)[reply]
When the second parameter isn't set, it uses the day-month-year format. As for "why not just enter the dates as fixed text", I guess it is used for stuff such as {{date|{{{year}}}-{{{month}}}-{{{day}}}|myd}} (doing that without the template would require {{#switch:{{{month}}}|January|February|...}}). --80.104.234.195 (talk) 10:26, 19 April 2009 (UTC)[reply]

(outdent) okay, thanks for the correction about it defaulting to dmy - some comments from editors who use the template indicate that they think it formats dates in accordance with user settings, but i guess they're wrong, so i'll strike that misconception from my earlier post. but i don't understand - at all - what you wrote about "stuff such as [code]" . maybe i don't have to understand it, but if it can be explained simply/briefly i'd be interested. thanks Sssoul (talk) 12:02, 19 April 2009 (UTC)[reply]

When can we expect results?[edit]

I have asked this above in another section and haven't heard back, so perhaps I should ask in a new section. I've never been involved in arbitration before. When can we expect a decision from ArbCom and go back to editing dates (i.e. when will the date editing injunction be lifted)? RainbowOfLight Talk 06:15, 19 April 2009 (UTC)[reply]

I have the same question. I did not align with either side but don't see why a portion of the issues that are decided on should continue to block editing. For example it seems to me there is consensus on the common sense position that date links should have no special status- that although we could link every single word in an article we instead eliminate links that readers would most likely would see no reason to click on. It is baffling to me why after 4 months there is still a ban on removing date links that make no sense and there is consensus support for allowing their removal. It seems to me that the consensus position is pretty simple to understand, but surely we could have a single document that spells the rules out crisply and thoroughly, then the link ban could be lifted.
Whether or not we can phase in certain aspects of what has been decided, I echo RainbowofLight's question. Is there any sort of guesstimate on when this arbcom ruling is going to finished with? I have family of date templates whose consideration at MOSNUM is crying out for an RFC but it's pointless to solicit wider input now since the results of the arbcom decision will undoubtedly have dramatic impact the opinions solicited. To remove doubt of the validity regarding any consensus positions reached, I would have to rerun the RFC. From where I stand, everything having to do with dates is in gridlock. I'm a patient guy, but I'd like to have a reading on how much time I should be expecting on resolution of this matter. -J JMesserly (talk) 21:37, 20 April 2009 (UTC)[reply]
The problem right now is that auto formatting has not been rejected (at present it's "no consensus"), and delinking dates would also remove the auto formatting (which still has the previously existing consensus until such time as it is rejected). I'm not willing to sacrifice all the marked up dates we have currently, especially if any agreed upon auto formatting system ultimately utilizes the link syntax (which would mean all the effort to delink dates would be for naught). It makes more sense to continue discussion and try to determine what would satisfy people regarding auto formatting. —Locke Coletc 21:57, 20 April 2009 (UTC)[reply]
But auto formatting would use that new wikitext thing (sorry, I forget the syntax- that recent mediawiki extension that does what the date links did). So date linking for the purposes of formatting is obsolete and there aren't any scenarios where we would still be using the old date link approach, right? I've not been following this super closely, but I don't see why all editors can't be removing links that make no sense just because they happen to link to a year article with a kajillion inbound links that also make no sense. Either auto formatting is approved or not, you keep the meaningful date links, but the rest of them get the red pen. If that is not fair to some side for some reason, then fine, I'll sit down. I just thought there was agreement from both sides on that much. -J JMesserly (talk) 01:08, 21 April 2009 (UTC)[reply]
I think we should leave Locke alone with his rather "unique" interpretation of the consensus. He didn't get it on 25 December 2008, it doesn't look like he's getting it any better on 13 April 2009. Ohconfucius (talk) 02:16, 21 April 2009 (UTC)[reply]
I think we should leave Ohconfucius alone with his rather silly ideas regarding "consensus". —Locke Coletc 11:30, 21 April 2009 (UTC)[reply]
Ohconfucius and Locke, just carry on hitting each other on the head with your sandbuckets if you’d like to see me propose a page ban for you on ANI. At a minimum, please say something actually witty to each other next time. Bishonen | talk 15:26, 21 April 2009 (UTC).[reply]
Trout, anyone? ;-) Ohconfucius (talk) 15:35, 21 April 2009 (UTC)[reply]
Bishonen slaps Ohconfucius around a bit with a ginormous shark. Bishonen | talk 15:42, 21 April 2009 (UTC).[reply]
  • Hey, I hope you're gonna be even handed with the shark, Locke will be extremely jealous. Ohconfucius (talk) 15:51, 21 April 2009 (UTC)[reply]
The formatdate parser function? People argued that the existing systems syntax was "too complicated" by virtue of it using square brackets (as with normal links). I can only imagine the kind of complaints we'd receive if we forced people to mark dates up using <code>{{#formatdate:November 11, 2001}}</code>... There's been a proposal to simply remove the linking in the software, but for whatever reason this continues to be resisted. It's simpler, doesn't involve the use of bots, and keeps dates auto formatted. (And of course the other issues can be fixed with time, assuming opponents don't try for RFC5 and a half...). —Locke Coletc 11:30, 21 April 2009 (UTC)[reply]
Off topic... What's formaldehyde got to do with anything? Ohconfucius (talk) 15:41, 21 April 2009 (UTC)[reply]
  • Ohconfucius, yeah, it's an attempt at humour, but please be sensitive to the feelings of people on both sides of this former argument. It is in everyone's interest that the temperature be cooled down. We all need to live with each other productively. Tony (talk) 15:45, 21 April 2009 (UTC)[reply]
  • This dispute has raged an absurd length of time due to intransigent wikilawyering. It should receive all the dignity it deserves. Greg L (talk) 15:41, 22 April 2009 (UTC)[reply]

Throwing fuel onto the fire - Google Timeline[edit]

Google has recently (last few days?) put up a new method of data aggregation, Google Timeline. It combined metadata from several sources to create a at-a-glance timeline for the information and probably will be expanded in the future. Presently it is pulling three streams of data from WP: "Wikipedia events", "births" and "deaths".

Unfortunately, I can't find out how they are pulling date data, but the last thing we want to do is limit what they are doing. I realize it is entirely possible they are pulling from unlinked data, but it would be helpful to know if they are in any way taking advantage of date autoformatting or if they're using other means. --MASEM (t) 01:33, 21 April 2009 (UTC)[reply]

  • Google's success isn't an accident. I don't think the sages there would built an entire timeline system relying on something which they couldn't control, and which could change at any minute. Ohconfucius (talk) 02:13, 21 April 2009 (UTC)[reply]
    • And even if they did, Google shouldn't have any bearing on how we do things here. That's not our problem. Sillyfolkboy (talk) (edits) 02:21, 21 April 2009 (UTC)[reply]
I agree we are not a back end for google applications, but regarding all these map and now timeline applications, if Wikipedia is not the premier destination for their links, we are doing something wrong. Strategically, I think we should feel little threat from them, and ought to regard them as doing valuable R&D for the Foundation. Here's what I mean. In 1994, the commercial publishers were the last word in electronic encyclopedias. Wikipedia has left them in the dust. Similarly, long term, it is inevitable that the Foundation will provide free software that supplants Google Earth and these Timeline things. As an engineer, I recognize that these visualization systems are not trivial, but the technology is a relatively stationary target, and ultimately the power of collaborative systems will leave Google Earth and Timelines in the dust. So we should welcome them and see how our material best works with theirs.
As for the specifics of the Google timeline as of the time of this post: They really have not done much work on the data extraction. If you take a look at Year 1865 by Month, you will see that all the graphics for all the battles are the same, and they all link to the same article: American Civil War. Really, they don't need to do much more than code filters for a half dozen infoboxes, and they basically have all they need regardless what we do.
An obvious improvement is to link the time coordinates to location coordinates using google earth. Google Earth now allows the encoding of timespans inside markers, and KML supports them so basically all the other virtual earths will be able to follow suit. Some geewhiz and technical observations here. Some elaboration of time, metadata, and strategic implications for Wikipedia as a global knowlegbase for these sorts of applications here (link to entire thread). -J JMesserly (talk) 04:27, 21 April 2009 (UTC)[reply]
And just to reinterate, it is inconceivable that Google would use the old square-bracket system to locate dates. It is blindingly easy to automatically locate dates in WP's text without them. Tony (talk) 04:34, 21 April 2009 (UTC)[reply]
Completely agree.-J JMesserly (talk) 04:54, 21 April 2009 (UTC)[reply]

"Dates" case and temporary injunction: likely timing?[edit]

I have made a formal request for information about when we are likely to see movement on these matters. Tony (talk) 09:00, 21 April 2009 (UTC)[reply]

Turning date linking off in one fell swoop[edit]

In the discussion around the DynamicDates config option, I think we missed something discussed very early on. Someone, and I unfortunately think it was UC Bill (what a sad situation that is, regardless of position) who suggested a one line change to the autoformatting code to turn off date linking within the current version of autoformatting. This would *not* turn off autoformatting, it would just fix the "sea of blue" All without editing one article, let alone millions. Before we unleash the bots, I thought it would be good to at least consider that option for a minute. Personally, I think this is just a first step, but I like the effect it would have of addressing the clear consensus regarding unlinking dates quickly while giving time for the options where there was more balance and which might require more discussion. dm (talk) 11:27, 21 April 2009 (UTC)[reply]

As you say, this could be a decent first step. It definitely wouldn't be appropriate as the only solution, as many of us opposed because we don't like the increased complexity for editing. It also assumes that autoformatting will continue, and I don't think there is consensus for that. Karanacs (talk) 15:57, 21 April 2009 (UTC)[reply]
... there are two long sections above ("action plan" and "bug filed") discussing that very idea, its repercussions, why it's not that simple, etc. Sssoul (talk) 16:03, 21 April 2009 (UTC)[reply]
I believe we're talking about two different things Ssoul. Above is discussion about the Dynamic Dates config line which would disable autoformatting entirely. I'm referring to a patch which would not touch autoformatting per se (which is what all the complication above appears to be about) but merely modify it to not link dates. dm (talk) 02:58, 22 April 2009 (UTC)[reply]

(outdent) ah - sorry! you mean the patch that uses linking markup for dates but instead of linking them it autoformats them, apart from serious problems handling punctuation and date ranges? that was thoroughly discussed while the last RfC was being constructed, and even the pro-DA people decided it was not a good idea to propose that route to the community. someone else can i'm sure direct you to that discussion. meanwhile it's very hard to see any justification at all for a "first step" that would entail complicating the editing process simply to prolong the existence of a function that the community doesn't want: that patch would entail eye-glazing instructions for using double square brackets to link/not to link and for punctuating bracketed dates. Sssoul (talk) 04:47, 22 April 2009 (UTC)[reply]

  • I think I'll stop trying to explain this now, because clearly what I thought in Good Faith would help move this forward is something that you're prepared to keep arguing will not. At this point, I'm sure we could find arguments against gravity thoroughly discussed in the talk pages of MOSNUM, but I'll let you find those for yourself. dm (talk) 11:09, 22 April 2009 (UTC)[reply]
  • This has been the problem all along. Any solution, no matter how intuitive or well reasoned, will be shunned or argued against if it doesn't involve mass delinking of dates via bots. Apparently Lightmouse is the way, the truth and the light, and anything else is... well, clearly not good enough. They've apparently "won" something, and they want their trophy (all dates sans square brackets), even if that doesn't have consensus. —Locke Coletc 13:58, 22 April 2009 (UTC)[reply]

That would be a silly kludge which would turn off all autoformatted links, including the ones which do comply with the new WP:LINK#Chronological items guidelines, and wouldn't turn off any non-autoformatted link, including the seventeen occurrence of 2007 (without a day link) in the same section. --A. di M. (formerly Army1987) — Deeds, not words. 17:52, 22 April 2009 (UTC)[reply]

I don't know that I'd call it a kludge so much as a stop-gap solution so auto formatting can be salvaged without keeping all the links intact. And from my perspective it's a reasonable compromise considering I want to keep all date links (the effect here is that I lose all the date links, but they can be manually added where appropriate). —Locke Coletc 19:21, 22 April 2009 (UTC)[reply]

Article list[edit]

Apologies for the misuse of the {{seealso}} template, but I think it's important to keep discussion as centralised as possible, or at least have links to where discussion has taken place.

I see that Ryan is suggesting that removal of date links should not be done by bot. I'd like to disagree with that opinion and give the reasons:

  1. In my humble estimation, there are millions of links within articles leading to date articles which are not germane to the subject and offer no value to the reader of the article. I submit that the community has clearly made its wishes known and that those links should be removed;
  2. In my humble estimation, there are no more than a handful of links within articles leading to date articles which are germane to the subject. So few, in fact, that they could be easily enumerated.

If I am correct, then a list of articles containing useful date links would be simple to compile. This could then be used as an exclusion list for any bot tasked with delinking dates. The "thought need[ed] to go into each and every one to decide whether that is important" (actually relevant is the correct adjective) would then all be done prior to a bot run. Manually delinking millions of dates is a complete waste of editors' time, when a bot could accomplish the identical task far more accurately and rapidly. The other advantage of making a page which lists exceptions is that the arguments about whether the articles 12 February or 1809 are relevant to the article Charles Darwin, etc. could be kept in one place and out of the article itself.

As I believe I'm right about the paucity of relevant links to date articles, I'll start by nominating an article that I believe contains a germane date link. If I'm wrong, then other editors should be able to list far more examples. I don't believe that can happen.

  • Article MM contains a relevant link to date article 2000

--RexxS (talk) 23:59, 23 April 2009 (UTC)[reply]

  • I agree that bots should be used, and that an exclusion list might be a solution; but remember that an article could have a relevant date link as well as irrelevant date links. I'd have to be honest and say that I don't believe the link to 2000 is relevant in the MM article. Someone might be interested to find out that MM and 2000 can be synonymous, but why that means they would be interested in finding out what else happened in 2000 is beyond me.  HWV258  00:31, 24 April 2009 (UTC)[reply]
  • Well, MM is a disambig. page, whose function is supposed to point readers to different articles; if you think the reader wouldn't be interested in the contents of 2000 there shouldn't be any entry about it on the page; --A. di M. (formerly Army1987) — Deeds, not words. 01:24, 24 April 2009 (UTC)[reply]
  • I understand the point you are making, however I still feel it is okay to associate MM with 2000, but without necessarily linking to 2000. For example, a reader might plug "MM" into WP and say "ah, so it means 2000 does it". Note that there are other entries on that page that have no link, e.g. "Missing Men, a Sky Sports game" (although that might be because no one has created the page yet).  HWV258  01:58, 24 April 2009 (UTC)[reply]
The purpose of a dab page is to direct readers to articles. From WP:DAB: "Each bulleted entry should, in almost every case, have exactly one navigable (blue) link". Anyone searching for "MM" (for example, if they saw it at the end of a film) ought to be able to reach 2000 from that dab page. Annoyingly, they ought to be able to reach 2000 in film as well, but can't! Frankly, I'd either remove "Missing Men, a Sky Sports game" or red-link it, then remove it if nobody creates it after a short time (and that's being generous). --RexxS (talk) 13:43, 24 April 2009 (UTC) On second thoughts, I've red-linked it myself. Please feel free to delete the entry if I forget. --RexxS (talk) 13:56, 24 April 2009 (UTC)[reply]
  • Can we all be very careful to specify whether we mean full (three-part) dates or date fragments (month-day items and years)? I can see confusion creeping in here. First, the proposal was that a Lightbot remove the square brackets around only full dates (February 5, 1972). These full items are what we normally think of as date autformatting. Although it's true that month-day links (July 19) are by default autoformatted because of the unfortunate piggybacking of DA on top of wikilinking, these two-component dates were never part of the proposal for mass treatment by Lightmouse (see his talk page). The reason is that Option #1 in the month-day question (Q2) of the RFC left open the rare possibility that a month-day item might indeed meet the relevance test for linking to its month-day article. Solitary year links, the subject of Q3, were excluded from the Lightbot proposal for the same reason. The proposal deliberately avoided the administrative and political issue of mass bot removal of these items because the community has endorsed a relevance test, albeit a very tight one. On the contrary, three-item full dates are not subject to a relevance test, and this was never at issue in Q1 of the RFC. Tony (talk) 12:41, 24 April 2009 (UTC)[reply]
I take your point, Tony, but please consider this: a full (three-part) date not only autoformats, but produces links, because of the crazy system we have at present. Any of the date-delinking objectors could claim that the original editor intended not only to autoformat, but also to produce one or two links. They then have a perfect excuse to object to using a bot to remove the markup around full dates, "since the bot cannot determine the original intention and may be removing a relevant link". It is far better to sideline these objections before a bot run. I am sure that a bot will eventually have be used to remove the massive amount of useless date links, both of the full- and fragment- variety. For that reason, I feel we need a solution that is applicable to both varieties, although I can see sense in proceeding carefully. --RexxS (talk) 13:43, 24 April 2009 (UTC)[reply]
Why are we still acting as if autoformatting has support for remaining? The poll went clearly against it. The best action is probably to remove the misguided javascript that does autoformatting. Shoemaker's Holiday (talk) 23:25, 27 April 2009 (UTC)[reply]


What does this solve?[edit]

Does this result actually solve anything or make edit warring any less likely? surely we will now get into arguments over what 'germane to the subject' actually means in practice. It could be argued that year links are in some cases by definition germane to the subject if they add historical context to a subject of international politics say. G-Man ? 23:15, 27 April 2009 (UTC)[reply]

Well, it sorted out the autoformatting issue. The rest can percolate through at whatever speed. Shoemaker's Holiday (talk) 00:20, 28 April 2009 (UTC)[reply]

Cleanup up poorly formatted dates[edit]

Please see a discussion at WT:Manual of Style (dates and numbers)#Cleanup up poorly formatted dates. -- Tcncv (talk) 06:05, 25 May 2009 (UTC)[reply]

Date unlinking bot proposal[edit]

The community RFC about a proposal for a bot to unlink dates is now open. Please see Wikipedia:Full-date unlinking bot and comment here. --Apoc2400 (talk) 10:37, 22 June 2009 (UTC)[reply]

Protected edit request on 8 June 2022 - Deprecated source tags[edit]

Could all the <source> tags please be replaced with <syntaxhighlight> tags per Category:Pages using deprecated source tags? Aidan9382 (talk) 17:57, 8 June 2022 (UTC)[reply]

 Not done however, I've unprotected this old page; that being said I don't see any of that tag in the text - so check carefully. — xaosflux Talk 13:25, 9 June 2022 (UTC)[reply]
@Xaosflux: Unfortunately, you unprotected the wrong page. The issue is on a subpage of this page (See the precise edit request location), and that page is still protected, so I can't fix the issue. Aidan9382 (talk) 14:29, 9 June 2022 (UTC)[reply]
@Aidan9382: unprotected that one too now - go for it! — xaosflux Talk 14:37, 9 June 2022 (UTC)[reply]