Wikipedia:WikiProject Red Link Recovery/RLRL

From Wikipedia, the free encyclopedia

This page is for the discussion of the Red Link Recovery Live tool, hosted on Toolforge at https://tools.wmflabs.org/tb-dev/RLRL.


Ideas for new data sets[edit]

  • Pinyin vs Wade-Giles transliteration of oriental languages (ie [1])
  • Try triple-metaphone for enhanced accuracy of foreign language titles
  • Generally, a weighted Levenshtein distance
  • Abbreviations in general (ltd/ltd./limited)
  • Removal of disambiguation features
  • Names with alternate spellings (Mohammed, Muhammad)
  • More homonyms and homoglyphs
  • Specialised number handler in double metaphone
  • All the tricks used in the Mediawiki Lucene-search, as discussed here
    • Word-order swapping
    • Ignoring stop-words
    • Context-free synonyms this list
    • Linguistic rephrasing (Politics of Africa -> African Politics) - tried, Jan 2011 - initial results look good.
  • Latin prefixes as separate or hyphenated words "non-sequitur", "anti personnel"

- TB (talk) 22:34, 15 December 2010 (UTC)[reply]

Also:
* Careless positional nunbers .. 2rd, 23nd, 55d
* Nouns vs verbs; sprint, sprinting, sprinter
* Inferred links. If A->B, is A may be a likely alternate target for a red link on B
- TB (talk) 23:42, 20 November 2011 (UTC)[reply]

Stats terms[edit]

Can you explain some of the terms in the stats screen?

  • Lost
  • Live results
  • Links vs Titles?

Also, how does Update Stats work?

Great tool. welsh (talk) 09:13, 19 December 2010 (UTC)[reply]

  • 'Lost' is a catch-all status for suggestions that aren't New, Checked, Old or Fixed. So far it's not seen much use other than during the process of adding or removing new data-sets from RLRL. Suggestions for a better term to use are welcome - the best I can come up with is 'out of play'.
  • 'Live results' are those generated on-demand when the check a specific page tool is used. The emphasis in producing the live results is on generating many reasonably plausible suggestions using whichever methods can be run quickly, as opposed to the far more selective and time-consuming methods used to generate the pre-prepared suggestions.
  • 'Links' and 'Titles' refers to whether the text of a red link or the title of an existing article has been modified to produce the suggested match - it's a bit of an artificial distinction to be honest. They're tracked separately as I'm trying to measure the difference in the accuracy of article titles compared to the accuracy of wikilinked. I beleive that more 'thought' goes into choosing the title of an article than into choosing a portion of text within an article or link, and that I can use this fact to further fine-tune future sets of suggestions.
  • 'Update Stats' - Before any suggestion is displayed, checks are run to make sure that the original red link still exists, that it is not now blue and that the suggested new target for it still exists. The 'update stats' makes exactly the same checks, but applies them to every 'in play' (new, checked or old) suggestion in the system.
Happy to elaborate, and very glad that you and others are finding the system useful. - TB (talk) 14:02, 19 December 2010 (UTC)[reply]

Check page updated, 2010-12-23[edit]

I have corrected a problem in one of the main algorithms used on the 'check page' tool. Suggestions for titles in eastern-european and asian languages should now be much more accurate. - TB (talk) 18:43, 23 December 2010 (UTC)[reply]

New suggestions - genitives[edit]

800 or so new suggestions posted. This is a new technique, comparing noun and genitive forms of country names to try and match links and titles such as Guatemalan politics and Politics of Guatemala. - TB (talk) 22:55, 8 January 2011 (UTC)[reply]

This set is completed now (578 good suggestions, 182 bad ones, >75% success). While initially run with continent, country and U.S. state name pairs (i.e. Africa/African, Iraq/Iraqi, Texas/Texan), it strikes me that there are many other classes of words that are commonly used in possessive form - consider Socialism/Socialist, Philosophy/Philosophical, Mathematics/Mathematical, Religion/Religious. Now I've been trying to hunt down any useful list of such words and am drawing a blank. Nor can I come up with any even remotely useful set of rules for detecting or generating the possessive form of English language words. Anyone got any ideas ? - TB (talk) 13:51, 14 August 2011 (UTC)[reply]

New suggestions - honorifics[edit]

1100 or so new suggestions posted. These suggestions have been generated by removing the honorific 'Sir' from the start of red links and checking what remains against existing article titles. If anyone can find and link the relevant manual of style section or spinoff, that'd be a big help, ta. - TB (talk) 22:03, 27 May 2011 (UTC)[reply]

MOS:HONORIFICS - welsh (talk) 16:51, 6 August 2011 (UTC)[reply]
Ah, thanks - the very thing. I only wish I were wiser having read it. My understanding is that the use of 'Sir' in an article title is basically optional. Oh well - at least marrying up instances of omitted/present honorifics in links and article titles will assist in any future standardisation. - TB (talk) 20:57, 6 August 2011 (UTC)[reply]

New suggestions - ampersands[edit]

1620 new suggestions involving the expansion of ampersands (&'s) in titles and red links into the word 'and'. Initial check seems good, think this one'll be a keeper. - TB (talk) 13:40, 14 August 2011 (UTC)[reply]

New suggestions - unspecified counties[edit]

1920 new suggestions. Where there's a red link in the form 'Place, Some County, State', the middle bit has been removed and the results matched against article titles. Horrible, I know ;) Almost certainly <50% accurate and never to be generated again, but there'll be some useful red link fixes and missing disambig pages for sure. - TB (talk) 15:13, 14 August 2011 (UTC)[reply]

This set isn't working well - almost every match seems to be a missing disambiguation page :( - TB (talk) 14:27, 28 August 2011 (UTC)[reply]

New suggestions - leading the's[edit]

10000 new suggestions involving removing a leading 'The' from red links. Results seem modestly good. - TB (talk) 14:47, 15 August 2011 (UTC)[reply]

Suggestion likeliness updated[edit]

I've revamped the algorithm used to estimate the likely correctness of any given suggestion, largely because the old one was ranking the "unspecified counties" set unduly highly. Hopefully this will push the 'better' suggestions to the top of whatever list(s) people are working through. - TB (talk) 10:46, 28 August 2011 (UTC)[reply]

New suggestions - Limited companies[edit]

A small set of links and titles ending with Ltd/Ltd./Limited - if it works well, I'll add in some additional suffixes. Corp/Corp./Corporation, Co/Co./Company, Inc/Inc/Incorporated and so forth - TB (talk) 14:24, 28 August 2011 (UTC)[reply]

Three more sets of these posted - all small sets, all look to be useful. - TB (talk)#

Autofix javascript updated[edit]

I've made some minor improvements to the autofix javascript at User:Topbanana/RLRL SR Utility.js. If you use autofix links and are finding that the search and replace is matching things it should not, you may need to refresh your browsers copy of the file. To do this, click on the 'preferencers' link on any Wikipedia page, then 'appearance', then the 'Custom JavaScript' link for your selected skin. Follow the instructions at the top of the page there. - TB (talk) 16:27, 28 August 2011 (UTC)[reply]

New suggestions - Dash-like characters[edit]

A set of 1100 new suggestions generated by equating the various unicode dash-like characters. - TB (talk) 22:00, 30 August 2011 (UTC)[reply]

Cross-wiki support[edit]

I'm in the process of modifying RLRL to support other Wikimedia projects (Wiktionary, Wikisource and so forth) and more usefully, other languages. Apologies if I break any on the English/Wikipedia processing as I do so. - TB (talk) 21:31, 11 September 2011 (UTC)[reply]

This proceeds well; RLRL is now set up for the English-language Wiktionary and Wikiquote sites, and the French- and German- language Wikipediae. What's now needed are some guinea pigs. I'll contact Restauration lien rouge for the former, but cannot find a German- language equivaent of this project. An initial check shows that Wiktionary has very different editing policies from Wikipedia, and therefore might need careful consideration.
So, questions:
Ta. - TB (talk) 06:30, 16 September 2011 (UTC)[reply]
I have made a couple of edits so far on deWiki. As I am about to do some travelling I will lie low for a week or so, but I will then do a batch of 50 and see if there is any fallout. Agathoclea (talk) 05:58, 29 September 2011 (UTC)[reply]
That's great, thanks. - TB (talk) 06:25, 29 September 2011 (UTC)[reply]

Checkpage changes[edit]

I've made some significant changes to the "check a specific page" tool - t should give the same results as before but now be dramatically quicker to return results. If anyone notices it behaving oddly, please let me know. - TB (talk) 06:33, 16 September 2011 (UTC)[reply]

New suggestions - Templates[edit]

A set of 1700 new suggestions. These are red links into namespace 10 (the template namespace) matched against existing template names by brute force. In essence, the levenshtein distance each pair of 30000 candidate red links and 300000 candidate templates was calculated, and those with an edit distance of 1 retained as suggestions. Entries where the entire edit consisted of numerals were suppressed.

Although far less elegant than other approaches, the brute-force Levenshtein has a greater chance of finding a suggestion for an arbitrary red link than other methods. Of course, this means a resulting decrease in the likelihood of the suggestion being correct, but I can see potential for its use. - TB (talk) 21:26, 19 September 2011 (UTC)[reply]

Domains[edit]

I love this tool and think it's great - quick question though - I notice that there are datasets for LGBT, doctor who, and (I think) maritime warfare - what are the actions I'd take if I wanted to do something similar for the disability project? Failedwizard (talk) 16:58, 1 October 2011 (UTC)[reply]

Welcome on board, Failedwizard. Posting here's as good a method as any ;) I'm still tinkering with the system used to generate the domain-specific sets; hopefully I'll have all the rough edges knocked off the process in the next few days an can offer the service to any and all Wikiprojects that would like their own list. For now:
Suggested fixes for 51 of these are now available in RLRL. - TB (talk) 18:54, 1 October 2011 (UTC)[reply]

You star! Failedwizard (talk) 19:34, 1 October 2011 (UTC)[reply]

Fabulous, thank you so much... have fixed all the ones that seemed sensible...(will come back to manually look at the others) by the way - is this an open-source thing that people can help out with...? my code writing is better than my wiki-fu... and the other quick question was - would it be possible to get the breakdown of, well, which of the 1412 red links are in which of the 494 articles? Failedwizard (talk) 21:09, 1 October 2011 (UTC)[reply]
RLRL runs on the Toolserver and is available under the AGPLv3. Happy to share - most of the fun's in the generation of the suggestions rather than the tool to store and serve them up. I'll post up a report on those red links in the next day or two, no problem. - TB (talk) 21:49, 1 October 2011 (UTC)[reply]
Redlink report posted at User:Failedwizard/Disability redlinks. Enjoy ;) - TB (talk) 10:49, 2 October 2011 (UTC)[reply]
Awesome, and still more awesome... *rolls up sleeves* Failedwizard (talk) 12:58, 2 October 2011 (UTC)[reply]

Toolbox bugs[edit]

  1. autofix is not working for me to correct the links in the articles. I have to manually change the redlinks that are spelled incorrectly. (redirect autofix works, though)
  2. An option to check for this suggestion is completely wrong. Currently changing something means right and not changing is wrong. This leaves no room for maybe and it is non-intuitive. (At least for me, viewing something without changing it should not be construed as it is wrong.)
  3. checking minor fix and unchecking watch this page would be appreciated

TStein (talk) 05:37, 9 October 2011 (UTC)[reply]

Hi there TStein.
  1. I'll need to wait until I'm home from my hols to check why autofix isn't working for you.
  2. Suggestions that have been looked at two or three times without being fixed are marked as 'old' rather than 'wrong'. This is deliberate distinction; the 'cost' of someone checking a suggestion is large compared to the cost of generating it. If a suggestion has been shown a few times and not been fixed, best to skip it and move on. I do check the logs every month or two and correct the any obvious 'queue draining' activity.:# Good idea about the checkboxes on redirects - that annoys me also. Again, I'll be able to check the scripts involved when back home.
- TB (talk) 12:43, 10 October 2011 (UTC)[reply]
Thanks for the response. I don't know if you have had a chance to work on #1 but it still doesn't work for me. (That isn't a real problem, it doesn't really bother me.) As far as number 2, I am not sure what I was trying to say there. I probably edited the sentence too many times and got confused. What inspired it, though was that sometimes I would go through a fair amount of work to investigate whether the recommended fix was good or not and then find out that it was not a good fix. In that case, I thought that an option to say that this suggestion is wrong would help so that other people wouldn't have to duplicate my work.
Finally, there is one more way that you may be able to improve this. In physics the vast majority of red-links by far are biographies and the suggestions are almost never correct (if ever). If it was possible (say by looking for a biography template or such) could you give an option to not count biographies? TStein (talk) 04:34, 12 October 2011 (UTC)[reply]

Housekeeping[edit]

I've removed around 8500 of the least-likely suggestions from the 2010 lists to try and aid in getting them closed off. - TB (talk) 19:47, 13 November 2011 (UTC)[reply]

Bank error in your favour, collect £100[edit]

Found and fixed an accounting error on the stats page. After 18 months, just shy of 74000 red links have been recovered. Of the pre-calculated suggestions that have been processed, 86.65% were 'correct' enough to result in a fix. Not to shabby :) -TB (talk) 18:38, 20 January 2012 (UTC)[reply]

Stats update[edit]

Just noting that last night we passed the 50,000 red links fixed mark on the pre-calculated set of suggestions in the Red Link Recovery Live tool. Adding in the 27,000 red links fixed using the suggestions generated on demand, this represents a significant contribution towards improving Wikipedia.

Over the life of the Red Link Recovery Project, we've now fixed over 425,000 red links - at current rates we'll reach a half a million towards the end of next year. Good job all. - TB (talk) 11:30, 6 March 2012 (UTC)[reply]

Not working[edit]

Is this thing broken, or is there something wrong with my browser? It's been working fine the past couple of weeks, but now the 'list of red links with alternate targets' is giving me the same three (already fixed) links over and over. Same with the lists of specific red links. What's going on? DoctorKubla (talk) 19:59, 28 March 2012 (UTC)[reply]

Howdy, Doc. The toolserver on which RLRL runs is partially offline for an update right now. It's been off for 10 or so days now; hopefully it'll be fixed in the next few days. When it is, the tool'll be back to normal. - TB (talk) 10:36, 29 March 2012 (UTC)[reply]
The system's now (mostly) back online - it'll take a few days to catch up with he last week or so's updates to the live database. - TB (talk) 08:25, 4 April 2012 (UTC)[reply]

Comma'd names[edit]

How about Comma'd names. eg Bloggs, Fred -> Fred Bloggs? welsh (talk) 06:49, 9 September 2012 (UTC)[reply]

Nice idea - I've popped up a set of 700 or so suggestions generated using this method. If it scores well, a more aggressive run should find another thousand or so. - TB (talk) 15:32, 9 September 2012 (UTC)[reply]

Unlikely bracket spacing[edit]

How about redlinks with unlikely spacing around brackets? Usually an open bracket is preceded by a space and a close bracket followed by a space (unless it is at the end of the link. Also an open bracket is rarely followed by a space. Maybe this should be in the unlikely links tool instead, as this could be fixed even if there is no bluelink solution. welsh (talk) 07:47, 15 October 2012 (UTC)[reply]

Nice one. I've added it to the unlikely links tool (as Spaces near brackets) immediately. A trial of matching red links against titles by adding or removing spaces was tried in 2006/07 with fairly poor results, but incorporating a little more knowledge about oddly places spaces might yield better results. I've added it to my todo list, cheers. - TB (talk) 10:09, 15 October 2012 (UTC)[reply]
That's good. How about brackets with no spaces before or after as relevant? eg Fred(singer). Maybe again as an unlikely links list welsh (talk) 23:01, 15 October 2012 (UTC)[reply]
Have added missing spaces near brackets to supplement the (renamed) extra spaces near brackets above. Mixed results I'd say; not sure if any simple rule defines what's a normal character to follow a close bracket. - TB (talk) 08:47, 16 October 2012 (UTC)[reply]
A closer look indicates problems with neither the suggested rule or the English-language, but rather a fairly large number of poorly punctuated article and redirect titles. I've added functionality tot he Unlikely tool to help hunt these down. - TB (talk) 20:59, 17 October 2012 (UTC)[reply]
Okay, I've come up with a test for this that's slightly better than awful. There are still a good few 'correct' titles in the set, but the check is now at least borderline useful. - TB (talk) 22:50, 30 October 2012 (UTC)[reply]

AUTOFIX now offered for red links on templates[edit]

Several of the upcoming datasets focus on broken links within templates. Often, fixing such red links on a single template corrects all instances of it - once of course Mediawiki catches up and propagates the change. To help solve such cases, an AUTOFIX will now always be offered for links on templates. Please do remember to check 'what links here' for the red link 48 hours later if you can, just in case the link existed independently on one or more articles also. - TB (talk) 10:26, 4 December 2012 (UTC)[reply]

RLRL moved to Tool Labs[edit]

The new Tool Labs environment will be replacing the Toolserver by the end of 2014. I've moved RLRL over ahead of schedule, and it can now be found at http://tools.wmflabs.org/tb-dev/RLRL/. Kindly report any oddities here. - TB (talk) 15:44, 1 June 2013 (UTC)[reply]

New Suggestions - Renamed Countries[edit]

I've posted a small set of suggestions (~250 entries) generated by using previous names of countries (per Geographical renaming). I've no idea how useful these will turn out to be. - TB (talk) 20:30, 13 June 2013 (UTC)[reply]

This set proved to be pretty dismal - of the 219 suggested fixes only 37 resulted in a fix - that's around 20% accurate. Not to be regenerated, although there may be scope to apply the same technique to other renamed entities;
  • Maiden names vs married
  • Commercial enterprises following takeovers/mergers
  • Educational institutions
  • Political parties
  • Re-branded goods (marathon/snickers, I'm looking at you)
  • Sporting events
  • Entertainers/acts (Prince/fishhooksquiggle)
  • Other geographical locations (Russian cities)

- TB (talk) 11:25, 11 August 2013 (UTC)[reply]

AUTOFIX stopped working[edit]

Since the switch to https pages AUTOFIX has been blocked by Chrome, "this page includes script from unauthenticated source". Any ideas how to prevent this? I don't see any settings to permit this specific script. welsh (talk) 06:23, 30 August 2013 (UTC)[reply]

Apologies for the tardy response. If Chrome is objecting to a script coming from an http (rather than https) source, the quickest solution is to make it come from an https source. I've made the relevant change to your custom.js file(s). You may need to force your browser to bypass it's cache once (CTRL+F5) . For other interested parties, the most robust method of importing the AUTOFIX tools is to add the line:
importScriptURI('https://en.wikipedia.org/w/index.php?title=User:Topbanana/RLRL_SR_Utility.js&action=raw&ctype=text/javascript');
to your custom.js file. - TB (talk) 16:38, 28 September 2013 (UTC)[reply]
Belated thanks. It's working fine now. welsh (talk) 22:04, 11 October 2013 (UTC)[reply]

Performance improvement[edit]

I have hopefully solved some of the odd delays some people have experienced using this tool. Any problems please let me know. - TB (talk) 21:49, 28 November 2013 (UTC)[reply]

Many new suggestions[edit]

Ahum .. have discovered a bug in the suggestion generation process that has been discarding a significant number of likely results. November's "weighted distance" set had 120 or so results originally - re-running it now returns 12700. I've posted these in lieu of December's run for now and may revisit a few of 2013's better sets to see what can be usefully done. - TB (talk) 18:37, 29 December 2013 (UTC)[reply]

Update all stats now not working?[edit]

For the past couple of days when I run update all stats now it comes back with zero matches in the report, though it seems to be processing for the normal length of time.welsh (talk) 08:52, 19 April 2014 (UTC)[reply]

OK, got it - there's a 90k second replag. Should've checked. welsh (talk) 11:01, 19 April 2014 (UTC)[reply]
Well spotted; I'd missed that myself. Will look into it. In the meantime I've added the replag indicator back in to the top of the RLRL tool pages. - TB (talk) 17:18, 19 April 2014 (UTC)[reply]

RLRL tools script changes[edit]

I've been prompted to updated a few of the client-side scripts used by this project. There shouldn't be any noticeable difference if I've done it right, other possibly than some minor speed improvements on modern browsers. As always, please shout if anything misbehaves. - TB (talk) 17:13, 9 June 2014 (UTC)[reply]

New action: move blue link to red title[edit]

I've come across quite a few red links that actually were the correct title the article should be under (mostly capitalization changes). It would be nice to offer a "move" or "rename" action next to the blue link, that would open Special:Move, ideally pre-filled with the relevant target, so we could move the current article into the red-linked title. --Waldir talk 17:14, 13 June 2014 (UTC)[reply]

An excellent idea; I'll add this to my todo list. My one (minor) concern would be that by lowering the effort require to move a page, we might encourage people to do so without applying sufficient thought. - TB (talk) 14:12, 15 June 2014 (UTC)[reply]
Well, that would only be easier than a regular move if the target page can be specified in the url (or somehow the gadget kicks in with some javascript to auto-fill the target box). But even so, it's a to-step process, you have to fill the reason and then confirm the move, so I don't think potential impulsiveness would be an issue. Looking forward to this feature :) --Waldir talk 01:11, 19 June 2014 (UTC)[reply]

Why miss the "simplest" change?[edit]

Using the RLRL tool I found What Will They Think of Next? in List of programs broadcast by Nickelodeon. The tool offers the "precomputed" What Will They Think Of Next and What Will They Think of Next, but not the "simplest" change What Will They Think Of Next?. (If you want a page that still references the red link, other than this one, I've added it to my sandbox.)

Obviously, with in total four redirects to the original name Science International where all the differences are only case and the final question mark, I'm not that surprised the RLRL was tripped up, but I thought I should report it missing the one character change.

Also, my feature request is to provide a piped autofix option that leaves the current display unchanged.

Mark Hurd (talk) 13:19, 15 June 2014 (UTC)[reply]

Regarding What Will They Think of Next?, this looks like a fault in the logic of the suggestion generating process. Here, the simplest fix (What Will They Think Of Next? was identified in September 2011 (set 37) and marked as fixed a few months later. When the red link was reintroduced to Wikipedia in early 2014 it should have no only found the same simplest suggestion again but promoted it because it worked last time. I'm guessing that instead in garnered the negative bias a previously incorrect suggestion would. I will look further into this - thanks for reporting it.
Piped fixes have been brought up in the past; in almost every case where you would want to use this facility, creating a redirect is a better solution. Is there are particular class or form of red link you've been wanted the piped option for? - TB (talk) 14:38, 15 June 2014 (UTC)[reply]
Note I've only been using the RLRL tool for a couple of nights so far, so I defer to your experience. To answer your question, however, I have gone back and reviewed my changes: I manually adjusted the change here to keep the displayed é and this was the first change I made using the RLRL -- it must have made an impression :-) I definitely agree that creating a redirect or moving the destination to a correct red link is more likely the right action, when fixing the red link itself is not.
Another feature request is some way to apply multiple changes to one article. Currently I have manually made the extra changes, often using the Find and Replace you've provided, and manually adjusted the Edit summary. Mark Hurd (talk) 15:50, 15 June 2014 (UTC)[reply]
I've tracked down the bug that was suppressing previously successful suggestions and partially fixed it, resulting in a few hundred new suggestions. Well spotted. Alas, no such good news on applying multiple fixes to an article in a single edit - I've had numerous attempts to implement this and have yet to find a workable method. - TB (talk) 22:18, 15 June 2014 (UTC)[reply]
You might want to look at WP:CHECKLINKS — definitely a different approach, but it could work if you provide a number of variations like plural]]s and/or a general manual adjustment. Mark Hurd (talk) 17:44, 16 June 2014 (UTC)[reply]

Sometimes you don't want to change all matches[edit]

One more issue I don't know a good UI fix for: I made the mistake of allowing all matches to be changed here and User:Welsh has done the same here. In my case I don't think the UI displayed the multiple matches — I remember because I was surprised it did say 2 replacements made. (BTW I would like the textarea to scroll to the match when I click on it, so I can see the full context.) The RLRL Quicklist pointed me to the newly wrong links in Welsh's case.

Clearly an option is to include \||]] at the end of the regex, but that does then make it harder to correct a plural and other end]]s — where I now adjust the existing find/replace of blahes/blah to blahes]]/blah]]es — and it should cater for trailing spaces anyway.

Obviously, if I'm correct that the UI doesn't always display all changes, that should fixed, unless it is in someway deliberate, of course. Otherwise enough context around each change is required and/or my suggestion about ensuring the textarea scrolls and perhaps even highlights the target.

Mark Hurd (talk) 11:34, 16 June 2014 (UTC)[reply]

I've had a good check of the two edits listed above; as far as I can tell, the UI is accurately showing the changes being made in each case along with enough context to easily approve the changes. If you find a case where it's definitely misbehaving let me know and I'll pursue this further.
That said, I agree that the current regexp-based search and replace is a rather blunt tool. Cases where the text being modified is a common substring within the article require either manual editing or at the very least find-tuning of the search and replace strings as you suggest above. Alas, my own expertise lies in the detection of plausible alternate targets for red links; to date none of the attempts to integrate RLRL with the more sophisticated editing tools like AutoWikiBrowser have been fruitful - TB (talk) 19:56, 16 June 2014 (UTC)[reply]

AUTOFIX for Creating a Redirect doesn't do much automatically[edit]

Just confirming you know, for Chrome 35.0, the AUTOFIX to Create a Redirect only opens the page to edit a new article, with nothing obviously different (except I can see extra stuff in the URL), for me at least. Mark Hurd (talk) 18:09, 21 June 2014 (UTC)[reply]

I've just tried one and it worked for me (Chrome 35.0.1916.153). I suspect the scripts aren't being run or triggered properly, probably due to some weird Chromey security trick. Can you in your common.js change
importScript('User:Topbanana/RLRL SR Utility.js');
to
mw.loader.load('https://en.wikipedia.org/w/index.php?title=User:Topbanana/RLRL_SR_Utility.js&action=raw&ctype=text/javascript');
(remember to reload with CTRL-shift-R / ⌘-Shift-R) and see if it behaves please? - TB (talk) 21:44, 21 June 2014 (UTC)[reply]
That change didn't affect it, but I found the problem: I have turned back on the idea of every edit is minor by default with code in my monobook.js. Unfortunately there is no minor edit checkbox for new pages, so this was silently failing, but also stopping further onload code! I will get it to check for existence before setting it. Mark Hurd (talk) 00:29, 22 June 2014 (UTC)[reply]

Review "Changklan Road"[edit]

Why doesn't this offer any suggestions unless you use Brute Force, and then not the "correct page" Chang Khlan Road? (I originally saw this here and placed it in my sandbox again.) Mark Hurd (talk) 17:23, 27 June 2014 (UTC)[reply]

Actually I think it could just be a relatively newly renamed page. Mark Hurd (talk) 17:26, 27 June 2014 (UTC)[reply]
I should point out my previous note was just to probably explain why it did not find the "correct page". I still query why Metaphone finds nothing.
And now I've added Australia & New Zealand Banking Group Ltd. to my sandbox (originally from List of banks in Thailand): I don't mind that Metaphone finds nothing (though I think it probably should), but why does Brute Force find Australian and New Zealand punting glossary and not Australia and New Zealand Banking Group, which hasn't been moved for at least two years. (And there's Australia and New Zealand Banking Group Limited as well.) Mark Hurd (talk) 16:29, 28 June 2014 (UTC)[reply]
The metaphone method depends on a precalculated table that's right now a few months old - I'm guessing I forgot to recreate the cron job that rebuilds it regularly after the move from the toolserver to tool labs. I'll have to look into your second example above - I agree, it looks like it should be finding better results. - TB (talk) 17:21, 28 June 2014 (UTC)[reply]
Ahah - that's a bit better now. Do have a play and let me know of any remaining weirdness. Cheers. - TB (talk) 18:10, 28 June 2014 (UTC)[reply]

I've added clickers to my sandbox: it never offers clicker and only offers anything with Metaphone! (It was originally at Mouse Mischief.) Mark Hurd (talk) 11:27, 29 June 2014 (UTC)[reply]

"Brute force" checks are (somewhat counter-intuitively) more expensive for shorter links; anything less than 9 characters was skipped. The new tool labs server seems a bit more powerful, so I've reduced this to 5 characters for now. I can't vouch for the quality of the results though - a single character change to a 5-character long red link represents a 20% change; there is just not sufficient entropy in short red links to feed the matching process. - TB (talk) 14:10, 29 June 2014 (UTC)[reply]

What is "Suppress redirects" meant to do?[edit]

With the change you've just made we're getting lots of results, so whats "Suppress redirects" meant to do? On my sandbox, which now produces the same results for all 4 types of Thoroughness, changing this checkbox seems to do nothing. Mark Hurd (talk) 01:48, 29 June 2014 (UTC)[reply]

The redirect being suppressed is the one you are typing in. To give an example, if you ask for suggestions for redlinks on B.B.C. (which is a redirect) it gives results for BBC (the target of that redirect) unless you tick the suppression box. This is mostly of use when fixing broken templates on redirects - {{R from diarcritics}} being my personal favourite.
If you (or anyone else) has asked for a particular page to be checked recently, there will be results sitting in the database. These are shown alongside any suggestions you have specifically asked for, even if they were generated using a different 'thoroughness' setting. In fact, results are stored on a redlink-by-redlink basis - if a redlink on the page you are checking exists on any page checked recently, there may be other results shown. - TB (talk) 06:34, 29 June 2014 (UTC)[reply]

Weighted distance 21[edit]

Hi, I've noticed that quite a proportion of fixable suggestions have found their way into the Old category. I am slowly going through them, but it might be worth resetting them back to New status. There are currently about 100 in old of which about 30 at last count are genuine. I don't know whether any other categories are affected. welsh (talk) 06:12, 14 July 2014 (UTC)[reply]

'Old' simply means that a suggestion has been presented to a RLRL user at least twice and not fixed yet. Every red link in Wikipedia is 'fixable' - if not by changing the target or introducing a redirect then by writing a new article or de-linking it. As a set is worked through, what we're really doing is tackling the 'easy to fix' ones (especially when the tool correctly suggests how to fix it) and passing over the 'hard to fix' ones.
There are a few heuristics in place to guard against 'queue draining' - quickly viewing hundreds of suggestions and not attempting to fix any of them, and I can reset sets as you suggest when this happens. That said, the number of red links is so vast that it's more productive all round to just top up the tool with another set of suggestions. Aiming for 90% accuracy, we're getting a few thousand suggestions per month currently, but we could just as happily have ten thousand a month at 80% accuracy. Sifting through the dregs of a previously worked set yields less improvement for the same effort.
If like me you are fussy and enjoy tidying up the loose ends of suggestion sets, try the 'expand old suggestions' button on the stats page; this breaks the 'old' suggestions down into more detailed sets. hopefully giving a better indication of when you've wrung the last drops of use from it :) - TB (talk) 15:51, 14 July 2014 (UTC)[reply]

importScriptURI[edit]

Please replace

importScriptURI('https://en.wikipedia.org/w/index.php?title=User:Topbanana/RLRL_SR_Utility.js&action=raw&ctype=text/javascript');

by

mw.loader.load('//en.wikipedia.org/w/index.php?title=User:Topbanana/RLRL_SR_Utility.js&action=raw&ctype=text/javascript');

on toollabs:tb-dev/RLRL/index.php (because importScriptURI is deprecated). Helder.wiki 19:03, 14 July 2014 (UTC)[reply]

Done. Apologies for the delay, testing this change against a sufficient number of languages, projects and browsers took some time. - TB (talk) 13:21, 4 August 2014 (UTC)[reply]

Updates to search and replace JavaScript tools[edit]

I have updated the javascript tools User:Topbanana/RLRL_SR_Utility.js to add a few new features; it should now deal more sensibly with the case of leading characters. I'm hoping to add the ability to carry out suggested page moves in a semi-automated manner, similar to the creation of redirects and modification of link text within articles. Watch this space. - TB (talk) 10:58, 23 August 2014 (UTC)[reply]

Checkpage improvements[edit]

The Red Link Recovery Live tool for checking red links on a specific page has been overhauled. It now makes use of (some of) the tools used to filter the pre-calculated result sets to remove obviously duff suggestions. I've only had a chance to try it on a handful of red-link rich pages, but it seems to be a very effective improvement. Where the suggested target is a redirect, an option to AUTOFIX using the target of the redirect is now shown also. Happy hunting. - TB (talk) 11:26, 7 September 2014 (UTC)[reply]

Catalan Wikipedia implementation[edit]

Hello Topbanana,

Few days ago I asked you to implement the catalan Wikipedia to the tool. It all went ok and worked but two days ago someone told me that the tool wasn't working. I get the message "This combination of project and language is not configured, sorry." Could you get it up right if that's not much work?

Thank you very much!

Gerardduenas (talk) 20:12, 3 October 2014 (UTC)[reply]

All fixed now - it looks like the Catalan Wikipedia database was moved between database servers. I've told the tool where to find things and it's all working again now. - TB (talk) 17:29, 5 October 2014 (UTC)[reply]
Thanks!--Gerardduenas (talk) 14:39, 7 October 2014 (UTC)[reply]

Redlink templates[edit]

Templates 5 on RLRL was fun. Is there a list of all the template calls that are redlinks? Or if that is too large, how about all red templates containing the text stub? welsh (talk) 21:41, 23 February 2015 (UTC)[reply]

There are approximately a quarter of a million red links in the main namespace to the template namespace - too many to comfortably list. Luckily only 600 or so contain the word 'stub' or 'Stub' - I'll pull these out of the database for you when I get a chance. - TB (talk) 08:50, 24 February 2015 (UTC)[reply]
Posted at User:Welsh/stub_redlinks for you. Have fun. - TB (talk) 15:16, 25 February 2015 (UTC)[reply]
Cool, thanks. welsh (talk) 22:02, 26 February 2015 (UTC)[reply]

Autofix[edit]

Is autofix supposed to automatically make the change? I find that I have to manually make the change when it brings me to the edit page for the article. --Arise again, Arisedrew! (talk) 23:36, 24 March 2015 (UTC)[reply]

Howdy, AaA - welome to Red Link Recovery. I've check your custom javascript page and it looks correct to me. AUTOFIX links should open the article in edit mode with some additional search and replace tools at the top. For example this AUTOFIX link should show:
If it's not doing the above, let me know which web browser and Wikipedia skin you are using and I'll see if I can replicate your problem. - TB (talk) 09:05, 25 March 2015 (UTC)[reply]

New suggestion - Sir or Dame[edit]

I've been working on pages regarding Fellows of the Royal Society. There are lots where the link contains Sir (and some where it contains Dame) but that's a redlink. e.g. Sir James Baddiley vs James Baddiley, or Dame Honor Bridget Fell vs Honor Bridget Fell. Should be easy fixes? Thanks. Tassedethe (talk) 20:23, 26 April 2015 (UTC)[reply]

Sounds good; similar runs have been done using professional titles ('Doctor') and military ranks ('General') in the past with reasonable results. I've popped up an initial set of 300 or so suggestions based on leading 'Sir'/'Dame's to see how it goes. - TB (talk) 10:44, 27 April 2015 (UTC)[reply]
That's great, thanks! Tassedethe (talk) 00:23, 28 April 2015 (UTC)[reply]

New suggestions - Religious titles[edit]

As per "Sirs and Dames" above, but covering a wider selection of religious titles culled from Index of religious honorifics and titles. The set is unusually large at over 3000 entries, I'll be aggressively filtering it over the next few days. Titles covered are listed below. - TB (talk) 16:41, 28 May 2015 (UTC)[reply]

Removed 'Canon' from this list - too many Canon-brand cameras. - TB (talk) 21:42, 8 August 2015 (UTC)[reply]
Abbess, Abbot, Abhyasi, Acharya, Agga_Maha_Pandita, Ajahn, Ajari, Akhoond, Allamah, Amir_al-Mu'minin,
Anagami, Anagarika, Ani, Apostle, Archbishop, Archdeacon, Archimandrite, Archipheracite, Archpriest,
Arhat, Ash_Shakur, Auxiliary_bishop, Av_Beit_Din, Ayatollah, Ayya, Badchen, Bhagat, Bhikkhu, Bhikkhuni,
Bishop, Bodhisattva, Branch_president, Brother, Cantor, Cantorate, Cardinal, Chakravartin,
Chancellor, Chaplain, Chaplain_of_His_Holiness, Chief_Rabbi, Choizer, Chorbishop, Coadjutor_bishop,
Constantinople, Custodian_of_the_Two_Holy_Mosques, Dalai_Lama, Dastur, Deacon, Dean, Dervish, Devadasi,
Dhammacari, Dob-dob, Dorje_Lopön, Dvija, Ecclesiastical_Judge, Ecumenical_Patriarch, Elder, Emir,
Episcopal_Vicar, Father, Fellow_Student, Firekeeper, Gabbai, Gaden_Tripa, Gadol, Gaon, Gelongma,
Geshe, Godman, Goswami, Gothi, Gyalwang_Drukpa, Gymnosophists, Hadrat, Hajji, Hakham, Hakham_Bashi,
Healing_practitioner, Herbad, Hierodeacon, Hieromonk, High_Priest, High_Priestess, Honorary_Prelate,
Illui, Imam, Jagad_guru, Je_Khenpo, Jisha, Judicial_Vicar, Kaisan, Karmapa, Karram-Allah-u_Wajhahu,
Khawaja, Khenpo, Khoja, Kohen, Kohen_Gadol, Lama, Lamane, Lamdan, Life_coach, Mae_ji, Maggid,
Maha_Kapphina, Mahamandaleshwar, Mahant, Maharshi, Mahasiddha, Mahdi, Major_archbishop, Makhdoom,
Mantrik, Marabout, Maran, Marja, Mashgiach, Mashgiach_ruchani, Mashpia, Mawlana, Mawlawi, Meiniach,
Melamed, Melshanthi, Meshulach, Metropolitan_bishop, Mission_president, Mobad, Mobedyar, Mohel,
Moinuddin, Monk, Mu'min, Mufti, Muhaddith, Mujaddid, Mullah, Murshid, Nagid, Novice, Nun, Osho, Otin,
Panchen_Lama, Pandit, Pandita, Paramahamsa, Paramguru, Parochial_Vicar, Pastor, Patriarch, Pir, Pirani,
Pope, Posek, Pratyekabuddha, Preacher, Presbyter, President, Presiding_Bishop, Presiding_Patriarch,
Priest, Priestess, Primate, Prince_bishop, Prior, Prioress, Protodeacon, Protonotary_Apostolic,
Protopriest, Qalandar, Rabbi, Radhiallahu_'anhu, Rahimatullah, Rais, Rajarshi, Rassophore, Rav, Rebbe,
Rector, Religious_Science_Practitioner, Resident_Bishop, Reverend, Rinpoche, Rishi, Rishi_Muni,
Rishon_LeZion, Rosh_yeshiva, Roshi, Sadhaka, Sadhu, Saint, Sakadagami, Saltigue, Samanera, Samaneri,
Sandek, Sannyasa, Sant, Satguru, Savakabuddha, Savoraim, Sayadaw, Sayyid, Schulklopfer, Segan, Sensei,
Shabdrung, Shaliah, Shamarpa, Shankaracharya, Sharif, Shaunaka, Shechita, Sheikh, Sheikh_ul-Islam,
Shishya, Sikkhamana, Singhai, Sofer, Solitary_practitioner, Sotapanna, Sravaka, Stavrophore,
Subhanahu_wa_ta'ala, Suffragan_bishop, Sultan, Sultana, Sunim, Swami, Tai_Situpa, Talmid_Chacham,
Teacher, Temple_boy, Temple_president, Tenzo, Thangal, Thero, Thilashin, Third_Bardor_Tulku_Rinpoche,
Titular_bishop, Tulku, Tzadik, Tzadikim_Nistarim, Ulama, Unsui, Upajjhaya, Upasaka_and_Upasika, Vajracharya,
Vicar_Forane, Vicar_General, Volkhvy, Witch, Yeshiva, Yogi, 

Tool not working?[edit]

SELECT command denied to user 'p50380g50491'@'10.68.18.47' for table 'schedule'

welsh (talk) 21:20, 2 July 2015 (UTC)[reply]

Back from hols. The above looks like a Tool labs issue; I have manually adjusted the permissions on the relevant MariaDB database to get things running and will look further into the underlying cause once I've unpacked and had a cuppa. - TB (talk) 10:35, 7 July 2015 (UTC)[reply]

Configuration[edit]

When attempting to look at the tool for the first time, new instructions say this tool needs to be configured in some fashion, reverting us back to the same information page. But the information page does not use the term configuration anywhere. Whatever this is, this needs to be spelled out and particularly using the same terms as are being referenced. Trackinfo (talk) 21:11, 1 August 2015 (UTC)[reply]

Hi there Trackinfo. I'm guessing the page of instructions that's tripping you up is this one - it is indeed rather half-baked. I'll attempt to explain it better here for now but will try and update that page in the near future.
The heart of 'Red Link Recovery Live' is a computer program that trawls through the red links in Wikipedia and tries to suggest articles that they should point to. The most promising suggestions are stored in a database on tool labs and can be viewed using a web interface also hosted there.
You can optionally use some semi-automated tools to make working through the lists of suggestions a bit less arduous - I've gone ahead and edited your "custom.js" file (User:Trackinfo/common.js) to enable these for you. Clicking on, green "AUTOFIX" links displayed beside suggestions will open the article with the recommended changes ready to apply.
- TB (talk) 20:22, 2 August 2015 (UTC)[reply]

Bug fix: UF8 encoding issues[edit]

I've squished a bug related to UTF8 encoding (that is, how suggestions with 'special characters' are stored). Quite frankly, I'm a bit surprised at how easy this long-standing issue was to put right and suspect I may have overlooked some complexities. Suffice to say, if you notice anything odd happening to do with accented or foreign-language characters, please let me know - TB (talk) 10:57, 3 August 2015 (UTC)[reply]

Database maintainence[edit]

As part of a more general bit of maintainence work, the database in which much of RLRL's innards are stored has been relocated. You shouldn't notice any difference, unless that is I've messed up and forgotten to migrate something essential over. Do shout if things are awry. Cheers - TB (talk) 09:50, 2 October 2015 (UTC)[reply]

Unexpected identifer[edit]

Hey, just here to say that this program is spitting out errors when I use it. Is there any reason why it might do this? Swordman97 talk to me

Have just checked it now and all seems well - can you give any more information about the error you saw please? Ideally, the approximate time you saw the message and the URL you were visiting or a description of what you were doing when the error occurred. - TB (talk) 08:48, 23 November 2016 (UTC)[reply]

Opening autofix links in a new tab automatically?[edit]

Hi there. Thanks very much for making the red link checker—it's really useful. Just a small feature request: would it be possible to make the Autofix link open in a new tab automatically instead of in the same window as the tool? Having to remember to do it myself is frustrating, and then I have to go back many pages to get the same tool results to carry on through the list. If I could find a code repository, I would do it myself, but I can't find one. Thanks! Issyl0 (talk) 11:03, 14 December 2016 (UTC)[reply]

Not seeing suggestions[edit]

After years of sterling service, https://tools.wmflabs.org/tb-dev/RLRL/ is no longer suggesting any links for me to mend. I just get the header where I can change to Wikipedia in other languages but nothing below. Has the tool moved, is it broken, have we mended all the red links or (most likely) am I doing something wrong? It's worked consistently for ages and I don't think I've changed anything recently. I've tried using http: and accepting cookies - no effect. I do use Ublock Origin but it's not blocking anything on that site. I'm seeing the issue with both Firefox 50.1.0 on Ubuntu 16.04 and IE11 on Windows 7 on different PCs. Any advice please? Even a yes/no as to whether it provides suggestions for someone else today would be useful. Thanks, Certes (talk) 16:24, 22 December 2016 (UTC)[reply]

Doesn't show up for me either (chrome and IE, win 8) --silraks (Talk) 10:43, 23 December 2016 (UTC)[reply]
Thanks! Doing some digging around, I think the tool may now be denied access to the database it queries. I'm sure someone will be around soon to unlock it. Certes (talk) 14:20, 23 December 2016 (UTC)[reply]

New suggestions available[edit]

I've added a few thousand new suggestions to the tool; nothing startlingly new, but if time permits I may have some novel sets to add in the new few weeks. Cheers. - TB (talk) 20:53, 16 February 2017 (UTC)[reply]

AUTOFIX stopped working?[edit]

My AUTOFIX prompts have stopped working - yesterday I think. They load the article in edit mode but fail to do the find/replace. I checked the setup but can't see a problem. Anyone else seeing this? welsh (talk) 20:11, 3 March 2017 (UTC)[reply]

It still works for me. Certes (talk) 21:22, 3 March 2017 (UTC)[reply]
Still puzzling about this. After a reboot it worked once only - which suggests its my set up. Turned off virus check and adblocker; used a different browser; checked WP settings. No change. welsh (talk) 07:40, 6 March 2017 (UTC)[reply]
Are you blocking JavaScript or cookies? Certes (talk) 13:59, 6 March 2017 (UTC)[reply]

Mobile friendly[edit]

Thing to make the site more mobile friendly (How to test in Chrome). Add the following viewport tag to <head>:

<META NAME="viewport" CONTENT="width=device-width, initial-scale=1">

Then add some CSS to tbs.css

/* Nitpick: you set a background-color you need to set a foreground color! */
BODY { color:#000000; }
/* Disable iOS input zoom */
@media screen and (max-width:750px) {
  INPUT[TYPE="text"], SELECT, TEXTAREA {font-size:16px;}
}

More could be done, but would require changing the output format (adding more classes). — Dispenser 17:17, 11 March 2017 (UTC)[reply]

Cheers for the advice. I've implemented the changes above and ordered up 5" Android and 7" IOS devices for testing purposes. - TB (talk) 16:41, 24 June 2017 (UTC)[reply]

Not working?[edit]

the site https://tools.wmflabs.org/tb-dev/RLRL does not work for me in firefox or microsoft edge. is this tool down or not working anymore? --MrLinkinPark333 (talk) 21:07, 30 August 2017 (UTC)[reply]

Looks like all tools in /tb-dev/ are down... shattered (talk) 21:41, 24 September 2017 (UTC)[reply]
T179599: Adoption of tb-dev Red Link Recovery tools. — Dispenser 11:58, 24 December 2017 (UTC)[reply]