Wikipedia talk:WikiProject Check Wikipedia/Archive 4

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Todo[edit]

  •  Done I'm often using the view=all page (frwiki). Did you remove the "priority" column? I miss it, especially for distinguishing deactivated errors from the others (could be displayed in an other manner if you don't want the column, like this for deactivated). --NicoV (Talk on frwiki) 21:57, 7 September 2013 (UTC)[reply]
  •  Done #64: Generate also for [[Foo|foo]] and [[foo|Foo]]. Matt S. (talk | cont. | cs) 05:49, 30 July 2013 (UTC)[reply]
    And [[Foo_bar|Foo bar]] (I haven't checked if it's already done or not). --NicoV (Talk on frwiki) 12:27, 18 September 2013 (UTC)[reply]
    And [[ Foo bar|Foo bar]] [[Foo bar | Foo bar]] [[ Foo bar | Foo bar ]] etc. Matt S. (talk | cont. | cs) 15:28, 18 September 2013 (UTC)[reply]
  •  Done There are some strange (Hebrew) characters on the Slovak Wikipedia project page. Matt S. (talk | cont. | cs) 09:33, 7 September 2013 (UTC)[reply]
    Oy vey, Adon Suchánek. It's actually Yiddish. Why it is happening, I haven't a clue. In the database, it doesn't contain the Yiddish characters. I'll look into it more this week. Bgwhite (talk) 07:28, 9 September 2013 (UTC)[reply]
  •  Done Checking for error #35 does not work (see de, cs or frwiki). Matt S. (talk | cont. | cs) 09:14, 7 September 2013 (UTC)[reply]
  • I can't see any problems in cswiki but let's wait to a full scan. Matt S. (talk | cont. | cs) 15:28, 18 September 2013 (UTC)[reply]
  •  Done Manage white lists for errors. I suggest using whitelistpage for configuration, and that Labs doesn't report an error for pages listed in the whitelist. --NicoV (Talk on frwiki) 12:18, 18 September 2013 (UTC)[reply]
  •  Resolved "403 Forbidden": find a way to prevent that error ? With the new full scan, I tried to clear the list by running WPCleaner on several tasks in parallel: bot for #16, bot for #37, manual fixing for other errors. At some point, I end up with HTTP errors 403 Forbidden from the server, even for lists for humans. --NicoV (Talk on frwiki) 09:27, 21 September 2013 (UTC)[reply]
  • I've added 10 seconds delay in WPCleaner before retrying, I hope it will enough for Labs to become available again. --NicoV (Talk on frwiki) 10:17, 21 September 2013 (UTC)[reply]
  • I've asked about this a couple of times to the labs people and I get no response.
  • Ok, with the extra delay, it seems to work (when I receive a 403, it works when I retry). --NicoV (Talk on frwiki) 20:54, 21 September 2013 (UTC)[reply]

#10

  • Unable to do anything about this. Technically, there should be <code>...</code> around it instead of a space. But, this problem also happens in chemistry and math articles. White listing the article is only thing I can think of at the moment. Bgwhite (talk) 20:25, 21 September 2013 (UTC)[reply]

ISBN[edit]

  •  Done #72 There's a space missing in the notice column between the ISBN number and the explanation (and a colon would be nice): "2-362620-10-22 vs. 7 (84 mod 11)" should be "2-362620-10-2: 2 vs. 7 (84 mod 11)" --NicoV (Talk on frwiki) 16:30, 21 September 2013 (UTC)[reply]
  •  Done #72 When the result gives 10, the expected checksum is X: "2-07-37281-200 vs. 10 (164 mod 11)" should be "2-07-37281-200 vs. X (164 mod 11)". --NicoV (Talk on frwiki) 18:06, 21 September 2013 (UTC)[reply]
  • Reworking the entire ISBN subroutines is on my todo list. There are other problems as well. For example, #69 just showing "-10" or "-13" isn't too helpful. Bgwhite (talk) 19:54, 21 September 2013 (UTC)[reply]
  • With the enhanced detection, I'm also completely reworking the ISBN part in WPCleaner, that's why I'm making a few comments about it ;-) --NicoV (Talk on frwiki) 19:58, 21 September 2013 (UTC)[reply]
  • Hmm, if you are reworking it... Could add another error. The checksum could be fine, but the ISBN can still be wrong. Any other suggestions? Bgwhite (talk)
  • After some coding, I will add another error. When finished, I'll run it against a dump and see how many errors there are before committing to it. Bgwhite (talk) 07:02, 22 September 2013 (UTC)[reply]
  •  Done Ok, got everything coded up and tested. There are alot more errors being produced.
#69 now checks for : and numbers. Old code only checked for ISBN-10 and ISBN-13. It now checks for errors such as ISBN-978-... and ISBN:. Last month's enwiki dump produced 64 articles. New code produced 1734 articles
#71 picked up 37 articles vs 3 articles.
#70 went from 6,300 to 8,000 errors in articles.
There is a caveat in which there will be false positives... cases where numbers follow ISBN numbers.
Example is ISBN 0-8160-1349-7 0267 p. 267 Program things 0267 is part of the ISBN when it is not.
It is hard to know when cases of "0267" are ok vs bad ISBN with more than 13 numbers. An example is ISBN 9781405190732 1405190736, where a person put two isbns following only one "ISBN"... this is a valid error.
#72 & #73 are picking up ~25% more errors. I'm not sure why. But, all the new errors are valid errors and are found in cite templates.
The error listing for #72 an #73 states: "0-00-654732-7 vs X" or "0-89087-968-9 vs 0". For the other errors, what the program thinks is the ISBN number.
Any changes or questions? Bgwhite (talk) 01:19, 26 September 2013 (UTC)[reply]
Not for the moment, I've modified WPCleaner to also detect more situations. I have to check with frwiki detections to see if I cover everything. --NicoV (Talk on frwiki) 20:11, 29 September 2013 (UTC)[reply]
  •  Done I think there's a problem with #70: ISBN-10 with a lowercase "x" checksum are reported (fr:Chris Harman 189887655x, fr:Hergé 2-87415-668-x, ...). --NicoV (Talk on frwiki) 21:37, 29 September 2013 (UTC)[reply]
  •  Done An other problem with #69: in the fr:The Beatles, it detects template's parameter isbn-2=978-2020419905. --NicoV (Talk on frwiki) 05:43, 2 October 2013 (UTC)[reply]
    • I'm confused. There is no "isbn-2=978-2020419905" in the Beatles article. isbn-2 is invalid in the Mediawiki software. Bgwhite (talk) 06:03, 2 October 2013 (UTC)[reply]
      • There was until right now (I just replaced isbn-2 by isbn2) but with space around the "=" isbn-2 = 978-2020418805 in chapter "L'Inde et le Maharishi". "isbn-2" was used as parameter name for the template Ouvrage, so I think it shouldn't be detected (a wiki can decide to name the parameters isbn-2, isbn-3, ...) --NicoV (Talk on frwiki) 07:17, 2 October 2013 (UTC)[reply]
        • Ahhh. Would you stop being so good at your job. I think you just like giving me work.  :) It should not catch any isbn2, isbn3, ... Now, isbn-2 is a different story. At the moment, I'd like to play ostrich and put my head in the sand. This is sort of tied into #3 below and #87 above. How to whitelist/blacklist different templates for different languages. There are templates that create false positives for other errors. Now throw in template parameters too. Bgwhite (talk) 08:49, 2 October 2013 (UTC)[reply]
          • I see them because I'm checking that WPCleaner can find the errors you have detected. When it doesn't, I either have to fix my code (a lot of times for ISBN recently, but I'm doing it silently) or tell you the problem is on your side ;-) If it's too complex, just forget about it because I'm not sure any wiki is using isbn-2 as a parameter name. With this detection, I just saw that the parameter wasn't used so I renamed it properly. --NicoV (Talk on frwiki) 08:55, 2 October 2013 (UTC)[reply]
NicoV I replaced the link with the direct link to the reference. The isbn link was to a search results list. -- Magioladitis (talk) 18:50, 15 October 2013 (UTC)[reply]
Thanks! --NicoV (Talk on frwiki) 19:07, 15 October 2013 (UTC)[reply]
NicoV, checkwiki shouldn't have caught that in the first place. In the code, there has to be a space before isbn in order for it to show up as an error. I specifically did that to avoid ISBNs in web links. Strange. Bgwhite (talk) 19:56, 15 October 2013 (UTC)[reply]
Hum, you're right, in fact it reports an ISBN-10 and not the web link, my bad (and going to fix WPCleaner for that ;-) ). --NicoV (Talk on frwiki) 20:01, 15 October 2013 (UTC)[reply]
NicoV, I've been having troubles with the vile WPCleaner programmer too. I think the idiot is almost as incompetent as the stupid checkwiki programmer.
Make sure you have fr:International Standard Book Number in the whitelist file for #69 and possibly the other ISBN errors too. Bgwhite (talk) 20:14, 15 October 2013 (UTC)[reply]


On dumps, check if page has been fixed[edit]

Red X Won't fix

  • When doing a full scan from a dump, manage pages that have already been fixed since the dump (for example, on frwiki the list for #37 just went back from 40k to 66k pages, the 26k pages were fixed in the last few days). Maybe new errors detected from the dump should be scanned again with the current article before being listed ? --NicoV (Talk on frwiki) 08:49, 21 September 2013 (UTC)[reply]
    I'm not sure of way to do this from a technical standpoint. On the daily scans, it takes just over 4 hours to check 72,000 articles. 72,000 is the maximum number of articles (500 grabbed every 10 minutes for 24 hours). The real number is less than that due to duplicate articles, usually around 50,000-55,000 articles. The cause for the slowdown over the dump processing is in retrieving the article's text. It would end up taking an extra ~8 hours to process frwiki, ~16 hours for dewiki and ~20 hours for eswiki.
    The real problem lies with labs. I can process frwiki in 5 hours on my 3-year old laptop. It appears the last frwiki run on lapbs took 33 hours. I'm also getting around a 6 times slowdown with enwiki. Labs people knows there is a problem. They know more people are migrating from toolserver. They refuse to do anything about it. Bgwhite (talk) 19:43, 21 September 2013 (UTC)[reply]
    Ok, I understand it can be difficult to achieve that with the current slowness of Labs. I have a few ideas on how to optimize this check, but I'm not sure if it would be enough:
    • Check only articles that are being added to the list by the full scan, not if they were already in the list
    • Use the revision id to check if article has been modified since the dump: if not, you don't have to retrieve the article's text
    • Do you use gzip compression to retrieve the article text ?
    If it's too difficult or impossible, we'll live with this even if it's not perfect (I still haven't caught up with the 26k added, only marked about 10k for now, but I have been doing other things in the day). --NicoV (Talk on frwiki) 19:54, 21 September 2013 (UTC)[reply]
    • There can be multiple errors in an article. What happens if one error was fixed, but not the others? What happens if none of the errors were fixed and a new error is in the article?
    • It will actually add time to processing. I'd have to move to a different way of retrieving articles from the dump file that will be dramatically slower. current way vs. slower way. The get_next_page_from_dump subroutine is the one to look at. It's not present in the 'current way' because it is not in the top 10 slowest subs. This was run for 10,000 articles. This graphs are from last June when I first started out redoing the program. The get_next_page_from_dump was the first one I redid.
    • Retrieving the article isn't the slow part. The problem is setting up the connection, other end finding it and getting it ready to send. Lab computers are in the same facility as the rest of wikipedia's computers. Bgwhite (talk) 07:35, 22 September 2013 (UTC)[reply]

Error #37[edit]

  • Not entirely sure, but I have the impression that some pages are reported while all their categories have a sort key. See the beginning of list of articles marked as done: they were added by the scan tonight, but I didn't find that a DEFAULTSORT needed to be added. --NicoV (Talk on frwiki) 05:22, 23 September 2013 (UTC)[reply]
    • Magioladitis could you answer this better than I could as he has more history with this. Bgwhite (talk) 23:35, 25 September 2013 (UTC)[reply]

#48 and imagemaps[edit]

 Done Magioladitis brought up that it is ok to have the title linked in text inside imagemaps. Examples are: Area codes 702 and 725, Area code 775 and Area codes 310 and 424. Instead of whitelisting the articles, I'll fix this in the checkwiki program as it applies wiki wide. Bgwhite (talk) 05:27, 4 October 2013 (UTC)[reply]

Ok for me, I will remove the specific code for this in WPCleaner (suggestion to remove the entire line, as it is useless). --NicoV (Talk on frwiki) 07:01, 4 October 2013 (UTC)[reply]
Magioladitis, NicoV, it is done for #48. Also did the same for #64 and <timeline> as it caused the same problem. Bgwhite (talk) 05:21, 11 October 2013 (UTC)[reply]

#35 and image descriptions[edit]

 Resolved Wikicode like this is reported as a missing image description, also there is one, see 4D.

<gallery caption="Hyperwürfel">
Datei:Hypercubeorder.svg
Datei:Hypercubecubes.svg
Datei:Hypercubestar.svg
</gallery>

--Betateschter (talk) 18:15, 6 October 2013 (UTC)[reply]

I think #35 detects images without caption inside a gallery, and not a gallery without caption. So it seems normal. --NicoV (Talk on frwiki) 20:26, 6 October 2013 (UTC)[reply]

Bugs Fehler Meldungen in CheckWikipedia Labs[edit]

 Done see here: discussion (de) — Preceding unsigned comment added by Crazy1880 (talkcontribs) 16:53, 5 October 2013 (UTC)[reply]

Ein paar Bugs (und ich darf den Bug-Melde-Seite nicht bearbeiten als Nicht-Admin....):

--WiseWoman (Diskussion) 20:09, 26. Sep. 2013 (CEST)

Moin Moin Stefan, ich hätte da auch noch einen Wunsch: Wenn du ein Item aufrufst und auf "More" gehst, dann stehen dort momentan nur die verschiedenen IDs, schöner wäre es, wenn die ID und der ID-Name dort stehen könnten, damit man das sofort weis. Frage: Skript weiterhin in Perl? Danke. --Crazy1880 19:35, 4. Okt. 2013 (CEST)
Moin Moin Stefan, bei der ID 6 ist mir noch aufgefallen, dass wenn sich ein "&" Zeichen im Titel befindet, sich das Erledigt-Flag nicht setzen lässt. mfg --Crazy1880 18:49, 5. Okt. 2013 (CEST) — Preceding unsigned comment added by Crazy1880 (talkcontribs) 16:51, 5 October 2013 (UTC)[reply]

Der Mann hinter die Neue CHECKWIKI Seite in MediaWikiLabs ist Bgwhite. Er kann kein Deutsch aber Ich uebersetze ihn was das Problem ist und er versucht eine Loesung. -- Magioladitis (talk) 07:16, 11 October 2013 (UTC)[reply]

Crazy1880 Ueber Felher #6: "&" ist ein Spezialcharakter. In Englische Wikipedia wir ersetzen "&" mit "and". Vielleicht kann man auch in Deutsche Wikipedia das Analog machen und "und" in DEFAULTSORT benutzen. -- Magioladitis (talk) 07:19, 11 October 2013 (UTC)[reply]

Hello Magioladitis, or like we speak in Lower Saxony "Moin". It is no problem for me to change my speak to english. Few users tested the CheckWikipedia on Labs and had had few problems. Fix you enter any ID and look for the article to fix you see only the articles from this ID. So one to see all IDs to fix you click to "More". There are the "bugs" we talk about. User WiseWoman told, that when you are at "More" and see all IDs, you couldn't click to the "articles-name" or to "edit", because you'll land at a page of this and not in the Wikipedia of your language (here german Wikipedia). I tested, it is the same result. Secondly: In the Check Wikipedia at the toolserver if you click by any ID to "More" you'll see ID-Description (ideas for improvement), Notice and "Done". At Labs we would like to have the column ID-Description (ideas for improvement) after the error-ID. Thirdly: If you have this symbol "&" in field article you couldn't mark this article as Done. At any ID. Back door: You set the whole ID as Done. Hope it will be a clear statement. Otherwise please ask. Best regards --Crazy1880 (talk) 17:03, 11 October 2013 (UTC)[reply]
Moin Crazy1880. I habe in Oldenburg fuer 3 Jahre gearbeitet :) OK I have already translated the different problems you reported to Bgwhite. Ich hoffe wir kommen in eine Loesung bald. -- Magioladitis (talk) 18:47, 11 October 2013 (UTC)[reply]
Not sure what the primary word for hello is where I live, but I say howdy. So, Howdy Crazy1880. I think 3 of your 4 error reports have been fixed. Atleast I think there are four reports. The three fixed were related to the web page. For the fourth one, I'm going to need more information... "Der Filter für [3] funktioniert nicht, weil der dauernd glaubt" For error #46, it does not look at the content, only brackets. So, it is not aware of a bracket being in an Image tag or not. If I could get some examples of articles where the problem is happening, it would be helpful. Bgwhite (talk) 23:44, 11 October 2013 (UTC)[reply]
Okai Howdy Bgwhite, thank you for the fast fixing. For ID 46 i wrote a text to page of User WiseWoman and asked for examples. Parallel i'm looking for the "bug", because i'm thinking it is no. Question by the side: In witch programming language is your Check Wikipedia Skript? Regards --Crazy1880 (talk) 08:45, 12 October 2013 (UTC)[reply]
And secondly, if i'm under "More" and click to view the article only there is the wrong URL (see: https://"de.wikipedia.org/wiki/Poesiealbum_neu") the "Edit" is right. Thank you --Crazy1880 (talk) 08:56, 12 October 2013 (UTC)[reply]

Oh, sorry. Since the toolserver was mostly German guys, I was using German. The wmflabs.org pages have been down or slooooooow as molasses the past few days, I hope you get that sorted out. Problem 1 was easy, the URL was set wrong. The second problem I had is more difficult. You are looking for wrongly parenthesized pages in [4] but when there are links in picture captions, the algorithm gets confused and reports errors. [5] is properly parenthesized, but apparently the algorithm does not properly deal with nested pairs such as

 [[Datei:Vindobona Hoher Markt-97.JPG|thumb|[[Scheibenfibel]] mit Darstellung des Flussgottes Danuvius, 150-250 n. Chr (Römermuseum, Wien)]] 

It sees the first closing pair as closing the first opening pair instead of the inner, second pair. --WiseWoman (talk) 09:02, 12 October 2013 (UTC)[reply]

You are wholeheartedly incorrect with your outlandish statement of wmflabs.org being "slooooooow as molasses". wmflabs hasn't been slow, it has been dead. :) Labs people were trying to correct a hardware problem that has been happening since July by replacing a machine. It didn't go well and caused a long outage. So far, my experience has been labs is alot slower than toolserver and buggier. However, there haven't been any multi-day or week long outages.
The problem has nothing to do with images. Checkwiki is correctly identifying a bracket imbalance in de:Danuvius. There are 51 opening brackets and 50 closing brackets in the article. The bracket problem is located in reference #7. The reference uses brackets in a "weird" way that is correct, but causes a bracket imbalance. This is a case where the article should be whitelisted.
Checkwiki is incorrectly reporting the problem at the first bracket in the article because the articles starts off with brackets. In the article's case, an image tag is the first brackets. This is a known problem and is listed up above in the TODO section under " #46 seems to detect a few internal links inside image description." Bgwhite (talk) 06:32, 13 October 2013 (UTC)[reply]

Okai, das erste Problem sollte behoben sein. --Crazy1880 (talk) 09:21, 12 October 2013 (UTC)[reply]

This is how we treat & symbols in pagetitle in the English Wikipedia. -- Magioladitis (talk) 09:35, 12 October 2013 (UTC)[reply]

Moin Magioladitis, yes and so i did it since four years. The problem was, that you couldn't set the article-ID as Done. But I now see, that could today. I'll have a look at this an will wirte it down here, if a had an example. Thank you --Crazy1880 (talk) 10:41, 12 October 2013 (UTC)[reply]
I did fix the problem where the article's title contained an & symbol and setting the article as done. Having & in url links can cause funny issues. Bgwhite (talk) 06:32, 13 October 2013 (UTC)[reply]
Yes Moin, and the next funny issue with the & symbol. If I have a & symbol in the article-text and click to "More", there is all blank. regards. --Crazy1880 (talk) 15:26, 13 October 2013 (UTC)[reply]
Moin Moin Bgwhite and Magioladitis, could you archive this discussion??? Background is, this discussion is meanwhile to general at all. I would like to start a new discussion, thank you. --Crazy1880 (talk) 15:43, 22 October 2013 (UTC)[reply]

500 Internal Server Error[edit]

 Resolved Hi,

since yesterday, I get from time to time 500 Internal Server Error when trying to reach frwiki main page. The global main page is working. Right now it's not working, and hasn't been working for at least 30 mn. Same problem for list of pages for a given error.

Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, mpelletier@wikimedia.org and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.

Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request.

Apache/2.2.22 Server at tools.wmflabs.org Port 80

--NicoV (Talk on frwiki) 07:19, 9 October 2013 (UTC)[reply]

I need to send an email you to you when I notice things.
It went down about 6:45 UTC. It is down for everybody. I did send in a bugzilla report and I did send an email to the labs lists. I haven't heard back from anybody. There was supposed to be a downtime later today to replace faulty hardware, but that isn't for another ~12 hours. Bgwhite (talk) 07:27, 9 October 2013 (UTC)[reply]

Hello - some editors fight off the vandal hordes, as I do repairing pages with citation errors. If I didn't - there would be a large backlog in Category:Pages with incorrect ref formatting and in Category:Pages with missing references list as in Category:Pages with broken reference names (more than 1500 yesterday). But it is impossible to work it alone. It is much more easier to repair references if you do it one hour, one day or one week ago after the errors were made instead of months and years after the error was done. Very, very difficult to find these errors.

Only with WikiBlame Search it is possible to find and repair such errors.

Best wishes --Frze > talk 08:49, 10 October 2013 (UTC)[reply]

Monitoring backlogs[edit]

User:TheJJJunk monitors certain backlog categories using {{User:TheJJJunk/Backlog}}, a template I made:

Backlog status (Purge)
Category Current status
Pages with missing references list  Not done
Persondata templates without name parameter  Done
Articles with incorrect citation syntax  Not done
Pages with URL errors  Done

which determines if the category is empty or not. He also uses ARA, a script that he developed, to help fix citation errors. --Frze > talk 05:14, 12 October 2013 (UTC)[reply]

Bug?[edit]

 Resolved This edit [6] replaced a '</br>' with '<b>', wrongly bolding some of the text in the list of founders. Philip Trueman (talk) 06:52, 15 October 2013 (UTC)[reply]

This is not a checkwiki problem. Checkwiki only finds errors and not fixes them. Checkwiki correctly identified a broken bracket and it was fixed. There were alot of changes made besides fixing the bracket, thus the edit summary of: Do general fixes and cleanup if needed. Most of the fixes were done manually. Bgwhite (talk) 07:58, 15 October 2013 (UTC)[reply]
Ah! I see - fair enough. Thanks. Philip Trueman (talk) 08:42, 15 October 2013 (UTC)[reply]

Site "More" at all[edit]

 Done Moin Moin Bgwhite and Magioladitis, here i have some bugs at the sites, when i click to "More" to see all entries:

  • If you click at the site "More" to the article-link, it will open an error-page. He trys to open "http://article" but want http://article, so the quotation marks are wrong. Every ID. Edit-link is right.
  • If you are at any normal site and click to "More" and the article has a special sign, the site "More" will be blank. Example +39 Challenge Example page More This will be at every ID

This is it first for page "More", Thank you --Crazy1880 (talk) 16:02, 22 October 2013 (UTC)[reply]

I fixed the second problem, but I don't understand the first. Could you give me some examples of where the problem is happening. Bgwhite (talk) 22:21, 22 October 2013 (UTC)[reply]
Check this page and try to click on the pagename. -- Magioladitis (talk) 04:27, 23 October 2013 (UTC)[reply]
Magioladitis, thank you. It has been fixed. Bgwhite (talk) 04:38, 23 October 2013 (UTC)[reply]
Thank you, that had to be said to you. Regards --Crazy1880 (talk) 14:56, 23 October 2013 (UTC)[reply]

#87[edit]

 Done

  • I noticed same thing on enwiki. The main problem is what each wiki uses besides File and Image to identify an image. At the very least, I can add File, Image a few other terms. Bgwhite (talk) 20:39, 21 September 2013 (UTC)[reply]
    • You would have to do an API request to get the full list of terms. --NicoV (Talk on frwiki) 20:54, 21 September 2013 (UTC)[reply]
      • NicoV and Magioladitis: Fixed. Wasn't added in time for today's daily run. Bgwhite (talk) 00:39, 29 October 2013 (UTC)[reply]
        • No worries. Tomorrow. -- Magioladitis (talk) 04:00, 29 October 2013 (UTC)[reply]
          • NicoV, I updated #11 with #87's updated code... Just the searching for html entities. Nothing like going from 50 lines of code to 5. #11 isn't turned on for enwiki, so yell if you see something wrong. Bgwhite (talk) 08:06, 29 October 2013 (UTC)[reply]

Error #3 arwiki[edit]

 Done This error list contains many article don't have this error, because we use many of templates to({{reflist}} or <references />).

We use main template in arwiki is {{مراجع}} and other link to it. the full list templates used in arwiki:

Reflist
مراجع
ثبت المراجع

--Zaher talk 13:18, 26 September 2013 (UTC)[reply]

Zaher kadour Very good point. There are probably other languages with the same problem. I will work on it. Bgwhite (talk) 17:59, 26 September 2013 (UTC)[reply]
Hi, I would suggest having the list of reference templates in the configuration file. WPCleaner already uses error_003_references_templates_yywiki for this configuration. --NicoV (Talk on frwiki) 21:40, 29 September 2013 (UTC)[reply]
Yes, I already knew about the config file and filed it as not practical and cumbersome for this case. That doesn't mean it ends up as the best available solution. Bgwhite (talk) 06:04, 30 September 2013 (UTC)[reply]
Zaher kadour and Meno25, this should be fixed now. There are still some reference templates I need to find for various languages wikis, but I have most of them. Bgwhite (talk) 08:39, 28 October 2013 (UTC)[reply]
Bgwhite Confirmed fixed. Thank you for your work. --Meno25 (talk) 13:00, 28 October 2013 (UTC)[reply]
Good work, Thanks. --Zaher talk 16:46, 28 October 2013 (UTC)[reply]

I don't know if this is related or a separate #3 issue, but i have recently (today especially badly) noticed a lot of articles coming up with Error #3 but actually containing a {{reflist}}. The example on my WPCleaner screen right now is Adam Swandi, but there have probably been twenty or thirty of them this morning. Am i missing something easy? Cheers, LindsayHello 08:13, 24 October 2013 (UTC)[reply]

LindsayH I started to do the fix for #3. I finished phase one yesterday, which reads the template names from the translation file. WPCleaner reads the same file. $10 says I goofed the syntax of the Translation file to where WPCleaner has problems with it. NicoV, what did I screw up on your end? Bgwhite (talk) 21:14, 24 October 2013 (UTC)[reply]
Seems to work for me, WPCleaner doesn't report any problem for Adam Swandi (correctly detects {{reflist}}). Did you fix something ? --NicoV (Talk on frwiki) 09:25, 25 October 2013 (UTC)[reply]
Still "detecting" no reflist when there is one for me. I've just spent an hour or so with WPCleaner, and probably 75% of the articles pulled up said that there was no reference list, but there was. Cheers, LindsayHello 11:34, 26 October 2013 (UTC)[reply]
I'm second-guessing myself now, so i just went back to WPCleaner and pulled up twenty random pages and one non-random; of them, the following all showed a missing reference list, yet all have some variety of the {{reflist}} in their Notes or References sections: 18th Infantry Regiment (United States), Dead Moon, Carlo Grano, Greenville Independent School District, Euriphene leonis, Edge Hill (Shadwell, Virginia), Douglas Moore, Vanitas (Anaal Nathrakh album), Midwest Millions, Michael V. Saxl, Los Siete de la Raza, and the non-random Adam Swandi. Don't worry about hurting my feelings; just tell me what i'm doing wrong! Cheers, LindsayHello 12:01, 26 October 2013 (UTC)[reply]
Hi LindsayH, I've just tested with Dead Moon and WPCleaner doesn't detect any problem with it. If I remove the {{reflist}}, error #3 is detected. The only thing I can think of is that you don't have an up to date version of WPCleaner: did you install it with Java Web Start which allows for automatic updates ? --NicoV (Talk on frwiki) 13:10, 26 October 2013 (UTC)[reply]
Hi NicoV. I figured the problem was probably with me (it often is!). I haven't a clue how i installed it, but when i get back from work this evening, i shall reinstall and get up to date! Thanks for the quick response. Cheers, LindsayHello 13:17, 26 October 2013 (UTC)[reply]
Hi LindsayH, did you manage to make it work? I've released WPCleaner 1.30 yesterday, so now you should see "1.30" when running WPCleaner and it should work. If you still have the problem with 1.30, tell me. --NicoV (Talk on frwiki) 09:53, 30 October 2013 (UTC)[reply]
NicoV, i did. I don't know what happened ~ i used to have the Java Web Start, but when i checked a couple of days ago i was using a six-month old version. Now it all seems to be working properly and well. Thank you for the advice! Cheers, LindsayHello 21:15, 30 October 2013 (UTC)[reply]

Add list as input[edit]

 Done

Another one on the todo list. Allow checkwiki to take a list of articles as input. Bgwhite (talk) 06:24, 5 October 2013 (UTC)[reply]

Seems to be a nice addition. How do you u== Error #84 (Section without content) ==

 Done A section that only has a <syntaxhighlight lang=""> block gets flagged as an empty section. Example: C signal handling#Example usage.

Also, the "comments and bugs" link at the bottom of a checklist goes to a non-existent page.

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Check_Wikipedia&action=edit&section=new
should be
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:WikiProject_Check_Wikipedia&action=edit&section=new

Thanks, --Bamyers99 (talk) 19:34, 19 October 2013 (UTC)[reply]

Bamyers99 Yea, I knew about the problem, but it hasn't been at the top of my list. I guess you put some fire under my feet.
The program goes thru and identifies some "special" tags (nowiki, comments, source, pre, code, ...) where the code shouldn't check for errors. I've changed it so the only special one for error #84 is the comment tag. Today's daily update contains the new code. Cross fingers.
The "comments and bugs" link was fixed too. Bgwhite (talk) 00:25, 23 October 2013 (UTC)[reply]

Persondata-Script[edit]

Moin Moin Bgwhite and Magioladitis, at the german Wikipedia Stefan Kühn has a persondata-script (see here). Will you be the account executive or Stefan Kühn. Perhaps it will be useful for all, not only for the german Wikipedia. Regards --Crazy1880 (talk) 15:00, 23 October 2013 (UTC)[reply]

Howdy, I didn't know that existed. At least for now, I have no plans to port it over to WMFLabs. Looking at the list of projects ported already, it's not listed there. I think you will have to ask Stefan. With Wikidata deployed, I'm not sure if persondata is even relevant. Bgwhite (talk) 21:19, 23 October 2013 (UTC)[reply]
Hello, at the moment I look also at Wikidata. This persondata-script work only in the german wikipedia, and has some known bugs. But I don't redesign it, because I don't know how we create or store this persondata in the future. Maybe we store this data only in wikidata (like commons) and generate infobox, list or the first lines of an artikel with templates. I don't know. So I wait for the first use of wikidata in this direction. I think we can use the interface and basicscript for a new script. And this script check the data in Wikidata for all languages. -- sk (talk) 13:49, 25 October 2013 (UTC)[reply]

#67[edit]

 Done Hi,

Error #67 doesn't seem to be detected for frwiki. It's supposed to detect the opposite of #61. --NicoV (Talk on frwiki) 15:24, 26 October 2013 (UTC)[reply]

NicoV, I am an idiot. I won't mention the mistake because you will laugh too hard. Bgwhite (talk) 20:30, 26 October 2013 (UTC)[reply]

Add Egyptian Arabic Wikipedia (arzwiki)[edit]

 Done

@Bgwhite: Hi. Could you, please, add the Egyptian Arabic Wikipedia (arzwiki) to the list of languages covered by Check Wikipedia? Currently when I want to fix Check Wikipedia errors I run the bot on all articles which wastes a lot of time. --Meno25 (talk) 20:47, 29 October 2013 (UTC)[reply]

Meno25, could you create a Template file for arzwiki, similar to ar::ﻮﻴﻜﻴﺒﻳﺪﻳﺍ:ﻒﺤﺻ_ﻮﻴﻜﻴﺒﻳﺪﻳﺍ/ﺕﺮﺠﻣﺓ. Once it is created, I take the file's value and stick it in a database where checkwiki accesses the info. So, I have to ask, what is different about Egyptian Arabic vs Arabic? I'm dyslexic and I can't image being dyslexic and having to learn Arabic or Hindi. Bgwhite (talk) 07:16, 30 October 2013 (UTC)[reply]
Bgwhite Created: arz:ويكيبيديا:تشيك ويكيبيديا/ترجمه One of the differences is that for error 3 please note that reflist on arzwiki is arz:قالب:مصادر. --Meno25 (talk) 09:16, 30 October 2013 (UTC)[reply]
Meno25, if you're interested also in using WPCleaner on arzwiki, I've copied its configuration from arwiki to arz:مستخدم:NicoV/WikiCleanerConfiguration (may need some modifications, but I'm not able to know which). I will add arzwiki to WPCleaner tonight. --NicoV (Talk on frwiki) 09:46, 30 October 2013 (UTC)[reply]
NicoV Thank you, sir, for your work. You are really helpful. --Meno25 (talk) 10:30, 30 October 2013 (UTC)[reply]
Done for WPCleaner. Don't hesitate to change its configuration. --NicoV (Talk on frwiki) 16:56, 30 October 2013 (UTC)[reply]
Meno25, it is all setup and the checkwiki program has been run against today's arzwiki dump file. Give a yell if you see anything wrong or needs to be changed. Bgwhite (talk) 21:14, 30 October 2013 (UTC)[reply]
Bgwhite Wow! That was really fast. However, there are 2 small glitches. First: On all priorities, the description of errors is not rendered properly. Example: description of error 42 is "HTML text style element" where as it should be "HTML text style element <small>" So the tag "<small>" is not displayed. The same happens for errors 41, 40, 38, 39, etc. Second: The translation of errors descriptions is not loaded from arz:ويكيبيديا:تشيك ويكيبيديا/ترجمه. For the same error 42 the description should be "عنصر تنسيق HTML <small>" --Meno25 (talk) 21:55, 30 October 2013 (UTC)[reply]
Meno25. Things should be looking much better. Program is using the translation file now. I'm rerunning the dump file and will take about 10 minutes. For the <small> not showing up, the translation file uses html syntax and not wiki syntax. The pages on WMFLabs are using html. To have the small tag show up, type: &lt;small&gt;. This also works for wiki syntax. Bgwhite (talk) 23:10, 30 October 2013 (UTC)[reply]

Remove errors[edit]

 Done

Hi all, I have got an idea that Check Wiki could search for links like this:

  • [[2012|2013]] - which year is meant?

Is it good idea? Matt S. (talk | cont. | cs) 12:41, 30 October 2013 (UTC)[reply]

I'd say it's a good idea, especially with VE being unleashed on a lot of wikis. --NicoV (Talk on frwiki) 12:47, 30 October 2013 (UTC)[reply]
Matt S., NicoV, magioladitis, Meno25 and NicoV. After I get done adding "Add list as input" from above, I'll turn to adding new errors. I'd like to first remove any errors that should be no longer.
  • Errors #01, #04, #42, #89, #90 and #91 are all ready deactivated and the code removed from the program.
  • Following errors are used by 15 wikis or less: #30 (15), #33 (11), #35 (15), #41 (13), #62 (7), #68 (10), #77 (13), #79 (11), #82 (8) and #92 (11).
I'd personally like to see deactivated: #30, #33, #35, #41, #77, #79 and #81. #77 seems identical to #66. #33 is a valid html5 tag. #41 is not valid html5, but there is no easy replacement and can be turned into valid html5 by mediawiki's parser. #30, #79 and #81 errors are just way too common. I don't see the need for #35 as majority of articles have the description in the article's text and not all gallery's need a description. Bgwhite (talk) 21:44, 30 October 2013 (UTC)[reply]
I agree for the already deactivated errors, they are unnecessary. I agree with you that generating lists for #30, #33, #35, #41, #77, #79 is not very useful, but I will probably keep some of them in WPCleaner (on frwiki, we already deactivated some errors for a long time, but kept them detected by WPCleaner). A bit more reserved about #81, but it already seems to be deactivated (not working on frwiki while in middle priority). I don't understand "#77 seems identical to #65" : first one is about <small />, other one is about <br />. --NicoV (Talk on frwiki) 22:11, 30 October 2013 (UTC)[reply]
I meant to say #66 and not #65. What ones would you keep in WPCleaner? #81 is not deactivated, but it should be working again. Bgwhite (talk) 22:26, 30 October 2013 (UTC)[reply]
Ok, #66 and #77 are very close (difference is whether the <small /> tag is around the whole description or not). #81 seems to be working again (170 detections right now). I would keep probably all of them (if some wiki want to use them), but I especially think the following ones can be useful: #30 (but it's probably only for users interested in them), #35 (same), #77, #79 (WPCleaner can analyze the target to propose some titles), #81 (useful to clean up the wikitext, automated for some situations). --NicoV (Talk on frwiki) 09:56, 31 October 2013 (UTC)[reply]
I support removing deactivated errors from Check Wikipedia. For other errors: #30, #35, #41, #79, #77 seem to be safe to remove while I believe #33 and #81 should stay. --Meno25 (talk) 22:57, 30 October 2013 (UTC)[reply]
Firstly, I'm not sure about removing #30, #33, #41, #68 and #79. I find using <u> and <big> very bad. Pictures without descriptions are against accessibility, #68 is often error, but not always, and I find links like [7] [8] wrong (URL is better). Secondly, I would like to keep #81 and #92. These are useful. Thirdly, #66 and #77 should be merged. Matt S. (talk | cont. | cs) 16:15, 31 October 2013 (UTC)[reply]

Ok, I think this is where we are at. Yes is for delete

#30 #33 #35 #41 #62 #68 #77 #79 #81 #82 #92
Yes No* Yes No Yes No Mer Yes No No* No

Error #33 is yes unless Meno confirms otherwise. I did a scan of enwiki for cases of <u> and the majority were to Arabic articles. It appears you have to underline some Arabic characters. Meno25, is this true? Arabic language and Ahmed II are examples. <u> is valid html and if it has to be used by Arabic articles, then I say remove. If it is not needed, then keep it.

Error #30, #35 and #79 are to be removed. 3 out of 4 say removed and looking at the lists generated on other wikis, the errors are not being done.

Nobody said anything about #62.

Matt S. was only one to mention #68. Nobody mentioned #82, but it is similar to #68. Matt, should #82 be kept or removed?

Anyone have objections? Bgwhite (talk) 21:11, 31 October 2013 (UTC)[reply]

Bgwhite We don't underline Arabic characters when writing (using Arabic script). However, sometimes we use underlining to show how words are pronounced in Arabic. An example in Arabic language is:
kib(un) 'book', -ti-b(un) 'writer', mak-ta-b(un) 'desk',
By all means feel free to do whatever you think is better either removing the error or not. And a big thank you from me for your work in improving the Check Wikipedia project. --Meno25 (talk) 22:24, 31 October 2013 (UTC)[reply]
Thank you Meno. I guess it is not necessarily needed. So, #33 will be kept. Bgwhite (talk) 22:35, 31 October 2013 (UTC)[reply]
@Bgwhite: Sorry that I couldn't answer on Friday but my device has broken. I am really not sure about #82. Someone could say that for example [[wikt: is bad inside article. Generally, links to other projects should be included via templates or in buleted lists in the "External links" section.
#62: I am not interested in fixing this error, and headline alone isn't always bad. So delete. Matt S. (talk | cont. | cs) 10:51, 3 November 2013 (UTC)[reply]
  • Errors #30, #35, #62 and #79 have been removed. Will update the code on labs in a couple of days. Bgwhite (talk) 07:58, 6 November 2013 (UTC)[reply]

Just asking[edit]

 Resolved

How frequent does the list re-scan itself? (sorry for my bad English) --Kc kennylau (talk) 13:55, 31 October 2013 (UTC)[reply]

If I'm not mistaken: every day there's a partial scan done (with some of the pages modified in the last 24h), and there's a full scan each time a new dump is made available. --NicoV (Talk on frwiki) 14:07, 31 October 2013 (UTC)[reply]
As you speak Chinese and Japanese, those two wikis don't have daily partial scan. Those two have a new dump twice a month and scans are made of those. English wiki has a partial scan, but only one dump a month. I wish I could tell when the dumps happen, but it varies. English dump is usually done the first week of the month. Bgwhite (talk) 18:11, 31 October 2013 (UTC)[reply]

Problems with checkwiki_bots.cgi[edit]

 Resolved

Hi Bgwhite, a WPCleaner user is reporting troubles to retrieve the list of pages. I can't test right now with WPCleaner, but I tried a checkwiki_bots.cgi URL and the behaviour is strange: instead of displaying a list of pages, FF wants me to download the cgi file. --NicoV (Talk on frwiki) 12:28, 4 November 2013 (UTC)[reply]

NicoV I can confirm that WPCleaner is not retrieving the list of pages. AWB also doesn't get the list. FF and IE wants to download the cgi file, chrome does not. This is also true for the checkwiki.cgi script. I haven't changed any files for a few days. Time to ask the idiots. Bgwhite (talk) 18:42, 4 November 2013 (UTC)[reply]
Hi, I tried when I came back home about 2h ago, and it was working for me and it's still working. I don't know what's going on again with Labs. --NicoV (Talk on frwiki) 19:54, 4 November 2013 (UTC)[reply]

#28[edit]

 Done

Hi, in some situations, the end of the table is given by a template and not by |}. Would it be possible to add a _templates_ parameter to list possible replacements for the usual end of table ? cf. this modification, reported on my talk page. --NicoV (Talk on frwiki) 07:34, 8 November 2013 (UTC)[reply]

NicoV Yes. I hard-coded 20 templates for enwiki way back when. Majority are sports related. Add the templates to the Template file and I will get it coded up. Bgwhite (talk) 08:09, 8 November 2013 (UTC)[reply]
Great, I've put one template in the list (also sports related). --NicoV (Talk on frwiki) 08:13, 8 November 2013 (UTC)[reply]

Error 37 on arzwiki[edit]

 Done

Why does error 37 on arzwiki have so many results (currently 9889)? Is it broken? --Meno25 (talk) 16:23, 9 November 2013 (UTC)[reply]

I hopefully I just turned it off in the Translation file. This error is for Latin Alphabet wikis only. Bgwhite (talk) 00:14, 10 November 2013 (UTC)[reply]
Thank you for the quick response. --Meno25 (talk) 08:47, 10 November 2013 (UTC)[reply]

New error, no closing </ref>[edit]

 Resolved

Hi, I've com across some pages that lack a closing </ref> tag and I was wondering if you could make that an error here. Thanks, Thegreatgrabber (talk)contribs 01:51, 13 November 2013 (UTC)[reply]

I believe the tracking category Category:Pages with incorrect ref formatting already catches this. Bgwhite (talk) 02:12, 13 November 2013 (UTC)[reply]

Format of translation file ?[edit]

 Resolved

Hi again, is it possible to replace the <pre /> in the configuration file by a <syntaxhighlight>...</syntaxhighlight> ? It would give a better display (ability to wrap lines, maybe partial syntax highlighting). --NicoV (Talk on frwiki) 13:55, 13 November 2013 (UTC)[reply]

What!! Three questions in a row. You are really trying my patience. It should be fine. I'm not coding anything against <pre />. Bgwhite (talk) 20:32, 13 November 2013 (UTC)[reply]
Yes, I was feeling bold today ;-) --NicoV (Talk on frwiki) 20:40, 13 November 2013 (UTC)[reply]
Works like a charm, and a lot prettier to read: configuration file. --NicoV (Talk on frwiki) 12:03, 14 November 2013 (UTC)[reply]

Information about DEFAULTSORT errors[edit]

information Note:

Hi everyone,

This is just an information about some errors regarding DEFAULTSORT:

  • #89, #90 and #91 can be deactivated for quite some time now, since MediaWiki handles correctly uppercase/lowercase for sorting.
  • #6 and #37 may be deactivated for some wikis if someone asks the developers to configure their wiki with a better algorithm for sorting in categories, removing the need to replace characters with accent marks by an other character. We did this on frwiki, and it works nicely after a transition time. See bugzilla:54680 for more information. You can contact pl:User:Matma Rex if you're interested in setting this up for your wiki.

--NicoV (Talk on frwiki) 09:22, 13 November 2013 (UTC)[reply]

#89, #90 and #91 are already deactivated in checkwiki.
On Labs, Swedish is the default collating value for the database. sigh.... Bgwhite (talk) 20:24, 13 November 2013 (UTC)[reply]
Ouch, interesting choice (trying to find a reason for such a collating...) --NicoV (Talk on frwiki) 20:42, 13 November 2013 (UTC)[reply]
NicoV, short answer: Labs people are lazy. Long Answer: Labs uses MariaDB, which is a fork of Mysql. Mysql is owned by Oracle and they acquired it when they bought Sun. Sun bought it from Mysql AB, whose founders developed Mysql. The Mysql AB founders are from and Mysql AB was headquartered in... Sweden. One of the very first things you do when you set up Mysql is to change its default collate value. Labs is too lazy to change that so every table one creates has to set a collate value. I set it to utf8_unicode_ci (ci means case insensitive). Labs also has the default character set as Latin1, which I changed to utf8. Lazy buggers. Bgwhite (talk) 22:22, 13 November 2013 (UTC)[reply]

Deprecated errors[edit]

Hi, if I understand correctly, several errors are currently deprecated (not checked any more even if the priority is set in the translation file). Would it be possible to show that information in list of errors (maybe "deprecated" instead of "off") or to even remove them from the list ? --NicoV (Talk on frwiki) 14:15, 22 November 2013 (UTC)[reply]

Yea, it should be done. However, I'm going to be lazy and use those slots when I start adding new errors. Bgwhite (talk) 08:31, 24 November 2013 (UTC)[reply]
Really not a huge fan of reusing previously used slots: all translation files will need to be modified, tools like WPCleaner will need to be modified, ... --NicoV (Talk on frwiki) 13:56, 24 November 2013 (UTC)[reply]
Yea, there are pluses and minuses. Translation files and WPCleaner will need to be modified no matter what, though by how much is a different story. Still, any more trouble that I can cause you is a definite bonus. :) Bgwhite (talk) 23:02, 25 November 2013 (UTC)[reply]
Other big minuses:
  • At least frwiki has several pages describing errors for CheckWiki : translation file, list of errors, individual page per error (fr:P:CS/002), ...
  • Some errors, even deactivated in Check Wiki, have some interest, and WPCleaner can still detect them on a per page basis. Exemple: listing all pages containing images without description is not very useful, but some people find it useful to detect images without description inside the page they are analyzing (for example, frwiki project has deactivated the list for #30 some time ago, but asked to keep it detected by WPCleaner, error_030_bot_frwiki = true in tanslation file. Same for #82, #84). I know some people use WPCleaner to check an article they have written, but are not part of CheckWiki project otherwise.
  • If error numbers are not reused, there's no real need to modify translation files (information will just be useless) or WPCleaner
As always, do as you see fit, but my preference goes on not reusing old numbers ;-) --NicoV (Talk on frwiki) 23:10, 25 November 2013 (UTC)[reply]

#61[edit]

 Resolved

Hi,

I was trying to implement in WPCleaner the list of templates for #61 (reference before punctuation), and I tested it with 2014 FIFA World Cup which is reported has having the error on Labs. But I think it shouldn't be detected as an error. It detects Brasil 2014{{refn|group=nb|The [[Portuguese language|Portuguese]] pronunciation is {{IPA-pt|ˈkɔpɐ du ˈmũdu dɐ ˈfifɐ bɾɐˈziw ˈdojz ˈmiw i kɐˈtoʁzi|}}, in Brazil's standard pronunciation.}} which doesn't seem to be a problem: maybe it detects the comma after the {{IPA-pt}} template ? --NicoV (Talk on frwiki) 21:44, 25 November 2013 (UTC)[reply]

NicoV, this is a bug in Checkwiki. I did confirm it for {{ref}}, but hadn't checked {{refn}}, which is why I removed {{ref}} from the translation file and not {{refn}} yet (I've never used either one before). I've been asked to add other templates such as {{citation needed}} (see here). My plan is to work on coding Checkwiki this week, so hopefully I'll get caught up again on requests. Bgwhite (talk) 22:35, 25 November 2013 (UTC)[reply]

ISBN errors and external links[edit]

 Resolved

Hi,

I was wondering if incorrect ISBN should be detected in external links. Currently, they seem to be detected (example for #73 fr:L'Absence d'oiseaux d'eau where I fixed the ISBN everywhere except for the link to the page on the editor website, which has an incorrect ISBN). --NicoV (Talk on frwiki) 09:26, 27 November 2013 (UTC)[reply]

I'm confused. I see "external links" as the external links section and I don't think that is what you mean. I think you mean, should they be found inside urls??
Except for error #69, Checkwiki will check ISBNs inside a url. I'm not sure if this is a good or bad idea. In the case of fr:L'Absence d'oiseaux d'eau, it is a good idea. In cases of Japanese Anime, such as List of Fist of the North Star chapters, they like adding X to the beginning of ISBNs (see refs 1, 2 and 3). In the article body of Anime articles, I fix or remove the "bad" ISBN, but can't with the url. There are currently 15 anime articles whitelisted for error #71. Bgwhite (talk) 20:06, 27 November 2013 (UTC)[reply]
Yes, I meant urls. I'm not sure also: they are links to other web sites which we often can't change if the other web site is incorrect (not always true: I sent an email to the webmaster of the editor for fr:L'Absence d'oiseaux d'eau this morning, and their site is now fixed :-) ), but sometimes it can be an error on our side. Ok, let's keep the detection unchanged. I have to modify WPCleaner to detect also in urls. --NicoV (Talk on frwiki) 20:28, 27 November 2013 (UTC)[reply]

#65[edit]

 Resolved

I found a problem solving itwiki list. All remainings have the same issue: <br /><math>...</math> at the end of the description. So the <br /> isn't unnecessary. Thanks! --AlessioMela (talk) 10:59, 30 November 2013 (UTC)[reply]

AlessioMela, this is caused by removing all the math tags before scanning. What is inside the math tags can cause many false positives with other errors. You can set up a whitelist, so the articles don't show up as an error. Look at either Wikipedia:WikiProject Check Wikipedia/Translation or fr:Projet:Correction_syntaxique/Traduction. They both have whitelists for various errors. English one has it set up for #65. Bgwhite (talk) 00:18, 1 December 2013 (UTC)[reply]
Ok, I understand. I'll do a withelist for that. If I'll find similar problems in other # I can always use whitelist's system? --AlessioMela (talk) 10:09, 1 December 2013 (UTC)[reply]
AlessioMela, yes, whitelists can be created for all errors. On the other hand, every malfunction should be reported, whitelisting is only for rare cases. Matt S. (talk | cont. | cs) 10:39, 1 December 2013 (UTC)[reply]
Thanks! --AlessioMela (talk) 12:33, 1 December 2013 (UTC)[reply]

Patch for #6 and #37 to support Hebrew[edit]

 Done

Hi. I've made this patch to add Hebrew characters for #6 and #37. How can I get it merged onto the actual running checkwiki.pl? Ijon (talk) 21:00, 5 December 2013 (UTC)[reply]

Ijon, wondered who did that. #6 and #37 are for Latin alphabet based wikis only. What is the reason for the errors in hewiki?

Update ins't running from german CheckWikipedia[edit]

 Resolved

Moin Moin Bgwhite and MPelletier (WMF), since the 6. december 2013 there is no update running for the german CheckWikipedia script. In the english version there is no problem, like me to see. Could you look for the "problem"? Thanks --Crazy1880 (talk) 07:50, 10 December 2013 (UTC)[reply]

Crazy1880 Most languages are automatically updated except for English, French and German. The way WMFLabs does dump files means those three would be ~15 days old before I'd see them, so I have to manually do those. When a dump gets done for German, it has been so long that a new dump starts right up. The database at WMFLabs has also become unstable. This is causing programs to die because they cannot connect to the database. It is also causing the web pages to be very slow. Last week was horrible for the web pages.
A new German dump file was produced about 2 hours ago, so I just started up the processing. I'll keep an eye on it to make sure it finishes.
German, French, English, Spanish, Arabic and Czech also get updated daily and that has been mostly working. The daily update looks at what articles changed that day and scans them. Due to a maximum number of articles I can get, not all, but most changed articles are scanned. Bgwhite (talk) 08:31, 10 December 2013 (UTC)[reply]
(And svwp. Don't forget the Swedeih Wikipedia. It has been 17 days...haha) I really love all your help and efforts! It is highly appreciated!-(tJosve05a (c) 09:03, 10 December 2013 (UTC)[reply]
Josve05a Dang, I do keep forgetting Swedish Wikipedia. I'll make sure it never runs again. :) Bgwhite (talk) 21:18, 10 December 2013 (UTC)[reply]
Bgwhite thats waht I ment: "The daily update looks at what articles changed that day and scans them. Due to a maximum number of articles I can get, not all, but most changed articles are scanned." This is, what inst running since the 6th dezember. Sorry, I wrote something wrong. --Crazy1880 (talk) 13:22, 10 December 2013 (UTC)[reply]
Crazy1880, my first thought is that it would be the database problems. The database problems have also stopped English from running a couple of times. But, three days in a row is troublesome. French is doing the same thing. I'll look closely at today's run and see how it goes and see if I can see a problem. Bgwhite (talk) 21:18, 10 December 2013 (UTC)[reply]
Bgwhite, I checked it this morning and in my eyes its running, isn't it? Regards --Crazy1880 (talk) 06:24, 11 December 2013 (UTC)[reply]
Bgwhite, addition: With the new dump the ID 30 (Image without description) is fully empty. This couldn't be. Can you have a look on this? Regards --Crazy1880 (talk) 06:36, 12 December 2013 (UTC)[reply]
Crazy1880, it was removed. See Remove errors for the discussion and other errors that were removed.
Moin Bgwhite, so in my eye the daily article-scan seem to run. Thank you. I have a look --Crazy1880 (talk) 16:50, 13 December 2013 (UTC)[reply]
Crazy1880, yea, it has been working. Programs are still dying, so don't be surprised if it doesn't update once in awhile. Bgwhite (talk) 18:49, 13 December 2013 (UTC)[reply]
Moin Bgwhite, thank you. But for what a time it is normal, when CheckWikipedia doesn't update daily? Regards --Crazy1880 (talk) 18:21, 15 December 2013 (UTC)[reply]
Daily updates start at 0z everyday. Bgwhite (talk) 22:20, 15 December 2013 (UTC)[reply]

What is error #88[edit]

 Resolved

Ping: @NicoV:, @Magioladitis:, @GoingBatty:. Why is it bad to have a blank space in the beginning of the DEFAULTSORT? -(tJosve05a (c) 09:48, 11 December 2013 (UTC)[reply]

@Josve05a: - I don't know. I intentionally added {{DEFAULTSORT: Roy, Vipul}} (with a space) to Vipul Roy, and the article appears to be correctly sorted under "R" in Category:Indian male film actors. Hopefully Magioladitis or Bgwhite can answer your question. GoingBatty (talk) 14:28, 11 December 2013 (UTC)[reply]
A long time ago in a far aware place, a program named MediaWiki didn't like colons followed by a space. Since then, enlightenment has come upon some software, but it looks much better not having a space just any 'ol anywhere. Bgwhite (talk) 18:57, 11 December 2013 (UTC)[reply]
It used to cause categorisation problems in mirror sites. The error is rare anyway, no harm to normalise this. -- Magioladitis (talk) 19:15, 11 December 2013 (UTC)[reply]

But if this does not cause any error or trouble, why is it still a CHECKWIKI-error? -(tJosve05a (c) 20:19, 11 December 2013 (UTC)[reply]

Ignore it. We only does this fix in parallel with others as an extra just in case some mirror sites still have some problems. Very few pages anyway. -- Magioladitis (talk) 23:27, 11 December 2013 (UTC)[reply]

There are more interesting similar things to fix. Check Wikipedia:AWB/FR#Fix_spacing_in_DEFAULTSORT. -- Magioladitis (talk) 23:02, 12 December 2013 (UTC)[reply]

FYI... I'm taking suggestions for new errors. Discussion is above this section. Any suggestions are welcome.
hewiki doesn't have a translation file. Want to help in translating? Also with the file, hewiki uses the default for which errors are turned of or on. Here is the [[Wikipedia:WikiProject Check Wikipedia/Translation|English translation file] you can use for reference. Don't need to translate the top part, only the individual errors. Bgwhite (talk) 22:10, 5 December 2013 (UTC)[reply]
Thanks for responding. #6 and #37 are in fact running on hewiki as well, and generating false positives, calling out perfectly valid strings in Hebrew, making them useless. My patch should fix it, the way code around it is doing for the Cyrillic alphabet on the Russian Wikipedia, for example. So, can my patch be merged?
I'll think about new features as I get more familiar with all the existing ones and make sure they work well on the Hebrew Wikipedia.
Hebrew Wikipedia does indeed have a translation file, and it seems to be affecting behavior too, as far as I can tell. Do you think otherwise? Ijon (talk) 22:54, 5 December 2013 (UTC)[reply]
Ijon, thank you for telling me about the translation file. I didn't have the location. I've added it, so the web pages and checkwiki program will use it from now on. If you see any errors that should be turned off or on, please edit the the translation file. Changes from the translation files get updated at 0z.
I'll look at your patch and get things running for you sometime late next week. I'm dealing with fixing the monthly dump errors right now. I have a bad memory and I try to concentrate on one big task at a time. Bgwhite (talk) 08:07, 7 December 2013 (UTC)[reply]
Ijon. Your patch was added to an obsolete branch. It was scary seeing that old piece of #*$&@#. Code has been added. The last dump run for hewiki was on the 5th. Usually, a new dump is produced every ~15 days, so around the 20th you should see articles for #006 and #037. Bgwhite (talk) 22:36, 10 December 2013 (UTC)[reply]
Oh, excellent, thanks! So, for future reference, what is the git repo and branch I should fork from for any future patches? Ijon (talk) 21:21, 12 December 2013 (UTC)[reply]
https://github.com/scfc/checkwiki/tree/pu/tools-migration is where anything I do goes into. Bgwhite (talk) 22:38, 12 December 2013 (UTC)[reply]

Future of the project?[edit]

Hi everyone (Bgwhite, Magioladitis, all others)

I was really sad when I understood a few months ago that Bgwhite had left the project (and wiki altogether). His work made the project increase a lot since he took over after SK. Now that Bgwhite also left the project, I'd like to understand what can be done to continue with this project. I understand that enwiki has some strong opponents against it (I don't understand why, keeping a good maintenance level is good for wiki in the long term...), but many other wikis seem to be ok with the project. My experience with frwiki is that I can do a lot of things over there, I just have to explain from times to times what I'm doing and why, but there are no ayatollas against the project.

It seems that some lists have not been updated in the last weeks (or months ?), so there's probably some maintenance to do on Labs for checkwiki. Is anyone interested in taking over this part of the project (at least making sure that the scripts keep running on a regular basis on Labs) ? How can we get access to checkwiki account on Labs ? Myself, I already have little time available to work on WPC, so working also on checkwiki would be too much...

If we don't manage to keep the checkwiki scripts running, there's still the possibility to use WPC to generate some of the lists (I do it on frwiki), but it's only based on the dumps which are normally produced twice a month.

--NicoV (Talk on frwiki) 16:33, 30 October 2017 (UTC)[reply]

NicoV I think we have to move the entire project on Meta and say goodbye to English Wikipedia, at least for now. There was an effort to encourage people to determine which tasks they find useful but I am not sure about the current status anymore. -- Magioladitis (talk) 18:10, 30 October 2017 (UTC)[reply]
I'm not sure where you get the idea that WP:CHECKWIKI has strong opponents. Most people have been pretty supportive of it, "opponents" included. Moving the project to 'meta' won't change anything about anything. Headbomb {t · c · p · b} 18:45, 30 October 2017 (UTC)[reply]
Headbomb The task is global. We moved the main information from German Wikipedia to Engish Wikipedia but I think it will reduce drama on English Wikipedia if we have a list of all possible errors that can be fixed on Meta. -- Magioladitis (talk) 18:58, 30 October 2017 (UTC)[reply]
Centralizing the project elsewhere certainly is doable (and may even be desirable for a variety of reasons), but it won't change anything about the level of support, or the drama (or lack thereof) the project has. Headbomb {t · c · p · b} 19:01, 30 October 2017 (UTC)[reply]
Magioladitis I don't think that hosting the discussions on Meta or English wikipedia changes anything: it's just the discussions, not the tools. I'm more worried about maintaining the tools on Labs: currently, the scripts are not working normally, only a few errors are reported... --NicoV (Talk on frwiki) 15:35, 31 October 2017 (UTC)[reply]

Both the enwiki (started: 31 August 2017, job id: 8992714) and frwiki (started: 12 September 2017, job id: 9458602) jobs have hung on the Toolforge grid. Since they are configured to have only one instance running at a time, new instances of these jobs will not run until these are killed. The statuses can be checked using qstat -j 8992714,9458602 on a Toolforge command line. I am not a member of the checkwiki Toolforge project, so I can't kill them. The two log files are at /data/project/checkwiki/var/log/frwiki-delay.o9458602 and /data/project/checkwiki/var/log/enwiki-delay.o8992714 --Bamyers99 (talk) 19:09, 31 October 2017 (UTC)[reply]

Some ideas:

  • Couldn't some WMF department help us? Given how successful the project is...
  • Shouldn't we finally have this integrated as a MediaWiki extension, like the successful Linter? This would spread it to much much more wikis.

Matěj Suchánek (talk) 08:39, 1 November 2017 (UTC)[reply]

There was an effort to integrate to a Mediawiki extension. Everything was interrupted due a series of unfortunate events. -- Magioladitis (talk) 10:16, 1 November 2017 (UTC)[reply]
@Magioladitis: If you would like to add me as a maintainer to the checkwiki tool I can get the enwiki and frwiki jobs running again. I am an experienced developer and run several bots/tools. --Bamyers99 (talk) 19:04, 1 November 2017 (UTC)[reply]
+1 for Bamyers99 request: we need someone that can do this kind of action. If you want to add me also, but I've never user this tools. --NicoV (Talk on frwiki) 10:52, 2 November 2017 (UTC)[reply]
@Magioladitis and Ladsgroup: Anyone could add a maintainer to unblock the jobs ? --NicoV (Talk on frwiki) 07:59, 4 November 2017 (UTC)[reply]
Bamyers99 : thanks for the explanation ! Matěj Suchánek: Pinging Whatamidoing to see if we can get some assistance on this one. Linter could integrate many of the detections I think, maybe in the future. --NicoV (Talk on frwiki) 10:44, 1 November 2017 (UTC)[reply]

I just added User:Bamyers99 as maintainer, hope that helps. Ladsgroupoverleg 15:38, 4 November 2017 (UTC)[reply]

@Ladsgroup: Thanks. I am able to log on and will try to get enwiki and frwiki running again and figure out where the infinite loop is. --Bamyers99 (talk) 15:51, 4 November 2017 (UTC)[reply]
enwiki and frwiki are working again. An infinite loop bug has been fixed. --Bamyers99 (talk) 22:44, 6 November 2017 (UTC)[reply]

As an OT comment, I'm glad to see activity in this wikiproject. The English Wikipedia is unfortunately behind with the Tidy fixes, while, if all goes well, a couple of big other wikis like the German and the Italian Wikipedia will be on track for an early switch to Remex a few weeks from now. I plan to nudge the English Wikipedia community once again soon, this time at the technical Village Pump. Obviously curious to hear other ideas of course :) Best, Elitre (WMF) (talk) 18:10, 21 November 2017 (UTC)[reply]

Going to ping Magioladitis (talk · contribs) here. Bamyers99 got the thing to run again. That should give your bot lots of work to do! Headbomb {t · c · p · b} 19:37, 21 November 2017 (UTC)[reply]
Subbu has now provided 2 examples of stuff that needs fixing here. Elitre (WMF) (talk) 10:52, 27 November 2017 (UTC)[reply]

Elitre (WMF) I order to avoid any false accusations for performing "cosmetic edits" I think we have first to wait for all pages to break and then fix the pages using bots. Ay preamptive action will be considered by some editors are "unnecessary". Any massive fixes by non bot accounts may be considered as "bot like editing". -- Magioladitis (talk) 22:45, 24 January 2018 (UTC)[reply]

Or you know, get consensus / demonstrate that these things are broken / will break / otherwise needs fixing. Headbomb {t · c · p · b} 22:59, 24 January 2018 (UTC)[reply]
I strongly believe that so far WMF has provided all the relevant information that's necessary to start fixing on a large scale, including simplified instructions for people who are not necessarily "techie", and several community members have shared their tools and techniques. Switching has already happened at major wikis (it.wp, de.wp) and more will follow very soon. How to organise, what to focus on, what to prioritise (example), that's obviously up to each community to decide. Again, if you need specific directions or clarifications, User:SSastry (WMF) is all ears and always available to provide guidance, tips, support etc. Elitre (WMF) (talk) 18:04, 26 January 2018 (UTC)[reply]

Headbomb Do you think that if I initiate this discussion would be a violation of my ban? -- Magioladitis (talk) 18:08, 13 February 2018 (UTC)[reply]

@Magioladitis: A discussion of what exactly? WP:CWERRORS has a list of cosmetic and non-cosmetic fixes. If you're proposing to reclassify something, the place to do that would be at WP:BOTN per "Magioladitis may ask specific questions, at the bot noticeboard or bot request for approval, to clarify whether bot tasks he wishes to undertake, or is currently undertaking, are permitted under remedy 1.1 of this case.". If you're proposing implementing a specific bot task to do non-cosmetic CWERRORS fixes, you're in the clear. If you're proposing to ammend WP:COSMETICBOT, that's a violation.Headbomb {t · c · p · b} 18:39, 13 February 2018 (UTC)[reply]
Headbomb OK, so I can't initiate a discussion about fixing things that changes in Mediawiki will (in the short future) break because they are not broken right now. The best approach is to wait that the pages actually break and then discuss how to fix them. -- Magioladitis (talk) 18:48, 13 February 2018 (UTC)[reply]
Headbomb I don't think all the changes proposed by WMF staff are included in current project's list. The list may need to get updated. -- Magioladitis (talk) 19:00, 13 February 2018 (UTC)[reply]
I'd have to know the specific of what 'things' will become 'broken' to really opine on that, but as mentioned here and elsewhere, non-HTML5 compliance doesn't mean things will be broken once rendered. As I read the remedy, the ban is on discussion/amending WP:COSMETICBOT as a policy. You're still allowed to ask questions about specific tasks, or clarifications on whether something specific is cosmetic or not. You just can't start the same discussion over and over because you don't like the answer you get. Headbomb {t · c · p · b} 19:01, 13 February 2018 (UTC)[reply]
Headbomb I would prefer to be on the same side here and not iniate any discussion that may be interpreted as an action to change the cosmeticbot policy or may lead in policy change. This is also my advice to everyone else. -- Magioladitis (talk) 19:08, 13 February 2018 (UTC)[reply]

Tidy being turned off at 389 wikis next week[edit]

phab:T184656 has a list of 389 wikis where Tidy will be removed (and replaced by RemexHTML) on 31 January 2018. These wikis currently have fewer than 10 high-priority problems.

If you are active at other wikis, please look over the list and check your favorites. If you notice problems after the switch, then please feel free to ping me, or (especially if it's urgent) leave a comment in the Phab task where the devs will see it. Whatamidoing (WMF) (talk) 18:10, 24 January 2018 (UTC)[reply]

This was postponed for a week, which means that it will (probably) happen sometime during the next ~12 to 18 hours. Whatamidoing (WMF) (talk) 07:11, 6 February 2018 (UTC)[reply]

Whatamidoing (WMF) My experience says that in English Wikipedia we should wait for the pages ot break first and then act. -- Magioladitis (talk) 18:58, 13 February 2018 (UTC)[reply]

Three Greek sister projects were in that list. Can you tell me how things are going there? Whatamidoing (WMF) (talk) 19:04, 13 February 2018 (UTC)[reply]
Whatamidoing (WMF) I have not noticed any problems there. I mainly cheched the wiktionary. -- Magioladitis (talk) 19:52, 13 February 2018 (UTC)[reply]
Thanks. Small projects sometimes have big problems and no idea where to get help, so I really appreciate you checking on them. Whatamidoing (WMF) (talk) 20:01, 13 February 2018 (UTC)[reply]
Whatamidoing (WMF) Th good thing is that Wiktionary is heavily based on templates. It would be easy to fix en masse. -- Magioladitis (talk) 20:18, 13 February 2018 (UTC)[reply]

Cross-Origin Requests[edit]

I already wrote about this problem, but it's still there: due to the Same Origin Policy, there is no opportunity to make a request to checkwiki from wikipedian's userscripts. @Bamyers99: maybe you can fix it? According to this task („I'll try to hack around it in my own tools, and but other tool authors to hack theirs“), Magnus Manske already faced this problem and, probably, can help with it. Thanks. Facenapalm (talk) 19:39, 24 February 2018 (UTC)[reply]

@Facenapalm: checkwiki has been changed to allow cross origin requests. --Bamyers99 (talk) 20:54, 24 February 2018 (UTC)[reply]
@Bamyers99: thank you very much! Facenapalm (talk) 22:03, 24 February 2018 (UTC)[reply]

Something wrong with updates?[edit]

Last date is March 20, but it's already April 10. Maybe somehow related to Wikipedia:Village pump (technical)#Analytics "pagecounts-ez" not generating, but those as far as I can see are only pageviews... Error log is screaming, but I don't know how related that is. @Bamyers99: --Edgars2007 (talk/contribs) 16:36, 10 April 2018 (UTC)[reply]

@Edgars2007: The dump scans are running again. The problem was related to the dump file server move. --Bamyers99 (talk) 19:14, 10 April 2018 (UTC)[reply]
Thanks, Bamyers99. --Edgars2007 (talk/contribs) 09:02, 12 April 2018 (UTC)[reply]

False positives? or actual errors[edit]

I have cleaned up "Tag with incorrect syntax" on eswiki and enwiki. There are still 26 to be done on the Spanish one, but I do not actually know the reason why they are listed. I have checked with another more experienced user and we do not find the incorrect syntax on those cases. Could be a false positive? or maybe those errors are not listed in the correct section. They could be actual errors but I cannot find them. Pablohn6 (talk) 23:25, 14 April 2018 (UTC)[reply]

<small/> should be changed to </small>. I don't think <br> takes arguments. I don't know what is happening with the rest, which all appear to use <cite>. – Jonesey95 (talk) 00:04, 15 April 2018 (UTC)[reply]
<cite> requires a closing </cite>. --Bamyers99 (talk) 00:25, 15 April 2018 (UTC)[reply]

False positives for #64[edit]

 Resolved

Hi, on frwiki, there are several false positives for the same link: [[(Miss)understood|(miss)understood]] (for fr:Ayumi Hamasaki, and others but I marked them as fixed yesterday...). It's probably reported because the only difference is the first letter which is in uppercase in the link, and lowercase in the text, but as there's a parenthesis at the beginning of the title, it's not automatically equivalent. --NicoV (Talk on frwiki) 11:43, 15 April 2018 (UTC)[reply]

@NicoV: Rule #64 has been fixed to skip the letter case check if the link does not start with a letter. --Bamyers99 (talk) 17:30, 15 April 2018 (UTC)[reply]

Cycle links[edit]

Hi. Is there a way to add a detection of wikilinks to redirect back to the article? Thank you. IKhitron (talk) 23:16, 15 April 2018 (UTC)[reply]

That is job more for SQL, not dump parsing. --Edgars2007 (talk/contribs) 05:55, 16 April 2018 (UTC)[reply]
Hi. WPCleaner can detect this if error #515 is configured, but it's article by article, not on a dump. Several problems with this on a large scale:
  • API doesn't tell if there's an anchor in the redirection, so it will also report redirects to a subsection of the page
  • Many cases where this is perfectly valid, when the redirect could be replaced by a full article
--NicoV (Talk on frwiki) 06:12, 16 April 2018 (UTC)[reply]
Thank you. Edgars2007, I started with quarry, of course. I get thousand results, most of them irrelevant, with a link in a navigation template. NicoV, could you provide an example for the second point, please? IKhitron (talk) 08:11, 16 April 2018 (UTC)[reply]
I think of several cases where the link could be valid, but I don't have existing examples at hand: for example, I remember about an article about a country: there were links to subdivisions of the country (like region), but the article about the regions for this country wasn't written at the moment, it was only a redirect to the country section about its subdivisions. Keeping the link was a good idea, because what was needed was only to replace the redirect by a real article. --NicoV (Talk on frwiki) 08:54, 16 April 2018 (UTC)[reply]
I definetly do not about redirects with #. IKhitron (talk) 09:01, 16 April 2018 (UTC)[reply]
Ok then, but as the API doesn't provide this information directly, WPCleaner isn't be able to distinguish them. For the dump analysis, it's probably too complex because you can't process each article independently. --NicoV (Talk on frwiki) 09:14, 16 April 2018 (UTC)[reply]
I see. So, find them all, when you know there will be false positives. IKhitron (talk) 10:12, 16 April 2018 (UTC)[reply]

Error #37 reported even if deactivated[edit]

 Resolved

Bamyers99, thanks a lot for the fix for #64! An other recurrent problem on frwiki, is that even if #37 has been deactivated for a long time, CW keeps reporting errors for it. From time to time, I mark all articles as done, but they progressively come back... --NicoV (Talk on frwiki) 20:52, 16 April 2018 (UTC)[reply]

@NicoV: Fixed a bug where it was using error #34s priority instead of #37s. --Bamyers99 (talk) 01:13, 17 April 2018 (UTC)[reply]

Error #43 : false positive ?[edit]

Hi, I'm stuck in finding where is the #43 problem in fr:Histoire d'Israël. It's reported by CW for this part of the code, which seems ok to me:

{{Encadré|titre= |alignement=gauche|largeur=1000 px|contenu=
=== Les kibboutzim ===
{{article détaillé|Kibboutz}}
Un des traits caractéristiques de la jeune société israélienne est l'existence de communautés de vie et de travail, le plus souvent à objet agricole, appelées ''Kibboutzim''. Le premier kibboutz a été fondé en 1908 à [[Degania]] et il en existe 214 en 1950, regroupant plus de {{nombre|67000|habitants}}. Il y en a, en 2000, 268 pour {{nombre|117000|habitants}}. Les fondateurs étaient souvent de jeunes idéalistes venus d'Europe désireux de trouver un nouveau mode de vie et de participer à la création du nouvel État. Les kibboutzim fonctionnent comme des démocraties directes où tous les membres participent aux assemblées générales et où chacun effectue à tour de rôle les tâches les plus ingrates<ref name=Kib1/>.

Les kibboutzim ont connu un succès remarquable et contribuent à 33 % de la production agricole et à 6,3 % de la production industrielle israéliennes. Dans les années 1970,  près de 15 % des officiers de l'armée viennent des kibboutzim quand leur population ne dépasse pas 4 % de la population totale<ref name=Kib2>{{en}} {{lien web|url=http://www.jewishvirtuallibrary.org/jsource/judaica/ejud_0002_0012_0_11103.html|site=Jewish Virtual Encyclopedia|titre=Kibbutz Movement|date=2008|auteur=Shaked Gilboa}}</ref>. Après un déclin sensible dans les années 1990, les kibboutzim connaissent un certain renouveau qui se caractérise par une économie profitable, mais un abandon au moins partiel des idéaux originels : de 1990 à 2000, le pourcentage de salariés dans les kibboutzim est passé de 30 à 67 %<ref name=Kib1>{{en}} {{lien web|url=http://www.jewishvirtuallibrary.org/jsource/Society_&_Culture/kibbutz.html|site=Jewish Virtual Encyclopedia|titre=The Kibbutz|date=2007}}</ref> et deux tiers des kibboutzim ont maintenant à leur tête des professionnels et non des membres du kibboutz<ref name=Kib2/>.     
}}

Any idea ? --NicoV (Talk on frwiki) 18:40, 19 April 2018 (UTC)[reply]

@NicoV: It is the Encadré template at the end of the section right before === Les villes de développement === --Bamyers99 (talk) 19:07, 19 April 2018 (UTC)[reply]
Are you sure? If I ask CW, it says that the problem is at index 8792, which is the beginning of the Encadré template right before === Les kibboutzim ===. And the template before === Les villes de développement === seems ok too. --NicoV (Talk on frwiki) 19:27, 19 April 2018 (UTC)[reply]
{{Encadré|titre= |alignement=gauche|largeur=1000 px|contenu=
=== Les villes de développement ===
L'[[Histoire des Juifs en Irak|antisémitisme en Irak]], l'activisme sioniste, la [[crise de Suez]] et la décolonisation française en [[Afrique du Nord]] provoque une immigration massive de Juifs en provenance d'Irak, du [[Histoire des Juifs au Yémen|Yémen]], d'[[Histoire des Juifs en Égypte|Égypte]] et du [[Histoire des Juifs au Maroc|Maroc]] et d'autres pays arabes. Dans les années 1950 et 1960, Israël fonde plusieurs dizaines de villes dites de développement pour loger les Juifs venus principalement de pays arabes. Dès leur arrivée ou parfois après être passés dans un camp ou [[ma'abara]], ces réfugiés ont souvent été obligés de s'installer dans ces nouvelles villes n'offrant guère d'opportunités et installées à la périphérie d'Israël plutôt que de pouvoir choisir une grande ville qui aurait été plus accueillante. Cela répondait à un besoin stratégique d'Israël de développer ses régions désertiques et de protéger ses frontières mais cela a aussi contribué à créer une [[Juifs Mizrahim#Les Mizrahim dans l'État d'Israël|société défavorisée de Juifs dits orientaux]] (bien que le Maroc soit plutôt à l'[[occident]]).
{{Article détaillé|Ville de développement|Ma'abarot}}
}}
@NicoV: Saw it with my own two eyes. There are actually 2 unterminated {{Encadré|titre= |alignement=gauche|largeur=1000 px|contenu= in both snippets above. They are both missing a }} at the end. --Bamyers99 (talk) 19:40, 19 April 2018 (UTC)[reply]
@NicoV: Oh, now I see. The heading is inside the template parameter contenu --Bamyers99 (talk) 19:45, 19 April 2018 (UTC)[reply]
Yes, "encradré" means "frame", they put a whole section inside a frame. --NicoV (Talk on frwiki) 19:49, 19 April 2018 (UTC)[reply]

@Bamyers99: Ok, I found it. It's much later in the article, in === Les réfugiés africains ===. The problem is that checkarticle.cgi was returning an incorrect index, it was the first one matching the "notice", but not the one where the problem was detected. --NicoV (Talk on frwiki) 19:36, 19 April 2018 (UTC)[reply]

@NicoV: After further research, this was indeed a false positive. The program was supposed to be looking for closing }}, but instead was looking for }. That bug has been fixed. Regarding the "incorrect index" issue, the generic error printing code is calculating that index using a simplistic string search. Changing it to always report the correct index would require changes to each error detecting code block. I feel that that would be more work then the actual benefit. --Bamyers99 (talk) 02:04, 20 April 2018 (UTC)[reply]

@Bamyers99: I don't know because it detected a single } which was not necessary. This happens a lot with #43. Usually, that's the main difference of detection between CW and WPCleaner on this one, the one where I have to search manually. I was thinking about modifying WPCleaner so that it would also detect it. --NicoV (Talk on frwiki) 06:58, 20 April 2018 (UTC)[reply]

Error #94: false positives?[edit]

Hi, I can't find the problem for #94 in the following articles:

Any idea? --NicoV (Talk on frwiki) 13:12, 22 April 2018 (UTC)[reply]

@NicoV: False positives caused by a < or > in a reference name. The bug has been fixed. --Bamyers99 (talk) 19:29, 22 April 2018 (UTC)[reply]

#43 false positives[edit]

Hi. I believe the code of #43 and #47 should be changed to ignore <chem>...</chem>, so the code <chem>{Mn^{+2}(aq)}+{2H2O(l)}->{{Mn_2O}(s)}+{4H^+}(aq)</chem> will not be marked. Thank you. IKhitron (talk) 18:09, 4 May 2018 (UTC)[reply]

@IKhitron: The program has been updated to ignore the contents of the <chem> tag for all error checks. The program should have been updated a couple of years ago when <chem> replaced <ce>. --Bamyers99 (talk) 19:06, 4 May 2018 (UTC)[reply]
@Bamyers99: Thank you. But it's new. See this until midnight. IKhitron (talk) 19:09, 4 May 2018 (UTC)[reply]
@IKhitron: I just updated the program right before I posted my previous message. When I said "should have been updated", I meant that somebody should have updated the program a couple of years ago, but nobody did. --Bamyers99 (talk) 19:23, 4 May 2018 (UTC)[reply]
I see. Very well, thank you. IKhitron (talk) 19:25, 4 May 2018 (UTC)[reply]

Parser change next month, and some wikis are not ready[edit]

If you are interested in other Wikipedias, then please see https://quarry.wmflabs.org/query/26474 for a "scoreboard" of how your favorite large wikis are doing. A few notes:

  • The Polish and Portuguese Wikipedias each have 5,000+ articles with unclosed tags. Do I remember correctly that this is actually pretty easy to fix with (semi-)automated tools?
  • The Spanish Wikipedia has 20,000+ articles with misnested tags, the Ukrainian Wikipedia has 16,000+ of them, and the Portuguese and Chinese Wikipedias have 10,000+. When the numbers are this high, I think it's reasonable to first check for problems in widely used templates (e.g., a template using span tags that gets multiple paragraphs of content).

Please help out by fixing what you can and by telling folks that you know at affected wikis. There is more information at https://lists.wikimedia.org/pipermail/wikitech-ambassadors/2018-April/001836.html Thanks, Whatamidoing (WMF) (talk) 19:02, 9 May 2018 (UTC)[reply]

Help deprecating Template:Tooltip[edit]

I recently closed an RfD, the result of which was to deprecate {{tooltip}}. Per Wikipedia:Manual of Style/Accessibility#Text, tooltips should be avoided except for the use of abbreviations. In short, every current use of tooltip (many are in templates) should be either:

  1. Converted to {{abbr}} if it is an abbreviation; OR
  2. Removed, whether by restructuring or converting to a footnote or whatever

This will obviously require some human effort, so while I imagine AWB will be helpful, I wanted to post here as well in case anyone was interested. It seems within the wheelhouse of the project. ~ Amory (utc) 17:52, 19 May 2018 (UTC)[reply]

Double wording[edit]

Hi. I do not know how active is the coding part of the project. Is there a way to detect repeated repeated words? Thank you. IKhitron (talk) 10:49, 28 May 2018 (UTC)[reply]

The coding part would be simple, I think. I do this for my Wikipedia (if you want, can generate such on-wiki list for hewiki). One thing to remember - there will by taxon-related false positives, a lot of them. --Edgars2007 (talk/contribs) 13:44, 28 May 2018 (UTC)[reply]
Thank you. Is this something that I can run by myself? I do not want to bother you with two wikis. IKhitron (talk) 14:03, 28 May 2018 (UTC)[reply]
Yeah, you could use AWB database scanner for that. But I would be running the scan on Toolforge, so two wikis isn't not a big problem (the biggest problem is to remember that I have to do the scan, because currently I do that manually). For Latvian I use such regex: \s(([A-Za-zĀČĒĢĪĶĻŅŠŪŽāčēģīķļņšūž]{3,})\s+\2)\s, so I would need Hebrew version of this regex. ĀČĒĢĪĶĻŅŠŪŽāčēģīķļņšūž are "special" Latvian letters, that doens't fit into some nice range. Of course, the list would be far from perfect, but to have something... --Edgars2007 (talk/contribs) 16:43, 28 May 2018 (UTC)[reply]
Thank you, but it's a misunderdstanding. I'd like to run it for two wiki on two languages for myself, hewiki and ruwiki, which is why I do not to make you problems. IKhitron (talk) 16:48, 28 May 2018 (UTC)[reply]
For russian, I used \s([а-яё]+)\s+\1\s with IGNORECASE flag. This also can detect cases like "Under under the tree". Also it's bad to ignore short words, most errors in ruwiki are with "в" word (Russian for "in"). And there will be lots of false positives, so I don't think that's a good idea to detect this kind of error with checkwiki. For example, J. J. Abrams is "Джей Джей Абрамс" in Russian, so there are more than 100 of double-Джей in articles. Facenapalm (talk) 23:31, 28 May 2018 (UTC)[reply]

Replacing macros within math tags.[edit]

Hi everyone,

We are currently trying to improve the math support and would need help replacing some macros which create serious compatibility issues for LaTeX-based external programs reading the wikitext and block further improvements to the math support, see phab:T195861. Since I don't have any experience with automated editing it would be great if you could help us:

--Debenben (talk) 21:49, 8 June 2018 (UTC)[reply]

As you can see the lists I created have been deleted as out of scope for MediaWiki-wiki. I am happy to post them here, in the German Wikipedia, on Meta or send them in a format you want.--Debenben (talk) 13:28, 9 June 2018 (UTC)[reply]

Unicode control characters[edit]

There has been at least one bot, Josvebot, that has come along and cleaned up articles that I have worked on and removed Unicode control characters. It's not happening as often as it used to, but I would like to stop doing whatever I am doing so that someone doesn't need to clean up after me.

Could this be happening if I copy and modify categories from another article?

Or, if I search for nested subcategories, find the right one, and copy and paste the name of the subcategory into HotCat?

Or, something else?

Thanks so much!!!–CaroleHenson (talk) 13:36, 7 July 2018 (UTC)[reply]

If you copy paste something there is a likelihood of such characters are included. I'm not sure why HotCat is so prevalent in causing these, but I'm guessing that the UI of HotCat includes such characters in the category names when displayed as to not break the names into multiple rows (or something like that). See e.g. User talk:Josvebot#Unicode control characters as well. (tJosve05a (c) 14:53, 7 July 2018 (UTC)[reply]
Ok, thanks, Josve05a. I'll stop copy-pasting.–CaroleHenson (talk) 15:01, 7 July 2018 (UTC)[reply]

Hewiki dump[edit]

Hi. Just to be sure. You do not need to fix something in the code? It was published that hewiki is a big wiki now, for dumping purposes. IKhitron (talk) 12:59, 25 July 2018 (UTC)[reply]

@IKhitron: Fix not needed. --Bamyers99 (talk) 01:40, 26 July 2018 (UTC)[reply]
Thank you. IKhitron (talk) 10:37, 26 July 2018 (UTC)[reply]

Interface for wikis without translations[edit]

@Bamyers99:: if you see the web interface for simplewiki, some of the errors, like #113 and #110, don't match en.wiki, and the description is empty. Both should be retrieved from en.wiki. For other wikis that don't have a translation page or don't have all the errors translated yet, at least the titles should be retrieved from en.wiki. Thanks. --Usgix (talk) 22:15, 1 August 2018 (UTC)[reply]

@Usgix: Checkwiki has been pointed to the existing simplewiki translation page. Programmatically fixing incomplete configurations is outside the scope of my involvement with Checkwiki, which is bug fixes and keeping it running. --Bamyers99 (talk) 12:59, 4 August 2018 (UTC)[reply]

WPCleaner : new installation procedure[edit]

Hi, I've finally managed to resume releases on WPCleaner, but with a change of release process, see announcement. In addition, this version includes a lot of additions to help fixing Special:LintErrors. I'm looking for testers for this new release procedure. --NicoV (Talk on frwiki) 16:48, 4 August 2018 (UTC)[reply]

Adding Wikivoyage in French[edit]

Hello. I'm trying to add Wikivoyage in French on WPCleaner. Apparently, it is not managed by Check Wiki and I need to ask here to add it. So here I am. Thanks. — Antimuonium U wanna talk? 20:12, 4 August 2018 (UTC)[reply]

@Antimuonium: Here ya go: frwikivoyage Configuration is at: Correction_syntaxique/Traduction --Bamyers99 (talk) 03:27, 5 August 2018 (UTC)[reply]
@Bamyers99: Thank you! — Antimuonium U wanna talk? 06:51, 5 August 2018 (UTC)[reply]

Didn't check new[edit]

Moin Moin together, for the german Wikipedia there where no new entries, so I think the job didn't run. Could anybody check and fix this? Regards --Crazy1880 (talk) 04:49, 16 August 2018 (UTC)[reply]

@Crazy1880: Thanks for reporting. Caused by a bug that got introduced coding support for mw:Requests for comment/Multi-Content Revisions. The bug has been fixed and the jobs have been re-run. --Bamyers99 (talk) 17:50, 16 August 2018 (UTC)[reply]

No recent entries for Simple English Wikipedia[edit]

Hello. There seem to be no new entries for simplewiki since 2018-08-03. Is there something we need to do? --Auntof6 (talk) 07:15, 17 August 2018 (UTC)[reply]

@Auntof6: simplewiki is only updated twice a month, a couple of days after the 1st and the 20th. --Bamyers99 (talk) 19:12, 17 August 2018 (UTC)[reply]
OK, thanks. I thought I'd seen it updated more often, or on the same schedule as others. I'll let our folks know. --Auntof6 (talk) 20:17, 17 August 2018 (UTC)[reply]

Structural errors[edit]

Hi, would it be possible for someone to compile a sub-list of CheckWiki errors which represent "structural" errors (wrt to the underlying generated HTML), and thus are more likely to cause Lint error detections? Thanks. ShakespeareFan00 (talk) 07:09, 9 September 2018 (UTC)[reply]

Error 3 False positives[edit]

{{ref list}} isn't being recognized. Dan Thomas (sportscaster) is an example. Jerod Lycett (talk) 02:22, 21 September 2018 (UTC)[reply]

{{RE}} is another one. Jerod Lycett (talk) 03:19, 21 September 2018 (UTC)[reply]

{{reference}} is yet another one. Jerod Lycett (talk) 03:31, 21 September 2018 (UTC)[reply]

@Jerodlycett: These 3 templates have now been added to the list of reference list templates. --Bamyers99 (talk) 18:36, 21 September 2018 (UTC)[reply]

Code template needs whitelisting[edit]

Here on en the page HTML has several errors that are caused by things in the code template and are meant to be that way. Jerod Lycett (talk) 17:12, 20 September 2018 (UTC)[reply]

@Jerodlycett: The {{code}} templates contents and its redirect {{inline syntax}} are now excluded from error checks. --Bamyers99 (talk) 22:18, 21 September 2018 (UTC)[reply]

A script to download page titles[edit]

Hi people, I wrote a script: hu:user:BinBot/checkwiki.py It works under Python 2 and Python 3. Itt will download the title list for a given error id to a file. I see 3 ways of use by bot:

  • Get the titles from the file (-file in Pywikibot).
  • Upload the contents of the file to a wikipage, and get the titles from there (-links in Pywikibot).
  • Use it as a pagegenerator (better to say a title generator) in a Python bot, such as Pywikibot. Needs some knowledge of programming.

The first two do not require Pywikibot, the script is pure Python and the list may be used by any bot. Bináris (talk) 07:37, 27 September 2018 (UTC)[reply]

False positives for #28[edit]

https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=huwiki&view=only&id=28https://hu.wikipedia.org/w/index.php?title=1966%E2%80%931967-es%20nyugatn%C3%A9met%20labdar%C3%BAg%C3%B3-bajnoks%C3%A1g%20(els%C5%91%20oszt%C3%A1ly)&action=edithttps://hu.wikipedia.org/w/index.php?title=Sablon:Fb_cl_footer&action=edit

It is tricky and hard to solve. A lot of sport tables consist of rows which come from templates. Under the last row there is a footer template (3rd link) which begins with |}. This is not a nice solution, but works. That's why we have 508 todos in the list, but I think, 500 of them are of this type. Bináris (talk) 15:17, 27 September 2018 (UTC)[reply]

@Bináris: I have added {{Fb cl footer to the huwiki Translation page to handle this situation. For ideas on other templates that huwiki may be using in the same way, look at the enwiki Translation page and search for error_028_templates. --Bamyers99 (talk) 18:01, 27 September 2018 (UTC)[reply]
Thank you. I dind't know this opportunity. Bináris (talk) 18:12, 27 September 2018 (UTC)[reply]

Colon not found (#57)[edit]

Colon is not found when it is not literally at the end of a section title, but semantically yes.

  • The colon is followed by bolding. After correcting error #44 (bold title), I went through the list of #44 with the fix of #57. It is still unrecognized when title ends with <small>.
  • Here I removed <u> tags together with bolding, but two apostrophes for italic legally remained. The colon was detected by eyes, it is not in the list of #44. I modified the fix to disregard trailing apostrophes when looking for colon (see the next edit).

You may say this is not in the scope of check, I just mentioned as a possible enhancement. Bináris (talk) 18:20, 27 September 2018 (UTC)[reply]

False positives for #90[edit]

Should be a list of DEFAULTSORT errors, but seems to belong to another error (a lot of file URLs pointing to the home wiki). Bináris (talk) 13:12, 28 September 2018 (UTC)[reply]

There is some problem in your translation page. #90 detects wikilinks in http protocols format. IKhitron (talk) 13:17, 28 September 2018 (UTC)[reply]
Thx, I will check it. The original English text is the same there. Bináris (talk) 13:47, 28 September 2018 (UTC)[reply]
 error_090_prio_enwiki=2 END
 error_090_head_enwiki=Internal link written as an external link END
 error_090_whitelistpage_enwiki=Wikipedia:WikiProject_Check_Wikipedia/Error_090_whitelist END
 error_090_desc_enwiki=The script finds an external link that should be replaced with a wikilink.  An example would be on enwiki [http://en.wikipedia.org/wiki/Larry_Wall Larry Wall] should be written as [[Larry Wall]]. Script also finds references that use Wikipedia as a source.<br>
<br>
Following tools can correct the problem:
<ul>
<li><a href="https://meta.wikimedia.org/wiki/User:TMg/autoFormatter">Auto-Formatter</a></li>
</ul> END
IKhitron (talk) 13:51, 28 September 2018 (UTC)[reply]
@Bináris: #89, #90, #91, etc. were changed back in December 2013 per these edits. The official error list is here. --Bamyers99 (talk) 15:45, 28 September 2018 (UTC)[reply]

Than our tranlsation page is simply outdated. Badly, badly outdated. Thank you! Bináris (talk) 07:22, 29 September 2018 (UTC)[reply]

Math tags and false positives for #43 and #47[edit]

Hi, I suggest to adapt the codes for #43 and #47 in order to avoid considering uses of double accolades between math tags as begin or end of templates. Currently, checkwiki reports false positives for codes such as

v_s^2=c^2\Xi\Leftrightarrow{v_s=c\sqrt{\Xi}}
(correct code for )

LeFit (talk) 15:40, 3 October 2018 (UTC)[reply]

@LeFit: Thanks for reporting. This was caused by a bug that I introduced while implementing the feature to ignore the contents of specific templates (ie. {{code}}). I have just fixed this. --Bamyers99 (talk) 22:04, 3 October 2018 (UTC)[reply]

#22 false positives[edit]

Hi. The list [9] has 62 problematic articles and 963 false positives, hasn't it? IKhitron (talk) 13:20, 14 September 2018 (UTC)[reply]

@IKhitron: Error #22 (Category with space) did not support right-to-left text. This has been fixed. The list will get updated after the Sept. 20th database dump. --Bamyers99 (talk) 20:25, 14 September 2018 (UTC)[reply]
Thanks a lot! IKhitron (talk) 20:50, 14 September 2018 (UTC)[reply]
Hello again, Bamyers99. It's much better now. There was 24 articles. I fixed 9, but another 15 still look like false positive. Thank you. IKhitron (talk) 18:22, 24 September 2018 (UTC)[reply]
@IKhitron: I misdiagnosed the problem as a right-to-left issue. It was really the program not handling colon's (:) in the category name properly. I undid the bad rtl fix and fixed the colon issue. --Bamyers99 (talk) 22:16, 24 September 2018 (UTC)[reply]
I see. Thank you again. IKhitron (talk) 22:57, 24 September 2018 (UTC)[reply]
Hello again, Bamyers99. There are many false positives now. IKhitron (talk) 12:08, 4 October 2018 (UTC)[reply]
@IKhitron: I don't see any false positives. This check looks for 4 things: 1) space before the ]]. 2) space after the [[. 3) space before the |. 4) space on either side of the first :. --Bamyers99 (talk) 13:44, 4 October 2018 (UTC)[reply]
@Bamyers99: Did not know that. So what about [[קטגוריה:הקונסרבטוריון למוזיקה של ניו אינגלנד|*]]? IKhitron (talk) 13:53, 4 October 2018 (UTC)[reply]
@IKhitron: In that case there is a newline before the ]]. A space sometimes means whitespace which includes space, tab, newline. --Bamyers99 (talk) 18:41, 4 October 2018 (UTC)[reply]
I see. Thank you very much. IKhitron (talk) 18:42, 4 October 2018 (UTC)[reply]

A suggestion for a new bug[edit]

Hi. How about a bug with [[<Some date>|<Another date>]]? Thank you. IKhitron (talk) 17:57, 11 October 2018 (UTC)[reply]

Hi IKhitron. WPCleaner can detect this as error #526, you can activate it on Wikipedia:WikiProject Check Wikipedia/Translation, and I can include this error in the dump analysis I'm doing twice a month (like fr:Projet:Correction syntaxique/Analyse 526) if you're interested. On frwiki, it's activated and configured to also use an abuse filter (#241) and to add a template after it to show that the link needs fixing. --NicoV (Talk on frwiki) 18:02, 11 October 2018 (UTC)[reply]
A pity. I can't use WPCleaner, so I asked about the checkwiki. This analysis exists for frwiki only? Thank you. IKhitron (talk) 18:05, 11 October 2018 (UTC)[reply]
Hi IKhitron. I'm already doing a dump analysis for enwiki twice a month, but I haven't included #526 in it. I can if you're interested in working on it. --NicoV (Talk on frwiki) 18:10, 11 October 2018 (UTC)[reply]
I see. But I was talking about two other wikisites. As I understand, there are no such things over there. IKhitron (talk) 18:11, 11 October 2018 (UTC)[reply]
IKhitron, why can't you use WPCleaner? I can always include other wikisites if needed, but probably not twice a month. --NicoV (Talk on frwiki) 18:14, 11 October 2018 (UTC)[reply]
I tried once to install, it did not work. It does work now, but I can;t find in manual how to display a specific error more than 120. IKhitron (talk) 18:25, 11 October 2018 (UTC)[reply]
Ok. There are no lists above #500, because they are not managed by CW. You can activate errors above #500 in the CW translation page, but you won't have lists. But you can generate lists with WPCleaner by analyzing a dump file. I can guide you if you're interested in doing so for another wiki. --NicoV (Talk on frwiki) 18:54, 11 October 2018 (UTC)[reply]
I see. I'll be glad if you can do it. Thank you. IKhitron (talk) 18:55, 11 October 2018 (UTC)[reply]
IKhitron, for which wiki do you want to use it? --NicoV (Talk on frwiki) 19:01, 11 October 2018 (UTC)[reply]
IKhitron, steps for activating error #526 and generating a list:
Need to go, will continue on the explanation on how to run the analysis later... --NicoV (Talk on frwiki) 19:34, 11 October 2018 (UTC)[reply]
IKhitron, next steps to run the analysis (I suggest running it with the command line, it's a bit more work to set it up at first, but after that it's a just a command to run:
  • Create a task file with what you want to do: take a look for example at enwiki task for updating the analysis for a bunch of errors (1, 2, 3..., 111), and then listing the problems with ISBN or ISSN. I suggest for the moment a 1 line file with something like:
    ListCheckWiki [Path]\[XX]wiki-$-pages-articles.xml.bz2 wiki:Wikipedia:CHECKWIKI/WPC_{0}_dump 526
    hewiki: for example he_ListCheckWiki.txt: ListCheckWiki [Path]\hewiki-$-pages-articles.xml.bz2 wiki:ויקיפדיה:Check_Wikipedia/WPC_{0}_dump 526
    ruwiki: for example ru_ListCheckWiki.txt: ListCheckWiki [Path]\ruwiki-$-pages-articles.xml.bz2 wiki:Проект:Check Wikipedia/WPC_{0}_dump 526
  • Create a credentials.txt file in the same folder as WPCleaner with your username and password:
    username=IKhitron
    password=...
  • Open a command line prompt in the folder where WPCleaner is installed and run a command like Bot.bat -credentials credentials.txt [XX] DoTasks [TaskFile] (if on Windows): WPCleaner will start in the background and perform the analysis on XX wiki.
    hewiki: for example Bot.bat -credentials credentials.txt he DoTasks he_ListCheckWiki.txt
    ruwiki: for example Bot.bat -credentials credentials.txt ru DoTasks ru_ListCheckWiki.txt
--NicoV (Talk on frwiki) 09:55, 15 October 2018 (UTC)[reply]
Thank you for your help and for your time, NicoV. I'll read it carefully. I'm talking about hewiki and ruwiki. About the filter - I do not understand, how can it locate the local months names? IKhitron (talk) 16:31, 15 October 2018 (UTC)[reply]
IKhitron Argh, I misread your first post: #526 will only work for years, not full dates (too complex to parse reliably), but it's a start: see fr:Projet:Correction syntaxique/Analyse 526 for an example of what can be detected with this error. The filter on frwiki is even simpler: it only detects simple cases where the link is 3 or 4 digits and the text is also 3 or 4 digits but different. --NicoV (Talk on frwiki) 16:40, 15 October 2018 (UTC)[reply]
IKhitron Don't hesitate if you need help. I've added more instructions based on the wikis you're working on. --NicoV (Talk on frwiki) 16:53, 15 October 2018 (UTC)[reply]
Much better. Thank you very much. IKhitron (talk) 17:30, 15 October 2018 (UTC)[reply]

Thanks[edit]

Just wanted to show my appreciation of this tool. Very handy. --Palosirkka (talk) 13:18, 26 October 2018 (UTC)[reply]

Page title character immediately following a colon gets capitalized on page listing for #81 (and possibly other ID)[edit]

In the Article column of the list of faulty pages for svwiki ID: 81, the character immediately following a colon in page titles incorrectly gets capitalized. Example of such incorrect page titles are

  • Gustav III:S kröning, it should be Gustav III:s kröning
  • Karl XI:S kröning, it should be Karl XI:s kröning
  • Karl XIII:S staty, it should be Karl XIII:s staty
  • Karl XV:S staty, it should be Karl XV:s staty
  • KFUK-KFUM:S studieförbund, it should be KFUK-KFUM:s studieförbund

The effect is that the edit link on the listing page doesn't work. The fault also propagates to the List for bots. --Larske (talk) 09:56, 27 October 2018 (UTC)[reply]

@Larske: This bug has just been fixed. Removed some unnecessary code that was attempting to capitalize the first character after a namespace. Not needed since checkwiki only looks at article (main) namespace. The next dump scan will have correct article titles. --Bamyers99 (talk) 20:02, 27 October 2018 (UTC)[reply]
👍 Like --Larske (talk) 03:50, 28 October 2018 (UTC)[reply]

False positives for #64 (Link equal to linktext)[edit]

Links with link texts ending with "special characters" are incorrectly reported as faulty with ID:64 (Link equal to linktext). See this list.

Here are the links in wikitext:

[[Malmö|Malmö Ö]]
[[Â|ÂÂ]]
[[Drottninghög|Drottninghög Ö]]
[[Ö|ÖÖ]]
[[Mölndal|Mölndal Ö]]
[[Â|ÂÂ]]
[[Â|Â,Â]]
[[Â|ÂÂ]]
[[Ö|ÖÖ]]
[[Malmö|Malmö Ö]]

--Larske (talk) 14:24, 29 October 2018 (UTC)[reply]

@Larske: This bug has just been fixed. There was a problem with the dump scanner not handling Unicode characters properly. --Bamyers99 (talk) 18:48, 30 October 2018 (UTC)[reply]

Priority[edit]

I originally though that the priority (e.g. on this page) somehow referred to how important the affected articles are instead of the severity of the found error. Maybe reword that? It would be also nice to be able to search for errors in high value articles. --Palosirkka (talk) 10:42, 31 October 2018 (UTC)[reply]

False positive with pre tags[edit]

Hi! Using CW I found a tiny bug. I'd like to send a pull request directly on github, but since I'm not really familiar with python I'm just reporting here. The regex used to find unclosed pre tags ("Pre tag without correct match") is flaky, and instead of /<pre/ it should be something like /<pre[ >]/. Right now, false positives may happen for instance with a preview tag (see here). Although they're less error prone, you may want to do the same for every other tag, just to be sure. Thanks, --Daimona Eaytoy (Talk) 18:15, 5 November 2018 (UTC)[reply]

@Daimona Eaytoy: This has been fixed for all HTML tags. --Bamyers99 (talk) 23:20, 7 November 2018 (UTC)[reply]

Newer dump for german Wikipedia[edit]

Moin Moin together, the last dump from german Wikipedia is from 1st septembre 2015. Is it possible to get a newer version? I'm well aware that the dump has a big size. Regards --Crazy1880 (talk)

Moin Magioladitis, sorry for writing to you directly, but could you do this? Or could you tell me, how I could do that? Regards --Crazy1880 (talk) 18:31, 17 October 2018 (UTC)[reply]
Or Bamyers99 could you help? Regards --Crazy1880 (talk) 19:14, 6 November 2018 (UTC)[reply]
@Crazy1880: This has been run. It took over 22 hours to run. I am not going to add it to the dump scan list. dewiki gets daily checkwiki updates of edited articles. Since the dump was asof the 1st, it is going to report again some that have been fixed between the 1st and 7th. --Bamyers99 (talk) 23:31, 7 November 2018 (UTC)[reply]
@Bamyers99: morning, that looks great. Yes, at that time Bgwhite and I also had the topic. Even then we had left it at the daily, because a new dump was so big and took a lot of time. So big thanks, now it is newer and we could scan the daily better. King regards --Crazy1880 (talk) 05:52, 8 November 2018 (UTC)[reply]

Pseudo headings[edit]

is it possible to watch for pseudo headings with abusing of the semicolon markup instead of bold text? The definitionlist-tag is difficult for people with screenreaders --Janui (talk) 10:48, 13 November 2018 (UTC)[reply]

#2 Closing line-break markup[edit]

Chiswick Chap told on my talk page, that the invalid tag </br> should not be fixed. If this invalid tag is necessary because of some bugs in the wikimedia software, you should not list it as Tag with incorrect syntax. If you think, this tag should be fixed, could you explain Chiswick Chap a way to handle his Wiki-markup problem? --GünniX (talk) 10:45, 13 November 2018 (UTC)[reply]

@GünniX and Chiswick Chap: I don't know what editor is showing purple text. I checked the VisualEditor with the Jellyfish#Life history and behavior image caption, no purple text there. The HTML standard does not allow a closing tag for <br>. In the Jellyfish article image caption, I noticed a <br> that was missing the trailing slash <br/>. Maybe that is causing the problem with the editor. --Bamyers99 (talk) 15:17, 13 November 2018 (UTC)[reply]

fr.wiktionary could benefit of this tool[edit]

 Resolved

Hi, could fr.wiktionary be added to the list of projects supported by this tool? Best, --Automatik (talk) 13:14, 20 November 2018 (UTC)[reply]

@Automatik: frwiktionary is now supported. Configuration is at Correction_syntaxique/Traduction. --Bamyers99 (talk) 02:26, 21 November 2018 (UTC)[reply]
@Automatik: And I've configured WPCleaner to use it if you're interested. --NicoV (Talk on frwiki) 07:06, 21 November 2018 (UTC)[reply]
Thanks to both of you! I'll check WPCleaner to see if it can help. --Automatik (talk) 15:01, 21 November 2018 (UTC)[reply]
@Bamyers99: there is a bug, see the first entry in [10], which does not have an article name. --Automatik (talk) 17:12, 21 November 2018 (UTC)[reply]
@Automatik: This has been fixed. The title (𡥵) is a 4 byte Unicode character. The database only supported 3 byte Unicode characters. I have upgraded the database to support 4 bytes. --Bamyers99 (talk) 03:14, 22 November 2018 (UTC)[reply]

Checkwiki dump scanner / webserver new server infrastructure[edit]

The checkwiki dump scanner and webserver need to be migrated to a new server infrastructure at Toolforge. The old infrastructure is running operating system software (Ubuntu Trusty 14.04) that will reach its end of life shortly (no more security updates). See wikitech:News/Toolforge Trusty deprecation for more details. The webserver has already been migrated (Tue Jan 15 02:00 UTC 2019). The dump scanners will be migrated after they are tested. --Bamyers99 (talk) 02:12, 15 January 2019 (UTC)[reply]

A small grammatical error[edit]

I don't know whether this is the right place to ask my doubt. If this is not the place, please tell where I need to post this. Please visit the link [link 1]. In that page, I am able to see a link named "Set all articles as done!". On clicking that link, it redirects onto another page in which I am able to see around four main links. My doubt is regarding the grammar that is present for the line "No, I will back!". Should that reference/link title be changed to "No, I will go back!"?Adithyak1997 (talk) 18:12, 22 January 2019 (UTC)[reply]

@Adithyak1997: This is correct grammar. In the phrase "I will go back!", the word back is an adverb of the verb go, meanwhile, in the phrase "I will back!", back is used as a verb, which is proper grammar (See wikt:back#Verb). A good example would be the phrase "I will back away!". Away acts as an adverb here, and so removing it from the phrase still results in the valid sentence, "I will back!". It isn't exactly intuitive though. Hecseur (talk) 11:50, 25 January 2019 (UTC)[reply]

False positives for #3[edit]

 Resolved

Please ignore those pages which have Jegyzetek, Források, Lábjegyzet or Hivatkozások template on the Hungarian Wikipedia. These pages have <references /> inside the template. Thanks! Bencemac (talk) 10:12, 10 January 2019 (UTC)[reply]

@Bencemac: The templates have been added here. --Bamyers99 (talk) 16:16, 10 January 2019 (UTC)[reply]
@Bamyers99: Thanks! I am going to check it when the next update arrives. Bencemac (talk) 18:16, 11 January 2019 (UTC)[reply]

It is working well, thanks! Bencemac (talk) 08:50, 9 February 2019 (UTC)[reply]

ID16 in german[edit]

Moin Moin together, the ID 16 "Unicode control characters" in german CheckWikipedia isn't working for times. In the english scan its running normal. Could anyone check this problem? Thanks --Crazy1880 (talk) 16:49, 12 February 2019 (UTC)[reply]

@Crazy1880: The scanner was only checking #16 on enwiki. I don't know who or why it was coded that way. I have enabled #16 for all wikis. --Bamyers99 (talk) 23:02, 13 February 2019 (UTC)[reply]

Checkwiki suspended by toolforge admin[edit]

Checkwiki has been suspended by a toolforge admin per T216167. --Bamyers99 (talk) 23:26, 14 February 2019 (UTC)[reply]

Checkwiki has been re-enabled. --Bamyers99 (talk) 00:54, 19 February 2019 (UTC)[reply]

User-agent checking to reject bot requests[edit]

The User-agent string is now being checked so that bot requests (search engines, crawlers, etc) can be rejected to reduce the load on the WMFLabs servers.

The regex is (?:spider|bot[\s_+:,\.\;\/\\\-]|[\s_+:,\.\;\/\\\-]bot)

WPCleaners User-agent will not get rejected.

If some other tool is getting rejected, I can add an exception to the regex. --Bamyers99 (talk) 01:42, 19 February 2019 (UTC)[reply]

Fixed #47 (Template without correct beginning) false positive when #9 (Multiple categories on one line) was found[edit]

I have just fixed a bug that caused a false positive for #47 if #9 was found. It is a Perl regex modifier (/g) gotcha. --Bamyers99 (talk) 19:25, 19 February 2019 (UTC)[reply]

Checkwiki scripts ran on non-wmf wiki dump[edit]

Hi guys, before some time, while user Bgwhite was still active, he kindly provided me with outputs of Check Wikipedia scripts for our small, but active wiki. I am looking for someone who would provide me with the outputs again. Could you help me out? Thanks. --Wesalius (talk) 17:53, 3 March 2019 (UTC)[reply]