User:Aymatth2/SvG clean-up

From Wikipedia, the free encyclopedia

This page, for most part, has now been superseded by User:Aymatth2/SvG clean-up/Guidelines. Lists of draft articles to be checked are at User:Aymatth2/SvG clean-up/Guidelines#Lists


This page is for discussion / coordination of clean-up following the discussion at Wikipedia:Administrators' noticeboard/IncidentArchive941#User:Fram.

Closing decision[edit]

The closing decision included clean-up:

... a list of Sander's existing problematic BLP articles should be made for reference for interested parties to recreate properly. Once created, one (1) week's notice should be given in a public enough manner so that editors and interested Wikiprojects (Cycling and Olympics were mentioned by name) can "adopt" articles to either correct during that week or userfy for longer-term correction. After said week, corrected articles should be removed from the list and the remaining uncorrected BLP articles should be deleted. This allows interested editors to fix the low-hanging fruit and provides a list for future development but removes the risk to Wikipedia of potentially hundreds of violations.

The closing admin (user:Avraham) has said:

In my opinion, for whatever that is worth, the people willing to undertake this job can decide among themselves how to best perform it, so long as it gets done. I'd suggest that whomever agrees to take primary responsibility should set up a user subpage for co-ordination, including checklists and order-of-deletion as necessary. ... It's been about a week, so the mass deletion should take place soon, but if the team wants to spread it over a bit more time, that's understandable as long as it gets done. These are BLPs we are discussing, remember.

Users supporting proposal[edit]

The following users expressed support for the mass deletion, and will be invited to comment or participate in the process: Aymatth2, Beyond My Ken, Blackmane, Class455, Cwmhiraeth, David Eppstein, EEng, Exemplo347, Fram, Giants2008, Reyk, Smartse, Softlavender, The Banner, Xxanthippe. Any other volunteers would be welcome, particularly on creating the list.

  • I'm afraid I don't see how I can contribute, but if someone can see how, please ping me back. EEng 13:15, 31 December 2016 (UTC)
  • I supported the proposal on the basis that it was better than the other sanctions suggested. I would be unable to help with the production of a script to sort the raw list, but would be happy to help if presented with a list of articles on which to work. Cwmhiraeth (talk) 13:34, 31 December 2016 (UTC)
  • My usefulness when it comes to writing scripts is less than zero, so I can't be of any help there. On the other hand, I would be interested in trying to clean up/confirm as acceptable some of the BLPs once that part of the job starts, if I can find time. While I must confess that I don't believe a week is enough time for a task of this magnitude (I wish it would have been a month, with articles blanked in the interim to hide any BLP issues), I will contribute what I can. Giants2008 (Talk) 16:14, 31 December 2016 (UTC)
  • I totally support this proposal. I've added a comment in the appropriate section below. Exemplo347 (talk) 16:30, 31 December 2016 (UTC)
  • I support this approach, but have no experience writing scripts. Can help with spot-checking some of the individual articles though. Reyk YO! 05:48, 1 January 2017 (UTC)
  • I too am also next to useless in writing scripts of this nature, however, @MusikAnimal: may be able to assist in that regard. But like Real, Giants2008 and Cwmhiraeth, if I'm able to help with the more manual side of things, please ping me. -Blackmane (talk)\
  • I might could help with a script, or a one-off bot task. We can write a simple SQL query to get all the BLPs, and there's an API I built to get word counts (example). For mass deletion, use Twinkle. I've wikified User:Aymatth2/SvG clean-up/Raw article list 0 so that there are wikilinks to each article, once that list is refined to articles you want to delete, use TW > D-Batch MusikAnimal talk 20:02, 1 January 2017 (UTC)
@MusikAnimal: We want to delete all the BLP's, all articles with Category:Living people in the raw article list, so if you could create a deletion list that holds links to all articles from the raw article list that are BLPs, for input to D-Batch, that would solve the main problem.
As a secondary concern, we would like to help the project people look for articles to be saved. If you could add the results from musikanimal/api/article_analysis/word_count?articlename to each entry in the deletion list that would be very useful, e.g.
3616. Gaston Fodouor "characters":264 "words":41
If you could also add the other categories in the article to the list entry, skipping categories like "... births", "...stubs" and "people from...", that would be amazing, e.g.
3616. Gaston Fodouor "characters":264 "words":41 category:Cameroonian male weightlifters
Thanks in advance, Aymatth2 (talk) 22:19, 1 January 2017 (UTC)
  • Please advise when the mass-deletion is going to start (giving, say 48hrs notice, if possible). Thanks. Lugnuts Precious bodily fluids 19:29, 4 January 2017 (UTC)
  • I note that only the editors who expressed support for mass deletion were ping above (a bit counter-productive?) I think several other editors, a selection of whom I'll ping here, might possibly be interested. @Rich Farmbrough, Sportsfan 1234, and Raymarcbadz: 103.6.159.67 (talk) 00:37, 7 January 2017 (UTC)

Proposed approach[edit]

We may choose to follow a slight variant of the approach defined in the closing decision to avoid the risk of bulk-removal of unreviewed articles from the deletion list. That is, editors interested in saving one or more BLP articles from the list may userfy them. After the defined period, all BLP articles in mainspace will be deleted. The userfied articles may then be checked, cleaned up if needed, and moved back to mainspace.

The proposed main steps are:

  1. Create a raw list of all articles created by SvG. Done, at User:Aymatth2/SvG clean-up/Raw article list 0
  2. Find a volunteer to write a script to create a deletion list which
    • Contains only articles from the raw list with Category:Living people. Done at User:Aymatth2/SvG clean-up/BLP 0
    • Optionally
      • Gives the length of DYK-type readable text in the article (i.e. length excluding stuff in templates. headings, citations, categories etc.)
      • Gives a count of cited sources
      • Gives relevant categories, e.g. 1977 birthsLiving people – Spanish male handball players – Handball players at the 2004 Summer Olympics − Olympic handball players of Spain − People from MadridSpanish handball biography stubs
  3. Execute the script and check results
  4. Create a list of sports-type Wikipedia projects that may want to be involved
  5. Advertise the mass deletion on those project pages, with a deadline for userfication (A note has already been left at Wikipedia talk:WikiProject Sports#Mass deletion)
  6. On expiry of the deadline, delete all BLP articles linked from the deletion list using Wikipedia:Twinkle/doc#Modules for administrators Batch deletion ("D-batch").

Lists[edit]

Lists of articles created are large, so may load slowly. Be patient.

Raw list User:Aymatth2/SvG clean-up/Raw article list 0
18,340 rows
All articles created by SvG, including deleted articles and articles that are not about living people.
User:Aymatth2/SvG clean-up/Raw article list 1 Ditto, showing creation dates
Deletion list User:Aymatth2/SvG clean-up/BLP 0
16,104 rows
Created by user:MusikAnimal using quarry:query/15207.
Spot-checked by user:Aymatth2 to confirm it contains only BLPs that were started by SvG.
These will be deleted from draft space after the 90-day salvage period.
Mainspace list User:Aymatth2/SvG clean-up/BLP mainspace
16,104 rows
Same as Deletion list, but points to main space versions of BLPs.
Redlink indicates the article has not yet been salvaged.

Saving an article from deletion[edit]

This section replaced by User:Aymatth2/SvG clean-up/Guidelines

Any autoconfirmed or confirmed editor may save an article that they do not want to be deleted in their user space, then restore it after the mass deletion has occurred. This should be done only for subjects that meet the notability criteria in Wikipedia:Notability#General notability guideline or Wikipedia:Notability (sports). Also consider Wikipedia:Biographies of living persons#Subjects notable only for one event. to save an article from deletion:

  1. Move the article to your user space. E.g. move Fred Smith to User:Myuserid/Fred Smith. This will leave a redirect in mainspace, which the mass deletion will remove.
  2. Tidy up the saved article:
    • Stub out any non-free image, e.g.
      • Change |image=Fred_Smith.jpg to |image=*Fred_Smith.jpg
      • Download a copy of the non-free image, which will soon be deleted. (You may need to consider whether an appropriate WP:NFCC rationale applies.)
      • Save information on the author, date, source etc. of the non-free image for use when restoring it
    • Remove any clean-up or maintenance tags, such as {{One source}} or {{Incomplete}}
    • Disable categories by adding a colon after the square brackets, thus [[:category:Living person]]
    • Add the {{Userspace draft}} template at the top of the article
  3. Review the article to ensure that it uses only reliable sources and accurately reflects what the sources say. Do not assume good faith: if you cannot view a cited source online, or it is not a reliable source, drop the citation and any statements it supported. Be particularly cautious about retaining any potentially damaging assertions. Consider Wikipedia:Biographies of living persons#People who are relatively unknown.
  4. Wait until the mass deletion has been run. If you've commented at #Users supporting proposal (above) you'll be notified.
  5. Restore the article:
    • Move the article back to mainspace
    • Upload the saved non-free image if present, and relink the article to the non-free image
    • Enable the categories
    • Remove the {{Userspace draft}} template

Discussion[edit]

Comments on approach[edit]

  • Comment - The sheer number of articles (over 18,000) makes it unlikely that every single article can be checked through by a human editor without it taking years. It's obviously unacceptable for the articles to remain on Wikipedia for an extended period given the fact that they've been shown to be riddled with various errors - a straight nuking of the whole lot (without salting them, so people have the option to recreate them if they wish to) is the only realistic outcome. Exemplo347 (talk) 16:28, 31 December 2016 (UTC)
The decision was to publish a list, give a week for salvage attempts, and then nuke all that are left. The critical path item is finding someone who can create the list. If they are able to annotate the entries with length, number of citations, categories etc. it may be be possible to focus on the articles most likely to be of value. E.g. longer articles in Olympics categories with multiple sources may be worth userfying for later checking. If it is not practical to make an annotated list, they will almost all get deleted. Unfortunate, but better than knowingly leaving a bunch of BLP violations in mainspace. Aymatth2 (talk) 16:43, 31 December 2016 (UTC)
  • Comment- in Step 1, we may need to check articles not in Category:Living People by eye, to make sure the category wasn't inadvertently left off. Reyk YO! 05:46, 1 January 2017 (UTC)
  • Comment I am using an expanded version of my approach at the Neelix lists for the first week. I am opening each article, checking for BLP violations, checking for notability, and then moving on. At the end of the week, the articles that I haven't addressed, or have fount unacceptable will be nuked according to the community consensus. I'm working at User:Aymatth2/SvG clean-up/BLP 0. I find that this approach is the most effective for large clean-ups, but I wanted to solicit input before getting too far. Cheers, Tazerdadog (talk) 00:49, 5 January 2017 (UTC)
  • @Aymatth2: The efficient way to delete these articles is the d-batch function in twinkle. In order to use it, it is necessary that there are only articles to be deleted in the page at the time d-batch is applied. With my method, that is a relatively easy copy-paste of the unchecked/bad articles into an admin's deletebox, and takes less than a minute. On the other hand, userfying each article takes a real amount of time, as does restoring it to mainspace. My way is simply more efficient. Pinging @Tavix:, who has experience doing this with the Neelix X1 cleanup. Also pinging @Lugnuts:, so he can tell where his olympic cyclist list went. Tazerdadog (talk) 01:22, 5 January 2017 (UTC)
  • @Tazerdadog: There may be confusion when other editors from other projects start editing the same list at the same time, and there is a risk of (accidental) block deletion of unchecked articles from the BLP list. It takes little effort to move an article that may have potential to user space. Then it can be checked carefully without pressure of a deletion deadline, and recreated at leisure. This leaves a useful audit trail, showing who checked each article. Remember, we cannot assume good faith on the sourcing, but must carefully check that each article accurately reflects what the source says. "Looks o.k." is not good enough for a BLP. Aymatth2 (talk) 01:51, 5 January 2017 (UTC)
  • @Aymatth2: You are of course correct on the increased sourcing requirements. I am ensuring that there are no claims that are challenged or likely to be challenged without rock-solid sourcing. Another problem that I have with the userfication method is that it will lead to duplication of work. If multiple people start working from the top of the list, we get a disaster. To prevent this, everyone has to work off of a master list. I would be happy to sign my name onto each page I save so that who saved what is clearly attributed for future audits. Userfying the articles takes longer than checking them however, because the articles are so short and formulaic. Tazerdadog (talk) 02:04, 5 January 2017 (UTC)
  • It is also easy enough to check the sum of the articles in the lists to ensure that no chunks get accidentally deleted, and to recover chunks from the history if it does happen. I'd happily volunteer to do this at the end of the week.Tazerdadog (talk) 02:07, 5 January 2017 (UTC)
  • @Tazerdadog: I don't know what it is like for you, but I find the download painfully slow, and then it takes for ever for the system to check the link colors after the download. If I spent half an hour editing the list, went to save and got an edit conflict, I would cry. The projects can coordinate lists for "their" articles: weightlifters, gymnasts, cyclists, footballers, whatever. Given the huge numbers and tight deadline, there is not time for careful checking at this stage. If it looks like it may have potential, userfy it and move on. Then take your time to check properly after the deadline is past. Aymatth2 (talk) 02:34, 5 January 2017 (UTC)
  • @Aymatth2: That one has an easy fix - just add about 30 sections, and those problems go away. We can also section by sport once that is identified. It's fine for me, but I'm on good internet. The problem is that it takes me longer to userfy an article than it does to fix it. Tazerdadog (talk) 02:45, 5 January 2017 (UTC)
  • @Tazerdadog:, @Aymatth2: I love the procedure of Tazerdadog. I think as long it's not indicating many BLP issues, there is no need for a deleting rush. MFriedman (talk) 12:31, 5 January 2017 (UTC)
Note I started sorting the articles, see User:MFriedman/sandbox. I thinks this might be usefull. MFriedman (talk) 11:59, 5 January 2017 (UTC)
  • That makes sense. I replied in the early stage, but got lost in it afterwards. Will take a look at it. But still I think only a week for scanning articles and the annoying editing way via user space is much too short. I'm willing to rescue the articles of athletes who participated at Olympics and World Championships, but a week would be too short. But wait, will start reading the above link first. MFriedman (talk) 14:52, 5 January 2017 (UTC)
  • @Aymatth2: After reading most of it I agree the articles need to be checked, but I don't agree that there is a rush. At the time that the proposals were written nobody actually knew how bad the situation was, and so it was closed as "delete all" because there were given examples of BLP violations. But as we know now, the articles of SvG don't show the severe kind of BLP violations as in the proposal. One of the main examples in the original proposal, was the article of Vanessa Hernandez. But if it was that bad to delete all his articles why is the doping allegation not deleted from List of doping cases in sport (H) (as also mentioned)? To continue with that I think SvG is not the main problem, but (people creating/editing) list like List of doping cases in sport (H), without using references for all cases. MFriedman (talk) 15:44, 5 January 2017 (UTC)
  • @MFriedman: It was clear from the discussion that most of the articles did not have serious BLP violations, but several (deleted during the discussion) did have extremely serious violations such as allegations of fraud with bogus citations, and we must assume an unknown number remain among the 16,000. We have to accept the closing decision. A mass of fairly harmless stubs will be lost, but almost all of them took only a few minutes to create, and can easily be recreated. Aymatth2 (talk) 15:58, 5 January 2017 (UTC)
  • @Aymatth2:, Yes I saw the mention of two footballers from Equatorial Guinea, or do you mean more with several? But like I said we can't revert the outcome and I agree articles needs to be checked, but we can adjust the time scale as we know better now. I see he had an explanation for it and even made a list of articles with harmfull statements he wrote. Having this list, and as no other cases are raised after screening already many articles, why a rush? (Many articles I saw can not be recreated within a few minutes.) MFriedman (talk) 16:13, 5 January 2017 (UTC)
  • @MFriedman: Not that Aymatth2 and Fram were the main contributers in willing to get my articles deleted. So I'm not wondering at all he wants to do it as quick as possible and making it as hard as possible to keep pages. Thanks for all your work, I'm done with it. Sander.v.Ginkel (Talk) 17:28, 5 January 2017 (UTC)
  • Comment Would it not be easier to just remove the entry from the master list, rather than moving x article to a userspace, then having to move it back? Everything still on the list at the deadline point can then be deleted. Lugnuts Precious bodily fluids 07:55, 5 January 2017 (UTC)
    • I think that makes sense, but Aymatth2 disagrees with me. Tazerdadog (talk) 08:39, 5 January 2017 (UTC)
    • Yes of course, that makes sense. Hats off to Aymath2 for beginning this project, but the method described in this section is ridiculous. A far better approach is to move out entries from the "To Be Checked" section to an appropriate section as and when they get reviewed. When the deadline is reached, all remaining articles in the "To Be Checked" section can be nuked by an admin (which they can do copying the list to a wiki-page of their convenience - like in their own userspace and running TW's D-batch module). Just abandon the master list. I don't doubt the community's ability to edit a single list collaboratively, rather multiple copies for each editor/project. I propose that User/Aymatth2/SvG clean-up/Tazerdadog checklist be moved to User:Aymatth2/SvG clean-up/Worklist. 103.6.159.86 (talk) 10:44, 5 January 2017 (UTC)
ok, i didn't see the comments made in the above section. Now I tend to agree with Aymatth, but I still feel this policy shouldn't be enforced on everyone. There may be many editors who just may not like moving articles to their userspace. In passing, i note that User/Aymatth2/SvG clean-up/Tazerdadog checklist should be moved out of mainspace. 103.6.159.86 (talk) 11:44, 5 January 2017 (UTC)
    • I agree with Lugnuts. MFriedman (talk) 12:02, 5 January 2017 (UTC)
  • The section on "Saving an article from deletion" describes the full, formal userfication process, but there should be little problem with simply moving an article to user space unchanged, a trivial step. It can be tidied up after. Many of the articles need fixing. Often there is a string of articles derived from one source that each say the subject is "a former road cyclist", or whatever, though the source does not say they have retired. Others imply the subject is still competing when the source does not say that either, or that they competed in several international competitions when only one is reported. These are not major issues, but not "good enough" for a BLP. We should not rush the task of clean-up. We should focus now on saving copies of candidates to be checked, then after the mass deletion carefully check and fix the articles in user space before moving them back to main space. That will give a solid audit trail. Aymatth2 (talk) 14:20, 5 January 2017 (UTC)
  • "We should not rush the task of clean-up. " So why rush in mass-deleting them all, if they can be fixed through cleanup? People can go through the lists, just like the Neelix redirects of days gone by. Anything needing deleting can be tagged as such. Lugnuts Precious bodily fluids 14:57, 5 January 2017 (UTC)
  • Our main objective must be to make sure BLP violations are removed, not to save stubs. Removing articles from the main list is very risky. It will tend to make us rush the review, leaves no usable audit trail and introduces the temptation to remove large blocks of articles, e.g. all the 2016 European Weightlifting Championships stubs, with little or no checking of the individual articles. It is too dangerous. Aymatth2 (talk) 14:20, 5 January 2017 (UTC)
  • Different projects can create their own checklists, perhaps like User:MFriedman/sandbox, to help sort through the 16,000 articles and pull out the ones they want to later review. There is no need for everyone to organize the same way. User:Tazerdadog seems to have a way to finding the cyclists in the list that could perhaps be used for other types of athlete. Aymatth2 (talk) 14:20, 5 January 2017 (UTC)
  • Comment - What will happen to articles which do not violate BLP? There are hundreds, if not thousands of articles about cycling races / competitions / teams which have no BLP violation. Are they getting deleted as well? If so, that seems absurd. Sorry if you've raised this already Lugnuts. XyZAn (talk) 17:32, 5 January 2017 (UTC)
    • Further comment - Aymatth2, Lugnuts I have analysed out the bulk of Sander's articles relating to WP:CYCLING- from article 16536 down in the raw article list 0 - see here: User:XyZAn/sandbox/SVG articles to save Wiki Project Cycling. I've split it into three parts, section 1 is already part of a separate AFD series, it would seem prudent to leave them as is as it has been suggested that the 'by-year' articles are merged. Section 2 lists teams, races, race editions, season articles etc - none have anything to do with BLP violations, so should not be deleted under any circumstances being they meet WP:NCYC. That leaves section 3 - which is the list of cyclists, these need to be checked for BLP violations. XyZAn (talk) 18:24, 5 January 2017 (UTC)
      • Comment The articles about teams, races, competitions etc. have been excluded from the deletion list, as have articles about dead athletes. We are only talking about biographies of living people. The (painful) decision was that these have to be deleted due to the high risk that they contain serious BLP violations and the effort it would take to review them all. But if they are saved in user space, they can be recreated after review and correction. Aymatth2 (talk) 21:52, 5 January 2017 (UTC)

Hang on a second[edit]

Maybe I'm missing something here... If the idea is for editors to manually move articles to their draft space, check them, then move them back, why can't a bot move them ALL to draft space in the first place? Delete the redirect out of the mainspace, list them all in manageable lists, with editors either tagging the draft for deletion or moving them back to the mainspace. Lets be pragmatic here. Lugnuts Precious bodily fluids 19:55, 5 January 2017 (UTC)

  • Something like that was discussed early in the debate, but there were objections about flooding queues of draft articles for review. Otherwise, if a bot can be arranged to userfy the articles within the next ten days or so, and they are then organized into lists for review, that seems like an acceptable solution. We would want a project to define the approach and guidelines for review, deletion (e.g. non-notable subjects), correction and restoration, to make sure consistent standards are followed. Aymatth2 (talk) 21:52, 5 January 2017 (UTC)
    The discussion was on @Avraham:'s talk page. I proposed something very similar to this. Avraham disliked the idea due to the possibility of BLP violations lingering in draftspace. I think that the risk of BLP violations in draftspace is outweighed by the benefit of the ease of restoration of acceptable articles, but Avraham disagrees. Flooding the draft queues for review is not a problem, we would not send any through the AfC process. @MusikAnimal:, is such a bot technically practical? Tazerdadog (talk) 22:23, 5 January 2017 (UTC)
    Definitely possible, relatively easy, and I'm happy to code that up MusikAnimal talk 23:17, 5 January 2017 (UTC)
    Thank you MusikAnimal. I'm going to ask you to go ahead and do so. That should, at minimum, give us a better starting place, and get the BLP violations out of mainspace while we figure out a more systematic review process. Tazerdadog (talk) 23:49, 5 January 2017 (UTC)
    I think this is a great idea. The Drover's Wife (talk) 00:47, 6 January 2017 (UTC)
    @Avraham: would it make a difference if we still set a deadline for rescuing the articles in draft space, just not one as short as a week? I agree that BLP violations shouldn't sit around in draft space indefinitely, but I also think there's a lower chance of them causing problems in draft space, and this would make it easier to address the articles that either have no BLP violations or are easily fixable. After a certain time (probably somewhere in the 3-6 month range) we could still delete all of the remaining drafts. TheCatalyst31 ReactionCreation 05:22, 6 January 2017 (UTC)
Thanks MusikAnimal - that would be a great help and would be the best option. Any BLP issue would be gone from the mainspace. Draft articles can be checked, tagged for deletion if needed, or moved back. And I'm more than happy to do the checking too. If we had them in lists of say 500 or 1000 articles per list, it won't seem so daunting to others who want to check. Lugnuts Precious bodily fluids 08:08, 6 January 2017 (UTC)
I can make the lists in chunks of the desired size. Tazerdadog (talk) 08:33, 6 January 2017 (UTC)
That sounds great to me. @Aymatth2: - any objections to this? IE moving all the BLPs to draft and creating lists of them to work through? Thanks. Lugnuts Precious bodily fluids 12:00, 6 January 2017 (UTC)

I like the general concept, but propose the following conditions:

  1. All the BLP articles are moved to a noindex space where they will not be picked up by search engines (I think draft space meets this condition) and are given names that can be easily distinguished from other drafts, e.g. Draft:SvG/John Doe
  2. They do not flood the AfC or other review queues. I think this would be true by default.
  3. All draft space articles that have not been deleted or moved back to main space within 90 days are deleted. We cannot let potential BLP violations linger in publicly accessible space indefinitely. Most of the stubs are fairly trivial, so saving them is of minor concern compared to the risk of harming the subject's reputation.
  4. A project is launched to coordinate the clean-up, including guidelines for review, tools for progress tracking, ways of sorting articles into groups for focused checking etc. My main concern is that there should be safeguards against casual, rapid moves back to main space without careful checking. A lot of the stubs "look o.k." but do not reflect what the source says.
Aymatth2 (talk) 14:40, 6 January 2017 (UTC)
Thanks Aymatth2. Yes, I agree to all those points. Lugnuts Precious bodily fluids 15:22, 6 January 2017 (UTC)
Yep, those all look reasonable to me. Tazerdadog (talk) 15:41, 6 January 2017 (UTC)
I think all of the above points sound good, I will assit Lugnuts in going through BLP cyclist articles. XyZAn (talk) 17:38, 6 January 2017 (UTC)

Superb, I think there's broad agreement here to move everything into draft and then get working. From comments here, and recalling what was raised in the ANI discussion, it would be useful to have the following lists created, in batches of 500 articles (if possible):

  1. Olympic cyclists, split by male and female
  2. All other cyclists, split by male and female
  3. Estonian biographies. I recall that these were all created from one source, that didn't turn out to a) support the birth/death dates in the article and b) stand-up to being reliable
  4. Weightlifters. All created from a single source that didn't detail birth/death dates
  5. All remaining Olympians not covered by the above, split by male and female
  6. All remaining articles not covered by the above, agains split by male and female

At least that would be starting point for various projects (WIR, Olympics, Cycling, etc). Any thoughts/comments on those before we get started with this? Thanks. Lugnuts Precious bodily fluids 13:14, 7 January 2017 (UTC)

  • See this link for a PetScan list of SvG articles on cyclists. You can change the value in the "Categories" box, e.g. to Weightlifters or Estonian people, and click on "Do It!" to get other lists. this link gives women Olympic cyclists. The tool is very flexible: see Manual for PetScan. But before we charge off, I suggest someone volunteers to start a new page with guidelines for the clean-up job, which we can review and refine. Once User:MusikAnimal has moved all the stubs to draft (hint) we can announce the project formally on the project pages. Aymatth2 (talk) 13:46, 7 January 2017 (UTC)
    • I believe this will require a quick BRFA, which I will file soon. Should be able to get to all of this tomorrow. Best MusikAnimal talk 16:44, 7 January 2017 (UTC)
Thanks again. Agree Aymatth2, guidance would be good, esp. on adding tags to drafts that def. need deleting once they've been checked. Lugnuts Precious bodily fluids 18:51, 7 January 2017 (UTC)
  • I'd be happy to help you write that and monitor the process if you would like. Tazerdadog (talk) 21:31, 7 January 2017 (UTC)
  • Thanks. I have made a first attempt with the things I can think of, but it needs fresh eyes. Aymatth2 (talk) 22:47, 7 January 2017 (UTC)
Thanks again. I'll take a look at the guidelines later. Lugnuts Precious bodily fluids 10:24, 8 January 2017 (UTC)
BRFA filed at Wikipedia:Bots/Requests for approval/MusikBot 10. Feel free to chime in MusikAnimal talk 05:57, 9 January 2017 (UTC)

Timescales[edit]

What timescales are there for this? When are articles going to be deleted? This element seems a bit unclear. Lugnuts Precious bodily fluids 14:58, 5 January 2017 (UTC)

  • The decision was to mass delete after one week's notice had been given of the BLP list. That could be taken as one week from the time User:Aymatth2/SvG clean-up/BLP 0 was posted here, e.g. 11 January 2017. Since the process has been slow to start, and there is still discussion about the approach, I would prefer to steal a few more days. But as User:Avraham has said, "These are BLPs we are discussing, remember." I do not see that we could reasonably delay beyond 15 January 2017. Aymatth2 (talk) 15:17, 5 January 2017 (UTC)
    • The ANI decision called for one week's notice "in a public enough manner" that interested editors could be aware of what was going to happen. With all due respect to the considerable efforts here, I'm not sure a user subpage is going to be public enough for some. I suggest posting on a few of the relevant WikiProject talk pages (such as Cycling and Olympics) with the deadline when one is decided. Giants2008 (Talk) 17:20, 5 January 2017 (UTC)
Athletics, Cycling, Football, Gymnastics, Olympics, Running, Swimming, Women's sport, Women in Red
I think at minimum, it also has to be one week after we finalize what the process of preventing articles from being deleted will be; it's still somewhat unclear whether these should be moved to userspace, moved to draft space, or removed from the list entirely, and until that's settled people who want to rescue these articles probably won't know where to put them, which could delay their work. TheCatalyst31 ReactionCreation 05:17, 6 January 2017 (UTC)

Copied from WikiProject Women in Red's talk page[edit]

Thanks TheCatalyst31 and Aymatth2 for raising these considerations. SvG has been the most active contributor to WiR over the past few months, not only creating more articles than anyone else but in assisting us in general and ensuring that as many new articles as possible are included in our Wikidata statistics on our women's biographies. As one of the most active day-to-day and month-to-month monitors of additions to WiR over the past 18 months, I have repeatedly drawn attention to Rosiestep and other key WiR coordinators to the inherent danger of including one-line mini-stubs on sports people created by SvG and several other editors. Her reaction, as one our leading members, was that stubs were always important as they could be built on this month, next month or in the years to come. I have personally carefully and constantly monitored much of the the work of SvG and several similar editors who have contributed a huge number of stubs in recent months on women who have played a key role in national and international sports events. While I agree that BLPs should normally be sourced from more than one reference, the fact that SvG -- like many others -- has been able to draw on at least one reliable reference for the biographies of almost 100% of his articles is ample justification for their continued inclusion. If SvG articles on women's BLPs over the past 15 months (probably over some 7,000) are to be deleted on the basis of problems detected with 20 or 30, I would estimate that some 30,000 other mini stubs on women in sports over the same period should also be deleted. Some would maintain that this should not be a problem as just as many of them are on men. But a general analysis on the number of new articles on women in sports compared to men in sports shows that the proportion is far above the overall 17%, bordering on 50%. In the case of SvG, given his specific interest in supporting women the proportion was much higher. The deletion of SvG's articles with the additional threat of including those of the five or six editors who have followed more or less the same path would therefore have a devastating effect on WiR's until now successful attempt to increase the EN Wikipedia's coverage of women. I hope therefore that SvG's important work will not simply be lost to the discretion of a couple of administrators.--Ipigott (talk) 18:10, 5 January 2017 (UTC)

Unfortunately, there was a weeks-long discussion on WP:ANI and the consensus was that the risk of keeping potentially thousands, if not tens of thousands of BLP violations outweighs the benefits of the articles. There will be a list of deleted articles to be recreated properly to facilitate the repair of the project, but consensus was reached and a request to reopen the decision was snow-closed. I understand it is frustrating, but that is how we reach difficult decisions, and when dealing with real people, we have a responsibility to take more care. Thank you. -- Avi (talk) 20:51, 5 January 2017 (UTC)
ANI is not a good place for such a discussion. It is frequented by vandal fighters to a large extent. A more central discussion would have been better. I assume that no statistics were done to look at the extent of BLP violations by which I mean actual rather than technical violations? All the best: Rich Farmbrough, 00:59, 7 January 2017 (UTC).
I probably have the biggest sample size of anyone. I checked 221 cyclists from the list. Among them, I flagged 4 articles as containing controversial/negative statements about their subject. The four articles I flagged are Sanne_van_Kerkhof, Hanna_Solovey, Márcia_Fernandes and Bas_van_Dooren. I make no comment on whether any/all of these four cases have the sourcing needed to avoid a BLP violation. I do not claim my sample was representative, as it was just the first 221 articles on the list. I cannot say with 100% confidence that I caught every BLP issue (one likely one that I would have missed is a flipped gender, which has happened to SvG articles in the past), however, I think I caught everything. Many of the articles contain flaws, such as calling their subject a former cyclist, when there is no indication that they have actually retired, however i think it's a stretch to call that a BLP violation. Cheers, Tazerdadog (talk) 01:57, 7 January 2017 (UTC)
Tazerdadog I've had a quick look at the four cyclists you mention. Interestingly, apart from Márcia Fernandes, I see that they are covered in several other languages with similar descriptions. I'm not too sure whether or not this affects the status (acceptability) of the articles or whether, for example, all BLPs relating to drug-taking should be systematically deleted or reworded. I imagine a considerable number of BLPs, even Start, C or higher, contain such details too.--Ipigott (talk) 11:46, 9 January 2017 (UTC)
I've just looked at Category:Doping cases in cycling and see there are 346 articles, mostly BLPs. I've also looked at all those in Category:LGBT sportspeople by nationality if this constitutes "negative" coverage of a living person.--Ipigott (talk) 11:55, 9 January 2017 (UTC)
Yeah, my standard for flagging an article was essentially it had a claim that would be unusually damaging if it was wrong. The other 217 cyclist articles had no apparent problems, and I would judge the risk of a BLP violation in any of them to be low.Tazerdadog (talk) 14:13, 9 January 2017 (UTC)
Some points:
  • We must get the article right. Saying an athlete was born in 1956 when she was in fact born in 1965 may cause significant harm. SvG was very slapdash, so there are a lot of "minor" errors that should be fixed.
  • Coverage with similar wording in other Wikipedias does not mean much. Errors tend to propagate.
  • PetScan gives only 39 results for SvG articles with category:Sportspeople in doping cases, but may miss some that were not categorized. We can report doping, if backed up by reliable sources, but should question whether it is important for a very short article on a minor athlete. If they are known only for competing in one event, maybe the article can be made a redirect to the list of competitors in the event. Or maybe it can be expanded into a more complete and balanced biography.
  • The guideline on sexual orientation categories for living people says they "should not be used unless the subject has publicly self-identified with the belief or orientation in question, and the subject's sexual orientation is relevant to their public life or notability ..." PetScan gives just 3 results. Again, this may miss some.
Aymatth2 (talk) 14:50, 9 January 2017 (UTC)
I think the doping point is a misinterpretation of BLP policy. Any sportspersons' fraud within their sport is key to the topic and should be included where sources are available. I would consider the choice not to include it a violation of the Neutral Point of View policy, as it deliberately obstructs the reader from understanding the context of a sportspersons' performance. In all the examples listed above Sander has stated doping in a neutral matter of fact manner and sourced it. He did not make the LGBT additions so I would say that is non-relevant to the case. The examples you've listed are not issues - they are good work very much in line with Wikipedia policy. SFB 12:53, 4 February 2017 (UTC)
Yeah, it can be frustrating that some error free articles will be lost in the process.But as already observed in the discussions pertaining to the entire process of nuking articles created by SvG, the risks of defamatory content, false citations etc. involved in BLP articles created by him run high.And we can't afford a rerun of the Seigenthaler incident.Light❯❯❯ Saber 16:56, 6 January 2017 (UTC)
What does "can't afford" mean in this context\? All the best: Rich Farmbrough, 00:59, 7 January 2017 (UTC).
I realize that it's probably far too late to comment on the sense of moving thousands of BLPs out of the mainspace but while a handful of BLPs have been highlighted for the damage which may be done to the reputation of the individuals they cover, as far as I can see absolutely no attention has been given to how upset people may be to see that they and their achievements were not considered important enough to keep on Wikipedia. It remains to be seen whether any of those concerned will notice the deletions or comment on them if they do.--Ipigott (talk) 11:33, 9 January 2017 (UTC)
  • There has been quite a lengthy discussion on the Women in Red talk page but until now there is not much evidence of a real interest in checking out the thousands of BLPs in question. As a result, apart from those on cycling, it looks as if most will soon be removed from the mainspace.--Ipigott (talk) 11:33, 9 January 2017 (UTC)

Dealing with "former" in articles"[edit]

One of the most widespread flaws with these articles is the use of former, without a source that stated that they had retired, or stopped competing at a high level. What should be done in these cases? @Sander.v.Ginkel:, what was your criteria for including former in the article?

For reference, here is the readable prose from a typical article containing this flaw:

Janice Bolland (born January 25, 1966 in Cheyenne, United States) is an American former road racing cyclist. She won a gold medal at the 1992 UCI Road World Championships in the team time trial and a silver one in the team time trial in 1993.

Options I see include:

  1. Leave it as-is.
  2. Remove former, leaving the question open whether the athlete is still active, but perhaps misleading readers to believe retired athletes are active by implication.
  3. Base the inclusion or exclusion of the word former on some other factor, such as age (if known), or date of last noteworthy finish.

I personally don't think this is a big deal, so I support option 1, but it has been brought up enough that discussion is warranted. Tazerdadog (talk) 04:51, 6 January 2017 (UTC)

  • The article should say only what the sources say, with no original research. From the sources cited by this article we cannot tell if she has retired or not. "Former" may be quite insulting if she still competes actively in some age category. The cited sources do not give the place of birth, so that should be dropped. She certainly was not born in Cheyenne, which is a group of indigenous people. We can say:
Janice Bolland (born January 25, 1966) represented the United States as a road racing cyclist at the 1992 UCI Road World Championships in the team time trial, where she won a gold medal, and at the team time trial in 1993, where she won a silver medal.
This is accurate, but gives undue weight to just two events in the individual's career, a problem with all these stubs. The primary source for the article, Janice Boland, gives much more information about her career, and a very small amount of research would find this source for Janice Bolland Tanner, from which a rather more complete and balanced biography could be developed. But if we are in too much of a hurry to improve the article, we must at least remove any original research. A BLP must say only what the sources say. Aymatth2 (talk) 15:22, 6 January 2017 (UTC)

Bot swiping[edit]

Wouldn't it be better to use a bot to swipe all of Sanders's articles and leave just xxx and profession and one source? That way we'd have a lot of sub stubs but at least they would be rid of errors and still expandable. If the infoboxes have problems too get a bot to remove those too. I just think there's a better way of dealing with this. I think the best thing would be to delete them all and get a bot to recreate them and thousands more but that's unlikely to be approved.♦ Dr. Blofeld 18:35, 6 January 2017 (UTC)

I think that's the main issue with the articles - the source. From the initial ANI post, it was clear that some (not meaning all) of the biographies didn't have a single fact backed-up by the source in the article. The way we're going with this is to move them all into draft space. Then there's 90 days to check them. Anything verified to be OK will be moved back. Everything else will be deleted, either via checking or going past the 90-day cut-off. 14,000 articles, 90 days, 10 volunteers would only need 15 articles checked per user per day. In theory... Lugnuts Precious bodily fluids 19:14, 6 January 2017 (UTC)

May have jumped the gun[edit]

Going by the guidelines that were previously added to the page, I moved the Richard Rozendaal article to User:Giants2008/Richard Rozendaal and cleaned it up (this in particular stood out). Even though it's still stubby, it has been expanded slightly and I feel it's ready to go back into mainspace, as everything is now sourced properly. However, now it seems that we're moving all of the articles to draftspace. Should this be moved to Draft:Richard Rozendaal or should I wait for the cleanup approach to be finalized? Sorry for any trouble caused by my premature move. Giants2008 (Talk) 23:52, 8 January 2017 (UTC)

It is probably best to wait until all the other articles have been moved to draft space, then move your version to main space, just to make sure it does not get deleted. I will make sure you are notified when the mass move happens. The infobox error seems typical: no bad intent, not really damaging, but slipshod. Aymatth2 (talk) 00:10, 9 January 2017 (UTC)

Question[edit]

Hey, so every article from him that was edited by people (e.g. me with handball ones) are deleted if i don't move them into draft space? Even if corrected and fixed? Kante4 (talk) 18:14, 11 January 2017 (UTC)

  • @Kante4: SvG made many errors that could affect the careers of these people. The articles started by SvG will all be moved to draft space (see User:Aymatth2/SvG clean-up/Guidelines) with 90 days for them to be reviewed, corrected and moved back to main space. Any drafts left after 90 days will be deleted. I will notify you when the move to draft space is done. You should check carefully that the articles agree with the cited reliable sources, and fix them if needed before moving them back to main space. If you are unable to view a source, do not assume good faith. Sometimes SvG guessed that a source would support the article without checking whether it even mentioned the subject. Aymatth2 (talk) 20:43, 11 January 2017 (UTC)
Ok, thanks for the notice. I edited some articles he created about handball players and that would be a "shame" if they were deleted. Kante4 (talk) 21:11, 11 January 2017 (UTC)