Wikipedia:Bots/Requests for approval/PseudoBot 4

The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was

Denied.

PseudoBot 4

Operator: Pseudomonas_(talk)

Automatic or Manually Assisted: Automatic

Programming Language(s): Perl

Function Summary:

More function creep, I'm afraid, guys :) I've had various requests that PseudoBot be run on selected lists other than date pages. I've discussed this on the Village Pump, and there seems to be a general feeling that some, but not all, bulleted lists are appropriate candidates for this. I'd want this only to apply to bulleted lists, and mainly to those where redlinks are a real problem. The most recent request was with respect to List of photographers. Others suggested were lists of bands from areas, lists of notable alumni of institutions, disambiguation pages for people with common forenames, and so forth.

Edit period(s) (e.g. Continuous, daily, one time run): Continuous.

Already has a bot flag (Y/N): N/A, it's an AV bot and so doesn't want one.

Function Details:

I think the summary above, along with the previous discussion for the BAG proposals, covers quite a lot. Things that differ from the bot's current workings:

Articles are going to need to be registered one-by-one, or by category (say Lists of people by nationality) or by some other specification. There are various ways one could do this - templates on the pages, a bot subpage that specifies the pages/cats, a "watched by PseudoBot" category, and many more esoteric options. The simplest way, of course, would be for me to do it by hand, and any additions to the list to be made by leaving a message on my talk page. I welcome suggestions on the best way to manage this, bearing in mind that registering inappropriate pages/sections would be a fast way to bring the entire enterprise into disrepute.
The bot would only deal with bulleted lists. Non-bulleted material would (I think) be too likely to be a relevant comment.
The bot could (as it does at present with the date pages) deal with a specified section or sections of pages - this may be useful in things like the "Notable Alumni" list sections that many school pages have (and which are ridiculously frequent targets of both vanity and - more unpleasantly - attack, additions).
I think it best that it continue only targetting the edits of anon and very new users. It won't be perfect - nothing is, and this way any autoconfirmed user can undo any mistakes it makes quickly and easily.
As (I very much hope) at present, the bot would leave talk page warnings that are as non-condemnatory and as helpful as possible; Wikipedia:AGF means that the assumption must always be that the user has made a typo or formatting error, and should be given clear and friendly advice on how to fix the problem.

I look forward to your suggestions.

Discussion

I really don't like this... I think on date pages, it can be somewhat safely assumed that an anon/new user adding a redlink is an nonconstructive edit, but I think if this is going be be expanded to pretty much any article with a bulleted list its going to need some more complex heuristics than: "redlink + new user == bad" as well as a 1 revert/user-article-day maximum. There was an issue with VoABot a while ago about it reverting every edit by new users to specific articles. This isn't the same thing, but its a little too close for my liking. Mr.Z-man 03:06, 16 October 2008 (UTC)[reply]

There's no way this would be appropriate for most articles with bulleted lists - this is for a minority that have a big problem with innappropriate redlink addition. Pseudomonas_(talk) 08:24, 16 October 2008 (UTC)[reply]

It would be a good idea to spam the noticeboards, I think this needs wide input before approval. BJ^Talk 23:01, 16 October 2008 (UTC)[reply]

The problem is that right now its restricted by its approval to only certain types of pages, this would remove that restriction and leave the decision to run it on a page entirely up to ...? Mr.Z-man 23:17, 16 October 2008 (UTC)[reply]

Admins only? Just give the bot a fully-protected page (with, say, a semi-protected discussion page) that's a list of the top-level categories it's to patrol. Philip Trueman (talk) 16:57, 17 October 2008 (UTC)[reply]

No reason to semiprotect the talk page, but yes, this is one sensible idea, I think. Categories or lists-of-lists would be a good starting point - and it'd be easy to allow special-casing of pages in or out. Pseudomonas_(talk) 17:06, 17 October 2008 (UTC)[reply]

(edit conflict) Well, I see various ways it could be done - it could be an admin decision (analogous to page semiprotection) - this would be easily doable (e.g. by having the bot config page be fully-protected). The BAG could approve it for a specific named subset of pages (and subsequently any change in this list), or specify a set of criteria that would make a page eligible for being watched by the bot. A new or existing group of Wikipedians could take charge of its direction. I would like input as to what people here think is most practical. Pseudomonas_(talk) 17:03, 17 October 2008 (UTC)[reply]

This bot might be able to start its work quietly, using a set of pages that was manually maintained by the bot owner. Then let him be the one to carefully follow the Talk pages of those articles (and the bot's own Talk page) to see if there were objections. I have a few surname articles on my own watchlist that I clean up periodically, such as List of people with surname Moore. Some of those might be a good choice for redlink removal.

Have any trials been made with this bot since the discussion at the Village Pump in July? I assume that 'PseudoBot 4' is a request for PseudoBot to be allowed to do more things. So the only previous approval for us to look at is Wikipedia:Bots/Requests for approval/PseudoBot. Let us know if there are other links we should be reviewing to see whether the PseudoBot 4 approval is a safe bet. Give us links to PseudoBot 3 etc if you think they are relevant. EdJohnston (talk) 17:52, 17 October 2008 (UTC)[reply]

No trials yet, no - depending on the outcome of this discussion a trial may be asked for, which will allow a decision on whether to authorize this bot activity long-term. Pseudomonas_(talk) 18:00, 17 October 2008 (UTC)[reply]

How about a trial that allows removing up to 500 red links, and has to be finished in three months? The bot operator would take responsibility for the correctness of all edits, would choose the articles himself, and would respond promptly to any complaints. EdJohnston (talk) 21:37, 17 October 2008 (UTC)[reply]

I think a trial would be a good idea. I also think that such a bot would perform a valuable service, given the number of safeguards that are or could be built in: (a) Posting on each article talk page of planned use of the bot, and then proceeding only if there is general consensus that the bot would be useful; (b) Making sure that there is an invisible comment at the top of sections of articles that the bot is used on, stating that links should be added only for existing articles - that is, only for notable people; (c) only using this for bulleted lists; (d) only reverting edits of IP editors and non-autoconfirmed (new) registered editors; (e) posting a note on user talk pages explaining the revert and suggesting if the problem was simple a misspelling, that the editor post the name again, correctly spelled. Given all those safeguards, the percentage of edits by the bot that will be wrong will be very, very low.

I also think that we don't want to create unnecessary bureaucracy. We already have WikiProject Lists; let them monitor the situation. I'd also like to assume good faith here - I can't see any reason why the bot owner would want to get involved in controversies with established editors about unnecessary removals of redlinks. If that does happen, fine, the bot can be modified or shut down. But to stop the bot before it has even been tried, out of concern that something might go wrong, seems to be me to assume bad faith (that the bot owner will use the bot excessively and ignore complaints), and to also assume that Wikipedia processes somehow would fail to correct the situation should it get out of hand. I disagree with both assumptions. -- John Broughton (♫♫) 16:15, 18 October 2008 (UTC)[reply]

The main issue I have is that, as far as anti-vandal bots go, this is pretty basic. Granted its designed to only deal with 1 particular type of vandalism, but it still seems somewhat crude. Redlinks aren't inherently bad. It just assumes that if one is added by a new user, its probably vandalism, which has the potential to be Wikipedia:BITE-y. I think some improved heuristics to determine the likelihood that a particular edit is vandalism should be developed before the scope is allowed to expand. Also, it needs some sort of rate limit to make sure it doesn't repeatedly revert the same user on the same article. Though that should be implemented regardless of whether or not the scope is expanded. A bot edit warring is really bad. Mr.Z-man 18:17, 18 October 2008 (UTC)[reply]

Well, if there's consensus that we can put a "Do not add people without Wikipedia articles to this list" notice, then it's reasonable to expect people to stick to it (and very politely explain to them how to go about remedying the situation). If there's not agreement on that, then that article is a bad candidate for the bot's attention. Pseudomonas_(talk) 19:36, 18 October 2008 (UTC)[reply]

I don't consider adding a redlink to be vandalism, unless it's obviously a bogus name; mostly it's just a mistake. That's why I suggested making sure that an invisible comment is added to such sections/lists, and why it's inappropriate to do a user warning for editors making this mistake. So I don't agree that this is an "anti-vandal bot". BITEy-ness is basically in the way a message is worded, not in the act of removing an error; if BAG wants to review the message that erring editors will see, more power to it. -- John Broughton (♫♫) 15:44, 19 October 2008 (UTC)[reply]

The bot (in its current incarnation) is nominally antivandalism, and a small percentage of its edits do revert vandalism - I'd guess maybe 5-10% are obvious vandalism, probably a smaller proportion are typos and mis-formattings (and most of the editors in those cases re-insert the link with the typo corrected); most of the balance is people in good faith making entries that don't belong on the page, and need removing (by either a human or a bot). If the user notices can be made less bite-y then I'll be happy to take any and all suggestions on board. Pseudomonas_(talk) 16:17, 19 October 2008 (UTC)[reply]

The problem is with the concept that "notable" (or more correctly "wikipedia-notable") = "has an article on Wikipedia". However, most editors already go by the rule that the two are equal, and in my experience are much more bitey about it than any bot could be. 86.44.27.83 (talk) 19:36, 3 December 2008 (UTC)[reply]

In general, I agree; I also think that there are some lists that aren't the place to introduce a subject that hasn't already had an article made - making the article stub first would be the way to do it. Pseudomonas_(talk) 10:36, 4 December 2008 (UTC)[reply]

What's the status of this? X clamation point 04:31, 28 January 2009 (UTC)[reply]

I'm still waiting to be told by the BAG that I should run a trial / provide more information / make certain changes / go away and stop bothering people. Pseudomonas_(talk) 19:59, 29 January 2009 (UTC)[reply]

Denied. I can't help but agree with the concerns above regarding new user + addition of redlink != vandalism 100% of the time. Furthermore, ClueBot's redesign using a neural net will be able to pick up the instances of redlink addition that are vandalism after it 'learns' which is which. This is much more efficacious than a blanket reversion of all redlink additions. Richard ⁰⁶¹² 22:58, 11 February 2009 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.