Wikipedia talk:Edit filter/Archive 3

This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Filter 102

I am concerned that the filter blocking unacceptable user names is generating a lot of false positives. Unfortunately, that filter is set to "private", barring Wikipedia users in general from inspecting it. While this privacy might have good reason behind it, someone ought to look at the usernames actually being blocked and see whether the ones being blocked ought to be blocked. — Rickyrab | Talk 04:10, 28 March 2009 (UTC)
Um, never mind, I read the description of that filter above. So far, it seems like the only post-change username added was "Anonymous is not a group.", which suggests a 4chan user was trying to push his/her luck. — Rickyrab | Talk 04:13, 28 March 2009 (UTC)
One example of false positives in usernames was that blocking "admin" in names of users who are not admins, was catching users whose names include the word "Badminton". Anthony Appleyard (talk) 20:35, 30 March 2009 (UTC)

Archiving consensus

As this talk page is getting rather large, is everyone OK for adding a MiszaBot II auto-archive template? It Is Me Here ^{t / c} 22:09, 28 March 2009 (UTC)

Yeah, I'm sure it's fine. How about 10 days old? ╟─Treasury Tag►contribs─╢ 22:10, 28 March 2009 (UTC)

Changed to 20, as a compromise. It Is Me Here ^{t / c} 17:50, 29 March 2009 (UTC)

Fuckups continue to accumulate

What about some kind of systems whereby two different people are required to set a filter to disallow? Or perhaps some kind of follow up dialog when trying to set a filter to disallow asking the admin if they've tested the filter first? Or heck, restricting AFE to people who know what they are doing? I'll be the first to remove myself from the group, even though I have the common sense not to screw too much with things that I don't understand. –xeno (talk) 16:46, 29 March 2009 (UTC)

What about needing a clear consensus before setting a filter to disallow? You know, some kinds of rules on how to use this thing? Right now everyone's just doing whatever he feels is best, and that's not really the way to go with a tool this powerful. --Conti|✉ 16:51, 29 March 2009 (UTC)

Yep, I tried proposing that solution above. It seems we sysops have less common sense than is commonly thought. –xeno (talk) 16:53, 29 March 2009 (UTC)

Or removing the abuse filter editor rights from abusers, which I'm going to do now. Cenarium (talk) 17:08, 29 March 2009 (UTC)

Yes, if somebody consistently overreaches their abilities, and fails to recognise this, then remove their ability to edit the filter. Although I don't know the details of what happened, in general terms one mistake is bad but correctable, two similar mistakes = remove access. Tim Vickers (talk) 17:14, 29 March 2009 (UTC)

Why not both? Apart from emergencies, there's really no need to set a filter to disallow immediately, or to test it for just a day (and then set it to disallow because it had 0 false positives (and 0 hits in general)). --Conti|✉ 17:46, 29 March 2009 (UTC)

Of course, filters need to be tested beforehand. Cenarium (talk) 18:21, 29 March 2009 (UTC)

Someone set a filter I created to disallow before it ran long enough to check for possible false positives. At the very least the most active filter editors need to be consulted on whether they think the filter is ready. (Mgm, not logged in) - 87.211.75.45 (talk) 19:18, 29 March 2009 (UTC)

Confirming the above was me. I think we should have "disallow discussions" that run for 5 days and not have actionable opposition before it is enabled. - Mgm|^(talk) 19:47, 29 March 2009 (UTC)

Sure, we have to be careful to make sure that filters we create not have (too much) collateral damage. And if filters are shown to have unacceptable or too much collateral damage, then the disallow should be disabled. However, there are two points against 'achieving consensus first'. We do not (always) achieve consensus before blocking editors, before semi or even fully protecting pages, or blacklisting external links. However, all three (except maybe for specifically blocking named editors or really static IPs) often result in collateral damage (blocked IPs may be used by other users, protection often excludes huge numbers of innocent IPs and users, and blacklisting results sometimes in good information not being able to be linked). Still we do those actions often without achieving consensus fist! Moreover, using the abuse filter for exluding e.g. a /8 range from editing a specific page will have collateral damage, and we will see that on such filters. However that collateral damage is way less than semi-protecting (which exludes a /0 and new editors!) the page, with the semi-protecting having the additional problem of not being seen! Similarly blocking a /24 excludes the editors from editing everything, while the filter can be made to disallow only a handful pages, which makes the chance of false positives way smaller (but not 0!).

However, I agree that most filters which are deemed to be set to disallow and which do not specify a usergroup significantly smaller than 'all IPs and new editors' OR just a handful of low-traffic pages should first be tested for a couple of days, and if others expect collateral damage, a discussion about that should be here (or well, somewhere). Specific filters on specified low-traffic pages, or on very small user groups are generally for specific current ongoing vandalism, and can (should?), IMHO, be enabled without discussion and testing.

I think we should consider, that on some filters we are going to have collateral damage, and accept that that is unavoidable (though I have not followed the bot, I guess AntiAbuseBot will also have blocked editors who may have been good-faith editors??). On the other hand, I do agree that if an abusefilter editor puts filters on disallow which give significant collateral damage, and do that repeatedly, the community should consider removing the rights for some time. --Dirk Beetstra ^{T C} 21:11, 29 March 2009 (UTC)

Seriously what the fuck? For starters lets not add another layer of bullshit by voting on filters, that is lame. Rathor Can Werdna please consider the following

Alter the abuse filter so that a rule must exist for 3 days before it can be set to any action other than notify or log. This would FORCE AFE's to wait and TEST thier filters prior to putting any damaging actions on them such as disallow or auto-confirm removal.
Remove the ability to add and remove abuse filter alteration right from sysop, making it by direct request (handled by crats) similar to RFA where experiance and knowlege are the criteria. We DO NOT need 1600+ users with the ability to severly impead progress on wikipedia with stupid filters that do not work. Moreover looking at the list of people with the AFE right I could bet 1/3 of those people who do not know what they are doing. The abuse filter has the potential to cause so much more damage than sysop by itself, so lets not just give it out willy nilly as in actualy fact it should be held to aa higher standard than sysop.

And to think werdna wanted to enable blocking and rights stripping straight away... «l| Ψrometheăn ™|l» (talk) 23:47, 29 March 2009 (UTC)

There are several issues here. One, the abuse filter is super powerful, and by extension super dangerous if used inappropriately by people who either don't understand what they are doing or don't exercise adequate care in testing and thinking about specific rules. Reasonable behavior demands that due care and testing be done in order to catch errors and false positives. At the same time, Beetstra correctly notes that some rules can be far more limited than the blocks and protections that we routinely allow admins to unilaterally apply. That is to say, one shouldn't need a huge bureaucratic process in order to deploy a rule that correctly and narrowly targets a specific vandal. Nor would it be a good idea to give a vandal free reign for several days while we discussed it.

I'd suggest the following:

We write up a best practices guideline for writing and modifying rules, with emphasis on due care and testing. I'd further recommend that we allow people to deviate from that, but that they bear the responsibility for that and if significant errors occur due to lack of testing then they risk losing the AF right. The guide could also discuss what tolerable thresholds for collateral damage might be in cases where there is no practical way to eliminate all errors.
We move control of the AF modify right to the crats so it seems like more a big deal and less of playground.
We agree on some process (perhaps just local discussion and consensus, but something) for approving the adding and removal of the AF right in the same way that BAG makes approvals about bot flagging. When necessary we use that process to deflag users that create problems.
We modify the rights settings so that all admins can read hidden rules (right now AF-modify is required for reading the protected rules and so some admins have adopted it just for that purpose).

I want to be liberal about granting people the ability to work here, but when your actions have the potential to effect every single edit, they also need to carry a sense of caution and thoughtfulness. And if people make a habit of engaging in careless or reckless behavior then they shouldn't be working here. Dragons flight (talk) 01:12, 30 March 2009 (UTC)

Im disturbed by the "I want to be liberal" statement, but i agree with most the other stuff. I think that the ability to read private filters should remain with the AFE group and not transfered to sysop. It would only take 1 rogue or disgruntled admin of the 1600+ to hand a full copy of the private filters to say Wikipedia Review or Wikipedia Watch and we would be back in the days of the public title blacklist. «l| Ψrometheăn ™|l» (talk) 03:03, 30 March 2009 (UTC)

I want $100 million dollars and an end to all vandalism on Wikipedia as well... Read it that way and focus on the clauses that follow it. Dragons flight (talk) 03:14, 30 March 2009 (UTC)

So I'm coming in cold here, but was it ever discussed that it might not be a good idea to put blocking and desysopping abilities in the hands of automatic filter? What sort of actions, exactly, are required to trigger this response?--HereToHelp ^{(talk to me)} 03:11, 30 March 2009 (UTC)

Blocking and desysoping are not available at this time, so we haven't yet answered that question. Dragons flight (talk) 03:14, 30 March 2009 (UTC)

I totally disagree with Promethean here. Yes, this filter can cause much more damage then blocking, protecting or blacklisting, but that depends on the filter! The filter can, and should, also be used to keep out current cases of vandalism, where blocking, blacklisting or protecting does WAY MORE damage than a specific filter, or where these actions simply don't help. That does not mean that we should be reckless, and that if AFE's are too reckless that they should not smell the trout from a really close distance, but having to test the filter before being able to put it to disallow results in that we have to use the harsher methods of blocking/protecting/blacklisting to keep out the current vandalism, which makes the testing of the filter totally, utterly useless. Please apply some common sense to this.

I am all for Dragons flight's 4 options, but there should be no real process for acquiring the right for admins (point 2), and keeping in mind that in all cases collateral damage is compared to the collateral damage caused by the alternatives, and also keeping an eye on the extend of the vandalism which is supposed to be caught (point 3). --Dirk Beetstra ^{T C} 10:22, 30 March 2009 (UTC)

So your saying that we should not test the filters that disallow or block autopromote at all rathor we should throw them together like a burger from burger king and pray for the best? Sorry but if thats your standards its not anywhere near good enough. «l| Ψrometheăn ™|l» (talk) 11:45, 30 March 2009 (UTC)

I don't entirely get your point. Because some other admin actions can cause more collateral damage than some of the filters, there's no need to do anything about the latter? Why's that? Not to mention that, right now, the large majority of the filters are not about a small group of users or a few articles, they check entire namespaces and/or every IP and every new user. So, your argument seems valid only for a very small portion of the current filters, and I entirely agree that those are not the problem. The problem are the filters that check a large majority of the edits. We need clear guidelines for those, and quite probably a rule that prevents such filters to be set to disallow immediately (with the WP:IAR exception of emergencies, as usual). --Conti|✉ 11:53, 30 March 2009 (UTC)

Now where did I say that? --Dirk Beetstra ^{T C} 11:58, 30 March 2009 (UTC)

Then I really don't get it. :) Do you support a rule (or guideline) saying that there should usually be a consensus before setting a filter to disallow? --Conti|✉ 12:24, 30 March 2009 (UTC)

I understand where Dirk is coming from, and I have no problem if someone who is experienced in these types of things throws up a well-thought out emergency filter. What I don't want to see is someone who has never made an abuse filter before dives headfirst into things and sets their first attempt at a filter to "disallow". So we need some way of preventing this. –xeno (talk) 12:32, 30 March 2009 (UTC)

That is roughly what I meant in an earlier post in this thread, and what I also meant here in the first of today: if the filter is not specifying a relatively small group of specific users (relatively small with respect to the number of users which would be blocked using e.g. semi-protection, or huge rangeblocks), and/or a specific subset of pages (or even one page), then it should be tested first. If the filter is going to apply to practically all new users on all edits, then testing needs to be done, and fail-rate should be lower than 1% (maybe even way lower). But building in limitations that the filter HAS to be tested first before it even can be enabled has as a result that specifically targetted filters with minimal collateral damage (collateral damage compared to blocking a range, blacklisting a link or semi-protecting a page) will never be used, as during the test-period the methods with more collateral damage will be applied. I am pretty confident with the system, and still I try first to test, but one of the rules has been activated without testing, as it is a specifically targetted rule on one page which targets a smaller set of edits then would be blocked with a semi-protection of the page, semi-protection would be needed immediately and hence would make the rule useless and untestable. So waiting for a test-period to end would there have resulted in either more vandalism, or some other admin getting fed up with it and semi-ing the article. --Dirk Beetstra ^{T C} 14:01, 30 March 2009 (UTC)

I tend to agree. The comments about "praying for the best" make it sound like whether a filter works is just some random, unpredictable event. I would agree with some sort of process to evaluate competency before giving out the editing right, but a full RFA-like process is unnecessary and counterproductive, potentially competent users will be turned off by the prospect of a weeklong barrage of questions. The filters aren't that complicated to make. Anyone with some basic programming knowledge should be able to figure it out pretty quick. Each filter is basically like one "if" statement. Mr.Z-man 20:26, 30 March 2009 (UTC)

Writing regular expressions clearly isn't rocket science but it's equally clearly beyond the competence of most administrators, who were not in any case voted on because of their programming skills. --Malleus Fatuorum 21:46, 30 March 2009 (UTC)

That's basically my point. There needs to be some sort of verification of competency, but a full RFA-like system (like was proposed above) for people who have already gone through RFA is excessive. Mr.Z-man 22:40, 30 March 2009 (UTC)

What about adding something like this to the guideline: "Any filter that affects a large number of edits (by searching through entire namespaces and/or user groups) should be tested for a few days before being set to "disallow" or "warn"." --Conti|✉ 22:06, 30 March 2009 (UTC)

Could the testing software be improved? I had the opposite problem from most - when I was writing a filter - no hits. But the testing software doesn't report no hits so I didn't know where the problem was. Perhaps the problem of too many hits can also be helped with better feedback from the testing software. Will Beback talk 22:55, 30 March 2009 (UTC)
- The batch testing interface allows one to target the history of a specific user, so if you know a vandal that the filter should match you can verify that it does. In the long-term it would be nice to have a static cache of say ~100,000 edits (12 hours of editing) that one could test filters against for false positive and efficiency. Dragons flight (talk) 23:50, 30 March 2009 (UTC)

Another possibility for testing is test filters on test.wikipedia.org with testing socks before deploying them live here. Dcoetzee 07:34, 31 March 2009 (UTC)

One of the features which I really would like to have on the testing facility here is being able to test on pagename. Now you can only test one sock at a time, but if a sockfarm has hit one page in the last 100 edits, that would greatly enhance the test-facility.

I like something along the lines of Conti's addition to the guideline, if we specify 'large number of edits' being 'more than the number of potential edits being hit by a page-semi-protection, or a /16 rangeblock', then I would agree with that addition. --Dirk Beetstra ^{T C} 08:29, 31 March 2009 (UTC)

Self edits to userspace

Ignore these mostly. Filter 65 however is warning users from giving themselves userpages, I have seen acceptable userpages not saved by newcomers because of it. --Clark89 (talk) 19:46, 30 March 2009 (UTC)

Do you have an example? Skimming the log the only abandoned User: edits I immediately noticed were spammy pages we wouldn't want in userspace anyway. Dragons flight (talk) 20:01, 30 March 2009 (UTC)

Edits to a user's own userspace have since been excluded if we're talking about the filter I've seen this afternoon.- Mgm|^(talk) 18:14, 31 March 2009 (UTC)

Naive question

If all admins are able to give themselves the "Abuse Filter editor" position, why not enable it for all sysops by default? After all, that's what the situation is with rollback; admins have rollback by default, since the function is packaged with the sysop user class. Am I missing something...? —Anonymous Dissident^Talk 05:39, 31 March 2009 (UTC)

This was discussed elsewhere, and I forget where, but this is basically a solution in search of a problem. The number of people needed to maintain the abuse filter system is small, and those that need to have the right can easily be given it by admins, and admins can give it to themselves. The current system works fine. I, for one, have no use for the tool, and so have not ticked it for myself. You could put it in the sysop package, to be sure, but I would never have use for it, so would know no difference. Its certainly a reasonable idea, in theory, but there does not seem to be much demand to do it, so the status quo, which works fine, seems like the way to go... --Jayron32.talk.contribs 05:48, 31 March 2009 (UTC)

There is a poll above (about bundling AFE to the admin package). Ruslik (talk) 08:47, 31 March 2009 (UTC)

Filter 61

The last diff by KnightLago is weird. Not only is it not clear what exactly was changed, but filter also suddenly seem to have gotten de idea they need reference formatting. - Mgm|^(talk) 17:59, 31 March 2009 (UTC)

Never mind the first question. He appears to have just changed the title. - Mgm|^(talk) 18:11, 31 March 2009 (UTC)

"Notes" boxes

Should we start putting filter notes at the top of the box, rather than at the end, so that the latest comments are visible without having to scroll down? NawlinWiki (talk) 16:04, 31 March 2009 (UTC)

That might be a good idea, but only if everyone will be doing it that way. If not, it will result in a confusing chaos, so it might be better to do it the standard way for now. We really need a better way to discuss the filters in general, tho, the "notes" box is pretty suboptimal for that (since it was never intended to be used that way, anyhow). --Conti|✉ 23:48, 31 March 2009 (UTC)

Correlary

Shouldn't we format the false positives page so that the latest entries go at the bottom, that way we can make a "click this link to report a false positive" with the entire template preloaded into it. - 131.211.211.247 (talk) 07:27, 1 April 2009 (UTC)

Discussing a specific filter

I want to discuss filter 117 (the one detecting removal of Category:Living people). Where is the best place to do that? Currently, there are false positives I can't explain (like this) and some I can explain (like this). The latter is simply a case of someone replacing the category with {{Lifetime}} (which contains the category - people have argued against putting sensitive categories in templates, but consensus seems to support it here when it was raised at a TfD). Another false positive (here) was where someone added pipe-sorting to the category, but didn't remove it. Typical example here is a correct detection, but it is also a correct edit (saying the person has died). However, it would be nice to have a separate logged stream of such "death announced" edits, so people can check sources were used and confirm the news. And this is just plain wrong. It seems to think that because the category appears on both sides of the diff, it got removed then added. Strange. Carcharoth (talk) 23:43, 31 March 2009 (UTC)

This is fine place to discuss it. I've made changes that should address the pipe, the template, and in your last example the addition of unnecessary whitespace. Dragons flight (talk) 04:59, 1 April 2009 (UTC)

What about when people die? –xeno (talk) 05:24, 1 April 2009 (UTC)

I'm not sure whether people want deaths to be exempted or deaths to be logged, or what exactly. Currently, deaths will show up in the log. Dragons flight (talk) 05:36, 1 April 2009 (UTC)

Since there is no reliable way to check a a reported death is accurate by the filter, I'd prefer it if they were logged. - Mgm|^(talk) 11:36, 1 April 2009 (UTC)

Wonderful. Thanks. I did see one edit that got logged, but when I went to the article, the edit wasn't there. Does that mean the edit had triggered more than one filter and had been disallowed by one of them? Carcharoth (talk) 00:54, 2 April 2009 (UTC)

Abuse filter shutdown

Can you remove the abuse filter from the website? 69.141.191.77 (talk) 10:44, 1 April 2009 (UTC)

The developers had it installed, so they can remove it too, but I doubt they will without a very good reason. - Mgm|^(talk) 11:35, 1 April 2009 (UTC)
Can I ask why you might want it removed? -- The Anome (talk) 13:51, 1 April 2009 (UTC)

Filters and Catchers

I see we at the moment have two type of filters in our list (which may be an unintentional separation ..), those which are meant to block vandalism and warn against typical problematic edits, and another set, which may never going to be set to block, warn or whatever, but which enable users to catch problematic edits (which don't actually violate policy or guideline, or which are significantly error-prone), so can then easily be screened later and reverted by hand. I think that the latter is a also great potential of the filter!

However, the 'catching' filters (those that are not going to be set to warn) are of lesser importance than the blocking ones. Still they do take away resources for possibly filtering edits (we get close to having a 100 active rules, I don't know where we hit the limit, but probably 200-250? .. I am not sure if that will be 'enough'). But for the catching filters, it does not matter that they are performed in real-time, they can be done after the edit is saved, and it does not matter that they are slow (a couple of seconds after the edit is saved is still fine, as long as it does not become minutes, and as long as it does not impede the performance of the real wiki).

I know that this would require a new version of this extension, which hooks into a different part of the software, etc. etc. But I think that it would be great to have a separate, catching-only 'filter'. It gives another level of filtering, rules can be tested there without a significant risk of screwing things up too much, etc. etc. It could also be used to test rules and see how they behave in terms of speed etc. etc. Would this be a useful spin-off of the filter? --Dirk Beetstra ^{T C} 14:59, 1 April 2009 (UTC)

Filter 31 isn't JUST for ASCII art, is it?

I found this out when an edit I made got repeatedly blocked for joking around on Wikipedia:Miscellany for deletion/User:Jimbo Wales. It used, in part, a common Internet meme, as well as Wikipedia's venerable Bad Jokes and Other Deleted Nonsense. 192.12.88.7 (talk) 15:33, 1 April 2009 (UTC) I found myself having to log in to make the joke, unfortunately. 192.12.88.7 (talk) 15:35, 1 April 2009 (UTC)

It was actually tripping on the line immediately above what you added. Dragons flight (talk) 17:13, 1 April 2009 (UTC)

Recent deaths filter

Is it possible to have a filter set up to do something similar to what is done here? Recent example. Detecting death claims is needed, because false claims are a form of abuse it is good to be able to catch (though it does require people checking the log). That page was set up by User:Sam Korn (who pointed it out to me), so maybe ask him if it involves anything more complicated than detecting a change in category from "living people" to "2009 deaths". He might have managed to successfully detect all the other ways people edit an article to indicate someone has died. The problem with that list, and the changes logged by filter 117 (the removal of category "living people) is that there doesn't seem to be a way to patrol the log, to avoid people duplicating each other's work and checking the same things. Is there a way to patrol a log of an abuse filter? Carcharoth (talk) 00:19, 3 April 2009 (UTC)

Adding patrolling is on the to do list. Dragons flight (talk) 00:25, 3 April 2009 (UTC)

We could add the abusefilter-patrol rights to reviewers in the trial flaggedrevs implementation. Cenarium (talk) 21:02, 3 April 2009 (UTC)

Filter 97 (Personal attacks by new user)

This filter is triggered quite often when anons/new users either revert a talk page blanking, or manually archive a talk page, which is quite unfortunate. Adding "edit_delta > 10000" to the filter might solve that problem, but I'm not (yet!) an expert with all this, so I figured I better ask here first before making the change. :) --Conti|✉ 16:12, 3 April 2009 (UTC)

& (edit_delta > 10000) should do that. FunPika 16:57, 3 April 2009 (UTC)

Alright, thanks! Modified the filter. --Conti|✉ 19:30, 3 April 2009 (UTC)

Any examples? It might be worth killing off the words that are causing problems. BJ^Talk 22:15, 3 April 2009 (UTC)

This is the most recent one, and here's another one. I'm sure there are more, but it's pretty impossible to properly search through the log. --Conti|✉ 22:32, 3 April 2009 (UTC)

The first hit was a bug in the regex ("shit" needed a word boundary), second hit was fine. BJ^Talk 01:16, 4 April 2009 (UTC)

My experience of auto-censor false positives

In email group that have auto-censors, I have run across or heard of disallowals from these false positives:

"Penistone" (place in Yorkshire (UK)) as "penis"
"Scunthorpe" (steelmaking town) as "cunt"
"wristwatch" as "twat"
"Dick" as slang for "penis" where it clearly means a man's name
"No hard or soft pornography will be allowed" (in an email group's description) refused because of the word "pornography"
"CP" as "child pornography" where it meant "Canadian Press" (in an email group's description)

Anthony Appleyard (talk) 06:22, 30 March 2009 (UTC)

I suppose it's a good thing we don't use one of those auto-censors, then. --Carnildo (talk) 10:03, 30 March 2009 (UTC)

Add: "new yearseve" as "arse". Anthony Appleyard (talk) 21:15, 30 March 2009 (UTC)

But we don't use those sorts of filters, for precisely that reason! ╟─Treasury Tag►contribs─╢ 07:11, 31 March 2009 (UTC)

Add: "specialist" as "cialis". Anthony Appleyard (talk) 16:37, 5 April 2009 (UTC)
- What's your point here? None of this applies to any filter. --Conti|✉ 16:52, 5 April 2009 (UTC)

Global Abuse filters

There is a discussion on meta to enabled global abuse filters that will affect wikis with the abuse filter extension, including the English Wikipedia. Please give you input. Thanks. Techman224^Talk 01:00, 5 April 2009 (UTC)

I think, like with global bots, we should opt out of them. Ruslik (talk) 15:19, 5 April 2009 (UTC)

psst, we don't opt out of global bots Mr.Z-man 17:32, 5 April 2009 (UTC)

G. bots can only be used to update interwiki links, which is almost a complete opt out. Ruslik (talk) 17:38, 5 April 2009 (UTC)

With global bots, there are no conflicts possible technically because they are restricted to interwiki tasks and use different account. Global filters could conflict because it uses an extension installed on meta and here. Techman224^Talk 18:05, 5 April 2009 (UTC)

Server sluggishness

I don't think it is an abuse filter problem per se, but all of the filters have started reporting large processing numbers (2-5 times normal). This occurred even with filters that haven't been changed. I suspect this particular problem is some unrelated high load affecting WMF in general. Filters are certainly capable of creating high loads, but my monitoring suggests it is not our fault (at least this time). So for the moment, don't panic. Dragons flight (talk) 05:55, 5 April 2009 (UTC)

Yep, not us, one of the memcached servers died. Dragons flight (talk) 06:03, 5 April 2009 (UTC)

Diff?

Unresolved

– Perhaps a bugzilla could get us diff links? –xeno (talk) 16:46, 6 April 2009 (UTC)

Could "diff" buttons be added to the AbuseLog to make it easy to revert changes? I understand that at least some of the filters should automate reversion at some point, but for now (and for the filters which don't trigger auto-reversion), it would be a great vandal-fighting tool. –Drilnoth (T • C) 03:23, 19 March 2009 (UTC)

That would be helpful, the only problem is not every log entry results in a diff. For those that did though, it would be great. –xeno (talk) 03:25, 19 March 2009 (UTC)

Which ones don't create a diff? (sorry; I haven't looked at all of the filters yet). –Drilnoth (T • C) 03:36, 19 March 2009 (UTC)

i.e. a log entry that resulted in a warning and the user didn't save the edit. –xeno (talk) 03:38, 19 March 2009 (UTC)

Ah, okay. Thanks for the clarification. –Drilnoth (T • C) 03:41, 19 March 2009 (UTC)

A diff view is visible if you click 'details'. Prodego ^talk 03:49, 19 March 2009 (UTC)

The problem is that that diff has information about the changes, but its not a normal diff so Twinkle can't rollback the edit, and there isn't even an easily accessible "undo" button. –Drilnoth (T • C) 13:26, 19 March 2009 (UTC)

Seconded. Please could (diff) be a link to the html diff? This would facilitate fixing. Thx. -- Chzz ► 17:33, 19 March 2009 (UTC)

Thirded. I'd love that. ViperSnake151 17:35, 19 March 2009 (UTC)

The problem is that on most of them, no edit took place, and there is nothing to roll back. Prodego ^talk 17:36, 19 March 2009 (UTC)

Yep, I mentioned that above, but for the ones that did result in an edit, it would be great to just have the diff. It would probably also reduce load from people clicking on the "details" or "examine" button... as those take a minute or two to come up. –xeno (talk) 17:38, 19 March 2009 (UTC)

See below on the filter 3 discussion. We need to be able to distinguish between edits that were attempted and ones that can and/or need to be undone anyway. A diff link would help. - Mgm|^(talk) 00:29, 20 March 2009 (UTC)

Bambifan101

User:Bambifan101 is a long-time IP-hopping sockpuppeteer devoted to vandalizing articles about Disney-related topics. Would it be possible to craft a filter that works on a combination of their various known IP ranges and topic-dependent words like "Disney" present in either the original article text or the new article text, and then blocks those edits? Autoconfirm isn't any use in this case, because they're known to create sleeper accounts. -- The Anome (talk) 14:12, 1 April 2009 (UTC)

Not exactly. For privacy reasons we can't directly check IPs if someone is using a username, which I assume this person is, but we can target articles with "Disney" in them as well as new-ish editors who only recently got autoconfirmed. Can you identify some recent socks to show what the vandalism looks like. Dragons flight (talk) 14:25, 1 April 2009 (UTC)

See Category:Wikipedia sockpuppets of Bambifan101. There are a few publically visible /16s which have come up repeatedly in vandal's IP edits, and they're the ones I suggest we filter: specifically 65.0.0.0/16 and 68.220.0.0/16. 70.146.0.0/16 looks like another good candidate. -- The Anome (talk) 14:32, 1 April 2009 (UTC)

One could write a filter to log whenever someone from one of those ranges anonymously edits a "Disney" topic. I'm not seeing much of a pattern that would allow one to do anything stronger than that. Dragons flight (talk) 14:48, 1 April 2009 (UTC)

Disney, Teletubbies, or my page would be very useful as he seems determined to get me to go back to watching for him. He is now on the 65 range again. *sigh* -- AnmaFinotera (talk · contribs) 19:55, 3 April 2009 (UTC)

Also, he is now back to using his 70.146.X.x IP range. -- AnmaFinotera (talk · contribs) 00:10, 4 April 2009 (UTC)

Just checking to see if there is any movement on this one, as he just struck again with both the IP and a new named sock. -- AnmaFinotera (talk · contribs) 02:04, 7 April 2009 (UTC)

This request has languished because it isn't entirely clear (at least to me) what you want.

We can log every edit by unregistered editors in 65.0.0.0/16, 68.220.0.0/16, and 70.146.0.0/16 to "Disney" / "teletubbies" pages, which may be helpful. We can't do much about logged in editors though. One could log every new editor to a Disney page, but that doesn't seem practical and would be prone to many false positives. We don't get IP information once someone has logged in. Without something more precise than "he edits Disney pages" the logged in socks would be difficult to target. Dragons flight (talk) 02:31, 7 April 2009 (UTC)

I don't really understand how these new filters work, so I'm not sure how to fully give the info needed. He frequently hits almost any Disney article (films, series, characters, related books), usually removing tags, reverting to very old versions, restoring long removed trivia, etc. He also loves to go blank article talk pages, randomly remove stuff, revert archiving, edit other people's comments. He lately also goes to RPP and tries to get "all the Disney articles" un semi-protected, protection instigated because of him. He often edits talk pages of his confirmed socks and IP socks. Newest thing is randomly hitting an anime/manga just to get my attention. :( Whenever he comes in with an IP, he will usually make a named sock and "double stack", editing one behind the other as usually only one gets reverted when people rollback all his edits, thereby keeping the vandalism he wanted in place. You can see this here[1] and checking the contribs of today's IP Special:Contributions/68.220.175.82 and Special:Contributions/Hahabricks named sock. -- AnmaFinotera (talk · contribs) 02:38, 7 April 2009 (UTC)

My first filter

I tried my first filter, #135, to catch people just holding keys down or copying and pasting 50 times. I'm accumulating improvements from false positives, but also have a few weird ones like this one. Where's the repetition there? Any other comments/ideas? —Wknight94 (talk) 04:47, 6 April 2009 (UTC)

Why is it marked private? --MZMcBride (talk) 07:22, 6 April 2009 (UTC)

I found neither documented standards nor an obvious pattern in existing filters so I basically flipped a coin. —Wknight94 (talk) 11:07, 6 April 2009 (UTC)

Filterable actions

Is it possible to have the filter act based on the deletion of a page? Thanks. Someguy1221 (talk) 09:30, 6 April 2009 (UTC)

Problem with single-quote in regex rlike

I had to undo a change because a single-quote made the whole regex fail. Anyone know how to include a single-quote in a [ ] group? I tried two single-quotes and preceding with a backslash - no go. Thanks. —Wknight94 (talk) 12:18, 6 April 2009 (UTC)

\' should work. — Werdna • talk 01:04, 7 April 2009 (UTC)

Figured out my problem was with the dash, not the single-quote. I had two characters around a dash inside square brackets, and that makes it match anything in the ASCII range between those characters. I made sure the dash was the last character in the square brackets and all is well (except I got shut down for having too many false positives). —Wknight94 (talk) 03:17, 7 April 2009 (UTC)

Penis

Isn't this just the sort of thing the abuse filter is for? I take it this wasn't caught because there were words other than penis in the edit? Rd232 ^talk 12:36, 6 April 2009 (UTC)

I'm new at the filter thing but it seems like the risk of false positives is fairly high. Although I suppose catching all-caps would reduce that risk. Otherwise, it is a clinical term. —Wknight94 (talk) 13:57, 6 April 2009 (UTC)

Surely any edits like this (or other "clinical" terms) by a brand-new anon user (or non-confirmed user) is likely to be vandalism? ~~ [ジャム]^[t - c] 14:09, 6 April 2009 (UTC)

Not so surely. IPs write a large percentage of the legitimate content on this site. You'd basically be saying no IP could easily write content for many of these pages. Something more specific would be needed IMHO, like all-caps "PENIS" or "penis" allowed but flagged for revision or something. Maybe "penis" allowed only if "penis" was already in the article (and that officially breaks my personal record for most uses of the word "penis" in one post). —Wknight94 (talk) 14:35, 6 April 2009 (UTC)

I didn't mean to imply it would be totally straightforward to avoid false positives. But surely a high % of "penis" in an edit by an anon to an article which has no prior mention of it, say, could be at least flagged if not blocked. (and can we use categories to filter too? eg block if it's outside category urology; flag if it's in). Rd232 ^talk 14:53, 6 April 2009 (UTC)

Why are so many of these "private"?

E.g. "common page move vandalism". Fulfils a purpose previously served by MediaWiki:Titleblacklist quite well without that needing to be private (and still served by it, just to confuse people). Why is it necessary to become an administrator just to find out what you are and aren't allowed to move pages to? Isn't this project supposed to be open or something? —Preceding unsigned comment added by 217.42.77.168 (talk) 16:39, 6 April 2009 (UTC)

See above at #Criteria for a Private Filter. –xeno (talk) 16:41, 6 April 2009 (UTC)

Which doesn't answer my question. And furthermore claims that MediaWiki:Titleblacklist had "little noticeable effect", whoever wrote that has evidently forgotten the occasions when nobody could create or move anything because some administrator fucked up the regular expressions -- 217.42.77.168 (talk) 17:08, 6 April 2009 (UTC)

See Dragonflight's comments at 02:17, 27 March 2009 (UTC). –xeno (talk) 17:11, 6 April 2009 (UTC)

Yeah, the one that starts with "he", implying that this entire feature is only here because of one person, never mind what is good for the rest of us -- 217.42.77.168 (talk) 17:13, 6 April 2009 (UTC)

It's just an example. Now, my main goal by watching this page is to ensure that AbuseFilters don't disenfranchise anon users such as yourself; if there is a specific cases that has prevented you improving the encyclopedia, please do let me know here, or on my talk page or file a WP:FALSEPOSitive report. –xeno (talk) 17:15, 6 April 2009 (UTC)

Can I has old_html?

It's there when I examine an edit, but when I want to use it in the actual filter I get a syntax error. I'd like to use it at Filter 133 so it only catches newly introduced citation errors. --Conti|✉ 20:24, 4 April 2009 (UTC)

old_html and old_text are unavailable for performance reasons. new_html and new_text may end up being killed for the same reason. It is processor intensive to parse the entire text of the new page. The use of added_lines, removed_lines, new_wikitext, old_wikitext, etc. are recommended whenever possible. Dragons flight (talk) 20:42, 4 April 2009 (UTC)

Hmm, dang. Without any of those, it would be impossible to check for the "cite error" error message, right? --Conti|✉ 20:48, 4 April 2009 (UTC)

Yes, the only way to catch parser generated error messages is by parsing the page, unfortunately. Dragons flight (talk) 20:53, 4 April 2009 (UTC)

new_html and new_text are fine, the parse operation has to happen anyway. old_html and old_text should in theory be pullable from the parser cache, but I haven't got there yet. — Werdna • talk 01:58, 7 April 2009 (UTC)

We've had clear examples of new_html timing out the server on very large pages, and it can easily lead to delays that make a difference for user experience. I don't know why one should be able to preview or save a page that one can't filter with new_html, but it is pretty clear that is the case. Dragons flight (talk) 02:03, 7 April 2009 (UTC)

I'm striking the above because I can't seem to duplicate the problem in testing just now. Very large pages still seem to save even with filters like 133 enabled. Which leaves me a bit befuddled. I've certainly seen pages that appeared unable to save in the past, but being unable to duplicate the issue is surprising. Dragons flight (talk) 03:11, 7 April 2009 (UTC)

Well, I've always had problems editing very, very large pages, long before the abuse filter existed. This might be a dumb question, but how did you know that it was the filter that caused the editing problems? --Conti|✉ 10:53, 7 April 2009 (UTC)

I'm not sure what problems DF is talking about, but I'm familiar with the issues MZMcBride was having with saving large pages. On sites with many filters (enwiki and test.wikipedia), edits that added more than ~250,000 bytes via the API timed out after about a minute with a 504 Gateway timeout. On mediawiki.org, which only has 1 filter and on my normally-slow test wiki which only had a couple random filters, the edits worked fine. So its not a lot of data, but it certainly looks like some correlation between number of filters and edit success. Mr.Z-man 23:03, 7 April 2009 (UTC)

Cleanup

I just removed many (12 or so) old filters with little or no hits for performance reasons. Prodego ^talk 22:54, 4 April 2009 (UTC)

You turned off 19 filters (not 12). I have re-enabled 9 of these. 7 were filters with real hits in the last week and performance burdens < 4 ms. The other two were less than 2 days old. In the mean time I've fixed an unrelated filter that was throwing runtime numbers over 50 ms, which means it had more load than the other 9 combined. I'm working on creating better tools for monitoring load, but we shouldn't throw away the highly targeted low load filters just because they get relatively few hits. Dragons flight (talk) 04:00, 5 April 2009 (UTC)

Sorry, but I feel to remark this as 'rediculous'. I have reenabled another handful. These filters are specific designed to keep out specific long term vandalism, and they should all be enabled as there is no reason to expect that the vandalism has stopped (the 3 hits for the Argentinian IP hopper were all three correct on the MO of the vandal, returned after 1 year of page protection, hitlerbunker is still active on de, was only not active here as the article was protected, etc. etc.). Please don't disable such rules on 'no hits' or 'likely no hits', they do what the abusefilter is for, stop abuse. If rules are to be shut down then consider to shut down purely monitor-only rules. Thanks. --Dirk Beetstra ^{T C} 11:20, 5 April 2009 (UTC)

~~I rest my case. --Dirk Beetstra ^{T C} 11:29, 5 April 2009 (UTC)~~ .. well, not completely, it would not have caught what it should have. --Dirk Beetstra ^{T C} 11:41, 5 April 2009 (UTC)

MZMcBride reported he could not make large (bot, I think database reports) edits during high load times, because they timed out. This is directly a result of doing too many checks. If you reenable a filter, you add a small load. If you reenable 20 low load filters, they add up to one large load. If they get almost no hits, they are wasting resources, which apparently we need, if edits were timing out. I disabled 133 again, since, after testing, 3 out of 4 users who tried to edit a large article (Timeline of United States inventions and discoveries) had their edits time out. Please prioritize the filters, if they are getting almost no hits, they probably aren't worth the processing time. Prodego ^talk 15:03, 5 April 2009 (UTC)

The answer to the observation actually is in the section above. new_html and new_text require parsing the entire page. Those operations are blocking against large pages. Timeline of United States inventions and discoveries actually gave me a >50 second render time last night. Aside from the fact we really shouldn't have any pages that take that long to render, of course the filter will time out if you ask it to do that operation. It is quite possible that new_html and new_text will be entirely removed to prevent these problems, but if they aren't removed they need to be predicated with conditions specifying new_size less than a few kilobytes because many large articles and large discussion pages will kill performance. All of the other variables should give reasonable performance even on large pages (though it may be possible to create combinations and sequences of operations that are unreasonable). Yes we need better tools to evaluate filter performance and prioritize implementation, but your approach wasn't very good because you didn't have a good measure for what was causing problems. Some filters average 4 ms, but will never ever take more than 10 ms, while others might average 10 ms but once in a 10000 operations take seconds to execute. The first behavior is rarely, if ever, a problem. The second is potentially a big issue. Dragons flight (talk) 17:00, 5 April 2009 (UTC)

Now that would have been a better explanation (and I disagree on calling the filters 'useless' ... but this might be part of the thread 2 below here (server sluggishness). --Dirk Beetstra ^{T C} 15:08, 5 April 2009 (UTC)

That should be unrelated, it is pretty well after I disabled several of them. I reworded the 'useless' above. Prodego ^talk 15:10, 5 April 2009 (UTC)

Hmm, regarding filter 133, would it help if it would only check articles that are smaller than, say, 50k? --Conti|✉ 15:13, 5 April 2009 (UTC)

Yes, although please test it on a 50k article if you do, on a non-autoconfirmed account. Also remember this is a low load time, so if it takes more than 2 or 3 seconds to make the edit, it might time out when the servers are under higher load. Prodego ^talk 15:51, 5 April 2009 (UTC)

In my opinion, the number should be more like 5k not 50k. Dragons flight (talk) 17:22, 5 April 2009 (UTC)

The filter wouldn't be very useful then, sadly. Is there any way to reliably find out with which settings the filter is starting to cause problems? --Conti|✉ 18:43, 5 April 2009 (UTC)

Everything that generates a cite error is already being placed in Category:Pages with incorrect ref formatting which is almost certainly a better approach. Dragons flight (talk) 19:11, 5 April 2009 (UTC)

Well, it would have been a nice way to prevent citation errors in the first place. :) --Conti|✉ 19:15, 5 April 2009 (UTC)

Again, it would be useful to have the test function for pages, with a timer. Then it can be seen how long it takes to test 100 edits on a large page, and from that we would know which filters are likely giving problems. My guess is that it are the rules which check page content vars on large pages (which may even look relatively fast on average, they do give time-outs on the large ones). --Dirk Beetstra ^{T C} 16:00, 5 April 2009 (UTC)

So I would ask that those of you who reenabled filters please reconsider if the load is worth the filter. All the filters I disabled had less than 10 hits, which seems to be wasting resources to me (10 edits out of how many tens of thousands?), and disable as appropriate. If they aren't hitting anything, they aren't very good filters. Prodego ^talk 18:15, 5 April 2009 (UTC)

You're completely missing the point of targetted filters. Abuse filters are designed to do just that, stop abuse. We have some patterns of abuse that are incredibly common, and so naturally the filters that block them get a lot of hits. We also have some patterns of abuse that are practiced by notorious users or are otherwise very specific: we (finally) have the ability to block that vandalism too using targetted filters. If the filters are now not getting many hits, yet the problems were previously widespread or severe enough for filters to be designed against them, that is an indicator of success, not failure. Happy‑melon 18:28, 5 April 2009 (UTC)

It is more likely that they are just getting around the filter. As I mentioned on Dragonflight's talk page: "Lets use filter 118 as an example. The filter has 2 hits. It has been enabled 5 days, and has a 2.6ms average run time. So if we have about 2 edits per second, that means that in 5 days 37 minutes of server time were used to get those 2 hits, or 18.5 minutes per hit." 18.5 minutes of server time are used to hit each 'bad' edit (one of them was actually a good edit). With that same amount of server time, 56 edits could be hit using filter 30. It is a matter of priorities, if a filter is only stopping a very small amount of 'bad edits' then that time would likely be better used with a filter that can catch many times the bad edits as that filter would, without using significantly more time. For example, filter 118 takes 2.6ms to evaluate, on average. 30 takes 3.1. It is pretty obvious which is a more efficient use of resources. Prodego ^talk 19:07, 5 April 2009 (UTC)

So if vandals get around the filters, we should remove them? Whose side are you on? :D Wikimedia has over three hundred servers: number crunching and concluding that, when you gear up the processing time four hundred and thirty thousand times, you suddenly get a noticeable number, is both factually obvious and totally pointless. WP:PERFORMANCE applies here just as much as anywhere else: the devs have said "this is a potentially server-intensive operation, here are the boundaries we need you to stay within". That does not, and never has, equated to "stay as far away from the boundaries as possible". Wikimedia server time is not pay-as-you-go; the servers are there, using electricity, whether we're using them or not. I'm not saying we should ignore benchmarking or efficiency, of course it's important. I'm saying that what the devs have told us time and again to do, is to do whatever is best for the project, and let them pick up the pieces on the technical side. If that means we have to make a choice between a filter that get 100 hits a day and one that gets 2, obviously we'll go for the former. You've made a choice between stopping vandalism and not stopping vandalism by doing what's best for the servers, not the project. Ignore the servers until they come and bite us in the arse; then it'll be time to make the hard choices. Happy‑melon 19:31, 5 April 2009 (UTC)

If we are at the point where MZMcBride is reporting he can't make a valid edit because it is timing out due to the abuse filter, then we are at the point where we have to remove some things. I am creating a table showing the efficiency of all the enabled filters at User:Prodego/Sandbox. I suggest everyone take a look. Prodego ^talk 20:18, 5 April 2009 (UTC)

Yes, we needed to fix the one rule that was blocking. The rest of what you disabled was largely irrelevant. Keep in mind that in order to time out the collective server operation has to run longer than about 30 seconds. new_html and new_text can do that on large pages, but almost nothing else will ever even approach a 30 second runtime. Dragons flight (talk) 20:23, 5 April 2009 (UTC)

HM is basically correct. We can't totally ignore performance because it is possible to write rules that have dreadful performance characteristics. (Things like 133 can totally block editing of very large pages, as noted above.) But we also shouldn't freak out about performance. The servers in general have spare capacity. With performance in mind I've personally written multiple patches for the AbuseFilter, and our current execution burden is only about half what it was during the first week with the AbuseFilter despite the increase in rules. The choice between 118 and 30 offered above is a false dichotomy because it is currently reasonable to have both. If and when we start hitting practical performance limits, we would need to start prioritizing more directly. For the moment though, the issue to focus on isn't 2 ms vs 5 ms, but rather well constructed rules taking 10 ms versus poorly considered rules taking 100 ms. I've already been poking at the bad rules fairly aggressively. Dragons flight (talk) 20:19, 5 April 2009 (UTC)

This is true, however, if users are reporting edits timing out (and you can test this yourself too) then there is a problem. The best way to deal with it is to ensure we are using our resources in the most efficient way. The table at User:Prodego/Sandbox shows how the filters use resources. This is useful to know, since it lets us judge if a filter is helping or slowing down edits unnecessarily. Prodego ^talk 20:32, 5 April 2009 (UTC)

Indeed users are reporting edits timing out, and indeed this is a problem. The problem can have one of two causes: a filter which runs abnormally slowly in the particular situation, or having one or more filters which are generally slow. Both are symptomatic of poor filter design of the filter(s) in question. Asserting that the best way to "deal with" this problem is to start disabling other filters is, at best, somewhat bizarre. Happy‑melon 21:33, 5 April 2009 (UTC)

You forget the the third option: having too many filters. Prodego ^talk 00:04, 6 April 2009 (UTC)

Prodego, if there are 5 filters with an average of 3 milliseconds, then that adds 15 milliseconds to the average edit. Indeed, removing 3 of them is a huge improvement (60%!). But if one of those 5 actually has a median at 2 milliseconds, but has a processingtime of 30 milliseconds on the large pages, then removing the 3 fast ones (which are fast except on the ONE page they actually should hit on!) has only an improvement on the whole from 42 to 33 milliseconds (about 25% gain) .. if 40 milliseconds times out, then indeed, having the three disabled solved your problem, but the problem would be better solved with having the slow filter being reduced to 15 milliseconds, as that would enable all filters, especially those which do what this filter was designed for, top abuse! May I again point at my suggestion somewhere else where we have a split off of filters which are NEVER going to set to warn, block or prevent into a after-edit-abusefilter (it could even be in the same system, I envisage that it is easy to first evaluate the action-filters, and if no action is performed that results in the edit being saved, followed by the processing of the non-action filters, that would even be good for testing!), as they do not have to monitor in real time? --Dirk Beetstra ^{T C} 12:11, 6 April 2009 (UTC)

Prodega disabled two filters I created that filter out longterm vandalism to two individual articles. As a result, ther was an instance of the same vandalism today.[2] My understanding is that the filters only activate if cases where the articles are edited. Can someone indicate exactly how much load is due to single-article filters? Will Beback talk 01:35, 7 April 2009 (UTC)

I'm a bit late, but there's no performance burden for new_html and new_text, the page has to be parsed anyway. — Werdna • talk 01:57, 7 April 2009 (UTC)

Does that mean that the editing penalty of a filter using new_html or new_text is washed out by the time spent on other operations, and that it doesn't apply at all to pages rejected by the first expression? Is there an example of a good filter using new_html or new_text? More information on optimizing filters would be helpful all around. Will Beback talk 10:54, 7 April 2009 (UTC)

Werdna, do you mean that a filter that does not execute 'new_html' has the same burden as one that does not do that? So (user_name = "Beetstra" & new_text contains "Blah") has the same execution time for you as for me? And the rule has the same execution time if I edit U of A (1,025 bytes) as if I edit Timeline of United States inventions and discoveries (417,316 bytes)? --Dirk Beetstra ^{T C} 10:58, 7 April 2009 (UTC)

I believe what he means is that there is no performance burden in creating the text that fills NEW_HTML and NEW_TEXT, because that material has already been parsed. That is, there is almost no performance difference between a rule that reads NEW_TEXT contains "foo" and "foobar" contains "foo", as the AbuseFilter already has the text that will fill the NEW_TEXT variable if it is used in a filter. The usual performance impact of evaluating the contains still apply, so the rule you suggest will take longer for an edit by you, Beetstra, than an edit by another user. I could be wrong, but I think this is what is meant. Happy‑melon 11:44, 7 April 2009 (UTC)

It is also what I think, with the small difference that the difference between (<1 Gb of text> contains "foo") and ("foobar" contains "foo") will be a bit, but not too much, quicker for the latter.

The other however has as a problem, that if it filters out 20% of the edits in the first part (say '!"user" in user_groups), and the average is 10 milliseconds of the overal rule, that that means that .. (performs difficult math in his head) .. err .. the rule would run e.g. 4 times with a speed of 5 milliseconds, and 1 time with a speed of 30 milliseconds (now assuming that the average time to run a two-part statement would be 50% for the first statement and 50% for the second statement, which for difficult rules and for 'uneven' procedures may not be the same)? So if that second part gets run on a large page (very likely when filtering on users, but not on pages), then a timeout could occur when editing a large page (if the effect is a bit more extreme than what I describe). --Dirk Beetstra ^{T C} 11:56, 7 April 2009 (UTC)

I expect that the balance is indeed very much more uneven; running the !"user" in USER_GROUP test probably takes << 1ms, but running a contains on a large page could take 100ms. But the same principle applies, as you say: taking the average disguises the outliers where certain situations prompt the rule to take much longer than the average. We know that, on average, the load on the AbuseFilter is acceptable. It's the outliers that we need to be looking at. Happy‑melon 12:39, 7 April 2009 (UTC)

The key isn't how long it takes to get a variable, that is pretty quick (with the notable exception of added_links and removed_links). But searching large variables takes a lot of time, which is what Beetstra mentioned above. The average is just an average, it doesn't tell the whole story, like Happy-melon mentioned. You guys have got it now. :) Prodego ^talk 21:31, 8 April 2009 (UTC)

Moving 'disallow' action to be "restricted"

Just noting here that it's possible to restrict certain actions ('disallow' comes to mind) to a smaller group of users. Is this desirable? It would be possible to require a bit more of a formal assessment of filter performance before setting a filter to disallow. Not sure about whether we want it or not, just raising it as an option. — Werdna • talk 03:04, 7 April 2009 (UTC)

How is this implemented? Once a filter is set to disallow, does that mean only disallow-able editors would be able to edit the filter? Dragons flight (talk) 03:13, 7 April 2009 (UTC)

The problem now seems to be lack of process. Who determines whether a filter is appropriate? How do they make that determination? Some people just create a filter and immediately make it disallow with zero hits. I created one and watched. The false positives seemed acceptable to me but it was turned off within a few hours for too many false positives. With no published standards, how could I know that? —Wknight94 (talk) 03:31, 7 April 2009 (UTC)

Concur, better guidelines would be helpful. Dragons flight (talk) 03:39, 7 April 2009 (UTC)

And would that mean that you need a 'disallower' to implement emergency filters? --Dirk Beetstra ^{T C} 10:52, 7 April 2009 (UTC)

The general implementation is that you can't save a filter with the 'disallow' action if you're don't have a right called abusefilter-edit-restricted. A side-effect would be that the emergency disable mechanism would also disable 'disallow' filters. I do agree with Dragons Flight that some guidelines at least, if not some kind of process (especially for hidden filters), would be helpful. — Werdna • talk 14:56, 7 April 2009 (UTC)

I agree with guidelines as well. Restricting it might mean that most can't use it for emergency filters (which are often both set to disallow and hidden per beans), but that should not mean that it does not need to be thought through before they get enabled. --Dirk Beetstra ^{T C} 10:30, 8 April 2009 (UTC)

Performance data

Wikipedia:Abuse filter/Performance. This is still subject to experimentation. It's got about 36 hours of data so far, and hence the (7 day) columns aren't very meaningful yet. Also, the 1 hour column can be quite noisy for a variety of reasons. Please don't use this as a reason to start slashing at things, because the general load is okay right now. But if we do ultimately need to have discussions about prioritizing then this can provide a more tangible and long-term basis for judgment. Dragons flight (talk) 05:24, 6 April 2009 (UTC)

You've got two sets of time columns there. What's the difference between them? --Carnildo (talk) 05:51, 6 April 2009 (UTC)

If I understand your question, the first set is hits in the last X time interval, and the second set is average execution time of the rule during that time interval. Dragons flight (talk) 05:55, 6 April 2009 (UTC)

Would it be possible to do something like getting information on processing time for different input sets? E.g. for an 'A & B & C' filter it would be nice to see the processing time for 'A = false', 'A = true and B = false' and 'A = true and B = true and C = false' .. Rules which hit only one page or a small group of editors are generally very fast (1-2 msec), but they might give problems when hitting really on a really big page. Or getting information on 'fastest and slowest processing time measured', or percentages of time that the filter runs <1 msec, 1-2 msec, 2-3 msec, 3-5 msec, 5-10 msec, 10-20 msec, and 20+ msec (rules should preferably have a very low count in the latter 2-3 of these categories) might already help. IMHO, the averages don't mean too much. --Dirk Beetstra ^{T C} 12:33, 6 April 2009 (UTC)

What unit of time is on the right? Will Beback talk 01:51, 7 April 2009 (UTC)

The first table seems to show that all filters have taken up a total of 321.9 milliseconds in one hour, and 347.72 ms in one day. Is that correct? Will Beback talk 01:54, 7 April 2009 (UTC)

That is the mean execution time per edit during the last hour/day/etc. So a number of 333 ms means that the average edit took 1/3 of a second longer to save because of filtering. Dragons flight (talk) 01:58, 7 April 2009 (UTC)

Can you add the number of users who were warned and did not proceed to save their edits, and the number of users who were warned and hit the report FP link. Dy yol (talk) 17:53, 11 April 2009 (UTC)

Current Trends

Based on the data as of 4/8/09

Category	Private	Public
Total Filters	41%	59%
Total Hits	2.5%	97.5%
% of Filters that disallow	83%	17%
% of Category that disallow	80%	12%
% of Total Disallows	28%	72%
% of Hits against category resulting in Disallows	88%	6%

Not very interesting numbers, but it does show that private filters are being used to disallow edits with very specific editing patterns as opposed to public disallow filters which have much wider scope, since despite making up only 17% of disallow filters, public filters result in 72% of disallow actions. Burzmali (talk) 14:42, 8 April 2009 (UTC)

That is interesting data, thank you! :) — Werdna • talk 02:05, 16 April 2009 (UTC)

Stop creating filters to catch one specific instance of vandalism

Things like this. Unless you want to make 1000000 filters most of which will have 0 hits and slow editing to a crawl. kthx 86.164.203.7 (talk) 20:51, 11 April 2009 (UTC)

Well, Prodego, an administrator here, says that it's recurring vandalism. In general, our admin community has a lot of technical expertise; the filter is run by Werdna, who is payed full-time to work on the system. I'm sure that they know what they're doing. ╟─Treasury Tag►contribs─╢ 20:53, 11 April 2009 (UTC)

And that's a strange example since it's not even enabled. —Wknight94 (talk) 21:33, 11 April 2009 (UTC)

Actually, your admin community is picked for their article writing experience, and frequently screw up regexes and do things like block article creation, block new user creation and deautoconfirm several hundred people (not mentioning any names) -- 86.164.203.7 (talk) 21:39, 11 April 2009 (UTC)

No, they're not. WP:RFA, WP:ADMIN etc... ╟─Treasury Tag►contribs─╢ 21:46, 11 April 2009 (UTC)

As was already said, the filter in question is not enabled and hasn't been for a week. In any case, the first check it does is for an exact match on the page title, on any article other than Warren G. Harding the time it adds to the edit would likely be <5 ms. Mr.Z-man 21:44, 11 April 2009 (UTC)

That would be one of the "Colbert Report" filters. It would have been nice to have had in place when the Warren G. Harding siege was going on. I don't know why an IP address would have a problem with such a filter - unless it would thwart his own attempts to vandalize the article. Baseball Bugs ^{What's up, Doc?} carrots 02:55, 12 April 2009 (UTC)

I said what? Don't confuse my comment with that of User:Will Beback ([3]) who enabled the filter. I in fact am the one who disabled that filter. In general I agree that filters should be targeted so to prevent the maximal amount of vandalism for the minimum amount of resources. In some cases this is a very wide filter, in some cases it is a very narrow one. Prodego ^talk 05:29, 13 April 2009 (UTC)

I agree that every extra filter imposes some cost in time and resources, and that we should seek to gain the greatest benefit. We have the recurring situation of certain specific phrases being inserted into single articles, like San Diego, Elephant, or Warren G. Harding. The "Colbert" vandalism seems to drop off because there are few re-runs, while other shows or media may have a more lasting effect. Perhaps using a mix of filters (to cover the peak periods) followed by bots (once the frequency drops off) would address the problem best? Also, if there are ways of optimizing filters for specific articles then that might reduce the problem too.Are there opther ways of handling this kind of vandalism that are better than filters? Will Beback talk 06:13, 13 April 2009 (UTC)

According to "Wikipedia:Don't worry about performance", we should let others concern themselves with system performance. Baseball Bugs ^{What's up, Doc?} carrots 08:45, 13 April 2009 (UTC)

That essay was written before administrators were given the ability to potentially add several seconds of processing time to every edit. Gurch (talk) 19:17, 18 April 2009 (UTC)

We are the others Baseball Bugs. The filters are a substantial load, we do have to worry about performance. @Will, generally rotating in more specific ones works for things like Colbert vandalism (which is periodic). Prodego ^talk 04:18, 14 April 2009 (UTC)

On one hand, the nature of this feature means it's always easy to use it to drag the system down, so we do need to worry about performance. On the other hand, it we really needed a lot of single article (or group of page) filters, it should be possible to make them work efficiently, if the right functionality were implemented. But a single fast filter that's changed with the flavor of the week shouldn't create a performance worry. -Steve Sanbeg (talk) 19:58, 15 April 2009 (UTC)

False positives page

Can I get some reassurance that things listed on the false positives page will be read and acted upon? Even sampling random log entries while trying to test code for interacting with the logs, I keep finding stuff that shouldn't be there; I'm sure if I actually looked deeper, I'd find a lot more wrong. The paranoid desire to unnecessarily keep most of the details of these filters hidden from me is not exactly helping, either, it's like trying to debug software by examining its output when you don't have the source code. Are posts there likely to be read, or am I better off sending everything to the administrators' noticeboard? Gurch (talk) 17:47, 18 April 2009 (UTC)

Both cases you reported are for log-only filters, there have been no actions performed on the editor, the edit has been performed without the user noticing. I would think that sending these to the administrators' noticeboard would be unnecessery. I hope this explains. --Dirk Beetstra ^{T C} 18:15, 18 April 2009 (UTC)

Being log-only doesn't mean a filter should be catching things that don't match its description, especially when it's a private one and the description is all we have to go on. In fact, such a filter is worse than no filter at all -- you get the performance cost of having the filter without the benefit of actually having the abuse log, well, logging abuse. Gurch (talk) 19:15, 18 April 2009 (UTC)

No, of course, filters should be as good as possible, with an as small as possible number of false positives as possible (preferably: zero). But for some of them, having a filter which catches everything what you want to catch, plus a handful of false positives can also be better than not having a filter at all, as otherwise you will have to devise other ways of finding all occurances (see filter 129).

Although I agree that we need to keep an eye on performance, and that the overal performance of the 'pedia does not suffer under the filters, on the other hand, there should also be a drive to improve the system behind the filters to make them as fast as possible. --Dirk Beetstra ^{T C} 19:24, 18 April 2009 (UTC)

where should I go?

Hi folks - and no need to point me in the obvious humor direction for my thread title ;). I got a spam filter notice for ezine.com when I tried to use it as a reference. I looked first to the MediaWiki talk:Spam-blacklist/log page, but I can't seem to find the right area to ask. Is ezine.com considered to be an unreliable site for reference? Thanks. — Ched : ? 19:48, 18 April 2009 (UTC)

I also looked at Wikipedia talk:WikiProject Spam and didn't see anything. I should have read the message I guess, instead of assuming it was an "Abuse filter" issue. Instead I just backspaced out of the edit warning window, and used a different reference. Oh well, any info on it would be appreciated. — Ched : ? 19:57, 18 April 2009 (UTC)
- You didn't hit any filter, at least. --Conti|✉ 20:12, 18 April 2009 (UTC)
  Yes, there are various different filters and blacklists that an edit has to pass through to be saved; unfortunately, rather a confusing situation for contributors. In this case, my guess is you were blocked by an entry on the spam blacklist added by someone who didn't bother logging it because it was obvious to them why they didn't like the look of the site. Looking at the site myself, I have to agree with them. "Submit Your High Quality Unique Articles To EzineArticles.com In Exchange For Traffic & Exposure Back To Your Website!" doesn't exactly scream "reliable source". Gurch (talk) 20:36, 18 April 2009 (UTC)

Thanks for the input folks ;) — Ched : ? 04:07, 19 April 2009 (UTC)

Where was the community vote on creating an additional privilege above "admin"?

I don't recall anything like that ever showing up in a watchlist notice. And I don't see the RfAFE's for these people either. --Random832 (contribs) 12:54, 23 March 2009 (UTC)

Did you not get the memo?? Happy‑melon 12:58, 23 March 2009 (UTC)

New software features come about all the time. The developers don't sit around seeking consensus for new features. We develop policy as it is needed. Chillum 13:00, 23 March 2009 (UTC)

A feature is one thing, the implementation of this as a separate user group vs including within sysop is something that the devs tend to follow individual project consensus on - so where was the consensus for this? And how did these people in particular get this flag? If there is no process, then there will be no objection to me going down this list and giving it to everyone on it, right? --Random832 (contribs) 13:05, 23 March 2009 (UTC)

Sysops can grant themselves the AFE right if they want to muck around with the AF. There's no additional red tape. AFE is not "above sysop", it's just granted to (or taken by) those who need it to edit or view private filters. Giving it out to every sysop would be pointy imo, and a waste of time. There is discussion above somewhere about rolling the right into the sysop package and another one about whether it is a good idea to consider granting it to non-sysops. –xeno (talk) 13:09, 23 March 2009 (UTC)

How would it be pointy? Given that every current AFE is also an admin; and that there seems to be no objection to admins granting themselves this flag without any discussion, it seems that the reality is that abusefilter-modify is an admin permission. If any admin is entitled to have the ability to modify filters, then there is no advantage whatsoever for the permissions not to be included in 'sysop'. It's not like an admin can 'accidentally' modify a filter if they don't intend to work in that area. Happy‑melon 14:05, 23 March 2009 (UTC)

Spamming the userrights log to make a point? (When I said "giving it out to every sysop", I was referring to Random's suggestion that he might go down the list of sysops and grant every one of them AFE) –xeno (talk) 14:08, 23 March 2009 (UTC)

Oh, sorry, I misread you. Yes, manually granting the 'abusefilter' flag to every current admin would be pointy. I thought you were talking about bundling the same rights into 'sysop'. Do you have an opinion on that? Happy‑melon 14:11, 23 March 2009 (UTC)

Yea, I don't really see why not, other than it would make it harder to get a list of "sysops-who-muck-about-with-the-AF". –xeno (talk) 14:13, 23 March 2009 (UTC)

Modifications to filters are supposed to be logged in Special:Log; that code is disabled for the time being because one of MediaWiki's standard database tables needs to be changed slightly (a column expanded to allow it to take longer entries) to accomodate the data. Then we'll be able to search all filter changes in the normal way. Happy‑melon 14:37, 23 March 2009 (UTC)

I still see some value in having an actual list, rather than forcing someone to trawl through a log to find an AFE-active admin. –xeno (talk) 14:39, 23 March 2009 (UTC)

The list might be current now, but there's no reason why it should remain so; people are just as likely to add that bit, do a bit of work, then drift off, as they are in any other job. If you want to know who currently does renames, do you look at Special:ListUsers/bureaucrat?? People shouldn't have to look at logs or lists to find AFE-competent people; they ask in the appropriate forum and interested admins watch it. Happy‑melon 16:25, 23 March 2009 (UTC)

It is not about sysop vs not-sysop! Good lord no! It is about having the technical knowledge and sense to use such a powerful tool. We don't give it to every admin because many admins would not have a clue how to use it. If we do grant it to non-admins it should be on the basis of technical skill and trust. It is not above or below admin. This is not a 1 dimensional concept and it cannot be reduced to that. Chillum 14:26, 23 March 2009 (UTC)

A similar amount of technical knowledge is required to edit the Spam blacklist, complicated templates, to perform rangeblocks or to modify the site JavaScript. All of these facilities are available to all administrators by default. Yet the site has not collapsed under the weight of admins who "do not have a clue how to use [them]" randomly playing around with features they do not understand, because we have selected the admin community to be, in general, comprised of users who do not do such things. It is not possible to 'accidentally' modify these things, so if you make the good-faith assumption that admins will not mess with things they do not understand until they have gained that necessary understanding, there is no reason why the ability to modify filters should be three clicks away instead of two. Any admin can grant themselves the AFE flag and go fiddling, if they are comfortable that they know what they are doing. Any admin can go edit the spam blacklist, if they are comfortable that they know what they are doing. Where is the additional 'safety' in the former situation? Happy‑melon 14:43, 23 March 2009 (UTC)

I am not opposed to admins granting the right to themselves. I think that they can decide if they are capable of using a tricky tool. The primary advantage of having it as a separate user group is that we can grant it to capable and trusted users who are not admins. Chillum 15:20, 23 March 2009 (UTC)

Indeed, and that is a separate issue. What's being suggested is that the same permissions that are given by the AFE flag are also included in the 'admin' bundle; in exactly the same fashion as rollback or IPBE. We can discuss the issue of whether or not to give AFE out to non-admins separately, but at the moment every one of the AFE flags are held by users who are also admins; there's complete duplication. Admins are indeed capable of deciding if they are capable of using this tool; they don't need to be put through a nuclear warhead-style interlock if they decide they are comfortable using it. Happy‑melon 16:30, 23 March 2009 (UTC)

I think an admin can decide if they are capable of using it before assigning to themselves. I would hardly call it a "nuclear warhead-style interlock", it is the same procedure used to add or remove any user right. Automatically granting it as part of the admin package seems out of place. It should not be granted to all admins because a) not all admins should be using it b) it would give the appearance to others that simply being an admin qualifies you. Think of it as a little plastic lid covering a dangerous button that can be flipped up if it is really needed. Chillum 17:55, 23 March 2009 (UTC)

The "little plastic lid" idea is the analogy I was really going for, except that that implies that there is nothing else to stop admins destroying the wiki. To screw up a filter from an empty browser, the admin needs to log in, type "Special:AbuseFilter" into the search bar, then scroll down, find a likely filter, click on it, change some settings, then click the "save" button. As opposed to logging in, typing "Special:UserRights/Jimbo", checking the AFE box, clicking save, then continuing the process? Do you really think that there is a danger of admins completing the first chain whilst sleepwalking, while the second will defeat them? Indeed admins can decide if they are capable of using it. Having decided, they don't need to go through extra sanity checks, that's the nuclear missile analogy. If an admin thinks they are capable of editing the filters, they will edit the filters. What's the benefit of the extra paper trail?

Your arguments against are, IMO, based on the continued misassumption that no one who has the technical permission is capable of resisting the temptation to use it. "simply being an admin qualifies you"... to do what? Being an admin qualifies you to have the technical ability to edit the filters; that's as true now as it would be if the permissions were bundled. Being qualified to actually implement and maintain filters is a status restricted neither to admins nor to AFE holders; but that's inevitable. Happy‑melon 19:10, 23 March 2009 (UTC)

We seem to be going round and round on something that in my personal opinion doesn't matter. The difference between having a right by default and having the right to add the right is very small. Could someone organize a vote or something so we can try to get some closure on this? Dragons flight (talk) 19:00, 23 March 2009 (UTC)

I agree that it's small; I wouldn't say it's irrelevant. I also agree that further cyclic discussion is unlikely to be constructive. I'll get a little staw poll going. Happy‑melon 19:11, 23 March 2009 (UTC)

Straw poll

“ Should the abusefilter-modify permission be included in the 'sysop' user right? ”

This will have the effect of giving all administrators the access to the Abuse Filter settings that is currently restricted to the 'abusefilter' group. This issue is separate from the question of whether the 'abusefilter' group should continue to exist and to whom it should be granted, which should be discussed separately. Please indicate support or opposition below. Happy‑melon

Pointless poll. Admins can already assign the right to anybody, including themselves, and the right is basically a technical right, people without the technical knowledge have no use for it, and those with it can get it easily and without problems. This right seems to be 100% noncontroversial, and we shouldn't run around demanding exhaustive community polls every time a new feature is enacted. The abusefilter feature has already been approved by the community, so this side trip down Pointless Poll Lane serves little purpose except to derail the implementation of an otherwise useful feature. Lets just let this be and move on. If an admin needs it, let them take it. If a non-admin needs it, let them get it from an admin. The number of people who are needed to maintain the abuse filter is diminishingly small, so there does not seem to be the need to enact this automatically for 1700 people, nor does there seem to be any need to demand a convoluted "approval" process to get this bit. This is a solution in search of a problem, and the entire thing is pointless. --Jayron32.talk.contribs 22:11, 23 March 2009 (UTC)

I'm not sure I understand much of what you're saying here. How on earth do you think that this poll is somehow intended to "derail" the AbuseFilter? If a non-admin needs it, let them indeed get it from an admin; although I suggested that we consider this issue separately, I'm generally in favour of retaining the separate AFE right to give to non-admins if desirable. But if an admin needs it, why the pointless paperwork of granting themselves an extra bell and whistle before getting to work? The situation is analogous to admins still being able to grant and revoke rollback, but needing to grant it to themselves before being able to use it. Happy‑melon 22:31, 23 March 2009 (UTC)

The status quo seems perfectly fine to me. Admins can give themselves the flag if they want to, and having this as a separate flag opens the possibility to give it to non-admins as well, should there be a consensus to do so in the future. --Conti|✉ 22:14, 23 March 2009 (UTC)

I agree, pointless poll. The setup seems to be working just fine at present. We can revisit this question of there are any serious problems, but for now let's just see how it goes. Tim Vickers (talk) 22:15, 23 March 2009 (UTC)

Well we both know there aren't going to be any problems that would be somehow magically 'fixed' by this change. Not all changes have to fix problems. Just because it's not broken doesn't mean it can't be improved. Happy‑melon 22:34, 23 March 2009 (UTC)

I support its being included in the sysop package, as Administrators are already trusted with tasks of equivalent sensitivity (e.g. Spam blacklists) and hence should be trusted with this. The inclusion of this right would just save Admins a little bit of hassle, in my opinion; its presence does not oblige Sysops to make use of that right every day. To give a similar example, despite technically being able to edit others' .CSS and .JS pages (editusercssjs), upload a file from a URL address (upload_by_url) and mark rolled-back edits as bot edits (markbotedits), I have yet to actually use any of those aforementioned rights. This does not mean, however, that they should be removed from me, as they might come in useful one day. The fact of the matter is, Sysops are trusted not to use their rights with malicious intent, and, as such, the situation whereby they can make use of these technical features should remain. It Is Me Here ^{t / c} 22:18, 23 March 2009 (UTC)

It's completely pointless. Admins can give themselves the ability or get it from a friendly fellow-admin. There's no need to give someone a right they're unlikely to need. This is totally different from ipexempt and rollback which many admins use frequently in day to day editing; the latter more than the first. This is an ability you only need when you plan on editing this. Keeping it separate allows us to give the right to non-admins (similar to the botflag) without giving them full admin rights they likely won't need. - Mgm|^(talk) 22:43, 23 March 2009 (UTC)
Certainly we should keep the separate AFE flag in the same fashion as IPBE or accountcreator. But how is this permission different to those? How is it different to something like the spam blacklist? If the coin had come down the other way and the devs had added the permission to the sysop bundle to start with, would we be having this discussion in the other direction? I agree it's a minor issue, but I wouldn't say it's pointless. Happy‑melon 22:52, 23 March 2009 (UTC)

We probably might have. IPexempt, rollback and accountcreate don't require particular know-how to use. That's why I don't think this can properly be compared to those permissions. - Mgm|^(talk) 00:17, 24 March 2009 (UTC)

Oppose It should be a separate flag. Flagging every admin is like saying every admin is able to understand how the filter works. I trust admins to decide if they are capable of using it safely then giving themselves the right. This extra step is not hard. Chillum 23:20, 23 March 2009 (UTC)

Meh. Who cares? --Carnildo (talk) 00:46, 24 March 2009 (UTC)
Oppose, with a dash of meh If someone want to edit the filter, they can op themselves. I would actually prefer that mediawiki and template namespace editing by disallowed in the same fashion for admins, but that's another issue. This is a simple way to keep track of who can edit the filter and a potential speed bump before editors who attempt to edit the filter but really shouldn't be (like me). Protonk (talk) 03:06, 24 March 2009 (UTC)
Support. I see no reason not to bundle it into the sysop package. Ruslik (talk) 04:38, 24 March 2009 (UTC)
Support Requiring admins add themselves to a group is a complete waste of time. BJ^Talk 10:37, 24 March 2009 (UTC)
"Why bother" - As I stated above, I still see value in having the list of AFE flagged people easily accessible. –xeno (talk) 12:55, 24 March 2009 (UTC)
If it were certain to be an accurate list, I would agree with you. Unfortunately experience with other user groups shows that it will not remain so. Happy‑melon 13:34, 24 March 2009 (UTC)
Whats the point? Any admin does have it, they just need to click an extra switch. rootology (C)(T) 13:10, 24 March 2009 (UTC)
Equally, why bother having to add an extra bell? What's the "point" in making them jump through an extra hoop? Happy‑melon 13:34, 24 March 2009 (UTC)
Exactly. It simply doesn't matter, so why bother even making a fuss over it, and the specific way it was implemented? It's like complaining over a handful more onions than garlic in your meal, or a handful more garlic than onions. A week later, it won't matter since it has zero impact on anything of importance. rootology (C)(T) 13:52, 24 March 2009 (UTC)
Yes, but in the honeymoon phase of AFE I think it'll be helpful. No need to rush into rolling it up. Jmho. –xeno (talk) 13:58, 24 March 2009 (UTC)
Well said. We should at the very least wait until the use of the tool stabilizes before giving it out to people who have not even expressed a desire for it. Chillum 14:00, 24 March 2009 (UTC)
"Pointless question" - Admins who need it can get it, just an extra couple of clicks. If they don't need it, or don't want to use it, then why bother giving it out. Are you also going to hand out AccountCreator, Huggle, VP and popups with the admin bit? Why even make a point about it. --Dirk Beetstra ^{T C} 13:56, 24 March 2009 (UTC)
Exactly. Everyone knows I'm a Process Fan, since Good Process That Is Enforced Can Give Sunlight To Shenanigans, but this is like raising a fuss over garbagemen smoking a cigarette while they collect garbage instead of during "scheduled smoke breaks" for level of crisis. rootology (C)(T) 14:05, 24 March 2009 (UTC)
Account creator and Huggle (through rollback) have already been given to all admins. Popups is not an admin tool. Ruslik (talk) 14:53, 24 March 2009 (UTC)

Similarly for AWB, IPBE, and others. Happy‑melon 15:08, 24 March 2009 (UTC)
Funny, last time I checked, I was not an Account creator, however, I can turn it on. And rollback != Huggle. You're right with popups, that is a free choice to use it, similar to AWB, IPBE, others .. and now (for admins), AbuseFilter. --Dirk Beetstra ^{T C} 19:14, 24 March 2009 (UTC)
It may be of interest that there are already 105 people with the AFE right; all of them are administrators (this is 6% of all administrators, a higher percentage of all active ones); but only 45 have ever modified a filter. Interpret as you will. Happy‑melon 14:14, 24 March 2009 (UTC)

Many are probably like me, that wanted to start out just reviewing filters. I'll probably do more later sometime, as I'm fairly familiar with regex (I should be, after years of use!). rootology (C)(T) 14:18, 24 March 2009 (UTC)

I have the flag because I intend to use it and have confidence in my ability to do so. I intend to start slow after I have had time to read the instructions, so I have not yet used it. Chillum 14:27, 24 March 2009 (UTC)

Support Current implementation makes little sense. — Jake Wartenberg 18:56, 24 March 2009 (UTC)
Oppose bundling it with the admin right. It makes very little real difference, the only advantage with the current situation being that one can see which admins have given themselves the permission = have a rough idea of who uses it. Not a great advantage, I know, but decent enough, I think. ╟─Treasury Tag►contribs─╢ 19:16, 24 March 2009 (UTC)
Support Admins can get it anyway, just by clicking a few buttons. The group should be used for users who are not administrators. Techman224^Talk 19:14, 28 March 2009 (UTC)
Support, userright is cluttered enough as it is. The capability argument is not really relevant, since an incapable admin can give himself that right already. I'd support a separate right if it was given on a case by case basis by bureaucrats. -- lucasbfr ^talk 09:36, 29 March 2009 (UTC)
Support – makes sense to me; see #Naive question. —Anonymous Dissident^Talk 11:39, 1 April 2009 (UTC)
Oppose – ... multiple reasons:
1. Abusefilter can cause temporary chaos (reversible, but the proverbial damage is done) if a filter is improperly crafted. This can range from slowdowns to accidentally preventing legitimate edits from going through to automatically removing autoconfirmed status on innocent users. Maybe it's just me, but I like the idea of encouraging an admin to sorta "figure out" abusefilter before adding himself to the group.
2. "Be bold" does not, in my opinion, apply to abusefilter in the same way it does to editing. I think extreme caution needs to be taken on every abusefilter edit, and each of those edits should be treated as if you're walking on eggshells. There's something reassuring about asking an admin to add himself to the group as an informal contract to be careful— that is, "this is dangerous stuff. add yourself to the group with caution."
3. It would possibly be easier to track who leak the contents of private filters to the public if the only admins who could leak those filters have to be in a specific usergroup (smaller pool size). From there it's logical deduction, so if you're not going to be editing abuse filters, an admin can stay out of the group unless otherwise needed.
4. It also allows us to track bizarre, sudden self-group-adds to the abusefilter group on dormant admin accounts, which would allow us to more closely watch for compromised accounts before they can do any real damage should they then screw with the abusefilters.
  --slakr^\ talk / 01:48, 20 April 2009 (UTC)

AbuseLog appearance

I found the entries in Special:AbuseLog overly descriptive, so on another wiki I changed MediaWiki:Abusefilter-log-detailedentry into something like

$1: $4: [[Special:AbuseFilter/$3|$3: $7]], $2 on $5, action: $6 ($8) ($9)

so the log entries look like

14:03, 9 April 2009: edit: 25: Filter name, Example (talk | contribs) on Foo, action: Disallow (details) (examine)

Just a thought... —AlexSm 15:59, 10 April 2009 (UTC)

Interesting; I agree that there's scope for improvement. I turned your example into regular text with live links, so we can see better what it would look like. I think this might be a little too compact. How about:

$1: $2 on $5 ($4), [[Special:AbuseFilter/$3|$7]]. Action taken: $6 ($8 | $9)

14:03, 9 April 2009: Example (talk | contribs) on Foo (edit), Filter name. Action taken: Disallow (details | examine)

Thoughts? Happy‑melon 20:41, 10 April 2009 (UTC)

That's going to stop working soon, because of some changes I've made to global filters $3 will be replaced by a link to the filter, rather than the actual filter. The link text will be in a separate message. I suppose I could pass the filter name to that message, though. — Werdna • talk 02:04, 16 April 2009 (UTC)

I see, so if it hits local filter #3 it would link to Special:AbuseFilter/3, whereas if it hit global filter #3 it would have to link to, eg, meta:Special:AbuseFilter/3. Why does the link text need to be in a separate message? Happy‑melon 13:36, 19 April 2009 (UTC)

My mistake, already fixed

On filter 58, already reverted. Sorry. NawlinWiki (talk) 02:12, 19 April 2009 (UTC)

You mean to say you hadn't made enough mistakes with MediaWiki:Titleblacklist? If you can't test before enabling, don't edit the filter. --NE2 04:43, 19 April 2009 (UTC)

I'm wondering why the abuse-filter code even accepted that change. I wouldn't think '("string1" & "string2")' is even syntactically valid, but since it is, I can see why it matches all edits. --Carnildo (talk) 04:46, 19 April 2009 (UTC)

PHP is weakly typed. MER-C 13:07, 19 April 2009 (UTC)

Looking further into it, the abuse filter appears to be effectively untyped, and the problematic addition didn't match all edits, only those with the character '1' in added_lines. --Carnildo (talk) 01:10, 20 April 2009 (UTC)

Somehow I knew it would be you :/ can't you get someone else to do these things? Gurch (talk) 14:19, 19 April 2009 (UTC)

AfD filter #147

I've created a filter (currently disabled) to check AfD !votes. It should be able to flag bolding problems like '''delete''. I've tested it myself several times, but could someone else make sure it works, since I'm new to the abuse filter and it involves the apostrophe (a tricky character)?

Also, I'd like to extent it to the following perhaps:

Making sure people don't just cast a vote in an AfD; require them to give a reason.
Making sure people don't make syntax errors such as failing to close bold/italics/other formatting, in other places like articles.

King of ♥ ♦ ♣ ♠ 00:20, 20 April 2009 (UTC)

I don't think forcing people to give a reason is a particularly good idea. –xeno ^talk 01:12, 20 April 2009 (UTC)

This is the abuse filter, not the "enforce my pet guidelines" filter. Gurch (talk) 10:15, 20 April 2009 (UTC)

I agree with Gurch; the abuse filter is not intended for style enforcement. -- The Anome (talk) 11:04, 20 April 2009 (UTC)

IP-ranges

I'm trying to create a filter on svwp which prohibits a certain ip-range from editing certain articles. But when I try to use the ip_in_range thing it either catches every IP or none at all. Presumably I'm doing it wrong! Can you tell me how to use it to catch a certain range? Would be much appreciated. Njaelkies Lea (talk) 17:15, 21 April 2009 (UTC)

The command is 'ip_in_range(user_name,"1.2.3.4/24")', see it for example at work in Special:AbuseFilter/38. Hope this helps. --Dirk Beetstra ^{T C} 17:58, 21 April 2009 (UTC)

I can't see the private filters on enwp unfortunately as I'm not an admin here, but I got the information I needed. Thank you! Njaelkies Lea (talk) 18:56, 21 April 2009 (UTC)

Filter 98

Can this filter excluse autoconfirmed users (and not just sysops). Its quite clear by looking throught the logs that people tripping it are non-autoconfirmed and that the people who are autoconfirmed are doing so for a reason (IE making a good solid stub compared to incoherible jibberish). I can see no reason why sysops should be excluded from this filter if autoconfirmed users arn't Prom3th3an (talk) 02:28, 23 April 2009 (UTC)

Some examples would be useful. I looked through the past couple hundred hits and didn't find anything. The ones that weren't deleted or tagged for deletion were either redirected, expanded in later edits (often by other people than the creator), or have cleanup tags on them Mr.Z-man 04:38, 23 April 2009 (UTC)

Filter 131

Resolved

For starters, i dont think removing an image (no matter how often it happens) is abuse and therefore within the morals of the abuse filter. Secondly. If i was a vandal I would spam those images (and / or upload images with simiar names and spam those to) on those pages because I know no one but a sysop could remove them which would take some time longer than a user. I think that filter as it stands (as proven above) is flawed and needs to be refined a bit. Prom3th3an (talk) 02:43, 23 April 2009 (UTC)

I've removed your advice to vandals. Seriously, what were you thinking?

As to the rest: Of course the repeated deletion of the Muhammad images is vandalism and abuse, I really don't see how you could describe it otherwise. If there were some discussion of the issue and a new consensus then they could be removed, but there is a long history of ideologically motivated edit warring that nothing to do with consensus editing. If you have a constructive suggestions then please offer them, but the images are already accompanied by comments warning against removal, so further warnings are likely to be pointless. Dragons flight (talk) 02:57, 23 April 2009 (UTC)

Also I wish to note another flaw in your actions, whilst you may have made the filter private the request (of which consensus was somewhat unclear at best to disallow) is still easily viewed and contains the pages and images concerned. Are you going to censor that too or fix or disable your filter? —Preceding unsigned comment added by Prom3th3an (talk • contribs) 03:46, 23 April 2009 (UTC)

Filter 118

Is a joke and outside of the abuse scope by far. Noting that every rule slows the servers down I must wonder why Raul needs his own rule that does absolutly nothing and has no clear purpose. I would stongly encourage its deletion as Prodego already tried. Again, this is an abuse filter not some office clerk. The abuse filter was made to stop serious issues, it however was not made to stop things that we personally find annoying. Prom3th3an (talk) 02:53, 23 April 2009 (UTC)

Wow... yeah. Why does that exist? --NE2 03:09, 23 April 2009 (UTC)

Occasionally, people have tried to self-publish things in the TFA queue. Whether simply by mistake or actively out of a desire to self-promote, I don't know. Having only one real hit, it is probably not an active enough problem to need a rule though. Dragons flight (talk) 03:14, 23 April 2009 (UTC)

Don't those pages get protected? --NE2 03:18, 23 April 2009 (UTC)

I agree its rather pointless. Are we worried Raul and anyone else (the one real hit was reverted by an anon) is just going to not notice and put some spam on the main page? Mr.Z-man 03:19, 23 April 2009 (UTC)

Filter already disabled once?

Prodego disabled this filter on the 5th of april providing "Too specific, disabling, performance -Prodego" as a reason as well as a detailed explanation on flight's talkpage of why. Within hours [4] of Prodego doing so Dragons flight re-enabled the filter without any additional comment [5] (talkpage or on-filter).

This raises serious questions about why Dragons flight reverted another admin when sufficent reason was provided. I also note he is yet to disable it despite increasing calls to do so. Prom3th3an (talk) 04:20, 23 April 2009 (UTC)

I don't see why this needs its own subsection, but its been less than 2 hours since the "increasing calls" started. I don't see what the purpose of rushing this is and making vague, ominous statements like "serious questions about why Dragons flight reverted another admin". Have you tried asking him? Mr.Z-man 04:34, 23 April 2009 (UTC)

As was discussed at length on this page, Prodego disabled 19 filters out of a mistaken concern about performance. I re-enabled about half (slightly less I think, but I'm not sure) that had made real hits in the preceding week (or in two cases had no hits but were less than a day old). Since Prodego's concern about performance was mistaken, there was no need to disable potentially useful filters. Since the filter had only been around for a few days and had seen a real hit, it wasn't clear how often the issue would actually arise. Now it is of course more clear. Dragons flight (talk) 04:59, 23 April 2009 (UTC)

I've turned it off. As you should be aware I was a little busy patching the 0day exploit you yourself publicized. Dragons flight (talk) 05:03, 23 April 2009 (UTC)

Ghost edits in the filter logs

Occasionally, I find edits that didn't really seem to exist in the filter logs. For example in the log of filter 81, I find

21:54, 22 April 2009: Until It Sleeps (talk | contribs) triggered filter 81, performing the action "edit" on Louis Armstrong. Actions taken: none; Filter description: Badcharts (details) (examine)

This edit seems to be a complete phantom: it isn't in the history of the article, and it isn't in the editors coontribution history. I'm sure there's a simple explanation, but I don't know what it is, and I'm curious.—Kww(talk) 03:27, 23 April 2009 (UTC)

The only thing I can guess is that the user was using some out of date data when vandal patrolling, as the phantom rollback was supposedly made 8 minutes after the edits he was trying to rollback were rolled back by someone else, but the abuse filter triggered before MediaWiki checked whether the rollback was valid. Other times things like this can happen if a user trips multiple filters, where one only logs but the other warns or disallows, if one only sees the log-only hit, it'll look like a phantom edit, but it was actually stopped by another filter. Mr.Z-man 03:35, 23 April 2009 (UTC)

...Yeah, I believe during that time, Huggle had glitched, and was not showing any new changes... Until It Sleeps 03:37, 23 April 2009 (UTC)

It would be incredibly useful if we could see in the log whether a flagged edit was actually made or not. --Conti|✉ 10:11, 23 April 2009 (UTC)

It would be more useful to fix this, as it's really a bug in MediaWiki -- at the very least, it should check whether the rollback can actually succeed or not before going to the abuse filter, and to be honest, I see no reason why the abuse filter should be checking rollbacks at all. Gurch (talk) 12:55, 23 April 2009 (UTC)

Rollbacks should absolutely be checked. If I clean vandalism out of an article and someone attempts to roll the vandalism back in, that's still vandalism.—Kww(talk) 11:31, 24 April 2009 (UTC)

I don't consider this a bug, although it may be useful to, as suggested, record which edits succeed. — Werdna • talk 04:24, 24 April 2009 (UTC)

How about killing two birds with one stone: if the edit succeeded, give us a direct link to the diff. 99% of the hits I monitor need to be reverted, and it's irritating to have to use a three-step process to get to the edit in question.—Kww(talk) 04:27, 24 April 2009 (UTC)

Yes, that would be awesome. Can this be done, pretty please? :) --Conti|✉ 09:28, 24 April 2009 (UTC)

This was the first obvious omission I noticed when I first looked at the abuse filter log. I filed it as bug 18374; apparently there are technical difficulties involved in doing so. My attempts to work around this in external software work for the most part but are also stumped by things like the above (actions that appear from the abuse log to have succeeded but actually didn't). Gurch (talk) 11:13, 24 April 2009 (UTC)

Werdna, whereabouts in the edit saving process does the abuse filter actually kick in? Can edits trip a filter but then fail anyway due to, say, edit conflicts or the spam blacklist, in addition to rollback failing? Personally I think the filter should be the very last step, partly for better performance and partly so things don't get logged that wouldn't have happened anyway. Gurch (talk) 11:16, 24 April 2009 (UTC)

Psudo-block using abuse filter

I have seen 2 types of situations where, in my opinion, a psudo-block using the abuse filter could be better:

A sockpuppeteer with lots of sleeping socks - we could, in stead of hardblocking the IP address, combine a softblock together with an abuse filter which disallows edits which fit the following qualifications:
1. The account doesn't have the "admin" of "IPBE" flags. (similar to true blocks. Using the IPBE allows for admins to allow specific false-positive accounts to edit.)
2. The edit isn't to the user's own talk page.
3. The IP address is the one which the sockpuppeteer has been using.
4. Identifying information about the sockpuppets - this can include names, account age, low number of edits, etc. This is the reason to use a psudo-block in stead of an actual block.
Users who were blocked, but there is a very good reason to allow them to edit a small number of pages, such as an open ArbCom case - set a filter to disallow all edits by the user except to the user's own talk page and the pages in question.

Both of these should give a disallow message which resembles MediaWiki:Blockedtext. עוד מישהו Od Mishehu 09:06, 23 April 2009 (UTC)

Abuse filter rights and administrators

A long time ago in a galaxy far, far away Jimbo said adminship is no big deal. But, we now have abuse filter rights built into it. Now it's a much bigger deal. One well intentioned administrator can cause serious damage to the project. We've seen editing shut down for a while due to one admin screwing up applying a new filter.

It's already becoming insanely impossible to become an administrator, and the rate of new administrators is way down. Adding abuse filter into the rights mix just makes all the more reason why people would want to limit who becomes an administrator.

I don't think we need abuse filter editors so badly (hell, we already have over a hundred of them) that we have to permit all administrators the notional ability to do it.

I hate, despise, scream in agony at the thought of creating a new bureaucracy, but something has to be done to separate the rights to abuse filter editing away from administrators. --Hammersoft (talk) 15:33, 23 April 2009 (UTC)

What exactly do you propose? Even without the Abuse filter, administrators already have the ability to cause absolutely unbelievable damage through their access to the MediaWiki namespace, far far more than you could possibly cause with the Abuse filter. Even so, they have been, and continue to be trusted with this access. I have seen no evidence that access to the abuse filter makes it more difficult to pass an RfA. In fact, in the last few weeks (although it has slowed recently), I have noticed a marked increase in the number of successful RfAs. J.delanoy ^gabs _adds 15:41, 23 April 2009 (UTC)

It will happen. Count on it. --Hammersoft (talk) 15:44, 23 April 2009 (UTC)

What sort of an argument is that? I'd rather not "count on it", if you don't mind, I'd much rather hear some convincing argument why it is certain or even likely to be the case. :P Happy‑melon 15:46, 23 April 2009 (UTC)

Let's play liar's dice, shall we? I bet I can name ten different ways that an administrator can prevent all edits to a wiki until reverted, without using the abuse filter. I won't, of course, on-wiki, but I'll e-mail anyone who's interested. Unless, of course, someone thinks they can think of eleven... :D

The point is, being able to crash the site is nothing new. Being able to crash the site in new and exciting ways is not new. Being able to crash the site in new and interesting ways that weren't available when the admin passed RfA is not new. What, exactly, is new? Happy‑melon 15:51, 23 April 2009 (UTC)

Part of the issue is that the ability to view private filters is tied to the ability to edit them. We may have 140 abuse filter editors, but only about 60 have ever actually edited a filter. Mr.Z-man 16:05, 23 April 2009 (UTC)

Also, +AFE is not presently bundled with +sysop. yes, we can grant it to ourselves, but there is a distinction to be made. Also, as Z-man says, most of us are simply viewing filters (I made one or two very minor changes, but it's far too complicated for me to be doing major work on it). –xeno ^talk 16:06, 23 April 2009 (UTC)

This is one of those cases where the only way to know for sure is to wait for the future to come. My prediction is a year from now there will be a vetting process for abuse filters editing ability. Those of you who think I am wrong are of course welcome to your opinions. But please remember your opinion and my opinion have the same value. --Hammersoft (talk) 16:13, 23 April 2009 (UTC)

So are you proposing we actually begin to create such a system (as your first, "something has to be done" comment suggested) or just stating your opinion? Mr.Z-man 16:21, 23 April 2009 (UTC)

I'd like to see a system where for 24 hours after each filter is changed it applies to administrators and no-one else. That might help. Gurch (talk) 17:10, 23 April 2009 (UTC)

Yeah thats works fine except nearly every filter have sysops exempted in some way, which is questionable to say the least. «l| Ψrometheăn ™|l» (talk) 08:24, 24 April 2009 (UTC)

Actually, most of them only apply to non-autoconfirmed users. Trouble is, of course, that that group is the least likely to know how to report a problem, and the most likely to be deterred from contributing by broken filters and/or annoying abuse filter messages. I just figured that if all admins had to go through a 24 hour period when they were warned for doing stuff, rather than just throwing the filter out there and then ignoring it because they couldn't see its effects, errors would be corrected far more quickly, and in particular screwups that prevented all editing would only prevent all admins editing rather than everyone, which is a much less disruptive situation. Gurch (talk) 09:19, 24 April 2009 (UTC)

Yes, that sounds fine, but then the same should be true for page-protections, rangeblocks and blacklist rules that admins apply. The system should there just randomly apply random blacklist rules, page-protections and edit-blocks to the applying admin as well, just to let them see the annoying effects it can have on innocent editors who are however affected by such measures.</sarcasm>

Gurch, please, properly applied rules have WAY less collateral damage than some range blocks and semi-protections have had here. There are some rules in place, which now very specifically target problems, giving the wikipedia back to the editors (I see IP editors improving articles which have been semi-protected for 2 out of the last 3 years!). You are right, there are some dark the dark sides to this tool, but we admins have worked with semi protection, range-blocking and external link blacklisting, tools which can have WAY worse effects than this (some time ago 't' was blacklisted on meta, so no external link with a 't' could be added anywhere!). So unless you can come up with the following two statistics:

The number of editors who have been hit by a abuse filter warning and who a) did not save their edit, and b) did not edit afterwards anymore
The number of editors who have been trying to edit a page, but were deterred by a range-block, a semi-protection or a spam-blacklist block, and who a) (where possible) did not adapt their edit and saved it, and b) did not edit afterwards anymore.

showing that the latter is really lower than the former, I don't see why you are so afraid of the use of the abuse filter. Admins who apply totally wrong filters which have a significant number of wrong hits (especially those who do not adapt/disable their filter quick when they notice because they go to sleep) should indeed be hit (hard!) with a trout, and we should be generally careful, indeed. But I do not believe that this tool is worse than range-blocking, page protection or blacklisting. --Dirk Beetstra ^{T C} 10:20, 24 April 2009 (UTC)

Well, last I checked the spam blacklist does apply to administrators, and that's the way it should be. As for page proteection, that by definition can't be applied to admins too, but you get the same mentality (I've seen many pages protected with summaries like "no non-admin should need to edit this").

Nowhere did I say the abuse filter was more of a deterrant to useful contributions than rangeblocks and semi-protection. I'm opposed to misuse of those things too, especially use of semi-protection for isolated incidents of vandalism, or reasons like "new users shouldn't need to edit the page", and rangeblocks of whole ISPs for the work of one person. Gurch (talk) 11:22, 24 April 2009 (UTC)

I know Gurch, and I hope that we/I do not use all these tools too lightly anywhere. However, in some cases one editor can make life very difficult for some of us. I may misunderstand you, but I get from your remarks the feeling that 'if a filter has any collateral damage it should not be applied at all'. I am afraid, that no collateral damage is hardly possible for anything. On the other hand, a lot of the 'abuse filters' we have here, are NOT against abuse, they are there to, hopefully friendly and encouraging, 'help' editors to make a better wiki. It would be good to know how many of the 'warn' edits ('you forgot to put a dot on the i!') result in editors walking away.

Remark: blacklisted links are generally not used by experienced editors, as they know they do not fit with the policies and guidelines here. However, new good-faith editors who do not know may run into such blacklisting messages, and those messages may hunt them away. The effect is not the same as a semi-protection or a rangeblock, but it will affect new, good-faith editors who are not familiar with policy and guideline more than regulars as well. --Dirk Beetstra ^{T C} 11:46, 24 April 2009 (UTC)

On Wheels

Can we add "Willy on wheels", "on wheels" and various capitalizations. Spate of vandalism earlier today in this regard and obvious historical reasons.--Fuhghettaboutit (talk) 12:42, 22 April 2009 (UTC)

Oh, and if not already done, "Haggar" "Hagger" etc. with various spellings, capitalizations and diacritic use would be appropriate.--Fuhghettaboutit (talk) 12:46, 22 April 2009 (UTC)

There are legitimate uses, such as Meals on Wheels. --NE2 13:17, 22 April 2009 (UTC)

So we limit it to avoid false positives. If "on wheels" is too general for content filtering for article additions, use only "willy on wheels" and "willy-on-wheels" with various capitalizations. Page moves, however, should be prevented from any existing title to anything "on wheels", as this was a major modus operandi. Limiting this to just moves may prevent much damage and it's very unlikely to result in more than 1 or 2 false positives over many years. In that unlikely event, an admin can be contracted or a requested move request can be made.--Fuhghettaboutit (talk) 17:23, 22 April 2009 (UTC)

I agree it is a reasonable thing to target if it is a current problem, but it's been a very long time since I've noticed "on wheels" vandalism. Can you point to the recent examples? Dragons flight (talk) 19:42, 22 April 2009 (UTC)

Yeah, this got old in 2005, nobody does it any more. Gurch (talk) 20:06, 22 April 2009 (UTC)

See these 14 pages created today. These are not all of the ones created today; just the ones I protected out of a larger group, and that I can therefore easily find. We still get Willy on Wheels vandalism, despite that it's all copycat. Unless the abuse filter has a very finite number of things it can look at, I don't see the point of not doing it. And don't lose the second issue. Haggar is active today.--Fuhghettaboutit (talk) 20:55, 22 April 2009 (UTC)

Are those creations, or moves? If they were moves I would have thought filter 1 would have blocked them, but then I have no way of telling what's in there Gurch (talk) 13:03, 23 April 2009 (UTC)

Creations. Cenarium (talk) 08:59, 24 April 2009 (UTC)

There's no numerical limit to filters, but some of them simply take too much processing time. Stifle (talk) 15:55, 24 April 2009 (UTC)

So "on wheels" isn't on the title blacklist? Seems odd, given half of Unicode is on there. Gurch (talk) 21:53, 25 April 2009 (UTC)

It is (2nd entry under 'ATTACK TITLES AND/OR PAGE MOVE VANDALISM TARGETS'):

.*[OÓÒÔÖÕǑŌŎǪŐŒØƏΌΟΩῸὈὉὌὊὍὋОӨӦӪ][N₦ŃÑŅŇṆΝ][ ]?[WŴẀẂẄẆẈ₩][HΉĤĦȞʰʱḢḤḦḨḪНҢӇӉΗἨἩἪἫἬἭἮἯῊᾘЋΗⱧԋњһh][ÉÈËEĘĚĔĖẺẸẾỀỄễỂểȨȩḜḝĒḖḗȄȅȆȇỆệḘḙḚḛ3عڠeēėèéëẽĕęəẻếềẹ][ÉÈËEĘĚĔĖẺẸẾỀỄễỂểȨȩḜḝĒḖḗȄȅȆȇỆệḘḙḚḛ3عڠeēėèéëẽĕęəẻếềẹ]+[L₤ĹĽḶŁĿΛЛЉ][[S$ŚŜŞŠṢΣЅ].* <moveonly> # Disallows moves with "on wheels" with 2 or more Es
.*on wh33ls.*
.*on whiels.*
.*\bwith wh?iels\b.* <moveonly>
.*on rails.* <moveonly>
.*on treads.* <moveonly>

As you can see its had the Unicode treatment --Chris 00:04, 26 April 2009 (UTC)

But set to "moveonly"? Someone should take that out of there, I guess, then it will block page creations too. Better to do that than add another abuse filter, as the title blacklist is more efficient. Gurch (talk) 10:36, 26 April 2009 (UTC)

Notification at least?

We can't automatically block as an action... How about having some notification that a filter is being triggered? Persistent vandals are just continually trying their edits tweaking them until they get around filters (see here). Unless you are sitting on the filter refreshing, there is no way to know. Maybe an e-mail? A talk page message? A note on some page that we could watchlist? Wknight94 ^talk 20:28, 26 April 2009 (UTC)

A mechanism to log rapidly repeated triggers of any filter by a single account would be useful. -- The Anome (talk) 20:45, 26 April 2009 (UTC)

For those who use IRC, I have a bot that makes an alert anytime a user trips 5 filters in 5 minutes, a user trips a filter doing a pagemove, or if 10 filters in 5 minutes are tripped on a page. It operates in #wikipedia-en-abuse-log along with a bot that reports most filter hits. Mr.Z-man 20:54, 26 April 2009 (UTC)

A RSS feed could be very helpful, otherwise. I dunno how hard it'd be to have one running (the RSS on the page apparently points to Special:RecentChanges). -- lucasbfr ^talk 22:12, 26 April 2009 (UTC)

Yes, we'd need a way to report users matching certain abuse filters. Maybe a bot could do that, report to WP:AVI any user matching a filter from a certain list. The list of filters could be held in the 'botspace', so that admins can update it as needed. Cenarium (talk) 20:32, 27 April 2009 (UTC)

Made a bot request, see here. Cenarium (talk) 22:25, 27 April 2009 (UTC)

Filter 107

Filter 107 should not be private because its specifications are easily accessbile here. «l| Ψrometheăn ™|l» (talk) 11:30, 29 April 2009 (UTC)

It should be, because it also catches some other things. And the privateness of the filter hides the actual implementation, making it more difficult to get around it. --Dirk Beetstra ^{T C} 11:44, 29 April 2009 (UTC)

Filter ID for us noobs?

Wouldn't it make sense to auto-include an ID string on filter comments added to the edit summary? Something like: [[Wikipedia:AF|AF]]:filter text). In other words, it would produce (AF:Nonsense movie?) to use the one that caught me out.

I'm a sample of one, admittedly, but I was one confused new page patroller because I didn't even know the filter existed. 19 extra characters will save noobs like me a lot of perplexity. 9Nak (talk) 18:49, 3 May 2009 (UTC)

To centralise a bit, as also briefly discussed at Wikipedia:Village pump (technical)/Archive 120#References removed automatically appended to edit summaries. 9Nak (talk) 19:07, 3 May 2009 (UTC)

Criteria for a Private Filter

Its come to my attention that wether a filter is private or not is largely down to the discression of the administrator who makes it and there are no guidelines as to what should be private and more importantly, what should not. Im of the opinion that all filters should be public unless an elaborate regex rule that could be easily circumvented if the regex was public (IE a meme pattern).

An example of a filter that should not be private is [6] and because it is an "as is" filter that cannot be circumvented. Another questionable regex is [7] «l| Ψrometheăn ™|l» (talk) 02:00, 27 March 2009 (UTC)

These filters both contain information that would assist an abusive user in circumventing them. –xeno (talk) 02:06, 27 March 2009 (UTC)

How can you circumvent a move throttle? seriously? and more to the point I bet any user could tell you what that filter contains either in full regex oy laymans terms, including the conditions. «l| Ψrometheăn ™|l» (talk) 02:09, 27 March 2009 (UTC)

Erm, it involves some beans? –xeno (talk) 02:11, 27 March 2009 (UTC)

Well because I don't know what the filter contains, one can but wonder how a simple move page vandalism filter need to be private for it to work and back to the inital statement, what constitutes a private filter? «l| Ψrometheăn ™|l» (talk) 02:14, 27 March 2009 (UTC)

He can't move pages until he is autoconfirmed by waiting 4 days and making 10 edits, and yet he routinely waits just long enough and makes just enough edits to do that. If he knew the specific requirements of those rules they would be just as beatable. (At least until we changed them anyway, but no one wants an arms war.) Dragons flight (talk) 02:17, 27 March 2009 (UTC)

This entire feature is an arms war, and apparently one fought by people who don't give a damn about civilian casualties (read: those of us who actually want to improve articles) -- 217.42.77.168 (talk) 17:12, 6 April 2009 (UTC)

We've tried public regex-based filters to prevent pagemove vandalism, with little noticeable effect. Mr.Z-man 03:45, 27 March 2009 (UTC)

That filter is quite different to the epic fail blacklist, but an email has provided me with a sufficent explanation to keep that filter private. «l| Ψrometheăn ™|l» (talk) 07:04, 27 March 2009 (UTC)

So does the mediawiki source code, last I checked that wasn't private -- 217.42.77.168 (talk) 17:09, 6 April 2009 (UTC)

He who trade freedom for security deserves neither and loses both, or at least that's what Ben Franklin thought. Maybe we should add and asterisk next to all the proclamations of openness and transparency that litter our statement of principle and other such articles. Burzmali (talk) 17:41, 6 April 2009 (UTC)

Filters at high risk for being circumvented are usually marked private to "cut them off at the pass." One of the bigger problems with titleblacklist, as we found out, as that everything was 100% open to the public to view; as a result, it was regularly circumvented— all that had to be done was for the puppeteer to add the page to his watchlist and adjust in one adjustment. Private filters, on the other hand, force the puppeteer to keep guessing, using up their throwaway accounts in the process, while being able to do nothing to disrupt the encyclopedia the whole time. It's basically the difference between using a clear lock on your house (where all of the pins are visible to someone trying to pick it) and using an opaque one. With the clear lock, the burglar can see the pin shears and have the lock open in a fraction of the time it would take him to brute force it. --slakr^\ talk / 01:55, 20 April 2009 (UTC)

It's my understanding that lock picking proceeds one pin at a time, and pins already picked are held in place, so I'm not sure your analogy is apt. How about the Titanic trying to avoid icebergs without sonar, when all that can be seen is the tip?

--NE2 02:47, 20 April 2009 (UTC)

Or to continue the lock analogy, an addition to the titleblacklist is like trying to increase security by adding more doors. It adds an extra step, but getting around it is fairly trivial. A private filter is like putting a lock on the door, connected to an alarm that lets you know when someone is trying to break in. Mr.Z-man 02:56, 20 April 2009 (UTC)

Which works fine until you have a bunch of well-intentioned contributors standing around outside the building unable to get in. Gurch (talk) 10:14, 20 April 2009 (UTC)

Keeping the analogy: Gurch, do you realise, that blacklisting, semi-protection, and IP(-range)-blocking leaves us having those same bunch of well-intentioned contributors standing there (and probably even waaayyy more then with a reasonable filter)? But as we have conveniently turned all streetlights off around the building (with the well-intentioned, precise and irreversible use of a shotgun), we will never know! They can knock, scream, build a trebuchet, paint their faces purple, wear a superman outfit, drop their pants, whatever .. we don't know. Filters can put those editors in the spotlight, and when they knock on the door (or drop their pants :-p ) and can't come in, we can have a look, and open our doors in such a way that we keep out only those which we really want to keep out?

Does such a filter hunt away editors, possibly, but has the blacklisting, semi-protection or blocking alternative done the same? I have seen on some of our filters that we now get again those well-intentioned contributors back after they have been standing there in the dark for years. However, others knock, scream, cry ... --Dirk Beetstra ^{T C} 10:49, 20 April 2009 (UTC)

I agree that the use of this filter system is potentially so powerful that we may be enabled to replace other measures. In particular, it should make the title blacklist obsolete or much less often needed, and similarly for long periods of semi protection or long range blocks. In fact it is so powerful that we may not need to keep very much about it private. DGG (talk) 06:26, 6 May 2009 (UTC)

Help requested for populating AbuseFilter tag wording/descriptions

Since they now show up in contributions lists, as a sort of an urgent action I've gone ahead and created a bunch of less-harsh/less-accusatory tag appearances for several of the tags listed over on Special:Tags that have gotten hits. No worries about permanence, since they're simply part of the interface (i.e., changing something will basically instantly become visible). Bringing it here for people to come up with better ones + populate the extended descriptions. If you're not a sysop and wanna make a change, I'd say just use {{editprotected}} on whatever message's talk page, though it might be better for someone to redirect the talk pages to a centralized location (maybe a subpage of this page?) Anyway, I'm swamped, so talk amongst yourselves. --slakr^\ talk / 01:14, 4 May 2009 (UTC)

Also edited MediaWiki:Abusefilter-edit-action-tag to make it clear what tags actually do + link to special page. --slakr^\ talk / 01:46, 4 May 2009 (UTC)

See Wikipedia:Tags for a place to discuss and describe the tags in one central location. 199.125.109.77 (talk) 03:38, 5 May 2009 (UTC)

Disabled Filters, unnecessary?

OverlordQ (talk · contribs) disabled three filters 94, 101, and 156.

The first two are very lightweight, averaging 0.89 and 1.26 ms respectively (performance data). They are both targeted single page vandalism getting 10 and 8 hits respectively since activation (including 5 and 1 hits respectively in the last two weeks). This kind of persistent targeted vandalism is exactly the kind of thing the Abuse Filter was initially designed for and given that these checks are extremely cheap I am inclined to reactivate these filters.

156 on the other hand is somewhat more expensive (though 5.08 ms is by no means bad) and has drawn 0 hits. It is also so specific in what it targets that it is probably easily avoided by the vandal (who is probably a lone individual, unlike the vandalism in 94 and 101 where there is a decent argument multiple people are involved). The mitigating factor if that 156 is only 4 days old. I probably would have waited several more days to see if there would be any hits before disabling (and would re-enable it in the face of ongoing vandalism), but I generally would agree that this filter seems unlikely to be of much long-term use.

In the interest of encouraging more transparency when it comes to filter use, I wanted to start a discussion about these things. In particular, I'd propose reactivating 94 and 101. Dragons flight (talk) 19:35, 29 April 2009 (UTC)

This is similar to an action by Prodego earlier. I disagree with disabling specific filters for specific vandalism which are set to warn or even to disallow, just to gain a bit of performance, especially not without consulting the AFE who has written and designed the filter (and hence probably knows most about the MO of the vandals who are supposed to be blocked by it). Those are the filters this was written for, and they should be enabled if the threat is still active. I know that filters like 29, 39 give way more hits, but they are (probably) never set to warn, they are (I am repeating myself) more logs than abuse filters, which could easily be run by a bot after the edit has been performed who dumps logs on wiki resulting in a huge gain in functionality in stopping abuse, and being able to do a much more rigorous check than reasonably possible with the filter (I am sure some could be improved but that improvement would result in a huge increase in parse-time per action). Please re-enable these filters for which the abuse-filter was written! --Dirk Beetstra ^{T C} 22:05, 29 April 2009 (UTC)

I honestly think that the abuse filter was designed to stop persistent and chronic abuse, It was not written to stop small minor vandalism that reers its head less than 10 times a month. There are enough hugglers around for things like this. The filters mentioned above should remain disabled. 94 is a waste of time, If he can't add vagina to that article he will just go to another and it is low occuring enough for standard rc patrol to deal with, if it stays on that article it can be watchlisted at least. 156 is a VERY SPECIFIC (if the guy changes one letter it wont trigger) case of vandalism that is no more common than most other vandalism making it unworthy of a filter, again RC patrol and watchlisting. 101 in itself is ok, but the low number of hits raises questions about necessity, again may be more suited to watchlisting and rc patrol. We seem to be forgetting that RC patrol didnt stop the day abuse filter came online. Prom3th3an (talk) 01:04, 30 April 2009 (UTC)

There is no reason that RC, watchlisters, or anyone else ought to be asked to revert these things ten times a month if they don't have to. 2 ms is a negligible burden and it frees up people to worry about other things. 94 is a movie inspired meme, and unlikely to be just one person. The log has IPs from 5 US states and Canada. Similarly 101 has 3 US states and 3 other countries. Besides, maybe when blocked those people will attack somewhere else and maybe they won't. We know that most people shown a page blanking warning (for example) don't continue the edit. I suspect that most people tempted to propagate a specific meme are also likely to give up if blocked. The point of the abuse filter was always to reduce the workload on RC and related tasks, and I'm not sure why we'd back away from that now. I believe historically things like 94 and 101 were in fact more what inspired its creation rather than our more heavily hit filters (like page blanking). Dragons flight (talk) 01:57, 30 April 2009 (UTC)

Prom3th3an pretty much sums up my views, so there's no reason to restate them here. Quick somebody added a word to an article! Lets add it to abuse filter! There's vandalism and there's abuse. Ten in a month is vandalism. Ten in a minute is abuse. In my opinion, only the latter should get a filter. (This is of course ignoring egregious cases) Q ^{T C} 01:21, 30 April 2009 (UTC)

If some of you guys are willing to take over watching the articles that get the same recurring vandalism then great. Reverting something ten times in a month, month after month, gets old. Will Beback talk 01:27, 30 April 2009 (UTC)

Will Beback, Regarding the San Diego article, A review of the history indicates that semi protection would be good here as most if not all IP contributions are vandalism of some sort. I'm going to assume no clue here and say if you are not aware of how protection works or where to request it please see Protection Policy and Requests for Page Protection as San Diego has never been protected before, so in the words of arbcom (_{play on words of dispute line}), try all other methods of resolving vandalism (revert, watchlist, block, protect) before requesting a filter. The abuse filter should be the last option Prom3th3an (talk) 01:54, 30 April 2009 (UTC)

A simple filter should be preferred over semi-protection, since that would allow more editing not less. It is not at all the last option. (I haven't looked at the history of San Diego, so there may be other reasons to encourage protection beside the problem with the movie meme.) Dragons flight (talk) 02:01, 30 April 2009 (UTC)

Where did ArbCom say that? Mr.Z-man 02:08, 30 April 2009 (UTC)

It's a play on words from thier dispute resolution line, not ment to be taken literally. San Diego has been semi protected to break the vandalism cycle on that artice. So that (94) abuse filter can remain disabled. Dragons flight: Semi protect to break a perisitent non stop cycle is far preferable than a permanent abuse filter, temp. semi protect will typically make the vandal loose interest in an article, so it should have been tried first. Prom3th3an (talk) 02:13, 30 April 2009 (UTC)

It makes no sense in this context though. ArbCom should be the last step, because there is nothing above it, except perhaps a direct intervention by Jimbo. With the AbuseFilter, we can avoid the collateral damage caused by protection and blocking. Just because there isn't an explicit expiration set for abuse filters doesn't mean that they're permanent. Mr.Z-man 02:20, 30 April 2009 (UTC)

Yes, great idea, let's disenfranchise good faith anons because of vandals. Seems like we'd be letting them win, don't you think? –xeno ^talk 02:21, 30 April 2009 (UTC)

OverlordQ, Promethean, indeed, we don't need a filter for every single form of hit and run vandalism. But I am sure tht the AFE that wrote these filters wrote them because the problem was broader, and other solutions are causing more aggravation or problems. It still is what the abusefilter was written for, and I do not believe, that it was written to log edits which may be problematic, but for which the editors will never get even a warning. It makes much more sense to disable the log-only filters (not the ones in test phase, of course) and write a bot that replaces those rules (giving similar functionality), freeing up the resources on the wikimedia servers and offloading them to the toolserver, e.g. --Dirk Beetstra ^{T C} 07:43, 30 April 2009 (UTC)

I'll stand by my words on this one - 10 (which is the at most scenario) obvious vandal edits a month can be absorbed by RC patrol easily, period. «l| Ψrometheăn ™|l» (talk) 08:11, 30 April 2009 (UTC)

Fine, though I doubt if RC patrol recognises long term POV pushing or spamming on certain articles as inappropriate edits (I know how often I have cleaned up behind our Argentinian POV pusher ...), so it comes down to those knowledgeable in the subject to revert and clean up, and the inappropriate edit may stand for hours. --Dirk Beetstra ^{T C} 08:31, 30 April 2009 (UTC)

Seems to me like the general philosophy of wikipedia would be that it's better to use these vandalism traps than to protect or semi-protect pages. Baseball Bugs ^{What's up, Doc?} carrots 08:52, 30 April 2009 (UTC)

The San Diego vandalism is like the various Colbert vandalisms: one specific false "fact" that keeps getting added. It is easily filtered without requiring the general restrictions like semi protection which would be far more disruptive. If a bot can be tasked with handling these repeat vandalisms then that'd be fine too, but it doesn't require any human judgment. Will Beback talk 14:20, 30 April 2009 (UTC)

Except that a bot would be removing vandalism from an article, whereas a filter would keep it out altogether, and the latter option seems better for the integrity of wikipedia content. Baseball Bugs ^{What's up, Doc?} carrots 14:24, 30 April 2009 (UTC)

Of course it can be handled by RC patrol. If we only used the filter for things that couldn't, we wouldn't be using it at all. We went 8 years without it, its obviously not providing some critical support such that RC patrol would be overwhelmed without it. Mr.Z-man 17:11, 30 April 2009 (UTC)

I don't see anything related to performance on bugzilla. Could it be as simple as the single article filters get disabled because the filters don't have good support for that sort of thing; while the filters don't have good support for single article filters because they get disabled, anyway? -Steve Sanbeg (talk) 01:50, 1 May 2009 (UTC)

The filters are being disabled manually because people assume that "few hits" == "performance problem" which isn't necessarily the case. A well-written targeted filter may add less than 0.5–2 ms to each edit; you'd need a couple hundred like that to be the equivalent time of an eye blink, so disabling 4 or 5 isn't going to have a visible impact. Mr.Z-man 02:19, 1 May 2009 (UTC)

Assume, yes; report where the right people can confirm/deny/fix the problem, no. I think it could be improved, but it's hard to justify the effort if there's only 4 or 5 of them anyway. So if these filters kept accumulating, it eventually could become a problem, which could be fixed. But manually disabling filters because someday a problem may occur, or because we don't want to waste machine's time on something that could just as well be done by humans seems contrary to the purpose of the extension. If the problem does occur, that link shows where to report it; that should be more productive than not using the extension for fear of wasting a fraction of an eye blink. -Steve Sanbeg (talk) 02:36, 1 May 2009 (UTC)

I totally agree. As long as problems (like pages not wanting to save anymore, or continuous sluggishness of the 'pedia) are not observed regularly, there is, IMHO, no reason to turn rules off for performance reasons. Keeping them there gives a realistic feel of what the server is actively doing at the moment, and Werdna and others are still working on speeding up the system. I agree, when performance is a problem, then turning off some of the rules a) that don't have too many hits anyway (and then please, be sure to replace them with other measures again, and notify the creator and/or main AFE: there are or have been problems on the case where the rule hits!) or b) that are particularly slow (even if they do give thousands of hits) is the option. But until then, please leave them enabled. --Dirk Beetstra ^{T C} 07:29, 1 May 2009 (UTC)

I have re-enabled both 94 and 101. In addition to the supportive comments above, I would also like to note that the vandalism targeted by each of these rules has actually occurred during the period they were disabled. Dragons flight (talk) 08:51, 6 May 2009 (UTC)

Protection templates

Think we could have some sort of filter for non-admins adding protection templates to articles? That may help discourage putting them on pages that are not "officially" protected. ViperSnake151 Talk 19:06, 4 May 2009 (UTC)

Protection templates turn invisible when applied to pages that are not actually protected, so there is no need for such a filter. Happy‑melon 20:25, 4 May 2009 (UTC)

Oh cool. ViperSnake151 Talk 20:41, 4 May 2009 (UTC)

Plus I think there is a bot that comes along and removes them. 199.125.109.77 (talk) 03:41, 5 May 2009 (UTC)

User:Legobot III has just been approved for that. Pages are automagically added to Category:Wikipedia pages with incorrect protection templates. Cenarium (talk) 20:59, 7 May 2009 (UTC)

ClueBot

So... if we have abuse filters, then why is ClueBot still running? Cheers, theFace 19:02, 6 May 2009 (UTC)

There are certain patterns of vandalism that would be too computationally demanding for the AbuseFilter to catch without being a drain on system resources or unduly slowing down the editing process. A completely external bot can still act on these, however. Someguy1221 (talk) 19:10, 6 May 2009 (UTC)

The abuse filter isn't a programming language. It has some string/regex functions and basic math. Each filter is basically the equivalent of 1 "if" statement in a program; ClueBot is much more than a list of "if" statements. Mr.Z-man 05:57, 8 May 2009 (UTC)

Violations of Copyright

Hello! I would like to know the code for the filter against possible infringements of copyright. Because I want to install the Wikipedia in Portuguese. Thank you. HyperBroad (talk) 18:20, 11 May 2009 (UTC)

Which filter are you referring to? Someguy1221 (talk) 19:34, 11 May 2009 (UTC)

Perhaps User:CorenSearchBot? Tim Vickers (talk) 20:22, 11 May 2009 (UTC)

Who is the operator of the bot? HyperBroad (talk) 00:00, 12 May 2009 (UTC)

I have not seen the name of Coren, thanks for the tip! HyperBroad (talk) 00:07, 12 May 2009 (UTC)

User self-renaming

I think Special:AbuseFilter/5 needs to be turned on with "prevention" and a custom warning, like testwiki:MediaWiki:Abusefilter-warning/selfrename. The last line of code (about user_talk → article) has to be moved to a separate filter with a separate message. — AlexSm 21:37, 12 May 2009 (UTC)

False positives and no filter admins "monitoring it"

Hi all. I've been monitoring the false positives page for a while, removing inappropriate reports and commenting on some of the reports that come through. However, I'm not a filter admin, so in the case of filters that might be acting inappropriately or are obviously broken and need fixing, I can't do anything about it.

It seems to me that there are no filter admins monitoring this page, and there is now quite a backlog of requests that could really do with being processed (I'm also unsure of when to move them into the "reviewed reports", and also when to archive them).

I think it is a good idea to have this page, as obviously false positives will occur, but it needs monitoring by people who can do something about it. If it is the case that the existing filter admins don't have enough time to do this, then perhaps a request needs putting out for additional filter admins to come on board? ~~ [ジャム]^[t - c] 07:29, 13 May 2009 (UTC)

Looking through all the entries listed since April 30, every reported false positive has what seems to be a reasonable response. Are you complaing about a specific entry? עוד מישהו Od Mishehu 08:27, 14 May 2009 (UTC)

That is because after I commented here, Ruslik0 went through and updated many of the entries. I'm just concerned that it took a (quite strong worded) message on here to get an admin to either comment to or make changes based on the reports. ~~ [ジャム]^[t - c] 19:07, 14 May 2009 (UTC)

You may wish to poke some more of with sticks at WP:AN to get more eyes on the page. I doubt there's many admins watching it presently. –xeno ^talk 19:09, 14 May 2009 (UTC)

I did consider that, but thought it best to try and aim at the people who were specifically interested in the abuse filter, rather than all admins. Maybe I'll post an abridged and edited copy at AN, just so it's on record there. ~~ [ジャム]^[t - c] 19:40, 14 May 2009 (UTC)

I've been preoccupied for some weeks running the m:Licensing update, but I'd intend to pay more attention to FALSEPOS once I get more free time. Dragons flight (talk) 20:41, 14 May 2009 (UTC)

Thanks for the note Dragons flight. I appreciate that many of the filter admins have other jobs that they do, but it just seemed that I needed to try and make people aware of issues that were arising which couldn't be dealt with by us "mere mortals" :) ~~ [ジャム]^[t - c] 21:21, 14 May 2009 (UTC)

Filter 102

Resolved

– "Filter" removed as too general. –xeno ^talk 19:08, 14 May 2009 (UTC)

For starters, There are quite a few false positives for filter 102. Secondly, I request that the ACC group be excluded from all filters that impact on account creation in any way, shape or form. Thirdly filter 102's message (and all filters that impact on account creation) should include a link to ACC to get the account made manually, as well as a link to false positives. «l| Ψrometheăn ™|l» (talk) 14:31, 13 May 2009 (UTC)

I think there is a bug already filed, but last I looked user group checks did not work at all for account creation filters, which meant the account creators group can not override a filter. Dragons flight (talk) 14:53, 13 May 2009 (UTC)

Also, of the 434 hits, ~400 were created by a previous coding error so the discussion of false positives should focus on just the time after March 25th. That said, I'm not sure why someone added the equivalent of "filter.*" to the prohibited list. That seems like a bad idea to me. Dragons flight (talk) 14:58, 13 May 2009 (UTC)

Yes, please remove that. I don't even think "abuse filter" should be blocked from creation, but "Filter", definitely not. –xeno ^talk 15:14, 13 May 2009 (UTC)

Remove the filter for short new articles

I suggest we remove the warn filter for short new articles. Longer articles are better, but short stubs are also useful and can be expanded by others. --Apoc2400 (talk) 10:14, 15 May 2009 (UTC)

I could definitely see this as off-putting. –xeno ^talk 12:53, 15 May 2009 (UTC)

I limited the filter to non-autoconfirmed users and reduced the limit to 150. Ruslik (talk) 14:47, 17 May 2009 (UTC)

Upcoming Albums

I think that upcoming albums should be allowed to be posted into the bands page. But I do not think the ablum';s own page should be created unless the album name is known. At the moment both are not allowed. I think it should be fine to write 'Upcoming studio album (year)' and then leave a reference. This is a great resource for fans to see if the band is producing a new album. Who agrees? --Arnies (talk) 11:46, 17 May 2009 (UTC)

They can be mentioned in text, but they don't belong in discography sections until they are real.—Kww(talk) 14:53, 17 May 2009 (UTC)

Help needed with filter 171

Whne looking through the hits of Filter 171 (currently set to log only), I found some hits which seem to represent multiple edits, as opposed to single edits. For example, this entry seems to represent this diff, which is actually 2 edits of different users. Does anyone understand why this happens? עוד מישהו Od Mishehu 10:47, 20 May 2009 (UTC)

I think, there was an edit conflict and the edits were merged as a result. Ruslik (talk) 11:02, 20 May 2009 (UTC)

Is there any way to get around this problem? עוד מישהו Od Mishehu 11:05, 20 May 2009 (UTC)
(ec) That sounds likely, from the examples I looked at, but it's still a bug. One example with more time between the two edits is diff&details.
Should be easy to test verify this theory. Amalthea 11:09, 20 May 2009 (UTC)

Proposal - Removing underscore as a repeating character tag?

Love the AF tagging process - makes vandalism catching much easier. However I've noticed a few (two within the last hour) where long strings of underscores which are appropriately used in org chart pages of family lineage pages tripping the filter. Just wondering if we think this is a safe enough character to excluded from this AF tag (as well as perhaps ascii art or other dashes). Would have to weigh the odds of a vandal using underscores I guess. (Here are two examples: [8] and [9] 7 ^{talk | Δ |} 01:08, 29 May 2009 (UTC)

Client side abuse filter

Let's stop beating the dead horse :) -- Luk ^talk 12:46, 28 May 2009 (UTC)

Very well. Let's go of the stick and calmly walks away.

Smallman12q (talk) 02:17, 30 May 2009 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

Information on the performance of abuse filters can be found at Wikipedia:Abuse_filter/Performance.

A client side javascript abuse filter should be developed so that the wiki servers won't get hammered as much. A number of the filters can easily be converted.Smallman12q (talk) 21:16, 16 May 2009 (UTC)

That would be trivial to circumvent, so it would be useless. Prodego ^talk 22:20, 16 May 2009 (UTC)

Most vandalism occurs from people who are bored rather than 4chan targeted attacks. A client side abuse filter would cut the amount of vandalism by at least 20%. While it could be circumnavigated, so can the current filters, and yet we still have them. Smallman12q (talk) 15:53, 23 May 2009 (UTC)

The abuse filters are not "hammering" the servers. If they were, we wouldn't have them. This is why we have the servers, to do as much of the processing as possible so that the client doesn't have to. Additionally, JavaScript is heavily cached by browsers, so if there's any errors made in a JS filter, it might take a month for a fix to propagate. And a JS filter would be far, far, far easier to circumvent than the normal ones; it would take about 20 seconds to bypass every JS filter at once. Mr.Z-man 16:15, 23 May 2009 (UTC)

Even though they can be bypassed, most vandals won't go that far. By far, as I said before, people won't bother going around the client side filter unless they are persistent vandals (in which the current filters don't help that much). As for the caching, while it is true, there are other methods of going around this problem. For example, the javascript would be given a certain build number, and when it blocked an action, the build number + "filter rule number" would be checked against wikipedia to see if there was an update. The point is that most vandals are simply bored and decide to come to wikipedia to kill a bit of time. And while the filters aren't hammering the servers per say, they are increasing the amount of time between edit saves. I've also create a support/oppose list so that the debate can be more clearly seen and so a concensus of opposition or implementation can be reached.Smallman12q (talk) 15:22, 25 May 2009 (UTC)

Support

Support- I believe this would reduce the amount of time between edit saves by moving the processing to client side(there is really minimal processing). Smallman12q (talk) 15:25, 25 May 2009 (UTC)

Oppose

Completely un-necessary, solution in search of a problem. ╟─Treasury Tag►hemicycle─╢ 15:38, 25 May 2009 (UTC)
I can almost guarantee that a bugzilla request for this would be closed as wontfix. Besides the fact that much of the information the filters use isn't directly accessible to javascript, the sole purpose of the servers is do do the processing; that's what they're there for. JavaScript should really only be used for dynamic content, as that can't really be done server-side. It would probably only decrease execution time for people with fast computers; for people with slow computers and old browsers, it would probably take much more time. And the amount of time we're talking about saving here is in the range of milliseconds – "premature optimization is the root of all evil" ^[10] Mr.Z-man 15:57, 25 May 2009 (UTC)
...And we'd kill our users running "slow" computers. -- Luk ^talk 16:06, 25 May 2009 (UTC)
You should discuss this with people who know what they're talking about, or at least tell the person who wrote the extension (me) that this discussion is going on. With that said, this seems pointless, difficult to implement, and with numerous shortcomings. — Werdna • talk 09:02, 27 May 2009 (UTC)
Oppose, I think you would still need to run all the filters, since you would need to guard against people disabling a client-side app. Therefore you wouldn't save any wiki server time. Tim Vickers (talk) 16:51, 27 May 2009 (UTC)
Oppose. Filters on the client can be trivially circumvented, much of the information needed is not visible to the client, and it would also be impossible to keep the details of filters hidden if they were to run on the client. Regarding server load, Wikipedia is not growing anywhere near as fast as Moore's Law -- indeed, the rate of growth is currently slowing -- so this is not a serious scalability issue in the long term, and we can just throw hardware at the problem where necessary in the short term (which is cheap, compared to developing software). As TreasuryTag said, this is a solution in search of a problem. -- The Anome (talk) 11:54, 28 May 2009 (UTC)

Neutral

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Filter Special:Abusefilter/172

Resolved

– Smallman12q (talk) 02:20, 30 May 2009 (UTC)

I wrote a filter to detect section blanking (as opposed to page blanking). As far as I can see it seems quite correct, but I'd like a second AFE to have a look at it (and I am going away for a couple of days as well), as I think it is good enough for a section-blanking-notice (it is quite often vandalism, though not necesserily). --Dirk Beetstra ^{T C} 13:01, 21 May 2009 (UTC)

Does the filter also catch subsection blanking such as "===" or "===="?Smallman12q (talk) 15:36, 25 May 2009 (UTC)

Should, it takes everything that contains a section header and where the next two characters are '==', that should include === and ==== ... maybe worth a test? Otherwise an upgrade is necessery, though it does already do the sections pretty correct. --Beetstra (public) (Dirk Beetstra^{T C} on public computers) 23:07, 27 May 2009 (UTC)

You mean details ?? --Beetstra (public) (Dirk Beetstra^{T C} on public computers) 23:19, 27 May 2009 (UTC)

And details. --Beetstra (public) (Dirk Beetstra^{T C} on public computers) 07:00, 28 May 2009 (UTC)

Didn't see those. Looks fine then.Smallman12q (talk) 02:20, 30 May 2009 (UTC)

Filter 39

Could someone add the terms fuck, whore, and horny f(a|e|u)?ck|whor|horny? (I'm not sure if I got the code right). Thanks. Smallman12q (talk) 18:21, 31 May 2009 (UTC)

I added "whore" and "fuck". Ruslik_Zero 09:36, 2 June 2009 (UTC)

Speed question

Is it worth checking the length in addition to an rlike comaprison? For example (for filter 104)

 !("sysop" in user_groups) &
 (article_namespace == 0) &
 (contains_any(added_lines,"{{helpme}}","{{adminhelp}}"))

or (improved)

 !("sysop" in user_groups)
 & (article_namespace == 0)
 &(length(added_lines) >= 10)
 &(lcase(added_lines) rlike "{{(helpme|adminhelp}}")

My question is "&(length(added_lines) >= 10)" needed?

Also, is rlike faster than contains_any? (And is there any link to where these speeds are documented? The documentation for the abuse filter is truly lacking.)Smallman12q (talk) 20:42, 31 May 2009 (UTC)

Possibly optimized 98

Old

(article_namespace == 0)
& (article_recent_contributors == "")
& (new_size < 150)
& !("autoconfirmed" in user_groups)
& !(contains_any(lcase(new_wikitext), "{{surname}}", "{{given name}}", "{{delrev}}", "#redirect", "{{softredirect}}", "{{db-unpatrolled}}"))
& !('disambig' in article_text)
& !('disambig' in lcase(new_html))

Combining 9 and 39

Could filter 9 and 39 be combined?Smallman12q (talk) 20:28, 31 May 2009 (UTC)

I would prefer this not to happen. Filter 39 is not intended as a general vandalism filter, but is intended to target specific problems within a specific subset of articles. The use of the words "expelled", "brothel", "had sex", and "abuse" are good examples of the specific problematic additions to school articles not covered by Filter 9. There may be a case to remove the general vandalism words from Filter 39 where there is duplication, but it is otherwise a different kettle of fish. -- zzuuzz ^(talk) 20:53, 31 May 2009 (UTC)

Okay then, perhaps repeated words should be removed.Smallman12q (talk) 20:57, 31 May 2009 (UTC)

Potentially dumb question

Now, I love the AF. However, I am inclined to ask... does it even work? After reviewing Special:AbuseLog, I notice that there are a lot of actions that still seem to get through and it lets them do it, even though it triggers the AF. Why doesn't it stop all those actions? —Mr. E. Sánchez (that's me!)^{What I Do} / _{What I Say} 22:37, 1 June 2009 (UTC)

Not all actions which trigger a filter need to be stopped. Some are allowed after displaying a specialised notice to the user, for example Users creating autobiographies. Others are tagged for further review in recent changes. The risk of false positives, or acceptable edits which match the filter, means they don't always need to be completely prevented. You can see the different consequences for each filter at Special:AbuseFilter. -- zzuuzz ^(talk) 23:44, 1 June 2009 (UTC)

And some filters are in "log only" mode, which means that there is no trace of the abuse filter action visible anywhere except for the AbuseLog. If you really want to know which actions are stopped, look at the "actions taken" section. עוד מישהו Od Mishehu 08:14, 4 June 2009 (UTC)

Abuse filter 164

Special:AbuseFilter/164 seems to be doing what it was meant to, although it catches a good number of false positives. Namely, editors' using an existing article as a template and failing to remove maintenance tags, or editors' tagging their own articles as unreferenced. I can't tell how many of the deleted or redirected hits came from people monitoring the filter or from everyday newpage patrol, so I was wondering if anyone here could clarify that. Because if it is just newpage patrol catching the more obvious cut & paste moves, I'm planning to remove the unreferenced tag from the search parameters to cut down on the false positives. Someguy1221 (talk) 08:22, 2 June 2009 (UTC)

Excessive whitespace

This filter pops up whenever I archive a talk page. Is it too sensitive? 70.29.208.129 (talk) 14:47, 2 June 2009 (UTC)

The edits you made were to Wikipedia_talk:Images_for_upload/svg/H1N1_USA_deaths_-_CSS_map.svg/1 and Wikipedia talk:Images for upload/svg/H1N1 USA deaths - CSS map.svg/2. I don't quite understand what you were doing on those two pages. Could you explain? Hipocrite (talk) 14:53, 2 June 2009 (UTC)

Those are the process that WP:IFU has come up with for uploading textual based image files. There's a discussion on it at WP:IFU and my talk page. 70.29.208.129 (talk) 15:38, 2 June 2009 (UTC)

I'm not referring to a recent edit, but to old edits from a couple of weeks ago (which I didn't bother to click on the thing), I only happened to notice the "TAG" in blue in the edit history, and am leaving a comment here (since I can't very well make a bug report, since I don't remember which pages I archived) However it happened on a recurring basis several times for several different archivals. 70.29.208.129 (talk) 15:36, 2 June 2009 (UTC)

Thanks. I'm not sure if the cost of experienced but IP-editors like yourself getting taged on very technical edits (table merges, text image uploads, and the like) outweighs the benefit of the filter. Have you considered making an account? Hipocrite (talk) 15:40, 2 June 2009 (UTC)

I certainly expect that the text image upload would get tagged (which is why I didn't do a bug report for it). I'm just referring to archivals of talk pages, since I don't see how those can have excessive whitespace. (and the merger as well; since it seems to have a reasonable amount of whitespace). 70.29.208.129 (talk) 15:51, 2 June 2009 (UTC)

I see little reason to having an account. Even less, now that WP:IFU is working properly. It was an idea when IP users were banned on creating articles, but WP:AFC is functioning in a timely manner now, and I just don't bother creating that many articles anymore. 70.29.208.129 (talk) 15:51, 2 June 2009 (UTC)

Automatic AIV reporting

For those who haven't already noticed, I've started running a bot to report some abuse filter hits to AIV. This is an extension of the bot I've been running in the abuse filter IRC channel for a few months now. The full details about what it will report are at Wikipedia:Bots/Requests for approval/Mr.Z-bot 7. The lists of filters it monitors are at User:Mr.Z-bot/filters.js; any admin can change the lists and it will be reloaded by the bot within 5 minutes. One list is for filters that should trigger an AIV report for all hits - anything that would be considered "block on sight" vandalism. The second list is for all other filters that prevent vandalism. These should be filters that catch what is unambiguously vandalism, preferably ones already set to warn or disallow (so the user gets adequately warned). Mr.Z-man 00:53, 3 June 2009 (UTC)

How to save changes?

On forms to create and edit filters, I see all the controls enabled, but no button to actually save the changes. If I can edit filters, shouldn't there be a button to save the edits? If not, shouldn't the controls be disabled? Neon Merlin 02:38, 3 June 2009 (UTC)

Only abusefilter editors (all of which are admins, to my knowledge) can actually edit the filters, for security. It just uses the same interface for admins and non-admins, but there isn't a save button for non-admins. If there is a specific edit that you'd like to have made you can probably mention it here and someone should take a look. –Drilnoth (T • C • L) 02:44, 3 June 2009 (UTC)

External URL abuse filter preventing proper tagging of db-copyvio

When tagging a db-copyvio (using twinkle or manually) where the copyrighted URL is on the blacklist the spam filter prevents it. As a workaround I insert a space somewhere in the domain name, but still slows things down a bit.

Would it be possible to have the spam filter ignore text in curly brackets? 7 ^{talk | Δ |} 23:30, 3 June 2009 (UTC)

Ignoring in curly brackets would let the {{official}} template be used, probably others. Maybe a more specific ignore, like "{{db-"? tedder (talk) 23:42, 3 June 2009 (UTC)

This has nothing to do with the abuse filter. As far as I know there is no way for the spam blacklist to be circumvented in this way. The usual solution is to either remove the "http://" part of the URL, or insert a space, neither of which takes very long. It may also be possible to get the template to mangle the URL for you, but I'm not sure if that could be both popular and a time-saver. -- zzuuzz ^(talk) 00:50, 4 June 2009 (UTC)

Strange, because an admin was able to include it in their speedy deletion summary - so I assumed it was SBL and that if user=admin it stops checking. Seems like you should be able to do the same thing if within a db-copyvio tag. Perhaps deletion summaries aren't checked by SBL... 7 ^{talk | Δ |} 08:29, 4 June 2009 (UTC)

What zzuuzz means is that the WP:Spam blacklist is separate and independent of the abuse filter, hence it isn't controlled by us. Aside from the decision of what to list (handled by admins), the behavior of the spam blacklist is mainly set by the software developers. Dragons flight (talk) 08:44, 4 June 2009 (UTC)

Ok - thanks - should I mention it over there, or do you think it's a non-starter? 7 ^{talk | Δ |} 09:14, 4 June 2009 (UTC)

Warning people who ignore abuse filter warnings

Considering that editors have already been warned once when they trigger the abuse filter, shouldn't vandal fighters immediately proceed to warning level two when reverting vandalism where the abuse filter was ignored? Better yet, shouldn't we make a special series of warning templates for people that ignore the abuse filter? PCHS-NJROTC ^(Messages) 00:26, 6 June 2009 (UTC)

I would think so. Please note, however, that freequently vandal fighters will skip levels or repeat the same warning level. עוד מישהו Od Mishehu 07:19, 8 June 2009 (UTC)

Personally I consider each trigger to be a warning, and I block directly (for 3 hours) after 5 triggers (if it's separate edits). -- Luk ^talk 08:35, 8 June 2009 (UTC)

Filter 81

Please, will someone reply to this request? Having the log of bad charts greatly simplifies my life, and letting the list of charts that the filter detects go stale means that I have to go back to manually searching for them: a tedious and error-prone process.—Kww(talk) 01:05, 6 June 2009 (UTC)

Removal of speed deletion tags.

In this [edit here] I replaced a speed deletion tag with a hangon tag, and it did not trigger the filter.I was not logged in at the time. A new user has just done the same thing on a page which probably will get deleted. Is this a bug? —Preceding unsigned comment added by Martin451 (talk • contribs) 17:46, 9 June 2009 (UTC)

Working as indented ^{I believe}. You removed a deletion tag, but you added the hangon tag, this still categorizes the page in "contested speedies". See Special:AbuseFilter/29. –xeno^talk 17:57, 9 June 2009 (UTC)

Problem

There's a major problem, I can't save a redirect over an existing page. It used to let me press save again, but now pressing it ten times in a row results in no save. So I have to save a redirect on top of content instead of clearing the page to do so. This is a very bad error. 70.29.212.226 (talk) 06:21, 14 June 2009 (UTC)

I added an exception to the filter 33, which will allow redirecting talk pages. Ruslik_Zero 15:12, 14 June 2009 (UTC)

This is an interesting case. There are two filters, which only warn and tag (28—warn only and 33—warn and tag). However the cumulative effect of their combination is disallow. After the first warns, the second tags, then the first warns again etc (infinite cycle). Ruslik_Zero 15:46, 14 June 2009 (UTC)

Filter 180

Just wondering how my creation of the article John Paskin triggered the tag "large unwikified new article"? cheers, Struway2 (talk) 09:04, 15 June 2009 (UTC)

I am seeing this also. This seems to be a new bug which started happening today. --Big_iron (talk) 09:57, 15 June 2009 (UTC)

Also happened on Church of St Peter and St Paul, Muchelney - what is this supposed to achieve & why is it labelling articles as unwikified?— Rod ^talk 15:30, 15 June 2009 (UTC)

Please watch your regular expressions.

There's a bug in AbuseFilter as seen here, that can unfortunately kill all edits if an invalid regex is present in ~~added to~~ a filter. ~~So be sure to click CHECK SYNTAX BEFORE submitting.~~ I had to fix this filter. Q ^{T C} 11:29, 15 June 2009 (UTC)

Ironically, this was created by the process that now prevents bad regex from being saved. Before the recent update, there was no error checking on regex at all (any bad regex was permanently false). Now malformed regex will prevent the rule from being saved. It's ironic that I'm that one who asked for built-in regex error checking and I'm also the one caught by it. Sigh. Hopefully once any other bad regex in existing rules is removed this won't be able to recur. Dragons flight (talk) 15:41, 15 June 2009 (UTC)

PS. Just to be clear, before the recent update, check syntax reported all regex as good even if it contained obvious errors. Dragons flight (talk) 15:42, 15 June 2009 (UTC)

Ahh, that explains a bit more. Q ^{T C} 15:45, 15 June 2009 (UTC)

Well and throughly borked

Bugzilla:19216. The recent update massively broke things. added_lines and removed_lines are behaving erratically. This is most obvious in rule 180, mentioned further up the page, but is also affecting a large number of other rules. Many may simply need to be disabled until this is fixed. Dragons flight (talk) 15:45, 15 June 2009 (UTC)

I think we need to disable everything, see this for example. The filters don't make sense anymore. We may have massive collateral damage. Cenarium (talk) 15:59, 15 June 2009 (UTC)

Actually, the filter tripped (#30) was the large deletion filter, and may actually have been working as intended. The summary reported was "Page move vandalism" though which is NOT #30. Obviously there is a problem there, but I'm not sure if it is related to the problem described above or not. Disabling everything may be wise. Before doing that, I would suggest deleting any existing disabled filters, so it will be easier to figure out which ones to re-enable later. Dragons flight (talk) 16:24, 15 June 2009 (UTC)

I've filed the incorrect filter description as bugzilla:19218 as I believe it is probably an unrelated issue to the keywords not working. Dragons flight (talk) 16:41, 15 June 2009 (UTC)

This appears to be the case for all log details, even old ones, e.g., the given description of a filter is always Common page-move vandalism; but it doesn't act for it, so it's not a big problem. It's probably unrelated, but may be also due to the recent software changes. The problem with keywords would be serious if other filters with actions had similar problems to those of rule 180. I haven't found other instances of misbehavior, and they would have probably surfaced by now. So there may be no need to deactivate right now, but closely monitor the filter activity. Cenarium (talk) 16:46, 15 June 2009 (UTC)

According to Werdna, 19216 should be fixed as of 30 minutes ago. I've turned 180 back on to see if it is working now. Dragons flight (talk) 15:28, 16 June 2009 (UTC)

Filter stopped working - strange testing results ..

I noticed that Special:AbuseFilter/172 stopped working, and reading above about the problems other filters, I tried to figure out why the filter stopped working ...

There are no items since early yesterday morning ..
Special:AbuseFilter/test/172 on the last IPs hitting yesterday (114.76.219.114, 196.201.34.165) does not result in any hits (while they hit earlier)

However:

examining a 'non-hit' (which were originally hits) by one of the IPs (test 172 against edits by 196.201.34.165 with showing edits that do not match the filter, examining the edit that did hit the filter originally (June 15, 9:05 to Alien))
clicking 'test'

gives a positive result!

Eh, anyone? --Dirk Beetstra ^{T C} 09:50, 16 June 2009 (UTC)

I played around with this a bit, and saw that testing it without "(added_lines == "")" did give hits. I changed that line now to "(length(added_lines) < 1)" (which is essentially the same ..), and now it works again. Funny bug. --Dirk Beetstra ^{T C} 09:25, 17 June 2009 (UTC)

Automatic AIV reporting 2

The bot to report abuse filter violators to AIV is now approved. Again, the filter lists it uses are at User:Mr.Z-bot/filters.js. Any admin can edit the lists to affect what it will report. Mr.Z-man 20:40, 16 June 2009 (UTC)

Request for filter on html

Not sure if this should be in a filter, or put into Cluebot's code, but would it be feasible to check for the addition of certain html, and tag accordingly.
e.g. <div style="display:none"> like in this edit which effectively blanked the page whilst leaving the source there. Of course the time taken for the filter to work might not be worth the hits it gets. Martin451 (talk) 22:29, 16 June 2009 (UTC)

Tagging legitimate edits

If a user has over 50,000 edits and has been registered for over four years, let's assume they know about copy and paste moves. Stop tagging legitimate edits. Given the number of variables and options available in the AbuseFilter extension, the current behavior is simply unacceptable. Fix the damn filter. Thanks. --MZMcBride (talk) 02:19, 17 June 2009 (UTC)

Done. Ruslik_Zero 07:23, 17 June 2009 (UTC)

Filter Reduction

Nearly 10% of all edits are hitting the condition limit now, so it is rather imperative we reduce the numbers of filters in order to stem this problem. I intend to cut back on the following types of filters: log only, very low hit, tag only, and duplicate filters. Prodego ^talk 07:50, 17 June 2009 (UTC)

Done and we are back below 2%. Let the complaining begin... Prodego ^talk 08:25, 17 June 2009 (UTC)

Actually filter 6 has had just one hit since yesterday, when I created it. The old filter 6 was disabled long ago and had 4 hits. Ruslik_Zero 08:48, 17 June 2009 (UTC)

Ah, I must not have noticed the change. If I recall, filter 6 was pretty light anyway, you can go ahead and reenable if you want. Prodego ^talk 17:40, 17 June 2009 (UTC)

I brought this up with Werdna (due to a general feeling that we jumped in the condition limiter after the update), but he didn't have anything constructive to add. So let me explain a bit about what the condition limiter measures. The following discussion is somewhat approximate (because explaining the code exactly would take longer).

The condition limiter, roughly speaking, counts each "condition" in the rule:

One count for each boolean operand: X & Y & Z = 3 conditions,
One count for each parenthetical clause it must enter:
- X & (Y & Z) is 4 conditions if X is true, but only 2 conditions if X is false, since it never enters "(Y & Z)".
- In general parentheses are a very good thing because they facilitate more efficient short circuiting of unnecessary clauses which benefits both timing and condition counts, and generally improve legibility, though since they do count as a condition this can be overdone.
One count for each parameter in a function it must evaluate, plus one count the first time a function is called with those parameters.
- contains_any(added_lines, "foo", "bar") is four conditions the first time it is evaluated, and three conditions all subsequent times for the same edit.

Incidentally, the update also defined a new syntax of the form "some_var_name := some_expression;", which allows one to create a variable. If one needs the same bit of computed data multiple times within a rule this is one way to increase efficiency. Dragons flight (talk) 18:27, 17 June 2009 (UTC)

That was what I was figuring it measured, although I didn't know about parenthesis, or that we got the variable name syntax. I think there are a number of filters that could be converted to using a variable. I also submitted a bugzilla request for a contains_all() function to go with contains_any(), which I think could be useful. Prodego ^talk 18:31, 17 June 2009 (UTC)

Filter 175 (repeating characters)

An user had a valid concern about that filter: it also tags repeating characters inside URLs. Could it be possible to tweak it to ignore such instances without hurting performance and accuracy too much? -- Luk ^talk 09:14, 17 June 2009 (UTC)

I added a condition. Ruslik_Zero 09:53, 17 June 2009 (UTC)

The Last Song (novel)

Heh. When someone creates an article with only the words "Dank asshole", I guess I'd rather we not have a filter jump in that says "please consider making your article longer" ... and then inserts that in the edit summary :) - Dank (push to talk) 14:47, 17 June 2009 (UTC)

Styling tags

Now that MediaWiki has been updated past r52071, Tags are now wrapped in a span which allows us to identify them. There is now an open discussion on whether we should style tags when they appear in RecentChanges, Watchlist, etc. All commens welcome! Happy‑melon 09:53, 18 June 2009 (UTC)

Description for tags

To make Special:Tags more comprehensible, I added descriptions to some tags, all formatted with {{Tag description}}. It supports the parameter inactive=yes for old tags that aren't used by any abuse filter and thus do not appear in recent changes (those we'll be able to get rid of when T20670 will be fixed). You may want to complete the list if you have inspiration. Cenarium (talk) 00:23, 20 June 2009 (UTC)

Massive triggering of filter 28

Noel Streatfield (talk · contribs) triggered the abuse filter 28 more than thousand times, in non-stop, see. This filter is not private and I remember someone having anticipated that vandals could exploit the abuse filter to hinder performance. I don't think it had any major effect besides clogging the log, but reporting here just in case. When approved, Mr.Z-bot may be used to report such users. Cenarium (talk) 17:09, 15 June 2009 (UTC)

Don't those look like errors though? I don't see why the filter should be triggering on those edits. Dragons flight (talk) 17:17, 15 June 2009 (UTC)

Hmmh, they are null edits. Any non-autoconfirmed user making an edit to a redirect will trigger that filter (just tested). This definitely needs to be fixed, the filter should be overhauled or deactivated if it's not of much use. Cenarium (talk) 17:47, 15 June 2009 (UTC)

I added an additional condition: old_size != new_size. Ruslik_Zero 18:43, 15 June 2009 (UTC)

It sounds a little strange, though, that the abuse filter checks null edits, shouldn't first the software check whether an edit is null, and then the abuse filter analyzes it only if it's not ? I don't know for the impact on performance, is it worth a bug ? Cenarium (talk) 15:55, 20 June 2009 (UTC)

Yes, it is probably worth a bug. Ruslik_Zero 17:59, 20 June 2009 (UTC)

I searched and Gurch already reported this, T21267. Cenarium (talk) 18:18, 20 June 2009 (UTC)

Filter 80 borked?

No hits for the last two days... probably caused by the recent scap. MER-C 12:23, 17 June 2009 (UTC)

There's a bug, I (through werdna's instructions) worked around it. Prodego ^talk 16:50, 20 June 2009 (UTC)

Abuse Filter editors group membership request

Od Mishehu has asked that I seek group membership in the "Abuse Filter editors" group. He also mentioned that this is probably the best place to ask for it. I am Cobi, the owner and operator of ClueBot (BRFA · contribs · actions log · block log · flag log · user rights), one of the antivandalism bots here with over 1.1 million edits. I believe that my work with ClueBot demonstrates my technical ability at writing heuristics to identify problematic edits. Thank you. -- Cobi^(t|c|b) 07:23, 22 June 2009 (UTC)

Strong support - I believe that ClueBot is proof that this user can be trusted with this tool. עוד מישהו Od Mishehu 07:27, 22 June 2009 (UTC)

Out of curiosity, why aren't you an admin? Dragons flight (talk) 07:33, 22 June 2009 (UTC)

Lack of article work. -- Cobi^(t|c|b) 07:41, 22 June 2009 (UTC)

Support. The article work is not important for an abuse filter editor. In addition, there was a precedent of granting the full adminship to the editor, who had hardly any contributions at all, in order to allow him to edit spamblacklist. Ruslik_Zero 07:57, 22 June 2009 (UTC)

I support this request but I think it could've done with a bit more discussion, especially being that we're setting a precedent. (Icestorm has already enabled the userright) –xeno^talk 16:09, 22 June 2009 (UTC)

Sorry if I rushed things a bit. I felt that for Cobi it was noncontroversial to give him the user rights, given his technical experience with ClueBot and his trustworthiness. As for granting the abuse filter tag for future cases, wouldn't WP:PERM suffice? Icestorm815 • Talk 16:18, 22 June 2009 (UTC)

As I said, it has nothing to do with to whom the right was granted, its more that we haven't had any formal discussions or consensus to grant the userright to non-admins. –xeno^talk 16:40, 22 June 2009 (UTC)

Yes, although we could have discussed it more, there wasn't any need for a long discussion: we'd have come to the same conclusion as there is no question that he's both trusted and capable. Tim Vickers (talk) 16:51, 22 June 2009 (UTC)

Agreed and I'm not suggesting we strip the right or anything (that would be wonkery) - but I think we should discuss in general what qualifications someone should bring to the table as well as how the userright should be applied for, how long it should remain for comment, and how to determine if consensus exists for the granting. –xeno^talk 16:53, 22 June 2009 (UTC)

I'd say apply on this talkpage, and discuss for a week unless it is a no-brainer reject or grant. As to qualifications, I'd tend to avoid bureaucracy by being vague and saying "Capable and trusted", leaving it up to people who can edit the filter to decide if somebody else has the expertise necessary. Tim Vickers (talk) 17:10, 22 June 2009 (UTC)

That seems reasonable. –xeno^talk 01:06, 23 June 2009 (UTC)

I'd rather suggest a seperate subpage for requests (like Wikipedia:Abuse filter/requests) where we can have a page dedicated to discussing such requests without cluttering the talk page. On a side note: Since the filter allows its editors to perform admin actions (like blocking users), there was a suggestion on WP:AN by Chris G that it should maybe only be assigned by crats like +sysop. I think if we consider how we decide which non-admins to assign the flag, we should also ponder whether we really want all admins to be able to elevate any user to a status where they can block other people (originally we have created crats exactly to control who can assign such "powers"). Regards So Why 12:53, 23 June 2009 (UTC)

I think that if a lot of people started applying then a subpage might be a good idea, but I doubt there will be a flood of applications. I'd much rather move discussion of specific filters to a subpage and leave this page for general discussion and applications, it's more heavily watch'd. –xeno^talk 13:06, 24 June 2009 (UTC)

I must oppose non-admins getting this userright, as it is getting admin tools by the back door. I believe this was the consensus on AN as well. (This is no reflection on Cobi whom I am sure is trustworthy and would support at RfA.) Therefore I think the userright should be deactivated. — Martin (MSGJ · talk) 14:14, 24 June 2009 (UTC)

Out of curiosity, which "admin tools" are you talking about? The AbuseFilter, in it's current state, cannot block a user as an action to a filter, or degroup a user. It can only block individual edits. The only actions are: warn; prevent user from doing whatever triggered the filter; revoke the user's autoconfirmed status; tag the edit; and/or throttle the user (rate-limit). -- Cobi^(t|c|b) 15:34, 24 June 2009 (UTC)

So you could (in theory) write a filter that prevents one or multiple users from making any edits whatsoever? Imho that is quite similar to blocking a user. And theoretically, a filter could be created to disallow all editing to a certain page (or set of pages), which is equal to protection of the article, a sysop right. So Martin's concerns are well-reasoned per the technical abilities of the filter. Whether non-admins like you will do such is another concern but imho it is a potent enough tool to be abused to require some sort of community consensus to be awarded (I do think that non-admins should be able to get it, just not simply by asking an admin). Regards So Why 15:43, 24 June 2009 (UTC)

I could (in theory) also write a bot that prevents one or multiple users from making any edits whatsoever. I could also, theoretically, make a bot which would disallow all editing to a certain page (or set of pages). Yes, the bot and I would get blocked nearly immediately, but the same goes if I start making stupid filters. All of the rejected edits are saved in the AbuseLog. -- Cobi^(t|c|b) 15:51, 24 June 2009 (UTC)

Disabling old filters: 44, 112, 137

Filter 44 hasn't been hit since April. Filter 112 since late May. Filter 137 since early May. Perhaps it is time to disable these? -- Gogo Dodo (talk) 04:02, 25 June 2009 (UTC)

Convenience links:

Cheers, tedder (talk) 04:24, 25 June 2009 (UTC)

These appear ro be filters associated with specific sockpuppeteers. I think we should wait until all IP/range blocks for the relevant sockpuppeteers have expired, and the about a month - and at that point, disable them. עוד מישהו Od Mishehu 07:06, 25 June 2009 (UTC)

137 goes with 4, and they should probably both be disabled if either is. I'd be a little patient since this guy has shown a propensity for multi-week breaks before, but in general they all seem to be good candidates for being turned off. Perhaps we should establish guidelines on when to disable filters that were useful in the past but ceased getting hits (hopefully because the vandal gave up ;-). My personal feeling is something like 6 to 8 weeks of inactivity. Dragons flight (talk) 07:58, 25 June 2009 (UTC)

The vandal from filter 44 is still active, and responsible for filter 186. Cenarium (talk) 14:52, 25 June 2009 (UTC)

I'd like to wait another month or two on filter 112. There were some edits lately by that guy, but he modified his behaviour, so the filter didn't catch his edits. --Conti|✉ 15:34, 26 June 2009 (UTC)

I disabled 4, 44, 66, and 137, in attempt to reduce the number of hits against the condition limit (was 15%). Now the condition limit is ~7% which is still too high in my opinion, but may be better served by looking for filters to optimize. Dragons flight (talk) 15:24, 27 June 2009 (UTC)

Oi, that is way too high... I got it down to 2, how did it get back so high? Prodego ^talk 15:28, 27 June 2009 (UTC)

There are a dozen new filters this week, I haven't gone through all of them yet, so some may be poorly designed. Dragons flight (talk) 15:34, 27 June 2009 (UTC)

I just cleaned out a bunch of low hit ones (mostly targeted at specific socks). We are still (barely) above 2% though, which I consider the max which is acceptable. Prodego ^talk 15:40, 27 June 2009 (UTC)

Part of the problem is that we don't really have a way of knowing how many condition counts each filter generates (since short-circuiting may make a big difference). I asked Werdna to include something to monitor this, but he hasn't gotten to it. Dragons flight (talk) 15:47, 27 June 2009 (UTC)

I've asked him as well - however, considering 10% is a very low percentage, we can probably just assume everything isn't being short circuited, and all checks are being made. Prodego ^talk 15:49, 27 June 2009 (UTC)

Tags look like edit summaries sometimes

Can someone point me in the right direction (Mediawiki pagewise or developer-wise) to add a short AF: with a wikilink to an explanation about tags before the tag description? –xeno ^talk 00:51, 15 May 2009 (UTC)

Ah, like the WP:AES arrow? That's probably a good idea. The tags are listed at Special:Tags, I assume you can just add it to each one there. Amalthea 01:14, 15 May 2009 (UTC)

hm, could a more elegant way be devised (as this would require actually creating the ones that are using the default non-modified tag description)? –xeno ^talk 01:16, 15 May 2009 (UTC) P.S. this might not be my idea, I may have read it at one of the pumps

bug 18661 would also help with this. Mr.Z-man 02:21, 15 May 2009 (UTC)

I've gone ahead and added the prefix to the tags with hits; while we wait for a more elegant solution. –xeno ^talk 16:24, 22 May 2009 (UTC)
- Would "Tag:" be better? It seems like it would be a bit more clear. And maybe an arrow. –Drilnoth (T • C • L) 19:00, 28 May 2009 (UTC)
  - can someone knowledgeable explain what the meaning of the tag MediaWiki:Tag-repeating_characters is? There seems to be no reason for banning aa, bb, cc, dd, ee, and the like. Jasy jatere (talk) 11:24, 29 May 2009 (UTC)
    - It doesn't ban anything, it only marks edits, and it needs at least 7 repetitions. Amalthea 12:23, 29 May 2009 (UTC)
      - I would find it helpful if this was documented somewhere, I was really puzzled when I came across this tag first. I cannot edit the tag page, but a short description would surely not be bad Jasy jatere (talk) 17:56, 29 May 2009 (UTC)
        The link in the tags links to Wikipedia:Tags. Is that what you're looking for? –xeno^talk 18:01, 29 May 2009 (UTC)
        please add the info "7 or more repetitions" there, or at another place. Currently, it only says "repetition of characters", which is not helpful without "7". Jasy jatere (talk) 11:46, 30 May 2009 (UTC)
        I find the current tag acceptable, in particular since it doesn't matter much what's written there, only that it's tagged at all. The exact conditions of the filters are complex, attempting to explain them in the tags will be confusing and far too long (and some are set to private). For example, Special:AbuseFilter/135 doesn't match in user space, it doesn't match with auoconfirmed users, and it actually matches character sequence repetitions, where each sequence is made of one to nine arbitrary characters except ':*|=}{-.
        A stronger correlation between tags and filters might be useful, but I think it's counterproductive to link to an explanation effectively detailing how to circumvent a particular tag. Amalthea 12:17, 30 May 2009 (UTC)
        I think it does matter what is written there, since it was that very message which confused me. I agree that a detailed account of the filter is not necessary, but a brief explanation somewhere would be helpful. Something like "this tag is used on edits which could be vandalism because they contain an unusually high repetition of the same character, e.g. jjjjjj" —Preceding unsigned comment added by Jasy jatere (talk • contribs) 30 May 2009(UTC)

Hey, now that the span thing is added (I think!) can we automatically prefix the filter descriptions with [[Special:Tags|Tag]]: rather than having to create the mediawiki page for every one of them? –xeno^talk 02:37, 29 June 2009 (UTC)
- Not if you want that to work in all browsers, I'm afraid. I assume Z-Man's comment from above was in referral to the section header, since the tag styling that can now been done helps with that. Amalthea 19:59, 29 June 2009 (UTC)
  - Sad face. –xeno^talk 20:07, 29 June 2009 (UTC)

Michael Jackson vandalism

There appears to be something unusual going on with this tag: it was assigned to the George McHugh article. See Revision history of George McHugh. --Big_iron (talk) 20:43, 27 June 2009 (UTC)

The filter that was created to combat the flood of Michael Jackson related vandalism also looks for quite common phrases. I have changed the tag to "Possible Michael Jackson vandalism" to make it clear that it has a high change of false positives and that those are no reason for worries. Regards So Why 20:54, 27 June 2009 (UTC)

Please format your tags properly, see at Special:Tags. Cenarium (talk) 20:24, 28 June 2009 (UTC)

Don't change the name of the tag directly in the abuse filter, it will create a new one, not desirable. Instead, change the description. Cenarium (talk) 20:27, 28 June 2009 (UTC)

Thanks for the advice about the tag system. As I just changed an already incorrect tag, I guess we need to educate abuse filter editors about that in general. Regards So Why 20:42, 28 June 2009 (UTC)

Are there any guidelines on the formatting of tags? If not, should there be? I think short, concise, alphanumeric and lowercase would be a sensible suggestion. Currently we have a lot of different formats floating around, which must be hard for people actually using the tag filters.. Happy‑melon 21:07, 28 June 2009 (UTC)

I don't think we can expect people to filter by tags by typing the tag name, a scroll-down menu would be much more usable. But with the impossibility to remove applied tags until now, it wouldn't be manageable. So there's only from Special:Tags that we can really get one. I think we should mention guidelines at MediaWiki:Abusefilter-intro, in particular on tags; this page needs a revamp. We could also add more at WP:Tags. Cenarium (talk) 22:42, 28 June 2009 (UTC)

lol, needs a "!sysop" –xeno^talk 02:30, 29 June 2009 (UTC) -
02:23, 29 June 2009 (hist) (diff) N MediaWiki:Tag-Possible Michael Jackson vandalism ‎ (Tag: Possible Michael Jackson vandalism) (Tag: Possible Michael Jackson vandalism)

This should probably have stricter hit criteria... Maybe restrict by userright and/or namespace. –xeno^talk 13:14, 29 June 2009 (UTC)

Filter 80 "possible link spam": change edit summary?

Special:AbuseFilter/80's current appearance on change lists is "Tag: possibly inappropriate external links". I found this a bit misleading when trying to work out why an edit had been tagged as it suggested to me that the URL triggered it. I had to look at Special:Tags and Special:AbuseFilter/80 to discover that it is triggered by a new user adding several external links within a given timeframe (currently >3 in 20 mins), and the URL is irrelevant, and therefore realise I should examine their recent contributions. Can I suggest that MediaWiki:Tag-possible link spam be changed to something a bit more descriptive such as "New user rapidly adding external links", or if space allows, "New user recently added several external links"? Regards, Qwfp (talk) 12:42, 28 June 2009 (UTC)

how about "repeated addition of external links by non-autoconfirmed user"? (already implemented) –xeno^talk 15:28, 29 June 2009 (UTC)

Great! That sounds better than my suggestions. Many thanks xeno. Qwfp (talk) 15:55, 29 June 2009 (UTC)

Detecting if a page is in a category

Isn't there a way to check if the page is in a category with the abuse filter (before the edit, and after) ? I mean, not just pages with the category directly given in wikitext, but also when transcluded. That would be tremendously useful, for example for 29 and 189, those filters use workarounds, but they are far from complete, while it would be easy to simply check respectively if the category is in Category:Candidates for speedy deletion before and not after, and in Category:Living people before. Cenarium (talk) 02:40, 24 June 2009 (UTC)

Unfortunately, there appears not to be. Feel free to open an enhancement request. עוד מישהו Od Mishehu 00:32, 2 July 2009 (UTC)

I opened T21455 yesterday. I requested a list of categories the article is in at the time of the edit. Thinking about it, I don't think it's possible for the software to extrapolate in which categories the page would be after the edit. Cenarium (talk) 00:46, 2 July 2009 (UTC)

Page moves

If you write a script to create a permanent log calling wikipedia editors abusive vandals you should know a little more about wikipedia than you do.

IPs can't make page moves. Removing articles written by bots that have had 4077 of their articles removed is not vandalism. Being stopped from removing them is vandalism. Your script is preventing me from editing the last 889 of the bad articles this bot created. Now who's the vandal? 889 potentially bad articles, some have evolutionary theory that's worst than an article written by a creationist. Wikipedia looks like an idiot calling bacteria eukaryotes. Now you're blocking qualified writers from fixing it.

Where's the abuse filter's abusive vandalism log for forcing these bad articles go unedited? --69.226.103.13 (talk) 15:28, 2 July 2009 (UTC)

Abuse filter 192 warning

I would like to know what users think about my suggested warning for direct use of stub categories before it goes live. עוד מישהו Od Mishehu 07:55, 2 July 2009 (UTC)

This is precisely the sort of thing that shouldn't be giving the user a warning, instead tagging the edit so that a user who understands stub categories can make any fixes that need to be made at a later date. Gurch (talk) 14:35, 3 July 2009 (UTC)

Should the Abuse filter log be permanent ?

... or should the log entries be set to expire after some time (say two weeks) ? After all, they are just meant to highlight possibly problematic edits for review and if an edit hasn't we reviewed in a couple of weeks it's unlikely to get/need attention later. We always have a permanent record of the edit diffs themselves, in case we need to review a user's edits later. Comments ? Abecedare (talk) 18:16, 2 July 2009 (UTC)

No, it should not be. If it is, its permanence should be debated with the community as a whole. People say it's meant to aid in correcting vandalism, then it should be used for that purpose, not to attach a permanent record of tagged edits to a user's account. If that's what it's purpose is, record all questionable edits according to the heuristics, then the community should formally be notified of this intention and given the courtesy to decide if that's what is wanted.

You need to test this a lot better before attaching it permanently. I have an IP edit not made by this IP on my log. Get it right first. --69.226.103.13 (talk) 06:04, 3 July 2009 (UTC)

Meh... Why not? I think there is no compelling reason NOT to make it permanent. --Jayron32.talk.contribs 11:28, 3 July 2009 (UTC)

Without getting too beansy, there are indeed good reasons to not keep this data for extended periods. Risker (talk) 14:40, 3 July 2009 (UTC)

Why would you default to "most bureaucratic" without a solid reason? That's a good way to guarantees that bureaucracy will encumber the creation of wikipedia.

If you can't offer a solid reason for increasing the level and tangle of bureaucracy on wikipedia simply don't increase it.

The real problem: anybot should never have been allowed to run on wikipedia, it created 1000s of bad articles and redirects, I'm trying to clean up after it, but the testing-stage heuristics here are preventing that from happening. Funniest yet, anybot edit-warred with other bots, and together with any other bot, anybot created some of the funniest stuff I've seen on wikipedia, yet not one single heuristic captured any of that. If it had been done by a human, they would have been blocked after three edits. Instead, here you are tagging my clean-up of a bot nightmare as non-constructive, arguing for a permanent record of my edits to deleted articles, and making me work twice as hard to get past the heuristics to edit.

I finished up one I started on last night, and removed a few more, fighting the heuristics all the way, but this is too much to be told with every edit I make that it's "potentially unconstructive" when I'm busy removing ridiculous garbage from wikipedia. Someone else will have to do it, if they're not so busy searching the test edit tagging list of users or reverting my edits.

What's really unconstructive is increasing the level of bureaucracy without a compelling reason to do so. --69.226.103.13 (talk) 20:43, 3 July 2009 (UTC)

Have you even tried to contact the editors who created the AbFilters you are running in to trouble with? They can easily temporarily disable the offending filters or write exceptions into them for your bot. For every person that runs into an occasional (though sometimes BIG as in your case) problem, there are thousands of vandalism we no longer have to revert, or worse, miss and leave to the public view. What is really REALLY unconstructive is not asking for help that could easily be given to you if your first reaction wasn't to indict the entire system over an unforseen glitch, but was instead to simply ask for help in running your bot in a way that did not run up against the AbFilter hueristics. Seriously, contact the AbFilter devs, ask for help, and the problem can be fixed very quickly. --Jayron32.talk.contribs 01:14, 4 July 2009 (UTC)

I'm not running a bot, I'm cleaning up after a bot whose approval was revoked for the huge mess it made. I asked for help, and I've been asking for help, and I keep running into more ways that some bot interferes with cleaning this mess up. And, can't people offer useful suggestions without criticism when writers are frustrated and have been trying to clean up a nasty mess for weeks, a mess they didn't make to begin with?

The other writer seemed to say IPs can't be exempted. But I'll try that. But that's just one more thing to do instead of editing the articles. How about if it doesn't work you agree to edit some of the articles?

I have not seen any vandalism on wikipedia that compares with this mess created by anybot. Not even close to it. --69.226.103.13 (talk) 01:46, 4 July 2009 (UTC)

In that case, you have not been here long enough... If IPs cannot be exempted, and you are opposed in principle to creating an account why, I will never understand, then you could perhaps create a temporary account which could allow you to get around this problem. Just abandon the account when the task is complete. --Jayron32.talk.contribs 02:05, 4 July 2009 (UTC)

It doesn't matter if you understand, but new editors are hit worse by established users than IPs. That's a sufficient reason for not registering. I come to wikipedia to correct bad science and pitfalls to its readers. This isn't popular when editors like their articles. I've been screamed at for deleting a link to a web page that downloaded a virus, and I've been blocked for pointing out that an administrator was lacking credentials for editing an article (the administrator was later defrocked).

Privacy is not that compelling. I drop in and out, so I can't watch articles, but if I ask someone at articles for creation to make it they monitor it for years: undoing vandalism, correcting spelling errors by other writers. I usually only request articles be created with references at google books, so this makes it easy for article creators who don't know the subject. I used to just randomly ask editors to write the articles, and wikipedia writers were willing to do that also.

Being an IP has its disadvantages, but fewer for me than the hostility towards new users. I've edited as an IP for, well, 5 years, included featured articles and good articles.

Wikipedia writers need to learn to evaluate each edit, rather than assuming a known editor is excellent, and a new editor is a vandal. It's a point that deserves emphasis. When wikipedia writers start valuing a good contribution no matter the editor, maybe I'll register.

The temporary account would still be tagged by the new user heuristics. --69.226.103.13 (talk) 02:40, 4 July 2009 (UTC)

I understand the problems you are having with the AbFilter, and I am trying to sympathize here. However, as absolutely annoying and problematic the particular issue you are having is, I don't think the problem is the AbFilter; yes, the situation sucks, but from the point of view of the abuse filter, this is a small problem. Your problem should be dealt with, and it is not insignificant, but it is also not worth indicting the entire abuse filter system for what is, measured over the whole Wikipedia, a small problem. With regards to your negative opinions about registering an account; so be it. I am not here to disabuse you of your misplaced aggression towards other registerred users. I am saddened for you that you hold this opinion, but you are certainly entitled to it, as misplaced and uninformed as it is. --Jayron32.talk.contribs 02:48, 4 July 2009 (UTC)

That's a little creepy. I'm done here. --69.226.103.13 (talk) 03:19, 4 July 2009 (UTC)

This doesn't answer the point of making adding layers of bureaucracy a default value. If there's a compelling reason related specifically to the development mission of this programming, then spell it out, otherwise the log should not be permanent. --69.226.103.13 (talk) 01:58, 4 July 2009 (UTC)

Filter 189 - BLP Vios

[11] [12] [13] [14] [15]

This filter appears to be b0rked in someway, or is extremely poorly designed. Requesting it to be disabled or fixed. «l| ?romethean ™|l» (talk) 02:26, 29 June 2009 (UTC)

Probably because it checks the content of any paragraph that has been changed, not just the words that have been added or removed. snigbrook (talk) 19:27, 1 July 2009 (UTC)

I'm not aware of a way to reduce this type of false positives that wouldn't reduce cases of 'true' positives too. Cenarium (talk) 23:24, 1 July 2009 (UTC)

You say that as though it's an argument not to change the filter. Apparently a little understanding of statistics is needed here. A filter that catches all edits would catch all vandalism. Any filter that catches some subset of all edits catches some subset of all vandalism. Any reduction in the scope of a filter reduces both the number of bad edits it catches, and the number of good. This does not make a filter that catches everything a good filter. The same logic applies to any situation where there are good edits being caught, and the answer is never to leave the filter as-is "because it would reduce cases of 'true' positives". Gurch (talk) 14:39, 3 July 2009 (UTC)

That was rather an attempt to ask someone else to try to fix it. What you say is true, and obvious, but this filter has no active action against the user, so we can wait a little to find a way to deal with this, there's no utter urgency. At the time I hadn't found a way to remove those false positives without making the filter almost useless. But a few days ago, I have found a way (it won't catch those, and similar ones), and applied the changes. It's more difficult 'cause you need to consider performances issues too. It may need a few more \b s at the beginning and end of words, and checking removed lines for a few other words. There's still the problem that the filter sweeps way too large for added lines and removed lines, but it's inherent to all filters. Cenarium (talk) 01:48, 11 July 2009 (UTC)

Also, the filter detect BLPs by checking if the page contains Category:Living people in the wikitext, so it can't detect those with {{lifetime}} et al, if someone knows a way to detect all BLPs, or a larger part, that would be appreciated. Cenarium (talk) 23:47, 1 July 2009 (UTC)

Some way to exempt good-faith IPs from getting caught by the filter

This is related to our 69.x.x.x algae expert, but I could also see it as a problem that may pop up from time to time. Some folks prefer to edit via their IP and no amount of convincing can get them to register an account. Nevertheless they may be experts in a certain refined field (i.e. phycology) and willing to put in a lot of grunt work to fix problems with articles on the same. What about some way to give certain IPs temporary exemption (I was thinking promote to autoconfirmed but it doesn't seem to be grantable nor can I access an IP's userrights). We could write a hack into the filters they're running into but that's suboptimal (though, could someone do that temporarily for the filter's that are giving 69.x.x.x trouble? [16]). Just telling them to re-submit their warned edits, or even telling them their edits were just "tagged" isn't a solution because other good faith users may see the tag and assume it's a bad edit (i.e. [17]). Thoughts? –xeno^talk 12:57, 3 July 2009 (UTC)

How about writing filters that don't flag good edits as bad in the first place, then this problem disappears? Gurch (talk) 14:30, 3 July 2009 (UTC)

Perhaps you should apply for AFEship? –xeno^talk 14:36, 3 July 2009 (UTC)

Since my first action if anyone gave me that right would be to disable about half the filters, I doubt that would be granted. Gurch (talk) 14:43, 3 July 2009 (UTC)

Could someone please tell me how to change

!("autoconfirmed" in user_groups)

into "not (autoconfirmed or User:69.226.103.13)" ? –xeno^talk 01:55, 4 July 2009 (UTC)

Ah, I see one just has to add &(user_name != "69.226.103.13") to the end of a filter. Thanks Cenarium. I was chatting with Bjweeks about a way to solve this and came up with one way he thought might be "doable" - maybe we could write a hack in the abuse filter that tells it to evaluate "autoconfirmed" as "autoconfirmed and <these IPs>" ? (and we can keep a list of the IPs on a mediawiki page or something) –xeno^talk 03:09, 4 July 2009 (UTC)–xeno^talk 03:06, 4 July 2009 (UTC)

Ugh, what's next, exemptions for semi-protection? Weeks or months of implementation and testing, etc., all so people don't have to take one whole minute to register an account? All the while real problems don't get taken care of? Aren't there like hundreds or thousands of really annoying bugs in Bugzilla that have been open for years? This is all a tempest in a teapot. Wknight94 ^talk 03:22, 4 July 2009 (UTC)

'tis just a thought. I figured I should write it down, because I would soon forget it. At present, the issue's only cropped up once so I don't think we need to do much more than write a few exemptions while our IP friend does his magic. –xeno^talk 03:26, 4 July 2009 (UTC)

I already exempted in this way a few users and IPs hitting repeatedly the same filter(s). We'd need a way to know when a user repeatedly hits filters. Cenarium (talk) 03:32, 4 July 2009 (UTC)

It's a bad precedent. Doing this once for one rule, is not a big deal. However, if dozens of rules each had dozens of special exemptions the added burden would bring the whole thing to a halt. Dragons flight (talk) 03:34, 4 July 2009 (UTC)

If one filter needs too many exemptions, then it probably needs fixing, for having too many false positives. Cenarium (talk) 03:39, 4 July 2009 (UTC)

I don't know if this is the case: many exemptions are needed. Most of the filters I'm hitting are specifically related to the nature of the articles I am editing: bad articles written by a bot that just had most (4077) of the articles it created deleted all at once. The rest of the articles are intertangled with human edits and have to be checked by hand. The bot also overwrote entire redirects and articles, and it copied improperly from its source, so I have to remove large amounts of text and delete references. I'm not sure there will be that many exemptions of this nature. Maybe the tags should be dropped if the article is subsequently deleted, or if the article is created by a bot, the heuristic might be different. --69.226.103.13 (talk) 03:57, 4 July 2009 (UTC)

Wikipedia rewriting 2 billion years of evolution for the past four months in thousands of articles and getting it copied into cyberspace is a real problem.

And it's not a minute, I'd still have to wait 4 days to be autoconfirmed, and for some of the heuristics I have to be an established user with over 500 edits.

But, go ahead and edit them. I'm not a phycologist by the way, and it's a lot of work for me.

Every exemption is directly related to being able to quickly fix the mess created by anybot, a bot that had its authorization revoked after created thousands of bad articles and redirects that need corrects. If the bot had never been run, and if there were plenty of wikipedia phycologist this would not be an issue. --69.226.103.13 (talk) 03:39, 4 July 2009 (UTC)

Absolutely not. Gurch is correct, if the filters are stopping good edits at all, or giving too many false warnings, disable it and/or fix it. Do not waste checks on exempting particular people. As for you, 69.226.103.13, create an account. Its free. Prodego ^talk 04:11, 4 July 2009 (UTC)

It's not mandatory. Until wikipedia bars IPs, and until wikipedia doesn't have such a big problem with how newly registered users are treated that there are guidelines about how not to treat them, others might get used to the idea that a lot of good content is created by IPs, and some people would rather edit as IPs than become newcomers. --69.226.103.13 (talk) 06:42, 4 July 2009 (UTC)

A better idea than an IP exemption

User:Ruslik0 made an exemption for anybot created content, as the problem is the anybot created content. However, if others are not comfortable making an exemption for anybot created content, which removes the IP issue from the discussion and focuses on the real problem issue, then I won't do edits that tag me as potentially unconstructive editor. Let me know.

But if the purpose of the abusive filter is to force people to register, then take up blocking IPs from editing with the community, or stop saying "anybody can edit." The community, though, has already decided the issue of IPs editing, and it is still anybody can edit wikipedia. --69.226.103.13 (talk) 15:41, 4 July 2009 (UTC)

And I tested it and it works.[18] Done with one click. --69.226.103.13 (talk) 15:43, 4 July 2009 (UTC)

Suggestion

I started going through New user changing redirect or redirecting so I can tag c&p moves, of which there is about 1 per day whenever I check. This leads me to check the user's contributions to find what redirect they removed when they did the c&p move. I wonder if the tag can be modified to look for "New user removing redirect" in addition to changing a redirect; this would be a "companion" edit, if you will. Thx. --64.85.216.57 (talk) 13:26, 10 July 2009 (UTC)

question

I'm sysop on Persian Wikipedia.How can enable abuse filter in Persian wikipedia?I'm translating massages and translate half of themAmir (talk) 15:08, 10 July 2009 (UTC)

You should file a request on http://bugzilla.wikimedia.org Ruslik_Zero 15:39, 10 July 2009 (UTC)

Edit filter 23 attached tag to wrong user - really bad

This filter's heuristics are hidden, but this filter is bad. It is the one that tagged my edit as "possible vandal phrases," while it was referring to an edit made by a different user.

The edit filter should, in the very least, attach the tag to the correct editors. Really bad when it doesn't. See my edit log to find the edit, and note that the discussion and edit in the log are not mine. --69.226.103.13 (talk) 03:04, 4 July 2009 (UTC)

This would appear to be caused by an already known bug in the abuse filter. עוד מישהו Od Mishehu 07:52, 7 July 2009 (UTC)

The abuse filter is not disabled while the bug is already known? Why not? --69.226.103.13 (talk) 19:10, 7 July 2009 (UTC)

Probably because that would present a net-negative. Most editors realize that the AF EF is not perfect and that having entries in one's EF log (especially erroneously attributed ones) does not necessarily indicate they're a bad person. –xeno^talk 19:13, 7 July 2009 (UTC)

I believe that the overall number of false positives caused by this bug is relatively small. I doubt any admin would block you on the basis of an edit filter log without checking that your edits were, in fact, problematic. עוד מישהו Od Mishehu 08:59, 12 July 2009 (UTC)

Changing North America to New World is possible vandalism!

These edit filters need a work before they start recording incidents and putting them anywhere. No one reading this cares that Edit Filter 23 tags the wrong contributor, and, now, another one,[19][20] Edit Filter 11, tags the words "New World" as possible vandalism, when it's supposed to be tagging it sucks, meaning that potentially every taxon that ranges both continents will be tagged as a vandalism edit if an IP edits it.

It's supposed to tag "You/He/She/It sucks," just like Edit Filter 23 with its hidden code is supposed to tag the actual contributor of the actual edit. When things that aren't in the code start happening, it's times to stop running it and get it cleaned up.

The IP --69.226.103.13 (talk) 18:53, 4 July 2009 (UTC)

If you report false positives like this to Wikipedia:Abuse filter/False positives you may get a faster response. --Jayron32.talk.contribs 19:06, 4 July 2009 (UTC)

They're already being ignored there, but thanks for suggesting the best location. It's hard to find the right page to post on wikipedia.

I'm posting here out of a general concern for the filters, also, since these are basic bug type errors that should have been removed before running. They're also hard to understand how they could happen from the source scripts. --69.226.103.13 (talk) 19:27, 4 July 2009 (UTC)

Actually, if you have serious concerns, another place to try is to find the person who wrote the abuse filter in question, present evidence that the filter is not working as intended, and ask them to disable it until the problems can be fixed. The most direct and personal method may work. --Jayron32.talk.contribs 19:32, 4 July 2009 (UTC)

BY the way, I did find the source of the problem. Apparently, the filter tags any edit by an anon which includes the word "suck" in it; any time that article is saved by an anon, since it contains the word "goatsucker", the edit will be tagged as possible vandalism. You may want to bring this up to the person that wrote the filter; it will help them make it run better. --Jayron32.talk.contribs 19:35, 4 July 2009 (UTC)

No, only if they edit that section of that article. The tag does say 'possible vandalism' not, 'vandalism', its allowed to have a small number of false positives, as it doesn't disallow the edit. Prodego ^talk 19:58, 4 July 2009 (UTC)

But it's not 'possible vandalism' by that editor, as they're not the one adding "goatsucker" or the term "they suck" (in the same paragraph, and more likely) that tripped the filter. Suck is appropriate to describe taking in liquid through a beak/mandible modified to suction liquid.

This contributor noticed his legitimate contribution tagged as "possible vandalsim."

The filters should not tag the wrong editor, any editor who did not make the contribution. Bad enough: indiscriminate tagging, but worse: attaching it to the wrong edit. There's no reason or advantage to accept routine false positives like this in the first place. --69.226.103.13 (talk) 21:15, 4 July 2009 (UTC)

Sorry, but that tag is worse than useless. It tags revisions as "possible vandalism" when they might be, but also possibly aren't, vandalism. Anyone patrolling recent changes is going to be looking at all edits from anonymous and newly registered users anyway, so this doesn't help them; it doesn't help find vandalism any more quickly because of its high false positive rate, and it doesn't aid in project maintenance in any other way I can see. What precisely is the purpose of this tag other than to harass anonymous and newly registered users? Gurch (talk) 22:01, 4 July 2009 (UTC)

Ask whomever created the filter. Prodego ^talk 02:55, 5 July 2009 (UTC)

That would be Cenarium, who for reasons that escape me seems to think that exempting individual users is the way to go. Gurch (talk) 18:01, 5 July 2009 (UTC)

WTF? No, it's not me. BTW, it seems to have been fixed by Od Mishehu to avoid when the word was already there. And I don't think individual exemptions are the way to go, that was a temporary fix. And.. I thought you knew we don't have the resources to check all changes... Cenarium (talk) 00:19, 11 July 2009 (UTC)

Yes, it was you that added the "possible vandalism" tag. We have logs of these things. We have the resources to briefly check all changes by anonymous and newly registered users (just not to check the entirety of every revision made by them and fill out a review form for each). And even if we didn't, that is not a very good excuse to reject good changes. Gurch (talk) 22:44, 12 July 2009 (UTC)

Filters 9 and 97

What is the differences between filter 9 and filter 97? The former applies to all edits of unregistered users in the main namespace, while the latter to all large (>5000) additions to pages in all namespaces made by non-autoconfirmed users. They frequently overlap. Does it make sense to have two filters? Ruslik_Zero 16:54, 12 July 2009 (UTC)

No. Not much of the filter system we have does. Gurch (talk) 22:46, 12 July 2009 (UTC)

Adding abuse-filter-view-private into sysop package

Resolved

– "Committed with adjustments in r52743." (per Andrew Garrett). Not sure when this will be live, but it's on the way. –xeno^talk 14:31, 3 July 2009 (UTC)

was: Rolling AFE into +sysop redux

Previous discussion: Wikipedia talk:Abuse filter/Archive 3#Straw poll

While I was originally unreceptive to the idea, now that we've settled into the AFE, I think we should roll it into the sysop bundle. Especially so that admins who just want to view private filters (like me =) don't need to add themself into the usergroup. However, please see arguments contra in the last discussion, especially from slakr who does have some compelling reasons against. Thoughts? –xeno^talk 01:06, 23 June 2009 (UTC) Changed my position to just add af-view-private to +sysop - vote for the bug below. –xeno^talk 14:00, 24 June 2009 (UTC)

I tend to agree with the previous decision- don't enable by default, as admins can turn it on for themselves. If there was a bit for "view only" and another for "edit and break the whole encyclopedia", I might agree to roll it in to the bundle. However, as it's just one bit, causing interested admins to flip the bit isn't much work. tedder (talk) 05:17, 23 June 2009 (UTC)

There is bugzilla:19362 now that requests a new right for reviewing private filters without being able to edit them. I think we should let the devs create such a right and roll it into the sysop bundle (or maybe even rollbacker, see haz's suggestion at AN) and leave the editing right a separate permission. Regards So Why 12:56, 23 June 2009 (UTC)

That would preclude this suggestion and is a better idea altogether - though I don't think it should be included in rollbacker by default; otherwise the pool of individuals who could leak filter details would be too great. It's not tough to get an admin to enable rollback for you. –xeno^talk 16:20, 23 June 2009 (UTC)

Agreed, but may be discussed. My original suggestion of this at AN (which lead to the bugzilla) was because admins like you and me might want to view the filter but not tamper with it. On a side note (see section above as well), if this flag gets created, the abuse filter group should not be given out lightly without any process, seeing that it would allow non-admins to perform admin tasks. We might want to open up a discussion on this at VPP or similar to discuss some kind of system to set this flag. Regards So Why 17:08, 23 June 2009 (UTC)

Vandals have managed to get rollback rights quite a few times, it's very easy. I think the abusefilter-view should be bundled with admin, as we have no need for non-admin users with view permission without edit permission. Cenarium (talk) 21:46, 23 June 2009 (UTC)

I would support abusefilter-view-private rights to be added to the sysop user group, and for the AFE group to be kept as-is. haz (talk) 13:54, 24 June 2009 (UTC)

I've changed the heading of this thread; this is a better idea all around. –xeno^talk 13:57, 24 June 2009 (UTC)

I support Xeno's original proposal, i.e. add all the AFE userrights into the admin usergroup. There is really no point in separating a userright which a usergroup can give to themselves anyway. Admins have been shown to have the trust of the community, and so we have to trust them not to edit a filter if they don't understand what they are doing! — Martin (MSGJ · talk) 14:18, 24 June 2009 (UTC)

I've been an admin for 2½ years and wouldn't have a freaking clue when it comes to technical stuff like that :) However, I'm in agreement that the view attribute, with read-only access effectively, should be added to the package. Alternatively, I like Tedder's suggestion near the top of this section - enable it for all, but only those who care can switch it on. Orderinchaos 16:21, 24 June 2009 (UTC)

Thanks. And the read-only makes it MUCH easier- give all sysops readonly, allow them to turn on full rights if they want it.

But with read-only, what is the non-admin policy going to be? I.e., give read-only access to most who ask for it, and be picky with full rights to nonadmins? Or absolutely no full rights to nonadmins? tedder (talk) 16:37, 24 June 2009 (UTC)

I don't see a need to have the ability to grant view-private rights to non-admins. If the non-admin wants to work with filters they should probably just apply for full access. If they're not competent enough to write filters, there's no real need for them to view private ones. –xeno^talk 16:42, 24 June 2009 (UTC)

That's also my opinion, there's no need for non-admins to view private filters if they don't have the full access. Admins however, with edit permission or not, should have the read permission by default, as they often have to view filters for information, for example for User:Mr.Z-bot reports at WP:AIV. Cenarium (talk) 19:10, 24 June 2009 (UTC)

I do see a need for a grantable right from my perspective: As a sysop at a smaller sister-project[21] who works with the AbuseFilter there, it would be very helpful to be able to view and export filters that have been developed by the larger community here. This is particularly so because the extension's documentation is so sketchy that it is difficult to make much headway without studying examples, even for one whose programming career is longer than the age of the typical Wikipedian, unless one has specific experience with MediaWiki or PHP. ~ Ningauble (talk) 19:36, 24 June 2009 (UTC)

Most filters aren't private, filters are 'privatized' when dealing with serial vandals or sockpuppets active on this wiki. So there shouldn't be many cases where they'd need to be imported to another wiki. There are also safe off-wiki ways to transmit information if needed, but it doesn't warrant a new userright imo. Cenarium (talk) 20:30, 24 June 2009 (UTC)

Furthermore, if there was a demonstrated need then I'm sure the community would be willing to grant (even if temporarily) +AFE to the admin-on-another-project for view-only purposes. –xeno^talk 20:37, 24 June 2009 (UTC)

Early on, when the extension was first rolled out, a greater proportion were private, which made the learning curve rather steep for those not privy to this community's accumulating knowledge. This is no longer a problem now that most are visible. ~ Ningauble (talk) 15:24, 19 July 2009 (UTC)

Odd false positives

That's correct, the edit filters are inaccurate. It has been reported edit filters can't find the edit they're tagging, but nothing was done. It doesn't matter they are creating a permanent log of nonsense. Anybot created thousands of nonsense articles that had to be deleted. At some point this nonsense will have to be handled. Meanwhile it's in the "accumulate nonsense" action phase. We could bet and try to guess the ultimate number of bad edit filters created? --69.226.103.13 (talk) 00:02, 19 July 2009 (UTC)

Maybe the filter gave a warning and the IP never saved its edits? --NE2 00:42, 19 July 2009 (UTC)

This appears to be bugzilla:19680 again. עוד מישהו Od Mishehu 09:47, 19 July 2009 (UTC)

Special:AbuseFilter/63

This filter is designed to prevent newcomers from making abusive unblock denials. I think we need a filter to prevent these, on the outside chance that the blocked user is, in fact, one who has a chance of getting unblocked - denials like this need to be stopped before the user sees them. עוד מישהו Od Mishehu 09:58, 19 July 2009 (UTC)

I'd disabled this recently because the condition limit percentage was getting rather high (above 4%, meaning 4% of all edits were not being checked for filters), the filter has had relatively few hits, and (as I said in the editor logs) any attempt to decline an unblock request is reported to #wikipedia-en-unblock on freenode. There's usually a couple dozen administrators in that channel, so any abusive denials should be reverted quickly.

That said, the condition limit percentage is now astonishingly low, less than a tenth of a percent. If this filter could be optimized (to use contains_any() or something), I've no problem with it coming back online. Hersfold ^(t/a/c) 19:04, 19 July 2009 (UTC)

Future albums

The wording of this template isn't right:

"Wikipedia has a policy stating Wikipedia is not a crystal ball. Adding details of future albums or other record releases where not even the name is known is not consistent with this policy."

WP:CRYSTAL says "All articles about anticipated events must be verifiable". It doesn't say "Articles about anticipated events are not allowed," nor does it say "Articles cannot have details of future albums". Adding well-sourced, verifiable information about future albums is consistent with WP:CRYSTAL, and this filter should make that clear. —Gendralman (talk) 14:51, 19 July 2009 (UTC)

Request for name change

Could the name of this log be changed, please? I just noticed the other day that I have entries in an "abuse" log for linking to YouTube and for creating articles about Michael Jackson, which triggered a suspicion of vandalism. A few other people are voicing the same concern at AN/I, and someone suggested posting the request here. SlimVirgin ^{talk|contribs} 18:11, 2 July 2009 (UTC)

I would support a name change on all public-facing parts of this extension to "Edit filter". Even after we tell people that "Entries in this list do not necessarily mean the edits were abusive.", they still worry about poisoning of their well. –xeno^talk 18:14, 2 July 2009 (UTC)

I too support the name change. As I have commented before: The current name is arguably inaccurate (since it doesn't contain any possible, probable qualifiers) and counter-productive in that it can drive away good faith editors who see their good edits labeled as "abuse". Vandals surely don't care what we call their edits, only good faith editors are likely to be sensitive about their reputation. Abecedare (talk) 18:24, 2 July 2009 (UTC)

Actually, that would be a sensible thing to do. I hadn't thought of it, but indeed a more neutral and less bitey title would be useful, and it would not in any way change the function of the extension. Good idea! --Jayron32.talk.contribs 18:41, 2 July 2009 (UTC)

I changed the name of the log to "Edit filter log". Ruslik_Zero 19:30, 2 July 2009 (UTC)

Thank you, User Ruslik0. User SlimVirgin, you should spend more time at AN/I offering solutions. I don't know why so many wikipedians offer the, "it doesn't bother me to have an abuse filter" defense when there's no point in calling it an abuse filter while denying it is one.

Just for that I'm going to continue editing the garbage created by anybot. Probably only these 900 articles plus maybe a few thousand redirects to delete. --69.226.103.13 (talk) 06:24, 3 July 2009 (UTC)

I changed all system messages. The only remaining steps are moving this page to Wikipedia:Edit filter and changing the name of user group to (I propose) Edit Filter managers. Ruslik_Zero 09:08, 3 July 2009 (UTC)

I did the move; it was a bit of a monster. I'll hold off on fixing the double redirects for a few hours in case all hell breaks loose again... Happy‑melon 18:56, 7 July 2009 (UTC)

We could actually get the special page itself renamed using $aliases['en']['AbuseFilter'] = 'EditFilter'... Worth a site request? Happy‑melon 19:14, 7 July 2009 (UTC)

I renamed Abuse Filter editors to Edit Filter managers. I think Wikipedia:Edit_filter/Performance page needs attention, because it is updated by a bot. The main page itself (Wikipedia:Edit_filter) should be edited to reflect the change in the name of the filter. Ruslik_Zero 19:23, 7 July 2009 (UTC)

Quite happy to see we're removing the word "abuse." --MZMcBride (talk) 19:23, 7 July 2009 (UTC)

T21618 Happy‑melon 16:34, 9 July 2009 (UTC)

Hindsight

Well, they say it's always 20/20. I think "ActionFilter" is better for a number of reasons:

Filters (as far as I'm aware) can apply to more than just &action=edit;
It preserves the AF initialism, which can be helpful for old discussion, shortcuts, etc.;
Using CamelCase keeps in line with other pages like Wikipedia:CheckUser.

I doubt anyone will want to implement this change (esp. as it was only just changed to "Edit filter"), but I thought I'd mention it anyway. --MZMcBride (talk) 23:30, 17 July 2009 (UTC)

I like it, but your conclusion is bang on. ;p –xeno^talk 04:17, 21 July 2009 (UTC)

Filter 102

Archiving consensus

Fuckups continue to accumulate

Self edits to userspace

Naive question

Filter 61

"Notes" boxes

Correlary

Discussing a specific filter

Abuse filter shutdown

Filters and Catchers

Filter 31 isn't JUST for ASCII art, is it?

Recent deaths filter

Filter 97 (Personal attacks by new user)

My experience of auto-censor false positives

Global Abuse filters

Server sluggishness

Diff?

Bambifan101

My first filter

Filterable actions

Problem with single-quote in regex rlike

Penis

Why are so many of these "private"?

Can I has old_html?

Cleanup

Moving 'disallow' action to be "restricted"

Performance data

Current Trends

Stop creating filters to catch one specific instance of vandalism

False positives page

where should I go?

Where was the community vote on creating an additional privilege above "admin"?

Straw poll

AbuseLog appearance

My mistake, already fixed

AfD filter #147

IP-ranges

Filter 98

Filter 131

Filter 118

Ghost edits in the filter logs

Psudo-block using abuse filter

Abuse filter rights and administrators

On Wheels

Notification at least?

Filter 107

Filter ID for us noobs?

Criteria for a Private Filter

Help requested for populating AbuseFilter tag wording/descriptions

Disabled Filters, unnecessary?

Protection templates

ClueBot

Violations of Copyright

User self-renaming

False positives and no filter admins "monitoring it"

Filter 102

Remove the filter for short new articles

Upcoming Albums

Help needed with filter 171

Proposal - Removing underscore as a repeating character tag?

Client side abuse filter

Support

Oppose

Neutral

Filter Special:Abusefilter/172

Filter 39

Speed question

Possibly optimized 98

Combining 9 and 39

Potentially dumb question

Abuse filter 164

Excessive whitespace

Automatic AIV reporting

How to save changes?

External URL abuse filter preventing proper tagging of db-copyvio

Warning people who ignore abuse filter warnings

Filter 81

Removal of speed deletion tags.

More problems with Filter 171