Wikipedia talk:Edit filter/Archive 6

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Filter 220 - Warning

Can 220 be set back to warning people if they are about to add an external link as an image? I think it would be really useful if they could be informed to upload files if they trip this filter, resulting in less trouble for all of us. Regards SoWhy 12:04, 25 July 2011 (UTC)

 Done Reaper Eternal (talk) 12:17, 25 July 2011 (UTC)

Edit filter request

Can somebody add the words "wimpy", "wuss" and "wuss rock" to the filter 384? The problem is that a person from Columbus, Ohio has abused on this by months. S/He started at the mentioned article (a place where s/he lives) removing an "offensive" nickname "Cowtown". You can see all the history and since January 2011 it has been the same. Then s/he moved to Cowtown, another page protected against him/her. Then s/he started moving to music-related pages. Here is a list:

A place:

And I can continue, but I think that with those pages is enough. I have enough with this user, after six months of blocks, re-blocks and rangeblocks, he has nit done good contributions. Could you please add those words to that filter, or any other? Thanks. Tbhotch. Grammatically incorrect? Correct it! See terms and conditions. 02:48, 26 July 2011 (UTC)

  • no Declined - "wimpy" - The chance of false positives is waaay too high.
  • no Declined - "wuss" - Doesn't he always just add it in front of "rock" to get "wuss rock"? This may produce false positives.
  • Accepted - "wuss rock" - Doesn't seem to have any practical use other than as a genre insult.
So, the whole thing is  Partly done. Any questions? Cheers! Reaper Eternal (talk) 16:39, 27 July 2011 (UTC)
No, and thanks. Tbhotch. Grammatically incorrect? Correct it! See terms and conditions. 21:19, 27 July 2011 (UTC)

Filter 354 - Promotional userspace

I'm receiving a lot of false positives thanks to the detection of this filter to the common auto-generated edit summary "New userpage through Outreach:ACIP" which is basically linked back to people who happen to use our outreachwiki, and I highly suspect that it is due to the outreach string tripping up the filter. Can we please remove it from the filter while this edit summaries are still in place, and until outreach can come to its senses? TeleComNasSprVen (talkcontribs) 16:21, 27 July 2011 (UTC)

Could you provide an example of a false positive? I cannot find any. Additionally, the edit summary is not checked in filter 354, and "outreach" is not in the regex. Reaper Eternal (talk) 16:32, 27 July 2011 (UTC)
The filter is suppressed from public view, so we'll have to take Reaper's word for it. Could you give an example of a false positive, though? — Waterfox ~talk~ 00:51, 29 July 2011 (UTC)
I believe this and this would be examples of what TeleComNasSprVen means - however, it's not related to the "outreach" in any way - it's because the users are including links in their userpages, AFAICT. Avicennasis @ 12:59, 14 Av 5771 / 14 August 2011 (UTC)
I guess so, because I think I've found something like it here. The self-description seems harmless to me, but apparently these occurrences trip the filter anyway. TeleComNasSprVen (talkcontribs) 13:48, 25 August 2011 (UTC)
It's the other promotional text that is triggering the edit filter, not the outreach template. Cheers! Reaper Eternal (talk) 15:03, 25 August 2011 (UTC)

An addition to the BLP filters

Could somebody modify the "possible BLP issue or vandalism" filters to catch this sort of thing (not visible to non-admins). Thanks. HJ Mitchell | Penny for your thoughts? 20:38, 25 August 2011 (UTC)

 Done Let me know if there are further attempts to mask obscenities. Reaper Eternal (talk) 18:21, 26 August 2011 (UTC)

timestamp manipulation

I wish to create a filter that checks the old_wikitext of a page for {{templatename|currentyear|currentmonthname|currentdate}}. The only way I can do this AFAICT is to use modular arithmetic on the timestamp, and, for the month name, using := with a switch (or if there are no switches then successive ternary operators). (a) Is there a better way? (b) If that is the only way, then how terribly expensive is such a check? (This is not on enWP.)—msh210 19:32, 1 September 2011 (UTC)

Also, such would require using concatenation ("pattern"+year+"\|"+, etc.). Does that work with rlike? with contains? by some other means?—msh210 20:04, 1 September 2011 (UTC)
If I had to make such filter it would be something like this (1st line needs to be updated annually):
( lcase(old_wikitext) contains '{{templatename|2011' ) &
( 
  /* expensive conversion here: timestamp -> month & date  */
)
And string concatenations work fine: test "a"+1 in Special:AbuseFilter/tools on your wiki. — AlexSm 21:00, 1 September 2011 (UTC)
Thanks. 'abc' contains 'a'+'b'+'d' evaluates to 1bd, so there's an order-of-operations issue. And trying 'abc' contains ('a'+'b'+'d') doesn't work (evaluates to ∅, as does if ('abc' contains ('a'+'b'+'d')) then 1 else 0 end).—msh210 21:21, 1 September 2011 (UTC)
Why do you expect "abc" to contain "abd"? The code 'abc' contains ('a'+'b'+'c') works fine. — AlexSm 01:21, 2 September 2011 (UTC)
Er, yes, good point. Sorry.—msh210 on a public computer (talk) 06:57, 2 September 2011 (UTC)
The conversion can probably be done using a combination of strpos and substr. I'd be happy to write the code if needed. Just give me a shout. — Kudu ~I/O~ 22:25, 1 September 2011 (UTC)
Would that be cheaper?—msh210 23:42, 1 September 2011 (UTC)
I have a feeling you expect something like 20110901... while what we have is Unix time. — AlexSm 01:21, 2 September 2011 (UTC)
Not at all (expecting 20110901...). But one can use modular arithmetic to convert a (Unix) timestamp, too, to more user-friendly units. But thanks for the heads-up.—msh210 on a public computer (talk) 06:57, 2 September 2011 (UTC)

Why wasn't this stopped?

See contributions for User:60.52.43.195 -- this should have been caught by three different filters (17, 58, and 264), and is showing as a hit for each of those filters, yet the edit was allowed. The condition limits don't seem to be a problem right now. Any ideas? Thanks, NawlinWiki (talk) 14:56, 3 September 2011 (UTC)

Same for User:60.52.122.210 - should have been stopped by 17 and 264, shows as a hit, but not stopped. NawlinWiki (talk) 02:19, 19 September 2011 (UTC)

Filter 420

Filter 420 may cause many false positives. I think it might be best to just tag the edits, removing the rate limit. — Kudu ~I/O~ 22:43, 11 September 2011 (UTC)

  • Could you show some examples please? Perhaps changing edit_delta should resolve the issue than removing the rate limit. I'll request dq to also come in here. Wifione Message 03:02, 12 September 2011 (UTC)
    • I created this on 16:03, 19 June 2011...I think our false positives would have shown a lot more by now (mind you I haven't really looked into it at all, and over it's life it's only stopped ~1,100 edits...it's somewhere around 9 per day. I'd rather see the false positives first before we take action. The fact that the rate limit is already set to 1/hour should be good. -- DQ (t) (e) 03:36, 12 September 2011 (UTC)
  • Example of a FP (editor was removing vandaism). Sole Soul (talk) 04:13, 19 September 2011 (UTC)

Filter 58

Am I being dumb, or is this a bug? Black Kite (t) (c) 23:34, 22 September 2011 (UTC)

It's a false positive but not a bug. I will send to you by e-mail the part of the edit that matched the filter. Sole Soul (talk) 23:48, 23 September 2011 (UTC)

New filters

I recently designed this edit filter who's code is below

(article_text) 'User:ClueBot NG/Run'
&(removed_lines) rlike "{{TrueItrue|False|false)
! (added_lines) rlike "{{TrueItrue|False|false}})

The filter is designed to run on the page User:ClueBot NG/Run and block vandals from changing the page to something other than "True" and "False." The page is the "switch" for ClueBot NG. While the bot can be blocked if it malfunctions, from this discussion, the users in this discussion seem to draw that if administrators can already block this bot if it malfunctions, than this page is intended for non-admins to turn off the bot in case it malfunctions. If this filter goes into effect, the action taken should be "disallow".

(article_text) 'User:ClueBot NG/Run'
&(added_lines) rlike "{{False|false}})

This filter should warn editors that they are changing the filter to "false" that they should only turn it off if the bot appears to be malfunctioning.

Now, I do not have access to Edit Filter rights so I have no idea if these codes would work as I cannot access the Batch testing page. Hopefully the people that do, will fix the codes up and test it out and run it live. Thanks. OpenInfoForAll (talk) 00:11, 24 September 2011 (UTC)

"Wuss-rock"

I requested here the addition of the words "wuss rock" (a degrading way to call pop rock/punk rock/power ballad/soft rock music) because of the long-term abuse of an editor from Columbus, Ohio. The filter worked for months, but now s/he decided to add a "-" between these words. The filter is not working as desired with this little change and now s/he returned with "wuss-rock". I know that as now there have been only two attacks with this, but there are enogh evidence that this user won't cooperate and will continue. If it is possible to add to the filter 384 would be appreciated. Tbhotch. Grammatically incorrect? Correct it! See terms and conditions. 21:06, 27 September 2011 (UTC)

I'll take a look. 28bytes (talk) 21:08, 27 September 2011 (UTC)
Looks like this might have been moved to a different filter. I'll ping the EFMs who've been working on that one. 28bytes (talk) 21:15, 27 September 2011 (UTC)
Vandalism filters are generally divided into two categories: 1) filters which combat common types of vandalism, usually committed by pass-by vandals. 2) and filters which combat specific types of vandalism by persistent, tireless vandals.
Filters of the first type have characteristics which allow them to catch most of vandalism, but ironically, these same characteristics allow a persistent vandal to pass them easily. The second type filters are harder to pass and they are changed frequently as vandals adapt to old filters (that's why they are private).
Filter 384 was intended to combat vandalism of the first type. NawlinWiki and MuZemike are maintaining many filters of the second type and I think it is better to ask them to add this word to one of these filters. Sole Soul (talk) 00:02, 28 September 2011 (UTC)

Request (Adrignola)

 Done. -- zzuuzz (talk) 11:49, 1 October 2011 (UTC)

As you may be aware, the edit filter is now enabled on all wikis. en.wikibooks admin/'crat/CU here; I was hoping for emailed content (or temporary access) for:

I'm hoping these additional/updated filters will reassure thosewishing they could block unregistered users from editing. – Adrignola talk 02:51, 23 September 2011 (UTC)

  • Adrignola is a bureaucrat, admin, checkuser, editor on enwikibooks, apart from being an editor and a reviewer at en_labswikimedia. Adrignola is also a sysop on outreachwiki and commonswiki. I have no issues on Adrignola being provided the abusefilter right. Wifione Message 17:38, 23 September 2011 (UTC)
    In the interest of full disclosure, you can add editor/reviewer to yourself at en_labswikimedia. Outreach added all trusted users to the administrators group, so it's not a whole lot better than my status as automatic administrator assignment on the checkuser wiki (but I guess it should still indicate a level of trust). I suppose since we're discussing my various hats, I'll mention also that I am an OTRS volunteer clearing images and text of copyright issues through confidential emails. Should any access be provided, whether permanent or temporary, I don't plan to do any editing to filters, instead benefiting from the knowledge others have to share through their hard work on the above filters and the extensive testing that Wikipedia provides through its massive amounts of editing. A long time ago after Wikibooks requested the extension, I had asked for email copies of some filters but I had not asked for all of the ones that would have been useful to limit the inconvenience of the request. Since then some have been updated and it would be useful to have the rest and of course private filters here would remain private at Wikibooks. – Adrignola talk 20:42, 23 September 2011 (UTC)
    In the interest of moving this along, no objections to this request from me. To keep things tidy perhaps temporary permission would be most appropriate, without prejudice to asking for it again. -- zzuuzz (talk) 07:41, 29 September 2011 (UTC)
    No further comments?  Done. -- zzuuzz (talk) 11:49, 1 October 2011 (UTC)

Request for views

Not meant to be an RfC; that's why titled the section as RfV. Recently, the sysop right was removed from various inactive sysops, with a note that they could have it back in case they wished. Should we (or should we not) take up the same exercise for inactive sysops and other inactive users with respect to their abusefilter-modify/abusefilter-view-private rights? Wifione Message 07:15, 29 September 2011 (UTC)

Do it. Prodego talk 04:36, 8 October 2011 (UTC)

New filter that could catch the worst attempts at A10

After Facepalm Supreme facepalm of destiny-ing ICONic Boyz, I noticed that the creator copy-and-pasted the tagline. Could something detect "From Wikipedia, the free encyclopedia" in the first few lines of a new page creation, and mark it with something like "Tag: Possible A10"? →Στc. 01:08, 9 October 2011 (UTC)

There is already a 'possible cut and paste move' tag used by filter 164. It could be added to that. GFOLEY FOUR!— 01:20, 9 October 2011 (UTC)

354 Promotional text added by user to own user(-talk) page - Why private

Re. Special:AbuseFilter/354

Is there any reason why this filter needs to be flagged as private?

(This came up on a mediawiki talk; I'll notify the users who've edited it of this thread too)

 Chzz  ►  10:28, 30 October 2011 (UTC)

Well, it was private when I started editing it, and I never bothered to notice that until now. I've changed it to public. -- King of ♠ 00:32, 31 October 2011 (UTC)
No, please change it back to private. Why, per WP:BEANS: real spammers have a habit of finding ways around to get their spam on Wikipedia. First when they do not succeed getting their links on Wikipedia by using multiple IPs to avoid detection, or creating massive sockfarms, they will use redirect sites (there is a reason why we blacklist these on sight) etc. etc. Same goes for this filter - once you know what we check for and how, it is easy to avoid those words and create your promotional userspace page without those terms, or by breaking them in such a way that they appear but not trigger the filter. I've said this many times, but (real) spammers are not your run-of-the-mill vandals. They will not come and see their stuff deleted and think 'oh, whatever, we go somewhere else'. They earn money with it, many come back and try again. If you need to know the words or the system, please email an edit-filter editor (but please avoid to post the filter on-wiki) --Dirk Beetstra T C 08:27, 31 October 2011 (UTC)
Oops, forgot to say, I already made it private again. --Dirk Beetstra T C 08:28, 31 October 2011 (UTC)

EDITS IN ALL CAPS

We have filter 225 for vandalism in all caps and filter 437 for titles in caps. I just ran accross this edit and was wondering what people think of a filter for edits in all caps. I'm not sure it should be disallowed, but maybe it should be tagged. Perhaps if 5 or more words were in caps. I don't know regex so I'm not sure if that's possible. Anyway, I'd like to get input here before making a request. Thanks. - Hydroxonium (TCV) 07:15, 5 November 2011 (UTC)

Detection would be good, because a large number of Indian village article creators seem to like to list off "important" people IN ALL CAPS IN LIST FORM!!!! (that's more or less what it looks like) I don't think we'd need to set it to disallow, but certainly triggering something would be good; that will help keep private names from being in the encyclopedia for too long and would help us detect all-caps yet less immediately obvious vandalism. The Blade of the Northern Lights (話して下さい) 18:17, 5 November 2011 (UTC)
We have it already. Sole Soul (talk) 23:51, 5 November 2011 (UTC)

Vandalism missed?

I wonder why this wasn't apparently picked up by a filter?  Chzz  ►  12:28, 18 November 2011 (UTC)

It should have matched filter 380, but the presence of 'asses' within the old_wikitext made it fail. I would guess it should be checking removed_lines instead of old_wikitext. -- zzuuzz (talk) 13:02, 18 November 2011 (UTC)
Thanks for explaining that. It might be something we could refine.  Chzz  ►  17:22, 18 November 2011 (UTC)

Filter 423 - WikiLove

To the best of my understanding, we use the edit filter for the purpose of tracking bad edits (even if done in good faith). In fact, I believe that this discussion, about Filter 200, is an excelent precedent for this. Filter 423 seems to be a violation of this idea - as most use of WikiLove is probably good. What do other users think about this? עוד מישהו Od Mishehu 11:11, 15 November 2011 (UTC)

Note: Eloquence was notified of this discussion. עוד מישהו Od Mishehu 11:13, 15 November 2011 (UTC)
There was a thread at the village pump about people possibly abusing the WikiLove feature. I believe that's when Eloquence created the filter, so that abuse could be tracked. Maybe, if there's no abuse, the filter could be turned off as a number of people view EF hits as a negative thing. I'd like to hear some some more pro's and con's before making a decision, but that's just me. Best regards, - Hydroxonium (TCV) 10:00, 18 November 2011 (UTC)
It's both to track potential misuse and ongoing use of the tool. While the tool is still relatively new, it seems wise to make it easy to track continuing use of it.--Eloquence* 18:10, 19 November 2011 (UTC)

Requests from 2010

I cleaned up WP:EF/R. I'd like to know if any of the EF managers are going to work on the requests from 2010 (i.e. requests that are older than 1 year)? If the answer is no, I'll archive all those requests as  Not done. - Hydroxonium (TCV) 19:40, 19 November 2011 (UTC)

Sandbox

Is there any way, in the Wikipedia talk:Sandbox, to override the filter? I think it would be a good exception if you could override the filter while clearing the Sandbox. 71.146.20.62 (talk) 00:45, 21 November 2011 (UTC) (Note: Copied from top)

It'd be unnecessary, since a bot periodically does it. Jasper Deng (talk) 00:45, 21 November 2011 (UTC)
Oh, okay, thanks. 71.146.20.62 (talk) 21:01, 22 November 2011 (UTC)

Vandalism not caught

How the hell was this huge section blanking not caught? Ten Pound Hammer(What did I screw up now?) 20:57, 22 November 2011 (UTC)

Because of the condition length(added_lines) < 1. This means that if the user added any content, or changed any content, or moved it from one place to an other, this filter (Filter 172) doesn't catch it. עוד מישהו Od Mishehu 11:47, 24 November 2011 (UTC)

Possible catch

Hi, it think this might be an easy catch [1]. -DePiep (talk) 00:04, 9 December 2011 (UTC)

Use for proactive cleanup

This could be pretty useful for proactive cleanup, using the "warn" feature to notify editors of WP:MOS expectations and stuff. Any occurrence of "irregardless" or "Irregardless" that is not in italics or quotations marks or inside a quotation template; that sort of thing. I'd love to see it catch all cases of "aka" as a string by itself ("aka" not being a word in English; I don't care if you do it as AKA or a.k.a., but it's not "aka"). Even allowing quoted cases might still be too restrictive for a string this simple, though <sigh>. "[I|i]rregardless" seems like an easy one, though. — SMcCandlish Talk⇒ ʕ(Õلō Contribs. 23:09, 25 December 2011 (UTC)

The filter would be insanely broad, however, and would take up an immense amount of server resources. This won't fly well at all, and I don't think that the abuse filter is a good way to stop people from misspelling or making up words. Nice idea though. P.S. The regex would really be "[Ii]rregardless" or "(I|i)rregardless". Reaper Eternal (talk) 15:15, 3 January 2012 (UTC)
Right. It'd unnecessarily gobble CPU cycles, complicate the editing experience for newbies, desensitize people to filter warnings, and it'd be woefully prone to error (much like a spell bot). Like Reaper said, good idea—I like that you think outside the box. :) --slakrtalk / 02:48, 5 January 2012 (UTC)
We could, however, consider writing something like this in javascript... Implementing it in to the edit form. Spellchecking is an obvious feature, but there is much more we could do. Prodego talk 20:23, 5 January 2012 (UTC)

PaoloNapolitano

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I am picking up anti-vandalism experience, I am a rollbacker, never blocked and I think this would be an interesting right to have. I love doing anti-vandalism work, and I would love working behind the scenes of it. PaoloNapolitano 20:52, 11 February 2012 (UTC)

Normally we like to see a lot more than 800 edits before setting that flag. Anyway, I can assure you it's not that interesting of a right. Most of the filters are public, so any editor can view how they work and suggest improvements. 28bytes (talk) 00:37, 12 February 2012 (UTC)
EFM is not a particularly "interesting" privilege. If you are just interested in looking at filters, almost all of them are viewable to the public anyway. Cheers! Reaper Eternal (talk) 14:36, 24 February 2012 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

139

I've disabled 139 for now because it's attracting a lot of false positives, and they're the worst kind of false positives—like this brand new editor being prevented from writing an article. It's also catching a lot of potentially dodgy but seemingly good-faith edits where disallowing isn't appropriate in my opinion. Could somebody more proficient with filters please go through the log and then try to make the filter more specific—preventing every edit that uses fixed position markup is clearly not working. Thanks, HJ Mitchell | Penny for your thoughts? 19:38, 8 December 2011 (UTC)

Filter 139 issue is now trending topic at VP/T: wp:vpt#Quartic_function_has_spam_inserted_directing_all_clicks_to_a_racist_9.2F11_conspiracy_site. Filter 422 seems related. -DePiep (talk) 19:51, 18 February 2012 (UTC)

Filters failing to catch racist vandalism

Can someone please take a look at WP:AN/I, and try to find out why filter 139 hasn't been catching the recent racist template vandalism? Please be careful viewing the diffs given there: they cover the content with an invisible image that clickjacks every link to racist sites which may well also contain malware, even in preview mode. In the meantime, I've turned filter 453 back on, which is simpler, and should perhaps have a better chance of catching it? -- The Anome (talk) 00:45, 19 February 2012 (UTC)

 Fixed. Reaper Eternal (talk) 18:45, 21 February 2012 (UTC)

an addition for a filter?

This is hardly subtle, so I'm thinking it might be possible to tack it onto the end of a filter somewhere, but I don't know my way around filters well enough to do it myself. HJ Mitchell | Penny for your thoughts? 00:42, 20 February 2012 (UTC)

 Done. Added to filter 260. Reaper Eternal (talk) 18:55, 21 February 2012 (UTC)

Jeff G.

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I am actively involved in anti-vandalism efforts here. I am an Autopatroller, Reviewer, Rollbacker, File mover, and Account creator here. I identified to the WMF 03:06, 2012 February 7 (UTC).   — Jeff G. ツ (talk) 04:59, 24 February 2012 (UTC)

No per flag collecting. Reaper Eternal (talk) 14:42, 24 February 2012 (UTC)
    • Seriously? I have been editing here for over five years and have over 50,000 anti-vandalism edits.[2]   — Jeff G. ツ (talk) 16:03, 24 February 2012 (UTC)
I'm not opposing based on how long you've been here and how many edits you have made. I took one look at your userpage, which gives me the strong impression that you are attempting to collect every badge under the sun on every possible wiki. My concerns are shared in opposes #3, #4, #5, #10, #11, and #13 in your RFA, all of which are by editors I greatly respect. The very fact that you mentioned all your user privileges here in this very request (including completely unrelated ones like 'autopatrolled', 'filemover', and 'reviewer') makes me even more concerned. Maybe I'm assuming too much bad faith (in which case other EFMs and editors will support you), but this is just too much for me to ignore. Reaper Eternal (talk) 16:46, 24 February 2012 (UTC)
  • I'm not sure if I'm allowed to "vote" on this but... Oppose: You can only fit so many hats on your head. Quality is better than quantity, I'd rather give someone who has 4,000 edits to articles making them all GAs than someone who has 500,000 edits using automated tools, writing on user talk pages, and requesting permissions. Frood! Ohai What did I break now? 23:37, 24 February 2012 (UTC)

Um, I personally don't care about the flag collecting. I don't know if that's what you're doing, or if you're the kind of person who gets really interested in one field of the 'pedia and then moves on to another, which is a totally good-faith way to accidentally collect lots of hats. Anyway, what I care more about is that you show some kind of competence so we know you're not going to blow up the filter system. Someguy1221 (talk) 08:49, 25 February 2012 (UTC)

Jeff G. is at the maximum "level" they are going to attain already, let's face it. Edit filter rights are neither needed nor warranted. Doc talk 09:04, 25 February 2012 (UTC)

Jeff G., you haven't explained why you want this userright, only that you feel entitled to it. Are there specific abuse filters you wish to write or change? Do you have a need to see the code of filters set to private? You can already see how the filters work by looking at the public ones. Or put another way, can you link us to your past successful requests for filters or filter changes (if you already know what you're doing, it's simpler to just give you the bit)? Franamax (talk) 12:59, 25 February 2012 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Ajraddatz

Hello, I'd like to request edit filter manager rights for the purposes of viewing filters only. I occasionally set up abusefilters on smaller wikis to prevent vandalism/spam, and I'd like to see what some of the enwiki filters do for that. Unfortunately, almost all of the filters that would actually be useful to me are marked as private, thus the need for the ability to view them. Like all permissions, I'll have this removed when I don't need it any more. Thanks, Ajraddatz (Talk) 17:23, 10 March 2012 (UTC)

 Done. 28bytes (talk) 20:16, 11 March 2012 (UTC)
Thanks. Ajraddatz (Talk) 20:22, 11 March 2012 (UTC)

Tomtomn00

I would like to be able to edit the filter, as I am making a filter myself (User:Tomtomn00/editfilter.js). I have over 2000 edits and I am very active on recent changes. ~ ⇒TomTomN00 @ 13:14, 11 March 2012 (UTC)

If I may comment on this, I'd say no due to obvious flag collecting. [3][4][5][6][7] Ajraddatz (Talk) 18:40, 11 March 2012 (UTC)
In line with Ajr's comment and from my own experience with Tom, I would counsel against granting this request. MBisanz talk 18:42, 11 March 2012 (UTC)
If you decline me I will think of reapplying later this year (4/5 months +)~ ⇒TomTomN00 @ 19:55, 11 March 2012 (UTC)
Also, which experience? ~ ⇒TomTomN00 @ 19:58, 11 March 2012 (UTC)
I had to oversight inappropriate content on your userpage. You're lack of understanding about why makes me strongly question your judgment. MBisanz talk 20:00, 11 March 2012 (UTC)
Sorry about that, the thing is, I cannot see any oversight actions, or can any low-rights user. ~ ⇒TomTomN00 @ 20:02, 11 March 2012 (UTC)

 Not done, per Ajraddatz and MBisanz's comments above. 28bytes (talk) 20:19, 11 March 2012 (UTC)

EFM Flag question

Is there an EFM flag that allows viewing (but not editing) the filters? Salvidrim! 10:19, 8 March 2012 (UTC)

You can even now view most filters without any additional right. Wifione Message 11:05, 8 March 2012 (UTC)
I am well-aware. I was asking because in the course of anti-vandalism work I've come across a few "private" ones; I was considering applying for EFM but have no interest in editing said filters (nor do I have the required knowledge to, anyhow)... hence I was wondering if there was a "read-only flag". Salvidrim! 11:09, 8 March 2012 (UTC)
There have been editors in the past who have been granted view-only rights. In precise terms, the abusefilter flag was added to their user rights but they were informed that this was only to view than to edit the filters. Best. Wifione Message 12:54, 8 March 2012 (UTC)
I see. I'll take it into consideration. :) Salvidrim! 13:01, 8 March 2012 (UTC)

Questions about writing filters

Moved here from Wikipedia:Edit filter/Requested as suggested by Reaper Eternal. --Chriswaterguy talk 12:43, 15 March 2012 (UTC)

All my AbuseFilter work is on another wiki (Appropedia) - I don't have the privileges here, so I can't look at other filters to see how it's done. I wonder if someone could help me, or suggest another place to ask - I've tried on mw:Extension talk:AbuseFilter but it's awfully quiet there.

  • Question 1: How do I put a comment in a filter?
  • Question 2: What's the most efficient way to test if an expression's value falls within a range? E.g. 4 < SOME_LONG_EXPRESSION < 9.
  • Question 3: Can I assign a variable? (This would probably be the best solution to Question 2.)

Any help is much appreciated. I'm having a lot of success with this fantastic tool, but coding it has involved a lot of trial and error. --Chriswaterguy talk 18:28, 13 March 2012 (UTC)

Okay, I'll try to answer this (this should probably be moved to WT:FILTER, though):
A1: Put a comment in the description box below.
A2: (4 < EXPR) && (EXPR < 9)
A3: foo := "sfgdsgf"
Hope this helps. Reaper Eternal (talk) 19:46, 13 March 2012 (UTC)
Thanks for the help!
Re A1: I was hoping to have some inline comments, as the spam filter I've written is 35 lines and 7700 characters long. But if that's impossible, I'll just document very carefully in the description box.
Re A3: Thanks. I'm trying to set a numerical variable rather than a string, and I found that this variation of your answer works:
age_in_days := user_age/(24*3600); age_in_days < 100
Where I'm stumped is: I'd like to do is run basic tests first (the low-server-load tests such as on user_age and user_editcount), THEN define a variable and do some things with that variable. A simplified but non-working example is:
user_age < 24*3600;
spamminess := "foo" in added_lines + "bar" in added_lines ;
spamminess >= 1
This evaluates the last part only, and discards the result of the user_age test. Is there a way I can make it evaluate the user_age test, then calculate the expression "spamminess", and then perform test the value of spamminess?
(Of course it would be easier to do this without the variable, but the actual case I'm working on is a quite long expression, about 7700 characters, and I want to minimize the load on the servers by only evaluating spamminess once, and only after testing user_age and other simple tests.) --Chriswaterguy talk 14:02, 15 March 2012 (UTC)
That's because the code should be different. Note the ampersand after the account age check:
(user_age < 24*3600) &
(spamminess := ("foo" in added_lines) + ("bar" in added_lines)); (spamminess >= 1)
It might be better if you would email one of us the code (if you don't want it publicly posted) so we could look at what is going wrong. Reaper Eternal (talk) 17:00, 15 March 2012 (UTC)
That's how I tried it first (ampersand instead of semi-colon) but it gave a syntax error. I just copied your suggestion and tried again -> "Syntax error detected: Unrecognised variable spamminess at character 22". If I delete the user_age test and start with spamminess, no error.
Happy to email you code at some stage - thanks. But I'm actually just trying to get these small examples working first, before I try to apply variables to my working spam filter. Thanks --Chriswaterguy talk 16:48, 16 March 2012 (UTC)
Sorry about that—I forgot the parentheses around the spamminess definition. Try now. Reaper Eternal (talk) 16:54, 16 March 2012 (UTC)
It works - fantastic! It'll probably be a week before I do much serious with it (due to work) but very happy to know that this works. I can also see a way to use it to make debugging quicker and tidier, too - by building the overall spamminess out of component variables.
Many thanks. --Chriswaterguy talk 19:47, 16 March 2012 (UTC)

Debugging tools?

I have a spam filter (that I use on Appropedia) based on a scoring system - if it's above 10, it's blocked. Sometimes I get a weird result, and it takes ages to debug it. Just working out the score takes a while (e.g. " >=10" matches, ">=20" matches, ">=30" fails, ">=25" matches...) Then I take out chunks of the scoring code, and see what the new score is (in the same stab-in-the-dark way) to eventually narrow down where the high score is coming from.

Surely there's a better way...? Thanks! --Chriswaterguy talk 19:32, 16 March 2012 (UTC)

This edit was tagged by Filter 231, despite not being "nonsense characters". Salvidrim! 20:33, 18 March 2012 (UTC)

The "fffffffffffffffffffffffffffffffffffffffuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuucccccccccccccccccccckkkkkkkkkkkkkkkkkkkkkk yyyyyyyyyyyyyyyyyyyoooooooooouuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu" triggered the filter. This is not a false positive. Reaper Eternal (talk) 14:09, 19 March 2012 (UTC)
Hm. I wonder if there wouldn't be a way to detect the profanity (this is obviously "fuck you" but with repeated letters) rather than tag it "nonsense". Salvidrim! 00:49, 20 March 2012 (UTC)

On this update, shouldn't all \1{6} have been changed to \1{7}? The first part is still detecting only 6 characters in the current version of the filter. Helder 20:05, 20 March 2012 (UTC) PS: this was copied from here.

DFA

Resolved
 – No. Reaper Eternal (talk) 13:52, 21 March 2012 (UTC)

Hi, I'd like to have this because I am getting more experienced on here, and I want to do more things on here. I would like to have more permissions to edits, and help fix filters. Like all permissions, I'll have this removed when I don't need it any more. DreamFieldArtsTalk 13:50, 21 March 2012 (UTC)

Non-sysop access

In the context of a discussion of privacy issues related to the edit filter, one item that came up was listing non-sysops with editfiltermanager set. I'm not sure if such a list already exists, so I ran an API listing. 146 users are in the access group "abusefilter" and the following are not also in the "sysop" group (+ link to archived permission request):

Charitwo (granted by Slakr 23 Jun 2011)
Chzz (16 Oct 2009)
EdoDodo (19 Sep 2010)
Fran Rogers (ex-admin, self-granted 20 Oct 2009)
Mayur (8 Dec 2011)
Netalarm (21 Aug 2010)
Pek the Penguin (self-granted by Optimist on the run to non-secure alt account)
Petrb (granted by PeterSymonds 15 Oct 2011)
Prodebot (Prodego temp experiment)
Sole Soul (15 Sep 2010)
Tim1357 (26 Apr 2010)
UncleDouggie (?? Oct 2010)
Vito Genovese (17 Jan 2010)

Perhaps occasional review would be a good idea, to confirm that these users still need this access level? Franamax (talk) 21:40, 19 January 2012 (UTC)

If the editors are active, then I should suggest there's no need to reconfirm. If the editor has undertaken issues that may be viewed negatively and someone brings them up, we should consider them. If the editors have gone inactive, then we could use the inactive-admin criteria and remove the right - at the same time leave a congenial message on their talk page that they can have it back whenever they might wish to become active. Do you think this sounds acceptable Franamax? Wifione Message 03:53, 20 January 2012 (UTC)
Seems reasonable enough - though I would call it tighter than the criteria for inactive admins (who were previously vetted by the entire community). Two that I noticed were Charitwo and Vito Genovese, who were using the access to set up filters on other projects and may not need the private access anymore. Would there be value in maintaining this list as a separate section on this page, in case the same question arises in future? Franamax (talk) 19:05, 20 January 2012 (UTC)
I don't know if it's correct to maintain a separate section containing this list. Those for whom we've given the right on a temporary basis (for exampe, Vito and Charitwo, who might not need it anymore), we could simply remove the right while informing them. For the others, I believe we consider them extremely trustworthy. Therefore, these editors might feel slighted by the move. Just my views. Wifione Message 16:21, 24 January 2012 (UTC)
I've removed the permissions from Pek the Penguin (my alternative account) as I don't need them there.  An optimist on the run! 17:18, 26 January 2012 (UTC)

Just a side note, for those who might not have kept track. Wifione Message 07:57, 28 January 2012 (UTC)

Well my opinion of whether these editors should keep EFM is as follows:
Just my thoughts. Reaper Eternal (talk) 14:25, 24 February 2012 (UTC)
I'm inclined to agree. Charitwo was granted it to export private filters to Wikia; we should ask him if he still needs it and remove it if he doesn't reply in the affirmative. Chzz is apparently retired and has never modified a filter. EdoDodo is inactive and hasn't modified a filter since November 2010. We should ask Mayur if he still needs it. HJ Mitchell | Penny for your thoughts? 17:36, 24 February 2012 (UTC)
I totally agree as well. If we admit that an inactive admin account is dangerous, then so is an inactive filter editor's. Someguy1221 (talk) 08:51, 25 February 2012 (UTC)
Charitwo is no longer a member of the VSTF on Wikia and is currently globally blocked there, so I doubt he still needs it. Mayur's access was just removed per a cross-wiki investigation of abuse. Ajraddatz (Talk) 00:37, 10 March 2012 (UTC)
These dispositions look good to me. It's easy enough to regrant if the editors need it again, but safer to remove the bit while they don't. Franamax (talk) 13:17, 25 February 2012 (UTC)
  • I've left a note on Mayur's page. I'll also leave a note on Chzz' page given the recentness of his absence. Other than that, agree with Reaper's listing. The only issue that I see here is as follows... When long absent accounts become active again, should we have some check to guard against compromised accounts before making abusefilter-private/modify-restricted=true? Wifione Message 14:53, 25 February 2012 (UTC)
(Chzz' replied by email in the affirmative; he wishes to retain the flag. Wifione Message 15:13, 25 February 2012 (UTC))
Hi, I was less active these day but this access would be useful for me.I mainly uses it for my local wiki abuse filter configuration. Regards--Mayur (talkEmail) 05:54, 26 February 2012 (UTC)
Removed abusefilter right from UncleDouggie, Vito Genovese, EdoDodo and Charitwo with no prejudice to future reinstatement. Wifione Message 11:09, 8 March 2012 (UTC)
Removed from Mayur per ongoing discussion about him, his user rights and abuse of the edit filter at m:Requests for comment/Userrights on hi.wiki. Please do not return without reading the discussion on meta and preferably discussion on enwiki too. The Helpful One 00:34, 10 March 2012 (UTC)
I strongly agree with this action, and have amended my proposition accordingly. Reaper Eternal (talk) 14:08, 21 March 2012 (UTC)

Edit request

I am requesting that WP:FLTR be added to the shortcuts. I know this page is not fully protected, however, I am not at the privilege level to make an edit to this page. Thanks. 75.53.218.81 (talk) 04:03, 21 March 2012 (UTC)

 DoneJasper Deng (talk) 04:05, 21 March 2012 (UTC)

Speedy deletion template filter

Does the filter that catches removed speedy deletion templates cover:

  • placing the tag in <!--invisible comments-->
  • changing the template to {{db-u1}} or {{db-g7}} and then removing that template
  • inserting random characters between the braces (something like {-{db-g13}-})
  • removing braces ({db-g13} or {{db-g13} rather than {{db-g13}})
  • wrapping the tag in any tag that does not properly display templates (such as <source> or <nowiki> or <pre>)

If not, could the filter be made to catch them? →Στc. 00:34, 25 March 2012 (UTC)

I'll reply line-by-line:
  • No. There is no point in wasting the server CPU cycles on that, since almost no newbies will know that is even possible.
  • No. Very unlikely to occur.
  • Yes.
  • No. As per #1.
Hope this helps. Reaper Eternal (talk) 12:39, 12 April 2012 (UTC)
Does the "Yes" refer to both the third and fourth bullet points? Salvidrim! 20:22, 12 April 2012 (UTC)

Filter 39 (School libel and vandalism)

Special:AbuseFilter/39 (School libel and vandalism) is currently marked as private. It was public when I started it, then marked private, then I made it public again only for it to be marked private. I think there is no reason to have this filter private, but I'd like to get consensus about it. IMO it's a useful resource for recent changes patrol. The log is currently hidden to non-EFMs, which I think is a hindrance. I don't know if that's a temporary thing, but there are other reasons. The filter has no reason to be private. The intention of the filter is not to catch determined vandals, and I'm certain virtually every one of them has never heard of an edit filter. I'll bet most haven't even heard of Wikipedia. Vandalism really belongs in a different filter. Having the filter public allows other to offer suggestions as well as look at the log. It offers transparency as well as utility with no downside that I can see. Furthermore Special:AbuseFilter/189 which is almost identical in design and purpose is public. Your opinions, thanks. -- zzuuzz (talk) 09:05, 11 April 2012 (UTC)

I don't think it needs to be private. It's tag-only, and I agree that these aren't exactly career vandals who are getting caught. Someguy1221 (talk) 09:45, 11 April 2012 (UTC)
I agree, it should be public. Most of the antivandalism filters should be public since I seriously doubt they are out looking at the abusefilter solely to find out what can get through. Reaper Eternal (talk) 14:44, 11 April 2012 (UTC)
Even though I cannot see that filter, I agree on the comments made above. ~ ⇒TomTomN00 @ 15:29, 11 April 2012 (UTC)
checkY I'm the one who marked it private - I wasn't aware it had previously been an issue. The reasoning looks fine to me, so I've made it public again. Feezo (send a signal | watch the sky) 01:17, 12 April 2012 (UTC)

Abuse Filters for the Article Feedback Tool

Hey all :). As part of developing Version 5 of the Article Feedback Tool, which includes text-based comments, we're drawing up a list of proposed abuse filters to apply to and integrate it. I would be very grateful if you could take part in the discussion; help with the implementation of proposed filters, propose new ones, and point out any issues you can see. Thanks! Okeyes (WMF) (talk) 18:59, 12 April 2012 (UTC)

Something wrong with filter 50

AFTAB HUSSAIN (PAKISTAN) was written in all capital letters (later deleted) but this did not trigger filter 50. The user who created it was not autoconfirmed. jfd34 (talk) 11:09, 5 May 2012 (UTC)

You've pointed out a important point- user wasn't auto-confirmed yet, right? Filter syntax says !"autoconfirmed" in user_groups which means that it triggers for autoconfirmed users. Dipankan (Have a chat?) 10:57, 8 May 2012 (UTC)
Sorry, and correct me if I'm wrong, but wouldn't !"autoconfirmed" in user_groups mean it triggers for non-autoconfirmed users? I did get the chance to have someone confirm the user who created the page wasn't autoconfirmed at the time of creation, so that condition should have been tripped, something else must not have matched.  JoeGazz84  ♦  23:37, 8 May 2012 (UTC)
You are correct, JoeGazz84. Dipankan001, the "!" before the expression is the not operator, which inverts the truth value of the expression. Reaper Eternal (talk) 01:47, 9 May 2012 (UTC)
Seeing as the page is deleted, unless an admin would like to look over the page or provide the content, I don't think we can do much more here in determining why it didn't trip the filter.  JoeGazz84  ♦  20:19, 13 May 2012 (UTC)

WikiLove

I mean, really? The abuse (edit) filter is for abuse, not for nice, WikiLove giving. --Tomtomn00 (talkcontributions) 20:24, 10 May 2012 (UTC)

That's how the WikiLove designers wanted to track its use. Not really something the EFMs have control over. 28bytes (talk) 21:43, 10 May 2012 (UTC)
Yeah, I disabled it once already, but that just pissed them off. Reaper Eternal (talk) 00:40, 11 May 2012 (UTC)
Heh, it 'pisses' me off. See this... --Tomtomn00 (talkcontributions) 07:31, 11 May 2012 (UTC)
I'm not quite sure what there is to discuss here... It seems like it's going to stay enabled because the developers of WikiLove are using it to track the extension usage. Having an edit filter log isn't bad because people can see that you did good things and it was just tracking, and not all filters have to be for abuse (hense the name edit filter, not abuse filter).  JoeGazz84  ♦  20:12, 11 May 2012 (UTC)

Open proxy filter

Open proxies re-code some characters passing through them, and I thought this can be used in a log-only edit filter like

contains_any(string(added_lines),"%2", "incloak.com")

%2 is rather common for open proxies [8] [9] whereas incloak.com is one of the (many?) sites used to recode html addresses [10]. Suggestions? Other strings? Materialscientist (talk) 06:14, 10 May 2012 (UTC)

I like the idea. How often do you see these edits? Someguy1221 (talk) 06:15, 10 May 2012 (UTC)
The frequency is very low overall and is something like 20% for open proxies (wild guess) for a simple reason - few users use open proxies, and many edits from open proxies are simple reverts or addition of plain text. For me, the addition of %2 is always a worrying sign, but I don't know if some valid processes (which I don't monitor, like bots) use %2 and thus it should be excluded or expanded into a series like %25, %28, %2E, etc. Materialscientist (talk) 06:27, 10 May 2012 (UTC)
Looks like Web proxies to me, and yes, I agree that the edit filter would be useful to catch them more quickly. Cheers, — madman 17:54, 10 May 2012 (UTC) And if you want to restrict further than %2, I'd use %20 and %2E, which is how spaces and periods, respectively, are encoded. — madman 18:09, 10 May 2012 (UTC)
Yes, those links are verified and blocked proxies. Materialscientist (talk) 22:37, 10 May 2012 (UTC)
It's not uncommon to see valid URLs containing %20s (often the result of a messy upload of a PDF or similar file) - would this filter catch those? Andrew Gray (talk) 19:35, 10 May 2012 (UTC)
Andrew: I do anticipate false positives, thus log-only filter. Can you give example of FP and an idea of how frequent they are, so that we could think about setting an exclusion (by article type, etc.). Materialscientist (talk) 22:37, 10 May 2012 (UTC)
It's hard to think of specific cases, but I do know I've run across them occasionally. Usually it's the result of someone uploading a document file of some form to personal webspace without it being configured to strip out spaces (and the user not realising it's a bad idea) - admittedly, in most cases this doesn't imply it'll be a great source, but there can still be valid reasons for linking to the page (especially outside the mainspace). Andrew Gray (talk) 22:06, 11 May 2012 (UTC)

I see what I want but need help with coding: the task is to filter (log-only) addition of the exact text string

%2 

into one of the

ref name= 

operators on any page [11] [12] - this is what proxies do (provided the ref name operator contains space, full stop, comma or similar symbol. Note that ref name=XXX can have slightly different syntax (with/without spaces and with/without quotes or slash, like <ref name=cia> or <ref name = "cia" />), and the targeted %2 symbol can be anywhere in the XXX string. The choice of %2 is because proxies use different coding: say, full stop may be coded as %252E or %2E. Materialscientist (talk) 06:43, 12 May 2012 (UTC)

If it's any help I previously considered Filter 244. Its 3 false positives will give you some idea of the task at hand. I think you need a filter which looks for specific character replacements, rather than added_ and removed_ lines. Anything like a URL replacement, or encoding characters like " ' { } < > and the comma, especially near a URL. -- zzuuzz (talk) 17:00, 12 May 2012 (UTC)
This only supports the point by Andrew that we'll get too many FPs from good-faith miscoded urls - this is why I changed my original proposal to targeting those symbols specifically in the ref name= fields. I believe this will greatly increase the signal/noise ratio, but I don't see how to code it in an elegant manner. Materialscientist (talk) 04:51, 13 May 2012 (UTC)
Indeed, one replacement character at a time - it is not so efficient to check for multiple variations, and that's why I never finished that filter. Something like added_lines rlike "ref name\s?=.{0,15}%[23]" should work I'd have thought. It checks for an encoded character early - probably too early - in the URL or reference. -- zzuuzz (talk) 16:40, 20 May 2012 (UTC)

Filtering the log by multiple filter ID's

If I enter a filter ID into the relevant field of Special:AbuseLog, e.g. 135, it works and shows only hits to that filter. But I can't find a way to show hits to a number of filters. I tried commas (e.g. "135,432"), ampersands, "OR", pipes... but nothing works.

This would be extremely useful if possible. Even a url hack would be great. Anyone know how? --Chriswaterguy talk 06:31, 20 May 2012 (UTC)

I think it's impossible. Reaper Eternal (talk) 01:25, 21 May 2012 (UTC)

Bot for archiving false positives

I'm considering filing a BRFA so I can make a bot that automatically archives the false positive page and sorts it by whether the reports have or have not been actioned. Currently, I'm doing this manually. Any thoughts? Reaper Eternal (talk) 01:54, 23 May 2012 (UTC)

NOINDEX EF/R

Is there any reason why EF/R is _not_ NOINDEXed? I can think of a lot of good reasons why we should NOINDEX it, but can't think of any reasons why it shouldn't be. 64.40.54.240 (talk) 09:12, 23 May 2012 (UTC)

I've BOLDly added NOINDEX to EF/R. Everybody is free to revert if they think it was a mistake, 64.40.57.65 (talk) 18:43, 26 May 2012 (UTC)
Looks fine to me. Reaper Eternal (talk) 19:57, 27 May 2012 (UTC)

Scottywong

I've noticed that the WP:EF/R page is currently being responded to mostly by one or two users, looks like it could use some help. I am very comfortable with regex (almost all of my toolserver tools use it pretty extensively), and would be willing to help out. I'll admit I'm new to the AbuseFilter extension syntax, but I've been reading up on it and it seems pretty straightforward. It would likely take me a little while to ramp up and become comfortable with actually committing new filters, but it looks like it is easy enough to test out new filters before committing them. Thanks. -Scottywong| speak _ 16:49, 7 June 2012 (UTC)

 Done. 28bytes (talk) 17:01, 7 June 2012 (UTC)
Oh. That was easy. Thanks. -Scottywong| babble _ 17:20, 7 June 2012 (UTC)

Question on added_lines

Need help with added_lines. A simplified example: contains_any(added_lines, "%25") returns 1 when applied to this only because some text was changed around the %25 string, whereas the purpose is to catch addition of %25 string (Note: my actual operator is added_lines rlike "ref name\s?=.{0,15}%[23]", but the problem is same; I use [13] against 112.204.31.194). Materialscientist (talk) 08:13, 26 May 2012 (UTC)

Try this:
(added_lines contains "%25") &
!(removed_lines contains "%25")
Reaper Eternal (talk) 19:56, 27 May 2012 (UTC)

Hidden filters in logs

Correct me if I'm wrong, but didn't the logs used to show you 'which' (i.e., the number) edit filter was tripped, even if it was a hidden filter? I noticed my bot tripped some, and it only shows the filter numbers for visible filters? Am I misremembering this, or was there a change? Avicennasis @ 10:13, 12 Sivan 5772 / 10:13, 2 June 2012 (UTC)

It changed a few months ago, so you're not misremembering. -- zzuuzz (talk) 07:43, 18 June 2012 (UTC)

Addition for EF 271

A spambot is getting past Special:AbuseFilter/271 eg http://en.wikipedia.org/w/index.php?title=Board_of_Admiralty&diff=prev&oldid=497983981, so can some on add "lingerie" to the list? Graeme Bartlett (talk) 21:03, 17 June 2012 (UTC)

I've made two adjustments which should help. -- zzuuzz (talk) 07:42, 18 June 2012 (UTC)

Vito Genovese

As per my original request, I'd like to be reinstated. I am active again following a period of absence due to family matters, and I need the rights right now in order to work on our filters. Please note that my original pledge remains.

Vito Genovese 16:34, 29 June 2012 (UTC)

As you were previously granted the right and it was removed only due to inactivity ("without prejudice to future re-adding"[14]),  Done. 28bytes (talk) 18:34, 29 June 2012 (UTC)
Thank you, 28bytes.
Vito Genovese 18:53, 29 June 2012 (UTC)

Annoying false positive notices from edit filter (from Wikipedia:Village pump (technical)

Hello.

I have had, on several occasions, the edit filter warn me (inaccurately) that I am adding a protection template to an unprotected page when in fact the edit filter is triggering a false message. I have noticed that it is falsely notifying me of doing this when I attempt to edit move-protected pages which have the move protection template. It detects that I am saving the page with the protection template (which already exists), and that the page is not protected in a traditional manner, such as semi-protection, which it is evidently programmed to spot. In that case, this filter needs to be adjusted to recognize pages that are move-protected as well. Thank you. 70.248.186.239 (talk) 02:04, 21 June 2012 (UTC), copied here 03:51, 21 June 2012 (UTC)

Mahitgar

  • request for EFM access

Hi,

I am a long time editor ( contribs on en wiki) and also a crat on Marathi language wikipedia .On Marathi language wikipedia and few other wikis I mainly work on newbie support and building help pages and wikipedia values awareness campaigns. Presently my area of concentration is AbuseFilter development and related help pages on Marathi language wikipedia and now we are in need of already developed filters including the private ones you guys have over here.

I have already studied and used some of publicly available AFM used on en wikipedia, besides I used filter info publicly available available on de and fr wikipedia. I need to access even the private filters for comparative study and to be able to export them if found usefull, and file enhancement bugs for features currently not available (my little participation on bugzilla) . As earlier stated I am an admin and crat on Marathi language Wikipedia and admin on two wiktionaries , I am a Wikimedian for over 8 years, so I can be trusted.

I hereby pledge that I will not make a single EFM edit (feel free to revoke the right if you see me editing them), and if I decide to work on them after gaining some experience, I will ask for an additional permission here. I'd normally go for a temporary access, but the filters are constantly developed, and I anticipate that I'll have to check them regularly in order for us to be in synch.

Thanks and Regards

Mahitgar (talk) 02:02, 8 July 2012 (UTC)

Filter 479

Filter 479 is currently set to block users from adding the example image to an article. I vaguely recall this being one of the earliest filters created, and that it was disbanded for some reason I don't remember. That early filter caught not only the example image but basically anything that could be added by hitting a button on the edit toolbar, that had no place in an article - '''Bold text''', ''Italic text'', etc. So my question to the other edit filter managers is, is this something we now want? Should Filter 479 start catching all such mistakes, or only specific ones? Should it be set to warn, block, both, tag, nothing? Thanks. Someguy1221 (talk) 21:12, 27 June 2012 (UTC)

FYI we've got bots that clean up this stuff (e.g. mine.) The original "stray mouse clicks" edit filter was 18. 28bytes (talk) 21:52, 27 June 2012 (UTC)
How often does your bot clean it up? I manually cleaned up several dozen instances of Example.jpg in article space after I created that filter. Is there a way to set the filter to disable, but still give the user a detailed explanation for why the edit was blocked so that they can easily fix it? (Kinda like a combination warn/disable?) -Scottywong| soliloquize _ 23:31, 28 June 2012 (UTC)
You just check the boxes prohibiting the action as well as "Trigger these actions after giving the user a warning". It will warn first, and block a subsequent attempt. But I was just thinking about your bot, 28, and I'll give you a message on your talk page. Someguy1221 (talk) 23:37, 28 June 2012 (UTC)
28bot has a few rules in place that keep it from being too aggressive with the stray mouse clicks. Specifically, if someone adds "Example.jpg" and some text, it won't revert the addition (but will log it), but if they add two "Example.jpg"s, it will revert. So you'll see some articles it intentionally leaves the Example.jpg in to avoid the chance of a false positive. 28bytes (talk) 00:13, 29 June 2012 (UTC)
Ok, I've updated the filter with a warning that should explain clearly what's going on. Keep in mind that the filter only covers article space additions by non-autoconfirmed editors. So, the bot may still be necessary if a more experienced editor makes a stray mouse click. I can't imagine too many scenarios where adding [[File:Example.jpg]] to an article would be intentional (i.e. where it would be a false positive to revert it). -Scottywong| verbalize _ 00:34, 29 June 2012 (UTC)

Gareth Griffith-Jones

I should like to be judged suitable to be granted permission. -- Gareth Griffith-Jones (talk) 18:28, 4 July 2012 (UTC)

Hi Gareth. Permissions for what? Wifione Message 15:48, 7 July 2012 (UTC)
Hi Wifione. I'm terribly sorry ... for my having access to the Edit filter logs.
During a fairly long and instructive 'conversation' on my Talk page last Wednesday with Mcewan, here, he brought the subject into our discussion, and as you wil read, gave me a link to this article. Kind regards, -- Gareth Griffith-Jones (talk) 16:54, 7 July 2012 (UTC)
Thanks for the reply. You generally don't need any permission for having access to edit filter logs, which is a permission given to all users. Or are you saying that you are not able to view the contents at Special:AbuseLog? Wifione Message 17:36, 7 July 2012 (UTC)
Well your link to AbuseLog worked okay, and I could open any there that I clicked on. I am confused, though. What is this section, which seems permanent, actually serving? The other user here was asking for some sort of permission to be reinstated. -- Gareth Griffith-Jones (talk) 23:09, 7 July 2012 (UTC)
You need permission to modify the edit filters and to view the details of private ones. Someguy1221 (talk) 00:58, 8 July 2012 (UTC)
Thank you. I cannot imagine I would aspire to do the former. Private being what? -- Gareth Griffith-Jones (talk) 01:17, 8 July 2012 (UTC)
Private meaning that only admins and editors with edit filter permissions can see how the filter works. Some filters, especially those designed for catching specific trolls, are set to private so the trolls/vandals can't see how they are being caught. Someguy1221 (talk) 03:16, 8 July 2012 (UTC)
"Gotcha!" Many thanks for that. -- Gareth Griffith-Jones (talk) 10:21, 8 July 2012 (UTC)

Section Blanking Filter

This filter is tripped whenever someone removes a level 2 header. Often, there are legitimate reasons for doing so. I propose that this filter should be configured to detect if there was an edit summary or not explaining why the level 2 header was removed. Electric Catfish 14:28, 22 July 2012 (UTC)

  • Probably difficult to do, but I agree with this because a user will end up tripping a lot of these filters if they happen to be addressing legitimate sections which are empty. Those identified by the empty section tags are oftentimes completely useless sections from a copy and pasted template which never had any text to begin with. Some templates are good as for the 'XXXX Year' (e.g. 1923 in radio) types which should list certain events, but others really are useless like a 'See Also' section being empty or a 'Recent History' section which is already covered in its history. Since 'recent history' changes I also tend to dislike this section heading. On the other hand, perhaps someone can monitor those changes easily from the filter. I know I tripped a lot of filters for editing before I became auto-confirmed. ChrisGualtieri (talk) 15:44, 22 July 2012 (UTC)

Edit Filter 458

Would someone mind unchecking (Article Feedback) Auto-flag as abuse on filter 458. As discussed at Wikipedia_talk:Article_Feedback_Tool/Version_5#Lots_of_stuff_is_being_incorrectly_automatically_marked_as_abuse and Wikipedia_talk:Article_Feedback_Tool/Version_5#Automated_flagging_as_abuse the filter is generating too many false positives and at the moment we don't have a way to unflag something as abuse once flagged by the filter. Monty845 13:16, 19 July 2012 (UTC)

Also, I'm not an expert of regex, but would it be possible to tweak Special:AbuseFilter/473 to not trigger on repeating periods but leave in other characters? For some reason people like to leave a lot of ....s in their feedback. Monty845 13:26, 19 July 2012 (UTC)
 Done Oliver Keyes, disabled 458 on July 21, 2012. It took me a while to figure out 473, but it works by listing the characters that are allowed to repeat, so I have added the "." as allowed, so repetitions of 7 dots will not trigger abuse. Graeme Bartlett (talk) 21:49, 24 July 2012 (UTC)

Filter 441

Can someone check this filter against this log? I can't see why it disallowed it. Black Kite (talk) 20:01, 21 July 2012 (UTC)

On the face of it, the filter was rather silly. I made it so it only checks against the specific range of this troll, instead of prohibiting all IPs from using certain words. Someguy1221 (talk) 06:38, 24 July 2012 (UTC)

Wikilove/ Barnstar Filter

What's the purpose of this filter? Electric Catfish 14:26, 22 July 2012 (UTC)

The Wikilove people wanted to track the feature's usage. Someguy1221 (talk) 06:34, 24 July 2012 (UTC)
Yes, but isn't there another way to track it? When I patrol the abuse log, I'm trying to find vandals and report them and revert their vandalism. I'm not looking to check out what Wikilove was sent. Is there a way that we can keep track of it without the entries appearing in the abuse log? Thanks, Electric Catfish 11:42, 24 July 2012 (UTC).

Eptalon

Hello all, I have started to develop an edit filter with the aim of detecting edits made by spambots, which can be detected by certain "keywords" in either the edit summary or the text added to the page edited. I am currently active on SimpleWP, where I hold the positions of administrator, checkuser, bureaucrat, and oversighter. The problem I have is that SimpleWP is a wiki with a relatively low traffic volume, which makes testing edit filters more time intensive. I would therefore like to ask you as a community to grant me the privilege to see and change editfilters on this wikipedia, so that the editfilter can be tested and finetuned. The arrangement would be temporary until the filter works sufficiently well, that it can be used on other wikis. The filter in question is filter 30 on SEWP, which is private, for obvious reasons. Thanks for the consideration. --Eptalon (talk) 21:49, 24 July 2012 (UTC)

Hi, here on en.wikipedia there is a spambot filter Special:AbuseFilter/271 whose triggerers I regularly block. It does seem to be highly effective, but does have about 10% false positive, but since it only warns, a real person can click to continue. Spambots have not yet developed the smarts to work out how to save the spam. You can contact me by email Special:EmailUser/Graeme_Bartlett to make your proposal, and I can reveal the 271 code to you. I thought it was public but now I check it is private. The only person I have given edit filter access to is myself, so I am not yet confident to give it to anyone else yet. Graeme Bartlett (talk) 10:42, 25 July 2012 (UTC)

Edit Filter Manager

Hi! I frequently patrol the abuse log and report people who repeatedly trip the filters to AIV. I would like to help you guys out with modifying the filters and fixing errors. However, what are the requirements to getting that user right? Thanks, Electriccatfish2 (talk) 11:41, 16 July 2012 (UTC).

Also, if I qualify for it, than I'd love to help out with the abuse filters. Electriccatfish2 (talk) 18:42, 20 July 2012 (UTC)
You can already help out — quite a number of filters are public, so you can suggest changes to them right on this page. This will demonstrate your knowledge of the filter syntax, which is critically important when editing filters directly. Feezo (send a signal | watch the sky) 20:43, 20 July 2012 (UTC)
Thanks! Electriccatfish2 (talk) 21:00, 20 July 2012 (UTC)
I'm going to help out a false positives for the time being and I'll request it in a month. Electric Catfish 00:23, 26 July 2012 (UTC)

Jasper Deng

I'm an active contributor on the false positives reports, but I can do less these days because I cannot view the abuse log entries for private filters. I have knowledge of regex and the abuse filter syntax.--Jasper Deng (talk) 17:14, 24 July 2012 (UTC)

That page needs all the help it can get, and you look like you'd do a good job, so I see no reason to oppose granting you access. Soap 17:17, 4 August 2012 (UTC)
 Done Unless somebody brings out a critical opposition in the future - in which case the flag could be revoked - I see no problems in assigning the flag right now for viewing purposes. Please take care. Best. Wifione Message 18:40, 4 August 2012 (UTC)

Filter 247 (adding email addresses)

Why didn't this edit get tagged with the adding email address filter? It should be enabled in talk namespaces as well, especially for anonymous users who cannot configure an email address on their account. jfd34 (talk) 06:41, 4 August 2012 (UTC)

Extraneous formatting (Filter #345)

This filter should also check the following code:

<span onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text"

This code is sometimes automatically added by some browsers when viewing a Google-translated version of a page, and then clicking the edit link there. IE 8 adds this code, do not know if any other browsers have this problem. jfd34 (talk) 10:58, 8 August 2012 (UTC)

Pre-2012 disabled filters all marked as deleted

To reduce clutter on the filter list, I've marked as deleted any filter that: 1) is currently disabled; 2) was last edited in 2011 or earlier; and 3) is not marked as a test filter. Any objections, just revert it. -- King of ♠ 23:19, 8 August 2012 (UTC)

Hoo man

As a user, who is already experienced with abuse filters from several wikis and who is active in global and local vandalism fighting (which I sadly didn't have much time for in the last months), it would be a good thing for me to be able to edit abuse filters and to be able to see hidden ones (to be able to export them and to track down false-positives), over here. Furthermore, there are other, non-vandalism, use cases, like Special:AbuseFilter/485 which I come across, as I'm involved into technical matters and it would be great, if I could work on that myself. - Hoo man (talk) 01:37, 18 August 2012 (UTC)

I know Hoo man very well from his anti-vandalism work cross-wiki as part of his role as a global sysop. In addition to my trust in him, his technical knowledge of Javascript and Regex is sound, as such I am more than happy to strongly support this request for EFM. We need more technical people able to do this work, whilst he may not always be active on all wikis (as I'm sure you can imagine, cross-wiki means there are a lot of wikis to do work on!), Hoo is willing and able to do it, so why not let him? The Helpful One 02:08, 18 August 2012 (UTC)
 Done. 28bytes (talk) 02:51, 18 August 2012 (UTC)
Was just about to flip it myself. Would have supported. Reaper Eternal (talk) 03:09, 18 August 2012 (UTC)

That was a fast one, thanks. Hoo man (talk) 13:39, 18 August 2012 (UTC)

Other discussion

Filter 439

I can't see any reason why the latest exception on this filter was tripping it, seems an odd article to be included in that filter. Black Kite (talk) 23:06, 9 August 2012 (UTC)

Just wondering, why are articleids used instead of article titles? Is it performance? Also, how do you lookup the article from the articleid? -- King of ♠ 00:42, 10 August 2012 (UTC)
View source and then find "articleid", if I'm not mistaken. I don't know if that's any faster than screening by title. Someguy1221 (talk) 01:15, 10 August 2012 (UTC)
Yes, I know that's how you get an articleid from an article. But the reverse? -- King of ♠ 21:27, 10 August 2012 (UTC)
By using curid (example). — AlexSm 21:39, 10 August 2012 (UTC)

Condition limit

We're hitting it pretty badly now (~15% of all edits), so I'm going to make a list of filters that are expensive and see what we can do.

  • #5: User self-renaming or moving user talk pages into article talk space
  • #16: Prolific socker I
  • #58: Long-term pattern abuse
  • #79: New user removing reference grouping tags
  • #80: Link spamming
  • #117: removal of Category:Living people
  • #183: AFC submissions in wrong namespace
    Disabled. Reaper Eternal (talk) 15:52, 17 August 2012 (UTC)
  • #354: Promotional text added by user to own user(-talk) page
  • #460: Feedback: Common Vandalism
  • #461: Feedback: Vandalism in all caps
  • #463: Feedback: Adding email addresses
  • #467: MS filter
    Deleted (was good for a week). Materialscientist (talk) 02:50, 15 August 2012 (UTC)
  • #468: Athletics time-tweaking vandal
    Hit rate ca. 100%, activity declined, maybe just for the summer. The edits are nasty (tweaking stats), thus I would keep this one. Materialscientist (talk) 02:50, 15 August 2012 (UTC)
  • #471: Eurodance vandal
    Hit rate ca. 100%, activity declined, maybe just for the summer. Edits are relatively harmless, thus disable it if resources are a problem. Materialscientist (talk) 02:50, 15 August 2012 (UTC)
  • #472: Feedback: Addition of bad words
  • #473: Feedback: Repeating characters
  • #474: Feedback: Comment with only obscenities
  • #475: Feedback: Vandalism or libel

Thanks! Reaper Eternal (talk) 16:23, 14 August 2012 (UTC)

I think we're now down to a much more acceptable ~0% of edits maxing out the condition limit. Reaper Eternal (talk) 16:06, 17 August 2012 (UTC)

Per your request Reaper I scanned through the error codes. Unfortunately a lot are set to private so I'm not going to be much help. Good luck. Kumioko (talk) 21:18, 17 August 2012 (UTC)

Long Term Pattern of Abuse

I'm not an EFM yet, but Black Kite informed me that tripping the long-term pattern of abuse filter does not mean that the user has actually vandalized, the user has just edited in a pattern that vandals do. Please correct me if I'm wrong here, but Mr. Z-bot is reporting users who trip these filters to AIV and the reports are being declined. Can someone please look into this? Thanks, Electric Catfish 21:27, 14 August 2012 (UTC).

Filter 58 has one of the worst false positive rates of any filter. We could probably get rid of a lot of them by setting the filter to only look at small edits, delta<500 maybe? It seems to be that when a newbie posts a giant new article in a sandbox, it's almost guaranteed to fit one of the bazillion patterns in that filter. I'd like some other opinions. Someguy1221 (talk) 22:26, 14 August 2012 (UTC)
Yes. Perhaps we can modify the regex. Also, what's the false positive rate? I'd like to help out with the regex patterns, but I'm not an EFM yet. Electric Catfish 00:23, 15 August 2012 (UTC)
Fine, but at the moment, can we please stop Mr. Z-bot from reporting users or IPs to trip this filter to AIV? Electric Catfish 11:03, 19 August 2012 (UTC)

Filter 422 again

  • [15]. It seems to me that this filter is far too vague/wide on the editing of templates by non-confirmed editors. Black Kite (talk) 14:17, 28 August 2012 (UTC)
  • [16]. Another one. Black Kite (talk) 11:45, 2 September 2012 (UTC)
I've dealt with it. Reaper Eternal (talk) 15:11, 3 September 2012 (UTC)

Filter 489

I've currently got filter 489 set up to log when IP users from particular address ranges edit, in an attempt to first track a particularly persistent vandal's activity, then, with appropriate content-based filter rules, to block their activities. 78.0.175.227 (talk · contribs · WHOIS) meets those criteria, but the filter seems to have triggered only once on their numerous recent edits, according to the edit log, yet back-testing the rule on all that IP's edits seems to catch them all. Can anyone tell me if I am doing something wrong? Is there some kind of throttling being applied here (perhaps on the logging end?), or is there a problem somewhere else that might account for this? Thanks. -- The Anome (talk) 01:12, 5 September 2012 (UTC)

I am a newbie in edit filters, thus just thinking loud:
You try to meticulously define ranges, and some even overlap, like 78.0.x.x. I would use [17] and define fewer ranges like 78.0.0.0/14 for the 78.x.x.x IPs, etc., not bothering much if they cover a bit more than intended. Your filter detects all edits from 78.0.175.227 in test mode [18], thus it should be something in the range detection, maybe the overlap. Materialscientist (talk) 01:56, 5 September 2012 (UTC)
I haven't eliminated overlaps from the ranges yet, but I can't see how that should be a problem right now: the code's slightly inefficient, but not incorrect as is. I can tidy this up when the code goes into production, by both removing covered ranges, and also aggregating adjacent ranges into larger ones. -- The Anome (talk) 03:15, 5 September 2012 (UTC)
You have put many OR operators into one condition, which are expensive; maybe wikisoftware kills an evaluation criterion after xx milliseconds, which results in unstable triggering? (just guessing - I also don't see logical errors) Materialscientist (talk) 04:08, 5 September 2012 (UTC)
Possibly: but these IP range comparisons should be cheap operations, vastly less expensive than, say, regexp operations, which many filters perform in abundance, so a CPU-time-based limit shouldn't be an issue here. I'll hack away at the list now to optimize it a bit: if really necessary, I can optimize it further by turning it into a branching search on the IP range. -- The Anome (talk) 13:32, 5 September 2012 (UTC)
I saw something odd with the tag filter "possible BLP issue or vandalism" yesterday, like it stopped logging for some time. Maybe there is nothing wrong with your filter. Materialscientist (talk) 22:32, 5 September 2012 (UTC)
We're probably hitting some form of condition limit. I'll look into what filters can be further pared down or removed. Reaper Eternal (talk) 14:55, 6 September 2012 (UTC)
Holy moly, 20% of edits are maxing out the filter. Reaper Eternal (talk) 15:00, 6 September 2012 (UTC)

Okay, so I've dealt with several filters:

  • #47: Disabled.
  • #56: Disabled.
  • #328: Disabled.
  • #422: Pared down most of the content to a much smaller, leaner, and hopefully more accurate filter.
  • #435: Deleted.
  • #448: Disabled.
  • #454: Disabled.
  • #465: Disabled.
    Restored. He is back (nasty LTA case). Contact me before disabling. Materialscientist (talk) 00:15, 20 September 2012 (UTC)
  • #468: Disabled.
  • #476: Disabled, and a trout to Beetstra (talk · contribs) for coding a filter to "topic ban" editors.
  • #478: Disabled.
  • #483: I've added a check to reduce the false positives and hopefully speed up processing.
  • #484: Deleted.
  • #489: I deleted a large number of redundant CIDR address ranges.

Some expensive filters:

  • #460: Matches nearly 5% of all actions, and they are not false positives! So nearly 5% of article feedback is blatant vandalism....

Reaper Eternal (talk) 16:02, 6 September 2012 (UTC)

Alright. We now are rarely hitting the condition limit, so edits hsould be checked against your filter. Reaper Eternal (talk) 14:27, 7 September 2012 (UTC)
Thank you! I've now set filter 489 to do some fairly broad-brush edit blocking based on a combination of keywords and IP ranges: although it's quite strong, it only affects a few tens of thousands of IPs worldwide, for a few percent of articles, and should help discourage some quite long-standing and intensive vandalism without going to the measure of fully range-blocking all those IPs. -- The Anome (talk) 23:35, 10 September 2012 (UTC)
Dear Reaper Eternal: thank you so much for spearheading this great clean up -- and for consolidating the article feedback filters! All your great work and advice is much appreciated. It was really helpful to get standardized code to filter a large number of swear words in a consistent way. However, an unintended side-effect was that filter #460 was automatically disabled because it exceeded the 5% limit, given the amount of swear words in all feedback posts. So we asked AFT5 developer mlitn to help solve that issue, and he went ahead and split that filter #460 into 5 separate filters (494, 495, 496, 497). This apparently resolved this particular issue, and we are now effectively filtering out the swear words you kindly helped compile. Please note that all feedback filters are in a special 'feedback' group that is being processed separately from all other filters, so they should not impact the condition limits for all other filters, thanks to a recent code enhancement by Werdna to the AbuseFilter extension. As a result of all this fine work, we expect that much of the abuse in article feedback will now be disallowed, thereby reducing the workload for editors who monitor that feed. We are very grateful for all that you (and other editors like Sole Shoe) have done to help make that possible! - Fabrice Florin (WMF) (talk) 18:52, 27 September 2012 (UTC)

AFTv5 filters

Can anyone tweak the AFTv5 filters so that they can catch vandalism like [19], [20], [21], [22], and maybe even legal threats like this one? jfd34 (talk) 16:15, 25 September 2012 (UTC)

Hello, jfd34, thank you for your good recommendations about filtering feedback abuse. We can easily disallow potentially offensive words (e.g.: 'vagina' in your example above), and will look into addressing some of your other suggestions shortly. Sadly, it is unlikely that we could automatically filter legal threats, though we could 'auto-flag' comments written in ALL-CAPS, which are often a telltale sign of inappropriate feedback (this would allow the comment, but automatically remove it from the 'Most Relevant' list which most readers see). In any case, we appreciate your suggestions and look forward to addressing issues like these in coming weeks, with everyone's help. Much appreciated! - Fabrice Florin (WMF) (talk) 19:08, 27 September 2012 (UTC)
Large caps is not a good indicator of inappropriate feedback. That filter was in before and generated mostly false positives, rather than threats. Graeme Bartlett (talk) 21:07, 27 September 2012 (UTC)

Filter 29

Hey, all, can edit filter 29 be updated to account for the template redirects that the new Page Curation tool uses? The full list can be found here. I think it would look something like:

(user_editcount < 50)
&(lcase(removed_lines) rlike "{{db-[a-z0-9]{2,15}(\||}})|{{db\||{{db}}|{{speedy deletion-.+?}}")
&!(lcase(added_lines) rlike "{{db-[a-z0-9]{2,15}(\||}})|{{db\||{{db}}|{{speedy deletion-.+?}}")
&!(lcase(removed_lines) rlike "{{db-(self|blanked|auth|g7|user|owner|u1)}}|<nowiki>{{db|{{speedy deletion-author request}}|<nowiki>{{speedy deletion")

But someone should probably check this, since my previous experience with PCRE is slim-to-none. Thanks! Writ Keeper 20:00, 11 October 2012 (UTC)

 Done Reaper Eternal (talk) 20:14, 11 October 2012 (UTC)

Upping the condition limit for AFT5 filters

Hey all

We're talking about upping the condition limit/percentage limit for AFT5 filters and AFT5 filters only, from 5 percent to 10 percent. Obviously we'll run it past ops first, but I wanted to give you guys a heads up and check we're not missing an obvious problem :). Okeyes (WMF) (talk) 18:23, 18 October 2012 (UTC)

Vandalism in all caps

Can anyone modify the filter (#225) to catch this? (replacing E with 3) jfd34 (talk) 07:11, 1 November 2012 (UTC)

The rlike operator can't distinguish between 3 and E, so no. Anyway, such an edit would normally be caught by filter 384. Regardless, the real problem is that the page Deer mentions "penis" quite a few times in completely encyclopedic contexts. Determining whether someone changed the number of bad words on a page is more expensive to do than determining whether they added a bad word that wasn't there to begin with. Someguy1221 (talk) 08:47, 1 November 2012 (UTC)
Filter 384 only catches words that are completely lowercase - not even starting with a capital letter. For example see this edit, where the word "fuck" was not there anywhere in the article prior to the edit, but the "F" being uppercase made it bypass the filter. jfd34 (talk) 07:28, 2 November 2012 (UTC)
Filter 384 ignores the case of the letters, actually. That edit slipped past the filter because at the time that the edit you linked was made, 384 would not trigger on edits with negative delta, a condition which has since been removed. Someguy1221 (talk) 07:54, 2 November 2012 (UTC)

Filter 189 false positives

In filter 189 (BLP vandalism/libel) please change \bwank to \bwank(a|er|ed|ing|s)?\b to avoid false positives like [23] [24] [25] jfd34 (talk) 05:30, 24 November 2012 (UTC)

 Done. Feezo (send a signal | watch the sky) 05:51, 24 November 2012 (UTC)

Edit filter based on ip/username

There is a question at WP:AN about a long term abuse case whether IP can be an edit filter criteria. Specifically there are some things that might be caught by abuse filter but only within a specific set of IP ranges. Is this functionality possible? Shadowjams (talk) 04:39, 25 November 2012 (UTC)

Yes, that is technically possible. Reaper Eternal (talk) 04:47, 25 November 2012 (UTC)

EFM for Legoktm

Resolved

Hi.

Lately, I've been working on some tools that would use the edit filter and have been constantly irritated at how many filters are set to private. Take User:Mr.Z-bot/filters.js for example. All of the filters considered immediate are set to private except for 139, which was specifically un-privated after I talked to an admin about it on IRC. For anyone trying to find/track serious vandals this gets annoying real fast. I understand that many of these filters are private because they're for LTAs and would be gamed if made public, but it still is annoying.

I've also been reporting a few bugs regarding the edit filter: bugzilla:42734 (Non-admins can see contents of deleted pages when viewing abusefilter details), bugzilla:42758 (AbuseFilter log events should show in the IRC feed), bugzilla:42802 (Query multiple filter logs at the same time in the API), and bugzilla:42814 (Abusefilter API does not check for abusefilter-view-private userright). Incidentally I wouldn't have noticed the last one if I did have the EFM right.

One of the things I'm currently working on is an IRC bot that tracks active vandals using the edit filter (working), and is able to accurately recommend blocks (not yet working). I believe this would be a major improvement over the current system in #wikipedia-en-abuse-log. It's currently about halfway done, you can PM me if you want access. If/when bug 42814 is closed, I won't be able to monitor private filters without the EFM right. Nor can I currently start tracking private filters, which are mainly for LTA's, if I can't see them.

As far as editing filters goes, I believe I am competent in regular expressions (having successfully run a number of bots) and have enough clue to know whats a good idea and not. But for the most part I don't plan on editing filters.

tl;dr: I would like the EFM permission to view private filters and build useful tools with the edit filter, not so much of editing actual filters. Legoktm (talk) 06:32, 7 December 2012 (UTC)

  • I definitely support this request because he's been trying to help fix up problems we have, as you can see above and below. I recently signed back up as an EFM after a 1-year hiatus to try to help with a simple problem but my skills don't seem to be much help here as I've already come across code I can't remember how to parse. My only advice to Legoktm, if approved, would be to use the testing tools before pushing anything live, even if you're absolutely sure the code is right, as people have thought that before only to be proven wrong. Soap 23:09, 7 December 2012 (UTC)
  • I would trust him with this task, he seems well skilled for it. MBisanz talk 23:25, 7 December 2012 (UTC)
  • Support - Seems reasonably skilled. Reaper Eternal (talk) 21:22, 8 December 2012 (UTC)
  • Support - Reasonable request. NativeForeigner Talk 18:04, 12 December 2012 (UTC)

 Done following 5 days of discussion with no objections. 28bytes (talk) 19:02, 12 December 2012 (UTC)

Thanks, I'll try not to blow anything up! Is there anywhere that EFMs can discuss private filters, like an IRC channel? I took a look at a few and have questions now :P Legoktm (talk) 19:30, 12 December 2012 (UTC)
There may be (I'm not on IRC, so I don't know), but if you have any questions about a specific filter you can always email the person who created or last edited it. 28bytes (talk) 19:34, 12 December 2012 (UTC)
There isn't any such channel that I know of. However, many active EFMs use IRC for other purposes and can be PMed. Reaper Eternal (talk) 20:43, 12 December 2012 (UTC)

Edit filters

Filter False Positive

Unresolved

Any progress on this?: [[26]] 68.50.128.91 (talk) 08:26, 20 December 2012 (UTC)

Filters to show up in IRC feeds

Hi all,

I have submitted a bug and changeset for edit filter log events to show up in the irc.wikimedia.org feed for enwp. Since these feeds are public, only public filters will be included. This will allow for bots/scripts to find out about filter trips as soon as they occur, rather than continually polling the API.

Thanks, Legoktm (talk) 11:48, 13 December 2012 (UTC)

After a few unexpected hiccups, it works! Log entries follow this general format (with colors and stuff too):
<rc-pmtpa> [[Special:Log/abusefilter]] hit * Username *  Username triggered [[Special:AbuseFilter/##|filter ##]], performing the action "edit" on [[Pagename]]. Actions taken: Warn ([[Special:AbuseLog/#####|details]])
Edits to filters will also show up, however that's slightly broken right now (see T45105).
I'll try and work on setting up a bot to bridge filter trips over to irc.freenode.net. Thanks, Legoktm (talk) 01:09, 14 December 2012 (UTC)

Proposal to have private filters show up in the RC feed

Hi all,

I wrote above that edit filter trips will start showing up in the irc.wikimedia.org recent changes feed. However this is only for public filters, not private ones. The only "non-public" information that would be shown (as compared to what Special:AbuseLog shows) would be the filter ## of what was tripped, and the log id. I believe that such information will not enable clever vandals to try and game the filters, however it would make it easier for tools/bots to track private filters. A slight code modification would be required to implement this, however I'm willing to write that. Thanks, Legoktm (talk) 04:35, 14 December 2012 (UTC)

Seems reasonable to me. Private filters were only implemented as a means to hide filter rules from clever vandals, as I understand it. Outputting filter hits to an IRC feed should be fine, I think. --MZMcBride (talk) 04:54, 14 December 2012 (UTC)
Filter numbers are a very small disclosure, this seems a useful step forward. Rich Farmbrough, 03:33, 16 December 2012 (UTC).

Edit filter 31 tripping on pre-existing text on a page

Here, a good edit is rejected because of the paragraph above, which was already there and presumably added before the edit filter existed. Is this a bug, and does it need fixing? Black Kite (talk) 12:40, 20 December 2012 (UTC)

Looks like a bug in the AbuseFilter to me, I'm not sure why those comments were in the added_lines field since they weren't added nor were they in the same line. Legoktm (talk) 09:18, 21 December 2012 (UTC)
added_lines does not include only lines that have been added to the page, it shows the same lines that would be shown in a diff. Prodego talk 21:29, 31 December 2012 (UTC)
Right, but if you look at the diff at Special:AbuseLog/7985879, that text isn't in the added section of the diff. Legoktm (talk) 00:02, 2 January 2013 (UTC)
It's in added_lines, all right. -- King of ♠ 00:10, 2 January 2013 (UTC)

Request for permission for ElockidAlternate

Hi all,

I would like to keep an eye out on my filters/other anti-abuse filters but since they're set on private, I cannot see them on my alternate account. Could someone please add abusefilter to this account? Elockid (Alternate) (Talk) 21:13, 29 January 2013 (UTC)

Crat request

Could some of you fine filter managers drop by Wikipedia:BN#Wikipedia:Changing_username.2FSimple and advise on the suitability of that proposal? Thanks. MBisanz talk 04:51, 20 January 2013 (UTC)

Help with filter

Can someone give me a pointer as to how I could modify filter 526 to catch this?—Kww(talk) 00:10, 24 January 2013 (UTC)

You want to prevent IP's from changing numbers in all articles? Maybe you should elaborate on exactly what you're trying to prevent. ‑Scottywong| spout _ 00:54, 24 January 2013 (UTC)
Currently, I warn any IP that attempts to insert a Hot 100 Brasil position that they have to be using the magazine, not the website (which isn't affiliated with Billboard and has been listed on WP:BADCHARTS for years). I would like the filter to detect that the editor was modifying a Billboard Brasil position and give the same warning (MediaWiki:Abusefilter-brasilhot100). It looks like the problem is that the chart isn't in "added_lines", it's only in "edit_diff", but I can't figure out how to pick it up without also picking up edits to the charts immediately above or below. That particular edit was exactly what I'm trying to put a stop to: someone modifying a legitimate position from the magazine to match the illegitimate website.—Kww(talk) 01:06, 24 January 2013 (UTC)

Do we have any statistics?

Hi, I'd like to know how frequently the edit filters prevent edits taking place. Do we have any stats? (Wikipedia:Edit_filter/Performance doesn't seem to have been refreshed since 2009 so I've taken the liberty of declaring it historical). ϢereSpielChequers 21:53, 16 February 2013 (UTC)

Since Toolserver users have access to the log table, someone at WP:DBR might be able to help you out. Legoktm (talk) 22:21, 16 February 2013 (UTC)

Breaking change to AbuseFilter logging

I would have expected someone to have posted this by now, but User:Hoo man has submitted a changeset that would fundamentally change the way logging works, there's a non-technical explanation here. The specific change is gerrit:42501.

Also, if you're already not on the wikitech-ambassadors list, you should subscribe since these kinds of changes will be announced there. Legoktm (talk) 02:31, 17 February 2013 (UTC)

Requesting code

I am admin and burocrat at Marathi language Wikipedia mr-wiki.Currentlly we are using "contains_any" parameter for filtering out required word .We are looking for effective suggssions for following (So we can have properly updated help pages)

  • Request 1 :The best way to filter single alphabate word while avoiding prefixes and suffixes being filtered to avoid fals positives; for eg in Devnagari scriptwe want to filter "तू" ; we do not want words with prefixes and suffexes to be filtered.


  • Request 2 :Is it possible that english language wikipedia shares old and effective private filter configurations without confidential keywords . So it helps rest of sister wiki projects to evaluate their filters more effectively.


Thanks and Regards Mahitgar (talk) 11:49, 19 February 2013 (UTC)

I'm having a hard time understanding what it is you are asking, but if you want to filter out only that character when it is by itself and not part of another word, you could just use: (added_lines rlike "\bतू\b"). Let me know if that is what you want. Cheers! Reaper Eternal (talk) 13:53, 19 February 2013 (UTC)
Yes , you understood it correctly, thanks for prompt help and reply. We will try your solution. Rgds Mahitgar (talk) 14:15, 19 February 2013 (UTC)

Filter 102

Appears to be stopping this editor from making any edits at all. Must be an FP, surely? Indeed, if you look at the filter log, there are at least two others on the first page that appear to be FPs as well. Black Kite (talk) 13:29, 19 February 2013 (UTC)

That was fixed four days ago by NawlinWiki. Cheers! Reaper Eternal (talk) 13:50, 19 February 2013 (UTC)
Thanks - I hadn't noticed that the report was four days old! Black Kite (talk) 14:16, 19 February 2013 (UTC)

removal of tag

With the advent of Wikidata the removal of interwiki links is now desirable instead of potentially abusive, so this tag should probably be depracated. Beeblebrox (talk) 20:12, 17 February 2013 (UTC)

  • Specifically, the involved abuse filters are Special:AbuseFilter/270 & Special:AbuseFilter/531. The latter is maintained by AddShore (last change yesterday), the first has been unchanged for two years. :) ·Salvidrim!·  23:35, 18 February 2013 (UTC)
    • Not entirely desirable, as people tend to blindly remove links without checking to make sure it's all on Wikidata. --Rschen7754 23:41, 18 February 2013 (UTC)
531 is specifically if all wikidata tags are removed from a page, the reason is that often at this stage if all of the links have been removed people are not checking for them on wikidata before removing them. Please see the generated list User:Addbot/log/wikidata ·Add§hore· Talk To Me! 00:14, 19 February 2013 (UTC)

I'd prefer for us to properly plan this and just have a flag day where we removed all of them en masse. While the transition is piecemeal, their removal can still be improper if the central support isn't there for a given article. Chris Cunningham (user:thumperward) (talk) 11:12, 19 February 2013 (UTC)

I would prefer if everyone didnt touch them and we just let bots do it all >.<. ·Add§hore· Talk To Me! 12:01, 19 February 2013 (UTC)
I am not sure but if there is only one bot that doing all of the work of removing the interwiki links, that can take a long time as there currently are over 4 million articles and then a couple thousand project pages, and template pages. So wouldn't it be better to have some manual editors work on it as well as it would most likely help cut down on the time that it would take to do all of that? --Clarkcj12 (talk) 21:51, 1 March 2013 (UTC)

Adding a new filter

Hi all! Will is be possible to add a new filter that stops people from adding categories to pages in the Wikipedia talk:Aricles for creation/... namespace? This is to prevent them adding categories to non-published pages. Thanks, Mdann52 (talk) 11:04, 25 February 2013 (UTC)

Yeah, it's possible. However, I don't know if it is desirable since the entire edit will be blocked rather than just the categories. Reaper Eternal (talk) 13:21, 25 February 2013 (UTC)

Confidentiality, editfilter requests, and WP:BEANS

Hi,

  • There is an editor who makes quite a large volume of unhelpful edits. They are not vandalism per se, but they give away a lot of personal information. Almost all edits are caught and reverted by RC patrol or watchlists, but I have been requesting oversight wherever I find them in the article history, for this person's protection - edits always contain full name, usually bank account number, often other details like date of birth or address. Accounts have been blocked in the past but the edits keep on coming. They currently use a wide range of IPs - blocking that would cause a lot of collateral damage.
  • Comnplexity is low - I don't imagine there would be performance problems - and the distinctive contents mean that an edit filter could be a good response, with minimal false positives/false negatives. However, the most distinctive part of the edit text is the personal information that we're trying to protect, so I'm wary of just putting it all out in the open here. Is it possible to request an editfilter through some other channel?

bobrayner (talk) 11:22, 4 March 2013 (UTC)

Most editfilter managers can be contacted via email or on IRC (what I've used in the past). Legoktm (talk) 11:24, 4 March 2013 (UTC)
I am willing to help via email. -- King of ♠ 11:38, 4 March 2013 (UTC)
Please be aware that edit filter logs are publicly visible, and the edits they stopped cannot be oversighted and will remain permanently publicly visible. I'm not certain that anything other than a lawsuit by the injured party against the spammer will be effective in stopping the abuse. Reaper Eternal (talk) 13:29, 4 March 2013 (UTC)
Um, I'm pretty sure Special:AbuseLog entries can be oversighted... Legoktm (talk) 13:33, 4 March 2013 (UTC)
Yeah, edit filter logs can be oversighted now, a bugfix was enacted a year or two ago. There is no way to get them to be "automatically" oversighted though, so that means someone will have to be watching the filter constantly. Soap 13:35, 4 March 2013 (UTC)
It would be nice if we had an easy solution which kept all these details completely hidden on-wiki, but it can never be totally hidden because the person also seems to do similar stuff on other freely-editable parts of the internet, sometimes - although these are much more obscure because they get less search engine love &c. The fastest way to find some of the historic edits is to find mirrors of enwiki which had scraped the personal data before another editor reverted!
The edit filter log is, I think, less "visible" than article-space; there are fewer eyes on it. Nonetheless, if I had a particular edit filter to look for, then over an initial phase of "log only" it might actually help me try one last attempt at communicating with whatever IP the editor is using that day. And in the longer term, if the edits were disallowed, we can only hope that the editor would give up... bobrayner (talk) 14:21, 4 March 2013 (UTC)
Actually, the logs for private filters are not publicly visible, only to administrators and edit filter managers. -- King of ♠ 23:18, 5 March 2013 (UTC)

Filter 58

Many false positives from this filter. Basically, appears to be stopping a lot of good-faith edits from non-confirmed editors where they're actually doing the right thing and adding sources. Am starting to get a little frustrated with opening reports here and seeing "58" a lot. Examples [27] [28] [29] [30]. Filter 225 seems to becoming problematic as well. Black Kite (talk) 03:02, 8 March 2013 (UTC)

I fixed the ones you mentioned. But I agree, it's one of the most complex, poorly understood filters we have. It often causes difficulty to admins and edit filter managers when reported on the false positive page. I myself had no idea how to identify the infringing parts in the regex, until I came to this realization: Open up the edit in "examine" mode and check to make sure the filter is positive. Now keep deleting portions of the regex until it goes negative, and then restore the last portion you deleted and delete everything past the end. Repeat (basically, binary search) until you've narrowed it down to one term. That is usually enough to figure out what is going on. In the rare case that even that leaves you bewildered (such as when there are multiple wildcards, disjunction, etc.), then copy the actual text of added_lines (in quotes) to replace added_lines, and binary search that. This is a protip for anyone who's reading this, by the way. -- King of ♠ 10:23, 8 March 2013 (UTC)

Filters for Article Feedback

Hi,

I try to understand how filters related to the Article Feedback work, before the extension is enabled on the French Wikipedia, where I am an abusefilter editor. I have seen that you have created specific filters, for instance Special:AbuseFilter/502 (" Feedback: Extremely long words"). But I can't see anything that test the action in the code of this filter (such as action == "feedback"). So how is it possible that this filter does not match normal contributions in articles?

Orlodrim (talk) 21:37, 18 March 2013 (UTC)

It seems that AFTv5 has been temporarily disabled, but when it is enabled, it's placed in a separate group called "feedback", and only filters in that group are run against feedback submissions, and not against anything else. Legoktm (talk) 22:38, 18 March 2013 (UTC)
Ok, I see. I checked the code and this setting appears only if there are at least two valid groups, that's why I have never seen it on the French version. Thanks, Orlodrim (talk) 23:36, 18 March 2013 (UTC)

Requesting a solution

The current filter parameters in one of mr-wiki filter is some thing like following

article_namespace >= 0 &
new_size >= 2 &
!article_articleid = 64452 & (This is article related to Rabies)
!contains_any(added_lines,"abc","efg")&
contains_any(added_lines,"synonymOfWordDog")

On mr-wiki one of the edit filter, filters synonym of word 'Dog' to avoid usage of the word for abuse.Since the synonym of word 'Dog' was giving false positive on article related to "Rabies" in main name space.So we added parameter "! article_articleid = 64452 &" to avoid false positive on this article.But in recent abuse attack we noticed that rather than filter skipping the article related to "Rabies"; filter is skipping the synonym of word 'Dog'. Please do suggest an improvement.

Rgds

Mahitgar (talk) 15:35, 7 March 2013 (UTC)

Hi Mahitgar. I'm new at this but I see no one has replied so here is what I think.
  1. The check for article_articleid should use ==, not =.
    == means asking a question, which is what you want here.
  2. I noticed what I think is another problem with your filter.
    When you say article_namespace >= 0, you probably mean article_namespace == 0. The various possible values are listed here. Your filter as it stands will check all edits in every namespace except the ones with negative numbers. What you probably want is to check only edits in the Main userspace, meaning only when article_articleid is zero.
  3. The last two lines look fine. Those are probably not the problem.
I hope this helps!
Mattj2 (talk) 03:15, 4 April 2013 (UTC)

Hi,User:Mattj2 , Thanks for your reply.

1) The main problem we sought solution was at third line (!article_articleid) . At local wiki we found reliable work around for the problem.But we still are interested in understanding exactly how exactly this syntax works .As such thought to be an error can be used constructively in some other code.But I was not sure how much resources it consumes so was waiting for some technical discussion to take place.

2)a First Line (article_namespace >= 0) On our local wiki usually people do understand importance/value of wiki to their mother tougne so as such we have hardly any deliberate vandalism in main namespace (other than few trial and error edits). We have left it open in main space too , because it has max edits(since we do not have enough edits in our language at other namespaces) and helps to study and contain false positives.

2)b Since info is from private filter it won't be wise to mention name spaces we exempt.At the same time discussion/suggessions on various syntax options for inclusion or avoiding namespace are welcome ,since it will help updating help pages.

Thanks and Regards Mahitgar (talk) 13:54, 4 April 2013 (UTC)

Hi Mahitgar.
"1) The main problem we sought solution was at third line (!article_articleid)"
The documentation is here and especially here. According to the documentation, you can use := to assign a value or == to test a value.
This means: Is the article ID 64452?
article_articleid == 64452
This means: Is the article ID not 64452?
!(article_articleid == 64452)
The documentation doesn't mention = at all, so I don't know what it does. The documentation only talks about := and ==.
In regards to your question about resources, the documentation is Controlling efficiency. Does that answer your question?
2a: Ok, the filter as written is going to include all namespaces except "Special" and "Media." If that's what you want, great!
2b: I didn't realize before. You mean a curse word. Ok.
These are two existing filters that test for curse words:
  1. http://en.wikipedia.org/wiki/Special:AbuseFilter/380
  2. http://en.wikipedia.org/wiki/Special:AbuseFilter/384
There are several filters that search for curse words. You can search through the list if you want to find more. These filters generally use regular expressions.
Ok. I hope this helps.
Mattj2 (talk) 04:58, 5 April 2013 (UTC)

Hello and thanks once again for all your effort User:Mattj2.All your effort helps us to confirm that we are on right track .Besides with such info I keep bulding up help pages so your effort has been very valuable.

About :article_articleid == 64452 My obsrvation has been it calls in all the words from an article ,actually I need to confirm this again by testing it once more, that I will do eventually.If it comes true it can make certain different work easier, but certainly would need more testing.

Thanks and regards

Mahitgar (talk) 12:13, 5 April 2013 (UTC)

You're very welcome. I really hope that article_articleid doesn't look at every word on the whole page! That would be terrible if it did. Ok. Have fun. Mattj2 (talk) 06:46, 8 April 2013 (UTC)

"new article with no mention of title"

I patrol new pages, mostly through tags, and have noticed that this tag hasn't been applied to an article for quite some time. It think it's been years since I've seen it applied. I'm guessing the filter has an issue as I'm sure there have been plenty of articles created with no mention of the title. I'm not sure that this is the best place to report an issue but this was the best place I could find. Is there anything I can do to help check these filters (96 and 238) to see if there's an issue that leaves this page perpetually blank? OlYeller21Talktome 02:54, 29 March 2013 (UTC)

238 was deleted over 3 years ago because it simply used way too many resources. I'm not sure why 96 was disabled, I'll leave User:Sole Soul a note about it. Legoktm (talk) 02:59, 29 March 2013 (UTC)
Gotcha. Thanks for the help. It would be nice to have articles tagged if they don't mention the name in the article but I'm sure those articles are still being patrolled. OlYeller21Talktome 03:13, 29 March 2013 (UTC)
I disabled filter 96 because I thought it was not effectively achieving it's stated purpose. Many new articles mention the name of their subjects and end up in the filter's log because of a typo or a misspelling. Sole Soul (talk) 05:18, 30 March 2013 (UTC)
That makes sense. Like I said, I'm sure those articles still get the attention they need, even if the NPP is running over a month behind. I just wanted to make sure that there wasn't an error causing the filter not to work. Thanks for helping me understand why these changes were made. OlYeller21Talktome 05:46, 31 March 2013 (UTC)

ASCII art / non-standard characters

Not sure which filter this is, but it probably needs a quick tweak because this got through it? Black Kite (talk) 13:39, 18 March 2013 (UTC)

Hi Black Kite. Thanks for reporting this.
The ASCII art filter is Filter 31. It's set to private so I can't view it, but I would guess that the fact that the edit used Unicode allowed it to evade that filter. (Should the ASCII filter be modified to be tripped by non-ASCII characters?)
I would like to see Filter 30 (Large deletion from article by new editors) be modified in response to this. As it stands, Filter 30 says:
edit_delta < -5000
I think it should be changed to:
added_lines >= 5000 || removed_lines >= 5000
Also the title of the filter should be changed to "Large addition or deletion..."
Thoughts?
Mattj2 (talk) 03:43, 4 April 2013 (UTC)
We already have such a filter but only for featured articles. I think a general "large addition" filter would generate a lot of false positives. Sole Soul (talk) 14:29, 6 April 2013 (UTC)
Ok. The first post here (from Black Kite) is about a user deleting the contents of an article and replacing it with a large amount of non-ASCII gibberish. Are there current filters that cover that case? Should there be? Mattj2 (talk) 06:52, 8 April 2013 (UTC)
There is the "Large non-English contributions" filter. Had it been an addition only edit (not replacing), it would have matched that filter. I will see if I can tweak the filter to include the replacing idea. Sole Soul (talk) 09:09, 9 April 2013 (UTC)
 Done Sole Soul (talk) 22:19, 18 April 2013 (UTC)

upcoming celebration declariations

  • Task:Any future date (This should be possible with mathematical calculation) + added_lines rlike "event is organised" .Applicable to "article namespace", "!autoconfirmed" users


  • Reason:Novice users many times post information(advertise) on upcoming celebration/get together event.Most of the time it is likely unecyclopedic.I dont know about enwiki but small wikis do face this. If proved usefull on one or two small wikis it may be used on meta shared list of globally usefull filters.

- Mahitgar (talk) 03:34, 22 April 2013 (UTC)

New feature requests made at bugzilla

Hi,

Depending on my experiance at our local wiki I have made some feature requests At bugzilla through following bugs


Thanks and Regards

Mahitgar (talk) 05:21, 22 April 2013 (UTC)

Enforcing User:Alan Liefting's ban by filter

He is restricted from "any category-related edits outside of mainspace". Could a filter be constructed to prevent him from doing that? As I see it, he sometimes "forgets" his restriction, and would unlikely to try to work around it. But I'm not sure how to write the filter, even there would be agreement that it is a good thing. — Arthur Rubin (talk) 01:57, 23 April 2013 (UTC)

It should be easy enough. If you get consensus for it, then I can implement it. -- King of ♠ 02:09, 23 April 2013 (UTC)
Sounds like a great idea to me! Kumioko (talk) 02:18, 23 April 2013 (UTC)


No, thank you. Creating edit filters to enforce topic bans would entail an unnecessary drain on server resources, increasing edit processing times and overflowing the condition limit. If Alan Liefting would like a software warning about edits that potentially violate his topic ban, he can add them to his local javascript page. Reaper Eternal (talk) 00:58, 24 April 2013 (UTC)

If (user=Alan Liefting) is the first condition, it wouldn't be a significant drain on server resources or edit processing times. As I said over at ANI, this would only be appropriate if the user would be unlikely to try to bypass the filter, and if the filter would be relatively simple. As an aside, how could it be done by a private javascript? — Arthur Rubin (talk) 02:17, 24 April 2013 (UTC)
A separate JS would be much much harder to craft but it might take less resources. Anyways, folks seem to care more about making sure a sanction is followed than whether the edits were useful or not. So that to me speaks volumes about the direction the project is going...not in a good way. Kumioko (talk) 02:27, 24 April 2013 (UTC)
This is off-topic here, but Alan's sanctions were expanded because, when restricted from making problematic edits of one sort, he started making problematic edits of a different sort, all related to categories. Enough of his article-space category edits are sensible that there wasn't a consensus to restrict those, although, in samples, I've found about 10% to be problematic. The current sanctions (seem to me to be) an attempt to produce an objective standard, so that any editor should be able to comprehend whether an edit is restricted. I'm not sure the attempt was successful. — Arthur Rubin (talk) 02:56, 24 April 2013 (UTC)
It appears that there is consensus against doing this at WP:ANI, at least partially due to the use of server resources, and partially due to it being a "technological solution for sociological issue", so I must withdraw the request. (It still only costs one condition for all edits other than Alan's, but it may cost significant resources for each of Alan's edits. If Mahitgar has some questions related to a similar proposal, it might be best if he/she opened a separate section. — Arthur Rubin (talk) 02:17, 25 April 2013 (UTC)
Shifting my questions to separate i.e. next section as you rightly said my questions are not directly related to any specific issue but more of general discussion Mahitgar (talk) 10:16, 25 April 2013 (UTC)

Technical aspects about limited restriction through filters

  • Underlined Point 1 (This 1st point is resolved out of total 6 points)

@User:King of Hearts , Please can you provide an example of "timestamp < someunixtimevalue" to use in such filters. I suppose this parameter is used to assign a time limit to auto lapse is kinda filters you are discussing here.It seems related filter on en wiki are private and I dont have access . If you provide me with an example we can save our time in trial and error at local wiki. Regrds

Mahitgar (talk) 15:19, 23 April 2013 (UTC)

Let's say you want to have a filter expire at 00:00, May 1, 2013 (UTC). Then you convert it using this tool, and so you would use timestamp < 1367366400. -- King of ♠ 06:19, 26 April 2013 (UTC)
@User:King of Hearts Thanks , very nice of you. Experimintation helps in technology this or that area .Thanks and best wishes.Mahitgar (talk) 11:22, 26 April 2013 (UTC)
1)My querry is to undertand further usage example of timestamp; an old related fixed bug for reference bug no.18246 I suppose we should be able to use time stamp in relation to some other conditions too.
2) @ User:Reaper Eternal ,I respect your concerns on given subject in discussion , but I do have some querries which I will be happy,please see,if you can guide about (basically that will make me understand and learn).
  • Underlined Point 2
2a Just want to know, your reservation to topic restrictions through edit filter is due to likely use of variable "old_wikitext" (which likely to consume more server resources) ; But in a case where restriction need not be very stringent on a user,whether restrictions through variable added_lines or removed_lines can still be used.
2aa here if we restrict the user from editing large size article (With help of variable old_size) and allow only to edit only small size article , it allows user to keep working on small size articles and we can restrict the said user from editing restricted category pages too (out of small size articles). Thus we save resources.
  • Underlined Point 3
2b In case of small wikis where number of articles/pages in a given namespace & number of edits per day are very less is it ok to use "old_wikitext" variable.
  • Underlined Point 4
2c What is your view of other types of limited purpose restrictions (i.e.edit restrictions to specific page &or namespace) with variable added_lines (not old_wikitext)
Thanks and warm regards
Mahitgar (talk) 08:53, 24 April 2013 (UTC)
This thread seems to have moved to the wider issue of what is involved in implementing a generic filter that would prevent a group of users under restriction from editing articles that belong to specified categories. The first matter to be cleared up is whether different Wikis would find such a filter useful. As background, the matter was discussed here on the English Wikipedia. The proposal there to implement such a filter was rapidly dismissed on two grounds. The first ground was that such a filter would have unacceptable time overheads, since scanning a list of restricted editors would have to be proceed every edit on the wiki. The second ground was that implementing such a filter provides a "technological solution for a sociological issue". What was meant by that, as expressed clearly in the opposing votes, is that there is a strong preference for the moralising and punishment that occurs in the absence of such filtering.
The first ground for dismissal should be a matter of fact rather than uninformed opinion. What would be the time overhead for an optimally implemented scan of a list of say 20 comma delimited restricted user names averaging 12 characters each on a typical web processor? That is, how long would it take to scan a 260 character string for a 12 character target? Would it not be a matter of microseconds or less, rather than milliseconds?
The second ground for dismissal is the sociological one. Some wikis might prefer sociological alternatives, such as the punishment and moralising the English Wikipedia currently advocates. There is no point in implementing something if no one wants to use it. So the question is, do some other wikis value rationality and might admins on such wikis see the advantages of such a filter? Or will all wikis prefer the mob entertainments of the sociological approach? There is no point in implementing such a filter if no one is going to use it. --Epipelagic (talk) 23:33, 26 April 2013 (UTC)
I urge anyone wondering about the veracity of the above summary to read the discussions where the obvious will quickly be confirmed—the comments about "punishment and moralising" completely miss the point and are nonsense. Johnuniq (talk) 23:47, 26 April 2013 (UTC)
Well this this getting off topic. But what other way could it possibly be interpreted? Anyway, for the purposes of this thread, it is sufficient to notice that some people prefer "sociological" solutions to ones that are drama free. --Epipelagic (talk) 00:38, 27 April 2013 (UTC)
@ Epipelagic Thanks for your inputs.(I suppose) probably you came on this page at a latter stage. The purpose of this thread is purely technical one.For the same reason to avoid mixing up of the issues and drifting of the discussion, the thread is forked out of to be purely technical one.
(I suppose,) Use of 'technology to sociological issue' is Right or wrong/good or bad are atributes for Wikipedia:Village pump (policy) to decide. In the role of an edit filter manager, an edit filter manager is generally expected to maintain neutrality .What are the technical challanges,Whether resources and methods are availabe ,if available at what cost.If cost is too high by one method can one innovate an alternate method.This is the technical role of an edit filter manager.
A pure technocrat/scientist is open minded, to him a challange is a challange. Many times his efforts do not pay up in indended area but experimentation and innovation may benefit other areas.So a good technocrat continues his technical innovation ,experimentation and discussion , irrispective of a certain thing will be really used by community or not.A technocrat may change his priorirties for the sake of community benifits .But his own quest to take up challanges to find solution does not end. (One's some thing becomes technically feasible it is for the community to use it or not.)
Since you took up the subject, let me clarify rest of wiki communities are independant ,they are having their own traditions and policies and what is discussed on en wiki has no direct bearing on rest of hundreds of wikis.If some thing is technically feasible and if the given wiki wants to go ahead will go ahead,only that as a sister project other projects refer to the discussions to limited manner.There are hundreds of wikis on wikimedia off wikimedia, they might use some technology .It is not necessary that every technology is implemented on english wikipedia first.Others may introduce certain technologies and english wikipedia can follow if it wants.
I wish and request we can proceed and focus to original technical discussion in this thread.
Mahitgar (talk) 03:03, 27 April 2013 (UTC)


Forking out to focus on technical aspect Mahitgar (talk) 05:48, 27 April 2013 (UTC)
  • Underlined Point 5
  • About 'time overheads', user Epipelagic says "What would be the time overhead for an optimally implemented scan of a list of say 20 comma delimited restricted user names averaging 12 characters each on a typical web processor? That is, how long would it take to scan a 260 character string for a 12 character target? Would it not be a matter of microseconds or less, rather than milliseconds? " (Besides user Epipelagic openions that what he stated is technically a matter of fact, thinking otherwise would be an uninformed opinion )
    • User Arthur Rubin says in previous thread If (user=abc) is the first condition, it wouldn't be a significant drain on server resources or edit processing times.
(I suppose) User Arthur Rubin and User:Epipelagic actually wanted to refer to variable user_name rlike "abcd|efg|hijk"
    • "You should always order your filters so that the condition that will knock out the largest number of edits is first. Usually this is a user groups"
In light of above, Can we generally agree with this argument of User Arthur Rubin and User:Epipelagic that if user is the first condition operation is inexpensive.
Mahitgar (talk) 06:16, 27 April 2013 (UTC) Mahitgar (talk) 07:00, 27 April 2013 (UTC)
  • Underlined Point 6
  • In Special:Categories software is already scanning and noting down various aspects of categories. So here edit filter is supposed to just share the existing available resources , hence this aspect is not asking for any significant resources
en Wiki seems to have three category related edit filters (namely 117,132,351).These three filters primarily do not show any significant consumption of resources.Please do discuss this technical aspect.
Mahitgar (talk) 07:24, 27 April 2013 (UTC)

Is there a list? And if so, why no link?

Where's the list? I'm trying to help a guy out at Teahouse who apparently triggered edit filter #139. What the heck is that?

There's a list at Special:AbuseFilter. 139 is "Fixed position vandalism": [31] But, yes, it might be helpful to have a few extra links &c to help people find their way round. bobrayner (talk) 13:39, 2 May 2013 (UTC)
Dear freind I am not a edit filter manager.This page is in my watch list to follow some other technical discussion.Since may be you forgot to sign on this talk page, and your message defficult to understand, I got curious and wanted to see to what extent I can be of any help to you and the edit filter managers.

>>Is there a list?<<

Which list you are looking for ? Special:AbuseFilter consists list of edit filters. and Edit filter you are referring to is Special:AbuseFilter/139.(Special:AbuseFilter- Special:AbuseFilter/139 are clearly shown I did not get what do you mean by 'why no link ?' may be an edit filter manager understands this better). And if you are looking for right page to put up a false positive message is at Wikipedia:Edit filter/False positives.
For edit filter managers :Probably this user is reffering to [32] this edit . The edit is right or wrong and whether it is false fositive is for conerned edit filter managers.But what I primarily studied is user who has done edit was not in autoconfirmed status at that moment, as far filter is concerned it seems to have done its job .The rest is for as said for edit filter managers.
Sorry if I make any mistake while providing the un-asked help.
Mahitgar (talk) 13:54, 2 May 2013 (UTC)

Efficient use of regex

I just find the function (or whatever) rlike for checking regex patterns. It seems to attempt to match the entire string only, not a part of the string. So how would one match just a part of a string with regex? The dot (.) doesn't seem to represent an arbitrary character either ('a.' rlike 'as' yields False), like the article on Perl Compatible Regular Expressions seems to indicate. Is there any documentation available for the regex used in this extension? --Njardarlogar (talk) 13:29, 13 May 2013 (UTC)

AbuseFilter internally uses PHP's preg_match function for rlike (and similar functions). Documentation on that can be found on PHP.net: [33] - Hoo man (talk) 19:27, 13 May 2013 (UTC)
Use 'as' rlike 'a.' instead. -- zzuuzz (talk) 19:31, 13 May 2013 (UTC)
Oh, no. I'll blame the screw-up on my influenza. Thanks, I did not consider the possibility that the order could matter... --Njardarlogar (talk) 19:38, 13 May 2013 (UTC)

How to use 'action' variables 'upload' and 'delete'

According to mw:Extension:AbuseFilter/RulesFormat

 action='upload' 

and

 action='delete' 

are supposed to be valid actions.But I could not succeed through the tests. Any clues,Please ?

Mahitgar (talk) 15:44, 2 June 2013 (UTC)

You do not have AbuseFilter rights here, so you cannot make a filter here. Are you trying to create a filter on a test wiki? If so, which one? -- King of ♠ 00:15, 3 June 2013 (UTC)

Engagement in Tool

Hello all; As a non-privileged user (who is not likely to become an abusefilter-manager) I was wondering if one of the more experienced managers could help me understand how to get engaged in the process. I'm currently wetting my feet with the syntax at test.Wikipedia, but want to eventually bring those skills where they matter (i.e, here). Fortunately for the project, the system as it stands is designed not to break things, but unfortunately for a newbie, all the contentious edits that do trigger a filter as vandalism, are often marked as private where I cannot interact with them. I'm looking for a way more than "Chat with us on the talk page mate!" to contribute, but am not requesting the flag. Look forward to hearing from you! Cheers! -TIM(Contact)/(Contribs) 11:51, 12 June 2013 (UTC)

I do manage filters for some other wiki and not for en-wiki.Still want to get into this discussion, sorry if you were not expecting me to join.


1)With experince from filter management, I have filed some bugs to seek more participation in open filters from the community.
Usually at warning we provide links to Home page of edit filter management.There one can refer open filters but has to search manually through list of filters.At bug 47494 I requested enhancement where larger community can see only open filters, simmiller purpose does have bug no. 45195 probably my bugs are not well understood by bugzilla tech community or may be I failed explain my points properly.
2)What you seem to be expecting is some thing more than point no 1. I would like to understand what you are expecting in >>I'm looking for a way more than "Chat with us on the talk page mate!" to contribute<< .Any specific ideas in your mind ,Please do elaborate.
Warm regards
Mahitgar (talk) 03:07, 14 June 2013 (UTC)

New edit filter suggeston

I think it would be useful to create a temporary filter to view the edits done by Visual editor. Particularly the ones done by new accounts. There are still some significant problems with it and if they release it to the 50% of new accounts today as they have been advertising it coudl cause a spike in the errors introduced to articles and formatting problems. Kumioko (talk) 16:59, 18 June 2013 (UTC)

So basically you want this? ;) Legoktm (talk) 19:44, 18 June 2013 (UTC)
Oh yeah thanks that's it. I learned 2 things today.:-) Kumioko (talk) 19:54, 18 June 2013 (UTC)

Exempting bots from filters

Would it be possible to exempt bots from Filter 167. We are having problems with archiving bots not being able to create new archives. Mdann52 (talk) 08:37, 24 June 2013 (UTC)

 Done. Legoktm (talk) 15:45, 24 June 2013 (UTC)

Problems with count, rcount and regex

I've been getting weird results with count, rcount and regex. I'll show tests below against this blocked edit - which is an edit containing the letter e multiple times.

Part 1: rcount can't count?

Count and rcount (in many examples that I've tested) evaluate to exactly 1 if there is a match, and to minus infinity (or at least a large negative number) if there is no match.

E.g. I test the simple filter:

rcount("e" in added_lines) == 1

This reports "The filter matched this change", when I expect it to not match. The count should be much higher than 1. ">1" or any other comparison I've tried fails to match.

Now I test for a string which is not in the added lines:

rcount("this is a test string blah blah blah" in added_lines) == 1

That matches, but it shouldn't. "0" or any other comparison I tried fails to match.

Also weird:

rcount("e" in "foo") 

...which matches, but shouldn't.

count gives similar results when I've tried it.

Part 2: regex:

Now to try regex for this string which is not found in added lines, testing against the same edit... at first it works correctly:

added_lines regex "this is a test string blah blah blah" 

"The filter did not match this change." Working correctly, no problem.

Using ! for NOT:

!added_lines regex "this is a test string blah blah blah" 

"The filter matched this change." Again, working correctly.

Then it gets weird

added_lines regex "this is a test string blah blah blah" == 0

This reports "The filter did not match this change.". Problem! I expect the first part of the expression to be false, and therefore the whole expression should match.

added_lines regex "this is a test string blah blah blah" < -10000000000000000000000000000000000000

This reports "The filter matched this change". False is somehow given a large negative number.

Can anyone help explain this to me? I've written a filter based on rcount, and on the idea that false is 0 and true is 1 - it's not working, and I came across these anomalies while trying to debug it. --Chriswaterguy talk 08:39, 1 July 2013 (UTC)

Well, for starters, the expression rcount("e" in added_lines) isn't giving the appropriate arguments to rcount. Rcount takes two arguments: the regex and the string with which to compare the regex. (Usage: rcount(string regex, string haystack).)
Secondly, you cannot perform a boolean operation in PHP (to the best of my knowledge) between a boolean and an integer as you can in C. (C does not have the boolean type; the boolean is simply an integer.) For example: ("asdasdasdasdasdffff" rlike "asdd") = 0 is not the same as ("asdasdasdasdasdffff" rlike "asdd") = false. The first expression, rlike, evaluates to 'false' in this case, and then a compare is done between 'false' and '0', which evaluates to 'false'.
Finally, "-10000000000000000000000000000000000000" is an integer underflow in even 64-bit systems.
I hope this helps. Reaper Eternal (talk) 12:27, 1 July 2013 (UTC)
Thanks - that's a huge help.
Is it reliable to use true and false as 1 and 0 in calculations? It seems to work (E.g. true+false==1 seems to evaluate true for any edit, based on initial testing.) Does using a boolean in a calculation automatically convert it to an integer? --Chriswaterguy talk 01:55, 3 July 2013 (UTC)
I did these tests:
  • added_lines regex "this is a test string blah blah blah" < -10000000000000000000000000000000000000 (matches - i.e. integer underflow)
  • added_lines regex "this is a test string blah blah blah" + 0 == 0 (matches - which is mathematically correct) and:
  • added_lines regex "this is a test string blah blah blah" * 1 == 0 (matches - which is mathematically correct)
So it looks like we can use a statement's true or false value as 1 or 0, respectively, in a calculation - and then use that in a comparison (<, > or ==). But if we don't do any mathematical operation, then as you pointed out, it's a different data type (boolean rather than integer or floating point) and we don't get a meaningful result.
Is that correct? Thanks. --Chriswaterguy talk 06:45, 4 July 2013 (UTC)

Request for Interpretation

In the uppermost row of Special:AbuseFilter:

What does "reached the condition limit of 1,000" mean?
In the Zhwiki's AbuseFilter, I encountered a problem that AbuseFilter can not block off the editing that matched filter rules. At the same time, the number reached the condition limit of 1,000 in Zhwiki is too much higher than Enwiki, see zh:Special:防滥用过滤器. I do not why it can be more than 1,000. Therefore, I suspect that the problem of blocking is due to this number is too higher. WHO CAN HELP ME?乌拉跨氪 (talk) 07:49, 6 June 2013 (UTC)
The first number should be kept below either 2% or 5%, or filters will start to be disabled. I can't find the precise setting but I've always thought it's 2% on this wiki. You can achieve this by disabling filters, combining them, or improving their logic to reduce the number of conditions. For a start, you can remove action==edit from most filters. Then for each filter make the first condition the one which removes the greatest number of edits from the rest of the filter. -- zzuuzz (talk) 09:07, 6 June 2013 (UTC)
You can also add conditions that will remove many edits with a minimal consumption of server resources. (For example, "!("confirmed" in user_groups) will eliminate the need to run the rest of the filter on any established users. With regards to edits matching the filter and not being caught, that is likely because those actions reached the condition limit before reaching the filter that they matched. When an action takes 1000 conditions, it stops being processed by the edit filter. Reaper Eternal (talk) 10:32, 6 June 2013 (UTC)
For filters in the 'default' group, the value is indeed 5% (the 'feedback' group has a higher limit of 20%). This is defined by the wmgAbuseFilterEmergencyDisableThreshold variable at InitialiseSettings.php. Helder 13:51, 13 July 2013 (UTC)
I am still not very clear about composition of limit 1,000. Each filter tries to match an editing fails once, so this filter will be counted a condition. When it tries 1000 times, this filter (or all filters?) will be disabled? The number reached the condition limit and the number matched one of the filters may be unrelated?
As zh:Special:防滥用过滤器/140, how can I do to improve efficiency and reduce condition?

乌拉跨氪 (talk) 11:59, 6 June 2013 (UTC)


Above given suggessions are good still, My openion/suggessions for above mentioned problem (My assumtions are general observations and may not be accurate,please correct me where I am wrong.)
  • My observation is, if an individual users's action trips a filter more than 5% of his(user's) total action then also a filter gets deactivated.(Some related notice is visible at times on individual filters).
My observation is this happens specially on filters where warning messages are served, so in those filters where serving of warning is not essential on first tripping itself,using rate limit throttle to postpone serving a warning and optimising (seconds) (or reducing time if its too long) in throttle feature might help.
  • I assume, if any filter is consumining more conditions, it will consume more run time also, so run time of indivisual filter may be a good indicator.So ,if, my routin visits obsrve high run time for a filter, usually I disable the filter temporarily till I study and improve and test parameters for run time optimisation and consumtion of conditions.
Above suggesstions are not specific to filter no 140 of zh-wiki since I could not see the filter being private. Please do correct me where I am wrong in above asumtion and observations.
I also feel that,a little more clarity in following may help,Querries coming to my mind:
  • Of the last 5,440 actions How do we know, what is the starting/begining point of counting these actions.
  • Whether the limit of 1000 condition includes (totals) condition consumtion of all the filters.
  • If not exact, what is the aprox and easy way/dependence, to understand condition counting
Mahitgar (talk) 15:14, 6 June 2013 (UTC)
Sorry, zh:Special:防滥用过滤器/21 is open. Could you give me some advice? Strangely, Special:AbuseFilter/520 and Special:AbuseFilter/473's condition is very high, can be thousands, but why it has not affected all the filters?乌拉跨氪 (talk) 18:30, 6 June 2013 (UTC)
Sorry, My assumption of relation between runtime and consumption of conditions seems to be wrong.I did not understand need of parameter "action==edit" in your filter 21, for the same I am testing on our(mr) wiki too,I will keep you updated.
I am also curious to know more about en wiki filter 473 and 520 as suggested by you.
Mahitgar (talk) 13:49, 7 June 2013 (UTC)

This is zhwiki filter #21:

action == "edit" & 
!("autoconfirmed" in user_groups) & 
!("bot" in user_groups)
& (article_namespace == 6)
& !(user_name in article_recent_contributors)
& (removed_lines rlike "\{\{.*\}\}")
& !(removed_lines in added_lines)

Firstly, I would remove the !("bot" in user_groups) check, since your bot accounts are probably going to be autoconfirmed due to mass editing. I'd also move the namespace check to the front, since that will filter out far more edits than the action == "edit" check. I can't help too much since I can't read Chinese and thus have no clue what this filter is supposed to be doing. This leaves us with the slightly more optimized filter:

(article_namespace == 6) &
!("confirmed" in user_groups) & 
(action == "edit") & 
!(user_name in article_recent_contributors) &
(removed_lines rlike "\{\{.*\}\}") &
!(removed_lines in added_lines)

Reaper Eternal (talk) 14:35, 7 June 2013 (UTC)

Thanks to all, this discussion is really helpfull and productive.Btw, I guess, filter zhwiki filter #21 seems to be based on or similler to March 2009 version of enwiki filter #59.
Mahitgar (talk) 04:43, 8 June 2013 (UTC)
Thanks for your patient answers. Zhwiki's filter problem is too severe, so that it can not be better by altering one or two filters. If anyone want to help us to alter zhwiki's filters, please tell me. 乌拉跨氪 (talk) 12:49, 9 June 2013 (UTC)

List of contributions

Hi folks,

How can I create a template that lists contributions by a specific user just on abuse filters?

We use hu:Template:Adminlista-elem to list special admin activities, such as log pages and editing MediaWiki namespace. I want to enhance it with abuse filter modifications. Bináris (talk) 07:43, 6 July 2013 (UTC)

Edit filter and VE

In case people haven't heard, VisualEditor does not play well with the edit filter. At present any edit done with VE that should trigger either warn or disallow will result in a fatal error: "Error: Hit AbuseFilter: [name of triggered filter]". This means that warn and disallow are functionally the same for VE right now. In addition, the custom error messages designed for the edit filter will not be shown when editing with VE. Dragons flight (talk) 05:56, 16 July 2013 (UTC)

Yeah, I think I reported this on one of the many VE pages. Thanks though. Reaper Eternal (talk) 10:23, 22 July 2013 (UTC)
just noting ... that bug has been fixed. Has anyone stress tested it? e.g. Does throttling work well with VE? John Vandenberg (chat) 14:51, 26 July 2013 (UTC)

Filter 527

I'm sad to see that there is not much traffic on this talk page... lack of camaraderie amongst filter managers, perhaps? :( Anyway, does anyone have any clue what to make of the filter 527 log? I doubt anyone is patrolling it, because it is fairly meaningless. Does anyone have any idea how the createaccount filters work? — This, that and the other (talk) 09:50, 22 July 2013 (UTC)

Abusefilter bug on mobile version

Hello,

There are currently many false positives for filter that trigger on large deletions (I just opened a bug report: bugzilla:52062).

For instance [34] [35] [36] [37] on filter 30 (which has more than 500 detections today, while the number of detections / day is usually around 100).

Until this is fixed, you might want to disable warnings for this filter, and similar filters if you have some.

Kind regards,

Orlodrim (talk) 22:18, 25 July 2013 (UTC)

Someone (else) with SQL access should do an impact analysis - how many edit filters use 'new_wikitext', and do any of them auto-block or throttle the user/IP. (Id do it except its past my bed time..) John Vandenberg (chat) 14:57, 26 July 2013 (UTC)

Filter to detect pasted refs

Do we have a filter that looks for [1],[2],etc, which indicates refs were pasted from one wikipedia page to another. If not, I'll propose one in the normal fashion - im guessing it might be a perf problem. John Vandenberg (chat) 15:00, 26 July 2013 (UTC)

Filter 164 looks for similler text [edit] , I suppose.
Mahitgar (talk) 10:20, 31 July 2013 (UTC)

Filter for "text added after categories and interwiki" is not working

In a discussion about a similar filter on Portuguese Wikipedia, we noticed your filter which detects "text added after categories and interwiki" doesn't have any result since february 2013 (we copied the English version and it stopped detecting edits until we reverted to our old version). Helder 19:52, 1 September 2013 (UTC)

Edit Filter bug (possibly VE)?

Hey, all, during NPP, I came across this article. It's obviously spam, so I nominated it for deletion under G11. At nearly the same time, the author added categories to the article with VE. For some reason, their edit tripped the "removing speedy deletion tags" edit filter, even though it didn't remove a speedy deletion tag, and I can't see any obvious reason (though I'm not a regex expert, a regexpert if you will) why it would have tripped the filter. I see from the above that VE is causing some issues with the edit filter; is causing false positives like this a known issue? Writ Keeper (WK to move) 14:30, 5 September 2013 (UTC)

Well they did try to remove the template, as can be seen here. I'm guessing that the edit filter carries over tags like that to a future edit which actually is saved, even if an earlier one wasn't saved. I don't know if this is new behavior or if it's always been like that. Soap 14:33, 5 September 2013 (UTC)
From primary observation edit filter has done its job perfectly.Edit filter may work even on an action which necessarily need not be a save action.If you know certainly that the edit was VE edit and template was not intended to be removed and still got removed then there may be VE bug responsible and you would need to discuss it on VE feedback page.
Mahitgar (talk) 15:20, 6 September 2013 (UTC)
No, the point is that the edit didn't remove the speedy deletion template, but was still tagged as having removed the template by the edit filter. Writ Keeper (WK to move) 18:06, 6 September 2013 (UTC)
My hypothesis is that there was an edit conflict, and the filter tripped before the edit conflict check tripped. It should probably be reversed so that edit filters don't run until after the edit conflict check. Jackmcbarn (talk) 18:26, 6 September 2013 (UTC)
Being non admin on en wiki I can not cross check deleted page edits, still I would doubt both the hypothesis.Reason en wiki filter no 29 is more than 4 years old with more than 97000 hits and if it is a bug on edit filter side you would have more instances to cite/point out.
Any way is it possible to provide 'Steps to reproduce the bug' ? This will make the problem easy to understand.And we can test it at our local wiki too. Mahitgar (talk) 02:37, 7 September 2013 (UTC)

Why is this filter hidden? If it only throttles pagemoves, it shouldn't be hidden, so I'll presume it's because it contains something like HAGGER specifics...? Ginsuloft (talk) 23:59, 18 October 2013 (UTC)

This came up before (Wikipedia:Edit filter/Requested#Change some filters to public), though I'm not entirely satisfied with the answer there. Jackmcbarn (talk) 17:07, 19 October 2013 (UTC)
I see. I noticed Reaper Eternal made one of those public (the "your mom" one). From what it looks, filter 68 would have been made public by now, so it probably contains long-term abuse pattern detection. I agree that many filters are probably still unnecessarily hidden (otherwise we'd have a thorough explanation for the hiding of every filter listed on the page you linked), but let the elitists have what they want; if they don't want non-admins to patrol the logs, then we don't have to do so. The logs probably aren't all that interesting anyway, or don't get many hits, therefore no need for patrollers. Ginsuloft (talk) 23:20, 19 October 2013 (UTC)
The reason it's hidden is because we don't want vandals gaming the number of page moves that need to be done to trip this filter. The logs aren't very interesting either. Legoktm (talk) 23:44, 19 October 2013 (UTC)
Ok, thanks! To be honest I didn't expect that to be the reason at all. That's an interesting reason actually. Ginsuloft (talk) 23:47, 19 October 2013 (UTC)

The archive.is filter needs a custom warning template

Special:AbuseFilter/559. Would someone mind cooking one up? I'll get around to it eventually if no one volunteers. But we are getting lots of "false" positive reports about this one, so clearly the newbies have no clue why this is not allowed, and who could expect them to. Someguy1221 (talk) 11:11, 15 November 2013 (UTC)

MediaWiki talk:Abusefilter-warning-archiveis. Just needs an admin to finalize it. Jackmcbarn (talk) 16:04, 15 November 2013 (UTC)
Done. Thank you, Jack. Someguy1221 (talk) 00:24, 16 November 2013 (UTC)
I've just edited the filter warning a bit for easier reading. Wifione Message 03:32, 16 November 2013 (UTC)

Does counting of preview action lead to falls positives ?

Hi

This is imformal request for comment.

For throttlling in edit filters, all types of actions are considered and I suppose even preview actions are considered.

Is getting preview actions counted in total actions really helpfull for edit filters? Even if we keep throttle above 4 or even 6 to 8,can some Edit filters give falls positives, even to genuine users, only because they previewed their edit several times ?


Mahitgar (talk) 03:33, 9 October 2013 (UTC)

No comments yet ? :(
As far as I known, the only actions that twrottle a filter are edit, move, createaccount, autocreateaccount, delete, upload. OTAVIO1981 (talk) 16:30, 19 November 2013 (UTC)


On mr-wiki one of particular edit filter has got relevant parameters for this discussion; A) action == "edit" and B) Number of actions to allow: = 4 actions . In atleast two instance of one user have been filtered in first edit itself,This particular user does lot of spell checking/correction but is not tech savy and probably due to his preview actions are getting counted as action and first attempt to save itself is getting throttled, and two such edit instances of this particular editor have given falls positive.

What happens if some user watches preview multiple times (Say more than 4 times in above case) before his first attempt to submit edit with save action ? Whether filter throttles such submission or not is the question I want to understand.


So before filing any bug I want to confirm if there is really a problem or not.

Rgds Mahitgar (talk) 08:24, 20 November 2013 (UTC)

How to use "new_pst"

Hi,

Among syntax options available for edit filter there is one option called "new_pst" given discription is :" New page wikitext,pre-save transformed . My question is how do we put this syntax to use ? edit filter example if it has been used already ?

Mahitgar (talk) 05:41, 12 December 2013 (UTC)

Just use it in place of new_wikitext. It looks at the page after subst's, etc. to make it harder to trick the filters. Jackmcbarn (talk) 15:09, 12 December 2013 (UTC)


Thanks, its nice of you.

Mahitgar (talk) 03:16, 13 December 2013 (UTC)

Minimum & maximum functions?

Is there a way to calculate the maximum or minimum of two terms? I checked mw:Extension:AbuseFilter/Rules_format and couldn't find it. Thanks --Chriswaterguy talk 23:29, 13 December 2013 (UTC)

There aren't any built in. You should be able to do (a<b)?a:b for minimum, or (a<b)?b:a for maximum of a and b. What need is there for it, though? Jackmcbarn (talk) 01:52, 14 December 2013 (UTC)

Filter 564 Comment

I know which user this is meant to be preventing, but is this an FP? If so, is the filter too broad? Black Kite (talk) 11:34, 22 December 2013 (UTC)

Possible bad hit on filter, but no log?

This edit prompted me with a message saying "Your edit includes new external links." and made me enter a CAPTCHA before I could save it. Looking at the diff, it's easy to see that is not the case. I went to report a false positive, but could not find this entry for my IP or the article in the Edit Filter log? Are some logs hidden? 96.236.155.40 (talk) 01:59, 2 January 2014 (UTC)

It's not the edit filter. It's a different extension, one designed specifically to combat spam, but I cannot recall its name at the moment. Perhaps someone else here remembers. If you ask at WP:VPT you may get a faster response. Someguy1221 (talk) 02:05, 2 January 2014 (UTC)
That's the ConfirmEdit extension, which is for all external links added by anonymous users. There are no public logs available, just only for sysadmins. Legoktm (talk) 02:13, 2 January 2014 (UTC)

Filter 420 throttle acting like a simple disallow

Looking at this filter log, an anonymous user is attempting to revert the addition of a wall of soapboxing from a talk page. Special:AbuseFilter/420 is supposed to throttle large anonymous deletions from talk pages at a rate of 1 per hour, but this user got disallowed on his first attempt. I wonder if this has something to do with the number of different filters to all call on him at the same time, but I don't have any real idea. Does anyone know what's going on? Someguy1221 (talk) 21:24, 1 January 2014 (UTC)

Has this been identified and sorted out ? I came across some simmiller problem at mr-wiki for some filter thats why brought issue for discussion here earlier but unfortunately did not get enough discussed.
We came across some instances where in throttle was scheduled for more number of actions still gave false positive on first edit for the day itself of an editor.This does not happen always but some times.Although problem is observed couple of times ,I could not reproduce it so still not reported on bugzilla.
if the filter setting of Group throttle by: is set by page (and not by user) in that case the above problem can make experince more difficult but I suppose en-wiki does not have much of throttle by page filters.de wikipedia uses some.
Mahitgar (talk) 09:12, 10 January 2014 (UTC)
I think "user" is missing from "Group throttle by". Right now, nothing's there. Jackmcbarn (talk) 15:26, 10 January 2014 (UTC)

Filter 602 not tagging as directed

Filter 602 is not tagging with discretionary sanctions alert as directed to do. However, as you see after using its conditions at Special:AbuseFilter/test and my username under 'Changes by user', the filter is correctly figured. Does anyone know why the filter is not logging and tagging? AGK [•] 00:32, 16 January 2014 (UTC)

Or warning, for that matter. --Rschen7754 00:55, 16 January 2014 (UTC)
  • Now resolved with the assistance and wisdom of MZMcBride. AGK [•] 23:58, 17 January 2014 (UTC)

False positive on filter 466

FWIW, this appears to be a false positive. Yaris678 (talk) 18:10, 13 February 2014 (UTC)

554

I just glance at my filter log out of passing curiosity, and I wondered how this edit and two subsequent edits tripped filter 554 "top100 blog charts". The topic seems about as far removed from the sorts of articles where this filter might catch legitimate results as it's possible to get. Also, is a filter that's had >12,000 hits (including bots and other edits that clearly aren't what it's aimed at) since last May actually doing anything useful? HJ Mitchell | Penny for your thoughts? 20:08, 19 February 2014 (UTC)

For about 2 hours on 29 January, there was a mistake in the filter's code that caused it to catch all edits. Jackmcbarn (talk) 20:18, 19 February 2014 (UTC)

Request for permission: User:Jimmy xu wrk

Hi, I'm a sysop on zhwiki, and we have a lot of filters copied from here over time. However I've found that vandals often find ways to bypass our filters and we have to improvise, but the enwiki equivalent had been set to private. So I request this permission for viewing purpose only, I don't intend to edit anything here. Thanks. Jimmy Xu (talk) 09:16, 12 February 2014 (UTC)

 Done – After consulting with another admin here. EdJohnston (talk) 16:26, 3 March 2014 (UTC)

279

This filter (Repeated attempts to vandalize) is getting hit a lot by new users trying The Wikipedia Adventure. See here for instance. I've not investigated fully, but I don't think this is vandalism, and may be deterring new users?  —SMALLJIM  17:49, 19 March 2014 (UTC)

I've excluded pages with /TWA in their title, it should resolve the issue. Cenarium (talk) 18:53, 23 March 2014 (UTC)
It seems to have done, thanks!  —SMALLJIM  18:14, 25 March 2014 (UTC)

Request for permission: User:This lousy T-shirt

Hello there! I'm a Veteran Editor quite active in recent patrolling and antivandalism on Wikipedia. This request for permission is based on the usefulness access to the edit filter management group will have for me in my duties. I currently have the permissions of rollbacker, reviewer and autopatrolled and can be trusted not to abuse the tools. Thanks in advance. — This lousy T-shirt — (talk) 15:00, 19 March 2014 (UTC)

 Not done — EFM is typically not needed for recent patrolling nor anti-vandalism work, and, unlike most other permissions, it is highly restricted when it comes to granting it to non-administrators, as it confers the ability to create widespread disruption with even the smallest of changes. On top of that, neither clear understanding of AbuseFilter syntax nor substantial amounts of prior edit filter development/debugging/analysis work here or on other wmf projects has been demonstrated. --slakrtalk / 05:56, 3 April 2014 (UTC)


613 - Signing in article

When patrolling, I've found that new users sometimes add their signatures in articles. I have no idea why, but it happens fairly often. So I decided to add a filter that would prompt a friendly warning, that is, assuming you can make custom warnings?

It's been running in idle for almost 12 hours, with five hits, four of them are correct. The other was with this edit and not the previous edit which was the one that actually had the signature in it. I'm using added_lines to inspect the change. edit_diff didn't seem to be the right one, and edit_diff_pst (which I could use to match ~~~~) seemed to cause the query to time out. I can't figure out why... anyone know what I'm doing wrong? Thanks — MusikAnimal talk 14:39, 16 May 2014 (UTC)

This edit also did not match... obviously I'm missing something. — MusikAnimal talk 17:50, 16 May 2014 (UTC)

Hello there, MusikAnimal. I am an administrator on Swedish Wikipedia, where I work on the edit filters. Our equivalent of this filter is working perfectly, and it just so happens that it's a filter I've been working on. I can see that you had to delete filter 613, so I'd like to share ours and explain how it works.

(article_namespace %2 == 0) & !(article_namespace == 4) 
& !("bot" in user_groups)
& ("~~~" in added_lines)
& !(old_wikitext rlike "~~~")
& !(added_lines irlike "<nowiki>~+<\/nowiki>|{{(information\skommer|(bearbetning\s|arbete\s)?på(går|börjad)|(pågående|ständiga)\suppdateringar)")
& !(article_prefixedtext rlike "Användare:.+\/")
& !( "Användare:" + user_name == article_prefixedtext)

The first line tells the filter to divide the namespace index with two, and if the result is zero, it should continue. An exception is made for ns-4 (the Wikipedia namespace).
The second line excludes bots (I'm not actually sure this is needed thanks to line six, but is doesn't hurt either).
The third line checks if three tildes are added.
The fourth line makes an exception if three tildes are present in the old wikitext.
The fifth line makes an exception if one or more tildes are added within nowiki tags, or a certain template (or any of its redirects) is added telling others that the page is under construction (the template uses tildes as parameters to show others who is editing the article and when the template was added). If you don't have any such templates on this project, simply leave out |{{(information\skommer|(bearbetning\s|arbete\s)?på(går|börjad)|(pågående|ständiga)\suppdateringar).
The sixth line makes an exception to subpages in the user namespace. You'll want to change Användare to User.
The seventh line makes an exception when a user signs on his or her own user page. Again, you'll want to change Användare to User.

So, assuming you don't have any templates that uses tildes as parameters, a working filter on this project would be:

(article_namespace %2 == 0) & !(article_namespace == 4) 
& !("bot" in user_groups)
& ("~~~" in added_lines)
& !(old_wikitext rlike "~~~")
& !(added_lines irlike "<nowiki>~+<\/nowiki>")
& !(article_prefixedtext rlike "User:.+\/")
& !( "User:" + user_name == article_prefixedtext)

I should probably mention that you can't test this filter using the tools, because of how the tildes transform upon saving, but it does work live.

Finally, regarding the warnings you asked about: Yes, as an administrator you can edit and create new warnings and tags for the edit filter. You can see a list of warning messages here and a list of tags here. Nirmos (talk) 04:46, 7 July 2014 (UTC)

@Nirmos: How incredibly helpful! Thank you so much! Basically exactly what I was looking for. I think your example is right on par with excluding the templates, for us however less about about that there are some templates that use tildes (I'm pretty sure that there are) but more about that nearly all of the people who would trip this filter are amongst new users. It may be less expensive to restrict the filter to those with an unconfirmed status rather than edits within some valid template. Similarly I'll initially try to restrict this only to the article space. I figure we'd warn on the first attempt then tag on the second, as again I'm guessing most of the time signing in article is an accident. Any results you have observed on your wiki is welcomed... not sure what we'll run into exactly. If it's not already clear for anyone else reading this, line 4 of the last above example is what does the magic... make sure the tildes are not in the old_wikitext. I'm going to re-add this filter and put it back in test mode. Thank you Nirmos again for your very informative and thought out response! — MusikAnimal talk 07:59, 7 July 2014 (UTC)

are edit filter-generated tags mutually exclusive?

I've noticed that edits never seem to have more than one tag that is generated by an edit filter. For instance, I've seen lots of page-blanking edits that are tagged with possible vandalism even though blanking would also normally apply but does not. For another example, have a look at the history of Sauli Niinistö: many of the recent edits were correctly tagged as possible BLP issue or vandalism. The vandal also reverted ClueBot NG several times, so the tag reverting anti-vandal bot should also have applied. But for some reason, it did not.

So I'm wondering: are tags generated by edit filters mutually exclusive? If so, is this by design? --Ixfd64 (talk) 18:29, 9 June 2014 (UTC)

@Ixfd64: I just tested this, and it's not actually a limit of one tag. The limit is that only one filter can apply any tags. This doesn't appear intentional. I'll see if I can get it fixed. Jackmcbarn (talk) 18:38, 9 June 2014 (UTC)
Just a guess... Perhaps if one filter disallows, warns, or tags an edit, the edit filter software will short-circuit and ignore subsequent filters in an attempt to reduce server load. Far from ideal, but with all those filters enabled I imagine performance is a concern. — MusikAnimal talk 18:46, 9 June 2014 (UTC)
That's not it. All of the triggered filters show up in the filter log, but only the tags from one of them end up getting applied. Jackmcbarn (talk) 18:49, 9 June 2014 (UTC)
I've submitted a patch that will fix this. Jackmcbarn (talk) 19:07, 9 June 2014 (UTC)
Thanks for the quick resolution! --Ixfd64 (talk) 20:54, 9 June 2014 (UTC)
Hi, Jackmcbarn , I am not sure common patrollers will be happy with this change. suppose an edit is being tagged by more than 3-4 filters , Recent changes and article history shall not only get cluttered , if too many tags appear together at recent changes I doubt how freindly it will be patroller friendly. And what about those language scripts that make longer speellings ? Or is it that I misunderstood what you are discussing here? I wont oppose completely but I sincerely doubt the efficasy.
Best wishes and rgds.
Mahitgar (talk) 08:04, 21 June 2014 (UTC)
It won't clutter beyond the same line, unless someone used really long tag names, which we don't. Jackmcbarn (talk) 15:14, 21 June 2014 (UTC)
Ok :) May be at mr wiki we will need to reduce length of few tags ! hope works with non roman scripts. Thanks. Mahitgar (talk) 04:17, 26 June 2014 (UTC)
My patch for this was just accepted. It will be live here starting July 10th. Jackmcbarn (talk) 23:33, 2 July 2014 (UTC)
@Ixfd64: This is now live here. See [38] for an example of it. Jackmcbarn (talk) 18:57, 10 July 2014 (UTC)
It works perfectly. Thanks for your hard work! --Ixfd64 (talk) 19:03, 11 July 2014 (UTC)

Filter request

Hi all,

There is consensus developing over at Wikipedia talk:AFC#Edit filter for an edit filter to be used to help enforce use of the script to only those who meet the criteria which has been agreed on per several RfC's, and is listed here. This follows several recent occerances of SPA's using the script to mass-move pages to cause disruption. Ideally, the filter should pick up edits tagged with the string "afch" in the edit summary (in either lower or upper case), as it is added automatically by the use of the script. It may also be worth limiting this to draft space only, to limit any false positives. If you would like any more information, please let me know. --Mdann52talk to me! 16:37, 15 August 2014 (UTC)

Yes, there is a real problem here, and EFs seem like a solid solution. One question: To the extent that the filter or filters protect the allowed participant list, is the "afch" check required? It would seem that pretty much any edit or modification to that page should be gatekept, but perhaps I'm missing something? --j⚛e deckertalk 17:09, 15 August 2014 (UTC)
Please provide a link to a sample of specific edits that you are suggesting the filter to act upon. Are you suggesting the filter action should be disallow or warn? — xaosflux Talk 17:37, 15 August 2014 (UTC)
Apparently, it's possible to use the script even if the user is not in the participants' list page. If that bug is fixed, an easier solution would be to "protect" the page using the filter. --Glaisher (talk) 17:39, 15 August 2014 (UTC)
This whole thing is ridiculous. A person who wants to use the script wouldn't even need to know how to code to copy it to their own .js file and rip the check routine out of MediaWiki:Gadget-afchelper.js/core.js if they are intent on bypassing the AFC communities rules. Deal with the user's conduct and give up on trying to use technical means to control them. Monty845 17:56, 15 August 2014 (UTC)
@Monty845: It is a gadget and it's supposed to be used by users with less than 500 edits and user age of 90 days and listed on the WP:WPAFC/P. However, there is a bug and anyone can use it now even if their username is not on the list. The script has been abused by vandals already. --Glaisher (talk) 18:01, 15 August 2014 (UTC)
(edit conflict × 2) @Xaosflux: edit summaries using the script look like this or (mainly) this (this just invokes one part of the script, but gives you an idea of the structure. Let me know if you need more information). Ideally, it should disallow the edits, as they should not be using the script without meeting the criteria. @Glaisher:. That approach may help, but checking this for all edits will allow us to catch any existing users on the list that don't meet the criteria. --Mdann52talk to me! 18:02, 15 August 2014 (UTC)
  • I agree with Monty. Bug or not, it doesn't take much to copy the script and remove the check, so that one could use it without being on the list (as a matter of fact, I did so myself long ago, so that I wouldn't have to bother with all the stuff AFCH is asking for if I happen to come across a draft I want to review). This is exactly why Twinkle doesn't have a blacklist at all, despite the even greater potential for damage it provides would-be vandals and abusers. And none of this is to mention that one doesn't need the script to review articles anyway. I really don't think that spending even a small amount of edit filter resources would be worth this. Writ Keeper  18:06, 15 August 2014 (UTC)

There seems to be some conflation here, and I'm sorry I didn't notice it earlier. The discussion here apparently refers to a much broader edit filter, the discussion that is linked at WT:AFC refers to an edit filter protecting only the AfC qualified reviewer list. A single URL. This distinction may or may not be important, but it was my view that FP on the participant list was an unnecessary impediment to getting more reviewers working at AfC, and I personally believe that the lack of timely reviews there is a situation which significantly degrading our ability to attract new editors--people who go to AfC now often have to wait over a month for a review. This is just nuts, and full protecting the participant list would make it worse. --j⚛e deckertalk 18:14, 15 August 2014 (UTC)

Blacklists in userscripts are generally ineffective at keeping deliberate abusers away; if someone wants to evade one, it is relatively simple for them to do so, and virtually impossible to stop them. Again, the example of Twinkle and its blacklist (or more accurately, its lack thereof) seems relevant. Perhaps it would be better to scrap the blacklist, accept that there are going to be people that abuse it, and deal with the consequences through community processes? I see what you're going for, and I suppose there could be value in keeping the script at arm's length from well-meaning but incapable editors, but it seems that this proposal is being made in response to deliberate, repeated abuse of the user script, and a blacklist/whitelist, regardless of how heavily it's protected, will not solve the problem of intentional abuse. Writ Keeper  18:32, 15 August 2014 (UTC)
It's a whitelist, not a blacklist. --j⚛e deckertalk 18:34, 15 August 2014 (UTC)
My point applies equally to both whitelists and blacklists. Writ Keeper  18:35, 15 August 2014 (UTC)
I disagree. Isn't a whitelist far more effective than a blacklist? With careful acceptance of new names a ne'er-do-well should have a much harder time getting on the whitelist. Chris Troutman (talk) 18:40, 15 August 2014 (UTC)
The point is all you need to do to bypass the whitelist is add importScript('User:Writ Keeper/Scripts/afcHelperWrapper.js'); to your common.js file. Or modify the afc scrips yourself to the same effect. Monty845 18:42, 15 August 2014 (UTC)
(edit conflict) You may be quite right that there is a better solution. I'm not sure I've seen it, but .. you may very well be right. I do think blacklists are easier to game than whitelists, for a reasons that I'm sure are obvious, but perhaps not so much easier that it matters. I would appreciate signficant, informed thought about how to deal with the problems that are occurring, though. And some understanding that I'm kinda frustrated--I think new article creation by new editors is a pretty delicate task that we are failing at horribly, and not a week goes by that I wonder whether it would be better to scrap AfC entirely. --j⚛e deckertalk 18:41, 15 August 2014 (UTC)

:(3x edit coflict) OK, so lets reform here---you want an edit filter that prevents ONE PAGE from being editing unless the username is on a list? How often is this page editing? Why can't protection and requests to change be utilized (much like the AWB checklist)? — xaosflux Talk 18:43, 15 August 2014 (UTC)

That's my preference, but not Mdann52's proposal above, yes. Full protection is probably my next second-favorite solution, my concern there is that many people will go "eh, too much work" and not make the edit request, at a time we're desparately short of reviewers. This may sound silly, but I think it's non-trivial. PC2, as I said at the proposal on the other page, feels like a bad solution because I don't expect reviewers to understand what we're looking to protect against, and the PC interface never implemented my suggestion of having for each PC-protected page a notice as to what problem or problems a PC reviewer should specifically look for. Semi-protection doesn't solve the abuse. Someone suggested template protection, and I'm more than willing to think IAR is good enough to invoke that, but it brings up the same concerns that brought me to dismiss PC2. But if my specific request was denied, and y'all make some reasonable points about it, I'd probably go to full protection on the participant list.
To contrast my view with that of WK so far, I'm taking the guess that a lot of abusers can manage to insert themselves on a list without really understanding scripting, but not the sort of change Writ Keeper describes. This is just a guess, and I could be entirely full of it.  :) --j⚛e deckertalk 18:48, 15 August 2014 (UTC)
Well, you could be right, too; I don't know how badly people want to abuse the AFC helper script, or how knowledgeable they ware with scripting. Let it not be said that I'm the grand master of all things technical. :) In fact, I'm not even sure that this edit filter is something we shouldn't do. (Xaosflux: what Joedecker is looking for is an edit filter that will prevent them from editing one page if they have less than 500 edits., etc. etc. Something like:
article_articleid == 12438036 &
user_editcount < 500 &
user_age < 7776000
The idea is to have the edit filter prevent anyone from adding themselves to the whitelist if they don't have enough experience to be reviewing things.) All I'm saying definitively is that preventing these users from adding themselves to the whitelist is only a secondary concern; it's how they then proceed to use the script that's the real problem here. An edit filter could help to fix the former, but it won't necessarily fix the latter, since there are other ways to evade the whitelist, and so if you're proposing this with the expectation that it'll fix both, you may be disappointed. Writ Keeper  19:28, 15 August 2014 (UTC)
Writ_Keeper, thanks - I think this is a really bad use for the edit filter, if we need an on-wiki whitelist just full-protect it and make an easy to use form to enter requests; there are plenty of people that review those types of requests, and it could even be transcluded at WP:PERM much like AWB already is. — xaosflux Talk 20:48, 15 August 2014 (UTC)
Writ_Keeper I was thinking more of an edit filter that picked up certain features of the script (eg. Edit summaries or fixes applied), compare it to the user age etc. and deny the edit if they don't meet the criteria. Of course, it's only worth applying this in draft (and maybe user) space. --Mdann52talk to me! 20:52, 15 August 2014 (UTC)
Yes, I know that's what you're looking for, Mdann52, but that's not what Joedecker meant. Unfortunately, your proposed edit filter runs into the same problem that Joe's would: it's just as easy to remove those identifying marks from edit summaries, etc. used in the script as it is to remove the code that checks the whitelist, so either way, an edit filter is far from a guaranteed fix. Moreover, I think your solution would have be much more significant, performance-wise, since we would be checking two whole namespaces rather than just a single page. It's also not what's being discussed on the AFCH talk page. Writ Keeper  20:58, 15 August 2014 (UTC)
(edit conflict) WK: Exactly. More specifically, yes, that's precisely the test I was thinking of, and yes, this doesn't solve every problem by far. I'm not even sure we can solve one of the infrequent problems --people moving their own drafts to mainspace, there's not actually a policy prohibiting that. The ACTRIAL precedent may be relevant. But reviewing per se, and mass reviewing, badly, of other people's articles ... that maybe we can manage some hurdles for. --j⚛e deckertalk 21:15, 15 August 2014 (UTC)
Right... With performance in mind, I support protecting just the participants page. --Mdann52talk to me! 05:47, 16 August 2014 (UTC)

Request for permission: User:Shubhamkanodia

Hello, I'm a sysop on hiwiki, which is suffering with a shortage of active editors and hence needs stricter edit filters to keep vandalism in check. Compared to enwiki, the number of such filters on hiwiki is very small. I am looking to import a few filters, some of which have been set to private here. I request this permission for viewing the conditions and content of the filters. I don't intend to make any edits on enwiki. Thank You! Shubhamkanodia (talk)

 Not done Wifione Message 07:19, 28 August 2014 (UTC)

Does anyone review filters?

Is it possible to request that an overly restrictive filter be seriously looked at? —174.141.182.82 (talk) 16:14, 19 August 2014 (UTC)

Yes, it's possible. I assume you're talking about filter 601, which has blocked several edits coming from your IP address? Well, I'm probably not the best person to talk to about it, but the main author of that filter is Kww; they might have input. Writ Keeper  16:26, 19 August 2014 (UTC)
Thanks. I posted to his Talk page, in case he misses the ping here. —174.141.182.82 (talk) 16:41, 19 August 2014 (UTC)
It does far more good in keeping Colton Cosmic's activities at bay than it causes harm. Even the target has figured out that it keeps people within certain IP ranges from anonymously discussing blocks and unblocks. That's an easy enough activity to avoid or to do in an alternative fashion if no points of pride are getting in the way.—Kww(talk) 17:57, 19 August 2014 (UTC)
What about discussing “comment requests” in project space? It won’t even let me mention the term; I couldn’t even link a discussion from the closure noticeboard. —174.141.182.82 (talk) 20:27, 19 August 2014 (UTC)
A side effect of the poor judgement some admins exercised by opening an RFC/U about Colton. It's unfortunate that you share address space with a prolific and determined block evader. I'm very uninclined to weaken it in any way, especially when it's so easy for you to fix the problem on your end. Other editors are watching, and consensus may turn against me, so I will continue watch this discussion.—Kww(talk) 21:46, 19 August 2014 (UTC)
Understood. This is all a lot less inscrutable to me now, so thanks for the explanations. I didn’t realize this user was still an active problem. Kind of unsettling, in a weird way, to know he’s apparently local to me. But again, thanks. —174.141.182.82 (talk) 01:08, 20 August 2014 (UTC)

No reaction on requests page

Hello all,

I requested an edit filter at the end of may without any response yet. Is there anything I could do to improve the chances of a response? Thanks for any comments, --Null Drei Nulltalk 16:16, 20 August 2014 (UTC)

Typically, if a section I start goes for too long without a responmse, I generally, re-instate it - remove it from its old location, and re-add it at the end with a new timestamp. In this case, I think your post probably went long enough where it is that if anyone was going to respond o it, they would have. עוד מישהו Od Mishehu 15:16, 21 August 2014 (UTC)

I think we need a more specialized warnong for this filter - I've seen several false positive reports which seem to object to these edits being described as "unconstructive", especially when it's only a small piece of an otherwise plausable edit. עוד מישהו Od Mishehu 12:24, 21 August 2014 (UTC)

  • The warning right now goes like this "An automated filter has identified this edit as potentially unconstructive. Please be aware that vandalism may result in revocation of your editing privileges. If this edit is constructive, please click 'Save page' again, and report this error." It seems alright to me. Having said that, there's one change that NawlinWiki has made to the filter yesterday, a change that I won't list here, and a change that might reduce the error reports. You might check the change and understand how it might improve the response. Wifione Message 18:14, 21 August 2014 (UTC)
    • I think that we need to have a warning for this that actually mentions that the problem is with links to Youtube. And a check of the 5 most recent false positives where a Youtube-related response was given are later than NW's change to the filter. עוד מישהו Od Mishehu 04:15, 22 August 2014 (UTC)
      • @Od Mishehu:, I understand. How does this sound? "Due to certain Youtube link(s) you may be attempting to add or remove, an automated filter has identified this edit as being potentially unconstructive. If this edit is constructive, please click 'Save page' again, and report this error. At the same time, please be aware that vandalism may result in revocation of your editing privileges." Wifione Message 09:47, 25 August 2014 (UTC)

Is it possible to disable one of the filters by using own .js file?

I wonder if there's a way I could prevent one filter to show up for me (on fiwiki). I don't like to see those warning texts when saving pages. The edit filter is about possible forgotten signature (this one). --Stryn (talk) 17:39, 11 September 2014 (UTC)

@Stryn: The filters as I understand it are applied to what gets submitted to the server. Frontend scripts would be able to manipulate it before the form is submitted but in this case the filter is looking for something that isn't being submitted (~~~), so I don't see a way to bypass that. I'm intrigued by this filter, it seems like it'd be pretty expensive. Here on en-wikipedia we rely on User:SineBot to automatically sign unsigned posts on talk pages. Perhaps it's worth talking to the bot operator to obtain the source for use on your wiki. — MusikAnimal talk 17:59, 11 September 2014 (UTC)
Thanks for your reply. Well, I think I then need to live with the edit filter as it's since other users like it. Anyway I do like more the edit filter than a bot signing unsigned posts. But that's fine for a big wiki, like enwiki really is. --Stryn (talk) 18:40, 11 September 2014 (UTC)

Automatic edit filter

Hmm, what is this edit filter do? I don't know what happen that users edit on talk pages may identify as an unconstructive edit that was blocked by an automatic edit filter? That should work on them! --Allen talk 01:36, 13 September 2014 (UTC)

Which particular filter are you talking about? You've never hit any edit filter. Jackmcbarn (talk) 15:27, 13 September 2014 (UTC)

Condition limit

Hi! I have some questions concerning the condition limit in the edit filter. The condition limit is a restriction concerning how much and complex filter code that is allowed to be executed before the edit filter stops executing filter code for that edit, i.e. it is not a time limit. I guess that the condition limit has been the same since the edit filter was introduced 5 ½ years ago. During this period, the hardware that the edit filter is running on has reasonably at least doubled its speed (only a guess). If that is the case, we now allow the filter code to run during at most approximately half the time or even less for each edit compared to when the edit filter was introduced. Couldn't it be possible to increase the condition limit from the current value 1000 for this reason? Another question: Is it possible for each language to decide its own condition limit? (I am working with the Swedish edit filter, not the English one.) A final question: It seems like a project is going on with the purpose to replace the condition limit with a time limit. If that is the case, when is that work expected to be finished? Svensson1 (talk) 20:55, 30 August 2014 (UTC)

@Hoo man: ping. Jackmcbarn (talk) 22:18, 11 September 2014 (UTC)
@Svensson1: Condition limits within the AbuseFilter extension are a complicated and long story. In a nutshell, they aren't really the kind of (cpu) time restriction we want these days, as the condition limit is a very poor metric for how heavy a filter is (a lot of stuff is "one condition"... some very heavy stuff and some very lightweight things). We plan to replace the current condition limits with a CPU time metric (or something similar). Because of the troubles with the current condition limits and without really evaluating that this might mean in production, we're probably not going to raise the condition limit.
Sadly work on a new limiting system is stuck at the moment because there's nobody working on it and AbuseFilter's code is rather bad and very hard to properly understand (so fixing that will take some time). Cheers, Hoo man (talk) 15:19, 14 September 2014 (UTC)
Hoo man, is it really true that nobody's working on it? Then what did Werdna do here on August 25? Nirmos (talk) 01:12, 17 September 2014 (UTC)
Yes, sadly... Werdna last touched that patch on "Aug 21, 2013", more than a year ago. (The recent update date is just from a change to the commit message) - Hoo man (talk) 13:13, 17 September 2014 (UTC)
Ok. Thanks for replying. Nirmos (talk) 01:33, 19 September 2014 (UTC)

Discussions

Vandalism filter

Hi

I had to revdel this edit a couple of days ago. I then went to check the edit filter for this user and noticed that none of the filters had picked up on anything. Surely one of the filters should have picked up on this vandalism? 5 albert square (talk) 19:37, 31 October 2014 (UTC)

I'd say only the second part of that diff would be able to feasibly be detected as vandalism by an edit filter. Problem is at least one of those obscenities is also part of the name of a notable music group. You might able to add that phrase to Special:AbuseFilter/260, but I wouldn't really consider that to be a common vandalism phrase. The other edit filter that comes to mind is Special:AbuseFilter/380, maybe that could be tweaked to match this. — MusikAnimal talk 19:46, 31 October 2014 (UTC)
Thanks @MusikAnimal:. Never edited one of the filters before. How do I add the phrase? 5 albert square (talk) 21:59, 31 October 2014 (UTC)

Pending changes block

I'm working on a 'mild block' proposal that is to classic block what pending changes protection is to classic protection. I'm posting here because it is suggested that the edit filter be granted the ability to pending changes block (as well as users in a new usergroup, and some anti-vandalism bots). My draft is at User:Cenarium/PCB and I welcome any input before going ahead with the proposal. If there are any technically minded users, I'd particularly appreciate feedback on the suggested implementation. Cenarium (talk) 22:08, 2 November 2014 (UTC)

Block after repeated attempts to spam?

Is there a way for a filter to take an action only after an account or IP triggers the criteria on multiple edit attempts? (Not necessarily on the same pagename, as spammers typically try a different pagename each time.)

E.g. a spammer makes 30 attempts to create a spam page and are prevented each time, but on the 31st attempt they are successful. Instead, is there a way to block them after the nth attempt? (E.g. on the 15th attempt.)

This is for a non-Wikipedia wiki. Maybe the answer is that we need some other kind of bot in addition to our AbuseFilter? Many thanks. --Chriswaterguy talk 04:14, 5 November 2014 (UTC)

We use a bot to report users triggering some filters multiple times at WP:AIV, Mr.Z-bot (talk · contribs), see User:Mr.Z-bot/filters.js. Cenarium (talk) 18:07, 7 November 2014 (UTC)
Blocking is an available action of the AbuseFilter, however the configuration on enwiki has not allowed it here. — xaosflux Talk 21:18, 7 November 2014 (UTC)
Thanks for the replies. Mr.Z-bot looks awesome - I'll look into that. --Chriswaterguy talk 08:54, 16 November 2014 (UTC)

Requests for permissions

-revi

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I have been fighting against LTA named 대우건설 (Report about this user on meta and enwiki report) globally since Jan 2014. Today, I requested MusikAnimal to create new filter to prevent his socking patterns (I believe, nobody with Korean knowledge would speak "GO AWAY FUCKING MAN" on enwiki except vandals) and he created Special:Abusefilter/648 (private). I would like to see the abuselog of this filter and amend new patterns as needed. I am sysop on Korean Wikipedia, Commons, Meta-wiki, Wikidata, and two small wikis, and have touched few abusefilters.  Revi 13:10, 5 December 2014 (UTC)

  • I support. -Revi and I have talked about 대우건설 on and off wiki, and it's clear there's a real need for this permission. There's nothing complex about this filter in particular, so we're not worried about having vast experience working with regex. Beyond that with such high level of trust across wikis there is little concern -revi will blindly create some new untested filter or one that would otherwise have an adverse effect. — MusikAnimal talk 16:23, 5 December 2014 (UTC)
  • @-revi: Your SUL information looks good, and I've got no concerns with you being able to view the logs. Which other projects have you actively edited/created filters on? — xaosflux Talk 04:12, 6 December 2014 (UTC)
    My work is mostly on Korean Wikis. I have created 3 filters on Korean Wikinews. #4 prohibits non-sysop editing File: and File talk: Namespace, since non-sysop cannot upload files, and there is only one file - Wiki.png (logo). #5 prohibits userpage vandals for contribs less than 30 edits & username does not match pagename, and #6 warns and tags some kind of repeated policy violations (it's private for some reasons). Also, #2 says last editor is Hym411, which is my former username. That detects blanking and disallow it. Probably Abusefilter did not changed username on rename. I have edited few strings for kowiki, that's not major addition, but adding a string for #50 (equivalent for #648 here) and some other maintenance work.  Revi 04:37, 6 December 2014 (UTC)
    Thank you. Please note, it is customary for this request to last at least a week before actioned. — xaosflux Talk 04:55, 6 December 2014 (UTC)
    Yeah, I have read WP:FILTER and found that. This is not a emergency thing, and I can wait :)  Revi 05:01, 6 December 2014 (UTC)
  •  Done Please note, this right allows access to a tool that can be very disruptive with even the slightest of error, it is not the place to be bold in updates, please use with extreme care - when in doubt bring it to discussion first, and make use of log only mode for anything new that you build first. — xaosflux Talk 05:47, 12 December 2014 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

New proposal: tag edits that turn redirects into non-redirects

I've created a discussion on tagging these edits, and showing the pages on Special:NewPages and Special:NewPagesFeed; please contribute here: Wikipedia:Village_pump_(proposals)#Proposed_technical_change:_show_pages_expanded_from_redirects_on_Special:NewPages_and_Special:NewPagesFeed. Swpbtalk 21:21, 16 December 2014 (UTC)

Erroneous behaviour when more than two warnings are shown simultaneously

I am working with the Swedish edit filter and I have investigated how the edit filter works when several warnings are displayed simultaneously. It seems to work well when two filters show a warning and tag the edit. One warning appears below the other. But when three or four warnings shall be shown it only works sometimes. It seems to depend on in which order the things in the article that the filters react on are placed. Sometimes all the three or four warnings are shown simultaneously and the edit is correct tagged. But sometimes only two warnings are shown and only two tags are added to the edit despite it should be more. This is not so good since it can be a very important filter, maybe the most important filter, which warning not is shown and which tag not is added to the edit. Svensson1 (talk) 15:26, 21 December 2014 (UTC)

Edit filter userright to admins?

Forgive me if this is written somewhere obvious, I couldn't find it, but are admins ok to give themselves the edit filter user right without formal request? Sam Walton (talk) 11:51, 14 January 2015 (UTC)

Yes, it's fine. Most admin filter editors are self-appointed. -- zzuuzz (talk) 12:28, 14 January 2015 (UTC)
Certainly, most leave it off if they are not working in that area, gives marginally better security; the +sysop bit gives access to all of logs, etc. Some admin turn it on if they are actually going to make the change and turn it back off when done. — xaosflux Talk 15:14, 14 January 2015 (UTC)
Ok thanks. And do admins need to request new filters or are they trusted to be sensible if they make one? Sam Walton (talk) 22:30, 15 January 2015 (UTC)
Go ahead - always start with log-only, and don't break the site. The key thing - the percentage of actions reaching the condition limit, shown at the top of the EF management page, which is currently at about 9%, needs to be kept below 5%, or filters will start failing. Adding filters increases this number, so other things should probably be optimised or disabled first. -- zzuuzz (talk) 22:58, 15 January 2015 (UTC)

Can't enable filter in spite of being an admin

I've just tried to re-enable Special:AbuseFilter/425, but can't: I get the error "You cannot edit this filter, because it contains one or more restricted actions. Please ask a user with permission to add restricted actions to make the change for you." Which is surprising, because I'm an admin on enwiki, and also have the "edit filter manager" permission set (I've just re-checked this, just to make sure). Can anyone please let me know what I should be doing to be able to edit edit filters again? -- The Anome (talk) 23:25, 2 February 2015 (UTC)

@The Anome: "Revoke the user's autoconfirmed status" is a restricted action. I don't think we're supposed to use those kind of actions here at all. Jackmcbarn (talk) 23:36, 2 February 2015 (UTC)
@Jackmcbarn: Ah: that wasn't the case in 2012, last time it was in use. However, even with that option unchecked, I am still unable to re-enable the filter: I get exactly the same message. -- The Anome (talk) 23:44, 2 February 2015 (UTC)
 Done, I've just removed the revoke autoconfirmed bit. Cheers, Hoo man (talk) 23:56, 2 February 2015 (UTC)
Thanks! Without that bit set, I can now make a test edit to the filter's comments field, so it looks like all is well again. Is there any way to get access to that restricted action option again (used with care, it was very useful for obstructing the efforts of persistent filter-evaders), or is that now restricted to a special class of user like oversighters? -- The Anome (talk) 01:30, 3 February 2015 (UTC)
I don't think there's consensus to allow restricted options on enwiki. You might want to just list that filter under immediate at User:Mr.Z-bot/filters.js, to send anyone who trips it to AIV. Jackmcbarn (talk) 01:33, 3 February 2015 (UTC)

WikiLove is Vandalism?

Hey, can someone with previous experience or the ability to look through the edit filters help me figure out why WikiLove messages are being tagged as possible vandalism? — {{U|Technical 13}} (etc) 17:36, 4 February 2015 (UTC)

That specific hit was a false positive. (Links <may be restricted>: Special:AbuseFilter/279, Special:AbuseLog/11532497, Special:AbuseFilter/examine/711535017) — xaosflux Talk 17:57, 4 February 2015 (UTC)
This filter doesn't appear to have many false positives, I don't think any further action is needed. — xaosflux Talk 17:57, 4 February 2015 (UTC)
FYI: The wikilove content is not what tripped this...just a coincidence. — xaosflux Talk 17:58, 4 February 2015 (UTC)

The test page says that it checks the past 100 edits but if you enter a user to test against it doesn't seem to follow an obvious constraint, other than not testing against edits from some time ago. What's the length of this time and why is there a limit? i.e. Why can't I check the past 100 edits made by a user or IP up to any time ago? Sam Walton (talk) 16:04, 20 January 2015 (UTC)

@Samwalton9: I think it's checking against recent changes, which only goes back 30 days. Jackmcbarn (talk) 23:38, 2 February 2015 (UTC)
@Jackmcbarn: Could it be changed to the past 100 edits regardless of time frame? It's not so useful when I can't check against old accounts' edits. Sam Walton (talk) 11:08, 11 February 2015 (UTC)
The recent changes table only holds the last 30 days of edits, so no, at least not without a lot of work. Jackmcbarn (talk) 12:52, 13 February 2015 (UTC)

One of the conditions of this filter is user_editcount < 15, and yet the filter caught this edit, by an admin with closer to 150,000 edits than 15. In fact, as far as I can tell, the only condition that was met was the use of "fuck you" in the edit summary. Could somebody more competent than me look into this? Thanks, HJ Mitchell | Penny for your thoughts? 16:30, 18 February 2015 (UTC)

Yes, that's the one. I've moved the test temporarily so it only matches the newer users, but there are probably better filters for it altogether. -- zzuuzz (talk) 16:58, 18 February 2015 (UTC)
The filter reads, in essence, A & B | C. In the absence of explicit parentheses, boolean operations group left to right, so this is equivalent to (A & B) | C. I assume the actual intention was A & (B | C), but that's not the current effect. The inclusion on "fuck you" in that filter seems rather superfluous as well. Dragons flight (talk) 17:42, 18 February 2015 (UTC)
Note: zzuuzz updated it effectively be A & (B | C) which should allow experienced users to keep cussing. Still not convinced there is any wisdom in including the "fuck you" condition on that filter. Dragons flight (talk) 18:20, 18 February 2015 (UTC)
FYI: Special:AbuseLog/11614865. Not an especially helpful edit, and from an account that's well on its way to an indef block, but nonetheless not what this filter is aimed at. And no Korean text, no "fuck you" in the edit summary... HJ Mitchell | Penny for your thoughts? 18:04, 18 February 2015 (UTC)
So you don't see "fuck you" next to "Edit summary/reason (summary)" on the log you just linked? Dragons flight (talk) 18:13, 18 February 2015 (UTC)

Condition limit

As mentioned here and here a few of us are confused about how the condition limit works. Could we get some input either at MediaWiki or here on what the condition limit actually refers to and whether we're exceeding some set limit? As I say at MusikAnimal's talk page, it seems some filters are failing to flag edits which should have been flagged per the test interface. Sam Walton (talk) 22:15, 7 February 2015 (UTC)

/Archive_3#Filter_Reduction is one of the best explanations of conditions. The limit was discussed and confirmed at 5% in Archive 6, and raised in relation to the Article Feedback Tool. Some essential reading in the archives. Werdna's previous advice in this situation has been to delete some filters, and that is what we've previously done. -- zzuuzz (talk) 19:06, 14 February 2015 (UTC)
I've disabled a couple and optimised some more, but according to tradition (see archives) some half dozen filters will need to be disabled. We should start with volunteers and nominations. Tag- or log-only and low hits are the likely first targets. Maybe we need Wikipedia:Filters for deletion? @Samwalton9 and MusikAnimal: -- zzuuzz (talk) 17:54, 15 February 2015 (UTC)
Hmm, filters for deletion seems like it could be a sensible idea, though perhaps too much bureaucracy. I've had a bit of a read through the archives but I'm still not fully up to speed on what the message at the top of Special:AbuseFilter means. "Of the last X actions", where I assume actions means edits or attempts at edits(?), "Y have reached the condition limit of 1000", does this mean Y edits have caused an edit filter to run through 1000 or more conditions? Sam Walton (talk) 18:58, 15 February 2015 (UTC)
Yes, that's what that means. I've noticed that in general though, the issue isn't that we have too many filters, but rather that some filters are poorly written such that they will consume hundreds of conditions for certain types of edits. Jackmcbarn (talk) 19:05, 15 February 2015 (UTC)
I finally understand then, thanks. How best can we diagnose which edit filters are consuming large numbers of conditions? The number on the edit filters themselves seems to fluctuate wildly so I guess the best way is to look at filters which contain a lot of conditions? Sam Walton (talk) 19:12, 15 February 2015 (UTC)
I'd say the best way is to look at all of the filters once in a while, and take note of the ones that seem to be unusually high every time you look at them. Jackmcbarn (talk) 19:31, 15 February 2015 (UTC)
Special:AbuseFilter/613 and Special:AbuseFilter/623 could probably be condensed into one - pinging @NawlinWiki: and @MusikAnimal:. Sam Walton (talk) 19:55, 15 February 2015 (UTC)
@Jackmcbarn: Could Special:AbuseFilter/656 be limited to IPs? It uses quite a few conditions at the moment. Sam Walton (talk) 00:40, 16 February 2015 (UTC)
@Samwalton9: 3 isn't really a lot. I'm also not sure that doing so would save any conditions. Jackmcbarn (talk) 01:10, 16 February 2015 (UTC)

Glad we're getting somewhere with this! @Samwalton9: Special:AbuseFilter/613 is a good-faith filter. This happens by accident so many times I decided to make a filter to inform the user that ~~~~ was in the wikitext, and to ensure they wanted to save it (often the rest of the edit is fully constructive). On the other hand, the more restrictive Special:AbuseFilter/623 I believe was aimed at a sockpuppet, but after seeing how well it performed in disallowing vandalism in general, with coincidentally very low false positives, it was just left as is. You may be able to still combine it with some other general vandalism filter, though.

About the "filters for deletion"... note also that it may require SPI-related discussions to take place, which at least for some we'd would need to be careful not to convey details about the filters' implementation.

I think the technical aspects about the condition limit issue is still unclear. That archived discussion was five years ago, with all the upgraded machinery we surely can handle a bigger payload and be able increase that condition limit if even a tad. I'd like to get WMF clarification on this, I've tried at mw:Extension_talk:AbuseFilter#Condition_limit and repeatedly on #wikimedia-tech connect. Perhaps we should open a phab report? MusikAnimal talk 20:31, 16 February 2015 (UTC)

Some notes on construction

The condition limiter is a somewhat ad hoc tool for preventing performance problems. In my personal opinion it should be removed and replaced with a total runtime limit. To the extent that you want to worry about performance, execution times are generally better measure to be thinking about. Also, the per filter reporting of condition numbers is completely wonky / broken and should not be considered accurate in any way. So don't necessarily rely on those numbers when identifying problems. (Unfortunately, the per filter time numbers are also somewhat broken.)

That said, the condition limiter is the current thing that we use, so it is worthwhile to understand it. The condition limit is (more or less) tracking the number of boolean operands + number of function calls + number of function parameters + the number of parenthetical conditions entered. However, it is also smart enough to bypass functions and parenthetical groups if the value doesn't matter. For example, in the expression A & B, the details of B are only evaluated is A is true. For that reason it is beneficial to performance to put simple limiting conditions, e.g. checks for article namespace, in front of more complex expressions. Also, parentheses are usually your friend even though entering them can count against you. Lastly, I should note that function calls are cached, so they only add to the condition count the first time a specific function result is asked for.

Example 1

For a practical example, consider filter 59:

article_namespace == 6
& !("autoconfirmed" in user_groups)
& !(user_name in article_recent_contributors)
& rcount ("\{\{.*\}\}", removed_lines) > rcount ("\{\{.*\}\}", added_lines)

This can be simplified as:

A & !(B) & !(C) & rcount( D, E ) > rcount( F, G )

Let's consider the branching chart:

  • A is true: new boolean operand, +1 condition
    • B is true: new boolean operand, and enter paren, +2 condition
      A & !(B) is false, enter bypass mode
      • C is true / false: new boolean operand, skip paren, +1 condition
        • rcount expressions: new boolean operand, skip functions, +1 condition
          • Total: 5 conditions
    • B is false: new boolean operand, and enter paren, +2 condition
      • C is true: new boolean operand, and enter paren, +2 condition
        A & !(B) & !(C) is false, enter bypass mode
        • rcount expressions: new boolean operand, skip functions, +1 condition
          • Total: 6 conditions
      • C is false: new boolean operand, and enter paren, +2 condition
        • rcount expressions: new boolean operand, evaluate D, E, F, and G, and evaluate rcount( D, E ) and rcount( F, G ), +7 conditions
          • Total: 12 conditions
  • A is false: new boolean operand, +1 condition
    A is false, enter bypass mode
    • B is true / false: new boolean operand, skip paren, +1 condition
      • C is true / false: new boolean operand, skip paren, +1 condition
        • rcount expressions: new boolean operand, skip functions, +1 condition
          • Total: 4 conditions

So, that filter runs from 4 conditions if the first operation is false to 12 conditions if every operation must be evaluated.

Example 2

Now consider an alternative construction with explicit parentheses for groups and removing excess parentheses around the "in" operations:

article_namespace == 6 & 
( 
  ! "autoconfirmed" in user_groups & 
  (
    ! user_name in article_recent_contributors & 
    rcount ("\{\{.*\}\}", removed_lines) > rcount ("\{\{.*\}\}", added_lines)
  )
)

This can be simplified as:

A & ( ! B & ( ! C & rcount( D, E ) > rcount( F, G ) ) )

Let's consider the branching chart:

  • A is true: new boolean operand, +1 condition
    • B is true: new boolean operand, and enter paren, +2 condition
      A & ! B is false, enter bypass mode
      • C is true / false: new boolean operand, skip paren, +1 condition
        • Total: 4 conditions
    • B is false: new boolean operand, and enter paren, +2 condition
      • C is true: new boolean operand, and enter paren, +2 condition
        A & ! B & ! C is false, enter bypass mode
        • rcount expressions: new boolean operand, skip functions, +1 condition
          • Total: 6 conditions
      • C is false: new boolean operand, and enter paren, +2 condition
        • rcount expressions: new boolean operand, evaluate D, E, F, and G, and evaluate rcount( D, E ) and rcount( F, G ), +7 conditions
          • Total: 12 conditions
  • A is false: new boolean operand, +1 condition
    A is false, enter bypass mode
    • B is true / false: new boolean operand, skip paren, +1 condition
      • Total: 2 conditions

So, that filter runs from 2 conditions if the first operation is false to 12 conditions if every operation must be evaluated. If the initial condition is rarely true, as article_namespace == 6 probably is, then the modified filter will consume only two conditions in most runs, compared to 4 conditions in the example without explicit parentheses. Stacking easy to evaluate but hard to match conditions at the front of a filter will generally improve run times and reduce condition usage. In most cases, the use of explicit parentheses also helps the edit filter parser more efficiently determine branching and also reduce both condition counts and runtimes. Dragons flight (talk) 04:54, 17 February 2015 (UTC)

PS. Before anyone asks, yes it would be nice if the explicit parentheses were entirely superfluous. In principle, the edit filter parser could be designed in such a way that they wouldn't make a difference. However, as presently implemented, the presence or absence of such parentheses does affect both execution speed and conditions counted. Dragons flight (talk) 05:07, 17 February 2015 (UTC)
This is bug T43693 once again. But I don't think it is entirely true that a & b & c & d & e always takes 5 conditions, as I am sure I have seen some rules in that format that have an average condition count of 2. All the best: Rich Farmbrough22:00, 17 February 2015 (UTC).

Footnote

I spent a chunk of time reformatting a number of the active filters to add explicit parentheses and to place exclusionary criteria at the front. Right now 0.15% of actions are exceeding the condition limit, down from about 10% when I started. Hopefully I didn't introduce any errors in the process. Dragons flight (talk) 12:56, 17 February 2015 (UTC)

Wow thank you, both for the explanation and the crazily good job at sorting the conditions. Sam Walton (talk) 13:27, 17 February 2015 (UTC)
@Dragons flight: Yes, thank you for the very thorough explanation! If you don't mind me asking, where did you get all of that information? And why isn't this anywhere at mw:Extension:AbuseFilter? Also, how do you feel about requesting an increase in the condition limit? Spending time improving the performance of existing filters is certainly a good idea, but looking ahead we're going to run into the same problem over and over. I'd like to think the Moore's law applies here (so to speak) and we can easily handle another thousand conditions. MusikAnimal talk 16:56, 17 February 2015 (UTC)
I somehow missed the part about how you brought that 10% figure to less than 1. Tremendous kudos! :) MusikAnimal talk 22:14, 17 February 2015 (UTC)
By way of background, I like new shiny things. When AbuseFilter first launched, it was new and exciting and I spent a while digging into it deeply. Though Werdna deserves nearly all the credit for creating it, I got far enough into it to contribute a few small patches to improve performance (mostly as relates to condition short-circuiting). In the process, I gained a pretty good understanding of how it works under the hood. I went back and looked at it last night to make sure nothing had dramatically changed in the internals (it hadn't), but mostly I am relying on knowledge from several years ago.
As regards, the condition limit, yes I think it could be pretty safely increased. Really though, it should be replaced. It is a poorly designed limitation that has little connection with actual load. A more direct limit on total execution time would make far more sense, for example, say no more than 1 second of elapsed time per edit, or something like that. I suspect that a 1 second max would accommodate several times more filters than are currently operating. Dragons flight (talk) 23:03, 17 February 2015 (UTC)
Good to know. I think I will put in a phab report about this so we have less to worry about moving forward. I'll first propose implementing the execution time limitation, but otherwise just an increase in the condition limit for enwiki.
How do you feel about me copying over the #Some notes on construction to mw:Extension:AbuseFilter/Documentation (or some other subpage)? I feel we should have something over there as a reference. I take it the parentheses are the real secret here that no one knew about. MusikAnimal talk 23:17, 17 February 2015 (UTC)
Feel free to add whatever documentation you think is useful. It is a wiki after all.  ;-) Dragons flight (talk) 23:46, 17 February 2015 (UTC)
After a few more filter edits by myself and a couple other people, I think we are now consistently at zero violations of the condition limit. Dragons flight (talk) 22:33, 18 February 2015 (UTC)

Toenote

There seems to be an issue where a revised filter can show very high average conditions (600+) despite being simple, and having a run time of c. 0.3 ms. Coming back later the average conditions reduce to something sensible like 2. Is this a known issue? All the best: Rich Farmbrough21:56, 17 February 2015 (UTC).

As I said above, the per filter condition values are simply wrong and should be ignored. There are two major problems with it. The easier of the problems to understand is caused by a pernicious little race condition where data from one Apache instance can partially overwrite activity data from another simultaneous instance leading to nonsensical comparisons. I don't believe there are any filters that legitimately reach condition counts higher than 50 or so. Dragons flight (talk) 22:46, 17 February 2015 (UTC)
It wouldn't surprise me if there is also a problem with counters not being reset properly when a filter is edited, but I've never looked into that. Dragons flight (talk) 23:49, 17 February 2015 (UTC)
  • Oh BTW, any "rlike" type conditions are going to be slower than binary compares, especially (but not only) where a complex regex is involved. It would be nice to have a bit-mapped representation of user groups. All the best: Rich Farmbrough22:03, 17 February 2015 (UTC).
This is something I've found too, especially just after saving an edit filter the condition count goes into the thousands for a minute or so. Sam Walton (talk) 22:22, 17 February 2015 (UTC)

rmspecials

Do we have an accurate definition of what specials this removes? All the best: Rich Farmbrough23:01, 18 February 2015 (UTC).

In PHP, the executed code is preg_replace( '/[^\p{L}\p{N}]/u', '', $s ); where $s is the initial string. I believe that translates to "remove everything that isn't either a letter or a number" as evaluated by PHP's unicode compliant definition of what are letters and numbers. Dragons flight (talk) 23:12, 18 February 2015 (UTC)
Thanks DF. I imagine \p is Posix class or some such. It's unfortunate for some purposes that it removes spaces, which means word boundaries are lost. All the best: Rich Farmbrough01:18, 22 February 2015 (UTC).

Filter addition

Feel like these edits [39] (admin viewable only) could be used to add to Filter 58, bit surprised they weren't caught already--Jac16888 Talk 16:26, 20 February 2015 (UTC)

Is this a long term issue or was this one-off vandalism? Sam Walton (talk) 16:58, 20 February 2015 (UTC)
Right now looks to be the only occurrence I've seen (other than my userpage in the last half hour), however if this is who it is acting like and not just a wannabe, well we all know how much advantage they take of a gap when they find it--Jac16888 Talk 17:03, 20 February 2015 (UTC)

! (auto)confirmed in user groups vs article namespace == 0

Those two checks are probably the two most used by the filters. Which should come first ? In the running filters I've checked, the autoconfirmed check seems to come first a bit more often than the mainspace check but there's no clear winner. If we've got enough information and experience to make a performance determination, then the conditions should be in that same order for all filters. I've also noticed that filters sometimes check for "confirmed", and sometimes for "autoconfirmed". Since we only get very few edits by users in the confirmed usergroup, it should come down to whether checking for "autoconfirmed" (an exact match) is faster than checking for "confirmed".  Cenarium (talk) 06:51, 25 February 2015 (UTC)

I'm pretty sure that !autoconfirmed is more selective than article_namespace == 0. You can check using batch testing on either expression. So there should be a small advantage to having !autoconfirmed in front. As for "confirmed" vs. "autoconfirmed", I doubt there is a meaningful difference in performance by choosing one over the other. Dragons flight (talk) 08:49, 25 February 2015 (UTC)
Last time I checked about 40% of edits were to article space, so that rule should kick out 60% of edits. I would suspect that more than 80% of edits are by confirmed users. Therefore checking autoconfirmed first would rule out more cases.
However article namespace is a numerical comparison, taking (in theory) a clock cycle. Pattern matching is much much slower. So doing the article check first is actually likely to be more efficient.
Static pattern matching should use the Boyer–Moore–Horspool algorithm, which means that the longer string matches faster.
All the best: Rich Farmbrough00:46, 3 March 2015 (UTC).