Edit filter configuration

Differences between versions

ItemVersion from 02:56, 23 October 2022 by Suffusion of YellowVersion from 23:11, 24 March 2023 by Suffusion of Yellow
Basic information
Notes:
*Just familiarizing myself with the syntax for now; if it's horribly, horribly wrong don't hesitate to nuke it. —Nihiltres
*Just familiarizing myself with the syntax for now; if it's horribly, horribly wrong don't hesitate to nuke it. —Nihiltres
*"new_wikitext" now replaced with more epic "added_lines" :) —Nihiltres
*"new_wikitext" now replaced with more epic "added_lines" :) —Nihiltres
*Enabled warning - Tiptoety
*Enabled warning - Tiptoety
*Optimised -- Andrew
*Optimised -- Andrew
*AntiAbuseBot hit something with only "nigger" in it earlier, adding that too. - Hersfold
*AntiAbuseBot hit something with only "nigger" in it earlier, adding that too. - Hersfold
*Made sure it has to be mass-removal --Andrew
*Made sure it has to be mass-removal --Andrew
*I've been adding some more variations —Nihiltres
*I've been adding some more variations —Nihiltres
*Enable "prevent user from doing the action" This filter is good. -Prodego
*Enable "prevent user from doing the action" This filter is good. -Prodego
*too many false positives, four of the first fifteen hits were good edits; dropping to just logging for now - east
*too many false positives, four of the first fifteen hits were good edits; dropping to just logging for now - east
*Fixed, disallow has been set, applying to mainspace only (after hitting a sig of a user with poop in their name) -Prodego
*Fixed, disallow has been set, applying to mainspace only (after hitting a sig of a user with poop in their name) -Prodego
*Adding not-sysop line, hopefully that will help optimize some, this filter is hitting the safeguard level. - hersfold
*Adding not-sysop line, hopefully that will help optimize some, this filter is hitting the safeguard level. - hersfold
*Added one. Does it matter whether you use single/double quotes? - It Is Me Here
*Added one. Does it matter whether you use single/double quotes? - It Is Me Here
*Removed addition by It Is Me Here: since we're using ccnorm() on added_lines, "SHIT" is already covered by "5H1T", and "SHIT" will never turn up. Filters need to be kept short for performance reasons. As far as I know, single/double quotes don't matter. —Nihiltres
*Removed addition by It Is Me Here: since we're using ccnorm() on added_lines, "SHIT" is already covered by "5H1T", and "SHIT" will never turn up. Filters need to be kept short for performance reasons. As far as I know, single/double quotes don't matter. —Nihiltres
*Added HAGGER (duh) -- NawlinWiki
*Added HAGGER (duh) -- NawlinWiki
*Changed to use contains_any :) --Andrew
*Changed to use contains_any :) --Andrew
*Add "Wikipedia is Communism" (boy, there's an oldie) -- NW
*Add "Wikipedia is Communism" (boy, there's an oldie) -- NW
*Added ' WANKER' (note the initial space); if it turns out any false positives, feel free to nuke it. Would it make this filter more efficient to order the most common words earlier in the contains_any()? A study of which words turn up would be interesting. —Nihiltres
*Added ' WANKER' (note the initial space); if it turns out any false positives, feel free to nuke it. Would it make this filter more efficient to order the most common words earlier in the contains_any()? A study of which words turn up would be interesting. —Nihiltres
**It would only make it a tiny bit more efficient, assuming that bad words edits are rare. RF.  
**It would only make it a tiny bit more efficient, assuming that bad words edits are rare. RF.  
*Excluding articles turned into redirects, which causes false positives. --Conti
*Excluding articles turned into redirects, which causes false positives. --Conti
*Refined the redirect check; we don't want "I REDIRECT THIS PAGE TO YOUR ANUS" to be an easy workaround for the filter. —Nihiltres
*Refined the redirect check; we don't want "I REDIRECT THIS PAGE TO YOUR ANUS" to be an easy workaround for the filter. —Nihiltres
*Added variant "A55 H01E" of "A55H01E". Remove if problematic. —Nihiltres
*Added variant "A55 H01E" of "A55H01E". Remove if problematic. —Nihiltres
*Obama "Epic fail" vandal.  --NW  4/13
*Obama "Epic fail" vandal.  --NW  4/13
*Added variant "F U C K"; saw it in an article history and confirmed that it was being used to bypass the filter ( http://en.wikipedia.org/wiki/Special:AbuseLog?title=Special%3AAbuseLog&wpSearchUser=125.237.148.153&wpSearchFilter=12 ). I wonder how computationally expensive contains_any() is; it might be useful to use a regex system if it isn't significantly cheaper. If we get too many variants I'd be tempted to change ccnorm to norm, though that's more likely to produce false positives. —Nihiltres
*Added variant "F U C K"; saw it in an article history and confirmed that it was being used to bypass the filter ( http://en.wikipedia.org/wiki/Special:AbuseLog?title=Special%3AAbuseLog&wpSearchUser=125.237.148.153&wpSearchFilter=12 ). I wonder how computationally expensive contains_any() is; it might be useful to use a regex system if it isn't significantly cheaper. If we get too many variants I'd be tempted to change ccnorm to norm, though that's more likely to produce false positives. —Nihiltres


*Removing restriction of article pages only and adding a phrase for Joker vandalism; this type of vandalism is not appropriate on user talk pages either.  Tested at length.  --NW 5/20
*Removing restriction of article pages only and adding a phrase for Joker vandalism; this type of vandalism is not appropriate on user talk pages either.  Tested at length.  --NW 5/20
**More Joker garbage, tested. --NW 5/20
**More Joker garbage, tested. --NW 5/20
***More from tonight, tested.  --NW 5/21
***More from tonight, tested.  --NW 5/21
****+ 1 more, also tested. --NW 5/21
****+ 1 more, also tested. --NW 5/21
* More, tested.  --NW 5/22
* More, tested.  --NW 5/22
*Shouldn't the additions of the last few days be moved to a separate filter? None of them are actual obscenities. --Conti
*Shouldn't the additions of the last few days be moved to a separate filter? None of them are actual obscenities. --Conti
*Makes sense, unless two filters eat up more time than one.  Or we could just change the name of this one to "replacing a page with vandalism".  --NW
*Makes sense, unless two filters eat up more time than one.  Or we could just change the name of this one to "replacing a page with vandalism".  --NW
**I think we can live with the 5 additional ms. I'd rather not rename this filter, tho. Adding "☺" to pages is not vandalism, it's the MO of a specific vandal, and therefore should deserve its own filter. --Conti
**I think we can live with the 5 additional ms. I'd rather not rename this filter, tho. Adding "☺" to pages is not vandalism, it's the MO of a specific vandal, and therefore should deserve its own filter. --Conti
*** Done -- the non-obscenities are now in filter 13.  --NW
*** Done -- the non-obscenities are now in filter 13.  --NW
*Fixed the "shit" filter (oops) and added a variant. —Nihiltres
*Fixed the "shit" filter (oops) and added a variant. —Nihiltres
*Re-add one not covered elsewhere anymore.  --NW
*Re-add one not covered elsewhere anymore.  --NW
**Modify to deal with 4chan vandalism.  --NW 11/24
**Modify to deal with 4chan vandalism.  --NW 11/24
***Revert self, that's not gonna work.  --NW 11/24
***Revert self, that's not gonna work.  --NW 11/24


Add "Hermaphrodite" per recent attacks -TS 1/3
Add "Hermaphrodite" per recent attacks -TS 1/3
allow users to blank their own talkpages -- Soap 1-21
allow users to blank their own talkpages -- Soap 1-21
exception for bots due to FP; more intelligent solution would be nice. also added space before crap to let "skyscraper" through (can we not use \b with ccnorm? that would be more ideal)  -- Soap 1-23
exception for bots due to FP; more intelligent solution would be nice. also added space before crap to let "skyscraper" through (can we not use \b with ccnorm? that would be more ideal)  -- Soap 1-23
Simplified. - Ruslik
Simplified. - Ruslik


I made a sudden change, tested on test wiki first, to correct the false positives. This required a complete redesign, but I took many of the ideas from the old filter. This should reduce false positives. Log only currently to see how it does. - Shirik 27 Jan 2010
I made a sudden change, tested on test wiki first, to correct the false positives. This required a complete redesign, but I took many of the ideas from the old filter. This should reduce false positives. Log only currently to see how it does. - Shirik 27 Jan 2010
+1, blame SGF -- Shirik 5 Mar 2010
+1, blame SGF -- Shirik 5 Mar 2010
+2 from ongoing attacks -- Shirik 7 Mar 2010
+2 from ongoing attacks -- Shirik 7 Mar 2010
Added exception for SPI due to recent false positives until I can find a better way of doing it. -- Shirik 29 Mar 2010
Added exception for SPI due to recent false positives until I can find a better way of doing it. -- Shirik 29 Mar 2010
Rm namespace for SPI exemption, since article_text does not have namespace in it. -Tim Song 31 Mar 2010
Rm namespace for SPI exemption, since article_text does not have namespace in it. -Tim Song 31 Mar 2010
Added replacement with "ORLY" due to an ongoing attack - Shirik 20 Apr 2010
Added replacement with "ORLY" due to an ongoing attack - Shirik 20 Apr 2010
My first edit filter change. Added + signs after each letter, to match thins like "FFFUUUCCCKKK". Also added matches for butt hole and bum hole. Changed to log only per request of shirik. Tim1357 April 26
My first edit filter change. Added + signs after each letter, to match thins like "FFFUUUCCCKKK". Also added matches for butt hole and bum hole. Changed to log only per request of shirik. Tim1357 April 26
Turned disallow back on, added LOOSER Tim1357 April 26
Turned disallow back on, added LOOSER Tim1357 April 26
Optimize (Move "Sockpupet investigation" exclusion to before the regex.--Tim
Optimize (Move "Sockpupet investigation" exclusion to before the regex.--Tim
Rm false positive "cook" (oops) -- Tim 5/5
Rm false positive "cook" (oops) -- Tim 5/5
Add a rule that it is only if the page is reduced in size, even though I know that it was made this way on purpose.  See Redrose64's \filter logs -- Soap
Add a rule that it is only if the page is reduced in size, even though I know that it was made this way on purpose.  See Redrose64's \filter logs -- Soap
Add line to exclude users with more than 2,000 edits. -- Tim 4/28
Add line to exclude users with more than 2,000 edits. -- Tim 4/28
Reorder to optimize and add some. -- Tim 6/26
Reorder to optimize and add some. -- Tim 6/26
I'm pretty sure a vandal won't survive more than 100 edits. - KoH
I'm pretty sure a vandal won't survive more than 100 edits. - KoH
Exlude "Sandbox" in page title, there is already a bot to clean that page. -Sole Soul
Exlude "Sandbox" in page title, there is already a bot to clean that page. -Sole Soul
Use irlike and simplify redirect regex. RF 2014-02-17
Use irlike and simplify redirect regex. RF 2014-02-17


Clean layout and reduce condition count.  -DF
Clean layout and reduce condition count.  -DF
Simplify regex. RF 2015-07-03
Simplify regex. RF 2015-07-03
Enhanced redirect regex, renamed bad_words, reduced condition count.  Possible that more optimisation could be done using statistics. (E.g. are SPI edits very much rarer than bad_words edits?) RF 20150806
Enhanced redirect regex, renamed bad_words, reduced condition count.  Possible that more optimisation could be done using statistics. (E.g. are SPI edits very much rarer than bad_words edits?) RF 20150806
Updated for both old and new version of ccnorm. RF 20160812
Updated for both old and new version of ccnorm. RF 20160812


https://phabricator.wikimedia.org/T29987 fully deployed and confirmed working, removing old code ~MA 2016.08.18
https://phabricator.wikimedia.org/T29987 fully deployed and confirmed working, removing old code ~MA 2016.08.18


A couple regex fixes --Kaldari 2016-08-19
A couple regex fixes --Kaldari 2016-08-19
Tweak. RF 20160911
Tweak. RF 20160911


Public per [[Special:Permalink/784131724#Privacy of general vandalism filters]] ~MA
Public per [[Special:Permalink/784131724#Privacy of general vandalism filters]] ~MA


Update deprecated, make more readable/optimize. -G 2018.02.21
Update deprecated, make more readable/optimize. -G 2018.02.21


Optimized regex code. --Oshwah 3/18/2022
Optimized regex code. --Oshwah 3/18/2022


Converted OR condition to non-capturing group. --Oshwah 9/29/2022
Converted OR condition to non-capturing group. --Oshwah 9/29/2022


Reorder to put unlikely conditions last. --Suffusion of Yellow 02:56 23 Oct 2022
Reorder to put unlikely conditions last. --Suffusion of Yellow 02:56 23 Oct 2022
Restore disallow; that was meant to be a quick check. Useless as a log-only filter. That said, this isn't all that useful as a disallowing filter, either; nearly everything is stopped by other filters. No objections to disabling completely. --Suffusion of Yellow 23:10 24 Mar 2023
Actions to take when matched
Actions to take when matched
 
Disallow: abusefilter-disallowed