Edit filter configuration

Differences between versions

ItemVersion from 11:00, 11 December 2022 by ProcrastinatingReaderVersion from 21:06, 20 April 2024 by Suffusion of Yellow
Basic information
Notes:
Start filtering this, per https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Spam_blacklist&oldid=813788883
Start filtering this, per https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Spam_blacklist&oldid=813788883


Simpler version, we are only logging atm.
Simpler version, we are only logging atm.
+1  Beetstra 12/6/2017
+1  Beetstra 12/6/2017


Using added_links, maybe cleaner.
Using added_links, maybe cleaner.


added_links is pretty expensive (for some reason), so if added_lines works for logging purposes then let's use it ~MA
added_links is pretty expensive (for some reason), so if added_lines works for logging purposes then let's use it ~MA


Problem is .. we are going to warn at some point.  added_lines does not work, as it may be part of a changed line (or a moved line, both of which in the diff are interpreted as a deleted line and an added line).  3 out of 5 hits were now false positives .. I was considering to turn it to warn after I saw the first two (the proper positives).
Problem is .. we are going to warn at some point.  added_lines does not work, as it may be part of a changed line (or a moved line, both of which in the diff are interpreted as a deleted line and an added line).  3 out of 5 hits were now false positives .. I was considering to turn it to warn after I saw the first two (the proper positives).


Check added_lines first to cancel out most edits before using the more expensive added_links ~MA
Check added_lines first to cancel out most edits before using the more expensive added_links ~MA


Set to warn and prevent, see https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Spam_blacklist&oldid=819509337#Follow_up_-_blacklist_links_to_inside_nets_of_Universities (links may end up being blacklisted, making this filter obsolete).
Set to warn and prevent, see https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Spam_blacklist&oldid=819509337#Follow_up_-_blacklist_links_to_inside_nets_of_Universities (links may end up being blacklisted, making this filter obsolete).


Block more
Block more


exempt bots/admins that are doing housekeeping 20190216 ~Xaos
exempt bots/admins that are doing housekeeping 20190216 ~Xaos


remove infoweb.newsbank.  Maybe not a proxy, caught with others.
remove infoweb.newsbank.  Maybe not a proxy, caught with others.


They creep in through drafts (either in draft or in userspace).  Blocking there as well (with the risk of being a bit bitey, but most of these are really utterly useless as they can often not be converted to intelligible links)
They creep in through drafts (either in draft or in userspace).  Blocking there as well (with the risk of being a bit bitey, but most of these are really utterly useless as they can often not be converted to intelligible links)


upgrading to rlike (temporarily as log only for testing; making it warn for a moment) as one of the links is more complex than a straight domain.  --Beetstra 11122019
upgrading to rlike (temporarily as log only for testing; making it warn for a moment) as one of the links is more complex than a straight domain.  --Beetstra 11122019


Disabling momentarily per https://en.wikipedia.org/w/index.php?title=Mexico%E2%80%93United_States_border&type=revision&diff=932069951&oldid=932068693 and will reenable after cleaning up. --Jasper Deng 12/23/2019.
Disabling momentarily per https://en.wikipedia.org/w/index.php?title=Mexico%E2%80%93United_States_border&type=revision&diff=932069951&oldid=932068693 and will reenable after cleaning up. --Jasper Deng 12/23/2019.


Expanding on ebscohost.com .. is there any reason why we should not exclude ALL of ebscohost?  --Beetstra
Expanding on ebscohost.com .. is there any reason why we should not exclude ALL of ebscohost?  --Beetstra
Added more, both ebscohost as an other proxy
Added more, both ebscohost as an other proxy


Adapting ebscohost to exclude all of ebscohost, adding exclusion for links on search.ebscohost.com that contain BOTH an access number (AN) and 'scope=site' (which seem to be the 'permanent links' that are provided by ebsco as universal).  20220210 Beetstra
Adapting ebscohost to exclude all of ebscohost, adding exclusion for links on search.ebscohost.com that contain BOTH an access number (AN) and 'scope=site' (which seem to be the 'permanent links' that are provided by ebsco as universal).  20220210 Beetstra


an lowercase??? 22020503 Beetstra
an lowercase??? 22020503 Beetstra


Check for ezproxy2 (or more generally: ezproxy\d*) --ProcrastinatingReader 11:00 11 Dec 2022
Check for ezproxy2 (or more generally: ezproxy\d*) --ProcrastinatingReader 11:00 11 Dec 2022
Remove rmwhitespace(); combined with ".*" this can cause a match to start in one link and end in another unrelated link. --Suffusion of Yellow 21:05 20 Apr 2024
Filter conditions
Conditions:
(documentation)
!(contains_any(user_groups, "bot", "sysop")) &  
!(contains_any(user_groups, "bot", "sysop")) &  
equals_to_any(page_namespace, 0, 2, 118) &
equals_to_any(page_namespace, 0, 2, 118) &
/* Use added_lines to cancel out edits before using more expensive added_links */
/* Use added_lines to cancel out edits before using more expensive added_links */
(rmwhitespace(lcase(added_lines)) rlike "\.proxy\.|\.gate.lib\.|ebscohost\.com.*(pdfviewer|detail)|\.ezproxy-v\.|search\.ebscohost\.com|ebscohost\.com\.|\blibproxy\.|ezproxy\d*\.|\.ucsf\.idm\.oclc\.org|tinyurl.galegroup.com.nls.idm.oclc.org") &
(added_lines irlike "\.proxy\.|\.gate.lib\.|ebscohost\.com.*(pdfviewer|detail)|\.ezproxy-v\.|search\.ebscohost\.com|ebscohost\.com\.|\blibproxy\.|ezproxy\d*\.|\.ucsf\.idm\.oclc\.org|tinyurl.galegroup.com.nls.idm.oclc.org") &
(rmwhitespace(lcase(added_links)) rlike "\.proxy\.|\.gate.lib\.|ebscohost\.com.*(pdfviewer|detail)|\.ezproxy-v\.|search\.ebscohost\.com|ebscohost\.com\.|\blibproxy\.|ezproxy\d*\.|\.ucsf\.idm\.oclc\.org|tinyurl.galegroup.com.nls.idm.oclc.org") &
(added_links irlike "\.proxy\.|\.gate.lib\.|ebscohost\.com.*(pdfviewer|detail)|\.ezproxy-v\.|search\.ebscohost\.com|ebscohost\.com\.|\blibproxy\.|ezproxy\d*\.|\.ucsf\.idm\.oclc\.org|tinyurl.galegroup.com.nls.idm.oclc.org") &
!(rmwhitespace(lcase(added_links)) rlike "search\.ebscohost.*?an=")
!(rmwhitespace(lcase(added_links)) rlike "search\.ebscohost.*?an=")