Wikipedia:Bots/Requests for approval/Qwerfjkl (bot) 12: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Discussion: editing comment ce [Bawl!]
Line 50: Line 50:
*:{{BotTrial|edits=64}} Ideally try to spread them out over the various categories. [[User:Primefac|Primefac]] ([[User talk:Primefac|talk]]) 08:59, 26 May 2022 (UTC)
*:{{BotTrial|edits=64}} Ideally try to spread them out over the various categories. [[User:Primefac|Primefac]] ([[User talk:Primefac|talk]]) 08:59, 26 May 2022 (UTC)
*::@[[User:Primefac|Primefac]], Is there a limit to the length of a regexp? Currently mine is <syntaxhighlight lang="regel">/\[\[Category: ?(Afghan|Albanian|Algerian|American‎|Andorran|Angolan|Antigua and Barbuda|Argentine|Armenian|Australian|Austrian|Austro-Hungarian|Azerbaijani|Bahamian|Bahraini|Bangladeshi|Belarusian|Belgian|Beninese|Bhutanese|Bolivian|Bosnia and Herzegovina|Botswana|Brazilian|British‎|Bruneian|Bulgarian|Burkinabé|Burmese|Burundian|Cambodian|Cameroonian|Canadian‎|Cape Verdean|Chadian|Chilean|Chinese|Colombian|Comorian|Democratic Republic of the Congo|Republic of the Congo|Costa Rican|Croatian|Cuban|Curaçaoan|Cypriot|Czech|Czechoslovak|Danish|Djiboutian|Dominican Republic|Dutch East Indies|Dutch|East Timorese|Ecuadorian|Egyptian|Emirati|Equatoguinean|Estonian|Ethiopian|Faroese|Fijian|Finnish|French‎|Gabonese|Gambian|German|Ghanaian|Greek|Greenlandic|Guatemalan|Bissau-Guinean|Guinean|Haitian|Honduran|Hong Kong|Hungarian|Icelandic|Indian|Indonesian|Iranian|Iraqi|Irish|Israeli|Italian|Ivorian|Jamaican|Japanese‎|Jordanian|Kazakhstani|Kenyan|Korean|Kosovan|Kuwaiti|Kyrgyzstani|Laotian|Latvian|Lebanese|Lesotho|Liberian|Libyan|Lithuanian|Luxembourgian|Macedonian|Malagasy|Malawian|Malaysian|Maldivian|Malian|Maltese|Mauritanian|Mauritian|Mexican‎|Moldovan|Mongolian|Montenegrin|Moroccan|Mozambican|Namibian|Nepalese|New Zealand|Nicaraguan|Nigerian|Nigerien|Norwegian|Pakistani|Palestinian|Panamanian|Paraguayan|Peruvian|Philippine|Polish|Portuguese|Qatari|Romanian|Russian‎|Rwandan|Sahrawi|Samoan|Saudi Arabian|Senegalese|Serbian|Sierra Leonean|Singaporean|Slovak|Slovenian|Somalian|South African|South Sudanese|Soviet‎|Spanish|Sri Lankan|Sudanese|Surinamese|Swazi|Swedish‎|Swiss|Syrian|Taiwanese|Tajikistani|Tanzanian|Thai|Togolese|Tongan|Trinidad and Tobago|Tunisian|Turkish|Turkmenistan|Ugandan|Ukrainian|Uruguayan|Uzbekistani) films\]\]\n?/</syntaxhighlight><br/>which might need splitting up.&nbsp;&#8213;&nbsp;<span id="Qwerfjkl:1653575341263:WikipediaBWLCLNBots/Requests_for_approval/Qwerfjkl_(bot)_12" class="BawlCmt">[[User:Qwerfjkl|<span style="background:#1d9ffc; color:white; padding:5px; box-shadow:darkgray 2px 2px 2px;">Qwerfjkl</span>]][[User talk:Qwerfjkl|<span style="background:#79c0f2;color:white; padding:2px; box-shadow:darkgray 2px 2px 2px;">talk</span>]] 14:29, 26 May 2022 (UTC)</span>
*::@[[User:Primefac|Primefac]], Is there a limit to the length of a regexp? Currently mine is <syntaxhighlight lang="regel">/\[\[Category: ?(Afghan|Albanian|Algerian|American‎|Andorran|Angolan|Antigua and Barbuda|Argentine|Armenian|Australian|Austrian|Austro-Hungarian|Azerbaijani|Bahamian|Bahraini|Bangladeshi|Belarusian|Belgian|Beninese|Bhutanese|Bolivian|Bosnia and Herzegovina|Botswana|Brazilian|British‎|Bruneian|Bulgarian|Burkinabé|Burmese|Burundian|Cambodian|Cameroonian|Canadian‎|Cape Verdean|Chadian|Chilean|Chinese|Colombian|Comorian|Democratic Republic of the Congo|Republic of the Congo|Costa Rican|Croatian|Cuban|Curaçaoan|Cypriot|Czech|Czechoslovak|Danish|Djiboutian|Dominican Republic|Dutch East Indies|Dutch|East Timorese|Ecuadorian|Egyptian|Emirati|Equatoguinean|Estonian|Ethiopian|Faroese|Fijian|Finnish|French‎|Gabonese|Gambian|German|Ghanaian|Greek|Greenlandic|Guatemalan|Bissau-Guinean|Guinean|Haitian|Honduran|Hong Kong|Hungarian|Icelandic|Indian|Indonesian|Iranian|Iraqi|Irish|Israeli|Italian|Ivorian|Jamaican|Japanese‎|Jordanian|Kazakhstani|Kenyan|Korean|Kosovan|Kuwaiti|Kyrgyzstani|Laotian|Latvian|Lebanese|Lesotho|Liberian|Libyan|Lithuanian|Luxembourgian|Macedonian|Malagasy|Malawian|Malaysian|Maldivian|Malian|Maltese|Mauritanian|Mauritian|Mexican‎|Moldovan|Mongolian|Montenegrin|Moroccan|Mozambican|Namibian|Nepalese|New Zealand|Nicaraguan|Nigerian|Nigerien|Norwegian|Pakistani|Palestinian|Panamanian|Paraguayan|Peruvian|Philippine|Polish|Portuguese|Qatari|Romanian|Russian‎|Rwandan|Sahrawi|Samoan|Saudi Arabian|Senegalese|Serbian|Sierra Leonean|Singaporean|Slovak|Slovenian|Somalian|South African|South Sudanese|Soviet‎|Spanish|Sri Lankan|Sudanese|Surinamese|Swazi|Swedish‎|Swiss|Syrian|Taiwanese|Tajikistani|Tanzanian|Thai|Togolese|Tongan|Trinidad and Tobago|Tunisian|Turkish|Turkmenistan|Ugandan|Ukrainian|Uruguayan|Uzbekistani) films\]\]\n?/</syntaxhighlight><br/>which might need splitting up.&nbsp;&#8213;&nbsp;<span id="Qwerfjkl:1653575341263:WikipediaBWLCLNBots/Requests_for_approval/Qwerfjkl_(bot)_12" class="BawlCmt">[[User:Qwerfjkl|<span style="background:#1d9ffc; color:white; padding:5px; box-shadow:darkgray 2px 2px 2px;">Qwerfjkl</span>]][[User talk:Qwerfjkl|<span style="background:#79c0f2;color:white; padding:2px; box-shadow:darkgray 2px 2px 2px;">talk</span>]] 14:29, 26 May 2022 (UTC)</span>
*:::I have no idea; try it, and if it doesn't work split it up. For what it's worth, you have a lot of unicode spaces in your copy above (which may or may not be present in your original files) so you might want to check that before you run anything. [[User:Primefac|Primefac]] ([[User talk:Primefac|talk]]) 14:31, 26 May 2022 (UTC)

Revision as of 14:31, 26 May 2022

Operator: Qwerfjkl (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 09:13, Saturday, May 14, 2022 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Remove

[[Category:(country) films]]

from overcategorized pages.

Links to relevant discussions (where appropriate): Wikipedia:Bot requests#Film categories (and the prior discussion linked there)

Edit period(s): One time run

Estimated number of pages affected: <200,000

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: The bot will remove

[[Category:(country) films]]

from pages determined by running a deepcategory query on the relevant categories, via a regexp. The page count is hard to estimate because of the number of categories removed, and the large size of categories to work on, so I've estimated an upper limit.

The categories I'll run deepcategory on are:

Discussion

  • Just for a bit of context on why this is warranted, if it would help: WP:FILM formerly had a policy of deeming "(Country) films" categories to be all-inclusive, meaning that they had to directly include all films from that country even if they were already extensively subcategorized for genre or other characteristics. That wasn't necessarily unreasonable 15 to 20 years ago when that rule was first established, as we had far, far fewer articles about films at that time than we do now — but in 2022, a considerable number of the categories are now populated into the thousands or tens of thousands, and would have been deemed too large and in need of diffusion in virtually any other category tree. So the WikiProject has now established a consensus to drop the "all inclusive" rule, but due to the sheer number of articles involved nobody wants to tackle the whole job manually.
    So the idea is to use a bot to clean out the redundant category from articles that are already properly subcategorized, so that the human editors can concentrate our efforts on the smaller number of articles that are only filed in the parent while lacking any subcategorization. Bearcat (talk) 12:51, 14 May 2022 (UTC)[reply]
    Approved for trial (64 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Ideally try to spread them out over the various categories. Primefac (talk) 08:59, 26 May 2022 (UTC)[reply]
    @Primefac, Is there a limit to the length of a regexp? Currently mine is
    /\[\[Category: ?(Afghan|Albanian|Algerian|American‎|Andorran|Angolan|Antigua and Barbuda|Argentine|Armenian|Australian|Austrian|Austro-Hungarian|Azerbaijani|Bahamian|Bahraini|Bangladeshi|Belarusian|Belgian|Beninese|Bhutanese|Bolivian|Bosnia and Herzegovina|Botswana|Brazilian|British‎|Bruneian|Bulgarian|Burkinabé|Burmese|Burundian|Cambodian|Cameroonian|Canadian‎|Cape Verdean|Chadian|Chilean|Chinese|Colombian|Comorian|Democratic Republic of the Congo|Republic of the Congo|Costa Rican|Croatian|Cuban|Curaçaoan|Cypriot|Czech|Czechoslovak|Danish|Djiboutian|Dominican Republic|Dutch East Indies|Dutch|East Timorese|Ecuadorian|Egyptian|Emirati|Equatoguinean|Estonian|Ethiopian|Faroese|Fijian|Finnish|French‎|Gabonese|Gambian|German|Ghanaian|Greek|Greenlandic|Guatemalan|Bissau-Guinean|Guinean|Haitian|Honduran|Hong Kong|Hungarian|Icelandic|Indian|Indonesian|Iranian|Iraqi|Irish|Israeli|Italian|Ivorian|Jamaican|Japanese‎|Jordanian|Kazakhstani|Kenyan|Korean|Kosovan|Kuwaiti|Kyrgyzstani|Laotian|Latvian|Lebanese|Lesotho|Liberian|Libyan|Lithuanian|Luxembourgian|Macedonian|Malagasy|Malawian|Malaysian|Maldivian|Malian|Maltese|Mauritanian|Mauritian|Mexican‎|Moldovan|Mongolian|Montenegrin|Moroccan|Mozambican|Namibian|Nepalese|New Zealand|Nicaraguan|Nigerian|Nigerien|Norwegian|Pakistani|Palestinian|Panamanian|Paraguayan|Peruvian|Philippine|Polish|Portuguese|Qatari|Romanian|Russian‎|Rwandan|Sahrawi|Samoan|Saudi Arabian|Senegalese|Serbian|Sierra Leonean|Singaporean|Slovak|Slovenian|Somalian|South African|South Sudanese|Soviet‎|Spanish|Sri Lankan|Sudanese|Surinamese|Swazi|Swedish‎|Swiss|Syrian|Taiwanese|Tajikistani|Tanzanian|Thai|Togolese|Tongan|Trinidad and Tobago|Tunisian|Turkish|Turkmenistan|Ugandan|Ukrainian|Uruguayan|Uzbekistani) films\]\]\n?/

    which might need splitting up. ― Qwerfjkltalk 14:29, 26 May 2022 (UTC)[reply]
    I have no idea; try it, and if it doesn't work split it up. For what it's worth, you have a lot of unicode spaces in your copy above (which may or may not be present in your original files) so you might want to check that before you run anything. Primefac (talk) 14:31, 26 May 2022 (UTC)[reply]