Jump to content

Wikipedia:Bots/Requests for approval/DoggoBot 5: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 75: Line 75:


I did more than a thousand more today; no new problems spotted. Of course, that's just getting a glance at each diff as I get into a bot-like clicking rhythm – and yes of course I do take full responsibility for any errors, should any be found. I suppose I can finish the lot this way in a couple of weeks time, but a bot still makes a lot more sense. [[User:Dicklyon|Dicklyon]] ([[User talk:Dicklyon|talk]]) 22:09, 25 February 2022 (UTC)
I did more than a thousand more today; no new problems spotted. Of course, that's just getting a glance at each diff as I get into a bot-like clicking rhythm – and yes of course I do take full responsibility for any errors, should any be found. I suppose I can finish the lot this way in a couple of weeks time, but a bot still makes a lot more sense. [[User:Dicklyon|Dicklyon]] ([[User talk:Dicklyon|talk]]) 22:09, 25 February 2022 (UTC)

OK, I found [https://en.wikipedia.org/w/index.php?title=2014_Australian_Open_%E2%80%93_Men%27s_singles&diff=1074046858&oldid=1071231264 one false positive] downcasing in a ref title. One in several thousand seems like a tolerable error rate; I fixed this one. [[User:Dicklyon|Dicklyon]] ([[User talk:Dicklyon|talk]]) 03:55, 26 February 2022 (UTC)

Revision as of 03:56, 26 February 2022

Operator: EpicPupper (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:34, Thursday, February 3, 2022 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): JWB

Source code available: JWB + User:Dicklyon/Tennis cleanup JWB JSON (permalink) (updated permalink, see discussion)

Function overview: Fix over-capitalization in tennis articles.

Links to relevant discussions (where appropriate): [1], BOTREQ

Edit period(s): One time run

Estimated number of pages affected: 16,000+

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Requested at BOTREQ. The replaces were tested with roughly 1000 articles already, and false positives were resolved (the regexes should no longer have false positives).

Discussion

There are quite a few (many hundreds at least, collecting them now) articles with succession boxes with over-capitalized redirect links (as pointed out in the linked discussion Wikipedia_talk:WikiProject_Tennis#Cleanup_edits) that could be fixed by one more regex replace I think. Give me a day or so to finalize. Any other issues? Dicklyon (talk) 04:03, 4 February 2022 (UTC)[reply]

Gotcha. This probably will take a day for approval anyways. 🐶 EpicPupper (he/him | talk) 04:23, 4 February 2022 (UTC)[reply]
Got it done (found 662 of these articles needing succession box fixes; more than I want to do by hand); I'll keep testing... Dicklyon (talk) 05:56, 4 February 2022 (UTC)[reply]
This would be a good time to make a permalink after this edit. How do you do that? Dicklyon (talk) 06:47, 4 February 2022 (UTC)[reply]
Ah, here is the permalink to the version of User:Dicklyon/Tennis cleanup JWB JSON to get approval for. Dicklyon (talk) 07:05, 4 February 2022 (UTC)[reply]
And I just updated that permalink to a version with some tab/space tweaks. Dicklyon (talk) 07:12, 4 February 2022 (UTC)[reply]
I'm adding this permalink to the request above. Dicklyon (talk) 07:14, 4 February 2022 (UTC)[reply]
Hold the phone. My pattern was OK with US but missed U.S. (a few hundred false negatives). Also there was a sort of idempotent false positive, trying to lowercase a string that was already lowercase. So I fixed those in this edit. New permalink to version. Dicklyon (talk) 21:59, 4 February 2022 (UTC)[reply]

These changes will fail to update a lot of things in articles in Category:1977 Australian Open (December) and Category:1977 Australian Open (January), and articles linking to them, due to the embedded parenthetical; and maybe some others that use unicode characters, ampersands, dashes, etc., instead of just numbers letters and spaces in the main name part. But I think it's not a whole lot, so I'll deal with them as I find them, using JWB and careful inspection. No need to complicate the bot rules for those at this point, and so far I haven't found any of those with succession boxes or other things that would fail to fix, besides the few parenthetical cases. Dicklyon (talk) 05:00, 6 February 2022 (UTC)[reply]

OK, I found and fixed about a few dozen of those. There may remain a few false negatives, but that's OK. Dicklyon (talk) 17:49, 6 February 2022 (UTC)[reply]

@ProcrastinatingReader: thanks for approving TolBot 13A which moved so many of these tennis articles. This is the cleanup to those pages and the rest of the tennis articles with similar over-capitalization. Take a look if you get a chance. Dicklyon (talk) 22:34, 6 February 2022 (UTC)[reply]

{{BAG assistance needed}} Could somebody have a look at this, please? Thanks! 🐶 EpicPupper (he/him | talk) 19:45, 12 February 2022 (UTC)[reply]

  • I am not overly keen to approve a bot purely for the purposes of avoiding redirects in articles. Primefac (talk) 13:57, 13 February 2022 (UTC)[reply]
    Hello @Primefac, I'm slightly confused by your comment. This is an example of an edit; I don't understand how this is avoiding redirects. Rather, the purpose of this task is to fix over-capitalization in prose and/or infoboxes. Pinging @Dicklyon as well. 🐶 EpicPupper (he/him | talk) 21:11, 13 February 2022 (UTC)[reply]
    @EpicPupper and Primefac:, that's not an example of what this bot would do, but a different set of over-capitalization that I was working on. Some of the more recent tennis edits just avoided a redirect in the previous and next links, e.g. this one after I added clauses to fix that and the case errors that showed up in succession boxes (actually, it looks like I did that one "by hand"). I was going to ignore the redirects, but the only feedback I got was to fix those, too, so I did. At that same article, the previous edit by JWB shows some of the common tennis case fixes that this set of replaces does. Here is another good example, with both the previous/next redirect updates and the other fixes. This discussion subsection started with my tweaks to succession box links, so I can see how I gave the wrong impression. Dicklyon (talk) 21:40, 13 February 2022 (UTC)[reply]

{{BAG assistance needed}} Any BAG folks got time to take a look? Dicklyon (talk) 19:33, 15 February 2022 (UTC)[reply]

I legitimately have no idea what type of edits this bot is supposed to be performing. Please do a handful with a non-bot account and link them here as exemplars. Primefac (talk) 12:35, 16 February 2022 (UTC)[reply]
@Primefac: I ran more examples; my contribs with summary (case cleanup (test for bot) (via WP:JWB)). Mostly it's just downcasing things like First Round to first round or First round, depending on context, and Singles and Doubles and such where relevant. Some examples that do a few more things in addition:
  • [2] and [3] include case fixes in the bold lead.
  • [4] and [5] and [6] include the case fix redirect bypasses we were talking about.
  • [7] and [8] show fixes to less common terms like gold medalist.
  • [9] fixes "Wild Card" and "Lucky Loser".
  • [10] downcases "singles" in a typical prose context.
  • [11] is an example with some obvious false negatives. I didn't try to guess what all might need to be downcased, and this one surprised me; I'll go further by hand.
  • [12] shows a visible link update only
I did another pass over the 17000 articles looking for capital letter after "due to" and fixed those by hand (there were a total of only 3 that weren't names). There are surely other false negatives (over-capitalization that I didn't anticipate in the JWB patterns), but none that I can identify at this time. I still haven't seen any false positives; my patterns are pretty restrictive, to avoid them. Dicklyon (talk) 18:27, 18 February 2022 (UTC)[reply]
And they all illustrate the widespread basic over-capitalization fixes, which is why essentially all tennis articles are involved. Dicklyon (talk) 23:36, 16 February 2022 (UTC)[reply]
@Primefac: PTAL. Dicklyon (talk) 18:27, 18 February 2022 (UTC)[reply]
@Primefac: Have you decided to not be bothered further by this one? Sure I ask for BAG assistance again? Dicklyon (talk) 21:16, 21 February 2022 (UTC)[reply]
I only get to BRFA about once a week, if that; I was away this last weekend and did not have as much time to dedicate to my usual weekend rota. Please be patient; I do not see this as a high-priority task. It will get looked at when it gets looked at. Primefac (talk) 08:26, 22 February 2022 (UTC)[reply]
  • My gut feeling is that a bot should not be used to enforce capitalisation preferences or to bypass redirects, especially as there is a history of these sorts of changes to Tennis articles being controversial. Thryduulf (talk) 10:57, 23 February 2022 (UTC)[reply]
    Do you have another suggestion about how to implement the consensus to fix these articles? It's not about preferences, and bypassing redirects is a trivial and uncommon part of the changes. Have you looked at the diffs? Are there any there where there could be a viable alternative to leave it as is? Dicklyon (talk) 02:31, 24 February 2022 (UTC)[reply]
    I just did a few hundred more fixes with JWB using the settings linked (see my contribs before now). Let me know if you see anything there that's potentially controversial, or not obviously correct. Dicklyon (talk) 02:34, 24 February 2022 (UTC)[reply]
    The change should not be made automatically. They should be done carefully and individually or in small groups so that the person doing the changes can ensure that every single one is correct and hasn't introduced more errors without requiring other editors to trawl through hundreds of your contributions to do that work for you. Thryduulf (talk) 12:48, 24 February 2022 (UTC)[reply]
    I have done about a thousand carefully over the last month, and have invited further scrutiny. The only feedbacks I got were about the items I failed to fix. No false downcasings have been observed, because I was careful about the contexts. There are still 16,000 articles left to fix, hence the bot request. This is a routine way to do such large-scale fixes of stereotypical problems. Dicklyon (talk) 17:18, 24 February 2022 (UTC)[reply]

I ran over a thousand more test edits, and found and fixed a couple more misses (false negatives). This edit to the JWB setup. So let's use this latest version. I also went through and fixed about 13 cases of "& nbsp;" before the dash in before_name in some of the "Boys'" articles, which messed with my patterns. Still no false positives (accidental inappropriate downcasings). Dicklyon (talk) 04:31, 25 February 2022 (UTC)[reply]

I did more than a thousand more today; no new problems spotted. Of course, that's just getting a glance at each diff as I get into a bot-like clicking rhythm – and yes of course I do take full responsibility for any errors, should any be found. I suppose I can finish the lot this way in a couple of weeks time, but a bot still makes a lot more sense. Dicklyon (talk) 22:09, 25 February 2022 (UTC)[reply]

OK, I found one false positive downcasing in a ref title. One in several thousand seems like a tolerable error rate; I fixed this one. Dicklyon (talk) 03:55, 26 February 2022 (UTC)[reply]