User talk:BilledMammal/Mass Creation Draftification
Discussion
[edit]@Paradise Chronicle, Avilich, Levivich, Dlthewave, S Marshall, and FOARP: Now that WP:LUGSTUBS is closed we need to decide what group of articles to move forwards with next, using the precedent set by LUGSTUBS.
I don't believe this second group needs to be as conservative as the first group, but I also don't want to propose anything too risky while the process is still being established.
However, I do want to use this group to stretch the boundaries of what the process can be used for; in it I want to do at least some of the following:
- Change the article selection criteria, to avoid the specific values of the first group becoming a standard part of the process
- Nominate articles created by editors other than Lugnuts, to avoid the process becoming a "Lugnuts cleanup process"
- Nominate articles on topics outside the sports topic area - or at least the Olympics topic area - to avoid the process becoming a "sports cleanup process"
- Nominate a larger number of articles, to avoid the process being limited to ~1000 article batches
Currently I have prepared five options. There may be better options available so if any of you have ideas please propose them and I will write a query for it. Of the five options:
- One and two were suggested by FOARP, and I don't know enough about the broader context to comment on the risk of proposing them though I am concerned that opposition may be stronger here; the articles are possibly covered by WP:GEOLAND, and the requirement to have at least one non-database source containing WP:SIGCOV imposed by WP:SPORTSCRIT #5 and WP:MASSCREATE does not apply.
- Three is simply an expansion of the initial group; here the biggest risk is that pushback is likely to be stronger the closer we get to the modern day.
- Four allows us to go beyond Olympians and Lugnuts while remaining within a topic area covered by WP:SPORTSCRIT #5. The risk here is the offline sources that BlackJack included; while there is no reason to believe these contribute to notability they present an unwelcome complexity.
- Five may be the safest; I believe it will be relatively uncontroversial while allowing us to go beyond Olympians, expanding the group size, and tweaking the article selection criteria.
The initial parameters of these options are flexible; please propose any changes you believe would be beneficial and I will write a query to implement them.
If there are other editors you believe may contribute positively to the conversation please ping them; I have likely missed many. BilledMammal (talk) 03:11, 28 May 2023 (UTC)
- I'll add that option 4 and option 5 exclude some sources from within the espncricinfo.com source in the query; while almost all of them are database sources, those whose addresses start with espncricinfo.com/wisden and espncricinfo.com/stories are exceptions. As this is an exclusion, not an inclusion, I'm not certain whether it is worth complicating the actual proposal with a mention of it. BilledMammal (talk) 03:30, 28 May 2023 (UTC)
- Pinging XAM2175, who joined a discussion on my talk page on this topic. BilledMammal (talk) 03:31, 28 May 2023 (UTC)
- Thanks for the ping. Something around WP:SPORTSCRIT I feel ready and comfortable to support. Makes sense to me. Paradise Chronicle (talk) 05:00, 28 May 2023 (UTC)
- I seriously hope you'll reconsider this. Because this looks like lot like the Wikipedia equivalent to the British Railways disaster with the Beeching Axe. The consequences of this action would cause significant damage to the website. I apologize if this is wrongful intrusion or I wasn't supposed to reply to this. I also mean no offense. I've put years if work into articles on this site to the best of my abilities and fear losing everything. If Im not allowed to comment here, I won't do it again.
— MatthewAnderson707 (talk|sandbox) 04:36, 28 May 2023 (UTC)
- @MatthewAnderson707:
I apologize if this is wrongful intrusion or I wasn't supposed to reply to this.
I'm a little confused how you found this discussion (can I ask how you found it? - given we have never interacted I doubt you are following my edits) but you are welcome to comment; it is helpful to have input from editors who disagree with the broad concept. - However, I do believe that the curation of Wikipedia's content is in the best interests of the encyclopedia, and this is part of that effort so I intend to move forward with the broad concept though not necessarily the specific proposals I've drafted here. Regarding your concerns about losing your contributions this process is intended to address problematic mass creation; looking at your contributions now I don't believe you've ever engaged in mass creation, problematic or otherwise, and so it shouldn't affect you. BilledMammal (talk) 04:58, 28 May 2023 (UTC)
- Thank you for the reply. I came across the link while using Google's advanced search to try and find information on article drafting. Sometimes I prefer to use Googke over Wikipedia's own search function. I read the discussion and got worried, then butted in. — MatthewAnderson707 (talk|sandbox) 05:41, 28 May 2023 (UTC)
- @MatthewAnderson707:
- I think we need to recognise that this only just squeaked through. We had a strong majority but the opposers said it didn't quite amount to a rough consensus within Wikipedia's idiosyncratic definition of the term. They had an arguable case and I think they argued it well. So imv the key objective for the second round isn't to expand the criteria; it's to get it through comfortably. We should normalize this as a process before we try to stretch it. For this reason, I urge being super-conservative. Please consider going with the first ~1,200 articles in query #5 (which is still ratcheting up 25% compared to WP:LUGSTUBS).—S Marshall T/C 10:17, 28 May 2023 (UTC)
- And, please give us a week's warning before launch. I would like to manually check a random sample of them and I'd encourage others to do so.—S Marshall T/C 10:20, 28 May 2023 (UTC)
- +1 I'd also like to go through a few before the launch of a discussion.Paradise Chronicle (talk) 10:22, 28 May 2023 (UTC)
- That's a fair view and I'm happy to go along with it. I've adjusted query #5 to use the old criteria - I can't see any criteria we can add to remove articles, but we can just do an arbitrary "first 1200" or similar.
- Regarding how long to wait before launch, I have been wondering if we should wait until around the 22nd of June - a month after the close review was closed - in order to give the community a break from this topic. However, if you don't think that is necessary I'm happy to open it sooner.
- I definitely encourage editors to manually review a sample of the articles included; it is possible that I missed something that we need to account for. BilledMammal (talk) 10:35, 28 May 2023 (UTC)
- I've converted the page into a proposal for option 5, limited to the first 1200 articles. I am concerned that note b is unclear; I would be grateful if someone could look at it and let me know what they think.
- If anyone wants the query to be adjusted, please let me know. I'm also happy to explore queries on different topics. BilledMammal (talk) 10:56, 28 May 2023 (UTC)
- And, please give us a week's warning before launch. I would like to manually check a random sample of them and I'd encourage others to do so.—S Marshall T/C 10:20, 28 May 2023 (UTC)
- Yes, I think this is much more likely to succeed. Definitely don't launch it for at least a couple of weeks so as to give the community a break. I'll get reviewing.—S Marshall T/C 11:21, 28 May 2023 (UTC)
- Consider removing Dara Bashir from this.—S Marshall T/C 15:41, 31 May 2023 (UTC)
- I assume the query will get re-run before it gets sent to anywhere? Blue Square Thing (talk) 18:04, 31 May 2023 (UTC)
- BilledMammal has already said it'll be re-run before the formal RfC.—S Marshall T/C 14:38, 1 June 2023 (UTC)
- I assume the query will get re-run before it gets sent to anywhere? Blue Square Thing (talk) 18:04, 31 May 2023 (UTC)
- Consider removing Dara Bashir from this.—S Marshall T/C 15:41, 31 May 2023 (UTC)
- Are you kidding me? We just finished the first discussion that absolutely drained me–and now you're doing another 1,000–and after that four more of even larger proportions? BeanieFan11 (talk) 14:29, 28 May 2023 (UTC)
- Hi BeanieFan11, how did you find this discussion? BilledMammal (talk) 21:59, 28 May 2023 (UTC)
- I think I saw an edit by Paradise Chronicle in my watchlist and clicked "contributions" (I have a habit of doing that)–or it might have been Blue Square Thing or you. I'm not completely sure. Though, why are you so concerned about how people find discussions? BeanieFan11 (talk) 22:13, 28 May 2023 (UTC)
- It's lucky he did, as this discussion should be taking place in a central location, not hidden away in a user's sandbox. Spike 'em (talk) 20:44, 29 May 2023 (UTC)
- In reply to this, and your other comment; the discussion will take place in a central location, with this just a preliminary drafting process. I had been intending to ping editors like Blue Square Thing, who I have found while opposed to this contribute productively and in a non-disruptive way, prior to opening the discussion to give them time to comment on the draft. BilledMammal (talk) 22:26, 29 May 2023 (UTC)
- @Spike 'em: there is a general expectation amongst a considerable portion of the community that complex and/or potentially-controversial RfCs be drafted before they are formally launched. In any case, since a small bunch of editors at a single user talkpage represents about the lowest possible CONLEVEL it's possible to describe, I don't know what it is that you fear will happen here other than the eventual launch of a formal, and formally-advertised, RfC. XAM2175 (T) 23:09, 29 May 2023 (UTC)
- Although by having this on a user page it's possible to delete it at any point, which might be less helpful. Perhaps we need to ask for a central location where discussions like this can be had - perhaps connected to the ArbCom request for mass deletion stuff? That might reassure people that this isn't the obverse of whatever that group who tried to stop articles being deleted was Blue Square Thing (talk) 09:26, 30 May 2023 (UTC)
- I've been taking a break, but hiding this away in user space and demanding to know how people found it does not come across as collaborative to me. Spike 'em (talk) 17:46, 27 June 2023 (UTC)
- Hi BeanieFan11, how did you find this discussion? BilledMammal (talk) 21:59, 28 May 2023 (UTC)
- Q: Has consideration been given to expanding the scope to include similar db-sourced cricketer biography micro-stubs created by editors other than Lugnuts? wjematherplease leave a message... 17:53, 28 May 2023 (UTC)
- @Wjemather: I've considered it, but we often start to include articles that weren't mass created, and articles that are more likely to contain substantial offline sources. If we ever go down that path we'll need a tool more powerful than Quarry to build the list. BilledMammal (talk) 22:02, 28 May 2023 (UTC)
- Rereading your question, I am not certain I answered it. While I don't plan to include creations by multiple editors in the same group, consideration has been given to creating groups of mass created cricketers made by other editors, such as BlackJack. BilledMammal (talk) 23:36, 29 May 2023 (UTC)
- @Wjemather: I've considered it, but we often start to include articles that weren't mass created, and articles that are more likely to contain substantial offline sources. If we ever go down that path we'll need a tool more powerful than Quarry to build the list. BilledMammal (talk) 22:02, 28 May 2023 (UTC)
- This is going to be a very difficult and extremely long set of articles to review. It would be much easier if the proposal were split into groups based on nationality. For example, we know that New Zealanders and, to an extent, Australians are much easier to find sources for if they exist - the excellent online extensive newspaper archives. At the same time, many of these cricketers will have played only for one team - so a redirect - which is the long-running consensus at AfD - is more obvious and easier to handle. Pakistanis, on the other hand, often played for multiple teams due to the nature of cricket in that country, so redirects are much harder to handle. Sources also seem to be much harder to find for South Asian players in general and for many West Indians. They aren't easy for many South Africans but are generally a little easier for British cricketers and, as I say, much easier for New Zealanders. The crude, blunderbuss style methodology being taken to identify articles is already proving to be far too crude for the Olympians list - there are clear cut cases where articles which already had extensive sourcing in them, albeit only to Olympedia, have been draftified where they are very obviously notable - gut feeling, from a sample of 47, is that this is at at 15% level with a further 20% or so where redirect is clearly a better option. If I had, say, some months without having to lookat yet another list, I'd have a chance to knowing for sure whether this is true for a larger sample. I'll add a separate port below. Blue Square Thing (talk) 18:43, 28 May 2023 (UTC)
- Q has the sourcing already in these articles actually been checked? Both Cricinfo and CricketArchive will sometimes have prose in their profiles - this is more true for Anglophone cricketers, and probably more so again for British cricketers. Blue Square Thing (talk) 18:43, 28 May 2023 (UTC)
- You need to check that the query's working, because I'm finding some already which have had other sources added yet are on this list. That shouldn't be the case should it? It suggests an unreliable methodology. There are also some who already have noted on the articles their notability outside of cricket. That should be being picked up shouldn't it? Blue Square Thing (talk) 19:29, 28 May 2023 (UTC)
- @Blue Square Thing: It is possible there is a mistake with the query; it's why we have this review process prior to opening the discussion. Can you link the article you found that you don't believe should be on the list? BilledMammal (talk) 21:59, 28 May 2023 (UTC)
- I think I found the issue; previously, the query was unable to check for the presence of offline sources. I have improved it so it now looks for the presence of citation templates (any template within Category:Citation templates or its children, plus Template:ISBN); if templates of a sort that are not expected (for example, Template:Cite web is expected) are found it will exclude the result. This results in seven articles being dropped from the list; Alan Shiell, Albertus Eckhoff, Andrew Fairbairn (cricketer), Andrew Grieve (cricketer), Anthony Bullick, Anup Ghatak, Colin Dettmer. Few of them contain WP:SIGCOV (the most common source among them was this book) but they all need to be removed to keep the proposal simple and reliable.
- If there are other templates that suggest relevant sources may be present in the article please let me know and I will add them to the exclusion list. BilledMammal (talk) 05:22, 29 May 2023 (UTC)
- I'd wondered if it was an issue with the sort of thing you were searching for. What if an article doesn't use cite templates at all? I regularly don't use those at all - there's no requirement to do so remember. Does that mean that if articles have CricInfo added as a simple ref rather than using cite web and that was the only reference provided, that they wouldn't be found? Of is someone added the Wisden obituary (as in the Ghatak case) but without the isbn template that it would be included in the list? I think for the methodology to be reliable you need to find a way to include sources that don't use any template - or at least enable a manual check. I went though perhaps 70 or so articles with anglophone names and found a handful already with alternative sources and maybe the same again, or more really, where the sourcing is already in the article (it's in the CricInfo profile - Alan Shiell for example) or linked from the CricInfo profile. There's also at least one Olympian where Olmpedia - with prose - is a source but for the life of me I can't remember the name. Having these split by nationality would be a real advantage when doing that check - Zimbabweans, for example, sometimes have really detailed biographies linked from their CricInfo profile (Alec Taylor (cricketer) for example).
- @Blue Square Thing: It is possible there is a mistake with the query; it's why we have this review process prior to opening the discussion. Can you link the article you found that you don't believe should be on the list? BilledMammal (talk) 21:59, 28 May 2023 (UTC)
- Example question: can this method find any source at all which doesn't use a template? So, for example, would it find the sources in Bob Lipscomb if the article were shorter, or would it assume it was only references to CricInfo as that's the only reference that's been used? Blue Square Thing (talk) 07:26, 29 May 2023 (UTC)
- If the article doesn't use cite templates I can usually find another template that indicates the presence of sources, or failing that another indicator that a source exists. For example, I caught your improvement to Albertus Eckhoff by looking for ISBN templates. However, I can only do this when I have examples; whenever you find articles that should be excluded can you link them here?
- Bob Lipscomb wouldn't be included in the list; the query would notice the presence of a link to this story and omit him.
- Regarding examples where the database source has some prose I am not concerned about their inclusion; WP:SPORTSCRIT #5 requires coverage in non-database sources. I'll add that Olympedia never counts towards notability as it lacks independence; it is owned by the IOC. BilledMammal (talk) 07:40, 29 May 2023 (UTC)
- Can you from time to time rerun the query because there will be pages that are updated and sourced. Themanwithnowifi (talk) 08:11, 29 May 2023 (UTC)
- Yes; I will run the query again immediately before opening the discussion. I'll also run it whenever I update it, but if you want it run at a different time please let me know. BilledMammal (talk) 08:12, 29 May 2023 (UTC)
- Question 1: to clarify, the link anywhere in refs on the Lipscomb article, even without a cite template being used, would be picked up, yes? Blue Square Thing (talk) 15:31, 29 May 2023 (UTC)
- Links anywhere in an article; they don't need to be in ref tags to be picked up. BilledMammal (talk) 22:27, 29 May 2023 (UTC)
- Question 2: is it technically possible to organise these by nationality? Using categories perhaps? This could make a massive difference in how I view any list put forward. Blue Square Thing (talk) 09:21, 30 May 2023 (UTC)
- @Blue Square Thing: See Quarry:query/74885. This should include categories related to nationality. It also includes a few other categories such as the state the individual played for as I was not able to exclude them; if those are problematic let me know and I will see what I can do about limiting it solely to countries. BilledMammal (talk) 06:55, 2 July 2023 (UTC)
- Can you from time to time rerun the query because there will be pages that are updated and sourced. Themanwithnowifi (talk) 08:11, 29 May 2023 (UTC)
- To the ones who argued Alan Shiell needs to be dropped; I expanded Alan Shiell while double checking the articles in hindsight of an eventual RfC. To me it's also not really clear what articles are included in option 5, as the text of option 5 is stricken.Paradise Chronicle (talk) 02:50, 1 June 2023 (UTC)
- Example question: can this method find any source at all which doesn't use a template? So, for example, would it find the sources in Bob Lipscomb if the article were shorter, or would it assume it was only references to CricInfo as that's the only reference that's been used? Blue Square Thing (talk) 07:26, 29 May 2023 (UTC)
Entries to check
[edit]- I've already mentioned Dara Bashir above.—S Marshall T/C 14:38, 1 June 2023 (UTC)
- Charith Jayampathi's entry at [1] does contain prose, but I do not for one moment believe it's SIGCOV.—S Marshall T/C 14:38, 1 June 2023 (UTC)
- Adil Raza's entry at [2] contains prose, which is also not SIGCOV. Maybe we should AfD these separately to be on the safe side.—S Marshall T/C 14:41, 1 June 2023 (UTC)
- Dara Bashir will be removed the next time the query is run; it was expanded on 30 May.
- I think Charith Jayampathi and Adil Raza are fine; I don't believe prose there is relevant as the ESPNCricket is a database source that doesn't contribute to WP:SPORTSCRIT #5 - Olympedia containing prose was not an issue even before I identified it lacked independence. Further, from a practical point of view, there is no way to systematically identify which contain prose. BilledMammal (talk) 01:29, 2 June 2023 (UTC)
- Would you consider the prose at this CricInfo profile to be part of a database? Blue Square Thing (talk) 07:23, 2 June 2023 (UTC)
- Everything in sections outside of "Stories" and "Wisden" are part of the comprehensive database; it's no different to Olympedia in that there may be some prose for some cricketers but that doesn't give it an exemption from the requirement to have a non-database source. I note the article for that cricketer, VVS Laxman, isn't on the list. BilledMammal (talk) 07:29, 2 June 2023 (UTC)
- I think you need to compromise a little here. That's clearly not anything close to a database entry - just like this one isn't. It will hurt your argument if I can produce dozens of similar prose sections. It is, however, more reasonable to suggest that the fact that these articles haven't been improved in X years is a good indication that prose sections such as those are unlikely to exist for the articles on the list you've created. As a counter argument I'll suggest that a suitable amount of time needs to be provided just to double check that, and I tend to be of the view that in the cases where it does - or where other indications of likely notability exists (articles linked from the bottom of the profile, an absolute tonne of top-level appearances etc) - that it's not unreasonable, as a compromise, to remove them from the list and consider them in another way. I've honestly no idea how many articles that would be, but 1,200 is a lot to check, even at 1 minute per article. That way we can move forward with this process, which has some utility, without quite as much conflict. I'll get it in the neck from some of the cricket people for compromising I'm sure. It's not unreasonable to expect a reasonable compromise back. Blue Square Thing (talk) 07:47, 2 June 2023 (UTC)
- Alex Moir is also not part of the list either, though? Have you been able to find any on the list with a blurb comparable to Laxman's or Moir's?
- Even if you can, rather than complicating the criteria, I suggest the simplest way to address their inclusion is to find a non-database source. If it is as strong an indicator of notability as you believe that should be easy to do. BilledMammal (talk) 07:57, 2 June 2023 (UTC)
- There are 1,200 articles on the list. I've looked at maybe 70 or so from trawling that list so I honestly don't know if there are any with substantial prose. As I said above, it might be possible to argue that *because* these articles haven't been developed that there *might* be none on the list with any form of prose in the profiles. But to argue that CricInfo profiles are always simple database entries.isn't fair and it is reasonable to allow them to be checked.
- You want one from the list - Alec Marks has a pretty decent level of prose which has clearly come from another source - it would also be possible to argue thave having played 35 matches should also suggest some level of notability in itself. A less good example is Alfred Bashford, although the prose there strongly suggests to me that we could find more than enough to add plenty of sources. I was only looking an anglophone names and got about as far as him.
- A different issue is raised by the CricInfo profile of Alec Taylor (cricketer) where an obviously detailed biography is clearly linked from the profile. That'll tend to only be the case for Zimbabweans from the early 2000s I imagine. Blue Square Thing (talk) 08:38, 2 June 2023 (UTC)
A different issue is raised by the CricInfo profile of Alec Taylor (cricketer) where an obviously detailed biography is clearly linked from the profile.
Those are easy to address; just add the source to the article. BilledMammal (talk) 08:42, 2 June 2023 (UTC)- Tbh that's why I need the break down by country at some point (there's no rush but it would be nice to get it a reasonable period of time before this goes to RfC). These sorts of ones will be confined to Zimbabweans Blue Square Thing (talk) 08:49, 3 June 2023 (UTC)
- I think you need to compromise a little here. That's clearly not anything close to a database entry - just like this one isn't. It will hurt your argument if I can produce dozens of similar prose sections. It is, however, more reasonable to suggest that the fact that these articles haven't been improved in X years is a good indication that prose sections such as those are unlikely to exist for the articles on the list you've created. As a counter argument I'll suggest that a suitable amount of time needs to be provided just to double check that, and I tend to be of the view that in the cases where it does - or where other indications of likely notability exists (articles linked from the bottom of the profile, an absolute tonne of top-level appearances etc) - that it's not unreasonable, as a compromise, to remove them from the list and consider them in another way. I've honestly no idea how many articles that would be, but 1,200 is a lot to check, even at 1 minute per article. That way we can move forward with this process, which has some utility, without quite as much conflict. I'll get it in the neck from some of the cricket people for compromising I'm sure. It's not unreasonable to expect a reasonable compromise back. Blue Square Thing (talk) 07:47, 2 June 2023 (UTC)
- Everything in sections outside of "Stories" and "Wisden" are part of the comprehensive database; it's no different to Olympedia in that there may be some prose for some cricketers but that doesn't give it an exemption from the requirement to have a non-database source. I note the article for that cricketer, VVS Laxman, isn't on the list. BilledMammal (talk) 07:29, 2 June 2023 (UTC)
- Would you consider the prose at this CricInfo profile to be part of a database? Blue Square Thing (talk) 07:23, 2 June 2023 (UTC)
Neutral notification of mention at ANI. Folly Mox (talk) 06:30, 3 June 2023 (UTC)
- The very first article on this list, which I have now expanded, is A. G. Pradeep who is a former India under-19 captain and captained his state at Ranji Trophy; clearly notable. That doesn't bode well for mass deletion of this articles. Beeeggs (talk) 09:17, 3 June 2023 (UTC)
- As a practicality, would it be useful to restrict the list to only go as far as Deendyal Upadhyay? As articles are improved they're added each time the query is run - I assume there's a hard count on the length of the query. The version listed right now goes to Dega Nischal and I know that several articles have been worked on to the extent that they will come off the list. It would make it easier if articles didn't keep on getting added at the bottom each time that happened! (and also allow an easier comparison). Blue Square Thing (talk) 09:34, 3 June 2023 (UTC)
- This isn't the RfC. This is a quiet break between RfCs, to give the community a rest. We've decided the list will be the first 1,200 articles of the ~5,000 which would be eligible. So at this stage, when we remove one, another gets added. Once the RfC starts, any that get improved will come off and not be replaced.—S Marshall T/C 20:50, 3 June 2023 (UTC)
- Oh, I thought this was for suggestions. I didn't realise it had already been decided. It doesn't really matter, but it just means the checking process will take even longer. Blue Square Thing (talk) 08:56, 4 June 2023 (UTC)
- It's not practical to check them all. It's not even reasonable to check them all. Checking one properly takes far more time and effort than Lugnuts ever spent writing them in the first place. The thousands of volunteer-hours we've already blown could have been so much better spent elsewhere——heck, in the time it takes to review and improve one Lugnuts stub, I bet you personally could have written two decent cricketer articles from scratch! You could be writing the stuff you actually want to write, instead of bogged down in all this nonsense, and I think it's a tragedy for the encyclopaedia that you're doing this.
- Oh, I thought this was for suggestions. I didn't realise it had already been decided. It doesn't really matter, but it just means the checking process will take even longer. Blue Square Thing (talk) 08:56, 4 June 2023 (UTC)
- This isn't the RfC. This is a quiet break between RfCs, to give the community a rest. We've decided the list will be the first 1,200 articles of the ~5,000 which would be eligible. So at this stage, when we remove one, another gets added. Once the RfC starts, any that get improved will come off and not be replaced.—S Marshall T/C 20:50, 3 June 2023 (UTC)
- The sheer scale of the problem we're trying to address with these RfCs makes it impractical to check every one individually. This is the link I showed Arbcom: Table of prolific article-starters. There are 94,000+ Lugnuts articles to get through (and also relevant to cricket are 800+ articles by now-banned sockmaster User:BlackJack, who appears to have forged his offline references). Then there are 70,000+ articles by Carlossuarez46 about world villages, many of which don't exist or turn out to be random farms or landscape features rather than villages. Lugnuts' stuff is higher priority, though, because so many of his are BLPs. At this rate we'll get through four RfCs a year. With only 1,200 articles per RfC, cleaning up after Lugnuts would take nearly twenty years.
- So we have to both speed up, so we're doing more batch RfCs per year, and also include more articles per batch than 1,200.
- Checking them all, in those circumstances, is unreasonable and unachievable.—S Marshall T/C 11:34, 4 June 2023 (UTC)
- I think we can target ones by nationality - and even better, by side - that we want to though. That information is available in cats usually, so should be workable in to queries. And I am wary of making no attempt to check - Alec Marks for example.
- Fwiw I've not come across one that I remember where there was a problem with verification - the people exist(ed) - or where there wasn't an *arguable case* for notability (i.e. they played at the top-level of domestic cricket - which is pretty notable if we can find suitable sourcing to go around it). There may be the odd one where a match they played has been re-categorised as below the top-level (I can recall an article at AfD recently where this was the case but can't remember who created it) but I'm not sure that there's actually a pressing issue from that perspective is there? I appreciate that people may consider that there are just too many articles. If there is a pressing argument, then perhaps filtering out the BLPs first would be a better approach - that should be technically possible I think? If you really consider that the issue, we should focus the list on those maybe? The historical ones can come later.
- BlackJack, btw, is a whole other issue but that's going to be a lot harder to resolve. I'm happy to share my thoughts on that whenever. Blue Square Thing (talk) 12:17, 4 June 2023 (UTC)
This question about whether Lugnuts is a "pressing" issue is the key point. It's the heart of this whole debate. Where we stand at the moment is:
- According to WP:BLP, paragraph 2, we're meant to be
very firm about the use of high-quality sources
. - But it's widely held that there is no deadline for improvement, and
- AfD, our only venue with a deadline, is not for cleanup, so
- In practice any request to improve the sources in a biography can be put off forever, and
- In anything to do with sports, editors insist that it is put off forever. In a submission to ARBCOM I called this "infinite deferral".
We know how editors go about infinite deferral. They look for people who insist on good sources and label them "deletionists". Editors who engage in it are tracked, monitored, and followed around the encyclopaedia to check what they're doing (much to BilledMammal's surprise earlier on this page). Promptly after BilledMammal began this discussion, prominent and less-than-neutrally-worded links to it appeared in the places where sports inclusionists gather.
But, Wikipedia rules have a hierarchy. WP:BLP is a policy. WP:RS is a guideline. WP:TIND is an essay, and WP:NOTCLEANUP is an essay. Even WP:N is only a guideline. We mustn't allow our requests for better sources to be put off forever (even though it would be a heck of a lot easier and we'd get attacked and accused a lot less if we did).
So yes, in my view, Lugnuts is a pressing issue. And all cases where articles fail WP:RS are pressing issues.
You're right to point out that this process isn't error-free. We're using automated tools so there will be some false positives, like the Alec Marks case you mention. That's exactly why these articles are draftified rather than deleted and given an exceptional five-year lifespan in draftspace. But even though there are going to be false positives, compared to all the poorly-sourced BLPs in the mainspace, I think a small proportion of those is the lesser evil.—S Marshall T/C 15:41, 4 June 2023 (UTC)
- Oh, and if we targeted articles by nationality, we'd be creating systemic bias issues and I would expect to be called a racist by outraged inclusionists.—S Marshall T/C 15:44, 4 June 2023 (UTC)
- I just sampled the first 50 beginning with B. 4 of them contain prose. That's 8%. One of them was quite brief, 3 long - so if we went with 3 it's 6%. Another 3 have lots of links in their CI profile so there's a good chance there's more there (in at least one case I know there's an article about them). So, at the worst case we're at an 8% false positive rate. At the best case it's 14%. We should be looking at 5% as the top really. I'll get to another sample later in the week - maybe the first 50 Cs. Blue Square Thing (talk) 19:03, 4 June 2023 (UTC)
- I should add, that I *think* all of these are sourced. I've yet to come across one that isn't verifiable. If unsourced BLPs are the priority then lets do unsourced BLPs. Almost by definition, these 1,200 are going to be "clean" - it's very, very unlikely that there has been anything untoward or requiring additional sourcing added to them - because we're excluding those articles from the list.
- Are you sure we're looking at the right list? Or just a convenient list? Blue Square Thing (talk) 19:06, 4 June 2023 (UTC)
- But look at that prose! Read it! Even a cursory glance tells me that espncricinfo isn't a "high quality source" by any reasonable definition. It's a database with one-paragraph hagiographies. No BLP whose only source is espncricinfo should be in the mainspace. And, if the only source is espncricinfo, how can we know which people are living? We've got to treat pretty much all of them as BLPs.—S Marshall T/C 20:09, 4 June 2023 (UTC)
Listification
[edit]Elsewhere, A. B. suggested that information from these articles could be retained as a series of lists. I think this is an interesting option to explore; while I wouldn't want to include the creation of the list as part of the proposal there is no reason it can't be done prior to the discussion being opened.
However, I don't know what the ideal groupings of this list could be; A. B., could you elaborate on your proposal? Blue Square Thing, I imagine you would also have some useful thoughts? BilledMammal (talk) 02:12, 3 June 2023 (UTC)
- First of all, I really defer to the cricketeers; they're already talking about this page at Wikipedia talk:WikiProject Cricket#Heads up. I think you can combine forces.
- My general bias for BLPs is fewer big articles containing the same information as a lot of little stubs. The encyclopaedia still offers the reader the same information but it's easier for us to monitor for BLP violations. So that's why I mentioned a list. But there are multiple ways to do this - a table of basic information about each player. Or a 2-sentence entry on each player.
- A bigger question is how do you organize it? By country? If so, what about someone who plays in 2 countries? By team or league - you get the same issue. The people at WT:CRICKET are already talking about this.
- There's also the Quarry stuff you're doing. It may be that they would want to vary the queries to produce other lists. For instance, just Test players.
- Finally, for the articles you draftify, they may be interested in the disposition of the drafts. I could see a scenario where you do smaller batches than 1000 based on more specific queries. You'd send these 243 which all involve left-handed Australians to one destination, 167 with red-haired New Zealanderd to a different location. This might provide organization for subsequent prioritization and development of some into real articles.
- Getting back to the list idea: if these were all created via some automated process in the same standard format, it may be easy to automate stripping them of information and filling a table. However, your table shows they've been edited since creation; if the resulting non-standardization complicates things, then just create the list from scratch using the same database the original player worked from. (I don't know what COPYVIO issues that might cause replicating someone else's list.)
- I suggest you keep this work in the open so you don't get accused of pre-canvassing when you take it to RFC. User space is a good place to work but make sure everybody is aware of it.
- I also suggest thinking of this as "streamlining" information rather than "deleting" it.
- Anyway, these are some ideas. Now go talk to the cricket people. —A. B. (talk • contribs • global count) 02:58, 3 June 2023 (UTC)
- There are a variety of ways in which lists have been done - and redirection to a list is virtually the default option for cricketers at AfD if we can't find anything very much about them (see this, this, this, this and this for example). When suitable lists aren't available we'd usually delete (see this one), although at times there are arguments against redirection made (see this for example).
- There are a number of different types of list. The type used really depends on who developed it and when. List of English cricketers (1826–1840), for example, uses a table and seems to have taken me about 9-10 days of pretty solid work to create. By List of English cricketers (1851–1860) there were too many cricketers to use a table for, so a simple list with notes where an article didn't exist was used. Often, though, we'll use team-based lists, such as List of Essex County Cricket Club players where notes, again, are used to add detail about those who don't justify an article - or some of them at least. It's possible to use tables for teams (see List of Bedfordshire County Cricket Club List A players), but in many cases there are simply too many people to use a table effectively. In other cases statistical tables are used - for example, List of Suffolk County Cricket Club List A players, although personally I dislike these. List of Victoria first-class cricketers takes a different approach again - but note that there are 868 entries on that list - there are several of a similar length.
- Part of the problem comes when someone plays for multiple teams. This is a particular problem in places such as Pakistan where there seem to have been more teams than players at times. Following a discussion earlier this year someone started work on a way to do this but if it ever gets done it'll be an absolutely hideous job and one that I simply wouldn't be able to take on.
- We've struggled at times to come to a decision about which team to redirect to when there's a fairly even split between who they played for.
- And creating lists is a job that requires time and attention. For British, New Zealand and Australian teams the lists probably exist in most cases (and these are the ones where it's generally much easier to source suitable levels of detail for to support standalone articles), but for south Asians, South Africans and West Indians it's much patchier.
- Standardising is going to be problematic. There are no real style guides for this sort of thing - and creating one would be problematic. I can't imagine that it would be possible to automate the process very easily - the best source for lists, CricketArchive, is behind a paywall as well, just to make things even easier for everyone (and having used it it's not straightforward anyway). There are good and bad reasons why there's not a style guide fwiw. Partly it does come down to how many people played for a side - fewer and it's easier to create a table with meaningful information; partly it's down to preference - if statistical tables were mandated, for example, I'd walk away as I see no point in them and find them almost impossible to use; partly it's down to use - for a list where the majority of articles are linked a basic list is much easier to use; partly it's down to time - basic lists are quicker to create.
- Lists are better than no information. Redirecting means that if we come across information at some point in the future we can later go back to the article and develop it. We do have all sorts of people who are notable for all sorts of things - I came across a Chief Justice the other evening who happened to have played cricket, for example. The article (Charles MacCormick) had been redirected boldly so it was easy to revert that and add a minimal level of detail in for now - someone else can develop it at some point (there are literally hundreds of hits for him on PapersPast).
- So, it's complex. I'll stop there are ask what you think of the examples I've provided and the points I've made. Blue Square Thing (talk) 08:19, 3 June 2023 (UTC)
- I clicked on three random Australians and all three included a link to "List of (state) representative players" where they were listed. It would seem to be a no-brainer simply to redirect these where there is little prospect of the articles being expanded. Black Kite (talk) 12:16, 3 June 2023 (UTC)
- It might be worth stating, that often it takes an article being listed at AfD, being PRODed etc for some of these articles actually to be expanded. The majority were created by editors who no longer edit, and while there are a number of cricket editors on Wikipedia, we all tend to edit our particular areas of interest, rather than these. While I have stated my support for redirecting a number of these articles (in my opinion the best option and the common middle ground for this debate), I'm almost 100% certain that some of those articles will be of notable people, but just need someone to put in the work, of which unfortunately until they get brought up, doesn't really occur. Rugbyfan22 (talk) 12:47, 3 June 2023 (UTC)
- Listification seems like a good option to me. And a micro-stub being converted into a list entry does not in any way preclude a full article being written on that subject later, if someone has the movitation and sources to do it. — SMcCandlish ☏ ¢ 😼 18:04, 16 June 2023 (UTC)
Opening
[edit]I plan to open this next weekend; I will update the list then to ensure any articles that have since been improved are removed. If you see any aspects that need correction before then please let me know and I will do so.
Before opening it I will also create a list of appropriate redirects that would be created if there was a consensus per the discussion above, such as Adam Clarke (Cambridge University cricketer) to List of Cambridge University Cricket Club players#C; for the sake of simplicity and because there are likely to be some cricketers for whom a redirect will not be appropriate I will not make the creation of these redirects part of the proposal, but will instead commit to doing them WP:BOLDly if there is a consensus for the proposal. BilledMammal (talk) 07:01, 2 July 2023 (UTC)
- Thanks for providing the list with cats above. I've processed initially to deal with the duplication and will try and get versions of it online by the end of tomorrow. That will help people take a look at the individuals they might have an interest in - but it is a long list and I could use running a couple of checker routines really! Do you have any idea how many articles will be in the list you submit this time?
- I think it might be helpful to specifically reference redirecting as a possibility - not necessarily as a direct proposal, but as something that's been mentioned.
- On a practicality, I think it would probably be helpful to have articles that might be sent to draft tagged for a period of time before it happens. This will allow anyone who sees an article on their watchlist that is obviously there in error (one with references that the automatic routine can't pick up, for example) to remove notify someone. I'm thinking about a week perhaps, not a long time. Blue Square Thing (talk) 05:35, 3 July 2023 (UTC)
Do you have any idea how many articles will be in the list you submit this time?
1200I think it might be helpful to specifically reference redirecting as a possibility - not necessarily as a direct proposal, but as something that's been mentioned.
We do; see "If this proposal is successful" #5I think it would probably be helpful to have articles that might be sent to draft tagged for a period of time before it happens. This will allow anyone who sees an article on their watchlist that is obviously there in error (one with references that the automatic routine can't pick up, for example) to remove notify someone.
I don't think that's necessary; moving an article doesn't remove it from watchlists and if there is an error it will be trivial for editors to move it back. BilledMammal (talk) 05:21, 4 July 2023 (UTC)- I'm thinking about the redundancy of having a draft article - which persists even after it's moved back - and one that is in main space or a redirect in mainspace. It seems wasteful.
- There's also a strong argument that many people who might be interested in article simply won't know that they have been listed somewhere. Particularly if people don't edit heavily - I can think of specific examples where people have an interest in (and resources dealing with) particular teams and simpley won't know articles have been suggested for removal until they've gone. If we tag them for a few days, they'll possibly step in at that stage and deal with things. That's much easier than needing page mover rights to move an article back - with the technical complexities that that involves. Blue Square Thing (talk) 05:35, 4 July 2023 (UTC)
- You don't need page mover rights to move the article back, and if redirects from draft space to main space were a problem we wouldn't have a consensus to generally keep them.
- Last time there was a moderate delay between me tagging the articles and an admin moving them; I expect there will be again, and believe that will be sufficient given the ease of restoring articles to mainspace. BilledMammal (talk) 05:51, 4 July 2023 (UTC)
- I know that I found moving the one page back that I did extremely confusing, but there you go. There will be a lot of people who simply don't dare do it in the circumstances.
- I'd say that this is worth a 7-day compromise. Sure, it's technically unnecessary. But it looks like you're listening and being aware of the issues that the automated nature of the queries bring to the situation. You don't have to compromise and look for an approach more likely to gain a really broad consensus of course. Personally, in your situation, I would do though - but then I tend to look for compromise all the time. Blue Square Thing (talk) 09:35, 4 July 2023 (UTC)
- I can't see an issue with tagging them in advance if it's genuinely used only for the purpose of identifying clearly-erroneous inclusions. I take it that this would occur after the conclusion of the RfC, should consensus to draftify be found? XAM2175 (T) 20:47, 4 July 2023 (UTC)