Jump to content

Wikipedia:Village pump (idea lab)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Donald Albury (talk | contribs) at 14:22, 6 December 2022 (→‎Different world views : WMF and Slate vs WP - The banner debate: Reply). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

 Policy Technical Proposals Idea lab WMF Miscellaneous 
The idea lab section of the village pump is a place where new ideas or suggestions on general Wikipedia issues can be incubated, for later submission for consensus discussion at Village pump (proposals). Try to be creative and positive when commenting on ideas.
Before creating a new section, please note:

Before commenting, note:

  • This page is not for consensus polling. Stalwart "Oppose" and "Support" comments generally have no place here. Instead, discuss ideas and suggest variations on them.
  • Wondering whether someone already had this idea? Search the archives below, and look through Wikipedia:Perennial proposals.

Discussions are automatically archived after remaining inactive for two weeks.

« Archives, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57

Are WMF and WP too different to ever get along ?

WMF (non developers) and WP are very very different Using Hofstede's cultural dimensions theory#Dimensions of national cultures just as a model for discussion to show how different they are

  • Power distance index (PDI): WP -medium (new editors), WMF low
  • Individualism vs. Collectivism  : WP 50/50 , WMF - Very Collective
  • Uncertainty avoidance  : WP - High, WMF - Low
  • Masculinity vs. Femininity : WP : More Masculine , WMF Very Feminine (see article for explanation)
  • Short (tradition) vs Long (Change) ; WP S, WMF L
  • Indulgence vs. restraint ; WP : High R, WMF HIgh I

And bonus DIversity : WP em/en dash preference; WMF people not like WP editors Wikipedia is : for WP it's the editors, for WMF it's the readers whose desires are only known to WMF Countries : WP is Samurai Japan plus California hippies, WMF is Sweden :-) Wikipedia exists to ; WP - create Wikipedia,free speech, an NPOV WMF -as a cash cow

I think we suck on a few things (mercilessly, no long term planning, ..), but based on this I suggest we need a divorce. WMF can have the $100 M, but we want the Kids (WMF developers, IT, organisers, and the person from fundraising who was brave enough to ask about the emails 13:02, 27 August 2022 (UTC) Wakelamp d[@-@]b (talk) 13:02, 27 August 2022 (UTC)[reply]

It's difficult. Wikipedia, especially enwp, is the WMF's cash cow. In return, WMF provides useful functions such as legal and administrative services, a hardware platform, maintaining the software and (slowly and reluctantly) adding a few requested features. On the other hand, the vast majority of the money goes elsewhere with no direct benefit to editors or readers: interference in government, global diversity workshops, unwanted technical changes, financial trusts with no clear aim. Splitting could benefit Wikipedia in the long run, with a lower level of income spent in a more focused way, but is hampered by the stupid decision to give Wikpedia's domain names and trademarks to a WMF which has a strong incentive to withhold them. Certes (talk) 13:41, 27 August 2022 (UTC)[reply]
I think the Wikimedia foundation is a bit like NASA: slow, bureaucratic, resistant to change, with its reputation primary provided by its early glory years. I don't have the solution here, but it is imperative that the WMF must become nimble if it wants to survive for the next 10 years. CactiStaccingCrane (talk) 17:22, 27 August 2022 (UTC)[reply]
Also, technologies have improved, and it is not out of a question for an individual or group of individuals to fork Wikipedia for the community, like the second reincarnation. To me, this is absolutely not an ideal situation, but will eventually happen if the WMF continues to be aloof with important issues and toying with random things. CactiStaccingCrane (talk) 17:24, 27 August 2022 (UTC)[reply]
Splitting worked for LibreOffice and others but it's very much a last resort, because we'd leave behind integration with other WMF projects and the excellent reputation of the Wikipedia brand. The best option is for the WMF to revert to its previous narrow role which has community consensus. However, that would put a lot of people who out of a comfortable job, and we can't expect turkeys to vote for Christmas. Certes (talk) 18:50, 27 August 2022 (UTC)[reply]
A fork is a bad idea, but you also can't put the cat back in the bag. The only way out is through, and continuing to move forward. Andre🚐 19:57, 27 August 2022 (UTC)[reply]
A fork of Spanish Wikipedia happened in 2002, you can read about it in meta:Spanish Fork. ~ 🦝 Shushugah (he/him • talk) 18:53, 4 September 2022 (UTC)[reply]
I wouldn't count the developers as on the community's side rather than the WMF. I don't dispute there are exceptions, but consider how many clashes between the WMF and the volunteer community have been IT focussed (the premature release of V/E, that viewer experiment, the reader comments box etc etc). Mixing volunteer and paid staff on the same project is not easy, especially if you want to maintain volunteer motivation and self respect. There are models that can succeed longterm such as "staff are employed to do the things that volunteers want to happen but aren't volunteering to do". However I don't see the WMF agreeing to adopt any of the viable models that could lead to a stable and successful movement for the longterm. Less than a decade ago part of the tension between the community and the WMF was over civility and harassment issues, but with the WMF as the less "woke" side of the equation. One reading of the UCOC dispute is that the WMF has gone from taking such issues less seriously than the community to taking them more seriously. Another reading, and one I find more convincing, is that the WMF's commitment to wokeness is barely skin deep and ill thought out. The UCOC is more about an old fashioned power grab, a secretive and centralised solution by people who neither respect nor understand the way the community works. If the WMF was really as "woke" as the community they wouldn't have been so resistant to the Portuguese language community ending IP editing, and they'd be using their real world influence to get IP companies and Police forces to deal with the people who make death threats against members of our community. To me the divide between the WMF and the volunteer community is cultural, but I don't see it through the lens of Hofstede's theory. I see one side of the divide looking at the Movement in silicon valley terms as one of the top ten internet sites, where the volunteers and the product is all subject to a "move fast and break things" mindset and standardisation and centralisation are key, And the other side of the divide a group of people who have taken the dream of "making the world's knowledge freely available to all" or "making the internet not suck", and found a way to make that a reality in a radically decentralised wiki way... I agree that the two groups are close to needing a divorce, I'm not 100% convinced it is inevitable or unavoidable. ϢereSpielChequers 19:49, 27 August 2022 (UTC)[reply]
This is very insightful. My thoughts, inspired by this comment: Volunteers give their blood, tears, and sweat equity as early stakeholders in open-source projects. Governance is tricky when it becomes top-down. To the extent the foundation can listen to community feedback and be on the side of the "workers," it will be less of a scab to "management." The startup mentality has always been a part of Wikipedia, the trouble is when control and predictability become more important than spontaneity and creativity in problem-solving. The foundation may not understand the community ad hoc self-organizing systems, and instead try to use a regular hierarchical corporate directive, which won't work. It's about servant leadership, retrospectives, being open-minded on the details but focusing on principles, values, and making sure the specifics remain negotiable and you don't over-plan. Andre🚐 19:56, 27 August 2022 (UTC)[reply]
Thanks. I agree that top down v bottom up is a major faultline in the divide between the WMF and the community. I don't accept that it is inevitable that governance moves from a bottom up philosophy to a top down one. ϢereSpielChequers 20:04, 27 August 2022 (UTC)[reply]
I agree. There are certain objectives and key results that may come from the top-down, but the solutions and the approaches should be determined by the community. Andre🚐 20:08, 27 August 2022 (UTC)[reply]
I am certain that NPP will get their developer, that WMF will tone down their donation approach, and that there will be a community outreach with lots of t-shirts. But I can't see that WP Meta or WMF are prepared to move on root issues for another few years until the lobster heats up a bit more, so I think the best thing is to create our own interWIki council. It's diversity, so it must be good. Wakelamp d[@-@]b (talk) 02:40, 28 August 2022 (UTC)[reply]
Ok - I was wrong :-( The interWiki council is also a bad idea Wakelamp d[@-@]b (talk) 14:29, 15 October 2022 (UTC)[reply]
I think the developers are very much like us based on what they write. It has all the same issues we have been discussing Wakelamp d[@-@]b (talk) 08:05, 28 August 2022 (UTC)[reply]
@WereSpielChequers: a clarification, if I may: "Mixing volunteer and paid staff on the same project is not easy" (agreed wholeheartedly, but...) "especially if you want to maintain volunteer motivation and self respect." — I'm not sure I follow here? — TheresNoTime (talk • she/her) 03:12, 29 August 2022 (UTC)[reply]
As a volunteer on software development for the project, I, like TNT, would also appreciate some clarification on this sentence. —TheDJ (talkcontribs) 11:11, 29 August 2022 (UTC)[reply]
I'm reasonably sure either this post is satire, or one or several of the replies are satire, or all of the above. Otherwise, I'd be significantly more fearful of the future of the project, but of course the posters on meta pages are not really a representative sample of editors. SamuelRiv (talk) 01:16, 28 August 2022 (UTC)[reply]
@SamuelRiv In any satire there is some truth. I don't think a divorce is inevitable (But didn't eswp do that?) With whether it is representative, I suggest you look at the discussion on Wikimedia donations emails on Proposal, Then hav search for #wikipediascam, google trends, google news for wikipedia and donation, quora, reddit, and it is worth getting a login for this ycombinator thread Wakelamp d[@-@]b (talk) 02:24, 28 August 2022 (UTC)[reply]
There was another, bigger ycombinator thread before that one. Andreas JN466 10:16, 28 August 2022 (UTC)[reply]
At least there's a few good points raised. I did comb the donation threads. I refuse to touch q***a on all principles -- if WMF-WP has integrity problems, q'a is satan incarnate. It would be good to collect links to key threads and points in one essay so that we don't have to refer people to all corners of the internet for this grand controversy. I don't know what substantial issue I should look for in particular through the noise, but I do find it funny that most of these threads begin with complaints about the endowment and management bloat, when the 2007 whistleblower thread (also brought up in complaints) was about financial mismanagement which was, as described, due in large part to lack of logistical staff. The logical remedy is to hire competent logistical staff, whose overhead increases with the size and scope of projects and their finances.
I don't know how WMF decided on their endowment goal, but as WMF operates globally and often has to deal with government legal threats directly I imagine a substantial hedge is justifiable. I also don't know how they are perceived as an investment in terms of risk, should they need to leverage funds, which would also substantially influence their endowment goals. Maybe there's some new grand project being planned that hasn't been revealed, who knows, but for an org of this size and scope I fail to see how a $100m endowment is inappropriate. It's also common in reaction threads that commenters will list expenditures that are a waste of money, then immediately follow with an alternative wishlist that is comparably or far more expensive yet always objectively more prudent.
Orgs aren't perfect, and nonprofits and NGOs in particular are notorious for inefficiencies that expand with size (independent audits and open donation ratings help mitigate this, but it seems there will always be at least some baseline inefficiency that is in part intrinsic to the nonprofit and/or donation incentive model). None of this is to suggest there is something wrong with vocally complaining -- it is essential to an open audit system -- but can we at least separate the realistic substance from the cruft from the plain ridiculous? SamuelRiv (talk) 15:28, 28 August 2022 (UTC)[reply]
The key issue is quite different from the money amounts per se – it's about how the money is brought in. It's about making people think you are struggling to keep Wikipedia online when in fact you have $400 million in assets and reserves, have reached your $100 million endowment target in half the time planned, enjoy huge annual surpluses and have steeply rising executive salaries.
Add to that the fact that the Endowment has never to date published audited accounts or disclosed any of its expenditure. Andreas JN466 22:39, 28 August 2022 (UTC)[reply]
??/ Where does the 400 Million come from?/ (I couldn;t work out a definite number :-)). I have no problems in paying staff. If they help WP, and make editors life easier then go for it. And Strongly agree about the [1]], It has some clever people on it, but I think it has one editor on it, and I didn't realize that their were no reports. I would prefer that there was an editor appointed audit expertWakelamp d[@-@]b (talk) 03:03, 29 August 2022 (UTC)[reply]
It's an estimate of financial status at the end of the third quarter of the 2021/2022 fiscal year (i.e. status at the end of March 2022). It is based on the following:
Net assets in June 2021: $231.2 million ($51 million up on year prior)
Endowment in June 2021: $100 million ($37.1 million up on year prior)
Increase in net assets as of 31 March 2022: $51.9 million
Increase in Endowment as of January 2022, the most recent figure available: $13.4 million
So, adding together, we have $231.2 million + $100 million + $51.9 million + 13.4 million = $396.5 million. There are another $3.2 million left in the m:Knowledge Equity Fund, earmarked to go to non-WMF organizations.
Note that the WMF had a total surplus of $88 million in 2020/2021 (Endowment growth included). (For more on Endowment transparency, see Wikipedia:Wikipedia_Signpost/2022-05-29/Opinion.) Andreas JN466 06:38, 29 August 2022 (UTC)[reply]
I was wondering who WMF funds. Please look at page 69 of their IRS 990, i don't understand why we are getting other organisations to do WMFs charity work.
  • Tides $5.5 M
  • yale $260 K (but the last mention is 2017)
  • Peace Development Fund 150 K - no mention
  • Black Lunch Table $168K but i can't find a mention that we donate
* We also seem to give grants to a Wikipedia DC, which I assume is to get around the non-profit lobbying rules
Also have a look at the breakdown by country for funding and spending and staff, page 41 Schedule F part 1 and 2 11:03, 29 August 2022 (UTC) Wakelamp d[@-@]b (talk) 11:03, 29 August 2022 (UTC)[reply]
Pretending that money is needed to keep the servers running then diverting it to political pressure groups is a gross breach of trust, however noble the causes being advocated. It leaves me with a serious ethical conflict: I want to continue helping our readers, but each article I improve makes me feel like an accessory to fraud. It seems that every time I log in, there's a new proposal to hijack another part of our encyclopedia's governance. I want to continue contributing, but we're very close to the last straw. Certes (talk) 12:20, 29 August 2022 (UTC)[reply]
Sorry, which political pressure groups are being funded? I don't see any listed above. I'm not sure how the Yale grant is any more controversial than any other major WMF grant. The reason you partner with outside institutions is because outside institutions are more established and experienced with the infrastructure to do it better and cheaper than if you did every little thing in house. Should we establish the "WMF office of janitor development" to fund our own office cleaning staff, or just pay a company to do it? For comparison, Google and other companies donate to WMF, a nonprofit, not because they're being nice, but because WMF projects are extremely important to their own R&D and business development, and it makes far more sense to support WMF than to start a "WikiGooglePedia" clone that is identical in every way -- donation based, etc. -- for the sole purpose of being an in-house operation, just to satisfy a handful employees who don't like the idea of outsourced interests.
Tides Advocacy according to the FAQ is distributing the Equity Fund, not simply being given money to spend on their own lobbying. I'm not quite sure I understand the model of Tides Advocacy still, as OpenSecrets isn't really open about its data -- Tides Advocacy is responsible for lobbying money, but that's part of what it's explicit purpose is -- to spend lobbying money on behalf of other nonprofits. It's hard to gather how much, if any, lobbying it does on behalf of its own interests. Regardless, this isn't about Tides lobbying, as WMF's explanation suggests. I don't know what Sched. F is supposed to indicate -- are you surprised $11m is spent in the entirety Europe in this breakdown? If the amount is what surprises you, why is it not the total -- why Europe specifically?
Finally, I've seen it argued in these same threads that WMF should do more government lobbying on behalf of, say, IP (and typically open source advocacy slips in there too -- some people also voice concerns about China and Russia policy). Would it make more sense for Google to hire, in-house, the dozen or so expensive people needed who specialize in high-level IP lobbying (IP lawyers are not a sufficient substitute), something that would only be really needed to be working on all engines a handful of times every decade when major legislation comes up, or just outsource to a lobbying firm? SamuelRiv (talk) 15:37, 29 August 2022 (UTC)[reply]
@SamuelRiv Lobbying removes our tax exemption. I agree Tides is very confusing (note only the 2019 990 returns are available
  1. The [[Tides Foundation] companies include Tide Advocacy, and is interlinked with many WMF projects
    • There are many issues (See signpost for more information).
    • “The Wikimedia Foundation, the non-profit organization which manages Wikipedia, is closely tied to the Tides Foundation."
    • "The general counsel of Wikimedia, Amanda Keton, is the former general counsel of the Tides Network, the former head of Tides Foundation, and the former CEO of Tides Advocacy."
    • “The multimillion-dollar Wikimedia Endowment, created in 2016, is managed by the Tides Foundation and has an advisory board appointed by Tides. In 2020, Wikimedia established the multimillion-dollar Wikimedia Foundation Knowledge Fund to be run by Tides Advocacy”
    • There is no mention of Tides on Wikimedia Endowment
    • I found a few other people that overlapped with both companies.
    • There is no conflict of interest is not broke as the WMF counsel is a past employee.
    • The Tides Foundation are not involved with education
    • They support only Democratic candidates, but | not that much directly, and instead via PACs and small organizations
    • tides makes no mention of wikipedia or wikimedia or education
  2. Tides Advocacy is not involved with education
    • [| IRS 990] states "TIDES ADVOCACY MAKES CONTRIBUTIONS TO ORGANIZATIONS THAT SUPPORT POLITICAL ACTIVITY, CONDUCTS INDEPENDENT EXPENDITURES, AND MAKES PARTISAN COMMUNICATION TO EXPRESSLY ADVOCATE FOR THE ELECTION OR DEFEAT OF A CANDIDATE."
    • They funded a Democratic candidate and did 800 K of lobbying
    • There is one mention of education on their 990 – A grant of 10 K to North Carolina A Philip Randolf Educational Fund Inc for the purpose of environmentalism.
    • No grants have a purpose of education.
    • Their [[The Advocacy Fund - Tides website make no mention of education - but specifies their purpose as "civic participation, Healthy Individuals and Communities"
  3. Some of their consultants actual work seems to differs from that on the IRS form   (company, IRS stated work, their website, $)
    • KIVVIT, Consulting Services (Issues Campaigning) $505,022
    • BASE BUILDER Payroll Services (Mailing lists) $386,373
    • THREE POINT STRATEGIES Staffing services (US Electoral Strategy),
    • | Natasha MINSKER Consulting Services (Actually "Skilled in Public Policy, Politics, Lobbying, Non-profit Management, Community Organizing, and Criminal Law. Graduated from Stanford University Law School.")
  4. Certain accounts on Tide Advocacy seem very high - especially per employee - bracketed amounts below. ( I think they have 20 employees (p/t and f/t ) but they also hire temps).
    • Travel $1.76 Million (90 k)
    • Other Employee expenses $1.6 million (80K)
    • Office Expenses $1 million (50 K per employee -which does not include occupancy of $534)
    • Conferences 900 K (wow)
    • IT is 260 K (13 K) But they only have 7.3 K of equipment??? ( See P 51 of 990}
    • Political Donations 800 K
  5. Tides Network is mentioned on  the Tide Advocacy  990 (p 72), but as unrelated.  On the Tides Network 990 it supports Tide Foundation, Tides Centre, Tides inc. It pays $13 M of network fees and “Tides Network supports the operating Organizations and appoints board members for Tides Foundation, Tides Center, Tides Two Rivers Fund and Tides, Inc. Tides Network sets the direction and policy orientation for and has economic interest in all of Tides organizations.” Wakelamp d[@-@]b (talk) 06:47, 30 August 2022 (UTC)[reply]
Giving money to Yale, funding the Black Lunch Table in Chicago, earmarking $4.5 million for non-Wikimedia organizations (more than half the grant money so far going to US orgs) etc. is all very well, but then you shouldn't tell people in India and South Africa that you urgently need their money to keep Wikipedia online. Andreas JN466 10:19, 30 August 2022 (UTC)[reply]
And Agreed we are telling whoppers
??? Does anyone know which months the banner ads have run. I would be interested to know whether they have increased up the number of times the ads appear,
I turned off the don't show banner ads a few dsays...But they didn't appear :-( But when I logged out, I got an awesome message of doom that took up the whole screen. It hasn't happened again.
Asides
Hmmm. ,,,It would be quite amusing to copy all their banner ads to one page, and have the editors descend to tag it :-) The media would find that amusing :-) Wakelamp d[@-@]b (talk) 11:59, 30 August 2022 (UTC)[reply]
The ads appear in different regions at different times. There's a list at m:Special:CentralNotice – filter on Campaign type = Fundraising – but it seems incomplete: I recall a campaign last December. At least the perpetrators recognise that Users already hate these banners. Certes (talk) 14:00, 30 August 2022 (UTC)[reply]
m:Fundraising has an overview of current_fundraising_activities. In addition to the scheduled campaigns listed on that page there are sometimes low-volume campaigns run for testing purposes, where banners only appear for a small subset of users. That's why people sometimes see a fundraising banner appearing "out of season". Andreas JN466 14:26, 30 August 2022 (UTC)[reply]
Thanks for that I wanted to find out if WMf had increased the frequency, because one of the fundraising staff wrote [this] saying it was a bad ideaWakelamp d[@-@]b (talk) 13:36, 31 August 2022 (UTC)[reply]
@Wakelamp: The $5.5 million to Tides are the annual $5 million paid into the Endowment (and included in WMF "expenses"), plus planned gifts, all of which have been diverted from the WMF and redirected into the Endowment since the February 25, 2021 board resolution. So any money people have left to Wikimedia in their wills in the past today goes into the Endowment. Andreas JN466 08:33, 30 August 2022 (UTC)[reply]
They are the 2019 numbers, but it still means that we are funding a Democratic Party fund raiser???
As far as I can see Tide (which was co-created in '72 by the heir to a cigarette fortune has taken us over),the trustees are toothless, and the WMF will keep on increasing up its staff :-(
Vivat Tidepedia! Wakelamp d[@-@]b (talk) 09:27, 30 August 2022 (UTC)[reply]
It's amazing to me the WMF doesn't fund more software projects. When a proposal was made to create a citation database it was rejected. Just one example. I understand they tried this with VE and got kind of burned, but that was a white whale project from the start. Smaller more doable projects that have a big impact. For example let's get a really good version of what reFill does to get our citations standardized. And make the tools cross-language. -- GreenC 04:46, 28 August 2022 (UTC)[reply]
I agree keeping it small and attainable, with frequent iterative deliveries, and user testing and feedback, is the way to go. The visual editor citation toolbar works pretty well, if you're citing a standard URL or a DOI or ISBN, I mostly use that now instead of reFill and CitationBot, and you can enable it in the beta settings for the wikitext editor as well. Andre🚐 05:04, 28 August 2022 (UTC)[reply]
@Andrevan @CactiStaccingCrane I would like a way of reducing down the time/friction to find and cite articles. Reading references is fun, but it's multi step, so it's a pain for new editors
  • A curated reference (NPP reputation tools) search engine similar to the film project's. ability to exclude self published, google books without preview, check for AKA names from articles.
  • JSTOR is mentioned in the missing ref for cites is restricted to 500 edits, so excludes new users that know citations
  • Google Book cite is not context sensitive (so manual entry of chapter name, page, chapter author, we don't use Google books API google site is missing data references ), and no image ocr to text tool, Ideally, it would be nice to get special approval from Authors' Guild & Google to have full or at least text access to the snips.Wakelamp d[@-@]b (talk) 07:49, 28 August 2022 (UTC)[reply]
@GreenC WMF doesn't fund things because there is no upside for them.I asked for the road-map the other day, and got pointed to this, and it makes sense because we have no roadmap ourselves:-( Wakelamp d[@-@]b (talk) 07:20, 28 August 2022 (UTC)[reply]
So, let's make a goal for ourselves. What specific goal would we want to achieve by 2023? CactiStaccingCrane (talk) 07:22, 28 August 2022 (UTC)[reply]
Change the proposal process. All proposal are created as an article. Editors comment on the article talk, and write an evidence based article. Results are fed back to the proposal page. 07:59, 28 August 2022 (UTC) Wakelamp d[@-@]b (talk) 07:59, 28 August 2022 (UTC)[reply]
Changed my mind again :-) Create a WP Development board. Have elections. - all active editors vote. They can work out what their job is, but they have to report to ALL active editors Wakelamp d[@-@]b (talk) Wakelamp d[@-@]b (talk) 08:17, 28 August 2022 (UTC)[reply]
Relating to the "Masculinity vs Femininity" question, how does WMF show "a preference for cooperation, modesty, caring for the weak, and quality of life" as opposed to Wikipedia editors showing "a preference in society for achievement, heroism, assertiveness, and material rewards for success"? Strange post. Dialmayo (talk) (Contribs) she/her 17:48, 2 December 2022 (UTC)[reply]
  • WMF The WMF vision is very F ("Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment") matches pretty well with "a preference for cooperation, modesty, caring for the weak, and quality of life ; It is not achievement based (John Lennon type imagine), or assertive, or measurable, and the weak in this case are the information poor. The modest doesn’t match for management, but I think it describes a large number of WMF employees.
  • WP community I have placed the WP equivalent in brackets "a preference in society for achievement (number of edits, articles, references,front page, DYK, heroism (NPP, anti-vandalism, bleck/white thinking ), assertiveness (AfD, NPP, ANI, robust debate), and material rewards for success (well we don’t get paid, and I think many of us are less than fluch with funds, but maybe this matches with being an admin, respected, ,,,,) . But it is a model and "a preference for cooperation, modesty, caring for the weak, and quality of life" also matches well with lp desk, tearoom, AfC, reference desk, quarry, helping newsbies, and nearly all the editors i have ever interacted with.
Wakelamp d[@-@]b (talk) 10:43, 3 December 2022 (UTC)[reply]
I like to think that the typical long-term Wikipedia editor also exhibits the qualities attributed to the WMF above. If anything, the WMF seems more achievement-based: it emphasises acquiring, measuring and spending donations, various equality metrics and deployment of new software (wanted or not). Certes (talk) 11:32, 3 December 2022 (UTC)[reply]
I am not sure whether WMF measure themselves that much; WMF measure WP outcomes (with donations being the exception, but even that is not really in their control that much). Dev is actually not measured that much - except for say a big milestone (Vector 2022) - there doesn't seem to be measures of new feature use, links to business KPIs, "cost" justifications. With the wanted or not software, open source projects typically give you the option to use/not use. but the way WMF manages wiki seems to be very unlike other open source projects The core product is not being develops to meet external challenges, or evolving based on personal preferences (but we don't have that many available/visible anyway) becoming popular. For instance, with Vector 2022, the dev team had a one size fits all mentallity/no preferences - WP editors should have the same UX as readers. They didn't understand that it was our IDE (editing workspace), and their own IDE had 132 char line. Wakelamp d[@-@]b (talk) Wakelamp d[@-@]b (talk) 11:35, 4 December 2022 (UTC)[reply]
The WMF currently uses the OKR framework rather than KPIs. There's a dedicated team for measuring results in the Product department (see their current work on Phab), plus every team in Product and Technology has to do their own work. Measuring feature use is routine during development, and sometimes continues afterwards. For example, the Reply tool that I'm using right now has been used 2.42 million times since inception through the end of November, across all wikis, and 184K in November. The Editing Team has also run an A/B test for every major feature in this project, and another is coming up as soon as the instrumentation passes QA. It sounds like you aren't personally aware of the work that's done in this area, and then perhaps jumped to the conclusion that since you didn't know about it, it wasn't happening. And, honestly, I get it: you probably have more important things to do with your life than to wonder which metrics are being used for which bit of software.
More generally, I have looked at your comments on this overall subject over the last couple of months, and I have not found that they align with my experiences either in the WMF or in any of the communities I'm part of. "The WP Community" doesn't have single culture. Even the much smaller subset of "the English Wikipedia's core community" doesn't have a single culture. It might be interesting for someone to do some proper research on the question. (The last time I read research on a related point, it was years ago and more about individual psychology, and concluded that we [the editors at the English Wikipedia] were conscientious, disagreeable, and neurotic.)
For anyone who finds this sort of thing fun, there's a sort of cultural-personality quiz at https://hbr.org/2014/08/whats-your-cultural-profile that you might enjoy. The only regret I have about it is that it compares your results against your selected country, but doesn't tell you which country's profile you are most closely aligned with. In this context, it would be interesting if we could aggregate the results, so we can see how much the responses spread. Whatamidoing (WMF) (talk) 20:58, 5 December 2022 (UTC)[reply]

@Guy had a similar better expressed [[2]] "the idea is basically to give a wider group of stakeholders a voice in setting development priorities and approving feature changes. In business this would typically include development leadership and business stakeholders, and the idea is to make sure that effort is spent where it will have the most impact on organisational priorities. Normally we'd handle this through RfCs at Meta and the like, or by meetings between WMF and devs I guess, but the meta discussions tend to attract only people with detailed interest in things like microformatting, much of the discussion is arcaqne and they run on geological timescales."

"They can work out what their job is": Is this satire again?
If there's already proposal pages that editors can amend and discuss, and the proposer can submit grant applications and would do/coordinate the implementation, and WMF reviews the grants... what's a board for? (aside from whacking ideas for bloating the WP bureaucracy over the head, Stooges-style.) SamuelRiv (talk) 15:42, 28 August 2022 (UTC)[reply]
Not Satire -All the committees started with vague responsibilities, which we then coloured in as we went.
I disagree that the proposal/grant process works well.
  1. After the proposal is approved, then it falls down. No one is respsonsible for testing, reporting back, supporting the chage, and testing, Our process at the moment is proposal approved, WMF start work, ask for comment directly, install, RfC complaining, testing, modifications, complaints about defaulting it to all.
  2. Gold Plating. The pareto principle applies to software in that 80 % of the cost is the last 20 % of work. Our current process causes large expensive projects, with many features that are unused, because multiple voices demand without triage
  3. Planning. We have no forward plan for how we think WP will work, which means WMF have no roadmap. We have no risk reduction or proactive planning. For instance, what do we do if
  • We have a Denial of service attack type attack on the NPP, through an AI creating thousands of articles.
  • Google/Facebook creates an AI generated 'pedia
  • Our tired UI encourages Generational change - The continuing decline in admins, 100 K editors leaving per year
$ Proposals and System changes are papering over conflict and editor resource issues,
  1. Proposals are solutions, but there is no list of problems, and there is no data analysis to verify opinions.
  2. Benefits of Change. There is no checking of whether the proposal achieved it's goal.
  3. The Content creators (mostly new editors) are not involved. Wakelamp d[@-@]b (talk) 07:12, 29 August 2022 (UTC)[reply]
These are great points. If development on WM software is not currently managed well, how would proposing changes through a WP bureaucracy be different from proposing changes through a WMF bureaucracy? You still have to go through the participation of the entrenched volunteer WM programmers, which through divorce of projects you have suddenly fragmented. I don't know how the addition of additional input of WP editors into the process (there's phabricator, WMF project pages, emailing WMF people directly, proposals through WP, and more currently that you can do to raise these concerns or search for whether they are already being addressed) is supposed to help anything except possibly an editor's self-esteem, briefly, until they are ignored. It reminds me of the story Richard Feynman talks about of receiving messages from the public, even after filtering for those who aren't cranks -- he's busy trying to decode the combination lock on a safe, using some tools to make the process more methodical, and comments come in like "did you try 21-3-49? How about the bank manager's birthday?" There's a good case to be made that WM coding should have a better management structure, but in the end that will probably require hiring at least a project manager and principal coder, which means yet more $200k+ salaries to shell out. SamuelRiv (talk) 15:56, 29 August 2022 (UTC)[reply]
I believe I am correct in stating that most WMF staff (employees + contractors) work remotely, i.e. from home. So why do they have to be in the US? There are many parts of the world outside the US where competent coders can be hired for a fraction of the cost. Andreas JN466 14:56, 30 August 2022 (UTC)[reply]
WMF funds some grant requests and not others, and they give reasons why. I have wondered about some of their rejections, whether it is due to being beaten out with a limited grants budget (all departments anywhere have limited budgets) or if they lost on merits or prospects for completion. Could you link the specific proposal you are referring to? SamuelRiv (talk) 15:35, 28 August 2022 (UTC)[reply]
When did WMF and WP start to diverge? Was it after the 2015 Harassment Survey? — Preceding unsigned comment added by Wakelamp (talkcontribs) 07:00, 30 August 2022 (UTC)[reply]
I have been editing for 14 years and have been an administrator for five years. I have always tried to develop friendly relationships with WMF staffers and still have some friendships though far less so than in the past. I used to live about 32 miles from WMF headquarters in San Francisco, and was always willing to drive there, pay for bridge tolls and parking, and meet with the staffers to share the perspectives of highly productive volunteer editors who are essential to the success of the encyclopedia. As the years have gone by, the staffers that I knew and who paid attention to what I had to say as a highly active encyclopedia editor have moved on, presumably to even better jobs in the software industry. A few remain who I interact with, but it seems that the WMF is determined to throw its cash resources at "pie in the sky" efforts to draw in editors from poor countries without fixing the fundamental flaws with the mobile sites and apps that such potential editors are most likely to use. The WMF takes in the massive amounts of cash that poor people worldwide donate, and instead of spending that to allow people to truly collaborate on smartphones used by billions of people worldwide, they squander the money to keep overpaid and unproductive code monkeys prosperous for more and more years to come, to the detriment of the encyclopedia. The desktop site works just fine on modern smartphones. Why not shut down all mobile sites and apps, and lay off all of the developers who have utterly failed to make these mobile sites and apps fully functional for over a decade? Why should people who have failed for so long stay on the payroll? I hate to be mean but no profit making business would tolerate a dozen years of complete and utter incompetence from a project team, when the free alternative works just fine as I have proved over and over and over over the years. It is nest feathering behavior by human beings who should know better. Their excuse boils down to "it's hard". Not acceptable when the cash donations of poor people are at stake. Cullen328 (talk) 07:48, 30 August 2022 (UTC)[reply]
User talk:Jimbo Wales/Archive 241#Newbie and IP edits should be vetted delayed before they go live " But we have a situation ongoing for a very very long time now that there is a disconnect between the community and the developers... a lack of trust is part of it... so that experimentation and rollback (something that we as Wikipedians should be super comfortable with) isn't allowed to happen. Any change to software is much much harder than it needs to be.-" Wakelamp d[@-@]b (talk) 13:36, 31 August 2022 (UTC)[reply]
Like you, I always switch to desktop view on mobiles and tablets; it's perfectly fine unless you have a 3-inch iPhone screen. It would be nice if the desktop view option were more prominent, or indeed the default setting.
Fundraising squeezing money out of poor people: See the current discussion/RfC over on the Proposals Village Pump, reviewing the Wikimedia fundraising emails about to go out. It touches on that.
As for throwing money at the developing world: While I think there may be some problems with spending decisions (see Wikipedia:Village pump (WMF)/Archive 5#Should the WMF have rules or policies for when banned users apply for or are part of the team that administers grants?), I looked at the Form 990 a while back and found that claims the WMF is spending large amounts of money in the developing world are merely a convenient PR meme. In reality, the amounts have been absolutely minuscule to date – less than 2.5% of revenue. See Wikipedia:Wikipedia Signpost/2022-06-26/News and notes#Where does the Wikimedia Foundation spend its_money? for a breakdown. Andreas JN466 08:25, 30 August 2022 (UTC)[reply]
@Kudpung , @Cullen328 Your posts made me wonder where the WMF developers spend their time
Exhibit 1 - fund raising Full board- 5 members
Exhibit 2 - Community Wishlist All in backlog
@WereSpielChequers You are correct - they aren't on our side - They can have the kids as well
??? Can we ask the trustees for the breakdown of costs by project ??? Wakelamp d[@-@]b (talk) 13:03, 30 August 2022 (UTC)[reply]
Your "Exhibit 2 - Community Wishlist" link doesn't seem to work — I think you may have meant phab:tag/Community-Wishlist-Survey-2022? If so, that board appears to be a little misleading, and you might find phab:tag/community-tech a little more helpful in gaining an insight into how this specific team is spending their time In the interest of transparency, I work on the Community Tech team as a software engineer, though I consider myself a volunteer first and foremost. — TheresNoTime-WMF (talk • she/her) 16:38, 30 August 2022 (UTC)[reply]

@WereSpielChequers, Cullen328, and TheresNoTime: Some people are conspicuous by their absence from this list, maybe it's because they are concerned about landing in the Foundation's bad books. Not that it would matter, the WMF doesn't appear to give a hoot because the SF cabal, or at least its management class, is a classic example of groupthink. I've often wondered what it's like to split one's personality between being a volunteer and accepting pay at the same time. Personally I don't think it's possible but I do admire the tiny handful of those who straddle the Great Divide and who are able to remain on the side of the volunteers. Kudpung กุดผึ้ง (talk) 18:49, 30 August 2022 (UTC)[reply]

Hi Kudpung, I've been a volunteer since 2007, and was a WMUK part time staff member from 2013 to 2015. I can remember leaving the room for one WMUK AGM (I didn't have to be in the room as either a volunteer a member or an employee, and would have been in an odd situation if I'd been in the room for a particular item). Mostly though I thought it worked, and I think my history of being a volunteer for six years before I joined the staff was a big advantage in my GLAM role. As for your letter, I suspect I'm not the only Inclusionist who was put off by the sewer analogy. If I'd agreed with the letter a bit more I would have signed it, as we both learned many years ago when we and ScottyWong looked into the block logs of the most active editors, discreet and diplomatic approaches are not the best way to influence the WMF. ϢereSpielChequers 19:05, 30 August 2022 (UTC)[reply]
@WereSpielChequers:, I wasn't alluding for a moment to your salaried role in a Wikipedia chapter. I will also never forget the one-to-one meeting we had in Oxford a great many years ago that inspired and encouraged me to become so active on NPP issues for over a decade. NPP has been without any coordination for a couple of years until MB stepped in recently. He and Novem Linguae are doing a grand job which partly includes doing some of the paid WMF's work for them for free. I thouroghly agree that discreet and diplomatic approaches are not the best way to influence the WMF. The new NPP oodinators' initiative with their letter is urgent and admirable although there might be some very minor turns of phrase that in hindsight could just possibly have been differently worded. I don't think they are a deal breaker though, and NPP certainly needs a lucky break soon. Kudpung กุดผึ้ง (talk) 02:56, 31 August 2022 (UTC)[reply]
  • There are clear culture-clashes between Wikipedia (enwiki in particular) and the WMF; and I definitely agree that some of the stuff they've spent the money enwiki (largely) generates on could have been more productive. But at the end of the day operating a site with the size and prominence of Wikipedia is going to require a sort of legal, financial, and bureaucratic overhead that cannot (currently) be done through enwiki's methods. I think we can and should push for better communication between the two and more input and influence from Wikipedia in the WMF's decision-making process, but I don't think it's realistic to suggest separating the two; we'd just need another WMF eventually. And at the end of the day, while I disagree with some of what the WMF has done, they've done better than the people who run any other high-profile website I can name - would you want to replace the WMF with the people who run Twitter? Google? Facebook? At the end of the day, outside of a tiny number of clashes that ended up having little impact in the grand scheme of things, the WMF has mostly allowed enwiki to do its thing, and enwiki has largely done all right by that arrangement. --Aquillion (talk) 03:14, 31 August 2022 (UTC)[reply]
Aquillion, the difference between the WMF and 'big tech' is that Twitter, Google, Facebook, etc., pay salaries to the people whose work generates the huge corporate profits. This sets the paygrades for the staff at the WMF, several of whom are on celebrity salaries. This is what causes miscontent. The WMF expects, yea, demands, that not only do we accept their wasteful, unrequested software 'enhancements', but that our volunteers who have enough to do also do the engineering on projects that the paid devs don't find sexy enough.
The case of the NPP tools is rather essential and without the new articles being promptly and accurately patrolled, Wikipedia will loose the very reputation for clean articles that the Foundation boasts about. Indeed , it's already happening.
We are down to barely 10% of the supposed 750 reviewers, and of that 10% only a tiny handful are doing 90% of the work, and backlog drives are proving largely ineffectual. In the worst case scenario, the reviewers will simply down tools. I wouldn't exactly call ACTRIAL, for example, a clash that ended up having little impact in the grand scheme of things. It had an immediate effect that worked wellmfor a while, but its usefulness has since expired. It was extremely useful in one respect however: it proved loudly and clearly just how totally wrong the WMF can be. They may have forgotten it in the grand scheme of things, but we haven't. Kudpung กุดผึ้ง (talk) 07:00, 31 August 2022 (UTC)[reply]
As Kudpung says, staff at Twitter etc. work for the company and can be expected to follow all reasonable orders in exchange for a salary. At Wikipedia, money flows the other way: unpaid editors create and curate content, which attracts donations, which WMF takes from Wikipedia. Effectively, Wikipedia is buying services such as hosting and legal from the WMF. Even though the transfer of Wikipedia's brand made the WMF a monopolist, it should still act more like a supplier than an employer. Certes (talk) 10:15, 31 August 2022 (UTC)[reply]
Actually, "unpaid editors create and curate content" exactly describes the business model at Google, Twitter, and Facebook. Why do you think those companies give away their services for free? Because then their users generate content for free. Which those companies monetize by selling ads. If you're not the customer, you're the product. -- RoySmith (talk) 21:47, 2 September 2022 (UTC)[reply]
I'm going to have to think about that one. If Wikipedia is really run like Facebook, this will be my last contribution. Certes (talk) 22:51, 2 September 2022 (UTC)[reply]
@Aquillion: Are you aware by just how much WMF revenue has increased over the years?
As for WMF salaries, compare some of the entries here to the corresponding entries two years prior. You've got the CEO's compensation increasing by 7%, the DGC's and GC's by 10%, the CFO's by 11%, the CAO's by 22%, the CCO's by 25%, the CT/CO's by 28%, and the CPO's by 32% over a two-year period when US inflation was at 2%.
Meanwhile, WMF fundraising messages ask donors – including in places like India and South Africa – for money "to keep Wikipedia online". Andreas JN466 11:12, 31 August 2022 (UTC)[reply]
@Aquillion I don't really wish a separation, but a rebalancing of the relationship. I do think that WMF has changed it's philosophy Wakelamp d[@-@]b (talk) 22:57, 31 August 2022 (UTC)[reply]
Has there ever been a proposal to message Spotlight to all active editors? I ask because it might correct the WMF and WP imbalance because most editors will never go near pump, For instance the discussion on the WMF emails only involved 20 editors (plus lurkers) Wakelamp d[@-@]b (talk)

Arbitrary break

What a tangled web the WMF weaves: A recent, long comment from a senior Foundation employee, goes to demonstrate once again the reasons for the community's long-ingrained distrust of the Foundation's use of the huge surplus money generated by the free work of volunteers, and the claims the Foundation makes of supporting the volunteers with the required software. The comment comes across as a rather poorly worded hurried attempt by the WMF to justify itself but it clearly contradicts that department's own mission statement.

The community has previously been told quite clearly that the maintenance of the essential PageTriage software is not within the remit of the WMF's Growth Team (although it was a WMF creation). At the same time they are telling us that there will be no action until a request is submitted through their annual Wishlist Survey. Maintaining the features and addressing the bugs in the various elements of the NPP tools is clearly beyond the scope and purpose of Community Tech as described on their own web page. Even if the community were to assume a huge dose of good faith, what is it supposed to believe?
The appeal addresses precisely that question. Kudpung กุดผึ้ง (talk) 00:25, 1 September 2022 (UTC)[reply]

Well, what else can we do? I predict that once the letter is sent, a similar response to the above will be made. I suppose we need some other big idea if we are to improve the program. Perhaps something for me to mull over for a while. Even if the letter does succeed, it wouldn't hurt to devise an alternative solution to the problem. CollectiveSolidarity (talk) 03:23, 1 September 2022 (UTC)[reply]
@CollectiveSolidarity Your user name is very appropriate for a possible solution; have The Spotlight sent to all active editors with editorials explaining the issues, or at least encouraging connectivity/editor retention/community/article improvement. Wakelamp d[@-@]b (talk) 03:50, 1 September 2022 (UTC)[reply]
@Wakelamp, @CollectiveSolidarity, in regards to your question, what can we do to create a semi-permanent resource, where these concerns could be aired? yes, we could send the Spotlight to all active editors. that is one option. are there others? let me ask, is it possible to create a user essay in the "Wikipedia:" namespace, to articulate and summarize these concerns? Sm8900 (talk) 13:39, 9 September 2022 (UTC)[reply]
@Sm8900 I have had difficulties working out another alternative for the essay, because everywhere I looked within WMF, I found reasons for major concern, and a clear policy of avoiding oversight.
What I am very uncertain of, is what the appetite for change is at the moment. if we wish to change them, then we will have to address/disprove some of their issues (as they use them to justify their mission), and create an alternate structure that represents the editors, with the media, and in decision making Otherwise, I expect even if there was a scandal, or major media attention, that changes would be superficial. Wakelamp d[@-@]b (talk) 06:05, 19 September 2022 (UTC)[reply]
hi @Wakelamp. ok. I appreciate your reply on that. thanks. Sm8900 (talk) 15:03, 19 September 2022 (UTC)[reply]
@Wakelamp, general question, how do I find the "Spotlight"? I'm sorry for my basic question. thanks. --Sm8900 (talk) 13:42, 9 September 2022 (UTC)[reply]
Apologies. Freudian Slip. I meant The SignpostThe Signpost Wakelamp d[@-@]b (talk) 00:19, 10 September 2022 (UTC)[reply]
I have an Option 4 which would solve your NPP issue.
  • Define what changes you want,
  • Ask for a ball park quote from an external Wiki developer,
  • Do a press release. But require they do the interview in a way not to reveal your identity - the Secret Wikipedia.
  • Get Ask EFF agreement to help , and to use one of their bank accounts.
  • Get agreement from Gutenberg, Open ID, Apaches, Free Software foundation, celebrity to put up banners for us
  • Get a | quote
  • Have the developer do a detailed quote
  • Create a kickstarter (after asking EFF to verify whether statements are legal) explaining our plight. and advise that x % will go to Support EFF and to review our management documents. Any over will go to the supporting charities.EFF will disburse the cash at development milestones. Kickstarters can also get t-shirts with 2005 Wikipedia Slogans @Jayen466. This was my Option 4). Wakelamp d[@-@]b (talk) 06:44, 1 September 2022 (UTC)[reply]
Option 4.1 Is similar, but we sell the T-shirts on wikipediocracy. Wakelamp d[@-@]b (talk) 07:06, 1 September 2022 (UTC)[reply]
Send a notice that 'The Spotlight' is available to all editors that have logged on in the last 12 months. If we want change then we need to organise, otherwise they will continue to increase in size, and ignore us.Wakelamp d[@-@]b (talk) 03:46, 3 September 2022 (UTC)[reply]
Wakelamp, let no one doubt for a moment that the required changes to NPP have not been thoroughly researched, discussed and defined by the NPP team. It's a lot of ongoing, dedicated work here and on its sub page. The scope of the work is such that paradixically, some of the Growth Team's members are telling us it's too big for the current pool of WMF developers, while other members are insisting the changes should be appealed for at their Wishlist. This obviously casts further doubts as to the professionalism and seriousness of those in charge of the Foundation's technology, and puts their sincerity in question. Kudpung กุดผึ้ง (talk) 22:37, 5 September 2022 (UTC)[reply]
On the Org chart they show 4 developers assigned to the project - have they been assigned elsewhere?
Chris Albon Director of Machine Learning
Kevin Bazira Software Engineer III
Aiko Chou Software Engineer III (Contractor)
Tobias Klausmann Senior Site Reliability Engineer (contractor)
Luca Toscano Senior Site Reliability Engineer (Contractor) Wakelamp d[@-@]b (talk) 23:55, 5 September 2022 (UTC)[reply]
https://wikimediafoundation.org/role/staff-contractors is famously out of date, if that's where you got the above from — TheresNoTime (talk • she/her) 06:21, 6 September 2022 (UTC)[reply]
Yes - that's where I got it. Is there a better one? I thought of scraping Meta user, but only IT staff seem to have user pages. The web estimates 900 staff and growth rate of 35 %
WMF is incredibly opaque compared to others Wakelamp d[@-@]b (talk) 08:51, 6 September 2022 (UTC)[reply]
I'm not aware of a more up-to-date global list like that (and I believe that one is slated for removal) — for the Growth team, their team listing on MediaWiki.org may be more helpful? — TheresNoTime (talk • she/her) 12:37, 6 September 2022 (UTC)[reply]
With Growth, the page states it is only for mid-sized wiki.
Thank-you for the tip about the possible upcoming deletion, and. I have now downloaded it just in case
There is nothing else so i have already converted it to Excel for the other analysis I am doing,
As an aside,Glassdoor (employee reviews) has been an eye-opener, as it indicates that the WMF internal structure is fiefdom/divisional silos (each has it's own section, often a profit centre linked to a project), with absent central control, unquantified goals, and with every difficulty factor turned to 11
Conway's law states that organisations create computer systems that reflect their internal communication,which explains a lot about the WP and WMF systems,
File:"Org charts" comic by Manu Cornet.png
"Org charts" comic by Manu Cornet
Wakelamp d[@-@]b (talk) 13:50, 6 September 2022 (UTC)[reply]
The WMF staff page should remain available: it is on archive.org, along with older versions. Certes (talk) 12:16, 3 October 2022 (UTC)[reply]
Wakelamp's option 4 proposal is phrased as if its some radical view, but if it was rephrased as: start a wikimedia affiliate, raise some money (or maybe just get a grant from WMF), and hire a developer to do the things you want - that proposal would be a pretty normal proposal. Bawolff (talk) 05:41, 12 September 2022 (UTC)[reply]
That sounds like a proposal worth taking further. Although most MediaWiki users are readers, many enhancements requested by editors are important too, and will help editors to provide readers with better content. If the WMF structure doesn't allow those developments to happen, for whatever reason, then an alternative route is needed. I've had responses in Phabricator along the lines of "go and do it yourself", but I'm not familiar with the relevant code, unlike the developers funded by donations. Certes (talk) 12:13, 3 October 2022 (UTC)[reply]
@Certes "I've had responses in Phabricator along the lines of "go and do it yourself". That isn't very constructive of them. How do you find the wishlist process?
@Bawolff The radical part is by doing it as a visible kickstarter, we are openly stating that the donations are not being used for the purpose intended,
@Kudpung, Certes' comment that "I'm not familiar with the relevant code, unlike the developers funded by donations.", made me wonder what are the ORES/NPP developers doing instead? Is the work you and yours wish done difficult to because the requirement is complex , because ORES/NPP is complex? Wakelamp d[@-@]b (talk) 23:40, 3 October 2022 (UTC)[reply]
@Wakelamp and Bawolff:, the work that two (or more) of our New Page Reviewers are doing is because they just happen to know about software code (maybe it's their job in RL). They are doing it because the WMF says they won't do it because they don't have enough money - which everyone one knows is hogwash, and they have inferred that if NPP want it done they should find volunteer developers to do it.You would stay up to date more if you were to read The Signpost, Wikipedia's monthly newspaper. Here's a link to the article that was published in last week's issue. Raising an affiliate is one thing and creating a simple user group is easy enough, , but the WMF are hardly likely to provide a grant simply for the running of a user group, and for a user group to receive donations it has to be an incorporated body, such as a registered charity. The catch 22 is that it would first need money to do the fundraising. Kudpung กุดผึ้ง (talk) 00:51, 4 October 2022 (UTC)[reply]
I do find the wishlist process helpful. I vote there annually and have suggested a few items myself, notably the temporary watchlisting facility, an enhancement which the WMF implemented very well and which I now use daily. We don't get everything we wish for but we do get some of it. It's one of the more effective avenues of communication and the WMF deserves credit for its part in that process. Certes (talk) 10:11, 4 October 2022 (UTC)[reply]
Certes, no one is saying the Wishlish is not helpful and it's fine for community requests for convenience tools. However, PageTriage is a major MediaWiki extension, it's the indispensible motor that enables editors to keep totally inapprpriate spams and attack pages out of the encyclopedia. No AI or filters or bots can do that, it's work for ecperienced human editors. The extension is a big peice of software. The total abberation is that in the same talk page diatribe they are telling us that we should put our request into the Wishlist, then in the same breath they don't have enough 'resources' and they don't have enough money. How's that for clear, professional thinking? Our work fuels their Big Tech lifestyles, the community deserves some return for it. Cullen328, one of our most mature and respected admins, spells it out even more aggressively than I would dare (mainly becase I'm one of the loudest users one who for years has negotiated with the WMF for NPP and other software that keeps the corpus clean, and I'm still trying). Cullen drives 32 miles to talk to them, I spend $1,000 a time to fly across the world to try and get an audience with them.Kudpung กุดผึ้ง (talk) 12:29, 4 October 2022 (UTC)[reply]
Yes, the proposed NPP extensions are too large to be a wishlist item. I was specifically answering Wakelamp's question on wishlists, without reference to NPP. I agree that NPP improvements should be made, independently of the wishlist, and I signed the recent letter to that effect. Certes (talk) 12:33, 4 October 2022 (UTC)[reply]
When you say we don't get everything that we want, what is the constraint? WMF resources, technical complexity, ....? Wakelamp d[@-@]b (talk) 08:52, 5 October 2022 (UTC)[reply]
@Wakelamp and Certes: I appreciate that type of response is frustrating, however sometimes there's tasks that nobody is willing to do. In such a case, the choice at phabricator is to either say some variant of {{sofixit}}, or they could lie, and say "we'll get to it any day now" and do nothing. Would you really prefer to be lied to? That said, to clarify, there are non-technical things to be done here. contrary to what everyone here believes, it is rather unclear what is actually wanted for NPP. Prioritizing and clarifying tasks would probably go a long way towards people actually fixing NPP issues. Right now, anyone who looks into it, sees a list of like 80 unsorted tasks, thinks to themselves, it looks like a lot of work to get familiar enough with the community's needs to figure out which tasks are important and why, and instead moves on to something else. This is something anyone can help fix.
Kudpung: As far as "openly stating that the donations are not being used for the purpose intended" - if you create a kickstarter in good faith to raise money to do something, nobody is going to object. On the other hand, if you create a kickstarter with the intent to embarrass WMF into doing something and no intent to use the raised funds to actually do the dev work you are talking about, then that is going to annoy people. In any case, I doubt WMF is saying it doesn't have enough money. I think its saying its using its resources elsewhere. Money buys resources, but money is not the same as resources. In regards to costs of incorporating - we're talking $40 to incorporate, and about $500 if you want to be a 501(c) [Is that really necessary? you could probably have an existing affiliate hold the cash to get around the requirement]. You have a plan that involves raising money in the high ten's of thousands of dollars. If you are worried about the nickles and dimes here, it seems unlikely you will get anywhere near your actual goal. Bawolff (talk) 14:11, 4 October 2022 (UTC)[reply]
,Bawolff, a couple of points here: if you follow the links in the Signpost article you will see the text where the WMF have clearly stated they are short of cash. Either that is true or they are lying. Financial analyses seem to demonstrate the latter. As much as I believe in building bridges across the chasm that divides the unpaid ordinary people here whose work raises the donations that go first and foremost to the WMF executives' salaries, I am not American, I don't live in the US and I'm not going to learn how to incorporate a registered charity there, and I'm definitely not going to put up a penny myself to kickstart it - I'm not a philanthropist. There's no need for me to embarrass the WMF, they shoot themselves in the feet - constantly, and as long as they are sitting pretty in Montgomery tower, they do not care. Kudpung กุดผึ้ง (talk) 15:05, 4 October 2022 (UTC)[reply]
Kudpung I followed the links, i did not see that claim. I did see the PM of the "Contributor Tools teams" say that these teams (A small part of the WMF) would prefer to use their budget elsewhere. That's really different - they are saying they have the money but are using it for different things, and they are not talking about WMF's money, but the budget of a specific team which is a small part of the WMF. And hey, maybe that's a stupid decision on the part of WMF. A reasonable argument could be made that it is. But saying the decision is stupid is really different then claiming that WMF is claiming that it is too poor to do it. As far as starting an american charity - then don't start one in america. All I'm claiming here is Wakelamp's proposal about starting some organization, collecting some money, and then doing something, is not the outrageous proposal everyone around here seems to be making it out to be. Its a perfectly normal thing to do, and is how some MediaWiki features already get developed. Bawolff (talk) 17:07, 4 October 2022 (UTC)[reply]
Bawolff, if you followed the links as I suggested, you missed this one and if it isn't a claim that money is in short supply, I don't know what is: Contrary to what some might think, we really don't have an endless supply of money that allows us to fix every important problem, and it's a bit ironic since they have enough cash for their Wishlist to fix all the unimportant ones. Kudpung กุดผึ้ง (talk) 23:48, 4 October 2022 (UTC)[reply]
I stand by my statement. They don't have infinite resources to solve every problem all at once, but then again nobody does. I can understand that the framing of the statement does make it sound like a money issue, possibly as a way to deflect focus from the fact a decision was made and it was an unpopular one. However, taken literally, all its saying is that there are more bugs in existence then people to solve them. Which is of course true. However, i wouldn't extend that to saying they couldn't solve the problem if they wanted to, just that, rightly or wrongly, there are other things they would rather prioritize. Bawolff (talk) 17:08, 5 October 2022 (UTC)[reply]
Just to be clear: are these "tasks that nobody is willing to do" being declined by volunteers or by paid employees and contractors? Certes (talk) 22:48, 4 October 2022 (UTC)[reply]
Paid employees and contractors, Certes. Kudpung กุดผึ้ง (talk) 23:50, 4 October 2022 (UTC)[reply]
@Certes: Not to speak on behalf of Bawolff here, and addressing the question more generally — it can be a mixture of both. Firstly, in situations where a certain tool/extension is no longer maintained/"owned" by a WMF team, new feature requests/bug reports are (ideally) worked on by volunteer developers. These volunteers are of course free to choose what tasks they work on, which can result in these "{{sofixit}} moments" on those that they opt not to complete. In cases where a tool/extension is still maintained by a WMF team, internal priorities may dictate which tasks are to be worked on next, and with a finite amount of time and resources there's always going to be some which will languish in a backlog. This too can result in a "{{sofixit}}", not from a volunteer who doesn't wish to work on a task, but from a member of staff (normally a product manager) setting expectations and perhaps wishfully hoping a volunteer developer steps in. It's not ideal, and I do honestly hope that at least one outcome of the NPP discussions are a reconsideration of some budget decisions with a renewed focus on supporting the editing community with their technical needs. — TheresNoTime (talk • they/them) 23:56, 4 October 2022 (UTC)[reply]
TheresNoTime, You're right of course, that is indeed an accurate description of the scenario. What disconcerts the NPP team however, is that PageTriage is a major and critical extension and to be told at Phab it has no owner comes as unusual. More to the point, the Foundation's reluctance to address the issues with this importance piece of software and expecting the New Page Reviewers to do it themselves seems irresponsible - not all Wikipedia editors are software engineers, nor do they need to be. Some are and they are ready to jump to action but they most definitely not be doing the Foundation's work for free.
Here's a simple analogy: The truck drivers in a wealthy transport company are complaining to the management that the brakes on their lorries have not been maintained for a longtime and are getting dangerous, "No money for it, so fix it yourselves'" says the CEO before climbing into his company Rolls Royce.. TheresNoTime, do you drive a car? Do you know how to change the brake pads and bleed the system afterwards? Not even in Germany which has some of the strictest requirements for obtaining a heavy goods vehicle or bus driver licence, and where the candidates are expected to fully understand the workings of the different brake technologies, are the drivers required to know how to repair them. What happens is that the drivers go on strike until the brakes are fixed. In the meantime the company gets a reputation for poor delivery times and looses its customers.
Imagine a world where the Wikipedia volunteer maintenance workers put their tools down... Kudpung กุดผึ้ง (talk) 00:45, 5 October 2022 (UTC)[reply]
@Kudpung: Everyone thinks their component is the most critical, simply asserting it is won't change hearts and minds. You could argue WMF is bad at prioritizing the right things (After all, this is the same organization that for many years felt that MediaWiki itself was legacy and should not have any teams dedicated to it, luckily that time has passed) but it is what it is. Probably the biggest counter-argument for the impact of this tool, is that it is only in use by english wikipedia, which limits the amount of impact any fixes to it makes and raises the natural question of if it really is so critical, how come everyone else is getting along just fine without it? [Not trying to argue this, just pointing out that this is the argument that is probably going through many people's heads when they decide what to work on]. Bawolff (talk) 17:08, 5 October 2022 (UTC)[reply]
Bawolff, I'll remind newer users that PageTriage it was created by the WMF as a consolation prize for having so rudely and aggressively forbidden the ACTRIAL. Part of their decision to offer it was based on the premise that it would also be used by other Wikipedias. Everyone is getting along 'just fine' without it[citation needed] because they are smaller and as region specific they don't carry as much clout as en.Wikipedia which is the first target for global spamers. I believe I also read somewhere recently that with 6.5mio articles, the en.Wiki has more articles than the sum of all the other wikis put together - but that's beside the point - PageTriage is not a luxury or convenience tool à la Wishlist, it's as essential as this is to human beings. Kudpung กุดผึ้ง (talk) 01:40, 6 October 2022 (UTC)[reply]
It is obviously false that en wikipedia has more articles than the sum of all the others. French, german and spanish together already have more than english wikipedia - as can be seen by visiting https://www.wikipedia.org . Then there is the whole question of how you should count wikidata which has a very high edit rate. Anyways, at the end of the day I am not the one you have to convince that it is essential, however simply asserting that it is is not very convincing. 95% of phabricator posts are people claiming that their thing is the most important thing. Most of them are wrong, some of them are right but either way, the assertion by itself does nothing to move the needle. Bawolff (talk) 03:30, 6 October 2022 (UTC)[reply]
Bawolf So the problem is priority setting. Can you advise me how to find (group,taks,project,tag?)the pt/en/es/wikidata projects on phab that you mention? Do they have wish lists as well? And is is possible to see how many story points/hours were/will be used on a task? Wakelamp d[@-@]b (talk) 11:44, 7 October 2022 (UTC)[reply]
i did not mention any pt/en/es/wikidata projects so i'm not sure what you are referring to. In general though you can just search phab for whatever you are looking for. Most people don't track time on a per-task basis. Story points are estimates from before you start a task not how long it actually took. They are also not comparable between different groups (or even over the long term in the same team). 1 story point might mean 10 minutes to one person, it might mean 1 day to another. Bawolff (talk) 18:52, 8 October 2022 (UTC)[reply]
"i'm not sure what you are referring to" - I asked based on your comment that "It is obviously false that en wikipedia has more articles" and I assume you meant this link.I agree that the top 4 have more articles, and English may soon be eclipsed by Cebuano_Wikipedia by itself. The articles per encyclopedia metric is becoming meaningless with bot translations, wikidata has so many edits because of the gamified options (I ran up 70 (?) very quickly using this. — Preceding unsigned comment added by Wakelamp (talkcontribs) 12:59, 10 October 2022 (UTC)[reply]
Bawolf, Whose side are you on? The WMF's as a MediaWiki admin, or the Community's? Does it not interest you that the Foundation's reputation depends on an encyclopedia that has reasonably trustworthy content? It's the en.Wiki that probably draws most of the donations. Do you not believe that is a priority? What do you think is more important than vetting the new articles? Kudpung กุดผึ้ง (talk) 07:30, 10 October 2022 (UTC)[reply]
Kudpung That's not constructive. There is no possible answer that Bawollf can give that will satisfy you, because you have defined community as agreeing with you, and as a Wikipedian I proudly say that we never all agree on anything. Especially us. So, if you want to convince an editor show refs, and good faith ; if you want to convince devs you need cost/benefits, (preferably strategic (long term/multiple areas/stable) over tactics, good faith, and caffeinated beverages. With your NPP proposal, you haven't explained what the benefits are in terms of numbers; I have now looked through your list and I couldn't see how it would solve your problems of capacity/quality. My guess is also that NPP are not considered the Scarlet Pimpernel (NPP: Still heaven or hell for new users – and for the reviewers) — Preceding unsigned comment added by Wakelamp (talkcontribs) 12:59, 10 October 2022 (UTC)[reply]
Kudpung I'm on my own side, in all things. In any case, i think WMF has many flaws, but much of the vocal criticisms are misguided, which ultimately helps WMF escape scrutiny because its critics are so easy to dismiss. As far as the matter at hand goes, you need to show that your proposal would help those things, in a significant way, not simply assert it. Furthermore the benefit has to be great enough to justify people abandoning current work in progress that is half done. To be clear, i'm not saying that it isn't, or that if i was in charge i wouldn't decide differently. I am saying that it's not being presented in a covincing way, which is why you are not getting success. Bawolff (talk) 15:53, 10 October 2022 (UTC)[reply]
Excellent choice of sides. I agree the WMF critics are easy to dismiss, especially as we don't seem to have any power over WMF under the California non-profits act
  1. We are only a small part of their mission
"The organization is also required to serve public, rather than private interests; generally, this means that its activities benefit a large and indefinite class of individuals, as opposed to a small, identifiable group. In particular, the organization may not be organized or operated for impermissible private interests, such as those of specifically designated individuals, the founder of the organization, the founder’s family, or persons or companies controlled by such private interests."
The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.
2. We have no voting rights,
"If a public benefit corporation has members, the members are typically vested with voting and other rights pertaining to the corporation’s affairs. And, when the corporation’s articles of incorporation or bylaws give members certain voting rights, they are called statutory or voting members"
Statutory members have the right to vote for the corporation’s directors (also known as board members, not to be confused with the corporation’s membership), to vote on the manner in which the corporation’s assets will be disposed upon dissolution or merger, or to vote on changes to the articles of incorporation or bylaws.
If members are deemed statutory members, California law also gives them other rights, including the rights to:  Inspect certain corporate records; 3  Receive notice of member meetings;4  Remove directors; 5 and  Sue directors in derivative actions, or third parties on behalf of the corporation, under certain circumstances.6
ARTICLE III - MEMBERSHIP The Foundation does not have members. (Fla. Stat. Section 617.0601)
3. We could not prevent an ownership change
"In the unlikely event that the ownership of the Foundation changes, we will provide you 30 days' notice before any Personal Information is transferred to the new owners or becomes subject to a different privacy policy."
www.forbes.com /how-to-successfully-approach-a-nonprofit-takeover/
4. The board could not prevent an ownership change
The policy change resolution is amazing in it's power
https://foundation.wikimedia.org/wiki/Delegation_of_policy-making_authority especially coupled with the quorum, trustee appointments, and voting power in the bylaws Gordon Gekko would approve.
5, The Global Council
The Global council would seem to weight affiliates with 1 active user the same as en de es https://wikistats.wmcloud.org/display.php?t=wp Wakelamp d[@-@]b (talk) 15:44, 11 October 2022 (UTC)[reply]
Have we any details of the contract with Tides or our obligations to the Endowment?
WMF may been taken over financially, If the contract with Tides specify we must put in $X irrevocably each year into WE, because we have ceded control.
WMF can only recommend to Tides on how much to spend (but they don't have to spend anything) Tides's fees are effectively a 14+ % handling fee (On their Web site Sponsorship costs of 10 % plus Fund fees of 5 % for a DAR ( ( Donor-advised fund)) which is similar to the annual reports on Tides Advocacy and Tides foundation), plus they receive all positive returns on the investment until is spent by the grantee.
DARs are controversial (even with some billionaires),because of their use in dark money political funding (both left and right), by billionaires (up to 75 % Capital deduction up front with the giving spread over many years, for boosting their "giving" by giving to other DAF, scams, for being at risk for of funding terrorism, dummy banks, 75% deductions up front were on the IRS "Dirty Dozen in 2008 (as tax minimisation schemes ,grants for private purposes including one case part of the Varsity Blues scandal, and because they do not have to show itemised accounts of who receives the grants (so there is risk that they 'sell' two donors the same grant'.
"If I want to launder, I can set up a donor-advised fund I control and make donations, then direct those donations to a nonprofit that I also control—so I then control the beginning, middle and end of the transaction and I just cleaned money,” Rechtman said. “I can also go through a donor-advised fund I don’t control, as long as the donations ultimately go to a charity I do control.”"
The Tide faq is important reading in terms of anonymity, and documentation required. [3] Wakelamp d[@-@]b (talk) 09:35, 14 October 2022 (UTC)[reply]
Just to add that it is not "we" (Wikipedia) who are ceding control to Tides. The WMF has seized control of Wikipedia in a number of frog-boiling stages over the past 20 years. It is the WMF rather than Wikipedia who is now passing on that usurped power. Certes (talk) 11:04, 14 October 2022 (UTC)[reply]

Petition WMF

Since 2012 and the change in fundraising model through the introduction of A/B models. WMF have been inflicted by the [Resource curse]. WMF had two resources in plenty - money and volunteers, so it had little need to listen to those few that complain.

Suggestion is that we advise WMF to Freeze

  1. Fundraising, the Endowment - we have three years
  2. Hiring of staff/contractors/consultants. Linkedin shows a 35 % growth, WMF has 21 jobs open but Indeed shows 75 job.
  3. Any structural change to the WMF-WP relationship

Until

  1. They confirm ALL the email claims claims through a detailed list of staff with split into the percentages, detailed annual reports for WE and WMF, contracts, staff surveys, statistics, etc...
  2. Explain exactly how we will have transparency on the Endowment and give an example of how much of a $10 donation via the endowment will get to grantee, and how we will know what they did with it Wakelamp d[@-@]b (talk)
Wakelamp, I have left a message for you on my talk page. Kudpung กุดผึ้ง (talk) 01:23, 5 October 2022 (UTC)[reply]

=== Does the WMDE to DEWP relation work better? === Their wishlist process is problem focused. Are they better resourced? === if WMF-dev did as we wanted, how would you want the Wikipedia platform be different in five years?=== WP-WMF/dev communication/priority setting is a big issue in the comments above But even if we could decide strategy, hat do we want the UX/Editing Process/Dev/Roadmap to be in one, five or ten years? Is the [|2030 plan] in line with this?

In 2017, we created a strategic direction to guide our Movement into the future: By 2030, Wikimedia will become the essential infrastructure of the ecosystem of free knowledge, and anyone who shares our vision will be able to join us.

Wakelamp d[@-@]b (talk) 15:03, 17 October 2022 (UTC)[reply]

Vector 2022 - everyone else was correct about WMF Dev

I have struck out the above. The The WMF's response to the RfC on Vector 2022 makes it clear that it is not an option. So because everyone else was correct and my boundless optimism in WMF Dev was misplaced. I have also struck out the suggestion that they hold themselves accountable by providing a summary linked to the comments. It is a great method to prevent sham consultations, but it also works well with buy-in with communities, making sure systems work, and refocusing. Oh well, off to research why they are incorrect :-) Wakelamp d[@-@]b (talk) 11:34, 19 October 2022 (UTC)[reply]

Best way to get the WMF's attention

If you don't like the direction the WMF is going... Stop donating, and encourage others to do the same. Blueboar (talk) 11:56, 19 October 2022 (UTC)[reply]

Stopping donating is ineffective - there are far more people who will.
Read he WMF response ".... it's evident that text becomes difficult to read at those lengths...so why would anybody study that. ..we have no reason to question the WCAG guideline, nor the individual opinions of 15 WMF designers, nor the several other design and typography professionals we've been advised by. Wikipedia is different than many sites, as you and many others have pointed out, but that doesn't mean it is different in regard to what a readable line-length is.
...And yes, more informational density is probably more appropriate for an encyclopaedia than for many other reading experiences, ...So yes, we can be unique and that's wonderful ... most readers are reading on narrower screens, and nudge editors towards having common experiences with editors. If they want to opt-out of that, I think that's their choice,.... and we hopefully embody the experimental spirit you speak of Wakelamp d[@-@]b (talk) 14:05, 19 October 2022 (UTC)[reply]
I doubt that money from Wikipedia editors forms a significant part of the WMF's income. Our impact comes from donating skill and effort, without which their cash cow would be rewritten by spammers and vandals. I wonder whether it would be fair to our readers to protest by withdrawing that labour during the forthcoming banner campaign. Certes (talk) 17:59, 19 October 2022 (UTC)[reply]
Agreed we don't. But I think we are going to be OK, as WMF are far weaker than I thought.
The only real advantages is that they control the PR, and the narrative, we are weakly networked, and we have no alternate tech strategy.
The big disadvantage of WMF is that they are running a large scale experiment in governance and growth. They are focused on KPis - 30 % growth in staff.. 350 languages, press releases, maximising geographic separation, encouraging micro-edits, the ability to create millions of articles at will, create dummy editors if they wish, no controls over who they fund, and fundraising without limits,
Problem is that there are always limits. And their master plan of creating an unbiased auto written encyclopedia is not technically possible would create zero barriers for entry, and zero reasons for donating. Wakelamp d[@-@]b (talk) 14:30, 25 October 2022 (UTC)[reply]
Wikipedia gave away its domain names and trademarks to the WMF. That's the main barrier to a more equal relationship. The WMF can ride the gravy train as far as it likes. Our only recourse is to fork Zxcvbnmpedia with its own URL, at which point the WMF could spend its cash mountain on paid editors to fight us for traffic. Certes (talk) 15:18, 25 October 2022 (UTC)[reply]
I thought so as well, until I realized that it doesn't matter who owns the trademark (we are using wiki on wikipedia - so they can't complain), or who owns the domain names (they have no way of stopping us from using it), paid editors (good luck with that - my estimate is on the order of a billion dollars of work per year (median edits of active editors is 250 per year (wmf figure) times 200 k editors times $100 per hour (including on costs) gives 5 billion).

They are already running into problems with bias and ineptitude causing the reader decline. Their long term vision is a bit boring , and Donations are dependant totally on what we do. The only thing they do control are the servers, major software releases, the ability to create new wikis, and the narrative/pr. Correct me if I am wrong, but all the big changes have been tied to grants, or because of things we do. They lose control if we

  • remove the need for a centralized server structure.
  • Open the architecture - use open source forums and tech editors, but write them to wikis.
  • reduce the scope - rather than the sum of all knowledge - access to the sum of all knowledge. integrate other communities - Khan academy, Wiktionary, and other databases - rather than replicate.
  • Remove the need for 350 wikis,
  • better communications within wiki. (signpost for all)
  • better communications between major wikis
  • challenging their legitimately as controllers of a toxic community

and lastly by having us act as the opposition, so that each media release is followed by us explaining that they are bonkers.

Leadership

My impression of the Wikipedia projects and the WMF is that they are a free-for-all in which most everyone does as they please. As different people have different ideas and different approaches you then get a lot of conflict and chaos. So, for example, the members of the English Wikipedia don't seem to get along with each other smoothly, let alone getting along with the WMF too.

What seems to be missing most is leadership as it seems that no-one is in charge. In the early days, you had a fairly clear leader – Jimmy Wales – plus a chief of staff, Larry Sanger. Jimbo then created an organisation – the WMF – and Sue Gardner provided leadership for that, establishing the financial success which now powers the WMF. But after Gardner, we had Lila Tretikov who clearly failed to get a grip and Katherine Maher who seemed to mainly focus on healing the wounds. We now have Maryana Iskander but, as yet, I'm not seeing much visibility or impact from her appointment.

Of course, strong leadership can be a liability – just look at what Elon Musk is doing to Twitter. There's some lessons to be learnt from what's happening over there and so perhaps we should count our blessings. If there's one thing worse than making $ millions, it's losing $ millions!

Andrew🐉(talk) 10:45, 6 November 2022 (UTC)[reply]

Maryana Iskander 's background if in charity, and grants, politics, and donors; I can't see her as fixing the rift as there are so many vested interests, they see no limits, there is so much status (Katherine is on a Nobel Prize Committee), and so much money at stake.
WP, even with all its chaos, is probably more competent than WMF, because what we do is very visible, and could be easier to change as the number of productive (non bot, non micro) editors/admins drop. So, as we need leadership it must come from within WP. Wakelamp d[@-@]b (talk) 14:43, 6 November 2022 (UTC)[reply]

Dedicated resources

Have we ever considered requesting the WMF hire a couple of software developers whose sole responsibility would be to work on tasks allocated by the enwiki community? An example of where we would choose to allocate them would be the NPP software.

There would be some practical issues to resolve, such as how we allocate tasks and what work they should do if we fail to allocate tasks, but those issues can easily be resolved and if the WMF agreed I think it would go a long way to improving our relations with them, as a significant part of the tension comes from them not properly supporting us. BilledMammal (talk) 14:49, 8 November 2022 (UTC)[reply]

Dedicated developers would be good, but I agree task allocation/co-operation is the issue because WMF have different goals, and frankly what we do isn't exciting for them or helps them achieve their mission.
If we had dedicated developers (and we have lots of volunteers), creating interfaces to existing open source systems (such as Phab, phpbb to replace WMF system gaps (or admit for things like talk, that Wiki wikis are a very poor fit), or a better search interface
Examples of WMF goals being different
The WMF no mention of vital articles ; Meta has one mention of Improving vital articles, Phab has no mention of improving existing editor processes and 10 mentions of Vital article; WMF have 175 IT staff (for 350 Wikis and fundraising/grants/wikidata/google Enterprise), of which I have heard that mumbled that 10ish work on enWP for existing editors, and the database that supports them. (Wikidata is out of deWP, and I am not sure who funds Commons, but they have also raised a petition. Wakelamp d[@-@]b (talk) 11:41, 10 November 2022 (UTC)[reply]
We already have an annual ballot for community suggestions for development features. I think it makes sense to continue and perhaps expand that approach rather than have dedicated programmers as the ballot could result in developments that involve different IT skills from year to year, and this emphasises features we want rather than who codes them. ϢereSpielChequers 15:50, 11 November 2022 (UTC)[reply]
So you suggest that rather than request the WMF hire two developers who are dedicated to work on tasks allocated by the enwiki community, we request they allocate the story points equivalent of two developers, allowing them to match the skills of the developer to the task?
That makes sense, though my concern is that if there aren't dedicated developers our tasks will be low on the priority list and often end up uncompleted. BilledMammal (talk) 03:47, 13 November 2022 (UTC)[reply]
With a 100 M a year in donations, I don't see why we have a wish list - why not do all the things that are worthwhile now? Wakelamp d[@-@]b (talk) 15:40, 13 November 2022 (UTC)[reply]

May I pin this topic?

We discuss this each year, I think we should leave it until we have a proposal. Wakelamp d[@-@]b (talk) 11:44, 10 November 2022 (UTC)[reply]

Conway's law and WP and WMF fundamental differences

  • Conway's law states that ."ganization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure." ,Systems/processes and organisations mirror each other.

WMF organisation WMF has a centralised organsation with low interaction between nodes (departments), exacerbated by its policy of emphasising geographic spread over effectiveness, Their central KPIs are donations/media mentions, unmeasurable or not related to WMF to their actions, WMF system/processes This organisational structure is reflected in systems and processes, complicated undocumented architecture and links between systems, The complicated partly undocumented architecture and links between systems, doesn't quite fit Conway's law, but I think it might be just a reflection of the organisation, and protective of devs (they need protection!)

  1. A central server running on non-standard configurations (So, we can't spin up more capacity on AWS)
  2. A central development architecture that we get around, and has no strategy
  3. A WMF focuses development team (UX team who want a single UX experience for both Editors and reader, a community wish list rather than a strategy, no attempt to provide tools that editors can hack (proper workflow tools open interfaces for other edit tools, newletter systems, project tools,,,), Fundraising having an unlimited IT budget)
  4. Overcentralisation causing fiefdoms (Poor communication between nodes (departments) to the point that you have profit and cost centres mixed, poor financial controls, seperate IT departments, high chance of fraud, and entryism,
  5. A command/control mentality towards WP (Judgement of editors especially if it increases up donations),
  6. A central tenet, "the mission" justifies the means (their constituency is not the editors, and they should be controlled, or not provided with services, working with Facebook on bias issues),
  7. Avoidance of responsibility (buereacratic, no measurement of outcomes, restriction of information, faux consenus within WMF or PR speak, weasel word strategy documents, deliberate poor communication because it is boring or too hard, hiding of bad news, volunteer editors bearing the burden, and no duty of care towards editor mental health or legal issues),
  8. Conformity to norms (hiring by alignment with the mission, social pressure within WMF, focus on status and branding, diversity over competence, inexperienced managers promoted from within dependent on patronage for perks and promotions,
  9. Centralisation of opinion (WP NPOV means WMF view, non-bias is universal and timeless and must be imposed on other Wikis (just like misionaries translating the Bible)
  10. A preference for working with other centralised/high status organisations rather than open source, and
  11. An intolerance of rival structures (enWP editors toxic, a diffusion of major WIki power by 1 vote = 1 wiki, undermining of Trustee power);,

Apart from that they are Apples :-) Wakelamp d[@-@]b (talk) 02:23, 26 November 2022 (UTC)[reply]

What does "WMF (non developers)" mean? (People writing on Wikipedia tend to use abbreviations without explanation. This is a very irritating habit that I would really lke to see changed. ABD CKF WRET WITWIGO? ) Kdammers (talk) 15:44, 4 December 2022 (UTC)[reply]
I will try to reduce my abbreviations. ‎
The discussion about WMF developers is at the top of the thread, I thought that the WMF developer areas had more in common with WP, but there was disagreement that this was true. It also gets a but murky what is a non developer area, because many of the VPs have a core area, an advocacy area, and their own IT departments.‎
For intance Lisa Seitz-Gruwell (Chief Advancement Officer/ Deputy Chief Executive Officer) is repsonsible for a staff split into 22 IT, 22 Grants/advocacy, 6 endowment, 23 fundraising, 10 education, plus many overseas staff overseas I guess based on job title in Wikimedia related organisations. So, it is doubtful, that the 22 advancement IT staff priorities wouldline line up with WP editors, as their metric is donation. ‎
As an aside, the advancement department breaks a fraud prevention accounting principle called Separation of duties as it contains multiple steps in an income process, both income and expenditure, and is not that dependant on other departments except for a weekly cash transfer by finance. A structure like this is a recipe for organisational chaos. Wakelamp d[@-@]b (talk) 11:55, 5 December 2022 (UTC)[reply]
That's not true. What makes you think there is any cash transfer to any departments? If you want a bill paid, you send the bill to Finance, and they pay it directly (or not, if they think it shouldn't be paid). Whatamidoing (WMF) (talk) 21:03, 5 December 2022 (UTC)[reply]

Different world views : WMF and Slate vs WP - The banner debate

Rebecca MacKinnon VP, WMF Global Advocacy linkedin post "Slate's Stephen Harrison has written a balanced and thoughtful story about a recent debate in the English language Wikipedia community about fundraising banners that the Wikimedia Foundation runs on English Wikipedia. He concludes: "In recent years, Wikipedia has been attacked by authoritarian regimes and powerful billionaires—people who do not necessarily benefit from the free flow of neutral information. If $3 helps hold them off, then that’s coffee money well spent."Wakelamp d[@-@]b (talk) 12:10, 5 December 2022 (UTC)[reply]

No one has yet said what WMF and WP are. The thread starts with "Are WMF and WP too different to ever get along ? == WMF (non developers) and WP." Having waded through comments, I am guessing that WMF is World Finance Fund and WP is Wikipedia. But, considering that they are in quite different spheres, I had no idea when I first tried to read this thread. I repeat my plea: explain the abbreviations you use. Kdammers (talk) 22:23, 5 December 2022 (UTC)[reply]
WMF is Wikimedia Foundation, and WP, as used here, usually means English Wikipedia. I'll avoid commenting about Anglo-centrism in these discussions. Donald Albury 00:18, 6 December 2022 (UTC)[reply]
The Slate article is being covered on the RfC.
@Donald Albury You mentioned that you found the discussion Anglo-centric. Do you see any areas in which the WMF-WP relationship could be improved?
The discussions about WMF accountability, truthful fundraising, the Endowment, structure, and fraud risk don't seem Anglo-centric. Alignment of enWP and enWMF aims and strategy is at worst Anglo-spheric.
De Wikipedia seems to have a less apocalyptic pop-up, but there are compaints on Reddit which has a similarish global demographic to enWP ( (14k views and 3 k and more), and memes in a few languages; English, English, Spanish, and Indian. Athough there are postive ones (Germany. Wakelamp d[@-@]b (talk) Wakelamp d[@-@]b (talk) 09:58, 6 December 2022 (UTC)[reply]
I was referring to the mentions in the discussion implying that the English Wikipedia is more important than the Foundation. Donald Albury 14:22, 6 December 2022 (UTC)[reply]

Restricting translations

Should we make it a requirement that translations from foreign Wikipedia articles must be done by people competent in the source language? I'm starting to see more and more machine translations, often with non-trivial factual errors. Alternatively or additionally, should there be a deletion criterion for machine-translated articles? WP:MACHINE discourages their creation, but I don't know whether it's a well-accepted argument for deletion (besides via WP:PROD). Ovinus (talk) 20:27, 19 October 2022 (UTC)[reply]

It's always been a requirement. —Cryptic 21:01, 19 October 2022 (UTC)[reply]
I understand that machine translation has improved significantly in the 19 years since Brion VIBBER wrote that. I suspect that modern advice would say not to use machine translation without manually checking the translation and improving upon it.
I see two problems with declaring that translations into English must be done by "people competent in the source language".
  • There's no way for either editors or software to determine who is "competent" until they've published their translation.
  • The level of linguistic competence needed depends on what you're translating. You need less linguistic competence if you translate a short, simple article in an area you thoroughly understand. You need more linguistic competence for long, complex articles about a subject you don't understand.
Most of us are probably competent to "translate" most substubs that basically say "<Subject> is a <nationality> <profession>." I could probably use machine translation to produce an accurate translation of some basic articles about medical subjects from multiple European languages into English. (Machine translation is stronger for European languages than for others, plus I know enough Spanish and German to make a guess at what's written in related languages. Here's an example of me 'translating' from Swedish – nobody's ever complained about it.) Few of us, though, have the skills to translate complex articles on unfamiliar subjects in a completely unfamiliar script. I think you have to know your own limitations, and use your own judgment about what's within your power and what's beyond your abilities. WhatamIdoing (talk) 21:29, 20 October 2022 (UTC)[reply]
In recent months I'm seeing a number of translations, from Spanish and Portuguese, of long and important articles on art history that might or not be machine translations (probably they are) but are certainly not checked by anyone competent in English, as the original title of the latest I've seen is enough to show: Greek Classicism sculpture (now moved). Oh God, I see the same editor User:Racnela21, who says they are a paid editor, has now followed up with Ancient Rome Painting (73k bytes, which somebody turned into a redirect) and Rococo Painting (61k bytes), both on the same day, so they must be machine traslations. Needless to say these are done without checking to see if they duplicate existing articles, which they normally do. Johnbod (talk) 22:10, 20 October 2022 (UTC)[reply]
I think that Ovinus is talking about people who speak English well, but don't read the source language (e.g., Spanish) well enough to know that "Ella no tiene pelos en la lengua" is a warning that she tends to be blunt, rather than a statement that she's recovered from Hairy tongue. WhatamIdoing (talk) 00:10, 21 October 2022 (UTC)[reply]
Actually I think it's the machine, who doesn't speak either language well. Do you really think that people doing machine translations bother to check their work? I don't. Johnbod (talk) 22:47, 22 October 2022 (UTC)[reply]
@Johnbod, when I use machine translation into any language that I can read, I always check the result. Why wouldn't I expect another editor to do the same thing, for the same reasons? WhatamIdoing (talk) 06:51, 8 November 2022 (UTC)[reply]
I 100% agree. The question is, what should we do about editors who don't seem to have that judgment, and despite being asked to stop, continue to badly translate articles? Ovinus (talk) 22:54, 20 October 2022 (UTC)[reply]
I think we treat it as a behavioral problem, probably with an eye towards a WP:TBAN or even a block.
What we don't want to end up with is incentivizing licensing/attribution problems. It's better to have people using the Wikipedia:Content translation tool than to have them "secretly" translating articles and "forgetting" to say that's what they are doing. Note, for the record, that "using the tool whose edit summary automatically complies with the license" is not the same thing as "enabling the machine translation option inside that tool". I'm only saying that, since we're going to have people translating articles whether we like it or not, it would be better to have them automatically comply with license requirements, and with the interlanguage links in place. Right now, we limit this to editors who have made more edits than 99.75% of registered accounts here (or who know the multiple workarounds), and that risks license/copyright violations rather pointlessly. I'd rather limit it to people who have made more edits than 95% of our accounts. WhatamIdoing (talk) 00:03, 21 October 2022 (UTC)[reply]
  • Every page on Wikipedia carries a disclaimer: "Please be advised that nothing found here has necessarily been reviewed by people with the expertise required to provide you with complete, accurate or reliable information." To require expertise would be a fundamental change, requiring presentation of qualifications, passing a test or other demonstration of competence. For an example of a project that works in this way, see Scholarpedia. The last time I checked, they were producing about 1 article per year. Andrew🐉(talk) 22:16, 20 October 2022 (UTC)[reply]
    • This seems to be an extreme view. To be clear, I didn't mean that "tests" would be administered, just that the editor made a good-faith assertion. Also there is some precedent here, like with ContentTranslation (WP:X2). Ovinus (talk) 22:54, 20 October 2022 (UTC)[reply]
There are several problems with incompetent translations. One is misunderstandings like what WAID is pointing out, another is lack of awareness of the cultural context. What I find worst is when translators do not check the sources but just assume the other language Wikipedia has perfect source to text integrity, no close paraphrasing and that all citations there are correct. This is unlikely to be the case, and we should hold people to account for the sourcing they pretend to use. —Kusma (talk) 05:56, 21 October 2022 (UTC)[reply]
Comment. I think it is better a bad translation—as long as it doesn't violate policies—than no translation at all when needed. Even professional translators disagree in the wording of translated texts and are not immune to mistakes. Certainly to a lesser degree than non-translators.
Besides, there is not that many availability of translators. Out of tens of thousands of experienced editors in English Wikipedia, there is only a handful of active Spanish translators at Wikipedia:Translators available. Therefore, I err on the topic of this thread on the side of free flow of information. Thinker78 (talk) 17:29, 22 October 2022 (UTC)[reply]
I would agree, except that many of these bad translations have genuine factual errors, or are incredibly misleading. Ovinus (talk) 20:06, 22 October 2022 (UTC)[reply]
I disagree, as we have seem to have more translators willing to fill red links than willing to fix poor translations, see for example the six-year backlog at WP:PNTCU. Fixing poor translations is hard work, almost as much as creating a new article, and not half as rewarding. Also, machine translation software is getting better and better and is available as browser plugin or feature, so sending people to foreign Wikipedias via {{ill}} and letting them read a live translation using most up-to-date software of a recently updated article can be better than giving them a poorly made out-of-date translation of the article of ten years ago using the software of ten years ago. I personally have stopped translating years ago, and try to write a new article based on the same sources (plus anything in English I can find) instead. Working directly from the sources (which often aren't available for machine translations) reduces errors introduced by multiple rounds of paraphrasing and translation. —Kusma (talk) 21:49, 22 October 2022 (UTC)[reply]
There is a reason why there are classes of articles, like stub, start, etc. Those articles can be in bad shape, but they can be improved. Also, translations—be it words, paragraphs or articles— are also subject to regular policies and guidelines. There are cases where information can be removed due to not complying with said policies. Thinker78 (talk) 00:29, 23 October 2022 (UTC)[reply]
  • Following up the ones I mentioned above, it is now clear that there is a large paid for campaign to add articles machine translated, mostly from Spanish (some Portuguese), to Wikipedia. The Open Knowledge Association (OKA) is funding this; these are their instructions, and they are recording progress here. They've done about 70 and want to do another 180 odd. I think we should urge them to stop. The quality is abysmally low, and many of them duplicate existing articles. We don't want this. The "freelancers" names are given - all seem Spanish. Johnbod (talk) 22:47, 22 October 2022 (UTC)[reply]
    There are many articles with abysmally low quality and we don't urge all new editors to stop just because of it. We let them know policies and guidelines and guide them on how to do it better. Besides, in the link you provided they have a provision that urges to follow Wikipedia's policies. So I don't see a problem with it. Instead, they seem to want to help the project. Thinker78 (talk) 00:35, 23 October 2022 (UTC)[reply]
    Of course they want to help the project, as well as earn money. But so do most of the editors who get reverted every day. Johnbod (talk) 03:35, 23 October 2022 (UTC)[reply]
    Most paid editors are here to promote the organisation that pays them. To be fair, OKA seems to have a more noble goal and has done some good work but also some well-intentioned harm. On the plus side, we're getting usable articles on missing topics. Some of those articles were obviously not written by native English speakers, but we tolerate that from other editors and such work gets cleaned up. On the minus side, other work duplicates existing articles, and the effort would be better spent elsewhere. I hope the compilers of to-do lists check the inter-language links (hidden behind a dropdown top right on ptwp) for an English equivalent first, but that's not always enough: for example, Animal husbandry in Brazil wasn't listed as related to pt:Pecuária no Brasil until an editor cleaned up Wikidata after replacing the new translation by a redirect. One suggestion might be to search enwp more thoroughly for the proposed title and see if an article on the topic comes up; if so then all that need be created is a redirect from the proposed title and an interwiki link in Wikidata. Certes (talk) 14:51, 27 October 2022 (UTC)[reply]
    @Johnbod I checked a couple of the articles created by them and I won't say at all their "quality is abysmally low". Although some articles get turned down, or are duplicates, I found a few of them actually of very good quality. Check for example article 1, article 2, article 3. This latter one got even a wow and a barnstar from an administrator. Thinker78 (talk) 01:04, 23 October 2022 (UTC)[reply]
    That was from User:CaptainEek, not known for article work, who evidently didn't know it was a machine translation. But that one is better than the others, & at first glance seems to fill a gap, which some of the obscure Hispanic history ones may also do. It had clearly been looked though and adjusted. I can see considerable problems with the first two though, though neither are on areas where I know the terminology. The big art history ones by the prolific User:Racnela21 are half in fluent gobbledegook, as shown by the original titles Greek Classicism sculpture and Ancient Rome Painting. Both of these duplicate well-developed existing articles (Ancient Greek sculpture and Roman art, with their subsidiary articles). Someone else has redirected the Roman one as a CFORK, btw, which should probably be done to others. Johnbod (talk) 03:32, 23 October 2022 (UTC)[reply]
    "The big art history ones by the prolific User:Racnela21 are half in fluent gobbledegook, as shown by the original titles Greek Classicism sculpture and Ancient Rome Painting." I read the lead of the first one and found no evident problems, much less gobbledegook. Actually I think the quality is above that of the average Wikipedia article.
    No idea why you are making such drastic criticism, that in my view doesn't nearly reflect the relevant work. If you read a problematic sample, if the work was as bad as you say, then said sample would reflect the condition of the whole work in general, and not be an isolated situation. Thinker78 (talk) 00:29, 24 October 2022 (UTC)[reply]
    Well firstly the lead (but not the rest) has been considerably cleaned up by other people, but I think if you think it is ok you probably don't know much about Ancient Greek sculpture (like what "Classical" means in this context). Are you really happy with "... With it, a form of representation of the human body was inaugurated that was one of the fulcrums for the birth of a new philosophical branch, aesthetics, and was the stylistic foundation of later revivalist movements of enormous importance, such as the Renaissance and Neoclassicism, and remains influential to this day. Thus, its impact on Western culture cannot be emphasized enough, and it is a central reference for the study of Western art history. But apart from its historical value, its intrinsic artistic quality has rarely been questioned, the vast majority of ancient and modern critics praise it vehemently, and the museums that preserve it are visited by millions of people every year"? Mind you, I think you are a native Spanish speaker, so perhaps are used to the airy waffle of the style here, which is unfortunately very prevalent in art history in the Romance languages. Johnbod (talk) 04:46, 24 October 2022 (UTC)[reply]
    Yeah, I am a native Spanish speaker. In contrast to the airy waffle, the Germanic languages—like English—in Guatemala have a reputation of excessive dryness. Lol.
    I understand your point that the language used in Wikipedia English should be objective. But, I think this may be out of the purview of this thread (unsanctioned translations). Do you consider the quoted text translated improperly? Thinker78 (talk) 05:15, 24 October 2022 (UTC)[reply]
    Well, we English-speakers like it that way, and it does mean you can, you know, convey information efficiently, like an encyclopaedia is supposed to do. I haven't checked either the original the machine produced, or the Hispanic original. It's the final product presented in English I'm concerned with. Johnbod (talk) 05:20, 24 October 2022 (UTC)[reply]
    Hi @Johnbod, I am the founder of OKA. I would like to provide a few clarifications here, but before that, I must say that I find some of the statements a bit demeaning to the work of our translators. They indeed receive a stipend from us to cover their costs of living, which is why they state that they are paid, however this is a very meager compensation compared to the hours they put into their work. In that sense, they are more like volunteers, as they could have taken much better paying opportunities, but decided instead to work on Wikipedia. Additionally, OKA is not a for-profit organization financed by companies, but a non-profit (officially recognized as tax exempt in Switzerland), which I have so far funded entirely out of my own pocket. It is ok to have different opinions and to discuss them, but making extreme statements such as "abysmally low quality" and "fluent gobbledegook" doesn't help having a constructive discussion.
    As you can see in our process and our website, our translators are expected to manually review every sentence. So the articles are not pure machine translation. As this is a lot of text, and because none of them are native speakers, it may be that some paragraphs or articles are of lower quality than others, but from my experience, this is more an exception rather than the norm. Additionally, in many cases, the translation work is not the root cause, but rather the fact that the original article was written in a way that may sound less natural in English (as Thinker78 pointed out).
    Whenever a quality issue is flagged on an article, our translators are supposed to work on it to fix it, so their work doesn't stop once the article is published. If it is a more fundamental problem with the source article and the community decides that an article should be deleted, then we take note of that decision and try to update our processes to ensure this doesn't happen again in the future.
    As far as I am aware, out of the 70+ articles we have translated so far (most of them very long), almost none was taken down. Most of these articles could of course be improved, but I am of the view that it is better to have an article in English with small things to improve here and there than no article at all. I know that this is a contentious topic as not all Wikipedians share the same view, but we are trying our best to abide by all the policies of the English Wikipedia community on that front.
    In general, we are very open to the feedback of the community on how we can strengthen our impact. We are happy to involve other Wikipedians in the design of our process, or in the process of reviewing or translating articles altogether. What we are creating is a taskforce of full-time translators dedicated to editing Wikipedia; at the moment, they are focused on creating content, but I am also happy to adjust our processes so that a share of our translators work on quality improvement work and other review process if the community feels it can help balance the situation and add value. If this is of interest, you can reach out to [email protected] 7804j (talk) 18:40, 24 October 2022 (UTC)[reply]
    Taking your comments in reverse order, afaik you made no effort to inform the wp community this was coming, or ask for advice on how best to do it, but just started uploading. There are all sorts of issues with your instructions and the articles you have selected. A number of your articles have been "taken down" and I expect more will be in the future, even if only draftified. It is perfectly clear that some of your "translators" don't speak really English at all, or we wouldn't get titles like Greek Classicism sculpture, Ancient Regime of Spain, Brazilian Romanticism Painting and Ancient Rome Painting. And these are just the titles! The quality of the articles I've looked at is variable, with some ok, but others terrible. It is clear that not enough work is done to see we don't already have an article on the subject; there have been several where we did, in fact most of the general (non-Iberian) ones. I imagine your translator's English was too poor to find them. Your instructions tell your people to avoid subjects that are too Spanish or Portuguese: "Some articles may have high interest in Spanish but low interest in English, because they concern topics where the readers are most likely already Spanish-speaking (e.g., articles about local celebrities). It is usually better to prioritize articles that are universal, i.e. not language or region-specific". This is very bad advice, and exactly the wrong way round - you should prioritize actual gaps on Iberian topics, and avoid universal topics completely. The Spanish and Portuguese wps do not have a very high reputation, and on universal or wide topics like art history understandably place a great deal of emphasis on contributions from the Iberian countries - far too much for what an Anglophone audience is interested in. The art history ones have clearly not been read over by anyone with the slightest familiarity with the correct vocabulary in English. When the subject is something like Termination of employment in Argentina, well, who cares really (actually the English seems ok). You say your translators are "expected" and "supposed" to do various things, but it's pretty clear to me they don't. I could go on (and on) but.... Thanks for responding anyway. Johnbod (talk) 04:46, 25 October 2022 (UTC)[reply]
    Dear @Johnbod. You are a textbook case of someone who sees the glass half empty instead of half full. Nevermind that 7804j contributes their own money to help the project. You don't care about that, why?
    I think it is a great initiative that very few people do without a commercial interest. And it highlights why there shouldn't be too much restriction about translation qualifications. If anyone can be an editor in Wikipedia, I don't see why only very few people could be translators in Wikipedia.
    I also have to point out that you have experience creating articles. But you just threw at us a wall of text. What happened to paragraphs?

    Shorter sentences and paragraphs make your content easier to skim and less intimidating. Paragraphs should top out around 3 to 8 sentences. Ideal sentence length is around 15 to 20 words.

    — Harvard Library, "Writing Guide", Book Title (date)
    Impressive record otherwise, kudos! Thinker78 (talk) 03:10, 26 October 2022 (UTC)[reply]
    This glass is more than half-empty, and I'm afraid 7804j has wasted a high proportion of her money, on work that won't survive, as WP:CFORKs etc. As I expect you know, the community is suspicious, after many years of bad experiences, of paid-for editing initiatives. Many of these are completely well-meaning, but we judge by the results, not the intentions. If 7804j had come to the community & explained her intended initiative, she could have received a lot of advice on how to avoid problems. Instead she seems to have started the initiative in April or before, but only at the end of August mentioned anything about it on her user page and only yesterday, over 6 months in, did she inform the community on a public page. Who suggested that "only very few people could be translators in Wikipedia"? Nobody. But minimum qualifications should be some ability to speak English, and a willingness to check through what the machine throws at you. It's odd that you should complain about a wall of text, as paragraphs many times that length are very characteristic of some of the OKA translations. Perhaps you haven't looked at many. Johnbod (talk) 12:04, 26 October 2022 (UTC)[reply]
    @Johnbod So far, the community didn't agree with your assessment of whether the pages deserve a separate entry. There were a few of our pages that were nominated for potential merge, but where the community thought they would be better to remain as a separate page. Maybe some of the pages will be removed or merged along the way, and I think that's ok -- we will learn from this as well and adjust our ways of working as a result.
    Also, as I stated in my other post, the reason why I waited for bringing it up in the Village pump is that I first wanted to set up everything and run a MVP to see if the model works. I wasn't sure that I would be able to recruit translators and get the non-profit recognition from the government, nor that it would be possible to train them on editing Wikipedia with reasonable effort, so I didn't want to bother the community with hypotheticals. I prefer to act than to talk. Now that the concept has been tested and that I have had time to improve the processes, I am inviting the community to share its inputs.
    Our translators do not define the lengths of the paragraphs they write. It is defined by the source they work on. So far, we have only translated articles that were considered "Featured" or "Good articles" in the source language.
    (by the way, I don't know what makes you feel that I am a "she". If you spent so much time digging in my profile, you should be able to find out the right pronoun)
    7804j (talk) 12:58, 26 October 2022 (UTC)[reply]
    Ah, apologies for that. Obviously, I haven't spent any time at all 'digging in your profile', or I wouldn't have got that wrong! To be honest I had confused your operation with this lot, where the top brass are all female. Can you link to the "few of our pages that were nominated for potential merge, but where the community thought they would be better to remain as a separate page". I certainly haven't looked all the articles. But you should realize that when pages appear with (often) no categories, wikiproject ratings & so on, nobody sees them except a new page patroller, whose tick in no way constitutes acceptance by the community - they are highly unlikely to spot a content fork. I had noticed a number of strange art history forks appearing for some time, but hadn't realized there was a project until just recently, when your translators started adding proper declarations on their user pages. It was unhelpful (and against the rules) of you not to have declared your conflict of interest, or any involvement at all, when we were in a discussion back in May at Draft talk:Early modern art. The fact is that mostly, the community just hasn't considered "whether the pages deserve a separate entry" at all, and the process may not be quick. Some cases are not 100% content fork, but only mostly, and may best be broken up for spare parts, with some used elsewhere, as I suggest at Talk:History_of_engraving.
    Now that I have the right website, I think it is a great pity your project hasn't been following your own declared aims to assist with "...topics where volunteers are missing. For example, articles in topics such as Science, technology, engineering, and Finance are lacking compared to topics such as History, Geography, and Humanities." And yet the great majority of your articles are on "topics such as History, Geography, and Humanities." Johnbod (talk) 18:10, 26 October 2022 (UTC)[reply]
    @Johnbod I am not sure about the objectivity or accuracy of your criticism. You would have to compare the frequency of declined drafts or of deleted articles created by random editors vs 7804j´s project.

    Meanwhile, I lean to believe that said project illustrates the need to avoid instruction creep in regulating translations. Because the evidence I have read seems to point at least to good quality work.

    Thinker78 (talk) 01:58, 27 October 2022 (UTC)[reply]
    If you think I'm rude, try Wikipediocracy. Johnbod (talk) 03:16, 27 October 2022 (UTC)[reply]
    I think human translators are awesome!!!!
    The machine translations are not. A link to one of the featured articles the other day was machine translated. I spent four hours and still couldn't make sense of it (Major issues were loss of nuance, literal translation of place names, non-existent links, and word order),
    Can someone explain how changes are synched? One wiki has translated 6 million articles
    Also are 350 different versions sustainable? Wakelamp d[@-@]b (talk) 13:37, 25 October 2022 (UTC)[reply]
    They're mostly not translated. The Cebuano Wikipedia likes to run article creation bots. These don't translate articles. They construct them in a kind of mail merge system – every article says the same things, but you fill in the blanks. See Lsjbot and Rambot's contributions if you want an idea of how it works. WhatamIdoing (talk) 06:56, 8 November 2022 (UTC)[reply]
  • I have only skim-read the discussion above, but need to say something about translation. Proficiency is needed in the source language when translating, but even more proficiency is needed in the target language. For example, I am a native speaker of English but am pretty fluent in Polish (I've spoken it most days for over 40 years). I would be perfectly happy translating from Polish to English, but would not presume to be good at translating from English to Polish. I'm sure that my writing would make it pretty obvious that I am not a native speaker. I'm pretty sure that the translators are proficient in Spanish or Portuguese, but is their knowledge of English good enough for them to be writing articles here? Phil Bridger (talk) 18:42, 26 October 2022 (UTC)[reply]
    I have confidently translated a few articles into my native English. I wouldn't have made a good job of translating anything from English. Knowing the source language is useful, but fluency in the target language is critical. The alternatif be half-wrote mess for other someone to leave. Certes (talk) 14:37, 27 October 2022 (UTC)[reply]
Question: Any example of a poor machine translation job in Wikipedia in the last 2 years? Thinker78 (talk) 21:07, 27 October 2022 (UTC)[reply]
WP:PNTCU.—S Marshall T/C 22:22, 27 October 2022 (UTC)[reply]
There must be at least 50 in the OKA spreadsheet here, some mentioned above, including a number of downright ungrammatical titles. Johnbod (talk) 00:26, 28 October 2022 (UTC)[reply]
@S Marshall:. I checked WP:PNTCU, but that was not a specific example of bad machine translation job, but just a collection of articles that for one reason or another need various degrees of cleanup.
@Johnbod:, I checked the work of another one of the editors of the OKA project. Although I had already checked a few pages not finding evidence of deserving such poor rating as you provided, I checked once more. I found evidence backing my opinion doubting the objectivity or accuracy of your criticism of said articles.
You made a statement to User:Racnela21, "It's fairly clear (from the titles alone) that your English isn't good enough to do any checking of these". @Andrew Davidson: replied, "I just started reviewing Kassite dynasty. I've not noticed any problems with the English [...]".
I also read the lead of said article and I concur with Andrew Davidson not noticing any problems with the use of English.
You may do very good editing work, but translation work, assessing statistically others' work or objective analytical criticism may not be your fields. Thinker78 (talk) 18:08, 28 October 2022 (UTC)[reply]
Thinker78, you say you can't see problems with the articles I linked and I assume that you genuinely can't. But most other editors can.—S Marshall T/C 23:39, 28 October 2022 (UTC)[reply]

As noted by S Marshall and Johnbod, we already have Wikipedia:Pages needing translation into English#Translated pages that could still use some cleanup as the coordinating list for those editors able and willing to undertake the Sisyphean task of fixing bad translations. (Note the backlog extending back to 2016, and click on almost anything listed there, or in the associated Category:Wikipedia articles needing cleanup after translation, to see how useless to readers a machine translation can be.) It is indeed our policy that machine-translated text is worse than no article, or no expansion. By their nature, all machine translations not only produce errors (omitting negatives, misidentifying antecedents, mistranslating by choosing wrongly among alternatives), they tend to produce plausible-looking mistranslations of some passages, because they are ultimately based on text search. The WMF's translation tool, mentioned positively above, flooded en.wikipedia with very bad translations and was disabled for use here after community outcry; I understand it can now be used, but only by extended-confirmed editors, enforced by an edit filter. The remnants of the clean-up list from editors being encouraged to use it to add articles to en.wikipedia can be found starting here (linked in its original location at the top of the Pages needing translation section).

Unfortunately, as Phil Bridger notes, there's not one problem, but two: to either translate, check a machine translation, or clean up another editor's translation, both knowledge of the original language (including idioms) and proficiency in the target language (English) are necessary. English Wikipedia attracts a lot of well-intentioned editors who write poor English from scratch; I suspect we have this problem more than do other-language Wikipedias. But while we editors may have become inured to a certain level of ESLese, especially in some topic areas, it's not reasonable to expect readers to wade through incomprehensible prose. We regularly, sadly, block editors for insufficient competence in English, as well as other things. In fact I would say that insufficient competency in English is more of a problem when translating for en.wikipedia than is insufficient competency in the source language. Machine translating programs help in understanding the original; a surprising amount can be understood and fixed in the translation by following the wikilinks in the original and looking at the English interwikis (something I'm surprised that most creators of poor translations evidently haven't thought to do; these are linked databases); there are sometimes even references in English (and we recommend searching for and adding English-language references anyway, to help satisfy WP:V for that vast majority of potential readers who won't be able to read references in the original language)—but it requires competency in English to render the translation in clear English and to spot false friends and other nonsense in machine output.

This was and is the problem with the WMF promoting their machine translations, and this, from the editor whose work I've seen and the editor Johnbod ran into, is the problem with the Open Knowledge Association project; the translators don't even have sufficient competency in English to realize that livestock farming in Brazil can also be called "animal husbandry in Brazil", and are creating content forks. In my opinion, that project, despite its good intentions, is not only a problem for English Wikipedia but should not be fund-raising until it puts in place adequate managerial oversight, including testing the ability of its proposed translators to write professional-level English. It's an axiom in translation that one should only translate into one's native language, and this kind of poor work demonstrates why.

I can't propose any feel-good solution; in my opinion (as a translator and as someone familiar with how thinly stretched our qualified translation checkers are here on en.wikipedia) the OKA project should have been brought to one of the administrators' noticeboards, not flagged at an off-wiki criticism site and mentioned here and elsewhere on the Village Pump with pride. The road to Hell is paved with good intentions, but this project needs quality control implementing immediately at the source or deprecating as harmful to en.WP out of all proportion to the few useful articles it may be adding. (We have processes for requesting translations; see Wikipedia:Translation#Translation from another language to English.) Yngvadottir (talk) 02:23, 28 October 2022 (UTC)[reply]

the problem with the WMF promoting their machine translations – What are you talking about? WhatamIdoing (talk) 06:57, 8 November 2022 (UTC)[reply]

Previous community decisions on this

  • There's a relevant community decision from 2016. Here is the whole vast, sprawling discussion, if you'd like to read it all in context, but the short version is that the community decided to restrict the use of automatic translation tools to extended-confirmed users. This is implemented using an edit filter (Special:AbuseFilter/782). The community has also authorised speedy deletion of edits made using automated translation tools prior to that consensus, which was at WP:CSD X2. I personally deprecated X2 after the community decided that automated translations can also be speedily draftified.
A scant few months after the community authorised speedy draftification of automated translations, the community then passed another rule that articles in draft space can be deleted after six months. Nowadays speedy draftification has become highly entangled with New Pages Patrol, and there seems to be a rule that only recently-created pages can be speedily draftified nowadays. This conflicts with the automated translation decision.
If large-scale automated translation is taking place once again, then we need to revisit these old discussions.—S Marshall T/C 15:42, 27 October 2022 (UTC)[reply]
  • This highlights something I have been thinking a lot about recently… I think we may be using Draftspace to solve too many things. I think it could be divided into two parts…
  1. A “Triage” space for new articles… focused mostly on basic sourcing and establishing notability… this would continue to be managed by the NPP and have all the current rules and time limits.
  2. A new “fix it” space for other types of problematic articles. This would not be under the remit of NPP, and would have a much more generous (or perhaps no) time limit attached. The article would simply be removed from Mainspace until fixed.
Poorly done machine translations of otherwise acceptable topics could go into this new “fix it” space until someone who knows the original language can review it and make appropriate corrections. Blueboar (talk) 16:11, 27 October 2022 (UTC)[reply]
  • I think we already have a fix-it space in the form of userspace. It's only sensible to userfy poor articles when we can identify the person who will fix them, but I think that's a feature and not a bug -- when we can't identify a fixer, the disputed content does need to go in the compost heap. With translations in fix-it space, I think the foreseeable problem is that people whose first language is English are notoriously poor at foreign languages and our translator numbers are extremely low relative to other-language wikipedias. WP:PNT has more than a decade of backlog and it's getting worse, and the reason is because when you do have the appropriate dual fluency, it's always so much easier, quicker, and more fun to do your own translation from scratch than to fix someone else's. So such content will tend to linger in fix-it space until the mainspace article is written by someone else.—S Marshall T/C 22:05, 27 October 2022 (UTC)[reply]

Making machine translation available again

I propose we add machine translation back into the WMF's translation tool for English Wikipedia. It's currently removed for all users. This would be a privilege extended only to extended-confirmed users, and, importantly, we should withdraw it on an individual basis for those who consistently produce mediocre machine translations.

  • Competence should be assumed. Why remove the tool from all users, when they may be perfectly capable of using it intelligently?
  • Removing machine translation from the Translate tool doesn't prevent people from doing machine translations; it's a complete waste of time for competent editors to start from scratch or waste time copy-pasting, when they could spend that time improving the translation and checking the sources, etc.
  • Some machine translation tools are very competent (DeepL is far better than Google Translate), so most problems could be better addressed by being more careful about which machine translation services we build-in. For example, I used DeepL to create fr:Alliance militaire, IMO a pretty decent translation (my first).
  • It would significantly help address WP:SYSTEMICBIAS, a priority for Wikipedia.
  • WP:MACHINE is incorrect when it claims that machine translations are easily accessible; many browsers do not have built-in translation, especially on mobile (or obviously the Wikipedia app). It also only forbids "unedited" machine translations, and cannot be used to support removing machine translations from the Translate tool altogether.
  • I support Blueboar's proposals above; but even the current mechanisms (drafts, AfC, new page review) would be sufficient here.
  • Bad previous translations are far better addressed by simply re-translating them today. Even an unedited machine translation with today's improved algorithms would be miles better than the crap produced by Google Translate back then. Returning machine translation to the translate tool would make this easier, and would help clear the backlog.

The idea that "people are more interested in creating translations than fixing them" may be true, but it's true for all articles. Our vital articles are very neglected, that's not an argument for anything. DFlhb (talk) 07:07, 4 November 2022 (UTC)[reply]

  • Oppose, just in case that wasn't obvious from the long discussions and positions I linked above.—S Marshall T/C 10:35, 4 November 2022 (UTC)[reply]
How would it "significantly help address WP:SYSTEMICBIAS, a priority for Wikipedia."? Wakelamp d[@-@]b (talk) 12:16, 8 November 2022 (UTC)[reply]
Oppose anything that makes it easier to add articles without verifying the sources. Wikipedias are not reliable sources, and blindly translating them is dangerous. —Kusma (talk) 12:27, 8 November 2022 (UTC)[reply]
  • Oppose, other Wikipedias are not reliable sources. CMD (talk) 13:52, 8 November 2022 (UTC)[reply]
  • Comment there is no point in allowing people to create machine translations, because if you want to read a machine translation, all you need to do is click on the little symbol in the English WP's article, or on the inter-language link if you've found one, and your browser will do a machine translation for you. This will be an up-to-date translation containing the latest best version of the target article, translated to the latest standards of technology, while a machine translation created in English WP is neither. Elemimele (talk) 16:23, 21 November 2022 (UTC)[reply]
  • Oppose Correctly translating Wikipedia articles, along with all the required referencing, is no easy task. Wikipedia does not need anymore poorly translated and broken articles, which is the only kind that machine translation is currently capable of creating.
-- LCU ActivelyDisinterested transmissions °co-ords° 17:08, 22 November 2022 (UTC)[reply]

A current example from DYK

I first heard about OKA just today when reviewing Template:Did you know nominations/Gothic sculpture. Oddly enough, the submission had initially been approved, then the machine translation issue came up and folks ran in the other direction. -- RoySmith (talk) 16:15, 14 November 2022 (UTC)[reply]

Ideas sought to arrive at a definition of "article creation at scale (aka mass creation)

A recently closed RfC found consensus to create a definition of "article creation at scale" (sometimes called mass creation). We are still seeking an agreeable definition, and are inviting input here before taking next steps. I'm including below some of the input that has already been provided, however I'll collapse it in case anyone wants to comment unprimed. –xenotalk 23:48, 5 November 2022 (UTC)[reply]

Ideas from RfC
The following discussion has been closed. Please do not modify it.


Discussion (arriving at a definition of "article creation at scale")

  • If someone wants to create a set of articles, they should be able to look at a given policy/guideline and determine the appropriate approach. They should not be expected to divine which of many interpretations [of a combination of almost-applicable policies] people will apply to them in the pursuit of "case by case". It's wild to me that so many people are arguing that more ambiguity is what's needed to avoid the dreaded wikilawyering. We also need to separate the definition of mass creation/article creation at scale from the processes, venues, and penalties associated with mass creation. We haven't even concluded whether (a) rate, (b) quality, (c) sourcing, and/or which combination thereof is what this definition should address. Maybe simply asking that is a good step, and then fleshing out whichever one or more apply? — Rhododendrites talk \\ 02:36, 6 November 2022 (UTC)[reply]
    Rhododendrites, if the answer to your question isn't primarily or exclusively a rate, then "mass creation" needs a new name. It does not make any sense to call the creation of one lousy stub a day "mass creation", and it makes a lot of sentence to call a hundred FAC-quality articles uploaded within an hour "mass creation". WhatamIdoing (talk) 03:32, 6 November 2022 (UTC)[reply]
    I would just like to be sure that it is made clear that this applies to creation of new pages in article space. Mass creation in draft space should be of no concern, unless it is followed by mass movement of unimproved drafts to article space (the latter of which should be treated as problematic). I would also propose that this should not apply to creation of disambiguation pages (though I do not anticipate their creation in these numbers), as these pages do not require sources and are easy to check. BD2412 T 04:05, 6 November 2022 (UTC)[reply]
    If the definition rests in any part on sources then redirects need to be excluded too. Mass creation of these can sometimes cause issues, but those issues are not related to sourcing or depth of coverage in the way articles are. Thryduulf (talk) 12:44, 6 November 2022 (UTC)[reply]
    @Thryduulf: I would also agree with this. Articles need to be defined as articles for this purpose. BD2412 T 02:33, 8 November 2022 (UTC)[reply]
  • I'm definitely repeating myself at this point, but: numerical thresholds are a terrible idea, IMO; they are gamed too easily. My view remains that mass creation occurs when a group of articles is created without the notability of each topic being separately evaluated, and instead the entire group being deemed notable. This isn't always an issue, but it's at the heart of our nasty AfD debates (athletes, villages, and roads are what come to mind). Vanamonde (Talk) 04:10, 6 November 2022 (UTC)[reply]
I'm not sure I agree. The WP:3RR works pretty well, and it's usually pretty obvious when someone is gaming it over an extended period of time. With mass article creations it would be even more clear, since the ultimate purpose of "gaming" it would be to create a bunch of articles without review and there wouldn't be any way to do that without constantly pushing the limits to or near the max again and again over an extended period of time. I think the ideal solution is to say "X articles per day is a hard limit beyond which mass-article creation policies always apply without exception, but is not a guarantee and if you continuously approach this number again and again then that's going to be considered mass creation as well and you're expected to adhere to the relevant policies." You talk about how mass creation occurs when a group of articles is created without the notability of each topic being separately evaluated, and instead the entire group being deemed notable, but the core problem is that people are "deeming" it so themselves or without the level of consensus required for such sweeping changes, then trying to force this through via WP:FAIT, which is completely inappropriate. Having a hard, indisputable "above this threshold you must follow these policies, without exception" coupled with "this is an upper limit, not a guarantee, and if you seem to be gaming this system you could face sanctions" would either force them to slow down due to hard sourcing requirements (or whatever system we decide on using this definition), or would provide a clear policy we could point and a straightforward way to argue that they are gaming it so they can be sanctioned in order to force them (and anyone else who wants to do the same thing) to the table, as opposed to the current situation where they often rely on WP:FAIT and the fact that mass-article creations are very hard to reverse to try and force through their policy preferences without a clear consensus backing them. --Aquillion (talk) 04:55, 6 November 2022 (UTC)[reply]
We do need some idea of the sort of order of magnitude we're talking about. Is this 10 articles over a year? I'd say no. Is it 10 articles over the entire time someone spends on wikipedia? I'd say absolutely not. Is it tens of articles a day or scores of articles a month? Ah, that's more like it, perhaps. Is it hundreds of articles a year? Yeah, probably. Just leaving it with the current "definition" of no one objected to a figure of 25ish is meh. Blue Square Thing (talk) 08:39, 6 November 2022 (UTC)[reply]
@Vanamonde93, I think it would help me to understand which problem you're trying to solve.
IMO the main problem that MASSCREATE is trying to solve is swamping the New Page Patrollers without warning. Can you agree with me that if we woke up tomorrow to discover that a million articles had been added to Wikipedia overnight that the sheer volume would be a problem, even if the individual articles were themselves 100% high-quality and on 100% notable subjects? WhatamIdoing (talk) 16:36, 6 November 2022 (UTC)[reply]
Aquillon, 3RR works well, but it's a bright-line for edit-warring, which is the behavior being regulated. I don't have an issue with a similar bright-line for mass creation, but that can't be the definition, just as 3RR isn't the definition of edit-warring. The definition needs to be about repetitive bot-like creation of similar articles. @WhatamIdoing: I agree that NPP being swamped with a million notable articles at once would be a problem. However, I see no evidence that it's currently a problem, whereas this discussion began as the result of conflicts at AfD that are almost entirely about athletes, villages, and roads. Also: if you are checking each individual topic against WP:GNG, rather than deciding an entire group is notable, then you'd have to be superhuman to even produce 20 articles a day, let alone a million. Vanamonde (Talk) 16:48, 6 November 2022 (UTC)[reply]
20 stubs per day, checked against GNG, is not a superhuman task. You'd just need to work in a subject area that lends itself to the GNG. There are more than 20 red links in the List of ICD-9 codes and List of MeSH codes. Every WHO-recognized disease and approved treatment passes the GNG. Template:Reliable sources for medical articles will even link you straight to the sources that prove it.
For non-GNG subjects, it's even easier. The question about whether a fish species is a notable subject is basically answered by saying "Does it have a Valid name (zoology)? If yes, then it passes WP:NSPECIES." If you know where to look, it takes maybe five seconds to determine this. WhatamIdoing (talk) 17:06, 6 November 2022 (UTC)[reply]
NSPECIES is an essay, not an SNG. GEOLAND is a better example. BilledMammal (talk) 17:08, 6 November 2022 (UTC)[reply]
And mass-creation under GEOLAND is currently a problem: we are in agreement there. But why is it a problem if someone is able to create 20 sources articles on WHO-recognized diseases? Or the codes you list, which I'm unfamiliar with, but which do not provide scope for hundreds of articles? I think we're talking past each other a little bit. My point is that rate of creation is a symptom of the problem, not the problem itself. The (potential) problem is repetitive creation of a group of articles, and the problem is when that group is one for which no consensus exists on notability, or when the articles are consistently of a poor quality. Also pinging Aquillion, whom I mentioned above but failed to ping correctly. Vanamonde (Talk) 18:13, 6 November 2022 (UTC)[reply]
IMO the point behind MASSCREATE is that, regardless of method (e.g., bot-like) or notability (e.g., obviously good, obviously bad, or not obvious) or article quality, editors who didn't creating the article need a reasonable chance to check the article. We handle about 500–600 articles most days. Increasing that by 20 isn't really going to be noticeable. Increasing that by 200 will probably be a problem. Increasing that by 2,000 will definitely be a problem. The rate of creation itself is the problem for reviewers.
It sounds like your concern has nothing to do with the "mass" aspect of mass creation. It's more like "I don't want people creating lousy articles, and therefore I especially don't want them creating a lot of lousy articles." That's not really what MASSCREATE's supposed to address. WhatamIdoing (talk) 23:50, 6 November 2022 (UTC)[reply]
  • I think a simple definition of Multiple articles created based on boilerplate text would cover most problematic instances of mass creation and cannot be gamed.
    I also don't think there would be an issue with multiple definitions, as mass creation can take different forms, and would suggest this allows an RfC with multiple questions, each asking Does X constitute mass creation; each question could involve multiple options if there are minor modifications to the same definition X (for example, a minor modification from my above proposal could be "At least ten articles created based on boilerplate text"). Every question that receives a consensus would then be added as a separate definition of mass creation. BilledMammal (talk) 05:20, 6 November 2022 (UTC)[reply]
    If we did the questions one or two at a time, possibly. But we know where too many questions ends up.
    Now, "multiple". So, more than 2 then? That's so open to interpretation that it becomes useless Blue Square Thing (talk) 08:39, 6 November 2022 (UTC)[reply]
    It can work, particularly when the questions are closely related like they would be here.
    I'm not certain where the threshold should be. I can see an argument for two (the boilerplate itself it the bright line that needs approval), but I can also see the argument for slightly broader latitude. Can you explain why it's so open to interpretation that it becomes useless. BilledMammal (talk) 09:24, 6 November 2022 (UTC)[reply]
    Multiple articles per lifetime? No, thank you. @Blue Square Thing is absolutely correct that "multiple" will be interpreted as meaning two, but even if you set it at a more reasonable number, then you need to be talking about a rate, not an absolute number. WhatamIdoing (talk) 16:32, 6 November 2022 (UTC)[reply]
    Editors aren't going to accidentally be reusing boilerplate text, which is why I'm not convinced we need to set a number. I also don't believe a rate is appropriate; 100 boilerplate articles created over a year should get consensus just as 100 boilerplate articles created over a week should. BilledMammal (talk) 16:49, 6 November 2022 (UTC)[reply]
    How about 100 articles per 20 years? How about 10 articles in the same month, and never repeated? Is 10 articles ever "mass" creation? Even if it's not "mass" creation, is creating 10 similar articles something you think is worth adding a bureaucratic pre-approval process for? WhatamIdoing (talk) 16:55, 6 November 2022 (UTC)[reply]
    Based on a boilerplate text - this goes beyond just similar - but yes.
    is creating 10 similar articles something you think is worth adding a bureaucratic pre-approval process for I believe that the pre-approval process will be similar in bureaucratic overhead to an AfD, which means I suspect we will reduce the overall bureaucratic overhead by implementing this. BilledMammal (talk) 17:03, 6 November 2022 (UTC)[reply]
    @BilledMammal, if someone wants to create 10 articles in a month, and those articles are expected to be similar – including if it's absolutely normal for those articles to be similar, because that's what happens if you follow Wikipedia:WikiProject Albums/Album article style advice – then you actually want people to get pre-approval for writing normal articles? WhatamIdoing (talk) 17:11, 6 November 2022 (UTC)[reply]
    I think you are misunderstanding my proposal. I'm not proposing that this applies to articles that are merely similar; I'm proposing it applies to articles that are based on boilerplate text. BilledMammal (talk) 17:18, 6 November 2022 (UTC)[reply]
    There's no real difference. The accepted start for a notable album appears to be this sentence: Album is the nth {studio|live} album by the <nationality> <genre> {band|singer} <name>, released on <date>, by <label>. followed by a track listing. Whether that is "merely similar" or "boilerplate" is in the eye of the beholder. WhatamIdoing (talk) 17:28, 6 November 2022 (UTC)[reply]
    That would be boilerplate text, proven by you being able to define the boilerplate text used to create the articles. However, I don't see it at WikiProject Albums style guide, and a review of the initial version of a dozen randomly selected albums doesn't appear to follow that text? But if you are right and it is the accepted start for a notable album and thus thousands or tens of thousands of articles are being created based on that boilerplate, then what is the issue with requiring consensus to be obtained for it, to ensure that the project as a whole approves of such actions rather than just WikiProject Albums? BilledMammal (talk) 17:38, 6 November 2022 (UTC)[reply]
    Historically, editors interested in a given topic area have worked out basic skeletons for new articles related to that area. We could mandate that such discussions should take place in a specific venue to facilitate someone watching for all of those discussions, and require that all previously established skeletons be reviewed. However it would create a central bottleneck and a long backlog of reviews which seems disproportionate to the benefits that would accrue. The problems with having many articles created rapidly have centred on a small number of editors, and not the vast many who have followed the same basic skeletons in different topic areas. isaacl (talk) 18:00, 6 November 2022 (UTC)[reply]
    Do you have some examples of these skeletons, and a rough estimate about how many exist? BilledMammal (talk) 18:03, 6 November 2022 (UTC)[reply]
    I know of examples for some sports. Nearly all articles, though, can be categorized with other similar articles. Editors will typically base the creation of a new article on existing ones of the same type. Mandating that creating a new article that mimics the skeleton of existing ones has to be centrally approved would affect virtually all of them. This would be possible to do, and would in essence be creating minimum stub standards for all topics. If we're going to pay that cost, though, personally I'd prefer to focus on the desired content to include, rather than a specific text layout. isaacl (talk) 18:19, 6 November 2022 (UTC)[reply]
    Can you link those examples? BilledMammal (talk) 18:37, 6 November 2022 (UTC)[reply]
    The football people certainly used to have them - very useful as well. There's also guides such as WP:UKCITIES. Blue Square Thing (talk) 19:32, 6 November 2022 (UTC)[reply]
    You're probably thinking of the guides like Wikipedia:WikiProject Football/Players. WhatamIdoing (talk) 21:28, 6 November 2022 (UTC)[reply]
    UKCITIES appears to be a style guide, not a boilerplate. The WikiProject football text could be a boilerplate, if editors are only using the introduction, but given the issues we have had with the creation of articles on football players I don't believe requiring editors wishing to create several or more sub-stubs based on that boilerplate to get consensus is a bad thing. BilledMammal (talk) 23:40, 6 November 2022 (UTC)[reply]
    re: open to interpretation - it doesn't provide any level of distinction between people creating a few articles about similar things (Danish cycle races, for example) and the creation of articles at a scale that becomes potentially problematic. It's just too open I'm afraid - 2 is probably multiple, 3 certainly is. Neither is problematic. Blue Square Thing (talk) 19:34, 6 November 2022 (UTC)[reply]
    This shouldn't impact editors if they are writing articles, rather than sub-stubs, even if they are on similar topics. BilledMammal (talk) 23:40, 6 November 2022 (UTC)[reply]
    BilledMammal, here are some examples:
    I believe, but have not checked, that all of these articles were written by different people at different times. they all begin with the fill-in-the-blank pattern of "Name is a high school in <place>" followed by a statement about the year the school opened.
    I don't think that editors could look at these 10 articles and agree whether these are "merely similar" or an undesirable "boilerplate text". WhatamIdoing (talk) 20:58, 6 November 2022 (UTC)[reply]
    These are good examples of why we absolutely need more than just "uses boilerplate" as a definition. I like the idea below of also saying that the articles reuse the same few sources (generally one). That would clarify that article doesn't count as mass created if it uses multiple/unique reliable sources. Maybe we even restrict it to "exclusively uses boilerplate and a single common source". Steven Walling • talk 21:39, 6 November 2022 (UTC)[reply]
    Using the "same few sources" is not a problem though. As you seem to suggest, there could be a problem if they use the same one source though. So I agree with your last sentence. But I think we still need a numerical threshold, since 2 or 3 articles that use the same single source are probably not a problem. Rlendog (talk) 22:08, 6 November 2022 (UTC)[reply]
    None of those are boilerplate articles, either now or when they were created.
    If you disagree, try to define a boilerplate that would allow you to create those articles - you won't be able to. BilledMammal (talk) 23:40, 6 November 2022 (UTC)[reply]
    Based on discussion, I believe A single editor, creating several articles based on boilerplate text and referenced to the same group of sources is better than my initial proposal. A slight alternative, that I believe could discussed in the same section, would be A single editor, creating dozens of articles based on boilerplate text and referenced to the same group of sources. BilledMammal (talk) 23:40, 6 November 2022 (UTC)[reply]
    At this point, I think we can safely conclude that what you think constitutes boilerplate text and what every single person who's responded to you so far thinks is a boilerplate text are not the same thing. WhatamIdoing (talk) 23:52, 6 November 2022 (UTC)[reply]
    Can you define a boilerplate text that would allow you to create those articles? BilledMammal (talk) 23:56, 6 November 2022 (UTC)[reply]
    Articles that are published with only boilerplate text are distinct from articles that just begin with or contain a boilerplate. The examples you provide all fit in the latter, while the hundreds of footballer microstubs that are wholly interchangeable if one removes the text from user entry fields are what BilledMammal is talking about. JoelleJay (talk) 22:07, 9 November 2022 (UTC)[reply]
  • We already have an existing definition at WP:MASSCREATE which is effectively that it's the use of a bot or similar software to mechanically create batches of articles as a single task. The issue seems to be that some want to extend the definition to include manually created stubs. That's a different issue IMO. The problem with stubs is not their mass but their minimal nature. But it doesn't seem to be a big problem as the NPP queue seems to be under control and there are lots of existing ways of handling its entries. So, I'm not seeing any need for an expansion of existing guidelines and policies. If it works, don't fix it. Andrew🐉(talk) 11:17, 6 November 2022 (UTC)[reply]
    • Speaking of WP:MASSCREATE, with the recent changes I'm starting to think it would benefit from being moved out of WP:BOTPOL to its own policy, as it's getting less and less relevant to only bots and automated editing. If nothing else, it not being in the "Bot policy" may reduce the chances of someone arguing that it somehow doesn't apply to their "manual" mass creation. Anomie 12:28, 6 November 2022 (UTC)[reply]
      @Andrew Davidson, MASSCREATE says "While no specific definition of "large-scale" was decided". Please explain why you believe " We already have an existing definition at WP:MASSCREATE" when MASSCREATE explicitly says there is no definition. WhatamIdoing (talk) 16:52, 6 November 2022 (UTC)[reply]
  • WP:MASSCREATE is not a free-standing, separate policy. It's part of the WP:Bot policy which "covers the operation of all bots and automated scripts used to provide automation of Wikipedia edits". That section covers the use of such bots and scripts to create pages. See Context (language use). Andrew🐉(talk) 21:49, 6 November 2022 (UTC)[reply]
    What's the smallest number of articles that you'd call "large-scale"? WhatamIdoing (talk) 23:52, 6 November 2022 (UTC)[reply]
    Also, one of the problems we're having is that people are pointing at MASSCREATE to try to stop editors from creating articles using 100% manual methods, with no hint of a bot or script anywhere in the process. It sounds like, from your contextual reading, that you need bot approval if you want to use a script to create more than n articles, but if you do the work 100% by hand, then you can create them at a rate limited only by how fast you can type. Is that a fair summary of your interpretation? WhatamIdoing (talk) 23:54, 6 November 2022 (UTC)[reply]
    The main point of MASSCREATE is that creations by bot/script should be pre-approved. There's other parts of the bot policy that say things like "Note that high-speed semi-automated editing may effectively be considered bots in some cases (see WP:MEATBOT), even if performed by a human editor." So, botlike work that seems to be erroneous or inattentive may be shut down. But that's generally true for any manual work. For example, see Utterly horrendously written articles from an auto patrolled user. In this case, an editor has created over 1,000 articles which have been criticised as too garbled in some cases. There doesn't seem to be a particular policy issue beyond WP:CIR. People don't seem to need any additional policy to address such cases of manual incompetence. Andrew🐉(talk) 09:12, 7 November 2022 (UTC)[reply]
    What if it's neither "high-speed" nor "semi-automated"? I'm not asking to be picky. I'm asking because we just spilled some thousands of words about whether creating one or two articles per day for a year should be banned under MASSCREATE because that's more than than the "25 to 50" listed in MASSCREATE.
    In your opinion, if someone creates just one or two stubs per day for a year, is that mass creation/large-scale creation/creation at scale? WhatamIdoing (talk) 22:39, 7 November 2022 (UTC)[reply]
    One or two stubs per day is not seen by MASSCREATE as a problem. MASSCREATE says "Alternatives to simply creating mass quantities of content pages include creating the pages in small batches..." and doing one or two per day would be such small batches. The idea seems to be that when the rate is low enough for individual human review then it's ok. One or two creations per day will obviously be getting individual attention from the primary author and is well within the capacity of our standard review processes like NPP. Andrew🐉(talk) 09:46, 8 November 2022 (UTC)[reply]
  • I can only repeat my comment quoted in the ideas box above that trying to define something as arbitrary as "mass creation" is impractical. The definition will fail when an influx is needed and will be circumvented when it is not. The only solutions to the perceived problem are to (a) insist that at least one acceptably significant source is cited in each new article; and (b) consider each case of mass creation on its individual merits. For (b), is it a one-off and is it justified; or is there a pattern that amounts to disruption? From what I can see of the Lugnuts case, the basic objection was not that he suddenly produced a mass input to, for example, fill a new category. It was more a case of persistent creation of minimal stubs over a long period of time. The key factor in any question of "mass creation" is circumstance and you cannot impose a rigorous definition on a concept which has such wide variations.
    There is another side to this and, turning to a point raised earlier by another editor (this might be at the RFC), I think retrospective action would be morally wrong. It must be acknowledged that the proverbial goalposts have shifted and that when Lugnuts and others were churning out their stubs in years gone by, the creation of placeholders was not only acceptable but probably a necessity to get the encyclopaedia up and running. We can now insist on quality before quantity, which is great because it means progress has been made. For the older stubs that cannot be expanded, there is WP:ATD and redirection to a suitable list. If no suitable list exists, it's easy to create one with a few items (get the relevant project to help if necessary) and then expand it in due course. There really is no need for "mass create" to evolve into "mass delete". BJóv | talk UTC 14:06, 6 November 2022 (UTC)[reply]
  • Comment - I realize that this will be slightly off topic, but from my perspective the issue goes beyond mass-creation and mass-deletion… the issue is doing anything “en mass”. Mass editing is disruptive no matter what you are doing. For example: while going through an article to conform it to our Manual of Style is considered commendable (and encouraged), we have sanctioned editors who do so at hundreds of articles at the same time. We start accusing the editor of “going on a Crusade” and “acting robotically”.
    Look at any individual edit, and there was nothing “wrong”, but we balk at edits done “en mass”. I suspect that the problem is that doing things (even “good” things) “en mass” overwhelms our community’s ability to mentally process the edits. We just can’t deal with so many edits at once.
    To relate this back to the topic at hand… perhaps what is needed is a broader WP:No mass edits guideline that makes it clear that: “Mass editing of any kind is considered disruptive” Blueboar (talk) 14:56, 6 November 2022 (UTC)[reply]
    Mass edits become a practical problem when watchlists get flooded. Mass article creation is a problem with the reviewers get swamped.
    Separately, I think we have a bias towards a certain style of editing that we have called "generalist" in the past. We want to see a certain amount of randomness, because that makes it look like you're a real human just like me. An editor who obsesses about a single thing (whether that's a subject area or a single typo) is not admired as much as someone who whimsically skips around editing articles they don't really know anything about. WhatamIdoing (talk) 17:15, 6 November 2022 (UTC)[reply]
    Making many similar edits that have community support is not inherently disruptive. There are a lot of editors making style changes in accordance with community consensus without anyone objecting. isaacl (talk) 17:50, 6 November 2022 (UTC)[reply]
    I fundamentally disagree that all mass editing is disruptive. This place is dodgy enough in places anyway. Remove all the people making minor fixes behind the scenes to things like MOS, and it'll be full of absolute garbage that's much harder to go and fix. It would also mean that even more of my selling errors would litter pages than already do. Blue Square Thing (talk) 20:06, 6 November 2022 (UTC)[reply]
  • If I was asked to define "mass creation", I would say that the creation of articles is not "mass creation" unless the total number of articles created exceeds X large number. (This criteria would be a "lifetime" total and would not depend on rate or time period). At this point in time 5,000 articles is the absolute minimum number that I would be prepared to even consider for "X" (for manually created articles). I would prefer something on the order of 10,000 or more. (These seem like relatively large numbers now, but might be less applicable to someone who has been editing for 70 years on our 70th anniversary in 2071.) In my opinion the one-off creation of a single batch of 26 or 51 articles in 24 hours, or even in 24 minutes, is not mass creation, because 51 articles is actually a small number. If someone creates 24 or 49 articles every day, for months and years on end, that might well be mass creation. A definition of mass creation would not be acceptable to the community if it affected large numbers of editors, especially now that it is being said to require BRFA approval, and this kind of criteria is one way to avoid that, since it would confine the definition to less than one hundred article creators at the moment. [There were 29 editors with more than 10,000 articles in 2021. Lugnuts created more than 90,000 articles and Carlossuarez46 created more than 80,000 articles]. James500 (talk) 18:35, 6 November 2022 (UTC)[reply]
    5,000 articles spread over 20 years is not much. That's less than one a day. But 5,000 articles in one year could be a challenge for the Wikipedia:New pages patrol work, and 5,000 in a month will be a problem. The definition needs to include a rate that takes into account the ability of the NPP to review the articles. WhatamIdoing (talk) 21:24, 6 November 2022 (UTC)[reply]
    A lifetime cap doesn't really get to the issue at hand, and is probably counterproductive. The issue is a lot of low quality, possibly non-notable articles dumped in a short period of time. I don't think a lifetime cap gets to that at all. — Preceding unsigned comment added by Rlendog (talkcontribs) 22:18, 6 November 2022 (UTC)[reply]
  • I'd define mass creation as creating articles without giving individual attention to the articles; in practice, that looks like a combination of high-volume creation, similar article content, and similar sources (e.g. a single database or document). Of course, exact thresholds for that are hard to pin down, but that's where I'd start. TheCatalyst31 ReactionCreation 20:33, 6 November 2022 (UTC)[reply]
    There needs to be some level below which article creation can't be "mass" creation, because there's so little volume involved. We really don't need editors accusing the m:100wikidays editors of mass creation. WhatamIdoing (talk) 21:21, 6 November 2022 (UTC)[reply]
    This trifecta of factors (large scale, boilterplate text, and repeated sources) is definitely better as a definition than any kind of numeric threshold. I'd maybe take it one stretch further and say that it would help to narrow the scope to articles created en masse from a single source, since that's the most problematic case by far. Steven Walling • talk 21:26, 6 November 2022 (UTC)[reply]
    The problem with not providing a numeric definition is that we already have editors claiming that one or two articles per day is "mass creation" (even when more than one source is provided, even when the subject is known to be notable). What Thryduulf says below about the basic desire here is to prevent people creating large numbers of articles that the person expressing the desire doesn't like resonates with me. Mass creation is being weaponized to stop things I don't like, rather than being about the "mass" creation of articles. WhatamIdoing (talk) 21:31, 6 November 2022 (UTC)[reply]
    I agree 100% with the goal of limiting bureaucracy. What about a definition that includes no sources or only a single source? It is legitimately risky for people to create a large number of articles based only on one source or without references at all, since we know that they can have errors. If we limit the scope to "a large number of stubs created using only boilerplate text and one (or no) source", then the problem can be addressed in ways other than deletion or merging, such as by adding more sources. There is definitely no consensus for making people ask for prior permission to create articles, so what that will produce is normal, healthy "should we delete this or can it be improved with better sources?" discussions at AFD. Steven Walling • talk 21:47, 6 November 2022 (UTC)[reply]
    @Steven Walling, what's "a large number"? If you do m:100wikidays for three years, you'll have created more than 1,000 articles. Is that "a large number"? WhatamIdoing (talk) 23:59, 6 November 2022 (UTC)[reply]
    I think it's clear from the RFC that no one likes trying to specifically define a numeric threshold. A single threshold only works in theory, not in practice. Lack of a single number that crosses the Rubicon from "normal" to "mass creation" prevents people from gaming it and also prevents people from going on a witchhunt looking for any author of more than N articles. Just saying "a large number" and then adding key attributes of the suggested minimum quality bar for mass creation is more effective. Steven Walling • talk 17:34, 7 November 2022 (UTC)[reply]
    @Steven Walling, I think it's clear from the RFC that a substantial number of editors, including me, think it's a good idea to define a numeric threshold. I'd settle for "If you come around crying 'mass creation' over one or two articles a day, we're going to ban you on the twin grounds of disruption and competence", but I think we really do need some numbers. We already have people claiming that one or two articles a day is a violation of "mass creation". WhatamIdoing (talk) 22:42, 7 November 2022 (UTC)[reply]
    As an adjustment on my previous proposal, A single editor, creating many articles based on boilerplate text and referenced to the same group of sources? I would agree that I don't think we need an explicit number of articles that need to be created; I don't see a benefit to having a bright line definition. BilledMammal (talk) 00:27, 8 November 2022 (UTC)[reply]
    If there is a "group of sources" then (assuming the sources are consistent with GNG guidelines) there should not be a problem. Boilerplate text might be a start, but if there are multiple reliable sources attached to the boilerplate text then I don't see an issue. The potential issue comes if there are a lot of articles using boilerplate text without appropriate sources, but then it comes down to how many. I strongly disagree that 10 in a month (or even a week) is any sort of problem (and even 10 in a day, if not repeated, I doubt is a real problem).Rlendog (talk) 00:41, 8 November 2022 (UTC)[reply]
    "Group of sources" are that they are all created from the same collective group of sources. Whether you use database A, database B, or database A and B, if they use the same boilerplate template they are all part of the same mass creation.
    I will add that the purpose of this definition isn't to define problematic mass creation, it is to define all mass creation, as the goal isn't to prevent such actions (the proposal to do that was overwhelmingly rejected), it is to give the community greater oversight of them. This means that appropriate mass creation, such as those that include sufficient sources to demonstrate compliance with GNG, should be included in the definition. BilledMammal (talk) 01:26, 8 November 2022 (UTC)[reply]
    I think you've identified one of the reasons we didn't agree on a definition there. There was too much "it's only mass creation if it doesn't have the right sort of sources" or "it's only mass creation if I can't figure out why anybody would care about this subject" and not enough "it's mass creation if there's a ton of it, and only after we've figured out whether there's a lot of it can we talk about whether it's a desirable or problematic mass creation".
    Along those lines, I think your proposal has too little attention on the key point (How many is "many"?) and too much on the desirable/problematic point (boilerplate text, wrong references). WhatamIdoing (talk) 01:45, 8 November 2022 (UTC)[reply]
    An editor making many articles isn't necessarily engaging in mass creation if the articles aren't related; the purpose of the focus on boilerplate text and reuse of sources is to establish that.
    Regarding the definition of many, I don't think we need a definition of that; WP:MEATBOT's definition is simply high-speed or large-scale, and that ambiguity hasn't caused issues in the past, and I don't believe something similar will here. However, repeating that wording might be useful; A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources BilledMammal (talk) 01:55, 8 November 2022 (UTC)[reply]
    But you are saying "database A" and "database B" as sources. If the sources are clearly not in line with GNG then that may be a problem. But if the sources are in line with GNG then boilerplate text should not be a problem at all. The mere use of boilerplate text and the same group of reasonably good sources is not an issue. So maybe ''A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources that are not reasonably consistent with GNG.'' Rlendog (talk) 14:56, 8 November 2022 (UTC)[reply]
    I would agree that an editor working through the Oxford Dictionary of National Biography to create boilerplate stubs on every individual currently lacking an article is not an issue, assuming the boilerplate is suitable, but they are still engaged in mass creation. What we are trying to do here is create a definition of mass creation; when we have this definition we can consider how to determine which mass creations are problematic and which are not. BilledMammal (talk) 15:09, 8 November 2022 (UTC)[reply]
    But mass creation wouldn't matter if they use the same sources. It would be mass creation if they used no sources. Once we have defined mass creation we can address when mass creation is problematic. I do think we need some sort of quantitative guidelines since "high speed or large scale" can be subjective. We can either define mass creation based on the high speed large scale creation based on boilerplate text, and then say that such mass creation is not problematic if the sources are acceptable, or we can define mass creation based on the high speed large scale creation based on boilerplate text with defined unacceptable sourcing, and then the mass creation is by definition problematic. Rlendog (talk) 17:59, 8 November 2022 (UTC)[reply]
    I included the line about "same group of sources" because I see finding and reviewing individual sources to not be a "mass" action, but I can see why editors can believe that the broader context means it still is mass creation and I have no objection to removing that line from the proposal: A single editor, creating articles at high-speed or large-scale, based on boilerplate text.
    I think it is better to leave "high speed and large scale" subjective, both because it will prevent gaming and because I don't see a benefit of a bright-line definition. In addition, I think including a bright-line definition would prevent this from finding a consensus. BilledMammal (talk) 18:46, 8 November 2022 (UTC)[reply]
  • There are times when it is reasonably desirable to create a bunch of similar articles with basic content so that we have articles that will, naturally, be fleshed out over time. At the start many of these articles will look similar to the extent that they appear to be written based on a framework, possibly even they were. The example that comes to mind for this is following an election to a body that confers notability to its members where many, sometimes most, were not notable previously (e.g. national parliaments). We don't want bureaucracy to get in the way of creating those articles. Thryduulf (talk) 21:22, 6 November 2022 (UTC)[reply]
  • It seems to me that the basic desire here is to prevent people creating large numbers of articles that the person expressing the desire doesn't like. However, few people have been able to objectively define what separates articles of the type they don't like from ones that they do, and different people's definitions correlate poorly with each other. Thryduulf (talk) 21:22, 6 November 2022 (UTC)[reply]
    I don't think that is an accurate way to describe it. I would say that the basic desire is to give the community input into large scale creations, as the status quo can result in WP:FAIT issues. BilledMammal (talk) 04:09, 7 November 2022 (UTC)[reply]
    How does "give the community input into large scale creations" actually differ from "prevent people creating large numbers of articles that the person expressing the desire doesn't like", in practical terms? I agree that the one sounds a lot more friendly, but both of them mean "other people can vote to tell you that your contributions aren't wanted". WhatamIdoing (talk) 22:43, 7 November 2022 (UTC)[reply]
    The first means that the community can decide that your proposed contributions won't improve the encyclopedia; this is aligned with existing community processes such as AfD, with the only difference being that it discusses proposed contributions, in order to address WP:FAIT issues, rather than existing contributions. BilledMammal (talk) 02:50, 8 November 2022 (UTC)[reply]
    Both of these are talking about proposed contributions. Both of these are preventive. The comment from Thryduulf, which you disagreed with, actually contains the word prevent. It is illogical to talk about preventing "existing contributions". Thryduulf says that some folks want to prevent editors from creating articles that they (=editors who consider themselves to represent "the community") don't want – prevent, as in before the proposed contributions are made. You say you want to "the community" to decide that some proposed contributions are unwanted, before those proposed contributions are made.
    There is no difference between these two statements. They both mean preventing editors from writing articles that you don't want. IMO the only difference is that you approve of preventing content contributions, and Thryduulf reports this as a view held people others (e.g., editors like you) rather than a view held by himself. WhatamIdoing (talk) 00:56, 9 November 2022 (UTC)[reply]
  • I think there are two facets to the issue - quantity and quality. A definition of mass creation needs to get to the quantity, and either the definition itself or the guidelines around it would get to the quality. I would suggest several thresholds as to quantity to avoid gaming. My opening suggestion (but open to revisions) is 25 in a day, 75 in a week, 200 in a month. Not sure if we need to go onto a year. But even a lot of similar articles with multiple reliable sources are not a problem. So as to quality, I would suggest that once a group of articles on a similar topic trips the threshold we should insist on certain quality parameters, such as at least one or two sources that plausibly meet GNG, and possibly others (maybe length, although I am not sure that in itself is a problem if there are multiple appropriate sources). So the guideline could be "If a single editor creates more than 25 articles on a similar topic in a day, or 75 in a week, or 200 in a month, it is expected that each of the articles will have at least 1 (or 2) reliable sources that plausibly meet GNG." If not, I guess the net step would be to figure out, but could result in some sort of restriction on further creation and/or a bias towards deletion at AfD. Rlendog (talk) 22:18, 6 November 2022 (UTC)[reply]
  • I find it highly paradoxical that basically the only meaningful result of the RFC was to recommend creating a definition, while every meaningful attempt to do anything that would use that definition was soundly defeated. So we're creating a definition of "article creation at scale" to do what, exactly? We're defining a term that, in the end, is pointless as most of the attempts to regulate "article creation at scale" have been soundly defeated. Of the 23 questions in the RFC, 3 were spun off into a new RFC, 11 failed either unanimously or "by a wide margin", 6 were unclear because they either received too little comments to judge consensus, or were phrased in such a way to be confusing, 2 were too close to call, and this was the only one that passed clearly. Let's say we can come up with a definition. What are we going to use that definition to do if the community is so dead-set against regulating at-scale article creation? --Jayron32 13:39, 7 November 2022 (UTC)[reply]
    This is a fair point. Put simply, I'd say the problem to solve is that the current definition at WP:MASSCREATE ('While no specific definition of "large-scale" was decided, a suggestion of "anything more than 25 or 50" was not opposed.') is bad. The RFC showed clear consensus that a single numeric threshold was unhelpful and ineffective. The RFC rejected prohibiting or discouraging people from creating articles at scale, but there was more support for saying we should suggest a minimum quality bar for sourcing, etc. There's a related discussion kicking off about how to handle them at AFD which also will fall completely flat unless we have a shared working definition. Steven Walling • talk 17:42, 7 November 2022 (UTC)[reply]
    The problem with the old definition isn't that having a number is bad; the problem is that it's unclear what that particular number means. Editors currently disagree whether "anything more than 25 or 50" means "anything more than 25 or 50 in a short time period" or "anything more than 25 or 50 in your entire life". People holding the second view then make up their own extra restrictions (e.g., that only articles which are short, poorly sourced, of doubtful notability, similar to other articles, etc. 'count' towards the limit of 25 or 50). I assume that this is because they have some subconscious recognition that a plain reading of "25 or 50 per lifetime" means that anyone who qualifies for Wikipedia:Autopatrolled would have to "violate" MASSCREATE to reach that point. WhatamIdoing (talk) 22:53, 7 November 2022 (UTC)[reply]
    I don't think I agree with the paradox. Part of the reason many of the suggested "fixes" were opposed was because different editors were coming to the discussion with different ideas of what "mass creation" meant. I know that I opposed some proposals because they seemed to apply to creations that were not mass creations, and that I didn't think were appropriate outside that context. If we have a definition, then editors can participate in a more meaningful discussion of proposals to fix the issues generated by creations that meet that particular definition. If the definition is too broad most proposals will likely be defeated, based on the previous discussion. But if the definition is more narrowly focused I think we could get agreement on some of the proposals directed at the specific problem. Rlendog (talk) 00:15, 8 November 2022 (UTC)[reply]

The big picture

Article creations per month

There's a good measure of article creations at Wikimedia stats. A snapshot to date is shown (right).

The number of creations peaked in 2007 with about 64,000 in July. The latest month of October 2022 was much lower with just 14,253. Note also the spike in October 2002 which was caused by Rambot.

What this seems to show is that mass creation is a declining issue rather than a pressing problem. The latest fuss was mainly about Lugnuts but he was quite exceptional and extreme. With his case resolved, I'm not seeing a need for an exact definition. There aren't lots of Lugnuts out there and the NPP queue of new articles seems to be under control.

If we want to provide some guidance to warn new editors then I suggest it be in the form of outcomes or case studies such as Lugnuts and Rambot which show when someone went too far and crossed a line. We know exactly what happened in those cases and so can give details. When we have a comprehensive list of such test cases, we might try to summarise them in an evidence-based way.

Andrew🐉(talk) 11:03, 8 November 2022 (UTC)[reply]

This is my general feeling as well. - Enos733 (talk) 16:33, 8 November 2022 (UTC)[reply]
This fact is what blows my mind about the whole debate. Article creation has slowed way down (in part due to things like WP:ACTRIAL) but for example, we still have tens of thousands of scientifically described species that lack any coverage in Wikipedia beyond a redlink. That's just one subject! We really need to be discussing ways to encourage article creation, not adding more rules that discourage or prevent it. Steven Walling • talk 17:24, 8 November 2022 (UTC)[reply]
+1 ---Another Believer (Talk) 17:28, 8 November 2022 (UTC)[reply]
-1. The encyclopedia doesn't need a separate page about every species in order to cover every species. Article count is a poor measure of topical coverage. Fewer stand-alone pages are easier to maintain than more stand-alone pages, so we should merge where we can. It's both inevitable and good that article creation has slowed down, and the current rate is still unsustainably high (as it always has been). Levivich (talk) 18:53, 8 November 2022 (UTC)[reply]
The idea that it's easier to not have species articles is totally laughable. It's quite common that just one genus of plants, animals, or fungi can contain more than a thousand species with detailed scientific descriptions. To collapse species up to the genus or family level would require creating and editing a set of truly gigantic, unwieldy lists that would be super complex to edit if they actually contained even a summary of the verifiable information about each one. Not to mention the fact that we'd be making it significantly harder for readers to find encyclopedic information. If we even got Google to index redirects as effectively as articles (which it doesn't), you'd have to scroll or search again within some huge list to find whatever information you were looking for. Steven Walling • talk 20:47, 8 November 2022 (UTC)[reply]
This assumes that a general encyclopedia should summarize all the verifiable information on each species in the first place. Why should we exempt species from WP:INDISCRIMINATE but not astronomical objects or published mathematical lemmas or school council members? Why do we need a million individual articles on insects when most of them can only ever contain the same boilerplate infobox parameters? We explicitly do not want every verifiable detail or anything close to that on a subject, so if the only material with which we can expand a stub is a collection of uncontextualized and/or primary-sourced facts, a standalone is just not merited. JoelleJay (talk) 22:38, 9 November 2022 (UTC)[reply]
Free access to the sum of all human knowledge, that's what we're doing. Steven Walling • talk 18:22, 10 November 2022 (UTC)[reply]
And what qualifies as the sum is explicitly restricted by what Wikipedia is NOT. JoelleJay (talk) 20:17, 14 November 2022 (UTC)[reply]
Jimbo's not a reliable source anyway. (And no, to collapse species up to the genus would not require creating a set of truly gigantic, unwieldy lists, that's just a straw man.) Levivich (talk) 21:28, 14 November 2022 (UTC)[reply]
+1. I agree. We need more articles rather than less. There are so many notable topics which are not covered here. BeanieFan11 (talk) 20:13, 8 November 2022 (UTC)[reply]
+1. One of my takeaways from looking at the detailed article creation stats that someone recently posted at one of the workshop threads is that mass creation is so rare that dealing with it on an editor-by-editor basis is preferable to trying to come up with a system of rules about it. Levivich (talk) 18:53, 8 November 2022 (UTC)[reply]

Sometimes if you can look at common sense / common practice and put it into words you get a good answer. If an editor spends a substantial amount of time writing text specific to the article and / or finding references for the specidic article, and they do that individually for many articles, nobody is going to have a problem with that from a mass-creation standpoint. If an editor finds a way to make a large amount of articles with very little time investment for each one, many will have a problem with that from a mass creation standpoint. So maybe, even though this will sound simple minded, it's "if you're creating a larger amount of articles, be sure to make a substantial effort on each one to create text unique to that article". If a question arises, see if that practice has been followed. North8000 (talk) 19:09, 8 November 2022 (UTC)[reply]

If an editor spends a substantial amount of time writing text specific to the article and / or finding references for the specidic article, and they do that individually for many articles, nobody is going to have a problem with that from a mass-creation standpoint. I wouldn't even consider that mass creation. I made a proposal above that I believe aligns to that with less subjectivity: A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources. An alternative, as an editor is not convinced by the sources aspect, is A single editor, creating articles at high-speed or large-scale, based on boilerplate text. BilledMammal (talk) 19:18, 8 November 2022 (UTC)[reply]
To me the first bolded definition you have seems sufficiently clear in scope to be useful. I think it's worth having a followup RFC that compares support for that vs. a definition that also includes a minimum number for what large scale or high speed means, like WhatamIdoing and others seem to prefer. Steven Walling • talk 20:58, 8 November 2022 (UTC)[reply]
So if the definition is approved in a first RfC, we hold a second RfC proposing to add quantitative values to "large-scale" and "high-speed"? I think that is a good idea. BilledMammal (talk) 21:49, 8 November 2022 (UTC)[reply]
It would probably be simpler to just do a single runoff RFC where we propose one or the other. (First we need to discuss what the numeric thresholds might be.) Steven Walling • talk 23:44, 8 November 2022 (UTC)[reply]
Are you suggesting we use instant-runoff voting, as discussed at the AfD at scale RfC, and ask:
Which proposed definition of mass creation should we implement?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences will be determined through IRV.

A: A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources.
B: A single editor, creating more than X articles per Y or Z articles overall, based on boilerplate text and referenced to the same group of sources.
C: Status quo

BilledMammal (talk) 00:33, 9 November 2022 (UTC)[reply]
Yep, ranked choice voting would be fine I think. Either we do it that way, or we have to have a whole other fullblown RFC to solicit various definitions. Steven Walling • talk 05:00, 9 November 2022 (UTC)[reply]
@BilledMammal: I would be very much in favour of your RfC proposal. If I might add a few comments: 1) should it say "creating stub-size articles? 2) should it say "referenced to a single source or to the same small group of sources"? 3) would you envisage having it before the AfD RfC starts, or running it in parallel? 4) – me being pedantic – "high speed" and "large scale" should not have hyphens, unless they are adjectives, which they're not here. Scolaire (talk) 14:12, 9 November 2022 (UTC)[reply]
Good clarifying questions. Including "stub" as a size qualifier makes sense to me. Other than that, we should say either one or add "only" to the same group of sources, so that it's clear that just because one source happens to be reused, that's not a problem as long as there are other unique sources. Also you don't need to say large scale or high speed. If scale is the key attribute just say that. So end result is:
  • A: A single editor creating a large number of stub articles based on boilerplate text and referenced only to the same sources.
  • B: A single editor, creating more than [X] stub articles per [period of time], based on boilerplate text and referenced only to the same sources.
How's that? Steven Walling • talk 16:38, 9 November 2022 (UTC)[reply]
I think we need to stick with "the same group of sources", as small variations in what source is used (for example, an editor switching between using database A or database B or database A and B) shouldn't be enough to mean this is not mass creation, or that it is a different group of mass creation that would need to be discussed separately. BilledMammal (talk) 21:17, 9 November 2022 (UTC)[reply]
  1. I'm not certain it needs it; it adds subjectively that will result in difficulties at the RfC, since we don't have a clear definition of what a stub, and I doubt that anyone can create a boilerplate that would create an article beyond stub-size.
  2. The first option won't work, as mass creation can use multiple databases (for example, see some of Lugnuts Olympic stubs, which use Olympedia and Olympics). The second might be better than the original, since "the same group group" could be excessively large.
  3. I believe the plan is to run it before the AfD RfC starts, so that we have a definition of mass creation for the RfC.
  4. Good point, we should change that at WP:MEATBOT as well.
BilledMammal (talk) 21:23, 9 November 2022 (UTC)[reply]
They are used as adjectives at WP:MEATBOT, so they are correct there. Scolaire (talk) 12:29, 10 November 2022 (UTC)[reply]
While I'm commenting, I'd like to address the oft-repeated criticism that setting X at, say, 50 articles will cause editors to "game the system" by creating 49 articles. Of course, by reductio ad absurdum, you can go on decreasing the value of X until it is two articles, and people will "game the system" by creating one. The question is not whether or how we can enforce a given number, but what sort of number patrollers or AfD can comfortably handle. If they can comfortably handle 50 articles in a given time period, who cares whether someone creates 49 or 51? Scolaire (talk) 14:45, 9 November 2022 (UTC)[reply]
Let's set it to 10 a day then, that way no matter what the processes we have in place will work. -- LCU ActivelyDisinterested transmissions °co-ords° 18:02, 9 November 2022 (UTC)[reply]
We need to chose a number that can be handled by AfD even if there are multiple editors engaged in such mass creation, and even if the mass creation isn't noticed for a few months. BilledMammal (talk) 21:17, 9 November 2022 (UTC)[reply]
10 a day, 20 a week, 40 a month. -- LCU ActivelyDisinterested transmissions °co-ords° 21:47, 9 November 2022 (UTC)[reply]
@Scolaire, I don't think we need to define creation speeds in terms of what AFD can handle at one time. Consider:
  • If I create a hundred articles today, I'm not being fair to NPP, but the AFDs could be spread out over the next month, or even the next year.
  • If the articles are very similar, then you could nominate multiple related pages for deletion. The suggested process is to nominate one, see what happens, and then come back with more. Perhaps today you send the first to AFD (or maybe the first couple, in separate nominations), and next week you send a bundle of five or ten, and the week after, you send all the rest in a large bundle.
  • Fairly often, the kinds of articles that folks are complaining about (e.g., substubs about early Olympic athletes) don't need to go to AFD anyway. The heaviest process needed is often Wikipedia:Proposed article mergers, to turn Rae Runner into a redirect to the List of Ruritanian Olympic athletes.
Consequently, I suggest that article creation ought to be limited according to what the article creation-related review processes can handle, and not according to what might be needed if (and only if) AFD becomes relevant. WhatamIdoing (talk) 21:41, 10 November 2022 (UTC)[reply]
WhatamIdoing I'm speaking as a layman here. My point is that we should pick a number that the system can comfortably handle, not any specific component of the system (I just used those two as examples). The responses above and below suggest that 40–50 articles a month is a reasonable figure. If we set it at that, then one or two creators "gaming the system" will not overwhelm the process. Scolaire (talk) 10:56, 11 November 2022 (UTC)[reply]
Thinking about the system as a whole, I think we could safely set the limit-at-which-following-the-rules-isn't-disruptive at 100 articles a month without any part of the system being overwhelmed, so long as those 100 articles aren't all posted in less than a week.
I fully agree with your reductio ad absurdum analysis of the "gaming" fear. WhatamIdoing (talk) 18:36, 11 November 2022 (UTC)[reply]
Could we achieve this by splitting B, with B2 becoming A single editor, creating more than 100 articles per week, based on ...? That would give respondents a lower a higher level to choose from. -- LCU ActivelyDisinterested transmissions °co-ords° 18:41, 11 November 2022 (UTC)[reply]
A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources - As long as we say this doesn't go into effect until we've defined "high-speed" and "large-scale", that seems more or less ok. Might change "and" to "or" in the second sentence, though (or change "same" to "same or substantially similar"). You don't need the same group of sources to work with a boilerplate, after all. — Rhododendrites talk \\ 21:55, 9 November 2022 (UTC)[reply]
The current proposal is to hold an RfC with three options; one without definitions for those terms, one with definitions for those terms, and one for the status quo. However, we still need a proposal on what definitions we should use for those terms.
Might change "and" to "or" in the second sentence, though (or change "same" to "same or substantially similar"). I agree with that second change, or something like it; the intent is to say that not all of the sources need to be used on every article for it to still be mass creation, but I think that might be being misinterpreted?
I don't support changing it to or, however, as I don't think merely using the same sources is enough for it to be mass creation, but if other editors disagree I won't object. BilledMammal (talk) 07:52, 10 November 2022 (UTC)[reply]
I think BilledMammal is right. The "or" option will lead to someone claiming that if you use any overlapping source, or any source that's the same basic type, that it's culpable mass creation. This is just the nature of the systems we've set up. We've gone so far down the path of rules-lawyering that people feel like they have to claim that all conditions are fulfilled, to accomplish the end that they want.
We used to say that in computing the three scariest things were a programmer with a soldering iron, a hardware tech with a compiler, and a user with an idea. In the current era, I'd change it to "users who can't do their jobs unless they break the security policies". If you tell a sales team that they can either follow the corporate policy about not uploading proprietary content to third-party websites, or they can get the deal-clinching, paycheck-producing document into the customers' hands, they're going to break the policy without a second thought. They will click any button, visit any site, and agree to any terms, so long as it accomplishes the goals. (See also efforts to stop people from uploading copyvios to Commons. Sure, Commons, I definitely took that picture of Queen Victoria myself, because if I don't claim that, the software* won't let me upload it.) [*depending on which software you're using]
We have a similar thing here: We tell people they can only get articles deleted under certain circumstances. If they deeply believe that the article should be deleted, then they will do and say whatever is necessary to reach their goal. If submitting AFD required you to tick a box claiming that this nomination was endorsed by a Nobel Prize winner, people would tick that box.
I don't think there is an easy solution to this, but so long as this problem exists, the rules should be written to defend against overblown claims. WhatamIdoing (talk) 22:00, 10 November 2022 (UTC)[reply]
  • The picture that I am getting here is that the basic problem that actually needs solving is that some editors are putting a disproportionate burden on NPP by creating articles at a rate that NPP cannot comfortably handle. If every registered editor suddenly created a new stub on one specific day it would crash the system, but the probability is vanishingly low so we accept the risk. When we have a few editors who persistently create sufficient articles of dubious quality at a rate which taxes the capacity of NPP we do not want to accept the risk because it is known to happen often enough to be a problem.
    There are two aspects to this recognised problem. Rate of creation, and quality of article after creation is finished. (articles are not necessarily created in one edit, and should not be reviewed until the initial creation is complete - In use and Under construction tags should keep them off the queue).
    The total number of articles created over an editor's Wikipedia career is not relevant to this specific problem, but may be a separate problem where incompetence is involved (there are examples of this happening).
    A limit to rate of creation may be the easiest way to deal with this. I think a running average limit, with a superposed peak daily rate may be a suitable constraint. Ideally it would be automated to choke the floe when it gets too high, and ideally it would be a function of NPP backlog size at the time, but those are technical issues and we do not have the capacity at present (as far as I know), so a dumbed down system that is simple enough for the average editor to follow would be needed as a starting point.
    Daily, weekly and monthly caps should be manageable (something like 10 per day, 30 per week, 50 per month, subject to change if they are found to be unmanageable. this could be linked to backlog status, with the understanding that there would be a hard bottom limit, so that no-one ever gets shut off completely, of 1 per day, whatever the backlog gets to.)
    These limits would only apply to articles that go through NPP.
    If anyone wants to exceed the limits for a period, they apply for permission, and special conditions will apply (on a case by case basis, for the specified batch). If they keep within the limits, no special permission is required. If they inadvertently exceed the limits, they get notified by whoever notices, and are required to slow down. If they fail to respond (by slowing down), they get a 24 hour block, which will slow them down. This block would be preventative and effective as a brake, would attract their attention, and would be applied as often as necessary.
    With a article creation rate limit system like this there is no need to define mass creation or creation at scale, which is basically a red herring. Cheers, · · · Peter Southwood (talk): 01:57, 10 November 2022 (UTC)[reply]

What I'm getting from the above, then is:

Which proposed definition of mass creation should we implement?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences will be determined through IRV.

A: A single editor, creating articles at high speed or large scale, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
B: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
C: Status quo

Is that about right? Scolaire (talk) 12:20, 10 November 2022 (UTC)[reply]

By the way, what is the status quo? That we continue to discuss mass creation without defining it? Scolaire (talk) 16:14, 10 November 2022 (UTC)[reply]

The status quo would be continued discussion on coming up with a definition. Question 3B of the RFC passed by a wide margin, so this discussion can only come up with a definition not decide not to define one. -- LCU ActivelyDisinterested transmissions °co-ords° 17:16, 10 November 2022 (UTC)[reply]
Thanks, AD. Scolaire (talk) 20:12, 10 November 2022 (UTC)[reply]
Well, you can't squeeze water out of a stone—if after significant attempts have been made there is no agreement, that's in essence a new consensus reversing the previous one. But what really matters are any new procedures that are agreed upon by consensus. Those will have to establish their own specific criteria on when they apply. isaacl (talk) 21:12, 10 November 2022 (UTC)[reply]
We should make sure to ping all those who voted on Question 3B once the RFC wording has been decided. That way noone can later claim any shenanigans. -- LCU ActivelyDisinterested transmissions °co-ords° 21:36, 10 November 2022 (UTC)[reply]
I still think A is a little too wordy, but yeah Scolaire I think those options are roughly worth having the followup definition RFC about. Steven Walling • talk 18:29, 10 November 2022 (UTC)[reply]
I think we should word it as and referenced to a subset of a small group of sources., but otherwise yes. BilledMammal (talk) 21:09, 10 November 2022 (UTC)[reply]
@Scolaire, could option A be a little clearer about its purely subjective nature, by saying creating articles at a rate that you subjectively believe should be called high speed or large scale? WhatamIdoing (talk) 22:02, 10 November 2022 (UTC)[reply]
I would disagree with both of those. @WhatamIdoing: "high speed or large scale" is objectively subjective (if you'll pardon the expression); anyone !voting for it will know that they are defining it as something that they "subjectively believe" is high speed or large scale. There is no need to spell it out.
@BilledMammal: "a subset of a small group of sources" is confusing; a small group of sources is about as "sub" as you can get; what is meant by a "subset" of those, and why do we need to specify it? On reflection, I disagree with "or substantially similar" for the same reason. If an article creator uses the same two sources for half his new articles, and two "substantially similar" sources for the other half, that is still the same small number (four) of sources for the whole lot.
I missed Steven Walling's earlier suggestion that the word "only" be added before "the same group of sources". That is kind of necessary, I think. I therefore suggest: based on boilerplate text and referenced only to the same small group of sources. Scolaire (talk) 12:03, 11 November 2022 (UTC)[reply]
I also see that Steven Walling suggested changing A to "creating a large number of articles based on boilerplate text..." I'd be inclined to agree: (a) it's plain English, and (b) "high speed" isn't really necessary, since anyone creating a large number of articles based on boilerplate text isn't going to do it over a period of years. Scolaire (talk) 12:39, 11 November 2022 (UTC)[reply]
@Scolaire, please reconcile these things:
  • You: "anyone creating a large number of articles based on boilerplate text isn't going to do it over a period of years"
  • An editor creating more than 500 short, similar articles, individually, manually, after checking the sources (the editor reports having found errors in FishBase), spread out over the course of the last year. This is an average of about one and a half articles per day.
Note, too, Wikipedia:Village pump (proposals)/Archive 194#Mass creation of pages on fish species, where multiple editors have claimed (and others have disagreed) that creating these articles is a violation of MASSCREATE. The editor started this discussion because @BilledMammal complained on the editor's talk page that creating one or two articles per day is a violation of MASSCREATE.
Based on these facts, I have to assume that stopping this "high-speed or large-scale" article creation of just one or two articles per day, involving "boilerplate" (there are only so many ways to say that <Species> is a <common name> in <genus>", so of course they're going to look like "boilerplate" articles), is exactly what BilledMammal wants to accomplish. Is this what you want to accomplish? WhatamIdoing (talk) 16:14, 11 November 2022 (UTC)[reply]
Isn't this discussion about the RFC wording? Once the RFC is underway editors can express their opinions on the options. -- LCU ActivelyDisinterested transmissions °co-ords° 16:36, 11 November 2022 (UTC)[reply]
Yes, and my goal here is to make it clear to RFC participants that if "A" is chosen, there will be disputes about whether one or two articles per day counts as "high-speed or large-scale", because (a) there is no agreed-upon definition for these terms anywhere, so each editor gets to make up their own numbers, and (b) we have already had disputes about what counts as "high-speed or large-scale", so we can't even pretend to be surprised when (not "if") it happens again in the future. WhatamIdoing (talk) 17:00, 11 November 2022 (UTC)[reply]
That why the second option that includes some kind of definition is a suggestion. Separately I'd argue that the fish articles fall under the first option due to "or large scale". The "or" would imply one or the other, so even if they were created at one or two a day (not high speed) there is amlot of them (large scale).
So the first option would include them, while the second option would only do so if they were created at a speed reaching the mentioned limits. -- LCU ActivelyDisinterested transmissions °co-ords° 17:14, 11 November 2022 (UTC)[reply]
And Scolaire incorrectly claimed above that nobody would create "a large number of articles [...] over a period of years", which is how we ended up with this sub-thread. Some editors have created a large number of articles over very long periods of time, and we also know that certain other editors, specifically including some editors active in these discussions, believe that creating 500 articles per year (=42 articles per month, 1.4 articles per day) should be banned as "large-scale".
As for your suggestion above about proposing a limit of more than 100 articles per week, I think it's a good idea. There will certainly be editors who oppose it; for example, BilledMammal has previously said that "100 boilerplate articles created over a year should get consensus". I suspect that this is due to a personal dislike of "boilerplate articles" rather than any practical concerns about whether reviewers can handle two articles a week. (Boilerplate articles are generally easier to process than most, because you know exactly what to expect from them and their main sources.) But from the POV of ranked-choice voting, I think that providing people with a range helps them identify their real preferences. You want to have options at both ends of the spectrum that most people are willing to vote against.
Another thing that would help people figure out their preferences is providing some background information, like the number of articles NPP handles in the same time period and the number of editors last year who might have been constrained by such a rule, and whether the rule would have moderated any notorious editors in the past. WhatamIdoing (talk) 20:19, 11 November 2022 (UTC)[reply]
Maybe the first option should change from high speed or large scale to high speed or large scale regardless of speed.
Also C should change to Other to give respondents a chance to give additional input, with D being Do not create a specific definition? Pinging all past participants from the original RFC as previously stated. -- LCU ActivelyDisinterested transmissions °co-ords° 17:20, 11 November 2022 (UTC)[reply]
I like your suggested change to "A".
"Other" does not work well in instant-runoff voting schemes. You can end up with lots of people voting for "Other" but none of them voting for the same thing.
Your suggestion of "Do not create a specific definition" would be more accurately re-phrased as "Overturn the results of the WP:ACAS RFC, whose clearest result was that the community should create a definition". WhatamIdoing (talk) 20:24, 11 November 2022 (UTC)[reply]
See my tweekes text below. I changed Overturn the results of the WP:ACAS RFC as it's overturning Question 3B not the whole RFC. -- LCU ActivelyDisinterested transmissions °co-ords° 20:49, 11 November 2022 (UTC)[reply]

Arbitrary break (article creation at scale)

  • New suggestion for RFC text.
    Question 3B Should we create a definition of "article creation at scale"? of the WP:ACAS RFC was passed, but no specific definition suggested in WP:ACAS managed to pass.
    Which proposed definition of mass creation should we implement?
    Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences will be determined through IRV
    A: A single editor, creating articles at high speed or large scale regardless of speed, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
    B1: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
    B2: A single editor, creating more than 100 articles per week, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
    C: Overturn the results of the WP:ACAS RFC Question 3B, and leave mass creation undefined.
    -- LCU ActivelyDisinterested transmissions °co-ords° 20:46, 11 November 2022 (UTC)[reply]
A is still a little bit of a word salad. I would really suggest keeping it simple and saying A single editor rapidly creating a large number of articles based on boilerplate text and referenced mostly to the same few sources. How's that? Steven Walling • talk 04:58, 12 November 2022 (UTC)[reply]
@Steven Walling, at least some of the folks here are trying to stop both The Tortoise and the Hare. That is, they want to prevent a single editor from slowly but persistently creating a large number of articles. WhatamIdoing (talk) 07:12, 13 November 2022 (UTC)[reply]
A large number of boilerplate articles, not a large number of articles. No one is trying to prevent editors from creating large numbers of non-boilerplate articles.
The reason we are focused on scale, not speed, is because speed is irrelevant to AfD's ability to handle mass creation. AfD cannot handle an editor creating 1000 boilerplate articles, regardless of whether it takes them one day or one year to create those articles. BilledMammal (talk) 07:31, 13 November 2022 (UTC)[reply]
I'm convinced that AFD can handle 1000 boilerplate articles much more easily than it can handle 1000 completely different articles. But the point is that when someone sets out to create 1000 articles (boilerplate or otherwise), if they're doing it slowly, you can stop them before there are 1000 articles available for AFD to handle. WhatamIdoing (talk) 22:02, 14 November 2022 (UTC)[reply]
Almost all of the boilerplate articles need to be taken through at least part of the AfD process (at a minimum, they need to be taken through WP:BEFORE), which means that the 1000 articles are harder to handle than the 1000 completely different articles, almost all of which need to be taken through no part of the process.
But the point is that when someone sets out to create 1000 articles (boilerplate or otherwise), if they're doing it slowly, you can stop them before there are 1000 articles available for AFD to handle. See Lugnuts for evidence of why that isn't the case. BilledMammal (talk) 23:30, 15 November 2022 (UTC)[reply]
  • Instant runoff works well for scenarios where one of the provided options must be selected. In this case, since these aren't the only possible options, I think it would be best to establish which options have consensus support, and then using the instant runoff procedure amongst those choices. (In essence, this combines approval voting with ranked voting.) isaacl (talk) 23:22, 11 November 2022 (UTC)[reply]

We seem to be going back and forth a bit here. None of my most recent comments were taken into account in this most recent draft. To summarise them: (1) Steven Walling's "A single editor rapidly creating a large number of articles" says the same thing as completely and more simply than "A single editor, creating articles at high speed or large scale regardless of speed"; (2) "or substantially similar" is also unnecessarily complicated, since they don't all need to be the identical sources to be a small group, and a large group of non-identical sources would not fit the definition we're looking for; and (3) the word "only" should be inserted before "the same sources".
Re the latest proposed wording: "overturn 3B" would mean we don't want to define mass creation at all; Surely a better alternative to "status quo" would be "none of the above"? I also agree with isaacl that IRV is not necessarily the best choice here; we would be better to leave that out of the question, and let the moderators/closers choose how to decide. One further thing: I would suggest that "which...should we implement" should be changed to "which...should we adopt". My alternative proposal, then, is:
Which proposed definition of mass creation should we adopt?
Please rank your choices by listing, in order of preference from most preferred to least preferred:
A: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced only to the same small group of sources.
B: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced only to the same small group of sources.
C: A single editor, creating more than 100 articles per week, based on boilerplate text and referenced only to the same small group of sources.
D None of the above

Scolaire (talk) 14:51, 12 November 2022 (UTC)[reply]

I'd still rather D was explicit in that it would overturn 3B, but it's not a sticking point for me. I'm happy with the text either way. BilledMammal, WhatamIdoing any thoughts? -- LCU ActivelyDisinterested transmissions °co-ords° 18:17, 12 November 2022 (UTC)[reply]
Another way of saying D could be "Leave WP:MASSCREATE as currently defined." Anyway, I'm supportive of Scolaire's suggested version too. Steven Walling • talk 19:58, 12 November 2022 (UTC)[reply]
We need to keep options for defining it in terms of scale, rather than just rate, as myself and many other editors prefer such a definition. I suggest that editors who prefer defining it based on rate create no more than two definitions that they can agree on, and editors who prefer defining it in terms of scale should do the same.
For scale, I suggest:
  1. A: A single editor creating a large number of articles based on boilerplate text
  2. B: A single editor creating more than 100 articles based on boilerplate text
We need to discuss whether we want to add something similar to "using the same group of sources", as some editors appear to believe it is unneeded and adds unnecessary complexity. BilledMammal (talk) 02:22, 13 November 2022 (UTC)[reply]
@BilledMammal: I'm not aware of any editors saying that "using the same group of sources" is unneeded. As far as I can see, using boilerplate text and using only the same small group of sources are both considered necessary for whatever proposals are put forward. On the other hand, to start talking about definition in terms of numbers alone seems to me to be going off on a tangent, just as we were close to agreement on a wording. It is a completely new idea to me, and I haven't seen "many other editors" showing a preference for it. 100 articles in a week is one thing, but 100 articles in a lifetime? How would we even count them? Obviously you are free to propose them, but as D and E, not as A and B. Too many people have said that they agree with the current three proposed definitions. Scolaire (talk) 14:44, 13 November 2022 (UTC)[reply]
See the discussions above. Personally, I support their inclusion and have no objection to them being restored.
I don't think we were close to an agreement on wording as your most recent wording varies significantly from previous wordings, such as the wording of A that you posted at 12:20, 10 November 2022, or the wording of A that ActivelyDisinterested posted at 20:46, 11 November 2022.
The option of B (alternatively A2, to match the format proposed by ActivelyDisinterested) is because we are now including options for the rate of creation that includes an explicit figure; to keep the RFC neutral I believe we need to provide an equivalent option for scale of creation. BilledMammal (talk) 14:48, 13 November 2022 (UTC)[reply]
@Scolaire, these two options are materially different:
  • "A single editor rapidly creating a large number of articles"
  • "A single editor, creating articles at high speed or large scale regardless of speed"
The first covers Alice making 500 articles in a week. It does not cover Bob making 500 articles over the course of ten years.
The second covers Alice making 500 articles in a week plus Bob making 500 articles over the course of ten years.
A few editors (e.g., see BilledMammal's comments) actually want to ban you from creating one article per week over the course of a decade, unless you get written permission to make so many similar articles. You can vote against them, but please don't claim these are the same thing. We really don't need to end up with the written rules saying "high speed" but editors claiming that you're just supposed to know that very slow rates of article creation count as being rapid if you don't quit soon enough. WhatamIdoing (talk) 07:19, 13 November 2022 (UTC)[reply]
  • Again this is pushing the idea of having a VOTE rather than establishing consensus. Trying to mandate draconian rules by straight majority voting is divisive and often results in the opposite of consensus. What you get is trainwrecks like Brexit and the current Blue/Red split in the US. Andrew🐉(talk) 09:13, 13 November 2022 (UTC)[reply]
    Unless we can narrow it back down to one option we're going to need a series of options that editors order. However, I would suggest clarifying the question by saying Preferences, weighted by strength of argument, will be resolved through IRV. BilledMammal (talk) 09:17, 13 November 2022 (UTC)[reply]
    We don't need this aggravation. The big picture above shows that mass creation is not a pressing problem requiring more complex and creepy rules. The real mass creation problem is the endless argumentation and tinkering with the rules. We have multiple policies discouraging this – WP:NOTLAW, WP:BURO, WP:IAR, &c.
Now we had a complex RfC about mass creation and it's over and being closed, right? If it failed to arrive at a consensus about the exact nature of mass creation then it failed and that's that. Time to move on, not repeat the process.
Andrew🐉(talk) 09:53, 13 November 2022 (UTC)[reply]
From the top of this discussion: A recently closed RfC found consensus to create a definition of "article creation at scale" (sometimes called mass creation). BilledMammal (talk) 10:27, 13 November 2022 (UTC)[reply]
That's not the consensus of the RFC, question 3B as per the closers notes passed by a wide margin. This discussion is about how to deal with that. There will be an option in the proposed RFC to overturn 3B and not create a definition. But the RFC consensus was very clear that one should be created. -- LCU ActivelyDisinterested transmissions °co-ords° 15:08, 13 November 2022 (UTC)[reply]
@Andrew Davidson, IMO to stop the "endless argumentation", we need to define what mass creation is, or at least what it isn't. I am firmly in favor of having a definition that even the most motivated wikilawyer could not re-interpret as meaning that participation in m:100wikidays is "mass creation". Right now, editors such as BilledMammal have argued that creating one or two articles a day is "mass creation". Stopping (what I think are) incorrect claims about this would stop the endless argumentation. Leaving things as-is will perpetuate the arguing. WhatamIdoing (talk) 22:32, 13 November 2022 (UTC)[reply]
The trouble is that producing a definition doesn't stop the argumentation; it feeds it as the fanatics proceed to argue about the definition, nibbling away at it to try to achieve their goal. Just look at a simple core policy like WP:V which was produced nearly 20 years ago and which still seems to be a constant battleground over issues like WP:ONUS, verifiability vs verification and whatever else. That policy's talk page now has 76 pages of archive. 76!
The delusion is that making more rules solves problems. This is not a given. Maybe the rules will be counter-productive or dysfunctional. The only certainty is that they will generate more complexity, argument, game-playing and wikilawyering.
To stop this, we already have the policy WP:CREEP. We just have to apply it.
Andrew🐉(talk) 00:21, 14 November 2022 (UTC)[reply]
Providing clear definitions stops arguments over what the definition means. Nobody who sees "Speed Limit 25" is going to start a fight about whether 12 is bigger than 25. They might complain about reckless driving, but they won't even try to claim a speed limit violation. WhatamIdoing (talk) 22:05, 14 November 2022 (UTC)[reply]
  • As there is two version of the wording with numbered limits, could there be two versions of the first question? One as is and one closer to BilledMammal's ideas. I realise it's becoming more unwieldy than the original text, but at least having the options would put these questions to rest. -- LCU ActivelyDisinterested transmissions °co-ords° 15:11, 13 November 2022 (UTC)[reply]
    In case you missed it, Scolaire's version of the first question is different from earlier proposals as it adds the requirement that the article creation be rapid; that didn't previously exist.
    My preference is for three options:
    A: A single editor creating a large number of articles based on boilerplate text and referenced to the same small group of sources.
    B: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced to the same small group of sources.
    C: Overturn consensus to create a definition of "article creation at scale"
    However, I believe some editors prefer to include options with explicit numbers. BilledMammal (talk) 15:18, 13 November 2022 (UTC)[reply]
    My point was that it didn't include an option for scale. There has been clear preference for options with numeric limits. So what I'm suggesting A, B as you stated, then options C/D would be with limits and E would be not define. -- LCU ActivelyDisinterested transmissions °co-ords° 16:11, 13 November 2022 (UTC)[reply]
    Agreed - many of the objections to proposals within the RfC this came from was that they lacked quantifiable limits so people didn't actually know what they were really voting for. Given that, I'd be dubious about voting for "large" because I'm not sure how that's being defined (which brings us back to the current definition that suggests 25 or more but gives no timeframe...). "Rapidly and large" I can understand and is a better option, but I imagine people would like to see something with actual values in it as an option at least. Blue Square Thing (talk) 18:50, 13 November 2022 (UTC)[reply]

Another suggested wording

Another go at a final wording.
Which proposed definition of mass creation should we adopt?
Please rank your choices by listing, in order of preference from most preferred to least preferred:
A: A single editor creating a large number of articles based on boilerplate text and referenced to the same small group of sources.
B: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced only to the same small group of sources.
C: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced only to the same small group of sources.
D: A single editor, creating more than 100 articles per week, based on boilerplate text and referenced only to the same small group of sources.
E: None of the above

-- LCU ActivelyDisinterested transmissions °co-ords° 16:15, 13 November 2022 (UTC)[reply]

If we are including rate definitions with numeric limits we need to include scale definitions with numeric limits. I suggest A single editor creating more than 100 articles based on boilerplate text and referenced to the same small group of sources. BilledMammal (talk) 16:21, 13 November 2022 (UTC)[reply]
As a modification of option A? -- LCU ActivelyDisinterested transmissions °co-ords° 16:23, 13 November 2022 (UTC)[reply]
As an alternative to option A. BilledMammal (talk) 16:24, 13 November 2022 (UTC)[reply]
That seems over kill, is there a specific reason you think it needs inclusion? Someone saying that 100 or more articles isn't a large quantity of articles doesn't seem likely to be accepted as a valid argument. -- LCU ActivelyDisinterested transmissions °co-ords° 16:36, 13 November 2022 (UTC)[reply]
+1, also we are already coming towards way too many options. More than 3-4 options is going to result in lower RFC participation, which is not good. Steven Walling • talk 16:42, 13 November 2022 (UTC)[reply]
For neutrality; editors who prefer definitions with numeric limits should not be limited to rate based definitions. If we are concerned that there are too many options I would suggest merging C and D, so that there are two rate based definitions and two scale based definitions. BilledMammal (talk) 17:04, 13 November 2022 (UTC)[reply]
The issue with placing a limit of just 100 articles ever, is that it's entirely possible that someone who was editing in 2005 could easily have created 100 articles about really worthwhile stuff, well referenced but based on very similar formats. I'm really not sure that works. The concerns only seem to be valid when the rate of creation overwhelms the ability of systems to deal with it - e.g. NPP, RC etc... That, I imagine, is why we've focussed on rate. 100 articles since 2005 is, what, 1 every couple of months? That's not problematic.
I can see the point of options B ("rapidly" is the key as it overhwhelms), C and D here and could support any of them. Tbh, I'm not sure if the number in C are too low if we're talking applying this retrospectively, but the numbers can always be changed if there seem to be issues, or people can use common sense when applying things to articles created back in the early days of Wikipedia. I guess we have to have D, so I don't think it's possible to get this down to less than 4 options and still have a sensible set of possibilities. Blue Square Thing (talk) 18:45, 13 November 2022 (UTC)[reply]
I don't think it's possible that someone who was not using a boilerplate would manage to create 100 articles that appear to be using a boilerplate, but that is a discussion that should be held during the RfC.
The system that was overwhelmed resulting in the ArbCom case and these RfC's was AfD, and for AfD what matters is not the speed of creation but the scale. BilledMammal (talk) 18:53, 13 November 2022 (UTC)[reply]
Not really? When article creation happens slowly, then the community can (and frequently does) intervene early. If I set out to create five thousand articles, at a rate of one per day, and I'm producing garbage, I'll likely be asked to stop within a month (frequently even within days). At that point, AFD only has to manage a couple dozen articles, which is not difficult. It's unlikely that my work will be completely overlooked for 5,000 days (more than 13 years). WhatamIdoing (talk) 22:36, 13 November 2022 (UTC)[reply]
The level of disruption required to get the community to intervene is higher than the level required to cause issues at AfD, even when the article creation is happening rapidly. For example it took years to topic ban Lugnuts from article creation and that is the case for most problematic mass-creators. BilledMammal (talk) 01:58, 14 November 2022 (UTC)[reply]
I don't think so. If I set out to create 5,000 articles, and you decide that article #16, created in day 16, is garbage, why wouldn't you send that lone article to AFD right away?
Lugnuts' work was accepted at the time of creation. WhatamIdoing (talk) 22:19, 14 November 2022 (UTC)[reply]
That's not the community intervening on your 5000-article plan, that's a single article being sent to AfD. That wouldn't prevent you from continuing to create those 5000 articles, and the community would not do anything whatsoever about you until a much larger number had been taken to AfD, by which point you very well could have 1000+ articles. JoelleJay (talk) 02:35, 16 November 2022 (UTC)[reply]

Revised suggested wording

Another go at a final wording.
Which proposed definition of mass creation should we adopt?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences, weighted by strength of argument, will be resolved through IRV.
A: A single editor creating a large number of articles based on boilerplate text and referenced to the same small group of sources.
B: A single editor creating more than 100 articles based on boilerplate text and referenced to the same small group of sources.
C: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced only to the same small group of sources.
D: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced only to the same small group of sources.
E: None of the above

BilledMammal (talk) 18:58, 13 November 2022 (UTC)[reply]

I could go with that. Scolaire (talk) 19:51, 13 November 2022 (UTC)[reply]
Also fine with me, I for getting this over and done with. -- LCU ActivelyDisinterested transmissions °co-ords° 20:04, 13 November 2022 (UTC)[reply]
I would like to see the option for "A single editor, creating more than 100 articles per week" restored. I can imagine someone balking at the 50 per month limit (that's less than two per day) and still want to set some sort of rate limit.
I would also like to see an option that does not require "boilerplate text" or "same small group of sources". In fact, that could be separated out, which would result in two questions:
  1. What speed or volume qualifies as "mass creation"? (with all of the options)
  2. MASSCREATE has historically been limited to "automated or semi-automated content page creation". Should the rules around mass creation require pre-approval for "articles based on boilerplate text and referenced to the same small group of sources", or for all articles regardless of content? (If "yes", then the words "based on boilerplate text and referenced to the same small group of sources" will be appended to whichever choice in #1 is chosen; if "no", then they won't, and all articles, regardless of content, would count towards any limit that is adopted. Neither choice directly affects the existing wording about "automated or semi-automated".)
Finally, I think we need to add some context. I suggest something along these lines:
"Relatively few editors create more than one article in a day, and except for m:100wikidays, few have ever created more than 100 articles. However, people have had good-faith disagreements over whether certain editing patterns should be subject to the restrictions in WP:MASSCREATE. The Wikipedia:Arbitration Committee/Requests for comment/Article creation at scale RFC concluded with a recommendation that the wording be clarified, to help editors understand which situations are covered by this rule."
This should give editors who haven't been following this for months some idea of why they're getting this question now. WhatamIdoing (talk) 22:54, 13 November 2022 (UTC)[reply]
That first section appears to lead the question. I think editors should add any statistics to there replies. -- LCU ActivelyDisinterested transmissions °co-ords° 00:30, 14 November 2022 (UTC)[reply]
I do agree about the removal of the higher limit option. There appeared to be a lot of discussion in the original RFC about where it should be set, and the low limit / high limit options gives that a voice. -- LCU ActivelyDisinterested transmissions °co-ords° 00:33, 14 November 2022 (UTC)[reply]
The higher limit I removed because the list was getting too long and there is little difference between it and declaring that mass creation does not exist; over the past few years I believe it would only apply to two editors; Lugnuts and Estopedist1. Both of whom would probably have been able to game it. If such an option is desired I would suggest instead asking Should mass-create be abolished as it will have the same result with less WP:CREEP. BilledMammal (talk) 01:55, 14 November 2022 (UTC)[reply]
  • This is all still junk. Every definition seems to rely upon the phrase boilerplate text as if that is supposed to be clear. But all articles use boilerplate. We call them templates and they are routinely used for citations, infoboxes, navigation templates and more. And if an article is a stub, it is likely to have a fairly clichéd phrasing as we have a house style in which articles tend to follow a common pattern for that type of topic.
And talk of a small number of sources is nonsense too as what matters is the quality of sources not their number. For example, see WP:ANYBIO which makes it very clear that "The person has an entry in a country's standard national biographical dictionary" then that's a sure sign of notability. One just needs a single good reference work like that to generate articles with guaranteed notability.
Andrew🐉(talk) 00:34, 14 November 2022 (UTC)[reply]
I'd be happy to hear suggestions for better wording, the current wording is a bit clunky. -- LCU ActivelyDisinterested transmissions °co-ords° 01:03, 14 November 2022 (UTC)[reply]
Just asking about boilerplate/sources separately should address Andrew's concern. WhatamIdoing (talk) 22:25, 14 November 2022 (UTC)[reply]
The intent is to only cover articles that are entirely based on boilerplate text. I believe it is clear and that most editors here have understood that, but I'm also happy to hear suggestions for better wording.
And talk of a small number of sources is nonsense too as what matters is the quality of sources not their number. This RfC is to determine what mass creation is; determining whether specific examples of mass creation are problematic is a different discussion. I would agree that an editor working through the Oxford Dictionary of National Biography to create boilerplate stubs on every individual currently lacking an article is not an issue, assuming the boilerplate is suitable, but they are still engaged in mass creation. BilledMammal (talk) 01:55, 14 November 2022 (UTC)[reply]
No one is arguing the mere presence of templates automatically makes an article "boilerplate", that would be stupid. But if editors are making many stubs with only boilerplate text, using the same sources, why shouldn't that be considered mass creation? Also, this RfC is still under the umbrella of the 3B option, so, as explained to you earlier, articles that do not need to meet GNG are not even covered here. JoelleJay (talk) 02:48, 16 November 2022 (UTC)[reply]
  • Why does the presence or absence of "boilerplate text" matter? Do we have a clear, relevant and unambiguous definition for boilerplate text in this context? Has it been accepted by the community as suitable for this purpose? If so, where is it? The Wikipedia article Boilerplate text is unsuitable as it may change and was not written for this purpose. If we do not have a suitable, accepted-as-policy definition, then trying to define mass creation in terms of boilerplate text is inherently futile. So "E: None of the above" is the only viable option. · · · Peter Southwood (talk): 13:34, 16 November 2022 (UTC)[reply]
    Of course we don't have a clear definition of boilerplate text. Also, what people mostly seem to care about is very short articles ("Alice Athlete was a Ruritanian gymnast in the 1928 Olympics."), where the "boilerplate text" is identical to the best practice as outlined in Wikipedia:Manual of Style/Lead section#First sentence. The actual, guaranteed-to-be-mass-created-boilerplate articles that we have approved a bot to create in the past (e.g., thousands of multi-paragraph articles on US cities) don't seem to be meant to be included. WhatamIdoing (talk) 17:29, 17 November 2022 (UTC)[reply]
  • If a text and format formula is acceptable for one article, why would it not be acceptable for another article on the same class of topic? · · · Peter Southwood (talk): 13:52, 16 November 2022 (UTC)[reply]
    As per BilledMammal, This RfC is to determine what mass creation is; determining whether specific examples of mass creation are problematic is a different discussion. The proposed definition encompasses both good and problematic mass creation, so there will be plenty of articles based on boilerplate text and few sources that have no threat of being challenged. JoelleJay (talk) 02:26, 17 November 2022 (UTC)[reply]

Clearly mass creation can be done without "boileplate text" which makes all 5 of the choices not work. But I think that good core elements would be larger amounts of articles with very little content that is unique to the article. North8000 (talk) 18:02, 17 November 2022 (UTC)[reply]

  • So if I have a database or two and just start copying simple facts from them, without using a boilerplate for the text, none of this applies? — Rhododendrites talk \\ 14:13, 21 November 2022 (UTC)[reply]
    • Can you give a single example of a large number of articles that do nothing but copy simple facts from a database or two, but format the text differently in each article? Scolaire (talk) 11:06, 22 November 2022 (UTC)[reply]
  • The structure of D is fine, but the levels are too low. 20 articles per week is less than 3 a day, and 50 articles per month is less than 2 a day. That is a ridiculously low threshold for "mass creation." 10 in a day is arguably reasonable, but if someone creates 15 in one day and then stops, that is hardly difficult to deal with, even if the articles are problematic. D should be reworded as 20 per day, 50 per week and 100 per month, or else that should be a separate option. Rlendog (talk) 16:21, 25 November 2022 (UTC)[reply]
    There was a separate option with a higher limit, (A single editor, creating more than 100 articles per week, based on boilerplate text and referenced only to the same small group of sources.), but it was removed. -- LCU ActivelyDisinterested transmissions °co-ords° 22:36, 27 November 2022 (UTC)[reply]
    I think that was a mistake. I, for one, cannot support the limits in the proposal but would be able to support something like what I proposed. If we do not offer more liberal choices and if other editors feel as I do, this will likely be shot down, even though its structure has merit. Rlendog (talk) 00:04, 28 November 2022 (UTC)[reply]

Close/open?

@Valereee and Xeno: This discussion has ground to a halt. Do you still intend to run an RfC? If so, do you want to declare this closed and decide on a question based on all of the above? It seems nothing is going to get universal support. Scolaire (talk) 16:02, 25 November 2022 (UTC)[reply]

Since it’s the idea lab, I don’t feel particularly strongly about needing to formally close it. If the discussion well has run dry, it can be archived naturally. If Valereee or I (or anyone else, for that matter) can glean something useful to spin into another RfC then we will do so :) [I haven’t had a chance to read it in depth yet!]. I do want to thank everyone for contribution their thoughts here. –xenotalk 18:09, 25 November 2022 (UTC)[reply]
Sorry for the delay. I think we probably do need to run an RfC on this, and it's probably necessary before we can run the RfC on article deletions at scale. We're in a pickle, here. Valereee (talk) 14:22, 1 December 2022 (UTC)[reply]
@Valereee and Xeno: Since it’s the idea lab, I don’t feel particularly strongly about needing to formally close it. By the same token, there's no need necessarily to try and ascertain what the "consensus" is. That's why I started this thread. There's nothing to stop the two of you from putting your heads together and deciding on a question. This could mean choosing one out of all the proposals or, more likely, framing a different question based on your reading of the discussion. Whatever question is asked, participants are going to suggest alternatives anyway. I'd say, just go for it. Scolaire (talk) 12:01, 2 December 2022 (UTC)[reply]

Proposals to improve corporation-related coverage

In September, I came across Amazon (company), which was in a pretty poor state (now significantly better). Currently, many of our corporation-related articles have significant issues, including inconsistent structures, excessive length, and relative confusion about the inclusion criteria. Our "Criticisms of ... Inc." articles are pretty uniformly bad, with poor structures, poor readability, and very unclear criteria for inclusion. These problems aren't isolated to specific articles, so I'd suggest we try to address them systemically, to reduce overall editor workload.

Two ideas I propose are:

  • Creating a topic-specific manual of style guideline for company-related topics, which would give clear examples of a what's due/undue, how to treat controversy sections (since WP:POVFORK is harder to apply to these articles), and would propose an outline (history, products & services, corporate affairs, etc.) to promote consistency. This should ideally cover both main articles on corporations, and criticism-specific splits. We already have equivalent MoS pages for television or music-related articles, for example.
  • More tentatively: clarifying or expanding WP:POV (given the surprising obscurity of WP:PROPORTION among editors of corp-related articles) to clarify its application here. It's currently very clear on scientific issues and BLPs, but when it comes to corporations, many editors interpret WP:DUE to support inclusion of practically every controversy that's been covered in reliable sources. When a new controversy appears, many editors indiscriminately add it to the relevant Wikipedia page, resulting in ever-lengthening articles and posing a maintainability burden. It wastes a lot of editors' time to selectively clean up these additions, and potentially need to argue at length in favor of each removal, if anyone disputes them and the content has been in the article for a while.
    • The policy's lack of clarity also means that editors who favor removing such content must base their arguments on inherently ambious and less well-known policies like WP:VNOT, WP:NOT, and WP:10YT. (I'll note that WP:RECENTISM is sadly not policy).
    • Policy changes may not be required, but perhaps we could add article notices pointing to WP:PROPORTION that people would see when they click Edit; I'm sure others will have better ideas on systemic ways to minimize editor workloads for these articles.

(As a minor side-note, I want to invite editors to check out Google Chrome version history, iOS version history, and List of iOS and iPadOS devices, since those are somewhat corporation-related too. WP:INDISCRIMINATE relies mostly on editors using "common sense", which seems insufficient. Many of our version history articles are extremely long and copy release notes verbatim, which may constitute copyvio. And many "list of devices" articles are far too detailed and hard to browse (especially with accessibility software). Again, this may not require new policies, but we should likely clarify existing ones or make them more prominent. Please discuss this side-note in a separate subsection to maintain clarity). DFlhb (talk) 20:32, 3 November 2022 (UTC)[reply]

  • One of the few things from WP:N that should apply to article content as well as article existence is WP:SUSTAINED, in particular if something only shows up in a 24 hour news cycle, but then is never mentioned again in any reliable sources; it's likely not WP:DUE. WP:DUE should deal both with how much mainstream sources cover a topic, but also should have a temporal element. If no one is writing about it after the fact, then it likely isn't worth reporting; it doesn't have any historical significance. I have no problem with Wikipedia using current events to add to an article, but I also think that one of the defenses for removing something from an article should be "It's been X years since this happened, and no reliable source has ever covered since". They say that journalism is the first draft of history, but if it doesn't make it to a second draft, it's probably not worth keeping in an article... --Jayron32 12:03, 4 November 2022 (UTC)[reply]
    I would strongly support this; it would likely increase the signal-to-noise ratio of content additions. DFlhb (talk) 13:13, 4 November 2022 (UTC)[reply]
    This discussion really should be on VPI - it's got a lot of significant changes considered, which would need to be fully spelt out to be taken to VPP as fully formed proposals. Nosebagbear (talk) 13:29, 4 November 2022 (UTC)[reply]
    This angle really needs to be placed into WP:DUE, adding that viewpoints/coverage that only come from a short-term burst of news should not be weighed as heavily as views/coverage that come from the long-tail of some event. Effectively this is codifying parts of WP:RECENTISM that, in how many editors edit WP today, gets readily ignored as facts are added the second something changes without looking towards the longer-term narrative. Masem (t) 14:09, 4 November 2022 (UTC)[reply]
The first and foremost concern I have with any changes to how we cover companies, is that we not permit any kind of entering wedge for corporate PR departments and other hired guns to cleanse and polish their clients' reps. --Orange Mike | Talk 14:04, 4 November 2022 (UTC)[reply]
Yes, but on the other hand, we should not be a catalogue of every single time a company is mentioned in a newspaper. If there's was a fire in an Amazon warehouse in 2009, even IF a newspaper wrote an article about it, we don't need to have that event in the Amazon article. How do we decide which events are important enough to cover? We discuss them. There's a middle ground between "allowing corporate shills to cleanse Wikipedia of all negative information" and "cataloguing every single time a company is mentioned in any reliable news article ever". We should not allow the latter to happen merely because we're frightened of the former. --Jayron32 15:00, 4 November 2022 (UTC)[reply]
In my experience most articles on companies suffer from the opposite problem. Amazon is probably a bad example; being one of the largest organisations in human history, it probably attracts a bit more attention than the average company. Usually very few editors are motivated to edit these articles proactively – except for those with an interest in promoting the company, who we'd rather not. The result is that negative coverage gets swept under the rug in favour of peacocky corporate histories and overly-detailed product catalogues (some random examples from Category:Companies in the Nasdaq-100: Amgen, Microchip Technology, Xcel Energy, Intuit). The proliferation of "controversies", "legal issues", etc. sections is a symptom of this. Amazon's appalling treatment of its workers, for example, isn't some free-floating, subjective "criticism" of the company, it's a well-documented fact that is an essential part of its history. Facts like these should be presented we're they're relevant, throughout the article text, not relegated to a few lines in a section at the bottom of the page because they happen to reflect poorly on the subject. – Joe (talk) 15:32, 4 November 2022 (UTC)[reply]
What generally happens with "controversies" sections is that you get something that a company did once, got a lot of short term criticism at it, and some WP editor judges that all a "controversy". For example, YouTube recently had temporarily put 4K videos behind its subscription tiers but two weeks later reverted that. Hypothetically, with the way some WP articles are written, that could have been presented as a "4K video controversy" on the WP article, but in reality the change itself likely merits no mention, and if anything, that could be incorporated as criticism related to YouTube's "changes on the fly" practices.
Absolutely there are valid criticisms and controversies like Amazon's that are mentioned above, but that has the long-tail of coverage and didn't come out of a incident lasting only a couple days. Masem (t) 15:37, 4 November 2022 (UTC)[reply]
Joe, you do realize that it is quite possible for two things to be wrong at the same time, right? Yes, where controversial corporate practices have received widespread, sustained, coverage then it is likely that is worthwhile to mention in an article. No one, not anyone, zero people, are talking about that sort of thing. What we're talking about here is the sort of minor, one off, singular events that are minor blips in the history of a company getting out-sized attention in an article. It's quite possible for us to BOTH cover the sort of long-term covered controversial corporate practices that maybe aren't in these articles, AND to avoid the "minor incident of the day" reporting that these articles fill up with, often instead of the actual real stuff we should be covering. --Jayron32 15:43, 4 November 2022 (UTC)[reply]
@Jayron32: You seem to be under the impression that my comment was intended to be a rebuttal of or challenge to you, and all I can suggest is that you read it again. – Joe (talk) 15:51, 4 November 2022 (UTC)[reply]
Not at all. I don't form impressions. It's not worthwhile. --Jayron32 15:55, 4 November 2022 (UTC)[reply]

I will say that this goes both ways. At Sembcorp, for example, several years of UPEs Gone Wild had turned a fossil fuel company into a socially conscious innovator devoted to positive social change #wow #whoa. On the other hand, on Target there was once an entire section devoted to an incident of somebody sneaking into a Target and playing porno clips over the speakers, once, about a decade ago. jp×g 11:43, 17 November 2022 (UTC)[reply]

I'll let this discussion die after this, but others may be interested in Talk:Criticism_of_Apple_Inc.#Strays_away_from_WP:POVFORK. I'm wondering why WP:POVFORK seems to be so rarely applied to non-BLPs. It just makes for way worse articles than what we could have. DFlhb (talk) 09:01, 3 December 2022 (UTC)[reply]

Archived discussions are not being de-archived

Hi everyone, there is a warning on this page: "Discussions are automatically archived after remaining inactive for two weeks." It must be adjusted with a statement "And they do not get returned, whether they active or not after that period!" If it is easy to implement such returning, that'd be great, but it seems I'm asking for too much.
I've verified that by replying on an idea risen almost six years ago. I have the same idea, I found it via searching, suggested at the top of this page. (Yeah, I follow algorithms, and this is an old rule to search before asking). My initial idea is discussed on a link: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab)/Archive_22#A_new_button_%22contest_edit_on_talk_page%22
Tosha Langue (talk) 09:48, 24 November 2022 (UTC)[reply]

Why would that need to be stated? Why would anyone expect discussions to be unarchived? Just start a new discussion.--User:Khajidha (talk) (contributions) 19:02, 24 November 2022 (UTC)[reply]
@Khajidha, I (perhaps) made a tough claim, and you replied with rhetorical questions. Sorry for that! I'll try to explain. I didn't mean that it should be done as I proposed (I assumed it just as an idea).Tosha Langue (talk) 09:07, 25 November 2022 (UTC)[reply]
If you want to revive an archived discussion, you can just do that as well. — xaosflux Talk 01:16, 25 November 2022 (UTC)[reply]
@Xaosflux: that's understood, but how? May I just copy and paste the whole discussion to a new one? To make a clone, so to speak... Or I should retell what happened (to make a recall), either to provide a link to an archived discussion in a passing manner (as I've already done), what is acceptable, what is better? Tosha Langue (talk) 09:22, 25 November 2022 (UTC)[reply]
@Tosha Langue "it depends". In general, if a discussion was recently archived (it would be in the most-recent archive) because it was stale, then just cut-paste it from the archive back to the main and continue it. If it wasn't so recent, just start a new discussion and link to the old one. What you should not do is reply in the archive as you did above, the archive says Please do not edit the contents of this page. If you wish to revive any of these discussions, either start a new thread or use the talk page associated with that topic. If you want to talk about that idea from 2016, just start a new thread as the direction says. — xaosflux Talk 10:12, 25 November 2022 (UTC)[reply]
Indeed. I've gone and reverted your edit to that archive. Graham87 07:59, 26 November 2022 (UTC)[reply]
But I overleaped right to that section from a search result, and didn't notice the warning. Anyway, thanks, @Graham87 Tosha Langue (talk) 15:21, 26 November 2022 (UTC)[reply]
Archives do not have section editing links to prevent such accidents. Unfortunately, the "reply" tool works on pages where section editing is disabled, making it easy to accidentally edit archives. —Kusma (talk) 19:18, 26 November 2022 (UTC)[reply]
According to phab:T249293, people are working on a new magic word to prevent "reply" links on archive pages. —Kusma (talk) 20:46, 26 November 2022 (UTC)[reply]
Marginally related to this, it would be awesome if there was some mechanism for links to discussion threads to continue to work when the thread gets archived. I could imagine something like every new section heading getting a UUID, and a "link to this" clickable thing next to the header. Then the various archiving tools could maintain some sort of forwarding map which could be consulted on the fly to get you to the right archive automatically with the same URL that got you to the pre-archived live thread. -- RoySmith (talk) 20:54, 26 November 2022 (UTC)[reply]
Shamelessly plugging Wikipedia:Convenient Discussions which automaticaly searches archives for you ■ ∃ Madeline ⇔ ∃ Part of me ; 21:01, 26 November 2022 (UTC)[reply]
@RoySmith, permalinks are planned, but I don't know how soon we'll get them. I believe that a database is being filled this month. The current link structure looks like this: Wikipedia:Village pump (idea lab)#c-RoySmith-20221126205400-Kusma-20221126204600 (for your comment). I'm not sure whether this will change. Whatamidoing (WMF) (talk) 02:02, 29 November 2022 (UTC)[reply]
Cool, thanks! -- RoySmith (talk) 02:09, 29 November 2022 (UTC)[reply]

Idea: Enable the AbuseFilter blocking action

Background: Edit filter actions

The edit filter (or AbuseFilter) is a tool that allows editors in the edit filter manager group to primarily set automated controls to address common patterns of harmful editing.

When an edit filter's pattern matches an edit, a number of actions can be configured to be triggered. On the English Wikipedia, we have enabled the following edit filter actions:

  • Warning—The user is warned that their edit may not be appreciated, and is given the opportunity to submit it again.
  • Disallowing—Actions matching the filter will be prevented.
  • Revoking auto-promoted groups—Actions matching the filter will cause the user in question to be barred from receiving any auto-promotion (i.e. autoconfirmed / extended confirmed) for 5 days.
  • Tagging—The edit or change can be 'tagged' with a particular tag, which will be shown on Recent Changes, contributions, logs, new pages, history, and everywhere else.

Initial idea

If progressed to a RfC, I would like to propose that the English Wikipedia enables the blocking action for our edit filters (as, for example, Meta Wiki has done). When enabled, and explicitly set as an action, any users matching the filter will be blocked for the time specified, with a descriptive block summary indicating the rule that was triggered. Some specifics about how this would work was recently discussed at the edit filter noticeboard, but rather than move to a RfC at WP:VPR, I thought it best to gain wider feedback and ideas from the community.

Currently, edit filters set to disallow must be tested in logging mode first. The enabling of the disallow action is then announced on the edit filter noticeboard for review. For a filter to have the blocking action enabled, I would propose that it would need to be rigorously tested; first with the logging action, and then with the disallowing action (with attendant announcement to WP:EFN), before the blocking action could be enabled.

I welcome comments, suggestions and feedback on the idea. Of particular interest are the concerns raised at that EFN discussion around how we would handle non-admin EFMs and false-positive blocks. — TheresNoTime (talk • they/them) 11:47, 29 November 2022 (UTC)[reply]

  • Here is an example of what the blocks look like: meta-wiki AF blocks
  • Note, the AF can not leave a "talk page" notice about the block. Generally the editor will get the Abuse Filter notice when they trip the filter; and then they would get the standard blocked interface if attempting to edit during the block.
xaosflux Talk 15:04, 29 November 2022 (UTC)[reply]
Not sure this is a great idea on enwiki. I've seen several instances of people complaining about incorrect blocks on wikis that use this feature (one memorable instance involved a misconfigured filter on enwikinews that would indef any non-autoconfirmed user who tried to edit a page containing the word "contact", which meant that people trying to report the false positive would also get blocked). Enwiki's sheer number of editors makes false positives more likely and I suspect that our ratio of genuine newbies vs. spambots and LTAs is much higher than sites like meta and wikinews. Enwiki also has a large number of active admins compared to most other wikis, which makes me wonder if this is really necessary. If this is ever implemented I think it should be limited to short-term blocks (a few hours at most) which are immediately reported to somewhere like AN or AIV for review. Spicy (talk) 16:04, 29 November 2022 (UTC)[reply]
I'm not convinced this is necessary. An edit filter set to "disallow" should already stop a spambot attack, and blocking in such a case isn't urgent. On the other hand if an editor/bot makes some edits that are disallowed and some edits that are let through, their edits will likely need human review, and reporting to WP:AIV via DatBot provides that already.
Of course, given the state of WP:RFA, we are likely to need this in ten years. But I hope we can avert that. —Kusma (talk) 17:09, 29 November 2022 (UTC)[reply]
It was an earth-shattering experience to get auto-blocked at https://meta.wikimedia.beta.wmflabs.org/ just because I added a link to my enwiki userpage, and it was not even my home wiki, just a test account, that led to this unblock request to TNT on their meta talk page. If this proposal passes, there should be a very strong check on its usage, possibly mandating a minimum number of admin support votes before doing anything. CX Zoom[he/him] (let's talk • {CX}) 17:22, 29 November 2022 (UTC)[reply]
Removing blatant vandalisms is the worst use of an admin's or patroller's time, and somehow we always choose to show more love towards vandals than towards these hard working people. An IP adding 100k of text, C&P-ing "your mom" 30 times, swearing, adding emojis (...) is likely to continue vandalizing the project, one way or another, flooding the filter along the way, and should be stopped... be it just for 2 hours - trust me, it helps! Full support, Ponor (talk) 18:24, 29 November 2022 (UTC)[reply]
Removing blatant vandalism (which is very rare these days compared to pre-edit filter times) is quick and easy work. Winning back an editor who has been incorrectly blocked by an imperfect filter is hard (or even impossible) and takes a long time, so a blocking filter would need to be absolutely perfect in order to be worth it. Admins try to choose the best block parameters (what range to block, from which pages, soft or hard). If we need to review and adjust filter blocks, this doesn't actually help all that much. —Kusma (talk) 19:15, 29 November 2022 (UTC)[reply]
Fair point, @Kusma. You know "your" vandals better than I, and I'm sure enwiki's many filters are already doing a great job. So blocking would be good (@TheresNoTime?) to prevent EF log flooding, and what else? Ponor (talk) 19:45, 29 November 2022 (UTC)[reply]
  • This is going to be a hard no from me. I'm okay with the automated "disallow and report" options that allow AIV to be notified by a bot so an admin can block the account in question, but I still always, 100% of the time, with no exceptions, want an actual human to review the case and make the decision to block. It still requires human nuance to implement a block, IMHO. --Jayron32 19:20, 29 November 2022 (UTC)[reply]
  • I'm with Jayron32 on this. Humans should do the blocking, after reviewing the situation. We have plenty of people here to address these kinds of issues; it's not like we're a small wiki with 40 editors and one or two admins. And we have so many edit filter creators and managers that there's a strong possibility of misuse (intentional or inadvertent). Risker (talk) 19:43, 29 November 2022 (UTC)[reply]
  • There's an incorrect premise that's built into a many of the comments here: If we just try really really hard, our filters will have precisely zero false positives. Sorry, Wikipedia is a complex place, and EFMs are only human. There will be false positives. With six million articles, and about 100 edits per minute, you just can't think of everything. Using Ponor's examples, The IP adding 100k of text will be the one reverting a page-blanking vandal. The IP "swearing" is quoting a song title. The one adding emojis is quoting a United States Senator.
    And it gets worse for the LTA filters. You can test the filter all you want, and then five minutes later, the LTA-who-says-"Ni" is now the LTA-who-until-recently-said-"Ni". You've got to modify the filter, and if you have to extensively test it again, you might as well not bother.
    If we're going to forward with this, we need to acknowledge that a block is two things: (A) A technical measure preventing a user from editing, and (B) A social "stain" on that user's record. Maybe you don't judge people by the length of their block logs. Good for you. But Wikipedia, collectively, does do this.
    What I'd suggest is that AbuseFilter blocks do not appear in the user's block log, at least after they have expired. Whether that's accomplished through a software change, or bot that revdels the log of User:Edit filter, I don't care. But consider me opposed otherwise. Suffusion of Yellow (talk) 19:47, 29 November 2022 (UTC)[reply]
  • I realise that this is a place to be positive, but it's hard to think of an edit that is never constructive or at least tolerable. The furthest we should go is to make a list of edits for which an admin should consider a manual block, but I think the EF logs already provide that data. Certes (talk) 20:01, 29 November 2022 (UTC)[reply]
  • Do we have any use cases in mind? That is, any specific filters that would be good candidates to eventually change to block? Perhaps a concrete example would help. –Novem Linguae (talk) 20:03, 29 November 2022 (UTC)[reply]
    @Novem Linguae I've written filters in the past which contained such absurdly specific or nonsensical strings of text that they ran for months and months without a single false positive, and the user responsible was always blocked. Some examples include "smoothest ashu" (Special:AbuseFilter/690) and "Brian Toussaint Thompson" (Special:AbuseFilter/674). Because those blocks had to wait for me as an administrator to notice that the filter was being tripped, the user in question had fair chance to attempt to circumvent the filter. Blocking would reduce the effectiveness of that. That said, while I've advocated for exploring the use of the blocking function in the past, I understand the hesitance - it is easy to make mistakes with edit filters, and the testing and warning mechanisms aren't great at preventing them. While disallowing a few good edits might be an acceptable mistake, I agree that the risk of outright blocking good faith users is a scarier prospect. I'm personally open to an RfC on this but the policies around its use would need to be restrictive. Sam Walton (talk) 10:31, 1 December 2022 (UTC)[reply]
  • I agree wuth Jayron32 and Risker. Human discretion is essential when blocking editors. Cullen328 (talk) 20:14, 29 November 2022 (UTC)[reply]
  • Hearing y'all loud and clear on this — no need for a RfC. The main use case probably would have been very automated attacks from IPs/newly created accounts (the sorts of attacks where an IP will repeatedly attempt to do something very obviously disruptive) but thankfully we can (now) mitigate them well before they can even make an edit, so the usefulness of a blocking filter may be moot. As always, thank you all for the comments and intelligent suggestions — TheresNoTime (talk • they/them) 20:33, 29 November 2022 (UTC)[reply]
  • Before doing this, I would like to see some proposed situations where this would apply. In the past I have blocked hundreds of spambot ips. But not every detection by filters was obviously a spambot. Some may have been due to some weirdness with web browsers. Filters were already very good at stopping the disruption being saved. Only occasionally did spambot edits get saved. Graeme Bartlett (talk) 10:14, 1 December 2022 (UTC)[reply]

Article Editing Reviews

As I'm sure many of you may know, Wikipedia is usually made a meme out of due to the site being considered an unreliable source. One of the main reasons for this sentiment is, according to [4]https://www.dw.com/en/fact-check-as-wikipedia-turns-20-how-credible-is-it/a-56228222#:~:text=So%20is%20Wikipedia%20a%20credible,manipulated%2C%20and%20sometimes%20almost%20undetectably."So is Wikipedia a credible source? Many of the entries are well-documented, checked for quality and — as opposed to reference books — often completely up-to-date, but, 20 years after its creation, the online encyclopedia is not 100% reliable, because information can be manipulated, and sometimes almost undetectably."


So here are list of ideas that I have that do correlate with eachother:

-A certain amount of users must review an edit before it is included in the article.

-Have articles be reviewed by experts of their respective field, and have an article be marked with a token of approval from one or multiple experts. (i.e. "Expert-Certified")

I think we may need different certification tokens depending on HOW many approve of the article.


-Articles may also be reviewed constantly by others for grammar, vocabulary, etc. 198.175.205.71 (talk) 15:13, 30 November 2022 (UTC)[reply]

Wikipedia:Pending changes, when used, requires that a pending changes reviewer review an edit before it will appear to other users. The community has rejected any use of pending changes except in exceptional cases where necessary to protect articles subject to high and continuing levels of vandalism. Due to the limited number of pending changes reviewers, it would disruptive to apply pending changes to all articles. Remember, we are all volunteers, and we cannot expect that large numbers of experienced users would dedicate a large amount of their time on Wikipedia to approving pending changes. Various levels of protection, detection and reversal of obvious vandalism by bots, and the marking of suspect edits in watchlists, provide multiple layers of protection from vandalism. Detection of incorrect information in articles often requires scrutiny by users with particular knowledge of the subject area, and requiring that a subject-area specialist review pending changes would mean that most pending changes would not be reviewed for months or years, or maybe never. I will also note that the community has in the past rejected the idea of certifying experts in a field. As for your last suggestion, we do have a lot of Wikignomes constantly fixing all kinds of things. Donald Albury 15:48, 30 November 2022 (UTC)[reply]
(1) there are two fundamental ways to make an encyclopedia reliable: make sure it's written by someone who knows what they're doing ("Britannica"), or make sure it's proof-read by as wide a range of experts as possible ("Wikipedia"). There's a risk that what you're proposing would combine the worst of both worlds: a situation where the article was written by someone who didn't know what they were doing, and appears in print because it's in a niche area with few reviewers, and it's accepted by one reviewer with a wildly non-neutral point of view, and no one else can be bothered to fight through the bureaucracy to deal with the reviewer's ownership issues.
(2) In some fields it's hard enough to find someone prepared to handle an AfC, let alone provide any meaningful review based on expert knowledge; we don't want to end up like Citizendium, with its handful of articles that very few people read.
(3) we don't really need to worry about people thinking Wikipedia is unreliable. Usually this is uttered from the mouths of frustrated teachers and lecturers who don't want 30 copies of the Wikipedia article handed in as homework. They'll feel just the same whoever wrote the article. If people think Wikipedia is unreliable, that's a good thing, because it might encourage them to go and check the references for themselves. And we do have references! So although you're right the debate is always worth having, I favour no change. Elemimele (talk) 17:46, 30 November 2022 (UTC)[reply]
As a teacher, I the reason I tell my students not to cite Wikipedia is not because the information can't be trusted. The reason is IT'S AN ENCYCLOPEDIA. I expect them to go deeper. They need to use the same sorts of sources we do here.--User:Khajidha (talk) (contributions) 21:22, 30 November 2022 (UTC)[reply]
You wrote that "the community has in the past rejected the idea of certifying experts in a field..."
Hmm, but for me it is obvious, that one can get acquainted with contributions of a user and get to know a field of his/her expertise. Sometimes this process happens not on purpose, and I just know a user, because I used to notice him/her here and there. I understand that I'm not a certification board, and can't give a certificate to the user. But what keeps me from attracting the user's attention (whom I find more or less expert) to one or another article/edit? What do think as an expert in the Wikipedia ethics, @Donald Albury? Tosha Langue (talk) 09:38, 1 December 2022 (UTC)[reply]
Wikipedia editors are anonymous. Even when an editor appears to identify with a real life person, we should not rely on such identification, and outing, i.e., posting any personal identifying information about another user on Wikipedia, is defined as harassment, and may end up with the offender being blocked. Therefore, judgement about the expertise of an editor in any area is dependent on the editor's reputation based entirely on their record of editing on Wikipedia. We do not accept any claims as to expertise in real-life as relevant on Wikipedia. See Essjay controversy for one incident in which self-claims of expertise did not end well. This is not likely to change in the near future. The Wikimedia Foundation is pushing to warn new users to carefully maintain their anonymity, as some Wikipedia users in certain countries have been subject to harassment, and even arrest, for the content of their editing on Wikipedia. Donald Albury 17:09, 1 December 2022 (UTC)[reply]
To comment further, the criteria for what the content of an article states is what is supported by reliable sources, and not what any user, 'expert' or not, says. Experienced and trusted users can help judge the reliability of sources, but, in any dispute, a consensus of such users trumps any single 'expert' on the suitability of content. Donald Albury 17:21, 1 December 2022 (UTC)[reply]
There is also the issue of (often unintentional) ownership behaviour. I can think of a number of examples in scientific fields where a Wikipedia article has obviously been written by someone closely associated with the first people to develop an area. These articles are now fossils; the field has maybe developed, other people have brought different insights, and new stuff has happened, but the article cannot change, because any suggested addition is politely reverted with an edit summary that it's not an improvement because the addition is not as important as the original basic concept, or because the new development is subtly different to the original concept and therefore doesn't belong in the article. If you argue with this, the response will be that the original paper has been cited far more than the reference you're using to try to add further information (which is inevitably true, because anyone who's developed the original idea will have cited the original idea), and the subtle-difference argument cannot be refuted either, because of course the new material is subtly different; otherwise it wouldn't be new.
There are other scientific articles where I'm aware there are multiple viewpoints, and I strongly suspect the article is written by someone from one of the camps; the author honestly believes that the other camp is debunked and that anyone still maintaining that view is adopting a fringe position, and what they write is genuinely an unreliable source. But unfortunately the other camp feel the same about the first. So in the worst case scenario, the wikipedia article can be gate-kept to reflect only one viewpoint by a well-intentioned expert author. Fortunately I can only think of a handful of such articles in wikipedia, because we're very good at having a big bust-up about it, and it takes considerable effort to WP:OWN an article. I suspect if we insisted on expert reviewing, our expert reviewers would often become Owners, and we'd have a lot more fossils as well as accurate-but-one-sided articles. Elemimele (talk) 10:30, 2 December 2022 (UTC)[reply]
"Experienced and trusted users can help judge the reliability of sources, but, in any dispute, a consensus of such users trumps..."
This is it, @Donald Albury! But my impression of Wikipedia is that attracting users to help with something is considered as bad manners. We all are volunteers, and we should stay volunteer till the end, and not disturb others, right? Tosha Langue (talk) 10:39, 2 December 2022 (UTC)[reply]
It is allowed to advertise disputes in a neutrally worded notice on Wikipedia:Noticeboards, or on talk pages of users who have made more than trivial contributions to an article, or participated in previous dicussions about disputes related to or similar to the current dispute, as long as it is done in accordance with the guidelines at Wikipedia:Canvassing. What will get you in trouble is canvassing only wiki-users you think will support your position, or canvassing off-wiki to bring in partisans of your cause. Donald Albury 14:23, 2 December 2022 (UTC)[reply]

Deprecate "estimated hits" from search engines

The estimated hits from search engines are not reliable or informative, can be misleading, and should be deprecated. We should reword guidelines to explicitly discourage their use in move request discussions and elsewhere.

As an example:

  • On Google:
    • "antisemitic tropes" ADL gives me 8,640 estimated hits (runs out on page 13 after 126 results).
    • "antisemitic myths" ADL: 2,850 estimated hits (runs out after only 81 results).
    • "antisemitic canards" ADL: 2,290 estimated hits (runs out after only 95 results).
  • On Bing:
    • "antisemitic tropes" ADL: 233 million esimated hits (runs out on page 15).
    • "antisemitic myths" ADL: 166 million estimated hits (runs out on page 9).
    • "antisemitic canards" ADL: 259 million estimated hits (runs out on page 10).

Notice the lowest-used term became the highest. Search engines don't (and can't) survey their entire dataset every time a user completes a search. They give you just around a hundred results using fancy-schmancy algorithms. No search engine has ever claimed these estimates were reliable, and they were never meant to be used for the purposes some editors use them for. Even if they weren't complete spitballs, they're nonselective, and count SEO sites, forums, and junk, which are not relevant to us.

One research paper[5] found that estimated counts vary wildly over short periods of time, and don't fluctuate around a central value, making them practically meaningless. Past examples of unsupported use of search engine estimates on Wikipedia can be found here: [6][7][8] (peep that first one, where HTML tags and nonselectiveness led to a very misleading estimate). No doubt you've encountered other examples. DFlhb (talk) 18:00, 1 December 2022 (UTC)[reply]

Neither "estimated hits" nor "runs out" are any good for anything. In particular the "runs out" number for Google is totally invalid. Google limits hits that it displays to 1,000 and then eliminates duplicates, so, for example a search for "Donald Trump" only displays, for me at this time, 117 results. I don't know about Bing, but it seems that it uses a similar algorithm. Of course there are many reliable sources for "Apple Mac" as can be found by looking at the books found this search, rather than just counting the results. Phil Bridger (talk) 19:29, 1 December 2022 (UTC)[reply]
I stated that "runs out" was invalid when I said can't survey their entire dataset and give you just around a hundred results. And you made the same mistake I pointed out for "Apple Mac"; the top books are self-published (don't count), and the search ignores formatting, returning many results that contain "Apple's Mac [...]", as well as non-context text like " Apple / Mac" and "Apple - Mac OS X" from thes books' reference lists (also don't count). Which is more informative, that GBooks link, or this? DFlhb (talk) 20:24, 1 December 2022 (UTC)[reply]
  • I mean, you can't stop people from making whatever argument they wish in such discussions. You can only counter their points with points of your own. The idea that we can make some type of argument verboten and that will somehow magically stop people from trying to make it is just silly. If the use of Ghits is a bad metric for deciding an article title, then just tell people that after they try to make the argument. Wikipedia already contains guidance that recommends against using Ghits as a metric for all kinds of purposes, see WP:GHITS "Although using a search engine like Google can be useful in determining how common or well-known a particular topic is, a large number of hits on a search engine is no guarantee that the subject is suitable for inclusion in Wikipedia. Similarly, a lack of search engine hits may only indicate that the topic is highly specialized or not generally sourceable via the internet.", WP:GTEST 'Raw "hit" (search result) count is a very crude measure of importance. Some unimportant subjects have many "hits", some notable ones have few or none... Hit-count numbers alone can only rarely "prove" anything about notability, without further discussion of the type of hits, what's been searched for, how it was searched, and what interpretation to give the results...A raw hit count should never be relied upon to prove notability.", WP:UCN "When using Google, generally a search of Google Books and News Archive should be defaulted to before a web search, as they concentrate reliable sources...Search engine results are subject to certain biases and technical limitations". Wikipedia already contains the guidance the OP seeks, as shown above, in multiple places. You can't actually force people to refrain from using bad arguments. They're going to make them. Wikipedia already tells people how to not do so. I'm not sure what you want to do other than that, or why you think that "deprecation" would change anything... --Jayron32 20:02, 1 December 2022 (UTC)[reply]
    WP:UCRN goes out of its way to suggest using Google, and WP:GTEST incorrectly states these numbers can "Confirm roughly how popularly referenced an expression is". While it also includes your quote, no one does further discussion of the type of hits, [etc.]. It also makes me cringe that WP:UCN recommends Google Books, see my reply above. It should recommend Ngram instead; same dataset, but graphs instead of an estimated count.
    IMO, the point of Wikipedia proposals is to solve problems systemically, to avoid having to run around and tell people they're wrong in each individual discussion; I have better things to do than that. You're completely right that it won't be fully effective, but I don't see why WP:Requested moves shouldn't include something on this. DFlhb (talk) 20:24, 1 December 2022 (UTC)[reply]
    WP:UCRN goes out of its way to say avoid using raw basic Google hit numbers, though it does say that using some of the specialized Google products can be useful. If you're saying "No one should ever use Google to help find any information ever" I'm not sure that's a reasonable position to take. If you are trying to say "Avoid using raw "hits" data from a basic Google search when making decisions about things", Wikipedia already recommends that. See above. --Jayron32 12:13, 2 December 2022 (UTC)[reply]

Podcast by Wikipedia

Wikipedia has been a very important part of my life and it makes me sad that it's been struggling recently, I've been talking with some people at my college and we have come up with a concept to make some money for the site using volunteers.

Wikipedia is full of a compilation of great information on nearly everything you can think of. I think it could be worthwhile to look into creating a podcast reading over some of the most popular pages. Podcasts get some great sponsorship opportunities and are cheap and easy to put together, I think if you put out an application asking for some volunteers to put record the podcast. Colleges are a great place to target for things like this as well, they have the right equipment and something like this would look very good on a resume or higher education essay.

Not everyone has time to sit down and read a Wikipedia page and I think it would be a great way for Wikipedia to continue evolving with the newer generations and a great and easy way to make money. There are many different platforms, and I've seen many podcasts move over to specific sites exclusively as a partnership.

the logistics could be hard to figure out at first but I think this is a great thing to look into to help the financial stability of this site.

Thanks for your consideration. 97.115.228.152 (talk) 19:17, 1 December 2022 (UTC)[reply]

Just a few things in response to your idea
  1. Wikipedia is not financially struggling. The Wikimedia Foundation (WMF), the organization that owns and runs Wikipedia among other properties, is actually doing quite well financially (as a side note, they still should ask for donations on the regular, as a continuing income stream from donations is the only way they can remain financially stable. Asking for handouts after you're already broke is a terrible way to budget. Being financially stable involves keeping a steady and reliable source of income, and pledge drives are how the WMF maintains that financial stability. So don't think that because the WMF asks for donations that means it is broke. Its only income stream is donations, so it asks for them regularly not because it is in financial trouble, but rather because it wants to avoid future financial trouble. That's what well-run organizations do from a financial point of view.
  2. A Wikipedia podcast sounds like a really great idea. If you want to read the direct contents of Wikipedia articles in such a podcast, that is feasible but requires some proper licensing of your podcast so it is compatible with Wikipedia's license. Wikipedia's so-called "copyleft" license allows reuse, and it isn't limited to print copying. See Wikipedia:Reusing Wikipedia content for more information.
  3. There is already a lot of media out there already about Wikipedia and its culture. There is the Wikipedia Weekly podcast already, that discusses issues Wikimedians may find interesting or relevant. There's also the Wikipediocracy website and probably lots of others I am missing, some run by or in conjunction with the WMF, and some entirely independent of it.
I hope that helps! --Jayron32 20:11, 1 December 2022 (UTC)[reply]
You should look at Wikipedia:WikiProject Spoken Wikipedia. Donald Albury 20:50, 1 December 2022 (UTC)[reply]
I feel like this is, bluntly, a useless WikiProject. Articles change wildly over time, and audio clips are clunky to work with. Already, text-to-speech exists. The podcast OP suggests is definitely different than this, although I feel it would be too boring to listen to. Sungodtemple (talkcontribs) 02:37, 2 December 2022 (UTC)[reply]
A fairly new podcast called m:WIKIMOVE is also launched since earlier this year. –xenotalk 02:45, 2 December 2022 (UTC)[reply]

Adding rel="me" tags

Would it be feasible to allow users to add rel="me" to links on their user pages? For context, this is prompted by trying and failing to add a link to my Mastodon to my user page, as that's how Mastodon verifies your account, and I'd like to be able to conclusively confirm that my Mastodon and Wikipedia accounts are the same person as I primarily discuss wikipedia there. I'm sure this would be useful in other situations as well.

This wikimedia extension implements this, but I'm not sure if or how this could be added to wikipedia. TheTranarchist ⚧ Ⓐ (talk) 02:23, 4 December 2022 (UTC)[reply]

@TheTranarchist: afaik, that extension is likely to be deployed to Wikimedia sites in the Near Future™ — we may just have to wait for now.. (I "get around" this by linking to a verified web page which lists my Wikimedia account username) TheresNoTime (talk • they/them) 02:29, 4 December 2022 (UTC)[reply]