Wikipedia talk:Manual of Style/Lead section/Archive 23

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 20 Archive 21 Archive 22 Archive 23

RfC on defining terms in first sentences of biographies

Should the guidance in MOS:BIOFIRSTSENTENCE be updated to recommend the subject be described using defining terms? Barnards.tar.gz (talk) 08:40, 23 June 2023 (UTC)

Proposal

The proposal is to change the fourth numbered item which currently states:

One, or possibly more, noteworthy positions, activities, or roles that the person held, avoiding subjective or contentious terms.

to:

One, or possibly more, noteworthy and defining positions, activities, or roles that the person held, avoiding subjective or contentious terms.

(change in bold)

Background

See the talk section above.

Survey (defining terms RfC)

  • Yes. See my initial proposal in the previous talk page section for my rationale as to why a change would be beneficial (short version: the guidance on definingness is an excellent fit for first sentences, because we should be describing subjects using terms that are commonly and consistently used by sources). Note that the actual wording in this RfC is slightly different to my initial proposal, because I agree with @Thinker78's tweak, on the basis that guidance is best articulated in a positive sense rather than in a negative sense. Barnards.tar.gz (talk) 08:40, 23 June 2023 (UTC)
  • Yes per previous section rationale. A lot of puffery could be eliminated, or at least pushed lower in the content. If general notability could be established based on a characteristic alone, it qualifies as defining, otherwise it is not defining. This ties in well with commonly and consistently used by sources, and simplifies deciding on what counts as defining. · · · Peter Southwood (talk): 10:07, 23 June 2023 (UTC)
    @Pbsouthwood, could you explain what you mean about "general notability" being "established based on a characteristic alone"? Are you arguing for an Wikipedia:Inherent notability concept?
    The first sentence of Donald Trump says he "is an American politician, media personality, and businessman who served as the 45th president of the United States from 2017 to 2021." Would you shorten that to "was the 45th president of the United States from 2017 to 2021", since being an American, being a politician, being a media personality, and being a businessman are not the sorts of characteristics that make editors assume the subject is notable? WhatamIdoing (talk) 08:47, 28 June 2023 (UTC)
    WhatamIdoing, re: Are you arguing for an Wikipedia:Inherent notability concept? No. Just that any characteristic that would make the general notability bar on its own can be considered due for the lead sentence, other characteristics which on their own would not satisfy general notability, or are far less notable, should normally not be in the lead sentence. However, it the topic does meet an inherent notability criterion, I would usually expect it to be mentioned in the lead sentence. (Thinking of a species, a city, or a professional footballer here - it would usually be a bit weird if that was not mentioned.)
    In the Trump example, POTUS#45, politician, media personality and businessman are all characteristics Trump is independently notable for, and are eligible for first sentence mention. Each alone would have justified the article's existence on general notability. There are a bunch of other characteristics that would also establish general notability in his case, but we have to draw the line somewhere or it would be unreadable. Politician could be omitted, as it is redundant to POTUS, but he is still a highly notable politician, so also a good argument to be kept. "American" is covered by another convention, and in this case is highly relevant, although implied by POTUS to those who are familiar with the US political system, which is not everybody. Does this clarify my position adequately? · · · Peter Southwood (talk): 11:32, 28 June 2023 (UTC)
    I'm not sure. When you say "any characteristic that would make the general notability bar on its own", do you mean something like "we have enough independent, in-depth sources to write an article about Trump solely from the POV of him being a businessman, so 'Trump is a businessman' would meet the GNG even if he hadn't ever been on a television show or run for elected office"?
    (Characteristics don't establish GNG. Only sources can do that.) WhatamIdoing (talk) 19:19, 28 June 2023 (UTC)
    WhatamIdoing, When you say "any characteristic that would make the general notability bar on its own", do you mean something like "we have enough independent, in-depth sources to write an article about Trump solely from the POV of him being a businessman, so 'Trump is a businessman' would meet the GNG even if he hadn't ever been on a television show or run for elected office"? - Yes, that is what I mean, Trump was already notable as a businessman before he went into politics, and might never have been a television personality without that established notability.
    Also yes, sources establish general notability, and thereby also establish the notable characteristics suitable for mention in the lead sentence. Without those (GNG) sources, other published characteristics may not be notable, though by consensus, special notability conditions may apply. · · · Peter Southwood (talk): 08:02, 29 June 2023 (UTC)
    On further consideration, I am not happy with the term "Defining characteristics", as it has not been adequately defined (if it has, please provide link), and is likely to be the source of much time-wasting contention. · · · Peter Southwood (talk): 08:11, 29 June 2023 (UTC)
    WP:CATDEF. Barnards.tar.gz (talk) 08:14, 29 June 2023 (UTC)
    That seems appropriate. Thanks, · · · Peter Southwood (talk): 11:48, 29 June 2023 (UTC)
  • Support: a reasonable adjustment which may provide benefits in clearer texts. (Aside from the often unnecessary but ever-popular "philanthropist", this may also help address "social worker" and "pedagogue" add-ons, terms whose meanings can be quite different across English-speaking communities.) AllyD (talk) 10:25, 23 June 2023 (UTC)
  • Support. Wording in the first sentence is sometimes arbitrary and reflecting more of the opinion of the editor than what reliable sources state. Thinker78 (talk) 20:57, 23 June 2023 (UTC)
    @Thinker78, how do you imagine this change resulting in less "of the opinion of the editor than what reliable sources state", given that we'd be asking editors to do exactly what they did before, plus to form an opinion on whether X or Y characteristic is "defining"? WhatamIdoing (talk) 08:42, 28 June 2023 (UTC)
    We would be looking for what reliable sources say about the subject, nor writing something of our own opinion. I know it probably has its drawbacks but after all it's a guideline that could have workarounds here and there. Regards, Thinker78 (talk) 20:51, 28 June 2023 (UTC)
    @Thinker78, you will not find a reliable source that says "The defining characteristics for Donald Trump are...", so that you can simply just look for what the reliable sources say. Instead, you will find sources that say a lot of things –
    "Donald Trump the political showman", "Donald Trump, the former president", Donald Trump, White House hopeful", "former President Donald Trump, the 45th president of the United States", "Donald Trump, the current GOP frontrunner", "Donald Trump, the first former president in history to face criminal charges", "Donald Trump the only living president not descended from slaveholders", " Donald Trump, the brash businessman", "Donald Trump the federal defendant", "Donald Trump, the former reality TV personality and real estate mogul", "real-estate developer Donald Trump", "Donald Trump the actor", "Donald Trump the businessman", "Former U.S. president Donald Trump", "ex-President Donald Trump", "real estate magnate, best-selling author and reality TV star, Donald Trump", "Donald Trump, real estate magnate, "Donald Trump, real estate mogul, entrepreneur and billionaire", "businessman Donald Trump", "TV personality Donald Trump", "Billionaire tycoon Mr Trump", "former reality TV star", "billionaire businessman Donald Trump", "businessman, television personality, and politician", "shrewd businessman and self-made billionaire", "self-financed billionaire candidate", "billionaire real estate tycoon", "scion of a wealthy real estate developer" (and many, many, many more)
    – and then editors will have to use their own opinions and their own judgement to decide which of them to include or exclude. WhatamIdoing (talk) 16:59, 29 June 2023 (UTC)
  • Support. Getting the weight in the first sentence right is extremely important because it's often the part of the article that gets the most exposure and because it can set the tone for the rest of the article. We plainly need better guidance for what goes in there on biographies, and defining-ness is a reasonable way to go forwards. All the reasons that we use it for categories also apply to the lead sentence. --Aquillion (talk) 11:23, 25 June 2023 (UTC)
  • Support. 'Defining' is a more apt choice of wording than 'noteworthy' imo. SWinxy (talk) 02:03, 27 June 2023 (UTC)
  • I'm not sure anymore. I'm swayed by WhatamIdoing's oppose in the RFCBEFORE above. We may very well end up dealing with cases where the consistently used by sources will limit us in misrepresentative ways, like when there's a level 2 subheading in the article discussing an important aspect the sources don't commonly address (due to datedness, breadth of coverage, etc). I would hope these cases would be rare. I could also foresee circular arguments of the form "it belongs in the lead because it's defining because it belongs in the lead", and some of the guidance cited at DEFINING may have to be revised to account for this. I think it's an improvement over noteworthy, which describes the actual proposed text, but I am in agreement with the aforementioned oppose backing the idea that the lead should ideally summarise the article, and not interface directly with sources. (Apologies for inclarity; my words today are not good.) Folly Mox (talk) 19:55, 27 June 2023 (UTC)
    Fair point about the potential circularity of definitions, but I think this can be overcome because leads are only mentioned as indicative of definingness, not part of the definition of definingness.
    Bear in mind that this guidance only covers the first sentence. The rest of the lead is free to mention level 2 subheading topics. Barnards.tar.gz (talk) 07:20, 28 June 2023 (UTC)
  • Oppose for the reasons outlined above, but mostly because I think it will be WP:CREEPY and won't achieve the apparent goal. WhatamIdoing (talk) 19:31, 28 June 2023 (UTC)
    @WhatamIdoing Ah! The minority report? Regards, Thinker78 (talk) 21:54, 4 July 2023 (UTC)
    Perhaps the wisdom gained of experience is more relevant. WhatamIdoing (talk) 07:36, 5 July 2023 (UTC)
  • Weak Oppose: how do we define what is "defining"? RS will disagree. Per Whatamidoing. Edward-Woodrow :) [talk] 13:08, 21 July 2023 (UTC)

"This article is about"

I was wondering what people thought about starting an article with This article discusses as Transgender history in Finland does. I was going to change it but saw it was a GA so wasn't sure what I should do. I haven't come across any other examples of this. PalauanLibertarian🗣️ 22:46, 24 May 2023 (UTC)

Ranks in the bottom of lead phrases along with "famous for" and "best known for".—Bagumba (talk) 01:16, 25 May 2023 (UTC)
(1) In this particular article, I think it serves a useful purpose, because the next paragraph (although in the lede) is rather long and detailed;
(2) "This article is about" is also standard disambiguation template {'About|") as at the top of War of 1812 created by typing {{about|the conflict in North America from 1812 to 1815|the Franco–Russian conflict|French invasion of Russia|other uses of this term|War of 1812 (disambiguation)}} —— Shakescene (talk) 01:41, 25 May 2023 (UTC)
The {{about}} hatnote is fine. The OP was commenting on the lead sentence. —Bagumba (talk) 03:30, 25 May 2023 (UTC)
The first sentence should be about the topic not the article. A good analogy is list articles, where MOS:FIRST says do not introduce ... as ... "This list of Xs", ie by analogy do not introduce the article as "This article ...". Ideally the first sentence would be something like "Transgender history in Finland dates back to the earliest records in the 1800s ...", but if that is awkward, reword it and remove the bold format, along the lines of the examples in MOS:REDUNDANCY and MOS:AVOIDBOLD. Mitch Ames (talk) 01:57, 25 May 2023 (UTC)
If that lead sentence were a physical artefact, its marketing would brand it as "craft", "rustic", or "rough and ready". It certainly gets the job done for the reader, but I'm sure the article can be summarised with greater elegance. Folly Mox (talk) 03:41, 25 May 2023 (UTC)
Per MOS:REFERS, "Avoid constructions like "[Subject] refers to..." or "...is a word for..." – the article is about the subject, not a term for the subject." Regards, Thinker78 (talk) 04:07, 25 May 2023 (UTC)
Agreeing with everyone else above, and noting that a one-person review (GA) is not a reason to not edit articles, I have. There is nothing about being a GA that means one should not repair the obvious. SandyGeorgia (Talk) 12:11, 25 May 2023 (UTC)
This article is about is a self-reference that should not be seen in any article, far less a GA. · · · Peter Southwood (talk): 10:24, 23 June 2023 (UTC)

Based on the opionions here, I have added: Avoid "This article is..." or "This list is...": the first sentence should be about the topic not the article. Shhhnotsoloud (talk) 08:08, 9 August 2023 (UTC)

I reverted. There was already very similar guidance and the addition of "first sentence is not about the article" is not something I agree with. Regards, Thinker78 (talk) 20:02, 9 August 2023 (UTC)

Coordinates duplicated in the lead (both inside and outside infobox)

Please discuss at Wikipedia_talk:Manual_of_Style/Dates_and_numbers#Coordinates_duplicated_in_the_lead_(both_inside_and_outside_infobox). fgnievinski (talk) 21:45, 13 August 2023 (UTC)

The redirect Mos:BOLDAVOID has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 October 11 § Mos:BOLDAVOID until a consensus is reached. Utopes (talk / cont) 23:44, 11 October 2023 (UTC)

Not contradictory

This article opens "In Wikipedia, the lead section is an introduction to an article and a summary of its most important contents... .. It is not a news-style lead or "lede" paragraph." Then tha tlinks to "A lead paragraph (sometimes shortened to lead; in the United States sometimes spelled lede) is the opening paragraph of an article, book chapter, or other written work that summarizes its main ideas." Erm, that's exactly what it is, then. There is no contradiction in definition, so why claim there is? 2A00:23C8:8F9F:4801:250A:E861:3F91:DA83 (talk) 21:31, 9 September 2023 (UTC)

What exactly are you talking about? This is not an article. And the word "contradictory" or "contradiction" doesn't appear in this guideline, nor on its talk page, except one occurrence under the #LEADCITE rewrite thread which seems to have nothing to do with what you posted. As for "It is not a news-style lead or 'lede' paragraph", see previous dicussions in archives. In summary: A journalistic lede is very little like an encyclopedic lead. The former is generally written to "tease" the reader with hints at information to induce them to read more, while the latter is a summary of all the salient details. The former is usually confined to 1-2 compressed-language ("news-speak") sentences, and under 25 words, while the latter is written in normal encyclopedic prose as clearly as possible, is usually much longer (except at a very small stub article), and varies in length by the length of the article it is summarizing. The former vary in style ("hard" versus "soft" leads), while the latter do not (they're all "hard"). PS: See also WP:NOT#NEWS policy: "Wikipedia is not written in news style." We have this policy, and MOS:LEAD avoids using the term "lede", because we have long had a pervasive problem with new editors trying to write in news-journalism instead of encyclopedic style, because they often are most familiar with news writing and have a sense of its style as "the" correct way to write.  — SMcCandlish ¢ 😼  22:19, 9 September 2023 (UTC)
I think the IP's point is that the wikilink is to Lead paragraph, which defines a lead paragraph just as this MOS section does. If it linked instead to News style#Lead, then it would make sense to contrast it to a Wikipedia lead. Schazjmd (talk) 22:22, 9 September 2023 (UTC)
Well, that's was an easy enough adjustment to make. :-)  — SMcCandlish ¢ 😼  00:56, 12 October 2023 (UTC)

MOS:BIOFIRSTSENTENCE

I have a question that came up in a recent close review that for which I never received a definitive answer. The closer argued that an occupation could not be mentioned in the first sentence per MOS:BIOFIRSTSENTENCE (also mentioned in MOS:FIRSTBIO). The reasoning was that the occupation was contentious. I don't want to religiate that close or that article, but I'd like to see some clarification on MOS:BIOFIRSTSENTENCE. Can an occupation be a contentious, value-laden label? Nemov (talk) 13:08, 20 October 2023 (UTC)

Not sure why the OP is being cagey about it, but this is about the term "journalist" being used.  — SMcCandlish ¢ 😼  19:02, 20 October 2023 (UTC)
  • Any claimed fact can be controversial, the more so the less it is supported in independent reliable sources.  — SMcCandlish ¢ 😼  19:03, 20 October 2023 (UTC)
    First of all, what is cagey by asking for clarification? I would remind you to assume good faith. Journalist is an occupation. It would't matter if the occupation is a plumber. My reading of MOS:BIOFIRSTSENTENCE is for value-laden terms. It seems by your interpretation if there's any argument/controversy with an occupation then per MOS:BIOFIRSTSENTENCE it cannot be in the first sentence. That is fine, but then I would suggest wording MOS:BIOFIRSTSENTENCE to be clearer in that regard. Nemov (talk) 19:23, 20 October 2023 (UTC)
    I didn't imply anything faith-wise, I'm just observing that you posted an over-generally phrased question without providing sufficient detail, making us go dig the detail out on our own. Not terribly helpful. Anyway, I don't see anything unclear about BIOFIRSTSENTENCE. If I were notable and I claimed to be a licensed plumber as well as a writer and an IT consultant, WP should not just include the plumber claim without independent sourcing. More to the point of this particular case, sources appear to be in disagreement about whether what Ngo does is journalism, which is probably a more serious matter than not finding any sources that address whether he's a journalist at all (i.e., there is no WP:ABOUTSELF wiggle room to even contemplate). Anyway, when there's a conflict in what sources are saying, this becomes a WP:DUE policy matter and has nothing to do with style guidelines. Or to put it another way, what BIOFIRSTSENTENCE says is simply not relevant when the claimed occupation is disputed, because it is not a style question of any kind but a fact-establishment content question. We establish the facts first, and decide how to style them after the fact.  — SMcCandlish ¢ 😼  19:58, 20 October 2023 (UTC)
    I appreciate you spending time to respond to this, but I'm not here to debate Ngo. I'm curious about the application of MOS:BIOFIRSTSENTENCE because that's what the closer used to justify the action. The closer said[[1]] that per MOS:BIOFIRSTSENTENCE, we ought to omit labels that are contentious in the first sentence of the lead. Debates about the lead sentence come up a lot in biography discussions. If it's just a bad justification on the closer's part that's fine, they could have justified it differently. Nemov (talk) 20:28, 20 October 2023 (UTC)
    I don't really see a problem. BIOFIRSTSENTENCE has language that is clearly moderated to comply with NPOV and related policies: the opening paragraph of a biographical article should neutrally describe the person .... One, or possibly more, noteworthy positions, activities, or roles that the person held, avoiding subjective or contentious terms. If Ngo's claimed status as a "journalist" is disputed in the RS material, then it's not a neutral description and is subjective and contentious. It might have been a more solid close to cite NPOV policy directly. But I don't think it matters much, since the close was correct either way, and the P&G pages that are applicable are not in conflict, so which one was cited isn't very important. There are lots of pages here that re-state a rule from another page in summary form, and this generally isn't a problem (as long as a WP:POLICYFORK doesn't develop over time).  — SMcCandlish ¢ 😼  21:04, 20 October 2023 (UTC)
    Again, I'm not challenging the close and this isn't about Ngo. My question could be any article. Can an occupation be rejected on BIOFIRSTSENTENCE/WP:CONTENTIOUS as a value-laden label? That's how this was justified. I believe the closer got to the right answer as you have pointed out, but should have said the occupation was "subjective" instead of claiming it's value-laden. Nemov (talk) 21:25, 20 October 2023 (UTC)
    In short, "yes". If you agree with the close in the first place, what is prompting you to ask the question? "Value-laden" (i.e. subjective or contentious and not neutral) is just as valid a rationale as a bare "subjective".  — SMcCandlish ¢ 😼  21:28, 20 October 2023 (UTC)
    I'm asking the question because to claim an occupation is a value-laden term doesn't match the what's written at WP:CONTENTIOUS. I would recommend amending it if that's how editors want to interpret it. As Aquillion mentions below, "anything can be controversial." So what's the point in having the distinction in the first place? Nemov (talk) 21:54, 20 October 2023 (UTC)
Anything can be controversial if the sources present it as controversial; anything can be value-laden depending on the context. If there is clear disagreement among sources of comparable weight about something, then we can't state it as a fact in the article voice, per WP:NPOV (Avoid stating seriously contested assertions as facts.) I don't think there is a meaningful distinction between contentious, contested, and value-laden - they're all different ways of saying "do non-opinion sources generally agree on this and state it as uncontested fact." If high-quality non-opinion sources agree on something and state it as fact, then it is uncontested, uncontentious, and not value-laden; and likewise, if there's disagreement among them or they state it in a plainly skeptical manner, then should be treated as contested, contentious, and value-laden. That's the only threshold that matters - how editors feel about a term doesn't come into it; nor is there some sort of list of "verboten" terms or anything like that. (That said, it should be easy to see that some professions are more prestigious than others, and especially for people in media-related fields can carry value judgments about the value and veracity of their work, as well as their overall methods and intent. Whether or not someone is described using those terms can therefore become a value-laden judgment, so it's unsurprising that there would be cases where the sources would conflict or treat them as controversial. "Propagandist" is also a profession; do you think we could use it in the lead sentence of a bio when there's disagreement over them among the highest-quality sources? What about "prostitute?") --Aquillion (talk) 21:47, 20 October 2023 (UTC)
@Nemov, I hope that the original dispute is long settled and nearly forgotten, but I wanted to circle back to this idea that an occupation is a value-laden term. This is probably not helpful (meaning: practical) language for discussions. A bona fide occupation (e.g., butcher, baker, candlestick maker) is not a value-laden term. But:
  • Some things people do to produce money, or to keep themselves occupied, involve activities can be described in ways that tend to express an opinion or judgement about the person's activities (an oppressed prostitute, or an empowered sex worker? a professional gambler, or a gambling addict? a business owner or a crime boss? a terrorist or a freedom fighter or a mercenary?). These are sometimes value-laden terms.
  • The right of certain individuals to claim certain careers may be in doubt (e.g., an author who's never been published, a consultant with no clients, a politician who has lost every election...).
Disputes about whether someone should be called a journalist don't really involve "value-laden terms" per se. Instead, the question is what it means for someone to be a journalist, and whether the person really is one. WhatamIdoing (talk) 05:47, 15 November 2023 (UTC)

A wording dispute about technical material

We have this presently:

Make the lead section accessible to as broad an audience as possible. In general, introduce useful abbreviations but avoid difficult-to-understand terminology, symbols, mathematical equations and formulas. Where uncommon terms are essential, they should be placed in context, linked, and briefly defined. The subject should be placed in a context familiar to a normal reader. For example, it is better to describe the location of a town with reference to an area or larger place than with coordinates. Readers should not be dropped into the middle of the subject from the first word; they should be eased into it.

This was recently changed to the following (with the change annoted here like this, for visual clarity):

Make the lead section accessible to as broad an audience as possible. In general, introduce useful abbreviations but avoid difficult-to-understand terminology, symbols, mathematical equations and formulas where such usage would conflict with the goal of making the article as accessible to as wide an audience as possible. Where uncommon terms are essential, they should be placed in context, linked, and briefly defined. The subject should be placed in a context familiar to a normal reader. For example, it is better to describe the location of a town with reference to an area or larger place than with coordinates. Readers should not be dropped into the middle of the subject from the first word; they should be eased into it.

The rationale for the addition was "put back in original wording here. We actually have articles *about* equations and other highly technical subjects. The guideline should not be read as excluding these from the lede." The rationale for the reversion was "No such goal - not the original text".

I'm not inclined to dig back through page history to determine when such wording was added the first time, by whom, or for what rationale. It's more sensible to just discuss whether we think such wording would be appropriate to have here.  — SMcCandlish ¢ 😼  20:15, 16 November 2023 (UTC)

I agree with the rationale, but I'm not sure the underlined text itself expresses the point very well. To paraphrase the given rationale, excluding an equation from an article about that equation would be perverse.
Furthermore, I'm a bit puzzled as to why abbreviations are singled out as acceptable. Why is it fair to introduce a "useful abbreviation" but not a useful symbol? The same be careful about introducing the unfamiliar ethos should apply across the board. XOR'easter (talk) 20:46, 16 November 2023 (UTC)
I'm not certain as to what is meant by "useful". My reading has been that it is okay to introduce an abbreviation and then use it to avoid repeating a long phrase. Hawkeye7 (discuss) 21:05, 16 November 2023 (UTC)
That sounds fair, in principle. My concern is that by the same token, one should be able to introduce a symbol and then use it to avoid repeating its definition or otherwise spilling a lot of words. XOR'easter (talk) 21:10, 16 November 2023 (UTC)
The original was inserted here. To me the additional sentence is not only repetitive, but demands a "goal" that we do not have and which conflicts with out mission. Our goal is to construct an encyclopaedia. Some articles will, of their natural, be quite specialised and of interest only to the specialist reader. Difficult-to-understand mathematical formulae and the like are absolutely essential in an article where that is the subject. @Tito Omburo: Hawkeye7 (discuss) 21:02, 16 November 2023 (UTC)
My take is that whether "equations" are used is a fairly blunt proxy for comprehensibility. Some of the most impenetrable introductory paragraphs in math articles are written entirely in prose, whereas is something that almost anyone can understand, but still might be less than beautiful in the first sentence.
So we should consider the issue of mathematical notation somewhat separately from the broader question of how to present technical material to readers who may not quite have the background for it.
I think it is reasonable to say, not as a hard rule but as a general stylistic preference, that mathematical notation should usually be avoided in the lead sentence, and maybe even the lead paragraph, except in cases where the article is specifically about an equation or similar formal entity (quadratic formula, Pell equation for example).
Note that this is specifically an aesthetic consideration; it is not really about comprehensibility. --Trovatore (talk) 21:23, 16 November 2023 (UTC)

This isn't in any way about "excluding an equation from an article"; it is only about lead sections of articles, and this discussion is going to be needlessly heated and increasingly nonsensical unless this distinction is understood and maintained.  — SMcCandlish ¢ 😼  21:29, 16 November 2023 (UTC)

The {{od}} template fails to make it completely clear whom you're responding to. From the content, I think you're responding to Hawkeye7, is that correct? --Trovatore (talk) 21:36, 16 November 2023 (UTC)
Sorry for accidentally eliding that above (I meant to include "the intro of" and didn't notice I had omitted it until re-reading just now). But... excluding an equation from the lead of an article about that equation is still absurd. I really doubt that anyone wants to remove the illustration of clefs from the lead of Clef. That's what the page is about; removing it in the name of "clarity" would rightly be seen as backwards. XOR'easter (talk) 21:41, 16 November 2023 (UTC)
Well, I was quoting a particular editor clearly, but making a general point: if this discussion gets mired in "what should be in the article at all" instead of remaining focused on "what should be in the lead section" then it's not going to go anywhere useful. This has implications for other statements above, like "consider the issue of mathematical notation somewhat separately from the broader question of how to present technical material". In the lead makes a big difference here, in what such a consideration would entail.  — SMcCandlish ¢ 😼  21:59, 16 November 2023 (UTC)
Another concern: a bajillion times over the years, I've said some variation of "the lead is meant to summarize the body". When a subject is notation-intense, omitting that notation from the lead entirely could well make for a defective summary. (I first wrote "dishonest summary", but that could be construed as implying ill intent.) Now, depending on the subject, it might still make sense to exclude fancy notation from the opening line, or from the first paragraph. The pragmatic choice will depend upon the topic, the length of the article, and the plausible intended audience. XOR'easter (talk) 22:12, 16 November 2023 (UTC)
I'm having trouble reading the changed text. Can we instead say

In general, introduce useful abbreviations but avoid difficult-to-understand terminology, symbols, mathematical equations and formulas where unless such usage would conflict with the goal of making the article as accessible to as wide an audience as possible.

Even with this change, it's still hard to understand. The pile-on wording of "do this avoid that unless the first thing" is convoluted.67.198.37.16 (talk) 22:22, 16 November 2023 (UTC)
BTW, this change would address XOReaster's concern: if some highly technical article requires some exceptional lead with unusual wording, its allowed, because that would meet "the goal of making the article as accessible to as wide an audience as possible." 67.198.37.16 (talk) 22:29, 16 November 2023 (UTC)
"Avoid unless it would conflict with the goal of making the article accessible" means not using this technical content in the cases where it is needed, but allowing this content in cases where it is unnecessary. You are changing the meaning to the opposite of what it should be. —David Eppstein (talk) 22:38, 16 November 2023 (UTC)
Yeah, there seems to be a series-of-negatives problem going on.  — SMcCandlish ¢ 😼  22:42, 16 November 2023 (UTC)
Thus illustrating that technical content may be a sufficient condition for producing difficult-to-read text, but it is not a necessary condition. —David Eppstein (talk) 22:52, 16 November 2023 (UTC)
I mean look, even journal articles ordinarily start with prose. It's not really about comprehension. It just doesn't look nice to start with symbols; it looks like you've wandered into somebody's notes rather than a polished article. I think we can make some such point, maybe not for the whole lead section, but at least for the lead paragraph.
And yes, there does need to be an exception for articles that are specifically about an equation or other symbolic entity. --Trovatore (talk) 23:18, 16 November 2023 (UTC)
I think that is broadly true, but not universally true. Abstracts and opening paragraphs of journal articles do break out the notation if it's sufficiently well-established in their fields that they don't have to define it first. "Our algorithm runs in time", etc. There are terms invented by incorporating notation into words: ∞-category, for example (and heck, ∞-groupoid has to have the symbol in the article title!). XOR'easter (talk) 02:22, 17 November 2023 (UTC)
I said "ordinarily". Yes, there will be exceptions, but as a general rule, it's probably better to defer heavy symbol usage to, at least, the second paragraph. I think it just looks nicer. I don't have a deeper reason than that, at least not that I've analyzed well enough to elucidate. --Trovatore (talk) 03:19, 17 November 2023 (UTC)

The original was changed with this very misleadingly summarized and undiscussed edit, which clearly changed the meaning to an injunction against including equations in the lede. This leads to all sorts of perverse problems, as editors who have experience editing mathematics and technical subjects have already remarked. Tito Omburo (talk) 23:20, 16 November 2023 (UTC)

Also, I am confused by the "no such goal" edit summary. If it is not a goal of this guideline to eliminate mathematics from lede sections of articles on mathematics, perhaps it makes sense to cut out the proscription on equations altogether. Seems like a classic case of WP:BEANS. Tito Omburo (talk) 23:24, 16 November 2023 (UTC)

As someone trying (perhaps poorly) to just facilitate the discussion happening, without "having a dog in the fight", I want to suggest that several of you continuing to edit the pertinent material in the guideline page back and forth while the discussion is going on kind of defeats the purpose of the discussion, which is to come to some consensus about what that material should say and why.  — SMcCandlish ¢ 😼  23:35, 16 November 2023 (UTC)

There is a dispute regarding whether it is DUE to mention A Haunting in Venice, a film adaptation, in the lead of Hallowe'en Party, its source material. TL;DR, proponents argue that the film is the most notable among the handful of adaptations, as evidenced by the fact that it is the only one to have a standalone article and that it has the most WP:SIGCOV; opponents argue that all of the adaptations are equally notable and it is therefore not appropriate to single out the film in the lead. You are invited to weigh in, thanks. InfiniteNexus (talk) 00:06, 31 October 2023 (UTC)

information Note: Started an RfC about this, see Talk:Hallowe'en Party#RfC on mention of film adaptation in the lead. Thanks. InfiniteNexus (talk) 00:04, 19 November 2023 (UTC)

Listing large US cities by state in broadcasting article leads

I've had this come up in an FAC (Wikipedia:Featured article candidates/WSNS-TV/archive1) and wanted some clarity on the topic. Some broadcasting articles are on stations located in and licensed to very large, undisambiguated-title-by-state-per-AP Stylebook US cities. Which of these should be preferred?

Pinging for visibility: Mvcg66b3r and MaranoFan. Sorry for double pings, but SMcCandlish asked me to move this over. Sammi Brie (she/her • tc) 04:03, 19 November 2023 (UTC)

  • My personal take on this would be to either link the city to its article, or give the long version, but not both, and prefer the former if there's an infobox that gives the long version. Even a lot of major city names are technically ambiguous (cf. San Francisco (disambiguation)). The rationale for linking would be that, while we don't normally link this class of major metro cities when they are mentioned in passing (e.g. in "Smith moved to Chicago in 2014"), in the lead of an article about a radio/TV station, the market it serves is directly pertinent to fully understanding the subject, so the link is justified.  — SMcCandlish ¢ 😼  04:15, 19 November 2023 (UTC)
    @SMcCandlish, I should have linked them in the examples above, but normally, they are. Examples revised. Sammi Brie (she/her • tc) 04:21, 19 November 2023 (UTC)
    I that case, I would just go with San Francisco, though even San Francisco, California would be preferable to San Francisco, California, United States. We generally don't put "United States" after a US state name, except sometimes in infoboxes (for no reason I've ever seen articulated). The usual presumption that people know where and what San Francisco or Chicago are goes double for entire US states.  — SMcCandlish ¢ 😼  04:41, 19 November 2023 (UTC)
    @SMcCandlish Nikkimaria has gone at me for not having country mentions in articles before. (The relevant infoboxes have a country field.) Sammi Brie (she/her • tc) 04:57, 19 November 2023 (UTC)
    Well, I guess it's good to have a general discussion then and come to a clearer consensus about what to do.  — SMcCandlish ¢ 😼  05:28, 19 November 2023 (UTC)
  • Option A. It's been my take that the lede in Us TV station articles have long been problematic, in that they are overlinked leading to WP:SEAOFBLUE. For example: WFTY-DT; "It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed UniMás co-flagship WFUT-DT (channel 68) and Paterson, New Jersey–licensed Univision co-flagship WXTV-DT (channel 41)". I feel the excess verbiage could be removed, without lessening the information in the lede. Example: "It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed WFUT-DT (channel 68) and Paterson, New Jersey–licensed WXTV-DT (channel 41)", conveys the same information and allows the reader to choose whether or not they want to click on the wiki-link for more information regarding the sister stations. - BlueboyLINY (talk) 20:41, 19 November 2023 (UTC)
    @BlueboyLINY I've been very attuned to SEAOFBLUE issues in our leads. We have a lot of kludgy lead paragraphs in our topic. Our other issue is that, generally, only articles I've improved have adequate summary leads of their contents. Sammi Brie (she/her • tc) 21:24, 19 November 2023 (UTC)
    BlueboyLINY, it is not clear from your example text what you think should and shouldn't be linked, since you didn't include any links in any of it. This makes your rationale and your desired outcome rather hard to understand.  — SMcCandlish ¢ 😼  21:32, 19 November 2023 (UTC)
    This is the relevant excerpt with links:

    It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed UniMás co-flagship WFUT-DT (channel 68) and Paterson, New Jersey–licensed Univision co-flagship WXTV-DT (channel 41), which WFTY simulcasts on its respective second and third digital subchannels.

    Sammi Brie (she/her • tc) 21:43, 19 November 2023 (UTC)
    And this is the excerpt with links I feel are unnecessary removed:

    It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed WFUT-DT (channel 68) and Paterson, New Jersey–licensed WXTV-DT (channel 41), which WFTY simulcasts on its respective second and third digital subchannels.

    - BlueboyLINY (talk) 03:00, 20 November 2023 (UTC)
    I would rewrite as:
    '''WFTY-DT''' (channel 67) is a [[television station]] licensed to [[Smithtown (CDP), New York|Smithtown, New York]], United States, serving [[Long Island]] as an affiliate of the [[True Crime Network]]. It is owned by [[TelevisaUnivision]] alongside [[Newark, New Jersey]]–licensed [[UniMás]] [[Flagship (broadcasting)|co-flagship]] [[WFUT-DT]] (channel 68) and [[Paterson, New Jersey]]–licensed [[Univision]] co-flagship [[WXTV-DT]] (channel 41), which WFTY [[Simulcast|simulcasts]]…
    +
    '''WFTY-DT''' (channel 67) is an American [[television station]] licensed to [[Smithtown (CDP), New York|Smithtown, New York]], serving [[Long Island]] as an affiliate of the [[True Crime Network]]. It is owned by [[TelevisaUnivision]] alongside [[Newark, New Jersey|Newark]]–licensed [[UniMás]] [[Flagship (broadcasting)|co-flagship]] [[WFUT-DT]] (channel 68) and [[Paterson, New Jersey|Paterson]]–licensed [[Univision]] co-flagship [[WXTV-DT]] (channel 41), which WFTY [[Simulcast|simulcasts]]…
    — HTGS (talk) 22:14, 23 November 2023 (UTC)
  • While we are the English Wikipedia, I think it stands to reason that a vast majority knows that San Francisco is a city in the US. So, I would go with Option A. Where necessary, like with New York City, I would just use a piped link. - NeutralhomerTalk • 13:56, 20 November 2023 (UTC)

FA numbers

@Femke, the numbers of sentences in FAs was based on a non-random sample of 61 articles. I looked again at the first 10, using the specific version that was promoted. Here are the numbers for each (words/sentences):

  • 361/16
  • 391/15
  • 232/9
  • 244/9
  • 399/19
  • 245/10
  • 334/15
  • 361/15
  • 343/16
  • 137/7

The range for sentences is 7 to 19, and if you exclude the most extreme, it's either 9 to 16 or 10 to 15 sentences per lead.

The mean word count is 305 words per lead, with most of them falling either around 250 or (a little more frequently) 350.

The words per sentence count has a range of 20 to 27, with a mean of 23.

This is similar to the numbers for last December's TFAs, which you can find here. I suggest that instead of raising the number of sentences per lead to 12, which is less accurate, you consider changing the "300 words" to a range (e.g., "250 to 350 words"). WhatamIdoing (talk) 06:17, 27 November 2023 (UTC)

Thanks for giving the background here. I'm happy with the recent change by Tpbradbury to 200-400 words.
I was surprised to see the combo 10 sentences for 300 words, as that would imply an average 30 words per sentence, above the maximum length (not maximum average) of 25 words in a sentence the UK government uses to assure readability.
In the non-random sample of 61 articles TFA list, the median sentence length is 21 words, which comes closer to what I expect, even there are two outliers with 30+ words. So for the 200-400, a rough number of sentences would be 10 to 20, taking that median and rounding for simplify. Happy for that to be added, or for the number of sentences to be omitted altogether. It's a bit odd to have a small range of sentences with a wider length range. —Femke 🐦 (talk) 17:35, 27 November 2023 (UTC)
I think you assumed that the smaller number of sentences (10) had the same number of words (300) as the larger number of sentences (15). In practice, 10-sentence leads tend to have 230–260 words in them, and 15-sentence leads tend to have 300–350 words in them.
We should probably change the "200" to 250. Only a small percentage of FAs have leads as short as 200. WhatamIdoing (talk) 17:43, 27 November 2023 (UTC)
That's indeed how I read the 10 sentences vs 300 words.
You're right it's a bit asymmetric: of the same 61 articles TFA sample, the 10% percentile is 193, 20% is 232, 50% is 282, 80% is 399 and 90% is 446. So a range of 250 to 400 words makes sense if we round to the nearest 50 and take the 20% and 80% percentile.
If we take the same percentiles (20% and 80%), we'd get 10 to 18 sentences. The alternative calculation of dividing word count by median sentence length gives us 11 to 19 sentences, which doesn't feel nicely rounded off. —Femke 🐦 (talk) 17:54, 27 November 2023 (UTC)
10 to 18 sentences sounds good. Gawaon (talk) 19:22, 27 November 2023 (UTC)
Those numbers (250–400 and 10–18) also match the December 2022 counts (excluding the smallest 6 and the largest 6). WhatamIdoing (talk) 19:34, 27 November 2023 (UTC)
I don't think it's self-evident that this is the appropriate way to come up with the figures (perhaps we should take a look at articles promoted in the last two, three, or five years to get a reasonably large sample without getting too many articles that do not reflect current best practices? Perhaps we should take into consideration that certain kinds of articles are likely to be over-/underrepresented among featured articles and have on average shorter/longer leads than others? Perhaps we should not be looking at absolute word counts but relative ones?), but more importantly I think this is taking an overly quantitative approach to an issue that is inherently mostly qualitative. The most important thing is that the figures only be used descriptively, not prescriptively. Being somewhat fuzzy about it (such as by using a range, and preferably a fairly broad one) helps here. TompaDompa (talk) 00:33, 28 November 2023 (UTC)
The counts can only be determined by hand; feel free to pick your own set and count them yourself. Another dataset will do no harm.
This information is already presented in a strictly descriptive manner: "Most Featured articles have a lead length of..." – not "You must" or "You should". WhatamIdoing (talk) 01:11, 28 November 2023 (UTC)
What is the aim of including these numbers (which will be misapplied)? What are we gaining ? SandyGeorgia (Talk) 01:15, 28 November 2023 (UTC)
I, for one, find them useful guidance. Gawaon (talk) 03:10, 28 November 2023 (UTC)
I tend to agree that this is mostly a qualitative not quantitative matter, and that these numbers will be misapplied, but may be we still need something like them anyway, perhaps with a strong statement that there are just a ballpark estimate and not grounds for forcing a split or deletion of material. PS: "10 to 18" is also weird to me, oddly arbitrary. It would make more sense as "10 to 20", without having much effect on the other numbers.  — SMcCandlish ¢ 😼  04:22, 28 November 2023 (UTC)
Well, that's why I rounded down the first time, but 10 to 18 is more accurate.
I'd much rather have word and sentence counts than paragraph counts. This page has made length-related suggestions since the first version in 2004, when length considerations took up half the page. Whatever benefit we thought we were getting by creating this page in the first place, IMO we'll get those benefits plus greater clarity by replacing the suggested paragraph count with a suggested word or sentence range. WhatamIdoing (talk) 05:00, 28 November 2023 (UTC)
Why "replacing"? But are currently there, and it makes a lot of sense to have both (or rather, all three: paragraphs, sentences, and words). Gawaon (talk) 07:32, 28 November 2023 (UTC)
The problem we've had (for years) with the paragraph counts is that people say "Oh, five paragraphs is too many – this lead is too long – look, I removed a line break, and now it is the right length!" Or "One paragraph is too little – this lead is too short – look, I pressed Return in the middle of the paragraph to make two single-sentence paragraphs instead of one two-sentence paragraph, and now it is the right length!" WhatamIdoing (talk) 16:03, 28 November 2023 (UTC)
I agree that "10 to 18 is more accurate". Rounding it up to 20 would feel arbitrary, as would rounding down the lower word count to 200 (as it has been for a short while, meanwhile reverted). Gawaon (talk) 07:34, 28 November 2023 (UTC)
The page is going WP:CREEPy; we still have no evidence that "Most Featured articles have a lead length of about three paragraphs, containing 12 to 15 sentences, or 250–400 words", as that was apparently based on one sample of TFAs for one month (according to the discussion at WP:SIZE, and why is this being discussed in two different places), and it could reflect a skew towards certain kinds of articles that are over-represented at WP:FA (eg hurricanes). We shouldn't be imposing stuff on a guideline that editors will misinterpret (because they always do), and we can't make generalizations like this about FAs without considering the topic. SandyGeorgia (Talk) 07:43, 28 November 2023 (UTC)
It's based on two sample sets (December 2022 + all of WPMED's FAs), both of which had the same results.
(Hurricanes were one of my concerns about FAs; most of them seem to have shorter than usual leads, with two paragraphs.) WhatamIdoing (talk) 15:53, 28 November 2023 (UTC)
Well, as another example of differences, the MED FAs include bios, which are different than medical conditions. So same issue ... it's hard to separate length from topic. SandyGeorgia (Talk) 16:03, 28 November 2023 (UTC)
When you get the same results in two separate studies, the odds of a third study producing different results are pretty small. But if you (or anyone else) would like to pick a third sample set, please feel free to do so, and please share your results. WhatamIdoing (talk) 17:01, 28 November 2023 (UTC)
WhatamIdoing, I suspect this is not a good representative month for this data. That month happened to have three one-paragraph leads, which are actually quite rare (unless FAC has gone way off the rails). If these kinds of numbers are to be used, a broader sample is called for. SandyGeorgia (Talk) 09:18, 28 November 2023 (UTC)
It could be; you could pick another month and see what you find. Looking at Wikipedia:Featured content, 12 of the most recent 15 FAs have three paragraphs in the lead; one has four paragraphs and two have two paragraphs. Three paragraphs is mean, median, and mode in that small sample set. WhatamIdoing (talk) 17:19, 28 November 2023 (UTC)

I'd be happy to take a more representative sample of FAs and redo the analysis later. I find some word count really useful here; I often quote it to say that 600+ words leads are intimidating and difficult to read. Having guidance on the number of paragraphs without guidance on words can lead people to misinterpret as well and write very bloaty paragraphs. To further avoid misinterpreting, we may want to widen to a 10-90 percentile interval instead. —Femke 🐦 (talk) 08:24, 28 November 2023 (UTC)

If we want to be all statistical about it, we could follow the 68–95–99.7 rule. We're currently taking the inner 80%, which is a bit more than one standard deviation from the median. WhatamIdoing (talk) 17:09, 28 November 2023 (UTC)
The one-paragraph leads are outliers, and three in one month should be an extreme anomaly. (If they're not, something is wonky at FAC.) I'm less interested in re-doing the numbers than I am in seeing better qualifiers put on the text. SandyGeorgia (Talk) 17:23, 28 November 2023 (UTC)
It says "Most Featured articles have a lead length of about three paragraphs". What would you change that to? "Most Featured articles have a lead length of about three paragraphs, and almost never one or five"? "Most Featured articles have a lead length of about three paragraphs, based on multiple samples, all of which found that three was the most common number of paragraphs"? WhatamIdoing (talk) 19:10, 28 November 2023 (UTC)
Most featured articles have a lead length of about three paragraphs; lead length varies depending on the topic and content area, but is rarely less than two or more than five paragraphs. SandyGeorgia (Talk) 19:25, 28 November 2023 (UTC)
I'm not opposed to that, though one could express it even simpler: "Most featured articles have a lead length of about two to four paragraphs." Gawaon (talk) 19:35, 28 November 2023 (UTC)
Brings us right back to the same problem-- no context, the uninitiated will then oppose five or one, which is what we're trying to avoid. They happen, albeit rarely, and they are acceptable. SandyGeorgia (Talk) 19:43, 28 November 2023 (UTC)
There are still the words "most" and "about" even in my proposal, but – anyway. Gawaon (talk) 20:06, 28 November 2023 (UTC)
(edit conflict) I think coords can deal with people at FAC overinterpreting stuff if we're clear that these are just descriptive and not prescriptive. I don't think people oppose for such reasons, really? These guidelines are most useful for showing newbies that a 2 sentence lead is really not that good, and that the leads of featured articles do not become a bloated mess, but typically stay under 500 (550) words.
I can't tell who wrote this, but there is no problem with FAC Coords (who know how to apply guidelines). As explained over at the split discussion (??? Why ??) at the talk page of WP:SIZE, the problem is not FAC, but how less experienced editors will use/interpret this data. SandyGeorgia (Talk) 20:44, 28 November 2023 (UTC)
@SandyGeorgia, what makes you believe that the lead varies according to the topic of the article? Could you articulate an example, like "Music articles tend to have short leads" or "Biographies tend to have long leads"? I'm hoping to find out what the difference is between "varies by the needs of the specific article" (which I assume we could all agree is true) and "varies by subject" (good luck guessing whether your subject tends to be shorter or longer).
I just ran another small set at User:WhatamIdoing/Sandbox#Most recent FAs (15 articles). 80% of them have exactly three paragraphs. 100% of them have between 10 and 17 sentences. 80% of them have a word count between 250 and 400 (and the 20% that don't are within a rounding error of 250, so 100% of them have "about" 250–400 words in the lead).
Someone else has claimed that lead length varies by length of the article; this might be true, but it's a fairly minor effect (articles 3x median have a lead that is in the upper half; articles that are 0.6x median have a lead that is in the bottom half). WhatamIdoing (talk) 21:15, 28 November 2023 (UTC)
Hurricanes and ships are often short; medical and scientific articles (climate change) (not bios) are often longer. Medical articles may have (proportionally) longer leads because the things we should include is somewhat prescribed at WP:MEDORDER (that is, we hit classification, signs and symptoms, diagnosis, cause, prognosis, epidemiology, history, cultural, etc ... we have a checklist other content areas might not have). Bios vary. I think I'd be convinced on the number range if we took Femke's analysis of articles passing FAR and looked at all of them (it's unclear to me why some are left out). But even that then would be misleading, as hurricanes aren't passing FAR because there's a CCI ... SandyGeorgia (Talk) 21:20, 28 November 2023 (UTC)
Also, in medical, lead size relating to article size falls apart because we cover the MEDORDER bits no matter how short the article. That is, Ajpolino's lead at the very short Buruli ulcer (not much known) is probably similar to the much longer lung cancer, as he has to hit all the MEDORDER sections anyway. No such list in most other kinds of content ... SandyGeorgia (Talk) 21:22, 28 November 2023 (UTC)
From the WPMED-tagged FAs, I get a mean of 371 words per lead for all of them (biographies, basic science, etc.), and 380 words per lead for only the diseases, drugs, etc. I doubt that difference is either statistically significant or of practical relevance. WhatamIdoing (talk) 21:38, 28 November 2023 (UTC)
OK, I guess that answers that. SandyGeorgia (Talk) 21:38, 28 November 2023 (UTC)
But that's well above the 300 average mentioned before ... (sorry for piecemeal responses, heading out the door soon, trying to finish up). SandyGeorgia (Talk) 21:39, 28 November 2023 (UTC)
It's within the 250–400 range, but longer than the 300ish median. WhatamIdoing (talk) 21:48, 28 November 2023 (UTC)
There's a separate problem looking at what's coming out of FAC; I've seen no evidence that leads are being reviewed, and plenty that they're not. There is one editor (Dying) who basically spends an entire day reviewing the lead of every WP:TFA and copyediting the blurbs-- which is work that should be happening at FAC. So, again, interested in Femke's analysis of what's coming out of FAR, as those tend to be more complex articles, and get more indepth review (but then FAR misses the many short hurricanes, as they are all quagmired in a CCI). And some *very* short articles are coming out of FAC of late, another concern. SandyGeorgia (Talk) 21:38, 28 November 2023 (UTC)

Analysis with some more articles

I've finished the analysis using articles that were kept at a featured article review. Typically these might be a bit more meaty as the more core articles tend to be saved. Given that Sandy suspected that where running out of meaty articles to run at today's featured article, I thought this would be a nice dataset to complement the first one with. The rounded values of the combined set (either the 9 to 95% or the 10 to 90% interval) are 200 to 500 words. This gives a nice range to avoid people over interpreting but thus provide some guidance to avoid people writing intimidatingly long unreadable introductions. What do you guys think? For the sentence interval, we could say between 10 and 20 sentences (80% interval: 9-22, 90% interval: 7-23 words). —Femke 🐦 (talk) 20:12, 28 November 2023 (UTC)

Looks at these articles (and some others), 500 words in the lead already feels somewhat too long for me. Personally I would be more comfortable with the old range of up to 400 words, or 450 as a "compromise". Its similar with the sentence range, but I guess 20 vs. 18 sentence is close enough that it doesn't matter much. Did you also check the paragraph count? Gawaon (talk) 20:40, 28 November 2023 (UTC)
I didn't, but feel free to add it to the table. —Femke 🐦 (talk) 20:47, 28 November 2023 (UTC)
Femke ... Looking at articles kept at FAR does broaden the sample to something meaninful, so good for that. If you browse the old data I used to keep at Wikipedia talk:Featured article statistics, you'll see that many of the historically very short and very long articles have ended up defeatured. In the case of short, they end up merged elsewhere, in the case of long, reasons vary (but often related to maintenance or people waking up and realizing the prose wasn't tight anyway). And, I suspect that FAC regulars are no longer rigorous about reviewing leads, so I'm glad you branched out in your data selection. So it's interesting that you came up with a good and broader range on kept FARs. Which articles were at the low end ? That is, is there a trend in content area? And I do suspect that looking at words may be more helpful than paragraphs, so we don't end up with something like India, where it seems five paragraphs of information has been artificially shoved in to four, perhaps to "comply" with an over-interpretation of a guideline. SandyGeorgia (Talk) 20:52, 28 November 2023 (UTC)
Never mind; I see your list now. Not sure it's complete-- where did you pull it from so that it missed Lung cancer, Hanford Site, J. K. Rowling, Belton House, Diocletian, and many others (see Wikipedia:Featured article review/FASA/Records) for example? SandyGeorgia (Talk) 20:56, 28 November 2023 (UTC)
And speed of light passed FAR with five paras, but now has seven (sigh). SandyGeorgia (Talk) 21:03, 28 November 2023 (UTC)
For that article, I see 4 paragraphs/18 sentences/503 words at the original promotion, 5 paragraphs/23 sentences/571 words at the end of FAR, and 6 paragraphs/22 sentences/546 words today. It is an illustration of the principle that adding blank lines does not always mean that the lead has gotten longer. WhatamIdoing (talk) 21:48, 28 November 2023 (UTC)
Yes; for that reason, focus on paras is misleading (eg India). If we go back to something you stated on the other page (it's not FAC/FAR we have to worry about but newer editors who need guidance), I'm trying to understand how using a word or para or sentence count won't end up mis-used by the very people who need the help. The problems are always the same: POV pushing-- the attempt to get everything you can that supports the editor's POV in to the lead or infobox for maximum exposure. That's the problem that needs solving here, and I fear that the word count will just encourage the POV pushers to chunk in POV right up to the maximum mentioned in the guideline, with no sense of context around the numbers. Most editing isn't like the J. K. Rowling FAR, where experienced editors set aside their differences and wrote the lead following policy and sticking within a valid word range. Many lead problems revolve around POV, with not always experienced editors fighting it out. The huge emphasis on this guideline page needs to stay on due weight and summarizing key points according to preponderance of sources... and any mention of words, sentences, paras has to be highlighted as clearly secondary to that primary aim. That underlies my concern about reading too much in to these numbers, because any way we calculate them, I can tell you where there are sampling issues in FAs. SandyGeorgia (Talk) 22:10, 28 November 2023 (UTC)
First, I think we need to remember that we have many more leads that are too short than those that are too long. There are eight articles tagged with Template:Lead too short for every one with Template:Lead too long. Furthermore, some of these (e.g., Damon Runyon) are probably tagged because they have multiple shorter paragraphs but a very normal word count. Encouraging people to increase the size of the lead is not necessarily problematic, on average.
Second, we are not talking about adding a paragraph count. We have had a paragraph count in this page since literally its first revision. I am talking about providing better guidance for longer leads than what we have had since literally the first revision, which basically said (and still says) "Hey, the max is usually four paragraphs – so remove a couple of blank lines to comply. Four long paragraphs is definitely better than five shorter ones". If your main concern is the small number of articles with very long leads, then providing sentence or word counts should reduce the problem of people "hiding" overly long leads by cramming 900 words into four l-o-n-g paragraphs. You can't hide 900 words by changing the white space but keeping the same content. If we say that 250–400 words is the common length for FAs, then 900 is not going to be accepted as typical or desirable, and the only solution will be to remove some of that POV pushing that was chunked in there. I'm okay with that.
There is another option: We could remove all numeric guidance about lead length completely. We will stop saying anything about the number of paragraphs. We will not add information about sentences or word counts. Each editor will form their own personal opinions about whether all leads should be restricted to one paragraph, or that "too long" is two thousand words, or whatever else they personally feel like. There will be no standards at all. I do not recommend this approach. WhatamIdoing (talk) 22:30, 28 November 2023 (UTC)
I think talking about POV pushing is distracting here. This is mainly about readability. A readable article make sure that you have enough information in the lead that many people don't have to go read beyond the lead, while preventing the problem of very long leads, which are too intimidating.
I've never found myself in discussion with people disagreeing that a lead was too short, whereas I have found myself very often in discussions with people that wanted to defend a 600 or 700 word lead that was difficult to parse. Anyway, whether we make it 200 to 500, or 250 to 400, this will give some guidance to people there must be a reason that these are the typical values. I think the absence of the guidance is more likely going to be abused then the presence, especially if we keep it descriptive rather than prescriptive.
In terms of the selection, I just took The first 30 articles from the very old and old list. Slightly more from the very old because the old list doesn't have 15 safe yet. Hence a few missing articles. —Femke 🐦 (talk) 19:54, 29 November 2023 (UTC)

Lists in leads

I noticed some disagreement in the Solar System article regarding a possible presentation of lists in the lead section by @CactiStaccingCrane. To me, this article seems like a place where such a relatively rare presentation is plausible—the topic involves a definition of a medium-sized (not small as to fit into prose, not large as to not belong in the lead) list. The consensus in reversion seems to be that prose is generally always preferred—if that should be the case, then it should be stated explicitly here, I think.
Personally, I think there is nothing inherently wrong with lists in the lead section in specific cases, perhaps like the Solar System article—save for the argument that it may require too much vertical space, but I'm skeptical of that. Thoughts? Remsense 21:37, 3 December 2023 (UTC)

Pinging @Praemonitus and @OmegaMantis as they have recently revert my edits. CactiStaccingCrane (talk) 01:03, 4 December 2023 (UTC)
This is horrible, and not a good idea. Also, considering that the article passed FAR a little over a year ago, why does that lead need to be messed with at all? XOR'easter. SandyGeorgia (Talk) 01:58, 4 December 2023 (UTC)
I, as a general-interest lay reader, read this lede this afternoon, and it sure felt that bulleting would help. After reading that lede, I was just a little bewildered by all those details and lists, and can remember only a little.
This is ironic (and sad) because Solar System went onto my Watchlist 10 or 12 years ago when I praised the then-current lede as nearly ideal: enough information for the casual reader and enough pointers to entice that reader to sections in the body of the article. But, as with many good ledes, successive additions and amendments over the years have destroyed those qualities. The lede to New York City has suffered similar elephantiasis by accretion. @Remsense, CastiStaccingCrane, SandyGeorgia, and Nikkimaria: —— Shakescene (talk) 02:18, 4 December 2023 (UTC)
Yep ... if a lead needs a list, it probably has too much detail. SandyGeorgia (Talk) 02:24, 4 December 2023 (UTC)
SandyGeorgia, if this is the consensus, I think it would be worthwhile to explicitly state in this guideline. Remsense 02:25, 4 December 2023 (UTC)
I've never seen anyone suggest adding a list to a lead before, so that seems WP:CREEPy. SandyGeorgia (Talk) 02:26, 4 December 2023 (UTC)
SandyGeorgia, perhaps. This is not my first time to the best of my recollection, but I figured it was worth a chat about. Remsense 02:28, 4 December 2023 (UTC)
New York City (858 words in the lead) ... ack! A good reason we should be giving some guidance on lead size, too. SandyGeorgia (Talk) 02:25, 4 December 2023 (UTC)
how do you feel about Solar System lede as it presently exists? Remsense 02:36, 4 December 2023 (UTC)
It's fine. It was fine when it passed FAR. From the standpoint of proportionally summarizing the rest of the article the way ledes are supposed to do, it was better in the version that passed FAR. XOR'easter (talk) 02:52, 4 December 2023 (UTC)
I'm not trying to start a discussion over the presentation in particular except as a general example of the guideline. I have no real opinion on that presentation other than it being plausible. Remsense 02:22, 4 December 2023 (UTC)
Perhaps we should link prose in the lead here. Moxy- 02:27, 4 December 2023 (UTC)
Tend to agree with "if a lead needs a list, it probably has too much detail." And we don't need to add a line item about this to MOS:LEAD per WP:CREEP and MOS:BLOAT. This kind of dispute is virtually unheard of (probably because everyone intuitively understand that lists are not how to write a lead), so there is no reason to address it here. MoS is already over-long, and no new rule should be added to it unless it is needed to deal with long-term, recurrent editorial strife of a particular sort.  — SMcCandlish ¢ 😼  09:08, 4 December 2023 (UTC)
I probably should have reverted the lead lists as soon as they appeared in the Solar System, but I am tired of fighting every little format issue. I was asked to take another look by CactiStaccingCrane and immediately tagged the list issue. Sorry if it caused such a fuss. While it is a detail-oriented article, to me the lead was just fine without the bullet lists. Praemonitus (talk) 15:17, 4 December 2023 (UTC)
One of the general problems with bulleted lists is that they draw attention, which means that they draw attention away from the sentences and paragraphs. Our guidance in the lead is probably correct, but our guidance elsewhere may be less than perfect. We recommend that people write something like:

The main causes of scaryitis are age and a high-fat diet.  Some less common risk factors include:

  • eating bacon,
  • drinking alcoholic beverages,
  • breathing dust,
  • running after dark,
  • talking to strangers, and
  • playing in the street.
It's easy for the eye to just skip right past the common points and focus on the list. Since everything in the lead is supposed to be important (at some level), we don't really want to format the lead so that the eye skips over any of it. WhatamIdoing (talk) 16:35, 4 December 2023 (UTC)
Yes, this is a good additional reason to not use vertical lists in the lead.  — SMcCandlish ¢ 😼  02:22, 5 December 2023 (UTC)