Wikipedia:Requests for comment/NOINDEX

From Wikipedia, the free encyclopedia
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
From an easy look below, you can tell that the community is currently in favor of giving the new features a try, in the face of the possibility that things just may not work out either because a company fails to follow tags left or in the amount of time it takes for entries to disappear. The community approves the following two changes to the system. -- DQ (ʞlɐʇ) 01:59, 21 April 2012 (UTC)[reply]
Thanks, DeltaQuad, for your close :). Okay; I can confirm on behalf of the Wikimedia Foundation that we will start working on this as soon as possible. If I get any more information, and as we get more detailed estimations, I'll post them to the WP:NPT talkpage. Okeyes (WMF) (talk) 05:00, 22 April 2012 (UTC)[reply]

This Request for Comment covers a proposal to prevent search engines finding or caching potentially harmful pages through the addition of a fairly restricted "noindex" property. It comes from discussions with the community as part of development of the Page Triage software, where several editors suggested that a noindex tag would be helpful. As this tag is not technically dependent on New Page Triage, and should have input from the wider community, this independent Request for Comment was established.

There are two proposals; to noindex unpatrolled articles, and to noindex articles tagged with specific deletion templates such as "attack page" or "copyright violation". Editors are asked to indicate their support for or opposition to each feature; we are perfectly happy to turn on both, one or the other, or neither. Suggestions to include more deletion tags, or questions about the technical implementation, should be left on the talkpage. Note that this RfC will run for 30 days, starting on 20 March.

Background[edit]

There presently exists a way to set noindex on articles outside the main Article namespace, the {{noindex}} template and associated magic word. It's disabled in mainspace because it's easy to abuse, and there's currently no way to ensure enough oversight for where the magic word is used.

Noindex unpatrolled new articles[edit]

At the moment, unpatrolled new articles are indexed by google and various other search engines. This can be a problem, because unpatrolled articles are potentially copyright violations, or attack pages that contain defamatory information. The developers at the Foundation have worked out a way to tie the "noindex" flag to the "patrol" function; this means that until an article is patrolled, it will not be indexed or cached by search engines, preventing the dissemination of potentially harmful information.

One possible (and unfortunate) side-effect is that it could prevent articles on breaking news from being immediately indexed (for example, the Costa Concordia disaster). However, patrol usually happens quite quickly: writing the article takes significantly more time than patrolling it. We hope that improvements in the patrolling interface as part of the Page Triage project will reduce the workload on patrollers, make the process more efficient, and further minimise edge cases such as this.

Support – unpatrolled[edit]

  1. Those articles that cover current events (and are sufficiently notable to be here in the first place) are likely to be patrolled almost immediately, due to the level of interest they generate, so I don't think it will inhibit their search engine visibility. Pol430 talk to me 13:18, 20 March 2012 (UTC)[reply]
  2. As a NPPer, I've seen the backlog go from almost a month to just a few days recently. WP's purpose is to be a good reference of information, not a Breaking News outlet (WikiNews,CNN,Drudge Report,etc.). Having a subject not show up on google searches for a bit is a good thing (reduces the feeding frenzy aspect of breaking/unverified news). Hasteur (talk) 13:32, 20 March 2012 (UTC)[reply]
  3. Seems fairly obvious to me. — foxj 13:44, 20 March 2012 (UTC)[reply]
  4. Brilliant idea. Hopefully this will make it so NPP can slow down and take a more careful and targeted approach.--v/r - TP 13:47, 20 March 2012 (UTC)[reply]
  5. I can see no significant downside to this, but plenty of good reason to do so (copyvios, attack pages, etc.). - SudoGhost 13:50, 20 March 2012 (UTC)[reply]
  6. If it's tied to the patrol function, that will both ensure someone has seen it before it gets cached by Google and prevent malicious users from screwing around with NOINDEX. Sounds good to me. The Blade of the Northern Lights (話して下さい) 13:54, 20 March 2012 (UTC)[reply]
  7. Armbrust, B.Ed. Let's talkabout my edits? 15:00, 20 March 2012 (UTC)[reply]
  8. Can't think of a reason to object to something so simple and so likely to be effective. A fluffernutter is a sandwich! (talk) 15:02, 20 March 2012 (UTC)[reply]
  9. Absolutely. It doesn't matter if breaking news isn't immediately indexed (we're not a news source and we're not trying to get as many pagehits as possible) but it does matter if attack pages are cached and remain in Google's index for a while. /ƒETCHCOMMS/ 15:07, 20 March 2012 (UTC)[reply]
  10. Like sudoghost, I don't see any disadvantages to this and plenty of advantages.Yoenit (talk) 15:14, 20 March 2012 (UTC)[reply]
  11. Definitely. This is especially important re attack pages, copyvio, and (to a lesser extent) unreferenced BLPs and I honestly don't see any any downsides to this. Voceditenore (talk) 15:51, 20 March 2012 (UTC)[reply]
  12. Can't see significant downsides. This could also reduce the incentive to create spam-type articles somewhat.  Sandstein  16:47, 20 March 2012 (UTC)[reply]
  13. Support, with caveat below - if we patch this gap, it looks good. Shimgray | talk | 20:27, 20 March 2012 (UTC)[reply]
    Looks resolved, now. Shimgray | talk | 23:33, 20 March 2012 (UTC)[reply]
  14. Support - Yeah, this is a terrific idea, automatically keeping short-lived attack pages and vandal-a-thons destined for Speedy off the Google machine will reduce the incentive for such behavior exponentially, without impacting anyone's editing experience in the least. Good thinking! Carrite (talk) 20:51, 20 March 2012 (UTC)[reply]
  15. Conditional support: Until Triage is up and running fully, if a page drops off because of the 30-day limit, it needs to remove the noindex tag just as if it had been patrolled. I'll support as long as that's included (it can be removed once NPT is up). - Jorgath (talk) (contribs) 21:08, 20 March 2012 (UTC)[reply]
  16. Support. This is a good idea. Wikipedia does not need to be broadcast as full of unreliable or inappropriate content, and "Googlers" should be spared inaccurate results. Our goal, in a way, is to spread reliable information to others via a user-friendly and reader-friendly encyclopedia that is hopefully as accurate as possible, and this would help in that regard. dci | TALK 22:11, 20 March 2012 (UTC)[reply]
  17. Support, with the same caveat as Shimgray. -- Finlay McWalterTalk 22:13, 20 March 2012 (UTC)[reply]
  18. Support, eliminiate running the risk of ruining Wikipedia's reputation of reliability that we try very hard to protect and improve. Whenaxis talk · contribs | DR goes to Wikimania! 22:55, 20 March 2012 (UTC)[reply]
  19. Support The only bad thing about this is why the heck nobody thought of it 7+ years ago. Andrew Lenahan - Starblind 00:06, 21 March 2012 (UTC)[reply]
  20. Support And kudos to the user who spotted we didn't already do this. Begoontalk 01:08, 21 March 2012 (UTC)[reply]
  21. Obvious support. MER-C 03:43, 21 March 2012 (UTC)[reply]
  22. Wow, increase the credibility of Wikipedia. What a wonderful idea? 68.55.112.31 (talk) 08:15, 21 March 2012 (UTC)[reply]
  23. Support This makes sense, and as a new page will now apparently remain listed for reviewing until it is seen by a reviewer, concerns about an inappropriate page sliding unseen under the net will be resolved. SilkTork ✔Tea time 09:52, 21 March 2012 (UTC)[reply]
  24. Support (talk→ BWilkins ←track) 11:37, 21 March 2012 (UTC)[reply]
  25. Support - will help with copyvio, attack pages, etc. Dougweller (talk) 11:50, 21 March 2012 (UTC)[reply]
  26. Strong Support Wikipedia articles are likely to have a high position in any search; this makes them a target for malicious editors interested in propagating false information, which may persist near the top of a search for some time, even if deleted quickly. This is particularly a problem for attacks and copyvios, but is also an issue for more subtle problems, which may not get appropriately tags, so I support the NOINDEX for all new articles, until they are patrolled. The downside that breaking news may not show up for some time is a trivial downside, as we are talking minutes in most instances, and breaking news isn't even the mandate of an encyclopedia. (I see concerns about the old practice of dropping articles from unpatrolled after 30 days, but that seems to be addressed by the change which leaves them on the list until patrolled.) SPhilbrick(Talk) 12:39, 21 March 2012 (UTC)[reply]
  27. Support - excellent idea! Skier Dude (talk) 16:07, 21 March 2012 (UTC)[reply]
  28. Better than nothing. T. Canens (talk) 18:54, 21 March 2012 (UTC)[reply]
  29. Seems utterly reasonable. —Tom Morris (talk) 20:45, 21 March 2012 (UTC)[reply]
  30. Support: Definitely, good idea –meiskam (talkcontrib) 01:45, 22 March 2012 (UTC)[reply]
  31. Support Pharaoh of the Wizards (talk) 03:35, 22 March 2012 (UTC)[reply]
  32. Support Saves us from some damage, and will reduce the incentive for spam, which should improve the flow at NPP. Kanguole 11:30, 22 March 2012 (UTC)[reply]
  33. Support. Can't hurt; I wish this were to include all new articles for a few days, simply because it's annoying to search Google for sources on a new article and get only information about the new article, rather than information about its subject. Nyttend (talk) 15:20, 22 March 2012 (UTC)[reply]
  34. Support, provided that the extensions to the time unpatrolled articles remain on Special:NewPages are put in place. A good idea whose time has come. Grondemar 21:51, 22 March 2012 (UTC)[reply]
  35. I've seen tagged attack pages and hoaxes appear in Google for hours. This should not be. →Στc. 07:10, 23 March 2012 (UTC)[reply]
    Did you mean to put this in the "Noindex articles with specific deletion templates" section? – Allen4names 18:40, 23 March 2012 (UTC)[reply]
    No. I believe that they should be noindexed whether they've been tagged or not. →Στc. 21:26, 24 March 2012 (UTC)[reply]
  36. Support: definitely! Pesky (talk) 14:34, 24 March 2012 (UTC)[reply]
  37. Support - a high percentage articles created by non-admins who don't have the "Autopatrolled" flag (tjhat is, articles which aren't automativcly patrolled) are speedy deleted when noticed, and some of these are attack pages. This feature will allow us to filter throuh these pages before they become widely visible through search engines. עוד מישהו Od Mishehu 12:57, 26 March 2012 (UTC)[reply]
  38. Support this commonsense way to ensure that newly created articles do not rocket to the top of Google search results before any checks have been done to ensure the content isn't an attack page, vandalism, copyright infringement, etc. 28bytes (talk) 20:21, 26 March 2012 (UTC)[reply]
  39. support on the strongest possible grounds. There is literally no downside to preventing Google from accessing pages that have never been vetted for accuracy or veracity. In addition there is a powerful upside in preventing BLP-attack and nonsense pages from EVER appearing with our marquee on them to non-wikipedia users. HominidMachinae (talk) 05:28, 27 March 2012 (UTC)[reply]
  40. Bmusician 07:09, 28 March 2012 (UTC)[reply]
  41. Support Sensible, limited. Glad to hear about that unpatrolled articles will cease being dequeued, that'd be an issue here otherwise for me. --joe deckertalk to me 21:38, 28 March 2012 (UTC)[reply]
  42. Support Commons sense and overdue.VolunteerMarek 07:38, 29 March 2012 (UTC)[reply]
  43. Support Seems sensible. --JN466 08:01, 29 March 2012 (UTC)[reply]
  44. Support. Piling in on the should have done long time ago wagon. --Piotr Konieczny aka Prokonsul Piotrus| talk to me 16:10, 29 March 2012 (UTC)[reply]
  45. Support – A high Google ranking comes with great responsibility. This feature will help to prevent newly-created attack pages, grudgewanks, trial-by-media material, and unencyclopedic, sensationalized material from instantly appearing on Google and damaging others. --Michaeldsuarez (talk) 12:53, 30 March 2012 (UTC)[reply]
  46. Support. NPP is one of our best lines of defence (though certainly not the only line of defence) against harmful content; this measure can help reduce the impact of any harmful content that hasn't been patrolled yet, and it doesn't add any extra work in day-to-day editing so it's effectively a freebie. I appreciate Rich Farmbrough's point, but I would argue that resolving serious content problems (or applying the right template so somebody else resolves them) is far more likely to be done by an established editor rather than by some novice passerby who arrived here via google. — Preceding unsigned comment added by Bobrayner (talkcontribs) 13:52, 30 March 2012
  47. Support iff this is configurable on-wiki by administrators or bureaucrats (i.e. it doesn't require developer intervention). Thryduulf (talk) 17:18, 30 March 2012 (UTC)[reply]
  48. Support though not affirming that it is a panacea of any sort at all. Collect (talk) 17:27, 30 March 2012 (UTC)[reply]
  49. Support -- excellent idea. Nomoskedasticity (talk) 18:54, 30 March 2012 (UTC)[reply]
  50. Support - about time. Noindexing new pages was discussed as long ago as November 2009, here (and surely that wasn't the first time). Rd232 talk 13:48, 31 March 2012 (UTC)[reply]
  51. Support - I'm in agreement with most reasons stated above. Keeps attack pages/ spam/ useless content off Google et al. SD5 21:53, 31 March 2012 (UTC)[reply]
  52. Support - Should have been done a long time ago. Once spammers realize that we're doing this, they'll have a lot less incentive to spam. —SW— converse 02:27, 3 April 2012 (UTC)[reply]
  53. Support - Would like to find reason to add here, but the 50 or so above have covered all of the bases. Mtking (edits) 07:39, 3 April 2012 (UTC)[reply]
  54. Support per Wikipedia_talk:New_Page_Triage#NoIndex_until_patrolled. I have read the oppose arguments here and at the original discussion and I find them unconvincing. True this is not a magicbullet that will solve all our problems, but it will move things in the right direction, it should stop the frustration of seeing attack pages persist in mirrors and search engine caches for hours after deletion. So if regulators or the press ask us what we have done in 2012 to reduce the impact of vandalism and attack pages, this will be a great thing to point to. ϢereSpielChequers 21:08, 3 April 2012 (UTC)[reply]
  55. Support as it could possibly decrease the incentive to abuse. Of course, there might be many reasons why it might not work, but I don't see a lot of harm. We are no news outlet, and the important articles with a lot of immediate value will get patrolled quick enough. JHSnl (talk) 11:46, 4 April 2012 (UTC)[reply]
  56. Support Per above. Dusty777 16:51, 4 April 2012 (UTC)[reply]
  57. Support I am confident that spammers and some of the attack-page posters post to Wikipedia specifically because they know their page will show up as one of the first Google search results. This change will remove that incentive. NawlinWiki (talk) 19:48, 5 April 2012 (UTC)[reply]
  58. per Pol430 et al. 17:23, 8 April 2012 (UTC)
  59. Obvious support As WP is often one of the first or second results when one searches Google, we need to be pretty careful what gets up there. — Train2104 (talk • contribs) 22:19, 8 April 2012 (UTC)[reply]
  60. Support Once this is implemented, patrolling will actually mean something: a new article has been deemed worthy of being Googleable. Jclemens (talk) 05:27, 9 April 2012 (UTC)[reply]
  61. Support. Yes, yes, yes! There is no reason for not doing it. -- Alan Liefting (talk - contribs) 03:45, 11 April 2012 (UTC)[reply]
  62. Support. A common sense technological tweak that will improve the quality and reliability of reader experiences. Particularly important given Google's lightning fast indexing these days. Ocaasi t | c 10:19, 12 April 2012 (UTC)[reply]
  63. Support – I've found it astounding to see that these unreviewed articles are popping up in Google (almost instantly, in fact). This change is common sensical. mc10 (t/c) 20:21, 12 April 2012 (UTC)[reply]
  64. Support Will help in NPP checking. AllyD (talk) 07:49, 14 April 2012 (UTC)[reply]

Oppose – unpatrolled[edit]

  1. Unless this feature has a time out of say 24–72 hours after witch the new page will be indexable I will oppose. – Allen4names 17:13, 20 March 2012 (UTC)[reply]
    That would sort of defeat the point of the feature ;p. One of the reasons people are suggesting this idea is psychological - it reduces pressure and allows people to happily work through at their own pace. If we stick in a time limit, the stress is still there. Okeyes (WMF) (talk) 17:15, 20 March 2012 (UTC)[reply]
    I would be willing to see the time out increased to up to seven days, beyond that (or even that far) may be contrary to Wikipedia:Assume good faith. – Allen4names 17:27, 20 March 2012 (UTC)[reply]
  2. a) This does not really solve the google problem, as bad content can be introduced at a later time and indexed. b) I also go with Allen that if this gets implemented there should be a timeout at a maximum of seven days. Agathoclea (talk) 18:29, 20 March 2012 (UTC)[reply]
  3. Opposing provisionally. Reason, we want eyes on new pages, and finding bad new pages will encourage people to fix them, and become editors. Rich Farmbrough, 19:21, 20 March 2012 (UTC).[reply]
    Your oppose makes sense, but I disagree. New page patrollers view articles quite often, as do other editors who monitor new additions to the encyclopedia. Though external viewers could be inspired to contribute, they may also be surprised by unreliable or wholly inappropriate content on a site reputed to be one of the Internet's more comprehensive encyclopedic resources. dci | TALK 22:13, 20 March 2012 (UTC)[reply]
    I disagree completely with your reason for opposing. Quite a lot of new-page-patrolling involves maintenance tagging, not actually fixing the articles. Once a page has been tagged, and marked as patrolled in the tagged state, your hypothetical newbies-from-Google will be able to see the tagged poor-quality articles, become editors, and fix them to their hearts' content. - Jorgath (talk) (contribs) 22:17, 21 March 2012 (UTC)[reply]
  4. It brings disadvantages, and as other said, it doesn't solve the problem.--NaBUru38 (talk) 23:06, 20 March 2012 (UTC)[reply]
  5. Oppose until such time as a proper crosswiki search engine is developed, right now the only way to look for a similar article and locate and deal with a crosswiki problem is using custom google search, this proposal would significantly limit this ability. Snowolf How can I help? 15:00, 21 March 2012 (UTC)[reply]
  6. Oppose, although we have no deadline this is the wrong way to "fix" a problem. mabdul 20:23, 24 March 2012 (UTC)[reply]
  7. I think that it's a good idea not to allow the use of NOINDEX in article space, which has been an established stance since Wikipedia started, and I see nothing here to convince me that it would be worthwhile to make an exception.
    — V = IR (Talk • Contribs) 00:16, 31 March 2012 (UTC)[reply]
  8. Status quo is satisfactory. Stifle (talk) 17:11, 1 April 2012 (UTC)[reply]
  9. Nah. I sometimes patrol the unpatrolled backlog and the overwhelming majority of articles there are solid content, with a speedy-able rate no higher than patrolling random pages. I also think the positive effects would be minimal – I don't know how long it takes for Google et al to index an article, but aren't obvious copyvios and attack pages typically deleted so swiftly that they're unlikely to be indexed at all? I think the cost to Wikipedia's relevance and the assumption of good faith outweigh the benefits. – hysteria18 (talk) 02:08, 12 April 2012 (UTC)[reply]
  10. Oppose, I think it's a good thing for people to find sub-par content on Google; it helps us get more editors as people jump in by fixing sub-par articles. Plus, I'm not convinced that the negative effects are really all that bad. Any article that is a significant liability would likely have been speedied well before anyone Googles it. JYolkowski // talk 18:05, 13 April 2012 (UTC)[reply]
  11. Oppose I think we all agree that serving our readers well is the primary goal here. Considering that there is zero evidence that entirely decent encyclopedia articles will not be hidden from readers somehow, I cannot support unselective noindexing. The reason Wikipedia exists is to be the best encyclopedia on the Web, and that is not a goal that demands every article be perfectly up to snuff from the moment it is created. Steven Walling • talk 22:24, 15 April 2012 (UTC)[reply]
Oppose oposition withdrawn - my objection was concerned that a flagged revision mechanism would cause extensive delisting.BO; talk 17:37, 19 April 2012 (UTC)[reply]
Is it possible you are confusing flagged revisions with newpage patrol? The only unpatrolled pages will be those of new articles, when an existing article is edited it will not become unpatrolled because it won't become a new article. To get the effect that worries you we would need to implement flagged revisions on EN wiki and make all unflagged revisions NoIndex until reviewed. Such a proposal would as you say cause all our frequently updated pages to be delisted, however no-one is proposing that we make new edits to old articles NoIndex - just newly created articles. ϢereSpielChequers 23:38, 19 April 2012 (UTC)[reply]
I've withdrawn my oposition, and removed the warning. BO; talk 21:11, 20 April 2012 (UTC)[reply]

Discussion – unpatrolled[edit]

  • One existing problem with new-page patrolling is articles slipping through the gaps - going unpatrolled for the full thirty days (or whatever period it is) then vanishing into the ether. As patrolling is currently linked directly to special:newpages, and can't be done from elsewhere, this means that there's the potential for indefinitely no-indexed articles unless we develop a second mechanism for retroactively finding unpatrolled pages. Can we assume that some kind of "all old noindexed articles" report will be available alongside this? It would also help alleviate objection #1 above... Shimgray | talk | 20:27, 20 March 2012 (UTC)[reply]
    We're going to introduce an end to the practise of letting unpatrolled articles lapse into the aether :). Unpatrolled articles will remain in the queue indefinitely, while patrolled ones will stick around for 60 days. Okeyes (WMF) (talk) 20:35, 20 March 2012 (UTC)[reply]
    This looks like it solves the problem; if unpatrolled articles are recorded indefinitely, we can regularly sweep the backlog and avoid articles being "invisibly" noindexed. Shimgray | talk | 23:32, 20 March 2012 (UTC)[reply]
  • What about those which have already got through NewPages without being patrolled, is there a way to find them (and noindex them too)? --07:39, 21 March 2012 (UTC), Utar (talk)
    There isn't, really :(. Okeyes (WMF) (talk) 12:01, 22 March 2012 (UTC)[reply]
    Pages which have already passed the 30 day window are added to Category:Unreviewed_new_articles, as of writing this there are 78 pages in the category. –meiskam (talkcontrib) 18:43, 30 March 2012 (UTC)[reply]
  • Will there be a hidden category (for example Articles NOINDEXed because of not being patrolled)? --07:39, 21 March 2012 (UTC), Utar (talk)
    Probably not; as I understand it it will be tied into the "patrolled=0" or "patrolled=1" hook in the database. Okeyes (WMF) (talk) 12:01, 22 March 2012 (UTC)[reply]
  • If you want to konw what's written about you on the internet, googling your name is the obvious way. Isn't it problematic to hide our sins (libelous content and such) by not indexing?-- (talk) 10:05, 26 March 2012 (UTC)[reply]
    • Can you explain what you mean by "problematic"? Okeyes (WMF) (talk) 06:50, 27 March 2012 (UTC)[reply]
      • I'm talking about BLP issues, and I think that if the libel is there, the "victim" ought to be able to find it and set it right. The proposed NOINDEX is an obvious good idea, but there is this problem with it. Hence, I've neither supported nor opposed the proposal, hoping to be enlightened through this discussion.-- (talk) 09:30, 27 March 2012 (UTC)[reply]
        • Well, the NOINDEXing is really a short-term measure; it lasts up to and until the article is patrolled. That's normally very quickly; within 30 days is the standard. Okeyes (WMF) (talk) 16:16, 27 March 2012 (UTC)[reply]
  • My main concern with these sorts of proposals is that indexers (Google, Yahoo!, Microsoft, etc, al...) will begin systematically ignoring NOINDEX even more than they already do. NOINDEX is really only a suggestion as it is, but this proposal (and the one below) seems to assume that it's some sort of magic bullet. Maybe I'm missing it, but I think that some evidence that adding NOINDEX would have any effect at all would be helpful (both in that the indexers will abide by it, and that them doing so would be helpful).
    — V = IR (Talk • Contribs) 17:07, 30 March 2012 (UTC)[reply]
    We've got contacts at Google, which already recognises our NOINDEXes. The reaction I got when confronting the devs with this same issue is "They're fine with it. We just need to drop them an FYI before we switch it on". I can't speak for other search engines, I'm afraid. Okeyes (WMF) (talk) 17:09, 30 March 2012 (UTC)[reply]
    That doesn't surprise me, but it doesn't exactly address the concern in my opinion. I mean, that's not a contractual understanding at all, it's simply "yea sure, that won't bother us". It's basically meaningless. Someone at Google (presumably in management) could come in to work tomorrow with a position something like "abiding by NOINDEX tags is costing us revenue, so this memo is to get your department to configure our system to ignore all NOINDEX'ing", or some such thing. Doing anything like this with a "wink, wink, nod, nod" assurance, where the effects are (completely!) outside of the control of anyone on Wikipedia or at Wikimedia seems like a bad idea to me. I doubt that this will have any immediate impact at all, and I believe that contacts in Google are being honest in their answers to questions about this, but that won't have any bearing on what happens in the future (nevermind the question about other search engines).
    — V = IR (Talk • Contribs) 20:39, 30 March 2012 (UTC)[reply]
    It's a distinct possibility whenever we do anything that involves us and the outside world. I mean, we have no assurances that google won't NOINDEX the whole damn site, either ;P. But we have a good (albeit informal) working relationship with them, and it seems unlikely. I'm not really sure what I can do to allay your concern, though - there's nothing short of a contractual obligation that would mean "yes, no really, we're definitely not going to ignore your NOINDEX". Okeyes (WMF) (talk) 20:58, 30 March 2012 (UTC)[reply]
    Right, you got it (what the concern is)! :) The reason that I'm bringing this point up is because I don't understand why this should be done when it creates a potential reason for Google and other search engine operators to behave differently. I'm fairly confident that as long as nothing changes then very little will change on the part of Google, but... this is a change. Combine it with other things that may or may not happen externally and who knows what things will be like in 6 months or a year from now (or longer). If there's such a good understanding with Google and others about this then why not simply turn off the configuration restrictions on NOINDEX tagging working everywhere in the main space?
    — V = IR (Talk • Contribs) 21:07, 30 March 2012 (UTC)[reply]
    As in, why not allow NOINDEX everywhere? Because that would be a nightmare for us and a heaven for vandals; people could screw with and blacklist articles willy-nilly. Were you thinking the NOINDEX-not-working-in-mainspace thing was because we are worried about search engines? That's not it; it's to do with our transparency and, well, WP:BEANS. Okeyes (WMF) (talk) 22:56, 30 March 2012 (UTC)[reply]
    Agreed, so... why are we discussing changing that at all?
    — V = IR (Talk • Contribs) 23:06, 30 March 2012 (UTC)[reply]
    Because we're discussing it in a form that doesn't permit people to mess around and just add the tag to anything they want :). Okeyes (WMF) (talk) 23:16, 30 March 2012 (UTC)[reply]
    What makes you think that Google and other Search engine operators even consider ignoring this proposal? If they were to ignore it they would be deliberately giving publicity to attack pages and so forth, and if the press caught them doing that they would be roasted. ϢereSpielChequers 20:48, 3 April 2012 (UTC)[reply]
  • I would be ok with this, assuming all articles are patrolled within a reasonable time. We don't want articles to get stuck in some kind of limbo and never show up on search engines because we have too few patrollers or it was a particularly busy day. --Apoc2400 (talk) 21:47, 8 April 2012 (UTC)[reply]

Noindex articles with specific deletion templates[edit]

Another idea (and one that can run in parallel with the proposal to noindex unpatrolled articles) is to enable the "noindex" flag on a specific set of deletion templates. The list of noindex'd templates would be found in a MediaWiki page editable by admins, meaning that if the community wishes to expand the list (for example, to include not only the copyvio CSD tag, but also the page-sized tag used for salvageable articles) they would be able to. This proposal differs from simply enabling the existing {{noindex}} template in mainspace, in that only templates which have a lot of in-built oversight (in the form of CSD queues) would be able to set the property, thus preventing abuse.

The proposed templates to start with are;

  1. The CSD G10 template, which covers attack pages;
  2. The CSD G12 template, for copyright violations, and;
  3. The CSD G3 template, for pure vandalism - something that can sometimes be used as a substitute for an "attack" tag.

Other possible templates to include can be discussed separately by the community. This feature can run in parallel to the proposal to noindex unpatrolled articles, or on its own; they are not technically tied together. If the community chooses to reject one option and accept the other, this is perfectly feasible.

Support – templates[edit]

  1. Support this in conjunction with the above proposal. It's not only new pages that can be tagged for speedy deletion. Occasionally, we find some quite old ones that meet the above criteria (mostly copyvio), but have already been patrolled. Whilst they have probably already been indexed, there is no harm in damage limitation. Pol430 talk to me 13:23, 20 March 2012 (UTC)[reply]
    Actually, because of the way google NOINDEXes us, tagging already-indexed articles actually removes them from the cache. So it'd even work here :). Okeyes (WMF) (talk) 13:36, 20 March 2012 (UTC)[reply]
    You're assuming that they will never change that (and that their relevance to the world will always be what it is today...). They do that for their own reasons, which could change at any time and without any notice.
    — V = IR (Talk • Contribs) 17:12, 30 March 2012 (UTC)[reply]
  2. Seems fairly obvious to me. — foxj 13:44, 20 March 2012 (UTC)[reply]
  3. Support adding it to the three specific CSD tags above, especially given Okeyes (WMF)'s comment above (which I didn't know about) - SudoGhost 13:52, 20 March 2012 (UTC)[reply]
  4. I think "CSD F9 Unambiguous copyright infringement" should be also added to this list. Armbrust, B.Ed. Let's talkabout my edits? 15:00, 20 March 2012 (UTC)[reply]
  5. These templates fairly indisputably indicate things we don't want google attributing to us. A fluffernutter is a sandwich! (talk) 15:02, 20 March 2012 (UTC)[reply]
  6. Really, quite obvious. /ƒETCHCOMMS/ 15:07, 20 March 2012 (UTC)[reply]
  7. Support per the above. Two questions, though, and maybe Okeyes can weigh in on them: First, could we add the F9 criteria to this proposal, or would that come as part of the general Copyvio criteria? Second, is there a way to override the NOINDEX generated by the template? Could someone with what is a disputed copyvio article (and tagged as such) add a template that would function, essentially, as a "NO REALLY INDEX THIS" tag? Or would the template's NOINDEX function trump that one? If not, could we (or should we) disable that function as part of this? UltraExactZZ Said ~ Did 15:37, 20 March 2012 (UTC)[reply]
  8. Per everyone above. Template:Copyvio will blank the article, but is generally only added if the situation is more complex and it isn't used for unambiguous blatant copyvio which can be speedied. Voceditenore (talk) 15:56, 20 March 2012 (UTC)[reply]
  9. Yeah, it makes sense that we don't want to index pages with these tags while the issue is unresolved.  Sandstein  16:48, 20 March 2012 (UTC)[reply]
  10. Support adding noindex to G10 and G12 templates. – Allen4names 17:20, 20 March 2012 (UTC)[reply]
    (See discussion by DCI2026 (talk · contribs) below.) – Allen4names 16:37, 22 March 2012 (UTC)[reply]
  11. Support adding noindex to all the speedy deletion templates. No change needed to the templates. -- John of Reading (talk) 18:06, 20 March 2012 (UTC)[reply]
  12. Partial support: I support it for G10, G12, and F9, but not G3. Or at least not G3 until we see how it works on the others. - Jorgath (talk) (contribs) 21:10, 20 March 2012 (UTC)[reply]
  13. "Googlers" should not in any case encounter articles that have been so questioned by the community that they warrant a speedy deletion according to the definitions of those categories. dci | TALK 21:29, 20 March 2012 (UTC)[reply]
  14. Support: eliminates the risk of Wikipedia being portrayed as running awry and bubbling with attack pages. Whenaxis talk · contribs | DR goes to Wikimania! 22:57, 20 March 2012 (UTC)[reply]
  15. Obvious support. I'd also consider NOINDEXing G11 as well to reduce the spam incentive. MER-C 04:10, 21 March 2012 (UTC)[reply]
  16. Support the basic idea as obviously needed for things like copyvios but see my comments below on implementation. Dpmuk (talk) 04:33, 21 March 2012 (UTC)[reply]
  17. Support: Protects WMF from liability, and mitigates the liability of individual authors. 68.55.112.31 (This is not legal advice.) 08:15, 21 March 2012 (UTC)[reply]
  18. T. Canens (talk) 18:54, 21 March 2012 (UTC)[reply]
  19. Support I would support this version of the proposal. It Is Me Here t / c 23:49, 21 March 2012 (UTC)[reply]
  20. Support: Also add CSD F9 –meiskam (talkcontrib) 01:45, 22 March 2012 (UTC)[reply]
  21. Support: a CSD tag indicates that sooner or later an admin will take a look at it, so even if suboptimal for the perspective of locating crosswiki issues, somebody should eventually properly handle it locally. Snowolf How can I help? 12:45, 22 March 2012 (UTC)[reply]
  22. Support; another excellent idea. I would favor applying this to all speedy deletion templates, not just the ones listed above. Grondemar 21:53, 22 March 2012 (UTC)[reply]
  23. Support: again, definitely :D Pesky (talk) 14:38, 24 March 2012 (UTC)[reply]
  24. Support, +F9. mabdul 20:23, 24 March 2012 (UTC)[reply]
  25. Support Pharaoh of the Wizards (talk) 02:40, 26 March 2012 (UTC)[reply]
  26. Support. If this is as straight forward as it should be, add a line of code to the deletion template, then this is a no-brainer. The danger that Google will crawl and cash an attack page in the, hopefully short, period of time between when it is tagged and deleted is greater than the one of legitimate pages being missed because they were incorrectly tagged. I don't feel strongly about the inclusion of G3, when correctly used (or when G10 is also applicable) it is fine but as something of a catch-all category it is more often misused than some of the others. Eluchil404 (talk) 08:21, 28 March 2012 (UTC)[reply]
  27. Support For G10, G12, honestly not sure (and thus Neutral) w.r.t. G3. I see the benefit as being a relatively low-overhead harm reduction. --joe deckertalk to me 21:42, 28 March 2012 (UTC)[reply]
  28. Support Per my own comment on prop 1.VolunteerMarek 07:39, 29 March 2012 (UTC)[reply]
  29. Support. Seems reasonable. --Piotr Konieczny aka Prokonsul Piotrus| talk to me 16:10, 29 March 2012 (UTC)[reply]
  30. Support – For mainly the same reasons that I support the first proposal listed. We need to act responsibly. --Michaeldsuarez (talk) 13:18, 30 March 2012 (UTC)[reply]
  31. Support iff the list of templates that use NOINDEX is configurable on-wiki by administrators or bureaucrats (i.e. it doesn't require developer intervention). Thryduulf (talk) 17:16, 30 March 2012 (UTC)[reply]
  32. Support - same as my reasoning for support on the unpatrolled pages, keeps attack pages etc off search engines. SD5 21:55, 31 March 2012 (UTC)[reply]
  33. Makes sense. —Strange Passerby (talkcont) 12:46, 2 April 2012 (UTC)[reply]
  34. Support - seams a well thought out proposal. Mtking (edits) 07:41, 3 April 2012 (UTC)[reply]
    Support - I thought this was already in place.  An optimist on the run! 07:05, 4 April 2012 (UTC)[reply]
    Switching to Oppose - see below.  An optimist on the run! 19:32, 12 April 2012 (UTC)[reply]
  35. Conditional Support - Brilliant idea, as long as abuse is prevented. (E.g. the suggestion below that people do a no-show div with the CSD template in it.) Also, I think it would be wise to check if there is an increase in tagging with these specific templates after this mechanism is in place. JHSnl (talk) 11:44, 4 April 2012 (UTC)[reply]
  36. Support with reservations. Anything that causes a nocat should also suppress the noindex, and there should be a noindex parameter added to all CSD tags that puts articles in an additional category (for use on egregious cases of anything else).
  37. Support and as a related RFC I want all templates themselves to have NOINDEX. See Wikipedia talk:Template messages#NOINDEX for all template messages. -- Alan Liefting (talk - contribs) 03:50, 11 April 2012 (UTC)[reply]
  38. Support. We wouldn't want these articles shared, so they shouldn't be indexed either. Makes sense. Ocaasi t | c 10:19, 12 April 2012 (UTC)[reply]
  39. Support for G10,G12, and F9. I was original in support of G3, but now feel differently. I think it is in our bet interests to use this approach as narrowly as possible. Google does not HAVE to honor Noindex, and if we cast too wide a net, they might decide not to honor it. (I would hope we would have a discussion if that were a possibility, but I don’t know how communication between WP and Google occurs, if at all.) I think it is likely they would completely understand and support the no-index of copyvios, and understand the desire to noindex attack pages. However, the broader the net, and the more judgment allowed, the more likely it is that Google might decide their interests in broad indexing trump our desire to exclude material. let's make it easy for them by staying as narrow as possible.--SPhilbrick(Talk) 19:50, 12 April 2012 (UTC)[reply]
  40. Support all templates mentioned above plus F9 – There is no reason that pages tagged with these templates should be indexed. mc10 (t/c) 20:23, 12 April 2012 (UTC)[reply]
  41. Support for G12/F9 and for G10. AllyD (talk) 07:53, 14 April 2012 (UTC)[reply]
  42. Support Absolutely! If a human editor has gone to the trouble of identifying truly harmful articles like attack pages etc, then noindex should be judiciously applied in order to prevent harm (beyond mere factual inaccuracy, which we have never and will never be able to guard against 100% of the time). Steven Walling • talk 22:26, 15 April 2012 (UTC)[reply]
  43. Support - these articles are garbage. Why let readers see them? Chutznik (talk) 23:46, 17 April 2012 (UTC)[reply]
  44. Support - under the condition that the directive on the page will not be contracdited by a directive off the page. BO; talk 22:54, 19 April 2012 (UTC)[reply]

Oppose – templates[edit]

  1. Oppose adding noindex to G3 template for now. I would prefer to wait and see how this works on the other two. – Allen4names 17:20, 20 March 2012 (UTC)[reply]
    Would you support it if we said that we will first do it just for G10 and G12, leave it that way for a month, and then discuss other templates we may want to do it with? עוד מישהו Od Mishehu 06:44, 29 March 2012 (UTC)[reply]
    If this gets abused I would want this feature fixed or removed. Other templates can be added later after an RFC. I would either support or oppose on a case by case basis. – Allen4names 04:01, 30 March 2012 (UTC)[reply]
  2. Provisional oppose. These categories should be resolved pretty quickly. For that reason adding NOINDEX is against KISS/Occam's razor. Rich Farmbrough, 19:25, 20 March 2012 (UTC).[reply]
  3. Seems likely to create problems - especially since the nature of search engines is that they do not quickly remove pages which are later tagged as NOINDEX. The ability of people to deliberately template articles with the specific hope that they will then fall off of search engines is quite likely to cause more problems than it solves. Let's just stick with the simpler proposal above which has very wide support. Collect (talk) 17:32, 30 March 2012 (UTC)[reply]
  4. I think that it's a good idea not to allow the use of NOINDEX in article space, which has been an established stance since Wikipedia started, and I see nothing here to convince me that it would be worthwhile to make an exception.
    — V = IR (Talk • Contribs) 00:16, 31 March 2012 (UTC)[reply]
  5. Doesn't seem useful. Most speedy deletion requests are taken care of within a few hours at most. I highly doubt that Google is crawling through all of Wikipedia several times per day. In other words, the article will be dealt with by the time Google updates their search results. And, if we have templates which cause a page to stop showing up in search results, someone will eventually figure out a way to abuse it, and artificially affect the search results for their least favorite pages. —SW— chat 02:30, 3 April 2012 (UTC)[reply]
    As I've explained above, google is patrolling constantly. They've hooked up our recent changes feed; when something alters, a spider is queued to find the new version. Okeyes (WMF) (talk) 02:33, 3 April 2012 (UTC)[reply]
    That's interesting, I wasn't aware of that. However, it still will only affect google's search results for a few hours at the most. I'm not convinced that it's worth exposing this for potential abuse. —SW— converse 17:56, 3 April 2012 (UTC)[reply]
    Can you explain what potential abuse you see? Okeyes (WMF) (talk) 18:10, 3 April 2012 (UTC)[reply]
  6. per Rich Farmbrough, Collect. KillerChihuahua?!? 17:25, 8 April 2012 (UTC)[reply]
  7. Oppose as unworkable. I've been doing some testing in my sandbox: I added a meaningless phrase, and checked how long it was before Google picked it up. This took about 12 minutes. I then added a __NOINDEX__ tag, and waited to see how long this would take to register. So far it's been over an hour, and my sandbox is still appearing in Google. (Before anyone asks, I have purged my browser.). Given that attack pages are very rarely around for that length of time, and are usually blanked anyway, adding NOINDEX via a template would seem to have no effect.  An optimist on the run! 19:33, 12 April 2012 (UTC)[reply]
    But that argument only applies to noindexing associated with speedy tags. One of the main uses I'd see would be with the copyvio tag which, because of the process it's involved in, will be around for at least a week. Dpmuk (talk) 19:45, 12 April 2012 (UTC)[reply]
    Result of my test - the page eventually disappeared from Google's listing after four days - much longer than the average time that CSDs stay up. This herefore seems like a complete waste of time.  An optimist on the run! 09:39, 17 April 2012 (UTC)[reply]
  8. Oppose Too much scope for gaming. Warden (talk) 16:51, 13 April 2012 (UTC)[reply]

Discussion – templates[edit]

  • This will require a configuration change since __NOINDEX__ is currently disabled in content namespaces (via $wgExemptFromUserRobotsControl). Amalthea 17:28, 20 March 2012 (UTC)[reply]
    Interesting point :). The devs feel comfortable they can push this through, but I'll pass it on in my update email today. Okeyes (WMF) (talk) 17:48, 20 March 2012 (UTC)[reply]
  • Will this flush what is already in Google? Will there be a mechanism to stop any abuse, i.e. placing malicious speedies to clear a competitor out of searchengine rankings? Agathoclea (talk) 18:32, 20 March 2012 (UTC)[reply]
    Yes, as Okeyes mentioned in response to me, CSD tagging an article removes it from the cache. Pol430 talk to me 20:24, 20 March 2012 (UTC)[reply]
    What on Earth are you talking about? --MZMcBride (talk) 17:51, 12 April 2012 (UTC)[reply]
    The mechanism for that would be the admins who evaluate CSD tags. An admin checks the accuracy of each tag before performing the requested article deletion, and if the tag is found to be inappropriate, will remove the tag and possibly warn the tagger. If a tagger makes a habit of malicious or extremely poor tagging and refuses to stop, they would end up blocked for disruption. A fluffernutter is a sandwich! (talk) 18:52, 20 March 2012 (UTC)[reply]
  • I support this proposal, but have a question. What about articles that someone CSD-tags with disruptive intentions? Obviously, well-known articles would find fake CSDs removed almost immediately, but someone could probably post a tag on Louis Joseph, Duke of Vendôme without too much oversight. Of course, this would still be removed quickly by an admin, but could any damage take place during that short time? dci | TALK 21:34, 20 March 2012 (UTC)[reply]
    Barring a mistaken deletion by the admin, not really. The article would be removed from Google's index while the speedy tag was in place, but when the admin removes the tag, Google would re-index the article, AFAIK. - Jorgath (talk) (contribs) 21:44, 20 March 2012 (UTC)[reply]
    We could also have sub-templates noindexed such as {{Db-g10/noindex}} that can be used by trusted editors in place of {{Db-g10}}. The edit filter should be usable for this purpose. – Allen4names 16:37, 22 March 2012 (UTC)[reply]
    What would the criteria be? Editors with autopatrol rights? Autoconfirmed editors? A seperate user right that you apply for like you apply for autopatrol rights? Something else? - Jorgath (talk) (contribs) 18:46, 22 March 2012 (UTC)[reply]
    It would likely be bundled with any user right that can be given by an admin. This does not preclude there being a new user group such as noindexer. – Allen4names 05:37, 23 March 2012 (UTC)[reply]
    It would be difficult to implement, I imagine. We could use the edit filter, or the devs could give us some way of seting a page to be non-transcludable except by autherized users, but I doubt that either one is really that worth it. עוד מישהו Od Mishehu 06:49, 29 March 2012 (UTC)[reply]
    What about someone adding <div style="display:none;">{{db-g10}}</div> to articles? Helder 21:35, 24 March 2012 (UTC)[reply]
    That's a good point; I'll bring it up. Okeyes (WMF) (talk) 21:36, 24 March 2012 (UTC)[reply]
    These still appear in the relevant category e.g. Category:Candidates for speedy deletion as copyright violations and so won't go unnoticed (see [1]). MER-C 04:16, 25 March 2012 (UTC)[reply]
    Ok. So, what if the user also suppress the categorization, as in <div style="display:none;">{{db-g10|demo=true}}</div>? Helder 02:33, 26 March 2012 (UTC)[reply]
    Either the edit filter could be used to prevent such edits or a subtemplate such as I suggested above could be modified so that it would be categorized regardless of any parameter values set. – Allen4names 17:19, 26 March 2012 (UTC)[reply]
  • However this is implemented I think it's important that the list of tags that cause __NOINDEX__ is community (probably admin) maintainable. If this is implemented by simply allowing __NOINDEX__ in mainspace then there isn't a problem (although it creates other problems such as vandals adding it) but if it's implemented in other ways there may be. My reason for this is that I can think of at least one other template not already mentioned and which isn't really a deletion template, {{copyviocore}}, where adding __NOINDEX__ makes sense and I think a case could be made for {{copypaste}} as well. No doubt in time we'll change the name of templates, decide on additional templates where this makes sense etc. Having to get developers involved every time would be a non-starter. Dpmuk (talk) 04:33, 21 March 2012 (UTC)[reply]
    As I understand, it'll work like so; NOINDEX will be enabled, with a call to a special MediaWiki page. If the template that is calling NOINDEX appears on that page, it works; if not, it doesn't. The page will be fully protected, which means that any admin in theory (although, practically speaking, I'd hope "any admin following wide support") can add a template. So for example we could institute this and then decide, 6 months down the line, that we want that massive, hideously ugly copyvio template to also use this function. An admin adds the template to the MediaWiki page, and boom. Okeyes (WMF) (talk) 08:26, 21 March 2012 (UTC)[reply]
  • If you want to konw what's written about you on the internet, googling your name is the obvious way. Isn't it problematic to hide our sins (libelous content and such) by not indexing?-- (talk) 10:04, 26 March 2012 (UTC)[reply]
    • Once a page has been tagged (correctly) with one of these tags, the problem is already being dealt with (unless some vandal removes them). They're in a special category for dealing with the proble, a category which admins go to a lot. And I don't buy that these are "our sins" - they're the sins of people using our site against our rules, not our sins. We didn't create these attack pages. עוד מישהו Od Mishehu 04:36, 29 March 2012 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.