Wikipedia talk:Contributor copyright investigations/20210315

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Questions and such[edit]

@SandyGeorgia, Hope you don't mind that I'm answering your questions on this talk page so the investigation page doesn't get cluttered.

  • On the biblical name articles, that's indeed highly annoying. The good news is that the definitions of the names, which I would consider the only "copyrightable" content in those edits, has been removed from them over the years. I don't think the remaining list of names is creative enough to be considered copyrightable, so fixing the attribution isn't needed. I'll mark them down on the listing.
  • For the Sandbox edit: the internet archive isn't loading for me right now, but we can blank/delete the sandbox, attribution isn't needed when there's only one author. The sandboxes are more listed to give an idea of how Doug edited and where certain content came from. When editing a CCI where a copyvio has been transferred between a few different pages, I try and remove it from the page it originated on, and then move on to dealing with the other ones.

If you have any further questions I'm happy to answer :) Moneytrees🏝️(Talk) 23:25, 25 January 2023 (UTC)[reply]

Great ... this much helps. And I'll put future questions here. Out for the evening, more tomorrow. SandyGeorgia (Talk) 23:28, 25 January 2023 (UTC)[reply]

Query WP:CP threshold[edit]

See 7&6 topic ban

As explained in the 7&6 AN topic ban, and as I've seen throughout the DC articles and in particular in the Ludington Public Library example, the writing is almost always all DC, while 7&6 would follow to do mostly technical cleanup (citations and the like), and 7&6 doesn't seem to have done any source-to-text integrity or copyvio checking.

I'm also finding it takes up to four hours to clean a DC article when it can't be sent to WP:CP, usually because other editors have done some cleanup so that the percentage contribs left to DC falls below 80% ... forcing reviewers to check every newspaper clipping and seek out the actual source, which is often misnamed.

The authorship tools often show DC's percent participation lower than what is revealed if one looks at the actual content via WP:Who Wrote That?. MER-C when deciding whether to send an article up to WP:CP, can 7&6's contribs be disregarded if they are all technical? What other advice would be helpful in deciding when to send an article on to WP:CP? SandyGeorgia (Talk) 14:34, 1 March 2023 (UTC)[reply]

@SandyGeorgia I don't use Who Wrote That, but as alternative for if I'm checking if an article can be listed for presumptive deletion I have two different stratagies:
  • I'll just look at the entire history and determine if certain edits not made by the creator were copyedits/other minor ones
  • I'll take the url of the diff of when the creator last made a major edit, put it into the "URL comparison" field of Earwig, and then compare it to the current revision of the article, which will highlight overlap.
Hope that can help. Moneytrees🏝️(Talk) 15:03, 1 March 2023 (UTC)[reply]
Thanks, Moneytrees, but I'm still worried that I'm doing this right. Could you and MER-C take this example and tell me what to do ?
And after that several copyedited. I can't find a single diff that is more useful than going through the major contributors to indicate that the content is mostly DC and no one seems to have checked sources or looked for copyvio or source-to-text issues or queried offline sources.
Is this a good candidate for WP:CP ?
Would an article with a similar editing pattern, but with DC contribs in only the 70% range as a result of copyediting, be a good WP:CP candidate?
XOR'easter is going through cleaning up citations on some DC articles, which lowers the DC contrib percentage, and leaves me unsure if I can still submit those articles to WP:CP, so I'm seeking stronger guidance on how to figure the threshold. SandyGeorgia (Talk) 15:34, 1 March 2023 (UTC)[reply]
A better example might be Howard B. Meek. Uses offline sources that can't be checked, but DC contribs as measured by tools now diluted to 73%. Forces us to go and check every blooming source ?? SandyGeorgia (Talk) 15:38, 1 March 2023 (UTC)[reply]
In this case, you could rip out the DC stuff - leave only the lead, the "Honors and works" section, and all sections with non-creative content. You could also blank the article and wait seven days for someone else to do it. It's a matter of deciding whether to delete or stub, not presumptively remove - if in doubt, blank and list. MER-C 20:10, 1 March 2023 (UTC)[reply]
OK, I never trust myself (too little experience), so prefer to blank and list. Sorry to give you extra work :) :) SandyGeorgia (Talk) 20:43, 1 March 2023 (UTC)[reply]

Rewrites and CCI[edit]

I recently rewrote the article for Aemilia Tertia. See project page under § Pages 161 through 180; see also discussion Is there anything that ought to be done to reflect that to make CCI any easier? Also, is there a machine-parseable list of these articles that need review? I like to know of any overlap with WP:CGR articles. Ifly6 (talk) 20:40, 28 April 2023 (UTC)[reply]

If you've rewritten it entirely in your own words and you're confident it's free of any copyright issues (including unattributed copying of public domain sources as well as close paraphrasing), you can take care of the CCI listing yourself. Edit the CCI page, remove the diffs from the CCI listing for Aemilia Tertia, add the {{?}} template since the article was rewritten without determining if there was any CV (this is not a problem - I'm just explaining why that template), and put a comment that you've rewritten it from scratch and sign. Poof! Done :) ♠PMC(talk) 21:09, 28 April 2023 (UTC)[reply]
Thanks. Done. Ifly6 (talk) 22:42, 28 April 2023 (UTC)[reply]
Perfect, cheers! ♠PMC(talk) 23:13, 28 April 2023 (UTC)[reply]
@Ifly6: There are a number of CGR articles that DC worked on that have not yet been investigated, both on this page and the companion page WP:Contributor copyright investigations/20210315 02. If you scan through these two pages you can probably recognize them on sight, including Nous, Otium, Iole, Indibilis and Mandonius, Chiomara, and so on. Wasted Time R (talk) 23:22, 28 April 2023 (UTC)[reply]

I wrote a script to query the categories recursively and parse the Wikitext to recover titles, which tells me of the following relevant articles. Sorry about format, this is a Pandas print-out so tables are hard-coded.

                   file                          title  handled   cgr
12   dc_copyvio_pg1.txt              Africa (Petrarch)     True  True
36   dc_copyvio_pg1.txt  De Viris Illustribus (Jerome)    False  True
54   dc_copyvio_pg1.txt                           Iole    False  True
176  dc_copyvio_pg1.txt                 Aemilia Tertia     True  True
200  dc_copyvio_pg1.txt                Theory of forms     True  True
216  dc_copyvio_pg1.txt             Augustine of Hippo     True  True
234  dc_copyvio_pg1.txt         Faltonia Betitia Proba     True  True
271  dc_copyvio_pg1.txt                            Ops     True  True
273  dc_copyvio_pg1.txt                       Olympias     True  True
280  dc_copyvio_pg1.txt               Juno (mythology)     True  True
287  dc_copyvio_pg1.txt              Ceres (mythology)     True  True
332  dc_copyvio_pg1.txt                           Livy     True  True
333  dc_copyvio_pg1.txt       Europa (consort of Zeus)     True  True
346  dc_copyvio_pg1.txt                         Medusa     True  True
350  dc_copyvio_pg1.txt          Agrippina the Younger     True  True
406  dc_copyvio_pg2.txt                          Otium    False  True
516  dc_copyvio_pg2.txt         Mary Hamilton Swindler    False  True
535  dc_copyvio_pg2.txt        Indibilis and Mandonius    False  True
754  dc_copyvio_pg2.txt             Augustine of Hippo    False  True
828  dc_copyvio_pg2.txt               Second Punic War     True  True
942  dc_copyvio_pg3.txt                          Otium    False  True

I'll probably go take a look at the ones still marked False. Ifly6 (talk) 00:12, 29 April 2023 (UTC)[reply]