User:Physchim62/ITN stats

From Wikipedia, the free encyclopedia

This is an interim statistical analysis of stories posted on the In the news secion of the Main Page of English Wikipedia in 2009. At present, the statistics only cover the first five months of 2009.

Dataset[edit]

The dataset is every story posted on ITN between 2009-01-01 00:00 (UTC) and 2009-05-31 23:59 (UTC). There were 256 stories posted during this period.

Analysis[edit]

All normal stories[edit]

For the purposes of this analysis, a "normal" story is one which was removed from ITN to make room for newer items. Hence, it excludes

  • stories which were removed "early" because of complaints or other procedural problems (13 stories);
  • April Fool's Day items (8 stories).

Hence, there are 235 "normal" stories.

Time on the Main Page[edit]

I have the raw data for this, but I haven't finished analysing it yet.

Viewing figures[edit]

  Deciles
Upper
quartile
33.4k 9th 60.7k
8th 37.2k
7th 26.4k
6th 20.0k
Median 15.6k
Lower
quartile
8.3k 4th 11.9k
3rd 9.4k
2nd 7.1k
1st 4.8k

The main statistic for viewing figures is the maximum daily viewing figure achieved by the article linked by the bolded link in the ITN story.

For individual articles, this statistic is subject to a number of systematic and semisystematic biases which I shall discuss below when I get round to it. These biases do not prevent its use for finding median viewing figures and similar statistics.

As stories are usually on the Main page for two to three days, the maximum daily viewing figure will systematically underestimate the total number of page views: no correction has been made for this effect, which is assumed to proportionally similar for all articles.

No baseline correction correction has been made, as, for ITN stories, baseline viewing figures are almost always far lower (by at least two orders of magnitude) than peak viewing figures while the article is featured on the Main Page.

The highest peak viewing figure was for swine influenza, with 1.1M page views; the lowest peak viewing figure was for Slovak presidential election, 2009, with 1.9k page views.

Procedural aspects[edit]

Discussion type Number
of stories
Median
page views
Notable awards 8 64.4k
April Fool's Day 8 37.2k
Obituaries 2 29.3k
Space launches 9 24.6k
Recurring sports events 11 19.5k
Standard 199 15.2k
Elections 19 2.8k
Major meteor showers 0

In the news has specific criteria for several types of story:

  • recurring events (sports events, elections, awards, space launches and meteor showers)
  • obituaries
  • April Fool's Day stories

All other stories have been classified as "standard discussions".

The list of recurring events changed considerably during 2009. An story has been listed as a recurring event if:

For obituaries, articles have only been classified as obituaries if the death of the person was the only news story: hence, stories featuring people who died during other newsworthy events (eg, Velupillai Prabhakaran, leader of the Sri Lankan Tamil Tigers) have been classified as "standard" discussions.

Subject matter[edit]

Discussion type Number
of stories
Median
page views
Arts & entertainment 8 64.4k
Science & technology 40 33.1k
Sports 16 19.1k
Business & economics 12 19.0k
War & diplomacy 21 18.5k
Overall median 15.6k
Other disasters & crime 50 14.5k
Terrorism 12 12.7k
Religion 3 11.5k
Natural disasters 19 9.1k
Politics & elections 54 7.7k

An attempt was made to classify each story into one of a limited number of subject areas. The choice of subject area is, by nature, somewhat subjective, but this should not overly affect the validity of the medians. To give just one example, different editors might have different dividing lines between "War", "Terrorism" and "Crime".

For the "Other disasters & crime" category, almost all the stories involved homicide or accidental death. "Other disasters" implies not war, terrorism or natural disasters.

"Science & technology" includes medicine, as well as space launches etc.

Regional distribution[edit]

Region Number
of stories
Median
page views
Europe 56 16.3k
Africa 33 8.9k
Americas (excluding USA) 26 9.3k
United States of America 25 38.6k
East & Southeast Asia 24 12.6k
South Asia 23 15.5k
International 16 26.1k
Middle East 15 13.9k
Oceania 9 8.7k
Outer Space 5 44.5k
Antarctica 3 35.2k

An attempt was made to assign a country to each story, based on the ISO 3166-1 alpha-3 classification. This proved unsatisfactory for a number of reasons, particularly the large number of countries which feature in ITN stories, which make statistical analysis unreliable. Instead, the stories (in practice, the countries) were classified into regions, based on the common news regions used by international news providers such as BBC News or Al-Jazeera. Even then, some modifications had to be made to cover the variety of ITN stories.

The choice of country, or even region, is, by nature, somewhat subjective.

Discussion[edit]

Statistics for individual articles[edit]