Jump to content

User:Krexer/sandbox

From Wikipedia, the free encyclopedia

OLD Version (FULL set of references)

[edit]
  1. 2011 Survey: 52-item survey; 1319 participants from over 60 countries.[1] Citations include [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]
  2. 2010 Survey: 50-item survey; 735 participants from 60 countries.[18][19] Citations include [20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36]
  3. 2009 Survey: 40-item survey; 710 participants from 58 countries.[37] Citations include [38][39][40][41][42][43][44][45][46][47][48][49]
  4. 2008 Survey: 34-item survey; 348 participants from 44 countries.[50] Citations include [51][52][53][54][55][56]
  5. 2007 Survey: 27-item survey; 314 participants from 35 countries.[57][58] Citations include [59][60][61][62][63]


NEW Version (REDUCED set of references)

[edit]
  1. 2011 Survey: 52-item survey; 1319 participants from over 60 countries.[1] Citations include [2][7][13]
  2. 2010 Survey: 50-item survey; 735 participants from 60 countries.[18][19] Citations include [20][21][22][25][26][27][36]
  3. 2009 Survey: 40-item survey; 710 participants from 58 countries.[37] Citations include [38][41][42][44]
  4. 2008 Survey: 34-item survey; 348 participants from 44 countries.[50] Citations include [56]
  5. 2007 Survey: 27-item survey; 314 participants from 35 countries.[57][58]


Recent survey results (previously posted to the Data Miner Survey page, but now removed)

[edit]

Results from the most recent survey were unveiled at the October 2011 Predictive Analytics World (PAW) conference held in New York City.[1] Survey participants included data miners working in corporations, consulting firms, tool vendors, academia, and government and non-government organizations.

  • Fields & Goals: Data miners work in a diverse set of fields. CRM/Marketing has been the #1 field in each of the past five years. Fittingly, “improving the understanding of customers”, “retaining customers” and other CRM goals continue to be the goals identified by the most data miners. This is consistent with independent polls of data miners conducted by KDnuggets over the years.[64] [65] [66]
  • Algorithms: Each year, decision trees, regression, and cluster analysis form the consistent triad of core algorithms for most data miners. However, a wide variety of algorithms are being used. This is consistent with independent polls of data miners conducted by KDnuggets over the years.[67] [68]
  • Text Mining: A third of data miners currently use text mining and another third plan to in the future. Text mining is most often used to analyze customer surveys and blogs/social media.
  • Data Mining Tools: Data miners report using an average of 4 software tool to conduct their analyses. Over the survey years, R has risen in popularity. In 2010 it overtook SPSS Statistics and SAS to become the tool used by the most data miners. And the 2011 survey showed that R is now being used by close to half of all data miners (47%). STATISTICA has also grown in popularity. From 2007-2009 more data miners indicated that SPSS Clementine (now IBM SPSS Modeler) was their primary data mining tool than any other tool. However, in 2010 and 2011, STATISTICA was cited most frequently as data miners' primary tool. In terms of satisfaction with their tools, in the past few years, STATISTICA, IBM SPSS Modeler, R, KNIME, RapidMiner and Salford Systems have received the strongest satisfaction ratings from data miners in these surveys. The growing popularity of R is consistent with independent polls of data miners conducted by KDnuggets, but the KDnuggets polls show a different picture regarding the popularity of commercial data mining software.[69][70][71] Robert Muenchen has taken a multi-faceted approach to assessing the popularity of data analysis software - an approach that includes blog post counts, Google Scholar data, listserv subscribers, use in competitions, book publications, Google PageRank, and more. [36] His analyses are consistent with the Rexer Analytics Surveys and KDnuggets in outlining the growth of R, but Muenchen illustrates that the popularity of software is more nuanced and one's conclusions will be different depending on what measure of popularity is used. The Rexer Analytics survey summary reports include analyses of the data miners' satisfaction with 20 dimensions of their software. Haughton et al. and Nisbet have also produced reviews of data mining software.[72] [73]
  • Analytic Capabilities & Success: Only 12% of corporate respondents rate their company as having very high analytic sophistication. However, data miners at companies with better analytic capabilities report that their companies are outperforming their peers. Participants in the 2011 survey shared best practices for measuring analytic success.[74]
  • Challenges: Consistently across the years, dirty data, explaining data mining to others, and difficult access to data are the top challenges data miners face. Participants in the 2010 survey shared best practices for overcoming these challenges.[75]
  • Future Trends: Data miners are optimistic about continued growth in data mining adoption and the positive impact data mining will have.


References

[edit]
  1. ^ a b c Karl Rexer, Heather Allen, & Paul Gearan (2011); 2011 Data Miner Survey Summary, presented at Predictive Analytics World, Oct. 2011.
  2. ^ a b Bob Thompson (2012); Big Data and Analytics in a Customer-Focused Enterprise: Inside Scoop with Karl Rexer, CustomerThink, August 7, 2012.
  3. ^ Paige Roberts (2012); The Measure of an Analytics Tool, Pervasive Big Data Blog, August 17, 2012.
  4. ^ James Taylor (2012); Rexer Analytics Survey – are data miners too focused on their models?, Decision Management Solutions, July 24, 2012.
  5. ^ Bob Thompson (2012); Big Data analysis tools: STATISTICA, KNIME, Rapid Miner, Salford Systems top rated; R on the rise, CustomerThink, July 24, 2012.
  6. ^ Shawn Hessinger (2011); Survey Shows Analytics Improves Business Performance, All Analytics, December 5, 2011.
  7. ^ a b David Smith (2012); The R Language & Big Data, Cloud Computing Journal, August 14, 2012. Cite error: The named reference "Revolution2011b" was defined multiple times with different content (see the help page).
  8. ^ Paul Hiller (2012); StatSoft's STATISTICA (TM) Achieves Highest Ratings in World's Largest Survey of Data Analysts, StatSoft press release, September 11, 2012 (as reported in Yahoo Finance).
  9. ^ David Smith (2011); R's popularity continues to rise, Revolution Newsletters: Issue 15 - April 2011.
  10. ^ Knime (2012); KNIME rated #1 in satisfaction for open source analytics platforms, KNIME press release, July 31, 2012.
  11. ^ Gregory Piatetsky-Shapiro (2011); Rexer Analytics 2001 Data Miner Survey: Summary Report, KDnuggets News 2012, Issue 17, July 25, 2011.
  12. ^ StatSoft (2011); STATISTICA is Highest in Adoption and Customer Satisfaction in Predictive Analytics, Statsoft press release 2011.
  13. ^ a b Selena Welz (2012); Meet R: a programming language that makes sense of Big Data, Technology @ Work, Tendo Communications, November 2012.
  14. ^ Rosaria Silipo (2012); KNIME is evaluated very positively by its users, Data Mining and Reporting, August 6, 2012.
  15. ^ Yanchang Zhao (2012); R is reported as being used by about half of all data miners in the 2011 Data Miners Survey, RDataMining blog, July 28, 2012.
  16. ^ Timothy D'Auria (2012); Users Switching from SAS to R, Boston Decisions blog, April 2, 2012.
  17. ^ Alan Calvitti (2012); Can Mathematica.SE help improve Mathematica's Meager Market in Analytics?, Mathematica Meta blog, August 3, 2012.
  18. ^ a b Karl Rexer, Heather Allen, & Paul Gearan (2010); 2010 Data Miner Survey Summary, presented at Predictive Analytics World, Oct. 2010.
  19. ^ a b Karl Rexer, Heather Allen, & Paul Gearan (2011); Understanding Data Miners, Analytics Magazine, May/June 2011 (INFORMS: Institute for Operations Research and the Management Sciences).
  20. ^ a b Paško Konjevoda and Nikola Štambuk (2012); Open-Source Tools for Data Mining in Social Science, Theoretical and Methodological Approaches to Social Sciences and Knowledge Management, Asunción López-Varela (Ed.), ISBN: 978-953-51-0687-6.
  21. ^ a b Emilia Mikołajewska and Dariusz Mikołajewski (2011); System eksploracji danych na potrzeby obronności państwa], Kwartalnik Bellona, 2011, Volume 3, pages 119-129 (Data Mining system for national security purposes, Bellona Quarterly, Scientific Journal of the Polish Ministry of National Defense; Article is in Polish).
  22. ^ a b Tomasz Ząbkowski (2011); Data Mining - Current State and Future Trends, Information Systems in Management XIII, Business Intelligence and Knowledge Management, Warsaw University of Life Sciences Press, Warsaw, 2011, pages 122-130; ISBN 978-83-7583-370-6.
  23. ^ Cheyu Hung (2011); 2010 Data Miner Survey Highlights, StatSoft Taiwan & StatSoft China Newsletter, April 17, 2011 (Newsletter is in Chinese).
  24. ^ Nick Lim (2011); Investigative SNA or large scale SNA? Part 1, Sonamine blog, January 19, 2011.
  25. ^ a b Tuba Islam (2011); How to use Analytics to Improve Your Business: Real Practices, SAS Business Analytics Series, Istanbul, Turkey, April, 2011 (presentation is in Turkish).
  26. ^ a b Shawn Hessinger (2011); CRM & Marketing Top Fields for Data Miners, All Analytics, November 9, 2011.
  27. ^ a b Gustavo Valencia (2012); Minería de Datos: Sesión 0, Universidad Pontificia Bolivariana, Graduate class: Data mining and Information visualization, 2012 (Presentation is in Spanish).
  28. ^ French Wikipedia Entry; Exploration de données, (Data Mining).
  29. ^ French Wikipedia Entry; Logiciels de fouille de données, (Data Mining Software).
  30. ^ James Taylor (2011); Rexer Data Mining Survey Results, Decision Management Solutions, March 8, 2011.
  31. ^ STATISTICA News (2011); STATISTICA Primary Predictive Analytics Tool According to Rexer Survey, STATISTICA News, March 29, 2011.
  32. ^ 刘思喆 (2011); Rexer Analytics 2010年度数据挖掘调查, March 9, 2011 (Blog is in Chinese).
  33. ^ Steve McDonnell (2011); Top Three Challenges for Data Miners, Trends and Outliers, June 22, 2011.
  34. ^ Frank C.S. Liu (2012); The Popularity of Data Analysis Software, Frank Liu's Blog, June 1, 2012 (Blog is in English and Chinese).
  35. ^ Shawn Hessinger (2011); Text Mining Is on the Rise, All Analytics, November 14, 2011.
  36. ^ a b c Robert A. Muenchen (2012); The Popularity of Data Analysis Software. Cite error: The named reference "muenchen" was defined multiple times with different content (see the help page).
  37. ^ a b Karl Rexer, Heather Allen, & Paul Gearan (2009); 2009 Data Miner Survey Summary, presented at SPSS Directions Conference, Oct. 2009.
  38. ^ a b M. Arthur Munson (2011); A Study on the Importance of and Time Spent on Different Modeling Steps, ACM SIGKDD Explorations, Volume 13, Issue 2, December 2011, pages 65-71.
  39. ^ Le Grand BI (2010); Karl Rexer mène l’enquête sur les Data Miner, Le Grand BI - Le blog Satirique de la Business Intelligence, March 21, 2010 (Blog is in French).
  40. ^ James Taylor (2009); Early results from the Rexer data mining survey, Decision Management Solutions, November 23, 2009.
  41. ^ a b Ervina Çergani (2009); Data Mining Survey, Survey of Businesses in Tirana, Albania; July, 2009 (Originally in Albanian, translated into English).
  42. ^ a b Valerie Valentine (2010); Data Miner Survey Shows Positive Signs, Information Management, March 25, 2010.
  43. ^ Katarzyna Kołyniak (2010); Rexer Analytics 2009, SPSS Poland Newsletter, April 2010 (Newsletter is in Polish).
  44. ^ a b Ajay Ohri (2009); Interview Karl Rexer - Rexer Analytics.
  45. ^ Gregory Piatetsky-Shapiro (2010); 3rd Annual Rexer Analytics Data Miner Survey - Summary, KDnuggets News 2010, Issue 6, March, 2010.
  46. ^ STATISTICA News (2009); STATISTICA Data Miner Tops Use, Reliability and Satisfaction Ratings in Survey of Data Mining Professionals, StatSoft Company reviews 2009.
  47. ^ Ajay Ohri (2009); Data Mining Survey Results :Tools and Offshoring, DecisionStats, December 17, 2009.
  48. ^ RapidMiner (2009); RapidMiner proves most popular data mining tool, RapidMiner press release, 2009.
  49. ^ Cheyu Hung (2011); 2009 Data Miner Survey Highlights, StatSoft Taiwan & StatSoft China Newsletter, April 17, 2011 (Newsletter is in Chinese).
  50. ^ a b Karl Rexer, Paul Gearan, & Heather Allen (2008); 2008 Data Miner Survey Summary, presented at SPSS Directions Conference, Oct. 2008, and Oracle BIWA Summit, Nov. 2008.
  51. ^ STATISTICA News (2008); STATISTICA Data Miner One of the Top Three Software Systems in Recent Independent Survey of Data Mining Professionals, STATISTICA News, November 3, 2008.
  52. ^ James Taylor (2008); Rexer Analytics Data Mining Survey Results Released, Decision Management Solutions, October 2, 2008.
  53. ^ Cheyu Hung (2011); 2008 Data Miner Survey Highlights, StatSoft Taiwan & StatSoft China Newsletter, April 17, 2011 (Newsletter is in Chinese).
  54. ^ Dean Abbott (2009); Overlap in the Business Intelligence / Predictive Analytics Space, Abbott Analytics: Data Mining and Predictive Analytics Blog, December 15, 2009.
  55. ^ SPSS (2008); Survey: SPSS Predictive Analytics Remains Top Choice Among Data Miners, SPSS press release, October 6, 2008.
  56. ^ a b Mayato (2008); Mayato Study: Data Mining Software 2009, November 2008 (available in German and English).
  57. ^ a b Karl Rexer, Paul Gearan, & Heather Allen (2007); 2007 Data Miner Survey Summary, presented at SPSS Directions Conference, Oct. 2007, and Oracle BIWA Summit, Oct. 2007.
  58. ^ a b Karl Rexer, Paul Gearan, & Heather Allen (2008); Portrait of a data miner, Quirk's Marketing Research Media, March 2008.
  59. ^ Dean Abbott (2008); Data Mining Survey, Abbott Analytics: Data Mining and Predictive Analytics Blog, April 17, 2008.
  60. ^ James Taylor (2007); Data Miner Survey - results, FICO Decision Management Blog, August 10, 2007.
  61. ^ Will Dwinnell (2007); Rexer Analytics Data Miner Survey, Aug-2007, Abbott Analytics: Data Mining and Predictive Analytics Blog, August 11, 2007.
  62. ^ Cheyu Hung (2011); 2007 Data Miner Survey Highlights, StatSoft Taiwan & StatSoft China Newsletter, April 17, 2011 (Newsletter is in Chinese).
  63. ^ SPSS (2007); Independent Survey Names SPSS Predictive Analytics Number One Choice Among Data Miners, SPSS press release, December 10, 2007.
  64. ^ Gregory Piatetsky-Shapiro (2011); Industries / Fields Where You Applied Analytics / Data Mining in 2011, KDnuggets, 2011.
  65. ^ Gregory Piatetsky-Shapiro (2010); Industries / Fields for Analytics / Data Mining, KDnuggets, 2010.
  66. ^ Gregory Piatetsky-Shapiro (2009); Data Mining Applications 2009, KDnuggets, 2009.
  67. ^ Gregory Piatetsky-Shapiro (2011); Algorithms for Data Analysis / Data Mining, KDnuggets, 2011.
  68. ^ Gregory Piatetsky-Shapiro (2007); Data Mining Methods, KDnuggets, 2007.
  69. ^ David Smith (2012); R Tops Data Mining Software Poll, Java Developers Journal, May 31, 2012.
  70. ^ Gregory Piatetsky-Shapiro (2011); Data Mining / Analytic Tools Used, KDnuggets, 2011.
  71. ^ Gregory Piatetsky-Shapiro (2010); Data Mining / Analytic Tools Used Poll, KDnuggets, 2010.
  72. ^ Haughton, Dominique; Deichmann, Joel; Eshghi, Abdolreza; Sayek, Selin; Teebagy, Nicholas; and Topi, Heikki (2003); A Review of Software Packages for Data Mining, The American Statistician, Vol. 57, No. 4, pp. 290–309.
  73. ^ Nisbet, Robert A. (2006); Data Mining Tools: Which One is Best for CRM? Part 1, Information Management Special Reports, January 2006.
  74. ^ Karl Rexer, Paul Gearan, & Heather Allen (2011); Best Practices in Measuring Analytic Project Performance / Success, verbatim responses are available online.
  75. ^ Karl Rexer, Paul Gearan, & Heather Allen (2010); Overcoming Data Mining Challenges, verbatim responses are available online.