on 01/08/2009 01:12 PM Andrew Choens wrote: > On Thu, 2009-01-08 at 10:42 -0600, Stas Kolenikov wrote: >> A really good measure for R will be the total # of the downloads of >> r-base for all platforms from all CRAN mirrors (and I would expect >> that # can be found from the servers' logs). Given that it is so easy >> to download everything nice and clean and up to date, I would doubt >> anybody will be distributing CD-ROMs with R install files among >> friends and colleagues. SAS (and Stata, and SPSS, and Minitab, and...) >> should have their (internal) number of licenses sold (and yes those >> come on the disks initially), but those are badly blurred by the >> network licenses, and are commercial secrets, anyway. > > The number of r-core downloads is definitely NOT representative of the > number of people using R. If you use R on Windows or OS X, you will > obviously download R from the mirrors. However, this methodology would > effectively ignore many users of R on Linux. I use R on a regular basis > and I have it installed on three separate systems, all running Ubuntu. > In all of these cases, I am downloading and installing r-core from the > Ubuntu Mirror in the USA, not from CRAN.
I would also note that R has been available via the Fedora yum repos for some time, which as with the Debian/Ubuntu repos, would be missed in just counting CRAN downloads. There are quite a few other Linux distributions that have a similar infrastructure in place where R is available as an 'add-on' or where the main distribution itself includes R. Additionally, there are many folks who will build R from source code, using the updated source tarballs via FTP or, as I do, by getting the source code right from the R subversion repo. These too would not be considered in a CRAN based count. > Of course, the number of Linux users is miniscule compared to the number > of Windows users, but I think it is safe to say the Linux users are, in > general, a more tech-savvy group than Windows users and are more likely > to be comfortable using R's interactive programming interface. I think > it is also fair to say that MANY (though not all) Linux users would be > uncomfortable installing SPSS or SAS or Stata onto their open-source > system and would prefer to use R. Thus, Linux users probably account for > a higher proportion of R's user-base than they do in the general > computing population. . . . although I do not claim to actually know > this proportion. > > Ehh. Comparing the popularity of computer software is incredibly tricky > to do, especially when some of the software being compared in > open-source. Correct. Trying extrapolate the number of users from any of these measures is quite complex, if doable at all. Even using the posting frequencies as I did yesterday, needs to be taken with a grain of salt in trying to attempt to get a sense of growth. As Dirk noted, the many R-SIG-* e-mail lists have offloaded some level of traffic from R-Help, which may account for the rate of growth in the R-Help posts declining somewhat since 2004 as Gabor pointed out, even though the absolute number of annual posts continues to increase. Reading the posts on SAS-L since yesterday via Google RSS, where the NYT article was also posted, some have noted that SAS itself offers online support forums (http://support.sas.com/forums/index.jspa). From a quick review, it looks like the SAS.com forums date back to perhaps early 2006, thus possibly accounting for some of the leveling of the posts on SAS-L recently. HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.