[Rd] proposed pbirthday fix
Actually, since NaN's are also detected in na.action operations, a simpler fix might just be to use the na.rm = TRUE option of min upper <- min(n^k/(c^(k - 1)), 1, na.rm = TRUE) > Recent news articles concerning an article from The Lancet with > fabricated > data indicate > that in the sample containing some 900 or so patients, more than 200 > had > the same > birthday. I was curious and tried out the p and q birthday functions > but > pbirthday > could not handle 250 coincidences with n = 1000. The calculation of > upper > prior > to using uniroot produces NaN, > > upper<-min(n^k/(c^(k-1)),1) > > I was able to get it to work by using logs, however, as in the > following > version > > function(n, classes = 365, coincident = 2){ > k <- coincident > c <- classes > if (coincident < 2) return(1) > if (coincident > n) return(0) > if (n > classes * (coincident - 1)) return(1) > eps <- 1e-14 > if (qbirthday(1 - eps, classes, coincident) <= n) > return(1 - eps) > f <- function(p) qbirthday(p, c, k) - n > lower <- 0 > upper <- min( exp( k * log(n) - (k-1) * log(c) ), 1 ) > nmin <- uniroot(f, c(lower, upper), tol = eps) > nmin$root > } [[alternative text/enriched version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] proposed pbirthday fix
> "ken" == ken knoblauch <[EMAIL PROTECTED]> > on Mon, 23 Jan 2006 09:43:28 +0100 writes: ken> Actually, since NaN's are also detected in na.action ken> operations, a simpler fix might just be to use the ken> na.rm = TRUE option of min ken> upper <- min(n^k/(c^(k - 1)), 1, na.rm = TRUE) Well, I liked your first fix better -- thank you for it! -- since it's always good practice to formulate such as to avoid overflow when possible. All things considered, I think I'd go for upper <- min( exp(k * log(n) - (k-1) * log(c)), 1, na.rm = TRUE) Martin Ken> Recent news articles concerning an article from The Ken> Lancet with fabricated data indicate that in the sample Ken> containing some 900 or so patients, more than 200 had the Ken> same birthday. I was curious and tried out the p and q Ken> birthday functions but pbirthday could not handle 250 Ken> coincidences with n = 1000. The calculation of upper Ken> prior to using uniroot produces NaN, Ken> upper<-min(n^k/(c^(k-1)),1) Ken> I was able to get it to work by using logs, however, as Ken> in the following version >> function(n, classes = 365, coincident = 2){ >> k <- coincident >> c <- classes >> if (coincident < 2) return(1) >> if (coincident > n) return(0) >> if (n > classes * (coincident - 1)) return(1) >> eps <- 1e-14 >> if (qbirthday(1 - eps, classes, coincident) <= n) >> return(1 - eps) >> f <- function(p) qbirthday(p, c, k) - n >> lower <- 0 >> upper <- min( exp( k * log(n) - (k-1) * log(c) ), 1 ) >> nmin <- uniroot(f, c(lower, upper), tol = eps) >> nmin$root >> } __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
Kjetil Brinchmann Halvorsen wrote: > Prof Brian Ripley wrote: > >>Quite a while back we set the goal of running R in 16Mb RAM, as people (I >>think Kjetil) had teaching labs that small. > > It's a while since I actually har R used on such small machines, I think > 64 MB is quite acceptable now. May I add another note to this - I recently upgraded to 64-bits (AMD opteron) and noticed the memory foot print of R has shot up. Just starting R takes up 90+MB virtual. There are correponding increases with Python and Perl as well; I suspect R suffers a bit on 64-bit platform due to extensive use of pointers internally. The fundamental unit in R, SEXP, is 6 pointers + 1 int, (and another pointer for itself). So I would probably say 64MB is questionable on 64-bit, but then probably nobody is stupid enough to do that... For those who want to investigate the equivalent in Perl, the equivalent perl headers corresponding to "R/include/Rinternals.h" is located at the "-I" flags of the output of: perl -MExtUtils::Embed -e ccopts (no idea where python stores its stuff...) Hin-Tak Leung > > Kjetil > > >>Since then R has grown, and we has recently started to optimize R for >>speed rather than size. I recently tested R-devel on my ancient Win98 >>notebook with 64Mb RAM -- it ran but startup was rather slow on what I >>think is a 233MHz processor and very slow disc. >> >>R still runs in 16Mb, but that is getting tight. Does anyone have any >>need to run on a smaller machine than my 64Mb notebook? >> > > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
On Mon, 23 Jan 2006, Hin-Tak Leung wrote: Kjetil Brinchmann Halvorsen wrote: Prof Brian Ripley wrote: Quite a while back we set the goal of running R in 16Mb RAM, as people (I think Kjetil) had teaching labs that small. It's a while since I actually har R used on such small machines, I think 64 MB is quite acceptable now. May I add another note to this - I recently upgraded to 64-bits (AMD opteron) and noticed the memory foot print of R has shot up. Just starting R takes up 90+MB virtual. That's a different question. I said RAM, you quote virtual. I am suprised at your figure though, as I am used to seeing 40-50Mb virtual at startup on an Opteron. The distinction is important: even those small Windows machines had 100s of Mb of virtual memory available, it was RAM that was in short supply. There are correponding increases with Python and Perl as well; I suspect R suffers a bit on 64-bit platform due to extensive use of pointers internally. The fundamental unit in R, SEXP, is 6 pointers + 1 int, (and another pointer for itself). So I would probably say 64MB is questionable on 64-bit, but then probably nobody is stupid enough to do that... We know: we even document it in the appropriate places. Some of us were running 64-bit R last century on machines with 128Mb (and others with much more, of course). When I tried in 1997, Solaris would not run in 64-bit mode with 64Mb RAM (which then cost £1000 or so). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Read.delim error in 2.3.0 devel, not in 2.2.0
I get this error in R-devel, but not in R-2.2.0. Any insight is appreciated. Error in read.delim(txtcon, header = TRUE, sep = "\t", na.strings = "NULL") : recursive default argument reference > sessionInfo() Version 2.3.0 Under development (unstable) (2006-01-04 r36984) powerpc-apple-darwin8.3.0 attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" [7] "base" other attached packages: GEOquery "1.5.3" Thanks, Sean __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
Prof Brian Ripley wrote: > That's a different question. I said RAM, you quote virtual. I am > suprised at your figure though, as I am used to seeing 40-50Mb virtual > at startup on an Opteron. I am somewhat surprised by it as well. But there is nothing unusual about the build - it is just rebuilding the rpm on CRAN on a FC4 system with everything as shipped, and should be quite reproducible. I'll probably have a better look in time. "R --vanilla" doesn't improve. Still 90+ MB virtual, 20+MB resident. > The distinction is important: even those small Windows machines had 100s > of Mb of virtual memory available, it was RAM that was in short supply. Yes and no. Virtual means it will possibly be used - and it is a big gray scale between unresponsible/intolerably-slow and slow. >> There are correponding increases with Python and Perl as well; I >> suspect R suffers a bit on 64-bit >> platform due to extensive use of pointers internally. The fundamental >> unit in R, SEXP, is 6 pointers + 1 int, (and another >> pointer for itself). So I would probably say 64MB is questionable on >> 64-bit, but then probably nobody is stupid enough to do that... > > > We know: we even document it in the appropriate places. I went and have a look - it is the last section of R-admin (and of course, for those who "read the source", R/include/Rinternals.h). It would be good to mention this in the FAQ (which it doesn't, or maybe I didn't look hard enough), or the beginning of R-admin? > Some of us were running 64-bit R last century on machines with 128Mb > (and others with much more, of course). When I tried in 1997, Solaris > would not run in 64-bit mode with 64Mb RAM (which then cost £1000 or so). > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
On Mon, 23 Jan 2006, Hin-Tak Leung wrote: > Prof Brian Ripley wrote: >> We know: we even document it in the appropriate places. > > I went and have a look - it is the last section of R-admin (and of > course, for those who "read the source", R/include/Rinternals.h). It > would be good to mention this in the FAQ (which it doesn't, or maybe I > didn't look hard enough), or the beginning of R-admin? > It's not in the FAQ because it isn't a FAQ (yet). If you use the PDF manual it is in the table of contents on page i. In the HTML manual it is admittedly less clear: there isn't a table of contents and there is nothing obvious in the index. To some extent this is a problem with all the manuals. The structure in the .texi file isn't translated well to HTML form by the makeinfo tools. -thomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
On Mon, 23 Jan 2006, Thomas Lumley wrote: > On Mon, 23 Jan 2006, Hin-Tak Leung wrote: > >> Prof Brian Ripley wrote: [About Ncell sizes on 64-bit platforms.] >>> We know: we even document it in the appropriate places. >> >> I went and have a look - it is the last section of R-admin (and of >> course, for those who "read the source", R/include/Rinternals.h). It >> would be good to mention this in the FAQ (which it doesn't, or maybe I >> didn't look hard enough), or the beginning of R-admin? >> > > It's not in the FAQ because it isn't a FAQ (yet). > > If you use the PDF manual it is in the table of contents on page i. > > In the HTML manual it is admittedly less clear: there isn't a table of > contents and there is nothing obvious in the index. To some extent this is a > problem with all the manuals. The structure in the .texi file isn't > translated well to HTML form by the makeinfo tools. In my build there is a chapter in the HTML manual Choosing between 32- and 64-bit builds in the top-level contents, and the information is in there. It is also in ?Memory (a fairly obvious place). It may be elsewhere, but those are the most obvious places to me. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
Prof Brian Ripley wrote: > [About Ncell sizes on 64-bit platforms.] > In my build there is a chapter in the HTML manual > > Choosing between 32- and 64-bit builds > > in the top-level contents, and the information is in there. Maybe the one on CRAN needs fixing... http://cran.r-project.org/doc/manuals/R-admin.html > It is also in ?Memory (a fairly obvious place). It may be elsewhere, > but those are the most obvious places to me. I don't want to be argumentative, but the perpective of "obvious" can often be quite different from one of the authors versus one of the users... The 32-bit/64-bit issue affects purchasing or upgrading decisions - whether one wants to spend the money on buying cheaper 32-bit machines, versus more expensive 64-bit machines. That decision would be based on information available while *not* having an operational R installation... Regards, Hin-Tak Leung __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] proposed pbirthday fix
> "MM" == Martin Maechler <[EMAIL PROTECTED]> > on Mon, 23 Jan 2006 11:52:55 +0100 writes: > "ken" == ken knoblauch <[EMAIL PROTECTED]> > on Mon, 23 Jan 2006 09:43:28 +0100 writes: ken> Actually, since NaN's are also detected in na.action ken> operations, a simpler fix might just be to use the ken> na.rm = TRUE option of min ken> upper <- min(n^k/(c^(k - 1)), 1, na.rm = TRUE) MM> Well, I liked your first fix better -- thank you for it! -- MM> since it's always good practice to formulate such as to avoid MM> overflow when possible. MM> All things considered, I think I'd go for MM> upper <- min( exp(k * log(n) - (k-1) * log(c)), 1, na.rm = TRUE) MM> Martin Ken> Recent news articles concerning an article from The Ken> Lancet with fabricated data indicate that in the sample Ken> containing some 900 or so patients, more than 200 had the Ken> same birthday. I was curious and tried out the p and q Ken> birthday functions but pbirthday could not handle 250 Ken> coincidences with n = 1000. The calculation of upper Ken> prior to using uniroot produces NaN, Ken> upper<-min(n^k/(c^(k-1)),1) Ken> I was able to get it to work by using logs, however, as Ken> in the following version >>> function(n, classes = 365, coincident = 2){ .. >>> upper <- min( exp( k * log(n) - (k-1) * log(c) ), 1 ) >>> nmin <- uniroot(f, c(lower, upper), tol = eps) >>> nmin$root >>> } Well, now after inspection, I think ``get it to work'' is a bit of an exaggeration, at least for a purist like me (some famous fortune teller once guessed it may be because I'm ... Swiss) who doesn't like to lose precision in probability computations unnecessarily. One can do much better: The version of [pq]birthday() I've just committed to R-devel *) now gives > sapply(c(20,50,100,200), function(k) pbirthday(1000, coincident= k)) [1] 8.596245e-08 9.252349e-41 2.395639e-112 1.758236e-285 whereas the 'na.rm=TRUE' fix would simply give [1] 8.596245e-08 0.00e+00 0.00e+00 0.00e+00 -- Martin Maechler, ETH Zurich *) peek at https://svn.r-project.org/R/trunk/src/library/stats/R/pbirthday.R __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Minumum memory requirements to run R.
On Mon, 23 Jan 2006, Hin-Tak Leung wrote: > > The 32-bit/64-bit issue affects purchasing or upgrading decisions > - whether one wants to spend the money on buying cheaper > 32-bit machines, versus more expensive 64-bit machines. That > decision would be based on information available while *not* having > an operational R installation... > Not necessarily. It's perfectly feasible to use a 32-bit build on a 64-bit machine, as it says in the manual, which is available from http://www.r-project.org whether or not you have an R installation. -thomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Read.delim error in 2.3.0 devel, not in 2.2.0
On Mon, 23 Jan 2006, Sean Davis wrote: > I get this error in R-devel, but not in R-2.2.0. Any insight is > appreciated. > > Error in read.delim(txtcon, header = TRUE, sep = "\t", na.strings = "NULL") > : >recursive default argument reference There is nothing we can reproduce here. What is txtcon, for example? I was unable to reproduce anything like this reading from a text connection. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] too-large notches in boxplot (PR #7690)
PR #7690 points out that if the confidence intervals (+/-1.58 IQR/sqrt(n)) in a boxplot with notch=TRUE are larger than the hinges -- which is most likely to happen for small n and asymmetric distributions -- the resulting plot is ugly, e.g.: set.seed(1001) npts <- 5 X <- rnorm(2*npts,rep(3:4,each=npts),sd=1) f <- factor(rep(1:2,each=npts)) boxplot(X~f) boxplot(X~f,notch=TRUE) I can imagine debate about what should be done in this case -- you could just say "don't do that", since the notches are based on an asymptotic argument ... the diff below just truncates the notches to the hinges, but produces a warning saying that the notches have been truncated. ?? what should the behavior be ?? the diff is against the 11 Jan version of R 2.3.0 *** newboxplot.R2006-01-23 14:32:12.0 -0500 --- oldboxplot.R2006-01-23 14:29:29.0 -0500 *** *** 84,98 bplt <- function(x, wid, stats, out, conf, notch, xlog, i) { ## Draw single box plot - conf.ok <- TRUE - if(!any(is.na(stats))) { - ## check for overlap of notches and hinges - if (notch && (stats[2]>conf[1] || stats[4] hinges: notches truncated") axes <- is.null(pars$axes) if(!axes) { axes <- pars$axes; pars$axes <- NULL } if(axes) { --- 231,243 xysegments <- segments } for(i in 1:n) ! bplt(at[i], wid=width[i], stats= z$stats[,i], out = z$out[z$group==i], conf = z$conf[,i], notch= notch, xlog = xlog, i = i) ! axes <- is.null(pars$axes) if(!axes) { axes <- pars$axes; pars$axes <- NULL } if(axes) { -- 620B Bartram Hall[EMAIL PROTECTED] Zoology Department, University of Floridahttp://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Master's project to coerce linux nvidia drivers to run generalised linear models
Hi, I am working with a friend on a master's project. Our laboratory does a lot of statistical analysis using the R stats package and we also have a lot of under-utilised nvidia cards sitting in the back of our networked linux machines. Our idea is to coerce the linux nvidia driver to run some of our statistical analysis for us. Our first thought was to specifically code up a version of glm() to run on the nvidia cards... Thinking that this might be of use to the broader community we thought we might ask for feedback before starting? Any ideas... Thanks, Olly __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Master's project to coerce linux nvidia drivers to run generalised linear models
On Mon, 2006-01-23 at 15:24 -0500, Oliver LYTTELTON wrote: > > Hi, > > I am working with a friend on a master's project. Our laboratory does a > lot of statistical analysis using the R stats package and we also have a > lot of under-utilised nvidia cards sitting in the back of our networked > linux machines. Our idea is to coerce the linux nvidia driver to run > some of our statistical analysis for us. Our first thought was to > specifically code up a version of glm() to run on the nvidia cards... > > Thinking that this might be of use to the broader community we thought > we might ask for feedback before starting? > > Any ideas... > > Thanks, > > Olly Well, I'll bite. My first reaction to this was, why? Then I did some Googling and found the following article: http://www.apcmag.com/apc/v3.nsf/0/5F125BA4653309A3CA25705A0005AD27 And also noted the GPU Gems 2 site here: http://developer.nvidia.com/object/gpu_gems_2_home.html So, my new found perspective is, why not? Best wishes for success, especially since I have a certain affinity for McGill... HTH, Marc Schwartz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Master's project to coerce linux nvidia drivers to run generalised linear models
I wonder if it would make more sense to get a relatively low level package to run on it so that all packages that used that low level package would benefit. The Matrix package and the functions runmean and sum.exact in package caTools are some things that come to mind. Others may have other ideas along these lines. On 1/23/06, Oliver LYTTELTON <[EMAIL PROTECTED]> wrote: > > > Hi, > > I am working with a friend on a master's project. Our laboratory does a > lot of statistical analysis using the R stats package and we also have a > lot of under-utilised nvidia cards sitting in the back of our networked > linux machines. Our idea is to coerce the linux nvidia driver to run > some of our statistical analysis for us. Our first thought was to > specifically code up a version of glm() to run on the nvidia cards... > > Thinking that this might be of use to the broader community we thought > we might ask for feedback before starting? > > Any ideas... > > Thanks, > > Olly > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] --gui=Tk window does not stretch (PR#8520)
Full_Name: Greg Kochanski Version: 2.2.0 OS: Debian Linux Submission from: (NULL) (212.159.16.190) When you grab the corner of the Tk-R (R's console) window, the window stretches, but the useable area does not. It remains firmly fixed at the (rather small) value of 24 lines. In fact, you end up with a grey border of wasted pixels around the active white area that contains the text. (And, please don't tell me that it's not a bug because it's been that way for 15 years, or because the S documentation states that the terminal window is 24 lines high. That would shatter my dreams and illusions.) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] --gui=Tk window does not stretch (PR#8520)
[EMAIL PROTECTED] writes: > When you grab the corner of the Tk-R (R's console) window, > the window stretches, but the useable area does not. > It remains firmly fixed at the (rather small) value of > 24 lines. > > In fact, you end up with a grey border of wasted pixels > around the active white area that contains the text. > > (And, please don't tell me that it's not a bug because > it's been that way for 15 years, or because the S > documentation states that the terminal window is > 24 lines high. That would shatter my dreams > and illusions.) It's not a bug, it's an unimplemented feature... As you're bound to discover, the Tk console is mainly a proof-of-concept with shortcomings in many other areas as well. It's been largely undeveloped (as has the Gnome GUI) because we had very little feedback to indicate that people were actually interested in getting it to work better. Patches might be considered. The particular issue is just a matter of sending suitable options to the Tk geometry manager. This can be fixed in the console.tcl script, and in fact also from within the running GUI using with(.GUIenv, tkpack("configure", Term, expand=TRUE, fill="both")) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel