Re: [Rd] idea for GSoC: an R package for fitting Bayesian Hierarchical Models
Above all, tnx Ben for taking time to read about my proposal! 2008/3/24, Ben Bolker <[EMAIL PROTECTED]>: > Antonio, Fabio Di Narzo gmail.com> writes: > > > > > I've put online a temp web page with some more info (and sources): > > > > http://antonio.fabio.googlepages.com/rgs%3Athergibbssampler > > > > Bests, > > Antonio. > > > > > Have you seen Jouni Kerman's Umacs package? It sounds similar > in spirit. Ya. But speeds are rather different. I admittely missed a comparison with Umacs in my short demo. However, from some early experiments (I'm doing while I'm writing), as I suspected, my approach results being many times faster than Umacs, even if one doesn't specify samplers as C code. Things goes even better for my demo implementation if one tries to plug in samplers specified as pure C code, which would further eliminate a lot of memory allocations/deallocations behind those "rnorm()". My aim is to obtain something which achieves decent speed, compared with JAGS. I mean, I can easily experiment new samplers by using an interpreted language, but if at the end I obtain something which is *many* times slower than JAGS (which is moreover much more robust and easier to work with), the whole stuff results being of little pratical interest. More: how can one really experiment a new custom sampler if doing some thousands iterations takes forever, so that checking your sampler pratical behaviour is a pain (I speak about my personal experience)? That's why I want to always keep attention on speed, and give the possibility to the user to either use R or C code at his choice, with the ability to modify model node values in place, without unneeded 'malloc's. Ya, I would abandon pure functional style... I will try to add a reproducible benchmark comparing my demo implementation with JAGS and Umacs. However, I see that the problem here is still finding someone interested in it at all. > > Something I would love to see done (not that I have the time > and energy to supervise someone to do it right now) would be > an R (or Python/etc.: R wouldn't necessarily be the best tool) > to translate lmer/nlme syntax (Wilkinson-Rogers with extensions > for specifying random factors, correlation structures, etc.) > into a BUGS file. It strikes me that it would be a really nice > way to bridge the gap between what mixed-model code can do > and what requires BUGS/MCMC. Such models could also serve as > (1) a way to cross-check the results of mixed model code; > (2) a way to get started in relaxing the assumptions of mixed > models (e.g. allowing for non-normal random effects distributions). That sounds interesting. However, I currenlty don't have enough know-how to work at something like it now. Antonio. > > > Ben Bolker > > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] idea for GSoC: an R package for fitting Bayesian Hierarchical Models
Antonio, Fabio Di Narzo gmail.com> writes: > I agree that quantitative differences in speed can make a qualitative difference in the way one works. Well, I'm somewhat interested, but don't feel that I'm necessarily appropriate as a mentor. I don't think it would be terrible if you contacted particular people (Gelman, Plummer, O'Hara?) to see if they were interested ... they could always say no ... > > > > Something I would love to see done (not that I have the time > > and energy to supervise someone to do it right now) would be > > an R (or Python/etc.: R wouldn't necessarily be the best tool) > > to translate lmer/nlme syntax (Wilkinson-Rogers with extensions > > for specifying random factors, correlation structures, etc.) > > into a BUGS file. It strikes me that it would be a really nice > > way to bridge the gap between what mixed-model code can do > > and what requires BUGS/MCMC. Such models could also serve as > > (1) a way to cross-check the results of mixed model code; > > (2) a way to get started in relaxing the assumptions of mixed > > models (e.g. allowing for non-normal random effects distributions). > > That sounds interesting. However, I currenlty don't have enough > know-how to work at something like it now. Do you think so? I don't think it would be too hard, if you were interested ... Ben __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] idea for GSoC: an R package for fitting Bayesian Hierarchical Models
> Ya. But speeds are rather different. > I admittely missed a comparison with Umacs in my short demo. > However, from some early experiments (I'm doing while I'm writing), as > I suspected, my approach results being many times faster than Umacs, > even if one doesn't specify samplers as C code. Things goes even > better for my demo implementation if one tries to plug in samplers > specified as pure C code, which would further eliminate a lot of > memory allocations/deallocations behind those "rnorm()". > > My aim is to obtain something which achieves decent speed, compared > with JAGS. I mean, I can easily experiment new samplers by using an > interpreted language, but if at the end I obtain something which is > *many* times slower than JAGS (which is moreover much more robust and > easier to work with), the whole stuff results being of little pratical > interest. > > More: how can one really experiment a new custom sampler if doing some > thousands iterations takes forever, so that checking your sampler > pratical behaviour is a pain (I speak about my personal experience)? > That's why I want to always keep attention on speed, and give the > possibility to the user to either use R or C code at his choice, with > the ability to modify model node values in place, without unneeded > 'malloc's. Ya, I would abandon pure functional style... There is some interesting work being done on this topic in computer science - e.g. @inproceedings{keller:2008, Author = {Keller, Gabriele and Chaffey-Millar, Hugh and Chakravarty, Manuel M. T. and Stewart, Don and Barner-Kowollik, Christopher}, Booktitle = {Proceedings of the Tenth International Symposium on Practical Aspects of Declarative Languages}, Title = {Specialising Simulator Generators for High-Performance Monte-Carlo Methods}, Url = {http://www.cse.unsw.edu.au/~chak/project/polysim/}, Year = {2008} } which explores a way to define a simulation at a high-level and then compile it down to fast low-level primitives. This seems like an interesting approach, but I suspect you would struggle to find students with the requisite statistical and computational backgrounds. Hadley -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] idea for GSoC: an R package for fitting Bayesian Hierarchical Models
2008/3/24, hadley wickham <[EMAIL PROTECTED]>: > > Ya. But speeds are rather different. > > I admittely missed a comparison with Umacs in my short demo. > > However, from some early experiments (I'm doing while I'm writing), as > > I suspected, my approach results being many times faster than Umacs, > > even if one doesn't specify samplers as C code. Things goes even > > better for my demo implementation if one tries to plug in samplers > > specified as pure C code, which would further eliminate a lot of > > memory allocations/deallocations behind those "rnorm()". > > > > There is some interesting work being done on this topic in computer > science - e.g. > > @inproceedings{keller:2008, > Author = {Keller, Gabriele and Chaffey-Millar, Hugh and Chakravarty, > Manuel M. T. and Stewart, Don and Barner-Kowollik, Christopher}, > Booktitle = {Proceedings of the Tenth International Symposium on > Practical Aspects of Declarative Languages}, > Title = {Specialising Simulator Generators for High-Performance > Monte-Carlo Methods}, > Url = {http://www.cse.unsw.edu.au/~chak/project/polysim/}, > Year = {2008} > } > > which explores a way to define a simulation at a high-level and then > compile it down to fast low-level primitives. This seems like an > interesting approach, but I suspect you would struggle to find > students with the requisite statistical and computational backgrounds. > > Hadley Tnx for the reference: that's surely an interesting reading. Instead of inventing a specialised meta-language for this kind of task (I don't ever have the knowledge for doing something like that) I've explored in the recent past the direct use of higher level languages that can be compiled into native code. I've got most interesting results in Steel Bank Common Lisp and OCaml. They really seem to do what they claim :-) However, writing something like a general purpose Gibbs sampler framework in these languages seems to be a waste of time, as one misses a lot of things which are already available in R. Not least: random number generators from a lot of common distributions! Ok, one can write wrapper code to the R standalone library, but this all looks as extra-work. So, waiting for an R-to-native code compiler, I think a feasible approach can be to write R functions with pass-by-reference semantics and a bounch of small C routines. In that respect, I was inspired by the lush project: http://lush.sourceforge.net where mix of high level and C code is encouraged (note that lush at the end has a native code compiler too). Antonio. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Roxygen
Hey Peter, sorry for the delay, I was on easter holiday. > Would it suffice, by the way, to source() a file and introspect upon > its objects with ls(), formals(), typeof(), mode(), and the like; or > should we formalize, say, a BNF and write the accompanying automaton? I agree with Duncan and Hadley. I think, a good way is to write a parser for the comments and use the parse() function for R code. Manuel. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Inaccurate qgamma() (PR#11030)
I haven't looked inside to see what is causing this, but there's a big discontinuity in qgamma: curve(qgamma(x, shape=19), from=1e-10, to=2e-10) This appears in both R-patched and R-devel. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] resampling from string when it runs across multiple lines
Hi, I need to resample from a long string, which is written in many lines with carriage-return marks at the end of each line. Perhaps because the data looks like a matrix, using the code: sample(data, 25, replace=T) gives me 25 columns of characters from the data because it is resampling whole columns. What I would like it to do is to treat the data as a vector that has just been spread across many lines, and pick single characters from random positions in randomly chosen lines. I am reproducing a sample dataset, the command and the output here: > y X..1. X..2. X..3. X..4. X..5. X..6. X..7. X..8. X..9. X..10. [1,] A C G T T G C A G C [2,] A C G F F F F F F G [3,] A C GS S S S S G A [4,] A C G T T G C A G G [5,] A B B B B B B A G T > sample(y, 20, replace=T) X..9. X..4. X..2. X..7. X..9..1 X..3. X..3..1 X..9..2 X..9..3 X..4..1 X..3..2 X..8. X..9..4 X..3..3 X..6. X..7..1 [1,] G T C C G G G G G T G A G G G C [2,] F FC FF G G F F F G F F G F F [3,] G SC S G G G G G S G S G G S S [4,] G T C C G G G G G T G A G G G C [5,] G B B B G B B G G B B A G B B B X..6..1 X..3..4 X..7..2 X..10. [1,] G G C C [2,] F GF G [3,] S G S A [4,] G G C G [5,] B B B T I wanted to try the bootstrap approach (since that's what I am doing - resampling with replacement) but that requires a "statistic" and I don't know what sense that makes for character data. Any help will be greatly appreciated. Thanks, S. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] resampling from string when it runs across multiple lines
try this: y <- as.matrix(read.table(textConnection( "A C G T T G C A G C A C G F F F F F F G A C G S S S S S G A A C G T T G C A G G A B B B B B B A G T" ), stringsAsFactors = FALSE)) ind <- sample(length(y), 20, TRUE) y[ind] I hope it helps. Best, Dimitris ps, it would be best that you send that kind of e-mails in R-help not R-devel; check http://www.r-project.org/mail.html for more info regarding the different R-mailing-lists. Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting Suraaga Kulkarni <[EMAIL PROTECTED]>: > Hi, > > I need to resample from a long string, which is written in many lines with > carriage-return marks at the end of each line. Perhaps because the data > looks like a matrix, using the code: sample(data, 25, replace=T) gives me 25 > columns of characters from the data because it is resampling whole columns. > What I would like it to do is to treat the data as a vector that has just > been spread across many lines, and pick single characters from random > positions in randomly chosen lines. > > I am reproducing a sample dataset, the command and the output here: > >> y > X..1. X..2. X..3. X..4. X..5. X..6. X..7. X..8. X..9. X..10. > [1,] A C G T T G C A G C > [2,] A C G F F F F F F G > [3,] A C GS S S S S G A > [4,] A C G T T G C A G G > [5,] A B B B B B B A G T > >> sample(y, 20, replace=T) > X..9. X..4. X..2. X..7. X..9..1 X..3. X..3..1 X..9..2 X..9..3 X..4..1 > X..3..2 X..8. X..9..4 X..3..3 X..6. X..7..1 > [1,] G T C C G G G G G > T G A G G G C > [2,] F FC FF G G F F > F G F F G F F > [3,] G SC S G G G G G > S G S G G S S > [4,] G T C C G G G G G > T G A G G G C > [5,] G B B B G B B G G > B B A G B B B > > X..6..1 X..3..4 X..7..2 X..10. > [1,] G G C C > [2,] F GF G > [3,] S G S A > [4,] G G C G > [5,] B B B T > > I wanted to try the bootstrap approach (since that's what I am doing - > resampling with replacement) but that requires a "statistic" and I don't > know what sense that makes for character data. > > Any help will be greatly appreciated. > > Thanks, > > S. > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] revision 44805 breaks promptMethods()
Hi, It seems that promptMethods() was changed in revision 44805 to include a call to "isGenericFunction" that at least as of revision 44861 does not exist as far as I can tell. This results in: could not find function "isGenericFunction" when evaluating e.g. promptMethods("myGeneric") Thanks, Michael [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Inaccurate qgamma() (PR#11030)
On 24/03/2008 1:35 PM, [EMAIL PROTECTED] wrote: > I haven't looked inside to see what is causing this, but there's a big > discontinuity in qgamma: > > curve(qgamma(x, shape=19), from=1e-10, to=2e-10) > > This appears in both R-patched and R-devel. With debugging turned on, the inaccurate value prints this: > qgamma(1.2e-10, shape=19) qgamma(p=1.2e-10, alpha= 19, scale= 1, l.t.= 1, log_p= 0): nu > .32: Wilson-Hilferty; x = -6.33328 ==> ch =5.03581: Ph.II iter; ch=5.03581, p2=8.8438e-11 it=2, ch=-0.635427, p2=4.94066e-324 it=3, ch=1.2e-10, p2=4.94066e-324 it=4, ch=nan, p2=4.94066e-324 it=1: p=1.2e-10, x = 2.51791, p.=3.1562e-11; p1:=D{p}=-8.8438e-11 no Newton step done since delta{p} >= last delta [1] 2.517907 Things are fine if I use log.p=TRUE: > qgamma(log(1.2e-10), shape=19, log.p=TRUE) qgamma(p=-22.8435, alpha= 19, scale= 1, l.t.= 1, log_p= 1): nu > .32: Wilson-Hilferty; x = -6.33328 ==> ch =5.03581: Ph.II iter; ch=5.03581, p2=8.8438e-11 it=2, ch=-0.635427, p2=4.94066e-324 it=3, ch=1.2e-10, p2=4.94066e-324 it=4, ch=nan, p2=4.94066e-324 it=1: p=-22.8435, x = 2.51791, p.=-24.1791; p1:=D{p}=-1.33554 it=2, d{p}=-0.0581595 it=3, d{p}=-0.00011856 it=4, d{p}=-4.9439e-10 it=5, d{p}=1.42247e-15 [1] 2.729837 Maybe we should switch to this scale when the first try fails? Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] revision 44805 breaks promptMethods()
I think, as of just 2 revisions later (44863) anyway, that problem has been fixed. Sitting in the methods/R directory: grep -nH -e "isGenericFunction" *.R Grep finished with no matches found at Mon Mar 24 16:55:39 Try updating & see if all is well. (r-devel is a moving target, particularly a month before release!) John Michael Lawrence wrote: > Hi, > > It seems that promptMethods() was changed in revision 44805 to include a > call to "isGenericFunction" that at least as of revision 44861 does not > exist as far as I can tell. > > This results in: > could not find function "isGenericFunction" > > when evaluating e.g. promptMethods("myGeneric") > > Thanks, > Michael > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] regexp with [:upper:] (PR#11032)
Full_Name: Mark Bravington Version: 2.6.2 patched OS: Windows XP Pro Submission from: (NULL) (140.79.22.104) > grep( '[:upper:]', letters, val=T) # shurely shouldn't match anything ?? [1] "e" "p" "r" "u" The converse ( '[:lower:]' and LETTERS) seems to work OK. --please do not edit the information below-- Version: platform = i386-pc-mingw32 arch = i386 os = mingw32 system = i386, mingw32 status = Patched major = 2 minor = 6.2 year = 2008 month = 03 day = 21 svn rev = 44836 language = R version.string = R version 2.6.2 Patched (2008-03-21 r44836) Windows XP (build 2600) Service Pack 2 Locale: LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252 Search Path: .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:methods, Autoloads, pa __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] regexp with [:upper:] (PR#11032)
That is correct: I suspect you meant the character class [[:upper:]] > grep( '[[:upper:]]', letters, val=TRUE) character(0) You asked for matches amongst :upper:, and that is what you got. As ?rexexp does say (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket list.) this appears to be a homework failure. On Tue, 25 Mar 2008, [EMAIL PROTECTED] wrote: > Full_Name: Mark Bravington > Version: 2.6.2 patched > OS: Windows XP Pro > Submission from: (NULL) (140.79.22.104) > > >> grep( '[:upper:]', letters, val=T) # shurely shouldn't match anything ?? > [1] "e" "p" "r" "u" > > The converse ( '[:lower:]' and LETTERS) seems to work OK. > > --please do not edit the information below-- > > Version: > platform = i386-pc-mingw32 > arch = i386 > os = mingw32 > system = i386, mingw32 > status = Patched > major = 2 > minor = 6.2 > year = 2008 > month = 03 > day = 21 > svn rev = 44836 > language = R > version.string = R version 2.6.2 Patched (2008-03-21 r44836) > > Windows XP (build 2600) Service Pack 2 > > Locale: > LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252 > > Search Path: > .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils, > package:datasets, package:methods, Autoloads, pa > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] regexp with [:upper:] (PR#11032)
Aaargh, sorry. I thought it might be to do with Australian English... Prof Brian Ripley wrote: > That is correct: I suspect you meant the character class [[:upper:]] > >> grep( '[[:upper:]]', letters, val=TRUE) character(0) > > You asked for matches amongst :upper:, and that is what you got. > As ?rexexp does say > > (Note that the > brackets in these class names are part of the symbolic > names, and > must be included in addition to the brackets delimiting the > bracket list.) > > this appears to be a homework failure. > > On Tue, 25 Mar 2008, [EMAIL PROTECTED] wrote: > >> Full_Name: Mark Bravington >> Version: 2.6.2 patched >> OS: Windows XP Pro >> Submission from: (NULL) (140.79.22.104) >> >> >>> grep( '[:upper:]', letters, val=T) # shurely shouldn't > match anything ?? >> [1] "e" "p" "r" "u" >> >> The converse ( '[:lower:]' and LETTERS) seems to work OK. >> >> --please do not edit the information below-- >> >> Version: >> platform = i386-pc-mingw32 >> arch = i386 >> os = mingw32 >> system = i386, mingw32 >> status = Patched >> major = 2 >> minor = 6.2 >> year = 2008 >> month = 03 >> day = 21 >> svn rev = 44836 >> language = R >> version.string = R version 2.6.2 Patched (2008-03-21 r44836) >> >> Windows XP (build 2600) Service Pack 2 >> >> Locale: >> > LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_M >> > ONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia. >> 1252 >> >> Search Path: >> .GlobalEnv, package:stats, package:graphics, package:grDevices, >> package:utils, package:datasets, package:methods, Autoloads, pa >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel -- Mark Bravington CSIRO Mathematical & Information Sciences Marine Laboratory Castray Esplanade Hobart 7001 TAS ph (+61) 3 6232 5118 fax (+61) 3 6232 5012 mob (+61) 438 315 623 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel