[R] Question about many-to-one merge
I have spent a whole afternoon searching for the solution, but got nothing so far. It should be a simple problem. I have two datasets to merge. The first one just contains two ID columns and it looks like: FromID ToID 1 2 1 3 2 1 2 3 3 1 3 2 The second one contains a ID column and a variable: ID X 1 100 2 150 3 130 What I want is to merge the two datasets together using the ToID and ID. The final merged dataset should look like: FromID ToID X 12 150 13 130 21 100 23 130 31 100 32 150 The Merge command doesn't work well. I also don't want to use for loop. I feel that there must be some subscripting or other tricks to use, but I couldn't figure it out. Any help is appreciated!! -Hongwei -- View this message in context: http://n4.nabble.com/Question-about-many-to-one-merge-tp1018811p1018811.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about many-to-one merge
Oh, yes, my bad. Merge works! I guess the reordering automatically made by Merge made me confused. I wrote my code using Merge several months ago, and just didn't feel it right today. Thanks anyway! -- View this message in context: http://n4.nabble.com/Question-about-many-to-one-merge-tp1018811p1037528.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about many-to-one merge
btw, the re-order and swap make it much clearer, so thanks again! -- View this message in context: http://n4.nabble.com/Question-about-many-to-one-merge-tp1018811p1037860.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimate Discrete Choice Models with R
Hi, R users, Does anyone know whether there are any Discrete Choice Modeling modules for R? Any books or websites discussing this? Thanks Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate Discrete Choice Models with R
Thanks for the replies. What I'm really interest in are the functions that can do GEV, probit, and mixed logit. I searched R, and I did find the function that can do the regular multinomial logit model, but I did not see the function that can do GEV, probit, and mixed logit. Any further advices? Thanks. Harry On Fri, Aug 14, 2009 at 3:49 PM, Hongwei Dong wrote: > Hi, R users, > Does anyone know whether there are any Discrete Choice Modeling modules > for R? Any books or websites discussing this? > > Thanks > > Harry > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] why summary() does not work here???
Hi, R users, I'm using the function "vglm" to estimate a multinomial logit model. Every time I use "summary()" to ask for the coefficients and std error, I got this: Length Class Mode 1 vglm S4 Anyone know what's wrong here? Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why summary() does not work here???
I'm quite sure the VGAM package has been installed because I used it to estimate the model before I use summary(). One reason I can guess is that the default method for summary() is S3, but here it is S4. But I'm not sure and don't know what to do with that. Here is an example you can reproduce: counts = c(18,17,15,20,10,20,25,13,12) outcome = gl(3,1,9) treatment = gl(3,3) print(d.AD <- data.frame(treatment, outcome, counts)) vglm.D93 = vglm(counts ~ outcome + treatment, family=poissonff) summary(vglm.D93) With this, I got: > summary(vglm.D93) Length Class Mode 1 vglm S4 If I just type "vglm.D93", I got: Call: vglm(formula = counts ~ outcome + treatment, family = poissonff) Coefficients: (Intercept) outcome2 outcome3treatment2treatment3 3.044522e+00 -4.542553e-01 -2.929871e-01 7.766813e-16 3.009745e-16 Degrees of Freedom: 9 Total; 4 Residual Residual Deviance: 5.129141 Log-likelihood: -23.38066 Looking forward to see more advices. Thanks. Harry On Sat, Aug 15, 2009 at 8:18 PM, Ronggui Huang wrote: > I cannot reproduce what you mentioned. One possible is that you use > summary() without loading VGAM package. > > > 2009/8/16 Hongwei Dong : > > Hi, R users, > > I'm using the function "vglm" to estimate a multinomial logit model. > Every > > time I use "summary()" to ask for the coefficients and std error, I got > > this: > > > > Length Class Mode > > 1 vglm S4 > > > > Anyone know what's wrong here? Thanks. > > > > Harry > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > HUANG Ronggui, Wincent > PhD Candidate > Dept of Public and Social Administration > City University of Hong Kong > Home page: http://asrr.r-forge.r-project.org/rghuang.html > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why summary() does not work here???
it seems we are using the version of VGAM and R. > packageDescription("VGAM")$Version [1] "0.7-9" > sessionInfo() R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats4splines stats graphics grDevices utils datasets methods base other attached packages: [1] MASS_7.2-48 nnet_7.2-48 VGAM_0.7-9 loaded via a namespace (and not attached): [1] tools_2.9.1 On Sun, Aug 16, 2009 at 12:34 PM, Gavin Simpson wrote: > On Sun, 2009-08-16 at 10:48 -0700, Hongwei Dong wrote: > > I'm quite sure the VGAM package has been installed because I used it to > > estimate the model before I use summary(). One reason I can guess is that > > the default method for summary() is S3, but here it is S4. But I'm not > sure > > and don't know what to do with that. Here is an example you can > reproduce: > > > > counts = c(18,17,15,20,10,20,25,13,12) > > outcome = gl(3,1,9) > > treatment = gl(3,3) > > print(d.AD <- data.frame(treatment, outcome, counts)) > > vglm.D93 = vglm(counts ~ outcome + treatment, family=poissonff) > > summary(vglm.D93) > > Works for me with: > > > packageDescription("VGAM")$Version > [1] "0.7-9" > > sessionInfo() > R version 2.9.1 Patched (2009-08-07 r49107) > i686-pc-linux-gnu > > locale: > > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C > > attached base packages: > [1] stats4splines stats graphics grDevices utils > [7] datasets methods base > > other attached packages: > [1] VGAM_0.7-9 > > loaded via a namespace (and not attached): > [1] tools_2.9.1 > > Are you using an older R or version of VGAM? Is this in a clean session? > > G > > > > > With this, I got: > > > > > summary(vglm.D93) > > Length Class Mode > > 1 vglm S4 > > > > If I just type "vglm.D93", I got: > > > > Call: > > vglm(formula = counts ~ outcome + treatment, family = poissonff) > > > > Coefficients: > > (Intercept) outcome2 outcome3treatment2treatment3 > > 3.044522e+00 -4.542553e-01 -2.929871e-01 7.766813e-16 3.009745e-16 > > > > Degrees of Freedom: 9 Total; 4 Residual > > Residual Deviance: 5.129141 > > Log-likelihood: -23.38066 > > > > > > Looking forward to see more advices. Thanks. > > > > Harry > > > > > > > > > > On Sat, Aug 15, 2009 at 8:18 PM, Ronggui Huang >wrote: > > > > > I cannot reproduce what you mentioned. One possible is that you use > > > summary() without loading VGAM package. > > > > > > > > > 2009/8/16 Hongwei Dong : > > > > Hi, R users, > > > > I'm using the function "vglm" to estimate a multinomial logit model. > > > Every > > > > time I use "summary()" to ask for the coefficients and std error, I > got > > > > this: > > > > > > > > Length Class Mode > > > > 1 vglm S4 > > > > > > > > Anyone know what's wrong here? Thanks. > > > > > > > > Harry > > > > > > > >[[alternative HTML version deleted]] > > > > > > > > __ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > -- > > > HUANG Ronggui, Wincent > > > PhD Candidate > > > Dept of Public and Social Administration > > > City University of Hong Kong > > > Home page: http://asrr.r-forge.r-project.org/rghuang.html > > > > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] memory limit in R
Hi, all, I'm doing a discrete choice model in R and kept getting this error: Error: cannot allocate vector of size 198.6 Mb. Does this mean the memory limit in R has been reached? > memory.size() [1] 1326.89 > memory.size(TRUE) [1] 1336 > memory.limit() [1] 1535 My laptop has a 4G memory with Windows Vista (32 bit). I increased the memory limit to 2500 M. But still getting the same error message. Any one can give me some suggestions? Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory limit in R
The size of my .Rdata workspace is about 9.2 M and the data I'm using is the only one object in this workspace. Is it a large one? Thanks. Harry On Tue, Aug 18, 2009 at 4:21 AM, jim holtman wrote: > About 2GB is the limit of the address space on 32-bit windows (you can > get upto 3GB with a special flag; check the documentation). Check the > size of the other objects your workspace and remove any you don't need > anymore. A rule of thumb is that your largest object should only be > at most 25% of the available space; in your case I would limit it to > at most 500MB, but try to get by with a smaller object. Since > everything is kept in memory, you need to conserve. You never did say > what you were doing or how large the objects were you were working on. > > If they are large, think about keeping the data in a database and only > retrieving the portion you need. Try to run with a smaller sample. > > On Mon, Aug 17, 2009 at 2:43 PM, Hongwei Dong wrote: > > Hi, all, I'm doing a discrete choice model in R and kept getting this > error: > > Error: cannot allocate vector of size 198.6 Mb. > > > > Does this mean the memory limit in R has been reached? > > > >> memory.size() > > [1] 1326.89 > >> memory.size(TRUE) > > [1] 1336 > >> memory.limit() > > [1] 1535 > > > > My laptop has a 4G memory with Windows Vista (32 bit). I increased the > > memory limit to 2500 M. But still getting the same error message. Any one > > can give me some suggestions? Thanks. > > > > Harry > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select number of lags in ADF.test
Hi, all, According to the ADF.test help,""Singf" removes the non-significant lags at the 10% level of significance until all the selected lags are significant." However, when I use "singf", it still give me the lags that are not significant at 10% level. Anyone knows why? Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use "subset" in lm function
Hi, I'm using R to do a time series analysis. In the model, I use the lags of some variables. such the lags of the variables have different length, I just can't use them directly in the lm function. Intuitively, I feel that "subset" might be useful, but I do not know how to use it. Anyone can give me an example syntax? Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot two graphs with different ranges in one
Hi, I'm trying to plot two variables in one graph. One ranges between 0 and 1, while the other ranges between 50 and 500. Can I plot them in one graph with similar scale? Thanks Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to predict dynamic model in R
I have a dynamic time series model like this: dyn$lm( y ~ lag(y,-1) + x + lag(x,-1)+lag(x,-2) ) I need to do an out of sample forecast with this model. Is there any way I can do this with R? It would be greatly appreciated if some one can give me an example. Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to predict dynamic model in R
Hi, Gabor and Other R users, I'm re-posting my script and the results I got. here is the dynamic model I used to estimate in-sample model (1996-2006) and it works: fit<-dyn$lm(Y~lag(Y,-1)+z+x+lag(x,-1)+lag(x,-2)+lag(x,-3)+lag(x,-4)) Then I used this model to do out sample forecast with the following scripts, which do not work: z<-ts(Z[41:52],start=2006,frequency=4) x<-ts(X[41:52],start=2006,frequency=4) newdata<-data.frame(cbind(z,x)) newdata<-ts(newdata,start=2006,frequency=4) pred<-predict(fit,newdata) Here is the results I got from R: Qtr1 Qtr2 Qtr3 Qtr4 2006 NA NA NA NA 2007 3083.362 NA NA NA 2008 NA NA NA NA 2009 NA NA NA NA I got only one prediction for the first quarter in 2007. Intuitively, there might be two problems: the definition of the newdata and how to define Y in newdata. But I just can't figure this out. It will greatly appreciated if someone can give me some help. Thanks. Harry On Thu, Jul 23, 2009 at 5:15 AM, Gabor Grothendieck wrote: > You have to use consistent classes. You can't start out using > ts class and then suddenly switch to zoo class as if you had been > using zoo all along. Either use ts everywhere or zoo everywhere. > > Also in the future please post reproducible examples as requested > at the bottom of every message to r-help. That means include > a minimal amount of data so we can get exactly what you > are getting. > > On Thu, Jul 23, 2009 at 4:48 AM, Hongwei Dong wrote: > > Thanks, Gabor. This is really helpful. > > When the regressive part, lag(Y,-1), is not included, my sytax works > well. > > However, when I include the lag(Y) in the model, the prediction part does > > not work. Here is my sytax for in-sample estimation and it works well: > > fit<-dyn$lm(Y~lag(Y,-1)+x+lag(x,-1)+lag(x,-2)+lag(x,-3)+lag(x,-4)+z). > > Then I use this model to do out of sample prediction: > > x<-ts(X[41:52],start=2006,frequency=4) > > z<-ts(Z[41:52],start=2006,frequency=4) > > newdata<-data.frame(cbind(x,z)) > > newdata<-zooreg(newdata) > > pred<-predict(fit,newdata) > > With these, I got weird result, a prediction for each year from year 1 to > > the first quarter of year 2007 (all "NA"). What I expect is a prediction > for > > the 8 quarters from 2007 to 2008. Intuitively, I know there must be > > something wrong with my newdata definition. But I can't figure it out. > I'll > > appreciate it very much if you can give some suggestions to modify my > > syntax. Thanks. > > > > Harry > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 22, 2009 at 10:53 PM, Gabor Grothendieck > > wrote: > >> > >> Here is an example closer to yours. > >> > >> > library(dyn) > >> > set.seed(123) > >> > x <- zooreg(rnorm(10)) > >> > y <- zooreg(rnorm(10)) > >> > L <- function(x, k = 1) lag(x, k = -k) > >> > mod <- dyn$lm(y ~ L(y) + L(x, 0:2)) > >> > mod > >> > >> Call: > >> lm(formula = dyn(y ~ L(y) + L(x, 0:2))) > >> > >> Coefficients: > >> (Intercept) L(y) L(x, 0:2)1 L(x, 0:2)2 L(x, 0:2)3 > >>0.06355 -0.74540 0.63649 0.44957 -0.41360 > >> > >> > newdata <- cbind(x = c(coredata(x), rnorm(1)), y = c(coredata(y), > >> > rnorm(1))) > >> > newdata <- zooreg(newdata) > >> > predict(mod, newdata) > >> 1 2 3 4 5 6 > >> 7 > >>NA NA 0.9157808 0.6056333 -0.5496422 1.5984615 > >> -0.2574875 > >> 8 9 10 11 12 13 > >> -1.6148859 0.3329285 -0.5284646 -0.1799693 NA NA > >> > >> > >> On Thu, Jul 23, 2009 at 1:04 AM, Gabor > >> Grothendieck wrote: > >> > Use dyn.predict like this: > >> > > >> >> library(dyn) > >> >> x <- y <- zoo(1:5) > >> >> mod <- dyn$lm(y ~ lag(x, -1)) > >> >> predict(mod, list(x = zoo(6:10, 6:10))) > >> > 7 8 9 10 > >> > 7 8 9 10 > >> > > >> > > >> > On Thu, Jul 23, 2009 at 12:54 AM, Hongwei Dong > wrote: > >> >> I have a dynamic time series model like this: > >> >> dyn$lm( y ~ lag(y,-1) + x + lag(x,-1)+lag(x,-2) ) > >> >> > >> >> I nee
Re: [R] how to predict dynamic model in R
Hi, Gabor, got it. Thanks a lot. I have one more question about how the "predict" function works here, especially for the lag(Y,-1) part. In my model, I assume I know predictors x and z in the next two years, and use them to predict Y. For each forecast step at time t, the lag(Y,-1) in the model should be the estimated result from the last forecast step. For example, the prediction of the Y in 2007 should be based on the estimated Y for 2006, not the actual number already in my dataset. Before I report my results, I want to make sure the "predict" function works in this way. Thanks. Harry On Thu, Jul 23, 2009 at 10:28 AM, Gabor Grothendieck < ggrothendi...@gmail.com> wrote: > Please provide your code in a reproducible form (as requested previously). > > Here we fit with first 9 points and then add a point for prediction. (Of > course your model can only predict the current value of Y so you may > have to rethink your model even aside from the implementation if you > really want to predict future values of Y.) > > > library(dyn) > > set.seed(123) > > tt <- ts(cbind(Y = 1:10, x = rnorm(10), z = rnorm(10))) > > L <- function(x, k = 1) lag(x, -k) > > tt.zoo <- as.zoo(tt) > > fit <- dyn$lm(Y ~ L(Y) + L(x, 0:2) + z, tt.zoo[-10, ]) > > predict(fit, tt.zoo) > 1 2 3 4 5 6 7 8 9 10 11 12 > NA NA 3 4 5 6 7 8 9 10 NA NA > > > > > On Thu, Jul 23, 2009 at 1:01 PM, Hongwei Dong wrote: > > Hi, Gabor and Other R users, > > I'm re-posting my script and the results I got. > > > > here is the dynamic model I used to estimate in-sample model > (1996-2006) > > and it works: > > > > fit<-dyn$lm(Y~lag(Y,-1)+z+x+lag(x,-1)+lag(x,-2)+lag(x,-3)+lag(x,-4)) > > Then I used this model to do out sample forecast with the following > > scripts, which do not work: > > > > z<-ts(Z[41:52],start=2006,frequency=4) > > x<-ts(X[41:52],start=2006,frequency=4) > > newdata<-data.frame(cbind(z,x)) > > newdata<-ts(newdata,start=2006,frequency=4) > > pred<-predict(fit,newdata) > > Here is the results I got from R: > > Qtr1 Qtr2 Qtr3 Qtr4 > > 2006 NA NA NA NA > > 2007 3083.362 NA NA NA > > 2008 NA NA NA NA > > 2009 NA NA NA NA > > > > I got only one prediction for the first quarter in 2007. Intuitively, > there > > might be two problems: the definition of the newdata and how to define Y > in > > newdata. But I just can't figure this out. It will greatly appreciated if > > someone can give me some help. Thanks. > > Harry > > > > > > On Thu, Jul 23, 2009 at 5:15 AM, Gabor Grothendieck > > wrote: > >> > >> You have to use consistent classes. You can't start out using > >> ts class and then suddenly switch to zoo class as if you had been > >> using zoo all along. Either use ts everywhere or zoo everywhere. > >> > >> Also in the future please post reproducible examples as requested > >> at the bottom of every message to r-help. That means include > >> a minimal amount of data so we can get exactly what you > >> are getting. > >> > >> On Thu, Jul 23, 2009 at 4:48 AM, Hongwei Dong wrote: > >> > Thanks, Gabor. This is really helpful. > >> > When the regressive part, lag(Y,-1), is not included, my sytax works > >> > well. > >> > However, when I include the lag(Y) in the model, the prediction part > >> > does > >> > not work. Here is my sytax for in-sample estimation and it works well: > >> > fit<-dyn$lm(Y~lag(Y,-1)+x+lag(x,-1)+lag(x,-2)+lag(x,-3)+lag(x,-4)+z). > >> > Then I use this model to do out of sample prediction: > >> > x<-ts(X[41:52],start=2006,frequency=4) > >> > z<-ts(Z[41:52],start=2006,frequency=4) > >> > newdata<-data.frame(cbind(x,z)) > >> > newdata<-zooreg(newdata) > >> > pred<-predict(fit,newdata) > >> > With these, I got weird result, a prediction for each year from year 1 > >> > to > >> > the first quarter of year 2007 (all "NA"). What I expect is a > prediction > >> > for > >> > the 8 quarters from 2007 to 2008. Intuitively, I know there must be > >> > something wrong with my newdata definition. But I can't figure it out. > >> > I'll > >> > appreciate it very much if you can give some suggestions to modify my > >> > syntax. Thanks. > >>
Re: [R] how to predict dynamic model in R
Thanks, Gabor. Here are the problems I'm trying to solve. *FIRST*, I run this to simulate a 20 years time series process. The data from 1-15 years are used to estimate the model, and this model is used to predict the year from 16-20. The following script works. set.seed(123) tt <- ts(cbind(Y = 1:20, x = rnorm(20), z = rnorm(20))) L <- function(x, k = 1) lag(x, -k) tt.zoo <- as.zoo(tt) fit <- dyn$lm(Y ~ L(Y) + z + L(x, 0:1), tt.zoo[(1:15), ]) fit pred <- predict(fit, tt.zoo[(16:20),]) pred *SECONDLY*, I use similar script, but pretend that we do not know the Y data from year 16-20. We know x and z for year 16-20, and use them predict Y based on the model estimated from 1-15 years. So, in the "newdata" part, I use tt.zoo[(16:20), (2:3)] to remove the Y out. here is the script. Unfortunately, it does not work in that way. pred1 <- predict(fit, tt.zoo[(16:20),(2:3)]) pred1 It will be greatly appreciated if you can give me some guide on this. Thanks. Harry On Wed, Jul 22, 2009 at 10:04 PM, Gabor Grothendieck < ggrothendi...@gmail.com> wrote: > Use dyn.predict like this: > > > library(dyn) > > x <- y <- zoo(1:5) > > mod <- dyn$lm(y ~ lag(x, -1)) > > predict(mod, list(x = zoo(6:10, 6:10))) > 7 8 9 10 > 7 8 9 10 > > > On Thu, Jul 23, 2009 at 12:54 AM, Hongwei Dong wrote: > > I have a dynamic time series model like this: > > dyn$lm( y ~ lag(y,-1) + x + lag(x,-1)+lag(x,-2) ) > > > > I need to do an out of sample forecast with this model. Is there any way > I > > can do this with R? > > It would be greatly appreciated if some one can give me an example. > Thanks. > > > > > > Harry > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to predict dynamic model in R
What I want R to do is to use the estimated Y at t-1 to be the lag(Y,-1) in the forecast equation for time t. Is there anyway I can realize this with R? For example, when the Y value for year 18 is forecast, the estimated Y for year 17 is used, not the actual Y for year 17 already in the data. Thanks for you patience. I appreciate it. Harry On Thu, Jul 23, 2009 at 5:44 PM, Gabor Grothendieck wrote: > You can't remove Y since its in the rhs of your model. > > On Thu, Jul 23, 2009 at 8:25 PM, Hongwei Dong wrote: > > Thanks, Gabor. Here are the problems I'm trying to solve. > > FIRST, I run this to simulate a 20 years time series process. The data > from > > 1-15 years are used to estimate the model, and this model is used to > predict > > the year from 16-20. The following script works. > > set.seed(123) > > tt <- ts(cbind(Y = 1:20, x = rnorm(20), z = rnorm(20))) > > L <- function(x, k = 1) lag(x, -k) > > tt.zoo <- as.zoo(tt) > > fit <- dyn$lm(Y ~ L(Y) + z + L(x, 0:1), tt.zoo[(1:15), ]) > > fit > > pred <- predict(fit, tt.zoo[(16:20),]) > > pred > > SECONDLY, I use similar script, but pretend that we do not know the Y > data > > from year 16-20. We know x and z for year 16-20, and use them predict Y > > based on the model estimated from 1-15 years. So, in the "newdata" part, > I > > use tt.zoo[(16:20), (2:3)] to remove the Y out. here is the script. > > Unfortunately, it does not work in that way. > > pred1 <- predict(fit, tt.zoo[(16:20),(2:3)]) > > pred1 > > It will be greatly appreciated if you can give me some guide on this. > > Thanks. > > Harry > > > > > > > > > > > > > > On Wed, Jul 22, 2009 at 10:04 PM, Gabor Grothendieck > > wrote: > >> > >> Use dyn.predict like this: > >> > >> > library(dyn) > >> > x <- y <- zoo(1:5) > >> > mod <- dyn$lm(y ~ lag(x, -1)) > >> > predict(mod, list(x = zoo(6:10, 6:10))) > >> 7 8 9 10 > >> 7 8 9 10 > >> > >> > >> On Thu, Jul 23, 2009 at 12:54 AM, Hongwei Dong > wrote: > >> > I have a dynamic time series model like this: > >> > dyn$lm( y ~ lag(y,-1) + x + lag(x,-1)+lag(x,-2) ) > >> > > >> > I need to do an out of sample forecast with this model. Is there any > way > >> > I > >> > can do this with R? > >> > It would be greatly appreciated if some one can give me an example. > >> > Thanks. > >> > > >> > > >> > Harry > >> > > >> >[[alternative HTML version deleted]] > >> > > >> > __ > >> > R-help@r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to predict dynamic model in R
Hi, Gabor, it seems ARIMA model does not have that problem. For example: set.seed(123) y<-ts(c(1:20)) x = ts(rnorm(20)) z = ts(rnorm(20)) tt<-ts(cbind(x, lag(x,-1),lag(x,-2),z)) fit <- arima(y[1:15],order=c(1,0,0),xreg=tt[(1:15),]) fit pred <- predict(fit, n.ahead=5,tt[(16:20),]) pred What do you make of this? Thanks. Harry On Thu, Jul 23, 2009 at 6:36 PM, Gabor Grothendieck wrote: > Try this: > > library(dyn) > set.seed(123) > tz <- zoo(cbind(Y = 0, x = rnorm(10), z = rnorm(10))) > > # simulate values > for(i in 2:10) { > tz$Y[i] <- with(tz, 2*Y[i-1] + 3*z[i] +4* x[i] + 5*x[i-1] + rnorm(1)) > } > > # keep copy of tz to compare later to simulated Y's > tz.orig <- tz > > # NA out Y's that are to be predicted > tz[7:10, "Y"] <- NA > > L <- function(x, k = 1) lag(x, -k) > > # predict 1 ahead each iteration > for(i in 7:10) { ># fit based on first i-1 values >fit <- dyn$lm(Y ~ L(Y) + z + L(x, 0:1), tz, subset = seq_len(i-1)) ># get prediction for ith value >tz[i, "Y"] <- tail(predict(fit, tz[1:i,]), 1) > } > cbind(pred = tz[7:10, "Y"], act = tz.orig[7:10, "Y"]) > > > On Thu, Jul 23, 2009 at 9:02 PM, Hongwei Dong wrote: > > What I want R to do is to use the estimated Y at t-1 to be the lag(Y,-1) > in > > the forecast equation for time t. Is there anyway I can realize this with > R? > > For example, when the Y value for year 18 is forecast, the estimated Y > for > > year 17 is used, not the actual Y for year 17 already in the data. > > Thanks for you patience. I appreciate it. > > Harry > > > > > > > > On Thu, Jul 23, 2009 at 5:44 PM, Gabor Grothendieck > > wrote: > >> > >> You can't remove Y since its in the rhs of your model. > >> > >> On Thu, Jul 23, 2009 at 8:25 PM, Hongwei Dong wrote: > >> > Thanks, Gabor. Here are the problems I'm trying to solve. > >> > FIRST, I run this to simulate a 20 years time series process. The data > >> > from > >> > 1-15 years are used to estimate the model, and this model is used to > >> > predict > >> > the year from 16-20. The following script works. > >> > set.seed(123) > >> > tt <- ts(cbind(Y = 1:20, x = rnorm(20), z = rnorm(20))) > >> > L <- function(x, k = 1) lag(x, -k) > >> > tt.zoo <- as.zoo(tt) > >> > fit <- dyn$lm(Y ~ L(Y) + z + L(x, 0:1), tt.zoo[(1:15), ]) > >> > fit > >> > pred <- predict(fit, tt.zoo[(16:20),]) > >> > pred > >> > SECONDLY, I use similar script, but pretend that we do not know the Y > >> > data > >> > from year 16-20. We know x and z for year 16-20, and use them predict > Y > >> > based on the model estimated from 1-15 years. So, in the "newdata" > part, > >> > I > >> > use tt.zoo[(16:20), (2:3)] to remove the Y out. here is the script. > >> > Unfortunately, it does not work in that way. > >> > pred1 <- predict(fit, tt.zoo[(16:20),(2:3)]) > >> > pred1 > >> > It will be greatly appreciated if you can give me some guide on this. > >> > Thanks. > >> > Harry > >> > > >> > > >> > > >> > > >> > > >> > > >> > On Wed, Jul 22, 2009 at 10:04 PM, Gabor Grothendieck > >> > wrote: > >> >> > >> >> Use dyn.predict like this: > >> >> > >> >> > library(dyn) > >> >> > x <- y <- zoo(1:5) > >> >> > mod <- dyn$lm(y ~ lag(x, -1)) > >> >> > predict(mod, list(x = zoo(6:10, 6:10))) > >> >> 7 8 9 10 > >> >> 7 8 9 10 > >> >> > >> >> > >> >> On Thu, Jul 23, 2009 at 12:54 AM, Hongwei Dong > >> >> wrote: > >> >> > I have a dynamic time series model like this: > >> >> > dyn$lm( y ~ lag(y,-1) + x + lag(x,-1)+lag(x,-2) ) > >> >> > > >> >> > I need to do an out of sample forecast with this model. Is there > any > >> >> > way > >> >> > I > >> >> > can do this with R? > >> >> > It would be greatly appreciated if some one can give me an example. > >> >> > Thanks. > >> >> > > >> >> > > >> >> > Harry > >> >> > > >> >> >[[alternative HTML version deleted]] > >> >> > > >> >> > __ > >> >> > R-help@r-project.org mailing list > >> >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> >> > PLEASE do read the posting guide > >> >> > http://www.R-project.org/posting-guide.html > >> >> > and provide commented, minimal, self-contained, reproducible code. > >> >> > > >> > > >> > > > > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] principal component analysis for class variables
Hi, R users, I'm using the "lme" function in R to estimate a 2 level mixed effects model, in which the size of the groups are different. It turned out that It takes forever for R to converge. I also tried the same thing in SPSS and SPSS can give the results out within 20 minutes. Anyone can give me some advice on the lme function in R, especially why R cannot converge? Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme funcion in R
Hi, R users, I'm using the "lme" function in R to estimate a 2 level mixed effects model, in which the size of the subject groups are different. It turned out that It takes forever for R to converge. I also tried the same thing in SPSS and SPSS can give the results out within 20 minutes. Anyone can give me some advice on the lme function in R, especially why R does not converge? Thanks. Harry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme funcion in R
Thanks for the replies above. Here are my script and data structure: library(nlme) tlevel<-lme(fixed = LN_unitlandval ~ MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D+access_emp1+pct_vacant+transit_D+park_dum,data=lusdrdata,random = ~MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D | TAZ) str: $ TAZ : int 100 100 100 100 100 100 100 100 100 100 ... $ MH_D : num 0 0 0 0 0 0 0 0 0 0 ... $ APT_D : num 0 0 0 0 0 0 0 0 0 0 ... $ ResOth_D : num 0 0 0 0 0 0 0 0 0 0 ... $ NonRes_D : num 0 0 0 0 0 0 0 0 0 1 ... $ Vacant_D : num 1 1 1 0 0 1 1 1 1 0 ... $ access_emp1 : num 45.8 45.8 45.8 45.8 45.8 ... $ pct_vacant : num 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 ... $ transit_D : num 0 0 0 0 0 0 0 0 0 0 ... $ park_dum : num 0 0 0 0 0 0 0 0 0 0 ... Thanks. Harry On Mon, Aug 3, 2009 at 10:36 AM, Jason Morgan wrote: > On 2009.08.03 10:15:46, Hongwei Dong wrote: > > Hi, R users, > > I'm using the "lme" function in R to estimate a 2 level mixed effects > > model, in which the size of the subject groups are different. It turned > out > > that It takes forever for R to converge. I also tried the same thing in > SPSS > > and SPSS can give the results out within 20 minutes. Anyone can give me > some > > advice on the lme function in R, especially why R does not converge? > Thanks. > > > > Harry > > Hello Harry, > > As Chuck mentions, providing some more information on the model and > the data you are using would be helpful. Also, be sure to compare the > optimization methods used in SPSS to that used in R. You can change > the optimization method in R if the default seems to be causing > issues. See help(lmeControl) for numerous setting options. > > ~Jason > > -- > Jason W. Morgan > Graduate Student > Department of Political Science > *The Ohio State University* > 154 North Oval Mall > Columbus, Ohio 43210 > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What does this error message mean?
Hi, I used R to run a linear regression and keep getting the following error message. I do not understand it very well. Anyone can help out? Thanks. Error in storage.mode(y) <- "double" : invalid to change the storage mode of a factor In addition: Warning message: In model.response(mf, "numeric") : using type="numeric" with a factor response will be ignored [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme funcion in R
Thanks. I tried to set a higher tolerance for the convergence, such as changing tolerance from 1e-6 to 1e-5, msTol from 1e-7 to 1e-6, but R still does not converge. Any more suggestions? Harry On Mon, Aug 3, 2009 at 10:36 AM, Jason Morgan wrote: > On 2009.08.03 10:15:46, Hongwei Dong wrote: > > Hi, R users, > > I'm using the "lme" function in R to estimate a 2 level mixed effects > > model, in which the size of the subject groups are different. It turned > out > > that It takes forever for R to converge. I also tried the same thing in > SPSS > > and SPSS can give the results out within 20 minutes. Anyone can give me > some > > advice on the lme function in R, especially why R does not converge? > Thanks. > > > > Harry > > Hello Harry, > > As Chuck mentions, providing some more information on the model and > the data you are using would be helpful. Also, be sure to compare the > optimization methods used in SPSS to that used in R. You can change > the optimization method in R if the default seems to be causing > issues. See help(lmeControl) for numerous setting options. > > ~Jason > > -- > Jason W. Morgan > Graduate Student > Department of Political Science > *The Ohio State University* > 154 North Oval Mall > Columbus, Ohio 43210 > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme funcion in R
Yeah, I have a very large sample size, about 60,000 observations. Multicollinearity should not be a problem here. The weird thing is that SPSS can converge very quickly and gives out reasonable results. The only problem I can think of is that, my first level (random) variables are dummy variables: 6 housing types, and I used five dummies in model and one as the reference. I also tried to combine them into two groups and use only dummy at random level, but it does not work either. is there any one here has similar experience with the LME function in R? Thanks. Harry On Tue, Aug 4, 2009 at 1:28 AM, ONKELINX, Thierry wrote: > Dear Harry, > > Your model seems rather complex. Do you have enough data to support it? > Did you check for multicollinearity between the variables? > > HTH, > > Thierry > > > > > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > Gaverstraat 4 > 9500 Geraardsbergen > Belgium > tel. + 32 54/436 185 > thierry.onkel...@inbo.be > www.inbo.be > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to > say what the experiment died of. > ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. > ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > -Oorspronkelijk bericht- > Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > Namens Hongwei Dong > Verzonden: maandag 3 augustus 2009 19:45 > Aan: r-help@r-project.org > Onderwerp: Re: [R] lme funcion in R > > Thanks for the replies above. Here are my script and data structure: > library(nlme) > tlevel<-lme(fixed = LN_unitlandval ~ > MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D+access_emp1+pct_vacant+transit_D+p > ark_dum,data=lusdrdata,random > = ~MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D | TAZ) > > str: > > $ TAZ : int 100 100 100 100 100 100 100 100 100 100 ... > $ MH_D : num 0 0 0 0 0 0 0 0 0 0 ... > $ APT_D : num 0 0 0 0 0 0 0 0 0 0 ... $ ResOth_D : num 0 0 0 0 0 0 0 0 0 > 0 ... $ NonRes_D : num 0 0 0 0 0 0 0 0 0 1 ... > $ Vacant_D : num 1 1 1 0 0 1 1 1 1 0 ... > $ access_emp1 : num 45.8 45.8 45.8 45.8 45.8 ... > $ pct_vacant : num 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 ... $ > transit_D : > num 0 0 0 0 0 0 0 0 0 0 ... $ park_dum : num 0 0 0 0 0 0 0 0 0 0 ... > > > Thanks. > > Harry > > > > On Mon, Aug 3, 2009 at 10:36 AM, Jason Morgan > wrote: > > > On 2009.08.03 10:15:46, Hongwei Dong wrote: > > > Hi, R users, > > > I'm using the "lme" function in R to estimate a 2 level mixed > > > effects model, in which the size of the subject groups are > > > different. It turned > > out > > > that It takes forever for R to converge. I also tried the same thing > > > > in > > SPSS > > > and SPSS can give the results out within 20 minutes. Anyone can give > > > > me > > some > > > advice on the lme function in R, especially why R does not converge? > > Thanks. > > > > > > Harry > > > > Hello Harry, > > > > As Chuck mentions, providing some more information on the model and > > the data you are using would be helpful. Also, be sure to compare the > > optimization methods used in SPSS to that used in R. You can change > > the optimization method in R if the default seems to be causing > > issues. See help(lmeControl) for numerous setting options. > > > > ~Jason > > > > -- > > Jason W. Morgan > > Graduate Student > > Department of Political Science > > *The Ohio State University* > > 154 North Oval Mall > > Columbus, Ohio 43210 > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver > weer > en binden het INBO onder geen enkel beding, zolang dit bericht niet > bevestigd is > door een geldig ondertekend document. The views expressed in this message > and any annex are purely those of the writer and may not be regarded as > stating > an official position of INBO, as long as the message is not confirmed by a > duly > signed document. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme funcion in R
Thanks, David, you are right. If I use continuous data such as 1, 2, ...6 to represent those 6 housing types, the model works with the lme function in R. The problem is, the relationship between the 6 housing types are not continuous, which we assume when we use 1,2,..6 to represent them. Harry On Tue, Aug 4, 2009 at 5:47 PM, David Winsemius wrote: > > On Aug 4, 2009, at 7:48 PM, Hongwei Dong wrote: > > Yeah, I have a very large sample size, about 60,000 observations. >> Multicollinearity should not be a problem here. The weird thing is that >> SPSS >> can converge very quickly and gives out reasonable results. >> The only problem I can think of is that, my first level (random) variables >> are dummy variables: 6 housing types, and I used five dummies in model and >> one as the reference. I also tried to combine them into two groups and use >> only dummy at random level, but it does not work either. >> >> is there any one here has similar experience with the LME function in R? >> > > I have absolutely no experience with "LME" but I can predict with very high > probability that you would be getting more sensible result if you modeled > those housing types with a single factor variable rather than creating 6 > dummies. ((Would one generally not create a reference dummy?) > > ?factor > > -- > David. > > > >> Thanks. >> >> Harry >> >> >> >> On Tue, Aug 4, 2009 at 1:28 AM, ONKELINX, Thierry >> wrote: >> >> Dear Harry, >>> >>> Your model seems rather complex. Do you have enough data to support it? >>> Did you check for multicollinearity between the variables? >>> >>> HTH, >>> >>> Thierry >>> >>> >>> >>> >>> >>> ir. Thierry Onkelinx >>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature >>> and Forest >>> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, >>> methodology and quality assurance >>> Gaverstraat 4 >>> 9500 Geraardsbergen >>> Belgium >>> tel. + 32 54/436 185 >>> thierry.onkel...@inbo.be >>> www.inbo.be >>> >>> To call in the statistician after the experiment is done may be no more >>> than asking him to perform a post-mortem examination: he may be able to >>> say what the experiment died of. >>> ~ Sir Ronald Aylmer Fisher >>> >>> The plural of anecdote is not data. >>> ~ Roger Brinner >>> >>> The combination of some data and an aching desire for an answer does not >>> ensure that a reasonable answer can be extracted from a given body of >>> data. >>> ~ John Tukey >>> >>> -Oorspronkelijk bericht- >>> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >>> Namens Hongwei Dong >>> Verzonden: maandag 3 augustus 2009 19:45 >>> Aan: r-help@r-project.org >>> Onderwerp: Re: [R] lme funcion in R >>> >>> Thanks for the replies above. Here are my script and data structure: >>> library(nlme) >>> tlevel<-lme(fixed = LN_unitlandval ~ >>> MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D+access_emp1+pct_vacant+transit_D+p >>> ark_dum,data=lusdrdata,random >>> = ~MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D | TAZ) >>> >>> str: >>> >>> $ TAZ : int 100 100 100 100 100 100 100 100 100 100 ... >>> $ MH_D : num 0 0 0 0 0 0 0 0 0 0 ... >>> $ APT_D : num 0 0 0 0 0 0 0 0 0 0 ... $ ResOth_D : num 0 0 0 0 0 0 0 0 0 >>> 0 ... $ NonRes_D : num 0 0 0 0 0 0 0 0 0 1 ... >>> $ Vacant_D : num 1 1 1 0 0 1 1 1 1 0 ... >>> $ access_emp1 : num 45.8 45.8 45.8 45.8 45.8 ... >>> $ pct_vacant : num 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 ... $ >>> transit_D : >>> num 0 0 0 0 0 0 0 0 0 0 ... $ park_dum : num 0 0 0 0 0 0 0 0 0 0 ... >>> >>> >>> Thanks. >>> >>> Harry >>> >>> >>> >>> On Mon, Aug 3, 2009 at 10:36 AM, Jason Morgan >>> wrote: >>> >>> On 2009.08.03 10:15:46, Hongwei Dong wrote: >>>> >>>>> Hi, R users, >>>>> I'm using the "lme" function in R to estimate a 2 level mixed >>>>> effects model, in which the size of the subject groups are >>>>> different. It turned >>>>> >>>> out >>>> >>>
Re: [R] lme funcion in R
In addition, it seems the lme function in R are very troubled with the dummy variables used at the first (random) level. Harry On Tue, Aug 4, 2009 at 6:06 PM, Hongwei Dong wrote: > Thanks, David, you are right. If I use continuous data such as 1, 2, ...6 > to represent those 6 housing types, the model works with the lme function in > R. The problem is, the relationship between the 6 housing types are not > continuous, which we assume when we use 1,2,..6 to represent them. > Harry > > > > On Tue, Aug 4, 2009 at 5:47 PM, David Winsemius wrote: > >> >> On Aug 4, 2009, at 7:48 PM, Hongwei Dong wrote: >> >> Yeah, I have a very large sample size, about 60,000 observations. >>> Multicollinearity should not be a problem here. The weird thing is that >>> SPSS >>> can converge very quickly and gives out reasonable results. >>> The only problem I can think of is that, my first level (random) >>> variables >>> are dummy variables: 6 housing types, and I used five dummies in model >>> and >>> one as the reference. I also tried to combine them into two groups and >>> use >>> only dummy at random level, but it does not work either. >>> >>> is there any one here has similar experience with the LME function in R? >>> >> >> I have absolutely no experience with "LME" but I can predict with very >> high probability that you would be getting more sensible result if you >> modeled those housing types with a single factor variable rather than >> creating 6 dummies. ((Would one generally not create a reference dummy?) >> >> ?factor >> >> -- >> David. >> >> >> >>> Thanks. >>> >>> Harry >>> >>> >>> >>> On Tue, Aug 4, 2009 at 1:28 AM, ONKELINX, Thierry >>> wrote: >>> >>> Dear Harry, >>>> >>>> Your model seems rather complex. Do you have enough data to support it? >>>> Did you check for multicollinearity between the variables? >>>> >>>> HTH, >>>> >>>> Thierry >>>> >>>> >>>> >>>> >>>> >>>> ir. Thierry Onkelinx >>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature >>>> and Forest >>>> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, >>>> methodology and quality assurance >>>> Gaverstraat 4 >>>> 9500 Geraardsbergen >>>> Belgium >>>> tel. + 32 54/436 185 >>>> thierry.onkel...@inbo.be >>>> www.inbo.be >>>> >>>> To call in the statistician after the experiment is done may be no more >>>> than asking him to perform a post-mortem examination: he may be able to >>>> say what the experiment died of. >>>> ~ Sir Ronald Aylmer Fisher >>>> >>>> The plural of anecdote is not data. >>>> ~ Roger Brinner >>>> >>>> The combination of some data and an aching desire for an answer does not >>>> ensure that a reasonable answer can be extracted from a given body of >>>> data. >>>> ~ John Tukey >>>> >>>> -Oorspronkelijk bericht- >>>> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >>>> Namens Hongwei Dong >>>> Verzonden: maandag 3 augustus 2009 19:45 >>>> Aan: r-help@r-project.org >>>> Onderwerp: Re: [R] lme funcion in R >>>> >>>> Thanks for the replies above. Here are my script and data structure: >>>> library(nlme) >>>> tlevel<-lme(fixed = LN_unitlandval ~ >>>> MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D+access_emp1+pct_vacant+transit_D+p >>>> ark_dum,data=lusdrdata,random >>>> = ~MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D | TAZ) >>>> >>>> str: >>>> >>>> $ TAZ : int 100 100 100 100 100 100 100 100 100 100 ... >>>> $ MH_D : num 0 0 0 0 0 0 0 0 0 0 ... >>>> $ APT_D : num 0 0 0 0 0 0 0 0 0 0 ... $ ResOth_D : num 0 0 0 0 0 0 0 0 0 >>>> 0 ... $ NonRes_D : num 0 0 0 0 0 0 0 0 0 1 ... >>>> $ Vacant_D : num 1 1 1 0 0 1 1 1 1 0 ... >>>> $ access_emp1 : num 45.8 45.8 45.8 45.8 45.8 ... >>>> $ pct_vacant : num 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 ... $ >>>> transit_D : >>>> num 0 0 0 0 0 0 0 0 0 0 ... $ park_dum : num 0 0 0 0 0 0 0 0 0 0 ... &
Re: [R] lme funcion in R
Hi, the problem has been solved by using "lmer" rather than the "lme" function. It seems the "lme" function dislike my large data size. "lmer" function deals with large data size much better. Thanks for the replies above. Harry On Tue, Aug 4, 2009 at 7:11 PM, David Winsemius wrote: > > On Aug 4, 2009, at 9:06 PM, Hongwei Dong wrote: > > Thanks, David, you are right. If I use continuous data such as 1, 2, ...6 > to represent those 6 housing types, the model works with the lme function in > R. The problem is, the relationship between the 6 housing types are not > continuous, which we assume when we use 1,2,..6 to represent them. > > > And that is why you use a factor variable rather than a numeric variable. > > > > Harry > > > On Tue, Aug 4, 2009 at 5:47 PM, David Winsemius wrote: > >> >> On Aug 4, 2009, at 7:48 PM, Hongwei Dong wrote: >> >> Yeah, I have a very large sample size, about 60,000 observations. >>> Multicollinearity should not be a problem here. The weird thing is that >>> SPSS >>> can converge very quickly and gives out reasonable results. >>> The only problem I can think of is that, my first level (random) >>> variables >>> are dummy variables: 6 housing types, and I used five dummies in model >>> and >>> one as the reference. I also tried to combine them into two groups and >>> use >>> only dummy at random level, but it does not work either. >>> >>> is there any one here has similar experience with the LME function in R? >>> >> >> I have absolutely no experience with "LME" but I can predict with very >> high probability that you would be getting more sensible result if you >> modeled those housing types with a single factor variable rather than >> creating 6 dummies. ((Would one generally not create a reference dummy?) >> >> ?factor >> >> -- >> David. >> >> >> >>> Thanks. >>> >>> Harry >>> >>> >>> >>> On Tue, Aug 4, 2009 at 1:28 AM, ONKELINX, Thierry >>> wrote: >>> >>> Dear Harry, >>>> >>>> Your model seems rather complex. Do you have enough data to support it? >>>> Did you check for multicollinearity between the variables? >>>> >>>> HTH, >>>> >>>> Thierry >>>> >>>> >>>> >>>> >>>> >>>> ir. Thierry Onkelinx >>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature >>>> and Forest >>>> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, >>>> methodology and quality assurance >>>> Gaverstraat 4 >>>> 9500 Geraardsbergen >>>> Belgium >>>> tel. + 32 54/436 185 >>>> thierry.onkel...@inbo.be >>>> www.inbo.be >>>> >>>> To call in the statistician after the experiment is done may be no more >>>> than asking him to perform a post-mortem examination: he may be able to >>>> say what the experiment died of. >>>> ~ Sir Ronald Aylmer Fisher >>>> >>>> The plural of anecdote is not data. >>>> ~ Roger Brinner >>>> >>>> The combination of some data and an aching desire for an answer does not >>>> ensure that a reasonable answer can be extracted from a given body of >>>> data. >>>> ~ John Tukey >>>> >>>> -Oorspronkelijk bericht- >>>> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >>>> Namens Hongwei Dong >>>> Verzonden: maandag 3 augustus 2009 19:45 >>>> Aan: r-help@r-project.org >>>> Onderwerp: Re: [R] lme funcion in R >>>> >>>> Thanks for the replies above. Here are my script and data structure: >>>> library(nlme) >>>> tlevel<-lme(fixed = LN_unitlandval ~ >>>> MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D+access_emp1+pct_vacant+transit_D+p >>>> ark_dum,data=lusdrdata,random >>>> = ~MH_D+APT_D+ResOth_D+NonRes_D+Vacant_D | TAZ) >>>> >>>> str: >>>> >>>> $ TAZ : int 100 100 100 100 100 100 100 100 100 100 ... >>>> $ MH_D : num 0 0 0 0 0 0 0 0 0 0 ... >>>> $ APT_D : num 0 0 0 0 0 0 0 0 0 0 ... $ ResOth_D : num 0 0 0 0 0 0 0 0 0 >>>> 0 ... $ NonRes_D : num 0 0 0 0 0 0 0 0 0 1 ... >>>>
Re: [R] lme funcion in R
Thanks, Thierry and other R users. I estimate the model using the factor rather than the dummy variables I used previously. It still takes forever for the function "lme" to run. But "lmer" is much better with my large data size (about 60,000 observations). The interesting part is that the results from the model using factor are slightly different from what I got from the model using dummy variables, especially for the variables at random level. The estimated random effects by using dummy variable are like this (each dummy got one intercept): Random effects: Groups Name Variance Std.Dev. Corr TAZ (Intercept)0.059160 0.24323 MH_D 0.215210 0.46391 -0.583 TAZ (Intercept)0.212061 0.46050 APT_D 0.205028 0.45280 -0.992 TAZ (Intercept)0.086223 0.29364 ResOth_D 0.305678 0.55288 0.665 TAZ (Intercept)0.161892 0.40236 NonRes_D 0.537284 0.73300 -0.874 TAZ (Intercept)0.088684 0.29780 Vacant_noimp_D 0.501495 0.70816 -0.570 TAZ (Intercept)0.136630 0.36964 Vacant_imp_D 0.368722 0.60722 -0.850 Residual0.382439 0.61842 Number of obs: 55762, groups: TAZ, 739 The estimated random effects by using factor are like this (one intercept for all): Random effects: Groups Name Variance Std.Dev. Corr TAZ (Intercept)0.83894 0.91594 HousingType1MH_D 0.23214 0.48181 -0.375 HousingType1APT_D 0.28850 0.53712 -0.827 0.630 HousingType1ResOth_D 0.29392 0.54214 0.156 -0.251 -0.165 HousingType1NonRes_D 0.58169 0.76269 -0.572 0.155 0.656 -0.030 HousingType1Vacant_imp_D 0.45349 0.67342 -0.522 0.203 0.265 0.101 0.611 HousingType1Vacant_noimp_D 0.54146 0.73584 -0.286 0.251 0.265 0.390 0.313 0.475 Residual0.38228 0.61829 Number of obs: 55762, groups: TAZ, 739 The fixed coefficients for each group are also slightly different. I'm wondering which one makes more sense. Thanks. Harry Harry R still report error On Wed, Aug 5, 2009 at 1:22 AM, ONKELINX, Thierry wrote: > Harry, > > I you use dummy variables, then you can only use (n-1) dummy variables > if your variable has n levels. Otherwise you introduce > multicollinearity! If you use n dummy variable then you can express one > dummy variable as a linear combination of the others. > > Make use of a factor variable. That is much easier to work with that > dummy variables. The model itself will create the necessary dummy > variables. > > lusdrdata$HousingType <- factor(lusdrdata$HousingType, levels = 1:6, > labels = c("Reference", "MH_D", "APT_D", "ResOth_D", "NonRes_D", > "Vacant_D")) > lme(fixed = LN_unitlandval ~ HousingType + > access_emp1+pct_vacant+transit_D +park_dum,data=lusdrdata, random = ~ > HousingType | TAZ) > > HTH, > > Thierry > > > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > Gaverstraat 4 > 9500 Geraardsbergen > Belgium > tel. + 32 54/436 185 > thierry.onkel...@inbo.be > www.inbo.be > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to > say what the experiment died of. > ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. > ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > -Oorspronkelijk bericht- > Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > Namens Hongwei Dong > Verzonden: woensdag 5 augustus 2009 1:49 > Aan: r-help@r-project.org > Onderwerp: Re: [R] lme funcion in R > > Yeah, I have a very large sample size, about 60,000 observations. > Multicollinearity should not be a problem here. The weird thing is that > SPSS can converge very quickly and gives out reasonable results. > The only problem I can think of is that, my first level (random) > variables are dummy variables: 6 housing types, and I used five dummies > in model and one as the reference. I also tried to combine them into two > groups and use only dummy at random level, but it does not work either. > > is there any one here has similar experience with the LME function in
[R] Monte Carlo simulation in R
Hi, R-helpers, I'm trying to use R to do a Monte Carlo simulation and need the help. What I have is a matrix that consists of the probabilities for the persons to choose zones. For example, in the matrix shown below, each column represents a person, and each row represents a zone. So, the probability that the first person will choose the 2nd zone is 30%. 25% 30% 10% 30% 20% 0% 20% 50% 60% 50% 0% 10% 20% 0% 20% Based on this matrix, I want to locate the persons to zones based on the probability using a Monte Carlo method. The result I want to see is like this: 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 Could anyone please give some help? Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] with data in the form of an R data objecte: Monte Carlo simulation in R
Hi, please use the following the matrix z as the example: x<-c(2,4,5,7,6,9,8,2,0) y<-matrix(x,3,3) z<-apply(y,2,function(x)x/sum(x)) z On Tue, Mar 23, 2010 at 6:59 PM, David Winsemius wrote: > > On Mar 23, 2010, at 9:05 PM, Hongwei Dong wrote: > > Hi, R-helpers, >> >> I'm trying to use R to do a Monte Carlo simulation and need the help. What >> I >> have is a matrix that consists of the probabilities for the persons to >> choose zones. For example, in the matrix shown below, each column >> represents >> a person, and each row represents a zone. So, the probability that the >> first >> person will choose the 2nd zone is 30%. >> >> 25% 30% 10% 30% 20% 0% 20% 50% 60% 50% 0% 10% 20% 0% 20% >> > > As Alex Trebeck would say: Can you put that in the form of an R data > object? > > > >> Based on this matrix, I want to locate the persons to zones based on the >> probability using a Monte Carlo method. The result I want to see is like >> this: >> >> 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 >> >> Could anyone please give some help? Thanks. >> > > You cannot specify what the result _will_ be and call the process > simulation of results at the same time. The sample function should work for > a vector. Why not use it? > > -- > David. > > > David Winsemius, MD > West Hartford, CT > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to transform the Matrix into the way I want it ???
Hi, R users, I'm trying to transform a matrix A into B (see below). Anyone knows how to do it in R? Thanks. Matrix A (zone to zone travel time) zone z1 z2 z3 z1 0 2.9 4.3 z2 2.9 0 2.5 z3 4.3 2.5 0 B: from to time z1 z1 0 z1 z2 2.9 z1 z3 4.3 z2 z1 2.9 z2 z2 0 z2 z3 2.5 z3 z1 4.3 z3 z2 2.5 z3 z3 0 The real matrix I have is much larger, with more than 2000 zones. But I think it should be the same thing if I can transform A into B. Thanks. Garry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Monte Carlo Simulation in R...
Hi, Dear R users, I'm wondering if I can do Monte Carlo Simulation in R. My problem is like this: I know variable X follows Gamma distribution with shape parameter 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me to simulate a vector of X that satisfies both the probability distribution and the sum. Anyone has a clue to this? Much appreciated. Regards Garry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
Exactly! Thanks, Duncan. Let me re-phrase me question like this: 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 2) Min(X)=1 and Max(X)=85 3) SUM(X)=2000 4) Do I also have to define the number of draws? if yes, it could be 250. Based on these restrictions, I want to generate random draw. I'm wondering how I can do this in R. Thanks. Garry On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch wrote: > On 11/10/2009 1:25 PM, Hongwei Dong wrote: > >> Hi, Dear R users, >> >> I'm wondering if I can do Monte Carlo Simulation in R. My problem is like >> this: I know variable X follows Gamma distribution with shape parameter >> 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help >> me >> to simulate a vector of X that satisfies both the probability distribution >> and the sum. Anyone has a clue to this? Much appreciated. >> > > Your requirements are slightly contradictory or incomplete. Here's one way > to fully specify the problem: > > The X_i values are independent Gammas, with the given shape and scale. You > want to simulate from the joint distribution conditional on the event sum(X) > == 2000. > > Is that your problem? I don't know how to do the simulation, but maybe > someone else does. > > Duncan Murdoch > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
Thanks. I tried the rgamma function too. But I'm still wondering how I can set the min, max, and sum of the variates created by the random draws. Anyone has a clue? Thanks. Garry On Tue, Nov 10, 2009 at 11:47 AM, David Winsemius wrote: > > On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: > > Exactly! Thanks, Duncan. >> >> Let me re-phrase me question like this: >> >> 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 >> 2) Min(X)=1 and Max(X)=85 >> > > You might want to check that your parameterization in in agreement with > that used by the rgamma function. Simply using those numbers yields a > distribution that does not look as though it would get many qualifying > samples. Here are 20 draws without any exclusions outside a range: > > > rgamma(20, shape=0.067, scale = 0.008) > [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 > 7.680773e-38 6.441082e-15 6.168961e-13 > [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 > 1.852885e-04 4.212802e-07 1.774495e-25 > [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 > > http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html > > > > 3) SUM(X)=2000 >> 4) Do I also have to define the number of draws? if yes, it could be 250. >> >> Based on these restrictions, I want to generate random draw. I'm wondering >> how I can do this in R. Thanks. >> >> Garry >> >> >> >> On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch > >wrote: >> >> On 11/10/2009 1:25 PM, Hongwei Dong wrote: >>> >>> Hi, Dear R users, >>>> >>>> I'm wondering if I can do Monte Carlo Simulation in R. My problem is >>>> like >>>> this: I know variable X follows Gamma distribution with shape parameter >>>> 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help >>>> me >>>> to simulate a vector of X that satisfies both the probability >>>> distribution >>>> and the sum. Anyone has a clue to this? Much appreciated. >>>> >>>> >>> Your requirements are slightly contradictory or incomplete. Here's one >>> way >>> to fully specify the problem: >>> >>> The X_i values are independent Gammas, with the given shape and scale. >>> You >>> want to simulate from the joint distribution conditional on the event >>> sum(X) >>> == 2000. >>> >>> Is that your problem? I don't know how to do the simulation, but maybe >>> someone else does. >>> >>> Duncan Murdoch >>> >>> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
Sorry for the confusion. Let me put it in this way. Here we have 2000 people and we want to put them into 150 groups. The distribution of the group size follows the Gamma distribution with shape parameter 0.067 and scale parameter 0.008. At the same time, the minimum group size is 1, and the largest one should not be bigger than 85. My questions: Can I generate a set of groups that follow the above rules by generating random draws? By the way, I also confused by the rate and scale parameters in R. I did the distribution test in SPSS and got those shape and scale parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned that this is actually the rate. I'm confused. Thanks a lot. Garry On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan wrote: > I think he means "rate = 0.008", so he is looking for: > > rgamma(n, shape=0.067, rate=0.008) > > Even then his problem is not well-posed. You cannot have both > "independent" > gamma rv's and have them sum to 2000. > > Ravi. > > > --- > > Ravi Varadhan, Ph.D. > > Assistant Professor, The Center on Aging and Health > > Division of Geriatric Medicine and Gerontology > > Johns Hopkins University > > Ph: (410) 502-2619 > > Fax: (410) 614-9625 > > Email: rvarad...@jhmi.edu > > Webpage: > > http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h > tml > > > > > > > > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On > Behalf Of David Winsemius > Sent: Tuesday, November 10, 2009 2:47 PM > To: Hongwei Dong > Cc: R-help Forum; Duncan Murdoch > Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte > Carlo Simulation in R... > > > On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: > > > Exactly! Thanks, Duncan. > > > > Let me re-phrase me question like this: > > > > 1) X_i values are independent Gammas, with the shape 0.067 and scale > > 0.008 > > 2) Min(X)=1 and Max(X)=85 > > You might want to check that your parameterization in in agreement > with that used by the rgamma function. Simply using those numbers > yields a distribution that does not look as though it would get many > qualifying samples. Here are 20 draws without any exclusions outside a > range: > > > rgamma(20, shape=0.067, scale = 0.008) > [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 > 7.680773e-38 6.441082e-15 6.168961e-13 > [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 > 1.852885e-04 4.212802e-07 1.774495e-25 > [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 > > http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html > > > > 3) SUM(X)=2000 > > 4) Do I also have to define the number of draws? if yes, it could be > > 250. > > > > Based on these restrictions, I want to generate random draw. I'm > > wondering > > how I can do this in R. Thanks. > > > > Garry > > > > > > > > On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch > > wrote: > > > >> On 11/10/2009 1:25 PM, Hongwei Dong wrote: > >> > >>> Hi, Dear R users, > >>> > >>> I'm wondering if I can do Monte Carlo Simulation in R. My problem > >>> is like > >>> this: I know variable X follows Gamma distribution with shape > >>> parameter > >>> 0.067 and scale parameter 0.008. The sum of the X is 2000. I need > >>> R help > >>> me > >>> to simulate a vector of X that satisfies both the probability > >>> distribution > >>> and the sum. Anyone has a clue to this? Much appreciated. > >>> > >> > >> Your requirements are slightly contradictory or incomplete. Here's > >> one way > >> to fully specify the problem: > >> > >> The X_i values are independent Gammas, with the given shape and > >> scale. You > >> want to simulate from the joint distribution conditional on the > >> event sum(X) > >> == 2000. > >> > >> Is that your problem? I don't know how to do the simulation, but > >> maybe > >> someone else does. > >> > >> Duncan Murdoch > >> > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org maili
Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...
By the way, maybe the number of groups can be determined endogenously. It will be better if I do not have to set the total number of groups exogenously. Thanks Garry On Tue, Nov 10, 2009 at 1:29 PM, Hongwei Dong wrote: > Sorry for the confusion. > > Let me put it in this way. Here we have 2000 people and we want to put them > into 150 groups. The distribution of the group size follows the Gamma > distribution with shape parameter 0.067 and scale parameter 0.008. At the > same time, the minimum group size is 1, and the largest one should not be > bigger than 85. > > My questions: Can I generate a set of groups that follow the above rules > by generating random draws? > > By the way, I also confused by the rate and scale parameters in R. I did > the distribution test in SPSS and got those shape and scale parameters. In > SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone > mentioned that this is actually the rate. I'm confused. > > > Thanks a lot. > Garry > > > > > On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan wrote: > >> I think he means "rate = 0.008", so he is looking for: >> >> rgamma(n, shape=0.067, rate=0.008) >> >> Even then his problem is not well-posed. You cannot have both >> "independent" >> gamma rv's and have them sum to 2000. >> >> Ravi. >> >> >> --- >> >> Ravi Varadhan, Ph.D. >> >> Assistant Professor, The Center on Aging and Health >> >> Division of Geriatric Medicine and Gerontology >> >> Johns Hopkins University >> >> Ph: (410) 502-2619 >> >> Fax: (410) 614-9625 >> >> Email: rvarad...@jhmi.edu >> >> Webpage: >> >> http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h >> tml<http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.html> >> >> >> >> >> >> >> >> >> -Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >> On >> Behalf Of David Winsemius >> Sent: Tuesday, November 10, 2009 2:47 PM >> To: Hongwei Dong >> Cc: R-help Forum; Duncan Murdoch >> Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte >> Carlo Simulation in R... >> >> >> On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote: >> >> > Exactly! Thanks, Duncan. >> > >> > Let me re-phrase me question like this: >> > >> > 1) X_i values are independent Gammas, with the shape 0.067 and scale >> > 0.008 >> > 2) Min(X)=1 and Max(X)=85 >> >> You might want to check that your parameterization in in agreement >> with that used by the rgamma function. Simply using those numbers >> yields a distribution that does not look as though it would get many >> qualifying samples. Here are 20 draws without any exclusions outside a >> range: >> >> > rgamma(20, shape=0.067, scale = 0.008) >> [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07 >> 7.680773e-38 6.441082e-15 6.168961e-13 >> [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11 >> 1.852885e-04 4.212802e-07 1.774495e-25 >> [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05 >> >> http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html >> >> >> > 3) SUM(X)=2000 >> > 4) Do I also have to define the number of draws? if yes, it could be >> > 250. >> > >> > Based on these restrictions, I want to generate random draw. I'm >> > wondering >> > how I can do this in R. Thanks. >> > >> > Garry >> > >> > >> > >> > On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch >> > wrote: >> > >> >> On 11/10/2009 1:25 PM, Hongwei Dong wrote: >> >> >> >>> Hi, Dear R users, >> >>> >> >>> I'm wondering if I can do Monte Carlo Simulation in R. My problem >> >>> is like >> >>> this: I know variable X follows Gamma distribution with shape >> >>> parameter >> >>> 0.067 and scale parameter 0.008. The sum of the X is 2000. I need >> >>> R help >> >>> me >> >>> to simulate a vector of X that satisfies both the probability >> >>> distribution >> >>> and the sum. A
[R] set condition in R
Hi, R users, I'm using while() in R to set a condition. My condition is: i<523 or i>535. How should I write this in R? I try "while (i<523 or i>535)" and it does not work. Thanks. Garry On Tue, Nov 10, 2009 at 11:26 AM, Hongwei Dong wrote: > Exactly! Thanks, Duncan. > > Let me re-phrase me question like this: > > 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008 > 2) Min(X)=1 and Max(X)=85 > 3) SUM(X)=2000 > 4) Do I also have to define the number of draws? if yes, it could be 250. > > Based on these restrictions, I want to generate random draw. I'm wondering > how I can do this in R. Thanks. > > Garry > > > > On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch >wrote: > > > On 11/10/2009 1:25 PM, Hongwei Dong wrote: > > > >> Hi, Dear R users, > >> > >> I'm wondering if I can do Monte Carlo Simulation in R. My problem is > like > >> this: I know variable X follows Gamma distribution with shape parameter > >> 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help > >> me > >> to simulate a vector of X that satisfies both the probability > distribution > >> and the sum. Anyone has a clue to this? Much appreciated. > >> > > > > Your requirements are slightly contradictory or incomplete. Here's one > way > > to fully specify the problem: > > > > The X_i values are independent Gammas, with the given shape and scale. > You > > want to simulate from the joint distribution conditional on the event > sum(X) > > == 2000. > > > > Is that your problem? I don't know how to do the simulation, but maybe > > someone else does. > > > > Duncan Murdoch > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to change the font size of x and y axis labels?
Dear R users, Anyone can tell me how to change the font size of X and Y axis labels in "plot" function? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reduce the size of points in plot???
Hi, R users, Can anyone tell me how I can change the size of points in my plot? For example: x <- c(1,3,6,9,12) y <- c(1.5,2,7,8,15) plot(x,y,pch=20) How do I reduce the size of those points? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] number of iterations in a Tobit model
Hi, R users, I'm running a Tobit model but convergence can not be reached within 30 iterations. Is there anyway I can change the max number of iterations? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maxiter specification in R
Dear R users, I'm having a problem with maxiter specification in VGLM function. I tried to increase the number of iteration to 100, but it still stopped at 30, which is the default. Here is my script: FIT <- vglm(SFH_PCT ~ RD_DEN + CAR_HH + TRS + RES_L, tobit(Lower=0), maxiter = 100) Thanks Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use "summary"
Hi, dear R users, I'm running a Tobit regression and using "summary" to display model results. It used to work. But this time, I kept getting this: > summary(RES_TOBIT) Length Class Mode 1 vglm S4 Could anyone tell me how to solve this problem, and why I got this problem? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sample function in R???
Hi, R users, I'm wondering if there a way in R I can select cases based on a probability vector. if a case is selected, that case is marked as 1, otherwise, 0. For example: x<-12:18 y<-1:7 sample(x,2,replace=FALSE,y) I got: [1] 15 17 What I want to see is: [1] 0 0 0 1 0 1 0 Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] identify an element in a column
Hi, R users, I'm wondering if I can identify an element in a column by an element in another column. For example: x<-1:10 y<-11:20 z<-cbind(x,y) z x y [1,] 1 11 [2,] 2 12 [3,] 3 13 [4,] 4 14 [5,] 5 15 [6,] 6 16 [7,] 7 17 [8,] 8 18 [9,] 9 19 [10,] 10 20 What I want to do is: when x=5, y=y-1 Anyone can tell me how to do this? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "aggregate" in R
Hi, R users, I'm wondering how I can aggregate data in R with different functions for different columns. For example: x<-rep(1:5,3) y<-cbind(x,a=1:15,b=21:35) y<-data.frame(y) I want to aggregate "a" and "b" in y by "x". With "a", I want to use function "mean"; with "b", I want to use function "sum". I tried: > aggregate(y,x,mean(y$a),sum(y$b)) But I got the error: Error in match.fun(FUN) : 'mean(y$a)' is not a function, character or symbol Anyone can tell me how to fix this problem? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] legend position in "barplot"
Hi, R users, I there a way that I can control the position of the legend while using "barplot" function? For example: a<-matrix(c(0,0,0.5,0.8,0.9,0.9),2,3) colnames(a)<-c("X","Y","Z") rownames(a)<-c("A","B") a barplot(a,width = 0.2,legend = TRUE,axes=FALSE,col=c("coral3","steelblue3"),beside=TRUE) In this case, the legend is too small, and I want to move it to the left. Thanks. Regards Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.