[R] specific package to laply
Hi, I have a batch jobs problem for parallel code without sudo. When I try to send a package to the different nodes, as follows: .libPaths( c(.libPaths(), "/my/first/library", "/my/second/library") ) library(foreach) library(iterators) library(parallel) library(doParallel) library(rvest) cl <- makeCluster(detectCores()) registerDoParallel(cl) sites <- paste0("https://www.site",1:2,".com";) html0 <- foreach(i=sites,.packages='rvest') %dopar% html(i) I get the following output: Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : worker initialization failed: there is no package called ‘rvest’ Calls: %dopar% -> Presumably, I need a way to export my .libPaths() to the nodes. Any suggestions? Thanks, Benjamin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] party package: is cforest really bagging?
Hi all, I'm using the "party" package to create random forest of regression trees. I've created a ForestControl class in order to limit my number of trees (ntree), of nodes (maxdepth) and of variables I use to fit a tree (mtry). One thing I'm not sure of is if the cforest algo is using subsets of my training set for each tree it generates or not. I've seen in the documentation that it is bagging so I assume it should. But I'm not sure to understand well what the "subset" input is in that function. I'm also puzzled by the results I get using ctree: when plotting the tree, I see that all my variables of my training set are classified in the different terminal tree nodes while I would have exepected that it only uses a subset here too. So my question is, is cforest doing the same thing as ctree or is it really bagging my training set? Thanks in advance for you help! Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package random forest: can corr.bias be harmful?
Hi, Christian. It sounds like to the maintainer feels that the package needs improvements, i.e., it may not function as expected. However, you may contact him directly using maintainer("randomForest"). Best, Benjamin On Fri, Jun 21, 2013 at 9:36 AM, Christian Kampichler < christian.kampich...@sovon.nl> wrote: > There exists an option corr.bias for making regression analyses with the R > package randomForest. The manual warns Experimental. Use at your own > risk. What does corr.bias precisely do and why can it be dangerous to use > it? As far as I know the bias correction is based on a linear regression > between the out-of-bag-prediction and the observed values which is used for > correction of all predictions made by the model, but this does not appear > so harmful to me. > > -- > Dr. Christian Kampichler > Sovon Dutch Centre for Field Ornithology > PO Box 6521, 6503 GA Nijmegen, The Netherlands > +31 24 7 410 410 > www.sovon.nl > > __** > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table(..., header == FALSE, colClasses = )
Hello I noticed that starting with R version 3.3.0 onward, this generates a warning: > txt <- c("a", "3.14") > read.table(file = textConnection(txt), header = FALSE, colClasses = c(x = "character", y = "numeric")) the warning is "not all columns named in 'colClasses' exist" and I guess the change was made in response to this? https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16478 Regardless, I am wondering whether this is desirable, that as a result of the change, the code has become stricter about the presence of a (formerly) harmless names attribute. Or am I missing something? Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table(..., header == FALSE, colClasses = )
Jeff, Thank you for your reply. The intent was to construct a minimum reproducible example. The same warning occurs when the 'file' argument points to a file on disk with a million lines. But you are correct, my example was slightly malformed and in fact gives an error under R version 3.2.2. Please allow me to try again; in older versions of R, > read.table(file = textConnection("a\t3.14"), header = FALSE, colClasses = c(x = "character", y = "numeric"), sep="\t") V1 V2 1 a 3.14 (with no warning). As of version 3.3.0, > read.table(file = textConnection("a\t3.14"), header = FALSE, colClasses = c(x = "character", y = "numeric"), sep="\t") V1 V2 1 a 3.14 Warning message: In read.table(file = textConnection("a\t3.14"), header = FALSE, : not all columns named in 'colClasses' exist My intent was not to complain but rather to learn more about best practices regarding the names attribute. Regards Ben On 10/23/2017 08:51 PM, Jeff Newmiller wrote: You are constructing the equivalent of a two-line data file, and complaining that it is not treating it like it was one line. If it did used to accept this silently [skeptical] then I for one am glad it produces a warning now. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table(..., header == FALSE, colClasses = )
Yes, it makes sense now; lesson learned. Thank you both! Sometimes it seems that no matter how good the documentation, some useR will inevitably (ab)use the code in ways that were never intended by the authors. Then when the code and/or documentation changes, it is not always obvious to the useR whether the intent of the authors has changed, or whether the useR had just been "getting the right answer for the wrong reason" all along. In this particular case, the change was documented as stemming from a "new feature" (as opposed to a bugfix or more stringent argument checking) that might appear to be a non fully backwards compatible change. For example one might have the (apparently) bad habit of using col.names as a shortcut to rename headers on-the-fly ... > getRversion() [1] ‘3.2.2’ > read.table(textConnection("x y\na 3.14"), header = TRUE, colClasses = c(x = "character", y = "numeric"), col.names = c("foo", "bar")) foo bar 1 a 3.14 but indeed, the names attribute has zero effect on the result: > read.table(textConnection("x y\na 3.14"), header = TRUE, colClasses = c(y = "character", x = "numeric"), col.names = c("foo", "bar")) foo bar 1 a 3.14 so I agree it is good that we are checking for that now. Regards Ben On 10/24/2017 08:55 AM, Martin Maechler wrote: Benjamin Tyner on Tue, 24 Oct 2017 07:21:33 -0400 writes: > Jeff, > Thank you for your reply. The intent was to construct a minimum > reproducible example. The same warning occurs when the 'file' argument > points to a file on disk with a million lines. But you are correct, my > example was slightly malformed and in fact gives an error under R > version 3.2.2. Please allow me to try again; in older versions of R, > > read.table(file = textConnection("a\t3.14"), header = FALSE, > colClasses = c(x = "character", y = "numeric"), sep="\t") > V1 V2 > 1 a 3.14 > (with no warning). As of version 3.3.0, > > read.table(file = textConnection("a\t3.14"), header = FALSE, > colClasses = c(x = "character", y = "numeric"), sep="\t") > V1 V2 > 1 a 3.14 > Warning message: > In read.table(file = textConnection("a\t3.14"), header = FALSE, : > not all columns named in 'colClasses' exist > My intent was not to complain but rather to learn more about best > practices regarding the names attribute. which is a nice attitude, thank you. An even shorter MRE (as header=FALSE is default, and the default sep="" works, too): tt <- read.table(textConnection("a 3.14"), colClasses = c(x="character", y="numeric")) Warning message: In read.table(file = textConnection("a 3.14"), colClasses = c(x = "character", : not all columns named in 'colClasses' exist If you read in the help page -- you did read that before posting, did you?--- how 'colClasses' should be specified , colClasses: character. A vector of classes to be assumed for the columns. If unnamed, recycled as necessary. If named, names are matched with unspecified values being taken to be ‘NA’. Possible values are .. . and the 'x' and 'y' names you used, are matched with the colnames ... which on the other hand are "V1" and "V2" for you, and so you provoke a warning. Once you have read (and understood) the above part of the help page, it becomes, easy, no? tt <- read.table(textConnection("a 3.14"), colClasses = c("character","numeric")) t2 <- read.table(textConnection("a 3.14"), colClasses=c(x="character",y="numeric"), col.names=c("x","y")) t2 xy 1 a 3.14 i.e., no warning in both of these two cases. So please, please, PLEASE: at least non-beginners like you *should* take the effort to read the help page (and report if these seem incomplete or otherwise improvable)... Best, Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trying to find the multiple combinations...
Hi, I'm trying to find a way to determine what multiples of the combination of three or more numbers equals a forth number. So, if I had a number set like: c(13.4689, 12.85212, 17.05071) What combination and multiples of these numbers would average to 15.0078? (so, something that would tell me x, y, and z in (x*13.4689 + y*12.85212+ z*17.05071) / x+y+z) = 15.0078 I think this is doable with aggregate? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setSessionLimit
Hello The doc/NEWS.2 file mentions a setSessionLimit function, added with version 2.8.0 o setTimeLimit() function to set limits on the CPU and/or elapsed time for each top-level computation, and setSessionLimit() to set limits for the rest of the session. However, I no longer see this function in recent versions of R, and there is no mention of its removal in the NEWS nor in the svn log. So I'm curious to learn why it was removed...does anyone recall? Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rJava garbage collect
Hi Does rJava offer a way to instruct the JVM to perform a garbage collection? Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rJava garbage collect
Thanks Jeff; indeed it works: .jcall("java/lang/System", method = "gc") On 02/05/2018 11:53 PM, Jeff Newmiller wrote: rJava offers a mechanism to call arbitrary methods in Java. Wouldn't you use that mechanism to call whatever you would call if you were programming in Java (e.g. System.gc)? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rJava garbage collect
Hi Martin, Thanks for providing the reference. In this particular case, it helped me to discover that 13 JVM threads were garbage collecting in parallel, occasionally resulting in a race condition. Setting options(java.parameters = "-XS:ParallelGCThreads=1") appears to resolve the issue. Regards Ben On 6 February 2018 at 04:34, Benjamin Tyner https://stat.ethz.ch/mailman/listinfo/r-help>> wrote: >/Hi />//>/Does rJava offer a way to instruct the JVM to perform a garbage collection? / Do you really, really need to run the garbage collector? Consider reading: https://stackoverflow.com/questions/5086800/java-garbage-collection Regards Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] system() or pipe(..., open = "r") without child process?
Greetings On linux, is it possible to invoke an OS command from within R without spawning a child process? If not, is it possible to avoid copying the "parts of the caller's context" that are mentioned on the clone manpage? ENOMEM Cannot allocate sufficient memory to allocate a task structure for the child, or to copy those parts of the caller's context that need to be copied. (I am trying to avoid that ENOMEM condition when calling system(), pipe(), etc). Regards, Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient manipulation with list object
You may be able to speed it up further by using `data.table`'s `rbindlist` or a similar function as shown here https://stackoverflow.com/a/49772719/5861244. 2018-06-10 21:20 GMT+02:00 Christofer Bogaso : > Using do.call() reduces my calculation time significantly. > > On Sun, Jun 10, 2018 at 10:45 PM ruipbarradas wrote: > >> Hello, >> >> Instead of Reduce try do.call. >> >> do.call ('rbind', list) >> >> But with such a long list it will still take time. >> >> Hope this helps, >> >> Rui Barradas >> >> >> >> Enviado a partir do meu smartphone Samsung Galaxy. >> Mensagem original >> De: Christofer Bogaso >> Data: 10/06/2018 16:33 (GMT+00:00) >> Para: r-help >> Assunto: [R] Efficient manipulation with list object >> >> Hi, >> >> I have a list of length 10,000, and each element of that list is a matrix >> with 3 columns and 2,000 rows. >> >> Now when I tried to make a Matrix object with that list using >> Reduce('rbind', list), my code is taking a considerable amount of time. >> >> Is there any way to implement same above task in more efficient way? >> >> Thanks, >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] undo compile? (or: remove bytecode from closure)
Hi Given a closure which has been compiled, what's the recommended way to recover the original? For example, > f <- function(x) x+1 > fc <- cmpfun(f) > rm(f) > fc function(x) x+1 what's the best way to recover f from fc ? Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] undo compile? (or: remove bytecode from closure)
Thanks Rui and Duncan, this is most helpful. On 07/16/2018 06:58 AM, Duncan Murdoch wrote: On 16/07/2018 5:31 AM, Rui Barradas wrote: Hello, Maybe the following is not the recommended way but it works (and I believe makes sense). f <- function(){} formals(f) <- formals(fc) body(f) <- body(fc) That's not quite right: it might lose the environment of fc, if it isn't the environment where this took place. But a simpler solution is just f <- fc body(f) <- body(f) because any assignment to the body of a function causes the bytecode to be dropped. Both of our approaches will also cause the source references to be dropped. If you want to save those, you need more steps: f <- fc body(f) <- body(f) attr(f, "srcref") <- getSrcref(fc) Duncan Murdoch f #function (x) #{ # x <- x + 1 # pi * x #} f(1) #[1] 6.283185 Hope this helps, Rui Barradas Às 03:25 de 16-07-2018, Benjamin Tyner escreveu: Hi Given a closure which has been compiled, what's the recommended way to recover the original? For example, > f <- function(x) x+1 > fc <- cmpfun(f) > rm(f) > fc function(x) x+1 what's the best way to recover f from fc ? Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unknown anomaly
Dear Sir, I writting to you as I am facing an irregularity in R that I do not know the origin. When doing a sequence from 0 to 1 by 0.02 and assigning it to a vector (i.e. code: a <- seq(from=0, to=1, by=0.02)) then, when I try to use the 36th element (and two others behave the same way) it is not recognized correctly. For instance a[36]==0.7, what should give TRUE, gives instead FALSE. However, this works fine for element 35 and 37 and all other elements but two. I do not know the reason. I restarted my R session and tried on another computer. This has been the same. None of my colleagues had an answer. I hope that you would be able to help me fix that as it must be a pretty straightforward error that I do not realise. I would be thankful for any help, With my very Best Regards, Benjamin. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot command error message
I tried to plot a clustered linear regression model with the cplot command in R (code below). Leaflet is a binary variable (I know logit would be better), partisan is nummeric variable (0-4) and partisan_mis a dummy (0,1). As you can see it is clustered around two variables: around individuals and around the specific survey. When I try to run the cplot command I always get this error message: error in plot.window(...) : need finite 'ylim' values Zusätzlich: Warnmeldungen: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf Does anyone know how to solve this problem? As I suppose that the problem has something to do with the way I coded the variables, here is how i did it: voxit$leaflet[voxit$a65== "Ja"] <- 1 voxit$leaflet[voxit$a65== "Nein"] <- 0 voxitf <- subset(voxit, leaflet >= 0 ) voxitf$partisan[voxitf$a87x== "1 Tag oder weniger vorher"] <- 0 voxitf$partisan[voxitf$a87x== "Einige Tage vorher"] <- 1 voxitf$partisan[voxitf$a87x== "1-2 Woche(n) vorher"] <- 2 voxitf$partisan[voxitf$a87x== "Mehrere Wochen vorher"] <- 3 voxitf$partisan[voxitf$a87x== "Schon immer klar"] <- 4 mean(voxitf$partisan, na.rm = TRUE) voxitf$partisan[is.na(voxitf$a87x)]<- 2.73493 table(voxitf$partisan) voxitf$partisan_mis[is.na(voxitf$a87x)]<- 1 voxitf$partisan_mis[!is.na(voxitf$a87x)]<- 0 model1.1 <- lm(leaflet ~ partisan + partisan_mis , data = voxitf) vcov_clust1.1 <- cluster.vcov(model1.1, cbind(voxitf$id,voxitf$projetx)) (marginal_effects1.1 <- margins(model1.1)) summary(marginal_effects1.1) cplot(model1.1, x = "partisan", dx = "leaflet", what = "effect", se.type = "shade") [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Svyglm Error
Greetings, I am revisiting code from several different files I have saved from the past and all used to run flawlessly; now when I run any of the svyglm related functions, I am coming up with an error: Error in model.frame.default(formula = F3ATTAINB ~ F1PARED, data = data, : the ... list does not contain 4 elements The following is a minimal reproducible example: library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr ##Resetting baseline levels for predictors elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") ) elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") ) elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") ) elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Other") ) ##Univariate testing for Other subset Othpared <- svyglm(formula=F3ATTAINB~F1PARED,family="quasibinomial",design=subset(elsq1ch_brr,BYSCTRL==1&G10COHRT==1&F1RTRCC=="Other"),na.action=na.omit) summary(Othpared)? Any help in resolving this concern would be greatly appreciated. Sincerely, Courtney Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Svyglm Error
?Problem solved; I did not have the most updated version of R. When I updated it and ran the code, all worked well again. Thanks very much for helping me solve the mistake of an R novice! Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: Anthony Damico Sent: Wednesday, July 5, 2017 3:47 PM To: Courtney Benjamin Cc: r-help@r-project.org Subject: Re: [R] Svyglm Error hi, i am not hitting an error when i copy and paste your code into a fresh console. maybe compare your sessionInfo() to mine? > sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1 Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] survey_3.32-1 survival_2.41-3 Matrix_1.2-10 RCurl_1.95-4.8 bitops_1.0-6 loaded via a namespace (and not attached): [1] compiler_3.4.1 splines_3.4.1 lattice_0.20-35 On Wed, Jul 5, 2017 at 2:24 PM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: Greetings, I am revisiting code from several different files I have saved from the past and all used to run flawlessly; now when I run any of the svyglm related functions, I am coming up with an error: Error in model.frame.default(formula = F3ATTAINB ~ F1PARED, data = data, : the ... list does not contain 4 elements The following is a minimal reproducible example: library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr ##Resetting baseline levels for predictors elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") ) elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") ) elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") ) elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Other") ) ##Univariate testing for Other subset Othpared <- svyglm(formula=F3ATTAINB~F1PARED,family="quasibinomial",design=subset(elsq1ch_brr,BYSCTRL==1&G10COHRT==1&F1RTRCC=="Other"),na.action=na.omit) summary(Othpared)? Any help in resolving this concern would be greatly appreciated. Sincerely, Courtney Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org><mailto:cbenj...@btboces.org<mailto:cbenj...@btboces.org>> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating New Variable Using Ifelse
Hello R Help List, I am an R novice and trying to use the ifelse function to create a new binary variable based off of the responses of two other binary variables; NAs are involved. I pulled it off almost successfully, but when I checked the counts of my new variable for accuracy, I found that a small portion of the NA cases were not being passed through as NAs, but as "0" counts in my new variable. My many attempts at creating a nested ifelse statement that would pass the NAs through properly have not been successful. Any help is greatly appreciated. Here is a MRE:? library(RCurl) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq2wbl.csv";) elsq2wbl <- read.csv(text = data) ##Recoding Negative Responses to NA elsq2wbl [elsq2wbl[, "EVERRELJOB"] < -3, "EVERRELJOB"] <- NA elsq2wbl [elsq2wbl[, "PSWBL"] < -2, "PSWBL"] <- NA #Labeling categorical variable levels elsq2wbl$EVERRELJOB <- factor(elsq2wbl$EVERRELJOB, levels = c(0,1), labels = c("No","Yes")) elsq2wbl$PSWBL <- factor(elsq2wbl$PSWBL, levels = c(0,1), labels = c("No","Yes")) ##Trying to create a new variable to indicate if the student had a job #related to the college studies that was NOT a WBL experience elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & elsq2wbl$EVERRELJOB=="Yes",1,0) #Cross tab to check counts of two variables that new variable is based upon xtabs(~PSWBL+EVERRELJOB,subset(elsq2wbl,BYSCTRL==1&G10COHRT==1),addNA=TRUE) #Checking count of newly created variable Q2sub <- subset(elsq2wbl,BYSCTRL==1&G10COHRT==1) library(plyr) count(Q2sub,'NONWBLRELJOB') #The new variable has the correct count of "1", but 88 cases too many for "0" #The cross tab shows 20 and 68 NA cases that are being incorrectly counted as "0" in the new variable #My other approach at trying to handle the NAs properly-returns an error elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & elsq2wbl$EVERRELJOB=="Yes",1,ifelse(is.na(elsq2wbl$PSWBL)&is.na(elsq2wbl$EVERRELJOB),NA, ifelse(elsq2wbl$PSWBL!="No" & elsq2wbl$EVERRELJOB!="Yes",0))) Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating New Variable Using Ifelse
Thanks very much; with your tips, I was able to get the nested ifelse statement to work properly! Courtney Benjamin From: PIKAL Petr Sent: Thursday, August 10, 2017 5:39 AM To: Courtney Benjamin; r-help@r-project.org Subject: RE: Creating New Variable Using Ifelse Hi see in line > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Courtney > Benjamin > Sent: Thursday, August 10, 2017 5:55 AM > To: r-help@r-project.org > Subject: [R] Creating New Variable Using Ifelse > > Hello R Help List, > > I am an R novice and trying to use the ifelse function to create a new binary > variable based off of the responses of two other binary variables; NAs are > involved. I pulled it off almost successfully, but when I checked the counts > of > my new variable for accuracy, I found that a small portion of the NA cases > were > not being passed through as NAs, but as "0" counts in my new variable. My > many attempts at creating a nested ifelse statement that would pass the NAs > through properly have not been successful. Any help is greatly appreciated. > > Here is a MRE:? > > library(RCurl) > data <- > getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech- > ed/master/elsq2wbl.csv") Did not work for me probably due to some restriction of data access. > elsq2wbl <- read.csv(text = data) > > ##Recoding Negative Responses to NA > elsq2wbl [elsq2wbl[, "EVERRELJOB"] < -3, "EVERRELJOB"] <- NA > elsq2wbl [elsq2wbl[, "PSWBL"] < -2, "PSWBL"] <- NA If you wanted recode values below zero to NA you can do this easily without any ifelse > summary(test) mp tepl kryst Min. : 7.11 Min. :100.0 Min. : 21.70 1st Qu.:32.25 1st Qu.:400.0 1st Qu.: 24.15 Median :37.50 Median :500.0 Median : 26.55 Mean :38.44 Mean :485.7 Mean : 42.64 3rd Qu.:44.75 3rd Qu.:600.0 3rd Qu.: 33.62 Max. :76.02 Max. :900.0 Max. :150.00 NA's :3 NA's :6 > test[is.na(test)] <- 999 > summary(test) mp tepl kryst Min. : 7.11 Min. :100.0 Min. : 21.70 1st Qu.: 34.27 1st Qu.:400.0 1st Qu.: 25.93 Median : 41.75 Median :500.0 Median : 92.90 Mean :244.28 Mean :485.7 Mean :452.51 3rd Qu.: 70.05 3rd Qu.:600.0 3rd Qu.:999.00 Max. :999.00 Max. :900.0 Max. :999.00 > > test[test>900] <- NA > summary(test) mp tepl kryst Min. : 7.11 Min. :100.0 Min. : 21.70 1st Qu.:32.25 1st Qu.:400.0 1st Qu.: 24.15 Median :37.50 Median :500.0 Median : 26.55 Mean :38.44 Mean :485.7 Mean : 42.64 3rd Qu.:44.75 3rd Qu.:600.0 3rd Qu.: 33.62 Max. :76.02 Max. :900.0 Max. :150.00 NA's :3 NA's :6 > > #Labeling categorical variable levels > elsq2wbl$EVERRELJOB <- factor(elsq2wbl$EVERRELJOB, levels = c(0,1), labels = > c("No","Yes")) > elsq2wbl$PSWBL <- factor(elsq2wbl$PSWBL, levels = c(0,1), labels = > c("No","Yes")) > > ##Trying to create a new variable to indicate if the student had a job > #related to the college studies that was NOT a WBL experience > elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & > elsq2wbl$EVERRELJOB=="Yes",1,0) You can use simple logical functions to get desired result First sample data > a <- sample(c("Y", "N"), 30, replace=TRUE) > b <- sample(c("Y", "N"), 30, replace=TRUE) > (a=="N")*(b=="Y") [1] 1 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 has 1 only if both conditions are met. and it works with NA values too. > a[c(3,5,8)] <- NA > b[c(3,6,7,8)] <- NA > (a=="N")*(b=="Y") [1] 1 1 NA 0 NA NA NA NA 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 [26] 0 0 0 0 1 Cheers Petr > > #Cross tab to check counts of two variables that new variable is based upon > xtabs(~PSWBL+EVERRELJOB,subset(elsq2wbl,BYSCTRL==1&G10COHRT==1),add > NA=TRUE) > > #Checking count of newly created variable > Q2sub <- subset(elsq2wbl,BYSCTRL==1&G10COHRT==1) > library(plyr) > count(Q2sub,'NONWBLRELJOB') > > #The new variable has the correct count of "1", but 88 cases too many for "0" > #The cross tab shows 20 and 68 NA cases that are being incorrectly counted as > "0" in the new variable > > #My other approach at trying to handle the NAs properly-returns an error > elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & > elsq2wbl$EVERRELJOB==
[R] How to Display Value Labels in R Outputs?
Hello R Experts, I am using the Survey Package in R to do some initial descriptive stats for my dissertation. With the outputs for both the svymean and the barplot, I would like the value labels to be displayed for the variable-it would make the descriptive statistics much easier to interpret. Instead of the output labels of: F1RTRCC1, F1RTRCC2-I would like to see the value labels of "academic" and "occupational" to be displayed. How do I go about making this happen? I am including a minimal reproducible example with a small subset of my actual data: https://drive.google.com/file/d/0B5fHCR5TGRjaZ29wNzR0cF9YRXc/view?usp=sharing Any help is greatly appreciated. My only experience thus far is with SPSS and I have a feeling that the reason the variable value labels are not appearing is due to either the way I read the dataset into R: elsq1ch<-read.table (file="els-Q1-04-21-16.dat", header = TRUE, sep = "\t", quote = "\"", dec =".") or I am not specifying some detail that is required to manually assign labels to the values of the variables.? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] commercial license
Dear r-project Team How does It cost a commercial license for the R Console and the R-comander GUI without the Rstudio enviroment. Thanks for helping me. Freundliche Gr�sse/Kind regards Benjamin Stocker Reporting/Controlling MBC Mercedes-Benz Schweiz AG Bernstrasse 55 8952 Schlieren/Switzerland Phone +41 44 755 84 24 Fax +41 44 755 82 17 mailto:benjamin.stoc...@daimler.com www.mercedes-benz.ch<http://www.mercedes-benz.ch/> If you are not the addressee, please inform us immediately that you have received this e-mail by mistake, and delete it. We thank you for your support. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Svyglm Error in Survey Package
In attempting to use the svyglm call in the R Survey Package, I am receiving the error: Error in pwt[i] : invalid subscript type 'list' I have not been able to find a lot of information on how to resolve the error; one source advised it was related to how the subsetting command was executed. This was my initial attempt: mc1 <- svyglm(F3ATTAINMENT~F1SES2QU+F1RGPP2,elsq1ch_brr,subset(elsq1ch_brr,BYSCTRL==1 & G10COHRT==1),na.action) summary(mc1) This was my second approach trying to change up how I had subsetted the data: summary(mc1) samp1 <- subset(elsq1ch_brr,BYSCTRL==1 & G10COHRT==1) dim(samp1) mc1 <- svyglm(F3ATTAINMENT~F1SES2QU+F1RGPP2,elsq1ch_brr,subset=samp1,na.action) summary(mc1)? Both attempts resulted in the same error stated above. Any advisement in how to resolve this error would be greatly appreciated. Sincerely, Courtney Benjamin ? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Svyglm Error in Survey Package
Hello Dr. Fox, Thank you very much for your recommendations and I will take action on following them. I appreciate your willingness to provide guidance on my novice questions. Sincerely, Courtney From: Fox, John Sent: Sunday, September 25, 2016 9:21 AM To: Courtney Benjamin Cc: r-help@r-project.org Subject: RE: Svyglm Error in Survey Package Dear Courtney, You're confusing a function call, na.action(na.omit), with an argument specification, na.action=na.omit (and, in any event, there is no na.action() function). But you don't have to specify na.action=na.omit, because na.omit (which produces a complete-case analysis) is the default na.action. See ?glm for more information. If you haven't already done so, you might want to read something about how statistical modeling functions in R -- and possibly R more generally -- work. In the long run, that likely will save you some time. I'm cc'ing this to the r-help email list, where you posed your original question. It's generally a good idea to keep responses on the list. I hope this helps, John > -Original Message- > From: Courtney Benjamin [mailto:cbenj...@btboces.org] > Sent: September 24, 2016 10:55 PM > To: Fox, John > Subject: Re: Svyglm Error in Survey Package > > Hello Dr. Fox, > Thank you very much for your explanation; I am in better shape now with the > subset argument and I am no longer getting the error. Now I am not having > success with specifying the na.action argument. I would like to exclude NAs > like I have done in other commands with na.rm=TRUE. > The following are my attempts: > > > summary(svyglm(F3ATTAINMENT~F1PARED+F1SES2QU,design=elsq1ch_brr,su > bset > > =BYSCTRL==1 & G10COHRT==1,na.action)) > Error in glm(formula = F3ATTAINMENT ~ F1PARED + F1SES2QU, subset = > BYSCTRL == : > argument "na.action" is missing, with no default > > > > summary(svyglm(F3ATTAINMENT~F1PARED+F1SES2QU,design=elsq1ch_brr,su > bset > > =BYSCTRL==1 & G10COHRT==1,na.action(na.omit)) > + > Any guidance you would be willing to provide is greatly appreciated; I have > only been using R for about 6 months. > Sincerely > Courtney > > Courtney Benjamin > Broome-Tioga BOCES > Automotive Technology II Teacher > Located at Gault Toyota > Doctoral Candidate-Educational Theory & Practice State University of New York > at Binghamton cbenj...@btboces.org > 607-763-8633 > > > From: Fox, John > Sent: Saturday, September 24, 2016 9:38 AM > To: Courtney Benjamin > Cc: r-help@r-project.org > Subject: RE: Svyglm Error in Survey Package > > Dear Courtney, > > I think that you're confused about how to use the subset argument to svyglm() > and about what the subset() function returns. The subset argument should be a > logical expression, evaluating to TRUE or FALSE for each case; subset() > returns > a data set (e.g., a "survey.design" object). > > Here's an example adapted from ?svyglm: > > --- snip--- > data(api) > dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) > summary(svyglm(api00~ell+meals+mobility, design=dstrat)) > summary(svyglm(api00~ell+meals+mobility, design=dstrat, subset = pcttest > > 90)) summary(svyglm(api00~ell+meals+mobility, design=subset(dstrat, subset > = pcttest > 90))) > --- snip--- > > The last two commands produce the same results. > > I hope this helps, > John > > - > John Fox, Professor > McMaster University > Hamilton, Ontario > Canada L8S 4M4 > Web: socserv.mcmaster.ca/jfox > > > > > -Original Message- > > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of > > Courtney Benjamin > > Sent: September 23, 2016 11:01 PM > > To: r-help@r-project.org > > Subject: [R] Svyglm Error in Survey Package > > > > In attempting to use the svyglm call in the R Survey Package, I am > > receiving the > > error: Error in pwt[i] : invalid subscript type 'list' > > > > I have not been able to find a lot of information on how to resolve > > the error; one source advised it was related to how the subsetting > > command was executed. > > > > This was my initial attempt: > > > > mc1 <- > > > svyglm(F3ATTAINMENT~F1SES2QU+F1RGPP2,elsq1ch_brr,subset(elsq1ch_brr, > > BYSCTRL==1 & G10COHRT==1),na.action) > > summary(mc1) > > This was my second approach trying to change up how I had subsetted > > the > > data: > > summary(mc1) > > s
Re: [R] non-terminal token lacking children from utils::getParseData
Thank you Yihui for also reporting the bug here: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16354 and thank you Duncan for finding the issue and fixing it! I definitely like your idea to report a summary message instead of the long text string. Regards Ben > I tried to reduce the offending portion as best I could to a > more-or-less minimal example (1136 bytes), which can be downloaded via: > > wget https://www.dropbox.com/s/74rgxr5x2aalr99/badstring.R > > then once in R, > > > b <- parse(file = "~/badstring.R", keep.source = TRUE) > > d <- getParseData(b, includeText = FALSE) > > subset(d, line1 == 2L) >line1 col1 line2 col2 id parent token terminal > 10 25241 10 21 exprFALSE > > subset(d, parent == 10) > [1] line1col1 line2col2 id parent token > terminal > <0 rows> (or 0-length row.names) > > here is my > > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > note that while it says R version 3.0.2 above, I have seen the same > behaviour under version 3.1.2 as well. > > Regards > Ben > > On 02/19/2015 06:34 PM, Duncan Murdoch wrote: > >/ On 19/02/2015 6:31 PM, B Tyner wrote: > />>/ Hi, > />>/ > />>/ I have run across a source file for which the return value > />>/ of getParseData() includes a record having FALSE for $terminal, yet it is > />>/ not the parent of any other tokens. Before I spend time constructing a > />>/ reproducible example, I wanted to verify that this is in fact unexpected > />>/ behavior (under R 3.1.2)? > />/ Before I spend the time thinking about that, I'd like to see a > />/ reproducible example. > />/ > />/ Duncan Murdoch > />/ > /> __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package implementing continuous binomial?
Hi I'm wondering if anyone is aware of an R package implementing (i.e., providing a pdf, cdf, and/or quantile function) for the continuous binomial distribution? Specifically the one characterized here: http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf Figured I would check here first, before attempting to code it up myself. Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package implementing continuous binomial?
Thanks David! I'll take a look at zipfR. Regards Ben On 05/07/2015 03:10 PM, David Winsemius wrote: > On May 6, 2015, at 7:00 PM, Benjamin Tyner wrote: > >> Hi >> >> I'm wondering if anyone is aware of an R package implementing (i.e., >> providing a pdf, cdf, and/or quantile function) for the continuous >> binomial distribution? Specifically the one characterized here: >> >> http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf >> >> Figured I would check here first, before attempting to code it up myself. > I found that reading the ArXiv version of that material was easier to > understand: > > http://arxiv.org/abs/1303.5990 > > zipfR package has an implementation of the incomplete beta function that > might make some of the coding of the pdf and cdf more simple. > > Searching done with Graves' very useful utility package: > > library('sos') > findFn("incomplete beta function") > > (I did't think that doing a search on "continuous Binomial" was likely to be > helpful, but I tried it anyway and did not find any functions named > "continuous binomial" in their help page titles.) > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scatterplot : smoothing colors according to density of points
Hello everyone, I have a data frame D with 4 columns id,X,Y,C. I want to plot a simple scatter plot of D$X vs. D$Y and using D$C values as a color. (id is just a text string not used for the plot) But actually, I don't want to use the raw values of D$C, I would prefer to calculate the average values of D$C according to the density of points in a fixed neighborhood. In other words, I would like to smooth the colors according to the density of points. I am looking for any function,package that could solve this. So far, I've been looking at library MASS and the function kde2d which can calculate the density of points in 2 directions, but I don't see how I could then use this information to recalculate my D$C values. Here is a piece of the matrix : > head(D) id X YC 1 O13297 44.44 21.61220 -0.136651639 2 O13329 31.272085 4.01590 -0.117016949 3 O13525 6.865672 2.43884 -0.161173913 4 O13539 14.176245 7.81217 -0.075756757 5 O13541 73.275862 3.59012 -0.006988235 6 O13547 28.991597 258.99900 -0.013985507 > dim(D) [1] 36164 > apply(D[,-1],2,range) X Y C [1,] 0.3378378 0.0003 -0.738 [2,] 100.000 24556.4000 0.5582500 (Y is not linear, so I use log='y' in the plot function) I used a palette of 100 colors ranging from Blue to Yellow to red. >pal = colorRampPalette(c("blue","yellow","red"))(100) To make D$C values correspond to a color, I used a cut with the following breaks (101 breaks from -1.2 to 1.2): > BREAKS [1] -1.2000 -0.8000 -0.4000 -0.3600 -0.3200 -0.2800 -0.2400 -0.2000 -0.1925 [10] -0.1850 -0.1775 -0.1700 -0.1625 -0.1550 -0.1475 -0.1400 -0.1368 -0.1336 [19] -0.1304 -0.1272 -0.1240 -0.1208 -0.1176 -0.1144 -0.1112 -0.1080 -0.1048 [28] -0.1016 -0.0984 -0.0952 -0.0920 -0.0888 -0.0856 -0.0824 -0.0792 -0.0760 [37] -0.0728 -0.0696 -0.0664 -0.0632 -0.0600 -0.0568 -0.0536 -0.0504 -0.0472 [46] -0.0440 -0.0408 -0.0376 -0.0344 -0.0312 -0.0280 -0.0248 -0.0216 -0.0184 [55] -0.0152 -0.0120 -0.0088 -0.0056 -0.0024 0.0008 0.0040 0.0072 0.0104 [64] 0.0136 0.0168 0.0200 0.0232 0.0264 0.0296 0.0328 0.0360 0.0392 [73] 0.0424 0.0456 0.0488 0.0520 0.0552 0.0584 0.0616 0.0648 0.0680 [82] 0.0712 0.0744 0.0776 0.0808 0.0840 0.0872 0.0904 0.0936 0.0968 [91] 0.1000 0.1250 0.1500 0.1750 0.2000 0.2250 0.2500 0.4875 0.7250 [100] 0.9625 1.2000 > C.levels = as.numeric(cut(D$C,breaks=BREAKS)) >length(C.levels) [1] 3616 C.levels ranges from 2 to 98 and then to plot the colors I used pal[C.levels]. > plot( x=D$x, y=D$Y, col=pal[ C.levels ],log='y') [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rprof and system
Hi I have an R script which invokes WriteXLS() (from the package of the same name) which as you may know, calls perl via system(). I've noticed that when I enable profiling using Rprof(), when the script gets to the part where perl is called, it gets "stuck": it just sits there using 99-100% CPU and around 10% of the RAM, and the perl command does not even show up as a running process under 'top'. While it is churning away, new profiling results continue to be written to the profiling file (at which point the entries are all "system" / "WriteXLS"). I have also noticed this behavior with other system() calls; not just perl. I've also tried pipe(cmd, open = "r") as an alternative to system(cmd) and the result is the same. I have tried in interactive as well as non-interactive mode, and the issue occurs in both modes although seems to be more common in the latter. So, I'm wondering if perhaps Rprof() + system() not a recommended combination ? I did notice from ?Rprof that "the profiler interrupts R asynchronously"; though I do not know what that means, perhaps that is somehow relevant? This is running on R version 3.2.1 under linux (RHEL6). Any suggestions would be greatly appreciated. Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-terminal token lacking children from utils::getParseData
I tried to reduce the offending portion as best I could to a more-or-less minimal example (1136 bytes), which can be downloaded via: wget https://www.dropbox.com/s/74rgxr5x2aalr99/badstring.R then once in R, > b <- parse(file = "~/badstring.R", keep.source = TRUE) > d <- getParseData(b, includeText = FALSE) > subset(d, line1 == 2L) line1 col1 line2 col2 id parent token terminal 10 25241 10 21 exprFALSE > subset(d, parent == 10) [1] line1col1 line2col2 id parent token terminal <0 rows> (or 0-length row.names) here is my > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base note that while it says R version 3.0.2 above, I have seen the same behaviour under version 3.1.2 as well. Regards Ben On 02/19/2015 06:34 PM, Duncan Murdoch wrote: > On 19/02/2015 6:31 PM, B Tyner wrote: >> Hi, >> >> I have run across a source file for which the return value >> of getParseData() includes a record having FALSE for $terminal, yet it is >> not the parent of any other tokens. Before I spend time constructing a >> reproducible example, I wanted to verify that this is in fact unexpected >> behavior (under R 3.1.2)? > Before I spend the time thinking about that, I'd like to see a > reproducible example. > > Duncan Murdoch > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mboost: Proportional odds boosting model - how to specify the offset?
Dear Madlene, the problem that you observed was twofold. First, mboost expects the offset to be a scalar or a vector with length equal to the number of observations. However, fitted(p.iris) is a matrix. In PropOdds(), the linear or additive predictor is shared among all outcome categories and the thresholds are treated as nuisance parameter. What you need to supply as offset is the result of the linear or additive predictor (i.e., x'beta) instead of the fitted class probabilities. Second, there was a bug in mboost. I fixed it on R-forge [1]. If the package was successfully built use install.packages("mboost", repos="http://R-Forge.R-project.org";) to install it. You can also email to me off list. Then I will send you the package sources directly. Your nuisance parameters (which represent the class thresholds) can be extracted via nuisance(mlp). More details are given in the example below. Best, Benjamin [1] http://r-forge.r-project.org/projects/mboost/ Example code library(MASS) library(mboost) data(iris) iris$Species <- factor(iris$Species, ordered = T) p.iris <- polr(Species ~ Sepal.Length, data = iris) p.iris lm.iris <- glmboost(Species ~ Sepal.Length, data = iris, family = PropOdds(nuirange = c(-0.5, 3))) lm.iris[1000] ## thresholds: nuisance(lm.iris) ## to make these comparable to p.iris use nuisance(lm.iris) - coef(lm.iris)["(Intercept)"] - attr(coef(lm.iris), "offset") ## now use linear predictor as offset: mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width), data = iris, family = PropOdds(nuirange = c(0, 1)), offset = fitted(lm.iris)) Nussbaum Madlene wrote Dear R team The package mboost allows for boosting of proportional odds models. However, I would like to include an offset for every observation. This produces an error - no matter how I put the offset (as response probabilities or as response link). Fitting gamboost-models with offset works satisfactory with family = Gaussian() or Multinomial(). Questions: 1) How do I need to specify the offset with family = PropOdds()? 2) Where in the mboost-object do I find the Theta's (response category dependent intercept)? > > > > # --- minimal example with iris data --- > > library(MASS) > library(mboost) > > data(iris) > iris$Species <- factor(iris$Species, ordered = T) > p.iris <- polr(Species ~ Sepal.Length, data = iris) > mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width), >data = iris, family = PropOdds(), > offset = fitted(p.iris) ) > > Error in tmp[[i]] : subscript out of bounds > > > Thank you > M. Nussbaum > > -- > > ETH Zürich > Madlene Nussbaum > Institut für Terrestrische Ökosysteme > Boden- und Terrestrische Umweltphysik > CHN E 37.2 > Universitätstrasse 16 > CH-8092 Zürich > > Telefon + 44 632 73 21 > Mobile + 79 761 34 66 > madlene.nussbaum@.ethz > www.step.ethz.ch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Having trouble with gdata read in
Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I’m sorting through it. However, when I try and read in the 2008 data, I’m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string ### Here is the code I’m running to get the data: ### install.packages("gdata") library("gdata") fileUrl <- "http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls"; download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") list.files("ami.data") dateDown.2008 <- date() ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both “” and # are present within it. Don’t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", header=TRUE) ### And it spits out “More columns than column names” output. Been searching this, and I can find some “solutions” for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker — Sent from Mailbox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Having trouble with gdata read in
Anthony, XLSX won’t read an XLS file. Additionally, the legacy Java that is required for the xlsx package really effs up my computer. Have to reinstall my OS to fix it. — Sent from Mailbox On Wed, Mar 25, 2015 at 3:51 PM, Anthony Damico wrote: > maybe > library(xlsx) > tf <- tempfile() > ami <- " > http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls > " > download.file( ami , tf , mode = 'wb' ) > ami.data2008 <- read.xlsx( tf , sheetIndex = 1 ) > On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker wrote: >> Trying to read and clean up the FERC data on Advanced Metering >> infrastructure. Of course it is in XLS for the first two survey years and >> then converts to XLSX for the final two. Bad enough that it is all in >> excel, they had to change the survey design and data format as well. Still, >> I’m sorting through it. However, when I try and read in the 2008 data, I’m >> getting this error: >> ### >> Wide character in print at >> /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ >> xls2csv.pl line 270. >> Warning message: >> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : >> EOF within quoted string >> ### >> >> >> >> Here is the code I’m running to get the data: >> ### >> install.packages("gdata") >> library("gdata") >> fileUrl <- " >> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls >> " >> download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") >> list.files("ami.data") >> dateDown.2008 <- date() >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, >> header=TRUE) >> ### >> >> >> Reviewed the data in the XLS file, and both “” and # are present within >> it. Don’t know how to get the read.xls to ignore them so I can read all the >> data into my data frame. Tried : >> ### >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", >> header=TRUE) >> ### >> >> >> And it spits out “More columns than column names” output. >> >> >> Been searching this, and I can find some “solutions” for read.table, but >> nothing specific to read.xls >> >> >> Many thanks, >> >> >> Benjamin Baker >> >> >> >> — >> Sent from Mailbox >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Having trouble with gdata read in
Jim, Thanks, XLConnect with proper syntax works great for both types of files. — Sent from Mailbox On Thu, Mar 26, 2015 at 5:15 AM, jim holtman wrote: > My suggestion is to use XLConnect to read the file: >> x <- > "C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls" >> require(XLConnect) > Loading required package: XLConnect > Loading required package: XLConnectJars > XLConnect 0.2-10 by Mirai Solutions GmbH [aut], > Martin Studer [cre], > The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons > Codec), > Stephen Colebourne [ctb, cph] (Joda-Time Java library) > http://www.mirai-solutions.com , > http://miraisolutions.wordpress.com >> input <- f.readXLSheet(x, 1) >> >> str(input) > 'data.frame': 2266 obs. of 51 variables: > $ EIA : num 34 59 87 97 108 118 123 149 > 150 157 ... > $ Entity.Name : chr "City of Abbeville" "City of > Abbeville" "City of Ada" "Adams Electric Cooperative" ... > $ State: chr "SC" "LA" "MN" "IL" ... > $ NERC.Region : chr "SERC" "SPP" "MRO" "SERC" ... > $ Filing.Order : num 12 11 1237 392 252 ... > $ Q5.MultRegion: chr "" "" "" "" ... > $ Q6.OwnMeters.: chr "Yes" "Yes" "Yes" "Yes" ... > $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... > $ Q7.ComMeters : num 531 972 132 155 1719 ... > $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA > ... > $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA > ... > $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA > ... > $ Q7...total.meters: num 3582 5244 1021 8366 36015 ... > $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA > NA ... > $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.TransAMI: num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.OtherAMI: num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.TotalAMI: num 0 0 0 155 0 0 0 0 0 0 ... > $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA > NA ... > $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA > NA ... > > Jim Holtman > Data Munger Guru > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker wrote: >> Trying to read and clean up the FERC data on Advanced Metering >> infrastructure. Of course it is in XLS for the first two survey years and >> then converts to XLSX for the final two. Bad enough that it is all in >> excel, they had to change the survey design and data format as well. Still, >> I’m sorting through it. However, when I try and read in the 2008 data, I’m >> getting this error: >> ### >> Wide character in print at >> /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ >> xls2csv.pl line 270. >> Warning message: >> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : >> EOF within quoted string >> ### >> >> >> >> Here is the code I’m running to get the data: >> ### >> install.packages("gdata") >> library("gdata") >> fileUrl <- " >> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls >> " >> download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") >> list.files("ami.data") >> dateDown.2008 <- date() >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, >> header=TRUE) >> ### >> >> >> Reviewed the data in the XLS file, and both “” and # are present within >> it. Don’t know how to get the read.xls to ignore them so I can read all the >> data into my data frame. Tried : >> ### >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", >> header=TRUE) >> ### >> >> >> And it spits out “More columns than column names” output. >> &g
Re: [R] Having trouble with gdata read in
Jim, I’m not seeing the command f.readXLSheet in the documentation, nor is it executing in my code. — Sent from Mailbox On Thursday, Mar 26, 2015 at 5:15 AM, jim holtman , wrote: My suggestion is to use XLConnect to read the file: > x <- > "C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls" > require(XLConnect) Loading required package: XLConnect Loading required package: XLConnectJars XLConnect 0.2-10 by Mirai Solutions GmbH [aut], Martin Studer [cre], The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons Codec), Stephen Colebourne [ctb, cph] (Joda-Time Java library) http://www.mirai-solutions.com , http://miraisolutions.wordpress.com > input <- f.readXLSheet(x, 1) > > str(input) 'data.frame': 2266 obs. of 51 variables: $ EIA : num 34 59 87 97 108 118 123 149 150 157 ... $ Entity.Name : chr "City of Abbeville" "City of Abbeville" "City of Ada" "Adams Electric Cooperative" ... $ State : chr "SC" "LA" "MN" "IL" ... $ NERC.Region : chr "SERC" "SPP" "MRO" "SERC" ... $ Filing.Order : num 12 11 1237 392 252 ... $ Q5.MultRegion : chr "" "" "" "" ... $ Q6.OwnMeters. : chr "Yes" "Yes" "Yes" "Yes" ... $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... $ Q7.ComMeters : num 531 972 132 155 1719 ... $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA ... $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA ... $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA ... $ Q7...total.meters : num 3582 5244 1021 8366 36015 ... $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA NA ... $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TransAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.OtherAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TotalAMI : num 0 0 0 155 0 0 0 0 0 0 ... $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA NA ... $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA NA ... Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker wrote: Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I’m sorting through it. However, when I try and read in the 2008 data, I’m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string ### Here is the code I’m running to get the data: ### install.packages("gdata") library("gdata") fileUrl <- "http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls"; download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") list.files("ami.data") dateDown.2008 <- date() ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both “” and # are present within it. Don’t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", header=TRUE) ### And it spits out “More columns than column names” output. Been searching this, and I can find some “solutions” for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker — Sent from Mailbox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice: packet.number() versus panel.number()
Hi, According to https://svn.r-project.org/R-packages/trunk/lattice/R/print.trellis.R, "[panel.number] is usually the same as, but can be different from packet.number" and I had been under the impression that as long as the user is not using a custom index.cond nor perm.cond, the panel.number would in fact be the same as the packet.number. However, I have recently come across a case where the two are *not* the same, even though I am not using a custom index.cond nor perm.cond. So my question is, what might be some other possible situations in which the two would be expected to differ? Regards, Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice: packet.number() versus panel.number()
Thanks David. It turns out that I was in fact using a custom index.perm, but not on purpose. What happened was I used the "[" method on the trellis object ( lattice:::`[.trellis`), which of course is nothing but a short-cut for updating the index.perm. Lesson learned... Regards, Ben On 08/27/2014 02:10 AM, David Winsemius wrote: > On Aug 26, 2014, at 3:25 PM, Benjamin Tyner wrote: > >> Hi, >> >> According to >> https://svn.r-project.org/R-packages/trunk/lattice/R/print.trellis.R, >> >>"[panel.number] is usually the same as, but can be different from >> packet.number" >> >> and I had been under the impression that as long as the user is not >> using a custom index.cond nor perm.cond, the panel.number would in fact >> be the same as the packet.number. >> >> However, I have recently come across a case where the two are *not* the >> same, even though I am not using a custom index.cond nor perm.cond. >> >> So my question is, what might be some other possible situations in which >> the two would be expected to differ? > The immediate hypothesis that leaps to mind is cases where there are multiple > pages. On each page I suspect the upper left numbering restarts with 1, but I > suspect the packet numbers are sequential increasing. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] from cut.Date
Hello, I'm wondering if this is expected? > cut(structure(1, class="Date"), structure(c(11100,1), class="Date")) [1] Levels: 2000-05-23 The help page says that "for ‘"Date"’ objects, only ‘"day"’, ‘"week"’, ‘"month"’, ‘"quarter"’ and ‘"year"’ are allowed" [for the 'breaks' argument]. Though I am not sure whether this statement is only applicable in the context of the previous sentence about interval specification (i.e., a roundabout way of saying that ‘"sec"’, ‘"min"’, ‘"hour"’, and ‘"DSTday"’ are not allowed for 'Date' objects), or whether it also means that a vector of cut points (as in my example) is likewise not allowed? If the latter, then perhaps the function out to error out rather than return in this case? Regards Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] from cut.Date
Thanks Brian! The confusion was due to my failure to notice that these two functions have opposite defaults for 'right': > args(cut.default) function (x, breaks, labels = NULL, include.lowest = FALSE, right = TRUE, dig.lab = 3L, ordered_result = FALSE, ...) > args(cut.Date) function (x, breaks, labels = NULL, start.on.monday = TRUE, right = FALSE, ...) I suppose the latter does make sense, given that days, months, years etc are right-continuous functions of time. Regards Ben > On 17/09/2014 12:04, Benjamin Tyner wrote: > >/ Hello, > />/ > />/ I'm wondering if this is expected? > / > It is as documented! > >/ > />/ > cut(structure(1, class="Date"), structure(c(11100,1), > />/ class="Date")) > />/ [1] > />/ Levels: 2000-05-23 > />/ > />/ The help page says that "for ‘"Date"’ objects, only ‘"day"’, ‘"week"’, > />/ ‘"month"’, ‘"quarter"’ and ‘"year"’ are allowed" [for the 'breaks' > />/ argument]. Though I am not sure whether this statement is only > />/ applicable in the context of the previous sentence about interval > />/ specification (i.e., a roundabout way of saying that ‘"sec"’, ‘"min"’, > />/ ‘"hour"’, and ‘"DSTday"’ are not allowed for 'Date' objects), or whether > />/ it also means that a vector of cut points (as in my example) is likewise > />/ not allowed? If the latter, then perhaps the function out to error out > />/ rather than return in this case? > / > The NA is correct: the value you pass is not covered by the 'breaks' you > specified. As the help says > > Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ will include > both ends of the range of dates. > > With the default values, only the lower end is included. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > <https://stat.ethz.ch/mailman/listinfo/r-help> > Emeritus Professor of Applied Statistics, University of Oxford > 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaling of Amat still recommended for quadprog::solve.QP ?
Greetings, I ran across this recommendation, to keep the norms of the columns of the Amat on similar scales, https://stat.ethz.ch/pipermail/r-help/2007-September/141335.html However, when I looked at the code, I noticed that the norms are already being calculated: c c calculate the norm of each column of the A matrix c do 51 i=1,q sum = 0.d0 do 52 j=1,n sum = sum + amat(j,i)*amat(j,i) 52 continue work(iwnbv+i) = sqrt(sum) 51 continue nact = 0 iter(1) = 0 iter(2) = 0 50 continue though I am not sure exactly how these norms get used subsequently. My question is, is it no longer necessary to follow Berwin's recommendation from 2007? Or are the norms being calculated for some other purpose, and the recommendation still applies? Regards Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quadprog::solve.QP sometimes returns NaNs
Hello, Here is an example; hopefully it is reproducible on others' platform: library(quadprog) n <- 66L set.seed(6860) X <- matrix(1e-20, n, n) diag(X) <- 1 Dmat <- crossprod(X) y <- seq_len(n) dvec <- crossprod(X, y) Amat <- diag(n) bvec <- y + runif(n) sol <- solve.QP(Dmat, dvec, Amat, bvec, meq = n) print(sol$solution) # this gives all NaNs under sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] quadprog_1.5-5 Any ideas? Thanks Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] backsolve, chol, Matrix, and SparseM
Hi I have some code which does (on a symmetric matrix 'x') backsolve(chol(x), diag(nrow(x))) and I am wondering what is the recommended way to accomplish this when x is also sparse (from package:Matrix). I know that package:Matrix provides a chol method for such matrices, but not a backsolve method. On the other hand, package:SparseM does provide a backsolve method, but doesn't actually return a sparse matrix. Moreover, I am a little hesitant to use SparseM, as the vignette seems to be from 2003. I did notice that help(topic = "solve", package = "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package, ‘a’ may also be a ‘MatrixFactorization’ instead of directly a matrix." which makes me think this is the right way: Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x))) but unfortunately this didn't give the same result as: Matrix::solve(chol(x), .sparseDiagonal(nrow(x))) so I'm asking here in case someone has any suggestions. Regards, Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] backsolve, chol, Matrix, and SparseM
Hi Martin, Thanks for the remarks and examples, and for confirming that I was indeed barking up the wrong tree with SparseM. A. I assume that is a typo and you meant to say, no need for backsolve(). B. Absolutely; however, in this case I am taking advantage of quadprog::solve.QP(..., factorized = TRUE) which requires the inverse of the Cholesky factor; it turns out to be faster to compute this one time upfront rather than have solve.QP(..., factorized = FALSE) do it over and over again. Of course the holy grail would be a QP solver which takes advantage of the various innovations from package:Matrix, but I digress... C. Agreed, assuming you are talking about Matrix::solve(X) on X of class Matrix. On the other hand for a regular matrix x it is not difficult to construct examples where backsolve(chol(x), diag(nrow(x))) is twice as fast as base::solve(chol(x)), which led me down this path in the first place. By the way, is R-forge still the correct place to report bugs in package:Matrix? Regards Ben On 09/25/2015 04:25 AM, Martin Maechler wrote: > Dear Ben, > >>>>>> Benjamin Tyner >>>>>> on Thu, 24 Sep 2015 13:47:58 -0400 writes: > > Hi I have some code which does (on a symmetric matrix 'x') > > > backsolve(chol(x), diag(nrow(x))) > > > and I am wondering what is the recommended way to > > accomplish this when x is also sparse (from > > package:Matrix). I know that package:Matrix provides a > > chol method for such matrices, but not a backsolve > > method. On the other hand, package:SparseM does provide a > > backsolve method, but doesn't actually return a sparse > > matrix. Moreover, I am a little hesitant to use SparseM, > > as the vignette seems to be from 2003. > > Roger Koenker has agreed in the past, that new projects should > rather use Matrix. SparseM has been the very first R package > providing sparse matrix support. > > > > I did notice that help(topic = "solve", package = > > "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package, > > ‘a’ may also be a ‘MatrixFactorization’ instead of > > directly a matrix." which makes me think this is the right > > way: > > > Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x))) > > > but unfortunately this didn't give the same result as: > > > Matrix::solve(chol(x), .sparseDiagonal(nrow(x))) > > > so I'm asking here in case someone has any suggestions. > > You don't give any examples. > So a few remarks and a reproducible example to get more concrete > > A. As the Matrix package has classes for triangular matrices and > Matrix :: chol() returns them, there is no need for > forwardsolve() or backwardsolve(), as just solve() is always > enough. > > B. As Doug Bates has been teaching for many decennia, "it is > almost always computationally *wrong* to compute a matrix > inverse explicitly". > Rather computeA^{-1} B or A^{-1} x {for vector x, > matrix B (but different from Identity). > > C. Inspite of B, there are cases (such as computing sandwich > estimates of covariance matrices) where you do want the inverse. > In that case, > >solve(A) is semantically equivalent to >solve(A, diag(.)) > >and almost always the *first* form is implempented more >efficiently than the second. > > D. In Matrix, use chol(.) ... unless you really read a bit >about Cholesky(.) and its special purpose sparse cholesky decompositions. >As mentioned above, Matrix :: chol() will return a >"formally triangular" matrix, i.e., inheriting from >"triangularMatrix"; in the sparse case, very typically of >specific class "dtCMatrix". > > Here's a small reproducible example, > please use it to ask further questions: > > *.R: > > library(Matrix) > M <- as(diag(4)+1,"dsCMatrix") > m <- as(M, "matrix") # -> traditional R matrix > stopifnot( all(M == m) ) > M > L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper") > (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically > "dtCMatrix") > (cM <- chol(M))# *upper* triagonal ("dtCMatrix") > (cm <- chol(m))# upper triagonal traditional matrix -- the same "of course" > : > all.equal(as.matrix(cM), cm) # TRUE > > (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix > (R. <- solve(cM) ) ## the "same" (but nicer printing) > all.equal
[R] efficiently multiply each column of a sparse Matrix by a sparse vector
Hi, Say I have a sparse Matrix X, and a sparse vector (stored as a 1-column sparse Matrix A), with X and A having the same number of rows, and I wish to multiply each column of X by A, but would like the operation to take full advantage of the sparseness of both X and A. In other words I want the result to be another sparse Matrix but not having any zeros calculated or stored unnecessarily. For concreteness, library(Matrix) set.seed(6860) X <- sparseMatrix(i = sample(1:10, 5L), j = sample(1:10, 5L), x = rep(1, 5), dims = c(10L, 10L) ) A <- sparseMatrix(i = sample(1:10, 5L), j = rep(1L, 5L), x = rep(1, 5), dims = c(10L, 1L) ) and observe that print(X * A[, 1L, drop=TRUE]) gives the following, in which three 0s are not represented sparsely, 10 x 10 sparse Matrix of class "dgCMatrix" [1,] . . . . . . . . . . [2,] . . 1 . . . . . . . [3,] . . . . . . . . . . [4,] . . . . . 0 . . . . [5,] . . . . . . 0 . . . [6,] . 1 . . . . . . . . [7,] . . . . . . . . . . [8,] . . . . . . . . . . [9,] . . . . . . . . . . [10,] 0 . . . . . . . . . in other words I am wondering if there is a more efficient way to arrive at the same result as, print(X * A[, rep(1L, ncol(X)), drop=FALSE]) 10 x 10 sparse Matrix of class "dgCMatrix" [1,] . . . . . . . . . . [2,] . . 1 . . . . . . . [3,] . . . . . . . . . . [4,] . . . . . . . . . . [5,] . . . . . . . . . . [6,] . 1 . . . . . . . . [7,] . . . . . . . . . . [8,] . . . . . . . . . . [9,] . . . . . . . . . . [10,] . . . . . . . . . . without the additional overhead of duplicating A for ncol(X) times. This seems like such a simple thing, but has me stumped. Any ideas? Regards, Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] choropleth packages (US)
Hi I wish to draw a basic choropleth (US, by state) and am wondering if anyone has any recommendations? I've tried the following thus far: 1. choroplethr: this works, but required installation of 30+ dependencies. I would prefer something with fewer dependencies. 2. tmap: this also seems promising, but most of the examples I saw were specific to European maps. Can it be adapted for US? 3. statebins: doesn't draw true choropleths, but I liked that it doesn't have many dependencies. Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] choropleth packages (US)
Very nice Adrian. Is there a straightforward way to add Alaska and Hawaii at the lower left? (without resorting to choroplethr package) On 12/10/2015 06:09 AM, Adrian Waddell wrote: Hi, You can also use the 'maps' package for the map data and the 'scales' package for the color mapping. E.g. library(maps) library(scales) m <- map('state', fill=TRUE, plot=FALSE) s_data <- tolower(rownames(USArrests)) s_map <- tolower(m$names) mapping <- lapply(s_data, function(state) { which(grepl(state, s_map)) }) ## check if the mapping is good! col_pal <- col_numeric("Greens", domain=NULL, na.color = 'lightyellow') cols <- rep('lightyellow', length(s_data)) Map(function(indices, col) { cols[indices] <<- col }, mapping, col_pal(USArrests$UrbanPop)) map(m, col=cols, fill=TRUE) Adrian On Mon, Dec 7, 2015 at 9:34 AM, Erich Neuwirth wrote: ggplot2 also can do this with fortify geom_polygon Von meinem iPad gesendet Am 06.12.2015 um 21:03 schrieb Benjamin Tyner : Hi I wish to draw a basic choropleth (US, by state) and am wondering if anyone has any recommendations? I've tried the following thus far: 1. choroplethr: this works, but required installation of 30+ dependencies. I would prefer something with fewer dependencies. 2. tmap: this also seems promising, but most of the examples I saw were specific to European maps. Can it be adapted for US? 3. statebins: doesn't draw true choropleths, but I liked that it doesn't have many dependencies. Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R_DirtyImage and Rprof
Hello, I have some code which was running in interactive mode while Rprof(..., line.profiling = TRUE). Near the end of my script, it opens up a pipe(..., open = "w") to a perl script, and at that point the execution gets stuck using 100% cpu. (The perl script itself never showed up in pstree, as far as I can tell). I did a "tail -f" on the file being written to by Rprof, and it was reporting "sys.save.image" over and over, and in fact an ".RData" file appeared when I had not asked for one, and I was able to load it later. This got me curious, as nowhere in my code do I directly use that function. Looking through the source code for R, it appears that "sys.save.image" is called whenever an R_DirtyImage condition is triggered. This was using R version 3.2.2 under RHEL. My efforts to create a reproducible example of this behavior have thus far been unsuccessful. My questions: is there any documentation for R_DirtyImage, and how plausible is it that the R_DirtyImage condition was triggered by something Rprof did? The reason for my conjecture is that sys.save.image() calls closeAllConnections(), which I imagine might have interfered with the pipe that was open for writing, thus causing the stuck execution at that point. If so, any advice for avoiding the R_DirtyImage condition while profiling? If not, any conjectures for what might actually be going on? For what it's worth, I have observed a similar situation when using Rprof + system() instead of pipe(); for example: https://stat.ethz.ch/pipermail/r-help/2015-August/431286.html Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Significance of Svyrepdesign Object Warning
Hello R Users, I am using Lumley's Survey Package in R to analyze complex survey data that involves 200 balanced repeated replicate (BRR) weight variables. I have ensured that my svyrepdesign object that specifies the application of the BRR weights to the data set is accurate and I have matched the published standard errors of the data set. When doing a logistic regression through the svyglm call, I receive the following warning: In object$survey.design$pweights * presid^2 : longer object length is not a multiple of shorter object length? I have search around quite a bit online and have not been able to find any good interpretation of its meaning. I want to be sure that I am not making some type of mistake that is causing this warning to be produced. Any advisement is greatly appreciated. The following is an MRE that can be pasted into the R console: library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Logistic regression call which yields a warning regarding svyrepdesign object svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) allCC <- summary(svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)) allCC #Session Info #R version 3.3.1 (2016-06-21) #Platform: x86_64-w64-mingw32/x64 (64-bit) #Running under: Windows >= 8 x64 (build 9200) #locale: # [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 #[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C #[5] LC_TIME=English_United States.1252 #attached base packages: # [1] grid stats graphics grDevices utils datasets methods base #other attached packages: #[1] survey_3.31-2 survival_2.39-4 Matrix_1.2-6RCurl_1.95-4.8 bitops_1.0-6 #loaded via a namespace (and not attached): #[1] tools_3.3.1 splines_3.3.1 knitr_1.14 lattice_0.20-33 Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significance of Svyrepdesign Object Warning
Thank you for your help. I did try Anthony's recommendation of removing the 'na.action=na.exclude' ; I thought I needed that argument as the data set includes NA values. I found it interesting that without the 'na.action=na.exclude' argument, the baseline level of two of my predictor variables (BYINCOME & F1HIMATH) were changed. Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: William Dunlap Sent: Sunday, October 23, 2016 2:24 PM To: Anthony Damico Cc: Courtney Benjamin; r-help@r-project.org; Thomas Lumley Subject: Re: [R] Significance of Svyrepdesign Object Warning The immediate problem could be solved by changing the following lines in survey:::summary.svrepglm from presid <- resid(object, "pearson") dispersion <- sum(object$survey.design$pweights * presid^2, na.rm = TRUE)/sum(object$survey.design$pweights) to presid <- resid(object, "pearson") pweights <- naresid(object$na.action, object$survey.design$pweights) dispersion <- sum(pweights * presid^2, na.rm = TRUE)/sum(pweights, na.rm = TRUE) 'naresid' uses the information from na.exclude to match up the residuals with the row in the data that they correspond to. resid() calls it so it should also be applied to pweights so they line up correctly. Bill Dunlap TIBCO Software wdunlap tibco.com<http://tibco.com> On Sun, Oct 23, 2016 at 11:17 AM, Anthony Damico mailto:ajdam...@gmail.com>> wrote: hi, great example. i am ccing survey package author/maintainer dr. lumley. why do you have `na.action=na.exclude`? if you remove it, things work as expected-- library(RCurl) library(survey) data <- getURL(" https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv ") elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Logistic regression call which yields a warning regarding svyrepdesign object # your warning a <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(a) # works fine a <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1) summary(a) the mismatch of vectors generating that warning happens inside debug(survey:::summary.svrepglm) [..snip..] Browse[2]> length(presid) [1] 12614 Browse[2]> length(object$survey.design$pweights) [1] 8397 and including vs excluding the na.action=na.exclude gives you a slightly different dispersion parameter calculation (Dispersion parameter for binomial family taken to be 0.7756235) (Dispersion parameter for binomial family taken to be 0.7849244) not sure if the two survey:::residuals.sv<http://residuals.sv>* methods should deal with the na.action= parameter? thanks On Sun, Oct 23, 2016 at 11:56 AM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: > Hello R Users, > > I am using Lumley's Survey Package in R to analyze complex survey data > that involves 200 balanced repeated replicate (BRR) weight variables. I > have ensured that my svyrepdesign object that specifies the application of > the BRR weights to the data set is accurate and I have matched the > published standard errors of the data set. > > When doing a logistic regression through the svyglm call, I receive the > following warning: > > In object$survey.design$pweights * presid^2 : > longer object length is not a multiple of shorter object length? > I have search around quite a bit online and have not been able to find any > good interpretation of its meaning. I want to be sure that I am not making > some type of mistake that is causing this warning to be produced. Any > advisement is greatly appreciated. > The following is an MRE that can be pasted into the R console: > library(RCurl) > library(survey) > data <- getURL("https://raw.githubusercontent.com/ > cbenjamin1821/careertech-ed/master/elsq1adj.csv") > elsq1ch <- read.csv(text = data) > #Specifying the svyrepdesign object which applies the BRR weights > elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = > elsq1ch[,18:217], weights = elsq1ch[,17]
Re: [R] Significance of Svyrepdesign Object Warning
Hello Mr. Dunlap, I have gone back and re-read the responses to my question. I am interested in trying to apply your recommendation so I am doing things correctly; however I am not sure how to go about doing it within my code. It appears that you are digging quite deeply into R where I am not yet familiar. I am including a reproducible example; would you be willing to show an example of how it would be done? I greatly appreciate your advisement and time. Sincerely, Courtney library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Logistic regression call which yields a warning regarding svyrepdesign object allCC <-svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(allCC) Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: William Dunlap Sent: Sunday, October 23, 2016 2:24 PM To: Anthony Damico Cc: Courtney Benjamin; r-help@r-project.org; Thomas Lumley Subject: Re: [R] Significance of Svyrepdesign Object Warning The immediate problem could be solved by changing the following lines in survey:::summary.svrepglm from presid <- resid(object, "pearson") dispersion <- sum(object$survey.design$pweights * presid^2, na.rm = TRUE)/sum(object$survey.design$pweights) to presid <- resid(object, "pearson") pweights <- naresid(object$na.action, object$survey.design$pweights) dispersion <- sum(pweights * presid^2, na.rm = TRUE)/sum(pweights, na.rm = TRUE) 'naresid' uses the information from na.exclude to match up the residuals with the row in the data that they correspond to. resid() calls it so it should also be applied to pweights so they line up correctly. Bill Dunlap TIBCO Software wdunlap tibco.com<http://tibco.com> On Sun, Oct 23, 2016 at 11:17 AM, Anthony Damico mailto:ajdam...@gmail.com>> wrote: hi, great example. i am ccing survey package author/maintainer dr. lumley. why do you have `na.action=na.exclude`? if you remove it, things work as expected-- library(RCurl) library(survey) data <- getURL(" https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv ") elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Logistic regression call which yields a warning regarding svyrepdesign object # your warning a <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(a) # works fine a <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1) summary(a) the mismatch of vectors generating that warning happens inside debug(survey:::summary.svrepglm) [..snip..] Browse[2]> length(presid) [1] 12614 Browse[2]> length(object$survey.design$pweights) [1] 8397 and including vs excluding the na.action=na.exclude gives you a slightly different dispersion parameter calculation (Dispersion parameter for binomial family taken to be 0.7756235) (Dispersion parameter for binomial family taken to be 0.7849244) not sure if the two survey:::residuals.sv<http://residuals.sv>* methods should deal with the na.action= parameter? thanks On Sun, Oct 23, 2016 at 11:56 AM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: > Hello R Users, > > I am using Lumley's Survey Package in R to analyze complex survey data > that involves 200 balanced repeated replicate (BRR) weight variables. I > have ensured that my svyrepdesign object that specifies the application of > the BRR weights to the data set is accurate and I have matched the > published standard errors of the data set. > > When doing a logistic regression through the svyglm call, I receive the > following warning: > > In object$survey.des
Re: [R] Significance of Svyrepdesign Object Warning
Thank you; I will do so. Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: William Dunlap Sent: Thursday, October 27, 2016 9:59 PM To: Courtney Benjamin Cc: Anthony Damico; r-help@r-project.org; Thomas Lumley Subject: Re: [R] Significance of Svyrepdesign Object Warning For, now I would just use na.action=na.omit instead of na.exclude. My comments were mainly for the package author. Bill Dunlap TIBCO Software wdunlap tibco.com<http://tibco.com> On Thu, Oct 27, 2016 at 5:53 PM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: Hello Mr. Dunlap, I have gone back and re-read the responses to my question. I am interested in trying to apply your recommendation so I am doing things correctly; however I am not sure how to go about doing it within my code. It appears that you are digging quite deeply into R where I am not yet familiar. I am including a reproducible example; would you be willing to show an example of how it would be done? I greatly appreciate your advisement and time. Sincerely, Courtney library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Logistic regression call which yields a warning regarding svyrepdesign object allCC <-svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(allCC) Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: William Dunlap mailto:wdun...@tibco.com>> Sent: Sunday, October 23, 2016 2:24 PM To: Anthony Damico Cc: Courtney Benjamin; r-help@r-project.org<mailto:r-help@r-project.org>; Thomas Lumley Subject: Re: [R] Significance of Svyrepdesign Object Warning The immediate problem could be solved by changing the following lines in survey:::summary.svrepglm from presid <- resid(object, "pearson") dispersion <- sum(object$survey.design$pweights * presid^2, na.rm = TRUE)/sum(object$survey.design$pweights) to presid <- resid(object, "pearson") pweights <- naresid(object$na.action, object$survey.design$pweights) dispersion <- sum(pweights * presid^2, na.rm = TRUE)/sum(pweights, na.rm = TRUE) 'naresid' uses the information from na.exclude to match up the residuals with the row in the data that they correspond to. resid() calls it so it should also be applied to pweights so they line up correctly. Bill Dunlap TIBCO Software wdunlap tibco.com<http://tibco.com> On Sun, Oct 23, 2016 at 11:17 AM, Anthony Damico mailto:ajdam...@gmail.com>> wrote: hi, great example. i am ccing survey package author/maintainer dr. lumley. why do you have `na.action=na.exclude`? if you remove it, things work as expected-- library(RCurl) library(survey) data <- getURL(" https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv ") elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Logistic regression call which yields a warning regarding svyrepdesign object # your warning a <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(a) # works fine a <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1) summary(a) the mismatch of vectors generating that warning happens inside debug(survey:::summary.svrepglm) [..snip..] Browse[2]> length(presid) [1] 12614 Browse[2]> length(object$survey.design$pweights) [1] 8397 and including vs excluding the na.action=na.exclude gives you a slightly different dispersion parameter calculation (Di
[R] Resetting Baseline Level of Predictor in svyglm Function
Hello R Users: I am using the survey package in R for modeling with complex survey data. I am trying to reset the baseline level of certain predictor variables being used in a logistic regression without success. The following is a reproducible example: library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Log. Reg. model allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) ##Attempting to reset baseline level for predictor variable #Both attempts did not work elsq1ch$F1HIMATH <- C(elsq1ch$F1HIMATH,contr.treatment, base=1) elsq1ch$F1HIMATH <- relevel(elsq1ch$F1HIMATH,"PreAlg or Less") #Log. Reg. model with no changes in baseline levels for the predictors allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) Any guidance is greatly appreciated.? Sincerely, Courtney? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Resetting Baseline Level of Predictor in svyglm Function
Thank you, Anthony; it worked flawlessly.? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: Anthony Damico Sent: Tuesday, November 1, 2016 3:20 AM To: Courtney Benjamin Cc: r-help@r-project.org Subject: Re: [R] Resetting Baseline Level of Predictor in svyglm Function hi, i think you want elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) On Mon, Oct 31, 2016 at 9:05 PM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: Hello R Users: I am using the survey package in R for modeling with complex survey data. I am trying to reset the baseline level of certain predictor variables being used in a logistic regression without success. The following is a reproducible example: library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr #Log. Reg. model allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) ##Attempting to reset baseline level for predictor variable #Both attempts did not work elsq1ch$F1HIMATH <- C(elsq1ch$F1HIMATH,contr.treatment, base=1) elsq1ch$F1HIMATH <- relevel(elsq1ch$F1HIMATH,"PreAlg or Less") #Log. Reg. model with no changes in baseline levels for the predictors allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) Any guidance is greatly appreciated.? Sincerely, Courtney? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org><mailto:cbenj...@btboces.org<mailto:cbenj...@btboces.org>> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression
?Hello R Experts, I am trying to implement the Archer-Lemeshow GOF Test for survey data on a logistic regression model using the survey package based upon an R Help Archive post that I found where Dr. Thomas Lumley advised how to do it: http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html Everything is going well until I get to the point where I have to add the objects 'r' and 'g' as variables to the data frame by either using the transform function or the update function to update the svrepdesign object. The log. regression model involved uses a subset of data and some of the values in the data frame are NA, so that is affecting my ability to add 'r' and 'g' as variables; I am getting an error because I only have 8397 rows for the new variables and 16197 in the data frame and svrepdesign object. I am not sure how to overcome this error. The following is a MRE: ##Archer Lemeshow Goodness of Fit Test for Complex Survey Data with Logistic Regression library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr ##Resetting baseline levels for predictors elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") ) elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") ) elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") ) elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") ) #Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) #Recommendations from Lumley (from R Help Archive) on implementing the Archer Lemeshow GOF test r <- residuals(allCC, type="response") f<-fitted(allCC) g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf))) # now create a new design object with r and g added as variables #This is the area where I am having problems as my model involves a subset and some values are NA as well #I am also not sure if I am naming/specifying the new variables of r and g properly transform(elsq1ch,r=r,g=g) elsq1ch_brr <- update(elsq1ch_brr,tag=g,tag=r) #then: decilemodel<- svyglm(r~g, design=newdesign) regTermTest(decilemodel, ~g) #is the F-adjusted mean residual test from the Archer Lemeshow paper Thank you, Courtney ? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression
Thank you, Anthony. Your approach does work; however, I am concerned to some degree about subsetting the data prior to creating the new svrepdesign as I know that it is not recommended to subset the data prior to creating the svrepdesign object. I am not sure if this is a significant concern in this context as the model was fitted with the original svrepdesign that was created prior to subsetting any data and the new svrepdesign is being used to run the diagnostic for the model. Any thoughts on that issue? Also, from my understanding of the outcome of the diagnostic, low p values indicate a poor model fit. Sincerely, Courtney Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: Anthony Damico Sent: Thursday, November 17, 2016 4:28 AM To: Courtney Benjamin Cc: r-help@r-project.org Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression great minimal reproducible example, thanks. does something like this work? #Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(allCC) r <- residuals(allCC, type="response") f<-fitted(allCC) your_g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf))) elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1) regTermTest(decilemodel, ~your_g) On Wed, Nov 16, 2016 at 10:15 PM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: ?Hello R Experts, I am trying to implement the Archer-Lemeshow GOF Test for survey data on a logistic regression model using the survey package based upon an R Help Archive post that I found where Dr. Thomas Lumley advised how to do it: http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html Everything is going well until I get to the point where I have to add the objects 'r' and 'g' as variables to the data frame by either using the transform function or the update function to update the svrepdesign object. The log. regression model involved uses a subset of data and some of the values in the data frame are NA, so that is affecting my ability to add 'r' and 'g' as variables; I am getting an error because I only have 8397 rows for the new variables and 16197 in the data frame and svrepdesign object. I am not sure how to overcome this error. The following is a MRE: ##Archer Lemeshow Goodness of Fit Test for Complex Survey Data with Logistic Regression library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr ##Resetting baseline levels for predictors elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") ) elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") ) elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") ) elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") ) #Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) #Recommendations from Lumley (from R Help Archive) on implementing the Archer Lemeshow GOF test r <- residuals(allCC, type="response") f<-fitted(allCC) g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf))) # now create a new design object with r and g added as variables #This is the area where I am having problems as my model involves a subset and some values are NA as well #I am also not sure if I am naming/specifying the new variables of r and g properly transform(elsq1ch,r=r,g=g) elsq1ch_brr <- update(els
Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression
Thank you; I appreciate your advisement. Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: Anthony Damico Sent: Friday, November 18, 2016 5:15 AM To: Courtney Benjamin Cc: r-help@r-project.org Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression hi, my code does not subset the survey design on the line that creates the svrepdesign(). subsetting in order to create a variable while your data is still a data.frame is probably okay, so long as you expect the observations outside of the subset to be NAs like they are in this case. nrow(elsq1ch_brr)==nrow(newdesign) On Fri, Nov 18, 2016 at 5:06 AM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: Thank you, Anthony. Your approach does work; however, I am concerned to some degree about subsetting the data prior to creating the new svrepdesign as I know that it is not recommended to subset the data prior to creating the svrepdesign object. I am not sure if this is a significant concern in this context as the model was fitted with the original svrepdesign that was created prior to subsetting any data and the new svrepdesign is being used to run the diagnostic for the model. Any thoughts on that issue? Also, from my understanding of the outcome of the diagnostic, low p values indicate a poor model fit. Sincerely, Courtney Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: Anthony Damico mailto:ajdam...@gmail.com>> Sent: Thursday, November 17, 2016 4:28 AM To: Courtney Benjamin Cc: r-help@r-project.org<mailto:r-help@r-project.org> Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression great minimal reproducible example, thanks. does something like this work? #Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(allCC) r <- residuals(allCC, type="response") f<-fitted(allCC) your_g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf))) elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1) regTermTest(decilemodel, ~your_g) On Wed, Nov 16, 2016 at 10:15 PM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: ?Hello R Experts, I am trying to implement the Archer-Lemeshow GOF Test for survey data on a logistic regression model using the survey package based upon an R Help Archive post that I found where Dr. Thomas Lumley advised how to do it: http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html Everything is going well until I get to the point where I have to add the objects 'r' and 'g' as variables to the data frame by either using the transform function or the update function to update the svrepdesign object. The log. regression model involved uses a subset of data and some of the values in the data frame are NA, so that is affecting my ability to add 'r' and 'g' as variables; I am getting an error because I only have 8397 rows for the new variables and 16197 in the data frame and svrepdesign object. I am not sure how to overcome this error. The following is a MRE: ##Archer Lemeshow Goodness of Fit Test for Complex Survey Data with Logistic Regression library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr ##Resetting baseline levels for predictors elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") ) elsq1ch_brr <- update( elsq1ch_brr , F1R
Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression
Like I had said, I was concerned with the low p value outcome of this diagnostic test, so I went on to test other interactions that I had not previously tested. I did find an interaction that is significant. Now when I go to run the diagnostic to check for any improvement, I am coming up with an error. ##Testing with interaction model allCCI12 <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC+F1HIMATH*F1RGPP2,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(allCCI12) r <- residuals(allCCI12, type="response") f<-fitted(allCCI12) your_g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf))) elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1) regTermTest(decilemodel, ~your_g) Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 From: Anthony Damico Sent: Friday, November 18, 2016 5:15 AM To: Courtney Benjamin Cc: r-help@r-project.org Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression hi, my code does not subset the survey design on the line that creates the svrepdesign(). subsetting in order to create a variable while your data is still a data.frame is probably okay, so long as you expect the observations outside of the subset to be NAs like they are in this case. nrow(elsq1ch_brr)==nrow(newdesign) On Fri, Nov 18, 2016 at 5:06 AM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: Thank you, Anthony. Your approach does work; however, I am concerned to some degree about subsetting the data prior to creating the new svrepdesign as I know that it is not recommended to subset the data prior to creating the svrepdesign object. I am not sure if this is a significant concern in this context as the model was fitted with the original svrepdesign that was created prior to subsetting any data and the new svrepdesign is being used to run the diagnostic for the model. Any thoughts on that issue? Also, from my understanding of the outcome of the diagnostic, low p values indicate a poor model fit. Sincerely, Courtney Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 ____ From: Anthony Damico mailto:ajdam...@gmail.com>> Sent: Thursday, November 17, 2016 4:28 AM To: Courtney Benjamin Cc: r-help@r-project.org<mailto:r-help@r-project.org> Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression great minimal reproducible example, thanks. does something like this work? #Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude) summary(allCC) r <- residuals(allCC, type="response") f<-fitted(allCC) your_g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf))) elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1) regTermTest(decilemodel, ~your_g) On Wed, Nov 16, 2016 at 10:15 PM, Courtney Benjamin mailto:cbenj...@btboces.org>> wrote: ?Hello R Experts, I am trying to implement the Archer-Lemeshow GOF Test for survey data on a logistic regression model using the survey package based upon an R Help Archive post that I found where Dr. Thomas Lumley advised how to do it: http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html Everything is going well until I get to the point where I have to add the objects 'r' and 'g' as variables to the data frame by either using the transform function or the update function to update the svrepdesign object. The log. regression model involved uses a subset of data a
[R] Further Subsetting of Data for Log. Reg. Results in qr.default Error
Hello R Experts, In further subsetting data within a logistic regression in the survey package, I am getting a qr.default error. Below is an example of the original model that runs correctly, but when I subset the data further to look at students of a particular curriculum concentration, the qr.default error occurs. I thought it may have been related to converting the F1RTRCC variable into a factor for use in the original model; I went back and restored that variable to its original form and it didn't help. Any guidance is greatly appreciated. library(RCurl) library(survey) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr ##Resetting baseline levels for predictors elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or Less") ) elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") ) elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") ) elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") ) elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") ) #Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit) summary(allCC) ##CTE Log. Reg. model that is resulting in the qr.default error CTE <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1&F1RTRCC=="Academic",na.action=na.omit) summary(CTE) Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Svyolr-Properly Specifying Start Values
Hello R Users, I am trying to use the svyolr command and coming up with the following error: Error in MASS::polr(formula, data = df, ..., Hess = TRUE, model = FALSE, : attempt to find suitable starting values failed >From what I have read online, a possible solution is to specify a value in the >start argument of svyolr; unfortunately, I have not been able to find any >detailed/clear descriptions about how to go about specifying values for the >start argument. Any help in explaining how to go about establishing >reasonable start values would be greatly appreciated. library (survey) library(RCurl) data <- getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj2.csv";) elsq1ch <- read.csv(text = data) #Specifying the svyrepdesign object which applies the BRR weights elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR") elsq1ch_brr allCColr <- svyolr(F3ATTAINMENT~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,design=subset(elsq1ch_brr,BYSCTRL==1&G10COHRT==1),na.action=na.omit) ? Courtney Benjamin Broome-Tioga BOCES Automotive Technology II Teacher Located at Gault Toyota Doctoral Candidate-Educational Theory & Practice State University of New York at Binghamton cbenj...@btboces.org<mailto:cbenj...@btboces.org> 607-763-8633 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modify the imported version of a function
Hi I saw on the assignInNamespace help page, that it changes "the copy in the namespace, but not any copies already exported from the namespace, in particular an object of that name in the package (if already attached) and any copies already imported into other namespaces." So now I'm wondering, whether there is a way to modify such copies already imported into other namespaces? For a concrete example, consider the lint package (now archived on CRAN) which includes a function check_pattern, which calls stringr:::perl, which has been deprecated in newer versions: > stringr:::perl function (pattern) { message("perl is deprecated. Please use regexp instead") regex(pattern) } so, what if I wanted to directly replace stringr:::perl with stringr:::regex, in such a way that lint::check_pattern sees the replacement? Is there a way, preferably without having to attach stringr to the search path? (I'm mostly just interested in learning about how namespace imports actually work; of course in this toy example one could use suppressMessages to hide that deprecation message). Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modify the imported version of a function
Figured it out...in case it is useful to others: > library(lint) > lint.enclos <- parent.env(asNamespace("lint")) > stopifnot(all(c("perl", "regex") %in% ls(lint.enclos))) > lint.enclos$perl function (pattern) { message("perl is deprecated. Please use regexp instead") regex(pattern) } > unlockBinding("perl", lint.enclos) > lint.enclos$perl <- lint.enclos$regex > lint.enclos$perl function (pattern, ignore_case = FALSE, multiline = FALSE, comments = FALSE, dotall = FALSE, ...) { options <- stri_opts_regex(case_insensitive = ignore_case, multiline = multiline, comments = comments, dotall = dotall, ...) structure(pattern, options = options, class = c("regex", "pattern", "character")) } On 12/16/2016 06:06 PM, Benjamin Tyner wrote: Hi I saw on the assignInNamespace help page, that it changes "the copy in the namespace, but not any copies already exported from the namespace, in particular an object of that name in the package (if already attached) and any copies already imported into other namespaces." So now I'm wondering, whether there is a way to modify such copies already imported into other namespaces? For a concrete example, consider the lint package (now archived on CRAN) which includes a function check_pattern, which calls stringr:::perl, which has been deprecated in newer versions: > stringr:::perl function (pattern) { message("perl is deprecated. Please use regexp instead") regex(pattern) } so, what if I wanted to directly replace stringr:::perl with stringr:::regex, in such a way that lint::check_pattern sees the replacement? Is there a way, preferably without having to attach stringr to the search path? (I'm mostly just interested in learning about how namespace imports actually work; of course in this toy example one could use suppressMessages to hide that deprecation message). Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] file.exists() on device files
Hi, On my linux machine (Ubuntu, and also tested on RHEL), I am curious to know what might be causing file.exists (and also normalizePath) to not see the final device file here: > list.files("/dev/fd", full.names = TRUE) [1] "/dev/fd/0" "/dev/fd/1" "/dev/fd/2" "/dev/fd/3" > file.exists(list.files("/dev/fd", full.names = TRUE)) [1] TRUE TRUE TRUE FALSE > normalizePath(list.files("/dev/fd", full.names = TRUE)) [1] "/dev/pts/2" "/dev/pts/2" "/dev/pts/2" "/dev/fd/3" Warning message: In normalizePath(list.files("/dev/fd", full.names = TRUE)) : path[4]="/dev/fd/3": No such file or directory > sessionInfo() R version 3.2.5 (2016-04-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.1 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Regards Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [FORGED] file.exists() on device files
Thank you for the insights, Rolf and Henrik. To give another example, this time in non-interactive mode, Rscript -e "file.exists(commandArgs(TRUE))" <(echo "Hi") [1] TRUE versus Rscript -e "normalizePath(commandArgs(TRUE))" <(echo "Hi") [1] "/dev/fd/63" Warning message: In normalizePath(commandArgs(TRUE)) : path[1]="/dev/fd/63": No such file or directory It almost seems like file.exists and normalizePath use separate criteria for determining existence? Regards Ben On 01/12/2017 01:42 AM, Rolf Turner wrote: On 12/01/17 16:33, Henrik Bengtsson wrote: FYI, the /proc is there because Unix has something called the "proc filesystem (procfs; https://en.wikipedia.org/wiki/Procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure". For instance, you can query the uptime of the machine by reading from /proc/uptime: $ cat /proc/uptime 332826.96 661438.10 $ cat /proc/uptime 332871.40 661568.50 You can get all IDs (PIDs) of all processes currently running: $ ls /proc/ | grep -E '^[0-9]+$' and for each process you there are multiple attributes mapped as files, e.g. if I start R as: $ R --args -e "message('hello there')" then I can query that process as: $ pid=$(pidof R) $ echo $pid 26323 $ cat /proc/26323/cmdline /usr/lib/R/bin/exec/R--args-emessage('hello there') Unix is neat Indeed. Couldn't agree more. Thanks for the insight. cheers, Rolf __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] flushing on.exit prior to q()
Hi, When using a custom error function that calls q(), what is the recommended way to "flush" the calling function's on.exit ? For example, say I have a script: #!/usr/bin/Rscript --no-init-file options(error = function() { cat("on error message\n", file = stderr()) q(save = "no", status = 1) }) test <- function() { on.exit(cat("on exit message\n", file = stderr())) stop("error") } test() when I run the script, the "on exit message" does not print. (I found one solution might involve adding a stop() to my custom error function prior to the q(); however this is somewhat less than ideal: it also triggers an "error during wrapup" because of the recursion). Other ideas, best practices? Thanks Ben __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple... but...
> x <- c(1,3,5) > y <- c(2,4,6) > xy <- sort(c(x,y)) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shubha Vishwanath Karanth Sent: Wednesday, July 23, 2008 8:55 AM To: [EMAIL PROTECTED] Subject: [R] Simple... but... Hi R, If x=c(1,3,5) y=c(2,4,6) I need a vector which is c(1,2,3,4,5,6) from x and y. How do I do it? I mean the best way Thanks, Shubha This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2007). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix barplot
> data <- data.frame(Year=c(2000,2001,2002), A=c(4,2,1), B=c(3,1,2), C=c(0,3,5)) > data.mat <- as.matrix(data)[,2:4] > rownames(data.mat) <- data$Year > data.mat <- t(data.mat) > barplot(data.mat,beside=TRUE) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andreas Tille Sent: Friday, July 25, 2008 8:14 AM To: r-help@r-project.org Subject: [R] Matrix barplot Hi, as a bloody R beginner I failed to solve the probably simple problem to create a barplot of the following data read from a file Year A BC 2000 4 30 2001 2 13 2002 1 25 The Barplot should look like 5 | C 4 | A C 3 | AB C C 2 | ABA CBC 1 | ABABC ABC +-- 2000 2001 2002 (well, something like that - the colors are encoded as letters in this ASCII-graphics - assume the coloring / shading as usual). My problem is that if I read the table using data <- read.table(file='data.dat', sep = '\t', fill=TRUE, header=TRUE ) everything looks is read as expected if I try data but I have no idea how to separate the Matrix (A B C) to specify the height parameter of barplot as a matrix and the names.arg parameter for the inscription. Whatever I tried I get 'height' must be a vector or a matrix and I have no idea to do this right. So I think I'm just facing a data conversion problem and have to turn data[['Year']] into a vector and the remaining table into a matrix. I guess this is a really simple question - but I failed to find a solution. Kind regards Andreas. -- http://fam-tille.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2007). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Color of box frame in Legend (Was: Matrix barplot)
Try sourcing in the 'new.legend' function below. It's the legend function with a new argument called 'box.col'. The argument will change the color of the box surrounding the legend. If I understand what it is you are looking for, this should work. Also, I didn't see a way to change the axis bar in your code, so I suppressed the axis in the call to barplot, and manually replaced the axis using the axis function. I hope this works for you. Benjamin data <- data.frame(Year=c(2000,2001,2002), A=c(2,2,1), B=c(3,1,2), C=c(0,3,5)) data.mat <- as.matrix(data)[,2:4] rownames(data.mat) <- data[['Year']] data.mat <- t(data.mat) textcolor="yellow" par(col.axis=textcolor,col.main=textcolor) barplot(data.mat,beside=TRUE,col=rainbow(3),main="Test",yaxt="n") axis(2,at=0:5,col=textcolor) new.legend(x="topleft", colnames(data[,2:4]),fill=rainbow(3), inset=0.05, text.col=textcolor,box.col=textcolor ) ### ### new.legend <- function (x, y = NULL, legend, fill = NULL, col = par("col"), lty, lwd, pch, angle = 45, density = NULL, bty = "o", bg = par("bg"), box.lwd = par("lwd"), box.lty = par("lty"), box.col="black", pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd, xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1, adj = c(0, 0.5), text.width = NULL, text.col = par("col"), merge = do.lines && has.pch, trace = FALSE, plot = TRUE, ncol = 1, horiz = FALSE, title = NULL, inset = 0) { if (missing(legend) && !missing(y) && (is.character(y) || is.expression(y))) { legend <- y y <- NULL } mfill <- !missing(fill) || !missing(density) title <- as.graphicsAnnot(title) if (length(title) > 1) stop("invalid title") legend <- as.graphicsAnnot(legend) n.leg <- if (is.call(legend)) 1 else length(legend) if (n.leg == 0) stop("'legend' is of length 0") auto <- if (is.character(x)) match.arg(x, c("bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center")) else NA if (is.na(auto)) { xy <- xy.coords(x, y) x <- xy$x y <- xy$y nx <- length(x) if (nx < 1 || nx > 2) stop("invalid coordinate lengths") } else nx <- 0 xlog <- par("xlog") ylog <- par("ylog") rect2 <- function(left, top, dx, dy, density = NULL, angle, ...) { r <- left + dx if (xlog) { left <- 10^left r <- 10^r } b <- top - dy if (ylog) { top <- 10^top b <- 10^b } rect(left, top, r, b, angle = angle, density = density, ...) } segments2 <- function(x1, y1, dx, dy, ...) { x2 <- x1 + dx if (xlog) { x1 <- 10^x1 x2 <- 10^x2 } y2 <- y1 + dy if (ylog) { y1 <- 10^y1 y2 <- 10^y2 } segments(x1, y1, x2, y2, ...) } points2 <- function(x, y, ...) { if (xlog) x <- 10^x if (ylog) y <- 10^y points(x, y, ...) } text2 <- function(x, y, ...) { if (xlog) x <- 10^x if (ylog) y <- 10^y text(x, y, ...) } if (trace) catn <- function(...) do.call("cat", c(lapply(list(...), formatC), list("\n"))) cin <- par("cin") Cex <- cex * par("cex") if (is.null(text.width)) text.width <- max(abs(strwidth(legend, units = "user", cex = cex))) else if (!is.numeric(text.width) || text.width < 0) stop("'text.width' must be numeric, >= 0") xc <- Cex * xinch(cin[1], warn.log = FALSE) yc <- Cex * yinch(cin[2], warn.log = FALSE) if (xc < 0) text.width <- -text.width xchar <- xc xextra <- 0 yextra <- yc * (y.intersp - 1) ymax <- yc * max(1, strheight(legend, units = "user", cex = cex)/yc) ychar <- yextra + ymax if (trace) catn(" xchar=", xchar, "; (yextra,ychar)=", c(yextra, ychar)) if (mfill) { xbox <- xc * 0.8 ybox <- yc * 0.5 dx.fill
[R] parent in Creating environment object
Hi, I would like to convert a simple list into an environment object. It seems I have to create an environment object with new.env() and assign the single values afterwards. Now what I did not really understand from the guides until now is, how the parent environment supplied to the new.env() function influence the final environment. So: 1. Do I ALWAYS have to supply a parent during creation? 2. If yes, what would that be, when all I want is a conversion from a simple list? Best regards Benjamin == Benjamin Otto University Hospital Hamburg-Eppendorf Institute For Clinical Chemistry Martinistr. 52 D-20246 Hamburg Tel.: +49 40 42803 1908 Fax.: +49 40 42803 4971 == -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf Körperschaft des öffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Jörg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Saving environment object
Hi, When I create an environment object with new.env() and populate it with values then how can I save it into an .RData file properly, so it can be loaded later on in a new session? Saving an environment object with save() or save.image() results in an error message when loading again: Error: protect(): protection stack overflow Regards, benjamin == Benjamin Otto University Hospital Hamburg-Eppendorf Institute For Clinical Chemistry Martinistr. 52 D-20246 Hamburg Tel.: +49 40 42803 1908 Fax.: +49 40 42803 4971 == -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf Körperschaft des öffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Jörg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving environment object
Hi Erik, Yes this is what I was trying and your example or the one of luke is working fine with me. So now I'm not sure if this is due to an environment which takes too much space. The environment troubling me has 644276 entries. Is this too much? Benjamin -Ursprüngliche Nachricht- Von: Erik Iverson [mailto:[EMAIL PROTECTED] Gesendet: Friday, August 15, 2008 3:37 PM An: Benjamin Otto Cc: R-Help Betreff: Re: [R] Saving environment object Benjamin Otto wrote: > Hi, > > When I create an environment object with new.env() and populate it with > values then how can I save it into an .RData file properly, so it can be > loaded later on in a new session? > > Saving an environment object with save() or save.image() results in an error > message when loading again: > > Error: protect(): protection stack overflow Can you give a small, reproducible example as the posting guide asks? And also provide your sessionInfo() ? I am not able to replicate this. test <- new.env() assign("hi", pi, pos = test) save(test, file = "~/testenv.Rdata") does not give me an error. Is this basically what you're trying? -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf Körperschaft des öffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Jörg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving environment object
Hi Luke, hi all, You have been right guys, the hashing solved the problem. Thanks for your hard efforts. :) Best regards Benjamin -Ursprüngliche Nachricht- Von: Luke Tierney [mailto:[EMAIL PROTECTED] Gesendet: Friday, August 15, 2008 6:29 PM An: Prof Brian Ripley Cc: Benjamin Otto; R-Help Betreff: Re: [R] Saving environment object On Fri, 15 Aug 2008, Prof Brian Ripley wrote: > Having been told that this was a particular very large environment, Luke's > comments in serialize.c wouild seem to apply: > > The output format for dotted pairs writes the ATTRIB value first > rather than last. This allows CDR's to be processed by iterative > tail calls to avoid recursion stack overflows when processing long > lists. The writing code does take advantage of this, but the > reading code does not. It hasn't been a big issue so far--the only > case where it has come up is in saving a large unhashed environment > where saving succeeds but loading fails because the PROTECT stack > overflows. With the ability to create hashed environments at the > user level this is likely to be even less of an issue now. But if > we do need to deal with it we can do so without a change in the > serialization format--just rewrite ReadItem to pass the place to > store the CDR it reads. (It's a bit of a pain to do, that is why it > is being deferred until it is clearly needed.) Thanks -- I had forgotten about that (and would still prefer to defer rewriting ReadItem :-)) luke > So I think the moral is to hash large environments, and increasing > --max-ppsize should enable this one to be read in. > > On Fri, 15 Aug 2008, Luke Tierney wrote: > >> On Fri, 15 Aug 2008, Benjamin Otto wrote: >> >>> Hi, >>> >>> When I create an environment object with new.env() and populate it with >>> values then how can I save it into an .RData file properly, so it can be >>> loaded later on in a new session? >>> >>> Saving an environment object with save() or save.image() results in an >>> error >>> message when loading again: >>> >>> Error: protect(): protection stack overflow >> >> save/load works fine (and is used in many places): >> >>> e<-new.env() >>> assign("e", e, envir = e) >>> assign("x", 2, envir = e) >>> save(e, file = "test.Rda") >>> rm(e) >>> load("test.Rda") >>> e >> >> >> There may be something about the values you are using that is causing >> problems, but there is no way to tell without a reproducible example. >> >> luke >> >>> >>> Regards, >>> >>> benjamin >>> >>> == >>> Benjamin Otto >>> University Hospital Hamburg-Eppendorf >>> Institute For Clinical Chemistry >>> Martinistr. 52 >>> D-20246 Hamburg >>> >>> Tel.: +49 40 42803 1908 >>> Fax.: +49 40 42803 4971 >>> == >>> >>> >>> >>> >> >> -- >> Luke Tierney >> Chair, Statistics and Actuarial Science >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics andFax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: [EMAIL PROTECTED] >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf Körperschaft des öffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Jörg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generic attribute in multinom function
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generic attribute in multinom function
Hello, I try estimate a MNL model but I need one coefficient as generic. How can I get the same coefficient for all alternatives? As example Call: multinom(formula = choice ~ time + costs, data = DATA) Coefficients: (Intercept)time costs 14.79-0.28 -3.97 2 -3.30 0.053 0.59 32.10-0.048-0.65 But I would like to get the costs parameters as generic attribute, so that I have the following result: Coefficients: (Intercept)time costs 14.79-0.3-1.8 2 -3.36 0.4-1.8 32.10-0.5-1.8 Thanks. Best Regards Benjamin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can I do a generic specification in multiple logistic regression
I use the function multinom to estimate a multiple logistic regression. but I need one coefficient as generic. How can I do this? Or is it possible to do this with an another function? As example this is what I get: Call: multinom(formula = choice ~ time + costs, data = DATA) Coefficients: (Intercept)time costs 14.79-0.28 -3.97 2 -3.30 0.053 0.59 32.10-0.048-0.65 But I would like to get the costs parameters as generic attribute, so that I have the following result: Coefficients: (Intercept)time costs 14.79-0.3-1.8 2 -3.36 0.4-1.8 32.10-0.5-1.8 Thank you for you help. Best Regards Benjamin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] applying summary() to an object created with ols()
Hello R-list, I am trying to calculate a ridge regression using first the *lm.ridge()* function from the MASS package and then applying the obtained Hoerl Kennard Baldwin (HKB) estimator as a penalty scalar to the *ols()* function provided by Frank Harrell in his Design package. It looks like this: > rrk1<-lm.ridge(lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, subset(aa, Jahr>=1957 & Jahr<=1966)) > f <- ols(lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, subset(aa, Jahr>=1957 & Jahr<=1966), penalty = rrk$kHKB) > f which returns >Linear Regression Model > >ols(formula = lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, >data = subset(aa, Jahr >= 1957 & Jahr <= 1966), penalty = rrk$kHKB) > > n Model L.R. d.f. R2 Sigma >10 38.59 8.814 0.98390.02796 > >Residuals: >1 2 3 4 5 6 7 8 910 >-0.014653 -0.002787 0.017515 -0.018145 -0.008757 -0.008035 0.006066 0.045826 -0.001244 -0.015786 > >Coefficients: >Value Std. Error t Pr(>|t|) >Intercept 1.5240 3.3034 0.4613 0.8496 >lntex 0.3722 0.2071 1.7975 0.6801 >lnbeerp0.9085 0.5760 1.5771 0.6964 >lnwinep -0.1458 0.1874 -0.7781 0.7863 >lntemp-0.0772 0.1344 -0.5743 0.8240 >pop -4.1889 1.9286 -2.1720 0.6571 > >Adjusted R-Squared: 0.2227 All in all beautiful (leaving aside that the results suck). The problem starts when I want to write the obtained coefficients (incl. Std. Errors, t-, and p-values) into a matrix. Via the *f$coef* command I can only access the betas (1st column) and using the *summary(f)* function I get > summary(f) >Fehler in summary.Design(f) : >adjustment values not defined here or with datadist for lntex lnbeerp lnwinep lntemp pop Does anyone know how I can set the *datadist()* and the *options()* such that I will get access to all coefficients? I tried: > options(datadist=NULL) > f <- ols(lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, subset(aa, Jahr>=1957 & Jahr<=1966), penalty = rrk$kHKB) > d <- datadist(f) but got: > Fehler in sort.list(unique(y)) : 'x' must be atomic for 'sort.list' > Have you called 'sort' on a list? In the R documentation on ?ols() it states concerning the values returned: "the same objects returned from |lm| (/unless |penalty| or |penalty.matrix| are given/ - then an abbreviated list is returned since |lm.pfit| is used as a fitter)..." Unfortunately no information seems to be available on lm.pfit. Does anyone know why the using that function leads to an abbreviated return list? Is there a trick to circumvent that? Thanks Benjamin Volland P.S. Currently using R-version 2.7.1 on a Windows PC. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with specifying variance-covariance matrix for random effects (nlme package)
Hi all, I've been struggling with trying to specify a diagnoal matrix for linear mixed effects model. I think I've got nearly everything correct, except the following message appears: In lme.formula(fixed = fwave ~ sex + sexXbulbar + visit + age + : Fewer observations than random effects in all level 1 groups Not sure if i've provided enough details, but I'm basically trying to perform a mixed effects model analysis, controlling for several covariates. Incorporating random effects for the intercept and slope. What does the second line, 'Fewer observations than random effects in all level 1 groups' mean? Many thanks in advance, Ben Cheah __ [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] all possible subsets, with AIC
I've dabbled in this a little bit, and the result of my dabbling is attached. I'll give you fair warning, however. The attached function can take a long time to run, and if your model has 10 or more predictors, you may be retired before it finishes running. In any case, it will models for all possible subsets of predictors in lm, glm, or coxph. If requested, it will also plot the R-squared, Adjusted R-squared, AIC, or BIC of those models (when the values are applicable to the model). It might give you a good starting point. Benjamin -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of kcleary2 Sent: Friday, February 12, 2010 3:19 PM To: r-help@r-project.org Subject: [R] all possible subsets, with AIC Hello, I have a question about doing ALL possible subsets regression with a general linear model. My goal is to produce cumulative Akaike weights for each of 7 predictor variables-to obtain this I need R to: 1. Show me ALL possible subsets, not just the best possible subsets 2. Give me an AIC value for each model (instead of a BIC value). I have tried to do this in library(RcmdrPlugin.HH), and using the "leaps" code below. With the leaps code my problem is that my response is not a vector, it's a single value (density of a species) ANy help would be greatly appreciated. Thanks a lot, Kate ALL-SUBSETS REGRESSIOM DESCRIPTION leaps() performs an exhaustive search for the best subsets of the variables in x for predicting y in linear regression, using an efficient branch-and-bound algorithm. It is a compatibility wrapper for regsubsets [1] does the same thing better. Since the algorithm returns a best model of each size, the results do not depend on a penalty model for model size: it doesn't make any difference whether you want to use AIC, BIC, CIC, DIC, ... USAGE leaps(x=, y=, wt=rep(1, NROW(x)), int=TRUE, method=c("Cp", "adjr2", "r2"), nbest=10, names=NULL, df=NROW(x), strictly.compatible=TRUE) ARGUMENTS x A matrix of predictors y A response vector wt Optional weight vector int Add an intercept to the model method Calculate Cp, adjusted R-squared or R-squared nbest Number of subsets of each size to report names vector of names for columns of x df Total degrees of freedom to use instead of nrow(x) in calculating Cp and adjusted R-squared strictly.compatible Implement misfeatures of leaps() in S -- Kate Cleary MS Candidate Department of Fish, Wildlife, and Conservation Biology Colorado State University Fort Collins, CO 970-491-3535 Links: -- [1] https://webmail.warnercnr.colostate.edu/leaps/help/regsubsets [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] First. Last. Data row selection
I've attached some functions I've written based on previous questions that have been posted here. Unfortunately, I was too lazy to give credit to previous commenters in my Rd file, and for that I hope they'll forgive me. In any case, please be assured that the functions I've attached are in no way my original work. Benjamin -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of wookie1976 Sent: Tuesday, February 23, 2010 12:40 PM To: r-help@r-project.org Subject: [R] First. Last. Data row selection I am in the process of switching from SAS over to R. I am working on very large CSV datasets that contain vehicle information. As I am processing the data, I need to select the first (or sometimes the second) record (by date) for any records that have the same license plate number. In SAS, there is a function called 'first.' that can be used on sorted datasets to pull out those first entries for each occurrence of a particular variable (in this case the variable is 'license plate') found in the data. I have spent some time looking around and cannot seem to find an equivalent function in R. Can anyone recommend an efficient technique that would pull this off? I assume the database must first be sorted by vehicle plate and date, and then apply the filter or function. Any help would be greatly appreciated. Thanks, Joe -- View this message in context: http://n4.nabble.com/First-Last-Data-row-selection-tp1566260p1566260.htm l Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Circles around letters or numbers in plot title
Has anyone ever tried putting a circle around a letter or a number in a plot title? For instance, if I have a plot title "Scatterplot for Subject 24", I want to put a circle around 24 to distinguish that plot from the other 30 I've generated. Any tips or ideas beyond plotting a circle in the margin? Benjamin Benjamin Nutter | Biostatistician | Quantitative Health Sciences Cleveland Clinic | 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365 === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting the number of ones in a vector
What you did works well. You could also try the following. table(x)["1"] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Randall Wrong Sent: Friday, February 26, 2010 9:41 AM To: r-help@r-project.org Subject: [R] counting the number of ones in a vector Dear R users, I want to count the number of ones in a vector x. That's what I did : length( x[x==1] ) Is that a good solution ? Thank you very much, Randall [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting the number of ones in a vector
But if x has any missing values: > x <- c(1, 1, 1, NA, NA, 2, 1, NA) > > sum( x == 1) [1] NA > > sum(x==1, na.rm=TRUE) [1] 4 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique Dallazuanna Sent: Friday, February 26, 2010 9:47 AM To: Randall Wrong Cc: r-help@r-project.org Subject: Re: [R] counting the number of ones in a vector Try: sum(x == 1) On Fri, Feb 26, 2010 at 11:40 AM, Randall Wrong wrote: > Dear R users, > > I want to count the number of ones in a vector x. > > That's what I did : length( x[x==1] ) > > Is that a good solution ? > Thank you very much, > Randall > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Row-wisely converting a data frame into a list
> as.data.frame(t(df)) For example > x <- as.data.frame(t(mtcars)) > typeof(x) [1] "list" -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sebastian Bauer Sent: Tuesday, March 02, 2010 8:12 AM To: r-help@r-project.org Subject: [R] Row-wisely converting a data frame into a list Hello, is there an elegant way, how I can convert each row of a data frame into distinct elements of a list? In essence, what I'm looking for is something like rows.to.lists <- function( df ) { ll <- NULL for( i in 1:nrow(df) ) ll <- append( ll, list(df[i,]) ) return (ll) } but more done more efficiently (the data frame may contain ten-thousands of rows). I thought about using apply() but this function always returns a matrix. Thanks in advance! Bye, Sebastian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on getting help from manuals
As has been pointed out, there are tools in R to help find the commands you are looking for. As a practical note, I recommend starting with '?' if you think you know what command you need. If you're unsure of what command you need, my next step would be help.search(). Often, the results from here will point you to a command that you are looking for. If a handful of tries with help.search() doesn't get you what you need, then try RSiteSearch(). This will search the mailing list for entries relevant to your search criteria. Often times, you'll find that someone has already asked your question. In fact, I can only recall a couple of times where a search of the mailing list has failed to provide a solution for me. Aside from reading lots of documentation, the one thing that has helped me the most in picking up R has been reading Peter Dalgaard's "Introductory Statistics with R." It's a very small, basic introduction to R, but laid a solid enough foundation that I soon found myself able to produce and refine my own solutions. In short, the best way to improve our knowledge of R is lots of reading and a bit of trial and error. Benjamin -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ManInMoon Sent: Friday, March 12, 2010 4:41 AM To: r-help@r-project.org Subject: [R] Help on getting help from manuals Hi, A number of people have suggested "I read the manuals"... Could someone help me by telling me where the primary start point is please? For example, I am interested in writing functions with variable number of arguments - where should I start to look? "An introduction to R" only show a brief example - with no pointer to where to find further data. I can't do ?xxx from R console in most cases - as I don't know what the function name is that I am looking for!!! People have helped me find "substitute" to get some metadata out - BUT how could I have found that without guidance from nice people in Nabble? Any help on this very much appreciated. -- View this message in context: http://n4.nabble.com/Help-on-getting-help-from-manuals-tp1590323p1590323 .html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Retrieving latitude and longitude via Google Maps API
Does anyone have any experience retrieving latitutde and longitude for an address from the Google Maps API? I'd like to have an R script that submits a street address, city, state, and zip code and returns the coordinates. So far, I've been submitting the coordinates from another program, then loading the coordinates in R and merging them back into the data frame I want to use. It'd be nice to be able to do it all in one script, but I'm not comprehending the API thing very well. I'm using R 2.9.1 on Windows XP. Any suggestions or pointers? Benjamin Benjamin Nutter | Biostatistician | Quantitative Health Sciences Cleveland Clinic | 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365 === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mboost: Interpreting coefficients from glmboost if center=TRUE
Sorry for the tardy reply but I just found your posting incidentally today. To make long things short: You are right about the centering. We forgot to correct the intercept if center = TRUE. We lately found the problem ourself and fixed it in the current version (mboost 2.0-3). However the problem only occurred if you extracted the coefficients. As the intercept is rarely interpretable we didn't have a closer look and thus it took some time until we found this bug. The model predictions however always took care of the centering and thus were not affected (as you already pointed out). As you realized centering is of hight importance if you use glmboost as it reduces the number of boosting iterations needed to estimate the model and furthermore often improves the estimates. Centering is also important if you use gamboost() and specify linear base-learners without intercept (e.g. bols(x, intercept=FALSE)). However, in this case you have to center the covariates yourself and take care of the intercept correction afterwards. You wrote you weren't aware of a mailing list for mboost but you could write to the maintainer (see http://cran.at.r-project.org/package=mboost) for possible help and/or to report bugs. HTH Benjamin Kyle Werner wrote: Thanks for your reply. In fact, I do use the predict method for model assessment, and it shows that centering leads to a substantial improvement using even the bluntest of assessments of 'goodness' (i.e., binary categorization accuracy). So I agree that the package authors must have internal tools to reverse the effects of centering the variables, at least within the predict method. But it seems to me that the coefficients that I get out should be related to the values that I input, not to the centered values. In other words, centering seems like it should be done "invisibly;" unless I center the variables myself, I would expect the coefficients to be applicable to the original data. I extract the coefficients returned by the model and store them in a database which is web accessible. I reconstruct models periodically, and track various statistics associated with these models in the database. This is why I highly value the fact that mboost has glmboost, which can return linearly interpretable coefficients. It is also why I do not directly call upon R every time I want to query a model. (As an aside, if I were to use R directly, I might consider the gamboost or blackboost methods, which do not return scalar coefficients that are readily extractable.) On Sun, Feb 7, 2010 at 6:31 PM, David Winsemius comcast.net> wrote: > > On Feb 7, 2010, at 5:03 PM, Kyle Werner wrote: > >> I'm running R 2.10.1 with mboost 2.0 in order to build predictive >> models . I am performing prediction on a binomial outcome, using a >> linear function (glmboost). However, I am running into some confusion >> regarding centering. (I am not aware of an mboost-specific mailing >> list, so if the main R list is not the right place for this topic, >> please let me know.) >> >> The boost_control() function allows for the choice between center=TRUE >> and center=FALSE. If I select center=FALSE, I am able to interpret the >> coefficients just like those from standard logistic regression. >> However, if I select center=TRUE, this is no longer the case. In >> theory and in practice with my data, centering improves the >> predictions made by the model, so this is an issue worth pursuing for >> me. >> >> Below is output from running the exact same data in exactly the same >> way, only differing by whether the "center" bit is flipped or not: >> >> Output with center=TRUE: >> [(Intercept)] => -0.04543632 >> [painscore] => 0.007553608 >> [Offset] => -0.546520621809327 >> >> Output with center=FALSE: >> [(Intercept)] => -0.989742 >> [painscore] => 0.001342585 >> [Offset] => -0.546520621809327 >> >> The mean of painscore is 741. It seems to me that for center=FALSE, >> mboost should modify the intercept by subtracting 741*0.007553608 from >> it (thus intercept should = -11.285). If I manually do this, the >> output is credible, and in the ballpark of that given by other methods >> (e.g., lrm or glm with a Binomial link function). If I don't do this, >> then the inverse logistic interpretation of the output is off by >> orders of magnitude. >> >> In the end, with "center=TRUE", and I want to make a prediction based >> on the coefficients returned by mboost, the results only make sense if >> I manually rescale my independent variables prior to making a >> prediction. Is this the desired behavior, or am I doing something >&
[R] dump() an object of type raw ?
Is there a way to do this? I tried x <- writeBin(pi, raw()) dump("x","x.R") source("x.R") but is seems x.R is not source()-able, as it contains an unexpected symbol. Thanks Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dump() an object of type raw ?
Thank you David for taking the time to respond to my question. Perhaps I should clarify: the man page says "a 'dump' file can be 'source'd"; have I taken the documentation too literally in this case? David Winsemius wrote: On Jul 21, 2009, at 7:19 PM, Benjamin Tyner wrote: Is there a way to do this? I tried x <- writeBin(pi, raw()) dump("x","x.R") source("x.R") but is seems x.R is not source()-able, as it contains an unexpected symbol. Yes, "2d".I get: > source("/Users/davidwinsemius/x.R") Error in source("/Users/davidwinsemius/x.R") : /Users/davidwinsemius/x.R:2:8: unexpected symbol 1: x <- 2: c(18, 2d ^ So there was an good faith effort to source the file but the interpreter was not put on notice that it would be getting hexadecimal. The file contains: x <- c(18, 2d, 44, 54, fb, 21, 09, 40) and when you try to execute that from the command line you get: > x <- + c(18, 2d, 44, 54, fb, 21, 09, 40) Error: syntax error (R variable names cannot start with 2 and R wouldn't assume these are hexadecimal numbers.) David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers on barplot
The only thing you're missing is the midpoints of the bars. Since you specified > Graph <- barplot(dat$Average) You can get the midpoints from the Graph object. So to put the number on top of each bar you might use something like: > text(Graph, dat$Average, dat$Average) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mohsen Jafarikia Sent: Monday, July 27, 2009 10:02 AM To: r-h...@stat.math.ethz.ch Subject: [R] numbers on barplot Hello all, I have this simple barplot code: ifn <- "id.dat" dat <- read.table(ifn) ofn <- "id.png" bitmap(ofn, type = "png256", width = 30, height = 30, pointsize = 30, bg = "white",res=50) par(mar=c(5, 5, 3, 2),lwd=5) par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6) names(dat)<-c("NumberOfPeople","Average") Graph<-barplot(dat$Average) dev.off() and here is the data (id.dat): 150.08 60.09 70.37 I want to write down the ""NumberOfPeople" on top of each of the bars. Can anybody help me on this? Thanks, Mohsen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing lines in margins
Look at the xpd option in ?par. If you set par(xpd=TRUE) you should be able to add a segment for what you want. But please let me know if someone gives you a better way to do this. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alan Cohen Sent: Wednesday, July 29, 2009 10:22 AM To: r-help@r-project.org Subject: [R] Drawing lines in margins Hi all, Quick question: What function can I use to draw a line in the margin of a plot? segments() and lines() both stop at the margin. In case the answer depends on exactly what I'm trying to do, see below. I'm using R v. 2.8.1 on Windows XP. Cheers, Alan I'm trying to make a horizontal barplot with a column of numbers on the right side. I'd like to put a line between the column header and the numbers. The following reconstructs the idea - just copy and paste it in: aa <- 1:10 plot.mtx2<-cbind(aa,aa+1) colnames(plot.mtx2)<-c("Male","Female") lci2<- cbind(aa-1,aa) uci2<- cbind(aa+1,aa+2) par(mar=c(5,6,4,5)) cols <- c("grey79","grey41") bplot2<-barplot(t(plot.mtx2),beside=TRUE,xlab="Malaria death rates per 100,000", names.arg=paste("state",aa,sep=""),legend.text=F,las=1,xlim=c(0,13), horiz=T, col=cols, main="Malaria death rates by state and sex") legend(8,6,legend=c("Female","Male"),fill=cols[order(2:1)]) segments(y0=bplot2, y1=bplot2, x0=t(lci2), x1=t(uci2)) mtext(10*(aa+1),side=4,line=4,at=seq(3,3*length(aa),by=3)-0.35,padj=0.5, adj=1,las=1,cex=0.85) mtext(10*aa,side=4,line=4,at=seq(2,3*length(aa)-1,by=3)-0.65,padj=0.5,ad j=1,las=1,cex=0.85) mtext("Estimated",side=4,line=3,at=3*length(aa)+2.75,padj=0.5,adj=0.5,la s=1,cex=0.85) mtext("Deaths",side=4,line=3,at=3*length(aa)+1.25,padj=0.5,adj=0.5,las=1 ,cex=0.85) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting date format
If x is your vector of character date variables: orig.date <- as.Date(x, format=c("%m/%d/%Y")) new.date <- format(x, format=c("%m/%d/%y")) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hosack, Michael Sent: Tuesday, March 23, 2010 10:11 AM To: R-help@r-project.org Subject: [R] Converting date format R community: Hello, I would to like to convert a character date variable from %m/%d/%Y to %m/%d/%y. Any advice would be greatly appreciated. I have tried functions for changing the formatting and removing the unnecessary digits without success. Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
I've got a problem with the sparseby command (reshape library), and I have reached the peak of my R knowledge (it isn't really that high). I have a small data frame of 23 rows and 15 columns, here is a subset, the first four columns are factors and the rest are numeric (only one, line54 is provided). bearID YEAR Season SEX line54 5 1900 8 3 0 16.3923519 11 2270 5 1 0 233.7414014 12 2271 5 1 0 290.8207652 13 2271 5 2 0 244.7820844 15 2291 5 1 0 0.000 16 2291 5 2 0 14.5037795 17 2291 6 1 0 0.000 18 2293 5 2 0 144.7440752 19 2293 5 3 0 0.000 20 2293 6 1 0 16.0592270 21 2293 6 2 0 30.1383426 28 2298 5 1 0 0.9741067 29 2298 5 2 0 9.6641018 30 2298 6 2 0 8.6533828 31 2309 5 2 0 85.9781303 32 2325 6 1 0 110.8892153 35 2331 6 1 0 26.7335562 44 2390 7 2 0 7.1690620 45 2390 8 2 0 44.1109897 46 2390 8 3 0 503.9074898 47 2390 9 2 0 8.4393660 54 2416 7 3 0 48.6910907 58 2418 8 2 0 5.7951139 Sparseby works fine when I try to calculate mean >sparseby(mF[1:5], mF$Season, mean) mF$Season bearID YEAR Season SEX line54 1 1 NA NA NA 0 84.90228 2 2 NA NA NA 0 54.90713 3 3 NA NA NA 0 142.24773 But it goes nuts when looking for max or min > sparseby(mF[5:6], mF$Season, max) mF$Season structure(c(2169.49621795108, 1885.22677689026, 2492.17544685464 1 1 2169.496 2 2 1885.227 3 3 2492.175 Any ideas? All I want is to calculate create three data.frames, mean, min and max. Thanks, Ben Stewart __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RExcel
Hello- I am a Graduate Assistant for an instructor who has written programs for statistics calculations such as binomial distributions and regressions. The programs had worked with no problem in Excel 2003. Now we are trying to use it with Excel 2007, and we are having some trouble. I have downloaded RandFriends and have ran the binomial distribution process in 2007 Excel and have received an error that says: "Compile error in hidden module: UFDBinomial" However, ther are two demo excel files in the RExcel file called RdemoDens. When I open the first RDemoDens excel file, I can run the processes and they work fine. When I run the second RDemoDens excel file, or a blank excel 2007 file, the processes do not work and I get the error message. I am trying to figure out what is different about the first RDemoDens excel file that allows the calculations to process correctly. I am thinking that something in the macro library in the demo must be different than what is in a blank excel document. I just cannot seem to figure out what it is. One thing that I did notice is that there are two different RExcel files in the RExcel folder. One is labled "RExcel" and one is labed "RExcel 2007." What are the difference between these two RExcel files? I am not sure if this has anything to do with the problem, but perhaps the excel demo in which our calculations work uses the correct RExcel file while a regular excel 2007 document does not call the correct one. If anyone has an idea about what might be happening here, or who else I could ask about the situation, I would appreciate any input. Thanks, Ben Ward [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] row selection
> x[seq(1, nrow(x), by=5), ] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, October 08, 2009 4:19 PM To: Ashta Cc: R help Subject: Re: [R] row selection On Oct 8, 2009, at 4:14 PM, Ashta wrote: > Hi all, > I have a matrix named x with N by C > I want to select every 5 th rrow from matrix x I used the following > code > n<- nrow(x) >> for(i in 1: n){ > + b <- a[i+5,] >> b > } > Error: subscript out of bounds What did you expect when "i" in your loop counter became one greater than the number of rows? > David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] row selection
sub3 <- x[-seq(1, nrow(x), by=5), ] Notice the '-' in front of the seq() command. This will select everything but what is in the sequence. From: Ashta [mailto:sewa...@gmail.com] Sent: Friday, October 09, 2009 12:42 PM To: Nutter, Benjamin Cc: r-help@r-project.org Subject: Re: [R] row selection Hi all, Thank you for your help. Now I am able to select every 5th row of the data from the main data set (x) using sub1<- x[seq(1, nrow(x), by=5), ] So sub1 contains one fith of the data set X. I want also create another data set that will contain the remaining data set from X (ie., four fifth of the data set). Any help is highly appreciated. I have a matrix named x with N by C I want to select every 5 th rrow from matrix x I used the following code > n<- nrow(x) >> for(i in 1: n){ > + b <- a[i+5,] >> b > } >sc < x[seq(1, nrow(x), by=5), ] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, October 08, 2009 4:19 PM To: Ashta Cc: R help Subject: Re: [R] row selection On Oct 8, 2009, at 4:14 PM, Ashta wrote: > Hi all, > I have a matrix named x with N by C > I want to select every 5 th rrow from matrix x I used the following > code > n<- nrow(x) >> for(i in 1: n){ > + b <- a[i+5,] >> b > } > Error: subscript out of bounds What did you expect when "i" in your loop counter became one greater than the number of rows? > David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Functional data analysis - problems with smoothing
Hi all, I'm having major issues with smoothing my functional data I'm referring to Jim Ramsay's examples in his books. The following error message keeps appearing, despite all my data being numeric can anyone kindly offer any suggestions? isi - vector of argument values - i.e. the independent variable of the curves rlz - data array TMSfdPar - functional data parameter. > TMSfdPar = fdPar(TMSbasis, 4, 0.01) > TMSfdsmooth = smooth.basis(isi, rlz, TMSfdPar) Error in smooth.basis(isi, rlz, TMSfdPar) : 'argvals' is not numeric. I don't understand why the error message keeps popping up. I've tried playing around with Jim Ramsay's datasets and I think my data is organised in a similar manner to his, so can't understand what's going on oh, the frustration! Thanks in advance, Ben (the amateur R data analyst and statistician.) __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Functional data analysis - problems with smoothing
Hi Peter and Spencer (and everyone else out there), Thank you for your prompt reply to my post re. FDA - FYI - i'm very new to R and FDA. Currently half way through a biostats masters degree in Sydney, Australia! Absolutely loving it and I hope to include R computing on my future CV! Just to clarify things: When I entered what you suggested, this is what I obtained with my own dataset > str(isi) num [1:14] 1 1.5 2 2.5 3 3.5 4 5 7 10 ... Similarly, for Jim Ramsay's dataset (chapter five of Use R book): > str(age) num [1:31] 1 1.25 1.5 1.75 2 3 4 5 6 7 ... Doesn't this mean that my data object, 'isi' is numeric? Also, I was looking through Jim Ramsay's datasets - I'm fairly sure that my vector, 'isi' was similarly organised to his vector, 'age' in chapter 5 of his new Use R FDA book, so I'm fairly certain that the data is numeric but it obviously isn't Hope this email clarified things - apologies if it hasn't - still very new to the terminology and overall feel of R. The R user network is terrific! Thank you for caring! Ben From: Peter Ehlers Cc: R-help@r-project.org Sent: Fri, 23 October, 2009 5:07:15 PM Subject: Re: [R] Functional data analysis - problems with smoothing The error message is pretty clear: regardless of what *you* think, R says that 'isi' is not numeric. Are you sure that 'isi' is not a *factor* object? I'm willing to bet that it is. Use str() to check your data. -Peter Ehlers Benjamin Cheah wrote: > Hi all, > > I'm having major issues with smoothing my functional data > > I'm referring to Jim Ramsay's > examples in his books. The following error message keeps appearing, > despite all my data being numeric can anyone kindly offer any suggestions? > > isi - vector of argument values - i.e. the independent variable of the curves > rlz - data array > TMSfdPar - functional data parameter. > >> TMSfdPar = fdPar(TMSbasis, 4, 0.01) >> TMSfdsmooth = smooth.basis(isi, rlz, TMSfdPar) > Error in smooth.basis(isi, rlz, TMSfdPar) : 'argvals' is not numeric. > > I don't understand why the error message keeps popping up. I've tried playing > around with Jim Ramsay's datasets and I think my data is organised in a > similar manner to his, so can't understand what's going on oh, the > frustration! > > Thanks in advance, > > Ben > (the amateur R data analyst and statistician.) > > > > > __ > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the last row of each group in a data frame
I usually use the following function: last.record <- function(data, id, ..., na.last=TRUE, decreasing=FALSE){ #*** Make vector of variables to sort by v <- c(id, unlist(list(...))) #*** Sort Data Frame data <- data[do.call(order, c(data[,v, drop=FALSE], na.last=na.last, decreasing=decreasing)),] #*** Extract last record for each id data[!duplicated(data[,id], fromLast=TRUE),] } DataData Frame from which the record is to be extracted Id ID variable from which the record is to be extracted. The data frame is automatically sorted by this variable. May be either a character string or an integer. ... Names of variables (or indices) in additon to id by which data should be sorted. na.last Argument passed to order(). Determines if missing values are placed at the end of the sorting. Decreasing Argument passed to order(). Determines if data frame is sorted in descending order. So, in your example > df <- data.frame(Name = c("A", "A", "A", "B", "B", "C", "D"), Value = c(1, 2, 3, 4, 8, 2, 3)) > last.record(df, "Name", "Value") > last.record(df, 1, 2) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hao Cen Sent: Monday, November 16, 2009 2:43 PM To: r-help@r-project.org Subject: [R] extracting the last row of each group in a data frame Hi, I would like to extract the last row of each group in a data frame. The data frame is as follows Name Value A 1 A 2 A 3 B 4 B 8 C 2 D 3 I would like to get a data frame as Name Value A 3 B 8 C 2 D 3 Thank you for your suggestions in advance Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding and Multiplying two Unevaluated Expressions
HI, As I'm trying to compute Taylor series, I'm having problems in adding and multiplying unevaluated expressions. I searched for a solution but found none. my Taylor function works fine for evaluating functions as you can see here: rTaylorVal=function(exp,x0,dx,n) { ls=list(x=x0) newexp=eval(exp,ls) exp0=exp for (i in 1:n){ exp0=D(exp0,"x") newexp=newexp+eval(exp0,ls)/factorial(i)*dx^i } return(newexp) } Where exp is an expression like exp=expression(x^2*sin(x)), x0 is the startvalue, dx the difference between startvalue and searched value and n is the length of the series. So I tried to remove dx as a value, to get a Taylor series expression, but it doesn't work as simple multiplication (*) and accumulation (+) is not good for expressions. That's my point so far, now my question: Is it actually possible to add and/or multiply expressions, and how? Thank you so far. Benjamin Müller Geographer (B.Sc.) -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [Fwd: Re: Adding and Multiplying two Unevaluated Expressions]
Original-Nachricht Betreff:Re: [R] Adding and Multiplying two Unevaluated Expressions Datum: Tue, 01 Dec 2009 23:49:39 +0100 Von:Benjamin Müller An: Rolf Turner Referenzen: <20091201144125.316...@gmx.net> <8e40e49f-e8fc-4fbd-8cc5-93789ffb0...@auckland.ac.nz> This works fine for your example, but doesn't work as simple if there's more than these expressions. In my example this would be in line 11: newexp=as.expression(substitute(a+con1/con2*b^con3,list(a=newexp[[1]],b=expression(dx)[[1]],con1=eval(exp0,ls),con2=factorial(i),con3=1*i))) this works just fine and you can still evaluate this. thank you very much! If there's an easier way, let me know. Greets, Ben Müller Rolf Turner schrieb: On 2/12/2009, at 3:41 AM, Benjamin Müller wrote: HI, As I'm trying to compute Taylor series, I'm having problems in adding and multiplying unevaluated expressions. I searched for a solution but found none. my Taylor function works fine for evaluating functions as you can see here: rTaylorVal=function(exp,x0,dx,n) { ls=list(x=x0) newexp=eval(exp,ls) exp0=exp for (i in 1:n){ exp0=D(exp0,"x") newexp=newexp+eval(exp0,ls)/factorial(i)*dx^i } return(newexp) } Where exp is an expression like exp=expression(x^2*sin(x)), x0 is the startvalue, dx the difference between startvalue and searched value and n is the length of the series. So I tried to remove dx as a value, to get a Taylor series expression, but it doesn't work as simple multiplication (*) and accumulation (+) is not good for expressions. That's my point so far, now my question: Is it actually possible to add and/or multiply expressions, and how? This may well be a case of the blind leading the partially sighted, but for what it's worth my answer is ``Yes, but it's a kludge.'' One needs to use substitute, it seems to me. E.g.: e1 <- expression((x+y)^2) e2 <- expression(1/(x^2 + y^2)) e3 <- as.expression(substitute(a+b,list(a=e1[[1]],b=e2[[1]]))) e3 D(e3,"x") D(e3,"y") Older and wiser heads may provide you with better counsel. cheers, Rolf Turner ## Attention:This e-mail message is privileged and confidential. If you are not theintended recipient please delete the message and notify the sender.Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshalwww.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.