[R] quantile() and "factors not allowed"
A list (t) that I'm trying to pass to quantile() is causing this error: Error in quantile.default(t, probs = c(0.9, 9.95, 0.99)) factors are not allowed I've successfully use lists before, and am having difficulty finding my mistake. Any suggestions appreciated! -Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quantile() and "factors not allowed"
The underlying data contained values that resulted in Factor instead of number fields during the read.csv. Problem fixed! I also introduced a typo while copying the error into my message, and as for the poor variable naming, I'll be more careful. Thanks x3! Corrected structure: > str(CPU) 'data.frame': 56470 obs. of 8 variables: $ Value : num 2.91 9.10e-01 1.08e+07 3.88e+06 3.03 ... $ Timestamp : Factor w/ 4835 levels "9/17/2010 15:30",..: 1 1 1 1 2 2 2 2 3 3 ... $ MetricId: Factor w/ 5 levels "cpu.usage.average",..: 1 1 4 4 1 1 4 4 1 1 ... $ Unit: Factor w/ 4 levels "%","count","KB",..: 1 1 3 3 1 1 3 3 1 1 ... $ Entity : Factor w/ 2 levels "system1",..: 2 1 2 1 2 1 2 1 2 1 ... $ EntityId: Factor w/ 3 levels "","EI1",..: 2 3 2 3 2 3 2 3 2 3 ... $ IntervalSecs: int 1800 1800 1800 1800 1800 1800 1800 1800 1800 1800 ... $ Instance: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ... > Hi Steve, > > The basic problem (as the error suggests) is that data of class > "factor" is not allowed in quantile.default. So one of the elements > of your list must be a factor. What are the results of: str(t) ? > As a side note, since t() is a function, using t as a variable name > can be a bit confusing. > > If your list is relative small, you could post the results of dput(t) > which would allow us to see what your data is actually like and > perhaps identify the exact problem and offer a solution. > > Cheers, > > Josh > > > On Tue, Sep 28, 2010 at 5:56 PM, Steve wrote: >> A list (t) that I'm trying to pass to quantile() is causing this error: >> >> Error in  quantile.default(t, probs = c(0.9, 9.95, 0.99)) >>  factors are not allowed >> >> I've successfully use lists before, and am having difficulty finding my >> mistake.  Any suggestions appreciated! >> >> -Steve >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/ > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D tomography data
Hi all, I was recently informed about R, as i need a program that can calculate the nearest neighbour in 3D tomography data with its vector. However, I am new to R and it isnt exactly intuitive. Does anyone know of any 3D tutorials that may help me get started? Cheers, Steve -- View this message in context: http://r.789695.n4.nabble.com/3D-tomography-data-tp2401591p2401591.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D tomography data
Like I was saying I want to be able to calculate the nearest neighbour and its vector. I think this can be done using pairdist or K3est in the spatstat package. But I have no idea as to how I prepare my data in a form that the software will recognise. How do I turn my tomography data into pp3 type format? -- View this message in context: http://r.789695.n4.nabble.com/3D-tomography-data-tp2401591p2403148.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do I get a character and a symbol in a legend
In the following snippet plot(1:10,1:10,type="n") points(1:5,1:5,pch="+") points(6:10,6:10,pch=20) legend(5,5, c("A","B"), pch=c("+",20)) I want to get a legend with a "+" and a solid circle (pch=20). However, what I get in the legend is "+" and "2". How can I get a "+" and a solid circle? thanks, Steve > sessionInfo() R version 2.6.1 (2007-11-26) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base loaded via a namespace (and not attached): [1] rcompgen_0.1-17 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot multiple lines, same plot, different axes?
Ron It's Steve Thornton here from Leicester in the UK. Hope you are well. I got your card and newsletter. It sounds like you're still travelling. If you get this E-mail please mail me back as I'd like to keep in touch. I've been trying to find your E-mail but you seem to have several. My last messages got bounced back so I hope this one finds you. Take care and hope to hear from you soon Steve. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] An array of an array of boxplots in lattice
Using the data set fgl in MASS the following code layout(matrix(1:9,3,3)) for(i in 1:9){ boxplot(fgl[,i] ~ type, data = fgl,main=dimnames(fgl)[[2]][i])} produces a 3 by 3 array of plots, each one of which consists of six boxplots. Is it possible to do this in lattice? Steve "R version 2.7.2 (2008-08-25)" on Ubuntu 6.06 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] An array of an array of boxplots in lattice
Thank you. Here's my version, using melt instead of do.call(make.groups... library(reshape) fgl2 = melt(fgl[,-10]) fgl2$type = fgl$type bwplot(value ~ type | variable, data = fgl2) Steve Deepayan Sarkar wrote: On Mon, Nov 17, 2008 at 11:15 AM, Chuck Cleland <[EMAIL PROTECTED]> wrote: On 11/17/2008 1:50 PM, steve wrote: Using the data set fgl in MASS the following code layout(matrix(1:9,3,3)) for(i in 1:9){ boxplot(fgl[,i] ~ type, data = fgl,main=dimnames(fgl)[[2]][i])} produces a 3 by 3 array of plots, each one of which consists of six boxplots. Is it possible to do this in lattice? library(MASS) library(lattice) newdf <- reshape(fgl, varying = list(c('RI','Na','Mg','Al','Si','K','Ca','Ba','Fe')), v.names = 'Y', times=c('RI','Na','Mg','Al','Si','K','Ca','Ba','Fe'), direction='long') And a slightly less verbose version of this step is: newdf <- do.call(make.groups, fgl[-10]) newdf$type <- fgl$type followed by bwplot(data ~ which | type, data = newdf, ) -Deepayan bwplot(Y ~ type | time, data = newdf, ylab="", scales=list(y=list(relation='free'))) Steve "R version 2.7.2 (2008-08-25)" on Ubuntu 6.06 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2 problem
I'm using ggplot2 2.0.8 and R 2.8.0 df = data.frame(Year = rep(1:5,2)) m = ggplot(df, aes(Year=Year)) m + geom_bar() Error in get("calculate", env = ., inherits = TRUE)(., ...) : attempt to apply non-function __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 problem
Yes - that's it. thank you Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble building R 3.5.0 under Ubuntu 18.04
I would love to hear from anyone who has successfully built 3.5.0 under Ubuntu 18.04 (Bionic Beaver). My attempts have failed, including: export LDFLAGS="$LDFLAGS -fPIC" export CXXFLAGS="$CXXFLAGS -fPIC" export CFLAGS="$CFLAGS -fPIC" ./configure --enable-R-shlib --prefix=/usr/lib/R/3.5.0 Configure completes normally without errors or warnings make make fails, always with lines like: /usr/bin/x86_64-linux-gnu-ld: ../appl/dtrsl.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC /usr/bin/x86_64-linux-gnu-ld: attrib.o: relocation R_X86_64_PC32 against symbol `R_NilValue' can not be used when making a shared object; recompile with -fPIC /usr/bin/x86_64-linux-gnu-ld: final link failed: Bad value collect2: error: ld returned 1 exit status Makefile:177: recipe for target 'libR.so' failed make[3]: *** [libR.so] Error 1 make[3]: Leaving directory '/home/steve/src/R/R-3.5.0/src/main' Makefile:135: recipe for target 'R' failed How does one set the -fPIC flag? I have never had trouble compiling under Mint, which is based on Ubuntu. Thanks! Steve __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boot.stepAIC fails with computed formula
The error is "the model fit failed in 50 bootstrap samples Error: non-character argument" Cheers, SOH. On 22/08/2017 17:52, Bert Gunter wrote: Failed? What was the error message? Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan wrote: I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail: In R-Studio with MS R Open 3.4: library(bootStepAIC) #Fake data n<-200 x1 <- runif(n, -3, 3) x2 <- runif(n, -3, 3) x3 <- runif(n, -3, 3) x4 <- runif(n, -3, 3) x5 <- runif(n, -3, 3) x6 <- runif(n, -3, 3) x7 <- runif(n, -3, 3) x8 <- runif(n, -3, 3) y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5) dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1) #the real data won't have these names... cn <- names(dat) trg <- "y1" xvars <- cn[cn!=trg] frm1<-as.formula(paste(trg,"~1")) frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+"))) strt=lm(y1~1,dat) # boot.stepAIC Works fine #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ## #strt=lm(frm1,dat) ## boot.stepAIC FAILS ## limit<-5 stp=stepAIC(strt,direction='forward',steps=limit, scope=list(lower=frm1,upper=frm2)) bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit, scope=list(lower=frm1,upper=frm2)) b1 <- bst$Covariates ball <- data.frame(b1) names(ball)=unlist(trg) Any ideas? Cheers, SOH [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble getting rms::survplot(..., n.risk=TRUE) to behave properly
Hello foks, I'm trying to plot the number of patients at-risk by setting the `n.risk` parameter to `TRUE` in the rms::survplot function, however it looks as if the numbers presented in the rows for each category are just summing up the total number of patients at risk in all groups for each timepoint -- which is to say that the numbers are equal in each category down the rows, and they don't seem to be the numbers specific to each group. You can reproduce the observed behavior by simply running the code in the Examples section of ?survplot, which I'll paste below for convenience. Is the error between the chair and the keyboard, here, or is this perhaps a bug? === code === library(rms) n <- 1000 set.seed(731) age <- 50 + 12*rnorm(n) label(age) <- "Age" sex <- factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))) cens <- 15*runif(n) h <- .02*exp(.04*(age-50)+.8*(sex=='Female')) dt <- -log(runif(n))/h label(dt) <- 'Follow-up Time' e <- ifelse(dt <= cens,1,0) dt <- pmin(dt, cens) units(dt) <- "Year" dd <- datadist(age, sex) options(datadist='dd') S <- Surv(dt,e) f <- cph(S ~ rcs(age,4) + sex, x=TRUE, y=TRUE) survplot(f, sex, n.risk=TRUE) === I'm using the latest version of rms (4.5-0) running on R 3.3.0-patched. === Output o sessionInfo() === R version 3.3.0 Patched (2016-05-26 r70671) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.4 (El Capitan) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rms_4.5-0 SparseM_1.7 Hmisc_3.17-4ggplot2_2.1.0 [5] Formula_1.2-1 survival_2.39-4 lattice_0.20-33 loaded via a namespace (and not attached): [1] Rcpp_0.12.5 cluster_2.0.4 MASS_7.3-45 [4] splines_3.3.0 munsell_0.4.3 colorspace_1.2-6 [7] multcomp_1.4-5 plyr_1.8.3 nnet_7.3-12 [10] grid_3.3.0 data.table_1.9.6gtable_0.2.0 [13] nlme_3.1-128quantreg_5.24 TH.data_1.0-7 [16] latticeExtra_0.6-28 MatrixModels_0.4-1 polspline_1.1.12 [19] Matrix_1.2-6gridExtra_2.2.1 RColorBrewer_1.1-2 [22] codetools_0.2-14acepack_1.3-3.3 rpart_4.1-10 [25] sandwich_2.3-4 scales_0.4.0mvtnorm_1.0-5 [28] foreign_0.8-66 chron_2.3-47zoo_1.7-13 === Thanks, -steve -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble getting rms::survplot(..., n.risk=TRUE) to behave properly
Ah! Sorry ... should have dug deeper into the examples section to notice that. Thank you for the quick reply, -steve On Thu, Jun 2, 2016 at 8:59 AM, Frank Harrell wrote: > This happens when you have not strat variables in the model. > > > -- > Frank E Harrell Jr Professor and Chairman School of Medicine > > Department of *Biostatistics* *Vanderbilt University* > > On Thu, Jun 2, 2016 at 10:55 AM, Steve Lianoglou > wrote: > >> Hello foks, >> >> I'm trying to plot the number of patients at-risk by setting the >> `n.risk` parameter to `TRUE` in the rms::survplot function, however it >> looks as if the numbers presented in the rows for each category are >> just summing up the total number of patients at risk in all groups for >> each timepoint -- which is to say that the numbers are equal in each >> category down the rows, and they don't seem to be the numbers specific >> to each group. >> >> You can reproduce the observed behavior by simply running the code in >> the Examples section of ?survplot, which I'll paste below for >> convenience. >> >> Is the error between the chair and the keyboard, here, or is this perhaps >> a bug? >> >> === code === >> library(rms) >> n <- 1000 >> set.seed(731) >> age <- 50 + 12*rnorm(n) >> label(age) <- "Age" >> sex <- factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))) >> cens <- 15*runif(n) >> h <- .02*exp(.04*(age-50)+.8*(sex=='Female')) >> dt <- -log(runif(n))/h >> label(dt) <- 'Follow-up Time' >> e <- ifelse(dt <= cens,1,0) >> dt <- pmin(dt, cens) >> units(dt) <- "Year" >> dd <- datadist(age, sex) >> options(datadist='dd') >> S <- Surv(dt,e) >> >> f <- cph(S ~ rcs(age,4) + sex, x=TRUE, y=TRUE) >> survplot(f, sex, n.risk=TRUE) >> === >> >> I'm using the latest version of rms (4.5-0) running on R 3.3.0-patched. >> >> === Output o sessionInfo() === >> R version 3.3.0 Patched (2016-05-26 r70671) >> Platform: x86_64-apple-darwin13.4.0 (64-bit) >> Running under: OS X 10.11.4 (El Capitan) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] rms_4.5-0 SparseM_1.7 Hmisc_3.17-4ggplot2_2.1.0 >> [5] Formula_1.2-1 survival_2.39-4 lattice_0.20-33 >> >> loaded via a namespace (and not attached): >> [1] Rcpp_0.12.5 cluster_2.0.4 MASS_7.3-45 >> [4] splines_3.3.0 munsell_0.4.3 colorspace_1.2-6 >> [7] multcomp_1.4-5 plyr_1.8.3 nnet_7.3-12 >> [10] grid_3.3.0 data.table_1.9.6gtable_0.2.0 >> [13] nlme_3.1-128quantreg_5.24 TH.data_1.0-7 >> [16] latticeExtra_0.6-28 MatrixModels_0.4-1 polspline_1.1.12 >> [19] Matrix_1.2-6gridExtra_2.2.1 RColorBrewer_1.1-2 >> [22] codetools_0.2-14acepack_1.3-3.3 rpart_4.1-10 >> [25] sandwich_2.3-4 scales_0.4.0mvtnorm_1.0-5 >> [28] foreign_0.8-66 chron_2.3-47zoo_1.7-13 >> === >> >> >> Thanks, >> -steve >> >> >> -- >> Steve Lianoglou >> Computational Biologist >> Genentech >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Apply a multi-variable function to a vector
Hello, I would like to define an arbitrary function of an arbitrary number of variables, for example, for 2 variables: func2 <- function(time, temp) time + temp I'd like to keep variable names that have a meaning in the problem (time and temperature above). If I have a vector of values for these variables, for example in the 2-d case, c(10, 121), I'd like to apply my function (in this case func2) and obtain the result. Conceptually, something like, func2(c(10,121)) becomes func2(10,121) Is there a simple way to accomplish this, for an arbitrary number of variables? I'd like something that would simply work from the definition of the function. If that is possible. Thanks, Steve Kennedy CONFIDENTIALITY NOTICE: This e-mail message, including a...{{dropped:11}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind question, please
This works for me... get0 = function(x) get(x,pos=1) sapply(big.char, get0) The extra step seems necessary because without it, get() gets base::cat() instead of cat. cheers, Steve -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess Sent: Friday, 24 April 2015 10:41a To: R help Subject: [R] cbind question, please Hello! I have a cbind type question, please: Suppose I have the following: dog <- 1:3 cat <- 2:4 tree <- 5:7 and a character vector big.char <- c("dog","cat","tree") I want to end up with a matrix that is a "cbind" of dog, cat, and tree. This is a toy example. There will be a bunch of variables. I experimented with "do.call", but all I got was 1 2 3 Any suggestions would be much appreciated. I still think that do.call might be the key, but I'm not sure. R Version 3-1.3, Windows 7. Thanks, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting Confidence Intervals
Have you tried: library(effects) plot(allEffects(ines),ylim=c(460,550)) -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Andre Roldao Sent: Saturday, 2 May 2015 2:50p To: r-help@r-project.org Subject: [R] Plotting Confidence Intervals Hi Guys, It's the first time i use R-Help and i hope you can help me. How can i plot conffidence intervals? with the data bellow: #Package Austria library(car) #head(States) States1=data.frame(States) ines=lm(SATM ~ log2(pop) + SATV , data=States1) summary(ines) NJ=as.data.frame(States["NJ",c(4,2,3)]) #Identificação do estado NJ p_conf<- predict(ines,interval="confidence",NJ,level=0.95) p_conf #Intervalo de confiança para o estado NJ e para um nivel de 95% round(p_conf, digits=3) p_conf1<- predict(ines,interval="confidence",NJ,level=0.99) p_conf1 #Intervalo de confiança para o estado NJ e para um nivel de 99% round(p_conf, digits=3) p_pred2<- predict(ines,interval="prediction",NJ,level=0.95) p_pred2 #Intervalo de perdição para o estado NJ e para um nivel de 95% round(p_pred2,digits=3) p_pred3<- predict(ines,interval="prediction",NJ,level=0.99) p_pred3 #Intervalo de perdição para o estado NJ e para um nivel de 99% round(p_pred3,digits=3) Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Call to a function
Note that objects can have more than one class, in which case your == and %in% might not work as expected. Better to use inherits(). cheers, Steve -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Steven Yen Sent: Wednesday, 24 June 2015 11:37a To: boB Rudis Cc: r-help mailing list Subject: Re: [R] Call to a function Thanks! From this I learn the much needed class statement if (class(wt)=="character") wt <- x[, wt] which serves my need in a bigger project. Steven Yen On 6/23/2015 6:20 PM, boB Rudis wrote: > You can do something like: > > aaa <- function(data, w=w) { >if (class(w) %in% c("integer", "numeric", "double")) { > out <- mean(w) >} else { > out <- mean(data[, w]) >} >return(out) > } > > (there are some typos in your function you may want to double check, too) > > On Tue, Jun 23, 2015 at 5:39 PM, Steven Yen wrote: >> mydata<-data.frame(matrix(1:20,ncol=2)) >> colnames(mydata) <-c("v1","v2") >> summary(mydata) >> >> aaa<-function(data,w=w){ >>if(is.vector(w)){ >> out<-mean(w) >>} else { >> out<-mean(data[wt]) >>} >> return(out) >> } >> >> aaa(mydata,mydata$v1) >> aaa(mydata,"v1") # want this call to work > -- Steven Yen My e-mail alert: https://youtu.be/9UwEAruhyhY?list=PLpwR3gb9OGHP1BzgVuO9iIDdogVOijCtO __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset() within function: logical error
Using return() within a for loop makes no sense: only the first one will be returned. How about: alldf.B = subset(alldf, stream=='B') # etc... Also, have a look at unique(alldf$stream) or levels(alldf$stream) if you want to use a for loop on each unique value. cheers, Steve -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rich Shepard Sent: Tuesday, 30 June 2015 12:04p To: r-help@r-project.org Subject: [R] Subset() within function: logical error Moving from interactive use of R to scripts and functions and have bumped into what I believe is a problem with variable names. Did not see a solution in the two R programming books I have or from my Web searches. Inexperience with ess-tracebug keeps me from refining my bug tracking. Here's a test data set (cleverly called 'testset.dput'): structure(list(stream = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("B", "J", "S"), class = "factor"), sampdate = structure(c(8121, 8121, 8121, 8155, 8155, 8155, 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257, 8257, 8257, 8308, 8785, 8785, 8785, 8785, 8785, 8785, 8785, 8847, 8847, 8847, 8847, 8847, 8847, 8847, 8875, 8875, 8875, 8875, 8875, 8875, 8875, 8121, 8121, 8121, 8155, 8155, 8155, 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257, 8257, 8257, 8301, 8301, 8301), class = "Date"), param = structure(c(2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L ), .Label = c("Ca", "Cl", "K", "Mg", "Na", "SO4", "pH"), class = "factor"), quant = c(4, 33, 8.43, 4, 32, 8.46, 4, 31, 8.43, 6, 33, 8.32, 5, 33, 8.5, 5, 32, 8.5, 5, 59.9, 3.46, 1.48, 29, 7.54, 64.6, 7.36, 46, 2.95, 1.34, 21.8, 5.76, 48.8, 7.72, 74.2, 5.36, 2.33, 38.4, 8.27, 141, 7.8, 3, 76, 6.64, 4, 74, 7.46, 2, 82, 7.58, 5, 106, 7.91, 3, 56, 7.83, 3, 51, 7.6, 6, 149, 7.73)), .Names = c("stream", "sampdate", "param", "quant" ), row.names = c(NA, -61L), class = "data.frame") I want to subset that data.frame on each of the stream names: B, J, and S. This is the function that has the naming error (eda.R): extstream = function(alldf) { sname = alldf$stream sdate = alldf$sampdate comp = alldf$param value = alldf$quant for (i in sname) { sname <- subset(alldf, alldf$stream, select = c(sdate, comp, value)) return(sname) } } This is the result of running source('eda.R') followed by > extstream(testset) Error in subset.data.frame(alldf, alldf$stream, select = c(sdate, comp, : 'subset' must be logical I've tried using sname for the rows to select, but that produces a different error of trying to select undefined columns. A pointer to the correct syntax for subset() is needed. Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] valid LRT between MASS::polr and nnet::multinom
Dear R-helpers, Does anyone know if the likelihoods calculated by these two packages are comparable in this way? That is, is this a valid likelihood ratio test? # Reproducable example: library(MASS) library(nnet) data(housing) polr1 = MASS::polr(Sat ~ Infl + Type + Cont, weights=Freq, data=housing) mnom1 = nnet::multinom(Sat ~ Infl + Type + Cont, weights=Freq, data=housing) pll = logLik(polr1) mll = logLik(mnom1) res = data.frame( model = c('Proportional odds','Multinomial'), Function = c('MASS::polr','nnet::multinom'), nobs = c(attr(pll, 'nobs'), attr(mll, 'nobs')), df = c(attr(pll, 'df'), attr(mll, 'df')), logLik = c(pll,mll), deviance = c(deviance(polr1), deviance(mnom1)), AIC = c(AIC(polr1), AIC(mnom1)), stringsAsFactors = FALSE ) res[3,1:2] = c("Difference","") res[3,3:7] = apply(res[,3:7],2,diff)[1,] print(res) mytest = structure( list( statistic = setNames(res$logLik[3], "X-squared"), parameter = setNames(res$df[3],"df"), p.value = pchisq(res$logLik[3], res$df[3], lower.tail = FALSE), method = "Likelihood ratio test", data.name = "housing" ), class='htest' ) print(mytest) # If you want to see the fitted results: library(effects) plot(allEffects(polr1), layout=c(3,1), ylim=0:1) plot(allEffects(mnom1), layout=c(3,1), ylim=0:1) many thanks, Steve __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifying a package installed via GitHub
Hi Folks, I am working with a package installed via GitHub that I would like to modify. However, I am not sure how I would go about loading a 'local' version of the package after I have modified it, and whether that process would including uninstalling the original unmodified package (and, conversely, how to uninstall my local, modified version if I wanted to go back to the unmodified version available on GitHub). Any advice would be appreciated. Thanks, Steve -- View this message in context: http://r.789695.n4.nabble.com/modifying-a-package-installed-via-GitHub-tp4710016.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Opposite color in R
I wonder if the hcl colour space is useful? Varying hue while keeping chroma and luminosity constant should give varying colours of perceptually the same "colourness" and brightness. ?hcl pie(rep(1,12),col=hcl((1:12)*30,c=70),border=NA) -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Atte Tenkanen Sent: Sunday, 26 July 2015 7:50a To: r-help@r-project.org Subject: [R] Opposite color in R Hi, I have tried to find a way to find opposite or complementary colors in R. I would like to form a color circle with R like this one: http://nobetty.net/dandls/colorwheel/complementary_colors.jpg If you just make a basic color wheel in R, the colors do not form complementary color circle: palette(rainbow(24)) Colors=palette() pie(rep(1, 24), col = Colors) There is a package ”colortools” where you can find function opposite(), but it doesn’t work as is said. I tried library(colortools) opposite("violet") and got green instead of yellow and opposite("blue") and got yellow instead of orange. Do you know any solutions? Atte Tenkanen __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply with multiple references and database interactivity
Hi R Colleagues, I have a small R script that relies on two for-loops to pull data from a database, make some edits to the data returned from the query, then inserts the updated data back into the database. The script works just fine, no problems, except that I am striving to get away from loops, and to focus on the apply family of tools. In this case, though, I did not know quite where to start with apply. I wonder if someone more adept with apply would not mind taking a look at this, and suggesting some tips as to how this could have been accomplished with apply instead of nested loops. More details on what the script is accomplishing are included below. Thanks in advance for your help and consideration. Steve Here, I have a df that includes a list of keywords that need to be edited, and the corresponding edit. The script goes through a database of people, identifies whether any of the keywords associated with each person are in the list of keywords to edit, and, if so, pulls in the list of keywords and the person details, swaps the new keyword for the old keyword, then inserts the updated keywords back into the database for that person (many keywords are associated with each person, and they are in an array, hence the somewhat complicated procedure). The if-statement provides a list of keywords in the df that were not found in the database, and 'm' is just a counter to help me know how many keywords the script changed. for(i in 1:nrow(keywords)) { pull <- dbGetQuery(conn = con, statement = paste0("SELECT person_id, expertise FROM people WHERE expertise RLIKE '; ", keywords[i, 2], ";'")) pull$expertise <- gsub(keywords[i, 2], keywords[i, 3], pull$expertise) if (nrow(pull)==0) { sink('~/Desktop/r1', append = TRUE) print(keywords[i, ]$keyword) sink() } else { for (j in 1:nrow(pull)) { dbSendQuery(conn = con, statement = paste0("UPDATE people SET expertise = '", pull[j, ]$expertise, "' WHERE person_id = ", pull[j, ]$person_id)) } m=m+1 } } -- View this message in context: http://r.789695.n4.nabble.com/apply-with-multiple-references-and-database-interactivity-tp4711148.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the less-than-minus gotcha
All the more reason to use = instead of <- -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker Sent: Monday, 2 February 2015 2:07p To: r-h...@stat.math.ethz.ch Subject: Re: [R] the less-than-minus gotcha Mike Miller gmail.com> writes: > > I've got to remember to use more spaces. Here's the basic problem: > > These are the same: > > v< 1 > v<1 > > But these are extremely different: > > v< -1 > v<-1 > This is indeed documented, in passing, in one of the pages you listed: http://tim-smith.us/arrgh/syntax.html Whitespace is meaningless, unless it isn't. Some parsing ambiguities are resolved by considering whitespace around operators. See and despair: x<-y (assignment) is parsed differently than x < -y (comparison)! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the less-than-minus gotcha
Responding to several messages in this thread... > > All the more reason to use = instead of <- > Definitely not! Martin and Rolf are right, it's not a reason for that; I wrote that quickly without thinking it through. An "=" user might be more likely to fall for the gotcha, if not spacing their code nicely. So the lesson learned from the gotcha is that it's good to space your code nicely, as others have siad, not which assignment symbol to use. However, I continue to use "=" for assignment on a daily basis without any problems, as I have done for many years. I remain unconvinced by any and all of these arguments against it in favour of "<-". People telling me that I "should" use the arrow need better agruments than what I've seen so far. I find "<-" ugly and "->" useless/pointless, whereas "=" is simpler and also nicely familiar from my experience in other languages. It doesn't matter to me that "=" is not commutative because I don't need it to be. > Further it can be nicely marked up by a real "left arrow" > by e.g. the listings LaTeX 'listings' package... Now that's just silly, turning R code into graphical characters that are not part of the R language. > foo(x = y) and foo(x <- y) I'm well aware of this distinction and it never causes me any problems. The latter is an example of bad (obfuscated) coding, IMHO; it should be done in two lines for clarity as follows: x = y foo(x) > Using = has it's problems too. Same goes for apostrophes. Shall we discuss putting "else" at the start of line next? cheers, Steve __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the less-than-minus gotcha
I disagree. Assignments in my code are all lines that look like this: variable = expression They are easy to find and easy to read. -Original Message- From: Ista Zahn [mailto:istaz...@gmail.com] Sent: Tuesday, 3 February 2015 3:36p To: Steve Taylor Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] the less-than-minus gotcha On Mon, Feb 2, 2015 at 8:57 PM, Steve Taylor wrote: Fair enough, but you skipped right past the most important one: it makes code easier to read. It's very nice to be able to visually scan through the code and easily see where assignment happens. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the less-than-minus gotcha
Nobody would write x=x or indeed x<-x; both are silly. If I found myself writing f(x=x) I might smirk at the coincidence, but it wouldn't bother me. I certainly wouldn't confuse it with assigning x to itself. By the way, here's another assignment operator we can use: `:=` = `<-` # this is going in my .Rprofile x := 1 -Original Message- From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us] Sent: Tuesday, 3 February 2015 3:54p To: Steve Taylor; r-h...@stat.math.ethz.ch Subject: Re: [R] the less-than-minus gotcha I did not start out liking <-, but I am quite attached to it now, and even Rcpp feels weird to me now. This may seem like yet another variation on a theme that you don't find compelling, but I find that f(x=x) makes sense when scope is considered, but x=x on its own is silly. That is why I prefer to reserve = for assigning parameters... I use it to clarify that I am crossing scope boundaries, while <- never does. (<<- is a dangerous animal, though... to be used only locally in nested function definitions). In my view, this is similar to preferring == from C-derived syntaxes over the overloaded = from, say, Basic. I am sure you can get by with the syntactic overloading, but if you have the option of reducing ambiguity, why not use it? --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On February 2, 2015 5:57:05 PM PST, Steve Taylor wrote: >Responding to several messages in this thread... > >> > All the more reason to use = instead of <- >> Definitely not! > >Martin and Rolf are right, it's not a reason for that; I wrote that >quickly without thinking it through. An "=" user might be more likely >to fall for the gotcha, if not spacing their code nicely. So the >lesson learned from the gotcha is that it's good to space your code >nicely, as others have siad, not which assignment symbol to use. > >However, I continue to use "=" for assignment on a daily basis without >any problems, as I have done for many years. I remain unconvinced by >any and all of these arguments against it in favour of "<-". People >telling me that I "should" use the arrow need better agruments than >what I've seen so far. > >I find "<-" ugly and "->" useless/pointless, whereas "=" is simpler and >also nicely familiar from my experience in other languages. It doesn't >matter to me that "=" is not commutative because I don't need it to be. > >> Further it can be nicely marked up by a real "left arrow" >> by e.g. the listings LaTeX 'listings' package... > >Now that's just silly, turning R code into graphical characters that >are not part of the R language. > >> foo(x = y) and foo(x <- y) > >I'm well aware of this distinction and it never causes me any problems. >The latter is an example of bad (obfuscated) coding, IMHO; it should be >done in two lines for clarity as follows: > >x = y >foo(x) > >> Using = has it's problems too. >Same goes for apostrophes. > >Shall we discuss putting "else" at the start of line next? > >cheers, > Steve > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fastest way to calculate quantile in large data.table
Not sure if there is a question in here somewhere? But if I can point out an observation: if you are doing summary calculations across the rows like this, my guess is that using a data.table (data.frame) structure for that will really bite you, because this operation on a data.table/data.frame is expensive; x <- dt[i,] However it's much faster with a matrix. It doesn't seem like you're doing anything with this dataset that takes advantage of data.table's quick grouping/indexing mojo, so why store it in data.table at all? Witness: R> library(data.table) R> m <- matrix(rnorm(1e6), nrow=10) R> d <- as.data.table(m) R> idxs <- sample(1:nrow(m), 500, replace=TRUE) R> system.time(for (i in idxs) x <- m[i,]) user system elapsed 0.497 0.169 0.670 R> system.time(for (i in idxs) x <- d[i,]) ## I killed it after waiting for 14 seconds -steve On Thu, Feb 5, 2015 at 11:48 AM, Camilo Mora wrote: > In total I found 8 different way to calculate quantile in very a large > data.table. I share below their performances for future reference. Tests 1, 7 > and 8 were the fastest I found. > > Best, > > Camilo > > library(data.table) > v <- data.table(x=runif(1),x2 = runif(1), > x3=runif(1),x4=runif(1)) > > #fastest > Sys.time()->StartTEST1 > t(v[, apply(v,1,quantile,probs =c(.1,.9,.5),na.rm=TRUE)] ) > Sys.time()->EndTEST1 > > Sys.time()->StartTEST2 > v[, quantile(.SD,probs =c(.1,.9,.5)), by = 1:nrow(v)] > Sys.time()->EndTEST2 > > Sys.time()->StartTEST3 > v[, c("L","H","M"):=quantile(.SD,probs =c(.1,.9,.5)), by = 1:nrow(v)] > Sys.time()->EndTEST3 > v > v[, c("L","H","M"):=NULL] > > v[,Names:=rownames(v)] > setkey(v,Names) > > Sys.time()->StartTEST4 > v[, c("L","H","M"):=quantile(.SD,probs =c(.1,.9,.5)), by = Names] > Sys.time()->EndTEST4 > v > v[, c("L","H","M"):=NULL] > > > Sys.time()->StartTEST5 > v[, as.list(quantile(.SD,c(.1,.90,.5),na.rm=TRUE)), by=Names] > Sys.time()->EndTEST5 > > > Sys.time()->StartTEST6 > v[, as.list(quantile(.SD,c(.1,.90,.5),na.rm=TRUE)), by=Names,.SDcols=1:4] > Sys.time()->EndTEST6 > > > Sys.time()->StartTEST7 > v[, as.list(quantile(c(x , x2,x3,x4 > ),c(.1,.90,.5),na.rm=TRUE)), by=Names] > Sys.time()->EndTEST7 > > > # melting the database and doing quantily by summary. This is the second > fastest, which is ironic given that the database has to be melted first > library(reshape2) > Sys.time()->StartTEST8 > vs<-melt(v) > vs[, as.list(quantile(value,c(.1,.90,.5),na.rm=TRUE)), by=Names] > Sys.time()->EndTEST8 > > > EndTEST1-StartTEST1 > EndTEST2-StartTEST2 > EndTEST3-StartTEST3 > EndTEST4-StartTEST4 > EndTEST5-StartTEST5 > EndTEST6-StartTEST6 > EndTEST7-StartTEST7 > EndTEST8-StartTEST8 > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scraping HTML using R
You want to take a look at rvest: https://github.com/hadley/rvest On Thu, Feb 5, 2015 at 2:36 PM, Madhuri Maddipatla wrote: > Dear R experts, > > My requirement for web scraping in R goes like this. > > *Step 1* - All the medical condition from from A-Z are listed in the link > below. > > http://www.webmd.com/drugs/index-drugs.aspx?show=conditions > > Choose the first condition say Acid Reflux(GERD-...) > > *Step 2 *- It lands on the this page > > http://www.webmd.com/drugs/condition-1999-Acid%20Reflux%20%20GERD-Gastroesophageal%20Reflux%20Disease%20.aspx?diseaseid=1999&diseasename=Acid+Reflux+(GERD-Gastroesophageal+Reflux+Disease)&source=3 > > with a list of drugs. > > Choose the column user reviews of the first drug say "Nexium Oral" > > *Step 3*: Now it lands on the webpage > > http://www.webmd.com/drugs/drugreview-20536-Nexium+oral.aspx?drugid=20536&drugname=Nexium+oral > > with a list of reviews. > I would like to scrape review information into a tabular format by scraping > the html. > For instance, i would like to fetch the full comment of each review as a > column in a table. > Also it should automatically go to next page and fetch the full comments of > all reviewers. > > > Please help me in this endeavor and thanks a lot in advance for reading my > mail and expecting response with your experience and expertise. > > Also please suggest me the possibility around my stepwise plan and any > advice you would like to give me along with the solution. > > High Regards, > *-* > *Madhuri Maddipatla* > *-* > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using dates in R
> today <- as.Date("2015-03-04") # default format Better is: today <- Sys.Date() S -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap Sent: Thursday, 5 March 2015 7:47a To: Brian Hamel Cc: r-help@r-project.org Subject: Re: [R] Using dates in R You will need to convert strings like "2/15/15" into one of the time/date classes available in R and then it is easy to do comparisons. E.g., if you have no interest in the time of day you can use the Date class: > d <- as.Date(c("12/2/79", "4/15/15"), format="%m/%d/%y") > today <- as.Date("2015-03-04") # default format > d [1] "1979-12-02" "2015-04-15" > today [1] "2015-03-04" > d < today [1] TRUE FALSE The lubridate package contains a bunch of handy functions for manipulating dates and times. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Mar 4, 2015 at 6:54 AM, Brian Hamel wrote: > Hi all, > > I have a dataset that includes a "date" variable. Each observation includes > a date in the form of 2/15/15, for example. I'm looking to create a new > indicator variable that is based on the date variable. So, for example, if > the date is earlier than today, I would need a "0" in the new column, and a > "1" otherwise. Note that my dataset includes dates from 1979-2012, so it is > not one-year (this means I can't easily create a new variable 1-365). > > How does R handle dates? My hunch is "not well," but perhaps there is a > package that can help me with this. Let me know if you have any > recommendations as to how this can be done relatively easily. > > Thanks! Appreciate it. > > Best, > Brian > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date extract Year
Hi all, I am trying in vain to create a new object "Year" in my data frame from existing Date data. I have tried many different approaches, but can't seem to get it to work. Here is an example of some code I tried. date1<- as.Date(wells$Date,"%m/%d/%Y") wells$year<-as.numeric(format(date1, "%Y")) I am starting with data that looks like this. ID Date DepthtoWater_bgs test test2 1 BC-0004 41163 260.603 1 2 BC-0004 41255 261.654 2 3 BC-0003 41345 166.585 3 4 BC-0002 41351 317.856 4 5 BC-0004 41355 262.157 5 6 BC-0003 41438 167.558 6 7 BC-0004 41438 265.459 7 8 BC-0002 41443 317.25 10 8 9 BC-0002 41521 321.25 11 9 10 BC-0003 41522 168.65 1210 11 BC-0004 41522 266.15 1311 12 BC-0003 41627 168.95 1412 13 BC-0004 41627 265.25 1513 14 BC-0002 41634 312.31 1614 15 BC-0003 41703 169.25 1715 16 BC-0004 41703 265.05 1816 17 BC-0002 41710 313.01 1917 18 BC-0003 41795 168.85 2018 19 BC-0004 41795 266.95 2119 20 BC-0002 41801 330.41 2220 21 BC-0003 41905 169.75 2321 22 BC-0004 41905 267.75 2422 23 BC-0002 41906 321.01 2523 Any help would be greatly appreciated! -Steve __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract year from date
Hi all, I am trying in vain to create a new object "Year" in my data frame from existing Date data. I have tried many different approaches, but can't seem to get it to work. Here is an example of some code I tried. date1<- as.Date(wells$Date,"%m/%d/%Y") wells$year<-as.numeric(format(date1, "%Y")) I am starting with data that looks like this. ID Date DepthtoWater_bgs test test2 1 BC-0004 41163 260.603 1 2 BC-0004 41255 261.654 2 3 BC-0003 41345 166.585 3 4 BC-0002 41351 317.856 4 5 BC-0004 41355 262.157 5 6 BC-0003 41438 167.558 6 7 BC-0004 41438 265.459 7 8 BC-0002 41443 317.25 10 8 9 BC-0002 41521 321.25 11 9 10 BC-0003 41522 168.65 1210 11 BC-0004 41522 266.15 1311 12 BC-0003 41627 168.95 1412 13 BC-0004 41627 265.25 1513 14 BC-0002 41634 312.31 1614 15 BC-0003 41703 169.25 1715 16 BC-0004 41703 265.05 1816 17 BC-0002 41710 313.01 1917 18 BC-0003 41795 168.85 2018 19 BC-0004 41795 266.95 2119 20 BC-0002 41801 330.41 2220 21 BC-0003 41905 169.75 2321 22 BC-0004 41905 267.75 2422 23 BC-0002 41906 321.01 2523 Any help would be greatly appreciated! -Steve Sent from my iPhone __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regex find anything which is not a number
How about letting a standard function decide which are numbers: which(!is.na(suppressWarnings(as.numeric(myvector Also works with numbers in scientific notation and (presumably) different decimal characters, e.g. comma if that's what the locale uses. -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Adrian Du?a Sent: Thursday, 12 March 2015 8:27a To: r-help@r-project.org Subject: [R] regex find anything which is not a number Hi everyone, I need a regular expression to find those positions in a character vector which contain something which is not a number (either positive or negative, having decimals or not). myvector <- c("a3", "N.A", "1.2", "-3", "3-2", "2.") In this vector, only positions 3 and 4 are numbers, the rest should be captured. So far I am able to detect anything which is not a number, excluding - and . > grep("[^-0-9.]", myvector) [1] 1 2 I still need to capture positions 5 and 6, which in human language would mean to detect anything which contains a "-" or a "." anywhere else except at the beginning of a number. Thanks very much in advance, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] operations on columns when data frames are in a list
Hello R folks, I have recently discovered the power of working with multiple data frames in lists. However, I am having trouble understanding how to perform operations on individual columns of data frames in the list. For example, I have a water quality data set (sample data included below) that consists of roughly a dozen data frames. Some of the data frames have a chr column called 'Month' that I need to to convert to a date with the proper format. I would like to iterate through all of the data frames in the list and format all of those that have the 'Month' column. I can accomplish this with a for-loop (e.g., below) but I cannot figure out how to do this with the plyr or apply families. This is just one example of the formatting that I have to perform so I would really like to avoid loops, and I would love to learn how to better work with lists as well. I would appreciate greatly any guidance. Thank you and regards, Stevan a for-loop like this works, but is not an ideal solution: for (i in 1:length(data)) {if ("Month" %in% names(data[[i]])) data[[i]]$Month<- as.POSIXct(data[[i]]$Month, format="%Y/%m/%d")} sample data (head of two data frames from the list of all data frames): structure(list(`3D_Fluorescence.csv` = structure(list(ID = 1:6, Site_Number = c("R5", "R6a", "R8", "R9a", "R14", "R15"), Month = c("2001/10/01", "2001/10/01", "2001/10/01", "2001/10/01", "2001/10/01", "2001/10/01"), Exc_A = c(215L, 215L, NA, NA, 215L, 215L), Em_A = c(422.5, 410.5, NA, NA, 408.5, 408), Fl_A = c(303, 296.86, NA, NA, 297.62, 174.75), Exc_B = c(325L, 325L, NA, NA, 325L, 325L), Em_B = c(416, 413, NA, NA, 418.5, 417.5), Fl_B = c(137.32, 116.1, NA, NA, 132.48, 77.44)), .Names = c("ID", "Site_Number", "Month", "Exc_A", "Em_A", "Fl_A", "Exc_B", "Em_B", "Fl_B"), row.names = c(NA, 6L), class = "data.frame"), algae.csv = structure(list( ID = 1:6, SiteNumber = c("R1", "R2A", "R2B", "R3", "R4", "R5"), SiteLocation = c("CAP canal above Waddell Canal", "Lake Pleasant integrated sample", "Lake Pleasant integrated sample", "Waddell Canal", "Cap Canal at 7th St.", "Verde River btwn Horseshoe and Bartlett" ), ClusterName = c("cap", "cap", "cap", "cap", "cap", "verde" ), SiteAcronym = c("cap-siphon", "pleasant-epi", "pleasant-hypo", "waddell canal", "cap @ 7th st", "verde abv bartlett"), Date = c("1999/08/18", "1999/08/18", "1999/08/18", "1999/08/18", "1999/08/18", "1999/08/16" ), Month = c("1999/08/01", "1999/08/01", "1999/08/01", "1999/08/01", "1999/08/01", "1999/08/01"), SampleType = c("", "", "", "", "", ""), Conductance = c(800, 890, 850, 870, 830, 500), ChlA = c(0.3, 0.3, 0.6, 0.8, 1.1, 7.6), Phaeophytin = c(0, 0, 0, 0, 0.7, 4.7), PhaeophytinChlA = c(0.7, 0.7, 1.3, 5.3, 0.7, 4.7), Chlorophyta = c(0L, 0L, 18L, 0L, 0L, 21L), Cyanophyta = c(8L, 0L, 0L, 0L, 7L, 79L), Bacillariophyta = c(135L, 76L, 0L, 18L, 54L, 195L), Total = c(147L, 76L, 18L, 18L, 61L, 302L ), AlgaeComments = c("", "", "", "", "", "")), .Names = c("ID", "SiteNumber", "SiteLocation", "ClusterName", "SiteAcronym", "Date", "Month", "SampleType", "Conductance", "ChlA", "Phaeophytin", "PhaeophytinChlA", "Chlorophyta", "Cyanophyta", "Bacillariophyta", "Total", "AlgaeComments"), row.names = c(NA, 6L), class = "data.frame")), .Names = c("3D_Fluorescence.csv", "algae.csv")) -- View this message in context: http://r.789695.n4.nabble.com/operations-on-columns-when-data-frames-are-in-a-list-tp4705757.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA's introduced by coercion
Hi, On Tue, Aug 26, 2014 at 9:56 PM, madhvi.gupta wrote: > Hi, > > I am applyin function as.numeric to a vector having many values as NA and it > is giving : > Warning message: > NAs introduced by coercion > > Can anyone help me to know how to remove this warning and sor it out? Let's say that the vector you are calling `as.numeric` over is called `x`. If you could show us the output of the following command: R> head(x[is.na(as.numeric(x))]) You'll see why you are getting the warning. How you choose to sort it out probably depends on what you are trying to do with your data after you convert it to a "numeric" -steve -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA's introduced by coercion
Hi Madhvi, First, please use "reply-all" when responding to emails form this list so that others can help (and benefit from) the discussion. Comment down below: On 26 Aug 2014, at 22:15, madhvi.gupta wrote: On 08/27/2014 10:42 AM, Steve Lianoglou wrote: Hi, On Tue, Aug 26, 2014 at 9:56 PM, madhvi.gupta wrote: Hi, I am applyin function as.numeric to a vector having many values as NA and it is giving : Warning message: NAs introduced by coercion Can anyone help me to know how to remove this warning and sor it out? Let's say that the vector you are calling `as.numeric` over is called `x`. If you could show us the output of the following command: R> head(x[is.na(as.numeric(x))]) You'll see why you are getting the warning. How you choose to sort it out probably depends on what you are trying to do with your data after you convert it to a "numeric" -steve Hi, I am having this error bacouse vector contains value NA but i want to convert that vector to numeric I don't quite follow what the problem is, then ... what is the end result that you want to happen? When you convert the vector to a numeric, the NA's that were in it originally, will remain as NAs (but they will be of a 'numeric' type). What would you like to do with the NA values? Do you just want to keep them, but want to silence the warning? If so, you can do: R> suppressWarnings(y <- as.numeric(x)) -steve -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract model from deriv3 or nls
Hello! I am trying to figure out how to extract the model equation when using deriv3 with nls. Here is my example: # # Generate derivatives # Puro.fun2 <- deriv3(expr = ~(Vmax + VmaxT*state) * conc/(K + Kt * state + conc), name = c("Vmax","VmaxT","K","Kt"), function.arg = function(conc, state, Vmax, VmaxT, K, Kt) NULL) # # Fit model using derivative function # Puro.fit1 <- nls(rate ~ Puro.fun2(conc, state == "treated", Vmax, VmaxT, K, Kt), data = Puromycin, start = c(Vmax = 160, VmaxT = 47, K = 0.043, Kt = 0.05)) Normally I would use summary(Puro.fit1)$formula to extract the model but because I am implementing deriv3, the following gets returned: > summary(Puro.fit1)$formula rate ~ Puro.fun2(conc, state == "treated", Vmax, VmaxT, K, Kt) What I would like to do is find something that returns: rate ~ (Vmax + VmaxT*state) * conc/(K + Kt * state + conc) Is there a way to extract this? Please advise. Thanks for your time. Steve 860-441-3435 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Webdings font on pdf device
Dear R-helpers Has anyone successfully used the Webdings font on a pdf or postscript device? I'm tearing my hair out trying to figure out how to make it work. # It works on a png() device: windowsFonts(Webdings = windowsFont("Webdings")) png('Webdings.png', family = 'Webdings') plot(-3:3,-3:3,type='n',xlab='',ylab='',axes=FALSE) text (rnorm(26),rnorm(26),LETTERS,cex=2) graphics.off() I have tried to set up the Webdings font using the extrafont package but it gives warnings. The output file says it has Webdings in it, but the characters do not show. R> library(extrafont) Registering fonts with R R> loadfonts(device = "pdf", quiet=TRUE) R> pdf('Webdings.pdf', family='Webdings') Warning messages: 1: In pdf("Webdings.pdf", family = "Webdings") : unknown AFM entity encountered 2: In pdf("Webdings.pdf", family = "Webdings") : unknown AFM entity encountered 3: In pdf("Webdings.pdf", family = "Webdings") : unknown AFM entity encountered 4: In pdf("Webdings.pdf", family = "Webdings") : unknown AFM entity encountered R> plot(-3:3,-3:3,type='n',xlab='',ylab='',axes=FALSE) R> text (rnorm(26),rnorm(26),LETTERS,cex=2) There were 27 warnings (use warnings() to see them) R> graphics.off() R> warnings()[1] Warning message: In text.default(rnorm(26), rnorm(26), LETTERS, cex = 2) : font width unknown for character 0x41 Any assistance would be much appreciated. cheers, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loops in R
While you should definitely read the tutorial that Don is referring to, I'd recommend you take a different approach and use more R idiomatic code here. In base R, this could be addressed with few approaches. Look for help on the following functions: * tapply * by * aggregate I'd rather recommend you also learn about some of the packages that are better suited to deal with computing over data.frames, particularly: * dplyr * data.table You can certainly achieve what you want with for loops, but you'll likely find that going this route will be more rewarding in the long run. HTH, -steve On Wed, Nov 5, 2014 at 10:02 AM, Don McKenzie wrote: > Have you read the tutorial that comes with the R distribution? This is a > very basic database calculation that you will > encounter (or some slight variation of it) over and over. The solution is a > few lines of code, and someone may write it > out for you, but if no one does > > You have 20 populations, so you will have 20 iterations in your for loop. For > each one, you will need a unique identifier that points to > the rows of "R" associated with that population. You'll calculate a mean and > variance 20 times, and will need a data object to store > those calculations. > > Look in the tutorial for syntax for identifying subsets of your data frame. > >> On Nov 5, 2014, at 5:41 AM, Noha Osman wrote: >> >> Hi Folks >> >> Iam a new user of R and I have a question . Hopefully anyone help me in >> that issue >> >> >> I have that dataset as following >> >> Sample Population Species Tissue R GB >> 1 Bari1_062-1 Bari1 ret seed 94.52303 80.70346 67.91760 >> 2 Bari1_062-2 Bari1 ret seed 98.27683 82.68690 68.55485 >> 3 Bari1_062-3 Bari1 ret seed 100.53170 86.56411 73.27528 >> 4 Bari1_062-4 Bari1 ret seed 96.65940 84.09197 72.05974 >> 5 Bari1_062-5 Bari1 ret seed 117.62474 98.49354 84.65656 >> 6 Bari1_063-1 Bari1 ret seed 144.39547 113.76170 99.95633 >> >> and I have 20 populations as following >> >> [1] Bari1 Bari2 Bari3 Besev Cermik Cudi Derici >> Destek Egil >> [10] GunasanKalkan Karabace Kayatepe Kesentas OrtancaOyali >> Cultivated Sarikaya >> [19] Savur Sirnak >> >> I need to calculate mean and variance of each population using column [R] >> using for-loop >> >> >> Thanks >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > Don McKenzie > Research Ecologist > Pacific Wildland Fire Sciences Lab > US Forest Service > > Affiliate Faculty > School of Environmental and Forest Sciences > University of Washington > d...@uw.edu > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Computational Biologist Genentech __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "predict" values from object of type "list"
Hi Guys, I could need some help here. I have a set of 3d points (x,y,v). These points are not randomly scattered but lie on a surface. This surface can be taken as a calibration plane for x and y -values. My goal is to quantify this surface and than predict the v-values for given pairs of x- and y-coordinates. This iscode shows how I started to solve this problem. First, I generate more points between existing points using 3d-splines. That way I "pre-smooth" my data set. After that I use interp to create even more points and I end up with an object called "sp" (class "list"). sp is visualized using surface3d. The surface looks like I wish it to be. Now, how can I predict a x/y-pair of, say -2, 2 ?? Can somebody help? Thanks a lot! library(rgl) library(akima) v <- read.table(text="5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14", sep=" ") v <- as.numeric(v) x <- read.table(text="3.4 3.3 3.4 3.4 3.4 3.4 3.6 3.5 3.5 3.4 3.4 3.4 3.4 3.5 3.5 2.6 2.6 2.6 2.7 2.6 2.7 2.9 2.9 2.8 2.7 2.7 2.7 2.7 2.7 2.8 1.8 1.7 1.7 1.7 1.8 1.9 2.1 2.2 2.0 1.9 1.9 1.9 1.9 1.9 2.0 0.8 0.8 0.8 0.8 0.9 1.1 1.3 1.4 1.2 1.1 1.0 1.0 1.0 1.1 1.1 -0.2 -0.2 -0.2 -0.2 0.0 0.2 0.4 0.6 0.3 0.1 0.1 0.1 0.1 0.1 0.2 -1.2 -1.3 -1.3 -1.3 -1.1 -0.8 -0.5 -0.3 -0.6 -0.9 -0.9 -0.9 -0.9 -1.0 -0.9 -2.4 -2.6 -2.6 -2.5 -2.3 -2.0 -1.1 -1.2 -1.6 -2.0 -2.0 -2.0 -2.1 -2.2 -2.1 -3.9 -4.2 -4.3 -4.2 -3.9 -3.6 -2.5 -2.7 -3.3 -3.7 -3.7 -3.8 -3.8 -4.0 -3.9 -5.8 -6.1 -6.2 -6.1 -5.7 -5.3 -3.9 -4.1 -4.8 -5.3 -5.3 -5.3 -5.4 -5.5 -5.4 -7.5 -7.8 -8.0 -7.8 -7.4 -6.8 -5.1 -5.3 -6.1 -6.6 -6.7 -6.8 -6.9 -6.9 -6.9", sep=" ") y <- read.table(text="0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1.0 1.0 1.1 1.1 1.2 1.2 0.5 0.5 0.6 0.7 0.8 0.9 0.9 1.0 1.1 1.1 1.2 1.3 1.4 1.4 1.5 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 0.4 0.5 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.5 1.6 1.7 1.9 2.0 2.1 0.4 0.5 0.7 0.8 1.0 1.1 1.2 1.3 1.5 1.7 1.8 2.0 2.1 2.3 2.4 0.3 0.5 0.7 0.9 1.0 1.2 1.4 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.8 0.2 0.4 0.7 0.9 1.1 1.3 1.4 1.6 1.9 2.2 2.4 2.6 2.8 3.1 3.3 0.2 0.4 0.7 1.0 1.3 1.5 1.6 1.8 2.2 2.5 2.7 3.0 3.3 3.6 3.8 0.2 0.5 0.8 1.1 1.4 1.7 1.8 2.0 2.4 2.8 3.1 3.4 3.7 4.1 4.3 0.1 0.4 0.8 1.2 1.5 1.8 1.9 2.2 2.7 3.1 3.5 3.8 4.2 4.5 4.9", sep=" ") x <- as.numeric(x) y <- as.numeric(y) z <- read.table(text="-35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35", sep=" ") z <- as.numeric(z) df <- data.frame(x,y,z,v) #hier ist df die originale kali plot3d(x,y,v) all_dat <- c() for (n in seq(min(z), max(z),5)) { blubb <- (which(df$z == n)) #hier werden gleiche winkel gesucht gleicheWink <- df[(blubb),] red_df <- data.frame(t=seq(1,length(gleicheWink[,1]),1), x = gleicheWink$x, y= gleicheWink$y, v=gleicheWink$v ) ts <- seq( from = min(red_df$t), max(red_df$t), length=50 ) d2 <- apply( red_df[,-1], 2, function(u) spline(red_df$t, u, xout = ts )$y ) all_dat <- rbind(all_dat, d2) } x <- all_dat[,1] y <- all_dat[,2] z <- all_dat[,3] sp <- interp(x,y,z,linear=TRUE, xo=seq(min(x),max(x), length=50), yo=seq(min(y),max(y), length=50), duplicate="mean") open3d(scale=c(1/diff(range(x)),1/diff(range(y)),1/diff(range(z zlen=5 cols <- heat.colors(zlen) with(sp,surface3d(x,y,z, color=cols)) #,alpha=.2)) points3d(x,y,z) title3d(xlab="x",ylab="y",zlab="v") axes3d() [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simulate data with binary outcome
Dear R-Users, I wish to simulate a binary outcome data set with predictors (in the example below, age, sex and systolic BP). Is there a way I can set the frequency of the outcome (y) to be say 5% (versus the 0.1% when using the seed below)? # Example R-code based on Frank Harrell's Design help files library(Hmisc) n <- 1000 set.seed(123456) age <- runif(n, 60, 90) sbp <- rnorm(n, 120, 15) sex <- factor(sample(c('female','male'), n,TRUE)) # Specify population model for log odds that CHD = Yes L <- 0.4*(sex == 'male') + 0.045*(age) + 0.05*(sbp) # Simulate binary y to have Prob(y = 1) = 1/[1+exp(-L)] y <- ifelse(runif(n) < plogis(L), 1, 0) table(y) ddist <- datadist(sex,age,sbp) options(datadist = 'ddist') fit <- lrm(y ~ sex + age + sbp) summary(fit) Steve Frost MPH University of Western Sydney Building 7 Campbelltown Campus Locked Bag 1797 PENRITH SOUTH DC 1797 Phone 61+ 2 4620 3415 Mobile 0407 291088 Fax 61+ 2 4625 4252 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matching Up Values
Dear all, I have two files, both of similar formats. In column 1 are Latitude values (real numbers, e.g. -179.25), column 2 has Longitude values (also real numbers) and in one of the files, column 3 has Population Density values (integers); there is no column 3 in the other file. However, the main difference between these two files is that one has fewer rows than the other. So what I'm looking to do is, 'pad out' the shorter file, by adding in the rows with those that are 'missing' from the longer file (ie. if a particular coordinate isn't present in the shorter file but is in the 'longer/master' file), and having 'zero' as its Population Density value (column C). This should result in the shorter file becoming the same length as the initially longer file, and with each file having the same coordinate values (latitude and longitude on each line). How would I do this in R? Thanks for any help offered, Steve _ The John Lewis Clearance - save up to 50% with FREE delivery __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matching Up Values
I think the approach is ok. I'm having difficulties though...! I've managed to get 'merge' working (using the 'all' function as suggested), but for some strange reason, the output file produces 12 extra rows! So now the shorter file isn't the same length as the 'master' file, it's now longer! The files are fairly sizeable (~6 rows) so it's difficult to pin-point manually where it's lost track. Is there an obvious solution to this? I was wondering if the best thing might be to 'de-merge' the now-longer file, so that the surplus rows are removed. Is there a command therefore which will enable me to compare the now-longer file to the master file, so that any coordinate pairs which are present in the longer file but not in the (now-shorter) master file are removed? Thanks again, Steve _ Play and win great prizes with Live Search and Kung Fu Panda __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matching Up Values
Hmm, I'm having a fair few difficulties using 'merge' now. I managed to get it to work successfully before, but in this case I'm trying to shorten (as oppose to lengthen as before) a file in relation to a 'master' file. These are the commands I've been using, followed by the dimensions of the files in question - as you can see, the row numbers of the merged file don't correlate to that of the 'coordinates' file (which is what I'm aiming to get 'merged' equal to): > merge(PopDens.long, coordinates, by=c("Latitude","Longitude"), all = TRUE) -> > merged > dim(PopDens.long); dim(coordinates); dim(merged) [1] 67870 3 [1] 67420 2 [1] 69849 3 One thing I tried was swapping the order of the files in the merge command, but this causes 'merged' to have the same number of rows (69849). Something else I tried was to leave out the 'all = TRUE' command, as I'm essentially attempting the shorten the file, but this makes the output file *too* short! (65441 as opposed to the intended 67420). Again, the same applies when the order of the input files are swapped. > merge(PopDens.long, coordinates, by=c("Latitude","Longitude")) -> merged > dim(PopDens.long); dim(coordinates); dim(merged) [1] 67870 3 [1] 67420 2 [1] 65441 3 Am I doing something obviously wrong? I'm pretty certain that 'coordinates' is a subset of 'PopDens.long' - so there should be equal numbers of common values when merged. Is there perhaps a more suitable function I could use, or a way of performing checks to see where I might be going wrong?! Many thanks, Steve _ 100’s of Nikon cameras to be won with Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Coarsening the Resolution of a Dataset
Dear all, I have gridded data at 5' (minutes) resolution, which I intend to coarsen to 0.5 degrees. How would I go about doing this in R? I've had a search online and haven't found anything obvious, so any help would be gratefully received. I'm assuming that there will be several 'coarsening' techniques available - I'm after something fairly simple, which for example, just takes an average of each 0.5 degree portion of the current dataset. If someone is able to point me in the right direction, then I'd be very grateful. Many thanks, Steve _ Play and win great prizes with Live Search and Kung Fu Panda __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coarsening the Resolution of a Dataset
Unfortunately, when I get to the 'myCuts' line, I receive the following error: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? ...and I also receive warnings about memory allocation being reached (even though I've already used memory.limit() to maximise the memory) - this is a fairly sizeable dataset afterall, 2160 rows by 4320 columns. Therefore I was wondering if there are any alternative ways of coarsening a dataset? Or are there any packages/commands built for this sort of thing? Any advice would be much appreciated! Thanks again, Steve _ Find the best and worst places on the planet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coarsening the Resolution of a Dataset
Hi - thanks for the advice - I am however applying this to the whole data frame. And the code that I'm using is just to read in the data (using read.table) and then the code that you supplied. I could send you the actual dataset if you don't mind a file ~50MB?! Thanks again, Steve > Date: Tue, 29 Jul 2008 15:34:31 -0400 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: Re: [R] Coarsening the Resolution of a Dataset > CC: r-help@r-project.org > > I assume that you are doing this on one column of the matrix which > should only have 2160 entries in it. can you send the actual code you > are using. I tried it with 10,000 samples and it works fine. So I > need to understand the data structure you are using. Also the > infinite recursion sounds strange; do you have function like 'cut' or > 'c' redefined? So it would help if you could supply a reproducible > example. > > On Tue, Jul 29, 2008 at 10:09 AM, Steve Murray <[EMAIL PROTECTED]> wrote: > > > > Unfortunately, when I get to the 'myCuts' line, I receive the following > > error: > > > > Error: evaluation nested too deeply: infinite recursion / > > options(expressions=)? > > > > ...and I also receive warnings about memory allocation being reached (even > > though I've already used memory.limit() to maximise the memory) - this is a > > fairly sizeable dataset afterall, 2160 rows by 4320 columns. > > > > Therefore I was wondering if there are any alternative ways of coarsening a > > dataset? Or are there any packages/commands built for this sort of thing? > > > > Any advice would be much appreciated! > > > > Thanks again, > > > > Steve > > > > > > _ > > Find the best and worst places on the planet > > http://clk.atdmt.com/UKM/go/101719807/direct/01/ > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? _ 100s of Nikon cameras to be won with Live Search [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coarsening the Resolution of a Dataset
Hi - thanks for the advice - I am however applying this to the whole data frame. And the code that I'm using is just to read in the data (using read.table) and then the code that you supplied. I could send you the actual dataset if you don't mind a file ~50MB?! Thanks again, Steve > Date: Tue, 29 Jul 2008 15:34:31 -0400 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: Re: [R] Coarsening the Resolution of a Dataset > CC: r-help@r-project.org > > I assume that you are doing this on one column of the matrix which > should only have 2160 entries in it. can you send the actual code you > are using. I tried it with 10,000 samples and it works fine. So I > need to understand the data structure you are using. Also the > infinite recursion sounds strange; do you have function like 'cut' or > 'c' redefined? So it would help if you could supply a reproducible > example. > > On Tue, Jul 29, 2008 at 10:09 AM, Steve Murray wrote: >> >> Unfortunately, when I get to the 'myCuts' line, I receive the following >> error: >> >> Error: evaluation nested too deeply: infinite recursion / >> options(expressions=)? >> >> ...and I also receive warnings about memory allocation being reached (even >> though I've already used memory.limit() to maximise the memory) - this is a >> fairly sizeable dataset afterall, 2160 rows by 4320 columns. >> >> Therefore I was wondering if there are any alternative ways of coarsening a >> dataset? Or are there any packages/commands built for this sort of thing? >> >> Any advice would be much appreciated! >> >> Thanks again, >> >> Steve >> >> >> _ >> Find the best and worst places on the planet >> http://clk.atdmt.com/UKM/go/101719807/direct/01/ > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? _ Play and win great prizes with Live Search and Kung Fu Panda __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coarsening the Resolution of a Dataset
Please find below my command inputs, subsequent outputs and errors that I've been receiving. > crops <- read.table("crop2000AD.asc", colClasses = "numeric", na="-") > str(crops[1:10]) 'data.frame': 2160 obs. of 10 variables: $ V1 : num NA NA NA NA NA NA NA NA NA NA ... $ V2 : num NA NA NA NA NA NA NA NA NA NA ... $ V3 : num NA NA NA NA NA NA NA NA NA NA ... $ V4 : num NA NA NA NA NA NA NA NA NA NA ... $ V5 : num NA NA NA NA NA NA NA NA NA NA ... $ V6 : num NA NA NA NA NA NA NA NA NA NA ... $ V7 : num NA NA NA NA NA NA NA NA NA NA ... $ V8 : num NA NA NA NA NA NA NA NA NA NA ... $ V9 : num NA NA NA NA NA NA NA NA NA NA ... $ V10: num NA NA NA NA NA NA NA NA NA NA ... Don't worry about all the NAs - this is because there is no data available at the poles of the Earth (at the top and bottom of the dataset). > min.5 <- 5/60 > dim(crops) [1] 2160 4320 > n <- 2160*4320 > memory.limit() [1] 382.9844 > crops <- cbind(interval=seq(0, by=min.5, length=n), value=runif(n)) Error: cannot allocate vector of size 142.4 Mb In addition: Warning messages: 1: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) : Reached total allocation of 382Mb: see help(memory.size) 2: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) : Reached total allocation of 382Mb: see help(memory.size) 3: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) : Reached total allocation of 382Mb: see help(memory.size) 4: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) : Reached total allocation of 382Mb: see help(memory.size) But seems to run when 'value = runif(n)' is excluded > crops <- cbind(interval=seq(0, by=min.5, length=n)) > head(crops) interval [1,] 0. [2,] 0.0833 [3,] 0.1667 [4,] 0.2500 [5,] 0. [6,] 0.4167 > str(crops[1:10]) num [1:10] 0. 0.0833 0.1667 0.2500 0. ... > breaks <- c(seq(min(crops[,'interval']), max(crops[, 'interval']), by=0.5), > Inf) > head(breaks) [1] 0.0 0.5 1.0 1.5 2.0 2.5 > str(breaks) num [1:1555201] 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 ... > myCuts <- cut(crops[, 'interval'], breaks, include.lowest=TRUE) Error: evaluation nested too deeply: infinite recursion / options(expressions=)? In addition: Warning messages: 1: In formatC(breaks, digits = dig, width = 1) : Reached total allocation of 382Mb: see help(memory.size) 2: In formatC(breaks, digits = dig, width = 1) : Reached total allocation of 382Mb: see help(memory.size) > This is as far as I've got because of the above errors I encounter. Any pointers and advice, or if I'm doing something obviously wrong, then please let me know. Thanks again for your help. Steve _ 100’s of Nikon cameras to be won with Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coarsening the Resolution of a Dataset
Hi Jim, Thanks for your advice. The problem is that I can't lose any of the data - it's a global dataset, where the left-most column = 180 degrees west, and the right-most is 180 degrees east. The top row is the North Pole and the bottom row is the South Pole. I've got 512MB RAM on the machine I'm using - which has been enough to deal with such datasets before...? I'm wondering, is there an alternative means of achieving this? Perhaps orientated via the desired output of the 'coarsened' dataset - my calculations suggest that the dataset would need to change from the current 2160 x 4320 dimensions to 360 x 720. Is there any way of doing this based on averages of blocks of rows/columns, for example? Many thanks again, Steve _ Find the best and worst places on the planet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coarsening the Resolution of a Dataset
Ok thanks Jim - I'll give it a go! I'm new to R, so I'm not sure how I'd go about performing averages in subsets... I'll have a look into it, but any subsequent pointers would be gratefully received as ever! I'll also try playing with it in Access, and maybe even Excel 2007 might be able to do the trick too? Thanks again...! Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Iterative Averages
Dear all, I have a data frame of 2160 rows and 4320 columns, which I hope to condense to a smaller dataset by finding averages of 6 by 6 blocks of values (to produce a data frame of 360 rows by 720 columns). How would I go about finding the mean of a 6 x 6 block, then find the mean of the next adjacent 6 x 6 block, and so on, until the whole data frame has been covered? One slight twist is that I have NA values, which I don't want to be included in the calculations unless a particular 6 x 6 block is entirely composed of NA values - in which case, NA should be the output value. Thanks very much for any advice and solutions. Steve _ Get Hotmail on your mobile from Vodafone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Iterative Averages
Thanks very much for both your suggestions. When you refer to doing a 'double for loop', do you mean finding the average for rowgrp and colgrp within each 6x6 block? If so, how would this be done so that the whole data frame is covered? It would seem to me that the 'mean' operation would need to combine rowgrp AND colgrp together? How would I get the loop to cycle through each set of 6x6 blocks of the data frame? Thanks again, Steve _ Win New York holidays with Kellogg’s & Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
Date: Sun, 31 Aug 2008 21:29:38 -0400 From: "ivo welch" Subject: Re: [R] SQL Primer for R stumped again by SQL... If I have a table named "main" in an SQLite data base, how do I get the names of all its columns? (I have a mysql book that claims the SHOW command does this sort of thing, but it does not seem to work on SQLite.) It sounds like SQLite's ".schema" command might be you're looking for. Here's an example: $ sqlite3 foo.db SQLite version 3.5.4 Enter ".help" for instructions sqlite> create table T (c1 integer, c2 integer, c3 integer); sqlite> .tables T sqlite> .schema T CREATE TABLE T (c1 integer, c2 integer, c3 integer); sqlite> .quit Steve Revilak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interpolation Problems
Dear all, I'm trying to interpolate a dataset to give it twice as many values (I'm giving the dataset a finer resolution by interpolating from 1 degree to 0.5 degrees) to match that of a corresponding dataset. I have the data in both a data frame format (longitude column header values along the top with latitude row header values down the side) or column format (in the format latitude, longitude, value). I have used Google to determine 'approxfun' the most appropriate command to use for this purpose - I may well be wrong here though! Nevertheless, I've tried using it with the default arguments for the data frame (i.e. interp <- approxfun(dataset) ) but encounter the following errors: > interp <- approxfun(JanAv) Error in approxfun(JanAv) : need at least two non-NA values to interpolate In addition: Warning message: In approxfun(JanAv) : collapsing to unique 'x' values However, there are no NA values! And to double-check this, I did the following: > JanAv[is.na(JanAv)] <- 0 ...to ensure that there really are no NAs, but receive the same error message each time. With regard to the latter 'collapsing to unique 'x' values', I'm not sure what this means exactly, or how to deal with it. Any words of wisdom on how I should go about this, or whether I should use an alternative command (I want to perform a simple (e.g. linear) interpolation), would be much appreciated. Many thanks for any advice offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpolation Problems
Thanks Duncan - a couple of extra points... I should have perhaps pointed out that the data are on a *regular* 'box' grid (with each value currently spaced at 1 degree intervals). Also, I'm looking for something fairly simple, like a bilinear interpolation (where each new point is created based on the values of the four points surrounding it). In answer to your question, JanAv is simply the data frame of values. And yes, you're right, I think I'll need a 2D interpolation as it's a grid with latitude and longitude values (which as an aside, I guess these need to be interpolated differently? In a 1D format??). I think you're also right in that the 'akima' package isn't suitable for this job, as it's designed for irregular grids. Do you, or does anyone, have any suggestions as to what my best option should be? Thanks again, Steve > Date: Mon, 1 Sep 2008 18:45:35 -0400 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > CC: r-help@r-project.org > Subject: Re: [R] Interpolation Problems > > On 01/09/2008 6:17 PM, Steve Murray wrote: >> Dear all, >> >> I'm trying to interpolate a dataset to give it twice as many values (I'm >> giving the dataset a finer resolution by interpolating from 1 degree to 0.5 >> degrees) to match that of a corresponding dataset. >> >> I have the data in both a data frame format (longitude column header values >> along the top with latitude row header values down the side) or column >> format (in the format latitude, longitude, value). >> >> I have used Google to determine 'approxfun' the most appropriate command to >> use for this purpose - I may well be wrong here though! Nevertheless, I've >> tried using it with the default arguments for the data frame (i.e. interp <- >> approxfun(dataset) ) but encounter the following errors: >> >>> interp <- approxfun(JanAv) >> Error in approxfun(JanAv) : >> need at least two non-NA values to interpolate >> In addition: Warning message: >> In approxfun(JanAv) : collapsing to unique 'x' values >> >> >> However, there are no NA values! And to double-check this, I did the >> following: >> >>> JanAv[is.na(JanAv)] <- 0 >> >> ...to ensure that there really are no NAs, but receive the same error >> message each time. >> >> With regard to the latter 'collapsing to unique 'x' values', I'm not sure >> what this means exactly, or how to deal with it. >> >> >> Any words of wisdom on how I should go about this, or whether I should use >> an alternative command (I want to perform a simple (e.g. linear) >> interpolation), would be much appreciated. > > What is JanAv? approxfun needs to be able to construct x and y values > to interpolate; it may be that your JanAv object doesn't allow it to do > that. (The general idea is that it will consider y to be a function of > x, and will construct a function that takes arbitrary x values and > returns y values matching those in the dataset, with some sort of > interpolation between values.) > > If you really have longitude and latitude on some sort of grid, you > probably want a two-dimensional interpolation, not a 1-d interpolation > as done by approxfun. The interp() function in the akima() package does > this, but maybe not in the format you need. > > Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] printing name of object inside lapply
Dear list members, I am trying, within a lapply command, to print the name of the objects in list or data frame. This is so that I can use odfWeave to print out a report with a section for each object, including the object names. I tried e.g. a=b=c=1:5 lis=data.frame(a,b,c) lapply( lis, function (z) { obj.nam <- deparse(substitute(z)) cat("some other text",obj.nam,"and so on","\n") } ) But instead of getting "a" "b" etc. I get X[[1L]] etc. Any ideas? www.promente.org proMENTE social research KranÄeviÄeva 35 71000 Sarajevo mob. +387 61 215 997 tel. +387 556 865 fax. +387 556 866 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table error
Dear all, I have a tab-delimited text (.txt) file which I'm trying to read into R. This file is of column format - there are in fact 3 columns and 259201 rows (including the column headers). I've been using the following commands, but receive an error each time which prevents the data from being read in: > Jan <- read.table("JanuaryAvBurntArea.txt", header=TRUE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 did not have 6 elements I tried removing the 'header' argument, but receive a similar message: > Jan <- read.table("JanuaryAvBurntArea.txt") Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 2 did not have 6 elements What's more confusing about this is that I know that none of the lines have 6 elements! They're not supposed to! Each row only has 3 values (one per column)! As a final resort I tried 'scan': <- scan("JanuaryAvBurntArea.txt") Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'Latitude' ...which is obviously something to do with there being a header as the first row, but the 'scan' command doesn't seem to have an equivalent of 'header=TRUE' like read.table...? If anyone is able to shed some light on why I'm receiving these errors, and how I can get the data into R, then I'd be very grateful to hear them! I suspect I'm doing something very basic which is wrong! Many thanks, Steve _ Win New York holidays with Kellogg’s & Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table error
Thanks Prof. Ripley! I knew it would be something simple - I'd missed the "\t" from the read.table command! I won't be doing that again...!! Thanks again, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing name of object inside lapply
Thanks Prof Ripley! How obvious in retrospect! Prof Brian Ripley wrote: On Thu, 4 Sep 2008, Steve Powell wrote: Dear list members, I am trying, within a lapply command, to print the name of the objects in list or data frame. This is so that I can use odfWeave to print out a report with a section for each object, including the object names. I tried e.g. a=b=c=1:5 lis=data.frame(a,b,c) lapply( lis, function (z) { obj.nam <- deparse(substitute(z)) cat("some other text",obj.nam,"and so on","\n") } ) But instead of getting "a" "b" etc. I get X[[1L]] etc. Any ideas? Use a for() loop on the names: lapply is overkill here. But you could use lapply(names(lis), function (z) { cat("some other text", z, "and so on","\n") ## references to lis[[z]] }) www.promente.org proMENTE social research Kran??evi??eva 35 71000 Sarajevo mob. +387 61 215 997 tel. +387 556 865 fax. +387 556 866 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] howto get the number of columns and column names of multiply data frames
Hi, On Aug 9, 2009, at 11:29 AM, Frank Schäffer wrote: Hi, I' ve read in several files with measurements into R data frames(works flawlessly). Each dataframe is named by the location of measurement and contains hundreds of rows and about 50 columns like this dataframe1. date measurment_1 mesurement_n 1 2 3 .. .. .. n Just as an aside, it's somehow considered more R-idiomatic to store all of these tables in a list (of tables) and access them as mydata[[1]], mydata[[2]], ..., mydata[[n]]. Assuming the datafiles are 'filename.1.txt', 'filename.2.txt', etc. You might do this like so: mydata <- lapply(paste('filename', 1:n, 'txt', sep='.'), read.table, header=TRUE, sep=...) To test that all colnames are the same, you could do something like. names1 <- colnames(mydata[[1]]) all(sapply(2:n, function(dat) length(intersect(names1, colnames(mydata[[n]]))) == length(names1))) For further processing I need to check whether or not ncol and colnames are the same for all dataframes. Also I need to add a new column to each dataframe with contain the name of the dataframe, so that this column can be treated as factor in later processing (after merging some seleted dataframes to one) I tried out for (i in 1:length(ls()){ print(ncol(ls()[i]) } but this does not work because r returns a "character" for i and therefore "NULL" as result. Reading the output of ls() into a list also does not work. How can I accomplish this task?? If you still want to do it this way, see: ?get for example: for (varName in paste('dataframe', 1:n, sep='')) { cat(colnames(get(varName))) } HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem selecting rows meeting a criterion
Hi, See comments in line: On Aug 11, 2009, at 2:45 PM, Jim Bouldin wrote: No problem John, thanks for your help, and also thanks to Dan and Patrick. Wasn't able to read or try anybody's suggestions yesterday. Here's what I've discovered in the meantime: What I did not include yesterday is that my original data frame, called "data", was this: X Y V3 1 1 1 0.00 2 2 1 8.062258 3 3 1 2.236068 4 4 1 6.324555 5 5 1 5.00 6 1 2 8.062258 7 2 2 0.00 8 3 2 9.486833 9 4 2 2.236068 10 5 2 5.656854 11 1 3 2.236068 12 2 3 9.486833 13 3 3 0.00 14 4 3 8.062258 15 5 3 5.099020 16 1 4 6.324555 17 2 4 2.236068 18 3 4 8.062258 19 4 4 0.00 20 5 4 5.385165 21 1 5 5.00 22 2 5 5.656854 23 3 5 5.099020 24 4 5 5.385165 25 5 5 0.00 To this data frame I applied the following command: data <- data[data$V3 >0,];data #to remove all rows where V3 = 0 giving me this (the point from which I started yesterday): X Y V3 2 2 1 8.062258 3 3 1 2.236068 4 4 1 6.324555 5 5 1 5.00 6 1 2 8.062258 8 3 2 9.486833 9 4 2 2.236068 10 5 2 5.656854 11 1 3 2.236068 12 2 3 9.486833 14 4 3 8.062258 15 5 3 5.099020 16 1 4 6.324555 17 2 4 2.236068 18 3 4 8.062258 20 5 4 5.385165 21 1 5 5.00 22 2 5 5.656854 23 3 5 5.099020 24 4 5 5.385165 So far so good. But when I then submit the command data = data[X>Y,] #to select all rows where X > Y This won't work in general, and is probably only working in this particular case because you already have defined somewhere in your workspace vars named X and Y. What you wrote above isn't taking the values X,Y from data$X and data $Y, respectively, but rather from var X and Y defined elsewhere. Instead of doing data[X > Y], do: data[data$X > data$Y,] This should get you what you're expecting. I get the problem result already mentioned, namely: X Y V3 3 3 1 2.236068 4 4 1 6.324555 5 5 1 5.00 6 1 2 8.062258 10 5 2 5.656854 11 1 3 2.236068 12 2 3 9.486833 17 2 4 2.236068 18 3 4 8.062258 24 4 5 5.385165 which is clearly wrong! It doesn't matter if I give a new name to the data frame at each step or not, or whether I use the name "data" or not. It always gives the same wrong answer. However, if I instead use the command: subset(data, X>Y), I get the right answer, namely: X Y V3 2 2 1 8.062258 3 3 1 2.236068 4 4 1 6.324555 5 5 1 5.00 8 3 2 9.486833 9 4 2 2.236068 10 5 2 5.656854 14 4 3 8.062258 15 5 3 5.099020 20 5 4 5.385165 That's because when you are using X, and Y in your subset(...) call, THIS takes X and Y to mean data$X and data$Y. OK so the lesson so far is "use the subset function". Hopefully you're learning a slightly different lesson now :-) Does that clear things up at all? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help from command line
Hi, On Aug 11, 2009, at 3:43 PM, Peng Yu wrote: Hi, I frequently need to open multiple help pages in R, which requires the start of multiple R sessions. I am wondering if there is a way to invoke the help page from the command line just like 'man'. I haven't been paying attention, but are you working on a machine with a windowing system + browser? If so, set your help files to be seen in the browser: R> option(htmlhelp=TRUE) next time you ask for some ?help, it should pop up a browser window. Good enough? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logged2
Hi, On Aug 12, 2009, at 9:26 AM, amor Gandhi wrote: Hi, Can you please tell me what does the the function logged2 in R do im list or..? As I have ?logged2 No documentation for 'logged2' in specified packages and libraries: you could try '??logged2' ??logged2 No help files found with alias or concept or title matching ‘logged2’ using fuzzy matching. There is no logged2 function in (base) R. Where are you seeing this function used? Running: R> RSiteSearch('logged2') Only brings up "logged2" as a parameter to some functions in the samr package, but nothing else. If you can let us know where you're seeing this function used, we could likely provide more help. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem loading ncdf library on MAC
Hi, On Aug 12, 2009, at 2:23 PM, Dan Kelley wrote: I think ncdf is broken now, and so is Rnetcdf. I can't build from source, either. If I find a solution, I'll post it here. FYI, I have an intel-based Mac running the latest OS and with R 2.9.1 While I haven't compiled it myself, it seems that Rnetcdf requires a library that you probably don't have installed. Check out it's build report: http://www.r-project.org/nosvn/R.check/r-release-macosx-ix86/RNetCDF-00install.html Probably it's looking for the udunits (http://www.unidata.ucar.edu/software/udunits/ (?)) library and you don't have it. I just tried to compile ncdf on my machine, and it also failed .. looks like you need netcdf.h. Just look at the status/reporting that building the package gives: checking netcdf.h usability... no checking netcdf.h presence... no checking for netcdf.h... no checking /usr/local/include/netcdf.h usability... no checking /usr/local/include/netcdf.h presence... no checking for /usr/local/include/netcdf.h... no checking /usr/include/netcdf.h usability... no checking /usr/include/netcdf.h presence... no ... I guess you just need to get the required libs and try again. Or are you getting different errors? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nominal variables in SVM?
Hi, On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote: Hi, The answers to my previous question about nominal variables has lead me to a more important question. What is the "best practice" way to feed nominal variable to an SVM. For example: color = ("red, "blue", "green") I could translate that into an index so I wind up with color= (1,2,3) But my concern is that the SVM will now think that the values are numeric in "range" and not discrete conditions. Another thought would be to create 3 binary variables from the single color variable, so I have: red = (0,1) blue = (0,1) green = (0,1) A example fed to the SVM would have one positive and two negative values to indicate the color value: i.e. for a blue example: red = 0, blue =1 , green = 0 Do it this way. So, imagine if the features for your examples were color and height, your "feature matrix" for N examples would be N x 4 0,1,0,15 # blue object, height 15 1,0,0,10 # red object, height 10 0,0,1,5 # green object, height 5 ... -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] request: Help
Hi, On Aug 13, 2009, at 3:43 PM, Sarjinder Singh wrote: Dear Sir/Madam, Good Day! How can we make output file in R? In FORTRAN, we could do as follows: WRITE (42, 107) x, y 107 FORMAT ( 2x, F9.3, 2x, F4.2) What is equivalent to this in R? See: ?file ?cat ?sprintf -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding problem: How can I extract substring of function callwithin the function
Hi, On Aug 13, 2009, at 5:30 PM, Pitt, Joel wrote: Thanks. It's a great solution to part of my problem. It provides the functionally I want would be no harder for students to use than my approach. I'm still interested in how to make what I was trying to do work -- simply to add to my own R programming proficiency. I wasn't going to really chime in, but I couldn't help it given the irony in your last sentence (sorry, I don't mean to sound like an ass). I think trying to help your students with some "bumps on the road" is admirable, but this seems just a bit misdirected. As you say, you are on a quest to add to your R programming proficiency, and will apply this newly founded technique to actively handicap your students' proficiency. I guess you're teaching some intro stat analysis class, and waving over the fact that functions like mean explicitly return NA in the presence of NA's, but this is important to know and recognize when analyzing any real data, because there will surely be NA's "in the wild." Knowing that you should consciously and deliberately deal with/ ignore them from the start might be a good first/early lesson for students to digest. Like I said, I don't mean to sound like an ass, but I think glossing over details of working with data should be avoided. Your other additions, like adding options to plotting/graphing functions, seem less harmful to me. But if you're trampling over some base:: function for something trivial like changing the value of one default parameter to something else, you might as well just get them to learn how to use the ? asap as well. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding problem: How can I extract substring of function callwithin the function
Hi Joel, On Aug 13, 2009, at 6:10 PM, Pitt, Joel wrote: Your objections are quite cogent, but I believe they are misdirected. I'm not at all interested in having my students ignore the existence of NA's and that's precisely why I'm not using the Default package as someone else suggested. But the mere existence of missing values in a data set doesn't make computation of the mean entirely useless -- why else would the na.rm option be available. I agree: it's certainly not useless to calc the mean of a series of numbers while ignoring NA's. If a student using my version of mean uses it to find the mean of a variable that has missing values, the function first prints a warning message and only then returns a value. Here's an example > length(x) [1] 101 > mean(x) Warning: x has 3 missing values. [1] 51.69388 Looks nice. My only goal in this project has been to provide them with a somewhat friendlyR version of R. It was not conceived as part of a quest to improve my programming proficiency I didn't think you were pursuing this to improve your programming proficiency from the outset, I just wanted to point out that you found a situation which you wanted to work around for (i) your precise use case. But it was because you were trying to solve (i) that you stumbled on (ii) and wanted to just know how one might do this for curiosity's sake and increase your proficiency (perhaps for some other use case in the future). It was simply for reason (ii) that prompted me to say something only because you might not have seen your approach as potentially handicapping your students' (maybe subconcious(?)) pursuit of proficiency. I just wanted to point out that this is what you *might* be doing in the short term. Then again, it might not be. Everyone has their own style of teaching, and it's your prerogative to do it as you see fit. I wouldn't presume to know which way is best, so good luck with the upcoming semester :-) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating list of the from 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, . . ., n, n, n?
Hi, On Aug 14, 2009, at 10:44 AM, John Sorkin wrote: Windows XP R 2.8.1 Is there any way to make a list of the from 1,1,1,2,2,2,3,3,3,4,4,4, . . ., n,n,n? Like so? R> rep(1:10, each=3) [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 [26] 9 9 10 10 10 -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning values based on a separate reference (lookup) table
Dear R Users, I have a data frame of 360 rows by 720 columns (259200 values). For each value in this grid I am hoping to apply an equation to, to generate a new grid. One of the parts of the equation (called 'p') relies on reading from a separate reference table. This is Table 4 at: http://www.fao.org/docrep/s2022e/s2022e07.htm#3.1.3%20blaney%20criddle%20method (scroll down a little). Therefore, 'p' relies on the latitude of the values in the initial 360 x 720 data frame. The row names of the data frame contain the latitude values and these range from between 89.75 to -89.75 (the latter being South of the Equator). My question is, how do I go about forming a loop to read each of the 259200 values and assign it a 'p' value (from the associated reference table), based on it's latitude? My thinking was to do a series of 'if' statements, but this soon got very, very messy - any ideas which get the job done (and aren't a riddle to follow), would be most welcome. Many thanks for any advice, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning values based on a separate reference (lookup) table
Thanks to you both for your responses. 3 I think these approaches will *nearly* do the trick, however, the problem is that the reference/lookup table is based on 'bins' of latitude values, eg.>61, 60-56, 55-51, 50-46 etc. whereas the actual data (in my 720 x 360 data frame) are not binned, e.g. 89.75, 89.25, 88.75, 88.25, 87.75 etc. - instead they 'increment' by -0.5 each time, and therefore many of the 367200 values which are in the data frame will have latitude values falling into the same 'reference' bin. It's for this reason that I think the 'merge' approach might fall down, unless there's a way of telling 'merge' that latitude can still be considered to match if they fall within a range. For example, if my 720 x 360 data frame has values whose corresponding latitude (row name) values are, say, 56.3, 55.9, 58.2, 56.8 and 57.3, then the original value in the grid needs to be assigned a 'p' value which corresponds with what is read off of the reference table from the bin 56-60. Hope this makes sense! If not, please feel free to ask for clarification. Many thanks again, Steve _ oticons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rounding to the nearest 5
Dear all, A hopefully simple question: how do I round a series of values (held in an object) to the nearest 5? I've checked out trunc, round, floor and ceiling, but these appear to be more tailored towards rounding decimal places. Thanks, Steve _ icons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reshape package: Casting data to form a grid
Dear R Users, I'm trying to use the 'cast' function in the 'reshape' package to convert column-format data to gridded-format data. A sample of my dataset is as follows: head(finalframe) Latitude Longitude Temperature OrigLat p-value Blaney 1 -90-38.75 NA -87.75 17.10167 NA 2 -90135.75 NA -87.75 17.10167 NA 3 -90 80.25 NA -87.75 17.10167 NA 4 -90 95.75 NA -87.75 17.10167 NA 5 -90 66.75 NA -87.75 17.10167 NA 6 -90 75.75 NA -87.75 17.10167 NA I'm attempting to form a grid based on the OrigLat, Longitude and Blaney columns, to form the rows, columns and values of the new grid respectively. The command I've been using is: cast_test <- cast(finalframe, finalframe$OrigLat~variable, finalframe$Longitude~variable, finalframe$Blaney~variable) Error: Casting formula contains variables not found in molten data: finalframe$OrigLat, variable And I've tried removing the ~variable suffixes: cast_test <- cast(finalframe, finalframe$OrigLat, finalframe$Longitude, finalframe$Blaney) Error: Casting formula contains variables not found in molten data: -87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75 [etc etc] I'm not sure how to get round this error, nor what the 'molten data' is that the error is referring to. I'm assuming it means the data frame presented above, yet the variables are clearly present! Any help or advice on this would be most welcomed. Many thanks, Steve _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing NA values in one column of a data.frame
Dear all, I'm trying to replace NA values with - in one column of a data frame. I've tried using is.na and the testdata[testdata$onecolumn==NA] <- approach, but whilst neither generate errors, neither result in -s appearing - the NAs remain there! I'd be grateful for any advice on what I'm doing wrong or any other suitable approaches. Many thanks, Steve _ oticons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing NA values in one column of a data.frame
Hi, On Aug 18, 2009, at 10:19 AM, John Kane wrote: Perhaps testdata$onecolumn[testdata$onecolumn==NA] <- I don't think this would work -- the is.na function is the way to go: R> a <- c(1,2,3,NA,NA,10,20,NA,NA) R> a [1] 1 2 3 NA NA 10 20 NA NA R> a[a == NA] <-999 R> a [1] 1 2 3 NA NA 10 20 NA NA R> a[is.na(a)] <-999 R> a [1] 1 2 3 999 999 10 20 999 999 Hope that helps, -steve --- On Mon, 8/17/09, Steve Murray wrote: From: Steve Murray Subject: [R] Replacing NA values in one column of a data.frame To: r-help@r-project.org Received: Monday, August 17, 2009, 11:41 AM Dear all, I'm trying to replace NA values with - in one column of a data frame. I've tried using is.na and the testdata[testdata$onecolumn==NA] <- approach, but whilst neither generate errors, neither result in -s appearing - the NAs remain there! I'd be grateful for any advice on what I'm doing wrong or any other suitable approaches. Many thanks, Steve _ oticons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing a dataframe summary to disk
Hi, On Aug 17, 2009, at 10:54 AM, Philip A. Viton wrote: I'd like to write the summary of a dataframe to disk, so that it looks essentially the same as what you'd see on screen; but I can't seem to do it. Can someone tell me how? Thanks! Look at: ?write.table and friends You can write the data.frame that way. If you're looking to serialize the result of a call to summary(my.data.frame) to a disk, just note that the summary function returns the summary as table (and doesn't simply print it to the terminal), so: my.summary <- summary(some.data.frame) write.table(my.summary, quote=FALSE, file="summary.txt") -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns
Hi Alberto, On Aug 18, 2009, at 4:14 AM, Alberto Lora M wrote: Hi Everbody Could somebody help me.? I need to remove the columns where the sum of it components is equal to zero. For example a<-matrix(c(0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,1,0), ncol=4) a [,1] [,2] [,3] [,4] [1,]0001 [2,]0101 [3,]0000 [4,]0100 [5,]0001 [6,]0000 Columns 1 and 3 should be removed the result should be the dollowing matrix [,2] [,4] [1,]01 [2,]11 [3,]00 [4,]10 [5,]01 [6,]00 Try this: R> a[,-which(colSums(a) == 0)] [,1] [,2] [1,]01 [2,]11 [3,]00 [4,]10 [5,]01 [6,]00 Indexing into a matrix/vector/data.frame/list/whatever with a negative number removes those elements from the result. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm.fit algo
Hi, On Aug 17, 2009, at 5:09 PM, Pavlo Kononenko wrote: Hi, everyone, This is a little silly, but I cant figure out the algorithm behind lm.fit function used in the context of promax rotation algorithm: The promax function is: promax <- function(x, m = 4) { if(ncol(x) < 2) return(x) dn <- dimnames(x) xx <- varimax(x) x <- xx$loadings Q <- x * abs(x)^(m-1) U <- lm.fit(x, Q)$coefficients d <- diag(solve(t(U) %*% U)) U <- U %*% diag(sqrt(d)) dimnames(U) <- NULL z <- x %*% U U <- xx$rotmat %*% U dimnames(z) <- dn class(z) <- "loadings" list(loadings = z, rotmat = U, crap = x, coeff = Q) } And the line I'm having trouble with is: U <- lm.fit(x, Q)$coefficients Isn't this doing a least squares regression using the predictor variables in x and the (I guess) real valued numbers in vector Q? x is a matrix of n (observations) by p (predictors) The $coefficients is just taking the vector of coefficients/weights over the predictors -- this would be a vector of length p -- such that x %*% t(t(U)) ~ Q * t(t(U)) is ugly, but I just want to say get U to be a column vector * ~ is used as "almost equals") You'll need some numerical/scientific/matrix library in java, perhaps this could be a place to start: http://commons.apache.org/math/userguide/stat.html#a1.5_Multiple_linear_regression Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie that don't understand R code
Hi, Comments inline and at end: On Aug 17, 2009, at 11:36 AM, kfcnhl wrote: I got some R code that I don't understand. Question as comment in code //where is t comming from, what is phi inverse rAC <- function(name, n, d, theta){ #generic function for Archimedean copula simulation illegalpar <- switch(name, clayton = (theta < 0), gumbel = (theta < 1), frank = (theta < 0), BB9 = ((theta[1] < 1) | (theta[2] < 0)), GIG = ((theta[2] < 0) | (theta[3] < 0) | ((theta[1]>0) & (theta[3]==0)) | ((theta[1]<0) & (theta[2]==0 if(illegalpar) stop("Illegal parameter value") independence <- switch(name, clayton = (theta == 0), gumbel = (theta == 1), frank = (theta == 0), BB9 = (theta[1] == 1), GIG=FALSE) U <- runif(n * d) U <- matrix(U, nrow = n, ncol = d) if(independence) return(U) Y <- switch(name, clayton = rgamma(n, 1/theta), gumbel = rstable(n, 1/theta) * (cos(pi/(2 * theta)))^theta, frank = rFrankMix(n, theta), BB9 = rBB9Mix(n, theta), GIG = rGIG(n,theta[1],theta[2],theta[3])) Y <- matrix(Y, nrow = n, ncol = d) phi.inverse <- switch(name, clayton = function(t, theta) //where is t comming from, what is phi inverse { (1 + t)^(-1/theta) } t isn't coming from anywhere, it's just a parameter to the function definition here. phi.inverse will be THE FUNCTION returned by the switch statement here, depending on the value of ``name``. , gumbel = function(t, theta) { exp( - t^(1/theta)) } , frank = function(t, theta) { (-1/theta) * log(1 - (1 - exp( - theta)) * exp( - t)) } , BB9 = function(t, theta) { exp( - (theta[2]^theta[1] + t)^(1/theta[1]) + theta[2]) } , GIG = function(t, theta) { lambda <- theta[1] chi <- theta[2] psi <- theta[3] if (chi==0) out <- (1+2*t/psi)^(-lambda) else if (psi==0) out <- 2^(lambda+1)*exp(besselM3(lambda,sqrt(2*chi*t),log value=TRUE)-lambda*log(2*chi*t)/2)/gamma(-lambda) else out <- exp(besselM3(lambda,sqrt(chi*(psi+2*t)),logvalue=T RUE)+lambda*log(chi*psi)/2- besselM3(lambda,sqrt(chi*psi),logvalue=TRUE)-lambda*log(chi*(psi +2*t))/2) out } ) phi.inverse( - log(U)/Y, theta) phi.inverse was defined as the function returned by the switch statement. ``- log(U)/Y`` is passed in to the function's ``t`` argument. Does that help? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] value of nth percentile
Hi Ajay, On Aug 18, 2009, at 8:16 AM, Ajay Singh wrote: Dear All, I have to get the value of say 90th percentile of precipitation time series.. The series is of daily precipitation value of 96 years, I have to to get 90the percentile value of daily precipitation each year. If you know the R code or command for this please let me know. I would appreciate your early response. R> dat <- rnorm(100, mean=10, sd=2) R> quantile(dat, .9) 90% 12.53047 R> sum(dat < quantile(dat, .9)) / length(dat) [1] 0.9 -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] open txt
On Aug 18, 2009, at 9:41 AM, Stela Valenti Raupp wrote: Não consigo abrir a pasta txt no R, dá a mensagem: Warning message: In file(file, "r") : cannot open file 'plantula.txt': No such file or directory O arquivo está na mesma página do Scrip. Não sei qual é o problema Tu file "plantula.txt" no esta aqui (I think). In short: you are passing some function the name of a file that doesn't exist. Try passing the absolute path to the plantula.txt file to your call to ``file()`` -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tr : create a table in the console!!
Hi, On Aug 18, 2009, at 10:52 AM, Inchallah Yarab wrote: - Message transféré De : Inchallah Yarab À : r-help@r-project.org Envoyé le : Mardi, 18 Août 2009, 16h26mn 20s Objet : create a table in the console!! HI I want to do a table with R (in the console) GWP_Max NumberOfPolicies No_GWPMax 8 [0-1000] 4 [1000-3000] 3 [> 3000] 5 i begin by calculate the number of policies in each class : Data1 <- read.csv2("c:/Total1.csv", sep=",") Data2 <- read.csv2("c:/GWPMax1.csv",sep=",")[1:20,1:2] M <- merge(Data1,Data2, by.x = "Policy.Number",by.y = "Policy.Number",all.x = TRUE,all.y = TRUE ) (No_GWPMax<-nrow(M[M[,25]=="NA",])) [1] 8 M2<- merge(Data1,Data2, by.x = "Policy.Number",by.y = "Policy.Number") M2$GWP_Max <- as.numeric(as.character(M2$GWP_Max )) class1 <- M2[M2[,25]>0 & M2[,25]<1000,] (NbpolicyClass1 <- nrow(class1)) [1] 5 class2 <- M2[M2[,25]>1000 & M2[,25]<3000,] (NbpolicyClass2 <- nrow(class2)) [1] 3 class3 <- M2[M2[,25]>3000,] (NbpolicyClass3 <- nrow(class3)) [1] 4 can you help me ? I don't understand what you want to do. From the code you've pasted, you already have extracted the numbers you wanted, so what do you mean when you say "I want to do a table" with them? Do you just want to put them in a data.frame, or something? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erros with RVM and LSSVM from kernlab library
Howdy, On Aug 19, 2009, at 2:54 PM, Noah Silverman wrote: Hi Steve, No custom kernel. (This is the exact same data that I call svm with. svm works without a complaint.) traindata is just a dataframe of numerical attributes trainlabels is just a vector of labels. ("good", "bad") Then I call model <- rvm(x,y) is x really a data.frame? Can you try to turn it into a matrix to see if it will get you over this speed bump? model <- rvm(as.matrix(x), y) I reckon if x is a data.frame, R is invoking the rvm function that's meant to work on list(s), rather than the matrix which you think you're passing in. Does that do the trick? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erros with RVM and LSSVM from kernlab library
Hi, On Aug 19, 2009, at 1:27 PM, Noah Silverman wrote: Hello, In my ongoing quest to develop a "best" model, I'm testing various forms of SVM to see which is best for my application. I have been using the SVM from the e1071 library without problem for several weeks. Now, I'm interested in RVM and LSSVM to see if I get better performance. When running RVM or LSSVM on the exact same data as the SVM{e1071}, I get an error that I don't understand: Error in .local(x, ...) : kernel must inherit from class 'kernel' Does this make sense to anyone? Can you suggest how to resolve this? Sure, it just means that whatever you are passing as a value to the kernel= parameter of your function call is not a kernel function (that kernlab knows about). Did you rig up a custom kernel function? If so -- be sure to set its class properly. Otherwise, can you provide something of a self- contained piece of code that you're using to invoke these functions such that it's giving you these errors? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erros with RVM and LSSVM from kernlab library
Steve, That makes sense, except that x is a data.frame with about 70 columns. So I don't see how it would convert to a list. Yeah ... not sure if that's what happening (R class relationships/ testing is still a bit of a mystery to me), but see: R> df <- data.frame(a=1:10,b=1:10) R> is(df) [1] "data.frame" "list" "oldClass" "mpinput""vector" But R> is(df, 'list') [1] FALSE So, in short, I don't know if that's what's happening ... did it fix your problem, tho? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a naive question
Hi, On Aug 19, 2009, at 4:59 PM, Bogdan Tanasa wrote: Hi, and my apologies for the following very naive question : I would like to read a column of numbers in R and plot a histogram. eg : x<-read.table("txSTART"); y<-as.numeric(x); and I do obtain the error : Error: (list) object cannot be coerced to type 'double'. Please could you let me know the way to fix it. Yeah, you can't do that. What does x look like? Can you show us the result of: R> head(x) Assuming it's just a single column, you can access the numbers in the first column, like so R> x[,1] You can use that to plot a histogram of the numbers in the first column: R> plot(hist(x[,1])) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a list of combinations
Dear R Users, I have 120 objects stored in R's memory and I want to pass the names of these many objects to be held as just one single object. The naming convention is month, year in sequence for all months between January 1986 to December 1995 (e.g. Jan86, Feb86, Mar86... through to Dec95). I hope to pass all these names (and their data I guess) to an object called file_list, however, I'm experiencing some problems whereby only the first (and possibly last) names seem to make the list, with the remainder recorded as 'NA' values. Here is my code as it stands: index <- expand.grid(month=month.abb, year=seq(from=86,to=95, by=1)) for (i in seq(nrow(index))) { file_list <- paste(index$month[i], index$year[i], sep='') print(file_list[i]) } Output is as follows: [1] "Jan86" [1] NA [1] NA [1] NA #[continues to row 120 as NA] > file_list; file_list[i] [1] "Dec95" [1] NA > head(index) # this seems to be working fine month year 1 Jan 86 2 Feb 86 3 Mar 86 4 Apr 86 5 May 86 6 Jun 86 Any help on how I can populate file_list correctly with all 120 combinations of month + year (without NAs!) would be gratefully received. Thanks, Steve _ icons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Several simple but hard tasks to do with R
For history of both commands and output, consider running R inside emacs using the ESS package and simply saving the buffer to a file. If you save the session as an "S transcript" file (extension .St) it is also easy to reload and re-execute any part of it. Emacs or xemacs is available on most platforms including Windows. Rakknar wrote: > > "1. logs. help.search("history") and ?savehistory shows you that R does > exactly what you want very easily (depending on the platform, which > contrary > to the posting guide's request, you did not tell us)." > > I've already find out about the "history" tool but it was not useful > because it only register commands, not output from the commands. The > commands are already stored in scripts (I use Tinn-R, I don't know if you > would recommend me other) what I want to do it's to store de commands AND > the outputs from each. I use the Windows version by the way. > -- View this message in context: http://www.nabble.com/Several-simple-but-hard-tasks-to-do-with-R-tp25052563p25064007.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a construct for conditional comment?
Why not if ( 0 ) { commented with zero } else { commented with one } Greg Snow-2 wrote: > > I believe that #if lines for C++ programs is handled by the preprocessor, > not the compiler. So if you want the same functionality for R programs, > it would make sense to just preprocess the R file. > >> In C++, I can use the following construct to choice either of the two >> blocks the comment but not both. Depending on whether the number after >> "#if" is zero or not, the commented block can be chose. I'm wondering >> if such thing is possible in R? >> >> #if 0 >> commented with 0 >> #else >> commented with 1 >> #endif >> >> Regards, >> Peng >> > -- View this message in context: http://www.nabble.com/Is-there-a-construct-for-conditional-comment--tp25034224p25064798.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using 'unlist' (incorrectly?!) to collate a series of objects
Dear R Users, I am attempting to write a new netCDF file (using the ncdf) package, using 120 grids I've created and which are held in R's memory. I am reaching the point where I try to put the data into the newly created file, but receive the following error: > put.var.ncdf(evap_file, evap_dims, unlist(noquote(file_list))) Error in put.var.ncdf(evap_file, evap_dims, unlist(noquote(file_list))) : put.var.ncdf: error: you asked to write 31104000 values, but the passed data array only has 120 entries! I think I understand why this is: the 120 grids contain 31104000 values in total, however, it seems that only the names of the 120 objects are being passed to the file. Earlier on in the script, I generated the file names using the following code: > for (i in seq(nrow(index))) { file_list[[i]] <- paste(index$month[i], index$year[i], sep='') print(file_list[i]) } I was hoping therefore, that when I do put.var.ncdf and use the 'unlist' function (see original section of code), that since the data associated with the names of the grids are held in memory, both the names *and data* would be passed to the newly created file. However, it seems that only the names are being recognised. My question is therefore, is there an easy way of passing all 120 grids, using the naming convention held in file_list, to an object, which can subsequently be used in the put.var.ncdf statement? Many thanks for any help, Steve _ icons. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LASSO: glmpath and cv.glmpath
Hi, On Aug 21, 2009, at 9:47 AM, Peter Schüffler wrote: Hi, perhaps you can help me to find out, how to find the best Lambda in a LASSO-model. I have a feature selection problem with 150 proteins potentially predicting Cancer or Noncancer. With a lasso model fit.glm <- glmpath(x=as.matrix(X), y=target, family="binomial") (target is 0, 1 <- Cancer non cancer, X the proteins, numerical in expression), I get following path (PICTURE 1) One of these models is the best, according to its crossvalidation (PICTURE 2), the red line corresponds to the best crossvalidation. Its produced by cv <- cv.glmpath(x=as.matrix(X), y=unclass(T)-1, family="binomial", type ="response", plot.it=TRUE, se=TRUE) abline(v= cv$fraction[max(which(cv$cv.error==min(cv$cv.error)))], col="red", lty=2, lwd=3) Does anyone know, how to conclude from the Normfraction in PICTURE 2 to the corresponding model in PICTURE 1? What is the best model? Which coefficients does it have? I can only see the best model's cross validation error, but not the actual model. How to see it? None of your pictures came through, so I'm not sure exactly what you're trying to point out, but in general the cross validation will help you find the best value for lambda for the lasso. I think it's the value of lambda that you'll use for your downstream analysis. I haven't used the glmpath package, but I have been using the glmnet package which is also by Hastie, newer, and I believe covers the same use cases as the glmpath library (though, to be honest, I'm not quite familiar w/ the cox proportions hazard model). Perhaps you might want to look into it. Anyway, speaking from my experience w/ the glmnet packatge, you might try this: 1. Determine the best value of lambda using CV. I guess you can use MSE or R^2 as you see fit as your yardstick of "best." 2. Train a model over all of your data and ask it for the coefficients at the given value of lambda from 1. 3. See which proteins have non-zero coefficients. 4. Divine a biological story that is explained by your statistical findings 4. Publish. I guess there are many ways to do model selection, and I'm not sure it's clear how effective they are (which isn't to say that you shouldn't don't do them)[1] ... you might want to further divide your data into training/tuning/test (somewhere between steps 1 and 2) as another means of scoring models. HTH, -steve [1] http://hunch.net/?p=29 -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert list to data frame while controlling column types
Hi Allie, On Aug 21, 2009, at 11:47 AM, Alexander Shenkin wrote: Hello all, I have a list which I'd like to convert to a data frame, while maintaining control of the columns' data types (akin to the colClasses argument in read.table). My numeric columns, for example, are getting converted to factors by as.data.frame. Is there a way to do this, or will I have to do as I am doing right now: allow as.data.frame to coerce column-types as it sees fit, and then convert them back manually? This doesn't sound right ... are there characters buried in your numeric columns somewhere that might be causing this? I'm pretty sure this shouldn't happen, and a small test case here goes along with my intuition: R> a <- list(a=1:10, b=rnorm(10), c=LETTERS[1:10]) R> df <- as.data.frame(a) R> sapply(df, is.factor) a b c FALSE FALSE TRUE Can you check to see if your data's wonky somehow? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extra .
Hi, This is somehow unrelated, but your answer brings up a question that I've been curious about: On Aug 21, 2009, at 11:48 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of kfcnhl Sent: Thursday, August 20, 2009 7:34 PM To: r-help@r-project.org Subject: [R] extra . sigma0 <- sqrt((6. * var(maxima))/pi) What does the '.' do here? In R it does nothing: both '6' and '6.' parse as "numerics" (in C, double precision numbers). In SV4 and S+ '6' parses as an integer (in C, a long) and '6.' parses as a numeric, so putting the decimal point after numerics makes the code a bit more portable, although there are not too many cases where the difference is significant. Calls to .C, etc., and oveflowing arithmetic are the main problem points. In R and S+ '6L' represents an integer. If this is true, I'm curious as to why when I'm poking through some source code of different packages, I see that some people are careful to explicitly include the L after integers? I can't find a good example at the moment, but you can see one such case in the source to the lm.fit function. The details aren't all that important, but here's one of the lines: r2 <- if (z$rank < p) (z$rank + 1L):p I mean, why not just (z$rank + 1):p ? Just wondering if anybody has any insight into that. I've always been curious and I seem to see it done in many different functions and packages, so I feel like I'm missing something ... Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computation of matrices in list of list
Hi Kathie, On Sat, Aug 22, 2009 at 1:03 PM, kathie wrote: Dear Gabor Grothendieck, > > > thank you for your comments. > > Ive already tried that. but I've got this error message. > > > > Reduce("+",z) > Error in f(init, x[[i]]) : non-numeric argument to binary operator > > > anyway, thanks > > ps. > > > is.matrix(z[[1]][[1]]) > [1] TRUE > > I guess the reason "Reduce" doesn't work is that it has multi-dimentional > list.. Yeah, what about a double reduce: # Make a list of matrix lists: R> z <- list(list(matrix(rnorm(25),5), matrix(rnorm(25),5), matrix(rnorm(25),5)), list(matrix(rnorm(25),5), matrix(rnorm(25),5)), list(matrix(rnorm(25),5), matrix(rnorm(25),5), matrix(rnorm(25),5))) # Reduce all matrices in the nested lists R> r1 <- lapply(z, function(ms) Reduce("+", ms)) R> r1 [[1]] [,1] [,2] [,3] [,4] [,5] [1,] 2.4884292 1.058375 1.3235864 -1.7800055 3.1095416 [2,] -0.5077567 -1.120329 1.8128142 -1.4255453 1.2478431 [3,] -3.7495272 -2.702159 -2.1013426 -2.1324515 -0.655 [4,] -2.7359066 -1.437341 -0.1735794 0.4892164 -1.1855285 [5,] 4.4842963 -2.312451 -0.6797429 0.4563329 0.2108545 [[2]] [,1][,2][,3] [,4] [,5] [1,] 1.725147 -0.06565073 0.16204140 -0.4859336 1.0162852 [2,] 2.187191 -0.91075148 -2.37727477 1.1329259 -0.3917582 [3,] 1.471685 0.73675444 1.18658159 -0.7677262 1.5632101 [4,] -1.959942 0.51154059 -0.04049294 -1.3777180 -0.9919192 [5,] 0.609865 -2.04175553 1.01257051 1.3094908 -0.9437275 [[3]] [,1][,2] [,3] [,4] [,5] [1,] 0.7896427 1.25712740 1.4208904 -0.4634764 1.3859927 [2,] -0.1193923 -0.03666575 -0.9531145 3.4310667 0.8684956 [3,] -2.3761459 1.16104711 -0.4272411 -2.7792338 -0.3665312 [4,] -2.7372060 1.75061841 -0.6583626 0.6655959 -1.5374698 [5,] -0.5498145 -1.70883781 0.1796487 -0.7663076 -1.3042342 # Reduce that R> Reduce("+", r1) [,1] [,2] [,3] [,4] [,5] [1,] 5.003219 2.2498521 2.9065183 -2.7294155 5.5118195 [2,] 1.560042 -2.0677467 -1.5175750 3.1384472 1.7245805 [3,] -4.653988 -0.8043570 -1.3420022 -5.6794115 0.5077933 [4,] -7.433055 0.8248184 -0.8724350 -0.2229058 -3.7149176 [5,] 4.544347 -6.0630447 0.5124764 0.9995161 -2.0371072 -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on comparing two matrices
Hi, On Sat, Aug 22, 2009 at 2:45 PM, Michael Kogan wrote: > Hi, > > I need to compare two matrices with each other. If you can get one of them > out of the other one by resorting the rows and/or the columns, then both of > them are equal, otherwise they're not. A matrix could look like this: > >[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,]01110110 > [2,]11000101 > [3,]10100011 > [4,]11001000 > [5,]10111000 > [6,]01011000 > [7,]00000111 > > Note that each matrix consists of ones and zeros, in each row and in each > column there are at least three ones and one zero and each pair of > rows/columns may have at most two positions where both are ones (e.g. for > the 1. and 2. rows those positions are 2 and 6). > > I was advised to sort both matrices in the same way and then to compare > them element by element. But I don't manage to get them sorted... My > approach is as following: > > 1. Sort the rows after the row sums (greater sums first). > 2. Sort the columns after the first column (columns with ones in the first > row go left, columns with zeros go right). > 3. Save the left part (all columns with ones in the first row) and the > right part in separate matrices. > 4. Repeat steps 2 and 3 with both of the created matrices (now taking the > second row for sorting), repeat until all fragments consist of a single > column. > 5. Compose the columns to a sorted matrix. > > This algorithm has several problems: > > 1. How to make a loop that is branching out in two subloops on each > iteration? > 2. How to organize the intermediate results and compose them without losing > the order? Maybe save them in lists and sublists? > 3. A fundamental problem: If there are rows with equal sums the result may > depend on which of them is sorted after first. Maybe this algorithm won't > work at all because of this problem? Ouch, this seems like a real PITA. If you want to go about this by implementing the algo you described, I think you'd be best suited via some divide-and-conquer/recursion route: http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm Perhaps you can take inspiration from some concrete sorting algorithms that are implemented this way: Merge sort: http://en.wikipedia.org/wiki/Merge_sort Quick sort: http://en.wikipedia.org/wiki/Quicksort Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on comparing two matrices
On Sat, Aug 22, 2009 at 2:45 PM, Michael Kogan wrote: >> >> 1. Sort the rows after the row sums (greater sums first). >> 2. Sort the columns after the first column (columns with ones in the first >> row go left, columns with zeros go right). >> 3. Save the left part (all columns with ones in the first row) and the right >> part in separate matrices. >> 4. Repeat steps 2 and 3 with both of the created matrices (now taking the >> second row for sorting), repeat until all fragments consist of a single >> column. >> 5. Compose the columns to a sorted matrix. > If you want to go about this by implementing the algo you described, I think > you'd be best suited via some divide-and-conquer/recursion route: Starting from step 2, that is. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on comparing two matrices
Hi, On Sun, Aug 23, 2009 at 4:14 PM, Michael Kogan wrote: > Thanks for all the replies! > > Steve: I don't know whether my suggestion is a good one. I'm quite new to > programming, have absolutely no experience and this was the only one I could > think of. :-) I'm not sure whether I'm able to put your tips into practice, > unfortunately I had no time for much reading today but I'll dive into it > tomorrow. Ok, yeah. I'm not sure what the best way to do this myself, I would at first see if one could reduce these matrices by some principled manner and then do a comparison, which might jump to: > Ted: Wow, that's heavy reading. In fact the matrices that I need to compare > are incidence matrices so I suppose it's exactly the thing I need, but I > don't know if I have the basics knowledge to understand this paper within > the next months. Ted's sol'n. I haven't read the paper, but its title gives me an idea. Perhaps you can assume the two matrices you are comparing are adjacency matrices for a graph then use the igraph library to do a graph isomorphism test between the two graphs represented by your adjacency matrices and see if they are the same. This is probably not the most efficient (computationally) way to do it, but it might be the quickest way out coding-wise. I see your original example isn't using square matrices, and an adjacency matrix has to be square. Maybe you can pad your matrices with zero rows or columns (depending on what's deficient) as an easy way out. Just an idea. Of course, if David's solution is what you need, then no need to bother with any of this. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on comparing two matrices
Hi, It looks like you're getting more good stuff, but just to follow up: On Aug 24, 2009, at 4:01 PM, Michael Kogan wrote: Steve: The two matrices I want to compare really are graph matrices, just not adjacency but incidence matrices. There should be a way to get an adjacency matrix of a graph out of its incidence matrix but I don't know it... If you're working with graph data, do yourself a favor and install igraph (no matter what solution you end up using for this particular problem). http://cran.r-project.org/web/packages/igraph/ http://igraph.sourceforge.net/ In there, you'll find the `graph.incidence` function which creates a graph from its incidence matrix. You can then test if the two graphs are isomorphic. That would look like so: library(igraph) g1 <- graph.incidence(matrix.1) g2 <- graph.incidence(matrix.2) is.iso <- graph.isomorphic(g1, g2) # Or, using the (somehow fast) vf2 algorithm is.iso <- graph.isomorphic.vf2(g1, g2) HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filtering matrices
Hi, On Tue, Aug 25, 2009 at 10:11 PM, bwgoudey wrote: > > I'm using the rcorr function from the Hmisc library to get pair-wise > correlations from a set of observations for a set of attributes. This > returns 3 matrices; one of correlations, one of the number of observations > seen for each pair and the final of the P values for each correlation seen. > > >From these three matrices, all I wish to do is return a single matrix based > on the first correlation matrix where each value is above a certain > correlation, has a certain number of instances and has a P-value below some > other threshold. My question is what is the nicest way of writing this sort > of code? Build some logical indexing vector against the matrix you want to pass the criteria against, and use that on the matrix that has your values. If you need more help, please provide three small example matrices and let us know what you'd like your indexing to return. Someone will provide the code to show you the correct way to do it. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.