[R] quantile() and "factors not allowed"

2010-09-28 Thread Steve
A list (t) that I'm trying to pass to quantile() is causing this error:

Error in  quantile.default(t, probs = c(0.9, 9.95, 0.99))
  factors are not allowed

I've successfully use lists before, and am having difficulty finding my
mistake.  Any suggestions appreciated!

-Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile() and "factors not allowed"

2010-09-28 Thread Steve
The underlying data contained values that resulted in Factor instead of
number fields during the read.csv.  Problem fixed!

I also introduced a typo while copying the error into my message, and as
for the poor variable naming, I'll be more careful.

Thanks x3!  Corrected structure:

> str(CPU)
'data.frame':   56470 obs. of  8 variables:
 $ Value   : num  2.91 9.10e-01 1.08e+07 3.88e+06 3.03 ...
 $ Timestamp   : Factor w/ 4835 levels "9/17/2010 15:30",..: 1 1 1 1 2 2 2
2 3 3 ...
 $ MetricId: Factor w/ 5 levels "cpu.usage.average",..: 1 1 4 4 1 1 4
4 1 1 ...
 $ Unit: Factor w/ 4 levels "%","count","KB",..: 1 1 3 3 1 1 3 3 1
1 ...
 $ Entity  : Factor w/ 2 levels "system1",..: 2 1 2 1 2
1 2 1 2 1 ...
 $ EntityId: Factor w/ 3 levels "","EI1",..: 2 3 2 3 2
3 2 3 2 3 ...
 $ IntervalSecs: int  1800 1800 1800 1800 1800 1800 1800 1800 1800 1800
...
 $ Instance: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...

> Hi Steve,
>
> The basic problem (as the error suggests) is that data of class
> "factor" is not allowed in quantile.default.  So one of the elements
> of your list must be a factor.  What are the results of:   str(t)  ?
> As a side note, since t() is a function, using t as a variable name
> can be a bit confusing.
>
> If your list is relative small, you could post the results of dput(t)
> which would allow us to see what your data is actually like and
> perhaps identify the exact problem and offer a solution.
>
> Cheers,
>
> Josh
>
>
> On Tue, Sep 28, 2010 at 5:56 PM, Steve  wrote:
>> A list (t) that I'm trying to pass to quantile() is causing this error:
>>
>> Error in  quantile.default(t, probs = c(0.9, 9.95, 0.99))
>>  factors are not allowed
>>
>> I've successfully use lists before, and am having difficulty finding my
>> mistake.  Any suggestions appreciated!
>>
>> -Steve
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D tomography data

2010-08-31 Thread Steve

Hi all,

I was recently informed about R, as i need a program that can calculate the
nearest neighbour in 3D tomography data with its vector. However, I am new
to R and it isnt exactly intuitive. Does anyone know of any 3D tutorials
that may help me get started?

Cheers,
Steve
-- 
View this message in context: 
http://r.789695.n4.nabble.com/3D-tomography-data-tp2401591p2401591.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D tomography data

2010-09-01 Thread Steve

Like I was saying I want to be able to calculate the nearest neighbour and
its vector. I think this can be done using pairdist or K3est in the spatstat
package. But I have no idea as to how I prepare my data in a form that the
software will recognise. How do I turn my tomography data into pp3 type
format?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/3D-tomography-data-tp2401591p2403148.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I get a character and a symbol in a legend

2008-01-23 Thread steve
In the following snippet

plot(1:10,1:10,type="n")
points(1:5,1:5,pch="+")
points(6:10,6:10,pch=20)
legend(5,5, c("A","B"), pch=c("+",20))

I want to get a legend with a "+" and a solid circle (pch=20).
However, what I get in the legend is "+" and "2". How can I get a "+" 
and a solid circle?

thanks,  Steve

 > sessionInfo()
R version 2.6.1 (2007-11-26)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

loaded via a namespace (and not attached):
[1] rcompgen_0.1-17

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot multiple lines, same plot, different axes?

2009-02-08 Thread Steve
Ron

 

It's Steve Thornton here from Leicester in the UK. Hope you are well. I got
your card and newsletter. It sounds like you're still travelling. If you get
this E-mail please mail me back as I'd like to keep in touch.

 

I've been trying to find your E-mail but you seem to have several. My last
messages got bounced back so I hope this one finds you.

 

Take care and hope to hear from you soon

 

Steve.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] An array of an array of boxplots in lattice

2008-11-17 Thread steve

Using the data set fgl in MASS the following code

layout(matrix(1:9,3,3))
for(i in 1:9){
boxplot(fgl[,i] ~ type, data = fgl,main=dimnames(fgl)[[2]][i])}

produces a 3 by 3 array of plots, each one of which consists of six 
boxplots.


Is it possible to do this in lattice?

Steve

 "R version 2.7.2 (2008-08-25)" on Ubuntu 6.06

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] An array of an array of boxplots in lattice

2008-11-17 Thread steve

Thank you. Here's my version, using melt instead of do.call(make.groups...

library(reshape)
fgl2 = melt(fgl[,-10])
fgl2$type = fgl$type
bwplot(value ~ type | variable, data = fgl2)

Steve




Deepayan Sarkar wrote:

On Mon, Nov 17, 2008 at 11:15 AM, Chuck Cleland <[EMAIL PROTECTED]> wrote:

On 11/17/2008 1:50 PM, steve wrote:

Using the data set fgl in MASS the following code

layout(matrix(1:9,3,3))
for(i in 1:9){
boxplot(fgl[,i] ~ type, data = fgl,main=dimnames(fgl)[[2]][i])}

produces a 3 by 3 array of plots, each one of which consists of six
boxplots.

Is it possible to do this in lattice?

library(MASS)
library(lattice)

newdf <- reshape(fgl, varying =
list(c('RI','Na','Mg','Al','Si','K','Ca','Ba','Fe')),
 v.names = 'Y',
  times=c('RI','Na','Mg','Al','Si','K','Ca','Ba','Fe'),
 direction='long')


And a slightly less verbose version of this step is:

newdf <- do.call(make.groups, fgl[-10])
newdf$type <- fgl$type

followed by

bwplot(data ~ which | type, data = newdf, )

-Deepayan


bwplot(Y ~ type | time, data = newdf, ylab="",
  scales=list(y=list(relation='free')))


Steve

 "R version 2.7.2 (2008-08-25)" on Ubuntu 6.06

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 problem

2008-11-26 Thread steve

I'm using ggplot2 2.0.8 and R 2.8.0

df = data.frame(Year = rep(1:5,2))
m = ggplot(df, aes(Year=Year))
m + geom_bar()

Error in get("calculate", env = ., inherits = TRUE)(., ...) :
  attempt to apply non-function

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 problem

2008-11-26 Thread steve

Yes - that's it.

 thank you

    Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble building R 3.5.0 under Ubuntu 18.04

2018-05-22 Thread Steve Gutreuter
I would love to hear from anyone who has successfully built 3.5.0 under
 Ubuntu 18.04 (Bionic Beaver).  My attempts have failed, including:

export LDFLAGS="$LDFLAGS -fPIC"
export CXXFLAGS="$CXXFLAGS -fPIC"
export CFLAGS="$CFLAGS -fPIC"
./configure --enable-R-shlib --prefix=/usr/lib/R/3.5.0

 Configure completes normally without errors or warnings

make

 make fails, always with lines like:
/usr/bin/x86_64-linux-gnu-ld: ../appl/dtrsl.o: relocation R_X86_64_32
against `.rodata' can not be used when making a shared object;
recompile with -fPIC
/usr/bin/x86_64-linux-gnu-ld: attrib.o: relocation R_X86_64_PC32
against symbol `R_NilValue' can not be used when making a shared
object; recompile with -fPIC
/usr/bin/x86_64-linux-gnu-ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
Makefile:177: recipe for target 'libR.so' failed
make[3]: *** [libR.so] Error 1
make[3]: Leaving directory '/home/steve/src/R/R-3.5.0/src/main'
Makefile:135: recipe for target 'R' failed

How does one set the -fPIC flag?

I have never had trouble compiling under Mint, which is based on
Ubuntu.

Thanks!
Steve

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boot.stepAIC fails with computed formula

2017-08-22 Thread Steve O'Hagan

The error is "the model fit failed in 50 bootstrap samples
Error: non-character argument"

Cheers,
SOH.

On 22/08/2017 17:52, Bert Gunter wrote:

Failed?  What was the error message?

Cheers,

Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
 wrote:

I'm trying to use boot.stepAIC for feature selection; I need to be able to 
specify the name of the dependent variable programmatically, but this appear to 
fail:

In R-Studio with MS R Open 3.4:

library(bootStepAIC)

#Fake data
n<-200

x1 <- runif(n, -3, 3)
x2 <- runif(n, -3, 3)
x3 <- runif(n, -3, 3)
x4 <- runif(n, -3, 3)
x5 <- runif(n, -3, 3)
x6 <- runif(n, -3, 3)
x7 <- runif(n, -3, 3)
x8 <- runif(n, -3, 3)
y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)

dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
#the real data won't have these names...

cn <- names(dat)
trg <- "y1"
xvars <- cn[cn!=trg]

frm1<-as.formula(paste(trg,"~1"))
frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))

strt=lm(y1~1,dat) # boot.stepAIC Works fine

#strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##

#strt=lm(frm1,dat) ## boot.stepAIC FAILS ##

limit<-5


stp=stepAIC(strt,direction='forward',steps=limit,
 scope=list(lower=frm1,upper=frm2))

bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
 scope=list(lower=frm1,upper=frm2))

b1 <- bst$Covariates
ball <- data.frame(b1)
names(ball)=unlist(trg)

Any ideas?

Cheers,
SOH


 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble getting rms::survplot(..., n.risk=TRUE) to behave properly

2016-06-02 Thread Steve Lianoglou
Hello foks,

I'm trying to plot the number of patients at-risk by setting the
`n.risk` parameter to `TRUE` in the rms::survplot function, however it
looks as if the numbers presented in the rows for each category are
just summing up the total number of patients at risk in all groups for
each timepoint -- which is to say that the numbers are equal in each
category down the rows, and they don't seem to be the numbers specific
to each group.

You can reproduce the observed behavior by simply running the code in
the Examples section of ?survplot, which I'll paste below for
convenience.

Is the error between the chair and the keyboard, here, or is this perhaps a bug?

=== code ===
library(rms)
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
S <- Surv(dt,e)

f <- cph(S ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
survplot(f, sex, n.risk=TRUE)
===

I'm using the latest version of rms (4.5-0) running on R 3.3.0-patched.

=== Output o sessionInfo() ===
R version 3.3.0 Patched (2016-05-26 r70671)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.4 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rms_4.5-0   SparseM_1.7 Hmisc_3.17-4ggplot2_2.1.0
[5] Formula_1.2-1   survival_2.39-4 lattice_0.20-33

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.5 cluster_2.0.4   MASS_7.3-45
 [4] splines_3.3.0   munsell_0.4.3   colorspace_1.2-6
 [7] multcomp_1.4-5  plyr_1.8.3  nnet_7.3-12
[10] grid_3.3.0  data.table_1.9.6gtable_0.2.0
[13] nlme_3.1-128quantreg_5.24   TH.data_1.0-7
[16] latticeExtra_0.6-28 MatrixModels_0.4-1  polspline_1.1.12
[19] Matrix_1.2-6gridExtra_2.2.1 RColorBrewer_1.1-2
[22] codetools_0.2-14acepack_1.3-3.3 rpart_4.1-10
[25] sandwich_2.3-4  scales_0.4.0mvtnorm_1.0-5
[28] foreign_0.8-66  chron_2.3-47zoo_1.7-13
===


Thanks,
-steve


-- 
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble getting rms::survplot(..., n.risk=TRUE) to behave properly

2016-06-02 Thread Steve Lianoglou
Ah!

Sorry ... should have dug deeper into the examples section to notice that.

Thank you for the quick reply,
-steve


On Thu, Jun 2, 2016 at 8:59 AM, Frank Harrell  wrote:
> This happens when you have not strat variables in the model.
>
>
> --
> Frank E Harrell Jr  Professor and Chairman  School of Medicine
>
> Department of *Biostatistics*  *Vanderbilt University*
>
> On Thu, Jun 2, 2016 at 10:55 AM, Steve Lianoglou 
> wrote:
>
>> Hello foks,
>>
>> I'm trying to plot the number of patients at-risk by setting the
>> `n.risk` parameter to `TRUE` in the rms::survplot function, however it
>> looks as if the numbers presented in the rows for each category are
>> just summing up the total number of patients at risk in all groups for
>> each timepoint -- which is to say that the numbers are equal in each
>> category down the rows, and they don't seem to be the numbers specific
>> to each group.
>>
>> You can reproduce the observed behavior by simply running the code in
>> the Examples section of ?survplot, which I'll paste below for
>> convenience.
>>
>> Is the error between the chair and the keyboard, here, or is this perhaps
>> a bug?
>>
>> === code ===
>> library(rms)
>> n <- 1000
>> set.seed(731)
>> age <- 50 + 12*rnorm(n)
>> label(age) <- "Age"
>> sex <- factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4)))
>> cens <- 15*runif(n)
>> h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
>> dt <- -log(runif(n))/h
>> label(dt) <- 'Follow-up Time'
>> e <- ifelse(dt <= cens,1,0)
>> dt <- pmin(dt, cens)
>> units(dt) <- "Year"
>> dd <- datadist(age, sex)
>> options(datadist='dd')
>> S <- Surv(dt,e)
>>
>> f <- cph(S ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
>> survplot(f, sex, n.risk=TRUE)
>> ===
>>
>> I'm using the latest version of rms (4.5-0) running on R 3.3.0-patched.
>>
>> === Output o sessionInfo() ===
>> R version 3.3.0 Patched (2016-05-26 r70671)
>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>> Running under: OS X 10.11.4 (El Capitan)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] rms_4.5-0   SparseM_1.7 Hmisc_3.17-4ggplot2_2.1.0
>> [5] Formula_1.2-1   survival_2.39-4 lattice_0.20-33
>>
>> loaded via a namespace (and not attached):
>>  [1] Rcpp_0.12.5 cluster_2.0.4   MASS_7.3-45
>>  [4] splines_3.3.0   munsell_0.4.3   colorspace_1.2-6
>>  [7] multcomp_1.4-5  plyr_1.8.3  nnet_7.3-12
>> [10] grid_3.3.0  data.table_1.9.6gtable_0.2.0
>> [13] nlme_3.1-128quantreg_5.24   TH.data_1.0-7
>> [16] latticeExtra_0.6-28 MatrixModels_0.4-1  polspline_1.1.12
>> [19] Matrix_1.2-6gridExtra_2.2.1 RColorBrewer_1.1-2
>> [22] codetools_0.2-14acepack_1.3-3.3 rpart_4.1-10
>> [25] sandwich_2.3-4  scales_0.4.0mvtnorm_1.0-5
>> [28] foreign_0.8-66  chron_2.3-47zoo_1.7-13
>> ===
>>
>>
>> Thanks,
>> -steve
>>
>>
>> --
>> Steve Lianoglou
>> Computational Biologist
>> Genentech
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Apply a multi-variable function to a vector

2016-09-09 Thread Steve Kennedy
Hello,

I would like to define an arbitrary function of an arbitrary number of 
variables, for example, for 2 variables:

func2 <- function(time, temp) time + temp

I'd like to keep variable names that have a meaning in the problem (time and 
temperature above).

If I have a vector of values for these variables, for example in the 2-d case, 
c(10, 121), I'd like to apply my function (in this case func2) and obtain the 
result. Conceptually, something like,

func2(c(10,121))

becomes

func2(10,121)

Is there a simple way to accomplish this, for an arbitrary number of variables? 
 I'd like something that would simply work from the definition of the function. 
 If that is possible.

Thanks,

Steve Kennedy

CONFIDENTIALITY NOTICE: This e-mail message, including a...{{dropped:11}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cbind question, please

2015-04-24 Thread Steve Taylor
This works for me...

get0 = function(x) get(x,pos=1)
sapply(big.char, get0)

The extra step seems necessary because without it, get() gets base::cat() 
instead of cat.

cheers,
Steve

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess
Sent: Friday, 24 April 2015 10:41a
To: R help
Subject: [R] cbind question, please

Hello!

I have a cbind type question, please:  Suppose I have the following:

dog <- 1:3
cat <- 2:4
tree <- 5:7

and a character vector
big.char <- c("dog","cat","tree")

I want to end up with a matrix that is a "cbind" of dog, cat, and tree.
This is a toy example.  There will be a bunch of variables.

I experimented with "do.call", but all I got was
1
2
3

Any suggestions would be much appreciated.  I still think that do.call
might be the key, but I'm not sure.

R Version 3-1.3, Windows 7.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Mathematical and Statistics
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting Confidence Intervals

2015-05-03 Thread Steve Taylor
Have you tried:

library(effects)
plot(allEffects(ines),ylim=c(460,550))


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Andre Roldao
Sent: Saturday, 2 May 2015 2:50p
To: r-help@r-project.org
Subject: [R] Plotting Confidence Intervals

Hi Guys,

It's the first time i use R-Help and i hope you can help me.

How can i plot conffidence intervals? with the data bellow:

#Package Austria
library(car)
#head(States)
States1=data.frame(States)

ines=lm(SATM ~ log2(pop) + SATV , data=States1)
summary(ines)

NJ=as.data.frame(States["NJ",c(4,2,3)]) #Identificação do estado NJ


p_conf<- predict(ines,interval="confidence",NJ,level=0.95)
p_conf #Intervalo de confiança para o estado NJ e para um nivel de 95%
round(p_conf, digits=3)

p_conf1<- predict(ines,interval="confidence",NJ,level=0.99)
p_conf1 #Intervalo de confiança para o estado NJ e para um nivel de 99%
round(p_conf, digits=3)

p_pred2<- predict(ines,interval="prediction",NJ,level=0.95)
p_pred2 #Intervalo de perdição para o estado NJ e para um nivel de 95%
round(p_pred2,digits=3)

p_pred3<- predict(ines,interval="prediction",NJ,level=0.99)
p_pred3 #Intervalo de perdição para o estado NJ e para um nivel de 99%
round(p_pred3,digits=3)

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Call to a function

2015-06-23 Thread Steve Taylor
Note that objects can have more than one class, in which case your == and %in% 
might not work as expected.  

Better to use inherits().

cheers,
Steve

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Steven Yen
Sent: Wednesday, 24 June 2015 11:37a
To: boB Rudis
Cc: r-help mailing list
Subject: Re: [R] Call to a function

Thanks! From this I learn the much needed class statement

 if (class(wt)=="character") wt <- x[, wt]

which serves my need in a bigger project.

Steven Yen

On 6/23/2015 6:20 PM, boB Rudis wrote:
> You can do something like:
>
> aaa <- function(data, w=w) {
>if (class(w) %in% c("integer", "numeric", "double")) {
>  out <- mean(w)
>} else {
>  out <- mean(data[, w])
>}
>return(out)
> }
>
> (there are some typos in your function you may want to double check, too)
>
> On Tue, Jun 23, 2015 at 5:39 PM, Steven Yen  wrote:
>> mydata<-data.frame(matrix(1:20,ncol=2))
>> colnames(mydata) <-c("v1","v2")
>> summary(mydata)
>>
>> aaa<-function(data,w=w){
>>if(is.vector(w)){
>>  out<-mean(w)
>>} else {
>>  out<-mean(data[wt])
>>}
>> return(out)
>> }
>>
>> aaa(mydata,mydata$v1)
>> aaa(mydata,"v1")  # want this call to work
>

-- 
Steven Yen
My e-mail alert:
https://youtu.be/9UwEAruhyhY?list=PLpwR3gb9OGHP1BzgVuO9iIDdogVOijCtO

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset() within function: logical error

2015-06-29 Thread Steve Taylor
Using return() within a for loop makes no sense: only the first one will be 
returned.

How about:
alldf.B = subset(alldf, stream=='B')  # etc...

Also, have a look at unique(alldf$stream) or levels(alldf$stream) if you want 
to use a for loop on each unique value.

cheers,
Steve

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rich Shepard
Sent: Tuesday, 30 June 2015 12:04p
To: r-help@r-project.org
Subject: [R] Subset() within function: logical error

   Moving from interactive use of R to scripts and functions and have bumped
into what I believe is a problem with variable names. Did not see a solution
in the two R programming books I have or from my Web searches. Inexperience
with ess-tracebug keeps me from refining my bug tracking.

   Here's a test data set (cleverly called 'testset.dput'):

structure(list(stream = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L), .Label = c("B", "J", "S"), class = "factor"),
 sampdate = structure(c(8121, 8121, 8121, 8155, 8155, 8155,
 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257,
 8257, 8257, 8308, 8785, 8785, 8785, 8785, 8785, 8785, 8785,
 8847, 8847, 8847, 8847, 8847, 8847, 8847, 8875, 8875, 8875,
 8875, 8875, 8875, 8875, 8121, 8121, 8121, 8155, 8155, 8155,
 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257,
 8257, 8257, 8301, 8301, 8301), class = "Date"), param = structure(c(2L,
 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L,
 6L, 7L, 2L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L,
 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L,
 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L
 ), .Label = c("Ca", "Cl", "K", "Mg", "Na", "SO4", "pH"), class = "factor"),
 quant = c(4, 33, 8.43, 4, 32, 8.46, 4, 31, 8.43, 6, 33, 8.32,
 5, 33, 8.5, 5, 32, 8.5, 5, 59.9, 3.46, 1.48, 29, 7.54, 64.6,
 7.36, 46, 2.95, 1.34, 21.8, 5.76, 48.8, 7.72, 74.2, 5.36,
 2.33, 38.4, 8.27, 141, 7.8, 3, 76, 6.64, 4, 74, 7.46, 2,
 82, 7.58, 5, 106, 7.91, 3, 56, 7.83, 3, 51, 7.6, 6, 149,
 7.73)), .Names = c("stream", "sampdate", "param", "quant"
), row.names = c(NA, -61L), class = "data.frame")

   I want to subset that data.frame on each of the stream names: B, J, and S.
This is the function that has the naming error (eda.R):

extstream = function(alldf) {
 sname = alldf$stream
 sdate = alldf$sampdate
 comp = alldf$param
 value = alldf$quant
 for (i in sname) {
 sname <- subset(alldf, alldf$stream, select = c(sdate, comp, value))
 return(sname)
 }
}

   This is the result of running source('eda.R') followed by

> extstream(testset)
Error in subset.data.frame(alldf, alldf$stream, select = c(sdate, comp,  :
   'subset' must be logical

   I've tried using sname for the rows to select, but that produces a
different error of trying to select undefined columns.

   A pointer to the correct syntax for subset() is needed.

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] valid LRT between MASS::polr and nnet::multinom

2015-07-07 Thread Steve Taylor
Dear R-helpers,

Does anyone know if the likelihoods calculated by these two packages are 
comparable in this way?  

That is, is this a valid likelihood ratio test?

# Reproducable example:
library(MASS)
library(nnet)
data(housing)
polr1 = MASS::polr(Sat ~ Infl + Type + Cont, weights=Freq, data=housing)
mnom1 = nnet::multinom(Sat ~ Infl + Type + Cont, weights=Freq, data=housing)
pll = logLik(polr1)
mll = logLik(mnom1)
res = data.frame(
  model = c('Proportional odds','Multinomial'),
  Function = c('MASS::polr','nnet::multinom'),
  nobs = c(attr(pll, 'nobs'), attr(mll, 'nobs')),
  df = c(attr(pll, 'df'), attr(mll, 'df')),
  logLik = c(pll,mll),
  deviance = c(deviance(polr1), deviance(mnom1)),
  AIC = c(AIC(polr1), AIC(mnom1)),
  stringsAsFactors = FALSE
)
res[3,1:2] = c("Difference","")
res[3,3:7] = apply(res[,3:7],2,diff)[1,]
print(res)
mytest = structure(
  list(
statistic = setNames(res$logLik[3], "X-squared"),
parameter = setNames(res$df[3],"df"),
p.value = pchisq(res$logLik[3], res$df[3], lower.tail = FALSE),
method = "Likelihood ratio test",
data.name = "housing"
  ),
  class='htest'
)
print(mytest)

# If you want to see the fitted results:
library(effects)
plot(allEffects(polr1), layout=c(3,1), ylim=0:1)
plot(allEffects(mnom1), layout=c(3,1), ylim=0:1)

many thanks,
Steve

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] modifying a package installed via GitHub

2015-07-17 Thread Steve E.
Hi Folks,

I am working with a package installed via GitHub that I would like to
modify. However, I am not sure how I would go about loading a 'local'
version of the package after I have modified it, and whether that process
would including uninstalling the original unmodified package (and,
conversely, how to uninstall my local, modified version if I wanted to go
back to the unmodified version available on GitHub).

Any advice would be appreciated.


Thanks,
Steve



--
View this message in context: 
http://r.789695.n4.nabble.com/modifying-a-package-installed-via-GitHub-tp4710016.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Opposite color in R

2015-07-27 Thread Steve Taylor
I wonder if the hcl colour space is useful?  Varying hue while keeping chroma 
and luminosity constant should give varying colours of perceptually the same 
"colourness" and brightness.

?hcl
pie(rep(1,12),col=hcl((1:12)*30,c=70),border=NA)


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Atte Tenkanen
Sent: Sunday, 26 July 2015 7:50a
To: r-help@r-project.org
Subject: [R] Opposite color in R

Hi,

I have tried to find a way to find opposite or complementary colors in R.

I would like to form a color circle with R like this one: 
http://nobetty.net/dandls/colorwheel/complementary_colors.jpg

If you just make a basic color wheel in R, the colors do not form 
complementary color circle:

palette(rainbow(24))
Colors=palette()
pie(rep(1, 24), col = Colors)

There is a package ”colortools” where you can find function opposite(), 
but it doesn’t work as is said. I tried

library(colortools)
opposite("violet") and got green instead of yellow and

opposite("blue") and got yellow instead of orange.

Do you know any solutions?

Atte Tenkanen

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] apply with multiple references and database interactivity

2015-08-15 Thread Steve E.
Hi R Colleagues,

I have a small R script that relies on two for-loops to pull data from a
database, make some edits to the data returned from the query, then inserts
the updated data back into the database. The script works just fine, no
problems, except that I am striving to get away from loops, and to focus on
the apply family of tools. In this case, though, I did not know quite where
to start with apply. I wonder if someone more adept with apply would not
mind taking a look at this, and suggesting some tips as to how this could
have been accomplished with apply instead of nested loops. More details on
what the script is accomplishing are included below.

Thanks in advance for your help and consideration.


Steve

Here, I have a df that includes a list of keywords that need to be edited,
and the corresponding edit. The script goes through a database of people,
identifies whether any of the keywords associated with each person are in
the list of keywords to edit, and, if so, pulls in the list of keywords and
the person details, swaps the new keyword for the old keyword, then inserts
the updated keywords back into the database for that person (many keywords
are associated with each person, and they are in an array, hence the
somewhat complicated procedure). The if-statement provides a list of
keywords in the df that were not found in the database, and 'm' is just a
counter to help me know how many keywords the script changed.

for(i in 1:nrow(keywords)) {
  pull <- dbGetQuery(conn = con, statement = paste0("SELECT person_id,
expertise FROM people WHERE expertise RLIKE '; ", keywords[i, 2], ";'"))
  pull$expertise <- gsub(keywords[i, 2], keywords[i, 3], pull$expertise)
  if (nrow(pull)==0) {
sink('~/Desktop/r1', append = TRUE)
print(keywords[i, ]$keyword)
sink() } else
{
for (j in 1:nrow(pull)) {
dbSendQuery(conn = con, statement = paste0("UPDATE people SET expertise
= '", pull[j, ]$expertise, "' WHERE person_id = ", pull[j, ]$person_id)) }
  m=m+1
} }




--
View this message in context: 
http://r.789695.n4.nabble.com/apply-with-multiple-references-and-database-interactivity-tp4711148.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the less-than-minus gotcha

2015-02-01 Thread Steve Taylor
All the more reason to use = instead of <-


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker
Sent: Monday, 2 February 2015 2:07p
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] the less-than-minus gotcha

Mike Miller  gmail.com> writes:

> 
> I've got to remember to use more spaces.  Here's the basic problem:
> 
> These are the same:
> 
> v< 1
> v<1
> 
> But these are extremely different:
> 
> v< -1
> v<-1
> 

This is indeed documented, in passing, in one of the pages you listed:

http://tim-smith.us/arrgh/syntax.html

Whitespace is meaningless, unless it isn't. Some parsing ambiguities 
are resolved by considering whitespace around operators. See and
despair: x<-y (assignment) is parsed differently than x < -y (comparison)!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the less-than-minus gotcha

2015-02-02 Thread Steve Taylor
Responding to several messages in this thread...

> > All the more reason to use = instead of <-
> Definitely not!

Martin and Rolf are right, it's not a reason for that; I wrote that quickly 
without thinking it through.  An "=" user might be more likely to fall for the 
gotcha, if not spacing their code nicely.  So the lesson learned from the 
gotcha is that it's good to space your code nicely, as others have siad, not 
which assignment symbol to use.

However, I continue to use "=" for assignment on a daily basis without any 
problems, as I have done for many years.  I remain unconvinced by any and all 
of these arguments against it in favour of "<-".  People telling me that I 
"should" use the arrow need better agruments than what I've seen so far.

I find "<-" ugly and "->" useless/pointless, whereas "=" is simpler and also 
nicely familiar from my experience in other languages.  It doesn't matter to me 
that "=" is not commutative because I don't need it to be.

> Further it can be nicely marked up by a real "left arrow" 
> by e.g. the listings LaTeX 'listings' package...

Now that's just silly, turning R code into graphical characters that are not 
part of the R language.

>  foo(x = y) and foo(x <- y)

I'm well aware of this distinction and it never causes me any problems.  The 
latter is an example of bad (obfuscated) coding, IMHO; it should be done in two 
lines for clarity as follows:

x = y
foo(x)

> Using = has it's problems too.
Same goes for apostrophes.

Shall we discuss putting "else" at the start of line next?

cheers,
 Steve

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the less-than-minus gotcha

2015-02-02 Thread Steve Taylor
I disagree.  Assignments in my code are all lines that look like this:

variable = expression

They are easy to find and easy to read.

-Original Message-
From: Ista Zahn [mailto:istaz...@gmail.com] 
Sent: Tuesday, 3 February 2015 3:36p
To: Steve Taylor
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] the less-than-minus gotcha

On Mon, Feb 2, 2015 at 8:57 PM, Steve Taylor  wrote:
Fair enough, but you skipped right past the most important one: it
makes code easier to read. It's very nice to be able to visually scan
through the code and easily see where assignment happens.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the less-than-minus gotcha

2015-02-02 Thread Steve Taylor
Nobody would write x=x or indeed x<-x; both are silly.  If I found myself 
writing f(x=x) I might smirk at the coincidence, but it wouldn't bother me.  I 
certainly wouldn't confuse it with assigning x to itself.

By the way, here's another assignment operator we can use:

`:=` = `<-`  # this is going in my .Rprofile
x := 1


-Original Message-
From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us] 
Sent: Tuesday, 3 February 2015 3:54p
To: Steve Taylor; r-h...@stat.math.ethz.ch
Subject: Re: [R] the less-than-minus gotcha

I did not start out liking <-, but I am quite attached to it now, and even Rcpp 
feels weird to me now. This may seem like yet another variation on a theme that 
you don't find compelling, but I find that

f(x=x)

makes sense when scope is considered, but

x=x

on its own is silly. That is why I prefer to reserve = for assigning 
parameters... I use it to clarify that I am crossing scope boundaries, while <- 
never does. (<<- is a dangerous animal, though... to be used only locally in 
nested function definitions).

In my view, this is similar to preferring == from C-derived syntaxes over the 
overloaded = from, say, Basic. I am sure you can get by with the syntactic 
overloading, but if you have the option of reducing ambiguity, why not use it?

---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On February 2, 2015 5:57:05 PM PST, Steve Taylor  wrote:
>Responding to several messages in this thread...
>
>> > All the more reason to use = instead of <-
>> Definitely not!
>
>Martin and Rolf are right, it's not a reason for that; I wrote that
>quickly without thinking it through.  An "=" user might be more likely
>to fall for the gotcha, if not spacing their code nicely.  So the
>lesson learned from the gotcha is that it's good to space your code
>nicely, as others have siad, not which assignment symbol to use.
>
>However, I continue to use "=" for assignment on a daily basis without
>any problems, as I have done for many years.  I remain unconvinced by
>any and all of these arguments against it in favour of "<-".  People
>telling me that I "should" use the arrow need better agruments than
>what I've seen so far.
>
>I find "<-" ugly and "->" useless/pointless, whereas "=" is simpler and
>also nicely familiar from my experience in other languages.  It doesn't
>matter to me that "=" is not commutative because I don't need it to be.
>
>> Further it can be nicely marked up by a real "left arrow" 
>> by e.g. the listings LaTeX 'listings' package...
>
>Now that's just silly, turning R code into graphical characters that
>are not part of the R language.
>
>>  foo(x = y) and foo(x <- y)
>
>I'm well aware of this distinction and it never causes me any problems.
>The latter is an example of bad (obfuscated) coding, IMHO; it should be
>done in two lines for clarity as follows:
>
>x = y
>foo(x)
>
>> Using = has it's problems too.
>Same goes for apostrophes.
>
>Shall we discuss putting "else" at the start of line next?
>
>cheers,
> Steve
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fastest way to calculate quantile in large data.table

2015-02-05 Thread Steve Lianoglou
Not sure if there is a question in here somewhere?

But if I can point out an observation: if you are doing summary
calculations across the rows like this, my guess is that using a
data.table (data.frame) structure for that will really bite you,
because this operation on a data.table/data.frame is expensive;

  x <- dt[i,]

However it's much faster with a matrix. It doesn't seem like you're
doing anything with this dataset that takes advantage of data.table's
quick grouping/indexing mojo, so why store it in  data.table at all?

Witness:

R> library(data.table)
R> m <- matrix(rnorm(1e6), nrow=10)
R> d <- as.data.table(m)
R> idxs <- sample(1:nrow(m), 500, replace=TRUE)

R> system.time(for (i in idxs) x <- m[i,])
   user  system elapsed
  0.497   0.169   0.670

R> system.time(for (i in idxs) x <- d[i,])
## I killed it after waiting for 14 seconds

-steve

On Thu, Feb 5, 2015 at 11:48 AM, Camilo Mora  wrote:
> In total I found 8 different way to calculate quantile in very a large 
> data.table. I share below their performances for future reference. Tests 1, 7 
> and 8 were the fastest I found.
>
> Best,
>
> Camilo
>
> library(data.table)
> v <- data.table(x=runif(1),x2 = runif(1),  
> x3=runif(1),x4=runif(1))
>
> #fastest
> Sys.time()->StartTEST1
> t(v[, apply(v,1,quantile,probs =c(.1,.9,.5),na.rm=TRUE)] )
> Sys.time()->EndTEST1
>
> Sys.time()->StartTEST2
> v[, quantile(.SD,probs =c(.1,.9,.5)), by = 1:nrow(v)]
> Sys.time()->EndTEST2
>
> Sys.time()->StartTEST3
> v[, c("L","H","M"):=quantile(.SD,probs =c(.1,.9,.5)), by = 1:nrow(v)]
> Sys.time()->EndTEST3
> v
> v[, c("L","H","M"):=NULL]
>
> v[,Names:=rownames(v)]
> setkey(v,Names)
>
> Sys.time()->StartTEST4
> v[, c("L","H","M"):=quantile(.SD,probs =c(.1,.9,.5)), by = Names]
> Sys.time()->EndTEST4
> v
> v[, c("L","H","M"):=NULL]
>
>
> Sys.time()->StartTEST5
> v[,  as.list(quantile(.SD,c(.1,.90,.5),na.rm=TRUE)), by=Names]
> Sys.time()->EndTEST5
>
>
> Sys.time()->StartTEST6
> v[,  as.list(quantile(.SD,c(.1,.90,.5),na.rm=TRUE)), by=Names,.SDcols=1:4]
> Sys.time()->EndTEST6
>
>
> Sys.time()->StartTEST7
> v[,  as.list(quantile(c(x ,   x2,x3,x4 
> ),c(.1,.90,.5),na.rm=TRUE)), by=Names]
> Sys.time()->EndTEST7
>
>
> # melting the database and doing quantily by summary. This is the second 
> fastest, which is ironic given that the database has to be melted first
> library(reshape2)
> Sys.time()->StartTEST8
> vs<-melt(v)
> vs[,  as.list(quantile(value,c(.1,.90,.5),na.rm=TRUE)), by=Names]
> Sys.time()->EndTEST8
>
>
> EndTEST1-StartTEST1
> EndTEST2-StartTEST2
> EndTEST3-StartTEST3
> EndTEST4-StartTEST4
> EndTEST5-StartTEST5
> EndTEST6-StartTEST6
> EndTEST7-StartTEST7
> EndTEST8-StartTEST8
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scraping HTML using R

2015-02-05 Thread Steve Lianoglou
You want to take a look at rvest:

https://github.com/hadley/rvest

On Thu, Feb 5, 2015 at 2:36 PM, Madhuri Maddipatla
 wrote:
> Dear R experts,
>
> My requirement for web scraping in R goes like this.
>
> *Step 1* - All the medical condition from from A-Z are listed in the link
> below.
>
> http://www.webmd.com/drugs/index-drugs.aspx?show=conditions
>
> Choose the first condition say Acid Reflux(GERD-...)
>
> *Step 2 *- It lands on the this page
>
> http://www.webmd.com/drugs/condition-1999-Acid%20Reflux%20%20GERD-Gastroesophageal%20Reflux%20Disease%20.aspx?diseaseid=1999&diseasename=Acid+Reflux+(GERD-Gastroesophageal+Reflux+Disease)&source=3
>
> with a list of drugs.
>
> Choose the column user reviews of the first drug say "Nexium Oral"
>
> *Step 3*: Now it lands on the webpage
>
> http://www.webmd.com/drugs/drugreview-20536-Nexium+oral.aspx?drugid=20536&drugname=Nexium+oral
>
> with a list of reviews.
> I would like to scrape review information into a tabular format by scraping
> the html.
> For instance, i would like to fetch the full comment of each review as a
> column in a table.
> Also it should automatically go to next page and fetch the full comments of
> all reviewers.
>
>
> Please help me in this endeavor and thanks a lot in advance for reading my
> mail and expecting response with your experience and expertise.
>
> Also please suggest me the possibility around my stepwise plan and any
> advice you would like to give me along with the solution.
>
> High Regards,
> *-*
> *Madhuri Maddipatla*
> *-*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using dates in R

2015-03-04 Thread Steve Taylor
> today <- as.Date("2015-03-04") # default format

Better is:

today <- Sys.Date() 

S

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
Sent: Thursday, 5 March 2015 7:47a
To: Brian Hamel
Cc: r-help@r-project.org
Subject: Re: [R] Using dates in R

You will need to convert strings like "2/15/15" into one of the time/date
classes available in R and then it is easy to do comparisons.  E.g., if you
have no interest in the time of day you can use the Date class:

> d <- as.Date(c("12/2/79", "4/15/15"), format="%m/%d/%y")
> today <- as.Date("2015-03-04") # default format
> d
[1] "1979-12-02" "2015-04-15"
> today
[1] "2015-03-04"
> d < today
[1]  TRUE FALSE

The lubridate package contains a bunch of handy functions for manipulating
dates and times.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Mar 4, 2015 at 6:54 AM, Brian Hamel 
wrote:

> Hi all,
>
> I have a dataset that includes a "date" variable. Each observation includes
> a date in the form of 2/15/15, for example. I'm looking to create a new
> indicator variable that is based on the date variable. So, for example, if
> the date is earlier than today, I would need a "0" in the new column, and a
> "1" otherwise. Note that my dataset includes dates from 1979-2012, so it is
> not one-year (this means I can't easily create a new variable 1-365).
>
> How does R handle dates? My hunch is "not well," but perhaps there is a
> package that can help me with this. Let me know if you have any
> recommendations as to how this can be done relatively easily.
>
> Thanks! Appreciate it.
>
> Best,
> Brian
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date extract Year

2015-03-08 Thread Steve Archambault
Hi all,

I am trying in vain to create a new object "Year" in my data frame from
existing Date data. I have tried many different approaches, but can't seem
to get it to work. Here is an example of some code I tried.

date1<- as.Date(wells$Date,"%m/%d/%Y")
wells$year<-as.numeric(format(date1, "%Y"))

I am starting with data that looks like this.

ID  Date DepthtoWater_bgs test test2
1  BC-0004 41163   260.603 1
2  BC-0004 41255   261.654 2
3  BC-0003 41345   166.585 3
4  BC-0002 41351   317.856 4
5  BC-0004 41355   262.157 5
6  BC-0003 41438   167.558 6
7  BC-0004 41438   265.459 7
8  BC-0002 41443   317.25   10 8
9  BC-0002 41521   321.25   11 9
10 BC-0003 41522   168.65   1210
11 BC-0004 41522   266.15   1311
12 BC-0003 41627   168.95   1412
13 BC-0004 41627   265.25   1513
14 BC-0002 41634   312.31   1614
15 BC-0003 41703   169.25   1715
16 BC-0004 41703   265.05   1816
17 BC-0002 41710   313.01   1917
18 BC-0003 41795   168.85   2018
19 BC-0004 41795   266.95   2119
20 BC-0002 41801   330.41   2220
21 BC-0003 41905   169.75   2321
22 BC-0004 41905   267.75   2422
23 BC-0002 41906   321.01   2523

Any help would be greatly appreciated!


-Steve

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract year from date

2015-03-08 Thread Steve Archambault
Hi all,

I am trying in vain to create a new object "Year" in my data frame from
existing Date data. I have tried many different approaches, but can't seem
to get it to work. Here is an example of some code I tried.

date1<- as.Date(wells$Date,"%m/%d/%Y")
wells$year<-as.numeric(format(date1, "%Y"))

I am starting with data that looks like this.

ID  Date DepthtoWater_bgs test test2
1  BC-0004 41163   260.603 1
2  BC-0004 41255   261.654 2
3  BC-0003 41345   166.585 3
4  BC-0002 41351   317.856 4
5  BC-0004 41355   262.157 5
6  BC-0003 41438   167.558 6
7  BC-0004 41438   265.459 7
8  BC-0002 41443   317.25   10 8
9  BC-0002 41521   321.25   11 9
10 BC-0003 41522   168.65   1210
11 BC-0004 41522   266.15   1311
12 BC-0003 41627   168.95   1412
13 BC-0004 41627   265.25   1513
14 BC-0002 41634   312.31   1614
15 BC-0003 41703   169.25   1715
16 BC-0004 41703   265.05   1816
17 BC-0002 41710   313.01   1917
18 BC-0003 41795   168.85   2018
19 BC-0004 41795   266.95   2119
20 BC-0002 41801   330.41   2220
21 BC-0003 41905   169.75   2321
22 BC-0004 41905   267.75   2422
23 BC-0002 41906   321.01   2523

Any help would be greatly appreciated!

-Steve
Sent from my iPhone

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regex find anything which is not a number

2015-03-12 Thread Steve Taylor
How about letting a standard function decide which are numbers:

which(!is.na(suppressWarnings(as.numeric(myvector

Also works with numbers in scientific notation and (presumably) different 
decimal characters, e.g. comma if that's what the locale uses.


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Adrian Du?a
Sent: Thursday, 12 March 2015 8:27a
To: r-help@r-project.org
Subject: [R] regex find anything which is not a number

Hi everyone,

I need a regular expression to find those positions in a character
vector which contain something which is not a number (either positive
or negative, having decimals or not).

myvector <- c("a3", "N.A", "1.2", "-3", "3-2", "2.")

In this vector, only positions 3 and 4 are numbers, the rest should be captured.
So far I am able to detect anything which is not a number, excluding - and .

> grep("[^-0-9.]", myvector)
[1] 1 2

I still need to capture positions 5 and 6, which in human language
would mean to detect anything which contains a "-" or a "." anywhere
else except at the beginning of a number.

Thanks very much in advance,
Adrian


-- 
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr.90
050663 Bucharest sector 5
Romania

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] operations on columns when data frames are in a list

2015-04-13 Thread Steve E.
Hello R folks,

I have recently discovered the power of working with multiple data frames in
lists. However, I am having trouble understanding how to perform operations
on individual columns of data frames in the list. For example, I have a
water quality data set (sample data included below) that consists of roughly
a dozen data frames. Some of the data frames have a chr column called
'Month' that I need to to convert to a date with the proper format. I would
like to iterate through all of the data frames in the list and format all of
those that have the 'Month' column. I can accomplish this with a for-loop
(e.g., below) but I cannot figure out how to do this with the plyr or apply
families. This is just one example of the formatting that I have to perform
so I would really like to avoid loops, and I would love to learn how to
better work with lists as well.

I would appreciate greatly any guidance.


Thank you and regards,
Stevan


a for-loop like this works, but is not an ideal solution:

for (i in 1:length(data)) {if ("Month" %in% names(data[[i]]))
data[[i]]$Month<- as.POSIXct(data[[i]]$Month, format="%Y/%m/%d")}



sample data (head of two data frames from the list of all data frames):

structure(list(`3D_Fluorescence.csv` = structure(list(ID = 1:6, 
Site_Number = c("R5", "R6a", "R8", "R9a", "R14", "R15"), 
Month = c("2001/10/01", "2001/10/01", "2001/10/01", "2001/10/01", 
"2001/10/01", "2001/10/01"), Exc_A = c(215L, 215L, NA, NA, 
215L, 215L), Em_A = c(422.5, 410.5, NA, NA, 408.5, 408), 
Fl_A = c(303, 296.86, NA, NA, 297.62, 174.75), Exc_B = c(325L, 
325L, NA, NA, 325L, 325L), Em_B = c(416, 413, NA, NA, 418.5, 
417.5), Fl_B = c(137.32, 116.1, NA, NA, 132.48, 77.44)), .Names =
c("ID", 
"Site_Number", "Month", "Exc_A", "Em_A", "Fl_A", "Exc_B", "Em_B", 
"Fl_B"), row.names = c(NA, 6L), class = "data.frame"), algae.csv =
structure(list(
ID = 1:6, SiteNumber = c("R1", "R2A", "R2B", "R3", "R4", 
"R5"), SiteLocation = c("CAP canal above Waddell Canal", 
"Lake Pleasant integrated sample", "Lake Pleasant integrated sample", 
"Waddell Canal", "Cap Canal at 7th St.", "Verde River btwn Horseshoe and
Bartlett"
), ClusterName = c("cap", "cap", "cap", "cap", "cap", "verde"
), SiteAcronym = c("cap-siphon", "pleasant-epi", "pleasant-hypo", 
"waddell canal", "cap @ 7th st", "verde abv bartlett"), Date =
c("1999/08/18", 
"1999/08/18", "1999/08/18", "1999/08/18", "1999/08/18", "1999/08/16"
), Month = c("1999/08/01", "1999/08/01", "1999/08/01", "1999/08/01", 
"1999/08/01", "1999/08/01"), SampleType = c("", "", "", "", 
"", ""), Conductance = c(800, 890, 850, 870, 830, 500), ChlA = c(0.3, 
0.3, 0.6, 0.8, 1.1, 7.6), Phaeophytin = c(0, 0, 0, 0, 0.7, 
4.7), PhaeophytinChlA = c(0.7, 0.7, 1.3, 5.3, 0.7, 4.7), 
Chlorophyta = c(0L, 0L, 18L, 0L, 0L, 21L), Cyanophyta = c(8L, 
0L, 0L, 0L, 7L, 79L), Bacillariophyta = c(135L, 76L, 0L, 
18L, 54L, 195L), Total = c(147L, 76L, 18L, 18L, 61L, 302L
), AlgaeComments = c("", "", "", "", "", "")), .Names = c("ID", 
"SiteNumber", "SiteLocation", "ClusterName", "SiteAcronym", "Date", 
"Month", "SampleType", "Conductance", "ChlA", "Phaeophytin", 
"PhaeophytinChlA", "Chlorophyta", "Cyanophyta", "Bacillariophyta", 
"Total", "AlgaeComments"), row.names = c(NA, 6L), class = "data.frame")),
.Names = c("3D_Fluorescence.csv", 
"algae.csv")) 



--
View this message in context: 
http://r.789695.n4.nabble.com/operations-on-columns-when-data-frames-are-in-a-list-tp4705757.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA's introduced by coercion

2014-08-26 Thread Steve Lianoglou
Hi,

On Tue, Aug 26, 2014 at 9:56 PM, madhvi.gupta  wrote:
> Hi,
>
> I am applyin function as.numeric to a vector having many values as NA and it
> is giving :
> Warning message:
> NAs introduced by coercion
>
> Can anyone help me to know how to remove this warning and sor it out?

Let's say that the vector you are calling `as.numeric` over is called
`x`. If you could show us the output of the following command:

R> head(x[is.na(as.numeric(x))])

You'll see why you are getting the warning.

How you choose to sort it out probably depends on what you are trying
to do with your data after you convert it to a "numeric"

-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA's introduced by coercion

2014-08-26 Thread Steve Lianoglou

Hi Madhvi,

First, please use "reply-all" when responding to emails form this list 
so that others can help (and benefit from) the discussion.


Comment down below:

On 26 Aug 2014, at 22:15, madhvi.gupta wrote:


On 08/27/2014 10:42 AM, Steve Lianoglou wrote:

Hi,

On Tue, Aug 26, 2014 at 9:56 PM, madhvi.gupta 
 wrote:

Hi,

I am applyin function as.numeric to a vector having many values as 
NA and it

is giving :
Warning message:
NAs introduced by coercion

Can anyone help me to know how to remove this warning and sor it 
out?

Let's say that the vector you are calling `as.numeric` over is called
`x`. If you could show us the output of the following command:

R> head(x[is.na(as.numeric(x))])

You'll see why you are getting the warning.

How you choose to sort it out probably depends on what you are trying
to do with your data after you convert it to a "numeric"

-steve

Hi,
I am having this error bacouse vector contains value NA but i want to 
convert that vector to numeric


I don't quite follow what the problem is, then ... what is the end 
result that you want to happen?


When you convert the vector to a numeric, the NA's that were in it 
originally, will remain as NAs (but they will be of a 'numeric' type).


What would you like to do with the NA values? Do you just want to keep 
them, but want to silence the warning?


If so, you can do:

R> suppressWarnings(y <- as.numeric(x))

-steve

--
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract model from deriv3 or nls

2014-09-18 Thread Riley, Steve
Hello!

I am trying to figure out how to extract the model equation when using deriv3 
with nls.

Here is my example:
#
# Generate derivatives
#
Puro.fun2 <- deriv3(expr = ~(Vmax + VmaxT*state) * conc/(K + Kt * state + conc),
name = c("Vmax","VmaxT","K","Kt"),
function.arg = function(conc, state, Vmax, VmaxT, K, Kt) 
NULL)
#
# Fit model using derivative function
#
Puro.fit1 <- nls(rate ~ Puro.fun2(conc, state == "treated", Vmax, VmaxT, K, Kt),
 data = Puromycin,
 start = c(Vmax = 160, VmaxT = 47, K = 0.043, Kt = 0.05))

Normally I would use summary(Puro.fit1)$formula to extract the model but 
because I am implementing deriv3, the following gets returned:

> summary(Puro.fit1)$formula
rate ~ Puro.fun2(conc, state == "treated", Vmax, VmaxT, K, Kt)

What I would like to do is find something that returns:

rate ~ (Vmax + VmaxT*state) * conc/(K + Kt * state + conc)

Is there a way to extract this? Please advise. Thanks for your time.

Steve
860-441-3435


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Webdings font on pdf device

2014-11-03 Thread Steve Taylor
Dear R-helpers

Has anyone successfully used the Webdings font on a pdf or postscript device?  
I'm tearing my hair out trying to figure out how to make it work.

# It works on a png() device:
windowsFonts(Webdings = windowsFont("Webdings"))
png('Webdings.png', family = 'Webdings')
plot(-3:3,-3:3,type='n',xlab='',ylab='',axes=FALSE)
text (rnorm(26),rnorm(26),LETTERS,cex=2)
graphics.off()

I have tried to set up the Webdings font using the extrafont package but it 
gives warnings.  The output file says it has Webdings in it, but the characters 
do not show.

R> library(extrafont)
Registering fonts with R
R> loadfonts(device = "pdf", quiet=TRUE)
R> pdf('Webdings.pdf', family='Webdings')
Warning messages:
1: In pdf("Webdings.pdf", family = "Webdings") :
  unknown AFM entity encountered
2: In pdf("Webdings.pdf", family = "Webdings") :
  unknown AFM entity encountered
3: In pdf("Webdings.pdf", family = "Webdings") :
  unknown AFM entity encountered
4: In pdf("Webdings.pdf", family = "Webdings") :
  unknown AFM entity encountered
R> plot(-3:3,-3:3,type='n',xlab='',ylab='',axes=FALSE)
R> text (rnorm(26),rnorm(26),LETTERS,cex=2)
There were 27 warnings (use warnings() to see them)
R> graphics.off()
R> warnings()[1]
Warning message:
In text.default(rnorm(26), rnorm(26), LETTERS, cex = 2) :
  font width unknown for character 0x41  


Any assistance would be much appreciated.

cheers,
Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loops in R

2014-11-05 Thread Steve Lianoglou
While you should definitely read the tutorial that Don is referring
to, I'd recommend you take a different approach and use more R
idiomatic code here.

In base R, this could be addressed with few approaches. Look for help
on the following functions:

  * tapply
  * by
  * aggregate

I'd rather recommend you also learn about some of the packages that
are better suited to deal with computing over data.frames,
particularly:

  * dplyr
  * data.table

You can certainly achieve what you want with for loops, but you'll
likely find that going this route will be more rewarding in the long
run.

HTH,
-steve


On Wed, Nov 5, 2014 at 10:02 AM, Don McKenzie  wrote:
> Have you read the tutorial that comes with the R distribution?  This is a 
> very basic database calculation that you will
> encounter (or some slight variation of it) over and over.  The solution is a 
> few lines of code, and someone may write it
> out for you, but if no one does
>
> You have 20 populations, so you will have 20 iterations in your for loop. For 
> each one, you will need a unique identifier that points to
> the rows of "R" associated with that population. You'll calculate a mean and 
> variance 20 times, and will need a data object to store
> those calculations.
>
> Look in the tutorial for syntax for identifying subsets of your data frame.
>
>> On Nov 5, 2014, at 5:41 AM, Noha Osman  wrote:
>>
>> Hi Folks
>>
>> Iam  a new user of R and I have a question . Hopefully anyone help me in 
>> that issue
>>
>>
>> I have that dataset as following
>>
>> Sample  Population  Species  Tissue R GB
>> 1 Bari1_062-1  Bari1 ret   seed  94.52303  80.70346 67.91760
>> 2 Bari1_062-2  Bari1 ret   seed  98.27683  82.68690 68.55485
>> 3 Bari1_062-3  Bari1 ret   seed 100.53170  86.56411 73.27528
>> 4 Bari1_062-4  Bari1 ret   seed  96.65940  84.09197 72.05974
>> 5 Bari1_062-5  Bari1 ret   seed 117.62474  98.49354 84.65656
>> 6 Bari1_063-1  Bari1 ret   seed 144.39547 113.76170 99.95633
>>
>> and I have 20 populations as following
>>
>> [1] Bari1  Bari2  Bari3  Besev  Cermik Cudi   Derici 
>> Destek Egil
>> [10] GunasanKalkan Karabace   Kayatepe   Kesentas   OrtancaOyali 
>>  Cultivated Sarikaya
>> [19] Savur  Sirnak
>>
>> I need to calculate mean and variance of each population using column [R] 
>> using  for-loop
>>
>>
>> Thanks
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> Don McKenzie
> Research Ecologist
> Pacific Wildland Fire Sciences Lab
> US Forest Service
>
> Affiliate Faculty
> School of Environmental and Forest Sciences
> University of Washington
> d...@uw.edu
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Computational Biologist
Genentech

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "predict" values from object of type "list"

2016-02-17 Thread Steve Ryan
Hi Guys,

I could need some help here. I have a set of 3d points (x,y,v). These
points are not randomly scattered but lie on a surface. This surface can be
taken as a calibration plane for x and y -values. My goal is to quantify
this surface and than predict the v-values for given pairs of x- and
y-coordinates.

This iscode shows how I started to solve this problem. First, I generate
more points between existing points using 3d-splines. That way I
"pre-smooth" my data set. After that I use interp to create even more
points and I end up with an object called "sp" (class "list"). sp is
visualized using surface3d. The surface looks like I wish it to be.

Now, how can I predict a x/y-pair of, say -2, 2 ??
Can somebody help?
Thanks a lot!

library(rgl)
library(akima)

v <- read.table(text="5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9
9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11
11 11 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12
12 12 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 14 14 14 14 14 14 14 14
14 14 14 14 14 14 14", sep=" ")
v <- as.numeric(v)
x <- read.table(text="3.4 3.3 3.4 3.4 3.4 3.4 3.6 3.5 3.5 3.4 3.4 3.4 3.4
3.5 3.5 2.6 2.6 2.6 2.7 2.6 2.7 2.9 2.9 2.8 2.7 2.7 2.7 2.7 2.7 2.8 1.8 1.7
1.7 1.7 1.8 1.9 2.1 2.2 2.0 1.9 1.9 1.9 1.9 1.9 2.0 0.8 0.8 0.8 0.8 0.9 1.1
1.3 1.4 1.2 1.1 1.0 1.0 1.0 1.1 1.1 -0.2 -0.2 -0.2 -0.2 0.0 0.2 0.4 0.6 0.3
0.1 0.1 0.1 0.1 0.1 0.2 -1.2 -1.3 -1.3 -1.3 -1.1 -0.8 -0.5 -0.3 -0.6 -0.9
-0.9 -0.9 -0.9 -1.0 -0.9 -2.4 -2.6 -2.6 -2.5 -2.3 -2.0 -1.1 -1.2 -1.6 -2.0
-2.0 -2.0 -2.1 -2.2 -2.1 -3.9 -4.2 -4.3 -4.2 -3.9 -3.6 -2.5 -2.7 -3.3 -3.7
-3.7 -3.8 -3.8 -4.0 -3.9 -5.8 -6.1 -6.2 -6.1 -5.7 -5.3 -3.9 -4.1 -4.8 -5.3
-5.3 -5.3 -5.4 -5.5 -5.4 -7.5 -7.8 -8.0 -7.8 -7.4 -6.8 -5.1 -5.3 -6.1 -6.6
-6.7 -6.8 -6.9 -6.9 -6.9", sep=" ")
y <- read.table(text="0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1.0 1.0 1.1 1.1
1.2 1.2 0.5 0.5 0.6 0.7 0.8 0.9 0.9 1.0 1.1 1.1 1.2 1.3 1.4 1.4 1.5 0.4 0.5
0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 0.4 0.5 0.7 0.8 0.9 1.0
1.1 1.2 1.3 1.5 1.6 1.7 1.9 2.0 2.1 0.4 0.5 0.7 0.8 1.0 1.1 1.2 1.3 1.5 1.7
1.8 2.0 2.1 2.3 2.4 0.3 0.5 0.7 0.9 1.0 1.2 1.4 1.5 1.7 1.9 2.1 2.3 2.5 2.7
2.8 0.2 0.4 0.7 0.9 1.1 1.3 1.4 1.6 1.9 2.2 2.4 2.6 2.8 3.1 3.3 0.2 0.4 0.7
1.0 1.3 1.5 1.6 1.8 2.2 2.5 2.7 3.0 3.3 3.6 3.8 0.2 0.5 0.8 1.1 1.4 1.7 1.8
2.0 2.4 2.8 3.1 3.4 3.7 4.1 4.3 0.1 0.4 0.8 1.2 1.5 1.8 1.9 2.2 2.7 3.1 3.5
3.8 4.2 4.5 4.9", sep=" ")
x <- as.numeric(x)
y <- as.numeric(y)
z <- read.table(text="-35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35
-30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5
10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30
-25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10
15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25
-20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15
20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35", sep=" ")
z <- as.numeric(z)

df <- data.frame(x,y,z,v) #hier ist df die originale kali
plot3d(x,y,v)
all_dat <- c()

for (n in seq(min(z), max(z),5))
{
  blubb <- (which(df$z == n)) #hier werden gleiche winkel gesucht
  gleicheWink <- df[(blubb),]
  red_df <- data.frame(t=seq(1,length(gleicheWink[,1]),1), x =
gleicheWink$x, y= gleicheWink$y, v=gleicheWink$v )
  ts <- seq( from = min(red_df$t), max(red_df$t), length=50 )
  d2 <- apply( red_df[,-1], 2, function(u) spline(red_df$t, u, xout = ts
)$y )
  all_dat <- rbind(all_dat, d2)
}

x <- all_dat[,1]
y <- all_dat[,2]
z <- all_dat[,3]

sp <- interp(x,y,z,linear=TRUE, xo=seq(min(x),max(x), length=50),
 yo=seq(min(y),max(y), length=50), duplicate="mean")

open3d(scale=c(1/diff(range(x)),1/diff(range(y)),1/diff(range(z

zlen=5
cols <- heat.colors(zlen)

with(sp,surface3d(x,y,z, color=cols)) #,alpha=.2))
points3d(x,y,z)

title3d(xlab="x",ylab="y",zlab="v")
axes3d()

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simulate data with binary outcome

2008-07-15 Thread Steve Frost
Dear R-Users,
 I wish to simulate a binary outcome data set with
predictors (in the example below, age, sex and systolic BP). Is there a
way I can set the frequency of the outcome (y) to be say 5% (versus the
0.1% when using the seed below)?

# Example R-code based on Frank Harrell's Design help files

library(Hmisc)
n <- 1000
set.seed(123456)
age <- runif(n, 60, 90)
sbp <- rnorm(n, 120, 15)
sex <- factor(sample(c('female','male'), n,TRUE))

# Specify population model for log odds that CHD = Yes
L  <- 0.4*(sex == 'male') +
  0.045*(age) +
  0.05*(sbp)

# Simulate binary y to have Prob(y = 1) = 1/[1+exp(-L)]

y <- ifelse(runif(n) < plogis(L), 1, 0)
table(y)

ddist <- datadist(sex,age,sbp)
options(datadist = 'ddist')

fit <- lrm(y ~ sex + age + sbp)

summary(fit)



Steve Frost MPH
University of Western Sydney
Building 7
Campbelltown Campus
Locked Bag 1797
PENRITH SOUTH DC 1797
Phone 61+ 2 4620 3415
Mobile 0407 291088
Fax 61+ 2 4625 4252


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matching Up Values

2008-07-17 Thread Steve Murray

Dear all,

I have two files, both of similar formats. In column 1 are Latitude values 
(real numbers, e.g. -179.25), column 2 has Longitude values (also real numbers) 
and in one of the files, column 3 has Population Density values (integers); 
there is no column 3 in the other file.

However, the main difference between these two files is that one has fewer rows 
than the other. So what I'm looking to do is, 'pad out' the shorter file, by 
adding in the rows with those that are 'missing' from the longer file (ie. if a 
particular coordinate isn't present in the shorter file but is in the 
'longer/master' file), and having 'zero' as its Population Density value 
(column C).

This should result in the shorter file becoming the same length as the 
initially longer file, and with each file having the same coordinate values 
(latitude and longitude on each line).

How would I do this in R?

Thanks for any help offered,

Steve

_
The John Lewis Clearance - save up to 50% with FREE delivery

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching Up Values

2008-07-18 Thread Steve Murray

I think the approach is ok. I'm having difficulties though...!

I've managed to get 'merge' working (using the 'all' function as suggested), 
but for some strange reason, the output file produces 12 extra rows! So now the 
shorter file isn't the same length as the 'master' file, it's now longer!

The files are fairly sizeable (~6 rows) so it's difficult to pin-point 
manually where it's lost track.

Is there an obvious solution to this?

I was wondering if the best thing might be to 'de-merge' the now-longer file, 
so that the surplus rows are removed. Is there a command therefore which will 
enable me to compare the now-longer file to the master file, so that any 
coordinate pairs which are present in the longer file but not in the 
(now-shorter) master file are removed?

Thanks again,

Steve
_
Play and win great prizes with Live Search and Kung Fu Panda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching Up Values

2008-07-20 Thread Steve Murray

Hmm, I'm having a fair few difficulties using 'merge' now. I managed to get it 
to work successfully before, but in this case I'm trying to shorten (as oppose 
to lengthen as before) a file in relation to a 'master' file.

These are the commands I've been using, followed by the dimensions of the files 
in question - as you can see, the row numbers of the merged file don't 
correlate to that of the 'coordinates' file (which is what I'm aiming to get 
'merged' equal to):

> merge(PopDens.long, coordinates, by=c("Latitude","Longitude"), all = TRUE) -> 
> merged
> dim(PopDens.long); dim(coordinates); dim(merged)
[1] 67870 3
[1] 67420 2
[1] 69849 3


One thing I tried was swapping the order of the files in the merge command, but 
this causes 'merged' to have the same number of rows (69849).


Something else I tried was to leave out the 'all = TRUE' command, as I'm 
essentially attempting the shorten the file, but this makes the output file 
*too* short! (65441 as opposed to the intended 67420). Again, the same applies 
when the order of the input files are swapped.

> merge(PopDens.long, coordinates, by=c("Latitude","Longitude")) -> merged
> dim(PopDens.long); dim(coordinates); dim(merged)
[1] 67870 3
[1] 67420 2
[1] 65441 3


Am I doing something obviously wrong? I'm pretty certain that 'coordinates' is 
a subset of 'PopDens.long' - so there should be equal numbers of common values 
when merged.

Is there perhaps a more suitable function I could use, or a way of performing 
checks to see where I might be going wrong?!

Many thanks,

Steve
_
100’s of Nikon cameras to be won with Live Search

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coarsening the Resolution of a Dataset

2008-07-26 Thread Steve Murray

Dear all,

I have gridded data at 5' (minutes) resolution, which I intend to coarsen to 
0.5 degrees. How would I go about doing this in R? I've had a search online and 
haven't found anything obvious, so any help would be gratefully received.

I'm assuming that there will be several 'coarsening' techniques available - I'm 
after something fairly simple, which for example, just takes an average of each 
0.5 degree portion of the current dataset.

If someone is able to point me in the right direction, then I'd be very 
grateful.

Many thanks,

Steve

_
Play and win great prizes with Live Search and Kung Fu Panda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coarsening the Resolution of a Dataset

2008-07-29 Thread Steve Murray

Unfortunately, when I get to the 'myCuts' line, I receive the following error:

Error: evaluation nested too deeply: infinite recursion / options(expressions=)?

...and I also receive warnings about memory allocation being reached (even 
though I've already used memory.limit() to maximise the memory) - this is a 
fairly sizeable dataset afterall, 2160 rows by 4320 columns.

Therefore I was wondering if there are any alternative ways of coarsening a 
dataset? Or are there any packages/commands built for this sort of thing? 

Any advice would be much appreciated!

Thanks again,

Steve


_
Find the best and worst places on the planet

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coarsening the Resolution of a Dataset

2008-07-30 Thread Steve Murray

Hi - thanks for the advice - I am however applying this to the whole data 
frame. And the code that I'm using is just to read in the data (using 
read.table) and then the code that you supplied. I could send you the actual 
dataset if you don't mind a file ~50MB?!

Thanks again,

Steve


> Date: Tue, 29 Jul 2008 15:34:31 -0400
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: [R] Coarsening the Resolution of a Dataset
> CC: r-help@r-project.org
> 
> I assume that you are doing this on one column of the matrix which
> should only have 2160 entries in it.  can you send the actual code you
> are using.  I tried it with 10,000 samples and it works fine.  So I
> need to understand the data structure you are using.  Also the
> infinite recursion sounds strange; do you have function like 'cut' or
> 'c' redefined?  So it would help if you could supply a reproducible
> example.
> 
> On Tue, Jul 29, 2008 at 10:09 AM, Steve Murray <[EMAIL PROTECTED]> wrote:
> >
> > Unfortunately, when I get to the 'myCuts' line, I receive the following 
> > error:
> >
> > Error: evaluation nested too deeply: infinite recursion / 
> > options(expressions=)?
> >
> > ...and I also receive warnings about memory allocation being reached (even 
> > though I've already used memory.limit() to maximise the memory) - this is a 
> > fairly sizeable dataset afterall, 2160 rows by 4320 columns.
> >
> > Therefore I was wondering if there are any alternative ways of coarsening a 
> > dataset? Or are there any packages/commands built for this sort of thing?
> >
> > Any advice would be much appreciated!
> >
> > Thanks again,
> >
> > Steve
> >
> >
> > _
> > Find the best and worst places on the planet
> > http://clk.atdmt.com/UKM/go/101719807/direct/01/
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem you are trying to solve?

_
100’s of Nikon cameras to be won with Live Search

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coarsening the Resolution of a Dataset

2008-07-30 Thread Steve Murray

Hi - thanks for the advice - I am however applying this to the whole data 
frame. And the code that I'm using is just to read in the data (using 
read.table) and then the code that you supplied. I could send you the actual 
dataset if you don't mind a file ~50MB?!

Thanks again,

Steve


> Date: Tue, 29 Jul 2008 15:34:31 -0400
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: [R] Coarsening the Resolution of a Dataset
> CC: r-help@r-project.org
>
> I assume that you are doing this on one column of the matrix which
> should only have 2160 entries in it. can you send the actual code you
> are using. I tried it with 10,000 samples and it works fine. So I
> need to understand the data structure you are using. Also the
> infinite recursion sounds strange; do you have function like 'cut' or
> 'c' redefined? So it would help if you could supply a reproducible
> example.
>
> On Tue, Jul 29, 2008 at 10:09 AM, Steve Murray  wrote:
>>
>> Unfortunately, when I get to the 'myCuts' line, I receive the following 
>> error:
>>
>> Error: evaluation nested too deeply: infinite recursion / 
>> options(expressions=)?
>>
>> ...and I also receive warnings about memory allocation being reached (even 
>> though I've already used memory.limit() to maximise the memory) - this is a 
>> fairly sizeable dataset afterall, 2160 rows by 4320 columns.
>>
>> Therefore I was wondering if there are any alternative ways of coarsening a 
>> dataset? Or are there any packages/commands built for this sort of thing?
>>
>> Any advice would be much appreciated!
>>
>> Thanks again,
>>
>> Steve
>>
>>
>> _
>> Find the best and worst places on the planet
>> http://clk.atdmt.com/UKM/go/101719807/direct/01/
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?

_
Play and win great prizes with Live Search and Kung Fu Panda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coarsening the Resolution of a Dataset

2008-07-31 Thread Steve Murray

Please find below my command inputs, subsequent outputs and errors that I've 
been receiving.

> crops <- read.table("crop2000AD.asc", colClasses = "numeric", na="-")
> str(crops[1:10])
'data.frame':   2160 obs. of  10 variables:
 $ V1 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V2 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V3 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V4 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V5 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V6 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V7 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V8 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V9 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V10: num  NA NA NA NA NA NA NA NA NA NA ...

Don't worry about all the NAs - this is because there is no data available at 
the poles of the Earth (at the top and bottom of the dataset).

> min.5 <- 5/60
> dim(crops)
[1] 2160 4320
> n <- 2160*4320

> memory.limit()
[1] 382.9844
> crops <- cbind(interval=seq(0, by=min.5, length=n), value=runif(n))
Error: cannot allocate vector of size 142.4 Mb
In addition: Warning messages:
1: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) :
  Reached total allocation of 382Mb: see help(memory.size)
2: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) :
  Reached total allocation of 382Mb: see help(memory.size)
3: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) :
  Reached total allocation of 382Mb: see help(memory.size)
4: In cbind(interval = seq(0, by = min.5, length = n), value = runif(n)) :
  Reached total allocation of 382Mb: see help(memory.size)

But seems to run when 'value = runif(n)' is excluded

> crops <- cbind(interval=seq(0, by=min.5, length=n))
> head(crops)
   interval
[1,] 0.
[2,] 0.0833
[3,] 0.1667
[4,] 0.2500
[5,] 0.
[6,] 0.4167
> str(crops[1:10])
 num [1:10] 0. 0.0833 0.1667 0.2500 0. ...

> breaks <- c(seq(min(crops[,'interval']), max(crops[, 'interval']), by=0.5), 
> Inf)
> head(breaks)
[1] 0.0 0.5 1.0 1.5 2.0 2.5
> str(breaks)
 num [1:1555201] 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 ...

> myCuts <- cut(crops[, 'interval'], breaks, include.lowest=TRUE)
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
In addition: Warning messages:
1: In formatC(breaks, digits = dig, width = 1) :
  Reached total allocation of 382Mb: see help(memory.size)
2: In formatC(breaks, digits = dig, width = 1) :
  Reached total allocation of 382Mb: see help(memory.size)
>

This is as far as I've got because of the above errors I encounter. Any 
pointers and advice, or if I'm doing something obviously wrong, then please let 
me know.

Thanks again for your help.

Steve

_
100’s of Nikon cameras to be won with Live Search

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coarsening the Resolution of a Dataset

2008-08-01 Thread Steve Murray

Hi Jim,

Thanks for your advice. The problem is that I can't lose any of the data - it's 
a global dataset, where the left-most column = 180 degrees west, and the 
right-most is 180 degrees east. The top row is the North Pole and the bottom 
row is the South Pole.

I've got 512MB RAM on the machine I'm using - which has been enough to deal 
with such datasets before...?

I'm wondering, is there an alternative means of achieving this? Perhaps 
orientated via the desired output of the 'coarsened' dataset - my calculations 
suggest that the dataset would need to change from the current 2160 x 4320 
dimensions to 360 x 720. Is there any way of doing this based on averages of 
blocks of rows/columns, for example?

Many thanks again,

Steve


_
Find the best and worst places on the planet

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coarsening the Resolution of a Dataset

2008-08-01 Thread Steve Murray

Ok thanks Jim - I'll give it a go! I'm new to R, so I'm not sure how I'd go 
about performing averages in subsets... I'll have a look into it, but any 
subsequent pointers would be gratefully received as ever!

I'll also try playing with it in Access, and maybe even Excel 2007 might be 
able to do the trick too?

Thanks again...!

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Iterative Averages

2008-08-02 Thread Steve Murray

Dear all,

I have a data frame of 2160 rows and 4320 columns, which I hope to condense to 
a smaller dataset by finding averages of 6 by 6 blocks of values (to produce a 
data frame of 360 rows by 720 columns).

How would I go about finding the mean of a 6 x 6 block, then find the mean of 
the next adjacent 6 x 6 block, and so on, until the whole data frame has been 
covered?

One slight twist is that I have NA values, which I don't want to be included in 
the calculations unless a particular 6 x 6 block is entirely composed of NA 
values - in which case, NA should be the output value.

Thanks very much for any advice and solutions.

Steve

_
Get Hotmail on your mobile from Vodafone 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Iterative Averages

2008-08-02 Thread Steve Murray

Thanks very much for both your suggestions.

When you refer to doing a 'double for loop', do you mean finding the average 
for rowgrp and colgrp within each 6x6 block? If so, how would this be done so 
that the whole data frame is covered? It would seem to me that the 'mean' 
operation would need to combine rowgrp AND colgrp together? How would I get the 
loop to cycle through each set of 6x6 blocks of the data frame?

Thanks again,

Steve

_
Win New York holidays with Kellogg’s & Live Search

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-09-01 Thread Steve Revilak

Date: Sun, 31 Aug 2008 21:29:38 -0400
From: "ivo welch"
Subject: Re: [R] SQL Primer for R



stumped again by SQL...  If I have a table named "main" in an SQLite
data base, how do I get the names of all its columns?  (I have a mysql
book that claims the SHOW command does this sort of thing, but it does
not seem to work on SQLite.)


It sounds like SQLite's ".schema" command might be you're looking for.
Here's an example:

  $ sqlite3 foo.db
  SQLite version 3.5.4
  Enter ".help" for instructions
  sqlite> create table T (c1 integer, c2 integer, c3 integer);
  sqlite> .tables
  T
  sqlite> .schema T
  CREATE TABLE T (c1 integer, c2 integer, c3 integer);
  sqlite> .quit

Steve Revilak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interpolation Problems

2008-09-01 Thread Steve Murray

Dear all,

I'm trying to interpolate a dataset to give it twice as many values (I'm giving 
the dataset a finer resolution by interpolating from 1 degree to 0.5 degrees) 
to match that of a corresponding dataset.

I have the data in both a data frame format (longitude column header values 
along the top with latitude row header values down the side) or column format 
(in the format latitude, longitude, value).

I have used Google to determine 'approxfun' the most appropriate command to use 
for this purpose - I may well be wrong here though! Nevertheless, I've tried 
using it with the default arguments for the data frame (i.e. interp <- 
approxfun(dataset) ) but encounter the following errors:

> interp <- approxfun(JanAv)
Error in approxfun(JanAv) : 
  need at least two non-NA values to interpolate
In addition: Warning message:
In approxfun(JanAv) : collapsing to unique 'x' values


However, there are no NA values! And to double-check this, I did the following:

> JanAv[is.na(JanAv)] <- 0

...to ensure that there really are no NAs, but receive the same error message 
each time.

With regard to the latter 'collapsing to unique 'x' values', I'm not sure what 
this means exactly, or how to deal with it.


Any words of wisdom on how I should go about this, or whether I should use an 
alternative command (I want to perform a simple (e.g. linear) interpolation), 
would be much appreciated.


Many thanks for any advice offered,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpolation Problems

2008-09-02 Thread Steve Murray

Thanks Duncan - a couple of extra points... I should have perhaps pointed out 
that the data are on a *regular* 'box' grid (with each value currently spaced 
at 1 degree intervals). Also, I'm looking for something fairly simple, like a 
bilinear interpolation (where each new point is created based on the values of 
the four points surrounding it).

In answer to your question, JanAv is simply the data frame of values. And yes, 
you're right, I think I'll need a 2D interpolation as it's a grid with latitude 
and longitude values (which as an aside, I guess these need to be interpolated 
differently? In a 1D format??). I think you're also right in that the 'akima' 
package isn't suitable for this job, as it's designed for irregular grids.

Do you, or does anyone, have any suggestions as to what my best option should 
be?

Thanks again,

Steve



> Date: Mon, 1 Sep 2008 18:45:35 -0400
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> CC: r-help@r-project.org
> Subject: Re: [R] Interpolation Problems
>
> On 01/09/2008 6:17 PM, Steve Murray wrote:
>> Dear all,
>>
>> I'm trying to interpolate a dataset to give it twice as many values (I'm 
>> giving the dataset a finer resolution by interpolating from 1 degree to 0.5 
>> degrees) to match that of a corresponding dataset.
>>
>> I have the data in both a data frame format (longitude column header values 
>> along the top with latitude row header values down the side) or column 
>> format (in the format latitude, longitude, value).
>>
>> I have used Google to determine 'approxfun' the most appropriate command to 
>> use for this purpose - I may well be wrong here though! Nevertheless, I've 
>> tried using it with the default arguments for the data frame (i.e. interp <- 
>> approxfun(dataset) ) but encounter the following errors:
>>
>>> interp <- approxfun(JanAv)
>> Error in approxfun(JanAv) :
>> need at least two non-NA values to interpolate
>> In addition: Warning message:
>> In approxfun(JanAv) : collapsing to unique 'x' values
>>
>>
>> However, there are no NA values! And to double-check this, I did the 
>> following:
>>
>>> JanAv[is.na(JanAv)] <- 0
>>
>> ...to ensure that there really are no NAs, but receive the same error 
>> message each time.
>>
>> With regard to the latter 'collapsing to unique 'x' values', I'm not sure 
>> what this means exactly, or how to deal with it.
>>
>>
>> Any words of wisdom on how I should go about this, or whether I should use 
>> an alternative command (I want to perform a simple (e.g. linear) 
>> interpolation), would be much appreciated.
>
> What is JanAv? approxfun needs to be able to construct x and y values
> to interpolate; it may be that your JanAv object doesn't allow it to do
> that. (The general idea is that it will consider y to be a function of
> x, and will construct a function that takes arbitrary x values and
> returns y values matching those in the dataset, with some sort of
> interpolation between values.)
>
> If you really have longitude and latitude on some sort of grid, you
> probably want a two-dimensional interpolation, not a 1-d interpolation
> as done by approxfun. The interp() function in the akima() package does
> this, but maybe not in the format you need.
>
> Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] printing name of object inside lapply

2008-09-04 Thread Steve Powell
Dear list members,
I am trying, within a lapply command, to print the name of the objects 
in list or data frame. This is so that I can use odfWeave to print out a 
report with a section for each object, including the object names.

I tried e.g.
a=b=c=1:5
lis=data.frame(a,b,c)
lapply(
lis, function (z) {
obj.nam <- deparse(substitute(z))
cat("some other text",obj.nam,"and so on","\n")
}
)


But instead of getting "a" "b" etc. I get X[[1L]] etc.

Any ideas?


www.promente.org

proMENTE social research

Krančevićeva 35
71000 Sarajevo

mob. +387 61 215 997
tel. +387 556 865
fax. +387 556 866


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.table error

2008-09-04 Thread Steve Murray

Dear all,

I have a tab-delimited text (.txt) file which I'm trying to read into R. This 
file is of column format - there are in fact 3 columns and 259201 rows 
(including the column headers). I've been using the following commands, but 
receive an error each time which prevents the data from being read in:


> Jan <- read.table("JanuaryAvBurntArea.txt", header=TRUE)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 1 did not have 6 elements


I tried removing the 'header' argument, but receive a similar message:

> Jan <- read.table("JanuaryAvBurntArea.txt")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 2 did not have 6 elements


What's more confusing about this is that I know that none of the lines have 6 
elements! They're not supposed to! Each row only has 3 values (one per column)!

As a final resort I tried 'scan':

 <- scan("JanuaryAvBurntArea.txt")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  scan() expected 'a real', got 'Latitude'

...which is obviously something to do with there being a header as the first 
row, but the 'scan' command doesn't seem to have an equivalent of 'header=TRUE' 
like read.table...?


If anyone is able to shed some light on why I'm receiving these errors, and how 
I can get the data into R, then I'd be very grateful to hear them! I suspect 
I'm doing something very basic which is wrong!

Many thanks,

Steve




_
Win New York holidays with Kellogg’s & Live Search

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table error

2008-09-04 Thread Steve Murray

Thanks Prof. Ripley! I knew it would be something simple - I'd missed the "\t" 
from the read.table command! I won't be doing that again...!!

Thanks again,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing name of object inside lapply

2008-09-04 Thread Steve Powell

Thanks Prof Ripley! How obvious in retrospect!

Prof Brian Ripley wrote:

On Thu, 4 Sep 2008, Steve Powell wrote:


Dear list members,
I am trying, within a lapply command, to print the name of the objects
in list or data frame. This is so that I can use odfWeave to print out a
report with a section for each object, including the object names.

I tried e.g.
a=b=c=1:5
lis=data.frame(a,b,c)
lapply(
lis, function (z) {
obj.nam <- deparse(substitute(z))
cat("some other text",obj.nam,"and so on","\n")
}
)


But instead of getting "a" "b" etc. I get X[[1L]] etc.

Any ideas?


Use a for() loop on the names: lapply is overkill here.  But you could 
use


lapply(names(lis), function (z) {
   cat("some other text", z, "and so on","\n")
   ## references to lis[[z]]
})





www.promente.org

proMENTE social research

Kran??evi??eva 35
71000 Sarajevo

mob. +387 61 215 997
tel. +387 556 865
fax. +387 556 866


[[alternative HTML version deleted]]






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] howto get the number of columns and column names of multiply data frames

2009-08-09 Thread Steve Lianoglou

Hi,

On Aug 9, 2009, at 11:29 AM, Frank Schäffer wrote:


Hi,
I' ve read in several files with measurements into R data frames(works
flawlessly). Each dataframe is named by the location of measurement  
and

contains hundreds of rows and about 50 columns like this

dataframe1.
date measurment_1   mesurement_n
1
2
3
..
..
..
n


Just as an aside, it's somehow considered more R-idiomatic to store  
all of these tables in a list (of tables) and access them as  
mydata[[1]], mydata[[2]], ..., mydata[[n]]. Assuming the datafiles are  
'filename.1.txt', 'filename.2.txt', etc. You might do this like so:


mydata <- lapply(paste('filename', 1:n, 'txt', sep='.'), read.table,  
header=TRUE, sep=...)


To test that all colnames are the same, you could do something like.

names1 <- colnames(mydata[[1]])
all(sapply(2:n, function(dat) length(intersect(names1,  
colnames(mydata[[n]]))) == length(names1)))


For further processing I need to check whether or not ncol and  
colnames are

the same for all dataframes.
Also I need to add a new column to each dataframe with contain the  
name of the
dataframe, so that this column can be treated as factor in later  
processing

(after merging some seleted dataframes to one)

I tried out

for (i in 1:length(ls()){
print(ncol(ls()[i])
}

but this does not work because r returns a "character" for i and  
therefore

"NULL" as result.
Reading the output of ls() into a list also does not work.

How can I accomplish this task??


If you still want to do it this way, see: ?get

for example:

for (varName in paste('dataframe', 1:n, sep='')) {
  cat(colnames(get(varName)))
}

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem selecting rows meeting a criterion

2009-08-11 Thread Steve Lianoglou

Hi,

See comments in line:

On Aug 11, 2009, at 2:45 PM, Jim Bouldin wrote:



No problem John, thanks for your help, and also thanks to Dan and  
Patrick.
Wasn't able to read or try anybody's suggestions yesterday.  Here's  
what

I've discovered in the meantime:

What I did not include yesterday is that my original data frame,  
called

"data", was this:

  X Y   V3
1  1 1 0.00
2  2 1 8.062258
3  3 1 2.236068
4  4 1 6.324555
5  5 1 5.00
6  1 2 8.062258
7  2 2 0.00
8  3 2 9.486833
9  4 2 2.236068
10 5 2 5.656854
11 1 3 2.236068
12 2 3 9.486833
13 3 3 0.00
14 4 3 8.062258
15 5 3 5.099020
16 1 4 6.324555
17 2 4 2.236068
18 3 4 8.062258
19 4 4 0.00
20 5 4 5.385165
21 1 5 5.00
22 2 5 5.656854
23 3 5 5.099020
24 4 5 5.385165
25 5 5 0.00

To this data frame I applied the following command:

data <- data[data$V3 >0,];data #to remove all rows where V3 = 0

giving me this (the point from which I started yesterday):

  X Y   V3
2  2 1 8.062258
3  3 1 2.236068
4  4 1 6.324555
5  5 1 5.00
6  1 2 8.062258
8  3 2 9.486833
9  4 2 2.236068
10 5 2 5.656854
11 1 3 2.236068
12 2 3 9.486833
14 4 3 8.062258
15 5 3 5.099020
16 1 4 6.324555
17 2 4 2.236068
18 3 4 8.062258
20 5 4 5.385165
21 1 5 5.00
22 2 5 5.656854
23 3 5 5.099020
24 4 5 5.385165

So far so good.  But when I then submit the command

data = data[X>Y,] #to select all rows where X > Y


This won't work in general, and is probably only working in this  
particular case because you already have defined somewhere in your  
workspace vars named X and Y.


What you wrote above isn't taking the values X,Y from data$X and data 
$Y, respectively, but rather from var X and Y defined elsewhere.


Instead of doing data[X > Y], do:

data[data$X > data$Y,]

This should get you what you're expecting.


I get the problem result already mentioned, namely:

  X Y   V3
3  3 1 2.236068
4  4 1 6.324555
5  5 1 5.00
6  1 2 8.062258
10 5 2 5.656854
11 1 3 2.236068
12 2 3 9.486833
17 2 4 2.236068
18 3 4 8.062258
24 4 5 5.385165

which is clearly wrong!  It doesn't matter if I give a new name to  
the data
frame at each step or not, or whether I use the name "data" or not.   
It

always gives the same wrong answer.

However, if I instead use the command:
subset(data, X>Y), I get the right answer, namely:

  X Y   V3
2  2 1 8.062258
3  3 1 2.236068
4  4 1 6.324555
5  5 1 5.00
8  3 2 9.486833
9  4 2 2.236068
10 5 2 5.656854
14 4 3 8.062258
15 5 3 5.099020
20 5 4 5.385165


That's because when you are using X, and Y in your subset(...) call,  
THIS takes X and Y to mean data$X and data$Y.



OK so the lesson so far is "use the subset function".


Hopefully you're learning a slightly different lesson now :-)

Does that clear things up at all?

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help from command line

2009-08-11 Thread Steve Lianoglou

Hi,

On Aug 11, 2009, at 3:43 PM, Peng Yu wrote:


Hi,

I frequently need to open multiple help pages in R, which requires the
start of multiple R sessions. I am wondering if there is a way to
invoke the help page from the command line just like 'man'.


I haven't been paying attention, but are you working on a machine with  
a windowing system + browser?


If so, set your help files to be seen in the browser:

R> option(htmlhelp=TRUE)

next time you ask for some ?help, it should pop up a browser window.

Good enough?

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logged2

2009-08-12 Thread Steve Lianoglou

Hi,

On Aug 12, 2009, at 9:26 AM, amor Gandhi wrote:


Hi,

Can you please tell me what does the the function logged2 in R do im  
list or..? As I have

?logged2

No documentation for 'logged2' in specified packages and libraries:
you could try '??logged2'

??logged2

No help files found with alias or concept or title matching ‘logged2’
using fuzzy matching.


There is no logged2 function in (base) R. Where are you seeing this  
function used? Running:


R> RSiteSearch('logged2')

Only brings up "logged2" as a parameter to some functions in the samr  
package, but nothing else.


If you can let us know where you're seeing this function used, we  
could likely provide more help.


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem loading ncdf library on MAC

2009-08-12 Thread Steve Lianoglou

Hi,

On Aug 12, 2009, at 2:23 PM, Dan Kelley wrote:

I think ncdf is broken now, and so is Rnetcdf.  I can't build from  
source,

either.  If I find a solution, I'll post it here.  FYI, I have an
intel-based Mac running the latest OS and with R 2.9.1


While I haven't compiled it myself, it seems that Rnetcdf requires a  
library that you probably don't have installed. Check out it's build  
report:


http://www.r-project.org/nosvn/R.check/r-release-macosx-ix86/RNetCDF-00install.html

Probably it's looking for the udunits (http://www.unidata.ucar.edu/software/udunits/ 
 (?)) library and you don't have it.


I just tried to compile ncdf on my machine, and it also failed ..  
looks like you need netcdf.h. Just look at the status/reporting that  
building the package gives:



checking netcdf.h usability... no
checking netcdf.h presence... no
checking for netcdf.h... no
checking /usr/local/include/netcdf.h usability... no
checking /usr/local/include/netcdf.h presence... no
checking for /usr/local/include/netcdf.h... no
checking /usr/include/netcdf.h usability... no
checking /usr/include/netcdf.h presence... no
...

I guess you just need to get the required libs and try again.

Or are you getting different errors?

-steve


--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Steve Lianoglou

Hi,

On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote:


Hi,

The answers to my previous question about nominal variables has lead  
me to a more important question.


What is the "best practice" way to feed nominal variable to an SVM.

For example:
color = ("red, "blue", "green")

I could translate that into an index so I wind up with
color= (1,2,3)

But my concern is that the SVM will now think that the values are  
numeric in "range" and not discrete conditions.


Another thought would be to create 3 binary variables from the  
single color variable, so I have:


red = (0,1)
blue = (0,1)
green = (0,1)

A example fed to the SVM would have one positive and two negative  
values to indicate the color value:

i.e. for a blue example:
red = 0, blue =1 , green = 0


Do it this way.

So, imagine if the features for your examples were color and height,  
your "feature matrix" for N examples would be N x 4


0,1,0,15  # blue object, height 15
1,0,0,10  # red object, height 10
0,0,1,5 # green object, height 5
...

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] request: Help

2009-08-13 Thread Steve Lianoglou

Hi,

On Aug 13, 2009, at 3:43 PM, Sarjinder Singh wrote:


Dear Sir/Madam,

Good Day!

How can we make output file in R?

In FORTRAN, we could do as follows:

WRITE (42, 107) x, y
107  FORMAT ( 2x, F9.3, 2x, F4.2)

What is equivalent to this in R?


See:
 ?file
 ?cat
 ?sprintf

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coding problem: How can I extract substring of function callwithin the function

2009-08-13 Thread Steve Lianoglou

Hi,

On Aug 13, 2009, at 5:30 PM, Pitt, Joel wrote:

Thanks. It's a great solution to part of my problem. It provides the  
functionally I want would be no harder for students to use than my  
approach. I'm still interested in how to make what I was trying to  
do work -- simply to add to my own R programming proficiency.


I wasn't going to really chime in, but I couldn't help it given the  
irony in your last sentence (sorry, I don't mean to sound like an ass).


I think trying to help your students with some "bumps on the road" is  
admirable, but this seems just a bit misdirected. As you say, you are  
on a quest to add to your R programming proficiency, and will apply  
this newly founded technique to actively handicap your students'  
proficiency.


I guess you're teaching some intro stat analysis class, and waving  
over the fact that functions like mean explicitly return NA in the  
presence of NA's, but this is important to know and recognize when  
analyzing any real data, because there will surely be NA's "in the  
wild." Knowing that you should consciously and deliberately deal with/ 
ignore them from the start might be a good first/early lesson for  
students to digest.


Like I said, I don't mean to sound like an ass, but I think glossing  
over details of working with data should be avoided. Your other  
additions, like adding options to plotting/graphing functions, seem  
less harmful to me. But if you're trampling over some base:: function  
for something trivial like changing the value of one default parameter  
to something else, you might as well just get them to learn how to use  
the ? asap as well.


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coding problem: How can I extract substring of function callwithin the function

2009-08-13 Thread Steve Lianoglou

Hi Joel,

On Aug 13, 2009, at 6:10 PM, Pitt, Joel wrote:

Your objections are quite cogent, but I believe they are  
misdirected. I'm not at all interested in having my students ignore  
the existence of NA's and that's precisely why I'm not using the  
Default package as someone else suggested. But the mere existence of  
missing values in a data set doesn't make computation of the mean  
entirely useless -- why else would the na.rm option be available.


I agree: it's certainly not useless to calc the mean of a series of  
numbers while ignoring NA's.


If a student using my version of mean uses it to find the mean of a  
variable that has missing values, the function first prints a  
warning message and only then returns a value. Here's an example


> length(x)
[1] 101
> mean(x)
Warning: x has 3 missing values.
[1] 51.69388


Looks nice.

My only goal in this project has been to provide them with a  
somewhat friendlyR version of R. It was not conceived as part of a  
quest to improve my programming proficiency


I didn't think you were pursuing this to improve your programming  
proficiency from the outset, I just wanted to point out that you found  
a situation which you wanted to work around for (i) your precise use  
case. But it was because you were trying to solve (i) that you  
stumbled on (ii) and wanted to just know how one might do this for  
curiosity's sake and increase your proficiency (perhaps for some other  
use case in the future).


It was simply for reason (ii) that prompted me to say something only  
because you might not have seen your approach as potentially  
handicapping your students' (maybe subconcious(?)) pursuit of  
proficiency. I just wanted to point out that this is what you *might*  
be doing in the short term. Then again, it might not be.


Everyone has their own style of teaching, and it's your prerogative to  
do it as you see fit. I wouldn't presume to know which way is best, so  
good luck with the upcoming semester :-)


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating list of the from 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, . . ., n, n, n?

2009-08-14 Thread Steve Lianoglou

Hi,

On Aug 14, 2009, at 10:44 AM, John Sorkin wrote:


Windows XP
R 2.8.1

Is there any way to make a list of the from
1,1,1,2,2,2,3,3,3,4,4,4, . . ., n,n,n?


Like so?

R> rep(1:10, each=3)
[1]  1  1  1  2  2  2  3  3  3  4  4  4  5  5  5  6  6  6  7  7  7  8   
8  8  9

[26]  9  9 10 10 10

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assigning values based on a separate reference (lookup) table

2009-08-14 Thread Steve Murray

Dear R Users,

I have a data frame of 360 rows by 720 columns (259200 values). For each value 
in this grid I am hoping to apply an equation to, to generate a new grid. One 
of the parts of the equation (called 'p') relies on reading from a separate 
reference table. This is Table 4 at: 
http://www.fao.org/docrep/s2022e/s2022e07.htm#3.1.3%20blaney%20criddle%20method 
(scroll down a little).

Therefore, 'p' relies on the latitude of the values in the initial 360 x 720 
data frame. The row names of the data frame contain the latitude values and 
these range from between 89.75 to -89.75 (the latter being South of the 
Equator).

My question is, how do I go about forming a loop to read each of the 259200 
values and assign it a 'p' value (from the associated reference table), based 
on it's latitude?

My thinking was to do a series of 'if' statements, but this soon got very, very 
messy - any ideas which get the job done (and aren't a riddle to follow), would 
be most welcome.

Many thanks for any advice,

Steve


_

[[elided Hotmail spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning values based on a separate reference (lookup) table

2009-08-14 Thread Steve Murray

Thanks to you both for your responses.
3
I think these approaches will *nearly* do the trick, however, the problem is 
that the reference/lookup table is based on 'bins' of latitude values, eg.>61, 
60-56, 55-51, 50-46 etc. whereas the actual data (in my 720 x 360 data frame) 
are not binned, e.g. 89.75, 89.25, 88.75, 88.25, 87.75 etc. - instead they 
'increment' by -0.5 each time, and therefore many of the 367200 values which 
are in the data frame will have latitude values falling into the same 
'reference' bin.

It's for this reason that I think the 'merge' approach might fall down, unless 
there's a way of telling 'merge' that latitude can still be considered to match 
if they fall within a range. For example, if my 720 x 360 data frame has values 
whose corresponding latitude (row name) values are, say, 56.3, 55.9, 58.2, 56.8 
and 57.3, then the original value in the grid needs to be assigned a 'p' value 
which corresponds with what is read off of the reference table from the bin 
56-60.

Hope this makes sense! If not, please feel free to ask for clarification.

Many thanks again,

Steve


_

oticons.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rounding to the nearest 5

2009-08-17 Thread Steve Murray

Dear all,

A hopefully simple question: how do I round a series of values (held in an 
object) to the nearest 5? I've checked out trunc, round, floor and ceiling, but 
these appear to be more tailored towards rounding decimal places.

Thanks,

Steve



_

icons.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reshape package: Casting data to form a grid

2009-08-18 Thread Steve Murray

Dear R Users,

I'm trying to use the 'cast' function in the 'reshape' package to convert 
column-format data to gridded-format data. A sample of my dataset is as follows:

head(finalframe)
  Latitude Longitude Temperature OrigLat  p-value Blaney
1  -90-38.75  NA  -87.75 17.10167 NA
2  -90135.75  NA  -87.75 17.10167 NA
3  -90 80.25  NA  -87.75 17.10167 NA
4  -90 95.75  NA  -87.75 17.10167 NA
5  -90 66.75  NA  -87.75 17.10167 NA
6  -90 75.75  NA  -87.75 17.10167 NA


I'm attempting to form a grid based on the OrigLat, Longitude and Blaney 
columns, to form the rows, columns and values of the new grid respectively.

The command I've been using is:

cast_test <- cast(finalframe, finalframe$OrigLat~variable, 
finalframe$Longitude~variable, finalframe$Blaney~variable)
Error: Casting formula contains variables not found in molten data: 
finalframe$OrigLat, variable

And I've tried removing the ~variable suffixes:

cast_test <- cast(finalframe, finalframe$OrigLat, finalframe$Longitude, 
finalframe$Blaney)
Error: Casting formula contains variables not found in molten data: 
-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75-87.75
 [etc etc]


I'm not sure how to get round this error, nor what the 'molten data' is that 
the error is referring to. I'm assuming it means the data frame presented 
above, yet the variables are clearly present!

Any help or advice on this would be most welcomed.

Many thanks,

Steve


_

[[elided Hotmail spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replacing NA values in one column of a data.frame

2009-08-18 Thread Steve Murray

Dear all,

I'm trying to replace NA values with - in one column of a data frame. I've 
tried using is.na and the testdata[testdata$onecolumn==NA] <-  approach, 
but whilst neither generate errors, neither result in -s appearing - the 
NAs remain there!

I'd be grateful for any advice on what I'm doing wrong or any other suitable 
approaches.

Many thanks,

Steve

_

oticons.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing NA values in one column of a data.frame

2009-08-18 Thread Steve Lianoglou

Hi,

On Aug 18, 2009, at 10:19 AM, John Kane wrote:


Perhaps
testdata$onecolumn[testdata$onecolumn==NA] <- 


I don't think this would work -- the is.na function is the way to go:

R> a <- c(1,2,3,NA,NA,10,20,NA,NA)
R> a
[1]  1  2  3 NA NA 10 20 NA NA

R> a[a == NA] <-999
R> a
[1]  1  2  3 NA NA 10 20 NA NA

R> a[is.na(a)] <-999
R> a
[1]   1   2   3 999 999  10  20 999 999

Hope that helps,
-steve



--- On Mon, 8/17/09, Steve Murray  wrote:


From: Steve Murray 
Subject: [R] Replacing NA values in one column of a data.frame
To: r-help@r-project.org
Received: Monday, August 17, 2009, 11:41 AM

Dear all,

I'm trying to replace NA values with - in one column of
a data frame. I've tried using is.na and the
testdata[testdata$onecolumn==NA] <-  approach, but
whilst neither generate errors, neither result in -s
appearing - the NAs remain there!

I'd be grateful for any advice on what I'm doing wrong or
any other suitable approaches.

Many thanks,

Steve

_

oticons.

__
R-help@r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.




  
__

[[elided Yahoo spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing a dataframe summary to disk

2009-08-18 Thread Steve Lianoglou

Hi,

On Aug 17, 2009, at 10:54 AM, Philip A. Viton wrote:


I'd like to write the summary of a dataframe to disk, so that it looks
essentially the same as what you'd see on screen; but I can't seem  
to do it. Can someone tell me how? Thanks!


Look at: ?write.table and friends

You can write the data.frame that way. If you're looking to serialize  
the result of a call to summary(my.data.frame) to a disk, just note  
that the summary function returns the summary as  table (and doesn't  
simply print it to the terminal), so:


my.summary <- summary(some.data.frame)
write.table(my.summary, quote=FALSE, file="summary.txt")

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove columns

2009-08-18 Thread Steve Lianoglou

Hi Alberto,

On Aug 18, 2009, at 4:14 AM, Alberto Lora M wrote:


Hi Everbody

Could somebody help me.?

I need to remove the columns where the sum of it components is equal  
to

zero.

For example


a<-matrix(c(0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,1,0), ncol=4)
a

[,1] [,2] [,3] [,4]
[1,]0001
[2,]0101
[3,]0000
[4,]0100
[5,]0001
[6,]0000

Columns 1 and 3 should be removed

the result should be the dollowing matrix

[,2]  [,4]
[1,]01
[2,]11
[3,]00
[4,]10
[5,]01
[6,]00


Try this:

R> a[,-which(colSums(a) == 0)]
 [,1] [,2]
[1,]01
[2,]11
[3,]00
[4,]10
[5,]01
[6,]00

Indexing into a matrix/vector/data.frame/list/whatever with a negative  
number removes those elements from the result.


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm.fit algo

2009-08-18 Thread Steve Lianoglou

Hi,

On Aug 17, 2009, at 5:09 PM, Pavlo Kononenko wrote:


Hi, everyone,

This is a little silly, but I cant figure out the algorithm behind
lm.fit function used in the context of promax rotation algorithm:

The promax function is:

promax <- function(x, m = 4)
{
   if(ncol(x) < 2) return(x)
   dn <- dimnames(x)
   xx <- varimax(x)
   x <- xx$loadings
   Q <- x * abs(x)^(m-1)
   U <- lm.fit(x, Q)$coefficients
   d <- diag(solve(t(U) %*% U))
   U <- U %*% diag(sqrt(d))
   dimnames(U) <- NULL
   z <- x %*% U
   U <- xx$rotmat %*% U
   dimnames(z) <- dn
   class(z) <- "loadings"
   list(loadings = z, rotmat = U, crap = x, coeff = Q)
}

And the line I'm having trouble with is:

U <- lm.fit(x, Q)$coefficients


Isn't this doing a least squares regression using the predictor  
variables in x and the (I guess) real valued numbers in vector Q?


x is a matrix of n (observations) by p (predictors)

The $coefficients is just taking the vector of coefficients/weights  
over the predictors  -- this would be a vector of length p -- such that


x %*% t(t(U)) ~ Q

 * t(t(U)) is ugly, but I just want to say get U to be a column vector
 * ~ is used as "almost equals")

You'll need some numerical/scientific/matrix library in java, perhaps  
this could be a place to start:


http://commons.apache.org/math/userguide/stat.html#a1.5_Multiple_linear_regression

Hope that helps,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie that don't understand R code

2009-08-18 Thread Steve Lianoglou

Hi,

Comments inline and at end:

On Aug 17, 2009, at 11:36 AM, kfcnhl wrote:


I got some R code that I don't understand.

Question as comment in code
//where is t comming from, what is phi inverse

rAC <- function(name, n, d, theta){
#generic function for Archimedean copula simulation
illegalpar <- switch(name,
clayton = (theta < 0),
gumbel = (theta < 1),
frank = (theta < 0),
BB9 = ((theta[1] < 1) | (theta[2] < 0)),
GIG = ((theta[2] < 0) | (theta[3] < 0) | ((theta[1]>0) &  
(theta[3]==0)) |

((theta[1]<0) & (theta[2]==0
if(illegalpar)
stop("Illegal parameter value")
independence <- switch(name,
clayton = (theta == 0),
gumbel = (theta == 1),
frank = (theta == 0),
BB9 = (theta[1] == 1),
GIG=FALSE)
U <- runif(n * d)
U <- matrix(U, nrow = n, ncol = d)
if(independence)
return(U)
Y <- switch(name,
clayton = rgamma(n, 1/theta),
gumbel = rstable(n, 1/theta) * (cos(pi/(2 * theta)))^theta,
frank = rFrankMix(n, theta),
BB9 = rBB9Mix(n, theta),
GIG = rGIG(n,theta[1],theta[2],theta[3]))
Y <- matrix(Y, nrow = n, ncol = d)
phi.inverse <- switch(name,
clayton = function(t, theta)
//where is t comming from, what is phi inverse
{
(1 + t)^(-1/theta)
}


t isn't coming from anywhere, it's just a parameter to the function  
definition here.


phi.inverse will be THE FUNCTION returned by the switch statement  
here, depending on the value of ``name``.



,
gumbel = function(t, theta)
{
exp( - t^(1/theta))
}
,
frank = function(t, theta)
{
(-1/theta) * log(1 - (1 - exp( - theta)) * exp( - t))
}
,
BB9 = function(t, theta)
{
exp( - (theta[2]^theta[1] + t)^(1/theta[1]) + theta[2])
}
,
GIG = function(t, theta)
{
lambda <- theta[1]
chi <- theta[2]
psi <- theta[3]
if (chi==0)
out <- (1+2*t/psi)^(-lambda)
else if (psi==0)
out <- 2^(lambda+1)*exp(besselM3(lambda,sqrt(2*chi*t),log
value=TRUE)-lambda*log(2*chi*t)/2)/gamma(-lambda)
else
out <- exp(besselM3(lambda,sqrt(chi*(psi+2*t)),logvalue=T
RUE)+lambda*log(chi*psi)/2- 
besselM3(lambda,sqrt(chi*psi),logvalue=TRUE)-lambda*log(chi*(psi 
+2*t))/2)

out
}
)
phi.inverse( - log(U)/Y, theta)


phi.inverse was defined as the function returned by the switch  
statement. ``- log(U)/Y`` is passed in to the function's ``t`` argument.


Does that help?

-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] value of nth percentile

2009-08-18 Thread Steve Lianoglou

Hi Ajay,

On Aug 18, 2009, at 8:16 AM, Ajay Singh wrote:


Dear All,
I have to get the value of say 90th percentile of precipitation time  
series.. The series is of daily precipitation value of 96 years, I  
have to to get 90the percentile value of daily precipitation each  
year. If you know the R code or command for this please let me know.

I would appreciate your early response.


R> dat <- rnorm(100, mean=10, sd=2)
R> quantile(dat, .9)
 90%
12.53047

R> sum(dat < quantile(dat, .9)) / length(dat)
[1] 0.9

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] open txt

2009-08-18 Thread Steve Lianoglou


On Aug 18, 2009, at 9:41 AM, Stela Valenti Raupp wrote:


Não consigo abrir  a pasta txt no R, dá a mensagem: Warning message:
In file(file, "r") : cannot open file 'plantula.txt': No such file or
directory

O arquivo está na mesma página do Scrip.
Não sei qual é o problema


Tu file "plantula.txt" no esta aqui (I think).

In short: you are passing some function the name of a file that  
doesn't exist. Try passing the absolute path to the plantula.txt file  
to your call to ``file()``


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tr : create a table in the console!!

2009-08-18 Thread Steve Lianoglou

Hi,

On Aug 18, 2009, at 10:52 AM, Inchallah Yarab wrote:

- Message transféré 
De : Inchallah Yarab 
À : r-help@r-project.org
Envoyé le : Mardi, 18 Août 2009, 16h26mn 20s
Objet : create a table in the console!!


HI

I want to do a table with R (in the console)

GWP_Max NumberOfPolicies
No_GWPMax 8
[0-1000] 4
[1000-3000] 3
[> 3000] 5

i begin by calculate the number of policies in each class :

 Data1 <- read.csv2("c:/Total1.csv", sep=",")

Data2 <- read.csv2("c:/GWPMax1.csv",sep=",")[1:20,1:2]
M <- merge(Data1,Data2, by.x = "Policy.Number",by.y =  
"Policy.Number",all.x = TRUE,all.y = TRUE )

(No_GWPMax<-nrow(M[M[,25]=="NA",]))

[1] 8
M2<- merge(Data1,Data2, by.x = "Policy.Number",by.y =  
"Policy.Number")

M2$GWP_Max <- as.numeric(as.character(M2$GWP_Max ))
class1 <- M2[M2[,25]>0 & M2[,25]<1000,]
(NbpolicyClass1 <- nrow(class1))

[1] 5

class2 <- M2[M2[,25]>1000 & M2[,25]<3000,]
(NbpolicyClass2 <- nrow(class2))

[1] 3

class3 <- M2[M2[,25]>3000,]
(NbpolicyClass3 <- nrow(class3))

[1] 4

can you help me ?


I don't understand what you want to do.

From the code you've pasted, you already have extracted the numbers  
you wanted, so what do you mean when you say "I want to do a table"  
with them? Do you just want to put them in a data.frame, or something?


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Steve Lianoglou

Howdy,

On Aug 19, 2009, at 2:54 PM, Noah Silverman wrote:


Hi Steve,

No custom kernel. (This is the exact same data that I call svm  
with.  svm works without a complaint.)


traindata is just a dataframe of numerical attributes
trainlabels is just a vector of labels. ("good", "bad")

Then I call

model <- rvm(x,y)


is x really a data.frame? Can you try to turn it into a matrix to see  
if it will get you over this speed bump?


model <- rvm(as.matrix(x), y)

I reckon if x is a data.frame, R is invoking the rvm function that's  
meant to work on list(s), rather than the matrix which you think  
you're passing in.


Does that do the trick?

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 |  Memorial Sloan-Kettering Cancer Center
 |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Steve Lianoglou

Hi,

On Aug 19, 2009, at 1:27 PM, Noah Silverman wrote:


Hello,

In my ongoing quest to develop a "best" model, I'm testing various  
forms of SVM to see which is best for my application.


I have been using the SVM from the e1071 library without problem for  
several weeks.


Now, I'm interested in RVM and LSSVM to see if I get better  
performance.


When running RVM or LSSVM on the exact same data as the SVM{e1071},  
I get an error that I don't understand:


Error in .local(x, ...) : kernel must inherit from class 'kernel'

Does this make sense to anyone?  Can you suggest how to resolve this?


Sure, it just means that whatever you are passing as a value to the  
kernel= parameter of your function call is not a kernel function (that  
kernlab knows about).


Did you rig up a custom kernel function? If so -- be sure to set its  
class properly. Otherwise, can you provide something of a self- 
contained piece of code that you're using to invoke these functions  
such that it's giving you these errors?


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Steve Lianoglou

Steve,

That makes sense, except that x is a data.frame with about 70  
columns.  So I don't see how it would convert to a list.


Yeah ... not sure if that's what happening (R class relationships/ 
testing is still a bit of a mystery to me), but see:


R> df <- data.frame(a=1:10,b=1:10)
R> is(df)
[1] "data.frame" "list"   "oldClass"   "mpinput""vector"

But

R> is(df, 'list')
[1] FALSE

So, in short, I don't know if that's what's happening ... did it fix  
your problem, tho?


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a naive question

2009-08-19 Thread Steve Lianoglou

Hi,

On Aug 19, 2009, at 4:59 PM, Bogdan Tanasa wrote:

Hi, and my apologies for the following very naive question : I would  
like to

read a column of numbers in R and plot a histogram.

eg :

x<-read.table("txSTART");
y<-as.numeric(x);

and I do obtain the error : Error: (list) object cannot be coerced  
to type

'double'. Please could you let me know the way to fix it.


Yeah, you can't do that. What does x look like? Can you show us the  
result of:


R> head(x)

Assuming it's just a single column, you can access the numbers in the  
first column, like so


R> x[,1]

You can use that to plot a histogram of the numbers in the first column:

R> plot(hist(x[,1]))

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a list of combinations

2009-08-20 Thread Steve Murray

Dear R Users,

I have 120 objects stored in R's memory and I want to pass the names of these 
many objects to be held as just one single object. The naming convention is 
month, year in sequence for all months between January 1986 to December 1995 
(e.g. Jan86, Feb86, Mar86... through to Dec95). I hope to pass all these names 
(and their data I guess) to an object called file_list, however, I'm 
experiencing some problems whereby only the first (and possibly last) names 
seem to make the list, with the remainder recorded as 'NA' values.

Here is my code as it stands:

index <- expand.grid(month=month.abb, year=seq(from=86,to=95, by=1))

for (i in seq(nrow(index))) {
file_list <- paste(index$month[i], index$year[i], sep='')
print(file_list[i])
}

Output is as follows:

[1] "Jan86"
[1] NA
[1] NA
[1] NA
#[continues to row 120 as NA]

> file_list; file_list[i]
[1] "Dec95"
[1] NA

> head(index) # this seems to be working fine
  month year
1   Jan   86
2   Feb   86
3   Mar   86
4   Apr   86
5   May   86
6   Jun   86


Any help on how I can populate file_list correctly with all 120 combinations of 
month + year (without NAs!) would be gratefully received.

Thanks,

Steve



_

icons.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Several simple but hard tasks to do with R

2009-08-20 Thread Steve Jaffe

For history of both commands and output, consider running R inside emacs
using the ESS package and simply saving the buffer to a file.  If you save
the session as an "S transcript" file (extension .St) it is also easy to
reload and re-execute any part of it. Emacs or xemacs is available on most
platforms including Windows.


Rakknar wrote:
> 
> "1. logs. help.search("history") and ?savehistory shows you that R does
> exactly what you want very easily (depending on the platform, which
> contrary
> to the posting guide's request, you did not tell us)."
> 
> I've already find out about the "history" tool but it was not useful
> because it only register commands, not output from the commands. The
> commands are already stored in scripts (I use Tinn-R, I don't know if you
> would recommend me other) what I want to do it's to store de commands AND
> the outputs from each. I use the Windows version by the way.
> 

-- 
View this message in context: 
http://www.nabble.com/Several-simple-but-hard-tasks-to-do-with-R-tp25052563p25064007.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a construct for conditional comment?

2009-08-20 Thread Steve Jaffe

Why not

if ( 0 ) {
commented with zero
} else {
commented with one
}


Greg Snow-2 wrote:
> 
> I believe that #if lines for C++ programs is handled by the preprocessor,
> not the compiler.  So if you want the same functionality for R programs,
> it would make sense to just preprocess the R file.
> 
>> In C++, I can use the following construct to choice either of the two
>> blocks the comment but not both. Depending on whether the number after
>> "#if" is zero or not, the commented block can be chose. I'm wondering
>> if such thing is possible in R?
>> 
>> #if 0
>> commented with 0
>> #else
>> commented with 1
>> #endif
>> 
>> Regards,
>> Peng
>> 
> 

-- 
View this message in context: 
http://www.nabble.com/Is-there-a-construct-for-conditional-comment--tp25034224p25064798.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using 'unlist' (incorrectly?!) to collate a series of objects

2009-08-20 Thread Steve Murray

Dear R Users,

I am attempting to write a new netCDF file (using the ncdf) package, using 120 
grids I've created and which are held in R's memory.

I am reaching the point where I try to put the data into the newly created 
file, but receive the following error:

> put.var.ncdf(evap_file, evap_dims, unlist(noquote(file_list)))
Error in put.var.ncdf(evap_file, evap_dims, unlist(noquote(file_list))) : 
  put.var.ncdf: error: you asked to write 31104000 values, but the passed data 
array only has 120 entries!

I think I understand why this is: the 120 grids contain 31104000 values in 
total, however, it seems that only the names of the 120 objects are being 
passed to the file.

Earlier on in the script, I generated the file names using the following code:

> for (i in seq(nrow(index))) {
file_list[[i]] <- paste(index$month[i], index$year[i], sep='')
print(file_list[i])
}

I was hoping therefore, that when I do put.var.ncdf and use the 'unlist' 
function (see original section of code), that since the data associated with 
the names of the grids are held in memory, both the names *and data* would be 
passed to the newly created file. However, it seems that only the names are 
being recognised.

My question is therefore, is there an easy way of passing all 120 grids, using 
the naming convention held in file_list, to an object, which can subsequently 
be used in the put.var.ncdf statement?

Many thanks for any help,

Steve


_

icons.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LASSO: glmpath and cv.glmpath

2009-08-21 Thread Steve Lianoglou

Hi,

On Aug 21, 2009, at 9:47 AM, Peter Schüffler wrote:


Hi,

perhaps you can help me to find out, how to find the best Lambda in  
a LASSO-model.


I have a feature selection problem with 150 proteins potentially  
predicting Cancer or Noncancer. With a lasso model


fit.glm <- glmpath(x=as.matrix(X), y=target, family="binomial")

(target is 0, 1 <- Cancer non cancer, X the proteins, numerical in  
expression), I get following path (PICTURE 1)
One of these models is the best, according to its crossvalidation  
(PICTURE 2), the red line corresponds to the best crossvalidation.  
Its produced by


cv <- cv.glmpath(x=as.matrix(X), y=unclass(T)-1, family="binomial",  
type ="response", plot.it=TRUE, se=TRUE)
abline(v= cv$fraction[max(which(cv$cv.error==min(cv$cv.error)))],  
col="red", lty=2, lwd=3)



Does anyone know, how to conclude from the Normfraction in PICTURE 2  
to the corresponding model in PICTURE 1? What is the best model?  
Which coefficients does it have? I can only see the best model's  
cross validation error, but not the actual model. How to see it?


None of your pictures came through, so I'm not sure exactly what  
you're trying to point out, but in general the cross validation will  
help you find the best value for lambda for the lasso. I think it's  
the value of lambda that you'll use for your downstream analysis.


I haven't used the glmpath package, but I have been using the glmnet  
package which is also by Hastie, newer, and I believe covers the same  
use cases as the glmpath library (though, to be honest, I'm not quite  
familiar w/ the cox proportions hazard model). Perhaps you might want  
to look into it.


Anyway, speaking from my experience w/ the glmnet packatge, you might  
try this:


1. Determine the best value of lambda using CV. I guess you can use  
MSE or R^2 as you see fit as your yardstick of "best."


2. Train a model over all of your data and ask it for the coefficients  
at the given value of lambda from 1.


3. See which proteins have non-zero coefficients.


4. Divine a biological story that is explained by your statistical  
findings


4. Publish.


I guess there are many ways to do model selection, and I'm not sure  
it's clear how effective they are (which isn't to say that you  
shouldn't don't do them)[1] ... you might want to further divide your  
data into training/tuning/test (somewhere between steps 1 and 2) as  
another means of scoring models.


HTH,
-steve

[1] http://hunch.net/?p=29

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert list to data frame while controlling column types

2009-08-21 Thread Steve Lianoglou

Hi Allie,

On Aug 21, 2009, at 11:47 AM, Alexander Shenkin wrote:


Hello all,

I have a list which I'd like to convert to a data frame, while
maintaining control of the columns' data types (akin to the colClasses
argument in read.table).  My numeric columns, for example, are getting
converted to factors by as.data.frame.  Is there a way to do this, or
will I have to do as I am doing right now: allow as.data.frame to  
coerce

column-types as it sees fit, and then convert them back manually?


This doesn't sound right ... are there characters buried in your  
numeric columns somewhere that might be causing this?


I'm pretty sure this shouldn't happen, and a small test case here goes  
along with my intuition:


R> a <- list(a=1:10, b=rnorm(10), c=LETTERS[1:10])
R> df <- as.data.frame(a)
R> sapply(df, is.factor)
a b c
FALSE FALSE  TRUE

Can you check to see if your data's wonky somehow?

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extra .

2009-08-21 Thread Steve Lianoglou

Hi,

This is somehow unrelated, but your answer brings up a question that  
I've been curious about:


On Aug 21, 2009, at 11:48 AM, William Dunlap wrote:




-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of kfcnhl
Sent: Thursday, August 20, 2009 7:34 PM
To: r-help@r-project.org
Subject: [R] extra .


sigma0 <- sqrt((6. * var(maxima))/pi)

What does the '.' do here?


In R it does nothing: both '6' and '6.' parse as "numerics"
(in C, double precision numbers).  In SV4 and S+ '6' parses
as an integer (in C, a long) and '6.' parses as a numeric,
so putting the decimal point after numerics makes the
code a bit more portable, although there are not too many
cases where the difference is significant.   Calls to .C, etc.,
and oveflowing arithmetic are the main problem points.

In R and S+ '6L' represents an integer.


If this is true, I'm curious as to why when I'm poking through some  
source code of different packages, I see that some people are careful  
to explicitly include the L after integers?


I can't find a good example at the moment, but you can see one such  
case in the source to the lm.fit function. The details aren't all that  
important, but here's one of the lines:


r2 <- if (z$rank < p)
(z$rank + 1L):p

I mean, why not just (z$rank + 1):p ?

Just wondering if anybody has any insight into that. I've always been  
curious and I seem to see it done in many different functions and  
packages, so I feel like I'm missing something ...


Thanks,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] computation of matrices in list of list

2009-08-22 Thread Steve Lianoglou
Hi Kathie,

On Sat, Aug 22, 2009 at 1:03 PM, kathie  wrote:

Dear Gabor Grothendieck,
>
>
> thank you for your comments.
>
> Ive already tried that. but  I've got this error message.
>
>
> > Reduce("+",z)
> Error in f(init, x[[i]]) : non-numeric argument to binary operator
>
>
> anyway, thanks
>
> ps.
>
> > is.matrix(z[[1]][[1]])
> [1] TRUE
>
> I guess the reason "Reduce" doesn't work is that it has multi-dimentional
> list..


Yeah, what about a double reduce:

# Make a list of matrix lists:
R> z <- list(list(matrix(rnorm(25),5), matrix(rnorm(25),5),
matrix(rnorm(25),5)),
list(matrix(rnorm(25),5), matrix(rnorm(25),5)),
list(matrix(rnorm(25),5), matrix(rnorm(25),5),
matrix(rnorm(25),5)))

# Reduce all matrices in the nested lists
R> r1 <- lapply(z, function(ms) Reduce("+", ms))
R> r1
[[1]]
   [,1]  [,2]   [,3]   [,4]   [,5]
[1,]  2.4884292  1.058375  1.3235864 -1.7800055  3.1095416
[2,] -0.5077567 -1.120329  1.8128142 -1.4255453  1.2478431
[3,] -3.7495272 -2.702159 -2.1013426 -2.1324515 -0.655
[4,] -2.7359066 -1.437341 -0.1735794  0.4892164 -1.1855285
[5,]  4.4842963 -2.312451 -0.6797429  0.4563329  0.2108545

[[2]]
  [,1][,2][,3]   [,4]   [,5]
[1,]  1.725147 -0.06565073  0.16204140 -0.4859336  1.0162852
[2,]  2.187191 -0.91075148 -2.37727477  1.1329259 -0.3917582
[3,]  1.471685  0.73675444  1.18658159 -0.7677262  1.5632101
[4,] -1.959942  0.51154059 -0.04049294 -1.3777180 -0.9919192
[5,]  0.609865 -2.04175553  1.01257051  1.3094908 -0.9437275

[[3]]
   [,1][,2]   [,3]   [,4]   [,5]
[1,]  0.7896427  1.25712740  1.4208904 -0.4634764  1.3859927
[2,] -0.1193923 -0.03666575 -0.9531145  3.4310667  0.8684956
[3,] -2.3761459  1.16104711 -0.4272411 -2.7792338 -0.3665312
[4,] -2.7372060  1.75061841 -0.6583626  0.6655959 -1.5374698
[5,] -0.5498145 -1.70883781  0.1796487 -0.7663076 -1.3042342

# Reduce that
R> Reduce("+", r1)
  [,1]   [,2]   [,3]   [,4]   [,5]
[1,]  5.003219  2.2498521  2.9065183 -2.7294155  5.5118195
[2,]  1.560042 -2.0677467 -1.5175750  3.1384472  1.7245805
[3,] -4.653988 -0.8043570 -1.3420022 -5.6794115  0.5077933
[4,] -7.433055  0.8248184 -0.8724350 -0.2229058 -3.7149176
[5,]  4.544347 -6.0630447  0.5124764  0.9995161 -2.0371072

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on comparing two matrices

2009-08-22 Thread Steve Lianoglou
Hi,

On Sat, Aug 22, 2009 at 2:45 PM, Michael Kogan wrote:

> Hi,
>
> I need to compare two matrices with each other. If you can get one of them
> out of the other one by resorting the rows and/or the columns, then both of
> them are equal, otherwise they're not. A matrix could look like this:
>
>[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
> [1,]01110110
> [2,]11000101
> [3,]10100011
> [4,]11001000
> [5,]10111000
> [6,]01011000
> [7,]00000111
>
> Note that each matrix consists of ones and zeros, in each row and in each
> column there are at least three ones and one zero and each pair of
> rows/columns may have at most two  positions where both are ones (e.g. for
> the 1. and 2. rows those positions are 2 and 6).
>
> I was advised to sort both matrices in the same way and then to compare
> them element by element. But I don't manage to get them sorted... My
> approach is as following:
>
> 1. Sort the rows after the row sums (greater sums first).
> 2. Sort the columns after the first column (columns with ones in the first
> row go left, columns with zeros go right).
> 3. Save the left part (all columns with ones in the first row) and the
> right part in separate matrices.
> 4. Repeat steps 2 and 3 with both of the created matrices (now taking the
> second row for sorting), repeat until all fragments consist of a single
> column.
> 5. Compose the columns to a sorted matrix.
>
> This algorithm has several problems:
>
> 1. How to make a loop that is branching out in two subloops on each
> iteration?
> 2. How to organize the intermediate results and compose them without losing
> the order? Maybe save them in lists and sublists?
> 3. A fundamental problem: If there are rows with equal sums the result may
> depend on which of them is sorted after first. Maybe this algorithm won't
> work at all because of this problem?


Ouch, this seems like a real PITA.

If you want to go about this by implementing the algo you described, I think
you'd be best suited via some divide-and-conquer/recursion route:

http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm

Perhaps you can take inspiration from some concrete sorting algorithms that
are implemented this way:

Merge sort: http://en.wikipedia.org/wiki/Merge_sort
Quick sort: http://en.wikipedia.org/wiki/Quicksort

Hope that helps,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on comparing two matrices

2009-08-22 Thread Steve Lianoglou
On Sat, Aug 22, 2009 at 2:45 PM, Michael Kogan  wrote:

>>
>> 1. Sort the rows after the row sums (greater sums first).
>> 2. Sort the columns after the first column (columns with ones in the first 
>> row go left, columns with zeros go right).
>> 3. Save the left part (all columns with ones in the first row) and the right 
>> part in separate matrices.
>> 4. Repeat steps 2 and 3 with both of the created matrices (now taking the 
>> second row for sorting), repeat until all fragments consist of a single 
>> column.
>> 5. Compose the columns to a sorted matrix.

> If you want to go about this by implementing the algo you described, I think 
> you'd be best suited via some divide-and-conquer/recursion route:

Starting from step 2, that is.

-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on comparing two matrices

2009-08-23 Thread Steve Lianoglou
Hi,

On Sun, Aug 23, 2009 at 4:14 PM, Michael Kogan wrote:
> Thanks for all the replies!
>
> Steve: I don't know whether my suggestion is a good one. I'm quite new to
> programming, have absolutely no experience and this was the only one I could
> think of. :-) I'm not sure whether I'm able to put your tips into practice,
> unfortunately I had no time for much reading today but I'll dive into it
> tomorrow.

Ok, yeah. I'm not sure what the best way to do this myself, I would at
first see if one could reduce these matrices by some principled manner
and then do a comparison, which might jump to:

> Ted: Wow, that's heavy reading. In fact the matrices that I need to compare
> are incidence matrices so I suppose it's exactly the thing I need, but I
> don't know if I have the basics knowledge to understand this paper within
> the next months.

Ted's sol'n. I haven't read the paper, but its title gives me an idea.
Perhaps you can assume the two matrices you are comparing are
adjacency matrices for a graph then use the igraph library to do a
graph isomorphism test between the two graphs represented by your
adjacency matrices and see if they are the same.

This is probably not the most efficient (computationally) way to do
it, but it might be the quickest way out coding-wise.

I see your original example isn't using square matrices, and an
adjacency matrix has to be square. Maybe you can pad your matrices
with zero rows or columns (depending on what's deficient) as an easy
way out.

Just an idea.

Of course, if David's solution is what you need, then no need to
bother with any of this.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on comparing two matrices

2009-08-25 Thread Steve Lianoglou

Hi,

It looks like you're getting more good stuff, but just to follow up:

On Aug 24, 2009, at 4:01 PM, Michael Kogan wrote:
Steve: The two matrices I want to compare really are graph matrices,  
just not adjacency but incidence matrices. There should be a way to  
get an adjacency matrix of a graph out of its incidence matrix but I  
don't know it...


If you're working with graph data, do yourself a favor and install  
igraph (no matter what solution you end up using for this particular  
problem).


http://cran.r-project.org/web/packages/igraph/
http://igraph.sourceforge.net/

In there, you'll find the `graph.incidence` function which creates a  
graph from its incidence matrix. You can then test if the two graphs  
are isomorphic.


That would look like so:

library(igraph)
g1 <- graph.incidence(matrix.1)
g2 <- graph.incidence(matrix.2)
is.iso <- graph.isomorphic(g1, g2)
# Or, using the (somehow fast) vf2 algorithm
is.iso <- graph.isomorphic.vf2(g1, g2)

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filtering matrices

2009-08-25 Thread Steve Lianoglou
Hi,

On Tue, Aug 25, 2009 at 10:11 PM, bwgoudey wrote:
>
> I'm using the rcorr function from the Hmisc library to get pair-wise
> correlations from a set of observations for a set of attributes. This
> returns 3 matrices; one of correlations, one of the number of observations
> seen for each pair and the final of the P values for each correlation seen.
>
> >From these three matrices, all I wish to do is return a single matrix based
> on the first correlation matrix where each value is above a certain
> correlation, has a certain number of instances and has a P-value below some
> other threshold. My question is what is the nicest way of writing this sort
> of code?

Build some logical indexing vector against the matrix you want to pass
the criteria against, and use that on the matrix that has your values.

If you need more help, please provide three small example matrices and
let us know what you'd like your indexing to return. Someone will
provide the code to show you the correct way to do it.

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >