[R] party for prediction [REPOST]

2012-10-12 Thread Ed
Apologies for re-posting, my original message seems to have been
overlooked by the moderators.

-- Forwarded message --
From: Ed 
Date: 11 October 2012 19:03
Subject: party for prediction
To: R-help@r-project.org


Hi there

I'm experiencing some problems using the party package (specifically
mob) for prediction. I have a real scalar y I want to predict from a
real valued vector x and an integral vector z. mob seemed the ideal
choice from the documentation.

The first problem I had was at some nodes in a partitioning tree, the
components of x may be extremely highly correlated or effectively
constant (that is x are not independent for all choices of components
of z). When the resulting fit is fed into predict() the result is NA -
this is not the same behaviour as models returned by say lm which
ignore missing coefficients. I have fixed this by defining my own
statsModel (myLinearModel - imaginative) which also ignores such
coefficients when predicting.

The second problem I have is that I get "Cholesky not positive
definite" errors at some nodes. I guess this is because of numerical
error and degeneracy in the covariance matrix? Any thoughts on how to
avoid having this happen would be welcome; it is ignorable though for
now.

The third and really big problem I have is that when I apply mob to
large datasets (say hundreds of thousands of elements) I get a
"logical subscript too long" error inside mob_fit_fluctests. It's
caught in a try(), and mob just gives up and treats the node as
terminal. This is really hurting me though; with 1% of my data I can
get a good fit and a worthwhile tree, but with the whole dataset I get
a very stunted tree with a pretty useless prediction ability.

I guess what I really want to know is:
(a) has anyone else had this problem, and if so how did they overcome it?
(b) is there any way to get a line or stack trace out of a try()
without source modification?
(c) failing all of that, does anyone know of an alternative to mob
that does the same thing; for better or worse I'm now committed to
recursive partitioning over linear models, as per mob?
(d) failing all of this, does anyone have a link to a way to rebuild,
or locally modify, an R package (preferably windows, but anything
would do)?

Sorry for the length of this post. If I should RTFM, please point me
at any relevant manual by all means. I've spent a few days on this as
you can maybe tell, but I'm far from being an R expert.

Thanks for any help you can give.

Best wishes,

Ed

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] party for prediction [REPOST]

2012-10-12 Thread Ed
Sorry, my mistake, I didn't get a notification or see it send. Thanks
for clearing that up.

Best wishes

Ed

On 12 October 2012 16:58, David Winsemius  wrote:
>
> On Oct 12, 2012, at 1:37 AM, Ed wrote:
>
>> Apologies for re-posting, my original message seems to have been
>> overlooked by the moderators.
>>
> No. Your original post _was_ forwarded to the list. On my machine it appeared 
> at October 11, 2012 11:03:08 AM PDT.  No one responded. It seems possible 
> that its lack of data or code is the reason for that state of affairs.
>
> --
> David.
>
>> -- Forwarded message --
>> From: Ed 
>> Date: 11 October 2012 19:03
>> Subject: party for prediction
>> To: R-help@r-project.org
>>
>>
>> Hi there
>>
>> I'm experiencing some problems using the party package (specifically
>> mob) for prediction. I have a real scalar y I want to predict from a
>> real valued vector x and an integral vector z. mob seemed the ideal
>> choice from the documentation.
>>
>> The first problem I had was at some nodes in a partitioning tree, the
>> components of x may be extremely highly correlated or effectively
>> constant (that is x are not independent for all choices of components
>> of z). When the resulting fit is fed into predict() the result is NA -
>> this is not the same behaviour as models returned by say lm which
>> ignore missing coefficients. I have fixed this by defining my own
>> statsModel (myLinearModel - imaginative) which also ignores such
>> coefficients when predicting.
>>
>> The second problem I have is that I get "Cholesky not positive
>> definite" errors at some nodes. I guess this is because of numerical
>> error and degeneracy in the covariance matrix? Any thoughts on how to
>> avoid having this happen would be welcome; it is ignorable though for
>> now.
>>
>> The third and really big problem I have is that when I apply mob to
>> large datasets (say hundreds of thousands of elements) I get a
>> "logical subscript too long" error inside mob_fit_fluctests. It's
>> caught in a try(), and mob just gives up and treats the node as
>> terminal. This is really hurting me though; with 1% of my data I can
>> get a good fit and a worthwhile tree, but with the whole dataset I get
>> a very stunted tree with a pretty useless prediction ability.
>>
>> I guess what I really want to know is:
>> (a) has anyone else had this problem, and if so how did they overcome it?
>> (b) is there any way to get a line or stack trace out of a try()
>> without source modification?
>> (c) failing all of that, does anyone know of an alternative to mob
>> that does the same thing; for better or worse I'm now committed to
>> recursive partitioning over linear models, as per mob?
>> (d) failing all of this, does anyone have a link to a way to rebuild,
>> or locally modify, an R package (preferably windows, but anything
>> would do)?
>>
>> Sorry for the length of this post. If I should RTFM, please point me
>> at any relevant manual by all means. I've spent a few days on this as
>> you can maybe tell, but I'm far from being an R expert.
>>
>> Thanks for any help you can give.
>>
>> Best wishes,
>>
>> Ed
>
> David Winsemius, MD
> Alameda, CA, USA
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] party for prediction [REPOST]

2012-10-14 Thread Ed
 such a strategy helps here.

I've considered using rpart() to partition into cells of constant
gradient, then fitting linear models myself to the cells. This is my
next thought. I'm pretty sure partitioning over linear regression is
the way forward for the data we have. I tried mars and glm but there
are good reasons to think they're less reasonable, even though the fit
wasn't particularly poor. I'm not particularly wedded to party's
approach except that it looked like it immediately returned what we
needed, and with some degree of "optimality" into the bargain.

>> (b) is there any way to get a line or stack trace out of a try()
>> without source modification?
>
> Not sure, I don't know any off the top off my head.

I guess I really will have to bite the bullet and try to figure out
how to install modified libraries. Thanks.

>> (c) failing all of that, does anyone know of an alternative to mob
>> that does the same thing; for better or worse I'm now committed to
>> recursive partitioning over linear models, as per mob?
>
>
> If your partitioning variables are particularly simple (e.g., all binary)
> you could exploit that and it may be easier to write a custom function for
> your particular data. Then likelihood-ratio tests (rather than LM-type
> tests) would also be easier to apply in case of unidentified parameters.
>
> But if there are partitioning variables with different measurement scales,
> then this will not be that simple...

Unfortunately each partitioning variable is essentially a state
indicator, taking values say 0,...,R where R is different for each
component. I'm not a stats expert either; I've spent some time with
the party manuals and papers, but I wouldn't be confident of
implementing something like it in the time available to me (though if
I have to I will, but that wouldn't be a good situation to be in).

>> (d) failing all of this, does anyone have a link to a way to rebuild, or
>> locally modify, an R package (preferably windows, but anything would do)?
>
>
> Have a look at the "Writing R Extensions" manual and the R for Windows FAQ.

Will do.

Thank you very much for your responses, I really appreciate it.

Best wishes,

Ed

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] party for prediction [REPOST]

2012-10-14 Thread Ed
This was an exceptionally helpful answer, I can only thank you again.
I have plenty of avenues ahead where I was worried before I was
getting trapped in a dead end. If all else fails, the idea of using
anova is brilliant. Thank you!

Ed

On 14 October 2012 18:36, Achim Zeileis  wrote:
> On Sun, 14 Oct 2012, Ed wrote:
>
>> First up, thanks hugely for your response. I've been beating my head
>> against this!
>>
>> On 14 October 2012 16:51, Achim Zeileis  wrote:
>>>
>>> I'm not sure what you mean by "integral vector". If you want to apply the
>>> approach to hundreds of thousands of observations, I gues that these are
>>> categorical (maybe even binary?) but maybe not...
>>
>>
>> I'm sorry I can't go into the details of the data, I would if I could.
>> z are categorical variables represented as integers, mostly ordered,
>> but not all. I've tried fitting them as integers, as well as ordered,
>> but O don't think it made a huge difference.
>
>
> The tests performed for categorical partitioning variables are rather
> different from the tests for numerical partitioning variables. If all of the
> variables are categorical, this may not be immediately obvious, but the
> factor coding should be more appropriate (especially if the number of levels
> is small or moderate).
>
>
>>> If I recall correctly, we kept linearModel as simple as we did to save as
>>> much time as possible. This can be particularly important when one of the
>>> partitioning variables has many possible splits and the linearModel has
>>> to
>>> be fitted thousands of times.
>>
>>
>> I can appreciate that, but maybe having an alternative linearModel which
>> will predict when the fit is degenerate would be worth including? I'm happy
>> to contribute what I have, although it's pretty obvious stuff (and probably
>> done suboptimally since I'm not much of an R coder at this point). For me at
>> least, even with huge datasets, the speed of party is quite good; it's
>> getting a better result that's the problem.
>
>
> As I explained in my last e-mail. In your situation this does not solve the
> problem completely because subsequent the tests are also not adapted to
> this. Setting the empirical estimating functions to zero for non-identified
> coefficients might alleviate the problem but is not really a clean solution.
>
>
>>> Also, mob() assesses the stability of all coefficients of the model in
>>> all
>>> nodes during partitioning. If any of the coefficients is not identified,
>>> this would have to be excluded from all subsequent parameter stability
>>> tests
>>> in that node (and its child nodes). This is currently not provided for in
>>> mob().
>>
>>
>> Would pretending the coefficients were fit at 0 fool mob into doing
>> something moderately meaningful here?
>
>
> The coefficients are not looked at during fitting, only the estfun(). This
> would have to be set to 0.
>
>
>> If not, I would try to hack the code, but I'm honestly at something of
>> a loss as to how to modify it and feed the results back into my
>> interpreter. I have bytecode installed; I downloaded the source, but I
>> haven't squared the circle of modifying the source and installing the
>> result. I will check out the docs on writing extensions you suggest.
>
>
> Writing (or modifying) R packages and installing them under Windows is
> pretty standard and well documented. The pointers I gave you should
> hopefully get you started.
>
>
>>>> The second problem I have is that I get "Cholesky not positive definite"
>>>> errors at some nodes. I guess this is because of numerical error and
>>>> degeneracy in the covariance matrix? Any thoughts on how to avoid having
>>>> this happen would be welcome; it is ignorable though for now.
>>>
>>>
>>> This comes from the parameter stability tests and might be a result of an
>>> unidentified (or close to unidentified) model fit.
>>
>>
>> This is a great help to know. I improved my results quite considerably
>> with aggressive scaling of everything (scaling the response and all
>> the predictors to lie between 0 an 1). That deepened my tree by a
>> factor of two or so (say depth 3 to 7) and improved the quality of fit
>> substantially. Is there any way I can engage a more numerically robust
>> Cholesky in mob?
>
>
> No, I don't think that this is conceivable with the way this is implemented
> at the moment. Instead o

[R] Changing a for loop to a function using sapply

2012-10-21 Thread Ed
Apparently there is one or more concepts that I do not fully understand 
from the descriptions of a function and the apply material.   I have 
been reading the mail from this forum and have learned much but, in this 
case, what I have been reading here and from the manual isn't enough.
The following code produces what I want with the for loop.  From what I 
have read from this forum, a for loop its not necessarily the best path 
so I tried to create a function do to the same work.


Using the following 64 bit version on Windows 7 Dell laptop
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-mingw32/x64 (64-bit)



below is the part that works

# The following lines create a string of nucleotides and uses a for loop 
to create multiple strings.

# random.string replicate something based on rs sampling criteria.

random.string <- rep(NA, rs<-sample(3:18,1,replace = TRUE))

# The randomizeString function uses members of DNAnucleotides list to 
sample 3 at a time

# placing the results in "a".

randomizeString <- function(x) {
DNAnucleotides<- c("a","c","g","t")
a <-sample(DNAnucleotides,3, replace = TRUE)
return(a)
}

# The following paste output uses random.string to indicate the number 
of times the function
# randomizeString selects a triplet from the list DNAnucleotides to 
create a text string

# of a sequence of nucleotides.
# collapse = "" removes the quotes from the triplets to produce one long 
string when the string

# is printed by paste.

paste(c(sapply(random.string, randomizeString, simplify = TRUE), ""), 
collapse = "")


# The for loop uses the paste output to create multiple random length 
nucleotide strings

# which can be printed to a file.

for(i in 1:20) DNA[i]<-paste(c(sapply(rep(NA, rs<-sample(3:21,1,replace 
= TRUE))

, randomizeString, simplify = TRUE), ""), collapse = "")
DNA

Rowname <- c(1:20) # provides row numbers to be used with the 
sequences produced
Arrow<- rep(">",20) # provides a list of ">" to be used to separate 
the row numbers and sequences


# DNAout uses a for loop to combine the vectors to create one string 
vector of sequences.


DNAout<-class(character)
for(j in 1:20)DNAout[j]<- paste(Rowname[j]," ",Arrow[j]," ",DNA[j], 
collapse = "" )

DNAout



Here is what I have tried in attempts to create a function to replicate 
the results of the for loop above.

This one comes close.

#This repeats the above script without the comments.
##
options(stringsAsFactors=FALSE)
DNA<-class(character)
randomizeString <- function(x) {
DNAnucleotides <- c("A","C","G","T")
a <-sample(DNAnucleotides, 3, replace = TRUE)
return(a)
}
for(i in 1:20) DNA[i]<-paste(c(sapply(rep(NA, rs<-sample(3:18,1,replace 
= TRUE))

, randomizeString, simplify = TRUE), ""), collapse = "")
DNA
Rowname <- c(1:20)
Arrow<- rep(">",20)

DNAout<-class(character)
for(j in 1:20)DNAout[j]<- paste(Rowname[j]," ",Arrow[j]," ",DNA[j], 
collapse = "" )

DNAout
###
##The following works partially
DNAoutc <- class("character")
DNAoutc <- function(x,y,z){sapply(x, paste(x," ",y," ",z,"\n", collapse 
= ""))}

DNAoutc(Rowname,Arrow,DNA)

Error in get(as.character(FUN), mode = "function", envir = envir) :
  object '1   >   ACAAACAATGAGGTCCGCCGGATGAAGCTG
2   >   CAAACCTCGTGCAAAGGTGCTTCATGGTAAATCCGTTTAGCCGGGAAAGT
3   >   TACATCGAAGCTCGTTGAAG
4   >   CGTCAACATGAACAAATGACATCCAGACGCACGCTGTAA
5   >   CATTTAACCCTTGGTGTGATG
6   >   AAGTATGAGTGGGCCTTGGGTTCTGGCTCCCACGCGTTGTGC
7   >   AGTTCCCGCAAACTGATACTGATCAGCACTTAGAGACCGCCACTATCAGTT
8   >   AATAATGCATGCTAGGCAGCCCGCTCGACCATTAGGGATAGAGCT
9   >   GACATCAAGTCATAGGTT
10   >   CAGAACAATATACACGTT
11   >   CGCAACCATCTACACTGCGTT
12   >   GTGAACTGAGGTATGACCGGTGGATAATAACGGGACC
13   >   TAGCAACATGAGTGCCTCAGGTTGTCGTTCAATAAACTCGGGAAG
14   >   GCGATGATCCGCTTATAGCATGGACAAAGCAACGTTCTGTCGTCGGATTC
15   >   AGCATGTTAGCAATTTG
16   >   ACTAGTTCTGCCGTCATTTCAATG
17   >   ATTCTTCCCTTG
18   >   CATCTCGATTCTTTCTTACAATGT
19   >   ATAGATACCTTGGTCAAATAATCGTTTCAAGGT
20   >   TGGATAATAGCGGATAC
' of mode 'function' was not found

My other attempts essentially give errors which I can not seem to figure 
out what I am missing to correct the errors.

Below are a few of the failed attempts.
#

mode(DNAoutf)<-("function")
DNAoutf <-  sapply(x,function(x,y,z){paste(x," ",y," ",z,"\n", collapse 
= "" )})

DNAoutf(Rowname,Arrow,DNA)

> mode(DNAoutf)<-("function")
Error in mode(DNAoutf) <- ("function") : object 'DNAoutf' not found
> DNAoutf<- function(x,y,z) {sapply(x,y,z),paste(x," ",y," ",z,"\n", 
collapse = "" ))}

Error: unexpected ',' in "DNAoutf<- function(x,y,z) {sapply(x,y,z),"
> DNAoutf(R

[R] maximum likelihood using nlm to estimate 4 variables

2011-06-27 Thread ED
Hi I need help

I am new to R and am having problems estimating parameters out of 3stage
constrained function.

I have constructed a code as below and my data are two colomns of R_j and
R_m(sample given below). R_j and R_m represents the dependent and
independent variables respectively. The parameters al_j, au_j, b_j , and
sigma_j need to be estimate and there are no initial estimates to them


llik=function(R_j,R_m)
{

LF=if(R_j<
0)sum[ln(1/(2*pi*(sigma_j^2)))-(1/(2*(sigma_j^2))*(R_j+al_j-b_j*R_m))^2] +
if(R_j>
0)sum[ln(1/(2*pi*(sigma_j^2)))-(1/(2*(sigma_j^2))*(R_j+au_j-b_j*R_m))^2] +

if(R_j==0)sum[(ln(%pnorm((au_j-b_j*R_m)/sigma_j)-%pnorm((al_j-b_j*R_m)/sigma_j)))]
}
est.nlm = nlm(llik,0) #not sure what to put for the 4 initial estimates so I
just put 0
est.nlm$estimate

Sample Data
R_j R_m
0.002   0.026567295
0.003   -0.009798475
0.050.008497274
-0.01   0.012464578
-0.0009 0.002896023
0.090.000879473
0.01-0.003194435
0.0006  0.010281122

I will appreciate if you help me to modify my code to get my estimates or
give me any better method to use.

Thank you in advance

Edward
Student: Institute of Actuaries






--
View this message in context: 
http://r.789695.n4.nabble.com/maximum-likelihood-using-nlm-to-estimate-4-variables-tp3629290p3629290.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A list of data frames and a list of colnames.

2017-10-23 Thread Ed Siefker
I have a list of file names, and a list of data frames contained in those files.

mynames <- list.files()
mydata <- lapply(mynames, read.delim)

Every file contains two columns.

> colnames(mydata[[1]])
[1] "Name" "NumReads"
> colnames(mydata[[2]])
[1] "Name" "NumReads"

I can set the colnames easily enough with a for loop.

for (i in seq_along(mynames)) {
colnames(mydata[[i]])[2] <- mynames[i]
}

Is there a nicer way to do this?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.data.frame doesn't set col.names

2017-10-24 Thread Ed Siefker
Why doesn't this work?

> samples$geno <- as.data.frame(sapply(yo, toupper), col.names="geno")
> samples
  quant_samples   age sapply(yo, toupper)
E11.5 F20het BA40 E11.5 F20het BA40 E11.5  F20HET
E11.5 F20het BA45 E11.5 F20het BA45 E11.5  F20HET

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.data.frame doesn't set col.names

2017-10-24 Thread Ed Siefker
Wait.  Now I'm really confused.

>
> head(samples)
  quant_samples   age sapply(yo, toupper)
E11.5 F20het BA40 E11.5 F20het BA40 E11.5  F20HET
E11.5 F20het BA45 E11.5 F20het BA45 E11.5  F20HET
E11.5 F20het BB84 E11.5 F20het BB84 E11.5  F20HET
E11.5 F9.20DKO KTr3 E11.5 F9.20DKO KTr3 E11.5F9.20DKO
E11.5 F9.20DKO PEd2 E11.5 F9.20DKO PEd2 E11.5F9.20DKO
E11.5 F9.20DKO j0J1 E11.5 F9.20DKO j0J1 E11.5F9.20DKO
> colnames(samples)
[1] "quant_samples" "age"   "geno"

Really, really confused.

On Tue, Oct 24, 2017 at 12:58 PM, Ed Siefker  wrote:
> Why doesn't this work?
>
>> samples$geno <- as.data.frame(sapply(yo, toupper), col.names="geno")
>> samples
>   quant_samples   age sapply(yo, toupper)
> E11.5 F20het BA40 E11.5 F20het BA40 E11.5  F20HET
> E11.5 F20het BA45 E11.5 F20het BA45 E11.5  F20HET

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] googlesheets gs_reshape_cellfeed()

2017-11-01 Thread Ed Siefker
I have a google spreadsheet with a column of hyperlinks I want the URL from.
The googlesheets package can return this information with gs_read_cellfeed(),
but it needs to be reshaped with gs_reshape_cellfeed().  Problem is,
gs_reshape_cellfeed() returns the 'value' of the cells, not the
'input_value' making
it exactly like gs_read().

How do I extract input_value from a cell feed in a convenient format? I want a
data frame that looks exactly like the output of gs_read(), except returning
'input_value' instead of 'value'.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot inside function doesn't plot

2017-11-02 Thread Ed Siefker
I have a function:

myplot <- function (X) {
d <- plotCounts(dds2, gene=X, intgroup="condition", returnData=TRUE)
png(paste("img/", X, ".png", sep=""))
ggplot(d, aes(x=condition, y=count, color=condition)) +
geom_point(position=position_jitter(w=0.1,h=0)) +
scale_y_log10(breaks=c(25,100,400)) +
ggtitle(X) +
theme(plot.title = element_text(hjust = 0.5))

dev.off()
}

'd' is a dataframe

 count  condition
E11.5 F20HET BA40_quant   955.9788   E11.5 F20HET
E11.5 F20HET BA45_quant   796.2863   E11.5 F20HET
E11.5 F20HET BB84_quant   745.0340   E11.5 F20HET
E11.5 F9.20DKO YEH3_quant 334.2994 E11.5 F9.20DKO
E11.5 F9.20DKO fkm1_quant 313.7307 E11.5 F9.20DKO
E11.5 F9.20DKO zzE2_quant 349.3313 E11.5 F9.20DKO

If I set X="Etv5" and paste the contents of the function into R, I get
'img/Etv5.png'
If I run myplot(X), I get nothing.


> X
[1] "Etv5"
> list.files("img")
character(0)
> myplot(X)
null device
  1
> list.files("img")
character(0)
> d <- plotCounts(dds2, gene=X, intgroup="condition", returnData=TRUE)
> png(paste("img/", X, ".png", sep=""))
> ggplot(d, aes(x=condition, y=count, color=condition)) +
+ geom_point(position=position_jitter(w=0.1,h=0)) +
+ scale_y_log10(breaks=c(25,100,400)) +
+ ggtitle(X) +
+ theme(plot.title = element_text(hjust = 0.5))
> dev.off()
null device
  1
> list.files("img")
[1] "Etv5.png"

Why doesn't my function work?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot inside function doesn't plot

2017-11-02 Thread Ed Siefker
I don't really understand. I mean, I understand the solution is
print(ggplot(...)).  But why is that required in a function and not at
the console?

Shouldn't I be able to rely on what I do at the console working in a
script?  Is this inconsistent behavior by design?



On Thu, Nov 2, 2017 at 11:54 AM, David Winsemius  wrote:
>
>> On Nov 2, 2017, at 9:27 AM, Ed Siefker  wrote:
>>
>> I have a function:
>>
>> myplot <- function (X) {
>>d <- plotCounts(dds2, gene=X, intgroup="condition", returnData=TRUE)
>>png(paste("img/", X, ".png", sep=""))
>>ggplot(d, aes(x=condition, y=count, color=condition)) +
>>geom_point(position=position_jitter(w=0.1,h=0)) +
>>scale_y_log10(breaks=c(25,100,400)) +
>>ggtitle(X) +
>>theme(plot.title = element_text(hjust = 0.5))
>>
>>dev.off()
>>}
>>
>> 'd' is a dataframe
>>
>> count  condition
>> E11.5 F20HET BA40_quant   955.9788   E11.5 F20HET
>> E11.5 F20HET BA45_quant   796.2863   E11.5 F20HET
>> E11.5 F20HET BB84_quant   745.0340   E11.5 F20HET
>> E11.5 F9.20DKO YEH3_quant 334.2994 E11.5 F9.20DKO
>> E11.5 F9.20DKO fkm1_quant 313.7307 E11.5 F9.20DKO
>> E11.5 F9.20DKO zzE2_quant 349.3313 E11.5 F9.20DKO
>>
>> If I set X="Etv5" and paste the contents of the function into R, I get
>> 'img/Etv5.png'
>> If I run myplot(X), I get nothing.
>>
>>
>>> X
>> [1] "Etv5"
>>> list.files("img")
>> character(0)
>>> myplot(X)
>> null device
>>  1
>>> list.files("img")
>> character(0)
>>> d <- plotCounts(dds2, gene=X, intgroup="condition", returnData=TRUE)
>>> png(paste("img/", X, ".png", sep=""))
>>> ggplot(d, aes(x=condition, y=count, color=condition)) +
>> + geom_point(position=position_jitter(w=0.1,h=0)) +
>> + scale_y_log10(breaks=c(25,100,400)) +
>> + ggtitle(X) +
>> + theme(plot.title = element_text(hjust = 0.5))
>>> dev.off()
>> null device
>>  1
>>> list.files("img")
>> [1] "Etv5.png"
>>
>> Why doesn't my function work?
>
> `ggplot` creates an object. You need to print it when used inside a function. 
> Inside a function (in a more restricted environment) there is no 
> parse-eval-print-loop.
>
>
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'   
> -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] drc, ggplot2, and gridExtra

2018-05-18 Thread Ed Siefker
I have dose response data I have analyzed with the 'drc' package.
Using plot() works great.  I want to arrange my plots and source
data on a single page.  I think 'gridExtra' is the usual package for
this.

I could use plot() and par(mfrow=...), but then I can't put the source
data table on the page.

gridExtra provides grid.table() which makes nice graphical tables. It
doesn't work with par(mfrow=...), but has the function grid.arrange()
instead.

Unfortunately, grid.arrange() doesn't accept plot(). It does work with
qplot() from 'ggplot2'.  Unfortunately, qplot() doesn't know how to
deal with data of class drc.

I'm at a loss on how to proceed here.  Any thoughts?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Exporting to text files

2018-05-18 Thread Ed Siefker
I have dose response data analyzed with the package 'drc'.
'summary(mymodel)' prints my kinetic parameters.  I want
that text in an ASCII text file.  I want to get exactly what I
would get if I copied and pasted from the terminal window.

I've read the documentation on data export to text files here:
https://cran.r-project.org/doc/manuals/r-release/R-data.html#Export-to-text-files

write() does not work.

> summary(mymodel)

Model fitted: Michaelis-Menten (2 parms)

Parameter estimates:

  Estimate Std. Error t-value  p-value
d:(Intercept)  213.435 67.094  3.1811 0.009801 **
e:(Intercept)   94.493 59.579  1.5860 0.143820
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error:

 22.03492 (10 degrees of freedom)
> write(summary(mymodel), "kinetics.txt")
Error in cat(x, file = file, sep = c(rep.int(sep, ncolumns - 1), "\n"),  :
  argument 1 (type 'list') cannot be handled by 'cat'

If I try to unlist(mymodel):
> write(unlist(summary(mymodel)), "kinetics.txt")
I get the following contents of "kinetics.txt":

485.537711262143
4501.62443636671
3821.31920509004
3821.31920509004
3549.67055527084
213.435401944579
94.4931993582911
67.0941460663053
59.5791117361684
3.18113299681396
1.58601222147673
0.00980057624097692
0.143819823442402
MM.2()
continuous
10
4.63571040101587
3.93514151059103
3.93514151059103
3.65540149913749
Michaelis-Menten
2
22.0349202690217
10


How do I get the output of 'summary(mymodel)' verbatim? Why doesn't it
work the way I think it does? What documentation should I read to
understand what's going on here?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to alpha entire plot?

2018-05-31 Thread Ed Siefker
I have two chromatograms I want plotted on the same axes.
I would like the plots to be transparent, so the first chart is
not obscured.

I have tried adjustcolor(..., alpha.f=0.3), the problem is that
my chromatogram is so dense with datapoints that they
overlap and the entire graph just ends up a solid color.  The
second histogram still obscures the first.

Consider this example:


col1 <- adjustcolor("red", alpha.f=0.3)
col2 <- adjustcolor("blue", alpha.f=0.3)
EU <- data.frame(EuStockMarkets)
with(EU, plot(DAX, CAC, col=col2, type="h", ylim=c(0,6000)))
par(new=TRUE)
with(EU, plot(DAX, FTSE, col=col1, type="h", ylim=c(0,6000)))

The density of the red plot around 2000 completely obscures the blue
plot behind it.

What I would like to do is plot both plots in solid colors, then alpha
the entire thing, and then overlay them.  Or some other method that
achieves a comparable result.
Thanks

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Boxplot, formula interface, and labels.

2017-09-28 Thread Ed Siefker
I have data I'd like to plot using the formula interface to boxplot.
I call boxplot like so:

with(mydata, boxplot(count ~ geno * tissue))

I get a boxplot with x axis labels like "wt.kidney".  I would like
to change the '.' to a newline.  Where is this separator configured?

Thanks,
-Ed

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Boxplot, formula interface, and labels.

2017-09-28 Thread Ed Siefker
Another way to think of this problem.  If I could get my hands on the
vector of names boxplot()
is creating, I could use gsub() to replace '.' with '\n'.

Is there something I could run before boxplot() that would give me
that vector of names which
I could then pass to boxplot()?

On Thu, Sep 28, 2017 at 11:40 AM, Ed Siefker  wrote:
> I have data I'd like to plot using the formula interface to boxplot.
> I call boxplot like so:
>
> with(mydata, boxplot(count ~ geno * tissue))
>
> I get a boxplot with x axis labels like "wt.kidney".  I would like
> to change the '.' to a newline.  Where is this separator configured?
>
> Thanks,
> -Ed

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Boxplot, formula interface, and labels.

2017-09-28 Thread Ed Siefker
I knew I was making harder than it needed to be.  I see it now in ?boxplot
Thanks!

On Thu, Sep 28, 2017 at 12:30 PM, David L Carlson  wrote:
> Just change the separator:
>
> data(Titanic)
> Titanic.df <- as.data.frame(Titanic)
> boxplot(Freq~Class*Sex, Titanic.df, cex.axis=.6, sep="\n")
>
> See attached .png.
>
> 
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn
> Sent: Thursday, September 28, 2017 12:27 PM
> To: Ed Siefker 
> Cc: r-help 
> Subject: Re: [R] Boxplot, formula interface, and labels.
>
> mybp <- boxplot(count ~ geno * tissue, data = mydata, plot = FALSE) 
> mybp$names <- gsub("\\.", "\n", mybp$names)
> bxp(mybp)
>
> See ?boxplot for details.
>
> Best,
> Ista
>
> On Thu, Sep 28, 2017 at 12:40 PM, Ed Siefker  wrote:
>> I have data I'd like to plot using the formula interface to boxplot.
>> I call boxplot like so:
>>
>> with(mydata, boxplot(count ~ geno * tissue))
>>
>> I get a boxplot with x axis labels like "wt.kidney".  I would like to
>> change the '.' to a newline.  Where is this separator configured?
>>
>> Thanks,
>> -Ed
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data.matrix output is not numeric

2017-10-13 Thread Ed Siefker
I have a data frame full of integer values.  I need a matrix full of
numeric values.

?data.matrix reads:

 Return the matrix obtained by converting all the variables in a
 data frame to numeric mode and then binding them together as the
 columns of a matrix.

This does not work.

test.df <- data.frame(a=as.integer(c(1,2,3)), b=as.integer(c(4,5,6)))
> class(test.df[[1,1]])
[1] "integer"
> class(data.matrix(test.df)[[1]])
[1] "integer"

What's going on here?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading data into nested frames

2016-06-02 Thread Ed Siefker
I have many data files named like this:

E11.5-021415-dko-1-1-masked-bottom-area.tsv
E11.5-021415-dko-1-1-masked-top-area.tsv
E11.5-021415-dko-1-2-masked-bottom-area.tsv
E11.5-021415-dko-1-2-masked-top-area.tsv
E11.5-021415-dko-1-3-masked-bottom-area.tsv
E11.5-021415-dko-1-3-masked-top-area.tsv

age-date-genotype-num-slicenum-filler-position-data

An individual sample is an age-date-geno-num, each sample has two
parts, and is composed of around 10 slices.  Each row of the tsv is an
area which will be summed for the total area.

What I want is a dataframe, with a row for each sample and a column
for bottom and top.  Under bottom and top, I want each element to be a
dataframe with a row for each slice and a column for the area.

So I can lapply over this list of files, use strsplit to pull out the
slice num and put the area into the correct row of a dataframe easily
enough.  But I have a line for every datapoint, not sample, and there
would be a dataframe for each area.

How can I merge all the data for the slices into one data frame?  Does
this make sense?
Thanks
-Ed

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merging dataframes in a list

2016-06-03 Thread Ed Siefker
I have a list of data as follows.

> list(data.frame(name="sample1", red=20), data.frame(name="sample1", 
> green=15), data.frame(name="sample2", red=10), data.frame(name="sample 2", 
> green=30))
[[1]]
 name red
1 sample1  20

[[2]]
 name green
1 sample115

[[3]]
 name red
1 sample2  10

[[4]]
 name green
1 sample230


I would like to massage this into a data frame like this:

 name red green
1 sample1  2015
2 sample2  1030


I'm imagining I can use aggregate(mylist, by=samplenames, merge)
right?  But how do I get the list of samplenames?  How do I subset
each dataframe inside the list?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging dataframes in a list

2016-06-03 Thread Ed Siefker
I manually constructed the list of sample names and tried the
aggregate call I mentioned.
Merge works when called manually, but not when using aggregate.

> mylist <- list(data.frame(name="sample1", red=20), data.frame(name="sample1", 
> green=15), data.frame(name="sample2", red=10), data.frame(na me="sample2", 
> green=30))
>  names <- list("sample1", "sample1", "sample2", "sample2")
> merge(mylist[1], mylist[2])
 name red green
1 sample1  2015
> merge(mylist[3], mylist[4])
 name red green
1 sample2  1030
> aggregate(mylist, by=as.list(names), merge)
Error in as.data.frame(y) : argument "y" is missing, with no default

What's the right way to do this?

On Fri, Jun 3, 2016 at 1:20 PM, Ed Siefker  wrote:
> I have a list of data as follows.
>
>> list(data.frame(name="sample1", red=20), data.frame(name="sample1", 
>> green=15), data.frame(name="sample2", red=10), data.frame(name="sample 2", 
>> green=30))
> [[1]]
>  name red
> 1 sample1  20
>
> [[2]]
>  name green
> 1 sample115
>
> [[3]]
>  name red
> 1 sample2  10
>
> [[4]]
>  name green
> 1 sample230
>
>
> I would like to massage this into a data frame like this:
>
>  name red green
> 1 sample1  2015
> 2 sample2  1030
>
>
> I'm imagining I can use aggregate(mylist, by=samplenames, merge)
> right?  But how do I get the list of samplenames?  How do I subset
> each dataframe inside the list?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging dataframes in a list

2016-06-03 Thread Ed Siefker
aggregate isn't really what I want.  Maybe tapply?  I still can't get
it to work.

> length(mylist)
[1] 4
> length(names)
[1] 4
> tapply(mylist, names, merge)
Error in tapply(mylist, names, merge) : arguments must have same length

I guess because a list isn't an atomic data type.  What function will
do the same on lists?  lapply doesn't have a 'by' argument.

On Fri, Jun 3, 2016 at 1:41 PM, Ed Siefker  wrote:
> I manually constructed the list of sample names and tried the
> aggregate call I mentioned.
> Merge works when called manually, but not when using aggregate.
>
>> mylist <- list(data.frame(name="sample1", red=20), 
>> data.frame(name="sample1", green=15), data.frame(name="sample2", red=10), 
>> data.frame(na me="sample2", green=30))
>>  names <- list("sample1", "sample1", "sample2", "sample2")
>> merge(mylist[1], mylist[2])
>  name red green
> 1 sample1  2015
>> merge(mylist[3], mylist[4])
>  name red green
> 1 sample2  1030
>> aggregate(mylist, by=as.list(names), merge)
> Error in as.data.frame(y) : argument "y" is missing, with no default
>
> What's the right way to do this?
>
> On Fri, Jun 3, 2016 at 1:20 PM, Ed Siefker  wrote:
>> I have a list of data as follows.
>>
>>> list(data.frame(name="sample1", red=20), data.frame(name="sample1", 
>>> green=15), data.frame(name="sample2", red=10), data.frame(name="sample 2", 
>>> green=30))
>> [[1]]
>>  name red
>> 1 sample1  20
>>
>> [[2]]
>>  name green
>> 1 sample115
>>
>> [[3]]
>>  name red
>> 1 sample2  10
>>
>> [[4]]
>>  name green
>> 1 sample230
>>
>>
>> I would like to massage this into a data frame like this:
>>
>>  name red green
>> 1 sample1  2015
>> 2 sample2  1030
>>
>>
>> I'm imagining I can use aggregate(mylist, by=samplenames, merge)
>> right?  But how do I get the list of samplenames?  How do I subset
>> each dataframe inside the list?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging dataframes in a list

2016-06-03 Thread Ed Siefker
Thanks, ldply got me a data frame straight away.  But it filled empty
spaces with NA and merge no longer works.

> ldply(mylist)
 name red green
1 sample1  20NA
2 sample1  NA15
3 sample2  10NA
4 sample2  NA30
> mydf <- ldply(mylist)
> merge(mydf[1,],mydf[2,])
[1] name  red   green
<0 rows> (or 0-length row.names)
> merge(mydf[1,],mydf[2,], by=1)
 name red.x green.x red.y green.y
1 sample120  NANA  15


How do I merge dataframes with NA?

On Fri, Jun 3, 2016 at 2:17 PM, Ulrik Stervbo  wrote:
> You can use ldply in the plyr package to bind all the data.frames together
> (a regular loop will also work). Afterwards you can summarise using ddply
>
> Hope this helps
> Ulrik
>
>
> Ed Siefker  schrieb am Fr., 3. Juni 2016 21:10:
>>
>> aggregate isn't really what I want.  Maybe tapply?  I still can't get
>> it to work.
>>
>> > length(mylist)
>> [1] 4
>> > length(names)
>> [1] 4
>> > tapply(mylist, names, merge)
>> Error in tapply(mylist, names, merge) : arguments must have same length
>>
>> I guess because a list isn't an atomic data type.  What function will
>> do the same on lists?  lapply doesn't have a 'by' argument.
>>
>> On Fri, Jun 3, 2016 at 1:41 PM, Ed Siefker  wrote:
>> > I manually constructed the list of sample names and tried the
>> > aggregate call I mentioned.
>> > Merge works when called manually, but not when using aggregate.
>> >
>> >> mylist <- list(data.frame(name="sample1", red=20),
>> >> data.frame(name="sample1", green=15), data.frame(name="sample2", red=10),
>> >> data.frame(na me="sample2", green=30))
>> >>  names <- list("sample1", "sample1", "sample2", "sample2")
>> >> merge(mylist[1], mylist[2])
>> >  name red green
>> > 1 sample1  2015
>> >> merge(mylist[3], mylist[4])
>> >  name red green
>> > 1 sample2  1030
>> >> aggregate(mylist, by=as.list(names), merge)
>> > Error in as.data.frame(y) : argument "y" is missing, with no default
>> >
>> > What's the right way to do this?
>> >
>> > On Fri, Jun 3, 2016 at 1:20 PM, Ed Siefker  wrote:
>> >> I have a list of data as follows.
>> >>
>> >>> list(data.frame(name="sample1", red=20), data.frame(name="sample1",
>> >>> green=15), data.frame(name="sample2", red=10), data.frame(name="sample 
>> >>> 2",
>> >>> green=30))
>> >> [[1]]
>> >>  name red
>> >> 1 sample1  20
>> >>
>> >> [[2]]
>> >>  name green
>> >> 1 sample115
>> >>
>> >> [[3]]
>> >>  name red
>> >> 1 sample2  10
>> >>
>> >> [[4]]
>> >>  name green
>> >> 1 sample230
>> >>
>> >>
>> >> I would like to massage this into a data frame like this:
>> >>
>> >>  name red green
>> >> 1 sample1  2015
>> >> 2 sample2  1030
>> >>
>> >>
>> >> I'm imagining I can use aggregate(mylist, by=samplenames, merge)
>> >> right?  But how do I get the list of samplenames?  How do I subset
>> >> each dataframe inside the list?
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] metafor - code for analysing geometric means

2014-12-07 Thread Purssell, Ed
Dear All

I have tried very hard to work out what to do with putting logged data into 
metafor; the paper says..
'geometric mean antibody concentrations (GMCs) or opsonophagocytic activity 
titres (geometric mean titres [GMT]) were calculated with 95% CIs by taking the 
antilog of the mean of the log concentration or titre transformations.'

Does this look right if I take the reported mean, upper and lower bound of the 
CI, and the number?

m<-log(mean) 
ub<-log(upper bound)
lb<-log(lower bound)
diff<-ub-lb
SE<-diff/3.92
SD<-SE*(sqrt(n))

Then put m, SD and n for each group into metafor as normal.  Or is there a 
better way?  I am afraid I didn't understand how to do it on a log scale.

Thank you

Edward

Edward Purssell PhD
Senior Lecturer

Florence Nightingale Faculty of Nursing and Midwifery
King's College London
James Clerk Maxwell Building
57 Waterloo Road
London SE1 8WA
Telephone 020 7848 3021
Mobile 07782 374217
email edward.purss...@kcl.ac.uk
https://www.researchgate.net/profile/Edward_Purssell


From: Viechtbauer Wolfgang (STAT) 
Sent: 14 November 2014 10:40
To: Michael Dewey; Purssell, Ed; r-help@r-project.org
Subject: RE: [R] metafor - code for analysing geometric means

With "geometric mean 1 CI /3.92", I assume you mean "(upper bound - lower 
bound) / 3.92". Two things:

1) That will give you the SE of the mean, not the SD of the observations (which 
is what you need as input).

2) Probably the CI for the geometric mean was calculated on the log-scale (as 
Michael hinted at). Check if log(upper bound) and log(lower bound) is (within 
rounding error) symmetric around log(geometric mean). Then (log(upper bound) - 
log(lower bound)) / 3.96 * sqrt(n) will give you the SD of the log of the 
values used to compute the geometric mean. Then you could use log(geometric 
mean) and that SD as input. But this would give you the difference of the 
log-transformed geometric means. Not sure if this is what you want to analyze.

Two more articles that may be helpful here:

Friedrich, J. O., Adhikari, N. K., & Beyene, J. (2012). Ratio of geometric 
means to analyze continuous outcomes in meta-analysis: Comparison to mean 
differences and ratio of arithmetic means using empiric data and simulation. 
Statistics in Medicine, 31(17), 1857-1886.

Souverein, O. W., Dullemeijer, C., van 't Veer, P., & van der Voet, H. (2012). 
Transformations of summary statistics as input in meta-analysis for linear 
dose-response models on a logarithmic scale: A methodology developed within 
EURRECA. BMC Medical Research Methodology, 12(57).

Best,
Wolfgang

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Michael Dewey
> Sent: Thursday, November 13, 2014 12:36
> To: Purssell, Ed; r-help@r-project.org
> Subject: Re: [R] metafor - code for analysing geometric means
>
> On 13/11/2014 11:00, Purssell, Ed wrote:
> > ?Dear All
> >
> > I have some data expressed in geometric means and 95% confidence
> intervals.  Can I code them in metafor as:
> >
> > rma(m1i=geometric mean 1, m2i=geometric mean 2, sd1i=geometric mean 1
> CI /3.92, sd2i=geometric mean 2 CI/3.92...etc, measure="MD")
>
> Would it not be better to work on the log scale?
>
> > All of the studies use geometric means.
> >
> > Thanks!
> >
> > Edward
>
> --
> Michael
> http://www.dewey.myzen.co.uk

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] inverse of which()

2019-02-27 Thread Ed Siefker
Given a vector of booleans, chich() will return indices that are TRUE.

Given a vector of indices, how can I get a vector of booleans?

My intent is to do logical operations on the output of grep().  Maybe
there's a better way to do this?

Thanks
-Ed

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inverse of which()

2019-02-28 Thread Ed Siefker
That's exactly what I want! Thanks!
-Ed

On Wed, Feb 27, 2019 at 5:14 PM David L Carlson  wrote:
>
> I'm not sure I completely understand your question. Would using grepl() 
> instead of grep() let you do what you want?
>
> 
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
> -Original Message-
> From: R-help  On Behalf Of Ed Siefker
> Sent: Wednesday, February 27, 2019 5:03 PM
> To: r-help 
> Subject: [R] inverse of which()
>
> Given a vector of booleans, chich() will return indices that are TRUE.
>
> Given a vector of indices, how can I get a vector of booleans?
>
> My intent is to do logical operations on the output of grep().  Maybe
> there's a better way to do this?
>
> Thanks
> -Ed
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using compute.es and metafor together

2014-10-03 Thread Purssell, Ed
Dear All



For mathematically challenged people such as myself; is it ok to use the 
compute.es package to calculate effect sizes and then import the effect sizes d 
and variances of d into metafor, coding these as yi and vi respectively and 
then running the meta-analysis?  This seems easier because compute.es offers a 
lot of ways of calculating d and its variance using similar codes.



Thanks

Edward




Edward Purssell PhD
Senior Lecturer

Florence Nightingale Faculty of Nursing and Midwifery
King's College London
James Clerk Maxwell Building
57 Waterloo Road
London SE1 8WA
Telephone 020 7848 3021
Mobile 07782 374217
email edward.purss...@kcl.ac.uk
https://www.researchgate.net/profile/Edward_Purssell

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] metafor - code for analysing geometric means

2014-11-13 Thread Purssell, Ed
?Dear All


I have some data expressed in geometric means and 95% confidence intervals.  
Can I code them in metafor as:


rma(m1i=geometric mean 1, m2i=geometric mean 2, sd1i=geometric mean 1 CI /3.92, 
sd2i=geometric mean 2 CI/3.92...etc, measure="MD")

All of the studies use geometric means.


Thanks!


Edward

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot changes usr?

2015-09-28 Thread Ed Siefker
I'm trying to plot() over an existing plot() like this:

> attach(mtcars)
> plot(mpg, hp)
> par(new=TRUE)
> par("usr")
[1]   9.46  34.84  40.68 346.32
> plot(mpg, hp, col="red", axes=FALSE, xlim=par("usr")[1:2], 
> ylim=par("usr")[3:4], xlab="", ylab="")
> par("usr")
[1]   8.4448  35.8552  28.4544 358.5456

For some reason "usr" is changing, and so it's not plotting over the
existing data in the right place.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with formula argument to randomForest

2015-10-28 Thread Ed Komp


The randomForest function generates an error whenever
I supply it with a formula using the function, I() to inhibit interpretation.
When I do so, I always get an error like this one:
 Error in unique(c("AsIs", oldClass(x))) : object 'Age' not found

Is this because of:
1.  a restriction for the randomForest function that I have not seen documented;
2.  a deficiency / error in randomForest; or
3.  an error in my calling sequence?

I am including a very simple example to demonstrate the problem.
Simply using   I()  generates the error.
This is not a meaningful use of I(), but is very simple.
My Interest is for  I(  / ) .

I also demonstrate that the usage of I() in a formula works just fine
for another discrimination function, lda.

The sample code is included after my signature, along with line-by-line output.

Thanks in advance !

Ed Komp
ITTC Lab, University of Kansas

===
> library(rpart)
> library(MASS)
> library(randomForest)
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
> formula <- as.formula('Kyphosis ~ Age + Number + Start')
> formula
Kyphosis ~ Age + Number + Start
> formulaWithI <- as.formula('Kyphosis ~ I(Age) + Number + Start')
> formulaWithI
Kyphosis ~ I(Age) + Number + Start
> fit <- randomForest(formula,   data=kyphosis)
> fitWithI <- randomForest(formulaWithI,   data=kyphosis)
Error in unique(c("AsIs", oldClass(x))) : object 'Age' not found
>
> fit <- lda(formula, data = kyphosis)
> fitWithI <- lda(formula, data = kyphosis)
> fitWithI
Call:
lda(formula, data = kyphosis)

Prior probabilities of groups:
   absent   present
0.7901235 0.2098765

Group means:
 Age   Number Start
absent  79.89062 3.75 12.609375
present 97.82353 5.176471  7.294118

Coefficients of linear discriminants:
LD1
Age 0.005910971
Number  0.291501797
Start  -0.170496626
>
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] randomForest_4.6-12 MASS_7.3-44 rpart_4.1-10

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] aggregate and the $ operator

2016-01-22 Thread Ed Siefker
Aggregate does the right thing with column names when passing it
numerical coordinates.
Given a dataframe like this:

  Nuclei Positive Nuclei Slide
1133  96A1
2 96  70A1
3 62  52A2
4 60  50A2

I can call 'aggregate' like this:

> aggregate(example[1], by=example[3], sum)
  Slide Nuclei
1A1229
2A2122

But that means I have to keep track of which column is which number.
If I try it the
easy way, it doesn't keep track of column names and it forces me to
coerce the 'by'
to a list.

> aggregate(example$Nuclei, by=list(example$Slide), sum)
  Group.1   x
1  A1 229
2  A2 122

Is there a better way to do this?  Thanks
-Ed

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate and the $ operator

2016-01-22 Thread Ed Siefker
So that's how that works!  Thanks.

On Fri, Jan 22, 2016 at 1:32 PM, Joe Ceradini  wrote:
> Does this do what you want?
>
> aggregate(Nuclei ~ Slide, example, sum)
>
> On Fri, Jan 22, 2016 at 12:20 PM, Ed Siefker  wrote:
>>
>> Aggregate does the right thing with column names when passing it
>> numerical coordinates.
>> Given a dataframe like this:
>>
>>   Nuclei Positive Nuclei Slide
>> 1133  96A1
>> 2 96  70A1
>> 3 62  52A2
>> 4 60  50A2
>>
>> I can call 'aggregate' like this:
>>
>> > aggregate(example[1], by=example[3], sum)
>>   Slide Nuclei
>> 1A1229
>> 2A2122
>>
>> But that means I have to keep track of which column is which number.
>> If I try it the
>> easy way, it doesn't keep track of column names and it forces me to
>> coerce the 'by'
>> to a list.
>>
>> > aggregate(example$Nuclei, by=list(example$Slide), sum)
>>   Group.1   x
>> 1  A1 229
>> 2  A2 122
>>
>> Is there a better way to do this?  Thanks
>> -Ed
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Cooperative Fish and Wildlife Research Unit
> Zoology and Physiology Dept.
> University of Wyoming
> joecerad...@gmail.com / 914.707.8506
> wyocoopunit.org
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lists and rownames

2016-04-18 Thread Ed Siefker
I'm doing some string manipulation on a vector of file names, and noticed
something curious.  When I strsplit the vector, I get a list of
character vectors.
The list is numbered, as lists are.  When I cast that list as a data
frame with 'as.data.frame()', the resulting columns have names derived
from the original filenames.

Example code is below.  My question is, where are these names stored
in the list?  Are there methods that can access this from the list?
Is there a way to preserve them verbatim?  Thanks
-Ed

> example.names
[1] "con1-1-masked-bottom-green.tsv" "con1-1-masked-bottom-red.tsv"
[3] "con1-1-masked-top-green.tsv""con1-1-masked-top-red.tsv"
> example.list <- strsplit(example.names, "-")
> example.list
[[1]]
[1] "con1"  "1" "masked""bottom""green.tsv"

[[2]]
[1] "con1""1"   "masked"  "bottom"  "red.tsv"

[[3]]
[1] "con1"  "1" "masked""top"   "green.tsv"

[[4]]
[1] "con1""1"   "masked"  "top" "red.tsv"

> example.df <- as.data.frame(example.list)
> example.df
  c..con11maskedbottomgreen.tsv..
1con1
2   1
3  masked
4  bottom
5   green.tsv
  c..con11maskedbottomred.tsv..
1  con1
2 1
3masked
4bottom
5   red.tsv
  c..con11maskedtopgreen.tsv..
1 con1
21
3   masked
4  top
5green.tsv
  c..con11maskedtopred.tsv..
1   con1
2  1
3 masked
4top
5red.tsv

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] T tests on multiple groups

2017-01-19 Thread Ed Siefker
I have a data set with observations on groups with multiple variables.
Let's call them GENO and AGE.  I have control and test genotypes
and two different ages.  It is only meaningful to compare control and
test within the same age.

I'd like to get the p value for each group compared back to control
of the appropriate age.  T-test requires that the grouping factor has
exactly two levels.   How can I do this efficiently?

I was hoping something like ttest(OBS ~ GENO * AGE, mydata) would work.
Is there something I can do with tapply() or aggregate() to do this?
I'd like to end up with a table that looks like this:

GENOAgeOBSp.val
control101.11
control100.91
control202.11
control201.91
A10110.01224066
A1090.01224066
A20210.003102783
A20190.003102783
B1040.057714305
B1060.057714305
B20140.005923285
B20160.005923285
AB1010.698488655
AB101.10.698488655
AB2020.552786405
AB202.20.552786405

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Average over data sets

2009-09-02 Thread Ed Long

Hello,

I have a number of files output1.dat, output2.dat, ... , output20.dat,  
each of which monitors several variables over a fixed number of  
timepoints. From this I want to create a data frame which contains the  
mean value between all files, for each timepoint and each variable.


The code below works, but it seems like I should be able to do the  
second part without a for loop. I played with sapply(myList, mean),  
but that seems to take the mean between time points and files, rather  
than just between files.


#Number of files to calculate mean value between
numberOfRuns = 20;
myList = list();
for (i in 1:numberOfRuns) {
#Read in file
fileName = paste("output", i, ".dat", sep="");
myData = read.table(fileName, header=TRUE);
#Append data frame to list
myList[[i]] = myData;
}

#Create variable to store data means
myAverage = myList[[1]]/numberOfRuns;

for (i in 2:numberOfRuns) {
myAverage = myAverage + myList[[i]]/numberOfRuns;
}

Is a list of data frames a sensible structure to store this or should  
I use an array?


Any pointers gratefully received.

Ed Long

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cygwin clipboard

2019-05-21 Thread Ed Siefker
I'd like to be able to access the windows clipboard from R under Cygwin.
But...

> read.table(file="clipboard")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") : unable to contact X11 display
>

Is this supported in any way?  Thanks
-Ed

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] why is this a factor?

2013-08-28 Thread Ed Siefker
I have a table, and I want a new column to add some annotations to.
But it ends up as a factor instead of characters, and won't let me add
arbitrary text.

> data(iris)
> iris<-data.frame(iris,annot=c(""))
> iris[1,"annot"]<-"annotation"
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "annotation") :
  invalid factor level, NAs generated
> class(iris[,"annot"])
[1] "factor"
> class(c(""))
[1] "character"

Why is c("") a character, but when I add it to a data frame it's a factor?
What am I missing?  Is there a better way to add a new column to
a data frame?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Metafor - why use escalc?

2014-03-14 Thread Purssell, Ed
Dear All



As you can specify the data directly to rma.uni via n1i, m1i, sd1i, etc in 
Metafor, why would you ever want to use escalc to calculate yi and vi?  Aren't 
these just intermediate steps to the final pooled effect size which is 
calculated by rma.uni; or is there some advantage to calculating yi and vi 
separately using escalc?

Thanks

Ed









[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] metafor combining escalc effect-sizes

2014-03-30 Thread Purssell, Ed

Dear All

I have a question about combining effect sizes generated by escalc in metafor.  
I realise these may be stupid things to do; but they are deliberately so to 
explain what I mean - I don't intend doing this!


I have 3 studies; each of which has a different measure of effect/presents the 
data differently, so I use escalc to calculate the effect size of each and 
combine them into a data-frame:

es1<-escalc(measure="MD", m1i=10 , m2i=5 , n1i=12 , n2i=12, sd1i=2, sd2i=2)
es2<-escalc(measure="RR", ai=10 , bi=5 , ci=12 , di=12)
es3<-escalc(measure="RR", ai=10 , ci=5 , n1i=15 , n2i=12)
es4<-rbind(es1, es2, es3) # combines the 3 effect sizes into a data frame

attach(es4) # makes the data frame available to R

es5<-rma(yi, vi, data=es4) # running the meta analysis here gives the error 
message

Error in rma(yi, vi, data = es4) :
  Length of yi and ni vectors are not the same.

But if I save this as a .csv and open it in R using read.csv("E:/es5.csv", etc) 
i get a data frame that looks like this:

  yi vi
1 5. 0.6667
2 0.6931 0.4667
3 0.4700 0.1500

I can run it using

rma(yi, vi, data=es4)

I have three questions.
1. Can escalc be used in this way to calculate each study effect size 
indvidually and then rbinding them into a data-frame (assuming that it is a 
sensible thing to do, which I realise the above probably isn't)?

2. What is the meaning of the error message:
Error in rma(yi, vi, data = es4) :
  Length of yi and ni vectors are not the same.

3. Is it right to save it as a .csv, open it and re-run it as I have done?


Thanks very much, and to Wolfgang thanks for a great programme! I am using it 
in my MSc teaching here for healthcare students.

Edward

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Java requested System.exit(130)

2013-10-01 Thread Ed Siefker
I'm used to using ctrl-c to end operations without killing R.  But I've used
xlsx in this session, which loads Java, which apparently intercepts the
ctrl-C.  Accordingly, I hit ctrl-C, R died, and I lost a lot of work.

I did some looking, and found a
thread(http://comments.gmane.org/gmane.comp.lang.r.rosuda.devel/1368)
that says:

"Yes, at least on Sun JVMs you need to add -Xrs java option so the JVM
doesn't steal SIGINT from R (see archives)."

So, how do I actually do that? I'm not running java from the command line,
I'm using "library(xlsx)".  How do I tell R to pass that option to the JVM?
Thanks
-Ed

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R licensing query

2010-06-17 Thread Ed Keith
Unfortunately this is how things work in the real world. I suspect the reason 
so many people keep getting in trouble for taking classified information home 
is because they can not get any work done on the office computer due to things 
like this. 

Many of the places I've worked have not permuted me to install Vim on my 
computer. I had to use MS Visual C++ editor or MS Notepad for all text editing.

Since I usually get payed by the hour, it just cost them more, and increases my 
income, but I still find it incredibly annoying.

   -EdK

Ed Keith
e_...@yahoo.com

Blog: edkeith.blogspot.com


--- On Thu, 6/17/10, Frank E Harrell Jr  wrote:

> From: Frank E Harrell Jr 
> Date: Thursday, June 17, 2010, 12:11 PM
> Pardon my english but you're working
> for idiots.  I'd look elsewhere if there are other
> options.  IT departments should be here to help get
> things done, not to help prevent good work from being done.
> 
> Frank
> 
> On 06/17/2010 04:28 AM, McAllister, Gina wrote:
> > I have recently started a new job at an NHS hospital
> in Scotland.  Since
> > I took up this post 6 months ago I have had an ongoing
> dispute with the
> > IT secutiry dept. who refuse to install R on my
> computer.  I previously
> > worked in another branch of the NHS where R was widely
> used and yet
> > there is nothing I can say which will persuade the IT
> dept here to even
> > visit the website!  With some help from our head
> of department, they
> > have now agreed to install R but only if they receive
> an email from 'R'
> > ensuring that it is licensed for commercial use, is
> compaitable with
> > Windows XP and will not affect the networked computer
> system here.  My
> > only other option for data anlaysis is Excel, we have
> no money for
> > S-plus or any other stats programme.  Can anyone
> suggest anything or
> > send me a suitable email?
> > 
> > Many thanks,
> > Georgina
> > 
> > 
> 
> -- Frank E Harrell Jr   Professor and
> Chairman        School of Medicine
>                
>      Department of
> Biostatistics   Vanderbilt University
> 
> __
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] F# vs. R

2010-07-08 Thread Ed Keith
It's been a long time since I used Fortran, and I have only dabbled in F#, but 
I do not think translating Fortran (or R) to F# will be easy. F# is basicly a 
functional language (like ML) and a very differant mind set than Fortran (or R).

   -EdK

Ed Keith
e_...@yahoo.com

Blog: edkeith.blogspot.com


--- On Thu, 7/8/10, rkevinbur...@charter.net  wrote:

> From: rkevinbur...@charter.net 
> Subject: Re: [R] F# vs. R
> To: r-help@r-project.org, "Patrick Burns" , 
> serg...@gmail.com
> Date: Thursday, July 8, 2010, 10:16 AM
> True, porting old C and Fortran code
> to C# or F# would be a pain and probably riddled with errors
> but it is not too soon to start looking to see if there is a
> better way. There have been numerous ports of LAPACK, BLAS,
> etc. to C#. Maybe they could be leveraged.
> 
> Maybe just allowing packages to be wrtten in C# or F# would
> be helpful. And remember there is Mono.
> 
> Just my 2 cents.
> 
>  Patrick Burns 
> wrote: 
> > I'd like to hear answers to this as well.
> > A language doesn't have to be a complete
> > replacement to be useful.
> > 
> > F# seems to have some nice features.
> > 
> > Pat
> > 
> > On 07/07/2010 17:54, Sergey Goriatchev wrote:
> > > Hello, Marc
> > >
> > > No, I do not want to validate Cox PH. :-)
> > > I do use R daily, though right now I do not use
> the statistical part that much.
> > >
> > > I just generally wonder if any R-user tried F#
> and his/her opinions.
> > >
> > > Regards,
> > > Sergey
> > >
> > >
> > > On Wed, Jul 7, 2010 at 17:56, Marc Schwartz 
> wrote:
> > >> On Jul 7, 2010, at 10:31 AM, Sergey
> Goriatchev wrote:
> > >>
> > >>> Hello, everyone
> > >>>
> > >>> F# is now public. Compiled code should
> run  faster than R.
> > >>>
> > >>> Anyone has opinion on F# vs. R? Just
> curious
> > >>>
> > >>> Best,
> > >>> S
> > >>
> > >>
> > >> The key time critical parts of R are written
> in compiled C and FORTRAN.
> > >>
> > >> Of course, if you want to take the time to
> code and validate a Cox PH or mixed effects model in F# and
> then run them against R's coxph() or lme()/lmer() functions
> to test the timing, feel free...  :-)
> > >>
> > >> So unless there is a pre-existing library of
> statistical and related functionality for F#, perhaps you
> need to reconsider your query.
> > >>
> > >> Regards,
> > >>
> > >> Marc Schwartz
> > >>
> > >>
> > >
> > >
> > >
> > 
> > -- 
> > Patrick Burns
> > pbu...@pburns.seanet.com
> > http://www.burns-stat.com
> > (home of 'Some hints for the R beginner'
> > and 'The R Inferno')
> > 
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> __
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using feather.plot to try and generate a stick plot of current velocity data (and having issues)

2010-09-14 Thread Hughes, Ed
Hello All,

 

I am attempting to use the feather.plot function from the plotrix
package to graph current velocity data as I have speed and direction.  I
let "r" be the first 10 rows of current speed data and "theta" be the
first 10 rows of directional data in radians.  I had tried this with 10
measurements, but keep getting the following error message:

 

 

> feather.plot(r,theta,1:10,yref=0,use.arrows=FALSE, fp.type="m")

 

Error in segments(xpos, yref, xpos + x, y, ...) : invalid third argument

 

My goal was to trouble shoot this smaller data set and see if I could
ramp up to few thousand entries to basically generate a stick plot of
current flow data.  Curious if I was doing anything obviously wrong with
my arguments or if I should be using an entirely different function.

 

 

Thanks for any guidance

Eddie Hughes


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Alphabetical sequence of data along the x-axis in a box plot

2010-09-27 Thread Hughes, Ed
Hello All,

 

I noticed when I generated some boxplots, the data is presented in
alphabetical order along the x-axis (the data in this case was the four
quandrants of a sample area (NE,NW, SE, SW) that was my first column of
data).  Is there a way to have R plot the data in a different order?  I
imagine you could use a dummy variable, but didn't know if there might
be a simple argument  that will address this? 

 

Thanks for any guidance,

Eddie Hughes

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gennerating skewed random numbers

2011-01-11 Thread Ed Keith
This is not exactly an R specific question, but I think the people on this list 
can probably help.

I'm working on a simulation. In the model I have the first three moments of the 
distributions of the variables. I know how to generate a random number from a 
distribution given the first two moments assuming the third moment is 0. But I 
do not know how to generate a number drawn from a distribution with a nonzero 
third monument.

If someone could point me to a good reference I would appreciate it.

Thank you in advance,

   -EdK

Ed Keith
e_...@yahoo.com

Blog: edkeith.blogspot.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Display a DataFrame in a data grid

2011-08-08 Thread Ed Heaton
Hi, all;

 

I'm new to R.  Have been a SAS developer for over 20 years.

 

Whenever I create a new table - you call them dataFrame objects - or modify
an existing one, I like to open the table in a grid with horizontal and
vertical sliders so that I can scan across the table and (especially) look
at all four corners.  If I made a gross error, it often shows up when I look
at the corners of the table.

 

I just can't seem to find how to evoke such a display.  Can anybody help me
here?

 

Ed

 

Ed Heaton
Project Manager, Sr. SAS Developer
Data and Analytic Solutions, Inc.
3057 Nutley Street, #602
Fairfax, VA 22031
Office: 301-520-7414
Fax: 703-991-8182
 <mailto:ehea...@dasconsultants.com> ehea...@dasconsultants.com
 <http://www.dasconsultants.com/> www.dasconsultants.com
CMMI ML-2, SBA 8(a) & SDB, WBE (WBENC), MBE (VA & MD)

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Thanks for the help with displaying a data frame.

2011-08-08 Thread Ed Heaton
Thanks to Michael Weylandt and Josh Wiley for pointing me to the View()
function.  It worked like a charm - once I learned that R is case-sensitive.
I told you I am new to R!

 

Ed

 

Ed Heaton
10318 Yearling Drive
Rockville, MD 20850-3517
Voice: (301) 424-8186
Mobile: (301) 520-7414
Fax: (301) 424-8187
eMail:  <mailto:e...@heaton.name> e...@heaton.name
URL:  <http://ed.heaton.name/> http://ed.heaton.name

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting data from an *.RData file into a data.frame object.

2011-08-12 Thread Ed Heaton
Hi, all.

I'm new to R.  I've been a SAS programmer for 20 years.

I seem to be having trouble with the most basic task - bringing a table in
an *.RData file into a data.frame object.

Here's how I created the *.RData file.

library(RODBC)
db <- odbcConnect("***")
df <- sqlQuery(
db
  , "select * from schema.table where year(someDate)=2006"
)
save(
df
  , file="C:/Documents and Settings/userName/My Documents/table2006.RData"
)
dim(df)
remove(df)
odbcClose(db)
remove(db)
detach("package:RODBC")

Next, I moved that data file (table2006.RData) to another workstation - not
at the client site.

Now, I need to get that data file into a data.frame object.  I know this
should be simple, but I can't seem to find out how to do that.  I tried the
following.  First, after opening R without doing anything, RGui used 35,008
KB of memory.  I submitted the following.

> debt2006 <- load("T:/R.Data/table2006.RData")

Memory used by RGui jumped to 191,512 KB.  So, it looks like the data
loaded.  However, debt2005 is of type character instead of data.frame.

> ls()
[1] "debt2005"
> class(debt2005)
[1] "character"
>

Help, please.

Ed

Ed Heaton
Project Manager, Sr. SAS Developer
Data and Analytic Solutions, Inc.
10318 Yearling Drive
Rockville, MD 20850
Office: 301-520-7414
ehea...@dasconsultants.com
www.dasconsultants.com <http://www.dasconsultants.com/> 
CMMI ML-2, SBA 8(a) & SDB, WBE (WBENC), MBE (VA & MD)

e...@heaton.name

(Re: http://www.r-project.org/posting-guide.html)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing non-graphic (text) output to PDF

2011-08-19 Thread Ed Heaton
Hi, friends.

I keep coming to you because I'm so new to R and can't seem to figure out
some simple things.  Sorry.

Consider the following code.  I want to load a table and write out the
structure to a PDF document.  I just can't seem to manage writing
non-graphic output to PDF.  Any help?  I've tried several functions, but
nothing worked.  All I get is the title.

# **
# Load the DEBT table.
  debt <- readRDS("T:/R.Data/Debt.rData")
  dim(debt)
# Open the debt.pdf file for graphics output.
  pdf(
file=paste(
  "R:/DAS/DMS/FedDebt"
 ,"DataDiscovery"
 ,"DistributionAnalysis"
 ,"Report"
 ,"Debt.pdf"
 ,sep="/"
)
  )
# ==
# Write the debt structucture to the output PDF.
  plot.new()
  title("DEBT")
  str(debt)
# ==
  dev.off() # Turn off the PDF device.
# ** End of Program 



Ed

Ed Heaton
Project Manager, Sr. SAS Developer
Data and Analytic Solutions, Inc.
10318 Yearling Drive
Rockville, MD 20850
Office: 301-520-7414
ehea...@dasconsultants.com
www.dasconsultants.com <http://www.dasconsultants.com/> 
CMMI ML-2, SBA 8(a) & SDB, WBE (WBENC), MBE (VA & MD)

e...@heaton.name

(Re: http://www.r-project.org/posting-guide.html)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why do I have a column called row.names?

2012-06-04 Thread Ed Siefker
I'm trying to read in a tab separated table with read.delim().
I don't particularly care what the row names are.
My data file looks like this:


start   stopSymbol  Insert sequence Clone End Pair  FISH
203048  67173930ABC8-43024000D23TI:993812543
 TI:993834585
255176  87869359ABC8-43034700N15TI:995224581
 TI:995237913
1022033 1060472 ABC27-1253C21   TI:2094436044   TI:2094696079
1022033 1061172 ABC23-1388A1TI:2120730727   TI:2121592459



I have to do something with row.names because my first column has
duplicate entries.  So I read in the file like this:

> BACS<-read.delim("testdata.txt", row.names=NULL, fill=TRUE)
> head(BACS)
  row.namesstart stop Symbol Insert.sequence Clone.End.Pair
1203048 67173930 ABC8-43024000D23 NATI:993812543  TI:993834585
2255176 87869359 ABC8-43034700N15 NATI:995224581  TI:995237913
3   1022033  1060472ABC27-1253C21 NA   TI:2094436044 TI:2094696079
4   1022033  1061172 ABC23-1388A1 NA   TI:2120730727 TI:2121592459
  FISH
1   NA
2   NA
3   NA
4   NA


Why is there a column named "row.names"?  I've tried a few different
ways of invoking this, but I always get the first column named row.names,
and the rest of the columns shifted by one.

Obviously I could fix this by using row.names<-, but I'd like to understand
why this happens.  Any insight?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why do I have a column called row.names?

2012-06-04 Thread Ed Siefker
I did read that, and I still don't understand why I have a column
called row.names.
I used "row.names = NULL" in order to get numbered row names, which was
successful:

> row.names(BACS)
[1] "1" "2" "3" "4"

I don't see what this has to do with an extraneous column name.  Can you be
more explicit as to what exactly I'm supposed to take away from this segment
of the help file?  Thanks.

On Mon, Jun 4, 2012 at 1:05 PM, David L Carlson  wrote:
> Try help("read.delim") - always a good strategy before using a function for
> the first time:
>
> In it, you will find: "Using row.names = NULL forces row numbering. Missing
> or NULL row.names generate row names that are considered to be 'automatic'
> (and not preserved by as.matrix)."
>
> --
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>> project.org] On Behalf Of Ed Siefker
>> Sent: Monday, June 04, 2012 12:47 PM
>> To: r-help@r-project.org
>> Subject: [R] Why do I have a column called row.names?
>>
>> I'm trying to read in a tab separated table with read.delim().
>> I don't particularly care what the row names are.
>> My data file looks like this:
>>
>>
>> start   stop    Symbol  Insert sequence Clone End Pair  FISH
>> 203048  67173930        ABC8-43024000D23                TI:993812543
>>  TI:993834585
>> 255176  87869359        ABC8-43034700N15                TI:995224581
>>  TI:995237913
>> 1022033 1060472 ABC27-1253C21           TI:2094436044   TI:2094696079
>> 1022033 1061172 ABC23-1388A1            TI:2120730727   TI:2121592459
>>
>>
>>
>> I have to do something with row.names because my first column has
>> duplicate entries.  So I read in the file like this:
>>
>> > BACS<-read.delim("testdata.txt", row.names=NULL, fill=TRUE)
>> > head(BACS)
>>   row.names    start             stop Symbol Insert.sequence
>> Clone.End.Pair
>> 1    203048 67173930 ABC8-43024000D23     NA    TI:993812543
>> TI:993834585
>> 2    255176 87869359 ABC8-43034700N15     NA    TI:995224581
>> TI:995237913
>> 3   1022033  1060472    ABC27-1253C21     NA   TI:2094436044
>> TI:2094696079
>> 4   1022033  1061172     ABC23-1388A1     NA   TI:2120730727
>> TI:2121592459
>>   FISH
>> 1   NA
>> 2   NA
>> 3   NA
>> 4   NA
>>
>>
>> Why is there a column named "row.names"?  I've tried a few different
>> ways of invoking this, but I always get the first column named
>> row.names,
>> and the rest of the columns shifted by one.
>>
>> Obviously I could fix this by using row.names<-, but I'd like to
>> understand
>> why this happens.  Any insight?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lavaan Package - How to Extract Residuals in Data Values

2012-07-17 Thread Ed Merkle

Dear Emily,

The lavaan package is typically used to fit models with latent 
variables, and these models are typically fit to the covariance matrix 
(and not necessarily to the raw data).  Thus, it is usually not 
straightforward to get data residuals from the fitted models. In your 
case, it appears that all variables are observed, so you could use 
"meanstructure=TRUE" within the sem() command to get the intercept for 
your regression.  Then I believe the residuals could be obtained manually.


I also wonder whether your specified model is really what you want. I 
believe that, if you estimate error in all your variables and also 
specify some covariances between independent variables, the model will 
be unidentified.  It appears that you are handling this by fixing b1 to 
be zero, but then you are effectively excluding LOG_SR_A_D from the 
model.  I wonder whether you can get by with a simple regression model 
as estimated by lm().


Ed

--
Ed Merkle, PhD
Assistant Professor
Department of Psychological Sciences
University of Missouri
Columbia, MO, USA 65211




On 7/9/12 1:25 PM, r-help-requ...@r-project.org wrote:

Date: Mon, 9 Jul 2012 11:41:33 -0400
From: Emily Zimmerman
To:r-help@r-project.org
Subject: [R] Lavaan Package - How to Extract Residuals in Data Values
Message-ID:

Content-Type: text/plain

Hello R Community,
I am using the Lavaan package in R 2.15.0 to analyze data collected from
1200 lakes across North America. My dataset includes 3 continuous
independent variables (LOG_NTL, LOG_PTL, and LOG_SR_A_D) and 1 continuous
dependent variable (BIOVOL) . I have successfully constructed structural
equation models using the Lavaan package (example included below with
code), but I have not been able to figure out how to extract the
residuals in the data values themselves (the unexplained values) of my
dependent variable, BIOVOL. For the last step of my analysis, I would like
to plot the residuals for BIOVOL against one of the independent variables
to see the relationship. I understand how to get the residuals for the
covariance matrix, but I do not know how to get the residuals in the data
values themselves for BIOVOL. Does anyone know how to extract residuals for
data values themselves in the Lavaan package?
Here is the code I am using to construct my model and the model that I am
trying to get the residuals for:
#Specify the model

>model2BIOVre <- 'BIOVOL ~ LOG_NTL + LOG_PTL + b1*LOG_SR_A_D

+ LOG_NTL ~~ LOG_PTL
+ LOG_NTL ~~ LOG_SR_A_D
+ b1 == 0'
#Fit the model with the sem function

>fit <- sem(model2BIOVre, data=lakes, fixed.x=FALSE, estimator="MLM")

#Summarize model

>summary(fit, fit.measures=TRUE, standardize=TRUE, rsq=TRUE)

Here is where I am stumped...I have read the package manuals, and tutorials
located at lavaan.urgent.be, as well as some by James Grace. I have also
tried to manipulate some other codes, but I can't get it.  I may have
missed something as I am relatively new to R, but it is not clear to me how
to do this.
Any help would be very much appreciated.
Thank you,
Emily Zimmerman


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] In rpart, how is "improve" calculated? (in the "class" case)

2011-06-14 Thread Ed Merkle

Tal,

For the Gini criterion, the "improve" value can be calculated as a 
weighted sum of the improvement in impurity.  Continuing with your 
original code:


# for "gini"
impurity_root<- gini(prop.table(table(y)))
impurity_l<- gini(prop.table(table(obs_0)))
impurity_R<-gini(prop.table(table(obs_1)))

# (13 and 7 are sample sizes in respective nodes)
13*(impurity_root - impurity_l) + 7*(impurity_root - impurity_R)
[1] 5.384615

This does not appear to extend immediately to the information criterion, 
however.  I'm not sure about the 6.84.


Ed


On 6/14/11 5:00 AM, r-help-requ...@r-project.org wrote:

--

Message: 4
Date: Mon, 13 Jun 2011 15:47:26 +0300
From: Tal Galili
To:r-help@r-project.org
Subject: [R] In rpart, how is "improve" calculated? (in the "class"
 case)
Message-ID:
Content-Type: text/plain

Hi all,

I apologies in advance if I am missing something very simple here, but since
I failed at resolving this myself, I'm sending this question to the list.

I would appreciate any help in understanding how the rpart function is
(exactly) computing the "improve" (which is given in fit$split), and how it
differs when using the split='information' vs split='gini' parameters.

According to the help in rpart.object:
"improve, which is the improvement in deviance given by this split"

From what I understand, that would mean that the "improve" value should not

be different when using different "split" switches.  Since it is different,
then I suspect that it is reflecting  the impurity measure somehow, but I
can't seem to understand how exactly.

Bellow is some simple R code showing the result for a simple classification
tree, with what the function outputs, and what I would have expected to see
if "improve" were to simply reflect the change in impurity.


set.seed(1324)
y<- sample(c(0,1), 20, T)
x<- y
x[1:5]<- 0
require(rpart)
fit<- rpart(y~x, method = "class", parms=list(split='information'))
fit$split[,3] # why is improve here 6.84 ?
fit<- rpart(y~x, method = "class", parms=list(split='gini'))
fit$split[,3] # why is improve here 5.38 ?


# Here is what I thought it should have been:
# for "information"
entropy<- function(p) {
if(any(p==1)) return(0) # works for the case when y has only 0 and 1
categories...
  -sum(p*log(p,2))
}
gini<- function(p) {sum(p*(1-p))}

obs_1<- y[x>.5]
obs_0<- y[x<.5]
n_l<- sum(x>.5)
n_R<- sum(x<.5)
n<- length(x)

# for entropy (information)
impurity_root<- entropy(prop.table(table(y)))
impurity_l<- entropy(prop.table(table(obs_0)))
impurity_R<-entropy(prop.table(table(obs_1)))
# shouldn't this have been "improve" ??
impurity_root - ((n_l/n)*impurity_l + (n_R/n)*impurity_R) # 0.7272

# for "gini"
impurity_root<- gini(prop.table(table(y)))
impurity_l<- gini(prop.table(table(obs_0)))
impurity_R<-gini(prop.table(table(obs_1)))
impurity_root - ((n_l/n)*impurity_l + (n_R/n)*impurity_R) # 0.3757


Thanks upfront,
Tal


Contact
Details:---
Contact me:tal.gal...@gmail.com  |  972-52-7275845
Read me:www.talgalili.com  (Hebrew) |www.biostatistics.co.il  (Hebrew) |
www.r-statistics.com  (English)
--


--
*** Note new email address ***
Ed Merkle, PhD
Assistant Professor
Department of Psychological Sciences (starting August 2011)
University of Missouri
Columbia, MO, USA 65211

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logical to vector?

2012-03-07 Thread Ed Siefker
I am trying to use the coXpress function from
the coXpress package.  This function requires
numerical vectors indicating which columns
are in which group.

The problem is, I can only figure out how
to get a logical structure, not a numerical one.
In other words, coXpress wants something like:
"1:3"

 I have something like:
TRUE TRUE TRUE FALSE FALSE

Can I convert one into the other easily?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rserve as a proxy

2012-03-12 Thread Ed Siefker
Is there a simple way to use Rserve/RSclient as a proxy to transparently
send requests from a local instance of R to a remote instance?   It seems
like this would by doable by wrapping each call that doesn't refer to a
local path inside RSeval.  Is this harder than it seems?  Does this already
exist somewhere?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subsetting by cell value with a list

2012-03-15 Thread Ed Siefker
I would like to subset by dataframe by matching all rows that have any value
from a list of values.  I can get it to work if I have exactly one value,
I'm not
sure how to do it with a list of values though.

This works and gives me exactly one line:
my.df[ which( mydf$IDX==17)), ]

I would like to do something like this:
my.df[ which( mydf$IDX==c(17, 42), ]

Obviously that won't work, but I hope the meaning is clear.
What's the right way to express this?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] argument names inside a function?

2012-03-24 Thread Ed Siefker
Is there a way I can get the names of the arguments passed to a
function from within a function?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] argument names inside a function?

2012-03-24 Thread Ed Siefker
Thanks, deparse(substitute()) does exactly what I want.

On Sat, Mar 24, 2012 at 4:20 PM, R. Michael Weylandt
 wrote:
> Can you be a little more concrete?
>
> If you want the form of the expression given (rather than its value),
> deparse(substitute()) will work:
>
> fnc1 <- function(x){ deparse(substitute(x))}
>
> fnc1(3) # 3
>
> fnc1(x) # "x"
>
> fnc1(x + 4) # "x+4"
>
> If you are passing them through the ... argument, you can coerce that
> to a list and use the names() attribute.
>
> If you want to reconstruct the exact call (e.g., for a modelling
> function), match.call() will do it.
>
> Hope this helps,
> Michael
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] avoiding for loops

2012-03-25 Thread Ed Siefker
I have data that looks like this:

> df1
  group id
1   red  A
2   red  B
3   red  C
4  blue  D
5  blue  E
6  blue  F


I want a list of the groups containing vectors with the ids.I am
avoiding subset(), as it is
only recommended for interactive use.  Here's what I have so far:

df1 <- data.frame(group=c("red", "red", "red", "blue", "blue",
"blue"), id=c("A", "B", "C", "D", "E", "F"))

groups <- levels(df1$group)
byid <- lapply(groups, "==", df1$group)
groupIDX <- lapply(byid, which)

> groupIDX
[[1]]
[1] 4 5 6

[[2]]
[1] 1 2 3



This gives me a list of the indices for each group.  I want to subset
df1 based on this list.
If I want just one group I can do this:

> df1[groupIDX[[1]],]$id
[1] D E F


What sort of statement should I use if I want a result like:
[[1]]
[1] D E F
Levels: A B C D E F

[[2]]
[1] A B C
Levels: A B C D E F


So far, I've used a for loop.  Can I express this with apply statements?

groupIDs <- list(1:length(groupIDX))
groupData<-
for (i in 1:length(groupIDX)) {
groupIDs[[i]] <- df1[groupIDX[[i]],]$id
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and paste

2012-03-28 Thread Ed Siefker
I have a list of suffixes I want to turn into file names with extensions.

suff<- c("C1", "C2", "C3")
paste("filename_", suff[[1]], ".ext", sep="")
[1] "filename_C1.ext"

How do I use lapply() on that call to paste()?
What's the right way to do this:

filenames <-  lapply(suff, paste, ...)

?

Can I have lapply() reorder the arguments to FUN?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and paste

2012-03-28 Thread Ed Siefker
Thank you, I was confused about that.  What exactly is lapply for then,
if R handles this kind of thing automatically?  Are there functions that are
not "vectorized"?


On Wed, Mar 28, 2012 at 1:37 PM, R. Michael Weylandt
 wrote:
> I think you're confused about the need for lapply -- paste is
> vectorized so this
>
> paste("filename_", suff, ".ext", sep = "")
>
> will work. But if you want to use lapply (for whatever reason) try this:
>
> lapply(suff, function(x) paste("filename_", x, ".ext", sep = "")
>
> Michael
>
> On Wed, Mar 28, 2012 at 2:31 PM, Ed Siefker  wrote:
>> I have a list of suffixes I want to turn into file names with extensions.
>>
>> suff<- c("C1", "C2", "C3")
>> paste("filename_", suff[[1]], ".ext", sep="")
>> [1] "filename_C1.ext"
>>
>> How do I use lapply() on that call to paste()?
>> What's the right way to do this:
>>
>> filenames <-  lapply(suff, paste, ...)
>>
>> ?
>>
>> Can I have lapply() reorder the arguments to FUN?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sys.setlocale() and text()

2007-12-11 Thread Ed Merkle
Dear HelpeRs,

I have a question about the Sys.setlocale() command and plotting.  I am 
running Windows XP, with R 2.6.1.  My default locale is English_United 
States.1252.

I am trying to add a lowercase sigma to a plot using the following code:

Sys.setlocale("LC_CTYPE","greek")
plot(1:10,1:10)
text(4,3,"\xF3")


For R 2.6.1, this code gives me the glyph from my default (1252) instead 
of from the 1253 codes.  For an older version of R (2.3.0) on the same 
computer, this code gives me the lowercase sigma that I wanted.  I have 
been unable to pinpoint what has changed.  Thanks for the help, and I 
apologize if I am missing something obvious.


-- 
Ed Merkle, PhD
Assistant Professor
Dept. of Psychology
Wichita State University
Wichita, KS 67260

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sys.setlocale() and text()

2007-12-12 Thread Ed Merkle
Thanks very much for the response.  I think I left out an important 
detail, however.

I want my lowercase sigma to be displayed in a specific font from the 
Rdevga file (my project involves fonts).  So far as I know, quote() does 
not allow me to select a font.  Thus, I am specifically interested in 
the text() command and reasons why my example code performs differently 
in R 2.3.0 vs 2.6.1.

Thanks,
Ed


Gabor Grothendieck wrote:
> Try this:
> 
> plot(1:10, main = quote(sigma ^ 2))
> 
> 
> On Dec 11, 2007 10:09 PM, Ed Merkle <[EMAIL PROTECTED]> wrote:
>> Dear HelpeRs,
>>
>> I have a question about the Sys.setlocale() command and plotting.  I am
>> running Windows XP, with R 2.6.1.  My default locale is English_United
>> States.1252.
>>
>> I am trying to add a lowercase sigma to a plot using the following code:
>>
>> Sys.setlocale("LC_CTYPE","greek")
>> plot(1:10,1:10)
>> text(4,3,"\xF3")
>>
>>
>> For R 2.6.1, this code gives me the glyph from my default (1252) instead
>> of from the 1253 codes.  For an older version of R (2.3.0) on the same
>> computer, this code gives me the lowercase sigma that I wanted.  I have
>> been unable to pinpoint what has changed.  Thanks for the help, and I
>> apologize if I am missing something obvious.
>>
>>
>> --
>> Ed Merkle, PhD
>> Assistant Professor
>> Dept. of Psychology
>> Wichita State University
>> Wichita, KS 67260
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] text vector clustering

2009-01-23 Thread Ed Merkle

Srinivas,

I don't know of a clustering algorithm, but you might check out agrep() 
from the base package and stringMatch() from the MiscPsycho package. 
These can help to identify similar text sequences, and it may be 
possible to group similar names by using these commands over and over again.


Ed

--
Ed Merkle, PhD
Assistant Professor
Dept. of Psychology
Wichita State University
Wichita, KS 67260



Date: Thu, 22 Jan 2009 16:33:03 +0530
From: srinivasa raghavan 
Subject: [R] text vector clustering
To: r-help@r-project.org
Message-ID:

Content-Type: text/plain

Hi,

I am a new user of R using R 2.8.1 in windows 2003.  I have a  csv file with
single column which contain the 30,000 students names. There were typo
errors while entering this student names. The actual list of names is <
1000. However we dont have that list for keyword search.

 I am interested in grouping/cluster these names   as those which are
similar  letter to letter.  Are there any text clustering algorithm in R
which can group names of similar type in to segments of exactly matching ,
90% matching, 80% matching,etc.

thanks in advance,

regards,
srinivas
statistical analyst.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Couple of Questions about Classification trees

2009-03-12 Thread Ed Merkle
The issue with the sample size is that there are so many measurements in 
comparison to number of meats.


Aside from that, you should check out the rpart package.  Its commands 
are similar to the tree package, but there are more options for the 
plots.  I don't know immediately how to display misclassification rates, 
but the text.rpart command can display numbers of incorrectly- and 
correctly-classified observations in each node.


Ed

--
Ed Merkle, PhD
Assistant Professor
Dept. of Psychology
Wichita State University
Wichita, KS, USA 67260



Date: Wed, 11 Mar 2009 13:53:46 -0700 (PDT)
From: Jen_mp3 
Subject: Re: [R] Couple of Questions about Classification trees
To: r-help@r-project.org
Message-ID: <22464302.p...@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii



Okay perhaps I should've been more clear about the data. Im actually working
on spectroscopic measurements from food authenticity testing. I have five
different types of meat: 55 of chicken, 55 of turkey, 55 of pork, 34 of beef
and 32 of lamb - 231 in total. On each of these 231 meats, 1024
spectroscopic measurements were taken. Matrix of 231 by 1024. But the
questions I want answered are which of the 1024 measurements are important
for predicting meat type and which of the different types of meat are
incorrectly classified - i.e can we tell the difference between chicken and
turkey. So to carry out a multivariate analysis on the data Ive split it
into two. A training data set and a test data set - half and half although I
think the larger half (55 goes into 27 and 28) went into the test data set
which explains the inequalities in the row numbers. By the way 1024 is
standard - can't change that. Can't change the 231 either.

So I created a new row with the meat types for each row.

End up with the following R code:
library(tree)
meat.tree <- tree(meat.type~., data=train)
using tree.cv (or cv.tree) lowest missclassification rate is 5 so cut the
number of nodes down to 5 using prune.tree
prunedtree <- prune.tree(meat.tree, best = 5, method = "misclass")
Then I want to use predict.tree and the test data set.
predicttree <- predict.tree(prunedtree, data = test)
I already said what it produces.

Again, how would I display the misclassification rate at each node on the
diagram? I know about misclass.tree(prunedtree, detail = TRUE) but that
doesn't actually display them on the classification tree - it just gives a
bunch of numbers of the worksheet and it just wouldn't look very neat if I
had to add them later.

--
View this message in context: 
http://www.nabble.com/Couple-of-Questions-about-Classification-trees-tp22461673p22464302.html
Sent from the R help mailing list archive at Nabble.com.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RWeb Server

2013-08-11 Thread Wiebe, Ed@CDCR
Hello,

I am looking for tutorials on setting up R on a Windows 2008 server for the 
purpose of making calls from web pages (e.g. SharePoint) or report engines 
(e.g. SSRS) to embed inline dynamically rendered R content. I found the link to 
RWeb (http://www.math.montana.edu/Rweb/) which I was hoping would get me 
started in the right direction, but apparently the links are broken and the 
contact is no longer available. Other than that I have not been able to find 
related support.

Is there any help you can offer to get me going?

Thank you!

Ed




Ed Wiebe, Manager
Enterprise Architecture, Enterprise Information Services
California Department of Corrections and Rehabilitation
1900 Birkmont Drive
Rancho Cordova, CA 95742
1-916-358-1866 Desk
1-916-358-2019 Fax
ed.wi...@cdcr.ca.gov<https://ca.mail.ca.gov/OWA/UrlBlockedError.aspx>

"The key to successfully doing something is in successfully understanding what 
you're doing."
- Thomas Erl

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best 64-bit Linux distro for R?

2009-02-08 Thread M. Edward (Ed) Borasky
On Sun, Feb 8, 2009 at 3:54 PM, Dirk Eddelbuettel  wrote:
> To differentiate the then-different
> chips of AMD from Intels Itanium ia64 line, the 'amd64' name was
> introduced.  These days ia64 is ancient history and "we're all amd64 users".
>
> By the way, if you decide to go with Ubuntu or Debian, the r-sig-debian list
> is there to help.
>
> Hth, Dirk

You will also see the "amd64" architecture referred to by the name
"x86_64". It is yet another name for the same architecture. As far as
the choice of distro is concerned, I've had total success with R on
all the major distros on my 64-bit machine. But I would definitely
give the nod to a Debian-based distro like Ubuntu because of the large
existing base of R packages in the Debian / Ubuntu repositories.
-- 
M. Edward (Ed) Borasky

I've never met a happy clam. In fact, most of them were pretty steamed.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing R on Ubuntu

2009-02-09 Thread M. Edward (Ed) Borasky
On Mon, Feb 9, 2009 at 4:51 AM, Neil Shephard  wrote:
>
> The preceived "difficulty" of installing R under whatever flavour of
> GNU/Linux in this thread stems from being unfamiliar with the process of the
> package management of the flavour of GNU/Linux you use (and in part by the
> various distros not having the most recent version of R in their
> repositories.
>
> People who say "why can't it be as easy as dowloading a self-installing
> binary and running that" are trying to fit a round peg (their experience and
> understanding of how applications install in M$-windows) in a square hole
> (or triangular, hexagonal, or whatever depending on the distribution of
> GNU/Linux).

This is true. However, for the most common Linux distros --Debian, Red
Hat Enterprise / CentOS / Scientific Linux / Fedora, openSUSE and
Ubuntu -- you can install the most recent R compiled for your distro
from

http:///bin/linux/

In addition, most of the distros have third-party repositories where
you can find the latest version of R. In short, if you have an x86 or
x86_64/amd64 system running almost any Linux, you can find a
pre-compiled R. R is a popular package, and it's pretty easy to find
even for Power PC or some of the obscure architectures.

>
> There are pro's and con's to each of the GNU/Linux flavours and its really a
> matter of deciding which you like/have invested time in learning.
>
> Irrespective its still simple to install R from source under GNU/Linux...
>
> 1) Download source tar-ball
> 2) Extract and cd to the directory
> 3) ./configure --prefix=/where/you/want/R/to/go (optionally setting the
> install path at this stage)
> 4) ./make
> 5) ./make install
>
> ...all documented in the FAQ at
> http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-R-be-installed-_0028Unix_0029

Many Linux distros do *not* install the development tools by default,
and which ones live in which packages varies by distro. Fedora in
particular is extremely stripped when you install from the LiveCD. You
have to install gcc, make and a couple of other things just to install
VMware Tools, for example, when running Fedora as a VMware guest. For
building R from source and installing R packages, you'll also need to
install gfortran. And many libraries with external dependencies, like
Rgraphviz, will require not only the package itself (graphviz) but
also the C headers, which may have the name "graphviz-devel" on some
distros and some other name on other distros.
>
> This might not be as clean as using the native package management, but does
> mean that you'll have the latest version installed.
>
> Neil
>
> (Addendum - I've tried several different distros, starting with RedHat 7.3,
> then various versions of Slackware 8 through to 9 before settling on Gentoo,
> all were easy to install R in).

I just recently switched from Gentoo to openSUSE. Gentoo usually had
the latest R source in their repository within a day or so of it
coming out of the R Project release cycle. To get it, all you needed
to do was put the package name in the "/etc/portage/package-keywords"
file. And Gentoo, since it is almost all compiled from source, by
nature *does* have all the development tools installed and installs
all the headers when it installs packages.

-- 
M. Edward (Ed) Borasky

I've never met a happy clam. In fact, most of them were pretty steamed.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scale

2018-05-20 Thread Ed medicine via R-help

I would like to get horizontal numbers on the both axes: X and Y.
I got horizontal numbers only on the Y axis when adding las=2,
How to obtain a horizontal orientation for number on scale also for the X axis
(now they are vertical)? Here is my code:
plot(survfit(Y~addicts$clinic), fun="cloglog", las=2)




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rounding

2008-07-10 Thread Korn, Ed (NIH/NCI) [E]
Hi,
 
Round(0.55,1)=0.5
 
Round(2.55,1)=2.6
 
Can this be right?
 
Thanks,
 
Ed

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.