[R] Splitting dataset for Tuning Parameter with Cross Validation

2009-07-12 Thread Tim

Hi,
My question might be a little general.

I have a number of values to select for the complexity parameters in some 
classifier, e.g. the C and gamma in SVM with RBF kernel. The selection is based 
on which values give the smallest cross validation error.

I wonder if the randomized splitting of the available dataset into folds is 
done only once for all those choices for the parameter values, or once for each 
choice? And why?

Thanks and regards!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting dataset for Tuning Parameter with Cross Validation

2009-07-13 Thread Tim


Seems to me if splitting once for all the bias will be big and if splitting 
once for each choice of parameters the variance ill be big.  

In LibSVM, for each choice of (c, gamma),  the searching script grid.py calls 
svm_cross_validation() which has a random split of the dataset. So seems to me 
it is the second method. 

As to the first one, I come to it in Ch 7 Section 10 of "The Elements of 
Statistical Learning" by Hastie where it says first split the dataset, then 
evaluate validation error CV(alpha) and vary the complexity parameter value 
alpha to find the one giving smallest validation error. It appears to me the 
splitting is once for all choices of the complexity parameter.

Thanks!

--- On Sun, 7/12/09, Tim  wrote:

> From: Tim 
> Subject: [R] Splitting dataset for Tuning Parameter with Cross Validation
> To: r-h...@stat.math.ethz.ch
> Date: Sunday, July 12, 2009, 6:58 PM
> 
> Hi,
> My question might be a little general.
> 
> I have a number of values to select for the complexity
> parameters in some classifier, e.g. the C and gamma in SVM
> with RBF kernel. The selection is based on which values give
> the smallest cross validation error.
> 
> I wonder if the randomized splitting of the available
> dataset into folds is done only once for all those choices
> for the parameter values, or once for each choice? And why?
> 
> Thanks and regards!
> 
> __
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get (feature, threshold) from Output of rpart() for Stump Tree

2009-05-08 Thread Tim
Hi, 
I have a question regarding how to get some partial information
from the output of rpart, which could be used as the first argument to
predict. For example, in my code, I try to learn a stump tree (decision
tree of depth 2):

   "fit    <- rpart(y~bx, weights = w/mean(w), control = cntrl)
    print(fit)

    btest[1,]  <- predict(fit, newdata = data.frame(bx)) "

I found that "fit" is of  mode "list" and length 12. If I "print(fit)", I will 
get as output:


"n= 124 
node), split, n, deviance, yval
  * denotes terminal node
1) root 124 61.54839 0.7096774  
  2) bx.21< 13.5 41 40.39024 0.1219512 *
  3) bx.21>=13.5 83  0.0 1.000 *"



I don't want the whole output of "print(fit)" but only the two kinds of
info in it: "21" in "bx.21", which I believe to be the feature ID of
the stump tree , and 13.5, which I believe to be the threshold on the
feature.  If I am able to get these two out, then I will be able to
further process them or write them into a file.



Any hint?



Thanks and regards!



-Tim







  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Get (feature, threshold) from Output of rpart() for Stump Tree

2009-05-08 Thread Tim
Thank you so much!

It seems that fit$splits[1,] does not contain the feature ID:
 "> fit$splits[1,]
  count    ncat improve   index adj 
124.000  -1.000   0.3437644  13.500   0.000 "

However help(rpart.object) says:
" splits: a matrix describing the splits.  The row label is the name of
  the split variable,..."

I try  to get the row label of fit$splits[1,] by
"> names(fit$splits[1,])
[1] "count"   "ncat"    "improve" "index"   "adj" "

However it has no feature ID. Is this the correct way to get the row label of 
fit$splits[1,]?

Regards,
- Tim

--- On Fri, 5/8/09, Terry Therneau  wrote:
From: Terry Therneau 
Subject: Re:  Get (feature, threshold) from Output of rpart() for Stump  Tree
To: "Tim" 
Cc: r-help@r-project.org
Date: Friday, May 8, 2009, 8:05 AM

--- begin included message --
Hi, 
I have a question regarding how to get some partial information
from the output of rpart, which could be used as the first argument to
predict. For example, in my code, I try to learn a stump tree (decision
tree of depth 2):

   "fit    <- rpart(y~bx, weights = w/mean(w), control =
cntrl)

--- end inclusion ---

 1. For stump trees, you can use the depth option in rpart.control to get a 
small tree.  You also might want to set maxsurrogate=0 for speed.
 
 2. Try help(rpart.object) for more information on what is contained in the 
returned rpart object.  In your case fit$splits[1,] would contain all that you 
need.
 
Terry T.





  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROCR: auc and logarithm plot

2009-05-12 Thread Tim
Hi,
I am quite new to R and I have two questions regarding ROCR.

1. I have tried to understand how to extract area-under-curve value by looking 
at the ROCR document and googling. Still I am not sure if I am doing the right 
thing. Here is my code, is "auc1" the auc value?
"
pred1 <- prediction(resp1,label1)

perf1 <- performance(pred1,"tpr","fpr")
plot( perf1, type="l",col=1 )

auc1 <- performance(pred1,"auc")
auc1 <- a...@y.values[[2]]
"

2. I have to compare two models that have very close ROCs. I'd like to have a 
more distinguishable plot of the ROCs. So is it possible to have a logarithm FP 
axis which might probably separate them well? Or zoom in the part close to the 
leftup corner of ROC plot? Or any other ways to make the ROCs more separate?

Thanks and regards!

--Tim



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR: auc and logarithm plot

2009-05-12 Thread Tim
Thanks Tobias!
A new question: if I want to draw an average ROC from cross-validation, how to 
make the bar color same as the line color? Here is my code:

"plot( perf2,avg="threshold",lty=2,col=2, spread.estimate="stddev",barcol=2)"

Even I specify "barcol=2", the color of bars are still black, the default one, 
instead of red "2".

--Tim


--- On Tue, 5/12/09, Tobias Sing  wrote:
From: Tobias Sing 
Subject: Re: [R] ROCR: auc and logarithm plot
To: timlee...@yahoo.com, r-help@r-project.org
Date: Tuesday, May 12, 2009, 5:54 AM

> 1. I have tried to understand how to extract area-under-curve value by
looking at the ROCR document and googling. Still I am not sure if I am doing the
right thing. Here is my code, is "auc1" the auc value?
> "
> pred1 <- prediction(resp1,label1)
>
> perf1 <- performance(pred1,"tpr","fpr")
> plot( perf1, type="l",col=1 )
>
> auc1 <- performance(pred1,"auc")
> auc1 <- a...@y.values[[2]]
> "


If you have only one set of predictions and matching class labels, it
would be in a...@y.values[[1]].
If you have multiple sets (as from cross-validation or bootstrapping),
the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc.
You can collect all of them for example by unlist(p...@y.values).

Btw, you can use str(auc1) to see the structure of objects.


> 2. I have to compare two models that have very close ROCs. I'd like to
have a more distinguishable plot of the ROCs. So is it possible to have a
logarithm FP axis which might probably separate them well? Or zoom in the part
close to the leftup corner of ROC plot? Or any other ways to make the ROCs more
separate?


To "zoom in" to a specific part:
plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1))
plot(perf2, add=TRUE, lty=2, col='red')

If you want logarithmic axes (though I wouldn't personally do this for
a ROC plot), you can set up an empty canvas and add ROC curves to it:
plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n')
plot(perf, add=TRUE)

You can adjust all components of the performance plots. See
?plot.performance and the examples in this slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Hope that helps,
  Tobias



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] #Keeping row names when using as.data.frame.matrix

2013-05-17 Thread Tim
#question I have the following data set:

Date<-c("9/7/2010","9/7/2010","9/7/2010","9/7/2010","9/7/2010","9/7/2010","9/8/2010")

EstimatedQuantity<-c(3535,2772,3279,3411,3484,3274,3305)

ScowNo<-c("4001","3002","4002","BR 8","4002","BR 8","4001")

dataset<- data.frame(EstimatedQuantity,Date,ScowNo)

#I'm trying to convert the data set into a contingency table and then back
into a regular data frame:


xtabdata<-as.data.frame.matrix(xtabs(EstimatedQuantity~Date+ScowNo,data=dataset),
 row.names=(dataset$Date),optional=F)

#I'm trying to keep the row names (in xtabsdata) as the dates.
#But the row names keep coming up as integers.
#How can I preserve the row names as dates when
#the table is converted back to a data frame?




--
View this message in context: 
http://r.789695.n4.nabble.com/Keeping-row-names-when-using-as-data-frame-matrix-tp4667344.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help reshaping table using aggregate

2012-06-20 Thread Tim
I am trying to learn how to reshape my data set.  I am new to R, so please
bear with me.  Basically, I have the following data set:

site<-c("A","A","B","B")
bug<-c("spider","grasshopper","ladybug","stinkbug")
count<-c(2,4,6,8)
myf <- data.frame(site, bug, count)
myf

  site bug count
1A  spider 2
2A grasshopper 4
3B ladybug 6
4Bstinkbug 8

This means that in site A, I found 2 spiders and 4 grasshopper.  In site B,
I found 6 ladybugs and 8 stinkbugs.

I would like to change the df to aggregate the site column and make the bugs
columns so it arranged like this:

  site spider grasshopper ladybug stinkbug
1A  2   4   00
2B  0   0   68


--
View this message in context: 
http://r.789695.n4.nabble.com/need-help-reshaping-table-using-aggregate-tp4634014.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installation: not creating necessary directories

2008-10-29 Thread Tim
I have tried installing R on a web server on which I have a user
account but not root access.

I checked and the PERL, Fortran, etc. prerequisites all seem in order.

The compiling of R with:

% ./configure --with-x=no

This works fine without errors.


I try a "make check", however, and soon get an error as it cannot find
files which should have been made.  Most of the files seem to have
come through though.

...

collecting examples for package 'base' ...
make[5]: Entering directory `/home/USERACCOUNT/mybin/R-2.8.0/src/library'
/bin/sh: ../../bin/R: No such file or directory
make[5]: *** [Rdfiles] Error 127
make[5]: Leaving directory `/home/USERACCOUNT/mybin/R-2.8.0/src/library'
file ../../library/base/R-ex cannot be opened at
../../share/perl/massage-Examples.pl line 136.

...

Is there any advice on what might be happening or on what I might need to do?

Thanks,
Tim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CRAN I-MR / Xbar-R / Xbar-S control chart package ?

2017-07-08 Thread Tim Smith
Hi,

I've had a quick look through the package list, and unless I've missed
something, I can't seem to find anything that will do I-MR / Xbar-R /
Xbar-S control charts ?

Assuming there is something out there, can anyone point me in the
right direction ?

Thanks !

TIm

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CRAN I-MR / Xbar-R / Xbar-S control chart package ?

2017-07-09 Thread Tim Smith
Interesting, thanks for that.  I came accross qcc but my quick scan of
the docs is that it only did xbars but maybe I need to re-read the
docs, I guess it does the individual plot versions (I-MR) too.

Tim

On 8 July 2017 at 20:53, Rui Barradas  wrote:
> Hello,
>
> I have no experience with I-MR charts but a google search found package qcc.
> Maybe it's what you're looking for.
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 08-07-2017 09:07, Tim Smith escreveu:
>>
>> Hi,
>>
>> I've had a quick look through the package list, and unless I've missed
>> something, I can't seem to find anything that will do I-MR / Xbar-R /
>> Xbar-S control charts ?
>>
>> Assuming there is something out there, can anyone point me in the
>> right direction ?
>>
>> Thanks !
>>
>> TIm
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interesting behavior of lm() with small, problematic data sets

2017-09-05 Thread Glover, Tim
I've recently come across the following results reported from the lm() function 
when applied to a particular type of admittedly difficult data.  When working 
with
small data sets (for instance 3 points) with the same response for different 
predicting variable, the resulting slope estimate is a reasonable approximation 
of the expected 0.0, but the p-value of that slope estimate is a surprising 
value.  A reproducible example is included below, along with the output of the 
summary of results

# example code
x <- c(1,2,3)
y <- c(1,1,1)

#above results in{ (1,1) (2,1) (3,1)} data set to regress

new.rez <- lm (y ~ x) # regress constant y on changing x)
summary(new.rez) # display results of regression

 end of example code

Results:

Call:
lm(formula = y ~ x)

Residuals:
 1  2  3
 5.906e-17 -1.181e-16  5.906e-17

Coefficients:
  Estimate Std. Errort value Pr(>|t|)
(Intercept)  1.000e+00  2.210e-16  4.525e+15   <2e-16 ***
x   -1.772e-16  1.023e-16 -1.732e+000.333
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.447e-16 on 1 degrees of freedom
Multiple R-squared:  0.7794,Adjusted R-squared:  0.5589
F-statistic: 3.534 on 1 and 1 DF,  p-value: 0.3112

Warning message:
In summary.lm(new.rez) : essentially perfect fit: summary may be unreliable


##

There is a warning that the summary may be unreliable sue to the essentially 
perfect fit, but a p-value of 0.3112 doesn’t seem reasonable.
As a side note, the various r^2 values seem odd too.







Tim Glover
Senior Scientist II (Geochemistry, Statistics), Americas - Environment & 
Infrastructure, Amec Foster Wheeler
271 Mill Road, Chelmsford, Massachusetts, USA 01824-4105
T +01 978 692 9090  D +01 978 392 5383  M +01 850 445 5039
tim.glo...@amecfw.com  amecfw.com


This message is the property of Amec Foster Wheeler plc and/or its subsidiaries 
and/or affiliates and is intended only for the named recipient(s). Its contents 
(including any attachments) may be confidential, legally privileged or 
otherwise protected from disclosure by law. Unauthorised use, copying, 
distribution or disclosure of any of it may be unlawful and is strictly 
prohibited. We assume no responsibility to persons other than the intended 
named recipient(s) and do not accept liability for any errors or omissions 
which are a result of email transmission. If you have received this message in 
error, please notify us immediately by reply email to the sender and confirm 
that the original message and any attachments and copies have been destroyed 
and deleted from your system. If you do not wish to receive future unsolicited 
commercial electronic messages from us, please forward this email to: 
unsubscr...@amecfw.com and include “Unsubscribe” in the subject line. If 
applicable, you will continue to receive invoices, project communications and 
similar factual, non-commercial electronic communications.

Please click http://amecfw.com/email-disclaimer for notices and company 
information in relation to emails originating in the UK, Italy or France.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rtmvnorm {tmvtnorm} seems broken

2014-12-09 Thread Tim Benham
General linear constraints don't seem to work. I get an error message if I
have more constraint equations than variables. E.g. executing the following
code

print(R.version)
library('tmvtnorm')
cat('tmvtnorm version ')
print(packageVersion('tmvtnorm'))
## Let's constrain our sample to the dwarfed hypercube of dimension p.
p <- 3  # dimension
mean <- rep(0,p)
sigma <- diag(p)
## a <= Dx <= b
a <- c(rep(0,p),-Inf)
b <- c(rep(1,p),2)
D <- rbind(diag(p),rep(1,p))
cat('mean is\n'); print(mean)
cat('a is\n'); print(a)
cat('b is\n'); print(b)
cat('D is\n'); print(D)
X <- rtmvnorm(n=1000, mean, sigma, D=D, lower=a, upper=b, algorithm="gibbsR")

produces the following output

platform   x86_64-w64-mingw32
arch   x86_64
os mingw32
system x86_64, mingw32
status
major  3
minor  1.0
year   2014
month  04
day10
svn rev65387
language   R
version.string R version 3.1.0 (2014-04-10)
nickname   Spring Dance
tmvtnorm version [1] '1.4.9'
mean is
[1] 0 0 0
a is
[1]000 -Inf
b is
[1] 1 1 1 2
D is
 [,1] [,2] [,3]
[1,]100
[2,]010
[3,]001
[4,]111
Error in checkTmvArgs(mean, sigma, lower, upper) (from rtmvnorm-test.R#18) :
  mean, lower and upper must have the same length

That error message is not appropriate when a matrix of linear constraints is
passed in. I emailed the package maintainer on the 3rd but received only an
automatic out-of-office reply.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.table with missing data and consecutive delimiters

2015-02-11 Thread Tim Victor
All,

Assume we have data in an ASCII file that looks like

Var1$Var2$Var3$Var4
1$2$3$4
2$$5
$$$6

When I execute

read.table( 'test.dat', header=TRUE, sep='$' )

I, of course, receive the following error:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
  line 2 did not have 4 elements

When I set fill=TRUE, e.g., read.table( 'test.dat', header=TRUE, sep='$',
fill=TRUE )

I get:

  Var1 Var2 Var3 Var4
11234
22   NA5   NA
3   NA   NA   NA6


What I need is

  Var1 Var2 Var3 Var4
11234
22   NA   NA5
3   NA   NA   NA6

What am I missing?

Thanks,

Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with labeling plots, possibly in font defaults

2014-08-07 Thread Tim Blass
Hello,

I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4.

I just tried making and labeling a plot as follows:

> x<-rnorm(10)
> y<-rnorm(10)
> plot(x,y)
> title(main="random points")

which produces a scatter plot of the random points, but without the title
and without any numbers along the axes. If I then run

> par(family="sans")
> plot(x,y,main="plot title")

the plot has the title and the numbers on the axes (also and 'x' and 'y'
appear as default labels for the axes).

I do not know what is going on, but maybe there is some problem in the
default font settings (I don't know if that could be an R issue or an issue
specific to my Mac)?

This is clearly not a big problem (at least for now) since I can put labels
on plots by running par(), but if it is indicative of a larger underlying
problem (or if there is a simple fix) I would like to know.

Thank you!

tb

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I overwrite a method in R?

2014-10-08 Thread Tim Hesterberg
How can I create an improved version of a method in R, and have it be used?

Short version:
I think plot.histogram has a bug, and I'd like to try a version with a fix.
But when I call hist(), my fixed version doesn't get used.

Long version:
hist() calls plot() which calls plot.histogram() which fails to pass ...
when it calls plot.window().
As a result hist() ignores xaxs and yaxs arguments.
I'd like to make my own copy of plot.histogram that passes ... to
plot.window().

If I just make my own copy of plot.histogram, plot() ignores it, because my
version is not part of the same graphics package that plot belongs to.

If I copy hist, hist.default and plot, the copies inherit the same
environments as
the originals, and behave the same.

If I also change the environment of each to .GlobalEnv, hist.default fails
in
a .Call because it cannot find C_BinCount.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I overwrite a method in R?

2014-10-09 Thread Tim Hesterberg
Thank you Duncan, Brian, Hadley, and Lin.

In Lin's suggestion, I believe the latter two statements should be
reversed, so that the environment is added before the function is placed
into the graphics namespace.

source('plot.histogram.R')
environment(plot.histogram) <- asNamespace('graphics')
assignInNamespace('plot.histogram', plot.histogram, ns='graphics')

The middle statement could also be
environment(plot.histogram) <- environment(graphics:::plot.histogram)
The point is to ensure that the replacement version has the same
environment as the original.

Having tested this, I will now submit a bug report :-)

On Thu, Oct 9, 2014 at 9:11 AM, C Lin  wrote:

> I posted similar question a while ago. Search for "modify function in a
> package".
> In your case, the following should work.
>
> source('plot.histogram.R')
> assignInNamespace('plot.histogram',plot.histogram,ns='graphics')
> environment(plot.histogram) <- asNamespace('graphics');
>
> Assuming you have your own plot.histogram function inside
> "plot.histogram.R" and the plot.histogram function you are trying to
> overwrite is in graphics package.
>
> Lin
>
> 
> > From: h.wick...@gmail.com
> > Date: Thu, 9 Oct 2014 07:00:31 -0500
> > To: timhesterb...@gmail.com
> > CC: r-h...@stat.math.ethz.ch
> > Subject: Re: [R] How can I overwrite a method in R?
> >
> > This is usually ill-advised, but I think it's the right solution for
> > your problem:
> >
> > assignInNamespace("plot.histogram", function(...) plot(1:10), "graphics")
> > hist(1:10)
> >
> > Haley
> >
> > On Thu, Oct 9, 2014 at 1:14 AM, Tim Hesterberg 
> wrote:
> >> How can I create an improved version of a method in R, and have it be
> used?
> >>
> >> Short version:
> >> I think plot.histogram has a bug, and I'd like to try a version with a
> fix.
> >> But when I call hist(), my fixed version doesn't get used.
> >>
> >> Long version:
> >> hist() calls plot() which calls plot.histogram() which fails to pass ...
> >> when it calls plot.window().
> >> As a result hist() ignores xaxs and yaxs arguments.
> >> I'd like to make my own copy of plot.histogram that passes ... to
> >> plot.window().
> >>
> >> If I just make my own copy of plot.histogram, plot() ignores it,
> because my
> >> version is not part of the same graphics package that plot belongs to.
> >>
> >> If I copy hist, hist.default and plot, the copies inherit the same
> >> environments as
> >> the originals, and behave the same.
> >>
> >> If I also change the environment of each to .GlobalEnv, hist.default
> fails
> >> in
> >> a .Call because it cannot find C_BinCount.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > http://had.co.nz/
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Tim Hesterberg
http://www.timhesterberg.net
 (resampling, water bottle rockets, computers to Costa Rica, hot shower =
2650 light bulbs, ...)

Help your students understand statistics:
Mathematical Statistics with Resampling and R, Chihara & Hesterberg
http://www.timhesterberg.net/bootstrap/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] big datasets for R

2014-10-28 Thread Tim Hoolihan
Another source of large datasets is the Public Data Sets on AWS 
http://aws.amazon.com/public-data-sets/
Tim Hoolihan
@thoolihan
http://linkedin.com/in/timhoolihan


On Oct 28, 2014, at 7:00 AM, r-help-requ...@r-project.org wrote:

> --
> 
> Message: 2
> Date: Mon, 27 Oct 2014 13:37:12 +0100
> From: Qiong Cai 
> To: r-help@r-project.org
> Subject: [R] big datasets for R
> Message-ID:
>   
> Content-Type: text/plain; charset="UTF-8"
> 
> Hi,
> 
> Could anyone please tell me where I can find very big datasets for R?  I'd
> like to do some benchmarking on R by stressing R a lot.
> 
> Thanks
> Qiong
> 
>   [[alternative HTML version deleted]]
> 
> 
> 
> --


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Download showing as exploit

2015-09-21 Thread Tim Kingston

Hi , 

I work for the NHS, and our IT service has been unable to download as its 
anti-virus software says it contains an exploit.

Is this normal? Is there a way around this?

Kind regards,

Tim Kingston

Sent from my HTC



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem installing/loading packages from Africa

2016-04-22 Thread Tim Werwie
I'm very new to R and I live in Mali, west Africa. I'm on *OS X 10.7.5*. I
downloaded and installed *R 3.2.1*. I downloaded and installed *RStudio
00.99.893*.

I ran through the free Microsoft data camp intro to R course, then started
another free course through 'edX', for Data and Statistics for Life
Sciences, using R. The first prompt in the course is to
install.packages("swirl"). Copied below are the various error messages I
get when trying to install or load any package.

My best guess is that the problems I'm having are due to being in west
Africa, with unreliable connections, weak connections and no CRAN Mirror
closer than Italy or Spain (as far as I know). I checked into common
package errors on the RStudio page, but I'm not confident enough in my
computing to get into internet proxies and some of the other suggested
troubleshooting.

Any insight would be very helpful. See error messages below. Thanks -- Tim

- When attempting to install, the print-out I get in the Console in RStudio
is any length of: > install.packages("swirl")
  % Total% Received % Xferd  Average Speed   TimeTime Time
Current
 Dload  Upload   Total   SpentLeft
Speed
  0 00 00 0  0  0 --:--:--  0:00:10
--:--:-- 0  0 00 00 0  0  0 --:--:--
0:00:10 --:--:-- 0  0 00 00 0  0  0
--:--:--  0:00:11 --:--:-- 0 22  207k   22 483840 0   3833
0  0:00:55  0:00:12  0:00:43 19740 30  207k   30 648230 0
4808  0  0:00:44  0:00:13  0:00:31 19578 38  207k   38 812070
0   5638  0  0:00:37  0:00:14  0:00:23 19193 46  207k   46 97591
0 0   6299  0  0:00:33  0:00:15  0:00:18 20076 53  207k   53
111k0 0   6966  0  0:00:30  0:00:16  0:00:14 22717 69  207k
69  143k0 0   8423  0  0:00:25  0:00:17  0:00:08 20487 76
207k   76  159k0 0   8821  0  0:00:24  0:00:18  0:00:06 19621
92  207k   92  191k0 0  10085  0  0:00:21  0:00:19  0:00:02
22841100  207k  100  207k0 0  10363  0  0:00:20  0:00:20
--:--:-- 230100  207k  100  207k0 0  10363  0  0:00:20  0:00:20
--:--:-- 23922
The downloaded binary packages are in

/var/folders/yb/7z339kn56mdbwx92ydmsqswcgn/T//RtmpBzQ16u/downloaded_packages

For other packages, ggplot2, forexample, a similar printout can run for
hundreds and hundred of lines. So RStudio tells me that the packages are
available to load, BUT:

- when *loading* any package an error comes up saying package 'name of
package' was built under R version 3.2.5 but info on swirl says it should
work on any version of R above 3.0.2. I get similar errors for other
packages (treemap, ggplot2, etc).

- Or sometimes I"ll get this: Error : .onAttach failed in attachNamespace()
for 'swirl', details: call: stri_c(..., sep = sep, collapse = collapse,
ignore_null = TRUE)
error: object 'C_stri_join' not found
In addition: Warning message:
package ‘swirl’ was built under R version 3.2.5
Error: package or namespace load failed for ‘swirl’

Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] CRAN updates: rpg and odeintr

2017-02-20 Thread Tim Keitt
rpg is a package for working with postgresql: https://github.com/thk686/rpg
odeintr is a package for integrating differential equations:
https://github.com/thk686/odeintr

Cheers,
THK

http://www.keittlab.org/

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?

2008-07-27 Thread Tim Hesterberg
I'll address the question of whether you can use the bootstrap to
improve estimates, and whether you can use the bootstrap to "virtually
increase the size of the sample".

Short answer - no, with some exceptions (bumping / Random Forests).

Longer answer:
Suppose you have data (x1, ..., xn) and a statistic ThetaHat,
that you take a number of bootstrap samples (all of size n) and
let ThetaHatBar be the average of those bootstrap statistics from
those samples.

Is ThetaHatBar better than ThetaHat?  Usually not.  Usually it
is worse.  You have not collected any new data, you are just using the
existing data in a different way, that is usually harmful:
* If the statistic is the sample mean, all this does is to add
  some noise to the estimate
* If the statistic is nonlinear, this gives an estimate that
  has roughly double the bias, without improving the variance.

What are the exceptions?  The prime example is tree models (random
forests) - taking bootstrap averages helps smooth out the
discontinuities in tree models.  For a simple example, suppose that a
simple linear regression model really holds:
y = beta x + epsilon
but that you fit a tree model; the tree model predictions are
a step function.  If you bootstrap the data, the boundaries of
the step function will differ from one sample to another, so
the average of the bootstrap samples smears out the steps, getting
closer to the smooth linear relationship.

Aside from such exceptions, the bootstrap is used for inference
(bias, standard error, confidence intervals), not improving on
ThetaHat.

Tim Hesterberg

>Hi Doran,
>
>Maybe I am wrong, but I think bootstrap is a general resampling method which
>can be used for different purposes...Usually it works well when you do not
>have a presentative sample set (maybe with limited number of samples).
>Therefore, I am positive with Michal...
>
>P.S., overfitting, in my opinion, is used to depict when you got a model
>which is quite specific for the training dataset but cannot be generalized
>with new samples..
>
>Thanks,
>
>--Jerry
>2008/7/21 Doran, Harold <[EMAIL PROTECTED]>:
>
>> > I used bootstrap to virtually increase the size of my
>> > dataset, it should result in estimates more close to that
>> > from the population - isn't it the purpose of bootstrap?
>>
>> No, not really. The bootstrap is a resampling method for variance
>> estimation. It is often used when there is not an easy way, or a closed
>> form expression, for estimating the sampling variance of a statistic.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vista problem -- can't type commands at prompt

2008-08-07 Thread Tim Calkins
Hi All -

I recently moved to Vista and reinstalled R.  I am able to run R as I
typically do (R.exe from the command prompt), and it can work well.
However, if I switch windows to, say, firefox or excel or anything
else, when I return to the R prompt it no longer works.  I am able to
use the up and down arrow keys to access previous commands, but no
other key stroke has any impact.

As long as I leave the R window active, it continues to work as
expected.  I have tried using the batchfiles "el R" method which
doesn't work.  I have disabled UAC and am an admin on this machine.

thanks for any help you might be able to provide.

tim



-- 
Tim Calkins
0406 753 997

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about type conversion in read.table with columns that contain "+" and "-" in R > 2.7

2008-09-01 Thread Tim Beissbarth
Somewhere in between the R-Versions 2.6 and 2.7 the behaviour of the 
function type.convert and therefore also read.table, read.csv, etc. 
has changed (see below):


In 2.6 and before:
> type.convert(c("+", "-", "+"))
[1] + - +
Levels: + -

In 2.7 and later:
> type.convert(c("+", "-", "+"))
[1] 0 0 0

Apparently, the character strings "+" and "-" are now interpreted as 
numeric and not any more as factors or character strings.


I have quite a number of files with columns that contain "+" or "-" 
and would like to convert these to characters or factors, without 
having to specify the individual column types manually.


Is there any way to still do so in a new version of R?

Many thanks and best wishes,
Tim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting the number of non-NA values per day

2009-08-11 Thread Tim Chatterton
I have a long dataframe ("pollution") that contains a column of hourly 
date information ("date") and  a column of  pollution  measurements ("pol")


I have been happily calculating daily means and daily maximums using the 
aggregate function


DMEANpollution =  aggregate(pollution["pol"], 
format(pollution["date"],"%Y-%j"), mean, na.rm = TRUE)
DMAXpollution =  aggregate(pollution["pol"], 
format(pollution["date"],"%Y-%j"), max, na.rm = TRUE)


However, I also need to count the number of valid measurements for each 
day to check that the mean and max are likely to be valid (for example I 
need at least 18 hourly measurements to calculate a valid daily mean)


Try as I might I have not found a simple way of doing this.
Can anybody help please?

Many thanks,
Tim.

--


__

Dr Tim Chatterton
Senior Research Fellow
Air Quality Management Resource Centre
Faculty of Environment and Technology
University of the West of England
Frenchay Campus
Bristol
BS16 1QY

Tel: 0117 328 2929
Fax: 0117 328 3360
Email: tim.chatter...@uwe.ac.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] REMOVE ME

2009-08-12 Thread Tim Paysen
This mailing list is too intrusive.  Remove my name.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding minimum of time subset

2009-08-13 Thread Tim Clark
Dear List,

I have a data frame of data taken every few seconds.  I would like to subset 
the data to retain only the data taken on the quarter hour, and as close to the 
quarter hour as possible.  So far I have figured out how to subset the data to 
the quarter hour, but not how to keep only the minimum time for each quarter 
hour.  

For example:
mytime<-c("12:00:00","12:00:05","12:15:05","12:15:06","12:20:00","12:30:01","12:45:01","13:00:00","13:15:02")
subtime<-grep(pattern="[[:digit:]]+[[:punct:]]00[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]15[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]30[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]45[[:punct:]][[:digit:]]+",mytime)
mytime[subtime]

[1] "12:00:00" "12:00:05" "12:15:05" "12:15:06" "12:30:01" "12:45:01" 
"13:00:00" "13:15:02"

This gives me the data taken at quarter hour intervals (removes 12:20:00) but I 
am still left with multiple values at the quarter hours.  

I would like to obtain:

"12:00:00" "12:15:05" "12:30:01" "12:45:01" "13:00:00" "13:15:02"

Thanks!

Tim




Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding minimum of time subset

2009-08-14 Thread Tim Clark
Jim,

That works great!  However, would you please explain what the '[' and the 1 do 
in the sapply function?  I understand that you are cutting x by quarter, then 
creating a list of x that is split based on those cuts.  I just don't 
understand what "[" means in this contex, or what the number one at the end 
does.

Thanks for you help,

Tim



Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 8/14/09, jim holtman  wrote:

> From: jim holtman 
> Subject: Re: [R] Finding minimum of time subset
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Friday, August 14, 2009, 6:18 AM
> Here is one way to do it:
> 
> >
> mytime<-c("12:00:00","12:00:05","12:15:05","12:15:06","12:20:00","12:30:01","12:45:01","13:00:00","13:15:02")
> > # you might want a date on your data
> > x <- as.POSIXct(mytime, format="%H:%M:%S")
> > # create quarter hour intervals for the data range
> > quarter <- seq(trunc(min(x), 'days'), trunc(max(x)
> + 86400, 'days'), by='15 min') # add 86400 to add a day for
> truncation
> > # cut the data by quarter hours and then take the
> first value in each group
> > x.s <- sapply(split(x, cut(x, breaks=quarter),
> drop=TRUE), '[', 1)
> > # lost the 'class' for some reason; put it back
> > class(x.s) <- c("POSIXt", "POSIXct")
> > # the answer
> > x.s
>       2009-08-14 12:00:00   
>    2009-08-14 12:15:00   
>    2009-08-14
> 12:30:00       2009-08-14
> 12:45:00       2009-08-14 13:00:00
> "2009-08-14 12:00:00 EDT" "2009-08-14 12:15:05 EDT"
> "2009-08-14
> 12:30:01 EDT" "2009-08-14 12:45:01 EDT" "2009-08-14
> 13:00:00 EDT"
>       2009-08-14 13:15:00
> "2009-08-14 13:15:02 EDT"
> >
> 
> 
> On Thu, Aug 13, 2009 at 4:10 PM, Tim Clark
> wrote:
> > Dear List,
> >
> > I have a data frame of data taken every few seconds.
>  I would like to subset the data to retain only the data
> taken on the quarter hour, and as close to the quarter hour
> as possible.  So far I have figured out how to subset the
> data to the quarter hour, but not how to keep only the
> minimum time for each quarter hour.
> >
> > For example:
> >
> mytime<-c("12:00:00","12:00:05","12:15:05","12:15:06","12:20:00","12:30:01","12:45:01","13:00:00","13:15:02")
> >
> subtime<-grep(pattern="[[:digit:]]+[[:punct:]]00[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]15[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]30[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]45[[:punct:]][[:digit:]]+",mytime)
> > mytime[subtime]
> >
> > [1] "12:00:00" "12:00:05" "12:15:05" "12:15:06"
> "12:30:01" "12:45:01" "13:00:00" "13:15:02"
> >
> > This gives me the data taken at quarter hour intervals
> (removes 12:20:00) but I am still left with multiple values
> at the quarter hours.
> >
> > I would like to obtain:
> >
> > "12:00:00" "12:15:05" "12:30:01" "12:45:01" "13:00:00"
> "13:15:02"
> >
> > Thanks!
> >
> > Tim
> >
> >
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding minimum of time subset

2009-08-14 Thread Tim Clark
Jim,

Got it!  Thanks for the explanation and the example.  Always nice to learn new 
tricks on R.

Aloha,

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 8/14/09, jim holtman  wrote:

> From: jim holtman 
> Subject: Re: [R] Finding minimum of time subset
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Friday, August 14, 2009, 7:51 AM
> sapply(mylist, '[', 1)
> 
> is equivalent to
> 
> sapply(mylist, function(x) x[1])  # select just the
> first element
> 
> "[" is an function that is called with a object and an
> index.  Using
> it the way I did in the email was a shorthand way of doing
> it.  Here
> is an example:
> 
> > x <- list(1,2,3)
> > x[1]
> [[1]]
> [1] 1
> 
> > `[`(x, 1)
> [[1]]
> [1] 1
> 
> Notice the function call  `[`(x,1).  This is what
> is being done in the
> sapply and passing the 1 as the second parameter.
> 
> On Fri, Aug 14, 2009 at 1:30 PM, Tim Clark
> wrote:
> > Jim,
> >
> > That works great!  However, would you please explain
> what the '[' and the 1 do in the sapply function?  I
> understand that you are cutting x by quarter, then creating
> a list of x that is split based on those cuts.  I just
> don't understand what "[" means in this contex, or what the
> number one at the end does.
> >
> > Thanks for you help,
> >
> > Tim
> >
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> >
> > --- On Fri, 8/14/09, jim holtman 
> wrote:
> >
> >> From: jim holtman 
> >> Subject: Re: [R] Finding minimum of time subset
> >> To: "Tim Clark" 
> >> Cc: r-help@r-project.org
> >> Date: Friday, August 14, 2009, 6:18 AM
> >> Here is one way to do it:
> >>
> >> >
> >>
> mytime<-c("12:00:00","12:00:05","12:15:05","12:15:06","12:20:00","12:30:01","12:45:01","13:00:00","13:15:02")
> >> > # you might want a date on your data
> >> > x <- as.POSIXct(mytime,
> format="%H:%M:%S")
> >> > # create quarter hour intervals for the data
> range
> >> > quarter <- seq(trunc(min(x), 'days'),
> trunc(max(x)
> >> + 86400, 'days'), by='15 min') # add 86400 to add
> a day for
> >> truncation
> >> > # cut the data by quarter hours and then take
> the
> >> first value in each group
> >> > x.s <- sapply(split(x, cut(x,
> breaks=quarter),
> >> drop=TRUE), '[', 1)
> >> > # lost the 'class' for some reason; put it
> back
> >> > class(x.s) <- c("POSIXt", "POSIXct")
> >> > # the answer
> >> > x.s
> >>       2009-08-14 12:00:00
> >>    2009-08-14 12:15:00
> >>    2009-08-14
> >> 12:30:00       2009-08-14
> >> 12:45:00       2009-08-14 13:00:00
> >> "2009-08-14 12:00:00 EDT" "2009-08-14 12:15:05
> EDT"
> >> "2009-08-14
> >> 12:30:01 EDT" "2009-08-14 12:45:01 EDT"
> "2009-08-14
> >> 13:00:00 EDT"
> >>       2009-08-14 13:15:00
> >> "2009-08-14 13:15:02 EDT"
> >> >
> >>
> >>
> >> On Thu, Aug 13, 2009 at 4:10 PM, Tim Clark
> >> wrote:
> >> > Dear List,
> >> >
> >> > I have a data frame of data taken every few
> seconds.
> >>  I would like to subset the data to retain only
> the data
> >> taken on the quarter hour, and as close to the
> quarter hour
> >> as possible.  So far I have figured out how to
> subset the
> >> data to the quarter hour, but not how to keep only
> the
> >> minimum time for each quarter hour.
> >> >
> >> > For example:
> >> >
> >>
> mytime<-c("12:00:00","12:00:05","12:15:05","12:15:06","12:20:00","12:30:01","12:45:01","13:00:00","13:15:02")
> >> >
> >>
> subtime<-grep(pattern="[[:digit:]]+[[:punct:]]00[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]15[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]30[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]45[[:punct:]][[:digit:]]+",mytime)
> >> > mytime[subtime]
> >> >
> >> > [1] "12:00:00" "12:00:05" "12:15:05"
> "12:15:06"
> 

Re: [R] Finding minimum of time subset

2009-08-14 Thread Tim Clark
Thanks for everyones help and for the alternate ways of doing this.  I am 
always amazed at how many solutions this list comes up with for things I get 
stuck on!  It really helps us non-programmers learn R!

Aloha,

Tim

Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 8/14/09, Henrique Dallazuanna  wrote:

> From: Henrique Dallazuanna 
> Subject: Re: [R] Finding minimum of time subset
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Friday, August 14, 2009, 7:19 AM
> Try this also:
> 
> times <- as.POSIXlt(mytime, format =
> "%H:%M:%S")
> subTimes <- times[times[['min']] %in%
> c(0,15,30,45)]
> format(subTimes[!duplicated(format(subTimes,
> "%H:%M"))], "%H:%M:%S")
> 
> 
> On Thu, Aug 13, 2009 at 5:10 PM,
> Tim Clark 
> wrote:
> 
> Dear List,
> 
> 
> 
> I have a data frame of data taken every few seconds.  I
> would like to subset the data to retain only the data taken
> on the quarter hour, and as close to the quarter hour as
> possible.  So far I have figured out how to subset the data
> to the quarter hour, but not how to keep only the minimum
> time for each quarter hour.
> 
> 
> 
> 
> For example:
> 
> mytime<-c("12:00:00","12:00:05","12:15:05","12:15:06","12:20:00","12:30:01","12:45:01","13:00:00","13:15:02")
> 
> subtime<-grep(pattern="[[:digit:]]+[[:punct:]]00[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]15[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]30[[:punct:]][[:digit:]]+|[[:digit:]]+[[:punct:]]45[[:punct:]][[:digit:]]+",mytime)
> 
> 
> mytime[subtime]
> 
> 
> 
> [1] "12:00:00" "12:00:05"
> "12:15:05" "12:15:06"
> "12:30:01" "12:45:01"
> "13:00:00" "13:15:02"
> 
> 
> 
> This gives me the data taken at quarter hour intervals
> (removes 12:20:00) but I am still left with multiple values
> at the quarter hours.
> 
> 
> 
> I would like to obtain:
> 
> 
> 
> "12:00:00" "12:15:05"
> "12:30:01" "12:45:01"
> "13:00:00" "13:15:02"
> 
> 
> 
> Thanks!
> 
> 
> 
> Tim
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Tim Clark
> 
> Department of Zoology
> 
> University of Hawaii
> 
> 
> 
> __
> 
> R-help@r-project.org
> mailing list
> 
> https://stat.ethz.ch/mailman/listinfo/r-help
> 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> 
> and provide commented, minimal, self-contained,
> reproducible code.
> 
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to rename spatial pts data frame slot that isn't a slot()

2009-08-30 Thread Tim Clark
Dear List,

I am analyzing the home range area of fish and seem to have lost the 
individuals ID names during my manipulations, and can't find out how to rename 
them.  I calculated the MCP of the fish using mcp() in Adehabitat.  MCP's were 
converted to spatial points data frame and exported to qGIS for manipulations.  
At this point the ID names were lost.  I brought the manipulated shapefiles 
back into qGIS, but can't figure out how to rename the individuals.

#Calculate MCP and save as a shapefile
my.mcp<-mcp(xy, id=id, percent=100)
spol<-area2spol(my.mcp)
spdf <- SpatialPolygonsDataFrame(spol, data=data.frame
+(getSpPPolygonsLabptSlots(spol),
+row.names=getSpPPolygonsIDSlots(spol)), match.ID = TRUE)
writeOGR(spdf,dsn=mcp.dir,layer="All Mantas MCP", driver="ESRI
+Shapefile")

#Read shapefile manipulated in qGIS
mymcp<-readOGR(dsn=mcp.dir,layer="All mantas MCP land differenc")


My spatial points data frame has a number of "Slot"s, including one that 
contained the original names called Slot "ID".  However, I can not access this 
slot using slot() or slotNames().  

> slotNames(mymcp)
[1] "data""polygons""plotOrder"   "bbox"  "proj4string"

What am I missing here?  Is Slot "ID" not a slot?  Can I export the ID's with 
the shapefiles to qGIS?  Can I rename the ID's when I bring them back into R?  
When is a slot not a slot()?

Thanks,

TIm




Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Zoomable graphs with multiple plots

2009-09-02 Thread Tim Shephard
Hi folks,

I was wondering if anyone could confirm/deny whether there exists any
kind of package to facilitate zoomable graphs with multiple plots (eg,
 plot(..) and then points(..)).I've tried zoom from IDPmisc, and
iplot from the iplot and iplot extreme packages, but as far I can
tell, neither can handle the task.

Does anyone know anything else that might work?   Or generally know different?

Cheers,

Tim.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Zoomable graphs with multiple plots

2009-09-03 Thread Tim Shephard
This is what I've done.   Just capture two identifies and then replot.
  If the identifies are right left, I zoom out.  Works quite well.

Still, can't wait for iplot xtreme.

Cheers,

Tim.

On Thu, Sep 3, 2009 at 5:03 AM, Jim Lemon wrote:
> Tim Shephard wrote:
>>
>> Hi folks,
>>
>> I was wondering if anyone could confirm/deny whether there exists any
>> kind of package to facilitate zoomable graphs with multiple plots (eg,
>>  plot(..) and then points(..)).    I've tried zoom from IDPmisc, and
>> iplot from the iplot and iplot extreme packages, but as far I can
>> tell, neither can handle the task.
>>
>> Does anyone know anything else that might work?   Or generally know
>> different?
>>
>
> Hi Tim,
> zoomInPlot in the plotrix package just plots the same data twice with
> different limits. You could use the same strategy, except instead of passing
> the coordinates to the function, write a similar function that accepts a
> list of plotting commands to be evaluated in each plot.
>
> Jim
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Zoomable graphs with multiple plots

2009-09-03 Thread Tim Shephard
On Thu, Sep 3, 2009 at 1:25 PM, Jim Porzak wrote:
> Tim,
>
> I've had success (& user acceptance) simply plotting to a .pdf &
> passing zoom functionality to Acrobat, or whatever.
>
> Worked especially well with large US map with a lot of fine print annotation.
>
> Of, course, will not replot axes more appropriate for zoom level.
>

That's clever.   It solve another problem too:  my function shuts down
R until I'm done.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Amazon SimpleDB and R

2009-09-19 Thread Tim Shephard
As far as I know there isn't anything available for this, but I
thought I'd check before working up something of my own.

Is there a way to query Amazon SimpleDB and import the data results
directly into R?

Cheers,

Tim.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROCR.plot methods, cross validation averaging

2009-09-23 Thread Tim Howard
Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - 

I think my first question is generic and could apply to many methods, 
which is why I'm directing this initially to R-help as well as Tobias and 
Oliver.

Question 1. The plot function in ROCR will average your cross validation
data if asked. I'd like to use that averaged data to find a "best" cutoff
but I can't figure out how to grab the actual data that get plotted.
A simple redirect of the plot (such as test <- plot(mydata)) doesn't do it.

Question 2. I am asking ROCR to average lists with varying lengths for
each list entry. See my example below. None of the ROCR examples have data
structured in this manner. Can anyone speak to whether the averaging
methods in ROCR allow for this? If I can't easily grab the data as desired
from Question 1, can someone help me figure out how to average the lists,
by threshold, similarly?

Question 3. If my cross validation data happen to have a list entry whose
length = 2, ROCR errors out. Please see the second part of my example.
Any suggestions?

#reproducible examples exemplifying my questions
##part one##
library(ROCR)
data(ROCR.xval)
 # set up data so it looks more like my real data
sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
testSet <- ROCR.xval
 # do the extraction
for (i in 1:length(ROCR.xval[[1]])){
  y <- sample(c(1:350),sampSize[i])
  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
  }
 # now massage the data using ROCR, set up for a ROC plot
 # if it errors out here, run the above sample again.
pred <- prediction(testSet$predictions, testSet$labels)
perf <- performance(pred,"tpr","fpr")
 # create the ROC plot, averaging by cutoff value
plot(perf, avg="threshold")
 # check out the structure of the data
str(perf)
 # note the ragged edges of the list and that I assume averaging
 # whether it be vertical, horizontal, or threshold, somehow 
 # accounts for this?

## part two ##
# add a list entry with only two values
p...@x.values[[1]] <- c(0,1)
p...@y.values[[1]] <- c(0,1)
p...@alpha.values[[1]] <- c(Inf,0)

plot(perf, avg="threshold")

##output results in an error with this message
# Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
# missing value where TRUE/FALSE needed


Thanks in advance for your help
Tim Howard
New York Natural Heritage Program

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Find each time a value changes

2010-02-10 Thread Tim Clark
Dear List,

I am trying to find each time a value changes in a dataset.  The numbers are 
variables for day vs. night values, so what I am really getting is the daily 
sunrise and sunset.  

A simplified example is the following:
  
x<-seq(1:100)
y1<-rep(1,10)
y2<-rep(2,10)
y<-c(y1,y2,y1,y1,y1,y2,y1,y2,y1,y2)
xy<-cbind(x,y)


I would like to know each time the numbers change.
Correct answer should be:
x=1,11,21,51,61,71,81,91

I would appreciate any help or suggestions.  It seems like it should be simple 
but I’m stuck!

Thanks,

Tim


Tim Clark
Department of Zoology 
University of Hawaii




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find each time a value changes

2010-02-10 Thread Tim Clark
Thanks everyone!  I would have been banging my head around for quite a while 
and still wouldn't have come up with either solution.  The function rle() is a 
good one to know!

Aloha,

Tim



Tim Clark
Department of Zoology 
University of Hawaii


--- On Wed, 2/10/10, Ben Tupper  wrote:

> From: Ben Tupper 
> Subject: Re: [R] Find each time a value changes
> To: r-help@r-project.org
> Cc: "Tim Clark" 
> Date: Wednesday, February 10, 2010, 4:16 PM
> Hi,
> 
> On Feb 10, 2010, at 8:58 PM, Tim Clark wrote:
> 
> > Dear List,
> >
> > I am trying to find each time a value changes in a
> dataset.  The  
> > numbers are variables for day vs. night values, so
> what I am really  
> > getting is the daily sunrise and sunset.
> >
> > A simplified example is the following:
> >
> > x<-seq(1:100)
> > y1<-rep(1,10)
> > y2<-rep(2,10)
> > y<-c(y1,y2,y1,y1,y1,y2,y1,y2,y1,y2)
> > xy<-cbind(x,y)
> >
> >
> > I would like to know each time the numbers change.
> > Correct answer should be:
> > x=1,11,21,51,61,71,81,91
> >
> 
> I think this gets close...
> 
> which(diff(y) != 0)
> [1] 10 20 50 60 70 80 90
> 
> You'll need to fiddle to get exactly what you want.
> 
> Cheers,
> Ben
> 
> 
> 
> > I would appreciate any help or suggestions.  It
> seems like it should  
> > be simple but I’m stuck!
> >
> > Thanks,
> >
> > Tim
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> >
> >
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find each time a value changes

2010-02-11 Thread Tim Clark
It was brought to my attention that the rle() answer to this question was not 
posted.  The following gives the correct answer once the last value is deleted.

x<-seq(1:100)
y1<-rep(1,10)
y2<-rep(2,10)
y<-c(y1,y2,y1,y1,y1,y2,y1,y2,y1,y2)
xy<-cbind(x,y)

print(xy)
print(str(xy))

# SEE WHAT RLE GIVES
test <- rle(xy[,2])
print(str(test)
  
# USE JIM"S TRICK OF CUMULATIVE SUMMING
# TO GET THE LOCATIONS  
result <- cumsum(c(1,rle(xy[,2])$lengths))



Tim Clark
Department of Zoology 
University of Hawaii


--- On Wed, 2/10/10, Ben Tupper  wrote:

> From: Ben Tupper 
> Subject: Re: [R] Find each time a value changes
> To: r-help@r-project.org
> Cc: "Tim Clark" 
> Date: Wednesday, February 10, 2010, 4:16 PM
> Hi,
> 
> On Feb 10, 2010, at 8:58 PM, Tim Clark wrote:
> 
> > Dear List,
> >
> > I am trying to find each time a value changes in a
> dataset.  The  
> > numbers are variables for day vs. night values, so
> what I am really  
> > getting is the daily sunrise and sunset.
> >
> > A simplified example is the following:
> >
> > x<-seq(1:100)
> > y1<-rep(1,10)
> > y2<-rep(2,10)
> > y<-c(y1,y2,y1,y1,y1,y2,y1,y2,y1,y2)
> > xy<-cbind(x,y)
> >
> >
> > I would like to know each time the numbers change.
> > Correct answer should be:
> > x=1,11,21,51,61,71,81,91
> >
> 
> I think this gets close...
> 
> which(diff(y) != 0)
> [1] 10 20 50 60 70 80 90
> 
> You'll need to fiddle to get exactly what you want.
> 
> Cheers,
> Ben
> 
> 
> 
> > I would appreciate any help or suggestions.  It
> seems like it should  
> > be simple but I’m stuck!
> >
> > Thanks,
> >
> > Tim
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> >
> >
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Formatting question for separate polygons

2010-02-11 Thread Tim Clark
Dear List,

I am trying to plot several separate polygons on a graph.  I have figured out 
how to do it by manually, but have too much data to use such a tedious method.  
I would appreciate your help.  I have made a simple example to illustrate the 
problem.  How can I get x into the proper format (x1)?

#Sample data
x<-c(1,2,3,4,5,6)
y<-c(1,2,2,1)

#I need to format the data like this
x1<-c(1,1,2,2,NA,3,3,4,4,NA,5,5,6,6,NA)
y1<-rep(c(1,2,2,1,NA),length(x)/2)

#Final plot
plot(c(1,6), 1:2, type="n")
polygon(x1,y1,density=c(40))


Thanks,

Tim

 
Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formatting question for separate polygons

2010-02-12 Thread Tim Clark
Thanks Uwe!  And Peter for the correction.  I would never have come up with 
that!

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 2/12/10, Uwe Ligges  wrote:

> From: Uwe Ligges 
> Subject: Re: [R] Formatting question for separate polygons
> To: "Peter Ehlers" 
> Cc: "Tim Clark" , r-help@r-project.org
> Date: Friday, February 12, 2010, 1:57 AM
> 
> 
> On 12.02.2010 12:54, Peter Ehlers wrote:
> > Nice, Uwe.
> > Small correction: make that nrow=4:
> >
> > x1 <- as.numeric(rbind(matrix(rep(x, each=2),
> nrow=4), NA))
> 
> 
> 
> Whoops, thanks!
> 
> Uwe
> 
> 
> > -Peter Ehlers
> >
> > Uwe Ligges wrote:
> >>
> >>
> >> On 11.02.2010 22:38, Tim Clark wrote:
> >>> Dear List,
> >>>
> >>> I am trying to plot several separate polygons
> on a graph. I have
> >>> figured out how to do it by manually, but have
> too much data to use
> >>> such a tedious method. I would appreciate your
> help. I have made a
> >>> simple example to illustrate the problem. How
> can I get x into the
> >>> proper format (x1)?
> >>>
> >>> #Sample data
> >>> x<-c(1,2,3,4,5,6)
> >>> y<-c(1,2,2,1)
> >>>
> >>> #I need to format the data like this
> >>> x1<-c(1,1,2,2,NA,3,3,4,4,NA,5,5,6,6,NA)
> >>> y1<-rep(c(1,2,2,1,NA),length(x)/2)
> >>
> >>
> >>
> >> x1 <- as.numeric(rbind(matrix(rep(x, each=2),
> nrow=2), NA))
> >>
> >> Uwe Ligges
> >>
> >>
> >>> #Final plot
> >>> plot(c(1,6), 1:2, type="n")
> >>> polygon(x1,y1,density=c(40))
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Tim
> >>>
> >>>
> >>> Tim Clark
> >>> Department of Zoology
> >>> University of Hawaii
> >>>
> >>>
> __
> >>> R-help@r-project.org
> mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal,
> self-contained, reproducible code.
> >>
> >> __
> >> R-help@r-project.org
> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained,
> reproducible code.
> >>
> >>
> >
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Non-monotonic spline using splinefun(method = "monoH.FC")

2010-02-15 Thread Tim Heaton
Hi,

 In my version of R, the stats package splinefun code for fitting a
Fritsch and Carlson monotonic spline does not appear to guarantee a
monotonic result. If two adjoining sections both have over/undershoot
the way the resulting adjustment of alpha and beta is performed can give
modified values which still do not satisfy the required constraints. I
do not think this is due to finite precision arithmetic. Is this a known
bug? Have had a look through the bug database but couldn't find anything.

Below is an example created to demonstrate this,

###
# Create the following data
# This is created so that their are two adjoining sections which have to
be adjusted
x <- 1:8
y <- c(-12, -10, 3.5, 4.45, 4.5, 140, 142, 142)

# Now run the splinefun() function

FailMonSpline <- splinefun(x, y, method = "mono")

# In theory this should be monotonic increasing but the required
conditions are not satisfied

# Check values of alpha and beta for this curve
m <- FailMonSpline(x, deriv = 1)
nx <- length(x)
n1 <- nx - 1L
dy <- y[-1] - y[-nx]
dx <- x[-1] - x[-nx]
Sx <- dy/dx

alpha <- m[-nx]/Sx
beta <- m[-1]/Sx
a2b3 <- 2 * alpha + beta - 3
ab23 <- alpha + 2 * beta - 3
ok <- (a2b3 > 0 & ab23 > 0)
ok <- ok & (alpha * (a2b3 + ab23) < a2b3^2)
# If the curve is monotonic then all ok should be FALSE however this is
not the case
ok


# Alternatively can easily seen to be non-monotonic by plotting the
region between 4 and 5

t <- seq(4,5, length = 200)
plot(t, FailMonSpline(t), type = "l")


The version of R I am running is

platform   x86_64-suse-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  8.1
year   2008
month  12
day22
svn rev47281
language   R
version.string R version 2.8.1 (2008-12-22)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Non-monotonic spline using splinefun(method = "monoH.FC")

2010-02-15 Thread Tim Heaton
Hi,

 Thanks for the reply but I think you are confusing monotonic and
strictly increasing/decreasing. I also just used the y-value of the last
knot as a simple example as that is not the bit where it goes wrong. It
will still produce a non-monotonic spline if you use for example

x <- 1:8
y <- c(-12, -10, 3.5, 4.45, 4.5, 140, 142, 143)

I am pretty sure that it's a bug in the way that the alpha's and beta's
are modified in the code itself which does not guarantee (if there are
two overlapping sections which need their alphas and betas modifying)
that after modification they satisfy the constraints as explained in the
original Fritsch and Carlson paper. The original paper is quite vague
about how to deal with multiple sections which need modifying --- should
one do it in order (in which case one might get a different result if
one entered the data in the opposite direction and moving one knot would
no longer guarantee that the curve changes in only a finite region) or
possibly shrink the coefficients twice (which would create a flatter
spline than necessary but would give a finite effect and the same curve
if the data were entered in the opposite direction).

Tim


David Winsemius wrote:
> 
> On Feb 15, 2010, at 10:59 AM, Tim Heaton wrote:
> 
>> Hi,
>>
>> In my version of R, the stats package splinefun code for fitting a
>> Fritsch and Carlson monotonic spline does not appear to guarantee a
>> monotonic result. If two adjoining sections both have over/undershoot
>> the way the resulting adjustment of alpha and beta is performed can give
>> modified values which still do not satisfy the required constraints. I
>> do not think this is due to finite precision arithmetic. Is this a known
>> bug? Have had a look through the bug database but couldn't find anything.
> 
> IThe help page says that the resulting function will be "monotone
>  iff  the data are."
> 
> y[7] < y[8]  # False
> FailMonSpline[7] < FailMonSpline[8] # False, ... , as promised.
> 
>>
>> Below is an example created to demonstrate this,
>>
>> ###
>> # Create the following data
>> # This is created so that their are two adjoining sections which have to
>> be adjusted
>> x <- 1:8
>> y <- c(-12, -10, 3.5, 4.45, 4.5, 140, 142, 142)
>>
>> # Now run the splinefun() function
>>
>> FailMonSpline <- splinefun(x, y, method = "mono")
>>
>> # In theory this should be monotonic increasing but the required
>> conditions are not satisfied
>>
>> # Check values of alpha and beta for this curve
>> m <- FailMonSpline(x, deriv = 1)
>> nx <- length(x)
>> n1 <- nx - 1L
>> dy <- y[-1] - y[-nx]
>> dx <- x[-1] - x[-nx]
>> Sx <- dy/dx
>>
>> alpha <- m[-nx]/Sx
>> beta <- m[-1]/Sx
>> a2b3 <- 2 * alpha + beta - 3
>> ab23 <- alpha + 2 * beta - 3
>> ok <- (a2b3 > 0 & ab23 > 0)
>> ok <- ok & (alpha * (a2b3 + ab23) < a2b3^2)
>> # If the curve is monotonic then all ok should be FALSE however this is
>> not the case
>> ok
>>
>>
>> # Alternatively can easily seen to be non-monotonic by plotting the
>> region between 4 and 5
>>
>> t <- seq(4,5, length = 200)
>> plot(t, FailMonSpline(t), type = "l")
>>
>> 
>> The version of R I am running is
>>
>> platform   x86_64-suse-linux-gnu
>> arch   x86_64
>> os linux-gnu
>> system x86_64, linux-gnu
>> status
>> major  2
>> minor  8.1
>> year   2008
>> month  12
>> day22
>> svn rev47281
>> language   R
>> version.string R version 2.8.1 (2008-12-22)
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] r help date format changes with c() vs. rbind()

2010-02-19 Thread Tim Clark
Dear List,

I am having a problem with dates and I would like to understand what is going 
on.  Below is an example.  I can produce a date/time using as.POSIXct, but I am 
trying to combine two as.POSIXct objects and keep getting strange results.  I 
thought I was using the wrong origin, but according to 
structure(0,class="Date") I am not (see below).  In my example a is a simple 
date/time object, b combines it using rbind(), c converts b to a date/time 
object again using as.POSIXct and gives the incorrect time, and d combines a 
using c() and gives the correct time.  Why doesn't c give me the correct answer?

Thanks,

Tim


> a<-as.POSIXct("2000-01-01 12:00:00")
> a
[1] "2000-01-01 12:00:00 HST"

> b<-rbind(a,a)
> b
   [,1]
a 946764000
a 946764000

> c<-as.POSIXct(b,origin="1970-01-01")
> c
[1] "2000-01-01 22:00:00 HST"
[2] "2000-01-01 22:00:00 HST"

> d<-c(a,a)
> d
[1] "2000-01-01 12:00:00 HST"
[2] "2000-01-01 12:00:00 HST"


> structure(0,class="Date")
[1] "1970-01-01"


Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r help date format changes with c() vs. rbind()

2010-02-19 Thread Tim Clark
Glad to know it isn't just me!  I couldn't use Phil's data.frame method since 
my real problem went from a POSIXct object to a large matrix where I used rbind 
and then back to POSIXct.  Jim's function worked great on converting the final 
product back to the proper date.

Thanks!

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 2/19/10, jim holtman  wrote:

> From: jim holtman 
> Subject: Re: [R] r help date format changes with c() vs. rbind()
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Friday, February 19, 2010, 12:19 PM
> I have used the following function to
> convert to POSIXct from a numeric without any
> problems:
>  
> > unix2POSIXct <- function (time)  
> structure(time, class = c("POSIXt",
> "POSIXct"))
> > 
> > unix2POSIXct(946764000)
> [1] "2000-01-01 17:00:00 EST"
> > 
> 
> 
> 
> 
> On Fri, Feb 19, 2010 at 4:07 PM,
> Tim Clark 
> wrote:
> 
> Dear
> List,
> 
> I am having a problem with dates and I would like to
> understand what is going on.  Below is an example.  I can
> produce a date/time using as.POSIXct, but I am trying to
> combine two as.POSIXct objects and keep getting strange
> results.  I thought I was using the wrong origin, but
> according to structure(0,class="Date") I am not
> (see below).  In my example a is a simple date/time object,
> b combines it using rbind(), c converts b to a date/time
> object again using as.POSIXct and gives the incorrect time,
> and d combines a using c() and gives the correct time.  Why
> doesn't c give me the correct answer?
> 
> 
> Thanks,
> 
> Tim
> 
> 
> > a<-as.POSIXct("2000-01-01 12:00:00")
> > a
> [1] "2000-01-01 12:00:00 HST"
> 
> > b<-rbind(a,a)
> > b
>       [,1]
> a 946764000
> a 946764000
> 
> 
> > c<-as.POSIXct(b,origin="1970-01-01")
> > c
> [1] "2000-01-01 22:00:00 HST"
> [2] "2000-01-01 22:00:00 HST"
> 
> > d<-c(a,a)
> > d
> [1] "2000-01-01 12:00:00 HST"
> 
> [2] "2000-01-01 12:00:00 HST"
> 
> 
> > structure(0,class="Date")
> [1] "1970-01-01"
> 
> 
> Tim Clark
> Department of Zoology
> University of Hawaii
> 
> __
> 
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> 
> and provide commented, minimal, self-contained,
> reproducible code.
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (quite possibly OT) Re: how to make R running on a Linux server display a plot on a Windows machine

2010-02-23 Thread Glover, Tim
One additional option on the Linux-on-the-local-machine option described below. 
 Consider running Microsoft's free virtual machine with a copy of Linux in 
there.  Now you have the advantages of dual-boot with BOTH operating systems 
available with a toggle key sequence. 

Tim Glover 
Senior Environmental Scientist - Geochemistry 
Geoscience Department Atlanta Area 
MACTEC Engineering and Consulting, Inc. 
Kennesaw, Georgia, USA 
Office 770-421-3310 
Fax 770-421-3486 
Email ntglo...@mactec.com 
Web www.mactec.com 
 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Emmanuel Charpentier
Sent: Monday, February 22, 2010 4:07 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] (quite possibly OT) Re: how to make R running on a Linux server 
display a plot on a Windows machine

Dear Xin,

Le lundi 22 février 2010 à 09:53 -0800, xin wei a écrit : 
> hi, Kevin and K.Elo:
> thank you for the suggestion. Can you be more specific on these? (like how
> exactly get into x-switch or man ssh). I am totally ignorant about linux and
> SSH:( Memory limitation forces me to switch from windows to Linux
> cluster.

\begin{SemiOffTopic}
While you may not believe it now, that is a reason for rejoicing rather
than grieving :-) Unix is a bit like sex : uncomprehensible and possibly
frightening to the young/ignorant, extremely enjoyable and almost
indispensable to the mature person ... and source of severe withdrawal
syndrome if/when it becomes unavailable (some employers have strange
ideas about the set of tools their employees are allowed to use) !
You've been warned...
\end{SemiOffTopic}


Your problem seems to have (almost) nothing to do with R *per* *se*, but
with your ability to use it on a *remote* computer. This problem should
be discussed with your *local* friendly help. To the best of my
knowledge, neither the R help list nor its "members" are subsidized by
psu.edu's sponsors (Penn State, right ?) to provide basic computer use
tutoring to its students ; consequently, this list is *definitely* *NOT*
the right place for this. But ...

Nonwhistanding my own advice, I'll take the time to try to answer you,
in the (futile?) hope that future beginning students, after *reading*
the Posting Guide, following its advice and searching the list, will
find this answer, thus avoiding overloading our reading bandwith
*again*... That's also why I rephrased the "subject" of your post.


I suppose that the simplest solution (namely fitting "enough" memory on
your present Windows machine) is deemed impossible. The fun begins...

Now, to work with R, you need only a *text* connection to your server.
It is enough to use function creating graphs ... as files that you can
later display on your Windows machine. That's what ssh does : terminal
emulation (plus the ability to copy files back and forth, via scp (which
you *will* need), plus a way to create so-called tunnels and
redirections... But that's a horse on an entirely different color (an
elephant, actually:)).

If you want "real-time" graphics displayed by the R interpreter, you
need, indeed, to use the "-X" switch to ssh. but that is *not*
*enough* : your Windows machine *must* be fitted with software accepting
commands emitted by the server's R interpreter and obeying them to
actually create the image ; that is something called "an X server" (yes,
server : in this case, your windows machine offers a "displaying
service", that your R interpreter uses for displaying your graph, thus
becoming a client of your server).

Installing such a beast under Windows is (was ?) neither easy nor
(usually) cheap. There *are* free (in both meanings of this word) X
server implementations for Windows (most notoriously Cygwin/X and
Xming), but, as far as I know, none of them is "easy" to install for the
uninitiated : to do this, you must understand what you are doing, which
implies (partially) mastering the X window system, which is ...
complicated, to put it mildly. You'd better seek *informed* help around
you on this one.

I am aware of a handful of commercial implementations claiming to be
"easy to install", but canot emit any opinion of them : the price tag is
enough to give me pause...

Another option (to be discussed with your server's manager) is to
display on your Windows machine the image of a "virtual" X session
started on the server. Such a solution, which has a couple of
implementations (variants of VNC, variants of RDP) might be quite
preferable if your network connection is slow/unreliable : X eats
bandwidth like there's no tomorrow... I find VNC quite useful on the
limited-bandwith connections that I use almost daily.

But, may well be that the *simplest* solution would be to install Linux
on your own machine (dual boot for a first time...) : X is the

Re: [R] Plotting 15 million points

2010-02-25 Thread Glover, Tim
Have you considered taking a random subset and plotting that?  I'd bet you can 
get a really impression of the distribution with a few hundred thousand points 
at most.

Tim Glover 
Senior Environmental Scientist - Geochemistry 
Geoscience Department Atlanta Area 
MACTEC Engineering and Consulting, Inc. 
Kennesaw, Georgia, USA 
Office 770-421-3310 
Fax 770-421-3486 
Email ntglo...@mactec.com 
Web www.mactec.com 
 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Abhishek Pratap
Sent: Thursday, February 25, 2010 6:12 PM
To: r-help@r-project.org
Subject: [R] Plotting 15 million points

Hi All

I have a vector of about 15 million numbers which I would like to
plot. The goal is the see the distribution.  I tired the usual steps.

1. Histogram : never gets complete my window freezes w/out log base 10
2. Density  : I first calculated the kernel density and then plotted
it which worked.

It would be nice to superimpose histogram with density but as of now I
am not able to get this data as a histogram. I tried ggplot2 which
also hangs.

Any efficient methods to play with > 10 million numbers in a vector.

Thanks,
-Abhi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading data from web data sources

2010-02-27 Thread Tim Coote

Hullo
I'm trying to read some time series data of meteorological records  
that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat) 
. I'd like to be able to read in the digital data directly into R.  
However, I cannot work out the right function and set of parameters to  
use.  It could be that the only practical route is to write a parser,  
possibly in some other language,  reformat the files and then read  
these into R. As far as I can tell, the informal grammar of the file is:



[
 ]+

and the  are of the form:
  [ ] 12

Readings for days in months where a day does not exist have special  
values. Missing values have a different special value.


And then I've got the problem of iterating over all relevant files to  
get a whole timeseries.


Is there a way to read in this type of file into R? I've read all of  
the examples that I can find, but cannot work out how to do it. I  
don't think that read.table can handle the separate sections of data  
representing each year. read.ftable maybe can be coerced to parse the  
data, but I cannot see how after reading the documentation and  
experimenting with the parameters.


I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10.

Any help/suggestions would be greatly appreciated. I can see that this  
type of issue is likely to grow in importance, and I'd also like to  
give the data owners suggestions on how to reformat their data so that  
it is easier to consume by machines, while being easy to read for  
humans.


The early records are a serious machine parsing challenge as they are  
tiff images of old notebooks ;-)


tia

Tim
Tim Coote
t...@coote.org
vincit veritas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data from web data sources

2010-02-27 Thread Tim Coote
Thanks, Gabor. My take away from this and Phil's post is that I'm  
going to have to construct some code to do the parsing, rather than  
use a standard function. I'm afraid that neither approach works, yet:


Gabor's gets has an off-by-one error (days start on the 2nd, not the  
first), and the years get messed up around the 29th day.  I think that  
na.omit (DF) line is throwing out the baby with the bathwater.  It's  
interesting that this approach is based on read.table, I'd assumed  
that I'd need read.ftable, which I couldn't understand the  
documentation for.  What is it that's removing the -999 and -888  
values in this code -they seem to be gone, but I cannot see why.


Phil's reads in the data, but interleaves rows with just a year and  
all other values as NA.


Tim
On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote:


Mark Leeds pointed out to me that the code wrapped around in the post
so it may not be obvious that the regular expression in the grep is
(i.e. it contains a space):
"[^ 0-9.]"


On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck
 wrote:
Try this.  First we read the raw lines into R using grep to remove  
any

lines containing a character that is not a number or space.  Then we
look for the year lines and repeat them down V1 using cumsum.   
Finally

we omit the year lines.

myURL <- "http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat 
"

raw.lines <- readLines(myURL)
DF <- read.table(textConnection(raw.lines[!grepl("[^
0-9.]",raw.lines)]), fill = TRUE)
DF$V1 <- DF[cumsum(is.na(DF[[2]])), 1]
DF <- na.omit(DF)
head(DF)


On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote > wrote:

Hullo
I'm trying to read some time series data of meteorological records  
that are

available on the web (eg
http://climate.arm.ac.uk/calibrated/soil/ 
dsoil100_cal_1910-1919.dat). I'd
like to be able to read in the digital data directly into R.  
However, I
cannot work out the right function and set of parameters to use.   
It could
be that the only practical route is to write a parser, possibly in  
some
other language,  reformat the files and then read these into R. As  
far as I

can tell, the informal grammar of the file is:


[
 ]+

and the  are of the form:
  [ ]  
12


Readings for days in months where a day does not exist have  
special values.

Missing values have a different special value.

And then I've got the problem of iterating over all relevant files  
to get a

whole timeseries.

Is there a way to read in this type of file into R? I've read all  
of the
examples that I can find, but cannot work out how to do it. I  
don't think
that read.table can handle the separate sections of data  
representing each
year. read.ftable maybe can be coerced to parse the data, but I  
cannot see
how after reading the documentation and experimenting with the  
parameters.


I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10.

Any help/suggestions would be greatly appreciated. I can see that  
this type
of issue is likely to grow in importance, and I'd also like to  
give the data
owners suggestions on how to reformat their data so that it is  
easier to

consume by machines, while being easy to read for humans.

The early records are a serious machine parsing challenge as they  
are tiff

images of old notebooks ;-)

tia

Tim
Tim Coote
t...@coote.org
vincit veritas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





Tim Coote
t...@coote.org
vincit veritas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best Hardware & OS For Large Data Sets

2010-02-27 Thread Tim Coote
Is it possible to run a Linux guest VM on the Wintel box so that you  
can run the 64 bit code?  I used to do this on XP (but not for R).

On 27 Feb 2010, at 20:03, David Winsemius wrote:



On Feb 27, 2010, at 12:47 PM, J. Daniel wrote:



Greetings,

I am acquiring a new computer in order to conduct data analysis.  I
currently have a 32-bit Vista OS with 3G of RAM and I consistently  
run into
memory allocation problems.  I will likely be required to run  
Windows 7 on
the new system, but have flexibility as far as hardware goes.  Can  
people
recommend the best hardware to minimize memory allocation  
problems?  I am
leaning towards dual core on a 64-bit system with 8G of RAM.  Given  
the

Windows constraint, is there anything I am missing here?


Perhaps the fact that the stable CRAN version of R for (any) Windows  
is 32-bit? It would expand your memory space somewhat but not as  
much as you might naively expect.


(There was a recent  announcement that an experimental version of a  
64-bit R was available (even with an installer) and there are  
vendors who will supply a 64-bit Windows version for an un-announced  
price. The fact that there was not as of January support for binary  
packages seems to a bit of a constraint on who would be able to  
"step up" to use full 64 bit R capabilities on Win64. I'm guessing  
from the your failure to mention potential software constraints that  
you are not among that more capable group, as I am also not.)


https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html
https://stat.ethz.ch/pipermail/r-devel/2010-January/056411.html



I know that Windows limits the RAM that a single application can  
access.
Does this fact over-ride many hardware considerations?  Any way  
around this?


Thanks,

JD


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Tim Coote
t...@coote.org
+44 (0)7866 479 760

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] turn character string into unevaluated R object

2010-03-02 Thread Tim Calkins
fortune('parse')

But if you have a vector of file names you can create a blank list and
read.table each file into a list. I generally find that if I'm reading a
bunch of files in at the same time they are probably related and I will end
up coming back and putting them all in to a list anyways.


file.names <- c('name 1', 'name 2', 'name 3')

my.list <- list()

for (i in file.names) {
 temp <- read.table(paste(i, '.txt', sep = ''))
# assign temp to i if desired
 assign(i, temp)
 # or put it in the list
 my.list[[i]] <- temp
}




On Wed, Mar 3, 2010 at 10:15 AM, Liviu Andronic wrote:

> On 3/2/10, carol white  wrote:
> >  How to turn a character string into an unevaluated R object? I want to
> load some
> >
> I'm not sure if this is what you're looking for:
> > as.name("iris")
> iris
> > parse(text="iris")
> expression(iris)
> attr(,"srcfile")
> 
> > head(eval(as.name("iris")))
>  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> 1  5.1 3.5  1.4 0.2  setosa
> 2  4.9 3.0  1.4 0.2  setosa
> 3  4.7 3.2  1.3 0.2  setosa
> 4  4.6 3.1  1.5 0.2  setosa
> 5  5.0 3.6  1.4 0.2  setosa
> 6  5.4 3.9  1.7 0.4  setosa
> > head(eval(parse(text="iris")))
>  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> 1  5.1 3.5  1.4 0.2  setosa
> 2  4.9 3.0  1.4 0.2  setosa
> 3  4.7 3.2  1.3 0.2  setosa
> 4  4.6 3.1  1.5 0.2  setosa
> 5  5.0 3.6  1.4 0.2  setosa
> 6  5.4 3.9  1.7 0.4  setosa
>
> Liviu
>
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Tim Calkins
0406 753 997

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 rose diagram

2010-03-09 Thread Tim Howard
Dear R gurus - 

consider this plot:

library(ggplot2)
dat <- sample(1:8,100,replace=TRUE)
smp <- ggplot(data.frame(dat), aes(x=factor(dat),fill=factor(dat))) + 
geom_bar(width=1)
smp + coord_polar() 


Q1. How do I change the font size and weight of bar labels (1,2,3...)?  I've 
been wallowing in the 'Themes' structure and I just can't figure out the 
correct place to change the definitions. Along these same lines, what does 
'strip' mean when referring to strip text? 

Q2. How can I move the legend defining bar height into the plot, so that it 
overlays the lines they refer to?


Consider the same figure using Circstats:

library(CircStats)
dat.rad <- (dat*((2*pi)/8)) -(2*pi)/16
rose.diag(dat.rad, bins = 8)  #note the origin is to the right rather than on 
top

Q3. The key difference is that CircStats uses an area-based calculation for the 
size of each slice, which makes for a different presentation than ggplot2. Any 
suggestions on how to use this calculation method in the ggplot framework? 

Thanks in advance for your help.
Tim Howard
New York Natural Heritage Program

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question on passing in parameter to Cox hazard

2010-03-09 Thread Tim Smith
Hi,

I wanted to do the cox model using a matrix. The following lines illustrate 
what I want to do:

dat <- matrix(rnorm(30), ncol=3,dimnames = list(1:10,letters[1:3]))
Survival <- rexp(10)
Status <- ifelse(runif(10) < .7, 1, 0)
mat <- as.data.frame(cbind(dat,Survival,Status))

cmod <- coxph(Surv(Survival, Status) ~ a+b+c, mat)
-
This works fine. However, I need to change the code so that the column headers 
( a+b+c )are passed into the coxph function on the fly. What string/object do I 
need to generate so the function works? I am trying:

# For example
chead <- "a+b+c"
cmod <- coxph(Surv(Survival, Status) ~ chead, mat)
but this gives an error since I'm passing in a string. Can I change chead to 
something so that the code works?
many thanks.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 rose diagram

2010-03-10 Thread Tim Howard
To answer two of my own questions to get them into the archives (I am slowly 
getting the hang of ggplot):

Q1.  use "opts(axis.text.x = theme_text(size=xx))" to change font size of the 
bar labels:

library(ggplot2)
set.seed(5)
dat <- sample(1:8,100,replace=TRUE)
smp <- ggplot(data.frame(dat), aes(x=factor(dat),fill=factor(dat))) + 
geom_bar(width=1) +
opts(axis.text.x = 
theme_text(size = 18))
smp + coord_polar() 

Q3. calculate the frequencies themselves and use stat="identity" inside the aes 
call:

L <- table(dat)
L.df <- data.frame(L)
L.df <- cbind(L.df, "SQRrelFreq" = sqrt(L.df[,2]/sum(L.df[,2])))
smp2 <- ggplot(L.df, aes(x=dat,y=SQRrelFreq, stat="identity", fill=dat)) + 
geom_bar(width=1) +
opts(axis.text.x = 
theme_text(size = 18))
smp2 + coord_polar() 

Cheers,
Tim

>>> Tim Howard 3/9/2010 9:25 AM >>>
Dear R gurus - 

consider this plot:

library(ggplot2)
dat <- sample(1:8,100,replace=TRUE)
smp <- ggplot(data.frame(dat), aes(x=factor(dat),fill=factor(dat))) + 
geom_bar(width=1)
smp + coord_polar() 


Q1. How do I change the font size and weight of bar labels (1,2,3...)?  I've 
been wallowing in the 'Themes' structure and I just can't figure out the 
correct place to change the definitions. Along these same lines, what does 
'strip' mean when referring to strip text? 

Q2. How can I move the legend defining bar height into the plot, so that it 
overlays the lines they refer to?


Consider the same figure using Circstats:

library(CircStats)
dat.rad <- (dat*((2*pi)/8)) -(2*pi)/16
rose.diag(dat.rad, bins = 8)  #note the origin is to the right rather than on 
top

Q3. The key difference is that CircStats uses an area-based calculation for the 
size of each slice, which makes for a different presentation than ggplot2. Any 
suggestions on how to use this calculation method in the ggplot framework? 

Thanks in advance for your help.
Tim Howard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 rose diagram

2010-03-10 Thread Tim Howard
Hadley, 
Thanks for chiming in. 

By Q2 I was trying to refer to the Y-axis labels. For the polar plot, the 
Y-axis labels reside left of the panel. I was looking for a way to get the 
Y-axis labels to radiate out from the center so it would be clear which line 
each label refers to. I still can't find any reference to moving the y-axis 
labels (for any plot type) in any of your documentation. It's probably my 
failure

For Q3, can you speak to whether the square-root transformation of counts for 
the y-axis provides the same function as square-root of the frequencies e.g.  
sqrt(countInBin/totalCount) . My goal is for area of the slice to correlate 
with number of records in each bin (rather than area expanding at a faster 
rate). 

Thanks,
Tim

>>> hadley wickham  3/10/2010 9:14 AM >>>
For Q2 you can use opts(legend.position = c(0.9, 0.9)).

For Q3, you can also use scale_y_sqrt().

Hadley

On Wed, Mar 10, 2010 at 2:05 PM, Tim Howard  wrote:
> To answer two of my own questions to get them into the archives (I am slowly 
> getting the hang of ggplot):
>
> Q1.  use "opts(axis.text.x = theme_text(size=xx))" to change font size of the 
> bar labels:
>
> library(ggplot2)
> set.seed(5)
> dat <- sample(1:8,100,replace=TRUE)
> smp <- ggplot(data.frame(dat), aes(x=factor(dat),fill=factor(dat))) +
>geom_bar(width=1) +
>opts(axis.text.x = 
> theme_text(size = 18))
> smp + coord_polar()
>
> Q3. calculate the frequencies themselves and use stat="identity" inside the 
> aes call:
>
> L <- table(dat)
> L.df <- data.frame(L)
> L.df <- cbind(L.df, "SQRrelFreq" = sqrt(L.df[,2]/sum(L.df[,2])))
> smp2 <- ggplot(L.df, aes(x=dat,y=SQRrelFreq, stat="identity", fill=dat)) +
>    geom_bar(width=1) +
>opts(axis.text.x = 
> theme_text(size = 18))
> smp2 + coord_polar()
>
> Cheers,
> Tim
>
>>>> Tim Howard 3/9/2010 9:25 AM >>>
> Dear R gurus -
>
> consider this plot:
>
> library(ggplot2)
> dat <- sample(1:8,100,replace=TRUE)
> smp <- ggplot(data.frame(dat), aes(x=factor(dat),fill=factor(dat))) + 
> geom_bar(width=1)
> smp + coord_polar()
>
>
> Q1. How do I change the font size and weight of bar labels (1,2,3...)?  I've 
> been wallowing in the 'Themes' structure and I just can't figure out the 
> correct place to change the definitions. Along these same lines, what does 
> 'strip' mean when referring to strip text?
>
> Q2. How can I move the legend defining bar height into the plot, so that it 
> overlays the lines they refer to?
>
>
> Consider the same figure using Circstats:
>
> library(CircStats)
> dat.rad <- (dat*((2*pi)/8)) -(2*pi)/16
> rose.diag(dat.rad, bins = 8)  #note the origin is to the right rather than on 
> top
>
> Q3. The key difference is that CircStats uses an area-based calculation for 
> the size of each slice, which makes for a different presentation than 
> ggplot2. Any suggestions on how to use this calculation method in the ggplot 
> framework?
>
> Thanks in advance for your help.
> Tim Howard
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 rose diagram

2010-03-10 Thread Tim Howard
Got it. 
Thanks again so much for the help. 
Best, 
Tim

>>> hadley wickham  3/10/2010 2:46 PM >>>
> By Q2 I was trying to refer to the Y-axis labels. For the polar plot, the 
> Y-axis labels reside left of the panel. I was looking for a way to get the 
> Y-axis labels to radiate out from the center so it would be clear which line 
> each label refers to. I still can't find any reference to moving the y-axis 
> labels (for any plot type) in any of your documentation. It's probably my 
> failure

Ah ok, there's not currently anyway to do that.  You'd be best off
just adding the text and markings yourself with geom_text and
geom_line.

> For Q3, can you speak to whether the square-root transformation of counts for 
> the y-axis provides the same function as square-root of the frequencies e.g.  
> sqrt(countInBin/totalCount) . My goal is for area of the slice to correlate 
> with number of records in each bin (rather than area expanding at a faster 
> rate).

It should produce a plot that looks the same as your explicit
transformation, so yes.

Hadley


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] heatmap.2 - ColSideColors question

2010-03-16 Thread Tim Smith
Hi,

I wanted to make more than one side color bar. For example, I can make one side 
color bar (col1) with the following code:
---
library(gplots)

mat <- matrix(sample(1:100,40),nrow=5)
class1 <- c(rep(0,4),rep(1,4))
col1 <- ifelse(class1 == 0,"blue","red")
class2 <- c(rep(1,3),rep(2,5))
col2 <- ifelse(class2 == 0,"yellow","green")

heatmap.2(mat,col=greenred(75),ColSideColors=col1,trace="none",
  dendrogram = "column",labCol = NULL)


---
How can I modify the code so that both col1 & col2 are displayed in the 
heatmap? 
thanks!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] abline on heatmap

2010-03-20 Thread Tim Smith
Hi,

Is there a way I can draw an abline on a heatmap? I try the abline function, 
but don't get the line. My sample code is:
mat <- matrix(sample(1:100,40),nrow=5)
heatmap(mat,col=greenred(75),trace="none",
dendrogram = "column",labCol = NULL)
abline(h=5,v=4)


thanks!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating distance between spatial points

2009-06-25 Thread Tim Clark

Dear List,

I am trying to determine the speed an animal is traveling on each leg of a 
track.  My data is in longitude and latitude, so I am using the package rgdal 
to convert it into a spatial points data frame and transform it to UTM.  I 
would then like to find the difference between successive longitudes and 
latitudes, find the euclidean distance between points, and compute the speed of 
the animal on each leg.

My problem is that once I convert the lat and long into a spatial points data 
frame I can not access the lat and long individually.  As far as I know I need 
to convert them in order to transform the lat and long to UTM.  Is there a way 
I can call each variable separately in the sp dataframe?  My code with example 
data is below.  Any suggestions would be appreciated.

  library(rgdal) 
  date.diff<-(20,30,10,30)
  Long<-c(-156.0540 ,-156.0541 ,-156.0550 ,-156.0640)
  Lat<-c(19.73733,19.73734,19.73743,19.73833) 
   
  SP<-data.frame(Long,Lat)  
  SP<-SpatialPoints(SP,proj4string=CRS("+proj=longlat +ellps=WGS84"))
  SP.utm<-spTransform(SP, CRS("+proj=utm +zone=4 +ellps=WGS84")) 
  
  long.diff<-diff(SP.utm$Long)
  lat.diff<-diff(SP.utm$Lat)
  
  d=(long.diff^2+lat.diff^2)^.5
  speed=d/date.diff


Aloha,

Tim



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How should I denormalise a data frame list of lists column?

2009-07-01 Thread Tim Slidel
Hi,

I have a data frame where one column is a list of lists. I would like to
subset the data frame based on membership of the lists in that column and be
able to 'denormalise' the data frame so that a row is duplicated for each of
its list elements. Example code follows:

# The data is read in in this form with the c2 list values in single strings
which I then split to give lists:
> f1 <- data.frame(c1=0:2, c2=c("A,B,C", "A,E", "F,G,H"))
> f1$Split <- strsplit(as.character(f1$c2), ",")
> f1
  c1c2   Split
1  0 A,B,C A, B, C
2  1   A,EA, E
3  2 F,G,H F, G, H

# So f1$Split is the list of lists column I want to denormalise or use as
the subject for subsetting

# f2 is data to use to select subsets from f1
> f2 <- data.frame(c1=LETTERS[0:8], c2=c("Apples",
"Badger","Camel","Dog","Elephants","Fish","Goat","Horse"))
> f2
  c1   c2
1  AApple
2  B   Badger
3  CCamel
4  D  Dog
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# I was able to find which rows of f2 are represented in the f1 lists (not
entirely sure if this is the best way to do this):
> f3 <- f2[f2$c1 %in% unlist(f1$Split),]
> f3
  c1   c2
1  AApple
2  B   Badger
3  CCamel
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# Note that 'D' is missing from f3 because it is not in any of the f1$Split
lists

# f4 is a subset of f3 and I want to find the rows of f1 where f1$Split
contains any of f4$c1:
> f4 <- f3[c(1,3),]
> f4
  c1c2
1  A Apple
3  C Camel

# I tried this and it didn't work, presumably because it's trying to match
against each list object rather than the list elements, but unlist doesn't
do the trick here because I need the individual rows, I need to unlist on a
row by row basis.
> f1[f1$Split %in% f4$c1,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[f4$c1 %in% f1$Split,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[match(f4$c1, f1$Split),]
 c1   c2 Split
NA   NA   NULL
NA.1 NA   NULL

I also looked at reshape which I don't think helps. I thought I might be
able to create a new data frame with the f1$Split denormalised and use that,
but couldn't find a way to do this, the result I'd want there is something
like:
> f1_denorm
  c1c2   Split   SplitDenorm
1  0 A,B,C A, B, C   A
2  0 A,B,C A, B, C   B
3  0 A,B,C A, B, C   C
4  1   A,EA, E A
5  1   A,EA, E E
6  2 F,G,H F, G, H   F
7  2 F,G,H F, G, H  G
8  2 F,G,H F, G, H  H

I thought perhaps for loops would be the next thing to try, but there must
be a better way!

Thanks for any help.

Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How should I denormalise a data frame list of lists column?

2009-07-01 Thread Tim Slidel
Hi,

(apologies for initial html posting)

I have a data frame where one column is a list of lists. I would like to
subset the data frame based on membership of the lists in that column and be
able to 'denormalise' the data frame so that a row is duplicated for each of
its list elements. Example code follows:

# The data is read in in this form with the c2 list values in single strings
which I then split to give lists:
> f1 <- data.frame(c1=0:2, c2=c("A,B,C", "A,E", "F,G,H"))
> f1$Split <- strsplit(as.character(f1$c2), ",")
> f1
  c1c2   Split
1  0 A,B,C A, B, C
2  1   A,EA, E
3  2 F,G,H F, G, H

# So f1$Split is the list of lists column I want to denormalise or use as
the subject for subsetting

# f2 is data to use to select subsets from f1
> f2 <- data.frame(c1=LETTERS[0:8], c2=c("Apples",
"Badger","Camel","Dog","Elephants","Fish","Goat","Horse"))
> f2
  c1   c2
1  AApple
2  B   Badger
3  CCamel
4  D  Dog
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# I was able to find which rows of f2 are represented in the f1 lists (not
entirely sure if this is the best way to do this):
> f3 <- f2[f2$c1 %in% unlist(f1$Split),]
> f3
  c1   c2
1  AApple
2  B   Badger
3  CCamel
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# Note that 'D' is missing from f3 because it is not in any of the f1$Split
lists

# f4 is a subset of f3 and I want to find the rows of f1 where f1$Split
contains any of f4$c1:
> f4 <- f3[c(1,3),]
> f4
  c1c2
1  A Apple
3  C Camel

# I tried this and it didn't work, presumably because it's trying to match
against each list object rather than the list elements, but unlist doesn't
do the trick here because I need the individual rows, I need to unlist on a
row by row basis.
> f1[f1$Split %in% f4$c1,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[f4$c1 %in% f1$Split,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[match(f4$c1, f1$Split),]
 c1   c2 Split
NA   NA   NULL
NA.1 NA   NULL

I also looked at reshape which I don't think helps. I thought I might be
able to create a new data frame with the f1$Split denormalised and use that,
but couldn't find a way to do this, the result I'd want there is something
like:
> f1_denorm
  c1c2   Split   SplitDenorm
1  0 A,B,C A, B, C   A
2  0 A,B,C A, B, C   B
3  0 A,B,C A, B, C   C
4  1   A,EA, E A
5  1   A,EA, E E
6  2 F,G,H F, G, H   F
7  2 F,G,H F, G, H  G
8  2 F,G,H F, G, H  H

I thought perhaps for loops would be the next thing to try, but there must
be a better way!

Thanks for any help.

Tim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting the month out of my date as a number not characters

2009-07-02 Thread Tim Chatterton
I have a data frame (hf) that is all set up and the dates are working 
fine - however I need to extract the months and hours (2 separate 
columns) as numbers - however they are coming out as characters.


I have tried both the following:

hf50$hour=  hf50$date
hf50$hour=format(hf50["hour"],"%H")

and

hf$month <- as.POSIXct(strptime(hf$date, format = "%m"))

but they are still coming out as characters.

Any ideas please?
Thanks,
Tim.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting the month out of my date as a number not characters

2009-07-02 Thread Tim Chatterton
I have a data frame (hf) that is all set up and the dates are working 
fine - however I need to extract the months and hours (2 separate 
columns) as numbers - however they are coming out as characters.


I have tried both the following:

hf50$hour <- hf50$date
hf50$hour <- format(hf50["hour"],"%H")

and

hf$month <- as.POSIXct(strptime(hf$date, format = "%m"))

but they are still coming out as characters.

Any ideas please?
Thanks,
Tim.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Collinearity in Linear Multiple Regression

2009-07-21 Thread Tim Paysen
Actually, the CI index and VIF are just a start.  It is best to look at what 
they call a matrix of "variance proportions" (found in SAS and a few other 
places...)--which hardly anyone understands (including the SAS folks).  It is a 
matrix of estimates of what the variences of the regression coefficients would 
be if you could figure them out in the first place.  It shows which factors 
dominate over others IN THE PARTICULAR SETUP you are analyzing.  The matrix is 
often calculated using eigenvalues, but is best done with Singular Value 
Decomposition techniques (you don't have to have a square matrix, and you 
maintain better precision).  Analysts will say that it can display an unstable 
system -- which is correct, but they generally say that, if its true, you have 
bad data and should throw it out--or collect more.  I suggest care, because it 
may be illustrating the nature of the system you are studying.

The only decent reference that I know of is a little book (hard to read) that I 
can't remember off the top of my head.  Have to look it up.

Timothy E. Paysen, Phd
Research Forester (ret.)





From: John Sorkin 
To: Alex Roy ; r-help@r-project.org
Sent: Tuesday, July 21, 2009 4:19:11 AM
Subject: Re: [R] Collinearity in Linear Multiple Regression

I suggest you start by doing some reading about Condition index (CI) and 
variation inflation factor (VIF). Once you have reviewed the theory, a search 
of search.r-project.org (under the help menu in a windows-based R installation) 
for VIF will help you obtain values for VIF, c.f. 
http://finzi.psych.upenn.edu/R/library/HH/html/vif.html 
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> Alex Roy  7/21/2009 7:01 AM >>>
Dear all,
                  How can I test for collinearity in the predictor data set
for multiple linear regression.

Thanks

Alex

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Duplicated date values aren't duplicates

2009-07-23 Thread Tim Clark

Dear list,

I just had a function (as.ltraj in Adehabitat) give me the following error:

"Error in as.ltraj(xy, id, date = da) : non unique dates for a given burst"

I checked my dates and got the following:

>   dupes<-mydata$DateTime[duplicated(mydata$DateTime)]
> dupes
[1] (07/30/02 00:00:00) (08/06/03 17:45:00)

Is there a reason different dates would come up as duplicate values?  I would 
prefer not to have to delete them if I don't have to.  Any suggestions on how 
to get R to realize they are different?

Thanks,

Tim



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicated date values aren't duplicates

2009-07-24 Thread Tim Clark

Don and Jim,

Thanks!  I got it!  Duplicated is only returning one of the two duplicated 
dates (the second date).  It all makes sense now!

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 7/24/09, Don MacQueen  wrote:

> From: Don MacQueen 
> Subject: Re: [R] Duplicated date values aren't duplicates
> To: "Tim Clark" , r-help@r-project.org
> Date: Friday, July 24, 2009, 4:00 AM
> Look at results of
> 
>    table( mydata$DateTime )
> 
> and I think you will see that some are duplicated.
> Specifically, the 
> two in your dupes object.
> 
> -Don
> 
> At 5:50 PM -0700 7/23/09, Tim Clark wrote:
> >Dear list,
> >
> >I just had a function (as.ltraj in Adehabitat) give me
> the following error:
> >
> >"Error in as.ltraj(xy, id, date = da) : non unique
> dates for a given burst"
> >
> >I checked my dates and got the following:
> >
> > 
> >   dupes<-mydata$DateTime[duplicated(mydata$DateTime)]
> >>  dupes
> >[1] (07/30/02 00:00:00) (08/06/03 17:45:00)
> >
> >Is there a reason different dates would come up as
> duplicate values? 
> >I would prefer not to have to delete them if I don't
> have to.  Any 
> >suggestions on how to get R to realize they are
> different?
> >
> >Thanks,
> >
> >Tim
> >
> >
> >
> >Tim Clark
> >Department of Zoology
> >University of Hawaii
> >
> >__
> >R-help@r-project.org
> mailing list
> >https://*stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide 
> >http://*www.*R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained,
> reproducible code.
> 
> 
> -- 
> --
> Don MacQueen
> Environmental Protection Department
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
> 925-423-1062
> --
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Maximizing values in subsetted dataframe

2009-07-28 Thread Tim Clark

Dear List,

I am trying to sub-sample some data by taking a data point every x minutes.  
The data contains missing values, and I would like to take the sub-sample that 
maximizes the number of valid points in the sample.  I.e. minimizes the number 
of NA's in the data set.  

For example, given the following:

da<-seq(Sys.time(),by=1,length.out=10)
x<-c(1,2,NA,4,NA,6,NA,8,9,10)
mydata<-data.frame(da,x)

If I wanted to take a subsample every 2 seconds, I would have the following two 
possible answers:

answer1: 2,4,NA,8
answer2: 1,NA,NA,7

I would like a function that would choose between these and obtain the one with 
the fewest missing values.

In my real dataset I have multiple variables collected every second and I would 
like to subsample it every 5, 10, and 15 minutes.

I appreciate your help.

Tim

Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating MS Access query objects using RODBC

2009-08-05 Thread Tim Calkins
Hi -
I'm trying to use R to create an MS Access query object. In particular, I
would like to pass a given sql statement to a variety of Access files and
have that sql statement saved as an Access Query in each db. Is this
possible using R?

I'm aware that I could use RODBC sqlQuery and write sql to make a table or
that I could run the sql, extract it to R, and then use sqlSave to save the
dataframe as a table in the db.

thanks in advance,

-- 
Tim Calkins
0406 753 997

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xts off by one confusion or error

2010-04-08 Thread Tim Coote

Hullo
I may have missed something blindingly obvious here. I'm using xts to  
handle some timeseries data. I've got daily measurements for 100  
years. If I try to reduce the error rate by taking means of each  
month, I'm getting what at first sight appears to be conflicting  
information. Here's a small subset to show the problem:


A small set of data:
> vv
 x
2010-02-01 6.1
2010-02-02 6.1
2010-02-03 6.0
2010-02-04 6.0
2010-02-05 6.0
2010-02-06 6.1
2010-02-07 6.1
2010-02-08 6.1
2010-02-09 6.1
2010-02-10 6.2

Aggregate:
> aggregate (vv, as.yearmon (index (vv)), mean)

Feb 2010 6.08

That's fine. But if I explicitly convert to xts (which the answer  
ought to be, so this should be a noop), the values shift back by one  
month:

> xts (aggregate (vv, as.yearmon (index (vv)), mean))
x
Jan 2010 6.08

Just to confirm the classes:
> class (aggregate (vv, as.yearmon (index (vv)), mean))
[1] "zoo"

> class (vv)
[1] "xts" "zoo"

And to confirm that as.yearmon is returning the right month:
> as.yearmon (index (vv))
 [1] "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010"
 [7] "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010"

This run was on a stock Fedora 10 build:
> version
   _
platform   i386-redhat-linux-gnu
arch   i386
os linux-gnu
system i386, linux-gnu
status
major  2
minor  10.0
year   2009
month  10
day26
svn rev50208
language   R
version.string R version 2.10.0 (2009-10-26)

And from installed.packages ():
xtsNA   NA  "GPL-3""2.10.0"
zooNA   NA  "GPL-2""2.10.0"

Any help gratefully received.

Tim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xts off by one confusion or error

2010-04-08 Thread Tim Coote
I find the following even more confusing as I thought that xts was a  
subclass of zoo and I'd expected that the conversion would have been  
more transparent


> aggregate (vv, as.yearmon(index(vv)), mean)

Feb 2010 6.08
> xts (aggregate (vv, as.yearmon(index(vv)), mean))
x
Jan 2010 6.08
> zoo (aggregate (vv, as.yearmon(index(vv)), mean))
x
Feb 2010 6.08

On 8 Apr 2010, at 15:53, Tim Coote wrote:


Hullo
I may have missed something blindingly obvious here. I'm using xts  
to handle some timeseries data. I've got daily measurements for 100  
years. If I try to reduce the error rate by taking means of each  
month, I'm getting what at first sight appears to be conflicting  
information. Here's a small subset to show the problem:


A small set of data:
> vv
x
2010-02-01 6.1
2010-02-02 6.1
2010-02-03 6.0
2010-02-04 6.0
2010-02-05 6.0
2010-02-06 6.1
2010-02-07 6.1
2010-02-08 6.1
2010-02-09 6.1
2010-02-10 6.2

Aggregate:
> aggregate (vv, as.yearmon (index (vv)), mean)

Feb 2010 6.08

That's fine. But if I explicitly convert to xts (which the answer  
ought to be, so this should be a noop), the values shift back by one  
month:

> xts (aggregate (vv, as.yearmon (index (vv)), mean))
   x
Jan 2010 6.08

Just to confirm the classes:
> class (aggregate (vv, as.yearmon (index (vv)), mean))
[1] "zoo"

> class (vv)
[1] "xts" "zoo"

And to confirm that as.yearmon is returning the right month:
> as.yearmon (index (vv))
[1] "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010"
[7] "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010"

This run was on a stock Fedora 10 build:
> version
  _
platform   i386-redhat-linux-gnu
arch   i386
os linux-gnu
system i386, linux-gnu
status
major  2
minor  10.0
year   2009
month  10
day26
svn rev50208
language   R
version.string R version 2.10.0 (2009-10-26)

And from installed.packages ():
xtsNA   NA  "GPL-3""2.10.0"
zooNA   NA  "GPL-2""2.10.0"

Any help gratefully received.

Tim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xts off by one confusion or error

2010-04-08 Thread Tim Coote
I find the following even more confusing as I thought that xts was a  
subclass of zoo and I'd expected that the conversion would have been  
more transparent


> aggregate (vv, as.yearmon(index(vv)), mean)

Feb 2010 6.08
> xts (aggregate (vv, as.yearmon(index(vv)), mean))
   x
Jan 2010 6.08
> zoo (aggregate (vv, as.yearmon(index(vv)), mean))
   x
Feb 2010 6.08

On 8 Apr 2010, at 15:53, Tim Coote wrote:


On 8 Apr 2010, at 15:53, Tim Coote wrote:


Hullo
I may have missed something blindingly obvious here. I'm using xts  
to handle some timeseries data. I've got daily measurements for 100  
years. If I try to reduce the error rate by taking means of each  
month, I'm getting what at first sight appears to be conflicting  
information. Here's a small subset to show the problem:


A small set of data:
> vv
x
2010-02-01 6.1
2010-02-02 6.1
2010-02-03 6.0
2010-02-04 6.0
2010-02-05 6.0
2010-02-06 6.1
2010-02-07 6.1
2010-02-08 6.1
2010-02-09 6.1
2010-02-10 6.2

Aggregate:
> aggregate (vv, as.yearmon (index (vv)), mean)

Feb 2010 6.08

That's fine. But if I explicitly convert to xts (which the answer  
ought to be, so this should be a noop), the values shift back by one  
month:

> xts (aggregate (vv, as.yearmon (index (vv)), mean))
   x
Jan 2010 6.08

Just to confirm the classes:
> class (aggregate (vv, as.yearmon (index (vv)), mean))
[1] "zoo"

> class (vv)
[1] "xts" "zoo"

And to confirm that as.yearmon is returning the right month:
> as.yearmon (index (vv))
[1] "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010"
[7] "Feb 2010" "Feb 2010" "Feb 2010" "Feb 2010"

This run was on a stock Fedora 10 build:
> version
  _
platform   i386-redhat-linux-gnu
arch   i386
os linux-gnu
system i386, linux-gnu
status
major  2
minor  10.0
year   2009
month  10
day26
svn rev50208
language   R
version.string R version 2.10.0 (2009-10-26)

And from installed.packages ():
xtsNA   NA  "GPL-3""2.10.0"
zooNA   NA  "GPL-2""2.10.0"

Any help gratefully received.

Tim

______
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Tim Coote
t...@coote.org
vincit veritas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Best subset of models for glm.nb()

2010-04-20 Thread Tim Clark
Dear List,

I am looking for a function that will find the best subset of negative binomial 
models.  I have a large data set with 15 variables that I am interested in.  I 
want an easy way to run all possible models and find a subset of the "best" 
models that I can then look at in more detail.  I have found two functions that 
seem to provide what I am looking for, but am not sure which one (if either) 
are appropriate.

glmulti() in package glmulti does an exhaustive search of all models and gives 
a number of candidate models to choose from based on your choice of Information 
Criterion.  This seems to be exactly what I am after, but I found nothing about 
it on this list which makes me think there is some reason no one is using it.

gl1ce() in package lasso2 uses the least absolute shrinkage and selection 
operator (lasso) to do something.  I found it at another thread:
http://tolstoy.newcastle.edu.au/R/help/05/03/0121.html
I did not understand the paper it was based on, and want to know if it even 
does what I am interested in before investing a lot of time in trying to 
understand it.

Yes, I have read about the problems with stepwise algorithms and am looking for 
a valid alternative to narrowing down models when you have a lot of data and a 
large number of variables your interested in.  

Any thoughts on either of these methods?  Or should I be doing something else?

Thanks for your help,

Tim


Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Periodic regression - lunar percent cover

2010-04-29 Thread Tim Clark
Dear List,

I am trying include a lunar variable in a model and am having problems figuring 
out the correct way to include it.  I want to convert the percent lunar 
illumination (fraction of moon showing) to a combination of sin and cos 
variables to account for the periodic nature of the lunar cycle.  Would someone 
let me know if I am doing this correctly?  I have included the first 20 
variables from my dataset as an example.  Y is count data and lp is the lunar 
percent cover.  The lunar period is 29.53.

y<-c(1, 3, 0, 0, 0, 0, 2, 4, 0, 1, 0, 5, 3, 2, 4, 2, 0, 1, 3, 5)
lp<-c(0.80, 0.88, 0.62, 0.19, 0.21, 0.01, 0.70, 1.00, 0.88, 0.04, 0.70, 0.93, 
0.23, 0.99, 0.19, 0.79, 1.00, 0.03, 0.01, 0.00)
g1<-glm(y~cos((2*pi*lp)/29.530589)+sin((2*pi*lp)/29.530589))

Thanks,

Tim




Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Estimating theta for negative binomial model

2010-05-03 Thread Tim Clark
Dear List,

I am trying to do model averaging for a negative binomial model using the 
package AICcmodavg.  I need to use glm() since the package does not accept 
glm.nb() models.  I can get glm() to work if I first run glm.nb and take theta 
from that model, but is there a simpler way to estimate theta for the glm 
model?  The two models are: 

mod.nb<-glm.nb(mantas~site,data=mydata)
mod.glm<-glm(mantas~site,data=mydata, family=negative.binomial(mod.nb$theta)) 

How else can I get theta for the family=negative.binomial(theta=???) 

Thanks!

Tim



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard
All,
 I'm trying again with a slightly more generic version of my first question. I 
can extract the
plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:

 # get some data
dat <- rnorm(100)
 # grab histogram data
hdat <- hist(dat)
hdat #provides details of the hist output

 #grab boxplot data
bdat <- boxplot(dat)
bdat #provides details of the boxplot output

 # the same works for randomForest
library(randomForest)
data(mtcars)
RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), 
log="y")
RFdat


##But, I can't use this method in ROCR
library(ROCR)
data(ROCR.xval)
RCdat <- plot(perf, avg="threshold")

RCdat
## output:  NULL

Does anyone have any tricks for piping or extracting these data?  
Or, perhaps for steering me in another direction?

Thanks,
Tim


From: "Tim Howard" 
Subject: [R] ROCR.plot methods, cross validation averaging
To: , ,

Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
Content-Type: text/plain; charset=US-ASCII

Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - 

I think my first question is generic and could apply to many methods, 
which is why I'm directing this initially to R-help as well as Tobias and 
Oliver.

Question 1. The plot function in ROCR will average your cross validation
data if asked. I'd like to use that averaged data to find a "best" cutoff
but I can't figure out how to grab the actual data that get plotted.
A simple redirect of the plot (such as test <- plot(mydata)) doesn't do it.

Question 2. I am asking ROCR to average lists with varying lengths for
each list entry. See my example below. None of the ROCR examples have data
structured in this manner. Can anyone speak to whether the averaging
methods in ROCR allow for this? If I can't easily grab the data as desired
from Question 1, can someone help me figure out how to average the lists,
by threshold, similarly?

Question 3. If my cross validation data happen to have a list entry whose
length = 2, ROCR errors out. Please see the second part of my example.
Any suggestions?

#reproducible examples exemplifying my questions
##part one##
library(ROCR)
data(ROCR.xval)
 # set up data so it looks more like my real data
sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
testSet <- ROCR.xval
 # do the extraction
for (i in 1:length(ROCR.xval[[1]])){
  y <- sample(c(1:350),sampSize[i])
  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
  }
 # now massage the data using ROCR, set up for a ROC plot
 # if it errors out here, run the above sample again.
pred <- prediction(testSet$predictions, testSet$labels)
perf <- performance(pred,"tpr","fpr")
 # create the ROC plot, averaging by cutoff value
plot(perf, avg="threshold")
 # check out the structure of the data
str(perf)
 # note the ragged edges of the list and that I assume averaging
 # whether it be vertical, horizontal, or threshold, somehow 
 # accounts for this?

## part two ##
# add a list entry with only two values
p...@x.values[[1]] <- c(0,1)
p...@y.values[[1]] <- c(0,1)
p...@alpha.values[[1]] <- c(Inf,0)

plot(perf, avg="threshold")

##output results in an error with this message
# Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
# missing value where TRUE/FALSE needed


Thanks in advance for your help
Tim Howard
New York Natural Heritage Program

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard
Whoops, sorry. Here is the full set with the missing lines:

library(ROCR)
data(ROCR.xval)
pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels)
perf <- performance(pred,"tpr","fpr")
RCdat <- plot(perf, avg="threshold")
RCdat

Thanks.
Tim
>>> David Winsemius  9/24/2009 9:25 AM >>>

On Sep 24, 2009, at 9:09 AM, Tim Howard wrote:

> All,
> I'm trying again with a slightly more generic version of my first  
> question. I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest().  
> Observe:
>
> # get some data
> dat <- rnorm(100)
> # grab histogram data
> hdat <- hist(dat)
> hdat #provides details of the hist output
>
> #grab boxplot data
> bdat <- boxplot(dat)
> bdat #provides details of the boxplot output
>
> # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE,  
> ntree=100), log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")

That code throws an object not found error. Perhaps you defined perf  
earlier?

David


>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>   
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as  
> Tobias and Oliver.
>
> Question 1. The plot function in ROCR will average your cross  
> validation
> data if asked. I'd like to use that averaged data to find a "best"  
> cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't  
> do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples  
> have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as  
> desired
> from Question 1, can someone help me figure out how to average the  
> lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry  
> whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
> # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
> # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
> # now massage the data using ROCR, set up for a ROC plot
> # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
> # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
> # check out the structure of the data
> str(perf)
> # note the ragged edges of the list and that I assume averaging
> # whether it be vertical, horizontal, or threshold, somehow
> # accounts for this?
>
> ## part two ##
> # add a list entry with only two values
> p...@x.values[[1]] <- c(0,1)
> p...@y.values[[1]] <- c(0,1)
> p...@alpha.values[[1]] <- c(Inf,0)
>
> plot(perf, avg="threshold")
>
> ##output results in an error with this message
> # Error in if (from == to) rep.int(from, length.out) else  
> as.vector(c(from,  :
> # missing value where TRUE/FALSE needed
>
>
> Thanks in advance for your help
> Tim Howard
> New York Natural Heritage Program
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard
David, 
Thank you for your reply. Yes, I can access the y-values slot with p...@y-values
but, note that in the cross-validation example (ROCR.xval), the plot function 
averages across the list of ten vectors in the y-values slot. 

I might be able to create a function to average across these ten vectors, but, 
since 
the plot function already does it for me, I thought it most efficient to get 
the values
from the function.  The compounding factor is that averaging needs to 
incorporate 
some kind of complex (to me at least) equalization based on the third slot 
(alpha.values). 

I don't know how to average vectors (especially uneven-length vectors) that 
align
using the alpha-values (suggestions here welcome!). Again, the plot function 
does 
this for me... if I could just get those values. 


Tobias, 
You suggestion to change the plot.performance function is a good one. I'll see 
if 
I can get in there and tweak it. 


Thanks to both of you for the help.
Tim


>>> David Winsemius  9/24/2009 9:43 AM >>>

On Sep 24, 2009, at 9:09 AM, Tim Howard wrote:

> All,
> I'm trying again with a slightly more generic version of my first  
> question. I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest().  
> Observe:
>
> # get some data
> dat <- rnorm(100)
> # grab histogram data
> hdat <- hist(dat)
> hdat #provides details of the hist output
>
> #grab boxplot data
> bdat <- boxplot(dat)
> bdat #provides details of the boxplot output
>
> # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE,  
> ntree=100), log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")
>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?

After looking at the examples in ROCR, my guess is that you really  
ought to examine the perf object itself. It's an S4 object so some of  
the access to internals are a bit different. In the example  
performance object I just created, the y-values slot values would ba  
obtainable with:

p...@y.values 

  The is also help from:
?"plot-methods"

-- 
David
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>   
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as  
> Tobias and Oliver.
>
> Question 1. The plot function in ROCR will average your cross  
> validation
> data if asked. I'd like to use that averaged data to find a "best"  
> cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't  
> do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples  
> have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as  
> desired
> from Question 1, can someone help me figure out how to average the  
> lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry  
> whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
> # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
> # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
> # now massage the data using ROCR, set up for a ROC plot
> # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
> # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
> # check out the structure of the data
> str(perf)
> # note the ragged edges of the list and that I assume averagin

Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tim Howard
Yes, that's exactly what I am after. Thank you for clarifying my problem for me!

I'll try to dive into the plot.performance function.

Best, 
Tim

>>> Tobias Sing  9/24/2009 9:57 AM >>>
Tim,

if I understand correctly, you are trying to get the numerical values
of averaged cross-validation curves.
Unfortunately the plot function of ROCR does not return anything in
the current version (it's a good suggestion to change this).

If you want a quick fix, you could change the plot.performance
function of ROCR to return back the values you wanted.

Kind regards,
  Tobias

On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard  wrote:
> All,
>  I'm trying again with a slightly more generic version of my first question. 
> I can extract the
> plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:
>
>  # get some data
> dat <- rnorm(100)
>  # grab histogram data
> hdat <- hist(dat)
> hdat #provides details of the hist output
>
>  #grab boxplot data
> bdat <- boxplot(dat)
> bdat #provides details of the boxplot output
>
>  # the same works for randomForest
> library(randomForest)
> data(mtcars)
> RFdat <- plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), 
> log="y")
> RFdat
>
>
> ##But, I can't use this method in ROCR
> library(ROCR)
> data(ROCR.xval)
> RCdat <- plot(perf, avg="threshold")
>
> RCdat
> ## output:  NULL
>
> Does anyone have any tricks for piping or extracting these data?
> Or, perhaps for steering me in another direction?
>
> Thanks,
> Tim
>
>
> From: "Tim Howard" 
> Subject: [R] ROCR.plot methods, cross validation averaging
> To: , ,
>
> Message-ID: <4aba1079.6d16.00d...@gw.dec.state.ny.us>
> Content-Type: text/plain; charset=US-ASCII
>
> Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -
>
> I think my first question is generic and could apply to many methods,
> which is why I'm directing this initially to R-help as well as Tobias and 
> Oliver.
>
> Question 1. The plot function in ROCR will average your cross validation
> data if asked. I'd like to use that averaged data to find a "best" cutoff
> but I can't figure out how to grab the actual data that get plotted.
> A simple redirect of the plot (such as test <- plot(mydata)) doesn't do it.
>
> Question 2. I am asking ROCR to average lists with varying lengths for
> each list entry. See my example below. None of the ROCR examples have data
> structured in this manner. Can anyone speak to whether the averaging
> methods in ROCR allow for this? If I can't easily grab the data as desired
> from Question 1, can someone help me figure out how to average the lists,
> by threshold, similarly?
>
> Question 3. If my cross validation data happen to have a list entry whose
> length = 2, ROCR errors out. Please see the second part of my example.
> Any suggestions?
>
> #reproducible examples exemplifying my questions
> ##part one##
> library(ROCR)
> data(ROCR.xval)
>  # set up data so it looks more like my real data
> sampSize <- c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
> testSet <- ROCR.xval
>  # do the extraction
> for (i in 1:length(ROCR.xval[[1]])){
>  y <- sample(c(1:350),sampSize[i])
>  testSet$predictions[[i]] <- ROCR.xval$predictions[[i]][y]
>  testSet$labels[[i]] <- ROCR.xval$labels[[i]][y]
>  }
>  # now massage the data using ROCR, set up for a ROC plot
>  # if it errors out here, run the above sample again.
> pred <- prediction(testSet$predictions, testSet$labels)
> perf <- performance(pred,"tpr","fpr")
>  # create the ROC plot, averaging by cutoff value
> plot(perf, avg="threshold")
>  # check out the structure of the data
> str(perf)
>  # note the ragged edges of the list and that I assume averaging
>  # whether it be vertical, horizontal, or threshold, somehow
>  # accounts for this?
>
> ## part two ##
> # add a list entry with only two values
> p...@x.values[[1]] <- c(0,1)
> p...@y.values[[1]] <- c(0,1)
> p...@alpha.values[[1]] <- c(Inf,0)
>
> plot(perf, avg="threshold")
>
> ##output results in an error with this message
> # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
> # missing value where TRUE/FALSE needed
>
>
> Thanks in advance for your help
> Tim Howard
> New York Natural Heritage Program
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] col headers in read.table()

2009-09-24 Thread Tim Smith
Hi,

I was trying to read in a file test.txt, which has the following data:
 norm   normnormclass   class   class
a   1   2   3   4   5   6
b   3   4   5   6   7   8
c   5   6   7   8   9   10


in my R code, I do the following:
---
> mat <- read.table('test.txt',header=T,row.names=1,sep='\t')
> mat

  norm norm.1 norm.2 class class.1 class.2
a1  2  3 4   5   6
b3  4  5 6   7   8
c5  6  7 8   9  10
> 
--
What do I need to do so that I don't get 'norm.1', 'norm.2' etc., but just 
'norm', 'norm'..,i.e. without the numbers.
thanks,


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data formatting for matplot

2009-09-27 Thread Tim Clark
Dear List,

I am wanting to produce a multiple line plot, and know I can do it with matplot 
but can't get my data in the format I need.  I have a dataframe with three 
columns; individuals ID, x, and y.  I have tried split() but it gives me a list 
of matrices, which is closer but not quite what I need.  For example:

id<-rep(seq(1,5,1),length.out=100)
x<-rnorm(100,5,1)
y<-rnorm(100,20,5)

mydat<-data.frame(id,x,y)
split.dat<-split(mydat[,2:3],mydat[,1])

I would appreciate your help in either how to get this into a format acceptable 
to matplot or other options for creating a multiple line plot.

Thanks,

Tim



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data formatting for matplot

2009-09-27 Thread Tim Clark
Henrique,

Thanks for the suggestion.  I think I may not understand matplot() because the 
graph did not come out like it should have.  Gabor suggested:

library(lattice)
xyplot(y ~ x, mydat, groups = id)

Which gave what I was looking for.  Is there a way to get matplot() to give the 
same graph?  I don't have to use matplot(), but would like to understand its 
use.

Thanks,

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Sun, 9/27/09, Henrique Dallazuanna  wrote:

> From: Henrique Dallazuanna 
> Subject: Re: [R] Data formatting for matplot
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Sunday, September 27, 2009, 4:47 PM
> You can try this:
> 
> matplot(do.call(cbind, split.dat))
> 
> On Sun, Sep 27, 2009 at 11:42 PM, Tim Clark 
> wrote:
> > Dear List,
> >
> > I am wanting to produce a multiple line plot, and know
> I can do it with matplot but can't get my data in the format
> I need.  I have a dataframe with three columns; individuals
> ID, x, and y.  I have tried split() but it gives me a list
> of matrices, which is closer but not quite what I need.
>  For example:
> >
> > id<-rep(seq(1,5,1),length.out=100)
> > x<-rnorm(100,5,1)
> > y<-rnorm(100,20,5)
> >
> > mydat<-data.frame(id,x,y)
> > split.dat<-split(mydat[,2:3],mydat[,1])
> >
> > I would appreciate your help in either how to get this
> into a format acceptable to matplot or other options for
> creating a multiple line plot.
> >
> > Thanks,
> >
> > Tim
> >
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data formatting for matplot

2009-09-28 Thread Tim Clark
Thanks for everyones help.  It is great to have a number of options that result 
in the same graph.  

Aloha,

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Mon, 9/28/09, Henrique Dallazuanna  wrote:

> From: Henrique Dallazuanna 
> Subject: Re: [R] Data formatting for matplot
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Monday, September 28, 2009, 1:43 AM
> Tim,
> 
> With Gabor examples, I understand this,
> 
> You can get a similar graph with  plot:
> 
> with(mydat, plot(x, y, col = id))
> 
> On Mon, Sep 28, 2009 at 3:01 AM, Tim Clark 
> wrote:
> > Henrique,
> >
> > Thanks for the suggestion.  I think I may not
> understand matplot() because the graph did not come out like
> it should have.  Gabor suggested:
> >
> > library(lattice)
> > xyplot(y ~ x, mydat, groups = id)
> >
> > Which gave what I was looking for.  Is there a way to
> get matplot() to give the same graph?  I don't have to use
> matplot(), but would like to understand its use.
> >
> > Thanks,
> >
> > Tim
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> >
> > --- On Sun, 9/27/09, Henrique Dallazuanna 
> wrote:
> >
> >> From: Henrique Dallazuanna 
> >> Subject: Re: [R] Data formatting for matplot
> >> To: "Tim Clark" 
> >> Cc: r-help@r-project.org
> >> Date: Sunday, September 27, 2009, 4:47 PM
> >> You can try this:
> >>
> >> matplot(do.call(cbind, split.dat))
> >>
> >> On Sun, Sep 27, 2009 at 11:42 PM, Tim Clark 
> >> wrote:
> >> > Dear List,
> >> >
> >> > I am wanting to produce a multiple line plot,
> and know
> >> I can do it with matplot but can't get my data in
> the format
> >> I need.  I have a dataframe with three columns;
> individuals
> >> ID, x, and y.  I have tried split() but it gives
> me a list
> >> of matrices, which is closer but not quite what I
> need.
> >>  For example:
> >> >
> >> > id<-rep(seq(1,5,1),length.out=100)
> >> > x<-rnorm(100,5,1)
> >> > y<-rnorm(100,20,5)
> >> >
> >> > mydat<-data.frame(id,x,y)
> >> > split.dat<-split(mydat[,2:3],mydat[,1])
> >> >
> >> > I would appreciate your help in either how to
> get this
> >> into a format acceptable to matplot or other
> options for
> >> creating a multiple line plot.
> >> >
> >> > Thanks,
> >> >
> >> > Tim
> >> >
> >> >
> >> >
> >> > Tim Clark
> >> > Department of Zoology
> >> > University of Hawaii
> >> >
> >> >
> __
> >> > R-help@r-project.org
> >> mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide 
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal,
> self-contained,
> >> reproducible code.
> >> >
> >>
> >>
> >>
> >> --
> >> Henrique Dallazuanna
> >> Curitiba-Paraná-Brasil
> >> 25° 25' 40" S 49° 16' 22" O
> >>
> >
> >
> >
> >
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot help - colors and break in plot

2009-09-28 Thread Tim Clark
Dear List,

I am new to lattice plots, and am having problems with getting my plot to do 
what I want.  Specifically:

1. I would like the legend to have the same symbols as the plot.  I tried 
simpleKey but can't seem to get it to work with autoKey.  Right now my plot has 
dots (pch=19) and my legend shows circles.

2.  I have nine groups but xyplot seems to only be using seven colors, so two 
groups have the same color.  How do I get a range of nine colors?

3.  I have one group who's y range is much greater than all the others.  I 
would like to split the plot somehow so that the bottom part shows 
ylim=c(0,200) and the top shows ylim=c(450,550).  Is this possible?

What I have so far is:

  library(lattice)
  xyplot(m.dp.area$Area.km2 ~ m.dp.area$DataPoint, m.dp.area, groups = 
m.dp.area$Manta,
main = "Cummulative area of 100% MCP",
xlab = "Data Point",
ylab = "MCP Area",
ylim = c(0,150),
scales = list(tck = c(1, 0)), #Removes tics on top and r-axis
pch=19,cex=.4,
auto.key = list(title = "Mantas", x = .05, y=.95, corner = 
c(0,1),border = TRUE)) #Legend


Thanks,

Tim



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot help - colors and break in plot

2009-09-29 Thread Tim Clark
Felix,

Thanks, that did the trick!  Lattice is a lot less intuitive than basic 
plotting!

Also, another person suggested using gap.plot from the plotrix package to put a 
break in the graph.  I am surprised Lattice doesn't have something similar 
since it seems like a common problem when you have data that groups in clusters 
separated by a large range.

Aloha,

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Mon, 9/28/09, Felix Andrews  wrote:

> From: Felix Andrews 
> Subject: Re: [R] xyplot help - colors and break in plot
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Monday, September 28, 2009, 1:50 PM
> 2009/9/29 Tim Clark :
> > Dear List,
> >
> > I am new to lattice plots, and am having problems with
> getting my plot to do what I want.  Specifically:
> >
> > 1. I would like the legend to have the same symbols as
> the plot.  I tried simpleKey but can't seem to get it
> to work with autoKey.  Right now my plot has dots
> (pch=19) and my legend shows circles.
> 
> Rather than the pch = 19 argument, use par.settings =
> simpleTheme(pch
> = 19, cex = .4)
> 
> >
> > 2.  I have nine groups but xyplot seems to only
> be using seven colors, so two groups have the same
> color.  How do I get a range of nine colors?
> 
> Yes, in the default theme, there are seven colours: see
> trellis.par.get("superpose.symbol")
> 
> You can change the set of colours yourself by modifying
> that list (via
> trellis.par.set).
> 
> An easier option is to use one of the predefined
> ColorBrewer palettes,
> with custom.theme() from the latticeExtra package, or just
> simpleTheme(). See ?brewer.pal (RColorBrewer package)
> You will see there are a few qualitative color palettes
> with 9 or more
> colours: e.g.
> brewer.pal(9, "Set1")
> brewer.pal(12, "Set3")
> 
> >
> > 3.  I have one group who's y range is much
> greater than all the others.  I would like to split the
> plot somehow so that the bottom part shows ylim=c(0,200) and
> the top shows ylim=c(450,550).  Is this possible?
> 
> Yes... in the absence of a reproducible example, maybe
> something like
> 
>  xyplot(Area.km2 ~ DataPoint | (Area.km2 > 200),
> m.dp.area,
>          groups = Manta,
> scales = list(y = "free"))
> 
> or
> 
> AreaRange <- shingle(Area.km2,
> rbind(c(0,200),c(450,550)))
> xyplot(Area.km2 ~ DataPoint | AreaRange, m.dp.area,
>         groups = Manta, scales = list(y
> = "free"))
> 
> >
> > What I have so far is:
> >
> >  library(lattice)
> >  xyplot(m.dp.area$Area.km2 ~ m.dp.area$DataPoint,
> m.dp.area, groups = m.dp.area$Manta,
> >        main = "Cummulative area of
> 100% MCP",
> >        xlab = "Data Point",
> >        ylab = "MCP Area",
> >        ylim = c(0,150),
> >        scales = list(tck = c(1,
> 0)), #Removes tics on top and r-axis
> >        pch=19,cex=.4,
> >        auto.key = list(title =
> "Mantas", x = .05, y=.95, corner = c(0,1),border = TRUE))
> #Legend
> >
> >
> > Thanks,
> >
> > Tim
> >
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 
> 
> -- 
> Felix Andrews / 安福立
> Postdoctoral Fellow
> Integrated Catchment Assessment and Management (iCAM)
> Centre
> Fenner School of Environment and Society [Bldg 48a]
> The Australian National University
> Canberra ACT 0200 Australia
> M: +61 410 400 963
> T: + 61 2 6125 1670
> E: felix.andr...@anu.edu.au
> CRICOS Provider No. 00120C
> -- 
> http://www.neurofractal.org/felix/
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bwplot scales in alphabetical order

2009-09-30 Thread Tim Clark
Dear List,

I know this has been covered before, but I don't seem to be able to get it 
right.  I am constructing a boxplot in lattice and can't get the scales in the 
correct alphebetical order.  I have already read that this is due to the way 
factors are treated, and I have to redefine the levels of the factors.  
However, I have failed.  
As a simple example:

library(lattice)
id<-rep(letters[1:9], each=20)
x<-rep(seq(1:10),each=18)
y<-rnorm(180,50,20)

#Reverse alphebetical order
  bwplot(y~x|id, horizontal=FALSE)

#alphebetical order reading right to left
  id<-factor(id,levels = sort(id,decreasing = TRUE))
  bwplot(y~x|id, horizontal=FALSE)

It appears that bwplot plots scales from the bottom left to the top right. If 
so my factor levels would need to be levels=c(7,8,9,4,5,6,1,2,3). I tried that 
but can't seem to get the factor function to work.

#Did not work!
id<-factor(id,levels=c(7,8,9,4,5,6,1,2,3),lables=letters[1:9])

Your help would be greatly appreciated.

Tim





Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bwplot scales in alphabetical order

2009-09-30 Thread Tim Clark
Peter,

Thanks, that did it!

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Wed, 9/30/09, Peter Ehlers  wrote:

> From: Peter Ehlers 
> Subject: Re: [R] bwplot scales in alphabetical order
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Wednesday, September 30, 2009, 2:43 AM
> Tim,
> 
> Add the argument as.table=TRUE to your call:
> 
>   bwplot(y~x|id, horizontal=FALSE, as.table=TRUE)
> 
> Peter Ehlers
> 
> Tim Clark wrote:
> > Dear List,
> > 
> > I know this has been covered before, but I don't seem
> to be able to get it right.  I am constructing a
> boxplot in lattice and can't get the scales in the correct
> alphebetical order.  I have already read that this is
> due to the way factors are treated, and I have to redefine
> the levels of the factors.  However, I have
> failed.  As a simple example:
> > 
> > library(lattice)
> > id<-rep(letters[1:9], each=20)
> > x<-rep(seq(1:10),each=18)
> > y<-rnorm(180,50,20)
> > 
> > #Reverse alphebetical order
> >   bwplot(y~x|id, horizontal=FALSE)
> > 
> > #alphebetical order reading right to left
> >   id<-factor(id,levels =
> sort(id,decreasing = TRUE))
> >   bwplot(y~x|id, horizontal=FALSE)
> > 
> > It appears that bwplot plots scales from the bottom
> left to the top right. If so my factor levels would need to
> be levels=c(7,8,9,4,5,6,1,2,3). I tried that but can't seem
> to get the factor function to work.
> > 
> > #Did not work!
> >
> id<-factor(id,levels=c(7,8,9,4,5,6,1,2,3),lables=letters[1:9])
> > 
> > Your help would be greatly appreciated.
> > 
> > Tim
> > 
> > 
> > 
> > 
> > 
> > Tim Clark
> > Department of Zoology University of Hawaii
> > 
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> > 
> > 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Paste a character to an object

2009-10-03 Thread Tim Clark
Dear List,

I can't seem to get a simple paste function to work like I need.  I have an 
object I need to call but it ends in a character string.  The object is a list 
of home range values for a range of percent isopleths.  I need to loop through 
a vector of percent values, so I need to paste the percent as a character on 
the end of the object variable.  I have no idea why the percent is in character 
form, and I can't use a simple index value (homerange[[1]]$polygons[100]) 
because there are a variable number of isopleths that are calculated and [100] 
will not always correspond to "100".  So I am stuck.

What I want is:

homerange[[1]]$polygons$"100"

What I need is something like the following, but that works:

percent<-c("100","75","50")
p=1
paste(homerange[[1]]$polygons$,percent[p],sep="")

Thanks for the help,

Tim



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Paste a character to an object

2009-10-03 Thread Tim Clark
David,

Thanks, that helps me in making an example of what I am trying to do.  Given 
the following example, I would like to run through a for loop and obtain a 
vector of the data only for the 100, 75, and 50 percent values.  Is there a way 
to get this to work, either using paste as in the example below or some other 
method?

homerange <- list()
homerange[[1]] <- "test"
homerange[[1]]$polygons <- "test2"
homerange[[1]]$polygons$`100` <- rnorm(20,10,1)
homerange[[1]]$polygons$`90` <- rnorm(20,10,1)
homerange[[1]]$polygons$`75` <- rnorm(20,10,1)
homerange[[1]]$polygons$`50` <- rnorm(20,10,1)

xx<-c()
percent<-c("100","75","50")
for (i in 1:length(percent))
{
x<-paste(homerange[[1]]$polygons$,percent[i]) #This does not work!!!
xx<-rbind(x,xx)
}

The x<-paste(...) in this function does not work, and that is what I am stuck 
on.  The result should be a vector the values for the "100","75",and "50" 
levels, but not the "90" level.

Aloha,

Tim




Tim Clark
Department of Zoology 
University of Hawaii


--- On Sat, 10/3/09, David Winsemius  wrote:

> From: David Winsemius 
> Subject: Re: [R] Paste a character to an object
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Saturday, October 3, 2009, 4:45 PM
> 
> On Oct 3, 2009, at 10:26 PM, Tim Clark wrote:
> 
> > Dear List,
> > 
> > I can't seem to get a simple paste function to work
> like I need.  I have an object I need to call but it
> ends in a character string.  The object is a list of
> home range values for a range of percent isopleths.  I
> need to loop through a vector of percent values, so I need
> to paste the percent as a character on the end of the object
> variable.  I have no idea why the percent is in
> character form, and I can't use a simple index value
> (homerange[[1]]$polygons[100]) because there are a variable
> number of isopleths that are calculated and [100] will not
> always correspond to "100".  So I am stuck.
> > 
> > What I want is:
> > 
> > homerange[[1]]$polygons$"100"
> > 
> > What I need is something like the following, but that
> works:
> > 
> > percent<-c("100","75","50")
> > p=1
> > paste(homerange[[1]]$polygons$,percent[p],sep="")
> 
> Not a reproducible example, but here is some code that
> shows that it is possible to construct names that would
> otherwise be invalid due to having numerals as a first
> character by using back-quotes:
> 
> > percent<-c("100","75","50")
> > p=1
> > paste(homerange[[1]]$polygons$,percent[p],sep="")
> Error: syntax error
> > homerange <- list()
> > homerange[[1]] <- "test"
> > homerange[[1]]$polygons <- "test2"
> Warning message:
> In homerange[[1]]$polygons <- "test2" : Coercing LHS to
> a list
> > homerange
> [[1]]
> [[1]][[1]]
> [1] "test"
> 
> [[1]]$polygons
> [1] "test2"
> 
> 
> > homerange[[1]]$polygons$`100` <- percent[1]
> Warning message:
> In homerange[[1]]$polygons$`100` <- percent[1] :
> Coercing LHS to a list
> > homerange[[1]]$polygons$`100`
> [1] "100"
> 
> --David Winsemius
> 
> 
> > 
> > Thanks for the help,
> > 
> > Tim
> > 
> > 
> > 
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> > 
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Paste a character to an object

2009-10-04 Thread Tim Clark
David,

Thanks!  You just gave me the answer.  All I had to do was:

xx<-c()
for (i in c('100', '75', '50') )
{
x<-homerange[[1]]$polygons[[i]] ; xx<-rbind(x,xx)
}
 xx

I didn't know you could use characters as index values in a for loop, or that 
you could use characters in double brackets instead of using the $ symbol.

homerange[[1]]$polygons[['100']]
is the same as
homerange[[1]]$polygons$'100

The list is actually the output of the NNCH function in Adehabitat.  I thought 
about changing the function first, but looked at the code and couldn't figure 
it out.  I knew there had to be an easier way.

I greatly appreciate all your help,

Tim

Tim Clark
Department of Zoology 
University of Hawaii


--- On Sat, 10/3/09, David Winsemius  wrote:

> From: David Winsemius 
> Subject: Re: [R] Paste a character to an object
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Saturday, October 3, 2009, 5:43 PM
> 
> On Oct 3, 2009, at 11:14 PM, Tim Clark wrote:
> 
> > David,
> > 
> > Thanks, that helps me in making an example of what I
> am trying to do.  Given the following example, I would
> like to run through a for loop and obtain a vector of the
> data only for the 100, 75, and 50 percent values.  Is
> there a way to get this to work, either using paste as in
> the example below or some other method?
> > 
> > homerange <- list()
> > homerange[[1]] <- "test"
> > homerange[[1]]$polygons <- "test2"
> > homerange[[1]]$polygons$`100` <- rnorm(20,10,1)
> > homerange[[1]]$polygons$`90` <- rnorm(20,10,1)
> > homerange[[1]]$polygons$`75` <- rnorm(20,10,1)
> > homerange[[1]]$polygons$`50` <- rnorm(20,10,1)
> > 
> > xx<-c()
> > percent<-c("100","75","50")
> > for (i in 1:length(percent))
> > {
> > x<-paste(homerange[[1]]$polygons$   
> ,    percent[i]) #This does not work!!!
>                
>                
>   ^?^
> And why _would_ you expect an expression ending in a "$" to
> be acceptable to the parser? You did not put quotes around
> it so the interpreter tried to evaluate it.
> 
> You are probably looking for the capabilities of the
> functions get and assign which take string variable and
> either get the object named by a sstring or assign a vlaue
> to an object so named.
> 
> But why are you intent in causing yourself all this
> pain?  (Not to mention asking questions I cannot
> answer.)  Working with expressions involving backquotes
> is a recipe for hair-pulling and frustration for us normal
> mortals. Why not call your lists "p100", "p90", "p75",
> "p50"? Then everything is simple:
> 
> > xx<-c()
> > percent<-c(100, 75, 50)
> > for (i in c("p100", "p75", "p50") )
> + {
> + x<-homerange[[1]]$polygons[[i]] ;
> xx<-rbind(x,xx)  # could have simplified this
> + }
> > xx
>        [,1] 
>    [,2]     [,3] 
>     [,4]     [,5]   
>   [,6]      [,7]     
> [,8]     [,9]
> x  9.660935 10.46526 10.75813  8.866064
> 9.967950  9.987941 10.757160 10.180826 9.992162
> x 11.674645 10.51753 10.88061 10.515120 9.440838 11.460845
> 12.033612  9.318392 9.592026
> x 10.057021 10.14339 10.29757  9.164233 8.977280 
> 9.733971  9.965002  9.693649 9.430043
>      [,10] 
>    [,11]     [,12] 
>    [,13]     [,14] 
>    [,15]     [,16] 
>    [,17]    [,18]
> x 11.78904  9.437353 11.910747 10.996167
> 11.631264  9.386944  9.602160 10.498921 
> 9.09349
> x  9.11036  9.546378 11.030323 
> 9.715164  9.500268 11.762440  9.101104 
> 9.610251 10.56210
> x  9.62574 12.738020  9.146863 10.497626
> 10.485520 11.644503 10.303581 11.340263 11.34873
>       [,19]     [,20]
> x 10.146955  9.640136
> x  9.334912 10.101603
> x  8.710609 11.265633
> 
> 
> 
> 
> > 
> > 
> > The x<-paste(...) in this function does not work,
> and that is what I am stuck on.  The result should be a
> vector the values for the "100","75",and "50" levels, but
> not the "90" level.
> > 
> > Aloha,
> > 
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> > 
> > 
> > --- On Sat, 10/3/09, David Winsemius 
> wrote:
> > 
> >> From: David Winsemius 
> >> Subject: Re: [R] Paste a character to an object
> >> To: "Tim Clark" 
> >> Cc: r-help@r-project.org
> >> Date: Saturday, October 3, 2009, 4:45 PM
> >> 
> >> On Oct 3, 2009, at 10:26 PM, Tim Clark wrote:
> >> 
> >>> Dea

[R] Satellite ocean color palette?

2009-10-09 Thread Tim Clark
Dear List,

Is there a color palette avaliable similar to what is used in satellite ocean 
color imagery?  I.e. a gradient with blue on one end and red on the other, with 
yellow in the middle?  I have tried topo.colors(n) but that comes out more 
yellow on the end.  I am looking for something similar to what is found on the 
CoastWatch web page:

http://oceanwatch.pifsc.noaa.gov/imagery/GA2009281_2009282_sst_2D_eddy.jpg

Thanks!

Tim


Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Satellite ocean color palette?

2009-10-09 Thread Tim Clark
Thanks!  The colorRampPalette() did just what I need.

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 10/9/09, Barry Rowlingson  wrote:

> From: Barry Rowlingson 
> Subject: Re: [R] Satellite ocean color palette?
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Friday, October 9, 2009, 9:06 AM
> On Fri, Oct 9, 2009 at 7:51 PM, Tim
> Clark 
> wrote:
> > Dear List,
> >
> > Is there a color palette avaliable similar to what is
> used in satellite ocean color imagery?  I.e. a gradient
> with blue on one end and red on the other, with yellow in
> the middle?  I have tried topo.colors(n) but that comes out
> more yellow on the end.  I am looking for something similar
> to what is found on the CoastWatch web page:
> >
> > http://oceanwatch.pifsc.noaa.gov/imagery/GA2009281_2009282_sst_2D_eddy.jpg
> >
> > Thanks!
> 
>  You could build one yourself with the colorRamp function:
> 
> satRampP =
> colorRampPalette(c("black","blue","cyan","yellow","orange","red","black"))
> 
>  that looks roughly like the one in the jpg, but I'm not
> sure about
> the black at the far end...anyway, let's see:
> 
> image(matrix(seq(0,1,len=100),100,1),col=satRampP(100))
> 
> Or you could try my colour schemes package:
> 
> https://r-forge.r-project.org/projects/colourscheme/
> 
> Barry
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bezier interpolation

2009-10-29 Thread Tim Clark
Dear List,

I am trying to interpolate animal tracking data using Bezier curves.  I need a 
function similar to spline() or approx() but that has a method Bezier.  I have 
tried xspline() but it does not allow you to set the number of points to 
interpolate between a given interval (n points between min(x) and max(x)).  
Mark Hindell asked the same question in 2006 
(http://tolstoy.newcastle.edu.au/R/e2/help/06/12/7034.html).  I contacted him 
and he never found a workable function.  Has one been developed since then?  

Thanks,

Tim
 
Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Time Series methods

2009-11-01 Thread Tim Bean
Hello, I have a quick question about time series methodology. If I want to
display a boxplot of time series data, sorted by period, I can type:

boxplot(data ~ cycle(data));

where data is of class "ts"

Is there a similar method for calculating, say, the median value of each
time step within the series? (So for a monthly data set, calculate median
for all Januarys, all Februarys, all Marchs, etc.)

Thanks,
Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Series methods

2009-11-02 Thread Tim Bean
Sure, but it seems like the point of the time series class is so you don't
have to create a factor based on the period of sampling. Is there anyway of
imposing calculations like median on the time series class, or is Kjetil's
suggestion the only approach?

Thanks!

On Sun, Nov 1, 2009 at 4:09 PM, Kjetil Halvorsen <
kjetilbrinchmannhalvor...@gmail.com> wrote:

> introduce a factor variable with the months and then use tapply?
>
> Kjetil
>
> On Sun, Nov 1, 2009 at 9:07 PM, Tim Bean  wrote:
> > Hello, I have a quick question about time series methodology. If I want
> to
> > display a boxplot of time series data, sorted by period, I can type:
> >
> > boxplot(data ~ cycle(data));
> >
> > where data is of class "ts"
> >
> > Is there a similar method for calculating, say, the median value of each
> > time step within the series? (So for a monthly data set, calculate median
> > for all Januarys, all Februarys, all Marchs, etc.)
> >
> > Thanks,
> > Tim
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Discontinuous graph

2009-11-16 Thread Tim Smith
Hi,
I wanted to make a graph with the following table (2 rows, 3 columns): 
a b c
x 1 3 5
y 5 8 6
The first column represents the start cordinate, and the second column contains 
the end cordinate for the x-axis. The third column contains the y-axis 
co-ordinate. For example, the first row in the matrix above represents the 
points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ?

thanks!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice plot

2009-11-17 Thread Tim Smith
Hi,

I was trying to get a graph in lattice with the following data frame (7 rows, 5 
cols):
chr start1 end1 meth positive
1   1 10   20  1.5y
2   2 12   18 -0.7n
3   3 22   34  2.0y
4   1 35   70  3.0y
5   1120  140 -1.3n
6   1180  190  0.2y
7   2220  300  0.4y
I wanted the panels to be organized by 'chr' - which is ok. Further, I wanted 
the lines to be discontinuous. For example, in the first row, the x co-ordinate 
starts with a value of 10 (2nd column) and ends with a value of 20 (3rd 
column). The corresponding y value for this range of x values is 1.5 (4th 
column). Similarly, for the same panel (i.e chr=1), the fourth row would have x 
co-ordinate range from 35 to 70 with a y co-ordinate of 3.
If it were only one panel, a similar result could be achieved for the data x2:
> x2
  chr start1 end1 meth positive
1   1 10   20  1.5y
4   1 35   70  3.0y
5   1120  140 -1.3n
6   1180  190  0.2y


## Code courtesy of BAPTISTE AUGUIE
library(ggplot2)
ggplot(data=x2) +
 geom_segment(aes(x=start1, xend=end1, y=meth, yend=meth))
- Can I get lattice to do a similar graph for the panels?
thanks!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discontinuous graph

2009-11-17 Thread Tim Smith
Thanks Baptiste. That is exactly what I needed. However, now I also need to 
know how I can achieve this using the lattice package, since I think I will 
have to make several panels. I've just rephrased the problem and put up another 
post. Hopefully, this will avoid some confusion.

best regards,




From: baptiste auguie 
To: r 
Sent: Mon, November 16, 2009 1:31:28 PM
Subject: Re: [R] Discontinuous graph

Hi,

An alternative with ggplot2,

library(ggplot2)

ggplot(data=coords) +
  geom_segment(aes(x=a, xend=b, y=c, yend=c))


HTH,

baptiste

2009/11/16 David Winsemius :
>
> On Nov 16, 2009, at 12:40 PM, Tim Smith wrote:
>
>> Hi,
>> I wanted to make a graph with the following table (2 rows, 3 columns):
>> a b c
>> x 1 3 5
>> y 5 8 6
>> The first column represents the start cordinate, and the second column
>> contains the end cordinate for the x-axis. The third column contains the
>> y-axis co-ordinate. For example, the first row in the matrix above
>> represents the points (1,5),(2,5), (3,5). How would I go about making a
>> discontinuous graph ?
>>
>> thanks!
>
> coords <- read.table(textConnection("a b c
>  x 1 3 5
>  y 5 8 6"), header=TRUE)
>
>  plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5),
> ylim=c(min(coords$c)-.5, max(coords$c)+.5)  )
>  apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2], y1=x[3])
> )
>
> --
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding columns to lower level of list

2009-11-21 Thread Tim Clark
Dear List,


I have very little experience with lists and am having some very basic 
problems.  I don't know how to add columns to the lower levels of a list, or 
how to take something from the upper level and add it as a column to the lower 
level.  I am analyzing animal movement data in the package Adehabitat.  I have 
a list of animal movements called "cut.ltr" (class ltraj) that have been 
divided into a series of "burst" - i.e. movements with no gaps in time over a 
given threashold.  I would like to 

1.  Add the speed to each item in the list, and also the burst.  I can 
calculate speed as:

sp<-lapply(cut.ltr,function(l){l$dist/l$dt})  

This creates a list of the correct size.  But I don't know how to add this to 
my original list.  I.e. add a column to the lower levels of the list called 
"speed".

2.  Add the burst to each lower level of the list.  It is in the upper level, 
but I don't know how to access it.
I have tried attribute(), attr(), cut.ltr$"burst", and several other creative 
guesses.

The first five items in the upper level are below - cut.ltr[1:5], along with 
head(cut.ltr[[1]]).  I would like my final result to have two more columns in 
cut.ltr[[1]].  One with speed, and the second with burst.

Thanks in advance for your help.

Tim





> cut.ltr[1:5]

*** List of class ltraj ***

Type of the traject: Type II (time recorded)
Irregular traject. Variable time lag between two locs

Characteristics of the bursts:
   id burst nb.reloc NAs  date.begindate.end
1 Abigail Abigail.1   47   0 2003-05-31 13:29:59 2003-06-01 00:59:56
2 Abigail Abigail.2  288   0 2003-06-18 17:28:11 2003-06-21 17:14:59
3 Abigail Abigail.3   10   0 2003-08-03 23:33:00 2003-08-04 01:43:58
4 Abigail Abigail.4   43   0 2003-08-04 08:15:25 2003-08-04 18:59:58
5 Abigail Abigail.5   78   0 2003-08-05 00:44:19 2003-08-05 20:15:00

> head(cut.ltr[[1]])
 x   ydate dx   dy dist  dt   
R2n abs.angle   rel.angle
1 809189.8 2189722 2003-05-31 13:29:59   81.87136 315.3389 325.7937 901   
0.0  1.316775  NA
2 809271.6 2190037 2003-05-31 13:45:00   13.00097 258.7351 259.0616 901  
106141.5  1.520590  0.20381526
3 809284.6 2190296 2003-05-31 14:00:01  250.52656 669.2065 714.5634 898  
338561.8  1.212584 -0.30800666
4 809535.2 2190965 2003-05-31 14:14:59 -171.14372 791.1522 809.4516 902 
1665046.9  1.783836  0.57125215
5 809364.0 2191756 2003-05-31 14:30:01  302.26979 707.0157 768.9202 900 
4169281.4  1.166785 -0.61705039
6 809666.3 2192463 2003-05-31 14:45:01  284.40962 725.2169 778.9919 900 
7742615.6  1.197057  0.03027109
> 



Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Removing objects from a list based on nrow

2009-11-29 Thread Tim Clark
Dear List,

I have a list containing data frames of various numbers of rows.  I need to 
remove any data frame that has less than 3 rows.  For example:

df1<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
df2<-data.frame(letter=c("A","B"),number=c(1,2))
df3<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
df4<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))

lst<-list(df1,df2,df3,df4)

How can I determine that the second object (df2) has less than 3 rows and 
remove it from the list?

Thanks!

Tim




Tim Clark
Department of Zoology 
University of Hawaii

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing objects from a list based on nrow

2009-11-29 Thread Tim Clark
Linlin,

Thanks!  That works great!

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Sat, 11/28/09, Linlin Yan  wrote:

> From: Linlin Yan 
> Subject: Re: [R] Removing objects from a list based on nrow
> To: "Tim Clark" 
> Cc: r-help@r-project.org
> Date: Saturday, November 28, 2009, 10:43 PM
> Try these:
> sapply(lst, nrow) # get row numbers
> which(sapply(lst, nrow) < 3) # get the index of rows
> which has less than 3 rows
> lst <- lst[-which(sapply(lst, nrow) < 3)] # remove
> the rows from the list
> 
> On Sun, Nov 29, 2009 at 4:36 PM, Tim Clark 
> wrote:
> > Dear List,
> >
> > I have a list containing data frames of various
> numbers of rows.  I need to remove any data frame that has
> less than 3 rows.  For example:
> >
> >
> df1<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
> > df2<-data.frame(letter=c("A","B"),number=c(1,2))
> >
> df3<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
> >
> df4<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
> >
> > lst<-list(df1,df2,df3,df4)
> >
> > How can I determine that the second object (df2) has
> less than 3 rows and remove it from the list?
> >
> > Thanks!
> >
> > Tim
> >
> >
> >
> >
> > Tim Clark
> > Department of Zoology
> > University of Hawaii
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   >