from:"iverson"


Daniel -

First, use order() to arrange the data.frame into an appropriate format.

Then, use duplicated() with the negation operator to get rid of the 
duplicated values.




Daniel Wagner wrote:

Dear R users,
Â 
I have a dataframe with lot of duplicate cases and I want to delete duplicate ones which have low rank and keep that case which has highest rank.

e.g
Â 

df1

Â  cnoÂ Â Â Â Â  rank
1Â  1342Â Â Â  0.23
2Â  1342Â Â Â  0.14
3Â  1342Â Â Â  0.56
4Â Â 2568Â Â Â  0.15
5Â  2568Â Â Â  0.89
Â 
so I want to keepÂ 3rd and 5thÂ  cases with highest rank (0.56 & 0.89) and deleteÂ rest of the duplicate cases.

Could somebody help me?
Â 
Regards
Â 
Daniel

Amsterdam
Â 
Â 
Â 
Â 
Â 
Â 
Â 
Â 

Send instant messages to your online friends http://uk.messenger.yahoo.com 
	[[alternative HTML version deleted]]






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with which()


See

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

[EMAIL PROTECTED] wrote:

Hi,

I'm using which to find the position of a value in my data table, and it is
returning the correct position and the position of another number that differs
by an extra decimal place. For example, when I do:

where1<-which(Operons==3573.1,arr.ind=TRUE)

it returns the position of that number and of 3573.15.

Is it possible to get the function to only return a position if the number
matches exactly?

Thanks,
-Nina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with which()




Ben Bolker wrote:

Erik Iverson  biostat.wisc.edu> writes:


See



http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

naw3  duke.edu wrote:



I'm using which to find the position of a value in my data table

[snip]

For example, when I do:

where1<-which(Operons==3573.1,arr.ind=TRUE)

it returns the position of that number and of 3573.15.



  This doesn't seem like the same issue as the FAQ.
  If there are really two elements in the array, one of
which is 3573.1 and the other of which is 3573.15, they
should not be treated as equal.  Something else is going on.
Could the original poster please send a small reproducible
example?



Yes, the more I look at it, the more I agree.  I just reacted when I saw
x == 3573.1 ...

It would be nice to see a reproducible example!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question: how to build a subset to do separate calculations


Spencer -

Spencer wrote:


Dear R Experts,

I am trying to create several subsets that I want to use as separate 
datasets. After separate subsets are created for each country, I want to 
do some calculations for each. I did the following:


data <- data.frame(na.omit(read.spss("trial.sav")))

attach (data)


You might not want to call it 'data' since 'data' is the name of a 
function in base R.


Also, the usual advice is to avoid attaching data.frames to the search 
path, for the exact reasons that happen to you below.




Netherlands <- subset(data, COUNTRY = "NETHERLANDS")
attach(Netherlands)

But after this step, when I check the summary statistics for 
"Netherlands," what returns is the summary for the original dataset. 


Try typing search() after attaching this to see why.  You may have also 
seen a message about objects being masked when you used the second 
'attach' command; at least I do in my version of R.





I've also tried defining Netherlands as Netherlands <- 
data.frame(subset(data, COUNTRY = "NETHERLANDS")), but get the same result.


Any help would be greatly appreciated!


Don't use attach.  If you want to do summaries by country, there are 
many simpler ways.  See ?tapply, ?aggregate, and ?by for instance.


Best,
Erik




Spencer




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question: how to build a subset to do separate calculations




Erik Iverson wrote:

Spencer -

Spencer wrote:


Dear R Experts,

I am trying to create several subsets that I want to use as separate 
datasets. After separate subsets are created for each country, I want 
to do some calculations for each. I did the following:


data <- data.frame(na.omit(read.spss("trial.sav")))

attach (data)


You might not want to call it 'data' since 'data' is the name of a 
function in base R.


Also, the usual advice is to avoid attaching data.frames to the search 
path, for the exact reasons that happen to you below.




Netherlands <- subset(data, COUNTRY = "NETHERLANDS")
attach(Netherlands)

But after this step, when I check the summary statistics for 
"Netherlands," what returns is the summary for the original dataset. 




Actually, this might not make sense to me.  What do you mean 'check the 
summary statistics for Netherlands".


What command are you giving  to R to do that?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] accessing list elements

2008-07-29 Thread Erik Iverson


Hello -

Paul Adams wrote:

Hello everyone, I have a list which I am trying to calculate a max
value.I have the list as w<-c(v[[1]][1],...v[[100]][1]). The problem
I am getting is that the function max is saying the list is an
"invalid type (list) of  argument".When I show the element v[[1]][1]
it shows as $statistic,V,736.I am only wanting to use the number 736
from v[[1]][1] but am not sure how to access that number only?I
believe if I just use the number then I should be able to calculate
the max.

Any help would be appreciated Paul


Please see footer of this message, which states "...provide commented, 
minimal, self-contained, reproducible code."


You do not do this.

Is your problem like this?

a <- list(20, 30)
max(a) ##error
max(unlist(a)) ##no error







[[alternative HTML version deleted]]






__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Variable name in a String

2008-07-30 Thread Erik Iverson


?assign, ?get


R_Learner wrote:

Hi all,
  I have this string "year" and integer 2008 (both are inputs from the
user), and I would like to make a variable called "year2008" that will store
a vector of numbers. Does anyone know how to do this? 


  Also, if the user later input "year" and "2008", how do I treat this as a
variable; how can I access the variable year2008 using the string
paste("year","2008",sep="")?

Thanks so much!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colnames from read.table

2008-07-30 Thread Erik Iverson

Chua Siang Li wrote:

   Hi.  May I know why the colnames is NULL when reading only 1 column from a
   csv file?  

You are not "reading only 1 column from a csv file", you are subsetting 
one column of a data.frame.  See ?Extract and the "drop" argument 
specifically.

How do I get the colname then?

?names in this case.  Always know the class of the object you are 
dealing with, and which methods are defined for that class.

?class, ?methods

See
 Thanks.

   > xy = read.table("dataFile.csv",header=T, sep=",")
   > y <- xy[,1]
   > xDate <- xy[,2]
   > x <- xy[,3:8]
   > colnames(y)
   NULL
   > colnames(xDate)
   NULL
   > colnames(x)
   [1]"Market.Price""Quantity""Country""Incoterm"
   "Channel"
   [6] "PaymentTerm"

   Chua Siang Li
   Consultant - Operations Research
   Acceval Pte Ltd
   Tel: 6297 8740
   Email: [EMAIL PROTECTED]
   Website: www.acceval-intl.com
   This message and any attachments (the "message"...{{dropped:12}}
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to output R image to a file?

2008-07-31 Thread Erik Iverson


?Devices for a start.

Aiste Aistike wrote:

Hello,

I would like to ask if anyone could help me. I want to save images I create
(e.g. histograms, boxplots, plots, etc.) to a file or files. Does anyone
know how to do this?

Thank you.

Aiste

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Storing Matrices into Hash

2008-07-31 Thread Erik Iverson

I think a named list is probably the easiest way to start off, something 
like:


all_mat <- list(mat1 = mat1, mat2 = mat2)

all_mat$mat2



Gundala Viswanath wrote:

Hi,

Suppose I have these two matrices (could be more).
What I need to do is to store these matrices into a hash.

So that I can call back any of the matrix back later.

Is there a way to do it?


mat_1

   [,1][,2]
  [1,] 9.327924e-01 0.067207616
  [2,] 9.869321e-01 0.013067929
  [3,] 9.892814e-01 0.010718579
  [4,] 9.931603e-01 0.006839735
  [5,] 9.149056e-01 0.08509


mat_2

   [,1][,2]
  [1,] 9.328202e-01 0.067179769
  [2,] 9.869402e-01 0.013059827
  [3,] 9.892886e-01 0.010711437
  [4,] 9.931660e-01 0.006833979
  [5,] 9.149391e-01 0.085060890


This method I have is not favorable
because it just stack the matrices together as another matrix.
Makes it hard to get individual matrix later.

all_mat <- NULL
all_mat <- c(all_mat, mat1,mat2)



- Gundala Viswanath
Jakarta - Indonesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Storing Matrices into Hash

2008-07-31 Thread Erik Iverson


Gundala Viswanath wrote:

Thanks so much Erik,

But how do you include that in a loop.

I tried this, doesn't seem to work. Please advice:

__BEGIN__
all_mat <- NULL

for (matno in 1:10) {

mat <- process_to_create_matrix(da[matno])
all_mat <- list(all_post, matno = mat)

}



I'm not exactly sure what you're up to, what is all_post, and what is "da"?

You might look at the lapply function as an option for avoiding the loop 
and writing cleaner code.


Something like:

all_mat <- lapply(da, process_to_create_matrix)

may or may not work depending on what "da" is.

Please see the footer of this message regarding reproducible, 
self-contained example code.


Best,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] viewing data in something similar to 'R Data Editor'

2008-08-01 Thread Erik Iverson

See ?View but I don't think it 'auto updates' per your last sentence. 
Maybe there's a better option?


Rachel Schwartz wrote:

Hi,

I would like to view matrices I am working with in a clean, easy to read,
separate window.

A friend showed me how to do something like I want  with edit(). I can view
the matrix in the 'R Data Editor':

For a sample matrix:


mat=matrix(1:15,ncol=3)
mat

 [,1] [,2] [,3]
[1,]16   11
[2,]27   12
[3,]38   13
[4,]49   14
[5,]5   10   15



look=function(x) invisible(edit(x))
look(mat)


That opens the 'R Data Editor' with mat loaded.


But I am not able to do any other actions in R while this 'R Data Editor' is
open. I want to keep this open while
I do other work.

Is there a way to view my data in something like the 'R Data Editor' that
still allows me to do work at the same time?
I am looking for something other than  str(), head(), and tail() which just
allow me a quick peak at the object. I do not
want to edit the object in the table, but be able to watch the object change
while I run anything that would manipulate it.

Thank you for your help.

Best,
Rachel Schwartz
Graduate Student Researcher
UCSD; Scripps Institution of Oceanography

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] viewing data in something similar to 'R Data Editor'

2008-08-01 Thread Erik Iverson

Rachel Schwartz wrote:
Thanks Erik, almost worked! I am a mac user and for some reason  View 
worked perfectly for my PC using friend, but

doesn't for me.

When I tried:
 > mat=matrix(1:10,ncol=2)
 > mat
 [,1] [,2]
[1,]16
[2,]27
[3,]38
[4,]49
[5,]5   10
 > View(mat)

I get no error message, but nothing happens (besides spinning ball of 
death) and I have to force quit R. I tried

a couple different variations but still no success with using View.

Suggestions?

Not from me, no Mac here.  Maybe someone else?  Or else there is a Mac 
specific list, R-SIG-Mac, google for it.

On Fri, Aug 1, 2008 at 10:52 AM, Erik Iverson <[EMAIL PROTECTED] 
<mailto:[EMAIL PROTECTED]>> wrote:

See ?View but I don't think it 'auto updates' per your last
sentence. Maybe there's a better option?

Rachel Schwartz wrote:

Hi,

I would like to view matrices I am working with in a clean, easy
to read,
separate window.

A friend showed me how to do something like I want  with edit().
I can view
the matrix in the 'R Data Editor':

For a sample matrix:

mat=matrix(1:15,ncol=3)
mat

[,1] [,2] [,3]
[1,]16   11
[2,]27   12
[3,]38   13
[4,]49   14
[5,]5   10   15

look=function(x) invisible(edit(x))
look(mat)

That opens the 'R Data Editor' with mat loaded.

But I am not able to do any other actions in R while this 'R
Data Editor' is
open. I want to keep this open while
I do other work.

Is there a way to view my data in something like the 'R Data
Editor' that
still allows me to do work at the same time?
I am looking for something other than  str(), head(), and tail()
which just
allow me a quick peak at the object. I do not
want to edit the object in the table, but be able to watch the
object change
while I run anything that would manipulate it.

Thank you for your help.

Best,
Rachel Schwartz
Graduate Student Researcher
UCSD; Scripps Institution of Oceanography

   [[alternative HTML version deleted]]

__
R-help@r-project.org <mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic factor question.

2008-08-02 Thread Erik Iverson


Kevin -

Read more closely "levels", being an optional vector of the values x 
might have taken.  You are saying x might have taken 1:20, and then 
giving it the first 20 letters, which are not part of "the values x 
might have taken".


Try:

x <- factor(letters[1:20])
levels(x)
x

vs.

y <- factor(letters[1:20], levels = letters)
levels(y)
y

vs.

z <- factor(letters[1:20], levels = letters[1:19])
levels(z)
z

That might help show you what's going on?

Best,
Erik Iverson

[EMAIL PROTECTED] wrote:

Doing ?factor I get:

x a vector of data, usually taking a small number of distinct values. 
levels an optional vector of the values that x might have taken. The default is the set of values taken by x, sorted into increasing order. 


So if I do:

factor(letters[1:20],level=seq(1:20)
 [1]   
[16] 
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

So why all of the NA? What happend to 'a'. 'b', etc.? I was expecting a=1, b=2, 
c=3 etc.

I am missing something. Please help with my understanding. 


Keviin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] T Test

2008-08-11 Thread Erik Iverson

Just plug your values into the t-test formula, you don't need R for 
this, you can use a calculator.  If you want a p-value then use the pt() 
function in R after getting the t statistic.


Angelo Scozzarella wrote:

Hi,

I want to calculate the T-Test from means and sd.

How can I do it?

Thanks

Angelo Scozzarella

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] produce variable on the fly

2008-08-12 Thread Erik Iverson


?assign , or consider a named vector/list.

jimineep wrote:

Hi guys,

I want to create variable on the fly: for example

for (i in 1:10) {
cat(paste("VAR",i,sep=""))
}
Will print VAR1, VAR2 etc up to VAR10. However I want to make these into
variables, and then give them a value, for example:

vect = c(10:20)

for (i in 1:10) {
cat(paste("VAR",i,sep="")) = vect[i]
}

THis doesnt work but I hope it demonstrates what I'm trying to do: I'm
trying to produce 10 variables, from VAR1 to VAR10, which have the numbers
11...20 respectively, so I would end up with 


VAR1 = 11
VAR2 = 12
...
VAR10 = 20

Hope this makes sense!! Does anyone have any ideas?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging data sets to match data to date


rcoder wrote:

Hi everyone,

I want to extract data from a data set according to dates specified in a
vector. I have created a blank matrix with row names (dates) that I want to
extract from the full data set. I have then performed a merge to try to o/p
rows corresponding to common dates to a results matrix, but the operation
did not fill the results matrix. Coulc anyone offer any advice to assist
with this operation?


Yes, follow the posting guide and provide commented, minimal, 
self-contained, reproducible code of your problem.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

I still don't understand what you are doing.  Can you make a small 
example that shows what you have and what you want?


Is ?split what you are after?

Emmanuel Levy wrote:

Dear Peter and Henrik,

Thanks for your replies - this helps speed up a bit, but I thought
there would be something much faster.

What I mean is that I thought that a particular value of a level
could be accessed instantly, similarly to a "hash" key.

Since I've got about 6000 levels in that data frame, it means that
making a list L of the form
L[[1]] = values of name "1"
L[[2]] = values of name "2"
L[[3]] = values of name "3"
...
would take ~1hour.

Best,

Emmanuel




2008/8/12 Henrik Bengtsson <[EMAIL PROTECTED]>:

To simplify:

n <- 2.7e6;
x <- factor(c(rep("A", n/2), rep("B", n/2)));

# Identify 'A':s
t1 <- system.time(res <- which(x == "A"));

# To compare a factor to a string, the factor is in practice
# coerced to a character vector.
t2 <- system.time(res <- which(as.character(x) == "A"));

# Interestingly enough, this seems to be faster (repeated many times)
# Don't know why.
print(t2/t1);
   user   system  elapsed
0.632653 1.60 0.754717

# Avoid coercing the factor, but instead coerce the level compared to
t3 <- system.time(res <- which(x == match("A", levels(x;

# ...but gives no speed up
print(t3/t1);
   user   system  elapsed
1.041667 1.00 1.018182

# But coercing the factor to integers does
t4 <- system.time(res <- which(as.integer(x) == match("A", levels(x
print(t4/t1);
usersystem   elapsed
0.417 0.000 0.3636364

So, the latter seems to be the fastest way to identify those elements.

My $.02

/Henrik


On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan <[EMAIL PROTECTED]> wrote:

Emmanuel,

On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy <[EMAIL PROTECTED]> wrote:

Dear All,

I have a large data frame ( 270 lines and 14 columns), and I would like to
extract the information in a particular way illustrated below:


Given a data frame "df":


col1=sample(c(0,1),10, rep=T)
names = factor(c(rep("A",5),rep("B",5)))
df = data.frame(names,col1)
df

  names col1
1  A1
2  A0
3  A1
4  A0
5  A1
6  B0
7  B0
8  B1
9  B0
10 B0

I would like to tranform it in the form:


index = c("A","B")
col1[[1]]=df$col1[which(df$name=="A")]
col1[[2]]=df$col1[which(df$name=="B")]

I'm not sure I fully understand your problem, you example would not run for me.

You could get a small speedup by omitting which(), you can subset by a
logical vector also which give a small speedup.


n <- 270
foo <- data.frame(

+   one = sample(c(0,1), n, rep = T),
+   two = factor(c(rep("A", n/2 ),rep("B", n/2 )))
+   )

system.time(out <- which(foo$two=="A"))

  user  system elapsed
 0.566   0.146   0.761

system.time(out <- foo$two=="A")

  user  system elapsed
 0.429   0.075   0.588

You might also find use for unstack(), though I didn't see a speedup.

system.time(out <- unstack(foo))

  user  system elapsed
 1.068   0.697   2.004

HTH

Peter


My problem is that the command:  *** which(df$name=="A") ***
takes about 1 second because df is so big.

I was thinking that a "level" could maybe be accessed instantly but I am not
sure about how to do it.

I would be very grateful for any advice that would allow me to speed this up.

Best wishes,

Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditional IF with AND


if(cond1 && cond2) {
  ...
}


rcoder wrote:

Hi everyone,

I'm trying to create an "if" conditional statement with two conditions,
whereby the statement is true when condition 1 AND condition 2 are met:

code structure:
if ?AND? (a[x,y] , a[x,y] )

I've trawled through the help files, but I cannot find an example of the
syntax for incorporating an AND in a conditional IF statement.

Thanks,

rcoder


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional statement used in sapply()


Hello -

Altaweel, Mark R. wrote:

Hi,

I have data stored in a list that I would like to aggregate and
perform some basic stats. However, I would like to apply conditional
statements so that not all the data are used.  Basically, I want to
get a specific variable, do some basic functions (such as a mean),
but only get the data in each element's data that match the
condition. The code I used is below:


result<-sapply(res, function(.df) {   #res is the list containing
file data

+ if(.df$Volume>0)mean(.df$Volume)  #only have the mean function
calculate on values great than 0 + })


I did get a numeric output; however, when I checked the output value
the conditional was ignored (i.e. it did not do anything to the
calculation)

I also obtained these warning statements:

Warning messages: 1: In if (.df$Volume > 0) mean(.df$Volume) : the
condition has length > 1 and only the first element will be used 2:
In if (.df$Volume > 0) mean(.df$Volume) : the condition has length >
1 and only the first element will be used

Please let me know what am I doing wrong and how can I apply a
conditional statement to the sapply function.



Before you think about sapply, what would you do if you had one element 
of this list.  Write a function to do that.


You wouldn't do :

if(x$Volume > 0)
  mean(x$Volume)

because x$Volume > 0 will create a logical vector greater than length 1 
(assuming x$Volume is greater than length 1), and then "if" will issue 
the warning.


You might do,

mean(x$Volume[x$Volume > 0])

and turn it into a function.

Then use sapply.

Hopefully that gets you started!

Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple (?) subset problem

2008-08-14 Thread Erik Iverson

I can't tell exactly what's wrong, just check out the ?str and ?levels 
functions for some guidance.


Farley, Robert wrote:

I can't figure out the syntax I need to get subset to work.  I'm trying
to split my dataframe into two parts.  I'm sure this is a simple issue,
but I'm stumped.  I either get all or none of the original "rows".  

 

 

 

 


XTTable <- xtabs( ~   direction_ , SurveyData)



XTTable


direction_

EASTBOUND  

   345 

WESTBOUND  

   307 


EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" )



XTTable <- xtabs( ~   direction_ , EBSurvey)



XTTable


direction_

EASTBOUND  

 0 

WESTBOUND  

 0 


EBSurvey <- subset(SurveyData, direction_ = "EASTBOUND" )



XTTable <- xtabs( ~   direction_ , EBSurvey)



XTTable


direction_

EASTBOUND  

   345 

WESTBOUND  

   307 


EBSurvey <- subset(SurveyData, direction_ == 1 )



XTTable <- xtabs( ~   direction_ , EBSurvey)



XTTable


direction_

EASTBOUND  

 0 

WESTBOUND  

 0 



 

 

 

 


Robert Farley

Metro

1 Gateway Plaza

Mail Stop 99-23-7

Los Angeles, CA 90012-2952

Voice: (213)922-2532

Fax:(213)922-2868

www.Metro.net 

 

 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Opening a web browser from R?

2008-08-15 Thread Erik Iverson


Hello -

[EMAIL PROTECTED] wrote:


Hi,

I was wondering if there's a way in R to open a web browser (such as 
Internet Explorer, or Firefox or whatever).
I'm doing some analyses that have associated urls, and it would be nice 
to have the ability to directly open the relevant page from within R.


I don't know what you mean by "associated".  If you just want to open a 
URL in a web browser from R, see ?browseURL, the help.start function 
uses this. Of course, help.search("browser") would have led you there.


HTH,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving environment object

2008-08-15 Thread Erik Iverson


Benjamin Otto wrote:

Hi,

When I create an environment object with new.env() and populate it with
values then how can I save it into an .RData file properly, so it can be
loaded later on in a new session?

Saving an environment object with save() or save.image() results in an error
message when loading again:

Error: protect(): protection stack overflow


Can you give a small, reproducible example as the posting guide asks? 
And also provide your sessionInfo() ?


I am not able to replicate this.

test <- new.env()
assign("hi", pi, pos = test)
save(test, file = "~/testenv.Rdata")

does not give me an error.  Is this basically what you're trying?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving environment object

2008-08-15 Thread Erik Iverson

Of course you said when you load it again.  I just now loaded it, 
without error.


FYI, my sessionInfo(), which I realize is not the latest version.

sessionInfo()
R version 2.7.0 (2008-04-22)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
 [1] grDevices datasets  grid  tcltk splines   graphics  utils
 [8] stats methods   base

other attached packages:
[1] fortunes_1.3-4  debug_1.1.0 mvbutils_1.1.1  erik_0.0-1
[5] reshape_0.8.0   SPLOTS_1.3-50   Hmisc_3.4-3 chron_2.3-21
[9] survival_2.34-1

loaded via a namespace (and not attached):
[1] cluster_1.11.10 lattice_0.17-6  tools_2.7.0

Benjamin Otto wrote:

Hi,

When I create an environment object with new.env() and populate it with
values then how can I save it into an .RData file properly, so it can be
loaded later on in a new session?

Saving an environment object with save() or save.image() results in an error
message when loading again:

Error: protect(): protection stack overflow

Regards,

benjamin

==
Benjamin Otto
University Hospital Hamburg-Eppendorf
Institute For Clinical Chemistry
Martinistr. 52
D-20246 Hamburg

Tel.: +49 40 42803 1908
Fax.: +49 40 42803 4971
==







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Produce single line graph title composed of text and computed values.

2009-08-10 Thread Erik Iverson

See the ?paste function, instead of the ?c function.  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of John Sorkin
Sent: Monday, August 10, 2009 10:36 AM
To: r-help@r-project.org
Subject: [R] Produce single line graph title composed of text and computed 
values.

R 2.81
Windows XP

I am trying to produce a title that combines: 
text, a computed value, text,  a computed value

The title contains everything I want, but each element of the title is on a 
separate line, i.e. my title is five lines long. Is there anyway I can force 
the entire title to be on the same line? The problem is not the length of the 
title; it is short enough to fit on a shingle line.

My R code follows:

title(c(text,"Corr=",round(corrln$estimate,2),"p=",round(corrln$p.value,2)))

Thanks,
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summary(table)

2009-08-10 Thread Erik Iverson

We cannot reproduce your example since we don't have access to probF.  It seems 
probF is not an object of class "table", but perhaps of class "data.frame".  

Also, summary is not "cutting off the other variables", it is pooling levels of 
a factor into the "Other" category.  All the levels belong to the same 
variable.  

By looking at the help for summary, i.e., by typing ?summary at the R prompt, 
you will find that the data.frame method has an argument called "maxsum", which 
is be default equal to 7.  The argument is described as "integer, indicating 
how many levels should be shown for 'factor's."  That sounds like what you 
want.  


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of mmv.listservs
Sent: Monday, August 10, 2009 12:52 PM
To: r-help@r-project.org
Subject: [R] summary(table)

Hi,

Why when I do a summary on a table it cuts off the other variables? It says
Other :58 or Other: 120.

how can I get the summary for all the variables under ServLoad.Task and
Server.Load and Avg. CPU and Max.CPU?

Thanks,

summary(probF)
   Reboot.Id   ServLoad.Task Server.Load
Avg.CPUMax.CPU Event.Log
 Min.   : 2.00   120067_122395:  5   SLACT04_1 :21   Min.   : 0.02827
Min.   : 0.40   2009-MAY-13 20:21:37:86
 1st Qu.:18.50   120076_124326:  5   SLSPED06  :19   1st Qu.: 4.22128
1st Qu.: 7.80   2009-MAY-11 20:03:29: 9
 Median :25.00   120030_122501:  4   SLACT04_5 :15   Median :10.29370
Median :23.70   2009-MAY-13 01:25:13: 7
 Mean   :21.18   120046_122891:  4   SL06_SCEVEN_OP:14   Mean   :20.54242
Mean   :32.88   2009-MAY-13 21:03:08: 5
 3rd Qu.:25.00   120046_122933:  4   SLACT04_4 :10   3rd Qu.:33.44267
3rd Qu.:53.70   2009-MAY-11 04:41:51: 4
 Max.   :30.00   120071_121550:  4   SLBF06_3  : 9   Max.   :96.73438
Max.   :99.90   2009-MAY-11 05:30:02: 4
 (Other)  :120   (Other)
:58  (Other) :31
  Probability
 Min.   :0.34
 1st Qu.:0.000227
 Median :0.009295
 Mean   :2.321083
 3rd Qu.:4.725526
 Max.   :9.432425

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in "seq" (or a "feature") ?

2009-08-10 Thread Erik Iverson

General floating point arithmetic issue here:

See FAQ 7.31

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tal Galili
Sent: Monday, August 10, 2009 4:14 PM
To: r-help@r-project.org
Subject: [R] Bug in "seq" (or a "feature") ?

(I use R 2.9.1 with win XP)

If I run this code:
seq(-0.1,.9, by = .05)[seq(-0.1,.9, by = .05) <= 0.5]
I get this output:
[1] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45
Why is 0.50 not in the results ?
(It seems that it gives a slightly bigger number then 0.5 but I don't
understand why it does that)


Where as if I try:
seq(-0.1,.9, by = .05)[seq(-0.1,.9, by = .05) <= 0.4]
and get:
[1] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
Then 0.40 WILL be in the results.

Thanks,Tal




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pretty display or print for data frames ?

I sometimes use the View() (note the capital V) function to view long/wide 
data.frames. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Leon Yee
Sent: Wednesday, August 12, 2009 2:12 AM
To: Patrick Connolly
Cc: r-help@r-project.org
Subject: Re: [R] pretty display or print for data frames ?

Patrick Connolly wrote:
> On Wed, 12-Aug-2009 at 12:38PM +0800, Leon Yee wrote:
> 
>> Hi, all
>>
>> I have a question of adjusting the output of a data frame with many  
>> columns. By default, print() will print out several columns according to  
>> the window size, and then it scrolls down and print out left columns.  
>> How can I make it print all the columns in the same line?
>> I found options(width=80 or 120 or whatever) will the change the  
>> behavior of print() and got the effect what I want, but using a  
>> parameter directly doesn't work: print(x, width=xxx).
>> Can anyone help me out? thanks.
> 
> This will get it on one line.
> 
> options(width=120); print(x)
> 
> I'm not sure what your main objective is so that might be of no use.
> If it's something you do often, you could make a little function that
> takes the name of the dataframe and the width you'd like it printed.
> You could even have the function put the width back to the initial
> setting.  (Generally speaking, printing very wide gets messy, so
> there's good reason why you might just want to do that.)

Thanks, Patrick.  Printing wide is sometimes needed. options(width=120) 
just works, but I need to changed it back to 80 every time for normal 
printing. Sort of tedious.
I'm just curious why print.data.frame does not take an argument of 
width=120, just like in print.factor.

## S3 method for class 'factor':
print(x, quote = FALSE, max.levels = NULL,
   width = getOption("width"), ...)

Best wishes,
Leon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Symbolic references - passing variable names into functions

I think ONE answer to what you actually want to do might be

f <- function(dataf, col1 = "column1", col2 = "column2") {
dataf[[col1]] <- dataf[[col2]] # just as an example
  dataf
}

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Alexander Shenkin
Sent: Wednesday, August 12, 2009 9:27 AM
To: r-help@r-project.org
Subject: [R] Symbolic references - passing variable names into functions

Hello All,

I am trying to write a function which would operate on columns of a
dataframe specified in parameters passed to that function.

f = function(dataf, col1 = "column1", col2 = "column2") {
dataf$col1 = dataf$col2 # just as an example
}

The above, of course, does not work as intended.  In some languages one
can force evaluation of a variable, and then use that evaluation as the
variable name.  Thus,

> a = "myvar"
> (operator)a = 1
> myvar
[1] 1

Is there some operator which allows this symbolic referencing in R?

Thanks,
Allie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nominal variables in SVM?

Noah, depending on what function you use, it might do this automatically for 
you if you give the function a formula containing a factor.  Otherwise, see 
?model.matrix.  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman
Sent: Wednesday, August 12, 2009 3:59 PM
Cc: r help
Subject: Re: [R] Nominal variables in SVM?

That makes sense.

I my data is already nominal, I need to "expand" a single column into 
several binary ones.

Is there an easy function to do this in R, or do I need to create 
something from scratch?  (If I have to create my own, any suggestions?)

Thanks!

-N

On 8/12/09 1:55 PM, Steve Lianoglou wrote:
> Hi,
>
> On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote:
>
>> Hi,
>>
>> The answers to my previous question about nominal variables has lead 
>> me to a more important question.
>>
>> What is the "best practice" way to feed nominal variable to an SVM.
>>
>> For example:
>> color = ("red, "blue", "green")
>>
>> I could translate that into an index so I wind up with
>> color= (1,2,3)
>>
>> But my concern is that the SVM will now think that the values are 
>> numeric in "range" and not discrete conditions.
>>
>> Another thought would be to create 3 binary variables from the single 
>> color variable, so I have:
>>
>> red = (0,1)
>> blue = (0,1)
>> green = (0,1)
>>
>> A example fed to the SVM would have one positive and two negative 
>> values to indicate the color value:
>> i.e. for a blue example:
>> red = 0, blue =1 , green = 0
>
> Do it this way.
>
> So, imagine if the features for your examples were color and height, 
> your "feature matrix" for N examples would be N x 4
>
> 0,1,0,15  # blue object, height 15
> 1,0,0,10  # red object, height 10
> 0,0,1,5 # green object, height 5
> ...
>
> -steve
>
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>   |  Memorial Sloan-Kettering Cancer Center
>   |  Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nominal variables in SVM?

This is where a small, reproducible example will definitely help us discover 
your problem. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman
Sent: Wednesday, August 12, 2009 4:29 PM
To: Achim Zeileis
Cc: r help
Subject: Re: [R] Nominal variables in SVM?

Thanks for all the suggestions.

My data was loaded in from a csv file with about 80 columns (3 of these 
columns are nominal)  no specific settings for the nominal columns.

Currently, if I call svm (e1071), I get an error about the nominal column.

Do I need to tell R to change the column to a factor?  i.e. foo$color <- 
factor(foo$color)


On 8/12/09 2:21 PM, Achim Zeileis wrote:
> On Wed, 12 Aug 2009, Noah Silverman wrote:
>
>> Hi,
>>
>> The answers to my previous question about nominal variables has lead 
>> me to a more important question.
>>
>> What is the "best practice" way to feed nominal variable to an SVM.
>
> As some of the previous posters have already indicated: The data 
> structure for storing categorical (including nominal) variables in R 
> is a "factor".
>
> Your comment about "truly nominal" is wrong. A character variable is a 
> character variable, not necessarily a categorical variable. 
> Categorical means that the answer falls into one of a finite number of 
> known categories, known as "levels" in R's "factor" class.
>
> If you start out from character information:
>
>   x <- c("red", "red", "blue", "green", "blue")
>
> You can turn it into a factor via:
>
>   x <- factor(x, levels = c("red", "green", "blue"))
>
> R now knows how to do certain things with such a variable, e.g., 
> produces useful summaries or knows how to deal with it in regression 
> problems:
>
>   model.matrix(~ x)
>
> which seems to be what you asked for. Moreover, you don't need call 
> this yourself but most regression functions in R will do that for you 
> (including svm() in "e1071" or ksvm() in "kernlab", among others).
>
> In short: Keep your categorical variables as "factor" columns in a 
> "data.frame" and use the formula interface of svm()/ksvm() and you are 
> fine.
> Z
>
>
>> For example:
>> color = ("red, "blue", "green")
>>
>> I could translate that into an index so I wind up with
>> color= (1,2,3)
>>
>> But my concern is that the SVM will now think that the values are 
>> numeric in "range" and not discrete conditions.
>>
>> Another thought would be to create 3 binary variables from the single 
>> color variable, so I have:
>>
>> red = (0,1)
>> blue = (0,1)
>> green = (0,1)
>>
>> A example fed to the SVM would have one positive and two negative 
>> values to indicate the color value:
>> i.e. for a blue example:
>> red = 0, blue =1 , green = 0
>>
>> Or, do any of the SVM packages intelligently handle this internally 
>> so that I don't have to mess with it.  If so, do I need to be 
>> concerned about different "translation" of the data if the test data 
>> set isn't exactly the same as the training set.
>> For example:
>> training data  =  color ("red, "blue", "green")
>> test data = color ("red, "green")
>>
>> How would I be sure that the "red" and "green" examples get encoded 
>> the same so that the SVM is accurate?
>>
>> Thanks in advance!!
>>
>> -N
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Browser and Debug?

2009-08-13 Thread Erik Iverson

This article might help:

http://www.biostat.jhsph.edu/~rpeng/docs/R-debug-tools.pdf

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Inchallah Yarab
Sent: Thursday, August 13, 2009 9:40 AM
To: r-help@r-project.org
Subject: [R] Browser and Debug?

Hi,

Someone can explain to me how use Browser and Debug ?
thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get the n (number of observations) per conditional group

2009-08-13 Thread Erik Iverson

We will only be able to help if you provide a reproducible example!  I'm sure 
this is a simple one-liner, but it's hard to tell from your example what it 
should be.  The functions table, length, tapply, and/or nrow may play a part 
though. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Dax
Sent: Thursday, August 13, 2009 9:11 AM
To: r-help@r-project.org
Subject: [R] How to get the n (number of observations) per conditional group

Hello all,

I have a huge data set that I'm cleaning up a bit. I am extracted the
means per condition, but also need to get the n. Strangely enough I am
unable to find a function that could actually pull this off. I am
unable to use replications or count the rows using nrow.

What I am trying to obtain is the number of observations fitting
something along the lines of "data$total[data$spp==1&data$gyro==1&data
$salinity==0&data$day==i]"

Can anyone help me?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coding problem: How can I extract substring of function call within the function

2009-08-13 Thread Erik Iverson

Are you sure you just don't want to tell them about the :: operator?  It sounds 
easier than what you're proposing. 

E.g.,

base::mean(c(1:10, NA))

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Pitt, Joel
Sent: Thursday, August 13, 2009 3:48 PM
To: r-help@r-project.org
Subject: [R] Coding problem: How can I extract substring of function call 
within the function

In order to ease my students into the R environment I am developing a package 
which installs a variety of utility functions as well as slightly modified 
versions of some standard R functions -- e.g. mean, hist, barplot,  In my 
versions of these standard R functions I either add options or alter some 
defaults that seem to create difficulties for most of my students -- for 
example, when they do barcharts for two dimensional tables they generally want 
the bars to be side-by-side and stumble over the standard default of besides=F, 
and my version of mean by default reports a mean value in the presence of NA's 
after warning of that presence (but retains the option of setting na.rm=F). (I 
don't doubt that some (if not many) of you will doubt the wisdom of this, and I 
would be happy to discuss this in more detail on other occasions.) You might 
want to think of my replacement R functions as a kind of "training wheels" for 
R, and, in the spirit of training wheels I include a funct!
 ion in my package that allows a user to revert to the standard version of one 
or all functions without unloading the package (and loosing its additional 
functionality). However, I want to add a function that allows a user to revert 
to running the standard R version of a given function on a one-off basis and 
that's where my problem comes up.
 
I believe that it should be possible to write a function rStd with the usage 
rStd(x,...) where x is a function -- e.g. mean, hist, barchart, and the 
remaining parameters would be any of the parameters that should be passed to 
the unmodified version of mean, hist, barchart... The problem I have is how to 
get ahold of that collection of parameters as a single character string. Now I 
know that sys.calls()[[1]] will give me the full text of the initial call, but 
the problem is to detach the ... above from that as a text string. If I could 
do that I'd be done.
 
Here's the incomplete code with comments -- see the gap set off by astericks. 
 
rStd=function(x,...){
if(missing(x))  # must have a specified function
{
   cat("Error: No function specified\n");
   return(invisible(NULL));
}
z=as.character(substitute(x));
# must include code here to check that z is the name
# of one of our altered functions
# if z is an altered function, e.g., "mean"
# then concatenating "x" with z gives the overlaid
# function -- e.g. xmean is the standard mean
#***
# Now we need to get a hold of the ... text 
 *
#   
  *
w=sys.calls()[[1]];  # this gets me the whole text of the call  *
#   
  *
# and now here's where the problem arises *
# how to I get the ... text if I could get it and say   
   *
# assign it to the variable params  
 *
# I could then set  
*
#   
  *
cmd=sprintf("x%s(%s)",z,params) # see remarks above about z
   # and then it's done
eval(parse(text=cmd),sys.frame());
}
 
 
Any help would be much appreciated.
 
Regards
Joel
Joel Pitt, Ph.D.
Associate Professor & Chair
Mathematics & Computer Science
Georgian Court University
900 Lakewood Avenue
Lakewood, NJ 08540
732-987-2322

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with median for each row

2009-08-21 Thread Erik Iverson

Edward,

In general, if you have an nxn matrix, you can use the "apply" function to 
apply a function to each row of the matrix, and return the result. 

So, as a start, you could do, 

apply(your.mat, 1, median) 

or

apply(your.mat, 1, median, na.rm = TRUE) 

if you want to pass further arguments to median... You can also write your own 
function and pass it in, of course: 

E.g.,

apply(your.mat, 1, function(x) sum(x)+1001)

See ?apply.  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Edward Chen
Sent: Friday, August 21, 2009 11:50 AM
To: r-help@r-project.org
Subject: [R] help with median for each row

Hi,

I tried looking through google search on whether there's a way to computer
the median for each row of a nxn matrix and return the medians for each row
for further computation.
And also if the number of columns in the matrix are even, how could I
specify which median to use?

Thank you very much!

-- 
Edward Chen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] table function

2009-08-24 Thread Erik Iverson

You need to create a factor that indicates which group the values in 'z' belong 
to.  The easiest way to do that based on your situation is to use the 'cut' 
function to construct the factor, and then call 'table' using the result 
created by 'cut'.  See ?cut and ?factor 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Inchallah Yarab
Sent: Monday, August 24, 2009 10:59 AM
To: r-help@r-project.org
Subject: [R] table function

hi,

i want to use the function table to build a table not of frequence (number of 
time the vareable is repeated in a list or a data frame!!) but in function of 
classes
I don t find a clear explnation in  examples of  ?table !!!

example

x      y    z
1    0   100
5    1   1500
6    1   1200 
2    2   500 
1    1   3500 
5 2 2000 
8 5 4500

i want to do a table summerizing the number of variable where z is in 
[0-1000],],[1000-3000], [> 3000]

thank you very much for your help

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unique command not deleting all duplicate rows

2009-08-24 Thread Erik Iverson

I really don't think this is the issue.  I think the issue is that some columns 
of the data.frame, specifically V1, V2, and V4 should be checked versus R FAQ 
7.31.  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Don McKenzie
Sent: Monday, August 24, 2009 1:35 PM
To: Mehdi Khan
Cc: r-help@r-project.org
Subject: Re: [R] Unique command not deleting all duplicate rows

duplicated()

 > test.df
 V1 V2   V3  V4 V5   V6 V7
1 -115.380 32.894  195 162.940  D 8419  D
2 -115.432 32.864  115 208.910  D 8419  D
3 -115.447 32.773 1170 264.570  D 8419  D
4 -115.447 32.773 1170 264.570  D 8419  D
5 -115.447 32.773 1170 264.570  D 8419  D
6 -115.447 32.773 1170 264.570  D 8419  D
7 -115.447 32.773  149 186.210  D 8419  D
8 -115.466 32.855  114 205.630  D 8419  D
9 -115.473 32.800 1121 207.469  D 8419  D

 > test.df[!duplicated(test.df),]
 V1 V2   V3  V4 V5   V6 V7
1 -115.380 32.894  195 162.940  D 8419  D
2 -115.432 32.864  115 208.910  D 8419  D
3 -115.447 32.773 1170 264.570  D 8419  D
7 -115.447 32.773  149 186.210  D 8419  D
8 -115.466 32.855  114 205.630  D 8419  D
9 -115.473 32.800 1121 207.469  D 8419  D


On 24-Aug-09, at 11:23 AM, Mehdi Khan wrote:

> Hello everyone, when I run the "unique" command on my data frame,  
> it deletes
> the majority of duplicate rows, but not all of them.  Here is a  
> sample of my
> data. How do I get it to delete all the rows?
>
>  6 -115.38 32.894 195 162.94 D 8419 D
>
>  7 -115.432 32.864 115 208.91 D 8419 D
>
>  8 -115.447 32.773 1170 264.57 D 8419 D
>
>  9 -115.447 32.773 1170 264.57 D 8419 D
>
>  10 -115.447 32.773 1170 264.57 D 8419 D
>
>  11 -115.447 32.773 1170 264.57 D 8419 D
>
>  12 -115.447 32.773 149 186.21 D 8419 D
>
>  13 -115.466 32.855 114 205.63 D 8419 D
>
>  14 -115.473 32.8 1121 207.469 D 8419 D
>
>
> Thanks a bunch!
>
> Mehdi Khan
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Don McKenzie, Research Ecologist
Pacific WIldland Fire Sciences Lab
US Forest Service

Affiliate Professor
School of Forest Resources, College of the Environment
CSES Climate Impacts Group
University of Washington

desk: 206-732-7824
cell: 206-321-5966
d...@u.washington.edu
donaldmcken...@fs.fed.us

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] table, xyplot, names, & loops

2009-08-25 Thread Erik Iverson

Hello, 

"I'm just starting out in R and have a basic question about xyplot and
tables. Suppose I had a table of data with the following names: Height,
Age_group, City. I'd like to plot mean Height vs Age_group for each City"

You did not provide a sample data.frame, so I generated one.  This example is 
basically borrowed directly from Figure 4.3 in Sarkar's excellent book, 
"Lattice". Personally, I do feel that a plot such as the one below is a better 
display choice than a simple table of the means, but some may disagree. 

Please note that my random data do not contain effects for either age.group or 
city, so my guess is that your resulting plot will look cleaner (i.e., contain 
some visual signal.) 

## BEGIN SAMPLE R CODE

## create sample data.frame
df <- data.frame(height <- rnorm(1000, 10),
 age.group <- sample(gl(10,100,
labels = paste("Age Group", 1:10))),
 city <- sample(gl(4, 250, labels = paste("City", 1:4

## tabulate the data in matrix form
h.tab <- with(df, tapply(height,
 list(age.group, city),
 mean))

## use dotplot with the matrix object 
dotplot(h.tab, type = "o",
auto.key = list(lines = TRUE, space = "right"),
xlab = "height")

## END SAMPLE R CODE

Best,
Erik Iverson 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] table, xyplot, names, & loops

2009-08-25 Thread Erik Iverson

And of course I did not test this :).  Within the data.frame argument list, 
please change the <- operators to = signs.  Then it should work.

Erik 

-Original Message-
From: Erik Iverson 
Sent: Tuesday, August 25, 2009 1:17 PM
To: 'w_poet'; r-help@r-project.org
Subject: RE: [R] table, xyplot, names, & loops

Hello, 

"I'm just starting out in R and have a basic question about xyplot and
tables. Suppose I had a table of data with the following names: Height,
Age_group, City. I'd like to plot mean Height vs Age_group for each City"

You did not provide a sample data.frame, so I generated one.  This example is 
basically borrowed directly from Figure 4.3 in Sarkar's excellent book, 
"Lattice". Personally, I do feel that a plot such as the one below is a better 
display choice than a simple table of the means, but some may disagree. 

Please note that my random data do not contain effects for either age.group or 
city, so my guess is that your resulting plot will look cleaner (i.e., contain 
some visual signal.) 

## BEGIN SAMPLE R CODE

## create sample data.frame
df <- data.frame(height <- rnorm(1000, 10),
 age.group <- sample(gl(10,100,
labels = paste("Age Group", 1:10))),
 city <- sample(gl(4, 250, labels = paste("City", 1:4

## tabulate the data in matrix form
h.tab <- with(df, tapply(height,
 list(age.group, city),
 mean))

## use dotplot with the matrix object 
dotplot(h.tab, type = "o",
auto.key = list(lines = TRUE, space = "right"),
xlab = "height")

## END SAMPLE R CODE

Best,
Erik Iverson 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Managing output

2009-08-26 Thread Erik Iverson

How about ?append, but R is vectorized, so why not just

result_list <- 2*item^2 , or for more complicated tasks, the 
apply/sapply/lapply/mapply family of functions?

In general, the "for" loop construct can be avoided so you don't have to think 
about messy indexing.  What exactly are you trying to do? 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman
Sent: Wednesday, August 26, 2009 2:20 PM
To: r help
Subject: [R] Managing output

Hi,


Is there a way to build up a vector, item by item.  In perl, we can 
"push" an item onto an array.  How can we can do this in R?
I have a loop that generates values as it goes.  I want to end up with a 
vector of all the loop results.

In perl it woud be:

for(item in list){
 result <- 2*item^2 (Or whatever formula, this is just a pseudo example)
 Push(@result_list, result)  (This is the step I can't do in R)
}


Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple column selection question- which and character lists

2009-08-31 Thread Erik Iverson

1) Don't call your data.frame "data".  I will call my "example" one "df".

2) If you want the columns NOT in names.species.bio.18, which is what you said, 
then the answer is: 

df[!names(df) %in% names.species.bio.18]

Best,
Erik 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of AllenL
Sent: Monday, August 31, 2009 11:40 AM
To: r-help@r-project.org
Subject: [R] Simple column selection question- which and character lists


Dear R-list,
Seems simple but have tried multiple approaches, no luck.

I have a list of column names:
>names.species.bio.18=c("Achimillb","Agrosmitb","Amorcaneb","Andrgerab","Ascltubeb","Elymcanab","Koelcrisb","Lespcapib","Liataspeb","Lupipereb","Monafistb","Panivirgb","Petapurpb","Poaprateb","Querellib","Quermacrb","Schiscopb","Sorgnutab")

I want to select the column numbers which correspond to these names in my
data frame:
>which(colnames(data)==names.species.bio.18)

Result:
+[1] 75 76
+Warning message:
+In cols == names.species.bio.18 :
+  longer object length is not a multiple of shorter object length

So I get the first two hits and then it trips an error message.

What is  >which doing? Why does it seem to have trouble with vectors of
characters? 
My goal is to output the column names/indices which correspond to the
columns NOT in the above list, but that is simple once I can find out what
they are.

Thanks!
-Allen

-- 
View this message in context: 
http://www.nabble.com/Simple-column-selection-question--which-and-character-lists-tp25226500p25226500.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread Erik Iverson

Dear R-help, 

Could someone please try to explain this paradox to me? What is more likely to 
show up first in a string of coin tosses, "Heads then Tails", or "Heads then 
Heads"?  

##generate 2500 strings of random coin flips
ht <- replicate(2500,
paste(sample(c("H", "T"), 100, replace = TRUE),
  collapse = ""))

## find first occurrence of HT
mean(regexpr("HT", ht))+1#mean of HT position, 4

## find first occurrence of HH
mean(regexpr("HH", ht))+1#mean of HH position, 6

FYI, this is not homework, I have not been in school in years.  I saw a similar 
problem posed in a blog post on the Revolutions R blog, and although I believe 
the answer, I'm having a hard time figuring out why this should be? 

Thanks,
Erik Iverson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread Erik Iverson

Part of my issue was that I was not answering my original question.  "What is 
more likely to show up first, HT or HH?" The answer to that turns out to be 
"neither", or "identical chances". 

ht <- replicate(2500,
paste(sample(c("H", "T"), 100, replace = TRUE),
  collapse = ""))

hts <- regexpr("HT", ht) + 1
hhs <- regexpr("HH", ht) + 1

## which is first?
table(hts < hhs)  # about 50/50 

summary(hts)  #mean of 4
summary(hhs)  #mean of 6

So, "What is more likely to show up first, HH or HT?" is of course a different 
question than "Are the expected values of the positions for the first HT or HH 
the same?"  I suppose that's where confusion set in.  It seems that if HH 
appears later in the string on average (i.e., after 6 tosses instead of 4), 
that the probability of it being first would be lower than HT, which is 
obviously wrong!

A quick graphic that helps show this (you must run the above code first):

library(lattice)

ht.df <- data.frame(count = c(hts, hhs),
type = gl(2, 1250, labels = c("HT", "HH")))

barchart(prop.table(xtabs(~ count + type, data = ht.df)),
 stack = FALSE, horizontal = FALSE,
 box.ratio = .8, auto.key = TRUE)

Thanks to all those who replied, and also someone sent me the following link 
off list, it also clears up the confusion:

http://www.mit.edu/~emin/writings/coinGame.html

Best, 
Erik 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erik Iverson
Sent: Monday, August 31, 2009 2:17 PM
To: r-help@r-project.org
Subject: [R] Offtopic, HT vs. HH in coin flips

Dear R-help, 

Could someone please try to explain this paradox to me? What is more likely to 
show up first in a string of coin tosses, "Heads then Tails", or "Heads then 
Heads"?  

##generate 2500 strings of random coin flips
ht <- replicate(2500,
paste(sample(c("H", "T"), 100, replace = TRUE),
  collapse = ""))

## find first occurrence of HT
mean(regexpr("HT", ht))+1#mean of HT position, 4

## find first occurrence of HH
mean(regexpr("HH", ht))+1#mean of HH position, 6

FYI, this is not homework, I have not been in school in years.  I saw a similar 
problem posed in a blog post on the Revolutions R blog, and although I believe 
the answer, I'm having a hard time figuring out why this should be? 

Thanks,
Erik Iverson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to find angle between coordinates?

2009-09-01 Thread Erik Iverson

?atan2 is a possible starting point. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of clair.crossup...@googlemail.com
Sent: Tuesday, September 01, 2009 8:09 AM
To: r-help@r-project.org
Subject: [R] Function to find angle between coordinates?

Dear all,

I was doing some self study and was wondering if a function already
exists which allows one to determine the angle between points.  e.g.
given the following (x,y) coordinates

input: (0,1); (0,0); (1,0)

would result in:

output: 90 degrees

Best regards
C.C.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind objects using character vectors

2009-09-01 Thread Erik Iverson

Not tested:
Instead of: 

cbind(vec.names[1], vec.names[2])

cbind(get(vec.names[1]), get(vec.names[2]))

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of jonas garcia
Sent: Tuesday, September 01, 2009 12:53 PM
To: r-help@r-project.org
Subject: [R] cbind objects using character vectors

Dear list,



I have a character vector such vec.names<- c("a", "b")

It happens that I have also two R objects called "a" and "b" that I would
like to merge. Is it possible to

do something like cbind(vec.names[1], vec.names[2]) ending up with the same
result as cbind(a,b)



Bellow is a reproducible example of what I need to to:



dat<- data.frame(A=seq(1,5), B=seq(6,10))

vec.names<- c("a", "b")

for(i in 1:ncol(dat))

{

tab<- dat[,i]-1

assign(vec.names[i], tab)

}



cbind(vec.names[1], vec.names[2])

 [,1] [,2]

[1,] "a"  "b"





But I was looking after the following result (using vec.names):



cbind(a,b)

 a b

[1,] 0 5

[2,] 1 6

[3,] 2 7

[4,] 3 8

[5,] 4 9





Thanks in advance



Jonas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date format in plot

2009-09-01 Thread Erik Iverson

We will need a reproducible example!  Please give us R commands that display 
the behavior you're observing:

For example,

I am having trouble understanding the as.Date function.  When I input 39939, I 
would like to get "06.05.2009", but when I try it, I get 

> as.Date(39939)
Error in as.Date.numeric(39939) : 'origin' must be supplied

I looked up what origin Excel uses for its' dates, and it seems like it might 
be January 1, 1900, so I tried

as.Date(39939, origin = "1900-01-01")
[1] "2009-05-08"

Then we will much better be able to help you, because we will be able to paste 
your commands into R and see the results and make changes. 

But this still seems to be off by two days.  So did you really mean "06.05", or 
"08.05"?



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of swertie
Sent: Tuesday, September 01, 2009 12:59 PM
To: r-help@r-project.org
Subject: [R] Date format in plot


Hello, I plot the abundance of a species in relation to the date. To have the
date as a continous variable I put it in the format "standard" in excel
(f.ex. 39939 means 06.05.2009). R uses 39939 on the x axis, but I would like
to have "06.05". I tried to use as.Date as suggested in some discussion but
I don't manage to use it, the returned date is not correct. Do you have any
clue? thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange results in summary and IQR functions

2009-09-08 Thread Erik Iverson

It's all simply a matter of definitions, and there are many who disagree. See 
?quantile , specifically the "type" argument.  Since IQR does not appear to 
have a type argument, you could easily write your own versions of these that do 
what SAS does (assuming that is your goal).  

With x defined as you have it, look at the results of this function call, which 
shows the different values for quantile that you get by using different "type" 
arguments.

> sapply(1:9, function(y) quantile(x, type = y))

 [,1] [,2] [,3] [,4] [,5]  [,6]  [,7] [,8][,9]
0%  222  2.02  2.00  2.00  2.0  2.
25%11   114  7.5   11  9.25 11.25 10.41667 10.5625
50%13   14   13 13.0   14 14.00 14.00 14.0 14.
75%31   31   31 31.0   31 32.50 31.00 31.5 31.3750
100%   47   47   47 47.0   47 47.00 47.00 47.0 47.


Best,
Erik Iverson 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Chunhao Tu
Sent: Tuesday, September 08, 2009 10:09 AM
To: r-help@r-project.org
Subject: [R] strange results in summary and IQR functions


Dear R users,
Something is strange in summary and IQR. Suppose, I have a data set and I
would like to find the Q1, Q2, Q3 and IQR.  
 
x<-c(2,4,11,12,13,15,31,31,37,47)
> summary(x)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
   2.00   11.25   14.00   20.30   31.00   47.00 
> IQR(x)
[1] 19.75
However, I test the same data set in SAS "proc univariate", and SAS shows
that Q1=11, Q2=14 and Q3=31. I think most of us agree that Q1 is 11 not
11.25. 

Could someone please explain to me why R shows Q1=11.25 not 11?

Many Thanks
Tu


-- 
View this message in context: 
http://www.nabble.com/strange-results-in-summary-and-IQR-functions-tp25348079p25348079.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Count number of different patterns (Polytomous variable)

2009-09-08 Thread Erik Iverson

If your data.frame was called "test" below, 

nrow(unique(test)) 

would do what you want, I believe. 

Erik 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of "Biedermann, Jürgen"
Sent: Tuesday, September 08, 2009 9:24 AM
To: r-help@r-project.org
Subject: [R] Count number of different patterns (Polytomous variable)

Hi there,

Does anyone know a method to calculate the number of different patterns 
in a given data frame. The variables are of polytomous type and not 
binary (for the latter i found a package called "countpattern" which 
unfortunately only functions for binary variables).

V1   V2   V3
0   3   1
1   2   0
1   2   0

So, in this case, i would like to get "2" as output.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The code behind the function

2009-09-09 Thread Erik Iverson

--
How can I see the code behind the function. For example,

> boxplot
function (x, ...) 
UseMethod("boxplot")


I really would like to see how people code this.
--

That *is* the code.  "boxplot" is a generic function, which calls another 
function based on what class of object you pass to it. You can find out which 
classes the boxplot method works on by using 

> methods("boxplot") 

[1] boxplot.default  boxplot.formula*

   Non-visible functions are asterisked

Next, try typing 

> boxplot.default

to see the implementation for the default function, and then since 
boxplot.formula is hidden, usually

getAnywhere("boxplot.formula") 

will retrieve it.

Section 10.9 of An Introduction to R explains this more:

http://cran.r-project.org/doc/manuals/R-intro.html#Object-orientation

Best Regards,
Erik Iverson   

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Joining Characters in R {issue with paste}

2009-09-09 Thread Erik Iverson

And did you read the help file, ?paste , paying attention to the arguments and 
their descriptions, specifically the "sep" argument?  Presumably, you want, 

paste(a, b, sep = "")

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Abhishek Pratap
Sent: Wednesday, September 09, 2009 4:09 PM
To: r-help@r-project.org
Subject: [R] Joining Characters in R {issue with paste}

Hi Guys
I am want to join to strings in R. I am using paste but not getting
desirable result.

For the sake of clarity, a quick example:

> a="Bio"
> b="iology"
> paste(a,b)
[1] "Bio iology"

*There is a SPACE in the word biology which is what I dont want *

Thanks,
-Abhi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with for loop

2009-09-14 Thread Erik Iverson

It is difficult to know what you're trying to do here, I think.  Is this it? 
You almost surely don't need a for loop to accomplish your task, and should 
make use of the pre-existing vectorized functions provided to you. 

a <- c(4, 5, 1, 7, 8, 12, 39)
b <- c(3, 7, 8, 4, 7, 25, 78)
d <- a - b

which(d > 0)

Erik 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Edward Chen
Sent: Monday, September 14, 2009 1:50 PM
To: r-help@r-project.org
Subject: [R] Help with for loop

I have a code:
*a = c(4,5,1,7,8,12,39)
b = c(3,7,8,4,7,25,78)
d =a-b
for(i in 1:length(d)){
if(d[i]>0){x = list(d[i])
print(x)}
else{y = list(d[i])
print(y)}}

the results are:

[[1]]
[1] 1

[[1]]
[1] -2

[[1]]
[1] -7

[[1]]
[1] 3

[[1]]
[1] 1

[[1]]
[1] -13

[[1]]
[1] -39


which will tell me what d is. but is it possible to output the order in
which the difference is in the vector d?
for example I would want to see x = 1,3,1 and they are from d[1], d[4],
d[5].
This is just a crude example I thought of to help me do something more
complicated.

Thank you very much!
*
-- 
Edward Chen
Email: edche...@gmail.com
Cell Phone: 510-371-4717

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] average for files and graph

2009-09-14 Thread Erik Iverson

It would be even greater if you could get us started with some commented, 
minimal, self-contained, reproducible code.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of kylle345
Sent: Monday, September 14, 2009 1:35 PM
To: r-help@r-project.org
Subject: [R] average for files and graph

Hi,

I have 5 different files.  Each file has about 1000 columns and about 3000
rows.  Basically what I want to do is take the average of the 1000 columns
and then graph it (line graph).  How would I do this for 5 files at the same
time and plot the average of the 5 files into one graph.

it would be great if you can get me started with a quick code.

thanks

kylle
-- 
View this message in context: 
http://www.nabble.com/average-for-files-and-graph-tp25440679p25440679.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Putting together a constantly evolving package

Steve,

I can't speak to your exact question, but perhaps suggest a simple alternative. 
 What I do is simply make changes to the .R file containing my code, and use 
the "source" function to read in the new definitions of my functions while I'm 
tweaking them.  Then, at the end of the day, I do my R CMD INSTALL just once.  
If you use ESS, this is particularly easy since you can just type C-c C-l to 
source the current .R file into a running *R* process. 

Also, see the Examples section of ?source for a function that sources a bunch 
of files, presumably all the .R files in your package, at once. 

I hope that might help,
Erik 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Steve Lianoglou
Sent: Tuesday, September 15, 2009 2:08 PM
To: R-help@r-project.org
Subject: [R] Putting together a constantly evolving package

Hi all,

I'm putting together some common code + data into a custom package,  
everything is working out fine, but the ``R CMD INSTALL MyPackage``  
call seems to take a particularly long time in the "**data" step:

$ R CMD INSTALL MyPackage/
* installing to library '/Library/Frameworks/R.framework/Resources/ 
library'
* installing *source* package ' MyPackage' ...
** R
** data

(here)

I have a handful of not-very-big *.rda files in my data dir, but also  
a rather large sqlite db.

Is R trying to do anything in particular to my data during the  
install? index or something? Is there anything I can do to make this  
step go faster?

If this were a 1-time install, it wouldn't matter, but since this  
package is evolving as I'm using it, I find myself constantly needing  
to tweak some code here, or change something there, and this always  
requires another round of R CMD INSTALLing ...

Is there something I can do to make this cycle turn around quicker?  
How do you guys deal with growing a package organically during your  
analyses?

For this particular situation, I reckon I can create a separate  
package for my dataset since its static (and I might do eventually  
down the road, anyway), but I'm wondering if there are other  
alternatives.

Thanks,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Viewing Function Code

See the reference to ?getAnywhere in the following post:

http://www.nabble.com/The-code-behind-the-function-td25370743.html

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Michael Pearmain
Sent: Tuesday, September 15, 2009 3:14 PM
To: r-help@r-project.org
Subject: [R] Viewing Function Code

Hi All,
I'd like to see the function code behind the barplots2() function in the
gplots package, however i come across a bit of a stumbling block of a hidden
function, can anyone help?

> library(gplots)
> methods(barplot2)
[1] barplot2.default*

   Non-visible functions are asterisked
> barplot2
function (height, ...)
UseMethod("barplot2")

Mike

-- 
Michael Pearmain
Senior Analytics Research Specialist

"I abhor averages.  I like the individual case.  A man may have six meals
one day and none the next, making an average of three meals per day, but
that is not a good way to live.  ~Louis D. Brandeis"

Google UK Ltd
Belgrave House
76 Buckingham Palace Road
London SW1W 9TQ
United Kingdom
t +44 (0) 2032191684
mpearm...@google.com

If you received this communication by mistake, please don't forward it to
anyone else (it may contain confidential or privileged information), please
erase all copies of it, including all attachments, and please let the sender
know it went to the wrong person. Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to remove 'NA's?

Well, how about the nomatch argument to the match function, see ?match . The 
nomatch argument is NA by default.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Peng Yu
Sent: Tuesday, September 15, 2009 4:05 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] How to remove 'NA's?

Hi,

> match(c(3,4), c(3,2,1))
[1]  1 NA

The above result has 'NA' in. Is there a way to make 'match' does not
produce any 'NA's?

Regards,
Peng

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R console line-wrapping

See ?options, particularly the "width" setting.  

> options(width=200)

Might do what you want, by default it is 80... 

Best,
Erik Iverson 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Nick Matzke
Sent: Tuesday, September 15, 2009 4:47 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] R console line-wrapping

Hi all, a quick question I couldn't find the answer to in the usual places:

Is there a way to turn off line-wrapping in the R console?  Or set the 
line width-before-wrapping manually?  Currently it looks like the 
console linewraps after about 70 characters, this occurs even if I 
increase the window size.

(I want to output some simple tables to screen for students in a 
computer lab course)

I am running R2.9, i.e. R.app, in Mac OS X.

Cheers!
Nick

-- 

Nicholas J. Matzke
Ph.D. Candidate, Graduate Student Researcher
Huelsenbeck Lab
Center for Theoretical Evolutionary Genomics
4151 VLSB (Valley Life Sciences Building)
Department of Integrative Biology
University of California, Berkeley

Lab websites:
http://ib.berkeley.edu/people/lab_detail.php?lab=54
http://fisher.berkeley.edu/cteg/hlab.html
Dept. personal page: 
http://ib.berkeley.edu/people/students/person_detail.php?person=370
Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html
Lab phone: 510-643-6299
Dept. fax: 510-643-6264
Cell phone: 510-301-0179
Email: mat...@berkeley.edu

Mailing address:
Department of Integrative Biology
3060 VLSB #3140
Berkeley, CA 94720-3140

-
"[W]hen people thought the earth was flat, they were wrong. When people 
thought the earth was spherical, they were wrong. But if you think that 
thinking the earth is spherical is just as wrong as thinking the earth 
is flat, then your view is wronger than both of them put together."

Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical Inquirer, 
14(1), 35-44. Fall 1989.
http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] T-test to check equality, unable to interpret the results.

2009-09-16 Thread Erik Iverson

Robert, 

We unfortunately do not have enough information to help you interpret the 
results, and this is not really an R question at all, but general statistical 
advice.  You will probably have much better understanding and confidence in 
your results by consulting a local statistical consultant at your university.

The values shown for sample 1 and sample 2 lead me to believe that these are 
not drawn from a homogeneous population.  Whoever ends up helping you is going 
to need to know how these measurements were obtained in much greater detail 
than you've given here. 

Best Regards,
Erik Iverson 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Robert Hall
Sent: Wednesday, September 16, 2009 1:55 PM
To: r-help
Subject: [R] T-test to check equality, unable to interpret the results.

Hi,
I have the precision values of a system on two different data sets.
The snippets of these results are as shown:

sample1: (total 194 samples)
0.600238
0.800119
0.600238
0.200030
0.600238
...
...

sample2: (total 188 samples)
0.8001
0.2000
0.8001
0.
0.8001
0.4001
...
...

I want to check if these results are statistically significant? Intuitively,
the similarity in the two results mean the results are statistically
significant.
I am using the t-test t.test(sample1,sample2)to check for similarity amongst
the two results.
I get the following output:

---
Welch Two Sample t-test

data:  s1p5 and s2p5
t = 0.9778, df = 374.904, p-value = 0.3288
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.03170059  0.09441172
sample estimates:
mean of x mean of y
0.5138298 0.4824742


I believe the t-test checks for difference amongst the two sets, and p-value
< 0.05 means both thesets are statistically different. Here while checking
for dissimilarity the p-value is 0.3288, does it mean that higher the
p-value (while t.test checks for dis-similarity) means more similar the
results are (which is the case above as the means of the results are very
close!)
Please help me interpret the results..
thanks in advance!

--
Rob Hall
Masters Student
ANU

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply function across two variables by mult factors

2009-09-16 Thread Erik Iverson

Hello, 

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Jon Loehrke
> Sent: Wednesday, September 16, 2009 2:23 PM
> To: r-help@r-project.org
> Subject: [R] apply function across two variables by mult factors
> 
> Greetings,
> 
> I am attempting to run a function, which produces a vector and
> requires two input variables, across two nested factor levels.  I can
> do this using by(X, list(factor1, factor2), function), however I
> haven't found a simple way to extract the list output into an
> organized vector form.  I can do this using nested loops but it isn't
> exactly an optimal approach.
> 
> Thank you for any and all suggestions.  Jon
> 
> # example data frame
> testDF<-data.frame(
>   x=rnorm(12),
>   y=rnorm(12),
>   f1=gl(3,4),
>   f2=gl(2,2,12))
> 

Try this using lapply, split, mapply?  Maybe it is in a nicer output object for 
you?  

testFun2 <- function(x, y) {
  X <- abs(x);
  Y <- abs(y);
  as.numeric(paste(round(X), round(Y), sep='.'))
}

lapply(split(testDF, list(testDF$f1, testDF$f2)),
   function(x) mapply(testFun2, x[1], x[2]))



> # example function [trivial]
> testFun<-function(x){
>   X<-abs(x[,1]);
>   Y<-abs(x[,2]);
>   as.numeric( paste(round(X), round(Y), sep='.'))
>   }
> 
> # apply by factor levels but hard to extract values
> by(testDF[,1:2], list(testDF$f1, testDF$f2), testFun)
> 
> # Loop works, but not efficient for large datasets
> testDF$value<-NA
> for(i in levels(testDF$f1)){
>   for(j in levels(testDF$f2)){
>   testDF[testDF$f1==i & testDF$f2==j,]$value<-
> testFun(testDF[testDF
> $f1==i & testDF$f2==j,1:2])
>   }
>   }
> testDF
> sessionInfo()
> #R version 2.9.1 Patched (2009-08-07 r49093)
> #i386-apple-darwin8.11.1
> #
> #locale:
> #en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> #
> #attached base packages:
> #[1] stats graphics  grDevices utils datasets  methods   base
> 
> 
> Jon Loehrke
> Graduate Research Assistant
> Department of Fisheries Oceanography
> School for Marine Science and Technology
> University of Massachusetts
> 200 Mill Road, Suite 325
> Fairhaven, MA 02719
> jloeh...@umassd.edu
> T 508-910-6393
> F 508-910-6396
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply function across two variables by mult factors

2009-09-16 Thread Erik Iverson

One correction below, 

---snip---

> >
> > # example data frame
> > testDF<-data.frame(
> > x=rnorm(12),
> > y=rnorm(12),
> > f1=gl(3,4),
> > f2=gl(2,2,12))
> >
> 
> Try this using lapply, split, mapply?  Maybe it is in a nicer output
> object for you?
> 
> testFun2 <- function(x, y) {
>   X <- abs(x);
>   Y <- abs(y);
>   as.numeric(paste(round(X), round(Y), sep='.'))
> }
> 
> lapply(split(testDF, list(testDF$f1, testDF$f2)),
>function(x) mapply(testFun2, x[1], x[2]))
> 

Or use "list indexing" in the mapply call to get a vector, in this case at 
least...

lapply(split(testDF, list(testDF$f1, testDF$f2)),
   function(x) mapply(testFun2, x[[1]], x[[2]]))

---snip---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generating unordered combinations

2009-09-17 Thread Erik Iverson

Dan, 

Still maybe a bit ugly, but no looping...

> unique(as.data.frame(t(apply(expand.grid(0:2, 0:2, 0:2), 1, sort
   V1 V2 V3
1   0  0  0
2   0  0  1
3   0  0  2
5   0  1  1
6   0  1  2
9   0  2  2
14  1  1  1
15  1  1  2
18  1  2  2
27  2  2  2

Best,
Erik 

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Dan Halligan
> Sent: Thursday, September 17, 2009 3:31 PM
> To: r-help@r-project.org
> Subject: [R] generating unordered combinations
> 
> Hi,
> 
> I am trying to generate all unordered combinations of a set of
> numbers / characters, and I can only find a (very) clumsy way of doing
> this using expand.grid.  For example, all unordered combinations of
> the numbers 0, 1, 2 are:
> 0, 0, 0
> 0, 0, 1
> 0, 0, 2
> 0, 1, 1
> 0, 1, 2
> 0, 2, 2
> 1, 1, 1
> 1, 1, 2
> 1, 2, 2
> 2, 2, 2
> 
> (I have not included, for example, 1, 0, 0, since it is equivalent to
> 0, 0, 1).
> 
> I have found a way to generate this data.frame using expand.grid as
> follows:
> 
> g <- expand.grid(c(0,1,2), c(0,1,2), c(0,1,2))
> for(i in 1:nrow(g)) {
>   g[i,] <- sort(as.character(g[i,]))
> }
> o <- order(g$Var1, g$Var2, g$Var3)
> unique(g[o,]).
> 
> This is obviously quite clumsy and hard to generalise to a greater
> number of characters, so I'm keen to find any other solutions.  Can
> anyone suggest a better (more general, quicker) method?
> 
> Cheers
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic function output/scope question

Hello, 

> 
> testfunc<-function(x)
> { y<-10
> print(y)
> print(x)
> }
> 
> testfunc(4)
> 
> The variables x and y are accessible during execution of the function
> "testfunc" but not afterwards.  

In R, expressions return values.  When you define a function, ?function says 
that, "If the end of a function is reached without calling 'return', the value 
of the last evaluated expression is returned." 

So you are correct, 'x' and 'y' are local variables, and by all accounts they 
should be.  If you want their values accessible, simply return them.

## return just y
testfunc2 <- function(x) {
   y <- 10
   y
}

## return both x and y 
testfunc2 <- function(x) {
   y <- 10
   list(x, y)
}

There are ways to make x and y global from within a function, but in general 
that is not the R way to do things! 

Hope that helps, 
Erik Iverson 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic function output/scope question

I see you've already been told about "<<-".  The reason I (and presumably 
others?) stay away from that construct is that your function then has side 
effects that a user (even yourself) may not anticipate or want, namely possibly 
overwriting a previous variable in your global environment.  The help for ?<<- 
says, 

"The operators '<<-' and '->>' cause a search to made through the
 environment for an existing definition of the variable being
 assigned.  If such a variable is found (and its binding is not
 locked) then its value is redefined, otherwise assignment takes
 place in the global environment."

So this is not even guaranteed to assign in the Global Environment!!

 To follow-up on my previous email, you usually assign the results of your 
function call to a variable if you want to use them further. For example, 

## return both x and y
testfunc2 <- function(x) {
   y <- 10
   list(x, y)
}

my.var <- testfunc2(4)

another.function(my.var)

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Erik Iverson
> Sent: Monday, September 21, 2009 10:43 AM
> To: David Young; r-help@r-project.org
> Subject: Re: [R] Basic function output/scope question
> 
> Hello,
> 
> >
> > testfunc<-function(x)
> > { y<-10
> > print(y)
> > print(x)
> > }
> >
> > testfunc(4)
> >
> > The variables x and y are accessible during execution of the function
> > "testfunc" but not afterwards.
> 
> In R, expressions return values.  When you define a function, ?function
> says that, "If the end of a function is reached without calling 'return',
> the value of the last evaluated expression is returned."
> 
> So you are correct, 'x' and 'y' are local variables, and by all accounts
> they should be.  If you want their values accessible, simply return them.
> 
> ## return just y
> testfunc2 <- function(x) {
>y <- 10
>y
> }
> 
> ## return both x and y
> testfunc2 <- function(x) {
>y <- 10
>list(x, y)
> }
> 
> There are ways to make x and y global from within a function, but in
> general that is not the R way to do things!
> 
> Hope that helps,
> Erik Iverson
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] missing level of a nested factor results in an NA in lm output

>  > estimable(fit, myEstimate)
>  Estimate Std. Error  t value DF Pr(>|t|)
> test 12.18198  0.6694812 18.19615 10 5.395944e-09

Where are you getting this "estimable" function from?  A package?  Did you 
define it yourself?  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] More elegant way of excluding rows with equal values in any 2 columns?

Hello, 

Do you mean exactly any 2 columns.  What if the value is equal in more than 2 
columns? 

> 
> I built a data frame "grid" (below) with 4 columns. I want to exclude
> all rows that have equal values in ANY 2 columns. Here is how I am
> doing it:
> 
> index<-expand.grid(1:4,1:4,1:4,1:4)

If a value is equal in 2 or more rows, i.e., duplicated, then the following 
should work, assuming index can be changed to a matrix for apply ... 

t3 <- index[apply(index, 1, function(x) all(!duplicated(x))),]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] More elegant way of excluding rows with equal values in any 2 columns?

It probably does... I did not look at his until just now, my guess is they are 
equivalent.  There are usually at least a couple ways to do things in R, no 
problem :).  With massive datasets, it might make sense to try a couple 
different ways to see if one or the other is faster though.  

You could also replace "!duplicated" in my function with "unique" ... 

Erik 

> -Original Message-
> From: Dimitri Liakhovitski [mailto:ld7...@gmail.com]
> Sent: Monday, September 21, 2009 2:02 PM
> To: Erik Iverson
> Cc: R-Help List
> Subject: Re: [R] More elegant way of excluding rows with equal values in
> any 2 columns?
> 
> Thank you very much, both Dimitris and Erik.
> Erik - you are right, I was trying to remove any duplication (i.e., if
> there are the same values in 2 or 3 or 4 columns).
> And it looks like that's what your solution does.
> But doesn't it do the same thing as Dimitris' solution?
> 
> Dimitri
> 
> On Mon, Sep 21, 2009 at 2:55 PM, Erik Iverson  wrote:
> > Hello,
> >
> > Do you mean exactly any 2 columns.  What if the value is equal in more
> than 2 columns?
> >
> >>
> >> I built a data frame "grid" (below) with 4 columns. I want to exclude
> >> all rows that have equal values in ANY 2 columns. Here is how I am
> >> doing it:
> >>
> >> index<-expand.grid(1:4,1:4,1:4,1:4)
> >
> > If a value is equal in 2 or more rows, i.e., duplicated, then the
> following should work, assuming index can be changed to a matrix for apply
> ...
> >
> > t3 <- index[apply(index, 1, function(x) all(!duplicated(x))),]
> >
> 
> 
> 
> --
> Dimitri Liakhovitski
> Ninah.com
> dimitri.liakhovit...@ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] More elegant way of excluding rows with equal values in any 2 columns?

> 
> You could also replace "!duplicated" in my function with "unique" ...
> 
It turns out you can't, of course :). 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working around 256 byte variable names? + trouble opening large file


> I did just try to do that, and it is still returning the same error when I
> try to attach the csv file..
> 
> > vc1<-read.table("P:\\R\\Everything-I.csv",header=T, sep=" ", dec=".",
> na.strings=NA, strip.white=T)
> > attach(vc1)
> Error in attach(vc1) : variable names are limited to 256 bytes
> 
> Each variable name is only 5 to 6 characters long, but I'm sure you're
> right about R reading the entire header line as one variable.
> I cannot figure out though, how to stop it from doing so.
> 
> sep=" ", or sep="," do not seem to work either, though I don't know if it
> is the right thing to be trying.
> 

You will certainly need to specify the correct "sep" argument, and if it is 
truly a CSV file, that is NOT " ", but ","  Why don't you paste/attach the 
first few lines of your .csv file for inspection if all else fails?  What about 
the unmatched quote idea from a previous post?? 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear Model "NA" Value Test

> if("fit$coef[[2]]" == "NA") {.cw = 1}

See ?is.na 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use a string to refer a function?

2010-02-11 Thread Erik Iverson


Hint:

"somebody let me know how to >>>get< the function from 
the name 'f'?"


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Which method is called when print(a_list)?

2010-02-15 Thread Erik Iverson




blue sky wrote:

I don't find print.list. Could somebody let me know which method is
called when I run command print(a_list), where a_list is a list? Is
'print.default' used for printing a list?


Yes. You can always debug functions to investigate what's going on, too. 
 See ?debug.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to check the two different nulls?

2010-02-15 Thread Erik Iverson


blue sky wrote:

x=list(a=1,b=NULL)
is.null(x$b)
is.null(x$c)

Both the above two commands give me TRUE, but in the first one, b is
NULL, in the second one, c doesn't exist. Are there functions that can
help me distinguish the two different nulls?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  

?names

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] delete repeated values - not unique...

2010-02-16 Thread Erik Iverson

Well, can you algorithmically describe what you are trying to do? Your 
example is not sufficient to determine it.  For instance, are you trying to:


1) remove repeated elements of a vector and concatenate the first 
element at the end?


2) remove repeated elements of a vector and concatenate the minimum 
element at the end?


3) always return the vector c(4, 5, 6, 4) ?

4) something else?



jorgusch wrote:

Hello,

I must be blind not to see it, but I have the following vector:

4
4
5
6
6
4

What I would like to have as a result is:

4
5
6
4

All repeated values are gone. I cannot use unique for this, as the second 4
would disappear. Is there another fast function for this problem?

Thanks in advance!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] delete repeated values - not unique...

2010-02-16 Thread Erik Iverson


Ah, the request was 'hidden' in the subject of the message, apologies!

Erik Iverson wrote:
Well, can you algorithmically describe what you are trying to do? Your 
example is not sufficient to determine it.  For instance, are you trying 
to:


1) remove repeated elements of a vector and concatenate the first 
element at the end?


2) remove repeated elements of a vector and concatenate the minimum 
element at the end?


3) always return the vector c(4, 5, 6, 4) ?

4) something else?



jorgusch wrote:

Hello,

I must be blind not to see it, but I have the following vector:

4
4
5
6
6
4

What I would like to have as a result is:

4
5
6
4

All repeated values are gone. I cannot use unique for this, as the 
second 4

would disappear. Is there another fast function for this problem?

Thanks in advance!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Keyboard

2010-02-16 Thread Erik Iverson


Steven Martin wrote:

All,

I installed R-2.10.1 with Readline=no.  Now for some reason R does not 
recognize some key strokes like the directional arrows.
I am not sure if Readline is the problem or not.

  
What particular OS are you using?  In many cases, there is a 
preconfigured package available to save you from compiling it yourself.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of R in clinical trials

2010-02-17 Thread Erik Iverson


Frank E Harrell Jr wrote:

Cody,

How amazing that SAS is still used to produce reports that reviewers 
hate and that requires tedious low-level programming.  R + LaTeX has 
it all over that approach IMHO.  We have used that combination very 
successfully for several data and safety monitoring reporting tasks 
for clinical trials for the pharmaceutical industry.


Frank


I used to work for a research group that also used R + LaTeX to produce 
DSMB reports for clinical trials.  If the DSMB members had only been 
exposed to SAS reports before, you could not get them to stop praising 
the quality of the R + LaTeX reports, even years into a trial.


Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems installing R-2.10.1 on Linux

2010-02-17 Thread Erik Iverson


Rhett Harrison wrote:

Hi,

I have been having problems installing the newest version on Linux
(Ubuntu 9.10) (tried on two machines).

The ./configure appears to work but I get the following error on the
'make' command.
  
Don't know about this, but if there's no particular reason you need to 
compile from source, just add the proper repository and install R from 
there...


http://cran.r-project.org/bin/linux/ubuntu/

Otherwise check out config.log and see what might have gone wrong...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable substitution


Hello,

Jon Erik Ween wrote:

Hi

I would like to write a script that reads a list of variable names. These variable names 
are some of the column headers in a data.frame. Then I want do a for-loop to execute 
various operations on the specified variables in the data.frame, but can't figure out how 
to do the necessary variable substitution. In bash (or C) I would use "$var", 
but there seems to be no equivalent in R. The data.frame has 300 columns and 2500 rows. 
Not all columns are continuous variables, some are factors, descriptors, etc., so I use 
the varlist to pick which columns I want to analyze.

#Example script
varlist<-read.table(/path/to/varlist)
for (i in 1:length(varlist)){
res<-mean(Dataset$SOMETHINGHERE_i )
write(res) somewhere
}



In your script, perhaps

res <- mean(Dataset[[varlist[i]]])

would work.

Better might be:

colMeans(Dataset[varlist])

or if your actual function is not "mean":

sapply(Dataset[varlist], function)

where "function" is whatever you want (e.g., mean, sd, ...)

and do avoid the "varlist" problem completely, say you only want to run 
it on numeric variables


sapply(Dataset[sapply(Dataset, is.numeric)], function)

None of this is tested, but the ideas should work.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset() for multiple values


subset(df, x %in% c(...))

chipmaney wrote:

This code works:

subset(NativeDominant.df,!ID=="37-R17")


This code does not:

Tree.df<-subset(NativeDominant.df,!ID==c("37-R17","37-R18","10-R1","37-R21","37-R24","R7A-R1","3-R1","37-R16"))



how do i get subset() to work on a range of values?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Funny result from rep(...) procedure

Cannot reproduce, what is branches?  If you can narrow it down to a 
"commented, minimal, self-contained, reproducible" example, you're far 
more likely to get help from the list.


dkStevens wrote:

I'm observing odd behavior of the rep(...) procedure when using variables as
parameters in a loop. Here's a simple loop on a vector 'branches' that is
c(5,6,5,5,5). The statement in question is   
print(c(ni,rep(i,times=ni)))


that works properly first time through the loop but the second time, when
branches[2] = 6, only prints 5 values of i.

Any ideas, anyone?

  iInd = 1
  for(i in 1:length(branches)) {
print((1:branches[i])+iInd-1)   # iInd is a position shift of the index
ni = branches[i]
print(i)
print(ni)
print(c(ni,rep(i,times=ni)))
# ... some interesting other stuff for my project that gets wrecked because
of this issue
iInd = iInd + branches[i]
}

# first pass through loop
[1] 1 2 3 4 5 # branches[1] + iInd - 1 content
[1] 1# i value to repeat 5 times
[1] 5# ni = 5 on 1st pass
[1] 5 1 1 1 1 1   # five values - 1 1 1 1 1 - OK

# second pass through loop
[1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
[1] 2   # i value to repeat 6 times
[1] 6   # ni = 6 on 2nd pass
[1] 6 2 2 2 2 2  # 6 'twos' but only shows 5 'twos' 
print(c(ni,rep(i,times=ni))) - why not 6?





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Funny result from rep(...) procedure




Erik Iverson wrote:
Cannot reproduce, what is branches?  If you can narrow it down to a 
"commented, minimal, self-contained, reproducible" example, you're far 
more likely to get help from the list.




My blinded guess though, is something to do with FAQ 7.31.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R error- "more columns than column names"

2010-02-24 Thread Erik Iverson

I had a comment character "#" in my header names earlier today that 
threw this error.


Euphoria wrote:

Hi all! I am desperately trying to figure out the solution to this error, but
nothing as of yet is working.  


As noted in an earlier post I am using GenABEL.  In an attempt to read in
the phenotype file, in the format .dat, R keeps giving me the error "more
columns than column names"

I have tried to read in the data without the headers; I have also tried to
trim the data to remove any trailing tabs or spaces but it doesn't solve the
problem.  All missing values have been replaced with "NA", and all data
seems to have matching corresponding header value - each column has a
matching column name.

What could be the possible underlying problem? I have tried to problem-solve
but clearly I am at a loss. Thanks for your help! 


Code:
 mix <- load.gwaa.data (phe = "Z:/CCFPhenotypesTAB.dat", gen =
"pedmap-0.raw", force = T)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subset Question

2010-02-25 Thread Erik Iverson


Chertudi wrote:

Hello helpful R folks,

   First off, please forgive my English.  Second, I'm new with R, I've
searched the archives about subsets, and I haven't found quite the help I
need.

I'm currently analysing a population survey whose data set has about 15000
households (the rows/observations) and 130 variables (the columns).  I've
managed to import the set into R as a data.frame called eu08.  Now, I'm
trying to look at all of the variables, but limited to one province in the
"region" variable.  I think the provinces are factors, and the province of
interest is labeled '3'.
I've tried the following:

region3=subset(eu08, region==3)
--this simply strips all of the rows from the columns, and I know that about
4000 of the observations are specific to region 3.  So does putting the 3 as
'3' and "3".

 Any help would be greatly appreciate.

  
Well, we don't know if it really is a factor.  You can determine that by 
doing...


class(eu08$region)

If it is a factor, then

levels(eu08$region)

should let you know what you can subset with. 


str(eu08) might also be good to look at...

Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors


Dwayne Blind wrote:

Dear all,

Do you use a text editor ? What would you recommend for Windows users ? What
about Tinn-R ?



Dwayne,

Perhaps you have seen http://www.sciviews.org/_rgui/ , it has 
information on several possibilities.  It would be hard to pull me away 
from using Emacs with ESS (http://ess.r-project.org/), both on Windows 
and Linux.  I use Emacs for a lot of things now, but ESS was the gateway 
that helped me learn it.  The fact that there is always a version of 
Emacs on all the platforms I might be faced with helps a lot too.  I 
know nothing about Tinn-R, but my recollection is that people who use it 
seem to like it just fine.


Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Experts


Hello,

Ryan Kinzer wrote:

I am trying to understand why R is working in a particular way.  I have a
data set with two variables; mark date (markd) and recap date (recapd).  I
would like to know the number of days between capture dates.  But if I
subtract recap date from mark date I often get the wrong results. 



Well, what are the classes of recapd and markd in your case?
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

You mention ifelse, so for completeness, I will show you a solution that
should work with that. There are other plenty of other possibilities
though, I am sure. The follow is not tested..

Assume 'my.df' is your data.frame, containing a variable "DOW".

my.df$DOW1 <- ifelse(my.df$DOW == "SAT", 1,
ifelse(my.df$DOW == "SUN", 2,
ifelse(my.df$DOW == "MON", 3,
ifelse(my.df$DOW == "TUE", 4,
ifelse(my.df$DOW == "WED", 5,
ifelse(my.df$DOW == "THU", 6,
7))

(don't know if the number of closing ")" is right, but you get the idea...

Erik

Steve Matco wrote:

Hi everyone,

I am at my wits end with what I believe would be considered simple by a more experienced R user. I want to know how to add a variable to a dataframe whose values are conditional on the values of an existing variable. I can't seem to make an ifelse statement work for my situation. The existing variable in my dataframe is a character variable named DOW which contains abbreviated day names (SAT, SUN, MON.FRI). I want to add a numerical variable named DOW1 to my dataframe that will take on the value 1 if DOW equals "SAT", 2 if DOW equals "SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI".
I know this must be a simple problem but I have searched everywhere and tried everything I could think of. Any help would be greatly appreciated.

Thank you,

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Experts