[R] R help, using R to build choropleth
Hi guys i need some help to build choropleth. Basically i am trying to colour regions on the map by population. I possess the shape file of the country, and also the population data, however, i am having trouble to create the plot, below is my code: population=read.csv("nz2.csv") population names(population) nz=readShapeSpatial((sprintf('nz-geodetic-marks',tmp_dir))) nz$land_distr mapnames = nz$land_distr mapnamesnew=as.character(mapnames) popnames =as.character(mapnamesnew) pop =population$People pop2=as.numeric(pop) popdich = ifelse(pop2 < 10, "red", "blue") popnames.lower = tolower(popnames) cols=popdich[match(mapnames,popnames.lower)] plot(nz,fill=TRUE,col=cols,proj="GCS_NZGD_2000") thanks in advance for any assistance Kind regards Iverson -- View this message in context: http://r.789695.n4.nabble.com/R-help-using-R-to-build-choropleth-tp4634686.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing arguments to subset from a function
This thread may help? https://stat.ethz.ch/pipermail/r-help/2007-November/145345.html On Wed, 17 Dec 2008 20:07:08 +0100, "GOUACHE David" wrote: > Hello R-helpers, > > I'm writing a long function in which I manipulate a certain number of > datasets. I want the arguments of said function to allow me to adapt the > way I do this. Among other things, I want my function to have an argument > which I will pass on to subset() somewhere inside my function. Here is a > quick and simplified example with the iris dataset. > > myfunction<-function(table, extraction) { > table2<-subset(table, extraction) > return(table2) } > > myfunction(iris, extraction= Species=="setosa") > > > ## end > > What I would like is for this function to return exactly the same thing as > : > subset(iris, Species=="setosa") > > > Thanks for your help. > > Regards, > > David Gouache > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change in Lattice bwplot?
While I do not know if anything has changed in lattice, you should provide a self-contained, reproducible example as indicated by the posting guide. That way we (and you) can determine if it's a syntax error, or an issue with your particular data. On Tue, 16 Dec 2008 14:56:09 +, "Fredrik Karlsson" wrote: > Dear list, > > Sorry for asking this question, but has something changed in the > syntax for bwplot in Lattice? In an old publication, I used > >> bwplot( VOTMS ~gender |type * group, > data=merge(vot,words,by="ord"), > nint=30, > horizontal=F, > layout=c(3,3), > box.ratio=0.8) > > which produced a lovelly 3x3 lattice plot with one box/gender in each > panel. Now, I try > >> bwplot( SyllableNucleusDiff ~ SourceLanguage,data=Hstar,horizontal=F) > > to get just a simple 1x1 panel plot with groups (which I will then of > course make into a panel plot by adding | factor1 +factor2...), but I > get a "Syntax error". > So, has anything changed, or am I just doing something very silly? > > /Fredrik > > -- > "Life is like a trumpet - if you don't put anything into it, you don't > get anything out of it." > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with data layout
Iain Gallagher wrote: Hello list I have been given some Excel sheets with data laid like this: Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3 I was hoping to import this into R as a csv and then get the mean and SD for each letter in column 1. Could someone give me some guidance on best to approach this? Sure. Reading in Excel sheets can be done at least a few ways, see the R Data Import/Export manual on CRAN. The only way I have done it is to save the Excel sheet as a CSV file, and then use read.csv in R to get a data.frame. One note here is that sometimes the Excel sheet has 'missing' cells where someone has inserted blanks. These may get written out to the CSV file, you'll have to check. For example, I've seen an Excel sheet with something like 10 rows of data that outputs about 100 to the CSV file, mostly all missing. Anyway, once you have the data.frame, I'd use na.locf from the zoo package to 'fill' in the missing Col1 values, and then use an R function such as ave, tapply, aggregate, or by to do whatever you'd like. Thanks Iain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display variables when running a script
Depending on your motivation here, you may want to search for 'debugging in R' or something to that effect to look at the various options available. I like the debug package available on CRAN. Also see ?browser and ?trace if you want to debug a function. [EMAIL PROTECTED] wrote: Hi, I know this must be a stupid question, and sorry in advance for being such a noob. But, is there way to get R to display only certain variables when running a script. I know if you want to see the value of a variable when using the interface, you just type it in and hit enter, but is there a command that I can use in a script file that will show the value at a certain point? Thanks, -Nina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with TLC/TK on Ubuntu
Davide Massidda wrote: Dear all, I have installed R on Linux/Ubuntu 8.04. But you don't say how. Are you compiling R yourself or installing the Ubuntu package from CRAN? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function Error
Please give commented, minimal, self-contained, reproducible code. In this case, not only the function definition, but a simple example showing *how* it does not work, and what you were expecting it to do! Angelo Scozzarella wrote: Hi, Why this function doesn't work? function (x) { if (is.factor(x)) { if (!is.ordered(x)) { warning("La mediana non si puo' calcolare!!!") return(NA) } me <- median(unclass(x)) if (me - floor(me) != 0) { warning("Mediana indeterminata") return(NA) } else { levels(x)[me] } } else if(class(x)=="histogram"){ N<-sum(x$counts) cl<-min(which(cumsum(x$counts)>=N/2)) return(x$breaks[cl]+ (N/2-sum(x$counts[1:(cl-1)]))/(x$densit[cl]*N) } else median(x) } thanks Angelo Scozzarella __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message in if else statement
Monica - Monica Pisica wrote: Hi, I am using an if else statement inside a function …. If I use that function I have no problems …. If I use the function with the if else statement inside a second function I get the following waring: Warning message: In if (pval == 0) p_value <- "< 2.2e-16" else p_value <- pval : the condition has length> 1 and only the first element will be used This means that pval has more than one element, try printing its value immediately before the if statement to see what it is and how it got that way. I also might ask what exactly you're doing with 'real p-values' and testing whether they equal 0? Using the second function I get the expected results, with a real p-value even if it is extremely small, or " _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R fill in missing values?
Hello - Jia Ying Mei wrote: Hi, I know this can be done in Stata (which is quite messy) but I wanted to know if it can be done in R. So lets say I have a merged data set (I used the merge function by date for the attached two files), where all the missing values are filled with NAs (which is what the all.x=TRUE does). Is there any way to replace those NAs with the value of the latest row that contains a value? For example: > Date<-read.table("Desktop/R/Testdate.txt", head=T, sep="\t") > Data<-read.table("Desktop/R/Testinput.txt", head=T, sep="\t") > Merged<-merge(Date, Data, all.x=TRUE) > Merged Date France Germany 1 3/10/07 2 4 2 3/11/07 NA NA 3 3/12/07 NA NA 4 3/13/07 NA NA 5 3/14/07 NA NA 6 3/15/07 1 2 Given this Merged data, is there a way to replace every NA value from 3/11 to 3/14 with that of 3/15? But then say there are multiple intervals with NAs that I want to fill with the last given value? Yes, no loop necessary. # begin R code install.packages("zoo") library(zoo) Merged$France <- na.locf(Merged$France, fromLast = TRUE) # end R code This will of course work on the 'France' column. Use of lapply in conjuction with this idea will lead to solving this problem for N columns in a couple lines of R. Not messy at all! Best Regards, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to delete duplicate cases?
Daniel - First, use order() to arrange the data.frame into an appropriate format. Then, use duplicated() with the negation operator to get rid of the duplicated values. Daniel Wagner wrote: Dear R users,  I have a dataframe with lot of duplicate cases and I want to delete duplicate ones which have low rank and keep that case which has highest rank. e.g  df1  cno     rank 1 1342   0.23 2 1342   0.14 3 1342   0.56 4  2568   0.15 5 2568   0.89  so I want to keep 3rd and 5th cases with highest rank (0.56 & 0.89) and delete rest of the duplicate cases. Could somebody help me?  Regards  Daniel Amsterdam         Send instant messages to your online friends http://uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with which()
See http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f [EMAIL PROTECTED] wrote: Hi, I'm using which to find the position of a value in my data table, and it is returning the correct position and the position of another number that differs by an extra decimal place. For example, when I do: where1<-which(Operons==3573.1,arr.ind=TRUE) it returns the position of that number and of 3573.15. Is it possible to get the function to only return a position if the number matches exactly? Thanks, -Nina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with which()
Ben Bolker wrote: Erik Iverson biostat.wisc.edu> writes: See http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f naw3 duke.edu wrote: I'm using which to find the position of a value in my data table [snip] For example, when I do: where1<-which(Operons==3573.1,arr.ind=TRUE) it returns the position of that number and of 3573.15. This doesn't seem like the same issue as the FAQ. If there are really two elements in the array, one of which is 3573.1 and the other of which is 3573.15, they should not be treated as equal. Something else is going on. Could the original poster please send a small reproducible example? Yes, the more I look at it, the more I agree. I just reacted when I saw x == 3573.1 ... It would be nice to see a reproducible example! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: how to build a subset to do separate calculations
Spencer - Spencer wrote: Dear R Experts, I am trying to create several subsets that I want to use as separate datasets. After separate subsets are created for each country, I want to do some calculations for each. I did the following: data <- data.frame(na.omit(read.spss("trial.sav"))) attach (data) You might not want to call it 'data' since 'data' is the name of a function in base R. Also, the usual advice is to avoid attaching data.frames to the search path, for the exact reasons that happen to you below. Netherlands <- subset(data, COUNTRY = "NETHERLANDS") attach(Netherlands) But after this step, when I check the summary statistics for "Netherlands," what returns is the summary for the original dataset. Try typing search() after attaching this to see why. You may have also seen a message about objects being masked when you used the second 'attach' command; at least I do in my version of R. I've also tried defining Netherlands as Netherlands <- data.frame(subset(data, COUNTRY = "NETHERLANDS")), but get the same result. Any help would be greatly appreciated! Don't use attach. If you want to do summaries by country, there are many simpler ways. See ?tapply, ?aggregate, and ?by for instance. Best, Erik Spencer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: how to build a subset to do separate calculations
Erik Iverson wrote: Spencer - Spencer wrote: Dear R Experts, I am trying to create several subsets that I want to use as separate datasets. After separate subsets are created for each country, I want to do some calculations for each. I did the following: data <- data.frame(na.omit(read.spss("trial.sav"))) attach (data) You might not want to call it 'data' since 'data' is the name of a function in base R. Also, the usual advice is to avoid attaching data.frames to the search path, for the exact reasons that happen to you below. Netherlands <- subset(data, COUNTRY = "NETHERLANDS") attach(Netherlands) But after this step, when I check the summary statistics for "Netherlands," what returns is the summary for the original dataset. Actually, this might not make sense to me. What do you mean 'check the summary statistics for Netherlands". What command are you giving to R to do that? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] accessing list elements
Hello - Paul Adams wrote: Hello everyone, I have a list which I am trying to calculate a max value.I have the list as w<-c(v[[1]][1],...v[[100]][1]). The problem I am getting is that the function max is saying the list is an "invalid type (list) of argument".When I show the element v[[1]][1] it shows as $statistic,V,736.I am only wanting to use the number 736 from v[[1]][1] but am not sure how to access that number only?I believe if I just use the number then I should be able to calculate the max. Any help would be appreciated Paul Please see footer of this message, which states "...provide commented, minimal, self-contained, reproducible code." You do not do this. Is your problem like this? a <- list(20, 30) max(a) ##error max(unlist(a)) ##no error [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable name in a String
?assign, ?get R_Learner wrote: Hi all, I have this string "year" and integer 2008 (both are inputs from the user), and I would like to make a variable called "year2008" that will store a vector of numbers. Does anyone know how to do this? Also, if the user later input "year" and "2008", how do I treat this as a variable; how can I access the variable year2008 using the string paste("year","2008",sep="")? Thanks so much! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] colnames from read.table
Chua Siang Li wrote: Hi. May I know why the colnames is NULL when reading only 1 column from a csv file? You are not "reading only 1 column from a csv file", you are subsetting one column of a data.frame. See ?Extract and the "drop" argument specifically. How do I get the colname then? ?names in this case. Always know the class of the object you are dealing with, and which methods are defined for that class. ?class, ?methods See Thanks. > xy = read.table("dataFile.csv",header=T, sep=",") > y <- xy[,1] > xDate <- xy[,2] > x <- xy[,3:8] > colnames(y) NULL > colnames(xDate) NULL > colnames(x) [1]"Market.Price""Quantity""Country""Incoterm" "Channel" [6] "PaymentTerm" Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: www.acceval-intl.com This message and any attachments (the "message"...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to output R image to a file?
?Devices for a start. Aiste Aistike wrote: Hello, I would like to ask if anyone could help me. I want to save images I create (e.g. histograms, boxplots, plots, etc.) to a file or files. Does anyone know how to do this? Thank you. Aiste [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing Matrices into Hash
I think a named list is probably the easiest way to start off, something like: all_mat <- list(mat1 = mat1, mat2 = mat2) all_mat$mat2 Gundala Viswanath wrote: Hi, Suppose I have these two matrices (could be more). What I need to do is to store these matrices into a hash. So that I can call back any of the matrix back later. Is there a way to do it? mat_1 [,1][,2] [1,] 9.327924e-01 0.067207616 [2,] 9.869321e-01 0.013067929 [3,] 9.892814e-01 0.010718579 [4,] 9.931603e-01 0.006839735 [5,] 9.149056e-01 0.08509 mat_2 [,1][,2] [1,] 9.328202e-01 0.067179769 [2,] 9.869402e-01 0.013059827 [3,] 9.892886e-01 0.010711437 [4,] 9.931660e-01 0.006833979 [5,] 9.149391e-01 0.085060890 This method I have is not favorable because it just stack the matrices together as another matrix. Makes it hard to get individual matrix later. all_mat <- NULL all_mat <- c(all_mat, mat1,mat2) - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing Matrices into Hash
Gundala Viswanath wrote: Thanks so much Erik, But how do you include that in a loop. I tried this, doesn't seem to work. Please advice: __BEGIN__ all_mat <- NULL for (matno in 1:10) { mat <- process_to_create_matrix(da[matno]) all_mat <- list(all_post, matno = mat) } I'm not exactly sure what you're up to, what is all_post, and what is "da"? You might look at the lapply function as an option for avoiding the loop and writing cleaner code. Something like: all_mat <- lapply(da, process_to_create_matrix) may or may not work depending on what "da" is. Please see the footer of this message regarding reproducible, self-contained example code. Best, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] viewing data in something similar to 'R Data Editor'
See ?View but I don't think it 'auto updates' per your last sentence. Maybe there's a better option? Rachel Schwartz wrote: Hi, I would like to view matrices I am working with in a clean, easy to read, separate window. A friend showed me how to do something like I want with edit(). I can view the matrix in the 'R Data Editor': For a sample matrix: mat=matrix(1:15,ncol=3) mat [,1] [,2] [,3] [1,]16 11 [2,]27 12 [3,]38 13 [4,]49 14 [5,]5 10 15 look=function(x) invisible(edit(x)) look(mat) That opens the 'R Data Editor' with mat loaded. But I am not able to do any other actions in R while this 'R Data Editor' is open. I want to keep this open while I do other work. Is there a way to view my data in something like the 'R Data Editor' that still allows me to do work at the same time? I am looking for something other than str(), head(), and tail() which just allow me a quick peak at the object. I do not want to edit the object in the table, but be able to watch the object change while I run anything that would manipulate it. Thank you for your help. Best, Rachel Schwartz Graduate Student Researcher UCSD; Scripps Institution of Oceanography [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] viewing data in something similar to 'R Data Editor'
Rachel Schwartz wrote: Thanks Erik, almost worked! I am a mac user and for some reason View worked perfectly for my PC using friend, but doesn't for me. When I tried: > mat=matrix(1:10,ncol=2) > mat [,1] [,2] [1,]16 [2,]27 [3,]38 [4,]49 [5,]5 10 > View(mat) I get no error message, but nothing happens (besides spinning ball of death) and I have to force quit R. I tried a couple different variations but still no success with using View. Suggestions? Not from me, no Mac here. Maybe someone else? Or else there is a Mac specific list, R-SIG-Mac, google for it. On Fri, Aug 1, 2008 at 10:52 AM, Erik Iverson <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: See ?View but I don't think it 'auto updates' per your last sentence. Maybe there's a better option? Rachel Schwartz wrote: Hi, I would like to view matrices I am working with in a clean, easy to read, separate window. A friend showed me how to do something like I want with edit(). I can view the matrix in the 'R Data Editor': For a sample matrix: mat=matrix(1:15,ncol=3) mat [,1] [,2] [,3] [1,]16 11 [2,]27 12 [3,]38 13 [4,]49 14 [5,]5 10 15 look=function(x) invisible(edit(x)) look(mat) That opens the 'R Data Editor' with mat loaded. But I am not able to do any other actions in R while this 'R Data Editor' is open. I want to keep this open while I do other work. Is there a way to view my data in something like the 'R Data Editor' that still allows me to do work at the same time? I am looking for something other than str(), head(), and tail() which just allow me a quick peak at the object. I do not want to edit the object in the table, but be able to watch the object change while I run anything that would manipulate it. Thank you for your help. Best, Rachel Schwartz Graduate Student Researcher UCSD; Scripps Institution of Oceanography [[alternative HTML version deleted]] __ R-help@r-project.org <mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic factor question.
Kevin - Read more closely "levels", being an optional vector of the values x might have taken. You are saying x might have taken 1:20, and then giving it the first 20 letters, which are not part of "the values x might have taken". Try: x <- factor(letters[1:20]) levels(x) x vs. y <- factor(letters[1:20], levels = letters) levels(y) y vs. z <- factor(letters[1:20], levels = letters[1:19]) levels(z) z That might help show you what's going on? Best, Erik Iverson [EMAIL PROTECTED] wrote: Doing ?factor I get: x a vector of data, usually taking a small number of distinct values. levels an optional vector of the values that x might have taken. The default is the set of values taken by x, sorted into increasing order. So if I do: factor(letters[1:20],level=seq(1:20) [1] [16] Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 So why all of the NA? What happend to 'a'. 'b', etc.? I was expecting a=1, b=2, c=3 etc. I am missing something. Please help with my understanding. Keviin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] T Test
Just plug your values into the t-test formula, you don't need R for this, you can use a calculator. If you want a p-value then use the pt() function in R after getting the t statistic. Angelo Scozzarella wrote: Hi, I want to calculate the T-Test from means and sd. How can I do it? Thanks Angelo Scozzarella __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] produce variable on the fly
?assign , or consider a named vector/list. jimineep wrote: Hi guys, I want to create variable on the fly: for example for (i in 1:10) { cat(paste("VAR",i,sep="")) } Will print VAR1, VAR2 etc up to VAR10. However I want to make these into variables, and then give them a value, for example: vect = c(10:20) for (i in 1:10) { cat(paste("VAR",i,sep="")) = vect[i] } THis doesnt work but I hope it demonstrates what I'm trying to do: I'm trying to produce 10 variables, from VAR1 to VAR10, which have the numbers 11...20 respectively, so I would end up with VAR1 = 11 VAR2 = 12 ... VAR10 = 20 Hope this makes sense!! Does anyone have any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging data sets to match data to date
rcoder wrote: Hi everyone, I want to extract data from a data set according to dates specified in a vector. I have created a blank matrix with row names (dates) that I want to extract from the full data set. I have then performed a merge to try to o/p rows corresponding to common dates to a results matrix, but the operation did not fill the results matrix. Coulc anyone offer any advice to assist with this operation? Yes, follow the posting guide and provide commented, minimal, self-contained, reproducible code of your problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?
I still don't understand what you are doing. Can you make a small example that shows what you have and what you want? Is ?split what you are after? Emmanuel Levy wrote: Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a "hash" key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name "1" L[[2]] = values of name "2" L[[3]] = values of name "3" ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson <[EMAIL PROTECTED]>: To simplify: n <- 2.7e6; x <- factor(c(rep("A", n/2), rep("B", n/2))); # Identify 'A':s t1 <- system.time(res <- which(x == "A")); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 <- system.time(res <- which(as.character(x) == "A")); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 <- system.time(res <- which(x == match("A", levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 <- system.time(res <- which(as.integer(x) == match("A", levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan <[EMAIL PROTECTED]> wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy <[EMAIL PROTECTED]> wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame "df": col1=sample(c(0,1),10, rep=T) names = factor(c(rep("A",5),rep("B",5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c("A","B") col1[[1]]=df$col1[which(df$name=="A")] col1[[2]]=df$col1[which(df$name=="B")] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n <- 270 foo <- data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep("A", n/2 ),rep("B", n/2 ))) + ) system.time(out <- which(foo$two=="A")) user system elapsed 0.566 0.146 0.761 system.time(out <- foo$two=="A") user system elapsed 0.429 0.075 0.588 You might also find use for unstack(), though I didn't see a speedup. system.time(out <- unstack(foo)) user system elapsed 1.068 0.697 2.004 HTH Peter My problem is that the command: *** which(df$name=="A") *** takes about 1 second because df is so big. I was thinking that a "level" could maybe be accessed instantly but I am not sure about how to do it. I would be very grateful for any advice that would allow me to speed this up. Best wishes, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional IF with AND
if(cond1 && cond2) { ... } rcoder wrote: Hi everyone, I'm trying to create an "if" conditional statement with two conditions, whereby the statement is true when condition 1 AND condition 2 are met: code structure: if ?AND? (a[x,y] , a[x,y] ) I've trawled through the help files, but I cannot find an example of the syntax for incorporating an AND in a conditional IF statement. Thanks, rcoder __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement used in sapply()
Hello - Altaweel, Mark R. wrote: Hi, I have data stored in a list that I would like to aggregate and perform some basic stats. However, I would like to apply conditional statements so that not all the data are used. Basically, I want to get a specific variable, do some basic functions (such as a mean), but only get the data in each element's data that match the condition. The code I used is below: result<-sapply(res, function(.df) { #res is the list containing file data + if(.df$Volume>0)mean(.df$Volume) #only have the mean function calculate on values great than 0 + }) I did get a numeric output; however, when I checked the output value the conditional was ignored (i.e. it did not do anything to the calculation) I also obtained these warning statements: Warning messages: 1: In if (.df$Volume > 0) mean(.df$Volume) : the condition has length > 1 and only the first element will be used 2: In if (.df$Volume > 0) mean(.df$Volume) : the condition has length > 1 and only the first element will be used Please let me know what am I doing wrong and how can I apply a conditional statement to the sapply function. Before you think about sapply, what would you do if you had one element of this list. Write a function to do that. You wouldn't do : if(x$Volume > 0) mean(x$Volume) because x$Volume > 0 will create a logical vector greater than length 1 (assuming x$Volume is greater than length 1), and then "if" will issue the warning. You might do, mean(x$Volume[x$Volume > 0]) and turn it into a function. Then use sapply. Hopefully that gets you started! Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple (?) subset problem
I can't tell exactly what's wrong, just check out the ?str and ?levels functions for some guidance. Farley, Robert wrote: I can't figure out the syntax I need to get subset to work. I'm trying to split my dataframe into two parts. I'm sure this is a simple issue, but I'm stumped. I either get all or none of the original "rows". XTTable <- xtabs( ~ direction_ , SurveyData) XTTable direction_ EASTBOUND 345 WESTBOUND 307 EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" ) XTTable <- xtabs( ~ direction_ , EBSurvey) XTTable direction_ EASTBOUND 0 WESTBOUND 0 EBSurvey <- subset(SurveyData, direction_ = "EASTBOUND" ) XTTable <- xtabs( ~ direction_ , EBSurvey) XTTable direction_ EASTBOUND 345 WESTBOUND 307 EBSurvey <- subset(SurveyData, direction_ == 1 ) XTTable <- xtabs( ~ direction_ , EBSurvey) XTTable direction_ EASTBOUND 0 WESTBOUND 0 Robert Farley Metro 1 Gateway Plaza Mail Stop 99-23-7 Los Angeles, CA 90012-2952 Voice: (213)922-2532 Fax:(213)922-2868 www.Metro.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Opening a web browser from R?
Hello - [EMAIL PROTECTED] wrote: Hi, I was wondering if there's a way in R to open a web browser (such as Internet Explorer, or Firefox or whatever). I'm doing some analyses that have associated urls, and it would be nice to have the ability to directly open the relevant page from within R. I don't know what you mean by "associated". If you just want to open a URL in a web browser from R, see ?browseURL, the help.start function uses this. Of course, help.search("browser") would have led you there. HTH, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving environment object
Benjamin Otto wrote: Hi, When I create an environment object with new.env() and populate it with values then how can I save it into an .RData file properly, so it can be loaded later on in a new session? Saving an environment object with save() or save.image() results in an error message when loading again: Error: protect(): protection stack overflow Can you give a small, reproducible example as the posting guide asks? And also provide your sessionInfo() ? I am not able to replicate this. test <- new.env() assign("hi", pi, pos = test) save(test, file = "~/testenv.Rdata") does not give me an error. Is this basically what you're trying? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving environment object
Of course you said when you load it again. I just now loaded it, without error. FYI, my sessionInfo(), which I realize is not the latest version. sessionInfo() R version 2.7.0 (2008-04-22) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] grDevices datasets grid tcltk splines graphics utils [8] stats methods base other attached packages: [1] fortunes_1.3-4 debug_1.1.0 mvbutils_1.1.1 erik_0.0-1 [5] reshape_0.8.0 SPLOTS_1.3-50 Hmisc_3.4-3 chron_2.3-21 [9] survival_2.34-1 loaded via a namespace (and not attached): [1] cluster_1.11.10 lattice_0.17-6 tools_2.7.0 Benjamin Otto wrote: Hi, When I create an environment object with new.env() and populate it with values then how can I save it into an .RData file properly, so it can be loaded later on in a new session? Saving an environment object with save() or save.image() results in an error message when loading again: Error: protect(): protection stack overflow Regards, benjamin == Benjamin Otto University Hospital Hamburg-Eppendorf Institute For Clinical Chemistry Martinistr. 52 D-20246 Hamburg Tel.: +49 40 42803 1908 Fax.: +49 40 42803 4971 == __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Produce single line graph title composed of text and computed values.
See the ?paste function, instead of the ?c function. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Monday, August 10, 2009 10:36 AM To: r-help@r-project.org Subject: [R] Produce single line graph title composed of text and computed values. R 2.81 Windows XP I am trying to produce a title that combines: text, a computed value, text, a computed value The title contains everything I want, but each element of the title is on a separate line, i.e. my title is five lines long. Is there anyway I can force the entire title to be on the same line? The problem is not the length of the title; it is short enough to fit on a shingle line. My R code follows: title(c(text,"Corr=",round(corrln$estimate,2),"p=",round(corrln$p.value,2))) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary(table)
We cannot reproduce your example since we don't have access to probF. It seems probF is not an object of class "table", but perhaps of class "data.frame". Also, summary is not "cutting off the other variables", it is pooling levels of a factor into the "Other" category. All the levels belong to the same variable. By looking at the help for summary, i.e., by typing ?summary at the R prompt, you will find that the data.frame method has an argument called "maxsum", which is be default equal to 7. The argument is described as "integer, indicating how many levels should be shown for 'factor's." That sounds like what you want. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of mmv.listservs Sent: Monday, August 10, 2009 12:52 PM To: r-help@r-project.org Subject: [R] summary(table) Hi, Why when I do a summary on a table it cuts off the other variables? It says Other :58 or Other: 120. how can I get the summary for all the variables under ServLoad.Task and Server.Load and Avg. CPU and Max.CPU? Thanks, summary(probF) Reboot.Id ServLoad.Task Server.Load Avg.CPUMax.CPU Event.Log Min. : 2.00 120067_122395: 5 SLACT04_1 :21 Min. : 0.02827 Min. : 0.40 2009-MAY-13 20:21:37:86 1st Qu.:18.50 120076_124326: 5 SLSPED06 :19 1st Qu.: 4.22128 1st Qu.: 7.80 2009-MAY-11 20:03:29: 9 Median :25.00 120030_122501: 4 SLACT04_5 :15 Median :10.29370 Median :23.70 2009-MAY-13 01:25:13: 7 Mean :21.18 120046_122891: 4 SL06_SCEVEN_OP:14 Mean :20.54242 Mean :32.88 2009-MAY-13 21:03:08: 5 3rd Qu.:25.00 120046_122933: 4 SLACT04_4 :10 3rd Qu.:33.44267 3rd Qu.:53.70 2009-MAY-11 04:41:51: 4 Max. :30.00 120071_121550: 4 SLBF06_3 : 9 Max. :96.73438 Max. :99.90 2009-MAY-11 05:30:02: 4 (Other) :120 (Other) :58 (Other) :31 Probability Min. :0.34 1st Qu.:0.000227 Median :0.009295 Mean :2.321083 3rd Qu.:4.725526 Max. :9.432425 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in "seq" (or a "feature") ?
General floating point arithmetic issue here: See FAQ 7.31 http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili Sent: Monday, August 10, 2009 4:14 PM To: r-help@r-project.org Subject: [R] Bug in "seq" (or a "feature") ? (I use R 2.9.1 with win XP) If I run this code: seq(-0.1,.9, by = .05)[seq(-0.1,.9, by = .05) <= 0.5] I get this output: [1] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 Why is 0.50 not in the results ? (It seems that it gives a slightly bigger number then 0.5 but I don't understand why it does that) Where as if I try: seq(-0.1,.9, by = .05)[seq(-0.1,.9, by = .05) <= 0.4] and get: [1] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 Then 0.40 WILL be in the results. Thanks,Tal -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pretty display or print for data frames ?
I sometimes use the View() (note the capital V) function to view long/wide data.frames. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Leon Yee Sent: Wednesday, August 12, 2009 2:12 AM To: Patrick Connolly Cc: r-help@r-project.org Subject: Re: [R] pretty display or print for data frames ? Patrick Connolly wrote: > On Wed, 12-Aug-2009 at 12:38PM +0800, Leon Yee wrote: > >> Hi, all >> >> I have a question of adjusting the output of a data frame with many >> columns. By default, print() will print out several columns according to >> the window size, and then it scrolls down and print out left columns. >> How can I make it print all the columns in the same line? >> I found options(width=80 or 120 or whatever) will the change the >> behavior of print() and got the effect what I want, but using a >> parameter directly doesn't work: print(x, width=xxx). >> Can anyone help me out? thanks. > > This will get it on one line. > > options(width=120); print(x) > > I'm not sure what your main objective is so that might be of no use. > If it's something you do often, you could make a little function that > takes the name of the dataframe and the width you'd like it printed. > You could even have the function put the width back to the initial > setting. (Generally speaking, printing very wide gets messy, so > there's good reason why you might just want to do that.) Thanks, Patrick. Printing wide is sometimes needed. options(width=120) just works, but I need to changed it back to 80 every time for normal printing. Sort of tedious. I'm just curious why print.data.frame does not take an argument of width=120, just like in print.factor. ## S3 method for class 'factor': print(x, quote = FALSE, max.levels = NULL, width = getOption("width"), ...) Best wishes, Leon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Symbolic references - passing variable names into functions
I think ONE answer to what you actually want to do might be f <- function(dataf, col1 = "column1", col2 = "column2") { dataf[[col1]] <- dataf[[col2]] # just as an example dataf } -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alexander Shenkin Sent: Wednesday, August 12, 2009 9:27 AM To: r-help@r-project.org Subject: [R] Symbolic references - passing variable names into functions Hello All, I am trying to write a function which would operate on columns of a dataframe specified in parameters passed to that function. f = function(dataf, col1 = "column1", col2 = "column2") { dataf$col1 = dataf$col2 # just as an example } The above, of course, does not work as intended. In some languages one can force evaluation of a variable, and then use that evaluation as the variable name. Thus, > a = "myvar" > (operator)a = 1 > myvar [1] 1 Is there some operator which allows this symbolic referencing in R? Thanks, Allie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nominal variables in SVM?
Noah, depending on what function you use, it might do this automatically for you if you give the function a formula containing a factor. Otherwise, see ?model.matrix. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Wednesday, August 12, 2009 3:59 PM Cc: r help Subject: Re: [R] Nominal variables in SVM? That makes sense. I my data is already nominal, I need to "expand" a single column into several binary ones. Is there an easy function to do this in R, or do I need to create something from scratch? (If I have to create my own, any suggestions?) Thanks! -N On 8/12/09 1:55 PM, Steve Lianoglou wrote: > Hi, > > On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote: > >> Hi, >> >> The answers to my previous question about nominal variables has lead >> me to a more important question. >> >> What is the "best practice" way to feed nominal variable to an SVM. >> >> For example: >> color = ("red, "blue", "green") >> >> I could translate that into an index so I wind up with >> color= (1,2,3) >> >> But my concern is that the SVM will now think that the values are >> numeric in "range" and not discrete conditions. >> >> Another thought would be to create 3 binary variables from the single >> color variable, so I have: >> >> red = (0,1) >> blue = (0,1) >> green = (0,1) >> >> A example fed to the SVM would have one positive and two negative >> values to indicate the color value: >> i.e. for a blue example: >> red = 0, blue =1 , green = 0 > > Do it this way. > > So, imagine if the features for your examples were color and height, > your "feature matrix" for N examples would be N x 4 > > 0,1,0,15 # blue object, height 15 > 1,0,0,10 # red object, height 10 > 0,0,1,5 # green object, height 5 > ... > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nominal variables in SVM?
This is where a small, reproducible example will definitely help us discover your problem. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Wednesday, August 12, 2009 4:29 PM To: Achim Zeileis Cc: r help Subject: Re: [R] Nominal variables in SVM? Thanks for all the suggestions. My data was loaded in from a csv file with about 80 columns (3 of these columns are nominal) no specific settings for the nominal columns. Currently, if I call svm (e1071), I get an error about the nominal column. Do I need to tell R to change the column to a factor? i.e. foo$color <- factor(foo$color) On 8/12/09 2:21 PM, Achim Zeileis wrote: > On Wed, 12 Aug 2009, Noah Silverman wrote: > >> Hi, >> >> The answers to my previous question about nominal variables has lead >> me to a more important question. >> >> What is the "best practice" way to feed nominal variable to an SVM. > > As some of the previous posters have already indicated: The data > structure for storing categorical (including nominal) variables in R > is a "factor". > > Your comment about "truly nominal" is wrong. A character variable is a > character variable, not necessarily a categorical variable. > Categorical means that the answer falls into one of a finite number of > known categories, known as "levels" in R's "factor" class. > > If you start out from character information: > > x <- c("red", "red", "blue", "green", "blue") > > You can turn it into a factor via: > > x <- factor(x, levels = c("red", "green", "blue")) > > R now knows how to do certain things with such a variable, e.g., > produces useful summaries or knows how to deal with it in regression > problems: > > model.matrix(~ x) > > which seems to be what you asked for. Moreover, you don't need call > this yourself but most regression functions in R will do that for you > (including svm() in "e1071" or ksvm() in "kernlab", among others). > > In short: Keep your categorical variables as "factor" columns in a > "data.frame" and use the formula interface of svm()/ksvm() and you are > fine. > Z > > >> For example: >> color = ("red, "blue", "green") >> >> I could translate that into an index so I wind up with >> color= (1,2,3) >> >> But my concern is that the SVM will now think that the values are >> numeric in "range" and not discrete conditions. >> >> Another thought would be to create 3 binary variables from the single >> color variable, so I have: >> >> red = (0,1) >> blue = (0,1) >> green = (0,1) >> >> A example fed to the SVM would have one positive and two negative >> values to indicate the color value: >> i.e. for a blue example: >> red = 0, blue =1 , green = 0 >> >> Or, do any of the SVM packages intelligently handle this internally >> so that I don't have to mess with it. If so, do I need to be >> concerned about different "translation" of the data if the test data >> set isn't exactly the same as the training set. >> For example: >> training data = color ("red, "blue", "green") >> test data = color ("red, "green") >> >> How would I be sure that the "red" and "green" examples get encoded >> the same so that the SVM is accurate? >> >> Thanks in advance!! >> >> -N >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Browser and Debug?
This article might help: http://www.biostat.jhsph.edu/~rpeng/docs/R-debug-tools.pdf -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Inchallah Yarab Sent: Thursday, August 13, 2009 9:40 AM To: r-help@r-project.org Subject: [R] Browser and Debug? Hi, Someone can explain to me how use Browser and Debug ? thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the n (number of observations) per conditional group
We will only be able to help if you provide a reproducible example! I'm sure this is a simple one-liner, but it's hard to tell from your example what it should be. The functions table, length, tapply, and/or nrow may play a part though. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dax Sent: Thursday, August 13, 2009 9:11 AM To: r-help@r-project.org Subject: [R] How to get the n (number of observations) per conditional group Hello all, I have a huge data set that I'm cleaning up a bit. I am extracted the means per condition, but also need to get the n. Strangely enough I am unable to find a function that could actually pull this off. I am unable to use replications or count the rows using nrow. What I am trying to obtain is the number of observations fitting something along the lines of "data$total[data$spp==1&data$gyro==1&data $salinity==0&data$day==i]" Can anyone help me? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding problem: How can I extract substring of function call within the function
Are you sure you just don't want to tell them about the :: operator? It sounds easier than what you're proposing. E.g., base::mean(c(1:10, NA)) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Pitt, Joel Sent: Thursday, August 13, 2009 3:48 PM To: r-help@r-project.org Subject: [R] Coding problem: How can I extract substring of function call within the function In order to ease my students into the R environment I am developing a package which installs a variety of utility functions as well as slightly modified versions of some standard R functions -- e.g. mean, hist, barplot, In my versions of these standard R functions I either add options or alter some defaults that seem to create difficulties for most of my students -- for example, when they do barcharts for two dimensional tables they generally want the bars to be side-by-side and stumble over the standard default of besides=F, and my version of mean by default reports a mean value in the presence of NA's after warning of that presence (but retains the option of setting na.rm=F). (I don't doubt that some (if not many) of you will doubt the wisdom of this, and I would be happy to discuss this in more detail on other occasions.) You might want to think of my replacement R functions as a kind of "training wheels" for R, and, in the spirit of training wheels I include a funct! ion in my package that allows a user to revert to the standard version of one or all functions without unloading the package (and loosing its additional functionality). However, I want to add a function that allows a user to revert to running the standard R version of a given function on a one-off basis and that's where my problem comes up. I believe that it should be possible to write a function rStd with the usage rStd(x,...) where x is a function -- e.g. mean, hist, barchart, and the remaining parameters would be any of the parameters that should be passed to the unmodified version of mean, hist, barchart... The problem I have is how to get ahold of that collection of parameters as a single character string. Now I know that sys.calls()[[1]] will give me the full text of the initial call, but the problem is to detach the ... above from that as a text string. If I could do that I'd be done. Here's the incomplete code with comments -- see the gap set off by astericks. rStd=function(x,...){ if(missing(x)) # must have a specified function { cat("Error: No function specified\n"); return(invisible(NULL)); } z=as.character(substitute(x)); # must include code here to check that z is the name # of one of our altered functions # if z is an altered function, e.g., "mean" # then concatenating "x" with z gives the overlaid # function -- e.g. xmean is the standard mean #*** # Now we need to get a hold of the ... text * # * w=sys.calls()[[1]]; # this gets me the whole text of the call * # * # and now here's where the problem arises * # how to I get the ... text if I could get it and say * # assign it to the variable params * # I could then set * # * cmd=sprintf("x%s(%s)",z,params) # see remarks above about z # and then it's done eval(parse(text=cmd),sys.frame()); } Any help would be much appreciated. Regards Joel Joel Pitt, Ph.D. Associate Professor & Chair Mathematics & Computer Science Georgian Court University 900 Lakewood Avenue Lakewood, NJ 08540 732-987-2322 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with median for each row
Edward, In general, if you have an nxn matrix, you can use the "apply" function to apply a function to each row of the matrix, and return the result. So, as a start, you could do, apply(your.mat, 1, median) or apply(your.mat, 1, median, na.rm = TRUE) if you want to pass further arguments to median... You can also write your own function and pass it in, of course: E.g., apply(your.mat, 1, function(x) sum(x)+1001) See ?apply. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Edward Chen Sent: Friday, August 21, 2009 11:50 AM To: r-help@r-project.org Subject: [R] help with median for each row Hi, I tried looking through google search on whether there's a way to computer the median for each row of a nxn matrix and return the medians for each row for further computation. And also if the number of columns in the matrix are even, how could I specify which median to use? Thank you very much! -- Edward Chen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table function
You need to create a factor that indicates which group the values in 'z' belong to. The easiest way to do that based on your situation is to use the 'cut' function to construct the factor, and then call 'table' using the result created by 'cut'. See ?cut and ?factor -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Inchallah Yarab Sent: Monday, August 24, 2009 10:59 AM To: r-help@r-project.org Subject: [R] table function hi, i want to use the function table to build a table not of frequence (number of time the vareable is repeated in a list or a data frame!!) but in function of classes I don t find a clear explnation in examples of ?table !!! example x y z 1 0 100 5 1 1500 6 1 1200 2 2 500 1 1 3500 5 2 2000 8 5 4500 i want to do a table summerizing the number of variable where z is in [0-1000],],[1000-3000], [> 3000] thank you very much for your help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique command not deleting all duplicate rows
I really don't think this is the issue. I think the issue is that some columns of the data.frame, specifically V1, V2, and V4 should be checked versus R FAQ 7.31. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Don McKenzie Sent: Monday, August 24, 2009 1:35 PM To: Mehdi Khan Cc: r-help@r-project.org Subject: Re: [R] Unique command not deleting all duplicate rows duplicated() > test.df V1 V2 V3 V4 V5 V6 V7 1 -115.380 32.894 195 162.940 D 8419 D 2 -115.432 32.864 115 208.910 D 8419 D 3 -115.447 32.773 1170 264.570 D 8419 D 4 -115.447 32.773 1170 264.570 D 8419 D 5 -115.447 32.773 1170 264.570 D 8419 D 6 -115.447 32.773 1170 264.570 D 8419 D 7 -115.447 32.773 149 186.210 D 8419 D 8 -115.466 32.855 114 205.630 D 8419 D 9 -115.473 32.800 1121 207.469 D 8419 D > test.df[!duplicated(test.df),] V1 V2 V3 V4 V5 V6 V7 1 -115.380 32.894 195 162.940 D 8419 D 2 -115.432 32.864 115 208.910 D 8419 D 3 -115.447 32.773 1170 264.570 D 8419 D 7 -115.447 32.773 149 186.210 D 8419 D 8 -115.466 32.855 114 205.630 D 8419 D 9 -115.473 32.800 1121 207.469 D 8419 D On 24-Aug-09, at 11:23 AM, Mehdi Khan wrote: > Hello everyone, when I run the "unique" command on my data frame, > it deletes > the majority of duplicate rows, but not all of them. Here is a > sample of my > data. How do I get it to delete all the rows? > > 6 -115.38 32.894 195 162.94 D 8419 D > > 7 -115.432 32.864 115 208.91 D 8419 D > > 8 -115.447 32.773 1170 264.57 D 8419 D > > 9 -115.447 32.773 1170 264.57 D 8419 D > > 10 -115.447 32.773 1170 264.57 D 8419 D > > 11 -115.447 32.773 1170 264.57 D 8419 D > > 12 -115.447 32.773 149 186.21 D 8419 D > > 13 -115.466 32.855 114 205.63 D 8419 D > > 14 -115.473 32.8 1121 207.469 D 8419 D > > > Thanks a bunch! > > Mehdi Khan > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. Don McKenzie, Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Forest Resources, College of the Environment CSES Climate Impacts Group University of Washington desk: 206-732-7824 cell: 206-321-5966 d...@u.washington.edu donaldmcken...@fs.fed.us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table, xyplot, names, & loops
Hello, "I'm just starting out in R and have a basic question about xyplot and tables. Suppose I had a table of data with the following names: Height, Age_group, City. I'd like to plot mean Height vs Age_group for each City" You did not provide a sample data.frame, so I generated one. This example is basically borrowed directly from Figure 4.3 in Sarkar's excellent book, "Lattice". Personally, I do feel that a plot such as the one below is a better display choice than a simple table of the means, but some may disagree. Please note that my random data do not contain effects for either age.group or city, so my guess is that your resulting plot will look cleaner (i.e., contain some visual signal.) ## BEGIN SAMPLE R CODE ## create sample data.frame df <- data.frame(height <- rnorm(1000, 10), age.group <- sample(gl(10,100, labels = paste("Age Group", 1:10))), city <- sample(gl(4, 250, labels = paste("City", 1:4 ## tabulate the data in matrix form h.tab <- with(df, tapply(height, list(age.group, city), mean)) ## use dotplot with the matrix object dotplot(h.tab, type = "o", auto.key = list(lines = TRUE, space = "right"), xlab = "height") ## END SAMPLE R CODE Best, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table, xyplot, names, & loops
And of course I did not test this :). Within the data.frame argument list, please change the <- operators to = signs. Then it should work. Erik -Original Message- From: Erik Iverson Sent: Tuesday, August 25, 2009 1:17 PM To: 'w_poet'; r-help@r-project.org Subject: RE: [R] table, xyplot, names, & loops Hello, "I'm just starting out in R and have a basic question about xyplot and tables. Suppose I had a table of data with the following names: Height, Age_group, City. I'd like to plot mean Height vs Age_group for each City" You did not provide a sample data.frame, so I generated one. This example is basically borrowed directly from Figure 4.3 in Sarkar's excellent book, "Lattice". Personally, I do feel that a plot such as the one below is a better display choice than a simple table of the means, but some may disagree. Please note that my random data do not contain effects for either age.group or city, so my guess is that your resulting plot will look cleaner (i.e., contain some visual signal.) ## BEGIN SAMPLE R CODE ## create sample data.frame df <- data.frame(height <- rnorm(1000, 10), age.group <- sample(gl(10,100, labels = paste("Age Group", 1:10))), city <- sample(gl(4, 250, labels = paste("City", 1:4 ## tabulate the data in matrix form h.tab <- with(df, tapply(height, list(age.group, city), mean)) ## use dotplot with the matrix object dotplot(h.tab, type = "o", auto.key = list(lines = TRUE, space = "right"), xlab = "height") ## END SAMPLE R CODE Best, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Managing output
How about ?append, but R is vectorized, so why not just result_list <- 2*item^2 , or for more complicated tasks, the apply/sapply/lapply/mapply family of functions? In general, the "for" loop construct can be avoided so you don't have to think about messy indexing. What exactly are you trying to do? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Wednesday, August 26, 2009 2:20 PM To: r help Subject: [R] Managing output Hi, Is there a way to build up a vector, item by item. In perl, we can "push" an item onto an array. How can we can do this in R? I have a loop that generates values as it goes. I want to end up with a vector of all the loop results. In perl it woud be: for(item in list){ result <- 2*item^2 (Or whatever formula, this is just a pseudo example) Push(@result_list, result) (This is the step I can't do in R) } Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple column selection question- which and character lists
1) Don't call your data.frame "data". I will call my "example" one "df". 2) If you want the columns NOT in names.species.bio.18, which is what you said, then the answer is: df[!names(df) %in% names.species.bio.18] Best, Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of AllenL Sent: Monday, August 31, 2009 11:40 AM To: r-help@r-project.org Subject: [R] Simple column selection question- which and character lists Dear R-list, Seems simple but have tried multiple approaches, no luck. I have a list of column names: >names.species.bio.18=c("Achimillb","Agrosmitb","Amorcaneb","Andrgerab","Ascltubeb","Elymcanab","Koelcrisb","Lespcapib","Liataspeb","Lupipereb","Monafistb","Panivirgb","Petapurpb","Poaprateb","Querellib","Quermacrb","Schiscopb","Sorgnutab") I want to select the column numbers which correspond to these names in my data frame: >which(colnames(data)==names.species.bio.18) Result: +[1] 75 76 +Warning message: +In cols == names.species.bio.18 : + longer object length is not a multiple of shorter object length So I get the first two hits and then it trips an error message. What is >which doing? Why does it seem to have trouble with vectors of characters? My goal is to output the column names/indices which correspond to the columns NOT in the above list, but that is simple once I can find out what they are. Thanks! -Allen -- View this message in context: http://www.nabble.com/Simple-column-selection-question--which-and-character-lists-tp25226500p25226500.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Offtopic, HT vs. HH in coin flips
Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, "Heads then Tails", or "Heads then Heads"? ##generate 2500 strings of random coin flips ht <- replicate(2500, paste(sample(c("H", "T"), 100, replace = TRUE), collapse = "")) ## find first occurrence of HT mean(regexpr("HT", ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr("HH", ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offtopic, HT vs. HH in coin flips
Part of my issue was that I was not answering my original question. "What is more likely to show up first, HT or HH?" The answer to that turns out to be "neither", or "identical chances". ht <- replicate(2500, paste(sample(c("H", "T"), 100, replace = TRUE), collapse = "")) hts <- regexpr("HT", ht) + 1 hhs <- regexpr("HH", ht) + 1 ## which is first? table(hts < hhs) # about 50/50 summary(hts) #mean of 4 summary(hhs) #mean of 6 So, "What is more likely to show up first, HH or HT?" is of course a different question than "Are the expected values of the positions for the first HT or HH the same?" I suppose that's where confusion set in. It seems that if HH appears later in the string on average (i.e., after 6 tosses instead of 4), that the probability of it being first would be lower than HT, which is obviously wrong! A quick graphic that helps show this (you must run the above code first): library(lattice) ht.df <- data.frame(count = c(hts, hhs), type = gl(2, 1250, labels = c("HT", "HH"))) barchart(prop.table(xtabs(~ count + type, data = ht.df)), stack = FALSE, horizontal = FALSE, box.ratio = .8, auto.key = TRUE) Thanks to all those who replied, and also someone sent me the following link off list, it also clears up the confusion: http://www.mit.edu/~emin/writings/coinGame.html Best, Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 2:17 PM To: r-help@r-project.org Subject: [R] Offtopic, HT vs. HH in coin flips Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, "Heads then Tails", or "Heads then Heads"? ##generate 2500 strings of random coin flips ht <- replicate(2500, paste(sample(c("H", "T"), 100, replace = TRUE), collapse = "")) ## find first occurrence of HT mean(regexpr("HT", ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr("HH", ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to find angle between coordinates?
?atan2 is a possible starting point. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of clair.crossup...@googlemail.com Sent: Tuesday, September 01, 2009 8:09 AM To: r-help@r-project.org Subject: [R] Function to find angle between coordinates? Dear all, I was doing some self study and was wondering if a function already exists which allows one to determine the angle between points. e.g. given the following (x,y) coordinates input: (0,1); (0,0); (1,0) would result in: output: 90 degrees Best regards C.C. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind objects using character vectors
Not tested: Instead of: cbind(vec.names[1], vec.names[2]) cbind(get(vec.names[1]), get(vec.names[2])) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jonas garcia Sent: Tuesday, September 01, 2009 12:53 PM To: r-help@r-project.org Subject: [R] cbind objects using character vectors Dear list, I have a character vector such vec.names<- c("a", "b") It happens that I have also two R objects called "a" and "b" that I would like to merge. Is it possible to do something like cbind(vec.names[1], vec.names[2]) ending up with the same result as cbind(a,b) Bellow is a reproducible example of what I need to to: dat<- data.frame(A=seq(1,5), B=seq(6,10)) vec.names<- c("a", "b") for(i in 1:ncol(dat)) { tab<- dat[,i]-1 assign(vec.names[i], tab) } cbind(vec.names[1], vec.names[2]) [,1] [,2] [1,] "a" "b" But I was looking after the following result (using vec.names): cbind(a,b) a b [1,] 0 5 [2,] 1 6 [3,] 2 7 [4,] 3 8 [5,] 4 9 Thanks in advance Jonas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date format in plot
We will need a reproducible example! Please give us R commands that display the behavior you're observing: For example, I am having trouble understanding the as.Date function. When I input 39939, I would like to get "06.05.2009", but when I try it, I get > as.Date(39939) Error in as.Date.numeric(39939) : 'origin' must be supplied I looked up what origin Excel uses for its' dates, and it seems like it might be January 1, 1900, so I tried as.Date(39939, origin = "1900-01-01") [1] "2009-05-08" Then we will much better be able to help you, because we will be able to paste your commands into R and see the results and make changes. But this still seems to be off by two days. So did you really mean "06.05", or "08.05"? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of swertie Sent: Tuesday, September 01, 2009 12:59 PM To: r-help@r-project.org Subject: [R] Date format in plot Hello, I plot the abundance of a species in relation to the date. To have the date as a continous variable I put it in the format "standard" in excel (f.ex. 39939 means 06.05.2009). R uses 39939 on the x axis, but I would like to have "06.05". I tried to use as.Date as suggested in some discussion but I don't manage to use it, the returned date is not correct. Do you have any clue? thank you __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange results in summary and IQR functions
It's all simply a matter of definitions, and there are many who disagree. See ?quantile , specifically the "type" argument. Since IQR does not appear to have a type argument, you could easily write your own versions of these that do what SAS does (assuming that is your goal). With x defined as you have it, look at the results of this function call, which shows the different values for quantile that you get by using different "type" arguments. > sapply(1:9, function(y) quantile(x, type = y)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8][,9] 0% 222 2.02 2.00 2.00 2.0 2. 25%11 114 7.5 11 9.25 11.25 10.41667 10.5625 50%13 14 13 13.0 14 14.00 14.00 14.0 14. 75%31 31 31 31.0 31 32.50 31.00 31.5 31.3750 100% 47 47 47 47.0 47 47.00 47.00 47.0 47. Best, Erik Iverson -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Chunhao Tu Sent: Tuesday, September 08, 2009 10:09 AM To: r-help@r-project.org Subject: [R] strange results in summary and IQR functions Dear R users, Something is strange in summary and IQR. Suppose, I have a data set and I would like to find the Q1, Q2, Q3 and IQR. x<-c(2,4,11,12,13,15,31,31,37,47) > summary(x) Min. 1st Qu. MedianMean 3rd Qu.Max. 2.00 11.25 14.00 20.30 31.00 47.00 > IQR(x) [1] 19.75 However, I test the same data set in SAS "proc univariate", and SAS shows that Q1=11, Q2=14 and Q3=31. I think most of us agree that Q1 is 11 not 11.25. Could someone please explain to me why R shows Q1=11.25 not 11? Many Thanks Tu -- View this message in context: http://www.nabble.com/strange-results-in-summary-and-IQR-functions-tp25348079p25348079.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Count number of different patterns (Polytomous variable)
If your data.frame was called "test" below, nrow(unique(test)) would do what you want, I believe. Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of "Biedermann, Jürgen" Sent: Tuesday, September 08, 2009 9:24 AM To: r-help@r-project.org Subject: [R] Count number of different patterns (Polytomous variable) Hi there, Does anyone know a method to calculate the number of different patterns in a given data frame. The variables are of polytomous type and not binary (for the latter i found a package called "countpattern" which unfortunately only functions for binary variables). V1 V2 V3 0 3 1 1 2 0 1 2 0 So, in this case, i would like to get "2" as output. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The code behind the function
-- How can I see the code behind the function. For example, > boxplot function (x, ...) UseMethod("boxplot") I really would like to see how people code this. -- That *is* the code. "boxplot" is a generic function, which calls another function based on what class of object you pass to it. You can find out which classes the boxplot method works on by using > methods("boxplot") [1] boxplot.default boxplot.formula* Non-visible functions are asterisked Next, try typing > boxplot.default to see the implementation for the default function, and then since boxplot.formula is hidden, usually getAnywhere("boxplot.formula") will retrieve it. Section 10.9 of An Introduction to R explains this more: http://cran.r-project.org/doc/manuals/R-intro.html#Object-orientation Best Regards, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Joining Characters in R {issue with paste}
And did you read the help file, ?paste , paying attention to the arguments and their descriptions, specifically the "sep" argument? Presumably, you want, paste(a, b, sep = "") -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Abhishek Pratap Sent: Wednesday, September 09, 2009 4:09 PM To: r-help@r-project.org Subject: [R] Joining Characters in R {issue with paste} Hi Guys I am want to join to strings in R. I am using paste but not getting desirable result. For the sake of clarity, a quick example: > a="Bio" > b="iology" > paste(a,b) [1] "Bio iology" *There is a SPACE in the word biology which is what I dont want * Thanks, -Abhi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with for loop
It is difficult to know what you're trying to do here, I think. Is this it? You almost surely don't need a for loop to accomplish your task, and should make use of the pre-existing vectorized functions provided to you. a <- c(4, 5, 1, 7, 8, 12, 39) b <- c(3, 7, 8, 4, 7, 25, 78) d <- a - b which(d > 0) Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Edward Chen Sent: Monday, September 14, 2009 1:50 PM To: r-help@r-project.org Subject: [R] Help with for loop I have a code: *a = c(4,5,1,7,8,12,39) b = c(3,7,8,4,7,25,78) d =a-b for(i in 1:length(d)){ if(d[i]>0){x = list(d[i]) print(x)} else{y = list(d[i]) print(y)}} the results are: [[1]] [1] 1 [[1]] [1] -2 [[1]] [1] -7 [[1]] [1] 3 [[1]] [1] 1 [[1]] [1] -13 [[1]] [1] -39 which will tell me what d is. but is it possible to output the order in which the difference is in the vector d? for example I would want to see x = 1,3,1 and they are from d[1], d[4], d[5]. This is just a crude example I thought of to help me do something more complicated. Thank you very much! * -- Edward Chen Email: edche...@gmail.com Cell Phone: 510-371-4717 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] average for files and graph
It would be even greater if you could get us started with some commented, minimal, self-contained, reproducible code. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of kylle345 Sent: Monday, September 14, 2009 1:35 PM To: r-help@r-project.org Subject: [R] average for files and graph Hi, I have 5 different files. Each file has about 1000 columns and about 3000 rows. Basically what I want to do is take the average of the 1000 columns and then graph it (line graph). How would I do this for 5 files at the same time and plot the average of the 5 files into one graph. it would be great if you can get me started with a quick code. thanks kylle -- View this message in context: http://www.nabble.com/average-for-files-and-graph-tp25440679p25440679.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Putting together a constantly evolving package
Steve, I can't speak to your exact question, but perhaps suggest a simple alternative. What I do is simply make changes to the .R file containing my code, and use the "source" function to read in the new definitions of my functions while I'm tweaking them. Then, at the end of the day, I do my R CMD INSTALL just once. If you use ESS, this is particularly easy since you can just type C-c C-l to source the current .R file into a running *R* process. Also, see the Examples section of ?source for a function that sources a bunch of files, presumably all the .R files in your package, at once. I hope that might help, Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Steve Lianoglou Sent: Tuesday, September 15, 2009 2:08 PM To: R-help@r-project.org Subject: [R] Putting together a constantly evolving package Hi all, I'm putting together some common code + data into a custom package, everything is working out fine, but the ``R CMD INSTALL MyPackage`` call seems to take a particularly long time in the "**data" step: $ R CMD INSTALL MyPackage/ * installing to library '/Library/Frameworks/R.framework/Resources/ library' * installing *source* package ' MyPackage' ... ** R ** data (here) I have a handful of not-very-big *.rda files in my data dir, but also a rather large sqlite db. Is R trying to do anything in particular to my data during the install? index or something? Is there anything I can do to make this step go faster? If this were a 1-time install, it wouldn't matter, but since this package is evolving as I'm using it, I find myself constantly needing to tweak some code here, or change something there, and this always requires another round of R CMD INSTALLing ... Is there something I can do to make this cycle turn around quicker? How do you guys deal with growing a package organically during your analyses? For this particular situation, I reckon I can create a separate package for my dataset since its static (and I might do eventually down the road, anyway), but I'm wondering if there are other alternatives. Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Viewing Function Code
See the reference to ?getAnywhere in the following post: http://www.nabble.com/The-code-behind-the-function-td25370743.html -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Pearmain Sent: Tuesday, September 15, 2009 3:14 PM To: r-help@r-project.org Subject: [R] Viewing Function Code Hi All, I'd like to see the function code behind the barplots2() function in the gplots package, however i come across a bit of a stumbling block of a hidden function, can anyone help? > library(gplots) > methods(barplot2) [1] barplot2.default* Non-visible functions are asterisked > barplot2 function (height, ...) UseMethod("barplot2") Mike -- Michael Pearmain Senior Analytics Research Specialist "I abhor averages. I like the individual case. A man may have six meals one day and none the next, making an average of three meals per day, but that is not a good way to live. ~Louis D. Brandeis" Google UK Ltd Belgrave House 76 Buckingham Palace Road London SW1W 9TQ United Kingdom t +44 (0) 2032191684 mpearm...@google.com If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove 'NA's?
Well, how about the nomatch argument to the match function, see ?match . The nomatch argument is NA by default. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peng Yu Sent: Tuesday, September 15, 2009 4:05 PM To: r-h...@stat.math.ethz.ch Subject: [R] How to remove 'NA's? Hi, > match(c(3,4), c(3,2,1)) [1] 1 NA The above result has 'NA' in. Is there a way to make 'match' does not produce any 'NA's? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R console line-wrapping
See ?options, particularly the "width" setting. > options(width=200) Might do what you want, by default it is 80... Best, Erik Iverson -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nick Matzke Sent: Tuesday, September 15, 2009 4:47 PM To: r-h...@stat.math.ethz.ch Subject: [R] R console line-wrapping Hi all, a quick question I couldn't find the answer to in the usual places: Is there a way to turn off line-wrapping in the R console? Or set the line width-before-wrapping manually? Currently it looks like the console linewraps after about 70 characters, this occurs even if I increase the window size. (I want to output some simple tables to screen for students in a computer lab course) I am running R2.9, i.e. R.app, in Mac OS X. Cheers! Nick -- Nicholas J. Matzke Ph.D. Candidate, Graduate Student Researcher Huelsenbeck Lab Center for Theoretical Evolutionary Genomics 4151 VLSB (Valley Life Sciences Building) Department of Integrative Biology University of California, Berkeley Lab websites: http://ib.berkeley.edu/people/lab_detail.php?lab=54 http://fisher.berkeley.edu/cteg/hlab.html Dept. personal page: http://ib.berkeley.edu/people/students/person_detail.php?person=370 Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html Lab phone: 510-643-6299 Dept. fax: 510-643-6264 Cell phone: 510-301-0179 Email: mat...@berkeley.edu Mailing address: Department of Integrative Biology 3060 VLSB #3140 Berkeley, CA 94720-3140 - "[W]hen people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together." Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical Inquirer, 14(1), 35-44. Fall 1989. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] T-test to check equality, unable to interpret the results.
Robert, We unfortunately do not have enough information to help you interpret the results, and this is not really an R question at all, but general statistical advice. You will probably have much better understanding and confidence in your results by consulting a local statistical consultant at your university. The values shown for sample 1 and sample 2 lead me to believe that these are not drawn from a homogeneous population. Whoever ends up helping you is going to need to know how these measurements were obtained in much greater detail than you've given here. Best Regards, Erik Iverson -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Robert Hall Sent: Wednesday, September 16, 2009 1:55 PM To: r-help Subject: [R] T-test to check equality, unable to interpret the results. Hi, I have the precision values of a system on two different data sets. The snippets of these results are as shown: sample1: (total 194 samples) 0.600238 0.800119 0.600238 0.200030 0.600238 ... ... sample2: (total 188 samples) 0.8001 0.2000 0.8001 0. 0.8001 0.4001 ... ... I want to check if these results are statistically significant? Intuitively, the similarity in the two results mean the results are statistically significant. I am using the t-test t.test(sample1,sample2)to check for similarity amongst the two results. I get the following output: --- Welch Two Sample t-test data: s1p5 and s2p5 t = 0.9778, df = 374.904, p-value = 0.3288 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.03170059 0.09441172 sample estimates: mean of x mean of y 0.5138298 0.4824742 I believe the t-test checks for difference amongst the two sets, and p-value < 0.05 means both thesets are statistically different. Here while checking for dissimilarity the p-value is 0.3288, does it mean that higher the p-value (while t.test checks for dis-similarity) means more similar the results are (which is the case above as the means of the results are very close!) Please help me interpret the results.. thanks in advance! -- Rob Hall Masters Student ANU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply function across two variables by mult factors
Hello, > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Jon Loehrke > Sent: Wednesday, September 16, 2009 2:23 PM > To: r-help@r-project.org > Subject: [R] apply function across two variables by mult factors > > Greetings, > > I am attempting to run a function, which produces a vector and > requires two input variables, across two nested factor levels. I can > do this using by(X, list(factor1, factor2), function), however I > haven't found a simple way to extract the list output into an > organized vector form. I can do this using nested loops but it isn't > exactly an optimal approach. > > Thank you for any and all suggestions. Jon > > # example data frame > testDF<-data.frame( > x=rnorm(12), > y=rnorm(12), > f1=gl(3,4), > f2=gl(2,2,12)) > Try this using lapply, split, mapply? Maybe it is in a nicer output object for you? testFun2 <- function(x, y) { X <- abs(x); Y <- abs(y); as.numeric(paste(round(X), round(Y), sep='.')) } lapply(split(testDF, list(testDF$f1, testDF$f2)), function(x) mapply(testFun2, x[1], x[2])) > # example function [trivial] > testFun<-function(x){ > X<-abs(x[,1]); > Y<-abs(x[,2]); > as.numeric( paste(round(X), round(Y), sep='.')) > } > > # apply by factor levels but hard to extract values > by(testDF[,1:2], list(testDF$f1, testDF$f2), testFun) > > # Loop works, but not efficient for large datasets > testDF$value<-NA > for(i in levels(testDF$f1)){ > for(j in levels(testDF$f2)){ > testDF[testDF$f1==i & testDF$f2==j,]$value<- > testFun(testDF[testDF > $f1==i & testDF$f2==j,1:2]) > } > } > testDF > sessionInfo() > #R version 2.9.1 Patched (2009-08-07 r49093) > #i386-apple-darwin8.11.1 > # > #locale: > #en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > # > #attached base packages: > #[1] stats graphics grDevices utils datasets methods base > > > Jon Loehrke > Graduate Research Assistant > Department of Fisheries Oceanography > School for Marine Science and Technology > University of Massachusetts > 200 Mill Road, Suite 325 > Fairhaven, MA 02719 > jloeh...@umassd.edu > T 508-910-6393 > F 508-910-6396 > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply function across two variables by mult factors
One correction below, ---snip--- > > > > # example data frame > > testDF<-data.frame( > > x=rnorm(12), > > y=rnorm(12), > > f1=gl(3,4), > > f2=gl(2,2,12)) > > > > Try this using lapply, split, mapply? Maybe it is in a nicer output > object for you? > > testFun2 <- function(x, y) { > X <- abs(x); > Y <- abs(y); > as.numeric(paste(round(X), round(Y), sep='.')) > } > > lapply(split(testDF, list(testDF$f1, testDF$f2)), >function(x) mapply(testFun2, x[1], x[2])) > Or use "list indexing" in the mapply call to get a vector, in this case at least... lapply(split(testDF, list(testDF$f1, testDF$f2)), function(x) mapply(testFun2, x[[1]], x[[2]])) ---snip--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generating unordered combinations
Dan, Still maybe a bit ugly, but no looping... > unique(as.data.frame(t(apply(expand.grid(0:2, 0:2, 0:2), 1, sort V1 V2 V3 1 0 0 0 2 0 0 1 3 0 0 2 5 0 1 1 6 0 1 2 9 0 2 2 14 1 1 1 15 1 1 2 18 1 2 2 27 2 2 2 Best, Erik > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Dan Halligan > Sent: Thursday, September 17, 2009 3:31 PM > To: r-help@r-project.org > Subject: [R] generating unordered combinations > > Hi, > > I am trying to generate all unordered combinations of a set of > numbers / characters, and I can only find a (very) clumsy way of doing > this using expand.grid. For example, all unordered combinations of > the numbers 0, 1, 2 are: > 0, 0, 0 > 0, 0, 1 > 0, 0, 2 > 0, 1, 1 > 0, 1, 2 > 0, 2, 2 > 1, 1, 1 > 1, 1, 2 > 1, 2, 2 > 2, 2, 2 > > (I have not included, for example, 1, 0, 0, since it is equivalent to > 0, 0, 1). > > I have found a way to generate this data.frame using expand.grid as > follows: > > g <- expand.grid(c(0,1,2), c(0,1,2), c(0,1,2)) > for(i in 1:nrow(g)) { > g[i,] <- sort(as.character(g[i,])) > } > o <- order(g$Var1, g$Var2, g$Var3) > unique(g[o,]). > > This is obviously quite clumsy and hard to generalise to a greater > number of characters, so I'm keen to find any other solutions. Can > anyone suggest a better (more general, quicker) method? > > Cheers > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic function output/scope question
Hello, > > testfunc<-function(x) > { y<-10 > print(y) > print(x) > } > > testfunc(4) > > The variables x and y are accessible during execution of the function > "testfunc" but not afterwards. In R, expressions return values. When you define a function, ?function says that, "If the end of a function is reached without calling 'return', the value of the last evaluated expression is returned." So you are correct, 'x' and 'y' are local variables, and by all accounts they should be. If you want their values accessible, simply return them. ## return just y testfunc2 <- function(x) { y <- 10 y } ## return both x and y testfunc2 <- function(x) { y <- 10 list(x, y) } There are ways to make x and y global from within a function, but in general that is not the R way to do things! Hope that helps, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic function output/scope question
I see you've already been told about "<<-". The reason I (and presumably others?) stay away from that construct is that your function then has side effects that a user (even yourself) may not anticipate or want, namely possibly overwriting a previous variable in your global environment. The help for ?<<- says, "The operators '<<-' and '->>' cause a search to made through the environment for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment." So this is not even guaranteed to assign in the Global Environment!! To follow-up on my previous email, you usually assign the results of your function call to a variable if you want to use them further. For example, ## return both x and y testfunc2 <- function(x) { y <- 10 list(x, y) } my.var <- testfunc2(4) another.function(my.var) > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Erik Iverson > Sent: Monday, September 21, 2009 10:43 AM > To: David Young; r-help@r-project.org > Subject: Re: [R] Basic function output/scope question > > Hello, > > > > > testfunc<-function(x) > > { y<-10 > > print(y) > > print(x) > > } > > > > testfunc(4) > > > > The variables x and y are accessible during execution of the function > > "testfunc" but not afterwards. > > In R, expressions return values. When you define a function, ?function > says that, "If the end of a function is reached without calling 'return', > the value of the last evaluated expression is returned." > > So you are correct, 'x' and 'y' are local variables, and by all accounts > they should be. If you want their values accessible, simply return them. > > ## return just y > testfunc2 <- function(x) { >y <- 10 >y > } > > ## return both x and y > testfunc2 <- function(x) { >y <- 10 >list(x, y) > } > > There are ways to make x and y global from within a function, but in > general that is not the R way to do things! > > Hope that helps, > Erik Iverson > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] missing level of a nested factor results in an NA in lm output
> > estimable(fit, myEstimate) > Estimate Std. Error t value DF Pr(>|t|) > test 12.18198 0.6694812 18.19615 10 5.395944e-09 Where are you getting this "estimable" function from? A package? Did you define it yourself? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More elegant way of excluding rows with equal values in any 2 columns?
Hello, Do you mean exactly any 2 columns. What if the value is equal in more than 2 columns? > > I built a data frame "grid" (below) with 4 columns. I want to exclude > all rows that have equal values in ANY 2 columns. Here is how I am > doing it: > > index<-expand.grid(1:4,1:4,1:4,1:4) If a value is equal in 2 or more rows, i.e., duplicated, then the following should work, assuming index can be changed to a matrix for apply ... t3 <- index[apply(index, 1, function(x) all(!duplicated(x))),] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More elegant way of excluding rows with equal values in any 2 columns?
It probably does... I did not look at his until just now, my guess is they are equivalent. There are usually at least a couple ways to do things in R, no problem :). With massive datasets, it might make sense to try a couple different ways to see if one or the other is faster though. You could also replace "!duplicated" in my function with "unique" ... Erik > -Original Message- > From: Dimitri Liakhovitski [mailto:ld7...@gmail.com] > Sent: Monday, September 21, 2009 2:02 PM > To: Erik Iverson > Cc: R-Help List > Subject: Re: [R] More elegant way of excluding rows with equal values in > any 2 columns? > > Thank you very much, both Dimitris and Erik. > Erik - you are right, I was trying to remove any duplication (i.e., if > there are the same values in 2 or 3 or 4 columns). > And it looks like that's what your solution does. > But doesn't it do the same thing as Dimitris' solution? > > Dimitri > > On Mon, Sep 21, 2009 at 2:55 PM, Erik Iverson wrote: > > Hello, > > > > Do you mean exactly any 2 columns. What if the value is equal in more > than 2 columns? > > > >> > >> I built a data frame "grid" (below) with 4 columns. I want to exclude > >> all rows that have equal values in ANY 2 columns. Here is how I am > >> doing it: > >> > >> index<-expand.grid(1:4,1:4,1:4,1:4) > > > > If a value is equal in 2 or more rows, i.e., duplicated, then the > following should work, assuming index can be changed to a matrix for apply > ... > > > > t3 <- index[apply(index, 1, function(x) all(!duplicated(x))),] > > > > > > -- > Dimitri Liakhovitski > Ninah.com > dimitri.liakhovit...@ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More elegant way of excluding rows with equal values in any 2 columns?
> > You could also replace "!duplicated" in my function with "unique" ... > It turns out you can't, of course :). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working around 256 byte variable names? + trouble opening large file
> I did just try to do that, and it is still returning the same error when I > try to attach the csv file.. > > > vc1<-read.table("P:\\R\\Everything-I.csv",header=T, sep=" ", dec=".", > na.strings=NA, strip.white=T) > > attach(vc1) > Error in attach(vc1) : variable names are limited to 256 bytes > > Each variable name is only 5 to 6 characters long, but I'm sure you're > right about R reading the entire header line as one variable. > I cannot figure out though, how to stop it from doing so. > > sep=" ", or sep="," do not seem to work either, though I don't know if it > is the right thing to be trying. > You will certainly need to specify the correct "sep" argument, and if it is truly a CSV file, that is NOT " ", but "," Why don't you paste/attach the first few lines of your .csv file for inspection if all else fails? What about the unmatched quote idea from a previous post?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Model "NA" Value Test
> if("fit$coef[[2]]" == "NA") {.cw = 1} See ?is.na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use a string to refer a function?
Hint: "somebody let me know how to >>>get< the function from the name 'f'?" __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which method is called when print(a_list)?
blue sky wrote: I don't find print.list. Could somebody let me know which method is called when I run command print(a_list), where a_list is a list? Is 'print.default' used for printing a list? Yes. You can always debug functions to investigate what's going on, too. See ?debug. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to check the two different nulls?
blue sky wrote: x=list(a=1,b=NULL) is.null(x$b) is.null(x$c) Both the above two commands give me TRUE, but in the first one, b is NULL, in the second one, c doesn't exist. Are there functions that can help me distinguish the two different nulls? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ?names __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] delete repeated values - not unique...
Well, can you algorithmically describe what you are trying to do? Your example is not sufficient to determine it. For instance, are you trying to: 1) remove repeated elements of a vector and concatenate the first element at the end? 2) remove repeated elements of a vector and concatenate the minimum element at the end? 3) always return the vector c(4, 5, 6, 4) ? 4) something else? jorgusch wrote: Hello, I must be blind not to see it, but I have the following vector: 4 4 5 6 6 4 What I would like to have as a result is: 4 5 6 4 All repeated values are gone. I cannot use unique for this, as the second 4 would disappear. Is there another fast function for this problem? Thanks in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] delete repeated values - not unique...
Ah, the request was 'hidden' in the subject of the message, apologies! Erik Iverson wrote: Well, can you algorithmically describe what you are trying to do? Your example is not sufficient to determine it. For instance, are you trying to: 1) remove repeated elements of a vector and concatenate the first element at the end? 2) remove repeated elements of a vector and concatenate the minimum element at the end? 3) always return the vector c(4, 5, 6, 4) ? 4) something else? jorgusch wrote: Hello, I must be blind not to see it, but I have the following vector: 4 4 5 6 6 4 What I would like to have as a result is: 4 5 6 4 All repeated values are gone. I cannot use unique for this, as the second 4 would disappear. Is there another fast function for this problem? Thanks in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Keyboard
Steven Martin wrote: All, I installed R-2.10.1 with Readline=no. Now for some reason R does not recognize some key strokes like the directional arrows. I am not sure if Readline is the problem or not. What particular OS are you using? In many cases, there is a preconfigured package available to save you from compiling it yourself. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of R in clinical trials
Frank E Harrell Jr wrote: Cody, How amazing that SAS is still used to produce reports that reviewers hate and that requires tedious low-level programming. R + LaTeX has it all over that approach IMHO. We have used that combination very successfully for several data and safety monitoring reporting tasks for clinical trials for the pharmaceutical industry. Frank I used to work for a research group that also used R + LaTeX to produce DSMB reports for clinical trials. If the DSMB members had only been exposed to SAS reports before, you could not get them to stop praising the quality of the R + LaTeX reports, even years into a trial. Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems installing R-2.10.1 on Linux
Rhett Harrison wrote: Hi, I have been having problems installing the newest version on Linux (Ubuntu 9.10) (tried on two machines). The ./configure appears to work but I get the following error on the 'make' command. Don't know about this, but if there's no particular reason you need to compile from source, just add the proper repository and install R from there... http://cran.r-project.org/bin/linux/ubuntu/ Otherwise check out config.log and see what might have gone wrong... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable substitution
Hello, Jon Erik Ween wrote: Hi I would like to write a script that reads a list of variable names. These variable names are some of the column headers in a data.frame. Then I want do a for-loop to execute various operations on the specified variables in the data.frame, but can't figure out how to do the necessary variable substitution. In bash (or C) I would use "$var", but there seems to be no equivalent in R. The data.frame has 300 columns and 2500 rows. Not all columns are continuous variables, some are factors, descriptors, etc., so I use the varlist to pick which columns I want to analyze. #Example script varlist<-read.table(/path/to/varlist) for (i in 1:length(varlist)){ res<-mean(Dataset$SOMETHINGHERE_i ) write(res) somewhere } In your script, perhaps res <- mean(Dataset[[varlist[i]]]) would work. Better might be: colMeans(Dataset[varlist]) or if your actual function is not "mean": sapply(Dataset[varlist], function) where "function" is whatever you want (e.g., mean, sd, ...) and do avoid the "varlist" problem completely, say you only want to run it on numeric variables sapply(Dataset[sapply(Dataset, is.numeric)], function) None of this is tested, but the ideas should work. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset() for multiple values
subset(df, x %in% c(...)) chipmaney wrote: This code works: subset(NativeDominant.df,!ID=="37-R17") This code does not: Tree.df<-subset(NativeDominant.df,!ID==c("37-R17","37-R18","10-R1","37-R21","37-R24","R7A-R1","3-R1","37-R16")) how do i get subset() to work on a range of values? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Funny result from rep(...) procedure
Cannot reproduce, what is branches? If you can narrow it down to a "commented, minimal, self-contained, reproducible" example, you're far more likely to get help from the list. dkStevens wrote: I'm observing odd behavior of the rep(...) procedure when using variables as parameters in a loop. Here's a simple loop on a vector 'branches' that is c(5,6,5,5,5). The statement in question is print(c(ni,rep(i,times=ni))) that works properly first time through the loop but the second time, when branches[2] = 6, only prints 5 values of i. Any ideas, anyone? iInd = 1 for(i in 1:length(branches)) { print((1:branches[i])+iInd-1) # iInd is a position shift of the index ni = branches[i] print(i) print(ni) print(c(ni,rep(i,times=ni))) # ... some interesting other stuff for my project that gets wrecked because of this issue iInd = iInd + branches[i] } # first pass through loop [1] 1 2 3 4 5 # branches[1] + iInd - 1 content [1] 1# i value to repeat 5 times [1] 5# ni = 5 on 1st pass [1] 5 1 1 1 1 1 # five values - 1 1 1 1 1 - OK # second pass through loop [1] 6 7 8 9 10 11 # branches[2] + iInd - 1 content [1] 2 # i value to repeat 6 times [1] 6 # ni = 6 on 2nd pass [1] 6 2 2 2 2 2 # 6 'twos' but only shows 5 'twos' print(c(ni,rep(i,times=ni))) - why not 6? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Funny result from rep(...) procedure
Erik Iverson wrote: Cannot reproduce, what is branches? If you can narrow it down to a "commented, minimal, self-contained, reproducible" example, you're far more likely to get help from the list. My blinded guess though, is something to do with FAQ 7.31. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R error- "more columns than column names"
I had a comment character "#" in my header names earlier today that threw this error. Euphoria wrote: Hi all! I am desperately trying to figure out the solution to this error, but nothing as of yet is working. As noted in an earlier post I am using GenABEL. In an attempt to read in the phenotype file, in the format .dat, R keeps giving me the error "more columns than column names" I have tried to read in the data without the headers; I have also tried to trim the data to remove any trailing tabs or spaces but it doesn't solve the problem. All missing values have been replaced with "NA", and all data seems to have matching corresponding header value - each column has a matching column name. What could be the possible underlying problem? I have tried to problem-solve but clearly I am at a loss. Thanks for your help! Code: mix <- load.gwaa.data (phe = "Z:/CCFPhenotypesTAB.dat", gen = "pedmap-0.raw", force = T) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset Question
Chertudi wrote: Hello helpful R folks, First off, please forgive my English. Second, I'm new with R, I've searched the archives about subsets, and I haven't found quite the help I need. I'm currently analysing a population survey whose data set has about 15000 households (the rows/observations) and 130 variables (the columns). I've managed to import the set into R as a data.frame called eu08. Now, I'm trying to look at all of the variables, but limited to one province in the "region" variable. I think the provinces are factors, and the province of interest is labeled '3'. I've tried the following: region3=subset(eu08, region==3) --this simply strips all of the rows from the columns, and I know that about 4000 of the observations are specific to region 3. So does putting the 3 as '3' and "3". Any help would be greatly appreciate. Well, we don't know if it really is a factor. You can determine that by doing... class(eu08$region) If it is a factor, then levels(eu08$region) should let you know what you can subset with. str(eu08) might also be good to look at... Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] text editors
Dwayne Blind wrote: Dear all, Do you use a text editor ? What would you recommend for Windows users ? What about Tinn-R ? Dwayne, Perhaps you have seen http://www.sciviews.org/_rgui/ , it has information on several possibilities. It would be hard to pull me away from using Emacs with ESS (http://ess.r-project.org/), both on Windows and Linux. I use Emacs for a lot of things now, but ESS was the gateway that helped me learn it. The fact that there is always a version of Emacs on all the platforms I might be faced with helps a lot too. I know nothing about Tinn-R, but my recollection is that people who use it seem to like it just fine. Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Experts
Hello, Ryan Kinzer wrote: I am trying to understand why R is working in a particular way. I have a data set with two variables; mark date (markd) and recap date (recapd). I would like to know the number of days between capture dates. But if I subtract recap date from mark date I often get the wrong results. Well, what are the classes of recapd and markd in your case? Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable
You mention ifelse, so for completeness, I will show you a solution that should work with that. There are other plenty of other possibilities though, I am sure. The follow is not tested.. Assume 'my.df' is your data.frame, containing a variable "DOW". my.df$DOW1 <- ifelse(my.df$DOW == "SAT", 1, ifelse(my.df$DOW == "SUN", 2, ifelse(my.df$DOW == "MON", 3, ifelse(my.df$DOW == "TUE", 4, ifelse(my.df$DOW == "WED", 5, ifelse(my.df$DOW == "THU", 6, 7)) (don't know if the number of closing ")" is right, but you get the idea... Erik Steve Matco wrote: Hi everyone, I am at my wits end with what I believe would be considered simple by a more experienced R user. I want to know how to add a variable to a dataframe whose values are conditional on the values of an existing variable. I can't seem to make an ifelse statement work for my situation. The existing variable in my dataframe is a character variable named DOW which contains abbreviated day names (SAT, SUN, MON.FRI). I want to add a numerical variable named DOW1 to my dataframe that will take on the value 1 if DOW equals "SAT", 2 if DOW equals "SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI". I know this must be a simple problem but I have searched everywhere and tried everything I could think of. Any help would be greatly appreciated. Thank you, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Experts
Ryan Kinzer wrote: Erik Thanks for helping. Both of them are factors. That's the problem, they need to be of class Date. See the R NEWS article about Date classes in Volume 4/1. http://cran.r-project.org/doc/Rnews/ I don't see how they could be factors though, since you shouldn't be able to subtract two factors from each other without a warning at least? e.g., when I make up factors f1 and f2 >f1 - f2 Warning message: In Ops.factor(f1, f2) : - not meaningful for factors We would have to have a small, reproducible example to know for sure what's going on... Best Regards, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quickest way convert 1-col df to vector?
sjaffe wrote: anything shorter than as.vector(as.matrix( df ) )? df[[1]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] page boundaries for latex printing of summary.formula objects in Hmisc
Hello, Warning, I'm guessing only those who have used the Hmisc package's summary.formula function with LaTeX will be able to offer much help here. I am using the Hmisc package's summary.formula function to produce tables for a LaTeX report. The "latex" function in the same package supports longtables in LaTeX. Ideally, I would like for page breaks in the LaTeX output to only occur at variable boundaries. ## Sample R code ## load the Hmisc package library(Hmisc) ## create an example data.frame test.df <- data.frame(sex = gl(2, 110, labels = c("Male", "Female")), fac1 = sample(gl(11, 100, labels = paste("V1 Level", 1:11))), fac2 = sample(gl(11, 100, labels = paste("V2 Level", 1:11))), fac3 = sample(gl(11, 100, labels = paste("V3 Level", 1:11))), fac4 = sample(gl(11, 100, labels = paste("V4 Level", 1:11))), fac5 = sample(gl(11, 100, labels = paste("V5 Level", 1:11))), fac6 = sample(gl(11, 100, labels = paste("V6 Level", 1:11 ## create the summary.formula object sf <- summary.formula(sex ~ fac1 + fac2 + fac3 + fac4 + fac5 + fac6, data = test.df, method = "reverse") ## print out the LaTeX code to the screen, not a file latex(sf, file = "", longtable = TRUE) Notice how the LaTeX output puts the newline in the middle of factor 4, instead of before or after. fac4~:~V4~Level~1&11\%~{\scriptsize~(59)}&~7\%~{\scriptsize~(41)}\tabularnewline V4~Level~2&10\%~{\scriptsize~(56)}&~8\%~{\scriptsize~(44)}\tabularnewline V4~Level~3&~9\%~{\scriptsize~(47)}&10\%~{\scriptsize~(53)}\tabularnewline V4~Level~4&~9\%~{\scriptsize~(49)}&~9\%~{\scriptsize~(51)}\tabularnewline V4~Level~5&~9\%~{\scriptsize~(51)}&~9\%~{\scriptsize~(49)}\tabularnewline V4~Level~6&~9\%~{\scriptsize~(50)}&~9\%~{\scriptsize~(50)}\tabularnewline V4~Level~7&~7\%~{\scriptsize~(39)}&11\%~{\scriptsize~(61)}\tabularnewline \newpage V4~Level~8&~9\%~{\scriptsize~(51)}&~9\%~{\scriptsize~(49)}\tabularnewline V4~Level~9&~9\%~{\scriptsize~(51)}&~9\%~{\scriptsize~(49)}\tabularnewline V4~Level~10&~9\%~{\scriptsize~(47)}&10\%~{\scriptsize~(53)}\tabularnewline V4~Level~11&~9\%~{\scriptsize~(50)}&~9\%~{\scriptsize~(50)}\tabularnewline I expect this given the documentation in ?latex of lines.page, which is set to 40 by default. lines.page: Applies if ‘longtable=TRUE’. No more than ‘lines.page’ lines in the body of a table will be placed on a single page. Page breaks will only occur at ‘rgroup’ boundaries. The problem is that variable boundaries don't in general correspond to constants, like 40 lines. So, rgroup sounds promising. I want the lines per variable to correspond to be the n.rgroup values, but since my tables are dynamic, in that the variables and number of levels in them change over time, I can't think of a way to define n.rgroup without specifying it per variable. My first thought was to compute it from the number of levels per variable in the formula. In fact, I did try this but immediately ran into some misconceptions I had about how continuous variables are represented internally within the latex function. Is there any easier way to accomplish this breaking of pages on variable boundaries using this set of functions? I suspect not, but thought I'd ask. I think I can figure out the approach I suggested in the preceding 2 paragraphs, but just want to make sure I'm not missing something ... Thanks a lot! Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with mean
See FAQ 7.31 venkata kirankumar wrote: Hi all, I got one intresting problem with caliculating mean that is while i am taking mean of values *0.6, -0.8, 4, -3.8* using *val<-c(0.6, -0.8, 4, -3.8)* *mean(val)* it given result as *2.775558e-17* but the actual result is *"0"* can any one suggest how can I get correct mean result in this case can any one suggest how I can farward thanks in regards kiran [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How good is R at making publication quality tables?
Paul Miller wrote: Hello Everyone, I have just started learning R and am in the process of figuring out what it can and can't do. I must say I am very impressed with R so far and am amazed that something this good can actually be free. Recently, I finished reading R for SAS and SPSS Users and have begun reading SAS and R and Data Manipulation with R. Based on what I've read in these books and elsewhere, I get the impression that R is very good at drawing high quality graphs but maybe not so good at creating nice looking tables of the sort I'm used to getting through SAS ODS. You're really only limited by your imagination here. I have written several custom table functions to output LaTeX, but you can output whatever you like (HTML, plain-text, org-mode files...), you're in complete control with R. I can second the Hmisc package though. I often use a combination of summary.formula and the latex function to output really nice looking tables that get put into a long PDF report for a study. I can say that both of these functions, summary.formula and latex, in Hmisc have a LOT of arguments, and almost every time I said "I wish it looked a little different", there was an option to control it. Specifically, I found the options: exclude1, long, longtable, combine, test, do be very useful. I often make tables by some treatment group, so all these are using method = "reverse" to accomplish that. And if you don't like the output, latex.summary.formula.reverse is a good function to make your own version of, to output exactly what you want. I have a local copy that augments the tables that contain unadjusted p-values with adjusted p-values from a model. But apart from Hmisc, just realize that with R you have a nice programming language to produce any type of output you want, you're not limited to what someone else gave you. --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use "ifelse" to generate random value from a distribution
alex46015 wrote: I think I figure it out. ifelse(data1$x==1,rnorm(12,2,1),ifelse(data1$x==2,rnorm(12,-2,1),rnorm(12,110,1))) Where is the number 12 coming from? Is that the length of data1$x? Here is a sample using the fact that rnorm can accept vectors of means and sds. My x is randomly generated since you did not provide a reproducible example. #random data, can take on 3 values x <- sample(1:3, 100, replace = TRUE) #i tried to make this identical to your criteria mns <- ifelse(x == 1, 2, ifelse(x==2, -2, 110)) #use vectors!, do not hardcode length(x)! rnorm(length(x), mns) HTH, --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.