[R] aggregate function / custom column names?

2010-02-11 Thread Chuck White
This question is about column names returned by the aggregate function. Consider the following example df <- data.frame( id = c(rep('11',30),rep('22',30),rep('33',30)), value = c(rnorm(30,2,0.5), rnorm(30,3,0.5), rnorm(30,6,0.5)) ) aggregate(df[,c("value"),drop=FALSE], by=list(id=df$id), m

Re: [R] ggplot2 / time series with different scales

2010-02-05 Thread Chuck White
Thank you both for your response. As the names suggest, I am plotting the sales & price data for items over time to understand the how certain items may be more responsive than others to price changes. Another way of displaying this information on the same chart as the one showing sales, would

[R] ggplot2 / time series with different scales

2010-02-04 Thread Chuck White
I am trying to plot this dataset using ggplot2: df <- data.frame( sid = c(rep('11',30),rep('22',30)), time = rep(ISOdate(year = 2010, month = 1, day = 1:30),2), sales = c(rnorm(30, 1000, 20),rnorm(30, 900, 10)), price = c(rnorm(30, 2, 0.5),rnorm(30, 3,0.5)) ) Plotting just the sales

Re: [R] merging columns

2010-02-02 Thread Chuck White
ing seems to work: data.frame(sapply(col2.uniq, function(col) { wcol <- which(col==col2) as.numeric(rowSums(data.frame(data.df[,wcol]))>0) })) I had to wrap data.df[,wcol] in another data.frame to handle situations where wcol had one element. Is there a better approach? ---- Chuck Whi

[R] merging columns

2010-02-02 Thread Chuck White
Hello -- I am trying to merge columns in a dataframe based on substring matches in colnames. I would appreciate if somebody can suggest a faster/cleaner approach (eg. I would have really liked to avoid the if-else piece but rowSums does not like that). Thanks. data.df <- data.frame(aa=c(1,1,0),

Re: [R] ggplot/time series with indicators question

2010-02-01 Thread Chuck White
; > > Since this question properly belongs on the ggplot2 list, it is being cc'ed > there as well. > > HTH, > Dennis > > On Mon, Feb 1, 2010 at 6:06 PM, Chuck White wrote: > > > Hello, I am trying to plot time-series data with certain weeks highlighted > &g

[R] ggplot/time series with indicators question

2010-02-01 Thread Chuck White
Hello, I am trying to plot time-series data with certain weeks highlighted using symbols. require(ggplot2) #plotting time series data timescale <- seq(as.Date("01/01/09","%m/%d/%y"), length.out=12, by=7) data.all <- data.frame( id = c(rep('111',12),rep('222',12),rep('333',12)), week=c(ti

[R] SemiPar/spm question

2010-01-29 Thread Chuck White
Hello -- I posted this question yesterday and for some reason the post seems to be attached to the wrong thread. Also, I extended my test a little and it seems to indicate the problem is with spm. I would appreciate any help. Thanks. == lib

[R] plyr / spm issue

2010-01-28 Thread Chuck White
I am not able to get spm function in SemiPar to work with plyr. Here's an example: library(plyr) library(SemiPar) data <- data.frame(id=c(rep("111",100),rep("222",200)), value=c(rnorm(100,2,1),rnorm(200,10,5))) #this works data111 <- data[data$id=="111"

[R] splitting a factor column into binary columns for each level

2010-01-26 Thread Chuck White
Yesterday I posted the following question (my apologies for not putting a subject line): =question== Hello -- I would like to know of a more efficient way of writing the following piece of code. Thanks. options(stringsAsFactors=FALSE) orig <- c(rep('11

[R] splitting a factor column into binary columns for each factor

2010-01-26 Thread Chuck White
Yesterday I posted the following question (my apologies for not putting a subject line): =question== Hello -- I would like to know of a more efficient way of writing the following piece of code. Thanks. options(stringsAsFactors=FALSE) orig <- c(rep('11

[R] splitting a factor column into binary columns for each factor

2010-01-26 Thread Chuck White
Yesterday I posted the following question (my apologies for not putting a subject line): =question== Hello -- I would like to know of a more efficient way of writing the following piece of code. Thanks. options(stringsAsFactors=FALSE) orig <- c(rep('11

[R] (no subject)

2010-01-25 Thread Chuck White
Hello -- I would like to know of a more efficient way of writing the following piece of code. Thanks. options(stringsAsFactors=FALSE) orig <- c(rep('',10),rep('',20),rep('',30),rep('',40)) orig.unique <- unique(orig) system.time(df <- as.data.frame

[R] cross-product

2009-12-17 Thread Chuck White
Hello -- I am trying to create a dataframe whose rows are the cross product of rows in two other dataframes. Here's an example: A = data.frame(F1=c('1','2','3','2')) B = data.frame(F2=c('4','5')) C = data.frame() for (x in unique(A[,'F1'])) { C = rbind(C, cbind(F1=x,B)) } How can I achieve t

[R] lm and levels

2009-11-10 Thread Chuck White
Consider the following example: x <- c(2,4,3,6) y <- c(4,9,5,10) z <- factor(c(1,1,2,2)) summary(lm("y ~ x + z")) The above works fine. Suppose I change z so that x <- c(2,4,3,6) y <- c(4,9,5,10) z <- factor(c(1,1,2,NA)) summary(lm("y ~ x + z")) the last row/observation is not considered in the

Re: [R] merge data

2009-11-10 Thread Chuck White
or will the additional column be added to df1? Thanks. David Winsemius wrote: > > On Nov 10, 2009, at 12:36 PM, Chuck White wrote: > > > df1 -- dataframe with column date and several other columns. #rows > > >40k Several of the dates are repeated. > >

[R] merge data

2009-11-10 Thread Chuck White
df1 -- dataframe with column date and several other columns. #rows >40k Several of the dates are repeated. df2 -- dataframe with two columns date and index. #rows ~130 This is really a map from date to index. I would like to create a column called index in df1 which has the corresponding inde

[R] R292 on AIX53 using gcc

2009-10-17 Thread Chuck White
Hello -- I am unable to build R 2.9.2 on IBM PowerPC AIX5.3. I would appreciate any help in this matter. ===details== Machine: IBM PowerPC_POWER5 / 4 proc, 1499 MHz 64-bit / AIX 5.3.0.0 Building R 2.9.2 using gcc/g++/gfortran 4.2.4 Config.site changes

Re: [R] building 2.0.6 on Win32/Py2.5

2009-07-07 Thread Chuck White
My sincere apologies. This message was intended for the rpy2 mailing list. Thanks. Uwe Ligges wrote: > Which previous message? Should we all look up the archives now? Please > cite and stay within the same thread. > > Thank you, > Uwe LIgges > > > > >

[R] building 2.0.6 on Win32/Py2.5

2009-07-07 Thread Chuck White
After I posted the previous message, I repeated the process on a windows machine with Python 2.6 and still get the same error. I would appreciate any help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read

[R] grep on vectors?

2009-06-30 Thread Chuck White
Input: dataframe with 300+columns for a regression. It consists of sets of factors whose names have the same structure. For example, aa1,aa2,aa3 could be one set of factors. After reading in the dataframe, I would like to compute the density (%nonzeroes) for certain groups of factors and delete