Re: [R] Converting 3 columns of a data frame into a matrix

2010-06-22 Thread Jeff08
Okay, it crosstab seems to work when I clear out a bunch of my variables, went from 800 mb Vcol to 100 mb Can anyone explain how memory works in R, because my lack of understanding with memory was clearly the problem. Jeff08 wrote: > > ##I have also tried the reshape package > librar

Re: [R] Converting 3 columns of a data frame into a matrix

2010-06-22 Thread Jeff08
lt(df, id=c("date_", "id")) mm1 <- cast(mm, date_~id) aba <- mm1[,2:2365] final <- as.matrix(aba) colnames(final) <- Returns.nodup$id rownames(final) <- mm1$date_ Jeff08 wrote: > > df<-data.frame() > df[1:8,1]<-c("1","2","5

[R] Converting 3 columns of a data frame into a matrix

2010-06-22 Thread Jeff08
df<-data.frame() df[1:8,1]<-c("1","2","5","3","1","4","3","5") ##identifier 1 df[1:8,2]<-c("c","a","b","c","a","b","b","a") ##identifier 2 df[1:8,3]<-c(1,2,3,4,5,6,7,8) ##value ##Each unique combination of identifiers identifies a datapoint ##What I am trying to do is create a matrix with value

Re: [R] Retrieving the 2 row of "dist" computations

2010-06-11 Thread Jeff08
Edit: I'm stupid and visualized the "dist" matrix incorrectly in my head. Should be Column # = x, Row # = y. n = 827-(x-2) index = y-1+(n+827)(827-n+1)/2 Everything works just fine. Thanks! Jeff08 wrote: > > Edit: > > There is something funky about the code. It d

Re: [R] Retrieving the 2 row of "dist" computations

2010-06-10 Thread Jeff08
index by ~400 on average) You can then compare this index to the one given by e[i] On Fri, Jun 11, 2010 at 11:06 AM, Jeff08 [via R] < ml-node+2251244-1652160471-274...@n4.nabble.com > wrote: > Edit: > > There is something funky about the code. It definitely returns the right >

Re: [R] Retrieving the 2 row of "dist" computations

2010-06-10 Thread Jeff08
wrote: > > Hi there, > > I am sure there is a better way to do it, but here is a suggestion: > > res <- matrix(NA, ncol = 2, nrow = 5) > for(i in 1:5) res[i, ] <- which(as.matrix(d) == sort(d)[i], arr.ind = > TRUE)[1,] > res > > HTH, > Jorge > > &

Re: [R] Retrieving the 2 row of "dist" computations

2010-06-10 Thread Jeff08
ut here is a suggestion: > > res <- matrix(NA, ncol = 2, nrow = 5) > for(i in 1:5) res[i, ] <- which(as.matrix(d) == sort(d)[i], arr.ind = > TRUE)[1,] > res > > HTH, > Jorge > > > On Wed, Jun 9, 2010 at 11:30 PM, Jeff08 <> wrote: > >> >>

[R] Retrieving the 2 row of "dist" computations

2010-06-09 Thread Jeff08
Dear R Gurus, As you probably know, dist calculates the distance between every two rows of data. What I am interested in is the actual two rows that have the least distance between them, rather than the numerical value of the distance itself. For example, If the minimum distance in the following

Re: [R] Extracting Elements By Date

2010-06-08 Thread Jeff08
edit: I found out how to declare empty variables in R, but the code still does not work. I get the index out of bounds error since my data is irregular (some have more dates than others, and the matrix will not allow for different sized rows) Dear R Gurus, Thanks for any help in advance! Date.f

[R] Extracting Elements By Date

2010-06-08 Thread Jeff08
Dear R Gurus, Thanks for any help in advance! Date.frame: Returns.names X id ticker date_ adjClose totret RankStk 258060 258060 13645T10 CP 2001-06-29 18.125 1877.758 My data frame is in the above format. I would like to filter by period, per id (every 125 days)

Re: [R] Adding in Missing Data

2010-06-08 Thread Jeff08
Hey All, I have just recently thought of a completely different way to accomplish my analysis (requiring different type of coding) Instead of going in and filling in data, I could remove any dates not shared by ALL the id's. I was thinking about accomplishing this using merge(~~), do you think

Re: [R] Adding in Missing Data

2010-06-08 Thread Jeff08
Thanks, That thread talks about adding values to NA. However, the problem with my data is that the missing data points aren't even in the data.frame. The method I think of is using a loop to check ID by ID, if the date column contains all elements of unique(Returns.names$date_), and if not add t

[R] Adding in Missing Data

2010-06-07 Thread Jeff08
Sample Data.Frame format Name is Returns.names X id ticker date_ adjClose totret RankStk 427225 427225 00174410AHS 2001-11-1321.661001235 "id" uniquely defines a row What I am trying to do is add missing data for each ID. Important Information: Date is

[R] Filtering out a data.frame

2010-06-07 Thread Jeff08
Sample Data.Frame format Name is Returns.nodup X id ticker date_ adjClose totret RankStk 427225 427225 00174410AHS 2001-11-1321.661001235 "id" uniquely defines a row What I am trying to do is filter out id's that have less than 1500 data points (by date

[R] Subsetting subsets of data.frames

2010-06-07 Thread Jeff08
Hey Everyone, I have been stumped by this all day. Basically, I have a data.frame of multiple columns. Of concern are "id" & "date" For some reason, oftentimes there are duplicates of data with the same date. I would like to remove the duplicates per different id (removing duplicate dates for

Re: [R] R Newbie, please help!

2010-06-03 Thread Jeff08
Hey Josh, Thanks for the quick response! I guess I have to switch from the Java mindset to the matrix/vector mindset of R. Your code worked very well, but I just have one problem: Essentially I have a time series of stock A, followed by a time series of stock B, etc. So there are break points

Re: [R] R Newbie, please help!

2010-06-03 Thread Jeff08
Hey Josh, Thanks for the quick response! I guess I have to switch from the Java mindset to the matrix/vector mindset of R. Your code worked very well, but I just have one problem: Essentially I have a time series of stock A, followed by a time series of stock B, etc. So there are break points

[R] R Newbie, please help!

2010-06-03 Thread Jeff08
Hello Everyone, I just started a new job & it requires heavy use of R to analyze datasets. I have a data.table that looks like this. It is sorted by ID & Date, there are about 150 different IDs & the dataset spans 3 million rows. The main columns of concern are ID, date, and totret. What I need