Okay, it crosstab seems to work when I clear out a bunch of my variables,
went from 800 mb Vcol to 100 mb
Can anyone explain how memory works in R, because my lack of understanding
with memory was clearly the problem.
Jeff08 wrote:
>
> ##I have also tried the reshape package
> librar
lt(df, id=c("date_", "id"))
mm1 <- cast(mm, date_~id)
aba <- mm1[,2:2365]
final <- as.matrix(aba)
colnames(final) <- Returns.nodup$id
rownames(final) <- mm1$date_
Jeff08 wrote:
>
> df<-data.frame()
> df[1:8,1]<-c("1","2","5
df<-data.frame()
df[1:8,1]<-c("1","2","5","3","1","4","3","5") ##identifier 1
df[1:8,2]<-c("c","a","b","c","a","b","b","a") ##identifier 2
df[1:8,3]<-c(1,2,3,4,5,6,7,8) ##value
##Each unique combination of identifiers identifies a datapoint
##What I am trying to do is create a matrix with value
Edit:
I'm stupid and visualized the "dist" matrix incorrectly in my head.
Should be
Column # = x, Row # = y. n = 827-(x-2)
index = y-1+(n+827)(827-n+1)/2
Everything works just fine. Thanks!
Jeff08 wrote:
>
> Edit:
>
> There is something funky about the code. It d
index by ~400 on average)
You can then compare this index to the one given by e[i]
On Fri, Jun 11, 2010 at 11:06 AM, Jeff08 [via R] <
ml-node+2251244-1652160471-274...@n4.nabble.com
> wrote:
> Edit:
>
> There is something funky about the code. It definitely returns the right
>
wrote:
>
> Hi there,
>
> I am sure there is a better way to do it, but here is a suggestion:
>
> res <- matrix(NA, ncol = 2, nrow = 5)
> for(i in 1:5) res[i, ] <- which(as.matrix(d) == sort(d)[i], arr.ind =
> TRUE)[1,]
> res
>
> HTH,
> Jorge
>
>
&
ut here is a suggestion:
>
> res <- matrix(NA, ncol = 2, nrow = 5)
> for(i in 1:5) res[i, ] <- which(as.matrix(d) == sort(d)[i], arr.ind =
> TRUE)[1,]
> res
>
> HTH,
> Jorge
>
>
> On Wed, Jun 9, 2010 at 11:30 PM, Jeff08 <> wrote:
>
>>
>>
Dear R Gurus,
As you probably know, dist calculates the distance between every two rows of
data. What I am interested in is the actual two rows that have the least
distance between them, rather than the numerical value of the distance
itself.
For example, If the minimum distance in the following
edit: I found out how to declare empty variables in R, but the code still
does not work. I get the index out of bounds error since my data is
irregular (some have more dates than others, and the matrix will not allow
for different sized rows)
Dear R Gurus,
Thanks for any help in advance!
Date.f
Dear R Gurus,
Thanks for any help in advance!
Date.frame: Returns.names
X id ticker date_ adjClose totret RankStk
258060 258060 13645T10 CP 2001-06-29 18.125 1877.758
My data frame is in the above format. I would like to filter by period, per
id (every 125 days)
Hey All,
I have just recently thought of a completely different way to accomplish my
analysis (requiring different type of coding)
Instead of going in and filling in data, I could remove any dates not shared
by ALL the id's.
I was thinking about accomplishing this using merge(~~), do you think
Thanks,
That thread talks about adding values to NA. However, the problem with my
data is that the missing data points aren't even in the data.frame.
The method I think of is using a loop to check ID by ID, if the date column
contains all elements of unique(Returns.names$date_), and if not add t
Sample Data.Frame format
Name is Returns.names
X id ticker date_ adjClose totret RankStk
427225 427225 00174410AHS 2001-11-1321.661001235
"id" uniquely defines a row
What I am trying to do is add missing data for each ID.
Important Information: Date is
Sample Data.Frame format
Name is Returns.nodup
X id ticker date_ adjClose totret RankStk
427225 427225 00174410AHS 2001-11-1321.661001235
"id" uniquely defines a row
What I am trying to do is filter out id's that have less than 1500 data
points (by date
Hey Everyone,
I have been stumped by this all day.
Basically, I have a data.frame of multiple columns. Of concern are "id" &
"date"
For some reason, oftentimes there are duplicates of data with the same date.
I would like to remove the duplicates per different id (removing duplicate
dates for
Hey Josh,
Thanks for the quick response!
I guess I have to switch from the Java mindset to the matrix/vector mindset
of R.
Your code worked very well, but I just have one problem:
Essentially I have a time series of stock A, followed by a time series of
stock B, etc.
So there are break points
Hey Josh,
Thanks for the quick response!
I guess I have to switch from the Java mindset to the matrix/vector mindset
of R.
Your code worked very well, but I just have one problem:
Essentially I have a time series of stock A, followed by a time series of
stock B, etc.
So there are break points
Hello Everyone,
I just started a new job & it requires heavy use of R to analyze datasets.
I have a data.table that looks like this. It is sorted by ID & Date, there
are about 150 different IDs & the dataset spans 3 million rows. The main
columns of concern are ID, date, and totret. What I need
18 matches
Mail list logo