Dear R users,
I try to use a very large file (~3 Gib) with the filehash package. The
length of the dataset is around 4,000,000 obs. I get this message from R
while trying to "load" the dataset (named "cc084.csv"):
> dumpDF(read.csv("cc084.csv", header=T), dbName="db01")
Erreur : impossible d'all
wnames(c)=key
for(i in 2:I(nrow(c))) {
if(rownames(c)[i]=="TA" & rownames(c)[i-1]=="TA") { c[i,] <-
colSums(c[i:I(i-1),])
c[i-1,]<-NA}} # sums the rows and replace the used rows by NA
values
c <- c[apply(c,1,function(x)any(!is.na(x))),] # remove
In the first reply, what was calculated was the overall means by group (amino
acids). It does not work for a larger database.
I am quite really new to R, and I worked on your question just to learn how
to manipulate data with R.
The following seems to work. The code could be made a lot more elegan
Thomas,
You are very clever! The "meil2" data frame has twice the common variable
combinations:
> meil2
dist sexe style meil
138F clas 02:43:17
238F free 02:24:46
338H clas 02:37:36
438H free 01:59:35
545F clas 03:46:15
645F free 02:20
4 matches
Mail list logo