On Jun 30, 2011, at 12:30 PM, Marc Schwartz wrote: > On Jun 30, 2011, at 11:20 AM, Edgar Alminar wrote: > >>>> I did this: >>>> >>>> library(data.table) >>>> >>>> dd <- data.table(bl) >>>> dd[,sum(as.integer(CONTTIME)), by = SCRNO] >>>> >>>> (I used as.integer because I got an error message: sum not meaningful for >>>> factors) >>>> >>>> And got this: >>>> >>>> SCRNO V1 >>>> [1,] HBA0020036 111 >>>> [2,] HBA0020087 71 >>>> [3,] HBA0020209 140 >>>> [4,] HBA0020213 189 >>>> [5,] HBA0020222 174 >>>> [6,] HBA0020292 747 >>>> [7,] HBA0020310 57 >>>> [8,] HBA0020317 291 >>>> [9,] HBA0020365 417 >>>> [10,] HBA0020366 124 >>>> >>>> All the sums are way too big. Is there something making it not add up >>>> correctly? >>>> >>>> Original dataset: >>>> >> RID SCRNO VISCODE RECNO CONTTIME >> 338 43 HBA0020036 bl 1 9 >> 1187 95 HBA0020087 bl 1 3 >> 3251 230 HBA0020209 bl 2 3 >> 3258 230 HBA0020209 bl 1 28 >> 3321 235 HBA0020213 bl 2 5 >> 3351 235 HBA0020213 bl 1 6 >> 3436 247 HBA0020222 bl 1 5 >> 3456 247 HBA0020222 bl 2 4 >> 4569 321 HBA0020292 bl 13 2 >> 4572 321 HBA0020292 bl 5 13 >> 4573 321 HBA0020292 bl 1 25 >> 4576 321 HBA0020292 bl 7 5 >> 4578 321 HBA0020292 bl 8 2 >> 4581 321 HBA0020292 bl 4 4 >> 4582 321 HBA0020292 bl 9 5 >> 4586 321 HBA0020292 bl 12 2 >> 4587 321 HBA0020292 bl 6 2 >> 4590 321 HBA0020292 bl 10 3 >> 4591 321 HBA0020292 bl 11 7 > > > That is not the entire dataset....HBA0020366 is missing, as an example. > > I don't use the data.table package, but if you are getting an error > indicating that CONTTIME is a factor, then something is wrong with either the > data itself (there are non-numeric entries) or the way in which it was > entered/imported into R. > > Thus, I would first check your data for errors. Use str(YourDataSet) to > review its structure and if CONTTIME is a factor, check into the data to see > why. > > Lastly, review this R FAQ: > > http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f > > Just as an alternative, with your data in 'DF': > >> DF > RID SCRNO VISCODE RECNO CONTTIME > 338 43 HBA0020036 bl 1 9 > 1187 95 HBA0020087 bl 1 3 > 3251 230 HBA0020209 bl 2 3 > 3258 230 HBA0020209 bl 1 28 > 3321 235 HBA0020213 bl 2 5 > 3351 235 HBA0020213 bl 1 6 > 3436 247 HBA0020222 bl 1 5 > 3456 247 HBA0020222 bl 2 4 > 4569 321 HBA0020292 bl 13 2 > 4572 321 HBA0020292 bl 5 13 > 4573 321 HBA0020292 bl 1 25 > 4576 321 HBA0020292 bl 7 5 > 4578 321 HBA0020292 bl 8 2 > 4581 321 HBA0020292 bl 4 4 > 4582 321 HBA0020292 bl 9 5 > 4586 321 HBA0020292 bl 12 2 > 4587 321 HBA0020292 bl 6 2 > 4590 321 HBA0020292 bl 10 3 > 4591 321 HBA0020292 bl 11 7 > > >> aggregate(CONTTIME ~ DF$SCRNO, data = DF, sum) > DF$SCRNO CONTTIME > 1 HBA0020036 9 > 2 HBA0020087 3 > 3 HBA0020209 31 > 4 HBA0020213 11 > 5 HBA0020222 9 > 6 HBA0020292 70
Quick typo correction here. the 'DF$' in DF$SCRNO is superfluous. I did not clean that up before copying and pasting. > aggregate(CONTTIME ~ SCRNO, data = DF, sum) SCRNO CONTTIME 1 HBA0020036 9 2 HBA0020087 3 3 HBA0020209 31 4 HBA0020213 11 5 HBA0020222 9 6 HBA0020292 70 Marc ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.