Hi Robert,

Thanks for report this. I'll look into it.

H.

On 06/17/2016 09:53 AM, Robert Castelo wrote:
hi,

the performance of snpsByOverlaps() in terms of time and memory
consumption is quite poor and i wonder whether there is some bug in the
code. here's one example:

library(GenomicRanges)
library(SNPlocs.Hsapiens.dbSNP144.GRCh37)

snps <- SNPlocs.Hsapiens.dbSNP144.GRCh37

gr <- GRanges(seqnames="ch10", IRanges(123276830, 123276830))

system.time(ov <- snpsByOverlaps(snps, gr))
    user  system elapsed
  33.768   0.124  33.955

system.time(ov <- snpsByOverlaps(snps, gr))
    user  system elapsed
  33.150   0.281  33.494


i've shown the call to snpsByOverlaps() twice to account for the fact
that maybe the first call was caching data and the second could be much
faster, but it is not the case.

if i do the same but with a larger GRanges object, for instance the one
attached to this email, then the memory consumption grows until about 20
Gbytes. to me this in conjunction with the previous observation,
suggests something wrong about the caching of the data.



i look forward to your comments and possible solutions,


thanks!!!


robert.
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [email protected]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to