Hello,

I forgot to mention that I am looping over ~70K objects. If I do mclapply on the first 200, its fine (i.e. doesn't give NULL values); if I go up to 2K (or over all of them), then I start to see NULL values.

Also the function I call uses commands 'restrict', 'gaps' and 'width' from the package IRanges in bioconductor in my functions. I don't know what is under the hood with those functions in terms of what calls they make, but could that be a source of a problem? (I saw an earlier post regarding errors when a function used Java code, but I'm not getting an error like they did)

Thanks,
Elizabeth

On 3/22/11 1:13 AM, Elizabeth Purdom wrote:
Hello,
I am running large simulations, which unfortunately I can't really replicate here because the code is so extensive. I rely heavily on mclapply, but I realize that I'm losing data somewhere.

There are two worrisome symptoms:
1) I am getting 'NULL' as a return value for some (but not all) elements of the output when I use mclapply, but not if I use lapply
> tmp2[1:3] #output from lapply
[[1]]
10000076 10000077
      24       24

[[2]]
10000076 10000077
     119      119

[[3]]
10000076
      71

> tmp[1:3] #output from mclapply
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL


2) I am not getting back a list the same length as my input vector I'm parallelizing over. i.e. a command like this:

tmp<-mclapply(x, FUN=myfunc, mc.cores=16)

gives me back a list tmp which is not the same length as x (and so I'm getting all kinds of errors)

This is extremely discouraging, because I've been using mclapply extensively at very many points on simulations that take a very long time to run, and now I'm wondering if what I'm getting is trustworthy. I don't think I could reasonably finish my results without mclapply, but I am thinking to cut it out except where it was absolutely necessary, time-wise. If anyone had any suggestions as to why this might be happening and how I can circumvent it (or test for it happening), I would greatly appreciate it.

Thanks,
Elizabeth Purdom

> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] multicore_0.1-4 msm_1.0 gtools_2.6.2 graph_1.28.0 Rsamtools_1.2.3 [6] Biostrings_2.18.2 GenomicFeatures_1.2.3 GenomicRanges_1.2.3 IRanges_1.8.9

loaded via a namespace (and not attached):
[1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.3 DBI_0.2-5 mvtnorm_0.9-96 RCurl_1.5-0 [7] RSQLite_0.9-4 rtracklayer_1.10.6 splines_2.12.1 survival_2.36-2 tools_2.12.1 XML_3.2-0

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to