[Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Shian Su
Dear R-devel,

I am experiencing issues with running GAM models using mclapply, it fails to 
return any values if the data input becomes large. For example here the code 
runs fine with a df of 100 rows, but fails at 1000.

library(mgcv)
library(parallel)

> df <- data.frame(
+ x = 1:100,
+ y = 1:100
+ )
>
> mclapply(1:2, function(i, df) {
+ fit <- gam(y ~ s(x, bs = "cs"), data = df)
+ },
+ df = df,
+ mc.cores = 2L
+ )
[[1]]

Family: gaussian
Link function: identity

Formula:
y ~ s(x, bs = "cs")

Estimated degrees of freedom:
9  total = 10

GCV score: 0

[[2]]

Family: gaussian
Link function: identity

Formula:
y ~ s(x, bs = "cs")

Estimated degrees of freedom:
9  total = 10

GCV score: 0

>
>
> df <- data.frame(
+ x = 1:1000,
+ y = 1:1000
+ )
>
> mclapply(1:2, function(i, df) {
+ fit <- gam(y ~ s(x, bs = "cs"), data = df)
+ },
+ df = df,
+ mc.cores = 2L
+ )
[[1]]
NULL

[[2]]
NULL

There is no error message returned, and the code runs perfectly fine in lapply.

I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 3.6.2. 
This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1.

Kind regards,
Shian Su

Shian Su
PhD Student, Ritchie Lab 6W, Epigenetics and Development
Walter & Eliza Hall Institute of Medical Research
1G Royal Parade, Parkville VIC 3052, Australia


___

The information in this email is confidential and intend...{{dropped:15}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Simon Urbanek
Sorry, the code works perfectly fine for me in R even for 1e6 observations (but 
I was testing with R 4.0.0). Are you using some kind of GUI?

Cheers,
Simon


> On 28/04/2020, at 8:11 PM, Shian Su  wrote:
> 
> Dear R-devel,
> 
> I am experiencing issues with running GAM models using mclapply, it fails to 
> return any values if the data input becomes large. For example here the code 
> runs fine with a df of 100 rows, but fails at 1000.
> 
> library(mgcv)
> library(parallel)
> 
>> df <- data.frame(
> + x = 1:100,
> + y = 1:100
> + )
>> 
>> mclapply(1:2, function(i, df) {
> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
> + },
> + df = df,
> + mc.cores = 2L
> + )
> [[1]]
> 
> Family: gaussian
> Link function: identity
> 
> Formula:
> y ~ s(x, bs = "cs")
> 
> Estimated degrees of freedom:
> 9  total = 10
> 
> GCV score: 0
> 
> [[2]]
> 
> Family: gaussian
> Link function: identity
> 
> Formula:
> y ~ s(x, bs = "cs")
> 
> Estimated degrees of freedom:
> 9  total = 10
> 
> GCV score: 0
> 
>> 
>> 
>> df <- data.frame(
> + x = 1:1000,
> + y = 1:1000
> + )
>> 
>> mclapply(1:2, function(i, df) {
> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
> + },
> + df = df,
> + mc.cores = 2L
> + )
> [[1]]
> NULL
> 
> [[2]]
> NULL
> 
> There is no error message returned, and the code runs perfectly fine in 
> lapply.
> 
> I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 
> 3.6.2. This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1.
> 
> Kind regards,
> Shian Su
> 
> Shian Su
> PhD Student, Ritchie Lab 6W, Epigenetics and Development
> Walter & Eliza Hall Institute of Medical Research
> 1G Royal Parade, Parkville VIC 3052, Australia
> 
> 
> ___
> 
> The information in this email is confidential and =\ i...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Shian Su
Yes I am running on Rstudio 1.2.5033. I was also running this code without 
error on Ubuntu in Rstudio. Checking again on the terminal and it does indeed 
work fine even with large data.frames.

Any idea as to what interaction between Rstudio and mclapply causes this?

Thanks,
Shian

On 28 Apr 2020, at 7:29 pm, Simon Urbanek 
mailto:simon.urba...@r-project.org>> wrote:

Sorry, the code works perfectly fine for me in R even for 1e6 observations (but 
I was testing with R 4.0.0). Are you using some kind of GUI?

Cheers,
Simon


On 28/04/2020, at 8:11 PM, Shian Su mailto:s...@wehi.edu.au>> 
wrote:

Dear R-devel,

I am experiencing issues with running GAM models using mclapply, it fails to 
return any values if the data input becomes large. For example here the code 
runs fine with a df of 100 rows, but fails at 1000.

library(mgcv)
library(parallel)

df <- data.frame(
+ x = 1:100,
+ y = 1:100
+ )

mclapply(1:2, function(i, df) {
+ fit <- gam(y ~ s(x, bs = "cs"), data = df)
+ },
+ df = df,
+ mc.cores = 2L
+ )
[[1]]

Family: gaussian
Link function: identity

Formula:
y ~ s(x, bs = "cs")

Estimated degrees of freedom:
9  total = 10

GCV score: 0

[[2]]

Family: gaussian
Link function: identity

Formula:
y ~ s(x, bs = "cs")

Estimated degrees of freedom:
9  total = 10

GCV score: 0



df <- data.frame(
+ x = 1:1000,
+ y = 1:1000
+ )

mclapply(1:2, function(i, df) {
+ fit <- gam(y ~ s(x, bs = "cs"), data = df)
+ },
+ df = df,
+ mc.cores = 2L
+ )
[[1]]
NULL

[[2]]
NULL

There is no error message returned, and the code runs perfectly fine in lapply.

I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 3.6.2. 
This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1.

Kind regards,
Shian Su

Shian Su
PhD Student, Ritchie Lab 6W, Epigenetics and Development
Walter & Eliza Hall Institute of Medical Research
1G Royal Parade, Parkville VIC 3052, Australia


___

The information in this email is confidential and intend...{{dropped:26}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Henrik Bengtsson
Hi, a few comments below.

First, from my experience and troubleshooting similar reports from
others, a returned NULL from parallel::mclapply() is often because the
corresponding child process crashed/died. However, when this happens
you should see a warning, e.g.

> y <- parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else x)
Warning message:
In parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else x) :
  scheduled core 2 did not deliver a result, all values of the job
will be affected
> str(y)
List of 2
 $ : int 1
 $ : NULL

This warning is produces on R 4.0.0 and R 3.6.2 in Linux, but I would
assume that warning is also produced on macOS.  It's not clear from
you message whether you also got that warning or not.

Second, forked processing, as used by parallel::mclapply(), is advised
against when using the RStudio Console [0].  Unfortunately, there's no
way to disable forked processing in R [1].  You could add the
following to your ~/.Rprofile startup file:

## Warn when forked processing is used in the RStudio Console
if (Sys.getenv("RSTUDIO") == "1" && !nzchar(Sys.getenv("RSTUDIO_TERM"))) {
  invisible(trace(parallel:::mcfork, tracer =
quote(warning("parallel::mcfork() was used. Note that forked
processes, e.g. parallel::mclapply(), may be unstable when used from
the RStudio Console
[https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011]";,
call.=FALSE
}

to detect when forked processed is used in the RStudio Console -
either by you or by some package code that you use directly or
indirectly.  You could even use stop() here if you wanna be
conservative.

[0] https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011
[1] https://stat.ethz.ch/pipermail/r-devel/2020-January/078896.html

/Henrik

On Tue, Apr 28, 2020 at 2:39 AM Shian Su  wrote:
>
> Yes I am running on Rstudio 1.2.5033. I was also running this code without 
> error on Ubuntu in Rstudio. Checking again on the terminal and it does indeed 
> work fine even with large data.frames.
>
> Any idea as to what interaction between Rstudio and mclapply causes this?
>
> Thanks,
> Shian
>
> On 28 Apr 2020, at 7:29 pm, Simon Urbanek 
> mailto:simon.urba...@r-project.org>> wrote:
>
> Sorry, the code works perfectly fine for me in R even for 1e6 observations 
> (but I was testing with R 4.0.0). Are you using some kind of GUI?
>
> Cheers,
> Simon
>
>
> On 28/04/2020, at 8:11 PM, Shian Su 
> mailto:s...@wehi.edu.au>> wrote:
>
> Dear R-devel,
>
> I am experiencing issues with running GAM models using mclapply, it fails to 
> return any values if the data input becomes large. For example here the code 
> runs fine with a df of 100 rows, but fails at 1000.
>
> library(mgcv)
> library(parallel)
>
> df <- data.frame(
> + x = 1:100,
> + y = 1:100
> + )
>
> mclapply(1:2, function(i, df) {
> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
> + },
> + df = df,
> + mc.cores = 2L
> + )
> [[1]]
>
> Family: gaussian
> Link function: identity
>
> Formula:
> y ~ s(x, bs = "cs")
>
> Estimated degrees of freedom:
> 9  total = 10
>
> GCV score: 0
>
> [[2]]
>
> Family: gaussian
> Link function: identity
>
> Formula:
> y ~ s(x, bs = "cs")
>
> Estimated degrees of freedom:
> 9  total = 10
>
> GCV score: 0
>
>
>
> df <- data.frame(
> + x = 1:1000,
> + y = 1:1000
> + )
>
> mclapply(1:2, function(i, df) {
> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
> + },
> + df = df,
> + mc.cores = 2L
> + )
> [[1]]
> NULL
>
> [[2]]
> NULL
>
> There is no error message returned, and the code runs perfectly fine in 
> lapply.
>
> I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 
> 3.6.2. This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1.
>
> Kind regards,
> Shian Su
> 
> Shian Su
> PhD Student, Ritchie Lab 6W, Epigenetics and Development
> Walter & Eliza Hall Institute of Medical Research
> 1G Royal Parade, Parkville VIC 3052, Australia
>
>
> ___
>
> The information in this email is confidential and inte...{{dropped:6}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rtools and R 4.0.0?

2020-04-28 Thread Hervé Pagès

Thanks Jeroen!


On Tue, Apr 7, 2020 at 6:07 PM Kevin Ushey  wrote:


Regardless, I would like to thank R core, CRAN, and Jeroen for all of
the time that has gone into creating and validating this new
toolchain. This is arduous work at an especially arduous time, so I'd
like to voice my appreciation for all the time and energy they have
spent on making this possible.


Absolutely. Thanks to R core, CRAN, Jeroen, and all the other people 
involved in creating the new Windows toolchain.


Cheers,
H.



Best,
Kevin

On Tue, Apr 7, 2020 at 7:47 AM Dirk Eddelbuettel  wrote:



There appears to have been some progress on this matter:

-Note that @command{g++} 4.9.x (as used for @R{} on Windows up to 3.6.x)
+Note that @command{g++} 4.9.x (as used on Windows prior to @R{} 4.0.0)

See SVN commit r78169 titled 'anticipate change in Windows toolchain', or the
mirrored git commit at
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wch_r-2Dsource_commit_bd674e2b76b2384169424e3d899fbfb5ac174978&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=oQL_LnqplfOV3qS3_v0vWloGk5Qhr6pWl4Yjzs4Tzzo&e=

Dirk

--
https://urldefense.proofpoint.com/v2/url?u=http-3A__dirk.eddelbuettel.com&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=nOplDwpoh_urogK65Old_l1Qi-EbVpyC0Mv4LgeLl64&e=
  | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=


__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=


__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rtools and R 4.0.0?

2020-04-28 Thread Gabriel Becker
 Huge thanks to you (Jeroen) and R-core for doing this.

I wasn't involved with this directly but I know it was a pretty seriously
heavy list so well done all around!

~G



On Tue, Apr 28, 2020, 11:04 AM Hervé Pagès  wrote:

> Thanks Jeroen!
>
> > On Tue, Apr 7, 2020 at 6:07 PM Kevin Ushey  wrote:
> >>
> >> Regardless, I would like to thank R core, CRAN, and Jeroen for all of
> >> the time that has gone into creating and validating this new
> >> toolchain. This is arduous work at an especially arduous time, so I'd
> >> like to voice my appreciation for all the time and energy they have
> >> spent on making this possible.
>
> Absolutely. Thanks to R core, CRAN, Jeroen, and all the other people
> involved in creating the new Windows toolchain.
>
> Cheers,
> H.
>
> >>
> >> Best,
> >> Kevin
> >>
> >> On Tue, Apr 7, 2020 at 7:47 AM Dirk Eddelbuettel 
> wrote:
> >>>
> >>>
> >>> There appears to have been some progress on this matter:
> >>>
> >>> -Note that @command{g++} 4.9.x (as used for @R{} on Windows up to
> 3.6.x)
> >>> +Note that @command{g++} 4.9.x (as used on Windows prior to @R{} 4.0.0)
> >>>
> >>> See SVN commit r78169 titled 'anticipate change in Windows toolchain',
> or the
> >>> mirrored git commit at
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wch_r-2Dsource_commit_bd674e2b76b2384169424e3d899fbfb5ac174978&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=oQL_LnqplfOV3qS3_v0vWloGk5Qhr6pWl4Yjzs4Tzzo&e=
> >>>
> >>> Dirk
> >>>
> >>> --
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__dirk.eddelbuettel.com&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=nOplDwpoh_urogK65Old_l1Qi-EbVpyC0Mv4LgeLl64&e=
> | @eddelbuettel | e...@debian.org
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=
> >
> > __
> > R-devel@r-project.org mailing list
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rtools and R 4.0.0?

2020-04-28 Thread Avraham Adler
Absolutely; this is a complicated and frustrating procedure, and we
owe Jeoren and all our gratitude!

Avi

On Tue, Apr 28, 2020 at 3:37 PM Gabriel Becker  wrote:
>
>  Huge thanks to you (Jeroen) and R-core for doing this.
>
> I wasn't involved with this directly but I know it was a pretty seriously
> heavy list so well done all around!
>
> ~G
>
>
>
> On Tue, Apr 28, 2020, 11:04 AM Hervé Pagès  wrote:
>
> > Thanks Jeroen!
> >
> > > On Tue, Apr 7, 2020 at 6:07 PM Kevin Ushey  wrote:
> > >>
> > >> Regardless, I would like to thank R core, CRAN, and Jeroen for all of
> > >> the time that has gone into creating and validating this new
> > >> toolchain. This is arduous work at an especially arduous time, so I'd
> > >> like to voice my appreciation for all the time and energy they have
> > >> spent on making this possible.
> >
> > Absolutely. Thanks to R core, CRAN, Jeroen, and all the other people
> > involved in creating the new Windows toolchain.
> >
> > Cheers,
> > H.
> >
> > >>
> > >> Best,
> > >> Kevin
> > >>
> > >> On Tue, Apr 7, 2020 at 7:47 AM Dirk Eddelbuettel 
> > wrote:
> > >>>
> > >>>
> > >>> There appears to have been some progress on this matter:
> > >>>
> > >>> -Note that @command{g++} 4.9.x (as used for @R{} on Windows up to
> > 3.6.x)
> > >>> +Note that @command{g++} 4.9.x (as used on Windows prior to @R{} 4.0.0)
> > >>>
> > >>> See SVN commit r78169 titled 'anticipate change in Windows toolchain',
> > or the
> > >>> mirrored git commit at
> > >>>
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wch_r-2Dsource_commit_bd674e2b76b2384169424e3d899fbfb5ac174978&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=oQL_LnqplfOV3qS3_v0vWloGk5Qhr6pWl4Yjzs4Tzzo&e=
> > >>>
> > >>> Dirk
> > >>>
> > >>> --
> > >>>
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__dirk.eddelbuettel.com&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=nOplDwpoh_urogK65Old_l1Qi-EbVpyC0Mv4LgeLl64&e=
> > | @eddelbuettel | e...@debian.org
> > >>>
> > >>> __
> > >>> R-devel@r-project.org mailing list
> > >>>
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=
> > >>
> > >> __
> > >> R-devel@r-project.org mailing list
> > >>
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=zMjaTujju0afmK5eIVPZrNajypj8QjuNbSyoAv93ISk&s=vUQZdkVyqq3iT9HukcKqEjg80sI-OZoKuy9DKiufquw&e=
> > >
> >
> > --
> > Hervé Pagès
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpa...@fredhutch.org
> > Phone:  (206) 667-5791
> > Fax:(206) 667-1319
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R 4.0.0 build error with sysdata.rda on ppc64el architecture

2020-04-28 Thread Dirk Eddelbuettel


The R 4.0.0 package migration on Debian is being held back by a failed build
on ppc64el [1]. We can see from the history of builds logs [2] that it used
to build, briefly failed, worked again and then failed leading to R 4.0.0's
release. (And my bad for missing how the alpha1/alpha2/beta/rc builds failed.)

I have however neither changed anything, nor did I ever have to accomodate
ppc64el (as it has happened with other platforms in the past).

The automated build gets killed after 150 mins at

make[7]: Entering directory '/<>/src/library/tools/src'
mkdir -p -- ../../../../library/tools/libs
make[7]: Leaving directory '/<>/src/library/tools/src'
make[6]: Leaving directory '/<>/src/library/tools/src'
make[5]: Leaving directory '/<>/src/library/tools'
make[5]: Entering directory '/<>/src/library/tools'
installing 'sysdata.rda'
E: Build killed with signal TERM after 150 minutes of inactivity

as can be seen in [3]. The Debian wiki has pointers for getting a shell
account on such platforms [4] (and that is not limited to Debianers but a
'Minipower' service).  I now have one such account on the VM farm at Unicamp
[5] in Brazil. It uses OpenStack (slick, never used it before) and I just
provisioned a reasonably beefy machine, booted from one of the available OSs
(Ubuntu 20.04), installed the build-dependencies and ... am now hanging at
the exact same spot:

make[7]: Entering directory '/home/ubuntu/git/r-base/src/library/tools/src'
mkdir -p -- ../../../../library/tools/libs
make[7]: Leaving directory '/home/ubuntu/git/r-base/src/library/tools/src'
make[6]: Leaving directory '/home/ubuntu/git/r-base/src/library/tools/src'
make[5]: Leaving directory '/home/ubuntu/git/r-base/src/library/tools'
make[5]: Entering directory '/home/ubuntu/git/r-base/src/library/tools'
installing 'sysdata.rda'

So at least it reproduces. But how do we go about addressing this? Why would
it be looping infinitely trying to assemble sysdata.rda?

Any hints or suggestions or debug flags I should set?

Thanks in advance for any pointers,  Dirk


[1] https://buildd.debian.org/status/package.php?p=r-base&suite=experimental
[2] https://buildd.debian.org/status/logs.php?pkg=r-base&arch=ppc64el
[3] 
https://buildd.debian.org/status/fetch.php?pkg=r-base&arch=ppc64el&ver=4.0.0-1&stamp=1587737274&raw=0
[4] https://wiki.debian.org/ppc64el
[5] https://openpower.ic.unicamp.br/minicloud/

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Shian Su
Thanks Henrik,

That clears things up significantly. I did see the warning but failed to 
include it my initial email. It sounds like an RStudio issue, and it seems like 
that it’s quite intrinsic to how forks interact with RStudio. Given this code 
is eventually going to be a part of a package, should I expect it to fail 
mysteriously in RStudio for my users? Is the best solution here to migrate all 
my parallelism to PSOCK for the foreseeable future?

Thanks,
Shian

> On 29 Apr 2020, at 2:08 am, Henrik Bengtsson  
> wrote:
>
> Hi, a few comments below.
>
> First, from my experience and troubleshooting similar reports from
> others, a returned NULL from parallel::mclapply() is often because the
> corresponding child process crashed/died. However, when this happens
> you should see a warning, e.g.
>
>> y <- parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else x)
> Warning message:
> In parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else x) :
>  scheduled core 2 did not deliver a result, all values of the job
> will be affected
>> str(y)
> List of 2
> $ : int 1
> $ : NULL
>
> This warning is produces on R 4.0.0 and R 3.6.2 in Linux, but I would
> assume that warning is also produced on macOS.  It's not clear from
> you message whether you also got that warning or not.
>
> Second, forked processing, as used by parallel::mclapply(), is advised
> against when using the RStudio Console [0].  Unfortunately, there's no
> way to disable forked processing in R [1].  You could add the
> following to your ~/.Rprofile startup file:
>
> ## Warn when forked processing is used in the RStudio Console
> if (Sys.getenv("RSTUDIO") == "1" && !nzchar(Sys.getenv("RSTUDIO_TERM"))) {
>  invisible(trace(parallel:::mcfork, tracer =
> quote(warning("parallel::mcfork() was used. Note that forked
> processes, e.g. parallel::mclapply(), may be unstable when used from
> the RStudio Console
> [https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011]";,
> call.=FALSE
> }
>
> to detect when forked processed is used in the RStudio Console -
> either by you or by some package code that you use directly or
> indirectly.  You could even use stop() here if you wanna be
> conservative.
>
> [0] https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011
> [1] https://stat.ethz.ch/pipermail/r-devel/2020-January/078896.html
>
> /Henrik
>
> On Tue, Apr 28, 2020 at 2:39 AM Shian Su  wrote:
>>
>> Yes I am running on Rstudio 1.2.5033. I was also running this code without 
>> error on Ubuntu in Rstudio. Checking again on the terminal and it does 
>> indeed work fine even with large data.frames.
>>
>> Any idea as to what interaction between Rstudio and mclapply causes this?
>>
>> Thanks,
>> Shian
>>
>> On 28 Apr 2020, at 7:29 pm, Simon Urbanek 
>> mailto:simon.urba...@r-project.org>> wrote:
>>
>> Sorry, the code works perfectly fine for me in R even for 1e6 observations 
>> (but I was testing with R 4.0.0). Are you using some kind of GUI?
>>
>> Cheers,
>> Simon
>>
>>
>> On 28/04/2020, at 8:11 PM, Shian Su 
>> mailto:s...@wehi.edu.au>> wrote:
>>
>> Dear R-devel,
>>
>> I am experiencing issues with running GAM models using mclapply, it fails to 
>> return any values if the data input becomes large. For example here the code 
>> runs fine with a df of 100 rows, but fails at 1000.
>>
>> library(mgcv)
>> library(parallel)
>>
>> df <- data.frame(
>> + x = 1:100,
>> + y = 1:100
>> + )
>>
>> mclapply(1:2, function(i, df) {
>> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
>> + },
>> + df = df,
>> + mc.cores = 2L
>> + )
>> [[1]]
>>
>> Family: gaussian
>> Link function: identity
>>
>> Formula:
>> y ~ s(x, bs = "cs")
>>
>> Estimated degrees of freedom:
>> 9  total = 10
>>
>> GCV score: 0
>>
>> [[2]]
>>
>> Family: gaussian
>> Link function: identity
>>
>> Formula:
>> y ~ s(x, bs = "cs")
>>
>> Estimated degrees of freedom:
>> 9  total = 10
>>
>> GCV score: 0
>>
>>
>>
>> df <- data.frame(
>> + x = 1:1000,
>> + y = 1:1000
>> + )
>>
>> mclapply(1:2, function(i, df) {
>> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
>> + },
>> + df = df,
>> + mc.cores = 2L
>> + )
>> [[1]]
>> NULL
>>
>> [[2]]
>> NULL
>>
>> There is no error message returned, and the code runs perfectly fine in 
>> lapply.
>>
>> I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 
>> 3.6.2. This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1.
>>
>> Kind regards,
>> Shian Su
>> 
>> Shian Su
>> PhD Student, Ritchie Lab 6W, Epigenetics and Development
>> Walter & Eliza Hall Institute of Medical Research
>> 1G Royal Parade, Parkville VIC 3052, Australia
>>
>>
>> ___
>>
>> The information in this email is confidential and intend...{{dropped:26}}
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__

Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Simon Urbanek
Do NOT use mcparallel() in packages except as a non-default option that user 
can set for the reasons Henrik explained. Multicore is intended for HPC 
applications that need to use many cores for computing-heavy jobs, but it does 
not play well with RStudio and more importantly you don't know the resource 
available so only the user can tell you when it's safe to use. Multi-core 
machines are often shared so using all detected cores is a very bad idea. The 
user should be able to explicitly enable it, but it should not be enabled by 
default.

As for parallelism, it depends heavily on your use-case. Native parallelism is 
preferred (threads, OpenMP, ...) and I assume you're not talking about that as 
that is always the first option. Multicore works well in cases where there is 
no easy native solution and you need to share a lot of data for small results. 
If the data is small, or you need to read it first, then other methods like 
PSOCK may be preferable. In any case, parallelization only makes sense for code 
that you know will take a long time to run.

Cheers,
Simon


> On 29/04/2020, at 11:54 AM, Shian Su  wrote:
> 
> Thanks Henrik,
> 
> That clears things up significantly. I did see the warning but failed to 
> include it my initial email. It sounds like an RStudio issue, and it seems 
> like that it’s quite intrinsic to how forks interact with RStudio. Given this 
> code is eventually going to be a part of a package, should I expect it to 
> fail mysteriously in RStudio for my users? Is the best solution here to 
> migrate all my parallelism to PSOCK for the foreseeable future?
> 
> Thanks,
> Shian
> 
>> On 29 Apr 2020, at 2:08 am, Henrik Bengtsson  
>> wrote:
>> 
>> Hi, a few comments below.
>> 
>> First, from my experience and troubleshooting similar reports from
>> others, a returned NULL from parallel::mclapply() is often because the
>> corresponding child process crashed/died. However, when this happens
>> you should see a warning, e.g.
>> 
>>> y <- parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else 
>>> x)
>> Warning message:
>> In parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else x) :
>> scheduled core 2 did not deliver a result, all values of the job
>> will be affected
>>> str(y)
>> List of 2
>> $ : int 1
>> $ : NULL
>> 
>> This warning is produces on R 4.0.0 and R 3.6.2 in Linux, but I would
>> assume that warning is also produced on macOS.  It's not clear from
>> you message whether you also got that warning or not.
>> 
>> Second, forked processing, as used by parallel::mclapply(), is advised
>> against when using the RStudio Console [0].  Unfortunately, there's no
>> way to disable forked processing in R [1].  You could add the
>> following to your ~/.Rprofile startup file:
>> 
>> ## Warn when forked processing is used in the RStudio Console
>> if (Sys.getenv("RSTUDIO") == "1" && !nzchar(Sys.getenv("RSTUDIO_TERM"))) {
>> invisible(trace(parallel:::mcfork, tracer =
>> quote(warning("parallel::mcfork() was used. Note that forked
>> processes, e.g. parallel::mclapply(), may be unstable when used from
>> the RStudio Console
>> [https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011]";,
>> call.=FALSE
>> }
>> 
>> to detect when forked processed is used in the RStudio Console -
>> either by you or by some package code that you use directly or
>> indirectly.  You could even use stop() here if you wanna be
>> conservative.
>> 
>> [0] https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011
>> [1] https://stat.ethz.ch/pipermail/r-devel/2020-January/078896.html
>> 
>> /Henrik
>> 
>> On Tue, Apr 28, 2020 at 2:39 AM Shian Su  wrote:
>>> 
>>> Yes I am running on Rstudio 1.2.5033. I was also running this code without 
>>> error on Ubuntu in Rstudio. Checking again on the terminal and it does 
>>> indeed work fine even with large data.frames.
>>> 
>>> Any idea as to what interaction between Rstudio and mclapply causes this?
>>> 
>>> Thanks,
>>> Shian
>>> 
>>> On 28 Apr 2020, at 7:29 pm, Simon Urbanek 
>>> mailto:simon.urba...@r-project.org>> wrote:
>>> 
>>> Sorry, the code works perfectly fine for me in R even for 1e6 observations 
>>> (but I was testing with R 4.0.0). Are you using some kind of GUI?
>>> 
>>> Cheers,
>>> Simon
>>> 
>>> 
>>> On 28/04/2020, at 8:11 PM, Shian Su 
>>> mailto:s...@wehi.edu.au>> wrote:
>>> 
>>> Dear R-devel,
>>> 
>>> I am experiencing issues with running GAM models using mclapply, it fails 
>>> to return any values if the data input becomes large. For example here the 
>>> code runs fine with a df of 100 rows, but fails at 1000.
>>> 
>>> library(mgcv)
>>> library(parallel)
>>> 
>>> df <- data.frame(
>>> + x = 1:100,
>>> + y = 1:100
>>> + )
>>> 
>>> mclapply(1:2, function(i, df) {
>>> + fit <- gam(y ~ s(x, bs = "cs"), data = df)
>>> + },
>>> + df = df,
>>> + mc.cores = 2L
>>> + )
>>> [[1]]
>>> 
>>> Family: gaussian
>>> Link function: identity
>>> 
>>> Formula:
>>> y ~ s(x, bs = "

Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Shian Su
Thanks Simon,

I will take note of the sensible default for core usage. I’m trying to achieve 
small scale parallelism, where tasks take 1-5 seconds and make fuller use of 
consumer hardware. Its not a HPC-worthy computation but even laptops these days 
come with 4 cores and I don’t see a reason to not make use of it.

The goal for the current piece of code I’m working on is to bootstrap many 
smoothing fits to generate prediction intervals, this is quite easy to write 
using mclapply. When you say native with threads, OpenMP, etc… are you 
referring to at the C/C++ level? From my understanding most parallel packages 
in R end up calling multicore or snow deep down.

I think one of the great advantages of mclapply is that it defaults to lapply 
when running on a single thread, this makes it much easier to maintain code 
with optional parallelism. I’m already running into trouble with the fact that 
PSOCK doesn’t seem to retain loaded packages in spawned processes. I would love 
to know if there reliable options in R that allow a similar interface to 
mclapply but use a different and more RStudio-stable mode of parallelisation?

Thanks,
Shian

> On 29 Apr 2020, at 1:33 pm, Simon Urbanek  wrote:
>
> Do NOT use mcparallel() in packages except as a non-default option that user 
> can set for the reasons Henrik explained. Multicore is intended for HPC 
> applications that need to use many cores for computing-heavy jobs, but it 
> does not play well with RStudio and more importantly you don't know the 
> resource available so only the user can tell you when it's safe to use. 
> Multi-core machines are often shared so using all detected cores is a very 
> bad idea. The user should be able to explicitly enable it, but it should not 
> be enabled by default.
>
> As for parallelism, it depends heavily on your use-case. Native parallelism 
> is preferred (threads, OpenMP, ...) and I assume you're not talking about 
> that as that is always the first option. Multicore works well in cases where 
> there is no easy native solution and you need to share a lot of data for 
> small results. If the data is small, or you need to read it first, then other 
> methods like PSOCK may be preferable. In any case, parallelization only makes 
> sense for code that you know will take a long time to run.
>
> Cheers,
> Simon
>
>
>> On 29/04/2020, at 11:54 AM, Shian Su  wrote:
>>
>> Thanks Henrik,
>>
>> That clears things up significantly. I did see the warning but failed to 
>> include it my initial email. It sounds like an RStudio issue, and it seems 
>> like that it’s quite intrinsic to how forks interact with RStudio. Given 
>> this code is eventually going to be a part of a package, should I expect it 
>> to fail mysteriously in RStudio for my users? Is the best solution here to 
>> migrate all my parallelism to PSOCK for the foreseeable future?
>>
>> Thanks,
>> Shian
>>
>>> On 29 Apr 2020, at 2:08 am, Henrik Bengtsson  
>>> wrote:
>>>
>>> Hi, a few comments below.
>>>
>>> First, from my experience and troubleshooting similar reports from
>>> others, a returned NULL from parallel::mclapply() is often because the
>>> corresponding child process crashed/died. However, when this happens
>>> you should see a warning, e.g.
>>>
 y <- parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else 
 x)
>>> Warning message:
>>> In parallel::mclapply(1:2, FUN = function(x) if (x == 2) quit("no") else x) 
>>> :
>>> scheduled core 2 did not deliver a result, all values of the job
>>> will be affected
 str(y)
>>> List of 2
>>> $ : int 1
>>> $ : NULL
>>>
>>> This warning is produces on R 4.0.0 and R 3.6.2 in Linux, but I would
>>> assume that warning is also produced on macOS.  It's not clear from
>>> you message whether you also got that warning or not.
>>>
>>> Second, forked processing, as used by parallel::mclapply(), is advised
>>> against when using the RStudio Console [0].  Unfortunately, there's no
>>> way to disable forked processing in R [1].  You could add the
>>> following to your ~/.Rprofile startup file:
>>>
>>> ## Warn when forked processing is used in the RStudio Console
>>> if (Sys.getenv("RSTUDIO") == "1" && !nzchar(Sys.getenv("RSTUDIO_TERM"))) {
>>> invisible(trace(parallel:::mcfork, tracer =
>>> quote(warning("parallel::mcfork() was used. Note that forked
>>> processes, e.g. parallel::mclapply(), may be unstable when used from
>>> the RStudio Console
>>> [https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011]";,
>>> call.=FALSE
>>> }
>>>
>>> to detect when forked processed is used in the RStudio Console -
>>> either by you or by some package code that you use directly or
>>> indirectly.  You could even use stop() here if you wanna be
>>> conservative.
>>>
>>> [0] https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011
>>> [1] https://stat.ethz.ch/pipermail/r-devel/2020-January/078896.html
>>>
>>> /Henrik
>>>
>>> On Tue, Apr 28, 2020 at 2:39 AM Shian Su  wrote:

 Yes 

Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-28 Thread Henrik Bengtsson
On Tue, Apr 28, 2020 at 9:00 PM Shian Su  wrote:
>
> Thanks Simon,
>
> I will take note of the sensible default for core usage. I’m trying to 
> achieve small scale parallelism, where tasks take 1-5 seconds and make fuller 
> use of consumer hardware. Its not a HPC-worthy computation but even laptops 
> these days come with 4 cores and I don’t see a reason to not make use of it.
>
> The goal for the current piece of code I’m working on is to bootstrap many 
> smoothing fits to generate prediction intervals, this is quite easy to write 
> using mclapply. When you say native with threads, OpenMP, etc… are you 
> referring to at the C/C++ level? From my understanding most parallel packages 
> in R end up calling multicore or snow deep down.
>
> I think one of the great advantages of mclapply is that it defaults to lapply 
> when running on a single thread, this makes it much easier to maintain code 
> with optional parallelism. I’m already running into trouble with the fact 
> that PSOCK doesn’t seem to retain loaded packages in spawned processes. I 
> would love to know if there reliable options in R that allow a similar 
> interface to mclapply but use a different and more RStudio-stable mode of 
> parallelisation?

If you use parLapply(cl, ...) and gives the end-users the control over
the cluster 'cl' object (e.g. via an argument), then they have the
option to choose from the different types of clusters that cl <-
parallel::makeCluster(...) can create, notably PSOCK, FORK and MPI
cluster but the framework support others.

The 'foreach' framework takes this separation of *what* to parallelize
(which you decide as a developer) and *how* to parallel (which the
end-user decides) further by so called foreach adaptors aka parallel
backends.  With foreach, users have plently of doNnn packages to pick
from, doMC, doParallel, doMPI, doSnow, doRedis, and doFuture.  Several
of these parallel backends build on top of the core functions provided
by the 'parallel' package.  So, with foreach your users can use forked
parallel processing if they want and, or something else (selected at
the top of their script).

(Disclaimer: I'm the author) The 'future' framework tries to take this
developer-end-user separation one step further and with a lower level
API - future(), value(), resolved() - for which different parallel
backends have been implemented, e.g. multicore, multisession
("PSOCK"), cluster (any parallel::makeCluster() cluster), callr,
batchtools (HPC job schedulers), etc.  All these have been tested to
conform to the Future API specs, so we know our parallel code works
regardless of which of these backends the user picks.  Now, based on
these basic future low-level functions, other higher level APIs have
been implemented.  For instance, the future.apply packages provides
futurized version of all base R apply functions, e.g. future_lapply(),
future_vapply(), future_Map(), etc.  You can basically take you
lapply(...) code and replace it with future_lapply(...) and things
will just work.  So, try replacing your current mclapply() with
future_lapply().  If you/the user uses the 'multicore' backend - set
by plan(multicore) at top of script, you'll get basically what
mclapply() provides.  If plan(multisession) is used, the you basically
get what parLapply() does.  The difference is that you don't have to
worry about globals and packages.  If you like the foreach-style of
map-reduce, you can use futures via the doFuture backend.  If you like
the purrr-style of map-reduce, you can use the 'furrr' package.  So,
and I'm obviously biased, if you pick the future framework, you'll
leave yourself and end-users with more options going forward.

Clear as mud?

/Henrik

PS. Simon, I think your explicit comment on mcparallel() & friends is
very helpful for many people and developers. It clearly tells
developers to never use mclapply() as the only path through their
code. I'm quite sure not everyone has been or is aware of this. Now
it's clear. Thank you.

>
> Thanks,
> Shian
>
> > On 29 Apr 2020, at 1:33 pm, Simon Urbanek  
> > wrote:
> >
> > Do NOT use mcparallel() in packages except as a non-default option that 
> > user can set for the reasons Henrik explained. Multicore is intended for 
> > HPC applications that need to use many cores for computing-heavy jobs, but 
> > it does not play well with RStudio and more importantly you don't know the 
> > resource available so only the user can tell you when it's safe to use. 
> > Multi-core machines are often shared so using all detected cores is a very 
> > bad idea. The user should be able to explicitly enable it, but it should 
> > not be enabled by default.
> >
> > As for parallelism, it depends heavily on your use-case. Native parallelism 
> > is preferred (threads, OpenMP, ...) and I assume you're not talking about 
> > that as that is always the first option. Multicore works well in cases 
> > where there is no easy native solution and you need to share a lot of data 
> > for smal