Re: [Rd] Memory leak with tons of closed connections

2016-11-11 Thread Martin Maechler
> Gergely Daróczi 
> on Thu, 10 Nov 2016 16:48:12 +0100 writes:

> Dear All,
> I'm developing an R application running inside of a Java daemon on
> multiple threads, and interacting with the parent daemon via stdin and
> stdout.

> Everything works perfectly fine except for having some memory leaks
> somewhere. Simplified version of the R app:

> while (TRUE) {
> con <- file('stdin', open = 'r', blocking = TRUE)
> line <- scan(con, what = character(0), nlines = 1, quiet = TRUE)
> close(con)
> }

> This loop uses more and more RAM as time passes (see more on this
> below), not sure why, and I have no idea currently on how to debug
> this further. Can someone please try to reproduce it and give me some
> hints on what is the problem?

> Sample bash script to trigger an R process with such memory leak:

> Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" | Rscript
> --vanilla -e 
"cat(Sys.getpid(),'\n');while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);gc()}"

> Maybe you have to escape '\n' depending on your shell.

> Thanks for reading this and any hints would be highly appreciated!

I have no hints, sorry... but give some more "data":

I've changed the above to *print* the gc() result every 1000th
iteration, and after 100'000 iterations, there is still no
memory increase from the point of view of R itself.

However, monitoring the process (via 'htop', e.g.) shows about
1 MB per second increase in memory foot print of the process.

One could argue that the error is with the OS / pipe / bash
rather than with R itself... but I'm not expert enough to do
argue  here at all.

Here's my version of your sample bash script and its output:

$  Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" | Rscript --vanilla -e 
"cat(Sys.getpid(),'\n');i <- 0; 
while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);a
 <- gc(); i <- i+1; if(i %% 1000 == 1) {cat('i=',i,'\\n'); print(a)} }"

11059 
i= 1 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83216  4.5   1000 534.1   213529 11.5
Vcells 172923  1.4   16777216 128.0   562476  4.3
i= 1001 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83255  4.5   1000 534.1   213529 11.5
Vcells 172958  1.4   16777216 128.0   562476  4.3
...
...
...
...
i= 80001 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83255  4.5   1000 534.1   213529 11.5
Vcells 172958  1.4   16777216 128.0   562476  4.3
i= 81001 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83255  4.5   1000 534.1   213529 11.5
Vcells 172959  1.4   16777216 128.0   562476  4.3
i= 82001 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83255  4.5   1000 534.1   213529 11.5
Vcells 172959  1.4   16777216 128.0   562476  4.3
i= 83001 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83255  4.5   1000 534.1   213529 11.5
Vcells 172958  1.4   16777216 128.0   562476  4.3
i= 84001 
 used (Mb) gc trigger  (Mb) max used (Mb)
Ncells  83255  4.5   1000 534.1   213529 11.5
Vcells 172958  1.4   16777216 128.0   562476  4.3


> Best,
> Gergely

> PS1 see the image posted at
> 
http://stackoverflow.com/questions/40522584/memory-leak-with-closed-connections
> on memory usage over time
> PS2 the issue doesn't seem to be due to writing more data in the first
> R app compared to what the second R app can handle, as I tried the
> same with adding a Sys.sleep(0.01) in the first app and that's not an
> issue at all in the real application
> PS3 I also tried using stdin() instead of file('stdin'), but that did
> not work well for the stream running on multiple threads started by
> the same parent Java daemon
> PS4 I've tried this on Linux using R 3.2.3 and 3.3.2

For me, it's Linux, too (Fedora 24), using  'R 3.3.2 patched'..

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Memory leak with tons of closed connections

2016-11-11 Thread Gergely Daróczi
On Fri, Nov 11, 2016 at 12:08 PM, Martin Maechler
 wrote:
>> Gergely Daróczi 
>> on Thu, 10 Nov 2016 16:48:12 +0100 writes:
>
> > Dear All,
> > I'm developing an R application running inside of a Java daemon on
> > multiple threads, and interacting with the parent daemon via stdin and
> > stdout.
>
> > Everything works perfectly fine except for having some memory leaks
> > somewhere. Simplified version of the R app:
>
> > while (TRUE) {
> > con <- file('stdin', open = 'r', blocking = TRUE)
> > line <- scan(con, what = character(0), nlines = 1, quiet = TRUE)
> > close(con)
> > }
>
> > This loop uses more and more RAM as time passes (see more on this
> > below), not sure why, and I have no idea currently on how to debug
> > this further. Can someone please try to reproduce it and give me some
> > hints on what is the problem?
>
> > Sample bash script to trigger an R process with such memory leak:
>
> > Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" | Rscript
> > --vanilla -e 
> "cat(Sys.getpid(),'\n');while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);gc()}"
>
> > Maybe you have to escape '\n' depending on your shell.
>
> > Thanks for reading this and any hints would be highly appreciated!
>
> I have no hints, sorry... but give some more "data":
>
> I've changed the above to *print* the gc() result every 1000th
> iteration, and after 100'000 iterations, there is still no
> memory increase from the point of view of R itself.
>
> However, monitoring the process (via 'htop', e.g.) shows about
> 1 MB per second increase in memory foot print of the process.
>
> One could argue that the error is with the OS / pipe / bash
> rather than with R itself... but I'm not expert enough to do
> argue  here at all.
>
> Here's my version of your sample bash script and its output:
>
> $  Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" | Rscript --vanilla 
> -e "cat(Sys.getpid(),'\n');i <- 0; 
> while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);a
>  <- gc(); i <- i+1; if(i %% 1000 == 1) {cat('i=',i,'\\n'); print(a)} }"
>
> 11059
> i= 1
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83216  4.5   1000 534.1   213529 11.5
> Vcells 172923  1.4   16777216 128.0   562476  4.3
> i= 1001
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83255  4.5   1000 534.1   213529 11.5
> Vcells 172958  1.4   16777216 128.0   562476  4.3
> ...
> ...
> ...
> ...
> i= 80001
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83255  4.5   1000 534.1   213529 11.5
> Vcells 172958  1.4   16777216 128.0   562476  4.3
> i= 81001
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83255  4.5   1000 534.1   213529 11.5
> Vcells 172959  1.4   16777216 128.0   562476  4.3
> i= 82001
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83255  4.5   1000 534.1   213529 11.5
> Vcells 172959  1.4   16777216 128.0   562476  4.3
> i= 83001
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83255  4.5   1000 534.1   213529 11.5
> Vcells 172958  1.4   16777216 128.0   562476  4.3
> i= 84001
>  used (Mb) gc trigger  (Mb) max used (Mb)
> Ncells  83255  4.5   1000 534.1   213529 11.5
> Vcells 172958  1.4   16777216 128.0   562476  4.3
>

Thank you very much, this was very useful!

I tried to do some more research on this, as Gabor Csardi also
suspected that the memory grow might be due to the writer being faster
than the reader, so data is simply accumulating in the input buffer of
the reader. I double checked this via:

Rscript --vanilla -e
"i<-1;while(TRUE){cat(runif(1),'\n');i<-i+1;if(i==1e6){Sys.sleep(15);i<-1}}"
| Rscript --vanilla -e
"cat(Sys.getpid(),'\n');i<-0;while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);a<-gc();i<-i+1;if(i%%1e3==1){cat('i=',i,'\\n');print(a)}}"scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);gc()}"

So the writer generates a good number of lines, but sleeps for 15
seconds after a while so that the reader can catch up. Monitoring the
memory footprint of the process (by the way gc reported no memory
increase in the reader, just like in Martin's output) shows that the
memory grows when the writer sends data, and it's constant when the
writer is sleeping, but it never decreases: http://imgur.com/r7T02pK

Maybe it's more like an OS-specific question based on this, you are
absolutely right, but I was not able to reproduce the same memory
issue in plain bash via:

while :;do echo '1';done | bash -c "while :;do read;done"

But I'm not sure if this does exactly 

[Rd] .S3methods: issue in content of info data.frame

2016-11-11 Thread Renaud Gaujoux
Hi,

I was trying to get a list of S3 method for a given generic, along with the
package in which they are defined, and I came across what looks like an
issue in the data.frame returned in attribute 'info'. The column 'from'
mostly gets the value "registered S3method for ..." except for visible
methods. Is this the expected behavior?
See code and output below.

Thank you.

Bests,
Renaud

$ Rscript -e "library(xtable); attr(.S3methods('plot'), 'info');
sessionInfo()"
   visible from generic  isS4
plot.acf FALSE registered S3method for plotplot FALSE
plot.data.frame  FALSE registered S3method for plotplot FALSE
plot.decomposed.ts   FALSE registered S3method for plotplot FALSE
plot.default  TRUE graphicsplot FALSE
plot.dendrogram  FALSE registered S3method for plotplot FALSE
plot.density FALSE registered S3method for plotplot FALSE
plot.ecdf TRUEstatsplot FALSE
plot.factor  FALSE registered S3method for plotplot FALSE
plot.formula FALSE registered S3method for plotplot FALSE
plot.function TRUE graphicsplot FALSE
plot.hclust  FALSE registered S3method for plotplot FALSE
plot.histogram   FALSE registered S3method for plotplot FALSE
plot.HoltWinters FALSE registered S3method for plotplot FALSE
plot.isoreg  FALSE registered S3method for plotplot FALSE
plot.lm  FALSE registered S3method for plotplot FALSE
plot.medpolish   FALSE registered S3method for plotplot FALSE
plot.mlm FALSE registered S3method for plotplot FALSE
plot.ppr FALSE registered S3method for plotplot FALSE
plot.prcomp  FALSE registered S3method for plotplot FALSE
plot.princompFALSE registered S3method for plotplot FALSE
plot.profile.nls FALSE registered S3method for plotplot FALSE
plot.raster  FALSE registered S3method for plotplot FALSE
plot.specFALSE registered S3method for plotplot FALSE
plot.stepfun  TRUEstatsplot FALSE
plot.stl FALSE registered S3method for plotplot FALSE
plot.table   FALSE registered S3method for plotplot FALSE
plot.ts   TRUEstatsplot FALSE
plot.tskernelFALSE registered S3method for plotplot FALSE
plot.TukeyHSDFALSE registered S3method for plotplot FALSE
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_ZA.UTF-8   LC_NUMERIC=C
LC_TIME=en_ZA.UTF-8LC_COLLATE=en_ZA.UTF-8
LC_MONETARY=en_ZA.UTF-8
 [6] LC_MESSAGES=en_ZA.UTF-8LC_PAPER=en_ZA.UTF-8
LC_NAME=C  LC_ADDRESS=C
LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_ZA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  base

other attached packages:
[1] xtable_1.8-2

loaded via a namespace (and not attached):
[1] tools_3.3.2

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Memory leak with tons of closed connections

2016-11-11 Thread Gábor Csárdi
On Fri, Nov 11, 2016 at 12:46 PM, Gergely Daróczi
 wrote:
[...]
>> I've changed the above to *print* the gc() result every 1000th
>> iteration, and after 100'000 iterations, there is still no
>> memory increase from the point of view of R itself.

Yes, R does not know about it, it does not manage this memory (any
more), but the R process requested this memory from the OS, and never
gave it back, which is basically the definition of a memory leak. No?

I think the leak is because 'stdin' is special and R opens it with fdopen():
https://github.com/wch/r-source/blob/f8cdadb769561970cc42776f563043ea5e12fe05/src/main/connections.c#L561-L579

and then it does not close it:
https://github.com/wch/r-source/blob/f8cdadb769561970cc42776f563043ea5e12fe05/src/main/connections.c#L636

I understand that R cannot fclose the FILE*, because that would also
close the file descriptor, but anyway, this causes a memory leak. I
think.

It seems that you cannot close the FILE* without closing the
descriptor, so maybe a workaround would be to keep one FILE* open,
instead of calling fdopen() to create new ones every time. Another
possible workaround is to use dup(), but I don't know enough about the
details to be sure.

Gabor

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Frames in compiled functions

2016-11-11 Thread brodie gaslam via R-devel
I noticed some problems that cropped in the latest versions of R-devel 
(2016-11-08 r71639 in my case) for one of my packages.  I _think_ I have 
narrowed it down to the changes to what gets byte-compiled by default.  The 
following example run illustrates the problem I'm having:

  compiler::enableJIT(0)
  fun <- function(x) local(as.list(parent.frame(2)))
  fun(1)
  ## $x
  ## [1] 1
  ## 

  

  compiler::cmpfun(fun)(1)
  ## 


Is this considered problematic at all?  If so, might it make sense to broaden 
the list of functions that disable JIT compilation beyond `browser`?  I'm 
pretty sure I can work around this issue in my specific use case, but figured 
it is worth mentioning here since it is a change in behavior.


Regards,

B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Frames in compiled functions

2016-11-11 Thread Winston Chang
It looks like the byte compiler is optimizing local() to an
immediately-invoked function, instead of using eval() and substitute(). I
don't know if that's exactly how it's implemented internally, but that's
what it looks like here:

compiler::enableJIT(0)

fun <- function(x) {
   local(sys.calls())
}
fun(1)
## [[1]]
## fun(1)
##
## [[2]]
## local(sys.calls())
##
## [[3]]
## eval.parent(substitute(eval(quote(expr), envir)))
##
## [[4]]
## eval(expr, p)
##
## [[5]]
## eval(expr, envir, enclos)
## call
## [[6]]
## eval(quote(sys.calls()), new.env())
##
## [[7]]
## eval(expr, envir, enclos)


compiler::cmpfun(fun)(1)
## [[1]]
## (compiler::cmpfun(fun))(1)
##
## [[2]]
## (function() sys.calls())()


-Winston


On Fri, Nov 11, 2016 at 1:13 PM, brodie gaslam via R-devel <
r-devel@r-project.org> wrote:

> I noticed some problems that cropped in the latest versions of R-devel
> (2016-11-08 r71639 in my case) for one of my packages.  I _think_ I have
> narrowed it down to the changes to what gets byte-compiled by default.  The
> following example run illustrates the problem I'm having:
>
>   compiler::enableJIT(0)
>   fun <- function(x) local(as.list(parent.frame(2)))
>   fun(1)
>   ## $x
>   ## [1] 1
>   ##
>
>
>
>   compiler::cmpfun(fun)(1)
>   ## 
>
>
> Is this considered problematic at all?  If so, might it make sense to
> broaden the list of functions that disable JIT compilation beyond
> `browser`?  I'm pretty sure I can work around this issue in my specific use
> case, but figured it is worth mentioning here since it is a change in
> behavior.
>
>
> Regards,
>
> B.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Frames in compiled functions

2016-11-11 Thread luke-tierney

That's about it. The plan is to modify the interpreter to do the same
so the inconsistency will go away. Code that is affected by this is
making assumptions that it should not.

Best,

luke

On Fri, 11 Nov 2016, Winston Chang wrote:


It looks like the byte compiler is optimizing local() to an
immediately-invoked function, instead of using eval() and substitute(). I
don't know if that's exactly how it's implemented internally, but that's
what it looks like here:

compiler::enableJIT(0)

fun <- function(x) {
  local(sys.calls())
}
fun(1)
## [[1]]
## fun(1)
##
## [[2]]
## local(sys.calls())
##
## [[3]]
## eval.parent(substitute(eval(quote(expr), envir)))
##
## [[4]]
## eval(expr, p)
##
## [[5]]
## eval(expr, envir, enclos)
## call
## [[6]]
## eval(quote(sys.calls()), new.env())
##
## [[7]]
## eval(expr, envir, enclos)


compiler::cmpfun(fun)(1)
## [[1]]
## (compiler::cmpfun(fun))(1)
##
## [[2]]
## (function() sys.calls())()


-Winston


On Fri, Nov 11, 2016 at 1:13 PM, brodie gaslam via R-devel <
r-devel@r-project.org> wrote:


I noticed some problems that cropped in the latest versions of R-devel
(2016-11-08 r71639 in my case) for one of my packages.  I _think_ I have
narrowed it down to the changes to what gets byte-compiled by default.  The
following example run illustrates the problem I'm having:

  compiler::enableJIT(0)
  fun <- function(x) local(as.list(parent.frame(2)))
  fun(1)
  ## $x
  ## [1] 1
  ##



  compiler::cmpfun(fun)(1)
  ## 


Is this considered problematic at all?  If so, might it make sense to
broaden the list of functions that disable JIT compilation beyond
`browser`?  I'm pretty sure I can work around this issue in my specific use
case, but figured it is worth mentioning here since it is a change in
behavior.


Regards,

B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel