Re: [Rd] Missing /share/dictionaries/en_stats.rds

2013-01-07 Thread Martin Maechler
> Henrik Bengtsson 
> on Sun, 6 Jan 2013 17:07:12 -0800 writes:

> Hi.  I'm on Windows 7 64-bit with latest R devel with and
> Rtools (2.16.0.1926).  When I try to enable spell checking
> for 'R CMD check' by setting environment variable
> '_R_CHECK_CRAN_INCOMING_USE_ASPELL_' to 'true', I get:

> * checking CRAN incoming feasibility ...Warning in
> aspell(files, filter = "dcf", control = control, encoding
> = encoding, : The following dictionaries were not found:
> en_stats

> Looking at the source code, I found that I'm missing
> 'en_stats.rds', e.g.

>> Sys.glob(file.path(R.home("share"), "dictionaries", "*.rds"))
> character(0)

> I see that the missing file is in
> http://svn.r-project.org/R/trunk/share/dictionaries/.  Is
> there a reason for why this is not part of the R
> source/binaries (e.g.
> http://cran.r-project.org/src/base/R-2/R-2.15.2.tar.gz)?
> Intentional or forgotten?

But it's in the source tarballs of R-devel, 
and you are talking about R-devel, right?

Also, for me (on Linux), it is in there

>> Sys.glob(file.path(R.home("share"), "dictionaries", "*.rds"))

are you sure you're not just mixing R-2.15.x
and R-devel ?

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Missing /share/dictionaries/en_stats.rds

2013-01-07 Thread Uwe Ligges



On 07.01.2013 09:47, Martin Maechler wrote:

Henrik Bengtsson 
 on Sun, 6 Jan 2013 17:07:12 -0800 writes:


 > Hi.  I'm on Windows 7 64-bit with latest R devel with and
 > Rtools (2.16.0.1926).  When I try to enable spell checking
 > for 'R CMD check' by setting environment variable
 > '_R_CHECK_CRAN_INCOMING_USE_ASPELL_' to 'true', I get:

 > * checking CRAN incoming feasibility ...Warning in
 > aspell(files, filter = "dcf", control = control, encoding
 > = encoding, : The following dictionaries were not found:
 > en_stats

 > Looking at the source code, I found that I'm missing
 > 'en_stats.rds', e.g.

 >> Sys.glob(file.path(R.home("share"), "dictionaries", "*.rds"))
 > character(0)

 > I see that the missing file is in
 > http://svn.r-project.org/R/trunk/share/dictionaries/.  Is
 > there a reason for why this is not part of the R
 > source/binaries (e.g.
 > http://cran.r-project.org/src/base/R-2/R-2.15.2.tar.gz)?
 > Intentional or forgotten?

But it's in the source tarballs of R-devel,
and you are talking about R-devel, right?

Also, for me (on Linux), it is in there

 >> Sys.glob(file.path(R.home("share"), "dictionaries", "*.rds"))

are you sure you're not just mixing R-2.15.x
and R-devel ?


No, he is right, since it is not shipped with the Windows installer.
We are working on it (and the other aspell error report).

Best,
Uwe





Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Plots not drawn with buffered Cairo 1.12

2013-01-07 Thread Milan Bouchet-Valat
Hi!

On Fedora 18 [1] and Arch Linux [2], using R 2.15.2, X11 plots are not
drawn (i.e. the window stays blank) when using X11.options(type="cairo")
and X11.options(type="dbcairo"). They are correctly drawn when using
X11.options(type="nbcairo") and X11.options(type="xlib"), or after
resizing the X11 window.

The bug happens with Cairo 1.12.4 and above, but not with 1.10.2 (I have
not tested versions between these two). I've filed a bug against Cairo
[3], and developers replied that R was probably not calling
cairo_surface_flush() when it should, as Cairo relies on this and this
assumption is relied on in more places in recent releases.

I gave a try to that idea but so far I've not been able to fix the bug
by adding cairo_surface_flush() calls in the src/modules/X11/devX11.c
code.

I've also discovered that the code in X11_Mode() is never really run
when drawing a simple plot like 'plot(1:10)'. What happens is that
xd->holdlevel is always 1 when X11_Mode() is called, so the function
returns. And when xd->holdlevel is set back to 0, no call to X11_Mode is
done. On the contrary, if I resize the window, the blocks for mode==0
and mode==1 are run several times. Is this behavior expected?

This also happens with Cairo 1.10.2, when the plot is correctly drawn,
so this is not the cause of the problem. But I find this puzzling, since
a comment says:

//
/* device_Mode is called whenever the graphics engine   */
/* starts drawing (mode=1) or stops drawing mode=0) */
/* the device is not required to do anything*/
//

According to what I'm seeing, the "starts drawing" part never happens.
So the comments sounds a bit misleading to me, if not completely wrong.

Any ideas?


Regards


1: https://bugzilla.redhat.com/show_bug.cgi?id=891983
2: https://bugs.archlinux.org/task/32597
3: https://bugs.freedesktop.org/show_bug.cgi?id=59085

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Missing /share/dictionaries/en_stats.rds

2013-01-07 Thread Prof Brian Ripley

On 07/01/2013 09:33, Uwe Ligges wrote:



On 07.01.2013 09:47, Martin Maechler wrote:

Henrik Bengtsson 
 on Sun, 6 Jan 2013 17:07:12 -0800 writes:


 > Hi.  I'm on Windows 7 64-bit with latest R devel with and
 > Rtools (2.16.0.1926).  When I try to enable spell checking
 > for 'R CMD check' by setting environment variable
 > '_R_CHECK_CRAN_INCOMING_USE_ASPELL_' to 'true', I get:

 > * checking CRAN incoming feasibility ...Warning in
 > aspell(files, filter = "dcf", control = control, encoding
 > = encoding, : The following dictionaries were not found:
 > en_stats

 > Looking at the source code, I found that I'm missing
 > 'en_stats.rds', e.g.

 >> Sys.glob(file.path(R.home("share"), "dictionaries", "*.rds"))
 > character(0)

 > I see that the missing file is in
 > http://svn.r-project.org/R/trunk/share/dictionaries/.  Is
 > there a reason for why this is not part of the R
 > source/binaries (e.g.
 > http://cran.r-project.org/src/base/R-2/R-2.15.2.tar.gz)?
 > Intentional or forgotten?

But it's in the source tarballs of R-devel,
and you are talking about R-devel, right?

Also, for me (on Linux), it is in there

 >> Sys.glob(file.path(R.home("share"), "dictionaries", "*.rds"))

are you sure you're not just mixing R-2.15.x
and R-devel ?


No, he is right, since it is not shipped with the Windows installer.
We are working on it (and the other aspell error report).


Or put another way, aspell() and its use in R CMD check is not yet 
supported on Windows (which is why the directory was not shipped).



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] weird bug with parallel, RSQlite and tcltk

2013-01-07 Thread Karl Forner
Hello and thank you.
Indeed gsubfn is responsible for loading tcltk in my case.

On Thu, Jan 3, 2013 at 12:14 PM, Gabor Grothendieck
 wrote:
> options(gsubfn.engine = "R")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Small changes to big objects (1)

2013-01-07 Thread Douglas Bates
Is there a difference in the copying behavior of

x@little <- other

and

x@little[] <- other

I was using the second form in (yet another!) modification of the internal
representation of mixed-effects models in the lme4 package in the hopes
that it would not trigger copying of the entire object.  The object
representing the model is quite large but the changes during iterations are
to small vectors representing parameters and coefficients.



On Thu, Jan 3, 2013 at 1:08 PM, John Chambers  wrote:

> Martin Morgan commented in email to me that a change to any slot of an
> object that has other, large slot(s) does substantial computation,
> presumably from copying the whole object.  Is there anything to be done?
>
> There are in fact two possible changes, one automatic but only partial,
> the other requiring some action on the programmer's part.  Herewith the
> first; I'll discuss the second in a later email.
>
> Some context:  The notion is that our object has some big data and some
> additional smaller things.  We need to change the small things but would
> rather not copy the big things all the time.  (With long vectors, this
> becomes even more relevant.)
>
> There are three likely scenarios: slots, attributes and named list
> components.  Suppose our object has "little" and "BIG" encoded in one of
> these.
>
> The three relevant computations are:
>
> x@little <- other
> attr(x, "little") <- other
> x$little <- other
>
> It turns out that these are all similar in behavior with one important
> exception--fixing that is the automatic change.
>
> I need to review what R does here. All these are replacement functions,
> `@<-`, `attr<-`, `$<-`.  The evaluator checks before calling any
> replacement whether the object needs to be duplicated (in a routine
> EnsureLocal()).  It does that by examining a special field that holds the
> reference status of the object.
>
> Some languages, such as Python (and S) keep reference counts for each
> object, de-allocating the object when the reference count drops back to
> zero.  R uses a different strategy. Its NAMED() field is 0, 1 or 2
> according to whether the object has been assigned never, once or more than
> once.  The field is not a reference count and is not decremented--relevant
> for this issue.  Objects are de-allocated only when garbage collection
> occurs and the object does not appear in any current frame or other context.
> (I did not write any of this code, so apologies if I'm misrepresenting it.)
>
> When any of these replacement operations first occurs for a particular
> object in a particular function call, it's very likely that the reference
> status will be 2 and EnsureLocal will duplicate it--all of it. Regardless
> of which of the three forms is used.
>
> Here the non-level-playing-field aspect comes in.  `@<-` is a normal R
> function (a "closure") but the other two are primitives in the main code
> for R.  Primitives have no frame in which arguments are stored.  As a
> result the new version of x is normally stored with status 1.
>
> If one does a second replacement in the same call (in a loop, e.g.) that
> should not normally copy again.  But the result of `@<-` will be an object
> from its frame and will have status 2 when saved, forcing a copy each time.
>
> So the change, naturally, is that R 3.0.0 will have a primitive
> implementation of `@<`.  This has been implemented in r-devel (rev. 61544).
>
> Please try it out _before_ we issue that version, especially if you own a
> package that does things related to this question.
>
> John
>
> PS:  Some may have noticed that I didn't mention a fourth approach: fields
> in a reference class object.  The assumption was that we wanted classical,
> functional behavior here.  Reference classes don't have the copy problem
> but don't behave functionally either.  But that is in fact the direction
> for the other approach.  I'll discuss that later, when the corresponding
> code is available.
>
> __**
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Small changes to big objects (1)

2013-01-07 Thread John Chambers

On 1/7/13 9:59 AM, Douglas Bates wrote:

Is there a difference in the copying behavior of

x@little <- other

and

x@little[] <- other


Not in the direction you were hoping, as far as I can tell.

Nested replacement expressions in R and S are unraveled and done as 
repeated simple replacements.  So either way you end up with, in effect

  x@little <- something

If x has >1 reference, as it tends to, EnsureLocal() will call duplicate().

I think the only difference is that your second form gets you to 
duplicate the little vector twice. ;-)


John


I was using the second form in (yet another!) modification of the internal
representation of mixed-effects models in the lme4 package in the hopes
that it would not trigger copying of the entire object.  The object
representing the model is quite large but the changes during iterations are
to small vectors representing parameters and coefficients.



On Thu, Jan 3, 2013 at 1:08 PM, John Chambers  wrote:


Martin Morgan commented in email to me that a change to any slot of an
object that has other, large slot(s) does substantial computation,
presumably from copying the whole object.  Is there anything to be done?

There are in fact two possible changes, one automatic but only partial,
the other requiring some action on the programmer's part.  Herewith the
first; I'll discuss the second in a later email.

Some context:  The notion is that our object has some big data and some
additional smaller things.  We need to change the small things but would
rather not copy the big things all the time.  (With long vectors, this
becomes even more relevant.)

There are three likely scenarios: slots, attributes and named list
components.  Suppose our object has "little" and "BIG" encoded in one of
these.

The three relevant computations are:

x@little <- other
attr(x, "little") <- other
x$little <- other

It turns out that these are all similar in behavior with one important
exception--fixing that is the automatic change.

I need to review what R does here. All these are replacement functions,
`@<-`, `attr<-`, `$<-`.  The evaluator checks before calling any
replacement whether the object needs to be duplicated (in a routine
EnsureLocal()).  It does that by examining a special field that holds the
reference status of the object.

Some languages, such as Python (and S) keep reference counts for each
object, de-allocating the object when the reference count drops back to
zero.  R uses a different strategy. Its NAMED() field is 0, 1 or 2
according to whether the object has been assigned never, once or more than
once.  The field is not a reference count and is not decremented--relevant
for this issue.  Objects are de-allocated only when garbage collection
occurs and the object does not appear in any current frame or other context.
(I did not write any of this code, so apologies if I'm misrepresenting it.)

When any of these replacement operations first occurs for a particular
object in a particular function call, it's very likely that the reference
status will be 2 and EnsureLocal will duplicate it--all of it. Regardless
of which of the three forms is used.

Here the non-level-playing-field aspect comes in.  `@<-` is a normal R
function (a "closure") but the other two are primitives in the main code
for R.  Primitives have no frame in which arguments are stored.  As a
result the new version of x is normally stored with status 1.

If one does a second replacement in the same call (in a loop, e.g.) that
should not normally copy again.  But the result of `@<-` will be an object
from its frame and will have status 2 when saved, forcing a copy each time.

So the change, naturally, is that R 3.0.0 will have a primitive
implementation of `@<`.  This has been implemented in r-devel (rev. 61544).

Please try it out _before_ we issue that version, especially if you own a
package that does things related to this question.

John

PS:  Some may have noticed that I didn't mention a fourth approach: fields
in a reference class object.  The assumption was that we wanted classical,
functional behavior here.  Reference classes don't have the copy problem
but don't behave functionally either.  But that is in fact the direction
for the other approach.  I'll discuss that later, when the corresponding
code is available.

__**
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] "Default" accessor in S4 classes

2013-01-07 Thread Chris Jewell
Hi All,

I'm currently trying to write an S4 class that mimics a data.frame, but stores 
data on disc in HDF5 format.  The idea is that the dataset is likely to be too 
large to fit into a standard desktop machine, and by using subscripts, the user 
may load bits of the dataset at a time.  eg:

> myLargeData <- LargeData("/path/to/file")
> mySubSet <- myLargeData[1:10, seq(1,15,by=3)]

I've therefore defined by LargeData class thus

> LargeData <- setClass("LargeData", representation(filename="character"))
> setMethod("initialize","LargeData", function(.Object,filename) 
> .Object@filename <- filename)

I've then defined the "[" method to call a C++ function (Rcpp), opening the 
HDF5 file, and returning the required rows/cols as a data.frame.

However, what if the user wants to load the entire dataset into memory?  Which 
method do I overload to achieve the following?

> fullData <- myLargeData
> class(fullData)
[1] "data.frame"

or apply transformations:

> myEigen <- eigen(myLargeData)

In C++ I would normally overload the "double" or "float" operator to achieve 
this -- can I do the same thing in R?

Thanks,

Chris

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "Default" accessor in S4 classes

2013-01-07 Thread Simon Urbanek
Chris,

On Jan 7, 2013, at 6:23 PM, Chris Jewell wrote:

> Hi All,
> 
> I'm currently trying to write an S4 class that mimics a data.frame, but 
> stores data on disc in HDF5 format.  The idea is that the dataset is likely 
> to be too large to fit into a standard desktop machine, and by using 
> subscripts, the user may load bits of the dataset at a time.  eg:
> 
>> myLargeData <- LargeData("/path/to/file")
>> mySubSet <- myLargeData[1:10, seq(1,15,by=3)]
> 
> I've therefore defined by LargeData class thus
> 
>> LargeData <- setClass("LargeData", representation(filename="character"))
>> setMethod("initialize","LargeData", function(.Object,filename) 
>> .Object@filename <- filename)
> 
> I've then defined the "[" method to call a C++ function (Rcpp), opening the 
> HDF5 file, and returning the required rows/cols as a data.frame.
> 
> However, what if the user wants to load the entire dataset into memory?  
> Which method do I overload to achieve the following?
> 
>> fullData <- myLargeData
>> class(fullData)
> [1] "data.frame"
> 

That makes no sense since a <- b is not a transformation, "a" will have the 
same value as "b" by definition - and thus the same class. If you really meant

fullData <- as.data.frame(myLargerData)

then you just need to implement the as.data.frame() method for your class.

Note, however, that a more common way to convert between a big data reference 
and native format in its entirety is simply myLargeData[] -- you may want to 
have a look at the (many) existing big data packages (AFAIR bigmemory uses C++ 
back-end as well). Also note that indexing is tricky in R and easy to get wrong 
(remember: negative indices, index by name etc.)


> or apply transformations:
> 
>> myEigen <- eigen(myLargeData)
> 
> In C++ I would normally overload the "double" or "float" operator to achieve 
> this -- can I do the same thing in R?
> 

Again, there is no implicit coercion in R (you cannot declare variable type in 
advance) so it doesn't make sense in the context you have in mind from C++ -- 
in R the equivalent is simply implementing as.double() method, but I suspect 
that's not what you had in mind. For generics you can simply implement a method 
for your class (that does the coercion, for example, or uses a more efficient 
way). If you cannot define a generic or don't want to write your own methods 
then it's a problem, because the only theoretical way is to subclass numeric 
vector class, but that is not possible in R if you want to change the 
representation because it falls through to the more efficient internal code too 
quickly (without extra dispatch) for you.

Cheers.
Simon


> Thanks,
> 
> Chris
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "Default" accessor in S4 classes

2013-01-07 Thread Michael Lawrence
On Mon, Jan 7, 2013 at 3:23 PM, Chris Jewell wrote:

> Hi All,
>
> I'm currently trying to write an S4 class that mimics a data.frame, but
> stores data on disc in HDF5 format.  The idea is that the dataset is likely
> to be too large to fit into a standard desktop machine, and by using
> subscripts, the user may load bits of the dataset at a time.  eg:
>
> > myLargeData <- LargeData("/path/to/file")
> > mySubSet <- myLargeData[1:10, seq(1,15,by=3)]
>
> I've therefore defined by LargeData class thus
>
> > LargeData <- setClass("LargeData", representation(filename="character"))
> > setMethod("initialize","LargeData", function(.Object,filename)
> .Object@filename <- filename)
>
>
The above function needs to return .Object.


> I've then defined the "[" method to call a C++ function (Rcpp), opening
> the HDF5 file, and returning the required rows/cols as a data.frame.
>
> However, what if the user wants to load the entire dataset into memory?
>  Which method do I overload to achieve the following?
>
> > fullData <- myLargeData
> > class(fullData)
> [1] "data.frame"
>
> or apply transformations:
>
> > myEigen <- eigen(myLargeData)
>
> In C++ I would normally overload the "double" or "float" operator to
> achieve this -- can I do the same thing in R?
>

The coercions are going to have to be explicit, since there are no type
declarations. So, an as.data.frame method for coercing to a data.frame  (as
well as a coerce method via setAs), and you'll need methods for many of the
base R functions.  Some of those you can implicitly support using an S3
method on as.data.frame, assuming the function calls it.


> Thanks,
>
> Chris
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel