Re: [Rd] POSIXlt matching bug

2010-07-02 Thread Sklyar, Oleg (London)
POSIXlt is a list and it is not a list of dates or times, it is a list
of 

> x <- as.POSIXlt(Sys.Date())
> names(x)
[1] "sec"   "min"   "hour"  "mday"  "mon"   "year"  "wday"  "yday"
"isdst"

So if you want to match these things, you should use POSIXct or any
other numeric-based format (as POSIXct is just a double value for the
number of seconds since 1970-01-01) e.g.

> z <- as.POSIXct(Sys.Date())
> x <- as.POSIXct(Sys.Date())
> z==x
[1] TRUE
> match(z,x)
[1] 1
> z %in% x
[1] TRUE

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3803
oskl...@maninvestments.com 

> -Original Message-
> From: r-devel-boun...@r-project.org 
> [mailto:r-devel-boun...@r-project.org] On Behalf Of McGehee, Robert
> Sent: 29 June 2010 15:46
> To: r-b...@r-project.org; r-devel@r-project.org
> Subject: [Rd] POSIXlt matching bug
> 
> I came across the below mis-feature/bug using match with 
> POSIXlt objects
> (from strptime) in R 2.11.1 (though this appears to be an old issue).
> 
> > x <- as.POSIXlt(Sys.Date())
> > table <- as.POSIXlt(Sys.Date()+0:5)
> > length(x)
> [1] 1
> > x %in% table  # I expect TRUE
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> > match(x, table) # I expect 1
> [1] NA NA NA NA NA NA NA NA NA
> 
> This behavior seemed more plausible when the length of a 
> POSIXlt object
> was 9 (back in the day), however since the length was redefined, the
> length of x no longer matches the length of the match function output,
> as specified by the ?match documentation: "A vector of the same length
> as 'x'".
> 
> I would normally suggest that we add a POSIXlt method for match that
> converts x into POSIXct or character first. However, match does not
> appear to be generic. Below is a possible rewrite of match 
> that appears
> to work as desired.
> 
> match <- function(x, table, nomatch = NA_integer_, 
> incomparables = NULL)
> 
> .Internal(match(if(is.factor(x)||inherits(x, "POSIXlt"))
> as.character(x) else x,
> if(is.factor(table)||inherits(table, "POSIXlt"))
> as.character(table) else table,
> nomatch, incomparables))
> 
> That said, I understand some people may be very sensitive to the speed
> of the match function, and may prefer a simple change to the ?match
> documentation noting this (odd) behavior for POSIXlt. 
> 
> Thanks, Robert
> 
> R.version
>_
> platform   x86_64-unknown-linux-gnu 
> arch   x86_64   
> os linux-gnu
> system x86_64, linux-gnu
> status  
> major  2
> minor  11.1 
> year   2010 
> month  05   
> day31   
> svn rev52157
> language   R
> version.string R version 2.11.1 (2010-05-31)
> 
> Robert McGehee, CFA
> Geode Capital Management, LLC
> One Post Office Square, 28th Floor | Boston, MA | 02109
> Tel: 617/392-8396Fax:617/476-6389
> mailto:robert.mcge...@geodecapital.com
> 
> 
> >This e-mail, and any attachments hereto, are intended for use by the
> addressee(s) only and may contain information that is (i) confidential
> information of Geode Capital Management, LLC and/or its affiliates,
> and/or (ii) proprietary information of Geode Capital Management, LLC
> and/or its affiliates. If you are not the intended recipient of this
> e-mail, or if you have otherwise received this e-mail in error, please
> immediately notify me by telephone (you may call collect), or 
> by e-mail,
> and please permanently delete the original, any print outs and any
> copies of the foregoing. Any dissemination, distribution or copying of
> this e-mail is strictly prohibited. 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

**
 Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] POSIXlt matching bug

2010-07-02 Thread Martin Maechler
> "RobMcG" == McGehee, Robert 
> on Tue, 29 Jun 2010 10:46:06 -0400 writes:

RobMcG> I came across the below mis-feature/bug using match with POSIXlt 
objects
RobMcG> (from strptime) in R 2.11.1 (though this appears to be an old 
issue).

>> x <- as.POSIXlt(Sys.Date())
>> table <- as.POSIXlt(Sys.Date()+0:5)
>> length(x)
RobMcG> [1] 1
>> x %in% table  # I expect TRUE
RobMcG> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>> match(x, table) # I expect 1
RobMcG> [1] NA NA NA NA NA NA NA NA NA

RobMcG> This behavior seemed more plausible when the length of a POSIXlt 
object
RobMcG> was 9 (back in the day), however since the length was redefined, the
RobMcG> length of x no longer matches the length of the match function 
output,
RobMcG> as specified by the ?match documentation: "A vector of the same 
length
RobMcG> as 'x'".

RobMcG> I would normally suggest that we add a POSIXlt method for match that
RobMcG> converts x into POSIXct or character first. However, match does not
RobMcG> appear to be generic. Below is a possible rewrite of match that 
appears
RobMcG> to work as desired.

RobMcG> match <- function(x, table, nomatch = NA_integer_, incomparables = 
NULL)

RobMcG> .Internal(match(if(is.factor(x)||inherits(x, "POSIXlt"))
RobMcG> as.character(x) else x,
RobMcG> if(is.factor(table)||inherits(table, "POSIXlt"))
RobMcG> as.character(table) else table,
RobMcG> nomatch, incomparables))

RobMcG> That said, I understand some people may be very sensitive to the 
speed
RobMcG> of the match function, 

yes, indeed. 

I'm currently investigating an alternative, considerably more
programming time, but in the end should loose much less speed,
is to  .Internal()ize the tests in C code,
so that the resulting R code would simply be

match <- function(x, table, nomatch = NA_integer_, incomparables = NULL)
.Internal(x, table, nomatch, incomparables)


Martin Maechler,
ETH Zurich


RobMcG> and may prefer a simple change to the ?match
RobMcG> documentation noting this (odd) behavior for POSIXlt. 

RobMcG> Thanks, Robert

RobMcG> R.version
RobMcG> _
RobMcG> platform   x86_64-unknown-linux-gnu 
RobMcG> arch   x86_64   
RobMcG> os linux-gnu
RobMcG> system x86_64, linux-gnu
RobMcG> status  
RobMcG> major  2
RobMcG> minor  11.1 
RobMcG> year   2010 
RobMcG> month  05   
RobMcG> day31   
RobMcG> svn rev52157
RobMcG> language   R
RobMcG> version.string R version 2.11.1 (2010-05-31)

RobMcG> Robert McGehee, CFA
RobMcG> Geode Capital Management, LLC
RobMcG> One Post Office Square, 28th Floor | Boston, MA | 02109
RobMcG> Tel: 617/392-8396Fax:617/476-6389
RobMcG> mailto:robert.mcge...@geodecapital.com


>> This e-mail, and any attachments hereto, are intended for use by the
RobMcG> addressee(s) only and may contain information that is (i) 
confidential
RobMcG> information of Geode Capital Management, LLC and/or its affiliates,
RobMcG> and/or (ii) proprietary information of Geode Capital Management, LLC
RobMcG> and/or its affiliates. If you are not the intended recipient of this
RobMcG> e-mail, or if you have otherwise received this e-mail in error, 
please
RobMcG> immediately notify me by telephone (you may call collect), or by 
e-mail,
RobMcG> and please permanently delete the original, any print outs and any
RobMcG> copies of the foregoing. Any dissemination, distribution or copying 
of
RobMcG> this e-mail is strictly prohibited. 

RobMcG> __
RobMcG> R-devel@r-project.org mailing list
RobMcG> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with dyn.load() under Windows 64bit at CRAN

2010-07-02 Thread Dominick Samperi
2010/6/30 Uwe Ligges 

> On 30.06.2010 15:44, Dominick Samperi wrote:
>
> Another odd thing about this is that everything worked under Windows 64bit
>> before the changes were made to serialize the build of packages that
>> depend on each other.
>>
>  That's untrue. I try to serialize some things on winbuilder.
>

There was a problem building packages that depended on Rcpp due to the
parallel build process used at CRAN (and a bug in GNU make). The CRAN
build process was changed (by serializing dependent packages) to fix this
problem. The package cxxPack built fine under Windows 64bit before this
change was implemented. Now it fails.

Today the build of cxxPack fails under most of the OS's, and it appears that
the problem is due to a new release of Rcpp.

Perhaps the idea of wrapping libs inside packages simply does not work,
at least not without extra support and attention from the CRAN team.

One possible resolution would be to incorporate whatever I need from Rcpp
into cxxPack to eliminate the package dependency.

Thanks,
Dominick

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] POSIXlt matching bug

2010-07-02 Thread Martin Maechler
> "MM" == Martin Maechler 
> on Fri, 2 Jul 2010 12:22:07 +0200 writes:

> "RobMcG" == McGehee, Robert 
> on Tue, 29 Jun 2010 10:46:06 -0400 writes:

RobMcG> I came across the below mis-feature/bug using match with POSIXlt 
objects
RobMcG> (from strptime) in R 2.11.1 (though this appears to be an old 
issue).

>>> x <- as.POSIXlt(Sys.Date())
>>> table <- as.POSIXlt(Sys.Date()+0:5)
>>> length(x)
RobMcG> [1] 1
>>> x %in% table  # I expect TRUE
RobMcG> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>> match(x, table) # I expect 1
RobMcG> [1] NA NA NA NA NA NA NA NA NA

RobMcG> This behavior seemed more plausible when the length of a POSIXlt 
object
RobMcG> was 9 (back in the day), however since the length was redefined, the
RobMcG> length of x no longer matches the length of the match function 
output,
RobMcG> as specified by the ?match documentation: "A vector of the same 
length
RobMcG> as 'x'".

RobMcG> I would normally suggest that we add a POSIXlt method for match that
RobMcG> converts x into POSIXct or character first. However, match does not
RobMcG> appear to be generic. Below is a possible rewrite of match that 
appears
RobMcG> to work as desired.

RobMcG> match <- function(x, table, nomatch = NA_integer_, incomparables = 
NULL)

RobMcG> .Internal(match(if(is.factor(x)||inherits(x, "POSIXlt"))
RobMcG> as.character(x) else x,
RobMcG> if(is.factor(table)||inherits(table, "POSIXlt"))
RobMcG> as.character(table) else table,
RobMcG> nomatch, incomparables))

RobMcG> That said, I understand some people may be very sensitive to the 
speed
RobMcG> of the match function, 

MM> yes, indeed. 

MM> I'm currently investigating an alternative, considerably more
MM> programming time, but in the end should loose much less speed,
MM> is to  .Internal()ize the tests in C code,
MM> so that the resulting R code would simply be

MM> match <- function(x, table, nomatch = NA_integer_, incomparables = NULL)
MM> .Internal(x, table, nomatch, incomparables)

I have committed such a change to  R-devel, to be 2.12.x.
This should mean that  match() actually is now very slightly
faster than it used to be.
The speed gain may not be measurable though.

Martin Maechler,  ETH Zurich



RobMcG> and may prefer a simple change to the ?match
RobMcG> documentation noting this (odd) behavior for POSIXlt. 

RobMcG> Thanks, Robert

RobMcG> R.version
RobMcG> _
RobMcG> platform   x86_64-unknown-linux-gnu 
RobMcG> arch   x86_64   
RobMcG> os linux-gnu
RobMcG> system x86_64, linux-gnu
RobMcG> status  
RobMcG> major  2
RobMcG> minor  11.1 
RobMcG> year   2010 
RobMcG> month  05   
RobMcG> day31   
RobMcG> svn rev52157
RobMcG> language   R
RobMcG> version.string R version 2.11.1 (2010-05-31)

RobMcG> Robert McGehee, CFA
RobMcG> Geode Capital Management, LLC
RobMcG> One Post Office Square, 28th Floor | Boston, MA | 02109
RobMcG> Tel: 617/392-8396Fax:617/476-6389
RobMcG> mailto:robert.mcge...@geodecapital.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with dyn.load() under Windows 64bit at CRAN

2010-07-02 Thread Uwe Ligges

On 02.07.2010 16:10, Dominick Samperi wrote:

2010/6/30 Uwe Ligges


On 30.06.2010 15:44, Dominick Samperi wrote:

Another odd thing about this is that everything worked under Windows 64bit

before the changes were made to serialize the build of packages that
depend on each other.


  That's untrue. I try to serialize some things on winbuilder.



There was a problem building packages that depended on Rcpp due to the
parallel build process used at CRAN (and a bug in GNU make).



Well, sometimes, *not* always.
And *only* for Windows, it was fine on other platforms.



The CRAN
build process was changed (by serializing dependent packages) to fix this
problem.


*No*! It was *not* changed, as I already told you in my last reply on 
the list. I aimed at and still have not found the time. And again, only 
Windows.


I fear you do not understand what I meant: I am just stopping to build 
some packages in parallel. Other stuff is not affected, hence nothing 
you or this list need to worry about.
I should not have told you, as I understand now. I just tried to give 
you an explanation why your packages failed twice in the past due to bad 
luck.
I just should keep quite in future and answer with a simple "no idea" 
given all the information and even internal guesses are reported in a 
different context and used for list-wide speculation in the end.





The package cxxPack built fine under Windows 64bit before this
change was implemented. Now it fails.


No, that's not the reason: the change has not yet been implemented.



Today the build of cxxPack fails under most of the OS's, and it appears that
the problem is due to a new release of Rcpp.

Perhaps the idea of wrapping libs inside packages simply does not work,
at least not without extra support and attention from the CRAN team.



Why should the CRAN team care about it?

Really, given my lack of time and your annoying posts to the lists 
containing misleading information, I probably should stop to care about 
your package at all. There are > 2000 other packages and some more 
important tasks to tackle. Consider the CRAN team would take so much 
time for each single package ...


Uwe



One possible resolution would be to incorporate whatever I need from Rcpp
into cxxPack to eliminate the package dependency.

>

Thanks,
Dominick



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with dyn.load() under Windows 64bit at CRAN

2010-07-02 Thread Dominick Samperi
2010/7/2 Uwe Ligges 

> Really, given my lack of time and your annoying posts to the lists
> containing misleading information, I probably should stop to care about your
> package at all. There are > 2000 other packages and some more important
> tasks to tackle. Consider the CRAN team would take so much time for each
> single package ...
> Uwe


No problem. I understand that you are busy, and my "annoying" posts were
intended to get
feedback from others who might have something constructive to say.

Too bad package developers cannot see more of what goes on at CRAN during
the build process
so they can debug problems without bothering the CRAN team. This would
eliminate the need
for guessing and postings based on limited information.

Dominick

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Best way to determine if you're running 32 or 64 bit R on windows

2010-07-02 Thread Jeffrey Horner
Hi,

Is this sufficient?

if (.Machine$sizeof.pointer==4){
  cat('32\n')
} else {
  cat('64\n')
}

Or is it better to test something in R.version, say os?

I'd like to use this to specify appropriate linker arguments when
building the RMySQL windows package.

Jeff
-- 
http://biostat.mc.vanderbilt.edu/JeffreyHorner

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best way to determine if you're running 32 or 64 bit R on windows

2010-07-02 Thread Martin Maechler
Jeffrey Horner  gmail.com> writes:

> Is this sufficient?
> 
> if (.Machine$sizeof.pointer==4){
>   cat('32\n')
> } else {
>   cat('64\n')
> }
> 
> Or is it better to test something in R.version, say os?

No, the above is perfect,  as it also works on other platforms to distinguish
32-bit and 64-bit.

Regards, Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Attributes of 1st argument in ...

2010-07-02 Thread Daniel Murphy
R-Devel:

I am trying to get an attribute of the first argument in a call to a
function whose formal arguments consist of dots only and do something, e.g.,
call 'cbind', based on the attribute
f<- function(...) {get first attribute; maybe or maybe not call 'cbind'}

I thought of (ignoring "deparse.level" for the moment)

f<-function(...) {x <- attr(list(...)[[1L]], "foo"); if (x=="bar")
cbind(...) else x}

but I feared my solution might do some extra copying, with a performance
penalty if the dotted objects in the actual call to "f' are very large.

I thought the following alternative might avoid a potential performance hit
by evaluating the attribute in the parent.frame (and therefore avoid extra
copying?):

f<-function(...)
{
   L<-match.call(expand.dots=FALSE)[[2L]]
   x <- eval(substitute(attr(x,"foo"), list(x=L[[1L]])))
   if (x=="bar") cbind(...) else x
}

system.time tests showed this second form to be only marginally faster.

Is my fear about extra copying unwarranted? If not, is there a better way to
get the "foo" attribute of the first argument other than my two
alternatives?

Thanks,
Dan Murphy

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Attributes of 1st argument in ...

2010-07-02 Thread Olaf Mersmann
Hi Daniel,

On 02.07.2010, at 23:26, Daniel Murphy wrote:
> I am trying to get an attribute of the first argument in a call to a
> function whose formal arguments consist of dots only and do something, e.g.,
> call 'cbind', based on the attribute
> f<- function(...) {get first attribute; maybe or maybe not call 'cbind'}
> 
> I thought of (ignoring "deparse.level" for the moment)
> 
> f<-function(...) {x <- attr(list(...)[[1L]], "foo"); if (x=="bar")
> cbind(...) else x}

what about using the somewhat obscure ..1 syntax? This version runs quite a bit 
faster for me:

 g <- function(...) {
   x <- attr(..1, "foo")
   if (x == "bar")
 cbind(...)
   else
 x
 }

but it will be hard to quantify how this pans out for your unless we know how 
many and what size and type the arguments are.

Cheers,
Olaf

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] kmeans

2010-07-02 Thread Gabor Grothendieck
In kmeans() in stats one gets an error message with the default
clustering algorithm if centers = 1.  Its often useful to calculate
the sum of squares for 1 cluster, 2 clusters, etc. and this error
complicates things since one has to treat 1 cluster as a special case.
 A second reason is that easily getting the 1 cluster sum of squares
makes it easy to calculate the between cluster sum of squares when
there is more than 1 cluster.

I suggest adding the line marked ### to the source code of kmeans (the
other lines shown are just ther to illustrate context).  Adding this
line forces kmeans to use the code for algorithm 3 if centers is 1.
This is useful since unlike the code for the default algorithm, the
code for algorithm 3 succeeds for centers = 1.

if(length(centers) == 1) {
if (centers == 1) nmeth <- 3 ###
k <- centers

Also note that KMeans in Rcmdr produces a betweenss and a tot.withinss
and it would be nice if kmeans in stats did that too:

> library(Rcmdr)
> str(KMeans(USArrests, 3))
List of 6
 $ cluster : Named int [1:50] 1 1 1 2 1 2 3 1 1 2 ...
  ..- attr(*, "names")= chr [1:50] "Alabama" "Alaska" "Arizona" "Arkansas" ...
 $ centers : num [1:3, 1:4] 11.81 8.21 4.27 272.56 173.29 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:3] "1" "2" "3"
  .. ..$ : chr [1:4] "Murder" "Assault" "UrbanPop" "Rape"
 $ withinss: num [1:3] 19564 9137 19264
 $ size: int [1:3] 16 14 20
 $ tot.withinss: num 47964  <=
 $ betweenss   : num 307844 <=
 - attr(*, "class")= chr "kmeans"

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best way to determine if you're running 32 or 64 bit R on windows

2010-07-02 Thread Prof Brian Ripley

On Fri, 2 Jul 2010, Jeffrey Horner wrote:


Hi,

Is this sufficient?


Yes, if you want to know in R code.


if (.Machine$sizeof.pointer==4){
 cat('32\n')
} else {
 cat('64\n')
}

Or is it better to test something in R.version, say os?


Not 'os' (the OS is the same), but 'arch' changes.  Just as on a Mac 
or on Linux.



I'd like to use this to specify appropriate linker arguments when
building the RMySQL windows package.


If you mean *installing*  (R CMD INSTALL, not R CMD build) the 
documented way is to use the environment variable R_ARCH: there are 
also make variables available, e.g. WIN.  See 
http://www.stats.ox.ac.uk/~ripley/Win64/W64porting.html (which is 
linked from the appropriate manuals).




Jeff
--
http://biostat.mc.vanderbilt.edu/JeffreyHorner

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel