[Rd] package.skeleton() creates corrupted Rd file (PR#11191)

2008-04-17 Thread pent
Calling package.skeleton() results in corrupted Rd-file stub on my
system. It (the file) is contaminated with mass question marks.

This happens only for package Rd file, Rd stubs for *package
functions* are generated nicely.

An example of bad Rd file created by package.skeleton():

\name{cmrutils-package}
\alias{cmrutils-package}
\alias{cmrutils}
\docType{package}
\title{
What the package does (short line)
~~ ? ?? ~~
}
\description{
More about what it does (maybe more than one line)
~~ ??? ?? (1-5 ?)  ?? ~~
}
\details{
\tabular{ll}{
Package: \tab cmrutils\cr
Type: \tab Package\cr
Version: \tab 1.0\cr
Date: \tab 2008-03-30\cr
License: \tab What license is it under?\cr
}
~~ ? , ???  ?, ???  ?? ??? ~~
}
\author{
Who wrote it

?: Who to complain to <[EMAIL PROTECTED]>
~~ ? ?/??? ?? ? ~~
}
\references{
~~ ?? ??? ?? ?? ?? ?? ?? ~~
}
~~ ?: ?? ???  ?, ?? ?? ?? ??, ~~
~~ ?? ? KEYWORDS ? ? ? ? ?? R ~~
\keyword{ package }
\seealso{
~~ ?? ?? ?? ??  ???,  ~~
~~ \code{\link[:-package]{}} ~~
}
\examples{
~~ ??? ??? ???  ?? ??? ~~
}

The file has been corrupted unevitably: these ?s are all actual 3F
characters, not that I just open the file in incorrect encoding.

Upon discussion with Dirk Eddelbuettel and Douglas Bates and further
investigation, it appears that the problem is in package.skeleton()
function, not in promptPackage() function neither in R-ru.po (Russian
translations for R) file. This is the problematic code:

package.skeleton <-
...
{
...

## we need to test in the C locale
curLocale <- Sys.getlocale("LC_CTYPE")
on.exit(Sys.setlocale("LC_CTYPE", curLocale), add = TRUE)
if(Sys.setlocale("LC_CTYPE", "C") != "C")
warning("cannot turn off locale-specific chars via LC_CTYPE")

...
}

Rd stubs for *package functions* are not affected because the
corresponding strings are not (yet) translated.

I'm ready to provide any additional info.

--please do not edit the information below--

Version:
 platform = i486-pc-linux-gnu
 arch = i486
 os = linux-gnu
 system = i486, linux-gnu
 status = 
 major = 2
 minor = 6.2
 year = 2008
 month = 02
 day = 08
 svn rev = 44383
 language = R
 version.string = R version 2.6.2 (2008-02-08)

Locale:
LC_CTYPE=ru_RU.UTF-8;LC_NUMERIC=C;LC_TIME=ru_RU.UTF-8;LC_COLLATE=ru_RU.UTF-8;LC_MONETARY=ru_RU.UTF-8;LC_MESSAGES=ru_RU.UTF-8;LC_PAPER=ru_RU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=ru_RU.UTF-8;LC_IDENTIFICATION=C

Search Path:
 .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils, 
package:datasets, package:methods, Autoloads, package:base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] segments() with zero-length arguments (PR#11192)

2008-04-17 Thread rmh
Uwe Ligges suggested I post this on R-bugs as a wishlist item with a
proposed patch.  R considers zero-length arguments to segments() to be
an error.  I would like R to allow this and to return without an
error.  It occurs naturally in settings like

valid <- c(FALSE, FALSE, FALSE)
segments(x0[valid], y0[valid], x1[valid], y1[valid])

For what it may be worth, S-Plus does not consider zero-length
arguments to segments() be an error.


plot(1:10)
segments(1,1,10,10,col='green')
segments(numeric(0), numeric(0), numeric(0), numeric(0), col='green')
Error in segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...) : 
invalid first argument


segments.proposal <-
  function (x0, y0, x1, y1, col = par("fg"), lty = par("lty"), 
lwd = par("lwd"), ...) {
if (length(x0)==0 && length(y0)==0 && length(x1)==0 && length(y1)==0)
  return(invisible(NULL))
.Internal(segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...))
}

segments.proposal(numeric(0), numeric(0), numeric(0), numeric(0), col='green')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] segments() with zero-length arguments (PR#11192)

2008-04-17 Thread ripley
I think we should allow only all-zero arguments (at present any 
zero-length argument is an error), as per the R-level proposed code.

arrows() and rect() share the code so it is much cleaner to do this 
internally.

Done for R-devel.

On Thu, 17 Apr 2008, [EMAIL PROTECTED] wrote:

> Uwe Ligges suggested I post this on R-bugs as a wishlist item with a
> proposed patch.  R considers zero-length arguments to segments() to be
> an error.  I would like R to allow this and to return without an
> error.  It occurs naturally in settings like
>
> valid <- c(FALSE, FALSE, FALSE)
> segments(x0[valid], y0[valid], x1[valid], y1[valid])
>
> For what it may be worth, S-Plus does not consider zero-length
> arguments to segments() be an error.
>
>
> plot(1:10)
> segments(1,1,10,10,col='green')
> segments(numeric(0), numeric(0), numeric(0), numeric(0), col='green')
> Error in segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...) :
>invalid first argument
>
>
> segments.proposal <-
>  function (x0, y0, x1, y1, col = par("fg"), lty = par("lty"),
>lwd = par("lwd"), ...) {
>if (length(x0)==0 && length(y0)==0 && length(x1)==0 && length(y1)==0)
>  return(invisible(NULL))
>.Internal(segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...))
> }
>
> segments.proposal(numeric(0), numeric(0), numeric(0), numeric(0), col='green')
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] segments() with zero-length arguments (PR#11192)

2008-04-17 Thread Peter Dalgaard
[EMAIL PROTECTED] wrote:
> I think we should allow only all-zero arguments (at present any 
> zero-length argument is an error), as per the R-level proposed code.
>
> arrows() and rect() share the code so it is much cleaner to do this 
> internally.
>
>   
There are precedents for not requiring all-or-none zero-length args. In
arithmetic, "recycling" treats the case of one zero length item as if
all had length zero

> 2+numeric(0)
numeric(0)

cbind() and rbind() are a bit anomalous in that they just throw away the
offending item:

> cbind(a=1:2,b=numeric(0),c=3:4)
 a c
[1,] 1 3
[2,] 2 4

Only the former variant would make sense for segments(), but segments()
does recycle, unlike lines() which complains if vectors are of different
length.

I think there are cases where you might want it to just do nothing
rather than warn and do nothing or cause an error. Consider rug()-like code:
segments(x, ytop, x, ybot) with an empty x. (I realize that the real
rug() uses Axis(), but the point remains.)

> Done for R-devel.
>
> On Thu, 17 Apr 2008, [EMAIL PROTECTED] wrote:
>
>   
>> Uwe Ligges suggested I post this on R-bugs as a wishlist item with a
>> proposed patch.  R considers zero-length arguments to segments() to be
>> an error.  I would like R to allow this and to return without an
>> error.  It occurs naturally in settings like
>>
>> valid <- c(FALSE, FALSE, FALSE)
>> segments(x0[valid], y0[valid], x1[valid], y1[valid])
>>
>> For what it may be worth, S-Plus does not consider zero-length
>> arguments to segments() be an error.
>>
>>
>> plot(1:10)
>> segments(1,1,10,10,col='green')
>> segments(numeric(0), numeric(0), numeric(0), numeric(0), col='green')
>> Error in segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...) :
>>invalid first argument
>>
>>
>> segments.proposal <-
>>  function (x0, y0, x1, y1, col = par("fg"), lty = par("lty"),
>>lwd = par("lwd"), ...) {
>>if (length(x0)==0 && length(y0)==0 && length(x1)==0 && length(y1)==0)
>>  return(invisible(NULL))
>>.Internal(segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...))
>> }
>>
>> segments.proposal(numeric(0), numeric(0), numeric(0), numeric(0), 
>> col='green')
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> 
>
>   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [rd] sage <--> r integration

2008-04-17 Thread Mike Hansen
I would like to re-emphasize the above points that William makes.
Both projects could benefit a lot from working with each other.  A
fair number of people, including Persi Diaconis and Susan Holmes, have
had enthusiastic responses when I mentioned that R was being included
with Sage.

This is a good opportunity to build bridges between the two
communities, and the lack of response (regardless of how GSoC issues
play out) is a bit disappointing.

--Mike

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] segments() with zero-length arguments (PR#11192)

2008-04-17 Thread Prof Brian Ripley

[Taken off R-bugs, at least for now.]

On Thu, 17 Apr 2008, Peter Dalgaard wrote:


[EMAIL PROTECTED] wrote:

I think we should allow only all-zero arguments (at present any
zero-length argument is an error), as per the R-level proposed code.

arrows() and rect() share the code so it is much cleaner to do this
internally.



There are precedents for not requiring all-or-none zero-length args. In
arithmetic, "recycling" treats the case of one zero length item as if
all had length zero


2+numeric(0)

numeric(0)


There are also rules for array arithmetic.


cbind() and rbind() are a bit anomalous in that they just throw away the
offending item:


cbind(a=1:2,b=numeric(0),c=3:4)

a c
[1,] 1 3
[2,] 2 4

Only the former variant would make sense for segments(), but segments()
does recycle, unlike lines() which complains if vectors are of different
length.


And neither is documented. Also S has changed its recycling rules in 
various ways over its lifetime.



I think there are cases where you might want it to just do nothing
rather than warn and do nothing or cause an error. Consider rug()-like code:
segments(x, ytop, x, ybot) with an empty x. (I realize that the real
rug() uses Axis(), but the point remains.)


I agree it is moot, which is why I commented on this.  Rich's proposal was 
the one I ended up implementing, but it was not my first idea. There are 
also cases when you pass zero-length arguments in error, and I thought 
those more likely.





Done for R-devel.

On Thu, 17 Apr 2008, [EMAIL PROTECTED] wrote:



Uwe Ligges suggested I post this on R-bugs as a wishlist item with a
proposed patch.  R considers zero-length arguments to segments() to be
an error.  I would like R to allow this and to return without an
error.  It occurs naturally in settings like

valid <- c(FALSE, FALSE, FALSE)
segments(x0[valid], y0[valid], x1[valid], y1[valid])

For what it may be worth, S-Plus does not consider zero-length
arguments to segments() be an error.


plot(1:10)
segments(1,1,10,10,col='green')
segments(numeric(0), numeric(0), numeric(0), numeric(0), col='green')
Error in segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...) :
   invalid first argument


segments.proposal <-
 function (x0, y0, x1, y1, col = par("fg"), lty = par("lty"),
   lwd = par("lwd"), ...) {
   if (length(x0)==0 && length(y0)==0 && length(x1)==0 && length(y1)==0)
 return(invisible(NULL))
   .Internal(segments(x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...))
}

segments.proposal(numeric(0), numeric(0), numeric(0), numeric(0), col='green')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel








--
  O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907





--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Suggestion: add a warning in the help-file of unique()

2008-04-17 Thread Matthieu Stigler
Hello

I'm sorry if this suggestion/correction was already made but after a 
search in devel list I did not find any mention of it.
I would just suggest to add a warning or an exemple for the help-file of 
the function unique() like

"Note that unique() compares only identical values. Values which, are 
printed equally but in facts are not identical will be treated as 
different."


 > a<-c(0.2, 0.3, 0.2, 0.4-0.1)
 > a
[1] 0.2 0.3 0.2 0.3
 > unique(a)
[1] 0.2 0.3 0.3
 >

Well this is just the idea and the sentence could be made better (my 
poor english...). Maybe a reference to RFAQ 7.31 could be made.
Maybe is this behaviour clear and logical for experienced users,  but I 
don't think it is for beginners. I personnaly spent two hours to see 
that the problem in my code came from this.

I was thinking about modify the function unique() to introduce a "tol" 
argument which allows to compare with a tolerance level (with default 
value zero to keep unique consistent) like all.equal(), but it seemed 
too complicated with my little understanding.

Bests regards and many thanks for what you do for R!

Matthieu Stigler

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: add a warning in the help-file of unique()

2008-04-17 Thread Ted Harding
On 17-Apr-08 10:44:32, Matthieu Stigler wrote:
> Hello
> 
> I'm sorry if this suggestion/correction was already made
> but after a search in devel list I did not find any mention
> of it. I would just suggest to add a warning or an exemple
> for the help-file of the function unique() like
> 
> "Note that unique() compares only identical values. Values
> which, are printed equally but in facts are not identical
> will be treated as different."
> 
> 
>  > a<-c(0.2, 0.3, 0.2, 0.4-0.1)
>  > a
> [1] 0.2 0.3 0.2 0.3
>  > unique(a)
> [1] 0.2 0.3 0.3
> 
> Well this is just the idea and the sentence could be made better
> (my poor english...). Maybe a reference to RFAQ 7.31 could be made.
> Maybe is this behaviour clear and logical for experienced users,
> but I don't think it is for beginners. I personnaly spent two
> hours to see that the problem in my code came from this.

The above is potentially a useful suggestion, and I would be
inclined to support it. However, for your other suggestion:

> I was thinking about modify the function unique() to introduce
> a "tol" argument which allows to compare with a tolerance level
> (with default value zero to keep unique consistent) like all.equal(),
> but it seemed too complicated with my little understanding.
> 
> Bests regards and many thanks for what you do for R!
> Matthieu Stigler

What is really complicated about it is that the results may
depend on the order of elements. When unique() eliminates only
values which are strictly identical to values which have been
scanned earlier, there is no problem.

But suppose you set "tol=0.11" in

unique(c(20.0, 30.0, 30.1, 30.2, 40.0)
# 20.0, 30.0, 40
[30.1 rejected because within 0.11 of previous 30.0;
 30.2 rejected because within 0.11 of previous 30.1]
and compare with

unique(c(20.0, 30.0, 30.2, 30.1, 40.0)
# 20.0, 30.0, 30.2, 40.0
[30.2 accepted because not within 0.11 of any previous;
 30.1 rejected because within 0.11 of previous 30.2 or 30.0]

This kind of problem is always present in situations where
there are potential "chained tolerances".

You cannot see the difference between the position of the
hour-hand of a clock now, and one minute later.

But you may not chain this logic, for, if you could:

If A is indistinguishable from B, and B is indistinguishable
  from C, then A is indistinguishable from C.

10:00 is indistinguishable from 10:01 (on the hour-hand)
10:[n] is indistinguishable from 10:[n+1]

Hence, by induction, 10:00 is indistinguishable from 11:00

Which you do not want!

Best wishes,
Ted.


E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861
Date: 17-Apr-08   Time: 14:54:19
-- XFMail --

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] LinkingTo for 2 packages

2008-04-17 Thread Iago Mosqueira
Hello,

One of our packages contains C++ code that needs to be compiled against
2 other packages. So the LinkingTo field in DESCRIPTION looks like this

LinkingTo: FLCore,FLash

Both packages are also in the Depends field.

In R 2.6.2, first thing we noticed was that both names could not have
any space between them, althoguh the example in the html version of
"Writing R Extensions" does have one:

"A package that wishes to make use of header files in other packages
needs to declare them as a comma-separated list in the field LinkingTo
in the DESCRIPTION file. For example

 Depends: link2, link3
 LinkingTo: link2, link3"

With the space character, this is the compiler call found in 00install.out

g++ -I/usr/local/lib/R/include -I/usr/local/lib/R/include
-I/usr/local/include -I"/usr/local/lib/R/library/FLCore/include"   -fpic
 -g -O2 -c FLBRP.cpp -o FLBRP.o

while deleting it means both include folders are correctly added

g++ -I/usr/local/lib/R/include -I/usr/local/lib/R/include
-I/usr/local/include -I"/usr/local/lib/R/library/FLCore/include"
-I"/usr/local/lib/R/library/FLash/include"   -fpic  -g -O2 -c FLBRP.cpp
-o FLBRP.o

Secondly, this behaviour has been detected when running R CMD check on
Linux (R.Version below), while in windows the second include is never
generated even if the space character is deleted.

g++-sjlj   -Ic:/progra~1/r/r-2.6.2/include
-I"c:/progra~1/r/r-2.6.2/library/FLCore/include"-O2 -Wall  -c
FLBRP.cpp -o FLBRP.o

We haven't been able to test yet on the latest R 2.7.0rc.

Thanks,


Iago


WINDOWS:
> > R.Version()
$platform
[1] "i386-pc-mingw32"

$arch
[1] "i386"

$os
[1] "mingw32"

$system
[1] "i386, mingw32"

$status
[1] ""

$major
[1] "2"

$minor
[1] "6.2"

$year
[1] "2008"

$month
[1] "02"

$day
[1] "08"

$`svn rev`
[1] "44383"

$language
[1] "R"

$version.string
[1] "R version 2.6.2 (2008-02-08)"


LINUX:
> R.Version()
$platform
[1] "i686-pc-linux-gnu"

$arch
[1] "i686"

$os
[1] "linux-gnu"

$system
[1] "i686, linux-gnu"

$status
[1] ""

$major
[1] "2"

$minor
[1] "6.2"

$year
[1] "2008"

$month
[1] "02"

$day
[1] "08"

$`svn rev`
[1] "44383"

$language
[1] "R"

$version.string
[1] "R version 2.6.2 (2008-02-08)"

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] calculating multiregression for the given data

2008-04-17 Thread man4ish

i am trying to calculate the multi regression for the follwing data

0 1 2 3 7 5 0 4 4 3
0 3 4 5 4 0 5 4 5 4 
0 1 1 0 1 0 1 1 1 0
1 0 4 5 6 1 2  3 2 1

keeping Ist column as dependent variable and other as independent
varialbles,how can i do this using lm  funciotn in stats please send me the
code for this , i will really thankful to you 
manish gupta 
-- 
View this message in context: 
http://www.nabble.com/calculating-multiregression-for-the-given-data-tp16743253p16743253.html
Sent from the R devel mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in ci.plot(HH Package) (PR#11163)

2008-04-17 Thread rmh
Thank you for the error report.  I have made the correction
which will be in HH_2.1-11 which will be posted in a few days.

Rich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [rd] sage <--> r integration

2008-04-17 Thread Rob Goedman
Mike,

I'm also surprised so few reactions have been forthcoming. I saw  
William's email shortly after Jan's email about Sage.

Having been involved in the Ryacas project (R to Yacas - Yet Another  
Computer Algebra System - with Gabor, Soren and Ayal), I  believe in  
the tremendous benefits of such an integration and have,  over the  
last couple of days, been studying Sage and its notebook interface to  
better understand how it could be used as an alternative/complement to  
the current Mac R.app (having occasionally helped out Simon on the Mac  
R GUI - R.app).

I am a firm believer in Python, and have worked on Python/ 
Django/"Python on Embedded systems for biometrics" for well over 2  
years now. I do believe this is also a major benefit of Sage. And  
projects such as Google's App Engine provide further support.

 From Willliam's email I take it the best route right now is to try to  
bring this project the attention of the GSoC mentors.

Regards,
Rob


On Apr 17, 2008, at 3:33 AM, Mike Hansen wrote:

> I would like to re-emphasize the above points that William makes.
> Both projects could benefit a lot from working with each other.  A
> fair number of people, including Persi Diaconis and Susan Holmes, have
> had enthusiastic responses when I mentioned that R was being included
> with Sage.
>
> This is a good opportunity to build bridges between the two
> communities, and the lack of response (regardless of how GSoC issues
> play out) is a bit disappointing.
>
> --Mike
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Herve Pages
Hi,

Couldn't is.unsorted() bail out immediately here (after comparing
the first 2 elements):

 > x <- 2000:1
 > system.time(is.unsorted(x), gcFirst=TRUE)
user  system elapsed
   0.084   0.040   0.124

 > x <- 2:1
 > system.time(is.unsorted(x), gcFirst=TRUE)
user  system elapsed
   0.772   0.440   1.214

Thanks!
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Bill Dunlap
On Thu, 17 Apr 2008, Herve Pages wrote:

> Couldn't is.unsorted() bail out immediately here (after comparing
> the first 2 elements):
>
>  > x <- 2000:1
>  > system.time(is.unsorted(x), gcFirst=TRUE)
> user  system elapsed
>0.084   0.040   0.124
>
>  > x <- 2:1
>  > system.time(is.unsorted(x), gcFirst=TRUE)
> user  system elapsed
>0.772   0.440   1.214

The C code does bail out upon seeing the first out- of-order pair, but
before calling the C code, the S code does any(is.na(x)), forcing a
scan of the entire data.  If you remove the is.na calls from
is.unsorted's S code you will see the timings improve in your example.
(It looks easy to do the NA checks in the C code.)

   is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
   if (is.null(x))
   return(FALSE)
   if (!is.atomic(x))
   return(NA)
   .Internal(is.unsorted(x))
   }
   > x <- 2000:1
   > system.time(is.unsorted(x), gcFirst=TRUE)
   user  system elapsed
 0.356   0.157   0.514
   > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
   user  system elapsed
  0   0   0
   > revx <- rev(x)
   > system.time(is.unsorted(revx), gcFirst=TRUE)
  user  system elapsed
 0.500   0.170   0.672
   > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
  user  system elapsed
 0.131   0.000   0.132


Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Prof Brian Ripley
I wouldn't say 'easy', and so I think we need a business case for this 
change.  (One of the issues is that the internals are used elsewhere and 
optimized for inputs without NAs.  So we would need to write separate code 
if we pass NAs down to C level.  As I recall, is.unsorted was a cheap R 
interface to existing C code.)

What real-world problems are being affected by this, and what would be 
proportional speedup of the whole analysis be from this change?

On Thu, 17 Apr 2008, Bill Dunlap wrote:

> On Thu, 17 Apr 2008, Herve Pages wrote:
>
>> Couldn't is.unsorted() bail out immediately here (after comparing
>> the first 2 elements):
>>
>> > x <- 2000:1
>> > system.time(is.unsorted(x), gcFirst=TRUE)
>> user  system elapsed
>>0.084   0.040   0.124
>>
>> > x <- 2:1
>> > system.time(is.unsorted(x), gcFirst=TRUE)
>> user  system elapsed
>>0.772   0.440   1.214
>
> The C code does bail out upon seeing the first out- of-order pair, but
> before calling the C code, the S code does any(is.na(x)), forcing a
> scan of the entire data.  If you remove the is.na calls from
> is.unsorted's S code you will see the timings improve in your example.
> (It looks easy to do the NA checks in the C code.)
>
>   is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
>   if (is.null(x))
>   return(FALSE)
>   if (!is.atomic(x))
>   return(NA)
>   .Internal(is.unsorted(x))
>   }
>   > x <- 2000:1
>   > system.time(is.unsorted(x), gcFirst=TRUE)
>   user  system elapsed
> 0.356   0.157   0.514
>   > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
>   user  system elapsed
>  0   0   0
>   > revx <- rev(x)
>   > system.time(is.unsorted(revx), gcFirst=TRUE)
>  user  system elapsed
> 0.500   0.170   0.672
>   > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
>  user  system elapsed
> 0.131   0.000   0.132
>
> 
> Bill Dunlap
> Insightful Corporation
> bill at insightful dot com
> 360-428-8146
>
> "All statements in this message represent the opinions of the author and do
> not necessarily reflect Insightful Corporation policy or position."
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Herve Pages
Hi,

Thanks for your answers!

No need to change anything. In my case, 'x' is guaranteed to be an integer
vector with no NAs so I can call .Internal(is.unsorted(x)) directly.

BTW, why not make is.unsorted() a little bit more prepared to silly user
input:

   > is.unsorted(c(2:5, NA), na.rm=NA)
   Error in if (!is.atomic(x) || (!na.rm && any(is.na(x return(NA) :
 missing value where TRUE/FALSE needed

(or at least silently coerce those silly values to TRUE or FALSE like
max()/min() do, following some obscure logic though).

Also it's arguable that a length-1 vector cannot be considered sorted:
   > is.unsorted(NA)
   [1] NA

Cheers,
H.


Prof Brian Ripley wrote:
> I wouldn't say 'easy', and so I think we need a business case for this 
> change.  (One of the issues is that the internals are used elsewhere and 
> optimized for inputs without NAs.  So we would need to write separate 
> code if we pass NAs down to C level.  As I recall, is.unsorted was a 
> cheap R interface to existing C code.)
> 
> What real-world problems are being affected by this, and what would be 
> proportional speedup of the whole analysis be from this change?
> 
> On Thu, 17 Apr 2008, Bill Dunlap wrote:
> 
>> On Thu, 17 Apr 2008, Herve Pages wrote:
>>
>>> Couldn't is.unsorted() bail out immediately here (after comparing
>>> the first 2 elements):
>>>
>>> > x <- 2000:1
>>> > system.time(is.unsorted(x), gcFirst=TRUE)
>>> user  system elapsed
>>>0.084   0.040   0.124
>>>
>>> > x <- 2:1
>>> > system.time(is.unsorted(x), gcFirst=TRUE)
>>> user  system elapsed
>>>0.772   0.440   1.214
>>
>> The C code does bail out upon seeing the first out- of-order pair, but
>> before calling the C code, the S code does any(is.na(x)), forcing a
>> scan of the entire data.  If you remove the is.na calls from
>> is.unsorted's S code you will see the timings improve in your example.
>> (It looks easy to do the NA checks in the C code.)
>>
>>   is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
>>   if (is.null(x))
>>   return(FALSE)
>>   if (!is.atomic(x))
>>   return(NA)
>>   .Internal(is.unsorted(x))
>>   }
>>   > x <- 2000:1
>>   > system.time(is.unsorted(x), gcFirst=TRUE)
>>   user  system elapsed
>> 0.356   0.157   0.514
>>   > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
>>   user  system elapsed
>>  0   0   0
>>   > revx <- rev(x)
>>   > system.time(is.unsorted(revx), gcFirst=TRUE)
>>  user  system elapsed
>> 0.500   0.170   0.672
>>   > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
>>  user  system elapsed
>> 0.131   0.000   0.132
>>
>>  
>>
>> Bill Dunlap
>> Insightful Corporation
>> bill at insightful dot com
>> 360-428-8146
>>
>> "All statements in this message represent the opinions of the author 
>> and do
>> not necessarily reflect Insightful Corporation policy or position."
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R-extension in unix system -- help to locate header files

2008-04-17 Thread Kyeongmi Cheon
Hi list,
To call C,  I used to use R-extension in windows but I'm moving to unix system
because my PC doesn't have enough memory. My C codes requires to include the
following header files:

#include 
#include 
#include 
#include 
#include 
#include 

In windows, I had no problem with it because I set the path for Rtools
and R etc.
But in unix system, these header files call other header files and
they call other
header files... And some files are not in R folders so I have to
manually find them
and give path and it takes for ever. For example,


#include 
#include 
#include 
#include 
#include 
#include 

but now
 calls  and I have to search the whole
directory of my
school computer to find it. Same for other header files...

Is there better way to do it? Or is there any package for unix that
has "all" the
needed header files so that I can download them only once and don't
have to search
for them? I appreciate your help in advance.
Kyeongmi

University of memphis

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-extension in unix system -- help to locate header files

2008-04-17 Thread Prof Brian Ripley
You should never give full paths in #include statements.

The search paths for include ('header') files are set by the compiler and 
supplemented by -I flags on the command line.  If your C compiler is 
unable to find  something is badly wrong, and you need to ask 
your 'unix' advisor for help.  (BTW, these header files are OS-specific, 
and some of them will be compiler-specific -- on my Solaris Unix box cc 
and gcc use different versions of some of these headers.)

On Thu, 17 Apr 2008, Kyeongmi Cheon wrote:

> Hi list,
> To call C,  I used to use R-extension in windows but I'm moving to unix system
> because my PC doesn't have enough memory. My C codes requires to include the
> following header files:
>
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> In windows, I had no problem with it because I set the path for Rtools
> and R etc.
> But in unix system, these header files call other header files and
> they call other
> header files... And some files are not in R folders so I have to
> manually find them
> and give path and it takes for ever. For example,
>
>
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> but now
>  calls  and I have to search the whole
> directory of my
> school computer to find it. Same for other header files...
>
> Is there better way to do it? Or is there any package for unix that
> has "all" the
> needed header files so that I can download them only once and don't
> have to search
> for them? I appreciate your help in advance.
> Kyeongmi
>
> University of memphis
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Prof Brian Ripley
On Thu, 17 Apr 2008, Herve Pages wrote:

> Hi,
>
> Thanks for your answers!
>
> No need to change anything. In my case, 'x' is guaranteed to be an integer
> vector with no NAs so I can call .Internal(is.unsorted(x)) directly.
>
> BTW, why not make is.unsorted() a little bit more prepared to silly user
> input:

Because R is a volunteer project and resources spent on trapping misuse 
are resources not available to be spent on other things.  (Same as for bug 
reports on fixed issues )

>  > is.unsorted(c(2:5, NA), na.rm=NA)
>  Error in if (!is.atomic(x) || (!na.rm && any(is.na(x return(NA) :
>missing value where TRUE/FALSE needed
>
> (or at least silently coerce those silly values to TRUE or FALSE like
> max()/min() do, following some obscure logic though).
>
> Also it's arguable that a length-1 vector cannot be considered sorted:
>  > is.unsorted(NA)
>  [1] NA
>
> Cheers,
> H.
>
>
> Prof Brian Ripley wrote:
>> I wouldn't say 'easy', and so I think we need a business case for this 
>> change.  (One of the issues is that the internals are used elsewhere and 
>> optimized for inputs without NAs.  So we would need to write separate code 
>> if we pass NAs down to C level.  As I recall, is.unsorted was a cheap R 
>> interface to existing C code.)
>> 
>> What real-world problems are being affected by this, and what would be 
>> proportional speedup of the whole analysis be from this change?
>> 
>> On Thu, 17 Apr 2008, Bill Dunlap wrote:
>> 
>>> On Thu, 17 Apr 2008, Herve Pages wrote:
>>> 
 Couldn't is.unsorted() bail out immediately here (after comparing
 the first 2 elements):
 
 > x <- 2000:1
 > system.time(is.unsorted(x), gcFirst=TRUE)
 user  system elapsed
0.084   0.040   0.124
 
 > x <- 2:1
 > system.time(is.unsorted(x), gcFirst=TRUE)
 user  system elapsed
0.772   0.440   1.214
>>> 
>>> The C code does bail out upon seeing the first out- of-order pair, but
>>> before calling the C code, the S code does any(is.na(x)), forcing a
>>> scan of the entire data.  If you remove the is.na calls from
>>> is.unsorted's S code you will see the timings improve in your example.
>>> (It looks easy to do the NA checks in the C code.)
>>>
>>>   is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
>>>   if (is.null(x))
>>>   return(FALSE)
>>>   if (!is.atomic(x))
>>>   return(NA)
>>>   .Internal(is.unsorted(x))
>>>   }
>>>   > x <- 2000:1
>>>   > system.time(is.unsorted(x), gcFirst=TRUE)
>>>   user  system elapsed
>>> 0.356   0.157   0.514
>>>   > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
>>>   user  system elapsed
>>>  0   0   0
>>>   > revx <- rev(x)
>>>   > system.time(is.unsorted(revx), gcFirst=TRUE)
>>>  user  system elapsed
>>> 0.500   0.170   0.672
>>>   > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
>>>  user  system elapsed
>>> 0.131   0.000   0.132
>>>
>>> 
>>> 
>>>  
>>> Bill Dunlap
>>> Insightful Corporation
>>> bill at insightful dot com
>>> 360-428-8146
>>> 
>>> "All statements in this message represent the opinions of the author and 
>>> do
>>> not necessarily reflect Insightful Corporation policy or position."
>>> 
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel