Re: [Rd] str() with attr(*, "names") is extremely slow for long vectors

2006-05-08 Thread Martin Maechler
> "HenrikB" == Henrik Bengtsson (max 7Mb) <[EMAIL PROTECTED]>
> on Fri, 5 May 2006 11:58:19 -0700 writes:

HenrikB> Hi,
HenrikB> I noticed some time ago that, for instance, named vectors that are
HenrikB> really makes str() really slow when displaying the names attribute.

HenrikB> I don't know exactly when this started, but it
HenrikB> wasn't the case say 1-2 years ago.  Example (on a WinXP 1.8GHz):

Thank you, Henrik, for the note.
Indeed, str() is unnecessary slow for long character vectors in
general, not just when they are names(); and Rprof() +
Rprofsummary() quickly reveal were the culprits lie.

This shouldn't be too hard to improve, I'm having a look.

Martin Maechler, ETH Zurich



>> s <- 1:1000; names(s) <- s
>> system.time(str(s))
HenrikB> Named int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
HenrikB> - attr(*, "names")= chr [1:1000] "1" "2" "3" "4" ...
HenrikB> [1] 0.08 0.00 0.09   NA   NA

>> s <- 1:10; names(s) <- s
>> system.time(str(s))
HenrikB> Named int [1:10] 1 2 3 4 5 6 7 8 9 10 ...
HenrikB> - attr(*, "names")= chr [1:10] "1" "2" "3" "4" ...
HenrikB> [1] 8.82 0.00 9.11   NA   NA

HenrikB> I looks like all strings elements are processed although only the
HenrikB> first few are displayed.

HenrikB> Cheers

HenrikB> Henrik

HenrikB> __
HenrikB> R-devel@r-project.org mailing list
HenrikB> https://stat.ethz.ch/mailman/listinfo/r-devel
> "HenrikB" == Henrik Bengtsson (max 7Mb) <[EMAIL PROTECTED]>
> on Fri, 5 May 2006 11:58:19 -0700 writes:

HenrikB> Hi, I noticed some time ago that, for instance,
HenrikB> named vectors that are really makes str() really
HenrikB> slow when displaying the names attribute.  I don't
HenrikB> know exactly when this started, but it wasn't the
HenrikB> case say 1-2 years ago.  Example (on a WinXP
HenrikB> 1.8GHz):

>> s <- 1:1000; names(s) <- s system.time(str(s))
HenrikB>  Named int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...  -
HenrikB> attr(*, "names")= chr [1:1000] "1" "2" "3" "4" ...
HenrikB> [1] 0.08 0.00 0.09 NA NA

>> s <- 1:10; names(s) <- s system.time(str(s))
HenrikB>  Named int [1:10] 1 2 3 4 5 6 7 8 9 10 ...  -
HenrikB> attr(*, "names")= chr [1:10] "1" "2" "3" "4"
HenrikB> ...  [1] 8.82 0.00 9.11 NA NA

HenrikB> I looks like all strings elements are processed
HenrikB> although only the first few are displayed.

HenrikB> Cheers

HenrikB> Henrik

HenrikB> __
HenrikB> R-devel@r-project.org mailing list
HenrikB> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Inconsistency in AIC values for glm with family poisson (PR#8840)

2006-05-08 Thread x . sole
Full_Name: Xavier Solé
Version: 2.3.0
OS: Windows XP SP2
Submission from: (NULL) (213.151.99.160)


#When computing AIC for one of the models shown in ?glm we get an inconsistent
AIC value. We also get the same wrong value if we use "extractAIC" o "AIC"
functions.

example(glm)

glm.D93

extractAIC(glm.D93)

#AIC of this model should be 15.129 (residual deviance + 2*effective degrees of
freedom), but the AIC which R returns is 56.76. Function extractAIC returns the
right number of effective degrees of freedom (5), but anyway seems to fail in
calculating the correct AIC value.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Patch for r-intro to update xgobi to ggobi

2006-05-08 Thread hadley wickham

Patch attached.

If this is acceptable, would someone please be able to check this in for me?

Thanks,

Hadley
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Patch for r-intro to update xgobi to ggobi

2006-05-08 Thread Uwe Ligges
hadley wickham wrote:

> Patch attached.
> 
> If this is acceptable, would someone please be able to check this in for 
> me?


Probably you do not wanted to post this to R-edev (where the attachment 
has been stripped off anyway)

Uwe Ligges



> Thanks,
> 
> Hadley
> 
> 
> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Patch for r-intro to update xgobi to ggobi

2006-05-08 Thread Uwe Ligges
Uwe Ligges wrote:

> hadley wickham wrote:
> 
>> Patch attached.
>>
>> If this is acceptable, would someone please be able to check this in 
>> for me?
> 
> 
> 
> Probably you do not wanted to post this to R-edev (where the attachment 
> has been stripped off anyway)

In fact, I meant to write:
probably you did not want to post this to R-devel ...


> Uwe Ligges
> 
> 
> 
>> Thanks,
>>
>> Hadley
>>
>>
>> 
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Inconsistency in AIC values for glm with family poisson (PR#8841)

2006-05-08 Thread ripley
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--27464147-1557463723-1147085467=:8118
Content-Type: TEXT/PLAIN; charset=iso-8859-1; format=flowed
Content-Transfer-Encoding: 8BIT

On Mon, 8 May 2006, [EMAIL PROTECTED] wrote:

> Full_Name: Xavier Solé
> Version: 2.3.0
> OS: Windows XP SP2
> Submission from: (NULL) (213.151.99.160)
>
>
> #When computing AIC for one of the models shown in ?glm we get an 
> inconsistent AIC value. We also get the same wrong value if we use 
> "extractAIC" o "AIC" functions.

Inconsistent with what?  It seems to me that it consistently gives the 
right answer, but you have not used the actual definition of AIC.

> example(glm)
>
> glm.D93
>
> extractAIC(glm.D93)
>
> #AIC of this model should be 15.129 (residual deviance + 2*effective 
> degrees of freedom), but the AIC which R returns is 56.76. Function 
> extractAIC returns the right number of effective degrees of freedom (5), 
> but anyway seems to fail in calculating the correct AIC value.

Where do you get that from (it is not the definition of AIC)?

> AIC(glm.D93)
[1] 56.76132
> extractAIC(glm.D93)
[1]  5.0 56.76132
> logLik(glm.D93)
'log Lik.' -23.38066 (df=5)

Did you read ?AIC, which gives the actual definition?  You may also need 
to review the definitions (note, plural) of `deviance'.

Please don't expect us to accept your assertions for definitions 
of statistical quantities: you need to supply your credentials and 
references.  In this case the help page actually points out that the 
quantity is not unambiguously defined.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
--27464147-1557463723-1147085467=:8118--

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] mean of complex vector (PR#8842)

2006-05-08 Thread john . peters
Full_Name: John Peters
Version: 2.3.0
OS: Windows 2000, xp
Submission from: (NULL) (220.233.20.203)


In R2.3.0 on Windows 2000 and xp

> mean(c(1i))
[1] 0+2i
> mean(c(1i,1i))
[1] 0+3i
> mean(c(1i,1i,1i))
[1] 0+4i

OK in R2.2.1

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mean of complex vector (PR#8842)

2006-05-08 Thread Peter Dalgaard
[EMAIL PROTECTED] writes:

> Full_Name: John Peters
> Version: 2.3.0
> OS: Windows 2000, xp
> Submission from: (NULL) (220.233.20.203)
> 
> 
> In R2.3.0 on Windows 2000 and xp
> 
> > mean(c(1i))
> [1] 0+2i
> > mean(c(1i,1i))
> [1] 0+3i
> > mean(c(1i,1i,1i))
> [1] 0+4i
> 
> OK in R2.2.1

Yes. This comes from a blunder in summary.c, apparently attempting to
copy the code from the REALSXP case, mutatis mutandis, but not quite so...

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Efficient Merging of two huge sorted data frames?---Use merge()?

2006-05-08 Thread Charles Cheung
Hello all,

A problem I encounter today is the speed which takes to sort two huge data 
frames...

I wish to sort by (X,Y)

Dataframe One consists of variables:
X, Y, sequence, position
having ~700 000 records

another dataframe consists of
X,Y, intensities
having ~900 000 records


Every (X,Y) pair in dataframe One is included in dataframe Two,
however,  the reverse is not true.
Furthermore,  (X,Y, position) in data frame One makes the record unique.
(That means there can be multiple records with the same (X,Y) records!)

Added together, it makes it hard to just combine the two data frames 
together by simply going
data.frame(dataFrameOne, dataFrameTwo) because the mapping won't correspond 
even in sorted records by X and Y.


Intuitive, it should only require very little time  after 
the data records are sorted.
However, it takes so long (I haven't finished the process in 20 minutes.. it 
should only take <1 min) to merge the list by X and Y using

merge(dataFrameOne, dataFrameTwo, by=c("X","Y") , which leads me to suspect 
this process is not optimized for already sorted list.

* assuming the two frames have been sorted, I would be able to do the 
following:


X Y seq Pos
1 1   AA  32
1 2   AG  44
1 3   GC  65


X Y intensities
1 1  0.4
1 3  0.552

>>Cursor at beginning (1,1) (1,1) -->merge the (1,1) pair.. then cursor 
>>moves to (1,2) (1,3)  --> can't find.. cursor moves to (1,3) (1,3) .. 
>>merge that pair

Is the merge function doing that already?


Is there an efficient way to merge the data frames? (What do you suggest I 
should do?)


(to produce)
X Y seq pos intensities
1 1 AA   32 0.4
1 3 GC  65 0.552

Thank you in advance!


Charles Cheung

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] "Unfelicity" :-) with edit()

2006-05-08 Thread François Pinard
Hi, people.  This is about R 2.3.0 under Linux.

It seems that edit() may change a function environment.  Here is 
a transcript, more comments follow:

==>
> fix(f)

> f
function ()
{
}

> fix(f)
Erreur dans edit(name, file, title, editor) :
une erreur s'est produite à la ligne 3
 utilisez une commande du genre
 x <- edit()
 pour corriger

> f <- edit()

> f
function ()
{
}

==<

The initial ``fix(f)`` called an editor, which I exited right away.  For 
the second ``fix(f)``, I used the editor for adding a slash between 
braces, and exited.  The French comment produced by R speaks about an 
error at line 3 and suggests using something like ``x <- edit()`` to 
make a correction.  On the third call to the editor, I remove the slash 
and exit.  Now, the environment of the function became "base".

This has unfortunate effects when editing a more substantial function, 
because for example, "stats" or "utils" is not readily available anymore 
after the editing.  Is it reasonable to suggest an improvement in the 
mechanics of edit(), for alleviating this drawback ?

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "Unfelicity" :-) with edit()

2006-05-08 Thread Duncan Murdoch
On 5/8/2006 9:03 PM, François Pinard wrote:
> Hi, people.  This is about R 2.3.0 under Linux.
> 
> It seems that edit() may change a function environment.  Here is 
> a transcript, more comments follow:
> 
> ==>
>> fix(f)
> 
>> f
> function ()
> {
> }
> 
>> fix(f)
> Erreur dans edit(name, file, title, editor) :
> une erreur s'est produite à la ligne 3
>  utilisez une commande du genre
>  x <- edit()
>  pour corriger
> 
>> f <- edit()
> 
>> f
> function ()
> {
> }
> 
> ==<
> 
> The initial ``fix(f)`` called an editor, which I exited right away.  For 
> the second ``fix(f)``, I used the editor for adding a slash between 
> braces, and exited.  The French comment produced by R speaks about an 
> error at line 3 and suggests using something like ``x <- edit()`` to 
> make a correction.  On the third call to the editor, I remove the slash 
> and exit.  Now, the environment of the function became "base".
> 
> This has unfortunate effects when editing a more substantial function, 
> because for example, "stats" or "utils" is not readily available anymore 
> after the editing.  Is it reasonable to suggest an improvement in the 
> mechanics of edit(), for alleviating this drawback ?

edit() is a hack, so you should expect problems.  You're better off 
keeping your source in an editor and using source() to get it.  There is 
no way it could preserve the environment of a function if you go through 
the steps you went through above.

However, it's a bug (introduced by me last year when converting NULL to 
.BaseEnv) that it ends up with the base environment instead of the 
global environment.  I'll fix it.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel