On Tue, 16 Jun 2009, jim holtman wrote:
I think the only way that you are going to get it to stop on the first
mismatch is to write your own function in C if you are concerned about the
time. Matching on character vectors will be even more costly since it is
having to loop to check the equality of each character in each element.
This is one of the places it might pay to convert to factors and then the
comparison only uses the integer values assigned to the factors.
Not so in a recent R: comparison of character vectors is now done by
comparing pointers in the first instance so (at least on a 32-bit
platform) is as fast as comparing integers. And on x86_64 Linux:
x <- as.character(c(1,2,rep(1,10000000)))
system.time(print(all(x[1] == x)))
[1] FALSE
user system elapsed
0.123 0.019 0.142
system.time(xx <- as.factor(x))
user system elapsed
9.874 0.284 10.159
system.time(print(all(xx[1] == xx)))
[1] FALSE
user system elapsed
0.511 0.145 0.656
Recent pre-release versions of R (e.g. 2.9.1 beta) allow
system.time(anyDuplicated(x))
user system elapsed
0.034 0.078 0.113
system.time(anyDuplicated(xx))
user system elapsed
0.037 0.076 0.113
which is probably what the original poster was looking for.
On Tue, Jun 16, 2009 at 8:31 AM, utkarshsinghal <
utkarsh.sing...@global-analytics.com> wrote:
Hi Jim,
What you are saying is correct. Although, my computer might not have same
speed and I am getting the following for 10M entries:
user system elapsed
0.559 0.038 0.607
Moreover, in the case of character vectors, it gets more than double.
In my modeling, which is already highly time consuming, I need to do check
this for few thousand vectors and the entries can easily be 10M in each
vector. So I am just looking for any possibilities of time saving. I am
pretty sure that whenever elements are not all equal, it can be concluded
from any few entries (most of the times). It will be worth if I can find a
way which stops checking further the moment it find two distinct elements.
Regards
Utkarsh
jim holtman wrote:
Just check that the first (or any other element) is equal to all the rest:
x = c(1,2,rep(1,10000000)) # 10,000,000
system.time(print(all(x[1] == x)))
[1] FALSE
user system elapsed
0.18 0.00 0.19
This was for 10M entries.
On Tue, Jun 16, 2009 at 7:42 AM, utkarshsinghal <
utkarsh.sing...@global-analytics.com> wrote:
Hi All,
There are several replies to the question below, but I think there must
exist a better way of doing so.
I just want to check whether all the elements of a vector are same. My
vector has one million elements and it is highly likely that there are
distinct elements in the first few itself. For example:
> x = c(1,2,rep(1,100000))
I want the answer as FALSE, which is clear from the first two
observations itself and we don't need to check for the rest.
Does anybody know the most efficient way of doing this?
Regards
Utkarsh
From: Francisco J. Zagmutt <gerifalte28_at_hotmail.com
<mailto:gerifalte28_at_hotmail.com
?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>>
Date: Tue 30 Aug 2005 - 06:05:20 EST
Hi Doran
The documentation for isTRUE reads 'isTRUE(x)' is an abbreviation of
'identical(TRUE,x)' so actually Vincent's solutions is "cleaner" than
using identical :)
Cheers
Francisco
/>From: "Doran, Harold" <hdo...@air.org> /
/>To: <vincent.gou...@act.ulaval.ca>, <r-h...@stat.math.ethz.ch> /
/>Subject: Re: [R] Testing if all elements are equal in a vector/matrix /
/>Date: Mon, 29 Aug 2005 15:49:20 -0400 /
/> /
>See ?identical
<http://tolstoy.newcastle.edu.au/R/help/05/08/11201.html#11202qlink1>
/> /
/>-----Original Message----- /
/>From: r-help-boun...@stat.math.ethz.ch /
/>[mailto:r-help-boun...@stat.math.ethz.ch] On Behalf Of Vincent Goulet /
/>Sent: Monday, August 29, 2005 3:35 PM /
/>To: r-h...@stat.math.ethz.ch /
/>Subject: [R] Testing if all elements are equal in a vector/matrix /
/> /
/> /
/>Is there a canonical way to check if all elements of a vector or
matrix are /
/>the same? Solutions below work, but look hackish to me. /
/> /
/> > x <- rep(1, 10) /
/> > all(x == x[1]) # == operator does not provide for small differences /
*/>[1] TRUE /
*/> > isTRUE(all.equal(x, rep(x[1], length(x)))) # ugly /
*/>[1] TRUE /
*/> /
/>Best, /
/> /
/>Vincent /
/>-- /
/> Vincent Goulet, Associate Professor /
/> ?cole d'actuariat /
/> Universit? Laval, Qu?bec /
/> Vincent.Goulet_at_act.ulaval.ca<http://vincent.goulet_at_act.ulaval.ca/>
<mailto:Vincent.Goulet_at_act.ulaval.ca
?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>
http://vgoulet.act.ulaval.ca /
/> /
/>______________________________________________ /
/>r-h...@stat.math.ethz.ch mailing list /
/>https://stat.ethz.ch/mailman/listinfo/r-help /
/>PLEASE do read the posting guide! /
/>http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/
/> /
/>______________________________________________ /
/>r-h...@stat.math.ethz.ch mailing list /
/>https://stat.ethz.ch/mailman/listinfo/r-help /
/>PLEASE do read the posting guide! /
/>http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.