Re: [R] high and lowest with names

Ben qant Thu, 13 Oct 2011 09:18:03 -0700

Besides being a much better solution, it displays ties (which I see as a
benefit). For example, if I ask for 5 I get 8 for top values since 12 occurs
3 times.


Here is the same thing David posted with slight mods to generalize it a bit
for cnt:

x <- swiss$Education[1:25]
dat = matrix(x,5,5)
colnames(dat) = c('a','b','c','d','e')
rownames(dat) = c('z','y','x','w','v')
cnt = 5
#===============================================
dattop <- which(dat >= c(dat)[rev(order(dat))][cnt], arr.ind=TRUE)
 rbind( top = dat[dattop],
         rows = rownames(dat)[ dattop[,1] ],
         cols = colnames(dat)[ dattop[,2] ])

datbot <- which(dat <= c(dat)[order(dat)][cnt], arr.ind=TRUE)
rbind( bot = dat[datbot],
         rows = rownames(dat)[ datbot[,1] ],
         cols = colnames(dat)[ datbot[,2] ])

Thanks David!

Ben


On Thu, Oct 13, 2011 at 9:48 AM, David Winsemius <dwinsem...@comcast.net>wrote:

>
> On Oct 13, 2011, at 10:42 AM, Ben qant wrote:
>
>  Here is a more R'sh solution (speed unknown).
>>
>
> Really? The intermediate, potentially large, objects seem to be
> proliferating.
>
>
>  Courtesy of Mark Leeds (I
>> modified it a bit to generalize it for a cnt input and get min and max).
>> Again, getting cnt highest and lowest values in the entire matrix and
>> display the data point row and column names with each:
>>
>
> 1) For max (or min) I would have thought that one could have much more
> easily gathered the maximum and minimum locations with:
>
>  which(x == max(x), arr.ind=TRUE)   # Bert Gunter's discarded suggestion
>
> ... and used the results as indices into x or rownames(x) or colnames(x).
> But I made no earlier comments because it did not appear that you had
> provided the swiss$Education object in a form that could be easily extracted
> for testing. I see now that setting up a similar object was fairly easy, but
> would encourage you to consider the `dput` function for such problem
> construction in the future;
>
> dat2 <- matrix(sample(1:25, 25), 5,5)
> colnames(dat2) = c('a','b','c','d','e')
> rownames(dat2) = c('z','y','x','w','v')
> arrns <- which(dat2 == max(dat2), arr.ind=TRUE)
> > arrns
>  row col
> v   5   1
> > colnames(dat2)[arrns[,2]] ; rownames(dat2)[arrns[,1]]
> [1] "a"
> [1] "v"
>
> 2) For display of all results with row/column labels :
>
> rbind(dat2, rownames(dat2)[row(dat2)], colnames(dat2)[row(dat2)])
>
> 3) For display of values of "bottom five" and top five:
>
>  dat2five <- which(dat2 <= c(dat2)[order(dat2)][5], arr.ind=TRUE)
>  rbind( dat2LT5= dat2[dat2five],
>          Rows = rownames(dat2)[ dat2five[,1] ],
>          Cols = colnames(dat2)[ dat2five[,2] ])
> #--------------
>
>        [,1] [,2] [,3] [,4] [,5]
> dat2LT5 "2"  "3"  "5"  "1"  "4"
> Rows    "x"  "w"  "y"  "y"  "x"
> Cols    "a"  "a"  "c"  "d"  "d"
>
> dat2topfive <- which(dat2 >= c(dat2)[rev(order(dat2))][5], arr.ind=TRUE)
>  rbind( dat2top5= dat2[dat2topfive],
>          Rows = rownames(dat2)[ dat2topfive[,1] ],
>          Cols = colnames(dat2)[ dat2topfive[,2] ])
> #---------------
>
>         [,1] [,2] [,3] [,4] [,5]
> dat2top5 "24" "25" "23" "22" "21"
> Rows     "z"  "v"  "y"  "w"  "v"
> Cols     "a"  "a"  "b"  "e"  "e"
>
>
>
>
>
>
>>  x <- swiss$Education[1:25]
>>> dat = matrix(x,5,5)
>>> colnames(dat) = c('a','b','c','d','e')
>>> rownames(dat) = c('z','y','x','w','v')
>>> cnt = 10
>>> #=============================**==================
>>> print(dat)
>>>
>>  a  b  c  d  e
>> z 12  7  6  2 10
>> y  9  7 12  8  3
>> x  5  8  7 28 12
>> w  7  7 12 20  6
>> v 15 13  5  9  1
>>
>>>
>>> # MAKE IT A VECTOR FOR EASIER ORDERING
>>> datasvec <- as.vector(dat)
>>> # ORDER IT
>>> datasvecordered<- order(datasvec)
>>> # RECYCLE ROWS AND COLUMNS NAMES FOR EASIER MAPPING
>>> recycledcols <- rep(colnames(dat),each=nrow(**dat))
>>> recycledrows <- rep(rownames(dat),times=ncol(**dat))
>>>
>>> # GET THE VALUES, THE ROW NAMES AND THE COLUMN NAMES
>>> len = length(datasvecordered)
>>> rr_len = length(recycledrows)
>>>
>>>  rbind(datasvec[**datasvecordered][(len-cnt):**len],recycledrows[**
>> datasvecordered][(rr_len-cnt):**rr_len],recycledcols[**
>> datasvecordered][(rr_len-cnt):**rr_len])
>>    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
>> [1,] "9"  "9"  "10" "12" "12" "12" "12" "13" "15" "20"  "28"
>> [2,] "y"  "v"  "z"  "z"  "y"  "w"  "x"  "v"  "v"  "w"   "x"
>> [3,] "a"  "d"  "e"  "a"  "c"  "c"  "e"  "b"  "a"  "d"   "d"
>>
>>>
>>>  rbind(datasvec[**datasvecordered][1:cnt],**
>> recycledrows[datasvecordered][**1:cnt],recycledcols[**
>> datasvecordered][1:cnt])
>>    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>> [1,] "1"  "2"  "3"  "5"  "5"  "6"  "6"  "7"  "7"  "7"
>> [2,] "v"  "z"  "y"  "x"  "v"  "z"  "w"  "w"  "z"  "y"
>> [3,] "e"  "d"  "e"  "a"  "c"  "c"  "e"  "a"  "b"  "b"
>>
>> enjoy
>>
>> ben
>>
>> On Wed, Oct 12, 2011 at 11:47 AM, Ben qant <ccqu...@gmail.com> wrote:
>>
>>  Hello,
>>>
>>> This is my solution. This is pretty fast (tested with a larger data set)!
>>> If you have a more elegant way to do it (of similar speed), please reply.
>>> Thanks for the help!
>>>
>>> ################## get highest and lowest values and names of a matrix
>>> # create sample data
>>>
>>> x <- swiss$Education[1:25]
>>> dat = matrix(x,5,5)
>>> colnames(dat) = c('a','b','c','d','e')
>>>
>>> rownames(dat) = c('z','y','x','w','v')
>>>
>>> #my solution
>>>
>>> nms = dimnames(dat) #get matrix row and col names
>>> cnt = 10 # number of max and mins to get
>>>
>>> tmp = dat
>>> mxs = list("list",cnt)
>>> mns = list("list",cnt)
>>> for(i in 1:cnt){
>>>  #get maxes
>>>  mx_dims = arrayInd(which.max(tmp), dim(tmp)) # get max dims for entire
>>> matrix note: which.max also removes NA's
>>>  mx_nm = c(nms[[1]][mx_dims[1]],nms[[2]**][mx_dims[2]]) #get names
>>>  mx = tmp[mx_dims] # get max value
>>>  mxs[[i]] = c(mx,mx_nm) # add max and dim names to list of maxes
>>>  tmp[mx_dims] = NA #removes last max so new one is found
>>>
>>>  #get mins (basically same as above)
>>>  mn_dims = arrayInd(which.min(tmp), dim(tmp))
>>>  mn_nm = c(nms[[1]][mn_dims[1]],nms[[2]**][mn_dims[2]])
>>>  mn = tmp[mn_dims]
>>>  mns[[i]] = c(mn,mn_nm)
>>>  tmp[mn_dims] = NA
>>> }
>>>
>>> mxs
>>> mns
>>>
>>> # end
>>>
>>> Regards,
>>>
>>> Ben
>>>
>>>
>>> On Tue, Oct 11, 2011 at 5:32 PM, "Dénes TÓTH" <tde...@cogpsyphy.hu>
>>> wrote:
>>>
>>>
>>>> which.max is even faster:
>>>>
>>>> dims <- c(1000,1000)
>>>> tt <- array(rnorm(prod(dims)),dims)
>>>> # which
>>>> system.time(
>>>> replicate(100, which(tt==max(tt), arr.ind=TRUE))
>>>> )
>>>> # which.max (& arrayInd)
>>>> system.time(
>>>> replicate(100, arrayInd(which.max(tt), dims))
>>>> )
>>>>
>>>> Best,
>>>> Denes
>>>>
>>>>  But it's simpler and probably faster to use R's built-in capabilities.
>>>>> ?which ## note the arr.ind argument!)
>>>>>
>>>>> As an example:
>>>>>
>>>>> test <- matrix(rnorm(24), nr = 4)
>>>>> which(test==max(test), arr.ind=TRUE)
>>>>>    row col
>>>>> [1,]   2   6
>>>>>
>>>>> So this gives the row and column indices of the max, from which row and
>>>>> column names can easily be obtained from the dimnames attribute of the
>>>>> matrix.
>>>>>
>>>>> Note: This assumes that the object in question is a matrix, NOT a data
>>>>> frame, for which it would be slightly more complicated.
>>>>>
>>>>> -- Bert
>>>>>
>>>>>
>>>>> On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega
>>>>> <c...@qualityexcellence.es>**wrote:
>>>>>
>>>>>  Hi,
>>>>>>
>>>>>> With this code you can find row and col names for the largest value
>>>>>> applied
>>>>>> to your example:
>>>>>>
>>>>>> r.m.tmp<-apply(dat,1,max)
>>>>>> r.max<-names(r.m.tmp)[r.m.tmp=**=max(r.m.tmp)]
>>>>>>
>>>>>> c.m.tmp<-apply(dat,2,max)
>>>>>> c.max<-names(c.m.tmp)[c.m.tmp=**=max(c.m.tmp)]
>>>>>>
>>>>>> It's inmediate how to get the same for the smallest and build a
>>>>>>
>>>>> function
>>>>
>>>>> to
>>>>>> calculate everything and return a list.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Carlos Ortega
>>>>>> www.qualityexcellence.es
>>>>>>
>>>>>> 2011/10/11 Ben qant <ccqu...@gmail.com>
>>>>>>
>>>>>>  Hello,
>>>>>>>
>>>>>>> I'm looking to get the values, row names and column names of the
>>>>>>>
>>>>>> largest
>>>>>>
>>>>>>> and
>>>>>>> smallest values in a matrix.
>>>>>>>
>>>>>>> Example (except is does not include the names):
>>>>>>>
>>>>>>>  x <- swiss$Education[1:25]
>>>>>>>> dat = matrix(x,5,5)
>>>>>>>> colnames(dat) = c('a','b','c','d','c')
>>>>>>>> rownames(dat) = c('z','y','x','w','v')
>>>>>>>> dat
>>>>>>>>
>>>>>>>  a  b  c  d  c
>>>>>>> z 12  7  6  2 10
>>>>>>> y  9  7 12  8  3
>>>>>>> x  5  8  7 28 12
>>>>>>> w  7  7 12 20  6
>>>>>>> v 15 13  5  9  1
>>>>>>>
>>>>>>>  #top 10
>>>>>>>> sort(dat,partial=n-9:n)[(n-9):**n]
>>>>>>>>
>>>>>>> [1]  9 10 12 12 12 12 13 15 20 28
>>>>>>>
>>>>>>>> # bottom 10
>>>>>>>> sort(dat,partial=1:10)[1:10]
>>>>>>>>
>>>>>>> [1] 1 2 3 5 5 6 6 7 7 7
>>>>>>>
>>>>>>> ...except I need the rownames and colnames to go along for the ride
>>>>>>>
>>>>>> with
>>>>>>
>>>>>>> the
>>>>>>> values...because of this, I am guessing the return value will need to
>>>>>>>
>>>>>> be
>>>>>> a
>>>>>>
>>>>>>> list since all of the values have different row and col names (which
>>>>>>>
>>>>>> is
>>>>>>
>>>>>>> fine).
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Ben
>>>>>>>
>>>>>>
> David Winsemius, MD
> West Hartford, CT
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] high and lowest with names

Reply via email to