not off topic at all; there are several matters of naming that I am not at
all settled on yet, and I don't think it is unimportant.
indeed, those are closely related functions, and I wasn't aware of them
yet, so that's some welcome additional perspective. The mathematica
function differs in that t
My comment is just on the name.
I'd expect something named `groupby`
to behave essentially like Mathematica's `GatherBy` command.
http://reference.wolfram.com/mathematica/ref/GatherBy.html
I think you are after something more like Matlab's grpstats:
http://www.mathworks.com/help/stats/grpstats.htm
Alan:
The equivalent of that in my current draft would be group_by(keys, values),
which is shorthand for group_by(keys).group(values); a optional values
argument to the constructor of GroupBy is directly bound to return an
iterable over the grouped values; but we often want to bind different value
On 1/26/2014 12:02 PM, Stéfan van der Walt wrote:
> what would the output of
>
> ``group_by((key1, key2))``
I'd expect something named "groupby" to behave as below.
Alan
def groupby(seq, key):
from collections import defaultdict
groups = defaultdict(list)
for item in seq:
groups[
To follow up with an example as to why it is useful that a temporary object
is created, consider the following (taken from the radial reduction
example):
g = group_by(np.round(radius, 5).flatten())
pp.errorbar(
g.unique,
g.mean(sample.flatten())[1],
g.std(sample.fla
An object of type GroupBy.
So a call to group_by does not return any consumable output directly. If
you want for instance the unique keys, or groups if you will, you can call
GroupBy.unique. In this case, for a tuple of input keys, youd get a tuple
of unique keys back. If you want to compute sever
Hi Eelco
On Sun, 26 Jan 2014 12:20:04 +0100, Eelco Hoogendoorn wrote:
> key1 = list('abaabb')
> key2 = np.random.randint(0,2,(6,2))
> values = np.random.rand(6,3)
> print group_by((key1, key2)).median(values)
I agree that group_by functionality could be handy in numpy.
In the above example, what
Hi all,
Please critique my draft exploring the possibilities of adding group_by
support to numpy:
http://pastebin.com/c5WLWPbp
In nearly ever project I work on, I require group_by functionality of some
sort. There are other libraries that provide this kind of functionality,
such as pandas for ins