On Tue, Jul 5, 2016 at 1:03 AM, Juan Nunez-Iglesias
wrote:
> On 4 July 2016 at 7:27:47 PM, Skip Montanaro (skip.montan...@gmail.com)
> wrote:
>
> Hashing it probably wouldn't work, too
> great a chance for collisions.
>
>
> If the string is ASCII, you can always interpret the bytes as part of an
On 4 July 2016 at 7:27:47 PM, Skip Montanaro (skip.montan...@gmail.com)
wrote:
Hashing it probably wouldn't work, too
great a chance for collisions.
If the string is ASCII, you can always interpret the bytes as part of an 8
byte integer. Or, you can map unique values to consecutive integers.
___
On 4 July 2016 at 7:38:48 PM, Skip Montanaro (skip.montan...@gmail.com)
wrote:
Oh, cool. Precisely the sort of solution I was hoping would turn up.
Except it doesn’t seem to meet your original spec, which retrieved the
first item of each *run* of an index value?
_
> This is trivial in pandas. a simple groupby.
Oh, cool. Precisely the sort of solution I was hoping would turn up.
Straightforward, easy for a novice data slinger like me to understand,
and likely a bazillion times faster than the straightforward version.
Skip
_
This is trivial in pandas. a simple groupby.
In [6]: data = [[ 'a', 27, 14.5 ],['b', 12, 99.0],['a', 17, 100.3], ['b',
12, -329.0]]
In [7]: df = DataFrame(data, columns=list('ABC'))
In [8]: df
Out[8]:
A B C
0 a 27 14.5
1 b 12 99.0
2 a 17 100.3
3 b 12 -329.0
In [9]: df.gro
> Any way that you can make your keys numeric? Then you can run np.diff on
> that first column, and use the indices of nonzero entries (np.flatnonzero)
> to know where values change. With a +1/-1 offset (that I am too lazy to
> figure out right now ;) you can then index into the original rows to ge
> 1. This is not a NumPy question; StackExchange would be more appropriate.
Thanks, that is the fairly straightforward -- but slow -- solution, which I
have already implemented. I was asking if numpy had some filtering functions
which might speed things up (it's a huge library, with which I'm not
===
Announcing PyTables 3.2.3
===
We are happy to announce PyTables 3.2.3.
What's new
==
This is a bug fix release. It solves many issues reported in the
months since the release of 3.2.2.
In case you want to know more in detail what has