2008/10/9 David Bolme <[EMAIL PROTECTED]>:
> I have written up basic nearest neighbor algorithm. It does a brute
> force search so it will be slower than kdtrees as the number of points
> gets large. It should however work well for high dimensional data. I
> have also added the option for user d
I have written up basic nearest neighbor algorithm. It does a brute
force search so it will be slower than kdtrees as the number of points
gets large. It should however work well for high dimensional data. I
have also added the option for user defined distance measures. The
user can set
On Oct 8, 2008, at 1:36 PM, Anne Archibald wrote:
> How important did you find the ability to select priority versus
> non-priority search?
>
> How important did you find the ability to select other splitting
> rules?
In both cases, I included those things just because they were options
in
My version of a wrapper of ANN is attached. I wrote it when I had
some issues installing the scikits.ann package. It uses ctypes, and
might be useful in deciding on an API.
Please feel free to take what you like,
-Rob
ann.tar.gz
Description: GNU Zip compressed data
Rob Hetland
2008/10/3 David Bolme <[EMAIL PROTECTED]>:
> I remember reading a paper or book that stated that for data that has
> been normalized correlation and Euclidean are equivalent and will
> produce the same knn results. To this end I spent a couple hours this
> afternoon doing the math. This document
I remember reading a paper or book that stated that for data that has
been normalized correlation and Euclidean are equivalent and will
produce the same knn results. To this end I spent a couple hours this
afternoon doing the math. This document is the result.
http://www.cs.colostate.edu/~
2008/10/2 David Bolme <[EMAIL PROTECTED]>:
> It may be useful to have an interface that handles both cases:
> similarity and dissimilarity. Often I have seen "Nearest Neighbor"
> algorithms that look for maximum similarity instead of minimum
> distance. In my field (biometrics) we often deal with
It may be useful to have an interface that handles both cases:
similarity and dissimilarity. Often I have seen "Nearest Neighbor"
algorithms that look for maximum similarity instead of minimum
distance. In my field (biometrics) we often deal with very
specialized distance or similarity me
2008/10/2 David Bolme <[EMAIL PROTECTED]>:
> I also like the idea of a scipy.spatial library. For the research I
> do in machine learning and computer vision we are often interested in
> specifying different distance measures. It would be nice to have a
> way to specify the distance measure. I w
I also like the idea of a scipy.spatial library. For the research I
do in machine learning and computer vision we are often interested in
specifying different distance measures. It would be nice to have a
way to specify the distance measure. I would like to see a standard
set included: C
> distances, indices = T.query(xs) # single nearest neighbor
I'm not sure if it's implied, but can xs be a NxD matrix here i.e.
query for all N points rather than just one. This will reduce the
python call overhead for large queries.
Also, I have some c++ code for locality sensitive hashing which
2008/10/1 Barry Wark <[EMAIL PROTECTED]>:
> Thanks for taking this on. The scikits.ann has licensing issues (as
> noted above), so it would be nice to have a clean-room implementation
> in scipy. I am happy to port the scikits.ann API to the final API that
> you choose, however, if you think that
2008/10/1 Gael Varoquaux <[EMAIL PROTECTED]>:
> On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote:
>> > k=None in the third call to T.query seems redundant. It should be
>> > possible do put some logics so that the call is simply
>
>> > distances, indices = T.query(xs, distance_upper_b
On Wed, Oct 1, 2008 at 1:26 AM, Gael Varoquaux
<[EMAIL PROTECTED]> wrote:
>
> Absolutely. I just think k should default to None, when
> distance_upper_bound is specified. k=None could be interpreted as k=1
> when distance_uppper_bound is not specified.
>
Why not expose the various possibilities th
On Mon, Sep 29, 2008 at 8:24 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> Once again there has been a thread on the numpy/scipy mailing lists
> requesting (essentially) some form of spatial data structure. Pointers
> have been posted to ANN (sadly LGPLed and in C++) as well as a handful
On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote:
> > k=None in the third call to T.query seems redundant. It should be
> > possible do put some logics so that the call is simply
> > distances, indices = T.query(xs, distance_upper_bound=1.0)
> Well, the problem with this is that you
On Tue, Sep 30, 2008 at 6:10 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
>
> Well, the problem with this is that you often want to provide a
> distance upper bound as well as a number of nearest neighbors. For
This use case is also important in scattered data interpolation, so we
definitely want
2008/9/30 Gael Varoquaux <[EMAIL PROTECTED]>:
> On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote:
>> T = KDTree(data)
>
>> distances, indices = T.query(xs) # single nearest neighbor
>
>> distances, indices = T.query(xs, k=10) # ten nearest neighbors
>
>> distances, indices = T.query(x
On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote:
> T = KDTree(data)
> distances, indices = T.query(xs) # single nearest neighbor
> distances, indices = T.query(xs, k=10) # ten nearest neighbors
> distances, indices = T.query(xs, k=None, distance_upper_bound=1.0) #
> all within 1 o
2008/9/30 Peter <[EMAIL PROTECTED]>:
> On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker
> <[EMAIL PROTECTED]> wrote:
>>
>> Anne Archibald wrote:
>>> I suggest the creation of
>>> a new submodule of scipy, scipy.spatial,
>>
>> +1
>>
>> Here's one to consider:
>> http://pypi.python.org/pypi/Rtree
On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker
<[EMAIL PROTECTED]> wrote:
>
> Anne Archibald wrote:
>> I suggest the creation of
>> a new submodule of scipy, scipy.spatial,
>
> +1
>
> Here's one to consider:
> http://pypi.python.org/pypi/Rtree
> and perhaps other stuff from:
> http://trac.gisp
On Mon, Sep 29, 2008 at 9:24 PM, Anne Archibald
<[EMAIL PROTECTED]>wrote:
> Hi,
>
> Once again there has been a thread on the numpy/scipy mailing lists
> requesting (essentially) some form of spatial data structure. Pointers
> have been posted to ANN (sadly LGPLed and in C++) as well as a handful
Anne Archibald wrote:
> I suggest the creation of
> a new submodule of scipy, scipy.spatial,
+1
Here's one to consider:
http://pypi.python.org/pypi/Rtree
and perhaps other stuff from:
http://trac.gispython.org/spatialindex/wiki
which I think is LGPL -- can scipy use that?
By the way, a gener
Hi,
Once again there has been a thread on the numpy/scipy mailing lists
requesting (essentially) some form of spatial data structure. Pointers
have been posted to ANN (sadly LGPLed and in C++) as well as a handful
of pure-python implementations of kd-trees. I suggest the creation of
a new submodul
24 matches
Mail list logo