Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-12 Thread Anne Archibald
2008/10/9 David Bolme <[EMAIL PROTECTED]>: > I have written up basic nearest neighbor algorithm. It does a brute > force search so it will be slower than kdtrees as the number of points > gets large. It should however work well for high dimensional data. I > have also added the option for user d

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-09 Thread David Bolme
I have written up basic nearest neighbor algorithm. It does a brute force search so it will be slower than kdtrees as the number of points gets large. It should however work well for high dimensional data. I have also added the option for user defined distance measures. The user can set

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-08 Thread Rob Hetland
On Oct 8, 2008, at 1:36 PM, Anne Archibald wrote: > How important did you find the ability to select priority versus > non-priority search? > > How important did you find the ability to select other splitting > rules? In both cases, I included those things just because they were options in

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-08 Thread Rob Hetland
My version of a wrapper of ANN is attached. I wrote it when I had some issues installing the scikits.ann package. It uses ctypes, and might be useful in deciding on an API. Please feel free to take what you like, -Rob ann.tar.gz Description: GNU Zip compressed data Rob Hetland

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-03 Thread Anne Archibald
2008/10/3 David Bolme <[EMAIL PROTECTED]>: > I remember reading a paper or book that stated that for data that has > been normalized correlation and Euclidean are equivalent and will > produce the same knn results. To this end I spent a couple hours this > afternoon doing the math. This document

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-03 Thread David Bolme
I remember reading a paper or book that stated that for data that has been normalized correlation and Euclidean are equivalent and will produce the same knn results. To this end I spent a couple hours this afternoon doing the math. This document is the result. http://www.cs.colostate.edu/~

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread Anne Archibald
2008/10/2 David Bolme <[EMAIL PROTECTED]>: > It may be useful to have an interface that handles both cases: > similarity and dissimilarity. Often I have seen "Nearest Neighbor" > algorithms that look for maximum similarity instead of minimum > distance. In my field (biometrics) we often deal with

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread David Bolme
It may be useful to have an interface that handles both cases: similarity and dissimilarity. Often I have seen "Nearest Neighbor" algorithms that look for maximum similarity instead of minimum distance. In my field (biometrics) we often deal with very specialized distance or similarity me

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread Matthieu Brucher
2008/10/2 David Bolme <[EMAIL PROTECTED]>: > I also like the idea of a scipy.spatial library. For the research I > do in machine learning and computer vision we are often interested in > specifying different distance measures. It would be nice to have a > way to specify the distance measure. I w

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-02 Thread David Bolme
I also like the idea of a scipy.spatial library. For the research I do in machine learning and computer vision we are often interested in specifying different distance measures. It would be nice to have a way to specify the distance measure. I would like to see a standard set included: C

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread James Philbin
> distances, indices = T.query(xs) # single nearest neighbor I'm not sure if it's implied, but can xs be a NxD matrix here i.e. query for all N points rather than just one. This will reduce the python call overhead for large queries. Also, I have some c++ code for locality sensitive hashing which

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread Anne Archibald
2008/10/1 Barry Wark <[EMAIL PROTECTED]>: > Thanks for taking this on. The scikits.ann has licensing issues (as > noted above), so it would be nice to have a clean-room implementation > in scipy. I am happy to port the scikits.ann API to the final API that > you choose, however, if you think that

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-10-01 Thread Anne Archibald
2008/10/1 Gael Varoquaux <[EMAIL PROTECTED]>: > On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote: >> > k=None in the third call to T.query seems redundant. It should be >> > possible do put some logics so that the call is simply > >> > distances, indices = T.query(xs, distance_upper_b

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Nathan Bell
On Wed, Oct 1, 2008 at 1:26 AM, Gael Varoquaux <[EMAIL PROTECTED]> wrote: > > Absolutely. I just think k should default to None, when > distance_upper_bound is specified. k=None could be interpreted as k=1 > when distance_uppper_bound is not specified. > Why not expose the various possibilities th

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Barry Wark
On Mon, Sep 29, 2008 at 8:24 PM, Anne Archibald <[EMAIL PROTECTED]> wrote: > Hi, > > Once again there has been a thread on the numpy/scipy mailing lists > requesting (essentially) some form of spatial data structure. Pointers > have been posted to ANN (sadly LGPLed and in C++) as well as a handful

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Gael Varoquaux
On Tue, Sep 30, 2008 at 06:10:46PM -0400, Anne Archibald wrote: > > k=None in the third call to T.query seems redundant. It should be > > possible do put some logics so that the call is simply > > distances, indices = T.query(xs, distance_upper_bound=1.0) > Well, the problem with this is that you

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Nathan Bell
On Tue, Sep 30, 2008 at 6:10 PM, Anne Archibald <[EMAIL PROTECTED]> wrote: > > Well, the problem with this is that you often want to provide a > distance upper bound as well as a number of nearest neighbors. For This use case is also important in scattered data interpolation, so we definitely want

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Anne Archibald
2008/9/30 Gael Varoquaux <[EMAIL PROTECTED]>: > On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote: >> T = KDTree(data) > >> distances, indices = T.query(xs) # single nearest neighbor > >> distances, indices = T.query(xs, k=10) # ten nearest neighbors > >> distances, indices = T.query(x

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Gael Varoquaux
On Tue, Sep 30, 2008 at 05:31:17PM -0400, Anne Archibald wrote: > T = KDTree(data) > distances, indices = T.query(xs) # single nearest neighbor > distances, indices = T.query(xs, k=10) # ten nearest neighbors > distances, indices = T.query(xs, k=None, distance_upper_bound=1.0) # > all within 1 o

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Anne Archibald
2008/9/30 Peter <[EMAIL PROTECTED]>: > On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker > <[EMAIL PROTECTED]> wrote: >> >> Anne Archibald wrote: >>> I suggest the creation of >>> a new submodule of scipy, scipy.spatial, >> >> +1 >> >> Here's one to consider: >> http://pypi.python.org/pypi/Rtree

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Peter
On Tue, Sep 30, 2008 at 5:10 AM, Christopher Barker <[EMAIL PROTECTED]> wrote: > > Anne Archibald wrote: >> I suggest the creation of >> a new submodule of scipy, scipy.spatial, > > +1 > > Here's one to consider: > http://pypi.python.org/pypi/Rtree > and perhaps other stuff from: > http://trac.gisp

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-29 Thread Charles R Harris
On Mon, Sep 29, 2008 at 9:24 PM, Anne Archibald <[EMAIL PROTECTED]>wrote: > Hi, > > Once again there has been a thread on the numpy/scipy mailing lists > requesting (essentially) some form of spatial data structure. Pointers > have been posted to ANN (sadly LGPLed and in C++) as well as a handful

Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-29 Thread Christopher Barker
Anne Archibald wrote: > I suggest the creation of > a new submodule of scipy, scipy.spatial, +1 Here's one to consider: http://pypi.python.org/pypi/Rtree and perhaps other stuff from: http://trac.gispython.org/spatialindex/wiki which I think is LGPL -- can scipy use that? By the way, a gener

[Numpy-discussion] Proposal: scipy.spatial

2008-09-29 Thread Anne Archibald
Hi, Once again there has been a thread on the numpy/scipy mailing lists requesting (essentially) some form of spatial data structure. Pointers have been posted to ANN (sadly LGPLed and in C++) as well as a handful of pure-python implementations of kd-trees. I suggest the creation of a new submodul