Shailendra <shailendra.vikas <at> gmail.com> writes: > > Hi All, > I want to make a function which should be like this > <code> > cordinates1=(x1,y1) # x1 and y1 are x-cord and y-cord of a large > number of points > cordinates2=(x2,y2) # similar to condinates1 > indices1,indices2= match_cordinates(cordinates1,cordinates2) > <code> > (x1[indices1],y1[indices1]) "matches" (x2[indices2],y2[indices2]) > > where definition of "match" is such that : > If A is closest point to B and distance between A and B is less that > delta than it is a "match". > If A is closest point to B and distance between A and B is more that > delta than there is no match. > Every point has either 1 "match"(closest point) or none
This logic is problematic in general case. See below. You may need to be able to handle several pairs of 'closest points'! > > Also, the size of the cordinates1 and cordinates2 are quite large and > "outer" should not be used. I can think of only C style code to > achieve this. Can any one suggest pythonic way of doing this? > > Thanks, > Shailendra > This is straightforward implementation as a starting point. eat <code> import numpy as np def dist(p1, p2): return np.sqrt(np.sum((p1- p2)** 2, 0)) def cdist(p1, p2, trh): """Expects 2d arrays p1 and p2, with combatible first dimesions and a threshold. Returns indicies of points close to each other -ind[:, 0], array of p1 indicies -ind[:, 1], array of p2 indicies -ambi, list of list of ambiquous situations (where more than 1 pair of points are 'equally close') The indicies are aranged such that dist(p1[:, ind[k, 0]], p2[:, ind[k, 1]])< trh is true for all k. """ ind= [] ambi= [] for k in range(p2.shape[1]): d= dist(p1, p2[:, None, k]) i= np.where(d< trh)[0] if 0< len(i): m= np.where(d[i]== d[i].min())[0] # problematic i= i[m].tolist() ind.append([i[0], k]) if 1< len(m): ambi.append([ind[-1], i]) return np.array(ind), ambi if __name__ == '__main__': n= 10 trh= 2e-1 p1= np.round(np.random.rand(2, n), 1) p2= np.round(p1+ 1e-1* np.random.randn(2, n), 1) ind, ambi= cdist(p1, p2, trh) print 'points close to each other:' if 0< len(ind): print 'p1:' print p1[:, ind[:, 0]], ind[:, 0] print 'p2:' print p2[:, ind[:, 1]], ind[:, 1] print 'valid:' print dist(p1[:, ind[:, 0]], p2[:, ind[:, 1]])< trh print 'with ambiguous situation(s):' if ambi: print ambi else: print 'None' else: print 'None' import timeit n= 1e2 trh= 2e-1 rep= 5 p1= np.random.rand(2, 1e3* n) p2= np.random.randn(2, n) def perf(): ind, ambi= cdist(p1, p2, trh) print 'performance:' t= np.array(timeit.repeat(perf, repeat= rep, number= 1))/ rep print 'min: ', t.min(), 'mean: ', t.mean(), 'max: ', t.max() <code> _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion