Re: [Numpy-discussion] Optimizing similarity matrix algorithm

2007-09-12 Thread Kurdt Bane
Er.. obviousl, when i wrote : "scan the similarity algorithm and find all the diagonals", I meant scan the "similarity matrix and find all the diagonals". "Similarity matrix" should really be called "Equality matrix", as I imagine it as a matrix with dimensions len(a) x len(b) where M[x][y] = (a[x

[Numpy-discussion] Optimizing similarity matrix algorithm

2007-09-12 Thread Kurdt Bane
Hi to all! For reverse engineering purposes, I need to find where every possible chunk of bytes in file A is contained in file B. Obviously, if a chunk of length n is contained in B, I dont' want my script to recognize also all the subchunks of size < n contained in the chunk. I coded a naive impl