Re: [Numpy-discussion] Overlapping ranges

2009-03-16 Thread josef . pktd
On Mon, Mar 16, 2009 at 5:29 PM, Robert Kern wrote: > 2009/3/16 Peter Saffrey : > >> At the moment, I'm using a fairly naive approach that finds roughly in the >> genome (which gene) each point might be and then checking it against the >> bins in that gene. If I split the problem into chromosomes,

Re: [Numpy-discussion] Overlapping ranges

2009-03-16 Thread Robert Kern
2009/3/16 Peter Saffrey : > At the moment, I'm using a fairly naive approach that finds roughly in the > genome (which gene) each point might be and then checking it against the > bins in that gene. If I split the problem into chromosomes, I feel sure > there must be some super-fast matrix approac

[Numpy-discussion] Overlapping ranges

2009-03-16 Thread Peter Saffrey
I'm trying to file a set of data points, defined by genome coordinates, into bins, also based on genome coordinates. Each data point is (chromosome, start, end, point) and each bin is (chromosome, start, end). I have about 140 million points to file into around 100,000 bins. Both are (roughly)