Vector based queries

2012-03-10 Thread Pat Ferrel
I have a case where I'd like to get documents which most closely match a particular vector. The RowSimilarityJob of Mahout is ideal for precalculating similarity between existing documents but in my case the query is constructed at run time. So the UI constructs a vector to be used as a query.

Re: Vector based queries

2012-03-11 Thread Pat Ferrel
g for or? paul PS I've always viewed queries as linear forms on the vector space and I'd like to see this really mathematically written one day... Le 11 mars 2012 à 07:23, Lance Norskog a écrit : Look at the MoreLikeThis feature in Lucene. I believe it does roughly what you describe. O

Re: Vector based queries

2012-03-11 Thread Pat Ferrel
this is a little slower than 2-3word query but still scalable. Has anyone used this on a very large index? Thanks, Pat On 3/11/12 10:45 AM, Pat Ferrel wrote: MoreLikeThis looks exactly like what I need. I would probably create a new "like" method to take a mahout vector and build a