tf.idf was invented because cosine similarity is too much computation. tf.idf 
gives similar results much, much faster than cosine distance.

I would expect cosine similarity to be slow. I would also expect retrieving 1 
million records to be slow. Doing both of those in one minute is pretty good.

As Kernighan and Paugher said in 1978, "Don’t diddle code to make it 
faster—find a better algorithm.”

https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Aug 11, 2019, at 10:40 AM, Doug Turnbull 
> <dturnb...@opensourceconnections.com> wrote:
> 
> Hi Vignan,
> 
> We need to see more details / code of what your query parser plugin does
> exactly with term vectors, we can't really help you without more details.
> Is it open source? Can you share a minimal example that recreates the
> problem?
> 
> On Sun, Aug 11, 2019 at 1:19 PM Vignan Malyala <dsmsvig...@gmail.com> wrote:
> 
>> Hi guys,
>> 
>> I made my custom qparser plugin in Solr for scoring. The plugin only does
>> cosine similarity of vectors for each record. I use term vectors here.
>> Results are fine!
>> 
>> BUT, Solr response is very slow with term vectors. It takes around 55
>> seconds for each request for 1000000 records.
>> How do I make it faster to get my results in ms ?
>> Please respond soon as its lil urgent.
>> 
>> Note: All my values are stored and indexed. I am not using Solr Cloud.
>> 
> 
> 
> -- 
> *Doug Turnbull **| CTO* | OpenSource Connections
> <http://opensourceconnections.com>, LLC | 240.476.9983
> Author: Relevant Search <http://manning.com/turnbull>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.

Reply via email to