Re: Solr is very slow with term vectors

Jan Høydahl Fri, 16 Aug 2019 09:23:45 -0700

I bet your main issue is assuming that this particular plugin is the only way 
to solve your ranking requirements.
I would advise you to start looking into the various built-in Similarities and 
instead try to tweak one of those, and/or adding more ranking signals to your 
solution, perhaps see if ReRanking on top 1000 hits is good enough etc. Not 
knowing anything about what lead you to that custom bad-performing 3rd party 
plugin in the first place, it is hard to guess, but take 10 steps back and 
re-consider that choice.


--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 16. aug. 2019 kl. 15:50 skrev Jörn Franke <[email protected]>:
> 
> You would have to implement that I don’t think that Solr is threading the 
> query parser magically for you, but maybe some people have more insight on 
> this topic.
> 
>> Am 16.08.2019 um 15:42 schrieb Vignan Malyala <[email protected]>:
>> 
>> How do I check that in solr? Can anyone share link on implementation of
>> threads in solr?
>> 
>>> On Fri 16 Aug, 2019, 4:52 PM Jörn Franke, <[email protected]> wrote:
>>> 
>>> Is your custom query parser multithreaded and leverages all cores?
>>> 
>>>> Am 16.08.2019 um 13:12 schrieb Vignan Malyala <[email protected]>:
>>>> 
>>>> I want response time below 3 seconds.
>>>> And fyi I'm already using 32 cores.
>>>> My cache is already full too and obviously same requests don't occur in
>>> my
>>>> case.
>>>> 
>>>> 
>>>>> On Fri 16 Aug, 2019, 11:47 AM Jörn Franke, <[email protected]>
>>> wrote:
>>>>> 
>>>>> How much response time do you require?
>>>>> I think you have to solve the issue in your code by introducing higher
>>>>> parallelism during calculation and potentially more cores.
>>>>> 
>>>>> Maybe you can also precalculate what you do, cache it and use during
>>>>> request the precalculated values.
>>>>> 
>>>>>> Am 16.08.2019 um 05:08 schrieb Vignan Malyala <[email protected]>:
>>>>>> 
>>>>>> Hi
>>>>>> Any solution for this? Taking around 50 seconds to get response.
>>>>>> 
>>>>>>> On Mon 12 Aug, 2019, 3:28 PM Vignan Malyala, <[email protected]>
>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Doug / Walter,
>>>>>>> 
>>>>>>> I'm just using this methodology.
>>>>>>> PFB link of my sample code.
>>>>>>> https://github.com/saaay71/solr-vector-scoring
>>>>>>> 
>>>>>>> The only issue is speed of response for 1M records.
>>>>>>> 
>>>>>>> On Mon, Aug 12, 2019 at 12:24 AM Walter Underwood <
>>>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> tf.idf was invented because cosine similarity is too much
>>> computation.
>>>>>>>> tf.idf gives similar results much, much faster than cosine distance.
>>>>>>>> 
>>>>>>>> I would expect cosine similarity to be slow. I would also expect
>>>>>>>> retrieving 1 million records to be slow. Doing both of those in one
>>>>> minute
>>>>>>>> is pretty good.
>>>>>>>> 
>>>>>>>> As Kernighan and Paugher said in 1978, "Don’t diddle code to make it
>>>>>>>> faster—find a better algorithm.”
>>>>>>>> 
>>>>>>>> https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style
>>>>>>>> 
>>>>>>>> wunder
>>>>>>>> Walter Underwood
>>>>>>>> [email protected]
>>>>>>>> http://observer.wunderwood.org/  (my blog)
>>>>>>>> 
>>>>>>>>> On Aug 11, 2019, at 10:40 AM, Doug Turnbull <
>>>>>>>> [email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Vignan,
>>>>>>>>> 
>>>>>>>>> We need to see more details / code of what your query parser plugin
>>>>> does
>>>>>>>>> exactly with term vectors, we can't really help you without more
>>>>>>>> details.
>>>>>>>>> Is it open source? Can you share a minimal example that recreates
>>> the
>>>>>>>>> problem?
>>>>>>>>> 
>>>>>>>>> On Sun, Aug 11, 2019 at 1:19 PM Vignan Malyala <
>>> [email protected]>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi guys,
>>>>>>>>>> 
>>>>>>>>>> I made my custom qparser plugin in Solr for scoring. The plugin
>>> only
>>>>>>>> does
>>>>>>>>>> cosine similarity of vectors for each record. I use term vectors
>>>>> here.
>>>>>>>>>> Results are fine!
>>>>>>>>>> 
>>>>>>>>>> BUT, Solr response is very slow with term vectors. It takes around
>>> 55
>>>>>>>>>> seconds for each request for 1000000 records.
>>>>>>>>>> How do I make it faster to get my results in ms ?
>>>>>>>>>> Please respond soon as its lil urgent.
>>>>>>>>>> 
>>>>>>>>>> Note: All my values are stored and indexed. I am not using Solr
>>>>> Cloud.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> *Doug Turnbull **| CTO* | OpenSource Connections
>>>>>>>>> <http://opensourceconnections.com>, LLC | 240.476.9983
>>>>>>>>> Author: Relevant Search <http://manning.com/turnbull>
>>>>>>>>> This e-mail and all contents, including attachments, is considered
>>> to
>>>>> be
>>>>>>>>> Company Confidential unless explicitly stated otherwise, regardless
>>>>>>>>> of whether attachments are marked as such.
>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>

Re: Solr is very slow with term vectors

Reply via email to