First, time fetching one million records with all the fields you need, both for display and for re-ranking. If that is slow, then no amount of cosine code tweaking will make it fast.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Aug 16, 2019, at 9:23 AM, Jan Høydahl <jan....@cominvent.com> wrote: > > I bet your main issue is assuming that this particular plugin is the only way > to solve your ranking requirements. > I would advise you to start looking into the various built-in Similarities > and instead try to tweak one of those, and/or adding more ranking signals to > your solution, perhaps see if ReRanking on top 1000 hits is good enough etc. > Not knowing anything about what lead you to that custom bad-performing 3rd > party plugin in the first place, it is hard to guess, but take 10 steps back > and re-consider that choice. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > >> 16. aug. 2019 kl. 15:50 skrev Jörn Franke <jornfra...@gmail.com>: >> >> You would have to implement that I don’t think that Solr is threading the >> query parser magically for you, but maybe some people have more insight on >> this topic. >> >>> Am 16.08.2019 um 15:42 schrieb Vignan Malyala <dsmsvig...@gmail.com>: >>> >>> How do I check that in solr? Can anyone share link on implementation of >>> threads in solr? >>> >>>> On Fri 16 Aug, 2019, 4:52 PM Jörn Franke, <jornfra...@gmail.com> wrote: >>>> >>>> Is your custom query parser multithreaded and leverages all cores? >>>> >>>>> Am 16.08.2019 um 13:12 schrieb Vignan Malyala <dsmsvig...@gmail.com>: >>>>> >>>>> I want response time below 3 seconds. >>>>> And fyi I'm already using 32 cores. >>>>> My cache is already full too and obviously same requests don't occur in >>>> my >>>>> case. >>>>> >>>>> >>>>>> On Fri 16 Aug, 2019, 11:47 AM Jörn Franke, <jornfra...@gmail.com> >>>> wrote: >>>>>> >>>>>> How much response time do you require? >>>>>> I think you have to solve the issue in your code by introducing higher >>>>>> parallelism during calculation and potentially more cores. >>>>>> >>>>>> Maybe you can also precalculate what you do, cache it and use during >>>>>> request the precalculated values. >>>>>> >>>>>>> Am 16.08.2019 um 05:08 schrieb Vignan Malyala <dsmsvig...@gmail.com>: >>>>>>> >>>>>>> Hi >>>>>>> Any solution for this? Taking around 50 seconds to get response. >>>>>>> >>>>>>>> On Mon 12 Aug, 2019, 3:28 PM Vignan Malyala, <dsmsvig...@gmail.com> >>>>>> wrote: >>>>>>>> >>>>>>>> Hi Doug / Walter, >>>>>>>> >>>>>>>> I'm just using this methodology. >>>>>>>> PFB link of my sample code. >>>>>>>> https://github.com/saaay71/solr-vector-scoring >>>>>>>> >>>>>>>> The only issue is speed of response for 1M records. >>>>>>>> >>>>>>>> On Mon, Aug 12, 2019 at 12:24 AM Walter Underwood < >>>>>> wun...@wunderwood.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> tf.idf was invented because cosine similarity is too much >>>> computation. >>>>>>>>> tf.idf gives similar results much, much faster than cosine distance. >>>>>>>>> >>>>>>>>> I would expect cosine similarity to be slow. I would also expect >>>>>>>>> retrieving 1 million records to be slow. Doing both of those in one >>>>>> minute >>>>>>>>> is pretty good. >>>>>>>>> >>>>>>>>> As Kernighan and Paugher said in 1978, "Don’t diddle code to make it >>>>>>>>> faster—find a better algorithm.” >>>>>>>>> >>>>>>>>> https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style >>>>>>>>> >>>>>>>>> wunder >>>>>>>>> Walter Underwood >>>>>>>>> wun...@wunderwood.org >>>>>>>>> http://observer.wunderwood.org/ (my blog) >>>>>>>>> >>>>>>>>>> On Aug 11, 2019, at 10:40 AM, Doug Turnbull < >>>>>>>>> dturnb...@opensourceconnections.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi Vignan, >>>>>>>>>> >>>>>>>>>> We need to see more details / code of what your query parser plugin >>>>>> does >>>>>>>>>> exactly with term vectors, we can't really help you without more >>>>>>>>> details. >>>>>>>>>> Is it open source? Can you share a minimal example that recreates >>>> the >>>>>>>>>> problem? >>>>>>>>>> >>>>>>>>>> On Sun, Aug 11, 2019 at 1:19 PM Vignan Malyala < >>>> dsmsvig...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi guys, >>>>>>>>>>> >>>>>>>>>>> I made my custom qparser plugin in Solr for scoring. The plugin >>>> only >>>>>>>>> does >>>>>>>>>>> cosine similarity of vectors for each record. I use term vectors >>>>>> here. >>>>>>>>>>> Results are fine! >>>>>>>>>>> >>>>>>>>>>> BUT, Solr response is very slow with term vectors. It takes around >>>> 55 >>>>>>>>>>> seconds for each request for 1000000 records. >>>>>>>>>>> How do I make it faster to get my results in ms ? >>>>>>>>>>> Please respond soon as its lil urgent. >>>>>>>>>>> >>>>>>>>>>> Note: All my values are stored and indexed. I am not using Solr >>>>>> Cloud. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> *Doug Turnbull **| CTO* | OpenSource Connections >>>>>>>>>> <http://opensourceconnections.com>, LLC | 240.476.9983 >>>>>>>>>> Author: Relevant Search <http://manning.com/turnbull> >>>>>>>>>> This e-mail and all contents, including attachments, is considered >>>> to >>>>>> be >>>>>>>>>> Company Confidential unless explicitly stated otherwise, regardless >>>>>>>>>> of whether attachments are marked as such. >>>>>>>>> >>>>>>>>> >>>>>> >>>> >