Highlighting Performance improvement suggestions required - Solr 6.5.1

2017-08-09 Thread sasarun
Hi All, 

I found quite a few discussions on the highlighting performance issue.
Though I tried to implement most of them, performance improvement was
negative. 
Currently index count is really low with about 922 records . But the field
on which highlighting is done is quite large data. Querying of data with
highlighting is taking lots of time with 85-90% time taken on highlighting. 
Configuration of  my set schema.xml is as below 

fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">






  
  

 



  





Query used in solr is 

hl=true&hl.fl=customContent&hl.fragsize=500&hl.simple.pre=&hl.simple.post=&hl.snippets=1&hl.method=unified&hl.bs.type=SENTENCE&hl.fragListBuilder=simple&hl.maxAnalyzedChars=214748364&facet=true&facet.mincount=1&facet.limit=-1&facet.s
ort=count&debug=timing&facet.field=contentSpecific

Also note that We had tried fastvectorhighlighter too but the result was not
positive. Once when we tried to hl.offsetSource="term_vectors" with unified
result came up in half a second but it didnt had any highlight snippets.

One of the debug returned by solr is shared below for reference

time=8833.0,prepare={time=0.0,query={time=0.0},facet={time=0.0},facet_module={time=0.0},mlt={time=0.0},hig
hlight={time=0.0},stats={time=0.0},expand={time=0.0},terms={time=0.0},debug={time=0.0}},process={time=8826.0,query={time=867.0},facet={time=2.0},facet_module={time=0.0},mlt={time=0.0},highlight={time=7953.0},stats={time=0.0},expand={time=0.0},ter
ms={time=0.0},debug={time=0.0}},loadFieldValues={time=28.0}}

Any suggestions to  improve the performance would be of great help

Thanks, 
Arun



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlighting-Performance-improvement-suggestions-required-Solr-6-5-1-tp4349767.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Highlighting Performance improvement suggestions required - Solr 6.5.1

2017-08-09 Thread sasarun
Hi Amrit, 

Thanks for the response. I did went through both and that is how I landed up
with unified method for highlighter

Thanks,
Arun



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlighting-Performance-improvement-suggestions-required-Solr-6-5-1-tp4349767p4349781.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr performance issue on querying --> Solr 6.5.1

2017-09-26 Thread sasarun
Hi All, 
I have been using Solr for some time now but mostly in standalone mode. Now
my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml
has the following configuration. In the prod environment the performance on
querying seems to really slow. Can anyone help me with few pointers on
howimprove on the same. 


${solr.hdfs.home:}
${solr.hdfs.blockcache.enabled:true}
${solr.hdfs.blockcache.slab.count:1}
${solr.hdfs.blockcache.direct.memory.allocation:false}
${solr.hdfs.blockcache.blocksperbank:16384}
${solr.hdfs.blockcache.read.enabled:true}
${solr.hdfs.blockcache.write.enabled:false}
${solr.hdfs.nrtcachingdirectory.enable:true}
${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}
${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}

hdfs
It has 6 collections of following size 
Collection 1 -->6.41 MB
Collection 2 -->634.51 KB 
Collection 3 -->4.59 MB 
Collection 4 -->1,020.56 MB 
Collection 5 --> 607.26 MB
Collection 6 -->102.4 kb
Each Collection has 5 shards each. Allocated heap size for young generation
is about 8 gb and old generation is about 24 gb. And gc analysis showed peak
size 
utlisation is really low compared to these values. 
But querying to Collection 4 and collection 5 is giving really slow response
even thoughwe are not using any complex queries.Output of debug quries run
with debug=timing
are given below for reference. Can anyone help suggest a way improve the
performance.

Response to query


true
0
3962


("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")

edismax
true
on

host
title
url
customContent
contentSpecificSearch


id
contentTagsCount

0
OR
OR
3985d7e2-3e54-48d8-8336-229e85f5d9de
600

("hybrid electric powerplant"^100.0 "hybrid electric powerplants"^100.0
"Electric"^50.0 "Electrical"^50.0 "Electricity"^50.0 "Engine"^50.0 "fuel
economy"^50.0 "fuel efficiency"^50.0 "Hybrid Electric Propulsion"^50.0
"Power Systems"^50.0 "Powerplant"^50.0 "Propulsion"^50.0 "hybrid"^15.0
"hybrid electric"^15.0 "electric powerplant"^15.0)





15374.0

2.0

2.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0



15363.0

1313.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


14048.0





Thanks,
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-26 Thread sasarun
Hi Erick, 

Thank you for the quick response. Query time was relatively faster once it
is read from memory. But personally I always felt response time could be far
better. As suggested, We will try and set up in a non HDFS environment and
update on the results. 

Thanks, 
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread sasarun
Hi Erick, 

Qtime comes down with rows set as 1. Also it was noted that qtime comes down
when debug parameter is not added with the query. It comes to about 900.

Thanks, 
Arun 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-27 Thread sasarun
Hi Emir, 

Please find the response without bq parameter and debugQuery set to true. 
Also it was noted that Qtime comes down drastically without the debug
parameter to about 700-800. 


true
0
3446


("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")

edismax
on

host
title
url
customContent
contentSpecificSearch


id
contentOntologyTagsCount

0
OR
3985d7e2-3e54-48d8-8336-229e85f5d9de
600
true


...



solr-prd-cluster-m-GooglePatent_shard4_replica2-1506504238282-20



35
159
GET_TOP_IDS
41294
...


29
165
GET_TOP_IDS
40980
...


31
200
GET_TOP_IDS
41006
...


43
208
GET_TOP_IDS
41040
...


181
466
GET_TOP_IDS
41138
...




1518
1523
GET_FIELDS,GET_DEBUG
110
...


1562
1573
GET_FIELDS,GET_DEBUG
115
...


1793
1800
GET_FIELDS,GET_DEBUG
120
...


2153
2161
GET_FIELDS,GET_DEBUG
125
...


2957
2970
GET_FIELDS,GET_DEBUG
130
...




10302.0

2.0

2.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0



10288.0

661.0


0.0


0.0


0.0


0.0


0.0


0.0


0.0


9627.0




("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")


("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid
Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid"
"hybrid electric" "electric powerplant")


(+(DisjunctionMaxQuery((host:hybrid electric powerplant |
contentSpecificSearch:"hybrid electric powerplant" | customContent:"hybrid
electric powerplant" | title:hybrid electric powerplant | url:hybrid
electric powerplant)) DisjunctionMaxQuery((host:hybrid electric powerplants
| contentSpecificSearch:"hybrid electric powerplants" |
customContent:"hybrid electric powerplants" | title:hybrid electric
powerplants | url:hybrid electric powerplants))
DisjunctionMaxQuery((host:Electric | contentSpecificSearch:electric |
customContent:electric | title:Electric | url:Electric))
DisjunctionMaxQuery((host:Electrical | contentSpecificSearch:electrical |
customContent:electrical | title:Electrical | url:Electrical))
DisjunctionMaxQuery((host:Electricity | contentSpecificSearch:electricity |
customContent:electricity | title:Electricity | url:Electricity))
DisjunctionMaxQuery((host:Engine | contentSpecificSearch:engine |
customContent:engine | title:Engine | url:Engine))
DisjunctionMaxQuery((host:fuel economy | contentSpecificSearch:"fuel
economy" | customContent:"fuel economy" | title:fuel economy | url:fuel
economy)) DisjunctionMaxQuery((host:fuel efficiency |
contentSpecificSearch:"fuel efficiency" | customContent:"fuel efficiency" |
title:fuel efficiency | url:fuel efficiency))
DisjunctionMaxQuery((host:Hybrid Electric Propulsion |
contentSpecificSearch:"hybrid electric propulsion" | customContent:"hybrid
electric propulsion" | title:Hybrid Electric Propulsion | url:Hybrid
Electric Propulsion)) DisjunctionMaxQuery((host:Power Systems |
contentSpecificSearch:"power systems" | customContent:"power systems" |
title:Power Systems | url:Power Systems))
DisjunctionMaxQuery((host:Powerplant | contentSpecificSearch:powerplant |
customContent:powerplant | title:Powerplant | url:Powerplant))
DisjunctionMaxQuery((host:Propulsion | contentSpecificSearch:propulsion |
customContent:propulsion | title:Propulsion | url:Propulsion))
DisjunctionMaxQuery((host:hybrid | contentSpecificSearch:hybrid |
customContent:hybrid | title:hybrid | url:hybrid))
DisjunctionMaxQuery((host:hybrid electric | contentSpecificSearch:"hybrid
electric" | customContent:"hybrid electric" | title:hybrid electric |
url:hybrid electric)) DisjunctionMaxQuery((host:electric powerplant |
contentSpecificSearch:"electric powerplant" | customContent:"electric
powerplant" | title:electric powerplant | url:electric
powerplant/no_coord


+((host:hybrid electric powerplant | contentSpecificSearch:"hybrid electric
powerplant" | customContent:"hybrid electric powerplant" | title:hybrid
electric powerplant | url:hybrid electric powerplant) (host:hybrid electric
powerplants | contentSpecificSearch:"hybrid electric powerplants" |
customContent:"hybrid electric powerplants" | title:hybrid electric
powerplants | url:hybrid electric powerplants) (host:Electric |
contentSpecificSearch:electric | customContent:electric | title:Electric |
url:Electric) (host:Electrical | contentSpecificSearch:electrical |
customContent:electrical | title:Electrical | url:Electrical)
(host:Electricity | contentSpecificSearch:electricity |
customContent:electricity | title:Electricity | url:Electricity)
(host:Engine | contentSpecificSearch:engine | customContent:engine |
title:Engine | url:Engine) (host:fuel econ

Re: Solr performance issue on querying --> Solr 6.5.1

2017-09-30 Thread sasarun
Hi Erick, 

As suggested, I did try nonHDFS solr cloud instance and it response looks to
be really better. From the configuration side to, I am mostly using default
configurations and with block.cache.direct.memory.allocation as false.  On
analysis of hdfs cache, evictions seems to be on higher side. 

Thanks, 
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Clarification on Suggester Component of Solr 6.5.1

2017-12-13 Thread sasarun
Hi All, 

Last day I was able to configure Solr Suggester for recommendation in my
site with the following settings




  


  
  



  
 



mySuggester
AnalyzingLookupFactory
DocumentDictionaryFactory
query_suggest
text_suggester
false
false
  



  
true
10
mySuggester
  
  
suggest
  



With the above configuration I am able to get suggestion from Solr. But only
point of confusion is when I repeatedly hit the same search word, results
are coming in different order. Is this an expected pattern with respect to
suggester component. Example for mentioned pattern is given below for
reference

localhost:8983/solr/techproduct/suggest?suggest=true&suggest.build=true&suggest.dictionary=mySuggester&wt=json&suggest.q=lat&suggest.cfq=memory

Sample Result 1 
{"responseHeader":{"zkConnected":true,"status":0,"QTime":16},"command":"build","suggest":{"mySuggester":{"lat":{"numFound":3,"suggestions":[{"term":"latest
development in electrification","weight":0,"payload":""},{"term":"latest
development in the area of digital
pdp","weight":0,"payload":""},{"term":"latest technology for
materials","weight":0,"payload":""}]

Sample Result 2
{"responseHeader":{"zkConnected":true,"status":0,"QTime":14},"command":"build","suggest":{"mySuggester":{"lat":{"numFound":3,"suggestions":[{"term":"latest
development in the area of digital
pdp","weight":0,"payload":""},{"term":"latest technology for
materials","weight":0,"payload":""},{"term":"latest development in
electrification","weight":0,"payload":""}]


 First Suggestion in both the case are different. 

Please advice

Thanks,
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Highlighting keywords which are not in close proximity with in a field

2018-01-22 Thread sasarun
Hi All, 

Currently when I search for a phrase "Artificial Intelligence in space".
keyword Artificial Intelligence is getting highlighted as number of
occurrence of that word is more in the document. Most of its occurrence is
mostly at the start of document. Whereas word Space is available in the
document at the bottom. Due to which it is not shown in highlighting blob.
Is there a way to highlight the keywords which are not in close proximity 

Thanks
Arun



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Highlighting Solr 8

2019-10-16 Thread sasarun
Hi Eric,

Unified highlighter does not have an option to provide alternate field when
highlighting. That option is available with Orginal and fast vector
highlighter. As indicated in the Solr documentation, Unified is the
recommended method for highlighting to meet most of the use cases. Please do
share more details in case you are facing any specific issue with
highlighting. 

Thanks,

Arun 




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Query related APACHE SOLR 8.2.0

2019-10-16 Thread sasarun
Hi Rohit, 

Solr bundle comes with a Jetty server by default and does not require a
tomcat instance to run. Even though earlier version of Solr was in the form
of war file, Solr 5.0 and higher versions no longer supports user defined
containers. Details of the same are available in the link below for
reference. 

https://cwiki.apache.org/confluence/display/solr/WhyNoWar


Details of system requirements are available in the below link

https://lucene.apache.org/solr/guide/8_2/solr-system-requirements.html#supported-operating-systems

Thanks,

Arun 




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html