Re: Streaming Expressions using Solrj.io

2018-02-21 Thread Ryan Yacyshyn
Hi Shawn, Thanks for this. I'm using maven and I had another library unrelated to solrj loading an older version of the httpclient (4.4.1), which was causing the problem. Regards, Ryan On Wed, 21 Feb 2018 at 12:50 Shawn Heisey wrote: > On 2/20/2018 7:54 PM, Ryan Yacyshyn wrote: > > I'd lik

Re: What is “high cardinality” in facet streams?

2018-02-21 Thread Joel Bernstein
With Streaming Expressions you have options for speeding up large aggregations. 1) Shard 2) Use the parallel function to run the aggregation in parallel. 3) Add more replicas When you use the parallel function the same aggregation can be pulled from every shard and every shard replica in the clus

Re: Solr Swap space

2018-02-21 Thread Susheel Kumar
Hi Shawn, Below output for prod machine based on the steps you described. Please take a look. The solr searches are returning fine and no issue with performance but since last 4 months swap space started going up. After restart, it comes down to zero and then few weeks, it utilization reaches to

Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Jason Gerlowski
My apologies Santosh. I added that comment a few releases back based on a misunderstanding I've only recently been disabused of. I will correct it. Anyway, Shawn's explanation above is correct. The queueSize parameter doesn't control batching, as he clarified. Sorry for the trouble. Best, Ja

[Solr Collection Loosing Leader]

2018-02-21 Thread Jagdish Saripella
Hello All, I am running into frequent issue where the leader shard in solr cloud stays active but does not acknowledge as "leader" . This brings down the other replicas as they go into to recovery mode and eventually fail trying to sync up. The error seen in "solr.log" is below: { this also simil

Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Santosh Narayan
Thanks for the explanation Shawn. Very helpful. I think I got misled by the JavaDoc text for *ConcurrentUpdateSolrClient.Builder.withQueueSize* /** * The number of documents to batch together before sending to Solr. If not set, this defaults to 10. */ public Builder withQueueSize(

[Solr Collection Loosing Leader]

2018-02-21 Thread Aaryan Reddy
Hello All, I am running into frequent issue where the leader shard in solr cloud stays active but does not acknowledge as "leader" . This brings down the other replicas as they go into to recovery mode and eventually fail trying to sync up. The error seen in "solr.log" is below: { this also simil

Re: Solr Swap space

2018-02-21 Thread Shawn Heisey
On 2/21/2018 1:30 PM, Susheel Kumar wrote: > I did go thru your posts on swap usage http://lucene.472066.n3. > nabble.com/Solr-4-3-1-memory-swapping-td4126641.html and my situation is > also similar. Below is top output from our prod and performance test > machine and as you can see the swap util

Re: What is “high cardinality” in facet streams?

2018-02-21 Thread Shawn Heisey
On 2/21/2018 12:08 PM, Alfonso Muñoz-Pomer Fuentes wrote: > Some more details about my collection: > - Approximately 200M documents > - 1.2M different values in the field I’m faceting over > > The query I’m doing is over a single bucket, which after applying q and fq > the 1.2M values are reduced

Re: Odd GC values in solr.in.sh on 7.2.1

2018-02-21 Thread Shawn Heisey
On 2/21/2018 7:51 AM, Bram Van Dam wrote: > solr.in.sh appears to contain broken GC suggestions: > > # These GC settings have shown to work well for a number of common Solr > workloads > #GC_TUNE="-XX:NewRatio=3 -XX:SurvivorRatio=4etc. In 5.x, the default environment variables were specified i

Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Shawn Heisey
On 2/21/2018 7:41 AM, Santosh Narayan wrote: > May be it is my understanding of the documentation. As per the > JavaDoc, ConcurrentUpdateSolrClient > buffers all added documents and writes them into open HTTP connections. > > So I thought that this class would buffer documents in the client side >

Re: q.op in 7.2.1 solconfig.xml

2018-02-21 Thread Emir Arnautović
You are welcome! In case you want to prevent overrides, you can put it in invariants section. Emir On Feb 22, 2018 12:32 AM, "Peter Sturge" wrote: > Hi, > Thanks for your reply. > > I managed to get it working by specifying it in the requestHandler /select > section in solrconfig.xml: > >

Re: q.op in 7.2.1 solconfig.xml

2018-02-21 Thread Peter Sturge
Hi, Thanks for your reply. I managed to get it working by specifying it in the requestHandler /select section in solrconfig.xml: explicit 10 AND Your other suggestions will be also useful for everyone with a similar question. Thanks! Peter On Wed, Feb 21

Re: q.op in 7.2.1 solconfig.xml

2018-02-21 Thread Emir Arnautović
Hi Peter, You should be able to set any param in several places/ways: * define in request handler defaults * define in params.json and reference using useParams in request handler definition * use initParams to define default for one or multiple handlers You can see examples in configs that are p

q.op in 7.2.1 solconfig.xml

2018-02-21 Thread Peter Sturge
Hi, I'm going through a major upgrade from 4.6 to 7.2.1 and I can see the defaultOperator has now been removed. The docs mention it's possible to set a default value for the new q.op directive in solrconfig.xml, but it doesn't say how or where. Does anyone have an example of specifying a default

Regarding solr 5.3.1 boosting issue

2018-02-21 Thread sha p
Hi, I am using apache solr 5.3.1 version in my project. I have requirement where each document can have child documents. While searching these child documents should be retrieved in the same order as added initially , when I try to search them using restApi calls of solr this order is jumbled. I

Re: Solr Swap space

2018-02-21 Thread Susheel Kumar
Hello Shawn, I did go thru your posts on swap usage http://lucene.472066.n3. nabble.com/Solr-4-3-1-memory-swapping-td4126641.html and my situation is also similar. Below is top output from our prod and performance test machine and as you can see the swap utilization on Prod machine is 44% while

Re: storing large text fields in a database? (instead of inside index)

2018-02-21 Thread Roman Chyla
Hi and thanks, Emir! FieldType might indeed be another layer where the logic could live. On Wed, Feb 21, 2018 at 6:32 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi, > Maybe you could use external field type as an example how to hook up > values from DB: https://lucene.apache.org

Re: What is “high cardinality” in facet streams?

2018-02-21 Thread Alfonso Muñoz-Pomer Fuentes
Thank you for all the information. As usual I learn a lot about Solr and data modeling with each reply. Some more details about my collection: - Approximately 200M documents - 1.2M different values in the field I’m faceting over The query I’m doing is over a single bucket, which after applying q

FINAL REMINDER: CFP for Apache EU Roadshow Closes 25th February

2018-02-21 Thread Sharan F
Hello Apache Supporters and Enthusiasts This is your FINAL reminder that the Call for Papers (CFP) for the Apache EU Roadshow is closing soon. Our Apache EU Roadshow will focus on Cloud, IoT, Apache Tomcat, Apache Http and will run from 13-14 June 2018 in Berlin. Note that the CFP deadline has

Odd GC values in solr.in.sh on 7.2.1

2018-02-21 Thread Bram Van Dam
Hey folks, solr.in.sh appears to contain broken GC suggestions: # These GC settings have shown to work well for a number of common Solr workloads #GC_TUNE="-XX:NewRatio=3 -XX:SurvivorRatio=4etc. The "etc." part is copied verbatim from the file. It looks likes the original GC_TUNE settings ha

Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Santosh Narayan
Hi Shawn, May be it is my understanding of the documentation. As per the JavaDoc, ConcurrentUpdateSolrClient buffers all added documents and writes them into open HTTP connections. So I thought that this class would buffer documents in the client side itself till the QueueSize is reached and then

Re: Help required with SolrJ

2018-02-21 Thread Aakanksha Gupta
Thanks Shawn! That was just a small fix from my side. Thanks for your help! On Tue, Feb 20, 2018 at 1:43 AM, Shawn Heisey wrote: > On 2/19/2018 8:49 AM, Aakanksha Gupta wrote: > > Thanks for the quick solution. It works. I just had to replace %20 to > space > > in query.addFilterQuery("timestamp

Re: Help required with SolrJ

2018-02-21 Thread Aakanksha Gupta
Thanks Erick. On Tue, Feb 20, 2018 at 1:11 AM, Erick Erickson wrote: > Aakanksha: > > Be a little careful here, filter queries with timestamps can be > tricky. The example you have is fine, but for end-points with finer > granularity may be best if you don't cache them, see: > https://lucidworks

Re: storing large text fields in a database? (instead of inside index)

2018-02-21 Thread Emir Arnautović
Hi, Maybe you could use external field type as an example how to hook up values from DB: https://lucene.apache.org/solr/guide/6_6/working-with-external-files-and-processes.html HTH, Emir -- Monitoring - L

Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Shawn Heisey
On 2/21/2018 1:21 AM, Santosh Narayan wrote: I'm using ConcurrentUpdateSolrClient to push data into Solr. Currently, I'm initializing it as follows: ConcurrentUpdateSolrClientclient = new ConcurrentUpdateSolrClient.Builder(serverUrl).withThreadCount(100).withQueueSize(50).build(); This works fi

Re: Deploying solr to tomcat 7

2018-02-21 Thread Shawn Heisey
On 2/21/2018 3:00 AM, Rehaman wrote: We installed Ensembl server in our environment and not able to query databases with large number of entries. And for that purpose we need to use indexed databases through Solr search engine. We have installed Solr search engine (Solr Specification Version:

Deploying solr to tomcat 7

2018-02-21 Thread Rehaman
Dear Solr Team, We installed Ensembl server in our environment and not able to query databases with large number of entries. And for that purpose we need to use indexed databases through Solr search engine. We have installed Solr search engine (Solr Specification Version: 3.6.1) on Tomcat 7. Able

Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Santosh Narayan
Hi all, I'm using ConcurrentUpdateSolrClient to push data into Solr. Currently, I'm initializing it as follows: ConcurrentUpdateSolrClientclient = new ConcurrentUpdateSolrClient.Builder(serverUrl).withThreadCount(100).withQueueSize(50).build(); This works fine when there are 50 requests coming in