Re: SOLR indexing strategy

2015-03-20 Thread Shawn Heisey
On 3/20/2015 10:08 PM, Jack Krupansky wrote: > 1. With 1000 fields, you may only get 10 to 25 million rows per node. So, a > single date may take 15 to 50 nodes. > 2. How many of the fields need to be indexed for reference in a query? > 3. Are all the fields populated for each row? > 4. Maybe you c

Re: SOLR indexing strategy

2015-03-20 Thread Jack Krupansky
1. With 1000 fields, you may only get 10 to 25 million rows per node. So, a single date may take 15 to 50 nodes. 2. How many of the fields need to be indexed for reference in a query? 3. Are all the fields populated for each row? 4. Maybe you could split each row, so that one Solr collection would

DocTransformer#setContext

2015-03-20 Thread Ryan Josal
Hey guys, I wanted to ask if I'm using the DocTransformer API as intended. There is a setContext( TransformerContext c ) method which is called by the TextResponseWriter before it calls transform on any docs. That context object contains a DocIterator reference. I want to use a DocTransformer to

How to use ConcurrentUpdateSolrServer for Secured Solr?

2015-03-20 Thread Furkan KAMACI
Is there anyway to use ConcurrentUpdateSolrServer for secured Solr as like CloudSolrServer: HttpClientUtil.setBasicAuth(cloudSolrServer.getLbServer().getHttpClient(), , ); I see that there is no way to access HTTPClient for ConcurrentUpdateSolrServer? Kind Regards, Furkan KAMACI

PostFilter does not seem to work across shards

2015-03-20 Thread Kevin Osborn
I developed a post filter. My documents to be filtered are on two different shards. So, in a single-shard environment, DelegatingCollector.doSetNextReader is called twice. And collect is called the correct number of times. Everything went well and I got my correct number of results back. So, I the

Re: Solr Unexpected Query Parser Exception

2015-03-20 Thread Jack Krupansky
Which query parser are you using? The dismax query parser does not support wild cards or "*:*". Either way, the error message is unhelpful - worth filing a Jira. -- Jack Krupansky On Fri, Mar 20, 2015 at 7:21 AM, Vishnu Mishra wrote: > Hi, I am using solr 4.10.3 and doing distributed shard que

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Erick Erickson
Are you faceting? That can sometimes use one of the caches (just glanced at stack trace...) as entries are pushed into and removed from the cache during the same request. Shot in the dark. Best, Erick On Fri, Mar 20, 2015 at 12:17 PM, Yonik Seeley wrote: > The document cache is not really going

Re: SOLR indexing strategy

2015-03-20 Thread Erick Erickson
On the surface, this is impossible: bq: This query should load only indexes within this date range How would one "load only indexes with this date range"? The nature of Lucene's merging segments makes it unclear what this would even mean. Best, Erick On Fri, Mar 20, 2015 at 5:09 AM, Priceputu C

Re: Solr Unexpected Query Parser Exception

2015-03-20 Thread Erick Erickson
You have to show us the entire query in order to be able to help. Or are you saying the entire url is blah blah blah/select?q=*:* ? And you should have a trace of this in the Solr log, that would be helpful too as it includes the query that _Solr_ sees. Best, Erick On Fri, Mar 20, 2015 at 4:21 A

Re: SOLR URL

2015-03-20 Thread Shawn Heisey
On 3/20/2015 11:15 AM, richardg wrote: > I'm in the process of upgrading to Solr 5 from Solr 4* using Tomcat w/ > multiple webapps. When setting up a SOLR 5 Node is it possible to change > the URL From localhost:8080/solr to localhost:8080/whateverIwant ? Yes. You can either use Solr 5 as a wa

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Yonik Seeley
The document cache is not really going to be taking up time here. How many concurrent requests (threads) are you testing with here? One thing I've seen over the years is a false sense of what is taking up time when benchmarks with a lot of threads are used. The reason is that when there are a lot

Re: Have anyone used Automatic Phrase Tokenization (AutoPhrasingTokenFilterFactory) ?

2015-03-20 Thread James Strassburg
I have an autophrase configured for 'wheel chair' and if I run analysis for 'super wheel chair awesome' such that it would index to 'super wheelchair awesome' this is how mine behaves: http://i.imgur.com/iR4IgGp.png When I did the implementation that is how I thought the positioning should work. D

Re: Block join ordering

2015-03-20 Thread Mikhail Khludnev
Hello, I'm not sure I got you right. Here is https://issues.apache.org/jira/browse/SOLR-5882 with the patch which sounds quite relevant. On Fri, Mar 20, 2015 at 4:02 PM, StrW_dev wrote: > Hi, > > I am using block joins in my Solr Index. I am searching and returning the > child document, but I w

Re: How To Interrupt Solr Query Execution

2015-03-20 Thread Gregg Donovan
SOLR-5986 looks like a great enhancement for enforcing timeouts. I'm curious about how to handle *manual* cancellation. We're working on backup requests -- e.g. wait till 90% of shards have responded then send out a backup request for the lagging (e.g. GC, cache miss, overloaded, etc.) shards afte

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Chris Hostetter
: we have quite a problem with Solr. We are running it in a config 6x3, and : suddenly solr started to hang, taking all the available cpu on the nodes. : : In the threads dump noticed things like this can eat lot of CPU time : : :- org.apache.solr.search.LRUCache.put​(LRUCache.java:116) :

SOLR URL

2015-03-20 Thread richardg
I'm in the process of upgrading to Solr 5 from Solr 4* using Tomcat w/ multiple webapps. When setting up a SOLR 5 Node is it possible to change the URL From localhost:8080/solr to localhost:8080/whateverIwant ? Another issue to consider w/ this I will have multiple SOLR instances. The way I en

Re: index duplicate records from data source into 1 document

2015-03-20 Thread Shawn Heisey
On 3/20/2015 4:03 AM, Toke Eskildsen wrote: > On Thu, 2015-03-19 at 15:44 +0100, Shawn Heisey wrote: >> You could in theory write a custom UpdateRequestProcessor that looks for >> the previous document and merges it in whatever way you desire, so the >> combined information is what will be indexed,

Re: Have anyone used Automatic Phrase Tokenization (AutoPhrasingTokenFilterFactory) ?

2015-03-20 Thread trhodesg
Sorry, i can see my post is munged. This seems to display it legibly     http://lucene.472066.n3.nabble.com/Have-anyone-used-Automatic-Phrase-Tokenization-AutoPhrasingTokenFilterFactory-td4173808.html I'm new to all this, so i hesitate to say the index

Re: data import

2015-03-20 Thread Shawn Heisey
On 3/19/2015 10:36 PM, Midas A wrote: > Thanks for replying .. I need clarity on following points > a) Making store false in schema for few fields will improve indexing time ? Maybe, maybe not. If Solr is I/O bound, then it probably would help ... but usually I/O on the Solr index directory is no

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Sergey Shvets
Hello Shawn, In that case it makes it a bit strange the behavior as it was noticed. LRU was heavy on the CPU in threads dump, and I don't have any reasonable explanation for that. However switch to LFU seemingly solved the case. -- Best regards, Sergeymail

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Shawn Heisey
On 3/19/2015 8:49 PM, Umesh Prasad wrote: > It might be because LRUCache by default will try to evict its entries on > each call to put and putAll. LRUCache is built on top of java's > LinkedHashMap. Check the javadoc of removeEldestEntry >

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Sergey Shvets
Hello Umesh, Thank you, indeed that gave positive results so far. we changed completely to LFU. Today it went quite okay. We wait till it shows more stability and then work out the optimal cache size. Below is a summary of the changes. - - - - + + + + + -- Best regards, Sergey

Re: How to configure Solr to use ZooKeeper ACLs in order to protect it's content

2015-03-20 Thread Dmitry Karanfilov
Hey Per, This is magic! Work like a charm! Thank you for trying reproduce this, for find the reason of issue, for step-by-step instruction and at all - for your time and help! I would never get it worked without your help! Regards, Dmitry

Block join ordering

2015-03-20 Thread StrW_dev
Hi, I am using block joins in my Solr Index. I am searching and returning the child document, but I want to order based on an attribute of the parent. Would that be possible? Example doc: Gr -- View this message in context: http://lucene.472066.n3.na

Re: SOLR indexing strategy

2015-03-20 Thread Priceputu Cristian
Why would you need 1000 fields for ? C On Fri, Mar 20, 2015 at 1:12 PM, varun sharma wrote: > Requirements of the system that we are trying to build are for each date > we need to create a SOLR index containing about 350-500 million documents , > where each document is a single structured record

Re: How to configure Solr to use ZooKeeper ACLs in order to protect it's content

2015-03-20 Thread Per Steffensen
Sorry, I did not follow this mailing-list close enough to detect this question. But Dmitry mailed to me privately asking for help, so here I am Initial steps * mkdir solr-test * cd solr-test * Downloaded solr-5.0.0.zip and unzipped into solr-test folder, so that I have solr-test/solr-5.0.0 fold

Solr Unexpected Query Parser Exception

2015-03-20 Thread Vishnu Mishra
Hi, I am using solr 4.10.3 and doing distributed shard query. I am getting following syntax exception at regular intervals. ERROR org.apache.solr.core.SolrCore ? org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Cannot parse '*:*': Encountered "" at line 1, column 3. Was

SOLR indexing strategy

2015-03-20 Thread varun sharma
Requirements of the system that we are trying to build are for each date we need to create a SOLR index containing about 350-500 million documents , where each document is a single structured record having about 1000 fields .Then query same based on index keys & date, for instance we will try to

Re: index duplicate records from data source into 1 document

2015-03-20 Thread Toke Eskildsen
On Thu, 2015-03-19 at 15:44 +0100, Shawn Heisey wrote: > You could in theory write a custom UpdateRequestProcessor that looks for > the previous document and merges it in whatever way you desire, so the > combined information is what will be indexed, and configure Solr to use > that update processo