TieredMergePolicy

2014-07-05 Thread Kireet Reddy
I have a question about the maxMergeAtOnce parameter. We are using elasticsearch and one of our nodes seems to have very high merge activity, However it seems to be high CPU activity and not I/O constrainted. I have enabled the IndexWriter info stream logs, and often times it seems to do merges

Re: Field for 'species' data?

2014-07-05 Thread Erick Erickson
re: do this in an update processor or in other parts of the pipeline: whichever is easier, the result will be the same. Personally I like putting stuff like this in other parts of the pipeline if for no other reason than the load isn't concentrated on the Solr machine. In particular if you enrich

Re: Solr and SolrCloud repllcation, and load balancing questions.

2014-07-05 Thread Erick Erickson
Question1, both sub-cases. You're off on the wrong track here, you have to forget about replication. When documents are added to the index, they get forwarded to _all_ replicas. So the flow is like this... 1> leader gets update request 2> leader indexes docs locally, and adds to (local) transacti

Re: Solr 4.7 Payload

2014-07-05 Thread Erick Erickson
Take a look at PayloadTermQuery, I think that should give you some hints. Best, Erick On Fri, Jul 4, 2014 at 8:19 AM, Ranjith Venkatesan wrote: > Hi all, > > I am evaluating Payload of lucene. I am using solr4.7.2 for this. I could > able to index with payload, but i couldnt able to retrieve pay

Re: Solr Map Reduce Indexer Tool GoLive to SolrCloud with index on local file system

2014-07-05 Thread Erick Erickson
Ok, I asked some folks who know and the response is that "that should work, but it's not supported/tested". IOW, you're into somewhat uncharted territory. The people who wrote the code don't have this use-case in their priority list and probably won't be expending energy in this direction any time

Re: error during heavy indexing

2014-07-05 Thread Shawn Heisey
On 7/5/2014 9:40 AM, navdeep agarwal wrote: > i am getting following error on heavy indexing .i am using Solr 4.7 > .creating index in hdfs through map reduce .sending docs in batch of 50 > . > > ERROR org.apache.solr.core.SolrCore – java.lang.RuntimeException: [was > class org.eclipse.jetty.

Re: Field for 'species' data?

2014-07-05 Thread Dan Bolser
The latter On 5 Jul 2014 16:39, "Jack Krupansky" wrote: > So, the immediate question is whether the value in the Solr source > document has the full taxonomy path for the species, or just parts, and > some external taxonomy definition must be consulted to "fill in" the rest > of the hierarchy pat

error during heavy indexing

2014-07-05 Thread navdeep agarwal
i am getting following error on heavy indexing .i am using Solr 4.7 .creating index in hdfs through map reduce .sending docs in batch of 50 . ERROR org.apache.solr.core.SolrCore – java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException] early EOF at com.ctc.wstx.util.Exce

Re: Field for 'species' data?

2014-07-05 Thread Jack Krupansky
So, the immediate question is whether the value in the Solr source document has the full taxonomy path for the species, or just parts, and some external taxonomy definition must be consulted to "fill in" the rest of the hierarchy path for that species. -- Jack Krupansky -Original Message-

Re: Field for 'species' data?

2014-07-05 Thread Dan Bolser
One requirement is that the hierarchical facet implementation marches whatever the Drupal ApacheSolr module does with taxonomy terms. The key thing is to add the taxonomy to the doc which only has one 'leaf' term. On 5 Jul 2014 15:01, "Jack Krupansky" wrote: > Focus on your data model and querie

Re: Field for 'species' data?

2014-07-05 Thread Jack Krupansky
Focus on your data model and queries first, then you can decide on the implementation. Take a semi-complex example and manually break it down into field values and then write some queries, including filters, in English, that do the required navigation. Once you have a handle on what fields you

Solr and SolrCloud repllcation, and load balancing questions.

2014-07-05 Thread Himanshu Mehrotra
Hi, I had three quesions/doubts regarding Solr and SolrCloud functionality. Can anyone help clarify these? I know these are bit long, please bear with me. [A] Replication related - As I understand before SolrCloud, under a classic master/slave replication setup, every 'X' minutes slaves will pull

Re: Field for 'species' data?

2014-07-05 Thread Dan Bolser
I'm super noob... Why choose to write it add a custom update request processor rather than an analysis pipeline? Cheers, Dan. On 5 Jul 2014 03:45, "Alexandre Rafalovitch" wrote: > Do that with a custom update request processor. > > Just remember Solr is there to find things not to preserve struc

delta import not working properly

2014-07-05 Thread madhav bahuguna
I have 8 tables in my solr data-config files and all are joined since i need data from all of them. But out of those 8 tables i have three tables that have common fields ,which i can use to link.But the issues is that the common fields in the three table that i have are such that they are repeating