Constantly high disk read access (40-60M/s)

2014-11-28 Thread Po-Yu Chuang
Hi all, I am using Solr 4.9 with Tomcat. Thanks to the suggestions from Yonik and Dmitry about the slow start up. Everything works fine now, but I noticed that the load average of the server is high because there is constantly heavy disk read access. Please point me some directions. Some numbers

Re: SolrCloud replica always fully resync index from leader node

2014-11-28 Thread stephon
Hello Erick, My solrconfig.xml is in attachment. solrconfig.xml It is running with a Debian server with 64GB RAM. And the full replication evidence is coreA_shard1_replica2 is in recovering state. Since in this state, solr/coreA

Re: Trying to get ALL scores from a previous search in a custom search component ("last-components")

2014-11-28 Thread Darin Amos
Thanks for the advice, I will take a look to see if there is some tuning I can do here, I am not terribly concerned about that yet anyway. My concern still remains with how can I get the scores of the entire matched set. Maybe it is not possible, or perhaps I need to write my own query/match/s

Re: Trying to get ALL scores from a previous search in a custom search component ("last-components")

2014-11-28 Thread Erick Erickson
Because you're fetching and decompressing the doc from disk. Grouping etc. Do their work from _indexed_ terms, which are already in memory. Two different things. If I'm reading this right on a quick scan... Best Erick On Nov 28, 2014 10:21 AM, "Darin Amos" wrote: > Hi Eric, > > I am curious why

Re: Upgrading Solr from 1.4.1 to 4.10

2014-11-28 Thread Shawn Heisey
On 11/28/2014 2:44 AM, rajadilipchowdary.ko...@cognizant.com wrote: > We are using Apache Solr 1.4.1 for our project. Now a days we are facing many > problems regarding solr indexing, so when we saw website we found latest > version is 4.10, could you please help us in Upgrading the Solr. > > Is

Re: Trying to get ALL scores from a previous search in a custom search component ("last-components")

2014-11-28 Thread Darin Amos
Hi Eric, I am curious why this would b considered an anti-patern to check a stored valued for every matching document. Is this not what the facet query component is doing anyway so it can get the total counts? Grouping doesn’t solve the issue because again, I will only see groups for the items

Re: Upgrading Solr from 1.4.1 to 4.10

2014-11-28 Thread Erick Erickson
P.S. Do _NOT_ just copy your 1.4 configs to 4.x. Start with the 4x sample configs and selectively move any customizations from 1.4 or you'll get burned by things like schema requiring _version_ in 4.x and possibly _root_ etc. Best, Erick On Fri, Nov 28, 2014 at 2:53 AM, David Philip wrote: > Hi

Re: SolrCloud replica always fully resync index from leader node

2014-11-28 Thread Erick Erickson
Stephon: Not quite sure what's going on, but you're hinting that you're mixing old-style replication with SolrCloud, the two are orthogonal. This, for instance, is irrelevant for SolrCloud: and setting replicateAfter:commit So let's see the relevant configuration from solrconfig.xml. Also, what

Re: Trying to get ALL scores from a previous search in a custom search component ("last-components")

2014-11-28 Thread Erick Erickson
Does grouping work for you here? Because even if you solve this problem, if I'm reading this right you're going to fetch stored values for every doc that matches the query, which is an anti-pattern big-time, consider *:* Of course I did a very quick skim, so maybe I'm all wet Best, Erick

Solr 4.10.2 - DataImportHandler - Qustion

2014-11-28 Thread Umang Agrawal
Hi All I have a question on loading data into Solr using DataImportHandler. I have a xml file which I need to load into Solr using data import handler via Xpath transformer: XML file structure is: ** * * * * * * * * * 01* * value01* * * * * * 02* * value02* * * * * * * * * * 03* * value03* * *

Re: phrase extraction from user paragraph input

2014-11-28 Thread Ahmet Arslan
Hi, For the first part of the task, you can use key phrase extraction. https://code.google.com/p/maui-indexer/ http://www.nzdl.org/Kea/ Ahmet On Friday, November 28, 2014 11:23 AM, Nikos Chaliasos wrote: Hello, I am investigating a university project where in a part of it, the user would

Re: Upgrading Solr from 1.4.1 to 4.10

2014-11-28 Thread David Philip
Hi Raja, Could you please mention the list of solr features that you were/are using in Solr 1.4. There have been tremendous changes since 1.4 to 4.10. Also, you may have to explore solr cloud for resolving the indexing operation. But what kind of indexing problems are you facing? You should loo

Upgrading Solr from 1.4.1 to 4.10

2014-11-28 Thread RajaDilipChowdary.Kolli
Hi Team, We are using Apache Solr 1.4.1 for our project. Now a days we are facing many problems regarding solr indexing, so when we saw website we found latest version is 4.10, could you please help us in Upgrading the Solr. Is there any specific things which we need to change from our current

Re: phrase extraction from user paragraph input

2014-11-28 Thread Vineet Mishra
Hi Nokos, Can you quote an example for your usecase, I guess that will be helpful for understanding the problem more clearly. Cheers! On Fri, Nov 28, 2014 at 2:31 PM, Nikos Chaliasos wrote: > Hello, > > I am investigating a university project where in a part of it, the user > would give a para

phrase extraction from user paragraph input

2014-11-28 Thread Nikos Chaliasos
Hello, I am investigating a university project where in a part of it, the user would give a paragraph of text as input and the parsing process (after removing stopwords) would extract a series of descriptive topics about the paragraph, with which I could then search in documents for results. Is th

Re: SolrCloud replica always fully resync index from leader node

2014-11-28 Thread stephon
Hello Ludovic, Zookeeper timeout errors not found in log file Here is my SolrCloud environment information. * Solr 4.5.1 used * Index size : ~270G * Index update: every 30 secs, each update will contain 3~4 index version changes * example: * old index version number: 1417165450218

Re: SolrCloud replica always fully resync index from leader node

2014-11-28 Thread lboutros
Hi Stephon, do you see Zookeeper timeout errors in your log files ? Could you please give us additional informations like : How often is your index updated ? Which version of Solr do you use ? What is the size of your index ? Make sure you have this handler in your solr configuration file :

Re: Solr mlt doesn't return documents with "exactly the same" contents

2014-11-28 Thread Nishant Kelkar
+1 hhc. Thanks for sharing! Best Regards, Nishant Kelkar On Thu, Nov 27, 2014 at 9:59 PM, hhc wrote: > After carefully reading the mlt parameters here > https://wiki.apache.org/solr/MoreLikeThis > > I found that I can specify the following parameters to return "bbb" when > search for similar do

SolrCloud replica always fully resync index from leader node

2014-11-28 Thread stephon
I have an SolrCloud core with 4 shards, and replication factor is 1. mentioned below: * coreA_shard1_replica1 * coreA_shard2_replica1 * coreA_shard3_replica1 * coreA_shard4_replica1 After added the new replica of coreA_shard1, i.e.: coreA_shard1_replica2. it will do fully resync from the leader no