Re: solr/lucene index merge and optimize performance improvement

2015-06-17 Thread Toke Eskildsen
On Tue, 2015-06-16 at 09:54 -0700, Shenghua(Daniel) Wan wrote: > Hi, Toke, > Did you try MapReduce with solr? I think it should be a good fit for your > use case. Thanks for the suggestion. Improved logistics, such as starting build of a new shard while the previous shard is optimizing, would work

Re: solr/lucene index merge and optimize performance improvement

2015-06-16 Thread Shenghua(Daniel) Wan
Hi, Toke, Did you try MapReduce with solr? I think it should be a good fit for your use case. On Tue, Jun 16, 2015 at 5:02 AM, Toke Eskildsen wrote: > Shenghua(Daniel) Wan wrote: > > Actually, I am currently interested in how to boost merging/optimizing > > performance of single solr instance.

Re: solr/lucene index merge and optimize performance improvement

2015-06-16 Thread Toke Eskildsen
Shenghua(Daniel) Wan wrote: > Actually, I am currently interested in how to boost merging/optimizing > performance of single solr instance. We have the same challenge (we build static 900GB shards one at a time and the final optimization takes 8 hours with only 1 CPU core at 100%). I know that

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Shenghua(Daniel) Wan
​I think your advice on future incremental update is very useful. I will keep eye on that. Actually, I am currently interested in how to boost merging/optimizing performance of single solr instance. Parallelism at MapReduce level does not help merging/optimizing much, unless Solr/Lucene internally

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Erick Erickson
Ah, OK. For very slowly changing indexes optimize can makes sense. Do note, though, that if you incrementally index after the full build, and especially if you update documents, you're laying a trap for the future. Let's say you optimize down to a single segment. The default TieredMergePolicy trie

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Shenghua(Daniel) Wan
Hi, Erick, First thanks for sharing the ideas. I am further giving more context here accordingly. 1. why optimize? I have done some experiments to compare the query response time, and there is some difference. In addition, the searcher will be customer-facing. I think any performance boost will be

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Erick Erickson
The first question is why you're optimizing at all. It's not recommended unless you can demonstrate that an optimized index is giving you enough of a performance boost to be worth the effort. And why are you using embedded solr server? That's kind of unusual so I wonder if you've gone down a wrong

solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Shenghua(Daniel) Wan
Hi, Do you have any suggestions to improve the performance for merging and optimizing index? I have been using embedded solr server to merge and optimize the index. I am looking for the right parameters to tune. My use case have about 300 fields plus 250 copyfields, and moderate doc size (about 65K

Re: 43sec commit duration - blocked by index merge events?

2015-02-13 Thread Timothy Potter
I think Mark found something similar - https://issues.apache.org/jira/browse/SOLR-6838 On Sat, Feb 14, 2015 at 2:05 AM, Erick Erickson wrote: > Exactly how are you issuing the commit? I'm assuming you're > using SolrJ. the server.commit(whatever, true) waits for the searcher > to be opened befor

Re: 43sec commit duration - blocked by index merge events?

2015-02-13 Thread Erick Erickson
Exactly how are you issuing the commit? I'm assuming you're using SolrJ. the server.commit(whatever, true) waits for the searcher to be opened before returning. This includes (I believe) warmup times. It could be that the warmup times are huge in your case, the solr logs should show you the autowar

Re: 43sec commit duration - blocked by index merge events?

2015-02-13 Thread Jack Krupansky
I wasn't able to follow Otis' answer but... the purpose of commit is to make make recent document changes (since the last commit) visible to queries, and has nothing to do with merging of segments. IOW, take the new segment that is being created and not yet ready for use by query, and finish it so

Re: 43sec commit duration - blocked by index merge events?

2015-02-13 Thread Otis Gospodnetic
Check http://search-lucene.com/?q=commit+wait+block&fc_type=mail+_hash_+user e.g. http://search-lucene.com/m/QTPa7Sqx81 Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Fri, Feb 13, 2015 at 8:50 AM, Gili Nachum

Re: 43sec commit duration - blocked by index merge events?

2015-02-13 Thread Gili Nachum
Thanks Otis, can you confirm that a commit call will wait for merges to complete before returning? On Thu, Feb 12, 2015 at 8:46 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > If you are using Solr and SPM for Solr, you can check a report that shows > the # of files in an index and th

Re: 43sec commit duration - blocked by index merge events?

2015-02-12 Thread Otis Gospodnetic
If you are using Solr and SPM for Solr, you can check a report that shows the # of files in an index and the report that shows you the max docs-num docs delta. If you see the # of files drop during a commit, that's a merge. If you see a big delta change, that's probably a merge, too. You could a

43sec commit duration - blocked by index merge events?

2015-02-08 Thread Gili Nachum
Hello, During a load test I noticed a commit that took 43 seconds to complete (client hard complete). Is this to be expected? What's causing it? I have a pair of machines hosting a 128M docs collection (8 shards, replication factor=2). Could it be merges? In Lucene merges happen async of commit s

Re: index merge question

2014-04-17 Thread Brett Hoerner
/content/support/en/documentation/cloudera-search/cloudera-search-documentation-v1-latest.html > > -- see the user guide pdf, " update-conflict-resolver" parameter > > > > James > > > > -Original Message- > > From: anirudh...@gmail.com [mail

RE: index merge

2014-03-26 Thread Susheel Kumar
@lucene.apache.org Subject: index merge hi all. I am trying solr merge index on two collection (solr 4.6.1). There are same index on each collection. I tried as follow: http://localhost:8983/solr/admin/cores?action=mergeindexes&core=collection1&srcCore=collection1&srcCore=collection2 It is

index merge

2014-03-26 Thread Cihad Guzel
hi all. I am trying solr merge index on two collection (solr 4.6.1). There are same index on each collection. I tried as follow: http://localhost:8983/solr/admin/cores?action=mergeindexes&core=collection1&srcCore=collection1&srcCore=collection2 It is successfully. But, Although id is unique key

Re: index merge question

2013-06-11 Thread Mark Miller
> Anirudha Jadhav > Sent: Tuesday, June 11, 2013 10:47 AM > To: solr-user@lucene.apache.org > Subject: Re: index merge question > > From my experience the lucene mergeTool and the one invoked by coreAdmin is a > pure lucene implementation and does not understand

RE: index merge question

2013-06-11 Thread James Thomas
ache.org Subject: Re: index merge question From my experience the lucene mergeTool and the one invoked by coreAdmin is a pure lucene implementation and does not understand the concepts of a unique Key(solr land concept) http://wiki.apache.org/solr/MergingSolrIndexes has a cautionary not

Re: index merge question

2013-06-11 Thread Anirudha Jadhav
>From my experience the lucene mergeTool and the one invoked by coreAdmin is a pure lucene implementation and does not understand the concepts of a unique Key(solr land concept) http://wiki.apache.org/solr/MergingSolrIndexes has a cautionary note at the end we do frequent index merges for which

Re: index merge question

2013-06-11 Thread Mark Miller
Yeah, you have to carefully manage things if you are map/reduce building indexes *and* updating documents in other ways. If your 'source' data for MR index building is the 'truth', you also have the option of not doing incremental index merging, and you could simply rebuild the whole thing eve

Re: index merge question

2013-06-10 Thread Jamie Johnson
Thanks Mark. My question is stemming from the new cloudera search stuff. My concern its that if while rebuilding the index someone updates a doc that update could be lost from a solr perspective. I guess what would need to happen to ensure the correct information was indexed would be to record th

Re: index merge question

2013-06-08 Thread Sourajit Basak
I have noticed that when I write a doc with an id that already exists, it creates a new revision with the only the fields from the second write. I guess there is a REST API in the latest solr version which updates only selected fields. In my opinion, merge should be creating a doc which is a union

Re: index merge question

2013-06-08 Thread Mark Miller
On Jun 8, 2013, at 12:52 PM, Jamie Johnson wrote: > When merging through the core admin ( > http://wiki.apache.org/solr/MergingSolrIndexes) what is the policy for > conflicts during the merge? So for instance if I am merging core 1 and > core 2 into core 0 (first example), what happens if core

index merge question

2013-06-08 Thread Jamie Johnson
When merging through the core admin ( http://wiki.apache.org/solr/MergingSolrIndexes) what is the policy for conflicts during the merge? So for instance if I am merging core 1 and core 2 into core 0 (first example), what happens if core 1 and core 2 both have a document with the same key, say core

Re: index merge

2012-05-31 Thread sudarshan
es? Please explain the merge operation. Thanks, Sudarshan -- View this message in context: http://lucene.472066.n3.nabble.com/index-merge-tp472904p3987121.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: index merge

2010-05-20 Thread uma m
Hi All, The problem is resolved. It is purely due to filesystem. My filesystem is of 32-bit, running on 64 bit OS. I changed to 64 bit filesystem and all works as expected. Uma -- View this message in context: http://lucene.472066.n3.nabble.com/index-merge-tp472904p832053.html Sent from

Re: index merge

2010-05-19 Thread Ahmet Arslan
> I am running solr in 64 bit HP-UX system. The total > index size is about > 5GB and when i try load any new document, solr tries to > merge the existing > segments first and results in following error. I could see > a temp file is > growng within index dir around 2GB in size and later it > fails

Re: index merge

2010-05-19 Thread uma m
uld anyone help me to resolve this exception? Regards, Uma -- View this message in context: http://lucene.472066.n3.nabble.com/index-merge-tp472904p829810.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: index merge

2010-03-11 Thread Mark Fletcher
Hi All, Thank you for the very valuable suggestions. I am planning to try using the Master - Slave configuration. Best Rgds, Mark. On Mon, Mar 8, 2010 at 11:17 AM, Mark Miller wrote: > On 03/08/2010 10:53 AM, Mark Fletcher wrote: > >> Hi Shalin, >> >> Thank you for the mail. >> My main purpose

Re: index merge

2010-03-08 Thread Mark Miller
On 03/08/2010 10:53 AM, Mark Fletcher wrote: Hi Shalin, Thank you for the mail. My main purpose of having 2 identical cores COREX - always serves user request COREY - every day once, takes the updates/latest data and passess it on to COREX. is:- Suppose say I have only one COREY and suppose a r

Re: index merge

2010-03-08 Thread Shalin Shekhar Mangar
Hi Mark, On Mon, Mar 8, 2010 at 9:23 PM, Mark Fletcher wrote: > > My main purpose of having 2 identical cores > COREX - always serves user request > COREY - every day once, takes the updates/latest data and passess it on to > COREX. > is:- > > Suppose say I have only one COREY and suppose a reque

Re: index merge

2010-03-08 Thread Mark Fletcher
Hi Shalin, Thank you for the mail. My main purpose of having 2 identical cores COREX - always serves user request COREY - every day once, takes the updates/latest data and passess it on to COREX. is:- Suppose say I have only one COREY and suppose a request comes to COREY while the update of the l

Re: index merge

2010-03-08 Thread Shalin Shekhar Mangar
Hi Mark, On Mon, Mar 8, 2010 at 7:38 PM, Mark Fletcher wrote: > > I ran the SWAP command. Now:- > COREX has the dataDir pointing to the updated dataDir of COREY. So COREX > has the latest. > Again, COREY (on which the update regularly runs) is pointing to the old > index of COREX. So this now doe

Re: index merge

2010-03-08 Thread Mark Fletcher
problem in using MERGE concept here. If it is wrong can some > one > > pls suggest the best approach. I tried the various merges explained in my > > previous mail. > > > > > Index merge happens at the Lucene level which has no idea about uniqueKeys. > Therefore when

Re: index merge

2010-03-08 Thread Shalin Shekhar Mangar
e various merges explained in my > previous mail. > > Index merge happens at the Lucene level which has no idea about uniqueKeys. Therefore when you merge two indexes containing exactly the same documents (by uniqueKey), you get double the document count. Looking at your scenario, it seems to

Fwd: index merge

2010-03-07 Thread Mark Fletcher
-- From: Mark Fletcher Date: Sat, Mar 6, 2010 at 9:17 AM Subject: index merge To: solr-user@lucene.apache.org Cc: goks...@gmail.com Hi, I have a doubt regarding Index Merging:- I have set up 2 cores COREX and COREY. COREX - always serves user requests COREY - gets updated with the latest

index merge

2010-03-06 Thread Mark Fletcher
Hi, I have a doubt regarding Index Merging:- I have set up 2 cores COREX and COREY. COREX - always serves user requests COREY - gets updated with the latest values (dataDir is in a different location from COREX) I tried merging coreX and coreY at the end of COREY getting updated with the latest d