On Tue, 2015-06-16 at 09:54 -0700, Shenghua(Daniel) Wan wrote:
> Hi, Toke,
> Did you try MapReduce with solr? I think it should be a good fit for your
> use case.
Thanks for the suggestion. Improved logistics, such as starting build of
a new shard while the previous shard is optimizing, would work
Hi, Toke,
Did you try MapReduce with solr? I think it should be a good fit for your
use case.
On Tue, Jun 16, 2015 at 5:02 AM, Toke Eskildsen
wrote:
> Shenghua(Daniel) Wan wrote:
> > Actually, I am currently interested in how to boost merging/optimizing
> > performance of single solr instance.
Shenghua(Daniel) Wan wrote:
> Actually, I am currently interested in how to boost merging/optimizing
> performance of single solr instance.
We have the same challenge (we build static 900GB shards one at a time and the
final optimization takes 8 hours with only 1 CPU core at 100%). I know that
I think your advice on future incremental update is very useful. I will
keep eye on that.
Actually, I am currently interested in how to boost merging/optimizing
performance of single solr instance.
Parallelism at MapReduce level does not help merging/optimizing much,
unless Solr/Lucene internally
Ah, OK. For very slowly changing indexes optimize can makes sense.
Do note, though, that if you incrementally index after the full build, and
especially if you update documents, you're laying a trap for the future. Let's
say you optimize down to a single segment. The default TieredMergePolicy
trie
Hi, Erick,
First thanks for sharing the ideas. I am further giving more context here
accordingly.
1. why optimize? I have done some experiments to compare the query response
time, and there is some difference. In addition, the searcher will be
customer-facing. I think any performance boost will be
The first question is why you're optimizing at all. It's not recommended
unless you can demonstrate that an optimized index is giving you enough
of a performance boost to be worth the effort.
And why are you using embedded solr server? That's kind of unusual
so I wonder if you've gone down a wrong
Hi,
Do you have any suggestions to improve the performance for merging and
optimizing index?
I have been using embedded solr server to merge and optimize the index. I
am looking for the right parameters to tune. My use case have about 300
fields plus 250 copyfields, and moderate doc size (about 65K
I think Mark found something similar -
https://issues.apache.org/jira/browse/SOLR-6838
On Sat, Feb 14, 2015 at 2:05 AM, Erick Erickson
wrote:
> Exactly how are you issuing the commit? I'm assuming you're
> using SolrJ. the server.commit(whatever, true) waits for the searcher
> to be opened befor
Exactly how are you issuing the commit? I'm assuming you're
using SolrJ. the server.commit(whatever, true) waits for the searcher
to be opened before returning. This includes (I believe) warmup
times. It could be that the warmup times are huge in your case, the
solr logs should show you the autowar
I wasn't able to follow Otis' answer but... the purpose of commit is to
make make recent document changes (since the last commit) visible to
queries, and has nothing to do with merging of segments. IOW, take the new
segment that is being created and not yet ready for use by query, and
finish it so
Check http://search-lucene.com/?q=commit+wait+block&fc_type=mail+_hash_+user
e.g. http://search-lucene.com/m/QTPa7Sqx81
Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On Fri, Feb 13, 2015 at 8:50 AM, Gili Nachum
Thanks Otis, can you confirm that a commit call will wait for merges to
complete before returning?
On Thu, Feb 12, 2015 at 8:46 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> If you are using Solr and SPM for Solr, you can check a report that shows
> the # of files in an index and th
If you are using Solr and SPM for Solr, you can check a report that shows
the # of files in an index and the report that shows you the max docs-num
docs delta. If you see the # of files drop during a commit, that's a
merge. If you see a big delta change, that's probably a merge, too.
You could a
Hello,
During a load test I noticed a commit that took 43 seconds to complete
(client hard complete).
Is this to be expected? What's causing it?
I have a pair of machines hosting a 128M docs collection (8 shards,
replication factor=2).
Could it be merges? In Lucene merges happen async of commit s
/content/support/en/documentation/cloudera-search/cloudera-search-documentation-v1-latest.html
> > -- see the user guide pdf, " update-conflict-resolver" parameter
> >
> > James
> >
> > -Original Message-
> > From: anirudh...@gmail.com [mail
@lucene.apache.org
Subject: index merge
hi all.
I am trying solr merge index on two collection (solr 4.6.1). There are same
index on each collection. I tried as follow:
http://localhost:8983/solr/admin/cores?action=mergeindexes&core=collection1&srcCore=collection1&srcCore=collection2
It is
hi all.
I am trying solr merge index on two collection (solr 4.6.1). There are
same index on each collection. I tried as follow:
http://localhost:8983/solr/admin/cores?action=mergeindexes&core=collection1&srcCore=collection1&srcCore=collection2
It is successfully. But, Although id is unique key
> Anirudha Jadhav
> Sent: Tuesday, June 11, 2013 10:47 AM
> To: solr-user@lucene.apache.org
> Subject: Re: index merge question
>
> From my experience the lucene mergeTool and the one invoked by coreAdmin is a
> pure lucene implementation and does not understand
ache.org
Subject: Re: index merge question
From my experience the lucene mergeTool and the one invoked by coreAdmin is a
pure lucene implementation and does not understand the concepts of a unique
Key(solr land concept)
http://wiki.apache.org/solr/MergingSolrIndexes has a cautionary not
>From my experience the lucene mergeTool and the one invoked by
coreAdmin is a pure lucene implementation and does not understand the
concepts of a unique Key(solr land concept)
http://wiki.apache.org/solr/MergingSolrIndexes has a cautionary note
at the end
we do frequent index merges for which
Yeah, you have to carefully manage things if you are map/reduce building
indexes *and* updating documents in other ways.
If your 'source' data for MR index building is the 'truth', you also have the
option of not doing incremental index merging, and you could simply rebuild the
whole thing eve
Thanks Mark. My question is stemming from the new cloudera search stuff.
My concern its that if while rebuilding the index someone updates a doc
that update could be lost from a solr perspective. I guess what would need
to happen to ensure the correct information was indexed would be to record
th
I have noticed that when I write a doc with an id that already exists, it
creates a new revision with the only the fields from the second write. I
guess there is a REST API in the latest solr version which updates only
selected fields.
In my opinion, merge should be creating a doc which is a union
On Jun 8, 2013, at 12:52 PM, Jamie Johnson wrote:
> When merging through the core admin (
> http://wiki.apache.org/solr/MergingSolrIndexes) what is the policy for
> conflicts during the merge? So for instance if I am merging core 1 and
> core 2 into core 0 (first example), what happens if core
When merging through the core admin (
http://wiki.apache.org/solr/MergingSolrIndexes) what is the policy for
conflicts during the merge? So for instance if I am merging core 1 and
core 2 into core 0 (first example), what happens if core 1 and core 2 both
have a document with the same key, say core
es? Please explain the merge operation.
Thanks,
Sudarshan
--
View this message in context:
http://lucene.472066.n3.nabble.com/index-merge-tp472904p3987121.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi All,
The problem is resolved. It is purely due to filesystem. My filesystem is
of 32-bit, running on 64 bit OS. I changed to 64 bit filesystem and all
works as expected.
Uma
--
View this message in context:
http://lucene.472066.n3.nabble.com/index-merge-tp472904p832053.html
Sent from
> I am running solr in 64 bit HP-UX system. The total
> index size is about
> 5GB and when i try load any new document, solr tries to
> merge the existing
> segments first and results in following error. I could see
> a temp file is
> growng within index dir around 2GB in size and later it
> fails
uld anyone help me to resolve this exception?
Regards,
Uma
--
View this message in context:
http://lucene.472066.n3.nabble.com/index-merge-tp472904p829810.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi All,
Thank you for the very valuable suggestions.
I am planning to try using the Master - Slave configuration.
Best Rgds,
Mark.
On Mon, Mar 8, 2010 at 11:17 AM, Mark Miller wrote:
> On 03/08/2010 10:53 AM, Mark Fletcher wrote:
>
>> Hi Shalin,
>>
>> Thank you for the mail.
>> My main purpose
On 03/08/2010 10:53 AM, Mark Fletcher wrote:
Hi Shalin,
Thank you for the mail.
My main purpose of having 2 identical cores
COREX - always serves user request
COREY - every day once, takes the updates/latest data and passess it on to
COREX.
is:-
Suppose say I have only one COREY and suppose a r
Hi Mark,
On Mon, Mar 8, 2010 at 9:23 PM, Mark Fletcher
wrote:
>
> My main purpose of having 2 identical cores
> COREX - always serves user request
> COREY - every day once, takes the updates/latest data and passess it on to
> COREX.
> is:-
>
> Suppose say I have only one COREY and suppose a reque
Hi Shalin,
Thank you for the mail.
My main purpose of having 2 identical cores
COREX - always serves user request
COREY - every day once, takes the updates/latest data and passess it on to
COREX.
is:-
Suppose say I have only one COREY and suppose a request comes to COREY while
the update of the l
Hi Mark,
On Mon, Mar 8, 2010 at 7:38 PM, Mark Fletcher
wrote:
>
> I ran the SWAP command. Now:-
> COREX has the dataDir pointing to the updated dataDir of COREY. So COREX
> has the latest.
> Again, COREY (on which the update regularly runs) is pointing to the old
> index of COREX. So this now doe
problem in using MERGE concept here. If it is wrong can some
> one
> > pls suggest the best approach. I tried the various merges explained in my
> > previous mail.
> >
> >
> Index merge happens at the Lucene level which has no idea about uniqueKeys.
> Therefore when
e various merges explained in my
> previous mail.
>
>
Index merge happens at the Lucene level which has no idea about uniqueKeys.
Therefore when you merge two indexes containing exactly the same documents
(by uniqueKey), you get double the document count.
Looking at your scenario, it seems to
--
From: Mark Fletcher
Date: Sat, Mar 6, 2010 at 9:17 AM
Subject: index merge
To: solr-user@lucene.apache.org
Cc: goks...@gmail.com
Hi,
I have a doubt regarding Index Merging:-
I have set up 2 cores COREX and COREY.
COREX - always serves user requests
COREY - gets updated with the latest
Hi,
I have a doubt regarding Index Merging:-
I have set up 2 cores COREX and COREY.
COREX - always serves user requests
COREY - gets updated with the latest values (dataDir is in a different
location from COREX)
I tried merging coreX and coreY at the end of COREY getting updated with the
latest d
39 matches
Mail list logo