November 01, 2011 10:58 PM
To: solr-user@lucene.apache.org
Subject: RE: large scale indexing issues / single threaded bottleneck
Roman,
How frequently do you update your index? I have a need to do real time
add/delete to SOLR documents at a rate of approximately 20/min.
The total number of documents
We have a rate of 2K small docs/sec which translates into 90 GB/day of index
space
You should be fine
Roman
Awasthi, Shishir wrote:
>
> Roman,
> How frequently do you update your index? I have a need to do real time
> add/delete to SOLR documents at a rate of approximately 20/min.
> The total n
Alekseenkov [mailto:ralekseen...@gmail.com]
Sent: Sunday, October 30, 2011 6:11 PM
To: solr-user@lucene.apache.org
Subject: Re: large scale indexing issues / single threaded bottleneck
Guys, thank you for all the replies.
I think I have figured out a partial solution for the problem on Friday
Yonik,
Adding overwrite=false don't help. XMLLoader don't check this HTTP
parameter. Instead it check attribute in XML tag, with the same name.
-Kiril
--
View this message in context:
http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp3461815p3468463.ht
Guys, thank you for all the replies.
I think I have figured out a partial solution for the problem on Friday
night. Adding a whole bunch of debug statements to the info stream showed
that every document is following "update document" path instead of "add
document" path. Meaning that all document I
Roman:
2) what would be the best way to port these (and only these) changes
to 3.4.0? I tried to dig into the branching and revisions, but got
lost quickly. Tried something like "svn diff
[…]realtime_search@r953476 […]realtime_search@r1097767", but I'm not
sure if it's even possible to merge th
On Sat, Oct 29, 2011 at 6:35 AM, Michael McCandless
wrote:
> I saw a mention somewhere that you can tell Solr not to use
> IW.addDocument (not IW.updateDocument) when you add a document if you
> are certain it's not replacing a previous document with the same ID
Right - adding overwrite=false to
On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer
wrote:
> one more thing, after somebody (thanks robert) pointed me at the
> stacktrace it seems kind of obvious what the root cause of your
> problem is. Its solr :) Solr closes the IndexWriter on commit which is
> very wasteful since you basically
> abstract away the encoding of the index
Robert, this is what you wrote. "Abstract away the encoding of the
index" means pluggable, otherwise it's not abstract and / or it's a
flawed design. Sounds like it's the latter.
On Fri, Oct 28, 2011 at 8:10 PM, Jason Rutherglen
wrote:
>> Otherwise we have "flexible indexing" where "flexible" means "slower
>> if you do anything but the default".
>
> The other encodings should exist as modules since they are pluggable.
> 4.0 can ship with the existing codec. 4.1 with addit
> Otherwise we have "flexible indexing" where "flexible" means "slower
> if you do anything but the default".
The other encodings should exist as modules since they are pluggable.
4.0 can ship with the existing codec. 4.1 with additional codecs and
the bulk postings at a later time.
Otherwise it
On Fri, Oct 28, 2011 at 5:03 PM, Jason Rutherglen
wrote:
> +1 I suggested it should be backported a while back. Or that Lucene
> 4.x should be released. I'm not sure what is holding up Lucene 4.x at
> this point, bulk postings is only needed useful for PFOR.
This is not true, most modern index
> We should maybe try to fix this in 3.x too?
+1 I suggested it should be backported a while back. Or that Lucene
4.x should be released. I'm not sure what is holding up Lucene 4.x at
this point, bulk postings is only needed useful for PFOR.
On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer
wro
On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer
wrote:
> Hey Roman,
>
> On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov
> wrote:
>> Hi everyone,
>>
>> I'm looking for some help with Solr indexing issues on a large scale.
>>
>> We are indexing few terabytes/month on a sizeable Solr cluster (8
Hey Roman,
On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov
wrote:
> Hi everyone,
>
> I'm looking for some help with Solr indexing issues on a large scale.
>
> We are indexing few terabytes/month on a sizeable Solr cluster (8
> masters / serving writes, 16 slaves / serving reads). After certain
I'm wondering if this is relevant:
https://issues.apache.org/jira/browse/LUCENE-2680 - Improve how
IndexWriter flushes deletes against existing segments
Roman
On Fri, Oct 28, 2011 at 11:38 AM, Roman Alekseenkov
wrote:
> Hi everyone,
>
> I'm looking for some help with Solr indexing issues on a la
16 matches
Mail list logo