On 1/13/2015 12:10 AM, ig01 wrote:
> Unfortunately this is the case, we do have hundreds of millions of documents
> on one
> Solr instance/server. All our configs and schema are with default
> configurations. Our index
> size is 180G, does that mean that we need at least 180G heap size?
If you ha
Hi,
Unfortunately this is the case, we do have hundreds of millions of documents
on one
Solr instance/server. All our configs and schema are with default
configurations. Our index
size is 180G, does that mean that we need at least 180G heap size?
Thanks.
--
View this message in context:
htt
On 1/10/2015 11:46 PM, ig01 wrote:
> Thank you all for your response,
> The thing is that we have 180G index while half of it are deleted documents.
> We tried to run an optimization in order to shrink index size but it
> crashes on ‘out of memory’ when the process reaches 120G.
> Is it possibl
Hi,
We gave 120G to JVM, while we have 140G memory on this machine.
We use the default merge policy("TieredMergePolicy"), and there are 54
segments in our index.
We tried to perform an optimization with different numbers of maxSegments
(53 and less)
it didn't help.
How much memory we need for 180G
[ disclaimer: this worked for me, ymmv ... ]
I just battled this. Turns out incrementally optimizing using the
maxSegments attribute was the most efficient solution for me. In
particular when you are actually running out of disk space.
#!/bin/bash
# n-segments I started with
high=400
# n-segme
OK, why can't you give the JVM more memory, perhaps on
a one-time basis to get past this problem? You've never
told us how much memory you give the JVM in the first place.
Best,
Erick
On Sun, Jan 11, 2015 at 7:54 AM, Jack Krupansky
wrote:
> Usually, Lucene will be optimizing (merging) segments o
Usually, Lucene will be optimizing (merging) segments on the fly so that
you should only have a fraction of your total deletions present in the
index and should never have an absolute need to do an old-fashioned full
optimize.
What merge policy are you using?
Is Solr otherwise running fine other
I believe if you delete all documents in a segment, that segment as a
whole goes away.
A segment is created on every commit whether you reopen the searcher
or not. Do you know what documents would be deleted later (are there
are natural clusters). If yes, perhaps there is a way to index them so
th
Hi,
It's not an option for us, all the documents in our index have same deletion
probability.
Is there any other solution to perform an optimization in order to reduce
index size?
Thanks in advance.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Frequent-deletions-tp41766
Not directly in your subject but you could look at this patch
https://issues.apache.org/jira/browse/SOLR-6841 it implements visualization
of solr(lucene) segments with exact information of how much deletions are
present in each segment. Looking at this one you could - of course next
time - react li
Maybe you should consider creating different generations of indexes and
not keep everything in one index. If the likelihood of documents being
deleted is rather high in, e.g., the first week or so, you could have
one index for the high-probability of deletion documents (the fresh
ones) and a second
Thank you all for your response,
The thing is that we have 180G index while half of it are deleted documents.
We tried to run an optimization in order to shrink index size but it
crashes on ‘out of memory’ when the process reaches 120G.
Is it possible to optimize parts of the index?
Please adv
Well, we are doing same thing(in a way). we have to do frequent deletions in
mass, at a time we are deleting around 20M+ documents.All i am doing is after
deletion i am firing the below command on each of our solr node and keep some
patience as it take way much time.
curl -vvv
"http://node1.so
Is there a specific list of which data structures are "sparce" and
"non-sparce" for Lucene and Solr (referencing G+ post)? I imagine this
is obvious to low-level hackers, but could actually be nice to
summarize it somewhere for troubleshooting.
Regards,
Alex.
Sign up for my Solr resources
Also see this G+ post I wrote up recently showing how %tg deletions
changes over time for an "every add also deletes a previous document"
stress test: https://plus.google.com/112759599082866346694/posts/MJVueTznYnD
Mike McCandless
http://blog.mikemccandless.com
On Wed, Dec 31, 2014 at 12:21 PM,
It's usually not necessary to optimize, as more indexing happens you
should see background merges happen that'll reclaim the space, so I
wouldn't worry about it unless you're seeing actual problems that have
to be addressed. Here's a great visualization of the process:
http://blog.mikemccandless.c
16 matches
Mail list logo