This sounds interesting, I'll check this out. Thanks! Elisabeth
2014-04-02 8:54 GMT+02:00 Dmitry Kan <solrexp...@gmail.com>: > Thanks, Markus, that is useful. > I'm guessing the higher the weight, the longer the op takes? > > > On Tue, Apr 1, 2014 at 10:39 PM, Markus Jelsma > <markus.jel...@openindex.io>wrote: > > > You may want to increase reclaimdeletesweight for tieredmergepolicy from > 2 > > to 3 or 4. By default it may keep too much deleted or updated docs in the > > index. This can increase index size by 50%!! Dmitry Kan < > > solrexp...@gmail.com> schreef:Elisabeth, > > > > Yes, I believe you are right in that the deletes are part of the optimize > > process. If you delete often, you may consider (if not already) the > > TieredMergePolicy, which is suited for this scenario. Check out this > > relevant discussion I had with Lucene committers: > > https://twitter.com/DmitryKan/status/399820408444051456 > > > > HTH, > > > > Dmitry > > > > > > On Tue, Apr 1, 2014 at 11:34 AM, elisabeth benoit < > > elisaelisael...@gmail.com > > > wrote: > > > > > Thanks a lot for your answers! > > > > > > Shawn. Our GC configuration has far less parameters defined, so we'll > > check > > > this out. > > > > > > Dimitry, about the expungeDeletes option, we'll add that in the delete > > > process. But from what I read, this is done in the optimize process > (cf. > > > > > > > > > http://lucene.472066.n3.nabble.com/Does-expungeDeletes-need-calling-during-an-optimize-td1214083.html > > > ). > > > Or maybe not? > > > > > > Thanks again, > > > Elisabeth > > > > > > > > > 2014-04-01 7:52 GMT+02:00 Dmitry Kan <solrexp...@gmail.com>: > > > > > > > Hi, > > > > > > > > We have noticed something like this as well, but with older versions > of > > > > solr, 3.4. In our setup we delete documents pretty often. Internally > in > > > > Lucene, when a document is client requested to be deleted, it is not > > > > physically deleted, but only marked as "deleted". Our original > > > optimization > > > > assumption was such that the "deleted" documents would get physically > > > > removed on each optimize command issued. We started to suspect it > > wasn't > > > > always true as the shards (especially relatively large shards) became > > > > slower over time. So we found out about the expungeDeletes option, > > which > > > > purges the "deleted" docs and is by default false. We have set it to > > > true. > > > > If your solr update lifecycle includes frequent deletes, try this > out. > > > > > > > > This of course does not override working towards finding better > > > > GCparameters. > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching > > > > > > > > > > > > On Mon, Mar 31, 2014 at 3:57 PM, elisabeth benoit < > > > > elisaelisael...@gmail.com > > > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > We are currently using solr 4.2.1. Our index is updated on a daily > > > basis. > > > > > After noticing solr query time has increased (two times the initial > > > size) > > > > > without any change in index size or in solr configuration, we tried > > an > > > > > optimize on the index but it didn't fix our problem. We checked the > > > > garbage > > > > > collector, but everything seemed fine. What did in fact fix our > > problem > > > > was > > > > > to delete all documents and reindex from scratch. > > > > > > > > > > It looks like over time our index gets "corrupted" and optimize > > doesn't > > > > fix > > > > > it. Does anyone have a clue how to investigate further this > > situation? > > > > > > > > > > > > > > > Elisabeth > > > > > > > > > > > > > > > > > > > > > -- > > > > Dmitry > > > > Blog: http://dmitrykan.blogspot.com > > > > Twitter: http://twitter.com/dmitrykan > > > > > > > > > > > > > > > -- > > Dmitry > > Blog: http://dmitrykan.blogspot.com > > Twitter: http://twitter.com/dmitrykan > > > > > > -- > Dmitry > Blog: http://dmitrykan.blogspot.com > Twitter: http://twitter.com/dmitrykan >