> Do you have any idea of why this happens? One just commit it every time, or send, append commitWithin param. Can you grep logs for 'commit' word occurrence? Also it's possible to increase log verbosity for LogUpdateProcessor.
On Thu, Dec 13, 2018 at 10:15 AM Danilo Tomasoni <tomas...@cosbi.eu> wrote: > Hello I tried setting both autocommit and autosoftcommit to -1, but i > still see the documents just seconds after indexing it. > > These are the actual configurations in <solrcorename>/conf/solrconfig.xml > > <autoCommit> > <maxTime>${solr.autoCommit.maxTime:9999999}</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > > <!-- softAutoCommit is like autoCommit except it causes a > 'soft' commit which only ensures that changes are visible > but does not ensure that data is synced to disk. This is > faster and more near-realtime friendly than a hard commit. > --> > > <autoSoftCommit> > <maxTime>${solr.autoSoftCommit.maxTime:9999999}</maxTime> > </autoSoftCommit> > > > but even that way after every single POST to /update request handler, If > I search * I see 1K documents more (i index in chunk of 1k documents). > > Do you have any idea of why this happens? > > > On 12/12/18 17:16, Erick Erickson wrote: > > The answer to your question is to set the interval to -1. > > > > however, for <autoCommit> that's a really bad idea. Why do you think > > this will help with OOM errors? _Querying_ usually is the place OOMs > > are generated, especially if you do things like facet on very > > high-cardinality fields and/or do _not_ have docValues enabled for > > fields you facet, group, or sort on. > > > I have a single machine where I just index data, no concurrent querying > is happening, that's why I don't care about visibility but just about > speed/no crash. > > I'm planning to make a single hard commit at the end (roughly once every > 500.000 docs) > > copy the final index to a clone machine where all the querying happens, > to avoid OOM presumably generated by concurrent indexing/querying. > > I thought this can help lowering the solr memory requirements. > > We don't facet, group, sort. The default solr sorting by relevance is ok > for us. > > We just have big edismax queries with sub-edismax queries with different > mm values. Every sub-edismax query do have a lot (order of K) of > alternative words/phrases. > > > > If you do disable hard commits, your TLOG sizes will grow without > > bound until your entire indexing run is complete. Worse, if the TLOG > > replays due to abnormal restart, it would try to re-index everything. > > Hard commits with openSearcher=false are recommended. > > > yes I know, but I want to have the control on the time where the hard > commit is triggered. > > It would also be nice to know when solr finishes the hard commit, so > that I can stop sending POST request in that timeframe, but I haven't > seen any API for that. > > > Thank you for your help > > Danilo > > > Best, > > Erick > > On Wed, Dec 12, 2018 at 4:44 AM Danilo Tomasoni <tomas...@cosbi.eu> > wrote: > >> I want to disable even that. > >> > >> I saw here > >> > >> > https://lucene.apache.org/solr/guide/6_6/updatehandlers-in-solrconfig.html > >> > >> > >> that probably to achieve what I want I just need to comment out the > >> autoCommit tag.. correct? > >> > >> What do you think about disabling autocommit/autosoftcommit? > >> > >> it can lower the system requirements while indexing? > >> > >> > >> What about transaction logs? they can be disabled? > >> > >> When solr crashes I always reimport from scratch because I don't expect > >> that the documents accepted by solr between the last hard commit and the > >> crash will be saved somewhere. > >> > >> But this article > >> > >> > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > >> > >> says that solr is capable of restoring documents even if they weren't > >> committed, is it still correct? > >> > >> > >> Thank you > >> > >> Danilo > >> > >> > >> On 12/12/18 13:33, Mikhail Khludnev wrote: > >>> What about autoSoftCommit ? > >>> > >>> On Wed, Dec 12, 2018 at 3:24 PM Danilo Tomasoni <tomas...@cosbi.eu> > wrote: > >>> > >>>> Hello, I'm experiencing oom while indexing a big amount of documents. > >>>> > >>>> The main idea to avoid OOM is to avoid commit (just one big commit at > >>>> the end). > >>>> > >>>> Is this a correct idea? > >>>> > >>>> How can I disable autocommit? > >>>> > >>>> I've set > >>>> > >>>> <autoCommit> > >>>> <maxTime>${solr.autoCommit.maxTime:-1}</maxTime> > >>>> <openSearcher>false</openSearcher> > >>>> </autoCommit> > >>>> > >>>> in solrconfig.xml > >>>> > >>>> but it's not sufficient, while indexing I still see documents. > >>>> > >>>> Thank you > >>>> > >>>> Danilo > >>>> > >>>> > >>>> -- > >>>> Danilo Tomasoni > >>>> COSBI > >>>> > >>>> As for the European General Data Protection Regulation 2016/679 on the > >>>> protection of natural persons with regard to the processing of > personal > >>>> data, we inform you that all the data we possess are object of > treatement > >>>> in the respect of the normative provided for by the cited GDPR. > >>>> > >>>> It is your right to be informed on which of your data are used and > how; > >>>> you may ask for their correction, cancellation or you may oppose to > their > >>>> use by written request sent by recorded delivery to The Microsoft > Research > >>>> – University of Trento Centre for Computational and Systems Biology > Scarl, > >>>> Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > >>>> > >>>> > >> -- > >> Danilo Tomasoni > >> COSBI > >> > >> As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal > data, we inform you that all the data we possess are object of treatement > in the respect of the normative provided for by the cited GDPR. > >> > >> It is your right to be informed on which of your data are used and how; > you may ask for their correction, cancellation or you may oppose to their > use by written request sent by recorded delivery to The Microsoft Research > – University of Trento Centre for Computational and Systems Biology Scarl, > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > >> > -- > Danilo Tomasoni > COSBI > > As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal > data, we inform you that all the data we possess are object of treatement > in the respect of the normative provided for by the cited GDPR. > > It is your right to be informed on which of your data are used and how; > you may ask for their correction, cancellation or you may oppose to their > use by written request sent by recorded delivery to The Microsoft Research > – University of Trento Centre for Computational and Systems Biology Scarl, > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > > -- Sincerely yours Mikhail Khludnev