Thanks for responding Erick. I set the "start" to zero and "rows" always to 100. I create CloudSolrClient instance and use it to both query as well as index. But I do sleep for 5 secs just to allow for any auto commits.
So query --> client.add(100 docs) --> wait --> query again But the weird thing I noticed was that after 8 or 9 batches I.e 800/900 docs the "query again" returns zero docs causing my while loop to exist...so was trying to see if I was doing the right thing or if there is an alternate way to do heavy indexing. Thanks Ravi Kiran Bhaskar On Friday, September 25, 2015, Erick Erickson <erickerick...@gmail.com> wrote: > How are you querying Solr? You say you query for 100 docs, > update then get the next set. What are you using for a marker? > If you're using the start parameter, and somehow a commit is > creeping in things might be weird, especially if you're using any > of the internal Lucene doc IDs. If you're absolutely sure no commits > are taking place even that should be OK. > > The "deep paging" stuff could be helpful here, see: > > https://lucidworks.com/blog/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/ > > Best, > Erick > > On Fri, Sep 25, 2015 at 3:13 PM, Ravi Solr <ravis...@gmail.com > <javascript:;>> wrote: > > No problem Walter, it's all fun. Was just wondering if there was some > other > > good way that I did not know of, that's all 😀 > > > > Thanks > > > > Ravi Kiran Bhaskar > > > > On Friday, September 25, 2015, Walter Underwood <wun...@wunderwood.org > <javascript:;>> > > wrote: > > > >> Sorry, I did not mean to be rude. The original question did not say that > >> you don’t have the docs outside of Solr. Some people jump to the > advanced > >> features and miss the simple ones. > >> > >> It might be faster to fetch all the docs from Solr and save them in > files. > >> Then modify them. Then reload all of them. No guarantee, but it is > worth a > >> try. > >> > >> Good luck. > >> > >> wunder > >> Walter Underwood > >> wun...@wunderwood.org <javascript:;> <javascript:;> > >> http://observer.wunderwood.org/ (my blog) > >> > >> > >> > On Sep 25, 2015, at 2:59 PM, Ravi Solr <ravis...@gmail.com > <javascript:;> > >> <javascript:;>> wrote: > >> > > >> > Walter, Not in a mood for banter right now.... Its 6:00pm on a friday > and > >> > Iam stuck here trying to figure reindexing issues :-) > >> > I dont have source of docs so I have to query the SOLR, modify and > put it > >> > back and that is seeming to be quite a task in 5.3.0, I did reindex > >> several > >> > times with 4.7.2 in a master slave env without any issue. Since then > we > >> > have moved to cloud and it has been a pain all day. > >> > > >> > Thanks > >> > > >> > Ravi Kiran Bhaskar > >> > > >> > On Fri, Sep 25, 2015 at 5:25 PM, Walter Underwood < > wun...@wunderwood.org <javascript:;> > >> <javascript:;>> > >> > wrote: > >> > > >> >> Sure. > >> >> > >> >> 1. Delete all the docs (no commit). > >> >> 2. Add all the docs (no commit). > >> >> 3. Commit. > >> >> > >> >> wunder > >> >> Walter Underwood > >> >> wun...@wunderwood.org <javascript:;> <javascript:;> > >> >> http://observer.wunderwood.org/ (my blog) > >> >> > >> >> > >> >>> On Sep 25, 2015, at 2:17 PM, Ravi Solr <ravis...@gmail.com > <javascript:;> > >> <javascript:;>> wrote: > >> >>> > >> >>> I have been trying to re-index the docs (about 1.5 million) as one > of > >> the > >> >>> field needed part of string value removed (accidentally > introduced). I > >> >> was > >> >>> issuing a query for 100 docs getting 4 fields and updating the doc > >> >> (atomic > >> >>> update with "set") via the CloudSolrClient in batches, However from > >> time > >> >> to > >> >>> time the query returns 0 results, which exits the re-indexing > program. > >> >>> > >> >>> I cant understand as to why the cloud returns 0 results when there > are > >> >> 1.4x > >> >>> million docs which have the "accidental" string in them. > >> >>> > >> >>> Is there another way to do bulk massive updates ? > >> >>> > >> >>> Thanks > >> >>> > >> >>> Ravi Kiran Bhaskar > >> >> > >> >> > >> > >> >