Gili I was constantly checking the cloud admin UI and it always stayed Green, that is why I initially overlooked sync issues...finally when all options dried out I went individually to each node and quieried and that is when i found the out of sync issue. The way I resolved my issue was shut down the leader that was not synching properly and let another node become the leader, then reindex all docs. Once the reindexing is done I started the node that was causing the issue and it synched properly :-)
Thanks Ravi Kiran Bhaskar On Mon, Sep 28, 2015 at 10:26 AM, Gili Nachum <gilinac...@gmail.com> wrote: > Were all of shard replica in active state (green color in admin ui) before > starting? > Sounds like it otherwise you won't hit the replica that is out of sync. > > Replicas can get out of sync, and report being in sync after a sequence of > stop start w/o a chance to complete sync. > See if it might have happened to you: > > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201412.mbox/%3CCAOOKt53XTU_e0m2ioJ-S4SfsAp8JC6m-=nybbd4g_mjh60b...@mail.gmail.com%3E > On Sep 27, 2015 06:56, "Ravi Solr" <ravis...@gmail.com> wrote: > > > Erick...There is only one type of String > > "sun.org.mozilla.javascript.internal.NativeString:" and no other > variations > > of that in my index, so no question of missing it. Point taken regarding > > the CURSORMARK stuff, yes you are correct, my head so numb at this point > > after working 3 days on this, I wasnt thinking straight. > > > > BTW I found the real issue, I have a total of 8 servers in the solr > cloud. > > The leader for this specific collection was the one that was returning 0 > > for the searches. All other 7 servers had roughly 800K docs still needing > > the string replacement. So maybe the real issue is sync among servers. > Just > > to prove to myself I shutdown the solr that was giving zero results > (i.e. > > all uuid strings have already been somehow devoid of spurious > > sun.org.mozilla.javascript.internal.NativeString on that server). Now it > > ran perfectly fine and is about to finish as last 103K are still left > when > > I was writing this email. > > > > So the real question is how can we ensure that the Sync is always > > maintained and what to do if it ever goes out of Sync, I did see some > Jira > > tickets from previous 4.10.x versions where Sync was an issue. Can you > > please point me to any doc which says how SolrCloud synchs/replicates ? > > > > Thanks, > > > > Ravi Kiran Bhaskar > > > > Thanks > > > > Rvai Kiran Bhaskar > > > > On Sat, Sep 26, 2015 at 11:00 PM, Erick Erickson < > erickerick...@gmail.com> > > wrote: > > > > > bq: 3. Erick, I wasnt getting all 1.4 mill in one shot. I was initially > > > using > > > 100 docs batch, which, I later increased to 500 docs per batch. Also it > > > would not be a infinite loop if I commit for each batch, right !!?? > > > > > > That's not the point at all. Look at the basic logic here: > > > > > > You run for a while processing 100 (or 500 or 1,000) docs per batch > > > and change all uuid fields with this statement: > > > > > > uuid.replace("sun.org.mozilla.javascript.internal.NativeString:", ""); > > > > > > and then update the doc. You run this as long as you have any docs > > > that satisfy the query "q=uuid:sun.org.mozilla*", _changing_ > > > every one that has this string! > > > > > > At that point, theoretically, no document in your index has this > string. > > So > > > running your update program immediately after should find _zero_ > > documents. > > > > > > I've been assuming your complaint is that you don't process 1.4 M docs > > (in > > > batches), you process some lower number then exit and you think this is > > > wrong. > > > I'm claiming that you should only expect to find as many docs as have > > been > > > indexed since the last time the program ran. > > > > > > As far as the infinite loop is concerned, again trace the logic in the > > old > > > code. > > > Forget about commits and all the mechanics, just look at the logic. > > > You're querying on "sun.org.mozilla*". But you only change if you get a > > > match on > > > "sun.org.mozilla.javascript.internal.NativeString:" > > > > > > Now imagine you have a doc that has sun.org.mozilla.erick in it. That > doc > > > gets > > > returned from the query but does _not_ get modified because it doesn't > > > match your pattern. In the older code, it would be found again and > > > returned next > > > time you queried. Then not modified again. Eventually you'd be in a > > > position > > > where you never changed any docs, just kept getting the same docList > back > > > over and over again. Marching through based on the unique key should > not > > > have the same potential issue. > > > > > > You should not be mixing the new query stuff with CURSORMARK. Deep > paging > > > supposes the exact same query is being run over and over and you're > > > _paging_ > > > through the results. You're changing the query every time so the > results > > > aren't > > > very predictable. > > > > > > Best, > > > Erick > > > > > > > > > On Sat, Sep 26, 2015 at 5:01 PM, Ravi Solr <ravis...@gmail.com> wrote: > > > > Erick & Shawn I incrporated your suggestions. > > > > > > > > > > > > 0. Shut off all other indexing processes. > > > > 1. As Shawn mentioned set batch size to 10000. > > > > 2. Loved Erick's suggestion about not using filter at all and sort by > > > > uniqueId and put last known uinqueId as next queries start while > still > > > > using cursor marks as follows > > > > > > > > SolrQuery q = new SolrQuery("+uuid:sun.org.mozilla* +uniqueId:{" + > > > > markerSysId + " TO > > > > *]").setRows(10000).addSort("uniqueId",ORDER.asc).setFields(new > > > > String[]{"uniqueId","uuid"}); > > > > q.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark); > > > > > > > > 3. As per Shawn's advise commented autocommit and soft commit in > > > > solrconfig.xml and set openSearcher to false and issued MANUAL COMMIT > > for > > > > every batch from code as follows > > > > > > > > client.commit(true, true, true); > > > > > > > > Here is what the log statement & results - log.info("Indexed " + > > count + > > > > "/" + docList.getNumFound()); > > > > > > > > > > > > 2015-09-26 17:29:57 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 90000/1344085 > > > > 2015-09-26 17:30:30 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 100000/1334085 > > > > 2015-09-26 17:33:26 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 110000/1324085 > > > > 2015-09-26 17:36:09 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 120000/1314085 > > > > 2015-09-26 17:39:42 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 130000/1304085 > > > > 2015-09-26 17:43:05 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 140000/1294085 > > > > 2015-09-26 17:46:14 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 150000/1284085 > > > > 2015-09-26 17:48:22 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 160000/1274085 > > > > 2015-09-26 17:48:25 INFO [a.b.c.AdhocCorrectUUID] - Indexed 160000/0 > > > > 2015-09-26 17:48:25 INFO [a.b.c.AdhocCorrectUUID] - FINISHED !!! > > > > > > > > Ran manually a second time to see if first was fluke. Still same. > > > > > > > > 2015-09-26 17:55:26 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 10000/1264716 > > > > 2015-09-26 17:58:07 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 20000/1254716 > > > > 2015-09-26 18:03:09 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 30000/1244716 > > > > 2015-09-26 18:06:32 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 40000/1234716 > > > > 2015-09-26 18:10:35 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 50000/1224716 > > > > 2015-09-26 18:15:23 INFO [a.b.c.AdhocCorrectUUID] - Indexed > > > 60000/1214716 > > > > 2015-09-26 18:15:24 INFO [a.b.c.AdhocCorrectUUID] - Indexed 60000/0 > > > > 2015-09-26 18:15:26 INFO [a.b.c.AdhocCorrectUUID] - FINISHED !!! > > > > > > > > Now changed the autommit in solrconfig.xml as follows...Note the soft > > > > commit has been shut off as per Shawn's advise > > > > > > > > <autoCommit> > > > > <!-- <maxDocs>100</maxDocs> --> > > > > <maxTime>300000</maxTime> > > > > <openSearcher>false</openSearcher> > > > > </autoCommit> > > > > > > > > <!-- > > > > <autoSoftCommit> > > > > <maxTime>30000</maxTime> > > > > </autoSoftCommit> > > > > --> > > > > > > > > 2015-09-26 18:47:44 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > Indexed 10000/1205451 > > > > 2015-09-26 18:50:49 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > Indexed 20000/1195451 > > > > 2015-09-26 18:54:18 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > Indexed 30000/1185451 > > > > 2015-09-26 18:57:04 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > Indexed 40000/1175451 > > > > 2015-09-26 19:00:10 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > Indexed 50000/1165451 > > > > 2015-09-26 19:00:13 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > Indexed 50000/0 > > > > 2015-09-26 19:00:13 INFO > > [com.wpost.search.reindexing.AdhocCorrectUUID] > > > - > > > > FINISHED !!! > > > > > > > > > > > > The query still returned 0 results when they are over million docs > > > > available which match uuid:sun.org.mozilla* ...Then why do I get 0 > ??? > > > > > > > > Thanks > > > > > > > > Ravi Kiran Bhaskar > > > > > > > > On Sat, Sep 26, 2015 at 3:49 PM, Ravi Solr <ravis...@gmail.com> > wrote: > > > > > > > >> Thank you Erick & Shawn for taking significant time off your > weekends > > to > > > >> debug and explain in great detail. I will try to address the main > > points > > > >> from your emails to provide more situation context for better > > > understanding > > > >> of my situation > > > >> > > > >> 1. Erick, As part of our upgrade from 4.7.2 to 5.3.0 I re-indexed > all > > > docs > > > >> from my old Master-Slave to My SolrCloud using DIH > SolrEntityProcessor > > > >> which used a Script Transformer. I unwittingly messed up the script > > and > > > >> hence this 'uuid' (String Type field) got messed up. All records > prior > > > to > > > >> Sep 20 2015 have this issue that I am currently try to rectify. > > > >> > > > >> 2. Regarding openSearcher=true/false, I had it as false all along in > > my > > > >> 4.7.2 config. I read somewhere that SolrCloud or 5.x doesn't honor > it > > > or it > > > >> should be left default (Don't exactly remember where I read it), > > hence, > > > I > > > >> removed it from my solrconfig.xml going against my intuition :-) > > > >> > > > >> 3. Erick, I wasnt getting all 1.4 mill in one shot. I was initially > > > using > > > >> 100 docs batch, which, I later increased to 500 docs per batch. Also > > it > > > >> would not be a infinite loop if I commit for each batch, right !!?? > > > >> > > > >> 4. Shawn, you are correct the uuid is of String Type and its not > > unique > > > >> key for my schema. My uniqueKey is uniqueId and systemid is of no > > > >> consequence here, it's another field for differentiating apps within > > my > > > >> solr. > > > >> > > > >> Than you very much again guys. I will incorporate your suggestions > and > > > >> report back. > > > >> > > > >> Thanks > > > >> > > > >> Ravi Kiran Bhaskar > > > >> > > > >> On Sat, Sep 26, 2015 at 12:58 PM, Erick Erickson < > > > erickerick...@gmail.com> > > > >> wrote: > > > >> > > > >>> Oh, one more thing. _assuming_ you can't change the indexing > process > > > >>> that gets the docs from the system of record, why not just add an > > > >>> update processor that does this at index time? See: > > > >>> > > > > > > https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors > > > >>> , > > > >>> in particular the StatelessScriptUpdateProcessorFactory might be a > > > >>> good candidate. It just takes a bit of javascript (or other > scripting > > > >>> language) and changes the record before it gets indexed. > > > >>> > > > >>> FWIW, > > > >>> Erick > > > >>> > > > >>> On Sat, Sep 26, 2015 at 9:52 AM, Shawn Heisey <apa...@elyograg.org > > > > > >>> wrote: > > > >>> > On 9/26/2015 10:41 AM, Shawn Heisey wrote: > > > >>> >> <autoCommit> <maxTime>300000</maxTime> </autoCommit> > > > >>> > > > > >>> > This needs to include openSearcher=false, as Erick mentioned. > I'm > > > sorry > > > >>> > I screwed that up: > > > >>> > > > > >>> > <autoCommit> > > > >>> > <maxTime>300000</maxTime> > > > >>> > <openSearcher>false</openSearcher> > > > >>> > </autoCommit> > > > >>> > > > > >>> > Thanks, > > > >>> > Shawn > > > >>> > > > >> > > > >> > > > > > >