_Why_ is reindexing not an option? 200M doc isn't that many. Since you have Atomic updates working, you could easily write a little program that pulled the docs from you existing collection and pushed them to a new one with the new schema.
Do use CursorMark if you try that.... You have to be ready to reindex as time passes, either to upgrade to a major version 2 greater than what you're using now or because the requirements change yet again. Best, Erick On Thu, Sep 19, 2019 at 12:36 AM Rahul Goswami <rahul196...@gmail.com> wrote: > > Eric, Markus, > Thank you for your inputs. I made sure that the jar file is found correctly > since the core reloads fine and also prints the log lines from my processor > during update request (getInstane() method of the update factory). The > reason why I want to insert the processor between distributed update > processor (DUP) and run update processor (RUP) is because there are certain > fields which were indexed against a dynamic field “*” and later the schema > was patched to remove the * field, causing atomic updates to fail for such > documents. Reindexing is not option since the index has nearly 200 million > docs. My understanding is that the atomic updates are stitched back to a > complete document in the DUP before being reindexed by RUP. Hence if I am > able to access the document before being indexed and check for fields which > are not defined in the schema, I can remove them from the stitched back > document so that the atomic update can happen successfully for such docs. > The documentation below mentions that even if I don’t include the DUP in my > chain it is automatically inserted just before RUP. > > https://lucene.apache.org/solr/guide/7_2/update-request-processors.html#custom-update-request-processor-chain > > > I tried both approaches viz. explicitly specifying my processor after DUP > in the chain and also tried using the “post-processor” option in the chain, > to have the custom processor execute after DUP. Still looks like the > processor is just short circuited. I have defined my logic in the > processAdd() of the processor. Is this an expected behavior? > > Regards, > Rahul > > > On Wed, Sep 18, 2019 at 5:28 PM Erick Erickson <erickerick...@gmail.com> > wrote: > > > It Depends (tm). This is a little confused. Why do you have > > distributed processor in stand-alone Solr? Stand-alone doesn't, well, > > distribute updates so that seems odd. Do try switching it around and > > putting it on top, this should be OK since distributed is irrelevant. > > > > You can also just set a breakpoint and see for instance, the > > instructions in the "IntelliJ" section here: > > https://cwiki.apache.org/confluence/display/solr/HowToContribute > > > > One thing I'd do is make very, very sure that my jar file was being > > found. IIRC, the -v startup option will log exactly where solr looks > > for jar files. Be sure your custom jar is in one of them and is picked > > up. I've set a lib directive to one place only to discover that > > there's an old copy lying around someplace else.... > > > > Best, > > Erick > > > > On Wed, Sep 18, 2019 at 5:08 PM Markus Jelsma > > <markus.jel...@openindex.io> wrote: > > > > > > Hello Rahul, > > > > > > I don't know why you don't see your logs lines, but if i remember > > correctly, you must put all custom processors above Log, Distributed and > > Run, at least i remember i read it somewhere a long time ago. > > > > > > We put all our custom processors on top of the three default processors > > and they run just fine. > > > > > > Try it. > > > > > > Regards, > > > Markus > > > > > > -----Original message----- > > > > From:Rahul Goswami <rahul196...@gmail.com> > > > > Sent: Wednesday 18th September 2019 22:20 > > > > To: solr-user@lucene.apache.org > > > > Subject: Custom update processor not kicking in > > > > > > > > Hello, > > > > > > > > I am using solr 7.2.1 in a standalone mode. I created a custom update > > > > request processor and placed it between the distributed processor and > > run > > > > update processor in my chain. I made sure the chain is invoked since I > > see > > > > log lines from the getInstance() method of my processor factory. But I > > > > don’t see any log lines from the processAdd() method. > > > > > > > > Any inputs on why the processor is getting skipped if placed after > > > > distributed processor? > > > > > > > > Thanks, > > > > Rahul > > > > > >