Yeah it just seems weird that you would need to disable the buffer on the
source cluster though.

The docs say "Replicas do not need to buffer updates, and it is recommended
to disable buffer on the target SolrCloud" which means the source should
have it enabled.

But the fact that it's working for you proves otherwise . What version of
Solr are you running? I'll try reproducing this problem at my end and see
if it's a documentation gap or a bug.

On Mon, Jul 10, 2017 at 7:15 PM, Xie, Sean <sean....@finra.org> wrote:

> Yes. Documents are being sent to target. Monitoring the output from
> “action=queues”, depending your settings, you will see the documents
> replication progress.
>
> On the other hand, if enable the buffer, the lastprocessedversion is
> always returning -1. Reading the source code, the CdcrUpdateLogSynchroizer
> does not continue to do the clean if this value is -1.
>
> Sean
>
> On 7/10/17, 5:18 PM, "Varun Thacker" <va...@vthacker.in> wrote:
>
>     After disabling the buffer are you still seeing documents being
> replicated
>     to the target cluster(s) ?
>
>     On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <sean....@finra.org> wrote:
>
>     > After several experiments and observation, finally make it work.
>     > The key point is you have to also disablebuffer on source cluster. I
> don’t
>     > know why in the wiki, it didn’t mention it, but I figured this out
> through
>     > the source code.
>     > Once disablebuffer on source cluster, the lastProcessedVersion will
> become
>     > a position number, and when there is hard commit, the old unused
> tlog files
>     > get deleted.
>     >
>     > Hope my finding can help other users who experience the same issue.
>     >
>     >
>     > On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com>
> wrote:
>     >
>     >     We have been experiencing this same issue for months now, with
> version
>     > 6.2.  No solution to date.
>     >
>     >     -----Original Message-----
>     >     From: Xie, Sean [mailto:sean....@finra.org]
>     >     Sent: Sunday, July 09, 2017 9:41 PM
>     >     To: solr-user@lucene.apache.org
>     >     Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction
> log
>     > files
>     >
>     >     Did another round of testing, the tlog on target cluster is
> cleaned up
>     > once the hard commit is triggered. However, on source cluster, the
> tlog
>     > files stay there and never gets cleaned up.
>     >
>     >     Not sure if there is any command to run manually to trigger the
>     > updateLogSynchronizer. The updateLogSynchronizer already set at run
> at
>     > every 10 seconds, but seems it didn’t help.
>     >
>     >     Any help?
>     >
>     >     Thanks
>     >     Sean
>     >
>     >     On 7/8/17, 1:14 PM, "Xie, Sean" <sean....@finra.org> wrote:
>     >
>     >         I have monitored the CDCR process for a while, the updates
> are
>     > actively sent to the target without a problem. However the tlog size
> and
>     > files count are growing everyday, even when there is 0 updates to
> sent, the
>     > tlog stays there:
>     >
>     >         Following is from the action=queues command, and you can see
> after
>     > about a month or so running days, the total transaction are reaching
> to
>     > 140K total files, and size is about 103G.
>     >
>     >         <response>
>     >         <lst name="responseHeader">
>     >         <int name="status">0</int>
>     >         <int name="QTime">465</int>
>     >         </lst>
>     >         <lst name="queues">
>     >         <lst name="some_zk_url_list">
>     >         <lst name="MY_COLLECTION">
>     >         <long name="queueSize">0</long>
>     >         <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
>     >         </lst>
>     >         </lst>
>     >         </lst>
>     >         <long name="tlogTotalSize">102740042616</long>
>     >         <long name="tlogTotalCount">140809</long>
>     >         <str name="updateLogSynchronizer">stopped</str>
>     >         </response>
>     >
>     >         Any help on it? Or do I need to configure something else?
> The CDCR
>     > configuration is pretty much following the wiki:
>     >
>     >         On target:
>     >
>     >           <requestHandler name="/cdcr" class="solr.
> CdcrRequestHandler">
>     >             <lst name="buffer">
>     >               <str name="defaultState">disabled</str>
>     >             </lst>
>     >           </requestHandler>
>     >
>     >           <updateRequestProcessorChain name="cdcr-processor-chain">
>     >             <processor class="solr.CdcrUpdateProcessorFactory"/>
>     >             <processor class="solr.RunUpdateProcessorFactory"/>
>     >           </updateRequestProcessorChain>
>     >
>     >           <requestHandler name="/update" class="solr.
>     > UpdateRequestHandler">
>     >             <lst name="defaults">
>     >               <str name="update.chain">cdcr-processor-chain</str>
>     >             </lst>
>     >           </requestHandler>
>     >
>     >           <updateHandler class="solr.DirectUpdateHandler2">
>     >             <updateLog class="solr.CdcrUpdateLog">
>     >               <str name="dir">${solr.ulog.dir:}</str>
>     >             </updateLog>
>     >             <autoCommit>
>     >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
>     >               <openSearcher>false</openSearcher>
>     >             </autoCommit>
>     >
>     >             <autoSoftCommit>
>     >               <maxTime>${solr.autoSoftCommit.maxTime:30000}<
> /maxTime>
>     >             </autoSoftCommit>
>     >           </updateHandler>
>     >
>     >         On source:
>     >           <requestHandler name="/cdcr" class="solr.
> CdcrRequestHandler">
>     >             <lst name="replica">
>     >               <str name="zkHost">${TargetZk}</str>
>     >               <str name="source">MY_COLLECTION</str>
>     >               <str name="target">MY_COLLECTION</str>
>     >             </lst>
>     >
>     >             <lst name="replicator">
>     >               <str name="threadPoolSize">1</str>
>     >               <str name="schedule">1000</str>
>     >               <str name="batchSize">128</str>
>     >             </lst>
>     >
>     >             <lst name="updateLogSynchronizer">
>     >               <str name="schedule">60000</str>
>     >             </lst>
>     >           </requestHandler>
>     >
>     >           <updateHandler class="solr.DirectUpdateHandler2">
>     >             <updateLog class="solr.CdcrUpdateLog">
>     >               <str name="dir">${solr.ulog.dir:}</str>
>     >             </updateLog>
>     >             <autoCommit>
>     >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
>     >               <openSearcher>false</openSearcher>
>     >             </autoCommit>
>     >
>     >             <autoSoftCommit>
>     >               <maxTime>${solr.autoSoftCommit.maxTime:30000}<
> /maxTime>
>     >             </autoSoftCommit>
>     >           </updateHandler>
>     >
>     >         Thanks.
>     >         Sean
>     >
>     >         On 7/8/17, 12:10 PM, "Erick Erickson" <
> erickerick...@gmail.com>
>     > wrote:
>     >
>     >             This should not be the case if you are actively sending
>     > updates to the
>     >             target cluster. The tlog is used to store unsent
> updates, so
>     > if the
>     >             connection is broken for some time, the target cluster
> will
>     > have a
>     >             chance to catch up.
>     >
>     >             If you don't have the remote DC online and do not intend
> to
>     > bring it
>     >             online soon, you should turn CDCR off.
>     >
>     >             Best,
>     >             Erick
>     >
>     >             On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <
> sean....@finra.org>
>     > wrote:
>     >             > Once enabled CDCR, update log stores an unlimited
> number of
>     > entries. This is causing the tlog folder getting bigger and bigger,
> as well
>     > as the open files are growing. How can one reduce the number of open
> files
>     > and also to reduce the tlog files? If it’s not taken care properly,
> sooner
>     > or later the log files size and open file count will exceed the
> limits.
>     >             >
>     >             > Thanks
>     >             > Sean
>     >             >
>     >             >
>     >             > Confidentiality Notice::  This email, including
> attachments,
>     > may include non-public, proprietary, confidential or legally
> privileged
>     > information.  If you are not an intended recipient or an authorized
> agent
>     > of an intended recipient, you are hereby notified that any
> dissemination,
>     > distribution or copying of the information contained in or
> transmitted with
>     > this e-mail is unauthorized and strictly prohibited.  If you have
> received
>     > this email in error, please notify the sender by replying to this
> message
>     > and permanently delete this e-mail, its attachments, and any copies
> of it
>     > immediately.  You should not retain, copy or use this e-mail or any
>     > attachment for any purpose, nor disclose all or any part of the
> contents to
>     > any other person. Thank you.
>     >
>     >
>     >
>     >
>     >
>     >
>     >     Nothing in this message is intended to constitute an electronic
>     > signature unless a specific statement to the contrary is included in
> this
>     > message.
>     >
>     >     Confidentiality Note: This message is intended only for the
> person or
>     > entity to which it is addressed. It may contain confidential and/or
>     > privileged material. Any review, transmission, dissemination or
> other use,
>     > or taking of any action in reliance upon this message by persons or
>     > entities other than the intended recipient is prohibited and may be
>     > unlawful. If you received this message in error, please contact the
> sender
>     > and delete it from your computer.
>     >
>     >
>     >
>
>
>

Reply via email to