Hi,
On 15/06/16 03:18, Bharath Kumar wrote:
Hi Renaud,
Thank you so much for your response. It is very helpful and it helped me
understand the need for turning on buffering.
Is it recommended to keep the buffering enabled all the time on the
source cluster? If the target cluster is up and running and the cdcr is
started, can i turn off the buffering on the source site?
yes, no need to keep buffering on if your target cluster is up and
running and cdcr replication is started.
As you have mentioned, the transaction logs are kept on the source
cluster, until the data is replicated on the target cluster, once the
cdcr is started. Is there a possibility that target cluster is out of
sync with the source cluster and we need to do a hard recovery from the
source cluster to sync up the target cluster?
If the target cluster goes down while cdcr is replicating, there should
be no loss of information. The source cluster will try from time to time
to communicate with the target and continue the replication until the
target cluster is back up and running. Until it can resume
communication, the source cluster will keep a pointer on where the
replication should resume, and therefore the update log will not be
cleaned up to this point.
The pointer on the source cluster is not persistent (maybe that could be
something to implement). Therefore if the source cluster is restarted,
the pointer will be lost, and buffer should be activated until the
target cluster is up and running.
Also i have the below configuration on the source cluster to synchronize
the update logs.
| <||lst| |name||=||"updateLogSynchronizer"||>|
|||<||str| |name||=||"schedule"||>1000</||str||>|
|||</||lst||>|
|
|
|Regarding the monitoring of the replication, i am planning to add a
script to check the queue size, to make sure the disk is not full in
case the target site is down and the transaction log size keeps growing
on the source site.|
|Is there any other recommended approach?|
The best is to use the monitoring api which provides some metrics on how
the replication is going. In the cwiki [1], there are also some
recommendations on how to monitor the system
[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462
Kind Regards
--
Renaud Delbru
|
|
|Thanks again, your inputs were very helpful.|
On Tue, Jun 14, 2016 at 7:10 PM, Bharath Kumar
<bharath.mvku...@gmail.com <mailto:bharath.mvku...@gmail.com>> wrote:
Hi Renaud,
Thank you so much for your response. It is very helpful and it
helped me understand the need for turning on buffering.
Is it recommended to keep the buffering enabled all the time on the
source cluster? If the target cluster is up and running and the cdcr
is started, can i turn off the buffering on the source site?
As you have mentioned, the transaction logs are kept on the source
cluster, until the data is replicated on the target cluster, once
the cdcr is started, is there a possibility that if on the target
cluster
On Tue, Jun 14, 2016 at 6:50 AM, Davis, Daniel (NIH/NLM) [C]
<daniel.da...@nih.gov <mailto:daniel.da...@nih.gov>> wrote:
I must chime in to clarify something - in case 2, would the
source cluster eventually start a log reader on its own? That
is, would the CDCR heal over time, or would manual action be
required?
-----Original Message-----
From: Renaud Delbru [mailto:renaud@siren.solutions
<mailto:renaud@siren.solutions>]
Sent: Tuesday, June 14, 2016 4:51 AM
To: solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org>
Subject: Re: Regarding CDCR SOLR 6
Hi Bharath,
The buffer is useful when you need to buffer updates on the
source cluster before starting cdcr, if the source cluster might
receive updates in the meanwhile and you want to be sure to not
miss them.
To understand this better, you need to understand how cdcr clean
transaction logs. Cdcr when started (with the START action) will
instantiate a log reader for each target cluster. The position
of the log reader will indicate cdcr which transaction logs it
can clean. If all the log readers are beyond a certain point,
then cdcr can clean all the transaction logs up to this point.
However, there might be cases when the source cluster will be up
without any log readers instantiated:
1) The source cluster is started, but cdcr is not started yet
2) the source cluster is started, cdcr is started, but the
target cluster was not accessible when cdcr was started. In this
case, cdcr will not be able to instantiate a log reader for this
cluster.
In these two scenarios, if updates are received by the source
cluster, then they might be cleaned out from the transaction log
as per the normal update log cleaning procedure.
That is where the buffer becomes useful. When you know that
while starting up your clusters and cdcr, you will be in one of
these two scenarios, then you can activate the buffer to be sure
to not miss updates. Then when the source and target clusters
are properly up and cdcr replication is properly started, you
can turn off this buffer.
--
Renaud Delbru
On 14/06/16 06:41, Bharath Kumar wrote:
> Hi,
>
> I have setup cross data center replication using solr 6, i
want to
> know why the buffer needs to be enabled on the source
cluster? Even if
> the buffer is not enabled, i am able to replicate the data
between
> source and target sites. What is the advantages of enabling
the buffer
> on the source site? If i enable the buffer, the transaction
logs are
> never deleted and over a period of time we are running out of
disk.
> Can you please let me know why the buffer enabling is required?
>
--
Thanks & Regards,
Bharath MV Kumar
"Life is short, enjoy every moment of it"
--
Thanks & Regards,
Bharath MV Kumar
"Life is short, enjoy every moment of it"