On 2016-03-14 13:40, Adam Kocoloski wrote:
My current solution watches the global _changes feed and fires up a
continuous replication to an off-site server whenever it sees a
change. If
it doesn't see a change from a database for 10 minutes, it kills that
replication. This means I only have ~150 active replications running
on
average at any given time.
How about instead of using continuous replications and killing them,
use non-continuous replications based on _db_updates? They end
automatically and should use fewer resources then.
Best
Jan
--
In my opinion this is actually a design we should adopt for CouchDB’s
own replication manager. Keeping all those _changes listeners running
is needlessly expensive now that we have _db_updates.
Adam
I like Jan's solution, and I'm going to try and use it for a project,
but it can't be the only solution, at least for one of my cases. In this
case I work with devices behind NAT that need to pull data from updated
remote databases with reasonably low latency.
The source database can't start a push replication on _db_updates,
because it can't find the target database behind NAT. "Pull"
replications have to be started by the target database (the NATted
device), which doesn't know whether the source database is updated
unless it's on a continuous replication.
Essentially I'm using CouchDB as a kind of STUN server as well as a
replicated datastore, or via replicated datastore. I can't be the only
one.
So maybe the non-continuous replications based on _db_updates could be
the default for push replications, keeping the current _changes listener
default for pull replications.
My own idiosyncratic glossary for this case:
- source: the updated database, which is written to independently of the
replication
- target: the database that only written to via replication, updated
from the new data in the source
- "pull" updating: the target runs the replication locally, which
connects to the remote source
- "push" updating: the source runs the replication locally, which
connects to the remote target
- "reasonably low latency": if the connection fails for some reason, so
be it, we can work on an offline first basis. But if we can have
subsecond latencies between writing to the source database and
replication to the target, the application feels real-time, and that's
the goal.
Thanks for the high quality, high signal/noise list, btw. I don't
usually say anything because I don't think I have anything to add, but I
always read with interest.
Cheers,
Javier