On 2016-03-14 13:40, Adam Kocoloski wrote:

My current solution watches the global _changes feed and fires up a
continuous replication to an off-site server whenever it sees a change. If
it doesn't see a change from a database for 10 minutes, it kills that
replication. This means I only have ~150 active replications running on
average at any given time.

How about instead of using continuous replications and killing them,
use non-continuous replications based on _db_updates? They end
automatically and should use fewer resources then.

Best
Jan
--

In my opinion this is actually a design we should adopt for CouchDB’s
own replication manager. Keeping all those _changes listeners running
is needlessly expensive now that we have _db_updates.

Adam

I like Jan's solution, and I'm going to try and use it for a project, but it can't be the only solution, at least for one of my cases. In this case I work with devices behind NAT that need to pull data from updated remote databases with reasonably low latency.

The source database can't start a push replication on _db_updates, because it can't find the target database behind NAT. "Pull" replications have to be started by the target database (the NATted device), which doesn't know whether the source database is updated unless it's on a continuous replication.

Essentially I'm using CouchDB as a kind of STUN server as well as a replicated datastore, or via replicated datastore. I can't be the only one.

So maybe the non-continuous replications based on _db_updates could be the default for push replications, keeping the current _changes listener default for pull replications.

My own idiosyncratic glossary for this case:

- source: the updated database, which is written to independently of the replication - target: the database that only written to via replication, updated from the new data in the source

- "pull" updating: the target runs the replication locally, which connects to the remote source - "push" updating: the source runs the replication locally, which connects to the remote target

- "reasonably low latency": if the connection fails for some reason, so be it, we can work on an offline first basis. But if we can have subsecond latencies between writing to the source database and replication to the target, the application feels real-time, and that's the goal.

Thanks for the high quality, high signal/noise list, btw. I don't usually say anything because I don't think I have anything to add, but I always read with interest.

Cheers,

Javier


Reply via email to