To those following along at home, we managed to get PouchDB’s test suite to 
pass against CouchDB 2 RC4 in Node, but not in Firefox.

https://github.com/pouchdb/pouchdb/pull/5628#issuecomment-244584069

Since this is a timeout, it may just be a test artifact, but 100 seconds seems 
like a pretty long timeout to not be a true failure.

–Nolan


> On Sep 3, 2016, at 8:24 PM, Russell Branca <[email protected]> wrote:
> 
> N=1 might reduce some of the issues, but it won't eliminate the problem
> entirely. The fundamental issue is that the "_dbs" db, which contains a
> document corresponding to every clustered database in the system, does not
> provide immediate consistency guarantees, and cycling databases can result
> in conflicts arising in these docs. The docs contain the shard/node
> mappings and conflicts can cause different nodes to have different views of
> the world.
> 
> It's important to remember that the "_dbs" db powers the db -> shards
> mapping and is a fundamental component of the quorum system, so
> unfortunately the standard clustered quorum semantics are not available in
> the "_dbs" db as it operates at a lower level. You can see the initial
> synchronization during bootup in [1] which circles its way back to [2] by
> way of mem3_sync_nodes.erl. You can further see where the "_dbs" db is a
> local db in the way in which shards are loaded in [3] and the fallback for
> creating the "_dbs" db in [4].
> 
> So in summary, the "_dbs" db operates at a lower level than the quorum
> system as the db is a core component that powers the shard mappings, and
> therefore uses a different approach for synchronization where each node has
> a full copy of the "_dbs" db and syncs directly with the other nodes. This
> is a known weak point as can be seen by the impact of cycling databases too
> quickly, and so recommended best practice is to not cycle databases
> quickly. Obviously this is not ideal, and this is one of the areas where a
> CP config store of some sort would be a significant boon, but bolting on a
> CP system to an AP system is fraught with a new set of complexities.
> 
> (A clarification on N=1: with N=1 you only have one replica of the
> database, and the database exists on only one node. The rest of the nodes
> still need to get the updated "_dbs" db doc so they know where the database
> exists, because any node in the cluster can handle any request and it will
> need to know where the database exists. In general, you have one
> coordinating node and N replica nodes containing the N replicas (of each
> shard) for the given database. In a three node cluster with N=3, whatever
> coordinating node the request is handled by will also have a local shard
> replica, but this is a special case. In a cluster with more than 3 nodes,
> say 15 nodes, the coordinating node will only have a 3/15 chance to contain
> a local shard (assuming round robin load balancing across nodes). So
> basically every node must know where every database exists because every
> node can coordinate every request.)
> 
> 
> -Russell
> 
> 
> [1]
> https://github.com/apache/couchdb-mem3/blob/15615b295ec970ca9b12b7b54107a80b95149511/src/mem3_sync.erl#L234-L236
> [2]
> https://github.com/apache/couchdb-mem3/blob/15615b295ec970ca9b12b7b54107a80b95149511/src/mem3_sync.erl#L230-L232
> [3]
> https://github.com/apache/couchdb-mem3/blob/699308f510d335d05bfd0416ad5e893b68a7ec1d/src/mem3_shards.erl#L266-L283
> [4]
> https://github.com/apache/couchdb-mem3/blob/699308f510d335d05bfd0416ad5e893b68a7ec1d/src/mem3_util.erl#L214-L222
> 
> On Fri, Sep 2, 2016 at 10:43 AM, Nolan Lawson <[email protected]> wrote:
> 
>> Thanks, Dale. That was my recollection as well.
>> 
>> Basically PouchDB does PUT -> DELETE -> PUT between every test, so since
>> there are 1000s of tests, this race condition comes up pretty easily. We
>> can add a timeout or do a random DB name, but without doing that we don't
>> know if Couch 2.x is truly "passing" the test suite or not.
>> 
>> I have some time this weekend, so I'll look into adding a patch to do the
>> workaround for Couch 2. I tend to side with Jan that in a clustered system
>> it can't reliably tell us when a database was truly deleted without
>> sacrificing the A in CAP. PouchDB users are already familiar with the weird
>> ways that databases start to behave when you actually DELETE them (e.g.
>> replication gets unreliable), hence workarounds like
>> https://www.npmjs.com/package/pouchdb-erase . In practice I expect PouchDB
>> users to never delete databases, so this is just an artifact of our test
>> suite IMO.
>> 
>> –Nolan
>> 
>> 
>> On Fri, Sep 2, 2016 at 3:14 AM, Dale Harvey <[email protected]> wrote:
>> 
>>> In PouchDB we can look into a workaround that uses random names only when
>>> the tests are run against Couch 2.0, however I would really like to make
>>> sure that a database not being fully deleted when we get a successful
>>> confirmation of deletion is considered a bug, it has impacts beyond the
>>> test suite, its really hard to create a reliable system when there is no
>>> way for you to be certain when a database is deleted.
>>> 
>>> Will found it easiest to reproduce this using concurrent scripts but
>> would
>>> like to clarify that Pouch doesnt run the test suite in parallel, this
>> bug
>>> can be hit by doing CREATE -> DELETE -> CREATE, its extremely hard to
>> nail
>>> down and reproduce (the similiar bug in PouchDB took many attempts +
>>> months). I will take a look at seeing if I can make an easier and clearer
>>> steps to reproduce.
>>> 
>>> On 2 September 2016 at 11:01, Jan Lehnardt <[email protected]> wrote:
>>> 
>>>> 
>>>>> On 02 Sep 2016, at 11:58, Will Holley <[email protected]> wrote:
>>>>> 
>>>>> Jan - I can understand that being the case in a clustered setup with
>>>>> distributed shard maps but shouldn't n=1 mitigate that?
>>>> 
>>>> n=1 still does q=8 (8 shards per node) and the software makes
>>>> noconsistency guarantees whatsoever.
>>>> 
>>>> n=1 && q=1 might work as a side-effect, but not sure how that is useful
>>>> for reliable tests :)
>>>> 
>>>> Best
>>>> Jan
>>>> --
>>>> 
>>>> 
>>>>> 
>>>>> On 2 September 2016 at 10:53, Jan Lehnardt <[email protected]> wrote:
>>>>>> 
>>>>>>> On 02 Sep 2016, at 11:45, Dale Harvey <[email protected]> wrote:
>>>>>>> 
>>>>>>> In PouchDB we used to generate unique database names for tests,
>>>> however we
>>>>>>> removed it for serveral reasons, one large reason being it
>> indicates
>>> a
>>>> race
>>>>>>> condition in critical code if we cannot reliably create -> delete
>> ->
>>>> create
>>>>>>> the same database (we have uncovered and fixed a lot of bugs in
>>>> PouchDB due
>>>>>>> to this). While its not my call how to prioritise those bugs, I
>>> really
>>>> do
>>>>>>> not think we should be closing what are fairly serious bugs because
>>> it
>>>>>>> wasnt inconvenient to workaround them in the couch test suite.
>>>>>> 
>>>>>> It’s just that a CouchDB 2.0 cluster is an AP system, and recreating
>>>> databases
>>>>>> in quick succession reliably basically requires a CA system and
>> that’s
>>>> not what can do easily.
>>>>>> 
>>>>>> (I hope I got the CAP letters right, but I think it is clear what I
>>>> mean)
>>>>>> 
>>>>>> That is, maybe we skip those tests when run against a CouchDB 2.0
>>>> endpoint and keep them for PouchDB?
>>>>>> 
>>>>>> Best
>>>>>> Jan
>>>>>> --
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> On 2 September 2016 at 10:31, Joan Touzet <[email protected]>
>> wrote:
>>>>>>> 
>>>>>>>> Hi Nolan, Will:
>>>>>>>> 
>>>>>>>> A further update from looking deeper with @janl. It appears that
>> we
>>>>>>>> have a pending fix for COUCHDB-3017 and we'll work on getting that
>>>>>>>> merged before 2.0.
>>>>>>>> 
>>>>>>>> COUCHDB-3034 is a WONTFIX. FYI in CouchDB itself we changed all of
>>>>>>>> our tests to use unique database names. I'll update the bug myself
>>>>>>>> shortly.
>>>>>>>> 
>>>>>>>> -Joan
>>>>>>>> 
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Joan Touzet" <[email protected]>
>>>>>>>>> To: [email protected]
>>>>>>>>> Sent: Friday, September 2, 2016 5:15:00 AM
>>>>>>>>> Subject: Re: Getting libraries to test RCs
>>>>>>>>> 
>>>>>>>>> Hi Will,
>>>>>>>>> 
>>>>>>>>> Neither of these are currently tagged as blocking issues for
>>> CouchDB
>>>>>>>>> 2.0, only major priority. If you want to flag them as such, this
>> is
>>>>>>>>> your last chance, and even still, there's no guarantee fixes for
>>> them
>>>>>>>>> will hit 2.0.
>>>>>>>>> 
>>>>>>>>> Erlangers, is there any chance of at least triaging these today?
>>>>>>>>> 
>>>>>>>>> -Joan
>>>>>>>>> 
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: "Will Holley" <[email protected]>
>>>>>>>>>> To: [email protected], "Joan Touzet" <[email protected]>
>>>>>>>>>> Sent: Friday, September 2, 2016 4:43:48 AM
>>>>>>>>>> Subject: Re: Getting libraries to test RCs
>>>>>>>>>> 
>>>>>>>>>> Assuming nothing's changed in the last few weeks, there are 2
>>>>>>>>>> issues
>>>>>>>>>> which cause the PouchDB tests to fail against master:
>> COUCHDB-3017
>>>>>>>>>> and
>>>>>>>>>> COUCHDB-3034.
>>>>>>>>>> 
>>>>>>>>>> Both could be addressed in the test suite by using different
>>>>>>>>>> database
>>>>>>>>>> names for each test, but that's quite a disruptive change.
>>>>>>>>>> 
>>>>>>>>>> On 2 September 2016 at 03:15, Joan Touzet <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Nolan, you state that it's 'failing for known reasons.' Is
>>>>>>>>>>> that
>>>>>>>>>>> reasons in PouchDB or anything you need to push back on us?
>> We'd
>>>>>>>>>>> like
>>>>>>>>>>> to know ASAP as we're very, very close to releasing 2.0 now.
>>>>>>>>>>> 
>>>>>>>>>>> I have zero PouchDB knowledge so I'm hoping you can give us a
>>>>>>>>>>> short
>>>>>>>>>>> summary of what you think is wrong.
>>>>>>>>>>> 
>>>>>>>>>>> All the best,
>>>>>>>>>>> Joan
>>>>>>>>>>> 
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: "Nolan Lawson" <[email protected]>
>>>>>>>>>>>> To: [email protected]
>>>>>>>>>>>> Sent: Thursday, September 1, 2016 7:56:42 PM
>>>>>>>>>>>> Subject: Re: Getting libraries to test RCs
>>>>>>>>>>>> 
>>>>>>>>>>>> We have been testing CouchDB master in PouchDB for months now,
>>>>>>>>>>>> but
>>>>>>>>>>>> as
>>>>>>>>>>>> an allowed failure because I believe it’s failing for known
>>>>>>>>>>>> reasons.
>>>>>>>>>>>> We test both using Node.js and the browser.
>>>>>>>>>>>> 
>>>>>>>>>>>> Node: https://travis-ci.org/pouchdb/pouchdb/jobs/156198210
>>>>>>>>>>>> Browser: https://travis-ci.org/pouchdb/pouchdb/jobs/156198211
>>>>>>>>>>>> 
>>>>>>>>>>>> For anyone who wants to run the Pouch test suite against
>>>>>>>>>>>> CouchDB,
>>>>>>>>>>>> it’s just:
>>>>>>>>>>>> 
>>>>>>>>>>>> git clone https://github.com/pouchdb/pouchdb.git
>>>>>>>>>>>> cd pouchdb
>>>>>>>>>>>> npm I
>>>>>>>>>>>> COUCH_HOST=http://localhost:5984 BAIL=0 npm t
>>>>>>>>>>>> 
>>>>>>>>>>>> BAIL=0 will tell it to run the full test suite and not stop on
>>>>>>>>>>>> any
>>>>>>>>>>>> failures. That way you can inspect the failures and see if
>>>>>>>>>>>> they’re
>>>>>>>>>>>> serious or not.
>>>>>>>>>>>> 
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Nolan
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Aug 29, 2016, at 12:15 PM, Jan Lehnardt <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Anyone on this list who could help with this? The work items
>>>>>>>>>>>>> are
>>>>>>>>>>>>> fairly self-explanatory and not very big individually <3
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best
>>>>>>>>>>>>> Jan
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 10 Aug 2016, at 09:37, Jan Lehnardt <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey everyone,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> from Joan’s excellent blog post about testing Release
>>>>>>>>>>>>>> Candidates:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> To our valued CouchDB application and library developers:
>>>>>>>>>>>>>>> please,
>>>>>>>>>>>>>>> please run your software against each of the options below.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> — https://blog.couchdb.org/2016/08/08/release-candidates/
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think we can be a little more proactive about this for
>>>>>>>>>>>>>> CouchDB
>>>>>>>>>>>>>> client libraries: let’s open issues on all the
>>>>>>>>>>>>>> CouchDB-compatible
>>>>>>>>>>>>>> client software we care about to test an RC.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Since there are a lot of projects, and we don’t necessarily
>>>>>>>>>>>>>> know
>>>>>>>>>>>>>> which one we “care” about, we should try to be clever about
>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Maybe something like this can work:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. We prepare an issue text explaining the thing: Heya,
>>>>>>>>>>>>>> CouchDB
>>>>>>>>>>>>>> team here, major new version coming up, you should test it
>>>>>>>>>>>>>> like
>>>>>>>>>>>>>> so: <include instructions to test against a 3-node cluster.
>>>>>>>>>>>>>> Maybe
>>>>>>>>>>>>>> even provide a cluster to do this, or Cloudant can sponsor
>>>>>>>>>>>>>> something?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. Post this message with a call to action on [email protected],
>> the
>>>>>>>>>>>>>> weekly news, and our other (social) media channels.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 3. Ask people who submitted an issue to report back with a
>>>>>>>>>>>>>> link.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 4. Collect the link in an issue or JIRA (this could be done
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> 3.,
>>>>>>>>>>>>>> but then everybody needs to be added to the wiki write
>> group,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> that’s just extra overhead we don’t need). Maybe we borrow a
>>>>>>>>>>>>>> gist
>>>>>>>>>>>>>> for this, or a Google doc.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> That way we encourage client software to check out RCs and
>> we
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>> keep track, while the community helps to select which
>>>>>>>>>>>>>> software
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> encourage to test 2.0 compat, and helps spread the word and
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> burden is not left with just a few folks.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>> Jan
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Professional Support for Apache CouchDB:
>>>>>>>>>>>>> https://neighbourhood.ie/couchdb-support/
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Professional Support for Apache CouchDB:
>>>>>> https://neighbourhood.ie/couchdb-support/
>>>>>> 
>>>> 
>>>> --
>>>> Professional Support for Apache CouchDB:
>>>> https://neighbourhood.ie/couchdb-support/
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Nolan Lawson
>> nolanlawson.com
>> github.com/nolanlawson
>> 

Reply via email to