Sekuraz opened a new issue, #5191:
URL: https://github.com/apache/couchdb/issues/5191
# Error loop with system freeze when removing a node from a cluster
## Description
```python
for db in databases: # for all databases
headers, shards = await
session._server._get(f"/_node/_local/_dbs/{db}")
# old = json.dumps(shards)
shards["changelog"].append(["remove", "00000000-ffffffff",
nodename])
del shards["by_node"][nodename] # remove node from the map
for shards_range in shards["by_range"].values(): # remove the
node from all ranges
if nodename in shards_range:
shards_range.remove(nodename)
# put the new shard distribution to the cluster
await session._server._put(f"/_node/_local/_dbs/{db}", shards)
# leave the cluster
headers, data = await
session._server._get(f"/_node/_local/_nodes/{nodename}")
rev = data["_rev"]
await
session._server._delete(f"/_node/_local/_nodes/{nodename}?rev={rev}")
```
After executing this code to remove shards from a couchDB instance, the
instance memory and CPU usage increased by a lot (300% CPU, 50%/ 8GB of RAM)
and it spammed the log with the following error roughly every 100 microseconds:
```
[error] 2024-08-21T19:02:06.679732Z [email protected] emulator --------
Error in process <0.30198.42> on node '[email protected]' with exit value:
{{badmatch,[]},[{mem3_sync,find_next_node,0,[{file,"src/mem3_sync.erl"},{line,309}]},{mem3_sync,sync_nodes_and_dbs,0,[{file,"src/mem3_sync.erl"},{line,265}]},{mem3_sync,initial_sync,1,[{file,"src/mem3_sync.erl"},{line,272}]}]}
[error] 2024-08-21T19:02:06.679803Z [email protected] emulator --------
Error in process <0.30198.42> on node '[email protected]' with exit value:
{{badmatch,[]},[{mem3_sync,find_next_node,0,[{file,"src/mem3_sync.erl"},{line,309}]},{mem3_sync,sync_nodes_and_dbs,0,[{file,"src/mem3_sync.erl"},{line,265}]},{mem3_sync,initial_sync,1,[{file,"src/mem3_sync.erl"},{line,272}]}]}
[error] 2024-08-21T19:02:06.679880Z [email protected] emulator --------
Error in process <0.30200.42> on node '[email protected]' with exit value:
{{badmatch,[]},[{mem3_sync,find_next_node,0,[{file,"src/mem3_sync.erl"},{line,309}]},{mem3_sync,sync_nodes_and_dbs,0,[{file,"src/mem3_sync.erl"},{line,265}]},{mem3_sync,initial_sync,1,[{file,"src/mem3_sync.erl"},{line,272}]}]}
[error] 2024-08-21T19:02:06.679964Z [email protected] emulator --------
Error in process <0.30200.42> on node '[email protected]' with exit value:
{{badmatch,[]},[{mem3_sync,find_next_node,0,[{file,"src/mem3_sync.erl"},{line,309}]},{mem3_sync,sync_nodes_and_dbs,0,[{file,"src/mem3_sync.erl"},{line,265}]},{mem3_sync,initial_sync,1,[{file,"src/mem3_sync.erl"},{line,272}]}]}
[error] 2024-08-21T19:02:06.680089Z [email protected] emulator --------
Error in process <0.30202.42> on node '[email protected]' with exit value:
{{badmatch,[]},[{mem3_sync,find_next_node,0,[{file,"src/mem3_sync.erl"},{line,309}]},{mem3_sync,sync_nodes_and_dbs,0,[{file,"src/mem3_sync.erl"},{line,265}]},{mem3_sync,initial_sync,1,[{file,"src/mem3_sync.erl"},{line,272}]}]}
```
There was only one other node in the cluster. The new shard distribution was
also never synced to the other node.
## Steps to Reproduce
Reproducing the error was not possible!
## Expected Behaviour
The couchdb instance goes quietly into the night (i.e. nothing happens, it
is just no longer synced.)
## Your Environment
We had 2 CouchDB docker containers running on different hosts.
* CouchDB version used: 3.3.3
* Browser name and version: NA
* Operating system and version: docker on arch linux
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]