nickva commented on issue #5004: URL: https://github.com/apache/couchdb/issues/5004#issuecomment-2525305402
@Clashsoft sorry you're having troubles. Yeah it could be some invalid documents, and I was trying to think what rules we had tightened that we'd fail to validate nowadays... Another reason for `invalid_json` could be anything in the middle (a proxy, for example), that might fail a request due to a timeout, auth, request size, etc. They often return a plain html error page or just cut the json result part-way through, so we get an invalid json error. CouchDB could also do that if document size is too large, perhaps on a timeout as well. Wonder, if it's always the same document it hits or it's random? We could narrow it down perhaps. If you already have a low batch size, let's also make your replication use less workers and checkpoint more frequently. Say, set https://docs.couchdb.org/en/stable/config/replicator.html#replicator/worker_processes to `1` and https://docs.couchdb.org/en/stable/config/replicator.html#replicator/checkpoint_interval to `5000`, then use force to avoid using `_bulk_get` for fetching so https://docs.couchdb.org/en/stable/config/replicator.html#replicator/use_bulk_get set to `false`. After those changes re-create the replication to let it continue. It will continue from the last checkpoint. Then watch the request logs on the source and target and see which requests fail (return a 500 error or terminate unexpectedly) those might be the ones the invalid_json is about. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
