nickva commented on issue #5004:
URL: https://github.com/apache/couchdb/issues/5004#issuecomment-2525305402

   @Clashsoft sorry you're having troubles. Yeah it could be some invalid 
documents, and I was trying to think what rules we had tightened that we'd fail 
to validate nowadays...
   
   Another reason for `invalid_json` could be anything in the middle (a proxy, 
for example), that might fail a request due to a timeout, auth, request size, 
etc. They often return a plain html error page or just cut the json result 
part-way through, so we get an invalid json error. CouchDB could also do that 
if document size is too large, perhaps on a timeout as well.
   
   Wonder, if it's always the same document it hits or it's random? We could 
narrow it down perhaps. If you already have a low batch size, let's also make 
your replication use less workers and checkpoint more frequently. Say, set 
https://docs.couchdb.org/en/stable/config/replicator.html#replicator/worker_processes
 to `1` and 
https://docs.couchdb.org/en/stable/config/replicator.html#replicator/checkpoint_interval
 to `5000`, then use force to avoid using `_bulk_get` for fetching so 
https://docs.couchdb.org/en/stable/config/replicator.html#replicator/use_bulk_get
 set to `false`. 
   
   After those changes re-create the replication to let it continue. It will 
continue from the last checkpoint. Then watch the request logs on the source 
and target and see which requests fail (return a 500 error or terminate 
unexpectedly) those might be the ones the invalid_json is about.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to