A regular backup creates the files in this order: drwxr-xr-x 2 root root 63 Jun 27 09:46 snapshot.shard7 drwxr-xr-x 2 root root 159 Jun 27 09:46 snapshot.shard8 drwxr-xr-x 2 root root 135 Jun 27 09:46 snapshot.shard1 drwxr-xr-x 2 root root 178 Jun 27 09:46 snapshot.shard3 drwxr-xr-x 2 root root 210 Jun 27 09:46 snapshot.shard11 drwxr-xr-x 2 root root 218 Jun 27 09:46 snapshot.shard9 drwxr-xr-x 2 root root 180 Jun 27 09:46 snapshot.shard2 drwxr-xr-x 2 root root 164 Jun 27 09:47 snapshot.shard5 drwxr-xr-x 2 root root 252 Jun 27 09:47 snapshot.shard6 drwxr-xr-x 2 root root 103 Jun 27 09:47 snapshot.shard12 drwxr-xr-x 2 root root 135 Jun 27 09:47 snapshot.shard4 drwxr-xr-x 2 root root 119 Jun 27 09:47 snapshot.shard10 drwxr-xr-x 3 root root 4 Jun 27 09:47 zk_backup -rw-r--r-- 1 root root 185 Jun 27 09:47 backup.properties
While an async backup creates files in this order: drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard3 drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard9 drwxr-xr-x 2 root root 62 Jun 27 09:49 snapshot.shard6 drwxr-xr-x 2 root root 37 Jun 27 09:49 snapshot.shard2 drwxr-xr-x 2 root root 67 Jun 27 09:49 snapshot.shard7 drwxr-xr-x 2 root root 75 Jun 27 09:49 snapshot.shard5 drwxr-xr-x 2 root root 70 Jun 27 09:49 snapshot.shard8 drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard4 drwxr-xr-x 2 root root 15 Jun 27 09:50 snapshot.shard11 drwxr-xr-x 2 root root 127 Jun 27 09:50 snapshot.shard1 drwxr-xr-x 2 root root 116 Jun 27 09:50 snapshot.shard12 drwxr-xr-x 3 root root 4 Jun 27 09:50 zk_backup -rw-r--r-- 1 root root 185 Jun 27 09:50 backup.properties drwxr-xr-x 2 root root 25 Jun 27 09:51 snapshot.shard10 shard10 is much larger than the other shards. >From the logs: INFO - 2017-06-27 09:50:33.832; [ ] org.apache.solr.cloud.BackupCmd; Completed backing up ZK data for backupName=collection1 INFO - 2017-06-27 09:50:33.800; [ ] org.apache.solr.handler.admin.CoreAdminOperation; Checking request status for : backup1103459705035055 INFO - 2017-06-27 09:50:33.800; [ ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/cores params={qt=/admin/cores&requestid=backup1103459705035055&action=REQUESTSTATUS&wt=javabin&version=2} status=0 QTime=0 INFO - 2017-06-27 09:51:33.405; [ ] org.apache.solr.handler.SnapShooter; Done creating backup snapshot: shard10 at file:///online/backup/collection1 Has anyone seen this bug, or knows a workaround? On 27 June 2017 at 09:47, Damien Kamerman <dami...@gmail.com> wrote: > Yes, the async command returns, and then I poll with REQUESTSTATUS. > > On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in> wrote: > >> Hi Damien, >> >> A backup command with async is supposed to return early. It is start the >> backup process and return. >> >> Are you using the REQUESTSTATUS ( >> http://lucene.apache.org/solr/guide/6_6/collections-api.html >> #collections-api >> ) API to validate if the backup is complete? >> >> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <dami...@gmail.com> >> wrote: >> >> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async >> > command returning early. The state is finished well before one shard is >> > finished. >> > >> > The collection I'm backing up has 12 shards across 6 nodes and I suspect >> > the issue is that it is not waiting for all backups on the node to >> finish. >> > >> > Alternatively, I if I change the request to not be async it works OK but >> > sometimes I get the exception "backup the collection time out:180s". >> > >> > Has anyone seen this, or knows a workaround? >> > >> > Cheers, >> > Damien. >> > >> > >