Good point Tomas; I hadn't considered that use-case. I suppose the behavior I suggest could be controlled with a boolean parameter flag like "asyncDeleteStatus" true/false. WDYT? I'm not married to it.
BTW these async status objects stored in ZK are in fact cleaned up when they reach 10k in number. See SizeLimitedDistributedMap. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, May 10, 2023 at 2:36 PM Tomás Fernández Löbbe <tomasflo...@gmail.com> wrote: > I find it very useful to keep the used async IDs regardless of the status > for some time. For example, If you have a workflow that involves multiple > steps such as add/remove replicas, you can just retry/restart the workflow > and be sure Solr will reject the request if the async ID already exists > (and your code can then handle this accordingly, for example, checking the > status of success/failed and act accordingly) as long as you use the async > IDs consistently. > > That said, async IDs do need to eventually be removed and AFAIK Solr > doesn't do this automatically. This is a problem because of ever increasing > objects in ZooKeeper. I think we should have some sort of task that cleans > up async ID after some configurable amount of time. > > On Wed, May 10, 2023 at 1:01 AM Andras Salamon <andras.sala...@melda.info> > wrote: > > > Hi, > > > > > > > > How can we be sure that the previous request status info has been already > > processed? What about the following timeline: > > > > > > > > -Client1 sends an async request > > > > -Client1 reads status info, it's still running > > > > -Client1 reads status info, it's still running > > > > -Async request finishes > > > > -Right after that Client2 sends a new async request with the same ID, we > > clear the async status because it's already finished > > > > -Client1 reads status info, but this time it will read info about the new > > async request sent by Client2. > > > > > > > > Andras > > > > > > > > > > > > > > > > > > ---- On Wed, 10 May 2023 05:15:40 +0200 David Smiley <dsmi...@apache.org > > > > wrote --- > > > > > > > > I noticed that async admin requests to Solr must have a unique asyncId or > > else a request is rejected. Makes sense -- maybe the request is in > > progress. But what if it isn't -- what if the previous request for the > > same ID either succeeded or failed? Shouldn't we clear the previous > > asyncId status and let the new request go through? > > > > I'm imagining leveraging this uniqueness constraint in order to be an > > additional protection measure against requests that should be done > > atomically, like a shard split. Yes there are already locks but this > > additional measure will allow a fail-fast -- no enqueue of a doomed > > message > > to the Overseer that will ultimately never succeed any way. Thus the > > sender of a shard split can use an async ID like > > "SPLIT-collectionName-shardName". Maybe there are other parts of > > SolrCloud > > that could leverage this constraint to its advantage likewise. > > > > ~ David Smiley > > Apache Lucene/Solr Search Developer > > http://www.linkedin.com/in/davidwsmiley >