[ https://issues.apache.org/jira/browse/SOLR-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070498#comment-17070498 ]
Eugene Tenkaev edited comment on SOLR-8030 at 3/29/20, 11:13 PM: ----------------------------------------------------------------- We need to operate on fully constructed document, to remove set of dynamic fields that replaced by new different set of dynamic fields with different fields names. And here we come up with post-processor. We put this post-processor in the *default chain*. We getting *SolrQueryRequest* from *AddUpdateCommand*: {code} @Override protected void process(AddUpdateCommand cmd, SolrQueryRequest req, SolrQueryResponse rsp) { String value = cmd.getReq().getParams().get(NAME + ".xxx"); {code} and remove old set of dynamic fields from the full document according to the parameters in *SolrQueryRequest* and ignore newly added fields. Is there possibility that somehow code here will not work during replay and we loose behavior that adds this processor? h4. Possible workaround for our case: We can introduce workaround, when we add special technical field in schema that will contain command for removing old set of dynamic fields. But we will not index this technical field. So our post-processor will work only with data from *SolrInputDocument* and this technical field. Is this workaround will handle current situation around replaying of updates? Or there some cases when all post-processors completely ignored even in default chain? h3. Additionally To the idea of [~elyograg], is it possible to move routing code out from *DistributedUpdateProcessor*? So all processors that comes after this routing processor will be executed on the proper node? If so, then we can move out Atomic Update processing from *DistributedUpdateProcessor* and it will be executed on node that have proper data. was (Author: hronom): We need to operate on fully constructed document, to remove set of dynamic fields that replaced by new different set of dynamic fields with different fields names. And here we come up with post-processor. We put this post-processor in the *default chain*. We getting *SolrQueryRequest* from *AddUpdateCommand*: {code} @Override protected void process(AddUpdateCommand cmd, SolrQueryRequest req, SolrQueryResponse rsp) { String value = cmd.getReq().getParams().get(NAME + ".xxx"); {code} and remove old set of dynamic fields from the full document according to the parameters in *SolrQueryRequest* and ignore newly added fields. Is there possibility that somehow code here will not work during replay and we loose behavior that adds this processor? h4. Possible workaround for our case: We can introduce workaround, when we add special technical field in schema that will contain command for removing old set of dynamic fields. But we will not index this technical field. So our post-processor will work only with data from *SolrInputDocument* and this technical field. Is this workaround will handle current situation around replaying of updates? Or there some cases when all post-processors completely ignored even in default chain? h3. Additionally To the idea of [~elyograg], is it possible to move routing code out from *DistributedUpdateProcessor*? So all processors that comes after this routing processor will be executed on the proper node? If so when we can move out Atomic Update processing from *DistributedUpdateProcessor* and it will be executed on node that have proper data. > Transaction log does not store the update chain (or req params?) used for > updates > --------------------------------------------------------------------------------- > > Key: SOLR-8030 > URL: https://issues.apache.org/jira/browse/SOLR-8030 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 5.3 > Reporter: Ludovic Boutros > Priority: Major > Attachments: SOLR-8030.patch > > > Transaction Log does not store the update chain, or any other details from > the original update request such as the request params, used during updates. > Therefore tLog uses the default update chain, and a synthetic request, during > log replay. > If we implement custom update logic with multiple distinct update chains that > use custom processors after DistributedUpdateProcessor, or if the default > chain uses processors whose behavior depends on other request params, then > log replay may be incorrect. > Potentially problematic scenerios (need test cases): > * DBQ where the main query string uses local param variables that refer to > other request params > * custom Update chain set as {{default="true"}} using something like > StatelessScriptUpdateProcessorFactory after DUP where the script depends on > request params. > * multiple named update chains with diff processors configured after DUP and > specific requests sent to diff chains -- ex: ParseDateProcessor w/ custom > formats configured after DUP in some special chains, but not in the default > chain -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org