Jason Gerlowski created SOLR-14190: -------------------------------------- Summary: Add multi-shard support to TaggerRequestHandler Key: SOLR-14190 URL: https://issues.apache.org/jira/browse/SOLR-14190 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: query Affects Versions: master (9.0) Reporter: Jason Gerlowski
As documented in the ref-guide, the Tagger Handler currently only works on single-shard collections. Users attempting to invoke {{/tag}} on a multi-shard collection will get results that only represent the tags from one of the shards. This is pretty easy to reproduce with the tagger tutorial in the [docs|https://lucene.apache.org/solr/guide/8_2/the-tagger-handler.html#tutorial-with-geonames]. If the geonames collection is created with multiple shards (e.g. {{bin/solr create -c geonames -shards 2}}), then the tags returned by the API vary based on which shard ends up being used. Repeating the same request returns different results: {code} ➜ solr git:(master) ✗ curl -X POST 'http://localhost:8983/solr/geonames2/tag?overlaps=NO_SUB&tagsLimit=5000&fl=id,name,countrycode&wt=json&indent=on' -H 'Content-Type:text/plain' -d 'Hello New York City' { "responseHeader":{...}, "tagsCount":2, "tags":[[ "startOffset",10, "endOffset",14, "ids",["4098776", "4562407"]], [ "startOffset",15, "endOffset",19, "ids",["8347868"]]], "response":{"numFound":3,"start":0,"docs":[ {"id":"8347868", "name":["City"], "countrycode":["AU"]}, {"id":"4098776", "name":["York"], "countrycode":["US"]}, {"id":"4562407", "name":["York"], "countrycode":["US"]}] }} ➜ solr git:(master) ✗ curl -X POST 'http://localhost:8983/solr/geonames2/tag?overlaps=NO_SUB&tagsLimit=5000&fl=id,name,countrycode&wt=json&indent=on' -H 'Content-Type:text/plain' -d 'Hello New York City' { "responseHeader":{...}, "tagsCount":1, "tags":[[ "startOffset",6, "endOffset",19, "ids",["5128581"]]], "response":{"numFound":1,"start":0,"docs":[ {"id":"5128581", "name":["New York City"], "countrycode":["US"]}] }} {code} Nothing inherent to {{/tag}} prevents it from handling multi-shard requests, it just wasn't a priority at the time the initial implementation was put in. We should add distributed support to this request handler. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org