[ https://issues.apache.org/jira/browse/SOLR-15220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296185#comment-17296185 ]
Tim Owen commented on SOLR-15220: --------------------------------- Example in a non-distributed search {noformat} { facet: { authors: { type:terms, field:author_s, sort: "count desc", limit:3, method:dvhash, facet: { "count": "min(followers_i)" } } } } "facets":{ "count":2, "authors":{ "buckets":[{ "val":"bob", "count":1, "count":50000}, { "val":"tim", "count":1, "count":12}]}}} {noformat} and then with a distributed search the values are merged (with other results from more shards) {noformat} "facets":{ "count":3, "authors":{ "buckets":[{ "val":"bob", "count":50001}, { "val":"tim", "count":27}]}}} {noformat} If I change the name from {{count}} to something else, it works correctly {noformat} "facets":{ "count":3, "authors":{ "buckets":[{ "val":"tim", "count":2, "mycount":12}, { "val":"bob", "count":1, "mycount":50000}]}}} {noformat} > Json faceting allows val and count as stat/subfacet names > --------------------------------------------------------- > > Key: SOLR-15220 > URL: https://issues.apache.org/jira/browse/SOLR-15220 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module, JSON Request API > Affects Versions: 7.7.3, master (9.0), 8.8.1 > Reporter: Tim Owen > Priority: Minor > > The json faceting API allows you to name your stats or subfacets with the > names {{val}} or {{count}} which leads to confusing results or failed > requests, because these names are effectively reserved by the code that > builds the bucket responses. > We noticed this by accident, when some new client code used the name > {{count}} for a stat and we were getting unexpected results. What seems to be > happening is that the NamedList from each shard contains *both* the true > count and our stat value under the same key. Both NamedList and JSON/XML > allow duplicates so there was no failure at this point. Then in distributed > mode, the facet merger combines the values from both keys, and we ended up > with the overall response having an inflated number for our stat. > I think we could just validate against those 2 names being used for stats or > subfacets, in the facet parser {{parseSubs}} method, to avoid this situation. > I would rather know it's asking for trouble than allow it and get weird > results or an exception. There may be other reserved names, it depends on the > facet type used. Alternatively we could throw an exception if a duplicate key > is used when building the NamedList response, although there isn't a central > place to check that. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org