[jira] [Commented] (SOLR-15220) Json faceting allows val and count as stat/subfacet names

Tim Owen (Jira) Fri, 05 Mar 2021 09:04:14 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-15220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296185#comment-17296185
 ]


Tim Owen commented on SOLR-15220:
---------------------------------

Example in a non-distributed search
{noformat}
{ facet: { authors: { type:terms, field:author_s, sort: "count desc", limit:3, 
method:dvhash, facet: { "count": "min(followers_i)" } } } }

  "facets":{
    "count":2,
    "authors":{
      "buckets":[{
          "val":"bob",
          "count":1,
          "count":50000},
        {
          "val":"tim",
          "count":1,
          "count":12}]}}} {noformat}
 

and then with a distributed search the values are merged (with other results 
from more shards)
{noformat}

  "facets":{
    "count":3,
    "authors":{
      "buckets":[{
          "val":"bob",
          "count":50001},
        {
          "val":"tim",
          "count":27}]}}} {noformat}
 

If I change the name from {{count}} to something else, it works correctly
{noformat}

  "facets":{
    "count":3,
    "authors":{
      "buckets":[{
          "val":"tim",
          "count":2,
          "mycount":12},
        {
          "val":"bob",
          "count":1,
          "mycount":50000}]}}}
 {noformat}

> Json faceting allows val and count as stat/subfacet names
> ---------------------------------------------------------
>
>                 Key: SOLR-15220
>                 URL: https://issues.apache.org/jira/browse/SOLR-15220
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Facet Module, JSON Request API
>    Affects Versions: 7.7.3, master (9.0), 8.8.1
>            Reporter: Tim Owen
>            Priority: Minor
>
> The json faceting API allows you to name your stats or subfacets with the 
> names {{val}} or {{count}} which leads to confusing results or failed 
> requests, because these names are effectively reserved by the code that 
> builds the bucket responses.
> We noticed this by accident, when some new client code used the name 
> {{count}} for a stat and we were getting unexpected results. What seems to be 
> happening is that the NamedList from each shard contains *both* the true 
> count and our stat value under the same key. Both NamedList and JSON/XML 
> allow duplicates so there was no failure at this point. Then in distributed 
> mode, the facet merger combines the values from both keys, and we ended up 
> with the overall response having an inflated number for our stat.
> I think we could just validate against those 2 names being used for stats or 
> subfacets, in the facet parser {{parseSubs}} method, to avoid this situation. 
> I would rather know it's asking for trouble than allow it and get weird 
> results or an exception. There may be other reserved names, it depends on the 
> facet type used. Alternatively we could throw an exception if a duplicate key 
> is used when building the NamedList response, although there isn't a central 
> place to check that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15220) Json faceting allows val and count as stat/subfacet names

Reply via email to