Hi,

Totally agree about the schema. It couldn't be the issue.

On the other hand, I made a naive test. Using the schema and the data you
attached I've tried to search over the same solr instance as if it was two
different shards.

For example, searching for:

http://localhost:8081/solr/select?q=*:*&wt=json&indent=true&facet=true&facet.field=ID_bent

gives the correct response (2 hits + nice facet count),


   - "docs": [
      - {
         - "id": "1",
         - "ID_bent": "#77762702P#77762953Y#77768200D#77763320M#77760725D#",
         - "score": 2.5645533
      },
      - {
         - "id": "2",
         - "ID_bent":
         
"#77760631F#77766156N#77760725D#77762702P#77765788N#48991207P#77762953Y#77760302T#12312312K#89890001K#77768200D#89890003T#11111111H#77763453T#99999999R#00020080J#Y4332393N#04889446Z#12345655Z#77763320M#11100336Z#Y4222970X#"
         ,
         - "score": 2.5645533
      }
   ]

},"facet_counts": {

   - "facet_queries": { },
   - "facet_fields": {
      - "ID_bent": [
         - "77760725D",
         - 2,
         - "77762702P",
         - 2,
         - "77762953Y",
         - 2,
         - "77763320M",
         - 2,
         - "77768200D",
         - 2,
         - "00020080J",



but when searching for:
http://localhost:8081/solr/select?q=*:*&wt=json&indent=true&facet=true&facet.field=ID_bent&;
*shards=localhost:8081/solr,localhost:8081/solr*

It gives me the next response (nice response for document found, as they
can be "merged" using the id, but bad facet count, as the values can't be
merged).

   - "numFound": 2,
   - "start": 0,
   - "maxScore": 2.5645533,
   - "docs": [
      - {
         - "id": "1",
         - "ID_bent": "#77762702P#77762953Y#77768200D#77763320M#77760725D#",
         - "score": 2.5645533
      },
      - {
         - "id": "2",
         - "ID_bent":
         
"#77760631F#77766156N#77760725D#77762702P#77765788N#48991207P#77762953Y#77760302T#12312312K#89890001K#77768200D#89890003T#11111111H#77763453T#99999999R#00020080J#Y4332393N#04889446Z#12345655Z#77763320M#11100336Z#Y4222970X#"
         ,
         - "score": 2.5645533
      }
   ]

},"facet_counts": {

   - "facet_queries": { },
   - "facet_fields": {
      - "ID_bent": [
         - "77760725D",
         - 4,
         - "77762702P",
         - 4,
         - "77762953Y",
         - 4,
         - "77763320M",
         - 4,
         - "77768200D",
         - 4,
         - "00020080J",


So there could be two different issues:

   - Duplicated documents (one on each shard)
   - Bad request, including twice the list of shards requested (I would bet
   the other is the real cause)

Hope it helps.








On Fri, Feb 6, 2015 at 1:27 PM, <david.dav...@correo.aeat.es> wrote:

> Hi Alvaro,
>
> this is the definition:
>
>                  <fieldType name="entidades" class="solr.TextField">
>                                  <analyzer type="index">
>                                                  <tokenizer
> class="solr.PatternTokenizerFactory" pattern="#"/>
>                                  </analyzer>
>                  </fieldType
>
>
> As you can see we store all the ID split with a #. Normally this have
> worked fine, and I think that the problem has nothing to do with the
> definition.
> Besides, I have seen that when the correct value in the facet field would
> be 2, Solr shows 4, and when it would be 1 it shows 2. In conclusion, for
> some reason values are being duplicated. Why? I have no idea.  And this
> doesn't happen always, it´s more, only with some queries or some
> documents. It's very weird, maybe Solr Cloud is merging the results from
> the two shards in a wrong way in some situations, but I have no idea.
>
> Regards,
>
>
> David Dávila Atienza
> AEAT - Departamento de Informática Tributaria
> Subdirección de Tecnologías de Análisis de la Información e Investigación
> del Fraude
> Teléfono: 915828763
> Extensión: 36763
>
>
>
> De:     Alvaro Cabrerizo <topor...@gmail.com>
> Para:   "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>,
> Fecha:  06/02/2015 12:34
> Asunto: Re: Problem with faceting
>
>
>
> Hi David,
>
> Yes it sounds weird.
>
> Just for testing purpose, It would be nice to have the ID_bent fieldtype
> definition.
>
> Regards.
>
> On Fri, Feb 6, 2015 at 9:05 AM, <david.dav...@correo.aeat.es> wrote:
>
> > Hello,
> >
> > we have been using faceting for a long time, but now I have discovered a
> > problem that I can't understand:
> >
> > the issue is that in a query with 2 results, in some facet values Solr
> is
> > answering that there are 4 results. But faceting only applies over the
> > result documents, therefore I think that this makes no sense.
> >
> > This is the query:
> >
> >
> >   "responseHeader": {
> >     "status": 0,
> >     "QTime": 330,
> >     "params": {
> >       "facet": "true",
> >       "fl": "ID_bent",
> >       "indent": "true",
> >       "q": "aitana",
> >       "_": "1423207958751",
> >       "facet.field": "ID_bent",
> >       "wt": "json",
> >       "fq": "ee_Procedimiento:ZZ12 AND ee_Referencia:\"CURSO\" AND
> > doc_FormatoDocumento:PDF"
> >     }
> >   },
> >   "response": {
> >     "numFound": 2,
> >     "start": 0,
> >     "maxScore": 0.17735688,
> >     "docs": [
> >       {
> >         "ID_bent": "#77762702P#77762953Y#77768200D#77763320M#77760725D#"
> >       },
> >       {
> >         "ID_bent":
> >
> >
>
> "#77760631F#77766156N#77760725D#77762702P#77765788N#48991207P#77762953Y#77760302T#12312312K#89890001K#77768200D#89890003T#11111111H#77763453T#99999999R#00020080J#Y4332393N#04889446Z#12345655Z#77763320M#11100336Z#Y4222970X#"
> >       }
> >     ]
> >   },
> >   "facet_counts": {
> >     "facet_queries": {},
> >     "facet_fields": {
> >       "ID_bent": [
> >         "77760725D",
> >         4,
> >         "77762702P",
> >         4,
> >         "77762953Y",
> >         4,
> >         "77763320M",
> >         4,
> >         "77768200D",
> >         4,
> >         "00000336Z",
> >         2,
> >         "00020000J",
> >         2,
> >         "04889446Z",
> >         2,
> >         "11111111H",
> >         2,
> >         "12312312K",
> >         2,
> >         "12345655Z",
> >         2,
> >         "48261207P",
> >         2,
> >         "77760302T",
> >         2,
> >         "77760631F",
> >         2,
> >         "77763453T",
> >         2,
> >         "77765788N",
> >         2,
> >
> >
> > We are using Solr 4.7 in cloud configuration with 2 shards.  Any idea
> what
> > it is happening?
> >
> > Thanks in advance,
> >
> > David Dávila Atienza
> > AEAT - Departamento de Informática Tributaria
> > Subdirección de Tecnologías de Análisis de la Información e
> Investigación
> > del Fraude
> >
>
>

Reply via email to