Re: Solr Grouping and empty fields

2013-02-24 Thread Teun Duynstee
We had a comparable situation. We created an extra field and at index time
copy the value if there is one and create a unique dummy value if there is
none. We couldn't just make the initial field required, because it has a
meaning other than just a grouping key.
Teun
Op 22 feb. 2013 20:47 schreef "Daniel Collins"  het
volgende:

> We had something similar to be fair, a cluster information field which was
> unfortunately optional, so all the documents that didn't have this field
> set grouped together.
>
> It isn't Solr's fault, to be fair, we told it to group on the values of
> field Z, null is a valid value and lots of documents have that value so
> they all group together.  We got what we asked for :-)
>
> Our solution was to make that field mandatory, and in our indexing
> pipeline we will set that field to some unique value (same as the document
> key if necessary) if it isn't set already to ensure that every document has
> that field set appropriately.
>
> -Original Message- From: Oussama Jilal
> Sent: Friday, February 22, 2013 5:25 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Grouping and empty fields
>
> OK I'm sorry if I did not explained well my need. I'll try to give a
> better explanation.
>
> What I have : Millions of documents that have a field X , another field
> Y and another field Z which is not required (So it can be empty in some
> documents and not in others).
>
> What I want to do : Search for docs that have the field X equals
> something and group them by field Z (so that only 1 document is returned
> for every field Z value), BUT I want documents who have field Z as empty
> to be included in the results (all of them), and sort the results by
> field Y (so I can't separate the request into two requests).
>
> I hope that this is clearer.
>
>
> On 02/22/2013 03:59 PM, Jack Krupansky wrote:
>
>> What?!?! You want them grouped but not grouped together?? What on earth
>> does that mean?! I mean, either they are included or they are not. All
>> results will be in some group, so where exactly do you want these "not to
>> be grouped together" documents to be grouped? In any case, please clarify
>> what your expectations really are.
>>
>> -- Jack Krupansky
>> -Original Message- From: Oussama Jilal
>> Sent: Friday, February 22, 2013 7:17 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr Grouping and empty fields
>>
>> Thank you Johannes, but I want the documents having the field empty to
>> be included in the results, just not to be grouped together, and if I
>> understood your solution correctly, it will simply remove those
>> documents from the results (Note : The field values are very variable
>> and unknown to me).
>>
>> On 02/22/2013 02:53 PM, Johannes Rodenwald wrote:
>>
>>> Hi Oussama,
>>>
>>> If you have only a few distinct, unchanging values in the field that you
>>> group upon, you could implement a FilterQuery (query parameter "fq") and
>>> add it to the query, allowing all valid values, but not an empty field. For
>>> example:
>>>
>>> fq=my_grouping_string_field:( value_a OR value_b OR value_c OR value_d )
>>>
>>> If you use SOLR 4.x, you should be able to group upon an integer field,
>>> allowing a range filter:
>>> (I still work with 3.6 which can only group on string fields, so i didnt
>>> test this one)
>>>
>>> fq=my_grouping_integer_field:[**1 TO *]
>>>
>>> --
>>> Johannes Rodenwald
>>>
>>>
>>> - Ursprüngliche Mail -
>>> Von: "Oussama Jilal" 
>>> An: solr-user@lucene.apache.org
>>> Gesendet: Freitag, 22. Februar 2013 12:32:13
>>> Betreff: Solr Grouping and empty fields
>>>
>>> Hi,
>>>
>>> I need to group some results in solr based on a field, but I don't want
>>> documents having that field empty to be grouped together, does anyone
>>> know how to achieve that ?
>>>
>>>
>>
> --
> Oussama Jilal
>
>


Re: Solr Grouping and empty fields

2013-02-24 Thread Teun Duynstee
That would depend on your indexing setup. We have a custom application for
indexing, so we just make a value up. In our case a GUID (UUID). But I
imagine that you could also just copy your id field with a prefix. It
depends on your data and tools.
Teun
Op 24 feb. 2013 15:00 schreef "Jilal Oussama"  het
volgende:

> Oh this is a good one ! Thank you very much Teun (But I will have to ask
> you how do you generate a unique value for the copy field when the original
> one is empty? Do you do this manualy or solr can do it?)
> And thanks again.
> On Feb 24, 2013 12:11 PM, "Teun Duynstee"  wrote:
>
> > We had a comparable situation. We created an extra field and at index
> time
> > copy the value if there is one and create a unique dummy value if there
> is
> > none. We couldn't just make the initial field required, because it has a
> > meaning other than just a grouping key.
> > Teun
> > Op 22 feb. 2013 20:47 schreef "Daniel Collins" 
> het
> > volgende:
> >
> > > We had something similar to be fair, a cluster information field which
> > was
> > > unfortunately optional, so all the documents that didn't have this
> field
> > > set grouped together.
> > >
> > > It isn't Solr's fault, to be fair, we told it to group on the values of
> > > field Z, null is a valid value and lots of documents have that value so
> > > they all group together.  We got what we asked for :-)
> > >
> > > Our solution was to make that field mandatory, and in our indexing
> > > pipeline we will set that field to some unique value (same as the
> > document
> > > key if necessary) if it isn't set already to ensure that every document
> > has
> > > that field set appropriately.
> > >
> > > -Original Message- From: Oussama Jilal
> > > Sent: Friday, February 22, 2013 5:25 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Solr Grouping and empty fields
> > >
> > > OK I'm sorry if I did not explained well my need. I'll try to give a
> > > better explanation.
> > >
> > > What I have : Millions of documents that have a field X , another field
> > > Y and another field Z which is not required (So it can be empty in some
> > > documents and not in others).
> > >
> > > What I want to do : Search for docs that have the field X equals
> > > something and group them by field Z (so that only 1 document is
> returned
> > > for every field Z value), BUT I want documents who have field Z as
> empty
> > > to be included in the results (all of them), and sort the results by
> > > field Y (so I can't separate the request into two requests).
> > >
> > > I hope that this is clearer.
> > >
> > >
> > > On 02/22/2013 03:59 PM, Jack Krupansky wrote:
> > >
> > >> What?!?! You want them grouped but not grouped together?? What on
> earth
> > >> does that mean?! I mean, either they are included or they are not. All
> > >> results will be in some group, so where exactly do you want these "not
> > to
> > >> be grouped together" documents to be grouped? In any case, please
> > clarify
> > >> what your expectations really are.
> > >>
> > >> -- Jack Krupansky
> > >> -Original Message- From: Oussama Jilal
> > >> Sent: Friday, February 22, 2013 7:17 AM
> > >> To: solr-user@lucene.apache.org
> > >> Subject: Re: Solr Grouping and empty fields
> > >>
> > >> Thank you Johannes, but I want the documents having the field empty to
> > >> be included in the results, just not to be grouped together, and if I
> > >> understood your solution correctly, it will simply remove those
> > >> documents from the results (Note : The field values are very variable
> > >> and unknown to me).
> > >>
> > >> On 02/22/2013 02:53 PM, Johannes Rodenwald wrote:
> > >>
> > >>> Hi Oussama,
> > >>>
> > >>> If you have only a few distinct, unchanging values in the field that
> > you
> > >>> group upon, you could implement a FilterQuery (query parameter "fq")
> > and
> > >>> add it to the query, allowing all valid values, but not an empty
> > field. For
> > >>> example:
> > >>>
> > >>> fq=my_grouping_string_field:( value_a OR value_b OR value_c OR
> value_d
> > )
> > >>>
> > >>> If you use SOLR 4.x, you should be able to group upon an integer
> field,
> > >>> allowing a range filter:
> > >>> (I still work with 3.6 which can only group on string fields, so i
> > didnt
> > >>> test this one)
> > >>>
> > >>> fq=my_grouping_integer_field:[**1 TO *]
> > >>>
> > >>> --
> > >>> Johannes Rodenwald
> > >>>
> > >>>
> > >>> - Ursprüngliche Mail -
> > >>> Von: "Oussama Jilal" 
> > >>> An: solr-user@lucene.apache.org
> > >>> Gesendet: Freitag, 22. Februar 2013 12:32:13
> > >>> Betreff: Solr Grouping and empty fields
> > >>>
> > >>> Hi,
> > >>>
> > >>> I need to group some results in solr based on a field, but I don't
> want
> > >>> documents having that field empty to be grouped together, does anyone
> > >>> know how to achieve that ?
> > >>>
> > >>>
> > >>
> > > --
> > > Oussama Jilal
> > >
> > >
> >
>


Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Teun Duynstee
You have to set group.ngroups=true (see
http://wiki.apache.org/solr/FieldCollapsing). Be aware that including the
number of groups is a surprisingly heavy operation, though.

Teun


2013/2/25 Nicholas Ding 

> Hello,
>
> I grouped the result, and set group.main=true. I was expecting the numFound
> equals to the number of groups, but actually it was not.
>
> How do I get the number of groups?
>
> Thanks
> Nicholas
>


Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Teun Duynstee
Ah, I see. The docs say "Although this result format does not have as much
information, it may be easier for existing solr clients to parse". I guess
the ngroups value could be added to this format, but apparently it isn't. I
do agree with you that to be usefull (as in possible to read for a client
that doesn't know of the grouped format), the number should be that of the
groups, not of the documents.

A quick glance in the code learns that it is indeed not calculated in this
case. But not completely trivial to fix. Could you use format=simple
instead? That will work with ngroups.

Teun


2013/2/25 Nicholas Ding 

> Thanks Teun and Carlos, I set group.ngroups=true, but I don't have this
> "ngroup" number when I was using group.main = true.
>
> On Mon, Feb 25, 2013 at 12:02 PM, Carlos Maroto <
> cmar...@searchtechnologies.com> wrote:
>
> > Use group.ngroups, check it in the Solr wiki for FieldCollapsing
> >
> > Carlos Maroto
> > Search Architect at Search Technologies (www.searchtechnologies.com)
> >
> >
> >
> > Nicholas Ding  wrote:
> >
> >
> > Hello,
> >
> > I grouped the result, and set group.main=true. I was expecting the
> numFound
> > equals to the number of groups, but actually it was not.
> >
> > How do I get the number of groups?
> >
> > Thanks
> > Nicholas
> >
>


Faceting on the first part or first letter of values

2013-02-26 Thread Teun Duynstee
What I really miss in the SimpleFaceting component is the ability to get
facets not of the full term, but grouped by the first letter(s). I wrote a
Jira issue on this ( https://issues.apache.org/jira/browse/SOLR-4496). I
also wrote a patch with a rather simplistic first try of an implementation.

Now that I've had a better look at faceting multi valued fields and the
inner working of UninvertedField, I see that doing it right is harder than
I thought, but I'd still like to give it a try.

So can anyone give me some tips on how to approach this? Should I treat the
facet on the first letters as a completely independent field? Should I
build the index based on the index of the complete field? By the nature of
this kind of facet, you'll always have a fairly limited number of terms.

Thanks a lot,
Teun