Re: Solr Grouping and empty fields
We had a comparable situation. We created an extra field and at index time copy the value if there is one and create a unique dummy value if there is none. We couldn't just make the initial field required, because it has a meaning other than just a grouping key. Teun Op 22 feb. 2013 20:47 schreef "Daniel Collins" het volgende: > We had something similar to be fair, a cluster information field which was > unfortunately optional, so all the documents that didn't have this field > set grouped together. > > It isn't Solr's fault, to be fair, we told it to group on the values of > field Z, null is a valid value and lots of documents have that value so > they all group together. We got what we asked for :-) > > Our solution was to make that field mandatory, and in our indexing > pipeline we will set that field to some unique value (same as the document > key if necessary) if it isn't set already to ensure that every document has > that field set appropriately. > > -Original Message- From: Oussama Jilal > Sent: Friday, February 22, 2013 5:25 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Grouping and empty fields > > OK I'm sorry if I did not explained well my need. I'll try to give a > better explanation. > > What I have : Millions of documents that have a field X , another field > Y and another field Z which is not required (So it can be empty in some > documents and not in others). > > What I want to do : Search for docs that have the field X equals > something and group them by field Z (so that only 1 document is returned > for every field Z value), BUT I want documents who have field Z as empty > to be included in the results (all of them), and sort the results by > field Y (so I can't separate the request into two requests). > > I hope that this is clearer. > > > On 02/22/2013 03:59 PM, Jack Krupansky wrote: > >> What?!?! You want them grouped but not grouped together?? What on earth >> does that mean?! I mean, either they are included or they are not. All >> results will be in some group, so where exactly do you want these "not to >> be grouped together" documents to be grouped? In any case, please clarify >> what your expectations really are. >> >> -- Jack Krupansky >> -Original Message- From: Oussama Jilal >> Sent: Friday, February 22, 2013 7:17 AM >> To: solr-user@lucene.apache.org >> Subject: Re: Solr Grouping and empty fields >> >> Thank you Johannes, but I want the documents having the field empty to >> be included in the results, just not to be grouped together, and if I >> understood your solution correctly, it will simply remove those >> documents from the results (Note : The field values are very variable >> and unknown to me). >> >> On 02/22/2013 02:53 PM, Johannes Rodenwald wrote: >> >>> Hi Oussama, >>> >>> If you have only a few distinct, unchanging values in the field that you >>> group upon, you could implement a FilterQuery (query parameter "fq") and >>> add it to the query, allowing all valid values, but not an empty field. For >>> example: >>> >>> fq=my_grouping_string_field:( value_a OR value_b OR value_c OR value_d ) >>> >>> If you use SOLR 4.x, you should be able to group upon an integer field, >>> allowing a range filter: >>> (I still work with 3.6 which can only group on string fields, so i didnt >>> test this one) >>> >>> fq=my_grouping_integer_field:[**1 TO *] >>> >>> -- >>> Johannes Rodenwald >>> >>> >>> - Ursprüngliche Mail - >>> Von: "Oussama Jilal" >>> An: solr-user@lucene.apache.org >>> Gesendet: Freitag, 22. Februar 2013 12:32:13 >>> Betreff: Solr Grouping and empty fields >>> >>> Hi, >>> >>> I need to group some results in solr based on a field, but I don't want >>> documents having that field empty to be grouped together, does anyone >>> know how to achieve that ? >>> >>> >> > -- > Oussama Jilal > >
Re: Solr Grouping and empty fields
That would depend on your indexing setup. We have a custom application for indexing, so we just make a value up. In our case a GUID (UUID). But I imagine that you could also just copy your id field with a prefix. It depends on your data and tools. Teun Op 24 feb. 2013 15:00 schreef "Jilal Oussama" het volgende: > Oh this is a good one ! Thank you very much Teun (But I will have to ask > you how do you generate a unique value for the copy field when the original > one is empty? Do you do this manualy or solr can do it?) > And thanks again. > On Feb 24, 2013 12:11 PM, "Teun Duynstee" wrote: > > > We had a comparable situation. We created an extra field and at index > time > > copy the value if there is one and create a unique dummy value if there > is > > none. We couldn't just make the initial field required, because it has a > > meaning other than just a grouping key. > > Teun > > Op 22 feb. 2013 20:47 schreef "Daniel Collins" > het > > volgende: > > > > > We had something similar to be fair, a cluster information field which > > was > > > unfortunately optional, so all the documents that didn't have this > field > > > set grouped together. > > > > > > It isn't Solr's fault, to be fair, we told it to group on the values of > > > field Z, null is a valid value and lots of documents have that value so > > > they all group together. We got what we asked for :-) > > > > > > Our solution was to make that field mandatory, and in our indexing > > > pipeline we will set that field to some unique value (same as the > > document > > > key if necessary) if it isn't set already to ensure that every document > > has > > > that field set appropriately. > > > > > > -Original Message- From: Oussama Jilal > > > Sent: Friday, February 22, 2013 5:25 PM > > > To: solr-user@lucene.apache.org > > > Subject: Re: Solr Grouping and empty fields > > > > > > OK I'm sorry if I did not explained well my need. I'll try to give a > > > better explanation. > > > > > > What I have : Millions of documents that have a field X , another field > > > Y and another field Z which is not required (So it can be empty in some > > > documents and not in others). > > > > > > What I want to do : Search for docs that have the field X equals > > > something and group them by field Z (so that only 1 document is > returned > > > for every field Z value), BUT I want documents who have field Z as > empty > > > to be included in the results (all of them), and sort the results by > > > field Y (so I can't separate the request into two requests). > > > > > > I hope that this is clearer. > > > > > > > > > On 02/22/2013 03:59 PM, Jack Krupansky wrote: > > > > > >> What?!?! You want them grouped but not grouped together?? What on > earth > > >> does that mean?! I mean, either they are included or they are not. All > > >> results will be in some group, so where exactly do you want these "not > > to > > >> be grouped together" documents to be grouped? In any case, please > > clarify > > >> what your expectations really are. > > >> > > >> -- Jack Krupansky > > >> -Original Message- From: Oussama Jilal > > >> Sent: Friday, February 22, 2013 7:17 AM > > >> To: solr-user@lucene.apache.org > > >> Subject: Re: Solr Grouping and empty fields > > >> > > >> Thank you Johannes, but I want the documents having the field empty to > > >> be included in the results, just not to be grouped together, and if I > > >> understood your solution correctly, it will simply remove those > > >> documents from the results (Note : The field values are very variable > > >> and unknown to me). > > >> > > >> On 02/22/2013 02:53 PM, Johannes Rodenwald wrote: > > >> > > >>> Hi Oussama, > > >>> > > >>> If you have only a few distinct, unchanging values in the field that > > you > > >>> group upon, you could implement a FilterQuery (query parameter "fq") > > and > > >>> add it to the query, allowing all valid values, but not an empty > > field. For > > >>> example: > > >>> > > >>> fq=my_grouping_string_field:( value_a OR value_b OR value_c OR > value_d > > ) > > >>> > > >>> If you use SOLR 4.x, you should be able to group upon an integer > field, > > >>> allowing a range filter: > > >>> (I still work with 3.6 which can only group on string fields, so i > > didnt > > >>> test this one) > > >>> > > >>> fq=my_grouping_integer_field:[**1 TO *] > > >>> > > >>> -- > > >>> Johannes Rodenwald > > >>> > > >>> > > >>> - Ursprüngliche Mail - > > >>> Von: "Oussama Jilal" > > >>> An: solr-user@lucene.apache.org > > >>> Gesendet: Freitag, 22. Februar 2013 12:32:13 > > >>> Betreff: Solr Grouping and empty fields > > >>> > > >>> Hi, > > >>> > > >>> I need to group some results in solr based on a field, but I don't > want > > >>> documents having that field empty to be grouped together, does anyone > > >>> know how to achieve that ? > > >>> > > >>> > > >> > > > -- > > > Oussama Jilal > > > > > > > > >
Re: numFound is not correct while using Result Grouping
You have to set group.ngroups=true (see http://wiki.apache.org/solr/FieldCollapsing). Be aware that including the number of groups is a surprisingly heavy operation, though. Teun 2013/2/25 Nicholas Ding > Hello, > > I grouped the result, and set group.main=true. I was expecting the numFound > equals to the number of groups, but actually it was not. > > How do I get the number of groups? > > Thanks > Nicholas >
Re: numFound is not correct while using Result Grouping
Ah, I see. The docs say "Although this result format does not have as much information, it may be easier for existing solr clients to parse". I guess the ngroups value could be added to this format, but apparently it isn't. I do agree with you that to be usefull (as in possible to read for a client that doesn't know of the grouped format), the number should be that of the groups, not of the documents. A quick glance in the code learns that it is indeed not calculated in this case. But not completely trivial to fix. Could you use format=simple instead? That will work with ngroups. Teun 2013/2/25 Nicholas Ding > Thanks Teun and Carlos, I set group.ngroups=true, but I don't have this > "ngroup" number when I was using group.main = true. > > On Mon, Feb 25, 2013 at 12:02 PM, Carlos Maroto < > cmar...@searchtechnologies.com> wrote: > > > Use group.ngroups, check it in the Solr wiki for FieldCollapsing > > > > Carlos Maroto > > Search Architect at Search Technologies (www.searchtechnologies.com) > > > > > > > > Nicholas Ding wrote: > > > > > > Hello, > > > > I grouped the result, and set group.main=true. I was expecting the > numFound > > equals to the number of groups, but actually it was not. > > > > How do I get the number of groups? > > > > Thanks > > Nicholas > > >
Faceting on the first part or first letter of values
What I really miss in the SimpleFaceting component is the ability to get facets not of the full term, but grouped by the first letter(s). I wrote a Jira issue on this ( https://issues.apache.org/jira/browse/SOLR-4496). I also wrote a patch with a rather simplistic first try of an implementation. Now that I've had a better look at faceting multi valued fields and the inner working of UninvertedField, I see that doing it right is harder than I thought, but I'd still like to give it a try. So can anyone give me some tips on how to approach this? Should I treat the facet on the first letters as a completely independent field? Should I build the index based on the index of the complete field? By the nature of this kind of facet, you'll always have a fairly limited number of terms. Thanks a lot, Teun