For me, I'm using the signature field grouping method, as shown from this website: https://cwiki.apache.org/confluence/display/solr/De-Duplication
You can set the signatureField to be "title", then during the query, instead of using &group=true&group.field=title, you can use &group=true&group.field=signature Regards, Edwin On 4 November 2015 at 16:40, Jan Høydahl <jan....@cominvent.com> wrote: > I second Toke’s recommendation to ensure you have a pure string-version of > your title. > For pure de-duplication you could also consider the lighter-weight > CollapseComponent > > Instead of &group=true&group.field=title, use &fq={!collapse > field=title_string} > > See > https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results > for more > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 3. nov. 2015 kl. 12.37 skrev Toke Eskildsen <t...@statsbiblioteket.dk>: > > > > On Tue, 2015-11-03 at 14:53 +0530, vishal raut wrote: > >> I have indexed various videos in solr which I have in my database. I > want > >> to search for those video titles, but there can be duplicate video > titles > >> as well (If the video is same but source is different, this will have > >> separate entry in solr). To remove those duplicate titles while > searching, > >> I am using solr group on title. > > > > And you get "Too many values for UnInvertedField faceting on field." > > > > There is a fairly low (16M per segment or something like that) limit to > > the amount of unique values that can be uninverted. DocValues has a much > > higher limit (2 billion I think. At least it works with 600M+ for us). > > > > Add your titles to a StrField with docValues, the group on that. > > > > - Toke Eskildsen, State and University Library, Denmark > > > > > >