Hi Erik,
  Thanks for the tip. Hmmmm, well that's a good point, or maybe I will
just do the word filtering upfront and store it separately now that I
think about it more.

Darren

On Thu, 2009-07-30 at 13:05 -0400, Erik Hatcher wrote:
> On Jul 30, 2009, at 1:00 PM, Shalin Shekhar Mangar wrote:
> 
> > On Thu, Jul 30, 2009 at 9:53 PM, <dar...@ontrenet.com> wrote:
> >
> >> Hi,
> >> I am exploring the faceted search results of Solr. My query is like  
> >> this.
> >>
> >>
> >> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=text&facet.limit=500&facet.prefix=wick
> >>
> >> If I don't use the prefix, I get back totals for words like 1,a,of, 
> >> 2,3,4.
> >> 1 letter/number occurrences in my documents. Its not really useful  
> >> since
> >> all the documents have some free floating single-digit numbers.
> >>
> >> Is there a way to restrict the word frequency results for a facet  
> >> based on
> >> the length so I can set it to > 3 or is there a better way?
> >>
> >
> > Yes, you can specify facet.mincount=3 to return only those terms  
> > present in
> > more than 3 documents. On a related note, a tokenized field (such as  
> > text
> > type in the example schema) will create a large number of unqiue  
> > terms.
> > Faceting on such a field may not be very useful and/or efficient.  
> > Typically
> > faceting is done on untokenized fields (such as string type).
> 
> I think what was meant by > 3 was if faceting only returned terms of  
> length greater than 3, not count.
> 
> You could copyField your text field to another field, set the analyzer  
> to include a LengthFilterFactory with a minimum length specified, and  
> also have other analysis tweaks to have numbers and other stop words  
> removed.
> 
>       Erik
> 

Reply via email to