Faceting harvests the fields that are already indexed (so you have to
both store and index the fields) and uses Java object refs (pointers),
without copying the facet values. You know how log files have
multi-line exception stacks & the like? The multi-line exception
stacks after the real log line tend to be the same. I grabbed all of
the lines after each log line and made facets out of them. Worked
quite well for counting "this exception stack happened 42 times, this
other one 250 times". So huge string fields work as facets.

I don't know if 'facet.prefix' on 50 characters is faster than 'q=' on
200 characters.

Sending a giant query is easy: use a POST instead of a GET.

If searching on giant facet strings really is a problem, add a hash
code to each facet string. Then, add a separate matching field in each
document that only stores that hashcode. Now, instead of searching on
the giant facet, you pull the hashcode out of it and search the
separate field for that.


On Fri, Aug 20, 2010 at 9:56 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote:
> "A common way is to make a facet string of categoryId-2_name_imageurl.
> Then in your UI display the categoryId part of the facet."
>
> I've been thinking about  doing something like this for the same purposes. 
> Will having an "extra long" facet string like that have any effect on 
> faceting performace?  How about facet sorting with facet.sort=index?  In my 
> case, the first part of the facet string would be a 'sortable' value that 
> sorts how I want, not just an id.
>
> I use facet.sort=index, but my display labels don't actually sort the way I 
> want, so I'm thinking of making a sort key that does, and storing 
> "sortkey_label" in the actual facet value.  But I worry this may have an 
> effect on performance if the string gets really long. But I'm thinking/hoping 
> it won't -- at least for faceting the length of string shoudln't matter, I 
> think, but not sure about for sorting.  [Obviously you have to make sure to 
> not accidentally store the same 'id' with differently serialized 'metadata', 
> or you'd wind up with two facet values where you meant to have one].
>
> Is there any reason I couldn't use some non-printing control char as the 
> seperator, instead of just in that example ascii underscore?
>
> And then the other thing is, once I have these weird long facet strings with 
> embedded 'metadata', if I actually want to 'fq' on one, I need to pass that 
> whole weird string in the fq, clearly.  How do people generally deal with 
> this, using this technique? Just do it, pass the whole string?  Use some sort 
> of 'prefix' technique (I guess that would be the * wildcard in the fq)?  Use 
> two different solr fields, one for faceting with embedded metadata, and a 
> different one with the same values without embedded metadata for actual 'fq' 
> filtering?
>
> Thanks for any tips,
>
> Jonathan
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to