: What I'd really like to see is an XML query language so I can toss all
: the hackish URL query arguments and really move much of the query plugin
: logic out into the query itself instead of in the Java code.

: customers.   We'll introduce dynamic attribute bucketing.  Rather than
: produce a list of counts of all values for an attribute and have "at
: least" or "at most" options, users will be given ranged lists based on
: the actual distribution of the facets.    I haven't really worked out
: the details since I haven't actually began the design but I'm probably
: going to see if I can't just look at it like it's on a bell curve and
: start picking evenly sized buckets.   Monitors <= 15" (10),  15 -> 17
: (10), 17 -> 21 (10), 21-> 25 (10), > 25 (10).   Now obviously I can't
: force it into a nice distribution like that but I'll figure out
: something.   In any case, the bucket ranges will need to be based on the
: actual distribution (easy to maintain, hard to implement) in the current
: result set and not some pre-manufactured bucket categories (easy to
: implement, hard to maintain) as those get obsoleted fairly quickly.

Those are really the $64,000 questions ... dynamic bucketing works great
in some cases -- but not all.  people like to see price ranges like $0-10,
$10-20, $20-30, $30-infinity ... if you try to make buckets based on
statistical distribution you get things like $0-11.75, $11.75-25.03,
$25.03-70.29, $70.29-infinity.

As for where the logic should live -- having a really robust way to
specify the rules you want to be used for determining which fields to
facet on , and wich type of faceting to do and what buckets to use, etc...
as query time params to the plugin works great when you've got one client
app that wants to drive the bus -- but when you've got lots of apps hiting
your Solr index, you want that data on the server -- either in "metadata
docs" that the plugin knows how to parse, or in the solrconfig.xml.
solrconfig.xml is easier to maintain, but harder to change on the fly --
and metadata docs have the advantage that there can be an arbitrary number
of them, each with a differnet identifier so that the client can say "use
rule set 'desktops'" and you search for the metadata doc with id
"desktops"  which can tell you everything you want to know -- but for a
differnet query the client can say "use rulesset 'cameras'" and get facets
that make more sense for those types of products.


i think the ideal robust solution is to come up with a good object
representation for facet rules that can work in all of these cases, and
can be expressed in "solr xml format" and then write a plugin that can
read that info from it's init params, or as a query param, or get a query
param that tells it how to search for a metadata doc which it can expect
to find in that format ... then apply those rules (with good caching of
course)


a lot harder to impliment, but it serves every use case i can think of.



-Hoss

Reply via email to