Before Solr had facets, I built my own implementation in a much
cruder and less performant way into Collex as custom request handlers.
Now the performance issue of warming up the cache needs to be
addressed. I'm going to upgrade Solr and adjust the application to
work with the built-in faceting and see how far I get with that. The
dilemma is that I've got a couple of custom things that don't map to
the built-in faceting and I'm looking for advice on how to proceed.
The index has a "type" field: "A" for archived objects and "C" for
collectibles. All the original objects are indexed in batch fashion
as type "A". Users collect objects and tags/annotates them. When a
user collects an object, a document of type "C" is indexed with the
original objects unique identifier (a URI), the username, tags, and
annotation. My custom facet cache differs from the built-in facets
in that it builds a cross-reference cache from the "C" types to the
"A" types (a JOIN, heh).
We can do queries that return facet counts such as:
- all collected objects
- all objects collected by erikhatcher
- all collected objects with tag "foo"
One of the facet counts returned is user, so you can easily see how
many objects each user has collected.
For the basic faceting we do on object metadata, this will fit well
with what Solr has built-in, but I'm not quite sure how to build in
the cross-reference and leverage faster warming, so I'm asking here
to see what thoughts folks have on how to proceed.
Thanks,
Erik