Hi Hoss,

Thanks for this.

The terms component approach, if i understand it correctly, will be problematic. I need to present not only the next X call numbers in sequence, but other fields in those documents (e.g. title, author). I assume the Terms Component approach will only give me the next X call number values, not the documents.

It sounds like Glen Newton's suggestion of mapping the call numbers to a float number is the most likely solution.

I know it sounds ridiculous to do all this for a "call number browse" but our faculty have explicitly asked for this. For humanities scholars especially, they know the call numbers that are of interest to them, and they browse the stacks that way (ML 1500s are opera, V35 is verdi ...). They are using the research methods that have been successful for their entire careers. Plus, library materials are going to off site, high density storage, so the only way for them to to browse all materials, regardless of location, via call number is online. I doubt they'll find this feature as useful as they expect, but it behooves us to give the users what they ask for.

So yeah, our user needs are perhaps a little outside of your expectations. :-)

- Naomi


On Nov 29, 2008, at 2:58 PM, Chris Hostetter wrote:


: The results are correct.  But the response time sucks.
:
: Reading the docs about caches, I thought I could populate the query result : cache with an autowarming query and the response time would be okay. But that
: hasn't worked.  (See excerpts from my solrConfig file below.)
:
: A repeated query is very fast, implying caching happens for a particular
: starting point ("42" above).
:
: Is there a way to populate the cache with the ENTIRE sorted list of values for : the field, so any arbitrary starting point will get results from the cache, : rather than grabbing all results from (x) to the end, then sorting all these
: results, then returning the first 10?

there's two "caches" that come into play for something like this...

the first cache is a low level Lucene cache called the "FieldCache" that
is completley hidden from you (and for the most part: from Solr).
anytime you sort on a field, it get's built, and reuse for all sorts on
that field.  My originl concern was that it wasn't getting warmed on
"newSearcher" (because you have to be explicit about that.

the second cache is the queryResultsCache which caches a "window" of an ordered list of documents based on a query, and a sort. you can see this cache in your Solr stats, and yes: these two requests results in different
cache keys for the queryResultsCache...

       q=yourField:[42+TO+*]&sort=yourField+asc&rows=10
       q=yourField:[52+TO+*]&sort=yourField+asc&rows=10

...BUT! ... the two queries below will result in the same cache key, and
the second will be a cache hit, provided a sufficient value for
the "queryResultWindowSize" ...

       q=yourField:[42+TO+*]&sort=yourField+asc&rows=10
       q=yourField:[42+TO+*]&sort=yourField+asc&rows=10&start=10

so perhaps the key to your problem is to just make sure that once the user gives you an id to start with, you "scroll" by increasing the start param (not altering the id) ... the first query might be "slow" but every query after that should be a cache hit (depending on your page size, and how far
you expect people to scroll, you should consider increasing
queryResultWindowSize)

But as Yonik said: the new TermsComponent may actually be a better option for you -- doing two requests for every page (the first to get the N Terms in your id field starting with your input, the second to do an query for
docs matching any of those N ids) might actually be faster even though
there won't likely even be any cache hits.


My opinion: Your use case sounds like a waste of effort. I can't imagine anyone using a library catalog system ever wanting to lookup a callnumber, and then scroll through all posisble books with similar call numbers -- it seems much more likely that i'd want to look at other books with similar authors, or keywords, or tags ... all things that are actaully *easier* to do with Solr. (but then again: i don't work in a library. i trust that
you know something i don't about what your users want.)


-Hoss


Naomi Dushay
[EMAIL PROTECTED]



Reply via email to