Do you want to page through all items or through the result of a query (like all hits for "civil war" in call number order).
If you want the former, then a text search engine is really the wrong tool. This problem only requires indexed sequential file formats (like B-trees), something that worked quite well 30 or 40 years ago, even before relational databases were invented. Text search engines, like Lucene/Solr, have sorting and traversal as a secondary feature. Their primary feature is relevance ranking. With only 8M items, I'd be inclined to put them in a big array sorted by call number, and use binary search. Sounds dumb, but it is really fast. The entries would be a simple pair, call number and key. wunder On 11/28/08 4:41 PM, "Naomi Dushay" <[EMAIL PROTECTED]> wrote: > Gosh, I'm sorry to be so unclear. Hmm. Trying to clarify below: > > On Nov 28, 2008, at 3:52 PM, Chris Hostetter wrote: > >> Having read through this thread, i'm not sure i understand what >> exactly >> the problem is. my naive understanding is... >> >> 1) you want to sort by a field >> 2) you want to be able to "paginate" through all docs in order of this >> field. >> 3) you want to be able to start your pagination at any arbitrary >> value for >> this field. >> >> so (assuming the field is a simple number for now) you could us >> something >> like >> >> q=yourField:[42 TO *&sort=yourField+asc&rows=10&start-0 >> >> where "42" is the arbitrary ID someone wants to start at. >> > > perfect. This is the query I'm using. > > The results are correct. But the response time sucks. > > Reading the docs about caches, I thought I could populate the query > result cache with an autowarming query and the response time would be > okay. But that hasn't worked. (See excerpts from my solrConfig file > below.) > > A repeated query is very fast, implying caching happens for a > particular starting point ("42" above). > > Is there a way to populate the cache with the ENTIRE sorted list of > values for the field, so any arbitrary starting point will get results > from the cache, rather than grabbing all results from (x) to the end, > then sorting all these results, then returning the first 10? > > >> This sentence below seems to imply that you have a solution which >> produces >> correct results, but doesn't produce results quickly... > > right. > >> : I have a performance problem and I haven't thought of a clever way >> around it. >> >> ...however this lines seems to suggest that you're having trouble >> getting at least 10 results from any query (?) >> >> : Call numbers are squirrelly, so we can't predict the string that >> will >> : appropriately grab at least 10 subsequent documents. They are >> certainly not >> : consecutive! >> : >> : so from >> : A123 B34 1970 >> : >> : we're unable to predict if any of these will return at least 10 >> results: > > I was trying to express that I couldn't do this: > > myfield:[X TO Y] > > because I can't algorithmically compute Y. > > Glen Newton suggested a work around, whereby I represent my > squirrelly, but sortable, field values as floating point numbers, and > then I can compute Y. > >> ...but i'm not sure what exactly that means. for any given field, >> there >> is always going to be some values X such that myField:[X TO *] won't >> return at least 10 docs ... the are the last values in the index in >> order >> -- surely it's okay for your app to have an "end" state when you run >> out >> of data? :) > > yes. Understood. This is not an issue. > >> Oh, and BTW... >> >> : numbers in sort order". I have also mucked about with the cache >> : initialization, but that's not working either: >> : >> : <listener event="firstSearcher" >> class="solr.QuerySenderListener"> >> >> ...make sure you also do a newSearcher listener that does the same >> thing, >> otherwise your FieldCache (used for sorting) may not be warmed when >> commits happen) > > Yup yup yup. > > from solrconfig: > > <filterCache > class="solr.LRUCache" > size="20000000" > initialSize="10000000" > autowarmCount="500000"/> > > <queryResultCache > class="solr.LRUCache" > size="10000000" > initialSize="5000000" > autowarmCount="5000000"/> > > > <listener event="newSearcher" class="solr.QuerySenderListener"> > <arr name="queries"> > <!-- populate query result cache for sorted queries --> > <lst> > <str name="q">shelfkey:[0 TO *]</str> > <str name="sort">shelfkey asc</str> > </lst> > </arr> > </listener> > > <listener event="firstSearcher" class="solr.QuerySenderListener"> > <arr name="queries"> > <!-- populate query result cache for sorted queries --> > <lst> > <str name="q">shelfkey:[0 TO *]</str> > <str name="sort">shelfkey asc</str> > </lst> > >