Hi, I think I follow what you said here. Let me check:
It sounds like you are saying that pretty much all getDoc(List|Set)* methods would need to be modified to take an additional CompositeHitCollector (CHC) parameter, correct? Then I'd modify the following methods (these are the methods that use anonymous HitCollectors and stick docs in some sort or priority queue): protected DocSet getDocSetNC(Query query, DocSet filter) private DocList getDocListNC(Query query, DocSet filter, …) private DocSet getDocListAndSetNC(DocListAndSet out, Query query, DocSet filter, ...) I'd have to: - add a new CompositeHitCollector parameter - if CHC != null: hc = new HitCollector { ... the same anonymous HCs that are there now ...} CHC.setComposite(hc); And when you said "...then the meat and potatoes methods of SolrIndexSearcher could take in your custom written CompositeHitCollector, specify the anonymous inner HitCollector it needs to use for the case it finds itself in..." - the "use for the case" refers to if/else/else if cases in the above methods, such as if sorting is needed, use FieldSortedHitQueue, if not, use ScorePriorityQueue and such? If I understood that correctly, I'll get to work, though I'm still not sure how DocSetHitCollector will fit in all of this. ............ But somehow this "add an additional parameter everywhere" doesn't sound right. I wish I could write my own WeightedSolrIndexSearcher that extends SolrSearcher and call some hook methods from SolrIndexSearcher to hook into caching (both get and set). public class WeightedHitCollector extends TopDocHitCollector { // TDHC from Lucene public void collect(int docId, float score) { // score * weightFromSomewhere // stick in PriorityQueue (from super - TDHC) } public int[] getDocIds() { // get them from super.topDocs which returns TopDocs[], from which we can get ScoreDoc[] and then docIds } public class WeightedSolrIndexSearcher extends SolrIndexSearcher { public DocList getDocList(Query q, ....) { // check the cache DocList docList = super.getDocListFromCache(q, ...); // not cached, got to search if (docList == null) { WeightedHitCollector whc = new WeightedHitCollector(); searcher.search(Query, null, whc); int[] docIds = whc.getDocIds(); // cache super.cacheDocList(int[] docids); } else { return docList; } } } Super-simplified, but I'm wondering if this is realistic and/or better than adding the additional CompositeHitCollector param. Thanks, Otis ----- Original Message ---- From: Chris Hostetter <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, May 2, 2007 3:14:23 PM Subject: Re: Custom HitCollector with SolrIndexSearcher and caching : I feel like I might be missing something, and there is in fact a way to : use a custom HitCollector and benefit from caching, but I just don't see : it now. I can't think of any easy way to do what you describe ... you can always use the low level IndexSearcher methods with a custom HitCollector that wraps a DocSetHitCollector and then explicitly cache the DocSet yourself, but thta doesn't really help you with the DocList ... there definitely doesn't seem to be an *easy* way to do what you're describing at the moment, but with a little refactoring methods like getDocListAndSet *coult* take in some sort of CompositeHitCollector class with an API like... /** * a HitCollector whose colelct method will delegate to a specified * HitCollector for each match it wants collected */ public abstract class CompositeHitCollector extends HitCollector { public setComposed(HitCollector inner); } ...then the meat and potatoes methods of SolrIndexSearcher could take in your custom written CompositeHitCollector, specify the anonymous inner HitCollector it needs to use for the case it finds itself in, and now you've got a window into the collection process where you can much with scores or igore certain matches. It would be a non trivial change, but it would be possible. -Hoss