: Correct me if I am wrong: In a standard distributed search with : QueryComponent, the first query sent to the shards asks for : fl=myUniqueKey or fl=myUniqueKey,score. When the response is being : generated to send back to the coordinator, SolrIndexSearcher.doc (int i, : Set<String> fields) is called for each document. As I understand it, : this will read each document from the index _on disk_ and retrieve the : myUniqueKey field value for each document. : : My idea is to have a FieldCache for the myUniqueKey field in : SolrIndexSearcher (or somewhere else?) that would be used in cases where : the only field that needs to be retrieved is myUniqueKey. Is this : something that would improve performance?
Quite probably ... you typically can't assume that a FieldCache can be constructed for *any* field, but it should be a safe assumption for the uniqueKey field, so for that initial request of the mutiphase distributed search it's quite possible it would speed things up. if you want to try this and report back results, i'm sure a lot of people would be interested in a patch ... i would guess the best place to make the chance would be in the QueryComponent so thta it used the FieldCache (probably best to do it via getValueSource() on the uniqueKey's SchemaField) to put the ids in teh response instead of using a SolrDocList. Hmm, actually... there's no reason why this kind of optimization would need to be specific to distributed queries, it could be done by the ResponseWriters directly -- if the field list they are being asked to return only contains the uniqueKeyField and computed values (like score) then don't bother calling SolrIndexSearcher.doc at all ... the only hitch is that with distributed search and using function values as psuedo fields and what not there are more places calling SolrIndexSearcher.doc then their use to be ... so maybe putting this change directly into SolrIndexSearcher.doc would make the most sense? -Hoss