Hi Geert-Jan, Have you considered storing this data in an external data store and not Lucene index? In other words, use the Lucene index only to index the content you need to search. Then, when you search this index, just pull out the single stored fields, the unique ID for each of top N hits, and use those ID to pull the actual content for display purposes from the external store. This external store could be a RDBMS, an ODBMS, a BDB, etc. I've worked with very large indices where we successfully used BDBs for this purpose.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Geert-Jan Brits <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, December 27, 2007 11:44:13 AM Subject: Re: big perf-difference between solr-server vs. SOlrJ req.process(solrserver) yeah, that makes sense. so, in in all, could scanning all the fields and loading the 10 fields add up to cost about the same or even more as performing the intial query? (Just making sure) I am wondering if the following change to the schema would help in this case: current setup: It's possible to have up to 2000 product-variants. each product-variant has: - 1 price field (stored / indexed) - 1 multivalued field which contains product-variant characteristics (strored / not indexed). This adds up to the 4000 fields described. Moreover there are some fields on the product level but these would contibute just a tiny bit to the overall scanning / loading costs (about 50 -stored and indexed- fields in total) possible new setup (only the changes) : - index but not store the price-field. - store the price as just another one of the product-variant characteristics in the multivalued product-variant field. as a result this would bring back the maximum number of stored fields to about 2050 from 4050 and thereby about halving scanning / loading costs while leaving the current quering-costs intact. Indexing costs would increase a bit. Would you expect the same performance gain? Thanks, Geert-Jan 2007/12/27, Yonik Seeley <[EMAIL PROTECTED]>: > > On Dec 27, 2007 11:01 AM, Britske <[EMAIL PROTECTED]> wrote: > > after inspecting solrconfig.xml I see that I already have enabled lazy > field > > loading by: > > <enableLazyFieldLoading>true</enableLazyFieldLoading> (I guess it was > > enabled by default) > > > > Since any query returns about 10 fields (which differ from query to > query) , > > would this mean that only these 10 of about 2000-4000 fields are > retrieved / > > loaded? > > Yes, but that's not the whole story. > Lucene stores all of the fields back-to-back with no index (there is > no random access to particular stored fields)... so all of the fields > must be at least scanned. > > -Yonik >