Don't know of any other way to organize the documents. We need to have the specific price that belongs to the user, so I don't think that the facets would be the issue. The facet querying would be modified to the corresponding price list field for that user. Let's say the customer belongs to priceList1500, I would use the price from that column (priceList1500) instead of the priceList1 or even price column. Let me post an example data in another way.
INDEX FIELD | INDEX DATA ------------------------ ID | 1 (INDEXED | STORED) NAME | TEST (INDEXED | STORED | MULTIVALUED) PRICE | 1.00 (INDEXED) PRICELIST1 | 0.99 (INDEXED) PRICELIST2 | 0.89 (INDEXED) PRICELIST500| 0.85 (INDEXED) ------------------------ ID | 2 (INDEXED | STORED) NAME | TEST2 (INDEXED | STORED | MULTIVALUED) PRICE | 1.10 (INDEXED) PRICELIST1 | 1.09 (INDEXED) PRICELIST250| 1.05 (INDEXED) PRICELIST600| 1.03 (INDEXED) The price list correspond to customer contracts with the company for contracted item pricing. Is there a specific size limit to the amount of index columns SOLR/LUCENCE can handle? Is there a better way of handling this? Do you see an issue with ram from what I am stating here? Also, with the index so huge, let's say 5000 columns across per data set will that degrade search performance dramatically (note the search fields of course would not be for all those columns)? Example Query: q=name&fl=NAME,ID&facet=true&facet.field=PRICELIST500 Thanks, Josh B. -----Original Message----- From: kenf_nc [mailto:ken.fos...@realestate.com] Sent: Wednesday, April 13, 2011 10:47 AM To: solr-user@lucene.apache.org Subject: Re: Indexing Question for large dataset Indexing isn't a problem, it's just disk space and space is cheap. But, if you do facets on all those price columns, that gets put into RAM which isn't as cheap or plentiful. Your cache buffers may get overloaded a lot and performance will suffer. 2000 price columns seems like a lot, could the documents be organized differently? Hard to tell from your example. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Question-for-large-dataset-tp2816344p2816377.html Sent from the Solr - User mailing list archive at Nabble.com. The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies