Re: Question about sampling

2012-05-22 Thread rita
Hi Lance, Could you provide more details about implementing this using SignatureUpdateProcessor? Example can be helpful. - Rita -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-sampling-tp3984103p3985379.html Sent from the Solr - User mailing list archive

Re: Question about sampling

2012-05-22 Thread Lance Norskog
e. :) > But I'd love to be wrong! > > Otis > > Performance Monitoring for Solr / ElasticSearch / HBase - > http://sematext.com/spm > > > >> >> From: Yuval Dotan >>To: solr-user >>Sent: Wednesday, May 16

Re: Question about sampling

2012-05-20 Thread Otis Gospodnetic
: solr-user >Sent: Wednesday, May 16, 2012 9:43 AM >Subject: Question about sampling > >Hi Guys >We have an environment containing billions of documents. >Faceting over this large result set could take many seconds, and so we >thought we might be able to use statistical sampling

Re: Question about sampling

2012-05-17 Thread Lance Norskog
Yes. The trick is to use a hash value on each document. The SignatureUpdateProcessor provides a tool for this. Store the hash value in a hex string field. Now, do wildcard queries on the hash string: hash:a* will randomly choose 1/16 of the documents. hash:00* will pick 1/256 of the documents. On