Re: Question about sampling

2012-05-22 Thread rita
Hi Lance, Could you provide more details about implementing this using SignatureUpdateProcessor? Example can be helpful. - Rita -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-sampling-tp3984103p3985379.html Sent from the Solr - User mailing list archive

Re: Question about sampling

2012-05-22 Thread Lance Norskog
My mistake- I did not research whether the data above is stored a strings. The hashcode has to be stored as strings for this trick to work. On Sun, May 20, 2012 at 8:25 PM, Otis Gospodnetic wrote: > I'd be curious about this, too! > I suspect the answer is: not doable, patches welcome. :) > But I

Re: Question about sampling

2012-05-20 Thread Otis Gospodnetic
I'd be curious about this, too! I suspect the answer is: not doable, patches welcome. :) But I'd love to be wrong! OtisĀ  Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spmĀ  > > From: Yuval Dotan >To: solr-user >Sent: Wednes

Re: Question about sampling

2012-05-17 Thread Lance Norskog
Yes. The trick is to use a hash value on each document. The SignatureUpdateProcessor provides a tool for this. Store the hash value in a hex string field. Now, do wildcard queries on the hash string: hash:a* will randomly choose 1/16 of the documents. hash:00* will pick 1/256 of the documents. On