Is it a business requirement that this is fast? If so, you are going to spend a lot of money on hardware. Might want to request that the business people think again about their requirements.
Here is one way to do this, using the simplest Solr/Lucene features. An implementation internal to Solr would probably look similar, but might not be much faster, especially if the processing is dominated by disk accesses. 1. Do the query, requesting 100 hits, using the default sort, and only return the ID field. 2. Randomly sample that set of 100, adding the chosen IDs to a set, then request the next 100. 3. Stop when you have a big enough sample (this might be before the end of the list). 4. Make another request with 100 of the IDs and the desired fields. The query will look like: "id:1 OR id:99 OR id:186 OR id:42". wunder On 7/7/08 10:15 AM, "Sean Laval" <[EMAIL PROTECTED]> wrote: > Well its simply a business requirement from my perspective. I am not sure I > can say more than that. I could maybe implement a request handler that did > an initial search to work out how many hits there are resulting from the > query and then did as many more queries as were required fetching just 1 > document starting at a given random number .. would that work? Sounds a bit > cludgy to me even as I say it. > > Sean > > > > -------------------------------------------------- > From: "Walter Underwood" <[EMAIL PROTECTED]> > Sent: Monday, July 07, 2008 5:06 PM > To: <solr-user@lucene.apache.org> > Subject: Re: implementing a random result request handler - solr 1.2 > >> Why do you want random hits? If we know more about the bigger >> problem, we can probably make better suggestions. >> >> Fundamentally, Lucene is designed to quickly return the best >> hits for a query. Returning random hits from the entire >> matched set is likely to be very slow. It just isn't what >> Lucene is designed to do. >> >> wunder >> >> On 7/7/08 8:58 AM, "Sean Laval" <[EMAIL PROTECTED]> wrote: >> >>> I have seen various posts about implementing random sorting relating to >>> the >>> 1.3 code base but I am trying to do this in 1.2. Does anyone have any >>> suggestions? The approach I have considered is to implement my own >>> request >>> handler that picks random documents from a larger result list. I >>> therefore >>> need to be able to create a DocList and add documents to it but can't >>> seem to >>> do this. Does anyone have any advice they could offer please? >>> >>> Regards, >>> >>> Sean >> >>