Ankush, It seems that unless reviews are changing constantly, why not do what Erick was saying in flattening your data by storing reviews with the hotel index but re-index your hotels storing the top two reviews. I guess I am suggesting computing the top two reviews for each hotel offline and store them somewhere.
You could store the top two reviews in an RDBMS and let whatever front end you have retrieve the top two from the RDBMS after receiving results from Solr based on your unique ID. HTH Amit On Tue, Apr 28, 2009 at 3:14 PM, Ankush Goyal <ankush.go...@orbitz.com>wrote: > Hi Erick, > > Thanks for response!...the solution I was talking about was same as your > last solution to get reviews for only required hotel-ids and then parsing > them in one go to make a hash-map, I guess I didn't explain correctly :) > > As far as putting reviews inside the hotel index is concerned, we thought > about that solution, but we also need to sort the reviews and (let's say) > show top 2 of maybe 50 reviews for a hotel, so we couldn't put reviews > inside hotel doc itself. > > Now, this again poses another question for the solution we talked about-, > as it seems like getting reviews for required hotel-ids and then making a > hash-map corresponding to hotel-ids can improve the performance, but then we > also need to sort all the reviews for each hotel using a field/ score in the > review-doc itself, which seems like would lower down the performance > drastically. > > Any ideas on a better solution? > > Thanks! > -Ankush > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Tuesday, April 28, 2009 4:05 PM > To: solr-user@lucene.apache.org > Subject: Re: Multiple Queries > > Have you considered indexing the reviews along with the hotels right > in the hotel index? That way you would fetch the reviews right along with > the hotels... > > Really, this is another way of saying "flatten your data" <G>... > > Your idea of holding all the hotel reviews in memory is also viable, > depending upon > how many there are. you'd pay some startup costs, but that's what caching > is > all > about. > > Given your current index structure, have you tried collecting the hotel > IDs, > and > submitting a query to your review index that just ORs together all the IDs > and > then parsing that rather than calling your review index for one hotel ID at > a time? > > Best > Erick > > On Tue, Apr 28, 2009 at 4:32 PM, Ankush Goyal <ankush.go...@orbitz.com > >wrote: > > > Hi, > > > > I have been trying to solve a performance issue: I have an index of > hotels > > with their ids and another index of reviews. Now, when someone queries > for a > > location, the current process gets all the hotels for that location. > > And, then corresponding to each hotel-id from all the hotel documents, it > > calls the review index to fetch reviews associated with that particular > > hotel and so on it repeats for all the hotels. This process slows down > the > > request significantly. > > I need to accumulate reviews according to corresponding hotel-ids, so I > > can't just fetch all the reviews for all the hotel ids and show them. > Now, I > > was thinking about fetching all the reviews for all the hotel-ids and > then > > parse all those reviews in one go and create a map with hotel-id as key > and > > list of reviews as values. > > > > Can anyone comment on whether this procedure would be better or worse, or > > if there's better way of doing this? > > > > --Ankush Goyal > > >