Let's say I have a data model that involves books and bookshelves. I have tens 
of thousands of books and thousands of bookshelves. There is a many-many 
relationship between books & bookshelves. All of the books are indexed by SOLR.

I need to be able to query SOLR and get all the books for a given bookshelf. I 
see two schema design options here:


1)      Each book has a multi-value field that contains a list of all the 
bookshelf ID's. Many books will have thousands of bookshelf ID's. In this case 
the query is simple, I just send solr the bookshelf ID.

2)      I send solr a query with each book on the bookshelf e.g. 
q=book_id:(1+OR+2+OR+3 ....). Many bookshelves will have thousands of book ID's 
so the query can get rather large.

Right now I am using option 2 and it seems to be working fine. I have had to 
crank 'maxBooleanClauses' right up but it does seem to be pretty fast.

Anyone have an opinion?

Reply via email to