Hello community, Let me explain my case. I need to implement a feature that combines text search and data aggregation. That is, the app will let the users search for products and set a date range. As result, I need to show them products that matched the search + some data aggregated for that period of time (like number of visits, number of sells, etc). For instance, if a user searches for "ipod" and sets dates from "01/01/11" to "01/10/11", I have to show all the products that match with "ipod" along with the visits, sells, etc for those products in those 10 days.
For that purpose, I have 2 tables in a DB: one with the all the products and the other with the visits and sells per day and product. What do you think is the best approach?. These are my options so far: Option 1: index both products and visits per day in the same index as result of doing a join between 2 tables. That way I can resolve product search + stats (for aggregated data) in 1 query and with 1 index. The drawback is having a huge index that grows in documents on daily basis where just a few of them (one per product) are searchable. Also I can see other problems if, for instance, I want to implement facet navigation over some attributes of the product, like product category, since the count for each element won't reflect the number of product for that category but rather the number of products x days those products have been visited. Option 2: have 2 cores, one with products, other with visits per day. Here I can use the product core/index for product searching, pick the product ids and request stats in the visits core for those ids, then join both and return everything. I was checking the "join" feature in Solr 4.0 to implement this ( http://wiki.apache.org/solr/Join ) , but as far as I know this feature doesn't return fields from both indexes (I would need product attributes from product index + stats from visits index), so I would have to perform both queries manually. The drawback here is code complexity (I need to maintain a code that queries and joins to both indexes) and eventually performance (2 queries per request and eventually with a long list of ids depending on product match result) Option 3: 1 core with products and keep visits in the DB. This option would work exactly as option 2 with the alternative of calculating the stats in the DB. Same drawbacks as option 2. Do you guys see any other option or have faced this problem before? If so what is your advise?. I think the best approach is to separate concerns and have Solr for product matching and something else (high-scalable db?) for stats calculations, but on the other hand it would be nice to have Solr performs both operations and joins results behind the scenes. To make things even more challenging, users could eventually sort by calculated fields. Thanks in advance Esteban