Hi all,

I would like to use Solr to replace our site search based on MySQL but I am not sure how to map entities into the search index. The model is described byt the attached UML class diagram.

I have a Hotel that resides in some City in some Country. The hotel has various Rooms. For each Room in a Hotel there are some Packages that can be purchased by the client.

The entity returned from the search will be mainly the Hotel. E.g.:
- all hotels in USA
- all hotels in New York
- all hotels with name containing "Hilton"
- all hotels in Egypt with packages with all inclusive boarding
  and price lower than 400 and startDate between 2010-08-20
  and 2010-08-30

Our application also uses faceting a lot. e.g:
- # of hotels per country/city
- # of hotels based on room size
    (# of beds - 1 bed - 100 hotels, 2 beds - 200 hotels, ...)
- # of hotels based on all inclusive package prices
    (0-100 EUR, 100-200 EUR, ...)

But there are also use cases when a search should return a Room or Package directly.

I'd like to use Data Import Handler to index directly from our database. But which approach of mapping entities into the search index to use? It seems to me that there are at least 2 ways.

1) One index based on Hotel with multivalued fields for Rooms and multivalued fields for Packages. In DIH:
<document>
<entity name="hotel" ...>
   <field name="id" .../>
   <entity name="room" ...>
      <field name="room_id" .../>
      <entity name="package"...>
         <field .../>
      </entity>
   </entity>
</entity>
</document>

But I am not sure whether this will work due to multivalued fields. The queries may span accross all the entities - I want only hotels that have room with 2 beds and the room has a package with all inclusive boarding and price lower than 400.

2) Denormalize data, so that there will be only one index based on Packages containing (duplicated) all the data from Room and Hotel and then use Field Collapsing on Hotel ID for search results and faceting too. This would enable also direct search for Packages or Rooms but I am not sure about Field Collapsing which is still a kind of beta functionality and about potential performance costs.

Can anybody give me some advice or share their experiences?

Thanks a lot
Wenca

Reply via email to