Search document design problem

Wenca Tue, 17 Aug 2010 02:31:27 -0700

Hi all,

I would like to use Solr to replace our site search based on MySQL but Iam not sure how to map entities into the search index. The model isdescribed byt the attached UML class diagram.

I have a Hotel that resides in some City in some Country. The hotel hasvarious Rooms. For each Room in a Hotel there are some Packages that canbe purchased by the client.


The entity returned from the search will be mainly the Hotel. E.g.:
- all hotels in USA
- all hotels in New York
- all hotels with name containing "Hilton"
- all hotels in Egypt with packages with all inclusive boarding
  and price lower than 400 and startDate between 2010-08-20
  and 2010-08-30

Our application also uses faceting a lot. e.g:
- # of hotels per country/city
- # of hotels based on room size
    (# of beds - 1 bed - 100 hotels, 2 beds - 200 hotels, ...)
- # of hotels based on all inclusive package prices
    (0-100 EUR, 100-200 EUR, ...)

But there are also use cases when a search should return a Room orPackage directly.

I'd like to use Data Import Handler to index directly from our database.But which approach of mapping entities into the search index to use? Itseems to me that there are at least 2 ways.

1) One index based on Hotel with multivalued fields for Rooms andmultivalued fields for Packages. In DIH:

<document>
<entity name="hotel" ...>
   <field name="id" .../>
   <entity name="room" ...>
      <field name="room_id" .../>
      <entity name="package"...>
         <field .../>
      </entity>
   </entity>
</entity>
</document>

But I am not sure whether this will work due to multivalued fields. Thequeries may span accross all the entities - I want only hotels that haveroom with 2 beds and the room has a package with all inclusive boardingand price lower than 400.

2) Denormalize data, so that there will be only one index based onPackages containing (duplicated) all the data from Room and Hotel andthen use Field Collapsing on Hotel ID for search results and faceting too.This would enable also direct search for Packages or Rooms but I am notsure about Field Collapsing which is still a kind of beta functionalityand about potential performance costs.


Can anybody give me some advice or share their experiences?

Thanks a lot
Wenca

Search document design problem

Reply via email to