On Wed, 16 Jan 2008 16:54:56 +0100
"Philippe Guillard" <[EMAIL PROTECTED]> wrote:

> Hi here,
> 
> It seems that Lucene accepts any kind of XML document but Solr accepts only
> flat name/value pairs inside a document to be indexed.
> You'll find below what I'd like to do, Thanks for help of any kind !
> 
> Phil
> ____________________________________________

Hey Phil,

> 
> I need to index products (hotels) which have a price by date, then search
> them by date or date range and price range.
> Is there a way to do that with Solr?

yes - look at the data types definition (somewhere in the wiki of the sample
schema.xml) about data-types for indexing dates and integers,etc There are some
caveats about using date data type fields (too much resolution, can slow down
too much..)

> 
> At the moment i have a document for each hotel :
> <add>
>     <doc>
>         <field name="url">http://xxxx/yyy</field>
>         <field name="id">1</field>
>         <field name="name">Hotel Opera</field>
>         <field name="category">4 stars</field>
> .....
>     </doc>
> </add>
> 
> I would need to add my dates/price values like this but it is forbidden in
> Solr indexing:
> <date value="30/01/2008" price="200">
> <date value="31/01/2008" price="150">
> 
> Otherwise i could define a default field (being an integer) and have as many
> fields as dates, like this:
> <field name="30/01/2008">200</field>
> <field name="31/01/2008">150</field>
> indexing would accept it but i think i will not be able to search or sort by
> date

for simple dates like that, why not make use of dynamic fields ? define , for
example, bydate_* as dynamic fields, then you can do :

<field name=bydate_YYYYMMDD>$$$$</field>

so, from your example : 
<field name="bydate_20080131">200</field>
<field name="bydate_20080130">150</field>


> The only solution i found at the moment is to create a document for each
> date/price
> <add>
>     <doc>
>         <field name="url">http://xxxx/yyy</field>
>         <field name="id">1</field>
>         <field name="name">Hotel Opera</field>
>         <field name="date">30/01/2008</field>
>         <field name="price">200</field>
>     </doc>
>     <doc>
>         <field name="url"> http://xxxx/yyy</field>
>         <field name="id">1</field>
>         <field name="name">Hotel Opera</field>
>         <field name="date">31/01/2008</field>
>         <field name="price">150</field>
>     </doc>
> </add>

If the field 'id' is your schemas ID, then this wouldn't work , but sure,
the approach would be valid though a bit wasteful wrt to storing the metadata
about the hotel.... There was a thread some time ago in this list (a month or 2
ago) about clever uses of the field defined as ID in the schema.

> then i'll have many documents for 1 hotel
> and in order to search by date range i would need more documents
> like this :
>     <field name="date-range">28/01/2008 to 31/01/2008</field>
>     <field name="date-range">29/01/2008 to 31/01/2008</field>
>     <field name="date-range">30/01/2008 to 31/01/2008</field>
> 
> Since i need to index many other informations about an hotel (address,
> telephone, amenities etc...) i wouldn' like to duplicate too much
> information, and i think it would not be scalable to search first in a dates
> index then in hotels index to retrieve hotel information.
> 
> Any idea?

It strikes me you'd probably want a relational DB for this kind
of thing....

B
_________________________
{Beto|Norberto|Numard} Meijome

Unix is user friendly. However, it isn't idiot friendly.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.

Reply via email to