On Wed, 16 Jan 2008 16:54:56 +0100 "Philippe Guillard" <[EMAIL PROTECTED]> wrote:
> Hi here, > > It seems that Lucene accepts any kind of XML document but Solr accepts only > flat name/value pairs inside a document to be indexed. > You'll find below what I'd like to do, Thanks for help of any kind ! > > Phil > ____________________________________________ Hey Phil, > > I need to index products (hotels) which have a price by date, then search > them by date or date range and price range. > Is there a way to do that with Solr? yes - look at the data types definition (somewhere in the wiki of the sample schema.xml) about data-types for indexing dates and integers,etc There are some caveats about using date data type fields (too much resolution, can slow down too much..) > > At the moment i have a document for each hotel : > <add> > <doc> > <field name="url">http://xxxx/yyy</field> > <field name="id">1</field> > <field name="name">Hotel Opera</field> > <field name="category">4 stars</field> > ..... > </doc> > </add> > > I would need to add my dates/price values like this but it is forbidden in > Solr indexing: > <date value="30/01/2008" price="200"> > <date value="31/01/2008" price="150"> > > Otherwise i could define a default field (being an integer) and have as many > fields as dates, like this: > <field name="30/01/2008">200</field> > <field name="31/01/2008">150</field> > indexing would accept it but i think i will not be able to search or sort by > date for simple dates like that, why not make use of dynamic fields ? define , for example, bydate_* as dynamic fields, then you can do : <field name=bydate_YYYYMMDD>$$$$</field> so, from your example : <field name="bydate_20080131">200</field> <field name="bydate_20080130">150</field> > The only solution i found at the moment is to create a document for each > date/price > <add> > <doc> > <field name="url">http://xxxx/yyy</field> > <field name="id">1</field> > <field name="name">Hotel Opera</field> > <field name="date">30/01/2008</field> > <field name="price">200</field> > </doc> > <doc> > <field name="url"> http://xxxx/yyy</field> > <field name="id">1</field> > <field name="name">Hotel Opera</field> > <field name="date">31/01/2008</field> > <field name="price">150</field> > </doc> > </add> If the field 'id' is your schemas ID, then this wouldn't work , but sure, the approach would be valid though a bit wasteful wrt to storing the metadata about the hotel.... There was a thread some time ago in this list (a month or 2 ago) about clever uses of the field defined as ID in the schema. > then i'll have many documents for 1 hotel > and in order to search by date range i would need more documents > like this : > <field name="date-range">28/01/2008 to 31/01/2008</field> > <field name="date-range">29/01/2008 to 31/01/2008</field> > <field name="date-range">30/01/2008 to 31/01/2008</field> > > Since i need to index many other informations about an hotel (address, > telephone, amenities etc...) i wouldn' like to duplicate too much > information, and i think it would not be scalable to search first in a dates > index then in hotels index to retrieve hotel information. > > Any idea? It strikes me you'd probably want a relational DB for this kind of thing.... B _________________________ {Beto|Norberto|Numard} Meijome Unix is user friendly. However, it isn't idiot friendly. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.