full text search across those columns can all done by putting all values into one text field, are there any benefits in it being multi-valued?
multi-valued is useful for one-to-many representation, e.g. - different features of a product - dates where that product was best seller - countries it is sold in you can then use them for exact searches and generating facets the original documents not following a standard schema probably makes it more appropriate to go straight to solr using groovy with solrj you should be able to manipulate those XML documents either one for one into solr or collect/transform into reduced set of solr documents ----- Original Message ----- From: "caman" <aboxfortheotherst...@gmail.com> To: solr-user@lucene.apache.org Sent: Tuesday, 2 February, 2010 6:31:36 PM Subject: Re: Indexing an oracle warehouse table Ron, Much appreciated. Search requirements are as : 1) Enable search/faceting on author,service,datetime. 2) Enable full text search on all text column which are named as col1.... col800+ -- total of more than 800 columns. Here is what I did so far: Defined entities in db schema in db-config.xml without any column definition in the file, which basically mean is that I want to keep fields name same as in the database. Now in schema.xml : I have <field> tag for each database field retrieved with the SQL queries in db-config.xml, which are more than 800+ (did not write this by hand,wrote a groovy script to generate this for me from the database) Multi-valued : Yes, this is what I am using to copy all the fields col1...col800+ to one multi-valued field. That fileld is set as default for search. You are right about going to original data source but then had to take a different approach. Original source is all XML files which do not follow a standard schema for the structure. I hope what I mentioned above makes sense.appreciate the response. Ron Chan wrote: > > it depends on what the search requirements are, so without knowing the > details here are some vague pointers > > you may only need to have fields for the columns you are going to be > categorizing and searching on, this may be a small subset of the 800 and > the rest can go into one large field to fulfil the full text search > > another thing to look into is the multi value fields, this can sometimes > replace the one-to-many relationships in database > > also it may sometimes be worth while going to the original data source > rather than the warehouse table, as this is already flattened and > denormalised, the flattening and denormalizing will most likely be done a > different way when solr indexing database type data, highly likely you > will end up with less rows and less columns in the solr index, as each > solr document can be seen as "multi-dimensional" > > > ----- Original Message ----- > From: "caman" <aboxfortheotherst...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Tuesday, 2 February, 2010 1:23:01 AM > Subject: Indexing an oracle warehouse table > > > Hello all, > > hope someone can point me to right direction. I am trying to index an > oracle > warehouse table(TableA) with 850 columns. Out of the structure about 800 > fields are CLOBs and are good candidate to enable full-text searching. > Also > have few columns which has relational link to other tables. I am clean on > how to create a root entity and then pull data from other relational link > as > child entities. Most columns in TableA are named as > field1,field2...field800. > Now my question is how to organize the schema efficiently: > First option: > if my query is 'select * from TableA', Do I define <field name="attr1" > column="FIELD1" /> for each of those 800 columns? Seems cumbersome. May be > can write a script to generate XML instead of handwriting both in > data-config.xml and schema.xml. > OR > Dont define any <field name="attr1" column="FIELD1" /> so that column in > SOLR will be same as in the database table. But questions are 1)How do I > define unique field in this scenario? 2) How to copy all the text fields > to > a common field for easy searching? > > Any helpful is appreciated. Please feel free to suggest any alternative > way. > > Thanks > > > > > > -- > View this message in context: > http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27414263.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > -- View this message in context: http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27425156.html Sent from the Solr - User mailing list archive at Nabble.com.