Re: Indexing an oracle warehouse table

caman Tue, 02 Feb 2010 10:32:08 -0800

Ron,

Much appreciated.  Search requirements are as :
1) Enable search/faceting on author,service,datetime. 
2) Enable full text search on all text column which are named as col1....
col800+  -- total of more than 800 columns.


Here is what I did so far: Defined entities in db schema in db-config.xml
without any column definition in the file, which basically mean is that I
want to keep fields name same as in the database. 
Now in schema.xml : I have <field> tag for each database field retrieved
with the SQL queries in db-config.xml, which are more than 800+ (did not
write this by hand,wrote a groovy script to generate this for me from the
database)

Multi-valued : Yes, this is what I am using to copy all the fields
col1...col800+ to one multi-valued field. That fileld is set as default for
search.

You are right about going to original data source but then had to take a
different approach. Original source is all XML files which do not follow a
standard schema for the structure.

I hope what I mentioned above makes sense.appreciate the response.





Ron Chan wrote:
> 
> it depends on what the search requirements are, so without knowing the
> details here are some vague pointers 
> 
> you may only need to have fields for the columns you are going to be
> categorizing and searching on, this may be a small subset of the 800 and
> the rest can go into one large field to fulfil the full text search 
> 
> another thing to look into is the multi value fields, this can sometimes
> replace the one-to-many relationships in database 
> 
> also it may sometimes be worth while going to the original data source
> rather than the warehouse table, as this is already flattened and
> denormalised, the flattening and denormalizing will most likely be done a
> different way when solr indexing database type data, highly likely you
> will end up with less rows and less columns in the solr index, as each
> solr document can be seen as "multi-dimensional" 
> 
> 
> ----- Original Message ----- 
> From: "caman" <aboxfortheotherst...@gmail.com> 
> To: solr-user@lucene.apache.org 
> Sent: Tuesday, 2 February, 2010 1:23:01 AM 
> Subject: Indexing an oracle warehouse table 
> 
> 
> Hello all, 
> 
> hope someone can point me to right direction. I am trying to index an
> oracle 
> warehouse table(TableA) with 850 columns. Out of the structure about 800 
> fields are CLOBs and are good candidate to enable full-text searching.
> Also 
> have few columns which has relational link to other tables. I am clean on 
> how to create a root entity and then pull data from other relational link
> as 
> child entities. Most columns in TableA are named as 
> field1,field2...field800. 
> Now my question is how to organize the schema efficiently: 
> First option: 
> if my query is 'select * from TableA', Do I define <field name="attr1" 
> column="FIELD1" /> for each of those 800 columns? Seems cumbersome. May be 
> can write a script to generate XML instead of handwriting both in 
> data-config.xml and schema.xml. 
> OR 
> Dont define any <field name="attr1" column="FIELD1" /> so that column in 
> SOLR will be same as in the database table. But questions are 1)How do I 
> define unique field in this scenario? 2) How to copy all the text fields
> to 
> a common field for easy searching? 
> 
> Any helpful is appreciated. Please feel free to suggest any alternative
> way. 
> 
> Thanks 
> 
> 
> 
> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27414263.html
>  
> Sent from the Solr - User mailing list archive at Nabble.com. 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27425156.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing an oracle warehouse table

Reply via email to