You are out of luck if you are not using a recent version of DIH The sub entity will work only if you use the FieldReaderDataSource. Then you do not need a ClobTransformer also.
The trunk version of DIH can be used w/ Solr 1.3 release On Thu, Jan 22, 2009 at 12:59 PM, Gunaranjan Chandraraju <chandrar...@apple.com> wrote: > Hi > > Yes, the XML is inside the DB in a clob. Would love to use XPath inside > SQLEntityProcessor as it will save me tons of trouble for file-dumping > (given that I am not able to post it). This is how I setup my DIH for DB > import. > > <dataConfig> > <dataSource type="JdbcDataSource" name="data-source-1" > driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@XXXXX" > user="abc" password="***" batchSize="100"/> > <document> > <entity dataSource="data-source-1" > name ="item" processor="SqlEntityProcessor" > pk="ID" > stream="false" > rootEntity="false" > transformer="ClobTransformer" <!-- custom clob transformer I > saw and not the one from 1.4. --> > query="select xml_col from xml_table where xml_col IS NOT NULL" >> <!-- horrible query I need to work on making it better --> > > <entity > dataSource="null" <!-- this is my problem - if I don't give a > name here it complains, if I put in null then the code seems to fail with a > null pointer --> > name="record" > processor="XPathEntityProcessor" > stream="false" > url="${item.xml_col}" > forEach="/record"> > > <field column="ID" xpath="/record/coreinfo/@a" /> > <field column="type" xpath="/record/coreinfo/@b" /> > <field column="streetname" xpath="/record/address/@c" /> > > .. and so on > </entity> > > > </entity> > </document> > </dataConfig> > > > The problem with this is that it always fails with this error. I can see > that the earlier SQL entity extraction and clob transformation is working as > the values show in the debug jsp (verbose mode with dataimport.jsp). > However no records are extracted for entity. When I check catalina.out > file, it shows me the following errors for entity name="record". (the XPath > entity on top). > > java.lang.NullPointerException at > org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85). > > I don't have the whole stack trace right now. If you need it I would be > happy to recreate and post it. > > Regards, > Guna > > On Jan 21, 2009, at 8:22 PM, Noble Paul നോബിള് नोब्ळ् wrote: > >> On Thu, Jan 22, 2009 at 7:02 AM, Gunaranjan Chandraraju >> <chandrar...@apple.com> wrote: >>> >>> Thanks >>> >>> Yes the source of data is a DB. However the xml is also posted on >>> updates >>> via publish framework. So I can just plug in an adapter hear to listen >>> for >>> changes and post to SOLR. I was trying to use the XPathProcessor inside >>> the >>> SQLEntityProcessor and this did not work (using 1.3 - I did see support >>> in >>> 1.4). That is not a show stopper for me and I can just post them via the >>> framework and use files for the first time load. >> >> XPathEntityprocessor works inside SqlEntityprocessor only if a db >> field contains xml. >> >> However ,you can have a separate entity (at the root) to read from db >> for delta. >> Anyway if your current solution works stick to it. >>> >>> Have a seen a couple of answers on the backup for crash scenarios. just >>> wanted to confirm - if I replace the index with the backup'ed files then >>> I >>> can simple start the up solr again and reindex the documents changed >>> since >>> last backup? Am I right? The slaves will also automatically adjust to >>> this. >> >> Yes. you can replace an archived index and Solr should work just fine. >> but the docs added since the last snapshot was taken will be missing >> (of course :) ) >>> >>> THanks >>> Guna >>> >>> >>> On Jan 20, 2009, at 9:37 PM, Noble Paul നോബിള് नोब्ळ् wrote: >>> >>>> On Wed, Jan 21, 2009 at 5:15 AM, Gunaranjan Chandraraju >>>> <chandrar...@apple.com> wrote: >>>>> >>>>> Hi All >>>>> We are considering SOLR for a large database of XMLs. I have some >>>>> newbie >>>>> questions - if there is a place I can go read about them do let me know >>>>> and >>>>> I will go read up :) >>>>> >>>>> 1. Currently we are able to pull the XMLs from a file systems using >>>>> FileDataSource. The DIH is convenient since I can map my XML fields >>>>> using >>>>> the XPathProcessor. This works for an initial load. However after >>>>> the >>>>> initial load, we would like to 'post' changed xmls to SOLR whenever the >>>>> XML >>>>> is updated in a separate system. I know we can post xmls with 'add' >>>>> however >>>>> I was not sure how to do this and maintain the DIH mapping I use in >>>>> data-config.xml? I don't want to save the file to the disk and then >>>>> call >>>>> the DIH - would prefer to directly post it. Do I need to use solrj for >>>>> this? >>>> >>>> What is the source of your new data? is it a DB? >>>> >>>>> >>>>> 2. If my solr schema.xml changes then do I HAVE to reindex all the old >>>>> documents? Suppose in future we have newer XML documents that contain >>>>> a >>>>> new >>>>> additional xml field. The old documents that are already indexed >>>>> don't >>>>> have this field and (so) I don't need search on them with this field. >>>>> However the new ones need to be search-able on this new field. Can I >>>>> just add this new field to the SOLR schema, restart the servers just >>>>> post >>>>> the new new documents or do I need to reindex everything? >>>>> >>>>> 3. Can I backup the index directory. So that in case of a disk crash - >>>>> I >>>>> can restore this directory and bring solr up. I realize that any >>>>> documents >>>>> indexed after this backup would be lost - I can however keep track of >>>>> these >>>>> outside and simply re-index documents 'newer' than that backup date. >>>>> This >>>>> question is really important to me in the context of using a Master >>>>> Server >>>>> with replicated index. I would like to run this backup for the >>>>> 'Master'. >>>> >>>> the snapshot script is can be used to take backups on commit. >>>>> >>>>> 4. In general what happens when the solr application is bounced? Is >>>>> the >>>>> index affected (anything maintained in memory)? >>>>> >>>>> Regards >>>>> Guna >>>>> >>>> >>>> >>>> >>>> -- >>>> --Noble Paul >>> >>> >> >> >> >> -- >> --Noble Paul > > -- --Noble Paul