It is planned to be in an another month or so. But it is never too sure.
On Fri, Jan 23, 2009 at 3:57 AM, Gunaranjan Chandraraju <chandrar...@apple.com> wrote: > Thanks > > A last question - do you have any approximate date for the release of 1.4. > If its going to be soon enough (within a month or so) then I can plan for > our development around it. > > Thanks > Guna > > On Jan 22, 2009, at 11:04 AM, Noble Paul നോബിള് नोब्ळ् wrote: > >> You are out of luck if you are not using a recent version of DIH >> >> The sub entity will work only if you use the FieldReaderDataSource. >> Then you do not need a ClobTransformer also. >> >> The trunk version of DIH can be used w/ Solr 1.3 release >> >> On Thu, Jan 22, 2009 at 12:59 PM, Gunaranjan Chandraraju >> <chandrar...@apple.com> wrote: >>> >>> Hi >>> >>> Yes, the XML is inside the DB in a clob. Would love to use XPath >>> inside >>> SQLEntityProcessor as it will save me tons of trouble for file-dumping >>> (given that I am not able to post it). This is how I setup my DIH for DB >>> import. >>> >>> <dataConfig> >>> <dataSource type="JdbcDataSource" name="data-source-1" >>> driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@XXXXX" >>> user="abc" password="***" batchSize="100"/> >>> <document> >>> <entity dataSource="data-source-1" >>> name ="item" processor="SqlEntityProcessor" >>> pk="ID" >>> stream="false" >>> rootEntity="false" >>> transformer="ClobTransformer" <!-- custom clob transformer I >>> saw and not the one from 1.4. --> >>> query="select xml_col from xml_table where xml_col IS NOT NULL" >>>> >>>> <!-- horrible query I need to work on making it better --> >>> >>> <entity >>> dataSource="null" <!-- this is my problem - if I don't give a >>> name here it complains, if I put in null then the code seems to fail with >>> a >>> null pointer --> >>> name="record" >>> processor="XPathEntityProcessor" >>> stream="false" >>> url="${item.xml_col}" >>> forEach="/record"> >>> >>> <field column="ID" xpath="/record/coreinfo/@a" /> >>> <field column="type" xpath="/record/coreinfo/@b" /> >>> <field column="streetname" xpath="/record/address/@c" /> >>> >>> .. and so on >>> </entity> >>> >>> >>> </entity> >>> </document> >>> </dataConfig> >>> >>> >>> The problem with this is that it always fails with this error. I can see >>> that the earlier SQL entity extraction and clob transformation is working >>> as >>> the values show in the debug jsp (verbose mode with dataimport.jsp). >>> However no records are extracted for entity. When I check catalina.out >>> file, it shows me the following errors for entity name="record". (the >>> XPath >>> entity on top). >>> >>> java.lang.NullPointerException at >>> >>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85). >>> >>> I don't have the whole stack trace right now. If you need it I would be >>> happy to recreate and post it. >>> >>> Regards, >>> Guna >>> >>> On Jan 21, 2009, at 8:22 PM, Noble Paul നോബിള് नोब्ळ् wrote: >>> >>>> On Thu, Jan 22, 2009 at 7:02 AM, Gunaranjan Chandraraju >>>> <chandrar...@apple.com> wrote: >>>>> >>>>> Thanks >>>>> >>>>> Yes the source of data is a DB. However the xml is also posted on >>>>> updates >>>>> via publish framework. So I can just plug in an adapter hear to listen >>>>> for >>>>> changes and post to SOLR. I was trying to use the XPathProcessor >>>>> inside >>>>> the >>>>> SQLEntityProcessor and this did not work (using 1.3 - I did see support >>>>> in >>>>> 1.4). That is not a show stopper for me and I can just post them via >>>>> the >>>>> framework and use files for the first time load. >>>> >>>> XPathEntityprocessor works inside SqlEntityprocessor only if a db >>>> field contains xml. >>>> >>>> However ,you can have a separate entity (at the root) to read from db >>>> for delta. >>>> Anyway if your current solution works stick to it. >>>>> >>>>> Have a seen a couple of answers on the backup for crash scenarios. >>>>> just >>>>> wanted to confirm - if I replace the index with the backup'ed files >>>>> then >>>>> I >>>>> can simple start the up solr again and reindex the documents changed >>>>> since >>>>> last backup? Am I right? The slaves will also automatically adjust to >>>>> this. >>>> >>>> Yes. you can replace an archived index and Solr should work just fine. >>>> but the docs added since the last snapshot was taken will be missing >>>> (of course :) ) >>>>> >>>>> THanks >>>>> Guna >>>>> >>>>> >>>>> On Jan 20, 2009, at 9:37 PM, Noble Paul നോബിള് नोब्ळ् wrote: >>>>> >>>>>> On Wed, Jan 21, 2009 at 5:15 AM, Gunaranjan Chandraraju >>>>>> <chandrar...@apple.com> wrote: >>>>>>> >>>>>>> Hi All >>>>>>> We are considering SOLR for a large database of XMLs. I have some >>>>>>> newbie >>>>>>> questions - if there is a place I can go read about them do let me >>>>>>> know >>>>>>> and >>>>>>> I will go read up :) >>>>>>> >>>>>>> 1. Currently we are able to pull the XMLs from a file systems using >>>>>>> FileDataSource. The DIH is convenient since I can map my XML fields >>>>>>> using >>>>>>> the XPathProcessor. This works for an initial load. However after >>>>>>> the >>>>>>> initial load, we would like to 'post' changed xmls to SOLR whenever >>>>>>> the >>>>>>> XML >>>>>>> is updated in a separate system. I know we can post xmls with 'add' >>>>>>> however >>>>>>> I was not sure how to do this and maintain the DIH mapping I use in >>>>>>> data-config.xml? I don't want to save the file to the disk and then >>>>>>> call >>>>>>> the DIH - would prefer to directly post it. Do I need to use solrj >>>>>>> for >>>>>>> this? >>>>>> >>>>>> What is the source of your new data? is it a DB? >>>>>> >>>>>>> >>>>>>> 2. If my solr schema.xml changes then do I HAVE to reindex all the >>>>>>> old >>>>>>> documents? Suppose in future we have newer XML documents that >>>>>>> contain >>>>>>> a >>>>>>> new >>>>>>> additional xml field. The old documents that are already indexed >>>>>>> don't >>>>>>> have this field and (so) I don't need search on them with this field. >>>>>>> However the new ones need to be search-able on this new field. Can >>>>>>> I >>>>>>> just add this new field to the SOLR schema, restart the servers just >>>>>>> post >>>>>>> the new new documents or do I need to reindex everything? >>>>>>> >>>>>>> 3. Can I backup the index directory. So that in case of a disk crash >>>>>>> - >>>>>>> I >>>>>>> can restore this directory and bring solr up. I realize that any >>>>>>> documents >>>>>>> indexed after this backup would be lost - I can however keep track of >>>>>>> these >>>>>>> outside and simply re-index documents 'newer' than that backup date. >>>>>>> This >>>>>>> question is really important to me in the context of using a Master >>>>>>> Server >>>>>>> with replicated index. I would like to run this backup for the >>>>>>> 'Master'. >>>>>> >>>>>> the snapshot script is can be used to take backups on commit. >>>>>>> >>>>>>> 4. In general what happens when the solr application is bounced? Is >>>>>>> the >>>>>>> index affected (anything maintained in memory)? >>>>>>> >>>>>>> Regards >>>>>>> Guna >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> --Noble Paul >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> --Noble Paul >>> >>> >> >> >> >> -- >> --Noble Paul > > -- --Noble Paul