DataImport TXT file entity processor

2009-01-23 Thread Nathan Adams
Is there a way to us Data Import Handler to index non-XML (i.e. simple
text) files (either via HTTP or FileSystem)?  I need to put the entire
contents of a text file into a single field of a document and the other
fields are being pulled out of Oracle...

 

-Nathan



DIH handling of missing files

2009-01-28 Thread Nathan Adams
I am constructing documents from a JDBC datasource and a HTTP datasource
(see data-config file below.)  My problem is that I cannot know if a
particular HTTP URL is available at index time, so I need DIH to
continue processing even if the HTTP location returns a 404.
onError="continue" does not appear to help in this case.  Should it?






 
 



http://???.com/${metadata.RESOURCEID}.xml"; forEach="/content"
dataSource="http" processor="XPathEntityProcessor" onError="continue">







 



Thanks,
Nathan


RE: DIH handling of missing files

2009-01-29 Thread Nathan Adams
I'm running the example from the DIH wiki page:
 
http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar
 
-Nathan

 


From: Noble Paul ??? ?? [mailto:noble.p...@gmail.com]
Sent: Wed 01/28/2009 11:32 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH handling of missing files



onError="continue" must help .

which version of DIH are you using? onError is a Solr 1.4 feature
--Noble

On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams  wrote:
> I am constructing documents from a JDBC datasource and a HTTP datasource
> (see data-config file below.)  My problem is that I cannot know if a
> particular HTTP URL is available at index time, so I need DIH to
> continue processing even if the HTTP location returns a 404.
> onError="continue" does not appear to help in this case.  Should it?
>
> 
>
> driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?"
> user="???" password="???"/>
>
>
>
>
>
> query="select * from " onError="continue">
>
> url="http://???.com/$ <http:///???.com/$> {metadata.RESOURCEID}.xml" 
> forEach="/content"
> dataSource="http" processor="XPathEntityProcessor" onError="continue">
>
>
>
>
>
>
>
>
>
> 
>
> Thanks,
> Nathan
>



--
--Noble Paul




RE: DIH handling of missing files

2009-01-29 Thread Nathan Adams
Which appears to be v1.3, which explains the problem.  Thanks!



From: Nathan Adams [mailto:na...@umich.edu]
Sent: Thu 01/29/2009 8:28 AM
To: solr-user@lucene.apache.org
Subject: RE: DIH handling of missing files



I'm running the example from the DIH wiki page:

http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar

-Nathan




From: Noble Paul ??? ?? [mailto:noble.p...@gmail.com]
Sent: Wed 01/28/2009 11:32 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH handling of missing files



onError="continue" must help .

which version of DIH are you using? onError is a Solr 1.4 feature
--Noble

On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams  wrote:
> I am constructing documents from a JDBC datasource and a HTTP datasource
> (see data-config file below.)  My problem is that I cannot know if a
> particular HTTP URL is available at index time, so I need DIH to
> continue processing even if the HTTP location returns a 404.
> onError="continue" does not appear to help in this case.  Should it?
>
> 
>
> driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?"
> user="???" password="???"/>
>
>
>
>
>
> query="select * from " onError="continue">
>
> url="http://???.com/$ <http:///???.com/$>  <http:///???.com/$> 
> {metadata.RESOURCEID}.xml" forEach="/content"
> dataSource="http" processor="XPathEntityProcessor" onError="continue">
>
>
>
>
>
>
>
>
>
> 
>
> Thanks,
> Nathan
>



--
--Noble Paul