DataImport TXT file entity processor
Is there a way to us Data Import Handler to index non-XML (i.e. simple text) files (either via HTTP or FileSystem)? I need to put the entire contents of a text file into a single field of a document and the other fields are being pulled out of Oracle... -Nathan
DIH handling of missing files
I am constructing documents from a JDBC datasource and a HTTP datasource (see data-config file below.) My problem is that I cannot know if a particular HTTP URL is available at index time, so I need DIH to continue processing even if the HTTP location returns a 404. onError="continue" does not appear to help in this case. Should it? http://???.com/${metadata.RESOURCEID}.xml"; forEach="/content" dataSource="http" processor="XPathEntityProcessor" onError="continue"> Thanks, Nathan
RE: DIH handling of missing files
I'm running the example from the DIH wiki page: http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar -Nathan From: Noble Paul ??? ?? [mailto:noble.p...@gmail.com] Sent: Wed 01/28/2009 11:32 PM To: solr-user@lucene.apache.org Subject: Re: DIH handling of missing files onError="continue" must help . which version of DIH are you using? onError is a Solr 1.4 feature --Noble On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams wrote: > I am constructing documents from a JDBC datasource and a HTTP datasource > (see data-config file below.) My problem is that I cannot know if a > particular HTTP URL is available at index time, so I need DIH to > continue processing even if the HTTP location returns a 404. > onError="continue" does not appear to help in this case. Should it? > > > > driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?" > user="???" password="???"/> > > > > > > query="select * from " onError="continue"> > > url="http://???.com/$ <http:///???.com/$> {metadata.RESOURCEID}.xml" > forEach="/content" > dataSource="http" processor="XPathEntityProcessor" onError="continue"> > > > > > > > > > > > > Thanks, > Nathan > -- --Noble Paul
RE: DIH handling of missing files
Which appears to be v1.3, which explains the problem. Thanks! From: Nathan Adams [mailto:na...@umich.edu] Sent: Thu 01/29/2009 8:28 AM To: solr-user@lucene.apache.org Subject: RE: DIH handling of missing files I'm running the example from the DIH wiki page: http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar -Nathan From: Noble Paul ??? ?? [mailto:noble.p...@gmail.com] Sent: Wed 01/28/2009 11:32 PM To: solr-user@lucene.apache.org Subject: Re: DIH handling of missing files onError="continue" must help . which version of DIH are you using? onError is a Solr 1.4 feature --Noble On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams wrote: > I am constructing documents from a JDBC datasource and a HTTP datasource > (see data-config file below.) My problem is that I cannot know if a > particular HTTP URL is available at index time, so I need DIH to > continue processing even if the HTTP location returns a 404. > onError="continue" does not appear to help in this case. Should it? > > > > driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?" > user="???" password="???"/> > > > > > > query="select * from " onError="continue"> > > url="http://???.com/$ <http:///???.com/$> <http:///???.com/$> > {metadata.RESOURCEID}.xml" forEach="/content" > dataSource="http" processor="XPathEntityProcessor" onError="continue"> > > > > > > > > > > > > Thanks, > Nathan > -- --Noble Paul