Thanks a lot, Lance. So, are these part of solr 1.4 release ?
-----Original Message----- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Thursday, April 15, 2010 9:53 AM To: solr-user@lucene.apache.org Subject: Re: DIH FileListEntityProcessor -> BinFileDataSource -> TikaEntityProcessor (I think) FLEP walks the directory and supplies a separate record per file. BFDS pulls the file and supplies it to TikaEntityProcessor. BinFileDataSource is not documented, but you need it for binary data streams like PDF & Word. For text files, use FileDataSource. On 4/14/10, Sandhya Agarwal <sagar...@opentext.com> wrote: > Hello, > > We want to design a solution where we have one polling directory (data > source directory) containing the xml files, of all data that must be > indexed. These XML files contain a reference to the content file. So, we > need another datasource that must be created for the content files. Could > somebody please tell me what is the best way to get this working using the > DIH / tika processor. > > Thanks, > Sandhya > > > -- Lance Norskog goks...@gmail.com