You should not have to do anything with Maven, the instructions you followed were from 1.4.1 days......
Assuming you're working with a 3.x build, here's a data-config that worked for me, just a straight distro. But note a couple of things: 1> for simplicity, I changed the schema.xml to NOT require the id field. You'll have to change this back probably and select a good <uniqueKey> 2> I had to add this line to solrconfig.xml to find the path: <lib dir="../../dist/" regex="apache-solr-dataimporthandler-extras-\d.*\.jar"/> 3> If this all works without errors in the Solr log and you still can't find anything, be sure you issue a commit. Best Erick <dataConfig> <dataSource name="bin" type="BinFileDataSource"/> <document> <entity baseDir="/Users/Erick/testdocs" fileName=".*pdf" name="sd" processor="FileListEntityProcessor" recursive="true" rootEntity="false"> <entity dataSource="bin" format="text" name="tika-test" processor="TikaEntityProcessor" url="${sd.fileAbsolutePath}"> <field column="Author" meta="true" name="author"/> <field column="Content-Type" meta="true" name="title"/> <!-- field column="title" name="title" meta="true"/ --> <field column="text" name="text"/> </entity> <!-- field column="fileLastModified" name="date" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss" / --> <field column="fileSize" meta="true" name="size"/> </entity> </document> </dataConfig> On Fri, Feb 17, 2012 at 9:35 AM, alessio crisantemi <alessio.crisant...@gmail.com> wrote: > thanks gora for your help. > I installed Maven and downloaded Tika following the guide: But I have an > errore during the built of Tika about 'tika compiler', and the maven > installation of Tika is stopped. > > there is another way? > thank you > a. > > 2012/2/16 Gora Mohanty <g...@mimirtech.com> > >> On 16 February 2012 21:37, alessio crisantemi >> <alessio.crisant...@gmail.com> wrote: >> > here the log: >> > >> > >> > org.apache.solr.handler.dataimport.DataImporter doFullImport >> > Grave: Full Import failed >> > org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' >> is >> > a required attribute Processing Document # 1 >> [...] >> >> The exception message above is pretty clear. You need to define a >> baseDir attribute for the second entity. >> >> However, even if you fix this, the setup will *not* work for indexing >> PDFs. Did you read the URLs that I sent earlier? >> >> Regards, >> Gora >>