Yes, I put all file in one directory and I have tested file names using code.
At 2012-02-09 20:45:49,"Jan Høydahl" <jan....@cominvent.com> wrote: >Hi, > >Are you 100% sure that the filename is globally unique, since you use it as >the uniqueKey? > >-- >Jan Høydahl, search solution architect >Cominvent AS - www.cominvent.com >Solr Training - www.solrtraining.com > >On 9. feb. 2012, at 08:30, 荣康 wrote: > >> Hey , >> I am using solr as my search engine to search my pdf files. I have 18219 >> files(different file names) and all the files are in one same directory。But >> when I use solr to import the files into index using Dataimport method, solr >> report only import 17233 files. It's very strange. This problem has stoped >> out project for a few days. I can't handle it. >> >> >> please help me! >> >> >> Schema.xml >> >> >> <fields> >> <field name="text" type="text" indexed="true" multiValued="true" >> termVectors="true" termPositions="true" termOffsets="true"/> >> <field name="filename" type="filenametext" indexed="true" required="true" >> termVectors="true" termPositions="true" termOffsets="true"/> >> <field name="id" type="string" stored="true"/> >> </fields> >> <uniqueKey>id</uniqueKey> >> <copyField source="filename" dest="text"/> >> >> >> and >> <dataConfig> >> <dataSource type="BinFileDataSource" name="bin"/> >> <document> >> <entity name="f" processor="FileListEntityProcessor" recursive="true" >> rootEntity="false" >> dataSource="null" baseDir="H:/pdf/cls_1_16800_OCRed/1" >> fileName=".*\.(PDF)|(pdf)|(Pdf)|(pDf)|(pdF)|(PDf)|(PdF)|(pDF)" >> onError="skip"> >> >> >> <entity name="tika-test" processor="TikaEntityProcessor" >> url="${f.fileAbsolutePath}" format="text" dataSource="bin" onError="skip"> >> <field column="text" name="text"/> >> </entity> >> <field column="file" name="id"/> >> <field column="file" name="filename"/> >> </entity> >> </document> >> </dataConfig> >> >> >> >> >> sincerecly >> Rong Kang >> >> >> >