Tika support inside DIH does not support wildcard mapping. If you are not planning to do any inner-entity content parsing, you might be better off with using ExtractingRequestHandler and uprefix parameter.
Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Sat, May 25, 2013 at 4:44 AM, Gian Maria Ricci <alkamp...@nablasoft.com>wrote: > Hi to everyone,**** > > ** ** > > I’ve configured import of a document folder with FileListEntityProcessor, > everything went smooth on the first try, but I have a simple question. I’m > able to map metadata without any problem, but I’d like to import in my > index all metadata, not only those I’ve configured with field nodes. In > this example I’ve imported Author and title, but I does not know in advance > which metadata a document could have and I wish to have all of them inside > my index.**** > > ** ** > > Here is my import config. It is the first try with importing with tika and > probably I’m missing a simple stuff.**** > > ** ** > > <dataConfig> **** > > <dataSource type="BinFileDataSource" />**** > > <document>**** > > <entity name="files" > dataSource="null" rootEntity="false"**** > > > processor="FileListEntityProcessor" **** > > baseDir="c:/temp/docs" > fileName=".*\.(doc)|(pdf)|(docx)"**** > > onError="skip"**** > > recursive="true">**** > > <field > column="file" name="id" />**** > > <field > column="fileAbsolutePath" name="path" />**** > > <field > column="fileSize" name="size" />**** > > <field > column="fileLastModified" name="lastModified" />**** > > **** > > <entity ** > ** > > > name="documentImport" **** > > > processor="TikaEntityProcessor"**** > > > url="${files.fileAbsolutePath}" **** > > > format="text">**** > > > <field column="file" name="fileName"/>**** > > > <field column="Author" name="author" meta="true"/>**** > > > <field column="title" name="title" meta="true"/>**** > > > <field column="text" name="text"/>**** > > </entity>* > *** > > </entity>**** > > </document> **** > > </dataConfig> **** > > ** ** > > ** ** > > --**** > > Gian Maria Ricci**** > > Mobile: +39 320 0136949**** > > <http://mvp.microsoft.com/en-us/mvp/Gian%20Maria%20Ricci-4025635> [image: > https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcQyg0wiW_QuTxl-rnuVR2P0jGuj4qO3I9attctCNarL--FC3vdPYg]<http://www.linkedin.com/in/gianmariaricci> > [image: > https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcT8z0HpwpDSjDWw1I59Yx7HmF79u-NnP0NYeYYyEyWM1WtIbOl7]<https://twitter.com/alkampfer> > [image: > https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcQQWMj687BGGypKMUTub_lkUrull1uU2LTx0K2tDBeu3mNUr7Oxlg]<http://feeds.feedburner.com/AlkampferEng> > [image: > https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcSkTG_lPTPFe470xfDtiInUtseqKcuV_lvI5h_-8t_3PsY5ikg3] > **** > > ** ** > > ** ** >