: > I don't think DIH can do that, but who knows, let's see what others say.
: Looks like the ExtractingRequestHandler uses Tika as well. I might just use : this but I'm wondering if there will be a large performance difference between : using it to batch content in over rolling my own Transformer? I'm confused ... You're using DIH, and some of your fields are URLs to documents that you want to parse with Tika? Why would you need a custom Transformer? http://wiki.apache.org/solr/DataImportHandler#Tika_Integration http://wiki.apache.org/solr/TikaEntityProcessor -Hoss