Hi, I have a lot of non standard IBM RSS feeds that needs to be crawled (via ManifoldCF v1.1.1) and put into solr 4.0 final. The problem is that we need to put the additional non standard metadata into solr. I've confirmed via fiddler that manifoldcf is indeed sending all the appropriate metadata but something in solr is removing all of it. It's either tika, rome or something else in solr. see this link for more details tika post <http://lucene.472066.n3.nabble.com/how-to-add-more-metadata-to-tika-extraction-td4043417.html#a4043456>
So, is there a way to configure tika (or rome which handles RSS parsing) to capture the additional metadata? I read that the tika config file is deprecated or obsolete. Is that true? Thanks, -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-solr-tika-to-capture-more-metadata-from-RSS-feed-tp4044015.html Sent from the Solr - User mailing list archive at Nabble.com.