(FYI: in the future please start a new thread with an approriate subject line when you ask questions -- you probably would have gotten a lot more responses fro people interested in Tika and SolrCell if they could tell that this email was about SolrCell)
: I found that Tika read the html and extract metadata like <meta name="id" : content="12"> from my htmls but my documents has an already an id setted by : literal.id=10. : : I tried to map the id from Tika by fmap.id=ignored_ but it ignore also my : literal.id Hmmmm, yeah: that seems like an odd order of operations, but it's documented on the wiki so evidently it's intentional... http://wiki.apache.org/solr/ExtractingRequestHandler#Order_of_field_operations my best sugguestions: * use the capture param to restrict what gets extracted (it's probably possible to write an XPath query that selects everything *except* metadata[id]) * change the name of your uniqueKey field to be something other then "id" so it's less likely to collide with a value from the document. I also opened two Jira issues that you may want to post comments in... https://issues.apache.org/jira/browse/SOLR-1633 https://issues.apache.org/jira/browse/SOLR-1634 -Hoss