Re: Solr 4.0 is stripping XML format from RSS content field

2013-10-01 Thread eShard
If anyone is interested, I managed to resolve this a long time ago. I used a Data Import Handler instead and it worked beautifully. DIH are very forgiving and it takes what ever XML data is there and injects it into the Solr Index. It's a lot faster than crawling too. You use XPATH to map the field

Solr 4.0 is stripping XML format from RSS content field

2013-02-11 Thread eShard
Hi, I'm running solr 4.0 final with manifoldcf 1.1 and I verified via fiddler that Manifold is indeed sending the content field from a RSS feed that contains xml data However, when I query the index the content field is there with just the data; the XML structure is gone. Does anyone know how to st