It sounds like maybe when you run this from code, you are getting an
error page instead of the RSS feed and that error page is a malformed
HTML.

Do you have a proxy where you run the code? If so, your browser may be
using proxy and your DIH code does not. You could try running
something like WireShark, Fiddler or similar t inspect the
request/response you are actually getting.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Sat, Jun 7, 2014 at 10:52 AM, ienjreny <ismaeel.enjr...@gmail.com> wrote:
> Hello,
>
> I am using the following script to index RSS items
>
> <dataSource type="URLDataSource" encoding="UTF-8" />
>   <document>
>     <entity name="slashdot"
>             pk="link"
>             url="http://www.alarabiya.net/.mrss/ar.xml";
>             processor="XPathEntityProcessor"
>             forEach="/rss/channel/item">
>
>       <field column="category_name" name="category_name"
> xpath="/rss/channel/item/title" />
>       <field column="link" name="url" xpath="/rss/channel/item/link" />
>
>     </entity>
>   </document>
>
> But I am facing the following error
>
> Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag
> </head>; expected </meta>.
>
> Can any body help?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Error-when-using-URLDataSource-to-index-RSS-items-tp4140548.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to