Dates in Solr have a very specific format, see:
http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html

Best
Erick

On Sat, Dec 11, 2010 at 6:32 PM, Adam Estrada <estrada.adam.gro...@gmail.com
> wrote:

> All,
>
> I am ingesting a lot of RSS feeds as part of my application and I keep
> getting the same error.
>
> WARNING: Could not parse a Date field
> java.text.ParseException: Unparseable date: "Mon, 06 Dec 2010 23:31:38
> +0000"
>        at java.text.DateFormat.parse(Unknown Source)
>        at
> org.apache.solr.handler.dataimport.DateFormatTransformer.process(Date
> FormatTransformer.java:89)
>        at
> org.apache.solr.handler.dataimport.DateFormatTransformer.transformRow
> (DateFormatTransformer.java:69)
>        at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransf
> ormer(EntityProcessorWrapper.java:195)
>        at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Ent
> ityProcessorWrapper.java:241)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde
> r.java:357)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde
> r.java:383)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.j
> ava:242)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java
> :180)
>        at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpo
> rter.java:331)
>        at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j
> ava:389)
>        at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja
> va:370)
> Dec 11, 2010 6:25:47 PM org.apache.solr.handler.dataimport.DocBuilder
> finish
> INFO: Import completed successfully
> Dec 11, 2010 6:25:47 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start
> commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDelete
> s=false)
>
> Are there any tips or tricks to getting standard RSS <update> fields to
> import correctly?
>
> An example for a DIH config XML file is as follows:
>
>      <entity name="CBS"
>        pk="link"
>        datasource="filedatasource"
>        url="http://feeds.cbsnews.com/CBSNewsMain?format=xml";
>        processor="XPathEntityProcessor"
>        forEach="/rss/channel | /rss/channel/item"
>        transformer="DateFormatTransformer,HTMLStripTransformer">
>         <field column="source"       xpath="/rss/channel/title"
> commonField="true" />
>        <field column="source-link"  xpath="/rss/channel/link"
>  commonField="true" />
>        <field column="subject"      xpath="/rss/channel/description"
> commonField="true" />
>        <field column="title"        xpath="/rss/channel/item/title" />
>        <field column="link"         xpath="/rss/channel/item/link" />
>        <field column="description"  xpath="/rss/channel/item/description"
> stripHTML="true" />
>        <field column="creator"      xpath="/rss/channel/item/creator" />
>        <field column="item-subject" xpath="/rss/channel/item/subject" />
>        <field column="author"       xpath="/rss/channel/item/author" />
>        <field column="comments"     xpath="/rss/channel/item/comments" />
>        <field column="pubdate"      xpath="/rss/channel/item/pubDate"
> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" />
>      </entity>
>
> Any tips on this would be really appreciated as I need to query based on
> the
> date the article was published.
>
> Thanks,
> Adam
>

Reply via email to