Dates in Solr have a very specific format, see: http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html
Best Erick On Sat, Dec 11, 2010 at 6:32 PM, Adam Estrada <estrada.adam.gro...@gmail.com > wrote: > All, > > I am ingesting a lot of RSS feeds as part of my application and I keep > getting the same error. > > WARNING: Could not parse a Date field > java.text.ParseException: Unparseable date: "Mon, 06 Dec 2010 23:31:38 > +0000" > at java.text.DateFormat.parse(Unknown Source) > at > org.apache.solr.handler.dataimport.DateFormatTransformer.process(Date > FormatTransformer.java:89) > at > org.apache.solr.handler.dataimport.DateFormatTransformer.transformRow > (DateFormatTransformer.java:69) > at > org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransf > ormer(EntityProcessorWrapper.java:195) > at > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Ent > ityProcessorWrapper.java:241) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde > r.java:357) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde > r.java:383) > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.j > ava:242) > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java > :180) > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpo > rter.java:331) > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j > ava:389) > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja > va:370) > Dec 11, 2010 6:25:47 PM org.apache.solr.handler.dataimport.DocBuilder > finish > INFO: Import completed successfully > Dec 11, 2010 6:25:47 PM org.apache.solr.update.DirectUpdateHandler2 commit > INFO: start > commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDelete > s=false) > > Are there any tips or tricks to getting standard RSS <update> fields to > import correctly? > > An example for a DIH config XML file is as follows: > > <entity name="CBS" > pk="link" > datasource="filedatasource" > url="http://feeds.cbsnews.com/CBSNewsMain?format=xml" > processor="XPathEntityProcessor" > forEach="/rss/channel | /rss/channel/item" > transformer="DateFormatTransformer,HTMLStripTransformer"> > <field column="source" xpath="/rss/channel/title" > commonField="true" /> > <field column="source-link" xpath="/rss/channel/link" > commonField="true" /> > <field column="subject" xpath="/rss/channel/description" > commonField="true" /> > <field column="title" xpath="/rss/channel/item/title" /> > <field column="link" xpath="/rss/channel/item/link" /> > <field column="description" xpath="/rss/channel/item/description" > stripHTML="true" /> > <field column="creator" xpath="/rss/channel/item/creator" /> > <field column="item-subject" xpath="/rss/channel/item/subject" /> > <field column="author" xpath="/rss/channel/item/author" /> > <field column="comments" xpath="/rss/channel/item/comments" /> > <field column="pubdate" xpath="/rss/channel/item/pubDate" > dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" /> > </entity> > > Any tips on this would be really appreciated as I need to query based on > the > date the article was published. > > Thanks, > Adam >