Hi,
the exception I received:
SEVERE: org.apache.solr.common.SolrException: Error while creating field
'date_df{type=trickyDate,properties=indexed,stored,omitNorms,omitTf,multiValued,sortMissingLast}'
from value 'c1991.'
at org.apache.solr.schema.FieldType.createField(FieldType.java:190)
at
org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:244)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:140)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: org.apache.solr.common.SolrException: Invalid Date
String:'c1991.'
at org.apache.solr.schema.DateField.parseMath(DateField.java:167)
at org.apache.solr.schema.DateField.toInternal(DateField.java:138)
at org.apache.solr.schema.FieldType.createField(FieldType.java:188)
... 27 more
My expectation is, that a field type behaves like this:
0) I give a field type as the storage type
1) I give it a string
2) with tokenizers, and filters I parse into a given form
3) the Solr handles it as the given type
for example:
0) I set the field type as "solr.DateField"
1) input string is "1991."
2) the analyzer creates "1991-01-01T00:00:00Z"
3) and as it is the normal input form of the date type, Solr
indexes it.
It seems, that the input string ("1991.") must match to the
solr.DateField's expectation, and not the output
("1991-01-01T00:00:00Z").
So the question is: is there a solution, in which I can
"preprocess" the inputs, or it is only doable only on the client's
side.
Péter
From: "Grant Ingersoll" <gsing...@apache.org>
Subject: Re: date field type problem
What's the exception?
On Sep 2, 2009, at 3:00 AM, Peter Kiraly wrote:
Hi Solr users,
I have a lots of dates from a library catalog in not
solr.DateField compatible format. I wrote a new <fieldType>
definition inside the solrconfig.xml, which creates
eg. 1991-01-01T00:00:01Z from the input '[c1991.]' string.
It works fine when I tried it with the typical values
in the http://localhost:8983/solr/admin/analysis.jsp,
but it always throws an exception, when I try to index
the records.
<fieldType name="trickyDate" class="solr.DateField"
sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory"
pattern="sh..?wa \d\d? " replacement="" replace="first"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="june (\d\d), " replacement="" replace="first"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="september (\d\d), " replacement="" replace="first"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="(\D)" replacement="" replace="all"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="^(\d{4})\d*$" replacement="$1-01-01T00:00:01"
replace="all"/>
</analyzer>
</fieldType>
It is more than possible, that I misunderstand something. What I
like to do is to 'normalize' somehow the input data, and I thought
that it is more effective in the Solr side, than in the client.
Have you got any advise, how I may continue?
Péter
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search