Hmm.  We haven’t changed data or the definition in YEARS now.  I'll have to do 
some more digging I guess.  Not sure re-indexing is a great thing to do though 
since this is a production setup and the database for this user is @ 50GB.  It 
would take quite a long time to reindex all that data from scratch.  Hmmmm

Thanks for the quick reply Erick!

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 6, 2017 5:33 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Getting an error: <field> was indexed without position data; 
cannot run PhraseQuery

Usually an _s field is a "string" type, so be sure you didn't change the 
definition without completely re-indexing. In fact I generally either index to 
a new collection or remove the data directory entirely.

right, the field isn't indexed with position information. That combined with 
(probably) the WordDelimiterFilterFactory in text_en_splitting is generating 
multiple tokens for inputs like 3799H.
See the admin/analysis page for how that gets broken up. Term positions are 
usually enable by default, so I'm not quite sure why they're gone unless you 
disabled them.

But you're on the right track regardless. you have to
1> include term positions for anything that generates phrase queries
or
2> make sure you don't generate phrase queries. edismax can do this if
you have it configured to, and then there's autoGeneratePhrasQueries that you 
may find.

And do reindex completely from scratch if you change the definitions.

Best,
Erick

On Mon, Mar 6, 2017 at 1:41 PM, Pouliot, Scott <scott.poul...@peoplefluent.com> 
wrote:
> We keep getting this in our Tomcat/SOLR Logs and I was wondering if a simple 
> schema change will alleviate this issue:
>
> INFO  - 2017-03-06 07:26:58.751; org.apache.solr.core.SolrCore; 
> [Client_AdvanceAutoParts] webapp=/solr path=/select 
> params={fl=candprofileid,+candid&start=0&q=*:*&wt=json&fq=issearchable:1+AND+cpentitymodifiedon:[2017-01-20T00:00:00.000Z+TO+*]+AND+clientreqid:17672+AND+folderid:132+AND+(engagedid_s:(0)+AND+atleast21_s:(1))+AND+(preferredlocations_s:(3799H))&rows=1000}
>  status=500 QTime=1480 ERROR - 2017-03-06 07:26:58.766; 
> org.apache.solr.common.SolrException; null:java.lang.IllegalStateException: 
> field "preferredlocations_s" was indexed without position data; cannot run 
> PhraseQuery (term=3799)
>                 at 
> org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
>                 at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351)
>                 at org.apache.lucene.search.Weight.bulkScorer(Weight.java:131)
>                 at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.bulkScorer(BooleanQuery.java:313)
>                 at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
>                 at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
>                 at 
> org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1158)
>                 at 
> org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:846)
>                 at 
> org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1004)
>                 at 
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1517)
>                 at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1397)
>                 at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:478)
>                 at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:461)
>                 at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
>                 at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>                 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
>                 at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
>                 at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
>                 at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>                 at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>                 at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>                 at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>                 at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>                 at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>                 at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
>                 at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>                 at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
>                 at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
>                 at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>                 at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
>                 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
> Source)
>                 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
> Source)
>                 at java.lang.Thread.run(Unknown Source)
>
>
> The field in question "preferredlocations_s" is not defined in schema.xml 
> explicitly, but we have a dynamicField schema entry that covers it.
>
> <dynamicField name="*_s" type="text_en_splitting" indexed="true" 
> stored="true" />
>
> Would adding omitTermFreqAndPositions="false" to this schema line help out 
> here?  Should I explicitly define this "preferredlocations_s" field in the 
> schema instead and add it there?  We do have a handful of dynamic fields that 
> all get covered by this rule, but it seems the "preferredlocations_s" field 
> is the only one throwing errors.  All it stores is a CSV string with location 
> IDs in it.
>

Reply via email to