I have an index which I cannot rebuild (it is very large).  The schema
includes a field 'dateorigin_sort'. This is an integer, not an 'sint'. It
sorts very quickly but does not support range queries.

Somehow the index has acquired one record out of millions in which an
integer value has been populated by an empty string. I would like to isolate
this record and remove it. This field exists solely to make sorting faster,
and since it has an empty record, sorting blows up. 
 
Is it possible to find this record? Is there any way to differentiate
between this record and all of the other records which have real numbers
populated?  

This query will isolate records which do not have the field populated. (It
works on all field types.)
    -dateorigin_sort:[* TO *]
But, since this record is an integer (not an sint) no other range query
works.
 
Here are the facet counts:
 
<lst name="facet_counts">
 <lst name="facet_queries"/>
 <lst name="facet_fields">
  <lst name="dateorigin_sort">
        <int name="">1</int>
        <int name="-1">17224588</int>
        <int name="-1019087976">1</int>
        <int name="-1020481693">1</int>
    and millions more

Here is the stack trace from solr when doing a sort on dateorigin_sort:

HTTP Status 500 -
        at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48
)
        at java.lang.Integer.parseInt(Integer.java:468)
        at java.lang.Integer.parseInt(Integer.java:497)
        at
org.apache.lucene.search.FieldCacheImpl$1.parseInt(FieldCacheImpl.java:136)
        at
org.apache.lucene.search.FieldCacheImpl$3.createValue(FieldCacheImpl.java:17
1)
        at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
        at
org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:154)
        at
org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:148)
        at
org.apache.lucene.search.FieldSortedHitQueue.comparatorInt(FieldSortedHitQue
ue.java:204)
        at
org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQue
ue.java:175)
        at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
        at
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSorted
HitQueue.java:155)
        at
org.apache.lucene.search.FieldSortedHitQueue.&lt;init&gt;(FieldSortedHitQueu
e.java:56)
        at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearche
r.java:1028)
        at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:
801)
        at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSet(SolrIndexSearcher.
java:1237)
        at
org.apache.solr.request.StandardRequestHandler.handleRequestBody(StandardReq
uestHandler.java:117)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.
java:77)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
        at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:1
91)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
159)


Cheers,

Lance Norskog

Reply via email to