bq. but I optimized all the cores, which should rewrite every segment as 
docValues.

Not true. Optimize is a Lucene level force merge. Dealing with segments, i.e. 
merging and the like, is a low-level Lucene operation and Lucene has no notion 
of a schema. So a change you made to the schema is irrelevant to merging.

You have to have something at the Solr level that does some magic for this to 
work. Take a look at UninvertDocValuesMergePolicyFactory if you have Solr 7.0 
or later. WARNING: I haven’t used that personally, and I do not know what the 
behavior would be on an index that is “mixed”, i.e. one that already has 
segments with some docs having DV entries and some not.

Best,
Erick

> On May 31, 2019, at 12:35 AM, Walter Underwood <wun...@wunderwood.org> wrote:
> 
> That field was changed to docValues, but I optimized all the cores, which 
> should rewrite every segment as docValues.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On May 30, 2019, at 7:37 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>> 
>> This is odd. The only reason I know of that would happen is if there were no 
>> docValues for that field in those documents. By any chance were docValues 
>> added to an existing index without totally reindexing into a new collection?
>> 
>> What happens if you just query the collection rather than the individual 
>> core? I’m thinking using a streaming expression as a check…..
>> 
>>> On May 30, 2019, at 6:41 PM, Walter Underwood <wun...@wunderwood.org> wrote:
>>> 
>>> 3/4 of the documents I’m getting back from /export are empty. This 
>>> collection has four shards, so I’m querying the leader core on each shard 
>>> with /export. The results start like this:
>>> 
>>> {"numFound":912370,"docs":[{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},
>>> 
>>> The final 1/4 of the results have UUIDs (the ID type). The id field is 
>>> stored as docValues. This is the URL.
>>> 
>>> http://hostname:8983/solr/decks_shard1_replica1/export?q=id:*&distrib=false&shards=shard1&fl=id&sort=id+asc
>>> 
>>> Running 6.6.2, Solr Cloud. The total number of non-null ids from all four 
>>> shards is a bit less than 1/4 of the document count.
>>> 
>>> Any ideas about what is going on?
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>> 
> 

Reply via email to