Ah. So docValues are managed by Solr outside of Lucene. Interesting. That actually answers a question I had not asked yet. I was curious if it was safe to change the id field to docValues without reindexing if we never sorted on it. It looks like fetching the value won’t work until everything is reindexed.
It seems like this would be a useful thing to have supported, migrating a field to docValues. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 31, 2019, at 5:00 AM, Erick Erickson <erickerick...@gmail.com> wrote: > > bq. but I optimized all the cores, which should rewrite every segment as > docValues. > > Not true. Optimize is a Lucene level force merge. Dealing with segments, i.e. > merging and the like, is a low-level Lucene operation and Lucene has no > notion of a schema. So a change you made to the schema is irrelevant to > merging. > > You have to have something at the Solr level that does some magic for this to > work. Take a look at UninvertDocValuesMergePolicyFactory if you have Solr 7.0 > or later. WARNING: I haven’t used that personally, and I do not know what the > behavior would be on an index that is “mixed”, i.e. one that already has > segments with some docs having DV entries and some not. > > Best, > Erick > >> On May 31, 2019, at 12:35 AM, Walter Underwood <wun...@wunderwood.org> wrote: >> >> That field was changed to docValues, but I optimized all the cores, which >> should rewrite every segment as docValues. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On May 30, 2019, at 7:37 PM, Erick Erickson <erickerick...@gmail.com> wrote: >>> >>> This is odd. The only reason I know of that would happen is if there were >>> no docValues for that field in those documents. By any chance were >>> docValues added to an existing index without totally reindexing into a new >>> collection? >>> >>> What happens if you just query the collection rather than the individual >>> core? I’m thinking using a streaming expression as a check….. >>> >>>> On May 30, 2019, at 6:41 PM, Walter Underwood <wun...@wunderwood.org> >>>> wrote: >>>> >>>> 3/4 of the documents I’m getting back from /export are empty. This >>>> collection has four shards, so I’m querying the leader core on each shard >>>> with /export. The results start like this: >>>> >>>> {"numFound":912370,"docs":[{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{}, >>>> >>>> The final 1/4 of the results have UUIDs (the ID type). The id field is >>>> stored as docValues. This is the URL. >>>> >>>> http://hostname:8983/solr/decks_shard1_replica1/export?q=id:*&distrib=false&shards=shard1&fl=id&sort=id+asc >>>> >>>> Running 6.6.2, Solr Cloud. The total number of non-null ids from all four >>>> shards is a bit less than 1/4 of the document count. >>>> >>>> Any ideas about what is going on? >>>> >>>> wunder >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> http://observer.wunderwood.org/ (my blog) >>>> >>> >> >