After doing some replications (replicationOnOptimize) I see
- on master filesystem files that belong to two segments (I suppose the
oldest is just a commit point)
- on master admin console
(SolrIndexReader{this=4f2452c6,r=ReadOnlyDirectoryReader@4f2452c6,refCnt=1,*segments=**1*})
but on slave
- on filesystem I see files belonging just to latest segment (which in
this case is called *_mx*)
- on admin console I see there's just one segment
- on stats page I see (FieldCache) references to both new and previous
(old) segment (*_mv and _mx*)
entries_count : 11
...
entry#2 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/_*mv*.frq")'=>'*title_sort*',class
org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#883647064
entry#3 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/_*mv*.frq")'=>'*author_sort*',class
org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#1606785643
...
entry#7 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/_*mx*.frq")'=>'*title_sort*',class
org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#144024863
entry#8 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/_*mx*.frq")'=>'*author_sort'*,class
org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#411802272
...
why? Those are my sort fields and they are occupying a lot of space
(doubled in this case but I see that sometimes I have three or four
"old" segment references)
Is there something I can do to remove those old references? I tried to
reload the core and it seems the old references are discarded (i.e.
garbage collected) but I believe is not a good workaround, I would avoid
to reload the core for every replication cycle.
Best
Andrea