It's my understanding that if my mergeFactor is 10, then there
shouldn't be more than 11 segments in my index directory (10 segments,
plus an additional segment if a merge is in progress). It would seem
to follow that there shouldn't be more than 11 fdt files, 11 tis
files, etc.. However, I'm looking at one of my indexes now, and this
doesn't seem to be the case. Here are the tis files for this index,
for instance:

07/22/2008  07:49 PM        77,925,180 _1je.tis
07/23/2008  02:57 AM        65,988,651 _256.tis
07/23/2008  04:18 AM        13,159,578 _29t.tis
07/23/2008  05:08 AM        10,146,941 _2cw.tis
07/23/2008  05:39 AM         6,749,665 _2el.tis
07/23/2008  06:24 AM        12,274,012 _2he.tis
07/23/2008  07:01 AM        14,069,531 _2kh.tis
07/23/2008  07:53 AM        13,795,213 _2nu.tis
07/23/2008  08:20 AM         6,284,902 _2p0.tis
07/23/2008  08:27 AM         1,980,945 _2p9.tis
07/23/2008  08:36 AM         1,674,640 _2pk.tis
07/23/2008  08:37 AM           311,483 _2pl.tis
07/23/2008  08:38 AM           285,881 _2pm.tis
07/23/2008  08:39 AM           245,138 _2pn.tis
07/23/2008  08:40 AM           116,881 _2po.tis
07/17/2008  11:22 PM        69,635,905 _rp.tis
07/18/2008  12:59 AM        15,883,866 _xu.tis

There are 17 of these files. (File sizes are in bytes.) When I open up
the index in Luke, it says all of them are "In Use" and it doesn't
list any of them as "Deletable". This seems to rule out the
possibility that Solr/Lucene somehow "forget" to clean up files that
were no longer in use.

I'm noticing that _2pk, _2pl, _2pm, _2pn, _2po are sequential file
names, alphabetically speaking, and their last modified times are very
close to one another. Does this mean they're actually part of the same
segment, even though they are in separate files? If those files are
indeed part of a single segment, then the number of segments
represented by these files would really be 17-4=13. But that's still
more than the expected 11 segments.

I just discovered that one of my other indexes has over 11,000 tis
files. That's disturbing. I'm not sure if it would have the same
underlying cause.

Any ideas?

Reply via email to