[ https://issues.apache.org/jira/browse/SOLR-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026900#comment-17026900 ]
Jörn Franke commented on SOLR-14202: ------------------------------------ Sorry for my late reply. I adapted the program minorly to use Kerberos as it represents the system under test. Thanks it is currently running but after 34,000,000 it behaves normal ie the collection is growing between 370 - 7xx MB. After some time it is reset to 370 and then grows and then reset etc. => so this is what I expect and it shows that Solr is working ok. I let it though running to see what happens. Then for my case, I should mention that the (re-)load is happening through DIH due to legacy reasons. Maybe that is an issue? We will move away from all DIH, but this has to be done step by step. Our collections for now are small, but each document contaisn a lot of text => so nothing what should challenge solr. I would now investigate in removing all the copy fields (there is a "large" content field of several hundred kb of text for each document that is copied in different other fields to be processed) to see if this is an issue. Alternatively I would look also to remove at least the atomic updates out of the DIH and apply a update processor handler for atomic updates afterwards. However, since I dont have a concrete error message I am a little bit in the dark what to do. > Old segments are not deleted after commit > ----------------------------------------- > > Key: SOLR-14202 > URL: https://issues.apache.org/jira/browse/SOLR-14202 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Affects Versions: 8.4 > Reporter: Jörn Franke > Priority: Major > Attachments: eoe.zip > > > The data directory of a collection is growing and growing. It seems that old > segments are not deleted. They are only deleting during start of Solr. > How to reproduce. Have any collection (e.g. the example collection) and start > indexing documents. Even during the indexing the data directory is growing > significantly - much more than expected (several magnitudes). if certain > documents are updated (without significantly increasing the amount of data) > the index data directory grows again several magnitudes. Even for small > collections the needed space explodes. > This reduces significantly if Solr is stopped and then started. During > startup (not shutdown) Solr purges all those segments if not needed (* > sometimes some but not a significant amount is deleted during shutdown). This > is of course not a good workaround for normal operations. > It does not seem to have a affect on queries (their performance do not seem > to change). > The configs have not changed before the upgrade and after (e.g. from Solr 8.2 > to 8.3 to 8.4, not cross major versions), so I assume it could be related to > Solr 8.4. It may have been also in Solr 8.3 (not sure), but not in 8.2. > > IndexConfig is pretty much default: Lock type: native, autoCommit: 15000, > openSearcher=false, autoSoftCommit -1 (reproducible with autoCommit 5000). > Nevertheless, it did not happen in previous versions of Solr and the config > did not change. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org