[
https://issues.apache.org/jira/browse/OAK-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066076#comment-18066076
]
Julian Sedding commented on OAK-12134:
--------------------------------------
Thank you for the report. I have analysed the issue and can confirm that the
issue exists. In fact, I found that this can happen for both "tail compaction"
and "full compaction".
It can happen that new changes are written to the segmentstore while compaction
is running. In that case, the compaction code notices the new changes and runs
an additional compaction cycle, where only the new changes are processed. The
expectation is that this is a lot faster than the first compaction cycle,
because there are less changes. Up to 5 cycles, each expected to be faster than
the previous one, are available, before the code blocks concurrent writes to
force compaction to complete. In the attached logs, we can see that this
happens in practice.
The bug is only triggered when new checkpoints are created during compaction.
Do to an incorrect handling of the inputs, the optimization that deduplicates
content between checkpoints is not used, leading to content being duplicated in
the storage and additional time taken due to the lack of the optimization. This
in turn increases the likelihood of concurrent writes and also potentially the
size of the extent of the new changes (more time = more changes).
> compaction with concurrent writes can increase segmentstore size
> ----------------------------------------------------------------
>
> Key: OAK-12134
> URL: https://issues.apache.org/jira/browse/OAK-12134
> Project: Jackrabbit Oak
> Issue Type: Bug
> Affects Versions: 1.88.0
> Reporter: Rishabh Daim
> Assignee: Julian Sedding
> Priority: Major
> Attachments: 1.78.GC#2.log, 1.78.GC#3.log, 1.88.GC#3.log
>
>
> The cleanup behavior is identical on both branches 1.78 & 1.88
> Both show 0 bytes in post-compaction cleanup. That was never the real
> difference.
> The actual regression is in compaction itself:
> ||Metric||1.78 GC#3||1.88 GC#3||
> |Compaction time |2.4s (3 cycles)|35.4s (6 cycles + force)|
> |Data written by compaction|~65 MB (489→554 MB)|~800 MB (511→1300 MB)|
> |Initial checkpoints|~66|~46|
> |Force compact needed|No|Yes|
> 1.88 writes 12x more data during compaction despite having fewer checkpoints.
> That's the smoking gun.
> Root cause: OAK-11895
> The CheckpointCompactor change (onto vs after) modified what paths get
> compacted per checkpoint:
> - 1.78 (old): collectSuperRootPaths returns "root" and "checkpoints/X/root"
> — only compacts the repository root and each checkpoint's content root
> - 1.88 (new): returns "" and "checkpoints/X" — compacts the entire
> super-root and the full checkpoint subtree (including metadata, not just the
> root)
> More paths traversed per checkpoint → more nodes copied → more segments
> written → more data produced.
> This has a cascading effect: more data written means compaction takes longer,
> which means more concurrent commits happen during compaction, which means
> more retry cycles, which eventually forces a blocking
> compaction.
> Summary
> The problem is not "cleanup doesn't reclaim space" — cleanup works
> identically on both branches. The problem is that 1.88 TAIL compaction
> produces ~12x more output data than 1.78 due to OAK-11895, causing the
> store to grow significantly after each GC cycle instead of shrinking. This is
> worth raising as a regression against OAK-11895 in Apache JIRA.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)