"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> As I was testing the release candidate, I stumbled across a regression in
> 'git merge-base' as a result of the switch to generation numbers. The commit
> message in [PATCH 1/1] describes the topology involved, but you can test it
> yourself by comparing 'git merge-base v4.8 v4.9' in the Linux kernel. The
> regression does not show up when running merge-base for tags at least v4.9,
> which is why I didn't see it when I was testing earlier.

Strange that it happens; I'll take a look.

> The solution is simple, but also will conflict with ds/reachable in next. I
> can send a similar patch that applies the same diff into commit-reach.c.
>
> With the integration of generation numbers into most commit walks coming to
> a close [1], it will be time to re-investigate other options for
> reachability indexes [2]. As I was digging into the issue with this
> regression, I discovered a way we can modify our generation numbers and pair
> them with commit dates to give us a simple-to-compute, immutable
> two-dimensional reachability index that would be immune to this regression.
> I will investigate that more and report back, but it is more important to
> fix this regression now.

I am interested in what you have created, especially because commit
creation dates are imperfect indicators because of clock skew etc.

>
> Thanks, -Stolee
>
> [1] https://public-inbox.org/git/[email protected]/[PATCH
> 0/6] Use generation numbers for --topo-order
>
> [2] https://public-inbox.org/git/[email protected]/[RFC] Other chunks
> for commit-graph, part 2 - reachability indexes

Since then I have found few more possible approaches:
- working with repository metagraph[1], where all chains of commits were
  replaced by edge, perhaps together with DAG Reduction[2] boosting
  framework on this metagraph
- using FELINE-like index (the Graph+Label approch, also known as online
  search), and for those where this index have false results, use Labels
  only approach[3]

[1] Xian Tang et. al., "An Optimized Labeling Scheme for Reachability
    Queries", CMC, vol.55, no.2, pp.267-283, 2018
[2] Marco Biazzini, Martin Monperrus, Benoit Baudry "On Analyzing the
    Topology of Commit Histories in Decentralized Version Control
    Systems", ICSME 2014 (conference paper)
[3] Junfeng Zhou et. al., "Accelerating reachability query processing
    based on DAG reduction", The VLDB Journal (2018) 27: 271

--
Jakub Narębski

Reply via email to