Michael Orlitzky <m...@gentoo.org> wrote:
>
> Who generates the metadata when I `git pull`?

For the gentoo repository, it is in general some gentoo server
which then pushes the calculated metadata to the repository which
you pull as a user.
It is *possible* to use the "plain" repository, but you have to
set up quite a bit for that (updating metadata is only one of
several steps which you have to do manually in that case).

A collection of scripts which do the missing steps is
https://github.com/vaeth/portage-postsyncd-mv
though I do not know whether they still work. Indeed (as
probably most users) I am using since years one of the
repositories with generated metadata already.

> Or in overlays?

There are overlays which provide metadata, and there are overlays
where you have to do it on your own with egencache. Overlays which
use eclasses from the gentoo repository usually do the latter,
since otherwise the metadata is soon out-of-date.

However, for most overlays, egencache takes far less than a minute,
so for overlays, this is not really an issue. For the gentoo repository,
the time of a full metadata generation is considerable. As mentioned,
checksums and timestamps are used to minimize the amount, though this
does not work for eclass changes (see below).

> but if you record that wasted time in a different "metadata generation"
> ledger, then it looks like I've saved a bunch of time because the "bash
> sourcing" ledger reads zero.

No. You need correct metadata after every syncing of the repository.
Either the gentoo server or your machine has to do it.
This is independent of whether the PM "prefers" the installed or
the repositories' metadata. Excluding installed packages from metadata
updating would not be sane and would not safe much time (since the vast
majority of ebuilds are not installed, anyway).

> Even if I believe in a metadata angel and if we pretend that the PMS
> requires the metadata to be there, then rebuilding whenever metadata
> changes is still not 100% correct (as you point out), because it often
> rebuilds pointlessly. But that's getting into a harder problem.

Yes, usually the metadata rebuilds due to eclass changes are pointless
(except in the few cases where the eclass change is related with the
metadata).

I remember when I used to sync from the "plain" repository and rebuilt
the metadata on my system, that the syncing (i.e. metadata regeneration)
costed 10-30 minutes whenever one of the basic eclasses (which are
sourced by almost every ebuild) had a trivial change. Probably,
meanwhile machines are slightly faster and there are less such "basic"
eclasses needed in newer EAPIs, but it will still need a considerable
time.

> The recompilation isn't always pointless. In the present scenario, the
> rebuild is what installs the python-3.8 copy of the program.

No. If users use the defaults for PYTHON_{,SINGLE}_TARGET, the
rebuild has absolutely no effect except for changing some metadata
in /var/db/pkg (and some file timestamps).

> I'm not arguing that this is the best way to do things, but that it's
> the only way to do them given what's in the PMS today.

Of course. That was the decision made some years ago.

> maybe it should pay a few people to sit around and think about
> this for a week.

There was such a debate some years ago, before the decision mentioned
above was made.
It was a hefty discussion, but there were strong proponents of pointless
recompilation vs. improvement of the dependency handling (for both
sides, there are much more arguments which I will not repeat now).
The discussion might have turned if I would have found the time to
implement the necessary change in portage, but neither then nor now
I have the time to do so. (To be honest: Maybe I had the time, but I
dislike python too much.) Nevertheless, I do not think that it was a
good decision.
I am not posting this to re-roll the above discussion again.
But your posting shows that apparently not everybody knows the reasons
why things are the way they are now.


Reply via email to