Re: update blows up

Philippe Meunier Sun, 11 Oct 2009 12:27:52 -0700

Marc Espie wrote:
>Fact: half snapshots are FUCKING ANNOYING. I don't know a working solution
>for this issue. The bandwidth of the T1 line is an issue. The disk usage
>on the servers is an issue.  A full solution would need to solve both.


Let me suggest a partial solution that might get close.  Here's a list
of steps that I think could help, where more and more steps
implemented would mean a better and better solution...

1) add some sort of build number to package names.  The number can be
a real build number, or a date+time number, or just some number modulo
some other number N (say, 100, to limit it to 2 digits) as long as all
mirrors have a very high probability of getting synched at least once
for every N build.  I presume it wouldn't be too hard to change the
build scripts to do that, and to change pkg_add to handle the new
format.

Once that's done then pkg_add can rely on the build numbers to ensure
that a package and its dependencies all come from the same build.
I.e. if the user wants to install package P, pkg_add finds the current
P-37 from the user-specified mirror, and if P depends on some other
package D, then pkg_add will only look for D-37 and nothing else.  If
pkg_add can't find D-37 (i.e. only D-36 or D-38 exists) then it can at
least give a nice "try again later" error message to the user and
abort (rather than the current situation where pkg_add gets the wrong
build of D and the user's left to guess what the problem is when some
strange error message shows up).  That's essentially what some users
currently use SHA256 for, to ensure P and D come from the same build,
except that now pkg_add can do the check itself automatically (and
more, see below).

Advantages: fairly easy to do, no extra disk space or bandwidth
required, mirrors do not need to be actively involved, users get a clear
error message when they hit a half-synched mirror.

Disadvantage: the transition to the new package name format might be
very messy, the package names become even uglier (ideally the build
number ought to be part of the path, not part of the package name, but
then you are back to the problem of mirrors having to deal with
multiple directories and twice the disk space when going from one
build to the next, etc).

2) There is currently a map from country to mirror names
(i.e. ftp.html) and a map from mirror names to mirror IP addresses
(i.e. the DNS).  Write a program that takes both maps as input and
creates a map from country to IP addresses (it should be done with a
program, because both maps probably change regularly).  Use the
resulting map to automatically update a hierarchical region-based DNS
system for mirrors (very much like the www.pool.ntp.org system).  For
example, have jp.mirrors.openbsd.org then asia.mirrors.openbsd.org
then finally world.mirrors.openbsd.org.  Some mirrors already use some
sort of openbsd.org names but here the idea is to organize them all in
a systematic manner based on geography.  Once that's done, change
ftp.html to remove all the "real" names of mirrors and put there only
the names of the DNS-based pools.

Advantages: fairly easy to do, no extra disk space or bandwidth
required, mirrors do not need to be actively involved, compatible with
the existing version of pkg_add, and it might even help spread the
load more evenly among mirrors in each region (just in case that's a
problem...)

Disadvantages: well we haven't really gained anything in this step,
have we?  It's just preparatory work for the next step.  Some users
might complain because they can't use a specific mirror in their
region anymore (unless they remember its "real" name or keep an old
copy of ftp.html around).

Extra bonus: Theo de Raadt's primary site doesn't have to be in the
pools, it can just be used as an invisible feed for the mirrors and
for the developers only (i.e. other users use mirrors and mirrors only
based on the pools system).  Or it can be just in
world.mirrors.openbsd.org only and therefore get fewer users since
most people will probably use local mirrors first and since the
primary site will just be one anonymous IP address among all the ones
in world.mirrors.openbsd.org.

3) Modify pkg_add to make it mirror-pools aware.  For example, a user
wants to install package P from jp.mirrors.openbsd.org.  pkg_add gets
from the DNS one IP address from that pool, connects to it, finds
P-37.  P depends on D so pkg_add tries to get D-37.  If D-37 exists on
that mirror, you're good to go.  If it does not (i.e. only D-36 or
D-38 exist) then pkg_add automatically tries another mirror in the
same pool jp.mirrors.openbsd.org.  Since the package name and the
build number together uniquely identify the file that pkg_add is
looking for, we know that, if pkg_add finds a D-37 on one of the other
mirrors in the pool, then it's the one we need.  If pkg_add does not
find D-37 in the pool jp.mirrors.openbsd.org then it automatically
tries the next level asia.mirrors.openbsd.org (minus the mirrors in
jp.mirrors.openbsd.org) and if that fails it tries
world.mirrors.openbsd.org (minus asia.mirrors.openbsd.org).
At that point, if pkg_add still cannot find D-37 then print an error
message and abort.

Advantages: fairly easy to do, the world's geography is not likely to
change any time soon so encoding that information (i.e. jp is in Asia
which is in the world) in pkg_add only needs to be done once, no extra
disk space or bandwidth required, mirrors do not need to be actively
involved, users will still come across the half-synched mirror problem
just as often as before but now pkg_add can automatically do something
about it and solve the problem transparently in most (but not all, see
below) cases.

Disadvantages: users might complain when they suddenly realize that
pkg_add has decided to download D-37 from some slow mirror in
Antarctica (well, presumably that's still better than not being able
to download D-37 at all).  Searching all the servers in jp, then asia,
then world might take a looong time, but then again it's better than
nothing and ^C is always available...

Note: DNS pool names might instead be organized like
jp.asia.world.mirrors.openbsd.org in which case pkg_add doesn't even
need to know about the world's geography, it can simply chop off one
level in the name to go to the next bigger pool, but such long names
are sort of ugly...

Note: Theo de Raadt's primary site can now play just the role of the
root of the hierarchy of mirror pools (i.e. "universe" above "world")
which is then automatically used by pkg_add only as a site of last
resort when D-37 can't be found anywhere else in the world, i.e. not
very often at all (except that, if users learn about the "universe"
level, they might then be tempted to go for the "universe" server
straight away, but pkg_add can then be modified to disallow the user
from directly using the "universe" level as a start level --- of
course the user can always modify pkg_add to remove that restriction
but that's not likely...)

4) Modify pkg_add to use some smarter heuristics.  Suppose pkg_add
found P-37 and is now looking for D-37 in the current mirror.  Here
are the possible cases:

- D-37 exists => use it.
- D-37 does not exist but D-36 does => look for D-37 on other mirrors
  (going through the various "up" pools if necessary), if it's not
  found (which might or might not be possible, depending on whether
  the "universe" level exists or not) then inform the user, sleep 10
  minutes, and retry at the bottom of the pools.
- D-37 does not exist but D-38 does => abort the installation of P-37
  and try P-38 (which is not available from the current server, unless
  you're really lucky with your race conditions, but which should be
  available elsewhere using the pool system).

Advantages: well, you've got a pretty smart package system at that
point.

Disadvantages: little by little, pkg_add's code gets fairly
complicated...

5) Modify pkg_add to add extra features.  For example, have a special
build number for packages that need a new maintainer.  When a user
tries to install such a package he gets a message "this package needs
a maintainer, since obviously you are using this package you've
therefore just volunteered yourself for the job, please enter your
email address below, thank you".  There's no limit to the amount of
metadata you can encode in package names so the sky's the limit
(modulo manpower, obviously).

That's it.  I don't claim it's perfect or that the amount of work
required is worth the importance of the problem.  It could help a lot
though, even if "only" steps 1-3 are implemented.  Consider it food
for thoughts.  And of course I'm not volunteering to do any of that
work, I don't have the time (and I don't use packages anyway) even if
I obviously have enough time to write 150 lines about it!

Philippe

Re: update blows up

Reply via email to