Bug#875703: debcrossgen defaults to a non-default gcc

Simon McVittie Wed, 11 Oct 2017 01:39:59 -0700

On Wed, 13 Sep 2017 at 21:01:43 +0200, Helmut Grohne wrote:
> debarch cpu     cpu_family
> armel   arm     armv7l

Is this a typo, or is this really what Meson does? I would expect
cpu=armv7hl and cpu_family=arm.

Is cpu_family intended to describe a CPU family, or an ABI? Some CPU
families have more than one incompatible ABI:

* ARM: OABI (GNU: arm*-linux-gnu), EABI soft-float (arm*-linux-gnueabi),
  EABI hard-float (arm*-linux-gnueabihf)
* mips: little-endian (GNU: mipsel-linux-gnu) or big-endian
  (GNU: mips-linux-gnu)
* mips64: little-endian (GNU: mips64el-linux-gnu) or big-endian
  (GNU: mips64-linux-gnu)
* x86_64 (yes really): LP64 x86_64 (GNU: x86_64-linux-gnu) or ILP32 "x32"
  (GNU: x86_64-linux-gnux32)

so the answer to that question does matter.

On Thu, 14 Sep 2017 at 00:26:04 +0300, Jussi Pakkanen wrote:
> When compiling with Meson, CC and CXX must point to the native
> compilers, not to cross compilers.

That's CC_FOR_BUILD in config.guess and many Autotools projects (also
CPP_FOR_BUILD for cpp, and
https://www.gnu.org/software/autoconf-archive/ax_prog_cc_for_build.html
doesn't define CXX_FOR_BUILD but it would be a logical extension).
Perhaps Meson should look at those variables first, when building things
for the build system? (In GNU terminology: build is the machine where
Meson runs, host is the machine where the result of compilation will be
run.)

If I wanted to cross-compile with Meson using a non-standard
cross-compiler, for instance running aarch64-linux-gnu-clang instead of
aarch64-linux-gnu-gcc on my x86_64 system, is there an environment variable
that can be set to achieve that, or do I have to generate a non-standard
cross-compilation definition file and use that?

On Thu, 14 Sep 2017 at 00:26:04 +0300, Jussi Pakkanen wrote:
> Yes, we do indeed to some canonicalization here.

Are you deliberately inventing a new vocabulary of CPU types?

If possible (and if it's not too late!) it would be great if this
vocabulary reused some existing one (for example GNU CPU types, which
in fact debcrossgen assumes are the vocabulary in use), or at worst,
was defined to be the same as some existing vocabulary with documented
exceptions (like x86 for the i?86 family).

The GNU tuples (like i386-linux-gnu) seem to be a de facto standard for
cross-compilation: projects that use Make but not Autotools often run
things like "$(CROSS_COMPILE)cc" and expect you to put a GNU tuple +
"-" in CROSS_COMPILE if you are cross-compiling.

Some good vocabularies of CPU types and ABIs that I know about, which
are all similar but non-identical:

* GNU Autotools (result of config.guess/config.sub, then delete the
  first "-" and everything after it if you only want the CPU)

* Debian multiarch (GNU Autotools, then normalize i?86 to i386 and
  arm* to arm because the specific functionality level doesn't imply an
  ABI change, then delete the first "-" and everything after it if you
  only want the CPU)

* Linux uname machine, i.e. uname -m (Linux-specific; no documentation
  of the vocabulary that I know of, other than source code; not rich
  enough to distinguish between ABIs if you care about that; also
  doesn't distinguish between LE and BE mips, and uses ppc where GNU uses
  powerpc; https://lists.debian.org/debian-devel/2017/10/msg00124.html)

* Python platform.machine() (no documentation of the vocabulary that
  I know of; it seems to be uname machine on Unix, but x86 or AMD64 on
  Windows, so in practice the vocabulary varies arbitrarily per-OS)

CMake has some sort of ad-hoc vocabulary of CPU types but as far as I'm
aware, doesn't document what it expects to see there, or even whether
it can vary between OSs. It would be great if Meson avoided making the
same mistake.

You could do a lot worse than recycling the GNU vocabulary as-is, or
applying the same CPU family normalization to it that Debian multiarch
does: that's consistent between OSs (e.g. always x86_64, never AMD64),
and if the makers of new CPU families want to run open source software,
in practice they are going to have to tell the maintainers of GNU
config.guess/config.sub about their new CPU family rather soon anyway.

For new CPU families like aarch64, Linux uname -m is usually the same
as the GNU CPU type, so hopefully the awkward cases like p(ower)pc
won't proliferate in future.

> "cpu_family" means the CPU type that is native to the given platform,
> _not_ the CPU type of the kernel currently running.

As I'm sure you've noticed, there's no such thing: Windows with WoW64
and Unix with multilib (lib32/lib/lib64) or multiarch (Debian) can have
any mixture of 32- and 64-bit. Non-x86 systems can also mix ABIs with
multilib or multiarch (ARM and MIPS have several ABIs).

If you're OK with assuming that the Python that is running Meson was
built for the desired architecture (which seems reasonable on Unix,
although possibly not on Windows) then
sysconfig.get_config_var('HOST_GNU_TYPE') can tell you that.

> > mips    mips64  mips64
> > mipsel  mips64  mips64
> 
> These should be plain mips (assuming that this is mips32 userland
> running on top of a 64 bit kernel).
> 
> > powerpc ppc64   ppc64
> 
> Same here.

Those would have to do the same __cpp_symbol__-based detection as x86,
if that's the chosen approach.

32-bit ARM can also run on an aarch64 kernel, and 31-bit S390 can run
on a 64-bit S390x kernel, so uses of that approach will tend to
proliferate. Detecting the build architecture's bit width from uname -m
just doesn't work very well, unfortunately: uname -m is a fact about
the kernel rather than a fact about the user-space OS.

Regards,
    smcv

Bug#875703: debcrossgen defaults to a non-default gcc

Reply via email to