machine-id "must not be exposed in untrusted environments"

Simon McVittie Fri, 09 Aug 2019 00:55:44 -0700

On Fri, 09 Aug 2019 at 02:04:25 +0200, Adam Borowski wrote:
> But... if this ID must not be exposed on the network, why does it need to be
> unique?


At the risk of stating the obvious, that defeats the object of having a
unique ID: anything that stores per-machine state/configuration keyed
by the machine ID will think all your machines and OS installations
are the same. To use an analogy with other strings that might get used
as a unique identifier, it's the same as if you set the hostname of
all your machines to spartacus, or changed all your MAC addresses to
d4:1d:8c:98:f0:0b, or somehow changed the hardware serial numbers and
UUIDs in /sys/class/dmi/id to a constant value.

The recommendation in machine-id(5) to use
sd_id128_get_machine_app_specific() (which is implemented as HMAC with the
app ID as the key), combined with the machine ID being unique, results in
a family of stable per-machine identifiers that are unique to a machine
and constant, but are not obviously related to each other. So for example,
if Chromium and PulseAudio both used sd_id128_get_machine_app_specific(),
an attacker who knows the Chromium app-specific machine ID would not be
able to tell whether the PulseAudio app-specific machine ID belongs to
the same machine or not.

In particular, this is what systemd-networkd does for DHCP: it uses a
keyed hash (HMAC) of the machine-id(5), so the same machine-id(5) gives
you the same DHCP ID (and hence probably the same IP address), and
different machine IDs give you different DHCP IDs, but the actual value
of the machine ID is not sent in the DHCP transaction.

For at least some code that does not follow the recommendation to use a
HMAC, the reason is likely to be compatibility with data stored by older
versions of itself: the code is older than the recommendation, and if it
switched to using sd_id128_get_machine_app_specific() or equivalent now,
it would lose its ability to associate stored state/configuration with
the same machine for which it was stored, causing apparent data loss.

I think it's also important to distinguish between the machine ID
being exposed on the network in a way that can be seen by untrusted
eavesdroppers (like if it was used for DHCP without using a HMAC), and
being exposed to other parties in a way that already involves trusting
them (like sharing it with the other machines sharing an NFS-mounted
home directory alongside confidential personal files, or including
/etc/machine-id in a system backup alongside stored secrets elsewhere in
/etc). If an attacker can read (or write!) your home directory or your
backups, that attacker being able to "fingerprint" your machine ID is
the least of your concerns.

Some mental models for the machine ID that are reasonably close:

- it's like the hostname (except opaque, so users don't want to change it
  to a more aesthetically appealing value and then expect things to still
  work the same)

- it's the same as the MAC address, if all machines had exactly one network
  interface (which they don't, so the MAC address is unsuitable)

- it's the same as the motherboard serial number, if all machines had one
  (which they don't, so this is unsuitable)

- it's the same as the disk serial number, if all machines had exactly
  one disk (which they don't, so this is unsuitable)

Regards,
    smcv

Re: [OT] /etc/machine-id "must not be exposed in untrusted environments"

Reply via email to