For what it's worth, I've now had the exact same problem, which led me
here.

On a bare-metal 20.04 using full blank HDDs as OSDs (/dev/sda etc.),
installing using cephadm worked fine with an XFS root, but later on when
I reinstalled and tried ZFS root, I then got the same behaviour
described above despite trying device zaps and everything I can think
of.

It seems that the unit.run does two separate steps, first a "/usr/sbin
/ceph-volume lvm activate 0" and then a "/usr/bin/ceph-osd -n osd.0"

The activate does its work inside a tmpfs "/var/lib/ceph/osd/ceph-0",
which is entirely thrown away when that container ends, so the symlink
"/var/lib/ceph/osd/ceph-0/block" it creates is gone before the ceph-osd
container starts up, resulting it in not finding a "block" any more and
then declaring unknown type because of that.

I don't understand how that could ever possibly work, so maybe the ZFS
root is not relevant, or maybe it somehow causes activate to use the
tmpfs?

Note that if I run a single container manually, and do the same activate
followed by running ceph-osd then the OSD does come up.

How is the "/var/lib/ceph/osd/ceph-0/block" meant to persist between
running the activate in one container and then running the ceph-osd in a
different one afterwards, or is the "/usr/bin/mount -t tmpfs tmpfs
/var/lib/ceph/osd/ceph-0" it does during activate that is somehow the
source of this problem?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881747

Title:
  cephadm does not work with zfs root

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  When trying to install ceph on ubuntu 20.04 with zfs as root file
  system the OSD's do not come up.

  The OSD's give an error of:

  May 29 16:51:11 ip-10-0-0-148 systemd[1]: 
ceph-a3ed1cb2-a1cb-11ea-8daf-a729fb450032@osd.0.service: Main process exited, 
code=exited, status=1/FAILURE
  May 29 16:51:12 ip-10-0-0-148 systemd[1]: 
ceph-a3ed1cb2-a1cb-11ea-8daf-a729fb450032@osd.0.service: Failed with result 
'exit-code'.
  May 29 16:51:22 ip-10-0-0-148 systemd[1]: 
ceph-a3ed1cb2-a1cb-11ea-8daf-a729fb450032@osd.0.service: Scheduled restart job, 
restart counter is at 4.
  May 29 16:51:22 ip-10-0-0-148 systemd[1]: Stopped Ceph osd.0 for 
a3ed1cb2-a1cb-11ea-8daf-a729fb450032.
  May 29 16:51:22 ip-10-0-0-148 systemd[1]: Starting Ceph osd.0 for 
a3ed1cb2-a1cb-11ea-8daf-a729fb450032...
  May 29 16:51:22 ip-10-0-0-148 docker[114525]: Error: No such container: 
ceph-a3ed1cb2-a1cb-11ea-8daf-a729fb450032-osd.0
  May 29 16:51:22 ip-10-0-0-148 systemd[1]: Started Ceph osd.0 for 
a3ed1cb2-a1cb-11ea-8daf-a729fb450032.
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: /usr/bin/mount 
-t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: /usr/bin/chown 
-R ceph:ceph /var/lib/ceph/osd/ceph-0
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: 
/usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev 
/dev/ceph-b3cf0dc5-a5fb-45c5-af3c-b85ef0b115ee/osd-block-3bfa4417-18e5-49f9->
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: /usr/bin/ln -snf 
/dev/ceph-b3cf0dc5-a5fb-45c5-af3c-b85ef0b115ee/osd-block-3bfa4417-18e5-49f9-95ee-4c5912f0fa22
 /var/lib/ceph/osd/ceph-0/block
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: /usr/bin/chown 
-h ceph:ceph /var/lib/ceph/osd/ceph-0/block
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: /usr/bin/chown 
-R ceph:ceph 
/dev/mapper/ceph--b3cf0dc5--a5fb--45c5--af3c--b85ef0b115ee-osd--block--3bfa4417--18e5--49f9--95ee--4c5912f0fa22
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: Running command: /usr/bin/chown 
-R ceph:ceph /var/lib/ceph/osd/ceph-0
  May 29 16:51:23 ip-10-0-0-148 bash[114543]: --> ceph-volume lvm activate 
successful for osd ID: 0
  May 29 16:51:24 ip-10-0-0-148 bash[115166]: debug 
2020-05-29T16:51:24.602+0000 7f05cfb9cec0  0 set uid:gid to 167:167 (ceph:ceph)
  May 29 16:51:24 ip-10-0-0-148 bash[115166]: debug 
2020-05-29T16:51:24.602+0000 7f05cfb9cec0  0 ceph version 15.2.2 
(0c857e985a29d90501a285f242ea9c008df49eb8) octopus (stable), process ceph-osd, 
pid 1
  May 29 16:51:24 ip-10-0-0-148 bash[115166]: debug 
2020-05-29T16:51:24.602+0000 7f05cfb9cec0  0 pidfile_write: ignore empty 
--pid-file
  May 29 16:51:24 ip-10-0-0-148 bash[115166]: debug 
2020-05-29T16:51:24.602+0000 7f05cfb9cec0 -1 missing 'type' file and unable to 
infer osd type

  Using ubuntu 20.04 without root zfs works fine.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1881747/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to