Hi

I had a look at the different proposed systemd service files today.
While this topic was brought up again by a Debian bug report, I don't
think this is distribution specific. I would very much prefer a unified
solution across all distributions.

Dmitry Smirnov <only...@debian.org> writes:

> Control: tags -1 pending
>
> On Fri, 14 Nov 2014 20:27:13 Clint Adams wrote:
>> Please add systemd service files for ceph.  I see that systemd is
>> special-cased in /etc/init.d/ceph, but it does not appear to work.
>
> I know, it doesn't work... Lately I've been experimenting with new systemd 
> support that I've committed to 
>
>     
> http://anonscm.debian.org/cgit/pkg-ceph/ceph.git/commit/?h=experimental&id=3c22e192d964789365e8dc21c168c5fd8985f7d8
>
>
> On Fri, 14 Nov 2014 14:35:17 Ken Dreyer wrote:
>> There is are some work-in-progress files here:
>> https://bugzilla.redhat.com/show_bug.cgi?id=771924
>> 
>> I have not yet had a chance to review them, though.
>
> Thanks, those files are not good neither what committed to Ceph repository as
>
>     https://github.com/ceph/ceph/tree/master/systemd
>
> It seems like nobody actually tried to use those files. There are number of 
> issues like incorrect order of includes for environment variables; lack of 
> "ulimit" support (i.e. "LimitNOFILE=32768") in OSD service file which is 
> necessary to prevent OSD termination shortly after start etc.
>
> I've created meta "ceph.service" for compatibility -- it merely depends on 
> other daemons (ceph-mon, ceph-mds) as well as enabled ceph-osd@
> services.

I don't particularly like this in it's current state. The line
"ExecStartPre=-/bin/systemctl start ceph-osd*" seems very wrong to me.
I'm not a systemd expert but I did not find an easy way to create
something like a meta-service in a way that looks like integrated into
systemd. But then I don't think that's needed either. The way the
current init script tries to start all the different daemons in one
script always seemd odd to me. Do we need a meta service like this?

>
> IMHO "ceph-mds" and "ceph-mon" are better than "ceph-mds@" and "ceph-mon@" 
> just like in SysV init files where by default we support just one instance 
> per 
> machine etc.

I agree with this. Having multiple instances per machine of ceph-mon or
ceph-mds does not make sense. On the other hand your proposed
implementation uses "%H" which resolves to the hostname. This is not
compatible with the current implementation in the init script which
parses the configuration file to find the id of the mds and mon. I'm not
sure how to solve this, but IMO all distributions should do this in the
same way and at the very least we need an upgrade path for users that
don't have the hostname as the id of their mon and mds (like having
mon.1, mon.2, ... instead of node1, node2, ...). I see 3 possible
solutions:

- Add a script similar to the code in the current init script which
  parses the config file to get the id and use that when starting the
  daemon.
- Agreement that mons and mds should have their ids equal to the
  hostname. I don't really like that solution as it seems quite
  inflexible.
- Use a service template (with the @) nonetheless. This is probably the
  simplest solution but requires more manual intervention by the cluster
  administrator. He has to set the id manually when enabling the service.

Some other discussion points:
- Restart policy: I think we should take advantage of the fact that
  systemd can monitor processes and restart them if they fail. I propose
  to start the daemon in the forground (like it's done already) and set
  "Restart=on-failure". See man systemd.service[1] for the details what
  this means. Do we need custom values for RestartSec (time to sleep
  before restart, default 100ms), StartLimitInterval, StartLimitBurst
  (both related to start rate limiting, default 5 times in 10 seconds)?

- Mounting OSD filesystems: For sysvinit the init script mounts the OSD
  filesystem. None of the proposed systemd solutions mounts any
  filesystems. I think that mounting filesystems should not be done in
  the ceph init scripts (independent of init system used). What's the
  reason this was added to the init scripts and can't be done from
  /etc/fstab like all other filesystems? My prefered solution for
  systemd is to mount filesystems from /etc/fstab and to have
  "RequiresMountsFor=/var/lib/ceph/mds/ceph-%i" in the individual
  service files to ensure that the filesystem is mounted. An alternative
  would be to create mount units or a generator similar to
  systemd-fstab-generator. But this sounds like a lot of work for little
  gain.

Gaudenz

[1] http://www.freedesktop.org/software/systemd/man/systemd.service.html

Attachment: signature.asc
Description: PGP signature

Reply via email to