[Bug 1909399] Re: Exception during removal of OSD blacklist entries

Thorbjørn Weidemann Mon, 04 Jan 2021 12:01:15 -0800

Thanks for responding :-)

I am not a Ceph expert either.

Steps to reproduce:
Short version: On a running ceph-cluster with at least one blacklisted osd:
install ceph-iscsi.

Long version:
Installing Ceph is complicated, so here is a way to do it with ansible. I know
this is a lot, but believe me: this is the EASY way.

Normally you would install a ceph-cluster with at least 3 servers, but
below is described how to do it on one machine.

Fresh install of Ubuntu 20.04.1 LTS server. Can be in a VM, but make sure you
have 3 exstra disks attached and at least 4GB RAM. Below I assume install is on
sda and sdb, sdc and sdd are attached blank hds ( I have used 10GB for each).
Make sure install is up-to-date:
thw@ff-ceph-4:~$ sudo apt update
thw@ff-ceph-4:~$ sudo apt dist-upgrade
thw@ff-ceph-4:~$ sudo reboot

Here I have the default admin user thw.
Assuming hostname ff-ceph-4
make sure ff-ceph-4 can resolve to the external ip-address of the local
machine, eg. by adding it to /etc/hosts

Make sure user thw can sudo any command without password by sudo visudo and
adding the line:
thw ALL=(ALL) NOPASSWD:ALL

thw@ff-ceph-4:~$ sudo apt install ansible git

Create user ansible with password ansible:
thw@ff-ceph-4:~$ sudo adduser ansible

thw@ff-ceph-4:~$ su - ansible

get Ceph ansible playbooks:
ansible@ff-ceph-4:~$ git clone https://github.com/ceph/ceph-ansible.git
ansible@ff-ceph-4:~$ cd ceph-ansible
ansible@ff-ceph-4:~$ git checkout stable-5.0

Copy the attached all.yml to group_vars dir in ceph-ansible. You can diff to
all.yml.sample to see what I have changed. I advise you to do this. Make sure
the monitor_interface: line lists the name of your network interface, and that
public_network: is the network that interface is on.
Copy the attached inventory to current-dir (/home/ansible/ceph-ansible)
NOTE: I could only attach ONE file, so this will be attached in new comment
below.

make sure user ansible kan login to thw account with ssh-key:
ansible@ff-ceph-4:~$ ssh-keygen
ansible@ff-ceph-4:~$ ssh-copy-id thw@ff-ceph-4
ansible@ff-ceph-4:~$ cp site.yml.sample site.yml
ansible@ff-ceph-4:~$ ansible-playbook -i inventory site.yml

This will take a few minutes to run.
If something goes wrong see if you can fix it, and re-run the ansible-playbook
command.

At the end you should hopefully have a running ceph-"cluster".
Go back to thw user (or add ansible user to sudo-list) and run
thw@ff-ceph-4:~$ sudo ceph -s

Line 3 should read:
health: HEALTH_OK

To reproduce the bug, you should have some blacklist entries. I have them on a
new install - I don't know why. Check with:
thw@ff-ceph-4:~$ sudo ceph osd blacklist ls

If there are entries listed, fine.
If not, create an entry with:
thw@ff-ceph-4:~$ sudo ceph osd blacklist add <ip-address-of-your host>

Now:
thw@ff-ceph-4:~$ sudo ceph osd pool create rbd
thw@ff-ceph-4:~$ sudo rbd pool init rbd
thw@ff-ceph-4:~$ sudo apt install ceph-iscsi

You should now see the exceptions in journalctl

** Attachment added: "variables for ansible playbook"

https://bugs.launchpad.net/ubuntu/+source/ceph-iscsi/+bug/1909399/+attachment/5449256/+files/all.yml

--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909399

Title:
Exception during removal of OSD blacklist entries

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph-iscsi/+bug/1909399/+subscriptions

--
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909399] Re: Exception during removal of OSD blacklist entries

Reply via email to