On 18/06/14 11:58, Jan Kalcic wrote:
Hi all,I am able to manually deploy a new ceph cluster by successfully bootstrapping the first monitor: # ceph -s cluster 926daa03-5e59-4ae1-a0bd-401a227e74c7 health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds monmap e1: 1 mons at {linux-c904=172.17.43.101:6789/0}, election epoch 2, quorum 0 linux-c904 osdmap e1: 0 osds: 0 up, 0 in pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 192 creating However, after rebooting the system or if try to restart the monitor I always get the same seg fault: /etc/init.d/ceph -v start mon.linux-c904 /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "user" === mon.linux-c904 === /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "run dir" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "pid file" --- linux-c904# mkdir -p /var/run/ceph /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "log dir" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "auto start" --- linux-c904# [ -e /var/run/ceph/mon.linux-c904.pid ] || exit 1 # no pid, presumably not running pid=`cat /var/run/ceph/mon.linux-c904.pid` [ -e /proc/$pid ] && grep -q ceph-mon /proc/$pid/cmdline && grep -qwe -i.linux-c904 /proc/$pid/cmdline && exit 0 # running exit 1 # pid is something else /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "copy executable to" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "lock file" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "admin socket" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "max open files" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "restart on core dump" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "valgrind" Starting Ceph mon.linux-c904 on linux-c904... /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "pre start eval" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "pre start command" /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.linux-c904 "post start command" --- linux-c904# ulimit -n 32768; /usr/bin/ceph-mon -i linux-c904 --pid-file /var/run/ceph/mon.linux-c904.pid -c /etc/ceph/ceph.conf --cluster ceph *** Caught signal (Segmentation fault) ** in thread 7f2028f0a780 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: /usr/bin/ceph-mon() [0x89419d] 2: (()+0xf7c0) [0x7f202887e7c0] 3: (()+0x61c0) [0x7f2026b621c0] 4: (_ULx86_64_step()+0x9) [0x7f2026b632a9] 5: (()+0x393a5) [0x7f2028ac53a5] 6: (GetStackTrace(void**, int, int)+0xe) [0x7f2028ac4d1e] 7: (tcmalloc::PageHeap::GrowHeap(unsigned long)+0x10f) [0x7f2028ab4b5f] 8: (tcmalloc::PageHeap::New(unsigned long)+0xbb) [0x7f2028ab52ab] 9: (tcmalloc::CentralFreeList::Populate()+0x7b) [0x7f2028ab30ab] 10: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)+0x58) [0x7f2028ab32e8] 11: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x8b) [0x7f2028ab33ab] 12: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long)+0x69) [0x7f2028ab7719] 13: (()+0x1994b) [0x7f2028aa594b] 14: (tc_new()+0x18) [0x7f2028ac63c8] 15: (std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&)+0x59) [0x7f202762f6a9] 16: (std::string::_M_mutate(unsigned long, unsigned long, unsigned long)+0x63) [0x7f202762f8a3] 17: (std::string::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long)+0x2c) [0x7f202762fa3c] 18: (leveldb::DBImpl::RecoverLogFile(unsigned long, leveldb::VersionEdit*, unsigned long*)+0x425) [0x7f20278950b5] 19: (leveldb::DBImpl::Recover(leveldb::VersionEdit*)+0x655) [0x7f2027895a45] 20: (leveldb::DB::Open(leveldb::Options const&, std::string const&, leveldb::DB**)+0xeb) [0x7f2027895dfb] 21: (LevelDBStore::do_open(std::ostream&, bool)+0x10d) [0x840b7d] 22: (main()+0x14a5) [0x53d035] 23: (__libc_start_main()+0xe6) [0x7f2026d88c36] 24: /usr/bin/ceph-mon() [0x53a209] By cleaning everything up and starting over, the monitor starts successfully again: # rm -rf /var/lib/ceph/mon/ceph-linux-c904/* # rm /tmp/ceph.mon.keyring # rm /etc/ceph/ceph.client.admin.keyring # ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *' # ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow' # ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring # monmaptool --create --add linux-c904 172.17.43.101 --fsid 926daa03-5e59-4ae1-a0bd-401a227e74c7 /tmp/monmap # ceph-mon --mkfs -i linux-c904 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring # /etc/init.d/ceph start mon.linux-c904 === mon.linux-c904 === Starting Ceph mon.linux-c904 on linux-c904... Starting ceph-create-keys on linux-c904... The following the content of my ceph.conf file: [global] fsid = 926daa03-5e59-4ae1-a0bd-401a227e74c7 mon initial members = linux-c904 mon host = 172.17.43.101 auth cluster required = cephx auth service required = cephx auth client required = cephx osd journal size = 1024 filestore xattr use omap = true osd pool default size = 2 osd pool default min size = 1 osd pool default pg num = 333 osd pool default pgp num = 333 osd crush chooseleaf type = 1 What's wrong with it?
What's your leveldb and google-perftools/tcmalloc version? -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
