1)将该 osd 设置为 out
[root@ceph1:/home/s1]# ceph osd out osd.1
marked out osd.1.
2)集群做 recovery
2017-06-03 01:54:21.596632 mon.0 [INF] osdmap e90: 4 osds: 4 up, 3 in
2017-06-03 01:54:21.608675 mon.0 [INF] pgmap v565: 256 pgs: 256 active+clean; 1422 MB data, 2833 MB used, 12493 MB / 15326 MB avail
2017-06-03 01:54:26.352909 mon.0 [INF] pgmap v566: 256 pgs: 1 active, 255 active+clean; 1422 MB data, 2979 MB used, 12347 MB / 15326 MB avail; 2/40 objects degraded (5.000%); 51033 B/s, 0 objects/s recovering
2017-06-03 01:54:28.624334 mon.0 [INF] pgmap v567: 256 pgs: 4 active, 252 active+clean; 1422 MB data, 3427 MB used, 11899 MB / 15326 MB avail; 8/40 objects degraded (20.000%); 51053 B/s, 0 objects/s recovering
2017-06-03 01:54:31.320973 mon.0 [INF] pgmap v568: 256 pgs: 3 active, 253 active+clean; 1422 MB data, 3539 MB used, 11787 MB / 15326 MB avail; 6/40 objects degraded (15.000%); 19414 kB/s, 0 objects/s recovering
2017-06-03 01:54:32.323443 mon.0 [INF] pgmap v569: 256 pgs: 256 active+clean; 1422 MB data, 3730 MB used, 11595 MB / 15326 MB avail; 77801 kB/s, 0 objects/s recovering
2017-06-03 01:56:10.949077 mon.0 [INF] pgmap v570: 256 pgs: 256 active+clean; 1422 MB data, 3730 MB used, 11595 MB / 15326 MB avail
3)完成后,该 osd 的状态还是 up,表示它的服务还在运行。现在将其服务停掉。
[root@ceph1:/home/s1]# ssh ceph2 service ceph stop osd.1
/etc/init.d/ceph: osd.1 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines )
该命令出错,需要将 osd.1 加入 ceph.conf 中。在 ceph1 上的 ceph.conf 中添加:
[osd]
[osd.1]
host = ceph2
[osd.2]
host = ceph1
[osd.3]
host = ceph2
[osd.0]
host = ceph1
然后运行 ceph-deploy –overwrite-conf config push ceph2 将它拷贝到 ceph2 上。重启所有的 osd 服务。诡异的事情出现了:
[root@ceph1:/etc/ceph]# ceph osd tree
# id weight type name up/down reweight
-1 4 root default
-2 4 host ceph1
0 1 osd.0 up 1
2 1 osd.2 up 1
1 1 osd.1 up 0
3 1 osd.3 up 1
-3 0 host ceph2
osd.1 和 osd.3 跑到了 ceph1 节点上!查看 start 命令,它将 curshmap 中的 osd.1 的 host 修改为了 ceph2:
[root@ceph1:/etc/ceph]# /etc/init.d/ceph -a start osd
=== osd.1 ===
df: ‘/var/lib/ceph/osd/ceph-1/.’: No such file or directory
create-or-move updating item name 'osd.1' weight 1 at location {host=ceph1,root=default} to crush map
Starting Ceph osd.1 on ceph2...
starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
可以看出,这其实是Ceph的一个 bug:make osd crush placement on startup handle multiple trees (e.g., ssd + sas)。该bug 在 OSD location reset after restart 中也有讨论。目前 Ceph 没有机制可以确保 CRUSH map 结构不变,最简单的办法是在 ceph.conf 中 [OSD] 部分设置 osd crush update on start = false。
尝试手工挪动 osd.1 和 osd.3:
[root@ceph1:/etc/ceph]# ceph osd crush remove osd.1
removed item id 1 name 'osd.1' from crush map
[root@ceph1:/etc/ceph]# ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map
[root@ceph1:/etc/ceph]# ceph osd tree
# id weight type name up/down reweight
-1 2 root default
-2 2 host ceph1
0 1 osd.0 up 1
2 1 osd.2 up 1
-3 0 host ceph2
1 0 osd.1 up 0
3 0 osd.3 up 1
[root@ceph1:/etc/ceph]# ceph osd crush set 1 1 root=default host=ceph2
Error ENOENT: unable to set item id 1 name 'osd.1' weight 1 at location {host=ceph2,root=default}: does not exist
该错误的原因待查。索性直接修改 crush map,然后正确的结果就回来了:
[root@ceph1:/etc/ceph]# ceph osd tree
# id weight type name up/down reweight
-1 2 root default
-2 2 host ceph1
0 1 osd.0 up 1
2 1 osd.2 up 1
-3 0 host ceph2
1 1 osd.1 up 0
3 1 osd.3 up 1
继续运行命令 ssh ceph2 /etc/init.d/ceph stop osd.1 去停止 osd.1 的服务,但是无法停止。据说是因为用 ceph-deploy 部署的 OSD 的服务都没法停止。只能想办法把进程杀掉了。
然后继续执行:
[root@ceph1:/etc/ceph]# ceph osd crush remove osd.1
removed item id 1 name 'osd.1' from crush map
[root@ceph1:/etc/ceph]# ceph auth del osd.1
updated
[root@ceph1:/etc/init]# ceph osd rm osd.1
removed osd.1
此时,osd tree 中再也没有 osd.1 了:
[root@ceph1:/etc/ceph]# ceph osd tree
# id weight type name up/down reweight
-1 3 root default
-2 2 host ceph1
0 1 osd.0 up 1
2 1 osd.2 up 1
-3 1 host ceph2
3 1 osd.3 up 1
收起