vCenterに作成したユーザのパスワード有効期限はデフォルト90日間

vSphere 8.0環境で、vCenter上に作成した新規ユーザのパスワード有効期限は、標準設定のままだと90日間となっている。

バックアップ専用に新規ユーザを作成する場合など、特殊なユーザに対してだけ、有効期限設定を無効化したい場合は、VCSA仮想マシンに対してsshログインして、設定を行う必要がある。

ドキュメント:vSphere IaaS Control Plane 7.0 「dir-cli コマンド リファレンス

(1)VCSA仮想マシンにsshでログイン

sshでアクセスし、rootユーザでログイン

(2)shellモードに移行

ログインすると「Command>」というプロンプト
そこに「shell」と入力し、UNIXコマンドが利用できるようにする

Connected to service

    * List APIs: "help api list"
    * List Plugins: "help pi list"
    * Launch BASH: "shell"

Command> shell
Shell access is granted to root
root@vcsa [ ~ ]#

(3)dir-cliコマンドで現在のアカウント状態を確認

/usr/lib/vmware-vmafd/bin/dir-cli user find-by-name –account アカウント名 –level 2
「Password never expires:」が「FALSE」となっているとパスワード有効期限設定が有効で「Password expiry」にある日付で無効化される状態です。

root@vcsa [ ~ ]# /usr/lib/vmware-vmafd/bin/dir-cli user find-by-name --account backupuser --level 2
Enter password for administrator@vsphere.local:
Account: backupuser
UPN: backupuser@VSPHERE.LOCAL
Account disabled: FALSE
Account locked: FALSE
Password never expires: FALSE
Password expired: FALSE
Password expiry: 874 day(s) 18 hour(s) 17 minute(s) 48 second(s)
root@vcsa [ ~ ]#

(4)dir-cliコマンドで パスワード有効期限を無効とします

/usr/lib/vmware-vmafd/bin/dir-cli user modify –account アカウント名 –password-never-expires

root@vcsa [ ~ ]# /usr/lib/vmware-vmafd/bin/dir-cli user modify --account backupuser --password-never-expires
Enter password for administrator@vsphere.local:
Password set to never expire for [backupuser].
root@vcsa [ ~ ]#

(5)dir-cliコマンドでアカウント状態が変更されたことを確認

/usr/lib/vmware-vmafd/bin/dir-cli user find-by-name –account アカウント名 –level 2
「Password never expires:」が「TRUE」となっているとパスワード有効期限設定が無効です
「Password expiry:N/A」と有効期限も未設定となっています

root@vcsa [ ~ ]# /usr/lib/vmware-vmafd/bin/dir-cli user find-by-name --account backupuser --level 2
Enter password for administrator@vsphere.local:
Account: backupuser
UPN: backupuser@VSPHERE.LOCAL
Account disabled: FALSE
Account locked: FALSE
Password never expires: TRUE
Password expired: FALSE
Password expiry: N/A
root@vcsa [ ~ ]#

ちなみにlevel 2オプションなしで実行した場合は下記情報しかみれません

root@vcsa [ ~ ]# /usr/lib/vmware-vmafd/bin/dir-cli user find-by-name –account backupuser
Enter password for administrator@vsphere.local:
Account: backupuser
UPN: backupuser@VSPHERE.LOCAL
root@vcsa [ ~ ]#

2サーバでcephを組んで見た(未解決)

Proxmox VEの2サーバ+ corosync qdeviceサーバの3サーバ構成でProxmox VEクラスタを作った際に、cephを組めるのかな?と実験してみた

Proxmox VEサーバは CPU6コア、メモリ16GB、システムディスク120GBで作成し、ceph用ストレージとして16GBディスクを3個で稼働させた。

とりあえず動いてる

モニタとマネージャは各サーバに1個ずつ設定した。

ただ、1台止めた場合に、cephは使えなくなる状態である。

後述の2ノードに均等に同じデータを持たす設定としても、ceph環境での多数決で過半数を問えるようにするには、仮想でもいいのでもう1ノード立てないと実現できないので、どうしようかなぁ・・・という状態となっている。

とりあえずcephストレージに仮想マシンを配置した場合の動作確認には使えるので、とりあえずはこれでいいか、としているが、

実は2ノードだとデータがミラー構成になるので、確保したディスク容量の1/2以下しか使えないのに対して、3ノードであれば、1/2~2/3の間程度が使える計算となるのでそっちの構成の方が良かったかなぁ・・と思わなくもない

詳細確認

まず「ceph health」と「ceph health detail」を実行して確認

root@proxmoxa:~# ceph health
HEALTH_WARN clock skew detected on mon.proxmoxb; Degraded data redundancy: 28/90 objects degraded (31.111%), 25 pgs degraded, 128 pgs undersized
root@proxmoxa:~# ceph health detail
HEALTH_WARN clock skew detected on mon.proxmoxb; Degraded data redundancy: 39/123 objects degraded (31.707%), 35 pgs degraded, 128 pgs undersized
[WRN] MON_CLOCK_SKEW: clock skew detected on mon.proxmoxb
    mon.proxmoxb clock skew 0.305298s > max 0.05s (latency 0.00675958s)
[WRN] PG_DEGRADED: Degraded data redundancy: 39/123 objects degraded (31.707%), 35 pgs degraded, 128 pgs undersized
    pg 2.0 is stuck undersized for 26m, current state active+undersized, last acting [3,1]
    pg 2.1 is stuck undersized for 26m, current state active+undersized, last acting [2,5]
    pg 2.2 is stuck undersized for 26m, current state active+undersized, last acting [5,1]
    pg 2.3 is stuck undersized for 26m, current state active+undersized, last acting [5,2]
    pg 2.4 is stuck undersized for 26m, current state active+undersized+degraded, last acting [1,4]
    pg 2.5 is stuck undersized for 26m, current state active+undersized, last acting [3,0]
    pg 2.6 is stuck undersized for 26m, current state active+undersized, last acting [1,3]
    pg 2.7 is stuck undersized for 26m, current state active+undersized+degraded, last acting [3,2]
    pg 2.8 is stuck undersized for 26m, current state active+undersized, last acting [3,0]
    pg 2.9 is stuck undersized for 26m, current state active+undersized, last acting [1,4]
    pg 2.a is stuck undersized for 26m, current state active+undersized+degraded, last acting [1,4]
    pg 2.b is stuck undersized for 26m, current state active+undersized, last acting [3,0]
    pg 2.c is stuck undersized for 26m, current state active+undersized, last acting [2,3]
    pg 2.d is stuck undersized for 26m, current state active+undersized, last acting [1,3]
    pg 2.e is stuck undersized for 26m, current state active+undersized+degraded, last acting [2,3]
    pg 2.f is stuck undersized for 26m, current state active+undersized, last acting [4,0]
    pg 2.10 is stuck undersized for 26m, current state active+undersized, last acting [2,4]
    pg 2.11 is stuck undersized for 26m, current state active+undersized, last acting [4,1]
    pg 2.1c is stuck undersized for 26m, current state active+undersized+degraded, last acting [4,2]
    pg 2.1d is stuck undersized for 26m, current state active+undersized, last acting [3,0]
    pg 2.1e is stuck undersized for 26m, current state active+undersized+degraded, last acting [2,5]
    pg 2.1f is stuck undersized for 26m, current state active+undersized+degraded, last acting [0,3]
    pg 2.20 is stuck undersized for 26m, current state active+undersized+degraded, last acting [5,1]
    pg 2.21 is stuck undersized for 26m, current state active+undersized, last acting [2,4]
    pg 2.22 is stuck undersized for 26m, current state active+undersized, last acting [3,2]
    pg 2.23 is stuck undersized for 26m, current state active+undersized, last acting [0,3]
    pg 2.24 is stuck undersized for 26m, current state active+undersized, last acting [5,1]
    pg 2.25 is stuck undersized for 26m, current state active+undersized, last acting [4,1]
    pg 2.26 is stuck undersized for 26m, current state active+undersized, last acting [5,2]
    pg 2.27 is stuck undersized for 26m, current state active+undersized, last acting [3,0]
    pg 2.28 is stuck undersized for 26m, current state active+undersized, last acting [2,3]
    pg 2.29 is stuck undersized for 26m, current state active+undersized+degraded, last acting [3,1]
    pg 2.2a is stuck undersized for 26m, current state active+undersized, last acting [5,0]
    pg 2.2b is stuck undersized for 26m, current state active+undersized, last acting [2,4]
    pg 2.2c is stuck undersized for 26m, current state active+undersized, last acting [2,5]
    pg 2.2d is stuck undersized for 26m, current state active+undersized, last acting [5,2]
    pg 2.2e is stuck undersized for 26m, current state active+undersized+degraded, last acting [5,0]
    pg 2.2f is stuck undersized for 26m, current state active+undersized+degraded, last acting [5,0]
    pg 2.30 is stuck undersized for 26m, current state active+undersized+degraded, last acting [4,0]
    pg 2.31 is stuck undersized for 26m, current state active+undersized, last acting [0,5]
    pg 2.32 is stuck undersized for 26m, current state active+undersized, last acting [5,1]
    pg 2.33 is stuck undersized for 26m, current state active+undersized, last acting [3,1]
    pg 2.34 is stuck undersized for 26m, current state active+undersized+degraded, last acting [5,0]
    pg 2.35 is stuck undersized for 26m, current state active+undersized, last acting [1,3]
    pg 2.36 is stuck undersized for 26m, current state active+undersized, last acting [1,4]
    pg 2.37 is stuck undersized for 26m, current state active+undersized, last acting [3,1]
    pg 2.38 is stuck undersized for 26m, current state active+undersized+degraded, last acting [0,5]
    pg 2.39 is stuck undersized for 26m, current state active+undersized, last acting [1,5]
    pg 2.7d is stuck undersized for 26m, current state active+undersized, last acting [0,4]
    pg 2.7e is stuck undersized for 26m, current state active+undersized+degraded, last acting [0,4]
    pg 2.7f is stuck undersized for 26m, current state active+undersized+degraded, last acting [4,1]
root@proxmoxa:~#

続いて「ceph -s」

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            clock skew detected on mon.proxmoxb
            Degraded data redundancy: 120/366 objects degraded (32.787%), 79 pgs degraded, 128 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 27m)
    mgr: proxmoxa(active, since 34m), standbys: proxmoxb
    osd: 6 osds: 6 up (since 28m), 6 in (since 29m); 1 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 122 objects, 436 MiB
    usage:   1.0 GiB used, 95 GiB / 96 GiB avail
    pgs:     120/366 objects degraded (32.787%)
             2/366 objects misplaced (0.546%)
             79 active+undersized+degraded
             49 active+undersized
             1  active+clean+remapped

  io:
    client:   15 KiB/s rd, 8.4 MiB/s wr, 17 op/s rd, 13 op/s wr

root@proxmoxa:~#

clock skew detected

まずは「clock skew detected」について確認

[WRN] MON_CLOCK_SKEW: clock skew detected on mon.proxmoxb
    mon.proxmoxb clock skew 0.305298s > max 0.05s (latency 0.00675958s)

MON_CLOCK_SKEW」にある通りサーバ間の時刻に差がある、というもの

mon_clock_drift_allowed が標準では 0.05秒で設定されているものに対して「mon.proxmoxb clock skew 0.305298s」となっているため警告となっている。

今回の検証環境はESXi 8.0 Free版の上に立てているので、全体的な処理パワーが足りずに遅延になっているのではないかと思われるため無視する

設定として無視する場合はproxmox wikiの Ceph Configuration にあるように「ceph config コマンド」で行う

現在の値を確認

root@proxmoxa:~# ceph config get mon mon_clock_drift_allowed
0.050000
root@proxmoxa:~#

設定を変更、今回は0.5ぐらいにしておくか

root@proxmoxa:~# ceph config set mon mon_clock_drift_allowed 0.5
root@proxmoxa:~# ceph config get mon mon_clock_drift_allowed
0.500000
root@proxmoxa:~#

メッセージが消えたことを確認

コマンドを実行しても消えていることを確認

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 427/1287 objects degraded (33.178%), 124 pgs degraded, 128 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 40m)
    mgr: proxmoxa(active, since 47m), standbys: proxmoxb
    osd: 6 osds: 6 up (since 41m), 6 in (since 41m); 1 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 429 objects, 1.6 GiB
    usage:   3.4 GiB used, 93 GiB / 96 GiB avail
    pgs:     427/1287 objects degraded (33.178%)
             2/1287 objects misplaced (0.155%)
             124 active+undersized+degraded
             4   active+undersized
             1   active+clean+remapped

  io:
    client:   685 KiB/s rd, 39 KiB/s wr, 7 op/s rd, 2 op/s wr

root@proxmoxa:~#

ceph helth detailからも消えたことを確認

root@proxmoxa:~# ceph health
HEALTH_WARN Degraded data redundancy: 426/1284 objects degraded (33.178%), 124 pgs degraded, 128 pgs undersized
root@proxmoxa:~# ceph health detail
HEALTH_WARN Degraded data redundancy: 422/1272 objects degraded (33.176%), 124 pgs degraded, 128 pgs undersized
[WRN] PG_DEGRADED: Degraded data redundancy: 422/1272 objects degraded (33.176%), 124 pgs degraded, 128 pgs undersized
    pg 2.0 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,1]
    pg 2.1 is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,5]
    pg 2.2 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,1]
    pg 2.3 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,2]
    pg 2.4 is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,4]
    pg 2.5 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,0]
    pg 2.6 is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,3]
    pg 2.7 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,2]
    pg 2.8 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,0]
    pg 2.9 is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,4]
    pg 2.a is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,4]
    pg 2.b is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,0]
    pg 2.c is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,3]
    pg 2.d is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,3]
    pg 2.e is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,3]
    pg 2.f is stuck undersized for 39m, current state active+undersized+degraded, last acting [4,0]
    pg 2.10 is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,4]
    pg 2.11 is active+undersized+degraded, acting [4,1]
    pg 2.1c is stuck undersized for 39m, current state active+undersized+degraded, last acting [4,2]
    pg 2.1d is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,0]
    pg 2.1e is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,5]
    pg 2.1f is stuck undersized for 39m, current state active+undersized+degraded, last acting [0,3]
    pg 2.20 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,1]
    pg 2.21 is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,4]
    pg 2.22 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,2]
    pg 2.23 is stuck undersized for 39m, current state active+undersized+degraded, last acting [0,3]
    pg 2.24 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,1]
    pg 2.25 is stuck undersized for 39m, current state active+undersized+degraded, last acting [4,1]
    pg 2.26 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,2]
    pg 2.27 is stuck undersized for 39m, current state active+undersized, last acting [3,0]
    pg 2.28 is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,3]
    pg 2.29 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,1]
    pg 2.2a is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,0]
    pg 2.2b is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,4]
    pg 2.2c is stuck undersized for 39m, current state active+undersized+degraded, last acting [2,5]
    pg 2.2d is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,2]
    pg 2.2e is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,0]
    pg 2.2f is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,0]
    pg 2.30 is stuck undersized for 39m, current state active+undersized+degraded, last acting [4,0]
    pg 2.31 is stuck undersized for 39m, current state active+undersized+degraded, last acting [0,5]
    pg 2.32 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,1]
    pg 2.33 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,1]
    pg 2.34 is stuck undersized for 39m, current state active+undersized+degraded, last acting [5,0]
    pg 2.35 is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,3]
    pg 2.36 is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,4]
    pg 2.37 is stuck undersized for 39m, current state active+undersized+degraded, last acting [3,1]
    pg 2.38 is stuck undersized for 39m, current state active+undersized+degraded, last acting [0,5]
    pg 2.39 is stuck undersized for 39m, current state active+undersized+degraded, last acting [1,5]
    pg 2.7d is stuck undersized for 39m, current state active+undersized+degraded, last acting [0,4]
    pg 2.7e is stuck undersized for 39m, current state active+undersized+degraded, last acting [0,4]
    pg 2.7f is stuck undersized for 39m, current state active+undersized+degraded, last acting [4,1]
root@proxmoxa:~#

PG_DEGRADED: Degraded data redundancy

たくさん出ているやつについて調査

まずは PG_DEGRADED を確認・・・

osdがdownしているわけではないので、参考にならなそう

とりあえず関連しそうな状態を確認

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 803/2415 objects degraded (33.251%), 128 pgs degraded, 128 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 89m)
    mgr: proxmoxa(active, since 96m), standbys: proxmoxb
    osd: 6 osds: 6 up (since 90m), 6 in (since 91m); 1 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 805 objects, 3.1 GiB
    usage:   6.4 GiB used, 90 GiB / 96 GiB avail
    pgs:     803/2415 objects degraded (33.251%)
             2/2415 objects misplaced (0.083%)
             128 active+undersized+degraded
             1   active+clean+remapped

  io:
    client:   20 KiB/s rd, 13 MiB/s wr, 15 op/s rd, 29 op/s wr

root@proxmoxa:~#
root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  2/6 objects misplaced (33.333%)

pool cephpool id 2
  826/2478 objects degraded (33.333%)
  client io 14 KiB/s rd, 8.4 MiB/s wr, 14 op/s rd, 15 op/s wr

root@proxmoxa:~#
root@proxmoxa:~# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 18 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 6.06
pool 2 'cephpool' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 39 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_bytes 21474836480 application rbd read_balance_score 1.41
        removed_snaps_queue [2~1]

root@proxmoxa:~#

現状のcephpoolは pg_num=128, pgp_num=128 で作成されている

autoscaleの設定を見てみる

root@proxmoxa:~# ceph osd pool autoscale-status
POOL        SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK
.mgr      452.0k                3.0        98280M  0.0000                                  1.0       1              on         False
cephpool   2234M       20480M   3.0        98280M  0.6252                                  1.0     128              on         False
root@proxmoxa:~#

How I Built a 2-Node HA Proxmox Cluster with Ceph, Podman, and a Raspberry Pi (Yes, It Works)」にやりたいことがあるっぽい

このページでは「ceph config set osd osd_default_size 2」と「ceph config set osd osd_default_min_size 1」を実行しているが、ceph config getで確認してみると、値はない模様

root@proxmoxa:~# ceph config get osd osd_default_size
Error ENOENT: unrecognized key 'osd_default_size'
root@proxmoxa:~# ceph config get osd osd_default_min_size
Error ENOENT: unrecognized key 'osd_default_min_size'
root@proxmoxa:~#

設定出来たりしないかを念のため確認してみたが、エラーとなった

root@proxmoxa:~# ceph config set osd osd_default_size 2
Error EINVAL: unrecognized config option 'osd_default_size'
root@proxmoxa:~# ceph config set osd osd_default_min_size 1
Error EINVAL: unrecognized config option 'osd_default_min_size'
root@proxmoxa:~#

osd_pool_default_sizeとosd_pool_default_min_sizeならばあるので、そちらを設定してみることにした

root@proxmoxa:~# ceph config get osd osd_pool_default_size
3
root@proxmoxa:~# ceph config get osd osd_pool_default_min_size
0
root@proxmoxa:~#
root@proxmoxa:~# ceph config set osd osd_pool_default_size 2
root@proxmoxa:~# ceph config set osd osd_pool_default_min_size 1
root@proxmoxa:~# ceph config get osd osd_pool_default_size
2
root@proxmoxa:~# ceph config get osd osd_pool_default_min_size
1
root@proxmoxa:~#

状態に変化はなさそう

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 885/2661 objects degraded (33.258%), 128 pgs degraded, 128 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 2h)
    mgr: proxmoxa(active, since 2h), standbys: proxmoxb
    osd: 6 osds: 6 up (since 2h), 6 in (since 2h); 1 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   7.1 GiB used, 89 GiB / 96 GiB avail
    pgs:     885/2661 objects degraded (33.258%)
             2/2661 objects misplaced (0.075%)
             128 active+undersized+degraded
             1   active+clean+remapped

root@proxmoxa:~# ceph osd pool autoscale-status
POOL        SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK
.mgr      452.0k                3.0        98280M  0.0000                                  1.0       1              on         False
cephpool   2319M       20480M   3.0        98280M  0.6252                                  1.0     128              on         False
root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  2/6 objects misplaced (33.333%)

pool cephpool id 2
  885/2655 objects degraded (33.333%)
  client io 170 B/s wr, 0 op/s rd, 0 op/s wr

root@proxmoxa:~# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 18 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 6.06
pool 2 'cephpool' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 39 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_bytes 21474836480 application rbd read_balance_score 1.41
        removed_snaps_queue [2~1]

root@proxmoxa:~#

ceph osd pool get コマンドで、各プールのsizeとmin_sizeを確認

root@proxmoxa:~# ceph osd pool get cephpool size
size: 3
root@proxmoxa:~# ceph osd pool get cephpool min_size
min_size: 2
root@proxmoxa:~#

設定を変更

root@proxmoxa:~# ceph osd pool set cephpool size 2
set pool 2 size to 2
root@proxmoxa:~# ceph osd pool set cephpool min_size 1
set pool 2 min_size to 1
root@proxmoxa:~# ceph osd pool get cephpool size
size: 2
root@proxmoxa:~# ceph osd pool get cephpool min_size
min_size: 1
root@proxmoxa:~#

状態確認すると、ceph health がHEALTH_OKになっている

root@proxmoxa:~# ceph health
HEALTH_OK
root@proxmoxa:~# ceph health detail
HEALTH_OK
root@proxmoxa:~#

他のステータスは?と確認してみると、問題無く見える

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_OK

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 2h)
    mgr: proxmoxa(active, since 2h), standbys: proxmoxb
    osd: 6 osds: 6 up (since 2h), 6 in (since 2h); 1 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   7.1 GiB used, 89 GiB / 96 GiB avail
    pgs:     2/1776 objects misplaced (0.113%)
             128 active+clean
             1   active+clean+remapped

  io:
    recovery: 1.3 MiB/s, 0 objects/s

root@proxmoxa:~# ceph osd pool autoscale-status
POOL        SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK
.mgr      452.0k                3.0        98280M  0.0000                                  1.0       1              on         False
cephpool   3479M       20480M   2.0        98280M  0.4168                                  1.0     128              on         False
root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  2/6 objects misplaced (33.333%)

pool cephpool id 2
  nothing is going on

root@proxmoxa:~# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 18 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 6.06
pool 2 'cephpool' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 42 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_bytes 21474836480 application rbd read_balance_score 1.17

root@proxmoxa:~#

GUIもHEALTH_OK

障害テスト

片側を停止してどうなるか?

PVEのクラスタ側は生きている

root@proxmoxa:~# ha-manager status
quorum OK
master proxmoxa (active, Wed Jan 21 17:43:39 2026)
lrm proxmoxa (active, Wed Jan 21 17:43:40 2026)
lrm proxmoxb (old timestamp - dead?, Wed Jan 21 17:43:08 2026)
service vm:100 (proxmoxa, started)
root@proxmoxa:~# pvecm status
Cluster information
-------------------
Name:             cephcluster
Config Version:   3
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Wed Jan 21 17:44:11 2026
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1.3b
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 192.168.2.64 (local)
0x00000000          1            Qdevice
root@proxmoxa:~#

しかし、cephのステータスは死んでいる

「ceph helth」コマンドを実行してみると返事が返ってこない

root@proxmoxa:~# ceph health

ダメそうなので、停止したノードを復帰

ceph osd poolの.mgrについてもsizeとmin_sizeを変更

root@proxmoxa:~# ceph osd pool ls
.mgr
cephpool
root@proxmoxa:~# ceph osd pool get .mgr size
size: 3
root@proxmoxa:~# ceph osd pool get .mgr min_size
min_size: 2
root@proxmoxa:~# ceph osd pool set .mgr size 2
set pool 1 size to 2
root@proxmoxa:~# ceph osd pool set .mgr min_size 1
set pool 1 min_size to 1
root@proxmoxa:~# ceph osd pool get .mgr size
size: 2
root@proxmoxa:~# ceph osd pool get .mgr min_size
min_size: 1
root@proxmoxa:~#

で、先のブログにあるようにmonパラメータも変更するため、現在値を確認

root@proxmoxa:~# ceph config get mon mon_osd_min_down_reporters
2
root@proxmoxa:~# ceph config get mon mon_osd_down_out_interval
600
root@proxmoxa:~# ceph config get mon mon_osd_report_timeout
900
root@proxmoxa:~#

これをそれぞれ変更

root@proxmoxa:~# ceph config set mon mon_osd_min_down_reporters 1
root@proxmoxa:~# ceph config set mon mon_osd_down_out_interval 120
root@proxmoxa:~# ceph config set mon mon_osd_report_timeout 90
root@proxmoxa:~# ceph config get mon mon_osd_min_down_reporters
1
root@proxmoxa:~# ceph config get mon mon_osd_down_out_interval
120
root@proxmoxa:~# ceph config get mon mon_osd_report_timeout
90
root@proxmoxa:~#

・・・相変わらずceph -sで応答がなくなる

cephを維持するための3番目のノードをどう作成する?

先ほどの記事の「Faking a Third Node with a Containerized MON」にコンテナとして3つめのceph monを起動させる話が書いてあった

Proxmox VEフォーラムの「3rd Ceph MON on external QDevice (Podman) – 4-node / 2-site cluster」からProxmox VE wikiの「Stretch Cluster」ではceph monではなく「tie-breaker node」を立てるとある

またStretch Clusterでは、先ほど変更したOSDのsize=4, min_size=2 として、2つのノードに2個のレプリカを保証する設定としていた。

とりあえず、OSDのsize/min_sizeを変更する

root@proxmoxa:~# ceph osd pool ls
.mgr
cephpool
root@proxmoxa:~# ceph osd pool get .mgr size
size: 2
root@proxmoxa:~# ceph osd pool get .mgr min_size
min_size: 1
root@proxmoxa:~# ceph osd pool set .mgr size 4
set pool 1 size to 4
root@proxmoxa:~# ceph osd pool set .mgr min_size 2
set pool 1 min_size to 2
root@proxmoxa:~# ceph osd pool get .mgr size
size: 4
root@proxmoxa:~# ceph osd pool get .mgr min_size
min_size: 2
root@proxmoxa:~# ceph osd pool get cephpool size
size: 2
root@proxmoxa:~# ceph osd pool get cephpool min_size
min_size: 1
root@proxmoxa:~# ceph osd pool set cephpool size 4
set pool 2 size to 4
root@proxmoxa:~# ceph osd pool set cephpool min_size 2
set pool 2 min_size to 2
root@proxmoxa:~# ceph osd pool get cephpool size
size: 4
root@proxmoxa:~# ceph osd pool get cephpool min_size
min_size: 2
root@proxmoxa:~#

この状態でceph osd pool statsを取ると先ほどまで33.333%だったものが50.0% なった

root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  4/8 objects degraded (50.000%)

pool cephpool id 2
  1770/3540 objects degraded (50.000%)

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 1774/3548 objects degraded (50.000%), 129 pgs degraded, 129 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 6h)
    mgr: proxmoxb(active, since 6h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 6h), 6 in (since 24h)

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   7.0 GiB used, 89 GiB / 96 GiB avail
    pgs:     1774/3548 objects degraded (50.000%)
             129 active+undersized+degraded

root@proxmoxa:~#
root@proxmoxa:~# ceph health
HEALTH_WARN Degraded data redundancy: 1774/3548 objects degraded (50.000%), 129 pgs degraded, 129 pgs undersized
root@proxmoxa:~# ceph health detail
HEALTH_WARN Degraded data redundancy: 1774/3548 objects degraded (50.000%), 129 pgs degraded, 129 pgs undersized
[WRN] PG_DEGRADED: Degraded data redundancy: 1774/3548 objects degraded (50.000%), 129 pgs degraded, 129 pgs undersized
    pg 1.0 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,0]
    pg 2.0 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,1]
    pg 2.1 is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,5]
    pg 2.2 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,1]
    pg 2.3 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,2]
    pg 2.4 is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,4]
    pg 2.5 is stuck undersized for 5m, current state active+undersized+degraded, last acting [4,0]
    pg 2.6 is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,3]
    pg 2.7 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,2]
    pg 2.8 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,0]
    pg 2.9 is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,4]
    pg 2.a is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,4]
    pg 2.b is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,0]
    pg 2.c is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,3]
    pg 2.d is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,3]
    pg 2.e is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,3]
    pg 2.f is stuck undersized for 5m, current state active+undersized+degraded, last acting [4,0]
    pg 2.10 is active+undersized+degraded, acting [2,4]
    pg 2.1c is stuck undersized for 5m, current state active+undersized+degraded, last acting [4,2]
    pg 2.1d is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,0]
    pg 2.1e is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,5]
    pg 2.1f is stuck undersized for 5m, current state active+undersized+degraded, last acting [0,3]
    pg 2.20 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,1]
    pg 2.21 is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,4]
    pg 2.22 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,2]
    pg 2.23 is stuck undersized for 5m, current state active+undersized+degraded, last acting [0,3]
    pg 2.24 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,1]
    pg 2.25 is stuck undersized for 5m, current state active+undersized+degraded, last acting [4,2]
    pg 2.26 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,2]
    pg 2.27 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,0]
    pg 2.28 is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,3]
    pg 2.29 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,1]
    pg 2.2a is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,0]
    pg 2.2b is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,4]
    pg 2.2c is stuck undersized for 5m, current state active+undersized+degraded, last acting [2,5]
    pg 2.2d is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,2]
    pg 2.2e is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,0]
    pg 2.2f is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,0]
    pg 2.30 is stuck undersized for 5m, current state active+undersized+degraded, last acting [4,0]
    pg 2.31 is stuck undersized for 5m, current state active+undersized+degraded, last acting [0,5]
    pg 2.32 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,1]
    pg 2.33 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,1]
    pg 2.34 is stuck undersized for 5m, current state active+undersized+degraded, last acting [5,0]
    pg 2.35 is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,3]
    pg 2.36 is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,4]
    pg 2.37 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,1]
    pg 2.38 is stuck undersized for 5m, current state active+undersized+degraded, last acting [0,5]
    pg 2.39 is stuck undersized for 5m, current state active+undersized+degraded, last acting [1,5]
    pg 2.7d is stuck undersized for 5m, current state active+undersized+degraded, last acting [0,4]
    pg 2.7e is stuck undersized for 5m, current state active+undersized+degraded, last acting [0,4]
    pg 2.7f is stuck undersized for 5m, current state active+undersized+degraded, last acting [4,1]
root@proxmoxa:~#

このときのceph osd treeは下記の状態

root@proxmoxa:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.09357  root default
-5         0.04678      host proxmoxa
 3    ssd  0.01559          osd.3          up   1.00000  1.00000
 4    ssd  0.01559          osd.4          up   1.00000  1.00000
 5    ssd  0.01559          osd.5          up   1.00000  1.00000
-3         0.04678      host proxmoxb
 0    ssd  0.01559          osd.0          up   1.00000  1.00000
 1    ssd  0.01559          osd.1          up   1.00000  1.00000
 2    ssd  0.01559          osd.2          up   1.00000  1.00000
root@proxmoxa:~#

次にCRUSH Structureを2個作る

root@proxmoxa:~# ceph osd crush add-bucket room1 room
added bucket room1 type room to crush map
root@proxmoxa:~# ceph osd crush add-bucket room2 room
added bucket room2 type room to crush map
root@proxmoxa:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-8               0  room room2
-7               0  room room1
-1         0.09357  root default
-5         0.04678      host proxmoxa
 3    ssd  0.01559          osd.3          up   1.00000  1.00000
 4    ssd  0.01559          osd.4          up   1.00000  1.00000
 5    ssd  0.01559          osd.5          up   1.00000  1.00000
-3         0.04678      host proxmoxb
 0    ssd  0.01559          osd.0          up   1.00000  1.00000
 1    ssd  0.01559          osd.1          up   1.00000  1.00000
 2    ssd  0.01559          osd.2          up   1.00000  1.00000
root@proxmoxa:~#

で、移動?

root@proxmoxa:~# ceph osd crush move room1 root=default
moved item id -7 name 'room1' to location {root=default} in crush map
root@proxmoxa:~# ceph osd crush move room2 root=default
moved item id -8 name 'room2' to location {root=default} in crush map
root@proxmoxa:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.09357  root default
-5         0.04678      host proxmoxa
 3    ssd  0.01559          osd.3          up   1.00000  1.00000
 4    ssd  0.01559          osd.4          up   1.00000  1.00000
 5    ssd  0.01559          osd.5          up   1.00000  1.00000
-3         0.04678      host proxmoxb
 0    ssd  0.01559          osd.0          up   1.00000  1.00000
 1    ssd  0.01559          osd.1          up   1.00000  1.00000
 2    ssd  0.01559          osd.2          up   1.00000  1.00000
-7               0      room room1
-8               0      room room2
root@proxmoxa:~#

次にノードをそれぞれ別のroomに移動

root@proxmoxa:~# ceph osd crush move proxmoxa room=room1
moved item id -5 name 'proxmoxa' to location {room=room1} in crush map
root@proxmoxa:~# ceph osd crush move proxmoxb room=room2
moved item id -3 name 'proxmoxb' to location {room=room2} in crush map
root@proxmoxa:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME              STATUS  REWEIGHT  PRI-AFF
-1         0.09357  root default
-7         0.04678      room room1
-5         0.04678          host proxmoxa
 3    ssd  0.01559              osd.3          up   1.00000  1.00000
 4    ssd  0.01559              osd.4          up   1.00000  1.00000
 5    ssd  0.01559              osd.5          up   1.00000  1.00000
-8         0.04678      room room2
-3         0.04678          host proxmoxb
 0    ssd  0.01559              osd.0          up   1.00000  1.00000
 1    ssd  0.01559              osd.1          up   1.00000  1.00000
 2    ssd  0.01559              osd.2          up   1.00000  1.00000
root@proxmoxa:~#

CRUSH ruleを作成

root@proxmoxa:~# ceph osd getcrushmap > crush.map.bin
25
root@proxmoxa:~# ls -l crush.map.bin
-rw-r--r-- 1 root root 1104 Jan 22 16:31 crush.map.bin
root@proxmoxa:~# crushtool -d crush.map.bin -o crush.map.txt
root@proxmoxa:~# ls -l crush.map*
-rw-r--r-- 1 root root 1104 Jan 22 16:31 crush.map.bin
-rw-r--r-- 1 root root 1779 Jan 22 16:31 crush.map.txt
root@proxmoxa:~#

crush.map.bin はバイナリファイルなので、crushtoolでテキストにしたものを作成

現状の内容は下記だった

root@proxmoxa:~# cat crush.map.txt
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class ssd
device 1 osd.1 class ssd
device 2 osd.2 class ssd
device 3 osd.3 class ssd
device 4 osd.4 class ssd
device 5 osd.5 class ssd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root

# buckets
host proxmoxa {
        id -5           # do not change unnecessarily
        id -6 class ssd         # do not change unnecessarily
        # weight 0.04678
        alg straw2
        hash 0  # rjenkins1
        item osd.3 weight 0.01559
        item osd.4 weight 0.01559
        item osd.5 weight 0.01559
}
room room1 {
        id -7           # do not change unnecessarily
        id -10 class ssd                # do not change unnecessarily
        # weight 0.04678
        alg straw2
        hash 0  # rjenkins1
        item proxmoxa weight 0.04678
}
host proxmoxb {
        id -3           # do not change unnecessarily
        id -4 class ssd         # do not change unnecessarily
        # weight 0.04678
        alg straw2
        hash 0  # rjenkins1
        item osd.0 weight 0.01559
        item osd.1 weight 0.01559
        item osd.2 weight 0.01559
}
room room2 {
        id -8           # do not change unnecessarily
        id -9 class ssd         # do not change unnecessarily
        # weight 0.04678
        alg straw2
        hash 0  # rjenkins1
        item proxmoxb weight 0.04678
}
root default {
        id -1           # do not change unnecessarily
        id -2 class ssd         # do not change unnecessarily
        # weight 0.09357
        alg straw2
        hash 0  # rjenkins1
        item room1 weight 0.04678
        item room2 weight 0.04678
}

# rules
rule replicated_rule {
        id 0
        type replicated
        step take default
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map
root@proxmoxa:~#

テキストファイルの最後に replicated_stretch_rule を追加。idは、テキストを見て他にあるruleのidの次の番号を設定する

root@proxmoxa:~# cp crush.map.txt crush-new.map.txt
root@proxmoxa:~# vi crush-new.map.txt
root@proxmoxa:~# diff -u crush.map.txt crush-new.map.txt
--- crush.map.txt       2026-01-22 16:31:44.837979276 +0900
+++ crush-new.map.txt   2026-01-22 16:35:12.146622553 +0900
@@ -87,3 +87,13 @@
 }

 # end crush map
+
+rule replicated_stretch_rule {
+        id 1
+        type replicated
+        step take default
+        step choose firstn 0 type room
+        step chooseleaf firstn 2 type host
+        step emit
+}
+
root@proxmoxa:~#

作成したファイルをcephに読み込ませる

root@proxmoxa:~# ls -l
total 12
-rw-r--r-- 1 root root 1104 Jan 22 16:31 crush.map.bin
-rw-r--r-- 1 root root 1779 Jan 22 16:31 crush.map.txt
-rw-r--r-- 1 root root 1977 Jan 22 16:35 crush-new.map.txt
root@proxmoxa:~# crushtool -c crush-new.map.txt -o crush-new.map.bin
root@proxmoxa:~# ls -l
total 16
-rw-r--r-- 1 root root 1104 Jan 22 16:31 crush.map.bin
-rw-r--r-- 1 root root 1779 Jan 22 16:31 crush.map.txt
-rw-r--r-- 1 root root 1195 Jan 22 16:36 crush-new.map.bin
-rw-r--r-- 1 root root 1977 Jan 22 16:35 crush-new.map.txt
root@proxmoxa:~# ceph osd setcrushmap -i crush-new.map.bin
26
root@proxmoxa:~#

そうするとcrush ruleが追加される

root@proxmoxa:~# ceph osd crush rule ls
replicated_rule
replicated_stretch_rule
root@proxmoxa:~#

別にosd treeは変わってない模様

root@proxmoxa:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME              STATUS  REWEIGHT  PRI-AFF
-1         0.09354  root default
-7         0.04677      room room1
-5         0.04677          host proxmoxa
 3    ssd  0.01558              osd.3          up   1.00000  1.00000
 4    ssd  0.01558              osd.4          up   1.00000  1.00000
 5    ssd  0.01558              osd.5          up   1.00000  1.00000
-8         0.04677      room room2
-3         0.04677          host proxmoxb
 0    ssd  0.01558              osd.0          up   1.00000  1.00000
 1    ssd  0.01558              osd.1          up   1.00000  1.00000
 2    ssd  0.01558              osd.2          up   1.00000  1.00000
root@proxmoxa:~#
root@proxmoxa:~# ceph health
HEALTH_WARN Degraded data redundancy: 448/3548 objects degraded (12.627%), 32 pgs degraded, 91 pgs undersized
root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 448/3548 objects degraded (12.627%), 32 pgs degraded, 91 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 7h)
    mgr: proxmoxb(active, since 7h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 7h), 6 in (since 25h); 97 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   10 GiB used, 86 GiB / 96 GiB avail
    pgs:     448/3548 objects degraded (12.627%)
             1276/3548 objects misplaced (35.964%)
             59 active+undersized+remapped
             34 active+clean+remapped
             32 active+undersized+degraded
             4  active+clean

root@proxmoxa:~#

移動していかない?

以前と同じようにpgp_numを128から32に変えてみる

root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  4/8 objects misplaced (50.000%)

pool cephpool id 2
  448/3540 objects degraded (12.655%)
  1272/3540 objects misplaced (35.932%)

root@proxmoxa:~# ceph osd pool get cephpool pgp_num
pgp_num: 128
root@proxmoxa:~# ceph osd pool set cephpool pgp_num 32
set pool 2 pgp_num to 32
root@proxmoxa:~# ceph osd pool get cephpool pgp_num
pgp_num: 128
root@proxmoxa:~#

かわっていかない・・・

root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  4/8 objects misplaced (50.000%)

pool cephpool id 2
  448/3540 objects degraded (12.655%)
  1272/3540 objects misplaced (35.932%)
  client io 170 B/s wr, 0 op/s rd, 0 op/s wr

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 448/3548 objects degraded (12.627%), 32 pgs degraded, 91 pgs undersized
            1 pools have pg_num > pgp_num

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 7h)
    mgr: proxmoxb(active, since 7h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 7h), 6 in (since 25h); 97 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   10 GiB used, 86 GiB / 96 GiB avail
    pgs:     448/3548 objects degraded (12.627%)
             1276/3548 objects misplaced (35.964%)
             59 active+undersized+remapped
             34 active+clean+remapped
             32 active+undersized+degraded
             4  active+clean

  io:
    client:   170 B/s wr, 0 op/s rd, 0 op/s wr

root@proxmoxa:~#

「1 pools have pg_num > pgp_num」とでているなら、pg_numもかえてみるか?

root@proxmoxa:~# ceph osd pool set cephpool pg_num 32
set pool 2 pg_num to 32
root@proxmoxa:~#
root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 448/3548 objects degraded (12.627%), 32 pgs degraded, 91 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 7h)
    mgr: proxmoxb(active, since 7h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 7h), 6 in (since 25h); 97 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   10 GiB used, 86 GiB / 96 GiB avail
    pgs:     448/3548 objects degraded (12.627%)
             1276/3548 objects misplaced (35.964%)
             59 active+undersized+remapped
             34 active+clean+remapped
             32 active+undersized+degraded
             4  active+clean

root@proxmoxa:~#

しばらく待ったものの変化はない

crush ruleが適用されているのか?

6.3. CRUSH ルールが作成され、プールが正しい CRUSH ルールに設定されていることの確認

現状のルールのrule idを確認

root@proxmoxa:~# ceph osd crush rule dump | grep -E "rule_(id|name)"
        "rule_id": 0,
        "rule_name": "replicated_rule",
        "rule_id": 1,
        "rule_name": "replicated_stretch_rule",
root@proxmoxa:~#

実際のpoolに設定されているルールのIDを確認

root@proxmoxa:~# ceph osd dump|grep cephpool
pool 2 'cephpool' replicated size 4 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 133 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_bytes 21474836480 application rbd read_balance_score 1.22
root@proxmoxa:~#

「crush_rule 0」とあるので、変更されてないっぽい

既存poolにcrush ruleを適用する方法をRedHatドキュメントから

root@proxmoxa:~# ceph osd pool get cephpool crush_rule
crush_rule: replicated_rule
root@proxmoxa:~# ceph osd pool set cephpool crush_rule replicated_stretch_rule
set pool 2 crush_rule to replicated_stretch_rule
root@proxmoxa:~# ceph osd pool get cephpool crush_rule
crush_rule: replicated_stretch_rule
root@proxmoxa:~# ceph osd dump|grep cephpool
pool 2 'cephpool' replicated size 4 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 134 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_bytes 21474836480 application rbd read_balance_score 1.22
root@proxmoxa:~#

変更できた

うーん・・・・?

root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  4/8 objects misplaced (50.000%)

pool cephpool id 2
  194/3540 objects degraded (5.480%)
  1562/3540 objects misplaced (44.124%)

root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 194/3548 objects degraded (5.468%), 14 pgs degraded, 83 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 8h)
    mgr: proxmoxb(active, since 8h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 8h), 6 in (since 26h); 115 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   11 GiB used, 85 GiB / 96 GiB avail
    pgs:     194/3548 objects degraded (5.468%)
             1566/3548 objects misplaced (44.138%)
             69 active+undersized+remapped
             45 active+clean+remapped
             14 active+undersized+degraded
             1  active+clean

root@proxmoxa:~# ceph osd pool stats
pool .mgr id 1
  4/8 objects misplaced (50.000%)

pool cephpool id 2
  194/3540 objects degraded (5.480%)
  1562/3540 objects misplaced (44.124%)
  client io 1.4 KiB/s wr, 0 op/s rd, 0 op/s wr

root@proxmoxa:~#

手順: PG カウントの増加」にpg_numとpgp_numをかえる、という話があって、pgp_numを4にしてたので実行してみた

root@proxmoxa:~# ceph osd pool get cephpool pg_num
pg_num: 128
root@proxmoxa:~# ceph osd pool get cephpool pgp_num
pgp_num: 128
root@proxmoxa:~# ceph osd pool set cephpool pgp_num 4
set pool 2 pgp_num to 4
root@proxmoxa:~# ceph osd pool get cephpool pgp_num
pgp_num: 128
root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 194/3548 objects degraded (5.468%), 14 pgs degraded, 83 pgs undersized
            1 pools have pg_num > pgp_num

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 9h)
    mgr: proxmoxb(active, since 9h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 9h), 6 in (since 27h); 115 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   11 GiB used, 85 GiB / 96 GiB avail
    pgs:     194/3548 objects degraded (5.468%)
             1566/3548 objects misplaced (44.138%)
             69 active+undersized+remapped
             45 active+clean+remapped
             14 active+undersized+degraded
             1  active+clean

root@proxmoxa:~#

「1 pools have pg_num > pgp_num」という出力がでるようになってしまった

じゃあ、pg_num も4にしてみる

root@proxmoxa:~# ceph osd pool set cephpool pg_num 4
set pool 2 pg_num to 4
root@proxmoxa:~# ceph osd pool get cephpool pg_num
pg_num: 128
root@proxmoxa:~# ceph -s
  cluster:
    id:     26b59237-5bed-45fe-906e-aa3b13033b86
    health: HEALTH_WARN
            Degraded data redundancy: 194/3548 objects degraded (5.468%), 14 pgs degraded, 83 pgs undersized

  services:
    mon: 2 daemons, quorum proxmoxa,proxmoxb (age 9h)
    mgr: proxmoxb(active, since 9h), standbys: proxmoxa
    osd: 6 osds: 6 up (since 9h), 6 in (since 27h); 115 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 887 objects, 3.4 GiB
    usage:   11 GiB used, 85 GiB / 96 GiB avail
    pgs:     194/3548 objects degraded (5.468%)
             1566/3548 objects misplaced (44.138%)
             69 active+undersized+remapped
             45 active+clean+remapped
             14 active+undersized+degraded
             1  active+clean

root@proxmoxa:~#

関係なさそう

samba 4.23.3 で立てたActive Directoryサーバの機能レベルが2008R2から動かせない件を修正する

ESXi8 Free環境上に Active Directoryサーバを立てるか、と、AlmaLinux 9 で samba 4.23.3 をソースからコンパイルして構築した

# /usr/local/samba/bin/samba-tool domain provision --use-rfc2307 --interactive
Realm [ADSAMPLE.LOCAL]:
Domain [ADSAMPLE]:
Server Role (dc, member, standalone) [dc]:
DNS backend (SAMBA_INTERNAL, BIND9_FLATFILE, BIND9_DLZ, NONE) [SAMBA_INTERNAL]:
DNS forwarder IP address (write 'none' to disable forwarding) [8.8.8.8]:  8.8.8.8
Administrator password:
Retype password:
INFO 2025-11-10 14:24:37,370 pid:1551 /usr/local/samba/lib64/python3.9/site-packages/samba/provision/__init__.py #2112: Looking up IPv4 addresses
<略>
INFO 2025-11-10 14:24:49,826 pid:1551 /usr/local/samba/lib64/python3.9/site-packages/samba/provision/__init__.py #501: DOMAIN SID:            S-1-5-21-1830428519-1651848948-1698044471
#

これで起動したActive Directoryサーバのフォレストレベル / ドメインレベル は下記の様にWindows 2008 R2 となっていた。

# samba-tool domain level show
Domain and forest function level for domain 'DC=adsample,DC=local'

Forest function level: (Windows) 2008 R2
Domain function level: (Windows) 2008 R2
Lowest function level of a DC: (Windows) 2008 R2
#

これをアップグレードしようと samba-tool domain level raiseコマンドを実行してみてもエラーとなる。

# samba-tool domain level raise --forest-level=2012_R2
ERROR: Forest function level can't be higher than the domain function level(s). Please raise it/them first!
# samba-tool domain level raise --domain-level=2012_R2
ERROR: Domain function level can't be higher than the lowest function level of a DC!
#

これはデフォルトのsamba設定で”ad dc functional level”が2008R2までとなっているからそういうことになっているのだという(参考:Samba domain controller: raising (all kinds of) level)

testparamコマンドを実行して現在の設定値を確認する

# /usr/local/samba/bin/testparm -s --section-name=global --parameter-name="ad dc functional level"
Load smb config files from /usr/local/samba/etc/smb.conf
Loaded services file OK.
Weak crypto is allowed by GnuTLS (e.g. NTLM as a compatibility fallback)

2008_R2
#

現状の /usr/local/samba/etc/smb.conf に記載はないが、 samba設定としては 2008_R2 として認識されている、ということを確認出来た

この結果を受けて/usr/local/samba/etc/smb.conf のglobalセクションに「ad dc functional level = 2016」という記述を追加する

# cat /usr/local/samba/etc/smb.conf
# Global parameters
[global]
        dns forwarder = 8.8.8.8
        netbios name = ADSERVER
        realm = ADSAMPLE.LOCAL
        server role = active directory domain controller
        workgroup = ADSAMPLE
        idmap_ldb:use rfc2307 = yes
        ad dc functional level = 2016

[sysvol]
        path = /usr/local/samba/var/locks/sysvol
        read only = No

[netlogon]
        path = /usr/local/samba/var/locks/sysvol/adsample.local/scripts
        read only = No
#

testparamで記述が反映されているかを確認

# /usr/local/samba/bin/testparm -s --section-name=global --parameter-name="ad dc functional level"
Load smb config files from /usr/local/samba/etc/smb.conf
Loaded services file OK.
Weak crypto is allowed by GnuTLS (e.g. NTLM as a compatibility fallback)

2016
#

sambaを再起動して、機能レベルがどうなったのかを確認

# systemctl restart samba-ad-dc
# samba-tool domain level show
Domain and forest function level for domain 'DC=adsample,DC=local'

Forest function level: (Windows) 2008 R2
Domain function level: (Windows) 2008 R2
Lowest function level of a DC: (Windows) 2016
#

Lowest function level of a DC が変更されたので、上2つも変更できるようになった

まずはドメインの機能レベルを変更

# samba-tool domain level raise --domain-level=2012_R2
Domain function level changed!
All changes applied successfully!
# samba-tool domain level show
Domain and forest function level for domain 'DC=adsample,DC=local'

Forest function level: (Windows) 2008 R2
Domain function level: (Windows) 2012 R2
Lowest function level of a DC: (Windows) 2016
#

続いてフォレストの機能レベルを変更

# samba-tool domain level raise --forest-level=2012_R2
Forest function level changed!
All changes applied successfully!
# samba-tool domain level show
Domain and forest function level for domain 'DC=adsample,DC=local'

Forest function level: (Windows) 2012 R2
Domain function level: (Windows) 2012 R2
Lowest function level of a DC: (Windows) 2016
#

これで問題なくなった。

iscsiadmコマンドのメモ

HPE VM Essentails に iSCSIストレージをつないだ場合の動作がわからない点が多かった、Web UIからではなく、CLIでいろいろ調べる羽目になったのでメモ書き

Linux汎用で使える話ではある

接続の確認

iscsiが接続できているかを「iscsiadm -m session」で確認

pcuser@hpevme6:~$ sudo iscsiadm -m session
tcp: [1] 192.168.3.34:3260,1029 iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3 (non-flash)
tcp: [2] 192.168.2.34:3260,1028 iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3 (non-flash)
pcuser@hpevme6:~$

何もつながっていない場合は下記

pcuser@hpevme6:~$ sudo iscsiadm -m session
iscsiadm: No active sessions.
pcuser@hpevme6:~$

詳細を確認したい場合は「-P 数字」というオプションを付ける。0,1,2,3が指定できるが「-P 0」は付けない場合と同じ

0~2は、接続先IPアドレスとログイン情報などの範囲
3になると、デバイスが認識されているかがわかるようになるので「sudo iscsiadm -m session -P 3」はトラブル時に必須

pcuser@hpevme6:~$ sudo iscsiadm -m session --print=3
iSCSI Transport Class version 2.0-870
version 2.1.9
Target: iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3 (non-flash)
        Current Portal: 192.168.3.34:3260,1029
        Persistent Portal: 192.168.3.34:3260,1029
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.3.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 33 State: running
                scsi33 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running
                scsi33 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sdd          State: running
        Current Portal: 192.168.2.34:3260,1028
        Persistent Portal: 192.168.2.34:3260,1028
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.2.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 34 State: running
                scsi34 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdc          State: running
                scsi34 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sde          State: running
pcuser@hpevme6:~$

“Attached SCSI devices:” のあとに scsi~ という表記があるかどうか

ない場合は、iSCSIストレージ側で、アクセス許可されてない可能性があるので、設定を確認

まず、Linux側のInitiatorNameを確認。Linuxの場合 /etc/iscsi/initiatorname.iscsi に記載されいて、OSインストール直後などは「InitiatorName=iqn.2004-10.com.ubuntu:01:<ランダム>」といった値で設定されていることが多い

HPE VMEの場合、hpe-vmセットアップ直後は ubuntuランダムなのだが、Web UIからiSCSI接続をするとホスト名 ランダムといった下記のような設定に切り替わる

pcuser@hpevme6:~$ sudo cat /etc/iscsi/initiatorname.iscsi
## DO NOT EDIT OR REMOVE THIS FILE!
## If you remove this file, the iSCSI daemon will not start.
## If you change the InitiatorName, existing access control lists
## may reject this initiator.  The InitiatorName must be unique
## for each iSCSI initiator.  Do NOT duplicate iSCSI InitiatorNames.
InitiatorName=iqn.2024-12.com.hpe:hpevme6:59012
pcuser@hpevme6:~$

この「InitiatorName」の値をiSCSIストレージ側の「イニシエータ」の登録に追加する必要がある

NetAppの場合の設定例

HPE VMEの場合、iSCSI設定を行う際に、Manager仮想マシンが各サーバの /etc/iscsi/initiatorname.iscsi の値を書き換えるので、設定したはずなのにつながらない場合は、最新の名前がiSCSIストレージ側に登録されているかを確認すること

設定変更した後、「sudo iscsiadm -m session –rescan」を実行して再スキャンを行う

認識していない状態から–rescanを実行して認識した、という実行ログ

pcuser@hpevme6:~$ sudo iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 2.1.9
Target: iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3 (non-flash)
        Current Portal: 192.168.3.34:3260,1029
        Persistent Portal: 192.168.3.34:3260,1029
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.3.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 120
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 33 State: running
        Current Portal: 192.168.2.34:3260,1028
        Persistent Portal: 192.168.2.34:3260,1028
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.2.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 120
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 34 State: running
pcuser@hpevme6:~$ sudo iscsiadm -m session --rescan
Rescanning session [sid: 1, target: iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3, portal: 192.168.3.34,3260]
Rescanning session [sid: 2, target: iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3, portal: 192.168.2.34,3260]
pcuser@hpevme6:~$ sudo iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 2.1.9
Target: iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3 (non-flash)
        Current Portal: 192.168.3.34:3260,1029
        Persistent Portal: 192.168.3.34:3260,1029
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.3.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 33 State: running
                scsi33 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running
                scsi33 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sdd          State: running
        Current Portal: 192.168.2.34:3260,1028
        Persistent Portal: 192.168.2.34:3260,1028
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.2.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 34 State: running
                scsi34 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdc          State: running
                scsi34 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sde          State: running
pcuser@hpevme6:~$

マルチパスの認識

iSCSIストレージは複数のセッション=マルチパスで接続されるので、下の例では、scsi33とscsi34 の2つで見えている

pcuser@hpevme6:~$ sudo iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 2.1.9
Target: iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3 (non-flash)
        Current Portal: 192.168.3.34:3260,1029
        Persistent Portal: 192.168.3.34:3260,1029
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.3.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 33 State: running
                scsi33 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running
                scsi33 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sdd          State: running
        Current Portal: 192.168.2.34:3260,1028
        Persistent Portal: 192.168.2.34:3260,1028
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2024-12.com.hpe:hpevme6:59012
                Iface IPaddress: 192.168.2.60
                Iface HWaddress: default
                Iface Netdev: default
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 1048576
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 34 State: running
                scsi34 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdc          State: running
                scsi34 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sde          State: running
pcuser@hpevme6:~$

2パスで見えているものを1つにまとめるのが multipathd の役割

「sudo multipath -ll」を実行して認識状況を確認

pcuser@hpevme6:~$ sudo multipath -ll
3600a09807770457a795d5a4159416c34 dm-2 NETAPP,LUN C-Mode
size=70G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 34:0:0:0 sdc 8:32 active ready running
  `- 33:0:0:0 sdb 8:16 active ready running
3600a09807770457a795d5a4159416c35 dm-1 NETAPP,LUN C-Mode
size=5.0G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 33:0:0:1 sdd 8:48 active ready running
  `- 34:0:0:1 sde 8:64 active ready running
pcuser@hpevme6:~$

multipathdでまとめられたデバイスは /dev/mapper の下にデバイスファイルがある

pcuser@hpevme6:~$ ls /dev/mapper/*
/dev/mapper/3600a09807770457a795d5a4159416c34  /dev/mapper/control
/dev/mapper/3600a09807770457a795d5a4159416c35  /dev/mapper/ubuntu--vg-ubuntu--lv
pcuser@hpevme6:~$

「sudo multipath -ll」で何も表示されていない場合は、手動でデバイスを登録する

まず、認識している /dev/sd? に対応するWWIDを調べるため「/lib/udev/scsi_id -g -u -d /dev/sd?」を実行する

/lib/udev/scsi_id -g -u -d /dev/sdX

このWWIDをmutlipathに登録するため「multipath -a WWID」を実行する

multipath -a WWID

登録した後は「multipath -r」で再読み込みして、「multipath -ll」で追加されたかを確認する

ターゲットログインなどの初期設定

「iscsiadm -m discovery -t sendtargets -p IPアドレス」で接続

接続パラメータの変更

現在のパラメータ確認は「sudo iscsiadm -m node」でポータル名を確認

pcuser@hpevme6:~$ sudo iscsiadm -m node
192.168.2.34:3260,1028 iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3
192.168.3.34:3260,1029 iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3
pcuser@hpevme6:~$

各ポータルに設定されているパラメータを「sudo iscsiadm -m node -p <ポータル名>」で確認

pcuser@hpevme6:~$ sudo iscsiadm -m node -p 192.168.2.34:3260,1028
# BEGIN RECORD 2.1.9
node.name = iqn.1992-08.com.netapp:sn.e56cfbb6bab111f09b2a000c2980b7f5:vs.3
node.tpgt = 1028
node.startup = automatic
node.leading_login = No
iface.iscsi_ifacename = default
iface.net_ifacename = <empty>
iface.ipaddress = <empty>
iface.prefix_len = 0
iface.hwaddress = <empty>
iface.transport_name = tcp
iface.initiatorname = <empty>
iface.state = <empty>
iface.vlan_id = 0
iface.vlan_priority = 0
iface.vlan_state = <empty>
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
iface.bootproto = <empty>
iface.subnet_mask = <empty>
iface.gateway = <empty>
iface.dhcp_alt_client_id_state = <empty>
iface.dhcp_alt_client_id = <empty>
iface.dhcp_dns = <empty>
iface.dhcp_learn_iqn = <empty>
iface.dhcp_req_vendor_id_state = <empty>
iface.dhcp_vendor_id_state = <empty>
iface.dhcp_vendor_id = <empty>
iface.dhcp_slp_da = <empty>
iface.fragmentation = <empty>
iface.gratuitous_arp = <empty>
iface.incoming_forwarding = <empty>
iface.tos_state = <empty>
iface.tos = 0
iface.ttl = 0
iface.delayed_ack = <empty>
iface.tcp_nagle = <empty>
iface.tcp_wsf_state = <empty>
iface.tcp_wsf = 0
iface.tcp_timer_scale = 0
iface.tcp_timestamp = <empty>
iface.redirect = <empty>
iface.def_task_mgmt_timeout = 0
iface.header_digest = <empty>
iface.data_digest = <empty>
iface.immediate_data = <empty>
iface.initial_r2t = <empty>
iface.data_seq_inorder = <empty>
iface.data_pdu_inorder = <empty>
iface.erl = 0
iface.max_receive_data_len = 0
iface.first_burst_len = 0
iface.max_outstanding_r2t = 0
iface.max_burst_len = 0
iface.chap_auth = <empty>
iface.bidi_chap = <empty>
iface.strict_login_compliance = <empty>
iface.discovery_auth = <empty>
iface.discovery_logout = <empty>
node.discovery_address = 192.168.2.34
node.discovery_port = 3260
node.discovery_type = send_targets
node.session.initial_cmdsn = 0
node.session.initial_login_retry_max = 8
node.session.xmit_thread_priority = 0
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.nr_sessions = 1
node.session.auth.authmethod = None
node.session.auth.username = <empty>
node.session.auth.password = <empty>
node.session.auth.username_in = <empty>
node.session.auth.password_in = <empty>
node.session.auth.chap_algs = MD5
node.session.timeo.replacement_timeout = 120
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.err_timeo.host_reset_timeout = 60
node.session.iscsi.FastAbort = Yes
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 2
node.session.iscsi.MaxConnections = 1
node.session.iscsi.MaxOutstandingR2T = 1
node.session.iscsi.ERL = 0
node.session.scan = auto
node.session.reopen_max = 0
node.conn[0].address = 192.168.2.34
node.conn[0].port = 3260
node.conn[0].startup = automatic
node.conn[0].tcp.window_size = 524288
node.conn[0].tcp.type_of_service = 0
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.HeaderDigest = None
node.conn[0].iscsi.DataDigest = None
node.conn[0].iscsi.IFMarker = No
node.conn[0].iscsi.OFMarker = No
# END RECORD
pcuser@hpevme6:~$

マルチパスで一部のセッションが切れた時の再接続にかかる時間がnode.session.timeo.replacement_timeout で設定されていれ標準は120秒となっている

これだと長いので、例えばHPEの「HPE Primera Red Hat Enterprise Linux実装ガイド」では 10秒 としている

今すぐ変更したい場合はiscsiadmを実行

pcuser@hpevme6:~$ sudo iscsiadm -m node -p 192.168.2.34:3260,1028 |grep node.session.timeo.replacem
ent_timeout
node.session.timeo.replacement_timeout = 120
pcuser@hpevme6:~$ sudo iscsiadm -m node -p 192.168.2.34:3260,1028 -o update -n node.session.timeo.replacement_timeout -v 10
pcuser@hpevme6:~$ sudo iscsiadm -m node -p 192.168.2.34:3260,1028 |grep node.session.timeo.replacement_timeout
node.session.timeo.replacement_timeout = 10
pcuser@hpevme6:~$

恒久的に変更するには /etc/iscsi/iscsid.conf にて該当する行を修正する。

HPE Morpheus VM EssentialsのHCI構成を組んでみた

vSphere代替とも言われるHPE Morpheus VM Essentials (HPE VME, HVM, HPE VM Essentails) で、クラスタをセットアップする際に下記の選択肢がある。

「HVM 1.2 HCI Ceph Cluster on HVM/Ubuntu 24.04」ということで、共有ストレージとしてCephを使用するHCI構成があるらしい。

どういう構成を組めばいいのかわからなかったのですが、ドキュメントを探すと [Infrastructure]-[Clusters]-[HVM Clusters]-[Base Cluster Details]という非常にわかりづらいところに、HCI構成の場合に求める仕様が書いてあった。

・物理サーバが最低3台
・CPUコア数1以上
・メモリ4GB以上。Cephを使う場合ディスク1個ごとに4GB追加
・OSディスク 20GB以上、Ceph用データディスク500GB以上

Example Cluster Deployment」にサンプル構成がある

まずは、最低限のスペックでUbuntu 24.04+HVMをインストールして、Manager仮想マシンをセットアップした。

クラスタを作るところまでの手順は省略

で・・・私がはまった点の1つとして、レイアウト選択のバグ動作、というのがあります。

ver 8.0.10で実施したところ、「HVM 1.2 HCI Ceph Cluster on HVM/Ubuntu 24.04」を選択したところ下記の様にSSHホストが1つしか選択できない状態でした。

そういうものかと思って手順を進めて作成を開始すると指定してないサーバが2つ登場してプロセスが開始されるという状況に・・・

何度かクラスタレイアウトの選択をやり直すと、下記の様に3サーバ分の空欄が表示されるときがありました。この状態であればHCIクラスタの作成に成功しました。

下記の様に入力し作成を開始

作成完了

まあ、わかってしまえば構成自体は簡単だったのですが・・・

Cephのステータスを確認する画面が無いのはどうかと思うんですよ・・・

たとえば[インフラストラクチャ]-[クラスター]で作成したHCIクラスタを表示すると下記の様に「CEPH」って表示があります。

問題は「ステータス:WARN」って表示以上のことを調べるインタフェースが無い、ということ

[ストレージ]で確認できるものは下記の情報だけで、Cephの状態について確認できない