一块老旧的SSD,使用lvmcache 作为读写缓存,今天掉盘了,系统hang 住,重启后一堆报错:Warning: dracut-initqueue timeout - starting timeout scripts
,启动失败,自动进入了救援模式dracut:/#
救援模式中,lvm 的变更操作都会报警:
Read-only locking type set. Write locks are prohibited.
Can't get lock for...
需要手工编辑下/etc/lvm/lvm.conf
将locking_type=4
修改为locking_type=1
,再继续。
先尝试强制移除cache,失败了:
dracut:/# lucomvert -uncache centos/root --force
Couldn't find device with unid Buatb2-rxoc-8B6f -cf8y-3XR3-JTer-WNS6tJ.
WARNING: Cache pool data. logical wolume centos/cache_ cdata is missing.
WARNING: Cache pool metadata logical volume centos/cache_cmeta is missing.
WARMING: Uncaching of partially missi ng writethrough cache volume centos/root might destroy your date
Do you really want to uncache centos/root with missing LUs? lym]:y
t 1366 4737991 Buffer I'd error on dey dm-3, logical block async page read
devmapper centos -cache_cmeta: read failed: Input / output error
Failed to active cache locally centos/root.
再试一试lvm unreduce --removemiss centos --force
,也失败了:
Couldn't find device with unid BuatbZ-rxoc-8B6r -cf8y-3XR3-JTer-uNS6tj.
WARNING:_ Removing partial LU centos/root:
1388 5592321 Buffer Iro error on dey dm-3, logical block 8, async page read
devmapper/centos-cache, cmeta: read failed: imput/output error
failed to active cache locally centosroot.
Failed to uncache centos/root.
看到有说可以再用一块盘,替换为相同的UUID,没折腾,最终用这个issue的方法:
At this point, the only way I found this can be fixed is to take the /etc/lvm/backup/lvmgroup file, modify it to remove the cache entries, rename the disk_corig back to disk, add "VISIBLE" flag back, and then run vgcfgrestore -f on the modified file.
先保存下原配置(/etc/lvm/backup/centos),例如原配置是
...
centos {
id = "cPeVKa-nu1q-wuPJ-bcXB-1tYT-12aG-ALYYoT"
seqno = 8
format = "lvm2" # informational
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192 # 4 Megabytes
max_lv = 0
max_pv = 0
metadata_copies = 0
physical_volumes {
pv0 {
id = "aJVnc3-w0xI-7WkX-Ejww-ROnQ-V1br-dsz6VX"
device = "/dev/sda2" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 3902797824 # 1.81738 Terabytes
pe_start = 2048
pe_count = 476415 # 1.81738 Terabytes
}
pv1 {
id = "i6BCMc-GcML-GO3g-UISN-83fg-39Cg-rTZFe8"
device = "/dev/sdb" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 248774656 # 118.625 Gigabytes
pe_start = 2048
pe_count = 30367 # 118.621 Gigabytes
}
}
logical_volumes {
root {
id = "rmdkDt-zyEW-i9UX-VLch-jzLr-154x-57Si74"
status = ["READ", "WRITE", "VISIBLE"]
flags = []
creation_time = 1693208343 # 2023-08-28 15:39:03 +0800
creation_host = "localhost.localdomain"
segment_count = 1
segment1 {
start_extent = 0
extent_count = 476415 # 1.81738 Terabytes
type = "cache"
cache_pool = "cache"
origin = "root_corig"
}
}
cache {
...
}
cache_cdata {
...
}
cache_cmeta {
...
}
lvol0_pmspare {
...
}
root_corig {
id = "12gqfI-5f9u-SRkw-yTtc-PE12-x38t-h4lYUa"
status = ["READ", "WRITE"]
flags = []
creation_time = 1693276572 # 2023-08-29 10:36:12 +0800
creation_host = "sz-node-8-12"
segment_count = 1
segment1 {
start_extent = 0
extent_count = 476415 # 1.81738 Terabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 0
]
}
}
}
}
将cache 相关的lv 配置都删除,掉盘的pv1 也可以删了,更新下lv root
,主要是修改segment 段配置 ,把root_corig 里面的挪过来即可,如下:
logical_volumes {
root {
id = "rmdkDt-zyEW-i9UX-VLch-jzLr-154x-57Si74"
status = ["READ", "WRITE", "VISIBLE"]
flags = []
creation_time = 1693208343 # 2023-08-28 15:39:03 +0800
creation_host = "localhost.localdomain"
segment_count = 1
segment1 {
start_extent = 0
extent_count = 476415 # 1.81738 Terabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 0
]
}
}
}
最后执行 lvm vgcfgrestore -f 配置文件 vg名称
生效。