fangpsh's blog

lvmcache 缓存盘掉盘处理

一块老旧的SSD,使用lvmcache 作为读写缓存,今天掉盘了,系统hang 住,重启后一堆报错:Warning: dracut-initqueue timeout - starting timeout scripts,启动失败,自动进入了救援模式dracut:/#

救援模式中,lvm 的变更操作都会报警:

Read-only locking type set. Write locks are prohibited.
Can't get lock for...

需要手工编辑下/etc/lvm/lvm.conflocking_type=4 修改为locking_type=1,再继续。

先尝试强制移除cache,失败了:

dracut:/# lucomvert -uncache centos/root --force
Couldn't find device with unid Buatb2-rxoc-8B6f -cf8y-3XR3-JTer-WNS6tJ.
WARNING: Cache pool data. logical wolume centos/cache_ cdata is missing.
WARNING: Cache pool metadata logical volume centos/cache_cmeta is missing.
WARMING: Uncaching of partially missi ng writethrough cache volume centos/root might destroy your date
Do you really want to uncache centos/root with missing LUs? lym]:y
t 1366 4737991 Buffer I'd error on dey dm-3, logical block async page read
devmapper centos -cache_cmeta: read failed: Input / output error
Failed to active cache locally centos/root.

再试一试lvm unreduce --removemiss centos --force,也失败了:

Couldn't find device with unid BuatbZ-rxoc-8B6r -cf8y-3XR3-JTer-uNS6tj.
WARNING:_ Removing partial LU centos/root:
1388 5592321 Buffer Iro error on dey dm-3, logical block 8, async page read
devmapper/centos-cache, cmeta: read failed: imput/output error
failed to active cache locally centosroot.
Failed to uncache centos/root.

看到有说可以再用一块盘,替换为相同的UUID,没折腾,最终用这个issue的方法:

At this point, the only way I found this can be fixed is to take the /etc/lvm/backup/lvmgroup file, modify it to remove the cache entries, rename the disk_corig back to disk, add "VISIBLE" flag back, and then run vgcfgrestore -f on the modified file.

先保存下原配置(/etc/lvm/backup/centos),例如原配置是

...
centos {
        id = "cPeVKa-nu1q-wuPJ-bcXB-1tYT-12aG-ALYYoT"
        seqno = 8
        format = "lvm2"                 # informational
        status = ["RESIZEABLE", "READ", "WRITE"]
        flags = []
        extent_size = 8192              # 4 Megabytes
        max_lv = 0
        max_pv = 0
        metadata_copies = 0

        physical_volumes {

                pv0 {
                        id = "aJVnc3-w0xI-7WkX-Ejww-ROnQ-V1br-dsz6VX"
                        device = "/dev/sda2"    # Hint only

                        status = ["ALLOCATABLE"]
                        flags = []
                        dev_size = 3902797824   # 1.81738 Terabytes
                        pe_start = 2048
                        pe_count = 476415       # 1.81738 Terabytes
                }

                pv1 {
                        id = "i6BCMc-GcML-GO3g-UISN-83fg-39Cg-rTZFe8"
                        device = "/dev/sdb"     # Hint only

                        status = ["ALLOCATABLE"]
                        flags = []
                        dev_size = 248774656    # 118.625 Gigabytes
                        pe_start = 2048
                        pe_count = 30367        # 118.621 Gigabytes
                }
        }
        logical_volumes {

                root {
                        id = "rmdkDt-zyEW-i9UX-VLch-jzLr-154x-57Si74"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_time = 1693208343      # 2023-08-28 15:39:03 +0800
                        creation_host = "localhost.localdomain"
                        segment_count = 1

                        segment1 {
                                start_extent = 0
                                extent_count = 476415   # 1.81738 Terabytes

                                type = "cache"
                                cache_pool = "cache"
                                origin = "root_corig"
                        }
                }
                cache {
                ...
                }
                cache_cdata {
                ...
                }
                cache_cmeta {
                ...
                }
                lvol0_pmspare {
                ...
                }
                root_corig {
                        id = "12gqfI-5f9u-SRkw-yTtc-PE12-x38t-h4lYUa"
                        status = ["READ", "WRITE"]
                        flags = []
                        creation_time = 1693276572      # 2023-08-29 10:36:12 +0800
                        creation_host = "sz-node-8-12"
                        segment_count = 1

                        segment1 {
                                start_extent = 0
                                extent_count = 476415   # 1.81738 Terabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 0
                                ]
                        }
                }
        }

}                                                                                   

将cache 相关的lv 配置都删除,掉盘的pv1 也可以删了,更新下lv root ,主要是修改segment 段配置 ,把root_corig 里面的挪过来即可,如下:

        logical_volumes {

                root {
                        id = "rmdkDt-zyEW-i9UX-VLch-jzLr-154x-57Si74"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_time = 1693208343      # 2023-08-28 15:39:03 +0800
                        creation_host = "localhost.localdomain"
                        segment_count = 1

                        segment1 {
                                start_extent = 0
                                extent_count = 476415   # 1.81738 Terabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 0
                                ]
                        }
                }
        }

最后执行 lvm vgcfgrestore -f 配置文件 vg名称 生效。