上周在线上修改了一批机器的hostname:
hostnamectl set-hostname xxx
后来有同学反馈2台机器的/etc/resolv.conf 被清空了,resolv.conf 的内容为:
# Generated by NetworkManager
完了,第一感觉是这两个事情一定有相关,我存在知识盲区,简单搜索,果然发现:
Bug 1344303 - hostnamectl set-hostname over-writes existing resolv.conf entries。
查看出问题的这2台机器的/var/log/message
也存在上述连接类似的日志:
Jun 22 13:48:46 test NetworkManager[605]: <info> Setting system hostname to 'test' (from system configuration)
Jun 22 13:48:46 test dbus[610]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Jun 22 13:48:46 test dbus-daemon[610]: dbus[610]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Jun 22 13:48:46 test systemd[1]: Starting Network Manager Script Dispatcher Service...
Jun 22 13:48:46 test dbus[610]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jun 22 13:48:46 test dbus-daemon[610]: dbus[610]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jun 22 13:48:46 test systemd[1]: Started Network Manager Script Dispatcher Service.
Jun 22 13:48:46 test nm-dispatcher[3006]: Dispatching action 'hostname'
Networkmanager 服务也确实在运行之中,存在相关日志:
settings: hostname changed from "" to ""
不过很奇怪,当时总共修改了有几十台服务器,NetworkManager 在运行的也有几十台,其他的好像都没问题,后来为了复现,在这2台机器上再次手动修改hostname,复现不了。。orz,猜测可能和resolv.conf 原来的内容有关,细节可能需要读源码才知道。
Bug 1344303
话说回来,除了感觉自己存在知识盲区,“想当然”之外,这个Bug 实在是有点莫名其妙,就如同这个Bug 中的网友说的:
My point is still, that hostnamectl does not look like you are editing resolv.conf. but I'm with you that there is a relation of the domain name and the FQDN.
I'll update NetworkManager.conf with main/dns=none - no problem. I just wished I knew earlier howto disable the management of resolv.conf
然而作者的态度也是非常坚持,读起来有点爱用用,不用滚的味道。只能说我作为运维,对交付出来的机器,初始化不够统一,才会遇到这种坑。ok,在初始化脚本中增加(针对CentOS 7.x):
- name: Disable NetworkManager
systemd:
name: NetworkManager
state: stopped
enabled: false
ignore_errors: true
网友指出:
在 RHEL8 中,已经取消了 network.service,所有的网络配置都归属于 NetworkManager,这里可能不是很适用。
具体可参考:基于RHEL8/CentOS8的网络IP配置详解。
似乎Linux 的世界就是有这么多轮子,曾经想,Systemd 能一统天下,带来的却是更多轮子,23333。
夺回控制权
读到一篇文章:How to take back control of /etc/resolv.conf on Linux,如何夺回Linux 上/etc/resolv.conf 的控制权! 原来还有这么多玩意,详情参考文章链接,作者一直在更新,很用心。
- NetworkManager
- /etc/sysconfig/network/config: NETCONFIG_DNS_POLICY
- resolvconf, rdnssd
- systemd-resolved
学无止境,坑外有坑。