最近几天遇到的问题,今天终于找到解决方法。现在正式记录下。
问题的症状是这样的:
当系统比较繁忙时(eg. Chromium 开了很多个标签页,Chromium 异常关闭重启后有多个标签页需要恢复,开着 Eclipse 时……),X 有时侯会被卡死,鼠标、键盘都没有响应。但是,Ctrl+Alt+F1 等进入 TTY 终端时,系统正常可用,且没有卡顿。用 top
命令查看,发现 upowerd
进程 CPU 占用率非常高,经常超过 80%。
一直都觉得不应该是系统硬件资源占用过多导致的,因为电脑硬件配置比较高(Y400,CPU 型号是 i7-3630QM,8G 内存),而且 CPU 和内存资源明显都还有剩余。
我尝试过的一些解决方法(最终都无果):
kill
掉所有繁忙的进程- 重启 X(我的 Gentoo 没用
systemd
):service xdm restart
- 重启
upowerd
进程。在 64 位 Gentoo Linux 下,upowerd
进程对应的程序路径是/usr/lib64/upower/upowerd
- 在
top
的 Shell 窗口下,调大/小upowerd
进程的的 nice 值 upower --monitor-detail
,监控upowerd
服务的事件
最终问题是如何解决的呢?
tail -f /var/log/messages
,发现文件内容增加很快,终端屏幕不断被刷新。再仔细一看,发现有大量的类似于以下的信息:
May 1 19:48:37 Gentooo kernel[15730]: seq 24780 queued, 'add' 'drm' May 1 19:48:37 Gentooo kernel[15730]: passed 239 bytes to netlink monitor 0x6495a0 May 1 19:48:37 Gentooo kernel[15730]: seq 24781 queued, 'remove' 'drm' May 1 19:48:37 Gentooo kernel: NVRM: NVIDIA init module failed! May 1 19:48:37 Gentooo kernel[15730]: seq 24782 queued, 'remove' 'drivers' May 1 19:48:37 Gentooo kernel[18296]: seq 24780 running May 1 19:48:37 Gentooo kernel[15730]: passed 155 bytes to netlink monitor 0x6495a0 May 1 19:48:37 Gentooo kernel[18296]: no db file to read /run/udev/data/+drm:card1: No such file or directory May 1 19:48:37 Gentooo kernel[18297]: seq 24782 running May 1 19:48:37 Gentooo kernel[18296]: GROUP 27 /lib64/udev/rules.d/50-udev-default.rules:30 May 1 19:48:37 Gentooo kernel[18297]: no db file to read /run/udev/data/+drivers:nvidia: No such file or directory May 1 19:48:37 Gentooo kernel[18296]: RUN 'udev-acl --action=$env{ACTION} --device=$env{DEVNAME}' /lib64/udev/rules.d/70-udev-acl.rules:74 May 1 19:48:37 Gentooo kernel[18297]: device 0x650940 has devpath '/bus/pci/drivers' May 1 19:48:37 Gentooo kernel[18297]: device 0x65dc10 has devpath '/bus/pci' May 1 19:48:37 Gentooo kernel[18296]: device 0x64f5f0 has devpath '/devices/pci0000:00/0000:00:01.0/0000:01:00.0' May 1 19:48:37 Gentooo kernel[18296]: device 0x660ba0 has devpath '/devices/pci0000:00/0000:00:01.0' May 1 19:48:37 Gentooo kernel[18296]: device 0x64f2e0 has devpath '/devices/pci0000:00' May 1 19:48:37 Gentooo kernel[18297]: passed -1 bytes to netlink monitor 0x64f7f0 May 1 19:48:37 Gentooo kernel[18297]: seq 24782 processed with 0 May 1 19:48:37 Gentooo kernel[18296]: handling device node '/dev/dri/card1', devnum=c226:1, mode=0660, uid=0, gid=27 May 1 19:48:37 Gentooo kernel[18296]: can not stat() node '/dev/dri/card1' (No such file or directory) May 1 19:48:37 Gentooo kernel[15730]: seq 24782 done with 0 May 1 19:48:37 Gentooo kernel[18296]: created db file '/run/udev/data/c226:1' for '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1' May 1 19:48:37 Gentooo kernel[32528]: starting 'udev-acl --action=add --device=/dev/dri/card1' May 1 19:48:37 Gentooo kernel[18296]: 'udev-acl --action=add --device=/dev/dri/card1' [32528] exit with return code 0 May 1 19:48:37 Gentooo kernel[18296]: passed -1 bytes to netlink monitor 0x6517a0 May 1 19:48:37 Gentooo kernel[18296]: seq 24780 processed with 0 May 1 19:48:37 Gentooo kernel[15730]: seq 24780 done with 0 May 1 19:48:37 Gentooo kernel[15730]: passed 242 bytes to netlink monitor 0x6495a0 May 1 19:48:37 Gentooo kernel[18296]: seq 24781 running May 1 19:48:37 Gentooo kernel[18296]: device 0x64f2e0 filled with db file data May 1 19:48:37 Gentooo kernel[18296]: RUN 'udev-acl --action=$env{ACTION} --device=$env{DEVNAME}' /lib64/udev/rules.d/70-udev-acl.rules:74 May 1 19:48:37 Gentooo kernel[18296]: device 0x660ba0 has devpath '/devices/pci0000:00/0000:00:01.0/0000:01:00.0' May 1 19:48:37 Gentooo kernel[18296]: device 0x64f5f0 has devpath '/devices/pci0000:00/0000:00:01.0' May 1 19:48:37 Gentooo kernel[18296]: device 0x650cc0 has devpath '/devices/pci0000:00' May 1 19:48:37 Gentooo kernel[32529]: starting 'udev-acl --action=remove --device=/dev/dri/card1' May 1 19:48:37 Gentooo kernel[18296]: 'udev-acl --action=remove --device=/dev/dri/card1' [32529] exit with return code 0 May 1 19:48:37 Gentooo kernel[18296]: passed -1 bytes to netlink monitor 0x6517a0 May 1 19:48:37 Gentooo kernel[18296]: seq 24781 processed with 0 May 1 19:48:37 Gentooo kernel[15730]: seq 24781 done with 0
从日志内容来看,应该是与 udev
进程有关。那么,解决方法也很简单了:
service udev stop
OK,问题解决,一切恢复正常。
至于为什么 udev
进程能导致 upowerd
的 CPU 占用率过高,并且导致 X 卡死……这点暂时没看出什么名堂来,以后抽空再研究下。
一样的问题…
Stopping udev.service, but it still be activated by :
system-udevs-control.socket
system-udevs-kernel.socket
怎么办怎么办怎么办( p_q)
刚看到评论 🙂
试过直接 kill 掉 udev 这个进程、然后再重启么?