想要卸载ath9k,但是失败,发现其refcnt为1,但是没有内核模块引用它。
1 2 3 4 5 | root@OpenWrt:/# lsmod ath 18771 4 ath9k,ath9k_common,ath9k_hw,ath10k_core ath10k_core 284008 1 ath10k_pci ath10k_pci 33859 0 ath9k 98779 1 |
查阅资料发现是内核本身的函数引用了这个内核模块。引用内核模块的接口为:
try_module_get和module_put
修改kernel/module.c中的这两个函数,打印出函数栈
1 2 3 4 5 6 7 8 9 10 11 12 13 | bool try_module_get( struct module *module) { bool ret = true ; char symname[KSYM_NAME_LEN]; if (module) { if (! strcmp (module->name, "ath9k" )) { lookup_symbol_name((unsigned long )_RET_IP_, symname); dump_stack(); if (! strcmp (symname, "gpiod_request" )) return true ; } 。。。 |
如下,返回函数的地址为_RET_IP_,通过lookup_symbol_name,可以得到调用者的名字,dump_stack()可以打印出异常栈。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | [ 12.963471] CPU: 0 PID: 422 Comm: kmodloader Not tainted 4.9.123 #22 [ 12.970091] Stack : 804e7672 00000038 00000000 00000000 83891c7c 8046f247 80422cd4 000001a6 [ 12.978781] 804e37c0 00000000 80470000 00000004 839cc010 800ae18c 83bbd8f8 83bbd8f8 [ 12.987489] 8043750c 00000800 80426ad4 83bbd93c 80441a7c 800e0ed8 00000000 801e6bb8 [ 12.996190] 82de80ff 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 13.004890] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 13.013590] ... [ 13.016127] Call Trace: [ 13.018656] [<8006bd78>] show_stack+0x54/0x88 [ 13.023176] [<800ce3a4>] try_module_get+0x88/0x120 [ 13.028145] [<80204e44>] gpiod_request+0x98/0xf8 [ 13.033032] [<82c211c8>] ath9k_beacon_config+0x3b0/0x648 [ath9k] [ 13.039281] [<801e4204>] snprintf+0x1c/0x28 [ 13.043661] [<82c21910>] ath_init_leds+0x2c8/0x30c [ath9k] [ 13.049405] [<82c22a64>] ath9k_init_device+0x9a4/0xa3c [ath9k] [ 13.055512] [<82c2ee98>] ath_pci_exit+0x25c/0x30c [ath9k] [ 13.061144] try_module_get ath9k gpiod_request |
最后发现是gpiod_request这个接口请求,没有释放。查看驱动源码,发现释放是在卸载驱动的时候,但是有引用计数又不能卸载,这就死锁了。
修改成如上,直接返回,不增加计数,编译内核,写到板子上,重启,发现计数变成0了,可以卸载:
1 2 3 4 5 | root@OpenWrt:/# lsmod ath 18771 4 ath9k,ath9k_common,ath9k_hw,ath10k_core ath10k_core 284008 1 ath10k_pci ath10k_pci 33859 0 ath9k 98779 0 |
但是内核给出告警了,因为没有增加计数,卸载时却减少计数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | root@OpenWrt:/# rmmod ath9k [ 53.480231] ------------[ cut here ]------------ [ 53.485026] WARNING: CPU: 0 PID: 1450 at /work/k2t/k2t-mesh/linux-4.9.123/kernel/module.c:1120 module_put+0x64/0xec [ 53.495846] Modules linked in: ath9k(-) ath9k_common pppoe ppp_async ath9k_hw ath10k_pci ath10k_core ath pppox ppp_generic nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ip_tables crc_ccitt compat ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables gpio_button_hotplug [ 53.557476] CPU: 0 PID: 1450 Comm: rmmod Not tainted 4.9.123 #24 [ 53.563692] Stack : 804e7672 00000034 00000000 00000000 82ae07bc 8046f247 80422cd4 000005aa [ 53.572392] 804e37c0 00000460 7701a000 00000000 00000000 800ae18c 00000003 80470000 [ 53.581092] 80428bb8 00000460 80426ad4 82a4bc64 00000000 800e0ef8 804e7672 00000067 [ 53.589792] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 53.598483] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 53.607184] ... [ 53.609729] Call Trace: [ 53.612257] [<8006bd78>] show_stack+0x54/0x88 [ 53.616772] [<80081bb8>] __warn+0xe4/0x118 [ 53.621010] [<80081c7c>] warn_slowpath_null+0x1c/0x30 [ 53.626234] [<800ce4bc>] module_put+0x64/0xec [ 53.630757] [<80204f00>] gpiod_free+0x3c/0x68 [ 53.635275] [<833e15b0>] ath_deinit_leds+0x74/0x10c [ath9k] [ 53.641042] [<833e2b2c>] ath9k_deinit_device+0x30/0x990 [ath9k] [ 53.647158] [<833eec80>] ath_pci_exit+0x44/0x30c [ath9k] [ 53.652663] ---[ end trace c16c94bd8f41a14b ]--- [ 53.657428] module_Put ath9k gpiod_free [ 53.682849] ath9k: ath9k: Driver unloaded |
参考: