Received: by oss.sgi.com id ; Tue, 20 Jun 2000 21:25:43 -0700 Received: from smtprch1.nortelnetworks.com ([192.135.215.14]:7590 "EHLO smtprch1.nortel.com") by oss.sgi.com with ESMTP id ; Tue, 20 Jun 2000 21:25:14 -0700 Received: from zrchb213.us.nortel.com (actually zrchb213) by smtprch1.nortel.com; Tue, 20 Jun 2000 23:23:49 -0500 Received: from zctwb003.asiapac.nortel.com ([47.152.32.111]) by zrchb213.us.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id NKANVN35; Tue, 20 Jun 2000 23:24:05 -0500 Received: from pwold011.asiapac.nortel.com ([47.181.193.45]) by zctwb003.asiapac.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id NCLF8D35; Wed, 21 Jun 2000 14:24:06 +1000 Received: from uow.edu.au (IDENT:akpm@localhost [127.0.0.1]) by pwold011.asiapac.nortel.com (8.9.3/8.9.3) with ESMTP id OAA28251; Wed, 21 Jun 2000 14:24:01 +1000 Message-ID: <39504361.81F03943@uow.edu.au> Date: Wed, 21 Jun 2000 04:24:01 +0000 X-Sybari-Space: 00000000 00000000 00000000 From: Andrew Morton X-Mailer: Mozilla 4.61 [en] (X11; I; Linux 2.4.0-test1-ac10 i686) X-Accept-Language: en MIME-Version: 1.0 To: Keith Owens CC: "netdev@oss.sgi.com" Subject: Re: modular net drivers, take 2 References: Your message of "Wed, 21 Jun 2000 02:37:58 GMT." <39502A86.6C8C6FC2@uow.edu.au> <12333.961559040@kao2.melbourne.sgi.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Keith Owens wrote: > > Anything sleeping loses the lock. Any sleep in module open code primes > the race, if the module_exit code also sleeps the race is triggered. You're a hard man, Mr Owens. So sys_delete_module() isn't allowed to sleep. It's hard to make this rule future-safe. Do you think that the concept of grabbing the entire machine during module unload is an acceptable one? I think it is, because the act of actually unloading kernel text is so unique and traumatic. If so then it shouldn't be too hard to find a way. You have shown why we can't use the big lock, but we could create a new one for this purpose. The challenge is to find a place to put it. A code path which is regularly traversed in toplevel context and has an upper bound on the revisit period. Such as schedule() (we'd get shot..). sys_delete_module() { ... spin_lock(&module_deletion_lock); blocked_cpus = 1 << smp_processor_id(); while (blocked_cpus != ((1 << smp_num_cpus) - 1)) ; { I think the only code whcih needs to go in here is the call to vfree(module). } spin_unlock(&module_deletion_lock); ... } schedule() { ... if (spin_is_locked(&module_deletion_lock)) wait_while_unloading() } wait_while_unloading() { set_bit(&blocked_cpus, smp_processor_id()); while (spin_is_locked(&module_deletion_lock)) ; }