Received: by oss.sgi.com id ; Tue, 20 Jun 2000 18:49:53 -0700 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:10817 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 20 Jun 2000 18:49:37 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via SMTP id SAA04067 for ; Tue, 20 Jun 2000 18:54:45 -0700 (PDT) mail_from (kaos@kao2.melbourne.sgi.com) Received: from kao2.melbourne.sgi.com (kao2.melbourne.sgi.com [134.14.55.180]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA01954; Wed, 21 Jun 2000 11:48:15 +1000 X-Mailer: exmh version 2.1.1 10/15/1999 From: Keith Owens To: Andrew Morton cc: "netdev@oss.sgi.com" Subject: Re: modular net drivers, take 2 In-reply-to: Your message of "Wed, 21 Jun 2000 01:18:43 GMT." <395017F3.516165AD@uow.edu.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 21 Jun 2000 11:48:15 +1000 Message-ID: <6062.961552095@kao2.melbourne.sgi.com> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Wed, 21 Jun 2000 01:18:43 +0000, Andrew Morton wrote: >Keith Owens wrote: >> ioctls are not a problem, as long as they use a file descriptor, i.e. >> no global ioctls. Getting a file descriptor requires open() or its >> equivalent which set the module use_count. The race is in open, I >> don't know of any races after use_count is set and open() has complete >> and left the module. > >I don't think you're right here, Keith. Note I said "ioctls are not a problem, as long as they use a file descriptor". Getting the file descriptor sets the module reference count. But anything that accesses module code without a reference count is a bomb waiting to explode. >ioctls on the netdevice don't require a descriptor which is associated >with dev->open(). For example, Donald's mii-diag application and >ifconfig both call device-specific functions without ever having called >dev->open(): > >modprobe driver >mii-diag -v eth0 >ifconfig -a > >In this example, both dev->ioctl() and dev->get_stats() are called while >the module refcount is zero. So they're as risky as open(); these code >paths need to be audited for races wrt kmalloc->schedule() >opportunities. Absolutely. No module reference count == no safety. mii-diag gets a file descriptor but it is for a general socket and is not tied to the interface module in any way. >How about a totally different approach: > >In the module_exit() we locate all netdevices associated with this >module and overwrite all their function pointers with the addresses of >non-modular stub functions which return ENODEV. Then we don't have to >worry about device methods being called after unload. Change every module that registers anything to make sure that they replace the register data with stubs on exit? And make sure that all of them do so before they sleep anywhere in module cleanup? It would work but is it the best solution? The existing method of avoiding module races is beginning to look like a dead dog. Look at the constraints we have to run under :- * All code that can ever call any module functions must either have a reference count on that module or must run under the same lock as the module unload (big kernel lock). * Every module must be checked to see that it never sleeps before doing MOD_INC_USE_COUNT. * Every module must be checked to see that it never sleeps after doing MOD_DEC_USE_COUNT. * Every module that registers anything must change the registered functions in module cleanup and must do so before sleeping (new). That is an awful lot of opportunities to make mistakes. And forcing lots of code to run under the big kernel lock does not scale well.