From davem@pizda.ninka.net Mon Dec 1 00:06:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 00:07:14 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB186sTa017689 for ; Mon, 1 Dec 2003 00:06:54 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id AAA05215; Mon, 1 Dec 2003 00:06:16 -0800 Date: Mon, 1 Dec 2003 00:06:16 -0800 From: "David S. Miller" To: Ben Greear Cc: scott.feldman@intel.com, netdev@oss.sgi.com Subject: Re: Problems with e1000 in 2.4.23 Message-Id: <20031201000616.1db7b6f4.davem@redhat.com> In-Reply-To: <3FCAACAD.7090609@candelatech.com> References: <3FCAACAD.7090609@candelatech.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1784 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 30 Nov 2003 18:51:25 -0800 Ben Greear wrote: > Also, I am seeing bogus things on this machine regardless of which > kernel and which e1000 driver I use, so it's quite possible that either > the NIC hardware or the MB/RAM/CPU/Whatever is just plain not quite > right. Ben, I don't want to beat an old dead horse, but is this the same system where you were having all of those overheating problems a long time ago? From casellas@infres.enst.fr Mon Dec 1 06:36:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 06:36:49 -0800 (PST) Received: from infres.enst.fr (infres.enst.fr [137.194.192.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1EaXTa012794 for ; Mon, 1 Dec 2003 06:36:34 -0800 Received: from gervaise.enst.fr (gervaise.enst.fr [137.194.160.71]) by infres.enst.fr (Postfix) with ESMTP id 5BAFB18DC for ; Mon, 1 Dec 2003 15:36:27 +0100 (MET) Received: from localhost (casellas@localhost) by gervaise.enst.fr (8.11.6+Sun/8.11.6) with ESMTP id hB1EaQV26056 for ; Mon, 1 Dec 2003 15:36:26 +0100 (MET) X-Authentication-Warning: gervaise.enst.fr: casellas owned process doing -bs Date: Mon, 1 Dec 2003 15:36:26 +0100 (MET) From: Ramon Casellas X-X-Sender: casellas@gervaise.enst.fr To: netdev@oss.sgi.com Subject: Request: Allocate a Netlink Family Number for MPLS Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1785 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: casellas@infres.enst.fr Precedence: bulk X-list: netdev Jamal/Dave/all, We're working in porting mpls for linux to 2.6 and one of the planned features is to move from an IOCTL based approach to a Netlink based one for updating/querying the MPLS FTN/ILM/Label Mapping/... tables from userspace. Since netlink families are public, I would like to know if it is possible to reserve a family for MPLS, even though the MPLS patch is not part of the official kernel. Please let me know who I should contact/forward my request (Dave Miller is listed as the main mantainer of the net core) or if, on the contrary, there are valid reasons to reject our request. Thanks in advance, Ramon Something like this: diff -urN linux-2.6.0-test11/include/linux/netlink.h linux-2.6.0-test11-mpls/include/linux/netlink.h --- linux-2.6.0-test11/include/linux/netlink.h 2003-11-27 14:24:00.000000000 +0100 +++ linux-2.6.0-test11-mpls/include/linux/netlink.h 2003-11-30 13:47:45.000000000 +0100 @@ -12,6 +12,11 @@ #define NETLINK_NFLOG 5 /* netfilter/iptables ULOG */ #define NETLINK_XFRM 6 /* ipsec */ #define NETLINK_ARPD 8 + +##if defined(CONFIG_MPLS) || defined(CONFIG_MPLS_MODULE) +#define NETLINK_MPLS 9 +#endif + #define NETLINK_ROUTE6 11 /* af_inet6 route comm channel */ #define NETLINK_IP6_FW 13 #define NETLINK_DNRTMSG 14 /* DECnet routing messages */ // ------------------------------------------------------------------- // Ramon Casellas - GET/ENST/INFRES/RHD/A508 - casellas@infres.enst.fr From hch@infradead.org Mon Dec 1 07:30:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 07:31:13 -0800 (PST) Received: from phoenix.infradead.org (phoenix.infradead.org [213.86.99.234]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1FUrTa017364 for ; Mon, 1 Dec 2003 07:30:54 -0800 Received: from hch by phoenix.infradead.org with local (Exim 4.22) id 1AQq0i-00011I-J0; Mon, 01 Dec 2003 15:30:52 +0000 Date: Mon, 1 Dec 2003 15:30:52 +0000 From: Christoph Hellwig To: Ramon Casellas Cc: netdev@oss.sgi.com Subject: Re: Request: Allocate a Netlink Family Number for MPLS Message-ID: <20031201153052.A3879@infradead.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from casellas@infres.enst.fr on Mon, Dec 01, 2003 at 03:36:26PM +0100 X-archive-position: 1786 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Mon, Dec 01, 2003 at 03:36:26PM +0100, Ramon Casellas wrote: > #define NETLINK_ARPD 8 > + > +##if defined(CONFIG_MPLS) || defined(CONFIG_MPLS_MODULE) > +#define NETLINK_MPLS 9 > +#endif > + This is bogus - either it's reserved or not, so the ifdef doesn't make sense. Also given that Dave & co are working on mpls already I wonder whether it's a good idea to reserve this, but I'll let him speak for himself :) From greearb@candelatech.com Mon Dec 1 09:26:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 09:27:09 -0800 (PST) Received: from grok.yi.org (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1HQtTa026501 for ; Mon, 1 Dec 2003 09:26:56 -0800 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id hB1HQcKt029022; Mon, 1 Dec 2003 09:26:45 -0800 Message-ID: <3FCB79CE.40409@candelatech.com> Date: Mon, 01 Dec 2003 09:26:38 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: scott.feldman@intel.com, netdev@oss.sgi.com Subject: Re: Problems with e1000 in 2.4.23 References: <3FCAACAD.7090609@candelatech.com> <20031201000616.1db7b6f4.davem@redhat.com> In-Reply-To: <20031201000616.1db7b6f4.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1787 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Sun, 30 Nov 2003 18:51:25 -0800 > Ben Greear wrote: > > >>Also, I am seeing bogus things on this machine regardless of which >>kernel and which e1000 driver I use, so it's quite possible that either >>the NIC hardware or the MB/RAM/CPU/Whatever is just plain not quite >>right. > > > Ben, I don't want to beat an old dead horse, but is this > the same system where you were having all of those > overheating problems a long time ago? Yep, I put a big ole fan right on the NICs but to no avail. However, it may not be enough, or the chipset/NIC may be so cooked/flaky that it doesn't really matter any more. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From niv@us.ibm.com Mon Dec 1 10:12:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 10:12:52 -0800 (PST) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1ICUTa027633 for ; Mon, 1 Dec 2003 10:12:39 -0800 Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.17.195.12]) by e33.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB1ICLJh340332; Mon, 1 Dec 2003 13:12:21 -0500 Received: from us.ibm.com (d03av01.boulder.ibm.com [9.17.193.81]) by westrelay03.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB1ICJeF128462; Mon, 1 Dec 2003 11:12:20 -0700 Message-ID: <3FCB8415.7060101@us.ibm.com> Date: Mon, 01 Dec 2003 10:10:29 -0800 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ronnie Sahlberg CC: netdev@oss.sgi.com Subject: Re: TCP retransmission timers, questions References: <010801c3b657$e3093fb0$6501010a@C5043436> In-Reply-To: <010801c3b657$e3093fb0$6501010a@C5043436> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1788 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Ronnie Sahlberg wrote: > By looking at the kernel sources it seems that the minimum TCP > retransmission timeout is hardcoded to 200ms. > Is this correct? Yes, that is correct. > While I understand why it is important to not be too aggressive in > retransmitting I wounder if it would be possible to get > and interface in proc where one could "tune" this. Currently, not unless you edit the kernel header file yourself and recompile the kernel. Not recommended for several reasons. > The reason for this is that in some applications you do have a completely > private, dedicated network used for one specific application. > Those networks can be dimensioned so that congestion "should" not occur. > However, packets are lost from time to time and sometimes packets will be > lost. > In those isolated dedicated subnets, with end to end network latency in the > sub ms range, would it not be useful to be able to allow > the retransmission timeout to drop down to 5-10ms? Exactly the scheme I was interested in proposing a while ago - provide a env for private networks that would allow more flexible tuning for private nets. > Do anyone know of any work/research in the area of tcp retransmission > timeouts for very high bandwidth, low latency networks? > I have checked both the IETF list of drafts, Sally Floyds pages and google > but could not find anything. Not that I could find last year either. > It seems to me that all research/experimentation in high throughput is for > high bandwidth high latency links and tuning the slowstart/congestion > avoidance algorithms. > What about high throughput, very low latency? Does nayone know of any > papers in that area? I'm doing my own experimentation for this environment - case study a 3 tiered app with a private network between the web front end and the database backend. I'm playing with gigabit but hope to do some 10Gb testing sometime in the near future. Hope to provide a experimental patch to play with, but it wont be soon. Mostly January. We had a thread on this a while ago, and DaveM pointed out that this was really a research area because the 200ms timer limit (BSD inherited) played a rather critical role in all the congestion control, and what its impact might be if changed on Internet traffic really needed to be studied/researched. However, that wouldnt apply to private, non-routable networks. > For specific applications, running on completely isolated dedicated > networks, dimensioned to make congestion unlikely, isolated so it will NEVER > compete about bandwidth with normal TCPs on the internet, to me it would > make sense to allow the retransmission timeout to be allowed to drop > significantly below 200ms. Exactly. > Another question, I think it was RFC2988 (but an not find it again) that > discussed that a TCP may add an artificial delay in sending the packets > based on > the RTT so that when sending an entire window the packets are spaced > equidistantly across the RTT interval instead of in just one big burst. > This to prevent the burstinessd of the traffic and make buffer > overruns/congestion less likely. I havent seen this help for the most part. This is helpful only in very selective situations. If youre studying multiple streams across one network, performance could be equally hurt/helped. Have you any data on this? > I have seen indications that w2k/bsd might in some conditions do this. > Doe Linux do this? my search through the sources came up with nothing. > Does anyone know whether there are other TCPs that do this? > As i said I have seen something that looked like that on a BSD stack but it > could have been related to something else. Linux doesn't, and the others dont either, to my knowledge, but I could be wrong, its been a while since I looked at the other OSs. hth, thanks, Nivedita From shemminger@osdl.org Mon Dec 1 11:32:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 11:33:09 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1JWtTa028898 for ; Mon, 1 Dec 2003 11:32:56 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB1JWUZ21603; Mon, 1 Dec 2003 11:32:32 -0800 Date: Mon, 1 Dec 2003 11:33:12 -0800 From: Stephen Hemminger To: anand@eis.iisc.ernet.in (SVR Anand) Cc: davem@redhat.com (David S. Miller), netdev@oss.sgi.com Subject: Re: Bridging woes after 3 days Message-Id: <20031201113312.2ce6ec0f.shemminger@osdl.org> In-Reply-To: <200311290944.PAA27304@eis.iisc.ernet.in> References: <20031123152601.67646dc1.davem@redhat.com> <200311290944.PAA27304@eis.iisc.ernet.in> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1789 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Sat, 29 Nov 2003 15:14:08 +0530 (GMT+05:30) anand@eis.iisc.ernet.in (SVR Anand) wrote: > Hi, > > After a continous run for 3 days the bridge came down crashing with the > following kernel panic screen dump. The kernel is 2.6.0-test9-bk25 with > kernel preemption disabled. > > The following call stack is what I have seen on the console. The ethernet > cards are RTL8139. Please let me know if you want more information or finer > debugging method, I will pass it on when the bridge fails the next time. Can you hook a serial console to catch the precise wording? Also if you save copies of /proc/slabinfo on a regular interval (like per hour), then it is possible to see if there is a memory leak. From casellas@infres.enst.fr Mon Dec 1 12:00:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 12:00:37 -0800 (PST) Received: from infres.enst.fr (infres.enst.fr [137.194.192.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1K0ATa006998 for ; Mon, 1 Dec 2003 12:00:11 -0800 Received: from gervaise.enst.fr (gervaise.enst.fr [137.194.160.71]) by infres.enst.fr (Postfix) with ESMTP id 769F818D1; Mon, 1 Dec 2003 20:26:31 +0100 (MET) Received: from localhost (casellas@localhost) by gervaise.enst.fr (8.11.6+Sun/8.11.6) with ESMTP id hB1JQVE27335; Mon, 1 Dec 2003 20:26:31 +0100 (MET) X-Authentication-Warning: gervaise.enst.fr: casellas owned process doing -bs Date: Mon, 1 Dec 2003 20:26:30 +0100 (MET) From: Ramon Casellas X-X-Sender: casellas@gervaise.enst.fr To: Christoph Hellwig Cc: netdev@oss.sgi.com Subject: Re: Request: Allocate a Netlink Family Number for MPLS In-Reply-To: <20031201153052.A3879@infradead.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1790 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: casellas@infres.enst.fr Precedence: bulk X-list: netdev On Mon, 1 Dec 2003, Christoph Hellwig wrote: > On Mon, Dec 01, 2003 at 03:36:26PM +0100, Ramon Casellas wrote: > > #define NETLINK_ARPD 8 > > + > > +#if defined(CONFIG_MPLS) || defined(CONFIG_MPLS_MODULE) > > +#define NETLINK_MPLS 9 > > +#endif > > + > > This is bogus - either it's reserved or not, so the ifdef doesn't > make sense. Yes of course :) that was just a cut&paste from an interim patch, where we put ifdefs around everything. If you decide to allocate a family number, then ifdefs don't make sense (much like in if_ether.h or ppp_defs.h), but you're right, my mistake, I didn't mean to propose a real patch. > > Also given that Dave & co are working on mpls already I wonder > whether it's a good idea to reserve this, but I'll let him speak > for himself :) Well, we are still coordinating efforts, and Jamal has some design docs around... but regardless of the actual implementation, a mechanism to communicate with userspace will be needed, and IM(Very, very)HO, netlink is a good candidate, analog to rtnetlink, but this is open to discussion. Thanks, R. From herbert@gondor.apana.org.au Mon Dec 1 12:17:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 12:17:21 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1KH2Ta007512 for ; Mon, 1 Dec 2003 12:17:06 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AQuTV-00005l-00; Tue, 02 Dec 2003 07:16:53 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AQuTT-0005G7-00; Tue, 02 Dec 2003 07:16:51 +1100 Date: Tue, 2 Dec 2003 07:16:51 +1100 To: "David S. Miller" , netdev@oss.sgi.com Subject: [ROUTE] PMTU only works on half the time Message-ID: <20031201201651.GA20194@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Nq2Wo0NMKNjxTN9z" Content-Disposition: inline User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1791 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --Nq2Wo0NMKNjxTN9z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Dave: I found out that PMTU only works on those routing cache entries where rt_src != 0. This patch should make it work for all matching entries. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --Nq2Wo0NMKNjxTN9z Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p Index: kernel-source-2.5/net/ipv4/route.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/net/ipv4/route.c,v retrieving revision 1.3 diff -u -r1.3 route.c --- kernel-source-2.5/net/ipv4/route.c 24 Nov 2003 09:52:04 -0000 1.3 +++ kernel-source-2.5/net/ipv4/route.c 1 Dec 2003 20:15:40 -0000 @@ -1259,9 +1259,9 @@ rth = rth->u.rt_next) { smp_read_barrier_depends(); if (rth->fl.fl4_dst == daddr && - rth->fl.fl4_src == skeys[i] && + (rth->fl.fl4_src == iph->saddr || + rth->rt_src == iph->saddr) && rth->rt_dst == daddr && - rth->rt_src == iph->saddr && rth->fl.fl4_tos == tos && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { --Nq2Wo0NMKNjxTN9z-- From davem@pizda.ninka.net Mon Dec 1 12:46:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 12:46:31 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1KkHTa011430 for ; Mon, 1 Dec 2003 12:46:17 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA19823; Mon, 1 Dec 2003 12:45:32 -0800 Date: Mon, 1 Dec 2003 12:45:32 -0800 From: "David S. Miller" To: Ramon Casellas Cc: netdev@oss.sgi.com Subject: Re: Request: Allocate a Netlink Family Number for MPLS Message-Id: <20031201124532.3f6b6a65.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1792 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 1 Dec 2003 15:36:26 +0100 (MET) Ramon Casellas wrote: > We're working in porting mpls for linux to 2.6 and one of the planned > features is to move from an IOCTL based approach to a Netlink based one > for updating/querying the MPLS FTN/ILM/Label Mapping/... tables from > userspace. Since netlink families are public, I would > like to know if it is possible to reserve a family for MPLS, even though > the MPLS patch is not part of the official kernel. You don't need a whole new netlink family. Just create a dummy address family for MPLS (ie. AF_MPLS) and then just use NETLINK_ROUTE. From herbert@gondor.apana.org.au Mon Dec 1 12:47:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 12:47:25 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1Kl8Ta011610 for ; Mon, 1 Dec 2003 12:47:10 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AQuwf-0000D1-00; Tue, 02 Dec 2003 07:47:01 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AQuwe-0005IT-00; Tue, 02 Dec 2003 07:47:00 +1100 Date: Tue, 2 Dec 2003 07:47:00 +1100 To: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-ID: <20031201204700.GA20349@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="YZ5djTAD1cGYuMQK" Content-Disposition: inline In-Reply-To: <20031201201651.GA20194@gondor.apana.org.au> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1793 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --YZ5djTAD1cGYuMQK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Dec 02, 2003 at 07:16:51AM +1100, herbert wrote: > > I found out that PMTU only works on those routing cache entries where > rt_src != 0. This patch should make it work for all matching entries. That patch removed one line too many. This one should be better. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --YZ5djTAD1cGYuMQK Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p Index: kernel-source-2.5/net/ipv4/route.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/net/ipv4/route.c,v retrieving revision 1.3 diff -u -r1.3 route.c --- kernel-source-2.5/net/ipv4/route.c 24 Nov 2003 09:52:04 -0000 1.3 +++ kernel-source-2.5/net/ipv4/route.c 1 Dec 2003 20:45:22 -0000 @@ -1260,8 +1260,9 @@ smp_read_barrier_depends(); if (rth->fl.fl4_dst == daddr && rth->fl.fl4_src == skeys[i] && + (rth->fl.fl4_src == iph->saddr || + rth->rt_src == iph->saddr) && rth->rt_dst == daddr && - rth->rt_src == iph->saddr && rth->fl.fl4_tos == tos && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { --YZ5djTAD1cGYuMQK-- From garzik@gtf.org Mon Dec 1 13:02:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 13:03:09 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1L2sTa020997 for ; Mon, 1 Dec 2003 13:02:54 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id F1C2A66CF; Mon, 1 Dec 2003 15:55:33 -0500 (EST) Date: Mon, 1 Dec 2003 15:55:33 -0500 From: Jeff Garzik To: Octave Cc: Stephen Hemminger , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: NAPI 8139too.c for 2.4.23 Message-ID: <20031201205533.GA15846@gtf.org> References: <20031201205038.GK10711@ovh.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031201205038.GK10711@ovh.net> User-Agent: Mutt/1.3.28i X-archive-position: 1794 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Mon, Dec 01, 2003 at 09:50:38PM +0100, Octave wrote: > Stephen, > I get your patch from http://lwn.net/Articles/54815/ for 2.6.X and > I rewrote it for 2.4.23. Tested with 2.4.23 on high load servers. I > have no more "Too much work at interrupt". > > I dropped it on ftp://ftp.ovh.net/made-in-ovh/8139too.c-2.4-0.9.27 > > Hope it helps. > Octave Very cool! Thanks for testing. Is there any chance you could do some benchmark runs with ttcp or somesuch? Jeff From voloterreno@tin.it Mon Dec 1 13:08:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 13:09:07 -0800 (PST) Received: from vsmtp12.tin.it (vsmtp12.tin.it [212.216.176.206]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1L8nTa002620 for ; Mon, 1 Dec 2003 13:08:50 -0800 Received: from tin.it (80.180.66.85) by vsmtp12.tin.it (7.0.019) (authenticated as voloterreno@tin.it) id 3FC8F04600174580; Mon, 1 Dec 2003 22:08:33 +0100 Message-ID: <3FCBAEFF.6000606@tin.it> Date: Mon, 01 Dec 2003 22:13:35 +0100 From: Marcello User-Agent: Mozilla/5.0 (X11; U; Linux i686; it-IT; rv:1.5) Gecko/20031031 X-Accept-Language: it, en-us, en MIME-Version: 1.0 To: Octave CC: Stephen Hemminger , Jeff Garzik , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: NAPI 8139too.c for 2.4.23 References: <20031201205038.GK10711@ovh.net> In-Reply-To: <20031201205038.GK10711@ovh.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1795 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: voloterreno@tin.it Precedence: bulk X-list: netdev Octave ha scritto: >Stephen, >I get your patch from http://lwn.net/Articles/54815/ for 2.6.X and >I rewrote it for 2.4.23. Tested with 2.4.23 on high load servers. I >have no more "Too much work at interrupt". > >I dropped it on ftp://ftp.ovh.net/made-in-ovh/8139too.c-2.4-0.9.27 > >Hope it helps. >Octave > >before: >------- ># ps auxw >root 256 0.0 0.0 0 0 ? SW Nov28 0:00 [eth0] ># ifconfig > RX packets:40940899 errors:250542 dropped:7052 overruns:250542 frame:0 > TX packets:33057049 errors:0 dropped:0 overruns:20 carrier:0 ># dmesg >eth0: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1. >nfs: server X.X.X.X not responding, still trying >nfs: server X.X.X.X OK >eth0: Too much work at interrupt, IntrStatus=0x0040. > >with NAPI >--------- > RX packets:428253 errors:0 dropped:0 overruns:0 frame:0 > TX packets:357949 errors:0 dropped:0 overruns:0 carrier:0 >8139too Fast Ethernet driver 0.9.27 >PCI: Found IRQ 11 for device 00:0b.0 >eth0: RealTek RTL8139 at 0xec00, 00:e0:4c:91:03:b0, IRQ 11 >eth0: Identified 8139 chip type 'RTL-8100B/8139D' > >- >To unsubscribe from this list: send the line "unsubscribe linux-net" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > I try immediatly your variant of the driver :) Bye Marcello From oles@ovh.net Mon Dec 1 13:20:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 13:20:30 -0800 (PST) Received: from ping.ovh.net (ping.ovh.net [213.186.33.13]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1LK3Ta003081 for ; Mon, 1 Dec 2003 13:20:04 -0800 Received: by ping.ovh.net (Postfix, from userid 502) id 07BC83B7A0; Mon, 1 Dec 2003 21:50:39 +0100 (CET) Date: Mon, 1 Dec 2003 21:50:38 +0100 From: Octave To: Stephen Hemminger , Jeff Garzik Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: NAPI 8139too.c for 2.4.23 Message-ID: <20031201205038.GK10711@ovh.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline User-Agent: Mutt/1.5.4i X-archive-position: 1796 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: oles@ovh.net Precedence: bulk X-list: netdev Stephen, I get your patch from http://lwn.net/Articles/54815/ for 2.6.X and I rewrote it for 2.4.23. Tested with 2.4.23 on high load servers. I have no more "Too much work at interrupt". I dropped it on ftp://ftp.ovh.net/made-in-ovh/8139too.c-2.4-0.9.27 Hope it helps. Octave before: ------- # ps auxw root 256 0.0 0.0 0 0 ? SW Nov28 0:00 [eth0] # ifconfig RX packets:40940899 errors:250542 dropped:7052 overruns:250542 frame:0 TX packets:33057049 errors:0 dropped:0 overruns:20 carrier:0 # dmesg eth0: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1. nfs: server X.X.X.X not responding, still trying nfs: server X.X.X.X OK eth0: Too much work at interrupt, IntrStatus=0x0040. with NAPI --------- RX packets:428253 errors:0 dropped:0 overruns:0 frame:0 TX packets:357949 errors:0 dropped:0 overruns:0 carrier:0 8139too Fast Ethernet driver 0.9.27 PCI: Found IRQ 11 for device 00:0b.0 eth0: RealTek RTL8139 at 0xec00, 00:e0:4c:91:03:b0, IRQ 11 eth0: Identified 8139 chip type 'RTL-8100B/8139D' From davem@pizda.ninka.net Mon Dec 1 13:52:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 13:53:09 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1LqqTa012987 for ; Mon, 1 Dec 2003 13:52:52 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id NAA20001; Mon, 1 Dec 2003 13:51:54 -0800 Date: Mon, 1 Dec 2003 13:51:54 -0800 From: "David S. Miller" To: Herbert Xu Cc: netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031201135154.6906454c.davem@redhat.com> In-Reply-To: <20031201204700.GA20349@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1797 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 07:47:00 +1100 Herbert Xu wrote: > On Tue, Dec 02, 2003 at 07:16:51AM +1100, herbert wrote: > > > > I found out that PMTU only works on those routing cache entries where > > rt_src != 0. This patch should make it work for all matching entries. > > That patch removed one line too many. This one should be better. Hmmm... Herbert, do you see how the outer loop and the skey[] thing works in this PMTU handling code? This takes care of comparing both iph->saddr and '0' against rt->rt_src. From herbert@gondor.apana.org.au Mon Dec 1 14:05:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 14:05:34 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1M5HTa016835 for ; Mon, 1 Dec 2003 14:05:19 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AQwAI-0000pH-00; Tue, 02 Dec 2003 09:05:10 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AQwAH-0005Q9-00; Tue, 02 Dec 2003 09:05:09 +1100 Date: Tue, 2 Dec 2003 09:05:09 +1100 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-ID: <20031201220509.GA20827@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> <20031201135154.6906454c.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031201135154.6906454c.davem@redhat.com> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1798 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Mon, Dec 01, 2003 at 01:51:54PM -0800, David S. Miller wrote: > > Herbert, do you see how the outer loop and the skey[] thing works in > this PMTU handling code? This takes care of comparing both iph->saddr > and '0' against rt->rt_src. It only takes care of fl4_src, not rt_src. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From davem@pizda.ninka.net Mon Dec 1 14:22:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 14:22:37 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1MMPTa019016 for ; Mon, 1 Dec 2003 14:22:25 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA20104; Mon, 1 Dec 2003 14:21:31 -0800 Date: Mon, 1 Dec 2003 14:21:31 -0800 From: "David S. Miller" To: Herbert Xu Cc: netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031201142131.5da50a07.davem@redhat.com> In-Reply-To: <20031201220509.GA20827@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> <20031201135154.6906454c.davem@redhat.com> <20031201220509.GA20827@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1799 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 09:05:09 +1100 Herbert Xu wrote: > On Mon, Dec 01, 2003 at 01:51:54PM -0800, David S. Miller wrote: > > > > Herbert, do you see how the outer loop and the skey[] thing works in > > this PMTU handling code? This takes care of comparing both iph->saddr > > and '0' against rt->rt_src. > > It only takes care of fl4_src, not rt_src. Indeed. At the surface it looks like a bug, but look at the redirect handling tests in ip_rt_redirect(). It's a very similar key comparison as the PMTU code, just structured differently: if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || rth->fl.fl4_tos != tos || rth->fl.oif != ikeys[k] || rth->fl.iif != 0) { rthp = &rth->u.rt_next; continue; } if (rth->rt_dst != daddr || rth->rt_src != saddr || rth->u.dst.error || rth->rt_gateway != old_gw || rth->u.dst.dev != dev) break; See? He's not comparing rt->rt_src against skeys[] and therefore '0' here either. I think the tests might be like this for a reason. I could see Alexey constructing this test wrong in one instance, but in two instances where the tests were structured totally different in each case is hard to believe. Let me think about this some more, maybe you're right and the error exists in both of these places. From oles@ovh.net Mon Dec 1 15:02:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 15:02:16 -0800 (PST) Received: from ping.ovh.net (ping.ovh.net [213.186.33.13]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1N20Ta020336 for ; Mon, 1 Dec 2003 15:02:01 -0800 Received: by ping.ovh.net (Postfix, from userid 502) id 5EF083B7A0; Tue, 2 Dec 2003 00:00:13 +0100 (CET) Date: Tue, 2 Dec 2003 00:00:13 +0100 From: Octave To: Jeff Garzik Cc: Stephen Hemminger , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: NAPI 8139too.c for 2.4.23 Message-ID: <20031201230013.GM4313@ovh.net> References: <20031201205038.GK10711@ovh.net> <20031201205533.GA15846@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20031201205533.GA15846@gtf.org> User-Agent: Mutt/1.5.4i X-archive-position: 1800 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: oles@ovh.net Precedence: bulk X-list: netdev > Is there any chance you could do some benchmark runs with ttcp or > somesuch? I tested on 6-7 servers running with eepro eth0: Intel Corp. 82557/8/9 [Ethernet Pro 100], 00:E0:18:01:78:6C, IRQ 10. realtek 8139too with NAPI 8139too Fast Ethernet driver 0.9.27 realtek 8139too with no NAPI (standard driver with soft polling) If this quick test is correct, realtek 8139too's driver works as good as eepro's driver. Octave >> from realtek (no NAPI) to realtek (no NAPI) ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from ttcp-r: 327680000 bytes in 42.90 real seconds = 59671.99 Kbit/sec +++ ttcp-r: 224852 I/O calls, msec/call = 0.20, calls/sec = 5241.16 ttcp-r: 0.1user 1.5sys 0:42real 3% 0i+0d 0maxrss 0+2pf 0+0csw >> from eepro to eepro ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from ttcp-r: 327680000 bytes in 28.33 real seconds = 90379.31 Kbit/sec +++ ttcp-r: 225058 I/O calls, msec/call = 0.13, calls/sec = 7945.54 ttcp-r: 0.2user 4.2sys 0:28real 15% 0i+0d 0maxrss 0+2pf 0+0csw >> from realtek (NAPI) to realtek (NAPI) ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from ttcp-r: 327680000 bytes in 29.21 real seconds = 87644.11 Kbit/sec +++ ttcp-r: 225735 I/O calls, msec/call = 0.13, calls/sec = 7728.26 ttcp-r: 0.0user 1.7sys 0:29real 6% 0i+0d 0maxrss 0+2pf 0+0csw >> from eepro to realtek (no NAPI) ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from ttcp-t: 327680000 bytes in 34.32 real seconds = 74594.99 Kbit/sec +++ ttcp-t: 40000 I/O calls, msec/call = 0.88, calls/sec = 1165.55 ttcp-t: 0.0user 1.2sys 0:34real 3% 0i+0d 0maxrss 0+2pf 0+0csw >> from realtek (NAPI) to realtek (no NAPI) ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from ttcp-r: 327680000 bytes in 32.60 real seconds = 78532.74 Kbit/sec +++ ttcp-r: 225544 I/O calls, msec/call = 0.15, calls/sec = 6918.98 ttcp-r: 0.1user 1.6sys 0:32real 5% 0i+0d 0maxrss 0+2pf 0+0csw >> from realtek (NAPI) to eepro ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from ttcp-r: 327680000 bytes in 34.02 real seconds = 75250.05 Kbit/sec +++ ttcp-r: 225685 I/O calls, msec/call = 0.15, calls/sec = 6633.91 ttcp-r: 0.1user 3.7sys 0:34real 11% 0i+0d 0maxrss 0+2pf 0+0csw From davem@pizda.ninka.net Mon Dec 1 15:23:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 15:23:15 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1NN0Ta021323 for ; Mon, 1 Dec 2003 15:23:00 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA20285; Mon, 1 Dec 2003 15:22:15 -0800 Date: Mon, 1 Dec 2003 15:22:15 -0800 From: "David S. Miller" To: "David S. Miller" Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031201152215.522c2447.davem@redhat.com> In-Reply-To: <20031201142131.5da50a07.davem@redhat.com> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> <20031201135154.6906454c.davem@redhat.com> <20031201220509.GA20827@gondor.apana.org.au> <20031201142131.5da50a07.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1801 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 1 Dec 2003 14:21:31 -0800 "David S. Miller" wrote: > Let me think about this some more, maybe you're right and the > error exists in both of these places. Ok, I did my thinking :) rt->rt_src is special. It is the source address we have selected to use with this route. All output packets using this route must use rt->rt_src as iph->saddr. So, in effect, when we say "if (rt->rt_src == iph->saddr)" we are asking the question "did we make this packet?" I think this is why Alexey coded the test in this way. You are speaking of a case of zero source addresses. When would we output such an iph->saddr, by way of a route? Right now this is the only part I'm not seeing. I want to be careful in changing this code, as loosening the key check opens the possibility of new kinds of PMTU lowering attacks. From ja@ssi.bg Mon Dec 1 15:30:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 15:30:25 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1NU0Ta021870 for ; Mon, 1 Dec 2003 15:30:06 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB1NUHSu003229; Tue, 2 Dec 2003 01:30:17 +0200 Date: Tue, 2 Dec 2003 01:30:17 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: Herbert Xu cc: "David S. Miller" , Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031201204700.GA20349@gondor.apana.org.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1802 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Tue, 2 Dec 2003, Herbert Xu wrote: > On Tue, Dec 02, 2003 at 07:16:51AM +1100, herbert wrote: > > > > I found out that PMTU only works on those routing cache entries where > > rt_src != 0. This patch should make it work for all matching entries. > > That patch removed one line too many. This one should be better. IMO, the rt_src check in ip_rt_frag_needed is ok. I would suspect all rth->fl.fl4_tos checks too. It seems we need toskeys[2] and a second for loop if tos!=0. What about rewriting them to (rth->fl.fl4_tos == toskeys[j]). Regards -- Julian Anastasov From greearb@candelatech.com Mon Dec 1 15:39:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 15:39:33 -0800 (PST) Received: from grok.yi.org (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1NdJTa022296 for ; Mon, 1 Dec 2003 15:39:19 -0800 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id hB1NdCKt012417; Mon, 1 Dec 2003 15:39:13 -0800 Message-ID: <3FCBD120.7070207@candelatech.com> Date: Mon, 01 Dec 2003 15:39:12 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Octave CC: netdev@oss.sgi.com Subject: Re: NAPI 8139too.c for 2.4.23 References: <20031201205038.GK10711@ovh.net> <20031201205533.GA15846@gtf.org> <20031201230013.GM4313@ovh.net> In-Reply-To: <20031201230013.GM4313@ovh.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1803 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Octave wrote: >>Is there any chance you could do some benchmark runs with ttcp or >>somesuch? > > > I tested on 6-7 servers running with > eepro eth0: Intel Corp. 82557/8/9 [Ethernet Pro 100], 00:E0:18:01:78:6C, IRQ 10. > realtek 8139too with NAPI 8139too Fast Ethernet driver 0.9.27 > realtek 8139too with no NAPI (standard driver with soft polling) > > If this quick test is correct, realtek 8139too's driver works as good as > eepro's driver. > > Octave Those are some nice numbers! I may have to bring some of my $5 realteks out of retirement! Anyone make a 4-port NIC with realteks on it? Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From davem@pizda.ninka.net Mon Dec 1 15:50:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 15:51:09 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB1NouTa022835 for ; Mon, 1 Dec 2003 15:50:56 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA20371; Mon, 1 Dec 2003 15:50:05 -0800 Date: Mon, 1 Dec 2003 15:50:05 -0800 From: "David S. Miller" To: Julian Anastasov Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031201155005.1c515793.davem@redhat.com> In-Reply-To: References: <20031201204700.GA20349@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1804 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 01:30:17 +0200 (EET) Julian Anastasov wrote: > I would suspect all rth->fl.fl4_tos checks too. > It seems we need toskeys[2] and a second for loop if tos!=0. > What about rewriting them to (rth->fl.fl4_tos == toskeys[j]). I disagree, and this is related to my most recent email in this thread. This packet we are reacting to for PMTU purposes could only have come from us if the TOS matches precisely. From ja@ssi.bg Mon Dec 1 16:04:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 16:04:12 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB203kTa023780 for ; Mon, 1 Dec 2003 16:03:52 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB204ASu006161; Tue, 2 Dec 2003 02:04:12 +0200 Date: Tue, 2 Dec 2003 02:04:10 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031201155005.1c515793.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1805 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Mon, 1 Dec 2003, David S. Miller wrote: > On Tue, 2 Dec 2003 01:30:17 +0200 (EET) > Julian Anastasov wrote: > > > I would suspect all rth->fl.fl4_tos checks too. > > It seems we need toskeys[2] and a second for loop if tos!=0. > > What about rewriting them to (rth->fl.fl4_tos == toskeys[j]). > > I disagree, and this is related to my most recent email > in this thread. No, only input routes match strictly tos, ip_rt_redirect() is such example that matches input routes. > This packet we are reacting to for PMTU purposes could only > have come from us if the TOS matches precisely. In this case we can react to packet routed with tos=0. We match output routes only. I do not see another place that needs such fix. Example patch (not tested): # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1350 -> 1.1351 # net/ipv4/route.c 1.73 -> 1.74 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/02 ja@ssi.bg 1.1351 # [IPV4]: ip_rt_frag_needed: fl4_tos accepts wildcard value for output routes # -------------------------------------------- # diff -Nru a/net/ipv4/route.c b/net/ipv4/route.c --- a/net/ipv4/route.c Tue Dec 2 02:00:54 2003 +++ b/net/ipv4/route.c Tue Dec 2 02:00:54 2003 @@ -1239,19 +1239,21 @@ unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu) { - int i; + int i, j; unsigned short old_mtu = ntohs(iph->tot_len); struct rtable *rth; u32 skeys[2] = { iph->saddr, 0, }; u32 daddr = iph->daddr; u8 tos = iph->tos & IPTOS_RT_MASK; unsigned short est_mtu = 0; + u8 toskeys[2] = { tos, 0 }; if (ipv4_config.no_pmtu_disc) return 0; + for (j = 0; j < (tos ? 2 : 1); j++) for (i = 0; i < 2; i++) { - unsigned hash = rt_hash_code(daddr, skeys[i], tos); + unsigned hash = rt_hash_code(daddr, skeys[i], toskeys[j]); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; @@ -1261,7 +1263,7 @@ rth->fl.fl4_src == skeys[i] && rth->rt_dst == daddr && rth->rt_src == iph->saddr && - rth->fl.fl4_tos == tos && + rth->fl.fl4_tos == toskeys[j] && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { unsigned short mtu = new_mtu; Regards -- Julian Anastasov From ja@ssi.bg Mon Dec 1 16:06:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 16:06:50 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB206VTa024182 for ; Mon, 1 Dec 2003 16:06:34 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB2070Su006176; Tue, 2 Dec 2003 02:07:00 +0200 Date: Tue, 2 Dec 2003 02:07:00 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031201155005.1c515793.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1806 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Mon, 1 Dec 2003, David S. Miller wrote: > On Tue, 2 Dec 2003 01:30:17 +0200 (EET) > Julian Anastasov wrote: > > > I would suspect all rth->fl.fl4_tos checks too. > > It seems we need toskeys[2] and a second for loop if tos!=0. > > What about rewriting them to (rth->fl.fl4_tos == toskeys[j]). > > I disagree, and this is related to my most recent email > in this thread. > > This packet we are reacting to for PMTU purposes could only > have come from us if the TOS matches precisely. Ops, ip_rt_redirect matches output route to, hm, lets think again about it... Regards -- Julian Anastasov From davem@pizda.ninka.net Mon Dec 1 16:09:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 16:09:15 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2092Ta024543 for ; Mon, 1 Dec 2003 16:09:03 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA20448; Mon, 1 Dec 2003 16:08:12 -0800 Date: Mon, 1 Dec 2003 16:08:11 -0800 From: "David S. Miller" To: Julian Anastasov Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031201160811.70904c29.davem@redhat.com> In-Reply-To: References: <20031201155005.1c515793.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1807 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 02:07:00 +0200 (EET) Julian Anastasov wrote: > > On Mon, 1 Dec 2003, David S. Miller wrote: > > > This packet we are reacting to for PMTU purposes could only > > have come from us if the TOS matches precisely. > > Ops, ip_rt_redirect matches output route to, hm, lets think > again about it... Right. I've rewritten emails in this thread probably 3 or 4 times each before actually sending them out, so don't feel bad since my mistakes have been merely hidden :) From romieu@fr.zoreil.com Mon Dec 1 16:13:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 16:14:09 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB20DsTa024933 for ; Mon, 1 Dec 2003 16:13:55 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id hB206pK7029677; Tue, 2 Dec 2003 01:06:51 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id hB206n9P029676; Tue, 2 Dec 2003 01:06:49 +0100 Date: Tue, 2 Dec 2003 01:06:49 +0100 From: Francois Romieu To: netdev@oss.sgi.com Cc: =?unknown-8bit?Q?Fernando_Alencar_Mar=F3stica?= , Brad House , jgarzik@pobox.com Subject: [PATCH 2.6] 2.6.0-test11 - more rtl8169 Message-ID: <20031202010649.A27879@electric-eye.fr.zoreil.com> References: <1070212415.1607.17.camel@oxygenium> <20031201020453.A16405@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Kj7319i9nmIyA2yE" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031201020453.A16405@electric-eye.fr.zoreil.com>; from romieu@fr.zoreil.com on Mon, Dec 01, 2003 at 02:04:53AM +0100 X-Organisation: Land of Sunshine Inc. X-archive-position: 1808 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev --Kj7319i9nmIyA2yE Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Francois Romieu : [...] > Btw, there is a flaw in r8169-dma-api-rx-buffers.patch so don't bother > testing it. I'll do it again tomorrow. Attached patch should fix it. Executive summary: o Pure 2.6.0-test11: Get: http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11/r8169-blob.tar.bz2 debuntarzipe and apply/compile/test in following order: r8169-dma-api-tx.patch r8169-dma-api-rx-buffers.patch <-+ Do not test these two patches r8169-dma-api-rx-buffers-ahum.patch <-+ separately r8169-start-xmit-fixes.patch r8169-dma-api-tx-buffers.patch r8169-rx_copybreak.patch r8169-mac-phy-version.patch r8169-init_one.patch r8169-timer.patch r8169-hw_start.patch r8169-missing-tx-stats.patch r8169-intr_mask.patch r8169-suspend.patch The same directory contains each patch alone as well. o 2.6.0-test11 + 2.6.0-test9-bk25-netdrvr-exp1 Get: http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11-netdrv/r8169-blob.tar.bz2 Same thing as above with: r8169-mac-phy-version.patch r8169-init_one.patch r8169-timer.patch r8169-hw_start.patch r8169-missing-tx-stats.patch r8169-intr_mask.patch r8169-suspend.patch r8169-dma-api-rx-buffers-ahum.patch Applying r8169-dma-api-rx-buffers-ahum.patch before r8169-mac-phy-version.patch generates a few offsets but works as well. -- Ueimor --Kj7319i9nmIyA2yE Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="r8169-dma-api-rx-buffers-ahum.patch" Brown paper bag time: the Rx descriptors are contiguous and EORbit only marks the last descriptor in the array. OWNbit implicitly marks the end of the Rx descriptors segment which is owned by the nic. drivers/net/r8169.c | 20 ++++++-------------- 1 files changed, 6 insertions(+), 14 deletions(-) diff -puN drivers/net/r8169.c~r8169-dma-api-rx-buffers-ahum drivers/net/r8169.c --- linux-2.6.0-test11/drivers/net/r8169.c~r8169-dma-api-rx-buffers-ahum 2003-12-02 00:22:41.000000000 +0100 +++ linux-2.6.0-test11-fr/drivers/net/r8169.c 2003-12-02 00:22:41.000000000 +0100 @@ -283,6 +283,8 @@ enum _DescStatusBit { LSbit = 0x10000000, }; +#define RsvdMask 0x3fffc000 + struct TxDesc { u32 status; u32 vlan_tag; @@ -1121,7 +1123,7 @@ rtl8169_hw_start(struct net_device *dev) static inline void rtl8169_make_unusable_by_asic(struct RxDesc *desc) { desc->buf_addr = 0xdeadbeef; - desc->status = EORbit; + desc->status &= ~(OWNbit | RsvdMask); } static void rtl8169_free_rx_skb(struct pci_dev *pdev, struct sk_buff **sk_buff, @@ -1141,7 +1143,7 @@ static inline void rtl8169_return_to_asi static inline void rtl8169_give_to_asic(struct RxDesc *desc, dma_addr_t mapping) { desc->buf_addr = mapping; - desc->status = OWNbit + RX_BUF_SIZE; + desc->status |= OWNbit + RX_BUF_SIZE; } static int rtl8169_alloc_rx_skb(struct pci_dev *pdev, struct net_device *dev, @@ -1209,11 +1211,6 @@ static inline void rtl8169_mark_as_last_ desc->status |= EORbit; } -static inline void rtl8169_unmark_as_last_descriptor(struct RxDesc *desc) -{ - desc->status &= ~EORbit; -} - static int rtl8169_init_ring(struct net_device *dev) { struct rtl8169_private *tp = dev->priv; @@ -1460,14 +1457,9 @@ rtl8169_rx_interrupt(struct net_device * } delta = rtl8169_rx_fill(tp, dev, tp->dirty_rx, tp->cur_rx); - if (delta > 0) { - u32 old_last = (tp->dirty_rx - 1) % NUM_RX_DESC; - + if (delta > 0) tp->dirty_rx += delta; - rtl8169_mark_as_last_descriptor(tp->RxDescArray + - (tp->dirty_rx - 1)%NUM_RX_DESC); - rtl8169_unmark_as_last_descriptor(tp->RxDescArray + old_last); - } else if (delta < 0) + else if (delta < 0) printk(KERN_INFO "%s: no Rx buffer allocated\n", dev->name); /* _ --Kj7319i9nmIyA2yE-- From francois@baligant.net Mon Dec 1 17:32:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 17:32:44 -0800 (PST) Received: from casimir.nikita.cx (casimir.nikita.cx [198.63.211.44]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB21WUTa030061 for ; Mon, 1 Dec 2003 17:32:31 -0800 Received: from fortress (fbaligant.net1.nerim.net [213.41.146.186]) by casimir.nikita.cx (8.12.8/8.12.8) with SMTP id hB21WLh0024881 for ; Mon, 1 Dec 2003 19:32:23 -0600 Message-ID: <072501c3b874$2542ae70$15fea8c0@fortress> From: "Francois Baligant" To: Subject: 2.6.0-test11: dst_cache_overflow causing unresponsive box Date: Tue, 2 Dec 2003 02:32:17 +0100 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0722_01C3B87C.80FBB1A0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Copied-To: backup@casimir.nikita.cx (by Synonym - http://www.modulo.ro/synonym) X-Scanned-By: MIMEDefang 2.37 X-archive-position: 1809 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: francois@baligant.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format... ------=_NextPart_000_0722_01C3B87C.80FBB1A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline We have a problem with a box running 2.6.0-test11-mjb1 and supporting aroun= d 90k simultaneous TCP connection. After a few hours/days of running, when a lots of clients connects/disconnects, the console will start to disp= lay: dst cache overflow NET: 1860 messages suppressed. dst cache overflow NET: 1858 messages suppressed. From there, the box is completely unresponsive, apparently eating all its C= PU in trying to shrink the routing cache. Only solution is reboot. Current sysctl: net.ipv4.route.max_size =3D 655360 # I know we shouldn't rise it that high = but it's only cure for now.. it lasts a bit longer like this net.ipv4.route.gc_min_interval =3D 2 net.ipv4.route.gc_interval =3D 10 net.ipv4.route.gc_timeout =3D 30 rtstat: size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot = mc GC: tot ignored goal_miss ovrf HASH: in_search out_search 139566 12393 123 0 0 0 0 0 184 21 = 0 143 142 0 0 26039 375 138876 13080 136 0 0 0 0 0 159 19 = 0 155 154 0 0 27153 277 139006 12317 125 0 0 0 0 0 180 28 = 0 153 153 0 0 25810 377 139138 13799 140 0 0 0 0 0 159 16 = 0 156 156 0 0 28375 331 139275 11610 128 0 0 0 0 0 177 27 = 0 154 153 0 0 23977 343 139383 12679 124 0 0 0 0 0 173 17 = 0 141 140 0 0 26717 398 139256 11946 135 0 0 0 0 0 166 17 = 0 152 151 0 0 24874 304 139353 11646 109 0 0 0 0 0 174 14 = 0 122 122 0 0 24165 320 138257 12702 116 0 0 0 0 0 180 16 = 0 131 130 0 0 26324 358 138369 12897 115 0 0 0 0 0 166 20 = 0 134 134 0 0 26819 339 138553 11309 133 0 0 0 0 0 158 33 = 0 165 165 0 0 21270 389 138172 17232 182 0 0 0 0 0 125 44 = 0 225 225 0 0 29702 375 138420 17407 182 0 0 0 0 0 165 73 = 0 254 253 0 0 29946 548 138833 17052 257 0 0 0 0 0 195 126 = 0 382 381 0 0 29715 812 139051 16606 224 0 0 0 0 0 238 97 = 0 320 319 0 0 28559 721 139217 18115 176 0 0 0 0 0 268 51 = 0 224 224 0 0 32983 527 139326 17531 178 0 0 0 0 0 291 44 = 0 220 220 0 0 33320 445 139422 15244 140 0 0 0 0 0 357 20 = 0 160 160 0 0 29934 415 139548 13123 142 0 0 0 0 0 281 12 = 0 154 154 0 0 26430 351 139684 13290 142 0 0 0 0 0 235 10 = 0 152 151 0 0 27341 309 OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20 142340 142296 99% 0.38K 14234 10 56936K ip_dst_cache Are we tuning the rt_cache in a wrong way ? regards, Francois Francois Baligant - http://www.pingouin.be Change the numbers, change your Life! ------=_NextPart_000_0722_01C3B87C.80FBB1A0-- From jgarzik@pobox.com Mon Dec 1 19:47:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Dec 2003 19:47:40 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB23lQTa031575 for ; Mon, 1 Dec 2003 19:47:27 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:37579 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AR1VP-0003kj-8Y; Tue, 02 Dec 2003 03:47:19 +0000 Message-ID: <3FCC0B36.9080901@pobox.com> Date: Mon, 01 Dec 2003 22:47:02 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ben Greear CC: Octave , netdev@oss.sgi.com Subject: Re: NAPI 8139too.c for 2.4.23 References: <20031201205038.GK10711@ovh.net> <20031201205533.GA15846@gtf.org> <20031201230013.GM4313@ovh.net> <3FCBD120.7070207@candelatech.com> In-Reply-To: <3FCBD120.7070207@candelatech.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1810 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Ben Greear wrote: > Those are some nice numbers! I may have to bring some of my $5 realteks > out of > retirement! Anyone make a 4-port NIC with realteks on it? hah! I hope not :) (I've never heard of such a beast, but who knows...) Jeff From ja@ssi.bg Tue Dec 2 00:08:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 00:09:04 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB288fTa007374 for ; Tue, 2 Dec 2003 00:08:47 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB21rrSu006664; Tue, 2 Dec 2003 03:53:55 +0200 Date: Tue, 2 Dec 2003 03:53:53 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031201155005.1c515793.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1811 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Mon, 1 Dec 2003, David S. Miller wrote: > I disagree, and this is related to my most recent email > in this thread. > > This packet we are reacting to for PMTU purposes could only > have come from us if the TOS matches precisely. Here is what I have for today. I assume all ip_route_output callers provide valid tos (not a wildcard). As result, only RTO_ONLINK and oif have wildcard value. I'm not sure if ip_rt_frag_needed needs an iif argument, may be yes? Also, it seems ip_rt_redirect needs the 'tos, tos | RTO_ONLINK' array too as in ip_rt_frag_needed. Not included yet. Another problem: it seems __ip_route_output_key does not hash with valid tos key bits, fix included below: --- net/ipv4/route.c.orig Tue Dec 2 03:25:59 2003 +++ net/ipv4/route.c Tue Dec 2 03:37:27 2003 @@ -1239,19 +1239,25 @@ unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu) { - int i; + int i, j, k; unsigned short old_mtu = ntohs(iph->tot_len); struct rtable *rth; u32 skeys[2] = { iph->saddr, 0, }; u32 daddr = iph->daddr; u8 tos = iph->tos & IPTOS_RT_MASK; unsigned short est_mtu = 0; + u8 toskeys[2] = { tos, tos | RTO_ONLINK }; + int iif = 0; // Can be argument + int ikeys[2] = { iif, 0 }; if (ipv4_config.no_pmtu_disc) return 0; + for (k = 0; k < (iif ? 2 : 1); k++) + for (j = 0; j < 2; j++) for (i = 0; i < 2; i++) { - unsigned hash = rt_hash_code(daddr, skeys[i], tos); + unsigned hash = rt_hash_code(daddr, skeys[i] ^ (ikeys[k] << 5), + toskeys[j]); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; @@ -1261,7 +1267,8 @@ rth->fl.fl4_src == skeys[i] && rth->rt_dst == daddr && rth->rt_src == iph->saddr && - rth->fl.fl4_tos == tos && + rth->fl.fl4_tos == toskeys[j] && + rth->fl.oif == ikeys[k] && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { unsigned short mtu = new_mtu; @@ -2214,7 +2221,8 @@ unsigned hash; struct rtable *rth; - hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), flp->fl4_tos); + hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), + flp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK)); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { Regards -- Julian Anastasov From herbert@gondor.apana.org.au Tue Dec 2 02:10:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 02:11:03 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2AAbTa013648 for ; Tue, 2 Dec 2003 02:10:39 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AR7UD-0003zX-00; Tue, 02 Dec 2003 21:10:29 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AR7U9-0006cM-00; Tue, 02 Dec 2003 21:10:25 +1100 Date: Tue, 2 Dec 2003 21:10:25 +1100 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-ID: <20031202101025.GA25422@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> <20031201135154.6906454c.davem@redhat.com> <20031201220509.GA20827@gondor.apana.org.au> <20031201142131.5da50a07.davem@redhat.com> <20031201152215.522c2447.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031201152215.522c2447.davem@redhat.com> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1812 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Mon, Dec 01, 2003 at 03:22:15PM -0800, David S. Miller wrote: > > You are speaking of a case of zero source addresses. When would > we output such an iph->saddr, by way of a route? Right now this > is the only part I'm not seeing. You're right. My patch is totally bogus. I misread the ip(8) output. I thought that if src wasn't shown that rt_src must be zero. But in fact it means that rt_src == fl4_src. My problem turns out to be that oif != 0 for the outgoing packets. Since frag_needed only handle cache entries where oif == 0 it never has a chance to work. The application that generated these packets is the RPC code in glibc. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From voloterreno@tin.it Tue Dec 2 02:26:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 02:26:45 -0800 (PST) Received: from vsmtp3.tin.it (vsmtp3.tin.it [212.216.176.223]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2AQLTa014229 for ; Tue, 2 Dec 2003 02:26:22 -0800 Received: from tin.it (80.117.50.2) by vsmtp3.tin.it (7.0.019) (authenticated as voloterreno@tin.it) id 3FCB9BE9000107FA; Mon, 1 Dec 2003 23:27:58 +0100 Message-ID: <3FCBC18A.4000405@tin.it> Date: Mon, 01 Dec 2003 23:32:42 +0100 From: Marcello User-Agent: Mozilla/5.0 (X11; U; Linux i686; it-IT; rv:1.5) Gecko/20031031 X-Accept-Language: it, en-us, en MIME-Version: 1.0 To: Octave CC: Stephen Hemminger , Jeff Garzik , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: NAPI 8139too.c for 2.4.23 References: <20031201205038.GK10711@ovh.net> In-Reply-To: <20031201205038.GK10711@ovh.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1813 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: voloterreno@tin.it Precedence: bulk X-list: netdev Octave ha scritto: >Stephen, >I get your patch from http://lwn.net/Articles/54815/ for 2.6.X and >I rewrote it for 2.4.23. Tested with 2.4.23 on high load servers. I >have no more "Too much work at interrupt". > >I dropped it on ftp://ftp.ovh.net/made-in-ovh/8139too.c-2.4-0.9.27 > >Hope it helps. >Octave > >before: >------- ># ps auxw >root 256 0.0 0.0 0 0 ? SW Nov28 0:00 [eth0] ># ifconfig > RX packets:40940899 errors:250542 dropped:7052 overruns:250542 frame:0 > TX packets:33057049 errors:0 dropped:0 overruns:20 carrier:0 ># dmesg >eth0: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1. >nfs: server X.X.X.X not responding, still trying >nfs: server X.X.X.X OK >eth0: Too much work at interrupt, IntrStatus=0x0040. > >with NAPI >--------- > RX packets:428253 errors:0 dropped:0 overruns:0 frame:0 > TX packets:357949 errors:0 dropped:0 overruns:0 carrier:0 >8139too Fast Ethernet driver 0.9.27 >PCI: Found IRQ 11 for device 00:0b.0 >eth0: RealTek RTL8139 at 0xec00, 00:e0:4c:91:03:b0, IRQ 11 >eth0: Identified 8139 chip type 'RTL-8100B/8139D' > >- >To unsubscribe from this list: send the line "unsubscribe linux-net" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > OCTAVE!! Your rewrite of the driver is great!! You have resolved all my problems with ethernet!! (collisions for now , but I think errors too :) ) You are my new personal HERO!! :D Thanks Marcello From davem@pizda.ninka.net Tue Dec 2 02:28:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 02:29:02 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2ASmTa014606 for ; Tue, 2 Dec 2003 02:28:48 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id CAA21879; Tue, 2 Dec 2003 02:27:33 -0800 Date: Tue, 2 Dec 2003 02:27:33 -0800 From: "David S. Miller" To: Herbert Xu Cc: netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031202022733.43cf693e.davem@redhat.com> In-Reply-To: <20031202101025.GA25422@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> <20031201135154.6906454c.davem@redhat.com> <20031201220509.GA20827@gondor.apana.org.au> <20031201142131.5da50a07.davem@redhat.com> <20031201152215.522c2447.davem@redhat.com> <20031202101025.GA25422@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1814 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 21:10:25 +1100 Herbert Xu wrote: > My problem turns out to be that oif != 0 for the outgoing packets. > Since frag_needed only handle cache entries where oif == 0 it > never has a chance to work. > > The application that generated these packets is the RPC code in glibc. That behavior of glibc is incorrect, I know about it, and I explained all this to Uli Drepper some time ago and he fixed it. Current glibc should not be doing this. If it still is, since Uli understood my arguments, it probably just slipped under the rug. Tell me this so I can take care of it. What the glibc code was doing was mirroring the input packet parameters (saddr/daddr/if/etc.) into what it used for the output packet sends. From herbert@gondor.apana.org.au Tue Dec 2 02:33:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 02:33:50 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2AXUTa015049 for ; Tue, 2 Dec 2003 02:33:34 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AR7q6-00046n-00; Tue, 02 Dec 2003 21:33:06 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AR7q4-0006hJ-00; Tue, 02 Dec 2003 21:33:04 +1100 Date: Tue, 2 Dec 2003 21:33:04 +1100 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-ID: <20031202103304.GA25658@gondor.apana.org.au> References: <20031201201651.GA20194@gondor.apana.org.au> <20031201204700.GA20349@gondor.apana.org.au> <20031201135154.6906454c.davem@redhat.com> <20031201220509.GA20827@gondor.apana.org.au> <20031201142131.5da50a07.davem@redhat.com> <20031201152215.522c2447.davem@redhat.com> <20031202101025.GA25422@gondor.apana.org.au> <20031202022733.43cf693e.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031202022733.43cf693e.davem@redhat.com> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1815 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Tue, Dec 02, 2003 at 02:27:33AM -0800, David S. Miller wrote: > > That behavior of glibc is incorrect, I know about it, and I explained > all this to Uli Drepper some time ago and he fixed it. Cool. I can confirm that this is definitely fixed with a more recent glibc. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From pp@ee.oulu.fi Tue Dec 2 02:40:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 02:41:06 -0800 (PST) Received: from ee.oulu.fi (ee.oulu.fi [130.231.61.23]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2AeoTa017107 for ; Tue, 2 Dec 2003 02:40:51 -0800 Received: from tk28.oulu.fi (tk28 [130.231.48.68]) by ee.oulu.fi (8.12.10/8.12.10) with ESMTP id hB2AeRsm003917; Tue, 2 Dec 2003 12:40:27 +0200 (EET) Received: (from pp@localhost) by tk28.oulu.fi (8.12.10/8.12.10/Submit) id hB2AePev009193; Tue, 2 Dec 2003 12:40:25 +0200 (EET) Date: Tue, 2 Dec 2003 12:40:25 +0200 From: Pekka Pietikainen To: netdev@oss.sgi.com Cc: marcelo.tosatti@cyclades.com.br Subject: Re: [patch] 2.4 lacks dummy SET_NETDEV_DEV Message-ID: <20031202104024.GA9163@ee.oulu.fi> References: <20031110181917.GA25846@ee.oulu.fi> <20031110221430.GA26556@ee.oulu.fi> <20031110221429.04732a57.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20031110221429.04732a57.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 1816 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pp@ee.oulu.fi Precedence: bulk X-list: netdev On Mon, Nov 10, 2003 at 10:14:29PM -0800, David S. Miller wrote: > On Tue, 11 Nov 2003 00:14:30 +0200 > Pekka Pietikainen wrote: > > > (I still like the idea of being able to use exactly the same driver > > source on 2.4/2.6 though) > > I agree, someone should merge in the dummy SET_NETDEV_DEV once > Marcelo starts up 2.4.24-preX Ping :-) --- linux-2.4.22-1.2115.nptl/include/linux/netdevice.h.orig 2003-11-10 20:08:42.922635848 +0200 +++ linux-2.4.22-1.2115.nptl/include/linux/netdevice.h 2003-11-10 20:09:09.754556776 +0200 @@ -456,6 +456,8 @@ #endif /* CONFIG_NET_DIVERT */ }; +/* 2.6 compatibility */ +#define SET_NETDEV_DEV(net, pdev) do { } while (0) struct packet_type { -- Pekka Pietikainen From Robert.Olsson@data.slu.se Tue Dec 2 02:44:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 02:44:53 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2AicTa017488 for ; Tue, 2 Dec 2003 02:44:39 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.9.3+/8.9.3) with ESMTP id LAA18731; Tue, 2 Dec 2003 11:44:25 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 86C41EC23B; Tue, 2 Dec 2003 11:44:31 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16332.27919.502097.988522@robur.slu.se> Date: Tue, 2 Dec 2003 11:44:31 +0100 To: "Francois Baligant" Cc: Subject: 2.6.0-test11: dst_cache_overflow causing unresponsive box In-Reply-To: <072501c3b874$2542ae70$15fea8c0@fortress> References: <072501c3b874$2542ae70$15fea8c0@fortress> X-Mailer: VM 7.17 under Emacs 21.3.1 X-archive-position: 1817 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Francois Baligant writes: > We have a problem with a box running 2.6.0-test11-mjb1 and supporting around 90k simultaneous TCP connection. After a few hours/days of running, > when a lots of clients connects/disconnects, the console will start to display: > > dst cache overflow > NET: 1860 messages suppressed. > > >From there, the box is completely unresponsive, apparently eating all its CPU in trying to shrink the routing cache. Only solution is reboot. > Current sysctl: > net.ipv4.route.max_size = 655360 # I know we shouldn't rise it that high but it's only cure for now.. it lasts a bit longer like this > size IN: hit tot mc no_rt bcast madst masrc OUT: hit tot mc GC: tot ignored goal_miss ovrf HASH: in_search out_search > 139566 12393 123 0 0 0 0 0 184 21 0 143 142 0 0 26039 375 > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 142340 142296 99% 0.38K 14234 10 56936K ip_dst_cache > > Are we tuning the rt_cache in a wrong way ? No experience with 90k TCP-flows but it seems GC is not able to free some the dst-entries for some reason. This will slowly kill your box with symptoms you describe. We have ask TCP-experts for timer settings to avoid pending sessions etc. Also check slab for any other objects growing as dst cache overflow is most likely secondary effect in your case. rtstat looks sane expect for the high number of dst-entries. Tuning is another story. Cheers. --ro From herbert@gondor.apana.org.au Tue Dec 2 03:02:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 03:03:00 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2B2bTa018100 for ; Tue, 2 Dec 2003 03:02:42 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AR8IT-0004F8-00; Tue, 02 Dec 2003 22:02:25 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AR8IS-0006lP-00; Tue, 02 Dec 2003 22:02:24 +1100 Date: Tue, 2 Dec 2003 22:02:24 +1100 To: "David S. Miller" , netdev@oss.sgi.com Subject: [RTNETLINK] Provide real oif Message-ID: <20031202110224.GA25957@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ZGiS0Q5IWpPtfppv" Content-Disposition: inline User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1818 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --ZGiS0Q5IWpPtfppv Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Dave: This patch adds a new attribute RTA_PREFOIF to the RTNETLINK interface so that we can send the real oif to user space. Hmm, would RTA_REALOIF be a better name? It really should be called RTA_OIF but that's already taken. Currently if there are two entries in the routing cache that only differ by the value of oif then they will appear identical to user space. The only way to tell them apart would be to export the value of oif from the kernel. Of course this patch by itself doesn't do anything. But once it is in we can extend ip(8) to display it. PS IPv6 doesn't seem to have an oif field so it doesn't need this. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --ZGiS0Q5IWpPtfppv Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p Index: kernel-source-2.5/include/linux/rtnetlink.h =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/include/linux/rtnetlink.h,v retrieving revision 1.5 diff -u -r1.5 rtnetlink.h --- kernel-source-2.5/include/linux/rtnetlink.h 18 Oct 2003 07:12:47 -0000 1.5 +++ kernel-source-2.5/include/linux/rtnetlink.h 2 Dec 2003 10:52:34 -0000 @@ -200,6 +200,7 @@ RTA_FLOW, RTA_CACHEINFO, RTA_SESSION, + RTA_PREFOIF, }; #define RTA_MAX RTA_SESSION Index: kernel-source-2.5/net//ipv4/route.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/net/ipv4/route.c,v retrieving revision 1.3 diff -u -r1.3 route.c --- kernel-source-2.5/net//ipv4/route.c 24 Nov 2003 09:52:04 -0000 1.3 +++ kernel-source-2.5/net//ipv4/route.c 2 Dec 2003 10:57:12 -0000 @@ -2347,6 +2347,8 @@ #endif RTA_PUT(skb, RTA_IIF, sizeof(int), &rt->fl.iif); } + if (rt->fl.oif) + RTA_PUT(skb, RTA_PREFOIF, sizeof(int), &rt->fl.oif); nlh->nlmsg_len = skb->tail - b; return skb->len; --ZGiS0Q5IWpPtfppv-- From davem@pizda.ninka.net Tue Dec 2 03:27:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 03:27:16 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2BR0Ta019330 for ; Tue, 2 Dec 2003 03:27:00 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id DAA22015; Tue, 2 Dec 2003 03:26:06 -0800 Date: Tue, 2 Dec 2003 03:26:06 -0800 From: "David S. Miller" To: Robert Olsson Cc: francois@baligant.net, netdev@oss.sgi.com Subject: Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box Message-Id: <20031202032606.28db927b.davem@redhat.com> In-Reply-To: <16332.27919.502097.988522@robur.slu.se> References: <072501c3b874$2542ae70$15fea8c0@fortress> <16332.27919.502097.988522@robur.slu.se> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1819 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 11:44:31 +0100 Robert Olsson wrote: > No experience with 90k TCP-flows but it seems GC is not able to free some > the dst-entries for some reason. This will slowly kill your box with > symptoms you describe. We have ask TCP-experts for timer settings to avoid > pending sessions etc. Also check slab for any other objects growing as > dst cache overflow is most likely secondary effect in your case. rtstat > looks sane expect for the high number of dst-entries. Tuning is another > story. Let us assume, for the sake of back of the envelope calculations, that all 90k TCP connections speak to unique destinations. Let us further assume that all of them have at least one packet in flight. This means the routing cache must be able to hold at least 90k entries. All of these routing cache entires will be referenced by the packets in the TCP retransmission queues of all the sockets, and thus the entries are unreclaimable. You are setting net.ipv4.route.max_size to 655360 which should be more than enough. But you also have to make the net.ipv4.route.gc_thresh more reasonable as well, perhaps 90K as a test. If net.ipv4.route.gc_thresh is lower than 90K and my assertions above hold, then the kernel will try to garbage collect too early, all the routing cache entries will be in use and therefore uncollectable, and you'll get the message you're seeing. Try to pump up gc_thresh and see if that helps. From davem@pizda.ninka.net Tue Dec 2 03:29:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 03:29:40 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2BTQTa019699 for ; Tue, 2 Dec 2003 03:29:26 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id DAA22031; Tue, 2 Dec 2003 03:28:25 -0800 Date: Tue, 2 Dec 2003 03:28:25 -0800 From: "David S. Miller" To: Pekka Pietikainen Cc: netdev@oss.sgi.com, marcelo.tosatti@cyclades.com.br Subject: Re: [patch] 2.4 lacks dummy SET_NETDEV_DEV Message-Id: <20031202032825.12ded81f.davem@redhat.com> In-Reply-To: <20031202104024.GA9163@ee.oulu.fi> References: <20031110181917.GA25846@ee.oulu.fi> <20031110221430.GA26556@ee.oulu.fi> <20031110221429.04732a57.davem@redhat.com> <20031202104024.GA9163@ee.oulu.fi> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1820 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Dec 2003 12:40:25 +0200 Pekka Pietikainen wrote: > On Mon, Nov 10, 2003 at 10:14:29PM -0800, David S. Miller wrote: > > I agree, someone should merge in the dummy SET_NETDEV_DEV once > > Marcelo starts up 2.4.24-preX > > Ping :-) Yes Marcelo, please apply this. > --- linux-2.4.22-1.2115.nptl/include/linux/netdevice.h.orig 2003-11-10 20:08:42.922635848 +0200 > +++ linux-2.4.22-1.2115.nptl/include/linux/netdevice.h 2003-11-10 20:09:09.754556776 +0200 > @@ -456,6 +456,8 @@ > #endif /* CONFIG_NET_DIVERT */ > }; > > +/* 2.6 compatibility */ > +#define SET_NETDEV_DEV(net, pdev) do { } while (0) > > struct packet_type > { From kuznet@ms2.inr.ac.ru Tue Dec 2 04:40:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 04:41:06 -0800 (PST) Received: from yakov.inr.ac.ru (yakov.inr.ac.ru [193.233.7.111]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2CehTa025892 for ; Tue, 2 Dec 2003 04:40:44 -0800 Received: (from kuznet@localhost) by yakov.inr.ac.ru (8.6.13/ANK) id PAA01536; Tue, 2 Dec 2003 15:40:07 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <200312021240.PAA01536@yakov.inr.ac.ru> Subject: Re: IPv6 MIB:ipv6PrefixTable implementation To: mashirle@us.ibm.com (Shirley Ma) Date: Tue, 2 Dec 2003 15:40:07 +0300 (MSK) Cc: netdev@oss.sgi.com, xma@us.ibm.com In-Reply-To: <200311191621.38087.mashirle@us.ibm.com> from "Shirley Ma" at îÏÑ 19, 2003 04:21:38 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1821 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > One implementation detail question, do you think I need to save all the o= > ther=20 > Prefix Objects: Type, Origin(addrconf, manually, dhcp, others), These two things should be stored right now. Existing implementation is quite a mess, but we definitely want to remember origin of each route in "protocol" and another flags, they are of common interest. > AutonomoueFlag, AdvPreferredLiftTime and ValidLifeTime ValidLifeTime is "expires" on this route. What's about AdvPreferredLiftTime I am puzzled a little, preferred time is not an attribute of a prefix at all, it is attribute of address, is not it? Unless the prefix is used to install local address it does not make sense, right? Alexey From hadi@cyberus.ca Tue Dec 2 04:47:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 04:47:50 -0800 (PST) Received: from mail.cyberus.ca (mail.cyberus.ca [209.197.145.21]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2ClaTa026369 for ; Tue, 2 Dec 2003 04:47:36 -0800 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1AR9wE-0006Ty-Pg; Tue, 02 Dec 2003 07:47:35 -0500 Subject: Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: Robert Olsson , francois@baligant.net, netdev@oss.sgi.com In-Reply-To: <20031202032606.28db927b.davem@redhat.com> References: <072501c3b874$2542ae70$15fea8c0@fortress> <16332.27919.502097.988522@robur.slu.se> <20031202032606.28db927b.davem@redhat.com> Content-Type: text/plain Organization: jamalopolis Message-Id: <1070369223.1027.34.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 02 Dec 2003 07:47:03 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 1822 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev With that many flows the neighbor cache gc may also be killing him. cheers, jamal From Helmut.BA.Berger@partner.bmw.de Tue Dec 2 08:11:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 08:11:34 -0800 (PST) Received: from mailgw2.bmwgroup.com (mailgw2.bmwgroup.com [192.109.190.191]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2GBGTa011959 for ; Tue, 2 Dec 2003 08:11:19 -0800 Received: from mhub1.muc (mhub1.muc [160.50.97.116]) by mailgw2.bmwgroup.com with ESMTP for netdev@oss.sgi.com; Tue, 2 Dec 2003 17:11:11 +0100 Received: from mail03.muc (mail03.muc [160.50.97.30]) by mhub1.muc with ESMTP for netdev@oss.sgi.com; Tue, 2 Dec 2003 17:11:10 +0100 Received: from ztpc451-L.muc ([10.249.235.55] (may be forged)) by mail03.muc (8.8.6 (PHNE_17190+no byaddr 2)/8.8.6) with ESMTP id RAA21375 for ; Tue, 2 Dec 2003 17:11:10 +0100 (MET) Subject: Inserting a layer between TCP and IP From: Helmut Berger To: netdev@oss.sgi.com Message-Id: <1070381465.1793.16.camel@ztpc451-L.MUC> X-Mailer: Ximian Evolution 1.4.4 Date: Tue, 02 Dec 2003 17:11:05 +0100 MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 1823 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Helmut.BA.Berger@partner.bmw.de Precedence: bulk X-list: netdev Hi, I am not sure if this is the right platform for my question. I am not very familiar with linux driver development. If I am wrong here, please be so kind and tell me, where I should better go to. I want to insert an additional layer between the tcp and ip layer with some special functionality and I am not sure how to do this. I fear it cannot be done with system_calls, can it? That would be very comfortable. I could write a loadable kernel module which registers itself on insert and deregisters itself on remove. Is it possible to insert my special layer at runtime without kernel changes? Or do I have to build a special kernel? Which ist the best way to insert a layer between the existing tcp and ip layer? Thanx, Helmut From Robert.Olsson@data.slu.se Tue Dec 2 09:56:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 09:56:56 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2HudTa019684 for ; Tue, 2 Dec 2003 09:56:40 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.9.3+/8.9.3) with ESMTP id SAA24729; Tue, 2 Dec 2003 18:56:21 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 53C6DEC23B; Tue, 2 Dec 2003 18:56:27 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16332.53835.297235.589786@robur.slu.se> Date: Tue, 2 Dec 2003 18:56:27 +0100 To: "David S. Miller" Cc: Robert Olsson , francois@baligant.net, netdev@oss.sgi.com Subject: Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box In-Reply-To: <20031202032606.28db927b.davem@redhat.com> References: <072501c3b874$2542ae70$15fea8c0@fortress> <16332.27919.502097.988522@robur.slu.se> <20031202032606.28db927b.davem@redhat.com> X-Mailer: VM 7.17 under Emacs 21.3.1 X-archive-position: 1824 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev David S. Miller writes: > Let us assume, for the sake of back of the envelope calculations, that > all 90k TCP connections speak to unique destinations. Let us further > assume that all of them have at least one packet in flight. > > This means the routing cache must be able to hold at least 90k entries. > All of these routing cache entires will be referenced by the packets > in the TCP retransmission queues of all the sockets, and thus the > entries are unreclaimable. > > You are setting net.ipv4.route.max_size to 655360 which should be more > than enough. But you also have to make the net.ipv4.route.gc_thresh > more reasonable as well, perhaps 90K as a test. > > If net.ipv4.route.gc_thresh is lower than 90K and my assertions above > hold, then the kernel will try to garbage collect too early, all the > routing cache entries will be in use and therefore uncollectable, > and you'll get the message you're seeing. > > Try to pump up gc_thresh and see if that helps. Yes better tuning as gc_thresh and max_size is in better balance but max_size is same so I'll guess we collect unreclaimable entries util we see dst overflow still'. The long time before overflow is suspect "hours to days" We have to ask if this has ever worked before? I'll guess number of hash buckets should be increased for systems like this. Cheers. --ro From hadi@cyberus.ca Tue Dec 2 11:29:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 11:29:48 -0800 (PST) Received: from mail.cyberus.ca (mail.cyberus.ca [209.197.145.21]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2JTYTa026404 for ; Tue, 2 Dec 2003 11:29:35 -0800 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1AR9hc-0005LY-7N; Tue, 02 Dec 2003 07:32:28 -0500 Subject: Re: [RTNETLINK] Provide real oif From: jamal Reply-To: hadi@cyberus.ca To: Herbert Xu Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20031202110224.GA25957@gondor.apana.org.au> References: <20031202110224.GA25957@gondor.apana.org.au> Content-Type: text/plain Organization: jamalopolis Message-Id: <1070368316.1031.30.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 02 Dec 2003 07:31:57 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 1825 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2003-12-02 at 06:02, Herbert Xu wrote: > Hi Dave: > > This patch adds a new attribute RTA_PREFOIF to the RTNETLINK interface > so that we can send the real oif to user space. Hmm, would RTA_REALOIF > be a better name? It really should be called RTA_OIF but that's already > taken. > > Currently if there are two entries in the routing cache that only differ > by the value of oif then they will appear identical to user space. The > only way to tell them apart would be to export the value of oif from > the kernel. > Can you provide a real example (output of route display in user space) where this would be valuable? whats wrong with the combo of RTA_OIF and RTA_SRC? cheers, jamal From herbert@gondor.apana.org.au Tue Dec 2 12:54:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 12:54:57 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2KsUTa032602 for ; Tue, 2 Dec 2003 12:54:34 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1ARHXJ-0007RO-00; Wed, 03 Dec 2003 07:54:21 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1ARHXF-0007cM-00; Wed, 03 Dec 2003 07:54:17 +1100 From: Herbert Xu To: hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: [RTNETLINK] Provide real oif Organization: Core In-Reply-To: <1070368316.1031.30.camel@jzny.localdomain> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.2-20031002 ("Berneray") (UNIX) (Linux/2.4.22-1-686-smp (i686)) Message-Id: Date: Wed, 03 Dec 2003 07:54:17 +1100 X-archive-position: 1826 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev jamal wrote: > > Can you provide a real example (output of route display in user space) > where this would be valuable? Without the real oif you will see entries like this in the output of ip r l c: 192.168.0.7 dev eth0 src 192.168.0.6 cache mtu 1500 advmss 1460 metric10 64 192.168.0.7 dev eth0 src 192.168.0.6 cache mtu 1500 advmss 1460 metric10 64 One of those has oif == 0 while the other one has oif == eth0. To generate an entry with oif == eth0, just send a UDP packet and bound to eth0. > whats wrong with the combo of RTA_OIF and RTA_SRC? Both of those attributes are independent of the real oif. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From pavlin@possum.icir.org Tue Dec 2 13:19:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 13:19:51 -0800 (PST) Received: from possum.icir.org (possum.icir.org [192.150.187.67]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2LJaTa001015 for ; Tue, 2 Dec 2003 13:19:37 -0800 Received: from possum.icir.org (localhost [127.0.0.1]) by possum.icir.org (8.12.9p1/8.12.3) with ESMTP id hB2LJaSx017575; Tue, 2 Dec 2003 13:19:36 -0800 (PST) (envelope-from pavlin@possum.icir.org) Message-Id: <200312022119.hB2LJaSx017575@possum.icir.org> To: netdev@oss.sgi.com Cc: pavlin@icir.org Subject: A request to add RTPROT_XORP to linux/rtnetlink.h Date: Tue, 02 Dec 2003 13:19:36 -0800 From: Pavlin Radoslavov X-archive-position: 1827 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pavlin@icir.org Precedence: bulk X-list: netdev [If this is not the right mailing list for such request, please let me know where I should send it to.] On behalf of the XORP (eXtensible Open Router Platform) project (http://www.xorp.org), I'd like to ask if the XORP-specific protocol value can be included in linux/rtnetlink.h (similar to the values already assigned to other routing daemons/suites such as GateD, etc.) Below I am including the simple one-line patch against linux-2.6.0-test11. Any chance this can be added to 2.6.0 before its release? :) Thanks, Pavlin --- linux-2.6.0-test11/include/linux/rtnetlink.h.org Wed Nov 26 12:45:11 2003 +++ linux-2.6.0-test11/include/linux/rtnetlink.h Tue Dec 2 12:53:18 2003 @@ -138,6 +138,7 @@ #define RTPROT_ZEBRA 11 /* Zebra */ #define RTPROT_BIRD 12 /* BIRD */ #define RTPROT_DNROUTED 13 /* DECnet routing daemon */ +#define RTPROT_XORP 14 /* XORP */ /* rtm_scope From shemminger@osdl.org Tue Dec 2 14:01:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:01:49 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M1ZTa001791 for ; Tue, 2 Dec 2003 14:01:35 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M0XZ01347; Tue, 2 Dec 2003 14:00:33 -0800 Date: Tue, 2 Dec 2003 14:01:15 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (4//8) dscc4 convert to new hdlc_device Message-Id: <20031202140115.239f998f.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1828 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1493 -> 1.1494 # drivers/net/wan/dscc4.c 1.51 -> 1.52 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/11/26 shemminger@osdl.org 1.1494 # Convert to work with hdlc_device that is not embedded. # Use list macros for list of ports on card. # -------------------------------------------- # diff -Nru a/drivers/net/wan/dscc4.c b/drivers/net/wan/dscc4.c --- a/drivers/net/wan/dscc4.c Wed Nov 26 12:30:13 2003 +++ b/drivers/net/wan/dscc4.c Wed Nov 26 12:30:13 2003 @@ -185,12 +185,14 @@ spinlock_t lock; struct pci_dev *pdev; - struct dscc4_dev_priv *root; + struct list_head devs; dma_addr_t iqcfg_dma; u32 xtal_hz; }; struct dscc4_dev_priv { + struct list_head dev_list; + struct sk_buff *rx_skbuff[RX_RING_SIZE]; struct sk_buff *tx_skbuff[TX_RING_SIZE]; @@ -228,7 +230,7 @@ unsigned short encoding; unsigned short parity; - hdlc_device hdlc; + hdlc_device *hdlc; sync_serial_settings settings; u32 __pad __attribute__ ((aligned (4))); }; @@ -371,9 +373,17 @@ static int dscc4_tx_poll(struct dscc4_dev_priv *, struct net_device *); #endif +static inline struct dscc4_dev_priv *dscc4_root_priv(struct dscc4_pci_priv *ppriv) +{ + struct list_head *first = ppriv->devs.next; + + return (first == &ppriv->devs) ? NULL + : list_entry(first, struct dscc4_dev_priv, dev_list); +} + static inline struct dscc4_dev_priv *dscc4_priv(struct net_device *dev) { - return list_entry(dev, struct dscc4_dev_priv, hdlc.netdev); + return dev_to_hdlc(dev)->dev_data; } static void scc_patchl(u32 mask, u32 value, struct dscc4_dev_priv *dpriv, @@ -636,7 +646,7 @@ struct net_device *dev) { struct RxFD *rx_fd = dpriv->rx_fd + dpriv->rx_current%RX_RING_SIZE; - struct net_device_stats *stats = &dpriv->hdlc.stats; + struct net_device_stats *stats = &dpriv->hdlc->stats; struct pci_dev *pdev = dpriv->pci_priv->pdev; struct sk_buff *skb; int pkt_len; @@ -681,19 +691,16 @@ static void dscc4_free1(struct pci_dev *pdev) { - struct dscc4_pci_priv *ppriv; - struct dscc4_dev_priv *root; - int i; + struct dscc4_pci_priv *ppriv = pci_get_drvdata(pdev); + struct dscc4_dev_priv *dpriv, *dnext; - ppriv = pci_get_drvdata(pdev); - root = ppriv->root; - - for (i = 0; i < dev_per_card; i++) - unregister_hdlc_device(&root[i].hdlc); + list_for_each_entry_safe(dpriv, dnext, &ppriv->devs, dev_list) { + unregister_hdlc_device(dpriv->hdlc); + free_hdlc_device(dpriv->hdlc); + } pci_set_drvdata(pdev, NULL); - kfree(root); kfree(ppriv); } @@ -704,7 +711,6 @@ struct dscc4_dev_priv *dpriv; static int cards_found = 0; unsigned long ioaddr; - int i; printk(KERN_DEBUG "%s", version); @@ -743,7 +749,8 @@ priv = (struct dscc4_pci_priv *)pci_get_drvdata(pdev); - if (request_irq(pdev->irq, &dscc4_irq, SA_SHIRQ, DRV_NAME, priv->root)){ + if (request_irq(pdev->irq, &dscc4_irq, SA_SHIRQ, DRV_NAME, + dscc4_root_priv(priv))) { printk(KERN_WARNING "%s: IRQ %d busy\n", DRV_NAME, pdev->irq); goto err_out_free1; } @@ -772,21 +779,19 @@ * SCC 0-3 private rx/tx irq structures * IQRX/TXi needs to be set soon. Learned it the hard way... */ - for (i = 0; i < dev_per_card; i++) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { dpriv->iqtx = (u32 *) pci_alloc_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), &dpriv->iqtx_dma); if (!dpriv->iqtx) goto err_out_free_iqtx; - writel(dpriv->iqtx_dma, ioaddr + IQTX0 + i*4); + writel(dpriv->iqtx_dma, ioaddr + IQTX0 + dpriv->dev_id*4); } - for (i = 0; i < dev_per_card; i++) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { dpriv->iqrx = (u32 *) pci_alloc_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), &dpriv->iqrx_dma); if (!dpriv->iqrx) goto err_out_free_iqrx; - writel(dpriv->iqrx_dma, ioaddr + IQRX0 + i*4); + writel(dpriv->iqrx_dma, ioaddr + IQRX0 + dpriv->dev_id*4); } /* Cf application hint. Beware of hard-lock condition on threshold. */ @@ -804,22 +809,19 @@ return 0; err_out_free_iqrx: - while (--i >= 0) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), dpriv->iqrx, dpriv->iqrx_dma); } - i = dev_per_card; err_out_free_iqtx: - while (--i >= 0) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), dpriv->iqtx, dpriv->iqtx_dma); } pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), priv->iqcfg, priv->iqcfg_dma); err_out_free_irq: - free_irq(pdev->irq, priv->root); + free_irq(pdev->irq, dscc4_root_priv(priv)); err_out_free1: dscc4_free1(pdev); err_out_iounmap: @@ -863,29 +865,27 @@ static int dscc4_found1(struct pci_dev *pdev, unsigned long ioaddr) { struct dscc4_pci_priv *ppriv; - struct dscc4_dev_priv *root; + struct dscc4_dev_priv *dpriv, *dnext; int i, ret = -ENOMEM; - root = (struct dscc4_dev_priv *) - kmalloc(dev_per_card*sizeof(*root), GFP_KERNEL); - if (!root) { - printk(KERN_ERR "%s: can't allocate data\n", DRV_NAME); - goto err_out; - } - memset(root, 0, dev_per_card*sizeof(*root)); - ppriv = (struct dscc4_pci_priv *) kmalloc(sizeof(*ppriv), GFP_KERNEL); if (!ppriv) { printk(KERN_ERR "%s: can't allocate private data\n", DRV_NAME); - goto err_free_dev; + goto err_out; } memset(ppriv, 0, sizeof(struct dscc4_pci_priv)); + INIT_LIST_HEAD(&ppriv->devs); for (i = 0; i < dev_per_card; i++) { - struct dscc4_dev_priv *dpriv = root + i; - hdlc_device *hdlc = &dpriv->hdlc; - struct net_device *d = hdlc_to_dev(hdlc); + hdlc_device *hdlc = alloc_hdlc_device(sizeof(*dpriv)); + struct net_device *d; + + if (!hdlc) { + ret = -ENOMEM; + goto err_unregister; + } + d = hdlc_to_dev(hdlc); d->base_addr = ioaddr; d->init = NULL; d->irq = pdev->irq; @@ -898,6 +898,8 @@ SET_MODULE_OWNER(d); SET_NETDEV_DEV(d, &pdev->dev); + dpriv = dscc4_priv(d); + dpriv->hdlc = hdlc; dpriv->dev_id = i; dpriv->pci_priv = ppriv; spin_lock_init(&dpriv->lock); @@ -920,23 +922,25 @@ unregister_hdlc_device(hdlc); goto err_unregister; } + list_add_tail(&dpriv->dev_list, &ppriv->devs); } - ret = dscc4_set_quartz(root, quartz); + + + ret = dscc4_set_quartz(dscc4_root_priv(ppriv), quartz); if (ret < 0) goto err_unregister; - ppriv->root = root; + spin_lock_init(&ppriv->lock); pci_set_drvdata(pdev, ppriv); return ret; err_unregister: - while (--i >= 0) { - dscc4_release_ring(root + i); - unregister_hdlc_device(&root[i].hdlc); + list_for_each_entry_safe(dpriv, dnext, &ppriv->devs, dev_list) { + dscc4_release_ring(dpriv); + unregister_hdlc_device(dpriv->hdlc); + free_hdlc_device(dpriv->hdlc); } kfree(ppriv); -err_free_dev: - kfree(root); err_out: return ret; }; @@ -964,7 +968,7 @@ sync_serial_settings *settings = &dpriv->settings; if (settings->loopback && (settings->clock_type != CLOCK_INT)) { - struct net_device *dev = hdlc_to_dev(&dpriv->hdlc); + struct net_device *dev = hdlc_to_dev(dpriv->hdlc); printk(KERN_INFO "%s: loopback requires clock\n", dev->name); return -1; @@ -1015,7 +1019,7 @@ static int dscc4_open(struct net_device *dev) { struct dscc4_dev_priv *dpriv = dscc4_priv(dev); - hdlc_device *hdlc = &dpriv->hdlc; + hdlc_device *hdlc = dpriv->hdlc; struct dscc4_pci_priv *ppriv; int ret = -EAGAIN; @@ -1467,7 +1471,7 @@ int i, handled = 1; priv = root->pci_priv; - dev = hdlc_to_dev(&root->hdlc); + dev = hdlc_to_dev(root->hdlc); spin_lock_irqsave(&priv->lock, flags); @@ -1518,7 +1522,7 @@ static inline void dscc4_tx_irq(struct dscc4_pci_priv *ppriv, struct dscc4_dev_priv *dpriv) { - struct net_device *dev = hdlc_to_dev(&dpriv->hdlc); + struct net_device *dev = hdlc_to_dev(dpriv->hdlc); u32 state; int cur, loop = 0; @@ -1549,7 +1553,7 @@ if (state & SccEvt) { if (state & Alls) { - struct net_device_stats *stats = &dpriv->hdlc.stats; + struct net_device_stats *stats = &dpriv->hdlc->stats; struct sk_buff *skb; struct TxFD *tx_fd; @@ -1687,7 +1691,7 @@ static inline void dscc4_rx_irq(struct dscc4_pci_priv *priv, struct dscc4_dev_priv *dpriv) { - struct net_device *dev = hdlc_to_dev(&dpriv->hdlc); + struct net_device *dev = hdlc_to_dev(dpriv->hdlc); u32 state; int cur; @@ -1954,23 +1958,21 @@ static void __devexit dscc4_remove_one(struct pci_dev *pdev) { struct dscc4_pci_priv *ppriv; - struct dscc4_dev_priv *root; + struct dscc4_dev_priv *root, *dpriv; unsigned long ioaddr; - int i; ppriv = pci_get_drvdata(pdev); - root = ppriv->root; + root = dscc4_root_priv(ppriv); - ioaddr = hdlc_to_dev(&root->hdlc)->base_addr; + ioaddr = hdlc_to_dev(root->hdlc)->base_addr; dscc4_pci_reset(pdev, ioaddr); free_irq(pdev->irq, root); pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), ppriv->iqcfg, ppriv->iqcfg_dma); - for (i = 0; i < dev_per_card; i++) { - struct dscc4_dev_priv *dpriv = root + i; + list_for_each_entry(dpriv, &ppriv->devs, dev_list) { dscc4_release_ring(dpriv); pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), dpriv->iqrx, dpriv->iqrx_dma); From shemminger@osdl.org Tue Dec 2 14:01:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:01:49 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M1XTa001790 for ; Tue, 2 Dec 2003 14:01:34 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M0KZ01320; Tue, 2 Dec 2003 14:00:21 -0800 Date: Tue, 2 Dec 2003 14:01:03 -0800 From: Stephen Hemminger To: Jeff Garzik , Krzysztof Halas Cc: netdev@oss.sgi.com Subject: [PATCH] (1/8) hdlc wan device disembedding Message-Id: <20031202140103.6bb2deb4.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1829 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Change the hdlc wan device's to not have the net_device structure embedded inside the hdlc_device structure. This won't work on 2.6 where the net_device structure may need to live after module unload due to sysfs. Instead, use alloc_netdev and setup so that netdev->priv = hdlc and have hdlc->dev_data for device private data. Patch is against net-drivers-2.5-exp tree. This portion breaks the actual hardware drivers but that is fixed in parts 2-5. I don't have any of this hardware so the code has only been tested by compiling and loading the modules. # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1502 -> 1.1503 # drivers/net/wan/hdlc_generic.c 1.14 -> 1.15 # drivers/net/wan/hdlc_cisco.c 1.9 -> 1.10 # drivers/net/wan/hdlc_fr.c 1.9 -> 1.10 # include/linux/hdlc.h 1.10 -> 1.11 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/01 shemminger@osdl.org 1.1503 # 1-hdlc_device # -------------------------------------------- # diff -Nru a/drivers/net/wan/hdlc_cisco.c b/drivers/net/wan/hdlc_cisco.c --- a/drivers/net/wan/hdlc_cisco.c Mon Dec 1 14:45:08 2003 +++ b/drivers/net/wan/hdlc_cisco.c Mon Dec 1 14:45:08 2003 @@ -222,8 +222,8 @@ hdlc->state.cisco.settings.timeout * HZ) { hdlc->state.cisco.up = 0; printk(KERN_INFO "%s: Link down\n", hdlc_to_name(hdlc)); - if (netif_carrier_ok(&hdlc->netdev)) - netif_carrier_off(&hdlc->netdev); + if (netif_carrier_ok(hdlc->netdev)) + netif_carrier_off(hdlc->netdev); } cisco_keepalive_send(hdlc, CISCO_KEEPALIVE_REQ, @@ -256,8 +256,8 @@ static void cisco_stop(hdlc_device *hdlc) { del_timer_sync(&hdlc->state.cisco.timer); - if (netif_carrier_ok(&hdlc->netdev)) - netif_carrier_off(&hdlc->netdev); + if (netif_carrier_ok(hdlc->netdev)) + netif_carrier_off(hdlc->netdev); } diff -Nru a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c --- a/drivers/net/wan/hdlc_fr.c Mon Dec 1 14:45:08 2003 +++ b/drivers/net/wan/hdlc_fr.c Mon Dec 1 14:45:08 2003 @@ -543,8 +543,8 @@ hdlc->state.fr.reliable = reliable; if (reliable) { - if (!netif_carrier_ok(&hdlc->netdev)) - netif_carrier_on(&hdlc->netdev); + if (!netif_carrier_ok(hdlc->netdev)) + netif_carrier_on(hdlc->netdev); hdlc->state.fr.n391cnt = 0; /* Request full status */ hdlc->state.fr.dce_changed = 1; @@ -558,8 +558,8 @@ } } } else { - if (netif_carrier_ok(&hdlc->netdev)) - netif_carrier_off(&hdlc->netdev); + if (netif_carrier_ok(hdlc->netdev)) + netif_carrier_off(hdlc->netdev); while (pvc) { /* Deactivate all PVCs */ pvc_carrier(0, pvc); @@ -938,8 +938,8 @@ printk(KERN_DEBUG "fr_start\n"); #endif if (hdlc->state.fr.settings.lmi != LMI_NONE) { - if (netif_carrier_ok(&hdlc->netdev)) - netif_carrier_off(&hdlc->netdev); + if (netif_carrier_ok(hdlc->netdev)) + netif_carrier_off(hdlc->netdev); hdlc->state.fr.last_poll = 0; hdlc->state.fr.reliable = 0; hdlc->state.fr.dce_changed = 1; diff -Nru a/drivers/net/wan/hdlc_generic.c b/drivers/net/wan/hdlc_generic.c --- a/drivers/net/wan/hdlc_generic.c Mon Dec 1 14:45:08 2003 +++ b/drivers/net/wan/hdlc_generic.c Mon Dec 1 14:45:08 2003 @@ -71,6 +71,7 @@ void hdlc_set_carrier(int on, hdlc_device *hdlc) { + struct net_device *dev = hdlc_to_dev(hdlc); on = on ? 1 : 0; #ifdef DEBUG_LINK @@ -92,14 +93,14 @@ if (hdlc->carrier) { if (hdlc->proto.start) hdlc->proto.start(hdlc); - else if (!netif_carrier_ok(&hdlc->netdev)) - netif_carrier_on(&hdlc->netdev); + else if (!netif_carrier_ok(dev)) + netif_carrier_on(dev); } else { /* no carrier */ if (hdlc->proto.stop) hdlc->proto.stop(hdlc); - else if (netif_carrier_ok(&hdlc->netdev)) - netif_carrier_off(&hdlc->netdev); + else if (netif_carrier_ok(dev)) + netif_carrier_off(dev); } carrier_exit: @@ -110,6 +111,8 @@ /* Must be called by hardware driver when HDLC device is being opened */ int hdlc_open(hdlc_device *hdlc) { + struct net_device *dev = hdlc_to_dev(hdlc); + #ifdef DEBUG_LINK printk(KERN_DEBUG "hdlc_open carrier %i open %i\n", hdlc->carrier, hdlc->open); @@ -129,11 +132,11 @@ if (hdlc->carrier) { if (hdlc->proto.start) hdlc->proto.start(hdlc); - else if (!netif_carrier_ok(&hdlc->netdev)) - netif_carrier_on(&hdlc->netdev); + else if (!netif_carrier_ok(dev)) + netif_carrier_on(dev); - } else if (netif_carrier_ok(&hdlc->netdev)) - netif_carrier_off(&hdlc->netdev); + } else if (netif_carrier_ok(dev)) + netif_carrier_off(dev); hdlc->open = 1; @@ -224,11 +227,9 @@ } - -int register_hdlc_device(hdlc_device *hdlc) +static void hdlc_setup(struct net_device *dev) { - int result; - struct net_device *dev = hdlc_to_dev(hdlc); + hdlc_device *hdlc = dev->priv; dev->get_stats = hdlc_get_stats; dev->change_mtu = hdlc_change_mtu; @@ -245,17 +246,13 @@ hdlc->open = 0; spin_lock_init(&hdlc->state_lock); - result = dev_alloc_name(dev, "hdlc%d"); - if (result < 0) - return result; - - result = register_netdev(dev); - if (result != 0) - return -EIO; - - return 0; + hdlc->dev_data = (hdlc + 1); } +int register_hdlc_device(hdlc_device *hdlc) +{ + return register_netdev(hdlc_to_dev(hdlc)); +} void unregister_hdlc_device(hdlc_device *hdlc) @@ -266,7 +263,17 @@ rtnl_unlock(); } +hdlc_device *alloc_hdlc_device(int sizeof_priv) +{ + struct net_device *dev = alloc_netdev(sizeof_priv + sizeof(hdlc_device), + "hdlc%d", hdlc_setup); + return dev ? dev_to_hdlc(dev) : NULL; +} +void free_hdlc_device(hdlc_device *hdlc) +{ + free_netdev(hdlc_to_dev(hdlc)); +} MODULE_AUTHOR("Krzysztof Halasa "); MODULE_DESCRIPTION("HDLC support module"); @@ -278,6 +285,8 @@ EXPORT_SYMBOL(hdlc_ioctl); EXPORT_SYMBOL(register_hdlc_device); EXPORT_SYMBOL(unregister_hdlc_device); +EXPORT_SYMBOL(alloc_hdlc_device); +EXPORT_SYMBOL(free_hdlc_device); static struct packet_type hdlc_packet_type = { .type = __constant_htons(ETH_P_HDLC), diff -Nru a/include/linux/hdlc.h b/include/linux/hdlc.h --- a/include/linux/hdlc.h Mon Dec 1 14:45:08 2003 +++ b/include/linux/hdlc.h Mon Dec 1 14:45:08 2003 @@ -96,7 +96,9 @@ typedef struct hdlc_device_struct { /* To be initialized by hardware driver */ - struct net_device netdev; /* master net device - must be first */ + struct net_device *netdev; /* master net device */ + void *dev_data; + struct net_device_stats stats; /* used by HDLC layer to take control over HDLC device from hw driver*/ @@ -180,6 +182,8 @@ /* Exported from hdlc.o */ +hdlc_device *alloc_hdlc_device(int sizeof_priv); +void free_hdlc_device(hdlc_device *hdlc); /* Called by hardware driver when a user requests HDLC service */ int hdlc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd); @@ -188,18 +192,15 @@ int register_hdlc_device(hdlc_device *hdlc); void unregister_hdlc_device(hdlc_device *hdlc); - static __inline__ struct net_device* hdlc_to_dev(hdlc_device *hdlc) { - return &hdlc->netdev; + return hdlc->netdev; } - -static __inline__ hdlc_device* dev_to_hdlc(struct net_device *dev) +static __inline__ hdlc_device *dev_to_hdlc(struct net_device *dev) { - return (hdlc_device*)dev; + return dev->priv; } - static __inline__ pvc_device* dev_to_pvc(struct net_device *dev) { From shemminger@osdl.org Tue Dec 2 14:02:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:02:43 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M2TTa002051 for ; Tue, 2 Dec 2003 14:02:29 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M1YZ01464; Tue, 2 Dec 2003 14:01:34 -0800 Date: Tue, 2 Dec 2003 14:02:17 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (5/8) farsync - hdlc_device conversion Message-Id: <20031202140217.776fdff6.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1830 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1494 -> 1.1495 # drivers/net/wan/farsync.c 1.18 -> 1.19 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/11/26 shemminger@osdl.org 1.1495 # Convert from embedding hdlc_device (with embedded net_device) # to hdlc_device pointer. # -------------------------------------------- # diff -Nru a/drivers/net/wan/farsync.c b/drivers/net/wan/farsync.c --- a/drivers/net/wan/farsync.c Wed Nov 26 12:31:27 2003 +++ b/drivers/net/wan/farsync.c Wed Nov 26 12:31:28 2003 @@ -328,7 +328,7 @@ /* Per port (line or channel) information */ struct fst_port_info { - hdlc_device hdlc; /* HDLC device struct - must be first */ + hdlc_device *hdlc; /* HDLC device struct */ struct fst_card_info *card; /* Card we're associated with */ int index; /* Port index on the card */ int hwif; /* Line hardware (lineInterface copy) */ @@ -353,13 +353,16 @@ spinlock_t card_lock; /* Lock for SMP access */ unsigned short pci_conf; /* PCI card config in I/O space */ /* Per port info */ - struct fst_port_info ports[ FST_MAX_PORTS ]; + struct fst_port_info *ports[ FST_MAX_PORTS ]; }; /* Convert an HDLC device pointer into a port info pointer and similar */ -#define hdlc_to_port(H) ((struct fst_port_info *)(H)) +static __inline__ struct fst_port_info *hdlc_to_port(hdlc_device *hdlc) +{ + return hdlc->dev_data; +} #define dev_to_port(D) hdlc_to_port(dev_to_hdlc(D)) -#define port_to_dev(P) hdlc_to_dev(&(P)->hdlc) +#define port_to_dev(P) hdlc_to_dev(P->hdlc) /* @@ -678,24 +681,24 @@ len ); if ( dmabits != ( RX_STP | RX_ENP ) || len > LEN_RX_BUFFER - 2 ) { - port->hdlc.stats.rx_errors++; + port->hdlc->stats.rx_errors++; /* Update error stats and discard buffer */ if ( dmabits & RX_OFLO ) { - port->hdlc.stats.rx_fifo_errors++; + port->hdlc->stats.rx_fifo_errors++; } if ( dmabits & RX_CRC ) { - port->hdlc.stats.rx_crc_errors++; + port->hdlc->stats.rx_crc_errors++; } if ( dmabits & RX_FRAM ) { - port->hdlc.stats.rx_frame_errors++; + port->hdlc->stats.rx_frame_errors++; } if ( dmabits == ( RX_STP | RX_ENP )) { - port->hdlc.stats.rx_length_errors++; + port->hdlc->stats.rx_length_errors++; } /* Discard buffer descriptors until we see the end of packet @@ -732,7 +735,7 @@ { dbg ( DBG_RX,"intr_rx: can't allocate buffer\n"); - port->hdlc.stats.rx_dropped++; + port->hdlc->stats.rx_dropped++; /* Return descriptor to card */ FST_WRB ( card, rxDescrRing[pi][rxp].bits, DMA_OWN ); @@ -756,12 +759,12 @@ port->rxpos = rxp; /* Update stats */ - port->hdlc.stats.rx_packets++; - port->hdlc.stats.rx_bytes += len; + port->hdlc->stats.rx_packets++; + port->hdlc->stats.rx_bytes += len; /* Push upstream */ skb->mac.raw = skb->data; - skb->dev = hdlc_to_dev ( &port->hdlc ); + skb->dev = hdlc_to_dev ( port->hdlc ); skb->protocol = hdlc_type_trans(skb, skb->dev); netif_rx ( skb ); @@ -806,7 +809,7 @@ { event = FST_RDB ( card, interruptEvent.evntbuff[rdidx]); - port = &card->ports[event & 0x03]; + port = card->ports[event & 0x03]; dbg ( DBG_INTR,"intr: %x\n", event ); @@ -835,8 +838,8 @@ * always load up the entire packet for DMA. */ dbg ( DBG_TX,"Tx underflow port %d\n", event & 0x03 ); - port->hdlc.stats.tx_errors++; - port->hdlc.stats.tx_fifo_errors++; + port->hdlc->stats.tx_errors++; + port->hdlc->stats.tx_fifo_errors++; break; case INIT_CPLT: @@ -859,9 +862,10 @@ } FST_WRB ( card, interruptEvent.rdindex, rdidx ); - for ( pi = 0, port = card->ports ; pi < card->nports ; pi++, port++ ) + for ( pi = 0 ; pi < card->nports ; pi++ ) { - if ( ! port->run ) + port = card->ports[pi]; + if ( !port || ! port->run ) continue; /* Check for rx completions */ @@ -1346,8 +1350,8 @@ port = dev_to_port ( dev ); - port->hdlc.stats.tx_errors++; - port->hdlc.stats.tx_aborted_errors++; + port->hdlc->stats.tx_errors++; + port->hdlc->stats.tx_aborted_errors++; if ( port->txcnt > 0 ) fst_issue_cmd ( port, ABORTTX ); @@ -1374,8 +1378,8 @@ if ( ! netif_carrier_ok ( dev )) { dev_kfree_skb ( skb ); - port->hdlc.stats.tx_errors++; - port->hdlc.stats.tx_carrier_errors++; + port->hdlc->stats.tx_errors++; + port->hdlc->stats.tx_carrier_errors++; return 0; } @@ -1385,7 +1389,7 @@ dbg ( DBG_TX,"Packet too large %d vs %d\n", skb->len, LEN_TX_BUFFER ); dev_kfree_skb ( skb ); - port->hdlc.stats.tx_errors++; + port->hdlc->stats.tx_errors++; return 0; } @@ -1399,7 +1403,7 @@ spin_unlock_irqrestore ( &card->card_lock, flags ); dbg ( DBG_TX,"Out of Tx buffers\n"); dev_kfree_skb ( skb ); - port->hdlc.stats.tx_errors++; + port->hdlc->stats.tx_errors++; return 0; } if ( ++port->txpos >= NUM_TX_BUFFER ) @@ -1419,8 +1423,8 @@ FST_WRW ( card, txDescrRing[pi][txp].bcnt, cnv_bcnt ( skb->len )); FST_WRB ( card, txDescrRing[pi][txp].bits, DMA_OWN | TX_STP | TX_ENP ); - port->hdlc.stats.tx_packets++; - port->hdlc.stats.tx_bytes += skb->len; + port->hdlc->stats.tx_packets++; + port->hdlc->stats.tx_bytes += skb->len; dev_kfree_skb ( skb ); @@ -1455,11 +1459,23 @@ */ for ( i = 0 ; i < card->nports ; i++ ) { - card->ports[i].card = card; - card->ports[i].index = i; - card->ports[i].run = 0; + hdlc_device *hdlc; + struct fst_port_info *port; + + hdlc = alloc_hdlc_device(sizeof(*port)); + if (!hdlc) { + printk_err ("Cannot allcoate HDLC device for port %d", i); + card->nports = i; + break; + } - dev = hdlc_to_dev ( &card->ports[i].hdlc ); + card->ports[i] = port = hdlc->dev_data; + port->hdlc = hdlc; + port->card = card; + port->index = i; + port->run = 0; + + dev = hdlc_to_dev ( hdlc ); /* Fill in the net device info */ /* Since this is a PCI setup this is purely @@ -1479,10 +1495,11 @@ dev->do_ioctl = fst_ioctl; dev->watchdog_timeo = FST_TX_TIMEOUT; dev->tx_timeout = fst_tx_timeout; - card->ports[i].hdlc.attach = fst_attach; - card->ports[i].hdlc.xmit = fst_start_xmit; + hdlc->attach = fst_attach; + hdlc->xmit = fst_start_xmit; - if (( err = register_hdlc_device ( &card->ports[i].hdlc )) < 0 ) + err = register_hdlc_device ( hdlc ); + if (err) { printk_err ("Cannot register HDLC device for port %d" " (errno %d)\n", i, -err ); @@ -1494,8 +1511,8 @@ spin_lock_init ( &card->card_lock ); printk ( KERN_INFO "%s-%s: %s IRQ%d, %d ports\n", - hdlc_to_dev(&card->ports[0].hdlc)->name, - hdlc_to_dev(&card->ports[card->nports-1].hdlc)->name, + hdlc_to_dev(card->ports[0]->hdlc)->name, + hdlc_to_dev(card->ports[card->nports-1]->hdlc)->name, type_strings[card->type], card->irq, card->nports ); } @@ -1641,13 +1658,14 @@ fst_remove_one ( struct pci_dev *pdev ) { struct fst_card_info *card; - int i; + int i; card = pci_get_drvdata(pdev); - for ( i = 0 ; i < card->nports ; i++ ) - { - unregister_hdlc_device ( &card->ports[i].hdlc ); + for ( i = 0 ; i < card->nports ; i++ ) { + hdlc_device *hdlc = card->ports[i]->hdlc; + unregister_hdlc_device (hdlc); + free_hdlc_device (hdlc); } fst_disable_intr ( card ); From shemminger@osdl.org Tue Dec 2 14:02:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:02:49 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M2YTa002052 for ; Tue, 2 Dec 2003 14:02:35 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M1eZ01475; Tue, 2 Dec 2003 14:01:40 -0800 Date: Tue, 2 Dec 2003 14:02:23 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (6/8) wanxl - hdlc device conversion Message-Id: <20031202140223.636f178d.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1831 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1495 -> 1.1496 # drivers/net/wan/wanxl.c 1.2 -> 1.3 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/11/26 shemminger@osdl.org 1.1496 # Convert from embedded hdlc_device with embedded net_device to # hdlc_device pointer. # Use explicit error unwind goto's to handle early errors in initialization. # C99 initializers. # -------------------------------------------- # diff -Nru a/drivers/net/wan/wanxl.c b/drivers/net/wan/wanxl.c --- a/drivers/net/wan/wanxl.c Wed Nov 26 12:37:49 2003 +++ b/drivers/net/wan/wanxl.c Wed Nov 26 12:37:50 2003 @@ -51,7 +51,7 @@ typedef struct { - hdlc_device hdlc; /* HDLC device struct - must be first */ + hdlc_device *hdlc; /* HDLC device struct - must be first */ struct card_t *card; spinlock_t lock; /* for wanxl_xmit */ int node; /* physical port #0 - 3 */ @@ -84,7 +84,7 @@ static inline port_t* hdlc_to_port(hdlc_device *hdlc) { - return (port_t*)hdlc; + return hdlc->dev_data; } @@ -96,13 +96,13 @@ static inline struct net_device *port_to_dev(port_t* port) { - return hdlc_to_dev(&port->hdlc); + return hdlc_to_dev(port->hdlc); } static inline const char* port_name(port_t *port) { - return hdlc_to_name((hdlc_device*)port); + return hdlc_to_name(port->hdlc); } @@ -172,7 +172,7 @@ printk(KERN_INFO "%s: %s%s module, %s cable%s%s\n", port_name(port), pm, dte, cable, dsr, dcd); - hdlc_set_carrier(value & STATUS_CABLE_DCD, &port->hdlc); + hdlc_set_carrier(value & STATUS_CABLE_DCD, port->hdlc); } @@ -191,13 +191,13 @@ return; case PACKET_UNDERRUN: - port->hdlc.stats.tx_errors++; - port->hdlc.stats.tx_fifo_errors++; + port->hdlc->stats.tx_errors++; + port->hdlc->stats.tx_fifo_errors++; break; default: - port->hdlc.stats.tx_packets++; - port->hdlc.stats.tx_bytes += skb->len; + port->hdlc->stats.tx_packets++; + port->hdlc->stats.tx_bytes += skb->len; } desc->stat = PACKET_EMPTY; /* Free descriptor */ pci_unmap_single(port->card->pdev, desc->address, skb->len, @@ -224,7 +224,7 @@ " nonexistent port\n", card_name(card->pdev)); else if (!skb) - port->hdlc.stats.rx_dropped++; + port->hdlc->stats.rx_dropped++; else { pci_unmap_single(card->pdev, desc->address, @@ -236,8 +236,8 @@ skb->len); debug_frame(skb); #endif - port->hdlc.stats.rx_packets++; - port->hdlc.stats.rx_bytes += skb->len; + port->hdlc->stats.rx_packets++; + port->hdlc->stats.rx_bytes += skb->len; skb->mac.raw = skb->data; skb->dev = dev; dev->last_rx = jiffies; @@ -530,7 +530,7 @@ -static void wanxl_pci_remove_one(struct pci_dev *pdev) +static void __devexit wanxl_pci_remove_one(struct pci_dev *pdev) { card_t *card = pci_get_drvdata(pdev); int i; @@ -539,9 +539,13 @@ if (card->irq) free_irq(card->irq, card); - for (i = 0; i < 4; i++) - if (card->ports[i]) - unregister_hdlc_device(&card->ports[i]->hdlc); + for (i = 0; i < 4; i++) { + port_t *port = card->ports[i]; + if (port) { + unregister_hdlc_device(port->hdlc); + free_hdlc_device(port->hdlc); + } + } wanxl_reset(card); @@ -577,7 +581,7 @@ u32 plx_phy; /* PLX PCI base address */ u32 mem_phy; /* memory PCI base addr */ u8 *mem; /* memory virtual base addr */ - int i, ports, alloc_size; + int i, err, ports; #ifndef MODULE static int printed_version; @@ -587,9 +591,9 @@ } #endif - i = pci_enable_device(pdev); - if (i) - return i; + err = pci_enable_device(pdev); + if (err) + goto err_1; /* QUICC can only access first 256 MB of host RAM directly, but PLX9060 DMA does 32-bits for actual packet data transfers */ @@ -601,28 +605,28 @@ if (pci_set_consistent_dma_mask(pdev, 0x0FFFFFFF) || pci_set_dma_mask(pdev, 0x0FFFFFFF)) { printk(KERN_ERR "No usable DMA configuration\n"); - return -EIO; + err = -EIO; + goto err_1; } - i = pci_request_regions(pdev, "wanXL"); - if (i) - return i; + err = pci_request_regions(pdev, "wanXL"); + if (err) + goto err_1; switch (pdev->device) { case PCI_DEVICE_ID_SBE_WANXL100: ports = 1; break; case PCI_DEVICE_ID_SBE_WANXL200: ports = 2; break; default: ports = 4; } - - alloc_size = sizeof(card_t) + ports * sizeof(port_t); - card = kmalloc(alloc_size, GFP_KERNEL); + + card = kmalloc(sizeof(card_t), GFP_KERNEL); if (card == NULL) { printk(KERN_ERR "wanXL %s: unable to allocate memory\n", card_name(pdev)); - pci_release_regions(pdev); - return -ENOBUFS; + err = -ENOBUFS; + goto err_2; } - memset(card, 0, alloc_size); + memset(card, 0, sizeof(card_t)); pci_set_drvdata(pdev, card); card->pdev = pdev; @@ -631,8 +635,8 @@ card->status = pci_alloc_consistent(pdev, sizeof(card_status_t), &card->status_address); if (card->status == NULL) { - wanxl_pci_remove_one(pdev); - return -ENOBUFS; + err = -ENOBUFS; + goto err_3; } #ifdef DEBUG_PCI @@ -647,8 +651,8 @@ if (pci_set_consistent_dma_mask(pdev, 0xFFFFFFFF) || pci_set_dma_mask(pdev, 0xFFFFFFFF)) { printk(KERN_ERR "No usable DMA configuration\n"); - wanxl_pci_remove_one(pdev); - return -EIO; + err = -EIO; + goto err_4; } /* set up PLX mapping */ @@ -660,12 +664,12 @@ #endif timeout = jiffies + 20 * HZ; + err = -ENODEV; while ((stat = readl(card->plx + PLX_MAILBOX_0)) != 0) { if (time_before(timeout, jiffies)) { printk(KERN_WARNING "wanXL %s: timeout waiting for" " PUTS to complete\n", card_name(pdev)); - wanxl_pci_remove_one(pdev); - return -ENODEV; + goto err_4; } switch(stat & 0xC0) { @@ -676,8 +680,7 @@ default: printk(KERN_WARNING "wanXL %s: PUTS test 0x%X" " failed\n", card_name(pdev), stat & 0x30); - wanxl_pci_remove_one(pdev); - return -ENODEV; + goto err_4; } schedule(); @@ -697,50 +700,61 @@ " (%u bytes detected, %u bytes required)\n", card_name(pdev), ramsize, BUFFERS_ADDR + (TX_BUFFERS + RX_BUFFERS) * BUFFER_LENGTH * ports); - wanxl_pci_remove_one(pdev); - return -ENODEV; + goto err_4; } if (wanxl_puts_command(card, MBX1_CMD_BSWAP)) { printk(KERN_WARNING "wanXL %s: unable to Set Byte Swap" " Mode\n", card_name(pdev)); - wanxl_pci_remove_one(pdev); - return -ENODEV; + goto err_4; } for (i = 0; i < ports; i++) { - port_t *port = (void *)card + sizeof(card_t) + - i * sizeof(port_t); - struct net_device *dev = hdlc_to_dev(&port->hdlc); + hdlc_device *hdlc; + port_t *port; + struct net_device *dev; + + hdlc = alloc_hdlc_device(sizeof(*port)); + if (!hdlc) { + err = -ENOBUFS; + goto err_5; + } + + port = hdlc->dev_data; + port->hdlc = hdlc; + dev = hdlc_to_dev(hdlc); + spin_lock_init(&port->lock); SET_MODULE_OWNER(dev); dev->tx_queue_len = 50; dev->do_ioctl = wanxl_ioctl; dev->open = wanxl_open; dev->stop = wanxl_close; - port->hdlc.attach = wanxl_attach; - port->hdlc.xmit = wanxl_xmit; - if(register_hdlc_device(&port->hdlc)) { + dev->get_stats = wanxl_get_stats; + hdlc->attach = wanxl_attach; + hdlc->xmit = wanxl_xmit; + port->card = card; + port->node = i; + + err = register_hdlc_device(hdlc); + if (err) { printk(KERN_ERR "wanXL %s: unable to register hdlc" " device\n", card_name(pdev)); - wanxl_pci_remove_one(pdev); - return -ENOBUFS; + goto err_5; } card->ports[i] = port; - dev->get_stats = wanxl_get_stats; - port->card = card; - port->node = i; get_status(port)->clocking = CLOCK_EXT; } for (i = 0; i < RX_QUEUE_LENGTH; i++) { struct sk_buff *skb = dev_alloc_skb(BUFFER_LENGTH); card->rx_skbs[i] = skb; - if (skb) - card->status->rx_descs[i].address = - pci_map_single(card->pdev, skb->data, - BUFFER_LENGTH, - PCI_DMA_FROMDEVICE); + if (!skb) + goto err_6; + card->status->rx_descs[i].address = + pci_map_single(card->pdev, skb->data, + BUFFER_LENGTH, + PCI_DMA_FROMDEVICE); } mem = ioremap_nocache(mem_phy, PDM_OFFSET + sizeof(firmware)); @@ -760,8 +774,8 @@ if (wanxl_puts_command(card, MBX1_CMD_ABORTJ)) { printk(KERN_WARNING "wanXL %s: unable to Abort and Jump\n", card_name(pdev)); - wanxl_pci_remove_one(pdev); - return -ENODEV; + err = -ENODEV; + goto err_6; } stat = 0; @@ -770,13 +784,13 @@ if ((stat = readl(card->plx + PLX_MAILBOX_5)) != 0) break; schedule(); - }while (time_after(timeout, jiffies)); + } while (time_after(timeout, jiffies)); if (!stat) { printk(KERN_WARNING "wanXL %s: timeout while initializing card" "firmware\n", card_name(pdev)); - wanxl_pci_remove_one(pdev); - return -ENODEV; + err = -ENODEV; + goto err_6; } #if DETECT_RAM @@ -796,12 +810,40 @@ if(request_irq(pdev->irq, wanxl_intr, SA_SHIRQ, "wanXL", card)) { printk(KERN_WARNING "wanXL %s: could not allocate IRQ%i.\n", card_name(pdev), pdev->irq); - wanxl_pci_remove_one(pdev); - return -EBUSY; + err = -EBUSY; + goto err_7; } card->irq = pdev->irq; return 0; + + err_7: + wanxl_reset(card); + err_6: + for (i = 0; i < RX_QUEUE_LENGTH; i++) + if (card->rx_skbs[i]) { + pci_unmap_single(card->pdev, + card->status->rx_descs[i].address, + BUFFER_LENGTH, PCI_DMA_FROMDEVICE); + dev_kfree_skb(card->rx_skbs[i]); + } + err_5: + for (i = 0; i < ports; i++) { + port_t *port = card->ports[i]; + if (port) { + unregister_hdlc_device(port->hdlc); + free_hdlc_device(port->hdlc); + } + } + err_4: + pci_free_consistent(pdev, sizeof(card_status_t), + card->status, card->status_address); + err_3: + kfree(card); + err_2: + pci_release_regions(pdev); + err_1: + return err; } static struct pci_device_id wanxl_pci_tbl[] __devinitdata = { @@ -816,10 +858,10 @@ static struct pci_driver wanxl_pci_driver = { - name: "wanXL", - id_table: wanxl_pci_tbl, - probe: wanxl_pci_init_one, - remove: wanxl_pci_remove_one, + .name = "wanXL", + .id_table = wanxl_pci_tbl, + .probe = wanxl_pci_init_one, + .remove = __devexit_p(wanxl_pci_remove_one), }; From shemminger@osdl.org Tue Dec 2 14:02:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:02:55 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M2fTa002143 for ; Tue, 2 Dec 2003 14:02:41 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M1hZ01479; Tue, 2 Dec 2003 14:01:43 -0800 Date: Tue, 2 Dec 2003 14:02:26 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (7/8) pc300 hdlc_device conversion Message-Id: <20031202140226.266abaac.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1832 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1496 -> 1.1497 # drivers/net/wan/pc300.h 1.3 -> 1.4 # drivers/net/wan/pc300_drv.c 1.15 -> 1.16 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/11/26 shemminger@osdl.org 1.1497 # Convert for hdlc_device without embedded net_device. # This device was overriding dev->priv to be the pc300 device, this # breaks the hdlc_device interface's assumptions. # Make hdlc_device private pointer be pc300 device and adjust to fix. # -------------------------------------------- # diff -Nru a/drivers/net/wan/pc300.h b/drivers/net/wan/pc300.h --- a/drivers/net/wan/pc300.h Wed Nov 26 12:35:58 2003 +++ b/drivers/net/wan/pc300.h Wed Nov 26 12:35:58 2003 @@ -391,7 +391,7 @@ typedef struct pc300ch { struct pc300 *card; int channel; - pc300dev_t d; + pc300dev_t *d; pc300chconf_t conf; ucchar tx_first_bd; /* First TX DMA block descr. w/ data */ ucchar tx_next_bd; /* Next free TX DMA block descriptor */ diff -Nru a/drivers/net/wan/pc300_drv.c b/drivers/net/wan/pc300_drv.c --- a/drivers/net/wan/pc300_drv.c Wed Nov 26 12:35:58 2003 +++ b/drivers/net/wan/pc300_drv.c Wed Nov 26 12:35:58 2003 @@ -615,7 +615,7 @@ while (cpc_readb(falcbase + F_REG(SIS, ch)) & SIS_CEC) { if (i++ >= PC300_FALC_MAXLOOP) { printk("%s: FALC command locked(cmd=0x%x).\n", - card->chan[ch].d.name, cmd); + card->chan[ch].d->name, cmd); break; } } @@ -1361,7 +1361,7 @@ pfalc->loss_mfa || pfalc->blue_alarm) { if (pfalc->sync) { pfalc->sync = 0; - chan->d.line_off++; + chan->d->line_off++; cpc_writeb(falcbase + card->hw.cpld_reg2, cpc_readb(falcbase + card->hw.cpld_reg2) & ~(CPLD_REG2_FALC_LED2 << (2 * ch))); @@ -1369,7 +1369,7 @@ } else { if (!pfalc->sync) { pfalc->sync = 1; - chan->d.line_on++; + chan->d->line_on++; cpc_writeb(falcbase + card->hw.cpld_reg2, cpc_readb(falcbase + card->hw.cpld_reg2) | (CPLD_REG2_FALC_LED2 << (2 * ch))); @@ -1771,7 +1771,8 @@ void cpc_tx_timeout(struct net_device *dev) { - pc300dev_t *d = (pc300dev_t *) dev->priv; + hdlc_device *hdlc = dev->priv; + pc300dev_t *d = hdlc->dev_data; pc300ch_t *chan = (pc300ch_t *) d->chan; pc300_t *card = (pc300_t *) chan->card; struct net_device_stats *stats = &d->hdlc->stats; @@ -1799,7 +1800,8 @@ int cpc_queue_xmit(struct sk_buff *skb, struct net_device *dev) { - pc300dev_t *d = (pc300dev_t *) dev->priv; + hdlc_device *hdlc = dev->priv; + pc300dev_t *d = hdlc->dev_data; pc300ch_t *chan = (pc300ch_t *) d->chan; pc300_t *card = (pc300_t *) chan->card; struct net_device_stats *stats = &d->hdlc->stats; @@ -1883,7 +1885,7 @@ void cpc_net_rx(hdlc_device * hdlc) { struct net_device *dev = hdlc_to_dev(hdlc); - pc300dev_t *d = (pc300dev_t *) dev->priv; + pc300dev_t *d = hdlc->dev_data; pc300ch_t *chan = (pc300ch_t *) d->chan; pc300_t *card = (pc300_t *) chan->card; struct net_device_stats *stats = &d->hdlc->stats; @@ -2016,7 +2018,7 @@ while ((status = cpc_readl(scabase + ISR0)) != 0) { for (ch = 0; ch < card->hw.nchan; ch++) { pc300ch_t *chan = &card->chan[ch]; - pc300dev_t *d = &chan->d; + pc300dev_t *d = chan->d; hdlc_device *hdlc = d->hdlc; struct net_device *dev = hdlc_to_dev(hdlc); @@ -2162,7 +2164,7 @@ netif_carrier_off(dev); #endif - card->chan[ch].d.line_off++; + card->chan[ch].d->line_off++; } else { /* DCD = 1 */ printk ("%s: DCD is ON. Going administrative up.\n", dev->name); @@ -2171,7 +2173,7 @@ /* verify if driver is not TTY */ #endif netif_carrier_on(dev); - card->chan[ch].d.line_on++; + card->chan[ch].d->line_on++; } } } @@ -2536,7 +2538,7 @@ int cpc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { hdlc_device *hdlc = dev_to_hdlc(dev); - pc300dev_t *d = (pc300dev_t *) dev->priv; + pc300dev_t *d = hdlc->dev_data; pc300ch_t *chan = (pc300ch_t *) d->chan; pc300_t *card = (pc300_t *) chan->card; pc300conf_t conf_aux; @@ -2616,8 +2618,8 @@ memset(&pc300stats, 0, sizeof(pc300stats_t)); pc300stats.hw_type = card->hw.type; - pc300stats.line_on = card->chan[ch].d.line_on; - pc300stats.line_off = card->chan[ch].d.line_off; + pc300stats.line_on = card->chan[ch].d->line_on; + pc300stats.line_off = card->chan[ch].d->line_off; memcpy(&pc300stats.gen_stats, &hdlc->stats, sizeof(struct net_device_stats)); if (card->hw.type == PC300_TE) @@ -2818,7 +2820,8 @@ static struct net_device_stats *cpc_get_stats(struct net_device *dev) { - pc300dev_t *d = (pc300dev_t *) dev->priv; + hdlc_device *hdlc = dev->priv; + pc300dev_t *d = hdlc->dev_data; if (d) return &d->hdlc->stats; @@ -3067,8 +3070,7 @@ static int cpc_attach(hdlc_device * hdlc, unsigned short encoding, unsigned short parity) { - struct net_device * dev = hdlc_to_dev(hdlc); - pc300dev_t *d = (pc300dev_t *)dev->priv; + pc300dev_t *d = hdlc->dev_data; pc300ch_t *chan = (pc300ch_t *)d->chan; pc300_t *card = (pc300_t *)chan->card; pc300chconf_t *conf = (pc300chconf_t *)&chan->conf; @@ -3145,7 +3147,7 @@ int cpc_open(struct net_device *dev) { hdlc_device *hdlc = dev_to_hdlc(dev); - pc300dev_t *d = (pc300dev_t *) dev->priv; + pc300dev_t *d = hdlc->dev_data; struct ifreq ifr; int result; @@ -3158,9 +3160,6 @@ } result = hdlc_open(hdlc); - if (hdlc->proto.id == IF_PROTO_PPP) { - dev->priv = d; - } if (result) { return result; } @@ -3174,7 +3173,7 @@ int cpc_close(struct net_device *dev) { hdlc_device *hdlc = dev_to_hdlc(dev); - pc300dev_t *d = (pc300dev_t *) dev->priv; + pc300dev_t *d = hdlc->dev_data; pc300ch_t *chan = (pc300ch_t *) d->chan; pc300_t *card = (pc300_t *) chan->card; uclong flags; @@ -3316,10 +3315,17 @@ for (i = 0; i < card->hw.nchan; i++) { pc300ch_t *chan = &card->chan[i]; - pc300dev_t *d = &chan->d; + pc300dev_t *d; hdlc_device *hdlc; struct net_device *dev; + hdlc = alloc_hdlc_device(sizeof(*d)); + if (hdlc == NULL) + continue; + d = hdlc->dev_data; + d->hdlc = hdlc; + dev = hdlc_to_dev(hdlc); + chan->card = card; chan->channel = i; chan->conf.phys_settings.clock_rate = 0; @@ -3358,12 +3364,6 @@ d->line_on = 0; d->line_off = 0; - d->hdlc = (hdlc_device *) kmalloc(sizeof(hdlc_device), GFP_KERNEL); - if (d->hdlc == NULL) - continue; - memset(d->hdlc, 0, sizeof(hdlc_device)); - - hdlc = d->hdlc; hdlc->xmit = cpc_queue_xmit; hdlc->attach = cpc_attach; @@ -3387,7 +3387,6 @@ dev->do_ioctl = cpc_ioctl; if (register_hdlc_device(hdlc) == 0) { - dev->priv = d; /* We need 'priv', hdlc doesn't */ printk("%s: Cyclades-PC300/", dev->name); switch (card->hw.type) { case PC300_TE: @@ -3647,7 +3646,9 @@ cpc_readw(card->hw.plxbase + card->hw.intctl_reg) & ~(0x0040)); for (i = 0; i < card->hw.nchan; i++) { - unregister_hdlc_device(card->chan[i].d.hdlc); + hdlc_device *hdlc = card->chan[i].d->hdlc; + unregister_hdlc_device(hdlc); + free_hdlc_device(hdlc); } iounmap((void *) card->hw.plxbase); iounmap((void *) card->hw.scabase); From shemminger@osdl.org Tue Dec 2 14:02:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:03:04 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M2lTa002210 for ; Tue, 2 Dec 2003 14:02:47 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M1kZ01498; Tue, 2 Dec 2003 14:01:46 -0800 Date: Tue, 2 Dec 2003 14:00:43 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (8/8) pc300_tty cleanup. Message-Id: <20031202140043.4da79ac3.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1833 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1500 -> 1.1501 # drivers/net/wan/pc300_tty.c 1.14 -> 1.15 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/01 shemminger@osdl.org 1.1501 # * make local variables static # * initialize tty_driver struct at compile time # * don't expect net_device to be first in pc300 struct # * don't depend on name being hdlcX # -------------------------------------------- # diff -Nru a/drivers/net/wan/pc300_tty.c b/drivers/net/wan/pc300_tty.c --- a/drivers/net/wan/pc300_tty.c Tue Dec 2 13:55:55 2003 +++ b/drivers/net/wan/pc300_tty.c Tue Dec 2 13:55:55 2003 @@ -102,20 +102,17 @@ volatile struct st_cpc_rx_list buf_rx; /* ptr. to reception buffer */ unsigned char* buf_tx; /* ptr. to transmission buffer */ pc300dev_t* pc300dev; /* ptr. to info struct in PC300 driver */ - unsigned char name[20]; /* interf. name + "-tty" */ + unsigned char name[IFNAMSIZ+6]; /* interf. name + "-tty" */ struct tty_struct *tty; struct work_struct tty_tx_work; /* tx work - tx interrupt */ struct work_struct tty_rx_work; /* rx work - rx interrupt */ } st_cpc_tty_area; -/* TTY data structures */ -static struct tty_driver serial_drv; - /* local variables */ -st_cpc_tty_area cpc_tty_area[CPC_TTY_NPORTS]; +static st_cpc_tty_area cpc_tty_area[CPC_TTY_NPORTS]; -int cpc_tty_cnt=0; /* number of intrfaces configured with MLPPP */ -int cpc_tty_unreg_flag = 0; +static int cpc_tty_cnt=0; /* number of intrfaces configured with MLPPP */ +static int cpc_tty_unreg_flag = 0; /* TTY functions prototype */ static int cpc_tty_open(struct tty_struct *tty, struct file *flip); @@ -152,8 +149,8 @@ int ch = pc300chan->channel; unsigned long flags; - CPC_TTY_DBG("%s-tty: Clear signal DTR\n", - ((struct net_device*)(pc300dev->hdlc))->name); + CPC_TTY_DBG("%s-tty: Clear signal DTR\n", + pc300dev->hdlc->netdev->name); CPC_TTY_LOCK(card, flags); cpc_writeb(card->hw.scabase + M_REG(CTL,ch), cpc_readb(card->hw.scabase+M_REG(CTL,ch))& CTL_DTR); @@ -171,13 +168,34 @@ unsigned long flags; CPC_TTY_DBG("%s-tty: Set signal DTR\n", - ((struct net_device*)(pc300dev->hdlc))->name); + pc300dev->hdlc->netdev->name); CPC_TTY_LOCK(card, flags); cpc_writeb(card->hw.scabase + M_REG(CTL,ch), cpc_readb(card->hw.scabase+M_REG(CTL,ch))& ~CTL_DTR); CPC_TTY_UNLOCK(card,flags); } +static struct tty_driver serial_drv = { + .magic = TTY_DRIVER_MAGIC, + .owner = THIS_MODULE, + .driver_name = "pc300_tty", + .name = "ttyCP", + .major = CPC_TTY_MAJOR, + .minor_start = CPC_TTY_MINOR_START, + .num = CPC_TTY_NPORTS, + .type = TTY_DRIVER_TYPE_SERIAL, + .subtype = SERIAL_TYPE_NORMAL, + .flags = TTY_DRIVER_REAL_RAW, + .open = cpc_tty_open, + .close = cpc_tty_close, + .write = cpc_tty_write, + .write_room = cpc_tty_write_room, + .chars_in_buffer = cpc_tty_chars_in_buffer, + .ioctl = cpc_tty_ioctl, + .flush_buffer = cpc_tty_flush_buffer, + .hangup = cpc_tty_hangup, +}; + /* * PC300 TTY initialization routine * @@ -190,48 +208,26 @@ */ void cpc_tty_init(pc300dev_t *pc300dev) { - int port, aux; + struct net_device *dev = pc300dev->hdlc->netdev; + unsigned port; st_cpc_tty_area * cpc_tty; /* hdlcX - X=interface number */ - port = ((struct net_device*)(pc300dev->hdlc))->name[4] - '0'; - if (port >= CPC_TTY_NPORTS) { + if (sscanf(dev->name, "hdlc%u", &port) != 1 + || port >= CPC_TTY_NPORTS) { printk("%s-tty: invalid interface selected (0-%i): %i", - ((struct net_device*)(pc300dev->hdlc))->name, - CPC_TTY_NPORTS-1,port); + dev->name, CPC_TTY_NPORTS-1,port); return; } if (cpc_tty_cnt == 0) { /* first TTY connection -> register driver */ CPC_TTY_DBG("%s-tty: driver init, major:%i, minor range:%i=%i\n", - ((struct net_device*)(pc300dev->hdlc))->name, - CPC_TTY_MAJOR, CPC_TTY_MINOR_START, - CPC_TTY_MINOR_START+CPC_TTY_NPORTS); - /* initialize tty driver struct */ - memset(&serial_drv,0,sizeof(struct tty_driver)); - serial_drv.magic = TTY_DRIVER_MAGIC; - serial_drv.owner = THIS_MODULE; - serial_drv.driver_name = "pc300_tty"; - serial_drv.name = "ttyCP"; - serial_drv.major = CPC_TTY_MAJOR; - serial_drv.minor_start = CPC_TTY_MINOR_START; - serial_drv.num = CPC_TTY_NPORTS; - serial_drv.type = TTY_DRIVER_TYPE_SERIAL; - serial_drv.subtype = SERIAL_TYPE_NORMAL; + dev->name, + CPC_TTY_MAJOR, CPC_TTY_MINOR_START, + CPC_TTY_MINOR_START+CPC_TTY_NPORTS); serial_drv.init_termios = tty_std_termios; serial_drv.init_termios.c_cflag = B9600|CS8|CREAD|HUPCL|CLOCAL; - serial_drv.flags = TTY_DRIVER_REAL_RAW; - - /* interface routines from the upper tty layer to the tty driver */ - serial_drv.open = cpc_tty_open; - serial_drv.close = cpc_tty_close; - serial_drv.write = cpc_tty_write; - serial_drv.write_room = cpc_tty_write_room; - serial_drv.chars_in_buffer = cpc_tty_chars_in_buffer; - serial_drv.ioctl = cpc_tty_ioctl; - serial_drv.flush_buffer = cpc_tty_flush_buffer; - serial_drv.hangup = cpc_tty_hangup; /* register the TTY driver */ if (tty_register_driver(&serial_drv)) { @@ -241,7 +237,7 @@ } memset((void *)cpc_tty_area, 0, - sizeof(st_cpc_tty_area) * CPC_TTY_NPORTS); + sizeof(st_cpc_tty_area) * CPC_TTY_NPORTS); } cpc_tty = &cpc_tty_area[port]; @@ -265,11 +261,8 @@ pc300dev->cpc_tty = (void *)cpc_tty; - aux = strlen(((struct net_device*)(pc300dev->hdlc))->name); - memcpy(cpc_tty->name,((struct net_device*)(pc300dev->hdlc))->name,aux); - memcpy(&cpc_tty->name[aux], "-tty", 5); - - cpc_open((struct net_device *)pc300dev->hdlc); + snprintf(cpc_tty->name, sizeof(cpc_tty->name)-1, "%s-tty", dev->name); + cpc_open(pc300dev->hdlc->netdev); cpc_tty_dtr_off(pc300dev); CPC_TTY_DBG("%s: Initializing TTY Sync Driver, tty major#%d minor#%i\n", @@ -997,22 +990,23 @@ static void cpc_tty_trace(pc300dev_t *dev, char* buf, int len, char rxtx) { struct sk_buff *skb; + struct net_device *netdev = dev->hdlc->netdev; if ((skb = dev_alloc_skb(10 + len)) == NULL) { /* out of memory */ CPC_TTY_DBG("%s: tty_trace - out of memory\n", - ((struct net_device *)(dev->hdlc))->name); + netdev->name); return; } skb_put (skb, 10 + len); - skb->dev = (struct net_device *) dev->hdlc; + skb->dev = netdev; skb->protocol = htons(ETH_P_CUST); skb->mac.raw = skb->data; skb->pkt_type = PACKET_HOST; skb->len = 10 + len; - memcpy(skb->data,((struct net_device *)(dev->hdlc))->name,5); + memcpy(skb->data,netdev->name,5); skb->data[5] = '['; skb->data[6] = rxtx; skb->data[7] = ']'; @@ -1035,14 +1029,14 @@ if ((cpc_tty= (st_cpc_tty_area *) pc300dev->cpc_tty) == 0) { CPC_TTY_DBG("%s: interface is not TTY\n", - ((struct net_device *)(pc300dev->hdlc))->name); + pc300dev->hdlc->netdev->name); return; } CPC_TTY_DBG("%s: cpc_tty_unregister_service", cpc_tty->name); if (cpc_tty->pc300dev != pc300dev) { CPC_TTY_DBG("%s: invalid tty ptr=%s\n", - ((struct net_device *)(pc300dev->hdlc))->name, cpc_tty->name); + pc300dev->hdlc->netdev->name, cpc_tty->name); return; } @@ -1091,8 +1085,7 @@ int i ; CPC_TTY_DBG("hdlcX-tty: reset variables\n"); - /* reset the tty_driver structure - serial_drv */ - memset(&serial_drv, 0, sizeof(struct tty_driver)); + for (i=0; i < CPC_TTY_NPORTS; i++){ memset(&cpc_tty_area[i],0, sizeof(st_cpc_tty_area)); } From shemminger@osdl.org Tue Dec 2 14:02:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:03:07 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M2qTa002276 for ; Tue, 2 Dec 2003 14:02:52 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M1kZ01494; Tue, 2 Dec 2003 14:01:46 -0800 Date: Tue, 2 Dec 2003 13:58:46 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (4//8) dscc4 convert to new hdlc_device Message-Id: <20031202135846.28a93010.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1834 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1493 -> 1.1494 # drivers/net/wan/dscc4.c 1.51 -> 1.52 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/11/26 shemminger@osdl.org 1.1494 # Convert to work with hdlc_device that is not embedded. # Use list macros for list of ports on card. # -------------------------------------------- # diff -Nru a/drivers/net/wan/dscc4.c b/drivers/net/wan/dscc4.c --- a/drivers/net/wan/dscc4.c Wed Nov 26 12:30:13 2003 +++ b/drivers/net/wan/dscc4.c Wed Nov 26 12:30:13 2003 @@ -185,12 +185,14 @@ spinlock_t lock; struct pci_dev *pdev; - struct dscc4_dev_priv *root; + struct list_head devs; dma_addr_t iqcfg_dma; u32 xtal_hz; }; struct dscc4_dev_priv { + struct list_head dev_list; + struct sk_buff *rx_skbuff[RX_RING_SIZE]; struct sk_buff *tx_skbuff[TX_RING_SIZE]; @@ -228,7 +230,7 @@ unsigned short encoding; unsigned short parity; - hdlc_device hdlc; + hdlc_device *hdlc; sync_serial_settings settings; u32 __pad __attribute__ ((aligned (4))); }; @@ -371,9 +373,17 @@ static int dscc4_tx_poll(struct dscc4_dev_priv *, struct net_device *); #endif +static inline struct dscc4_dev_priv *dscc4_root_priv(struct dscc4_pci_priv *ppriv) +{ + struct list_head *first = ppriv->devs.next; + + return (first == &ppriv->devs) ? NULL + : list_entry(first, struct dscc4_dev_priv, dev_list); +} + static inline struct dscc4_dev_priv *dscc4_priv(struct net_device *dev) { - return list_entry(dev, struct dscc4_dev_priv, hdlc.netdev); + return dev_to_hdlc(dev)->dev_data; } static void scc_patchl(u32 mask, u32 value, struct dscc4_dev_priv *dpriv, @@ -636,7 +646,7 @@ struct net_device *dev) { struct RxFD *rx_fd = dpriv->rx_fd + dpriv->rx_current%RX_RING_SIZE; - struct net_device_stats *stats = &dpriv->hdlc.stats; + struct net_device_stats *stats = &dpriv->hdlc->stats; struct pci_dev *pdev = dpriv->pci_priv->pdev; struct sk_buff *skb; int pkt_len; @@ -681,19 +691,16 @@ static void dscc4_free1(struct pci_dev *pdev) { - struct dscc4_pci_priv *ppriv; - struct dscc4_dev_priv *root; - int i; + struct dscc4_pci_priv *ppriv = pci_get_drvdata(pdev); + struct dscc4_dev_priv *dpriv, *dnext; - ppriv = pci_get_drvdata(pdev); - root = ppriv->root; - - for (i = 0; i < dev_per_card; i++) - unregister_hdlc_device(&root[i].hdlc); + list_for_each_entry_safe(dpriv, dnext, &ppriv->devs, dev_list) { + unregister_hdlc_device(dpriv->hdlc); + free_hdlc_device(dpriv->hdlc); + } pci_set_drvdata(pdev, NULL); - kfree(root); kfree(ppriv); } @@ -704,7 +711,6 @@ struct dscc4_dev_priv *dpriv; static int cards_found = 0; unsigned long ioaddr; - int i; printk(KERN_DEBUG "%s", version); @@ -743,7 +749,8 @@ priv = (struct dscc4_pci_priv *)pci_get_drvdata(pdev); - if (request_irq(pdev->irq, &dscc4_irq, SA_SHIRQ, DRV_NAME, priv->root)){ + if (request_irq(pdev->irq, &dscc4_irq, SA_SHIRQ, DRV_NAME, + dscc4_root_priv(priv))) { printk(KERN_WARNING "%s: IRQ %d busy\n", DRV_NAME, pdev->irq); goto err_out_free1; } @@ -772,21 +779,19 @@ * SCC 0-3 private rx/tx irq structures * IQRX/TXi needs to be set soon. Learned it the hard way... */ - for (i = 0; i < dev_per_card; i++) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { dpriv->iqtx = (u32 *) pci_alloc_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), &dpriv->iqtx_dma); if (!dpriv->iqtx) goto err_out_free_iqtx; - writel(dpriv->iqtx_dma, ioaddr + IQTX0 + i*4); + writel(dpriv->iqtx_dma, ioaddr + IQTX0 + dpriv->dev_id*4); } - for (i = 0; i < dev_per_card; i++) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { dpriv->iqrx = (u32 *) pci_alloc_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), &dpriv->iqrx_dma); if (!dpriv->iqrx) goto err_out_free_iqrx; - writel(dpriv->iqrx_dma, ioaddr + IQRX0 + i*4); + writel(dpriv->iqrx_dma, ioaddr + IQRX0 + dpriv->dev_id*4); } /* Cf application hint. Beware of hard-lock condition on threshold. */ @@ -804,22 +809,19 @@ return 0; err_out_free_iqrx: - while (--i >= 0) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), dpriv->iqrx, dpriv->iqrx_dma); } - i = dev_per_card; err_out_free_iqtx: - while (--i >= 0) { - dpriv = priv->root + i; + list_for_each_entry(dpriv, &priv->devs, dev_list) { pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), dpriv->iqtx, dpriv->iqtx_dma); } pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), priv->iqcfg, priv->iqcfg_dma); err_out_free_irq: - free_irq(pdev->irq, priv->root); + free_irq(pdev->irq, dscc4_root_priv(priv)); err_out_free1: dscc4_free1(pdev); err_out_iounmap: @@ -863,29 +865,27 @@ static int dscc4_found1(struct pci_dev *pdev, unsigned long ioaddr) { struct dscc4_pci_priv *ppriv; - struct dscc4_dev_priv *root; + struct dscc4_dev_priv *dpriv, *dnext; int i, ret = -ENOMEM; - root = (struct dscc4_dev_priv *) - kmalloc(dev_per_card*sizeof(*root), GFP_KERNEL); - if (!root) { - printk(KERN_ERR "%s: can't allocate data\n", DRV_NAME); - goto err_out; - } - memset(root, 0, dev_per_card*sizeof(*root)); - ppriv = (struct dscc4_pci_priv *) kmalloc(sizeof(*ppriv), GFP_KERNEL); if (!ppriv) { printk(KERN_ERR "%s: can't allocate private data\n", DRV_NAME); - goto err_free_dev; + goto err_out; } memset(ppriv, 0, sizeof(struct dscc4_pci_priv)); + INIT_LIST_HEAD(&ppriv->devs); for (i = 0; i < dev_per_card; i++) { - struct dscc4_dev_priv *dpriv = root + i; - hdlc_device *hdlc = &dpriv->hdlc; - struct net_device *d = hdlc_to_dev(hdlc); + hdlc_device *hdlc = alloc_hdlc_device(sizeof(*dpriv)); + struct net_device *d; + + if (!hdlc) { + ret = -ENOMEM; + goto err_unregister; + } + d = hdlc_to_dev(hdlc); d->base_addr = ioaddr; d->init = NULL; d->irq = pdev->irq; @@ -898,6 +898,8 @@ SET_MODULE_OWNER(d); SET_NETDEV_DEV(d, &pdev->dev); + dpriv = dscc4_priv(d); + dpriv->hdlc = hdlc; dpriv->dev_id = i; dpriv->pci_priv = ppriv; spin_lock_init(&dpriv->lock); @@ -920,23 +922,25 @@ unregister_hdlc_device(hdlc); goto err_unregister; } + list_add_tail(&dpriv->dev_list, &ppriv->devs); } - ret = dscc4_set_quartz(root, quartz); + + + ret = dscc4_set_quartz(dscc4_root_priv(ppriv), quartz); if (ret < 0) goto err_unregister; - ppriv->root = root; + spin_lock_init(&ppriv->lock); pci_set_drvdata(pdev, ppriv); return ret; err_unregister: - while (--i >= 0) { - dscc4_release_ring(root + i); - unregister_hdlc_device(&root[i].hdlc); + list_for_each_entry_safe(dpriv, dnext, &ppriv->devs, dev_list) { + dscc4_release_ring(dpriv); + unregister_hdlc_device(dpriv->hdlc); + free_hdlc_device(dpriv->hdlc); } kfree(ppriv); -err_free_dev: - kfree(root); err_out: return ret; }; @@ -964,7 +968,7 @@ sync_serial_settings *settings = &dpriv->settings; if (settings->loopback && (settings->clock_type != CLOCK_INT)) { - struct net_device *dev = hdlc_to_dev(&dpriv->hdlc); + struct net_device *dev = hdlc_to_dev(dpriv->hdlc); printk(KERN_INFO "%s: loopback requires clock\n", dev->name); return -1; @@ -1015,7 +1019,7 @@ static int dscc4_open(struct net_device *dev) { struct dscc4_dev_priv *dpriv = dscc4_priv(dev); - hdlc_device *hdlc = &dpriv->hdlc; + hdlc_device *hdlc = dpriv->hdlc; struct dscc4_pci_priv *ppriv; int ret = -EAGAIN; @@ -1467,7 +1471,7 @@ int i, handled = 1; priv = root->pci_priv; - dev = hdlc_to_dev(&root->hdlc); + dev = hdlc_to_dev(root->hdlc); spin_lock_irqsave(&priv->lock, flags); @@ -1518,7 +1522,7 @@ static inline void dscc4_tx_irq(struct dscc4_pci_priv *ppriv, struct dscc4_dev_priv *dpriv) { - struct net_device *dev = hdlc_to_dev(&dpriv->hdlc); + struct net_device *dev = hdlc_to_dev(dpriv->hdlc); u32 state; int cur, loop = 0; @@ -1549,7 +1553,7 @@ if (state & SccEvt) { if (state & Alls) { - struct net_device_stats *stats = &dpriv->hdlc.stats; + struct net_device_stats *stats = &dpriv->hdlc->stats; struct sk_buff *skb; struct TxFD *tx_fd; @@ -1687,7 +1691,7 @@ static inline void dscc4_rx_irq(struct dscc4_pci_priv *priv, struct dscc4_dev_priv *dpriv) { - struct net_device *dev = hdlc_to_dev(&dpriv->hdlc); + struct net_device *dev = hdlc_to_dev(dpriv->hdlc); u32 state; int cur; @@ -1954,23 +1958,21 @@ static void __devexit dscc4_remove_one(struct pci_dev *pdev) { struct dscc4_pci_priv *ppriv; - struct dscc4_dev_priv *root; + struct dscc4_dev_priv *root, *dpriv; unsigned long ioaddr; - int i; ppriv = pci_get_drvdata(pdev); - root = ppriv->root; + root = dscc4_root_priv(ppriv); - ioaddr = hdlc_to_dev(&root->hdlc)->base_addr; + ioaddr = hdlc_to_dev(root->hdlc)->base_addr; dscc4_pci_reset(pdev, ioaddr); free_irq(pdev->irq, root); pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), ppriv->iqcfg, ppriv->iqcfg_dma); - for (i = 0; i < dev_per_card; i++) { - struct dscc4_dev_priv *dpriv = root + i; + list_for_each_entry(dpriv, &ppriv->devs, dev_list) { dscc4_release_ring(dpriv); pci_free_consistent(pdev, IRQ_RING_SIZE*sizeof(u32), dpriv->iqrx, dpriv->iqrx_dma); From shemminger@osdl.org Tue Dec 2 14:06:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:06:50 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2M6bTa004599 for ; Tue, 2 Dec 2003 14:06:37 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2M0PZ01334; Tue, 2 Dec 2003 14:00:25 -0800 Date: Tue, 2 Dec 2003 14:01:08 -0800 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (2/8) hdlc_fr - use alloc_ether/netdev Message-Id: <20031202140108.24c1dd6b.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1835 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev The HDLC frame relay device use's kmalloc directly to create the network device. This patch switches to alloc_netdev/alloc_etherdev as necessary. # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1499 -> 1.1500 # drivers/net/wan/hdlc_fr.c 1.10 -> 1.11 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/01 shemminger@osdl.org 1.1500 # Make PVC net_devices use alloc_netdev # -------------------------------------------- # diff -Nru a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c --- a/drivers/net/wan/hdlc_fr.c Mon Dec 1 13:53:42 2003 +++ b/drivers/net/wan/hdlc_fr.c Mon Dec 1 13:53:42 2003 @@ -986,7 +986,13 @@ } } - +static void fr_dlci_setup(struct net_device *dev) +{ + dev->type = ARPHRD_DLCI; + dev->flags = IFF_POINTOPOINT; + dev->hard_header_len = 10; + dev->addr_len = 2; +} static int fr_add_pvc(hdlc_device *hdlc, unsigned int dlci, int type) { @@ -1009,26 +1015,26 @@ used = pvc_is_used(pvc); - dev = kmalloc(sizeof(struct net_device) + - sizeof(struct net_device_stats), GFP_KERNEL); + if (type == ARPHRD_ETHER) { + dev = alloc_etherdev(sizeof(struct net_device_stats)); + prefix = "pvceth%d" ; + } else { + dev = alloc_netdev(sizeof(struct net_device_stats), + "pvc", fr_dlci_setup); + prefix = "pvc%d"; + } + if (!dev) { printk(KERN_WARNING "%s: Memory squeeze on fr_pvc()\n", hdlc_to_name(hdlc)); delete_unused_pvcs(hdlc); return -ENOBUFS; } - memset(dev, 0, sizeof(struct net_device) + - sizeof(struct net_device_stats)); if (type == ARPHRD_ETHER) { - ether_setup(dev); memcpy(dev->dev_addr, "\x00\x01", 2); get_random_bytes(dev->dev_addr + 2, ETH_ALEN - 2); } else { - dev->type = ARPHRD_DLCI; - dev->flags = IFF_POINTOPOINT; - dev->hard_header_len = 10; - dev->addr_len = 2; *(u16*)dev->dev_addr = htons(dlci); dlci_to_q922(dev->broadcast, dlci); } @@ -1044,13 +1050,13 @@ result = dev_alloc_name(dev, prefix); if (result < 0) { - kfree(dev); + free_netdev(dev); delete_unused_pvcs(hdlc); return result; } if (register_netdevice(dev) != 0) { - kfree(dev); + free_netdev(dev); delete_unused_pvcs(hdlc); return -EIO; } From samo.pogacnik@s5.net Tue Dec 2 14:25:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:25:19 -0800 (PST) Received: from ikar.perftech.si (ikar.perftech.si [195.246.0.20]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2MP0Ta005412 for ; Tue, 2 Dec 2003 14:25:01 -0800 Message-Id: <200312022225.hB2MP0Ta005412@oss.sgi.com> Received: from there ([195.246.28.105]) by ikar.perftech.si (IKAR [127.0.0.1]) (MDaemon.PRO.v6.8.5.R) with ESMTP id 27-md50000000093.tmp for ; Tue, 02 Dec 2003 22:59:01 +0100 Content-Type: text/plain; charset="iso-8859-2" From: Samo Pogacnik To: netdev@oss.sgi.com Subject: network interface groups Date: Tue, 2 Dec 2003 22:43:42 +0100 X-Mailer: KMail [version 1.3.2] MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Processed: IKAR, Tue, 02 Dec 2003 22:59:01 +0100 (not processed: message size (17774) exceeds max size (15360)) X-MDRemoteIP: 195.246.28.105 X-Return-Path: samo.pogacnik@s5.net X-MDaemon-Deliver-To: netdev@oss.sgi.com X-archive-position: 1836 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: samo.pogacnik@s5.net Precedence: bulk X-list: netdev hi, i made a small modification of the linux networking code, that might be interesting to someone. So here is the description of what i did and a linux patch at the end of the description. The following URL: http://friends.s5.net/pogacnik/net_if_grp.html leeds to linux (2.4.18 and 2.6.0-test11) and ifconfig patches. ---------------------------------------------------------------------- Network interface groups 1. Introduction Provided modification gives you the possibility to assign a set of group numbers between 0 and 31 to each network interface in the system and to configure your networking services to be accessible on those groups of interfaces, as you like. The linux patch implements this functionality (IPv4 only), while the net-tools (ifconfig) patch adds the ability to manage group numbers of each network interface. 2. How this works or How I think that this works? This code extends the ability of a socket to accept packets on ANY of your network interfaces into the ability to accept packets on a selected group of network interfaces. The goal was achived in a way, that an interface group mask number has been added to the net_device structure and another one to the inet_opt structure. When the received packet looks for its socket, the group mask of the associated network interface has to match the socket's group mask at least in one bit. This way a bind system call with the desired interface group number as its parameter (instead of an IP or INADDR_ANY), selects a group of interfaces that may provide packets on this socket. By default all network interfaces belong only to group 0, as well as all sockets initialise its group mask to group 0. You may change group mask of network interfaces while network services are running, using the patched ifconfig tool. For example, the result of changing the group mask number of an interface may be, that a network service stops accepting connections via this interface, while the existing connections remain operational until they are closed. The same way a network service may start accepting connections via the desired interface. The good thing about this solution is, that the default behaviour of this extension should not make any significant difference (correct me if i'm wrong), because all network interfaces fall into group zero by default. 3. What is it good for? Separating network services, moving services between interfaces, better security, ....??? 4. To think about Can we use this aproach on IPv6? Do we need more than 32 groups? 5. Note: To be able to build the patched ifconfig tool, you have to synchronise changes of linux headers if.h and sockios.h with coresponding libc headers (net/if.h and bits/ioctls.h - on my system) and correctly set the Linux headers path. Comments appreciated:) Samo samo.pogacnik@s5.net -------------------------------------------------------------------------- diff -Nur linux-2.6.0-test11/include/linux/if.h linux-2.6.0-test11-ifgrp/include/linux/if.h --- linux-2.6.0-test11/include/linux/if.h Wed Nov 26 21:43:51 2003 +++ linux-2.6.0-test11-ifgrp/include/linux/if.h Tue Dec 2 19:58:09 2003 @@ -141,6 +141,7 @@ short ifru_flags; int ifru_ivalue; int ifru_mtu; + unsigned int ifru_group; struct ifmap ifru_map; char ifru_slave[IFNAMSIZ]; /* Just fits the size */ char ifru_newname[IFNAMSIZ]; @@ -158,6 +159,7 @@ #define ifr_flags ifr_ifru.ifru_flags /* flags */ #define ifr_metric ifr_ifru.ifru_ivalue /* metric */ #define ifr_mtu ifr_ifru.ifru_mtu /* mtu */ +#define ifr_group ifr_ifru.ifru_group /* device group */ #define ifr_map ifr_ifru.ifru_map /* device map */ #define ifr_slave ifr_ifru.ifru_slave /* slave device */ #define ifr_data ifr_ifru.ifru_data /* for use by interface */ diff -Nur linux-2.6.0-test11/include/linux/ip.h linux-2.6.0-test11-ifgrp/include/linux/ip.h --- linux-2.6.0-test11/include/linux/ip.h Wed Nov 26 21:43:28 2003 +++ linux-2.6.0-test11-ifgrp/include/linux/ip.h Tue Dec 2 20:17:23 2003 @@ -144,6 +144,7 @@ u32 addr; struct flowi fl; } cork; + __u32 if_group; /* Bound network interface group mask */ }; #define IPCORK_OPT 1 /* ip-options has been held in ipcork.opt */ diff -Nur linux-2.6.0-test11/include/linux/netdevice.h linux-2.6.0-test11-ifgrp/include/linux/netdevice.h --- linux-2.6.0-test11/include/linux/netdevice.h Wed Nov 26 21:45:38 2003 +++ linux-2.6.0-test11-ifgrp/include/linux/netdevice.h Tue Dec 2 19:59:42 2003 @@ -295,6 +295,10 @@ int ifindex; int iflink; + /* The 'ifgroup' field adds support for binding services to a group + * of interfaces. + */ + unsigned int ifgroup; struct net_device_stats* (*get_stats)(struct net_device *dev); struct iw_statistics* (*get_wireless_stats)(struct net_device *dev); diff -Nur linux-2.6.0-test11/include/linux/sockios.h linux-2.6.0-test11-ifgrp/include/linux/sockios.h --- linux-2.6.0-test11/include/linux/sockios.h Wed Nov 26 21:42:50 2003 +++ linux-2.6.0-test11-ifgrp/include/linux/sockios.h Tue Dec 2 20:07:01 2003 @@ -116,6 +116,12 @@ #define SIOCBONDINFOQUERY 0x8994 /* rtn info about bond state */ #define SIOCBONDCHANGEACTIVE 0x8995 /* update to a new active slave */ +/* Interface group configuration calls */ + +#define SIOCGIFGROUP 0x89A0 /* Get device group numbers */ +#define SIOCAIFGROUP 0x89A1 /* Add device group number */ +#define SIOCRIFGROUP 0x89A2 /* Remove device group number */ + /* Device private ioctl calls */ /* diff -Nur linux-2.6.0-test11/include/net/raw.h linux-2.6.0-test11-ifgrp/include/net/raw.h --- linux-2.6.0-test11/include/net/raw.h Wed Nov 26 21:45:37 2003 +++ linux-2.6.0-test11-ifgrp/include/net/raw.h Tue Dec 2 20:08:34 2003 @@ -35,7 +35,7 @@ extern struct sock *__raw_v4_lookup(struct sock *sk, unsigned short num, unsigned long raddr, unsigned long laddr, - int dif); + int dif, u32 grp); extern void raw_v4_input(struct sk_buff *skb, struct iphdr *iph, int hash); diff -Nur linux-2.6.0-test11/include/net/tcp.h linux-2.6.0-test11-ifgrp/include/net/tcp.h --- linux-2.6.0-test11/include/net/tcp.h Wed Nov 26 21:43:05 2003 +++ linux-2.6.0-test11-ifgrp/include/net/tcp.h Tue Dec 2 20:18:06 2003 @@ -161,7 +161,7 @@ extern void tcp_bucket_destroy(struct tcp_bind_bucket *tb); extern void tcp_bucket_unlock(struct sock *sk); extern int tcp_port_rover; -extern struct sock *tcp_v4_lookup_listener(u32 addr, unsigned short hnum, int dif); +extern struct sock *tcp_v4_lookup_listener(u32 addr, unsigned short hnum, int dif, u32 grp); /* These are AF independent. */ static __inline__ int tcp_bhashfn(__u16 lport) diff -Nur linux-2.6.0-test11/net/core/dev.c linux-2.6.0-test11-ifgrp/net/core/dev.c --- linux-2.6.0-test11/net/core/dev.c Wed Nov 26 21:44:11 2003 +++ linux-2.6.0-test11-ifgrp/net/core/dev.c Tue Dec 2 20:32:10 2003 @@ -2340,6 +2340,22 @@ dev->tx_queue_len = ifr->ifr_qlen; return 0; + case SIOCGIFGROUP: + ifr->ifr_group = dev->ifgroup; + return 0; + + case SIOCAIFGROUP: + if (ifr->ifr_group > 31) + return -EINVAL; + dev->ifgroup |= (1 << ifr->ifr_group); + return 0; + + case SIOCRIFGROUP: + if (ifr->ifr_group > 31) + return -EINVAL; + dev->ifgroup &= ~(1 << ifr->ifr_group); + return 0; + case SIOCSIFNAME: if (dev->flags & IFF_UP) return -EBUSY; @@ -2452,6 +2468,7 @@ case SIOCGIFMAP: case SIOCGIFINDEX: case SIOCGIFTXQLEN: + case SIOCGIFGROUP: dev_load(ifr.ifr_name); read_lock(&dev_base_lock); ret = dev_ifsioc(&ifr, cmd); @@ -2518,6 +2535,8 @@ case SIOCDELMULTI: case SIOCSIFHWBROADCAST: case SIOCSIFTXQLEN: + case SIOCAIFGROUP: + case SIOCRIFGROUP: case SIOCSIFNAME: case SIOCSMIIREG: case SIOCBONDENSLAVE: @@ -2658,6 +2677,7 @@ goto out; dev->iflink = -1; + dev->ifgroup = 1; /* Init, if this function is available */ if (dev->init) { diff -Nur linux-2.6.0-test11/net/ipv4/af_inet.c linux-2.6.0-test11-ifgrp/net/ipv4/af_inet.c --- linux-2.6.0-test11/net/ipv4/af_inet.c Wed Nov 26 21:43:06 2003 +++ linux-2.6.0-test11-ifgrp/net/ipv4/af_inet.c Tue Dec 2 20:47:02 2003 @@ -403,6 +403,7 @@ inet->mc_ttl = 1; inet->mc_index = 0; inet->mc_list = NULL; + inet->if_group = 1; #ifdef INET_REFCNT_DEBUG atomic_inc(&inet_sock_nr); @@ -476,6 +477,7 @@ unsigned short snum; int chk_addr_ret; int err; + __u32 grp; /* If the socket has its own bind function then use it. (RAW) */ if (sk->sk_prot->bind) { @@ -527,12 +529,17 @@ if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST) inet->saddr = 0; /* Use device */ + inet->if_group = 1; /* Make sure we are allowed to bind here. */ if (sk->sk_prot->get_port(sk, snum)) { inet->saddr = inet->rcv_saddr = 0; err = -EADDRINUSE; goto out_release_sock; } + + grp = ntohl(inet->rcv_saddr); + if (grp < 32) + inet->if_group = 1 << grp; if (inet->rcv_saddr) sk->sk_userlocks |= SOCK_BINDADDR_LOCK; diff -Nur linux-2.6.0-test11/net/ipv4/icmp.c linux-2.6.0-test11-ifgrp/net/ipv4/icmp.c --- linux-2.6.0-test11/net/ipv4/icmp.c Wed Nov 26 21:45:38 2003 +++ linux-2.6.0-test11-ifgrp/net/ipv4/icmp.c Tue Dec 2 20:49:24 2003 @@ -696,7 +696,7 @@ if ((raw_sk = sk_head(&raw_v4_htable[hash])) != NULL) { while ((raw_sk = __raw_v4_lookup(raw_sk, protocol, iph->daddr, iph->saddr, - skb->dev->ifindex)) != NULL) { + skb->dev->ifindex, skb->dev->ifgroup)) != NULL) { raw_err(raw_sk, skb, info); raw_sk = sk_next(raw_sk); iph = (struct iphdr *)skb->data; diff -Nur linux-2.6.0-test11/net/ipv4/raw.c linux-2.6.0-test11-ifgrp/net/ipv4/raw.c --- linux-2.6.0-test11/net/ipv4/raw.c Wed Nov 26 21:44:20 2003 +++ linux-2.6.0-test11-ifgrp/net/ipv4/raw.c Tue Dec 2 20:55:16 2003 @@ -104,7 +104,7 @@ struct sock *__raw_v4_lookup(struct sock *sk, unsigned short num, unsigned long raddr, unsigned long laddr, - int dif) + int dif, u32 grp) { struct hlist_node *node; @@ -113,7 +113,7 @@ if (inet->num == num && !(inet->daddr && inet->daddr != raddr) && - !(inet->rcv_saddr && inet->rcv_saddr != laddr) && + !(!(inet->if_group & grp) && inet->rcv_saddr != laddr) && !(sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif)) goto found; /* gotcha */ } @@ -158,7 +158,7 @@ goto out; sk = __raw_v4_lookup(__sk_head(head), iph->protocol, iph->saddr, iph->daddr, - skb->dev->ifindex); + skb->dev->ifindex, skb->dev->ifgroup); while (sk) { if (iph->protocol != IPPROTO_ICMP || !icmp_filter(sk, skb)) { @@ -170,7 +170,7 @@ } sk = __raw_v4_lookup(sk_next(sk), iph->protocol, iph->saddr, iph->daddr, - skb->dev->ifindex); + skb->dev->ifindex, skb->dev->ifgroup); } out: read_unlock(&raw_v4_lock); diff -Nur linux-2.6.0-test11/net/ipv4/tcp_diag.c linux-2.6.0-test11-ifgrp/net/ipv4/tcp_diag.c --- linux-2.6.0-test11/net/ipv4/tcp_diag.c Wed Nov 26 21:42:49 2003 +++ linux-2.6.0-test11-ifgrp/net/ipv4/tcp_diag.c Tue Dec 2 20:57:22 2003 @@ -206,7 +206,7 @@ return -1; } -extern struct sock *tcp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif); +extern struct sock *tcp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif, u32 grp); #ifdef CONFIG_IPV6 extern struct sock *tcp_v6_lookup(struct in6_addr *saddr, u16 sport, struct in6_addr *daddr, u16 dport, @@ -223,7 +223,7 @@ if (req->tcpdiag_family == AF_INET) { sk = tcp_v4_lookup(req->id.tcpdiag_dst[0], req->id.tcpdiag_dport, req->id.tcpdiag_src[0], req->id.tcpdiag_sport, - req->id.tcpdiag_if); + req->id.tcpdiag_if, 1); } #ifdef CONFIG_IPV6 else if (req->tcpdiag_family == AF_INET6) { diff -Nur linux-2.6.0-test11/net/ipv4/tcp_ipv4.c linux-2.6.0-test11-ifgrp/net/ipv4/tcp_ipv4.c --- linux-2.6.0-test11/net/ipv4/tcp_ipv4.c Wed Nov 26 21:43:32 2003 +++ linux-2.6.0-test11-ifgrp/net/ipv4/tcp_ipv4.c Tue Dec 2 21:10:39 2003 @@ -412,7 +412,7 @@ * during the search since they can never be otherwise. */ static struct sock *__tcp_v4_lookup_listener(struct hlist_head *head, u32 daddr, - unsigned short hnum, int dif) + unsigned short hnum, int dif, u32 grp) { struct sock *result = NULL, *sk; struct hlist_node *node; @@ -424,9 +424,10 @@ if (inet->num == hnum && !ipv6_only_sock(sk)) { __u32 rcv_saddr = inet->rcv_saddr; + __u32 if_group = inet->if_group; score = (sk->sk_family == PF_INET ? 1 : 0); - if (rcv_saddr) { + if (!(if_group & grp)) { if (rcv_saddr != daddr) continue; score+=2; @@ -449,7 +450,7 @@ /* Optimize the common listener case. */ inline struct sock *tcp_v4_lookup_listener(u32 daddr, unsigned short hnum, - int dif) + int dif, u32 grp) { struct sock *sk = NULL; struct hlist_head *head; @@ -460,11 +461,11 @@ struct inet_opt *inet = inet_sk((sk = __sk_head(head))); if (inet->num == hnum && !sk->sk_node.next && - (!inet->rcv_saddr || inet->rcv_saddr == daddr) && + ((inet->if_group & grp) || inet->rcv_saddr == daddr) && (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) && !sk->sk_bound_dev_if) goto sherry_cache; - sk = __tcp_v4_lookup_listener(head, daddr, hnum, dif); + sk = __tcp_v4_lookup_listener(head, daddr, hnum, dif, grp); } if (sk) { sherry_cache: @@ -515,21 +516,21 @@ } static inline struct sock *__tcp_v4_lookup(u32 saddr, u16 sport, - u32 daddr, u16 hnum, int dif) + u32 daddr, u16 hnum, int dif, u32 grp) { struct sock *sk = __tcp_v4_lookup_established(saddr, sport, daddr, hnum, dif); - return sk ? : tcp_v4_lookup_listener(daddr, hnum, dif); + return sk ? : tcp_v4_lookup_listener(daddr, hnum, dif, grp); } inline struct sock *tcp_v4_lookup(u32 saddr, u16 sport, u32 daddr, - u16 dport, int dif) + u16 dport, int dif, u32 grp) { struct sock *sk; local_bh_disable(); - sk = __tcp_v4_lookup(saddr, sport, daddr, ntohs(dport), dif); + sk = __tcp_v4_lookup(saddr, sport, daddr, ntohs(dport), dif, grp); local_bh_enable(); return sk; @@ -1003,7 +1004,7 @@ } sk = tcp_v4_lookup(iph->daddr, th->dest, iph->saddr, - th->source, tcp_v4_iif(skb)); + th->source, tcp_v4_iif(skb), skb->dev->ifgroup); if (!sk) { ICMP_INC_STATS_BH(IcmpInErrors); return; @@ -1774,7 +1775,7 @@ sk = __tcp_v4_lookup(skb->nh.iph->saddr, th->source, skb->nh.iph->daddr, ntohs(th->dest), - tcp_v4_iif(skb)); + tcp_v4_iif(skb), skb->dev->ifgroup); if (!sk) goto no_tcp_socket; @@ -1837,7 +1838,8 @@ case TCP_TW_SYN: { struct sock *sk2 = tcp_v4_lookup_listener(skb->nh.iph->daddr, ntohs(th->dest), - tcp_v4_iif(skb)); + tcp_v4_iif(skb), + skb->dev->ifgroup); if (sk2) { tcp_tw_deschedule((struct tcp_tw_bucket *)sk); tcp_tw_put((struct tcp_tw_bucket *)sk); diff -Nur linux-2.6.0-test11/net/ipv4/udp.c linux-2.6.0-test11-ifgrp/net/ipv4/udp.c --- linux-2.6.0-test11/net/ipv4/udp.c Wed Nov 26 21:43:09 2003 +++ linux-2.6.0-test11-ifgrp/net/ipv4/udp.c Tue Dec 2 21:17:05 2003 @@ -219,7 +219,7 @@ /* UDP is nearly always wildcards out the wazoo, it makes no sense to try * harder than this. -DaveM */ -struct sock *udp_v4_lookup_longway(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif) +struct sock *udp_v4_lookup_longway(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif, u32 grp) { struct sock *sk, *result = NULL; struct hlist_node *node; @@ -231,7 +231,7 @@ if (inet->num == hnum && !ipv6_only_sock(sk)) { int score = (sk->sk_family == PF_INET ? 1 : 0); - if (inet->rcv_saddr) { + if (!(inet->if_group & grp)) { if (inet->rcv_saddr != daddr) continue; score+=2; @@ -263,12 +263,12 @@ return result; } -__inline__ struct sock *udp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif) +__inline__ struct sock *udp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif, u32 grp) { struct sock *sk; read_lock(&udp_hash_lock); - sk = udp_v4_lookup_longway(saddr, sport, daddr, dport, dif); + sk = udp_v4_lookup_longway(saddr, sport, daddr, dport, dif, grp); if (sk) sock_hold(sk); read_unlock(&udp_hash_lock); @@ -325,7 +325,7 @@ int harderr; int err; - sk = udp_v4_lookup(iph->daddr, uh->dest, iph->saddr, uh->source, skb->dev->ifindex); + sk = udp_v4_lookup(iph->daddr, uh->dest, iph->saddr, uh->source, skb->dev->ifindex, skb->dev->ifgroup); if (sk == NULL) { ICMP_INC_STATS_BH(IcmpInErrors); return; /* No socket for error */ @@ -1187,7 +1187,7 @@ if(rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST)) return udp_v4_mcast_deliver(skb, uh, saddr, daddr); - sk = udp_v4_lookup(saddr, uh->source, daddr, uh->dest, skb->dev->ifindex); + sk = udp_v4_lookup(saddr, uh->source, daddr, uh->dest, skb->dev->ifindex, skb->dev->ifgroup); if (sk != NULL) { int ret = udp_queue_rcv_skb(sk, skb); From romieu@fr.zoreil.com Tue Dec 2 14:50:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 14:50:46 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2MoTTa006117 for ; Tue, 2 Dec 2003 14:50:30 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id hB2MmBxg010541; Tue, 2 Dec 2003 23:48:11 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id hB2MmBOH010540; Tue, 2 Dec 2003 23:48:11 +0100 Date: Tue, 2 Dec 2003 23:48:11 +0100 From: Francois Romieu To: Stephen Hemminger Cc: Krzysztof Halas , Jeff Garzik , netdev@oss.sgi.com Subject: Re: [PATCH] (4/8) dscc4 convert to new hdlc_device Message-ID: <20031202234811.A10194@electric-eye.fr.zoreil.com> References: <20031202135846.28a93010.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031202135846.28a93010.shemminger@osdl.org>; from shemminger@osdl.org on Tue, Dec 02, 2003 at 01:58:46PM -0800 X-Organisation: Land of Sunshine Inc. X-archive-position: 1837 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Stephen Hemminger : [...] drivers/net/wan/dscc4.c:1502: i = dev_per_card - 1; do { dscc4_rx_irq(priv, root + i); ^^^^^^^^ -> __BOOOOM__ } while (--i >= 0); Anyway it is a long expected change, I'll fix/test it. -- Ueimor From shemminger@osdl.org Tue Dec 2 15:32:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 15:32:40 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2NWPTa007431 for ; Tue, 2 Dec 2003 15:32:25 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB2NW3Z16862; Tue, 2 Dec 2003 15:32:03 -0800 Date: Tue, 2 Dec 2003 15:32:46 -0800 From: Stephen Hemminger To: Krzysztof Halas , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] (3/8) n2 and c101 hdlc dis-embedding Message-Id: <20031202153246.57ef7c43.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1838 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Change the n2 and c101 drivers to allocate the hdlc device structure and use hdlc->dev_data for their private data. # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1492 -> 1.1493 # drivers/net/wan/hd6457x.c 1.8 -> 1.9 # drivers/net/wan/n2.c 1.13 -> 1.14 # drivers/net/wan/c101.c 1.12 -> 1.13 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/11/26 shemminger@osdl.org 1.1493 # Change to work with hdlc_device without embedded net_device. # -------------------------------------------- # diff -Nru a/drivers/net/wan/c101.c b/drivers/net/wan/c101.c --- a/drivers/net/wan/c101.c Wed Nov 26 12:28:38 2003 +++ b/drivers/net/wan/c101.c Wed Nov 26 12:28:38 2003 @@ -54,7 +54,7 @@ typedef struct card_s { - hdlc_device hdlc; /* HDLC device struct - must be first */ + hdlc_device *hdlc; /* HDLC device struct - must be first */ spinlock_t lock; /* TX lock */ u8 *win0base; /* ISA window base address */ u32 phy_winbase; /* ISA physical base address */ @@ -128,8 +128,8 @@ sca_out(stat & ST1_UDRN, MSCI0_OFFSET + ST1, card); if (stat & ST1_UDRN) { - port->hdlc.stats.tx_errors++; /* TX Underrun error detected */ - port->hdlc.stats.tx_fifo_errors++; + port->hdlc->stats.tx_errors++; /* TX Underrun error detected */ + port->hdlc->stats.tx_fifo_errors++; } /* Reset MSCI CDCD status bit - uses ch#2 DCD input */ @@ -137,7 +137,7 @@ if (stat & ST1_CDCD) hdlc_set_carrier(!(sca_in(MSCI1_OFFSET + ST3, card) & ST3_DCD), - &port->hdlc); + port->hdlc); } @@ -288,13 +288,14 @@ release_mem_region(card->phy_winbase, C101_MAPPED_RAM_SIZE); } - kfree(card); + free_hdlc_device(card->hdlc); } static int __init c101_run(unsigned long irq, unsigned long winbase) { + hdlc_device *hdlc; struct net_device *dev; card_t *card; int result; @@ -309,12 +310,13 @@ return -ENODEV; } - card = kmalloc(sizeof(card_t), GFP_KERNEL); - if (card == NULL) { + hdlc = alloc_hdlc_device(sizeof(card_t)); + if (!hdlc) { printk(KERN_ERR "c101: unable to allocate memory\n"); return -ENOBUFS; } - memset(card, 0, sizeof(card_t)); + card = hdlc->dev_data; + card->hdlc = hdlc; if (request_irq(irq, sca_intr, 0, devname, card)) { printk(KERN_ERR "c101: could not allocate IRQ\n"); @@ -347,7 +349,7 @@ sca_init(card, 0); - dev = hdlc_to_dev(&card->hdlc); + dev = hdlc_to_dev(hdlc); spin_lock_init(&card->lock); SET_MODULE_OWNER(dev); @@ -358,11 +360,11 @@ dev->do_ioctl = c101_ioctl; dev->open = c101_open; dev->stop = c101_close; - card->hdlc.attach = sca_attach; - card->hdlc.xmit = sca_xmit; + hdlc->attach = sca_attach; + hdlc->xmit = sca_xmit; card->settings.clock_type = CLOCK_EXT; - result = register_hdlc_device(&card->hdlc); + result = register_hdlc_device(hdlc); if (result) { printk(KERN_WARNING "c101: unable to register hdlc device\n"); c101_destroy_card(card); @@ -371,11 +373,11 @@ sca_init_sync_port(card); /* Set up C101 memory */ hdlc_set_carrier(!(sca_in(MSCI1_OFFSET + ST3, card) & ST3_DCD), - &card->hdlc); + card->hdlc); printk(KERN_INFO "%s: Moxa C101 on IRQ%u," " using %u TX + %u RX packets rings\n", - hdlc_to_name(&card->hdlc), card->irq, + hdlc_to_name(card->hdlc), card->irq, card->tx_ring_buffers, card->rx_ring_buffers); *new_card = card; @@ -424,7 +426,7 @@ while (card) { card_t *ptr = card; card = card->next_card; - unregister_hdlc_device(&ptr->hdlc); + unregister_hdlc_device(ptr->hdlc); c101_destroy_card(ptr); } } diff -Nru a/drivers/net/wan/hd6457x.c b/drivers/net/wan/hd6457x.c --- a/drivers/net/wan/hd6457x.c Wed Nov 26 12:28:38 2003 +++ b/drivers/net/wan/hd6457x.c Wed Nov 26 12:28:38 2003 @@ -114,7 +114,7 @@ static inline port_t* hdlc_to_port(hdlc_device *hdlc) { - return (port_t*)hdlc; + return hdlc->dev_data; } @@ -245,7 +245,7 @@ } hdlc_set_carrier(!(sca_in(get_msci(port) + ST3, card) & ST3_DCD), - &port->hdlc); + port->hdlc); } @@ -262,13 +262,13 @@ sca_out(stat & (ST1_UDRN | ST1_CDCD), msci + ST1, card); if (stat & ST1_UDRN) { - port->hdlc.stats.tx_errors++; /* TX Underrun error detected */ - port->hdlc.stats.tx_fifo_errors++; + port->hdlc->stats.tx_errors++; /* TX Underrun error detected */ + port->hdlc->stats.tx_fifo_errors++; } if (stat & ST1_CDCD) hdlc_set_carrier(!(sca_in(msci + ST3, card) & ST3_DCD), - &port->hdlc); + port->hdlc); } #endif @@ -287,7 +287,7 @@ len = readw(&desc->len); skb = dev_alloc_skb(len); if (!skb) { - port->hdlc.stats.rx_dropped++; + port->hdlc->stats.rx_dropped++; return; } @@ -316,12 +316,12 @@ printk(KERN_DEBUG "%s RX(%i):", hdlc_to_name(&port->hdlc), skb->len); debug_frame(skb); #endif - port->hdlc.stats.rx_packets++; - port->hdlc.stats.rx_bytes += skb->len; + port->hdlc->stats.rx_packets++; + port->hdlc->stats.rx_bytes += skb->len; skb->mac.raw = skb->data; - skb->dev = hdlc_to_dev(&port->hdlc); + skb->dev = hdlc_to_dev(port->hdlc); skb->dev->last_rx = jiffies; - skb->protocol = hdlc_type_trans(skb, hdlc_to_dev(&port->hdlc)); + skb->protocol = hdlc_type_trans(skb, hdlc_to_dev(port->hdlc)); netif_rx(skb); } @@ -333,7 +333,7 @@ u16 dmac = get_dmac_rx(port); card_t *card = port_to_card(port); u8 stat = sca_in(DSR_RX(phy_node(port)), card); /* read DMA Status */ - struct net_device_stats *stats = &port->hdlc.stats; + struct net_device_stats *stats = &port->hdlc->stats; /* Reset DSR status bits */ sca_out((stat & (DSR_EOT | DSR_EOM | DSR_BOF | DSR_COF)) | DSR_DWE, @@ -401,13 +401,13 @@ break; /* Transmitter is/will_be sending this frame */ desc = desc_address(port, port->txlast, 1); - port->hdlc.stats.tx_packets++; - port->hdlc.stats.tx_bytes += readw(&desc->len); + port->hdlc->stats.tx_packets++; + port->hdlc->stats.tx_bytes += readw(&desc->len); writeb(0, &desc->stat); /* Free descriptor */ port->txlast = next_desc(port, port->txlast, 1); } - netif_wake_queue(hdlc_to_dev(&port->hdlc)); + netif_wake_queue(hdlc_to_dev(port->hdlc)); spin_unlock(&port->lock); } @@ -790,7 +790,7 @@ desc = desc_address(port, port->txin + 1, 1); if (readb(&desc->stat)) /* allow 1 packet gap */ - netif_stop_queue(hdlc_to_dev(&port->hdlc)); + netif_stop_queue(hdlc_to_dev(port->hdlc)); spin_unlock_irq(&port->lock); diff -Nru a/drivers/net/wan/n2.c b/drivers/net/wan/n2.c --- a/drivers/net/wan/n2.c Wed Nov 26 12:28:38 2003 +++ b/drivers/net/wan/n2.c Wed Nov 26 12:28:38 2003 @@ -92,11 +92,10 @@ typedef struct port_s { - hdlc_device hdlc; /* HDLC device struct - must be first */ + hdlc_device *hdlc; /* HDLC device struct */ struct card_s *card; spinlock_t lock; /* TX lock */ sync_serial_settings settings; - int valid; /* port enabled */ int rxpart; /* partial frame received, next frame invalid*/ unsigned short encoding; unsigned short parity; @@ -120,7 +119,7 @@ u16 tx_ring_buffers; u8 irq; /* IRQ (3-15) */ - port_t ports[2]; + port_t *ports[2]; struct card_s *next_card; }card_t; @@ -141,8 +140,7 @@ #define phy_node(port) ((port)->phy_node) #define winsize(card) (USE_WINDOWSIZE) #define winbase(card) ((card)->winbase) -#define get_port(card, port) ((card)->ports[port].valid ? \ - &(card)->ports[port] : NULL) +#define get_port(card, port) ((card)->ports[port]) @@ -311,9 +309,13 @@ { int cnt; - for (cnt = 0; cnt < 2; cnt++) - if (card->ports[cnt].card) - unregister_hdlc_device(&card->ports[cnt].hdlc); + for (cnt = 0; cnt < 2; cnt++) { + port_t *port = card->ports[cnt]; + if (port) { + unregister_hdlc_device(port->hdlc); + free_hdlc_device(port->hdlc); + } + } if (card->irq) free_irq(card->irq, card); @@ -434,14 +436,25 @@ sca_init(card, 0); for (cnt = 0; cnt < 2; cnt++) { - port_t *port = &card->ports[cnt]; - struct net_device *dev = hdlc_to_dev(&port->hdlc); + hdlc_device *hdlc; + port_t *port; + struct net_device *dev; + int err; if ((cnt == 0 && !valid0) || (cnt == 1 && !valid1)) continue; + hdlc = alloc_hdlc_device(sizeof(*port)); + if (!hdlc) { + printk(KERN_WARNING "n2: unable to allocate hdlc device\n"); + n2_destroy_card(card); + return -ENOMEM; + } + + dev = hdlc_to_dev(hdlc); + port = hdlc->dev_data; + port->hdlc = hdlc; port->phy_node = cnt; - port->valid = 1; if ((cnt == 1) && valid0) port->log_node = 1; @@ -455,21 +468,23 @@ dev->do_ioctl = n2_ioctl; dev->open = n2_open; dev->stop = n2_close; - port->hdlc.attach = sca_attach; - port->hdlc.xmit = sca_xmit; + hdlc->attach = sca_attach; + hdlc->xmit = sca_xmit; port->settings.clock_type = CLOCK_EXT; - if (register_hdlc_device(&port->hdlc)) { + err = register_hdlc_device(hdlc); + if (err) { printk(KERN_WARNING "n2: unable to register hdlc " "device\n"); n2_destroy_card(card); - return -ENOBUFS; + return err; } port->card = card; sca_init_sync_port(port); /* Set up SCA memory */ + card->ports[i] = port; printk(KERN_INFO "%s: RISCom/N2 node %d\n", - hdlc_to_name(&port->hdlc), port->phy_node); + hdlc_to_name(hdlc), port->phy_node); } *new_card = card; From ja@ssi.bg Tue Dec 2 15:39:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 15:40:13 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2NdlTa008115 for ; Tue, 2 Dec 2003 15:39:52 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB2Ne6ee008997; Wed, 3 Dec 2003 01:40:08 +0200 Date: Wed, 3 Dec 2003 01:40:06 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031201155005.1c515793.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1839 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, I'm appending new version that handles all missing cases for hashed values on icmp error: - ip_rt_frag_needed now provides interface index - ip_rt_redirect checks for tos|RTO_ONLINK too - ip_rt_frag_needed checks also for tos|RTO_ONLINK and oif!=0 - __ip_route_output_key now ignores illegal bits (bit 1) from tos Please review and edit if needed. diff -Nru a/include/net/route.h b/include/net/route.h --- a/include/net/route.h Tue Dec 2 23:37:17 2003 +++ b/include/net/route.h Tue Dec 2 23:37:17 2003 @@ -122,7 +122,7 @@ extern int ip_route_output_key(struct rtable **, struct flowi *flp); extern int ip_route_output_flow(struct rtable **rp, struct flowi *flp, struct sock *sk, int flags); extern int ip_route_input(struct sk_buff*, u32 dst, u32 src, u8 tos, struct net_device *devin); -extern unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu); +extern unsigned short ip_rt_frag_needed(int iif, struct iphdr *iph, unsigned short new_mtu); extern void ip_rt_send_redirect(struct sk_buff *skb); extern unsigned inet_addr_type(u32 addr); diff -Nru a/net/ipv4/icmp.c b/net/ipv4/icmp.c --- a/net/ipv4/icmp.c Tue Dec 2 23:37:17 2003 +++ b/net/ipv4/icmp.c Tue Dec 2 23:37:17 2003 @@ -626,7 +626,7 @@ "and DF set.\n", NIPQUAD(iph->daddr)); } else { - info = ip_rt_frag_needed(iph, + info = ip_rt_frag_needed(skb->dev->ifindex, iph, ntohs(icmph->un.frag.mtu)); if (!info) goto out; diff -Nru a/net/ipv4/route.c b/net/ipv4/route.c --- a/net/ipv4/route.c Tue Dec 2 23:37:17 2003 +++ b/net/ipv4/route.c Tue Dec 2 23:37:17 2003 @@ -967,11 +967,12 @@ void ip_rt_redirect(u32 old_gw, u32 daddr, u32 new_gw, u32 saddr, u8 tos, struct net_device *dev) { - int i, k; + int i, j, k; struct in_device *in_dev = in_dev_get(dev); struct rtable *rth, **rthp; u32 skeys[2] = { saddr, 0 }; int ikeys[2] = { dev->ifindex, 0 }; + u8 toskeys[2]; tos &= IPTOS_RT_MASK; @@ -992,11 +993,15 @@ goto reject_redirect; } + toskeys[0] = tos; + toskeys[1] = tos | RTO_ONLINK; + if (saddr && daddr) + for (j = 0; j < 2; j++) for (i = 0; i < 2; i++) { for (k = 0; k < 2; k++) { unsigned hash = rt_hash_code(daddr, skeys[i] ^ (ikeys[k] << 5), - tos); + toskeys[j]); rthp=&rt_hash_table[hash].chain; @@ -1007,7 +1012,7 @@ smp_read_barrier_depends(); if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || - rth->fl.fl4_tos != tos || + rth->fl.fl4_tos != toskeys[j] || rth->fl.oif != ikeys[k] || rth->fl.iif != 0) { rthp = &rth->u.rt_next; @@ -1237,21 +1242,26 @@ return 68; } -unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu) +unsigned short ip_rt_frag_needed(int iif, struct iphdr *iph, unsigned short new_mtu) { - int i; + int i, j, k; unsigned short old_mtu = ntohs(iph->tot_len); struct rtable *rth; u32 skeys[2] = { iph->saddr, 0, }; u32 daddr = iph->daddr; u8 tos = iph->tos & IPTOS_RT_MASK; unsigned short est_mtu = 0; + u8 toskeys[2] = { tos, tos | RTO_ONLINK }; + int ikeys[2] = { iif, 0 }; if (ipv4_config.no_pmtu_disc) return 0; + for (k = 0; k < (iif ? 2 : 1); k++) + for (j = 0; j < 2; j++) for (i = 0; i < 2; i++) { - unsigned hash = rt_hash_code(daddr, skeys[i], tos); + unsigned hash = rt_hash_code(daddr, skeys[i] ^ (ikeys[k] << 5), + toskeys[j]); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; @@ -1261,7 +1271,8 @@ rth->fl.fl4_src == skeys[i] && rth->rt_dst == daddr && rth->rt_src == iph->saddr && - rth->fl.fl4_tos == tos && + rth->fl.fl4_tos == toskeys[j] && + rth->fl.oif == ikeys[k] && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { unsigned short mtu = new_mtu; @@ -2213,8 +2224,9 @@ { unsigned hash; struct rtable *rth; + u8 tos = flp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK); - hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), flp->fl4_tos); + hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -2226,8 +2238,7 @@ #ifdef CONFIG_IP_ROUTE_FWMARK rth->fl.fl4_fwmark == flp->fl4_fwmark && #endif - !((rth->fl.fl4_tos ^ flp->fl4_tos) & - (IPTOS_RT_MASK | RTO_ONLINK))) { + rth->fl.fl4_tos == tos) { rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; Regards -- Julian Anastasov From ja@ssi.bg Tue Dec 2 15:48:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 15:48:23 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB2Nm4Ta008633 for ; Tue, 2 Dec 2003 15:48:08 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB2NnCee009074; Wed, 3 Dec 2003 01:49:12 +0200 Date: Wed, 3 Dec 2003 01:49:12 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: netdev@oss.sgi.com cc: Sridhar Samudrala , "David S. Miller" Subject: [2.6 PATCH] sctp - provide valid tos and oif values for ip_route_output_key Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="1607745702-506488201-1070408952=:1237" X-archive-position: 1840 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --1607745702-506488201-1070408952=:1237 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello, I'm not sure if this change is needed (it is not tested). Is SO_BINDTODEVICE supported for SCTP? It seems it is bad for PMTUD not to provide valid tos and oif values for the routing call. Regards -- Julian Anastasov --1607745702-506488201-1070408952=:1237 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="tos-sctp-1.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: SCTP tos and oif Content-Disposition: attachment; filename="tos-sctp-1.diff" IyBUaGlzIGlzIGEgQml0S2VlcGVyIGdlbmVyYXRlZCBwYXRjaCBmb3IgdGhl IGZvbGxvd2luZyBwcm9qZWN0Og0KIyBQcm9qZWN0IE5hbWU6IExpbnV4IGtl cm5lbCB0cmVlDQojIFRoaXMgcGF0Y2ggZm9ybWF0IGlzIGludGVuZGVkIGZv ciBHTlUgcGF0Y2ggY29tbWFuZCB2ZXJzaW9uIDIuNSBvciBoaWdoZXIuDQoj IFRoaXMgcGF0Y2ggaW5jbHVkZXMgdGhlIGZvbGxvd2luZyBkZWx0YXM6DQoj CSAgICAgICAgICAgQ2hhbmdlU2V0CTEuMTM1NCAgLT4gMS4xMzU1IA0KIwkg bmV0L3NjdHAvcHJvdG9jb2wuYwkxLjU4ICAgIC0+IDEuNTkgICANCiMNCiMg VGhlIGZvbGxvd2luZyBpcyB0aGUgQml0S2VlcGVyIENoYW5nZVNldCBMb2cN CiMgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0NCiMgMDMvMTIvMDMJamFAc3NpLmJnCTEuMTM1NQ0KIyBbU0NUUF06IHBy b3ZpZGUgdmFsaWQgdG9zIGFuZCBvaWYgdmFsdWVzIGZvciBpcF9yb3V0ZV9v dXRwdXRfa2V5DQojIC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tDQojDQpkaWZmIC1OcnUgYS9uZXQvc2N0cC9wcm90b2Nv bC5jIGIvbmV0L3NjdHAvcHJvdG9jb2wuYw0KLS0tIGEvbmV0L3NjdHAvcHJv dG9jb2wuYwlXZWQgRGVjICAzIDAxOjA4OjMxIDIwMDMNCisrKyBiL25ldC9z Y3RwL3Byb3RvY29sLmMJV2VkIERlYyAgMyAwMTowODozMSAyMDAzDQpAQCAt NDQ0LDYgKzQ0NCw4IEBADQogDQogCW1lbXNldCgmZmwsIDB4MCwgc2l6ZW9m KHN0cnVjdCBmbG93aSkpOw0KIAlmbC5mbDRfZHN0ICA9IGRhZGRyLT52NC5z aW5fYWRkci5zX2FkZHI7DQorCWZsLmZsNF90b3MgID0gUlRfQ09OTl9GTEFH Uyhhc29jLT5iYXNlLnNrKTsNCisJZmwub2lmICAgICAgPSBhc29jLT5iYXNl LnNrLT5za19ib3VuZF9kZXZfaWY7DQogCWZsLnByb3RvID0gSVBQUk9UT19T Q1RQOw0KIA0KIAlpZiAoc2FkZHIpDQo= --1607745702-506488201-1070408952=:1237-- From ja@ssi.bg Tue Dec 2 16:02:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:02:42 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB302LTa009802 for ; Tue, 2 Dec 2003 16:02:27 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB303Kee009180; Wed, 3 Dec 2003 02:03:20 +0200 Date: Wed, 3 Dec 2003 02:03:20 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: netdev@oss.sgi.com cc: Bart De Schuymer , Stephen Hemminger , "David S. Miller" Subject: [2.6 PATCH] bridge - provide valid tos value for ip_route_output_key Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="1607745702-1706568113-1070409800=:1237" X-archive-position: 1841 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --1607745702-1706568113-1070409800=:1237 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello, The attached patch changes bridging to provide valid tos value to routing. Regards -- Julian Anastasov --1607745702-1706568113-1070409800=:1237 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="tos-bridge-1.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: bridge tos Content-Disposition: attachment; filename="tos-bridge-1.diff" IyBUaGlzIGlzIGEgQml0S2VlcGVyIGdlbmVyYXRlZCBwYXRjaCBmb3IgdGhl IGZvbGxvd2luZyBwcm9qZWN0Og0KIyBQcm9qZWN0IE5hbWU6IExpbnV4IGtl cm5lbCB0cmVlDQojIFRoaXMgcGF0Y2ggZm9ybWF0IGlzIGludGVuZGVkIGZv ciBHTlUgcGF0Y2ggY29tbWFuZCB2ZXJzaW9uIDIuNSBvciBoaWdoZXIuDQoj IFRoaXMgcGF0Y2ggaW5jbHVkZXMgdGhlIGZvbGxvd2luZyBkZWx0YXM6DQoj CSAgICAgICAgICAgQ2hhbmdlU2V0CTEuMTM1MSAgLT4gMS4xMzUyIA0KIwlu ZXQvYnJpZGdlL2JyX25ldGZpbHRlci5jCTEuMTUgICAgLT4gMS4xNiAgIA0K Iw0KIyBUaGUgZm9sbG93aW5nIGlzIHRoZSBCaXRLZWVwZXIgQ2hhbmdlU2V0 IExvZw0KIyAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLQ0KIyAwMy8xMi8wMglqYUBzc2kuYmcJMS4xMzUyDQojIFtCUklE R0VdOiBwcm92aWRlIHZhbGlkIHRvcyB2YWx1ZSBmb3IgaXBfcm91dGVfb3V0 cHV0X2tleQ0KIyAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLQ0KIw0KZGlmZiAtTnJ1IGEvbmV0L2JyaWRnZS9icl9uZXRm aWx0ZXIuYyBiL25ldC9icmlkZ2UvYnJfbmV0ZmlsdGVyLmMNCi0tLSBhL25l dC9icmlkZ2UvYnJfbmV0ZmlsdGVyLmMJVHVlIERlYyAgMiAyMzo1NTozMCAy MDAzDQorKysgYi9uZXQvYnJpZGdlL2JyX25ldGZpbHRlci5jCVR1ZSBEZWMg IDIgMjM6NTU6MzAgMjAwMw0KQEAgLTE4MCw3ICsxODAsNyBAQA0KIAkJCXN0 cnVjdCBydGFibGUgKnJ0Ow0KIAkJCXN0cnVjdCBmbG93aSBmbCA9IHsgLm5s X3UgPSANCiAJCQl7IC5pcDRfdSA9IHsgLmRhZGRyID0gaXBoLT5kYWRkciwg LnNhZGRyID0gMCAsDQotCQkJCSAgICAgLnRvcyA9IGlwaC0+dG9zfSB9LCAu cHJvdG8gPSAwfTsNCisJCQkJICAgICAudG9zID0gUlRfVE9TKGlwaC0+dG9z KX0gfSwgLnByb3RvID0gMH07DQogDQogCQkJaWYgKCFpcF9yb3V0ZV9vdXRw dXRfa2V5KCZydCwgJmZsKSkgew0KIAkJCQkvKiBCcmlkZ2VkLWFuZC1ETkFU J2VkIHRyYWZmaWMgZG9lc24ndA0K --1607745702-1706568113-1070409800=:1237-- From davem@pizda.ninka.net Tue Dec 2 16:08:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:08:53 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB308eTa010246 for ; Tue, 2 Dec 2003 16:08:40 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA04395; Tue, 2 Dec 2003 16:07:38 -0800 Date: Tue, 2 Dec 2003 16:07:38 -0800 From: "David S. Miller" To: Julian Anastasov Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031202160738.439f9f0d.davem@redhat.com> In-Reply-To: References: <20031201155005.1c515793.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1842 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 3 Dec 2003 01:40:06 +0200 (EET) Julian Anastasov wrote: > Please review and edit if needed. Thanks Julian, I will review your patch today. But I want to mention something about this. TOS checks I think are pretty unreliable. TOS has meaning within a network realm, right? This means that if we make a packet with TOS X, once it leaves our realm the TOS value may be arbitrarily changed so that the TOS in the packet has meaning within that new realm. If a PMTU or other routing ICMP gets sent back to us after TOS changes have been made, there is absolutely no way to reliably match the routes properly for update. Therefore TOS based routes cannot handle ICMP messages reliably in the global internet. In fact, it is possible and even likely to apply an ICMP for the wrong TOS variant of a given route when this TOS rewriting occurs. From hadi@cyberus.ca Tue Dec 2 16:08:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:09:06 -0800 (PST) Received: from mail.cyberus.ca (mail.cyberus.ca [209.197.145.21]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB308oTa010251 for ; Tue, 2 Dec 2003 16:08:53 -0800 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1ARKZU-0006P8-JB; Tue, 02 Dec 2003 19:08:48 -0500 Subject: Re: [RTNETLINK] Provide real oif From: jamal Reply-To: hadi@cyberus.ca To: Herbert Xu Cc: netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolis Message-Id: <1070410097.1038.27.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 02 Dec 2003 19:08:17 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 1843 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2003-12-02 at 15:54, Herbert Xu wrote: > jamal wrote: > > > > Can you provide a real example (output of route display in user space) > > where this would be valuable? > > Without the real oif you will see entries like this in the output of > ip r l c: > > 192.168.0.7 dev eth0 src 192.168.0.6 > cache mtu 1500 advmss 1460 metric10 64 > 192.168.0.7 dev eth0 src 192.168.0.6 > cache mtu 1500 advmss 1460 metric10 64 > > One of those has oif == 0 while the other one has oif == eth0. > They both seem to have an oif of eth0, no? > To generate an entry with oif == eth0, just send a UDP packet and > bound to eth0. Sorry, i am a bit slow - i think i may know where you are going with this but help me out; By binding to eth0, you mean socket bind() or msg_dontroute of the udp packet via eth0? Is it unicast? Do you have a small sample proggie i could use to check this? > > whats wrong with the combo of RTA_OIF and RTA_SRC? > > Both of those attributes are independent of the real oif. They are related in the rt selection. If you have a small program i can use to create the above cacheinfo i would appreaciate it. Is the same IP attached to multiple interfaces that you need to select a prefered OIF? cheers, jamal From francois@baligant.net Tue Dec 2 16:15:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:15:54 -0800 (PST) Received: from casimir.nikita.cx (casimir.nikita.cx [198.63.211.44]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30FfTa011055 for ; Tue, 2 Dec 2003 16:15:41 -0800 Received: from fortress (fbaligant.net1.nerim.net [213.41.146.186]) by casimir.nikita.cx (8.12.8/8.12.8) with SMTP id hB30Fbh0021361 for ; Tue, 2 Dec 2003 18:15:38 -0600 Message-ID: <078f01c3b932$97919630$15fea8c0@fortress> From: "Francois Baligant" Cc: References: <072501c3b874$2542ae70$15fea8c0@fortress><16332.27919.502097.988522@robur.slu.se> <20031202032606.28db927b.davem@redhat.com> Subject: Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box Date: Wed, 3 Dec 2003 01:15:42 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Copied-To: backup@casimir.nikita.cx (by Synonym - http://www.modulo.ro/synonym) X-Scanned-By: MIMEDefang 2.37 X-archive-position: 1844 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: francois@baligant.net Precedence: bulk X-list: netdev Thanks all for your suggestions. Actually I have noticed that with 90k establish TCP sessions, I have around the double amount of entries in the routing cache. For each TCP session there is an inbound and outbond cache entry like this: 39.125.111.131 81.64.64.96 39.125.111.129 1500 0 0 eth0 81.64.64.96 39.125.111.131 39.125.111.131 l 0 0 0 lo This system has accepted that many sessions before when running 2.4 and this problem surfaced with 2.6. Now, I can't be sure that traffic pattern are exactly the same so Im not drawing conclusions about 2.6 I will try to raise gc_tresh and keep you informed. Thanks, Francois ----- Original Message ----- From: "David S. Miller" To: "Robert Olsson" Cc: ; Sent: Tuesday, December 02, 2003 12:26 PM Subject: Re: 2.6.0-test11: dst_cache_overflow causing unresponsive box > On Tue, 2 Dec 2003 11:44:31 +0100 > Robert Olsson wrote: > > > No experience with 90k TCP-flows but it seems GC is not able to free some > > the dst-entries for some reason. This will slowly kill your box with > > symptoms you describe. We have ask TCP-experts for timer settings to avoid > > pending sessions etc. Also check slab for any other objects growing as > > dst cache overflow is most likely secondary effect in your case. rtstat > > looks sane expect for the high number of dst-entries. Tuning is another > > story. > > Let us assume, for the sake of back of the envelope calculations, that > all 90k TCP connections speak to unique destinations. Let us further > assume that all of them have at least one packet in flight. > > This means the routing cache must be able to hold at least 90k entries. > All of these routing cache entires will be referenced by the packets > in the TCP retransmission queues of all the sockets, and thus the > entries are unreclaimable. > > You are setting net.ipv4.route.max_size to 655360 which should be more > than enough. But you also have to make the net.ipv4.route.gc_thresh > more reasonable as well, perhaps 90K as a test. > > If net.ipv4.route.gc_thresh is lower than 90K and my assertions above > hold, then the kernel will try to garbage collect too early, all the > routing cache entries will be in use and therefore uncollectable, > and you'll get the message you're seeing. > > Try to pump up gc_thresh and see if that helps. > From davem@pizda.ninka.net Tue Dec 2 16:23:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:23:30 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30NGTa014310 for ; Tue, 2 Dec 2003 16:23:17 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA04448; Tue, 2 Dec 2003 16:22:08 -0800 Date: Tue, 2 Dec 2003 16:22:08 -0800 From: "David S. Miller" To: Julian Anastasov Cc: netdev@oss.sgi.com, laforge@gnumonks.org Subject: Re: [2.6 PATCH] ipchains masquerade must select maddr correctly Message-Id: <20031202162208.04629dab.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1845 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 3 Dec 2003 01:55:28 +0200 (EET) Julian Anastasov wrote: > The attached patch fixes ipchains masquerade to use > correctly the routing. This bug-to-bug compatibility with 2.2 > is not valid from long time. Also, a missing unlock is added. Slow down. I don't think it's always desirable to specify a specific TOS when we're working with an input packet. In fact, what you're doing all over the tree is going to cause the routing cache size to explode in some very real usage. From davem@pizda.ninka.net Tue Dec 2 16:31:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:31:44 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30VVTa014783 for ; Tue, 2 Dec 2003 16:31:31 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA04523; Tue, 2 Dec 2003 16:30:35 -0800 Date: Tue, 2 Dec 2003 16:30:35 -0800 From: "David S. Miller" To: Pavlin Radoslavov Cc: netdev@oss.sgi.com, pavlin@icir.org Subject: Re: A request to add RTPROT_XORP to linux/rtnetlink.h Message-Id: <20031202163035.698533e7.davem@redhat.com> In-Reply-To: <200312022119.hB2LJaSx017575@possum.icir.org> References: <200312022119.hB2LJaSx017575@possum.icir.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1846 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 02 Dec 2003 13:19:36 -0800 Pavlin Radoslavov wrote: > On behalf of the XORP (eXtensible Open Router Platform) project > (http://www.xorp.org), I'd like to ask if the XORP-specific protocol > value can be included in linux/rtnetlink.h (similar to the values > already assigned to other routing daemons/suites such as GateD, > etc.) Patch applied, thanks. From pavlin@possum.icir.org Tue Dec 2 16:32:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:32:50 -0800 (PST) Received: from possum.icir.org (possum.icir.org [192.150.187.67]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30WbTa014987 for ; Tue, 2 Dec 2003 16:32:37 -0800 Received: from possum.icir.org (localhost [127.0.0.1]) by possum.icir.org (8.12.9p1/8.12.3) with ESMTP id hB30WaSx019988; Tue, 2 Dec 2003 16:32:36 -0800 (PST) (envelope-from pavlin@possum.icir.org) Message-Id: <200312030032.hB30WaSx019988@possum.icir.org> To: "David S. Miller" cc: Pavlin Radoslavov , netdev@oss.sgi.com Subject: Re: A request to add RTPROT_XORP to linux/rtnetlink.h In-Reply-To: Message from "David S. Miller" of "Tue, 02 Dec 2003 16:30:35 PST." <20031202163035.698533e7.davem@redhat.com> Date: Tue, 02 Dec 2003 16:32:36 -0800 From: Pavlin Radoslavov X-archive-position: 1847 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pavlin@icir.org Precedence: bulk X-list: netdev > On Tue, 02 Dec 2003 13:19:36 -0800 > Pavlin Radoslavov wrote: > > > On behalf of the XORP (eXtensible Open Router Platform) project > > (http://www.xorp.org), I'd like to ask if the XORP-specific protocol > > value can be included in linux/rtnetlink.h (similar to the values > > already assigned to other routing daemons/suites such as GateD, > > etc.) > > Patch applied, thanks. Great! Thanks!! Pavlin From xma@us.ibm.com Tue Dec 2 16:34:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:34:20 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30Y7Ta015512 for ; Tue, 2 Dec 2003 16:34:07 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e31.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB30XKb0531060; Tue, 2 Dec 2003 19:33:20 -0500 Received: from d03nm124.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB30XF1k164826; Tue, 2 Dec 2003 17:33:19 -0700 Importance: Normal Sensitivity: Subject: Re: IPv6 MIB:ipv6PrefixTable implementation To: kuznet@ms2.inr.ac.ru Cc: mashirle@us.ibm.com, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000 Message-ID: From: Shirley Ma Date: Tue, 2 Dec 2003 16:33:13 -0800 X-MIMETrack: Serialize by Router on D03NM124/03/M/IBM(Release 6.0.2CF2|July 23, 2003) at 12/02/2003 17:33:18 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-archive-position: 1848 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xma@us.ibm.com Precedence: bulk X-list: netdev Hi, Alexy, > preferred time is not an attribute of a prefix at all, > it is attribute of address, is not it? Unless the prefix is used > to install local address it does not make sense, right? The new IPv6 MIBs said that the preferred time is saved in the prefix table, not in the address table. Also, not every onlink prefix is saved in the routing table, for example, if there is already a manual address configured for an interface with the same prefix. Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone: (503) 578-7638 FAX: (503) 578-3228 From davem@pizda.ninka.net Tue Dec 2 16:40:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:40:17 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30e4Ta015954 for ; Tue, 2 Dec 2003 16:40:04 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA04574; Tue, 2 Dec 2003 16:39:02 -0800 Date: Tue, 2 Dec 2003 16:39:02 -0800 From: "David S. Miller" To: Julian Anastasov Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031202163902.2deb43a7.davem@redhat.com> In-Reply-To: References: <20031201155005.1c515793.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1849 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 3 Dec 2003 01:40:06 +0200 (EET) Julian Anastasov wrote: > - ip_rt_frag_needed now provides interface index I disagree with this. There is no connection between the route (and thus output device) we used to send a packet and the interface on which an ICMP for that packet comes back upon. Think assymetric routes. You changes mean that for routes with specific output interfaces, we will ignore ICMPs for those routes that arrive on other interfaces due to assymetric routing. Why don't you create a seperate patch that just has the TOS masking changes? That's much less controversial and thus something I'm likely to apply. From ja@ssi.bg Tue Dec 2 16:42:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:42:38 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30gJTa016319 for ; Tue, 2 Dec 2003 16:42:22 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB2NtSee009114; Wed, 3 Dec 2003 01:55:29 +0200 Date: Wed, 3 Dec 2003 01:55:28 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: netdev@oss.sgi.com cc: Harald Welte , "David S. Miller" Subject: [2.6 PATCH] ipchains masquerade must select maddr correctly Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="1607745702-1400563270-1070409328=:1237" X-archive-position: 1850 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --1607745702-1400563270-1070409328=:1237 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello, The attached patch fixes ipchains masquerade to use correctly the routing. This bug-to-bug compatibility with 2.2 is not valid from long time. Also, a missing unlock is added. Regards -- Julian Anastasov --1607745702-1400563270-1070409328=:1237 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="ipchains-masq-1.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: ipchains maddr Content-Disposition: attachment; filename="ipchains-masq-1.diff" IyBUaGlzIGlzIGEgQml0S2VlcGVyIGdlbmVyYXRlZCBwYXRjaCBmb3IgdGhl IGZvbGxvd2luZyBwcm9qZWN0Og0KIyBQcm9qZWN0IE5hbWU6IExpbnV4IGtl cm5lbCB0cmVlDQojIFRoaXMgcGF0Y2ggZm9ybWF0IGlzIGludGVuZGVkIGZv ciBHTlUgcGF0Y2ggY29tbWFuZCB2ZXJzaW9uIDIuNSBvciBoaWdoZXIuDQoj IFRoaXMgcGF0Y2ggaW5jbHVkZXMgdGhlIGZvbGxvd2luZyBkZWx0YXM6DQoj CSAgICAgICAgICAgQ2hhbmdlU2V0CTEuMTM1MiAgLT4gMS4xMzUzIA0KIwlu ZXQvaXB2NC9uZXRmaWx0ZXIvaXBfZndfY29tcGF0X21hc3EuYwkxLjExICAg IC0+IDEuMTIgICANCiMNCiMgVGhlIGZvbGxvd2luZyBpcyB0aGUgQml0S2Vl cGVyIENoYW5nZVNldCBMb2cNCiMgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0NCiMgMDMvMTIvMDMJamFAc3NpLmJnCTEu MTM1Mw0KIyBbSVBDSEFJTlNdOiBtYXNxdWVyYWRlIG11c3Qgc2VsZWN0IG1h ZGRyIGNvcnJlY3RseQ0KIyANCiMgLSBJdCBpcyBmaXhlZCBpbiBMaW51eCAy LjIgZnJvbSBhZ2VzDQojIC0gQWRkIG1pc3NpbmcgV1JJVEVfVU5MT0NLIG9u IHJvdXRlIGVycm9yDQojIC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tDQojDQpkaWZmIC1OcnUgYS9uZXQvaXB2NC9uZXRm aWx0ZXIvaXBfZndfY29tcGF0X21hc3EuYyBiL25ldC9pcHY0L25ldGZpbHRl ci9pcF9md19jb21wYXRfbWFzcS5jDQotLS0gYS9uZXQvaXB2NC9uZXRmaWx0 ZXIvaXBfZndfY29tcGF0X21hc3EuYwlXZWQgRGVjICAzIDAwOjMyOjA1IDIw MDMNCisrKyBiL25ldC9pcHY0L25ldGZpbHRlci9pcF9md19jb21wYXRfbWFz cS5jCVdlZCBEZWMgIDMgMDA6MzI6MDUgMjAwMw0KQEAgLTY3LDE4ICs2Nywy MyBAQA0KIAkvKiBTZXR1cCB0aGUgbWFzcXVlcmFkZSwgaWYgbm90IGFscmVh ZHkgKi8NCiAJaWYgKCFpbmZvLT5pbml0aWFsaXplZCkgew0KIAkJdV9pbnQz Ml90IG5ld3NyYzsNCi0JCXN0cnVjdCBmbG93aSBmbCA9IHsgLm5sX3UgPSB7 IC5pcDRfdSA9IHsgLmRhZGRyID0gKCpwc2tiKS0+bmguaXBoLT5kYWRkciB9 IH0gfTsNCisJCXN0cnVjdCBmbG93aSBmbCA9IHsgLm5sX3UgPSB7DQorCQkJ CQkuaXA0X3UgPSB7DQorCQkJCQkJLmRhZGRyID0gKCpwc2tiKS0+bmguaXBo LT5kYWRkciwNCisJCQkJCQkudG9zID0gUlRfVE9TKCgqcHNrYiktPm5oLmlw aC0+dG9zKSwNCisJCQkJCX0gfSwNCisJCQkJICAgIC5vaWYgPSAoKnBza2Ip LT5kc3QtPmRldi0+aWZpbmRleCB9Ow0KIAkJc3RydWN0IHJ0YWJsZSAqcnQ7 DQogCQlzdHJ1Y3QgaXBfbmF0X211bHRpX3JhbmdlIHJhbmdlOw0KIA0KIAkJ LyogUGFzcyAwIGluc3RlYWQgb2Ygc2FkZHIsIHNpbmNlIGl0J3MgZ29pbmcg dG8gYmUgY2hhbmdlZA0KIAkJICAgYW55d2F5LiAqLw0KIAkJaWYgKGlwX3Jv dXRlX291dHB1dF9rZXkoJnJ0LCAmZmwpICE9IDApIHsNCisJCQlXUklURV9V TkxPQ0soJmlwX25hdF9sb2NrKTsNCiAJCQlERUJVR1AoImlwbmF0X3J1bGVf bWFzcXVlcmFkZTogQ2FuJ3QgcmVyb3V0ZS5cbiIpOw0KIAkJCXJldHVybiBO Rl9EUk9QOw0KIAkJfQ0KLQkJbmV3c3JjID0gaW5ldF9zZWxlY3RfYWRkcihy dC0+dS5kc3QuZGV2LCBydC0+cnRfZ2F0ZXdheSwNCi0JCQkJCSAgUlRfU0NP UEVfVU5JVkVSU0UpOw0KKwkJbmV3c3JjID0gcnQtPnJ0X3NyYzsNCiAJCWlw X3J0X3B1dChydCk7DQogCQlyYW5nZSA9ICgoc3RydWN0IGlwX25hdF9tdWx0 aV9yYW5nZSkNCiAJCQkgeyAxLA0K --1607745702-1400563270-1070409328=:1237-- From khc@pm.waw.pl Tue Dec 2 16:43:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:43:50 -0800 (PST) Received: from hq.pm.waw.pl (hq.pm.waw.pl [195.116.170.10]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30haTa016701 for ; Tue, 2 Dec 2003 16:43:37 -0800 Received: by hq.pm.waw.pl (Postfix, from userid 10) id AD7B6365; Wed, 3 Dec 2003 01:43:32 +0100 (CET) Received: by defiant.pm.waw.pl (Postfix, from userid 500) id AE258302D4; Wed, 3 Dec 2003 00:03:30 +0100 (CET) To: Stephen Hemminger Cc: Jeff Garzik , netdev@oss.sgi.com Subject: Re: [PATCH] (1/8) hdlc wan device disembedding References: <20031202140103.6bb2deb4.shemminger@osdl.org> From: Krzysztof Halasa Date: Wed, 03 Dec 2003 00:03:30 +0100 In-Reply-To: <20031202140103.6bb2deb4.shemminger@osdl.org> (Stephen Hemminger's message of "Tue, 2 Dec 2003 14:01:03 -0800") Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 1851 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: khc@pm.waw.pl Precedence: bulk X-list: netdev Stephen Hemminger writes: > Change the hdlc wan device's to not have the net_device structure embedded > inside the hdlc_device structure. This won't work on 2.6 where the > net_device > structure may need to live after module unload due to sysfs. > Instead, use alloc_netdev and setup so that netdev->priv = hdlc > and have hdlc->dev_data for device private data. Hmm... I always wanted dev->priv to be available for hw drivers (and yes, PPP proto doesn't currently meet that). Any other idea maybe? It seems the whole WAN (drivers/net/wan) needs some major rewrite, as the recent patches (starting with last 2.5.x ones) have broken few things (at least for me) which are not all fixed. First I would duplicate syncppp.c code in hdlc_ppp.c (removing Cisco HDLC support and polishing it to suit generic HDLC needs) so it no longer depends on syncppp (long-term I plan switching to generic PPP but it's certainly post-2.6 thing and I'm not even sure how to do it). This would result in few hundred lines of duplicated code, though. The positive side is that I can test the generic HDLC + hw drivers for C101, N2, wanXL, PCI200SYN (being merged) and PC300. The other thing would be converting drivers using syncppp.c to use generic HDLC instead (it would add support for X.25, Frame-Relay and raw HDLC). While I can probably make a patch I can't test it. The remaining drivers use either Sangoma (sdla/dlci/wanpipe/wanrouter) or comx code. Not sure about their status, are they both maintained? Comments? -- Krzysztof Halasa, B*FH From ja@ssi.bg Tue Dec 2 16:50:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 16:50:15 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30ntTa017246 for ; Tue, 2 Dec 2003 16:49:58 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB30ouee009404; Wed, 3 Dec 2003 02:50:57 +0200 Date: Wed, 3 Dec 2003 02:50:56 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: netdev@oss.sgi.com, Subject: Re: [2.6 PATCH] ipchains masquerade must select maddr correctly In-Reply-To: <20031202162208.04629dab.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1852 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Tue, 2 Dec 2003, David S. Miller wrote: > > The attached patch fixes ipchains masquerade to use > > correctly the routing. This bug-to-bug compatibility with 2.2 > > is not valid from long time. Also, a missing unlock is added. > > Slow down. > > I don't think it's always desirable to specify a specific TOS when > we're working with an input packet. In fact, what you're doing all > over the tree is going to cause the routing cache size to explode in > some very real usage. Yes, it can grow up to 8 times (IPTOS_RT_MASK is 3 bits) if we detect different rt tos values. In fact, ipchains is the only case where tos is not provided :) For some users may be this is not only a maddr selection, may be they have real routes by tos for this public IP. Perhaps, TOS matching and hash key should be a sysctl/compile time option? Then a site that does not use tos for routing can safely run PMTUD without problems. I think, it is a common case not to route by tos. The good news is that for ipchains this is in->out traffic and may be there is only one tos value per path. Regards -- Julian Anastasov From davem@pizda.ninka.net Tue Dec 2 16:59:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 17:00:08 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB30xsTa017690 for ; Tue, 2 Dec 2003 16:59:55 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA04690; Tue, 2 Dec 2003 16:58:50 -0800 Date: Tue, 2 Dec 2003 16:58:50 -0800 From: "David S. Miller" To: Julian Anastasov Cc: netdev@oss.sgi.com, laforge@gnumonks.org Subject: Re: [2.6 PATCH] ipchains masquerade must select maddr correctly Message-Id: <20031202165850.3f8c16c4.davem@redhat.com> In-Reply-To: References: <20031202162208.04629dab.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1853 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 3 Dec 2003 02:50:56 +0200 (EET) Julian Anastasov wrote: > TOS matching and hash key should be > a sysctl/compile time option? It already is, CONFIG_IP_ROUTE_TOS. But the issue is that all dist vendors enable that. Also note that what you propose could potentially break some setups. From ja@ssi.bg Tue Dec 2 18:06:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 18:06:43 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB326ITa019786 for ; Tue, 2 Dec 2003 18:06:22 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB31h2ee009586; Wed, 3 Dec 2003 03:43:04 +0200 Date: Wed, 3 Dec 2003 03:43:02 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031202163902.2deb43a7.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1854 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Tue, 2 Dec 2003, David S. Miller wrote: > > - ip_rt_frag_needed now provides interface index > > I disagree with this. > > There is no connection between the route (and thus output > device) we used to send a packet and the interface on which > an ICMP for that packet comes back upon. Think assymetric > routes. Agreed. This is a very bad case for PMTUD. It can be even a multipath route with different PMTUs. > You changes mean that for routes with specific output interfaces, > we will ignore ICMPs for those routes that arrive on other interfaces > due to assymetric routing. They are already ignored because the oif is a hash key. With the current code I do not think we can find other oif values in the current row when the oif hash key is 0. With the changes, we are trying to walk the rows for oif=IIF and oif=0. We can not learn how many paths and oifs we have before the smallest pmtu that generates ICMPs. We are even not sure when the ICMP comes if the originating packet was routed with oif 0 or not 0. So, may be it is better to set the smallest pmtu for all these paths, for oif 0 as before and now for oif=IIF. About CONFIG_IP_ROUTE_TOS, why it is not propagated to the routing cache? Then for the fl4_tos value we will have only two cases, onlink or not onlink. In essence, I see it as defining IPTOS_RT_MASK to 0 when CONFIG_IP_ROUTE_TOS is not defined. If this is working it solves so many problems: cache size, PMTUD. Is it really so easy? CONFIG_IP_ROUTE_TOS disables tos match in rules and routes, so why we have to waste memory for the route cache when all these routes receive same path no matter the tos value? > Why don't you create a seperate patch that just has the TOS masking > changes? That's much less controversial and thus something I'm likely > to apply. ok, may be after some days for rethinking :) let me know for the final thoughts and I can create patch(es) after reading some standards. Regards -- Julian Anastasov From hadi@cyberus.ca Tue Dec 2 19:09:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 19:09:23 -0800 (PST) Received: from mail.cyberus.ca (mail.cyberus.ca [209.197.145.21]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3395Ta020844 for ; Tue, 2 Dec 2003 19:09:06 -0800 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1AQvJ6-0006PF-0p; Mon, 01 Dec 2003 16:10:12 -0500 Subject: Re: Request: Allocate a Netlink Family Number for MPLS From: jamal Reply-To: hadi@cyberus.ca To: Ramon Casellas Cc: Christoph Hellwig , netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolis Message-Id: <1070313011.1107.39.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 01 Dec 2003 16:10:11 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 1855 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2003-12-01 at 14:26, Ramon Casellas wrote: > > Well, we are still coordinating efforts, and Jamal has some design docs > around... but regardless of the actual implementation, a mechanism to > communicate with userspace will be needed, and IM(Very, very)HO, netlink > is a good candidate, analog to rtnetlink, but this is open to discussion. > Please leave this alone because the code i have is taking care of this (i.e it uses netlink). And the grander plan is to use the same code for bridging and VLANs as well. I will share the code once James and everybody else reviews things. cheers, jamal From davem@pizda.ninka.net Tue Dec 2 19:46:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Dec 2003 19:46:37 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB33kATa021533 for ; Tue, 2 Dec 2003 19:46:13 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id TAA05015; Tue, 2 Dec 2003 19:44:15 -0800 Date: Tue, 2 Dec 2003 19:44:15 -0800 From: "David S. Miller" To: Xose Vazquez Perez Cc: netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: Level One LXT1001 GE chip Message-Id: <20031202194415.076637f4.davem@redhat.com> In-Reply-To: <3FC616CE.2060208@wanadoo.es> References: <3FC3FB99.70700@wanadoo.es> <20031127010324.061cac3b.davem@redhat.com> <3FC616CE.2060208@wanadoo.es> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1856 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 27 Nov 2003 16:22:54 +0100 Xose Vazquez Perez wrote: > I thought that you already got homework, tg3TsoFwText > firmware ;-) 8-P~~~ Ok, I threw this together and I'll probably merge it upstream soon. (Jeff I'll take care of this and the bug tg3 fixes I worked on yesterday... this email is just a FYI) ===== drivers/net/tg3.c 1.116 vs edited ===== --- 1.116/drivers/net/tg3.c Tue Dec 2 02:35:11 2003 +++ edited/drivers/net/tg3.c Tue Dec 2 17:46:31 2003 @@ -2688,7 +2688,13 @@ mss |= (tsflags << 11); } } else { - mss += tcp_opt_len; + if (tcp_opt_len || skb->nh.iph->ihl > 5) { + int tsflags; + + tsflags = ((skb->nh.iph->ihl - 5) + + (tcp_opt_len >> 2)); + base_flags |= tsflags << 12; + } } } #else @@ -2895,7 +2901,13 @@ mss |= (tsflags << 11); } } else { - mss += tcp_opt_len; + if (tcp_opt_len || skb->nh.iph->ihl > 5) { + int tsflags; + + tsflags = ((skb->nh.iph->ihl - 5) + + (tcp_opt_len >> 2)); + base_flags |= tsflags << 12; + } } } #else @@ -3860,180 +3872,181 @@ #if TG3_TSO_SUPPORT != 0 #define TG3_TSO_FW_RELEASE_MAJOR 0x1 -#define TG3_TSO_FW_RELASE_MINOR 0x3 +#define TG3_TSO_FW_RELASE_MINOR 0x4 #define TG3_TSO_FW_RELEASE_FIX 0x0 #define TG3_TSO_FW_START_ADDR 0x08000000 #define TG3_TSO_FW_TEXT_ADDR 0x08000000 -#define TG3_TSO_FW_TEXT_LEN 0x1ac0 -#define TG3_TSO_FW_RODATA_ADDR 0x08001650 +#define TG3_TSO_FW_TEXT_LEN 0x1a90 +#define TG3_TSO_FW_RODATA_ADDR 0x08001a900 #define TG3_TSO_FW_RODATA_LEN 0x60 -#define TG3_TSO_FW_DATA_ADDR 0x080016a0 +#define TG3_TSO_FW_DATA_ADDR 0x08001b20 #define TG3_TSO_FW_DATA_LEN 0x20 -#define TG3_TSO_FW_SBSS_ADDR 0x080016c0 +#define TG3_TSO_FW_SBSS_ADDR 0x08001b40 #define TG3_TSO_FW_SBSS_LEN 0x2c -#define TG3_TSO_FW_BSS_ADDR 0x080016e0 -#define TG3_TSO_FW_BSS_LEN 0x890 +#define TG3_TSO_FW_BSS_ADDR 0x08001b70 +#define TG3_TSO_FW_BSS_LEN 0x894 static u32 tg3TsoFwText[] = { 0x00000000, 0x10000003, 0x00000000, 0x0000000d, 0x0000000d, 0x3c1d0800, 0x37bd4000, 0x03a0f021, 0x3c100800, 0x26100000, 0x0e000010, 0x00000000, 0x0000000d, 0x00000000, 0x00000000, 0x00000000, 0x27bdffe0, 0x3c04fefe, - 0xafbf0018, 0x0e0005e0, 0x34840002, 0x0e000670, 0x00000000, 0x3c030800, - 0x90631b78, 0x24020002, 0x3c040800, 0x24841acc, 0x14620003, 0x24050001, - 0x3c040800, 0x24841ac0, 0x24060002, 0x00003821, 0xafa00010, 0x0e000684, + 0xafbf0018, 0x0e0005d4, 0x34840002, 0x0e000664, 0x00000000, 0x3c030800, + 0x90631b58, 0x24020002, 0x3c040800, 0x24841a9c, 0x14620003, 0x24050001, + 0x3c040800, 0x24841a90, 0x24060003, 0x00003821, 0xafa00010, 0x0e000678, 0xafa00014, 0x8f625c50, 0x34420001, 0xaf625c50, 0x8f625c90, 0x34420001, 0xaf625c90, 0x2402ffff, 0x0e000034, 0xaf625404, 0x8fbf0018, 0x03e00008, - 0x27bd0020, 0x00000000, 0x00000000, 0x00000000, 0x27bdffe0, 0xafbf0018, - 0xafb10014, 0x0e000052, 0xafb00010, 0x24110001, 0x8f706820, 0x32020100, - 0x10400003, 0x00000000, 0x0e0000b2, 0x00000000, 0x8f706820, 0x32022000, - 0x10400004, 0x32020001, 0x0e0001e3, 0x24040001, 0x32020001, 0x10400003, - 0x00000000, 0x0e00009a, 0x00000000, 0x0a00003a, 0xaf715028, 0x8fbf0018, - 0x8fb10014, 0x8fb00010, 0x03e00008, 0x27bd0020, 0x27bdffe0, 0x3c040800, - 0x24841ae0, 0x00002821, 0x00003021, 0x00003821, 0xafbf0018, 0xafa00010, - 0x0e000684, 0xafa00014, 0x3c040800, 0x248423e8, 0xa4800000, 0x3c010800, - 0xa0201ba8, 0x3c010800, 0xac201bac, 0x3c010800, 0xac201bb0, 0x3c010800, - 0xac201bb4, 0x3c010800, 0xac201bbc, 0x3c010800, 0xac201bc8, 0x3c010800, - 0xac201bcc, 0x8f624434, 0x3c010800, 0xac221b98, 0x8f624438, 0x3c010800, - 0xac221b9c, 0x8f624410, 0xac80f7a8, 0x3c010800, 0xac201b94, 0x3c010800, - 0xac2023f0, 0x3c010800, 0xac2023d8, 0x3c010800, 0xac2023dc, 0x3c010800, - 0xac202410, 0x3c010800, 0xac221ba0, 0x8f620068, 0x24030007, 0x00021702, - 0x10430005, 0x00000000, 0x8f620068, 0x00021702, 0x14400004, 0x24020001, - 0x3c010800, 0x0a00008e, 0xac20241c, 0xac820034, 0x3c040800, 0x24841aec, - 0x3c050800, 0x8ca5241c, 0x00003021, 0x00003821, 0xafa00010, 0x0e000684, - 0xafa00014, 0x8fbf0018, 0x03e00008, 0x27bd0020, 0x27bdffe0, 0x3c040800, - 0x24841af8, 0x00002821, 0x00003021, 0x00003821, 0xafbf0018, 0xafa00010, - 0x0e000684, 0xafa00014, 0x0e000052, 0x00000000, 0x0e0000ab, 0x00002021, - 0x8fbf0018, 0x03e00008, 0x27bd0020, 0x24020001, 0x8f636820, 0x00821004, - 0x00021027, 0x00621824, 0x03e00008, 0xaf636820, 0x27bdffd0, 0xafbf002c, - 0xafb60028, 0xafb50024, 0xafb40020, 0xafb3001c, 0xafb20018, 0xafb10014, - 0xafb00010, 0x8f665c5c, 0x3c030800, 0x24631bcc, 0x8c620000, 0x14460005, - 0x3c0200ff, 0x3c020800, 0x90421ba8, 0x14400115, 0x3c0200ff, 0x3442fff8, - 0x00c28824, 0xac660000, 0x00111902, 0x306300ff, 0x30c20003, 0x000211c0, - 0x00623825, 0x00e02821, 0x00061602, 0x3c030800, 0x90631ba8, 0x3044000f, - 0x1460002b, 0x00804021, 0x24020001, 0x3c010800, 0xa0221ba8, 0x00071100, - 0x00821025, 0x3c010800, 0xac201bac, 0x3c010800, 0xac201bb0, 0x3c010800, - 0xac201bb4, 0x3c010800, 0xac201bbc, 0x3c010800, 0xac201bc8, 0x3c010800, - 0xac201bc0, 0x3c010800, 0xac201bc4, 0x3c010800, 0xa42223e8, 0x9623000c, - 0x30628000, 0x10400008, 0x30627fff, 0x2442003e, 0x3c010800, 0xa4221ba6, - 0x24020001, 0x3c010800, 0x0a0000f9, 0xac222404, 0x24620036, 0x3c010800, - 0xa4221ba6, 0x3c010800, 0xac202404, 0x3c010800, 0xac202400, 0x3c010800, - 0x0a000101, 0xac202408, 0x9622000c, 0x3c010800, 0xa42223fc, 0x3c040800, - 0x24841bac, 0x8c820000, 0x00021100, 0x3c010800, 0x00220821, 0xac311bd8, - 0x8c820000, 0x00021100, 0x3c010800, 0x00220821, 0xac261bdc, 0x8c820000, - 0x24a30001, 0x306701ff, 0x00021100, 0x3c010800, 0x00220821, 0xac271be0, - 0x8c820000, 0x00021100, 0x3c010800, 0x00220821, 0xac281be4, 0x96230008, - 0x3c020800, 0x8c421bbc, 0x00432821, 0x3c010800, 0xac251bbc, 0x9622000a, - 0x30420004, 0x14400018, 0x00071100, 0x8f630c14, 0x3063000f, 0x2c620002, - 0x1440000b, 0x3c02c000, 0x8f630c14, 0x3c020800, 0x8c421b50, 0x3063000f, - 0x24420001, 0x3c010800, 0xac221b50, 0x2c620002, 0x1040fff7, 0x3c02c000, - 0x00c21825, 0xaf635c5c, 0x8f625c50, 0x30420002, 0x10400014, 0x00000000, - 0x0a000133, 0x00000000, 0x3c030800, 0x8c631b90, 0x3c040800, 0x94841ba4, - 0x01021025, 0x3c010800, 0xa42223ea, 0x24020001, 0x3c010800, 0xac221bc8, - 0x24630001, 0x0085202a, 0x3c010800, 0x10800003, 0xac231b90, 0x3c010800, - 0xa4251ba4, 0x3c060800, 0x24c61bac, 0x8cc20000, 0x24420001, 0xacc20000, - 0x28420080, 0x14400005, 0x00000000, 0x0e00065e, 0x24040002, 0x0a0001d9, - 0x00000000, 0x3c020800, 0x8c421bc8, 0x1040007f, 0x24020001, 0x3c040800, - 0x90841ba8, 0x14820077, 0x24020003, 0x3c150800, 0x96b51ba6, 0x3c050800, - 0x8ca51bbc, 0x32a3ffff, 0x00a3102a, 0x14400073, 0x00000000, 0x14a30003, - 0x00000000, 0x3c010800, 0xac242400, 0x10600061, 0x00009021, 0x24d60004, - 0x0060a021, 0x24d30014, 0x8ec20000, 0x00028100, 0x3c110800, 0x02308821, - 0x0e00062d, 0x8e311bd8, 0x00403021, 0x10c00059, 0x00000000, 0x9628000a, - 0x31020040, 0x10400004, 0x2407180c, 0x8e22000c, 0x2407188c, 0xacc20018, - 0x31021000, 0x10400004, 0x34e32000, 0x00081040, 0x3042c000, 0x00623825, - 0x3c030800, 0x00701821, 0x8c631be0, 0x3c020800, 0x00501021, 0x8c421be4, - 0x00031d00, 0x00021400, 0x00621825, 0xacc30014, 0x8ec30004, 0x96220008, - 0x00432023, 0x3242ffff, 0x3083ffff, 0x00431021, 0x0282102a, 0x14400002, - 0x02b22823, 0x00802821, 0x8e620000, 0x30a4ffff, 0x00441021, 0xae620000, - 0x8e220000, 0xacc20000, 0x8e220004, 0x8e63fff4, 0x00431021, 0xacc20004, - 0xa4c5000e, 0x8e62fff4, 0x00441021, 0xae62fff4, 0x96230008, 0x0043102a, - 0x14400005, 0x02459021, 0x8e62fff0, 0xae60fff4, 0x24420001, 0xae62fff0, - 0xacc00008, 0x3242ffff, 0x14540008, 0x24020305, 0x31020080, 0x54400001, - 0x34e70010, 0x24020905, 0xa4c2000c, 0x0a0001bc, 0x34e70020, 0xa4c2000c, - 0x3c020800, 0x8c422400, 0x10400003, 0x3c024b65, 0x0a0001c4, 0x34427654, - 0x3c02b49a, 0x344289ab, 0xacc2001c, 0x30e2ffff, 0xacc20010, 0x0e0005aa, - 0x00c02021, 0x3242ffff, 0x0054102b, 0x1440ffa4, 0x00000000, 0x24020002, - 0x3c010800, 0x0a0001d9, 0xa0221ba8, 0x8ec2083c, 0x24420001, 0x0a0001d9, - 0xaec2083c, 0x14820003, 0x00000000, 0x0e0004b9, 0x00000000, 0x8fbf002c, + 0x27bd0020, 0x00000000, 0x00000000, 0x00000000, 0x27bdffe0, 0xafbf001c, + 0xafb20018, 0xafb10014, 0x0e00005b, 0xafb00010, 0x24120002, 0x24110001, + 0x8f706820, 0x32020100, 0x10400003, 0x00000000, 0x0e0000bb, 0x00000000, + 0x8f706820, 0x32022000, 0x10400004, 0x32020001, 0x0e0001ef, 0x24040001, + 0x32020001, 0x10400003, 0x00000000, 0x0e0000a3, 0x00000000, 0x3c020800, + 0x90421b88, 0x14520003, 0x00000000, 0x0e0004bf, 0x00000000, 0x0a00003c, + 0xaf715028, 0x8fbf001c, 0x8fb20018, 0x8fb10014, 0x8fb00010, 0x03e00008, + 0x27bd0020, 0x27bdffe0, 0x3c040800, 0x24841ab0, 0x00002821, 0x00003021, + 0x00003821, 0xafbf0018, 0xafa00010, 0x0e000678, 0xafa00014, 0x3c040800, + 0x248423c8, 0xa4800000, 0x3c010800, 0xa0201b88, 0x3c010800, 0xac201b8c, + 0x3c010800, 0xac201b90, 0x3c010800, 0xac201b94, 0x3c010800, 0xac201b9c, + 0x3c010800, 0xac201ba8, 0x3c010800, 0xac201bac, 0x8f624434, 0x3c010800, + 0xac221b78, 0x8f624438, 0x3c010800, 0xac221b7c, 0x8f624410, 0xac80f7a8, + 0x3c010800, 0xac201b74, 0x3c010800, 0xac2023d0, 0x3c010800, 0xac2023b8, + 0x3c010800, 0xac2023bc, 0x3c010800, 0xac2023f0, 0x3c010800, 0xac221b80, + 0x8f620068, 0x24030007, 0x00021702, 0x10430005, 0x00000000, 0x8f620068, + 0x00021702, 0x14400004, 0x24020001, 0x3c010800, 0x0a000097, 0xac2023fc, + 0xac820034, 0x3c040800, 0x24841abc, 0x3c050800, 0x8ca523fc, 0x00003021, + 0x00003821, 0xafa00010, 0x0e000678, 0xafa00014, 0x8fbf0018, 0x03e00008, + 0x27bd0020, 0x27bdffe0, 0x3c040800, 0x24841ac8, 0x00002821, 0x00003021, + 0x00003821, 0xafbf0018, 0xafa00010, 0x0e000678, 0xafa00014, 0x0e00005b, + 0x00000000, 0x0e0000b4, 0x00002021, 0x8fbf0018, 0x03e00008, 0x27bd0020, + 0x24020001, 0x8f636820, 0x00821004, 0x00021027, 0x00621824, 0x03e00008, + 0xaf636820, 0x27bdffd0, 0xafbf002c, 0xafb60028, 0xafb50024, 0xafb40020, + 0xafb3001c, 0xafb20018, 0xafb10014, 0xafb00010, 0x8f675c5c, 0x3c030800, + 0x24631bac, 0x8c620000, 0x14470005, 0x3c0200ff, 0x3c020800, 0x90421b88, + 0x14400118, 0x3c0200ff, 0x3442fff8, 0x00e28824, 0xac670000, 0x00111902, + 0x306300ff, 0x30e20003, 0x000211c0, 0x00622825, 0x00a04021, 0x00071602, + 0x3c030800, 0x90631b88, 0x3044000f, 0x14600036, 0x00804821, 0x24020001, + 0x3c010800, 0xa0221b88, 0x00051100, 0x00821025, 0x3c010800, 0xac201b8c, + 0x3c010800, 0xac201b90, 0x3c010800, 0xac201b94, 0x3c010800, 0xac201b9c, + 0x3c010800, 0xac201ba8, 0x3c010800, 0xac201ba0, 0x3c010800, 0xac201ba4, + 0x3c010800, 0xa42223c8, 0x9622000c, 0x30437fff, 0x3c010800, 0xa4222400, + 0x30428000, 0x3c010800, 0xa4231bb6, 0x10400005, 0x24020001, 0x3c010800, + 0xac2223e4, 0x0a000102, 0x2406003e, 0x24060036, 0x3c010800, 0xac2023e4, + 0x9622000a, 0x3c030800, 0x94631bb6, 0x3c010800, 0xac2023e0, 0x3c010800, + 0xac2023e8, 0x00021302, 0x00021080, 0x00c21021, 0x00621821, 0x3c010800, + 0xa42223c0, 0x3c010800, 0x0a000115, 0xa4231b86, 0x9622000c, 0x3c010800, + 0xa42223dc, 0x3c040800, 0x24841b8c, 0x8c820000, 0x00021100, 0x3c010800, + 0x00220821, 0xac311bb8, 0x8c820000, 0x00021100, 0x3c010800, 0x00220821, + 0xac271bbc, 0x8c820000, 0x25030001, 0x306601ff, 0x00021100, 0x3c010800, + 0x00220821, 0xac261bc0, 0x8c820000, 0x00021100, 0x3c010800, 0x00220821, + 0xac291bc4, 0x96230008, 0x3c020800, 0x8c421b9c, 0x00432821, 0x3c010800, + 0xac251b9c, 0x9622000a, 0x30420004, 0x14400018, 0x00061100, 0x8f630c14, + 0x3063000f, 0x2c620002, 0x1440000b, 0x3c02c000, 0x8f630c14, 0x3c020800, + 0x8c421b30, 0x3063000f, 0x24420001, 0x3c010800, 0xac221b30, 0x2c620002, + 0x1040fff7, 0x3c02c000, 0x00e21825, 0xaf635c5c, 0x8f625c50, 0x30420002, + 0x10400014, 0x00000000, 0x0a000147, 0x00000000, 0x3c030800, 0x8c631b70, + 0x3c040800, 0x94841b84, 0x01221025, 0x3c010800, 0xa42223ca, 0x24020001, + 0x3c010800, 0xac221ba8, 0x24630001, 0x0085202a, 0x3c010800, 0x10800003, + 0xac231b70, 0x3c010800, 0xa4251b84, 0x3c060800, 0x24c61b8c, 0x8cc20000, + 0x24420001, 0xacc20000, 0x28420080, 0x14400005, 0x00000000, 0x0e000652, + 0x24040002, 0x0a0001e5, 0x00000000, 0x3c020800, 0x8c421ba8, 0x10400077, + 0x24020001, 0x3c050800, 0x90a51b88, 0x14a20071, 0x00000000, 0x3c150800, + 0x96b51b86, 0x3c040800, 0x8c841b9c, 0x32a3ffff, 0x0083102a, 0x1440006b, + 0x00000000, 0x14830003, 0x00000000, 0x3c010800, 0xac2523e0, 0x1060005b, + 0x00009021, 0x24d60004, 0x0060a021, 0x24d30014, 0x8ec20000, 0x00028100, + 0x3c110800, 0x02308821, 0x0e000621, 0x8e311bb8, 0x00402821, 0x10a00053, + 0x00000000, 0x9628000a, 0x31020040, 0x10400004, 0x2407180c, 0x8e22000c, + 0x2407188c, 0xaca20018, 0x3c030800, 0x00701821, 0x8c631bc0, 0x3c020800, + 0x00501021, 0x8c421bc4, 0x00031d00, 0x00021400, 0x00621825, 0xaca30014, + 0x8ec30004, 0x96220008, 0x00432023, 0x3242ffff, 0x3083ffff, 0x00431021, + 0x0282102a, 0x14400002, 0x02b23023, 0x00803021, 0x8e620000, 0x30c4ffff, + 0x00441021, 0xae620000, 0x8e220000, 0xaca20000, 0x8e220004, 0x8e63fff4, + 0x00431021, 0xaca20004, 0xa4a6000e, 0x8e62fff4, 0x00441021, 0xae62fff4, + 0x96230008, 0x0043102a, 0x14400005, 0x02469021, 0x8e62fff0, 0xae60fff4, + 0x24420001, 0xae62fff0, 0xaca00008, 0x3242ffff, 0x14540008, 0x24020305, + 0x31020080, 0x54400001, 0x34e70010, 0x24020905, 0xa4a2000c, 0x0a0001ca, + 0x34e70020, 0xa4a2000c, 0x3c020800, 0x8c4223e0, 0x10400003, 0x3c024b65, + 0x0a0001d2, 0x34427654, 0x3c02b49a, 0x344289ab, 0xaca2001c, 0x30e2ffff, + 0xaca20010, 0x0e00059f, 0x00a02021, 0x3242ffff, 0x0054102b, 0x1440ffaa, + 0x00000000, 0x24020002, 0x3c010800, 0x0a0001e5, 0xa0221b88, 0x8ec2083c, + 0x24420001, 0x0a0001e5, 0xaec2083c, 0x0e0004bf, 0x00000000, 0x8fbf002c, 0x8fb60028, 0x8fb50024, 0x8fb40020, 0x8fb3001c, 0x8fb20018, 0x8fb10014, 0x8fb00010, 0x03e00008, 0x27bd0030, 0x27bdffd0, 0xafbf0028, 0xafb30024, 0xafb20020, 0xafb1001c, 0xafb00018, 0x8f725c9c, 0x3c0200ff, 0x3442fff8, - 0x3c060800, 0x24c61bc4, 0x02428824, 0x9623000e, 0x8cc20000, 0x00431021, - 0xacc20000, 0x8e220010, 0x30420020, 0x14400011, 0x00809821, 0x0e000643, + 0x3c060800, 0x24c61ba4, 0x02428824, 0x9623000e, 0x8cc20000, 0x00431021, + 0xacc20000, 0x8e220010, 0x30420020, 0x14400011, 0x00809821, 0x0e000637, 0x02202021, 0x3c02c000, 0x02421825, 0xaf635c9c, 0x8f625c90, 0x30420002, 0x10400121, 0x00000000, 0xaf635c9c, 0x8f625c90, 0x30420002, 0x1040011c, - 0x00000000, 0x0a000200, 0x00000000, 0x8e240008, 0x8e230014, 0x00041402, + 0x00000000, 0x0a00020c, 0x00000000, 0x8e240008, 0x8e230014, 0x00041402, 0x000241c0, 0x00031502, 0x304201ff, 0x2442ffff, 0x3042007f, 0x00031942, 0x30637800, 0x00021100, 0x24424000, 0x00625021, 0x9542000a, 0x3084ffff, - 0x30420008, 0x104000b3, 0x000429c0, 0x3c020800, 0x8c422410, 0x1440002d, - 0x25050008, 0x95020014, 0x3c010800, 0xa42223e0, 0x8d070010, 0x00071402, - 0x3c010800, 0xa42223e2, 0x3c010800, 0xa42723e4, 0x9502000e, 0x30e3ffff, - 0x00431023, 0x3c010800, 0xac222418, 0x8f626800, 0x3c030010, 0x00431024, - 0x10400005, 0x00000000, 0x9503001a, 0x9502001c, 0x0a000235, 0x00431021, - 0x9502001a, 0x3c010800, 0xac22240c, 0x3c02c000, 0x02421825, 0x3c010800, - 0xac282410, 0x3c010800, 0xac322414, 0xaf635c9c, 0x8f625c90, 0x30420002, + 0x30420008, 0x104000b3, 0x000429c0, 0x3c020800, 0x8c4223f0, 0x1440002d, + 0x25050008, 0x95020014, 0x3c010800, 0xa42223c0, 0x8d070010, 0x00071402, + 0x3c010800, 0xa42223c2, 0x3c010800, 0xa42723c4, 0x9502000e, 0x30e3ffff, + 0x00431023, 0x3c010800, 0xac2223f8, 0x8f626800, 0x3c030010, 0x00431024, + 0x10400005, 0x00000000, 0x9503001a, 0x9502001c, 0x0a000241, 0x00431021, + 0x9502001a, 0x3c010800, 0xac2223ec, 0x3c02c000, 0x02421825, 0x3c010800, + 0xac2823f0, 0x3c010800, 0xac3223f4, 0xaf635c9c, 0x8f625c90, 0x30420002, 0x104000df, 0x00000000, 0xaf635c9c, 0x8f625c90, 0x30420002, 0x104000da, - 0x00000000, 0x0a000242, 0x00000000, 0x9502000e, 0x3c030800, 0x946323e4, + 0x00000000, 0x0a00024e, 0x00000000, 0x9502000e, 0x3c030800, 0x946323c4, 0x00434823, 0x3123ffff, 0x2c620008, 0x1040001c, 0x00000000, 0x95020014, 0x24420028, 0x00a22821, 0x00031042, 0x1840000b, 0x00002021, 0x24c60848, 0x00403821, 0x94a30000, 0x8cc20000, 0x24840001, 0x00431021, 0xacc20000, 0x0087102a, 0x1440fff9, 0x24a50002, 0x31220001, 0x1040001f, 0x3c024000, - 0x3c040800, 0x2484240c, 0xa0a00001, 0x94a30000, 0x8c820000, 0x00431021, - 0x0a000281, 0xac820000, 0x8f626800, 0x3c030010, 0x00431024, 0x10400009, - 0x00000000, 0x9502001a, 0x3c030800, 0x8c63240c, 0x00431021, 0x3c010800, - 0xac22240c, 0x0a000282, 0x3c024000, 0x9502001a, 0x9504001c, 0x3c030800, - 0x8c63240c, 0x00441023, 0x00621821, 0x3c010800, 0xac23240c, 0x3c024000, + 0x3c040800, 0x248423ec, 0xa0a00001, 0x94a30000, 0x8c820000, 0x00431021, + 0x0a00028d, 0xac820000, 0x8f626800, 0x3c030010, 0x00431024, 0x10400009, + 0x00000000, 0x9502001a, 0x3c030800, 0x8c6323ec, 0x00431021, 0x3c010800, + 0xac2223ec, 0x0a00028e, 0x3c024000, 0x9502001a, 0x9504001c, 0x3c030800, + 0x8c6323ec, 0x00441023, 0x00621821, 0x3c010800, 0xac2323ec, 0x3c024000, 0x02421825, 0xaf635c9c, 0x8f625c90, 0x30420002, 0x1440fffc, 0x00000000, - 0x9542000a, 0x30420010, 0x10400095, 0x00000000, 0x3c060800, 0x24c62410, - 0x3c020800, 0x944223e4, 0x8cc50000, 0x3c040800, 0x8c842418, 0x24420030, - 0x00a22821, 0x94a20004, 0x3c030800, 0x8c63240c, 0x00441023, 0x00621821, + 0x9542000a, 0x30420010, 0x10400095, 0x00000000, 0x3c060800, 0x24c623f0, + 0x3c020800, 0x944223c4, 0x8cc50000, 0x3c040800, 0x8c8423f8, 0x24420030, + 0x00a22821, 0x94a20004, 0x3c030800, 0x8c6323ec, 0x00441023, 0x00621821, 0x00603821, 0x00032402, 0x30e2ffff, 0x00823821, 0x00071402, 0x00e23821, - 0x00071027, 0x3c010800, 0xac23240c, 0xa4a20006, 0x3c030800, 0x8c632414, + 0x00071027, 0x3c010800, 0xac2323ec, 0xa4a20006, 0x3c030800, 0x8c6323f4, 0x3c0200ff, 0x3442fff8, 0x00628824, 0x96220008, 0x24040001, 0x24034000, 0x000241c0, 0x00e01021, 0xa502001a, 0xa500001c, 0xacc00000, 0x3c010800, - 0xac241b70, 0xaf635cb8, 0x8f625cb0, 0x30420002, 0x10400003, 0x00000000, - 0x3c010800, 0xac201b70, 0x8e220008, 0xaf625cb8, 0x8f625cb0, 0x30420002, - 0x10400003, 0x00000000, 0x3c010800, 0xac201b70, 0x3c020800, 0x8c421b70, - 0x1040ffec, 0x00000000, 0x3c040800, 0x0e000643, 0x8c842414, 0x0a000320, - 0x00000000, 0x3c030800, 0x90631ba8, 0x24020002, 0x14620003, 0x3c034b65, - 0x0a0002d7, 0x00008021, 0x8e22001c, 0x34637654, 0x10430002, 0x24100002, - 0x24100001, 0x01002021, 0x0e000346, 0x02003021, 0x24020003, 0x3c010800, - 0xa0221ba8, 0x24020002, 0x1202000a, 0x24020001, 0x3c030800, 0x8c632400, - 0x10620006, 0x00000000, 0x3c020800, 0x944223e8, 0x00021400, 0x0a000315, - 0xae220014, 0x3c040800, 0x248423ea, 0x94820000, 0x00021400, 0xae220014, - 0x3c020800, 0x8c421bcc, 0x3c03c000, 0x3c010800, 0xa0201ba8, 0x00431025, + 0xac241b50, 0xaf635cb8, 0x8f625cb0, 0x30420002, 0x10400003, 0x00000000, + 0x3c010800, 0xac201b50, 0x8e220008, 0xaf625cb8, 0x8f625cb0, 0x30420002, + 0x10400003, 0x00000000, 0x3c010800, 0xac201b50, 0x3c020800, 0x8c421b50, + 0x1040ffec, 0x00000000, 0x3c040800, 0x0e000637, 0x8c8423f4, 0x0a00032c, + 0x00000000, 0x3c030800, 0x90631b88, 0x24020002, 0x14620003, 0x3c034b65, + 0x0a0002e3, 0x00008021, 0x8e22001c, 0x34637654, 0x10430002, 0x24100002, + 0x24100001, 0x01002021, 0x0e000352, 0x02003021, 0x24020003, 0x3c010800, + 0xa0221b88, 0x24020002, 0x1202000a, 0x24020001, 0x3c030800, 0x8c6323e0, + 0x10620006, 0x00000000, 0x3c020800, 0x944223c8, 0x00021400, 0x0a000321, + 0xae220014, 0x3c040800, 0x248423ca, 0x94820000, 0x00021400, 0xae220014, + 0x3c020800, 0x8c421bac, 0x3c03c000, 0x3c010800, 0xa0201b88, 0x00431025, 0xaf625c5c, 0x8f625c50, 0x30420002, 0x10400009, 0x00000000, 0x2484f7e2, 0x8c820000, 0x00431025, 0xaf625c5c, 0x8f625c50, 0x30420002, 0x1440fffa, - 0x00000000, 0x3c020800, 0x24421b94, 0x8c430000, 0x24630001, 0xac430000, + 0x00000000, 0x3c020800, 0x24421b74, 0x8c430000, 0x24630001, 0xac430000, 0x8f630c14, 0x3063000f, 0x2c620002, 0x1440000c, 0x3c024000, 0x8f630c14, - 0x3c020800, 0x8c421b50, 0x3063000f, 0x24420001, 0x3c010800, 0xac221b50, + 0x3c020800, 0x8c421b30, 0x3063000f, 0x24420001, 0x3c010800, 0xac221b30, 0x2c620002, 0x1040fff7, 0x00000000, 0x3c024000, 0x02421825, 0xaf635c9c, 0x8f625c90, 0x30420002, 0x1440fffc, 0x00000000, 0x12600003, 0x00000000, - 0x0e0004b9, 0x00000000, 0x8fbf0028, 0x8fb30024, 0x8fb20020, 0x8fb1001c, - 0x8fb00018, 0x03e00008, 0x27bd0030, 0x8f634450, 0x3c040800, 0x24841b98, + 0x0e0004bf, 0x00000000, 0x8fbf0028, 0x8fb30024, 0x8fb20020, 0x8fb1001c, + 0x8fb00018, 0x03e00008, 0x27bd0030, 0x8f634450, 0x3c040800, 0x24841b78, 0x8c820000, 0x00031c02, 0x0043102b, 0x14400007, 0x3c038000, 0x8c840004, 0x8f624450, 0x00021c02, 0x0083102b, 0x1040fffc, 0x3c038000, 0xaf634444, 0x8f624444, 0x00431024, 0x1440fffd, 0x00000000, 0x8f624448, 0x03e00008, 0x3042ffff, 0x3c024000, 0x00822025, 0xaf645c38, 0x8f625c30, 0x30420002, 0x1440fffc, 0x00000000, 0x03e00008, 0x00000000, 0x27bdffe0, 0x00805821, - 0x14c00017, 0x256e0008, 0x3c020800, 0x8c422404, 0x1040000a, 0x2402003e, - 0x3c010800, 0xa42223e0, 0x24020016, 0x3c010800, 0xa42223e2, 0x2402002a, - 0x3c010800, 0x0a000360, 0xa42223e4, 0x95620014, 0x3c010800, 0xa42223e0, - 0x8d670010, 0x00071402, 0x3c010800, 0xa42223e2, 0x3c010800, 0xa42723e4, - 0x3c040800, 0x948423e4, 0x3c030800, 0x946323e2, 0x95cf0006, 0x3c020800, - 0x944223e0, 0x00832023, 0x01e2c023, 0x3065ffff, 0x24a20028, 0x01c24821, + 0x14c00011, 0x256e0008, 0x3c020800, 0x8c4223e4, 0x10400007, 0x24020016, + 0x3c010800, 0xa42223c2, 0x2402002a, 0x3c010800, 0x0a000366, 0xa42223c4, + 0x8d670010, 0x00071402, 0x3c010800, 0xa42223c2, 0x3c010800, 0xa42723c4, + 0x3c040800, 0x948423c4, 0x3c030800, 0x946323c2, 0x95cf0006, 0x3c020800, + 0x944223c0, 0x00832023, 0x01e2c023, 0x3065ffff, 0x24a20028, 0x01c24821, 0x3082ffff, 0x14c0001a, 0x01226021, 0x9582000c, 0x3042003f, 0x3c010800, - 0xa42223e6, 0x95820004, 0x95830006, 0x3c010800, 0xac2023f4, 0x3c010800, - 0xac2023f8, 0x00021400, 0x00431025, 0x3c010800, 0xac221bd0, 0x95220004, - 0x3c010800, 0xa4221bd4, 0x95230002, 0x01e51023, 0x0043102a, 0x10400010, - 0x24020001, 0x3c010800, 0x0a000394, 0xac222408, 0x3c030800, 0x8c6323f8, - 0x3c020800, 0x94421bd4, 0x00431021, 0xa5220004, 0x3c020800, 0x94421bd0, - 0xa5820004, 0x3c020800, 0x8c421bd0, 0xa5820006, 0x3c020800, 0x8c422400, - 0x3c0d0800, 0x8dad23f4, 0x3c0a0800, 0x144000e5, 0x8d4a23f8, 0x3c020800, - 0x94421bd4, 0x004a1821, 0x3063ffff, 0x0062182b, 0x24020002, 0x10c2000d, - 0x01435023, 0x3c020800, 0x944223e6, 0x30420009, 0x10400008, 0x00000000, - 0x9582000c, 0x3042fff6, 0xa582000c, 0x3c020800, 0x944223e6, 0x30420009, - 0x01a26823, 0x3c020800, 0x8c422408, 0x1040004a, 0x01203821, 0x3c020800, - 0x944223e2, 0x00004021, 0xa520000a, 0x01e21023, 0xa5220002, 0x3082ffff, + 0xa42223c6, 0x95820004, 0x95830006, 0x3c010800, 0xac2023d4, 0x3c010800, + 0xac2023d8, 0x00021400, 0x00431025, 0x3c010800, 0xac221bb0, 0x95220004, + 0x3c010800, 0xa4221bb4, 0x95230002, 0x01e51023, 0x0043102a, 0x10400010, + 0x24020001, 0x3c010800, 0x0a00039a, 0xac2223e8, 0x3c030800, 0x8c6323d8, + 0x3c020800, 0x94421bb4, 0x00431021, 0xa5220004, 0x3c020800, 0x94421bb0, + 0xa5820004, 0x3c020800, 0x8c421bb0, 0xa5820006, 0x3c020800, 0x8c4223e0, + 0x3c0d0800, 0x8dad23d4, 0x3c0a0800, 0x144000e5, 0x8d4a23d8, 0x3c020800, + 0x94421bb4, 0x004a1821, 0x3063ffff, 0x0062182b, 0x24020002, 0x10c2000d, + 0x01435023, 0x3c020800, 0x944223c6, 0x30420009, 0x10400008, 0x00000000, + 0x9582000c, 0x3042fff6, 0xa582000c, 0x3c020800, 0x944223c6, 0x30420009, + 0x01a26823, 0x3c020800, 0x8c4223e8, 0x1040004a, 0x01203821, 0x3c020800, + 0x944223c2, 0x00004021, 0xa520000a, 0x01e21023, 0xa5220002, 0x3082ffff, 0x00021042, 0x18400008, 0x00003021, 0x00401821, 0x94e20000, 0x25080001, 0x00c23021, 0x0103102a, 0x1440fffb, 0x24e70002, 0x00061c02, 0x30c2ffff, 0x00623021, 0x00061402, 0x00c23021, 0x00c02821, 0x00061027, 0xa522000a, @@ -4044,131 +4057,127 @@ 0x30e2007f, 0x14400006, 0x25080001, 0x8d630000, 0x3c02007f, 0x3442ff80, 0x00625824, 0x25670008, 0x0109102a, 0x1440fff3, 0x00000000, 0x30820001, 0x10400005, 0x00061c02, 0xa0e00001, 0x94e20000, 0x00c23021, 0x00061c02, - 0x30c2ffff, 0x00623021, 0x00061402, 0x00c23021, 0x0a000479, 0x30c6ffff, - 0x24020002, 0x14c20081, 0x00000000, 0x3c020800, 0x8c42241c, 0x14400007, - 0x00000000, 0x3c020800, 0x944223e2, 0x95230002, 0x01e21023, 0x10620077, - 0x00000000, 0x3c020800, 0x944223e2, 0x01e21023, 0xa5220002, 0x3c020800, - 0x8c42241c, 0x1040001a, 0x31e3ffff, 0x8dc70010, 0x3c020800, 0x94421ba6, + 0x30c2ffff, 0x00623021, 0x00061402, 0x00c23021, 0x0a00047f, 0x30c6ffff, + 0x24020002, 0x14c20081, 0x00000000, 0x3c020800, 0x8c4223fc, 0x14400007, + 0x00000000, 0x3c020800, 0x944223c2, 0x95230002, 0x01e21023, 0x10620077, + 0x00000000, 0x3c020800, 0x944223c2, 0x01e21023, 0xa5220002, 0x3c020800, + 0x8c4223fc, 0x1040001a, 0x31e3ffff, 0x8dc70010, 0x3c020800, 0x94421b86, 0x00e04021, 0x00072c02, 0x00aa2021, 0x00431023, 0x00823823, 0x00072402, 0x30e2ffff, 0x00823821, 0x00071027, 0xa522000a, 0x3102ffff, 0x3c040800, - 0x948423e4, 0x00453023, 0x00e02821, 0x00641823, 0x006d1821, 0x00c33021, - 0x00061c02, 0x30c2ffff, 0x0a000479, 0x00623021, 0x01203821, 0x00004021, + 0x948423c4, 0x00453023, 0x00e02821, 0x00641823, 0x006d1821, 0x00c33021, + 0x00061c02, 0x30c2ffff, 0x0a00047f, 0x00623021, 0x01203821, 0x00004021, 0x3082ffff, 0x00021042, 0x18400008, 0x00003021, 0x00401821, 0x94e20000, 0x25080001, 0x00c23021, 0x0103102a, 0x1440fffb, 0x24e70002, 0x00061c02, 0x30c2ffff, 0x00623021, 0x00061402, 0x00c23021, 0x00c02821, 0x00061027, 0xa522000a, 0x00003021, 0x2527000c, 0x00004021, 0x94e20000, 0x25080001, 0x00c23021, 0x2d020004, 0x1440fffb, 0x24e70002, 0x95220002, 0x00004021, 0x91230009, 0x00442023, 0x01803821, 0x3082ffff, 0xa4e00010, 0x3c040800, - 0x948423e4, 0x00621821, 0x00c33021, 0x00061c02, 0x30c2ffff, 0x00623021, - 0x00061c02, 0x3c020800, 0x944223e0, 0x00c34821, 0x00441023, 0x00021fc2, + 0x948423c4, 0x00621821, 0x00c33021, 0x00061c02, 0x30c2ffff, 0x00623021, + 0x00061c02, 0x3c020800, 0x944223c0, 0x00c34821, 0x00441023, 0x00021fc2, 0x00431021, 0x00021043, 0x18400010, 0x00003021, 0x00402021, 0x94e20000, 0x24e70002, 0x00c23021, 0x30e2007f, 0x14400006, 0x25080001, 0x8d630000, 0x3c02007f, 0x3442ff80, 0x00625824, 0x25670008, 0x0104102a, 0x1440fff3, - 0x00000000, 0x3c020800, 0x944223fc, 0x00c23021, 0x3122ffff, 0x00c23021, + 0x00000000, 0x3c020800, 0x944223dc, 0x00c23021, 0x3122ffff, 0x00c23021, 0x00061c02, 0x30c2ffff, 0x00623021, 0x00061402, 0x00c23021, 0x00c04021, - 0x00061027, 0xa5820010, 0xadc00014, 0x0a000499, 0xadc00000, 0x8dc70010, + 0x00061027, 0xa5820010, 0xadc00014, 0x0a00049f, 0xadc00000, 0x8dc70010, 0x00e04021, 0x11400007, 0x00072c02, 0x00aa3021, 0x00061402, 0x30c3ffff, 0x00433021, 0x00061402, 0x00c22821, 0x00051027, 0xa522000a, 0x3c030800, - 0x946323e4, 0x3102ffff, 0x01e21021, 0x00433023, 0x00cd3021, 0x00061c02, + 0x946323c4, 0x3102ffff, 0x01e21021, 0x00433023, 0x00cd3021, 0x00061c02, 0x30c2ffff, 0x00623021, 0x00061402, 0x00c23021, 0x00c04021, 0x00061027, 0xa5820010, 0x3102ffff, 0x00051c00, 0x00431025, 0xadc20010, 0x3c020800, - 0x8c422404, 0x10400002, 0x25e2fff2, 0xa5c20034, 0x3c020800, 0x8c4223f8, - 0x3c040800, 0x8c8423f4, 0x24420001, 0x3c010800, 0xac2223f8, 0x3c020800, - 0x8c421bd0, 0x3303ffff, 0x00832021, 0x3c010800, 0xac2423f4, 0x00431821, - 0x0062102b, 0x10400003, 0x2482ffff, 0x3c010800, 0xac2223f4, 0x3c010800, - 0xac231bd0, 0x03e00008, 0x27bd0020, 0x27bdffb8, 0x3c050800, 0x24a51ba8, + 0x8c4223e4, 0x10400002, 0x25e2fff2, 0xa5c20034, 0x3c020800, 0x8c4223d8, + 0x3c040800, 0x8c8423d4, 0x24420001, 0x3c010800, 0xac2223d8, 0x3c020800, + 0x8c421bb0, 0x3303ffff, 0x00832021, 0x3c010800, 0xac2423d4, 0x00431821, + 0x0062102b, 0x10400003, 0x2482ffff, 0x3c010800, 0xac2223d4, 0x3c010800, + 0xac231bb0, 0x03e00008, 0x27bd0020, 0x27bdffb8, 0x3c050800, 0x24a51b86, 0xafbf0044, 0xafbe0040, 0xafb7003c, 0xafb60038, 0xafb50034, 0xafb40030, - 0xafb3002c, 0xafb20028, 0xafb10024, 0xafb00020, 0x90a30000, 0x24020003, - 0x146200d5, 0x00000000, 0x3c090800, 0x95291ba6, 0x3c020800, 0x944223e0, - 0x3c030800, 0x8c631bc0, 0x3c040800, 0x8c841bbc, 0x01221023, 0x0064182a, - 0xa7a9001e, 0x106000c8, 0xa7a20016, 0x24be0020, 0x97b6001e, 0x24b30018, - 0x24b70014, 0x8fc20000, 0x14400008, 0x00000000, 0x8fc2fff8, 0x97a30016, - 0x8fc4fff4, 0x00431021, 0x0082202a, 0x148000ba, 0x00000000, 0x97d50818, - 0x32a2ffff, 0x104000ad, 0x00009021, 0x0040a021, 0x00008821, 0x0e00062d, - 0x00000000, 0x00403021, 0x14c00007, 0x00000000, 0x3c020800, 0x8c4223ec, - 0x24420001, 0x3c010800, 0x0a00059e, 0xac2223ec, 0x3c100800, 0x02118021, - 0x8e101bd8, 0x9608000a, 0x31020040, 0x10400004, 0x2407180c, 0x8e02000c, - 0x2407188c, 0xacc20018, 0x31021000, 0x10400004, 0x34e32000, 0x00081040, - 0x3042c000, 0x00623825, 0x31020080, 0x54400001, 0x34e70010, 0x3c020800, - 0x00511021, 0x8c421be0, 0x3c030800, 0x00711821, 0x8c631be4, 0x00021500, - 0x00031c00, 0x00431025, 0xacc20014, 0x96040008, 0x3242ffff, 0x00821021, - 0x0282102a, 0x14400002, 0x02b22823, 0x00802821, 0x8e020000, 0x02459021, - 0xacc20000, 0x8e020004, 0x00c02021, 0x26310010, 0xac820004, 0x30e2ffff, - 0xac800008, 0xa485000e, 0xac820010, 0x24020305, 0x0e0005aa, 0xa482000c, - 0x3242ffff, 0x0054102b, 0x1440ffc0, 0x3242ffff, 0x0a000596, 0x00000000, - 0x8e620000, 0x8e63fffc, 0x0043102a, 0x1040006c, 0x00000000, 0x8e62fff0, - 0x00028900, 0x3c100800, 0x02118021, 0x0e00062d, 0x8e101bd8, 0x00403021, - 0x14c00005, 0x00000000, 0x8e62082c, 0x24420001, 0x0a00059e, 0xae62082c, - 0x9608000a, 0x31020040, 0x10400004, 0x2407180c, 0x8e02000c, 0x2407188c, - 0xacc20018, 0x31021000, 0x10400004, 0x34e32000, 0x00081040, 0x3042c000, - 0x00623825, 0x3c020800, 0x00511021, 0x8c421be0, 0x3c030800, 0x00711821, - 0x8c631be4, 0x00021500, 0x00031c00, 0x00431025, 0xacc20014, 0x8e63fff4, - 0x96020008, 0x00432023, 0x3242ffff, 0x3083ffff, 0x00431021, 0x02c2102a, - 0x10400003, 0x00802821, 0x97a9001e, 0x01322823, 0x8e620000, 0x30a4ffff, - 0x00441021, 0xae620000, 0xa4c5000e, 0x8e020000, 0xacc20000, 0x8e020004, - 0x8e63fff4, 0x00431021, 0xacc20004, 0x8e63fff4, 0x96020008, 0x00641821, - 0x0062102a, 0x14400006, 0x02459021, 0x8e62fff0, 0xae60fff4, 0x24420001, - 0x0a000579, 0xae62fff0, 0xae63fff4, 0xacc00008, 0x3242ffff, 0x10560003, - 0x31020004, 0x10400006, 0x24020305, 0x31020080, 0x54400001, 0x34e70010, - 0x34e70020, 0x24020905, 0xa4c2000c, 0x8ee30000, 0x8ee20004, 0x14620007, - 0x3c02b49a, 0x8ee20860, 0x54400001, 0x34e70400, 0x3c024b65, 0x0a000590, - 0x34427654, 0x344289ab, 0xacc2001c, 0x30e2ffff, 0xacc20010, 0x0e0005aa, - 0x00c02021, 0x3242ffff, 0x0056102b, 0x1440ff96, 0x00000000, 0x8e620000, - 0x8e63fffc, 0x0043102a, 0x1440ff3e, 0x00000000, 0x8fbf0044, 0x8fbe0040, - 0x8fb7003c, 0x8fb60038, 0x8fb50034, 0x8fb40030, 0x8fb3002c, 0x8fb20028, - 0x8fb10024, 0x8fb00020, 0x03e00008, 0x27bd0048, 0x27bdffe8, 0xafbf0014, - 0xafb00010, 0x8f624450, 0x8f634410, 0x0a0005b9, 0x00808021, 0x8f626820, - 0x30422000, 0x10400003, 0x00000000, 0x0e0001e3, 0x00002021, 0x8f624450, - 0x8f634410, 0x3042ffff, 0x0043102b, 0x1440fff5, 0x00000000, 0x8f630c14, - 0x3063000f, 0x2c620002, 0x1440000b, 0x00000000, 0x8f630c14, 0x3c020800, - 0x8c421b50, 0x3063000f, 0x24420001, 0x3c010800, 0xac221b50, 0x2c620002, - 0x1040fff7, 0x00000000, 0xaf705c18, 0x8f625c10, 0x30420002, 0x10400009, - 0x00000000, 0x8f626820, 0x30422000, 0x1040fff8, 0x00000000, 0x0e0001e3, - 0x00002021, 0x0a0005cc, 0x00000000, 0x8fbf0014, 0x8fb00010, 0x03e00008, - 0x27bd0018, 0x00000000, 0x00000000, 0x00000000, 0x27bdffe8, 0x3c1bc000, + 0xafb3002c, 0xafb20028, 0xafb10024, 0xafb00020, 0x94a90000, 0x3c020800, + 0x944223c0, 0x3c030800, 0x8c631ba0, 0x3c040800, 0x8c841b9c, 0x01221023, + 0x0064182a, 0xa7a9001e, 0x106000bc, 0xa7a20016, 0x24be0022, 0x97b6001e, + 0x24b3001a, 0x24b70016, 0x8fc20000, 0x14400008, 0x00000000, 0x8fc2fff8, + 0x97a30016, 0x8fc4fff4, 0x00431021, 0x0082202a, 0x148000ae, 0x00000000, + 0x97d50818, 0x32a2ffff, 0x104000a1, 0x00009021, 0x0040a021, 0x00008821, + 0x0e000621, 0x00000000, 0x00403021, 0x14c00007, 0x00000000, 0x3c020800, + 0x8c4223cc, 0x24420001, 0x3c010800, 0x0a000593, 0xac2223cc, 0x3c100800, + 0x02118021, 0x8e101bb8, 0x9608000a, 0x31020040, 0x10400004, 0x2407180c, + 0x8e02000c, 0x2407188c, 0xacc20018, 0x31020080, 0x54400001, 0x34e70010, + 0x3c020800, 0x00511021, 0x8c421bc0, 0x3c030800, 0x00711821, 0x8c631bc4, + 0x00021500, 0x00031c00, 0x00431025, 0xacc20014, 0x96040008, 0x3242ffff, + 0x00821021, 0x0282102a, 0x14400002, 0x02b22823, 0x00802821, 0x8e020000, + 0x02459021, 0xacc20000, 0x8e020004, 0x00c02021, 0x26310010, 0xac820004, + 0x30e2ffff, 0xac800008, 0xa485000e, 0xac820010, 0x24020305, 0x0e00059f, + 0xa482000c, 0x3242ffff, 0x0054102b, 0x1440ffc6, 0x3242ffff, 0x0a00058b, + 0x00000000, 0x8e620000, 0x8e63fffc, 0x0043102a, 0x10400066, 0x00000000, + 0x8e62fff0, 0x00028900, 0x3c100800, 0x02118021, 0x0e000621, 0x8e101bb8, + 0x00403021, 0x14c00005, 0x00000000, 0x8e62082c, 0x24420001, 0x0a000593, + 0xae62082c, 0x9608000a, 0x31020040, 0x10400004, 0x2407180c, 0x8e02000c, + 0x2407188c, 0xacc20018, 0x3c020800, 0x00511021, 0x8c421bc0, 0x3c030800, + 0x00711821, 0x8c631bc4, 0x00021500, 0x00031c00, 0x00431025, 0xacc20014, + 0x8e63fff4, 0x96020008, 0x00432023, 0x3242ffff, 0x3083ffff, 0x00431021, + 0x02c2102a, 0x10400003, 0x00802821, 0x97a9001e, 0x01322823, 0x8e620000, + 0x30a4ffff, 0x00441021, 0xae620000, 0xa4c5000e, 0x8e020000, 0xacc20000, + 0x8e020004, 0x8e63fff4, 0x00431021, 0xacc20004, 0x8e63fff4, 0x96020008, + 0x00641821, 0x0062102a, 0x14400006, 0x02459021, 0x8e62fff0, 0xae60fff4, + 0x24420001, 0x0a00056e, 0xae62fff0, 0xae63fff4, 0xacc00008, 0x3242ffff, + 0x10560003, 0x31020004, 0x10400006, 0x24020305, 0x31020080, 0x54400001, + 0x34e70010, 0x34e70020, 0x24020905, 0xa4c2000c, 0x8ee30000, 0x8ee20004, + 0x14620007, 0x3c02b49a, 0x8ee20860, 0x54400001, 0x34e70400, 0x3c024b65, + 0x0a000585, 0x34427654, 0x344289ab, 0xacc2001c, 0x30e2ffff, 0xacc20010, + 0x0e00059f, 0x00c02021, 0x3242ffff, 0x0056102b, 0x1440ff9c, 0x00000000, + 0x8e620000, 0x8e63fffc, 0x0043102a, 0x1440ff4a, 0x00000000, 0x8fbf0044, + 0x8fbe0040, 0x8fb7003c, 0x8fb60038, 0x8fb50034, 0x8fb40030, 0x8fb3002c, + 0x8fb20028, 0x8fb10024, 0x8fb00020, 0x03e00008, 0x27bd0048, 0x27bdffe8, + 0xafbf0014, 0xafb00010, 0x8f624450, 0x8f634410, 0x0a0005ae, 0x00808021, + 0x8f626820, 0x30422000, 0x10400003, 0x00000000, 0x0e0001ef, 0x00002021, + 0x8f624450, 0x8f634410, 0x3042ffff, 0x0043102b, 0x1440fff5, 0x00000000, + 0x8f630c14, 0x3063000f, 0x2c620002, 0x1440000b, 0x00000000, 0x8f630c14, + 0x3c020800, 0x8c421b30, 0x3063000f, 0x24420001, 0x3c010800, 0xac221b30, + 0x2c620002, 0x1040fff7, 0x00000000, 0xaf705c18, 0x8f625c10, 0x30420002, + 0x10400009, 0x00000000, 0x8f626820, 0x30422000, 0x1040fff8, 0x00000000, + 0x0e0001ef, 0x00002021, 0x0a0005c1, 0x00000000, 0x8fbf0014, 0x8fb00010, + 0x03e00008, 0x27bd0018, 0x00000000, 0x00000000, 0x27bdffe8, 0x3c1bc000, 0xafbf0014, 0xafb00010, 0xaf60680c, 0x8f626804, 0x34420082, 0xaf626804, - 0x8f634000, 0x24020b50, 0x3c010800, 0xac221b64, 0x24020b78, 0x3c010800, - 0xac221b74, 0x34630002, 0xaf634000, 0x0e00060d, 0x00808021, 0x3c010800, - 0xa0221b78, 0x304200ff, 0x24030002, 0x14430005, 0x00000000, 0x3c020800, - 0x8c421b64, 0x0a000600, 0xac5000c0, 0x3c020800, 0x8c421b64, 0xac5000bc, - 0x8f624434, 0x8f634438, 0x8f644410, 0x3c010800, 0xac221b6c, 0x3c010800, - 0xac231b7c, 0x3c010800, 0xac241b68, 0x8fbf0014, 0x8fb00010, 0x03e00008, + 0x8f634000, 0x24020b50, 0x3c010800, 0xac221b44, 0x24020b78, 0x3c010800, + 0xac221b54, 0x34630002, 0xaf634000, 0x0e000601, 0x00808021, 0x3c010800, + 0xa0221b58, 0x304200ff, 0x24030002, 0x14430005, 0x00000000, 0x3c020800, + 0x8c421b44, 0x0a0005f4, 0xac5000c0, 0x3c020800, 0x8c421b44, 0xac5000bc, + 0x8f624434, 0x8f634438, 0x8f644410, 0x3c010800, 0xac221b4c, 0x3c010800, + 0xac231b5c, 0x3c010800, 0xac241b48, 0x8fbf0014, 0x8fb00010, 0x03e00008, 0x27bd0018, 0x3c040800, 0x8c870000, 0x3c03aa55, 0x3463aa55, 0x3c06c003, 0xac830000, 0x8cc20000, 0x14430007, 0x24050002, 0x3c0355aa, 0x346355aa, 0xac830000, 0x8cc20000, 0x50430001, 0x24050001, 0x3c020800, 0xac470000, 0x03e00008, 0x00a01021, 0x27bdfff8, 0x18800009, 0x00002821, 0x8f63680c, 0x8f62680c, 0x1043fffe, 0x00000000, 0x24a50001, 0x00a4102a, 0x1440fff9, - 0x00000000, 0x03e00008, 0x27bd0008, 0x8f634450, 0x3c020800, 0x8c421b6c, - 0x00031c02, 0x0043102b, 0x14400008, 0x3c038000, 0x3c040800, 0x8c841b7c, + 0x00000000, 0x03e00008, 0x27bd0008, 0x8f634450, 0x3c020800, 0x8c421b4c, + 0x00031c02, 0x0043102b, 0x14400008, 0x3c038000, 0x3c040800, 0x8c841b5c, 0x8f624450, 0x00021c02, 0x0083102b, 0x1040fffc, 0x3c038000, 0xaf634444, 0x8f624444, 0x00431024, 0x1440fffd, 0x00000000, 0x8f624448, 0x03e00008, 0x3042ffff, 0x3082ffff, 0x2442e000, 0x2c422001, 0x14400003, 0x3c024000, - 0x0a000650, 0x2402ffff, 0x00822025, 0xaf645c38, 0x8f625c30, 0x30420002, + 0x0a000644, 0x2402ffff, 0x00822025, 0xaf645c38, 0x8f625c30, 0x30420002, 0x1440fffc, 0x00001021, 0x03e00008, 0x00000000, 0x8f624450, 0x3c030800, - 0x8c631b68, 0x0a000659, 0x3042ffff, 0x8f624450, 0x3042ffff, 0x0043102b, + 0x8c631b48, 0x0a00064d, 0x3042ffff, 0x8f624450, 0x3042ffff, 0x0043102b, 0x1440fffc, 0x00000000, 0x03e00008, 0x00000000, 0x27bdffe0, 0x00802821, - 0x3c040800, 0x24841b10, 0x00003021, 0x00003821, 0xafbf0018, 0xafa00010, - 0x0e000684, 0xafa00014, 0x0a000668, 0x00000000, 0x8fbf0018, 0x03e00008, + 0x3c040800, 0x24841ae0, 0x00003021, 0x00003821, 0xafbf0018, 0xafa00010, + 0x0e000678, 0xafa00014, 0x0a00065c, 0x00000000, 0x8fbf0018, 0x03e00008, 0x27bd0020, 0x00000000, 0x00000000, 0x00000000, 0x3c020800, 0x34423000, - 0x3c030800, 0x34633000, 0x3c040800, 0x348437ff, 0x3c010800, 0xac221b84, - 0x24020040, 0x3c010800, 0xac221b88, 0x3c010800, 0xac201b80, 0xac600000, + 0x3c030800, 0x34633000, 0x3c040800, 0x348437ff, 0x3c010800, 0xac221b64, + 0x24020040, 0x3c010800, 0xac221b68, 0x3c010800, 0xac201b60, 0xac600000, 0x24630004, 0x0083102b, 0x5040fffd, 0xac600000, 0x03e00008, 0x00000000, - 0x00804821, 0x8faa0010, 0x3c020800, 0x8c421b80, 0x3c040800, 0x8c841b88, - 0x8fab0014, 0x24430001, 0x0044102b, 0x3c010800, 0xac231b80, 0x14400003, - 0x00004021, 0x3c010800, 0xac201b80, 0x3c020800, 0x8c421b80, 0x3c030800, - 0x8c631b84, 0x91240000, 0x00021140, 0x00431021, 0x00481021, 0x25080001, - 0xa0440000, 0x29020008, 0x1440fff4, 0x25290001, 0x3c020800, 0x8c421b80, - 0x3c030800, 0x8c631b84, 0x8f64680c, 0x00021140, 0x00431021, 0xac440008, + 0x00804821, 0x8faa0010, 0x3c020800, 0x8c421b60, 0x3c040800, 0x8c841b68, + 0x8fab0014, 0x24430001, 0x0044102b, 0x3c010800, 0xac231b60, 0x14400003, + 0x00004021, 0x3c010800, 0xac201b60, 0x3c020800, 0x8c421b60, 0x3c030800, + 0x8c631b64, 0x91240000, 0x00021140, 0x00431021, 0x00481021, 0x25080001, + 0xa0440000, 0x29020008, 0x1440fff4, 0x25290001, 0x3c020800, 0x8c421b60, + 0x3c030800, 0x8c631b64, 0x8f64680c, 0x00021140, 0x00431021, 0xac440008, 0xac45000c, 0xac460010, 0xac470014, 0xac4a0018, 0x03e00008, 0xac4b001c, 0x00000000, 0x00000000, }; u32 tg3TsoFwRodata[] = { - 0x4d61696e, 0x43707542, 0x00000000, 0x4d61696e, 0x43707541, - 0x00000000, 0x00000000, 0x00000000, 0x73746b6f, 0x66666c64, - 0x496e0000, 0x73746b6f, 0x66662a2a, 0x00000000, 0x53774576, - 0x656e7430, 0x00000000, 0x00000000, 0x00000000, 0x00000000, - 0x66617461, 0x6c457272, 0x00000000, 0x00000000, 0x00000000 + 0x4d61696e, 0x43707542, 0x00000000, 0x4d61696e, 0x43707541, 0x00000000, + 0x00000000, 0x00000000, 0x73746b6f, 0x66666c64, 0x496e0000, 0x73746b6f, + 0x66662a2a, 0x00000000, 0x53774576, 0x656e7430, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x66617461, 0x6c457272, 0x00000000, 0x00000000, }; #if 0 /* All zeros, don't eat up space with it. */ From vnuorval@tcs.hut.fi Wed Dec 3 02:26:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 02:26:33 -0800 (PST) Received: from neon.tcs.hut.fi (neon.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3AQ6Ta003330 for ; Wed, 3 Dec 2003 02:26:07 -0800 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by neon.tcs.hut.fi (Postfix) with ESMTP id 8A4128001C6; Wed, 3 Dec 2003 12:26:04 +0200 (EET) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-6.6) with ESMTP id hB3AQ4sj029629; Wed, 3 Dec 2003 12:26:04 +0200 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-6.6) with ESMTP id hB3AQ21D029625; Wed, 3 Dec 2003 12:26:03 +0200 Date: Wed, 3 Dec 2003 12:26:02 +0200 (EET) From: Ville Nuorvala To: dahinds@users.sourceforge.net, akpm@digeo.com, jgarzik@pobox.com Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: [2.6 PATCH]: Fix deadlock in driver 3c574_cs.c In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1857 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev Hello, I already sent a report about this to linux-net earlier, but I guess it went unnoticed due to thanksgiving... The deadlock occurs when update_stats() is called from el3_interrupt(), which already holds the lp->window_lock. The patch below fixes this behavior. Thanks, Ville ===== drivers/net/pcmcia/3c574_cs.c 1.25 vs edited ===== --- 1.25/drivers/net/pcmcia/3c574_cs.c Sat Sep 6 22:50:44 2003 +++ edited/drivers/net/pcmcia/3c574_cs.c Thu Nov 27 10:26:34 2003 @@ -1092,8 +1092,12 @@ { struct el3_private *lp = (struct el3_private *)dev->priv; - if (netif_device_present(dev)) + if (netif_device_present(dev)) { + unsigned long flags; + spin_lock_irqsave(&lp->window_lock, flags); update_stats(dev); + spin_unlock_irqrestore(&lp->window_lock, flags); + } return &lp->stats; } @@ -1105,7 +1109,6 @@ { struct el3_private *lp = (struct el3_private *)dev->priv; ioaddr_t ioaddr = dev->base_addr; - unsigned long flags; u8 rx, tx, up; DEBUG(2, "%s: updating the statistics.\n", dev->name); @@ -1113,8 +1116,6 @@ if (inw(ioaddr+EL3_STATUS) == 0xffff) /* No card. */ return; - spin_lock_irqsave(&lp->window_lock, flags); - /* Unlike the 3c509 we need not turn off stats updates while reading. */ /* Switch to the stats window, and read everything. */ EL3WINDOW(6); @@ -1139,7 +1140,6 @@ lp->stats.tx_bytes += tx + ((up & 0xf0) << 12); EL3WINDOW(1); - spin_unlock_irqrestore(&lp->window_lock, flags); } static int el3_rx(struct net_device *dev, int worklimit) @@ -1281,6 +1281,8 @@ DEBUG(2, "%s: shutting down ethercard.\n", dev->name); if (DEV_OK(link)) { + unsigned long flags; + /* Turn off statistics ASAP. We update lp->stats below. */ outw(StatsDisable, ioaddr + EL3_CMD); @@ -1290,8 +1292,9 @@ /* Note: Switching to window 0 may disable the IRQ. */ EL3WINDOW(0); - + spin_lock_irqsave(&lp->window_lock, flags); update_stats(dev); + spin_unlock_irqrestore(&lp->window_lock, flags); } link->open--; -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 From herbert@gondor.apana.org.au Wed Dec 3 02:35:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 02:35:54 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3AZbTa003784 for ; Wed, 3 Dec 2003 02:35:38 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1ARULc-0003Xs-00; Wed, 03 Dec 2003 21:35:08 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1ARULX-0000Ny-00; Wed, 03 Dec 2003 21:35:03 +1100 From: Herbert Xu To: hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: [RTNETLINK] Provide real oif Organization: Core In-Reply-To: <1070410097.1038.27.camel@jzny.localdomain> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.2-20031002 ("Berneray") (UNIX) (Linux/2.4.22-1-686-smp (i686)) Message-Id: Date: Wed, 03 Dec 2003 21:35:03 +1100 X-archive-position: 1858 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev jamal wrote: >> >> Without the real oif you will see entries like this in the output of >> ip r l c: >> >> 192.168.0.7 dev eth0 src 192.168.0.6 >> cache mtu 1500 advmss 1460 metric10 64 >> 192.168.0.7 dev eth0 src 192.168.0.6 >> cache mtu 1500 advmss 1460 metric10 64 >> >> One of those has oif == 0 while the other one has oif == eth0. >> > > They both seem to have an oif of eth0, no? They both have an RTA_OIF of eth0. But RTA_OIF != rt->fl.oif. In fact, RTA_OIF is the value of rt->u.dst.dev. Perhaps a program would be easier to understand. Run the attached program with the arguments x.x.x.x eth0, assuming that x.x.x.x is routed via eth0 on your system. You should then see the apparent duplicate entries in your routing cache. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- #include #include #include #include #include #include #include #include int main(int argc, char **argv) { struct sockaddr_in addr; int s; char msg = 0; int true = 1; struct { struct cmsghdr hdr; struct in_pktinfo pktinfo; } ctrl; struct ifreq req; struct msghdr hdr; struct iovec iov; if (argc < 3) return 1; addr.sin_family = AF_INET; inet_aton(argv[1], &addr.sin_addr); addr.sin_port = ntohs(30000); memset(&ctrl.pktinfo, 0, sizeof(ctrl.pktinfo)); ctrl.hdr.cmsg_len = sizeof(ctrl); ctrl.hdr.cmsg_level = SOL_IP; ctrl.hdr.cmsg_type = IP_PKTINFO; strcpy(req.ifr_name, argv[2]); iov.iov_base = &msg; iov.iov_len = 1; hdr.msg_name = &addr; hdr.msg_namelen = sizeof(addr); hdr.msg_iov = &iov; hdr.msg_iovlen = 1; hdr.msg_control = &ctrl; hdr.msg_controllen = sizeof(ctrl); hdr.msg_flags = 0; s = socket(AF_INET, SOCK_DGRAM, 0); setsockopt(s, SOL_IP, IP_PKTINFO, &true, sizeof(true)); ioctl(s, SIOCGIFINDEX, &req); ctrl.pktinfo.ipi_ifindex = req.ifr_ifindex; sendto(s, &msg, 1, 0, (struct sockaddr *)&addr, sizeof(addr)); sendmsg(s, &hdr, 0); return 0; } From prasanna@in.ibm.com Wed Dec 3 05:51:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 05:51:42 -0800 (PST) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3DpNTa016856 for ; Wed, 3 Dec 2003 05:51:24 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e32.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB3DpHrE066300; Wed, 3 Dec 2003 08:51:17 -0500 Received: from nightmon.in.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB3DpCxa157808; Wed, 3 Dec 2003 06:51:16 -0700 Received: by nightmon.in.ibm.com (Postfix, from userid 500) id 10EB4FB84; Wed, 3 Dec 2003 13:52:17 +0000 (UTC) Date: Wed, 3 Dec 2003 19:22:16 +0530 From: Prasanna S Panchamukhi To: jgarzik@pobox.com Cc: mpm@selenic.com, suparna@in.ibm.com, netdev@oss.sgi.com Subject: [1/4]pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Message-ID: <20031203135216.GA13465@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1859 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: prasanna@in.ibm.com Precedence: bulk X-list: netdev Hi Jeff, Below is the pollcontroller patch for e100. This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch diff -urNp linux.orig/drivers/net/e100/e100_main.c linux-2.6.0-test10/drivers/net/e100/e100_main.c --- linux.orig/drivers/net/e100/e100_main.c 2003-11-26 03:57:38.000000000 +0530 +++ linux-2.6.0-test10/drivers/net/e100/e100_main.c 2003-11-26 03:59:15.554978648 +0530 @@ -544,6 +544,22 @@ e100_trigger_SWI(struct e100_private *bd readw(&(bdp->scb->scb_status)); /* flushes last write, read-safe */ } +#ifdef CONFIG_NET_POLL_CONTROLLER + +/* + * Polling 'interrupt' - used by things like netconsole to send skbs + * without having to re-enable interrupts. It's not called while + * the interrupt routine is executing. + */ +static void +e100_poll(struct net_device *dev) +{ + disable_irq(dev->irq); + e100intr(dev->irq, dev, NULL); + enable_irq(dev->irq); +} +#endif + static int __devinit e100_found1(struct pci_dev *pcid, const struct pci_device_id *ent) { @@ -562,6 +578,9 @@ e100_found1(struct pci_dev *pcid, const SET_MODULE_OWNER(dev); +#ifdef CONFIG_NET_POLL_CONTROLLER + dev->poll_controller = &e100_poll; +#endif if (first_time) { first_time = false; printk(KERN_NOTICE "%s - version %s\n", -- Thanks & Regards Prasanna S Panchamukhi Linux Technology Center India Software Labs, IBM Bangalore Ph: 91-80-5044632 From prasanna@in.ibm.com Wed Dec 3 05:56:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 05:56:47 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3DuXTa017253 for ; Wed, 3 Dec 2003 05:56:33 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e31.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB3DuRb0102234; Wed, 3 Dec 2003 08:56:27 -0500 Received: from nightmon.in.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB3DuNxa168940; Wed, 3 Dec 2003 06:56:26 -0700 Received: by nightmon.in.ibm.com (Postfix, from userid 500) id E4DD1FB88; Wed, 3 Dec 2003 13:57:28 +0000 (UTC) Date: Wed, 3 Dec 2003 19:27:28 +0530 From: Prasanna S Panchamukhi To: jgarzik@pobox.com Cc: mpm@selenic.com, suparna@in.ibm.com, netdev@oss.sgi.com Subject: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Message-ID: <20031203135728.GA13585@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1860 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: prasanna@in.ibm.com Precedence: bulk X-list: netdev Hi Jeff, Below is the pollcontroller patch for e1000. This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch diff -urNp linux-netdrvr/drivers/net/e1000/e1000_main.c linux-2.6.0-test10/drivers/net/e1000/e1000_main.c --- linux-netdrvr/drivers/net/e1000/e1000_main.c 2003-12-04 11:18:03.000000000 +0530 +++ linux-2.6.0-test10/drivers/net/e1000/e1000_main.c 2003-12-04 11:43:35.412199928 +0530 @@ -338,6 +338,15 @@ e1000_reset(struct e1000_adapter *adapte e1000_phy_get_info(&adapter->hw, &adapter->phy_info); } +#ifdef CONFIG_NET_POLL_CONTROLLER +static void e1000_poll(struct net_device *dev) +{ + disable_irq(dev->irq); + e1000_intr(dev->irq, dev, NULL); + enable_irq(dev->irq); +} +#endif + /** * e1000_probe - Device Initialization Routine * @pdev: PCI device information struct @@ -439,6 +448,9 @@ e1000_probe(struct pci_dev *pdev, adapter->bd_number = cards_found; +#ifdef CONFIG_NET_POLL_CONTROLLER + netdev->poll_controller = &e1000_poll; +#endif /* setup the private structure */ if((err = e1000_sw_init(adapter))) -- Thanks & Regards Prasanna S Panchamukhi Linux Technology Center India Software Labs, IBM Bangalore Ph: 91-80-5044632 From prasanna@in.ibm.com Wed Dec 3 05:59:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 05:59:27 -0800 (PST) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.105]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3DxETa017622 for ; Wed, 3 Dec 2003 05:59:14 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e5.ny.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB3Dx69n554556; Wed, 3 Dec 2003 08:59:06 -0500 Received: from nightmon.in.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB3Dx2ar106574; Wed, 3 Dec 2003 08:59:04 -0500 Received: by nightmon.in.ibm.com (Postfix, from userid 500) id A9FF8FB88; Wed, 3 Dec 2003 14:00:07 +0000 (UTC) Date: Wed, 3 Dec 2003 19:30:07 +0530 From: Prasanna S Panchamukhi To: jgarzik@pobox.com Cc: mpm@selenic.com, suparna@in.ibm.com, netdev@oss.sgi.com Subject: [3/4]pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Message-ID: <20031203140007.GB13585@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1861 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: prasanna@in.ibm.com Precedence: bulk X-list: netdev Hi Jeff, Below is the pollcontroller patch for smc ultra net driver. This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch diff -urNp linux.orig/drivers/net/smc-ultra.c linux-2.6.0-test10/drivers/net/smc-ultra.c --- linux.orig/drivers/net/smc-ultra.c 2003-11-26 03:57:39.000000000 +0530 +++ linux-2.6.0-test10/drivers/net/smc-ultra.c 2003-11-26 03:59:15.630967096 +0530 @@ -121,6 +121,14 @@ MODULE_DEVICE_TABLE(isapnp, ultra_device #define ULTRA_IO_EXTENT 32 #define EN0_ERWCNT 0x08 /* Early receive warning count. */ +#ifdef CONFIG_NET_POLL_CONTROLLER +static void ultra_poll(struct net_device *dev) +{ + disable_irq(dev->irq); + ei_interrupt(dev->irq, dev, NULL); + enable_irq(dev->irq); +} +#endif /* Probe for the Ultra. This looks like a 8013 with the station address PROM at I/O ports +8 to +13, with a checksum following. @@ -134,6 +142,9 @@ static int __init do_ultra_probe(struct SET_MODULE_OWNER(dev); +#ifdef CONFIG_NET_POLL_CONTROLLER + dev->poll_controller = &ultra_poll; +#endif if (base_addr > 0x1ff) /* Check a single specified location. */ return ultra_probe1(dev, base_addr); else if (base_addr != 0) /* Don't probe at all. */ -- Thanks & Regards Prasanna S Panchamukhi Linux Technology Center India Software Labs, IBM Bangalore Ph: 91-80-5044632 From prasanna@in.ibm.com Wed Dec 3 06:01:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 06:01:40 -0800 (PST) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3E1RTa018040 for ; Wed, 3 Dec 2003 06:01:28 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e35.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB3E1Lcp394034; Wed, 3 Dec 2003 09:01:21 -0500 Received: from nightmon.in.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB3E19xa026978; Wed, 3 Dec 2003 07:01:12 -0700 Received: by nightmon.in.ibm.com (Postfix, from userid 500) id D963BFB88; Wed, 3 Dec 2003 14:02:14 +0000 (UTC) Date: Wed, 3 Dec 2003 19:32:14 +0530 From: Prasanna S Panchamukhi To: jgarzik@pobox.com Cc: mpm@selenic.com, suparna@in.ibm.com, netdev@oss.sgi.com Subject: [4/4]pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Message-ID: <20031203140214.GC13585@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1862 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: prasanna@in.ibm.com Precedence: bulk X-list: netdev Hi Jeff, Below is the pollcontroller patch for tlan network device driver. This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch diff -urNp linux-netdrvr/drivers/net/tlan.c linux-2.6.0-test10/drivers/net/tlan.c --- linux-netdrvr/drivers/net/tlan.c 2003-11-24 07:01:12.000000000 +0530 +++ linux-2.6.0-test10/drivers/net/tlan.c 2003-12-04 11:56:16.536491448 +0530 @@ -814,6 +814,14 @@ static void __init TLan_EisaProbe (void } /* TLan_EisaProbe */ +#ifdef CONFIG_NET_POLL_CONTROLLER +static void TLan_Poll(struct net_device *dev) +{ + disable_irq(dev->irq); + TLan_HandleInterrupt(dev->irq, dev, NULL); + enable_irq(dev->irq); +} +#endif @@ -893,6 +901,9 @@ static int TLan_Init( struct net_device dev->get_stats = &TLan_GetStats; dev->set_multicast_list = &TLan_SetMulticastList; dev->do_ioctl = &TLan_ioctl; +#ifdef CONFIG_NET_POLL_CONTROLLER + dev->poll_controller = &TLan_Poll; +#endif dev->tx_timeout = &TLan_tx_timeout; dev->watchdog_timeo = TX_TIMEOUT; -- Thanks & Regards Prasanna S Panchamukhi Linux Technology Center India Software Labs, IBM Bangalore Ph: 91-80-5044632 From bdschuym@pandora.be Wed Dec 3 11:13:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 11:13:41 -0800 (PST) Received: from poros.telenet-ops.be (poros.telenet-ops.be [195.130.132.44]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3JDDTa026111 for ; Wed, 3 Dec 2003 11:13:14 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by poros.telenet-ops.be (Postfix) with SMTP id 0AB862E6A8C; Wed, 3 Dec 2003 19:38:19 +0100 (MET) Received: from 192.168.0.138 (D5762BBC.kabel.telenet.be [213.118.43.188]) by poros.telenet-ops.be (Postfix) with ESMTP id 8045D2E6AE0; Wed, 3 Dec 2003 19:38:02 +0100 (MET) From: Bart De Schuymer To: Julian Anastasov , netdev@oss.sgi.com Subject: Re: [2.6 PATCH] bridge - provide valid tos value for ip_route_output_key Date: Wed, 3 Dec 2003 19:38:22 +0100 User-Agent: KMail/1.5 Cc: Stephen Hemminger , "David S. Miller" References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200312031938.22925.bdschuym@pandora.be> X-archive-position: 1863 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Hi Dave, If you haven't already applied the patch from Julian Anastasov below, please do. cheers, Bart On Wednesday 03 December 2003 01:03, Julian Anastasov wrote: > The attached patch changes bridging to provide valid > tos value to routing. diff -Nru a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c --- a/net/bridge/br_netfilter.c Tue Dec 2 23:55:30 2003 +++ b/net/bridge/br_netfilter.c Tue Dec 2 23:55:30 2003 @@ -180,7 +180,7 @@ struct rtable *rt; struct flowi fl = { .nl_u = { .ip4_u = { .daddr = iph->daddr, .saddr = 0 , - .tos = iph->tos} }, .proto = 0}; + .tos = RT_TOS(iph->tos)} }, .proto = 0}; if (!ip_route_output_key(&rt, &fl)) { /* Bridged-and-DNAT'ed traffic doesn't From herbert@gondor.apana.org.au Wed Dec 3 11:51:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 11:52:05 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3JplTa026822 for ; Wed, 3 Dec 2003 11:51:49 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1ARd2A-0006yw-00; Thu, 04 Dec 2003 06:51:38 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1ARd26-0001CK-00; Thu, 04 Dec 2003 06:51:34 +1100 Date: Thu, 4 Dec 2003 06:51:34 +1100 To: jamal Cc: netdev@oss.sgi.com Subject: Re: [RTNETLINK] Provide real oif Message-ID: <20031203195134.GD4482@gondor.apana.org.au> References: <1070459463.1036.179.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1070459463.1036.179.camel@jzny.localdomain> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1864 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Wed, Dec 03, 2003 at 08:51:03AM -0500, jamal wrote: > > I am wondering if they should have been the same resolved > cache entry to begin with. i.e the fl->oif should not have > been 0 rather the ifindex of eth0. > Even in the case of multipath routing, you have to wait for > the cache to expire before selecting the next one in the slow > path; so fl->oif of zero may not be a very useful artifact. That would be good. In fact, I think that's how IPv6 works. Dave, is this acceptable? -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From garzik@gtf.org Wed Dec 3 12:13:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 12:13:24 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3KD3Ta027761 for ; Wed, 3 Dec 2003 12:13:03 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id 8FCFD670C; Wed, 3 Dec 2003 15:09:29 -0500 (EST) Date: Wed, 3 Dec 2003 15:09:29 -0500 From: Jeff Garzik To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [PATCH] 2.6.x net driver updates Message-ID: <20031203200929.GA22839@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 1865 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Patch: http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test11-netdrvr-exp1.patch.bz2 BK users: bk://gkernel.bkbits.net/net-drivers-2.5-exp Nothing new in this patch, just a rediff versus 2.6.0-test11 release. From ddillow@idleaire.com Wed Dec 3 13:43:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 13:43:13 -0800 (PST) Received: from iasrv1.idleaire.net (NS1.idleaire.net [65.220.16.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3LgxTa003878 for ; Wed, 3 Dec 2003 13:42:59 -0800 Received: by iasrv1.idleaire.net (Postfix, from userid 300) id 684B3236E82; Wed, 3 Dec 2003 16:42:54 -0500 (EST) Received: from corp4.idleaire.com (corp4.idleaire.com [10.1.66.36]) by iasrv1.idleaire.net (Postfix) with ESMTP id ED526236E54; Wed, 3 Dec 2003 16:42:53 -0500 (EST) Received: from [10.1.66.124] ([10.1.66.124]) by corp4.idleaire.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 3 Dec 2003 16:42:53 -0500 Subject: [PATCH RESEND] Typhoon (3CR990) updates for 2.4 From: Dave Dillow To: Jeff Garzik Cc: Netdev X-Originating-IP: [216.153.89.133] X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f X-archive-position: 1708 X-ecartis-version: Ecartis v1.0.0 X-original-sender: dave@thedillows.org X-list: netdev Content-Type: text/plain Organization: Message-Id: <1070487773.5062.6.camel@dillow.idleaire.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 03 Dec 2003 16:42:53 -0500 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 03 Dec 2003 21:42:53.0990 (UTC) FILETIME=[684B7C60:01C3B9E6] X-archive-position: 1866 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ddillow@idleaire.com Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1063.4.32 -> 1.1063.47.1 # drivers/net/typhoon.c 1.4 -> 1.8 # # The following is the BitKeeper ChangeSet Log # -- # 03/11/25 dave@thedillows.org 1.1063.47.1 # Bug fixes: # * Avoid short timeouts when waiting for a reset # * Support newer versions of the sleep image # * Fix link status reporting # -- # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c -- a/drivers/net/typhoon.c Wed Nov 26 00:45:39 2003 +++ b/drivers/net/typhoon.c Wed Nov 26 00:45:39 2003 @@ -85,8 +85,8 @@ #define PKT_BUF_SZ 1536 #define DRV_MODULE_NAME "typhoon" -#define DRV_MODULE_VERSION "1.4.1" -#define DRV_MODULE_RELDATE "03/06/26" +#define DRV_MODULE_VERSION "1.4.2" +#define DRV_MODULE_RELDATE "03/11/25" #define PFX DRV_MODULE_NAME ": " #define ERR_PFX KERN_ERR PFX @@ -127,7 +127,7 @@ static char version[] __devinitdata = "typhoon.c: version " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n"; -MODULE_AUTHOR("David Dillow "); +MODULE_AUTHOR("David Dillow "); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("3Com Typhoon Family (3C990, 3CR990, and variants)"); MODULE_PARM(rx_copybreak, "i"); @@ -146,11 +146,12 @@ int capabilities; }; -#define TYPHOON_CRYPTO_NONE 0 -#define TYPHOON_CRYPTO_DES 1 -#define TYPHOON_CRYPTO_3DES 2 -#define TYPHOON_CRYPTO_VARIABLE 4 -#define TYPHOON_FIBER 8 +#define TYPHOON_CRYPTO_NONE 0x00 +#define TYPHOON_CRYPTO_DES 0x01 +#define TYPHOON_CRYPTO_3DES 0x02 +#define TYPHOON_CRYPTO_VARIABLE 0x04 +#define TYPHOON_FIBER 0x08 +#define TYPHOON_WAKEUP_NEEDS_RESET 0x10 enum typhoon_cards { TYPHOON_TX = 0, TYPHOON_TX95, TYPHOON_TX97, TYPHOON_SVR, @@ -307,7 +308,8 @@ /* We'll wait up to six seconds for a reset, and half a second normally. */ #define TYPHOON_UDELAY 50 -#define TYPHOON_RESET_TIMEOUT (6 * HZ) +#define TYPHOON_RESET_TIMEOUT_SLEEP (6 * HZ) +#define TYPHOON_RESET_TIMEOUT_NOSLEEP ((6 * 1000000) / TYPHOON_UDELAY) #define TYPHOON_WAIT_TIMEOUT ((1000000 / 2) / TYPHOON_UDELAY) #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 5, 28) @@ -375,10 +377,12 @@ typhoon_reset(unsigned long ioaddr, int wait_type) { int i, err = 0; - int timeout = TYPHOON_RESET_TIMEOUT; + int timeout; if(wait_type == WaitNoSleep) - timeout = (timeout * 1000000) / (HZ * TYPHOON_UDELAY); + timeout = TYPHOON_RESET_TIMEOUT_NOSLEEP; + else + timeout = TYPHOON_RESET_TIMEOUT_SLEEP; writel(TYPHOON_INTR_ALL, ioaddr + TYPHOON_REG_INTR_MASK); writel(TYPHOON_INTR_ALL, ioaddr + TYPHOON_REG_INTR_STATUS); @@ -1857,6 +1861,11 @@ if(typhoon_wait_status(ioaddr, TYPHOON_STATUS_SLEEPING) < 0) return -ETIMEDOUT; + /* Since we cannot monitor the status of the link while sleeping, + * tell the world it went away. + */ + netif_carrier_off(tp->dev); + pci_enable_wake(tp->pdev, state, 1); pci_disable_device(pdev); return pci_set_power_state(pdev, state); @@ -1871,8 +1880,13 @@ pci_set_power_state(pdev, 0); pci_restore_state(pdev, tp->pci_state); + /* Post 2.x.x versions of the Sleep Image require a reset before + * we can download the Runtime Image. But let's not make users of + * the old firmware pay for the reset. + */ writel(TYPHOON_BOOTCMD_WAKEUP, ioaddr + TYPHOON_REG_COMMAND); - if(typhoon_wait_status(ioaddr, TYPHOON_STATUS_WAITING_FOR_HOST) < 0) + if(typhoon_wait_status(ioaddr, TYPHOON_STATUS_WAITING_FOR_HOST) < 0 || + (tp->capabilities & TYPHOON_WAKEUP_NEEDS_RESET)) return typhoon_reset(ioaddr, wait_type); return 0; @@ -2250,7 +2264,7 @@ void *shared; dma_addr_t shared_dma; struct cmd_desc xp_cmd; - struct resp_desc xp_resp; + struct resp_desc xp_resp[3]; int i; int err = 0; @@ -2378,15 +2392,15 @@ } INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_MAC_ADDRESS); - if(typhoon_issue_command(tp, 1, &xp_cmd, 1, &xp_resp) < 0) { + if(typhoon_issue_command(tp, 1, &xp_cmd, 1, xp_resp) < 0) { printk(ERR_PFX "%s: cannot read MAC address\n", pdev->slot_name); err = -EIO; goto error_out_reset; } - *(u16 *)&dev->dev_addr[0] = htons(le16_to_cpu(xp_resp.parm1)); - *(u32 *)&dev->dev_addr[2] = htonl(le32_to_cpu(xp_resp.parm2)); + *(u16 *)&dev->dev_addr[0] = htons(le16_to_cpu(xp_resp[0].parm1)); + *(u32 *)&dev->dev_addr[2] = htonl(le32_to_cpu(xp_resp[0].parm2)); if(!is_valid_ether_addr(dev->dev_addr)) { printk(ERR_PFX "%s: Could not obtain valid ethernet address, " @@ -2394,6 +2408,28 @@ goto error_out_reset; } + /* Read the Sleep Image version last, so the response is valid + * later when we print out the version reported. + */ + INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS); + if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) { + printk(ERR_PFX "%s: Could not get Sleep Image version\n", + pdev->slot_name); + goto error_out_reset; + } + + tp->capabilities = typhoon_card_info[card_id].capabilities; + tp->xcvr_select = TYPHOON_XCVR_AUTONEG; + + /* Typhoon 1.0 Sleep Images return one response descriptor to the + * READ_VERSIONS command. Those versions are OK after waking up + * from sleep without needing a reset. Typhoon 1.1+ Sleep Images + * seem to need a little extra help to get started. Since we don't + * know how to nudge it along, just kick it. + */ + if(xp_resp[0].numDesc != 0) + tp->capabilities |= TYPHOON_WAKEUP_NEEDS_RESET; + if(typhoon_sleep(tp, 3, 0) < 0) { printk(ERR_PFX "%s: cannot put adapter to sleep\n", pdev->slot_name); @@ -2401,9 +2437,6 @@ goto error_out_reset; } - tp->capabilities = typhoon_card_info[card_id].capabilities; - tp->xcvr_select = TYPHOON_XCVR_AUTONEG; - /* The chip-specific entries in the device structure. */ dev->open = typhoon_open; dev->hard_start_xmit = typhoon_start_tx; @@ -2440,6 +2473,32 @@ printk("%2.2x:", dev->dev_addr[i]); printk("%2.2x\n", dev->dev_addr[i]); + /* xp_resp still contains the response to the READ_VERSIONS command. + * For debugging, let the user know what version he has. + */ + if(xp_resp[0].numDesc == 0) { + /* This is the Typhoon 1.0 type Sleep Image, last 16 bits + * of version is Month/Day of build. + */ + u16 monthday = le32_to_cpu(xp_resp[0].parm2) & 0xffff; + printk(KERN_INFO "%s: Typhoon 1.0 Sleep Image built " + "%02u/%02u/2000\n", dev->name, monthday >> 8, + monthday & 0xff); + } else if(xp_resp[0].numDesc == 2) { + /* This is the Typhoon 1.1+ type Sleep Image + */ + u32 sleep_ver = le32_to_cpu(xp_resp[0].parm2); + u8 *ver_string = (u8 *) &xp_resp[1]; + ver_string[25] = 0; + printk(KERN_INFO "%s: Typhoon 1.1+ Sleep Image version " + "%u.%u.%u.%u %s\n", dev->name, HIPQUAD(sleep_ver), + ver_string); + } else { + printk(KERN_WARNING "%s: Unknown Sleep Image version " + "(%u:%04x)\n", dev->name, xp_resp[0].numDesc, + le32_to_cpu(xp_resp[0].parm2)); + } + return 0; error_out_reset: From ddillow@idleaire.com Wed Dec 3 13:43:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 13:43:16 -0800 (PST) Received: from iasrv1.idleaire.net (NS1.idleaire.net [65.220.16.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3Lh2Ta003880 for ; Wed, 3 Dec 2003 13:43:03 -0800 Received: by iasrv1.idleaire.net (Postfix, from userid 300) id D0F15236E84; Wed, 3 Dec 2003 16:42:56 -0500 (EST) Received: from corp4.idleaire.com (corp4.idleaire.com [10.1.66.36]) by iasrv1.idleaire.net (Postfix) with ESMTP id 64195236E54; Wed, 3 Dec 2003 16:42:56 -0500 (EST) Received: from [10.1.66.124] ([10.1.66.124]) by corp4.idleaire.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 3 Dec 2003 16:42:56 -0500 Subject: [PATCH RESEND] Typhoon (3CR990) updates for 2.5 From: Dave Dillow To: Jeff Garzik Cc: Netdev X-Originating-IP: [216.153.89.133] X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f X-archive-position: 1710 X-ecartis-version: Ecartis v1.0.0 X-original-sender: dave@thedillows.org X-list: netdev Content-Type: text/plain Organization: Message-Id: <1070487776.5062.10.camel@dillow.idleaire.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 03 Dec 2003 16:42:56 -0500 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 03 Dec 2003 21:42:56.0412 (UTC) FILETIME=[69BD0DC0:01C3B9E6] X-archive-position: 1868 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ddillow@idleaire.com Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1486 -> 1.1487 # drivers/net/typhoon.c 1.8 -> 1.12 # # The following is the BitKeeper ChangeSet Log # -- # 03/11/26 dave@thedillows.org 1.1487 # Bug fixes: # * Avoid short timeouts when waiting for a reset # * Fix issue with loading runtime image on newer versions of the sleep image # * Fix link status reporting # -- # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c -- a/drivers/net/typhoon.c Wed Nov 26 01:28:53 2003 +++ b/drivers/net/typhoon.c Wed Nov 26 01:28:53 2003 @@ -85,8 +85,8 @@ #define PKT_BUF_SZ 1536 #define DRV_MODULE_NAME "typhoon" -#define DRV_MODULE_VERSION "1.5.1" -#define DRV_MODULE_RELDATE "03/06/26" +#define DRV_MODULE_VERSION "1.5.2" +#define DRV_MODULE_RELDATE "03/11/25" #define PFX DRV_MODULE_NAME ": " #define ERR_PFX KERN_ERR PFX @@ -127,7 +127,7 @@ static char version[] __devinitdata = "typhoon.c: version " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n"; -MODULE_AUTHOR("David Dillow "); +MODULE_AUTHOR("David Dillow "); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("3Com Typhoon Family (3C990, 3CR990, and variants)"); MODULE_PARM(rx_copybreak, "i"); @@ -146,11 +146,12 @@ int capabilities; }; -#define TYPHOON_CRYPTO_NONE 0 -#define TYPHOON_CRYPTO_DES 1 -#define TYPHOON_CRYPTO_3DES 2 -#define TYPHOON_CRYPTO_VARIABLE 4 -#define TYPHOON_FIBER 8 +#define TYPHOON_CRYPTO_NONE 0x00 +#define TYPHOON_CRYPTO_DES 0x01 +#define TYPHOON_CRYPTO_3DES 0x02 +#define TYPHOON_CRYPTO_VARIABLE 0x04 +#define TYPHOON_FIBER 0x08 +#define TYPHOON_WAKEUP_NEEDS_RESET 0x10 enum typhoon_cards { TYPHOON_TX = 0, TYPHOON_TX95, TYPHOON_TX97, TYPHOON_SVR, @@ -307,7 +308,8 @@ /* We'll wait up to six seconds for a reset, and half a second normally. */ #define TYPHOON_UDELAY 50 -#define TYPHOON_RESET_TIMEOUT (6 * HZ) +#define TYPHOON_RESET_TIMEOUT_SLEEP (6 * HZ) +#define TYPHOON_RESET_TIMEOUT_NOSLEEP ((6 * 1000000) / TYPHOON_UDELAY) #define TYPHOON_WAIT_TIMEOUT ((1000000 / 2) / TYPHOON_UDELAY) #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 5, 28) @@ -375,10 +377,12 @@ typhoon_reset(unsigned long ioaddr, int wait_type) { int i, err = 0; - int timeout = TYPHOON_RESET_TIMEOUT; + int timeout; if(wait_type == WaitNoSleep) - timeout = (timeout * 1000000) / (HZ * TYPHOON_UDELAY); + timeout = TYPHOON_RESET_TIMEOUT_NOSLEEP; + else + timeout = TYPHOON_RESET_TIMEOUT_SLEEP; writel(TYPHOON_INTR_ALL, ioaddr + TYPHOON_REG_INTR_MASK); writel(TYPHOON_INTR_ALL, ioaddr + TYPHOON_REG_INTR_STATUS); @@ -1858,6 +1862,11 @@ if(typhoon_wait_status(ioaddr, TYPHOON_STATUS_SLEEPING) < 0) return -ETIMEDOUT; + /* Since we cannot monitor the status of the link while sleeping, + * tell the world it went away. + */ + netif_carrier_off(tp->dev); + pci_enable_wake(tp->pdev, state, 1); pci_disable_device(pdev); return pci_set_power_state(pdev, state); @@ -1872,8 +1881,13 @@ pci_set_power_state(pdev, 0); pci_restore_state(pdev, tp->pci_state); + /* Post 2.x.x versions of the Sleep Image require a reset before + * we can download the Runtime Image. But let's not make users of + * the old firmware pay for the reset. + */ writel(TYPHOON_BOOTCMD_WAKEUP, ioaddr + TYPHOON_REG_COMMAND); - if(typhoon_wait_status(ioaddr, TYPHOON_STATUS_WAITING_FOR_HOST) < 0) + if(typhoon_wait_status(ioaddr, TYPHOON_STATUS_WAITING_FOR_HOST) < 0 || + (tp->capabilities & TYPHOON_WAKEUP_NEEDS_RESET)) return typhoon_reset(ioaddr, wait_type); return 0; @@ -2251,7 +2265,7 @@ void *shared; dma_addr_t shared_dma; struct cmd_desc xp_cmd; - struct resp_desc xp_resp; + struct resp_desc xp_resp[3]; int i; int err = 0; @@ -2380,15 +2394,15 @@ } INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_MAC_ADDRESS); - if(typhoon_issue_command(tp, 1, &xp_cmd, 1, &xp_resp) < 0) { + if(typhoon_issue_command(tp, 1, &xp_cmd, 1, xp_resp) < 0) { printk(ERR_PFX "%s: cannot read MAC address\n", pci_name(pdev)); err = -EIO; goto error_out_reset; } - *(u16 *)&dev->dev_addr[0] = htons(le16_to_cpu(xp_resp.parm1)); - *(u32 *)&dev->dev_addr[2] = htonl(le32_to_cpu(xp_resp.parm2)); + *(u16 *)&dev->dev_addr[0] = htons(le16_to_cpu(xp_resp[0].parm1)); + *(u32 *)&dev->dev_addr[2] = htonl(le32_to_cpu(xp_resp[0].parm2)); if(!is_valid_ether_addr(dev->dev_addr)) { printk(ERR_PFX "%s: Could not obtain valid ethernet address, " @@ -2396,6 +2410,28 @@ goto error_out_reset; } + /* Read the Sleep Image version last, so the response is valid + * later when we print out the version reported. + */ + INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS); + if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) { + printk(ERR_PFX "%s: Could not get Sleep Image version\n", + pdev->slot_name); + goto error_out_reset; + } + + tp->capabilities = typhoon_card_info[card_id].capabilities; + tp->xcvr_select = TYPHOON_XCVR_AUTONEG; + + /* Typhoon 1.0 Sleep Images return one response descriptor to the + * READ_VERSIONS command. Those versions are OK after waking up + * from sleep without needing a reset. Typhoon 1.1+ Sleep Images + * seem to need a little extra help to get started. Since we don't + * know how to nudge it along, just kick it. + */ + if(xp_resp[0].numDesc != 0) + tp->capabilities |= TYPHOON_WAKEUP_NEEDS_RESET; + if(typhoon_sleep(tp, 3, 0) < 0) { printk(ERR_PFX "%s: cannot put adapter to sleep\n", pci_name(pdev)); @@ -2403,9 +2439,6 @@ goto error_out_reset; } - tp->capabilities = typhoon_card_info[card_id].capabilities; - tp->xcvr_select = TYPHOON_XCVR_AUTONEG; - /* The chip-specific entries in the device structure. */ dev->open = typhoon_open; dev->hard_start_xmit = typhoon_start_tx; @@ -2442,6 +2475,32 @@ printk("%2.2x:", dev->dev_addr[i]); printk("%2.2x\n", dev->dev_addr[i]); + /* xp_resp still contains the response to the READ_VERSIONS command. + * For debugging, let the user know what version he has. + */ + if(xp_resp[0].numDesc == 0) { + /* This is the Typhoon 1.0 type Sleep Image, last 16 bits + * of version is Month/Day of build. + */ + u16 monthday = le32_to_cpu(xp_resp[0].parm2) & 0xffff; + printk(KERN_INFO "%s: Typhoon 1.0 Sleep Image built " + "%02u/%02u/2000\n", dev->name, monthday >> 8, + monthday & 0xff); + } else if(xp_resp[0].numDesc == 2) { + /* This is the Typhoon 1.1+ type Sleep Image + */ + u32 sleep_ver = le32_to_cpu(xp_resp[0].parm2); + u8 *ver_string = (u8 *) &xp_resp[1]; + ver_string[25] = 0; + printk(KERN_INFO "%s: Typhoon 1.1+ Sleep Image version " + "%u.%u.%u.%u %s\n", dev->name, HIPQUAD(sleep_ver), + ver_string); + } else { + printk(KERN_WARNING "%s: Unknown Sleep Image version " + "(%u:%04x)\n", dev->name, xp_resp[0].numDesc, + le32_to_cpu(xp_resp[0].parm2)); + } + return 0; error_out_reset: From ddillow@idleaire.com Wed Dec 3 13:43:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 13:43:14 -0800 (PST) Received: from iasrv1.idleaire.net (NS1.idleaire.net [65.220.16.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3Lh1Ta003879 for ; Wed, 3 Dec 2003 13:43:01 -0800 Received: by iasrv1.idleaire.net (Postfix, from userid 300) id 23381236E83; Wed, 3 Dec 2003 16:42:56 -0500 (EST) Received: from corp4.idleaire.com (corp4.idleaire.com [10.1.66.36]) by iasrv1.idleaire.net (Postfix) with ESMTP id A5C39236E54; Wed, 3 Dec 2003 16:42:55 -0500 (EST) Received: from [10.1.66.124] ([10.1.66.124]) by corp4.idleaire.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 3 Dec 2003 16:42:55 -0500 Subject: [BK RESEND Typhoon (3CR990) updates for 2.5 From: Dave Dillow To: Jeff Garzik Cc: Netdev X-Originating-IP: [216.153.89.133] X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f X-archive-position: 1709 X-ecartis-version: Ecartis v1.0.0 X-original-sender: dave@thedillows.org X-list: netdev Content-Type: text/plain Organization: Message-Id: <1070487775.5062.8.camel@dillow.idleaire.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 03 Dec 2003 16:42:55 -0500 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 03 Dec 2003 21:42:55.0693 (UTC) FILETIME=[694F57D0:01C3B9E6] X-archive-position: 1867 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ddillow@idleaire.com Precedence: bulk X-list: netdev Jeff, please do a bk pull http://typhoon.bkbits.net/typhoon-2.5 This will update the following files: drivers/net/typhoon.c | 97 ++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 78 insertions(+), 19 deletions(-) through these ChangeSets: (03/11/26 1.1487) Bug fixes: * Avoid short timeouts when waiting for a reset * Fix issue with loading runtime image on newer versions of the sleep image * Fix link status reporting From ja@ssi.bg Wed Dec 3 14:03:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 14:03:18 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3M2sTa005223 for ; Wed, 3 Dec 2003 14:02:57 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB3M3KjV001862; Thu, 4 Dec 2003 00:03:22 +0200 Date: Thu, 4 Dec 2003 00:03:20 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031202163902.2deb43a7.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1869 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Tue, 2 Dec 2003, David S. Miller wrote: > You changes mean that for routes with specific output interfaces, > we will ignore ICMPs for those routes that arrive on other interfaces > due to assymetric routing. > > Why don't you create a seperate patch that just has the TOS masking > changes? That's much less controversial and thus something I'm likely > to apply. Here is a new version, all changes are: - ip_rt_redirect works for entries oif=IIF and oif=0 as before - ip_rt_redirect now supports RTO_ONLINK - ip_rt_frag_needed now supports RTO_ONLINK and all oif!=0 - ifindex is not anymore a hash key. This is required for ip_rt_frag_needed to deliver the event to all entries no matter the oif key. I'm not sure if this is for good or for bad for the hash table distribution. I hope jenkins hash is ready for this as it is not a common case to have multiple oifs per one saddr-daddr-tos. - __ip_route_output_key now ignores illegal bits (bit 1) from tos diff -Nru a/net/ipv4/route.c b/net/ipv4/route.c --- a/net/ipv4/route.c Wed Dec 3 23:37:00 2003 +++ b/net/ipv4/route.c Wed Dec 3 23:37:00 2003 @@ -967,11 +967,11 @@ void ip_rt_redirect(u32 old_gw, u32 daddr, u32 new_gw, u32 saddr, u8 tos, struct net_device *dev) { - int i, k; + int i, j; struct in_device *in_dev = in_dev_get(dev); struct rtable *rth, **rthp; u32 skeys[2] = { saddr, 0 }; - int ikeys[2] = { dev->ifindex, 0 }; + u8 toskeys[2]; tos &= IPTOS_RT_MASK; @@ -992,11 +992,12 @@ goto reject_redirect; } + toskeys[0] = tos; + toskeys[1] = tos | RTO_ONLINK; + if (saddr && daddr) for (i = 0; i < 2; i++) { - for (k = 0; k < 2; k++) { - unsigned hash = rt_hash_code(daddr, - skeys[i] ^ (ikeys[k] << 5), - tos); + for (j = 0; j < 2; j++) { + unsigned hash = rt_hash_code(daddr, skeys[i], toskeys[j]); rthp=&rt_hash_table[hash].chain; @@ -1007,8 +1008,9 @@ smp_read_barrier_depends(); if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || - rth->fl.fl4_tos != tos || - rth->fl.oif != ikeys[k] || + rth->fl.fl4_tos != toskeys[j] || + (rth->fl.oif && + rth->fl.oif != dev->ifindex) || rth->fl.iif != 0) { rthp = &rth->u.rt_next; continue; @@ -1105,8 +1107,7 @@ } else if ((rt->rt_flags & RTCF_REDIRECTED) || rt->u.dst.expires) { unsigned hash = rt_hash_code(rt->fl.fl4_dst, - rt->fl.fl4_src ^ - (rt->fl.oif << 5), + rt->fl.fl4_src, rt->fl.fl4_tos); #if RT_CACHE_DEBUG >= 1 printk(KERN_DEBUG "ip_rt_advice: redirect to " @@ -1239,19 +1240,21 @@ unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu) { - int i; + int i, j; unsigned short old_mtu = ntohs(iph->tot_len); struct rtable *rth; u32 skeys[2] = { iph->saddr, 0, }; u32 daddr = iph->daddr; u8 tos = iph->tos & IPTOS_RT_MASK; unsigned short est_mtu = 0; + u8 toskeys[2] = { tos, tos | RTO_ONLINK }; if (ipv4_config.no_pmtu_disc) return 0; + for (j = 0; j < 2; j++) for (i = 0; i < 2; i++) { - unsigned hash = rt_hash_code(daddr, skeys[i], tos); + unsigned hash = rt_hash_code(daddr, skeys[i], toskeys[j]); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; @@ -1261,7 +1264,7 @@ rth->fl.fl4_src == skeys[i] && rth->rt_dst == daddr && rth->rt_src == iph->saddr && - rth->fl.fl4_tos == tos && + rth->fl.fl4_tos == toskeys[j] && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { unsigned short mtu = new_mtu; @@ -1503,7 +1506,7 @@ RT_CACHE_STAT_INC(in_slow_mc); in_dev_put(in_dev); - hash = rt_hash_code(daddr, saddr ^ (dev->ifindex << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); return rt_intern_hash(hash, rth, (struct rtable**) &skb->dst); e_nobufs: @@ -1554,7 +1557,7 @@ if (!in_dev) goto out; - hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); /* Check for the most weird martians, which can be not detected by fib_lookup. @@ -1847,7 +1850,7 @@ int iif = dev->ifindex; tos &= IPTOS_RT_MASK; - hash = rt_hash_code(daddr, saddr ^ (iif << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -2190,7 +2193,7 @@ rth->rt_flags = flags; - hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos); + hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src, tos); err = rt_intern_hash(hash, rth, rp); done: if (free_res) @@ -2213,8 +2216,9 @@ { unsigned hash; struct rtable *rth; + u8 tos = flp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK); - hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), flp->fl4_tos); + hash = rt_hash_code(flp->fl4_dst, flp->fl4_src, tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -2226,8 +2230,7 @@ #ifdef CONFIG_IP_ROUTE_FWMARK rth->fl.fl4_fwmark == flp->fl4_fwmark && #endif - !((rth->fl.fl4_tos ^ flp->fl4_tos) & - (IPTOS_RT_MASK | RTO_ONLINK))) { + rth->fl.fl4_tos == tos) { rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; Regards -- Julian Anastasov From johnip@sgi.com Wed Dec 3 14:12:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 14:12:50 -0800 (PST) Received: from tolkor.sgi.com (tolkor.SGI.COM [198.149.18.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3MCbTa005737 for ; Wed, 3 Dec 2003 14:12:37 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [192.48.203.135]) by tolkor.sgi.com (8.12.9/8.12.9/linux-outbound_gateway-1.1) with ESMTP id hB3MXeHc011599 for ; Wed, 3 Dec 2003 16:33:40 -0600 Received: from daisy-e236.americas.sgi.com (daisy-e236.americas.sgi.com [128.162.236.214]) by flecktone.americas.sgi.com (8.12.9/8.12.9/generic_config-1.2) with ESMTP id hB3MBVP516945186; Wed, 3 Dec 2003 16:11:31 -0600 (CST) Received: from sgi.com (fundament.americas.sgi.com [128.162.233.56]) by daisy-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id hB3MBFRn350607139; Wed, 3 Dec 2003 16:11:20 -0600 (CST) Message-ID: <3FCE5F8A.9010503@sgi.com> Date: Wed, 03 Dec 2003 16:11:22 -0600 From: John Partridge Reply-To: johnip@sgi.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031120 Thunderbird/0.3 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: "David S. Miller" , ak@suse.de, netdev@oss.sgi.com, jes@sgi.com Subject: Re: Tigon3 5701 PCI-X recv performance problem References: <3F844578.40306@sgi.com> <20031008101046.376abc3b.davem@redhat.com> <3F8455BE.8080300@sgi.com> <20031008183742.GA24822@wotan.suse.de> <20031008122223.1ba5ac79.davem@redhat.com> <20031008202248.GA15611@oldwotan.suse.de> <3F8702FF.70500@sgi.com> <20031010192036.GA31727@wotan.suse.de> <3F8802E6.5030601@sgi.com> <20031011131921.GC21763@wotan.suse.de> <20031011105054.0e16a607.davem@redhat.com> <3F8C290A.3010508@sgi.com> <20031014095323.71c8b9fe.davem@redhat.com> <3FB03A56.7000709@sgi.com> <20031110182911.2c5a121b.davem@redhat.com> <3FB140E2.1070007@sgi.com> <20031111122403.2d7bcf28.davem@redhat.com> <3FB153F1.8040708@sgi.com> <3FB15550.8020807@pobox.com> In-Reply-To: <3FB15550.8020807@pobox.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1870 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnip@sgi.com Precedence: bulk X-list: netdev Jeff, I have not seen the change for this go into 2.6 is there something else I should be doing ? Thanks John Jeff Garzik wrote: > John Partridge wrote: > >> David S. Miller wrote: >> >>> On Tue, 11 Nov 2003 14:04:50 -0600 >>> Why are you depending upon MCKINLEY? Don't all ia64 cpus >>> give traps for unaligned memory accesses? >> >> >> >> I have no evidence of a problem with Itanium1, only on Itanium2 >> (MCKINLEY) >> I have seen it on HP and SGI MCKINLEY > > > > Ask the other David M about eepro100.c patches he wanted to send to me... > > Jeff > > > -- John Partridge Silicon Graphics Inc Tel: 651-683-3428 Vnet: 233-3428 E-Mail: johnip@sgi.com From ddillow@idleaire.com Wed Dec 3 14:22:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 14:22:41 -0800 (PST) Received: from iasrv1.idleaire.net (NS1.idleaire.net [65.220.16.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3MM7Ta006583 for ; Wed, 3 Dec 2003 14:22:08 -0800 Received: by iasrv1.idleaire.net (Postfix, from userid 300) id 24A8D236E5C; Wed, 3 Dec 2003 16:42:53 -0500 (EST) Received: from corp4.idleaire.com (corp4.idleaire.com [10.1.66.36]) by iasrv1.idleaire.net (Postfix) with ESMTP id 9A637236E54; Wed, 3 Dec 2003 16:42:52 -0500 (EST) Received: from [10.1.66.124] ([10.1.66.124]) by corp4.idleaire.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 3 Dec 2003 16:42:52 -0500 Subject: [BK RESEND] Typhoon (3CR990) updates for 2.4 From: Dave Dillow To: Jeff Garzik Cc: Netdev X-Originating-IP: [216.153.89.133] X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f X-archive-position: 1707 X-ecartis-version: Ecartis v1.0.0 X-original-sender: dave@thedillows.org X-list: netdev Content-Type: text/plain Organization: Message-Id: <1070487769.5062.4.camel@dillow.idleaire.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 03 Dec 2003 16:42:49 -0500 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 03 Dec 2003 21:42:52.0646 (UTC) FILETIME=[677E6860:01C3B9E6] X-archive-position: 1871 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ddillow@idleaire.com Precedence: bulk X-list: netdev Jeff, please do a bk pull http://typhoon.bkbits.net/typhoon-2.4 This will update the following files: drivers/net/typhoon.c | 97 ++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 78 insertions(+), 19 deletions(-) through these ChangeSets: (03/11/25 1.1063.47.1) Bug fixes: * Avoid short timeouts when waiting for a reset * Support newer versions of the sleep image * Fix link status reporting From hadi@cyberus.ca Wed Dec 3 14:31:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 14:32:03 -0800 (PST) Received: from mail.cyberus.ca (mail.cyberus.ca [209.197.145.21]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3MVnTa007023 for ; Wed, 3 Dec 2003 14:31:50 -0800 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1ARXPi-0003ME-Hu; Wed, 03 Dec 2003 08:51:34 -0500 Subject: Re: [RTNETLINK] Provide real oif From: jamal Reply-To: hadi@cyberus.ca To: Herbert Xu Cc: netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolis Message-Id: <1070459463.1036.179.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 03 Dec 2003 08:51:03 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 1872 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Hi, On Wed, 2003-12-03 at 05:35, Herbert Xu wrote: > They both have an RTA_OIF of eth0. But RTA_OIF != rt->fl.oif. In fact, > RTA_OIF is the value of rt->u.dst.dev. (to be precise: rt->u.dst.dev->ifindex) Ok, now i understand what you are saying. In one case you taught the kernel the oif to use in the other you let it select the default. i.e: ip r g x.x.x.x dev eth0 and to select the default: ip r g x.x.x.x ip r l c should show them to be exact. In your case your selection and the default just happen to be the same device. I am wondering if they should have been the same resolved cache entry to begin with. i.e the fl->oif should not have been 0 rather the ifindex of eth0. Even in the case of multipath routing, you have to wait for the cache to expire before selecting the next one in the slow path; so fl->oif of zero may not be a very useful artifact. cheers, jamal From romieu@fr.zoreil.com Wed Dec 3 15:26:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 15:26:53 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3NQaTa008009 for ; Wed, 3 Dec 2003 15:26:37 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id hB3NQDxg025696; Thu, 4 Dec 2003 00:26:13 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id hB3NQAp3025695; Thu, 4 Dec 2003 00:26:10 +0100 Date: Thu, 4 Dec 2003 00:26:10 +0100 From: Francois Romieu To: netdev@oss.sgi.com Cc: =?unknown-8bit?Q?Fernando_Alencar_Mar=F3stica?= , Brad House , Matthew Gregan , jgarzik@pobox.com Subject: [PATCH 2.6] 2.6.0-test11 - rtl8169 endianness Message-ID: <20031204002610.A25405@electric-eye.fr.zoreil.com> References: <1070212415.1607.17.camel@oxygenium> <20031201020453.A16405@electric-eye.fr.zoreil.com> <20031202010649.A27879@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="6c2NcOVqGQ03X4Wi" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031202010649.A27879@electric-eye.fr.zoreil.com>; from romieu@fr.zoreil.com on Tue, Dec 02, 2003 at 01:06:49AM +0100 X-Organisation: Land of Sunshine Inc. X-archive-position: 1873 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev --6c2NcOVqGQ03X4Wi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline See patch in attachment. Francois Romieu : [...] > o Pure 2.6.0-test11: > > Get: > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11/r8169-blob.tar.bz2 > debuntarzipe and apply/compile/test in following order: > r8169-dma-api-tx.patch [...] > r8169-suspend.patch Insert here. [...] > o 2.6.0-test11 + 2.6.0-test9-bk25-netdrvr-exp1 > > Get: > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11-netdrv/r8169-blob.tar.bz2 > Same thing as above with: > r8169-mac-phy-version.patch [...] > r8169-suspend.patch > r8169-dma-api-rx-buffers-ahum.patch Or here. -- Ueimor --6c2NcOVqGQ03X4Wi Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="r8169-endianness.patch" Endianness update (original idea from Alexandra N. Kossovsky): - descriptors status (bitfields enumerated as _DescStatusBit); - address of buffers stored in Rx/Tx descriptors. drivers/net/r8169.c | 31 ++++++++++++++++--------------- 1 files changed, 16 insertions(+), 15 deletions(-) diff -puN drivers/net/r8169.c~r8169-endianness drivers/net/r8169.c --- linux-2.6.0-test11/drivers/net/r8169.c~r8169-endianness 2003-12-03 23:29:40.000000000 +0100 +++ linux-2.6.0-test11-fr/drivers/net/r8169.c 2003-12-03 23:50:54.000000000 +0100 @@ -1123,13 +1123,14 @@ rtl8169_hw_start(struct net_device *dev) static inline void rtl8169_make_unusable_by_asic(struct RxDesc *desc) { desc->buf_addr = 0xdeadbeef; - desc->status &= ~(OWNbit | RsvdMask); + desc->status &= ~cpu_to_le32(OWNbit | RsvdMask); } static void rtl8169_free_rx_skb(struct pci_dev *pdev, struct sk_buff **sk_buff, struct RxDesc *desc) { - pci_unmap_single(pdev, desc->buf_addr, RX_BUF_SIZE, PCI_DMA_FROMDEVICE); + pci_unmap_single(pdev, le32_to_cpu(desc->buf_addr), RX_BUF_SIZE, + PCI_DMA_FROMDEVICE); dev_kfree_skb(*sk_buff); *sk_buff = NULL; rtl8169_make_unusable_by_asic(desc); @@ -1137,13 +1138,13 @@ static void rtl8169_free_rx_skb(struct p static inline void rtl8169_return_to_asic(struct RxDesc *desc) { - desc->status |= OWNbit + RX_BUF_SIZE; + desc->status |= cpu_to_le32(OWNbit + RX_BUF_SIZE); } static inline void rtl8169_give_to_asic(struct RxDesc *desc, dma_addr_t mapping) { - desc->buf_addr = mapping; - desc->status |= OWNbit + RX_BUF_SIZE; + desc->buf_addr = cpu_to_le32(mapping); + desc->status |= cpu_to_le32(OWNbit + RX_BUF_SIZE); } static int rtl8169_alloc_rx_skb(struct pci_dev *pdev, struct net_device *dev, @@ -1208,7 +1209,7 @@ static u32 rtl8169_rx_fill(struct rtl816 static inline void rtl8169_mark_as_last_descriptor(struct RxDesc *desc) { - desc->status |= EORbit; + desc->status |= cpu_to_le32(EORbit); } static int rtl8169_init_ring(struct net_device *dev) @@ -1240,8 +1241,8 @@ static void rtl8169_unmap_tx_skb(struct { u32 len = sk_buff[0]->len; - pci_unmap_single(pdev, desc->buf_addr, len < ETH_ZLEN ? ETH_ZLEN : len, - PCI_DMA_TODEVICE); + pci_unmap_single(pdev, le32_to_cpu(desc->buf_addr), + len < ETH_ZLEN ? ETH_ZLEN : len, PCI_DMA_TODEVICE); desc->buf_addr = 0x00; *sk_buff = NULL; } @@ -1307,17 +1308,17 @@ rtl8169_start_xmit(struct sk_buff *skb, spin_lock_irq(&tp->lock); - if ((tp->TxDescArray[entry].status & OWNbit) == 0) { + if (!(le32_to_cpu(tp->TxDescArray[entry].status) & OWNbit)) { dma_addr_t mapping; mapping = pci_map_single(tp->pci_dev, skb->data, len, PCI_DMA_TODEVICE); tp->Tx_skbuff[entry] = skb; - tp->TxDescArray[entry].buf_addr = mapping; + tp->TxDescArray[entry].buf_addr = cpu_to_le32(mapping); - tp->TxDescArray[entry].status = OWNbit | FSbit | LSbit | len | - (EORbit * !((entry + 1) % NUM_TX_DESC)); + tp->TxDescArray[entry].status = cpu_to_le32(OWNbit | FSbit | + LSbit | len | (EORbit * !((entry + 1) % NUM_TX_DESC))); RTL_W8(TxPoll, 0x40); //set polling bit @@ -1358,7 +1359,7 @@ rtl8169_tx_interrupt(struct net_device * tx_left = tp->cur_tx - dirty_tx; while (tx_left > 0) { - if ((tp->TxDescArray[entry].status & OWNbit) == 0) { + if (!(le32_to_cpu(tp->TxDescArray[entry].status) & OWNbit)) { int cur = dirty_tx % NUM_TX_DESC; struct sk_buff *skb = tp->Tx_skbuff[cur]; @@ -1416,8 +1417,8 @@ rtl8169_rx_interrupt(struct net_device * cur_rx = tp->cur_rx % RX_BUF_SIZE; - while ((tp->RxDescArray[cur_rx].status & OWNbit) == 0) { - u32 status = tp->RxDescArray[cur_rx].status; + while (!(le32_to_cpu(tp->RxDescArray[cur_rx].status) & OWNbit)) { + u32 status = le32_to_cpu(tp->RxDescArray[cur_rx].status); if (status & RxRES) { printk(KERN_INFO "%s: Rx ERROR!!!\n", dev->name); _ --6c2NcOVqGQ03X4Wi-- From brad@mainstreetsoftworks.com Wed Dec 3 15:30:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 15:31:08 -0800 (PST) Received: from nameserver1.mcve.com (nameserver1.brainwerkz.net [209.251.159.130]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB3NUsTa010258 for ; Wed, 3 Dec 2003 15:30:54 -0800 Received: from mainstreetsoftworks.com (ip68-105-173-45.ga.at.cox.net [68.105.173.45]) by nameserver1.mcve.com (Postfix) with ESMTP id 7B2EA8527C; Wed, 3 Dec 2003 18:45:20 -0500 (EST) Message-ID: <3FCE7224.4000106@mainstreetsoftworks.com> Date: Wed, 03 Dec 2003 18:30:44 -0500 From: Brad House User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031121 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu Cc: netdev@oss.sgi.com, =?ISO-8859-1?Q?Fernando_Alencar_Mar=F3stica?= , Brad House , Matthew Gregan , jgarzik@pobox.com Subject: Re: [PATCH 2.6] 2.6.0-test11 - rtl8169 endianness References: <1070212415.1607.17.camel@oxygenium> <20031201020453.A16405@electric-eye.fr.zoreil.com> <20031202010649.A27879@electric-eye.fr.zoreil.com> <20031204002610.A25405@electric-eye.fr.zoreil.com> In-Reply-To: <20031204002610.A25405@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1874 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: brad@mainstreetsoftworks.com Precedence: bulk X-list: netdev k, cool ... I should have time to test all this tonight ... -Brad Francois Romieu wrote: > See patch in attachment. > > Francois Romieu : > [...] > >>o Pure 2.6.0-test11: >> >>Get: >>http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11/r8169-blob.tar.bz2 >>debuntarzipe and apply/compile/test in following order: >>r8169-dma-api-tx.patch > > [...] > >>r8169-suspend.patch > > > Insert here. > > [...] > >>o 2.6.0-test11 + 2.6.0-test9-bk25-netdrvr-exp1 >> >>Get: >>http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11-netdrv/r8169-blob.tar.bz2 >>Same thing as above with: >>r8169-mac-phy-version.patch > > [...] > >>r8169-suspend.patch >>r8169-dma-api-rx-buffers-ahum.patch > > > Or here. > > -- > Ueimor > > > ------------------------------------------------------------------------ > > > Endianness update (original idea from Alexandra N. Kossovsky): > - descriptors status (bitfields enumerated as _DescStatusBit); > - address of buffers stored in Rx/Tx descriptors. > > > > drivers/net/r8169.c | 31 ++++++++++++++++--------------- > 1 files changed, 16 insertions(+), 15 deletions(-) > > diff -puN drivers/net/r8169.c~r8169-endianness drivers/net/r8169.c > --- linux-2.6.0-test11/drivers/net/r8169.c~r8169-endianness 2003-12-03 23:29:40.000000000 +0100 > +++ linux-2.6.0-test11-fr/drivers/net/r8169.c 2003-12-03 23:50:54.000000000 +0100 > @@ -1123,13 +1123,14 @@ rtl8169_hw_start(struct net_device *dev) > static inline void rtl8169_make_unusable_by_asic(struct RxDesc *desc) > { > desc->buf_addr = 0xdeadbeef; > - desc->status &= ~(OWNbit | RsvdMask); > + desc->status &= ~cpu_to_le32(OWNbit | RsvdMask); > } > > static void rtl8169_free_rx_skb(struct pci_dev *pdev, struct sk_buff **sk_buff, > struct RxDesc *desc) > { > - pci_unmap_single(pdev, desc->buf_addr, RX_BUF_SIZE, PCI_DMA_FROMDEVICE); > + pci_unmap_single(pdev, le32_to_cpu(desc->buf_addr), RX_BUF_SIZE, > + PCI_DMA_FROMDEVICE); > dev_kfree_skb(*sk_buff); > *sk_buff = NULL; > rtl8169_make_unusable_by_asic(desc); > @@ -1137,13 +1138,13 @@ static void rtl8169_free_rx_skb(struct p > > static inline void rtl8169_return_to_asic(struct RxDesc *desc) > { > - desc->status |= OWNbit + RX_BUF_SIZE; > + desc->status |= cpu_to_le32(OWNbit + RX_BUF_SIZE); > } > > static inline void rtl8169_give_to_asic(struct RxDesc *desc, dma_addr_t mapping) > { > - desc->buf_addr = mapping; > - desc->status |= OWNbit + RX_BUF_SIZE; > + desc->buf_addr = cpu_to_le32(mapping); > + desc->status |= cpu_to_le32(OWNbit + RX_BUF_SIZE); > } > > static int rtl8169_alloc_rx_skb(struct pci_dev *pdev, struct net_device *dev, > @@ -1208,7 +1209,7 @@ static u32 rtl8169_rx_fill(struct rtl816 > > static inline void rtl8169_mark_as_last_descriptor(struct RxDesc *desc) > { > - desc->status |= EORbit; > + desc->status |= cpu_to_le32(EORbit); > } > > static int rtl8169_init_ring(struct net_device *dev) > @@ -1240,8 +1241,8 @@ static void rtl8169_unmap_tx_skb(struct > { > u32 len = sk_buff[0]->len; > > - pci_unmap_single(pdev, desc->buf_addr, len < ETH_ZLEN ? ETH_ZLEN : len, > - PCI_DMA_TODEVICE); > + pci_unmap_single(pdev, le32_to_cpu(desc->buf_addr), > + len < ETH_ZLEN ? ETH_ZLEN : len, PCI_DMA_TODEVICE); > desc->buf_addr = 0x00; > *sk_buff = NULL; > } > @@ -1307,17 +1308,17 @@ rtl8169_start_xmit(struct sk_buff *skb, > > spin_lock_irq(&tp->lock); > > - if ((tp->TxDescArray[entry].status & OWNbit) == 0) { > + if (!(le32_to_cpu(tp->TxDescArray[entry].status) & OWNbit)) { > dma_addr_t mapping; > > mapping = pci_map_single(tp->pci_dev, skb->data, len, > PCI_DMA_TODEVICE); > > tp->Tx_skbuff[entry] = skb; > - tp->TxDescArray[entry].buf_addr = mapping; > + tp->TxDescArray[entry].buf_addr = cpu_to_le32(mapping); > > - tp->TxDescArray[entry].status = OWNbit | FSbit | LSbit | len | > - (EORbit * !((entry + 1) % NUM_TX_DESC)); > + tp->TxDescArray[entry].status = cpu_to_le32(OWNbit | FSbit | > + LSbit | len | (EORbit * !((entry + 1) % NUM_TX_DESC))); > > RTL_W8(TxPoll, 0x40); //set polling bit > > @@ -1358,7 +1359,7 @@ rtl8169_tx_interrupt(struct net_device * > tx_left = tp->cur_tx - dirty_tx; > > while (tx_left > 0) { > - if ((tp->TxDescArray[entry].status & OWNbit) == 0) { > + if (!(le32_to_cpu(tp->TxDescArray[entry].status) & OWNbit)) { > int cur = dirty_tx % NUM_TX_DESC; > struct sk_buff *skb = tp->Tx_skbuff[cur]; > > @@ -1416,8 +1417,8 @@ rtl8169_rx_interrupt(struct net_device * > > cur_rx = tp->cur_rx % RX_BUF_SIZE; > > - while ((tp->RxDescArray[cur_rx].status & OWNbit) == 0) { > - u32 status = tp->RxDescArray[cur_rx].status; > + while (!(le32_to_cpu(tp->RxDescArray[cur_rx].status) & OWNbit)) { > + u32 status = le32_to_cpu(tp->RxDescArray[cur_rx].status); > > if (status & RxRES) { > printk(KERN_INFO "%s: Rx ERROR!!!\n", dev->name); > > _ From scott.feldman@intel.com Wed Dec 3 18:26:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Dec 2003 18:26:24 -0800 (PST) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB42Q9Ta015917 for ; Wed, 3 Dec 2003 18:26:10 -0800 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by caduceus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hB42QKvY003816; Thu, 4 Dec 2003 02:26:20 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by petasus.jf.intel.com (8.11.6-20030918-01/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id hB42JkG06097; Thu, 4 Dec 2003 02:19:46 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.1.32) with SMTP id M2003120318254621192 ; Wed, 03 Dec 2003 18:25:46 -0800 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 3 Dec 2003 18:25:46 -0800 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Subject: RE: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Date: Wed, 3 Dec 2003 18:25:46 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Thread-Index: AcO5pVKFKO1/5MqkTyaatG+4vg7ZuAAaSxqg From: "Feldman, Scott" To: , Cc: , , X-OriginalArrivalTime: 04 Dec 2003 02:25:46.0597 (UTC) FILETIME=[ECC1C950:01C3BA0D] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hB42Q9Ta015917 X-archive-position: 1875 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > Below is the pollcontroller patch for e1000. > This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch Did you test this w/ and w/o CONFIG_E1000_NAPI? -scott From uucp@coruscant.gnumonks.org Thu Dec 4 02:14:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 02:14:25 -0800 (PST) Received: from coruscant.gnumonks.org (coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB4AE7Ta030791 for ; Thu, 4 Dec 2003 02:14:08 -0800 Received: from uucp by coruscant.gnumonks.org with local-bsmtp (Exim 4.20) id 1ARpNt-0000qI-Tj for netdev@oss.sgi.com; Thu, 04 Dec 2003 10:02:53 +0100 Received: from laforge by obroa-skai.gnumonks.org with local (Exim 3.36 #1) id 1ARpMl-0000XH-00; Thu, 04 Dec 2003 14:31:21 +0530 Date: Thu, 4 Dec 2003 14:31:21 +0530 From: Harald Welte To: "David S. Miller" Cc: netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org Subject: [PATCH 2.4.x] IPv6 multicast (MLD,IGMP) code bypasses netfilter hooks Message-ID: <20031204090121.GD16635@obroa-skai.de.gnumonks.org> Mail-Followup-To: Harald Welte , "David S. Miller" , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org References: <20031122090330.GB2745@obroa-skai.de.gnumonks.org> <20031123154344.3e2b0b1a.davem@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oj4kGyHlBMXGt3Le" Content-Disposition: inline In-Reply-To: <20031123154344.3e2b0b1a.davem@redhat.com> X-Operating-System: Linux obroa-skai.de.gnumonks.org 2.4.23-pre7-ben0 X-Date: Today is Pungenday, the 46th day of The Aftermath in the YOLD 3169 User-Agent: Mutt/1.5.4i X-archive-position: 1876 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --oj4kGyHlBMXGt3Le Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Nov 23, 2003 at 03:43:22PM -0800, David S. Miller wrote: > If the fix is simple enough (1 or 2 one-liner changes) and easy > to verify, I would consider it for 2.6.0 >=20 > I may even look into this myself. Now that the other Dave's fix has made it in 2.6.0-test11, I have merged it (untested, but compiles) with 2.4.x. Dave, would you consider applying this to to 2.4.x ? Thanks. Greetings (still from India), Harald. --- linux.old/net/ipv6/mcast.c 2003-11-28 23:55:59.000000000 +0530 +++ linux/net/ipv6/mcast.c 2003-12-04 14:21:42.000000000 +0530 @@ -45,6 +45,9 @@ #include #include =20 +#include +#include + #include #include =20 @@ -1262,7 +1265,7 @@ { struct ipv6hdr *pip6 =3D skb->nh.ipv6h; struct mld2_report *pmr =3D (struct mld2_report *)skb->h.raw; - int payload_len, mldlen; + int payload_len, mldlen, err; =20 payload_len =3D skb->tail - (unsigned char *)skb->nh.ipv6h - sizeof(struct ipv6hdr); @@ -1271,8 +1274,10 @@ =20 pmr->csum =3D csum_ipv6_magic(&pip6->saddr, &pip6->daddr, mldlen, IPPROTO_ICMPV6, csum_partial(skb->h.raw, mldlen, 0)); - dev_queue_xmit(skb); - ICMP6_INC_STATS(Icmp6OutMsgs); + err =3D NF_HOOK(PF_INET6, NF_IP6_LOCAL_OUT, skb, NULL, skb->dev, + dev_queue_xmit); + if (!err) + ICMP6_INC_STATS(Icmp6OutMsgs); } =20 static int grec_size(struct ifmcaddr6 *pmc, int type, int gdel, int sdel) @@ -1596,12 +1601,16 @@ IPPROTO_ICMPV6, csum_partial((__u8 *) hdr, len, 0)); =20 - dev_queue_xmit(skb); - if (type =3D=3D ICMPV6_MGM_REDUCTION) - ICMP6_INC_STATS(Icmp6OutGroupMembReductions); - else - ICMP6_INC_STATS(Icmp6OutGroupMembResponses); - ICMP6_INC_STATS(Icmp6OutMsgs); + err =3D NF_HOOK(PF_INET6, NF_IP6_LOCAL_OUT, skb, NULL, skb->dev, + dev_queue_xmit); + if (!err) { + if (type =3D=3D ICMPV6_MGM_REDUCTION) + ICMP6_INC_STATS(Icmp6OutGroupMembReductions); + else + ICMP6_INC_STATS(Icmp6OutGroupMembResponses); + ICMP6_INC_STATS(Icmp6OutMsgs); + } + return; =20 out: --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --oj4kGyHlBMXGt3Le Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQE/zvf2XaXGVTD0i/8RAgyPAJ9pxdEJHaHwVGipMPVP4FtEzYVPAwCfUKNd t66omr9VmxcY9XO13onkzl0= =XLO5 -----END PGP SIGNATURE----- --oj4kGyHlBMXGt3Le-- From uucp@coruscant.gnumonks.org Thu Dec 4 05:40:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 05:41:07 -0800 (PST) Received: from coruscant.gnumonks.org (coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB4DeqTa021792 for ; Thu, 4 Dec 2003 05:40:53 -0800 Received: from uucp by coruscant.gnumonks.org with local-bsmtp (Exim 4.20) id 1ARrqn-0001pw-8Y for netdev@oss.sgi.com; Thu, 04 Dec 2003 12:40:53 +0100 Received: from laforge by obroa-skai.gnumonks.org with local (Exim 3.36 #1) id 1ARrpg-0006Lb-00; Thu, 04 Dec 2003 17:09:22 +0530 Date: Thu, 4 Dec 2003 17:09:22 +0530 From: Harald Welte To: Michael Bussmann Cc: netdev@oss.sgi.com, Netfilter Development Mailinglist Subject: Re: [Bug 155] Outgoing MLD packets are not traversing netfilter Message-ID: <20031204113922.GB7680@obroa-skai.de.gnumonks.org> Mail-Followup-To: Harald Welte , Michael Bussmann , netdev@oss.sgi.com, Netfilter Development Mailinglist References: <20031204092816.GC28403@ruf098.fkie.fgan.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="cvVnyQ+4j833TQvp" Content-Disposition: inline In-Reply-To: <20031204092816.GC28403@ruf098.fkie.fgan.de> X-Operating-System: Linux obroa-skai.de.gnumonks.org 2.4.23-pre7-ben0 X-Date: Today is Pungenday, the 46th day of The Aftermath in the YOLD 3169 User-Agent: Mutt/1.5.4i X-archive-position: 1877 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --cvVnyQ+4j833TQvp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 04, 2003 at 10:27:54AM +0100, Michael Bussmann wrote: > However, I still notice an oddity: MLD packets are not recognised as > icmpv6 messages. That's bad. Are you talking about MLDv1 or MLDv2 packets? > I set up a simple set of rules that directs both icmpv6 packets and "any" > packets (just for testing purposes) to the QUEUE target. An application > program (using libipq) dumps all packets: Ouch. We have two issues here: 1) ip6_tables not matching it as icmp6 packet This basically means that skb->nh.ipv6h is not initialized, or points to skb->raw instead of the ipv6 header. I cannot see how this could happen in the mld_sendpack(), since itself dereferences skb->nh.ipv6h =20 In the igmp6_send() case, ip6_nd_hdr() initializes skb->nh correct, so I also cannot see this happen. 2) ip6_queue copying the linklayer header to userspace This means that skb->data does not point at the beginning of the ipv6 packet (i.e. skb->data !=3D skb->nh.ip6h), but to the beginning of the hardware header (i.e. skb->data =3D=3D skb->head). This is because mcast.c writes the full packet including the hardware header and _then_ calls NF_HOOK. All other code (ipv6 or ipv4) is building the layer 3 packet, then calling NF_HOOK and later on (via ipX_outpur_finish() or dst_output()) adding the hardware header. > I'm not sure whether this is a bug, but it seems to be inconsistent that > some icmpv6 packets are dumped with the LL and others are not. of course it is a bug. So the fundamental question (with regard to the ipv6 gods) is: Is it really necessarry that the ipv6 mcast code bypasses the destination cache? IPv4 multicast just uses the dst_cache, and thus sends the packet using the normal dst_output() functions. I have to admit that I'm not as fluent with the code to decide why ipv6 multicast is the only code in ipv4/ipv6 that manually adds the hardware header and bypasses the normal dst_otput() way?. Can somebody explain? > Let me know if you need further information. As I am currently travelling, I cannot really play with any multicast networking (just a single notebook machine here, and it is ppc, so no user-mode-linux). > Best regards, > MB --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --cvVnyQ+4j833TQvp Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQE/zx0AXaXGVTD0i/8RAh0SAKCziMP8L76fLxyvBxxC9XI3d7yFDACfUXuX O76opGIfnXi5iO1h47XPnH8= =Sk4I -----END PGP SIGNATURE----- --cvVnyQ+4j833TQvp-- From dlstevens@us.ibm.com Thu Dec 4 09:59:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 09:59:17 -0800 (PST) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB4HwtTa030671 for ; Thu, 4 Dec 2003 09:59:04 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e33.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB4HwXJh123062; Thu, 4 Dec 2003 12:58:33 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB4HwTQQ065278; Thu, 4 Dec 2003 10:58:30 -0700 Importance: Normal Sensitivity: Subject: Re: [Bug 155] Outgoing MLD packets are not traversing netfilter To: Harald Welte Cc: Michael Bussmann , netdev@oss.sgi.com, Netfilter Development Mailinglist X-Mailer: Lotus Notes Release 5.0.4a July 24, 2000 Message-ID: From: David Stevens Date: Thu, 4 Dec 2003 09:58:27 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0.2CF2HF92 | October 15, 2003) at 12/04/2003 10:58:29 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-archive-position: 1878 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev Harald, >1) ip6_tables not matching it as icmp6 packet > Just a guess, but if netfilter6 doesn't go through all the extension headers, then the protocol won't look like ICMP6 because of the router alert option. >2) ip6_queue copying the linklayer header to userspace > >This means that skb->data does not point at the beginning of the >ipv6 packet (i.e. skb->data != skb->nh.ip6h), but to the beginning of >the hardware header (i.e. skb->data == skb->head). > >This is because mcast.c writes the full packet including the hardware >header and _then_ calls NF_HOOK. All other code (ipv6 or ipv4) is >building the layer 3 packet, then calling NF_HOOK and later on (via >ipX_outpur_finish() or dst_output()) adding the hardware header. ... >So the fundamental question (with regard to the ipv6 gods) is: Is it >really necessarry that the ipv6 mcast code bypasses the destination >cache? It can't go through the routing table. The advertisements are only meaningful on the particular interface, whether or not there is a route pointing output multicasts to a different interface (which is useful frequently, but MLD packets can't go there too). +-DLS From davem@pizda.ninka.net Thu Dec 4 10:38:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 10:38:50 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB4IcaTa032156 for ; Thu, 4 Dec 2003 10:38:37 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id KAA22520; Thu, 4 Dec 2003 10:37:19 -0800 Date: Thu, 4 Dec 2003 10:37:19 -0800 From: "David S. Miller" To: Harald Welte Cc: netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org Subject: Re: [PATCH 2.4.x] IPv6 multicast (MLD,IGMP) code bypasses netfilter hooks Message-Id: <20031204103719.56038dba.davem@redhat.com> In-Reply-To: <20031204090121.GD16635@obroa-skai.de.gnumonks.org> References: <20031122090330.GB2745@obroa-skai.de.gnumonks.org> <20031123154344.3e2b0b1a.davem@redhat.com> <20031204090121.GD16635@obroa-skai.de.gnumonks.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1879 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 4 Dec 2003 14:31:21 +0530 Harald Welte wrote: > Now that the other Dave's fix has made it in 2.6.0-test11, I have merged > it (untested, but compiles) with 2.4.x. > > Dave, would you consider applying this to to 2.4.x ? Done, thanks for catching this Harald. From kumarkr@us.ibm.com Thu Dec 4 13:26:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 13:26:41 -0800 (PST) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB4LQJTa012725 for ; Thu, 4 Dec 2003 13:26:28 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e34.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB4LQD6t363666 for ; Thu, 4 Dec 2003 16:26:13 -0500 Received: from d03nm801.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB4LQCQ1116638 for ; Thu, 4 Dec 2003 14:26:13 -0700 Subject: Gratuitous ARP To: netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: Krishna Kumar Date: Thu, 4 Dec 2003 13:24:57 -0800 X-MIMETrack: Serialize by Router on D03NM801/03/M/IBM(Release 6.0.2CF2HF92 | October 15, 2003) at 12/04/2003 14:26:12 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-archive-position: 1880 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kumarkr@us.ibm.com Precedence: bulk X-list: netdev Hi, I am trying to find out why gratuitous ARP isn't implemented for IPv4 duplicate address detection, though a comment is present in the arp.c that it was added. Was it removed for any reason ? I went through netdev archives and found some threads relating to this, but no comments were there on why this feature is absent or removed. Could it be to avoid a flood arp attack ? If so, any reason why it cannot be implemented under a configuration param ? I would appreciate if someone can throw any light on this. Thanks, - KK From shemminger@osdl.org Thu Dec 4 16:38:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 16:39:05 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB50cpTa021744 for ; Thu, 4 Dec 2003 16:38:51 -0800 Received: from dell_ss3.pdx.osdl.net (IDENT:2997@dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB50chZ06302; Thu, 4 Dec 2003 16:38:43 -0800 Date: Thu, 4 Dec 2003 16:39:28 -0800 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com, Alexander Viro Subject: [PATCH] skfddi - convert to new pci model. Message-Id: <20031204163928.0f34d5d1.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1881 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1502 -> 1.1503 # drivers/net/skfp/skfddi.c 1.20 -> 1.21 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/04 shemminger@osdl.org 1.1503 # Convert skfddi device to 2.6. # * use new pci device bus initialization # * allocate network device with alloc_fddidev # and use dev->priv # * get rid of special module/non module distinctions. # * fix error unwinds and return values on initialization # -------------------------------------------- # diff -Nru a/drivers/net/skfp/skfddi.c b/drivers/net/skfp/skfddi.c --- a/drivers/net/skfp/skfddi.c Thu Dec 4 16:36:03 2003 +++ b/drivers/net/skfp/skfddi.c Thu Dec 4 16:36:03 2003 @@ -58,6 +58,7 @@ * 07-May-00 DM 64 bit fixes, new dma interface * 31-Jul-03 DB Audit copy_*_user in skfp_ioctl * Daniele Bellucci + * 03-Dec-03 SH Convert to PCI device model * * Compilation options (-Dxxx): * DRIVERDEBUG print lots of messages to log file @@ -70,7 +71,7 @@ /* Version information string - should be updated prior to */ /* each new release!!! */ -#define VERSION "2.06" +#define VERSION "2.07" static const char *boot_msg = "SysKonnect FDDI PCI Adapter driver v" VERSION " for\n" @@ -80,15 +81,11 @@ #include #include -#include -#include #include #include #include #include #include -#include -#include // isdigit #include #include #include @@ -107,11 +104,6 @@ // Define module-wide (static) routines -static struct net_device *alloc_device(struct net_device *dev, u_long iobase); -static struct net_device *insert_device(struct net_device *dev); -static int fddi_dev_index(unsigned char *s); -static void init_dev(struct net_device *dev, u_long iobase); -static void link_modules(struct net_device *dev, struct net_device *tmp); static int skfp_driver_init(struct net_device *dev); static int skfp_open(struct net_device *dev); static int skfp_close(struct net_device *dev); @@ -188,15 +180,6 @@ // Define module-wide (static) variables static int num_boards; /* total number of adapters configured */ -static int num_fddi; -static int autoprobed; - -#ifdef MODULE -static struct net_device *unlink_modules(struct net_device *p); -static int loading_module = 1; -#else -static int loading_module; -#endif // MODULE #ifdef DRIVERDEBUG #define PRINTK(s, args...) printk(s, ## args) @@ -207,9 +190,9 @@ #define PRIV(dev) (&(((struct s_smc *)dev->priv)->os)) /* - * ============== - * = skfp_probe = - * ============== + * ================= + * = skfp_init_one = + * ================= * * Overview: * Probes for supported FDDI PCI controllers @@ -218,30 +201,11 @@ * Condition code * * Arguments: - * dev - pointer to device information + * pdev - pointer to PCI device information * * Functional Description: - * This routine is called by the OS for each FDDI device name (fddi0, - * fddi1,...,fddi6, fddi7) specified in drivers/net/Space.c. - * If loaded as a module, it will detect and initialize all - * adapters the first time it is called. - * - * Let's say that skfp_probe() is getting called to initialize fddi0. - * Furthermore, let's say there are three supported controllers in the - * system. Before skfp_probe() leaves, devices fddi0, fddi1, and fddi2 - * will be initialized and a global flag will be set to indicate that - * skfp_probe() has already been called. - * - * However...the OS doesn't know that we've already initialized - * devices fddi1 and fddi2 so skfp_probe() gets called again and again - * until it reaches the end of the device list for FDDI (presently, - * fddi7). It's important that the driver "pretend" to probe for - * devices fddi1 and fddi2 and return success. Devices fddi3 - * through fddi7 will return failure since they weren't initialized. - * - * This algorithm seems to work for the time being. As other FDDI - * drivers are written for Linux, a more generic approach (perhaps - * similar to the Ethernet card approach) may need to be implemented. + * This is now called by PCI driver registration process + * for each board found. * * Return Codes: * 0 - This device (fddi0, fddi1, etc) configured successfully @@ -254,370 +218,166 @@ * initialized and the board resources are read and stored in * the device structure. */ -static int skfp_probe(struct net_device *dev) +static __init int skfp_init_one(struct pci_dev *pdev, + const struct pci_device_id *ent) { - int i; /* used in for loops */ - struct pci_dev *pdev = NULL; /* PCI device structure */ -#ifndef MEM_MAPPED_IO - u16 port; /* temporary I/O (port) address */ - int port_len; /* length of port address range (in bytes) */ -#else - unsigned long port; -#endif - u16 command; /* PCI Configuration space Command register val */ + struct net_device *dev; struct s_smc *smc; /* board pointer */ - struct net_device *tmp = dev; - u8 first_dev_used = 0; - u16 SubSysId; + u32 port, len; + int err; PRINTK(KERN_INFO "entering skfp_probe\n"); - /* - * Verify whether we're going through skfp_probe() again - * - * If so, see if we're going through for a subsequent fddi device that - * we've already initialized. If we are, return success (0). If not, - * return failure (-ENODEV). - */ + if (num_boards == 0) + printk("%s\n", boot_msg); - if (autoprobed) { - PRINTK(KERN_INFO "Already entered skfp_probe\n"); - if (dev != NULL) { - if ((strncmp(dev->name, "fddi", 4) == 0) && - (dev->base_addr != 0)) { - return (0); - } - return (-ENODEV); - } + err = pci_enable_device(pdev); + if (err) + goto err_out1; + + +#ifdef MEM_MAPPED_IO + if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) { + printk(KERN_ERR "skfp: region is not an MMIO resource\n"); + err = -EIO; + goto err_out1; + } + port = pci_resource_start(pdev, 0); + len = pci_resource_len(pdev, 0); + + if (len < 0x4000) { + printk(KERN_ERR "skfp: Invalid PCI region size: %d\n", len); + err = -EIO; + goto err_out1; } - autoprobed = 1; /* set global flag */ - - printk("%s\n", boot_msg); - - /* Scan for Syskonnect FDDI PCI controllers */ - for (i = 0; i < SKFP_MAX_NUM_BOARDS; i++) { // scan for PCI cards - PRINTK(KERN_INFO "Check device %d\n", i); - if ((pdev=pci_find_device(PCI_VENDOR_ID_SK, PCI_DEVICE_ID_SK_FP, - pdev)) == 0) { - break; - } - if (pci_enable_device(pdev)) - continue; - -#ifndef MEM_MAPPED_IO - /* Verify that I/O enable bit is set (PCI slot is enabled) */ - pci_read_config_word(pdev, PCI_COMMAND, &command); - if ((command & PCI_COMMAND_IO) == 0) { - PRINTK("I/O enable bit not set!"); - PRINTK(" Verify that slot is enabled\n"); - continue; - } - - /* Turn off memory mapped space and enable mastering */ - - PRINTK(KERN_INFO "Command Reg: %04x\n", command); - command |= PCI_COMMAND_MASTER; - command &= ~PCI_COMMAND_MEMORY; - pci_write_config_word(pdev, PCI_COMMAND, command); - - /* Read I/O base address from PCI Configuration Space */ - - pci_read_config_word(pdev, PCI_BASE_ADDRESS_1, &port); - port &= PCI_BASE_ADDRESS_IO_MASK; // clear I/O bit (bit 0) - - /* Verify port address range is not already being used */ - - port_len = FP_IO_LEN; - if (check_region(port, port_len) != 0) { - printk("I/O range allocated to adapter"); - printk(" (0x%X-0x%X) is already being used!\n", port, - (port + port_len - 1)); - continue; - } #else - /* Verify that MEM enable bit is set (PCI slot is enabled) */ - pci_read_config_word(pdev, PCI_COMMAND, &command); - if ((command & PCI_COMMAND_MEMORY) == 0) { - PRINTK("MEMORY-I/O enable bit not set!"); - PRINTK(" Verify that slot is enabled\n"); - continue; - } - - /* Turn off IO mapped space and enable mastering */ - - PRINTK(KERN_INFO "Command Reg: %04x\n", command); - command |= PCI_COMMAND_MASTER; - command &= ~PCI_COMMAND_IO; - pci_write_config_word(pdev, PCI_COMMAND, command); - - port = pci_resource_start(pdev, 0); - - port = (unsigned long)ioremap(port, 0x4000); - if (!port){ - printk("skfp: Unable to map MEMORY register, " - "FDDI adapter will be disabled.\n"); - break; - } -#endif - - if ((!loading_module) || first_dev_used) { - /* Allocate a device structure for this adapter */ - tmp = alloc_device(dev, port); - } - first_dev_used = 1; // only significant first time - - pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &SubSysId); - - if (tmp != NULL) { - if (loading_module) - link_modules(dev, tmp); - dev = tmp; - init_dev(dev, port); - dev->irq = pdev->irq; - - /* Initialize board structure with bus-specific info */ - - smc = (struct s_smc *) dev->priv; - smc->os.dev = dev; - smc->os.bus_type = SK_BUS_TYPE_PCI; - smc->os.pdev = *pdev; - smc->os.QueueSkb = MAX_TX_QUEUE_LEN; - smc->os.MaxFrameSize = MAX_FRAME_SIZE; - smc->os.dev = dev; - smc->hw.slot = -1; - smc->os.ResetRequested = FALSE; - skb_queue_head_init(&smc->os.SendSkbQueue); - - if (skfp_driver_init(dev) == 0) { - int SubSysId = pdev->subsystem_device; - // only increment global board - // count on success - num_boards++; - if ((SubSysId & 0xff00) == 0x5500 || - (SubSysId & 0xff00) == 0x5800) { - printk("%s: SysKonnect FDDI PCI adapter" - " found (SK-%04X)\n", dev->name, - SubSysId); - } else { - printk("%s: FDDI PCI adapter found\n", - dev->name); - } - } else { - kfree(dev); - i = SKFP_MAX_NUM_BOARDS; // stop search - } - } // if (dev != NULL) - } // for SKFP_MAX_NUM_BOARDS - - /* - * If we're at this point we're going through skfp_probe() for the - * first time. Return success (0) if we've initialized 1 or more - * boards. Otherwise, return failure (-ENODEV). - */ - - if (num_boards > 0) - return (0); - else { - release_region (dev->base_addr, FP_IO_LEN); - printk("no SysKonnect FDDI adapter found\n"); - return (-ENODEV); + if (!(pci_resource_flags(pdev, 1) & IO_RESOURCE_IO)) { + printk(KERN_ERR "skfp: region is not PIO resource\n"); + err = -EIO; + goto err_out1; } -} // skfp_probe - - -/************************ - * - * Search the entire 'fddi' device list for a fixed probe. If a match isn't - * found then check for an autoprobe or unused device location. If they - * are not available then insert a new device structure at the end of - * the current list. - * - ************************/ -static struct net_device *alloc_device(struct net_device *dev, u_long iobase) -{ - struct net_device *adev = NULL; - int fixed = 0, new_dev = 0; - PRINTK(KERN_INFO "entering alloc_device\n"); - if (!dev) - return dev; - - num_fddi = fddi_dev_index(dev->name); - if (loading_module) { - num_fddi++; - dev = insert_device(dev); - return dev; + port = pci_resource_start(pdev, 1); + len = pci_resource_len(pdev, 1); + if (len < FP_IO_LEN) { + printk(KERN_ERR "skfp: Invalid PCI region size: %d\n", + io_len); + err = -EIO; + goto err_out1; } - while (1) { - if (((dev->base_addr == NO_ADDRESS) || - (dev->base_addr == 0)) && !adev) { - adev = dev; - } else if ((dev->priv == NULL) && (dev->base_addr == iobase)) { - fixed = 1; - } else { - if (dev->next == NULL) { - new_dev = 1; - } else if (strncmp(dev->next->name, "fddi", 4) != 0) { - new_dev = 1; - } - } - if ((dev->next == NULL) || new_dev || fixed) - break; - dev = dev->next; - num_fddi++; - } // while (1) - - if (adev && !fixed) { - dev = adev; - num_fddi = fddi_dev_index(dev->name); - new_dev = 0; - } - if (((dev->next == NULL) && ((dev->base_addr != NO_ADDRESS) && - (dev->base_addr != 0)) && !fixed) || - new_dev) { - num_fddi++; /* New device */ - dev = insert_device(dev); - } - if (dev) { - if (!dev->priv) { - /* Allocate space for private board structure */ - dev->priv = (void *) kmalloc(sizeof(struct s_smc), - GFP_KERNEL); - if (dev->priv == NULL) { - printk("%s: Could not allocate memory for", - dev->name); - printk(" private board structure!\n"); - return (NULL); - } - /* clear structure */ - memset(dev->priv, 0, sizeof(struct s_smc)); - } +#endif + err = pci_request_regions(pdev, "skfddi"); + if (err) + goto err_out1; + + pci_set_master(pdev); + + dev = alloc_fddidev(sizeof(struct s_smc)); + if (!dev) { + printk(KERN_ERR "skfp: Unable to allocate fddi device, " + "FDDI adapter will be disabled.\n"); + err = -ENOMEM; + goto err_out2; + } + +#ifdef MEM_MAPPED_IO + dev->base_addr = (unsigned long) ioremap(port, len); + if (!dev->base_addr) { + printk(KERN_ERR "skfp: Unable to map MEMORY register, " + "FDDI adapter will be disabled.\n"); + err = -EIO; + goto err_out3; } - return dev; -} // alloc_device - - - -/************************ - * - * Initialize device structure - * - ************************/ -static void init_dev(struct net_device *dev, u_long iobase) -{ - /* Initialize new device structure */ - - dev->mem_end = 0; /* shared memory isn't used */ - dev->mem_start = 0; /* shared memory isn't used */ - dev->base_addr = iobase; /* save port (I/O) base address */ - dev->if_port = 0; /* not applicable to FDDI adapters */ - dev->dma = 0; /* Bus Master DMA doesn't require channel */ - dev->irq = 0; - - netif_start_queue(dev); +#else + dev->base_addr = port; +#endif + dev->irq = pdev->irq; dev->get_stats = &skfp_ctl_get_stats; dev->open = &skfp_open; dev->stop = &skfp_close; + dev->init = &skfp_driver_init; dev->hard_start_xmit = &skfp_send_pkt; - dev->hard_header = NULL; /* set in fddi_setup() */ - dev->rebuild_header = NULL; /* set in fddi_setup() */ dev->set_multicast_list = &skfp_ctl_set_multicast_list; dev->set_mac_address = &skfp_ctl_set_mac_address; dev->do_ioctl = &skfp_ioctl; - dev->set_config = NULL; /* not supported for now &&& */ dev->header_cache_update = NULL; /* not supported */ - dev->change_mtu = NULL; /* set in fddi_setup() */ SET_MODULE_OWNER(dev); + SET_NETDEV_DEV(dev, &pdev->dev); - /* Initialize remaining device structure information */ - fddi_setup(dev); -} // init_device + /* Initialize board structure with bus-specific info */ + smc = (struct s_smc *) dev->priv; + smc->os.dev = dev; + smc->os.bus_type = SK_BUS_TYPE_PCI; + smc->os.pdev = *pdev; + smc->os.QueueSkb = MAX_TX_QUEUE_LEN; + smc->os.MaxFrameSize = MAX_FRAME_SIZE; + smc->os.dev = dev; + smc->hw.slot = -1; + smc->os.ResetRequested = FALSE; + skb_queue_head_init(&smc->os.SendSkbQueue); + + err = register_netdev(dev); + if (err) + goto err_out4; + + ++num_boards; + pci_set_drvdata(pdev, dev); + + if ((pdev->subsystem_device & 0xff00) == 0x5500 || + (pdev->subsystem_device & 0xff00) == 0x5800) + printk("%s: SysKonnect FDDI PCI adapter" + " found (SK-%04X)\n", dev->name, + pdev->subsystem_device); + else + printk("%s: FDDI PCI adapter found\n", dev->name); + return 0; +err_out4: +#ifdef MEM_MAPPED_IO + iounmap((void *) dev->base_addr); +#endif +err_out3: + free_netdev(dev); +err_out2: + pci_release_regions(pdev); +err_out1: + return err; +} -/************************ - * - * If at end of fddi device list and can't use current entry, malloc - * one up. If memory could not be allocated, print an error message. - * -************************/ -static struct net_device *insert_device(struct net_device *dev) +/* + * Called for each adapter board from pci_unregister_driver + */ +static void __devexit skfp_remove_one(struct pci_dev *pdev) { - struct net_device *new; - int len; + struct net_device *p = pci_get_drvdata(pdev); + struct s_smc *lp = p->priv; - PRINTK(KERN_INFO "entering insert_device\n"); - len = sizeof(struct net_device) + sizeof(struct s_smc); - new = (struct net_device *) kmalloc(len, GFP_KERNEL); - if (new == NULL) { - printk("fddi%d: Device not initialised, insufficient memory\n", - num_fddi); - return NULL; - } else { - memset((char *) new, 0, len); - new->priv = (struct s_smc *) (new + 1); - new->init = skfp_probe; - if (!loading_module) { - new->next = dev->next; - dev->next = new; - } - /* create new device name */ - if (num_fddi > 999) { - sprintf(new->name, "fddi????"); - } else { - sprintf(new->name, "fddi%d", num_fddi); - } - } - return new; -} // insert_device - - -/************************ - * - * Get the number of a "fddiX" string - * - ************************/ -static int fddi_dev_index(unsigned char *s) -{ - int i = 0, j = 0; + unregister_netdev(p); - for (; *s; s++) { - if (isdigit(*s)) { - j = 1; - i = (i * 10) + (*s - '0'); - } else if (j) - break; + if (lp->os.SharedMemAddr) { + pci_free_consistent(&lp->os.pdev, + lp->os.SharedMemSize, + lp->os.SharedMemAddr, + lp->os.SharedMemDMA); + lp->os.SharedMemAddr = NULL; + } + if (lp->os.LocalRxBuffer) { + pci_free_consistent(&lp->os.pdev, + MAX_FRAME_SIZE, + lp->os.LocalRxBuffer, + lp->os.LocalRxBufferDMA); + lp->os.LocalRxBuffer = NULL; } - return i; -} // fddi_dev_index - - -/************************ - * - * Used if loaded as module only. Link the device structures - * together. Needed to release them all at unload. - * -************************/ -static void link_modules(struct net_device *dev, struct net_device *tmp) -{ - struct net_device *p = dev; - - if (p) { - while (((struct s_smc *) (p->priv))->os.next_module) { - p = ((struct s_smc *) (p->priv))->os.next_module; - } - - if (dev != tmp) { - ((struct s_smc *) (p->priv))->os.next_module = tmp; - } else { - ((struct s_smc *) (p->priv))->os.next_module = NULL; - } - } - return; -} // link_modules - +#ifdef MEM_MAPPED_IO + iounmap((void *) p->base_addr); +#endif + pci_release_regions(pdev); + free_netdev(p); + pci_set_drvdata(pdev, NULL); +} /* * ==================== @@ -644,11 +404,12 @@ * 0 - initialization succeeded * -1 - initialization failed */ -static int skfp_driver_init(struct net_device *dev) +static __init int skfp_driver_init(struct net_device *dev) { struct s_smc *smc = (struct s_smc *) dev->priv; skfddi_priv *bp = PRIV(dev); u8 val; /* used for I/O read/writes */ + int err = -EIO; PRINTK(KERN_INFO "entering skfp_driver_init\n"); @@ -729,7 +490,7 @@ bp->LocalRxBuffer, bp->LocalRxBufferDMA); bp->LocalRxBuffer = NULL; } - return (-1); + return err; } // skfp_driver_init @@ -757,14 +518,15 @@ static int skfp_open(struct net_device *dev) { struct s_smc *smc = (struct s_smc *) dev->priv; + int err; PRINTK(KERN_INFO "entering skfp_open\n"); /* Register IRQ - support shared interrupts by passing device ptr */ - if (request_irq(dev->irq, (void *) skfp_interrupt, SA_SHIRQ, - dev->name, dev)) { - printk("%s: Requested IRQ %d is busy\n", dev->name, dev->irq); - return (-EAGAIN); - } + err = request_irq(dev->irq, (void *) skfp_interrupt, SA_SHIRQ, + dev->name, dev); + if (err) + return err; + /* * Set current address to factory MAC address * @@ -788,6 +550,7 @@ /* Disable promiscuous filter settings */ mac_drv_rx_mode(smc, RX_DISABLE_PROMISC); + netif_start_queue(dev); return (0); } // skfp_open @@ -822,7 +585,6 @@ static int skfp_close(struct net_device *dev) { struct s_smc *smc = (struct s_smc *) dev->priv; - struct sk_buff *skb; skfddi_priv *bp = PRIV(dev); CLI_FBI(); @@ -835,13 +597,7 @@ /* Deregister (free) IRQ */ free_irq(dev->irq, dev); - for (;;) { - skb = skb_dequeue(&bp->SendSkbQueue); - if (skb == NULL) - break; - bp->QueueSkb++; - dev_kfree_skb(skb); - } + skb_queue_purge(&bp->SendSkbQueue); return (0); } // skfp_close @@ -1276,6 +1032,8 @@ break; default: printk("ioctl for %s: unknow cmd: %04x\n", dev->name, ioc.cmd); + status = -EOPNOTSUPP; + } // switch return status; @@ -2529,63 +2287,21 @@ } // drv_reset_indication - -static struct net_device *mdev; +static struct pci_driver skfddi_pci_driver = { + .name = "skfddi", + .id_table = skfddi_pci_tbl, + .probe = skfp_init_one, + .remove = __devexit_p(skfp_remove_one), +}; static int __init skfd_init(void) { - struct net_device *p; - - if ((mdev = insert_device(NULL)) == NULL) - return -ENOMEM; - - for (p = mdev; p != NULL; p = ((struct s_smc *)p->priv)->os.next_module) { - if (register_netdev(p) != 0) { - printk("skfddi init_module failed\n"); - return -EIO; - } - } - - return 0; + return pci_module_init(&skfddi_pci_driver); } -static struct net_device *unlink_modules(struct net_device *p) -{ - struct net_device *next = NULL; - - if (p->priv) { /* Private areas allocated? */ - struct s_smc *lp = (struct s_smc *) p->priv; - - next = lp->os.next_module; - - if (lp->os.SharedMemAddr) { - pci_free_consistent(&lp->os.pdev, - lp->os.SharedMemSize, - lp->os.SharedMemAddr, - lp->os.SharedMemDMA); - lp->os.SharedMemAddr = NULL; - } - if (lp->os.LocalRxBuffer) { - pci_free_consistent(&lp->os.pdev, - MAX_FRAME_SIZE, - lp->os.LocalRxBuffer, - lp->os.LocalRxBufferDMA); - lp->os.LocalRxBuffer = NULL; - } - release_region(p->base_addr, - (lp->os.bus_type == SK_BUS_TYPE_PCI ? FP_IO_LEN : 0)); - } - unregister_netdev(p); - printk("%s: unloaded\n", p->name); - free_netdev(p); /* Free the device structure */ - - return next; -} // unlink_modules - static void __exit skfd_exit(void) { - while (mdev) - mdev = unlink_modules(mdev); + pci_unregister_driver(&skfddi_pci_driver); } module_init(skfd_init); From aviro@redhat.com Thu Dec 4 16:59:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 16:59:53 -0800 (PST) Received: from devserv.devel.redhat.com (pix-525-pool.redhat.com [66.187.233.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB50xeTa022324 for ; Thu, 4 Dec 2003 16:59:40 -0800 Received: from devserv.devel.redhat.com (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.10/8.12.10) with ESMTP id hB50xGDl005522; Thu, 4 Dec 2003 19:59:16 -0500 Received: (from aviro@localhost) by devserv.devel.redhat.com (8.12.10/8.12.10/Submit) id hB50xGP7005520; Thu, 4 Dec 2003 19:59:16 -0500 Date: Thu, 4 Dec 2003 19:59:16 -0500 From: Alexander Viro To: Stephen Hemminger Cc: Jeff Garzik , netdev@oss.sgi.com, Alexander Viro Subject: Re: [PATCH] skfddi - convert to new pci model. Message-ID: <20031205005915.GD31510@devserv.devel.redhat.com> References: <20031204163928.0f34d5d1.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031204163928.0f34d5d1.shemminger@osdl.org> User-Agent: Mutt/1.4.1i X-archive-position: 1882 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aviro@redhat.com Precedence: bulk X-list: netdev On Thu, Dec 04, 2003 at 04:39:28PM -0800, Stephen Hemminger wrote: > + dev->irq = pdev->irq; > dev->get_stats = &skfp_ctl_get_stats; > dev->open = &skfp_open; > dev->stop = &skfp_close; > + dev->init = &skfp_driver_init; Ehh... Don't do that, please. net_device ->init() means trouble, since getting failure from register_netdev() gives you no clue whether it had failed before, during or after ->init(). Makes for an interesting cleanup... You can do that if required cleanup can be deduced from the state of *dev after register_netdev() failure (e.g. if no cleanup is ever needed), but generally it's less PITA to just call the damn function directly before register_netdev() and leave ->init NULL. From garzik@gtf.org Thu Dec 4 22:38:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 22:39:19 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB56cnTa000304 for ; Thu, 4 Dec 2003 22:38:51 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id 1AFE06659; Fri, 5 Dec 2003 01:38:43 -0500 (EST) Date: Fri, 5 Dec 2003 01:38:43 -0500 From: Jeff Garzik To: netdev@oss.sgi.com Subject: [RFR] new e100 driver Message-ID: <20031205063842.GA23843@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 1883 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev e100 3.0.9_dev just got checked into net-drivers-2.5-exp queue. As I do occasionally (especially for smaller drivers), I post them for review and comment. Patches welcome in addition to comments. One thing I am tempted to request is use of the new module_param()... Jeff /******************************************************************************* Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. The full GNU General Public License is included in this distribution in the file called LICENSE. Contact Information: Linux NICS Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497 *******************************************************************************/ /* * e100.c: Intel(R) PRO/100 ethernet driver * * (Re)written 2003 by scott.feldman@intel.com. Based loosely on * original e100 driver, but better described as a munging of * e100, e1000, eepro100, tg3, 8139cp, and other drivers. * * References: * Intel 8255x 10/100 Mbps Ethernet Controller Family, * Open Source Software Developers Manual, * http://sourceforge.net/projects/e1000 * * * Theory of Operation * * I. General * * The driver supports Intel(R) 10/100 Mbps PCI Fast Ethernet * controller family, which includes the 82557, 82558, 82559, 82550, * 82551, and 82562 devices. 82558 and greater controllers * integrate the Intel 82555 PHY. The controllers are used in * server and client network interface cards, as well as in * LAN-On-Motherboard (LOM), CardBus, MiniPCI, and ICHx * configurations. 8255x supports a 32-bit linear addressing * mode and operates at 33Mhz PCI clock rate. * * II. Driver Operation * * Memory-mapped mode is used exclusively to access the device's * shared-memory structure, the Control/Status Registers (CSR). All * setup, configuration, and control of the device, including queuing * of Tx, Rx, and configuration commands is through the CSR. * cmd_lock serializes accesses to the CSR command register. cb_lock * protects the shared Command Block List (CBL). * * 8255x is highly MII-compliant and all access to the PHY go * through the Management Data Interface (MDI). Consequently, the * driver leverages the mii.c library shared with other MII-compliant * devices. * * Big- and Little-Endian byte order as well as 32- and 64-bit * archs are supported. Weak-ordered memory and non-cache-coherent * archs are supported. * * III. Transmit * * A Tx skb is mapped and hangs off of a TCB. TCBs are linked * together in a fixed-size ring (CBL) thus forming the flexible mode * memory structure. A TCB marked with the suspend-bit indicates * the end of the ring. The last TCB processed suspends the * controller, and the controller can be restarted by issue a CU * resume command to continue from the suspend point, or a CU start * command to start at a given position in the ring. * * Non-Tx commands (config, multicast setup, etc) are linked * into the CBL ring along with Tx commands. The common structure * used for both Tx and non-Tx commands is the Command Block (CB). * * cb_to_use is the next CB to use for queuing a command; cb_to_clean * is the next CB to check for completion; cb_to_send is the first * CB to start on in case of a previous failure to resume. CB clean * up happens in interrupt context in response to a CU interrupt, or * in dev->poll in the case where NAPI is enabled. cbs_avail keeps * track of number of free CB resources available. * * Hardware padding of short packets to minimum packet size is * enabled. 82557 pads with 7Eh, while the later controllers pad * with 00h. * * IV. Recieve * * The Receive Frame Area (RFA) comprises a ring of Receive Frame * Descriptors (RFD) + data buffer, thus forming the simplified mode * memory structure. Rx skbs are allocated to contain both the RFD * and the data buffer, but the RFD is pulled off before the skb is * indicated. The data buffer is aligned such that encapsulated * protocol headers are u32-aligned. Since the RFD is part of the * mapped shared memory, and completion status is contained within * the RFD, the RFD must be dma_sync'ed to maintain a consistent * view from software and hardware. * * Under typical operation, the receive unit (RU) is start once, * and the controller happily fills RFDs as frames arrive. If * replacement RFDs cannot be allocated, or the RU goes non-active, * the RU must be restarted. Frame arrival generates an interrupt, * and Rx indication and re-allocation happen in the same context, * therefore no locking is required. If NAPI is enabled, this work * happens in dev->poll. A software-generated interrupt is gen- * erated from the watchdog to recover from a failed allocation * senario where all Rx resources have been indicated and none re- * placed. * * V. Miscellaneous * * VLAN offload support of tagging, stripping and filtering is not * supported, but driver will accommodate the extra 4-byte VLAN tag * for processing by upper layers. Tx/Rx Checksum offloading is not * supported. Tx Scatter/Gather is not supported. Jumbo Frames is * not supported. * * NAPI support is enabled with CONFIG_E100_NAPI. * * MagicPacket(tm) WoL support is enabled/disabled via ethtool. * * Thanks to JC (jchapman@katalix.com) for helping with * testing/troubleshooting the development driver. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define DRV_NAME "e100" #define DRV_VERSION "3.0.9_dev" #define DRV_DESCRIPTION "Intel(R) PRO/100 Network Driver" #define DRV_COPYRIGHT "Copyright(c) 1999-2003 Intel Corporation" #define PFX DRV_NAME ": " #define E100_WATCHDOG_PERIOD 2 * HZ #define E100_NAPI_WEIGHT 16 MODULE_DESCRIPTION(DRV_DESCRIPTION); MODULE_AUTHOR(DRV_COPYRIGHT); MODULE_LICENSE("GPL"); static int debug = 3; MODULE_PARM(debug, "i"); MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)"); #define DPRINTK(nlevel, klevel, fmt, args...) \ (void)((NETIF_MSG_##nlevel & nic->msg_enable) && \ printk(KERN_##klevel PFX "%s: %s: " fmt, nic->netdev->name, \ __FUNCTION__ , ## args)) #define INTEL_8255X_ETHERNET_DEVICE(device_id, ich) {\ PCI_VENDOR_ID_INTEL, device_id, PCI_ANY_ID, PCI_ANY_ID, \ PCI_CLASS_NETWORK_ETHERNET << 8, 0xFFFF00, ich } static struct pci_device_id e100_id_table[] = { INTEL_8255X_ETHERNET_DEVICE(0x1029, 0), INTEL_8255X_ETHERNET_DEVICE(0x1030, 0), INTEL_8255X_ETHERNET_DEVICE(0x1031, 3), INTEL_8255X_ETHERNET_DEVICE(0x1032, 3), INTEL_8255X_ETHERNET_DEVICE(0x1033, 3), INTEL_8255X_ETHERNET_DEVICE(0x1034, 3), INTEL_8255X_ETHERNET_DEVICE(0x1038, 3), INTEL_8255X_ETHERNET_DEVICE(0x1039, 4), INTEL_8255X_ETHERNET_DEVICE(0x103A, 4), INTEL_8255X_ETHERNET_DEVICE(0x103B, 4), INTEL_8255X_ETHERNET_DEVICE(0x103C, 4), INTEL_8255X_ETHERNET_DEVICE(0x103D, 4), INTEL_8255X_ETHERNET_DEVICE(0x103E, 4), INTEL_8255X_ETHERNET_DEVICE(0x1050, 5), INTEL_8255X_ETHERNET_DEVICE(0x1051, 5), INTEL_8255X_ETHERNET_DEVICE(0x1052, 5), INTEL_8255X_ETHERNET_DEVICE(0x1053, 5), INTEL_8255X_ETHERNET_DEVICE(0x1054, 5), INTEL_8255X_ETHERNET_DEVICE(0x1055, 5), INTEL_8255X_ETHERNET_DEVICE(0x1059, 0), INTEL_8255X_ETHERNET_DEVICE(0x1209, 0), INTEL_8255X_ETHERNET_DEVICE(0x1229, 0), INTEL_8255X_ETHERNET_DEVICE(0x2449, 2), INTEL_8255X_ETHERNET_DEVICE(0x2459, 2), INTEL_8255X_ETHERNET_DEVICE(0x245D, 2), { 0, } }; MODULE_DEVICE_TABLE(pci, e100_id_table); enum mac { mac_82557_D100_A = 0, mac_82557_D100_B = 1, mac_82557_D100_C = 2, mac_82558_D101_A4 = 4, mac_82558_D101_B0 = 5, mac_82559_D101M = 8, mac_82559_D101S = 9, mac_82550_D102 = 12, mac_82550_D102_C = 13, mac_82550_D102_E = 15, mac_unknown = 0xFF, }; enum phy { phy_100a = 0x000003E0, phy_100c = 0x035002A8, phy_82555_tx = 0x015002A8, phy_nsc_tx = 0x5C002000, phy_82562_et = 0x033002A8, phy_82562_em = 0x032002A8, phy_82562_eh = 0x017002A8, phy_unknown = 0xFFFFFFFF, }; /* CSR (Control/Status Registers) */ struct csr { struct { u8 status; u8 stat_ack; u8 cmd_lo; u8 cmd_hi; u32 gen_ptr; } scb; u32 port; u16 flash_ctrl; u8 eeprom_ctrl_lo; u8 eeprom_ctrl_hi; u32 mdi_ctrl; u32 rx_dma_count; }; enum scb_status { rus_idle = 0x00, rus_suspended = 0x04, rus_no_resources = 0x08, rus_ready = 0x10, rus_mask = 0x3C, cus_idle = 0x00, cus_suspended = 0x40, cus_active = 0x80, cus_mask = 0xC0, }; enum scb_stat_ack { stat_ack_sw_gen = 0x04, stat_ack_rnr = 0x10, stat_ack_cu_idle = 0x20, stat_ack_frame_rx = 0x40, stat_ack_cu_cmd_done = 0x80, stat_ack_rx = (stat_ack_sw_gen | stat_ack_rnr | stat_ack_frame_rx), stat_ack_tx = (stat_ack_cu_idle | stat_ack_cu_cmd_done), }; enum scb_cmd_hi { irq_mask_none = 0x00, irq_mask_all = 0x01, irq_sw_gen = 0x02, }; enum scb_cmd_lo { ruc_start = 0x01, ruc_load_base = 0x06, cuc_start = 0x10, cuc_resume = 0x20, cuc_dump_addr = 0x40, cuc_dump_stats = 0x50, cuc_load_base = 0x60, cuc_dump_reset = 0x70, }; enum port { software_reset = 0x0000, selftest = 0x0001, selective_reset = 0x0002, }; enum eeprom_ctrl_lo { eesk = 0x01, eecs = 0x02, eedi = 0x04, eedo = 0x08, }; enum mdi_ctrl { mdi_write = 0x04000000, mdi_read = 0x08000000, mdi_ready = 0x10000000, }; enum eeprom_op { op_write = 0x05, op_read = 0x06, op_ewds = 0x10, op_ewen = 0x13, }; enum eeprom_offsets { eeprom_id = 0x0A, eeprom_config_asf = 0x0D, eeprom_smbus_addr = 0x90, }; enum eeprom_id { eeprom_id_wol = 0x0020, }; enum eeprom_config_asf { eeprom_asf = 0x8000, eeprom_gcl = 0x4000, }; enum cb_status { cb_complete = 0x8000, cb_ok = 0x2000, }; enum cb_command { cb_iaaddr = 0x0001, cb_config = 0x0002, cb_multi = 0x0003, cb_tx = 0x0004, cb_dump = 0x0006, cb_tx_sf = 0x0008, cb_cid = 0x1f00, cb_i = 0x2000, cb_s = 0x4000, cb_el = 0x8000, }; struct rfd { u16 status; u16 command; u32 link; u32 rbd; u16 actual_size; u16 size; }; struct rx_list { struct list_head list; struct sk_buff *skb; dma_addr_t dma_addr; unsigned int length; }; #if defined(__BIG_ENDIAN_BITFIELD) #define X(a,b) b,a #else #define X(a,b) a,b #endif struct config { /*0*/ u8 X(byte_count:6, pad0:2); /*1*/ u8 X(X(rx_fifo_limit:4, tx_fifo_limit:3), pad1:1); /*2*/ u8 adaptive_ifs; /*3*/ u8 X(X(X(X(mwi_enable:1, type_enable:1), read_align_enable:1), term_write_cache_line:1), pad3:4); /*4*/ u8 X(rx_dma_max_count:7, pad4:1); /*5*/ u8 X(tx_dma_max_count:7, dma_max_count_enable:1); /*6*/ u8 X(X(X(X(X(X(X(late_scb_update:1, direct_rx_dma:1), tno_intr:1), cna_intr:1), standard_tcb:1), standard_stat_counter:1), rx_discard_overruns:1), rx_save_bad_frames:1); /*7*/ u8 X(X(X(X(X(rx_discard_short_frames:1, tx_underrun_retry:2), pad7:2), rx_extended_rfd:1), tx_two_frames_in_fifo:1), tx_dynamic_tbd:1); /*8*/ u8 X(X(mii_mode:1, pad8:6), csma_disabled:1); /*9*/ u8 X(X(X(X(X(rx_tcpudp_checksum:1, pad9:3), vlan_arp_tco:1), link_status_wake:1), arp_wake:1), mcmatch_wake:1); /*10*/ u8 X(X(X(pad10:3, no_source_addr_insertion:1), preamble_length:2), loopback:2); /*11*/ u8 X(linear_priority:3, pad11:5); /*12*/ u8 X(X(linear_priority_mode:1, pad12:3), ifs:4); /*13*/ u8 ip_addr_lo; /*14*/ u8 ip_addr_hi; /*15*/ u8 X(X(X(X(X(X(X(promiscuous_mode:1, broadcast_disabled:1), wait_after_win:1), pad15_1:1), ignore_ul_bit:1), crc_16_bit:1), pad15_2:1), crs_or_cdt:1); /*16*/ u8 fc_delay_lo; /*17*/ u8 fc_delay_hi; /*18*/ u8 X(X(X(X(X(rx_stripping:1, tx_padding:1), rx_crc_transfer:1), rx_long_ok:1), fc_priority_threshold:3), pad18:1); /*19*/ u8 X(X(X(X(X(X(X(addr_wake:1, magic_packet_disable:1), fc_disable:1), fc_restop:1), fc_restart:1), fc_reject:1), full_duplex_force:1), full_duplex_pin:1); /*20*/ u8 X(X(X(pad20_1:5, fc_priority_location:1), multi_ia:1), pad20_2:1); /*21*/ u8 X(X(pad21_1:3, multicast_all:1), pad21_2:4); /*22*/ u8 X(X(rx_d102_mode:1, rx_vlan_drop:1), pad22:6); u8 pad_d102[9]; }; #define E100_MAX_MULTICAST_ADDRS 64 struct multi { u16 count; u8 addr[E100_MAX_MULTICAST_ADDRS * ETH_ALEN + 2/*pad*/]; }; /* Important: keep total struct u32-aligned */ struct cb { u16 status; u16 command; u32 link; union { u8 iaaddr[ETH_ALEN]; struct config config; struct multi multi; struct { u32 tbd_array; u16 tcb_byte_count; u8 threshold; u8 tbd_count; struct { u32 buf_addr; u16 size; u16 eol; } tbd; } tcb; u32 dump_buffer_addr; } u; struct cb *next, *prev; dma_addr_t dma_addr; struct sk_buff *skb; }; enum loopback { lb_none = 0, lb_mac = 1, lb_phy = 3, }; struct stats { u32 tx_good_frames, tx_max_collisions, tx_late_collisions, tx_underruns, tx_lost_crs, tx_deferred, tx_single_collisions, tx_multiple_collisions, tx_total_collisions; u32 rx_good_frames, rx_crc_errors, rx_alignment_errors, rx_resource_errors, rx_overrun_errors, rx_cdt_errors, rx_short_frame_errors; u32 fc_xmt_pause, fc_rcv_pause, fc_rcv_unsupported; u16 xmt_tco_frames, rcv_tco_frames; u32 complete; }; struct mem { struct { u32 signature; u32 result; } selftest; struct stats stats; u8 dump_buf[596]; }; struct param_range { u32 min; u32 max; u32 count; }; struct params { struct param_range rfds; struct param_range cbs; }; struct nic { /* Begin: frequently used values: keep adjacent for cache effect */ u32 msg_enable ____cacheline_aligned; struct net_device *netdev; struct pci_dev *pdev; struct list_head rx_list_head ____cacheline_aligned; struct rx_list *rx_list; struct rfd blank_rfd; spinlock_t cb_lock ____cacheline_aligned; spinlock_t cmd_lock; struct csr *csr; enum scb_cmd_lo cuc_cmd; unsigned int cbs_avail; struct cb *cbs; struct cb *cb_to_use; struct cb *cb_to_send; struct cb *cb_to_clean; u16 tx_command; /* End: frequently used values: keep adjacent for cache effect */ enum { ich = (1 << 0), promiscuous = (1 << 1), multicast_all = (1 << 2), wol_magic = (1 << 3), } flags ____cacheline_aligned; enum mac mac; enum phy phy; struct params params; struct net_device_stats net_stats; struct timer_list watchdog; struct timer_list blink_timer; struct mii_if_info mii; enum loopback loopback; struct mem *mem; dma_addr_t dma_addr; dma_addr_t cbs_dma_addr; u8 adaptive_ifs; u8 tx_threshold; u32 tx_frames; u32 tx_collisions; u32 tx_deferred; u32 tx_single_collisions; u32 tx_multiple_collisions; u32 tx_fc_pause; u32 tx_tco_frames; u32 rx_fc_pause; u32 rx_fc_unsupported; u32 rx_tco_frames; u8 rev_id; u16 leds; u16 eeprom_wc; u16 eeprom[256]; u32 pm_state[16]; }; static void e100_get_defaults(struct nic *nic) { struct param_range rfds = { .min = 64, .max = 256, .count = 64 }; struct param_range cbs = { .min = 64, .max = 256, .count = 64 }; pci_read_config_byte(nic->pdev, PCI_REVISION_ID, &nic->rev_id); /* MAC type is encoded as rev ID; exception: ICH is treated as 82559 */ nic->mac = (nic->flags & ich) ? mac_82559_D101M : nic->rev_id; if(nic->mac == mac_unknown) nic->mac = mac_82557_D100_A; nic->params.rfds = rfds; nic->params.cbs = cbs; /* Quadwords to DMA into FIFO before starting frame transmit */ nic->tx_threshold = 0xE0; nic->tx_command = cpu_to_le16(cb_tx | cb_i | cb_tx_sf | ((nic->mac >= mac_82558_D101_A4) ? cb_cid : 0)); /* Template for a freshly allocated RFD */ nic->blank_rfd.status = 0; nic->blank_rfd.command = cpu_to_le16(cb_el); nic->blank_rfd.link = 0; nic->blank_rfd.rbd = 0xFFFFFFFF; nic->blank_rfd.actual_size = 0; nic->blank_rfd.size = cpu_to_le16(VLAN_ETH_FRAME_LEN); } static inline void e100_write_flush(struct nic *nic) { /* Flush previous PCI writes through intermediate bridges * by doing a benign read */ (void)readb(&nic->csr->scb.status); } static inline void e100_enable_irq(struct nic *nic) { writeb(irq_mask_none, &nic->csr->scb.cmd_hi); e100_write_flush(nic); } static inline void e100_disable_irq(struct nic *nic) { writeb(irq_mask_all, &nic->csr->scb.cmd_hi); e100_write_flush(nic); } static void e100_hw_reset(struct nic *nic) { /* Put CU and RU into idle with a selective reset to get * device off of PCI bus */ writel(selective_reset, &nic->csr->port); e100_write_flush(nic); udelay(20); /* Now fully reset device */ writel(software_reset, &nic->csr->port); e100_write_flush(nic); udelay(20); /* TCO workaround - 82559 and greater */ if(nic->mac >= mac_82559_D101M) { /* Issue a redundant CU load base without setting * general pointer, and without waiting for scb to * clear. This gets us into post-driver. Finally, * wait 20 msec for reset to take effect. */ writeb(cuc_load_base, &nic->csr->scb.cmd_lo); mdelay(20); } /* Mask off our interrupt line - it's unmasked after reset */ e100_disable_irq(nic); } static int e100_self_test(struct nic *nic) { u32 dma_addr = nic->dma_addr + offsetof(struct mem, selftest); /* Passing the self-test is a pretty good indication * that the device can DMA to/from host memory */ nic->mem->selftest.signature = 0; nic->mem->selftest.result = 0xFFFFFFFF; writel(selftest | dma_addr, &nic->csr->port); e100_write_flush(nic); /* Wait 10 msec for self-test to complete */ set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(HZ / 100 + 1); /* Interrupts are enabled after self-test */ e100_disable_irq(nic); /* Check results of self-test */ if(nic->mem->selftest.result != 0) { DPRINTK(HW, ERR, "Self-test failed: result=0x%08X\n", nic->mem->selftest.result); return -ETIMEDOUT; } if(nic->mem->selftest.signature == 0) { DPRINTK(HW, ERR, "Self-test failed: timed out\n"); return -ETIMEDOUT; } return 0; } static void e100_eeprom_write(struct nic *nic, u16 addr_len, u16 addr, u16 data) { u32 cmd_addr_data[3]; u8 ctrl; int i, j; /* Three cmds: write/erase enable, write data, write/erase disable */ cmd_addr_data[0] = op_ewen << (addr_len - 2); cmd_addr_data[1] = (((op_write << addr_len) | addr) << 16) | data; cmd_addr_data[2] = op_ewds << (addr_len - 2); /* Bit-bang cmds to write word to eeprom */ for(j = 0; j < 3; j++) { /* Chip select */ writeb(eecs | eesk, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); for(i = 31; i >= 0; i--) { ctrl = (cmd_addr_data[j] & (1 << i)) ? eecs | eedi : eecs; writeb(ctrl, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); writeb(ctrl | eesk, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); } /* Wait 10 msec for cmd to complete */ set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(HZ / 100 + 1); /* Chip deselect */ writeb(0, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); } }; /* General technique stolen from the eepro100 driver - very clever */ static u16 e100_eeprom_read(struct nic *nic, u16 *addr_len, u16 addr) { u32 cmd_addr_data; u16 data = 0; u8 ctrl; int i; cmd_addr_data = ((op_read << *addr_len) | addr) << 16; /* Chip select */ writeb(eecs | eesk, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); /* Bit-bang to read word from eeprom */ for(i = 31; i >= 0; i--) { ctrl = (cmd_addr_data & (1 << i)) ? eecs | eedi : eecs; writeb(ctrl, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); writeb(ctrl | eesk, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); /* Eeprom drives a dummy zero to EEDO after receiving * complete address. Use this to adjust addr_len. */ ctrl = readb(&nic->csr->eeprom_ctrl_lo); if(!(ctrl & eedo) && i > 16) { *addr_len -= (i - 16); i = 17; } data = (data << 1) | (ctrl & eedo ? 1 : 0); } /* Chip deselect */ writeb(0, &nic->csr->eeprom_ctrl_lo); e100_write_flush(nic); udelay(4); return data; }; /* Load entire EEPROM image into driver cache and validate checksum */ static int e100_eeprom_load(struct nic *nic) { u16 addr, addr_len = 8, checksum = 0; /* Try reading with an 8-bit addr len to discover actual addr len */ e100_eeprom_read(nic, &addr_len, 0); nic->eeprom_wc = 1 << addr_len; for(addr = 0; addr < nic->eeprom_wc; addr++) { nic->eeprom[addr] = e100_eeprom_read(nic, &addr_len, addr); if(addr < nic->eeprom_wc - 1) checksum += nic->eeprom[addr]; } /* The checksum, stored in the last word, is calculated such that * the sum of words should be 0xBABA */ checksum = 0xBABA - checksum; if(checksum != nic->eeprom[nic->eeprom_wc - 1]) { DPRINTK(PROBE, ERR, "EEPROM corrupted\n"); return -EAGAIN; } return 0; } /* Save (portion of) driver EEPROM cache to device and update checksum */ static int e100_eeprom_save(struct nic *nic, u16 start, u16 count) { u16 addr, addr_len = 8, checksum = 0; /* Try reading with an 8-bit addr len to discover actual addr len */ e100_eeprom_read(nic, &addr_len, 0); nic->eeprom_wc = 1 << addr_len; if(start + count >= nic->eeprom_wc) return -EINVAL; for(addr = start; addr < start + count; addr++) e100_eeprom_write(nic, addr_len, addr, nic->eeprom[addr]); /* The checksum, stored in the last word, is calculated such that * the sum of words should be 0xBABA */ for(addr = 0; addr < nic->eeprom_wc - 1; addr++) checksum += nic->eeprom[addr]; nic->eeprom[nic->eeprom_wc - 1] = 0xBABA - checksum; e100_eeprom_write(nic, addr_len, nic->eeprom_wc - 1, 0xBABA - checksum); return 0; } #define E100_WAIT_SCB_TIMEOUT 40 static inline int e100_exec_cmd(struct nic *nic, u8 cmd, dma_addr_t dma_addr) { unsigned long flags; unsigned int i; int err = 0; spin_lock_irqsave(&nic->cmd_lock, flags); /* Previous command is accepted when SCB clears */ for(i = 0; i < E100_WAIT_SCB_TIMEOUT; i++) { if(likely(!readb(&nic->csr->scb.cmd_lo))) break; cpu_relax(); if(unlikely(i > (E100_WAIT_SCB_TIMEOUT >> 1))) udelay(5); } if(unlikely(i == E100_WAIT_SCB_TIMEOUT)) { err = -EAGAIN; goto err_unlock; } if(unlikely(cmd != cuc_resume)) writel(dma_addr, &nic->csr->scb.gen_ptr); writeb(cmd, &nic->csr->scb.cmd_lo); err_unlock: spin_unlock_irqrestore(&nic->cmd_lock, flags); return err; } static inline int e100_exec_cb(struct nic *nic, struct sk_buff *skb, void (*cb_prepare)(struct nic *, struct cb *, struct sk_buff *)) { struct cb *cb; unsigned long flags; int err = 0; spin_lock_irqsave(&nic->cb_lock, flags); if(unlikely(!nic->cbs_avail)) { err = -ENOMEM; goto err_unlock; } cb = nic->cb_to_use; nic->cb_to_use = cb->next; nic->cbs_avail--; cb->skb = skb; if(unlikely(!nic->cbs_avail)) err = -ENOSPC; cb_prepare(nic, cb, skb); /* Order is important otherwise we'll be in a race with h/w: * set S-bit in current first, then clear S-bit in previous. */ cb->command |= cpu_to_le16(cb_s); cb->prev->command &= cpu_to_le16(~cb_s); while(nic->cb_to_send != nic->cb_to_use) { if(unlikely((err = e100_exec_cmd(nic, nic->cuc_cmd, nic->cb_to_send->dma_addr)))) { /* Ok, here's where things get sticky. It's * possible that we can't schedule the command * because the controller is too busy, so * let's just queue the command and try again * when another command is scheduled. */ break; } else { nic->cuc_cmd = cuc_resume; nic->cb_to_send = nic->cb_to_send->next; } } err_unlock: spin_unlock_irqrestore(&nic->cb_lock, flags); return err; } static u16 mdio_ctrl(struct nic *nic, u32 addr, u32 dir, u32 reg, u16 data) { u32 data_out = 0; unsigned int i; writel((reg << 16) | (addr << 21) | dir | data, &nic->csr->mdi_ctrl); for(i = 0; i < 100; i++) { udelay(20); if((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready) break; } DPRINTK(HW, DEBUG, "%s:addr=%d, reg=%d, data_in=0x%04X, data_out=0x%04X\n", dir == mdi_read ? "READ" : "WRITE", addr, reg, data, data_out); return (u16)data_out; } static int mdio_read(struct net_device *netdev, int addr, int reg) { return mdio_ctrl(netdev->priv, addr, mdi_read, reg, 0); } static void mdio_write(struct net_device *netdev, int addr, int reg, int data) { mdio_ctrl(netdev->priv, addr, mdi_write, reg, data); } static void e100_configure(struct nic *nic, struct cb *cb, struct sk_buff *skb) { struct config *config = &cb->u.config; u8 *c = (u8 *)config; cb->command = cpu_to_le16(cb_config); memset(config, 0, sizeof(struct config)); config->byte_count = 0x16; /* bytes in this struct */ config->rx_fifo_limit = 0x8; /* bytes in FIFO before DMA */ config->direct_rx_dma = 0x1; /* reserved */ config->standard_tcb = 0x1; /* 1=standard, 0=extended */ config->standard_stat_counter = 0x1; /* 1=standard, 0=extended */ config->rx_discard_short_frames = 0x1; /* 1=discard, 0=pass */ config->tx_underrun_retry = 0x3; /* # of underrun retries */ config->mii_mode = 0x1; /* 1=MII mode, 0=503 mode */ config->pad10 = 0x6; config->no_source_addr_insertion = 0x1; /* 1=no, 0=yes */ config->preamble_length = 0x2; /* 0=1, 1=3, 2=7, 3=15 bytes */ config->ifs = 0x6; /* x16 = inter frame spacing */ config->ip_addr_hi = 0xF2; /* ARP IP filter - not used */ config->pad15_1 = 0x1; config->pad15_2 = 0x1; config->crs_or_cdt = 0x0; /* 0=CRS only, 1=CRS or CDT */ config->fc_delay_hi = 0x40; /* time delay for fc frame */ config->tx_padding = 0x1; /* 1=pad short frames */ config->fc_priority_threshold = 0x7; /* 7=priority fc disabled */ config->pad18 = 0x1; config->full_duplex_pin = 0x1; /* 1=examine FDX# pin */ config->pad20_1 = 0x1F; config->fc_priority_location = 0x1; /* 1=byte#31, 0=byte#19 */ config->pad21_1 = 0x5; config->adaptive_ifs = nic->adaptive_ifs; config->loopback = nic->loopback; if(nic->mii.force_media && nic->mii.full_duplex) config->full_duplex_force = 0x1; /* 1=force, 0=auto */ if(nic->flags & promiscuous || nic->loopback) { config->rx_save_bad_frames = 0x1; /* 1=save, 0=discard */ config->rx_discard_short_frames = 0x0; /* 1=discard, 0=save */ config->promiscuous_mode = 0x1; /* 1=on, 0=off */ } if(nic->flags & multicast_all) config->multicast_all = 0x1; /* 1=accept, 0=no */ if(!(nic->flags & wol_magic)) config->magic_packet_disable = 0x1; /* 1=off, 0=on */ if(nic->mac >= mac_82558_D101_A4) { config->fc_disable = 0x1; /* 1=Tx fc off, 0=Tx fc on */ config->mwi_enable = 0x1; /* 1=enable, 0=disable */ config->standard_tcb = 0x0; /* 1=standard, 0=extended */ config->rx_long_ok = 0x1; /* 1=VLANs ok, 0=standard */ if(nic->mac >= mac_82559_D101M) config->tno_intr = 0x1; /* TCO stats enable */ else config->standard_stat_counter = 0x0; } DPRINTK(HW, DEBUG, "[00-07]=%02X:%02X:%02X:%02X:%02X:%02X:%02X:%02X\n", c[0], c[1], c[2], c[3], c[4], c[5], c[6], c[7]); DPRINTK(HW, DEBUG, "[08-15]=%02X:%02X:%02X:%02X:%02X:%02X:%02X:%02X\n", c[8], c[9], c[10], c[11], c[12], c[13], c[14], c[15]); DPRINTK(HW, DEBUG, "[16-23]=%02X:%02X:%02X:%02X:%02X:%02X:%02X:%02X\n", c[16], c[17], c[18], c[19], c[20], c[21], c[22], c[23]); } static void e100_setup_iaaddr(struct nic *nic, struct cb *cb, struct sk_buff *skb) { u8 *dev_addr = nic->netdev->dev_addr; DPRINTK(HW, DEBUG, "dev_addr=%02X:%02X:%02X:%02X:%02X:%02X\n", dev_addr[0], dev_addr[1], dev_addr[2], dev_addr[3], dev_addr[4], dev_addr[5]); cb->command = cpu_to_le16(cb_iaaddr); memcpy(cb->u.iaaddr, dev_addr, ETH_ALEN); } static void e100_dump(struct nic *nic, struct cb *cb, struct sk_buff *skb) { cb->command = cpu_to_le16(cb_dump); cb->u.dump_buffer_addr = cpu_to_le32(nic->dma_addr + offsetof(struct mem, dump_buf)); } #define NCONFIG_AUTO_SWITCH 0x0080 #define MII_NSC_CONG MII_RESV1 #define NSC_CONG_ENABLE 0x0100 #define NSC_CONG_TXREADY 0x0400 #define ADVERTISE_FC_SUPPORTED 0x0400 static int e100_phy_init(struct nic *nic) { struct net_device *netdev = nic->netdev; u32 addr; u16 bmcr, stat, id_lo, id_hi, cong; nic->mii.phy_id = 0; nic->mii.advertising = 0; nic->mii.phy_id_mask = 0x1F; nic->mii.reg_num_mask = 0x1F; nic->mii.dev = netdev; nic->mii.full_duplex = 0; nic->mii.force_media = 0; nic->mii.mdio_read = mdio_read; nic->mii.mdio_write = mdio_write; /* Discover phy addr by searching addrs in order {1,0,2,..., 31} */ for(addr = 0; addr < 32; addr++) { nic->mii.phy_id = (addr == 0) ? 1 : (addr == 1) ? 0 : addr; bmcr = mdio_read(netdev, nic->mii.phy_id, MII_BMCR); stat = mdio_read(netdev, nic->mii.phy_id, MII_BMSR); stat = mdio_read(netdev, nic->mii.phy_id, MII_BMSR); if(!((bmcr == 0xFFFF) || ((stat == 0) && (bmcr == 0)))) break; } DPRINTK(HW, DEBUG, "phy_addr = %d\n", nic->mii.phy_id); if(addr == 32) return -EAGAIN; /* Selected the phy and isolate the rest */ for(addr = 0; addr < 32; addr++) { if(addr != nic->mii.phy_id) { mdio_write(netdev, addr, MII_BMCR, BMCR_ISOLATE); } else { bmcr = mdio_read(netdev, addr, MII_BMCR); mdio_write(netdev, addr, MII_BMCR, bmcr & ~BMCR_ISOLATE); } } /* Get phy ID */ id_lo = mdio_read(netdev, nic->mii.phy_id, MII_PHYSID1); id_hi = mdio_read(netdev, nic->mii.phy_id, MII_PHYSID2); nic->phy = (u32)id_hi << 16 | (u32)id_lo; DPRINTK(HW, DEBUG, "phy ID = 0x%08X\n", nic->phy); /* Handle National tx phy */ if(nic->phy == phy_nsc_tx) { /* Disable congestion control */ cong = mdio_read(netdev, nic->mii.phy_id, MII_NSC_CONG); cong |= NSC_CONG_TXREADY; cong &= ~NSC_CONG_ENABLE; mdio_write(netdev, nic->mii.phy_id, MII_NSC_CONG, cong); } /* enable MDI/MDI-X auto-switching */ if(nic->mac >= mac_82550_D102) mdio_write(netdev, nic->mii.phy_id, MII_NCONFIG, NCONFIG_AUTO_SWITCH); return 0; } static int e100_hw_init(struct nic *nic) { int err; e100_hw_reset(nic); DPRINTK(HW, ERR, "e100_hw_init\n"); if(!in_interrupt() && (err = e100_self_test(nic))) return err; if((err = e100_phy_init(nic))) return err; if((err = e100_exec_cmd(nic, cuc_load_base, 0))) return err; if((err = e100_exec_cmd(nic, ruc_load_base, 0))) return err; if((err = e100_exec_cb(nic, NULL, e100_configure))) return err; if((err = e100_exec_cb(nic, NULL, e100_setup_iaaddr))) return err; if((err = e100_exec_cmd(nic, cuc_dump_addr, nic->dma_addr + offsetof(struct mem, stats)))) return err; if((err = e100_exec_cmd(nic, cuc_dump_reset, 0))) return err; e100_disable_irq(nic); return 0; } static void e100_multi(struct nic *nic, struct cb *cb, struct sk_buff *skb) { struct net_device *netdev = nic->netdev; struct dev_mc_list *list = netdev->mc_list; u16 i, count = min(netdev->mc_count, E100_MAX_MULTICAST_ADDRS); cb->command = cpu_to_le16(cb_multi); cb->u.multi.count = cpu_to_le16(count * ETH_ALEN); for(i = 0; list && i < count; i++, list = list->next) memcpy(&cb->u.multi.addr[i*ETH_ALEN], &list->dmi_addr, ETH_ALEN); } static void e100_set_multicast_list(struct net_device *netdev) { struct nic *nic = netdev->priv; DPRINTK(HW, DEBUG, "mc_count=%d, flags=0x%04X\n", netdev->mc_count, netdev->flags); if(netdev->flags & IFF_PROMISC) nic->flags |= promiscuous; else nic->flags &= ~promiscuous; if(netdev->flags & IFF_ALLMULTI || netdev->mc_count > E100_MAX_MULTICAST_ADDRS) nic->flags |= multicast_all; else nic->flags &= ~multicast_all; e100_exec_cb(nic, NULL, e100_configure); e100_exec_cb(nic, NULL, e100_multi); } static void e100_update_stats(struct nic *nic) { struct net_device_stats *ns = &nic->net_stats; struct stats *s = &nic->mem->stats; u32 *complete = (nic->mac < mac_82558_D101_A4) ? &s->fc_xmt_pause : (nic->mac < mac_82559_D101M) ? (u32 *)&s->xmt_tco_frames : &s->complete; /* Device's stats reporting may take several microseconds to * complete, so where always waiting for results of the * previous command. */ if(*complete == le32_to_cpu(0x0000A007)) { *complete = 0; nic->tx_frames = le32_to_cpu(s->tx_good_frames); nic->tx_collisions = le32_to_cpu(s->tx_total_collisions); ns->tx_aborted_errors += le32_to_cpu(s->tx_max_collisions); ns->tx_window_errors += le32_to_cpu(s->tx_late_collisions); ns->tx_carrier_errors += le32_to_cpu(s->tx_lost_crs); ns->tx_fifo_errors += le32_to_cpu(s->tx_underruns); ns->collisions += nic->tx_collisions; ns->tx_errors += le32_to_cpu(s->tx_max_collisions) + le32_to_cpu(s->tx_lost_crs); ns->rx_dropped += le32_to_cpu(s->rx_resource_errors); ns->rx_length_errors += le32_to_cpu(s->rx_short_frame_errors); ns->rx_over_errors += le32_to_cpu(s->rx_resource_errors); ns->rx_crc_errors += le32_to_cpu(s->rx_crc_errors); ns->rx_frame_errors += le32_to_cpu(s->rx_alignment_errors); ns->rx_fifo_errors += le32_to_cpu(s->rx_overrun_errors); ns->rx_errors += le32_to_cpu(s->rx_crc_errors) + le32_to_cpu(s->rx_alignment_errors) + le32_to_cpu(s->rx_short_frame_errors) + le32_to_cpu(s->rx_cdt_errors); nic->tx_deferred += le32_to_cpu(s->tx_deferred); nic->tx_single_collisions += le32_to_cpu(s->tx_single_collisions); nic->tx_multiple_collisions += le32_to_cpu(s->tx_multiple_collisions); if(nic->mac >= mac_82558_D101_A4) { nic->tx_fc_pause += le32_to_cpu(s->fc_xmt_pause); nic->rx_fc_pause += le32_to_cpu(s->fc_rcv_pause); nic->rx_fc_unsupported += le32_to_cpu(s->fc_rcv_unsupported); if(nic->mac >= mac_82559_D101M) { nic->tx_tco_frames += le16_to_cpu(s->xmt_tco_frames); nic->rx_tco_frames += le16_to_cpu(s->rcv_tco_frames); } } } e100_exec_cmd(nic, cuc_dump_reset, 0); } static void e100_adjust_adaptive_ifs(struct nic *nic, int speed, int duplex) { /* Adjust inter-frame-spacing (IFS) between two transmits if * we're getting collisions on a half-duplex connection. */ if(duplex == DUPLEX_HALF) { u32 prev = nic->adaptive_ifs; u32 min_frames = (speed == SPEED_100) ? 1000 : 100; if((nic->tx_frames / 32 < nic->tx_collisions) && (nic->tx_frames > min_frames)) { if(nic->adaptive_ifs < 60) nic->adaptive_ifs += 5; } else if (nic->tx_frames < min_frames) { if(nic->adaptive_ifs >= 5) nic->adaptive_ifs -= 5; } if(nic->adaptive_ifs != prev) e100_exec_cb(nic, NULL, e100_configure); } } static void e100_watchdog(unsigned long data) { struct nic *nic = (struct nic *)data; struct ethtool_cmd cmd; DPRINTK(TIMER, DEBUG, "right now = %ld\n", jiffies); /* mii library handles link maintenance tasks */ mii_ethtool_gset(&nic->mii, &cmd); if(mii_link_ok(&nic->mii) && !netif_carrier_ok(nic->netdev)) { DPRINTK(LINK, INFO, "link up, %sMbps, %s-duplex\n", cmd.speed == SPEED_100 ? "100" : "10", cmd.duplex == DUPLEX_FULL ? "full" : "half"); } else if(!mii_link_ok(&nic->mii) && netif_carrier_ok(nic->netdev)) { DPRINTK(LINK, INFO, "link down\n"); } mii_check_link(&nic->mii); /* Software generated interrupt to recover from (rare) Rx * allocation failure */ writeb(irq_sw_gen, &nic->csr->scb.cmd_hi); e100_write_flush(nic); e100_update_stats(nic); e100_adjust_adaptive_ifs(nic, cmd.speed, cmd.duplex); if(nic->mac <= mac_82557_D100_C) /* Issue a multicast command to workaround a 557 lock up */ e100_set_multicast_list(nic->netdev); mod_timer(&nic->watchdog, jiffies + E100_WATCHDOG_PERIOD); } static inline void e100_xmit_prepare(struct nic *nic, struct cb *cb, struct sk_buff *skb) { cb->command = nic->tx_command; cb->u.tcb.tbd_array = cb->dma_addr + offsetof(struct cb, u.tcb.tbd); cb->u.tcb.tcb_byte_count = 0; cb->u.tcb.threshold = nic->tx_threshold; cb->u.tcb.tbd_count = 1; cb->u.tcb.tbd.buf_addr = cpu_to_le32(pci_map_single(nic->pdev, skb->data, skb->len, PCI_DMA_TODEVICE)); cb->u.tcb.tbd.size = cpu_to_le16(skb->len); } static int e100_xmit_frame(struct sk_buff *skb, struct net_device *netdev) { struct nic *nic = netdev->priv; int err = e100_exec_cb(nic, skb, e100_xmit_prepare); switch(err) { case -ENOSPC: /* We queued the skb, but now we're out of space, so * stop the queue before we completely run out. */ netif_stop_queue(netdev); break; case -ENOMEM: /* This is a hard error - log it. */ DPRINTK(TX_ERR, DEBUG, "Out of Tx resources, returning skb\n"); netif_stop_queue(netdev); return 1; } netdev->trans_start = jiffies; return 0; } static inline int e100_tx_clean(struct nic *nic) { struct cb *cb; int tx_cleaned = 0; spin_lock(&nic->cb_lock); DPRINTK(TX_DONE, DEBUG, "cb->status = 0x%04X\n", nic->cb_to_clean->status); /* Clean CBs marked complete */ for(cb = nic->cb_to_clean; cb->status & cpu_to_le16(cb_complete); cb = nic->cb_to_clean = cb->next) { if(likely(cb->skb != NULL)) { nic->net_stats.tx_packets++; nic->net_stats.tx_bytes += cb->skb->len; pci_unmap_single(nic->pdev, le32_to_cpu(cb->u.tcb.tbd.buf_addr), le16_to_cpu(cb->u.tcb.tbd.size), PCI_DMA_TODEVICE); dev_kfree_skb_any(cb->skb); tx_cleaned = 1; } cb->status = 0; nic->cbs_avail++; } spin_unlock(&nic->cb_lock); /* Recover from running out of Tx resources in xmit_frame */ if(unlikely(tx_cleaned && netif_queue_stopped(nic->netdev))) netif_wake_queue(nic->netdev); return tx_cleaned; } static void e100_clean_cbs(struct nic *nic, int free_mem) { if(nic->cbs) { while(nic->cb_to_clean != nic->cb_to_use) { struct cb *cb = nic->cb_to_clean; if(cb->skb) { pci_unmap_single(nic->pdev, le32_to_cpu(cb->u.tcb.tbd.buf_addr), le16_to_cpu(cb->u.tcb.tbd.size), PCI_DMA_TODEVICE); dev_kfree_skb(cb->skb); } nic->cb_to_clean = nic->cb_to_clean->next; } nic->cbs_avail = nic->params.cbs.count; if(free_mem) { pci_free_consistent(nic->pdev, sizeof(struct cb) * nic->params.cbs.count, nic->cbs, nic->cbs_dma_addr); nic->cbs = NULL; nic->cbs_avail = 0; } } nic->cuc_cmd = cuc_start; nic->cb_to_use = nic->cb_to_send = nic->cb_to_clean = nic->cbs; } static int e100_alloc_cbs(struct nic *nic) { struct cb *cb; unsigned int i, count = nic->params.cbs.count; nic->cuc_cmd = cuc_start; nic->cb_to_use = nic->cb_to_send = nic->cb_to_clean = NULL; nic->cbs_avail = 0; nic->cbs = pci_alloc_consistent(nic->pdev, sizeof(struct cb) * count, &nic->cbs_dma_addr); if(!nic->cbs) return -ENOMEM; for(cb = nic->cbs, i = 0; i < count; cb++, i++) { cb->next = (i + 1 < count) ? cb + 1 : nic->cbs; cb->prev = (i == 0) ? nic->cbs + count - 1 : cb - 1; cb->dma_addr = nic->cbs_dma_addr + i * sizeof(struct cb); cb->link = cpu_to_le32(nic->cbs_dma_addr + ((i+1) % count) * sizeof(struct cb)); } nic->cb_to_use = nic->cb_to_send = nic->cb_to_clean = nic->cbs; nic->cbs_avail = count; return 0; } static inline void e100_start_receiver(struct nic *nic) { /* (Re)start RU if suspended or idle and RFA is fully allocated */ struct rx_list *curr = list_entry(nic->rx_list_head.next, struct rx_list, list); if(curr->skb) { u8 status = readb(&nic->csr->scb.status); if(unlikely((status & rus_mask) != rus_ready)) e100_exec_cmd(nic, ruc_start, curr->dma_addr); } } static inline int e100_rx_alloc_skb(struct nic *nic, struct rx_list *curr) { unsigned int rx_offset = 2; /* u32 align protocol headers */ curr->dma_addr = 0; curr->length = sizeof(struct rfd) + VLAN_ETH_FRAME_LEN; if(!(curr->skb = dev_alloc_skb(curr->length + rx_offset))) return -ENOMEM; skb_reserve(curr->skb, rx_offset); curr->skb->dev = nic->netdev; curr->dma_addr = pci_map_single(nic->pdev, curr->skb->data, curr->length, PCI_DMA_FROMDEVICE); return 0; } static inline void e100_rx_rfa_add_tail(struct nic *nic, struct rx_list *curr) { struct rfd *rfd = (struct rfd *)curr->skb->data; *rfd = nic->blank_rfd; pci_dma_sync_single(nic->pdev, curr->dma_addr, sizeof(struct rfd), PCI_DMA_TODEVICE); if(likely(curr->list.prev != &nic->rx_list_head)) { struct rx_list *prev = (struct rx_list *)curr->list.prev; if(likely(prev->skb != NULL)) { struct rfd *prev_rfd = (struct rfd *)prev->skb->data; put_unaligned(cpu_to_le32(curr->dma_addr), (u32 *)&prev_rfd->link); prev_rfd->command = 0; pci_dma_sync_single(nic->pdev, prev->dma_addr, sizeof(struct rfd), PCI_DMA_TODEVICE); } } } static inline int e100_rx_indicate(struct nic *nic, struct rx_list *curr, unsigned int *work_done, unsigned int work_to_do) { struct sk_buff *skb = curr->skb; struct rfd *rfd = (struct rfd *)skb->data; u16 rfd_status, actual_size; if(unlikely(work_done && *work_done >= work_to_do)) return -EAGAIN; /* Need to sync before taking a peek at cb_complete bit */ pci_dma_sync_single(nic->pdev, curr->dma_addr, sizeof(struct rfd), PCI_DMA_FROMDEVICE); rfd_status = le16_to_cpu(rfd->status); DPRINTK(RX_STATUS, DEBUG, "status=0x%04X\n", rfd_status); /* If data isn't ready, nothing to indicate */ if(unlikely(!(rfd_status & cb_complete))) return -EAGAIN; /* Get actual data size */ actual_size = le16_to_cpu(rfd->actual_size) & 0x3FFF; if(unlikely(actual_size > curr->length - sizeof(struct rfd))) actual_size = curr->length - sizeof(struct rfd); /* Get data */ pci_dma_sync_single(nic->pdev, curr->dma_addr, sizeof(struct rfd) + actual_size, PCI_DMA_FROMDEVICE); pci_unmap_single(nic->pdev, curr->dma_addr, curr->length, PCI_DMA_FROMDEVICE); /* Pull off the RFD and put the actual data (minus eth hdr) */ skb_reserve(skb, sizeof(struct rfd)); skb_put(skb, actual_size); skb->protocol = eth_type_trans(skb, nic->netdev); if(unlikely(!(rfd_status & cb_ok)) || actual_size > nic->netdev->mtu + VLAN_ETH_HLEN) { /* Don't indicate if errors */ dev_kfree_skb_any(skb); } else { nic->net_stats.rx_packets++; nic->net_stats.rx_bytes += actual_size; nic->netdev->last_rx = jiffies; #ifdef CONFIG_E100_NAPI netif_receive_skb(skb); #else netif_rx(skb); #endif if(work_done) (*work_done)++; } curr->length = 0; curr->dma_addr = 0; curr->skb = NULL; return 0; } static inline void e100_rx_clean(struct nic *nic, unsigned int *work_done, unsigned int work_to_do) { struct list_head *list, *tmp; struct rx_list *curr; /* Indicate newly arrived packets */ list_for_each(list, &nic->rx_list_head) { curr = list_entry(list, struct rx_list, list); if(likely(curr->skb != NULL)) if(e100_rx_indicate(nic, curr, work_done, work_to_do)) break; } /* Alloc new skbs to refill list */ list_for_each_safe(list, tmp, &nic->rx_list_head) { curr = list_entry(list, struct rx_list, list); if(unlikely(curr->skb != NULL)) break; /* List is full, done */ if(unlikely(e100_rx_alloc_skb(nic, curr))) break; /* Better luck next time (see watchdog) */ list_del(&curr->list); list_add_tail(&curr->list, &nic->rx_list_head); e100_rx_rfa_add_tail(nic, curr); } e100_start_receiver(nic); } static void e100_rx_clean_list(struct nic *nic) { struct list_head *list; if(!nic->rx_list) return; list_for_each(list, &nic->rx_list_head) { struct rx_list *curr = list_entry(list, struct rx_list, list); if(curr->skb) { pci_unmap_single(nic->pdev, curr->dma_addr, curr->length, PCI_DMA_FROMDEVICE); dev_kfree_skb(curr->skb); } } kfree(nic->rx_list); nic->rx_list = NULL; } static int e100_rx_alloc_list(struct nic *nic) { struct rx_list *curr; unsigned int i, count = nic->params.rfds.count; INIT_LIST_HEAD(&nic->rx_list_head); if(!(nic->rx_list = kmalloc(sizeof(struct rx_list)*count, GFP_ATOMIC))) return -ENOMEM; for(curr = nic->rx_list, i = 0; i < count; curr++, i++) { if(e100_rx_alloc_skb(nic, curr)) { e100_rx_clean_list(nic); return -ENOMEM; } list_add_tail(&curr->list, &nic->rx_list_head); e100_rx_rfa_add_tail(nic, curr); } return 0; } static irqreturn_t e100_intr(int irq, void *dev_id, struct pt_regs *regs) { struct net_device *netdev = dev_id; struct nic *nic = netdev->priv; u8 stat_ack = readb(&nic->csr->scb.stat_ack); DPRINTK(INTR, DEBUG, "stat_ack = 0x%02X\n", stat_ack); if(stat_ack == 0x00 || /* Not our interurpt */ stat_ack == 0xFF) /* Hardware is ejected (cardbus, hotswap) */ return IRQ_NONE; /* Ack interrupts */ writeb(stat_ack, &nic->csr->scb.stat_ack); e100_write_flush(nic); #ifdef CONFIG_E100_NAPI e100_disable_irq(nic); netif_rx_schedule(netdev); #else if(stat_ack & stat_ack_rx) e100_rx_clean(nic, NULL, 0); if(stat_ack & stat_ack_tx) e100_tx_clean(nic); #endif return IRQ_HANDLED; } #ifdef CONFIG_E100_NAPI static int e100_poll(struct net_device *netdev, int *budget) { struct nic *nic = netdev->priv; unsigned int work_to_do = min(netdev->quota, *budget); unsigned int work_done = 0; int tx_cleaned; e100_rx_clean(nic, &work_done, work_to_do); tx_cleaned = e100_tx_clean(nic); /* If no Rx and Tx cleanup work was done, exit polling mode. */ if((!tx_cleaned && (work_done == 0)) || !netif_running(netdev)) { netif_rx_complete(netdev); e100_enable_irq(nic); return 0; } *budget -= work_done; netdev->quota -= work_done; return 1; } #endif static struct net_device_stats *e100_get_stats(struct net_device *netdev) { struct nic *nic = netdev->priv; return &nic->net_stats; } static int e100_set_mac_address(struct net_device *netdev, void *p) { struct nic *nic = netdev->priv; struct sockaddr *addr = p; if (!is_valid_ether_addr(addr->sa_data)) return -EADDRNOTAVAIL; memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len); e100_exec_cb(nic, NULL, e100_setup_iaaddr); return 0; } static int e100_change_mtu(struct net_device *netdev, int new_mtu) { if(new_mtu < ETH_ZLEN || new_mtu > ETH_DATA_LEN) return -EINVAL; netdev->mtu = new_mtu; return 0; } static int e100_asf(struct nic *nic) { /* ASF can be enabled from eeprom */ return((nic->pdev->device >= 0x1050) && (nic->pdev->device <= 0x1055) && (nic->eeprom[eeprom_config_asf] & eeprom_asf) && !(nic->eeprom[eeprom_config_asf] & eeprom_gcl) && ((nic->eeprom[eeprom_smbus_addr] & 0xFF) != 0xFE)); } static int e100_up(struct nic *nic) { int err; if((err = e100_rx_alloc_list(nic))) return err; if((err = e100_alloc_cbs(nic))) goto err_rx_clean_list; if((err = e100_hw_init(nic))) goto err_clean_cbs; e100_set_multicast_list(nic->netdev); e100_start_receiver(nic); netif_start_queue(nic->netdev); mod_timer(&nic->watchdog, jiffies); if((err = request_irq(nic->pdev->irq, e100_intr, SA_SHIRQ, nic->netdev->name, nic->netdev))) goto err_no_irq; e100_enable_irq(nic); return 0; err_no_irq: del_timer_sync(&nic->watchdog); netif_stop_queue(nic->netdev); err_clean_cbs: e100_clean_cbs(nic, 1); err_rx_clean_list: e100_rx_clean_list(nic); return err; } static void e100_down(struct nic *nic) { e100_disable_irq(nic); free_irq(nic->pdev->irq, nic->netdev); del_timer_sync(&nic->watchdog); netif_carrier_off(nic->netdev); netif_stop_queue(nic->netdev); e100_clean_cbs(nic, 1); e100_rx_clean_list(nic); } static void e100_tx_timeout(struct net_device *netdev) { struct nic *nic = netdev->priv; DPRINTK(TX_ERR, DEBUG, "scb.status=0x%02X\n", readb(&nic->csr->scb.status)); e100_down(netdev->priv); e100_up(netdev->priv); } static int e100_loopback_test(struct nic *nic, enum loopback loopback_mode) { int err; struct sk_buff *skb; struct rx_list *rx; /* Use driver resources to perform internal MAC or PHY * loopback test. A single packet is prepared and transmitted * in loopback mode, and the test passes if the received * packet compares byte-for-byte to the transmitted packet. */ if((err = e100_rx_alloc_list(nic))) return err; if((err = e100_alloc_cbs(nic))) goto err_clean_rx; /* ICH PHY loopback is broken so do MAC loopback instead */ if(nic->flags & ich && loopback_mode == lb_phy) loopback_mode = lb_mac; nic->loopback = loopback_mode; if((err = e100_hw_init(nic))) goto err_loopback_none; if(loopback_mode == lb_phy) mdio_write(nic->netdev, nic->mii.phy_id, MII_BMCR, BMCR_LOOPBACK); e100_start_receiver(nic); if(!(skb = dev_alloc_skb(ETH_DATA_LEN))) { err = -ENOMEM; goto err_loopback_none; } skb_put(skb, ETH_DATA_LEN); memset(skb->data, 0xFF, ETH_DATA_LEN); e100_xmit_frame(skb, nic->netdev); set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(HZ / 100 + 1); rx = list_entry(nic->rx_list_head.next, struct rx_list, list); if(memcmp(rx->skb->data + sizeof(struct rfd), skb->data, ETH_DATA_LEN)) err = -EAGAIN; err_loopback_none: mdio_write(nic->netdev, nic->mii.phy_id, MII_BMCR, 0); nic->loopback = lb_none; e100_hw_init(nic); e100_clean_cbs(nic, 1); err_clean_rx: e100_rx_clean_list(nic); return err; } #define MII_LED_CONTROL 0x1B static void e100_blink_led(unsigned long data) { struct nic *nic = (struct nic *)data; enum led_state { led_on = 0x01, led_off = 0x04, led_on_559 = 0x05, led_on_557 = 0x07, }; nic->leds = (nic->leds & led_on) ? led_off : (nic->mac < mac_82559_D101M) ? led_on_557 : led_on_559; mdio_write(nic->netdev, nic->mii.phy_id, MII_LED_CONTROL, nic->leds); mod_timer(&nic->blink_timer, jiffies + HZ / 4); } static int e100_get_settings(struct net_device *netdev, struct ethtool_cmd *cmd) { struct nic *nic = netdev->priv; return mii_ethtool_gset(&nic->mii, cmd); } static int e100_set_settings(struct net_device *netdev, struct ethtool_cmd *cmd) { struct nic *nic = netdev->priv; return mii_ethtool_sset(&nic->mii, cmd); } static void e100_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *info) { struct nic *nic = netdev->priv; strcpy(info->driver, DRV_NAME); strcpy(info->version, DRV_VERSION); strcpy(info->fw_version, "N/A"); strcpy(info->bus_info, pci_name(nic->pdev)); } static int e100_get_regs_len(struct net_device *netdev) { struct nic *nic = netdev->priv; #define E100_PHY_REGS 0x1C #define E100_REGS_LEN 1 + E100_PHY_REGS + \ sizeof(nic->mem->dump_buf) / sizeof(u32) return E100_REGS_LEN * sizeof(u32); } static void e100_get_regs(struct net_device *netdev, struct ethtool_regs *regs, void *p) { struct nic *nic = netdev->priv; u32 *buff = p; int i; regs->version = (1 << 24) | nic->rev_id; buff[0] = readb(&nic->csr->scb.cmd_hi) << 24 | readb(&nic->csr->scb.cmd_lo) << 16 | readw(&nic->csr->scb.status); for(i = E100_PHY_REGS; i >= 0; i--) buff[1 + E100_PHY_REGS - i] = mdio_read(netdev, nic->mii.phy_id, i); memset(nic->mem->dump_buf, 0, sizeof(nic->mem->dump_buf)); e100_exec_cb(nic, NULL, e100_dump); set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(HZ / 100 + 1); memcpy(&buff[2 + E100_PHY_REGS], nic->mem->dump_buf, sizeof(nic->mem->dump_buf)); } static void e100_get_wol(struct net_device *netdev, struct ethtool_wolinfo *wol) { struct nic *nic = netdev->priv; wol->supported = (nic->mac >= mac_82558_D101_A4) ? WAKE_MAGIC : 0; wol->wolopts = (nic->flags & wol_magic) ? WAKE_MAGIC : 0; } static int e100_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol) { struct nic *nic = netdev->priv; if(wol->wolopts != WAKE_MAGIC && wol->wolopts != 0) return -EOPNOTSUPP; if(wol->wolopts) nic->flags |= wol_magic; else nic->flags &= ~wol_magic; pci_enable_wake(nic->pdev, 0, nic->flags & (wol_magic | e100_asf(nic))); e100_exec_cb(nic, NULL, e100_configure); return 0; } static u32 e100_get_msglevel(struct net_device *netdev) { struct nic *nic = netdev->priv; return nic->msg_enable; } static void e100_set_msglevel(struct net_device *netdev, u32 value) { struct nic *nic = netdev->priv; nic->msg_enable = value; } static int e100_nway_reset(struct net_device *netdev) { struct nic *nic = netdev->priv; return mii_nway_restart(&nic->mii); } static u32 e100_get_link(struct net_device *netdev) { struct nic *nic = netdev->priv; return mii_link_ok(&nic->mii); } static int e100_get_eeprom_len(struct net_device *netdev) { struct nic *nic = netdev->priv; return nic->eeprom_wc << 1; } #define E100_EEPROM_MAGIC 0x1234 static int e100_get_eeprom(struct net_device *netdev, struct ethtool_eeprom *eeprom, u8 *bytes) { struct nic *nic = netdev->priv; eeprom->magic = E100_EEPROM_MAGIC; memcpy(bytes, &((u8 *)nic->eeprom)[eeprom->offset], eeprom->len); return 0; } static int e100_set_eeprom(struct net_device *netdev, struct ethtool_eeprom *eeprom, u8 *bytes) { struct nic *nic = netdev->priv; if(eeprom->magic != E100_EEPROM_MAGIC) return -EINVAL; memcpy(&((u8 *)nic->eeprom)[eeprom->offset], bytes, eeprom->len); return e100_eeprom_save(nic, eeprom->offset >> 1, (eeprom->len >> 1) + 1); } static void e100_get_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring) { struct nic *nic = netdev->priv; struct param_range *rfds = &nic->params.rfds; struct param_range *cbs = &nic->params.cbs; ring->rx_max_pending = rfds->max; ring->tx_max_pending = cbs->max; ring->rx_mini_max_pending = 0; ring->rx_jumbo_max_pending = 0; ring->rx_pending = rfds->count; ring->tx_pending = cbs->count; ring->rx_mini_pending = 0; ring->rx_jumbo_pending = 0; } static int e100_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring) { struct nic *nic = netdev->priv; struct param_range *rfds = &nic->params.rfds; struct param_range *cbs = &nic->params.cbs; if(netif_running(netdev)) e100_down(nic); rfds->count = max(ring->rx_pending, rfds->min); rfds->count = min(rfds->count, rfds->max); cbs->count = max(ring->tx_pending, cbs->min); cbs->count = min(cbs->count, cbs->max); if(netif_running(netdev)) e100_up(nic); return 0; } static char e100_gstrings_test[][ETH_GSTRING_LEN] = { "Link test (on/offline)", "Eeprom test (on/offline)", "Self test (offline)", "Mac loopback (offline)", "Phy loopback (offline)", }; #define E100_TEST_LEN sizeof(e100_gstrings_test) / ETH_GSTRING_LEN static int e100_diag_test_count(struct net_device *netdev) { return E100_TEST_LEN; } static void e100_diag_test(struct net_device *netdev, struct ethtool_test *test, u64 *data) { struct nic *nic = netdev->priv; int i; memset(data, 0, E100_TEST_LEN * sizeof(u64)); data[0] = !mii_link_ok(&nic->mii); data[1] = e100_eeprom_load(nic); if(test->flags & ETH_TEST_FL_OFFLINE) { if(netif_running(netdev)) e100_down(nic); data[2] = e100_self_test(nic); data[3] = e100_loopback_test(nic, lb_mac); data[4] = e100_loopback_test(nic, lb_phy); if(netif_running(netdev)) e100_up(nic); } for(i = 0; i < E100_TEST_LEN; i++) test->flags |= data[i] ? ETH_TEST_FL_FAILED : 0; } static int e100_phys_id(struct net_device *netdev, u32 data) { struct nic *nic = netdev->priv; if(!data || data > (u32)(MAX_SCHEDULE_TIMEOUT / HZ)) data = (u32)(MAX_SCHEDULE_TIMEOUT / HZ); mod_timer(&nic->blink_timer, jiffies); set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(data * HZ); del_timer_sync(&nic->blink_timer); mdio_write(netdev, nic->mii.phy_id, MII_LED_CONTROL, 0); return 0; } static char e100_gstrings_stats[][ETH_GSTRING_LEN] = { "rx_packets", "tx_packets", "rx_bytes", "tx_bytes", "rx_errors", "tx_errors", "rx_dropped", "tx_dropped", "multicast", "collisions", "rx_length_errors", "rx_over_errors", "rx_crc_errors", "rx_frame_errors", "rx_fifo_errors", "rx_missed_errors", "tx_aborted_errors", "tx_carrier_errors", "tx_fifo_errors", "tx_heartbeat_errors", "tx_window_errors", /* device-specific stats */ "tx_deferred", "tx_single_collisions", "tx_multi_collisions", "tx_flow_control_pause", "rx_flow_control_pause", "rx_flow_control_unsupported", "tx_tco_packets", "rx_tco_packets", }; #define E100_NET_STATS_LEN 21 #define E100_STATS_LEN sizeof(e100_gstrings_stats) / ETH_GSTRING_LEN static int e100_get_stats_count(struct net_device *netdev) { return E100_STATS_LEN; } static void e100_get_ethtool_stats(struct net_device *netdev, struct ethtool_stats *stats, u64 *data) { struct nic *nic = netdev->priv; int i; for(i = 0; i < E100_NET_STATS_LEN; i++) data[i] = ((unsigned long *)&nic->net_stats)[i]; data[i++] = nic->tx_deferred; data[i++] = nic->tx_single_collisions; data[i++] = nic->tx_multiple_collisions; data[i++] = nic->tx_fc_pause; data[i++] = nic->rx_fc_pause; data[i++] = nic->rx_fc_unsupported; data[i++] = nic->tx_tco_frames; data[i++] = nic->rx_tco_frames; } static void e100_get_strings(struct net_device *netdev, u32 stringset, u8 *data) { switch(stringset) { case ETH_SS_TEST: memcpy(data, *e100_gstrings_test, sizeof(e100_gstrings_test)); break; case ETH_SS_STATS: memcpy(data, *e100_gstrings_stats, sizeof(e100_gstrings_stats)); break; } } static struct ethtool_ops e100_ethtool_ops = { .get_settings = e100_get_settings, .set_settings = e100_set_settings, .get_drvinfo = e100_get_drvinfo, .get_regs_len = e100_get_regs_len, .get_regs = e100_get_regs, .get_wol = e100_get_wol, .set_wol = e100_set_wol, .get_msglevel = e100_get_msglevel, .set_msglevel = e100_set_msglevel, .nway_reset = e100_nway_reset, .get_link = e100_get_link, .get_eeprom_len = e100_get_eeprom_len, .get_eeprom = e100_get_eeprom, .set_eeprom = e100_set_eeprom, .get_ringparam = e100_get_ringparam, .set_ringparam = e100_set_ringparam, .self_test_count = e100_diag_test_count, .self_test = e100_diag_test, .get_strings = e100_get_strings, .phys_id = e100_phys_id, .get_stats_count = e100_get_stats_count, .get_ethtool_stats = e100_get_ethtool_stats, }; static int e100_do_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd) { struct nic *nic = netdev->priv; struct mii_ioctl_data *mii = (struct mii_ioctl_data *)&ifr->ifr_data; return generic_mii_ioctl(&nic->mii, mii, cmd, NULL); } static int e100_alloc(struct nic *nic) { nic->mem = pci_alloc_consistent(nic->pdev, sizeof(struct mem), &nic->dma_addr); return nic->mem ? 0 : -ENOMEM; } static void e100_free(struct nic *nic) { if(nic->mem) { pci_free_consistent(nic->pdev, sizeof(struct mem), nic->mem, nic->dma_addr); nic->mem = NULL; } } static int e100_open(struct net_device *netdev) { struct nic *nic = netdev->priv; int err = 0; netif_carrier_off(netdev); if((err = e100_up(nic))) DPRINTK(IFUP, ERR, "Cannot open interface, aborting.\n"); return err; } static int e100_close(struct net_device *netdev) { e100_down(netdev->priv); return 0; } static int __devinit e100_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { struct net_device *netdev; struct nic *nic; int err; if(!(netdev = alloc_etherdev(sizeof(struct nic)))) { if(((1 << debug) - 1) & NETIF_MSG_PROBE) printk(KERN_ERR PFX "Etherdev alloc failed, abort.\n"); return -ENOMEM; } netdev->open = e100_open; netdev->stop = e100_close; netdev->hard_start_xmit = e100_xmit_frame; netdev->get_stats = e100_get_stats; netdev->set_multicast_list = e100_set_multicast_list; netdev->set_mac_address = e100_set_mac_address; netdev->change_mtu = e100_change_mtu; netdev->do_ioctl = e100_do_ioctl; SET_ETHTOOL_OPS(netdev, &e100_ethtool_ops); netdev->tx_timeout = e100_tx_timeout; netdev->watchdog_timeo = E100_WATCHDOG_PERIOD; #ifdef CONFIG_E100_NAPI netdev->poll = e100_poll; netdev->weight = E100_NAPI_WEIGHT; #endif nic = netdev->priv; nic->netdev = netdev; nic->pdev = pdev; nic->msg_enable = (1 << debug) - 1; pci_set_drvdata(pdev, netdev); if((err = pci_enable_device(pdev))) { DPRINTK(PROBE, ERR, "Cannot enable PCI device, aborting.\n"); goto err_out_free_dev; } if(!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) { DPRINTK(PROBE, ERR, "Cannot find proper PCI device " "base address, aborting.\n"); err = -ENODEV; goto err_out_disable_pdev; } if((err = pci_request_regions(pdev, DRV_NAME))) { DPRINTK(PROBE, ERR, "Cannot obtain PCI resources, aborting.\n"); goto err_out_disable_pdev; } pci_set_master(pdev); if((err = pci_set_dma_mask(pdev, 0xFFFFFFFFULL))) { DPRINTK(PROBE, ERR, "No usable DMA configuration, aborting.\n"); goto err_out_free_res; } SET_MODULE_OWNER(netdev); SET_NETDEV_DEV(netdev, &pdev->dev); nic->csr = ioremap(pci_resource_start(pdev, 0), sizeof(struct csr)); if(!nic->csr) { DPRINTK(PROBE, ERR, "Cannot map device registers, aborting.\n"); err = -ENOMEM; goto err_out_free_res; } if(ent->driver_data) nic->flags |= ich; else nic->flags &= ~ich; spin_lock_init(&nic->cb_lock); spin_lock_init(&nic->cmd_lock); init_timer(&nic->watchdog); nic->watchdog.function = e100_watchdog; nic->watchdog.data = (unsigned long)nic; init_timer(&nic->blink_timer); nic->blink_timer.function = e100_blink_led; nic->blink_timer.data = (unsigned long)nic; if((err = e100_alloc(nic))) { DPRINTK(PROBE, ERR, "Cannot alloc driver memory, aborting.\n"); goto err_out_iounmap; } e100_get_defaults(nic); e100_hw_reset(nic); e100_phy_init(nic); if((err = e100_eeprom_load(nic))) goto err_out_free; ((u16 *)nic->netdev->dev_addr)[0] = le16_to_cpu(nic->eeprom[0]); ((u16 *)nic->netdev->dev_addr)[1] = le16_to_cpu(nic->eeprom[1]); ((u16 *)nic->netdev->dev_addr)[2] = le16_to_cpu(nic->eeprom[2]); if(!is_valid_ether_addr(nic->netdev->dev_addr)) { DPRINTK(PROBE, ERR, "Invalid MAC address from " "EEPROM, aborting.\n"); err = -EAGAIN; goto err_out_free; } /* Wol magic packet can be enabled from eeprom */ if((nic->mac >= mac_82558_D101_A4) && (nic->eeprom[eeprom_id] & eeprom_id_wol)) nic->flags |= wol_magic; pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic))); if((err = register_netdev(netdev))) { DPRINTK(PROBE, ERR, "Cannot register net device, aborting.\n"); goto err_out_free; } return 0; err_out_free: e100_free(nic); err_out_iounmap: iounmap(nic->csr); err_out_free_res: pci_release_regions(pdev); err_out_disable_pdev: pci_disable_device(pdev); err_out_free_dev: pci_set_drvdata(pdev, NULL); free_netdev(netdev); return err; } static void __devexit e100_remove(struct pci_dev *pdev) { struct net_device *netdev = pci_get_drvdata(pdev); if(netdev) { struct nic *nic = netdev->priv; unregister_netdev(netdev); e100_free(nic); iounmap(nic->csr); free_netdev(netdev); pci_release_regions(pdev); pci_disable_device(pdev); pci_set_drvdata(pdev, NULL); } } #ifdef CONFIG_PM static int e100_suspend(struct pci_dev *pdev, u32 state) { struct net_device *netdev = pci_get_drvdata(pdev); struct nic *nic = netdev->priv; if(netif_running(netdev)) e100_down(nic); e100_hw_reset(nic); netif_device_detach(netdev); pci_save_state(pdev, nic->pm_state); pci_enable_wake(pdev, state, nic->flags & (wol_magic | e100_asf(nic))); pci_disable_device(pdev); pci_set_power_state(pdev, state); return 0; } static int e100_resume(struct pci_dev *pdev) { struct net_device *netdev = pci_get_drvdata(pdev); struct nic *nic = netdev->priv; pci_set_power_state(pdev, 0); pci_restore_state(pdev, nic->pm_state); e100_hw_init(nic); netif_device_attach(netdev); if(netif_running(netdev)) e100_up(nic); return 0; } #endif static struct pci_driver e100_driver = { .name = DRV_NAME, .id_table = e100_id_table, .probe = e100_probe, .remove = __devexit_p(e100_remove), #ifdef CONFIG_PM .suspend = e100_suspend, .resume = e100_resume, #endif }; static int __init e100_init_module(void) { if(((1 << debug) - 1) & NETIF_MSG_DRV) { printk(KERN_INFO PFX "%s, %s\n", DRV_DESCRIPTION, DRV_VERSION); printk(KERN_INFO PFX "%s\n", DRV_COPYRIGHT); } return pci_module_init(&e100_driver); } static void __exit e100_cleanup_module(void) { pci_unregister_driver(&e100_driver); } module_init(e100_init_module); module_exit(e100_cleanup_module); From jgarzik@pobox.com Thu Dec 4 22:40:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 22:40:54 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB56eeTa000669 for ; Thu, 4 Dec 2003 22:40:41 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:39574 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AS9dl-0001In-BQ; Fri, 05 Dec 2003 06:40:37 +0000 Message-ID: <3FD02853.2070804@pobox.com> Date: Fri, 05 Dec 2003 01:40:19 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: prasanna@in.ibm.com CC: mpm@selenic.com, suparna@in.ibm.com, netdev@oss.sgi.com Subject: Re: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 References: <20031203135728.GA13585@in.ibm.com> In-Reply-To: <20031203135728.GA13585@in.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1884 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev I'm applying smc-ultra and tlan patches, but holding off on e100 and e1000. For both, as Scott mentioned, they don't look tested under NAPI. For e100 specifically, there is a spiffy new e100 that would need to be re-diffed and tested against. Jeff From jgarzik@pobox.com Thu Dec 4 23:47:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Dec 2003 23:47:55 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB57ldTa001984 for ; Thu, 4 Dec 2003 23:47:40 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:39603 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1ASAga-0001pT-TO; Fri, 05 Dec 2003 07:47:37 +0000 Message-ID: <3FD03807.6040207@pobox.com> Date: Fri, 05 Dec 2003 02:47:19 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Dave Dillow CC: Netdev Subject: Re: [BK RESEND Typhoon (3CR990) updates for 2.5 References: <1070487775.5062.8.camel@dillow.idleaire.com> In-Reply-To: <1070487775.5062.8.camel@dillow.idleaire.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1885 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev thanks, applied to 2.4 and 2.5 bugfix stream From rask@sygehus.dk Fri Dec 5 07:43:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 07:44:07 -0800 (PST) Received: from 0x50a172bb.albnxx15.adsl-dhcp.tele.dk (0x50a172bb.albnxx15.adsl-dhcp.tele.dk [80.161.114.187]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5FhqTa031005 for ; Fri, 5 Dec 2003 07:43:53 -0800 Received: by 0x50a1446d.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 037A120BB6; Fri, 5 Dec 2003 16:43:48 +0100 (CET) Date: Fri, 5 Dec 2003 16:43:47 +0100 From: Rask Ingemann Lambertsen To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: [RFR] new e100 driver Message-ID: <20031205164347.B1291@sygehus.dk> References: <20031205063842.GA23843@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031205063842.GA23843@gtf.org>; from jgarzik@pobox.com on Fri, Dec 05, 2003 at 01:38:43AM -0500 X-archive-position: 1886 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev On Fri, Dec 05, 2003 at 01:38:43AM -0500, Jeff Garzik wrote: > > e100 3.0.9_dev just got checked into net-drivers-2.5-exp queue. As I do > occasionally (especially for smaller drivers), I post them for review > and comment. How would I build it? A patch for the Makefile is missing. > Patches welcome in addition to comments. > > One thing I am tempted to request is use of the new module_param()... Does anybody have a macro that turns module_parm() into MODULE_PARM() for 2.4 compatibility? > * References: > * Intel 8255x 10/100 Mbps Ethernet Controller Family, > * Open Source Software Developers Manual, > * http://sourceforge.net/projects/e1000 [cut] > * Hardware padding of short packets to minimum packet size is > * enabled. 82557 pads with 7Eh, while the later controllers pad > * with 00h. I would be nice if the documentation said so. A few days ago I spent half an hour or so trying to figure out why frames sent by an 82558 were padded with 0x00 rather than the documented 0x7e (with no sign of skb_padto(), memset() or anything similiar in the driver). > enum mac { > mac_82557_D100_A = 0, > mac_82557_D100_B = 1, > mac_82557_D100_C = 2, > mac_82558_D101_A4 = 4, > mac_82558_D101_B0 = 5, > mac_82559_D101M = 8, > mac_82559_D101S = 9, > mac_82550_D102 = 12, > mac_82550_D102_C = 13, > mac_82550_D102_E = 15, > mac_unknown = 0xFF, > }; Nice for matching the D10* names with chip numbers. -- Regards, Rask Ingemann Lambertsen From jleu@mindspring.com Fri Dec 5 09:37:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 09:37:49 -0800 (PST) Received: from localhost.localdomain (jleu-laptop.dyn.internetnoc.com [209.249.160.113]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5HbaTa005093 for ; Fri, 5 Dec 2003 09:37:36 -0800 Received: from localhost.localdomain (jleu-laptop [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id hB5HZWkB003362; Fri, 5 Dec 2003 11:35:32 -0600 Received: (from jleu@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id hB5HZWgZ003360; Fri, 5 Dec 2003 11:35:32 -0600 X-Authentication-Warning: localhost.localdomain: jleu set sender to jleu@mindspring.com using -f Date: Fri, 5 Dec 2003 11:35:32 -0600 From: "James R. Leu" To: "David S. Miller" Cc: Ramon Casellas , netdev@oss.sgi.com Subject: Re: Request: Allocate a Netlink Family Number for MPLS Message-ID: <20031205173507.GB2241@mindspring.com> Reply-To: jleu@mindspring.com References: <20031201124532.3f6b6a65.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031201124532.3f6b6a65.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 1887 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jleu@mindspring.com Precedence: bulk X-list: netdev On Mon, Dec 01, 2003 at 12:45:32PM -0800, David S. Miller wrote: > On Mon, 1 Dec 2003 15:36:26 +0100 (MET) > Ramon Casellas wrote: > > > We're working in porting mpls for linux to 2.6 and one of the planned > > features is to move from an IOCTL based approach to a Netlink based one > > for updating/querying the MPLS FTN/ILM/Label Mapping/... tables from > > userspace. Since netlink families are public, I would > > like to know if it is possible to reserve a family for MPLS, even though > > the MPLS patch is not part of the official kernel. > > You don't need a whole new netlink family. Just create a dummy > address family for MPLS (ie. AF_MPLS) and then just use > NETLINK_ROUTE. Agreed. One could also use AF_MPLS for applications to register for data associated with the router alert label. I suppose one could also go completly nuts and allow an application to open stream connection that talks and listens on a specific label, but I think that is a bit overboard. -- James R. Leu jleu@mindspring.com From jgarzik@pobox.com Fri Dec 5 10:34:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 10:34:47 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5IYVTa006797 for ; Fri, 5 Dec 2003 10:34:32 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:39769 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1ASKGm-0002Hr-Pw; Fri, 05 Dec 2003 18:01:36 +0000 Message-ID: <3FD0C7EF.40305@pobox.com> Date: Fri, 05 Dec 2003 13:01:19 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rask Ingemann Lambertsen CC: netdev@oss.sgi.com Subject: Re: [RFR] new e100 driver References: <20031205063842.GA23843@gtf.org> <20031205164347.B1291@sygehus.dk> In-Reply-To: <20031205164347.B1291@sygehus.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1888 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Rask Ingemann Lambertsen wrote: > On Fri, Dec 05, 2003 at 01:38:43AM -0500, Jeff Garzik wrote: > >>e100 3.0.9_dev just got checked into net-drivers-2.5-exp queue. As I do >>occasionally (especially for smaller drivers), I post them for review >>and comment. > > > How would I build it? A patch for the Makefile is missing. Download the net-drivers-2.5-exp patch I am about to post :) >>Patches welcome in addition to comments. >> >>One thing I am tempted to request is use of the new module_param()... > > > Does anybody have a macro that turns module_parm() into MODULE_PARM() for > 2.4 compatibility? Due to semantics of MODULE_PARM(), that's impossible AFAIK. >> * References: >> * Intel 8255x 10/100 Mbps Ethernet Controller Family, >> * Open Source Software Developers Manual, >> * http://sourceforge.net/projects/e1000 > > [cut] > >> * Hardware padding of short packets to minimum packet size is >> * enabled. 82557 pads with 7Eh, while the later controllers pad >> * with 00h. > > > I would be nice if the documentation said so. A few days ago I spent half an > hour or so trying to figure out why frames sent by an 82558 were padded with > 0x00 rather than the documented 0x7e (with no sign of skb_padto(), memset() > or anything similiar in the driver). Do you mean the PDF documentation? I dunno how much control Scott Feldman and his team have over that. Jeff From garzik@gtf.org Fri Dec 5 11:28:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 11:28:50 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5JSYTa009128 for ; Fri, 5 Dec 2003 11:28:35 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id E92476641; Fri, 5 Dec 2003 14:28:28 -0500 (EST) Date: Fri, 5 Dec 2003 14:28:28 -0500 From: Jeff Garzik To: netdev@oss.sgi.com Cc: torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: [BK PATCHES] 2.6.x experimental net driver queue Message-ID: <20031205192828.GA15907@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 1889 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev This is the latest installment of "stuff I think should go into 2.6.1" ;-) or IOW, my patch queue. The changelog was getting quite long, so I'm now omitting all but the new changes. The full changelog is in BK, and also will be posted on kernel.org in the same directory as its respective patch. The new and interesting thing in today's patch is the e100 rewrite. A bit of history. I have been really impressed with what Intel has done with the e100 driver. If anyone saw it in the days when e100 only reared its head in vendor kernels, it was pretty grotty, and had license problems to boot. The current guys working on e100 listened to feedback from the community, and over time improved e100's code and license to the point where it was acceptable for merging into the mainline kernel. Well, even after all that work, it was a pretty darn big driver. It supported all the whiz bang hw-offload features. But as hardware does, you find out about various limits and errata that have crept into the field over time. Particularly, some of the hwaccel features don't work on some chips. So we come full circle, with e100 version 3. I've received a lot of "this works where e100 and eepro100 don't" reports, so I'm anxious to see how it does in wide testing. To quote from the driver, * VLAN offload support of tagging, stripping and filtering is not * supported, but driver will accommodate the extra 4-byte VLAN tag * for processing by upper layers. Tx/Rx Checksum offloading is not * supported. Tx Scatter/Gather is not supported. Jumbo Frames is * not supported. * * NAPI support is enabled with CONFIG_E100_NAPI. * * MagicPacket(tm) WoL support is enabled/disabled via ethtool. In other words, the scales tip back in the direction of letting the net stack do a bit more of the work. And guess what? The newer, more simple driver not only seems a bit more stable, but sometimes faster, because it now scales upward with the performance of your system CPU. If I may risk a tangent, this is another reason why I feel interest in TOE may be more trouble than it's worth. Forget security issues. Forget userland interface issues. The plain and simple fact is that CPUs keep getting faster, while that TOE card you bought last month does not. Getting back on topic, I like the new e100. Yes, it's a rewrite from scratch. But darn, it's clean, and early reports seem to indicate it's more stable and faster, too. So why not? Jeff BK users: bk pull bk://gkernel.bkbits.net/net-drivers-2.5-exp Patch: ftp://ftp.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test11-netdrvr-exp2.patch.bz2 This will update the following files: Documentation/networking/8139too.txt | 438 --- drivers/net/68360enet.c | 951 ------ drivers/net/e100/LICENSE | 339 -- drivers/net/e100/Makefile | 8 drivers/net/e100/e100.h | 999 ------- drivers/net/e100/e100_config.c | 639 ---- drivers/net/e100/e100_config.h | 168 - drivers/net/e100/e100_eeprom.c | 565 ---- drivers/net/e100/e100_main.c | 4358 -------------------------------- drivers/net/e100/e100_phy.c | 1163 -------- drivers/net/e100/e100_phy.h | 158 - drivers/net/e100/e100_test.c | 500 --- drivers/net/e100/e100_ucode.h | 365 -- Documentation/SubmittingPatches | 4 Documentation/networking/netconsole.txt | 57 drivers/char/synclink.c | 43 drivers/net/3c501.c | 116 drivers/net/3c501.h | 1 drivers/net/3c503.c | 117 drivers/net/3c505.c | 128 drivers/net/3c507.c | 120 drivers/net/3c515.c | 328 +- drivers/net/3c523.c | 108 drivers/net/3c527.c | 682 ++--- drivers/net/3c527.h | 6 drivers/net/3c59x.c | 17 drivers/net/8139too.c | 405 +- drivers/net/82596.c | 83 drivers/net/Kconfig | 34 drivers/net/Makefile | 4 drivers/net/Space.c | 587 ++-- drivers/net/a2065.c | 21 drivers/net/ac3200.c | 91 drivers/net/amd8111e.c | 14 drivers/net/apne.c | 81 drivers/net/appletalk/ipddp.c | 62 drivers/net/appletalk/ltpc.c | 2 drivers/net/arcnet/arc-rimi.c | 131 drivers/net/arcnet/arcnet.c | 6 drivers/net/arcnet/com20020-isa.c | 84 drivers/net/arcnet/com20020-pci.c | 54 drivers/net/arcnet/com20020.c | 16 drivers/net/arcnet/com90io.c | 126 drivers/net/arcnet/com90xx.c | 238 - drivers/net/ariadne.c | 10 drivers/net/arm/am79c961a.c | 2 drivers/net/arm/ether00.c | 4 drivers/net/arm/ether1.c | 2 drivers/net/arm/ether3.c | 2 drivers/net/arm/etherh.c | 2 drivers/net/at1700.c | 166 - drivers/net/atari_bionet.c | 62 drivers/net/atari_pamsnet.c | 67 drivers/net/atarilance.c | 58 drivers/net/atp.c | 40 drivers/net/au1000_eth.c | 61 drivers/net/bagetlance.c | 77 drivers/net/cs89x0.c | 132 drivers/net/de600.c | 59 drivers/net/de600.h | 1 drivers/net/de620.c | 63 drivers/net/declance.c | 17 drivers/net/defxx.c | 2 drivers/net/depca.c | 20 drivers/net/dummy.c | 2 drivers/net/e100.c | 2299 ++++++++++++++++ drivers/net/e100/e100_config.c | 8 drivers/net/e100/e100_main.c | 41 drivers/net/e100/e100_phy.c | 14 drivers/net/e1000/e1000.h | 10 drivers/net/e1000/e1000_ethtool.c | 94 drivers/net/e1000/e1000_hw.c | 65 drivers/net/e1000/e1000_hw.h | 9 drivers/net/e1000/e1000_main.c | 143 - drivers/net/e1000/e1000_param.c | 45 drivers/net/e2100.c | 88 drivers/net/eepro.c | 256 - drivers/net/eepro100.c | 21 drivers/net/eexpress.c | 91 drivers/net/eql.c | 2 drivers/net/es3210.c | 87 drivers/net/eth16i.c | 119 drivers/net/ethertap.c | 3 drivers/net/ewrk3.c | 684 ++--- drivers/net/fc/iph5526.c | 3 drivers/net/fc/iph5526_scsi.h | 2 drivers/net/fmv18x.c | 111 drivers/net/gt96100eth.c | 49 drivers/net/hamradio/baycom_epp.c | 2 drivers/net/hamradio/bpqether.c | 2 drivers/net/hamradio/hdlcdrv.c | 2 drivers/net/hamradio/yam.c | 2 drivers/net/hp-plus.c | 84 drivers/net/hp.c | 84 drivers/net/hp100.c | 743 ++--- drivers/net/hplance.c | 85 drivers/net/hydra.c | 19 drivers/net/jazzsonic.c | 88 drivers/net/lance.c | 129 drivers/net/lasi_82596.c | 17 drivers/net/lne390.c | 88 drivers/net/mac8390.c | 103 drivers/net/mac89x0.c | 77 drivers/net/mace.c | 50 drivers/net/macmace.c | 30 drivers/net/macsonic.c | 103 drivers/net/meth.c | 83 drivers/net/mvme147.c | 64 drivers/net/natsemi.c | 39 drivers/net/ne.c | 83 drivers/net/ne2.c | 82 drivers/net/ne2k-pci.c | 3 drivers/net/ne2k_cbus.c | 107 drivers/net/ne2k_cbus.h | 2 drivers/net/ne3210.c | 18 drivers/net/netconsole.c | 120 drivers/net/ni5010.c | 184 - drivers/net/ni52.c | 118 drivers/net/ni65.c | 101 drivers/net/ns83820.c | 2 drivers/net/oaknet.c | 61 drivers/net/pci-skeleton.c | 2 drivers/net/pcmcia/3c574_cs.c | 25 drivers/net/pcmcia/3c589_cs.c | 24 drivers/net/pcmcia/axnet_cs.c | 17 drivers/net/pcmcia/com20020_cs.c | 47 drivers/net/pcmcia/fmvj18x_cs.c | 28 drivers/net/pcmcia/ibmtr_cs.c | 15 drivers/net/pcmcia/nmclan_cs.c | 24 drivers/net/pcmcia/pcnet_cs.c | 17 drivers/net/pcmcia/smc91c92_cs.c | 24 drivers/net/pcmcia/xirc2ps_cs.c | 7 drivers/net/pcnet32.c | 11 drivers/net/plip.c | 18 drivers/net/ppp_generic.c | 67 drivers/net/r8169.c | 448 ++- drivers/net/saa9730.c | 63 drivers/net/sb1250-mac.c | 49 drivers/net/seeq8005.c | 97 drivers/net/sgiseeq.c | 29 drivers/net/shaper.c | 11 drivers/net/sk_g16.c | 182 - drivers/net/sk_mca.c | 119 drivers/net/sk_mca.h | 3 drivers/net/skfp/skfddi.c | 32 drivers/net/smc-ultra.c | 107 drivers/net/smc-ultra32.c | 85 drivers/net/smc9194.c | 110 drivers/net/stnic.c | 42 drivers/net/sun3_82586.c | 81 drivers/net/sun3lance.c | 85 drivers/net/tc35815.c | 194 - drivers/net/tg3.c | 16 drivers/net/tlan.c | 11 drivers/net/tokenring/3c359.c | 4 drivers/net/tokenring/abyss.c | 2 drivers/net/tokenring/madgemc.c | 6 drivers/net/tokenring/proteon.c | 184 - drivers/net/tokenring/skisa.c | 182 - drivers/net/tokenring/smctr.c | 194 - drivers/net/tokenring/tmspci.c | 2 drivers/net/tulip/Kconfig | 20 drivers/net/tulip/de2104x.c | 2 drivers/net/tulip/dmfe.c | 2 drivers/net/tulip/interrupt.c | 417 ++- drivers/net/tulip/tulip.h | 18 drivers/net/tulip/tulip_core.c | 78 drivers/net/tulip/winbond-840.c | 2 drivers/net/tulip/xircom_tulip_cb.c | 3 drivers/net/tun.c | 18 drivers/net/wan/cosa.c | 37 drivers/net/wan/lmc/lmc_main.c | 375 -- drivers/net/wan/lmc/lmc_var.h | 15 drivers/net/wan/x25_asy.c | 2 drivers/net/wd.c | 89 drivers/net/wireless/arlan-main.c | 283 -- drivers/net/wireless/arlan.h | 6 drivers/net/wireless/atmel.c | 2 drivers/net/wireless/ray_cs.c | 17 drivers/net/wireless/strip.c | 2 drivers/net/wireless/wavelan.c | 251 - drivers/net/wireless/wavelan.p.h | 58 drivers/net/wireless/wavelan_cs.c | 113 drivers/net/wireless/wavelan_cs.p.h | 49 drivers/net/znet.c | 36 drivers/net/zorro8390.c | 19 drivers/s390/net/qeth.c | 18 drivers/usb/gadget/ether.c | 2 include/linux/arcdevice.h | 1 include/linux/com20020.h | 1 include/linux/netdevice.h | 18 include/linux/netpoll.h | 38 include/linux/pci_ids.h | 2 net/Kconfig | 20 net/core/Makefile | 1 net/core/dev.c | 39 net/core/netpoll.c | 646 ++++ net/wanrouter/wanmain.c | 2 198 files changed, 10465 insertions(+), 17186 deletions(-) through these ChangeSets: (03/12/05 1.1505) [netdrvr tlan] netpoll support Hi Jeff, Below is the pollcontroller patch for tlan network device driver. This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch (03/12/05 1.1504) [netdrvr smc-ultra] netpoll support Hi Jeff, Below is the pollcontroller patch for smc ultra net driver. This patch can be applied over 2.6.0-test9-bk25-netdrvr-exp1.patch (03/12/05 1.1503) [netdrvr e100] complete rewrite of e100 driver [... snip changes from previous net-drivers-2.5-exp patchkits ...] From garzik@gtf.org Fri Dec 5 11:47:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 11:48:05 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5JloTa015570 for ; Fri, 5 Dec 2003 11:47:51 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id 3EACC6641; Fri, 5 Dec 2003 14:47:14 -0500 (EST) Date: Fri, 5 Dec 2003 14:47:13 -0500 From: Jeff Garzik To: netdev@oss.sgi.com Cc: torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: new e100 (was Re: [BK PATCHES] 2.6.x experimental net driver queue) Message-ID: <20031205194713.GA19321@gtf.org> References: <20031205192828.GA15907@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031205192828.GA15907@gtf.org> User-Agent: Mutt/1.3.28i X-archive-position: 1890 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Oh, since it wasn't very clear, this is largely the work of Scott Feldman at Intel. In addition to have been in public testing for a while, Scott says Intel Q/A beat up e100 v3 for a while, too. Kudos Scott, Jeff From jkenisto@us.ibm.com Fri Dec 5 11:56:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 11:56:30 -0800 (PST) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5JuGTa016041 for ; Fri, 5 Dec 2003 11:56:16 -0800 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e4.ny.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB5Ju1h3176714; Fri, 5 Dec 2003 14:56:01 -0500 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB5JtxIt170414; Fri, 5 Dec 2003 14:55:59 -0500 Message-ID: <3FD0E1FE.1D5B1883@us.ibm.com> Date: Fri, 05 Dec 2003 11:52:30 -0800 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: LKML , netdev , Jeff Garzik , Andrew Morton CC: "Feldman, Scott" , Larry Kessler Subject: [PATCH 2.6.0-test11] Net device error logging Content-Type: multipart/mixed; boundary="------------A4FE9473AFB3835D185F2625" X-archive-position: 1891 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------A4FE9473AFB3835D185F2625 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit The enclosed patch implements the netdev_* error-logging macros for network drivers. These macros have been discussed at length on the linux-kernel and linux-netdev lists. With the v2.6.0-test6 version of these macros, we addressed all the issues that reviewers had raised. This is just an update for v2.6.0-test11. As previously discussed, these macros are in demand now (e.g., for the e1000 driver) and have essentially no impact on drivers that don't use them. RECAP (from previous posts): Calls to the netdev_* macros (netdev_printk and wrappers such as netdev_err) are intended to replace calls to printk in network device drivers. These macros have the following characteristics: - Same format + args as the corresponding printk call. - Approximately the same amount of text as the corresponding printk call. - The first arg is a pointer to the net_device struct. - The second arg, which is a NETIF_MSG_* message level, can be used to implement verbosity control. - Standard message prefixes: verbose (interface name, driver name, bus ID) during probe, or just the interface name once the device is registered. - The current implementation just calls printk. However, the netdev_* interface (and availability of the net_device pointer) opens the door for logging additional information (via printk, via evlog/netlink, etc.) as desired, with no change to driver code. Examples: netdev_err(netdev, RX_ERR, "No mem: dropped packet\n"); logs a message such as the following if the NETIF_MSG_RX_ERR bit is set in netdev->msg_enable. eth2: No mem: dropped packet netdev_fatal(netdev, PROBE, "The EEPROM Checksum Is Not Valid\n"); or netdev_err(netdev, ALL, "The EEPROM Checksum Is Not Valid\n"); unconditionally logs a message such as: eth%d (e1000 0000:00:03.0): The EEPROM Checksum Is Not Valid The message's prefix includes the driver name and bus ID because the message is logged at probe time, before netdev is registered. SAMPLE DRIVERS As examples of how the netdev_* macros could be used, patches for the v2.6.0-test11 e100, e1000, and tg3 drivers are available on request. LINUX v2.4 SUPPORT Since there is no v2.6-style struct device underlying the net_device, a v2.4.23-compatible version of netdev_printk would always log the interface name as the message prefix: #define netdev_printk(sevlevel, netdev, msglevel, format, arg...) \ do { \ if (NETIF_MSG_##msglevel == NETIF_MSG_ALL \ || (netdev->msg_enable & NETIF_MSG_##msglevel)) { \ printk(sevlevel "%s: " format , netdev->name , ## arg); \ } \ } while (0) Jim Keniston IBM Linux Technology Center --------------A4FE9473AFB3835D185F2625 Content-Type: text/plain; charset=us-ascii; name="netdev-2.6.0-test11.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="netdev-2.6.0-test11.patch" diff -Naur linux.org/include/linux/netdevice.h linux.netdev.patched/include/linux/netdevice.h --- linux.org/include/linux/netdevice.h Fri Dec 5 10:36:55 2003 +++ linux.netdev.patched/include/linux/netdevice.h Fri Dec 5 10:36:56 2003 @@ -467,6 +467,9 @@ struct divert_blk *divert; #endif /* CONFIG_NET_DIVERT */ + /* NETIF_MSG_* flags to control the types of events we log */ + int msg_enable; + /* class/net/name entry */ struct class_device class_dev; struct net_device_stats* (*last_stats)(struct net_device *); @@ -749,6 +752,7 @@ NETIF_MSG_PKTDATA = 0x1000, NETIF_MSG_HW = 0x2000, NETIF_MSG_WOL = 0x4000, + NETIF_MSG_ALL = -1, /* always log message */ }; #define netif_msg_drv(p) ((p)->msg_enable & NETIF_MSG_DRV) @@ -910,6 +914,41 @@ #ifdef CONFIG_SYSCTL extern char *net_sysctl_strdup(const char *s); #endif + +/* debugging and troubleshooting/diagnostic helpers. */ +/** + * netdev_printk() - Log message with interface name, gated by message level + * @sevlevel: severity level -- e.g., KERN_INFO + * @netdev: net_device pointer + * @msglevel: a standard message-level flag with the NETIF_MSG_ prefix removed. + * Unless msglevel is ALL, log the message only if that flag is set in + * netdev->msg_enable. + * @format: as with printk + * @args: as with printk + */ +extern int __netdev_printk(const char *sevlevel, + const struct net_device *netdev, int msglevel, const char *format, ...); +#define netdev_printk(sevlevel, netdev, msglevel, format, arg...) \ + __netdev_printk(sevlevel , netdev , NETIF_MSG_##msglevel , \ + format , ## arg) + +#ifdef DEBUG +#define netdev_dbg(netdev, msglevel, format, arg...) \ + netdev_printk(KERN_DEBUG , netdev , msglevel , format , ## arg) +#else +#define netdev_dbg(netdev, msglevel, format, arg...) do {} while (0) +#endif + +#define netdev_err(netdev, msglevel, format, arg...) \ + netdev_printk(KERN_ERR , netdev , msglevel , format , ## arg) +#define netdev_info(netdev, msglevel, format, arg...) \ + netdev_printk(KERN_INFO , netdev , msglevel , format , ## arg) +#define netdev_warn(netdev, msglevel, format, arg...) \ + netdev_printk(KERN_WARNING , netdev , msglevel , format , ## arg) + +/* report fatal error unconditionally; msglevel ignored for now */ +#define netdev_fatal(netdev, msglevel, format, arg...) \ + netdev_printk(KERN_ERR , netdev , ALL , format , ## arg) #endif /* __KERNEL__ */ diff -Naur linux.org/net/core/dev.c linux.netdev.patched/net/core/dev.c --- linux.org/net/core/dev.c Fri Dec 5 10:36:56 2003 +++ linux.netdev.patched/net/core/dev.c Fri Dec 5 10:36:56 2003 @@ -3049,6 +3049,79 @@ subsys_initcall(net_dev_init); +static spinlock_t netdev_printk_lock = SPIN_LOCK_UNLOCKED; +/** + * __netdev_printk() - Log message with interface name, gated by message level + * @sevlevel: severity level -- e.g., KERN_INFO + * @netdev: net_device pointer + * @msglevel: a standard message-level flag such as NETIF_MSG_PROBE. + * Unless msglevel is NETIF_MSG_ALL, log the message only if + * that flag is set in netdev->msg_enable. + * @format: as with printk + * @args: as with printk + * + * Does the work for the netdev_printk macro. + * For a lot of network drivers, the probe function looks like + * ... + * netdev = alloc_netdev(...); // or alloc_etherdev(...) + * SET_NETDEV_DEV(netdev, dev); + * ... + * register_netdev(netdev); + * ... + * netdev_printk and its wrappers (e.g., netdev_err) can be used as + * soon as you have a valid net_device pointer -- e.g., from alloc_netdev, + * alloc_etherdev, or init_etherdev. (Before that, use dev_printk and + * its wrappers to report device errors.) It's common for an interface to + * have a name like "eth%d" until the device is successfully configured, + * and the call to register_netdev changes it to a "real" name like "eth0". + * + * If the interface's reg_state is NETREG_REGISTERED, we assume that it has + * been successfully set up in sysfs, and we prepend only the interface name + * to the message -- e.g., "eth0: NIC Link is Down". The interface + * name can be used to find eth0's driver, bus ID, etc. in sysfs. + * + * For any other value of reg_state, we prepend the driver name and bus ID + * as well as the (possibly incomplete) interface name -- e.g., + * "eth%d (e100 0000:00:03.0): Failed to map PCI address..." + * + * Probe functions that alloc and register in one step (via init_etherdev), + * or otherwise register the device before the probe completes successfully, + * may need to take other steps to ensure that the failing device is clearly + * identified. + */ +int __netdev_printk(const char *sevlevel, const struct net_device *netdev, + int msglevel, const char *format, ...) +{ + if (!netdev || !format) { + return -EINVAL; + } + if (msglevel == NETIF_MSG_ALL || (netdev->msg_enable & msglevel)) { + static char msg[512]; /* protected by netdev_printk_lock */ + unsigned long flags; + va_list args; + struct device *dev = netdev->class_dev.dev; + + spin_lock_irqsave(&netdev_printk_lock, flags); + va_start(args, format); + vsnprintf(msg, 512, format, args); + va_end(args); + + if (!sevlevel) { + sevlevel = ""; + } + + if (netdev->reg_state == NETREG_REGISTERED || !dev) { + printk("%s%s: %s", sevlevel, netdev->name, msg); + } else { + printk("%s%s (%s %s): %s", sevlevel, netdev->name, + dev->driver->name, dev->bus_id, msg); + } + spin_unlock_irqrestore(&netdev_printk_lock, flags); + } + return 0; +} + +EXPORT_SYMBOL(__netdev_printk); EXPORT_SYMBOL(__dev_get); EXPORT_SYMBOL(__dev_get_by_flags); EXPORT_SYMBOL(__dev_get_by_index); --------------A4FE9473AFB3835D185F2625-- From mashirle@us.ibm.com Fri Dec 5 12:15:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 12:15:24 -0800 (PST) Received: from linux2.suntekindustrial.com (wbar1.sjo1-4-4-004-065.sjo1.dsl-verizon.net [4.4.4.65]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5KF9Ta016629 for ; Fri, 5 Dec 2003 12:15:09 -0800 Received: from ibm-mxl (bi01p1.co.us.ibm.com [32.97.110.142]) (authenticated bits=0) by linux2.suntekindustrial.com (8.12.8/8.12.8) with ESMTP id hB5LQKbO022422; Fri, 5 Dec 2003 13:26:26 -0800 From: Shirley Ma Organization: IBM Linux To: kuznet@ms2.inr.ac.ru Subject: [PATCH]snmp6 64-bit counter support in proc.c Date: Fri, 5 Dec 2003 12:14:47 -0800 User-Agent: KMail/1.4.3 Cc: netdev@oss.sgi.com, xma@us.ibm.com References: <200312021240.PAA01536@yakov.inr.ac.ru> In-Reply-To: <200312021240.PAA01536@yakov.inr.ac.ru> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="------------Boundary-00=_NWUFQ050M8TIMNC8CGFE" Message-Id: <200312051214.48076.mashirle@us.ibm.com> X-archive-position: 1892 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mashirle@us.ibm.com Precedence: bulk X-list: netdev --------------Boundary-00=_NWUFQ050M8TIMNC8CGFE Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I add this because there are some 64-bit counters in the new IPv6 MIBs.=20 This patch has been tested agaist linux-2.6.0-test9, and cleanly applied = to=20 linux-2.6.0-test11. Thanks Shirley Ma IBM Linux Technology Center --------------Boundary-00=_NWUFQ050M8TIMNC8CGFE Content-Type: text/x-diff; charset="iso-8859-1"; name="linux-2.6.0-test11-ipv6mib2-64.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="linux-2.6.0-test11-ipv6mib2-64.patch" diff -urN linux-2.6.0-test11/net/ipv6/proc.c linux-2.6.0-test11-ipv6mib2-64/net/ipv6/proc.c --- linux-2.6.0-test11/net/ipv6/proc.c 2003-11-26 12:43:38.000000000 -0800 +++ linux-2.6.0-test11-ipv6mib2-64/net/ipv6/proc.c 2003-12-05 12:04:19.000000000 -0800 @@ -60,13 +60,14 @@ struct snmp6_item { char *name; + int size; int offset; }; -#define SNMP6_SENTINEL { .name = NULL, .offset = 0 } +#define SNMP6_SENTINEL { .name = NULL, .size = 0, .offset = 0 } static struct snmp6_item snmp6_ipv6_list[] = { /* ipv6 mib according to RFC 2465 */ -#define SNMP6_GEN(x) { .name = #x , .offset = offsetof(struct ipv6_mib, x) } +#define SNMP6_GEN(x) { .name = #x , .size = sizeof(((struct ipv6_mib *)0)->x), .offset = offsetof(struct ipv6_mib, x) } SNMP6_GEN(Ip6InReceives), SNMP6_GEN(Ip6InHdrErrors), SNMP6_GEN(Ip6InTooBigErrors), @@ -104,7 +105,7 @@ OutRouterAdvertisements too. OutGroupMembQueries too. */ -#define SNMP6_GEN(x) { .name = #x , .offset = offsetof(struct icmpv6_mib, x) } +#define SNMP6_GEN(x) { .name = #x , .size = sizeof(((struct icmpv6_mib *)0)->x), .offset = offsetof(struct icmpv6_mib, x) } SNMP6_GEN(Icmp6InMsgs), SNMP6_GEN(Icmp6InErrors), SNMP6_GEN(Icmp6InDestUnreachs), @@ -138,7 +139,7 @@ }; static struct snmp6_item snmp6_udp6_list[] = { -#define SNMP6_GEN(x) { .name = "Udp6" #x , .offset = offsetof(struct udp_mib, Udp##x) } +#define SNMP6_GEN(x) { .name = "Udp6" #x , .size = sizeof(((struct udp_mib *)0)->Udp##x), .offset = offsetof(struct udp_mib, Udp##x) } SNMP6_GEN(InDatagrams), SNMP6_GEN(NoPorts), SNMP6_GEN(InErrors), @@ -147,22 +148,27 @@ SNMP6_SENTINEL }; -static unsigned long -fold_field(void *mib[], int offt) +static unsigned long long +fold_field(void *mib[], int size, int offt) { - unsigned long res = 0; + unsigned long long res = 0; int i; for (i = 0; i < NR_CPUS; i++) { if (!cpu_possible(i)) continue; - res += - *((unsigned long *) (((void *)per_cpu_ptr(mib[0], i)) + - offt)); - res += - *((unsigned long *) (((void *)per_cpu_ptr(mib[1], i)) + - offt)); - } + if (size == 4) { + res += *((unsigned long *) + (((void *)per_cpu_ptr(mib[0], i)) + offt)); + res += *((unsigned long *) + (((void *)per_cpu_ptr(mib[1], i)) + offt)); + } else if (size == 8) { + res += *((unsigned long long *) + (((void *)per_cpu_ptr(mib[0], i)) + offt)); + res += *((unsigned long long *) + (((void *)per_cpu_ptr(mib[1], i)) + offt)); + } + } return res; } @@ -170,9 +176,9 @@ snmp6_seq_show_item(struct seq_file *seq, void **mib, struct snmp6_item *itemlist) { int i; - for (i=0; itemlist[i].name; i++) - seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, - fold_field(mib, itemlist[i].offset)); + for (i=0; itemlist[i].name; i++) + seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name, + fold_field(mib, itemlist[i].size, itemlist[i].offset)); } static int snmp6_seq_show(struct seq_file *seq, void *v) --------------Boundary-00=_NWUFQ050M8TIMNC8CGFE-- From davem@pizda.ninka.net Fri Dec 5 12:33:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 12:33:44 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5KXNTa019910 for ; Fri, 5 Dec 2003 12:33:23 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA06420; Fri, 5 Dec 2003 12:31:33 -0800 Date: Fri, 5 Dec 2003 12:31:33 -0800 From: "David S. Miller" To: Shirley Ma Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, xma@us.ibm.com Subject: Re: [PATCH]snmp6 64-bit counter support in proc.c Message-Id: <20031205123133.3743fe39.davem@redhat.com> In-Reply-To: <200312051214.48076.mashirle@us.ibm.com> References: <200312021240.PAA01536@yakov.inr.ac.ru> <200312051214.48076.mashirle@us.ibm.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1893 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 5 Dec 2003 12:14:47 -0800 Shirley Ma wrote: > I add this because there are some 64-bit counters in the new IPv6 MIBs. > This patch has been tested agaist linux-2.6.0-test9, and cleanly applied to > linux-2.6.0-test11. "sizeof(unsigned long)" evaluates to 8 on 64-bit systems, yet you assume it always evaluated to 4 as on 32-bit systems. Maybe it would be wiser to explicitly use 'u32' and 'u64' for the types of the snmp counters? This has always been a sore area. From davem@pizda.ninka.net Fri Dec 5 12:43:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 12:43:33 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5KhKTa020735 for ; Fri, 5 Dec 2003 12:43:20 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA09715; Fri, 5 Dec 2003 12:43:09 -0800 Date: Fri, 5 Dec 2003 12:43:09 -0800 From: "David S. Miller" To: Julian Anastasov Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031205124309.1490a2f4.davem@redhat.com> In-Reply-To: References: <20031202163902.2deb43a7.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1894 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 4 Dec 2003 00:03:20 +0200 (EET) Julian Anastasov wrote: > - ip_rt_redirect now supports RTO_ONLINK > > - ip_rt_frag_needed now supports RTO_ONLINK and all oif!=0 Let's discuss what RTO_ONLINK means :) It means, "Do not route this packet". It goes to the local subnet and that's the end of the story. Therefore there are no PMTU nor redirects to handle when this TOS bit is set. RTO_ONLINK is set by these two cases: 1) ARP 2) RAW/UDP sockets when MSG_DONTROUTE is specified by the user. So this patch still needs some work. From davem@pizda.ninka.net Fri Dec 5 12:45:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 12:45:19 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5Kj6Ta021006 for ; Fri, 5 Dec 2003 12:45:06 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA10693; Fri, 5 Dec 2003 12:44:55 -0800 Date: Fri, 5 Dec 2003 12:44:55 -0800 From: "David S. Miller" To: Bart De Schuymer Cc: ja@ssi.bg, netdev@oss.sgi.com, shemminger@osdl.org Subject: Re: [2.6 PATCH] bridge - provide valid tos value for ip_route_output_key Message-Id: <20031205124455.6b313caf.davem@redhat.com> In-Reply-To: <200312031938.22925.bdschuym@pandora.be> References: <200312031938.22925.bdschuym@pandora.be> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1895 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 3 Dec 2003 19:38:22 +0100 Bart De Schuymer wrote: > Hi Dave, If you haven't already applied the patch from Julian Anastasov > below, please do. Applied, thanks guys. From scott.feldman@intel.com Fri Dec 5 13:36:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 13:36:54 -0800 (PST) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5LacTa022111 for ; Fri, 5 Dec 2003 13:36:41 -0800 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by caduceus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hB5Lb6Lk029859 for ; Fri, 5 Dec 2003 21:37:06 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by petasus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.6 2003/12/04 00:48:44 root Exp $) with SMTP id hB5LaOHN026685 for ; Fri, 5 Dec 2003 21:36:26 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs041.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003120513363106919 ; Fri, 05 Dec 2003 13:36:31 -0800 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Fri, 5 Dec 2003 13:36:31 -0800 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Subject: RE: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Date: Fri, 5 Dec 2003 13:36:30 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Thread-Index: AcO6+s2n/YLrRrA9R2aOF4Vb2yga6AAe/gAQ From: "Feldman, Scott" To: "Jeff Garzik" , Cc: , , X-OriginalArrivalTime: 05 Dec 2003 21:36:31.0562 (UTC) FILETIME=[D92D1AA0:01C3BB77] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hB5LacTa022111 X-archive-position: 1896 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > For both, as Scott mentioned, they don't look tested under NAPI. For > e100 specifically, there is a spiffy new e100 that would need to be > re-diffed and tested against. Specifically, we need to care about insuring netif_rx is called in the poll_controller callback, rather than netif_receive_skb. If you use netif_receive_skb (NAPI mode), the netdump conversation is one-sided. tg3.c looks broken in this regard in the 2.6-exp BK tree, BTW. Additionally, if you have VLANs enabled on the interface, we might need to care about netif_hwaccell_[rx|receive_skb]. Something tells me all of the netif_* receive funcs need to handle the netdump conversation. That would be easier than special casing poll_controller in each driver. Is someone working on this? We're working on e100/e1000 patches for the current scheme, so we'll post these when we've tested the various NAPI/VLAN combinations. -scott From mashirle@us.ibm.com Fri Dec 5 13:52:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 13:52:14 -0800 (PST) Received: from linux2.suntekindustrial.com (wbar1.sjo1-4-4-004-065.sjo1.dsl-verizon.net [4.4.4.65]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5LpsTa022612 for ; Fri, 5 Dec 2003 13:52:01 -0800 Received: from ibm-mxl (bi01p1.co.us.ibm.com [32.97.110.142]) (authenticated bits=0) by linux2.suntekindustrial.com (8.12.8/8.12.8) with ESMTP id hB5N3BbO022496; Fri, 5 Dec 2003 15:03:12 -0800 From: Shirley Ma Organization: IBM Linux To: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: [PATCH] IPv6 MIB:ipv6Prefix netlink notification Date: Fri, 5 Dec 2003 13:51:47 -0800 User-Agent: KMail/1.4.3 Cc: xma@us.ibm.com References: <200311191621.38087.mashirle@us.ibm.com> In-Reply-To: <200311191621.38087.mashirle@us.ibm.com> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="------------Boundary-00=_BEZFRZA61GQXT3MPYBIA" Message-Id: <200312051351.47962.mashirle@us.ibm.com> X-archive-position: 1897 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mashirle@us.ibm.com Precedence: bulk X-list: netdev --------------Boundary-00=_BEZFRZA61GQXT3MPYBIA Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Once receiving a router advertisement message for prefix info,=20 a netlink notification event will be created. This patch has been tested against linux-2.6.0-test11.=20 Thanks Shirley Ma IBM Linux Technology Center --------------Boundary-00=_BEZFRZA61GQXT3MPYBIA Content-Type: text/x-diff; charset="iso-8859-1"; name="linux-2.6.0-test11-ipv6mib3.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="linux-2.6.0-test11-ipv6mib3.patch" diff -urN linux-2.6.0-test11/include/linux/rtnetlink.h linux-2.6.0-test11-ipv6mib3/include/linux/rtnetlink.h --- linux-2.6.0-test11/include/linux/rtnetlink.h 2003-11-26 12:45:11.000000000 -0800 +++ linux-2.6.0-test11-ipv6mib3/include/linux/rtnetlink.h 2003-12-05 12:20:11.000000000 -0800 @@ -44,7 +44,10 @@ #define RTM_DELTFILTER (RTM_BASE+29) #define RTM_GETTFILTER (RTM_BASE+30) -#define RTM_MAX (RTM_BASE+31) +#define RTM_NEWPREFIX (RTM_BASE+36) +#define RTM_GETPREFIX (RTM_BASE+38) + +#define RTM_MAX (RTM_BASE+39) /* Generic structure for encapsulation of optional route information. @@ -458,6 +461,34 @@ unsigned ifi_change; /* IFF_* change mask */ }; +/******************************************************************** + * prefix information + ****/ + +struct prefixmsg +{ + unsigned char prefix_family; + int prefix_ifindex; + unsigned char prefix_type; + unsigned char prefix_len; + unsigned char prefix_flags; +}; + +enum +{ + PREFIX_UNSPEC, + PREFIX_ADDRESS, + PREFIX_CACHEINFO, +}; + +#define PREFIX_MAX PREFIX_CACHEINFO + +struct prefix_cacheinfo +{ + __u32 preferred_time; + __u32 valid_time; +}; + /* The struct should be in sync with struct net_device_stats */ struct rtnl_link_stats { @@ -614,6 +645,8 @@ #define RTMGRP_DECnet_IFADDR 0x1000 #define RTMGRP_DECnet_ROUTE 0x4000 +#define RTMGRP_IPV6_PREFIX 0x20000 + /* End of information exported to user level */ #ifdef __KERNEL__ diff -urN linux-2.6.0-test11/include/net/if_inet6.h linux-2.6.0-test11-ipv6mib3/include/net/if_inet6.h --- linux-2.6.0-test11/include/net/if_inet6.h 2003-11-26 12:45:45.000000000 -0800 +++ linux-2.6.0-test11-ipv6mib3/include/net/if_inet6.h 2003-12-05 12:20:11.000000000 -0800 @@ -25,6 +25,10 @@ #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 +/* prefix flags */ +#define IF_PREFIX_ONLINK 0x01 +#define IF_PREFIX_AUTOCONF 0x02 + #ifdef __KERNEL__ struct inet6_ifaddr diff -urN linux-2.6.0-test11/net/ipv6/addrconf.c linux-2.6.0-test11-ipv6mib3/net/ipv6/addrconf.c --- linux-2.6.0-test11/net/ipv6/addrconf.c 2003-11-26 12:45:37.000000000 -0800 +++ linux-2.6.0-test11-ipv6mib3/net/ipv6/addrconf.c 2003-12-05 12:21:59.000000000 -0800 @@ -138,6 +138,8 @@ static void addrconf_rs_timer(unsigned long data); static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifa); +static void inet6_prefix_notify(int event, struct inet6_dev *idev, + struct prefix_info *pinfo); static int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *dev); static struct notifier_block *inet6addr_chain; @@ -1491,6 +1493,7 @@ addrconf_verify(0); } } + inet6_prefix_notify(RTM_NEWPREFIX, in6_dev, pinfo); in6_dev_put(in6_dev); } @@ -2751,6 +2754,66 @@ return skb->len; } +static int inet6_fill_prefix(struct sk_buff *skb, struct inet6_dev *idev, + struct prefix_info *pinfo, u32 pid, u32 seq, int event) +{ + struct prefixmsg *pmsg; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + struct prefix_cacheinfo ci; + + nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*pmsg)); + + if (pid) + nlh->nlmsg_flags |= NLM_F_MULTI; + + pmsg = NLMSG_DATA(nlh); + pmsg->prefix_family = AF_INET6; + pmsg->prefix_ifindex = idev->dev->ifindex; + pmsg->prefix_len = pinfo->prefix_len; + pmsg->prefix_type = pinfo->type; + + pmsg->prefix_flags = 0; + if (pinfo->onlink) + pmsg->prefix_flags |= IF_PREFIX_ONLINK; + if (pinfo->autoconf) + pmsg->prefix_flags |= IF_PREFIX_AUTOCONF; + + RTA_PUT(skb, PREFIX_ADDRESS, sizeof(pinfo->prefix), &pinfo->prefix); + + ci.preferred_time = ntohl(pinfo->prefered); + ci.valid_time = ntohl(pinfo->valid); + RTA_PUT(skb, PREFIX_CACHEINFO, sizeof(ci), &ci); + + nlh->nlmsg_len = skb->tail - b; + return skb->len; + +nlmsg_failure: +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; +} + +static void inet6_prefix_notify(int event, struct inet6_dev *idev, + struct prefix_info *pinfo) +{ + struct sk_buff *skb; + int size = NLMSG_SPACE(sizeof(struct prefixmsg)+128); + + skb = alloc_skb(size, GFP_ATOMIC); + if (!skb) { + netlink_set_err(rtnl, 0, RTMGRP_IPV6_PREFIX, ENOBUFS); + return; + } + if (inet6_fill_prefix(skb, idev, pinfo, 0, 0, event) < 0) { + kfree_skb(skb); + netlink_set_err(rtnl, 0, RTMGRP_IPV6_PREFIX, EINVAL); + return; + } + NETLINK_CB(skb).dst_groups = RTMGRP_IPV6_PREFIX; + netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_PREFIX, GFP_ATOMIC); +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { [RTM_GETLINK - RTM_BASE] = { .dumpit = inet6_dump_ifinfo, }, [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, --------------Boundary-00=_BEZFRZA61GQXT3MPYBIA-- From xma@us.ibm.com Fri Dec 5 13:55:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 13:56:10 -0800 (PST) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5LtuTa023006 for ; Fri, 5 Dec 2003 13:55:57 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e33.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hB5LtIJh085606; Fri, 5 Dec 2003 16:55:18 -0500 Received: from d03nm124.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hB5LtHIn059586; Fri, 5 Dec 2003 14:55:18 -0700 Importance: Normal Sensitivity: Subject: Re: [PATCH]snmp6 64-bit counter support in proc.c To: "David S. Miller" Cc: mashirle@us.ibm.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000 Message-ID: From: Shirley Ma Date: Fri, 5 Dec 2003 13:55:15 -0800 X-MIMETrack: Serialize by Router on D03NM124/03/M/IBM(Release 6.0.2CF2|July 23, 2003) at 12/05/2003 14:55:17 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-archive-position: 1898 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xma@us.ibm.com Precedence: bulk X-list: netdev Thanks, Dave. When I created the patch, I was concern about this. I will change the MIBs counters to 'u32' or 'u64'. Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone: (503) 578-7638 FAX: (503) 578-3228 "David S. Miller" on 12/05/2003 12:31:33 PM To: mashirle@us.ltcfwd.linux.ibm.com cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Shirley Ma/Beaverton/IBM@IBMUS Subject: Re: [PATCH]snmp6 64-bit counter support in proc.c On Fri, 5 Dec 2003 12:14:47 -0800 Shirley Ma wrote: > I add this because there are some 64-bit counters in the new IPv6 MIBs. > This patch has been tested agaist linux-2.6.0-test9, and cleanly applied to > linux-2.6.0-test11. "sizeof(unsigned long)" evaluates to 8 on 64-bit systems, yet you assume it always evaluated to 4 as on 32-bit systems. Maybe it would be wiser to explicitly use 'u32' and 'u64' for the types of the snmp counters? This has always been a sore area. From scott.feldman@intel.com Fri Dec 5 14:01:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 14:02:07 -0800 (PST) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5M1sTa023492 for ; Fri, 5 Dec 2003 14:01:54 -0800 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by caduceus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hB5M2ELk014008 for ; Fri, 5 Dec 2003 22:02:14 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by petasus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.6 2003/12/04 00:48:44 root Exp $) with SMTP id hB5M1GHR015564 for ; Fri, 5 Dec 2003 22:01:34 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs041.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003120514013807369 ; Fri, 05 Dec 2003 14:01:38 -0800 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Fri, 5 Dec 2003 14:01:38 -0800 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Subject: RE: [RFR] new e100 driver Date: Fri, 5 Dec 2003 14:01:37 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFR] new e100 driver Thread-Index: AcO7XoL3HvkggIM8TJamxnwXBN/tjgAHRT6Q From: "Feldman, Scott" To: "Jeff Garzik" , "Rask Ingemann Lambertsen" Cc: X-OriginalArrivalTime: 05 Dec 2003 22:01:38.0264 (UTC) FILETIME=[5B3D9580:01C3BB7B] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hB5M1sTa023492 X-archive-position: 1899 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > > I would be nice if the documentation said so. A few days > > ago I spent half an hour or so trying to figure out why frames > > sent by an 82558 were padded with 0x00 rather than the > > documented 0x7e (with no sign of skb_padto(), memset() or > > anything similiar in the driver). > > Do you mean the PDF documentation? I dunno how much control Scott > Feldman and his team have over that. Sorry Rask. It'll be easier to accumulate several doc changes and roll them out at one time. Do you have other doc updates? I'll try to maintain a list for when we do an update. -scott From scott.feldman@intel.com Fri Dec 5 14:06:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 14:07:04 -0800 (PST) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5M6oTa023951 for ; Fri, 5 Dec 2003 14:06:51 -0800 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by caduceus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hB5M75Lk017526 for ; Fri, 5 Dec 2003 22:07:06 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by petasus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.6 2003/12/04 00:48:44 root Exp $) with SMTP id hB5M5aHQ019480 for ; Fri, 5 Dec 2003 22:06:16 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs041.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003120514062108171 ; Fri, 05 Dec 2003 14:06:21 -0800 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Fri, 5 Dec 2003 14:06:21 -0800 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Subject: RE: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Date: Fri, 5 Dec 2003 14:06:20 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [2/4] pollcontroller patch for 2.6.0-test10-bk25-netdrvr-exp1 Thread-Index: AcO6+s2n/YLrRrA9R2aOF4Vb2yga6AAe/gAQAAFu+AA= From: "Feldman, Scott" To: "Feldman, Scott" , "Jeff Garzik" , Cc: , , X-OriginalArrivalTime: 05 Dec 2003 22:06:21.0221 (UTC) FILETIME=[03E56550:01C3BB7C] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hB5M6oTa023951 X-archive-position: 1900 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > Additionally, if you have VLANs enabled on the interface, we > might need to care about netif_hwaccell_[rx|receive_skb]. > Something tells me all of the netif_* receive funcs need to > handle the netdump conversation. That would be easier than > special casing poll_controller in each driver. Is someone > working on this? Ok, nevermind, we looked closer at lkcd/netpoll patches and it looks like this is addressed. -scott From davem@pizda.ninka.net Fri Dec 5 14:57:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 14:57:16 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5Mv3Ta027568 for ; Fri, 5 Dec 2003 14:57:03 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA23054; Fri, 5 Dec 2003 14:57:00 -0800 Date: Fri, 5 Dec 2003 14:57:00 -0800 From: "David S. Miller" To: Shirley Ma Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, xma@us.ibm.com Subject: Re: [PATCH] IPv6 MIB:ipv6Prefix netlink notification Message-Id: <20031205145700.120e9e68.davem@redhat.com> In-Reply-To: <200312051351.47962.mashirle@us.ibm.com> References: <200311191621.38087.mashirle@us.ibm.com> <200312051351.47962.mashirle@us.ibm.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1901 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 5 Dec 2003 13:51:47 -0800 Shirley Ma wrote: > Once receiving a router advertisement message for prefix info, > a netlink notification event will be created. > > This patch has been tested against linux-2.6.0-test11. Let's queue this up for 2.6.1, please resend it after 2.6.0 is released. From shemminger@osdl.org Fri Dec 5 15:08:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 15:08:23 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5N87Ta028878 for ; Fri, 5 Dec 2003 15:08:07 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB5N62Z19877; Fri, 5 Dec 2003 15:06:02 -0800 Date: Fri, 5 Dec 2003 15:06:01 -0800 From: Stephen Hemminger To: Russell King , Jeff Garzik , Jean Tourrilhes , Alexander Viro Cc: netdev@oss.sgi.com, irda-users@sourceforge.net Subject: [PATCH] sa1100ir - fix for 2.6 Message-Id: <20031205150601.68c52de3.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1902 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Update the ARM SA1100 irda driver to current 2.6 network device model * allocate network device with alloc_irdadev - get rid of dev_alloc * change to current system device class model - looks like it was using early old version of interface from 2.5 * better error unwind This has been cross compiled successfully against net-drivers-2.5-exp diff -Nru a/drivers/net/irda/sa1100_ir.c b/drivers/net/irda/sa1100_ir.c --- a/drivers/net/irda/sa1100_ir.c Fri Dec 5 14:48:38 2003 +++ b/drivers/net/irda/sa1100_ir.c Fri Dec 5 14:48:38 2003 @@ -30,6 +30,7 @@ #include #include #include +#include #include #include @@ -56,6 +57,11 @@ static int tx_lpm; static int max_rate = 4000000; +/* + * Our netdevice. There is only ever one of these. + */ +static struct net_device *netdev; + struct sa1100_irda { unsigned char hscr0; unsigned char utcr4; @@ -947,56 +953,77 @@ return io->head ? 0 : -ENOMEM; } -static struct device_driver sa1100ir_driver = { - .name = "sa1100ir", - .bus = &system_bus_type, +static struct sysdev_class sa1100ir_sysdev_class = { + set_kset_name("sa1100ir"), .suspend = sa1100_irda_suspend, .resume = sa1100_irda_resume, }; static struct sys_device sa1100ir_device = { - .name = "sa1100ir", .id = 0, - .root = NULL, - .dev = { - .name = "Intel Corporation SA11x0 [IrDA]", - .bus_id = "0", - .driver = &sa1100ir_driver, - }, + .cls = &sa1100ir_sysdev_class, }; -static int sa1100_irda_net_init(struct net_device *dev) +static int __init sa1100_irda_init(void) { - struct sa1100_irda *si = dev->priv; + struct net_device *dev; + struct sa1100_irda *si; unsigned int baudrate_mask; - int err = -ENOMEM; - - si = kmalloc(sizeof(struct sa1100_irda), GFP_KERNEL); - if (!si) - goto out; - - memset(si, 0, sizeof(*si)); + int err; - si->dev = &sa1100ir_device.dev; + /* Register as a system device with sysfs */ + err = sysdev_class_register(&sa1100ir_sysdev_class); + if (err) + goto err_sys_1; + + err = sys_device_register(&sa1100ir_device); + if (err) + goto err_sys_2; /* - * Initialise the HP-SIR buffers + * Limit power level a sensible range. */ - err = sa1100_irda_init_iobuf(&si->rx_buff, 14384); + if (power_level < 1) + power_level = 1; + if (power_level > 3) + power_level = 3; + + err = request_mem_region(__PREG(Ser2UTCR0), 0x24, "IrDA") ? 0 : -EBUSY; if (err) - goto out; - err = sa1100_irda_init_iobuf(&si->tx_buff, 4000); + goto err_mem_1; + err = request_mem_region(__PREG(Ser2HSCR0), 0x1c, "IrDA") ? 0 : -EBUSY; + if (err) + goto err_mem_2; + err = request_mem_region(__PREG(Ser2HSCR2), 0x04, "IrDA") ? 0 : -EBUSY; if (err) - goto out_free_rx; + goto err_mem_3; + + err = -ENOMEM; + dev = alloc_irdadev(sizeof(struct sa1100_irda)); + if (!dev) + goto err_mem_4; + + si = dev->priv; + dev->irq = IRQ_Ser2ICP; - dev->priv = si; dev->hard_start_xmit = sa1100_irda_hard_xmit; dev->open = sa1100_irda_start; dev->stop = sa1100_irda_stop; dev->do_ioctl = sa1100_irda_ioctl; dev->get_stats = sa1100_irda_stats; - irda_device_setup(dev); + SET_MODULE_OWNER(dev); + + /* + * Initialise the HP-SIR buffers + */ + err = sa1100_irda_init_iobuf(&si->rx_buff, 14384); + if (err) + goto err_buf_1; + err = sa1100_irda_init_iobuf(&si->tx_buff, 4000); + if (err) + goto err_buf_2; + irda_init_max_qos_capabilies(&si->qos); /* @@ -1030,110 +1057,55 @@ Ser2UTCR4 = si->utcr4; Ser2HSCR0 = HSCR0_UART; + err = register_netdev(dev); + if (err) + goto err_registration; + + netdev = dev; + return 0; +err_registration: kfree(si->tx_buff.head); -out_free_rx: +err_buf_2: kfree(si->rx_buff.head); -out: - kfree(si); - - return err; -} +err_buf_1: + free_netdev(dev); +err_mem_4: -/* - * Remove all traces of this driver module from the kernel, so we can't be - * called. Note that the device has already been stopped, so we don't have - * to worry about interrupts or dma. - */ -static void sa1100_irda_net_uninit(struct net_device *dev) -{ - struct sa1100_irda *si = dev->priv; - - dev->hard_start_xmit = NULL; - dev->open = NULL; - dev->stop = NULL; - dev->do_ioctl = NULL; - dev->get_stats = NULL; - dev->priv = NULL; - - kfree(si->tx_buff.head); - kfree(si->rx_buff.head); - kfree(si); -} - -static int __init sa1100_irda_init(void) -{ - struct net_device *dev; - int err; - - /* - * Limit power level a sensible range. - */ - if (power_level < 1) - power_level = 1; - if (power_level > 3) - power_level = 3; - - err = request_mem_region(__PREG(Ser2UTCR0), 0x24, "IrDA") ? 0 : -EBUSY; - if (err) - goto err_mem_1; - err = request_mem_region(__PREG(Ser2HSCR0), 0x1c, "IrDA") ? 0 : -EBUSY; - if (err) - goto err_mem_2; - err = request_mem_region(__PREG(Ser2HSCR2), 0x04, "IrDA") ? 0 : -EBUSY; - if (err) - goto err_mem_3; - - driver_register(&sa1100ir_driver); - sys_device_register(&sa1100ir_device); - - rtnl_lock(); - dev = dev_alloc("irda%d", &err); - if (dev) { - dev->irq = IRQ_Ser2ICP; - dev->init = sa1100_irda_net_init; - dev->uninit = sa1100_irda_net_uninit; - - err = register_netdevice(dev); - - if (err) - kfree(dev); - else - dev_set_drvdata(&sa1100ir_device.dev, dev); - } - rtnl_unlock(); - - if (err) { - sys_device_unregister(&sa1100ir_device); - driver_unregister(&sa1100ir_driver); - - release_mem_region(__PREG(Ser2HSCR2), 0x04); + release_mem_region(__PREG(Ser2HSCR2), 0x04); err_mem_3: - release_mem_region(__PREG(Ser2HSCR0), 0x1c); + release_mem_region(__PREG(Ser2HSCR0), 0x1c); err_mem_2: - release_mem_region(__PREG(Ser2UTCR0), 0x24); - } + release_mem_region(__PREG(Ser2UTCR0), 0x24); err_mem_1: + sys_device_unregister(&sa1100ir_device); +err_sys_2: + sysdev_class_unregister(&sa1100ir_sysdev_class); +err_sys_1: return err; } static void __exit sa1100_irda_exit(void) { - struct net_device *dev = dev_get_drvdata(&sa1100ir_device.dev); + struct net_device *dev = netdev; - if (dev) + if (dev) { + struct sa1100_irda *si = dev->priv; unregister_netdev(dev); - sys_device_unregister(&sa1100ir_device); - driver_unregister(&sa1100ir_driver); + kfree(si->tx_buff.head); + kfree(si->rx_buff.head); + release_mem_region(__PREG(Ser2HSCR2), 0x04); + release_mem_region(__PREG(Ser2HSCR0), 0x1c); + release_mem_region(__PREG(Ser2UTCR0), 0x24); + free_netdev(dev); - release_mem_region(__PREG(Ser2HSCR2), 0x04); - release_mem_region(__PREG(Ser2HSCR0), 0x1c); - release_mem_region(__PREG(Ser2UTCR0), 0x24); + sys_device_unregister(&sa1100ir_device); + sysdev_class_unregister(&sa1100ir_sysdev_class); - if(dev) - free_netdev(dev); + netdev = NULL; + } } static int __init sa1100ir_setup(char *line) From shemminger@osdl.org Fri Dec 5 15:13:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 15:13:18 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5ND4Ta029319 for ; Fri, 5 Dec 2003 15:13:04 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hB5NCnZ20679; Fri, 5 Dec 2003 15:12:49 -0800 Date: Fri, 5 Dec 2003 15:12:49 -0800 From: Stephen Hemminger To: Alexander Viro Cc: Jeff Garzik , netdev@oss.sgi.com, Alexander Viro Subject: Re: [PATCH] skfddi - convert to new pci model. Message-Id: <20031205151249.44a067df.shemminger@osdl.org> In-Reply-To: <20031205005915.GD31510@devserv.devel.redhat.com> References: <20031204163928.0f34d5d1.shemminger@osdl.org> <20031205005915.GD31510@devserv.devel.redhat.com> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1903 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Second revision of the cleanup of skfddi driver. * use new pci device bus initialization * allocate network device with alloc_fddidev and use dev->priv * get rid of special module/non module distinctions. * fix error unwinds and return values on initialization * call driver_init directly not via register_netdev * reset internal queue count after purge * get rid of h[iy]sterical comment that is no longer true diff -Nru a/drivers/net/skfp/skfddi.c b/drivers/net/skfp/skfddi.c --- a/drivers/net/skfp/skfddi.c Fri Dec 5 14:49:12 2003 +++ b/drivers/net/skfp/skfddi.c Fri Dec 5 14:49:12 2003 @@ -39,12 +39,6 @@ * are skfddi.c, h/types.h, h/osdef1st.h, h/targetos.h. * The others belong to the SysKonnect FDDI Hardware Module and * should better not be changed. - * NOTE: - * Compiling this driver produces some warnings, but I did not fix - * this, because the Hardware Module source is used for different - * drivers, and fixing it for Linux might bring problems on other - * projects. To keep the source common for all those drivers (and - * thus simplify fixes to it), please do not clean it up! * * Modification History: * Date Name Description @@ -58,6 +52,7 @@ * 07-May-00 DM 64 bit fixes, new dma interface * 31-Jul-03 DB Audit copy_*_user in skfp_ioctl * Daniele Bellucci + * 03-Dec-03 SH Convert to PCI device model * * Compilation options (-Dxxx): * DRIVERDEBUG print lots of messages to log file @@ -70,7 +65,7 @@ /* Version information string - should be updated prior to */ /* each new release!!! */ -#define VERSION "2.06" +#define VERSION "2.07" static const char *boot_msg = "SysKonnect FDDI PCI Adapter driver v" VERSION " for\n" @@ -80,15 +75,11 @@ #include #include -#include -#include #include #include #include #include #include -#include -#include // isdigit #include #include #include @@ -107,11 +98,6 @@ // Define module-wide (static) routines -static struct net_device *alloc_device(struct net_device *dev, u_long iobase); -static struct net_device *insert_device(struct net_device *dev); -static int fddi_dev_index(unsigned char *s); -static void init_dev(struct net_device *dev, u_long iobase); -static void link_modules(struct net_device *dev, struct net_device *tmp); static int skfp_driver_init(struct net_device *dev); static int skfp_open(struct net_device *dev); static int skfp_close(struct net_device *dev); @@ -188,15 +174,6 @@ // Define module-wide (static) variables static int num_boards; /* total number of adapters configured */ -static int num_fddi; -static int autoprobed; - -#ifdef MODULE -static struct net_device *unlink_modules(struct net_device *p); -static int loading_module = 1; -#else -static int loading_module; -#endif // MODULE #ifdef DRIVERDEBUG #define PRINTK(s, args...) printk(s, ## args) @@ -207,9 +184,9 @@ #define PRIV(dev) (&(((struct s_smc *)dev->priv)->os)) /* - * ============== - * = skfp_probe = - * ============== + * ================= + * = skfp_init_one = + * ================= * * Overview: * Probes for supported FDDI PCI controllers @@ -218,30 +195,11 @@ * Condition code * * Arguments: - * dev - pointer to device information + * pdev - pointer to PCI device information * * Functional Description: - * This routine is called by the OS for each FDDI device name (fddi0, - * fddi1,...,fddi6, fddi7) specified in drivers/net/Space.c. - * If loaded as a module, it will detect and initialize all - * adapters the first time it is called. - * - * Let's say that skfp_probe() is getting called to initialize fddi0. - * Furthermore, let's say there are three supported controllers in the - * system. Before skfp_probe() leaves, devices fddi0, fddi1, and fddi2 - * will be initialized and a global flag will be set to indicate that - * skfp_probe() has already been called. - * - * However...the OS doesn't know that we've already initialized - * devices fddi1 and fddi2 so skfp_probe() gets called again and again - * until it reaches the end of the device list for FDDI (presently, - * fddi7). It's important that the driver "pretend" to probe for - * devices fddi1 and fddi2 and return success. Devices fddi3 - * through fddi7 will return failure since they weren't initialized. - * - * This algorithm seems to work for the time being. As other FDDI - * drivers are written for Linux, a more generic approach (perhaps - * similar to the Ethernet card approach) may need to be implemented. + * This is now called by PCI driver registration process + * for each board found. * * Return Codes: * 0 - This device (fddi0, fddi1, etc) configured successfully @@ -254,370 +212,176 @@ * initialized and the board resources are read and stored in * the device structure. */ -static int skfp_probe(struct net_device *dev) +static __init int skfp_init_one(struct pci_dev *pdev, + const struct pci_device_id *ent) { - int i; /* used in for loops */ - struct pci_dev *pdev = NULL; /* PCI device structure */ -#ifndef MEM_MAPPED_IO - u16 port; /* temporary I/O (port) address */ - int port_len; /* length of port address range (in bytes) */ -#else - unsigned long port; -#endif - u16 command; /* PCI Configuration space Command register val */ + struct net_device *dev; struct s_smc *smc; /* board pointer */ - struct net_device *tmp = dev; - u8 first_dev_used = 0; - u16 SubSysId; + u32 port, len; + int err; PRINTK(KERN_INFO "entering skfp_probe\n"); - /* - * Verify whether we're going through skfp_probe() again - * - * If so, see if we're going through for a subsequent fddi device that - * we've already initialized. If we are, return success (0). If not, - * return failure (-ENODEV). - */ + if (num_boards == 0) + printk("%s\n", boot_msg); - if (autoprobed) { - PRINTK(KERN_INFO "Already entered skfp_probe\n"); - if (dev != NULL) { - if ((strncmp(dev->name, "fddi", 4) == 0) && - (dev->base_addr != 0)) { - return (0); - } - return (-ENODEV); - } + err = pci_enable_device(pdev); + if (err) + goto err_out1; + + +#ifdef MEM_MAPPED_IO + if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) { + printk(KERN_ERR "skfp: region is not an MMIO resource\n"); + err = -EIO; + goto err_out1; + } + port = pci_resource_start(pdev, 0); + len = pci_resource_len(pdev, 0); + + if (len < 0x4000) { + printk(KERN_ERR "skfp: Invalid PCI region size: %d\n", len); + err = -EIO; + goto err_out1; } - autoprobed = 1; /* set global flag */ - - printk("%s\n", boot_msg); - - /* Scan for Syskonnect FDDI PCI controllers */ - for (i = 0; i < SKFP_MAX_NUM_BOARDS; i++) { // scan for PCI cards - PRINTK(KERN_INFO "Check device %d\n", i); - if ((pdev=pci_find_device(PCI_VENDOR_ID_SK, PCI_DEVICE_ID_SK_FP, - pdev)) == 0) { - break; - } - if (pci_enable_device(pdev)) - continue; - -#ifndef MEM_MAPPED_IO - /* Verify that I/O enable bit is set (PCI slot is enabled) */ - pci_read_config_word(pdev, PCI_COMMAND, &command); - if ((command & PCI_COMMAND_IO) == 0) { - PRINTK("I/O enable bit not set!"); - PRINTK(" Verify that slot is enabled\n"); - continue; - } - - /* Turn off memory mapped space and enable mastering */ - - PRINTK(KERN_INFO "Command Reg: %04x\n", command); - command |= PCI_COMMAND_MASTER; - command &= ~PCI_COMMAND_MEMORY; - pci_write_config_word(pdev, PCI_COMMAND, command); - - /* Read I/O base address from PCI Configuration Space */ - - pci_read_config_word(pdev, PCI_BASE_ADDRESS_1, &port); - port &= PCI_BASE_ADDRESS_IO_MASK; // clear I/O bit (bit 0) - - /* Verify port address range is not already being used */ - - port_len = FP_IO_LEN; - if (check_region(port, port_len) != 0) { - printk("I/O range allocated to adapter"); - printk(" (0x%X-0x%X) is already being used!\n", port, - (port + port_len - 1)); - continue; - } #else - /* Verify that MEM enable bit is set (PCI slot is enabled) */ - pci_read_config_word(pdev, PCI_COMMAND, &command); - if ((command & PCI_COMMAND_MEMORY) == 0) { - PRINTK("MEMORY-I/O enable bit not set!"); - PRINTK(" Verify that slot is enabled\n"); - continue; - } - - /* Turn off IO mapped space and enable mastering */ - - PRINTK(KERN_INFO "Command Reg: %04x\n", command); - command |= PCI_COMMAND_MASTER; - command &= ~PCI_COMMAND_IO; - pci_write_config_word(pdev, PCI_COMMAND, command); - - port = pci_resource_start(pdev, 0); - - port = (unsigned long)ioremap(port, 0x4000); - if (!port){ - printk("skfp: Unable to map MEMORY register, " - "FDDI adapter will be disabled.\n"); - break; - } -#endif - - if ((!loading_module) || first_dev_used) { - /* Allocate a device structure for this adapter */ - tmp = alloc_device(dev, port); - } - first_dev_used = 1; // only significant first time - - pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, &SubSysId); - - if (tmp != NULL) { - if (loading_module) - link_modules(dev, tmp); - dev = tmp; - init_dev(dev, port); - dev->irq = pdev->irq; - - /* Initialize board structure with bus-specific info */ - - smc = (struct s_smc *) dev->priv; - smc->os.dev = dev; - smc->os.bus_type = SK_BUS_TYPE_PCI; - smc->os.pdev = *pdev; - smc->os.QueueSkb = MAX_TX_QUEUE_LEN; - smc->os.MaxFrameSize = MAX_FRAME_SIZE; - smc->os.dev = dev; - smc->hw.slot = -1; - smc->os.ResetRequested = FALSE; - skb_queue_head_init(&smc->os.SendSkbQueue); - - if (skfp_driver_init(dev) == 0) { - int SubSysId = pdev->subsystem_device; - // only increment global board - // count on success - num_boards++; - if ((SubSysId & 0xff00) == 0x5500 || - (SubSysId & 0xff00) == 0x5800) { - printk("%s: SysKonnect FDDI PCI adapter" - " found (SK-%04X)\n", dev->name, - SubSysId); - } else { - printk("%s: FDDI PCI adapter found\n", - dev->name); - } - } else { - kfree(dev); - i = SKFP_MAX_NUM_BOARDS; // stop search - } - } // if (dev != NULL) - } // for SKFP_MAX_NUM_BOARDS - - /* - * If we're at this point we're going through skfp_probe() for the - * first time. Return success (0) if we've initialized 1 or more - * boards. Otherwise, return failure (-ENODEV). - */ - - if (num_boards > 0) - return (0); - else { - release_region (dev->base_addr, FP_IO_LEN); - printk("no SysKonnect FDDI adapter found\n"); - return (-ENODEV); + if (!(pci_resource_flags(pdev, 1) & IO_RESOURCE_IO)) { + printk(KERN_ERR "skfp: region is not PIO resource\n"); + err = -EIO; + goto err_out1; } -} // skfp_probe - -/************************ - * - * Search the entire 'fddi' device list for a fixed probe. If a match isn't - * found then check for an autoprobe or unused device location. If they - * are not available then insert a new device structure at the end of - * the current list. - * - ************************/ -static struct net_device *alloc_device(struct net_device *dev, u_long iobase) -{ - struct net_device *adev = NULL; - int fixed = 0, new_dev = 0; - - PRINTK(KERN_INFO "entering alloc_device\n"); - if (!dev) - return dev; - - num_fddi = fddi_dev_index(dev->name); - if (loading_module) { - num_fddi++; - dev = insert_device(dev); - return dev; + port = pci_resource_start(pdev, 1); + len = pci_resource_len(pdev, 1); + if (len < FP_IO_LEN) { + printk(KERN_ERR "skfp: Invalid PCI region size: %d\n", + io_len); + err = -EIO; + goto err_out1; } - while (1) { - if (((dev->base_addr == NO_ADDRESS) || - (dev->base_addr == 0)) && !adev) { - adev = dev; - } else if ((dev->priv == NULL) && (dev->base_addr == iobase)) { - fixed = 1; - } else { - if (dev->next == NULL) { - new_dev = 1; - } else if (strncmp(dev->next->name, "fddi", 4) != 0) { - new_dev = 1; - } - } - if ((dev->next == NULL) || new_dev || fixed) - break; - dev = dev->next; - num_fddi++; - } // while (1) - - if (adev && !fixed) { - dev = adev; - num_fddi = fddi_dev_index(dev->name); - new_dev = 0; - } - if (((dev->next == NULL) && ((dev->base_addr != NO_ADDRESS) && - (dev->base_addr != 0)) && !fixed) || - new_dev) { - num_fddi++; /* New device */ - dev = insert_device(dev); - } - if (dev) { - if (!dev->priv) { - /* Allocate space for private board structure */ - dev->priv = (void *) kmalloc(sizeof(struct s_smc), - GFP_KERNEL); - if (dev->priv == NULL) { - printk("%s: Could not allocate memory for", - dev->name); - printk(" private board structure!\n"); - return (NULL); - } - /* clear structure */ - memset(dev->priv, 0, sizeof(struct s_smc)); - } +#endif + err = pci_request_regions(pdev, "skfddi"); + if (err) + goto err_out1; + + pci_set_master(pdev); + + dev = alloc_fddidev(sizeof(struct s_smc)); + if (!dev) { + printk(KERN_ERR "skfp: Unable to allocate fddi device, " + "FDDI adapter will be disabled.\n"); + err = -ENOMEM; + goto err_out2; + } + +#ifdef MEM_MAPPED_IO + dev->base_addr = (unsigned long) ioremap(port, len); + if (!dev->base_addr) { + printk(KERN_ERR "skfp: Unable to map MEMORY register, " + "FDDI adapter will be disabled.\n"); + err = -EIO; + goto err_out3; } - return dev; -} // alloc_device - - - -/************************ - * - * Initialize device structure - * - ************************/ -static void init_dev(struct net_device *dev, u_long iobase) -{ - /* Initialize new device structure */ - - dev->mem_end = 0; /* shared memory isn't used */ - dev->mem_start = 0; /* shared memory isn't used */ - dev->base_addr = iobase; /* save port (I/O) base address */ - dev->if_port = 0; /* not applicable to FDDI adapters */ - dev->dma = 0; /* Bus Master DMA doesn't require channel */ - dev->irq = 0; - - netif_start_queue(dev); +#else + dev->base_addr = port; +#endif + dev->irq = pdev->irq; dev->get_stats = &skfp_ctl_get_stats; dev->open = &skfp_open; dev->stop = &skfp_close; dev->hard_start_xmit = &skfp_send_pkt; - dev->hard_header = NULL; /* set in fddi_setup() */ - dev->rebuild_header = NULL; /* set in fddi_setup() */ dev->set_multicast_list = &skfp_ctl_set_multicast_list; dev->set_mac_address = &skfp_ctl_set_mac_address; dev->do_ioctl = &skfp_ioctl; - dev->set_config = NULL; /* not supported for now &&& */ dev->header_cache_update = NULL; /* not supported */ - dev->change_mtu = NULL; /* set in fddi_setup() */ SET_MODULE_OWNER(dev); + SET_NETDEV_DEV(dev, &pdev->dev); - /* Initialize remaining device structure information */ - fddi_setup(dev); -} // init_device - - -/************************ - * - * If at end of fddi device list and can't use current entry, malloc - * one up. If memory could not be allocated, print an error message. - * -************************/ -static struct net_device *insert_device(struct net_device *dev) -{ - struct net_device *new; - int len; - - PRINTK(KERN_INFO "entering insert_device\n"); - len = sizeof(struct net_device) + sizeof(struct s_smc); - new = (struct net_device *) kmalloc(len, GFP_KERNEL); - if (new == NULL) { - printk("fddi%d: Device not initialised, insufficient memory\n", - num_fddi); - return NULL; - } else { - memset((char *) new, 0, len); - new->priv = (struct s_smc *) (new + 1); - new->init = skfp_probe; - if (!loading_module) { - new->next = dev->next; - dev->next = new; - } - /* create new device name */ - if (num_fddi > 999) { - sprintf(new->name, "fddi????"); - } else { - sprintf(new->name, "fddi%d", num_fddi); - } - } - return new; -} // insert_device - - -/************************ - * - * Get the number of a "fddiX" string - * - ************************/ -static int fddi_dev_index(unsigned char *s) -{ - int i = 0, j = 0; - - for (; *s; s++) { - if (isdigit(*s)) { - j = 1; - i = (i * 10) + (*s - '0'); - } else if (j) - break; - } - return i; -} // fddi_dev_index + /* Initialize board structure with bus-specific info */ + smc = (struct s_smc *) dev->priv; + smc->os.dev = dev; + smc->os.bus_type = SK_BUS_TYPE_PCI; + smc->os.pdev = *pdev; + smc->os.QueueSkb = MAX_TX_QUEUE_LEN; + smc->os.MaxFrameSize = MAX_FRAME_SIZE; + smc->os.dev = dev; + smc->hw.slot = -1; + smc->os.ResetRequested = FALSE; + skb_queue_head_init(&smc->os.SendSkbQueue); + + err = skfp_driver_init(dev); + if (err) + goto err_out4; + + err = register_netdev(dev); + if (err) + goto err_out5; + + ++num_boards; + pci_set_drvdata(pdev, dev); + + if ((pdev->subsystem_device & 0xff00) == 0x5500 || + (pdev->subsystem_device & 0xff00) == 0x5800) + printk("%s: SysKonnect FDDI PCI adapter" + " found (SK-%04X)\n", dev->name, + pdev->subsystem_device); + else + printk("%s: FDDI PCI adapter found\n", dev->name); + return 0; +err_out5: + if (smc->os.SharedMemAddr) + pci_free_consistent(pdev, smc->os.SharedMemSize, + smc->os.SharedMemAddr, + smc->os.SharedMemDMA); + pci_free_consistent(pdev, MAX_FRAME_SIZE, + smc->os.LocalRxBuffer, smc->os.LocalRxBufferDMA); +err_out4: +#ifdef MEM_MAPPED_IO + iounmap((void *) dev->base_addr); +#endif +err_out3: + free_netdev(dev); +err_out2: + pci_release_regions(pdev); +err_out1: + return err; +} -/************************ - * - * Used if loaded as module only. Link the device structures - * together. Needed to release them all at unload. - * -************************/ -static void link_modules(struct net_device *dev, struct net_device *tmp) +/* + * Called for each adapter board from pci_unregister_driver + */ +static void __devexit skfp_remove_one(struct pci_dev *pdev) { - struct net_device *p = dev; + struct net_device *p = pci_get_drvdata(pdev); + struct s_smc *lp = p->priv; - if (p) { - while (((struct s_smc *) (p->priv))->os.next_module) { - p = ((struct s_smc *) (p->priv))->os.next_module; - } + unregister_netdev(p); - if (dev != tmp) { - ((struct s_smc *) (p->priv))->os.next_module = tmp; - } else { - ((struct s_smc *) (p->priv))->os.next_module = NULL; - } + if (lp->os.SharedMemAddr) { + pci_free_consistent(&lp->os.pdev, + lp->os.SharedMemSize, + lp->os.SharedMemAddr, + lp->os.SharedMemDMA); + lp->os.SharedMemAddr = NULL; + } + if (lp->os.LocalRxBuffer) { + pci_free_consistent(&lp->os.pdev, + MAX_FRAME_SIZE, + lp->os.LocalRxBuffer, + lp->os.LocalRxBufferDMA); + lp->os.LocalRxBuffer = NULL; } - return; -} // link_modules - +#ifdef MEM_MAPPED_IO + iounmap((void *) p->base_addr); +#endif + pci_release_regions(pdev); + free_netdev(p); + pci_set_drvdata(pdev, NULL); +} /* * ==================== @@ -644,11 +408,11 @@ * 0 - initialization succeeded * -1 - initialization failed */ -static int skfp_driver_init(struct net_device *dev) +static __init int skfp_driver_init(struct net_device *dev) { struct s_smc *smc = (struct s_smc *) dev->priv; skfddi_priv *bp = PRIV(dev); - u8 val; /* used for I/O read/writes */ + int err = -EIO; PRINTK(KERN_INFO "entering skfp_driver_init\n"); @@ -657,9 +421,7 @@ smc->hw.iop = dev->base_addr; // Get the interrupt level from the PCI Configuration Table - val = dev->irq; - - smc->hw.irq = val; + smc->hw.irq = dev->irq; spin_lock_init(&bp->DriverLock); @@ -729,7 +491,7 @@ bp->LocalRxBuffer, bp->LocalRxBufferDMA); bp->LocalRxBuffer = NULL; } - return (-1); + return err; } // skfp_driver_init @@ -757,14 +519,15 @@ static int skfp_open(struct net_device *dev) { struct s_smc *smc = (struct s_smc *) dev->priv; + int err; PRINTK(KERN_INFO "entering skfp_open\n"); /* Register IRQ - support shared interrupts by passing device ptr */ - if (request_irq(dev->irq, (void *) skfp_interrupt, SA_SHIRQ, - dev->name, dev)) { - printk("%s: Requested IRQ %d is busy\n", dev->name, dev->irq); - return (-EAGAIN); - } + err = request_irq(dev->irq, (void *) skfp_interrupt, SA_SHIRQ, + dev->name, dev); + if (err) + return err; + /* * Set current address to factory MAC address * @@ -788,6 +551,7 @@ /* Disable promiscuous filter settings */ mac_drv_rx_mode(smc, RX_DISABLE_PROMISC); + netif_start_queue(dev); return (0); } // skfp_open @@ -822,7 +586,6 @@ static int skfp_close(struct net_device *dev) { struct s_smc *smc = (struct s_smc *) dev->priv; - struct sk_buff *skb; skfddi_priv *bp = PRIV(dev); CLI_FBI(); @@ -835,13 +598,8 @@ /* Deregister (free) IRQ */ free_irq(dev->irq, dev); - for (;;) { - skb = skb_dequeue(&bp->SendSkbQueue); - if (skb == NULL) - break; - bp->QueueSkb++; - dev_kfree_skb(skb); - } + skb_queue_purge(&bp->SendSkbQueue); + bp->QueueSkb = MAX_TX_QUEUE_LEN; return (0); } // skfp_close @@ -1276,6 +1034,8 @@ break; default: printk("ioctl for %s: unknow cmd: %04x\n", dev->name, ioc.cmd); + status = -EOPNOTSUPP; + } // switch return status; @@ -2529,63 +2289,21 @@ } // drv_reset_indication - -static struct net_device *mdev; +static struct pci_driver skfddi_pci_driver = { + .name = "skfddi", + .id_table = skfddi_pci_tbl, + .probe = skfp_init_one, + .remove = __devexit_p(skfp_remove_one), +}; static int __init skfd_init(void) { - struct net_device *p; - - if ((mdev = insert_device(NULL)) == NULL) - return -ENOMEM; - - for (p = mdev; p != NULL; p = ((struct s_smc *)p->priv)->os.next_module) { - if (register_netdev(p) != 0) { - printk("skfddi init_module failed\n"); - return -EIO; - } - } - - return 0; + return pci_module_init(&skfddi_pci_driver); } -static struct net_device *unlink_modules(struct net_device *p) -{ - struct net_device *next = NULL; - - if (p->priv) { /* Private areas allocated? */ - struct s_smc *lp = (struct s_smc *) p->priv; - - next = lp->os.next_module; - - if (lp->os.SharedMemAddr) { - pci_free_consistent(&lp->os.pdev, - lp->os.SharedMemSize, - lp->os.SharedMemAddr, - lp->os.SharedMemDMA); - lp->os.SharedMemAddr = NULL; - } - if (lp->os.LocalRxBuffer) { - pci_free_consistent(&lp->os.pdev, - MAX_FRAME_SIZE, - lp->os.LocalRxBuffer, - lp->os.LocalRxBufferDMA); - lp->os.LocalRxBuffer = NULL; - } - release_region(p->base_addr, - (lp->os.bus_type == SK_BUS_TYPE_PCI ? FP_IO_LEN : 0)); - } - unregister_netdev(p); - printk("%s: unloaded\n", p->name); - free_netdev(p); /* Free the device structure */ - - return next; -} // unlink_modules - static void __exit skfd_exit(void) { - while (mdev) - mdev = unlink_modules(mdev); + pci_unregister_driver(&skfddi_pci_driver); } module_init(skfd_init); From aviro@redhat.com Fri Dec 5 15:27:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 15:27:59 -0800 (PST) Received: from devserv.devel.redhat.com (pix-525-pool.redhat.com [66.187.233.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5NRkTa029893 for ; Fri, 5 Dec 2003 15:27:46 -0800 Received: from devserv.devel.redhat.com (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.10/8.12.10) with ESMTP id hB5NRMDl026312; Fri, 5 Dec 2003 18:27:22 -0500 Received: (from aviro@localhost) by devserv.devel.redhat.com (8.12.10/8.12.10/Submit) id hB5NRLhD026310; Fri, 5 Dec 2003 18:27:22 -0500 Date: Fri, 5 Dec 2003 18:27:21 -0500 From: Alexander Viro To: Stephen Hemminger Cc: Alexander Viro , Jeff Garzik , netdev@oss.sgi.com Subject: Re: [PATCH] skfddi - convert to new pci model. Message-ID: <20031205232721.GA24715@devserv.devel.redhat.com> References: <20031204163928.0f34d5d1.shemminger@osdl.org> <20031205005915.GD31510@devserv.devel.redhat.com> <20031205151249.44a067df.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031205151249.44a067df.shemminger@osdl.org> User-Agent: Mutt/1.4.1i X-archive-position: 1904 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aviro@redhat.com Precedence: bulk X-list: netdev On Fri, Dec 05, 2003 at 03:12:49PM -0800, Stephen Hemminger wrote: > Second revision of the cleanup of skfddi driver. > * use new pci device bus initialization > * allocate network device with alloc_fddidev > and use dev->priv > * get rid of special module/non module distinctions. > > * fix error unwinds and return values on initialization > * call driver_init directly not via register_netdev > * reset internal queue count after purge > * get rid of h[iy]sterical comment that is no longer true Looks sane. From jt@bougret.hpl.hp.com Fri Dec 5 15:45:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 15:45:25 -0800 (PST) Received: from palrel11.hp.com (palrel11.hp.com [156.153.255.246]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB5NjBTa030404 for ; Fri, 5 Dec 2003 15:45:11 -0800 Received: from tomil.hpl.hp.com (tomil.hpl.hp.com [15.0.152.100]) by palrel11.hp.com (Postfix) with ESMTP id 357311C01174; Fri, 5 Dec 2003 15:45:11 -0800 (PST) Received: from bougret.hpl.hp.com (bougret.hpl.hp.com [15.4.92.227]) by tomil.hpl.hp.com (8.9.3 (PHNE_28810+JAGae91741+JAGae92668)/8.9.3 HPLabs Timeshare Server) with ESMTP id PAA28571; Fri, 5 Dec 2003 15:45:10 -0800 (PST) Received: from jt by bougret.hpl.hp.com with local (Exim 3.35 #1 (Debian)) id 1ASPdG-0000fE-00; Fri, 05 Dec 2003 15:45:10 -0800 Date: Fri, 5 Dec 2003 15:45:10 -0800 To: Linux kernel mailing list , Thomas Bogendrfer Cc: Jeff Garzik , netdev@oss.sgi.com Subject: [BUG 2.6.0-test11] pcnet32 oops Message-ID: <20031205234510.GA2319@bougret.hpl.hp.com> Reply-To: jt@hpl.hp.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i Organisation: HP Labs Palo Alto Address: HP Labs, 1U-17, 1501 Page Mill road, Palo Alto, CA 94304, USA. E-mail: jt@hpl.hp.com From: Jean Tourrilhes X-archive-position: 1905 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jt@bougret.hpl.hp.com Precedence: bulk X-list: netdev Hi Frank, Would you mind having a look at the following bugs and forward as appropriate ? I'm trying to get an AMD Lance going on my box. It mostly works, but I get oops when I deconfigure the card or remove the module. A few samples : At "ifdown" : ---------------------------------- Badness in local_bh_enable at kernel/softirq.c:121 Call Trace: [] local_bh_enable+0x35/0x58 [] ip_ct_find_proto+0xd2/0xd8 [] destroy_conntrack+0x13/0x19c [] __kfree_skb+0xc8/0xfc [] pcnet32_purge_tx_ring+0x94/0xc4 [pcnet32] [] pcnet32_restart+0x14/0x6c [pcnet32] [] pcnet32_set_multicast_list+0x7d/0x90 [pcnet32] [] __dev_mc_upload+0x1e/0x24 [] dev_mc_upload+0x26/0x38 [] dev_change_flags+0x2d/0x104 [] devinet_ioctl+0x327/0x64c [] inet_ioctl+0xbb/0xf8 [] sock_ioctl+0x29d/0x2cc [] sys_ioctl+0x21d/0x264 [] error_code+0x2d/0x38 [] syscall_call+0x7/0xb ------------------------------------------------- At "rmmod" : --------------------------------------------- kernel BUG at include/linux/dcache.h:274! invalid operand: 0000 [#1] CPU: 1 EIP: 0060:[sysfs_remove_dir+21/256] Not tainted EFLAGS: 00010246 EIP is at sysfs_remove_dir+0x15/0x100 eax: 00000000 ebx: d085e760 ecx: c02ccb2c edx: ffff0001 esi: c1983220 edi: d085e744 ebp: 00000880 esp: c1a49f04 ds: 007b es: 007b ss: 0068 Process rmmod (pid: 750, threadinfo=c1a48000 task=ca972060) Stack: d085e760 c02ccb2c d085e744 00000880 c01858c3 d085e760 d085e760 c01858db d085e760 c02ccae0 c01aec48 d085e760 d085e744 00000000 00000000 c01aef5c d085e744 d085e720 00000000 c018b7e6 d085e744 c1875000 d085c88c d085e720 Call Trace: [kobject_del+75/88] kobject_del+0x4b/0x58 [kobject_unregister+11/24] kobject_unregister+0xb/0x18 [bus_remove_driver+92/108] bus_remove_driver+0x5c/0x6c [driver_unregister+12/50] driver_unregister+0xc/0x32 [pci_unregister_driver+14/28] pci_unregister_driver+0xe/0x1c [_end+273681636/1070212184] pcnet32_cleanup_module+0xac/0xc0 [pcnet32] [sys_delete_module+394/432] sys_delete_module+0x18a/0x1b0 [sys_munmap+69/100] sys_munmap+0x45/0x64 [syscall_call+7/11] syscall_call+0x7/0xb Code: 0f 0b 12 01 c2 dd 25 c0 f0 ff 06 85 f6 0f 84 d0 00 00 00 8b --------------------------------------------- Info : Dual PIII 550 SMP, 2.6.0-test11, many PCI cards, hotplug 02:05.0 Ethernet controller: Advanced Micro Devices [AMD] 79c970 [PCnet LANCE] (rev 36) Subsystem: Hewlett-Packard Company Ethernet with LAN remote power Adapter Flags: bus master, medium devsel, latency 64, IRQ 18 I/O ports at 7ce0 [size=32] Memory at fdfff400 (32-bit, non-prefetchable) [size=32] Expansion ROM at [disabled] [size=1M] Capabilities: [40] Power Management version 1 pcnet32.c:v1.27b 01.10.2002 tsbogend@alpha.franken.de pcnet32: PCnet/FAST+ 79C972 at 0x7ce0, 00 10 83 34 ba e5 tx_start_pt(0x0c00):~220 bytes, BCR18(9861):BurstWrEn BurstRdEn NoUFlow SRAMSIZE=0x1700, SRAM_BND=0x0800, assigned IRQ 18. eth0: registered as PCnet/FAST+ 79C972 pcnet32: 1 cards_found. eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 Good luck... Jean From jt@bougret.hpl.hp.com Fri Dec 5 16:30:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 16:30:33 -0800 (PST) Received: from palrel13.hp.com (palrel13.hp.com [156.153.255.238]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB60UJTa002003 for ; Fri, 5 Dec 2003 16:30:20 -0800 Received: from tomil.hpl.hp.com (tomil.hpl.hp.com [15.0.152.100]) by palrel13.hp.com (Postfix) with ESMTP id 4A32A1C01BED; Fri, 5 Dec 2003 16:30:19 -0800 (PST) Received: from bougret.hpl.hp.com (bougret.hpl.hp.com [15.4.92.227]) by tomil.hpl.hp.com (8.9.3 (PHNE_28810+JAGae91741+JAGae92668)/8.9.3 HPLabs Timeshare Server) with ESMTP id QAA00328; Fri, 5 Dec 2003 16:30:18 -0800 (PST) Received: from jt by bougret.hpl.hp.com with local (Exim 3.35 #1 (Debian)) id 1ASQKw-0000le-00; Fri, 05 Dec 2003 16:30:18 -0800 Date: Fri, 5 Dec 2003 16:30:18 -0800 To: Stephen Hemminger Cc: Russell King , Jeff Garzik , Alexander Viro , netdev@oss.sgi.com, irda-users@sourceforge.net Subject: Re: [PATCH] sa1100ir - fix for 2.6 Message-ID: <20031206003018.GA2941@bougret.hpl.hp.com> Reply-To: jt@hpl.hp.com References: <20031205150601.68c52de3.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031205150601.68c52de3.shemminger@osdl.org> User-Agent: Mutt/1.3.28i Organisation: HP Labs Palo Alto Address: HP Labs, 1U-17, 1501 Page Mill road, Palo Alto, CA 94304, USA. E-mail: jt@hpl.hp.com From: Jean Tourrilhes X-archive-position: 1906 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jt@bougret.hpl.hp.com Precedence: bulk X-list: netdev On Fri, Dec 05, 2003 at 03:06:01PM -0800, Stephen Hemminger wrote: > Update the ARM SA1100 irda driver to current 2.6 network device model > * allocate network device with alloc_irdadev - get rid of dev_alloc > * change to current system device class model - looks like it was using early > old version of interface from 2.5 > * better error unwind > > This has been cross compiled successfully against net-drivers-2.5-exp I don't have the hardware, so I can't test that (and I'm not familiar with sysfs). If you want, I can add that as part of my next IrDA update (probably 2.6.1). Russel, if you do testing you may want to grab the "F-timer fix" from my web page. Regards, Jean From davem@pizda.ninka.net Fri Dec 5 17:02:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 17:02:17 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB6123Ta002914 for ; Fri, 5 Dec 2003 17:02:04 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA23330; Fri, 5 Dec 2003 16:59:00 -0800 Date: Fri, 5 Dec 2003 16:59:00 -0800 From: "David S. Miller" To: jt@hpl.hp.com Cc: jt@bougret.hpl.hp.com, linux-kernel@vger.kernel.org, tsbogend@alpha.franken.de, jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [BUG 2.6.0-test11] pcnet32 oops Message-Id: <20031205165900.2920ea6a.davem@redhat.com> In-Reply-To: <20031205234510.GA2319@bougret.hpl.hp.com> References: <20031205234510.GA2319@bougret.hpl.hp.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1907 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 5 Dec 2003 15:45:10 -0800 Jean Tourrilhes wrote: > Badness in local_bh_enable at kernel/softirq.c:121 > Call Trace: > [] local_bh_enable+0x35/0x58 > [] ip_ct_find_proto+0xd2/0xd8 > [] destroy_conntrack+0x13/0x19c > [] __kfree_skb+0xc8/0xfc > [] pcnet32_purge_tx_ring+0x94/0xc4 [pcnet32] > [] pcnet32_restart+0x14/0x6c [pcnet32] > [] pcnet32_set_multicast_list+0x7d/0x90 [pcnet32] This is the classic case of doing disabling/enabling of software interrupts with hardware interrupts disabled, which is a bug. In this case pcnet32_set_multicast_list() is disabling hardware interrupts, and the packet freeing of pcnet32_purge_tx_ring() is what leads to the software interrupt disable/enable. However, I'm inclined to believe that we should change dev_kfree_skb_any() to fix this class of problems, by making it check for hardware interrupts being disabled as well as being in an interrupt. But we might not want to do it like this, just reenabling the hardware interrupts at the top level won't cause the software interrupt to fire to actually execute the SKB freeing work.... Need to think about this some more. ===== include/linux/netdevice.h 1.66 vs edited ===== --- 1.66/include/linux/netdevice.h Sat Nov 1 14:11:04 2003 +++ edited/include/linux/netdevice.h Fri Dec 5 16:53:01 2003 @@ -634,7 +634,7 @@ */ static inline void dev_kfree_skb_any(struct sk_buff *skb) { - if (in_irq()) + if (in_irq() || irqs_disabled()) dev_kfree_skb_irq(skb); else dev_kfree_skb(skb); From jgarzik@pobox.com Fri Dec 5 20:23:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 20:24:07 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB64NqTa008566 for ; Fri, 5 Dec 2003 20:23:53 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:39888 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1ASMB2-0003Tw-Me; Fri, 05 Dec 2003 20:03:48 +0000 Message-ID: <3FD0E498.8070703@pobox.com> Date: Fri, 05 Dec 2003 15:03:36 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jim Keniston CC: LKML , netdev , Andrew Morton , "Feldman, Scott" , Larry Kessler , "David S. Miller" , Linus Torvalds Subject: Re: [PATCH 2.6.0-test11] Net device error logging References: <3FD0E1FE.1D5B1883@us.ibm.com> In-Reply-To: <3FD0E1FE.1D5B1883@us.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1908 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Jim Keniston wrote: > The enclosed patch implements the netdev_* error-logging macros for > network drivers. These macros have been discussed at length on the > linux-kernel and linux-netdev lists. With the v2.6.0-test6 version of > these macros, we addressed all the issues that reviewers had raised. > This is just an update for v2.6.0-test11. > > As previously discussed, these macros are in demand now (e.g., for > the e1000 driver) and have essentially no impact on drivers that > don't use them. > > RECAP (from previous posts): > Calls to the netdev_* macros (netdev_printk and wrappers such as > netdev_err) are intended to replace calls to printk in network device > drivers. These macros have the following characteristics: > - Same format + args as the corresponding printk call. > - Approximately the same amount of text as the corresponding printk call. > - The first arg is a pointer to the net_device struct. > - The second arg, which is a NETIF_MSG_* message level, can be used to > implement verbosity control. > - Standard message prefixes: verbose (interface name, driver name, bus ID) > during probe, or just the interface name once the device is registered. > - The current implementation just calls printk. However, the netdev_* > interface (and availability of the net_device pointer) opens the door > for logging additional information (via printk, via evlog/netlink, etc.) > as desired, with no change to driver code. > > Examples: > netdev_err(netdev, RX_ERR, "No mem: dropped packet\n"); > logs a message such as the following if the NETIF_MSG_RX_ERR bit is set > in netdev->msg_enable. > eth2: No mem: dropped packet > > netdev_fatal(netdev, PROBE, "The EEPROM Checksum Is Not Valid\n"); > or > netdev_err(netdev, ALL, "The EEPROM Checksum Is Not Valid\n"); > unconditionally logs a message such as: > eth%d (e1000 0000:00:03.0): The EEPROM Checksum Is Not Valid > The message's prefix includes the driver name and bus ID because the > message is logged at probe time, before netdev is registered. > > SAMPLE DRIVERS > As examples of how the netdev_* macros could be used, patches for the > v2.6.0-test11 e100, e1000, and tg3 drivers are available on request. > > LINUX v2.4 SUPPORT > Since there is no v2.6-style struct device underlying the net_device, > a v2.4.23-compatible version of netdev_printk would always log the > interface name as the message prefix: > > #define netdev_printk(sevlevel, netdev, msglevel, format, arg...) \ > do { \ > if (NETIF_MSG_##msglevel == NETIF_MSG_ALL \ > || (netdev->msg_enable & NETIF_MSG_##msglevel)) { \ > printk(sevlevel "%s: " format , netdev->name , ## arg); \ > } \ > } while (0) I discussed this a bit with David. My personal feelings are that I prefer just leaving all the printk's as they are. But Linus and GregKH have been accepting patches into other parts of the tree like this one, and logging additional already-computer-parsed information is probably not a bad thing long-term, so perhaps I've been being a bit of a Luddite on this issue. It's definitely too late for 2.6.0-testX mainline merging, until Andrew (2.6) and Linus (2.7) re-open their respective trees, but a good first step would be for me to merge this into the net-drivers-2.5-exp queue. Jeff From ja@ssi.bg Fri Dec 5 23:51:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Dec 2003 23:52:08 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB67plTa013244 for ; Fri, 5 Dec 2003 23:51:52 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hB67q1iX002045; Sat, 6 Dec 2003 09:52:03 +0200 Date: Sat, 6 Dec 2003 09:52:01 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: herbert@gondor.apana.org.au, Subject: Re: [ROUTE] PMTU only works on half the time In-Reply-To: <20031205124309.1490a2f4.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1909 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Fri, 5 Dec 2003, David S. Miller wrote: > Let's discuss what RTO_ONLINK means :) > > It means, "Do not route this packet". It goes to the local subnet > and that's the end of the story. Therefore there are no PMTU nor > redirects to handle when this TOS bit is set. You are right about the case with redirects but for PMTUD I'm still not sure. Here is one example: host A (we) thinks that host B is onlink (rt_dst == rt_gateway). But host B runs DNAT, tunneling or cluster software and forwards the packet through other hosts which generate frag_needed. Such example is IPVS TUN mode, I now see that it is not working because IPVS on host B does not handle correctly the ICMP error and it can not reach host A - a new difficult thing to fix. > RTO_ONLINK is set by these two cases: > > 1) ARP > 2) RAW/UDP sockets when MSG_DONTROUTE is specified by the user. UDP+RTO_ONLINK+PMTUD is still valid if we want to support the above example setup. But I'm not sure someone will use such combination of parameters. Should I remove the RTO_ONLINK case from PMTUD? I have some questions about the redirects: Assume we have direct (rt_dst==rt_gateway) route with SCOPE_HOST and shared_media is ON (if shared media is OFF ip_fib_check_default already avoids SCOPE_HOST routes). I do not see whether the standards cover such case, our target sends redirect message but we are sure we hit the target IP directly. Is it supposed we to change rt_gateway if rt_gateway==rt_dst ? I now included such check in ip_rt_redirect but may be have to remove it. IOW, the question is whether the ICMP redirects modify only routes via gateway when shared_media is ON? > So this patch still needs some work. OK, here is a new version, may be before the final one. Changes from previous version: - removed the RTO_ONLINK case from ip_rt_redirect - removed the useless 'saddr && daddr' check which was added in previous changes - added temporarily 'rth->rt_dst == rth->rt_gateway' in ip_rt_redirect. Is it needed? diff -Nru a/net/ipv4/route.c b/net/ipv4/route.c --- a/net/ipv4/route.c Sat Dec 6 09:05:58 2003 +++ b/net/ipv4/route.c Sat Dec 6 09:05:58 2003 @@ -967,11 +967,10 @@ void ip_rt_redirect(u32 old_gw, u32 daddr, u32 new_gw, u32 saddr, u8 tos, struct net_device *dev) { - int i, k; + int i; struct in_device *in_dev = in_dev_get(dev); struct rtable *rth, **rthp; u32 skeys[2] = { saddr, 0 }; - int ikeys[2] = { dev->ifindex, 0 }; tos &= IPTOS_RT_MASK; @@ -993,10 +992,7 @@ } for (i = 0; i < 2; i++) { - for (k = 0; k < 2; k++) { - unsigned hash = rt_hash_code(daddr, - skeys[i] ^ (ikeys[k] << 5), - tos); + unsigned hash = rt_hash_code(daddr, skeys[i], tos); rthp=&rt_hash_table[hash].chain; @@ -1008,7 +1004,9 @@ if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || rth->fl.fl4_tos != tos || - rth->fl.oif != ikeys[k] || + (rth->fl.oif && + rth->fl.oif != dev->ifindex) || + rth->rt_dst == rth->rt_gateway || rth->fl.iif != 0) { rthp = &rth->u.rt_next; continue; @@ -1075,7 +1073,6 @@ rcu_read_unlock(); do_next: ; - } } in_dev_put(in_dev); return; @@ -1105,8 +1102,7 @@ } else if ((rt->rt_flags & RTCF_REDIRECTED) || rt->u.dst.expires) { unsigned hash = rt_hash_code(rt->fl.fl4_dst, - rt->fl.fl4_src ^ - (rt->fl.oif << 5), + rt->fl.fl4_src, rt->fl.fl4_tos); #if RT_CACHE_DEBUG >= 1 printk(KERN_DEBUG "ip_rt_advice: redirect to " @@ -1239,19 +1235,21 @@ unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu) { - int i; + int i, j; unsigned short old_mtu = ntohs(iph->tot_len); struct rtable *rth; u32 skeys[2] = { iph->saddr, 0, }; u32 daddr = iph->daddr; u8 tos = iph->tos & IPTOS_RT_MASK; unsigned short est_mtu = 0; + u8 toskeys[2] = { tos, tos | RTO_ONLINK }; if (ipv4_config.no_pmtu_disc) return 0; + for (j = 0; j < 2; j++) for (i = 0; i < 2; i++) { - unsigned hash = rt_hash_code(daddr, skeys[i], tos); + unsigned hash = rt_hash_code(daddr, skeys[i], toskeys[j]); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; @@ -1261,7 +1259,7 @@ rth->fl.fl4_src == skeys[i] && rth->rt_dst == daddr && rth->rt_src == iph->saddr && - rth->fl.fl4_tos == tos && + rth->fl.fl4_tos == toskeys[j] && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { unsigned short mtu = new_mtu; @@ -1503,7 +1501,7 @@ RT_CACHE_STAT_INC(in_slow_mc); in_dev_put(in_dev); - hash = rt_hash_code(daddr, saddr ^ (dev->ifindex << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); return rt_intern_hash(hash, rth, (struct rtable**) &skb->dst); e_nobufs: @@ -1554,7 +1552,7 @@ if (!in_dev) goto out; - hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); /* Check for the most weird martians, which can be not detected by fib_lookup. @@ -1847,7 +1845,7 @@ int iif = dev->ifindex; tos &= IPTOS_RT_MASK; - hash = rt_hash_code(daddr, saddr ^ (iif << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -2190,7 +2188,7 @@ rth->rt_flags = flags; - hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos); + hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src, tos); err = rt_intern_hash(hash, rth, rp); done: if (free_res) @@ -2213,8 +2211,9 @@ { unsigned hash; struct rtable *rth; + u8 tos = flp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK); - hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), flp->fl4_tos); + hash = rt_hash_code(flp->fl4_dst, flp->fl4_src, tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -2226,8 +2225,7 @@ #ifdef CONFIG_IP_ROUTE_FWMARK rth->fl.fl4_fwmark == flp->fl4_fwmark && #endif - !((rth->fl.fl4_tos ^ flp->fl4_tos) & - (IPTOS_RT_MASK | RTO_ONLINK))) { + rth->fl.fl4_tos == tos) { rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; Regards -- Julian Anastasov From rmk@arm.linux.org.uk Sat Dec 6 00:38:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 00:38:44 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB68cRTa017003 for ; Sat, 6 Dec 2003 00:38:30 -0800 Received: from flint.arm.linux.org.uk ([2002:d412:e8ba:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.22) id 1ASXx6-0000Kv-Ko; Sat, 06 Dec 2003 08:38:12 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.22) id 1ASXx5-0002VF-O7; Sat, 06 Dec 2003 08:38:11 +0000 Date: Sat, 6 Dec 2003 08:38:11 +0000 From: Russell King To: Stephen Hemminger Cc: Jeff Garzik , Jean Tourrilhes , Alexander Viro , netdev@oss.sgi.com, irda-users@sourceforge.net Subject: Re: [PATCH] sa1100ir - fix for 2.6 Message-ID: <20031206083811.A9622@flint.arm.linux.org.uk> References: <20031205150601.68c52de3.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031205150601.68c52de3.shemminger@osdl.org>; from shemminger@osdl.org on Fri, Dec 05, 2003 at 03:06:01PM -0800 X-archive-position: 1910 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev On Fri, Dec 05, 2003 at 03:06:01PM -0800, Stephen Hemminger wrote: > Update the ARM SA1100 irda driver to current 2.6 network device model > * allocate network device with alloc_irdadev - get rid of dev_alloc > * change to current system device class model - looks like it was using early > old version of interface from 2.5 > * better error unwind It isn't a system device though. Please revery the sysdev changes. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/ 2.6 Serial core From bunk@fs.tum.de Sat Dec 6 00:57:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 00:57:59 -0800 (PST) Received: from hermes.fachschaften.tu-muenchen.de (hermes.fachschaften.tu-muenchen.de [129.187.202.12]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB68vfTa017611 for ; Sat, 6 Dec 2003 00:57:45 -0800 Received: (qmail 817 invoked from network); 6 Dec 2003 08:54:22 -0000 Received: from mimas.fachschaften.tu-muenchen.de (129.187.202.58) by hermes.fachschaften.tu-muenchen.de with QMQP; 6 Dec 2003 08:54:22 -0000 Date: Sat, 6 Dec 2003 09:57:36 +0100 From: Adrian Bunk To: Jeff Garzik Cc: netdev@oss.sgi.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: [patch] remove com20020-isa.c unused variables Message-ID: <20031206085736.GQ20739@fs.tum.de> References: <20031205192828.GA15907@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031205192828.GA15907@gtf.org> User-Agent: Mutt/1.4.1i X-archive-position: 1911 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@fs.tum.de Precedence: bulk X-list: netdev Hi Jeff, your experimental net driver queue patch introduced the following compile warnings when compiling com20020-isa statically into the kernel: <-- snip --> ... CC drivers/net/arcnet/com20020-isa.o drivers/net/arcnet/com20020-isa.c: In function `com20020isa_setup': drivers/net/arcnet/com20020-isa.c:189: warning: unused variable `lp' drivers/net/arcnet/com20020-isa.c:188: warning: unused variable `dev' ... <-- snip --> The fix is trivial (the net driver queue patch removes all uses of these variables): --- linux-2.6.0-test11-full-no-smp/drivers/net/arcnet/com20020-isa.c.old 2003-12-06 09:51:10.000000000 +0100 +++ linux-2.6.0-test11-full-no-smp/drivers/net/arcnet/com20020-isa.c 2003-12-06 09:53:41.000000000 +0100 @@ -185,8 +185,6 @@ #ifndef MODULE static int __init com20020isa_setup(char *s) { - struct net_device *dev; - struct arcnet_local *lp; int ints[8]; s = get_options(s, 8, ints); cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From greg@kroah.com Sat Dec 6 01:05:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 01:05:59 -0800 (PST) Received: from perch.kroah.org (mail.kroah.org [65.200.24.183]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB695kTa018154 for ; Sat, 6 Dec 2003 01:05:47 -0800 Received: from [192.168.0.10] (12-203-11-189.client.attbi.com [12.203.11.189]) (authenticated) by perch.kroah.org (8.11.6/8.11.6) with ESMTP id hB694YM17209; Sat, 6 Dec 2003 01:04:35 -0800 Received: from greg by echidna.kroah.org with local (masqmail 0.2.19) id 1ASYMJ-67C-00; Sat, 06 Dec 2003 01:04:15 -0800 Date: Sat, 6 Dec 2003 01:04:14 -0800 From: Greg KH To: Jeff Garzik Cc: Jim Keniston , LKML , netdev , Andrew Morton , "Feldman, Scott" , Larry Kessler , "David S. Miller" , Linus Torvalds Subject: Re: [PATCH 2.6.0-test11] Net device error logging Message-ID: <20031206090414.GA23445@kroah.com> References: <3FD0E1FE.1D5B1883@us.ibm.com> <3FD0E498.8070703@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FD0E498.8070703@pobox.com> User-Agent: Mutt/1.4.1i X-archive-position: 1912 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg@kroah.com Precedence: bulk X-list: netdev On Fri, Dec 05, 2003 at 03:03:36PM -0500, Jeff Garzik wrote: > I discussed this a bit with David. My personal feelings are that I > prefer just leaving all the printk's as they are. But Linus and GregKH > have been accepting patches into other parts of the tree like this one, > and logging additional already-computer-parsed information is probably > not a bad thing long-term, so perhaps I've been being a bit of a Luddite > on this issue. To be fair, the patches I've taken (dev_err and friends) are _much_ simpler than these, so accepting them was not that big of a deal. It enabled the subsystems that have started to use them (USB and I2C) to log better messages (we now know exactly which device caused the errors, instead of just which driver). So please judge this patch on its own, and feel no pressure due to the dev_*() calls :) thanks, greg k-h From devik@cdi.cz Sat Dec 6 06:19:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 06:20:00 -0800 (PST) Received: from www1.cdi.cz (www1.cdi.cz [194.213.194.49]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB6EJjTa027129 for ; Sat, 6 Dec 2003 06:19:46 -0800 Received: from gprs207-188.eurotel.cz ([160.218.207.188] helo=devix) by www1.cdi.cz with asmtp (Exim 3.34 #3) id 1ASdHN-0004vI-00; Sat, 06 Dec 2003 15:19:30 +0100 Received: from devik (helo=localhost) by devix with local-esmtp (Exim 3.16 #8) id 1ASdGX-0007IG-00; Sat, 06 Dec 2003 15:18:37 +0100 Date: Sat, 6 Dec 2003 15:18:37 +0100 (CET) From: devik X-X-Sender: To: "David S. Miller" cc: , , Subject: HTB 2.6.0-test10 oops related patch In-Reply-To: <4371.1067710407@www6.gmx.net> Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323328-564380524-1070720317=:839" X-CDI: passed X-archive-position: 1913 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: devik@cdi.cz Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --8323328-564380524-1070720317=:839 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello Dave, I discovered reason of the first Daniel's OOPS. It is change which was not forward-ported to 2.6. It is attached. However other Daniel's oops (pppd detach related) is of unknown reason yet. I tried to duplicate all conditions but pppoe (have no pppoe device here) and I was able to kill pppd with htb under full load with no problem. I'll look into it again today. ------------------------------- Martin Devera aka devik Linux kernel QoS/HTB maintainer http://luxik.cdi.cz/~devik/ --8323328-564380524-1070720317=:839 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="htb2.6fix.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="htb2.6fix.diff" LS0tIHNjaF9odGIub2xkCVNhdCBEZWMgIDYgMTU6MDU6MjEgMjAwMw0KKysr IHNjaF9odGIuYwlTYXQgRGVjICA2IDE1OjA1OjI0IDIwMDMNCkBAIC03MDMs NyArNzAzLDcgQEANCiANCiAgICAgc2NoLT5xLnFsZW4rKzsNCiAgICAgc2No LT5zdGF0cy5wYWNrZXRzKys7IHNjaC0+c3RhdHMuYnl0ZXMgKz0gc2tiLT5s ZW47DQotICAgIEhUQl9EQkcoMSwxLCJodGJfZW5xX29rIGNsPSVYIHNrYj0l cFxuIixjbD9jbC0+Y2xhc3NpZDowLHNrYik7DQorICAgIEhUQl9EQkcoMSwx LCJodGJfZW5xX29rIGNsPSVYIHNrYj0lcFxuIiwoY2wgJiYgY2wgIT0gSFRC X0RJUkVDVCk/Y2wtPmNsYXNzaWQ6MCxza2IpOw0KICAgICByZXR1cm4gTkVU X1hNSVRfU1VDQ0VTUzsNCiB9DQogDQpAQCAtNzMxLDcgKzczMSw3IEBADQog CSAgICBodGJfYWN0aXZhdGUgKHEsY2wpOw0KIA0KICAgICBzY2gtPnEucWxl bisrOw0KLSAgICBIVEJfREJHKDEsMSwiaHRiX3JlcV9vayBjbD0lWCBza2I9 JXBcbiIsY2w/Y2wtPmNsYXNzaWQ6MCxza2IpOw0KKyAgICBIVEJfREJHKDEs MSwiaHRiX3JlcV9vayBjbD0lWCBza2I9JXBcbiIsKGNsICYmIGNsICE9IEhU Ql9ESVJFQ1QpP2NsLT5jbGFzc2lkOjAsc2tiKTsNCiAgICAgcmV0dXJuIE5F VF9YTUlUX1NVQ0NFU1M7DQogfQ0KIA0KQEAgLTE0NzcsNyArMTQ3Nyw3IEBA DQogCQljbC0+bWFnaWMgPSBIVEJfQ01BR0lDOw0KICNlbmRpZg0KIA0KLQkJ LyogY3JlYXRlIGxlYWYgcWRpc2MgZWFybHkgYmVjYXVzZSBpdCB1c2VzIGtt YWxsb2MoR1BGX0tFUk5FTCkNCisJCS8qIGNyZWF0ZSBsZWFmIHFkaXNjIGVh cmx5IGJlY2F1c2UgaXQgdXNlcyBrbWFsbG9jKEdGUF9LRVJORUwpDQogCQkg ICBzbyB0aGF0IGNhbid0IGJlIHVzZWQgaW5zaWRlIG9mIHNjaF90cmVlX2xv Y2sNCiAJCSAgIC0tIHRoYW5rcyB0byBLYXJsaXMgUGVpc2VuaWVrcyAqLw0K IAkJbmV3X3EgPSBxZGlzY19jcmVhdGVfZGZsdChzY2gtPmRldiwgJnBmaWZv X3FkaXNjX29wcyk7DQo= --8323328-564380524-1070720317=:839-- From xose@wanadoo.es Sat Dec 6 08:50:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 08:50:34 -0800 (PST) Received: from smtp12.eresmas.com (smtp12.eresmas.com [62.81.235.112]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB6GoKTa032011 for ; Sat, 6 Dec 2003 08:50:21 -0800 Received: from [192.168.108.53] (helo=mx03.eresmas.com) by smtp12.eresmas.com with esmtp (Exim 4.10) id 1ASfdD-00080y-00; Sat, 06 Dec 2003 17:50:11 +0100 Received: from [80.103.141.169] (helo=wanadoo.es) by mx03.eresmas.com with esmtp (Exim 4.20) id 1ASfdF-00008J-MI; Sat, 06 Dec 2003 17:50:14 +0100 Message-ID: <3FD20895.2020605@wanadoo.es> Date: Sat, 06 Dec 2003 17:49:25 +0100 From: Xose Vazquez Perez User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003 X-Accept-Language: gl, es, en MIME-Version: 1.0 To: Jeff Garzik , netdev@oss.sgi.com, "David S. Miller" Subject: [PATCH] tg3, more IDs X-Enigmail-Version: 0.63.3.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------050809000102050606080308" X-archive-position: 1914 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xose@wanadoo.es Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050809000102050606080308 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit hi, two new IDs for tg3: 0x14e4,0x1649 is a BCM5704S based board. 0x14e4,0x166e is a BCM5705F based board. -thanks- --------------050809000102050606080308 Content-Type: text/plain; name="tg3_ids.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="tg3_ids.diff" --- linux/drivers/net/tg3.c 2003-11-29 00:00:15.000000000 +0100 +++ n/drivers/net/tg3.c 2003-12-06 17:26:24.000000000 +0100 @@ -175,6 +175,10 @@ PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5901_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704S_2, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705F, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9DXX, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9MXX, --- linux/include/linux/pci_ids.h 2003-11-29 00:00:21.000000000 +0100 +++ n/include/linux/pci_ids.h 2003-12-06 17:30:10.000000000 +0100 @@ -1656,11 +1656,13 @@ #define PCI_DEVICE_ID_TIGON3_5702 0x1646 #define PCI_DEVICE_ID_TIGON3_5703 0x1647 #define PCI_DEVICE_ID_TIGON3_5704 0x1648 +#define PCI_DEVICE_ID_TIGON3_5704S_2 0x1649 #define PCI_DEVICE_ID_TIGON3_5702FE 0x164d #define PCI_DEVICE_ID_TIGON3_5705 0x1653 #define PCI_DEVICE_ID_TIGON3_5705_2 0x1654 #define PCI_DEVICE_ID_TIGON3_5705M 0x165d #define PCI_DEVICE_ID_TIGON3_5705M_2 0x165e +#define PCI_DEVICE_ID_TIGON3_5705F 0x166e #define PCI_DEVICE_ID_TIGON3_5782 0x1696 #define PCI_DEVICE_ID_TIGON3_5788 0x169c #define PCI_DEVICE_ID_TIGON3_5702X 0x16a6 --------------050809000102050606080308-- From shmulik.hen@intel.com Sat Dec 6 09:05:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 09:05:33 -0800 (PST) Received: from hermes-pilot.fm.intel.com (fmr99.intel.com [192.55.52.32]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB6H5ETa032596 for ; Sat, 6 Dec 2003 09:05:14 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes-pilot.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hB4FAISN030021 for ; Thu, 4 Dec 2003 15:10:18 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.6 2003/12/04 00:48:44 root Exp $) with SMTP id hB4F57WU024709 for ; Thu, 4 Dec 2003 15:05:10 GMT Received: from jrslxjul5.npdj.intel.com ([10.12.220.55]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003120407135228761 ; Thu, 04 Dec 2003 07:13:54 -0800 From: Shmulik Hen Organization: Intel Date: Thu, 4 Dec 2003 17:13:52 +0200 User-Agent: KMail/1.5 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com, Jay Vosburgh MIME-Version: 1.0 Content-Disposition: inline To: Jeff Garzik Subject: [PATCH SET][bonding 2.4] cleanup Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <200312041713.52021.shmulik.hen@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 1915 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Hi, Here is another go with the bonding cleanup. This set is similar to the previous set I sent on 10/12, but built against 2.4.23 and includes the silent failure fix reported by Per Hedeland. The set is broken down to many patches for better tracking. It has already undergone a thorough set of testing by our QA group, and is the basis for all future developments we plan for bonding. Jeff, Please apply this set and forward it to Marcelo for inclusion in 2.4.24-pre1 since I was told by Jay it would be OK to apply this set once 2.4.23 is released. Set can be downloaded from: http://osdn.dl.sourceforge.net/sourceforge/bonding/bonding_cleanup_3-2.4.23.tar.bz2 It will update the following files: Documentation/networking/bonding.txt Documentation/networking/ifenslave.c drivers/net/bonding/bond_3ad.c drivers/net/bonding/bond_alb.c drivers/net/bonding/bond_alb.h drivers/net/bonding/bonding.h drivers/net/bonding/bond_main.c include/linux/if_bonding.h Description: patch 1 - ifenslave lite - No more IP settings to slaves, unified printing format, code re-org and broken to more functions. patch 2 - convert all debug prints to use the dprintk macro and consolidate format of all prints (e.g. "bonding: Error: ..."). patch 3 - death of typedef. eliminate bonding_t/slave_t types and consolidate casting. patch 4 - remove dead code and redundant checks. patch 5 - consolidate timers initialization, error checking and re-queuing. patch 6 - convert too long if-else to a switch-case. Fix all locations that handles bond->primary. patch 7 - eliminate the multicast_mode module param. settings are now done only according to mode. patch 8 - slave list iteration - bond is no longer part of the list, added cyclic list iteration macros. patch 9 - consolidate function declarations: o all functions begin with bond_ o return value, function name and all params are on the same line. o change some function names to be more descriptive. patch 10 - consolidate names of function params and variables (e.g. bond_dev instead of dev/master/master_dev). patch 11 - change names/types for some of the members in struct bonding. change position of members. patch 12 - consolidate return values of functions. patch 13 - put curly braces around all if, else, for, while, switch statements. change conditions to short format. e.g. (ptr == NULL) ==> (!ptr) patch 14 - consolidate error handling in all xmit functions. patch 15 - chomp all trailing white space. patch 16 - remove duplicate empty lines. add empty lines where needed to improve readability. patch 17 - fix indentations. patch 18 - code re-organization in bond_main.c according to context (e.g. module initialization, bond initialization, device entry points, monitoring, etc). -- | Shmulik Hen Advanced Network Services | | Israel Design Center, Jerusalem | | LAN Access Division, Platform Networking | | Intel Communications Group, Intel corp. | From xose@wanadoo.es Sat Dec 6 10:04:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 10:04:47 -0800 (PST) Received: from smtp14.eresmas.com (smtp14.eresmas.com [62.81.235.114]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB6I4WTa001243 for ; Sat, 6 Dec 2003 10:04:33 -0800 Received: from [192.168.108.56] (helo=mx06.eresmas.com) by smtp14.eresmas.com with esmtp (Exim 4.10) id 1ASgn4-0003yp-00; Sat, 06 Dec 2003 19:04:26 +0100 Received: from [80.103.141.169] (helo=wanadoo.es) by mx06.eresmas.com with esmtp (Exim 4.20) id 1ASgn4-0008OL-Vn; Sat, 06 Dec 2003 19:04:27 +0100 Message-ID: <3FD219FA.5010908@wanadoo.es> Date: Sat, 06 Dec 2003 19:03:38 +0100 From: Xose Vazquez Perez User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003 X-Accept-Language: gl, es, en MIME-Version: 1.0 To: Jeff Garzik , netdev@oss.sgi.com, "David S. Miller" Subject: Re: [PATCH] tg3, more IDs References: <3FD20895.2020605@wanadoo.es> X-Enigmail-Version: 0.63.3.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------000005030202000100050105" X-archive-position: 1916 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xose@wanadoo.es Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000005030202000100050105 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Xose Vazquez Perez wrote: > two new IDs for tg3: > 0x14e4,0x1649 is a BCM5704S based board. > 0x14e4,0x166e is a BCM5705F based board. forget my last patch, here goes one more complete: - adds two new IDs: 0x14e4,0x1649 - BCM5704S based board. 0x14e4,0x166e - BCM5705F based board. - in subsys_id_to_phy_id: replaced BCM_ID magical numbers with MACROS added one IBM board deleted imcompatible COMPAQ boards - 5705F limited to 10/100 too. And as side note, Broadcom has enabled _by default_ TSO in its latest driver 7.1.9(11/03/2003). For curious people: http://www.broadcom.com/drivers/downloaddrivers.php -thanks- --------------000005030202000100050105 Content-Type: text/plain; name="tg3_patch.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="tg3_patch.diff" --- linux/drivers/net/tg3.c 2003-12-06 17:44:05.000000000 +0100 +++ n/drivers/net/tg3.c 2003-12-06 18:34:59.000000000 +0100 @@ -175,6 +175,10 @@ PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5901_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704S_2, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705F, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9DXX, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9MXX, @@ -6401,25 +6405,22 @@ static struct subsys_tbl_ent subsys_id_to_phy_id[] = { /* Broadcom boards. */ - { 0x14e4, 0x1644, PHY_ID_BCM5401 }, /* BCM95700A6 */ - { 0x14e4, 0x0001, PHY_ID_BCM5701 }, /* BCM95701A5 */ - { 0x14e4, 0x0002, PHY_ID_BCM8002 }, /* BCM95700T6 */ - { 0x14e4, 0x0003, PHY_ID_SERDES }, /* BCM95700A9 */ - { 0x14e4, 0x0005, PHY_ID_BCM5701 }, /* BCM95701T1 */ - { 0x14e4, 0x0006, PHY_ID_BCM5701 }, /* BCM95701T8 */ - { 0x14e4, 0x0007, PHY_ID_SERDES }, /* BCM95701A7 */ - { 0x14e4, 0x0008, PHY_ID_BCM5701 }, /* BCM95701A10 */ - { 0x14e4, 0x8008, PHY_ID_BCM5701 }, /* BCM95701A12 */ - { 0x14e4, 0x0009, PHY_ID_BCM5701 }, /* BCM95703Ax1 */ - { 0x14e4, 0x8009, PHY_ID_BCM5701 }, /* BCM95703Ax2 */ + { PCI_VENDOR_ID_BROADCOM, 0x1644, PHY_ID_BCM5401 }, /* BCM95700A6 */ + { PCI_VENDOR_ID_BROADCOM, 0x0001, PHY_ID_BCM5701 }, /* BCM95701A5 */ + { PCI_VENDOR_ID_BROADCOM, 0x0002, PHY_ID_BCM8002 }, /* BCM95700T6 */ + { PCI_VENDOR_ID_BROADCOM, 0x0003, PHY_ID_SERDES }, /* BCM95700A9 */ + { PCI_VENDOR_ID_BROADCOM, 0x0005, PHY_ID_BCM5701 }, /* BCM95701T1 */ + { PCI_VENDOR_ID_BROADCOM, 0x0006, PHY_ID_BCM5701 }, /* BCM95701T8 */ + { PCI_VENDOR_ID_BROADCOM, 0x0007, PHY_ID_SERDES }, /* BCM95701A7 */ + { PCI_VENDOR_ID_BROADCOM, 0x0008, PHY_ID_BCM5701 }, /* BCM95701A10 */ + { PCI_VENDOR_ID_BROADCOM, 0x8008, PHY_ID_BCM5701 }, /* BCM95701A12 */ + { PCI_VENDOR_ID_BROADCOM, 0x0009, PHY_ID_BCM5701 }, /* BCM95703Ax1 */ + { PCI_VENDOR_ID_BROADCOM, 0x8009, PHY_ID_BCM5701 }, /* BCM95703Ax2 */ /* 3com boards. */ { PCI_VENDOR_ID_3COM, 0x1000, PHY_ID_BCM5401 }, /* 3C996T */ { PCI_VENDOR_ID_3COM, 0x1006, PHY_ID_BCM5701 }, /* 3C996BT */ - /* { PCI_VENDOR_ID_3COM, 0x1002, PHY_ID_XXX }, 3C996CT */ - /* { PCI_VENDOR_ID_3COM, 0x1003, PHY_ID_XXX }, 3C997T */ { PCI_VENDOR_ID_3COM, 0x1004, PHY_ID_SERDES }, /* 3C996SX */ - /* { PCI_VENDOR_ID_3COM, 0x1005, PHY_ID_XXX }, 3C997SZ */ { PCI_VENDOR_ID_3COM, 0x1007, PHY_ID_BCM5701 }, /* 3C1000T */ { PCI_VENDOR_ID_3COM, 0x1008, PHY_ID_BCM5701 }, /* 3C940BR01 */ @@ -6434,7 +6435,10 @@ { PCI_VENDOR_ID_COMPAQ, 0x009a, PHY_ID_BCM5701 }, /* BANSHEE_2 */ { PCI_VENDOR_ID_COMPAQ, 0x007d, PHY_ID_SERDES }, /* CHANGELING */ { PCI_VENDOR_ID_COMPAQ, 0x0085, PHY_ID_BCM5701 }, /* NC7780 */ - { PCI_VENDOR_ID_COMPAQ, 0x0099, PHY_ID_BCM5701 } /* NC7780_2 */ + { PCI_VENDOR_ID_COMPAQ, 0x0099, PHY_ID_BCM5701 }, /* NC7780_2 */ + + /* IBM boards. */ + { PCI_VENDOR_ID_IBM, 0x0281, PHY_ID_SERDES } /* IBM??? */ }; static int __devinit tg3_phy_probe(struct tg3 *tp) @@ -6974,7 +6978,8 @@ (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705 && tp->pdev->vendor == PCI_VENDOR_ID_BROADCOM && (tp->pdev->device == PCI_DEVICE_ID_TIGON3_5901 || - tp->pdev->device == PCI_DEVICE_ID_TIGON3_5901_2))) + tp->pdev->device == PCI_DEVICE_ID_TIGON3_5901_2 || + tp->pdev->device == PCI_DEVICE_ID_TIGON3_5705F))) tp->tg3_flags |= TG3_FLAG_10_100_ONLY; err = tg3_phy_probe(tp); --- linux/include/linux/pci_ids.h 2003-12-06 17:44:05.000000000 +0100 +++ n/include/linux/pci_ids.h 2003-12-06 17:30:10.000000000 +0100 @@ -1656,11 +1656,13 @@ #define PCI_DEVICE_ID_TIGON3_5702 0x1646 #define PCI_DEVICE_ID_TIGON3_5703 0x1647 #define PCI_DEVICE_ID_TIGON3_5704 0x1648 +#define PCI_DEVICE_ID_TIGON3_5704S_2 0x1649 #define PCI_DEVICE_ID_TIGON3_5702FE 0x164d #define PCI_DEVICE_ID_TIGON3_5705 0x1653 #define PCI_DEVICE_ID_TIGON3_5705_2 0x1654 #define PCI_DEVICE_ID_TIGON3_5705M 0x165d #define PCI_DEVICE_ID_TIGON3_5705M_2 0x165e +#define PCI_DEVICE_ID_TIGON3_5705F 0x166e #define PCI_DEVICE_ID_TIGON3_5782 0x1696 #define PCI_DEVICE_ID_TIGON3_5788 0x169c #define PCI_DEVICE_ID_TIGON3_5702X 0x16a6 --------------000005030202000100050105-- From davem@pizda.ninka.net Sat Dec 6 23:11:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Dec 2003 23:11:49 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB77BaTa019068 for ; Sat, 6 Dec 2003 23:11:36 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id XAA16107; Sat, 6 Dec 2003 23:10:48 -0800 Date: Sat, 6 Dec 2003 23:10:47 -0800 From: "David S. Miller" To: devik Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org, daniel.blueman@gmx.net Subject: Re: HTB 2.6.0-test10 oops related patch Message-Id: <20031206231047.01257548.davem@redhat.com> In-Reply-To: References: <4371.1067710407@www6.gmx.net> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1917 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sat, 6 Dec 2003 15:18:37 +0100 (CET) devik wrote: > I discovered reason of the first Daniel's OOPS. It is change > which was not forward-ported to 2.6. > It is attached. Thanks a lot for tracking this problem down Devik. Patch applied, thanks again. From herbert@gondor.apana.org.au Sun Dec 7 01:03:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 01:03:17 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7930Ta026054 for ; Sun, 7 Dec 2003 01:03:03 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1ASunp-0007MK-00; Sun, 07 Dec 2003 20:02:09 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1ASunl-0000cg-00; Sun, 07 Dec 2003 20:02:05 +1100 Date: Sun, 7 Dec 2003 20:02:05 +1100 To: kuznet@ms2.inr.ac.ru Cc: "David S. Miller" , jmorris@redhat.com, netdev@oss.sgi.com Subject: Re: [IPSEC] Move hardware headers for decaped packets Message-ID: <20031207090205.GA2358@gondor.apana.org.au> References: <20030925121131.GA17968@gondor.apana.org.au> <200309251228.QAA11650@yakov.inr.ac.ru> <20030925124102.GA18188@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="5mCyUwZo2JvN/JJP" Content-Disposition: inline In-Reply-To: <20030925124102.GA18188@gondor.apana.org.au> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1918 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Sep 25, 2003 at 10:41:02PM +1000, herbert wrote: > > A quick grep shows just below 100 occurances of mac.raw under > drivers/net. I'll have a go at it unless someone else does > it first. > > On the other hand, I've found a way to avoid the move without > requiring mac_len to be set at all. We can do this in netif_rx: It looks like I've neglected this issue for a few months. I have had another look at it and avoiding the move altogether is actually not that useful. The reason is that anything that uses mac.raw will need to handle the case where the MAC header is not next to the payload. In particular, this is non-trivial in the AF_PACKET case so we'll end up moving the MAC header anyway. Worse yet, if there are multiple AF_PACKET users, this means that the copy/move will have to be repeated. So I'd like to go with the previous approach which is to move for tunnel SAs only by using mac_len. Here is that patch again. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p Index: include/linux/skbuff.h =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/include/linux/skbuff.h,v retrieving revision 1.1.1.8 retrieving revision 1.6 diff -u -r1.1.1.8 -r1.6 --- include/linux/skbuff.h 9 Aug 2003 08:12:03 -0000 1.1.1.8 +++ include/linux/skbuff.h 13 Sep 2003 00:11:55 -0000 1.6 @@ -159,6 +161,7 @@ * @cb: Control buffer. Free for use by every layer. Put private vars here * @len: Length of actual data * @data_len: Data length + * @mac_len: Length of link layer header * @csum: Checksum * @__unused: Dead field, may be reused * @cloned: Head may be cloned (check refcnt to be sure) @@ -199,6 +202,7 @@ struct icmphdr *icmph; struct igmphdr *igmph; struct iphdr *ipiph; + struct ipv6hdr *ipv6h; unsigned char *raw; } h; @@ -227,6 +231,7 @@ unsigned int len, data_len, + mac_len, csum; unsigned char local_df, cloned, Index: net/core/dev.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/net/core/dev.c,v retrieving revision 1.1.1.17 retrieving revision 1.2 diff -u -r1.1.1.17 -r1.2 --- net/core/dev.c 22 Aug 2003 23:56:14 -0000 1.1.1.17 +++ net/core/dev.c 13 Sep 2003 00:11:55 -0000 1.2 @@ -1547,6 +1547,7 @@ #endif skb->h.raw = skb->nh.raw = skb->data; + skb->mac_len = skb->nh.raw - skb->mac.raw; pt_prev = NULL; rcu_read_lock(); Index: net/ipv4/xfrm4_input.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/net/ipv4/xfrm4_input.c,v retrieving revision 1.1.1.6 retrieving revision 1.9 diff -u -r1.1.1.6 -r1.9 --- net/ipv4/xfrm4_input.c 22 Aug 2003 23:52:18 -0000 1.1.1.6 +++ net/ipv4/xfrm4_input.c 13 Sep 2003 00:11:55 -0000 1.9 @@ -9,6 +9,7 @@ * */ +#include #include #include #include @@ -18,9 +19,10 @@ return xfrm4_rcv_encap(skb, 0); } -static inline void ipip_ecn_decapsulate(struct iphdr *outer_iph, struct sk_buff *skb) +static inline void ipip_ecn_decapsulate(struct sk_buff *skb) { - struct iphdr *inner_iph = skb->nh.iph; + struct iphdr *outer_iph = skb->nh.iph; + struct iphdr *inner_iph = skb->h.ipiph; if (INET_ECN_is_ce(outer_iph->tos) && INET_ECN_is_not_ce(inner_iph->tos)) @@ -95,10 +97,16 @@ if (x->props.mode) { if (iph->protocol != IPPROTO_IPIP) goto drop; - skb->nh.raw = skb->data; + if (!pskb_may_pull(skb, sizeof(struct iphdr))) + goto drop; + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto drop; if (!(x->props.flags & XFRM_STATE_NOECN)) - ipip_ecn_decapsulate(iph, skb); - iph = skb->nh.iph; + ipip_ecn_decapsulate(skb); + skb->mac.raw = memmove(skb->data - skb->mac_len, + skb->mac.raw, skb->mac_len); + skb->nh.raw = skb->data; memset(&(IPCB(skb)->opt), 0, sizeof(struct ip_options)); decaps = 1; break; Index: net/ipv6/xfrm6_input.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/net/ipv6/xfrm6_input.c,v retrieving revision 1.1.1.7 retrieving revision 1.8 diff -u -r1.1.1.7 -r1.8 --- net/ipv6/xfrm6_input.c 9 Aug 2003 08:12:04 -0000 1.1.1.7 +++ net/ipv6/xfrm6_input.c 13 Sep 2003 00:11:55 -0000 1.8 @@ -9,17 +9,20 @@ * IPv6 support */ +#include #include #include #include #include -static inline void ipip6_ecn_decapsulate(struct ipv6hdr *iph, - struct sk_buff *skb) +static inline void ipip6_ecn_decapsulate(struct sk_buff *skb) { - if (INET_ECN_is_ce(ip6_get_dsfield(iph)) && - INET_ECN_is_not_ce(ip6_get_dsfield(skb->nh.ipv6h))) - IP6_ECN_set_ce(skb->nh.ipv6h); + struct ipv6hdr *outer_iph = skb->nh.ipv6h; + struct ipv6hdr *inner_iph = skb->h.ipv6h; + + if (INET_ECN_is_ce(ip6_get_dsfield(outer_iph)) && + INET_ECN_is_not_ce(ip6_get_dsfield(inner_iph))) + IP6_ECN_set_ce(inner_iph); } int xfrm6_rcv(struct sk_buff **pskb, unsigned int *nhoffp) @@ -77,10 +80,16 @@ if (x->props.mode) { /* XXX */ if (nexthdr != IPPROTO_IPV6) goto drop; - skb->nh.raw = skb->data; + if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) + goto drop; + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto drop; if (!(x->props.flags & XFRM_STATE_NOECN)) - ipip6_ecn_decapsulate(iph, skb); - iph = skb->nh.ipv6h; + ipip6_ecn_decapsulate(skb); + skb->mac.raw = memmove(skb->data - skb->mac_len, + skb->mac.raw, skb->mac_len); + skb->nh.raw = skb->data; decaps = 1; break; } --5mCyUwZo2JvN/JJP-- From herbert@gondor.apana.org.au Sun Dec 7 01:34:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 01:34:30 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB79YBTa027369 for ; Sun, 7 Dec 2003 01:34:13 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1ASvI8-0007WX-00; Sun, 07 Dec 2003 20:33:28 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1ASvI4-0000gX-00; Sun, 07 Dec 2003 20:33:24 +1100 Date: Sun, 7 Dec 2003 20:33:24 +1100 To: kuznet@ms2.inr.ac.ru Cc: davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org Subject: Re: 2.6 IPSEC + SNAT Message-ID: <20031207093324.GA2520@gondor.apana.org.au> References: <20031001121648.GA8755@gondor.apana.org.au> <200310071251.QAA32679@yakov.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200310071251.QAA32679@yakov.inr.ac.ru> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1919 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Tue, Oct 07, 2003 at 04:51:22PM +0400, kuznet@ms2.inr.ac.ru wrote: > > I thought about this (ages ago, so this can be stale) > The verdict was that self-consistent picture is possible only > when NAT rules are integral part of SPD. It does not look like > a stimulating idea. :-) I've hit this issue again recently [1] so I'd like to know if you've had any new revelations on this. It seems to me that for the forward path at least, there is no issue with breaking the routing/xfrm_lookup couple, and turning it into routing/FORWARD/POSTROUTING1/xfrm_lookup/xfrm_output/POSTROUTING2. Do you agree with this? If so, then the problem is restricted to the case of locally generated packets. Which leads us back to the previous discussion about what you're going to do with the xfrm_lookup calls on TCP/UDP/RAW sockets. Can you give an update on your work on that? Thanks, [1] The setup is as follows: Client <-----> Gateway <-----> Internet IPSEC If the gateway does SNAT and the client starts an active FTP connection, then the TCP sequence numbers may have to be modified. This is OK on the way out, but on the way back to the client, the sequence change can only occur in POSTROUTING which unfortunately occurs after IPSEC encapsulation. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From davem@pizda.ninka.net Sun Dec 7 01:48:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 01:48:17 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB79m4Ta027859 for ; Sun, 7 Dec 2003 01:48:04 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id BAA16296; Sun, 7 Dec 2003 01:47:14 -0800 Date: Sun, 7 Dec 2003 01:47:14 -0800 From: "David S. Miller" To: Herbert Xu Cc: kuznet@ms2.inr.ac.ru, jmorris@redhat.com, netdev@oss.sgi.com Subject: Re: [IPSEC] Move hardware headers for decaped packets Message-Id: <20031207014714.03e79cd2.davem@redhat.com> In-Reply-To: <20031207090205.GA2358@gondor.apana.org.au> References: <20030925121131.GA17968@gondor.apana.org.au> <200309251228.QAA11650@yakov.inr.ac.ru> <20030925124102.GA18188@gondor.apana.org.au> <20031207090205.GA2358@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1920 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 7 Dec 2003 20:02:05 +1100 Herbert Xu wrote: > So I'd like to go with the previous approach which is to move > for tunnel SAs only by using mac_len. Ok, I'll review this again. Did you audit the tree to make sure you updated mac_len everywhere that 'skb->mac.*' is modified? From herbert@gondor.apana.org.au Sun Dec 7 01:56:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 01:56:28 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB79uBTa028334 for ; Sun, 7 Dec 2003 01:56:13 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1ASvdR-0007nj-00; Sun, 07 Dec 2003 20:55:29 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1ASvdP-0000jP-00; Sun, 07 Dec 2003 20:55:27 +1100 Date: Sun, 7 Dec 2003 20:55:27 +1100 To: "David S. Miller" Cc: kuznet@ms2.inr.ac.ru, jmorris@redhat.com, netdev@oss.sgi.com Subject: Re: [IPSEC] Move hardware headers for decaped packets Message-ID: <20031207095527.GA2767@gondor.apana.org.au> References: <20030925121131.GA17968@gondor.apana.org.au> <200309251228.QAA11650@yakov.inr.ac.ru> <20030925124102.GA18188@gondor.apana.org.au> <20031207090205.GA2358@gondor.apana.org.au> <20031207014714.03e79cd2.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031207014714.03e79cd2.davem@redhat.com> User-Agent: Mutt/1.5.4i From: Herbert Xu X-archive-position: 1921 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Sun, Dec 07, 2003 at 01:47:14AM -0800, David S. Miller wrote: > > Ok, I'll review this again. Did you audit the tree to make sure you > updated mac_len everywhere that 'skb->mac.*' is modified? To be honest, no. The reason is that my use of mac_len starts from netif_receive_skb and ends just before the reentrance into netif_rx in xfrm[46]_input. At the start of that path, mac_len is initialised from a value that we know to be correct. I have also verified that within the path, nobody expands/contracts the MAC header. Cheers, -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From devik@cdi.cz Sun Dec 7 04:02:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 04:03:13 -0800 (PST) Received: from www1.cdi.cz (www1.cdi.cz [194.213.194.49]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7C2qTa031842 for ; Sun, 7 Dec 2003 04:02:55 -0800 Received: from gprs129-34.eurotel.cz ([160.218.129.34] helo=devix) by www1.cdi.cz with asmtp (Exim 3.34 #3) id 1ASx8L-00065J-00; Sun, 07 Dec 2003 12:31:34 +0100 Received: from devik (helo=localhost) by devix with local-esmtp (Exim 3.16 #8) id 1ASx7E-0000uE-00; Sun, 07 Dec 2003 12:30:20 +0100 Date: Sun, 7 Dec 2003 12:30:20 +0100 (CET) From: devik X-X-Sender: To: "David S. Miller" cc: , , Subject: Re: HTB 2.6.0-test10 oops related patch In-Reply-To: <20031206231047.01257548.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323328-49484585-1070796620=:635" X-CDI: passed X-archive-position: 1922 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: devik@cdi.cz Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --8323328-49484585-1070796620=:635 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello Dave, I just spotted the second bug too. (For those interested it is described in the patch). Please revert the small patch I sent you yesterday before applying the new one. It contains yesterday's one too (it is because it is for 2.4 too). Please apply it against both 2.4.23 and 2.6.0-test10. It should apply cleanly in 2.4.23 case and with some offsets to 2.6.0-test10. Patch comments: - Fixed oops in debugging of enqueue/requeue. - Fixed oops in htb_destroy. Note that I tested 2.6 only because I'm still downloading 2.4.22 patch on my slow line. I just wanted to sent the patch ASAP. I'll inform you about result later but I expect no problems. ------------------------------- Martin Devera aka devik Linux kernel QoS/HTB maintainer http://luxik.cdi.cz/~devik/ On Sat, 6 Dec 2003, David S. Miller wrote: > On Sat, 6 Dec 2003 15:18:37 +0100 (CET) > devik wrote: > > > I discovered reason of the first Daniel's OOPS. It is change > > which was not forward-ported to 2.6. > > It is attached. > > Thanks a lot for tracking this problem down Devik. > > Patch applied, thanks again. > > --8323328-49484585-1070796620=:635 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="htb_k2.6.0t10_n_k2.4.23_to_3.14.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="htb_k2.6.0t10_n_k2.4.23_to_3.14.diff" LS0tIHNjaF9odGIuMjQyMy5jCVN1biBEZWMgIDcgMTE6NTA6NDIgMjAwMw0K KysrIHNjaF9odGIuYwlTdW4gRGVjICA3IDEyOjEyOjM5IDIwMDMNCkBAIC0y Myw3ICsyMyw3IEBADQogICoJCQlzcG90dGVkIGJ1ZyBpbiBkZXF1ZXVlIGNv ZGUgYW5kIGhlbHBlZCB3aXRoIGZpeA0KICAqCQlhbmQgbWFueSBvdGhlcnMu IHRoYW5rcy4NCiAgKg0KLSAqICRJZDogc2NoX2h0Yi5jLHYgMS4yNCAyMDAz LzA3LzI4IDE1OjI1OjIzIGRldmlrIEV4cCBkZXZpayAkDQorICogJElkOiBz Y2hfaHRiLmMsdiAxLjI1IDIwMDMvMTIvMDcgMTE6MDg6MjUgZGV2aWsgRXhw IGRldmlrICQNCiAgKi8NCiAjaW5jbHVkZSA8bGludXgvY29uZmlnLmg+DQog I2luY2x1ZGUgPGxpbnV4L21vZHVsZS5oPg0KQEAgLTc1LDcgKzc1LDcgQEAN CiAjZGVmaW5lIEhUQl9IWVNURVJFU0lTIDEvKiB3aGV0aGVyIHRvIHVzZSBt b2RlIGh5c3RlcmVzaXMgZm9yIHNwZWVkdXAgKi8NCiAjZGVmaW5lIEhUQl9R TE9DSyhTKSBzcGluX2xvY2tfYmgoJihTKS0+ZGV2LT5xdWV1ZV9sb2NrKQ0K ICNkZWZpbmUgSFRCX1FVTkxPQ0soUykgc3Bpbl91bmxvY2tfYmgoJihTKS0+ ZGV2LT5xdWV1ZV9sb2NrKQ0KLSNkZWZpbmUgSFRCX1ZFUiAweDMwMDBkCS8q IG1ham9yIG11c3QgYmUgbWF0Y2hlZCB3aXRoIG51bWJlciBzdXBsaWVkIGJ5 IFRDIGFzIHZlcnNpb24gKi8NCisjZGVmaW5lIEhUQl9WRVIgMHgzMDAwZQkv KiBtYWpvciBtdXN0IGJlIG1hdGNoZWQgd2l0aCBudW1iZXIgc3VwbGllZCBi eSBUQyBhcyB2ZXJzaW9uICovDQogDQogI2lmIEhUQl9WRVIgPj4gMTYgIT0g VENfSFRCX1BST1RPVkVSDQogI2Vycm9yICJNaXNtYXRjaGVkIHNjaF9odGIu YyBhbmQgcGt0X3NjaC5oIg0KQEAgLTcxNyw3ICs3MTcsNyBAQA0KIA0KICAg ICBzY2gtPnEucWxlbisrOw0KICAgICBzY2gtPnN0YXRzLnBhY2tldHMrKzsg c2NoLT5zdGF0cy5ieXRlcyArPSBza2ItPmxlbjsNCi0gICAgSFRCX0RCRygx LDEsImh0Yl9lbnFfb2sgY2w9JVggc2tiPSVwXG4iLGNsP2NsLT5jbGFzc2lk OjAsc2tiKTsNCisgICAgSFRCX0RCRygxLDEsImh0Yl9lbnFfb2sgY2w9JVgg c2tiPSVwXG4iLChjbCAmJiBjbCAhPSBIVEJfRElSRUNUKT9jbC0+Y2xhc3Np ZDowLHNrYik7DQogICAgIHJldHVybiBORVRfWE1JVF9TVUNDRVNTOw0KIH0N CiANCkBAIC03NDUsNyArNzQ1LDcgQEANCiAJICAgIGh0Yl9hY3RpdmF0ZSAo cSxjbCk7DQogDQogICAgIHNjaC0+cS5xbGVuKys7DQotICAgIEhUQl9EQkco MSwxLCJodGJfcmVxX29rIGNsPSVYIHNrYj0lcFxuIixjbD9jbC0+Y2xhc3Np ZDowLHNrYik7DQorICAgIEhUQl9EQkcoMSwxLCJodGJfcmVxX29rIGNsPSVY IHNrYj0lcFxuIiwoY2wgJiYgY2wgIT0gSFRCX0RJUkVDVCk/Y2wtPmNsYXNz aWQ6MCxza2IpOw0KICAgICByZXR1cm4gTkVUX1hNSVRfU1VDQ0VTUzsNCiB9 DQogDQpAQCAtMTM5NiwxMSArMTM5NiwxNiBAQA0KICNpZmRlZiBIVEJfUkFU RUNNDQogCWRlbF90aW1lcl9zeW5jICgmcS0+cnR0aW0pOw0KICNlbmRpZg0K KwkvKiBUaGlzIGxpbmUgdXNlZCB0byBiZSBhZnRlciBodGJfZGVzdHJveV9j bGFzcyBjYWxsIGJlbG93DQorCSAgIGFuZCBzdXJwcmlzaW5nbHkgaXQgd29y a2VkIGluIDIuNC4gQnV0IGl0IG11c3QgcHJlY2VkZSBpdCANCisJICAgYmVj YXVzZSBmaWx0ZXIgbmVlZCBpdHMgdGFyZ2V0IGNsYXNzIGFsaXZlIHRvIGJl IGFibGUgdG8gY2FsbA0KKwkgICB1bmJpbmRfZmlsdGVyIG9uIGl0ICh3aXRo b3V0IE9vcHMpLiAqLw0KKwlodGJfZGVzdHJveV9maWx0ZXJzKCZxLT5maWx0 ZXJfbGlzdCk7DQorCQ0KIAl3aGlsZSAoIWxpc3RfZW1wdHkoJnEtPnJvb3Qp KSANCiAJCWh0Yl9kZXN0cm95X2NsYXNzIChzY2gsbGlzdF9lbnRyeShxLT5y b290Lm5leHQsDQogCQkJCQlzdHJ1Y3QgaHRiX2NsYXNzLHNpYmxpbmcpKTsN CiANCi0JaHRiX2Rlc3Ryb3lfZmlsdGVycygmcS0+ZmlsdGVyX2xpc3QpOw0K IAlfX3NrYl9xdWV1ZV9wdXJnZSgmcS0+ZGlyZWN0X3F1ZXVlKTsNCiAJTU9E X0RFQ19VU0VfQ09VTlQ7DQogfQ0KQEAgLTE0OTMsNyArMTQ5OCw3IEBADQog CQljbC0+bWFnaWMgPSBIVEJfQ01BR0lDOw0KICNlbmRpZg0KIA0KLQkJLyog Y3JlYXRlIGxlYWYgcWRpc2MgZWFybHkgYmVjYXVzZSBpdCB1c2VzIGttYWxs b2MoR1BGX0tFUk5FTCkNCisJCS8qIGNyZWF0ZSBsZWFmIHFkaXNjIGVhcmx5 IGJlY2F1c2UgaXQgdXNlcyBrbWFsbG9jKEdGUF9LRVJORUwpDQogCQkgICBz byB0aGF0IGNhbid0IGJlIHVzZWQgaW5zaWRlIG9mIHNjaF90cmVlX2xvY2sN CiAJCSAgIC0tIHRoYW5rcyB0byBLYXJsaXMgUGVpc2VuaWVrcyAqLw0KIAkJ bmV3X3EgPSBxZGlzY19jcmVhdGVfZGZsdChzY2gtPmRldiwgJnBmaWZvX3Fk aXNjX29wcyk7DQo= --8323328-49484585-1070796620=:635-- From vdb128@stargate.kotnet.org Sun Dec 7 09:42:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 09:42:26 -0800 (PST) Received: from stargate.picaros.org (ns1.picaros.org [195.162.202.41]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7Hg4Ta014681 for ; Sun, 7 Dec 2003 09:42:11 -0800 Received: from stargate.kotnet.org (localhost [127.0.0.1]) by stargate.picaros.org [:25] (8.12.10/8.12.10) with ESMTP id hB7Hg0eS010464 for ; Sun, 7 Dec 2003 18:42:00 +0100 Received: (from vdb128@localhost) by stargate.kotnet.org (8.12.10/8.12.10/Submit) id hB7Hg0Kt010463 for netdev@oss.sgi.com; Sun, 7 Dec 2003 18:42:00 +0100 From: vdb128@stargate.kotnet.org Message-Id: <200312071742.hB7Hg0Kt010463@stargate.kotnet.org> Subject: [patch] icmp ipip relay To: netdev@oss.sgi.com Date: Sun, 7 Dec 2003 18:42:00 +0100 (CET) X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1923 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vdb128@stargate.kotnet.org Precedence: bulk X-list: netdev Dear ipv4/ipip.c maintainers, The patch here below relays ICMP IPIP errors to the originating host. This is attempted if in addition to the IPIP header length (hlen) more then 20+8 bytes data remain. Otherwise, it defaults to the current code and increments the error count. Yours sincerely, Servaas Vandenberghe ----------------------------- at local host 172.24.105.1 ---------------------- stargate:~$ tracepath 195.162.196.2 1?: [LOCALHOST] pmtu 1500 1: be-bru02a-ra2-vl-201.aorta.net (195.162.209.253) 6.817ms (0) 2: rubicon.chello.be (195.162.196.2) 5.986ms reached Resume: pmtu 1500 hops 2 back 2 10: tunl1@NONE: mtu 1480 qdisc noqueue link/ipip 195.162.202.41 peer 195.162.196.2 inet 172.24.105.24 peer 172.24.10.24/32 scope global tunl1 stargate:~$ tracepath 172.24.10.24 1?: [LOCALHOST] pmtu 1480 1: no reply 2: tkstar.picaros.org (172.24.105.24) 6.409ms asym 1 !H Resume: pmtu 1480 ==> /var/log/debug <== Dec 7 02:56:34 stargate kernel: ipip_err: icmp 195.162.209.253>195.162.202.41 11 0 simple Dec 7 02:56:36 stargate last message repeated 2 times Dec 7 02:56:37 stargate kernel: ipip_err: icmp 195.162.196.2>195.162.202.41 3 2 full Dec 7 02:56:37 stargate kernel: ipip_err: ipip 195.162.202.41>195.162.196.2 172.24.105.24>172.24.10.24 ----------------------- at firewalled host 172.24.105.1 --------------------- reddwarf:~$ tracepath 195.162.196.2 1?: [LOCALHOST] pmtu 1500 1: stargate.picaros.org (172.24.105.14) 4.021ms 2: be-bru02a-ra2-vl-201.aorta.net (195.162.209.253) 12.948ms (0) 3: rubicon.chello.be (195.162.196.2) 10.130ms reached Resume: pmtu 1500 hops 3 back 3 reddwarf:~$ tracepath 172.24.10.24 1?: [LOCALHOST] pmtu 1500 1: stargate.picaros.org (172.24.105.14) 3.902ms 2: stargate.picaros.org (172.24.105.14) 3.771ms asym 1 pmtu 1480 3: stargate.picaros.org (172.24.105.14) 20.500ms asym 1 !H Resume: pmtu 1480 ==> /var/log/debug <== Dec 7 03:03:39 stargate kernel: ipip_err: icmp 195.162.196.2>195.162.202.41 3 2 full Dec 7 03:03:39 stargate kernel: ipip_err: ip_route_input non local pkt Dec 7 03:03:39 stargate kernel: ipip_err: ipip 195.162.202.41>195.162.196.2 172.24.105.1>172.24.10.24 reddwarf:~$ tracepath -m 40 172.24.10.24 1: stargate.picaros.org (172.24.105.14) 1.335ms 2: no reply 3: stargate.picaros.org (172.24.105.14) 3.172ms asym 1 !H Resume: pmtu 40 ==> /var/log/debug <== Dec 7 03:52:41 stargate kernel: ipip_err: icmp 195.162.209.253>195.162.202.41 11 0 simple Dec 7 03:52:43 stargate last message repeated 2 times Dec 7 03:52:44 stargate kernel: ipip_err: icmp 195.162.196.2>195.162.202.41 3 2 full Dec 7 03:52:44 stargate kernel: ipip_err: ip_route_input non local pkt Dec 7 03:52:44 stargate kernel: ipip_err: ipip 195.162.202.41>195.162.196.2 172.24.105.1>172.24.10.24 stargate:~# tcpdump -i eth1 -s 1024 -v host 195.162.196.2 or host 195.162.202.41 tcpdump: listening on eth1 03:52:41.658061 ns1.picaros.org > rubicon.chello.be: reddwarf.picaros.org.1028 > 172.24.10.24.44445: [udp sum ok] udp 12 (DF) [ttl 1] (id 0, len 40) (DF) [ttl 1] (id 0, len 60) 03:52:41.659947 be-bru02a-ra2-vl-201.aorta.net > ns1.picaros.org: icmp 28: time exceeded in-transit [tos 0xc0] (ttl 255, id 58562, len 56) 03:52:42.659757 ns1.picaros.org > rubicon.chello.be: reddwarf.picaros.org.1028 > 172.24.10.24.44446: [udp sum ok] udp 12 (DF) [ttl 1] (id 0, len 40) (DF) [ttl 1] (id 0, len 60) 03:52:42.665514 be-bru02a-ra2-vl-201.aorta.net > ns1.picaros.org: icmp 28: time exceeded in-transit [tos 0xc0] (ttl 255, id 58594, len 56) 03:52:43.669749 ns1.picaros.org > rubicon.chello.be: reddwarf.picaros.org.1028 > 172.24.10.24.44447: [udp sum ok] udp 12 (DF) [ttl 1] (id 0, len 40) (DF) [ttl 1] (id 0, len 60) 03:52:43.671649 be-bru02a-ra2-vl-201.aorta.net > ns1.picaros.org: icmp 28: time exceeded in-transit [tos 0xc0] (ttl 255, id 58632, len 56) 03:52:44.686199 ns1.picaros.org > rubicon.chello.be: reddwarf.picaros.org.1028 > 172.24.10.24.44448: [udp sum ok] udp 12 (DF) (ttl 2, id 0, len 40) (DF) (ttl 2, id 0, len 60) 03:52:44.688177 rubicon.chello.be > ns1.picaros.org: icmp 60: rubicon.chello.be protocol 4 unreachable (DF) (ttl 254, id 14625, len 88) ############################################################################# --- linux-2.4.23/net/ipv4/ipip-dist.c Fri Nov 28 19:26:21 2003 +++ linux-2.4.23/net/ipv4/ipip.c Sun Dec 7 02:01:13 2003 @@ -289,19 +290,29 @@ dev_put(dev); } +#define IPIP_DEBUG 0 void ipip_err(struct sk_buff *skb, u32 info) { -#ifndef I_WISH_WORLD_WERE_PERFECT + struct iphdr *iph = (struct iphdr*)skb->data; + int hlen = iph->ihl<<2; + unsigned int len = skb->len; + int type = skb->h.icmph->type; + int code = skb->h.icmph->code; + + if (len < hlen+sizeof(struct iphdr)+8) { + /*#ifndef I_WISH_WORLD_WERE_PERFECT*/ /* It is not :-( All the routers (except for Linux) return only 8 bytes of packet payload. It means, that precise relaying of ICMP in the real Internet is absolutely infeasible. */ - struct iphdr *iph = (struct iphdr*)skb->data; - int type = skb->h.icmph->type; - int code = skb->h.icmph->code; struct ip_tunnel *t; - +#if IPIP_DEBUG + if (net_ratelimit()) + printk(KERN_DEBUG "ipip_err: icmp %u.%u.%u.%u>%u.%u.%u.%u %d %d " + "simple \n", NIPQUAD(skb->nh.iph->saddr), + NIPQUAD(skb->nh.iph->daddr), type,code); +#endif switch (type) { default: case ICMP_PARAMETERPROB: @@ -344,22 +355,21 @@ t->err_time = jiffies; out: read_unlock(&ipip_lock); - return; -#else - struct iphdr *iph = (struct iphdr*)dp; - int hlen = iph->ihl<<2; - struct iphdr *eiph; - int type = skb->h.icmph->type; - int code = skb->h.icmph->code; - int rel_type = 0; - int rel_code = 0; + } else { /* #else */ + struct iphdr *eiph = (struct iphdr*)(skb->data + hlen); + int rel_type = type; + int rel_code = code; int rel_info = 0; - struct sk_buff *skb2; - struct rtable *rt; - - if (len < hlen + sizeof(struct iphdr)) - return; - eiph = (struct iphdr*)(dp + hlen); + struct ip_tunnel *tlook, *troute; + struct sk_buff *skb2=NULL; + struct rtable *rt=NULL; + +#if IPIP_DEBUG + if (net_ratelimit()) + printk(KERN_DEBUG "ipip_err: icmp %u.%u.%u.%u>%u.%u.%u.%u %d %d " + "full \n", NIPQUAD(skb->nh.iph->saddr), + NIPQUAD(skb->nh.iph->daddr), type,code); +#endif switch (type) { default: @@ -407,62 +417,90 @@ break; } + tlook = ipip_tunnel_lookup(iph->daddr, iph->saddr); + if (tlook == NULL || tlook->parms.iph.daddr == 0) { +#if IPIP_DEBUG + if (net_ratelimit()) + printk(KERN_DEBUG "ipip_err: %u.%u.%u.%u>%u.%u.%u.%u " + "no tunnel ! \n", NIPQUAD(iph->saddr), NIPQUAD(iph->daddr)); +#endif + return; + } + /* Prepare fake skb to feed it to icmp_send */ skb2 = skb_clone(skb, GFP_ATOMIC); if (skb2 == NULL) return; dst_release(skb2->dst); - skb2->dst = NULL; - skb_pull(skb2, skb->data - (u8*)eiph); + skb2->dst = NULL; + skb_pull(skb2, hlen); skb2->nh.raw = skb2->data; - /* Try to guess incoming interface */ + /* Try to guess original incoming interface before encapsulation */ if (ip_route_output(&rt, eiph->saddr, 0, RT_TOS(eiph->tos), 0)) { - kfree_skb(skb2); - return; + kfree_skb(skb2); + return; } skb2->dev = rt->u.dst.dev; /* route "incoming" packet */ if (rt->rt_flags&RTCF_LOCAL) { - ip_rt_put(rt); - rt = NULL; - if (ip_route_output(&rt, eiph->daddr, eiph->saddr, eiph->tos, 0) || - rt->u.dst.dev->type != ARPHRD_IPGRE) { - ip_rt_put(rt); - kfree_skb(skb2); - return; - } + struct rtable *rtout=NULL; + if (ip_route_output(&rtout, eiph->daddr, eiph->saddr, eiph->tos, 0) + || rtout->u.dst.dev->type != ARPHRD_TUNNEL) { + ip_rt_put(rtout); + goto end_error; + } + skb2->dst = &rtout->u.dst; /* needed by icmp_send() */ } else { - ip_rt_put(rt); - if (ip_route_input(skb2, eiph->daddr, eiph->saddr, eiph->tos, skb2->dev) || - skb2->dst->dev->type != ARPHRD_IPGRE) { - kfree_skb(skb2); - return; - } +#if IPIP_DEBUG + if (net_ratelimit()) + printk(KERN_DEBUG "ipip_err: ip_route_input non local pkt\n"); +#endif + if (ip_route_input(skb2, eiph->daddr, eiph->saddr, eiph->tos, skb2->dev) || + skb2->dst->dev->type != ARPHRD_TUNNEL) { + goto end_error; + } + } + +#if IPIP_DEBUG + if (net_ratelimit()) + printk(KERN_DEBUG "ipip_err: ipip %u.%u.%u.%u>%u.%u.%u.%u " + "%u.%u.%u.%u>%u.%u.%u.%u \n", + NIPQUAD(iph->saddr), NIPQUAD(iph->daddr), + NIPQUAD(eiph->saddr), NIPQUAD(eiph->daddr)); +#endif + troute = (struct ip_tunnel*)skb2->dst->dev->priv; + if (troute != tlook) { + if (net_ratelimit()) + printk(KERN_INFO "ipip_err: %u.%u.%u.%u>%u.%u.%u.%u " + "%u.%u.%u.%u>%u.%u.%u.%u different tunnel ! \n", + NIPQUAD(iph->saddr), NIPQUAD(iph->daddr), + NIPQUAD(eiph->saddr), NIPQUAD(eiph->daddr)); + goto end_error; } /* change mtu on this route */ if (type == ICMP_DEST_UNREACH && code == ICMP_FRAG_NEEDED) { - if (rel_info > skb2->dst->pmtu) { - kfree_skb(skb2); - return; - } + if (rel_info > skb2->dst->pmtu) + goto end_error; skb2->dst->pmtu = rel_info; rel_info = htonl(rel_info); } else if (type == ICMP_TIME_EXCEEDED) { - struct ip_tunnel *t = (struct ip_tunnel*)skb2->dev->priv; - if (t->parms.iph.ttl) { + if (troute->parms.iph.ttl) { rel_type = ICMP_DEST_UNREACH; rel_code = ICMP_HOST_UNREACH; } } icmp_send(skb2, rel_type, rel_code, rel_info); + end_error: + ip_rt_put(rt); kfree_skb(skb2); - return; -#endif + } /* if len< min length */ + return; } +#undef IPIP_DEBUG static inline void ipip_ecn_decapsulate(struct iphdr *outer_iph, struct sk_buff *skb) { ######################### informative: traceroute patch ##################### --- iputils-ss020927/tracepath-dist.c Sat Feb 23 01:10:59 2002 +++ iputils-ss020927/tracepath.c Sun Dec 7 04:44:09 2003 @@ -23,30 +23,22 @@ #include #include -struct hhistory -{ - int hops; - struct timeval sendtime; +struct probehdr { + __u32 ttl; + struct timeval tv; }; -struct hhistory his[64]; -int hisptr; - -struct sockaddr_in target; -__u16 base_port; - -const int overhead = 28; -int mtu = 65535; -int hops_to = -1; -int hops_from = -1; -int no_resolve = 0; +struct hhistory { + int ptr; + struct probehdr hdr[64]; +}; -struct probehdr -{ - __u32 ttl; - struct timeval tv; +struct hopslist { + int to, from; }; +const static int overhead=28, maxttl=255; + void data_wait(int fd) { fd_set fds; @@ -54,11 +46,11 @@ FD_ZERO(&fds); FD_SET(fd, &fds); tv.tv_sec = 1; - tv.tv_usec = 0; + tv.tv_usec = 10000; /* ARP timeout is 3s */ select(fd+1, &fds, NULL, NULL, &tv); } -int recverr(int fd, int ttl) +int recverr(int fd, unsigned base_port, int ttl, int no_resolve, int *mtu, struct hhistory *his, struct hopslist *hops) { int res; struct probehdr rcvbuf; @@ -70,11 +62,8 @@ struct sockaddr_in addr; struct timeval tv; struct timeval *rettv; - int slot; - int rethops; - int sndhops; + int slot, rethops, sndhops, broken_router; int progress = -1; - int broken_router; restart: memset(&rcvbuf, -1, sizeof(rcvbuf)); @@ -91,32 +80,35 @@ gettimeofday(&tv, NULL); res = recvmsg(fd, &msg, MSG_ERRQUEUE); if (res < 0) { - if (errno == EAGAIN) + if (errno == EAGAIN || errno == EBADF || errno == ENOTSOCK + || errno == EINVAL) return progress; goto restart; } - progress = mtu; + progress = *mtu; rethops = -1; sndhops = -1; e = NULL; rettv = NULL; slot = ntohs(addr.sin_port) - base_port; - if (slot>=0 && slot < 63 && his[slot].hops) { - sndhops = his[slot].hops; - rettv = &his[slot].sendtime; - his[slot].hops = 0; + if (slot>=0 && slot<(maxttl+1)*3 && his->hdr[slot=(slot & 63)].ttl) { + sndhops = his->hdr[slot].ttl; + rettv = &his->hdr[slot].tv; + his->hdr[slot].ttl = 0; } broken_router = 0; if (res == sizeof(rcvbuf)) { - if (rcvbuf.ttl == 0 || rcvbuf.tv.tv_sec == 0) { + if (rcvbuf.ttl==0 || rcvbuf.ttl>maxttl + || rcvbuf.tv.tv_sec==0) { broken_router = 1; } else { sndhops = rcvbuf.ttl; rettv = &rcvbuf.tv; } - } + } else + broken_router = -1; for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) { if (cmsg->cmsg_level == SOL_IP) { @@ -158,6 +150,11 @@ } } + if (rettv) { + int diff = (tv.tv_sec-rettv->tv_sec)*1000000+(tv.tv_usec-rettv->tv_usec); + printf("%3d.%03dms ", diff/1000, diff%1000); + } + if (rethops>=0) { if (rethops<=64) rethops = 65-rethops; @@ -166,16 +163,16 @@ else rethops = 256-rethops; if (sndhops>=0 && rethops != sndhops) - printf("asymm %2d ", rethops); + printf("asym %2d ", rethops); else if (sndhops<0 && rethops != ttl) - printf("asymm %2d ", rethops); + printf("asym %2d ", rethops); } if (rettv) { - int diff = (tv.tv_sec-rettv->tv_sec)*1000000+(tv.tv_usec-rettv->tv_usec); - printf("%3d.%03dms ", diff/1000, diff%1000); - if (broken_router) + if (broken_router>0) printf("(This broken router returned corrupted payload) "); + else if(broken_router<0) + printf("(%i) ",res); } switch (e->ee_errno) { @@ -184,13 +181,12 @@ break; case EMSGSIZE: printf("pmtu %d\n", e->ee_info); - mtu = e->ee_info; - progress = mtu; + progress = *mtu = e->ee_info; break; case ECONNREFUSED: printf("reached\n"); - hops_to = sndhops<0 ? ttl : sndhops; - hops_from = rethops; + hops->to = sndhops<0 ? ttl : sndhops; + hops->from = rethops; return 0; case EPROTO: printf("!P\n"); @@ -219,69 +215,110 @@ goto restart; } -int probe_ttl(int fd, int ttl) +int probe_ttl(int fd, struct sockaddr_in *target, __u16 base_port, int ttl, int no_resolve, int *mtu, struct hhistory *his, struct hopslist *hops) { - int i; - char sndbuf[mtu]; + int i, all= *mtuttl = ttl; - target.sin_port = htons(base_port + hisptr); + hdr->ttl= ttl; + target->sin_port = htons(base_port + his->ptr); gettimeofday(&hdr->tv, NULL); - his[hisptr].hops = ttl; - his[hisptr].sendtime = hdr->tv; - if (sendto(fd, sndbuf, mtu-overhead, 0, (struct sockaddr*)&target, sizeof(target)) > 0) + his->hdr[his->ptr]= *hdr; + if (sendto(fd, sndbuf, *mtu-overhead, 0, + (struct sockaddr*)target, sizeof(*target)) >= 0) break; - res = recverr(fd, ttl); - his[hisptr].hops = 0; + + res = recverr(fd, base_port, ttl, no_resolve, mtu, his, hops); + his->hdr[his->ptr].ttl = 0; if (res==0) return 0; - if (res > 0) - goto restart; + if (res<0) /* no error info in queue */ + i+=25; } - hisptr = (hisptr + 1)&63; + his->ptr = (his->ptr + 1)&63; - if (i<10) { + if (i<256) { data_wait(fd); if (recv(fd, sndbuf, sizeof(sndbuf), MSG_DONTWAIT) > 0) { printf("%2d?: reply received 8)\n", ttl); return 0; } - return recverr(fd, ttl); + return recverr(fd, base_port, ttl, no_resolve, mtu, his, hops); } - printf("%2d: send failed\n", ttl); + { char abuf[128]; + inet_ntop(target->sin_family, &target->sin_addr, abuf, sizeof(abuf)); + printf("%2d: send failed dest %s/%u\n", ttl, abuf, + (unsigned )ntohs(target->sin_port)); + } return 0; } -static void usage(void) __attribute((noreturn)); - static void usage(void) { - fprintf(stderr, "Usage: tracepath [-n] [/]\n"); - exit(-1); + fprintf(stderr, "Use: tracepath [-Q tos] [-b src[/port]] [-m mtu] [-n] [-t ttli[/f]] dest[/port]\n"); + return; } -int -main(int argc, char **argv) +int div_string(char *argin, long *valout) +{ + int err=0; + char *p; + + if((p = strrchr(argin, '/'))) { + *p = 0; + *valout = strtol(&p[1], NULL,0); + } else { + err=1; + } + + return(err); +} + +int main(int argc, char **argv) { - struct hostent *he; int fd; + struct hostent *he; + struct sockaddr_in target; + __u16 base_port=44444; + struct hopslist hops={ -1, -1}; + char *tos_ini=NULL, *bind_ini=NULL, *mtu_ini=NULL, *ttl_ini=NULL; + int no_resolve = 0; int on; - int ttl; - char *p; - int ch; + int ttli=1, ttlf=31, ttl, tos=0; + int ch, res=0, err=0; + int mtu = 65535; + struct hhistory his={ .ptr=0, .hdr={ {.ttl=0}, } }; + + extern char *optarg; + extern int optind; /*, opterr, optopt; */ + + optarg=NULL; /* init getopt */ + optind=0; - while ((ch = getopt(argc, argv, "nh?")) != EOF) { + while ((ch = getopt(argc, argv, "Q:b:m:nt:s:h")) != EOF) { switch(ch) { + case 'Q': + tos_ini=optarg; + break; + case 'b': + case 's': + bind_ini=optarg; + break; + case 'm': + mtu_ini=optarg; + break; case 'n': - no_resolve = 1; + no_resolve ^= 1; + break; + case 't': + ttl_ini=optarg; break; default: ; } @@ -290,57 +327,104 @@ argc -= optind; argv += optind; - if (argc != 1) + if (argc != 1) { usage(); - + err=-1; goto end_error; + } fd = socket(AF_INET, SOCK_DGRAM, 0); if (fd < 0) { perror("socket"); - exit(1); + err=1; goto end_error; } target.sin_family = AF_INET; - p = strchr(argv[0], '/'); - if (p) { - *p = 0; - base_port = atoi(p+1); - } else - base_port = 44444; + { long li; + if(!div_string(argv[0], &li)) + base_port = li; + } he = gethostbyname(argv[0]); if (he == NULL) { herror("gethostbyname"); - exit(1); + err=2; goto end_error; } memcpy(&target.sin_addr, he->h_addr, 4); + if(bind_ini) { + long li; + __u16 sport=0; + struct hostent *she; + struct sockaddr_in saddr; + + memset(&saddr, 0, sizeof(saddr)); + saddr.sin_family = AF_INET; + + if(!div_string(bind_ini, &li)) + sport = li; + saddr.sin_port = htons(sport); + + she = gethostbyname(bind_ini); + if (she == NULL) { + herror("gethostbyname"); + err=3; goto end_error; + } + memcpy(&saddr.sin_addr, she->h_addr, 4); + + if(bind(fd, (struct sockaddr*)&saddr, sizeof(saddr))<0) { + perror("bind"); + err=4; goto end_error; + } + } + + if(mtu_ini) { + mtu=(int )strtol(mtu_ini, NULL,0); + if(mtu 0) @@ -351,12 +435,14 @@ printf("%2d: no reply\n", ttl); } printf(" Too many hops: pmtu %d\n", mtu); + err= res<0 ? 16 : 17; done: printf(" Resume: pmtu %d ", mtu); - if (hops_to>=0) - printf("hops %d ", hops_to); - if (hops_from>=0) - printf("back %d ", hops_from); + if (hops.to>=0) + printf("hops %d ", hops.to); + if (hops.from>=0) + printf("back %d ", hops.from); printf("\n"); - exit(0); +end_error: + return(err); } From jgarzik@pobox.com Sun Dec 7 10:51:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 10:51:32 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7IpITa016101 for ; Sun, 7 Dec 2003 10:51:18 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:40917 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AT3zx-0002Jo-4p; Sun, 07 Dec 2003 18:51:17 +0000 Message-ID: <3FD3769A.4090200@pobox.com> Date: Sun, 07 Dec 2003 13:51:06 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com Subject: Re: [PATCH] pc300 - get rid of MOD_INC/MOD_DEC References: <20031124155100.17b7a4ca.shemminger@osdl.org> In-Reply-To: <20031124155100.17b7a4ca.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1924 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied From jgarzik@pobox.com Sun Dec 7 11:17:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 11:17:30 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7JH6Ta016758 for ; Sun, 7 Dec 2003 11:17:07 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:40914 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AT3xY-0002Ib-Uy; Sun, 07 Dec 2003 18:48:49 +0000 Message-ID: <3FD37605.5070806@pobox.com> Date: Sun, 07 Dec 2003 13:48:37 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com Subject: Re: [PATCH] hp100 -- fixes for new probing. References: <20031124154459.6fe02a94.shemminger@osdl.org> In-Reply-To: <20031124154459.6fe02a94.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1926 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Stephen Hemminger wrote: > Fixes to net-drivers-2.5-exp patches for hp100 > * EISA device id table needs a terminating string. > * if one driver built for all variations (ISA, EISA, PCI) > then try to have sane error handling on probe. > > diff -Nru a/drivers/net/hp100.c b/drivers/net/hp100.c > --- a/drivers/net/hp100.c Mon Nov 24 15:34:53 2003 > +++ b/drivers/net/hp100.c Mon Nov 24 15:34:53 2003 > @@ -201,6 +201,7 @@ > { "HWP1990" }, /* HP J2577 */ > { "CPX0301" }, /* ReadyLink ENET100-VG4 */ > { "CPX0401" }, /* FreedomLine 100/VG */ > + { "" } > }; > MODULE_DEVICE_TABLE(eisa, hp100_eisa_tbl); > #endif > @@ -3045,10 +3046,16 @@ > err = hp100_isa_init(); > > #ifdef CONFIG_EISA > - err |= eisa_driver_register(&hp100_eisa_driver); > + if (err && err != -ENODEV) > + return err; > + > + err = eisa_driver_register(&hp100_eisa_driver); > #endif > #ifdef CONFIG_PCI > - err |= pci_module_init(&hp100_pci_driver); > + if (err && err != -ENODEV) > + return err; > + > + err = pci_module_init(&hp100_pci_driver); > #endif > return err; > } Valid changes... but it looks like there should be some *_unregister_* calls in this last patch chunk, to clean up on error... Jeff From jgarzik@pobox.com Sun Dec 7 11:17:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 11:17:28 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7JHETa016764 for ; Sun, 7 Dec 2003 11:17:15 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:40874 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AT3QS-00022w-25; Sun, 07 Dec 2003 18:14:36 +0000 Message-ID: <3FD36DFF.4060503@pobox.com> Date: Sun, 07 Dec 2003 13:14:23 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu CC: netdev@oss.sgi.com, =?ISO-8859-1?Q?Fernando_Alencar_Mar=F3stica?= , Brad House Subject: Re: [PATCH 2.6] 2.6.0-test11 - more rtl8169 References: <1070212415.1607.17.camel@oxygenium> <20031201020453.A16405@electric-eye.fr.zoreil.com> <20031202010649.A27879@electric-eye.fr.zoreil.com> In-Reply-To: <20031202010649.A27879@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1925 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Francois Romieu wrote: > Get: > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11-netdrv/r8169-blob.tar.bz2 > Same thing as above with: > r8169-mac-phy-version.patch > r8169-init_one.patch > r8169-timer.patch > r8169-hw_start.patch > r8169-missing-tx-stats.patch > r8169-intr_mask.patch > r8169-suspend.patch > r8169-dma-api-rx-buffers-ahum.patch All the above patches applied, in the above order. Thanks much, Francois! Is there anything left? Jeff P.S. Will release updated 2.5-exp patch soon. From jgarzik@pobox.com Sun Dec 7 12:36:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 12:36:37 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7KaLTa020925 for ; Sun, 7 Dec 2003 12:36:22 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:40960 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AT5dc-00039T-Cb; Sun, 07 Dec 2003 20:36:20 +0000 Message-ID: <3FD38F39.5070801@pobox.com> Date: Sun, 07 Dec 2003 15:36:09 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Netdev CC: Linux Kernel , Andrew Morton Subject: [BK PATCHES] 2.6.x experimental net driver queue Content-Type: multipart/mixed; boundary="------------070501080508050700070805" X-archive-position: 1927 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------070501080508050700070805 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Another installment in the net-drivers-2.5-exp queue patchkit. Summary of new changes: * r8169 bug fixing (Francois Romieu, RealTek, others) * Make dev->priv calculation a compile-time constant (eliminates extra dereference) (Al Viro) * More netdev allocation work (Al Viro) * airo fixes (Javier) * 8139too minor fixes (OGAWA Hirofumi) * new nVidia nForce NIC driver "forcedeth" (Manfred Spraul) * misc net driver fixes/cleanups (Stephen Hemminger) Summary of patchkit: * e100 driver rewrite * new nVidia nForce NIC driver * r8169 bug fixing * 8139too NAPI support * tulip NAPI support * netconsole / netdump support * net_device allocation and reference counting work * e1000 minor updates * misc bug fixes Patch: ftp://ftp.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test11-bk5-netdrvr-exp1.patch.bz2 Full changelog: ftp://ftp.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test11-bk5-netdrvr-exp1.log BK repo: http://gkernel.bkbits.net/net-drivers-2.5-exp Changelog delta attached. --------------070501080508050700070805 Content-Type: text/plain; name="2.6.0-test11-bk5-netdrvr-exp1.txt" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="2.6.0-test11-bk5-netdrvr-exp1.txt" ChangeSet@1.1526, 2003-12-07 14:01:09-05:00, jgarzik@redhat.com Merge redhat.com:/spare/repo/net-drivers-2.5 into redhat.com:/spare/repo/net-drivers-2.5-exp ChangeSet@1.1474.1.26, 2003-12-07 13:58:53-05:00, jgarzik@redhat.com Merge redhat.com:/spare/repo/linux-2.5 into redhat.com:/spare/repo/net-drivers-2.5 ChangeSet@1.1525, 2003-12-07 13:50:47-05:00, shemminger@osdl.org [PATCH] pc300 - get rid of MOD_INC/MOD_DEC Remove old style mod inc/dec from this WAN driver. ChangeSet@1.1524, 2003-12-07 13:49:20-05:00, shemminger@osdl.org [PATCH] get rid of MOD INC/DEC for farsync Get rid of leftover MOD INC/DEC in this driver. Ref counting now done by network core. Jeff, please apply to net-drivers-2.5-exp ChangeSet@1.1523, 2003-12-07 13:45:51-05:00, viro@parcelfarce.linux.theplanet.co.uk [atm clip] convert to using alloc_netdev(), const-offset priv ChangeSet@1.1522, 2003-12-07 13:42:37-05:00, viro@parcelfarce.linux.theplanet.co.uk [pcmcia] synclink_cs] convert net_device to dynamic allocation ChangeSet@1.1521, 2003-12-07 13:42:04-05:00, viro@parcelfarce.linux.theplanet.co.uk [char synclinkmp] convert net_device to dynamic allocation ChangeSet@1.1520, 2003-12-07 13:40:52-05:00, viro@parcelfarce.linux.theplanet.co.uk [hamradio dmascc] convert embedded net_device to dynamic allocation ChangeSet@1.1519, 2003-12-07 13:39:29-05:00, viro@parcelfarce.linux.theplanet.co.uk net_device and netdev private struct allocation improvements. 1) Ensure alignment of both net_device and private area. 2) Introduce netdev_priv(), an inline which allows the dynamic private area (dev->priv) to be calculated as a constant offset from the base struct net_device at compile time. ChangeSet@1.1518, 2003-12-07 13:21:55-05:00, achirica@telefonica.net [wireless airo] Delay MIC activation to prevent Oops ChangeSet@1.1517, 2003-12-07 13:21:47-05:00, achirica@telefonica.net [wireless airo] Fix PCI registration ChangeSet@1.1516, 2003-12-07 13:18:43-05:00, hirofumi@mail.parknet.co.jp [PATCH] 8139too tx queue handling fix Hi, netif_stop_queue(dev); [....] netif_start_queue(dev); netif_wake_queue(dev); and netif_stop_queue(dev); [....] netif_wake_queue(dev); is not same. After tx_timeout, this was needed for restarting tx work immediately. Are you interested in this patch? drivers/net/8139too.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) ChangeSet@1.1515, 2003-12-07 13:18:34-05:00, hirofumi@mail.parknet.co.jp [PATCH] 8139too warning fix (1/2) drivers/net/8139too.c | 2 ++ 1 files changed, 2 insertions(+) ChangeSet@1.1514, 2003-12-07 13:13:24-05:00, romieu@fr.zoreil.com [netdrvr r8169] fix RX Brown paper bag time: the Rx descriptors are contiguous and EORbit only marks the last descriptor in the array. OWNbit implicitly marks the end of the Rx descriptors segment which is owned by the nic. ChangeSet@1.1513, 2003-12-07 13:12:27-05:00, romieu@fr.zoreil.com [netdrvr r8169] Suspend/resume code (Fernando Alencar MarĂ³tica). ChangeSet@1.1512, 2003-12-07 13:11:46-05:00, romieu@fr.zoreil.com [netdrvr r8169] Modification of the interrupt mask (RealTek). ChangeSet@1.1511, 2003-12-07 13:10:49-05:00, romieu@fr.zoreil.com [netdrvr r8169] Driver forgot to update the transmitted bytes counter. Originally done in rtl8169_start_xmit() by Realtek. ChangeSet@1.1510, 2003-12-07 13:10:04-05:00, romieu@fr.zoreil.com [netdrvr r8169] Merge of changes from Realtek: - register voodoo in rtl8169_hw_start(). ChangeSet@1.1509, 2003-12-07 13:08:55-05:00, romieu@fr.zoreil.com [netdrvr r8169] Merge of timer related changes from Realtek: - changed their timeout value from 100 to HZ to trigger rtl8169_phy_timer(); - s/TX_TIMEOUT/RTL8169_TX_TIMEOUT/ to have RTL8169_{TX/PHY}_TIMEOUT. ChangeSet@1.1508, 2003-12-07 13:07:25-05:00, romieu@fr.zoreil.com [netdrvr r8169] Merge of changes done by Realtek to rtl8169_init_one(): - phy capability settings allows lower or equal capability as suggested in Realtek's changes; - I/O voodoo; - no need to s/mdio_write/RTL8169_WRITE_GMII_REG/; - s/rtl8169_hw_PHY_config/rtl8169_hw_phy_config/; - rtl8169_hw_phy_config(): ad-hoc struct "phy_magic" to limit duplication of code (yep, the u16 -> int conversions should work as expected); - variable renames and whitepace changes ignored. ChangeSet@1.1507, 2003-12-07 13:05:55-05:00, romieu@fr.zoreil.com [netdrvr r8169] Add {mac/phy}_version. - change of identification logic in rtl8169_init_board(); - {chip/rtl_chip}_info are merged in rtl_chip_info; - misc style nits (lazy braces, SHOUTING MACROS from realtek converted to functions). ChangeSet@1.1506, 2003-12-07 12:58:13-05:00, manfred@colorfullife.com [netdrvr] add "forcedeth" driver for nVidia nForce NICs --------------070501080508050700070805-- From romieu@fr.zoreil.com Sun Dec 7 15:04:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 15:04:15 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7N40Ta023626 for ; Sun, 7 Dec 2003 15:04:01 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id hB7N1YsW030321; Mon, 8 Dec 2003 00:01:34 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id hB7N1WoG030319; Mon, 8 Dec 2003 00:01:32 +0100 Date: Mon, 8 Dec 2003 00:01:32 +0100 From: Francois Romieu To: Jeff Garzik Cc: netdev@oss.sgi.com, =?unknown-8bit?Q?Fernando_Alencar_Mar=F3stica?= , Brad House Subject: [PATCH 2.6] 2.6.0-test11-bk5 - rtl8169 Message-ID: <20031208000132.A29834@electric-eye.fr.zoreil.com> References: <1070212415.1607.17.camel@oxygenium> <20031201020453.A16405@electric-eye.fr.zoreil.com> <20031202010649.A27879@electric-eye.fr.zoreil.com> <3FD36DFF.4060503@pobox.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="jRHKVT23PllUwdXP" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FD36DFF.4060503@pobox.com>; from jgarzik@pobox.com on Sun, Dec 07, 2003 at 01:14:23PM -0500 X-Organisation: Land of Sunshine Inc. X-archive-position: 1928 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Jeff Garzik : [...] > Is there anything left? See attachment for r8169-endianness.patch r8169-getstats.patch (against 2.6.0-test11-bk5 + 2.6.0-test11-bk5-netdrvr-exp1). I diffed the whole thing against pure 2.6.0-test11 as well. It is available at: http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.0-test11-bk5 -- Ueimor --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="r8169-endianness.patch" Endianness update (original idea from Alexandra N. Kossovsky): - descriptors status (bitfields enumerated as _DescStatusBit); - address of buffers stored in Rx/Tx descriptors. drivers/net/r8169.c | 31 ++++++++++++++++--------------- 1 files changed, 16 insertions(+), 15 deletions(-) diff -puN drivers/net/r8169.c~r8169-endianness drivers/net/r8169.c --- linux-2.6.0-test11-bk5/drivers/net/r8169.c~r8169-endianness 2003-12-07 23:26:32.000000000 +0100 +++ linux-2.6.0-test11-bk5-fr/drivers/net/r8169.c 2003-12-07 23:26:32.000000000 +0100 @@ -1116,13 +1116,14 @@ rtl8169_hw_start(struct net_device *dev) static inline void rtl8169_make_unusable_by_asic(struct RxDesc *desc) { desc->buf_addr = 0xdeadbeef; - desc->status &= ~(OWNbit | RsvdMask); + desc->status &= ~cpu_to_le32(OWNbit | RsvdMask); } static void rtl8169_free_rx_skb(struct pci_dev *pdev, struct sk_buff **sk_buff, struct RxDesc *desc) { - pci_unmap_single(pdev, desc->buf_addr, RX_BUF_SIZE, PCI_DMA_FROMDEVICE); + pci_unmap_single(pdev, le32_to_cpu(desc->buf_addr), RX_BUF_SIZE, + PCI_DMA_FROMDEVICE); dev_kfree_skb(*sk_buff); *sk_buff = NULL; rtl8169_make_unusable_by_asic(desc); @@ -1130,13 +1131,13 @@ static void rtl8169_free_rx_skb(struct p static inline void rtl8169_return_to_asic(struct RxDesc *desc) { - desc->status |= OWNbit + RX_BUF_SIZE; + desc->status |= cpu_to_le32(OWNbit + RX_BUF_SIZE); } static inline void rtl8169_give_to_asic(struct RxDesc *desc, dma_addr_t mapping) { - desc->buf_addr = mapping; - desc->status |= OWNbit + RX_BUF_SIZE; + desc->buf_addr = cpu_to_le32(mapping); + desc->status |= cpu_to_le32(OWNbit + RX_BUF_SIZE); } static int rtl8169_alloc_rx_skb(struct pci_dev *pdev, struct net_device *dev, @@ -1201,7 +1202,7 @@ static u32 rtl8169_rx_fill(struct rtl816 static inline void rtl8169_mark_as_last_descriptor(struct RxDesc *desc) { - desc->status |= EORbit; + desc->status |= cpu_to_le32(EORbit); } static int rtl8169_init_ring(struct net_device *dev) @@ -1233,8 +1234,8 @@ static void rtl8169_unmap_tx_skb(struct { u32 len = sk_buff[0]->len; - pci_unmap_single(pdev, desc->buf_addr, len < ETH_ZLEN ? ETH_ZLEN : len, - PCI_DMA_TODEVICE); + pci_unmap_single(pdev, le32_to_cpu(desc->buf_addr), + len < ETH_ZLEN ? ETH_ZLEN : len, PCI_DMA_TODEVICE); desc->buf_addr = 0x00; *sk_buff = NULL; } @@ -1300,17 +1301,17 @@ rtl8169_start_xmit(struct sk_buff *skb, spin_lock_irq(&tp->lock); - if ((tp->TxDescArray[entry].status & OWNbit) == 0) { + if (!(le32_to_cpu(tp->TxDescArray[entry].status) & OWNbit)) { dma_addr_t mapping; mapping = pci_map_single(tp->pci_dev, skb->data, len, PCI_DMA_TODEVICE); tp->Tx_skbuff[entry] = skb; - tp->TxDescArray[entry].buf_addr = mapping; + tp->TxDescArray[entry].buf_addr = cpu_to_le32(mapping); - tp->TxDescArray[entry].status = OWNbit | FSbit | LSbit | len | - (EORbit * !((entry + 1) % NUM_TX_DESC)); + tp->TxDescArray[entry].status = cpu_to_le32(OWNbit | FSbit | + LSbit | len | (EORbit * !((entry + 1) % NUM_TX_DESC))); RTL_W8(TxPoll, 0x40); //set polling bit @@ -1351,7 +1352,7 @@ rtl8169_tx_interrupt(struct net_device * tx_left = tp->cur_tx - dirty_tx; while (tx_left > 0) { - if ((tp->TxDescArray[entry].status & OWNbit) == 0) { + if (!(le32_to_cpu(tp->TxDescArray[entry].status) & OWNbit)) { int cur = dirty_tx % NUM_TX_DESC; struct sk_buff *skb = tp->Tx_skbuff[cur]; @@ -1409,8 +1410,8 @@ rtl8169_rx_interrupt(struct net_device * cur_rx = tp->cur_rx % RX_BUF_SIZE; - while ((tp->RxDescArray[cur_rx].status & OWNbit) == 0) { - u32 status = tp->RxDescArray[cur_rx].status; + while (!(le32_to_cpu(tp->RxDescArray[cur_rx].status) & OWNbit)) { + u32 status = le32_to_cpu(tp->RxDescArray[cur_rx].status); if (status & RxRES) { printk(KERN_INFO "%s: Rx ERROR!!!\n", dev->name); _ --jRHKVT23PllUwdXP Content-Type: text/plain; charset=unknown-8bit Content-Disposition: attachment; filename="r8169-getstats.patch" Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by fr.zoreil.com id hB7N1YsW030321 Stats fix (Fernando Alencar Mar=F3stica ). drivers/net/r8169.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+) diff -puN drivers/net/r8169.c~r8169-getstats drivers/net/r8169.c --- linux-2.6.0-test11-bk5/drivers/net/r8169.c~r8169-getstats 2003-12-07 = 23:26:37.000000000 +0100 +++ linux-2.6.0-test11-bk5-fr/drivers/net/r8169.c 2003-12-07 23:26:37.000= 000000 +0100 @@ -1610,11 +1610,26 @@ rtl8169_set_rx_mode(struct net_device *d spin_unlock_irqrestore(&tp->lock, flags); } =20 +/** + * rtl8169_get_stats - Get rtl8169 read/write statistics + * @dev: The Ethernet Device to get statistics for + * + * Get TX/RX statistics for rtl8169 + */ struct net_device_stats * rtl8169_get_stats(struct net_device *dev) { struct rtl8169_private *tp =3D dev->priv; + void *ioaddr =3D tp->mmio_addr; + unsigned long flags; =20 + if (netif_running(dev)) { + spin_lock_irqsave(&tp->lock, flags); + tp->stats.rx_missed_errors +=3D RTL_R32(RxMissed); + RTL_W32(RxMissed, 0); + spin_unlock_irqrestore(&tp->lock, flags); + } + =09 return &tp->stats; } =20 _ --jRHKVT23PllUwdXP-- From davem@pizda.ninka.net Sun Dec 7 15:30:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 15:30:55 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB7NUNTa024254 for ; Sun, 7 Dec 2003 15:30:23 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA04091; Sun, 7 Dec 2003 15:29:31 -0800 Date: Sun, 7 Dec 2003 15:29:31 -0800 From: "David S. Miller" To: devik Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org, daniel.blueman@gmx.net Subject: Re: HTB 2.6.0-test10 oops related patch Message-Id: <20031207152931.33e8c3fd.davem@redhat.com> In-Reply-To: References: <20031206231047.01257548.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1929 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 7 Dec 2003 12:30:20 +0100 (CET) devik wrote: > I just spotted the second bug too. (For those interested it is described > in the patch). > Please revert the small patch I sent you yesterday before applying the > new one. It contains yesterday's one too (it is because it is for 2.4 too). > Please apply it against both 2.4.23 and 2.6.0-test10. It should apply > cleanly in 2.4.23 case and with some offsets to 2.6.0-test10. Great work, applied. Thanks. From jgarzik@pobox.com Sun Dec 7 16:00:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 16:01:12 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB800uTa024925 for ; Sun, 7 Dec 2003 16:00:57 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:41102 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AT8pa-0005ps-E3; Mon, 08 Dec 2003 00:00:54 +0000 Message-ID: <3FD3BF2B.50106@pobox.com> Date: Sun, 07 Dec 2003 19:00:43 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu CC: netdev@oss.sgi.com, =?ISO-8859-1?Q?Fernando_Alencar_Mar=F3stica?= , Brad House Subject: Re: [PATCH 2.6] 2.6.0-test11-bk5 - rtl8169 References: <1070212415.1607.17.camel@oxygenium> <20031201020453.A16405@electric-eye.fr.zoreil.com> <20031202010649.A27879@electric-eye.fr.zoreil.com> <3FD36DFF.4060503@pobox.com> <20031208000132.A29834@electric-eye.fr.zoreil.com> In-Reply-To: <20031208000132.A29834@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1930 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev thanks, applied both -endianness and -getstats patches. From davem@pizda.ninka.net Sun Dec 7 16:56:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Dec 2003 16:56:24 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB80uBTa029022 for ; Sun, 7 Dec 2003 16:56:11 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA04287; Sun, 7 Dec 2003 16:55:16 -0800 Date: Sun, 7 Dec 2003 16:55:16 -0800 From: "David S. Miller" To: Herbert Xu Cc: kuznet@ms2.inr.ac.ru, jmorris@redhat.com, netdev@oss.sgi.com Subject: Re: [IPSEC] Move hardware headers for decaped packets Message-Id: <20031207165516.1edfde10.davem@redhat.com> In-Reply-To: <20031207095527.GA2767@gondor.apana.org.au> References: <20030925121131.GA17968@gondor.apana.org.au> <200309251228.QAA11650@yakov.inr.ac.ru> <20030925124102.GA18188@gondor.apana.org.au> <20031207090205.GA2358@gondor.apana.org.au> <20031207014714.03e79cd2.davem@redhat.com> <20031207095527.GA2767@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1931 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 7 Dec 2003 20:55:27 +1100 Herbert Xu wrote: > On Sun, Dec 07, 2003 at 01:47:14AM -0800, David S. Miller wrote: > > > > Ok, I'll review this again. Did you audit the tree to make sure you > > updated mac_len everywhere that 'skb->mac.*' is modified? > > To be honest, no. The reason is that my use of mac_len starts from > netif_receive_skb and ends just before the reentrance into netif_rx > in xfrm[46]_input. > > At the start of that path, mac_len is initialised from a value that > we know to be correct. I have also verified that within the path, > nobody expands/contracts the MAC header. Ok, we can make it a receive only thing at first, I guess. But I think this will definitely need to be deferred to 2.6.1 at least, along with that XFRM device unload fix you sent me the other week. Linus really only wants the obvious one-liners at this point for 2.6.0 Thanks. From chrisw@osdl.org Mon Dec 8 20:24:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 20:24:39 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB94OQTa009796 for ; Mon, 8 Dec 2003 20:24:26 -0800 Received: from build.pdx.osdl.net (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hB94OEZ25178; Mon, 8 Dec 2003 20:24:14 -0800 Received: (from chrisw@localhost) by build.pdx.osdl.net (8.11.6/8.11.6) id hB94OEe31987; Mon, 8 Dec 2003 20:24:14 -0800 Date: Mon, 8 Dec 2003 20:24:14 -0800 From: Chris Wright To: netdev@oss.sgi.com Cc: davem@redhat.com, shemminger@osdl.org Subject: [PATCH 2/4] irda check error on memcpy_fromiovec Message-ID: <20031208202414.D30587@build.pdx.osdl.net> References: <20031208202302.C30587@build.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20031208202302.C30587@build.pdx.osdl.net>; from chrisw@osdl.org on Mon, Dec 08, 2003 at 08:23:02PM -0800 X-archive-position: 1933 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev Check the return value on memcpy_fromiovec(). ===== net/irda/af_irda.c 1.47 vs edited ===== --- 1.47/net/irda/af_irda.c Tue Nov 4 16:29:43 2003 +++ edited/net/irda/af_irda.c Fri Dec 5 17:29:04 2003 @@ -1307,7 +1307,11 @@ skb_reserve(skb, self->max_header_size + 16); asmptr = skb->h.raw = skb_put(skb, len); - memcpy_fromiovec(asmptr, msg->msg_iov, len); + err = memcpy_fromiovec(asmptr, msg->msg_iov, len); + if (err) { + kfree_skb(skb); + return err; + } /* * Just send the message to TinyTP, and let it deal with possible @@ -1549,7 +1553,11 @@ IRDA_DEBUG(4, "%s(), appending user data\n", __FUNCTION__); asmptr = skb->h.raw = skb_put(skb, len); - memcpy_fromiovec(asmptr, msg->msg_iov, len); + err = memcpy_fromiovec(asmptr, msg->msg_iov, len); + if (err) { + kfree_skb(skb); + return err; + } /* * Just send the message to TinyTP, and let it deal with possible @@ -1612,7 +1620,11 @@ IRDA_DEBUG(4, "%s(), appending user data\n", __FUNCTION__); asmptr = skb->h.raw = skb_put(skb, len); - memcpy_fromiovec(asmptr, msg->msg_iov, len); + err = memcpy_fromiovec(asmptr, msg->msg_iov, len); + if (err) { + kfree_skb(skb); + return err; + } err = irlmp_connless_data_request(self->lsap, skb); if (err) { From chrisw@osdl.org Mon Dec 8 20:23:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 20:23:29 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB94NFTa009697 for ; Mon, 8 Dec 2003 20:23:16 -0800 Received: from build.pdx.osdl.net (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hB94N3Z25095; Mon, 8 Dec 2003 20:23:03 -0800 Received: (from chrisw@localhost) by build.pdx.osdl.net (8.11.6/8.11.6) id hB94N2c31959; Mon, 8 Dec 2003 20:23:02 -0800 Date: Mon, 8 Dec 2003 20:23:02 -0800 From: Chris Wright To: netdev@oss.sgi.com Cc: davem@redhat.com, shemminger@osdl.org Subject: [PATCH 1/4] ax25 check error on memcpy_fromiovec Message-ID: <20031208202302.C30587@build.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-archive-position: 1932 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev Check the return value on memcpy_fromiovec(). ===== net/ax25/af_ax25.c 1.33 vs edited ===== --- 1.33/net/ax25/af_ax25.c Tue Oct 7 06:27:14 2003 +++ edited/net/ax25/af_ax25.c Fri Dec 5 16:43:44 2003 @@ -1534,7 +1534,12 @@ SOCK_DEBUG(sk, "AX.25: Appending user data\n"); /* User data follows immediately after the AX.25 data */ - memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len); + if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) { + err = -EFAULT; + kfree_skb(skb); + goto out; + } + skb->nh.raw = skb->data; /* Add the PID if one is not supplied by the user in the skb */ From chrisw@osdl.org Mon Dec 8 20:25:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 20:25:42 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB94PTTa010005 for ; Mon, 8 Dec 2003 20:25:29 -0800 Received: from build.pdx.osdl.net (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hB94PHZ25326; Mon, 8 Dec 2003 20:25:17 -0800 Received: (from chrisw@localhost) by build.pdx.osdl.net (8.11.6/8.11.6) id hB94PHI32016; Mon, 8 Dec 2003 20:25:17 -0800 Date: Mon, 8 Dec 2003 20:25:17 -0800 From: Chris Wright To: netdev@oss.sgi.com Cc: davem@redhat.com, shemminger@osdl.org Subject: [PATCH 3/4] netrom check error on memcpy_fromiovec Message-ID: <20031208202517.E30587@build.pdx.osdl.net> References: <20031208202302.C30587@build.pdx.osdl.net> <20031208202414.D30587@build.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20031208202414.D30587@build.pdx.osdl.net>; from chrisw@osdl.org on Mon, Dec 08, 2003 at 08:24:14PM -0800 X-archive-position: 1934 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev Check the return value on memcpy_fromiovec(). ===== net/netrom/af_netrom.c 1.40 vs edited ===== --- 1.40/net/netrom/af_netrom.c Sun Sep 21 18:17:31 2003 +++ edited/net/netrom/af_netrom.c Fri Dec 5 16:45:14 2003 @@ -1094,7 +1094,12 @@ SOCK_DEBUG(sk, "NET/ROM: Appending user data\n"); /* User data follows immediately after the NET/ROM transport header */ - memcpy_fromiovec(asmptr, msg->msg_iov, len); + if (memcpy_fromiovec(asmptr, msg->msg_iov, len)) { + kfree_skb(skb); + err = -EFAULT; + goto out; + } + SOCK_DEBUG(sk, "NET/ROM: Transmitting buffer\n"); if (sk->sk_state != TCP_ESTABLISHED) { From chrisw@osdl.org Mon Dec 8 20:27:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 20:27:54 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB94RfTa010641 for ; Mon, 8 Dec 2003 20:27:41 -0800 Received: from build.pdx.osdl.net (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hB94RTZ25815; Mon, 8 Dec 2003 20:27:29 -0800 Received: (from chrisw@localhost) by build.pdx.osdl.net (8.11.6/8.11.6) id hB94RTn32051; Mon, 8 Dec 2003 20:27:29 -0800 Date: Mon, 8 Dec 2003 20:27:29 -0800 From: Chris Wright To: netdev@oss.sgi.com Cc: davem@redhat.com, shemminger@osdl.org Subject: [PATCH 4/4] rose check error on memcpy_fromiovec Message-ID: <20031208202729.F30587@build.pdx.osdl.net> References: <20031208202302.C30587@build.pdx.osdl.net> <20031208202414.D30587@build.pdx.osdl.net> <20031208202517.E30587@build.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20031208202517.E30587@build.pdx.osdl.net>; from chrisw@osdl.org on Mon, Dec 08, 2003 at 08:25:17PM -0800 X-archive-position: 1935 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev Check the return value on memcpy_fromiovec(). ===== net/rose/af_rose.c 1.33 vs edited ===== --- 1.33/net/rose/af_rose.c Tue Sep 9 11:22:56 2003 +++ edited/net/rose/af_rose.c Mon Dec 8 16:58:33 2003 @@ -1079,7 +1079,11 @@ asmptr = skb->h.raw = skb_put(skb, len); - memcpy_fromiovec(asmptr, msg->msg_iov, len); + err = memcpy_fromiovec(asmptr, msg->msg_iov, len); + if (err) { + kfree_skb(skb); + return err; + } /* * If the Q BIT Include socket option is in force, the first From davem@pizda.ninka.net Mon Dec 8 21:40:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 21:40:20 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB95e2Ta017474 for ; Mon, 8 Dec 2003 21:40:04 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id VAA20381; Mon, 8 Dec 2003 21:39:04 -0800 Date: Mon, 8 Dec 2003 21:39:03 -0800 From: "David S. Miller" To: Chris Wright Cc: netdev@oss.sgi.com, shemminger@osdl.org Subject: Re: [PATCH 1/4] ax25 check error on memcpy_fromiovec Message-Id: <20031208213903.0617f289.davem@redhat.com> In-Reply-To: <20031208202302.C30587@build.pdx.osdl.net> References: <20031208202302.C30587@build.pdx.osdl.net> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1936 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 8 Dec 2003 20:23:02 -0800 Chris Wright wrote: > Check the return value on memcpy_fromiovec(). I'm fine with all of these patches, but since these problems are not super-critical can you resend these for 2.6.1 perhaps? From chrisw@osdl.org Mon Dec 8 21:41:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 21:41:48 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB95fZTa017633 for ; Mon, 8 Dec 2003 21:41:35 -0800 Received: from build.pdx.osdl.net (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hB95fTZ03870; Mon, 8 Dec 2003 21:41:29 -0800 Received: (from chrisw@localhost) by build.pdx.osdl.net (8.11.6/8.11.6) id hB95fTo01212; Mon, 8 Dec 2003 21:41:29 -0800 Date: Mon, 8 Dec 2003 21:41:29 -0800 From: Chris Wright To: "David S. Miller" Cc: Chris Wright , netdev@oss.sgi.com, shemminger@osdl.org Subject: Re: [PATCH 1/4] ax25 check error on memcpy_fromiovec Message-ID: <20031208214129.H30587@build.pdx.osdl.net> References: <20031208202302.C30587@build.pdx.osdl.net> <20031208213903.0617f289.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20031208213903.0617f289.davem@redhat.com>; from davem@redhat.com on Mon, Dec 08, 2003 at 09:39:03PM -0800 X-archive-position: 1937 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev * David S. Miller (davem@redhat.com) wrote: > I'm fine with all of these patches, but since these problems > are not super-critical can you resend these for 2.6.1 perhaps? Sure, no problem. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net From davem@pizda.ninka.net Mon Dec 8 21:57:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 21:58:10 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB95vvTa022508 for ; Mon, 8 Dec 2003 21:57:57 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id VAA20480; Mon, 8 Dec 2003 21:56:57 -0800 Date: Mon, 8 Dec 2003 21:56:57 -0800 From: "David S. Miller" To: Xose Vazquez Perez Cc: jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [PATCH] tg3, more IDs Message-Id: <20031208215657.550ba7e0.davem@redhat.com> In-Reply-To: <3FD219FA.5010908@wanadoo.es> References: <3FD20895.2020605@wanadoo.es> <3FD219FA.5010908@wanadoo.es> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1938 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sat, 06 Dec 2003 19:03:38 +0100 Xose Vazquez Perez wrote: > forget my last patch, here goes one more complete: Applied, but please next time remember to edit drivers/pci/pci.ids as well when adding new PCI device ids. From igor@veo.com Mon Dec 8 22:10:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Dec 2003 22:11:13 -0800 (PST) Received: from mars.xirlink.com (mars.veo.com [208.238.223.170]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB96AuTa013308 for ; Mon, 8 Dec 2003 22:10:56 -0800 Received: from jaguar.veo.com (jaguar [208.238.223.200]) by mars.xirlink.com (8.12.8/8.12.4) with SMTP id hB968JVf004081; Mon, 8 Dec 2003 22:08:20 -0800 (PST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: kernel 2.4 TCP stack problem (uClinux, ARM7 platform) x-mimeole: Produced By Microsoft Exchange V6.0.6249.0 content-class: urn:content-classes:message Date: Mon, 8 Dec 2003 22:09:09 -0800 Message-ID: <3A10AC3F9A35D44BBF397F889FAF5FBF096191@server_2.veo.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: kernel 2.4 TCP stack problem (uClinux, ARM7 platform) Thread-Index: AcO+G2gzw0mcTSrxSdexb4llvzHeVQ== From: "Igor Serikov" To: Cc: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hB96AuTa013308 X-archive-position: 1939 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: igor@veo.com Precedence: bulk X-list: netdev Hello, See the short log bellow (sequence numbers are relative). Sending side is uClinux based on kernel-v2.4 ARM 7 platform, receiving side is Mac OS X (changing receiving side to Windows XP produce the same results). Another important factors are: Fast Ethernet network and long time connection (the trouble usually happens within 1 hour). The data source is network camera that sends continues MIME stream. At certain moment the TCP Linux' stack starts sending empty TCP packets instead of sending the data. The connection does not die. Thanks, Igor. 12:36:08.465357 camera > PC: . 294537538:294538986(1448) ack 1 win 5792 (DF) 12:36:08.466761 camera > PC: . 294538986:294540434(1448) ack 1 win 5792 (DF) 12:36:08.466913 PC > camera: . ack 294540434 win 33304 (DF) 12:36:08.468852 camera > PC: . 294540434:294541882(1448) ack 1 win 5792 (DF) 12:36:08.470251 camera > PC: . 294541882:294543330(1448) ack 1 win 5792 (DF) 12:36:08.470406 PC > camera: . ack 294543330 win 33304 (DF) 12:36:08.473466 camera > PC: . 294543330:294544778(1448) ack 1 win 5792 (DF) 12:36:08.474997 camera > PC: . 294544778:294546226(1448) ack 1 win 5792 (DF) 12:36:08.475164 PC > camera: . ack 294546226 win 33304 (DF) 12:36:08.476912 camera > PC: P 294546226:294546309(83) ack 1 win 5792 (DF) 12:36:08.495083 camera > PC: . 294546309:294547757(1448) ack 1 win 5792 (DF) 12:36:08.496490 camera > PC: . 294547757:294549205(1448) ack 1 win 5792 (DF) 12:36:08.496659 PC > camera: . ack 294549205 win 33304 (DF) 12:36:08.498551 camera > PC: . 294549205:294550653(1448) ack 1 win 5792 (DF) 12:36:08.499974 camera > PC: . 294550653:294552101(1448) ack 1 win 5792 (DF) 12:36:08.500128 PC > camera: . ack 294552101 win 33304 (DF) 12:36:08.502213 camera > PC: . 294552101:294553549(1448) ack 1 win 5792 (DF) 12:36:08.504127 camera > PC: . 294553549:294554997(1448) ack 1 win 5792 (DF) 12:36:08.504323 PC > camera: . ack 294554997 win 33304 (DF) 12:36:08.506119 camera > PC: . 294554997:294556445(1448) ack 1 win 5792 (DF) 12:36:08.507675 camera > PC: . 294556445:294557893(1448) ack 1 win 5792 (DF) 12:36:08.508772 camera > PC: P 294557893:294557920(27) ack 1 win 5792 (DF) 12:36:08.510448 PC > camera: . ack 294557920 win 33304 (DF) 12:36:08.525111 camera > PC: P 294557920:294558018(98) ack 1 win 5792 (DF) 12:36:08.529038 camera > PC: . 294558018:294559466(1448) ack 1 win 5792 (DF) 12:36:08.530311 camera > PC: . 294559466:294560914(1448) ack 1 win 5792 (DF) 12:36:08.530538 PC > camera: . ack 294560914 win 33304 (DF) 12:36:08.532702 camera > PC: . 294560914:294562362(1448) ack 1 win 5792 (DF) 12:36:08.534611 camera > PC: . 294562362:294563810(1448) ack 1 win 5792 (DF) 12:36:08.534775 PC > camera: . ack 294563810 win 33304 (DF) 12:36:08.536522 camera > PC: . 294563810:294565258(1448) ack 1 win 5792 (DF) 12:36:08.537781 camera > PC: . 294565258:294566706(1448) ack 1 win 5792 (DF) 12:36:08.537999 PC > camera: . ack 294566706 win 33304 (DF) 12:36:08.539715 camera > PC: . 294566706:294568154(1448) ack 1 win 5792 (DF) 12:36:08.541633 camera > PC: P 294568154:294569589(1435) ack 1 win 5792 (DF) 12:36:08.594664 PC > camera: . ack 294569589 win 33304 (DF) 12:36:08.842150 camera > PC: . ack 1 win 5792 (DF) 12:36:08.842302 PC > camera: . ack 294569589 win 33304 (DF) 12:36:09.092104 camera > PC: . ack 1 win 5792 (DF) 12:36:09.092187 PC > camera: . ack 294569589 win 33304 (DF) 12:36:09.342105 camera > PC: . ack 1 win 5792 (DF) From rask@sygehus.dk Tue Dec 9 06:15:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 06:15:43 -0800 (PST) Received: from 0x50a14495.albnxx15.adsl-dhcp.tele.dk (0x50a14495.albnxx15.adsl-dhcp.tele.dk [80.161.68.149]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9EFKTa024002 for ; Tue, 9 Dec 2003 06:15:28 -0800 Received: by 0x50a14495.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 3DAA527774; Tue, 9 Dec 2003 15:15:04 +0100 (CET) Date: Tue, 9 Dec 2003 15:15:02 +0100 From: Rask Ingemann Lambertsen To: "David S. Miller" Cc: jt@hpl.hp.com, jt@bougret.hpl.hp.com, linux-kernel@vger.kernel.org, tsbogend@alpha.franken.de, jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [BUG 2.6.0-test11] pcnet32 oops Message-ID: <20031209151459.A1345@sygehus.dk> References: <20031205234510.GA2319@bougret.hpl.hp.com> <20031205165900.2920ea6a.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031205165900.2920ea6a.davem@redhat.com>; from davem@redhat.com on Fri, Dec 05, 2003 at 04:59:00PM -0800 X-archive-position: 1941 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 1259 Lines: 31 On Fri, Dec 05, 2003 at 04:59:00PM -0800, David S. Miller wrote: > This is the classic case of doing disabling/enabling of software > interrupts with hardware interrupts disabled, which is a bug. > > In this case pcnet32_set_multicast_list() is disabling hardware > interrupts, and the packet freeing of pcnet32_purge_tx_ring() > is what leads to the software interrupt disable/enable. I think the root cause of this problem is that pcnet32_set_multicast_list() dumps the entire TX ring on the floor (as a side effect of calling pcnet32_restart()). I don't think dev->set_multicast_list() is supposed to do that. > However, I'm inclined to believe that we should change dev_kfree_skb_any() > to fix this class of problems, by making it check for hardware interrupts > being disabled as well as being in an interrupt. I've been wondering about this too with the recent netpoll patches. Many (including pcnet32) implement the poll controller simply as disable_irq (dev->irq); driver_interrupt_handler (dev->irq, dev, NULL); enable_irq (dev->irq); If the interrupt handler calls dev_kfree_skb_any(), could you then run into this kind of problem? Or is it just if you call spin_lock_irq*() that you have a problem? -- Regards, Rask Ingemann Lambertsen From rask@sygehus.dk Tue Dec 9 06:39:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 06:39:40 -0800 (PST) Received: from 0x50a14495.albnxx15.adsl-dhcp.tele.dk (0x50a14495.albnxx15.adsl-dhcp.tele.dk [80.161.68.149]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9EdOTa024788 for ; Tue, 9 Dec 2003 06:39:25 -0800 Received: by 0x50a14495.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 86D6C1F5ED; Tue, 9 Dec 2003 15:39:22 +0100 (CET) Date: Tue, 9 Dec 2003 15:39:22 +0100 From: Rask Ingemann Lambertsen To: netdev@oss.sgi.com Cc: jgarzik@pobox.com Subject: [EXPERIMENTAL PATCH] 2.6 de2104x.c jumbo frames Message-ID: <20031209153922.C1345@sygehus.dk> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="uAKRQypu60I7Lcqm" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 1942 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 3456 Lines: 117 --uAKRQypu60I7Lcqm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Attached is a patch which enables jumbo frames on de2104x based boards. It makes it possible to set an MTU of up to 2025 bytes. However, testing against an NE3200 board using 10base-2 shows assorted problems when going above 2018 bytes (for frames without VLAN tags). The problems show up as framing errors, bad CRCs and such (in both directions). The patch is against 2.6.0-test10 with 2.6.0-test9-bk24-netdrvr-exp1 on top of it. -- Regards, Rask Ingemann Lambertsen --uAKRQypu60I7Lcqm Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="de2104x-mtu.patch" --- linux-2.6.0-test8/drivers/net/tulip/de2104x.c-orig Tue Oct 21 12:34:44 2003 +++ linux-2.6.0-test8/drivers/net/tulip/de2104x.c Thu Nov 27 01:35:13 2003 @@ -19,7 +19,6 @@ like dl2k.c/sundance.c * Constants (module parms?) for Rx work limit * Complete reset on PciErr - * Jumbo frames / dev->change_mtu * Adjust Rx FIFO threshold and Max Rx DMA burst on Rx FIFO error * Adjust Tx FIFO threshold and Max Tx DMA burst on Tx FIFO error * Implement Tx software interrupt mitigation via @@ -36,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -67,11 +67,13 @@ MODULE_PARM (debug, "i"); MODULE_PARM_DESC (debug, "de2104x bitmapped message enable number"); +#define PKT_BUF_SZ_MAX 2047 /* Maximum Rx buffer size. */ + /* Set the copy breakpoint for the copy-only-tiny-buffer Rx structure. */ #if defined(__alpha__) || defined(__arm__) || defined(__hppa__) \ || defined(__sparc_) || defined(__ia64__) \ || defined(__sh__) || defined(__mips__) -static int rx_copybreak = 1518; +static int rx_copybreak = PKT_BUF_SZ_MAX; #else static int rx_copybreak = 100; #endif @@ -403,7 +410,7 @@ int rc; while (rx_work--) { - u32 status, len; + u32 status, len, status_hacked; dma_addr_t mapping; struct sk_buff *skb, *copy_skb; unsigned copying_skb, buflen; @@ -424,7 +431,13 @@ goto rx_next; } - if (unlikely((status & 0x38008300) != 0x0300)) { + /* Ugly Jumbo frame hack. Remove error flag on long frames. */ + if ((status & (RxErrLong | RxErrCRC | RxErrFIFO | RxErrRunt | RxErrFrame)) == RxErrLong) + status_hacked = status & ~(RxError | RxErrLong); + else + status_hacked = status; + + if (unlikely((status_hacked & 0x38008300) != 0x0300)) { de_rx_err_acct(de, rx_tail, status, len); goto rx_next; } @@ -1372,6 +1390,8 @@ printk(KERN_DEBUG "%s: enabling interface\n", dev->name); de->rx_buf_sz = (dev->mtu <= 1500 ? PKT_BUF_SZ : dev->mtu + 32); + if (de->rx_buf_sz > PKT_BUF_SZ_MAX) + de->rx_buf_sz = PKT_BUF_SZ_MAX; rc = de_alloc_rings(de); if (rc) { @@ -1464,6 +1482,18 @@ netif_wake_queue(dev); } +static int de_change_mtu (struct net_device *dev, int mtu) +{ + if (netif_running (dev)) + return (-EBUSY); + + if (mtu < 0 || mtu > PKT_BUF_SZ_MAX - VLAN_ETH_HLEN - 4) + return (-EINVAL); + + dev->mtu = mtu; + return (0); +} + static void __de_get_regs(struct de_private *de, u8 *buf) { int i; @@ -1964,6 +1994,7 @@ dev->ethtool_ops = &de_ethtool_ops; dev->tx_timeout = de_tx_timeout; dev->watchdog_timeo = TX_TIMEOUT; + dev->change_mtu = de_change_mtu; dev->irq = pdev->irq; --uAKRQypu60I7Lcqm-- From rask@sygehus.dk Tue Dec 9 07:06:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 07:06:52 -0800 (PST) Received: from 0x50a14495.albnxx15.adsl-dhcp.tele.dk (0x50a14495.albnxx15.adsl-dhcp.tele.dk [80.161.68.149]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9F6bTa025959 for ; Tue, 9 Dec 2003 07:06:38 -0800 Received: by 0x50a14495.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id D00171F5ED; Tue, 9 Dec 2003 16:06:35 +0100 (CET) Date: Tue, 9 Dec 2003 16:06:35 +0100 From: Rask Ingemann Lambertsen To: netdev@oss.sgi.com Cc: jgarzik@pobox.com Subject: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames Message-ID: <20031209160632.D1345@sygehus.dk> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="DBIVS5p969aUjpLe" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 1943 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 7564 Lines: 223 --DBIVS5p969aUjpLe Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Attached is a patch which enables jumbo frames on tulip based boards. It makes it possible to set an MTU of up to 2025 bytes. However, testing against an NE3200 board using 10base-2 shows assorted problems when going above 2018 bytes (for frames without VLAN tags). The problems show up as framing errors, bad CRCs and such (in both directions). The patch is against 2.4.22-bk45. -- Regards, Rask Ingemann Lambertsen --DBIVS5p969aUjpLe Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="tulip-mtu.patch" --- linux-2.4.22/drivers/net/tulip/tulip.h-orig Mon Dec 1 21:58:11 2003 +++ linux-2.4.22/drivers/net/tulip/tulip.h Mon Dec 1 22:09:14 2003 @@ -190,6 +190,12 @@ enum desc_status_bits { DescOwned = 0x80000000, RxDescFatalErr = 0x8000, RxWholePkt = 0x0300, + RxError = (1 << 15), + RxErrLong = (1 << 7), + RxErrCRC = (1 << 1), + RxErrFIFO = (1 << 0), + RxErrRunt = (1 << 11), + RxErrFrame = (1 << 14), }; @@ -347,6 +353,7 @@ struct tulip_private { struct ring_info tx_buffers[TX_RING_SIZE]; /* The addresses of receive-in-place skbuffs. */ struct ring_info rx_buffers[RX_RING_SIZE]; + uint rx_buf_sz; u16 setup_frame[96]; /* Pseudo-Tx frame to init address table. */ int chip_id; int revision; --- linux-2.4.22/drivers/net/tulip/interrupt.c-orig Mon Dec 1 21:52:10 2003 +++ linux-2.4.22/drivers/net/tulip/interrupt.c Mon Dec 1 22:03:50 2003 @@ -74,11 +74,11 @@ int tulip_refill_rx(struct net_device *d struct sk_buff *skb; dma_addr_t mapping; - skb = tp->rx_buffers[entry].skb = dev_alloc_skb(PKT_BUF_SZ); + skb = tp->rx_buffers[entry].skb = dev_alloc_skb(tp->rx_buf_sz); if (skb == NULL) break; - mapping = pci_map_single(tp->pdev, skb->tail, PKT_BUF_SZ, + mapping = pci_map_single(tp->pdev, skb->tail, tp->rx_buf_sz, PCI_DMA_FROMDEVICE); tp->rx_buffers[entry].mapping = mapping; @@ -122,13 +122,21 @@ static int tulip_rx(struct net_device *d /* If we own the next entry, it is a new packet. Send it up. */ while ( ! (tp->rx_ring[entry].status & cpu_to_le32(DescOwned))) { s32 status = le32_to_cpu(tp->rx_ring[entry].status); + s32 status_hacked; if (tulip_debug > 5) printk(KERN_DEBUG "%s: In tulip_rx(), entry %d %8.8x.\n", dev->name, entry, status); if (--rx_work_limit < 0) break; - if ((status & 0x38008300) != 0x0300) { + + /* Ugly Jumbo frame hack. Remove error flag on long frames. */ + if ((status & (RxErrLong | RxErrCRC | RxErrFIFO | RxErrRunt | RxErrFrame)) == RxErrLong) + status_hacked = status & ~(RxError | RxErrLong); + else + status_hacked = status; + + if (unlikely((status_hacked & 0x38008300) != 0x0300)) { if ((status & 0x38000300) != 0x0300) { /* Ingore earlier buffers. */ if ((status & 0xffff) != 0x7fff) { @@ -154,15 +162,6 @@ static int tulip_rx(struct net_device *d short pkt_len = ((status >> 16) & 0x7ff) - 4; struct sk_buff *skb; -#ifndef final_version - if (pkt_len > 1518) { - printk(KERN_WARNING "%s: Bogus packet size of %d (%#x).\n", - dev->name, pkt_len, pkt_len); - pkt_len = 1518; - tp->stats.rx_length_errors++; - } -#endif - #ifdef CONFIG_NET_HW_FLOWCONTROL drop = atomic_read(&netdev_dropping); if (drop) @@ -203,7 +202,7 @@ static int tulip_rx(struct net_device *d #endif pci_unmap_single(tp->pdev, tp->rx_buffers[entry].mapping, - PKT_BUF_SZ, PCI_DMA_FROMDEVICE); + tp->rx_buf_sz, PCI_DMA_FROMDEVICE); tp->rx_buffers[entry].skb = NULL; tp->rx_buffers[entry].mapping = 0; --- linux-2.4.22/drivers/net/tulip/tulip_core.c-orig Mon Dec 1 21:34:32 2003 +++ linux-2.4.22/drivers/net/tulip/tulip_core.c Mon Dec 1 22:10:53 2003 @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -60,11 +61,13 @@ const char * const medianame[32] = { "","","","", "","","","", "","","","Transceiver reset", }; +#define PKT_BUF_SZ_MAX 2047 /* Maximum Rx buffer size. */ + /* Set the copy breakpoint for the copy-only-tiny-buffer Rx structure. */ #if defined(__alpha__) || defined(__arm__) || defined(__hppa__) \ || defined(__sparc_) || defined(__ia64__) \ || defined(__sh__) || defined(__mips__) || defined(__SH5__) -static int rx_copybreak = 1518; +static int rx_copybreak = PKT_BUF_SZ_MAX; #else static int rx_copybreak = 100; #endif @@ -518,9 +521,7 @@ void tulip_xon(struct net_device *dev) static int tulip_open(struct net_device *dev) { -#ifdef CONFIG_NET_HW_FLOWCONTROL struct tulip_private *tp = (struct tulip_private *)dev->priv; -#endif int retval; MOD_INC_USE_COUNT; @@ -529,6 +530,10 @@ tulip_open(struct net_device *dev) return retval; } + tp->rx_buf_sz = (dev->mtu <= 1500 ? PKT_BUF_SZ : dev->mtu + 32); + if (tp->rx_buf_sz > PKT_BUF_SZ_MAX) + tp->rx_buf_sz = PKT_BUF_SZ_MAX; + tulip_init_ring (dev); tulip_up (dev); @@ -673,13 +678,13 @@ static void tulip_init_ring(struct net_d for (i = 0; i < RX_RING_SIZE; i++) { tp->rx_ring[i].status = 0x00000000; - tp->rx_ring[i].length = cpu_to_le32(PKT_BUF_SZ); + tp->rx_ring[i].length = cpu_to_le32(tp->rx_buf_sz); tp->rx_ring[i].buffer2 = cpu_to_le32(tp->rx_ring_dma + sizeof(struct tulip_rx_desc) * (i + 1)); tp->rx_buffers[i].skb = NULL; tp->rx_buffers[i].mapping = 0; } /* Mark the last entry as wrapping the ring. */ - tp->rx_ring[i-1].length = cpu_to_le32(PKT_BUF_SZ | DESC_RING_WRAP); + tp->rx_ring[i-1].length = cpu_to_le32(tp->rx_buf_sz | DESC_RING_WRAP); tp->rx_ring[i-1].buffer2 = cpu_to_le32(tp->rx_ring_dma); for (i = 0; i < RX_RING_SIZE; i++) { @@ -688,12 +693,12 @@ static void tulip_init_ring(struct net_d /* Note the receive buffer must be longword aligned. dev_alloc_skb() provides 16 byte alignment. But do *not* use skb_reserve() to align the IP header! */ - struct sk_buff *skb = dev_alloc_skb(PKT_BUF_SZ); + struct sk_buff *skb = dev_alloc_skb(tp->rx_buf_sz); tp->rx_buffers[i].skb = skb; if (skb == NULL) break; mapping = pci_map_single(tp->pdev, skb->tail, - PKT_BUF_SZ, PCI_DMA_FROMDEVICE); + tp->rx_buf_sz, PCI_DMA_FROMDEVICE); tp->rx_buffers[i].mapping = mapping; skb->dev = dev; /* Mark as being used by this device. */ tp->rx_ring[i].status = cpu_to_le32(DescOwned); /* Owned by Tulip chip */ @@ -876,7 +881,7 @@ static int tulip_close (struct net_devic tp->rx_ring[i].length = 0; tp->rx_ring[i].buffer1 = 0xBADF00D0; /* An invalid address. */ if (skb) { - pci_unmap_single(tp->pdev, mapping, PKT_BUF_SZ, + pci_unmap_single(tp->pdev, mapping, tp->rx_buf_sz, PCI_DMA_FROMDEVICE); dev_kfree_skb (skb); } @@ -1321,6 +1326,18 @@ out: } #endif +static int tulip_change_mtu (struct net_device *dev, int mtu) +{ + if (netif_running (dev)) + return (-EBUSY); + + if (mtu < 0 || mtu > PKT_BUF_SZ_MAX - VLAN_ETH_HLEN - 4) + return (-EINVAL); + + dev->mtu = mtu; + return (0); +} + static int __devinit tulip_init_one (struct pci_dev *pdev, const struct pci_device_id *ent) { @@ -1727,6 +1744,7 @@ static int __devinit tulip_init_one (str dev->get_stats = tulip_get_stats; dev->do_ioctl = private_ioctl; dev->set_multicast_list = set_rx_mode; + dev->change_mtu = tulip_change_mtu; if (register_netdev(dev)) goto err_out_free_ring; --DBIVS5p969aUjpLe-- From Robert.Olsson@data.slu.se Tue Dec 9 07:09:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 07:09:26 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9F9BTa026371 for ; Tue, 9 Dec 2003 07:09:12 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.9.3+/8.9.3) with ESMTP id QAA01545; Tue, 9 Dec 2003 16:09:00 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 90C9EEC23B; Tue, 9 Dec 2003 16:09:07 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16341.58771.558850.163216@robur.slu.se> Date: Tue, 9 Dec 2003 16:09:07 +0100 To: davem@redhat.com Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Robert.Olsson@data.slu.se Subject: [PATCH] option for large routing hash X-Mailer: VM 7.17 under Emacs 21.3.1 X-archive-position: 1944 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1248 Lines: 44 Hello! I think patch should be useful as it helps performance a lot during high flow load. I have some numbers if you are interested. Cheers. --ro --- net/ipv4/Kconfig.orig 2003-12-09 14:00:14.000000000 +0100 +++ net/ipv4/Kconfig 2003-12-09 14:51:27.000000000 +0100 @@ -75,6 +75,19 @@ If unsure, say N. +config IP_ROUTE_LARGE_HASH + bool "IP: large routing hash" + ---help--- + Say yes to increase the number of dst hash buckets. Most likely you + are running a server or router with lots of flows. Saying yes + will increase used dst hash 8 times typically from 16Kbytes to 128 + Kbytes on a 256Mbytes system. + + dst hash is otherwise tunable via /proc/sys/net/ipv4/route/ + Monitoring can be done with rtstat utility this comes with iproute2. + + If unsure, say N. + config IP_ROUTE_FWMARK bool "IP: use netfilter MARK value as routing key" depends on IP_MULTIPLE_TABLES && NETFILTER --- net/ipv4/route.c.orig 2003-12-09 13:59:57.000000000 +0100 +++ net/ipv4/route.c 2003-12-09 14:49:48.000000000 +0100 @@ -2747,6 +2747,9 @@ goal = num_physpages >> (26 - PAGE_SHIFT); +#ifdef CONFIG_IP_ROUTE_LARGE_HASH + goal <<= 3; +#endif for (order = 0; (1UL << order) < goal; order++) /* NOTHING */; From rask@sygehus.dk Tue Dec 9 07:20:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 07:20:22 -0800 (PST) Received: from 0x50a14495.albnxx15.adsl-dhcp.tele.dk (0x50a14495.albnxx15.adsl-dhcp.tele.dk [80.161.68.149]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9FK5Ta026869 for ; Tue, 9 Dec 2003 07:20:08 -0800 Received: by 0x50a14495.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 5C86A277C8; Tue, 9 Dec 2003 16:20:04 +0100 (CET) Date: Tue, 9 Dec 2003 16:20:03 +0100 From: Rask Ingemann Lambertsen To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: de2104x problems (and perhaps a clue as to why) Message-ID: <20031209162003.E1345@sygehus.dk> References: <20031126134140.A7327@sygehus.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031126134140.A7327@sygehus.dk>; from rask@sygehus.dk on Wed, Nov 26, 2003 at 01:41:40PM +0100 X-archive-position: 1945 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 2019 Lines: 50 On Wed, Nov 26, 2003 at 01:41:40PM +0100, Rask Ingemann Lambertsen wrote: [snip] > lspci dug up something interesting: > $ modprobe -k de2104x > $ ethtool -s eth0 port bnc > $ ifconfig eth0 up > $ ifconfig eth0 down > $ ifconfig eth0 up > $ lspci -vvv > [...] > 00:0e.0 Ethernet controller: Digital Equipment Corporation DECchip 21041 [Tulip Pass 3] (rev 11) > Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- > Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Interrupt: pin A routed to IRQ 11 > Region 0: I/O ports at 2080 [size=128] > Region 1: Memory at 42200000 (32-bit, non-prefetchable) [size=128] > Expansion ROM at [disabled] [size=256K] > [...] > > Say what? It says "BusMaster-", which means busmaster DMA is disabled, > right? Then of course it won't work. Is this because de_close() calls > pci_disable_device() and de_open() doesn't not reenable the device? Indeed pci_disable_device() disables busmaster DMA, so this needs to be reenabled by dev->open(). This patch seems to fix the problem by reenabling the device and busmaster DMA in de_init_hw(): --- linux-2.6.0-test8/drivers/net/tulip/de2104x.c~ Wed Nov 26 13:37:17 2003 +++ linux-2.6.0-test8/drivers/net/tulip/de2104x.c Wed Nov 26 13:37:17 2003 @@ -1253,6 +1253,10 @@ static int de_init_hw (struct de_private struct net_device *dev = de->dev; int rc; + rc = pci_enable_device(de->pdev); + if (rc) + return (rc); + pci_set_master(de->pdev); de_adapter_wake(de); de->macmode = dr32(MacMode) & ~MacModeClear; But this causes pci_enable_device() and pci_set_busmaster() to be called twice when the interface is brought up for the first time after loading the module. The documentation does not say that doing so is OK. Perhaps we should simply remove the pci_disable_device() call from dev->close()? de2104x is a bit unusual in calling pci_disable_device() from dev-close(). Jeff? -- Regards, Rask Ingemann Lambertsen From xose@wanadoo.es Tue Dec 9 09:23:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 09:23:42 -0800 (PST) Received: from smtp14.eresmas.com (smtp14.eresmas.com [62.81.235.114]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9HNPTa001383 for ; Tue, 9 Dec 2003 09:23:26 -0800 Received: from [192.168.108.52] (helo=mx02.eresmas.com) by smtp14.eresmas.com with esmtp (Exim 4.10) id 1ATlZr-0002W7-00; Tue, 09 Dec 2003 18:23:15 +0100 Received: from [80.102.90.86] (helo=wanadoo.es) by mx02.eresmas.com with esmtp (Exim 4.20) id 1ATlZs-0007i0-50; Tue, 09 Dec 2003 18:23:16 +0100 Message-ID: <3FD60502.5050300@wanadoo.es> Date: Tue, 09 Dec 2003 18:23:14 +0100 From: Xose Vazquez Perez User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003 X-Accept-Language: gl, es, en MIME-Version: 1.0 To: "David S. Miller" CC: jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [PATCH] tg3, more IDs References: <3FD20895.2020605@wanadoo.es> <3FD219FA.5010908@wanadoo.es> <20031208215657.550ba7e0.davem@redhat.com> X-Enigmail-Version: 0.63.3.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1946 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xose@wanadoo.es Precedence: bulk X-list: netdev Content-Length: 628 Lines: 28 David S. Miller wrote: > Applied, but please next time remember to edit drivers/pci/pci.ids > as well when adding new PCI device ids. I usually send pci.ids updates to pciids.sf.net Today it contains a _kilopatch_ of new entries, more than kernel_pci.ids I was speaking with Martin Mares weeks ago to try that he did a 'commit'. I don't know what happens with him. Does anyone know anything? There are more pciids developers: Dave Jones Arjan van de Ven Jeff Garzik Lenz Grimmer Martin Mares Mike A. Harris Bill Nottingham Vojtech Pavlik but until now, nobody cares about it. Next time I will send it to you. -thanks- From greearb@candelatech.com Tue Dec 9 09:28:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 09:28:38 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9HSOTa002009 for ; Tue, 9 Dec 2003 09:28:24 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id hB9HSNEu018379; Tue, 9 Dec 2003 09:28:24 -0800 Message-ID: <3FD60637.2080009@candelatech.com> Date: Tue, 09 Dec 2003 09:28:23 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rask Ingemann Lambertsen CC: netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames References: <20031209160632.D1345@sygehus.dk> In-Reply-To: <20031209160632.D1345@sygehus.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1947 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 943 Lines: 35 Rask Ingemann Lambertsen wrote: > Attached is a patch which enables jumbo frames on tulip based boards. It > makes it possible to set an MTU of up to 2025 bytes. However, testing > against an NE3200 board using 10base-2 shows assorted problems when going > above 2018 bytes (for frames without VLAN tags). The problems show up as > framing errors, bad CRCs and such (in both directions). > > The patch is against 2.4.22-bk45. > +static int tulip_change_mtu (struct net_device *dev, int mtu) > +{ > + if (netif_running (dev)) > + return (-EBUSY); > + > + if (mtu < 0 || mtu > PKT_BUF_SZ_MAX - VLAN_ETH_HLEN - 4) > + return (-EINVAL); I suppose you could let them ignore the VLAN_ETH_HLEN if you really wanted to (in case they are not running VLANs). But, this should work fine too. --Ben > + > + dev->mtu = mtu; > + return (0); > +} > + -- Ben Greear Candela Technologies Inc http://www.candelatech.com From davem@pizda.ninka.net Tue Dec 9 12:21:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 12:21:52 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9KLaTa008613 for ; Tue, 9 Dec 2003 12:21:36 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA02763; Tue, 9 Dec 2003 12:20:31 -0800 Date: Tue, 9 Dec 2003 12:20:31 -0800 From: "David S. Miller" To: Robert Olsson Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Robert.Olsson@data.slu.se Subject: Re: [PATCH] option for large routing hash Message-Id: <20031209122031.048b406f.davem@redhat.com> In-Reply-To: <16341.58771.558850.163216@robur.slu.se> References: <16341.58771.558850.163216@robur.slu.se> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1948 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1433 Lines: 33 On Tue, 9 Dec 2003 16:09:07 +0100 Robert Olsson wrote: > I think patch should be useful as it helps performance a lot during high > flow load. I have some numbers if you are interested. I'm very hesitant about this, and it is not because I don't believe that it brings better performance in your tests on your machines :) Recently there was a thread on linux-kernel by the folks, such as Jes, working on super-duper-huge NUMA boxes and how big the hash tables get sized to on these machines. In some of their configurations it was trying to allocate 1GB TCP hash tables or something totally rediculious like this. The problem with all of our current algorithms for size selection is that it considers only one parameter when there are actually two. It considers currently only relative memory consumption. It needs to also consider hard limits that exist for useful hash table sizes. There is a point at which hash table size exceeds it's usefulness in that the gains you are getting from the O(1) lookup are offset by the fact that the access to the hash table heads are constantly taking cpu cache misses. You've obtained good results in your tests with a _specific_ hash table size for the routing cache, but the algorithm you are proposing for the kernel computes things relative to the amount of memory in the machine. It cannot be a function of only this parameter. Do you see my point? From rask@sygehus.dk Tue Dec 9 13:32:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 13:32:36 -0800 (PST) Received: from 0x50a41c47.albnxx15.adsl-dhcp.tele.dk (0x50a41c47.albnxx15.adsl-dhcp.tele.dk [80.164.28.71]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9LWKTa010320 for ; Tue, 9 Dec 2003 13:32:21 -0800 Received: by 0x50a41c47.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 6B405192AA; Tue, 9 Dec 2003 22:32:18 +0100 (CET) Date: Tue, 9 Dec 2003 22:32:18 +0100 From: Rask Ingemann Lambertsen To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames Message-ID: <20031209223214.A1855@sygehus.dk> References: <20031209160632.D1345@sygehus.dk> <3FD5FC36.5090405@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FD5FC36.5090405@pobox.com>; from jgarzik@pobox.com on Tue, Dec 09, 2003 at 11:45:42AM -0500 X-archive-position: 1949 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 1719 Lines: 57 On Tue, Dec 09, 2003 at 11:45:42AM -0500, Jeff Garzik wrote: > Two questions and a comment... > > Would you split this into two patches? The first simply adds, and uses, > tp->rx_buf_sz. The second adds PKT_BUF_SZ_MAX and mtu-related changes. That sounds like a good idea. I'll do that. I tried to find a good place in the private structure to add the rx_buf_sz field. By good, I mean using the cache efficiently. Any comments about that? > Have you looked at Donald Becker's changes to tulip.c? He went through > most of his drivers and made the changes necessary to support larger > MTUs. IIRC his tulip.c changes (which should be easily translate-able > to 2.6.x tulip) were a bit more minimal than your patch, but still > served the purpose. I have not looked at this particular part of Becker's tulip.c. I'll have a look at it. > For the comment: I am curious why a VLAN_xxx constant is included in > the calculation of max MTU, in the ->change_mtu hook? This is to ensure that the receive buffers are large enough to hold a frame of the requested MTU also when using VLAN tags. > IMO ->change_mtu > simply needs to bind the MTU to the min and max h/w limits. If > VLAN_ETH_HLEN ever figures into the calculations, those calculations > should occur elsewhere, not in ->change_mtu. What do you propose? Do we need something like int vlan_adjust_mtu (int mtu) { #ifdef CONFIG_VLANN_8021Q return (mtu - VLAN_HLEN); #else return (mtu); #endif } and int foobar_change_mtu (struct net_device *dev, int mtu) { mtu = vlan_adjust_mtu (mtu); /* check hardware limits. */ ... dev->mtu = mtu; return (0); } ? Ben, this would also keep you happy, right? -- Regards, Rask Ingemann Lambertsen From Robert.Olsson@data.slu.se Tue Dec 9 14:28:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 14:28:47 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9MSWTa013029 for ; Tue, 9 Dec 2003 14:28:33 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.9.3+/8.9.3) with ESMTP id XAA11029; Tue, 9 Dec 2003 23:28:24 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id C60C2EC23B; Tue, 9 Dec 2003 23:28:31 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16342.19599.686693.823755@robur.slu.se> Date: Tue, 9 Dec 2003 23:28:31 +0100 To: "David S. Miller" Cc: Robert Olsson , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] option for large routing hash In-Reply-To: <20031209122031.048b406f.davem@redhat.com> References: <16341.58771.558850.163216@robur.slu.se> <20031209122031.048b406f.davem@redhat.com> X-Mailer: VM 7.17 under Emacs 21.3.1 X-archive-position: 1950 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 2643 Lines: 66 David S. Miller writes: > There is a point at which hash table size exceeds it's usefulness in > that the gains you are getting from the O(1) lookup are offset by the > fact that the access to the hash table heads are constantly taking cpu > cache misses. Yes. > You've obtained good results in your tests with a _specific_ hash > table size for the routing cache, but the algorithm you are proposing > for the kernel computes things relative to the amount of memory in the > machine. It cannot be a function of only this parameter. > > Do you see my point? I do. Lets look at experiment to start with also we seen people trying to use in real hi-flow environments. pktgen is sending 32 kflows with flowlen 10 packets (64 byte) at 2 * 300 kpps on into a router. Packet streams 2 * 1M packets. TX OK gives thoughput. And cache settings max_size=262144 gc_thresh=32768 gc_elastity=8 for both setup. IP: routing cache hash table of 4096 buckets, 32Kbytes Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 2671635 9583466 9583466 7328370 10 0 0 0 BRU eth1 1500 0 12 0 0 0 2671640 0 0 0 BRU eth2 1500 0 2623413 9556039 9556039 7376591 4 0 0 0 BRU eth3 1500 0 1 0 0 0 2623412 0 0 0 BRU rtstat sample (truncated) size IN: hit tot mc no_rt bcast madst masrc 35320 62700 88890 0 0 0 0 0 Look due to the cache size and GC it looks like a route DoS attack. We have very little use of the hash as tot(long path) > (cache) hit. Lots of linear seach in hash. And tot+hit gives the pps throughput 152 kpps. It's easy to characterize as a DoS attack but flowlen is 10 packets. ---------------------------------------------------------------------------- IP: routing cache hash table of 32768 buckets, 256Kbytes Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 4382945 9293599 9293599 5617062 13 0 0 0 BRU eth1 1500 0 16 0 0 0 4381291 0 0 0 BRU eth2 1500 0 4290290 9292399 9292399 5709713 3 0 0 0 BRU eth3 1500 0 1 0 0 0 4288727 0 0 0 BRU rtstat sample (truncated) size IN: hit tot mc no_rt bcast madst masrc 212976 212665 52703 0 0 0 0 0 We see cache is now used as hit > tot and we get a performance jump from 152 to 265 kpps. Just as you said this was the experiment. I'll stop here for now. Cheers. --ro From greearb@candelatech.com Tue Dec 9 14:38:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 14:38:19 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9Mc6Ta013497 for ; Tue, 9 Dec 2003 14:38:06 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id hB9Mc2Eu001633; Tue, 9 Dec 2003 14:38:05 -0800 Message-ID: <3FD64EC9.6010203@candelatech.com> Date: Tue, 09 Dec 2003 14:38:01 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rask Ingemann Lambertsen CC: Jeff Garzik , netdev@oss.sgi.com Subject: Re: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames References: <20031209160632.D1345@sygehus.dk> <3FD5FC36.5090405@pobox.com> <20031209223214.A1855@sygehus.dk> In-Reply-To: <20031209223214.A1855@sygehus.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1951 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 843 Lines: 42 Rask Ingemann Lambertsen wrote: > What do you propose? Do we need something like > > int vlan_adjust_mtu (int mtu) > { > #ifdef CONFIG_VLANN_8021Q > return (mtu - VLAN_HLEN); > #else > return (mtu); > #endif > } > > and > > int foobar_change_mtu (struct net_device *dev, int mtu) > { > mtu = vlan_adjust_mtu (mtu); > /* check hardware limits. */ > ... > dev->mtu = mtu; > return (0); > } > > ? Ben, this would also keep you happy, right? I was thinking the check could be made run-time, but in reality, this is a very minor detail. It may be better to just hard-code it like you had it originally. I don't like the patch above, I'd rather see the #ifdef when checking for the maximum hardware limit, if anywhere. Thanks, Ben > -- Ben Greear Candela Technologies Inc http://www.candelatech.com From rask@sygehus.dk Tue Dec 9 15:40:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 15:40:23 -0800 (PST) Received: from user6.cybercity.dk (fxp0.user6.ip.cybercity.dk [212.242.41.54]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hB9Ne7Ta016255 for ; Tue, 9 Dec 2003 15:40:08 -0800 Received: from privat.webmail.cybercity.dk (localhost [127.0.0.1]) by user6.cybercity.dk (8.12.9/8.12.9) with ESMTP id hB9Ne2tf032247; Wed, 10 Dec 2003 00:40:02 +0100 (CET) (envelope-from rask@sygehus.dk) From: "Rask Ingemann Lambertsen" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames Date: Wed, 10 Dec 2003 00:40:02 +0100 Message-Id: <20031209224906.M53356@sygehus.dk> In-Reply-To: <3FD64EC9.6010203@candelatech.com> References: <20031209160632.D1345@sygehus.dk> <3FD5FC36.5090405@pobox.com> <20031209223214.A1855@sygehus.dk> <3FD64EC9.6010203@candelatech.com> X-Mailer: Cybercity Webmail 2.01 20030425 X-OriginatingIP: 80.164.28.71 (ccc94453@vip.cybercity.dk) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-archive-position: 1952 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 1602 Lines: 46 On Tue, 09 Dec 2003 14:38:01 -0800, Ben Greear wrote > Rask Ingemann Lambertsen wrote: > > > What do you propose? Do we need something like > > > > int vlan_adjust_mtu (int mtu) > > { > > #ifdef CONFIG_VLANN_8021Q > > return (mtu - VLAN_HLEN); > > #else > > return (mtu); > > #endif > > } > I was thinking the check could be made run-time, but in reality, > this is a very minor detail. You could use something like if (dev->priv_flags & IFF_802_1Q_VLAN) return (mtu - VLAN_HLEN); else return (mtu); but then you have a problem if the MTU is set to the maximum with VLAN disabled and someone decides to enable VLAN on the device afterwards. A possible solution would be to set NETIF_F_VLAN_CHALLENGED when VLAN_HLEN extra bytes are not available. That said, even checking CONFIG_VLAN_8021Q is probably flawed too, because ideally, even when building a kernel without VLAN support, you should be able to use the bridging support in a VLAN environment. IMHO. I mean, if this is not the case, please remind me why we need VLAN patches in the first place since setting an MTU of 1496 bytes works with every Ethernet board and driver. > I don't like the patch above, I'd rather see the > #ifdef when checking for the maximum hardware limit, if anywhere. I was thinking that there could be other reasons than VLAN for reducing the MTU and that vlan_adjust_mtu() could take these into account as well. PPPoE comes to my mind, but I have no clue how that is implemented in Linux (or any other system). -- Regards, Rask Ingemann Lambertsen From jgarzik@pobox.com Tue Dec 9 17:15:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 17:15:53 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA1FYTa024248 for ; Tue, 9 Dec 2003 17:15:35 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:42282 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1ATkzi-00034P-Bj; Tue, 09 Dec 2003 16:45:54 +0000 Message-ID: <3FD5FC36.5090405@pobox.com> Date: Tue, 09 Dec 2003 11:45:42 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rask Ingemann Lambertsen CC: netdev@oss.sgi.com Subject: Re: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames References: <20031209160632.D1345@sygehus.dk> In-Reply-To: <20031209160632.D1345@sygehus.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1953 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 827 Lines: 24 Two questions and a comment... Would you split this into two patches? The first simply adds, and uses, tp->rx_buf_sz. The second adds PKT_BUF_SZ_MAX and mtu-related changes. Have you looked at Donald Becker's changes to tulip.c? He went through most of his drivers and made the changes necessary to support larger MTUs. IIRC his tulip.c changes (which should be easily translate-able to 2.6.x tulip) were a bit more minimal than your patch, but still served the purpose. For the comment: I am curious why a VLAN_xxx constant is included in the calculation of max MTU, in the ->change_mtu hook? IMO ->change_mtu simply needs to bind the MTU to the min and max h/w limits. If VLAN_ETH_HLEN ever figures into the calculations, those calculations should occur elsewhere, not in ->change_mtu. Thanks! Jeff From dr@cluenet.de Tue Dec 9 21:44:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 21:44:39 -0800 (PST) Received: from mail1.cluenet.de (mail1.cluenet.de [195.20.121.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA5iMTa000750 for ; Tue, 9 Dec 2003 21:44:23 -0800 Received: by mail1.cluenet.de (Postfix, from userid 500) id 679AA1006; Wed, 10 Dec 2003 06:44:20 +0100 (CET) Date: Wed, 10 Dec 2003 06:44:20 +0100 From: Daniel Roesen To: netdev@oss.sgi.com Subject: GRE tunnel keepalive support? Message-ID: <20031210064420.A22481@homebase.cluenet.de> Mail-Followup-To: netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 1954 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dr@cluenet.de Precedence: bulk X-list: netdev Content-Length: 235 Lines: 12 Hi, is anybody aware of anyone implementing keepalives for GRE tunnels? I don't want to duplicate efforts... http://www.cisco.com/en/US/products/sw/iosswrel/ps1838/ products_feature_guide09186a0080134a36.html Best regards, Daniel From davem@pizda.ninka.net Tue Dec 9 22:34:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 22:34:56 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA6YVTa002041 for ; Tue, 9 Dec 2003 22:34:32 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id WAA03751; Tue, 9 Dec 2003 22:33:21 -0800 Date: Tue, 9 Dec 2003 22:33:21 -0800 From: "David S. Miller" To: Daniel Roesen Cc: netdev@oss.sgi.com Subject: Re: GRE tunnel keepalive support? Message-Id: <20031209223321.29861dc8.davem@redhat.com> In-Reply-To: <20031210064420.A22481@homebase.cluenet.de> References: <20031210064420.A22481@homebase.cluenet.de> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1955 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 396 Lines: 11 On Wed, 10 Dec 2003 06:44:20 +0100 Daniel Roesen wrote: > is anybody aware of anyone implementing keepalives for GRE tunnels? > I don't want to duplicate efforts... > > http://www.cisco.com/en/US/products/sw/iosswrel/ps1838/ > products_feature_guide09186a0080134a36.html Not to my knowledge. It looks like a rather simple mechanism, and should be easy to code up. Feel free. From hibi665@oki.com Tue Dec 9 22:36:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 22:37:08 -0800 (PST) Received: from iscan1.intra.oki.co.jp (okigate.oki.co.jp [202.226.91.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA6asTa002338 for ; Tue, 9 Dec 2003 22:36:55 -0800 Received: from aoi.okilab.oki.co.jp (localhost.localdomain [127.0.0.1]) by iscan1.intra.oki.co.jp (8.9.3/8.9.3) with SMTP id PAA28141 for ; Wed, 10 Dec 2003 15:36:53 +0900 Received: (qmail 18790 invoked from network); 10 Dec 2003 15:36:52 +0900 Received: from dhcp23233.okilab.oki.co.jp (HELO kiso) (172.24.23.233) by aoi.okilab.oki.co.jp with SMTP; 10 Dec 2003 15:36:52 +0900 Message-Id: <20031210153810.4ec4639e%hibi665@oki.com> MIME-Version: 1.0 Date: Wed, 10 Dec 2003 15:38:10 +0900 X-Mailer: Denshin 8 Go V32.1.4.3 X-My-Real-Login-Name: thibi; aoi From: Takashi Hibi To: netdev@oss.sgi.com Subject: Kernel 2.4.22 MLD problems X-archive-position: 1956 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hibi665@oki.com Precedence: bulk X-list: netdev Content-Length: 1277 Lines: 32 Hi there, We are testing IPv6 multicast with FreeBSD(kame) router. The kernel version is 2.4.22. And we are using both MLDv1 and MLDv2. We apply some patches to fix problems of 2.4.22 - fix MLD v1 compatibility mode checks to allow for extension header http://oss.sgi.com/archives/netdev/2003-11/msg00449.html - MLDv2 MRC timer wrong units [PATCH] http://oss.sgi.com/archives/netdev/2003-11/msg00364.html But there are still some problems. [MLDv2] 1. After joining multicast address, MLDv2 listener report isn't issued immediately. 2. After leaving multicast address, MLDv2 listener report is issued, but multicast routing doesn't stop. The type of report is ChangeToIncludeMode. FreeBSD doesn't seem to stop multicast after receiving it. [MLDv1] 1. After joining multicast address, MLDv2 listener report(not MLDv1) is issued. After receiving MLDv2 query, MLDv1 listener report is issued. 2. After leaving multicast address, MLDv2 listener report is sometimes issued. It happens after long communication. And I have a question about recent patch. http://www-124.ibm.com/linux/patches/ipv6/2.4.22JUMBO.patch This patch doesn't include the following patch. http://oss.sgi.com/archives/netdev/2003-11/msg00364.html Isn't it necessary? Regards, Takashi Hibi From dr@cluenet.de Tue Dec 9 23:32:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Dec 2003 23:32:23 -0800 (PST) Received: from mail1.cluenet.de (mail1.cluenet.de [195.20.121.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA7VxTa004033 for ; Tue, 9 Dec 2003 23:32:00 -0800 Received: by mail1.cluenet.de (Postfix, from userid 500) id 35CF01006; Wed, 10 Dec 2003 07:52:58 +0100 (CET) Date: Wed, 10 Dec 2003 07:52:58 +0100 From: Daniel Roesen To: netdev@oss.sgi.com Subject: Re: GRE tunnel keepalive support? Message-ID: <20031210075258.A23221@homebase.cluenet.de> Mail-Followup-To: netdev@oss.sgi.com References: <20031210064420.A22481@homebase.cluenet.de> <20031209223321.29861dc8.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031209223321.29861dc8.davem@redhat.com>; from davem@redhat.com on Tue, Dec 09, 2003 at 10:33:21PM -0800 X-archive-position: 1957 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dr@cluenet.de Precedence: bulk X-list: netdev Content-Length: 688 Lines: 18 On Tue, Dec 09, 2003 at 10:33:21PM -0800, David S. Miller wrote: > > is anybody aware of anyone implementing keepalives for GRE tunnels? > > I don't want to duplicate efforts... > > > > http://www.cisco.com/en/US/products/sw/iosswrel/ps1838/ > > products_feature_guide09186a0080134a36.html > > Not to my knowledge. It looks like a rather simple mechanism, > and should be easy to code up. Feel free. OK, anything else than "easy" would be out of scope anyway, given significant time constraints and never did any kernel work at all (except some micropatches). Given that there is no documentation whatsoever, I will have to do some reverse engineering though. Best regards, Daniel From davem@pizda.ninka.net Wed Dec 10 00:17:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 00:17:22 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA8H5Ta005506 for ; Wed, 10 Dec 2003 00:17:05 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id AAA03914; Wed, 10 Dec 2003 00:15:56 -0800 Date: Wed, 10 Dec 2003 00:15:56 -0800 From: "David S. Miller" To: Robert Olsson Cc: Robert.Olsson@data.slu.se, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] option for large routing hash Message-Id: <20031210001556.71253d1f.davem@redhat.com> In-Reply-To: <16342.19599.686693.823755@robur.slu.se> References: <16341.58771.558850.163216@robur.slu.se> <20031209122031.048b406f.davem@redhat.com> <16342.19599.686693.823755@robur.slu.se> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1958 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1015 Lines: 37 On Tue, 9 Dec 2003 23:28:31 +0100 Robert Olsson wrote: > IP: routing cache hash table of 4096 buckets, 32Kbytes ... > And tot+hit gives the pps throughput 152 kpps. ... > IP: routing cache hash table of 32768 buckets, 256Kbytes ... > We see cache is now used as hit > tot and we get a performance jump from > 152 to 265 kpps. > > Just as you said this was the experiment. I'll stop here for now. Thanks for the data. I would eventually like an algorithm that uses a min/max range. Perhaps something like: const unsigned long rthash_min = PAGE_SIZE: const unsigned long rthash_max = PAGE_ALIGN(512 * 1024 * sizeof(struct rt_hash_bucket)); unsigned long rthash_choose_size(unsigned long num_physpages) { unsigned long goal; goal = num_physpages >> (23 - PAGE_SHIFT); if (goal < rthash_min) goal = rthash_min; if (goal > rthash_max) goal = rthash_max; return goal; } It's a combination of your goal computation adjustment along with sanity limits, that's all. From dlstevens@us.ibm.com Wed Dec 10 00:22:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 00:22:59 -0800 (PST) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA8MiTa009083 for ; Wed, 10 Dec 2003 00:22:45 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e32.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBA8MbrE350804; Wed, 10 Dec 2003 03:22:37 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBA8MZF6159076; Wed, 10 Dec 2003 01:22:36 -0700 Importance: Normal Sensitivity: Subject: Re: Kernel 2.4.22 MLD problems To: Takashi Hibi Cc: netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.4a July 24, 2000 Message-ID: From: David Stevens Date: Wed, 10 Dec 2003 00:22:33 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0.2CF2HF133 | November 14, 2003) at 12/10/2003 01:22:36 MIME-Version: 1.0 Content-type: multipart/alternative; Boundary="0__=07BBE76BDFBA98B18f9e8a93df938690918c07BBE76BDFBA98B1" Content-Disposition: inline X-archive-position: 1959 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 7448 Lines: 159 --0__=07BBE76BDFBA98B18f9e8a93df938690918c07BBE76BDFBA98B1 Content-type: text/plain; charset=US-ASCII Takashi, >But there are still some problems. >[MLDv2] >1. After joining multicast address, MLDv2 listener report isn't issued immediately. This looks like a bug; it should send the first one immediately, and QRV-1 more at random intervals between 0 and MRC. It's adding the MRC random delay before sending the first one. Not critical, but I'll put a patch together in the next couple of days. >2. After leaving multicast address, MLDv2 listener report is issued, but multicast > routing doesn't stop. The type of report is ChangeToIncludeMode. > FreeBSD doesn't seem to stop multicast after receiving it. "CHANGE_TO_INCLUDE {}" is the MLDv2 "leave group"; if FreeBSD continues to forward them, I'm guessing that's a problem with FreeBSD, assuming the report is well-formed. Have you tried with a FreeBSD listener? I'd be interested in seeing packet traces if Linux and FreeBSD listeners behave differently with the FreeBSD multicast router for this case. >[MLDv1] >1. After joining multicast address, MLDv2 listener report(not MLDv1) is issued. > After receiving MLDv2 query, MLDv1 listener report is issued. MLDv2 is the default. If it has received an MLDv1 query "recently", then it'll be in "MLDv1 compatibility mode" and will answer with v1 reports, even for a v2 query. Leaving v1 compatibility mode is based on a timer, not on the types of subsequent queries. So, you probably don't want to mix testing unless you're specifically testing the v1 compatibility mode. You should either have a v1 querier on the net, or a v2 querier, and use only that version for the whole test. Otherwise, you can get a mix of report versions depending on when the last v1 query was received. I'm guessing you joined the group (v2 report scheduled) and then someone issued a v1 query (switch to v1 compatibility mode), followed by another (v1) report. If you don't think there was any v1 query at that time, please do send me a packet trace, and with the latest 2.6.0 test kernel, if possible. This worked in my testing. >2. After leaving multicast address, MLDv2 listener report is sometimes issued. > It happens after long communication. This is the same problem as you listed in "[MLDv2] 1)" -- there's a random delay between reports, but the code appears to have the delay before the first report. That's correct for a query, but not an interface state change. >And I have a question about recent patch. >http://www-124.ibm.com/linux/patches/ipv6/2.4.22JUMBO.patch This patch wasn't submitted to netdev, or the mainline-kernel. It's a version I created for specific testing and delivered to 2 people. You really want the latest mainline kernel, which has all of the patches, already integrated. >This patch doesn't include the following patch. >http://oss.sgi.com/archives/netdev/2003-11/msg00364.html >Isn't it necessary? Yes, you probably want that. Again, the latest mainline kernel tree is the most up-to-date. Thanks for reporting the problem, and let me know if the explanations for the other things don't fit what you're doing (with packet traces). +-DLS --0__=07BBE76BDFBA98B18f9e8a93df938690918c07BBE76BDFBA98B1 Content-type: text/html; charset=US-ASCII Content-Disposition: inline

Takashi,

>But there are still some problems.
>[MLDv2]
>1. After joining multicast address, MLDv2 listener report isn't issued immediately.


This looks like a bug; it should send the first one immediately,
and QRV-1 more at random intervals between 0 and MRC. It's adding
the MRC random delay before sending the first one. Not critical,
but I'll put a patch together in the next couple of days.

>2. After leaving multicast address, MLDv2 listener report is issued, but multicast

> routing doesn't stop. The type of report is ChangeToIncludeMode.
> FreeBSD doesn't seem to stop multicast after receiving it.


"CHANGE_TO_INCLUDE {}" is the MLDv2 "leave group"; if FreeBSD continues
to forward them, I'm guessing that's a problem with FreeBSD, assuming
the report is well-formed. Have you tried with a FreeBSD listener?
I'd be interested in seeing packet traces if Linux and FreeBSD listeners
behave differently with the FreeBSD multicast router for this case.

>[MLDv1]
>1. After joining multicast address, MLDv2 listener report(not MLDv1) is issued.

>  After receiving MLDv2 query, MLDv1 listener report is issued.

MLDv2 is the default. If it has received an MLDv1 query "recently", then
it'll be in "MLDv1 compatibility mode" and will answer with v1 reports, even
for a v2 query. Leaving v1 compatibility mode is based on a timer, not on
the types of subsequent queries. So, you probably don't want to mix testing
unless you're specifically testing the v1 compatibility mode. You should
either have a v1 querier on the net, or a v2 querier, and use only that
version for the whole test. Otherwise, you can get a mix of report versions
depending on when the last v1 query was received.

I'm guessing you joined the group (v2 report scheduled) and then someone
issued a v1 query (switch to v1 compatibility mode), followed by another
(v1) report.

If you don't think there was any v1 query at that time, please do send me
a packet trace, and with the latest 2.6.0 test kernel, if possible. This
worked in my testing.

>2. After leaving multicast address, MLDv2 listener report is sometimes issued.

>  It happens after long communication.

This is the same problem as you listed in "[MLDv2] 1)" -- there's a random
delay between reports, but the code appears to have the delay before the
first report. That's correct for a query, but not an interface state change.

>And I have a question about recent patch.
>
http://www-124.ibm.com/linux/patches/ipv6/2.4.22JUMBO.patch

This patch wasn't submitted to netdev, or the mainline-kernel. It's a version
I created for specific testing and delivered to 2 people. You really want the
latest mainline kernel, which has all of the patches, already integrated.

>This patch doesn't include the following patch.
>
http://oss.sgi.com/archives/netdev/2003-11/msg00364.html
>Isn't it necessary?

Yes, you probably want that. Again, the latest mainline kernel tree is the
most up-to-date.

Thanks for reporting the problem, and let me know if the explanations for
the other things don't fit what you're doing (with packet traces).

+-DLS

--0__=07BBE76BDFBA98B18f9e8a93df938690918c07BBE76BDFBA98B1-- From davem@pizda.ninka.net Wed Dec 10 00:32:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 00:32:46 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA8W9Ta009599 for ; Wed, 10 Dec 2003 00:32:29 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id AAA03965; Wed, 10 Dec 2003 00:30:59 -0800 Date: Wed, 10 Dec 2003 00:30:59 -0800 From: "David S. Miller" To: Rask Ingemann Lambertsen Cc: jt@hpl.hp.com, jt@bougret.hpl.hp.com, linux-kernel@vger.kernel.org, tsbogend@alpha.franken.de, jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [BUG 2.6.0-test11] pcnet32 oops Message-Id: <20031210003059.24fdd370.davem@redhat.com> In-Reply-To: <20031209151459.A1345@sygehus.dk> References: <20031205234510.GA2319@bougret.hpl.hp.com> <20031205165900.2920ea6a.davem@redhat.com> <20031209151459.A1345@sygehus.dk> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1960 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1471 Lines: 35 On Tue, 9 Dec 2003 15:15:02 +0100 Rask Ingemann Lambertsen wrote: > On Fri, Dec 05, 2003 at 04:59:00PM -0800, David S. Miller wrote: > > This is the classic case of doing disabling/enabling of software > > interrupts with hardware interrupts disabled, which is a bug. > > > > In this case pcnet32_set_multicast_list() is disabling hardware > > interrupts, and the packet freeing of pcnet32_purge_tx_ring() > > is what leads to the software interrupt disable/enable. > > I think the root cause of this problem is that pcnet32_set_multicast_list() > dumps the entire TX ring on the floor (as a side effect of calling > pcnet32_restart()). I don't think dev->set_multicast_list() is supposed to > do that. The task of dev->set_multicast_list() is to do whatever is necessary to update the multicast filter. If the pcnet32 hardware for some odd reason requires that you flush the TX list, it is appropriate. > I've been wondering about this too with the recent netpoll patches. Many > (including pcnet32) implement the poll controller simply as > > disable_irq (dev->irq); > driver_interrupt_handler (dev->irq, dev, NULL); > enable_irq (dev->irq); > > If the interrupt handler calls dev_kfree_skb_any(), could you then run into > this kind of problem? Or is it just if you call spin_lock_irq*() that you > have a problem? No, it would not be a problem because the thing that dev_kfree_skb_any() _does_ test right now ('in_irq()') would trigger. From davem@pizda.ninka.net Wed Dec 10 00:33:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 00:33:57 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBA8XhTa009709 for ; Wed, 10 Dec 2003 00:33:43 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id AAA03971; Wed, 10 Dec 2003 00:32:34 -0800 Date: Wed, 10 Dec 2003 00:32:34 -0800 From: "David S. Miller" To: Rask Ingemann Lambertsen Cc: jt@hpl.hp.com, jt@bougret.hpl.hp.com, linux-kernel@vger.kernel.org, tsbogend@alpha.franken.de, jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [BUG 2.6.0-test11] pcnet32 oops Message-Id: <20031210003234.2cb6f74e.davem@redhat.com> In-Reply-To: <20031209151459.A1345@sygehus.dk> References: <20031205234510.GA2319@bougret.hpl.hp.com> <20031205165900.2920ea6a.davem@redhat.com> <20031209151459.A1345@sygehus.dk> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1961 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 417 Lines: 11 Sorry, forgot to mention this in the previous email. The way to fix this properly in the pcnet32 driver itself would be to pass a stack local "struct sk_buff_head" list down into these deep routines while we have the locks held. Instead of freeing the TX skbs, we add them all to this list. At the top level, the code drops the spinlock and enables cpu irqs, then it frees up any SKBs that were put on that list. From Robert.Olsson@data.slu.se Wed Dec 10 06:47:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 06:48:00 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAElkTa031002 for ; Wed, 10 Dec 2003 06:47:47 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.9.3+/8.9.3) with ESMTP id PAA17634; Wed, 10 Dec 2003 15:47:38 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id EA187EC23B; Wed, 10 Dec 2003 15:47:44 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16343.12816.917811.588853@robur.slu.se> Date: Wed, 10 Dec 2003 15:47:44 +0100 To: "David S. Miller" Cc: Robert Olsson , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] option for large routing hash X-Mailer: VM 7.17 under Emacs 21.3.1 X-archive-position: 1962 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 2056 Lines: 66 Hello! More thoughts and observations before doing anything else maybe we can wake up Alexey... > IP: routing cache hash table of 4096 buckets, 32Kbytes ... > And tot+hit gives the pps throughput 152 kpps. ... > IP: routing cache hash table of 32768 buckets, 256Kbytes ... > We see cache is now used as hit > tot and we get a performance jump from > 152 to 265 kpps. > Some more experiment details. First full Internet routing table was used. Processor UP XEON 2.6 GHz. With large route hash see dst cache overflow. Somewhat surprising it seems at first sight but as we increase to 265 kpps so we get much closer to max_size (265k entries). So if RCU get problems with getting the batch job (freeing the dst entries) done. We get dst cache over- flow. The RCU/sofirq stuff pop-up again (now with routing table loaded). RCU will probable get on the agenda again as from what I heard the netfilter folks have plans to use it. It's also worth to notice that focusing just of getting rid of "dst cache overflow" can give half performance as seen here. :-) Since have max_size, gc_elasity, gc_thresh same. I think it's goal of GC that causes the difference. (rt_hash_mask). With a smaller number the GC gets much more aggressive. I think this is indicated in rtstat: size IN: hit tot 35320 62700 88890 Versus: 212976 212665 52703 Much more entries when we increased the bucket size. RCU can play some tricks here as well. Conclusions? IMO the dst cache "within it's operation range" is well behaved traffic is giving us very good performance and something we don't have pay attention to. We see cache is in effect simply as hit > tot. But when hit <= tot we have at least two cases: 1) Traffic is not well behaved. DoS. 2) Traffic is well behaved but tuning is bad. As the actions is totally different from above cases: w. 1. Reduce size to avoid searching. w. 2. Increase size so cache becomes active. So it crucial to distinguish between the cases. Can it be done from incoming traffic? Cheers. --ro From jmorris@redhat.com Wed Dec 10 08:34:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 08:34:17 -0800 (PST) Received: from thoron.boston.redhat.com (nat-pool-bos.redhat.com [66.187.230.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAGY0Ta006089 for ; Wed, 10 Dec 2003 08:34:01 -0800 Received: from thoron.boston.redhat.com (localhost.localdomain [127.0.0.1]) by thoron.boston.redhat.com (8.12.8/8.12.8) with ESMTP id hBAGXsta028002; Wed, 10 Dec 2003 11:33:54 -0500 Received: from localhost (jmorris@localhost) by thoron.boston.redhat.com (8.12.8/8.12.8/Submit) with ESMTP id hBAGXrZ8027998; Wed, 10 Dec 2003 11:33:53 -0500 X-Authentication-Warning: thoron.boston.redhat.com: jmorris owned process doing -bs Date: Wed, 10 Dec 2003 11:33:53 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: "David S. Miller" cc: kuznet@ms2.inr.ac.ru, , , Stephen Smalley Subject: [RFC] SO_PEERSEC - security credentials for Unix stream sockets Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1963 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev Content-Length: 7377 Lines: 210 Below is a patch against 2.6.0-test11 which implements a new socket option SO_PEERSEC (defined for i386 only at this stage). The purpose of this option is to allow an application to obtain the security credentials of a Unix stream socket peer. It is analogous to SO_PEERCRED (which provides authentication using standard Unix credentials of pid, uid and gid), and extends this concept to other security models. The getsockopt call is exposed via LSM, so that security modules can define their own handlers for SO_PEERSEC. An SELinux specific handler and demonstration userspace utilities are available at: http://people.redhat.com/jmorris/selinux/peersec/ This functionality is required to support SELinux DBUS and Security Enhanced X, and should be generally useful to security-aware applications which need to authenticate other applications. Three new LSM hooks have been implemented: - socket_getpeersec() is the getsockopt interface. - sk_alloc_security() and sk_free_security() facilitate the use of an sk_security field, which is used to store the security credentials of the Unix peer. We can't use an existing security field for this (e.g. inode), as we need the security credentials of the server's child socket. This follows the same general scheme used for managing existing Unix peer credentials. Comments? - James -- James Morris diff -urN -X dontdiff linux-2.6.0-test11.orig/include/asm-i386/socket.h linux-2.6.0-test11.w2/include/asm-i386/socket.h --- linux-2.6.0-test11.orig/include/asm-i386/socket.h 2003-09-27 20:50:09.000000000 -0400 +++ linux-2.6.0-test11.w2/include/asm-i386/socket.h 2003-12-10 09:55:39.371902424 -0500 @@ -45,6 +45,8 @@ #define SO_ACCEPTCONN 30 +#define SO_PEERSEC 31 + /* Nasty libc5 fixup - bletch */ #if defined(__KERNEL__) || !defined(__GLIBC__) || (__GLIBC__ < 2) /* Socket types. */ diff -urN -X dontdiff linux-2.6.0-test11.orig/include/linux/security.h linux-2.6.0-test11.w2/include/linux/security.h --- linux-2.6.0-test11.orig/include/linux/security.h 2003-10-15 08:53:19.000000000 -0400 +++ linux-2.6.0-test11.w2/include/linux/security.h 2003-12-10 09:55:39.375901816 -0500 @@ -757,6 +757,19 @@ * incoming sk_buff @skb has been associated with a particular socket, @sk. * @sk contains the sock (not socket) associated with the incoming sk_buff. * @skb contains the incoming network data. + * @socket_getpeersec: + * This hook allows the security module to provide peer socket security + * state to userspace via getsockopt SO_GETPEERSEC. + * @sock is the local socket. + * @optval userspace memory where the security state is to be copied. + * @len is the maximum length to copy to userspace. + * Return 0 if all is well, otherwise, typical getsockopt return + * values. + * @sk_alloc_security: + * Allocate and attach a security structure to the sk->sk_security field, + * which is used to copy security attributes between local stream sockets. + * @sk_free_security: + * Deallocate security structure. * * Security hooks affecting all System V IPC operations. * @@ -1183,6 +1196,9 @@ int (*socket_setsockopt) (struct socket * sock, int level, int optname); int (*socket_shutdown) (struct socket * sock, int how); int (*socket_sock_rcv_skb) (struct sock * sk, struct sk_buff * skb); + int (*socket_getpeersec) (struct socket *sock, char __user *optval, unsigned len); + int (*sk_alloc_security) (struct sock *sk, int family, int priority); + void (*sk_free_security) (struct sock *sk); #endif /* CONFIG_SECURITY_NETWORK */ }; @@ -2564,6 +2580,22 @@ { return security_ops->socket_sock_rcv_skb (sk, skb); } + +static inline int security_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned len) +{ + return security_ops->socket_getpeersec(sock, optval, len); +} + +static inline int security_sk_alloc_security(struct sock *sk, int family, int priority) +{ + return security_ops->sk_alloc_security(sk, family, priority); +} + +static inline void security_sk_free_security(struct sock *sk) +{ + return security_ops->sk_free_security(sk); +} #else /* CONFIG_SECURITY_NETWORK */ static inline int security_unix_stream_connect(struct socket * sock, struct socket * other, @@ -2664,6 +2696,21 @@ { return 0; } + +static inline int security_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned optlen) +{ + return -ENOPROTOOPT; +} + +static inline int security_sk_alloc_security(struct sock *sk, int family, int priority) +{ + return 0; +} + +static inline void security_sk_free_security(struct sock *sk) +{ +} #endif /* CONFIG_SECURITY_NETWORK */ #endif /* ! __LINUX_SECURITY_H */ diff -urN -X dontdiff linux-2.6.0-test11.orig/include/net/sock.h linux-2.6.0-test11.w2/include/net/sock.h --- linux-2.6.0-test11.orig/include/net/sock.h 2003-12-01 15:27:07.000000000 -0500 +++ linux-2.6.0-test11.w2/include/net/sock.h 2003-12-10 09:55:39.376901664 -0500 @@ -246,6 +246,9 @@ struct socket *sk_socket; void *sk_user_data; struct module *sk_owner; +#ifdef CONFIG_SECURITY_NETWORK + void *sk_security; +#endif void (*sk_state_change)(struct sock *sk); void (*sk_data_ready)(struct sock *sk, int bytes); void (*sk_write_space)(struct sock *sk); diff -urN -X dontdiff linux-2.6.0-test11.orig/net/core/sock.c linux-2.6.0-test11.w2/net/core/sock.c --- linux-2.6.0-test11.orig/net/core/sock.c 2003-12-01 15:27:04.000000000 -0500 +++ linux-2.6.0-test11.w2/net/core/sock.c 2003-12-10 09:55:39.378901360 -0500 @@ -564,6 +564,9 @@ v.val = sk->sk_state == TCP_LISTEN; break; + case SO_PEERSEC: + return security_socket_getpeersec(sock, optval, len); + default: return(-ENOPROTOOPT); } @@ -606,6 +609,11 @@ sock_lock_init(sk); } sk->sk_slab = slab; + + if (security_sk_alloc_security(sk, family, priority)) { + kmem_cache_free(sk->sk_slab, sk); + sk = NULL; + } } return sk; } @@ -628,6 +636,7 @@ printk(KERN_DEBUG "%s: optmem leakage (%d bytes) detected.\n", __FUNCTION__, atomic_read(&sk->sk_omem_alloc)); + security_sk_free_security(sk); kmem_cache_free(sk->sk_slab, sk); module_put(owner); } diff -urN -X dontdiff linux-2.6.0-test11.orig/security/dummy.c linux-2.6.0-test11.w2/security/dummy.c --- linux-2.6.0-test11.orig/security/dummy.c 2003-10-15 08:53:19.000000000 -0400 +++ linux-2.6.0-test11.w2/security/dummy.c 2003-12-10 09:55:39.379901208 -0500 @@ -793,6 +793,21 @@ { return 0; } + +static int dummy_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned optlen) +{ + return -ENOPROTOOPT; +} + +static inline int dummy_sk_alloc_security (struct sock *sk, int family, int priority) +{ + return 0; +} + +static inline void dummy_sk_free_security (struct sock *sk) +{ +} #endif /* CONFIG_SECURITY_NETWORK */ static int dummy_register_security (const char *name, struct security_operations *ops) @@ -969,6 +984,9 @@ set_to_dummy_if_null(ops, socket_getsockopt); set_to_dummy_if_null(ops, socket_shutdown); set_to_dummy_if_null(ops, socket_sock_rcv_skb); + set_to_dummy_if_null(ops, socket_getpeersec); + set_to_dummy_if_null(ops, sk_alloc_security); + set_to_dummy_if_null(ops, sk_free_security); #endif /* CONFIG_SECURITY_NETWORK */ } From shemminger@osdl.org Wed Dec 10 10:09:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 10:09:24 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAI98Ta012332 for ; Wed, 10 Dec 2003 10:09:08 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBAI8qZ11168; Wed, 10 Dec 2003 10:08:52 -0800 Date: Wed, 10 Dec 2003 10:08:51 -0800 From: Stephen Hemminger To: "David S. Miller" , Ralf Baechle Cc: netdev@oss.sgi.com, linux-hams@vger.kernel.org Subject: [PATCH] use after free in AF_ROSE Message-Id: <20031210100851.4b4a6927.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1964 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1489 Lines: 44 Doing multiple protocol testing and get crashes with simple socket/close combo with AF_ROSE. The problem is that it dereferences the socket in rose_release after it has already been freed by rose_destroy_socket. This patch fixes that problem, and also uses sock_put to handle the case where rose_destroy_socket is called with sk_refcnt > 1 which might be possible if data comes in during close. The other X.25 like protocols don't have this problem (AX.25, X.25, Netrom) had the same problem, but have been fixed already (in 2.6.0-test2) # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1533 -> 1.1534 # net/rose/af_rose.c 1.34 -> 1.35 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/10 shemminger@osdl.org 1.1534 # Rose protocol use after free bug. # -------------------------------------------- # diff -Nru a/net/rose/af_rose.c b/net/rose/af_rose.c --- a/net/rose/af_rose.c Wed Dec 10 09:47:02 2003 +++ b/net/rose/af_rose.c Wed Dec 10 09:47:02 2003 @@ -359,7 +359,7 @@ sk->sk_timer.data = (unsigned long)sk; add_timer(&sk->sk_timer); } else - sk_free(sk); + sock_put(sk); } /* @@ -634,7 +634,6 @@ } sock->sk = NULL; - sk->sk_socket = NULL; /* Not used, but we should do this. **/ return 0; } From ruddk@us.ibm.com Wed Dec 10 10:22:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 10:22:39 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAIMITa012814 for ; Wed, 10 Dec 2003 10:22:25 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBAILQb0385986; Wed, 10 Dec 2003 13:21:26 -0500 Received: from gateway1.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBAILPRp083092; Wed, 10 Dec 2003 11:21:26 -0700 Received: from crg8.beaverton.ibm.com (crg8.beaverton.ibm.com [9.47.57.33]) by gateway1.beaverton.ibm.com (8.11.6/8.11.6) with ESMTP id hBAILPc06889; Wed, 10 Dec 2003 10:21:25 -0800 Received: from w-kevinr.beaverton.ibm.com (w-kevinr.beaverton.ibm.com [9.47.13.228]) by crg8.beaverton.ibm.com (8.10.0.B10p2/8.8.5/token.aware-1.2) with ESMTP id hBAILNT22967; Wed, 10 Dec 2003 10:21:23 -0800 (PST) Date: Wed, 10 Dec 2003 10:23:15 -0800 (PST) From: "Kevin W. Rudd" X-X-Sender: kevinr@w-kevinr.svc.beaverton.ibm.com To: David Miller , Alexey Kuznetsov cc: netdev@oss.sgi.com Subject: PMTU issues due to TOS field manipulation (for DSCP) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1965 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ruddk@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 3242 Lines: 80 Here is a recent posting to linux-kernel related to PMTU issues when routers use DSCP marking. > Subject Yet another UDP pmtud iss, it's different, really > Date Sun, 16 Nov 2003 22:25:08 -0800 > From "Johnson, Chester F" > > This is not the same as the pmtud issues discussed ad-nauseum from 1999 > through 2001. It really is different. Trust me, please read on. > > Well, it is similar, but with a twist. We are in the middle of deploying > DiffServ compliant QoS throughout our networks and stumbled across an > issue that occurs when we configure our routers to mark the DiffServ > Code Points (DSCP) for UDP traffic (AFS, NFS, other full frame size UDP > traffic). > > The problem is that when the marked traffic reaches an IPsec/Ethernet > segment, and the DF bit set to true, an ICMP message is returned to the > transmitting host to say basically "fix your MTU". Since we have changed > the ToS field with DSCP information, the ICMP message no longer matches > anything in the route cache hash. If the ToS field is not "0", it must > match src, dst, and ToS in the cache. Well, we changed one of them and > there can be no such match. > > The net result is that the transmitting host sends another 1500 byte > packet and the process repeats itself. Ultimately the data transfer > fails. When we stop DSCP marking, MTU negotiation works just fine, but > we have no QoS. > > This kind of match might be great if we use a Linux platform as a > router. It may indeed be useful for higher performance DiffServ routing. > This kind of match requirement for an end-host is problematic. In our > estimation it looks like a bug. > > Can anyone out there help sort this out? > > Chester Johnson > Network Transport Engineering > Intel Corporation > At least in the case of a "Destination unreachable/Fragmentation needed" ICMP message, there is an assumption that the TOS value returned will not have changed. The ip_rt_frag_needed() routine will fail to find a cached route with a matching TOS (since it has been changed by the DSCP marking) and so the MTU will not be properly updated. Given that the DS field definition supersedes the previous TOS definitions, this does indeed present a problem for route modifying ICMP messages in a network environment that is using DSCP marking. This particular user would like the ability to turn off the caching of the TOS value within the routing tables. On the surface, it looks like simply manipulating the IPTOS_RT_MASK would accomplish what they are looking for with minimal code changes. This can currently be done with something equivalent in include/net/route.h: #ifndef CONFIG_IP_ROUTE_TOS #define IPTOS_RT_MASK 0 #endif and then rebuilding the kernel with the CONFIG_IP_ROUTE_TOS unset. If the approach of zeroing out the TOS routing mask seems reasonable, a more dynamic approach would be desired (a sysctl variable that can be modified without having to build a custom kernel). Long term, does it really make sense to continue trying to make routing decisions based on TOS when this field is obsolete? Thoughts? Comments? Thanks, -Kevin -- Kevin W. Rudd Linux Change Team IBM Global Services 1-800-426-7378, T/L 775-4161 From shemminger@osdl.org Wed Dec 10 10:23:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 10:23:22 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAIN9Ta012871 for ; Wed, 10 Dec 2003 10:23:09 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBAIMuZ14577; Wed, 10 Dec 2003 10:22:56 -0800 Date: Wed, 10 Dec 2003 10:22:56 -0800 From: Stephen Hemminger To: "David S. Miller" , Rusty Russell , Michal Ostrowski Cc: netdev@oss.sgi.com Subject: [PATCH] MODULE_ALIAS_PROTO for PPPoE Message-Id: <20031210102256.27d2ff51.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1966 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 938 Lines: 23 Rusty added module aliases for most protocol families but missed this one. # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1534 -> 1.1535 # drivers/net/pppoe.c 1.37 -> 1.38 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/12/10 shemminger@osdl.org 1.1535 # Add module alias for pppoe # -------------------------------------------- # diff -Nru a/drivers/net/pppoe.c b/drivers/net/pppoe.c --- a/drivers/net/pppoe.c Wed Dec 10 10:20:36 2003 +++ b/drivers/net/pppoe.c Wed Dec 10 10:20:36 2003 @@ -1151,3 +1151,4 @@ MODULE_AUTHOR("Michal Ostrowski "); MODULE_DESCRIPTION("PPP over Ethernet driver"); MODULE_LICENSE("GPL"); +MODULE_ALIAS_NETPROTO(PF_PPPOX); From ak@suse.de Wed Dec 10 11:34:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 11:35:14 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAJYkTa016531 for ; Wed, 10 Dec 2003 11:34:47 -0800 Received: from Hermes.suse.de (Hermes.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id 4174D18CFD26; Wed, 10 Dec 2003 20:34:40 +0100 (CET) Date: Wed, 10 Dec 2003 20:34:33 +0100 From: Andi Kleen To: "Kevin W. Rudd" Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031210203433.3747b7fd.ak@suse.de> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1967 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 700 Lines: 19 On Wed, 10 Dec 2003 10:23:15 -0800 (PST) "Kevin W. Rudd" wrote: > > Thoughts? Comments? I don't think the network users will be very happy if you require changing all end hosts this way. How about the following hack? For DF=1 packets the ipid field is useless. When you rewrite the TOS to DSCP save the old TOS in the ipid field. When you see an ICMP fragment required message with the right DSCP on the router restore the old TOS from the ipid field. One drawback is that there are a few VJ header compression servers that require changing ipid. But you can still handle these by not overwriting all bits of ipid so that it still changes (keep a few changing bits) -Andi From shemminger@osdl.org Wed Dec 10 12:03:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:03:34 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAK3LTa017304 for ; Wed, 10 Dec 2003 12:03:21 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBAK39Z03736; Wed, 10 Dec 2003 12:03:10 -0800 Date: Wed, 10 Dec 2003 12:03:09 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Are IPV6 module ref counts wrong on 2.6.0-test11? Message-Id: <20031210120309.09b8fd84.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1968 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 281 Lines: 8 If I load IPV6 as a module, it comes up with a ref count of 10 even though there are no applications using IPV6. Interesting /proc/net/sockstat6 shows 1 TCP connection, but there is no application with that open. This doesn't seem right, and it makes unloading IPV6 impossible. From ja@ssi.bg Wed Dec 10 12:22:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:22:35 -0800 (PST) Received: from l.himel.bg (IDENT:root@unamed.infotel.bg [212.39.68.18] (may be forged)) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAKMJTa020606 for ; Wed, 10 Dec 2003 12:22:21 -0800 Received: from linux.himel.bg (IDENT:ja@linux.himel.bg [127.0.0.1]) by l.himel.bg (8.11.6/8.9.3) with ESMTP id hBAKKu507361; Wed, 10 Dec 2003 22:20:56 +0200 Date: Wed, 10 Dec 2003 22:20:56 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@l To: Andi Kleen cc: "Kevin W. Rudd" , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031210203433.3747b7fd.ak@suse.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1969 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 788 Lines: 23 Hello, On Wed, 10 Dec 2003, Andi Kleen wrote: > I don't think the network users will be very happy if you require changing all > end hosts this way. How about the following hack? For DF=1 packets > the ipid field is useless. When you rewrite the TOS to DSCP save > the old TOS in the ipid field. When you see an ICMP fragment required > message with the right DSCP on the router restore the old TOS from the ipid > field. It can be an option: pmtu_promisc or whatever (default 0). We should avoid using TOS as hash key and if the var is 1 we should ignore matching the tos keys in ip_rt_frag_needed and ip_rt_redirect. By this way, all cache entries between saddr and daddr will be updated, no matter the TOS or even OIF keys. Comments? Regards -- Julian Anastasov From ruddk@us.ibm.com Wed Dec 10 12:25:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:25:47 -0800 (PST) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAKPYTa020990 for ; Wed, 10 Dec 2003 12:25:34 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e6.ny.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBAKOsDA446388; Wed, 10 Dec 2003 15:24:54 -0500 Received: from gateway.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.6) with ESMTP id hBAKOWPL081000; Wed, 10 Dec 2003 15:24:36 -0500 Received: from crg8.beaverton.ibm.com (crg8.beaverton.ibm.com [9.47.57.33]) by gateway.beaverton.ibm.com (8.11.6/8.11.6) with ESMTP id hBAKOWO25363; Wed, 10 Dec 2003 12:24:32 -0800 Received: from w-kevinr.beaverton.ibm.com (w-kevinr.beaverton.ibm.com [9.47.13.228]) by crg8.beaverton.ibm.com (8.10.0.B10p2/8.8.5/token.aware-1.2) with ESMTP id hBAKOUT16088; Wed, 10 Dec 2003 12:24:31 -0800 (PST) Date: Wed, 10 Dec 2003 12:26:26 -0800 (PST) From: "Kevin W. Rudd" X-X-Sender: kevinr@w-kevinr.svc.beaverton.ibm.com To: Andi Kleen cc: davem@redhat.com, , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031210203433.3747b7fd.ak@suse.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1970 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ruddk@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1532 Lines: 43 I don't think it's practical to make these types of changes at the router level. Discounting not having the ability to change the IOS behavior, it is quite probable that the router doing the marking is beyond your control (e.g. at the ISP level). The only network users that would have to make this change would be those taking advantage of a network using QoS marking (and a smaller MTU somewhere in the path). That's why the ideal short term solution would be to allow the manipulation of this behavior via a sysctl option/variable, but keep the default the same as it is now. -Kevin On Wed, 10 Dec 2003, Andi Kleen wrote: > Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) > > On Wed, 10 Dec 2003 10:23:15 -0800 (PST) > "Kevin W. Rudd" wrote: > > > > > Thoughts? Comments? > > I don't think the network users will be very happy if you require changing all > end hosts this way. How about the following hack? For DF=1 packets > the ipid field is useless. When you rewrite the TOS to DSCP save > the old TOS in the ipid field. When you see an ICMP fragment required > message with the right DSCP on the router restore the old TOS from the ipid > field. > > One drawback is that there are a few VJ header compression servers that > require changing ipid. But you can still handle these by not overwriting > all bits of ipid so that it still changes (keep a few changing bits) > > -Andi > -- Kevin W. Rudd Linux Change Team IBM Global Services 1-800-426-7378, T/L 775-4161 From niv@us.ibm.com Wed Dec 10 12:33:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:33:28 -0800 (PST) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAKXCTa021431 for ; Wed, 10 Dec 2003 12:33:15 -0800 Received: from westrelay05.boulder.ibm.com (westrelay05.boulder.ibm.com [9.17.193.33]) by e33.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBAKWXJh099508; Wed, 10 Dec 2003 15:32:33 -0500 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay05.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBAKWVeo043950; Wed, 10 Dec 2003 13:32:32 -0700 Message-ID: <3FD7826F.1050308@us.ibm.com> Date: Wed, 10 Dec 2003 12:30:39 -0800 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andi Kleen CC: "Kevin W. Rudd" , davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) References: <20031210203433.3747b7fd.ak@suse.de> In-Reply-To: <20031210203433.3747b7fd.ak@suse.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1971 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 502 Lines: 17 Andi Kleen wrote: > I don't think the network users will be very happy if you require changing all > end hosts this way. How about the following hack? For DF=1 packets > the ipid field is useless. When you rewrite the TOS to DSCP save > the old TOS in the ipid field. When you see an ICMP fragment required > message with the right DSCP on the router restore the old TOS from the ipid > field. Wouldnt this require changes at the both ends, router and host? How would we sync? thanks, Nivedita From niv@us.ibm.com Wed Dec 10 12:36:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:36:32 -0800 (PST) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAKaITa021809 for ; Wed, 10 Dec 2003 12:36:19 -0800 Received: from westrelay05.boulder.ibm.com (westrelay05.boulder.ibm.com [9.17.193.33]) by e35.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBAKZfcp407880; Wed, 10 Dec 2003 15:35:41 -0500 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay05.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBAKZdeo116324; Wed, 10 Dec 2003 13:35:40 -0700 Message-ID: <3FD7832B.3010709@us.ibm.com> Date: Wed, 10 Dec 2003 12:33:47 -0800 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Julian Anastasov CC: Andi Kleen , "Kevin W. Rudd" , davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1972 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 562 Lines: 16 Julian Anastasov wrote: > It can be an option: pmtu_promisc or whatever (default 0). > We should avoid using TOS as hash key and if the var is 1 we should > ignore matching the tos keys in ip_rt_frag_needed and ip_rt_redirect. > By this way, all cache entries between saddr and daddr will be updated, no > matter the TOS or even OIF keys. Comments? This is what I was going to suggest too, but one of the drawbacks would be needing the user to recompile the distro kernel just for this..could we keep any fix to this dynamically tunable? thanks, Nivedita From davem@pizda.ninka.net Wed Dec 10 12:51:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:51:16 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAKp3Ta022689 for ; Wed, 10 Dec 2003 12:51:03 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA18386; Wed, 10 Dec 2003 12:49:49 -0800 Date: Wed, 10 Dec 2003 12:49:48 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: Are IPV6 module ref counts wrong on 2.6.0-test11? Message-Id: <20031210124948.042a3a5f.davem@redhat.com> In-Reply-To: <20031210120309.09b8fd84.shemminger@osdl.org> References: <20031210120309.09b8fd84.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1973 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 589 Lines: 16 On Wed, 10 Dec 2003 12:03:09 -0800 Stephen Hemminger wrote: > If I load IPV6 as a module, it comes up with a ref count of 10 > even though there are no applications using IPV6. > > Interesting /proc/net/sockstat6 shows 1 TCP connection, but there is > no application with that open. > > This doesn't seem right, and it makes unloading IPV6 impossible. IPV6 is not safe to unload as a module, it is going to take a lot of difficult work to fix this. The 10 reference counts come from things like sockets created for sending ipv6 ICMP packets and stuff like that. From ak@suse.de Wed Dec 10 12:52:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 12:53:10 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAKqsTa023019 for ; Wed, 10 Dec 2003 12:52:57 -0800 Received: from Hermes.suse.de (Hermes.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id A422718D369A; Wed, 10 Dec 2003 21:52:48 +0100 (CET) Date: Wed, 10 Dec 2003 21:52:44 +0100 From: Andi Kleen To: "Kevin W. Rudd" Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031210215244.4713fd25.ak@suse.de> In-Reply-To: References: <20031210203433.3747b7fd.ak@suse.de> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1974 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 301 Lines: 10 On Wed, 10 Dec 2003 12:26:26 -0800 (PST) "Kevin W. Rudd" wrote: > I don't think it's practical to make these types of changes at the > router level. Just changing all end hosts for this will be likely even less practical because there are much more of them than routers. -Andi From niv@us.ibm.com Wed Dec 10 13:13:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 13:14:06 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBALDjTa024883 for ; Wed, 10 Dec 2003 13:13:46 -0800 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.195.10]) by e31.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBALDY19519582; Wed, 10 Dec 2003 16:13:34 -0500 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay01.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBALDXKL072338; Wed, 10 Dec 2003 14:13:34 -0700 Message-ID: <3FD78C0D.4030609@us.ibm.com> Date: Wed, 10 Dec 2003 13:11:41 -0800 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andi Kleen CC: ruddk@us.ibm.com, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) References: <20031210203433.3747b7fd.ak@suse.de> <3FD7826F.1050308@us.ibm.com> <20031210215527.239fb33f.ak@suse.de> In-Reply-To: <20031210215527.239fb33f.ak@suse.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1975 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 876 Lines: 26 Andi Kleen wrote: > Only the router would need to change (and rewrite ICMP messages, which is a > bit nasty, but then compatibility is not always fun) True, but I get a sinking feeling when I hear we have to change router/icmp code. The word "only" seems inappropriate ;). > -Andi > > P.S.: I am not opposed to fixing linux for this, just I have my doubts > that fixing all end hosts is a practical solution for the problem. The reason it seems simpler to me, at least, and I don't speak for Kevin or the person at Intel who first reported the problem, is that I see only a few people (well, only Intel :)) running into the problem. So if there were a patch available or the fix were in some release, only the few end hosts who actually needed it would need to upgrade. Otherwise, rollout would be evolutionary. But also open to other approaches... thanks, Nivedita From ja@ssi.bg Wed Dec 10 13:17:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 13:18:04 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBALHVTa025708 for ; Wed, 10 Dec 2003 13:17:37 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBALI5CL001387; Wed, 10 Dec 2003 23:18:05 +0200 Date: Wed, 10 Dec 2003 23:18:05 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: Nivedita Singhvi cc: Andi Kleen , "Kevin W. Rudd" , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <3FD7832B.3010709@us.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1976 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 1593 Lines: 50 Hello, On Wed, 10 Dec 2003, Nivedita Singhvi wrote: > Julian Anastasov wrote: > > > It can be an option: pmtu_promisc or whatever (default 0). > > We should avoid using TOS as hash key and if the var is 1 we should > > ignore matching the tos keys in ip_rt_frag_needed and ip_rt_redirect. > > By this way, all cache entries between saddr and daddr will be updated, no > > matter the TOS or even OIF keys. Comments? > > This is what I was going to suggest too, but one of the drawbacks > would be needing the user to recompile the distro kernel just > for this..could we keep any fix to this dynamically tunable? Yes, I mean sysctl var in /proc. For now, there are two needs: 1. update PMTU info ignoring TOS and OIF 2. Reduce the number of hash entries by ignoring the TOS, may be depending on CONFIG_IP_ROUTE_TOS, it would be preferred to be sysctl var but the real benefits depend on the TOS usage. I see three cases: 1. we do not route by TOS => we want to ignore TOS everywhere, if possible, sysctl var variant for IPTOS_RT_MASK=0 2. we route by TOS (IPTOS_RT_MASK!=0) but we like the idea to ignore TOS for PMTUD and ICMP redirects. 3. the current behavior, PMTUD does not work if TOS is changed, the routing cache keeps many entries with different TOS The question is how many proc entries are needed for these cases. It seems, we can cover all cases with one proc entry: to make IPTOS_RT_MASK a sysctl var. But there is still difference between cases 1 and 2, may be the var will have more than 2 values? > thanks, > Nivedita Regards -- Julian Anastasov From chester.f.johnson@intel.com Wed Dec 10 13:35:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 13:35:45 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBALZFTa027058 for ; Wed, 10 Dec 2003 13:35:16 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by caduceus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hBALa8ew018655 for ; Wed, 10 Dec 2003 21:36:08 GMT Received: from fmsmsxv041-1.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by petasus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.6 2003/12/04 00:48:44 root Exp $) with SMTP id hBALZuWU014218 for ; Wed, 10 Dec 2003 21:36:01 GMT Received: from fmsmsx331.amr.corp.intel.com ([132.233.42.135]) by fmsmsxv041-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003121013350717918 ; Wed, 10 Dec 2003 13:35:07 -0800 Received: from fmsmsx410.amr.corp.intel.com ([132.233.42.51]) by fmsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 10 Dec 2003 13:35:07 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: PMTU issues due to TOS field manipulation (for DSCP) Date: Wed, 10 Dec 2003 13:35:07 -0800 Message-ID: <7E713DB94F47914DB5AAB80DE8EEA87501539903@fmsmsx410.fm.intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: PMTU issues due to TOS field manipulation (for DSCP) Thread-Index: AcO/YFmxPP5VaHCOS1W7AzF0RfILpgAALtaw From: "Johnson, Chester F" To: "Andi Kleen" , "Nivedita Singhvi" Cc: , X-OriginalArrivalTime: 10 Dec 2003 21:35:07.0508 (UTC) FILETIME=[7B243B40:01C3BF65] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hBALZFTa027058 X-archive-position: 1977 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chester.f.johnson@intel.com Precedence: bulk X-list: netdev Content-Length: 1964 Lines: 52 From a corporate network perspective there are a couple of options that can temporarily work around the problem. As an example, support for jumbo frames on Ethernet IP-Sec interfaces. This would require some hardware replacement, but GigE generally supports this. On the other hand, I'm not sure I want to comprehend IP-Sec at GigE line rates. We are pretty early in deploying diff-serv compliant QoS at this scale. We are also one of few companies that mandate encrypted private WAN links (Only the paranoid survive). As we look forward, we would expect more frequent appearance of this behavior at other companies. All that said, an evolutionary roll-out is OK. Knowing that there is a target release that would resolve the issue gives us something to look forward to. It may also be a good idea to submit some clarification in the relevant RFCs. chet -----Original Message----- From: Andi Kleen [mailto:ak@suse.de] Sent: Wednesday, December 10, 2003 12:55 PM To: Nivedita Singhvi Cc: ruddk@us.ibm.com; netdev@oss.sgi.com; Johnson, Chester F Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) On Wed, 10 Dec 2003 12:30:39 -0800 Nivedita Singhvi wrote: > > I don't think the network users will be very happy if you require > > changing all end hosts this way. How about the following hack? For > > DF=1 packets the ipid field is useless. When you rewrite the TOS to > > DSCP save the old TOS in the ipid field. When you see an ICMP > > fragment required message with the right DSCP on the router restore > > the old TOS from the ipid field. > > Wouldnt this require changes at the both ends, router and host? How > would we sync? Only the router would need to change (and rewrite ICMP messages, which is a bit nasty, but then compatibility is not always fun) -Andi P.S.: I am not opposed to fixing linux for this, just I have my doubts that fixing all end hosts is a practical solution for the problem. From ak@suse.de Wed Dec 10 13:47:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 13:48:08 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBALldTa028420 for ; Wed, 10 Dec 2003 13:47:39 -0800 Received: from Hermes.suse.de (Hermes.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id BE0B918D2D61; Wed, 10 Dec 2003 21:55:28 +0100 (CET) Date: Wed, 10 Dec 2003 21:55:27 +0100 From: Andi Kleen To: Nivedita Singhvi Cc: ruddk@us.ibm.com, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031210215527.239fb33f.ak@suse.de> In-Reply-To: <3FD7826F.1050308@us.ibm.com> References: <20031210203433.3747b7fd.ak@suse.de> <3FD7826F.1050308@us.ibm.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1978 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 837 Lines: 21 On Wed, 10 Dec 2003 12:30:39 -0800 Nivedita Singhvi wrote: > > I don't think the network users will be very happy if you require changing all > > end hosts this way. How about the following hack? For DF=1 packets > > the ipid field is useless. When you rewrite the TOS to DSCP save > > the old TOS in the ipid field. When you see an ICMP fragment required > > message with the right DSCP on the router restore the old TOS from the ipid > > field. > > Wouldnt this require changes at the both ends, router and > host? How would we sync? Only the router would need to change (and rewrite ICMP messages, which is a bit nasty, but then compatibility is not always fun) -Andi P.S.: I am not opposed to fixing linux for this, just I have my doubts that fixing all end hosts is a practical solution for the problem. From niv@us.ibm.com Wed Dec 10 14:39:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 14:39:26 -0800 (PST) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAMd5Ta031157 for ; Wed, 10 Dec 2003 14:39:12 -0800 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.195.10]) by e34.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBAMcR6t295454; Wed, 10 Dec 2003 17:38:27 -0500 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay01.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBAMcPKL063950; Wed, 10 Dec 2003 15:38:26 -0700 Message-ID: <3FD79FF1.8000505@us.ibm.com> Date: Wed, 10 Dec 2003 14:36:33 -0800 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Julian Anastasov CC: Andi Kleen , "Kevin W. Rudd" , davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1979 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1052 Lines: 36 Julian Anastasov wrote: > I see three cases: > > 1. we do not route by TOS => we want to ignore TOS everywhere, if > possible, sysctl var variant for IPTOS_RT_MASK=0 > > 2. we route by TOS (IPTOS_RT_MASK!=0) but we like the idea to > ignore TOS for PMTUD and ICMP redirects. > 3. the current behavior, PMTUD does not work if TOS is changed, > the routing cache keeps many entries with different TOS > The question is how many proc entries are needed for these > cases. It seems, we can cover all cases with one proc entry: to > make IPTOS_RT_MASK a sysctl var. But there is still difference > between cases 1 and 2, may be the var will have more than 2 values? 1. we route by TOS - yes (no PMTUD/ICMP conflict) MASK = $mask - no (PMTUD/ICMP conflict) MASK = 0 2. we do not route by TOS at all MASK = 0 Unless the kernel is checking on the fly, you do not need to distinguish between 1b and 2, correct? (i.e.Since we are just expecting the user to set the sysctl variable). 1a is your case 3, current behaviour. thanks, Nivedita ' From davem@pizda.ninka.net Wed Dec 10 14:43:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 14:43:18 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAMh5Ta031589 for ; Wed, 10 Dec 2003 14:43:05 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA18684; Wed, 10 Dec 2003 14:41:50 -0800 Date: Wed, 10 Dec 2003 14:41:50 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: rusty@rustcorp.com.au, mostrows@styk.uwaterloo.ca.sgi.com, netdev@oss.sgi.com Subject: Re: [PATCH] MODULE_ALIAS_PROTO for PPPoE Message-Id: <20031210144150.01124264.davem@redhat.com> In-Reply-To: <20031210102256.27d2ff51.shemminger@osdl.org> References: <20031210102256.27d2ff51.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1980 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 192 Lines: 6 On Wed, 10 Dec 2003 10:22:56 -0800 Stephen Hemminger wrote: > Rusty added module aliases for most protocol families but missed this one. Patch applied, thanks Stephen. From davem@pizda.ninka.net Wed Dec 10 14:53:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 14:53:30 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAMrHTa032494 for ; Wed, 10 Dec 2003 14:53:17 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA18711; Wed, 10 Dec 2003 14:51:49 -0800 Date: Wed, 10 Dec 2003 14:51:49 -0800 From: "David S. Miller" To: Nivedita Singhvi Cc: ja@ssi.bg, ak@suse.de, ruddk@us.ibm.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031210145149.2bd89e9b.davem@redhat.com> In-Reply-To: <3FD79FF1.8000505@us.ibm.com> References: <3FD79FF1.8000505@us.ibm.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1981 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 904 Lines: 23 Here is my take on this, as far as Linux is concerned. I agree with the three behaviors proposed by Julian. However I have some slight trouble with the ignore-TOS-for- PMTU idea, implementation wise. Walking the routing hash table for each possible TOS value is going to be computationally expensive, and is inviting computational complexity DDoS attacks by bombing the machine with PMTU ICMP messages. That is the most obvious implementation, and I'm not saying there are not others. I just have no alternatives in mind right now :) But once that issue is resolved I'm more than happy to put a patch in which does this stuff. We even have been speaking about this in other threads on netdev wrt. Julian's patches. TOS is truly a value with only network local meaning and hops are going to modify the value on us. I'm actually surprised this is the first time the issue has been seriously hit. From davem@pizda.ninka.net Wed Dec 10 14:54:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 14:54:55 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAMsgTa000301 for ; Wed, 10 Dec 2003 14:54:42 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA18738; Wed, 10 Dec 2003 14:53:26 -0800 Date: Wed, 10 Dec 2003 14:53:26 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: ralf@linux-mips.org, netdev@oss.sgi.com, linux-hams@vger.kernel.org Subject: Re: [PATCH] use after free in AF_ROSE Message-Id: <20031210145326.6d1d9b81.davem@redhat.com> In-Reply-To: <20031210100851.4b4a6927.shemminger@osdl.org> References: <20031210100851.4b4a6927.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1982 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 528 Lines: 13 On Wed, 10 Dec 2003 10:08:51 -0800 Stephen Hemminger wrote: > Doing multiple protocol testing and get crashes with simple > socket/close combo with AF_ROSE. The problem is that it > dereferences the socket in rose_release after it has already been > freed by rose_destroy_socket. > > This patch fixes that problem, and also uses sock_put to handle the > case where rose_destroy_socket is called with sk_refcnt > 1 which > might be possible if data comes in during close. Applied, thanks a lot Stephen. From davem@pizda.ninka.net Wed Dec 10 15:00:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:00:21 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAN08Ta001248 for ; Wed, 10 Dec 2003 15:00:08 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA18756; Wed, 10 Dec 2003 14:56:52 -0800 Date: Wed, 10 Dec 2003 14:56:52 -0800 From: "David S. Miller" To: James Morris Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-security-module@wirex.com, sds@epoch.ncsc.mil Subject: Re: [RFC] SO_PEERSEC - security credentials for Unix stream sockets Message-Id: <20031210145652.66bda4c2.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1983 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 995 Lines: 23 On Wed, 10 Dec 2003 11:33:53 -0500 (EST) James Morris wrote: > Three new LSM hooks have been implemented: > > - socket_getpeersec() is the getsockopt interface. > > - sk_alloc_security() and sk_free_security() facilitate the use of an > sk_security field, which is used to store the security credentials of > the Unix peer. We can't use an existing security field for this (e.g. > inode), as we need the security credentials of the server's child > socket. This follows the same general scheme used for managing existing > Unix peer credentials. > > Comments? I'm fine with this conceptually, although the earliest I could put this into the tree is 2.6.1 although I have a hunch that I'll be asked to defer something like this to 2.6.2, but who knows. The one thing I don't like is the ifdef conditionalized member of the sock struct. We should move away from config variables changing structure layouts. Even a "void *sk_security;" would be better. From davem@pizda.ninka.net Wed Dec 10 15:06:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:07:08 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBAN6rTa001705 for ; Wed, 10 Dec 2003 15:06:53 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA18779; Wed, 10 Dec 2003 15:05:37 -0800 Date: Wed, 10 Dec 2003 15:05:37 -0800 From: "David S. Miller" To: Robert Olsson Cc: Robert.Olsson@data.slu.se, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] option for large routing hash Message-Id: <20031210150537.5fd0edb0.davem@redhat.com> In-Reply-To: <16343.12816.917811.588853@robur.slu.se> References: <16343.12816.917811.588853@robur.slu.se> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1984 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1793 Lines: 44 On Wed, 10 Dec 2003 15:47:44 +0100 Robert Olsson wrote: > The RCU/sofirq stuff pop-up again (now with routing table loaded). > RCU will probable get on the agenda again as from what I heard the > netfilter folks have plans to use it. I wish we had resolved that in the long thread we had about it a few months ago. We should really reestablish the point we had reached, and try to make some more forward progress. This problem is not going to go away, as you say. > But when hit <= tot we have at least two cases: > 1) Traffic is not well behaved. DoS. > 2) Traffic is well behaved but tuning is bad. #2 should be prevented by the fact that GC prefers to toss out entries in an LRU'ish manner. I think I see what you're saying though, and in my opinion there is way too much black magic in choosing gc_thresh etc. routing cache parameters. Other caches in the kernel grow to fill the needs of the system. The routing cache grows only as large as it has been configured to be allowed to grow, and afterwards it no longer responds to increasing system needs. The IPV4 routing front end has exactly this kind of logic. As you load more and more routing table entries into the kernel it grows the hashes in response. > So it crucial to distinguish between the cases. Can it be done from > incoming traffic? This is the real question, and the other is how to implement dynamic hash table growth in the routing cache. The locking is particularly problematic, but RCU may be able to help us with this. Monitoring traffic to make this decision is hard because to determine whether cache misses are due to not well formed traffic vs. too small cache sizing we'd need the entry we garbage collected to determine the latter. It's the chicken and egg problem. From ja@ssi.bg Wed Dec 10 15:14:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:15:02 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBANEeTa002946 for ; Wed, 10 Dec 2003 15:14:44 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBANF6CL002015; Thu, 11 Dec 2003 01:15:22 +0200 Date: Thu, 11 Dec 2003 01:15:06 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: Nivedita Singhvi , , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031210145149.2bd89e9b.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1985 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 1223 Lines: 34 Hello, On Wed, 10 Dec 2003, David S. Miller wrote: > Here is my take on this, as far as Linux is concerned. > > I agree with the three behaviors proposed by Julian. > However I have some slight trouble with the ignore-TOS-for- > PMTU idea, implementation wise. > > Walking the routing hash table for each possible TOS value > is going to be computationally expensive, and is inviting > computational complexity DDoS attacks by bombing the machine > with PMTU ICMP messages. What about not using TOS as hash key, then we will see all entries for same SADDR->DADDR but with different TOS values in same table row. I hope it will not hurt the jenkins hash too much but it is evident that we put all these entries with different TOS and OIF on same table row. It seems, there are no many users of OIF!=0 but if TOS is used as routing key we can see up to 8 entries with different TOS for same SADDR,DADDR. Of course, it looks difficult to walk 8 rows just to check all TOS variants, the common case is to see only one TOS value used. That is why I propose to eliminate the TOS as hash key and to walk one row. At first look, the risk of DoS is same, thanks to the random value. Regards -- Julian Anastasov From davem@pizda.ninka.net Wed Dec 10 15:16:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:16:58 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBANGbTa003342 for ; Wed, 10 Dec 2003 15:16:38 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA18823; Wed, 10 Dec 2003 15:15:09 -0800 Date: Wed, 10 Dec 2003 15:15:09 -0800 From: "David S. Miller" To: Julian Anastasov Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com Subject: Re: [ROUTE] PMTU only works on half the time Message-Id: <20031210151509.6b48e531.davem@redhat.com> In-Reply-To: References: <20031205124309.1490a2f4.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1986 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 2060 Lines: 54 On Sat, 6 Dec 2003 09:52:01 +0200 (EET) Julian Anastasov wrote: > You are right about the case with redirects but for PMTUD > I'm still not sure. Here is one example: host A (we) thinks > that host B is onlink (rt_dst == rt_gateway). But host B runs > DNAT, tunneling or cluster software and forwards the packet > through other hosts which generate frag_needed. Such example > is IPVS TUN mode, I now see that it is not working because IPVS > on host B does not handle correctly the ICMP error and it can > not reach host A - a new difficult thing to fix. Entites that proport to provide IP services onlink, yet really do not, need to preserve all behaviors such that they appear to be onlink as far as other hosts can see. This means no PMTU for onlink destinations. > UDP+RTO_ONLINK+PMTUD is still valid if we want to support > the above example setup. But I'm not sure someone will use > such combination of parameters. Should I remove the RTO_ONLINK > case from PMTUD? That is what I think is best. > Assume we have direct (rt_dst==rt_gateway) route with > SCOPE_HOST and shared_media is ON (if shared media is OFF > ip_fib_check_default already avoids SCOPE_HOST routes). I do > not see whether the standards cover such case, our target sends redirect > message but we are sure we hit the target IP directly. Is it supposed we > to change rt_gateway if rt_gateway==rt_dst ? I now included > such check in ip_rt_redirect but may be have to remove it. > IOW, the question is whether the ICMP redirects modify only > routes via gateway when shared_media is ON? Interesting, I've never explored this area. I honestly don't know, and I will study this issue some more. > OK, here is a new version, may be before the final one. > Changes from previous version: > > - removed the RTO_ONLINK case from ip_rt_redirect OK. > - removed the useless 'saddr && daddr' check which was added in > previous changes Ok. > - added temporarily 'rth->rt_dst == rth->rt_gateway' in > ip_rt_redirect. Is it needed? I'll get back to you on this :) From davem@pizda.ninka.net Wed Dec 10 15:22:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:22:26 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBANMBTa003730 for ; Wed, 10 Dec 2003 15:22:13 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA18855; Wed, 10 Dec 2003 15:20:39 -0800 Date: Wed, 10 Dec 2003 15:20:39 -0800 From: "David S. Miller" To: Julian Anastasov Cc: niv@us.ibm.com, ak@suse.de, ruddk@us.ibm.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031210152039.055e887a.davem@redhat.com> In-Reply-To: References: <20031210145149.2bd89e9b.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1987 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1056 Lines: 25 On Thu, 11 Dec 2003 01:15:06 +0200 (EET) Julian Anastasov wrote: > It seems, there are no many users of OIF!=0 but if TOS is > used as routing key we can see up to 8 entries with different > TOS for same SADDR,DADDR. Of course, it looks difficult to > walk 8 rows just to check all TOS variants, the common case > is to see only one TOS value used. That is why I propose > to eliminate the TOS as hash key and to walk one row. At first > look, the risk of DoS is same, thanks to the random value. This is not my understanding at all. Consider the case where we generate routing cache entries for all 8 TOS values. Currently we'll likely get a O(1) lookup for any one of those entries. Your proposal guarentees that all such entries will land to the same hash chain, since TOS is not an input for the hash any longer. Therefore the lookup in my example case will be O(8). And instead of just eating the complexity at ICMP PMTU handling time, we eat the complexity at every routing cache lookup. I really don't think we can consider this. From chester.f.johnson@intel.com Wed Dec 10 15:37:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:37:27 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBANb0Ta004330 for ; Wed, 10 Dec 2003 15:37:01 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.11 2003/12/03 18:29:28 root Exp $) with ESMTP id hBANXYOP001078 for ; Wed, 10 Dec 2003 23:33:34 GMT Received: from fmsmsxv041-1.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by petasus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.6 2003/12/04 00:48:44 root Exp $) with SMTP id hBANbaWY014608 for ; Wed, 10 Dec 2003 23:37:43 GMT Received: from fmsmsx331.amr.corp.intel.com ([132.233.42.135]) by fmsmsxv041-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003121015365009381 ; Wed, 10 Dec 2003 15:36:50 -0800 Received: from fmsmsx410.amr.corp.intel.com ([132.233.42.51]) by fmsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 10 Dec 2003 15:36:50 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: PMTU issues due to TOS field manipulation (for DSCP) Date: Wed, 10 Dec 2003 15:36:49 -0800 Message-ID: <7E713DB94F47914DB5AAB80DE8EEA87501539908@fmsmsx410.fm.intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: PMTU issues due to TOS field manipulation (for DSCP) Thread-Index: AcO/dHCKUdioo+l2SBSIcOpmH+vVMAAAC83Q From: "Johnson, Chester F" To: "David S. Miller" , "Julian Anastasov" Cc: , , , , X-OriginalArrivalTime: 10 Dec 2003 23:36:50.0555 (UTC) FILETIME=[7C18C4B0:01C3BF76] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id hBANb0Ta004330 X-archive-position: 1988 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chester.f.johnson@intel.com Precedence: bulk X-list: netdev Content-Length: 1945 Lines: 50 I'm trying to think through one potential issue with ignoring the TOS field. Would/should a route redirect have to match the same src-dst-tos hash? 1. Assume the network uses DSCP to make routing decisions. 2. Assume the first hop routers are configured as passive and do not advertise routes to the supported subnets and presumes static routes configured on end hosts. 3. The Linux host has multiple network interfaces for route diversity. The preferred route may send the frame to the wrong router and then ignore the icmp route redirect. Or would it? chet -----Original Message----- From: David S. Miller [mailto:davem@redhat.com] Sent: Wednesday, December 10, 2003 3:21 PM To: Julian Anastasov Cc: niv@us.ibm.com; ak@suse.de; ruddk@us.ibm.com; kuznet@ms2.inr.ac.ru; netdev@oss.sgi.com; Johnson, Chester F Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) On Thu, 11 Dec 2003 01:15:06 +0200 (EET) Julian Anastasov wrote: > It seems, there are no many users of OIF!=0 but if TOS is used as > routing key we can see up to 8 entries with different TOS for same > SADDR,DADDR. Of course, it looks difficult to walk 8 rows just to > check all TOS variants, the common case is to see only one TOS value > used. That is why I propose to eliminate the TOS as hash key and to > walk one row. At first look, the risk of DoS is same, thanks to the > random value. This is not my understanding at all. Consider the case where we generate routing cache entries for all 8 TOS values. Currently we'll likely get a O(1) lookup for any one of those entries. Your proposal guarentees that all such entries will land to the same hash chain, since TOS is not an input for the hash any longer. Therefore the lookup in my example case will be O(8). And instead of just eating the complexity at ICMP PMTU handling time, we eat the complexity at every routing cache lookup. I really don't think we can consider this. From shemminger@osdl.org Wed Dec 10 15:58:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 15:59:03 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBANwoTa004898 for ; Wed, 10 Dec 2003 15:58:50 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBANwaZ19696; Wed, 10 Dec 2003 15:58:37 -0800 Date: Wed, 10 Dec 2003 15:58:36 -0800 From: Stephen Hemminger To: Maxim Krasnyansky , Rusty Russell , "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] fix module netproto alias for bluetooth Message-Id: <20031210155836.1c12d56a.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1989 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1013 Lines: 20 In 2.6.0-test11 the net-pf-31 alias gets associated with the bnep module rather than the bluetooth module that actually handles the protocol. It is mostly works because bnep depends on bluetooth so it gets loaded. diff -Nru a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c --- a/net/bluetooth/af_bluetooth.c Wed Dec 10 15:55:50 2003 +++ b/net/bluetooth/af_bluetooth.c Wed Dec 10 15:55:50 2003 @@ -361,3 +361,4 @@ MODULE_AUTHOR("Maxim Krasnyansky "); MODULE_DESCRIPTION("Bluetooth Core ver " VERSION); MODULE_LICENSE("GPL"); +MODULE_ALIAS_NETPROTO(PF_BLUETOOTH); diff -Nru a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c --- a/net/bluetooth/bnep/core.c Wed Dec 10 15:55:50 2003 +++ b/net/bluetooth/bnep/core.c Wed Dec 10 15:55:50 2003 @@ -707,4 +707,3 @@ MODULE_DESCRIPTION("Bluetooth BNEP ver " VERSION); MODULE_AUTHOR("David Libault , Maxim Krasnyanskiy "); MODULE_LICENSE("GPL"); -MODULE_ALIAS_NETPROTO(PF_BLUETOOTH); From shemminger@osdl.org Wed Dec 10 16:00:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:01:07 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB00sTa005340 for ; Wed, 10 Dec 2003 16:00:54 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBB00cZ20161; Wed, 10 Dec 2003 16:00:38 -0800 Date: Wed, 10 Dec 2003 16:00:38 -0800 From: Stephen Hemminger To: Maxim Krasnyansky , "David S. Miller" , Rusty Russell Cc: netdev@oss.sgi.com Subject: [PATCH] bluetooth protocol module aliases Message-Id: <20031210160038.53f44b29.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1990 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1930 Lines: 58 Simple extension of rusty's network protocol aliases to handle the bluetooth protocols as well. This can hold off till after 2.6.0 diff -Nru a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h --- a/include/net/bluetooth/bluetooth.h Wed Dec 10 15:55:23 2003 +++ b/include/net/bluetooth/bluetooth.h Wed Dec 10 15:55:23 2003 @@ -179,4 +179,7 @@ int bt_err(__u16 code); +#define MODULE_ALIAS_BTPROTO(proto) \ + MODULE_ALIAS("bt-proto-" __stringify(proto)) + #endif /* __BLUETOOTH_H */ diff -Nru a/net/bluetooth/bnep/sock.c b/net/bluetooth/bnep/sock.c --- a/net/bluetooth/bnep/sock.c Wed Dec 10 15:55:23 2003 +++ b/net/bluetooth/bnep/sock.c Wed Dec 10 15:55:23 2003 @@ -192,6 +192,7 @@ .owner = THIS_MODULE, .create = bnep_sock_create }; +MODULE_ALIAS_BTPROTO(BTPROTO_BNEP); int __init bnep_sock_init(void) { diff -Nru a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c --- a/net/bluetooth/hci_sock.c Wed Dec 10 15:55:23 2003 +++ b/net/bluetooth/hci_sock.c Wed Dec 10 15:55:23 2003 @@ -639,6 +639,7 @@ .owner = THIS_MODULE, .create = hci_sock_create, }; +MODULE_ALIAS_BTPROTO(BTPROTO_HCI); struct notifier_block hci_sock_nblock = { .notifier_call = hci_sock_dev_event diff -Nru a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c --- a/net/bluetooth/l2cap.c Wed Dec 10 15:55:23 2003 +++ b/net/bluetooth/l2cap.c Wed Dec 10 15:55:23 2003 @@ -2141,6 +2141,7 @@ .create = l2cap_sock_create, .owner = THIS_MODULE, }; +MODULE_ALIAS_BTPROTO(BTPROTO_L2CAP); static struct hci_proto l2cap_hci_proto = { .name = "L2CAP", diff -Nru a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c --- a/net/bluetooth/rfcomm/sock.c Wed Dec 10 15:55:23 2003 +++ b/net/bluetooth/rfcomm/sock.c Wed Dec 10 15:55:23 2003 @@ -888,6 +888,7 @@ .owner = THIS_MODULE, .create = rfcomm_sock_create }; +MODULE_ALIAS_BTPROTO(BTPROTO_RFCOMM); int __init rfcomm_init_sockets(void) { From davem@pizda.ninka.net Wed Dec 10 16:00:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:01:11 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB00wTa005351 for ; Wed, 10 Dec 2003 16:00:58 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA18950; Wed, 10 Dec 2003 15:59:41 -0800 Date: Wed, 10 Dec 2003 15:59:41 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: maxk@qualcomm.com, rusty@rustcorp.com.au, netdev@oss.sgi.com Subject: Re: [PATCH] fix module netproto alias for bluetooth Message-Id: <20031210155941.300b68ba.davem@redhat.com> In-Reply-To: <20031210155836.1c12d56a.shemminger@osdl.org> References: <20031210155836.1c12d56a.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1991 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 340 Lines: 10 On Wed, 10 Dec 2003 15:58:36 -0800 Stephen Hemminger wrote: > In 2.6.0-test11 the net-pf-31 alias gets associated with the bnep module > rather than the bluetooth module that actually handles the protocol. > It is mostly works because bnep depends on bluetooth so it gets loaded. Good catch, patch applied. Thanks. From davem@pizda.ninka.net Wed Dec 10 16:03:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:03:13 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB030Ta006098 for ; Wed, 10 Dec 2003 16:03:00 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA18981; Wed, 10 Dec 2003 16:01:43 -0800 Date: Wed, 10 Dec 2003 16:01:43 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: maxk@qualcomm.com, rusty@rustcorp.com.au, netdev@oss.sgi.com Subject: Re: [PATCH] bluetooth protocol module aliases Message-Id: <20031210160143.62213d8c.davem@redhat.com> In-Reply-To: <20031210160038.53f44b29.shemminger@osdl.org> References: <20031210160038.53f44b29.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1992 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 248 Lines: 7 On Wed, 10 Dec 2003 16:00:38 -0800 Stephen Hemminger wrote: > Simple extension of rusty's network protocol aliases to handle the bluetooth > protocols as well. This can hold off till after 2.6.0 Yeah, resend this for 2.6.1 From ja@ssi.bg Wed Dec 10 16:06:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:06:28 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB05mTa006865 for ; Wed, 10 Dec 2003 16:05:54 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBB069CL002255; Thu, 11 Dec 2003 02:06:29 +0200 Date: Thu, 11 Dec 2003 02:06:09 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: niv@us.ibm.com, , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031210152039.055e887a.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1993 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 1872 Lines: 49 Hello, On Wed, 10 Dec 2003, David S. Miller wrote: > > is to see only one TOS value used. That is why I propose > > to eliminate the TOS as hash key and to walk one row. At first > > look, the risk of DoS is same, thanks to the random value. > > This is not my understanding at all. > > Consider the case where we generate routing cache entries for > all 8 TOS values. Currently we'll likely get a O(1) lookup > for any one of those entries. IMO, the main question is whether all chains are with ~equal length, this depends on how good is the hash function. And we hope it is good. Because these entries are always linked somewhere and if they do not slowdown one chain they surely slowdown other chains. The total walk time is always same if all entries are searched. This is more and more true if the average chain length is longer than the number of TOS values used. Of course, if the table is empty and the average chain length is 1-4 using TOS as hash key will be faster for 8 TOS values. But the common average chain length for full table is 16 and the different TOS values are usually 1 or 2 per path. We do not talk about intentional DoS. > Your proposal guarentees that all such entries will land to the > same hash chain, since TOS is not an input for the hash any longer. > Therefore the lookup in my example case will be O(8). Yes, it will look very slow on empty table and when we do not ignore the TOS values. > And instead of just eating the complexity at ICMP PMTU handling > time, we eat the complexity at every routing cache lookup. For solving the PMTUD problems, I still do not know how we can avoid walking 8 chains if the TOS is a hash key. This is a real DoS because 8 is too longer compared with the common case of walking 1 chain with 1-2 TOS values. BTW, there are 1-2 entries for OIF too. Regards -- Julian Anastasov From davem@pizda.ninka.net Wed Dec 10 16:11:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:11:26 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB0BATa007284 for ; Wed, 10 Dec 2003 16:11:10 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA19008; Wed, 10 Dec 2003 16:09:46 -0800 Date: Wed, 10 Dec 2003 16:09:46 -0800 From: "David S. Miller" To: Julian Anastasov Cc: niv@us.ibm.com, ak@suse.de, ruddk@us.ibm.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031210160946.4110c611.davem@redhat.com> In-Reply-To: References: <20031210152039.055e887a.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1994 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 575 Lines: 14 On Thu, 11 Dec 2003 02:06:09 +0200 (EET) Julian Anastasov wrote: > But the common average chain length for full table is 16 and the > different TOS values are usually 1 or 2 per path. We do not talk > about intentional DoS. Your system has not configured it's rthash table size properly, see recent discussions Robert has been making here. You should have rthash sized somewhere near the number of entries a full system will have. But regardless, let us say that your system has complexity O(16) lookups as you mention, your proposal changes this to O(16+8). From ja@ssi.bg Wed Dec 10 16:16:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:16:47 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB0GQTa007866 for ; Wed, 10 Dec 2003 16:16:31 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBB0HJCL002327; Thu, 11 Dec 2003 02:17:19 +0200 Date: Thu, 11 Dec 2003 02:17:19 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "Johnson, Chester F" cc: "David S. Miller" , , , , , Subject: RE: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <7E713DB94F47914DB5AAB80DE8EEA87501539908@fmsmsx410.fm.intel.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1995 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 1288 Lines: 36 Hello, On Wed, 10 Dec 2003, Johnson, Chester F wrote: > I'm trying to think through one potential issue with ignoring the TOS > field. Would/should a route redirect have to match the same src-dst-tos > hash? It will surely match as before, as long as you do not change the sysctl var to ignore the TOS. > 1. Assume the network uses DSCP to make routing decisions. > 2. Assume the first hop routers are configured as passive and do not > advertise routes to the supported subnets and presumes static routes > configured on end hosts. > 3. The Linux host has multiple network interfaces for route diversity. > > The preferred route may send the frame to the wrong router and then > ignore the icmp route redirect. Or would it? That is why I prefer 3 states, we will be able to route by TOS (it will not be ignored for routing and cache) but still to ignore TOS for PMTUD. I think, we need one proc entry that will alter 2 int variables: whether TOS is ignored for PMTU and another for the routing TOS mask. As we see, if we ignore TOS and OIF for PMTU, all possible paths (including different TOS and OIF values) between SADDR and DADDR will have same PMTU and we can not solve this problem, we have to update PMTU for all these entries. Regards -- Julian Anastasov From ja@ssi.bg Wed Dec 10 16:34:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Dec 2003 16:34:29 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBB0Y3Ta011729 for ; Wed, 10 Dec 2003 16:34:06 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBB0YpCL002383; Thu, 11 Dec 2003 02:34:52 +0200 Date: Thu, 11 Dec 2003 02:34:51 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: niv@us.ibm.com, , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031210160946.4110c611.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1996 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 1615 Lines: 55 Hello, On Wed, 10 Dec 2003, David S. Miller wrote: > On Thu, 11 Dec 2003 02:06:09 +0200 (EET) > Julian Anastasov wrote: > > > But the common average chain length for full table is 16 and the > > different TOS values are usually 1 or 2 per path. We do not talk > > about intentional DoS. > > Your system has not configured it's rthash table size properly, see > recent discussions Robert has been making here. You should have > rthash sized somewhere near the number of entries a full system > will have. > > But regardless, let us say that your system has complexity O(16) > lookups as you mention, your proposal changes this to O(16+8). It is ~16 :) ip_rt_max_size = (rt_hash_mask + 1) * 16; This is what happens on full table, of course. OK, some simple numbers for an ideal table: - full table with 1024 chains, 16384 (max_size) entries equally distributed in these 1024 chains, 16 per chain. - there are 2048 paths with same saddr->daddr, each has 8 TOS values 2 cases depending on whether TOS is a hash key (path=saddr->daddr): 1. TOS is a hash key: - in each chain we have 16 paths, 1 TOS value per path - all 8 TOS values for a path are in 8 different chains 2. TOS is not a hash key: 2 paths per chain (2 paths x 8 TOS values => 16 entries) if all saddr->daddr->tos streams have same packet rate I think the CPU time to lookup them will be same. This is because 8 (number of TOS values) < 16 (chain length). And I hope the users always can tune the proposed TOS settings if they see DoS and if they do not need TOS as a rt key. Regards -- Julian Anastasov From andi@averellmail.firstfloor.org Thu Dec 11 03:13:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 03:13:58 -0800 (PST) Received: from zero.aec.at (Naramy_Oof@zero.aec.at [193.170.194.10]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBBBDgTa015508 for ; Thu, 11 Dec 2003 03:13:43 -0800 Received: from fred.muc.de (Conmore_Apel_Brune@localhost.localdomain [127.0.0.1]) by zero.aec.at (8.11.6/8.11.2) with ESMTP id hBB8aXS02735; Thu, 11 Dec 2003 09:36:34 +0100 Received: by fred.muc.de (Postfix on SuSE Linux 7.3 (i386), from userid 500) id B6B745B295; Thu, 11 Dec 2003 09:36:21 +0100 (CET) Date: Thu, 11 Dec 2003 09:36:21 +0100 From: Andi Kleen To: netdev@oss.sgi.com Cc: jgarzik@pobox.com Subject: [patch] variable number of dummy devices Message-ID: <20031211083621.GA2455@averell> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1997 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@muc.de Precedence: bulk X-list: netdev Content-Length: 1923 Lines: 77 Uses the new module_params that nobody else uses now. diff -u linux-2.6.0test11-amd64/drivers/net/dummy.c-DUMMY linux-2.6.0test11-amd64/drivers/net/dummy.c --- linux-2.6.0test11-amd64/drivers/net/dummy.c-DUMMY 2003-11-29 16:34:26.000000000 +0100 +++ linux-2.6.0test11-amd64/drivers/net/dummy.c 2003-12-11 09:34:17.000000000 +0100 @@ -33,6 +33,9 @@ #include #include #include +#include + +static int numdummies = 1; static int dummy_xmit(struct sk_buff *skb, struct net_device *dev); static struct net_device_stats *dummy_get_stats(struct net_device *dev); @@ -83,10 +86,14 @@ return dev->priv; } -static struct net_device *dev_dummy; +static struct net_device **dummies; -static int __init dummy_init_module(void) +/* Number of dummy devices to be set up by this module. */ +module_param(numdummies, int, 0); + +static int __init dummy_init_one(int index) { + struct net_device *dev_dummy; int err; dev_dummy = alloc_netdev(sizeof(struct net_device_stats), @@ -98,15 +105,40 @@ if ((err = register_netdev(dev_dummy))) { kfree(dev_dummy); dev_dummy = NULL; + } else { + dummies[index] = dev_dummy; } + return err; } +static void __exit dummy_free_one(int index) +{ + unregister_netdev(dummies[index]); + free_netdev(dummies[index]); +} + +static int __init dummy_init_module(void) +{ + int i, err = 0; + dummies = kmalloc(numdummies * sizeof(void *), GFP_KERNEL); + if (!dummies) + return -ENOMEM; + for (i = 0; i < numdummies && !err; i++) + err = dummy_init_one(i); + if (err) { + while (--i >= 0) + dummy_free_one(i); + } + return err; +} + static void __exit dummy_cleanup_module(void) { - unregister_netdev(dev_dummy); - free_netdev(dev_dummy); - dev_dummy = NULL; + int i; + for (i = 0; i < numdummies; i++) + dummy_free_one(i); + kfree(dummies); } module_init(dummy_init_module); From dlstevens@us.ibm.com Thu Dec 11 15:20:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 15:20:20 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBBNK7Ta022056 for ; Thu, 11 Dec 2003 15:20:07 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e31.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id hBBNJx19524384; Thu, 11 Dec 2003 18:19:59 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id hBBNJw4g133224; Thu, 11 Dec 2003 16:19:58 -0700 Importance: Normal Sensitivity: Subject: Re: Kernel 2.4.22 MLD problems To: Takashi Hibi , netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.4a July 24, 2000 Message-ID: From: David Stevens Date: Thu, 11 Dec 2003 15:19:55 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0.2CF2HF133 | November 14, 2003) at 12/11/2003 16:19:58 MIME-Version: 1.0 Content-type: multipart/alternative; Boundary="0__=07BBE76ADFECEDFC8f9e8a93df938690918c07BBE76ADFECEDFC" Content-Disposition: inline X-archive-position: 1998 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 3018 Lines: 70 --0__=07BBE76ADFECEDFC8f9e8a93df938690918c07BBE76ADFECEDFC Content-type: text/plain; charset=US-ASCII Takashi, >>But there are still some problems. >>[MLDv2] >>1. After joining multicast address, MLDv2 listener report isn't issued >>immediately. > >This looks like a bug; it should send the first one immediately, >and QRV-1 more at random intervals between 0 and MRC. It's adding >the MRC random delay before sending the first one. Not critical, >but I'll put a patch together in the next couple of days. I thought this was a bug, from a quick glance at the code, but I just looked some more and it already does the fix I had in mind. I also ran some tests, and I do see the advertisement immediately after an interface state change. That was with 2.6.0-test11bk8, but 2.4.22 has the same code for this piece. In MLDv1 compatibility mode, interface state changes are not sent, but a "leave group" will result in an MLDv1 leave group message. So, I can't explain (or reproduce) this behaviour right now. Please do send me a packet trace with the MLD packets and describing what your application is doing in more detail if you still have a problem with this. +-DLS --0__=07BBE76ADFECEDFC8f9e8a93df938690918c07BBE76ADFECEDFC Content-type: text/html; charset=US-ASCII Content-Disposition: inline

Takashi,

>>But there are still some problems.
>>[MLDv2]
>>1. After joining multicast address, MLDv2 listener report isn't issued >>immediately.
>
>This looks like a bug; it should send the first one immediately,
>and QRV-1 more at random intervals between 0 and MRC. It's adding
>the MRC random delay before sending the first one. Not critical,
>but I'll put a patch together in the next couple of days.

I thought this was a bug, from a quick glance at the code, but I
just looked some more and it already does the fix I had in mind.
I also ran some tests, and I do see the advertisement immediately
after an interface state change. That was with 2.6.0-test11bk8,
but 2.4.22 has the same code for this piece.

In MLDv1 compatibility mode, interface state changes are not sent, but a
"leave group" will result in an MLDv1 leave group message. So, I can't
explain (or reproduce) this behaviour right now.

Please do send me a packet trace with the MLD packets and describing
what your application is doing in more detail if you still have a problem with
this.

+-DLS
--0__=07BBE76ADFECEDFC8f9e8a93df938690918c07BBE76ADFECEDFC-- From shemminger@osdl.org Thu Dec 11 15:33:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 15:33:58 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBBNXjTa023697 for ; Thu, 11 Dec 2003 15:33:46 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBBNXYZ25528; Thu, 11 Dec 2003 15:33:34 -0800 Date: Thu, 11 Dec 2003 15:33:34 -0800 From: Stephen Hemminger To: "David S. Miller" , YOSHIFUJI Hideaki Cc: netdev@oss.sgi.com Subject: [BUG] can't unload network device's if IPV6 is loaded Message-Id: <20031211153334.0c59214b.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1999 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 502 Lines: 11 In 2.6.0-test11, IPV6 is not correctly cleaning up the network device reference's (ie missing dev_put). So if I do: rmmod e100 it hangs forever and complains about that not all references have been cleaned up. This happens even if no IPV6 addresses have been set up. Just having ipv6 available to be loaded at boot up. The vendor startup scripts (SuSe 9) may be setting something. -- Stephen Hemminger mailto:shemminger@osdl.org Open Source Development Lab http://developer.osdl.org/shemminger From davem@pizda.ninka.net Thu Dec 11 16:20:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 16:20:37 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBC0KNTa002246 for ; Thu, 11 Dec 2003 16:20:23 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA21524; Thu, 11 Dec 2003 16:18:56 -0800 Date: Thu, 11 Dec 2003 16:18:56 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: [BUG] can't unload network device's if IPV6 is loaded Message-Id: <20031211161856.6960c479.davem@redhat.com> In-Reply-To: <20031211153334.0c59214b.shemminger@osdl.org> References: <20031211153334.0c59214b.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2000 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 3019 Lines: 70 On Thu, 11 Dec 2003 15:33:34 -0800 Stephen Hemminger wrote: > In 2.6.0-test11, IPV6 is not correctly cleaning up the network > device reference's (ie missing dev_put). So if I do: > rmmod e100 > it hangs forever and complains about that not all references have > been cleaned up. > > This happens even if no IPV6 addresses have been set up. Just having > ipv6 available to be loaded at boot up. The vendor startup scripts > (SuSe 9) may be setting something. Ugh, I thought we had killed all of this kind of crap. We had so many issues in this area wrt. the autoconf and mcast but those all got fixed up. Yoshfuji is very busy at this time, probably until the new year. So we need to try and figure this one out ourselves. Even without configuring anything, IPV6 does things to all active devices when it is loaded. 1) It assigns link-local IPV6 addresses to interfaces it understands. 2) It joins the standard default multicast groups necessary for a device to take part in ipv6 operations on a network. So when the device is brought down all this stuff has to be undone. This is where the problems are likely to be, and where they in fact were in previous cases where we fixed this bug. The setup of each device occurs in net/ipv6/addrconf.c in addrconf_init(). It loops over all devices and configures them for ipv6 operation. For ethernet devices it would call addrconf_dev_config(). This also happens when NETDEV_UP events are sent out. It allocates the ipv6 device private data, gives it a link-local address (which is usually formed using a fixed constant prefix with the device hardware address added to the end). This is also where the default multicast and link local routes are added via addrconf_add_mroute() and addrconf_add_lroute() respectively. If all of this succeeds, the duplicate address detection process is begun via addrconf_dad_start(). addrconf_ifdown() is supposed to undo and clean all of this crap up when the interfaces goes down or is unregistered. One problem we had previously was due to the fact that if it was an unregister causing addrconf_ifdown() to run, the first thing this function will do is NULL out dev->ip6_ptr, which causes things like {,__}in6_dev_get() to return NULL. This causes problems for cleaning up the multicast references, in mcast.c:ipv6_mc_destroy_dev(). We had to change that multicast code to pass the 'idev' around directly instead of obtaining it via __in6_dev_get(). There are even comments in ipv6_mc_destroy_dev() which were added to explain this problem. The other area we had a refcount problem was wrt. the addrconf per-device timers used for implementation of the privacy extension. I forget the exact nature of the bug, but Herbert Xu coded up the patch to fix that stuff, you should be able to see it in the BK revision history of addrconf.c I hope this can help you search down the problem further. These areas (addrconf and mcast) should be the only sources of device references in your case. From shemminger@osdl.org Thu Dec 11 17:08:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 17:08:15 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBC182Ta004218 for ; Thu, 11 Dec 2003 17:08:02 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBC17LZ09912; Thu, 11 Dec 2003 17:07:21 -0800 Date: Thu, 11 Dec 2003 17:07:20 -0800 From: Stephen Hemminger To: "Feldman, Scott" , "David S. Miller" , Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] e100 driver crashes with ethtool Message-Id: <20031211170720.2a5bb4ca.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2001 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 3838 Lines: 157 The e100 driver in 2.6.0-test11 crashes when ever a ethtool command would initiate a board reset. The problem is bogus __devinit attributes on functions that can be called from the ethtool ioctl code. diff -Nru a/drivers/net/e100/e100_config.c b/drivers/net/e100/e100_config.c --- a/drivers/net/e100/e100_config.c Thu Dec 11 17:03:50 2003 +++ b/drivers/net/e100/e100_config.c Thu Dec 11 17:03:50 2003 @@ -60,7 +60,7 @@ * All other init functions will only set values that are * different from the 82557 default. */ -void __devinit +void e100_config_init_82557(struct e100_private *bdp) { /* initialize config block */ @@ -104,7 +104,7 @@ e100_config_mulcast_enbl(bdp, false); } -static void __devinit +static void e100_config_init_82558(struct e100_private *bdp) { /* MWI enable. This should be turned on only if the adapter is a 82558/9 @@ -136,7 +136,7 @@ e100_config_long_rx(bdp, true); } -static void __devinit +static void e100_config_init_82550(struct e100_private *bdp) { /* The D102 chip allows for 32 config bytes. This value is @@ -160,7 +160,7 @@ } /* Initialize the adapter's configure block */ -void __devinit +void e100_config_init(struct e100_private *bdp) { e100_config_init_82557(bdp); diff -Nru a/drivers/net/e100/e100_main.c b/drivers/net/e100/e100_main.c --- a/drivers/net/e100/e100_main.c Thu Dec 11 17:03:50 2003 +++ b/drivers/net/e100/e100_main.c Thu Dec 11 17:03:50 2003 @@ -1290,7 +1290,7 @@ * true: if S/W was successfully initialized * false: otherwise */ -static unsigned char __devinit +static unsigned char e100_sw_init(struct e100_private *bdp) { bdp->next_cu_cmd = START_WAIT; // init the next cu state @@ -1318,7 +1318,7 @@ return 1; } -static void __devinit +static void e100_tco_workaround(struct e100_private *bdp) { int i; @@ -2508,7 +2508,7 @@ * true: if successfully cleared stat counters * false: otherwise */ -static unsigned char __devinit +static unsigned char e100_clr_cntrs(struct e100_private *bdp) { volatile u32 *pcmd_complete; @@ -2873,7 +2873,7 @@ /***************************************************************************/ /* Read PWA (printed wired assembly) number */ -void __devinit +static void __devinit e100_rd_pwa_no(struct e100_private *bdp) { bdp->pwa_no = e100_eeprom_read(bdp, EEPROM_PWA_NO); @@ -2882,7 +2882,7 @@ } /* Read the permanent ethernet address from the eprom. */ -void __devinit +static void __devinit e100_rd_eaddr(struct e100_private *bdp) { int i; diff -Nru a/drivers/net/e100/e100_phy.c b/drivers/net/e100/e100_phy.c --- a/drivers/net/e100/e100_phy.c Thu Dec 11 17:03:50 2003 +++ b/drivers/net/e100/e100_phy.c Thu Dec 11 17:03:50 2003 @@ -132,7 +132,7 @@ } } -static unsigned char __devinit +static unsigned char e100_phy_valid(struct e100_private *bdp, unsigned int phy_address) { u16 ctrl_reg, stat_reg; @@ -150,7 +150,7 @@ return true; } -static void __devinit +static void e100_phy_address_detect(struct e100_private *bdp) { unsigned int addr; @@ -180,7 +180,7 @@ } } -static void __devinit +static void e100_phy_id_detect(struct e100_private *bdp) { u16 low_id_reg, high_id_reg; @@ -204,7 +204,7 @@ ((unsigned int) high_id_reg << 16)); } -static void __devinit +static void e100_phy_isolate(struct e100_private *bdp) { unsigned int phy_address; @@ -227,7 +227,7 @@ } } -static unsigned char __devinit +static unsigned char e100_phy_specific_setup(struct e100_private *bdp) { u16 misc_reg; @@ -380,7 +380,7 @@ * Returns: * NOTHING */ -static void __devinit +static void e100_fix_polarity(struct e100_private *bdp) { u16 status; @@ -916,7 +916,7 @@ schedule_timeout(HZ / 2); } -unsigned char __devinit +unsigned char e100_phy_init(struct e100_private *bdp) { e100_phy_reset(bdp); From wis@udev.org Thu Dec 11 17:55:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 17:55:41 -0800 (PST) Received: from nemo (200.149-200-80.adsl.skynet.be [80.200.149.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBC1tPTa005189 for ; Thu, 11 Dec 2003 17:55:26 -0800 Received: by nemo (Postfix, from userid 1000) id E6BD1C7BBD; Fri, 12 Dec 2003 03:17:17 +0100 (CET) Date: Fri, 12 Dec 2003 03:17:17 +0100 From: Antony Lesuisse To: netdev@oss.sgi.com Subject: How to bypass the local routing table, to simulate a hub Message-ID: <20031212021717.GA5695@udev.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i X-archive-position: 2002 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: netdev-list@udev.org Precedence: bulk X-list: netdev Content-Length: 3580 Lines: 113 Imagine, on a linux box, having 3 different network interfaces on the same switch. It is for simulating a network, using virtual tun/tap interface: ----------------+-----+------------------------ Hub simulator|Linux| Userspaces apps | | ?--|-----|--- tap0 192.168.1.1 ? | | /dev/net/tun ?--|-----|--- tap1 192.168.1.2 ? | | ?--|-----|--- tap2 192.168.1.3 | | ----------------+-----+----------------------- How can you tell the linux networking stack that a packet for 192.168.1.2 has to be sent via the network interface tap0 or tap2 (according to the sin_addr) to the WIRE and be received by tap1? I would like that, for packets that comes from process, the kernel doesn't recognize the dest ip as local so that the packet will be sent over the interface even if there is an other interface configurated with the destination ip on the same kernel. For example i will configure a HTTP server on tap2 binding 192.168.1.3:80 and i will launch an HTTP client binding IP 192.168.1.1 trying to make TCP connection to 192.168.1.3:80 doing a GET. Everything on the same computer. My hub simulator will sometimes simulate packet loss. ping -I tap0 192.168.1.2 should go trought my simulator and not use lo. ping 192.168.1.2 should go trought my simulator (ie using 192.168.1.1) as source Acoording to my understanting of the linux network stack the problem is that there is no way to skip the lookup in the local routing table: You cannot change the 0: of ip rules ip rule: 0: from all lookup local 32766: from all lookup main 32767: from all lookup default ip route show table local: local 192.168.1.1 dev eth0 proto kernel scope host src 192.168.1.1 local 192.168.1.2 dev eth0 proto kernel scope host src 192.168.1.2 local 192.168.1.3 dev eth0 proto kernel scope host src 192.168.1.3 ... Somebody on irc told me to do something like this: iptables -A PREROUTING -t mangle -p tcp -s MYIP1 -j MARK --set-mark 10 iptables -A PREROUTING -t mangle -p tcp -s MYIP2 -j MARK --set-mark 20 ip rule add fwmark 10 table first ip rule add from MYIP1 to 0.0.0.0/0 lookup second ip route add default via eth2 table second But it doesn't change the fact that linux still lookup the local routing table. TIA Antony Lesuisse BONUS: Here is my small (but fully functionnal) network hub simulator in python: #!/usr/bin/python2.2 # hub with a probabilty of 0.9 import fcntl,math,os,random,select,sys,time # TUN TAP kernel interface def net_alloc(name=""): s = len(name) if s>15: raise "name too long" fd = os.open("/dev/net/tun",os.O_RDWR) print fd if fd == -1: raise "could not open /dev/net/tun" ioctl_int = 0x400454ca ioctl_ifreq = name+(16-s)*'\x00'+'\x02'+15*'\x00' name = fcntl.ioctl(fd,ioctl_int,ioctl_ifreq) name = name[:name.find('\x00')] return (fd,name) # Contruction of a hub hub = [] hubname = {} for i in range(2): fd,name = net_alloc() hub.append(fd) hubname[fd]={'fd':fd, 'name':name} os.system("ifconfig -a") os.system("ifconfig tap0 192.168.66.1; ifconfig tap1 192.168.66.2") # hub simulation while 1: (fdin,a,b) = select.select(hub,[],[],1.0) print "event:",fdin for i in fdin: frame = os.read(i,2048) print "%s:frame:%s"%(hubname[i]['name'],repr(frame)) for j in hub: if j != i: if random.random() < 0.9: os.write(j,frame) else: print "lost:%s:to:%s"%(hubname[i]['name'],hubname[j]['name']) -- Antony Lesuisse GPG EA2CCD66: 4B7F 6061 3DF5 F07A ACFF F127 6487 54F7 EA2C CD66 From davem@pizda.ninka.net Thu Dec 11 23:12:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Dec 2003 23:12:35 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBC7CMTa016760 for ; Thu, 11 Dec 2003 23:12:22 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id XAA22284; Thu, 11 Dec 2003 23:10:49 -0800 Date: Thu, 11 Dec 2003 23:10:49 -0800 From: "David S. Miller" To: Andi Kleen Cc: netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: [patch] variable number of dummy devices Message-Id: <20031211231049.7390173a.davem@redhat.com> In-Reply-To: <20031211083621.GA2455@averell> References: <20031211083621.GA2455@averell> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2003 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 200 Lines: 7 On Thu, 11 Dec 2003 09:36:21 +0100 Andi Kleen wrote: > Uses the new module_params that nobody else uses now. Jeff, I think Andi's patch here is fine, you should try to get it into 2.6.1 From davem@pizda.ninka.net Fri Dec 12 00:33:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 00:33:48 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBC8XSTa021344 for ; Fri, 12 Dec 2003 00:33:29 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id AAA22434; Fri, 12 Dec 2003 00:31:43 -0800 Date: Fri, 12 Dec 2003 00:31:43 -0800 From: "David S. Miller" To: Julian Anastasov Cc: niv@us.ibm.com, ak@suse.de, ruddk@us.ibm.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, chester.f.johnson@intel.com Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) Message-Id: <20031212003143.062598e9.davem@redhat.com> In-Reply-To: References: <20031210160946.4110c611.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2004 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 2216 Lines: 67 On Thu, 11 Dec 2003 02:34:51 +0200 (EET) Julian Anastasov wrote: > On Wed, 10 Dec 2003, David S. Miller wrote: > > > But regardless, let us say that your system has complexity O(16) > > lookups as you mention, your proposal changes this to O(16+8). > > It is ~16 :) > > ip_rt_max_size = (rt_hash_mask + 1) * 16; > > This is what happens on full table, of course. OK, > some simple numbers for an ideal table: But look at default gc_thresh setting, which is when we trim rt cache entries: ipv4_dst_ops.gc_thresh = (rt_hash_mask + 1); The ip_rt_max_size value is meant to be a sort of buffer to absorb the situation where many rt cache entries are unreclaimable. But this is a seperate issue, and we can discuss your further points regardless. > 2 cases depending on whether TOS is a hash key (path=saddr->daddr): > > 1. TOS is a hash key: > > - in each chain we have 16 paths, 1 TOS value per path > - all 8 TOS values for a path are in 8 different chains > > 2. TOS is not a hash key: > > 2 paths per chain (2 paths x 8 TOS values => 16 entries) > > if all saddr->daddr->tos streams have same packet rate I think > the CPU time to lookup them will be same. > This is because 8 (number of TOS values) < 16 (chain length). > > And I hope the users always can tune the proposed TOS > settings if they see DoS and if they do not need TOS as a rt key. Ok. I agree with your analysis. Let's propose something concrete. 1) PMTU processing applies PMTU change to all TOS'd instances of a route. This behavior change is sysctl controllable, and on by default. The implementation is to just lookup all 8 possible TOS values. 2) Whether TOS is a routing cache hash key is controlled by another sysctl. When CONFIG_IP_ROUTE_TOS is set this sysctl defaults to on, other- wise it defaults to off. I think #2 should be very safe because fib node fn_tos values are only ever set when that config variable is enabled, and fib rule r_tos values are only compared on lookup when it is enabled as well. However, there could be a few more ifdefs added to the fib rule code to cover all the assignment cases too but let's not worry about that right now. Comments? From herbert@gondor.apana.org.au Fri Dec 12 03:59:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 04:00:04 -0800 (PST) Received: from arnor.me.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBCBxkTa001390 for ; Fri, 12 Dec 2003 03:59:50 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.me.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1AUlxK-0001oA-00; Fri, 12 Dec 2003 22:59:38 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1AUlxI-00060F-00; Fri, 12 Dec 2003 22:59:36 +1100 From: Herbert Xu To: shemminger@osdl.org (Stephen Hemminger), netdev@oss.sgi.com Subject: Re: [BUG] can't unload network device's if IPV6 is loaded Organization: Core In-Reply-To: <20031211153334.0c59214b.shemminger@osdl.org> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.2-20031002 ("Berneray") (UNIX) (Linux/2.4.23-1-686-smp (i686)) Message-Id: Date: Fri, 12 Dec 2003 22:59:36 +1100 X-archive-position: 2005 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 927 Lines: 24 Stephen Hemminger wrote: > In 2.6.0-test11, IPV6 is not correctly cleaning up the network device reference's > (ie missing dev_put). So if I do: > rmmod e100 > it hangs forever and complains about that not all references have been cleaned up. > > This happens even if no IPV6 addresses have been set up. Just having ipv6 available > to be loaded at boot up. The vendor startup scripts (SuSe 9) may be setting something. I can't reproduce this here. I've done modprobe dummy ifconfig dummy0 up modprobe ipv6 ifconfig dummy0 down rmmod dummy It does complain that there are 3 remaining references to dummy0, but they go away after a few seconds and the module unloads. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From jmorris@redhat.com Fri Dec 12 06:25:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 06:26:11 -0800 (PST) Received: from thoron.boston.redhat.com (nat-pool-bos.redhat.com [66.187.230.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBCEPsTa028487 for ; Fri, 12 Dec 2003 06:25:55 -0800 Received: from thoron.boston.redhat.com (localhost.localdomain [127.0.0.1]) by thoron.boston.redhat.com (8.12.8/8.12.8) with ESMTP id hBCEPmta002848; Fri, 12 Dec 2003 09:25:48 -0500 Received: from localhost (jmorris@localhost) by thoron.boston.redhat.com (8.12.8/8.12.8/Submit) with ESMTP id hBCEPkjw002844; Fri, 12 Dec 2003 09:25:46 -0500 X-Authentication-Warning: thoron.boston.redhat.com: jmorris owned process doing -bs Date: Fri, 12 Dec 2003 09:25:45 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: "David S. Miller" cc: kuznet@ms2.inr.ac.ru, , , Subject: Re: [RFC] SO_PEERSEC - security credentials for Unix stream sockets In-Reply-To: <20031210145652.66bda4c2.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2006 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev Content-Length: 624 Lines: 22 On Wed, 10 Dec 2003, David S. Miller wrote: > I'm fine with this conceptually, although the earliest I could put > this into the tree is 2.6.1 although I have a hunch that I'll be > asked to defer something like this to 2.6.2, but who knows. No problem, it will hopefully get more testing and feedback in the meantime. I'll resubmit it sometime after 2.6.0. > The one thing I don't like is the ifdef conditionalized member of > the sock struct. We should move away from config variables changing > structure layouts. Even a "void *sk_security;" would be better. Ok. - James -- James Morris From jgarzik@pobox.com Fri Dec 12 07:28:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 07:28:33 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBCFSATa030075 for ; Fri, 12 Dec 2003 07:28:11 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:45390 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AUpD2-0005om-J8; Fri, 12 Dec 2003 15:28:04 +0000 Message-ID: <3FD9DE75.5080200@pobox.com> Date: Fri, 12 Dec 2003 10:27:49 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Shmulik Hen CC: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com, Jay Vosburgh Subject: Re: [PATCH SET][bonding 2.4] cleanup References: <200312041713.52021.shmulik.hen@intel.com> In-Reply-To: <200312041713.52021.shmulik.hen@intel.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2007 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 158 Lines: 10 Either you or Jay or somebody needs to email patches, if you want me to apply patches... "Go get this patchset from my website" doesn't scale... Jeff From Robert.Olsson@data.slu.se Fri Dec 12 15:10:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 15:10:48 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBCNAPTa027027 for ; Fri, 12 Dec 2003 15:10:26 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.9.3+/8.9.3) with ESMTP id AAA23141; Sat, 13 Dec 2003 00:10:09 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 846E7EC23B; Sat, 13 Dec 2003 00:10:16 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16346.19160.434671.377642@robur.slu.se> Date: Sat, 13 Dec 2003 00:10:16 +0100 To: "David S. Miller" Cc: Robert Olsson , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] option for large routing hash In-Reply-To: <20031210150537.5fd0edb0.davem@redhat.com> References: <16343.12816.917811.588853@robur.slu.se> <20031210150537.5fd0edb0.davem@redhat.com> X-Mailer: VM 7.17 under Emacs 21.3.1 X-archive-position: 2009 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 2250 Lines: 51 David S. Miller writes: > I wish we had resolved that in the long thread we had about it > a few months ago. We should really reestablish the point we had > reached, and try to make some more forward progress. > > This problem is not going to go away, as you say. No I will not. I revisited the patches and they seems to some help but here must be some other problem too. Seems we have to start over. Well let me add the patch with the debugging from Sarma to start with. At least there is no problem to trigger the overflows. > I think I see what you're saying though, and in my opinion there > is way too much black magic in choosing gc_thresh etc. routing cache > parameters. > > Other caches in the kernel grow to fill the needs of the system. The > routing cache grows only as large as it has been configured to be > allowed to grow, and afterwards it no longer responds to increasing > system needs. > > The IPV4 routing front end has exactly this kind of logic. As you > load more and more routing table entries into the kernel it grows > the hashes in response. > > > So it crucial to distinguish between the cases. Can it be done from > > incoming traffic? > > This is the real question, and the other is how to implement dynamic > hash table growth in the routing cache. The locking is particularly > problematic, but RCU may be able to help us with this. > > Monitoring traffic to make this decision is hard because to determine > whether cache misses are due to not well formed traffic vs. too small > cache sizing we'd need the entry we garbage collected to determine the > latter. It's the chicken and egg problem. Yes. Well for desktop systems and system with not too much flows there is no problem. For hi-flow servers and routers we should expect administrators do some manual setup according their environment. As we started this thread IMO increasing just hash size is the most continent in this case a as most routing cache parameters is derived from it so the black magic is untouched but still doing it's magic. We have used this for some time on our routers. Of course things can be done better but RCU is probably the most annoying for now. Cheers. --ro From ja@ssi.bg Fri Dec 12 15:37:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 15:38:11 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBCNbhTa028121 for ; Fri, 12 Dec 2003 15:37:45 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBCNc0v4008028; Sat, 13 Dec 2003 01:38:01 +0200 Date: Sat, 13 Dec 2003 01:38:00 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: niv@us.ibm.com, , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031212003143.062598e9.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2010 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 9415 Lines: 350 Hello, On Fri, 12 Dec 2003, David S. Miller wrote: > 1) PMTU processing applies PMTU change to all TOS'd instances of > a route. This behavior change is sysctl controllable, and > on by default. > > The implementation is to just lookup all 8 possible TOS values. > > 2) Whether TOS is a routing cache hash key is controlled by another > sysctl. > > When CONFIG_IP_ROUTE_TOS is set this sysctl defaults to on, other- > wise it defaults to off. > > I think #2 should be very safe because fib node fn_tos values are only > ever set when that config variable is enabled, and fib rule r_tos values > are only compared on lookup when it is enabled as well. However, there > could be a few more ifdefs added to the fib rule code to cover all the > assignment cases too but let's not worry about that right now. OK, here is a change that compiles. No fib changes yet (it is the next step). I'm not sure what var names are better. > Comments? What about ip_rt_redirect? May be we need the same TOS matching behavior if TOS is changed somewhere after routing? diff -Nru a/include/linux/sysctl.h b/include/linux/sysctl.h --- a/include/linux/sysctl.h Sat Dec 13 01:16:15 2003 +++ b/include/linux/sysctl.h Sat Dec 13 01:16:15 2003 @@ -329,6 +329,8 @@ NET_IPV4_ROUTE_MIN_PMTU=16, NET_IPV4_ROUTE_MIN_ADVMSS=17, NET_IPV4_ROUTE_SECRET_INTERVAL=18, + NET_IPV4_ROUTE_IGNORE_TOS=19, + NET_IPV4_ROUTE_PMTU_MODE=20, }; enum diff -Nru a/net/ipv4/route.c b/net/ipv4/route.c --- a/net/ipv4/route.c Sat Dec 13 01:16:15 2003 +++ b/net/ipv4/route.c Sat Dec 13 01:16:15 2003 @@ -124,6 +124,14 @@ int ip_rt_min_advmss = 256; int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; +#ifdef CONFIG_IP_ROUTE_TOS +static u8 iptos_rt_mask = IPTOS_RT_MASK; +int ip_rt_ignore_tos; +#else +static u8 iptos_rt_mask; +int ip_rt_ignore_tos = 1; +#endif +int ip_rt_pmtu_mode; /* 1=match by iph->tos, 0=ignore TOS for PMTU */ #define RTprint(a...) printk(KERN_DEBUG a) @@ -967,13 +975,12 @@ void ip_rt_redirect(u32 old_gw, u32 daddr, u32 new_gw, u32 saddr, u8 tos, struct net_device *dev) { - int i, k; + int i; struct in_device *in_dev = in_dev_get(dev); struct rtable *rth, **rthp; u32 skeys[2] = { saddr, 0 }; - int ikeys[2] = { dev->ifindex, 0 }; - tos &= IPTOS_RT_MASK; + tos &= iptos_rt_mask; if (!in_dev) return; @@ -993,10 +1000,7 @@ } for (i = 0; i < 2; i++) { - for (k = 0; k < 2; k++) { - unsigned hash = rt_hash_code(daddr, - skeys[i] ^ (ikeys[k] << 5), - tos); + unsigned hash = rt_hash_code(daddr, skeys[i], tos); rthp=&rt_hash_table[hash].chain; @@ -1008,7 +1012,9 @@ if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || rth->fl.fl4_tos != tos || - rth->fl.oif != ikeys[k] || + (rth->fl.oif && + rth->fl.oif != dev->ifindex) || + rth->rt_dst == rth->rt_gateway || rth->fl.iif != 0) { rthp = &rth->u.rt_next; continue; @@ -1075,7 +1081,6 @@ rcu_read_unlock(); do_next: ; - } } in_dev_put(in_dev); return; @@ -1105,8 +1110,7 @@ } else if ((rt->rt_flags & RTCF_REDIRECTED) || rt->u.dst.expires) { unsigned hash = rt_hash_code(rt->fl.fl4_dst, - rt->fl.fl4_src ^ - (rt->fl.oif << 5), + rt->fl.fl4_src, rt->fl.fl4_tos); #if RT_CACHE_DEBUG >= 1 printk(KERN_DEBUG "ip_rt_advice: redirect to " @@ -1237,21 +1241,33 @@ return 68; } +/* See IPTOS_RT_MASK */ +static u8 all_tos_values[8] = { 0x00, 0x04, 0x08, 0x0C, 0x10, 0x14, 0x18, 0x1C }; + unsigned short ip_rt_frag_needed(struct iphdr *iph, unsigned short new_mtu) { - int i; + int i, j, ntos; unsigned short old_mtu = ntohs(iph->tot_len); struct rtable *rth; u32 skeys[2] = { iph->saddr, 0, }; u32 daddr = iph->daddr; - u8 tos = iph->tos & IPTOS_RT_MASK; + u8 *tos_values, tos = iph->tos & iptos_rt_mask; unsigned short est_mtu = 0; if (ipv4_config.no_pmtu_disc) return 0; + if (ip_rt_pmtu_mode || !iptos_rt_mask) { + tos_values = &tos; + ntos = 1; + } else { + tos_values = all_tos_values; + ntos = ARRAY_SIZE(all_tos_values); + } + + for (j = 0; j < ntos; j++) for (i = 0; i < 2; i++) { - unsigned hash = rt_hash_code(daddr, skeys[i], tos); + unsigned hash = rt_hash_code(daddr, skeys[i], tos_values[j]); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; @@ -1259,9 +1275,9 @@ smp_read_barrier_depends(); if (rth->fl.fl4_dst == daddr && rth->fl.fl4_src == skeys[i] && + rth->fl.fl4_tos == tos_values[j] && rth->rt_dst == daddr && rth->rt_src == iph->saddr && - rth->fl.fl4_tos == tos && rth->fl.iif == 0 && !(dst_metric_locked(&rth->u.dst, RTAX_MTU))) { unsigned short mtu = new_mtu; @@ -1503,7 +1519,7 @@ RT_CACHE_STAT_INC(in_slow_mc); in_dev_put(in_dev); - hash = rt_hash_code(daddr, saddr ^ (dev->ifindex << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); return rt_intern_hash(hash, rth, (struct rtable**) &skb->dst); e_nobufs: @@ -1554,7 +1570,7 @@ if (!in_dev) goto out; - hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); + hash = rt_hash_code(daddr, saddr, tos); /* Check for the most weird martians, which can be not detected by fib_lookup. @@ -1846,8 +1862,8 @@ unsigned hash; int iif = dev->ifindex; - tos &= IPTOS_RT_MASK; - hash = rt_hash_code(daddr, saddr ^ (iif << 5), tos); + tos &= iptos_rt_mask; + hash = rt_hash_code(daddr, saddr, tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -1912,11 +1928,11 @@ int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) { - u32 tos = oldflp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK); + u32 tos = oldflp->fl4_tos & (iptos_rt_mask | RTO_ONLINK); struct flowi fl = { .nl_u = { .ip4_u = { .daddr = oldflp->fl4_dst, .saddr = oldflp->fl4_src, - .tos = tos & IPTOS_RT_MASK, + .tos = tos & iptos_rt_mask, .scope = ((tos & RTO_ONLINK) ? RT_SCOPE_LINK : RT_SCOPE_UNIVERSE), @@ -2190,7 +2206,7 @@ rth->rt_flags = flags; - hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos); + hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src, tos); err = rt_intern_hash(hash, rth, rp); done: if (free_res) @@ -2213,8 +2229,9 @@ { unsigned hash; struct rtable *rth; + u8 tos = flp->fl4_tos & (iptos_rt_mask | RTO_ONLINK); - hash = rt_hash_code(flp->fl4_dst, flp->fl4_src ^ (flp->oif << 5), flp->fl4_tos); + hash = rt_hash_code(flp->fl4_dst, flp->fl4_src, tos); rcu_read_lock(); for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next) { @@ -2226,8 +2243,7 @@ #ifdef CONFIG_IP_ROUTE_FWMARK rth->fl.fl4_fwmark == flp->fl4_fwmark && #endif - !((rth->fl.fl4_tos ^ flp->fl4_tos) & - (IPTOS_RT_MASK | RTO_ONLINK))) { + rth->fl.fl4_tos == tos) { rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; @@ -2479,6 +2495,26 @@ } #ifdef CONFIG_SYSCTL + +static int ipv4_sysctl_doint_strategy(ctl_table *table, int *name, + int nlen, void *oldval, + size_t *oldlenp, void *newval, + size_t newlen, void **context) +{ + int val; + + if (!newval || !newlen) + return 0; + if (newlen != sizeof(int)) + return -EINVAL; + if (get_user(val, (int *)newval)) + return -EFAULT; + if (val == *(int *) table->data) + return 0; + *(int *) table->data = val; + return 1; +} + static int flush_delay; static int ipv4_sysctl_rtcache_flush(ctl_table *ctl, int write, @@ -2508,6 +2544,53 @@ return 0; } +#ifdef CONFIG_IP_ROUTE_TOS + +static void do_ignore_tos(void) +{ + iptos_rt_mask = ip_rt_ignore_tos? 0 : IPTOS_RT_MASK; + rt_cache_flush(0); +} + +#endif + +static int ip_rt_ignore_tos_handler(ctl_table *ctl, int write, + struct file *filp, void *buffer, + size_t *lenp) +{ + if (write) { +#ifdef CONFIG_IP_ROUTE_TOS + int old = ip_rt_ignore_tos; + int ret = proc_dointvec(ctl, write, filp, buffer, lenp); + + if (ret) + return ret; + if (old != ip_rt_ignore_tos) do_ignore_tos(); + return 0; +#else + return -EINVAL; +#endif + } + return proc_dointvec(ctl, write, filp, buffer, lenp); +} +static int ipv4_sysctl_ignore_tos_strategy(ctl_table *table, int *name, + int nlen, void *oldval, + size_t *oldlenp, void *newval, + size_t newlen, void **context) +{ +#ifdef CONFIG_IP_ROUTE_TOS + int ret = ipv4_sysctl_doint_strategy(table, name, nlen, oldval, oldlenp, + newval, newlen, context); + + if (1 != ret) + return ret; + do_ignore_tos(); + return 0; +#else + return (newval || newlen) ? -EINVAL : 0; +#endif +} + ctl_table ipv4_route_table[] = { { .ctl_name = NET_IPV4_ROUTE_FLUSH, @@ -2660,6 +2743,23 @@ .mode = 0644, .proc_handler = &proc_dointvec_jiffies, .strategy = &sysctl_jiffies, + }, + { + .ctl_name = NET_IPV4_ROUTE_IGNORE_TOS, + .procname = "ignore_tos", + .data = &ip_rt_ignore_tos, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &ip_rt_ignore_tos_handler, + .strategy = &ipv4_sysctl_ignore_tos_strategy, + }, + { + .ctl_name = NET_IPV4_ROUTE_PMTU_MODE, + .procname = "pmtu_mode", + .data = &ip_rt_pmtu_mode, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, }, { .ctl_name = 0 } }; Regards -- Julian Anastasov From ja@ssi.bg Fri Dec 12 16:10:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 16:10:43 -0800 (PST) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBD0AITa029302 for ; Fri, 12 Dec 2003 16:10:23 -0800 Received: from localhost (localhost [127.0.0.1]) by u.domain.uli (8.12.10/8.12.10) with ESMTP id hBD0Aqv4016188; Sat, 13 Dec 2003 02:10:53 +0200 Date: Sat, 13 Dec 2003 02:10:52 +0200 (EET) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: niv@us.ibm.com, , , , , Subject: Re: PMTU issues due to TOS field manipulation (for DSCP) In-Reply-To: <20031212003143.062598e9.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2011 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Content-Length: 1793 Lines: 63 Hello, On Fri, 12 Dec 2003, David S. Miller wrote: > I think #2 should be very safe because fib node fn_tos values are only > ever set when that config variable is enabled, and fib rule r_tos values > are only compared on lookup when it is enabled as well. However, there > could be a few more ifdefs added to the fib rule code to cover all the > assignment cases too but let's not worry about that right now. It seems the FIB changes are small, only for the lookups. I didn't changed the other places, better to allow deletion by tos because the flag can be changed at run time. diff -Nru a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c --- a/net/ipv4/fib_hash.c Sat Dec 13 02:04:59 2003 +++ b/net/ipv4/fib_hash.c Sat Dec 13 02:04:59 2003 @@ -48,6 +48,8 @@ printk(KERN_DEBUG a) */ +extern int ip_rt_ignore_tos; + static kmem_cache_t * fn_hash_kmem; /* @@ -309,7 +311,7 @@ continue; } #ifdef CONFIG_IP_ROUTE_TOS - if (f->fn_tos && f->fn_tos != flp->fl4_tos) + if (f->fn_tos && f->fn_tos != flp->fl4_tos && !ip_rt_ignore_tos) continue; #endif f->fn_state |= FN_S_ACCESSED; diff -Nru a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c --- a/net/ipv4/fib_rules.c Sat Dec 13 02:04:59 2003 +++ b/net/ipv4/fib_rules.c Sat Dec 13 02:04:59 2003 @@ -49,6 +49,8 @@ #define FRprintk(a...) +extern int ip_rt_ignore_tos; + struct fib_rule { struct fib_rule *r_next; @@ -323,7 +325,7 @@ if (((saddr^r->r_src) & r->r_srcmask) || ((daddr^r->r_dst) & r->r_dstmask) || #ifdef CONFIG_IP_ROUTE_TOS - (r->r_tos && r->r_tos != flp->fl4_tos) || + (r->r_tos && r->r_tos != flp->fl4_tos && !ip_rt_ignore_tos) || #endif #ifdef CONFIG_IP_ROUTE_FWMARK (r->r_fwmark && r->r_fwmark != flp->fl4_fwmark) || Regards -- Julian Anastasov From chrisw@osdl.org Fri Dec 12 16:16:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 16:16:54 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBD0GfTa029704 for ; Fri, 12 Dec 2003 16:16:42 -0800 Received: (from chrisw@localhost) by mail.osdl.org (8.11.6/8.11.6) id hBD0GHV15641; Fri, 12 Dec 2003 16:16:17 -0800 Date: Fri, 12 Dec 2003 16:16:17 -0800 From: Chris Wright To: James Morris Cc: "David S. Miller" , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-security-module@wirex.com, Stephen Smalley Subject: Re: [RFC] SO_PEERSEC - security credentials for Unix stream sockets Message-ID: <20031212161617.C24246@osdlab.pdx.osdl.net> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from jmorris@redhat.com on Wed, Dec 10, 2003 at 11:33:53AM -0500 X-archive-position: 2012 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev Content-Length: 1221 Lines: 31 * James Morris (jmorris@redhat.com) wrote: > Below is a patch against 2.6.0-test11 which implements a new socket option > SO_PEERSEC (defined for i386 only at this stage). Thanks for doing this James. In your example demonstration, you simply print the peersec string. Do you expect to use with simple comparison against something like data from procattr, or something else? IOW, does this introduce any new namespace issues for apps? > +static inline int security_sk_alloc_security(struct sock *sk, int family, int priority) > +static inline void security_sk_free_security(struct sock *sk) minor nit. these names are inconsistent with the existing analogous ones. how about simply, security_sk_alloc and security_sk_free? > +++ linux-2.6.0-test11.w2/net/core/sock.c 2003-12-10 09:55:39.378901360 -0500 > @@ -564,6 +564,9 @@ > v.val = sk->sk_state == TCP_LISTEN; > break; > > + case SO_PEERSEC: > + return security_socket_getpeersec(sock, optval, len); > + Would it be useful to ask the module to update len as is done in some other cases. Perhaps buffer is too small, can len be vector for that info? thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net From jmorris@redhat.com Fri Dec 12 19:44:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Dec 2003 19:45:01 -0800 (PST) Received: from thoron.boston.redhat.com (nat-pool-bos.redhat.com [66.187.230.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBD3igTa021098 for ; Fri, 12 Dec 2003 19:44:43 -0800 Received: from thoron.boston.redhat.com (localhost.localdomain [127.0.0.1]) by thoron.boston.redhat.com (8.12.8/8.12.8) with ESMTP id hBD3iWta004863; Fri, 12 Dec 2003 22:44:32 -0500 Received: from localhost (jmorris@localhost) by thoron.boston.redhat.com (8.12.8/8.12.8/Submit) with ESMTP id hBD3iORv004859; Fri, 12 Dec 2003 22:44:24 -0500 X-Authentication-Warning: thoron.boston.redhat.com: jmorris owned process doing -bs Date: Fri, 12 Dec 2003 22:44:24 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Chris Wright cc: "David S. Miller" , , , , Stephen Smalley Subject: Re: [RFC] SO_PEERSEC - security credentials for Unix stream sockets In-Reply-To: <20031212161617.C24246@osdlab.pdx.osdl.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2013 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev Content-Length: 11701 Lines: 379 On Fri, 12 Dec 2003, Chris Wright wrote: > * James Morris (jmorris@redhat.com) wrote: > > Below is a patch against 2.6.0-test11 which implements a new socket option > > SO_PEERSEC (defined for i386 only at this stage). > > Thanks for doing this James. In your example demonstration, you simply > print the peersec string. Do you expect to use with simple comparison > against something like data from procattr, or something else? IOW, > does this introduce any new namespace issues for apps? Semantics are application dependent, but with SELinux, the returned security context would be passed to a userspace SELinux API function. I'm not sure how this would be a namespace issue -- do you mean a data format issue? > > > +static inline int security_sk_alloc_security(struct sock *sk, int family, int priority) > > +static inline void security_sk_free_security(struct sock *sk) > > minor nit. these names are inconsistent with the existing analogous ones. > how about simply, security_sk_alloc and security_sk_free? Done. > > > +++ linux-2.6.0-test11.w2/net/core/sock.c 2003-12-10 09:55:39.378901360 -0500 > > @@ -564,6 +564,9 @@ > > v.val = sk->sk_state == TCP_LISTEN; > > break; > > > > + case SO_PEERSEC: > > + return security_socket_getpeersec(sock, optval, len); > > + > > Would it be useful to ask the module to update len as is done in some > other cases. Yep, allowing the security module to update the returned length is now implemented. > Perhaps buffer is too small, can len be vector for that info? I would not advise updating len on error -- it's a bad idea in general to interpret any returned data from failed syscalls except the error number. An updated patch is below, which implements the above, as well as previous feedback from David. This one includes the changes to SELinux so it's easier to see example usage. - James -- James Morris diff -urN -X dontdiff linux-2.6.0-test11.orig/include/asm-i386/socket.h linux-2.6.0-test11.w1/include/asm-i386/socket.h --- linux-2.6.0-test11.orig/include/asm-i386/socket.h 2003-09-27 20:50:09.000000000 -0400 +++ linux-2.6.0-test11.w1/include/asm-i386/socket.h 2003-12-12 20:24:34.000000000 -0500 @@ -45,6 +45,8 @@ #define SO_ACCEPTCONN 30 +#define SO_PEERSEC 31 + /* Nasty libc5 fixup - bletch */ #if defined(__KERNEL__) || !defined(__GLIBC__) || (__GLIBC__ < 2) /* Socket types. */ diff -urN -X dontdiff linux-2.6.0-test11.orig/include/linux/security.h linux-2.6.0-test11.w1/include/linux/security.h --- linux-2.6.0-test11.orig/include/linux/security.h 2003-10-15 08:53:19.000000000 -0400 +++ linux-2.6.0-test11.w1/include/linux/security.h 2003-12-12 21:43:53.000000000 -0500 @@ -757,6 +757,21 @@ * incoming sk_buff @skb has been associated with a particular socket, @sk. * @sk contains the sock (not socket) associated with the incoming sk_buff. * @skb contains the incoming network data. + * @socket_getpeersec: + * This hook allows the security module to provide peer socket security + * state to userspace via getsockopt SO_GETPEERSEC. + * @sock is the local socket. + * @optval userspace memory where the security state is to be copied. + * @len as input is the maximum length to copy to userspace provided + * by the caller. The module should change this to the actual + * length copied. + * Return 0 if all is well, otherwise, typical getsockopt return + * values. + * @sk_alloc_security: + * Allocate and attach a security structure to the sk->sk_security field, + * which is used to copy security attributes between local stream sockets. + * @sk_free_security: + * Deallocate security structure. * * Security hooks affecting all System V IPC operations. * @@ -1183,6 +1198,9 @@ int (*socket_setsockopt) (struct socket * sock, int level, int optname); int (*socket_shutdown) (struct socket * sock, int how); int (*socket_sock_rcv_skb) (struct sock * sk, struct sk_buff * skb); + int (*socket_getpeersec) (struct socket *sock, char __user *optval, unsigned *len); + int (*sk_alloc_security) (struct sock *sk, int family, int priority); + void (*sk_free_security) (struct sock *sk); #endif /* CONFIG_SECURITY_NETWORK */ }; @@ -2564,6 +2582,22 @@ { return security_ops->socket_sock_rcv_skb (sk, skb); } + +static inline int security_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned *len) +{ + return security_ops->socket_getpeersec(sock, optval, len); +} + +static inline int security_sk_alloc(struct sock *sk, int family, int priority) +{ + return security_ops->sk_alloc_security(sk, family, priority); +} + +static inline void security_sk_free(struct sock *sk) +{ + return security_ops->sk_free_security(sk); +} #else /* CONFIG_SECURITY_NETWORK */ static inline int security_unix_stream_connect(struct socket * sock, struct socket * other, @@ -2664,6 +2698,21 @@ { return 0; } + +static inline int security_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned *len) +{ + return -ENOPROTOOPT; +} + +static inline int security_sk_alloc(struct sock *sk, int family, int priority) +{ + return 0; +} + +static inline void security_sk_free(struct sock *sk) +{ +} #endif /* CONFIG_SECURITY_NETWORK */ #endif /* ! __LINUX_SECURITY_H */ diff -urN -X dontdiff linux-2.6.0-test11.orig/include/net/sock.h linux-2.6.0-test11.w1/include/net/sock.h --- linux-2.6.0-test11.orig/include/net/sock.h 2003-12-01 15:27:07.000000000 -0500 +++ linux-2.6.0-test11.w1/include/net/sock.h 2003-12-12 20:44:15.000000000 -0500 @@ -246,6 +246,7 @@ struct socket *sk_socket; void *sk_user_data; struct module *sk_owner; + void *sk_security; void (*sk_state_change)(struct sock *sk); void (*sk_data_ready)(struct sock *sk, int bytes); void (*sk_write_space)(struct sock *sk); diff -urN -X dontdiff linux-2.6.0-test11.orig/net/core/sock.c linux-2.6.0-test11.w1/net/core/sock.c --- linux-2.6.0-test11.orig/net/core/sock.c 2003-12-01 15:27:04.000000000 -0500 +++ linux-2.6.0-test11.w1/net/core/sock.c 2003-12-12 21:43:54.000000000 -0500 @@ -564,6 +564,13 @@ v.val = sk->sk_state == TCP_LISTEN; break; + case SO_PEERSEC: { + int err = security_socket_getpeersec(sock, optval, &len); + if (err) + return err; + goto lenout; + } + default: return(-ENOPROTOOPT); } @@ -606,6 +613,11 @@ sock_lock_init(sk); } sk->sk_slab = slab; + + if (security_sk_alloc(sk, family, priority)) { + kmem_cache_free(sk->sk_slab, sk); + sk = NULL; + } } return sk; } @@ -628,6 +640,7 @@ printk(KERN_DEBUG "%s: optmem leakage (%d bytes) detected.\n", __FUNCTION__, atomic_read(&sk->sk_omem_alloc)); + security_sk_free(sk); kmem_cache_free(sk->sk_slab, sk); module_put(owner); } diff -urN -X dontdiff linux-2.6.0-test11.orig/security/dummy.c linux-2.6.0-test11.w1/security/dummy.c --- linux-2.6.0-test11.orig/security/dummy.c 2003-10-15 08:53:19.000000000 -0400 +++ linux-2.6.0-test11.w1/security/dummy.c 2003-12-12 21:45:08.000000000 -0500 @@ -793,6 +793,21 @@ { return 0; } + +static int dummy_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned *len) +{ + return -ENOPROTOOPT; +} + +static inline int dummy_sk_alloc_security (struct sock *sk, int family, int priority) +{ + return 0; +} + +static inline void dummy_sk_free_security (struct sock *sk) +{ +} #endif /* CONFIG_SECURITY_NETWORK */ static int dummy_register_security (const char *name, struct security_operations *ops) @@ -969,6 +984,9 @@ set_to_dummy_if_null(ops, socket_getsockopt); set_to_dummy_if_null(ops, socket_shutdown); set_to_dummy_if_null(ops, socket_sock_rcv_skb); + set_to_dummy_if_null(ops, socket_getpeersec); + set_to_dummy_if_null(ops, sk_alloc_security); + set_to_dummy_if_null(ops, sk_free_security); #endif /* CONFIG_SECURITY_NETWORK */ } diff -urN -X dontdiff linux-2.6.0-test11.orig/security/selinux/hooks.c linux-2.6.0-test11.w1/security/selinux/hooks.c --- linux-2.6.0-test11.orig/security/selinux/hooks.c 2003-10-15 08:53:19.000000000 -0400 +++ linux-2.6.0-test11.w1/security/selinux/hooks.c 2003-12-12 22:06:21.000000000 -0500 @@ -242,6 +242,39 @@ kfree(sbsec); } +#ifdef CONFIG_SECURITY_NETWORK +static int sk_alloc_security(struct sock *sk, int family, int priority) +{ + struct sk_security_struct *ssec; + + if (family != PF_UNIX) + return 0; + + ssec = kmalloc(sizeof(*ssec), priority); + if (!ssec) + return -ENOMEM; + + memset(ssec, 0, sizeof(*ssec)); + ssec->magic = SELINUX_MAGIC; + ssec->sk = sk; + ssec->peer_sid = SECINITSID_UNLABELED; + sk->sk_security = ssec; + + return 0; +} + +static void sk_free_security(struct sock *sk) +{ + struct task_security_struct *ssec = sk->sk_security; + + if (sk->sk_family != PF_UNIX || ssec->magic != SELINUX_MAGIC) + return; + + sk->sk_security = NULL; + kfree(ssec); +} +#endif /* CONFIG_SECURITY_NETWORK */ + /* The security server must be initialized before any labeling or access decisions can be provided. */ extern int ss_initialized; @@ -2576,6 +2609,7 @@ struct socket *other, struct sock *newsk) { + struct sk_security_struct *ssec; struct inode_security_struct *isec; struct inode_security_struct *other_isec; struct avc_audit_data ad; @@ -2594,6 +2628,14 @@ if (err) return err; + /* connecting socket */ + ssec = sock->sk->sk_security; + ssec->peer_sid = other_isec->sid; + + /* server child socket */ + ssec = newsk->sk_security; + ssec->peer_sid = isec->sid; + return 0; } @@ -2621,6 +2663,53 @@ return 0; } +static int selinux_socket_getpeersec(struct socket *sock, + char __user *optval, unsigned *len) +{ + int err = 0; + char *scontext; + u32 scontext_len; + struct sk_security_struct *ssec; + struct inode_security_struct *isec; + + isec = SOCK_INODE(sock)->i_security; + if (isec->sclass != SECCLASS_UNIX_STREAM_SOCKET) { + err = -ENOPROTOOPT; + goto out; + } + + ssec = sock->sk->sk_security; + + err = security_sid_to_context(ssec->peer_sid, &scontext, &scontext_len); + if (err) + goto out; + + if (scontext_len > *len) { + err = -ENOSPC; + goto out_free; + } + + if (copy_to_user(optval, scontext, scontext_len)) + err = -EFAULT; + + *len = scontext_len; + +out_free: + kfree(scontext); +out: + return err; +} + +static int selinux_sk_alloc_security(struct sock *sk, int family, int priority) +{ + return sk_alloc_security(sk, family, priority); +} + +static void selinux_sk_free_security(struct sock *sk) +{ + sk_free_security(sk); +} + #endif static int ipc_alloc_security(struct task_struct *task, @@ -3356,6 +3445,9 @@ .socket_getsockopt = selinux_socket_getsockopt, .socket_setsockopt = selinux_socket_setsockopt, .socket_shutdown = selinux_socket_shutdown, + .socket_getpeersec = selinux_socket_getpeersec, + .sk_alloc_security = selinux_sk_alloc_security, + .sk_free_security = selinux_sk_free_security, #endif }; diff -urN -X dontdiff linux-2.6.0-test11.orig/security/selinux/include/objsec.h linux-2.6.0-test11.w1/security/selinux/include/objsec.h --- linux-2.6.0-test11.orig/security/selinux/include/objsec.h 2003-09-27 20:50:18.000000000 -0400 +++ linux-2.6.0-test11.w1/security/selinux/include/objsec.h 2003-12-12 20:24:39.000000000 -0500 @@ -93,6 +93,12 @@ unsigned char set; }; +struct sk_security_struct { + unsigned long magic; /* magic number for this module */ + struct sock *sk; /* back pointer to sk object */ + u32 peer_sid; /* SID of peer */ +}; + extern int inode_security_set_sid(struct inode *inode, u32 sid); #endif /* _SELINUX_OBJSEC_H_ */ From rask@sygehus.dk Sat Dec 13 09:29:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 13 Dec 2003 09:29:46 -0800 (PST) Received: from 0x50a41c34.albnxx15.adsl-dhcp.tele.dk (0x50a41c34.albnxx15.adsl-dhcp.tele.dk [80.164.28.52]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBDHTVTa022328 for ; Sat, 13 Dec 2003 09:29:32 -0800 Received: by 0x50a41c34.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 8074E1F25E; Sat, 13 Dec 2003 18:29:29 +0100 (CET) Date: Sat, 13 Dec 2003 18:29:28 +0100 From: Rask Ingemann Lambertsen To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: [EXPERIMENTAL PATCH] 2.4 tulip jumbo frames Message-ID: <20031213182925.A1791@sygehus.dk> References: <20031209160632.D1345@sygehus.dk> <3FD5FC36.5090405@pobox.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="k1lZvvs/B4yU6o8G" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FD5FC36.5090405@pobox.com>; from jgarzik@pobox.com on Tue, Dec 09, 2003 at 11:45:42AM -0500 X-archive-position: 2014 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev --k1lZvvs/B4yU6o8G Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Dec 09, 2003 at 11:45:42AM -0500, Jeff Garzik wrote: > Two questions and a comment... > > Would you split this into two patches? The first simply adds, and uses, > tp->rx_buf_sz. The second adds PKT_BUF_SZ_MAX and mtu-related changes. The first patch is attached to this message. I'm not happy with the second part yet because it is too hackish and I would prefer to have a look at the tulip documentation first. > Have you looked at Donald Becker's changes to tulip.c? He went through > most of his drivers and made the changes necessary to support larger > MTUs. IIRC his tulip.c changes (which should be easily translate-able > to 2.6.x tulip) were a bit more minimal than your patch, but still > served the purpose. I've just tested Becker's tulip.c v0.97 (7/22/2003) and it fails to receive packets larger than the standard 1514 bytes. "ping -s 1472" works while "ping -s 1473" doesn't (when pinging from another box). -- Regards, Rask Ingemann Lambertsen --k1lZvvs/B4yU6o8G Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="tulip-rxbufsz.patch" --- linux-2.4.22/drivers/net/tulip/tulip.h-orig Mon Dec 1 21:58:11 2003 +++ linux-2.4.22/drivers/net/tulip/tulip.h Mon Dec 1 22:09:14 2003 @@ -347,6 +353,7 @@ struct tulip_private { struct ring_info tx_buffers[TX_RING_SIZE]; /* The addresses of receive-in-place skbuffs. */ struct ring_info rx_buffers[RX_RING_SIZE]; + uint rx_buf_sz; u16 setup_frame[96]; /* Pseudo-Tx frame to init address table. */ int chip_id; int revision; --- linux-2.4.22/drivers/net/tulip/interrupt.c-orig Mon Dec 1 21:52:10 2003 +++ linux-2.4.22/drivers/net/tulip/interrupt.c Mon Dec 1 22:03:50 2003 @@ -74,11 +74,11 @@ int tulip_refill_rx(struct net_device *d struct sk_buff *skb; dma_addr_t mapping; - skb = tp->rx_buffers[entry].skb = dev_alloc_skb(PKT_BUF_SZ); + skb = tp->rx_buffers[entry].skb = dev_alloc_skb(tp->rx_buf_sz); if (skb == NULL) break; - mapping = pci_map_single(tp->pdev, skb->tail, PKT_BUF_SZ, + mapping = pci_map_single(tp->pdev, skb->tail, tp->rx_buf_sz, PCI_DMA_FROMDEVICE); tp->rx_buffers[entry].mapping = mapping; @@ -203,7 +202,7 @@ static int tulip_rx(struct net_device *d #endif pci_unmap_single(tp->pdev, tp->rx_buffers[entry].mapping, - PKT_BUF_SZ, PCI_DMA_FROMDEVICE); + tp->rx_buf_sz, PCI_DMA_FROMDEVICE); tp->rx_buffers[entry].skb = NULL; tp->rx_buffers[entry].mapping = 0; --- linux-2.4.22/drivers/net/tulip/tulip_core.c-orig Mon Dec 1 21:34:32 2003 +++ linux-2.4.22/drivers/net/tulip/tulip_core.c Mon Dec 1 22:10:53 2003 @@ -518,9 +521,7 @@ void tulip_xon(struct net_device *dev) static int tulip_open(struct net_device *dev) { -#ifdef CONFIG_NET_HW_FLOWCONTROL struct tulip_private *tp = (struct tulip_private *)dev->priv; -#endif int retval; MOD_INC_USE_COUNT; @@ -529,6 +530,7 @@ tulip_open(struct net_device *dev) return retval; } + tp->rx_buf_sz = PKT_BUF_SZ; tulip_init_ring (dev); tulip_up (dev); @@ -673,13 +678,13 @@ static void tulip_init_ring(struct net_d for (i = 0; i < RX_RING_SIZE; i++) { tp->rx_ring[i].status = 0x00000000; - tp->rx_ring[i].length = cpu_to_le32(PKT_BUF_SZ); + tp->rx_ring[i].length = cpu_to_le32(tp->rx_buf_sz); tp->rx_ring[i].buffer2 = cpu_to_le32(tp->rx_ring_dma + sizeof(struct tulip_rx_desc) * (i + 1)); tp->rx_buffers[i].skb = NULL; tp->rx_buffers[i].mapping = 0; } /* Mark the last entry as wrapping the ring. */ - tp->rx_ring[i-1].length = cpu_to_le32(PKT_BUF_SZ | DESC_RING_WRAP); + tp->rx_ring[i-1].length = cpu_to_le32(tp->rx_buf_sz | DESC_RING_WRAP); tp->rx_ring[i-1].buffer2 = cpu_to_le32(tp->rx_ring_dma); for (i = 0; i < RX_RING_SIZE; i++) { @@ -688,12 +693,12 @@ static void tulip_init_ring(struct net_d /* Note the receive buffer must be longword aligned. dev_alloc_skb() provides 16 byte alignment. But do *not* use skb_reserve() to align the IP header! */ - struct sk_buff *skb = dev_alloc_skb(PKT_BUF_SZ); + struct sk_buff *skb = dev_alloc_skb(tp->rx_buf_sz); tp->rx_buffers[i].skb = skb; if (skb == NULL) break; mapping = pci_map_single(tp->pdev, skb->tail, - PKT_BUF_SZ, PCI_DMA_FROMDEVICE); + tp->rx_buf_sz, PCI_DMA_FROMDEVICE); tp->rx_buffers[i].mapping = mapping; skb->dev = dev; /* Mark as being used by this device. */ tp->rx_ring[i].status = cpu_to_le32(DescOwned); /* Owned by Tulip chip */ @@ -876,7 +881,7 @@ static int tulip_close (struct net_devic tp->rx_ring[i].length = 0; tp->rx_ring[i].buffer1 = 0xBADF00D0; /* An invalid address. */ if (skb) { - pci_unmap_single(tp->pdev, mapping, PKT_BUF_SZ, + pci_unmap_single(tp->pdev, mapping, tp->rx_buf_sz, PCI_DMA_FROMDEVICE); dev_kfree_skb (skb); } --k1lZvvs/B4yU6o8G-- From rask@sygehus.dk Sat Dec 13 11:00:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 13 Dec 2003 11:01:01 -0800 (PST) Received: from 0x50a41c34.albnxx15.adsl-dhcp.tele.dk (0x50a41c34.albnxx15.adsl-dhcp.tele.dk [80.164.28.52]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBDJ0lTa024395 for ; Sat, 13 Dec 2003 11:00:47 -0800 Received: by 0x50a41c34.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id B894D48A8; Sat, 13 Dec 2003 20:00:45 +0100 (CET) Date: Sat, 13 Dec 2003 20:00:45 +0100 From: Rask Ingemann Lambertsen To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: [PATCH] e100: Enable receiving bogus packets, and transmitting bad/custom CRC Message-ID: <20031213200044.B1791@sygehus.dk> References: <3FC2931B.3070903@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FC2931B.3070903@candelatech.com>; from greearb@candelatech.com on Mon, Nov 24, 2003 at 03:24:11PM -0800 X-archive-position: 2015 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev On Mon, Nov 24, 2003 at 03:24:11PM -0800, Ben Greear wrote: > Thanks to those who pointed me in the right direction, here is a patch > to the e100 (2.4.23-pre9) that allows it to capture all frames, bogons included. > It also coppies the FCS to the skb so ethereal et al can read it. > > It utilizes ethtool commands to get/set the rx-all feature, and uses > a new flag in the skbuff (and socket struct) structure to determine when to disable generating > the FCS on transmit. I have the entire patch that adds the management > bits and flags, but as usual, it's mixed in with various other things... Reading the tulip manual (see below) triggered a question: When transmitting a custom CRC, who is responsible for padding the frame to the minimum length? If frame padding is left to the driver, what should be used for padding? I (or rather, Google) found the tulip documentation at . The tulip chips are capable of all these tricks too. -- Regards, Rask Ingemann Lambertsen From greearb@candelatech.com Sat Dec 13 11:12:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 13 Dec 2003 11:13:03 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBDJCoTa025093 for ; Sat, 13 Dec 2003 11:12:50 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id hBDJCnEu006809; Sat, 13 Dec 2003 11:12:49 -0800 Message-ID: <3FDB64B1.5050700@candelatech.com> Date: Sat, 13 Dec 2003 11:12:49 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rask Ingemann Lambertsen CC: netdev@oss.sgi.com, "David S. Miller" Subject: Re: [PATCH] e100: Enable receiving bogus packets, and transmitting bad/custom CRC References: <3FC2931B.3070903@candelatech.com> <20031213200044.B1791@sygehus.dk> In-Reply-To: <20031213200044.B1791@sygehus.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2016 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Rask Ingemann Lambertsen wrote: > On Mon, Nov 24, 2003 at 03:24:11PM -0800, Ben Greear wrote: > >>Thanks to those who pointed me in the right direction, here is a patch >>to the e100 (2.4.23-pre9) that allows it to capture all frames, bogons included. >>It also coppies the FCS to the skb so ethereal et al can read it. >> >>It utilizes ethtool commands to get/set the rx-all feature, and uses >>a new flag in the skbuff (and socket struct) structure to determine when to disable generating >>the FCS on transmit. I have the entire patch that adds the management >>bits and flags, but as usual, it's mixed in with various other things... > > > Reading the tulip manual (see below) triggered a question: When transmitting > a custom CRC, who is responsible for padding the frame to the minimum length? > If frame padding is left to the driver, what should be used for padding? Hrm, I have not tested this, as my sending app itself enforces the minimum size. I am guessing that padding is up to the driver/OS in this case, but I'd have to re-read the e100 docs to be sure... > I (or rather, Google) found the tulip documentation at > . The tulip > chips are capable of all these tricks too. Cool, I'll check out this doc soon. If you happen to write up a tulip driver patch for this feature, please let me know. I have lots of 4-port tulip nics to test on here... DaveM: Are you interested in getting these patches into 2.4.24-preX? (It would be helpful to get the infrastructure and ethtool portions in, even if the driver parts remain outside the tree for a bit longer.) Thanks, Ben > -- Ben Greear Candela Technologies Inc http://www.candelatech.com From mag@fbab.net Sat Dec 13 15:17:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 13 Dec 2003 15:17:23 -0800 (PST) Received: from mail2.fbab.net (qmailr@spectre.fbab.net [212.214.165.139]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBDNH7Ta032437 for ; Sat, 13 Dec 2003 15:17:08 -0800 Received: (qmail 19686 invoked by uid 136); 13 Dec 2003 23:17:04 -0000 Received: from mag@fbab.net by mail2.fbab.net by uid 133 with qmail-scanner-1.20rc1 (avp: 4.0.3.0. Clear:RC:1:. Processed in 0.053998 secs); 13 Dec 2003 23:17:04 -0000 Received: from unknown (HELO mail2.fbab.net) (212.214.165.228) by mail2.fbab.net with SMTP; 13 Dec 2003 23:17:04 -0000 Received: from 194.237.241.213 (SquirrelMail authenticated user magimap1) by mail2.fbab.net with HTTP; Sun, 14 Dec 2003 00:17:04 +0100 (CET) Message-ID: <32820.194.237.241.213.1071357424.squirrel@mail2.fbab.net> Date: Sun, 14 Dec 2003 00:17:04 +0100 (CET) Subject: forcedeth 0xa2 event From: "Magnus Naeslund(w)" To: netdev@oss.sgi.com Reply-To: mag@fbab.net User-Agent: SquirrelMail/1.4.2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 Importance: Normal X-archive-position: 2017 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mag@fbab.net Precedence: bulk X-list: netdev I tried out the forcedeth driver, and so far it has worked very good attached to my ADSL router. I saw a message printed: "eth0: received irq with unknown events 0xa2. Please report" So i'm reporting it. Thanks for this effort, this means i can buy a new (non nVidia) gfx card to my shulle box :) I'm not subscribed to this list, so CC me. I can provide more systen information if you need me to. Cheers Magnus Naeslund -- no .sig From david+hb@blue-labs.org Sat Dec 13 21:52:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 13 Dec 2003 21:52:42 -0800 (PST) Received: from ns.kalifornia.com (wsip-68-99-153-203.ri.ri.cox.net [68.99.153.203]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBE5qRTa016345 for ; Sat, 13 Dec 2003 21:52:28 -0800 Received: from blue-labs.org (pcp03879594pcs.kilng01.ct.comcast.net [68.63.175.90]) (authenticated bits=0) by ns.kalifornia.com (8.12.10/8.12.9) with ESMTP id hBE5qP0o025673 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 14 Dec 2003 00:52:26 -0500 Message-ID: <3FDBFA98.1080401@blue-labs.org> Date: Sun, 14 Dec 2003 00:52:24 -0500 From: David Ford User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: ALLMULTI and PROMISC Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2018 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: david+hb@blue-labs.org Precedence: bulk X-list: netdev The iproute2 documentation states that these are deprecated. Would somebody give me some pointers or a short explanation why they are deprecated and if there is a replacement method, would you mind mentioning it? Thank you, David From dax@gurulabs.com Sun Dec 14 21:53:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 14 Dec 2003 21:53:32 -0800 (PST) Received: from mail.gurulabs.com (you@[66.62.77.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBF5rHTa000519 for ; Sun, 14 Dec 2003 21:53:17 -0800 Received: from [10.200.1.137] (mentor-wlan-dkhome.gurulabs.com [10.200.1.137]) by mail.gurulabs.com (Postfix) with ESMTP id A8A0C77A5; Sun, 14 Dec 2003 22:53:15 -0700 (MST) Subject: Re: [BUG] can't unload network device's if IPV6 is loaded From: Dax Kelson To: Stephen Hemminger Cc: "David S. Miller" , YOSHIFUJI Hideaki , netdev@oss.sgi.com In-Reply-To: <20031211153334.0c59214b.shemminger@osdl.org> References: <20031211153334.0c59214b.shemminger@osdl.org> Content-Type: text/plain Message-Id: <1071467514.3219.11.camel@mentor.gurulabs.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 (1.4.5-7) Date: Sun, 14 Dec 2003 22:51:54 -0700 Content-Transfer-Encoding: 7bit X-archive-position: 2019 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dax@gurulabs.com Precedence: bulk X-list: netdev On Thu, 2003-12-11 at 16:33, Stephen Hemminger wrote: > In 2.6.0-test11, IPV6 is not correctly cleaning up the network device reference's > (ie missing dev_put). So if I do: > rmmod e100 > it hangs forever and complains about that not all references have been cleaned up. > > This happens even if no IPV6 addresses have been set up. Just having ipv6 available > to be loaded at boot up. The vendor startup scripts (SuSe 9) may be setting something. I noticed this on my laptop with my internal Dell TrueMobile 1150 wireless 802.11 card (orinoco_cs). On my Fedora box when I added IPV6=yes to /etc/sysconfig/network I couldn't get clean reboots or shutdowns. Dax Kelson Guru Labs From rusty@samba.org Sun Dec 14 22:56:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 14 Dec 2003 22:56:37 -0800 (PST) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBF6u9Ta001907 for ; Sun, 14 Dec 2003 22:56:10 -0800 Received: by lists.samba.org (Postfix, from userid 590) id E7F7C2C284; Mon, 15 Dec 2003 06:08:38 +0000 (GMT) From: Rusty Russell To: Stephen Hemminger Cc: "David S. Miller" , Rusty Russell , Michal Ostrowski Cc: netdev@oss.sgi.com Subject: Re: [PATCH] MODULE_ALIAS_PROTO for PPPoE In-reply-to: Your message of "Wed, 10 Dec 2003 10:22:56 -0800." <20031210102256.27d2ff51.shemminger@osdl.org> Date: Mon, 15 Dec 2003 11:43:48 +1100 Message-Id: <20031215060838.E7F7C2C284@lists.samba.org> X-archive-position: 2020 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rusty@rustcorp.com.au Precedence: bulk X-list: netdev In message <20031210102256.27d2ff51.shemminger@osdl.org> you write: > Rusty added module aliases for most protocol families but missed this one. Thanks. I've got a few aliases accumulated. I'll push this to Linus/Akpm and see if they bite. Thanks. Rusty > diff -Nru a/drivers/net/pppoe.c b/drivers/net/pppoe.c > --- a/drivers/net/pppoe.c Wed Dec 10 10:20:36 2003 > +++ b/drivers/net/pppoe.c Wed Dec 10 10:20:36 2003 > @@ -1151,3 +1151,4 @@ > MODULE_AUTHOR("Michal Ostrowski "); > MODULE_DESCRIPTION("PPP over Ethernet driver"); > MODULE_LICENSE("GPL"); > +MODULE_ALIAS_NETPROTO(PF_PPPOX); -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. From akpm@osdl.org Mon Dec 15 00:01:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 00:02:12 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBF81vTa003848 for ; Mon, 15 Dec 2003 00:01:58 -0800 Received: from mnm (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hBF81pZ09651 for ; Mon, 15 Dec 2003 00:01:52 -0800 Date: Mon, 15 Dec 2003 00:01:57 -0800 From: Andrew Morton To: netdev@oss.sgi.com Subject: Fw: [Bugme-new] [Bug 1627] New: system crashes after 3 hours test. Message-Id: <20031215000157.411bb9cb.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.4 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2021 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev It's a bit dated... Begin forwarded message: Date: Tue, 2 Dec 2003 08:36:03 -0800 From: bugme-daemon@osdl.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 1627] New: system crashes after 3 hours test. http://bugme.osdl.org/show_bug.cgi?id=1627 Summary: system crashes after 3 hours test. Kernel Version: 2.6.0-test9 Status: NEW Severity: high Owner: bugme-janitors@lists.osdl.org Submitter: dvnguyen@us.ibm.com CC: wmb@us.ibm.com Distribution: Hardware Environment: pSeries p650 Software Environment: 2.6.0-test9 Problem Description: Ran SPECweb99_SSL benchmark test for 3 hours and system crashed . Here are some information about xmon: 0:mon> t c0000007fc70fd00 c00000000035ddfc .tcp_do_twkill_work+0x19c/0x1b0 c0000007fc70fdd0 c00000000035e064 .twkill_work+0x11c/0x1b4 c0000007fc70fe80 c00000000006457c .worker_thread+0x280/0x3b8 c0000007fc70ff90 c000000000017d4c .kernel_thread+0x4c/0x68 0:mon> 0:mon> r R00 = 0000000000000001 R16 = 0000000000000000 R01 = c0000007fc70fd00 R17 = 0000000000000000 R02 = c000000000679000 R18 = 0000000000000000 R03 = c0000007fc2a5b80 R19 = 0000000000000000 R04 = c0000007fc2a4000 R20 = 0000000000c00000 R05 = 0000000000000000 R21 = 0000000000000000 R06 = c0000000005ec880 R22 = c000000000745ce8 R07 = c0000007f9000000 R23 = 0000000000000064 R08 = 00000000000d4c50 R24 = 0000000000000000 R09 = 0000000000000000 R25 = 0000000000000001 R10 = 0000000000000001 R26 = 0000000000000001 R11 = c0000007fc2a4010 R27 = c00000065069aef8 R12 = 0000000024000080 R28 = c00000062d56acf8 R13 = c0000000005aa000 R29 = c0000000004ea428 R14 = 0000000000000000 R30 = c0000000005927e8 R15 = 0000000000000000 R31 = c00000062d56ac80 pc = c00000000035dce0 msr = 9000000000009032 lr = c00000000035ddfc cr = 0000000084008080 ctr = 0000000000000000 xer = 0000000020000000 trap = 300 0:mon> S msr = 9000000000001032 sprg0= 0000000000000000 pvr = 0000000000380201 sprg1= 0000000000000000 dec = 000000003f96aab1 sprg2= 0000000000c00000 sp = c0000007fc70f560 sprg3= c0000000005aa000 toc = c000000000679000 dar = 0000000000000000 srr0 = c00000000000a888 srr1 = 9000000000001032 asr = 0000000000009001 sr00 = 0000000000000053 sr08 = 0000000000000053 sr01 = 0000000000000053 sr09 = 0000000000000053 sr02 = 0000000000000053 sr10 = 0000000000000053 sr03 = 0000000000000053 sr11 = 0000000000000053 sr04 = 0000000000000053 sr12 = 0000000000000053 sr05 = 0000000000000053 sr13 = 0000000000000053 sr06 = 0000000000000053 sr14 = 0000000000000053 sr07 = 0000000000000053 sr15 = 0000000000000053 Paca: Local Processor Control Area (LpPaca): Saved Srr0=0000000000000000 Saved Srr1=0000000000000000 Saved Gpr3=0000000000000000 Saved Gpr4=0000000000000000 Saved Gpr5=0000000000000000 Local Processor Register Save Area (LpRegSave): Saved Sprg0=0000000000000000 Saved Sprg1=0000000000000000 Saved Sprg2=0000000000000000 Saved Sprg3=0000000000000000 Saved Msr =0000000000000000 Saved Nia =0000000000000000 0:mon> e cpu 0: Vector: 300 (Data Access) at [c0000007fc70fa80] pc: c00000000035dce0 (.tcp_do_twkill_work+0x80/0x1b0) lr: c00000000035ddfc (.tcp_do_twkill_work+0x19c/0x1b0) sp: c0000007fc70fd00 msr: 9000000000009032 dar: 0 dsisr: 42000000 current = 0xc0000007fc7547b8 paca = 0xc0000000005aa000 pid = 10, comm = events/0 0:mon> s Oops: Kernel access of bad area, sig: 11 [#1] NIP: C00000000035DCE0 XER: 0000000020000000 LR: C00000000035DDFC REGS: c0000007fc70fa80 TRAP: 0300 Not tainted MSR: 9000000000009432 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 DAR: 0000000000000000, DSISR: 0000000042000000 TASK = c0000007fc7547b8[10] 'events/0' CPU: 0 GPR00: 0000000000000001 C0000007FC70FD00 C000000000679000 C0000007FC2A5B80 GPR04: C0000007FC2A4000 0000000000000000 C0000000005EC880 C0000007F9000000 GPR08: 00000000000D4C50 0000000000000000 0000000000000001 C0000007FC2A4010 GPR12: 0000000024000080 C0000000005AA000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000C00000 0000000000000000 C000000000745CE8 0000000000000064 GPR24: 0000000000000000 0000000000000001 0000000000000001 C00000065069AEF8 GPR28: C00000062D56ACF8 C0000000004EA428 C0000000005927E8 C00000062D56AC80 NIP [c00000000035dce0] .tcp_do_twkill_work+0x80/0x1b0 Call Trace: [c00000000035e064] .twkill_work+0x11c/0x1b4 [c00000000006457c] .worker_thread+0x280/0x3b8 [c000000000017d4c] .kernel_thread+0x4c/0x68 <0>Kernel panic: Fatal exception in interrupt In interrupt handler - not syncing <0>Rebooting in 180 seconds.. ============================================= Quote here some debug info: "I disassembled the kernel around where the crash occurs, and compared that to the source code. It's a little hard to follow due to the inlining, but I think I see where in the source the crash is occurring. tcp_do_twkill_work calls __tw_del_dead_node(tw), which calls __hlist_del(&tw- >tw_death_node). I think the crash occurs in __hlist_del, at the line shown below. static __inline__ void __hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next; struct hlist_node **pprev = n->pprev; *pprev = next; <<<<<<---------- crash occurs here if (next) next->pprev = pprev; } The corresponding assembly code looks as follows: c000000000376380: eb 7c 00 00 ld r27,0(r28) c000000000376384: e9 3c 00 08 ld r9,8(r28) c000000000376388: 3b bc ff 88 addi r29,r28,-120 c00000000037638c: 2e 3b 00 00 cmpdi cr4,r27,0 c000000000376390: fb 69 00 00 std r27,0(r9) <<<---- crashes here c000000000376394: 41 92 00 08 beq- cr4,c00000000037639c c000000000376398: f9 3b 00 08 std r9,8(r27) " "The xmon output shows that r9 == 0. Linking this back to the source code, this means that pprev == n->pprev == NULL in hlist_del." " I'll test the latest kernel (test11) and will have some infor posted back here. Steps to reproduce: Need to run SPECweb99_SSL benchmark to reproduce problem. ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From akpm@osdl.org Mon Dec 15 02:34:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 02:34:41 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFAYNTa016889 for ; Mon, 15 Dec 2003 02:34:24 -0800 Received: from mnm (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id hBFAYHZ31475; Mon, 15 Dec 2003 02:34:17 -0800 Date: Mon, 15 Dec 2003 02:34:23 -0800 From: Andrew Morton To: netdev@oss.sgi.com Cc: kieranm@gtemail.net Subject: Fw: [Bugme-new] [Bug 1675] New: TCP occasionally ignores a FIN, requiring a retransmit Message-Id: <20031215023423.3d1ce730.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.4 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2022 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Begin forwarded message: Date: Fri, 12 Dec 2003 09:18:33 -0800 From: bugme-daemon@osdl.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 1675] New: TCP occasionally ignores a FIN, requiring a retransmit http://bugme.osdl.org/show_bug.cgi?id=1675 Summary: TCP occasionally ignores a FIN, requiring a retransmit Kernel Version: 2.4.20-18.9, 2.4.22 Status: NEW Severity: low Owner: niv@us.ibm.com Submitter: kieranm@gtemail.net Distribution: Redhat 9 Hardware Environment: Dual Intel Xeon Server, with Intel Corp. 82546EB Gigabit Ethernet Controller Software Environment: Linux, symptom provoked using small NetPIPE test Problem Description: If a FIN is delivered to the Linux TCP stack close (within around 10us) to the time it is sending a FIN|ACK, it does not acknowledge the received FIN. The other node is then required to retransmit the FIN, which is then correctly acknowledged. I can supply an ethereal trace to illustrate this. I suspect it is a race in the state change of the TCP connection, although I can't see an obvious candidate for it in the source - it seems to correctly implement the TCP state diagram. Because it seems to only involve FINs, and the stack resolves it with a retransmission, this problem would not normally be visible. As a result this is an annoyance rather than a serious problem, but having spent a week convincing myself that it is a bug in Linux, I'm quite interested to find what's wrong. I have tested it on 2.4.20 (with hyperthread enabled) and 2.4.22 (with hyperthreading disabled) kernels. Both are SMP kernels, which could be the cause of the race. Steps to reproduce: I am provoking the symptom using a different TCP stack on the remote node, which is itself running on a proprietary high performance network, bridged onto the ethernet that the linux node is on. Because the problem is so sensitive to the timing of the delivery of the FIN packet it will be hard to reproduce on another setup. The traffic is generated by a simple NetPIPE test: NPtcp -p 0 -i -l 64 -n 10 -u 64 -h I'm happy to try any fixes that are suggested on this setup to see if it does resolve the problem. Also, let me know any questions you have to help track it down. The ethereal trace is the best way to see what the problem is. ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From steve@navaho.co.uk Mon Dec 15 05:15:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 05:16:06 -0800 (PST) Received: from pinus.navaho (fairchild-196.adsl.newnet.co.uk [213.131.187.196]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFDFqTa026882 for ; Mon, 15 Dec 2003 05:15:53 -0800 Received: from 10.0.0.42 (steve@navaho.co.uk) by pinus.navaho with SMTP (Navaho Galium v3.6.13 [Linux/i686]) id: 2124813680; Mon, 15 Dec 2003 13:15:44 GMT X-Sender-Local: 10.0.0.42 Received: from localhost (steve@localhost) by sorbus2.navaho (8.11.6/8.11.2) with ESMTP id hBFDFjA09695 for ; Mon, 15 Dec 2003 13:15:45 GMT X-Authentication-Warning: sorbus2.navaho: steve owned process doing -bs Date: Mon, 15 Dec 2003 13:15:44 +0000 (GMT) From: Steve Hill X-X-Sender: steve@sorbus2.navaho To: netdev@oss.sgi.com Subject: Re: 2.6.0-test9 : bridge freezes Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323328-874190902-1071494144=:8670" X-Navaho-ID: 7ea6153d X-Domain-Forwarded-By: pinus.navaho X-Navaho-Spam-Rating: 0.000000 X-Spam-Override: Local user [steve@navaho.co.uk] X-Navaho-Spam: No X-archive-position: 2023 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: steve@navaho.co.uk Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --8323328-874190902-1071494144=:8670 Content-Type: TEXT/PLAIN; charset=US-ASCII With both conntrack and bridging turned on in the 2.6.0test11 kernel, sending fragmented packets over the bridge reveals a memory leak (specifically, forwarding packets from any interface to a bridge). The memory that is leaking seems to be being allocated on line 299 on net/bridge/br_netfilter.c: if ((nf_bridge = nf_bridge_alloc(skb)) == NULL) return NF_DROP; Only the first fragment gets freed later on. The patch attached fixes the problem by freeing nf_bridge when the packets are defragmented, however I am sure this is not the right place to do this. Where would the skb's for the fragments usually get freed? Bart De Schuymer suggested that they should be freed in skbuff.c::skb_release_data(), but having looked at this it seems to do this already. skb_release_data() calls skb_drop_fraglist(), which does kfree_skb() on each fragment, and kfree_skb calls nf_bridge_put correctly so this isn't the problem. -- - Steve Hill Senior Software Developer Email: steve@navaho.co.uk Navaho Technologies Ltd. Tel: +44-870-7034015 ... Alcohol and calculus don't mix - Don't drink and derive! ... --8323328-874190902-1071494144=:8670 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="bridge-memleak.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="bridge-memleak.patch" ZGlmZiAtdXJOIGxpbnV4LTIuNi4wLXRlc3QxMS52YW5pbGxhL25ldC9pcHY0 L2lwX2ZyYWdtZW50LmMgbGludXgtMi42LjAtdGVzdDExL25ldC9pcHY0L2lw X2ZyYWdtZW50LmMNCi0tLSBsaW51eC0yLjYuMC10ZXN0MTEudmFuaWxsYS9u ZXQvaXB2NC9pcF9mcmFnbWVudC5jCTIwMDMtMTItMTIgMTk6Mjc6MDcuMDAw MDAwMDAwICswMDAwDQorKysgbGludXgtMi42LjAtdGVzdDExL25ldC9pcHY0 L2lwX2ZyYWdtZW50LmMJMjAwMy0xMi0xNSAwODo0OTowMS4wMDAwMDAwMDAg KzAwMDANCkBAIC01OTIsNiArNTkyLDkgQEANCiAJYXRvbWljX3N1YihoZWFk LT50cnVlc2l6ZSwgJmlwX2ZyYWdfbWVtKTsNCiANCiAJZm9yIChmcD1oZWFk LT5uZXh0OyBmcDsgZnAgPSBmcC0+bmV4dCkgew0KKyNpZmRlZiBDT05GSUdf QlJJREdFX05FVEZJTFRFUg0KKwkJbmZfYnJpZGdlX3B1dChmcC0+bmZfYnJp ZGdlKTsNCisjZW5kaWYNCiAJCWhlYWQtPmRhdGFfbGVuICs9IGZwLT5sZW47 DQogCQloZWFkLT5sZW4gKz0gZnAtPmxlbjsNCiAJCWlmIChoZWFkLT5pcF9z dW1tZWQgIT0gZnAtPmlwX3N1bW1lZCkNCg== --8323328-874190902-1071494144=:8670-- From linux-netdev@gmane.org Mon Dec 15 09:40:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 09:40:42 -0800 (PST) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFHeRTa017168 for ; Mon, 15 Dec 2003 09:40:28 -0800 Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 1AVwhl-0004tv-00 for ; Mon, 15 Dec 2003 18:40:25 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from sea.gmane.org ([80.91.224.252]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1AVwVk-0004gN-00 for ; Mon, 15 Dec 2003 18:28:00 +0100 Received: from news by sea.gmane.org with local (Exim 3.35 #1 (Debian)) id 1AVwVk-0005yk-00 for ; Mon, 15 Dec 2003 18:28:00 +0100 From: Leo Cambilargiu Subject: RTL8139 Date: Mon, 15 Dec 2003 15:33:57 -0200 Lines: 13 Message-ID: <3FDDF085.6060603@secline.com.br> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en, pt, en-us, pt-br X-archive-position: 2024 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: leo@secline.com.br Precedence: bulk X-list: netdev Hello readers: I purchased a couple of encore ENL832-tx NIC's that Linux doesn't recognize. they are based on RTL8139C (06117S1) The module 8139too (2.4.23) doesn't load. It behaves as though no recognized hardware exists. Any comments? Leo From rask@sygehus.dk Mon Dec 15 11:23:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 11:23:31 -0800 (PST) Received: from 0x50a41c34.albnxx15.adsl-dhcp.tele.dk (0x50a41c34.albnxx15.adsl-dhcp.tele.dk [80.164.28.52]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFJNBTa024008 for ; Mon, 15 Dec 2003 11:23:16 -0800 Received: by 0x50a41c34.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id 382EB4997; Mon, 15 Dec 2003 20:23:07 +0100 (CET) Date: Mon, 15 Dec 2003 20:23:06 +0100 From: Rask Ingemann Lambertsen To: Leo Cambilargiu Cc: netdev@oss.sgi.com Subject: Re: RTL8139 Message-ID: <20031215202306.A1697@sygehus.dk> References: <3FDDF085.6060603@secline.com.br> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FDDF085.6060603@secline.com.br>; from leo@secline.com.br on Mon, Dec 15, 2003 at 03:33:57PM -0200 X-archive-position: 2025 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev On Mon, Dec 15, 2003 at 03:33:57PM -0200, Leo Cambilargiu wrote: > The module 8139too (2.4.23) doesn't load. It behaves as though no > recognized hardware exists. Normally the module will load fine even with no supported hardware. At least PCI drivers tend to work that way. > Any comments? Show us the errors you get when you try to load the module. In case the module does load fine but doesn't recognise your board, send us the output of "lspci -vvv" (as root). -- Regards, Rask Ingemann Lambertsen From jgarzik@pobox.com Mon Dec 15 11:51:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 11:52:03 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFJpmTa024639 for ; Mon, 15 Dec 2003 11:51:48 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:48355 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AVykk-00036O-Ed; Mon, 15 Dec 2003 19:51:38 +0000 Message-ID: <3FDE10BE.2000208@pobox.com> Date: Mon, 15 Dec 2003 14:51:26 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Leo Cambilargiu CC: netdev@oss.sgi.com Subject: Re: RTL8139 References: <3FDDF085.6060603@secline.com.br> In-Reply-To: <3FDDF085.6060603@secline.com.br> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2026 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Leo Cambilargiu wrote: > Hello readers: > > I purchased a couple of encore ENL832-tx NIC's that Linux doesn't > recognize. > > they are based on RTL8139C (06117S1) > > The module 8139too (2.4.23) doesn't load. It behaves as though no > recognized hardware exists. > > Any comments? Probably you need to add PCI ids to 8139too.c... From greearb@candelatech.com Mon Dec 15 12:03:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 12:04:12 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFK3xTa025506 for ; Mon, 15 Dec 2003 12:03:59 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id hBFK3wEu024450 for ; Mon, 15 Dec 2003 12:03:58 -0800 Message-ID: <3FDE13AE.3050402@candelatech.com> Date: Mon, 15 Dec 2003 12:03:58 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: How to count tx and rx bytes? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2027 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Is there an agreed upon standard for exactly what ethernet drivers should be counting for rx-bytes and tx-bytes? For example, should the counters include the 4-byte FCS? Should they include the ethernet header? From what I can tell, various drivers count the rx and tx bytes differently, which cannot be correct.... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From davem@pizda.ninka.net Mon Dec 15 14:19:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 14:19:48 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFMJYTa031778 for ; Mon, 15 Dec 2003 14:19:34 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA23072; Mon, 15 Dec 2003 14:17:29 -0800 Date: Mon, 15 Dec 2003 14:17:29 -0800 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? Message-Id: <20031215141729.15387fc1.davem@redhat.com> In-Reply-To: <3FDE13AE.3050402@candelatech.com> References: <3FDE13AE.3050402@candelatech.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2028 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 15 Dec 2003 12:03:58 -0800 Ben Greear wrote: > Is there an agreed upon standard for exactly what ethernet drivers > should be counting for rx-bytes and tx-bytes? For example, should the > counters include the 4-byte FCS? Should they include the ethernet header? Good question. It should be that all drivers use what skb->len ends up with at rx/tx time. However, it is often faster to just let the hardware keep track of these statistics (tg3 is one example of a chip that can do this). And sometimes these mechanisms take the FCS or whatever into account and this as you note makes the numbers different. From greearb@candelatech.com Mon Dec 15 14:47:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 14:47:14 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFMl1Ta000375 for ; Mon, 15 Dec 2003 14:47:01 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id hBFMl0Eu029519; Mon, 15 Dec 2003 14:47:00 -0800 Message-ID: <3FDE39E3.4000408@candelatech.com> Date: Mon, 15 Dec 2003 14:46:59 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? References: <3FDE13AE.3050402@candelatech.com> <20031215141729.15387fc1.davem@redhat.com> In-Reply-To: <20031215141729.15387fc1.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2029 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Mon, 15 Dec 2003 12:03:58 -0800 > Ben Greear wrote: > > >>Is there an agreed upon standard for exactly what ethernet drivers >>should be counting for rx-bytes and tx-bytes? For example, should the >>counters include the 4-byte FCS? Should they include the ethernet header? > > > Good question. > > It should be that all drivers use what skb->len ends up with at > rx/tx time. A more noticeable problem is what I found in e100: It was adding skb->len after doing eth_type_trans, which yanks off the ethernet header... I moved it before the eth_type_trans and I am getting much better numbers. It does appear that the e1000 is counting the FCS in transmitted bytes, however, and unless we tell the e100 to receive the FCS, it will not count those on receive. > However, it is often faster to just let the hardware keep track > of these statistics (tg3 is one example of a chip that can do > this). And sometimes these mechanisms take the FCS or whatever > into account and this as you note makes the numbers different. Maybe the stats-polling code in the driver could do the necessary subtraction to remove the FCS from the results (ie, subtract (packets-since-last-sample * 4) bytes. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From rddunlap@osdl.org Mon Dec 15 14:49:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 14:49:52 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFMndTa000750 for ; Mon, 15 Dec 2003 14:49:40 -0800 Received: from dragon.pdx.osdl.net (dragon.pdx.osdl.net [172.20.1.27]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBFMnRZ01654; Mon, 15 Dec 2003 14:49:27 -0800 Date: Mon, 15 Dec 2003 14:40:49 -0800 From: "Randy.Dunlap" To: "David S. Miller" Cc: greearb@candelatech.com, netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? Message-Id: <20031215144049.7acea87e.rddunlap@osdl.org> In-Reply-To: <20031215141729.15387fc1.davem@redhat.com> References: <3FDE13AE.3050402@candelatech.com> <20031215141729.15387fc1.davem@redhat.com> Organization: OSDL X-Mailer: Sylpheed version 0.9.4 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: +5V?h'hZQPB9kW Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2030 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rddunlap@osdl.org Precedence: bulk X-list: netdev On Mon, 15 Dec 2003 14:17:29 -0800 "David S. Miller" wrote: | On Mon, 15 Dec 2003 12:03:58 -0800 | Ben Greear wrote: | | > Is there an agreed upon standard for exactly what ethernet drivers | > should be counting for rx-bytes and tx-bytes? For example, should the | > counters include the 4-byte FCS? Should they include the ethernet header? | | Good question. | | It should be that all drivers use what skb->len ends up with at | rx/tx time. | | However, it is often faster to just let the hardware keep track | of these statistics (tg3 is one example of a chip that can do | this). And sometimes these mechanisms take the FCS or whatever | into account and this as you note makes the numbers different. RFC 1573: RFC 2233: RFC 1213: all agree that on an "interface" the number of octets received (InOctets) is: The total number of octets received on the interface, including framing characters. so I read that as including MAC headers and FCS. Without reading these, I would have said include MAC headers but not FCS. Similar for OutOctets. -- ~Randy MOTD: Always include version info. From davem@pizda.ninka.net Mon Dec 15 14:56:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 14:56:40 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFMuRTa001298 for ; Mon, 15 Dec 2003 14:56:27 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id OAA23159; Mon, 15 Dec 2003 14:54:22 -0800 Date: Mon, 15 Dec 2003 14:54:22 -0800 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? Message-Id: <20031215145422.58a78551.davem@redhat.com> In-Reply-To: <3FDE39E3.4000408@candelatech.com> References: <3FDE13AE.3050402@candelatech.com> <20031215141729.15387fc1.davem@redhat.com> <3FDE39E3.4000408@candelatech.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2031 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 15 Dec 2003 14:46:59 -0800 Ben Greear wrote: > Maybe the stats-polling code in the driver could do the necessary subtraction > to remove the FCS from the results (ie, subtract (packets-since-last-sample * 4) bytes. The whole reason to use the chip internally computed stats is to avoid having to do "stats->foo++" at all, your suggestion basically eliminates this purpose. From greearb@candelatech.com Mon Dec 15 15:17:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 15:17:43 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFNHUTa001973 for ; Mon, 15 Dec 2003 15:17:30 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id hBFNHTEu030441; Mon, 15 Dec 2003 15:17:30 -0800 Message-ID: <3FDE4109.7090603@candelatech.com> Date: Mon, 15 Dec 2003 15:17:29 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? References: <3FDE13AE.3050402@candelatech.com> <20031215141729.15387fc1.davem@redhat.com> <3FDE39E3.4000408@candelatech.com> <20031215145422.58a78551.davem@redhat.com> In-Reply-To: <20031215145422.58a78551.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2032 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Mon, 15 Dec 2003 14:46:59 -0800 > Ben Greear wrote: > > >>Maybe the stats-polling code in the driver could do the necessary subtraction >>to remove the FCS from the results (ie, subtract (packets-since-last-sample * 4) bytes. > > > The whole reason to use the chip internally computed stats is to avoid > having to do "stats->foo++" at all, your suggestion basically eliminates > this purpose. Well, I was assuming that the stats polling is a fixed cost O(1), where-as the per-packet calculation is O(n). I am quite sure this assumption is true for e1000, but I have not looked at tg3. So, for e1000, the cost of subtracting out the FCS would be basically free. All that said, from Randy's email, it appears we should be including the FCS anyway... Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From becker@scyld.com Mon Dec 15 15:19:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 15:20:08 -0800 (PST) Received: from beohost.penguincomputing.com (gateway.penguincomputing.com [64.243.132.186]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFNJtTa002336 for ; Mon, 15 Dec 2003 15:19:56 -0800 Received: from localhost (becker@localhost) by beohost.penguincomputing.com (8.11.6/8.11.6) with ESMTP id hBFNJR404236; Mon, 15 Dec 2003 18:19:32 -0500 X-Authentication-Warning: beohost.penguincomputing.com: becker owned process doing -bs Date: Mon, 15 Dec 2003 18:19:27 -0500 (EST) From: Donald Becker X-X-Sender: becker@beohost To: "Randy.Dunlap" cc: "David S. Miller" , , Subject: Re: How to count tx and rx bytes? In-Reply-To: <20031215144049.7acea87e.rddunlap@osdl.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2033 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: becker@scyld.com Precedence: bulk X-list: netdev On Mon, 15 Dec 2003, Randy.Dunlap wrote: > On Mon, 15 Dec 2003 14:17:29 -0800 "David S. Miller" wrote: > > | On Mon, 15 Dec 2003 12:03:58 -0800 > | Ben Greear wrote: > | > | > Is there an agreed upon standard for exactly what ethernet drivers > | > should be counting for rx-bytes and tx-bytes? For example, should the > | > counters include the 4-byte FCS? Yes. > | > Should they include the ethernet header? Yes, but not the preamble. [[ Ethernet packets have an 8 byte preamble before the station addresses. It doesn't make sense to count this, as all modern NICs strip it off, and a few leading bits may be dropped at each hop ]] > | It should be that all drivers use what skb->len ends up with at > | rx/tx time. Not quite. The skb->len value omits the 4 byte FCS/CRC. > | However, it is often faster to just let the hardware keep track > | of these statistics (tg3 is one example of a chip that can do > | this). And sometimes these mechanisms take the FCS or whatever > | into account and this as you note makes the numbers different. The driver should correct, although this is non-critical. Using hardware counters where available used to make sense, but today it's better to have the software driver maintain the statistics. The only thing the hardware counters are needed for is otherwise-unobservable errors, such as missed packets and CRC errors. > RFC 1573: > RFC 2233: > RFC 1213: > all agree that on an "interface" the number of octets received (InOctets) > is: > The total number of octets received on the interface, including > framing characters. This is imprecise. "Framing" might be misread to include the preamble, which should be omitted from the count. The original Linux errors counters were influenced by what the dp8390 chip reported. Later, the definitions in appendix B of the dc21040 manual served as a more general guideline (although the software counters were never a one-for-one copy of any specific hardware semantics). -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems Annapolis MD 21403 410-990-9993 From rddunlap@osdl.org Mon Dec 15 15:31:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 15:31:15 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBFNV2Ta002842 for ; Mon, 15 Dec 2003 15:31:02 -0800 Received: from dragon.pdx.osdl.net (dragon.pdx.osdl.net [172.20.1.27]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id hBFNUnZ10024; Mon, 15 Dec 2003 15:30:49 -0800 Date: Mon, 15 Dec 2003 15:22:11 -0800 From: "Randy.Dunlap" To: Donald Becker Cc: davem@redhat.com, greearb@candelatech.com, netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? Message-Id: <20031215152211.7003fe8e.rddunlap@osdl.org> In-Reply-To: References: <20031215144049.7acea87e.rddunlap@osdl.org> Organization: OSDL X-Mailer: Sylpheed version 0.9.4 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: +5V?h'hZQPB9kW Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2034 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rddunlap@osdl.org Precedence: bulk X-list: netdev On Mon, 15 Dec 2003 18:19:27 -0500 (EST) Donald Becker wrote: | On Mon, 15 Dec 2003, Randy.Dunlap wrote: | | > On Mon, 15 Dec 2003 14:17:29 -0800 "David S. Miller" wrote: | > | > | On Mon, 15 Dec 2003 12:03:58 -0800 | > | Ben Greear wrote: | > | | > | > Is there an agreed upon standard for exactly what ethernet drivers | > | > should be counting for rx-bytes and tx-bytes? For example, should the | > | > counters include the 4-byte FCS? | | Yes. | | > | > Should they include the ethernet header? | | Yes, but not the preamble. | [[ Ethernet packets have an 8 byte preamble before the station | addresses. It doesn't make sense to count this, as | all modern NICs strip it off, and | a few leading bits may be dropped at each hop | ]] | | > | It should be that all drivers use what skb->len ends up with at | > | rx/tx time. | | Not quite. The skb->len value omits the 4 byte FCS/CRC. | | > | However, it is often faster to just let the hardware keep track | > | of these statistics (tg3 is one example of a chip that can do | > | this). And sometimes these mechanisms take the FCS or whatever | > | into account and this as you note makes the numbers different. | | The driver should correct, although this is non-critical. | Using hardware counters where available used to make sense, but today | it's better to have the software driver maintain the statistics. The | only thing the hardware counters are needed for is | otherwise-unobservable errors, such as missed packets and CRC errors. | | > RFC 1573: | > RFC 2233: | > RFC 1213: | > all agree that on an "interface" the number of octets received (InOctets) | > is: | > The total number of octets received on the interface, including | > framing characters. | | This is imprecise. "Framing" might be misread to include the preamble, | which should be omitted from the count. Yes, sorry. | The original Linux errors counters were influenced by what the dp8390 | chip reported. Later, the definitions in appendix B of the dc21040 | manual served as a more general guideline (although the software | counters were never a one-for-one copy of any specific hardware | semantics). I agree with Don's comments, based on my experiences in former lives. I don't know about the DC21040, but I worked at/on both National and Intel LAN drivers. -- ~Randy MOTD: Always include version info. From davem@pizda.ninka.net Mon Dec 15 16:10:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 16:10:31 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG0AITa003675 for ; Mon, 15 Dec 2003 16:10:19 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id QAA23324; Mon, 15 Dec 2003 16:08:10 -0800 Date: Mon, 15 Dec 2003 16:08:10 -0800 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: How to count tx and rx bytes? Message-Id: <20031215160810.5cea9216.davem@redhat.com> In-Reply-To: <3FDE4109.7090603@candelatech.com> References: <3FDE13AE.3050402@candelatech.com> <20031215141729.15387fc1.davem@redhat.com> <3FDE39E3.4000408@candelatech.com> <20031215145422.58a78551.davem@redhat.com> <3FDE4109.7090603@candelatech.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2035 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 15 Dec 2003 15:17:29 -0800 Ben Greear wrote: > Well, I was assuming that the stats polling is a fixed cost O(1), > where-as the per-packet calculation is O(n). I am quite sure this > assumption is true for e1000, but I have not looked at tg3. So, for > e1000, the cost of subtracting out the FCS would be basically free. I see, I misunderstood your idea. That would work, and yes all of the effort would be expended at netdev->get_stats() time. > All that said, from Randy's email, it appears we should be including > the FCS anyway... Yes, indeed. So the drivers that do not have hardware statistics doing this, and are using skb->len, need to add on the FCS length (which is 4 bytes right?) when accumulating stats->* values. Sounds like a nice janitor job :) From davem@pizda.ninka.net Mon Dec 15 17:19:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 17:20:06 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG1JhTa008014 for ; Mon, 15 Dec 2003 17:19:43 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id RAA23517; Mon, 15 Dec 2003 17:17:32 -0800 Date: Mon, 15 Dec 2003 17:17:32 -0800 From: "David S. Miller" To: Steve Hill Cc: netdev@oss.sgi.com Subject: Re: 2.6.0-test9 : bridge freezes Message-Id: <20031215171732.4877acd1.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2036 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 15 Dec 2003 13:15:44 +0000 (GMT) Steve Hill wrote: > The memory that is leaking seems to be being allocated on line 299 on > net/bridge/br_netfilter.c: > > if ((nf_bridge = nf_bridge_alloc(skb)) == NULL) > return NF_DROP; > > Only the first fragment gets freed later on. I see. > The patch attached fixes the problem by freeing nf_bridge when the > packets are defragmented, however I am sure this is not the right place > to do this. Where would the skb's for the fragments usually get freed? > > Bart De Schuymer suggested that they should be freed in > skbuff.c::skb_release_data(), but having looked at this it seems to do > this already. skb_release_data() calls skb_drop_fraglist(), which does > kfree_skb() on each fragment, and kfree_skb calls nf_bridge_put correctly > so this isn't the problem. There must be something in particular that the IPV4 fragmentation code is doing that makes these fragment reference drops get forgotten. Hmmm... I just noticed that both bridge netfilter and IPV4 fragmentation make much use of the skb->cb[] control block, this may be the true source of the troubles. In fact, since bridge netfilter expects pointers to be there, I'm surprised this does not cause a crash. From chrisw@osdl.org Mon Dec 15 17:33:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 17:33:12 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG1WxTa008902 for ; Mon, 15 Dec 2003 17:33:00 -0800 Received: (from chrisw@localhost) by mail.osdl.org (8.11.6/8.11.6) id hBG1Wfo30180; Mon, 15 Dec 2003 17:32:41 -0800 Date: Mon, 15 Dec 2003 17:32:41 -0800 From: Chris Wright To: James Morris Cc: "David S. Miller" , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-security-module@wirex.com, Stephen Smalley Subject: Re: [RFC] SO_PEERSEC - security credentials for Unix stream sockets Message-ID: <20031215173241.B14552@osdlab.pdx.osdl.net> References: <20031212161617.C24246@osdlab.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from jmorris@redhat.com on Fri, Dec 12, 2003 at 10:44:24PM -0500 X-archive-position: 2037 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev * James Morris (jmorris@redhat.com) wrote: > I'm not sure how this would be a namespace issue -- do you mean a data > format issue? I just mean, applications are coded for specific security module. > Yep, allowing the security module to update the returned length is now > implemented. > > > Perhaps buffer is too small, can len be vector for that info? > > I would not advise updating len on error -- it's a bad idea in general to > interpret any returned data from failed syscalls except the error number. Right, in some cases a NULL buffer or 0 buflen is a probe for size. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net From yoshfuji@linux-ipv6.org Mon Dec 15 20:02:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 20:02:31 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG42HTa010738 for ; Mon, 15 Dec 2003 20:02:18 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-6.6) with ESMTP id hBG3tPwH025435; Tue, 16 Dec 2003 12:55:25 +0900 Date: Tue, 16 Dec 2003 12:55:25 +0900 (JST) Message-Id: <20031216.125525.119981521.yoshfuji@linux-ipv6.org> To: dax@gurulabs.com, shemminger@osdl.org Cc: davem@redhat.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [BUG] can't unload network device's if IPV6 is loaded From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <1071467514.3219.11.camel@mentor.gurulabs.com> References: <20031211153334.0c59214b.shemminger@osdl.org> <1071467514.3219.11.camel@mentor.gurulabs.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2038 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <1071467514.3219.11.camel@mentor.gurulabs.com> (at Sun, 14 Dec 2003 22:51:54 -0700), Dax Kelson says: > I noticed this on my laptop with my internal Dell TrueMobile 1150 > wireless 802.11 card (orinoco_cs). > > On my Fedora box when I added IPV6=yes to /etc/sysconfig/network I > couldn't get clean reboots or shutdowns. please provide your kernel config. thanks. (I cannot reproduce the problem.) --yoshfuji From garzik@gtf.org Mon Dec 15 21:43:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 21:43:46 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG5hWTa015098 for ; Mon, 15 Dec 2003 21:43:33 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id DFAB9663B; Tue, 16 Dec 2003 00:43:26 -0500 (EST) Date: Tue, 16 Dec 2003 00:43:26 -0500 From: Jeff Garzik To: Stephen Hemminger Cc: Alexander Viro , netdev@oss.sgi.com Subject: Re: [PATCH] skfddi - convert to new pci model. Message-ID: <20031216054326.GA8825@gtf.org> References: <20031204163928.0f34d5d1.shemminger@osdl.org> <20031205005915.GD31510@devserv.devel.redhat.com> <20031205151249.44a067df.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031205151249.44a067df.shemminger@osdl.org> User-Agent: Mutt/1.3.28i X-archive-position: 2039 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Fri, Dec 05, 2003 at 03:12:49PM -0800, Stephen Hemminger wrote: > Second revision of the cleanup of skfddi driver. > * use new pci device bus initialization > * allocate network device with alloc_fddidev > and use dev->priv > * get rid of special module/non module distinctions. > > * fix error unwinds and return values on initialization > * call driver_init directly not via register_netdev > * reset internal queue count after purge > * get rid of h[iy]sterical comment that is no longer true > > diff -Nru a/drivers/net/skfp/skfddi.c b/drivers/net/skfp/skfddi.c > --- a/drivers/net/skfp/skfddi.c Fri Dec 5 14:49:12 2003 > +++ b/drivers/net/skfp/skfddi.c Fri Dec 5 14:49:12 2003 > @@ -39,12 +39,6 @@ > * are skfddi.c, h/types.h, h/osdef1st.h, h/targetos.h. > * The others belong to the SysKonnect FDDI Hardware Module and > * should better not be changed. > - * NOTE: > - * Compiling this driver produces some warnings, but I did not fix > - * this, because the Hardware Module source is used for different > - * drivers, and fixing it for Linux might bring problems on other > - * projects. To keep the source common for all those drivers (and > - * thus simplify fixes to it), please do not clean it up! > * > * Modification History: > * Date Name Description > @@ -58,6 +52,7 @@ > * 07-May-00 DM 64 bit fixes, new dma interface > * 31-Jul-03 DB Audit copy_*_user in skfp_ioctl > * Daniele Bellucci > + * 03-Dec-03 SH Convert to PCI device model > * > * Compilation options (-Dxxx): > * DRIVERDEBUG print lots of messages to log file > @@ -70,7 +65,7 @@ > > /* Version information string - should be updated prior to */ > /* each new release!!! */ > -#define VERSION "2.06" > +#define VERSION "2.07" > > static const char *boot_msg = > "SysKonnect FDDI PCI Adapter driver v" VERSION " for\n" > @@ -80,15 +75,11 @@ > > #include > #include > -#include > -#include > #include > #include > #include > #include > #include > -#include > -#include // isdigit > #include > #include > #include > @@ -107,11 +98,6 @@ > > > // Define module-wide (static) routines > -static struct net_device *alloc_device(struct net_device *dev, u_long iobase); > -static struct net_device *insert_device(struct net_device *dev); > -static int fddi_dev_index(unsigned char *s); > -static void init_dev(struct net_device *dev, u_long iobase); > -static void link_modules(struct net_device *dev, struct net_device *tmp); > static int skfp_driver_init(struct net_device *dev); > static int skfp_open(struct net_device *dev); > static int skfp_close(struct net_device *dev); > @@ -188,15 +174,6 @@ > // Define module-wide (static) variables > > static int num_boards; /* total number of adapters configured */ > -static int num_fddi; > -static int autoprobed; > - > -#ifdef MODULE > -static struct net_device *unlink_modules(struct net_device *p); > -static int loading_module = 1; > -#else > -static int loading_module; > -#endif // MODULE > > #ifdef DRIVERDEBUG > #define PRINTK(s, args...) printk(s, ## args) > @@ -207,9 +184,9 @@ > #define PRIV(dev) (&(((struct s_smc *)dev->priv)->os)) > > /* > - * ============== > - * = skfp_probe = > - * ============== > + * ================= > + * = skfp_init_one = > + * ================= > * > * Overview: > * Probes for supported FDDI PCI controllers > @@ -218,30 +195,11 @@ > * Condition code > * > * Arguments: > - * dev - pointer to device information > + * pdev - pointer to PCI device information > * > * Functional Description: > - * This routine is called by the OS for each FDDI device name (fddi0, > - * fddi1,...,fddi6, fddi7) specified in drivers/net/Space.c. > - * If loaded as a module, it will detect and initialize all > - * adapters the first time it is called. > - * > - * Let's say that skfp_probe() is getting called to initialize fddi0. > - * Furthermore, let's say there are three supported controllers in the > - * system. Before skfp_probe() leaves, devices fddi0, fddi1, and fddi2 > - * will be initialized and a global flag will be set to indicate that > - * skfp_probe() has already been called. > - * > - * However...the OS doesn't know that we've already initialized > - * devices fddi1 and fddi2 so skfp_probe() gets called again and again > - * until it reaches the end of the device list for FDDI (presently, > - * fddi7). It's important that the driver "pretend" to probe for > - * devices fddi1 and fddi2 and return success. Devices fddi3 > - * through fddi7 will return failure since they weren't initialized. > - * > - * This algorithm seems to work for the time being. As other FDDI > - * drivers are written for Linux, a more generic approach (perhaps > - * similar to the Ethernet card approach) may need to be implemented. > + * This is now called by PCI driver registration process > + * for each board found. > * > * Return Codes: > * 0 - This device (fddi0, fddi1, etc) configured successfully > @@ -254,370 +212,176 @@ > * initialized and the board resources are read and stored in > * the device structure. > */ > -static int skfp_probe(struct net_device *dev) > +static __init int skfp_init_one(struct pci_dev *pdev, I take back my private email... patch doesn't apply. Can you please rediff, and also remove the above "__init" ? Jeff From jgarzik@pobox.com Mon Dec 15 22:32:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 22:32:55 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG6WdTa016349 for ; Mon, 15 Dec 2003 22:32:40 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:48876 helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.22) id 1AW8l4-0002aa-9F; Tue, 16 Dec 2003 06:32:38 +0000 Message-ID: <3FDEA6FA.4010906@pobox.com> Date: Tue, 16 Dec 2003 01:32:26 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Netdev CC: Linux Kernel , Al Viro , akpm@osdl.org, davem@redhat.com Subject: [BK PATCHES] 2.6.x experimental net driver updates Content-Type: multipart/mixed; boundary="------------000800080303090907080207" X-archive-position: 2040 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000800080303090907080207 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Another edition of net driver updates. Summary of new changes: * revert skfddi janitorial patches. no skfddi changes are included in this patchkit anymore. Stephen Hemminger has a much better skfddi patch which obsoleted the already-merged skfddi changes. * atmel wireless SKB leak fix * 3c574_cs deadlock fix * bonding destruction fix * more work from Al making net_device allocation dynamic and sane (including fast, const-offset private structure allocation for drivers) * sk98lin update from vendor * TG3 update from DaveM * new pc200syn WAN driver from WAN maintainer Krzysztof Halasa * r8169 fixes from Francois Romieu Summary of patchkit: * new e100 driver (rewritten from scratch) * new nVidia nForce NIC driver * new pc200syn WAN driver * tg3 bug fixes * r8169 major bug fixes * e1000 minor updates / fixes * sk98lin vendor updates / fixes * misc bug fixes * 8139too NAPI support * tulip NAPI support * netconsole / netdump support * net_device allocation and reference counting work Patch: http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test11-bk11-netdrvr-exp1.patch.bz2 Full changelog: http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test11-bk11-netdrvr-exp1.log BK repo: bk://gkernel.bkbits.net/net-drivers-2.5-exp Changelog delta attached. --------------000800080303090907080207 Content-Type: text/plain; name="2.6.0-test11-bk11-netdrvr-exp1.txt" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="2.6.0-test11-bk11-netdrvr-exp1.txt" ChangeSet@1.1548, 2003-12-16 01:13:32-05:00, jgarzik@redhat.com Cset exclude: shemminger@osdl.org|ChangeSet|20031029195339|20158 Revert more skfddi stuff... better patch coming. ChangeSet@1.1545, 2003-12-16 00:57:23-05:00, scott.feldman@intel.com [PATCH] ICH6 IDs + ia64 memcpy fix + module_param * Add ICH6 device IDs. Devices funcionally equivalent to supported ICH5 devices, but new IDs. * Fixed unaligned access to casted skb->data (Matt Willcox [willy@debian.org]). * MODULE_PARM -> module_param * Bug printk after register_netdev to identify nic details. * misc cleanups. ChangeSet@1.1474.1.30, 2003-12-16 00:56:37-05:00, dtor_core@ameritech.net [PATCH] Fwd: Re: Atmel - possible SKB leak? Jeff, Atmel driver in 2.6.0-test11 is leaking SKBs if card gets disassociated from an AP when it's about to transfer packet. Simon (atmel maintainer) is OK with the patch. Given the fact that we are leaking memory I think it may be beneficial to push it to Linus (if you like the patch). Dmitry =================================================================== ChangeSet@1.1517, 2003-12-11 01:44:56-05:00, dtor_core@ameritech.net NET: atmel - do not leak SKBs when dropping packets atmel.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) =================================================================== ChangeSet@1.1474.1.29, 2003-12-16 00:52:31-05:00, akpm@osdl.org [PATCH] Re: Deadlock in 3c574_cs.c (fwd) Patch looks fine to me, thanks. I've queued up the below. From: Ville Nuorvala I've experienced random lockups witch become almost certain under heavy loads, like when doing ping6 -f. The culprit seems to be the 3c574_cs driver, which locks lp->window_lock twice when calling update_stats() from el3_interrupt(). drivers/net/pcmcia/3c574_cs.c | 15 +++++++++------ 1 files changed, 9 insertions(+), 6 deletions(-) ChangeSet@1.1543, 2003-12-16 00:47:59-05:00, jgarzik@redhat.com Cset exclude: felipewd@terra.com.br|ChangeSet|20031014182245|09592 Revert skfddi request_region patch... better patch coming. ChangeSet@1.1541, 2003-12-16 00:25:32-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr axnet_cs] use embedded private struct ChangeSet@1.1540, 2003-12-16 00:23:22-05:00, viro@parcelfarce.linux.theplanet.co.uk [NET] s/kfree/free_netdev/ in bridge driver ChangeSet@1.1474.1.27, 2003-12-16 00:16:23-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr bonding] use destructor to properly free net device (required because of driver's use of rtnl_lock/unlock) ChangeSet@1.1539, 2003-12-16 00:14:52-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr 8390] convert 8390 lib to use const-offset priv struct ChangeSet@1.1538, 2003-12-16 00:14:01-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr pcnet_cs] alloc_ei_netdev-associated cleanups Create __alloc_ei_netdev() helper, which takes a size argument for allocation of driver-private structures. Use __alloc_ei_netdev in pcnet_cs, for embedded priv struct. ChangeSet@1.1537, 2003-12-16 00:10:35-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr smc-mca] alloc_ei_netdev(), other fixes Move all initialization between alloc_ei_netdev() and register_netdev(), and fix bugs on error paths. Also s/kfree/free_netdev/ ChangeSet@1.1536, 2003-12-16 00:08:57-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr] convert most 8390 drivers to using alloc_ei_netdev() ChangeSet@1.1535, 2003-12-16 00:03:04-05:00, viro@parcelfarce.linux.theplanet.co.uk [wireless wl3501_cs] remove unused constructor ChangeSet@1.1534, 2003-12-16 00:02:37-05:00, viro@parcelfarce.linux.theplanet.co.uk [netdrvr] s/kfree/free_netdev/ where appropriate Affected drivers: ixgb, sk98lin, ibmtr, airport, orinoco, wl3501_cs ChangeSet@1.1533, 2003-12-15 23:55:18-05:00, mlindner@syskonnect.de [PATCH] sk98lin-2.6: pci.ids Update to Driver Version v6.21 Patch 4/4 (Update to version 6.21) * Add: Update of the Vendors list ChangeSet@1.1532, 2003-12-15 23:54:52-05:00, mlindner@syskonnect.de [PATCH] sk98lin-2.6: Kconfig Update to Driver Version v6.21 Patch 3/4 (Update to version 6.21) * Add: Update of the Vendors list ChangeSet@1.1531, 2003-12-15 23:54:31-05:00, mlindner@syskonnect.de [PATCH] sk98lin-2.6: Readme Update to Driver Version v6.21 Patch 2/4 (Update to version 6.21) * Fix: Readme changes ChangeSet@1.1530, 2003-12-15 23:54:21-05:00, mlindner@syskonnect.de [PATCH] sk98lin-2.6: Kernel Update to Driver Version v6.21 Patch 1/4 (Update to version 6.21) * Add: Common module update * Add: New function for PCI initialization (SkGeInitPCI) * Add: Yukon Plus changes (ChipID, PCI...) * Add: Code for DIAG tool * Fix: Problems while unloading the linux driver * Fix: PrefPort=B not allowed on single NICs * Fix: Fixed Linux System crash when using vlans * Fix: Remove useless register_netdev * Fix: Initalize Board before network configuration * Fix: Modifications regarding try_module_get() and capable() ChangeSet@1.1474.12.6, 2003-12-08 21:54:23-08:00, xose@wanadoo.es [TG3]: Add new device IDs. ChangeSet@1.1529, 2003-12-07 19:17:51-05:00, khc@pm.waw.pl [wan] add new pc200syn driver ChangeSet@1.1528, 2003-12-07 19:00:04-05:00, romieu@fr.zoreil.com [netdrvr r8169] Stats fix (Fernando Alencar MarĂ³tica ). ChangeSet@1.1527, 2003-12-07 18:59:29-05:00, romieu@fr.zoreil.com [netdrvr r8169] Endianness update (original idea from Alexandra N. Kossovsky): - descriptors status (bitfields enumerated as _DescStatusBit); - address of buffers stored in Rx/Tx descriptors. ChangeSet@1.1474.12.5, 2003-12-02 19:42:50-08:00, davem@nuts.ninka.net [TG3]: Update to latest non-5705 TSO firmware. ChangeSet@1.1474.12.4, 2003-12-02 02:35:21-08:00, davem@nuts.ninka.net [TG3]: Update version and release date. ChangeSet@1.1474.12.3, 2003-12-02 02:34:45-08:00, davem@nuts.ninka.net [TG3]: Clear on-chip stats/status block after resetting flow-through queues. On systems where the config cycles might take a long time, we can end up with the ASF firmware using the FTQs before we get to resetting them. ChangeSet@1.1474.12.2, 2003-12-02 02:34:13-08:00, davem@nuts.ninka.net [TG3]: Do not set RX_MODE_KEEP_VLAN_TAG when ASF is enabled. ChangeSet@1.1474.12.1, 2003-12-02 02:33:45-08:00, davem@nuts.ninka.net [TG3]: Do not drop existing GRC_MODE_HOST_STACKUP when writing to GRC_MODE. --------------000800080303090907080207-- From bdschuym@pandora.be Mon Dec 15 23:43:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 23:44:10 -0800 (PST) Received: from elpis.telenet-ops.be (elpis.telenet-ops.be [195.130.132.40]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG7htTa017879 for ; Mon, 15 Dec 2003 23:43:56 -0800 Received: from localhost (poros.telenet-ops.be [195.130.132.44]) by elpis.telenet-ops.be (Postfix) with SMTP id 751CF3801B; Tue, 16 Dec 2003 08:43:54 +0100 (MET) Received: from 192.168.0.138 (D5762BBC.kabel.telenet.be [213.118.43.188]) by poros.telenet-ops.be (Postfix) with ESMTP id 4FA192E68F0; Tue, 16 Dec 2003 08:43:54 +0100 (MET) From: Bart De Schuymer To: "David S. Miller" , Steve Hill Subject: Re: 2.6.0-test9 : bridge freezes Date: Tue, 16 Dec 2003 08:43:58 +0100 User-Agent: KMail/1.5 Cc: netdev@oss.sgi.com References: <20031215171732.4877acd1.davem@redhat.com> In-Reply-To: <20031215171732.4877acd1.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200312160843.58992.bdschuym@pandora.be> X-archive-position: 2041 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev On Tuesday 16 December 2003 02:17, David S. Miller wrote: > There must be something in particular that the IPV4 fragmentation code > is doing that makes these fragment reference drops get forgotten. Hmmm... > > I just noticed that both bridge netfilter and IPV4 fragmentation make much > use of the skb->cb[] control block, this may be the true source of the > troubles. > > In fact, since bridge netfilter expects pointers to be there, I'm surprised > this does not cause a crash. It only expects a pointer in br_nf_forward_finish() for ARP traffic. I checked and the ARP code doesn't use the control buffer. For IP traffic, it uses the control buffer just before and just after the call to the IP PRE_ROUTING hook. OK, I just looked at the ip_fragment.c code and it uses the control buffer too. You are truly amazing. I'll use skbuff.c::nf_bridge_info instead. Steve, does this patch fix things? Of course, first remove your code from ip_fragment.c. I haven't tested this patch yet, this will have to wait until this evening. Dave, I'll cook up a slightly different patch for you later, I think nf_bridge->hh is now a bad name, I'll change it into nf_bridge->data. thanks, Bart --- linux-2.6.0-test11-bk10/net/bridge/br_netfilter.c.old 2003-12-16 08:33:35.000000000 +0100 +++ linux-2.6.0-test11-bk10/net/bridge/br_netfilter.c 2003-12-16 08:34:12.000000000 +0100 @@ -38,11 +38,9 @@ #define skb_origaddr(skb) (((struct bridge_skb_cb *) \ - (skb->cb))->daddr.ipv4) + (skb->nf_bridge->hh))->daddr.ipv4) #define store_orig_dstaddr(skb) (skb_origaddr(skb) = (skb)->nh.iph->daddr) #define dnat_took_place(skb) (skb_origaddr(skb) != (skb)->nh.iph->daddr) -#define clear_cb(skb) (memset(&skb_origaddr(skb), 0, \ - sizeof(struct bridge_skb_cb))) #define has_bridge_parent(device) ((device)->br_port != NULL) #define bridge_parent(device) ((device)->br_port->br->dev) @@ -203,7 +201,6 @@ bridged_dnat: */ nf_bridge->mask |= BRNF_BRIDGED_DNAT; skb->dev = nf_bridge->physindev; - clear_cb(skb); if (skb->protocol == __constant_htons(ETH_P_8021Q)) { skb_push(skb, VLAN_HLEN); @@ -224,7 +221,6 @@ bridged_dnat: dst_hold(skb->dst); } - clear_cb(skb); skb->dev = nf_bridge->physindev; if (skb->protocol == __constant_htons(ETH_P_8021Q)) { skb_push(skb, VLAN_HLEN); From davem@pizda.ninka.net Mon Dec 15 23:48:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Dec 2003 23:48:47 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG7mZTa019639 for ; Mon, 15 Dec 2003 23:48:35 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id XAA24100; Mon, 15 Dec 2003 23:46:25 -0800 Date: Mon, 15 Dec 2003 23:46:24 -0800 From: "David S. Miller" To: Bart De Schuymer Cc: steve@navaho.co.uk, netdev@oss.sgi.com Subject: Re: 2.6.0-test9 : bridge freezes Message-Id: <20031215234624.7ce46dc5.davem@redhat.com> In-Reply-To: <200312160843.58992.bdschuym@pandora.be> References: <20031215171732.4877acd1.davem@redhat.com> <200312160843.58992.bdschuym@pandora.be> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2042 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 16 Dec 2003 08:43:58 +0100 Bart De Schuymer wrote: > You are truly amazing. Don't make me blush in public :) > Dave, I'll cook up a slightly different patch for you later, I think > nf_bridge->hh is now a bad name, I'll change it into nf_bridge->data. Great, I hope we've got this one nailed. From hibi665@oki.com Tue Dec 16 00:22:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 16 Dec 2003 00:23:17 -0800 (PST) Received: from iscan1.intra.oki.co.jp (okigate.oki.co.jp [202.226.91.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id hBG8MvTa027999 for ; Tue, 16 Dec 2003 00:22:58 -0800 Received: from aoi.okilab.oki.co.jp (localhost.localdomain [127.0.0.1]) by iscan1.intra.oki.co.jp (8.9.3/8.9.3) with SMTP id RAA30612 for ; Tue, 16 Dec 2003 17:22:51 +0900 Received: (qmail 11151 invoked from network); 16 Dec 2003 17:22:54 +0900 Received: from dhcp23233.okilab.oki.co.jp (HELO kiso) (172.24.23.233) by aoi.okilab.oki.co.jp with SMTP; 16 Dec 2003 17:22:54 +0900 Message-Id: <20031216172417.1ba1c59e%hibi665@oki.com> MIME-Version: 1.0 Date: Tue, 16 Dec 2003 17:24:18 +0900 X-Mailer: Denshin 8 Go V32.1.4.3 X-My-Real-Login-Name: thibi; aoi From: Takashi Hibi To: David Stevens , netdev@oss.sgi.com In-Reply-To: (Your message of "Thu, 11 Dec 2003 15:19:55 -0800") References: Subject: Re: Kernel 2.4.22 MLD problems X-archive-position: 2043 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hibi665@oki.com Precedence: bulk X-list: netdev Steven, > Takashi, > >>>But there are still some problems. >>>[MLDv2] >>>1. After joining multicast address, MLDv2 listener report isn't issued >>>immediately. >> >>This looks like a bug; it should send the first one immediately, >>and QRV-1 more at random intervals between 0 and MRC. It's adding >>the MRC random delay before sending the first one. Not critical, >>but I'll put a patch together in the next couple of days. > > I thought this was a bug, from a quick glance at the code, but I > just looked some more and it already does the fix I had in mind. > I also ran some tests, and I do see the advertisement immediately > after an interface state change. That was with 2.6.0-test11bk8, > but 2.4.22 has the same code for this piece. > > In MLDv1 compatibility mode, interface state changes are not sent, but a > "leave group" will result in an MLDv1 leave group message. So, I can't > explain (or reproduce) this behaviour right now. > > Please do send me a packet trace with the MLD packets and describing > what your application is doing in more detail if you still have a problem > with > this. > > +-DLS I found a problem in the code of MLDv1 compatibility mode. The computation of Older Version Querier Present Timeout is wrong. In the draft of MLDv2 9.12 This value MUST be ([Robustness Variable] times (the [Query Interval] in th