Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 14:33:14 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52LX1Xq027235 for ; Thu, 2 Jun 2005 14:33:01 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52LVQjA018683 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 14:31:26 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52LVQL9019198; Thu, 2 Jun 2005 14:31:26 -0700 Date: Thu, 2 Jun 2005 14:31:26 -0700 From: Stephen Hemminger To: "Ronciak, John" Cc: , "Jon Mason" , "David S. Miller" , "Williams, Mitch A" , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050602143126.7c302cfd@dxpl.pdx.osdl.net> In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1989 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 2869 Lines: 57 On Thu, 2 Jun 2005 14:19:55 -0700 "Ronciak, John" wrote: > The DRR algorithm assumes a perfect world, where hardware resources are > infinite, packets arrive continuously (or separated by very long > delays), there are no bus latencies, and CPU speed is infinite. > > The real world is much messier: hardware starves for resources if it's > not serviced quickly enough, packets arrive at inconvenient intervals > (especially at 10 and 100 Mbps speeds), and buses and CPUs are slow. > > Thus, the driver should have the intelligence built into it to make an > "intelligent" choice on what the weight should be for that > driver/hardware. The calculation in the driver should take into account > all the factors that the driver has access to. These include link > speed, bus type and speed, processor speed and some amount of actual > device FIFO size and latency smarts. The driver would use all of the > factors to come up with a weight to prevent it from dropping frames and > not to starve out other devices in the system or hinder performance. It > seems to us that the driver is the one that know best and should try to > come up with a reasonable value for weight based on its own knowledge of > the hardware. This is like saying each CPU vendor should write their own process scheduler for Linux. Now with NUMA and HT, it is getting almost that bad but we still try and keep it CPU neutral. For networking the problem is worse, the "right" choice depends on workload and relationship between components in the system. I can't see how you could ever expect a driver specific solution. > This has been showing up in our NAPI test data which Mitch is currently > scrubbing for release. It shows that there is a need for either better > default static weight numbers or for them to be calculated based on some > system dynamic variables. We would like to see the latter tried but the > only problem is that each driver would have to make its own > calculations, and it may not have access to all of the system-wide data > it would need to make a proper calculation. And for other workloads, and other systems (think about the Altix with long access latencies), your numbers will be wrong. Perhaps we need to quit trying for a perfect solution and just get a "good enough" one that works. Let's keep the intelligence out of the driver. Most of the existing smart drivers end up looking like crap and don't work that well. > Even with a more intelligent driver, we still would like to see some > mechanism for the weight to be changed at runtime, such as with > Stephen's sysfs patch. This would allow a sysadmin (or user-space app) > to tune the system based on statistical data that isn't available to the > individual driver. > It will be yet another knob that all except the benchmark tweakers can ignore (hopefully).