Eric Lemoine writes:
> I developped a kernel module that basically implements per-cpu kernel
> threads, each being bound to a particular cpu. I also modified the
> Myrinet NIC driver and firmware so that they implement per-cpu rx rings.
> The NIC makes sure that packets of the same connection are always
> deposited in the same ring. Here's how it does it. For each incoming
> pkt, the NIC computes the index of the ring into which the packet must
> be placed [*], passes this index to the driver, and dmas the packet into
> the appropriate ring. The driver uses the ring index to wake up the
> appropriate kernel thread. Each kernel-thread behaves in a NAPI manner.
So reordering should be guaranteed within "connections" but not per interface.
And if you can repeat the trick with per-cpu rings for tx you can eventually
eliminate cache bouncing when sending/freeing skb's.
We tried to tag with cpu-owner in tx-ring when doing hard_xmit and having same
cpu sending it to do kfree but the complexity balanced the win... The thinking
was that per-cpu tx rings could help.