[Top] [All Lists]

Re: what pointers does pskb_may_pull() nuke?

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: what pointers does pskb_may_pull() nuke?
From: Michael Richardson <mcr@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 05 Dec 2001 16:09:24 -0500
Cc: netdev@xxxxxxxxxxx, design@xxxxxxxxxxxxxxxxxx
In-reply-to: Your message of "Tue, 04 Dec 2001 12:18:24 +0200." <Pine.LNX.4.33.0112041144050.1594-100000@l>
Sender: owner-netdev@xxxxxxxxxxx

>>>>> "Julian" == Julian Anastasov <ja@xxxxxx> writes:
    Julian>     Hello,

  Thank you kindly for your continuing help.

    Julian> On Tue, 4 Dec 2001, Michael Richardson wrote:

    >> One thing is that we had mixed reports about the need to reassemble
    >> fragments. We realized that this is because netfilter will reassemble
    >> fragments for us.
    >> (My User-Mode-Linux tests had netfilter off)

    Julian>     Yes, this is true. But this reassembling will disappear soon

  "soon" means 2.5/2.6, I hope. Not 2.4 series.

    Julian> and will be moved down to each its user. I.e. you have to do it
    Julian> yourself. Using the 2 lines is simpler but slower if you receive 
    Julian> fragments. You can use these 2 lines of code before you design how
    Julian> to read the data from each skb in frag_list (may be checksum + 

  When various systems screw up PMTU, (outside of our control, we have
re-enabled generation of ICMP messages after turning it off a couple of
releases ago) we get lots of 1500 byte packets that we need to prepend 48
bytes to (IP + ESP) header. So, we wind up fragmenting the outgoing ESP.

  (We do not copy DF to the outer header at this time, since we can not cope
with the unauthenticated ICMPs that would result).

  So, we get ESP fragments to reassemble.

  Once nice thing about seeing the fragments is that we might decide to
implement a proposal for doing authenticated PMTU for tunnels by observing
the size of the largest fragment.

    >> Anyway, it is not clear to us what netfilter option is the minimum 
    >> to cause fragments to be reassembled prior to transport layer. Is just 
    >> netfilter enabled enough to do that?  I'm asking so that we can properly
    >> diagnose the situation.

    Julian>     It is not the Netfilter who calls ip_defrag, it was called
    Julian> from the local delivery code (of course, the conntracking can do it
    Julian> at prerouting). With the current kernels you are ok, as long as, the

  I think you are missing what I am saying.
  Yes, it was called in the local delivery code, but is no longer.
  Our experience is that some aspect of compiling netfilter into a kernel 
(not necessarily with conntrack on) causes ip_defrag() to get called before
local delivery.

    Julian> will disappear soon. So, be warned. This is the reason we to talk
    Julian> for using these 2 lines. TCP is just ready for these changes (no
    Julian> surprises here).

  It would have been better if the transport protocol registration code had
taken a flag (that TCP would set) that said "I can handle fragments". 

  What has happened is that you have changed the interface between network
and transport code unilaterally causing a subtle incompatibility. This causes
us to add code to compensate. This is BAD.

  I would be happy to write a patch that implemented the above.

    >> We will be adding:
    >> /* In Linux 2.4.4, we may have to reassemble fragments. They are
    >> not assembled automatically to save TCP from having to copy
    >> twice.

    Julian>     They are reassembled in sense of proper ordering but the
    Julian> data is spread on many skbs in ->frag_list. I.e. ip_defrag avoids
    Julian> data copy. If you need to checksum the data then you have to
    Julian> additionally linearize it.

  We need to decrypt it. We need it linear.
  Yes, defragment is the wrong term if you like. Linearize is better.

    >> we will do this after we have checked for COW on the SKB.

    Julian>     Instead of cow may be you have to look for another
    Julian> function (but it will be 2.4 specific). skb_cow copies data,
    Julian> although valid for 2.2, you must be sure that you need to copy
    Julian> the skb data (if you need write access) in 2.4. The other variant
    Julian> is to use function that expands the header if only that is the goal.

  We need to write (we decrypt in place).

  We do:

#ifdef NET_21
        /* if skb was cloned (most likely due to a packet sniffer such as
           tcpdump being momentarily attached to the interface), make
           a copy of our own to modify */
        if(skb_cloned(skb)) {
                /* include any mac header while copying.. */
                if(skb_headroom(skb) < hard_header_len) {
                        printk(KERN_WARNING "klips_error:ipsec_rcv: "
                               "tried to skb_push hhlen=%d, %d available.  This 
should never happen, please report.\n",
                        goto rcvleave;
                skb_push(skb, hard_header_len);
#ifdef SKB_COW_NEW
               (skb_cow(skb, skb_headroom(skb)) != 0)
#else /* SKB_COW_NEW */
               ((skb = skb_cow(skb, skb_headroom(skb))) == NULL)
#endif /* SKB_COW_NEW */
                        goto rcvleave;
                if(skb->len < hard_header_len) {
                        printk(KERN_WARNING "klips_error:ipsec_rcv: "
                               "tried to skb_pull hhlen=%d, %d available.  This 
should never happen, please report.\n",
                        goto rcvleave;
                skb_pull(skb, hard_header_len);
#endif /* NET_21 */

]       ON HUMILITY: to err is human. To moo, bovine.           |  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON    |net architect[
] mcr@xxxxxxxxxxxxxxxxxxxxxx |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [

Version: 2.6.3ia
Charset: latin1
Comment: Finger me for keys


<Prev in Thread] Current Thread [Next in Thread>