Andi Kleen wrote:

On Saturday 25 March 2006 23:32, Mark Butler wrote:

A true firewall should never need to do anything but drop packets and reset connections. Changes to the way packets are routed should be done at the routing layer, using the flow information from the transport layer.

The real world doesn't work this way.
Agreed that there are other uses for "filtering" that are not firewalls in the normal sense of the word, but rather transparent proxies and other odd applications. So the way to do this would be to run through the NF output chain at dst entry assignment time, asking each entry to return negative if drops everything in the flow, 0 if it is a no-op for the flow, and postive if it needs to be called for every packet in the flow. If any entry returns negative, then an appropriate error would be returned to the transport layer, so that it could immediately cancel the connection / path as appropriate.

If all entries return zero, then we know that the NF chain does not need to be traversed for packets in that flow. If some of them return positive, then one could either operate as usual, or (preferably) construct a list of just the ones that may be applicable and use those.

A positive return value would consist of a bitmask indicating the types of transformations the entry applies. Possibly flags for:

  Examines packet only (no side effects)
  Drops packets
  Generates additional packets

  Changes layer 2 hardware type
  Changes layer 2 interface
  Changes layer 2 address
  Changes layer 2 control information
  Adds    layer 2 encapsulation
  Removes layer 2 encapsulation

  Changes layer 3 protocol
  Changes layer 3 routing
  Changes layer 3 address
  Changes layer 3 control information
  Adds    layer 3 encapsulation
  Removes layer 3 encapsulation

  Changes layer 4 protocol
  Changes layer 4 routing
  Changes layer 4 addressing (ports)
  Changes layer 4 control information
  Adds    layer 4 encapsulation
  Removes layer 4 encapsulation

  Changes higher layer protocol
  Changes higher layer routing
  Changes higher layer addressing
  Changes higher layer control information
  Changes higher layer encoding
  Adds    higher layer encapsulation
  Removes higher layer encapsulation
  Other higher layer changes

The positive return bitmask could be OR-ed together and returned to the transport layer, which could then use cleared bits to know whether it was safe to make certain types of assumptions - e.g. that packets from an IPoIB flow were going to use an IB interface, or that packets from a loopback flow were actually going to use the loopback interface.

Of course, if netfilter changes were made, the relevant dst entries would need to be marked obsolete and the process repeated. It would also be helpful if the entry flow check functions returned the amount of headroom that entry requires.

The flowi structure already contains all that information for routing purposes. No reason why it could not be used to do early netfilter reduction as well. Right?

netfilter is unfortunately too powerfull for that. It can do many complex
dynamic decisions per packet that are impossible to cache or predict.
Dynamic decisions are fine as long as there is a way to know in advance what flows they apply to, so unaffected flows can use the fast path.

In theory you could try to build such a fast path for some simple filtering that implements a subset of full netfilter, but nobody has attempted to do so so far.

I hate to say this, but one other application besides transport output path optimatization would be for (horrors!) TOE / RNIC / iSCSI drivers to check a flow for netfilter applicabililty and revert back to the standard kernel output processing if appropriate, as well as detecting when it is necessary to inject duplicate packets back into the kernel for read-only filters to examine, say when Ethereal is active.

- Mark





-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to