a couple of thoughts
23 Feb 2007 When I woke up this morning, I was wondering about negations in firewall rules. Once a trie is built that contains all network addresses and ranges, it's easy to check whether a given IP address matches any part of that trie. But how to store negated network ranges in the trie ?For example, I want to drop all packets that don't go to 10.0.0.1. Since that IP address has netmask 255.255.255.255 or CIDR mask 32, it is stored at the bottom of the tree (well, actually at the leaves, which are at the top). When a packet comes in with some IP address, how will I know that it has to be dropped EXCEPT when the destination is 10.0.0.1 ?
By the time I was in the shower, I had figured out the lazy approach to this. Instead of storing 10.0.0.1, I have to store every network range that does not contain 10.0.0.1, which would be 32 network ranges with CIDR mask going from 1 to 32. That takes up 32 times more memory than storing a single IP!
Now that can't be the most efficient approach, but at least I can fall back on that one. The space complexity would still be O(n), but instead of being 2*n, it would be 32*n in the case of all negations.
Back to the papers. I've read quite a lot of them lately, but none of them seem to touch this subject. All papers are theoretical and very few of them touch any practical issues at all. In my quest for an answer, I fell over the FIS (fat inverted segment) tree, described in the Feldmann/Muthukrishnan paper, a couple times.
I'm ashamed to say I still don't understand how it works :(
I'm currently implementing what is called a segment trie: starting at the root, nodes are stored as children (meaning there is a pointer from root to child node). Each node contains rule data, each edge contains classification data.
The FIS tree/trie works the other way around. All leave nodes have pointers to their parents and so on, all the way up to the root.
The question that pops up right away is this: given an arbitrary IP address, how do I walk through the tree ? First, I would have to locate the subnet stored in a leaf node that contains this IP address, and then I can go up the chain to see which other (larger) subnets also contain the IP.
But, locating a leaf node is the whole point of a tree/trie! So how do I do that in a FIS ? Go over all leaf nodes one by one untill I find it ? Possibly using a binary search, which would take O(log U) time, where U is the size of the universe of all possible IP addresses (2^32 for IPv4).
But how would that be different from a tree search in the first place ?
I'll have to look into this some more...
I assume the NF-HiPAC system uses a FIS to do whatever it does, but I don't want to look at that yet.