7 min read

In this article by Rafik Naccache, author of the book Clojure Data Structures and Algorithms Cookbook, we will be implementing a reusable mini-firewall using transducers.

(For more resources related to this topic, see here.)

Clojure’s unique way of describing data transformations, reminiscent of its lisp and functional heritage, has set a new standard in the art of designing highly expressive algorithms.

Clojure makes you address your problems in terms of highly declarative multi-stage transformations, and more often than not, you’ll find your self alternating map, reduce, filter, and likely operations on the powerful seq abstraction to express the solution you came with as if you were explaining it in plain English to some non IT-savvy person. This declarative way of thinking yields much expressive power, and just looking at how SQL is ruling the database industry nowadays confirms that.

But there was room for improvement in what Clojure provided for defining compoundable data transformations. First of all, the transformations you were writing were not reusable. Indeed, you’d catch yourself writing the same map operations over and over again, for instance, had you to do that same operation on different types of collections.

Of course, you could encapsulate these in functions, but that will not avoid the second problem we wanted to highlight: the intermediate seq structure generated between the transformation stages. As a matter of fact, each and every step your threading macro passes through yields a new seq function, which is not the most efficient possible processing sequence.

Lastly, the transformations you specified were closely tied to the type of input or output they operated on, and that gave the Clojure programming language designers the hassle of redefining all of these operations for any new data abstraction they’d come with while evolving the language.

For all of these reasons, transducers were introduced starting from Clojure 1.7. As the official documentation put it, they are composable algorithmic transformations. They are independent from the context of their input and output sources and specify only the essence of the transformation in terms of an individual element. Transducers compose directly, without awareness of input or creation of intermediate aggregates.

In a nutshell, a transducer is a “transformation from one reducing function to another”, that is, a means to modify how to handle the way receiving the elements of some previous step is carried out. For instance, say we define the following transducer (note that transducers are, for the most part, the usual seq manipulation functions, but with an arity decremented by one):

(map inc)

We created a transformation that increments all the elements that some previous stage handed over to this transformation. Let’s see an example of that (note the use of transduce function to apply a transducer on a seq function):

(transduce (map inc) conj (range 0 3))

(range 0 3) hands a seq function of integers over to the (map inc) transducer, which on receiving them, gets them incremented and then, as per the transduce function specification, reduces them using the conj function.

Transducers are also compoundable. You can achieve this using the traditional comp operator, used for traditional function composition, but there’s one subtlety you must be aware of. Generally, comp is applied left to right, like so:

(comp f g ) => f(g(x))

In the case of transducers, however, comp yields a transformation that looks as if composition was applied from right to left. This is not a change in the way comp works, but just how transducers are built. Composing transducers yields a function that changes the inner reducing function, as in the following example:

(comp tr1 tr2)=> (fn[] ….. tr2(tr1))

So tr1 is applied on the input before tr2, although comp is still applied from left to right!

To get a deeper understanding of transducer internals and how exactly they happen to be composed from right to left, watch Rich Hickey’s talk about the subject at https://www.youtube.com/watch?v=6mTbuzafcII.

We are going to use transducers to build a mini-firewall, implementing a two-stage data-transformation, one map and one filter stages, and are going to see how this mini-firewall can be interchangeably used on a vector or on a core.async channel.

How to do it…

  1. First of all, be sure to use Clojure 1.7. Here is the Leiningen dependency to include:
    [org.clojure/clojure "1.7.0-RC1"]
  2. Besides, we are going to use some core.async channels in this recipe. Here is the namespace declaration:
    (ns recipe24.core
      (:require [clojure.java.io :refer :all]
                [clojure.core.async :refer [chan
                                            go
                                            >!
                                            <!
                                            >!!
                                            <!!]]))
  3. Now, let’s define our first transducer, applying a source NAT on the TCP frames that pass through our firewall:
    (defn source-nat
      [pub-ip
       tcp-frame]
      ;; Change source ip into the public IP Interface
      (assoc tcp-frame :source-ip pub-ip))
    
    (def public-interface "1.2.3.4")
    ;; This is our public interface IP
    
    (def tr-source-nat (map (partial source-nat
                                     public-interface)))
    ;; A transducer transforming tcp frames in such a way
    ;; That Source IP contains the public interface's.
    
  4. After that, let’s concern ourselves with discarding or accepting TCP frames whose source, destination IPs, and ports are invalidated according to some connection whitelist:
    (defn accepted?
      [accept-rules
       tcp-frame]
      (not
       (nil?
        (some #{tcp-frame} accept-rules))))
    ;; Verify if this TCP frame exists within
    ;; the accept-rules ACL
    
    (def sample-accept-rules [{:source-ip "4.5.3.8" :dest-ip "4.4.3.5" :dest-port 80}
                              {:source-ip "4.5.3.9" :dest-ip "4.4.3.4" :dest-port 80}])
    
    (def tr-filter-rules (filter (partial accepted?
                                          sample-accept-rules)))
    ;; A transducer dropping tcp frames not present
    ;; in our sample ACL
    
  5. Now we build our mini firewall, a transducer resulting from the composition of the previous two transducers. As we’ll verify conformity of the connections before proceeding to the source IP NAT on them, we make sure that tr-filter-rules comes first in our composition:
    (def firewall(comp
                  tr-filter-rules
                  tr-source-nat))
  6. Let’s try our mini-firewall on a vector of TCP frames:
    (def sample-frames [{:source-ip "1.1.1.1" :dest-ip "2.3.2.2" :dest-port 10}
     {:source-ip "2.2.2.2" :dest-ip "4.3.4.1" :dest-port 20}
     {:source-ip "4.5.3.8" :dest-ip "4.4.3.5" :dest-port 30}
     {:source-ip "4.5.3.9" :dest-ip "4.4.3.4" :dest-port 80}])
    
    (transduce firewall
               conj
               sample-frames)
    ;; => [{:source-ip "1.2.3.4", :dest-ip "4.4.3.4", :dest-port 80}]
  7. And now we’ll show how our mini-firewall is reusable. For this, we are going to create a random-TCP frame generator that’ll throw some traffic into a core.async channel. But this would not be just any channel; it would be one with a transducer, our firewall, attached to it. We’ll see what happens when we try to read from it.
  8. First of all, let’s write the random TCP frames generator:
    (def source-hosts [ "4.5.3.8" "4.5.3.9"  "1.1.1.1" "2.2.2.2" ])
    (def dest-hosts [ "4.4.3.5" "4.4.3.4"  "2.3.2.2" "4.3.4.1" ])
    (def ports [ 80])
    
    (defn get-random-elt
      [v]
        (get v (rand-int (dec (count  v)))))
    
    (defn random-frame []
      {:source-ip (get-random-elt source-hosts)
       :dest-ip (get-random-elt dest-hosts)
       :dest-port (get-random-elt ports)})
    
  9. Now it’s time to create a core.async channel, to which we’ll attach the firewall transducer. Note that when we attach a transducer to a core.async channel, the former must be buffered:
    (def from-net (chan 10
                                    firewall))
  10. We now need to throw, from time to time, a random TCP frame inside that channel:
    (defn gen-frames!
      []
      (go
        (while true
          (let [fr (random-frame)]
            (>! from-net fr)
            (Thread/sleep 1000)))))
    
  11. We also have to build some function to print the TCP frames that were allowed to pass by the firewall:
    (defn show-fw-output!
      []
      (while true
        (println "accepted & NAT'ted : "
                 (<!! from-net))))
    
  12. Let’s see what happens when we launch the last two functions:
    (gen-frames!)
    (show-fw-output!)
    accepted & NATted :  {:source-ip 1.2.3.4, :dest-ip 4.4.3.4, :dest-port 80}
    accepted & NATted :  {:source-ip 1.2.3.4, :dest-ip 4.4.3.5, :dest-port 80}
    accepted & NATted :  {:source-ip 1.2.3.4, :dest-ip 4.4.3.5, :dest-port 80}
    accepted & NATted :  {:source-ip 1.2.3.4, :dest-ip 4.4.3.5, :dest-port 80}
    ...
    
  13. As you can see, we could only read through the channel “NAT’ted” TCP frames that have been granted access according to sample-access-rules. Actually, the transducer attached to the core.async channel transforms the data as they are flowing in, and we were able to do that by reusing our transducer specification without having to re-implement it specifically for that particular input type, that is, the channel.

Summary

In this article, we used transducers to build a mini-firewall, by implementing a two-stage data-transformation; one map and one filter stages, and saw how this mini-firewall can be interchangeably used on a vector or on a core.async channel.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here