Model based software development with Clojure

02.04.2014 Permalink

In software development, models serve basically two different purposes: one is communication among humans, the other is capturing essential information about an aspect of the system in some formal way. Here I deal with the second purpose: using a formal model to automatically create executable parts of the system.

TL;DR if you're interested in creating Clojure data in a DSL-like fashion then go to metam on GitHub to see if it is of help to you.

Formal models are expressive power

Why would we want to use formal models in software development? Current mainstream programming languages tend to be cumbersome and verbose. Expressing solutions by specifying only the absolutely necessary information is in most cases impossible. Imperative languages for example force us to introduce order and state management, which in most cases are not essential for a specification, but mere implementation details. These details blur the essence. It becomes hard to reason about the solution and to change it. The promise of modeling is that we only take care for the essential complexity. The mapping of the model to all the nasty implementation details is done by processing the model in a uniform, automated way.

The main reason for formal models apart from our program code is therefore a weakness of a language to succinctly express specifications of solutions. Lisp always allowed to raise the level of abstraction within the language, which means we develop the language towards the problem. Lisp embraces language oriented programming and model based software development in very deep way. Still, the concepts and ideas in the model-driven orbit have a value, and it helps to be aware of those even when you have the magic of Clojure at your disposal. In the forthcoming sections I'll try to show how we can technically apply these concepts using Clojure as our main tool.

A model is structured data

Many people tend to think of models as being something graphical, made up of boxes, circles and lines. But this is only a graphical representation of a model, it's not the model. So, what in essence is a model?

Don't be fooled by all those 3-letter acronyms, heavy-weight modeling tools or recent language workbenches. A model is only structured data. Its structure is described by what is called a meta model, non-modelers would simply call it a schema. And a schema is a set of constraints a.k.a predicates.

Clojure is well equipped to deal with models

Structured data? Predicates? This sounds like Clojure (or any other functional language) is fit to process models with ease. So let's take a closer look. Well-known topics in the model-driven orbit are I'll now describe if and how you can use Clojure in each of these areas.

Text based editing and representation

If we want to create a model we can just write it down in EDN. Here, I use the state machine model that M. Fowler used as introductory example in his DSL book. In Clojure I could represent the model like so:

(defn unlock-door [state] state)
(defn lock-door [state] state)
(defn unlock-panel [state] state)
(defn lock-panel [state] state)


(def secret-compartment1
  {:commands [unlock-door lock-door unlock-panel lock-panel]
   :states  {"idle"
             {:commands [unlock-door lock-panel]
              :transitions [{:doorClosed "active"}]}
             "active"
             {:commands []
              :transitions {:lightOn "waiting-for-drawer,"
                            :drawerOpenend "waiting-for-light"}}
             "waiting-for-light"
             {:commands []
              :transitions {:lightOn "unlocked-panel"}}
             "waiting-for-drawer"
             {:commands []
              :transitions {:drawerOpened "unlocked-panel"}}
             "unlocked-panel"
             {:commands [unlock-panel lock-door]
              :transitions {:panelClosed "idle"}}}
   :reset-events [:doorOpened]
   :initial-state :idle})

The data and its textual representation are the same. This may not be what we want: our textual model shall look different to ease comprehension, or perhaps we aim at a graphical representation or even a graphical editor. Let's discuss this in more detail:

With Clojure we can easily create code that allows for a nicer textual syntax, we do not even have to use macros. metam is a library that supports a certain type of textual model representation. Due to lack of editor support this produces a very basic user experience, without refactoring support, code completion or special syntax highlighting. Nevertheless reading and understanding these models works well. Here's an example using metam:
(def secret-compartment2
  (state-machine "secret-compartment2"
                :commands [unlock-door lock-door unlock-panel lock-panel]
                :states [(state "idle"
                                [unlock-door lock-panel]
                                {:doorClosed "active"})
                         (state "active"
                                {:lightOn "waiting-for-drawer"
                                 :drawerOpenend "waiting-for-light"})
                         (state "waiting-for-light"
                                {:lightOn "unlocked-panel"})
                         (state "waiting-for-drawer"
                                {:drawerOpened "unlocked-panel"})
                         (state "unlocked-panel"
                                [unlock-panel lock-door]
                                {:panelClosed "idle"})]
                :reset-events [:doorOpened]))

Meta model definition and validation

With metam the visual difference is not big, but metam also provides validation of a model. The meta model is in essence only a set of constraints expressed as predicates. Applying these predicates to data is model validation. Clojure is not restrictive. We can produce arbitrary data and label it as an instance of the meta model, but only validation reveals if this is true. Meta model definition and validation are two sides of the same coin. For the purpose of validation, metam allows you to specify predicates that are automatically attached to model elements and offers a valid? function. We use the REPL to quickly detect errors as the model grows. The meta model for the state machine and a convenience function are shown below:
(defmetamodel statemachine
  (make-hierarchy)
  {::state* {:commands [(coll fn?)]
             :transitions [(coll keyword? string?)]}
   ::state-machine {:commands [(coll fn?)]
                    :states [(coll (type-of ::state*))]
                    :reset-events [(coll keyword?)]}}
  #'no-defaults)

(defn state
  ([name transitions]
     (state name [] transitions))
  ([name commands transitions]
     (state* name :commands commands :transitions transitions)))
If we now add a bit of macro magic we can include additional validation and transformation of the resulting model. In this case we want to be sure that only existing states are referenced, and that the model keeps the states as a map, not a vector:
(defmacro defstatemachine
  [sym & args]
  `(def ~sym (let [sm# (state-machine ~(str sym) ~@args)
                   drs# (dangling-references sm#)]
               (if (seq drs#)
                 (throw (IllegalArgumentException.
                         (str "The following states are referenced, but not defined " drs#))))
               (transform-states sm#))))

(defn dangling-references
  "Returns the set of references to states that are not defined."
  [sm]
  (let [defd-state-names (->> sm :states (map :name) set)
        refd-state-names (->> sm :states (mapcat :transitions) (map second) set)]
    (difference refd-state-names defd-state-names)))


(defn transform-states
  "Turns the vector of states into a map with the state name as key."
  [sm]
  (let [states-map (->> sm :states (map (juxt :name identity)) (into {}))]
    (-> sm
        (assoc :states states-map)
        (assoc :initial-state (-> sm :states first)))))
With these tools in place our textual representation is finally:
(defstatemachine secret-compartment3
  :commands [unlock-door lock-door unlock-panel lock-panel]
  :states [(state "idle"
                  [unlock-door lock-panel]
                  {:doorClosed "active"})
           (state "active"
                  {:lightOn "waiting-for-drawer"
                   :drawerOpenend "waiting-for-light"})
           (state "waiting-for-light"
                  {:lightOn "unlocked-panel"})
           (state "waiting-for-drawer"
                  {:drawerOpened "unlocked-panel"})
           (state "unlocked-panel"
                  [unlock-panel lock-door]
                  {:panelClosed "idle"})]
  :reset-events [:doorOpened])	
Of course, the model is still Clojure code because it is based on an internal DSL. If we wanted to escape from the syntax of the host language we'd have to go for an external DSL, which requires us to provide a parser. Although the parser is not hard, we should not do this frivolously. We do not only need additional tools, but we loose the power to combine our mini-language with features of the host language. What if we wanted to automatically create models from arbitrary data? Or if we wanted to use expressions inside our model? This is a dangerous path, and we have to ask ourselves if the syntactical improvements justify the extra effort or loss of flexibility. However, it is possible in Clojure, so let's take the state machine representation from the DSL book:
events
  doorClosed
  drawerOpened
  lightOn   
  reset doorOpened
  panelClosed
end

commands
  unlockPanel
  lockPanel
  lockDoor
  unlockDoor
end

state idle
  actions { unlockDoor lockPanel }
  doorClosed => active
end

state active
  drawerOpened => waitingForLight
  lightOn    => waitingForDraw
end

state waitingForLight
  lightOn => unlockedPanel
end

state waitingForDraw
  drawerOpened => unlockedPanel
end

state unlockedPanel
  actions { unlockPanel lockDoor }
  panelClosed => idle
end
... and create a corresponding parser with instaparse:
(def statemachine-parser
  (insta/parser "
MACHINE = <s>? EVENTS <s> COMMANDS <s> STATES <s>?
STATES = (STATE <s>)* STATE
EVENTS = <'events'> <s> (EVENT <s>)+ <'end'>
COMMANDS = <'commands'> <s> (COMMAND <s>)+ <'end'>
STATE = <'state'> <s> token (<s> ACTIONS)? <s> TRANSITIONS <'end'>
ACTIONS = <'actions'> <s>? <'{'> <s>? (COMMAND <s>)* <'}'>
TRANSITIONS = (TRANSITION <s>)*
TRANSITION = token <s> <'=>'> <s> token
EVENT = 'reset'? token
COMMAND = token
<token> = #'[a-zA-Z]+'
<s> = #'\\s+'
"))
Application of this parser yields the model in vector-based Hiccup-style, which means if we wanted to obtain the model in the same map-based representation as the Clojure based models we would need an additional transformation. Please note that the parser is only a very first step when dealing with external DSLs. If we really wanted an external DSL with state-of-the-art comfort then we should take a closer look at specialised language workbenches like Xtext or MPS. These tools provide comprehensive editor support, full integration in an IDE and give us powerful means for code generation or interpretation. It's a different level of language engineering, requiring dedicated tooling and knowledge.

Graphical representation

The strength of graphical representation is that humans can comprehend diagrams quicker and spot connections or patterns that no one would see in a textual representation. The state machine is a nice example:

To create a graphical representation like the one above we create a generator (or interpreter) that produces input for a renderer. Examples for a renderer are Graphviz, PlantUML, a GUI Toolkit like JavaFX or a web browser. Here's an example for a renderer of the state machine model that creates input for PlantUMLs SourceStringReader:
(defn statename
  [s]
  (let [n (if (string? s) s (:name s))]
    (.replaceAll n "-" "")))

(defn umlstr
  [sm]
  (str "@startuml\n"
       "[*] -> " (-> sm :initial-state :name) "\n"
       (apply str (for [from-state (-> sm :states vals),
                        [event to-state] (:transitions from-state)]
                    (str (statename from-state)
                         " --> "
                         (statename to-state)
                         (str  " : " (name event))
                         "\n")))
       (apply str (for [state (-> sm :states vals),
                        command (:commands state)]
                    (str (statename state)
                         ":"
                         (-> command class str (.split "\\$") (nth 1)) "\n")))
       "\n@enduml"))

(defn generate-png
  [sm pngfile]
  (with-open [os (io/output-stream pngfile)]
    (-> sm umlstr SourceStringReader. (.generateImage os))))
We could trigger instant re-creation by using a watch on a var and/or repeated polling. Here a watch is sufficient because my default picture viewer refreshes itself automatically when the file changes:
(add-watch #'secret-compartment3
           :diagram
           (fn [k r o n]
             (generate-png n (io/file "/home/riemensc/test.png"))))
Whenever we change the model the viewer will show an updated diagram, which gives us both: quick text editing and a nice picture reflecting our model changes within a blink of an eye.

Graphical editors are expensive

If a graphical editor is a must to satisfy the needs of the people who edit the model, then chances are that bare Clojure is the wrong tool for the job. At the moment, I am not aware of a customizable editor that produces Clojure data or a similar format to be processed by Clojure. Of course we could leave the Clojure ecosystem and use external tools or frameworks and finally feed the model for further processing into a Clojure function, but this can certainly be done with any programming language. Anyway we must consider carefully, if a graphical editor is what we want. Building a usable graphical editor takes quite some effort, and editing a graphical model using point-and-click is likely to be significantly slower than editing text, once you're accustomed to the grammar. Still, you can generate a graphical representation to support intuitive understanding, which gives us the best of both worlds.

Generation, Interpretation and Transformation

Artifact generation is employed at compile-time and the resulting text is usually written to external files, which must be considered in the downstream build process. Interpretation on the other hand takes place at runtime and requires the model at runtime. It usually causes side-effects to runtime state. Transformation can take place any time, it is often used to supply derived model elements and/or ease further processing. It yields another model, in other words: only data.

Clojure gives us multi-methods which are a perfect fit for systematically processing all model elements in a polymorphic fashion. To give you an example I switch to another type of model, taken from the instaweb sample. Suppose we wanted to build many HTML forms with Clojure. Hiccup itself gives us a Clojure data representation for HTML, but this is certainly too low level. For a form like the following:

... we'd like to write down an expression representing the model like this:
(defn address-panel
  []
  (panel "Address" :elements
         [(textfield "Name" :required true)
          (textfield "Street")
          (textfield "Zipcode")
          (textfield "City")
          (checkbox "Faraway" :label "Is it far away?")
          (button "OK") (button "Back")]))
      
To be able to do so, we use metam and create a meta model:
(declare defaults)

(defn binding?
  [x]
  (or (nil? x) (keyword? x) (vector? x)))

(defmetamodel forml
  (-> (make-hierarchy)
      (derive ::button ::widget)
      (derive ::checkbox ::widget)
      (derive ::label ::widget)
      (derive ::panel ::widget)
      (derive ::table ::widget)
      (derive ::textfield ::widget))
  {::button {:text [string?]}
   ::checkbox {:bind [binding?]
               :label [string?]}
   ::column {:bind [binding?]
             :text [string?]}
   ::label {:class [string?]
            :text [string?]
            :for [string?]}
   ::panel {:elements [(coll (type-of ::widget))]
            :title [string?]}
   ::table {:bind [binding?]
            :columns [(coll (type-of ::column))]
            :title [string?]}
   ::textfield {:bind [binding?]
                :label [string?]
                :required [boolean?]
                :value []}}
  #'defaults)

(defn keyword-from-name
  [spec]
  (-> spec :name lower-case keyword))


(defdefaults defaults forml
  {:default nil
   [::checkbox :bind]         (keyword-from-name spec)
   [::checkbox :label]        (:name spec)
   [::column :bind ]          (keyword-from-name spec)
   [::column :text]           (:name spec)
   [::panel :title]           (:name spec)
   [::table :bind]            (keyword-from-name spec)
   [::table :title]           (:name spec)
   [::textfield :bind]        (keyword-from-name spec)
   [::textfield :label]       (:name spec)
   [::textfield :required]    false
   [::widget :text]           (:name spec)})
And in order to produce HTML we create a transformation that generates Hiccup vectors. The transformation is carried out by a multi-method that handles different types of model elements differently.
(defmulti generate*
  (fn [parentid data el]
    (metatype el)))

(defmethod generate* :default
  [parentid data el]
  "")

(defmethod generate* ::f/button
  [parentid data el]
  (let [id (make-id parentid el)]
    [:input {:id id :type "submit" :value (:text el)}]))

(defmethod generate* ::f/checkbox
  [parentid data el]
  (let [id (make-id parentid el)]
    (with-label el id
      [:input {:id id :type "checkbox" :checked (value data el)}])))

(defmethod generate* ::f/label
  [parentid data el]
  (let [id (make-id parentid el)]
    [:label (attrs el id :for :class) (:text el)]))

(defmethod generate* ::f/panel
  [parentid data el]
  (let [id (make-id parentid el)]
    [:div
     {:id id :class "form"}
     [:p {:class "heading"} (:title el)]
     (->> el
          :elements
          (map (partial generate* id data)))]))

(defn column-header
  [col]
  (vector :th (:text col)))

(defn table-row
  [cols item]
  (vector :tr (map #(vector :td (value item %)) cols)))

(defmethod generate* ::f/table
  [parentid data el]
  (let [cols (:columns el)
        items (value data el)]
    [:div {:class "table"}
     [:p {:class "heading"} (:title el)]
     [:table
      [:tr (map column-header cols)]
      (map (partial table-row cols) items)]]))

(defmethod generate* ::f/textfield
  [parentid data el]
  (let [id (make-id parentid el)]
    (with-label el id
      [:input {:id id :type "text" :value (value data el)}])))

(defn generate
  ([el]
     (generate* "" {} el))
  ([data el]
     (generate* "" data el)))
      
Transformation, generation or interpretation of models boil down to the same approach: a multi-method dispatching over the type of model element. The only difference between the three is the effect that we are interested in: a different model representation, external text files or side-effects at runtime.

Beyond models

So far, I restricted myself to how data (a.k.a model) can be edited and used to derive an executable part of the system. But a Lisp can do more, often without us being aware that we just switched to the meta level.

In Lisp, code is data. Because models are data the code itself forms a model. This model is available at compile-time to macros, which are regular code that transforms code, just like functions transform data structures. In other words: we can extend the language within the language. Or we can apply changes to code to yield a different behaviour. This sounds great, but don't rush into it before thinking twice. After all, the well-known first rule of "Macro Club" is: Don't write macros!

Indeed macros are not necessary in most cases. Consequent application of DRY often results in Clojure code that only contains the essential information for specifying the solution. Thinking about modeling in those cases is superfluous, the code IS the model. A macro might only provide a tiny last bit to make it perfect. Compojure is a good example. A defroute expression reads like a configuration in a specialized DSL, but if you take a look at what the macros are doing then you would agree that the referenced library functions already form a nice API for defining routes.

Of course there are other areas where macros are not just a means to add nice-to-have syntactic sugar, but are essential to enable a certain style of programming. As a very simple example the threading macros -> and ->> come to my mind. They enable us to write long, yet comprehensible access pathes or sequences of transformations in the actual order of the execution. Compare these two expressions:

(vec (map symbol (map name (keys (:fields metadata)))))

(->> metadata :fields keys (map name) (map symbol) vec)

The first shows the action in reverted order, and it is deeply nested. Understanding (and changing) the second expression is considerably easier. Without the macros we would likely not create long transformation sequences, because they are hard to understand. Therefore I consider the macros to be an enabler for this code style.

On the other end of the spectrum I see core.async's go macro. If you haven't done yet, please read the blog post about the State Machines of core.async. The expressions between channel accesses are assigned to states of a state machine in order to execute these pieces asynchronously. This enables CSP-style programs which follow a very own paradigm to deal with concurrency. To create a similar effect with todays tools in the model-driven orbit requires considerable effort, and the result would certainly not compose seamlessly with our host language.

Conclusion

I hope that I gave enough evidence that models in Clojure are as simple as data, because they ARE data. For the development of enterprise systems I consider language internal textual models as the most important aspect, perhaps supplied with an additional transformation / generator to produce a graphical representation. For this, Clojure already provides everything we need.

If we need strong support for external DSLs or graphical editors, Clojure is not as well prepared as specialised tools or frameworks, which is no surprise. We must recognize that switching to a non-Lisp language (either textual or graphical) will take us far away from true simplicity, because our development process will rely on complex tooling. We sacrifice simplicity for convenience.

Clojure's macro system can do things with ease and embedded in Clojure for which we'll have to go a long way if we picked an external tooling. This is another strong reason to stay within Clojure whenever we must move to the meta level. Modeling will never play a special role in Lisp-based software development, because it's already a fundamental concept of the language.