software, programming, functional
home mail me! RSS (2.0) feed

Algebraic + abstract = true?

I had this idea a few weeks ago, of merging the syntactical features of algebraic data types with the implementation freedom of abstract data types. Coming to think about it, I had this idea in 1988, when I created an equational “mathematics” system on top of Prolog.

Most people having dealt with declarative languages have encountered pattern matching, and, specifically, pattern matching against constructors for algebraic data types. The canonical example – to which even Erlang developer have access, being in a land without user-defined types – is lists, which have the following two constructors:

The first one creates a list out of a head and another list and second one creates the empty list. The syntax differs a bit between languages, and most languages have the syntactic sugar [X1, X2, ..., Xn] for deeper terms, but you get the idea.

This list can then be used in pattern matching, such as (using Haskell syntax):

  1. length nil = 0
  2. length [_|tail] = length tail + 1

The beauty of algebraic data types is that you can get this equational style of programming even for your own types. The ubiquitous example is a binary ordered tree, where I create a polymorphic type, i.e., expecting a concrete type for its labels:

  1. data Tree a = Leaf a | Branch a (Tree a) (Tree a)

You can then use it to check whether a value is contained in the tree, as:

  1. contains x (Leaf y) = x==y
  2. contains x (Branch y left right)
  3.  | x==y = True
  4.  | x<y   = contains x left
  5.  | x>y   = contains x right

Note 1: Here you also get a glimpse into the guard system of Haskell. You are welcome!
Note 2: I currently abstract away the requirements of actually implementing < which is done via classes in Haskell’s generic type system. For the curious, one simply inserts an Ord a => before the type definition.

This is it. That is the type. Algebraic data types is the quintessence of WYSIWYG in the world of code.

But what happens if we are not satisfied with this explicit representation, but wants to abstract it away, and just provide operations to manipulate trees? Well, we have to define an abstract data type, which in Haskell is achieved by hiding the implementation behind a module and exporting just those abstract operations, like:

  1. module Tree(makeLeaf, makeBranch, getValue,
  2.  getLeft, getRight, isLeaf)
  3. <i>... some really cool implementation of trees</i>

You might ask yourself why on earth we would want a different implementation of trees than that algebraic one anyway. One answer is that fully balanced trees with low update rates are quite susceptible to array implementations. I.e., fully balanced trees happen to be isomorphic to an ordered array. There is an array type in the Prelude (the standard library for Haskell) but in order not to get carried away into the more esoteric parts of that language, we will use simple lists. So, here is an implementation of that abstract tree:

  1. data Tree a = Tree { startR::Int, endR::Int, impl::[a] }
  2. -- first we have two helper functions
  3. midIndex tree = (endR tree + startR tree) / 2
  4. makeList (Tree start end list) = (take (end - start) . drop start) list
  6. -- As you see, building is expensive, accessing cheap
  7. makeLeaf x = Tree 0 1 [x]
  8. makeBranch x left right = Tree 0 (length app) app where
  9. &nbsp;app = makeList left ++ [x] ++ makeList right
  10. isLeaf [x] = True
  11. isLeaf _   = False
  12. getValue tree = tree !! midIndex tree
  13. getLeft (Tree start end list) = Tree start (midIndex tree) list
  14. getRight (Tree start end list) = Tree (midIndex tree + 1) end list

We now have this abstract tree (via that Tree module and its list implementation), which we can use as:

  1. contains x tree
  2.  |isLeaf tree = value == x
  3.  |x == value = True
  4.  |x < value = contains x (getLeft tree)
  5.  |x > value = contains x (getRight tree)
  6.  where value = getValue tree

But we cannot use the nice equational appearance with pattern matching, as we did with the algebraic data type! This is what I propose to change. I want to have the pattern matching beauty of algebraic data types while being free to implement the type however I want.

What is needed is a mapping from the various parts of a pattern (or algebraic constructor) to a function, free to implement in whichever way, along with predicates associated with the constructors themselves.

A simple incarnation of that idea follows, in a Haskellish notation:

  1. dataclass AATree (tree a) =
  2.     Leaf (getValue@a) by isLeaf |
  3.     Branches (getValue@a) (getLeft@(tree a))
  4.                          (getRight@(tree a))                       
  5. instance AATree (Tree start end list) where
  6. <i>... definitions just as for the ADT above ...</i>

That way you can either use the abstract interface or the algebraic. This is beautiful, if I may say so myself :-)

Note: In order to make this abstract data type an object-oriented type, we would have to define the abstract interface in a class structure. That is at least as OOish you can get in Haskell.

After writing this, I find out that one the Big Boys in functional programming, Philip Wadler, has come up with a very similar merging idea almost 20 years ago, two years before I even touched upon this idea myself. He calls it views. This does not make the idea less viable, though.

I will present a more detailed exposition on how to implement it – as a pre-processor – to Haskell and perhaps Scala, later. And, bring some new life into the views of Wadler’s.

davber does IT » Algebraic + abstract = true! (at least in the key of F#) said,

December 10, 2007 @ 2:37 am

[...] I earlier wrote about some ideas I had to combine the encapsulation of abstract data types with the swift case analysis combined with decomposition of algebraic types. Why not have both worlds? [...]

RSS feed for comments on this post · TrackBack URI

Leave a Comment

You must be logged in to post a comment.