Catamorphism

The concept of a catamorphism is grounded in category theory, and has been applied to functional programming. It denotes the unique homomorphism for an initial algebra. The term comes from Greek κατα- (downwards, according to) + morphism, the latter from Greek μορφή (form, shape). The dual concept is that of anamorphism.

Catamorphisms in functional programming

In functional programming, a catamorphism is a generalization of the folds on lists known from functional programming to arbitrary abstract data types that can be described as initial algebras.

One of the first publications to introduce the notion of a catamorphism in the context of programming was the paper "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire", by Erik Meijer et al.[1], which was in the context of the Squiggol formalism.

The dual concept is of anamorphisms, a generalisation of "unfolds".

Example

The following example is in Haskell-like pseudocode, assuming a class Group of types supporting an "addition" operation.

data Tree a = Leaf a
            | Branch (Tree a) (Tree a)

type TreeAlgebra a r = (a -> r, r -> r -> r)

foldTree ∷ TreeAlgebra a r -> Tree a -> r
foldTree (f, g) (Leaf x)     = f x
foldTree (f, g) (Branch l r) = g (foldTree (f, g) l) (foldTree (f, g) r)

treeDepth ∷ TreeAlgebra a Integer
treeDepth = ( λx. 1, λl r. 1 + max l r )

sumTree ∷ (Group a) => TreeAlgebra a a
sumTree = ( λx. x, λl r. l+r )

Here foldTree (f, g) is a catamorphism for the Tree a datatype; treeDepth and sumTree are called algebras.

Catamorphisms in category theory

Category theory provides the necessary concepts to give a generic definition that accounts for all initial data types (using an identification of functions in functional programming with the morphisms of the category Set or some related concrete category). This was done by Grant Malcolm in his Ph.D. Thesis Algebraic Types and Program Transformation (Univ. Groningen, 1990)[2] and the article Data structures and program transformation[3].

Specialized to the above example, the formal definition of a catamorphism is as follows: For fixed type a, the pair (r, [f1,f2]) is an F-algebra, where F is the functor sending r to a + r × r. The pair (Tree a, [Leaf,Branch]) is also an F-algebra, and specifically the initial F-algebra. This means that there is a unique homomorphism to any other F-algebra. That unique homomorphism is called a catamorphism.

In the code example above, foldTree is the function that constructs these catamorphisms.

The general case

If (A, in) is the initial F-algebra for some endofunctor F (so in is a morphism from FA to A), and (X, f) is an F-algebra, there is a unique homomorphism from (A, in) to (X, f), which may be denoted cata f (leaving the "carrier" X implicit). (Here "cata" is a generic name for what was called "foldTree" for the specific case from the example.)

Properties

Let F be given and fixed. Using the definition of homomorphism, the uniqueness property can be expressed as: for all F-algebras (X, f), and all h from A to X, the following two statements are equivalent:

$h=\mathrm {cata} \ f$
$h\circ in=f\circ Fh$

Notation

Another notation for cata f found in the literature is $(\!|f|\!)$ . The open brackets used are known as banana brackets, after which catamorphisms are sometimes referred to as bananas.

External links

Erik Meijer, Maarten Fokkinga, and Ross Paterson. Functional Programming with Bananas, Lenses, Envelopes, and Barbed Wire [4], contains additional definitions and examples
Catamorphisms in Haskell