Representing complex data
We’ve seen:
base types:
Bool
,Int
,Integer
,Float
some ways to build up types: given types
T1, T2
- functions:
T1 -> T2
- tuples:
(T1, T2)
- lists:
[T1]
- functions:
Algebraic Data Types: a single, powerful technique for building up types to represent complex data
- lets you define your own data types
- subsumes tuples and lists!
Algebraic Data Types
Type Synonyms: naming existing types
Product Types: bundling things together
Sum Types: types with multiple variants
Recursive Types: types that contain themselves
Polymorphic Datatypes: datatypes with parameters
Type Synonyms
Synonyms are just names (“aliases”) for existing types
- think
typedef
inC
A type to represent circles
A tuple (x, y, r)
is a circle with center at (x, y)
and radius r
type CircleT = (Double, Double, Double)
A type to represent cuboids
A tuple (length, depth, height)
is a cuboid
type CuboidT = (Double, Double, Double)
Using Type Synonyms
We can now use synonyms by creating values of the given types
circ0 :: CircleT
= (5, 1, 100) -- circle at (5,1) with radius 100
circ0
cub0 :: CuboidT
= (10, 20, 30) -- cuboid with l=10, d=20, h=30 cub0
And we can write functions over synonyms too
area :: CircleT -> Double
= pi * r * r
area (x, y, r)
volume :: CuboidT -> Double
= l * d * h volume (l, d, h)
We should get this behavior
>>> area circ0
31415.926535897932
>>> volume cub0
6000
QUIZ
Suppose we have the definitions
type CircleT = (Double, Double, Double)
type CuboidT = (Double, Double, Double)
circ0 :: CircleT
= (5, 1, 100) -- circle at (5,1) with radius 100
circ0
cub0 :: CuboidT
= (10, 20, 30) -- cuboid with length=10, depth=20, height=30
cub0
area :: CircleT -> Double
= pi * r * r
area (x, y, r)
volume :: CuboidT -> Double
= l * d * h volume (l, d, h)
What is the result of
>>> volume circ0
A. 500
B. Type error
Answer: A
Beware!
Type Synonyms
Do not create new types
Just name existing types
And hence, synonyms
- Do not prevent confusing different values
Algebraic Data Types
Type Synonyms: naming existing types [done]
Product Types: bundling things together
Sum Types: types with multiple variants
Recursive Types: types that contain themselves
Polymorphic Datatypes: datatypes with parameters
Creating New Data Types
We can avoid mixing up values by creating new data
types
-- | A new type `Circle` with constructor `MkCircle`,
-- which takes three arguments of type `Double`
data Circle = MkCircle Double Double Double
-- | A new type `Cuboid` with constructor `MkCuboid`
-- which takes three arguments of type `Double`
data Cuboid = MkCuboid Double Double Double
We use constructors to build values of the new type:
circ1 :: Circle
= MkCircle 5 1 100 -- circle at (5,1) w/ radius 100
circ1
cub1 :: Cuboid
= MkCuboid 10 20 30 -- cuboid w/ len=10, dep=20, ht=30 cub1
QUIZ
Suppose we create a new type with a data
definition
-- | A new type `Circle` with constructor `MkCircle`
data Circle = MkCircle Double Double Double
What is the type of the MkCircle
constructor?
A. MkCircle :: Circle
B. MkCircle :: Double -> Circle
C. MkCircle :: Double -> Double -> Circle
D. MkCircle :: Double -> Double -> Double -> Circle
E. MkCircle :: (Double, Double, Double) -> Circle
Answer: D
Aside: Record syntax
Haskell’s record syntax allows you to name the constructor parameters:
Instead of
data Circle = MkCircle Double Double Double
you can write:
data Circle = MkCircle { center_x :: Double, center_y :: Double, radius :: Double }
then you can do:
= MkCircle { center_x = 5, center_y = 1, radius = 100 } circ1 -- same as: circ1 = MkCircle 5 1 100 = radius circ1 -- use field name as a function r
QUIZ
Suppose we have the definitions
type CuboidT = (Double, Double, Double)
data Cuboid = MkCuboid Double Double Double
volume :: CuboidT -> Double
= l * d * h volume (l, d, h)
What is the result of
>>> volume (MkCuboid 10 20 30)
A. 6000
B. Type error
Answer: B
Deconstructing Data
Constructors let us build values of new type … but how to use those values?
How can we implement a function
volume :: Cuboid -> Double
= ??? volume c
such that
>>> volume (MkCuboid 10 20 30)
6000
Deconstructing Data by Pattern Matching
Haskell lets us deconstruct data via pattern-matching
volume :: Cuboid -> Double
MkCuboid l d h) = l * d * h
volume (
area :: Circle -> Double
MkCircle x y r) = pi * r * r area (
same as for tuples, just using different constructors!
Algebraic Data Types
Type Synonyms: naming existing types [done]
Product Types: bundling things together [done]
Sum Types: types with multiple variants
Recursive Types: types that contain themselves
Polymorphic Datatypes: datatypes with parameters
Defining new types with data
prevents us from mixing up values:
area :: Circle -> Double
MkCircle x y r) = pi * r * r
area (
>>> area (MkCuboid 10 20 30) -- TYPE ERROR!
But … what if we need to mix up values?
A List of Shapes
Suppose I need to represent a list of shapes
- Some
Circle
s - Some
Cuboid
s
What is the problem with shapes
as defined below?
= [circ1, cub1] shapes
Where we have defined
circ1 :: Circle
= MkCircle 5 1 100 -- circle at (5,1) with radius 100
circ1
cub1 :: Cuboid
= MkCuboid 10 20 30 -- cuboid with length=10, depth=20, height=30 cub1
This does not type-check!
All list elements must have the same type!
Solution?
Sum types
Lets create a single type that can represent both kinds of shapes!
data Shape
= MkCircle Double Double Double -- Circle at x, y with radius r
| MkCuboid Double Double Double -- Cuboid with length, depth, height
A sum type is a data type with multiple constructors
- aka enum in Rust
- aka (tagged) union in C
QUIZ
With the definition:
data Shape
= MkCircle Double Double Double -- Circle at x, y with radius r
| MkCuboid Double Double Double -- Cuboid with length, depth, height
What is the type of MkCircle 5 1 100
?
A. Shape
B. Circle
C. MkCircle
D. (Double, Double, Double)
Answer: A
List of Shapes: Take 2
Now we can define
circ2 :: Shape
= MkCircle 5 1 100 -- circle at (5,1) with radius 100
circ2
cub2 :: Shape
= MkCuboid 10 20 30 -- cuboid with length=10, depth=20, height=30 cub2
and then define collections of Shape
s
shapes :: [Shape]
= [circ2, cub2] shapes
2D Shapes
Lets define a type for 2D shapes
data Shape2D
= MkRect Double Double -- rectangle with width and height
| MkCirc Double -- circle with radius
| MkPoly [Vertex] -- polygon with at least three vertices
type Vertex = (Double, Double)
Different constructors can have different types!
Tagged Boxes
A values of type Shape2D
is either two doubles tagged with Rectangle
>>> :type (Rectangle 4.5 1.2)
Rectangle 4.5 1.2) :: Shape (
or a single double tagged with Circle
>>> :type (Circle 3.2)
Circle 3.2) :: Shape (
or a list of pairs of d tagged with Polygon
> :type (Polygon [(1, 1), (2, 2), (3, 3)])
ghciPolygon [(1, 1), (2, 2), (3, 3)]) :: Shape (
How do we get the value out of the box?
EXERCISE
Write a function to compute the perimeter
of a Shape2D
data Shape2D = MkRect Double Double | MkCirc Double | MkPoly [Vertex]
type Vertex = (Double, Double)
perimeter :: Shape2D -> Double
= ??? perimeter s
HINT
You may want to use this helper that computes the distance between two points
distance :: Vertex -> Vertex -> Double
= sqrt ((x2 - x1) ** 2 + (y2 - y1) ** 2) distance (x1, y1) (x2, y2)
Algebraic Data Types
Type Synonyms: naming existing types [done]
Product Types: bundling things together [done]
Sum Types: types with multiple variants [done]
Recursive Types: types that contain themselves
Polymorphic Datatypes: datatypes with parameters
Recursive types
Have we seen an example already where
- a value of type
T
- contains another value of the same type
T
inside?
Yes, lists!
Lists aren’t built-in! They are an algebraic data type like any other:
data IntList
= INil -- empty list
-- ^ base constructor
| ICons Int IntList -- list with Int head and IntList tail
-- ^ inductive constructor (contains another IntList)
QUIZ
data IntList
= INil -- empty list
| ICons Int IntList -- list with Int head and IntList tail
What is the type of ICons
?
A. IntList
B. Int -> IntList
C. (Int, IntList)
D. Int -> IntList -> IntList
E. IntList -> IntList
Answer: D
EXERCISE
With the definition:
data IntList = INil | ICons Int IntList
Write down an IntList
representation of the list [1,2,3]
list_1_2_3 :: IntList
= ??? list_1_2_3
Recursion means boxes within boxes
Functions on recursive types
Recursive code mirrors recursive data
data IntList
= INil -- base constructor
| ICons Int IntList -- inductive constructor
Step 1: add a pattern per constructor
length :: IntList -> Int
length INil = ... -- base case
length (ICons x xs) = ... -- recursive case
Step 2: fill in base case:
length :: IntList -> Int
length INil = 0
length (ICons x xs) = ...
Step 3: fill in inductive case using a recursive call:
length :: IntList -> Int
length INil = 0
length (ICons x xs) = 1 + length xs
-- ^ recursive call on the "nested box"
EXERCISE: Append
Write an append
function that appends two IntList
s:
data IntList = INil | ICons Int IntList
append :: IntList -> IntList -> IntList
= ??? append
so that we can call:
-- [1,2] ++ [3,4] ==> [1,2,3,4]
>>> append (ICons 1 (ICons 2 INil)) (ICons 3 (ICons 4 INil))
ICons 1 (ICons 2 (ICons 3 (ICons 4 INil)))) (
append :: List -> List -> List
Nil ys = ys
append Cons x xs) ys = Cons x (append xs ys) append (
Lessons Learned
- Recursive code mirrors recursive data
- With multiple arguments of a recursive type, which one should I recurse on?
- The name of the game is to pick the right inductive strategy!
Trees
Lists are unary trees:
How do we represent binary trees?
QUIZ: Binary trees
How do you represent this binary tree as a recursive datatype?
(A) data ITree = ILeaf | INode Int ITree
(B) data ITree = ILeaf | INode ITree ITree
(C) data ITree = ILeaf | INode Int ITree ITree
(D) data ITree = ILeaf Int | INode ITree ITree
(E) data ITree = ILeaf Int | INode Int ITree ITree
Answer: C
-- | Binary tree of integers
data ITree = ILeaf | INode Int ITree ITree
= INode 1
t1234 INode 2 (INode 3 ILeaf ILeaf) ILeaf)
(INode 4 ILeaf ILeaf) (
Functions on trees
height :: Tree -> Int
Leaf = 0
height Node _ l r) = 1 + max (height l) (height r) height (
Example: Calculator
I want to implement an arithmetic calculator to evaluate expressions like:
4.0 + 2.9
3.78 – 5.92
(4.0 + 2.9) * (3.78 - 5.92)
What is a Haskell datatype to represent these expressions?
data Expr = ???
data Expr = Num Float
| Add Expr Expr
| Sub Expr Expr
| Mul Expr Expr
EXERCISE: Eval
With expressions defined as follows:
data Expr = Num Float
| Add Expr Expr
| Sub Expr Expr
| Mul Expr Expr
write a function to evaluate an expression
eval :: Expr -> Float
Num f) = f
eval (Add e1 e2) = eval e1 + eval e2
eval (Sub e1 e2) = eval e1 - eval e2
eval (Mul e1 e2) = eval e1 * eval e2 eval (
Algebraic Data Types
Type Synonyms: naming existing types [done]
Product Types: bundling things together [done]
Sum Types: types with multiple variants [done]
Recursive Types: types that contain themselves [done]
Polymorphic Datatypes: datatypes with parameters
Parameterized Types
Our IntList
datatype can only store Int
s :-(
What if we want to store Char
s or Double
s?
data CharList
= CNil
| CCons Char CharList
data DoubleList
= DNil
| DCons Char DoubleList
Don’t Repeat Yourself!
Don’t repeat definitions - Instead reuse the list structure across all types!
Find abstract data patterns by
- identifying the different parts and
- refactor those into parameters
A Refactored List
Here are the three types: What is common? What is different?
data IList = INil | ICons Int IList
data CList = CNil | CCons Char CList
data DList = DNil | DCons Double DList
Common: Nil
/Cons
structure
Different: type of each “head” element
Refactored using Type Parameter
-- | A list of elements of type `a`
data List a = Nil | Cons a (List a)
We can recover original types as instances of List
:
iList :: List Int -- list where 'a' = 'Int'
= Cons 1 (Cons 2 (Cons 3 Nil))
iList
cList :: List Char -- list where 'a' = 'Char'
= Cons 'a' (Cons 'b' (Cons 'c' Nil))
cList
dList :: List Double -- list where 'a' = 'Double'
= Cons 1.1 (Cons 2.2 (Cons 3.3 Nil)) dList
QUIZ
data List a = Nil | Cons a (List a)
What is the type of Cons
?
A. Int -> List -> List
B. Int -> List Int -> List Int
C. Char -> List Char -> List Char
D. a -> List -> List
E. a -> List a -> List a
Answer: E
Type Constructors
Note: List
is not a type!
List a
, List Int
, … are types
a
is a type parameter
Then what is List
?
A type-constructor that
- takes as input a type
a
- returns as output the type
List a
Polymorphic Data has Polymorphic Constructors
Look at the types of the constructors
>>> :type Nil
Nil :: List a
That is, Nil
is a value of any kind of list, and
>>> :type Cons
Cons :: a -> List a -> List a
Cons
takes an a
and a List a
and returns a List a
.
Polymorphic Functions
Let us refactor the append
function to work on List
s:
data List a = Nil | Cons a (List a)
append :: ??? -- What is the type of `append`?
= ??? append
Polymorphic Functions over Polymorphic Data
The append
function on List
s is polymorphic:
append :: List a -> List a -> List a
Nil ys = ys
append Cons x xs) ys = Cons x (append xs ys) append (
append
doesn’t care about the actual values in the list
- only manipulates the structure of the list
Hence append :: List a -> List a -> List a
- we can call
append
on lists of any kind - as long as both lists are of the same kind
>>> append (Cons 1 (Cons 2 Nil)) (Cons 3 Nil) -- a = Int
Cons 1 (Cons 2 (Cons 3 Nil))
>>> append (Cons 'a' (Cons 'b' Nil)) (Cons 'c' Nil) -- a = Char
Cons 'a' (Cons 'b' (Cons 'c' Nil))
QUIZ
With the type of append
defined as
append :: List a -> List a -> List a
what would be result of
Cons 'a' (Cons 'b' Nil)) (Cons True Nil) append (
A. Cons 'a' (Cons 'b' (Cons True Nil))
B. Cons 'a' (Cons 'b' Nil)
C. Type error
Answer: C
Built-in Lists?
This is exactly how Haskell’s “built-in” lists are defined:
data [a] = [] | (:) a [a]
data List a = Nil | Cons a (List a)
Nil
is called[]
Cons
is called:
Many list manipulating functions e.g. in Data.List
are polymorphic
- Can be reused across all kinds of lists.
(++) :: [a] -> [a] -> [a]
head :: [a] -> a
tail :: [a] -> [a]
That’s all folks!