Using newtype to make type class instances
·
Many times, we want to make our types instances
of certain type classes, but the type parameters just don't match up for what
we want to do. It's easy to make Maybe an instance
of Functor, because the Functor type class is
defined like this:
- class Functor f where
- fmap :: (a -> b) -> f a -> f b
·
So we just start out with:
- instance Functor Maybe where
·
And then implement fmap. All the type parameters
add up because the Maybe takes the
place of f:
- fmap :: (a -> b) -> Maybe a -> Maybe b
Tuple Example
·
What if we wanted to make the tuple an instance
of Functor in such a way
that when we fmap a function
over a tuple, it gets applied to the first component of the tuple?
·
That way, doing fmap (+3) (1,1) would result
in (4,1). It turns out that
writing the instance for that is kind of hard.
·
With Maybe,
we just say instance Functor Maybe
where because only type constructors that take
exactly one parameter can be made an instance of Functor.
·
But it seems like there's no way to do something
like that with (a,b) so that the
type parameter a ends up being
the one that changes when we use fmap.
·
To get around this, we can newtype our tuple in
such a way that the second type parameter represents the type of the first
component in the tuple:
- newtype Pair b a = Pair { getPair :: (a,b) }
·
And now, we can make it an instance of Functor so that the
function is mapped over the first component:
- instance Functor (Pair c) where
- fmap f (Pair (x,y)) = Pair (f x, y)
·
As you can see, we can pattern match on types
defined with newtype.
·
We pattern match to get the underlying tuple,
then we apply the function f to the first
component in the tuple and then we use the Pair value
constructor to convert the tuple back to our Pair b a.
·
If we imagine what the type fmap would be if
it only worked on our new pairs, it would be:
- fmap :: (a -> b) -> Pair c a -> Pair c b
·
Again, we said instance
Functor (Pair c) where so Pair c took the
place of the f in the type
class definition for Functor:
- class Functor f where
- fmap :: (a -> b) -> f a -> f b
·
So now, if we convert a tuple into a Pair b a, we can use fmap over it and
the function will be mapped over the first component:
- ghci> getPair $ fmap (*100) (Pair (2,3))
- (200,3)
- ghci> getPair $ fmap reverse (Pair ("london calling", 3))
- ("gnillac nodnol",3)
On newtype laziness
·
We mentioned that newtype is usually
faster than data.
·
The only thing that can be done with newtype is turning an
existing type into a new type, so internally, Haskell can represent the values
of types defined with newtype just like the
original ones, only it has to keep in mind that the their types are now
distinct.
·
This fact means that not only is newtype faster, it's also lazier.
·
Like we've said before, Haskell is lazy by
default, which means that only when we try to actually print the results of our
functions will any computation take place.
·
Furthemore, only those computations that are
necessary for our function to tell us the result will get carried out.
·
The undefined value in
Haskell represents an erronous computation.
·
If we try to evaluate it (that is, force Haskell
to actually compute it) by printing it to the terminal, Haskell will throw an
exception:
- ghci> undefined
- *** Exception: Prelude.undefined
·
However, if we make a list that has some undefined values in it
but request only the head of the list, which is not undefined, everything will go
smoothly because Haskell doesn't really need to evaluate any other elements in
a list if we only want to see what the first element is:
- ghci> head [3,4,5,undefined,2,undefined]
- 3
·
Now consider the following type:
- data CoolBool = CoolBool { getCoolBool :: Bool }
·
It's your run-of-the-mill algebraic data type
that was defined with the data keyword.
·
It has one value constructor, which has one
field whose type is Bool.
·
Let's make a function that pattern matches on a CoolBool and returns
the value "hello" regardless of
whether the Bool inside the CoolBool was True or False:
- helloMe :: CoolBool -> String
- helloMe (CoolBool _) = "hello"
·
Instead of applying this function to a normal CoolBool, let's throw it a
curveball and apply it to undefined
- ghci> helloMe undefined
- "*** Exception: Prelude.undefined
·
Types defined with the data keyword can
have multiple value constructors (even though CoolBool only has
one).
·
In order to see if the value given to our
function conforms to the (CoolBool _) pattern,
Haskell has to evaluate the value just enough to see which value constructor
was used when we made the value.
·
And when we try to evaluate an undefined value, even a
little, an exception is thrown.
·
Instead of using the data keyword for CoolBool, let's try using newtype:
- newtype CoolBool = CoolBool { getCoolBool :: Bool }
·
We don't have to change our helloMe function,
because the pattern matching syntax is the same if you use newtype or data to define
your type. Let's do the same thing here and apply helloMe to an undefined value:
- ghci> helloMe undefined
- "hello"
·
When we use newtype, Haskell can internally
represent the values of the new type in the same way as the original values.
·
It doesn't have to add another box around them,
it just has to be aware of the values being of different types.
·
And because Haskell knows that types made with
the newtype keyword can only have one constructor, it
doesn't have to evaluate the value passed to the function to make sure that it
conforms to the(CoolBool _) pattern because newtype types can only have one possible value constructor
and one field
·
Even though types defined with data and newtype behave
similarly from the programmer's point of view because they both have value
constructors and fields, they are actually two different mechanisms.
·
Whereas data can be used to make your own types from
scratch, newtype is for making
a completely new type out of an existing type.
·
Pattern matching on newtype values isn't
like taking something out of a box (like it is with data),
it's more about making a direct conversion from one type to another.
type vs. newtype vs. data
·
The type keyword is
for making type synonyms. What that means is that we just give another name to
an already existing type so that the type is easier to refer to. Say we did the
following:
1.
type IntList = [Int]
·
All this does is to allow us to refer to the [Int] type as IntList.
·
They can be used interchangeably.
·
We don't get an IntList value constructor or
anything like that. Because [Int] and IntList are only two
ways to refer to the same type, it doesn't matter which name we use in our type
annotations:
1.
ghci> ([1,2,3] :: IntList) ++ ([1,2,3] :: [Int])
2.
[1,2,3,1,2,3]
·
We use type synonyms when we want to make our
type signatures more descriptive by giving types names that tell us something
about their purpose in the context of the functions where they're being used.
·
For instance, when we used an association list
of type [(String,String)] to represent
a phone book, we gave it the type synonym of PhoneBook so that the
type signatures of our functions were easier to read.
·
The newtype keyword is
for taking existing types and wrapping them in new types, mostly so that it's
easier to make them instances of certain type classes.
·
When we use newtype to wrap an
existing type, the type that we get is separate from the original type. If we
make the following newtype:
- newtype CharList = CharList { getCharList :: [Char] }
·
We can't use ++ to put
together a CharList and a list of
type [Char].
·
We can't even use ++ to put
together two CharLists, because ++ works only on
lists and the CharList type isn't a
list, even though it could be said that it contains one.
·
We can, however, convert two CharLists to lists, ++ them and then
convert that back to a CharList.
·
When we use record syntax in our newtype declarations,
we get functions for converting between the new type and the original type:
namely the value constructor of our newtype and the
function for extracting the value in its field. The new type also isn't
automatically made an instance of the type classes that the original type
belongs to, so we have to derive or manually write them.
·
newtype declarations
= data declarations
that can only have one constructor and one field. If you catch yourself
writing such a data declaration,
consider using newtype.
·
The data keyword is
for making your own data types
·
They can have as many constructors and fields as
you wish and can be used to implement any algebraic data type by yourself.
·
Everything from lists and Maybe-like types to trees.
·
If you just want your type signatures to look
cleaner and be more descriptive, you probably want type synonyms.
·
If you want to take an existing type and wrap it
in a new type in order to make it an instance of a type class, chances are
you're looking for a newtype.
·
And if you want to make something completely new,
odds are good that you're looking for the data keyword.
Monoids
·
Type classes in Haskell are used to present an
interface for types that have some behavior in common.
·
We started out with simple type classes like Eq, which is for types whose
values can be equated, and Ord,
which is for things that can be put in an order and then moved on to more
interesting ones, like Functor and Applicative.
·
When we make a type, we think about which
behaviors it supports, i.e. what it can act like and then based on that we
decide which type classes to make it an instance of.
·
If it makes sense for values of our type to be
equated, we make it an instance of the Eq type class.
If we see that our type is some kind of functor, we make it an instance of Functor, and so on.
·
Now consider the following: * is a function
that takes two numbers and multiplies them.
·
If we multiply some number with a 1, the result is always equal
to that number.
·
It doesn't matter if we do 1 * x or x * 1, the result is always x.
·
Similarly, ++ is also a
function which takes two things and returns a third.
·
Only instead of multiplying numbers, it takes
two lists and concatenates them.
·
And much like *,
it also has a certain value which doesn't change the other one when used with ++.
·
That value is the empty list: [].
- ghci> 4 * 1
- 4
- ghci> 1 * 9
- 9
- ghci> [1,2,3] ++ []
- [1,2,3]
- ghci> [] ++ [0.5, 2.5]
- [0.5,2.5]
·
It seems that both * together with 1 and ++ along with [] share some
common properties:
·
The function takes two parameters.
·
The parameters and the returned value have the same type.
·
There exists such a value that doesn't change other values when used with the binary
function.
·
There's another thing that these two operations
have in common that may not be as obvious as our previous observations:
·
when we have three or more values and we want to
use the binary function to reduce them to a single result, the order in which
we apply the binary function to the values doesn't matter.
·
It doesn't matter if we do (3 * 4) * 5 or 3 * (4 * 5).
·
Either way, the result is 60. The same goes for ++:
- ghci> (3 * 2) * (8 * 5)
- 240
- ghci> 3 * (2 * (8 * 5))
- 240
- ghci> "la" ++ ("di" ++ "da")
- "ladida"
- ghci> ("la" ++ "di") ++ "da"
- "ladida"
·
We call this property associativity. * is associative,
and so is ++, but -, for example, is not.
·
By noticing and writing down these properties,
we have chanced upon monoids!
·
A monoid is when you have an associative binary function and a value which acts as an identity with
respect to that function.
·
When something acts as an identity with
respect to a function, it means that when called with that function and some
other value, the result is always equal to that other value.
·
1 is the identity with respect to *
·
[] is the
identity with respect to ++
·
There are a lot of other monoids to be found in
the world of Haskell, which is why the Monoid type class
exists.
- class Monoid m where
- mempty :: m
- mappend :: m -> m -> m
- mconcat :: [m] -> m
- mconcat = foldr mappend mempty
·
The Monoid type
class is defined in import Data.Monoid.
·
Only concrete types can be made instances of Monoid,
because the m in
the type class definition doesn't take any type parameters. This is different
from Functor and Applicative, which require their
instances to be type constructors which take one parameter.
·
mempty
represents the identity value for a particular monoid.
·
mappend
is the (unfortunately named) binary function
·
It takes two values of the same type and returns
a value of that type as well.
·
mconcat
takes a list of monoid values and reduces them to a single value by doing mappend between the
list's elements.
·
Default implementation takes mempty as a starting
value and folds the list from the right with mappend.
·
When making a type an instance of Monoid,
it suffices to just implement mempty and mappend. mconcat only necessary for performance considerations.
Monoid Laws
·
mempty `mappend` x = x
·
x `mappend` mempty = x
·
(x `mappend` y) `mappend` z = x `mappend` (y
`mappend` z)
·
The first two state that mempty has to act as
the identity with respect to mappend and the third
says that mappend has to be
associative i.e. that it the order in which we use mappend to reduce
several monoid values into one doesn't matter.
·
Haskell doesn't enforce these laws, so we as the
programmer have to be careful that our instances do indeed obey them.
Lists are monoids
·
Yes, lists are monoids! Like we've seen, the ++ function and
the empty list [] form a
monoid. The instance is very simple:
- instance Monoid [a] where
- mempty = []
- mappend = (++)
·
Lists are an instance of the Monoid type class
regardless of the type of the elements they hold.
·
Notice that we wrote instance Monoid [a] and not instance Monoid [], because Monoid requires a
concrete type for an instance.
- ghci> [1,2,3] `mappend` [4,5,6]
- [1,2,3,4,5,6]
- ghci> ("one" `mappend` "two") `mappend` "tree"
- "onetwotree"
- ghci> "one" `mappend` ("two" `mappend` "tree")
- "onetwotree"
- ghci> "one" `mappend` "two" `mappend` "tree"
- "onetwotree"
- ghci> "pang" `mappend` mempty
- "pang"
- ghci> mconcat [[1,2],[3,6],[9]]
- [1,2,3,6,9]
- ghci> mempty :: [a]
- []
·
Notice that in the last line, we had to write an
explicit type annotation, because if we just did mempty, GHCi wouldn't know
which instance to use, so we had to say we want the list instance.
·
We were able to use the general type of [a] (as opposed
to specifying [Int] or [String]) because the empty
list can act as if it contains any type.
·
Because mconcat has a default
implementation, we get it for free when we make something an instance of Monoid.
·
In the case of the list, mconcat turns out to
be just concat.
·
It takes a list of lists and flattens it,
because that's the equivalent of doing ++ between all
the adjecent lists in a list.
·
The monoid laws do indeed hold for the list
instance.
·
When we have several lists and we mappend (or ++) them together, it doesn't
matter which ones we do first, because they're just joined at the ends anyway.
·
Also, the empty list acts as the identity so all
is well.
·
Notice that monoids don't require that a `mappend` b be equal to b `mappend` a.
·
In the case of the list, they clearly aren't:
- ghci> "one" `mappend` "two"
- "onetwo"
- ghci> "two" `mappend` "one"
- "twoone"
·
The fact that for multiplication 3 * 5 and 5 * 3 are the same
is just a property of multiplication, but it doesn't hold for all (and indeed,
most) monoids.
Product and Sum
·
We already examined one way for numbers to be
considered monoids.
·
Just have the binary function be * and the
identity value 1.
·
It turns out that that's not the only way for
numbers to be monoids.
·
Another way is to have the binary function be + and the identity
value 0:
- ghci> 0 + 4
- 4
- ghci> 5 + 0
- 5
- ghci> (1 + 3) + 5
- 9
- ghci> 1 + (3 + 5)
- 9
·
The monoid laws hold, because if you add 0 to
any number, the result is that number.
·
And addition is also associative, so we get no
problems there.
·
So now that there are two equally valid ways for
numbers to be monoids, which way do choose?
·
Well, we don't have to. Remember, when there are
several ways for some type to be an instance of the same type class, we can
wrap that type in a newtype and then make the new type an instance of the
type class in a different way.
·
We can have our cake and eat it too.
·
The Data.Monoid module
exports two types for this, namely Product and Sum. Product is defined
like this:
- newtype Product a = Product { getProduct :: a }
- deriving (Eq, Ord, Read, Show, Bounded)
·
Simple, just a newtype wrapper with
one type parameter along with some derived instances. Its instance for Monoid goes a little
something like this:
- instance Num a => Monoid (Product a) where
- mempty = Product 1
- Product x `mappend` Product y = Product (x * y)
·
mempty is just 1 wrapped in a Product constructor.
·
mappend pattern
matches on the Product constructor,
multiplies the two numbers and then wraps the resulting number back.
·
As you can see, there's a Num a class
constraint.
·
So this means that Product a is an
instance of Monoid for all a's that are already an
instance of Num.
·
To use Product a as a monoid,
we have to do some newtype wrapping and
unwrapping:
- ghci> getProduct $ Product 3 `mappend` Product 9
- 27
- ghci> getProduct $ Product 3 `mappend` mempty
- 3
- ghci> getProduct $ Product 3 `mappend` Product 4 `mappend` Product 2
- 24
- ghci> getProduct . mconcat . map Product $ [3,4,2]
- 24
·
This is nice as a showcase of the Monoid type class,
but no one in their right mind would use this way of multiplying numbers
instead of just writing 3 * 9 and 3 * 1. But a bit later, we'll
see how these Monoid instances
that may seem trivial at this time can come in handy.
·
Sum is defined
like Product and the
instance is similar as well. We use it in the same way:
- ghci> getSum $ Sum 2 `mappend` Sum 9
- 11
- ghci> getSum $ mempty `mappend` Sum 3
- 3
- ghci> getSum . mconcat . map Sum $ [1,2,3]
- 6
Any and All
·
Another type which can act like a monoid in two
distinct but equally valid ways is Bool.
The first way is to have the or function ||act
as the binary function along with False as the
identity value. The way or works in logic is that if any of its two
parameters is True, it
returns True, otherwise it returns False. So if we use False as the
identity value, it will return False when or-ed
with False and True when or-ed
with True. The Any newtype constructor
is an instance of Monoid in this
fashion. It's defined like this:
- newtype Any = Any { getAny :: Bool }
- deriving (Eq, Ord, Read, Show, Bounded)
- instance Monoid Any where
- mempty = Any False
- Any x `mappend` Any y = Any (x || y)
·
The reason it's called Any is because x `mappend` y will be True if any one of those
two is True. Even if three or more Any wrapped Bools are mappended together, the
result will hold True if any of
them are True:
- ghci> getAny $ Any True `mappend` Any False
- True
- ghci> getAny $ mempty `mappend` Any True
- True
- ghci> getAny . mconcat . map Any $ [False, False, False, True]
- True
- ghci> getAny $ mempty `mappend` mempty
- False
·
The other way for Bool to be an
instance of Monoid is to kind of
do the opposite: have && be the binary
function and then make True the identity
value. Logical and will return True only if both
of its parameters are True. This is the newtype declaration,
nothing fancy:
- newtype All = All { getAll :: Bool }
- deriving (Eq, Ord, Read, Show, Bounded)
- instance Monoid All where
- mempty = All True
- All x `mappend` All y = All (x && y)
·
When we mappend values of the All type, the
result will be True only if all the values
used in the mappend operations
are True:
- ghci> getAll $ mempty `mappend` All True
- True
- ghci> getAll $ mempty `mappend` All False
- False
- ghci> getAll . mconcat . map All $ [True, True, True]
- True
- ghci> getAll . mconcat . map All $ [True, True, False]
- False
·
Just like with multiplication and addition, we
usually explicitly state the binary functions instead of wrapping them in newtypes
and then using mappend and mempty. mconcat seems useful
for Any and All, but usually it's easier
to use the or and andfunctions, which take
lists of Bools and return True if any of
them are True or if all of
them are True, respectively.
The Ordering monoid
·
Hey, remember the Ordering type? It's
used as the result when comparing things and it can have three values: LT, EQ and GT, which stand for less than, equal and greater
than respectively:
- ghci> 1 `compare` 2
- LT
- ghci> 2 `compare` 2
- EQ
- ghci> 3 `compare` 2
- GT
·
With lists, numbers and boolean values, finding
monoids was just a matter of looking at already existing commonly used
functions and seeing if they exhibit some sort of monoid behavior. With Ordering, we have to look a
bit harder to recognize a monoid, but it turns out that its Monoid instance is
just as intuitive as the ones we've met so far and also quite useful:
- instance Monoid Ordering where
- mempty = EQ
- LT `mappend` _ = LT
- EQ `mappend` y = y
- GT `mappend` _ = GT
·
The instance is set up like this: when we mappend two Ordering values, the
one on the left is kept, unless the value on the left is EQ, in which case the right
one is the result. The identity is EQ.
It resembles the way we alphabetically compare words. We compare the first two
letters and if they differ, we can already decide which word would go first in
a dictionary. However, if the first two letters are equal, then we move on to
comparing the next pair of letters and repeat the process.
·
It's important to note that in the Monoid instance for Ordering, x `mappend` y doesn't equal y `mappend` x. Because the
first parameter is kept unless it's EQ, LT `mappend` GT will result
in LT, whereas GT `mappend` LT will result
in GT:
·
Associative
but not commutative
- ghci> LT `mappend` GT
- LT
- ghci> GT `mappend` LT
- GT
- ghci> mempty `mappend` LT
- LT
- ghci> mempty `mappend` GT
- GT
·
OK, so how is this monoid useful? Let's say you
were writing a function that takes two strings, compares their lengths, and
returns an Ordering. But if the strings
are of the same length, then instead of returning EQ right away,
we want to compare them alphabetically. One way to write this would be like so:
- lengthCompare :: String -> String -> Ordering
- lengthCompare x y = let a = length x `compare` length y
- b = x `compare` y
- in if a == EQ then b else a
·
We name the result of comparing the lengths a and the
result of the alphabetical comparison b and then if
it turns out that the lengths were equal, we return their alphabetical
ordering.
·
But by employing our understanding of how Ordering is a monoid, we
can rewrite this function in a much simpler manner:
- import Data.Monoid
- lengthCompare :: String -> String -> Ordering
- lengthCompare x y = (length x `compare` length y) `mappend`
- (x `compare` y)
- ghci> lengthCompare "zen" "ants"
- LT
- ghci> lengthCompare "zen" "ant"
- GT
·
Remember, when we use mappend, its left parameter
is always kept unless it's EQ,
in which case the right one is kept. That's why we put the comparison that we
consider to be the first, more important criterion as the first parameter. If
we wanted to expand this function to also compare for the number of vowels and
set this to be the second most important criterion for comparison, we'd just
modify it like this:
- import Data.Monoid
- lengthCompare :: String -> String -> Ordering
- lengthCompare x y = (length x `compare` length y) `mappend`
- (vowels x `compare` vowels y) `mappend`
- (x `compare` y)
- where vowels = length . filter (`elem` "aeiou")
·
We made a helper function, which takes a string
and tells us how many vowels it has by first filtering it only for letters that
are in the string "aeiou" and then
applying length to that.
- ghci> lengthCompare "zen" "anna"
- LT
- ghci> lengthCompare "zen" "ana"
- LT
- ghci> lengthCompare "zen" "ann"
- GT
·
Ordering monoid allows
us to easily compare things by many different criteria and put those criteria
in an order themselves, ranging from the most important to the least.
Maybe the monoid
- instance Monoid a => Monoid (Maybe a) where
- mempty = Nothing
- Nothing `mappend` m = m
- m `mappend` Nothing = m
- Just m1 `mappend` Just m2 = Just (m1 `mappend` m2)
- ghci> Nothing `mappend` Just "andy"
- Just "andy"
- ghci> Just LT `mappend` Nothing
- Just LT
- ghci> Just (Sum 3) `mappend` Just (Sum 4)
- Just (Sum {getSum = 7})
·
But what if the type of the contents of the Maybe aren't an
instance of Monoid? One thing we can do
is to just discard the second value and keep the first one. For this, the First a type exists
and this is its definition:
- newtype First a = First { getFirst :: Maybe a }
- deriving (Eq, Ord, Read, Show)
·
We take a Maybe a and we wrap
it with a newtype.
The Monoid instance is
as follows:
- instance Monoid (First a) where
- mempty = First Nothing
- First (Just x) `mappend` _ = First (Just x)
- First Nothing `mappend` x = x
- ghci> getFirst $ First (Just 'a') `mappend` First (Just 'b')
- Just 'a'
- ghci> getFirst $ First Nothing `mappend` First (Just 'b')
- Just 'b'
- ghci> getFirst $ First (Just 'a') `mappend` First Nothing
- Just 'a'
·
First is useful
when we have a bunch of Maybe values and we
just want to know if any of them is a Just.
The mconcat function
comes in handy:
- ghci> getFirst . mconcat . map First $ [Nothing, Just 9, Just 10]
- Just 9
·
If we want a monoid on Maybe a such that the
second parameter is kept if both parameters of mappend are Just values,Data.Monoid provides a
the Last a type, which
works like First a, only the last non-Nothing value is kept
when mappending and using mconcat:
- ghci> getLast . mconcat . map Last $ [Nothing, Just 9, Just 10]
- Just 10
- ghci> getLast $ Last (Just "one") `mappend` Last (Just "two")
- Just "two"
Using monoids to fold data structures
·
One of the more interesting ways to put monoids
to work is to make them help us define folds over various data structures. So
far, we've only done folds over lists, but lists aren't the only data structure
that can be folded over. We can define folds over almost any data structure.
Trees especially lend themselves well to folding.
·
Because there are so many data structures that
work nicely with folds, the Foldable type class
was introduced. Much like Functor is for things
that can be mapped over, Foldable is for things
that can be folded up! It can be found in Data.Foldable
and because it exports functions whose names clash with the ones from the Prelude, it's best imported
qualified (and served with basil):
- import qualified Foldable as F
·
foldr, foldl, foldr1 and foldl1.
- ghci> :t foldr
- foldr :: (a -> b -> b) -> b -> [a] -> b
- ghci> :t F.foldr
F.foldr :: (F.Foldable t) => (a -> b -> b) -> b -> t a -> b
[] as Foldable
- ghci> foldr (*) 1 [1,2,3]
- 6
- ghci> F.foldr (*) 1 [1,2,3]
- 6
Maybe as Foldable
- ghci> F.foldl (+) 2 (Just 9)
- 11
- ghci> F.foldr (||) False (Just True)
- True
Tree
as Foldable
- data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
·
We said that a tree is either an empty tree that
doesn't hold any values or it's a node that holds one value and also two other
trees. After defining it, we made it an instance of Functor and with that
we gained the ability to fmap functions
over it. Now, we're going to make it an instance of Foldable so that we
get the ability to fold it up. One way to make a type constructor an instance
of Foldable is to just
directly implement foldr for it. But
another, often much easier way, is to implement the foldMap function,
which is also a part of the Foldable type class.
The foldMap function has
the following type:
- foldMap :: (Monoid m, Foldable t) => (a -> m) -> t a -> m
·
Its first parameter is a function that takes a
value of the type that our foldable structure contains (denoted here with a) and returns a monoid
value. Its second parameter is a foldable structure that contains values of
type a. It maps that function over
the foldable structure, thus producing a foldable structure that contains
monoid values. Then, by doing mappend between those
monoid values, it joins them all into a single monoid value. This function may
sound kind of odd at the moment, but we'll see that it's very easy to
implement. What's also cool is that implementing this function is all it takes
for our type to be made an instance of Foldable.
So if we just implement foldMap for some
type, we get foldr and foldl on that type
for free!
·
This is how we make Tree an instance
of Foldable:
- instance F.Foldable Tree where
- foldMap f Empty = mempty
- foldMap f (Node x l r) = F.foldMap f l `mappend`
- f x `mappend`
- F.foldMap f r
·
We think like this: if we are provided with a
function that takes an element of our tree and returns a monoid value, how do
we reduce our whole tree down to one single monoid value? When we were doing fmap over our
tree, we applied the function that we were mapping to a node and then we
recursively mapped the function over the left sub-tree as well as the right
one. Here, we're tasked with not only mapping a function, but with also joining
up the results into a single monoid value by using mappend. First we consider
the case of the empty tree — a sad and lonely tree that has no values or
sub-trees. It doesn't hold any value that we can give to our monoid-making
function, so we just say that if our tree is empty, the monoid value it becomes
is mempty.
·
The case of a non-empty node is a bit more
interesting. It contains two sub-trees as well as a value. In this case, we
recursively foldMap the same function f over the left
and the right sub-trees. Remember, our foldMap results in a
single monoid value. We also apply our function f to the value
in the node. Now we have three monoid values (two from our sub-trees and one
from applying f to the value
in the node) and we just have to bang them together into a single value. For
this purpose we use mappend, and naturally the
left sub-tree comes first, then the node value and then the right sub-tree.
·
Notice that we didn't have to provide the
function that takes a value and returns a monoid value. We receive that
function as a parameter to foldMap and all we
have to decide is where to apply that function and how to join up the resulting
monoids from it.
·
Now that we have a Foldable instance for
our tree type, we get foldr and foldl for free!
Consider this tree:
- testTree = Node 5
- (Node 3
- (Node 1 Empty Empty)
- (Node 6 Empty Empty)
- )
- (Node 9
- (Node 8 Empty Empty)
- (Node 10 Empty Empty)
- )
·
It has 5 at its root
and then its left node it has 3 with 1 on the left
and 6 on the right.
The root's right node has a 9 and then an 8 to its left and a 10 on the far
right side. With a Foldable instance, we
can do all of the folds that we can do on lists:
- ghci> F.foldl (+) 0 testTree
- 42
- ghci> F.foldl (*) 1 testTree
- 64800
·
And also, foldMap isn't only
useful for making new instances of Foldable;
it comes in handy for reducing our structure to a single monoid value. For
instance, if we want to know if any number in our tree is equal to 3, we can do this:
- ghci> getAny $ F.foldMap (\x -> Any $ x == 3) testTree
- True
·
Here, \x -> Any
$ x == 3 is a function that takes a number and returns
a monoid value, namely a Bool wrapped in Any.foldMap applies this function to every element in our
tree and then reduces the resulting monoids into a single monoid withmappend. If we do this:
- ghci> getAny $ F.foldMap (\x -> Any $ x > 15) testTree
- False
·
All of the nodes in our tree would hold the
value Any False after having
the function in the lambda applied to them. But to end up True, mappend for Any has to have
at least one True value as a
parameter. That's why the final result is False,
which makes sense because no value in our tree is greater than 15.
·
We can also easily turn our tree into a list by
doing a foldMap with the \x -> [x] function. By
first projecting that function onto our tree, each element becomes a singleton
list. The mappend action that
takes place between all those singleton list results in a single list that
holds all of the elements that are in our tree:
- ghci> F.foldMap (\x -> [x]) testTree
- [1,3,6,5,8,9,10]
·
What's cool is that all of these trick aren't
limited to trees, they work on any instance of Foldable.
Notes:
newtype
relation
between newtype, data, and type declarations:
·
Unlike algebraic datatypes, the newtype
constructor N is unlifted,
so thatN _|_ is the same as _|_.
Monoid
Foldable
A Foldable type is also a container (although the class does not technically require Functor, interesting Foldables are all Functors). It is a container
with the added property that its items can be 'folded' to a summary value. In
other words, it is a type which supports "foldr". Once you support foldr, of
course, you can be turned into a list, by using toList
= foldr (:) []. This means that all Foldables have a representation as a
list, but the order of the items may or may not have any particular
significance. However, if a Foldable is
also a Functor,
parametricity and the Functor law guarantee that toList and fmap
commute. Further, in the case of Data.Sequence, there is a well defined order and it
is exposed as expected by
toList. A
particular kind of fold well-used by Haskell programmers is mapM_, which is a kind of fold
over (>>), and Foldable provides this along with the
related sequence_.
No comments:
Post a Comment