Matt's Haskell Learnings: LYAH Chapter 11c Monoids

Using newtype to make type class instances

· Many times, we want to make our types instances of certain type classes, but the type parameters just don't match up for what we want to do. It's easy to make Maybe an instance of Functor, because the Functor type class is defined like this:

class Functor f where
fmap :: (a -> b) -> f a -> f b

· So we just start out with:

instance Functor Maybe where

· And then implement fmap. All the type parameters add up because the Maybe takes the place of f:

fmap :: (a -> b) -> Maybe a -> Maybe b

Tuple Example

· What if we wanted to make the tuple an instance of Functor in such a way that when we fmap a function over a tuple, it gets applied to the first component of the tuple?

· That way, doing fmap (+3) (1,1) would result in (4,1). It turns out that writing the instance for that is kind of hard.

· With Maybe, we just say instance Functor Maybe where because only type constructors that take exactly one parameter can be made an instance of Functor.

· But it seems like there's no way to do something like that with (a,b) so that the type parameter a ends up being the one that changes when we use fmap.

· To get around this, we can newtype our tuple in such a way that the second type parameter represents the type of the first component in the tuple:

newtype Pair b a = Pair { getPair :: (a,b) }

· And now, we can make it an instance of Functor so that the function is mapped over the first component:

instance Functor (Pair c) where
fmap f (Pair (x,y)) = Pair (f x, y)

· As you can see, we can pattern match on types defined with newtype.

· We pattern match to get the underlying tuple, then we apply the function f to the first component in the tuple and then we use the Pair value constructor to convert the tuple back to our Pair b a.

· If we imagine what the type fmap would be if it only worked on our new pairs, it would be:

fmap :: (a -> b) -> Pair c a -> Pair c b

· Again, we said instance Functor (Pair c) where so Pair c took the place of the f in the type class definition for Functor:

class Functor f where
fmap :: (a -> b) -> f a -> f b

· So now, if we convert a tuple into a Pair b a, we can use fmap over it and the function will be mapped over the first component:

ghci> getPair $ fmap (*100) (Pair (2,3))
(200,3)
ghci> getPair $ fmap reverse (Pair ("london calling", 3))
("gnillac nodnol",3)

On newtype laziness

· We mentioned that newtype is usually faster than data.

· The only thing that can be done with newtype is turning an existing type into a new type, so internally, Haskell can represent the values of types defined with newtype just like the original ones, only it has to keep in mind that the their types are now distinct.

· This fact means that not only is newtype faster, it's also lazier.

· Like we've said before, Haskell is lazy by default, which means that only when we try to actually print the results of our functions will any computation take place.

· Furthemore, only those computations that are necessary for our function to tell us the result will get carried out.

· The undefined value in Haskell represents an erronous computation.

· If we try to evaluate it (that is, force Haskell to actually compute it) by printing it to the terminal, Haskell will throw an exception:

ghci> undefined
*** Exception: Prelude.undefined

· However, if we make a list that has some undefined values in it but request only the head of the list, which is not undefined, everything will go smoothly because Haskell doesn't really need to evaluate any other elements in a list if we only want to see what the first element is:

ghci> head [3,4,5,undefined,2,undefined]
3

· Now consider the following type:

data CoolBool = CoolBool { getCoolBool :: Bool }

· It's your run-of-the-mill algebraic data type that was defined with the data keyword.

· It has one value constructor, which has one field whose type is Bool.

· Let's make a function that pattern matches on a CoolBool and returns the value "hello" regardless of whether the Bool inside the CoolBool was True or False:

helloMe :: CoolBool -> String
helloMe (CoolBool _) = "hello"

· Instead of applying this function to a normal CoolBool, let's throw it a curveball and apply it to undefined

ghci> helloMe undefined
"*** Exception: Prelude.undefined

· Types defined with the data keyword can have multiple value constructors (even though CoolBool only has one).

· In order to see if the value given to our function conforms to the (CoolBool _) pattern, Haskell has to evaluate the value just enough to see which value constructor was used when we made the value.

· And when we try to evaluate an undefined value, even a little, an exception is thrown.

· Instead of using the data keyword for CoolBool, let's try using newtype:

newtype CoolBool = CoolBool { getCoolBool :: Bool }

· We don't have to change our helloMe function, because the pattern matching syntax is the same if you use newtype or data to define your type. Let's do the same thing here and apply helloMe to an undefined value:

ghci> helloMe undefined
"hello"

· When we use newtype, Haskell can internally represent the values of the new type in the same way as the original values.

· It doesn't have to add another box around them, it just has to be aware of the values being of different types.

· And because Haskell knows that types made with the newtype keyword can only have one constructor, it doesn't have to evaluate the value passed to the function to make sure that it conforms to the(CoolBool _) pattern because newtype types can only have one possible value constructor and one field

· Even though types defined with data and newtype behave similarly from the programmer's point of view because they both have value constructors and fields, they are actually two different mechanisms.

· Whereas data can be used to make your own types from scratch, newtype is for making a completely new type out of an existing type.

· Pattern matching on newtype values isn't like taking something out of a box (like it is with data), it's more about making a direct conversion from one type to another.

type vs. newtype vs. data

· The type keyword is for making type synonyms. What that means is that we just give another name to an already existing type so that the type is easier to refer to. Say we did the following:

1. type IntList = [Int]

· All this does is to allow us to refer to the [Int] type as IntList.

· They can be used interchangeably.

· We don't get an IntList value constructor or anything like that. Because [Int] and IntList are only two ways to refer to the same type, it doesn't matter which name we use in our type annotations:

1. ghci> ([1,2,3] :: IntList) ++ ([1,2,3] :: [Int])

2. [1,2,3,1,2,3]

· We use type synonyms when we want to make our type signatures more descriptive by giving types names that tell us something about their purpose in the context of the functions where they're being used.

· For instance, when we used an association list of type [(String,String)] to represent a phone book, we gave it the type synonym of PhoneBook so that the type signatures of our functions were easier to read.

· The newtype keyword is for taking existing types and wrapping them in new types, mostly so that it's easier to make them instances of certain type classes.

· When we use newtype to wrap an existing type, the type that we get is separate from the original type. If we make the following newtype:

newtype CharList = CharList { getCharList :: [Char] }

· We can't use ++ to put together a CharList and a list of type [Char].

· We can't even use ++ to put together two CharLists, because ++ works only on lists and the CharList type isn't a list, even though it could be said that it contains one.

· We can, however, convert two CharLists to lists, ++ them and then convert that back to a CharList.

· When we use record syntax in our newtype declarations, we get functions for converting between the new type and the original type: namely the value constructor of our newtype and the function for extracting the value in its field. The new type also isn't automatically made an instance of the type classes that the original type belongs to, so we have to derive or manually write them.

· newtype declarations = data declarations that can only have one constructor and one field. If you catch yourself writing such a data declaration, consider using newtype.

· The data keyword is for making your own data types

· They can have as many constructors and fields as you wish and can be used to implement any algebraic data type by yourself.

· Everything from lists and Maybe-like types to trees.

· If you just want your type signatures to look cleaner and be more descriptive, you probably want type synonyms.

· If you want to take an existing type and wrap it in a new type in order to make it an instance of a type class, chances are you're looking for a newtype.

· And if you want to make something completely new, odds are good that you're looking for the data keyword.

Monoids

· Type classes in Haskell are used to present an interface for types that have some behavior in common.

· We started out with simple type classes like Eq, which is for types whose values can be equated, and Ord, which is for things that can be put in an order and then moved on to more interesting ones, like Functor and Applicative.

· When we make a type, we think about which behaviors it supports, i.e. what it can act like and then based on that we decide which type classes to make it an instance of.

· If it makes sense for values of our type to be equated, we make it an instance of the Eq type class. If we see that our type is some kind of functor, we make it an instance of Functor, and so on.

· Now consider the following: * is a function that takes two numbers and multiplies them.

· If we multiply some number with a 1, the result is always equal to that number.

· It doesn't matter if we do 1 * x or x * 1, the result is always x.

· Similarly, ++ is also a function which takes two things and returns a third.

· Only instead of multiplying numbers, it takes two lists and concatenates them.

· And much like *, it also has a certain value which doesn't change the other one when used with ++.

· That value is the empty list: [].

ghci> 4 * 1
4
ghci> 1 * 9
9
ghci> [1,2,3] ++ []
[1,2,3]
ghci> [] ++ [0.5, 2.5]
[0.5,2.5]

· It seems that both * together with 1 and ++ along with [] share some common properties:

· The function takes two parameters.

· The parameters and the returned value have the same type.

· There exists such a value that doesn't change other values when used with the binary function.

· There's another thing that these two operations have in common that may not be as obvious as our previous observations:

· when we have three or more values and we want to use the binary function to reduce them to a single result, the order in which we apply the binary function to the values doesn't matter.

· It doesn't matter if we do (3 * 4) * 5 or 3 * (4 * 5).

· Either way, the result is 60. The same goes for ++:

ghci> (3 * 2) * (8 * 5)
240
ghci> 3 * (2 * (8 * 5))
240
ghci> "la" ++ ("di" ++ "da")
"ladida"
ghci> ("la" ++ "di") ++ "da"
"ladida"

· We call this property associativity. * is associative, and so is ++, but -, for example, is not.

· By noticing and writing down these properties, we have chanced upon monoids!

· A monoid is when you have an associative binary function and a value which acts as an identity with respect to that function.

· When something acts as an identity with respect to a function, it means that when called with that function and some other value, the result is always equal to that other value.

· 1 is the identity with respect to *

· [] is the identity with respect to ++

· There are a lot of other monoids to be found in the world of Haskell, which is why the Monoid type class exists.

class Monoid m where
mempty :: m
mappend :: m -> m -> m
mconcat :: [m] -> m
mconcat = foldr mappend mempty

· The Monoid type class is defined in import Data.Monoid.

· Only concrete types can be made instances of Monoid, because the m in the type class definition doesn't take any type parameters. This is different from Functor and Applicative, which require their instances to be type constructors which take one parameter.

· mempty represents the identity value for a particular monoid.

· mappend is the (unfortunately named) binary function

· It takes two values of the same type and returns a value of that type as well.

· mconcat takes a list of monoid values and reduces them to a single value by doing mappend between the list's elements.

· Default implementation takes mempty as a starting value and folds the list from the right with mappend.

· When making a type an instance of Monoid, it suffices to just implement mempty and mappend. mconcat only necessary for performance considerations.

Monoid Laws

· mempty `mappend` x = x

· x `mappend` mempty = x

· (x `mappend` y) `mappend` z = x `mappend` (y `mappend` z)

· The first two state that mempty has to act as the identity with respect to mappend and the third says that mappend has to be associative i.e. that it the order in which we use mappend to reduce several monoid values into one doesn't matter.

· Haskell doesn't enforce these laws, so we as the programmer have to be careful that our instances do indeed obey them.

Lists are monoids

· Yes, lists are monoids! Like we've seen, the ++ function and the empty list [] form a monoid. The instance is very simple:

instance Monoid [a] where
mempty = []
mappend = (++)

· Lists are an instance of the Monoid type class regardless of the type of the elements they hold.

· Notice that we wrote instance Monoid [a] and not instance Monoid [], because Monoid requires a concrete type for an instance.

ghci> [1,2,3] `mappend` [4,5,6]
[1,2,3,4,5,6]
ghci> ("one" `mappend` "two") `mappend` "tree"
"onetwotree"
ghci> "one" `mappend` ("two" `mappend` "tree")
"onetwotree"
ghci> "one" `mappend` "two" `mappend` "tree"
"onetwotree"
ghci> "pang" `mappend` mempty
"pang"
ghci> mconcat [[1,2],[3,6],[9]]
[1,2,3,6,9]
ghci> mempty :: [a]
[]

· Notice that in the last line, we had to write an explicit type annotation, because if we just did mempty, GHCi wouldn't know which instance to use, so we had to say we want the list instance.

· We were able to use the general type of [a] (as opposed to specifying [Int] or [String]) because the empty list can act as if it contains any type.

· Because mconcat has a default implementation, we get it for free when we make something an instance of Monoid.

· In the case of the list, mconcat turns out to be just concat.

· It takes a list of lists and flattens it, because that's the equivalent of doing ++ between all the adjecent lists in a list.

· The monoid laws do indeed hold for the list instance.

· When we have several lists and we mappend (or ++) them together, it doesn't matter which ones we do first, because they're just joined at the ends anyway.

· Also, the empty list acts as the identity so all is well.

· Notice that monoids don't require that a `mappend` b be equal to b `mappend` a.

· In the case of the list, they clearly aren't:

ghci> "one" `mappend` "two"
"onetwo"
ghci> "two" `mappend` "one"
"twoone"

· The fact that for multiplication 3 * 5 and 5 * 3 are the same is just a property of multiplication, but it doesn't hold for all (and indeed, most) monoids.

Product and Sum

· We already examined one way for numbers to be considered monoids.

· Just have the binary function be * and the identity value 1.

· It turns out that that's not the only way for numbers to be monoids.

· Another way is to have the binary function be + and the identity value 0:

ghci> 0 + 4
4
ghci> 5 + 0
5
ghci> (1 + 3) + 5
9
ghci> 1 + (3 + 5)
9

· The monoid laws hold, because if you add 0 to any number, the result is that number.

· And addition is also associative, so we get no problems there.

· So now that there are two equally valid ways for numbers to be monoids, which way do choose?

· Well, we don't have to. Remember, when there are several ways for some type to be an instance of the same type class, we can wrap that type in a newtype and then make the new type an instance of the type class in a different way.

· We can have our cake and eat it too.

· The Data.Monoid module exports two types for this, namely Product and Sum. Product is defined like this:

newtype Product a = Product { getProduct :: a }
deriving (Eq, Ord, Read, Show, Bounded)

· Simple, just a newtype wrapper with one type parameter along with some derived instances. Its instance for Monoid goes a little something like this:

instance Num a => Monoid (Product a) where
mempty = Product 1
Product x `mappend` Product y = Product (x * y)

· mempty is just 1 wrapped in a Product constructor.

· mappend pattern matches on the Product constructor, multiplies the two numbers and then wraps the resulting number back.

· As you can see, there's a Num a class constraint.

· So this means that Product a is an instance of Monoid for all a's that are already an instance of Num.

· To use Product a as a monoid, we have to do some newtype wrapping and unwrapping:

ghci> getProduct $ Product 3 `mappend` Product 9
27
ghci> getProduct $ Product 3 `mappend` mempty
3
ghci> getProduct $ Product 3 `mappend` Product 4 `mappend` Product 2
24
ghci> getProduct . mconcat . map Product $ [3,4,2]
24

· This is nice as a showcase of the Monoid type class, but no one in their right mind would use this way of multiplying numbers instead of just writing 3 * 9 and 3 * 1. But a bit later, we'll see how these Monoid instances that may seem trivial at this time can come in handy.

· Sum is defined like Product and the instance is similar as well. We use it in the same way:

ghci> getSum $ Sum 2 `mappend` Sum 9
11
ghci> getSum $ mempty `mappend` Sum 3
3
ghci> getSum . mconcat . map Sum $ [1,2,3]
6

Any and All

· Another type which can act like a monoid in two distinct but equally valid ways is Bool. The first way is to have the or function ||act as the binary function along with False as the identity value. The way or works in logic is that if any of its two parameters is True, it returns True, otherwise it returns False. So if we use False as the identity value, it will return False when or-ed with False and True when or-ed with True. The Any newtype constructor is an instance of Monoid in this fashion. It's defined like this:

newtype Any = Any { getAny :: Bool }
deriving (Eq, Ord, Read, Show, Bounded)

instance Monoid Any where
mempty = Any False
Any x `mappend` Any y = Any (x || y)

· The reason it's called Any is because x `mappend` y will be True if any one of those two is True. Even if three or more Any wrapped Bools are mappended together, the result will hold True if any of them are True:

ghci> getAny $ Any True `mappend` Any False
True
ghci> getAny $ mempty `mappend` Any True
True
ghci> getAny . mconcat . map Any $ [False, False, False, True]
True
ghci> getAny $ mempty `mappend` mempty
False

· The other way for Bool to be an instance of Monoid is to kind of do the opposite: have && be the binary function and then make True the identity value. Logical and will return True only if both of its parameters are True. This is the newtype declaration, nothing fancy:

newtype All = All { getAll :: Bool }
deriving (Eq, Ord, Read, Show, Bounded)

instance Monoid All where
mempty = All True
All x `mappend` All y = All (x && y)

· When we mappend values of the All type, the result will be True only if all the values used in the mappend operations are True:

ghci> getAll $ mempty `mappend` All True
True
ghci> getAll $ mempty `mappend` All False
False
ghci> getAll . mconcat . map All $ [True, True, True]
True
ghci> getAll . mconcat . map All $ [True, True, False]
False

· Just like with multiplication and addition, we usually explicitly state the binary functions instead of wrapping them in newtypes and then using mappend and mempty. mconcat seems useful for Any and All, but usually it's easier to use the or and andfunctions, which take lists of Bools and return True if any of them are True or if all of them are True, respectively.

The Ordering monoid

· Hey, remember the Ordering type? It's used as the result when comparing things and it can have three values: LT, EQ and GT, which stand for less than, equal and greater than respectively:

ghci> 1 `compare` 2
LT
ghci> 2 `compare` 2
EQ
ghci> 3 `compare` 2
GT

· With lists, numbers and boolean values, finding monoids was just a matter of looking at already existing commonly used functions and seeing if they exhibit some sort of monoid behavior. With Ordering, we have to look a bit harder to recognize a monoid, but it turns out that its Monoid instance is just as intuitive as the ones we've met so far and also quite useful:

instance Monoid Ordering where
mempty = EQ
LT `mappend` _ = LT
EQ `mappend` y = y
GT `mappend` _ = GT

· The instance is set up like this: when we mappend two Ordering values, the one on the left is kept, unless the value on the left is EQ, in which case the right one is the result. The identity is EQ. It resembles the way we alphabetically compare words. We compare the first two letters and if they differ, we can already decide which word would go first in a dictionary. However, if the first two letters are equal, then we move on to comparing the next pair of letters and repeat the process.

· It's important to note that in the Monoid instance for Ordering, x `mappend` y doesn't equal y `mappend` x. Because the first parameter is kept unless it's EQ, LT `mappend` GT will result in LT, whereas GT `mappend` LT will result in GT:

· Associative but not commutative

ghci> LT `mappend` GT
LT
ghci> GT `mappend` LT
GT
ghci> mempty `mappend` LT
LT
ghci> mempty `mappend` GT
GT

· OK, so how is this monoid useful? Let's say you were writing a function that takes two strings, compares their lengths, and returns an Ordering. But if the strings are of the same length, then instead of returning EQ right away, we want to compare them alphabetically. One way to write this would be like so:

lengthCompare :: String -> String -> Ordering
lengthCompare x y = let a = length x `compare` length y
b = x `compare` y
in if a == EQ then b else a

· We name the result of comparing the lengths a and the result of the alphabetical comparison b and then if it turns out that the lengths were equal, we return their alphabetical ordering.

· But by employing our understanding of how Ordering is a monoid, we can rewrite this function in a much simpler manner:

import Data.Monoid
lengthCompare :: String -> String -> Ordering
lengthCompare x y = (length x `compare` length y) `mappend`
(x `compare` y)

ghci> lengthCompare "zen" "ants"
LT
ghci> lengthCompare "zen" "ant"
GT

· Remember, when we use mappend, its left parameter is always kept unless it's EQ, in which case the right one is kept. That's why we put the comparison that we consider to be the first, more important criterion as the first parameter. If we wanted to expand this function to also compare for the number of vowels and set this to be the second most important criterion for comparison, we'd just modify it like this:

import Data.Monoid
lengthCompare :: String -> String -> Ordering
lengthCompare x y = (length x `compare` length y) `mappend`
(vowels x `compare` vowels y) `mappend`
(x `compare` y)
where vowels = length . filter (`elem` "aeiou")

· We made a helper function, which takes a string and tells us how many vowels it has by first filtering it only for letters that are in the string "aeiou" and then applying length to that.

ghci> lengthCompare "zen" "anna"
LT
ghci> lengthCompare "zen" "ana"
LT
ghci> lengthCompare "zen" "ann"
GT

· Ordering monoid allows us to easily compare things by many different criteria and put those criteria in an order themselves, ranging from the most important to the least.

Maybe the monoid

instance Monoid a => Monoid (Maybe a) where
mempty = Nothing
Nothing `mappend` m = m
m `mappend` Nothing = m
Just m1 `mappend` Just m2 = Just (m1 `mappend` m2)

ghci> Nothing `mappend` Just "andy"
Just "andy"
ghci> Just LT `mappend` Nothing
Just LT
ghci> Just (Sum 3) `mappend` Just (Sum 4)
Just (Sum {getSum = 7})

· But what if the type of the contents of the Maybe aren't an instance of Monoid? One thing we can do is to just discard the second value and keep the first one. For this, the First a type exists and this is its definition:

newtype First a = First { getFirst :: Maybe a }
deriving (Eq, Ord, Read, Show)

· We take a Maybe a and we wrap it with a newtype. The Monoid instance is as follows:

instance Monoid (First a) where
mempty = First Nothing
First (Just x) `mappend` _ = First (Just x)
First Nothing `mappend` x = x

ghci> getFirst $ First (Just 'a') `mappend` First (Just 'b')
Just 'a'
ghci> getFirst $ First Nothing `mappend` First (Just 'b')
Just 'b'
ghci> getFirst $ First (Just 'a') `mappend` First Nothing
Just 'a'

· First is useful when we have a bunch of Maybe values and we just want to know if any of them is a Just. The mconcat function comes in handy:

ghci> getFirst . mconcat . map First $ [Nothing, Just 9, Just 10]
Just 9

· If we want a monoid on Maybe a such that the second parameter is kept if both parameters of mappend are Just values,Data.Monoid provides a the Last a type, which works like First a, only the last non-Nothing value is kept when mappending and using mconcat:

ghci> getLast . mconcat . map Last $ [Nothing, Just 9, Just 10]
Just 10
ghci> getLast $ Last (Just "one") `mappend` Last (Just "two")
Just "two"

Using monoids to fold data structures

· One of the more interesting ways to put monoids to work is to make them help us define folds over various data structures. So far, we've only done folds over lists, but lists aren't the only data structure that can be folded over. We can define folds over almost any data structure. Trees especially lend themselves well to folding.

· Because there are so many data structures that work nicely with folds, the Foldable type class was introduced. Much like Functor is for things that can be mapped over, Foldable is for things that can be folded up! It can be found in Data.Foldable and because it exports functions whose names clash with the ones from the Prelude, it's best imported qualified (and served with basil):

import qualified Foldable as F

· foldr, foldl, foldr1 and foldl1.

ghci> :t foldr
foldr :: (a -> b -> b) -> b -> [a] -> b
ghci> :t F.foldr
F.foldr :: (F.Foldable t) => (a -> b -> b) -> b -> t a -> b

[] as Foldable

ghci> foldr (*) 1 [1,2,3]
6
ghci> F.foldr (*) 1 [1,2,3]
6

Maybe as Foldable

ghci> F.foldl (+) 2 (Just 9)
11
ghci> F.foldr (||) False (Just True)
True

Tree as Foldable

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)

· We said that a tree is either an empty tree that doesn't hold any values or it's a node that holds one value and also two other trees. After defining it, we made it an instance of Functor and with that we gained the ability to fmap functions over it. Now, we're going to make it an instance of Foldable so that we get the ability to fold it up. One way to make a type constructor an instance of Foldable is to just directly implement foldr for it. But another, often much easier way, is to implement the foldMap function, which is also a part of the Foldable type class. The foldMap function has the following type:

foldMap :: (Monoid m, Foldable t) => (a -> m) -> t a -> m

· Its first parameter is a function that takes a value of the type that our foldable structure contains (denoted here with a) and returns a monoid value. Its second parameter is a foldable structure that contains values of type a. It maps that function over the foldable structure, thus producing a foldable structure that contains monoid values. Then, by doing mappend between those monoid values, it joins them all into a single monoid value. This function may sound kind of odd at the moment, but we'll see that it's very easy to implement. What's also cool is that implementing this function is all it takes for our type to be made an instance of Foldable. So if we just implement foldMap for some type, we get foldr and foldl on that type for free!

· This is how we make Tree an instance of Foldable:

instance F.Foldable Tree where
foldMap f Empty = mempty
foldMap f (Node x l r) = F.foldMap f l `mappend`
f x `mappend`
F.foldMap f r

· We think like this: if we are provided with a function that takes an element of our tree and returns a monoid value, how do we reduce our whole tree down to one single monoid value? When we were doing fmap over our tree, we applied the function that we were mapping to a node and then we recursively mapped the function over the left sub-tree as well as the right one. Here, we're tasked with not only mapping a function, but with also joining up the results into a single monoid value by using mappend. First we consider the case of the empty tree — a sad and lonely tree that has no values or sub-trees. It doesn't hold any value that we can give to our monoid-making function, so we just say that if our tree is empty, the monoid value it becomes is mempty.

· The case of a non-empty node is a bit more interesting. It contains two sub-trees as well as a value. In this case, we recursively foldMap the same function f over the left and the right sub-trees. Remember, our foldMap results in a single monoid value. We also apply our function f to the value in the node. Now we have three monoid values (two from our sub-trees and one from applying f to the value in the node) and we just have to bang them together into a single value. For this purpose we use mappend, and naturally the left sub-tree comes first, then the node value and then the right sub-tree.

· Notice that we didn't have to provide the function that takes a value and returns a monoid value. We receive that function as a parameter to foldMap and all we have to decide is where to apply that function and how to join up the resulting monoids from it.

· Now that we have a Foldable instance for our tree type, we get foldr and foldl for free! Consider this tree:

testTree = Node 5
(Node 3
(Node 1 Empty Empty)
(Node 6 Empty Empty)
)
(Node 9
(Node 8 Empty Empty)
(Node 10 Empty Empty)
)

· It has 5 at its root and then its left node it has 3 with 1 on the left and 6 on the right. The root's right node has a 9 and then an 8 to its left and a 10 on the far right side. With a Foldable instance, we can do all of the folds that we can do on lists:

ghci> F.foldl (+) 0 testTree
42
ghci> F.foldl (*) 1 testTree
64800

· And also, foldMap isn't only useful for making new instances of Foldable; it comes in handy for reducing our structure to a single monoid value. For instance, if we want to know if any number in our tree is equal to 3, we can do this:

ghci> getAny $ F.foldMap (\x -> Any $ x == 3) testTree
True

· Here, \x -> Any $ x == 3 is a function that takes a number and returns a monoid value, namely a Bool wrapped in Any.foldMap applies this function to every element in our tree and then reduces the resulting monoids into a single monoid withmappend. If we do this:

ghci> getAny $ F.foldMap (\x -> Any $ x > 15) testTree
False

· All of the nodes in our tree would hold the value Any False after having the function in the lambda applied to them. But to end up True, mappend for Any has to have at least one True value as a parameter. That's why the final result is False, which makes sense because no value in our tree is greater than 15.

· We can also easily turn our tree into a list by doing a foldMap with the \x -> [x] function. By first projecting that function onto our tree, each element becomes a singleton list. The mappend action that takes place between all those singleton list results in a single list that holds all of the elements that are in our tree:

ghci> F.foldMap (\x -> [x]) testTree
[1,3,6,5,8,9,10]

· What's cool is that all of these trick aren't limited to trees, they work on any instance of Foldable.

Notes:

newtype

· http://www.haskell.org/haskellwiki/Newtype

· http://www.haskell.org/tutorial/moretypes.html

relation between newtype, data, and type declarations:

· http://www.haskell.org/onlinereport/decls.html#datatype-renaming

· Unlike algebraic datatypes, the newtype constructor N is unlifted, so thatN _|_ is the same as _|_.

Monoid

http://www.haskell.org/haskellwiki/Monoid

Foldable

http://www.haskell.org/haskellwiki/Foldable_and_Traversable

A Foldable type is also a container (although the class does not technically require Functor, interesting Foldables are all Functors). It is a container with the added property that its items can be 'folded' to a summary value. In other words, it is a type which supports "foldr". Once you support foldr, of course, you can be turned into a list, by using toList = foldr (:) []. This means that all Foldables have a representation as a list, but the order of the items may or may not have any particular significance. However, if a Foldable is also a Functor, parametricity and the Functor law guarantee that toList and fmap commute. Further, in the case of Data.Sequence, there is a well defined order and it is exposed as expected by

toList. A particular kind of fold well-used by Haskell programmers is mapM_, which is a kind of fold over (>>), and Foldable provides this along with the related sequence_.

Matt's Haskell Learnings

Monday, March 5, 2012

LYAH Chapter 11c Monoids