-
Notifications
You must be signed in to change notification settings - Fork 6
/
DList.hs
269 lines (205 loc) · 6.94 KB
/
DList.hs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
{-
---
fulltitle: "In class exercise: Difference lists"
date:
---
-}
module DList where
{-
Motivation
----------
In this exercise, you will use first-class functions to implement an alternative
version of lists, called `DList`s, short for *difference lists*.
`DList`s support O(1) append operations on lists, making them very
useful for append-heavy uses, such as logging and traversing
tree-like data structures in linear time. (An implementation of this data
structure is available on [hackage](http://hackage.haskell.org/package/dlist)
but try to complete it on your own. No peeking!)
See the micro-benchmark section for experiments you can do once you
have completed the implementation.
Implementation
--------------
The key idea for difference lists is to represent them using a
*function* from lists to lists. In otherwords, we will tell Haskell
that the type `[a] -> [a]` can be called a `DList a` for any type
parameter `a`.
-}
type DList a = [a] -> [a]
{-
You can think of a difference list as a data structure where we have "factored
out" the end of the list.
For example, we might write a regular list like this:
-}
list :: [Int]
list = 1 : 2 : 3 : [] -- end is nil
{-
The analogous "difference list" replaces the nil at the end of the list with a
parameter.
-}
dlist :: DList Int
dlist = \x -> 1 : 2 : 3 : x -- end is "x"
{-
This parameterization gives us flexibility. We can always fill in the
parameter with `[]` to get a normal list. However, we can also fill in the
parameter with another list, effectively appending [1, 2, 3] to the beginning
of that other list.
Once we have constructed a `DList`, the *only* way to observe it is to
convert it to a list. This data structure does not support any other form of
pattern matching.
-}
toList :: DList a -> [a]
toList x = x []
{-
See if you can figure out how to define the following standard list operations
for this new type of `DList`s. Remember that `DList a` is just a synonym for `[a] -> [a]`.
-}
-- | Create an empty DList
-- >>> toList empty
-- []
empty :: DList a
empty = undefined
-- | Create a DList containing a single element
-- >>> toList (singleton "a")
-- ["a"]
singleton :: a -> DList a
singleton x = undefined
-- | Append two DLists together
-- >>> toList ((singleton "a") `append` (singleton "b"))
-- ["a","b"]
append :: DList a -> DList a -> DList a
append = undefined
-- | Construct a DList from a head element and tail
-- >>> toList (cons "a" (singleton "b"))
-- ["a","b"]
cons :: a -> DList a -> DList a
cons = undefined
{-
Now write a function to convert a regular list to a `DList` using the above
definitions and `foldr`.
-}
-- | convert a normal list to a DList
-- >>> toList (fromList [1,2,3])
-- [1,2,3]
fromList :: [a] -> DList a
fromList = undefined
{-
Micro-benchmarks
----------------
Remember that you can execute the definitions in this module by loading it
into ghci. In the terminal, you can use the command
stack ghci DList.hs
to automatically start ghci and load the module.
If you'd like to see the difference between using (++) with regular lists and
`append` using DLists, in GHCi you can type
*DList> :set +s
That will cause GHCi to give you timing and allocation information for each
evaluation that you do. Then, after you complete this file, you can test out
these logging micro-benchmarks.
This first example repeatedly appends a single character to its string
parameter with each recursive call.
-}
micro1 :: Char
micro1 = last (t 10000 "")
where
t :: Int -> [Char] -> [Char]
t 0 l = l
t n l = t (n -1) (l ++ "s")
{-
*DList> micro1
's'
(2.80 secs, 4,300,584,976 bytes)
This version does the same, except that this time it uses the `DList` operations.
-}
micro2 :: Char
micro2 = last (toList (t 10000 empty))
where
t :: Int -> DList Char -> DList Char
t 0 l = l
t n l = t (n -1) (l `append` singleton 's')
{-
*DList> micro2
's'
(0.02 secs, 10,359,248 bytes)
Notice how the second version is *much* faster and uses much less memory. Why
is this the case? The `++` operator for lists takes time proportional to its
first argument. So as the `l` argument of `t` grows in length, adding an `s`
to the end of it takes longer and longer. However, the `DList` `append`
operator doesn't have this behavior. It just remembers that we are going to
add an additional character at each step and then constructs the list all at
once with `toList`. Nifty!
We can also see the effect of using difference lists for *defining* a list
reverse function.
For example, consider this version of the list `reverse` function. This function
is easy to understand, but it is O(n^2), not O(n). Can you see why?
-}
naiveReverse :: [a] -> [a]
naiveReverse = rev
where
rev [] = []
rev (x : xs) = rev xs ++ [x]
{-
Let's use a list containing 10,001 integers to micro-benchmark this function.
-}
bigList :: [Int]
bigList = [0 .. 10000]
{-
Don't skip this next step! Let's look at the last element in this list. This
command will force GHCi to evaluate the expression above and allocate the list
into memory. (We don't want our first benchmark below to include time for
constructing this list---we only want to time the reverse operation.)
*DList> last bigList
10000
(0.01 secs, 882,216 bytes)
Let's try to reverse this list. How long does it take? How many bytes? Give it a try.
-}
micro3 :: Int
micro3 = last (naiveReverse bigList)
{-
*DList> micro3
We can dress up the reverse function a bit using `foldr`, `flip` and the
singleton section `(:[])`, but that doesn't really help. It's fundamentally
the same algorithm. (I also find this 'point-free' version much harder to
understand! Try to convince yourself that this definition really is doing the
same thing as `naiveReverse`!)
-}
ivoryTowerReverse :: [a] -> [a]
ivoryTowerReverse = foldr (flip (++) . (: [])) []
{-
But, the microbenchmark shows that this version is doing about the same amount
of work. Try it out.
-}
micro4 :: Int
micro4 = last (ivoryTowerReverse bigList)
{-
*DList> micro4
Now watch what happens when we use a `DList` instead. Compare this definition
with the `naiveReverse` one above. It's still easy to read. All we have done
is replace the standard list operations with the `DList` versions, through a fairly mechanical process.
-}
dlistReverse :: [a] -> [a]
dlistReverse = toList . rev
where
rev [] = empty
rev (x : xs) = rev xs `append` singleton x
micro5 :: Int
micro5 = last (dlistReverse bigList)
{-
*DList> micro5
We can also replace the list operations in ivoryTowerReverse with their DList
analogues, also a mechanical process.
-}
dlistIvoryTowerReverse :: [a] -> [a]
dlistIvoryTowerReverse = toList . foldr (flip append . singleton) empty
micro6 :: Int
micro6 = last (dlistIvoryTowerReverse bigList)
{-
*DList> micro6
(Of course, it is often better to use the standard library definition of
common operations. How does the built-in operation, which has been optimized
for GHC, compare?)
-}
micro7 :: Int
micro7 = last (reverse bigList)
{-
*DList> micro7
-}