馃挘 Machine learning which might blow up in your face 馃挘
1Grenade
2=======
3
4[](https://travis-ci.org/HuwCampbell/grenade)
5[![Hackage page (downloads and API reference)][hackage-png]][hackage]
6[![Hackage-Deps][hackage-deps-png]][hackage-deps]
7
8
9```
10First shalt thou take out the Holy Pin, then shalt thou count to three, no more, no less.
11Three shall be the number thou shalt count, and the number of the counting shall be three.
12Four shalt thou not count, neither count thou two, excepting that thou then proceed to three.
13Five is right out.
14```
15
16馃挘 Machine learning which might blow up in your face 馃挘
17
18Grenade is a composable, dependently typed, practical, and fast recurrent neural network library
19for concise and precise specifications of complex networks in Haskell.
20
21As an example, a network which can achieve ~1.5% error on MNIST can be
22specified and initialised with random weights in a few lines of code with
23```haskell
24type MNIST
25 = Network
26 '[ Convolution 1 10 5 5 1 1, Pooling 2 2 2 2, Relu
27 , Convolution 10 16 5 5 1 1, Pooling 2 2 2 2, Reshape, Relu
28 , FullyConnected 256 80, Logit, FullyConnected 80 10, Logit]
29 '[ 'D2 28 28
30 , 'D3 24 24 10, 'D3 12 12 10 , 'D3 12 12 10
31 , 'D3 8 8 16, 'D3 4 4 16, 'D1 256, 'D1 256
32 , 'D1 80, 'D1 80, 'D1 10, 'D1 10]
33
34randomMnist :: MonadRandom m => m MNIST
35randomMnist = randomNetwork
36```
37
38And that's it. Because the types are so rich, there's no specific term level code
39required to construct this network; although it is of course possible and
40easy to construct and deconstruct the networks and layers explicitly oneself.
41
42If recurrent neural networks are more your style, you can try defining something
43["unreasonably effective"](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
44with
45```haskell
46type Shakespeare
47 = RecurrentNetwork
48 '[ R (LSTM 40 80), R (LSTM 80 40), F (FullyConnected 40 40), F Logit]
49 '[ 'D1 40, 'D1 80, 'D1 40, 'D1 40, 'D1 40 ]
50```
51
52Design
53------
54
55Networks in Grenade can be thought of as a heterogeneous lists of layers, where
56their type includes not only the layers of the network, but also the shapes of
57data that are passed between the layers.
58
59The definition of a network is surprisingly simple:
60```haskell
61data Network :: [*] -> [Shape] -> * where
62 NNil :: SingI i
63 => Network '[] '[i]
64
65 (:~>) :: (SingI i, SingI h, Layer x i h)
66 => !x
67 -> !(Network xs (h ': hs))
68 -> Network (x ': xs) (i ': h ': hs)
69```
70
71The `Layer x i o` constraint ensures that the layer `x` can sensibly perform a
72transformation between the input and output shapes `i` and `o`.
73
74The lifted data kind `Shape` defines our 1, 2, and 3 dimension types, used to
75declare what shape of data is passed between the layers.
76
77In the MNIST example above, the input layer can be seen to be a two dimensional
78(`D2`), image with 28 by 28 pixels. When the first *Convolution* layer runs, it
79outputs a three dimensional (`D3`) 24x24x10 image. The last item in the list is
80one dimensional (`D1`) with 10 values, representing the categories of the MNIST
81data.
82
83Usage
84-----
85
86To perform back propagation, one can call the eponymous function
87```haskell
88backPropagate :: forall shapes layers.
89 Network layers shapes -> S (Head shapes) -> S (Last shapes) -> Gradients layers
90```
91which takes a network, appropriate input and target data, and returns the
92back propagated gradients for the network. The shapes of the gradients are
93appropriate for each layer, and may be trivial for layers like `Relu` which
94have no learnable parameters.
95
96The gradients however can always be applied, yielding a new (hopefully better)
97layer with
98```haskell
99applyUpdate :: LearningParameters -> Network ls ss -> Gradients ls -> Network ls ss
100```
101
102Layers in Grenade are represented as Haskell classes, so creating one's own is
103easy in downstream code. If the shapes of a network are not specified correctly
104and a layer can not sensibly perform the operation between two shapes, then
105it will result in a compile time error.
106
107Composition
108-----------
109
110Networks and Layers in Grenade are easily composed at the type level. As a `Network`
111is an instance of `Layer`, one can use a trained Network as a small component in a
112larger network easily. Furthermore, we provide 2 layers which are designed to run
113layers in parallel and merge their output (either by concatenating them across one
114dimension or summing by pointwise adding their activations). This allows one to
115write any Network which can be expressed as a
116[series parallel graph](https://en.wikipedia.org/wiki/Series-parallel_graph).
117
118A residual network layer specification for instance could be written as
119```haskell
120type Residual net = Merge Trivial net
121```
122If the type `net` is an instance of `Layer`, then `Residual net` will be too. It will
123run the network, while retaining its input by passing it through the `Trivial` layer,
124and merge the original image with the output.
125
126See the [MNIST](https://github.com/HuwCampbell/grenade/blob/master/examples/main/mnist.hs)
127example, which has been overengineered to contain both residual style learning as well
128as inception style convolutions.
129
130Generative Adversarial Networks
131-------------------------------
132
133As Grenade is purely functional, one can compose its training functions in flexible
134ways. [GAN-MNIST](https://github.com/HuwCampbell/grenade/blob/master/examples/main/gan-mnist.hs)
135example displays an interesting, type safe way of writing a generative adversarial
136training function in 10 lines of code.
137
138Layer Zoo
139---------
140
141Grenade layers are normal haskell data types which are an instance of `Layer`, so
142it's easy to build one's own downstream code. We do however provide a decent set
143of layers, including convolution, deconvolution, pooling, pad, crop, logit, relu,
144elu, tanh, and fully connected.
145
146Build Instructions
147------------------
148Grenade is most easily built with the [mafia](https://github.com/ambiata/mafia)
149script that is located in the repository. You will also need the `lapack` and
150`blas` libraries and development tools. Once you have all that, Grenade can be
151build using:
152
153```
154./mafia build
155```
156
157and the tests run using:
158
159```
160./mafia test
161```
162
163Grenade builds with ghc 7.10, 8.0, 8.2 and 8.4.
164
165Thanks
166------
167Writing a library like this has been on my mind for a while now, but a big shout
168out must go to [Justin Le](https://github.com/mstksg), whose
169[dependently typed fully connected network](https://blog.jle.im/entry/practical-dependent-types-in-haskell-1.html)
170inspired me to get cracking, gave many ideas for the type level tools I
171needed, and was a great starting point for writing this library.
172
173Performance
174-----------
175Grenade is backed by hmatrix, BLAS, and LAPACK, with critical functions optimised
176in C. Using the im2col trick popularised by Caffe, it should be sufficient for
177many problems.
178
179Being purely functional, it should also be easy to run batches in parallel, which
180would be appropriate for larger networks, my current examples however are single
181threaded.
182
183Training 15 generations over Kaggle's 41000 sample MNIST training set on a single
184core took around 12 minutes, achieving 1.5% error rate on a 1000 sample holdout set.
185
186Contributing
187------------
188Contributions are welcome.
189
190 [hackage]: http://hackage.haskell.org/package/grenade
191 [hackage-png]: http://img.shields.io/hackage/v/grenade.svg
192 [hackage-deps]: http://packdeps.haskellers.com/reverse/grenade
193 [hackage-deps-png]: https://img.shields.io/hackage-deps/v/grenade.svg