馃挘 Machine learning which might blow up in your face 馃挘
at master 193 lines 7.5 kB view raw view rendered
1Grenade 2======= 3 4[![Build Status](https://api.travis-ci.org/HuwCampbell/grenade.svg?branch=master)](https://travis-ci.org/HuwCampbell/grenade) 5[![Hackage page (downloads and API reference)][hackage-png]][hackage] 6[![Hackage-Deps][hackage-deps-png]][hackage-deps] 7 8 9``` 10First shalt thou take out the Holy Pin, then shalt thou count to three, no more, no less. 11Three shall be the number thou shalt count, and the number of the counting shall be three. 12Four shalt thou not count, neither count thou two, excepting that thou then proceed to three. 13Five is right out. 14``` 15 16馃挘 Machine learning which might blow up in your face 馃挘 17 18Grenade is a composable, dependently typed, practical, and fast recurrent neural network library 19for concise and precise specifications of complex networks in Haskell. 20 21As an example, a network which can achieve ~1.5% error on MNIST can be 22specified and initialised with random weights in a few lines of code with 23```haskell 24type MNIST 25 = Network 26 '[ Convolution 1 10 5 5 1 1, Pooling 2 2 2 2, Relu 27 , Convolution 10 16 5 5 1 1, Pooling 2 2 2 2, Reshape, Relu 28 , FullyConnected 256 80, Logit, FullyConnected 80 10, Logit] 29 '[ 'D2 28 28 30 , 'D3 24 24 10, 'D3 12 12 10 , 'D3 12 12 10 31 , 'D3 8 8 16, 'D3 4 4 16, 'D1 256, 'D1 256 32 , 'D1 80, 'D1 80, 'D1 10, 'D1 10] 33 34randomMnist :: MonadRandom m => m MNIST 35randomMnist = randomNetwork 36``` 37 38And that's it. Because the types are so rich, there's no specific term level code 39required to construct this network; although it is of course possible and 40easy to construct and deconstruct the networks and layers explicitly oneself. 41 42If recurrent neural networks are more your style, you can try defining something 43["unreasonably effective"](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) 44with 45```haskell 46type Shakespeare 47 = RecurrentNetwork 48 '[ R (LSTM 40 80), R (LSTM 80 40), F (FullyConnected 40 40), F Logit] 49 '[ 'D1 40, 'D1 80, 'D1 40, 'D1 40, 'D1 40 ] 50``` 51 52Design 53------ 54 55Networks in Grenade can be thought of as a heterogeneous lists of layers, where 56their type includes not only the layers of the network, but also the shapes of 57data that are passed between the layers. 58 59The definition of a network is surprisingly simple: 60```haskell 61data Network :: [*] -> [Shape] -> * where 62 NNil :: SingI i 63 => Network '[] '[i] 64 65 (:~>) :: (SingI i, SingI h, Layer x i h) 66 => !x 67 -> !(Network xs (h ': hs)) 68 -> Network (x ': xs) (i ': h ': hs) 69``` 70 71The `Layer x i o` constraint ensures that the layer `x` can sensibly perform a 72transformation between the input and output shapes `i` and `o`. 73 74The lifted data kind `Shape` defines our 1, 2, and 3 dimension types, used to 75declare what shape of data is passed between the layers. 76 77In the MNIST example above, the input layer can be seen to be a two dimensional 78(`D2`), image with 28 by 28 pixels. When the first *Convolution* layer runs, it 79outputs a three dimensional (`D3`) 24x24x10 image. The last item in the list is 80one dimensional (`D1`) with 10 values, representing the categories of the MNIST 81data. 82 83Usage 84----- 85 86To perform back propagation, one can call the eponymous function 87```haskell 88backPropagate :: forall shapes layers. 89 Network layers shapes -> S (Head shapes) -> S (Last shapes) -> Gradients layers 90``` 91which takes a network, appropriate input and target data, and returns the 92back propagated gradients for the network. The shapes of the gradients are 93appropriate for each layer, and may be trivial for layers like `Relu` which 94have no learnable parameters. 95 96The gradients however can always be applied, yielding a new (hopefully better) 97layer with 98```haskell 99applyUpdate :: LearningParameters -> Network ls ss -> Gradients ls -> Network ls ss 100``` 101 102Layers in Grenade are represented as Haskell classes, so creating one's own is 103easy in downstream code. If the shapes of a network are not specified correctly 104and a layer can not sensibly perform the operation between two shapes, then 105it will result in a compile time error. 106 107Composition 108----------- 109 110Networks and Layers in Grenade are easily composed at the type level. As a `Network` 111is an instance of `Layer`, one can use a trained Network as a small component in a 112larger network easily. Furthermore, we provide 2 layers which are designed to run 113layers in parallel and merge their output (either by concatenating them across one 114dimension or summing by pointwise adding their activations). This allows one to 115write any Network which can be expressed as a 116[series parallel graph](https://en.wikipedia.org/wiki/Series-parallel_graph). 117 118A residual network layer specification for instance could be written as 119```haskell 120type Residual net = Merge Trivial net 121``` 122If the type `net` is an instance of `Layer`, then `Residual net` will be too. It will 123run the network, while retaining its input by passing it through the `Trivial` layer, 124and merge the original image with the output. 125 126See the [MNIST](https://github.com/HuwCampbell/grenade/blob/master/examples/main/mnist.hs) 127example, which has been overengineered to contain both residual style learning as well 128as inception style convolutions. 129 130Generative Adversarial Networks 131------------------------------- 132 133As Grenade is purely functional, one can compose its training functions in flexible 134ways. [GAN-MNIST](https://github.com/HuwCampbell/grenade/blob/master/examples/main/gan-mnist.hs) 135example displays an interesting, type safe way of writing a generative adversarial 136training function in 10 lines of code. 137 138Layer Zoo 139--------- 140 141Grenade layers are normal haskell data types which are an instance of `Layer`, so 142it's easy to build one's own downstream code. We do however provide a decent set 143of layers, including convolution, deconvolution, pooling, pad, crop, logit, relu, 144elu, tanh, and fully connected. 145 146Build Instructions 147------------------ 148Grenade is most easily built with the [mafia](https://github.com/ambiata/mafia) 149script that is located in the repository. You will also need the `lapack` and 150`blas` libraries and development tools. Once you have all that, Grenade can be 151build using: 152 153``` 154./mafia build 155``` 156 157and the tests run using: 158 159``` 160./mafia test 161``` 162 163Grenade builds with ghc 7.10, 8.0, 8.2 and 8.4. 164 165Thanks 166------ 167Writing a library like this has been on my mind for a while now, but a big shout 168out must go to [Justin Le](https://github.com/mstksg), whose 169[dependently typed fully connected network](https://blog.jle.im/entry/practical-dependent-types-in-haskell-1.html) 170inspired me to get cracking, gave many ideas for the type level tools I 171needed, and was a great starting point for writing this library. 172 173Performance 174----------- 175Grenade is backed by hmatrix, BLAS, and LAPACK, with critical functions optimised 176in C. Using the im2col trick popularised by Caffe, it should be sufficient for 177many problems. 178 179Being purely functional, it should also be easy to run batches in parallel, which 180would be appropriate for larger networks, my current examples however are single 181threaded. 182 183Training 15 generations over Kaggle's 41000 sample MNIST training set on a single 184core took around 12 minutes, achieving 1.5% error rate on a 1000 sample holdout set. 185 186Contributing 187------------ 188Contributions are welcome. 189 190 [hackage]: http://hackage.haskell.org/package/grenade 191 [hackage-png]: http://img.shields.io/hackage/v/grenade.svg 192 [hackage-deps]: http://packdeps.haskellers.com/reverse/grenade 193 [hackage-deps-png]: https://img.shields.io/hackage-deps/v/grenade.svg