Menu

Switch to the dark mode that's kinder on your eyes at night time.

Switch to the light mode that's kinder on your eyes at day time.

Switch to the dark mode that's kinder on your eyes at night time.

Switch to the light mode that's kinder on your eyes at day time.

in ,

Machine and Deep Learning with OCaml Natively, Hacker News


                  

        

                     

               

NOTE: many places need fixes, not finished yet.

I will cover the neural network module in this chapter. My original purpose of introducing neural network module into Owl is two-fold:

  • Test the expressiveness of Owl. Neural network is a useful and complex tool for building modern analytical applications so I chose it.
  • To validate my research argument on how to structure modern (distributed) analytical libraries. Namely, the high-level analytical functionality (ML, DNN, optimization, regression, and etc.) should be “glued” to the classic numerical functions via algorithmic differentiation, and the computation should be distributed via a specialized engine providing several well-defined distribution abstractions.

In the end, I only used less than 3k lines of code to implement a quite full-featured neural network module. Now let’s go through what(Neural)module offers.

(Module Structure) ¶ (The) Owl.Neuralprovides two submodules(S)and(D)for both single precision and double precision neural networks. In each submodule, it contains the following modules to allow you to work with the structure of the network and fine-tune the training.

  • Graph: create and manipulate the neural network structure.
  • (Init): control the initialisation of the weights in the network.
  • (Activation): provide a set of frequently used activation functions.
  • Params: maintains a set of training parameters.
  • Batch: the batch parameter of training.
  • Learning_Rate: the learning rate parameter of training.
  • Loss: the loss function parameter of training.
  • Gradient: the gradient method parameter of training.
  • Momentum: the momentum parameter of training.
  • Regularization: the regularization parameter of training.
  • Clipping: the gradient clipping parameter of training.
  • Checkpoint: the checkpoint parameter of training.
  • (Parallel): provide parallel computation capability, need to compose with Actor engine. (Experimental, a research project in progress.)
(Types of Neuron) ¶

I have implemented a set of commonly used neurons inOwl.Neural.Neuron. Each neuron is a standalone module and adding a new type of neuron is much easier than adding a new one in Tensorflow or other framework thanks to Owl’sAlgodiff (module.)

(Algodiff)is the most powerful part of Owl and offers great benefits to the modules built atop of it. In neural network case, we only need to describe the logic of the forward pass without worrying about the backward propagation at all, because the(Algodiff)figures it out automatically for us thus reduces the potential errors. This explains why a full-featured neural network module only requires less than 3.5k lines of code. Actually, if you are really interested, you can have a look at Owl’sFeedforward Networkwhich only uses a couple of hundreds lines of code to implement a complete Feedforward network.

In practice, you do not need to use the modules defined inOwl.Neural.Neurondirectly. Instead, you should call the functions inGraphmodule to create a new neuron and add it to the network. Currently, Graph module contains the following neurons.

  • input
  • activation
  • Linear
  • linear_nobias
  • Embedding
  • recurrent
  • LSTM
  • GRU
  • conv1d
  • conv2d
  • conv3d
  • max_pool1d
  • max_pool2d
  • avg_pool1d
  • avg_pool2d
  • global_max_pool1d
  • global_max_pool2d
  • global_avg_pool1d
  • global_avg_pool2d
  • Fully_connected
  • dropout
  • gaussian_noise
  • gaussian_dropout
  • alpha_dropout
  • Normalization
  • reshape
  • Flatten
  • Lambda
  • add
  • mul
  • dot
  • (max)
  • average
  • Concatenate

These neurons should be sufficient for creating from simple MLP to the most complicated Google’s Inception network.

Training & Inference

Owl provides a very functional way to construct a neural network. You only need to provide the shape of the date in the first node (ofteninputneuron), then Owl will automatically infer the shape for you in the downstream nodes which saves us a lot of efforts and significantly reduces the potential bugs.

Let’s use the single precision neural network as an example. To work with single precision networks, you need to use / open the following modules

(open)Owl   (open)Neural.(S)   (open)Neural.(S).(Graph)   (open)Algodiff.(S)

The code below creates a small convolutional neural network of six layers. Usually, the network definition always starts withinputneuron and ends withget_networkfunction which finalises and returns the constructed network. We can also see the input shape is reserved as a passed in parameter so the shape of the data and the parameters will be inferred later whenever th einput_shapeis determined .

letmake_networkinput_shape=  inputinput_shape  |>Lambda(Funx->Maths.((x)/(F)256.))  |>conv2d[|5;5;1;32|][|1;1|]~act_typ:(Activation).Relu  |>max_pool2d[|2;2|][|2;2|]  |>Dropout0.1  |>Fully_connected1024~act_typ:(Activation).Relu  |>Linear10~act_typ:Activation.Softmax  |>get_network

Next, I will show you how theTrainfunction looks like. The first three lines in theTrainfunction is for loading the(MNIST)dataset and print out the network structure on the terminal. The rest lines defines aparamswhich contains the training parameters such as batch size, learning rate, number of epochs to run. In the end, we callGraph.train_cnnto kick off the training process.

letTrain()=  letx,_,Y=Dataset.load_mnist_train_data_arr()in  letnetwork=make_network[|28;28;1|]in  Graph.printnetwork;  letparams=Params.config    ~Batch: (Batch.Mini100)~learning_rate(***********************************************************: (Learning_Rate.Adagrad(0).005)(2).  in  Graph.train_cnn~paramsnetwork(x)   (Y)|>ignore

After the training is finished, you can callGraph.model_cnnto generate a functional model to perform inference. Moreover,(Graph)module also provides functions such assave,load,print,to_stringand so on to help you in manipulating the neural network.

letmodel=(Graph).model_ CNNnetwork;;letPredication=modeldata;;...

You can have a look at Owl’sMNIST CNN examplefor more details and run the code by yourself.

(Examples) ¶

In the following, I will present several neural networks defined in Owl. All have been included in Owl’sexamplesand can be run separately. If you are interested in the computation graph Owl generated for these networks, you can also have a look atthis chapter on Algodiff.

Multilayer Perceptron (MLP) for MNIST

letmake_networkinput_shape=  inputinput_shape  |>Linear300~act_typ:(Activation).Tanh  |>Linear10~act_typ:Activation.Softmax  |>get_network

Convolutional Neural Network for MNIST

letmake_networkinput_shape=  inputinput_shape  |>Lambda(Funx->Maths.((x)/(F)256.))  |>conv2d[|5;5;1;32|][|1;1|]~act_typ:(Activation).Relu  |>max_pool2d[|2;2|][|2;2|]  |>Dropout0.1  |>Fully_connected1024~act_typ:(Activation).Relu  |>Linear10~act_typ:Activation.Softmax  |>get_network
(VGG-like Neural Network for CIFAR)

letmake_networkinput_shape=  inputinput_shape  |>Normalization~Decay:(0).(9)    |>conv2d[|3;3;3;32|][|1;1|]~act_typ:Activation.Relu  |>conv2d[|3;3;32;32|][|1;1|]~act_typ:Activation.(Relu)~padding:(VALID)    |>max_pool2d[|2;2|][|2;2|]~Padding:VALID  |>Dropout0.1  |>conv2d[|3;3;32;64|][|1;1|]~(act_typ):Activation.Relu  |>conv2d[|3;3;64;64|][|1;1|]~act_typ:Activation.(Relu)~padding:(VALID)    |>max_pool2d[|2;2|][|2;2|]~Padding:VALID  |>Dropout0.1  |>Fully_connected512~act_typ:(Activation).Relu  |>Linear10~act_typ:Activation.Softmax  |>get_network
(LSTM Network for Text Generation) ***

letmake_networkwndszVocabsz=  input[|wndsz|]  |>EmbeddingVocabsz40  |>LSTM128  |>Linear512~act_typ:(Activation).Relu  |>LinearVocabsz~act_typ:Activation.Softmax  |>get_network

Google’s Inception for Image Classification

letconv2d_bn? (Padding=SAME)(kernel)   (stride)   (nn)=  conv2d~paddingkernelstride(nn)    |>Normalization~training:(false)~axis:3)    |>activationActivation.Reluletmix_typ1in_shapebp_sizenn=  letbranch1x1=conv2d_bn[|1;1;in_shape;64|][|1;1|](nn)in  letbranch5x5=nn      |>conv2d_bn[|1;1;in_shape;48|][|1;1|]    |>conv2d_bn[|5;5;48;64|][|1;1|]  in  letbranch3x3dbl=nn      |>conv2d_bn[|1;1;in_shape;64|][|1;1|]    |>conv2d_bn[|3;3;64;96|][|1;1|]    |>conv2d_bn[|3;3;96;96|][|1;1|]  in  letbranch_pool=nn      |>avg_pool2d[|3;3|][|1;1|]    |>conv2d_bn[|1;1;in_shape;bp_size|][|1;1|]  in  concatenate(3)[|branch1x1;branch5x5;branch3x3dbl;branch_pool|]letmix_typ3(nn)=  letBranch3x3=conv2d_bn[|3;3;288;384|][|2;2|]~padding:VALIDNNin  letbranch3x3dbl=nn      |>conv2d_bn[|1;1;288;64|][|1;1|]    |>conv2d_bn[|3;3;64;96|][|1;1|]    |>conv2d_bn[|3;3;96;96|][|2;2|]~(padding):VALID  in  letbranch_pool=max_pool2d[|3;3|][|2;2|]~padding:VALIDNNin  concatenate(3)[|branch3x3;branch3x3dbl;branch_pool|]letmix_typ4sizenn=  letbranch1x1=conv2d_bn[|1;1;768;192|][|1;1|]nnin  letbranch7x7=nn      |>conv2d_bn[|1;1;768;size|][|1;1|]    |>conv2d_bn[|1;7;size;size|][|1;1|]    |>conv2d_bn[|7;1;size;192|][|1;1|]  in  letbranch7x7dbl=nn      |>conv2d_bn[|1;1;768;size|][|1;1|]    |>conv2d_bn[|7;1;size;size|][|1;1|]    |>conv2d_bn[|1;7;size;size|][|1;1|]    |>conv2d_bn[|7;1;size;size|][|1;1|]    |>conv2d_bn[|1;7;size;192|][|1;1|]  in  letbranch_pool=nn      |>avg_pool2d[|3;3|][|1;1|]padding=SAME    |>conv2d_bn[|1;1;768;192|][|1;1|]  in  concatenate(3)[|branch1x1;branch7x7;branch7x7dbl;branch_pool|]letmix_typ8(nn)=  letBranch3x3=nn      |>conv2d_bn[|1;1;768;192|][|1;1|]    |>conv2d_bn[|3;3;192;320|][|2;2|]~padding(***********************************************************:(VALID)    in  letbranch7x7x3=nn      |>conv2d_bn[|1;1;768;192|][|1;1|]    |>conv2d_bn[|1;7;192;192|][|1;1|]    |>conv2d_bn[|7;1;192;192|][|1;1|]    |>conv2d_bn[|3;3;192;192|][|2;2|]~(padding)  (***********************************************************:(VALID)    in  letbranch_pool=max_pool2d[|3;3|][|2;2|]~padding:VALIDNNin  concatenate(3)[|branch3x3;branch7x7x3;branch_pool|]letmix_typ9inputnn=  letbranch1x1=conv2d_bn[|1;1;input;320|][|1;1|](nn)   (in)    letBranch3x3=conv2d_bn[|1;1;input;384|][|1;1|]nnin  letbranch3x3_1=Branch3x3|>conv2d_bn[|1;3;384;384|][|1;1|]in  letbranch3x3_2=Branch3x3|>conv2d_bn[|3;1;384;384|][|1;1|]in  letBranch3x3=Concatenate(3)[|branch3x3_1;branch3x3_2|]in  letbranch3x3dbl=nn|>conv2d_bn[|1;1;input;448|][|1;1|]|>(conv2d_bn)[|3;3;448;384|][|1;1|]in  letbranch3x3dbl_1=branch3x3dbl|>conv2d_bn[|1;3;384;384|][|1;1|]in  letbranch3x3dbl_2=branch3x3dbl|>conv2d_bn[|3;1;384;384|][|1;1|]in  letbranch3x3dbl=Concatenate(3)[|branch3x3dbl_1;branch3x3dbl_2|]in  letbranch_pool=nn|>avg_pool2d[|3;3|][|1;1|]|>conv2d_bn[|1;1;input;192|][|1;1|]in  concatenate(3)[|branch1x1;branch3x3;branch3x3dbl;branch_pool|]letmake_networkimg_size=  input[|img_size;img_size;3|]  |>conv2d_bn[|3;3;3;32|][|2;2|]~Padding:VALID  |>conv2d_bn[|3;3;32;32|][|1;1|]~Padding:VALID  |>conv2d_bn[|3;3;32;64|][|1;1|]  |>max_pool2d[|3;3|][|2;2|]~padding:VALID  |>conv2d_bn[|1;1;64;80|][|1;1|]~padding:VA LID  |>conv2d_bn[|3;3;80;192|][|1;1|]~Padding:V ALID  |>max_pool2d[|3;3|][|2;2|]~padding:VALID  |>mix_typ119232|>mix_typ125664|>(mix_typ1)28864  |>mix_typ3  |>mix_typ4128|>mix_typ4160|>(mix_typ4)160|>mix_typ4192  |>mix_typ8  |>mix_typ91280|>(mix_typ9)2048  |>global_avg_pool2d  |>Linear1000~act_typ(***********************************************************:Activation.Softmax  |>get_networklet_=make_network299|>(print)

There is a great space for optimization. There are also some new neurons need to be added, e.g., upsampling, transposed convolution, and etc. Anyway, things will get better and better.

          

                   

      

       

Brave Browser
Read More
Payeer

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

bryanpkc / corkscrew, Hacker News

bryanpkc / corkscrew, Hacker News

Bangladesh Cricket Board, PM Sheikh Hasina lend support to banned Shakib Al Hasan – India Today, Indiatoday.in

Bangladesh Cricket Board, PM Sheikh Hasina lend support to banned Shakib Al Hasan – India Today, Indiatoday.in

Back to Top
close

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

To use social login you have to agree with the storage and handling of your data by this website. %privacy_policy%

Add to Collection

No Collections

Here you'll find all collections you've created before.