Columbia University

ECBM E4040 Neural Networks and Deep Learning

Yi-Pei Chan

Assignment 2: Multilayer Perceptron (MLP)

This is the second part of the assignment. You will get to know how to build basic fully connected neural network.

Load Data

Part 1: Basic layers

In this part, all the functions will be created from scratch using numpy for better understanding. (In the next task, you will be introduced to built in layers from tensorflow.)

Create basic layer functions

Complete functions affine_forward, affine_backward

NOTE: Please do not change the code in the cell below, The cell below will run correctly if your code is right.

Complete functions relu_forward, relu_backward

NOTE: Please do not change the code in the cell below, The cell below will run correctly if your code is right.

Complete functions softmax_loss

NOTE: Please do not change the code in the cell below, The cell below will run correctly if your code is right.

Create a single layer

Now try to combine an affine function and a nonlinear activation function into a single fully-connected layer. Edit the code in ./utils/layer_utils.py

$$\mathbf{O} = activation(\mathbf{W} \times \mathbf{X} + \mathbf{b})$$

For this assignment, you need to create two types of layers as below. You can get started with the skeleton code in ./utils/layer_utils.py. The basic class structure has been provided, and you need to fill in the "TODO" part(s).

Complete function AffineLayer in ./utils/layer_utils.py

NOTE: Please do not change the code in the cell below, The cell below will run correctly if your code is right.

Complete function DenseLayer

NOTE: Please do not change the code in the cell below, The cell below will run correctly if your code is right.

Part 2: Two Layer Network

Complete the class TwoLayerNet in ./utils/classifiers/twolayernet.py. Through this experiment, you will create a two-layer neural network and learn about the backpropagation mechanism. The network structure is like input >> DenseLayer >> AffineLayer >> softmax loss >> output. Complete "TODO" part(s).

Class TwoLayerNet:   
    Functions: 
        __init__: GIVEN
        loss: TODO - calculate cross entropy loss and gradients wst all weights and bias.
        step: TODO - a single update all weights and bias by SGD.
        predict: TODO - output result(classification accuracy) based on input data

    Variables:
        layers

TODO: Complete class TwoLayerNet in ./utils/classifiers/twolayernet.py

NOTE: Please do not change the code in the cell below, The cell below will run correctly if your code is right.

Train a two-layer network

Import functions for training and testing

Start training

We have provide you the train( ) function in ./utils/train_func.py

Plot training and validation accuracy history of each epoch</strong></p>

SOLUTION (enter a new cell below):

Visulize the weight variable in the first layer.

Visualization of the intermediate weights can help you get an intuitive understanding of how the network works, especially in Convolutional Neural Networks (CNNs).

Get test accuracy greater than 50%

For this part, you need to train a better two-layer net. The requirement is to get test accuracy better than 50%. If your accuracy is lower, for each 1% lower than 50%, you will lose 5 points.

Here are some recommended methods for improving the performance. Feel free to try any other method as you see fit.

  1. Hyperparameter tuning: reg, hidden_dim, lr, learning_decay, num_epoch, batch_size, weight_scale.
  2. Adjust training strategy: Randomly select a batch of samples rather than selecting them orderly.
  3. Try new optimization methods: Now we are using SGD, you can try SGD with momentum, adam, etc.
  4. Early-stopping.
  5. Good (better) initial values for weights in the model.

A comparison between SGD and SGD with momentum.

TODO

SOLUTION (enter a new cell below):

plot training and validation accuracy of your best model

SOLUTION (enter a new cell below):

Save your best model in a dictionary