Columbia University

ECBM E4040 Neural Networks and Deep Learning. Fall 2020.

Name : Yi-Pei Chan

ECBM E4040 - Task 3: Convolutional Neural Network (CNN)

In this task, you are going to first practice the forward/backward propagation of the convolutional operations with NumPy. After that, we will introduce TensorFlow with which you will create your CNN model for an image classification task.

CNNs:

This is one of the good posts describing CNNs:

https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/

Convolutional neural networks (CNNs) are highly effective for image processing tasks.

Remember when one builds a MLP model, each connection is multiplied by its own weight. When the input dimension or the first layer is very large, we need a giant matrix to store the weights. This could easily become a problem in image processing since the dimension of a vectorized image could easily exceed 1000 (consider CIFAR-10 which has images of shape 32×32=1024, yet the resolution is so low).

In CNN, the weights are shared: the same filter (also known as 'weights' or 'kernel') moves over the input, and at each position an output value is calculated. This means that the same weights are repetitively applied to the entire input, therefore saving a lot of memory.

Illustration of the CNN Image source: here

Convolution: In the picture above, the input is a 7-by-7 image, and the filter is shown as a blue 3-by-3 grid. The filter overlaps with the top-left corner of the input, and we perform an element-wise multiplication followed by a summation, then put the sum into the output matrix. The filter then moves several pixels right, covering a new input area so a new sum can be derived.

Training: One thing to remember is that there would be a lot of filters for each layer in a CNN, and the goal of training is to find the best filters for your task. Each filter tries to capture one specific feature. Typically, in the first convolutional layer which directly looks at your input, the filters try to capture information about color and edges which we know as local features; in higher layers, due to the effect of max-pooling, the receptive-fields of filters becomes large so more global and complex features can be detected.

Architecture: For classification tasks, a CNN usually starts with convolution followed by max-pooling. After that, the feature maps will be flattened so that we could append fully connected layers. Common activation functions include ReLu, ELU in the convolution layers, and softmax in the fully connected layers (to calculate the classification scores).


Terminology

Part 1: Getting a sense of convolution

conv2d feedforward

Implement a NumPy naive 2-D convolution feedforward function. We ask you to simply do the element-wise multiplication and summation. Do not worry about the efficiency of your functions. Use as many loops as you like.

__TODO:__ Complete the function conv2d_forward in utils/layer_funcs.py. After that, run the following cell blocks in Jupyter notebook, which will give the output of your convolution function. Detailed instructions have been given in the comments of layer_func.py. The instructors will look at the output to give credits for this task.

conv2d backpropagation (optional, bonus +10 points)

This function is optional, but a bonus 10 points will be given if you solve it correctly.

Implement a NumPy naive 2-D convolution backpropagation function. Again, don't worry about the efficiency.

__TODO:__ Complete the function conv2d_backward in utils/layer_funcs.py. After that, run the following cell blocks, which will give the output of your backpropagation. Detailed instructions have been given in the comments of layer_func.py. We need to judge your output to give you credits.

max pool feedforward

Implement a NumPy naive max pool feedforward function. We ask you to simply find the max in your pooling window. Also, don't need to worry about the efficiency of your function. Use loops as many as you like.

__TODO:__ Finish the function max_pool_forward in utils/layer_funcs.py. After that, run the following cell blocks, which will give the output of your max pool function. Detailed instructions have been given in the comments of layer_func.py. We need to judge your output to give you credits.

max pool backpropogation (optional, bonus +10 points)

This function is optional, but a bonus 10 points will be given if you solve it correctly.

Implement a Numpy naive max pooling backpropagation function. Again, don't worry about the efficiency.

__TODO:__ Finish the function max_pool_backward in utils/layer_funcs.py. After that, run the following cell blocks, which will give the output of your backpropagation. Detailed instructions have been given in the comments of layer_func.py. We need to judge your output to give you credits.

Part 2: TensorFlow CNN

In this part we will construct the CNN in TensorFlow. We will implement a CNN similar to the LeNet structure.

TensorFlow offers many useful resources and functions which help developers build the net in a high-level fashion, such as functions in the layer module. However, we will build the network by ourself for this homework for better understanding. By utilizing functions in tf.nn that exist for Neural Network structuring and training, we can build out our own layers and network modules rather quickly.

Also, we will introduce a visualization tool called Tensorboard. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data that pass through it.

Resources and References:

Quick guide for Tensorboard

Tensorboard is a powerful tool provided by TensorFlow. It allows developers to check their graph and trend of parameters. This guide will give you a basic under standing on how to set up Tensorboard graph in your code, start tensorboard on your local machine/GCP instance and how to access the interface.

For complete instructions, check the official guide on Tensorflow web site here.

How to start tensorboard

Local

To start your Tensorboard on your local machine, you need to specify a log directory for the service to fetch the graph. For example, in your command line, type:

$ tensorboard --logdir="~/log"

Then, Tensorboard will start running. By default, it will be running on port 6006:

TensorBoard 1.8.0 at http://localhost:6006 (Press CTRL+C to quit)

Make sure Tensorboard is running, you can visit http://localhost:6006 In your browser and you should be able to see the main page of Tensorboard. If the page is shown as below, it means Tensorboard is running correctly. The report is due to lack of event file, but we can just leave it there for now.

Tensorboard_1

GCP

To set up the Tensorboard on GCP is the same as above. However, we're not able to check the Tensorboard UI directly through our browser. In order to visit the page through our local browser, we should link the port of our local machine to the port on GCP. It is similar to what we did previously for Jupyter Notebook.

In the command line on your local machine, type:

$ gcloud compute ssh --ssh-flag="-L 9999:localhost:9999 -L 9998:localhost:6006" "ecbm4040@YOUR_INSTANCE"

This will bind your port of your local machine to the port on GCP instance. In this case, your local port 9999 is binded with 9999 on GCP, while local port 9998 is binded with 6006 on GCP. You can change whatever port you like as long as it does not confilct with your local services.

After connecting to GCP using the command, you will be able to see the result page.

Export Tensorboard events into log directory

To write summary data for visualization in TensorBoard, we should use class tf.summary. This class will save your network graph sturcuture and all the variable summary.

For example, the following code from the LeNet_trainer.py will save all the parameter summary and marked with iteration_total. These data will be displayed in the Tensorboard later on.

# ... previous code ...
# ...
                # Set up summary writers to write the summaries to disk in a different logs directory:
                def summary(self):
                    self.current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
                    self.train_log_dir = 'logs/gradient_tape/' + self.current_time + '/train'
                    self.test_log_dir = 'logs/gradient_tape/' + self.current_time + '/test'
                    self.train_summary_writer = tf.summary.create_file_writer(self.train_log_dir)
                    self.test_summary_writer = tf.summary.create_file_writer(self.test_log_dir)

                # Use tf.summary.scalar() to log metrics (loss and accuracy) during training/testing within the scope of
                # the summary writers to write the summaries to disk.
                def train_epoch(self, epoch):
                    ...previous code...
                    for images, labels in train_ds:
                    self.train_step(images, labels)
                    with self.train_summary_writer.as_default():
                        tf.summary.scalar('loss', self.train_loss.result(), step=epoch)
                        tf.summary.scalar('accuracy', self.train_accuracy.result(), step=epoch)

                    for test_images, test_labels in test_ds:
                        self.test_step(test_images, test_labels)

                    with self.test_summary_writer.as_default():
                        tf.summary.scalar('loss', self.test_loss.result(), step=epoch)
                        tf.summary.scalar('accuracy', self.test_accuracy.result(), step=epoch)

Check the graph and summary in Tensorboard

After executing the program once, you should able to see the metrics displayed in the tensorboard.

Tensorboard_2

Also, you may able zoom in or zoom out or click into the layer block to check all the variables and tensor operations in the graph, check the trend of the variables and the distribution of those in Scalar, Distributions and Histograms. You may explore the tensorboard by yourself and take advantage to it for debuging the nerwork structure.

__TODO:__ You will try to achieve your own CNN model that has similar structure to LeNet, show the model graph in tensorboard, and get a model with 90% or higher accuracy using the data we provide you.

An example code is included in utils/neuralnets/cnn/model_LeNet.py. This sample is used as a guide line for how to build a Neural Net model in Tensorflow. Feel free to copy or utilize the code we give you.

__TODO:__

  1. Edit the file utils/neuralnets/cnn/my_model_LeNet.py. Create your own CNN that is based on the LeNet structure to achieve at least 90% test accuracy.
  2. Print out the training process and the best validation accuracy, save the .meta model in model/ folder.
  3. Attach a screenshot of your tensorboard graph in the markdown cell below. Double click the cell and replace the example image with your own image. Here is a Markdown Cheetsheet that may also help.

Hint:

  1. You can copy and edit the code from model_LeNet.py

Sequential implementation.

Notes: Don't forget to set up a firewall for Tensorboard just like the way we did for jupyter notebook. This time the port is 6006.

__TODO:__ Show accuracy and attach the tensorboard graph.

[Insert your screenshot here.]

results_from_tensorboard_cnn.png