Overview about MNIST data

Tuesday, June 6, 2017

1. Introduction
2. History and Overview about Artificial Neural Network

3. Single neural network

3.1 Perceptron

3.2 Adaptive Linear Neurons

3.3 Problems with Perceptron (AI Winter)

4. Multi-layer neural network

4.1 Overview about Multi-layer Neural Network
4.2 Forward Propagation
4.3 Cost function
4.4 Backpropagation
4.5 Implement simple Multi-layer Neural Network to solve the problem of Perceptron

4.6 Some optional techniques for Multi-layer Neural Network Optimization

4.7 Multi-layer Neural Network for binary/multi classification

5. Install and using Multi-layer Neural Network to classify MNIST data

5.1 Overview about MNIST data
5.2 Implement Multi-layer Neural Network
5.3 Debugging Neural Network with Gradient Descent Checking

6. Summary
7. References

Overview about MNIST data

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits, it was collected by Yann LeCun, Corinna Cortes, Christopher J.C. Burges consists of 70.000 handwritten digits 28x28 pixels written by 500 different writers. It's divided into 2 components include 60.000 handwritten digits for training and 10.000 handwritten digits for testing. It's available on Yann's page or you can download directly from here,

train-images-idx3-ubyte.gz [training set images (9912422 bytes)]
train-labels-idx1-ubyte.gz [training set labels (28881 bytes)]
t10k-images-idx3-ubyte.gz [test set images (1648877 bytes)]
t10k-labels-idx1-ubyte.gz [test set labels (4542 bytes)]

Until now, the applied Convolution Neural Network on this data with the lowest error is 0.21, that's really cool. So in this final excercise, we actually play some codes to install "naive" Multi-layer Neural Network to learn and predict based-on this data. Since this data is transform to binary format we can not open with regular application, but can read by some simple Python codes. Let's take a look what kind of data that we will work on,

First 25 images of MNIST data

How to load MINIST data?

Before installing Neural Network, firstly let's write some codes to get MNIST data into features train, features test, labels train and labels test.
### Import some needed libraries Some notes maybe are useful for you to not struggle with the snippet codes below:

"I" is unsinged integer equal 4 bytes. We need "IIII"(16 bytes) to read the descriptions of the image dataset and "II"(8 bytes) to read the descriptions of the label dataset
">" is the Big-endian, if you don't know what Big-endian is? Let's take a look on Wikipedia page
"data_description" is the tuple contain description of data: data_description = (magic number, number of images, rows, columns)
".read(16)" method in load images data used to read bytes begin at offset 0016 to the end of file and it's similar to load labels data with .read(8)
"dtype=np.uint8" used to determined the size of byte-order and np.int8 = 1 byte. Check some other data types of numpy on this page

### Firstly, let’s define a function allow us to push folder contain data and return to us features_train, features_test, labels_train and lables_test ### Now import your personal folder in your computer, remember folder must contain 4 data files Alright, we done.

Overview about MNIST data

Overview about MNIST data

How to load MINIST data?

No comments :

Post a Comment