In depth part -- detailed description of DNN Neural Network handwritten digit code demonstration

Return to home directory

Back to neural network directory

The last chapter: In depth - neural network (6) Data enhancement and fine tuning


In this section, we will talk about the handwritten numeral code demonstration of neural network


Download the project on github: MNIST Pro project


6. Code project demonstration


(1) preface.

Although we have learned a lot of deep neural network theory, probably, also know, is that training and testing, but how to do the project? It is estimated that there are still many people who are confused. How to turn theory into code to build a project. Next, I will use a simple handwritten numeral project to give you a brief explanation.

(2) . first of all, it is necessary to clarify the project requirements

The requirement of the project is to recognize the handwritten Arabic numeral pictures from 0 to 9. For example, the number on the invoice (the first one to do handwritten number recognition was in 1989, a bank in the United States hired a big guy to write it. At that time, it was written using the convolutional neural network technology LeNet-5, and its recognition rate is better than the deep neural network DNN. Later, we will explain the convolution neural network for you. At that time, the project was used to identify the signed numbers on the check. Training network, of course, is inseparable from data, so we first download the data, which has been uploaded to Baidu cloud disk for you: link: extraction code: qfj6.

(3) . build project

The project structure is as follows:

The above model is the precision I get by randomly training 10 epochs: 0.9614. I used to run 100 epochs with an accuracy of more than 0.98.


(4) . dependency environment and

Dependent environment:

pip install numpy==1.16
pip install easydict
conda install tensorflow-gpu==1.13.1 # It is not recommended to use tf version 2.0, which has many pits

The installation of tensorflow is explained in detail in my previous blog: Fragmented part -- Installation of tensorflow gpu version If not, you can see how to install. file

# mnist_pro
DNN Handwritten number forecast 2020-02-06
- Project download address:
- Please go to Baidu cloud disk to download the training data required for the project:
- Links: extraction code: qfj6 

## Parameter setting
- Before training or forecasting, we need to set parameters
- open File, where parameters or paths are set.

## Training model
- Function ,Simple operation, right click directly run
- The training effect is as follows:
- acc_train: 0.90625
- y_perd: [7 2 1 0 4]
- y_true: [7 2 1 0 4]
- epoch: 10, acc_test: 0.9613999724388123
- epoch: 10, acc_test_2: 0.9606000185012817
- Here is the effect of random training. If you want to get good results, you can train more epoch
- You can also add it yourself early-stopping Go in it's not a problem

## Forecast
- Function ,Simple operation, right click directly run
- After running, some forecast results will be printed on the console
- The prediction effect is as follows:
- predicted value: [7 2 1 0 4]
- True value: [7 2 1 0 4]

## Tensorbboard log
- Use tensorboard The advantage of this log is that it is real-time, and you can watch the renderings while training.
- stay cmd Command window, enter the following command:
- tensorboard --logdir=G:\work_space\python_space\pro2018_space\wandao\mnist_pro\logs\mnist_log_train --host=localhost
- stay --logdir= Followed by the folder path of the log,
- stay --host= Is used to specify ip If you don't write it, you can only use the address of the computer instead of using it localhost
- Open on Google browser tensorboard Journal: http://localhost:6006/
- This is how the test log is opened. I won't go into details.
- About other ROC Curve or mAP Wait, this is not done here. In the future, we'll just do it again.


The following file or code, in which there are comments

(5) .

#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/05 13:51
# @Author   : WanDaoYi
# @FileName :
# ============================================

from easydict import EasyDict as edict
import os

__C = edict()

cfg = __C

# common options public profile
__C.COMMON = edict()
# Windows gets the absolute path of files, which is convenient for windows to run projects in black windows
__C.COMMON.BASE_PATH = os.path.abspath(os.path.dirname(__file__))
# # Get the path of the current window. When using Linux, switch to this, or an error will be reported. (windows can also use this)
# __C.COMMON.BASE_PATH = os.getcwd()

__C.COMMON.DATA_PATH = os.path.join(__C.COMMON.BASE_PATH, "dataset")

# Number of output nodes of hidden layer

# Training configuration
__C.TRAIN = edict()

# Learning rate
# batch_size
# Iteration times

# Model save path, use relative path, easy to transplant
__C.TRAIN.MODEL_SAVE_PATH = "./checkpoint/model_"
# dropout's holdings, 0.7 represents 70% of the nodes.

# Test configuration
__C.TEST = edict()

# Test model save path
__C.TEST.CKPT_MODEL_SAVE_PATH = "./checkpoint/model_acc=0.961400.ckpt-10"

# Log configuration
__C.LOG = edict()
# Log saving path, followed by a trail or test: for example, MNIST log trail
__C.LOG.LOG_SAVE_PATH = "./logs/mnist_log_"


(6) .

#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/05 14:09
# @Author   : WanDaoYi
# @FileName :
# ============================================

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from config import cfg
import numpy as np

class Common(object):

    def __init__(self):
        # Data path
        self.data_file_path = cfg.COMMON.DATA_PATH

        self.hidden_01 = cfg.COMMON.HIDDEN_1
        self.hidden_02 = cfg.COMMON.HIDDEN_2
        self.hidden_03 = cfg.COMMON.HIDDEN_3


    # Read data
    def read_data(self):
        # Data download address:
        mnist_data = input_data.read_data_sets(self.data_file_path, one_hot=True)
        train_image = mnist_data.train.images
        train_label = mnist_data.train.labels
        _, n_feature = train_image.shape
        _, n_label = train_label.shape

        return mnist_data, n_feature, n_label

    # bn operation
    def layer_bn(self, input_data, is_training, name, momentum=0.999, eps=1e-3):
        :param inputdata: Input data
        :param is_training: Is it training, True For training
        :param name: Name
        :param momentum: momentum factor
        :param eps:

        return tf.layers.batch_normalization(inputs=input_data, training=is_training,
                                             name=name, momentum=momentum,

    # dropout processing
    def deal_dropout(self, hidden_layer, keep_prob):
        with tf.name_scope("dropout"):
            tf.summary.scalar('dropout_keep_probability', keep_prob)
            dropped = tf.nn.dropout(hidden_layer, keep_prob)
            tf.summary.histogram('dropped', dropped)
            return dropped

    # Neural network layer
    def neural_layer(self, x, n_neuron, name, activation=None):
        # Include all computing nodes. For this layer, the name scope can be written or not
        with tf.name_scope(name=name):
            n_input = int(x.get_shape()[1])
            stddev = 2 / np.sqrt(n_input)

            # The w in this layer can be regarded as a two-dimensional array. Each neuron has a set of w parameters
            # truncated normal distribution has a smaller value than regular normal distribution
            # There will be no big weight value to ensure a slow and steady training
            # Using this standard deviation will make convergence faster
            # The w parameter needs to be random, not 0, otherwise the output is 0, and the final adjustment is not significant.
            with tf.name_scope("weights"):
                init_w = tf.truncated_normal((n_input, n_neuron), stddev=stddev)
                w = tf.Variable(init_w, name="weight")

            with tf.name_scope("biases"):
                b = tf.Variable(tf.zeros([n_neuron]), name="bias")
            with tf.name_scope("wx_plus_b"):
                z = tf.matmul(x, w) + b
                tf.summary.histogram('pre_activations', z)

            if activation == "relu":
                activation_result = tf.nn.relu(z)
                tf.summary.histogram('activation_result', activation_result)
                return activation_result
                return z

    def dnn_layer(self, x, n_label, keep_prob):
        # Hidden layer
        with tf.name_scope("dnn"):
            # Pay attention to matrix matching here
            x_scale = self.layer_bn(x, is_training=True, name="x_bn")
            hidden_1 = self.neural_layer(x_scale, self.hidden_01, "hidden_01", activation="relu")
            dropped_hidden_1 = self.deal_dropout(hidden_1, keep_prob)

            hidden_scale_1 = self.layer_bn(dropped_hidden_1, is_training=True, name="hidden_bn_1")
            hidden_2 = self.neural_layer(hidden_scale_1, self.hidden_02, "hidden_02", activation="relu")
            dropped_hidden_2 = self.deal_dropout(hidden_2, keep_prob)

            hidden_scale_2 = self.layer_bn(dropped_hidden_2, is_training=True, name="hidden_bn_2")
            hidden_3 = self.neural_layer(hidden_scale_2, self.hidden_03, "hidden_03", activation="relu")
            dropped_hidden_3 = self.deal_dropout(hidden_3, keep_prob)

            hidden_scale_3 = self.layer_bn(dropped_hidden_3, is_training=True, name="hidden_bn_3")
            logits = self.neural_layer(hidden_scale_3, n_label, name="logits")

            return logits

    # Define the data aggregation function of Variable variable, and we can calculate mean, stddev, max, min of Variable
    # Use tf.summary.scalar to record and summarize the scalar data
    # Using tf.summary.histogram to record the histogram data of var directly
    def variable_summaries(self, param):
        with tf.name_scope('summaries'):
            mean = tf.reduce_mean(param)
            tf.summary.scalar('mean', mean)
            with tf.name_scope('stddev'):
                stddev = tf.sqrt(tf.reduce_mean(tf.square(param - mean)))
            tf.summary.scalar('stddev', stddev)
            tf.summary.scalar('max', tf.reduce_max(param))
            tf.summary.scalar('min', tf.reduce_min(param))
            tf.summary.histogram('histogram', param)


(7) . training code

#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/05 13:52
# @Author   : WanDaoYi
# @FileName :
# ============================================

from datetime import datetime
import tensorflow as tf
from config import cfg
from core.common import Common
import numpy as np

class MnistTrain(object):

    def __init__(self):
        # Model save path
        self.model_save_path = cfg.TRAIN.MODEL_SAVE_PATH
        self.log_path = cfg.LOG.LOG_SAVE_PATH

        self.learning_rate = cfg.TRAIN.LEARNING_RATE
        self.batch_size = cfg.TRAIN.BATCH_SIZE
        self.n_epoch = cfg.TRAIN.N_EPOCH

        self.common = Common()
        # Read data and dimensions
        self.mnist_data, self.n_feature, self.n_label = self.common.read_data()

        # Create a blueprint
        with tf.name_scope(name="input_data"):
            self.x = tf.placeholder(dtype=tf.float32, shape=(None, self.n_feature), name="input_data")
            self.y = tf.placeholder(dtype=tf.float32, shape=(None, self.n_label), name="input_labels")

        with tf.name_scope(name="input_shape"):
            # 784 dimensions are transformed into pictures and kept to nodes
            # -1 represents the number of incoming pictures, 28, 28 is the height and width of the picture, 1 is the color channel of the picture
            image_shaped_input = tf.reshape(self.x, [-1, 28, 28, 1])
            tf.summary.image('input', image_shaped_input, self.n_label)

        self.keep_prob_dropout = cfg.TRAIN.KEEP_PROB_DROPOUT
        self.keep_prob = tf.placeholder(tf.float32)

        # Get the output of the last layer
        self.logits = self.common.dnn_layer(self.x, self.n_label, self.keep_prob_dropout)

        # self.config = tf.ConfigProto()
        # self.config.gpu_options.allow_growth = True
        # self.sess = tf.Session(config=self.config)
        self.sess = tf.InteractiveSession()
        # Save training model
        self.saver = tf.train.Saver()

    # Irrigation data
    def feed_dict(self, train_flag=True):
        # training sample
        if train_flag:
            # Get next batch of samples
            x_data, y_data = self.mnist_data.train.next_batch(self.batch_size)
            keep_prob = self.keep_prob_dropout
        # Validation sample
            x_data, y_data = self.mnist_data.test.images, self.mnist_data.test.labels
            keep_prob = 1.0
        return {self.x: x_data, self.y: y_data, self.keep_prob: keep_prob}

    # train
    def do_train(self):

        # Calculate loss
        with tf.name_scope("train_loss"):
            # Softmax? Cross? Entry? With? Logits only encodes one hot
            # Sparse ﹣ softmax ﹣ cross ﹣ entry ﹣ with ﹣ Logits will only code those without one hot, and those used will give 0-9 classification number
            cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=self.y, logits=self.logits)
            loss = tf.reduce_mean(cross_entropy, name="loss")
            tf.summary.scalar("cross_entropy", loss)

        # Build optimizer
        with tf.name_scope("optimizer"):
            optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
            training_op = optimizer.minimize(loss=loss)

        # Correct rate of comparison
        # Only process no one hot code, get whether the largest 1 bit in logits is the same as y comparison category, and return a set of values of True or False
        # Correct = \ \ top \ \ K (Logits, y, 1) after one hot encoding
        with tf.name_scope("accuracy"):
            correct = tf.equal(tf.argmax(self.logits, 1), tf.argmax(self.y, 1))
            acc = tf.reduce_mean(tf.cast(correct, tf.float32))
            tf.summary.scalar("accuracy", acc)

        # Because we have defined too many tf.summary summary operations before, it is too troublesome to perform them one by one,
        # Get all the summary operations directly using TF. Summary. Merge all() for later execution
        merged = tf.summary.merge_all()

        # Define two tf.summary.FileWriter file recorders and different subdirectories to store the training and test log data respectively
        # At the same time, the Session calculation graph sess.graph is added to the training process so that it can be displayed in the graphics window of TensorBoard
        train_writer = tf.summary.FileWriter(self.log_path + 'train', self.sess.graph)
        test_writer = tf.summary.FileWriter(self.log_path + 'test')

        # Build initialization variables
        init_variable = tf.global_variables_initializer()

        test_acc = None

        for epoch in range(self.n_epoch):
            # Obtain the total number of samples
            batch_number = self.mnist_data.train.num_examples
            # Obtain the total samples in several batches
            size_number = int(batch_number / self.batch_size)

            for number in range(size_number):
                summary, _ =[merged, training_op], feed_dict=self.feed_dict())
                # Cycle number
                i = epoch * size_number + number + 1
                train_writer.add_summary(summary, i)

                if number == size_number - 1:
                    # Get next batch of samples
                    x_batch, y_batch = self.mnist_data.train.next_batch(self.batch_size)
                    acc_train = acc.eval(feed_dict={self.x: x_batch, self.y: y_batch})
                    print("acc_train: {}".format(acc_train))

            # test
            output = self.logits.eval(feed_dict={self.x: self.mnist_data.test.images})
            y_perd = np.argmax(output, axis=1)
            print("y_perd: {}".format(y_perd[: 5]))
            y_true = np.argmax(self.mnist_data.test.labels, axis=1)
            print("y_true: {}".format(y_true[: 5]))

            # Verification method 1
            acc_test = acc.eval(feed_dict={self.x: self.mnist_data.test.images,
                                           self.y: self.mnist_data.test.labels})

            print("epoch: {}, acc_test: {}".format(epoch + 1, acc_test))

            # Verification method two or two, any one can be chosen.
            test_summary, acc_test_2 =[merged, acc], feed_dict=self.feed_dict(False))
            print("epoch: {}, acc_test_2: {}".format(epoch + 1, acc_test_2))
            test_writer.add_summary(test_summary, epoch + 1)

            test_acc = acc_test

        save_path = self.model_save_path + "acc={:.6f}".format(test_acc) + ".ckpt"
        # Preservation model, save_path, global_step=self.n_epoch)


if __name__ == "__main__":
    # Code start time
    start_time =
    print("start time: {}".format(start_time))

    demo = MnistTrain()

    # Code end time
    end_time =
    print("End time: {}, Training model time consuming: {}".format(end_time, end_time - start_time))


(8) . test code

#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/05 13:52
# @Author   : WanDaoYi
# @FileName :
# ============================================

from datetime import datetime
import tensorflow as tf
import numpy as np
from config import cfg
from core.common import Common

class MnistTest(object):

    def __init__(self):

        self.common = Common()
        # Read data and dimensions
        self.mnist_data, self.n_feature, self.n_label = self.common.read_data()

        # ckpt model
        self.test_ckpt_model = cfg.TEST.CKPT_MODEL_SAVE_PATH
        print("test_ckpt_model: {}".format(self.test_ckpt_model))

        # tf.reset_default_graph()
        # Create a blueprint
        with tf.name_scope(name="input"):
            self.x = tf.placeholder(dtype=tf.float32, shape=(None, self.n_feature), name="input_data")
            self.y = tf.placeholder(dtype=tf.float32, shape=(None, self.n_label), name="input_labels")

        # Get the output of the last layer
        self.logits = self.common.dnn_layer(self.x, self.n_label, 1)

    # Test with ckpt model
    def do_ckpt_test(self):

        saver = tf.train.Saver()

        with tf.Session() as sess:
            # Heavy load model
            saver.restore(sess, self.test_ckpt_model)

            # Forecast
            output = self.logits.eval(feed_dict={self.x: self.mnist_data.test.images})

            # Convert one hot forecast to number
            y_perd = np.argmax(output, axis=1)
            print("predicted value: {}".format(y_perd[: 5]))

            # True value
            y_true = np.argmax(self.mnist_data.test.labels, axis=1)
            print("True value: {}".format(y_true[: 5]))


if __name__ == "__main__":
    # Code start time
    start_time =
    print("start time: {}".format(start_time))

    demo = MnistTest()
    # Test with ckpt model

    # Code end time
    end_time =
    print("End time: {}, Training model time consuming: {}".format(end_time, end_time - start_time))


(9) . log operation

After the training generates a log, use cmd to open the command window.

Enter the command: tensorbboard -- logdir = path of log folder -- host=localhost

In tensorboard, you need to define -- host=localhost on the web page to open it with localhost ip. Otherwise, you can only open it by copying the machine ip displayed in the command window. The port number of tensorbboard is 6006, which is the reverse of Google word. Speaking of this, tensorboard needs to be opened with Google browser. I've tried Firefox and Sogou, but they can't be opened.

After opening the tensorbboard log that I trained myself, it is as follows:

tensorboard --logdir=G:\work_space\python_space\pro2018_space\wandao\mnist_pro\logs\mnist_log_train --host=localhost

Put http://localhost:6006 Open in Google browser


In this project, I didn't use early stopping and fine tuning, which will be used later. In addition, to the later projects, the accuracy of the calculation will also be calculated with mAP and so on. If you are interested in improving this project, you can do it yourself first.


What's not enough, please give me more advice.




Return to home directory

Back to neural network directory

The last chapter: In depth - neural network (6) Data enhancement and fine tuning

38 original articles published, praised 13, visited 2094
Private letter follow

Tags: network Google Python Windows

Posted on Thu, 06 Feb 2020 03:28:06 -0800 by kanth1