# This lecture:

- How to load and prepare data
- How to create a baseline neural network
- How to use scikit learn and k-fold cross validation to evaluate Keras model
- How data preparation can improve your model performance
- How to adjust the network topology can improve the performance test of the model

## 1. Data set description

Sonar data sets distinguish between rock and metal, and the output variable is a string "M" and "R" of my rock, which will need to be converted to integers 1 and 0.

## 2. guide pack

import numpy import pandas from keras.models import Sequential from keras.layers import Dense from keras.wrappers.scikit_learn import KerasClassifier from sklearn.model_selection import cross_val_score#Score results of cross validation from sklearn.preprocessing import LabelEncoder from sklearn.model_selection import StratifiedKFold#For cross validation #from sklearn.preprocessing import StandardScaler#Standardization #from sklearn.pipeline import Pipeline

## 3. read data

seed = 7 numpy.random.seed(seed)#Always get the same random number sequence dataframe = pandas.read_csv("sonar.csv", header=None) # Load data set, use data frame, header=None: remove the data column name in the first row dataset = dataframe.values #Divide the column vector into 60 input variables (X) and 1 output variable (Y) X = dataset[:, 0:60].astype(float) # Divided into 60 input variables and 60 attributes, converted to floating-point type Y = dataset[:, 60] # 1 output variable. The output variable is of string type. After that, convert r, m to 0, 1

We can use the class from scikit learn labelencoder. This class uses the fit() function to get the encoding needed for the entire dataset model, and then uses the transform() function to apply the encoding to create a new output variable.

#It is not a numerical type to be converted. The category label is numerical to convert the category label to a numerical label encoder = LabelEncoder() encoder.fit(Y) encoded_Y = encoder.transform(Y)#Y is encoded and converted to a new output variable, which is 0, 1

## 4. Be ready to use Keras to create our neural network model

# Benchmark model def create_baseline(): #create model model = Sequential()#Define datum model model.add(Dense(60, input_dim=60, kernel_initializer='normal', activation='relu'))#60 fully connected hidden layers model.add(Dense(30, kernel_initializer='normal', activation='sigmoid')) model.add(Dense(1, kernel_initializer='normal', activation='sigmoid')) # Compile model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])#Loss ='binary'logarithm loss function; optimizer='adam'optimization function, gradient descent return model

## 5. Evaluation model

Cross validation

Validation of K-fold cross evaluation model

10 cross: divide the data into 10 parts, take out 9 parts for training each time, leave one for testing, take turns to take, until each part has and only has one test, so take turns for 10 times in total

# evaluate model with standardized dataset #start_time = time.time() estimator = KerasClassifier(build_fn=create_baseline, nb_epoch=100, batch_size=5, verbose=0)#To build a baseline model, Nb epoch = 100 refers to 100 iterations; kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)#N? Splits = 10 10 copies; shuffle=True, random? State = seed; define how to divide cross validation results = cross_val_score(estimator, X, encoded_Y, cv=kfold) print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))#The accuracy and mean standard deviation of the results are printed out. The average of 10 results is the final result of cross validation in a round #end_time = time.time() #print "time used:", end ﹐ time - start ﹐ time

batch_size parameter adjustment: first, if the training set is small, directly use the batch gradient descent method (batch size is equal to the sample set size). If the number of samples is large, the general Mini batch size is 64 to 512. Considering the setting and use of computer memory, if the mini batch size is 2 to the nth power, the code will run faster. Select a small batch value when there is not enough memory.

Output: quite low...... The result is unknown.

## 5. Rerun the benchmark model with data preparation

Use scikit learn's StandardScaler class to standardize sonar datasets.

Standardization: standardize the data on the same scale so that the average value of each attribute is 0 and the variance is 1.

Standardize the data on the training set rather than the whole data set, and prepare "unknown" training data by running cross validation and trained standardized data. This makes standardization a step of model preparation in cross validation, and also prevents the algorithm from obtaining the memory of test set in data preparation process during evaluation.

This process is implemented by using the pipeline class of scikit learn. Pipeline is a wrapper that performs one or more models during the cross validation process. Here, we can define a pipeline with a tandardScaler, followed by our neural network model.

from sklearn.preprocessing import StandardScaler from sklearn.pipeline import Pipeline#pipeline is the wrapper

numpy.random.seed(seed) estimators = [] estimators.append(('standardize', StandardScaler()))#Standardized way, the last function, put the data in the average standard deviation of 1, 0 estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=100, batch_size=5, verbose=0)))#mlp is multi-layer perceptron, followed by training mode pipeline = Pipeline(estimators)#Put the above two lines of procedures in, and the standardized process and training mode in kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)#5 cross validation results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)#Use cross validation to run print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100)) #end_time = time.time() #print "time used:", end ﹐ time - start ﹐ time

The results show that:

## Optimization model:

The results of annotation of 30 neurons in the hidden layer were significantly improved

def create_baseline(): #create model model = Sequential() model.add(Dense(60, input_dim=60, kernel_initializer='normal', activation='relu')) #model.add(Dense(30, kernel_initializer='normal', activation='sigmoid')) model.add(Dense(1, kernel_initializer='normal', activation='sigmoid')) # Compile model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) return model

The results show that:

Note: generally, a larger network is evaluated: adjust the model, add a smaller hidden layer to the hidden layer, and the performance should be improved very well, but the performance of the hidden layer is removed here, because the cross validation partition data set is random.

Originally, 60 input - > [60 - > 30] - > 1 output has become 60 input - > [60] - > 1 output, and its performance has been improved....!! !

Evaluate a smaller network:

Limit the number of features in the first layer, reduce the 60 input variables of the benchmark model to 30, force the neural network to extract the most important structure from the input data, and the result is slightly improved

def create_baseline(): #create model model = Sequential() model.add(Dense(30, input_dim=60, kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal', activation='sigmoid')) # Compile model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) return model

The results show that:

So the structure of neural network will affect the performance, but so far there is no good way to tell you how to improve the performance of network structure, only the only way: try more

Change the code and parameters.

Original Chinese link: using the artificial neural network framework Keras to solve the two classification problem -- Jinkey translation