人工智能之旅IVs-神经网络-真枪实弹-技术教程-四时宝库

在这一讲，王校长将带大家使用Python来编写神经网络的程序。教程中完整的代码都放在了王校长的GitHub目录：https://github.com/physicso/AICourse

Google在TensorFlow的Playground（playground.tensorflow.org）里面，有一个关于神经网络的演示，这个页面的代码运行在浏览器中的，大家可以玩儿命地去探索网络结构对训练结果的影响，不用担心把它玩儿坏。

在这里，我给大家带来的是使用Scikit-Learn和TensorFlow分别来建立神经网络模型，用这两个模型来对Mnist数据进行分类。与之前的编程课类似，Scikit-Learn给我们提供了一种近乎“傻瓜式”的接口，但是当数据很大的时候却无能无力。而使用TensorFlow，我们需要对神经网络的细节非常熟悉，一切都是自己动手，丰衣足食，但是这样做出来的模型可以处理很大很大的数据。

下面，我就教大家如何使用为数不多的几行代码，来使用Scikit-Learn编写一个简单的神经网络对Mnist数据分类。

首先，我们引入一些库，并读入训练和测试数据：

importcPickle aspickle

importgzip

fromsklearn.neural_network importMLPClassifier

defread_data(data_file):

f = gzip.open(data_file, "rb")

train, val, test = pickle.load(f)

f.close()

train_x = train[0]

train_y = train[1]

test_x = test[0]

test_y = test[1]

returntrain_x, train_y, test_x, test_y

data_file = "mnist.pkl.gz"

print'Reading training and testing data...'

X_train, y_train, X_test, y_test = read_data(data_file)

随后我们使用几行代码就能搞定神经网络模型的建立和训练：

mlp = MLPClassifier(hidden_layer_sizes=(200,200), activation='relu',max_iter=100, alpha=1e-4, solver='sgd', verbose=10, tol=1e-4, learning_rate_init=.1)

mlp.fit(X_train, y_train)

print "Training Set Accuracy: %f"% mlp.score(X_train,y_train)

print "Test Set Accuracy: %f"% mlp.score(X_test,y_test)

在上面的代码中，我们告诉程序神经网络拥有两个200神经元的隐藏层，使用的激活函数为ReLU，模型优化方式为随机梯度下降。

程序运行后，我们可以看到这样的输出：

Reading training and testing data...

Iteration 1, loss = 0.29860416

Iteration 2, loss = 0.10486606

...

Iteration 19, loss = 0.00053843

Training loss did not improve more than tol=0.000100 for two consecutive epochs. Stopping.

Training Set Accuracy: 1.000000

Test Set Accuracy: 0.982500

王校长测试了一些不同的网络结构对测试准确率的影响：

神经网络结构参数	测试准确率
100	97.92%
100,100	98.02%
200	98.12%
200,200	98.25

这个结果基本是符合直觉的，即随着神经元个数和隐藏层层数的增加，模型的准确率不断提升，感兴趣的童鞋可以做更多的实验。不过有一点需要注意的是，全连接的神经网络是具有一定的性能瓶颈的，测试准确率不会接近100%，如果想要达到99%以上的准确率，需要采用我们后续课程会讲到的卷积神经网络。

为了使我们的代码能够更方便地拓展到多个CPU内核、多个GPU和多台机器，下面我们写一个TensorFlow的程序来做类似的事情。

首先我们引入一些库，并读入数据：

import mnist_input_data

import tensorflow as tf

mnist = mnist_input_data.read_data_sets("MNIST_data/", one_hot=True)

我们定义两个函数来对权重和偏移这两种变量进行初始化：

def weight_variable(shape):

initial = tf.truncated_normal(shape, stddev=0.1)

return tf.Variable(initial)

def bias_variable(shape):

initial = tf.constant(0.1, shape=shape)

return tf.Variable(initial)

接下来是程序的核心，我们来定义网络结构：

# Create the model

x = tf.placeholder(tf.float32, [None, 784])

W_fc1 = weight_variable([784, 200])

b_fc1 = bias_variable([200])

W_fc2 = weight_variable([200, 200])

b_fc2 = bias_variable([200])

W_out = weight_variable([200, 10])

b_out = bias_variable([10])

hidden_1 = tf.nn.relu(tf.matmul(x, W_fc1) + b_fc1)

hidden_2 = tf.nn.relu(tf.matmul(hidden_1, W_fc2) + b_fc2)

y = tf.nn.softmax(tf.matmul(hidden_2, W_out) + b_out)

可以看到，我们定义了两个神经元数目为200的隐藏层，数据流非常简单：x -> hidden_1 -> hidden_2 -> y。

下面的误差函数和训练代码与上一次课是一模一样滴：

# Define loss and optimizer

y_ = tf.placeholder(tf.float32, [None, 10])

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy)

# Train

training_iteration = 10000

batch_size = 100

display_step = 50

with tf.Session() as sess:

sess.run(tf.initialize_all_variables())

for iter in range(training_iteration):

batch_xs, batch_ys = mnist.train.next_batch(batch_size)

train_step.run({x: batch_xs, y_: batch_ys})

if iter % display_step == 0:

print "Epoch:", '%04d' % (iter + 1), "accuracy =", "{:.9f}".format(

sess.run(accuracy, feed_dict={x: batch_xs, y_: batch_ys}))

print "Test Accuracy: " + "{:.9f}".format(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

运行这段代码，我们可以得到这样的输出：

Epoch: 0001 accuracy = 0.259999990

Epoch: 0051 accuracy = 0.829999983

Epoch: 0101 accuracy = 0.939999998

...

Epoch: 9901 accuracy = 1.000000000

Epoch: 9951 accuracy = 1.000000000

Test Accuracy: 0.975399971

嗯哼，我们没有获得很好的准确率，TensorFlow的模型给了大家一堆可以调节的训练参数，剩下的工作就交给大家喽~

还有，悄悄地告诉大家一个小秘密，上面的代码再加10行就变成了大名鼎鼎的卷积神经网络，激动和兴奋吧！ :)

四时宝库

程序员的知识宝库

人工智能之旅IVs-神经网络-真枪实弹