Getting Started With TensorFlow

2017-12-09

TensorFlow

TensorFlow 提供大量 API，最低层次 API–TensorFlow Core– 提供完整的编程控制。高层次的 API 更容易学习与使用。一个高层次的 API 例如 tf.estimator 帮助我控制 data sets, estimators, training and inference.

Tensors( 张量 )

TensorFlow 中最重要的数据单位是 tensor, 可以理解为任意维度的数组。rank 是它的维数。

3 # a rank 0 tensor; a scalar with shape []
[1., 2., 3.] # a rank 1 tensor; a vector with shape [3]
[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3]
[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3]

TensorFlow Core 教程

Importing TensorFlow

1	import tensorflow as tf

The Computational Graph( 计算图 )

你可能会认为 TensorFlow Core 程序包括两部分：

建立计算图
运行计算图

计算图是一系列 TensorFlow 操作。每个节点上有0个或多个 tensor 作为输入，并输出1个 tensor。其中一种节点是 constant 节点。没有输入，输出一个固定值。例如：

1
2
3

node1 = tf.constant(3.0, dtype=tf.float32)
node2 = tf.constant(4.0) # also tf.float32 implicitly
print(node1, node2)

可以看到，输出的不是值而是能produce 3.0, 4.0 的 nodes，也就是说，为了真正知道节点存储的值，必须在 Session 中运行这个计算图。Session 中封装了整个 TensorFlow runtime 的控制与状态。

下列命令创建一个 Session 并使用 run 方法：

1 2	sess = tf.Session() print(sess.run([node1, node2]))

我们可以通过给 Tensor nodes 以运算操作来建立更复杂的计算图。

1
2
3

node3 = tf.add(node1, node2)
print('node3:', node3)
print('sess.run(node3):', sess.run(node3))

TensorFlow 提供了 TensorBoard 可视化工具，可以展示计算图。

计算图可以被参数化来接收外部输入，例如 placeholders( 占位符 )。它可以在之后被赋值。

1
2
3

a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
adder_node = a + b # + provides a shortcut for tf.add(a, b)

之后在计算图的运行时便可以用 run 方法中的 feed_dict 命令赋予计算图参数。

1 2	print(sess.run(adder_node, {a: 3, b: 4.5})) print(sess.run(adder_node, {a: [1, 3], b: [2, 4]}))

TensorBoard 中计算图如下图：

还可以更复杂些

1 2	add_and_triple = adder_node * 3. print(sess.run(add_and_triple, {a: 3, b: 4.5}))

TensorBoard 中计算图如下图：

Variables 可以使我们往计算图中加入trainable 的参数。

W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
x = tf.placeholder(tf.float32)
linear_model = W*x + b

TensorFlow 采用惰性编程，Variable 命令在上述中并没有初始化，因此在 run 之前，需要初始化

1 2	init = tf.global_variables_initializer() sess.run(init)

因为 x 是 placeholder，如下计算给定值后的 linear_model

1	print(sess.run(linear_model, {x: [1, 2, 3, 4]}))

我们已经建立了一个线性模型，接下来在训练集上进行训练。y placeholder 提供 target，现在我们写出损失函数。

y = tf.placeholder(tf.float32)
squared_deltas = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_deltas)
print(sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]}))

误差为

我们可以手工地更改参数来最小化损失函数

fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])
sess.run([fixW, fixb])
print(sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]}))

误差为

tf.train API

接下来采用 API 训练 model

TensorFlow 提供 optimizers 每次改变来极小化损失函数。最简单的是梯度下降( gradient descent ) optimizer。TensorFlow 可以用 tf.gradients 自动计算梯度。为了简要，optimizers 自动帮我完成。

optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

sess.run(init)
for i in range(1000):
    sess.run(train, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]})

print(sess.run([W, b]))

以上完成了训练过程。

完整代码

import tensorflow as tf

# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W*x + b
y = tf.placeholder(tf.float32)

# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

# training data
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
  sess.run(train, {x: x_train, y: y_train})

# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

这个程序可以用 TensorBoard 可视化

tf.estimator

tf.estimator 是一个 high-level 的 TensorFlow 库，包括：

运行训练循环
运行估值循环
管理 data sets

基本用法

注意到线性回归模型转换为 tf.estimator 带来了很大方便。

# NumPy is often used to load, manipulate and preprocess data.
import numpy as np
import tensorflow as tf

# Declare list of features. We only have one numeric feature. There are many
# other types of columns that are more complicated and useful.
feature_columns = [tf.feature_column.numeric_column('x', shape=[1])]

# An estimator is the front end to invoke training (fitting) and evaluation
# (inference). There are many predefined types like linear regression,
# linear classification, and many neural network calssifiers and regression.
# The following code provides an estimator that does linear regression.
estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns)

# TensorFlow provides many helper methods to read and set up data sets.
# Here we use two data sets: one for training and one for evaluation
# We have to tell the function how many batches
# of data (num_epochs) we want and how big each batch should be.
x_train = np.array([1., 2., 3., 4.])
y_train = np.array([0., -1., -2., -3.])
x_eval = np.array([2., 5., 8., 1.])
y_eval = np.array([-1.01, -4.1, -7, 0.])
input_fn = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=4, num_epochs=None, shuffle=True)
train_input_fn = tf.estimator.inputs.numpy_input_fn({"x": x_train}, y_train, batch_size=4, num_epochs=1000, shuffle=False)
eval_input_fn = tf.estimator.inputs.numpy_input_fn({"x": x_eval}, y_eval, batch_size=4, num_epochs=1000, shuffle=False)

# We can invoke 1000 training steps by invoking the  method and passing the
# training data set.
estimator.train(input_fn=input_fn, steps=1000)

# Here we evaluate how well our model did.
train_metrics = estimator.evaluate(input_fn=train_input_fn)
eval_metrics = estimator.evaluate(input_fn=eval_input_fn)
print("train metrics: %r"% train_metrics)
print("eval metrics: %r"% eval_metrics)

iteration：表示1次迭代，每次迭代更新1次网络结构的参数
batch_size：1次迭代所使用的样本量
epoch：1个epoch表示过了1遍训练集中的所有样本

常用随机梯度下降算法（Stochastic Gradient Descent, SGD）训练深层结构，它有一个好处就是并不需要遍历全部的样本，当数据量非常大时十分有效。此时，可根据实际问题来定义epoch，例如定义10000次迭代为1个epoch，若每次迭代的batch_size设为256，那么1个epoch相当于过了2560000个训练样本。

A custom model( 自定义模型 )

使用 tf.estimator.Estimator 自定义模型

tf.estimator.LinearRegressor is actually a sub-class of tf.estimator.Estimator.

import numpy as np
import tensorflow as tf

# Declare list of features, we only have one real-valued feature
def model_fn(features, labels, mode):
  # Build a linear model and predict values
  W = tf.get_variable("W", [1], dtype=tf.float64)
  b = tf.get_variable("b", [1], dtype=tf.float64)
  y = W*features['x'] + b
  # Loss sub-graph
  loss = tf.reduce_sum(tf.square(y - labels))
  # Training sub-graph
  global_step = tf.train.get_global_step()
  optimizer = tf.train.GradientDescentOptimizer(0.01)
  train = tf.group(optimizer.minimize(loss),
                   tf.assign_add(global_step, 1))
  # EstimatorSpec connects subgraphs we built to the
  # appropriate functionality.
  return tf.estimator.EstimatorSpec(
      mode=mode,
      predictions=y,
      loss=loss,
      train_op=train)

estimator = tf.estimator.Estimator(model_fn=model_fn)
# define our data sets
x_train = np.array([1., 2., 3., 4.])
y_train = np.array([0., -1., -2., -3.])
x_eval = np.array([2., 5., 8., 1.])
y_eval = np.array([-1.01, -4.1, -7., 0.])
input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_train}, y_train, batch_size=4, num_epochs=None, shuffle=True)
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_train}, y_train, batch_size=4, num_epochs=1000, shuffle=False)
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_eval}, y_eval, batch_size=4, num_epochs=1000, shuffle=False)

# train
estimator.train(input_fn=input_fn, steps=1000)
# Here we evaluate how well our model did.
train_metrics = estimator.evaluate(input_fn=train_input_fn)
eval_metrics = estimator.evaluate(input_fn=eval_input_fn)
print("train metrics: %r"% train_metrics)
print("eval metrics: %r"% eval_metrics)

注意到 model_fn() 里的内容与用 lower level API 手工编写的模型类似。