이 글은 Tensorflow Tutorials를 읽고 정리 및 번역한 글입니다.
Table of Contents
- Introduction
- Import Keras
- Loda MNIST Data
- Sequential Model
- Model Compilation, train
- Functional Model
- Visualizing
Introduction
Keras는 Prettytensor나 tf.layers API 처럼 텐서플로우에서 사용되는 고수준(high-level) API 중 하나입니다.
Imports
Keras API를 사용하기 위해 import를 해줍니다.
%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import math
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import InputLayer, Input
from tensorflow.python.keras.layers import Reshape, MaxPooling2D
from tensorflow.python.keras.layers import Conv2D, Dense, Flatten
Load Data
여기서는 MNIST Dataset을 사용합니다. 총 7만장의 Image로 구성되어 있고 train에 55000, validation에 5000, test에 10000장의 Image를 사용합니다.
from mnist import MNIST
data = MNIST(data_dir="data/MNIST/")
편리함을 위해 data
에서 몇몇 값들을 copy합니다.
# The number of pixels in each dimension of an image.
img_size = data.img_size
# The images are stored in one-dimensional arrays of this length.
img_size_flat = data.img_size_flat
# Tuple with height and width of images used to reshape arrays.
img_shape = data.img_shape
# Tuple with height, width and depth used to reshape arrays.
# This is used for reshaping in Keras.
img_shape_full = data.img_shape_full
# Number of classes, one class for each of 10 digits.
num_classes = data.num_classes
# Number of colour channels for the images: 1 channel for gray-scale.
num_channels = data.num_channels
plot_images Helper Function
plot_images
함수는 images, true class, predict class 세 값을 받아 이미지와 라벨을 화면에 보여주는 함수입니다.
def plot_images(images, cls_true, cls_pred=None):
assert len(images) == len(cls_true) == 9
# Create figure with 3x3 sub-plots.
fig, axes = plt.subplots(3, 3)
fig.subplots_adjust(hspace=0.3, wspace=0.3)
for i, ax in enumerate(axes.flat):
# Plot image.
ax.imshow(images[i].reshape(img_shape), cmap='binary')
# Show true and predicted classes.
if cls_pred is None:
xlabel = "True: {0}".format(cls_true[i])
else:
xlabel = "True: {0}, Pred: {1}".format(cls_true[i], cls_pred[i])
# Show the classes as the label on the x-axis.
ax.set_xlabel(xlabel)
# Remove ticks from the plot.
ax.set_xticks([])
ax.set_yticks([])
# Ensure the plot is shown correctly with multiple plots
# in a single Notebook cell.
plt.show()
Sequential Model
Keras API에서 Neural Network를 구성하는 방법은 두가지가 존재합니다. 그 중 Sequentail 모델은 차례대로 레이어들을 add해가는 간단한 모델입니다.
# Start construction of the Keras Sequential model.
model = Sequential()
# Add an input layer which is similar to a feed_dict in TensorFlow.
# Note that the input-shape must be a tuple containing the image-size.
model.add(InputLayer(input_shape=(img_size_flat,)))
# The input is a flattened array with 784 elements,
# but the convolutional layers expect images with shape (28, 28, 1)
model.add(Reshape(img_shape_full))
# First convolutional layer with ReLU-activation and max-pooling.
model.add(Conv2D(kernel_size=5, strides=1, filters=16, padding='same',
activation='relu', name='layer_conv1'))
model.add(MaxPooling2D(pool_size=2, strides=2))
# Second convolutional layer with ReLU-activation and max-pooling.
model.add(Conv2D(kernel_size=5, strides=1, filters=36, padding='same',
activation='relu', name='layer_conv2'))
model.add(MaxPooling2D(pool_size=2, strides=2))
# Flatten the 4-rank output of the convolutional layers
# to 2-rank that can be input to a fully-connected / dense layer.
model.add(Flatten())
# First fully-connected / dense layer with ReLU-activation.
model.add(Dense(128, activation='relu'))
# Last fully-connected / dense layer with softmax-activation
# for use in classification.
model.add(Dense(num_classes, activation='softmax'))
Model Compiliation
이제 Neural Network가 정의 되었습니다. 이후에 할 일은 loss-function 정의, performance metric 선택, optimzer 선택 인데 이런 모든 과정들을 Keras 에선 compiliation 이라고 합니다. 여기서는 string값들을 사용하여 optimizer 등을 정할 수 있고, 더 세심한 조정이 필요할 경우 각종 객체를 import하여 사용할 수 있습니다.
다음의 예는 Adam Optimizer를 사용할 때 learning rate를 조절하기 위해 따로 optimizer를 정의한 것입니다.
from tensorflow.python.keras.optimizers import Adam
optimizer = Adam(lr=1e-3)
Loss 함수는 categorical_crossentropy를 사용하였습니다.
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
Training
이제 모델이 정의되었으므로 model.fit
함수를 이용하여 학습시켜줍니다. 여기서는 128의 batch-size로 1번 epoch를 거쳐 학습시켰습니다.
model.fit(x=data.x_train,
y=data.y_train,
epochs=1, batch_size=128)
Evaluation
모델 학습이 완료되면, mode.evaulate
함수를 이용해 테스트 할 수 있습니다.
result = model.evaluate(x=data.x_test,
y=data.y_test)
Functional Model
Keras API는 Sequential Model보다 더 복잡한 모델을 생성할 수 있는 Functional Model을 지원합니다. 여기서 각 레이어 층들은 함수로 정의됩니다. 이를 통해 더 복잡한 모델을 만들 수 있습니다.
# Create an input layer which is similar to a feed_dict in TensorFlow.
# Note that the input-shape must be a tuple containing the image-size.
inputs = Input(shape=(img_size_flat,))
# Variable used for building the Neural Network.
net = inputs
# The input is an image as a flattened array with 784 elements.
# But the convolutional layers expect images with shape (28, 28, 1)
net = Reshape(img_shape_full)(net)
# First convolutional layer with ReLU-activation and max-pooling.
net = Conv2D(kernel_size=5, strides=1, filters=16, padding='same',
activation='relu', name='layer_conv1')(net)
net = MaxPooling2D(pool_size=2, strides=2)(net)
# Second convolutional layer with ReLU-activation and max-pooling.
net = Conv2D(kernel_size=5, strides=1, filters=36, padding='same',
activation='relu', name='layer_conv2')(net)
net = MaxPooling2D(pool_size=2, strides=2)(net)
# Flatten the output of the conv-layer from 4-dim to 2-dim.
net = Flatten()(net)
# First fully-connected / dense layer with ReLU-activation.
net = Dense(128, activation='relu')(net)
# Last fully-connected / dense layer with softmax-activation
# so it can be used for classification.
net = Dense(num_classes, activation='softmax')(net)
# Output of the Neural Network.
outputs = net
모델을 inputs과 outputs을 이용하여 정의하였습니다. 나머지 과정은 Sequential과 마찬가지로 진행됩니다. 가장 먼저 compiliation을 진행합니다.
from tensorflow.python.keras.models import Model
model2 = Model(inputs=inputs, outputs=outputs)
model2.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
Visualizing of Layer Weights & Outputs
helper function plotting conv weights
def plot_conv_weights(weights, input_channel=0):
# Get the lowest and highest values for the weights.
# This is used to correct the colour intensity across
# the images so they can be compared with each other.
w_min = np.min(weights)
w_max = np.max(weights)
# Number of filters used in the conv. layer.
num_filters = weights.shape[3]
# Number of grids to plot.
# Rounded-up, square-root of the number of filters.
num_grids = math.ceil(math.sqrt(num_filters))
# Create figure with a grid of sub-plots.
fig, axes = plt.subplots(num_grids, num_grids)
# Plot all the filter-weights.
for i, ax in enumerate(axes.flat):
# Only plot the valid filter-weights.
if i<num_filters:
# Get the weights for the i'th filter of the input channel.
# See new_conv_layer() for details on the format
# of this 4-dim tensor.
img = weights[:, :, input_channel, i]
# Plot image.
ax.imshow(img, vmin=w_min, vmax=w_max,
interpolation='nearest', cmap='seismic')
# Remove ticks from the plot.
ax.set_xticks([])
ax.set_yticks([])
# Ensure the plot is shown correctly with multiple plots
# in a single Notebook cell.
plt.show()
Get Layers
keras는 summary 함수를 이용해 모델의 레이어들을 압축해서 보여줍니다.
model3.summary()
Convolution Weights
모델의 layers
attribute는 레이어의 정보를 담고 있습니다. 또한 get_weights
함수를 이용해 weight의 값들을 얻을 수 있습니다. 여기선 first convolution 레이어의 값들을 plotting 합니다.
layer_conv1 = model3.layers[2]
layer_conv1
weights_conv1 = layer_conv1.get_weights()[0]
plot_conv_weights(weights=weights_conv1, input_channel=0)
License (MIT)
Copyright (c) 2016-2017 by Magnus Erik Hvass Pedersen
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.