In this project, you will use Generative Adversarial Nets to generate new face images.
The project will use the following data sets:
Because the CelebA data set is more complicated, and this is your first time using GANs. We want you to test your GANs model on the MNIST dataset first, so that you can evaluate the performance of the model you build faster.
If you are using FloydHub , set data_dir
to "/ input" and use FloydHub the Data ID "R5KrjnANiKVhLWAkpXhNBe."
data_dir = '/data'
!pip install matplotlib==2.0.2
# FloydHub - Use with data ID "R5KrjnANiKVhLWAkpXhNBe"
#data_dir = '/input'
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper
helper . download_extract ( 'mnist' , data_dir )
helper . download_extract ( 'celeba' , data_dir )
show_n_images = 25
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
%matplotlib inline
import os
from glob import glob
from matplotlib import pyplot
mnist_images = helper.get_batch(glob(os.path.join(data_dir, 'mnist/*.jpg'))[:show_n_images], 28, 28, 'L')
pyplot.imshow(helper.images_square_grid(mnist_images, 'L'), cmap='gray')
CelebFaces Attributes Dataset (CelebA) is a dataset containing more than 200,000 celebrity images and related photo captions. You will use this dataset to generate faces without using instructions. You can change show_n_images
explore this dataset.
show_n_images = 25
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
mnist_images = helper.get_batch(glob(os.path.join(data_dir, 'img_align_celeba/*.jpg'))[:show_n_images], 28, 28, 'RGB')
pyplot.imshow(helper.images_square_grid(mnist_images, 'RGB'))
Since the focus of the project is to build a GANs model, we will preprocess the data for you.
After data preprocessing, the values of the MNIST and CelebA data sets are in the range of [-0.5, 0.5] for 28×28 dimensional images. The image in the CelebA dataset crops the portion of the image that is not the face and then adjusts to the 28x28 dimension.
MNIST image data set is a single channel monochrome image, CelebA image data set is RGB three-channel color image .
You will build the main components of GANs by deploying the following functions:
model_inputs
discriminator
generator
model_loss
model_opt
train
Check if you are using the correct TensorFlow version and get the GPU model
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
from distutils.version import LooseVersion
import warnings
import tensorflow as tf
# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.0'), 'Please use TensorFlow version 1.0 or newer. You are using {}'.format(tf.__version__)
print('TensorFlow Version: {}'.format(tf.__version__))
# Check for a GPU
if not tf.test.gpu_device_name():
warnings.warn('No GPU found. Please use a GPU to train your neural network.')
else:
print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
Deployment model_inputs
function to create a neural network placeholder (TF Placeholders) . Please create the following placeholders:
image_width
, image_height
and image_channels
to rank 4.z_dim
.The shape of the placeholder tuple is (tensor of real input images, tensor of z data, learning rate).
import problem_unittests as tests
def model_inputs(image_width, image_height, image_channels, z_dim):
"""
Create the model inputs
:param image_width: The input image width
:param image_height: The input image height
:param image_channels: The number of image channels
:param z_dim: The dimension of Z
:return: Tuple of (tensor of real input images, tensor of z data, learning rate)
"""
# TODO: Implement Function
real_input = tf.placeholder(tf.float32, shape=(None, image_height, image_width, image_channels))
z = tf.placeholder(tf.float32, shape=(None, z_dim)) # None in the first demension for batch
lr = tf.placeholder(tf.float32, shape=(None))
return real_input, z, lr
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_model_inputs(model_inputs)
Deployment discriminator
function to create a neural network to identify discrimination images
. This function should be able to reuse various variables in the neural network. In tf.variable_scope
using the "discriminator" in the name of the variable space to reuse the function variables.
This function should return a tuple of the form (tensor output of the discriminator, tensor logits of the discriminator).
def discriminator(images, reuse=False):
"""
Create the discriminator network
:param image: Tensor of input image(s)
:param reuse: Boolean if the weights should be reused
:return: Tuple of (tensor output of the discriminator, tensor logits of the discriminator)
"""
# TODO: Implement Function
# initializer = tf.contrib.layers.variance_scaling_initializer()
initializer = tf.random_normal_initializer(stddev=0.02)
with tf.variable_scope('discriminator', reuse=reuse): # I hate tensorflow, Udacity should use pytorch!
# 28*28*3
x = images
# x = tf.nn.dropout(x, 0.9) # prevent mode collapse
x = tf.layers.conv2d(x, 64, 5, strides=2, kernel_initializer=initializer, padding='same')
x = tf.maximum(x, 0.1 * x)
# x = tf.layers.batch_normalization(x, training=True)
# 14*14*64
x = tf.layers.conv2d(x, 128, 5, strides=2, kernel_initializer=initializer, padding='same')
x = tf.layers.batch_normalization(x, training=True)
x = tf.maximum(x, 0.1 * x)
# 7*7*128
x = tf.layers.conv2d(x, 256, 5, strides=2, kernel_initializer=initializer, padding='same')
x = tf.layers.batch_normalization(x, training=True)
x = tf.maximum(x, 0.1 * x)
# 3.5*3.5 (4*4*256)
x = tf.layers.dense(x, 4*4*256)
x = tf.layers.batch_normalization(x, training=True)
x = tf.maximum(x, 0.1 * x)
x = tf.reshape(x, (-1, 4*4*256)) # flatten
logits = tf.layers.dense(x, 1)
output = tf.sigmoid(logits)
return output, logits
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_discriminator(discriminator, tf)
Deployment generator
function used to z
generate an image. This function should be able to reuse various variables in the neural network. In tf.variable_scope
using the "generator" in the name of the variable space to reuse the function variables.
This function should return the resulting 28 x 28 x out_channel_dim
dimensional image.
def generator(z, out_channel_dim, is_train=True):
"""
Create the generator network
:param z: Input z
:param out_channel_dim: The number of channels in the output image
:param is_train: Boolean if generator is being used for training
:return: The tensor output of the generator
"""
# TODO: Implement Function
# initializer = tf.contrib.layers.variance_scaling_initializer()
initializer = tf.random_normal_initializer(stddev=0.02) # kaiming init sucks here
with tf.variable_scope('generator', reuse= not is_train): # I hate tensorflow, Udacity should use pytorch!
# 4*4*256
x = z
x = tf.layers.dense(z, 7 * 7 *256)
x = tf.reshape(x, (-1, 7, 7, 256))
# x = tf.layers.batch_normalization(x, training=is_train)
x = tf.maximum(x, 0.1 * x)
# 7*7*256
x = tf.layers.conv2d_transpose(x, 128, 5, strides=2, kernel_initializer=initializer, padding='same')
x = tf.layers.batch_normalization(x, training=is_train)
x = tf.maximum(x, 0.1 * x)
# 14*14*128
x = tf.layers.conv2d_transpose(x, 64, 5, strides=2, kernel_initializer=initializer, padding='same')
x = tf.maximum(x, 0.1 * x)
x = tf.layers.batch_normalization(x, training=True)
# 28*28*64
logits = tf.layers.conv2d_transpose(x, out_channel_dim, 5, strides=1, kernel_initializer=initializer, padding='same')
output = tf.tanh(logits)
return output # no logits requried
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_generator(generator, tf)
Deployment model_loss
function of training and calculate the loss GANs. This function should return a tuple of the form (discriminator loss, generator loss).
Use the function you have implemented:
discriminator(images, reuse=False)
generator(z, out_channel_dim, is_train=True)
def model_loss(input_real, input_z, out_channel_dim):
"""
Get the loss for the discriminator and generator
:param input_real: Images from the real dataset
:param input_z: Z input
:param out_channel_dim: The number of channels in the output image
:return: A tuple of (discriminator loss, generator loss)
"""
# TODO: Implement Function
# https://classroom.udacity.com/nanodegrees/nd101-cn-advanced/parts/34ed075b-3ca2-45f0-916c-00db3186f18f/modules/af4b44d7-35bd-4408-bf77-22e469eec31b/lessons/1411d674-356f-4a26-961e-bc04a059f36e/concepts/3bf52eeb-a50a-4734-bd26-d9603b1fcc84
g_fake_output = generator(input_z, out_channel_dim, is_train=True)
# this line should be first... tf you did not tell me so
d_real_output, d_real_logits = discriminator(input_real, False) # you need to trun resue False. Again, I hate Tensorflow! QAQ
d_fake_output, d_fake_logits = discriminator(g_fake_output, reuse=True)
g_fake_label = tf.ones_like(d_fake_output) # force generator increate probability to discriminator think its real
d_fake_label = tf.zeros_like(d_fake_output)
d_real_label = tf.ones_like(d_real_output) * 0.9
g_fake_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_fake_logits, labels=g_fake_label))
d_fake_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_fake_logits, labels=d_fake_label))
d_real_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_real_logits, labels=d_real_label))
return (d_fake_loss+d_real_loss)*1, g_fake_loss*1
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_model_loss(model_loss)
Deployment model_opt
function to optimize the GANs. Use tf.trainable_variables
get trainable all variables. By variable space name discriminator
and generator
to filter variables. This function should return a tuple of the form discriminator training operation, generator training operation.
def model_opt(d_loss, g_loss, learning_rate, beta1):
"""
Get optimization operations
:param d_loss: Discriminator loss Tensor
:param g_loss: Generator loss Tensor
:param learning_rate: Learning Rate Placeholder
:param beta1: The exponential decay rate for the 1st moment in the optimizer
:return: A tuple of (discriminator training operation, generator training operation)
"""
# TODO: Implement Function
variables = tf.trainable_variables()
d_vars = [var for was in variables if var . name . startwith ( 'discriminator' )]
g_vars = [ var for var in variables if var . name . startwith ( 'generator' )]
with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
d_train_opt = tf.train.AdamOptimizer(learning_rate, beta1).minimize(d_loss, var_list=d_vars)
g_train_opt = tf.train.AdamOptimizer(learning_rate, beta1).minimize(g_loss, var_list=g_vars)
# lr_generator = tf.assign(learning_rate, learning_rate*0.9)
# what is 'operation'? you mean optimizer?
return d_train_opt, g_train_opt #why don't you need my learning rate?
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_model_opt(model_opt, tf)
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import numpy as np
def show_generator_output(sess, n_images, input_z, out_channel_dim, image_mode):
"""
Show example output for the generator
:param sess: TensorFlow session
:param n_images: Number of Images to display
:param input_z: Input Z Tensor
:param out_channel_dim: The number of channels in the output image
:param image_mode: The mode to use for images ("RGB" or "L")
"""
cmap = None if image_mode == 'RGB' else 'gray'
z_dim = input_z.get_shape().as_list()[-1]
example_z = np.random.uniform(-1, 1, size=[n_images, z_dim])
samples = sess.run(
generator(input_z, out_channel_dim, False),
feed_dict={input_z: example_z})
images_grid = helper.images_square_grid(samples, image_mode)
pyplot.imshow(images_grid, cmap=cmap)
pyplot.show()
Deployment train
function to establish and train GANs model. Remember to use the following functions you have completed:
model_inputs(image_width, image_height, image_channels, z_dim)
model_loss(input_real, input_z, out_channel_dim)
model_opt(d_loss, g_loss, learning_rate, beta1)
Using the show_generator_output
function to display generator
the output in the training process.
Note : Run in each batch (BATCH) in show_generator_output
function of the training time will significantly increase the volume of the notebook. Recommended every 100 output a batch generator
output.
from tqdm import tqdm_notebook as tqdm
def train(epoch_count, batch_size, z_dim, learning_rate, beta1, get_batches, data_shape, data_image_mode):
"""
Train the GAN
:param epoch_count: Number of epochs
:param batch_size: Batch Size
:param z_dim: Z dimension
:param learning_rate: Learning Rate
:param beta1: The exponential decay rate for the 1st moment in the optimizer
:param get_batches: Function to get batches
:param data_shape: Shape of the data
:param data_image_mode: The image mode to use for images ("RGB" or "L")
"""
# TODO: Build Model
# note that data_shape[0] is batch
real_input, z, lr = model_inputs(image_width=data_shape[1], image_height=data_shape[2], image_channels=data_shape[3], z_dim=z_dim)
d_loss, g_loss = model_loss(real_input, z, out_channel_dim=data_shape[3])
d_train_opt, g_train_opt = model_opt(d_loss, g_loss, lr, beta1)
with tf.Session() as sess: # what the heck is tensorflow session
sess.run(tf.global_variables_initializer())
loss_g_numpy = None
loss_d_numpy = None
for epoch_i in range(epoch_count):
pbar = tqdm(get_batches(batch_size))
for b, batch_images in enumerate(pbar):
""" Here tanh is applied to the output of the generator, the tanh function output is between -1 and 1,
but the batch_images range is between -0.5 and 0.5,
so this place needs to rescale the real image to -1 to 1 Between,
this can be achieved by batch_images = batch_images*2,
so that the real image passed to the discriminator and the fake image of the generator are in the same scope."""
batch_images = batch_images * 2
# TODO: Train Model
z_noise = np . Random . Uniform ( - . 1 , . 1 , size = ( the batch_size , z_dim ))
Sess . RUN (d_train_opt, feed_dict={real_input: batch_images, z: z_noise, lr:learning_rate})
sess.run(g_train_opt, feed_dict={real_input: batch_images, z: z_noise, lr:learning_rate})
# I still cannot understand tf
# It took me a long time to search for how to get the loss from packed tensorflow session
if b%100 == 0:
show_generator_output(sess, n_images=batch_size, input_z=z, out_channel_dim=data_shape[3], image_mode=data_image_mode)
loss_g_numpy = d_loss.eval({real_input: batch_images, z: z_noise, lr:learning_rate})
loss_d_numpy = g_loss.eval({z: z_noise, lr:learning_rate})
pbar.set_description("E{}B{}, G_loss={} D_loss={}".format(epoch_i, b, loss_g_numpy, loss_d_numpy))
Test your GANs model on MNIST. After 2 iterations, GANs should be able to generate images similar to handwritten numbers. Make sure the generator is below the discriminator's loss, or close to zero.
batch_size = 32
z_dim = 64 # don't set too large or too small, otherwise mode collapse
learning_rate = 0.001
beta1 = 0.4
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
epochs = 1
mnist_dataset = helper.Dataset('mnist', glob(os.path.join(data_dir, 'mnist/*.jpg')))
with tf.Graph().as_default():
train(epochs, batch_size, z_dim, learning_rate, beta1, mnist_dataset.get_batches,
mnist_dataset.shape, mnist_dataset.image_mode)
Run your GANs model on CelebA. It takes about 20 minutes to run each iteration on a normal GPU. You can run the entire iteration or stop it when the GANs start producing a real face image.
batch_size = 32
z_dim = 128
learning_rate = 0.001
beta1 = 0.4
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
epochs = 1
celeba_dataset = helper.Dataset('celeba', glob(os.path.join(data_dir, 'img_align_celeba/*.jpg')))
with tf.Graph().as_default():
train(epochs, batch_size, z_dim, learning_rate, beta1, celeba_dataset.get_batches,
celeba_dataset.shape, celeba_dataset . image_mode )
Before submitting this project, be sure to save the file after running all cells.
Save the file as "dlnd_face_generation.ipynb" and save it as HTML "File" -> "Download as". Please include the "helper.py" and "problem_unittests.py" files when submitting your project.