Skip to content

Latest commit

 

History

History
488 lines (381 loc) · 16.1 KB

README.md

File metadata and controls

488 lines (381 loc) · 16.1 KB

Neural-Style-Transfer

Example and Notebook


Understanding The Project



Feature maps

  • Visualize features extraction in a CNN

  • Import VGG19 Model

def load_vgg19()-> Model:
    vgg = VGG19(include_top=False, weights='imagenet')
    return vgg

vgg19 = load_vgg19()
vgg19.summary()

>>
=================================================================
Total params: 143,667,240
Trainable params: 143,667,240
Non-trainable params: 0
_________________________________________________________________
  • Create List of Convolution Layer names

def create_list_of_vgg_layer():

    layers_name   = ['block1_conv1',
                     'block2_conv1',
                     'block3_conv1',
                     'block4_conv1',
                     'block5_conv1']

    return (layers_name)

Create Model who outputs list of Feature maps

  • iterate in list of layers name
  • get output shape of each layer in vgg19
  • append in list of outputs
  • define the New Model with a list of Feature maps as output
def create_multi_output_model(layer_name : list)-> Model:

    vgg19 = load_vgg19()
    list_of_features_map = list()
    
    for name in layers_name:
        layer = vgg19.get_layer(name)
        output = layer.output
        list_of_features_map.append(output)

    model = Model([vgg19.input], list_of_features_map)
    model.trainable = False

    return (model)

Import and Preprocess image

  • Load image - Keras
  • Preprocess array with the special function of the VGG19 model
  • Recover the list of Feature maps
img = load_img('waves.jpg')  
img = img_to_array(img)  
img = expand_dims(img, axis=0)  
input = preprocess_input(img)  
feature_maps = model.predict(input)

Plot one filter for each Feature map

  • Define the size of Subplot
  • Iterate in list of Feature maps
  • Plot one filter in Feature maps Tensor with Imshow
fig, ax = plt.subplots(1, 5, figsize=(20, 15))

i = 0
for f in feature_maps :
    ax[i].imshow(f[0, :, :, 4], cmap='gray')
    i += 1

Cost Functions

Content Cost Function


Learn to Recreate Content

  • To recreate an image we will base it on the production of Feature maps.
  • We pass 2 image in the model :
    • One with random pixels $\large G$
    • One with a content $\large C$
  • We get the output of one convolution of the model and we compare The value of the Pixel in all Filter for the 2 images by :
    • calculating the difference between the 2 Tensor of Feature maps $F$ and $P$

$$\Large L_\text {content}(G, C)=\frac{1}{2} \sum(G - C)^{2}$$

Create Custom Model to Generate One Feature map

  • From the VGG19 Network we recreate a Model that outputs 1 Feature maps of a given image
  • Load the VGG Network without the Fully Connected Layer (FC) and without the Output Layer
  • Define the output of the New Model as a Convolution Layer
  • Set the Un-trainaible parameters
content_layers = ['block2_conv2']

def load_vgg19()-> Model:
    vgg = VGG19(include_top=False, weights='imagenet')
    return vgg

def create_model(content_layers : list)-> Model:

    vgg19 = load_vgg19()
    name = content_layers[0]
    layer = vgg19.get_layer(name)
    output = layer.output

    model = Model([vgg19.input], output)
    model.trainable = False

    return (model)

model = create_model(content_layers)
model.summary()

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_5 (InputLayer)        [(None, None, None, 3)]   0         
                                                                 
 block1_conv1 (Conv2D)       (None, None, None, 64)    1792      
                                                                 
 block1_conv2 (Conv2D)       (None, None, None, 64)    36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, None, None, 64)    0         
                                                                 
 block2_conv1 (Conv2D)       (None, None, None, 128)   73856     
                                                                 
 block2_conv2 (Conv2D)       (None, None, None, 128)   147584    
                                                                 
=================================================================
Total params: 260,160
Trainable params: 0
Non-trainable params: 260,160
_________________________________________________________________

Get Feature maps Custom Model

  • Preprocessing image for Custom Model
  • Get Output of Custom Model -> Feature maps
def get_features_map(model : Model, img : Tensor)->list:

	process_img = preprocessing_img(img)
	features_map = model(process_img)

    return (features_map)

Compute Error With Feature maps

  • For the Model can compare 2 images we give it Feature maps
  • We can calculate the pixel difference between the 2 Feature maps with a Square Error MSE
 def compute_content_loss(content_generated : Tensor, content_target : Tensor):
    
    content_loss = tf.reduce_mean((content_generated - content_target)**2)
    return (content_loss)

Recreate Content with Feature maps

For each Iteration :

  • Extract Content Feature maps
  • Compute Error with Target Content
  • Update Generated Image

Style Cost Function

Learn to Recreate Style

  • To recreate Style we need to have multiple Feature maps for one image
  • Compute the Correlation Between Filter of all Feature maps for understand the paterns in style (List of Gram Matrix)
  • Initialise Image with Random PIxel Generated Image
  • Set the Target Style with the List of Gram Matrix of Style Image
  • Compute the Difference of 2 list of Gram Matrix (Style Image and Generated Image)
  • For each update of Generated Image :
    • get the List of Gram Matrix of Generated Image
    • Compare with Style Target
    • Update Generated Image

$$G = \text{Gram Matrix of Generated Image}$$

$$S = \text{Gram Matrix of Style Image}$$

$$L_{\text{Style}}(G, S)=\frac{1}{2} \sum(G - S)^{2}$$


Create Custom Model that outputs "list of Feature maps"

Create List of Convolution Layer Output

  • List all Convolution Layer that we want to return
def create_list_of_vgg_layer():

    style_layer_names   = ['block1_conv1',
                           'block2_conv1',
                           'block3_conv1',
                           'block4_conv1',
                           'block5_conv1']

    return (style_layer_names)
Load VGG19
  • Load the VGG Network without the Fully Connected Layer and without Output Layer
def load_vgg19()-> Model:
    vgg = VGG19(include_top=False, weights='imagenet')
    return vgg

Create Custom Model

  • iterate in list of layers name
  • get output shape of each layer in vgg19
  • append in list of outputs
  • define the New Model with a list of Feature maps as output
def create_multi_output_model(layer_name : list)-> Model:

    vgg19 = load_vgg19()
    list_of_features_map = list()
    
    for name in layers_name:
        layer = vgg19.get_layer(name)
        output = layer.output
        list_of_features_map.append(output)

    model = Model([vgg19.input], list_of_features_map)
    model.trainable = False

    return (model)
model = create_multi_output_model(layers_name)
model.summary()

>>>
=================================================================
...
Total params: 12,944,960
Trainable params: 0
Non-trainable params: 12,944,960
_________________________________________________________________

Extract Style


Gram Matrix

Matrix of Correlation between Vectors

  • Dot Product with $\large n$ Vector

$$G(\{v_{1},\dots ,v_{n}\})={\begin{vmatrix}\langle v_{1},v_{1}\rangle &\langle v_{1},v_{2}\rangle &\dots &\langle v_{1},v_{n}\rangle \\\langle v_{2},v_{1}\rangle &\langle v_{2},v_{2}\rangle &\dots &\langle v_{2},v_{n}\rangle \\\vdots &\vdots &\ddots &\vdots \\\langle v_{n},v_{1}\rangle &\langle v_{n},v_{2}\rangle &\dots &\langle v_{n},v_{n}\rangle \end{vmatrix}}$$

  • Correlation between One matrix and its tranpose

def gram_matrix(F):

    Gram = tf.matmul(F, F, transpose_a=True)
    return Gram

Filter Map to Matrix of Pixels

  • Flatten each Filter in Feature maps to Vector of Pixel
  • Create Matrix with N Vector of Pixel
def flatten_filters(Filters):
    
    batch = int(Filters.shape[0])
    vector_pixels = int(Filters.shape[1] * Filters.shape[2])
    nbr_filter = int(Filters.shape[3])
    
    matrix_pixels = tf.reshape(Filters, (batch, vector_pixels, nbr_filter))
    return (matrix_pixels)

$$\Large F = \text{Matrix of Vector Pixels}$$

Create Gram Style Matrix

  • Compute the Correlation between each Filter
  • Get the Transpose of $\large F$ $(\large F^{T})$
  • Make the Dot Product between each Vector in $\large F \text{ with } F^{T}$
  • Normalize Value with the number of pixel
def gram_matrix(Filters):

    F = flatten_filters(Filters)
    Gram = tf.matmul(F, F, transpose_a=True)
    Gram = normalize_matrix(Gram, Filters)
    return Gram

def normalize_matrix(G, Filters):

    height =  tf.cast(Filters.shape[1], tf.float32)
    width =  tf.cast(Filters.shape[2], tf.float32)
    number_pixels = height * width
    G = G / number_pixels
    return (G)

Get Entire Style Of Image

  • Get List of Feature maps
  • Convert each Filters to Vector of Pixel
  • Compute Gram Matrix for Each Feature Map
  • Save List of Gram Matrix
  • The list of Gram matrices will become our target
def extract_style(features_map):

    Grams_styles = list()
    
    for style in features_map:
        Gram = gram_matrix(style)
        Grams_styles.append(Gram)
    return Grams_styles

Compute Error Between 2 Lists of Gram Matrix

  • Get List of Feature maps for Style Image and Generated Image
  • Compute Gram Style Matrix for Each Feature maps for the 2 Image
  • Calculate the difference between the 2 Gram Matrix lists
def compute_style_loss(style_generated : Tensor, 
                       style_target : Tensor):

    all_style_loss = list()

    for generated, target in zip(style_generated, style_target):

        style_layer_loss = tf.reduce_mean((generated - target)**2)
        all_style_loss.append(style_layer_loss)

    num_style_layers = len(all_style_loss)
    style_loss = tf.add_n(all_style_loss) / num_style_layers

    return (style_loss)

Recreate Style

For each iteration

  • Get List of Feature maps of Generated Image
  • Compute The Gram Style Matrix for each Feature maps
  • Compute Loss With :
    • Gram Style Matrix of Generated Image
    • Gram Style Matrix of Style Image (Target)
  • Update Pixel of Image

Total Cost Function


  • Extract Content and Style for Generated Image and the Target Image
  • Compute Total Loss With the Addition between Style Loss and Content Loss
  • Weighting each Loss to prioritize the style or the content

$$\LARGE L_{\text {Total}}=\theta . L_{\text {Content}}+\beta . L_{\text {Style}}$$

Recreate Content with Style

Compute Total loss for the generated image for each iteration

  • Extract Style in Generated Image
  • Extract Content Generated Image
  • Compute Total loss With Target Style $\Large S$ and Target Content $\Large C$
  • Minimize the Error