SlideShare uma empresa Scribd logo
1 de 294
Baixar para ler offline
Introduction of Generative
Adversarial Network
李宏毅
Hung-yi Lee
Outline
Lecture 1: Brief Review of Deep Learning
Lecture 2: Introduction of Generative
Adversarial Network
Lecture 3: Conditional Generation
Lecture 4: Making Decision and Control
Generative Adversarial Network
(GAN)
Google 小姐
Ian Goodfellow
Yann LeCun’s comment
https://www.quora.com/What-are-some-recent-and-
potentially-upcoming-breakthroughs-in-unsupervised-learning
Yann LeCun’s comment
https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-
in-deep-learning
……
Is GAN popular?
Resources
• GAN Zoo
• https://github.com/hindupuravinash/the-gan-zoo
• Tricks: https://github.com/soumith/ganhacks
• Tutorial of Ian Goodfellow:
https://arxiv.org/abs/1701.00160
Creation
Drawing?
Writing
Poems?
http://www.rb139.com/index.ph
p?s=/Lot/44547
Anime Face Generation
何之源的知乎: https://zhuanlan.zhihu.com/p/24767059
DCGAN: https://github.com/carpedm20/DCGAN-tensorflow
Drawing?
Basic Idea of GAN
Generator
It is a neural network
(NN), or a function.
Generator
0.1
−3
⋮
2.4
0.9
imagevector
Generator
3
−3
⋮
2.4
0.9
Generator
0.1
2.1
⋮
2.4
0.9
Generator
0.1
−3
⋮
2.4
3.5
matrix
Powered by: http://mattya.github.io/chainer-DCGAN/
Each dimension of input vector
represents some characteristics.
Longer hair
blue hair Open mouth
Discri-
minator
scalar
image
Basic Idea of GAN It is a neural network
(NN), or a function.
Larger value means real,
smaller value means fake.
Discri-
minator
Discri-
minator
Discri-
minator
Discri-
minator
1.0 1.0
0.1 ?
Basic Idea of GAN
Brown veins
Butterflies are
not brown
Butterflies do
not have veins
……..
Generator
Discriminator
Basic Idea of GAN
NN
Generator
v1
Discri-
minator
v1
Real images:
NN
Generator
v2
Discri-
minator
v2
NN
Generator
v3
Discri-
minator
v3
This is where the term
“adversarial” comes from.
You can explain the process
in different ways…….
Generator
v3
Generator
v2
Basic Idea of GAN
Generator
(student)
Discriminator
(teacher)
Generator
v1 Discriminator
v1
Discriminator
v2
No eyes
No mouth
So many questions ……
Q1: Why generator cannot learn by itself?
Q2: Why discriminator don’t generate
object itself?
Q3: How discriminator and generator
interact?
Anime Face Generation
100 updates
Anime Face Generation
1000 updates
Anime Face Generation
2000 updates
Anime Face Generation
5000 updates
Anime Face Generation
10,000 updates
Anime Face Generation
20,000 updates
Anime Face Generation
50,000 updates
感謝陳柏文同學提供實驗結果
0.0
0.0
G
0.9
0.9
G
0.1
0.1
G
0.2
0.2
G
0.3
0.3
G
0.4
0.4
G
0.5
0.5
G
0.6
0.6
G
0.7
0.7
G
0.8
0.8
G
這有什麼用?
用來生成更多資料?
擴增實境/虛擬實境?
This will be clear in
the next lecture.
Lecture I:
Basic Idea of Deep Learning
Machine Learning
≈ Looking for a Function
• Binary Classification (是
非題)
• Multi-class
Classification (選擇題)
Function f Function f
Input Input
Yes/No Class 1, Class 2, … Class N
Machine Learning
≈ Looking for a Function
• Structured input/output
“大家好,歡迎大
家來修機器學習”
f
Speech Recognition
“girl with red hair
and red eyes”
f
f
Summarization
“近半選民挺脫歐
公投 ……”
(title, summary)
Framework
A set of
function 21, ff
 1f “cat”
 1f “dog”
 2f “money”
 2f “snake”
Model
 f “cat”
Image Recognition:
Framework
A set of
function 21, ff
 f “cat”
Image Recognition:
Model
Training
Data
Goodness of
function f
Better!
“monkey” “cat” “dog”
function input:
function output:
Supervised Learning
Framework
A set of
function 21, ff
 f “cat”
Image Recognition:
Model
Training
Data
Goodness of
function f
“monkey” “cat” “dog”
*
f
Pick the “Best” Function
Using 
f
“cat”
Training Testing
Example Application
Input Output
16 x 16 = 256
1x
2x
256x
……
Ink → 1
No ink → 0
……
y1
y2
y10
Each dimension represents
the confidence of a digit.
is 1
is 2
is 0
……
0.1
0.7
0.2
The image
is “2”
Example Application
• Handwriting Digit Recognition
Function “2”
1x
2x
256x
……
……
y1
y2
y10
is 1
is 2
is 0
……
Input:
256-dim vector
output:
10-dim vector
Neural
Network
Training Data
• Preparing training data: images and their labels
The learning target is defined on
the training data.
“5” “0” “4” “1”
“3”“1”“2”“9”
bwawawaz KKkk  11
Neural Network
z
1w
kw
Kw
…
1a
ka
Ka

b
 z
bias
a
weights
Neuron…
……
A simple function
Activation
function
Output
LayerHidden Layers
Input
Layer
Fully Connect Feedforward
Network
Input Output
1x
2x
Layer 1
……
Nx
……
Layer 2
……
Layer L
……
……
……
……
……
y1
y2
yM
neuron
is 1
is 2
is 0
……
Fully Connect Feedforward
Network
Input Output
1x
2x
Layer 1
……
Nx
……
Layer 2
……
Layer L
……
……
……
……
……
y1
y2
yM
Simple
Function 1
is 1
is 2
is 0
……
a1 Simple
Function 2
a2 Simple
Function L
y…
A very complex function
How can we know the weights and
biases of the neurons?
1x
2x
……
Nx
……
……
……
……
Ink → 1
No ink → 0
……
y1
y2
y10
y1 has the maximum value
Find a set of weights and bias such that
Input:
y2 has the maximum valueInput:
is 1
is 2
is 0
Gradient descent is the algorithm finding the weights
and biases of the neurons that can achieve the target.
Gradient descent has lots of problems ……
Learning/
Training:
Convolutional Layer……
1
1
2
2
Sparse Connectivity
3
4
5
Each neural only connects to
part of the output of the
previous layer
3
4
Receptive
Field
Different neurons have
different, but overlapping,
receptive fields
Convolutional Layer……
1
1
2
2
Sparse Connectivity
3
4
5
Each neural only connects to
part of the output of the
previous layer
3
4
Parameter Sharing
The neurons with different
receptive fields can use the
same set of parameters.
Less parameters then
fully connected layer
Convolutional Layer……
1
1
2
2
3
4
5
3
4
Considering neuron 1 and 3 as
“filter 1” (kernel 1)
filter (kernel) size: size of the
receptive field of a neuron
Stride = 2
Considering neuron 2 and 4 as
“filter 2” (kernel 2)
Kernel size, no. of filter,
stride are all designed by
the developers.
Pooling Layer
Group the neurons corresponding
to the same filter with nearby
receptive fields
……
1
1
2
2
3
4
5
3
4
Convolutional
Layer
Pooling
Layer
1
2
Subsampling
Recurrent Neural Network
• Recurrent Structure: usually used when the input is a
sequence
Sentiment Analysis
我 覺 太得 糟 了
超好雷
好雷
普雷
負雷
超負雷
看了這部電影覺
得很高興 …….
這部電影太糟了
…….
這部電影很
棒 …….
Positive (正雷) Negative (負雷) Positive (正雷)
……
Function
Recurrent Neural Network
• Recurrent Structure: usually used when the input is a
sequence
fh0
h1
x1
f h2 f h3
g
No matter how long the
input sequence is, we
only need one function f
y
x2 x3
Func
f ht
xt
ht-1
LSTM GRU
Func
f ht
xt
ht-1
Auto-encoder
NN
Encoder
NN
Decoder
code
Compact
representation of
the input object
code
Can reconstruct
the original object
Learn together
28 X 28 = 784
Low
dimension
𝑐
Trainable
NN
Encoder
NN
Decoder
Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the
dimensionality of data with neural networks." Science 313.5786 (2006): 504-507
• NN encoder + NN decoder = a deep network
Deep Auto-encoderInputLayer
Layer
Layer
bottle
OutputLayer
Layer
Layer
Layer
Layer
… …
Code
As close as possible
𝑥 ො𝑥
Encoder Decoder
Deep Auto-encoder - Example
Pixel -> tSNE
𝑐
NN
Encoder
PCA 降到
32-dim
Unsupervised Leaning
without annotation
Sequence-to-sequence
Auto-encoder
• Machine represents each audio segment also by a
vector
audio segment
vector
(word-level)
Learn from lots of audio
without supervision
[Chung, Wu, Lee, Lee, Interspeech 16)
Used in the following
spoken language
understanding applications
Sequence-to-sequence
Auto-encoder
audio segment
acoustic featuresx1 x2 x3 x4
RNN Encoder
vector
The vector we want
We use sequence-to-sequence auto-encoder here
The training is unsupervised.
Sequence-to-sequence
Auto-encoder
RNN Generator
x1 x2 x3 x4
y1 y2 y3 y4
x1 x2 x3
x4
RNN Encoder
audio segment
acoustic features
The RNN encoder and
generator are jointly trained.
Input acoustic features
What does machine learn?
• Typical word to vector:
• Audio word to vector (phonetic information)
𝑉 𝑅𝑜𝑚𝑒 − 𝑉 𝐼𝑡𝑎𝑙𝑦 + 𝑉 𝐺𝑒𝑟𝑚𝑎𝑛𝑦 ≈ 𝑉 𝐵𝑒𝑟𝑙𝑖𝑛
𝑉 𝑘𝑖𝑛𝑔 − 𝑉 𝑞𝑢𝑒𝑒𝑛 + 𝑉 𝑎𝑢𝑛𝑡 ≈ 𝑉 𝑢𝑛𝑐𝑙𝑒
V( ) - V( ) + V( ) = V( )
V( ) - V( ) + V( ) = V( )
Network
Many kinds of
network structures:
Matrix
How to find
the function?
Given the example inputs/outputs as
training data: {(x1,y1),(x2,y2), ……, (x1000,y1000)}
➢ Fully connected feedforward network
➢ Convolutional neural network (CNN)
➢ Recurrent neural network (RNN)
Vector
Vector Seq
𝑥 𝑦
Deep Learning in One Slide (Review)
(speech, video, sentence)
Different networks can take input/output in different domains.
Lecture II:
Introduction of GAN
To learn more theory:
https://www.youtube.com/watch?v=0CKeqXl5IY0&lc=z13zuxbgl
pvsgbgpo04cg1bxuoraejdpapo0k
https://www.youtube.com/watch?v=KSN4QYgAtao&lc=z13kz1nq
vuqsipqfn23phthasre4evrdo
Outline
When can I use GAN?
Generation by GAN
Improving GAN
Evaluation
A little bit theory
Structured Learning
YXf :
Regression: output a scalar
Classification: output a “class”
Structured Learning/Prediction: output a
sequence, a matrix, a graph, a tree ……
(one-hot vector)
1 0 0
Machine learning is to find a function f
Output is composed of components with dependency
0 1 0 0 0 1
Class 1 Class 2 Class 3
Regression,
Classification
Structured
Learning
Output Sequence
(what a user says)
YXf :
“機器學習及其深層與
結構化”
“Machine learning and
having it deep and structured”
:X :Y
感謝大家來上課”
(response of machine)
:X :Y
Machine Translation
Speech Recognition
Chat-bot
(speech)
:X :Y
(transcription)
(sentence of language 1) (sentence of language 2)
“How are you?” “I’m fine.”
Output Matrix YXf :
ref: https://arxiv.org/pdf/1605.05396.pdf
“this white and yellow flower
have thin white petals and a
round yellow stamen”
:X :Y
Text to Image
Image to Image
:X :Y
Colorization:
Ref: https://arxiv.org/pdf/1611.07004v1.pdf
Decision Making and Control
Action:
“right”
GO Playing is the same.
Action:
“fire”
Action:
“left”
Why Structured Learning
Interesting?
• One-shot/Zero-shot Learning:
• In classification, each class has some examples.
• In structured learning,
• If you consider each possible output as a
“class” ……
• Since the output space is huge, most “classes”
do not have any training data.
• Machine has to create new stuff during
testing.
• Need more intelligence
Why Structured Learning
Interesting?
• Machine has to learn to planning
• Machine can generate objects component-by-
component, but it should have a big picture in its mind.
• Because the output components have dependency,
they should be considered globally.
Image
Generation
Sentence
Generation
這個婆娘不是人
九天玄女下凡塵
Structured Learning Approach
Bottom
Up
Top
Down
Generative
Adversarial
Network (GAN)
Generator
Discriminator
Learn to generate
the object at the
component level
Evaluating the
whole object, and
find the best one
Outline
When can I use GAN?
Generation by GAN
Improving GAN
Evaluation
A little bit theory
Generation
NN
Generator
Image Generation
Sentence Generation
NN
Generator
We will control what to generate
latter. → Conditional Generation
0.1
−0.1
⋮
0.7
−0.3
0.1
⋮
0.9
0.1
−0.1
⋮
0.2
−0.3
0.1
⋮
0.5
Good morning.
How are you?
0.3
−0.1
⋮
−0.7
0.3
−0.1
⋮
−0.7 Good afternoon.
In a specific range
So many questions ……
Q1: Why generator cannot learn by itself?
Q2: Why discriminator don’t generate
object itself?
Q3: How discriminator and generator
interact?
Generator
Image:
code: 0.1
0.9(where does them
come from?)
NN
Generator
0.1
−0.5
0.2
−0.1
0.3
0.2
NN
Generator
code
vectors
code
code0.1
0.9
image
As close as possible
c.f. NN
Classifier
𝑦1
𝑦2
⋮
As close as possible
1
0
⋮
Generator
Encoder in auto-encoder
provides the code ☺
𝑐
NN
Encoder
NN
Generator
code
vectors
code
code
Image:
code:
(where does them
come from?)
0.1
0.9
0.1
−0.5
0.2
−0.1
0.3
0.2
Remember Auto-encoder?
As close as possible
NN
Encoder
NN
Decoder
code
NN
Decoder
code
Randomly generate
a vector as code Image ?
= Generator
= Generator
Remember Auto-encoder?
NN
Decoder
code
2D
-1.5 1.5
−1.5
0
NN
Decoder
1.5
0
NN
Decoder
(real examples)
image
Remember Auto-encoder?
-1.5 1.5
(real examples)
Generator NN
Generator
code
vectors
code
codeNN
Generator
a
NN
Generator
b
NN
Generator
?a b0.5x + 0.5x
Image:
code:
(where does them
come from?)
0.1
0.9
0.1
−0.5
0.2
−0.1
0.3
0.2
Auto-encoder
Variational Auto-encoder
(VAE)
NN
Encoder
input NN
Decoder
output
m1
m2
m3
𝜎1
𝜎2
𝜎3
𝑒3
𝑒1
𝑒2
From a normal
distribution
𝑐3
𝑐1
𝑐2
X
+
Minimize
reconstruction error
෍
𝑖=1
3
𝑒𝑥𝑝 𝜎𝑖 − 1 + 𝜎𝑖 + 𝑚𝑖
2
exp
𝑐𝑖 = 𝑒𝑥𝑝 𝜎𝑖 × 𝑒𝑖 + 𝑚𝑖
Minimize
NN
Encoder
NN
Decoder
code
input output
= Generator
Why VAE?
?
encode
decode
code
Ref: http://www.wired.co.uk/article/google-artificial-intelligence-poetry
Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy
Bengio, Generating Sentences from a Continuous Space, arXiv prepring, 2015
VAE - Writing Poetry
RNN
Encoder
sentence RNN
Decoder
sentence
code
Code Space
i went to the store to buy some groceries.
"come with me," she said.
i store to buy some groceries.
i were to buy any groceries.
"talk to me," she said.
"don’t worry about it," she said.
……
What do we miss?
G
as close as
possible
TargetGenerated Image
It will be fine if the generator can truly copy the target image.
What if the generator makes some mistakes …….
Some mistakes are serious, while some are fine.
What do we miss? Target
1 pixel error 1 pixel error
6 pixel errors 6 pixel errors
我覺得不行 我覺得不行
我覺得
其實 OK
我覺得
其實 OK
What do we miss?
我覺得不行 我覺得其實 OK
The relation between the components are critical.
The last layer generates each components independently.
Need deep structure to catch the relation between
components.
Layer L-1
……
Layer L
……
……
……
……
Each neural in output layer
corresponds to a pixel.
ink
empty
旁邊的,我們產
生一樣的顏色
誰管你
(Variational) Auto-encoder
感謝 黃淞楓 同學提供結果
x1
x2
G
𝑧1
𝑧2
𝑥1
𝑥2
So many questions ……
Q1: Why generator cannot learn by itself?
Q2: Why discriminator don’t generate
object itself?
Q3: How discriminator and generator
interact?
Discriminator
• Discriminator is a function D (network, can deep)
• Input x: an object x (e.g. an image)
• Output D(x): scalar which represents how “good” an
object x is
R:D X
D 1.0 D 0.1
Can we use the discriminator to generate objects?
Yes.
Evaluation function, Potential
Function, Evaluation Function …
Discriminator
• It is easier to catch the relation between the
components by top-down evaluation.
我覺得不行 我覺得其實 OK
This CNN filter is
good enough.
Discriminator
• Suppose we already have a good discriminator
D(x) …
How to learn the discriminator?
Enumerate all possible x !!!
It is feasible ???
Discriminator - Training
• I have some real images
Discriminator training needs some negative examples.
D scalar 1
(real)
D scalar 1
(real)
D scalar 1
(real)
D scalar 1
(real)
Discriminator only learns to output “1” (real).
Discriminator - Training
D(x)
real examples
In practice, you cannot decrease all the x
other than real examples.
Discrimi
nator D
object
x
scalar
D(x)
Discriminator - Training
• Negative examples are critical.
D scalar 1
(real)
D scalar 0
(fake)
D 0.9 Pretty real
D scalar 1
(real)
D scalar 0
(fake)
How to generate realistic
negative examples?
Discriminator - Training
• General Algorithm
• Given a set of positive examples, randomly
generate a set of negative examples.
• In each iteration
• Learn a discriminator D that can discriminate
positive and negative examples.
• Generate negative examples by discriminator D
D
 xDx
Xx
 maxarg~
v.s.
Discriminator
- Training
𝐷 𝑥
𝐷 𝑥
𝐷 𝑥𝐷 𝑥
In the end ……
real
generated
Graphical Model
Bayesian Network
(Directed Graph)
Markov Random Field
(Undirected Graph)
Markov Logic
Network
Boltzmann
Machine
Restricted
Boltzmann Machine
Structured Learning ➢Structured Perceptron
➢Structured SVM
➢Gibbs sampling
➢Hidden information
➢Application: sequence
labelling, summarization
Conditional
Random Field
Segmental CRF
(Only list some of
the approaches)
Energy-based
Model:
http://www.cs.nyu
.edu/~yann/resear
ch/ebm/
Generator v.s. Discriminator
• Generator
• Pros:
• Easy to generate even
with deep model
• Cons:
• Imitate the appearance
• Hard to learn the
correlation between
components
• Discriminator
• Pros:
• Considering the big
picture
• Cons:
• Generation is not
always feasible
• Especially when
your model is deep
• How to do negative
sampling?
So many questions ……
Q1: Why generator cannot learn by itself?
Q2: Why discriminator don’t generate
object itself?
Q3: How discriminator and generator
interact?
Discriminator - Training
• General Algorithm
• Given a set of positive examples, randomly
generate a set of negative examples.
• In each iteration
• Learn a discriminator D that can discriminate
positive and negative examples.
• Generate negative examples by discriminator D
D
 xDx
Xx
 maxarg~
v.s.
G x~ =
Generating Negative Examples
 xDx
Xx
 maxarg~G x~ =
Discri-
minator
NN
Generator
image
code
0.13
hidden layer
update fix 0.9
Gradient Ascent
• Initialize generator and discriminator
• In each training iteration:
DG
Sample some
real objects:
Generate some
fake objects:
G
Algorithm
D
Update
G D
image
code
code
code
code
0000
1111
code
code
code
code
image
image
image
1
update fix
• Initialize generator and discriminator
• In each training iteration:
DG
Learning
D
Sample some
real objects:
Generate some
fake objects:
G
Algorithm
D
Update
Learning
G G D
image
code
code
code
code
1111
code
code
code
code
image
image
image
1
update fix
0000
Benefit of GAN
• From Discriminator’s point of view
• Using generator to generate negative samples
• From Generator’s point of view
• Still generate the object component-by-
component
• But it is learned from the discriminator with
global view.
 xDx
Xx
 maxarg~G x~ =
efficient
GAN
感謝 段逸林 同學提供結果
https://arxiv.org/a
bs/1512.09300
VAE
GAN
Outline
When can I use GAN?
Generation by GAN
Improving GAN
Evaluation
A little bit theory
Binary Classifier
as Discriminator
Binary
Classifier
Real
Typical binary classifier uses sigmoid function
at the output layer
You cannot train your classifier too good……
They don’t
move.
Binary
Classifier
Fake
1 is the largest, 0 is the smallest
real
generated
Binary Classifier
as Discriminator
• Don’t let the discriminator perfectly separate real
and generated data
• Weaken your discriminator?
real
generated
strong
weak
Binary Classifier
as Discriminator
• Don’t let the discriminator perfectly separate real
and generated data
• Add noise to input or label?
real
generated
Least Square GAN (LSGAN)
• Replace sigmoid with linear (replace classification
with regression)
1 (Real) 0 (Fake)
Binary
Classifier
scalar
Binary
Classifier
scalar
They don’t
move.
0
1
real
generated
WGAN
• We want the scores of the real examples as large as
possible, generated examples as small as possible.
Any
problem?
Binary
Classifier
scalar
Binary
Classifier
scalar
as large as possible as small as possible
real
generated
Move towards Max +∞
−∞
WGAN
Move towards Max
𝐷 𝑥1 − 𝐷 𝑥2 ≤ 𝐾 𝑥1 − 𝑥2
Lipschitz Function
K=1 for "1 − 𝐿𝑖𝑝𝑠𝑐ℎ𝑖𝑡𝑧"
Output
change
Input
change
Do not change fast
The discriminator should
be a 1-Lipschitz function.
It should be smooth.
How to realize?
WGAN
• Original WGAN → Weight Clipping
• Improved WGAN → Gradient Penalty
Force the parameters w between c and -c
After parameter update, if w > c, w = c; if w < -c, w = -c
Do not truly maximize (minimize) the real (generated) examples
It should be
smooth enough.
𝐷 𝑥𝐷 𝑥
Keep the gradient close to 1
Easy to move
DCGAN
G: CNN, D: CNN
G: CNN (no normalization), D: CNN (no normalization)
LSGAN Original
WGAN
Improved
WGAN
G: CNN (tanh), D: CNN(tanh)
DCGAN LSGAN Original
WGAN
Improved
WGAN
G: MLP, D: CNN
G: CNN (bad structure), D: CNN
G: 101 layer, D: 101 layer
Sentence Generation
• good bye.
g o o d <space> ……
0
1
0
0
…………
d
g
o
<space>
0
0
1
0
0
0
1
0
1
0
0
0
1
0
0
1
Consider this matrix as an “image”
1-of-N
Encoding
I will talk about
RNN later.
Sentence Generation
• Real sentence
• Generated
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0.9
0.1
0
0
0
0.1
0.9
0
0
0
0.1
0.1
0.7
0.1
0
0
0
0.1
0.8
0.1
0
0
0
0.1
0.9
Can never
be 1-of-N
A binary classifier
can immediately
find the difference.
WGAN is helpful
No overlap
I will talk about
RNN later.
https://arxiv.org/abs/1704.00028Improved WGAN successfully
generating sentences
W-GAN – 唐詩鍊成
感謝 李仲翊 同學提供實
驗結果
• 升雲白遲丹齋取,此酒新巷市入頭。黃道故海歸中後,不驚入得韻子門。
• 據口容章蕃翎翎,邦貸無遊隔將毬。外蕭曾臺遶出畧,此計推上呂天夢。
• 新來寳伎泉,手雪泓臺蓑。曾子花路魏,不謀散薦船。
• 功持牧度機邈爭,不躚官嬉牧涼散。不迎白旅今掩冬,盡蘸金祇可停。
• 玉十洪沄爭春風,溪子風佛挺橫鞋。盤盤稅焰先花齋,誰過飄鶴一丞幢。
• 海人依野庇,為阻例沉迴。座花不佐樹,弟闌十名儂。
• 入維當興日世瀕,不評皺。頭醉空其杯,駸園凋送頭。
• 鉢笙動春枝,寶叅潔長知。官爲宻爛去,絆粒薛一靜。
• 吾涼腕不楚,縱先待旅知。楚人縱酒待,一蔓飄聖猜。
• 折幕故癘應韻子,徑頭霜瓊老徑徑。尚錯春鏘熊悽梅,去吹依能九將香。
• 通可矯目鷃須浄,丹迤挈花一抵嫖。外子當目中前醒,迎日幽筆鈎弧前。
• 庭愛四樹人庭好,無衣服仍繡秋州。更怯風流欲鴂雲,帛陽舊據畆婷儻。
輸出 32 個字 (包含標點)
Loss-sensitive GAN (LSGAN)
D(x)
𝑥
WGAN LSGAN
𝑥′′
𝑥′
D(x)
Δ 𝑥, 𝑥′
Δ 𝑥, 𝑥′
𝑥′′
𝑥′
𝑥
Energy-based GAN (EBGAN)
• Using an autoencoder as discriminator D
scalar
Discriminator
An image
is good.
It can be reconstructed
by autoencoder.
=
0 for best
images
Generator is
the same.
-
0.1
EN DE
Autoencoder
X -1 -0.1
EBGAN
realgen gen
Hard to reconstruct, easy to destroy
m
0 is for
the best.
Do not have to
be very negative
Auto-encoder based discriminator
only give limited region large value.
Mode Collapse
Missing Mode ?
?
Mode collapse is
easy to detect.
Solution?
• Feature matching, Minibatch discrimination
• https://arxiv.org/abs/1606.03498
• Unroll GAN
• https://arxiv.org/abs/1611.02163
• InfoGAN
• Next lecture
• Ensemble GAN
• Next lecture
Outline
When can I use GAN?
Generation by GAN
Improving GAN
Evaluation
A little bit theory
Lucas Theis, Aäron van den Oord, Matthias Bethge,
“A note on the evaluation of generative models”,
arXiv preprint, 2015
Likelihood
: real data (not observed
during training)
: generated data
𝑥 𝑖
Log Likelihood:
Generator
PG
𝐿 =
1
𝑁
෍
𝑖
𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖
We cannot compute 𝑃𝐺 𝑥 𝑖 . We can only
sample from 𝑃𝐺.
Prior
Distribution
Likelihood
- Kernel Density Estimation
• Estimate the distribution of PG(x) from sampling
Generator
PG
https://stats.stackexchange.com/questions/244012/can-
you-explain-parzen-window-kernel-density-estimation-in-
laymans-terms/244023
Now we have an approximation of 𝑃𝐺, so we can
compute 𝑃𝐺 𝑥 𝑖 for each real data 𝑥 𝑖
Then we can compute the likelihood.
Each sample is the mean of a
Gaussian with the same covariance.
Likelihood v.s. Quality
• Low likelihood, high quality?
• High likelihood, low quality?
Considering a model generating good images (small variance)
Generator
PG
Generated data Data for evaluation 𝑥 𝑖
𝑃𝐺 𝑥 𝑖 = 0
G1
𝐿 =
1
𝑁
෍
𝑖
𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖
G2
X 0.99
X 0.01
= −𝑙𝑜𝑔100 +
1
𝑁
෍
𝑖
𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖
4.6100
Evaluate by Other Networks
Well-trained
CNN
𝑥 𝑃 𝑦|𝑥
Lower entropy means
higher visual quality
CNN𝑥1
𝑃 𝑦1|𝑥1
Higher entropy means
higher variety
CNN𝑥2
𝑃 𝑦2|𝑥2
CNN𝑥3
𝑃 𝑦3
|𝑥3
…
1
𝑁
෍
𝑛
𝑃 𝑦 𝑛|𝑥 𝑛
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen,
“Improved Techniques for Training GANs”, arXiv prepring, 2016
Evaluate by Other Networks
- Inception Score
• Improved W-GAN
Outline
When can I use GAN?
Generation by GAN
Improving GAN
Evaluation
A little bit theory Can skip this part
Theory behind GAN
• The data we want to generate has a distribution
𝑃𝑑𝑎𝑡𝑎 𝑥
𝑃𝑑𝑎𝑡𝑎 𝑥
High
Probability
Low
ProbabilityImage
Space
Theory behind GAN
• A generator G is a network. The network defines a
probability distribution.
generator
G𝑧 𝑥 = 𝐺 𝑧
Normal
Distribution
𝑃𝐺 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥
As close as
possible
https://blog.openai.com/generative-models/
It is difficult to compute 𝑃𝐺 𝑥
We do not know what the distribution
looks like.
generator
G𝑧 𝑥 = 𝐺 𝑧
Normal
Distribution
𝑃𝐺 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥
As close as
possible
https://blog.openai.com/generative-models/
Discri-
minator
When the discriminator is well trained, its
loss represents a specific divergence
measure between PG and Pdata.
Update discriminator is to minimize
the divergence measure
Binary classifier evaluates the
JS divergence
You can design the discriminator to evaluate other divergence.
Why GAN is hard to train?
http://www.guokr.com/post/773890/
Better
Why GAN is hard to train?
𝑃𝑑𝑎𝑡𝑎 𝑥
𝑃𝐺0
𝑥
𝑃𝑑𝑎𝑡𝑎 𝑥
𝑃𝐺50
𝑥
𝐽𝑆 𝑃𝐺1
||𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2
𝐽𝑆 𝑃𝐺2
||𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2
𝑃𝑑𝑎𝑡𝑎 𝑥
𝑃𝐺100
𝑥
𝐽𝑆 𝑃𝐺2
||𝑃𝑑𝑎𝑡𝑎 = 0
?
…………
Not really better ……
Evaluating JS divergence
• JS divergence estimated by discriminator telling
little information
https://arxiv.org/a
bs/1701.07875
Weak Generator Strong Generator
Earth Mover’s Distance
• Considering one distribution P as a pile of earth,
and another distribution Q as the target
• The average distance the earth mover has to move
the earth.
𝑃 𝑄
d
𝑊 𝑃, 𝑄 = 𝑑
𝑃𝑑𝑎𝑡𝑎𝑃𝐺0 𝑃𝑑𝑎𝑡𝑎𝑃𝐺50
𝐽𝑆 𝑃𝐺0
, 𝑃𝑑𝑎𝑡𝑎
= 𝑙𝑜𝑔2
𝑃𝑑𝑎𝑡𝑎𝑃𝐺100
…… ……
𝑑0
𝑑50
𝐽𝑆 𝑃𝐺50
, 𝑃𝑑𝑎𝑡𝑎
= 𝑙𝑜𝑔2
𝐽𝑆 𝑃𝐺100
, 𝑃𝑑𝑎𝑡𝑎
= 0
𝑊 𝑃𝐺0
, 𝑃𝑑𝑎𝑡𝑎
= 𝑑0
𝑊 𝑃𝐺50
, 𝑃𝑑𝑎𝑡𝑎
= 𝑑50
𝑊 𝑃𝐺100
, 𝑃𝑑𝑎𝑡𝑎
= 0
Why Earth Mover’s Distance?
𝐷𝑓 𝑃𝑑𝑎𝑡𝑎||𝑃𝐺
𝑊 𝑃𝑑𝑎𝑡𝑎, 𝑃𝐺
https://arxiv.org/abs/1701.07875
Vertical
When can I use GAN?
• I consider GAN as a new approach for structured learning.
Generation by GAN
• Generator: generate locally
• Discriminator: evaluate globally
• Generator + Discriminator → GAN
Improving GAN
Evaluation
A Little Bit Theory
Concluding Remarks
Question?
https://arxiv.org/pdf/1406.2661.pdf
Yann Lecun’s talk
Discriminator is flat
eventually?
Discriminator is flat in the end.
• Discriminator leads the generator
𝐷 𝑥 𝐷 𝑥
𝐷 𝑥
𝐷 𝑥
Discriminator
Real
Generated
Discriminator is flat in the end.
Source: https://www.youtube.com/watch?v=ebMei6bYeWw (credit: Benjamin Striner)
Discriminator is flat in the end.
http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/
Lecture III:
Conditional Generation
and its application on Anime Face Generation
Conditional Generation
Generation
Conditional Generation
NN
Generator
“Girl with red hair
and red eyes”
“Girl with yellow
ribbon”
NN
Generator
0.1
−0.1
⋮
0.7
−0.3
0.1
⋮
0.9
0.3
−0.1
⋮
−0.7
In a specific range
Conditional Generation
• We don’t want to simply generate some random stuff.
• Generate objects based on conditions:
Given
condition:
Caption Generation
Chat-bot
Given
condition:
“Hello”
“A young girl
is dancing.”
“Hello. Nice
to see you.”
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Modifying Input Code
Generator
0.1
−3
⋮
2.4
0.9
Generator
3
−3
⋮
2.4
0.9
Generator
0.1
2.1
⋮
2.4
0.9
Generator
0.1
−3
⋮
2.4
3.5
Each dimension of input vector
represents some characteristics.
Longer hair
blue hair Open mouth
➢The input code determines the generator output.
➢Understand the meaning of each dimension to
control the output.
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
InfoGAN
Regular
GAN
Modifying a specific dimension,
no clear meaning
What we expect Actually …
(The colors
represents the
characteristics.)
What is InfoGAN?
Discrimi
nator
Classifier
scalarGenerator=z
z’
c
x
c
Predict the code c
that generates x
“Auto-
encoder”
Parameter
sharing
(only the last
layer is different)
encoder
decoder
What is InfoGAN?
Classifier
Generator=z
z’
c
x
c
Predict the code c
that generates x
“Auto-
encoder”
encoder
decoder
c must have clear
influence on x
The classifier can
recover c from x.
https://arxiv.org/abs/1606.03657
https://arxiv.org/abs/1606.03657
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Connecting Code and Attribute
CelebA
GAN+Autoencoder
• We have a generator (input z, output x)
• However, given x, how can we find z?
• Learn an encoder (input x, output z)
Generator
(Decoder)
Encoder z xx
as close as possible
fixed
Discriminator
init
Autoencoder
different structures?
Generator
(Decoder)
Encoder z xx
as close as possible
Attribute Representation
𝑧𝑙𝑜𝑛𝑔 =
1
𝑁1
෍
𝑥∈𝑙𝑜𝑛𝑔
𝐸𝑛 𝑥 −
1
𝑁2
෍
𝑥′∉𝑙𝑜𝑛𝑔
𝐸𝑛 𝑥′
Short
Hair
Long
Hair
𝐸𝑛 𝑥 + 𝑧𝑙𝑜𝑛𝑔 = 𝑧′𝑥 𝐺𝑒𝑛 𝑧′
𝑧𝑙𝑜𝑛𝑔
https://www.youtube.com/watch?v=kPEIJJsQr7U
Photo
Editing
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Conditional GAN
• Generating images based on text description
NN
Encoder
NN
Generator
sentence code
code image
“a dog is sleeping”
?
Conditional GAN
• Generating images based on text description
NN
Encoder
NN
Generator
sentence code
code
“a bird is flying”
c1: a dog is running ො𝑥1:
ො𝑥2:c2: a bird is flying
“a dog is running”
Target of
NN output
Conditional GAN
• Text to image by traditional supervised learning
NN Image
Text: “train”
c1: a dog is running ො𝑥1:
ො𝑥2:c2: a bird is flying
A blurry image!
c1: a dog is running ො𝑥1:
as close as
possible
Conditional GAN
G
𝑧Prior distribution
x = G(c,z)
c: train
Target of
NN output
Text: “train”
A blurry image!
Approximate the
distribution below
It is a distribution.
Image
Conditional GAN
D
(type 2)
scalar
𝑐
𝑥
D
(type 1)
scalar𝑥
Positive example:
Negative example:
G
𝑧Prior distribution
x = G(c,z)
c: train
x is realistic or not
Image
x is realistic or not +
c and x are matched or not
(train , )
(train , ) (cat , )
Positive example:
Negative example:
Text to Image
- Results
"red flower with
black center"
Text to Image
- Results
Image-to-image
https://arxiv.org/pdf/1611.07004
G
𝑧
x = G(c,z)
𝑐
as close as
possible
Image-to-image
• Traditional supervised approach
NN Image
It is blurry because it is the
average of several images.
Testing:
input close
Image-to-image
• Experimental results
Testing:
input close GAN
G
𝑧
Image D scalar
GAN + close
Image super resolution
• Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew
Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes
Totz, Zehan Wang, Wenzhe Shi, “Photo-Realistic Single Image Super-Resolution
Using a Generative Adversarial Network”, CVPR, 2016
Video Generation
Generator
Discrimi
nator
Last frame is real
or generated
Discriminator thinks it is real
target
Minimize
distance
https://github.com/dyelax/Adversarial_Video_Generation
Speech Enhancement
• Typical deep learning approach
Noisy Clean
G Output
Using CNN
Speech Enhancement
• Conditional GAN
G
D scalar
noisy output clean
noisy clean
output
noisy
training data
(fake pair or not)
Speech Enhancement
Noisy Speech
Enhanced Speech
Which Enhanced Speech is better?
感謝廖峴峰同學提供實驗結果
(和中研院曹昱博士共同指導)
More about Speech Processing
• Speech synthesis
• Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru
Hiramatsu, Kunio Kashino, “Generative Adversarial Network-based
Postfiltering for Statistical Parametric Speech Synthesis”, ICASSP 2017
• Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Training
algorithm to deceive anti-spoofing verification for DNN-based speech
synthesis, ”, ICASSP 2017
• Voice Conversion
• Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang,
Voice Conversion from Unaligned Corpora using Variational Autoencoding
Wasserstein Generative Adversarial Networks, Interspeech 2017
• Speech Enhancement
• Santiago Pascual, Antonio Bonafonte, Joan Serrà, SEGAN: Speech
Enhancement Generative Adversarial Network, Interspeech 2017
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Cycle GAN, Disco GAN
Transform an object from one domain to another without paired data
photo van Gogh
Domain X Domain Y
paired data
Cycle GAN
𝐺 𝑋→𝑌
Domain X Domain Y
𝐷 𝑌
Domain Y
Domain X
scalar
Input image
belongs to
domain Y or not
Become similar
to domain Y
Not what we want
ignore input
https://arxiv.org/abs/1703.10593
https://junyanz.github.io/CycleGAN/
Cycle GAN
Domain X Domain Y
𝐺 𝑋→𝑌
𝐷 𝑌
Domain Y
scalar
Input image
belongs to
domain Y or not
𝐺Y→X
as close as possible
Lack of information
for reconstruction
Cycle GAN
Domain X Domain Y
𝐺 𝑋→𝑌 𝐺Y→X
as close as possible
𝐺Y→X 𝐺 𝑋→𝑌
as close as possible
𝐷 𝑌𝐷 𝑋
scalar: belongs to
domain Y or not
scalar: belongs to
domain X or not
c.f. Dual Learning
動畫化的世界
• Using the code:
https://github.com/Hi-
king/kawaii_creator
• It is not cycle GAN,
Disco GAN
input output domain
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
https://www.youtube.com/watch?v=9c4z6YsBGQ0
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman and Alexei A. Efros. "Generative
Visual Manipulation on the Natural Image Manifold", ECCV, 2016.
Andrew Brock, Theodore Lim, J.M. Ritchie, Nick
Weston, Neural Photo Editing with Introspective
Adversarial Networks, arXiv preprint, 2017
Basic Idea
space of
code z
Why move on the code space?
Fulfill the
constraint
Generator
(Decoder)
Encoder z xx
as close as possible
Back to z
• Method 1
• Method 2
• Method 3
𝑧∗ = 𝑎𝑟𝑔 min
𝑧
𝐿 𝐺 𝑧 , 𝑥 𝑇
𝑥 𝑇
𝑧∗
Difference between G(z) and xT
➢ Pixel-wise
➢ By another network
Using the results from method 2 as the initialization of method 1
Gradient Descent
Editing Photos
• z0 is the code of the input image
𝑧∗ = 𝑎𝑟𝑔 min
𝑧
𝑈 𝐺 𝑧 + 𝜆1 𝑧 − 𝑧0
2 − 𝜆2 𝐷 𝐺 𝑧
Does it fulfill the constraint of editing?
Not too far away from
the original image
Using discriminator
to check the image is
realistic or notimage
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Domain Independent Features
• Training and testing data are in different domains
Training
data:
Testing
data:
Generator
Generator
The same
distribution
feature
feature
Domain Independent Features
Domain Independent Features
Too easy to feature
extractor ……
feature extractor
Domain classifier
Domain-adversarial training
Not only cheat the domain
classifier, but satisfying label
classifier at the same time
This is a big network, but different parts have different goals.
Maximize label
classification accuracy
Maximize domain
classification accuracy
Maximize label classification accuracy +
minimize domain classification accuracy
feature extractor
Domain classifier
Label predictor
Domain-adversarial training
Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation,
ICML, 2015
Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand,
Domain-Adversarial Training of Neural Networks, JMLR, 2016
Domain classifier fails in the end
It should struggle ……
Domain-adversarial training
Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation,
ICML, 2015
Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand,
Domain-Adversarial Training of Neural Networks, JMLR, 2016
Train:
Test:
Conditional Generation
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
VAE-GAN
Discrimi
nator
Generator
(Decoder)
Encoder z x scalarx
as close as possible
VAE
GAN
➢ Minimize
reconstruction error
➢ z close to normal
➢ Minimize
reconstruction error
➢ Cheat discriminator
➢ Discriminate real,
generated and
reconstructed images
Discriminator provides the similarity measure
Anders Boesen, Lindbo Larsen, Søren Kaae
Sønderby, Hugo Larochelle, Ole Winther, “Autoencoding
beyond pixels using a learned similarity metric”, ICML. 2016
BiGAN
Encoder Decoder Discriminator
Image x
code z
Image x
code z
Image x code z
from encoder
or decoder?
(real) (generated)
(from prior
distribution)
Jeff Donahue, Philipp Krähenbühl, Trevor Darrell,
“Adversarial Feature Learning”, ICLR, 2017
Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier
Mastropietro, Alex Lamb, Martin Arjovsky, Aaron
Courville, “Adversarially Learned Inference”, ICLR, 2017
BiGAN
• Initialize encoder En, decoder De, discriminator Dis
• In each iteration:
• Sample M images 𝑥1, 𝑥2, ⋯ , 𝑥 𝑀 from database
• Generate M codes ǁ𝑧1, ǁ𝑧2, ⋯ , ǁ𝑧 𝑀 from encoder
• ǁ𝑧 𝑖 = 𝐸𝑛 𝑥 𝑖
• Sample M codes 𝑧1, 𝑧2, ⋯ , 𝑧 𝑀 from prior P(z)
• Generate M codes ෤𝑥1, ෤𝑥2, ⋯ , ෤𝑥 𝑀 from decoder
• ෤𝑥 𝑖 = 𝐷𝑒 𝑧 𝑖
• Update Discriminator to increase 𝐷 𝑥 𝑖, ǁ𝑧 𝑖 , decrease
𝐷 ෤𝑥 𝑖, 𝑧 𝑖
• Update En and De to decrease 𝐷 𝑥 𝑖
, ǁ𝑧 𝑖
, increase
𝐷 ෤𝑥 𝑖, 𝑧 𝑖
Encoder Decoder Discriminator
Image x
code z
Image x
code z
Image x code z
from encoder
or decoder?
(real) (generated)
(from prior
distribution)
𝑃 𝑥, 𝑧 𝑄 𝑥, 𝑧
Evaluate the
difference
between P and QTo make P and Q the same
Optimal encoder
and decoder:
En(x’) = z’ De(z’) = x’
En(x’’) = z’’De(z’’) = x’’
For all x’
For all z’’
BiGAN
En(x’) = z’ De(z’) = x’
En(x’’) = z’’De(z’’) = x’’
For all x’
For all z’’
How about?
Encoder
Decoder
x
z
෤𝑥
Encoder
Decoder
x
z
ǁ𝑧
En, De
En, De
Optimal encoder
and decoder:
Concluding Remarks
Modifying input code
• Making code has influence (InfoGAN)
• Connection code space with attribute
Controlling by input objects
• Paired data
• Unpaired data
• Unsupervised
Feature extraction
• Domain Independent Feature
• Improving Auto-encoder (VAE-GAN, BiGAN)
Anime Face
Generation
Anime Face Generation
• My Course at NTU: “Machine Learning and having
it Deep and Structured”
• This spring I opened this course the third time
• I want to update the course.
• Including the HW for sequence-to-sequence
learning, reinforcement learning, GAN
• What is the HW for GAN?
“girl with red hair
and red eyes”
machine
Anime Face Generation
• https://github.com/mattya/chainer-DCGAN
• https://zhuanlan.zhihu.com/p/24767059
• https://github.com/jayleicn/animeGAN
Anime Face Generation
- Girl Friend Factory
• http://qiita.com/Hiroshiba/items/d5749d8896613e6f0b48
Data Collection
http://konachan.net/post/show/239400/aikatsu-clouds-flowers-hikami_sumire-hiten_goane_r
感謝曾柏翔助教、
樊恩宇助教蒐集資料
Released Training Data
• Data download link:
https://drive.google.com/open?id=0BwJmB7alR-
AvMHEtczZZN0EtdzQ
• Anime Dataset:
• training data: 33.4k (image, tags) pair
• Training tags file format
• img_id <comma> tag1 <colon> #_post <tab> tag2
<colon> …
blue eyes
red hair
short hair
tags.csv
96 x 96
Data
• The students have to submit their codes.
• The code generates 5 64x64 images based on text
description.
• Input testing text only includes ‘color hair’ and ‘color
eyes’.
• Testing text file format
• testing_text_id <comma> testing_text
• Extra bonus if you can generate images beyond the
above description
sample
testing_text.txt
Evaluation
• Lots evaluation methods in the literature
• We will put all the generated images in the grading
platform
• Giving 2 scores for each image
• How the image fits the text?
• How the image looks real?
• Scores should be integer from 1 to 5
• 1 to 5 corressponding to (super bad, bad,
average, good, super good)
• It is double blind.
Implementation Details
• Updates between Generator and Discriminator
• 1 : 1 or 2 : 1
• ADAM with lr = 0.0002, momentum = 0.5
• gaussian noise dim = 100
• batch size = 64, epoch = 300
Paper: https://arxiv.org/pdf/1605.05396.pdf
Code: https://github.com/paarthneekhara/text-to-image
Results of TA
• input text: black hair blue eyes
• input text: green hair green eyes
• input text: blue hair green eyes
Advanced Techniques
• WGAN and Improved WGAN
• StackGAN
https://arxiv.org/abs/1612.03242
Sentence Representation
• Using RNN to encode the input text
• But you already know the testing input …...
One-hot encoding
Why not 15*15
one-hot encoding?
本技巧由江東峻、陳翰浩、鄭嘉文提供
Data Augmentation
• Total training data: 33.4K
• Only 14.8K is useful
• Not sufficient for image generation?
• Data Augmentation
• Horizonal Flipping + Rotation
本技巧由江東峻、陳翰浩、鄭嘉文提供
Sampled Noise v.s. Image Quality
Std=0.950
Std=0.950
Std=0.666
Std=0.666
Std=0.432
Std=0.432
本技巧由柯達方提供
Sampled Noise v.s. Image Quality
green hair
green eyes
noise with
small std
noise with
large std
green hair
green eyes
Generator Generator
Ensemble
• E.g. BEGAN on CelebA
陳柏文同學提供
實驗結果
Ensemble
Generator
1
Generator
2
Generator
3
Generator
4
Generator
5
green hair
green eyes
noise with
small std
Bonus
• hat, twin tails, ribbon, horns, headdress, headphones,
glasses ……
• Whole images
• Deep Cleavage GAN (DCGAN)
• Successful …… but I will not show this ……
感謝 孫凡耕、羅啟心、許晉嘉、郭子生 提供結果
Lecture 4:
Decision Making
and Control
and its relation with GAN
Decision Making and Control
• Machine observe some inputs, takes an action, and
finally achieve the target.
• E.g. Go playing
observation action
Target: win the game
State: summarization of observation
Decision Making and Control
• Machine plays video games
• Widely studies:
• Gym: https://gym.openai.com/
• Universe: https://openai.com/blog/universe/
Machine learns to play video
games as human players
➢ Machine learns to take
proper action itself
➢ What machine observes are pixels
Decision Making and Control
• E.g. self-driving car
Object
Recognition
紅燈
Decision
Making
observation
Stop!
Action:
a giant network?
Decision Making and Control
• E.g. dialogue system
Slot
Filling
Decision
Making
我想訂11月5日
到台北的機票
問出發地
a giant network?
抵達時間:11月5日
目的地:台北
Natural
Language
Generation
請問您要從
哪裡出發呢?
How to solve this problem?
• Network as a function, learn as typical supervised
tasks
Network
Target: 3-3
Network
Target:
Stop!
我想訂11月5日
到台北的機票
Network
Target: 請問您要
從哪裡出發呢?
Behavior Cloning
https://www.youtube.com/watch?v=j2FSB3bseek
Machine do not know some
behavior must copy, but
some can be ignored.
Properties of
Decision Making and Control
• Agent’s actions affect the subsequent data it
receives
• Reward delay
• In space invader, only “fire” obtains reward
• Although the moving before “fire” is
important
• In Go playing, sacrificing immediate reward to
gain more long-term reward
Machine does not know the
influence of each action.
What do we miss?
Better Way ……
Network (19 x 19
positions)
Next move
A sequence
of decision
Classification
Two Learning Scenarios
• Scenario 1: Reinforcement Learning
• Machine interacts with the environment.
• Machine obtains the reward from the environment, so it
knows its performance is good or bad.
• Scenario 2: Learning by demonstration
• Also known as imitation learning, apprenticeship
learning
• An expert demonstrates how to solve the task, and
machine learns from the demonstration.
Outline
Reinforcement Learning
•Training an actor
•RNN generation with GAN
•Actor + Critic
Inverse Reinforcement Learning
RL
Policy-based Value-based
Learning an Actor Learning a CriticActor + Critic
Model-based Approach
Model-free
Approach
Alpha Go: policy-based + value-based
+ model-based
Outline
Reinforcement Learning
•Training an actor
•RNN generation with GAN
•Actor + Critic
Inverse Reinforcement Learning
Basic Idea
EnvActor
Reward
Function
Video
Game
Go
殺一隻怪
得 20 分 …
圍棋規則
You cannot control
Actor, Environment, Reward
𝜏 = 𝑠1, 𝑎1, 𝑠2, 𝑎2, ⋯ , 𝑠 𝑇, 𝑎 𝑇
Trajectory
Actor
𝑠1
𝑎1
Env
𝑠2
Env
𝑠1
𝑎1
Actor
𝑠2
𝑎2
Env
𝑠3
𝑎2
……
Total reward:
𝑅 𝜏 = ෍
𝑡=1
𝑇
𝑟𝑡
Reward
𝑟1
Reward
𝑟2
Start with
observation 𝑠1 Observation 𝑠2 Observation 𝑠3
Example: Playing Video Game
Action 𝑎1: “right”
Obtain reward
𝑟1 = 0
Action 𝑎2 : “fire”
(kill an alien)
Obtain reward
𝑟2 = 5
Usually there is some randomness in the environment
Start with
observation 𝑠1 Observation 𝑠2 Observation 𝑠3
Example: Playing Video Game
After many turns
Action 𝑎 𝑇
Obtain reward 𝑟𝑇
Game Over
(spaceship destroyed)
This is an episode.
We want the total
reward be maximized.
Total reward:
𝑅 = ෍
𝑡=1
𝑇
𝑟𝑡
• Input of neural network: the observation of machine
represented as a vector or a matrix
• Output neural network : each action corresponds to a
neuron in output layer
……
NN as actor
pixels
fire
right
left
Score of an
action
0.7
0.2
0.1
Take the action
with the
highest score
Neural network as Actor
What is the benefit of using network instead of lookup table?
generalization
Actor, Environment, Reward
Actor
𝑠1
𝑎1
Env
𝑠2
Env
𝑠1
𝑎1
Actor
𝑠2
𝑎2
Env
𝑠3
𝑎2
……
𝑅 𝜏 = ෍
𝑡=1
𝑇
𝑟𝑡
Reward
𝑟1
Reward
𝑟2
NetworkNetwork
They are not
networks.
如果發現不能微分就用
policy gradient 硬 train 一發
argmax
Ref: https://www.youtube.com/watch?v=W8XF3ME8G2I
Warning of Math
Policy Gradient
Step 1:
define a set
of function
Step 2:
goodness of
function
Step 3: pick
the best
function
Three Steps for Deep Learning
Deep Learning is so simple ……
Neural Network
as Actor
Step 1:
define a set
of function
Step 2:
goodness of
function
Step 3: pick
the best
function
Three Steps for Deep Learning
Deep Learning is so simple ……
Neural Network
as Actor
Goodness of Actor
• Given an actor 𝜋 𝜃 𝑠 with network parameter 𝜃
• Use the actor 𝜋 𝜃 𝑠 to play the video game
• Start with observation 𝑠1
• Machine decides to take 𝑎1
• Machine obtains reward 𝑟1
• Machine sees observation 𝑠2
• Machine decides to take 𝑎2
• Machine obtains reward 𝑟2
• Machine sees observation 𝑠3
• ……
• Machine decides to take 𝑎 𝑇
• Machine obtains reward 𝑟𝑇
Total reward: 𝑅 𝜃 = σ 𝑡=1
𝑇
𝑟𝑡
Even with the same actor,
𝑅 𝜃 is different each time
We define ത𝑅 𝜃 as the
expected value of Rθ
ത𝑅 𝜃 evaluates the goodness of an actor 𝜋 𝜃 𝑠
Randomness in the actor
and the game
END
Goodness of Actor
• 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇
𝑃 𝜏|𝜃 =
𝑝 𝑠1 𝑝 𝑎1|𝑠1, 𝜃 𝑝 𝑟1, 𝑠2|𝑠1, 𝑎1 𝑝 𝑎2|𝑠2, 𝜃 𝑝 𝑟2, 𝑠3|𝑠2, 𝑎2 ⋯
= 𝑝 𝑠1 ෑ
𝑡=1
𝑇
𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡
Control by
your actor 𝜋 𝜃
not related
to your actor
Actor
𝜋 𝜃
left
right
fire
𝑠𝑡
0.1
0.2
0.7
𝑝 𝑎 𝑡 = "𝑓𝑖𝑟𝑒"|𝑠𝑡, 𝜃
= 0.7
We define ത𝑅 𝜃 as the
expected value of Rθ
Goodness of Actor
• An episode is considered as a trajectory 𝜏
• 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇
• 𝑅 𝜏 = σ 𝑡=1
𝑇
𝑟𝑡
• If you use an actor to play the game, each 𝜏 has a
probability to be sampled
• The probability depends on actor parameter 𝜃:
𝑃 𝜏|𝜃
ത𝑅 𝜃 = ෍
𝜏
𝑅 𝜏 𝑃 𝜏|𝜃
Sum over all
possible trajectory
Use 𝜋 𝜃 to play the
game N times,
obtain 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁
Sampling 𝜏 from 𝑃 𝜏|𝜃
N times
≈
1
𝑁
෍
𝑛=1
𝑁
𝑅 𝜏 𝑛
Step 1:
define a set
of function
Step 2:
goodness of
function
Step 3: pick
the best
function
Three Steps for Deep Learning
Deep Learning is so simple ……
Neural Network
as Actor
Gradient Ascent
• Problem statement
• Gradient ascent
• Start with 𝜃0
• 𝜃1
← 𝜃0
+ 𝜂𝛻 ത𝑅 𝜃0
• 𝜃2
← 𝜃1
+ 𝜂𝛻 ത𝑅 𝜃1
• ……
𝜃∗
= 𝑎𝑟𝑔 max
𝜃
ത𝑅 𝜃
𝛻 ത𝑅 𝜃
𝜃 = 𝑤1, 𝑤2, ⋯ , 𝑏1, ⋯
=
Τ𝜕 ത𝑅 𝜃 𝜕𝑤1
Τ𝜕 ത𝑅 𝜃 𝜕𝑤2
⋮
Τ𝜕 ത𝑅 𝜃 𝜕𝑏1
⋮
Policy Gradient ത𝑅 𝜃 = ෍
𝜏
𝑅 𝜏 𝑃 𝜏|𝜃 𝛻 ത𝑅 𝜃 =?
𝛻 ത𝑅 𝜃 = ෍
𝜏
𝑅 𝜏 𝛻𝑃 𝜏|𝜃 = ෍
𝜏
𝑅 𝜏 𝑃 𝜏|𝜃
𝛻𝑃 𝜏|𝜃
𝑃 𝜏|𝜃
= ෍
𝜏
𝑅 𝜏 𝑃 𝜏|𝜃 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃
Use 𝜋 𝜃 to play the game N times,
Obtain 𝜏1
, 𝜏2
, ⋯ , 𝜏 𝑁≈
1
𝑁
෍
𝑛=1
𝑁
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑃 𝜏 𝑛|𝜃
𝑅 𝜏 do not have to be differentiable
It can even be a black box.
𝑑𝑙𝑜𝑔 𝑓 𝑥
𝑑𝑥
=
1
𝑓 𝑥
𝑑𝑓 𝑥
𝑑𝑥
Policy Gradient
• 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇
𝑃 𝜏|𝜃 = 𝑝 𝑠1 ෑ
𝑡=1
𝑇
𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡
𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 =?
𝑙𝑜𝑔𝑃 𝜏|𝜃
= 𝑙𝑜𝑔𝑝 𝑠1 + ෍
𝑡=1
𝑇
𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 + 𝑙𝑜𝑔𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡
𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 = ෍
𝑡=1
𝑇
𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 Ignore the terms
not related to 𝜃
Policy Gradient
𝛻 ത𝑅 𝜃 ≈
1
𝑁
෍
𝑛=1
𝑁
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑃 𝜏 𝑛|𝜃
𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃
= ෍
𝑡=1
𝑇
𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃
=
1
𝑁
෍
𝑛=1
𝑁
𝑅 𝜏 𝑛 ෍
𝑡=1
𝑇𝑛
𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
𝜃 𝑛𝑒𝑤 ← 𝜃 𝑜𝑙𝑑 + 𝜂𝛻 ത𝑅 𝜃 𝑜𝑙𝑑
=
1
𝑁
෍
𝑛=1
𝑁
෍
𝑡=1
𝑇𝑛
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
If in 𝜏 𝑛
machine takes 𝑎 𝑡
𝑛
when seeing 𝑠𝑡
𝑛
in
𝑅 𝜏 𝑛 is positive Tuning 𝜃 to increase 𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
𝑅 𝜏 𝑛 is negative Tuning 𝜃 to decrease 𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
It is very important to consider the cumulative reward 𝑅 𝜏 𝑛 of
the whole trajectory 𝜏 𝑛 instead of immediate reward 𝑟𝑡
𝑛
What if we replace
𝑅 𝜏 𝑛
with 𝑟𝑡
𝑛
……
Add a Baseline
𝜃 𝑛𝑒𝑤
← 𝜃 𝑜𝑙𝑑
+ 𝜂𝛻 ത𝑅 𝜃 𝑜𝑙𝑑
𝛻 ത𝑅 𝜃 ≈
1
𝑁
෍
𝑛=1
𝑁
෍
𝑡=1
𝑇𝑛
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
It is possible that 𝑅 𝜏 𝑛
is always positive.
Ideal
case
Sampling
……
a b c a b c
It is probability …
a b c
Not
sampled
a b c
The probability of the
actions not sampled
will decrease.
1
𝑁
෍
𝑛=1
𝑁
෍
𝑡=1
𝑇𝑛
𝑅 𝜏 𝑛 − 𝑏 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
Policy Gradient
𝜏1: 𝑠1
1
, 𝑎1
1
𝑅 𝜏1
𝑠2
1
, 𝑎2
1
……
𝑅 𝜏1……
𝜏2
: 𝑠1
2
, 𝑎1
2
𝑅 𝜏2
𝑠2
2
, 𝑎2
2
……
𝑅 𝜏2
……
𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃
𝛻 ത𝑅 𝜃 =
1
𝑁
෍
𝑛=1
𝑁
෍
𝑡=1
𝑇𝑛
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
Given actor parameter 𝜃
Update
Model
Data
Collection
Policy Gradient
…
…
fire
right
left
s
1
0
0
− ෍
𝑖=1
3
ො𝑦𝑖 𝑙𝑜𝑔𝑦𝑖Minimize:
Maximize:
𝑙𝑜𝑔𝑃 "𝑙𝑒𝑓𝑡"|𝑠
𝑦𝑖 ො𝑦𝑖
𝑙𝑜𝑔𝑦𝑖 =
𝜃 ← 𝜃 + 𝜂𝛻𝑙𝑜𝑔𝑃 "𝑙𝑒𝑓𝑡"|𝑠
Considered as
Classification Problem
Policy Gradient
𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃
𝛻 ത𝑅 𝜃 =
1
𝑁
෍
𝑛=1
𝑁
෍
𝑡=1
𝑇𝑛
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
𝜏1: 𝑠1
1
, 𝑎1
1
𝑅 𝜏1
𝑠2
1
, 𝑎2
1
……
𝑅 𝜏1
……
𝜏2: 𝑠1
2
, 𝑎1
2
𝑅 𝜏2
𝑠2
2
, 𝑎2
2
……
𝑅 𝜏2
……
Given actor parameter 𝜃
…
𝑠1
1
NN
fire
right
left
𝑎1
1
= 𝑙𝑒𝑓𝑡
1
0
0
𝑠1
2
NN
fire
right
left
𝑎1
2
= 𝑓𝑖𝑟𝑒
0
0
1
Policy Gradient
𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃
𝛻 ത𝑅 𝜃 =
1
𝑁
෍
𝑛=1
𝑁
෍
𝑡=1
𝑇𝑛
𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡
𝑛
|𝑠𝑡
𝑛
, 𝜃
𝜏1: 𝑠1
1
, 𝑎1
1
𝑅 𝜏1
𝑠2
1
, 𝑎2
1
……
𝑅 𝜏1
……
𝜏2: 𝑠1
2
, 𝑎1
2
𝑅 𝜏2
𝑠2
2
, 𝑎2
2
……
𝑅 𝜏2
……
Given actor parameter 𝜃
…
𝑠1
1
NN 𝑎1
1
= 𝑙𝑒𝑓𝑡
2
2
1
1 𝑠1
1
NN 𝑎1
1
= 𝑙𝑒𝑓𝑡
𝑠1
2
NN 𝑎1
2
= 𝑓𝑖𝑟𝑒
Each training data is
weighted by 𝑅 𝜏 𝑛
End of Warning
Outline
Reinforcement Learning
•Training an actor
•RNN generation with GAN
•Actor + Critic
Inverse Reinforcement Learning
RNN Generation
• Training
begin 春 眠 不
……
……
……
春 眠 不 覺
Unsupervised
Leaning
without annotation
RNN Generation
• Testing
前 明 月
<BOS>
c1 c2 c3
……
……
……
P(c1) P(c2|c1) P(c3|c1,c2) P(c4|c1,c2,c3)
床
床 前 明
~
~
~
~
sample
Unsupervised
Leaning
without annotation
RNN
Generation
• Images are composed of pixels
• Generating a pixel at each time by RNN
blue pink blue
<BOS> ……
……
……
red
Consider as a sentence
blue red yellow gray ……
Each pixel is a character.
red blue pink
~
~
~
~
c1 c2 c3
P(c1) P(c2|c1) P(c3|c1,c2) P(c4|c1,c2,c3)
寶可夢鍊成
• Small images of 792 Pokémon's
• Can machine learn to create new Pokémons?
• Source of image:
http://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A
9mon_by_base_stats_(Generation_VI)
Don't catch them! Create them!
Original image is 40 x 40
Making them into 20 x 20
Real
Pokémon
Cover 50%
Cover 75%
It is difficult to evaluate generation.
Never seen
by machine!
寶可夢鍊成
Drawing from scratch
PixelRNN
Real
World
Ref: Aaron van den Oord, Nal Kalchbrenner, Koray
Kavukcuoglu, Pixel Recurrent Neural Networks,
arXiv preprint, 2016
More than images ……
Audio: Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol
Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu,
WaveNet: A Generative Model for Raw Audio, arXiv preprint, 2016
Video: Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo
Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu, Video Pixel Networks ,
arXiv preprint, 2016
Sentence Generation
Generator
Discriminator
sentence x
sentence x Real or fake
Original GAN
code z sampled from
prior distribution
Sentence Generation
• Initialize generator G and discriminator D
• In each iteration:
• Sample real sentences 𝑥 from database
• Generate sentences ෤𝑥 by G
• Update D to increase 𝐷 𝑥 and decrease 𝐷 ෤𝑥
• Update Gen such that
Generator
Discrimi
nator
scalar
update
Sentence Generation
A A
A
B
A
B
A
A
B
B
B
<BOS>
Can we do
backpropogation?
Tuning generator a
little bit will not
change the output.
Discriminator
scalar
NO!
In the paper of
improved WGAN …
(ignoring
sampling process)
Sentence Generation
A A
A
B
A
B
A
A
B
B
B
<BOS>
Discriminator
scalar
observation
Actions set
Action taken
Reward Function
Sentence Generation - SeqGAN
• Using Reinforcement learning
• Consider the discriminator as reward function
• Consider the output of discriminator as total reward
• Update generator to increase discriminator = to get
maximum total reward
Ref: Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu, “SeqGAN: Sequence
Generative Adversarial Nets with Policy Gradient”, AAAI, 2017
Generator
Discrimi
nator
scalar
update
Using Policy Gradient
Chat-bot with GAN
Discriminator
Input
sentence/history h response sentence x
Real or fake
human
dialogues
Chatbot
En De
Conditional GAN
response sentence x
Input
sentence/history h
Jiwei Li, Will Monroe, Tianlin
Shi, Sébastien Jean, Alan Ritter, Dan
Jurafsky, “Adversarial Learning for
Neural Dialogue Generation”, arXiv
preprint, 2017
Example Results
感謝 段逸林 同學提供實驗結果
Outline
Reinforcement Learning
•Training an actor
•RNN generation with GAN
•Actor + Critic
Inverse Reinforcement Learning
Critic
• A critic does not determine the action.
• Given an actor π, it evaluates the how good the
actor is
http://combiboilersleeds.com/picaso/critics/critics-4.html
An actor can be
found from a critic.
e.g. Q-learning
Not suitable for
continuous action a
Actor-Critic
𝜋 interacts with
the environment
Learning Critic
Update actor from
𝜋 → 𝜋’ based on
Critic
TD or MC𝜋 = 𝜋′
Actor = Generator ?
Critic = Discriminator ?
https://arxiv.org/abs/1610.01945
Visual Doom AI Competition @ CIG 2016
https://www.youtube.com/watch?v=94EPSjQH38Y
Playing On-line Game
劉廷緯、溫明浩
https://www.youtube.com/watch?v=8iRD1w73fDo&feature=youtu.be
Outline
Reinforcement Learning
•Training an actor
•RNN generation with GAN
•Actor + Critic
Inverse Reinforcement Learning
Imitation Learning
We have demonstration of the expert.
Actor
𝑠1
𝑎1
Env
𝑠2
Env
𝑠1
𝑎1
Actor
𝑠2
𝑎2
Env
𝑠3
𝑎2
……
reward function is not available
Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁
Each Ƹ𝜏 is a trajectory
of the export.
Self driving: record
human drivers
Robot: grab the
arm of robot
Motivation
• It is hard to define reward in some tasks.
• Hand-crafted rewards can lead to uncontrolled
behavior.
機器人三大法則:
一、機器人不得傷害人類,或坐視人類受到
傷害而袖手旁觀。
二、除非違背第一法則,機器人必須服從人
類的命令。
三、在不違背第一法則及第二法則的情況下,
機器人必須保護自己。
因此為了保護人類整體,控制人類的自由是
必須的
神邏輯
Inverse Reinforcement Learning
Reward
Function
Environment
Optimal
Actor
Inverse Reinforcement
Learning
➢Using the reward function to find the optimal actor.
➢Modeling reward can be easier. Simple reward
function can lead to complex policy.
Reinforcement
Learning
Expert
demonstration
of the expert
Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁
Inverse Reinforcement Learning
• Principle: The teacher is always the best.
• Basic idea:
• Initialize an actor
• In each iteration
• The actor interacts with the environments to obtain
some trajectories
• Define a reward function, which makes the
trajectories of the teacher better than the actor
• The actor learns to maximize the reward based on
the new reward function.
• Output the reward function and the actor learned from
the reward function
Framework of IRL
Expert ො𝜋
Actor 𝜋
Obtain
Reward Function R
Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁
𝜏1, 𝜏2, ⋯ , 𝜏 𝑁
Find an actor based
on reward function R
By Reinforcement learning
෍
𝑛=1
𝑁
𝑅 Ƹ𝜏 𝑛 > ෍
𝑛=1
𝑁
𝑅 𝜏
Reward function
= Discriminator
Actor
= Generator
Reward
Function R
𝜏1, 𝜏2, ⋯ , 𝜏 𝑁
GAN
IRL
G
D
High score for real,
low score for generated
Find a G whose output
obtains large score from D
Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁
Expert
Actor
Reward
Function
Larger reward for Ƹ𝜏 𝑛,
Lower reward for 𝜏
Find a Actor obtains
large reward
Teaching Robot
• In the past …… https://www.youtube.com/watch?v=DEGbtjTOIB0
Robot
http://rll.berkeley.edu/gcl/
Chelsea Finn, Sergey Levine, Pieter Abbeel, ”
Guided Cost Learning: Deep Inverse Optimal
Control via Policy Optimization”, ICML, 2016
Parking a Car
https://pdfs.semanticscholar.org/27c3/6fd8e6aeadd50ec1e180399df70c46dede00.pdf
Path Planning
http://martin.zinkevich.org/publications
/maximummarginplanning.pdf
Third Person Imitation Learning
• Ref: Bradly C. Stadie, Pieter Abbeel, Ilya Sutskever, “Third-Person Imitation
Learning”, arXiv preprint, 2017
http://lasa.epfl.ch/research_new/ML/index.php https://kknews.cc/sports/q5kbb8.html
http://sc.chinaz.com/Files/pic/icons/1913/%E6%9C%BA%E5%99%A8%E4%BA%BA%E5%9B
%BE%E6%A0%87%E4%B8%8B%E8%BD%BD34.png
Third PersonFirst Person
Third Person Imitation Learning
Concluding Remarks
Lecture 1: Brief Review of Deep Learning
Lecture 2: Introduction of Generative
Adversarial Network
Lecture 3: Conditional Generation
Lecture 4: Making Decision and Control
To Learn More …
• Machine Learning
• Slides:
http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.htm
l
• Video:
https://www.youtube.com/watch?v=fegAeph9UaA&list=
PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49
• Machine Learning and Having it Deep and Structured
• Slides:
http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.h
tml
• Video:
https://www.youtube.com/watch?v=IzHoNwlCGnE&list=
PLJV_el3uVTsPMxPbjeX7PicgWbY7F8wW9
工商時間
「科技大擂台 與AI對話」
熱身賽初賽
「科技大擂台 與AI對話」
➢ 參加對象:大專院校研究所以下、
高中職學生 (一到四人一組參加)
➢有高額獎金
➢但跟這份投影片
沒什麼關係
➢ 首獎有十萬
➢ 讓大家在拿高額獎金前先暖個身
➢全民皆可參加
熱身賽初賽 ─ 試題
• 機器在聽一段人類對話後,預測人類的下一句話
• 試題下載
• 範例試題 (development set):8/02 後可以在科技部官
網上下載 50 題的範例試題和其答案
• 測試試題 (testing set):8/07 釋出
A: 柏翰謝謝你
B: 不客氣
A: 不好意思耽誤你下
課的休息時間
初賽直接提供字幕檔 選項:
(1) B: 我很喜歡下課時間
(2) B: 都這麼熟了不用說
客套話
……
本出題模式
僅限熱身賽
Time Topic
13:50-14:00 報到
14:00-14:50 詞向量 (Word Embedding) 實作
15:00-15:50 遞迴式類神經網路(Recurrent Neural
Network, RNN) 實作
16:00-16:50 序列對序列模型(Sequence-to-
sequence Model) 實作
17:00- Q&A
本計畫辦公室將舉辦賽前教育訓練,邀請台大 李宏毅教授
團隊蒞臨指導。
時間:2017年8月16日(三)下午2-5點
地點:106台北市和平東路2段106號1樓(科技大樓)創新創
業推動組會議室
僅開放給報名完成的團隊 (8月15日報名截止)
實作教學

Mais conteúdo relacionado

Mais procurados

Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work IIMohamed Loey
 
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)徹 上野山
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Simplilearn
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOleg Mygryn
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnSumeraHangi
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnDebarko De
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural networkKIRAN R
 
ディープラーニングの最新動向
ディープラーニングの最新動向ディープラーニングの最新動向
ディープラーニングの最新動向Preferred Networks
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep LearningPoo Kuan Hoong
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
 
MixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised LearningMixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised Learningharmonylab
 
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習SSII
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 

Mais procurados (20)

Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work II
 
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Resnet
ResnetResnet
Resnet
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
ディープラーニングの最新動向
ディープラーニングの最新動向ディープラーニングの最新動向
ディープラーニングの最新動向
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
MixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised LearningMixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised Learning
 
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and Applications
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 

Destaque

[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹
[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹
[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹台灣資料科學年會
 
給軟體工程師的不廢話 R 語言精要班
給軟體工程師的不廢話 R 語言精要班給軟體工程師的不廢話 R 語言精要班
給軟體工程師的不廢話 R 語言精要班台灣資料科學年會
 
[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業台灣資料科學年會
 
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用台灣資料科學年會
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用台灣資料科學年會
 
[系列活動] 智慧城市中的時空大數據應用
[系列活動] 智慧城市中的時空大數據應用[系列活動] 智慧城市中的時空大數據應用
[系列活動] 智慧城市中的時空大數據應用台灣資料科學年會
 
[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法台灣資料科學年會
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOWMark Chang
 
[系列活動] 手把手教你R語言資料分析實務
[系列活動] 手把手教你R語言資料分析實務[系列活動] 手把手教你R語言資料分析實務
[系列活動] 手把手教你R語言資料分析實務台灣資料科學年會
 
[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程台灣資料科學年會
 
[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R台灣資料科學年會
 
[系列活動] Python 程式語言起步走
[系列活動] Python 程式語言起步走[系列活動] Python 程式語言起步走
[系列活動] Python 程式語言起步走台灣資料科學年會
 
[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人台灣資料科學年會
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)台灣資料科學年會
 
姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘台灣資料科學年會
 
黃從仁/心理與行為資料中的因與果
黃從仁/心理與行為資料中的因與果黃從仁/心理與行為資料中的因與果
黃從仁/心理與行為資料中的因與果台灣資料科學年會
 
心理與行為資料中的因與果-黃從仁
心理與行為資料中的因與果-黃從仁心理與行為資料中的因與果-黃從仁
心理與行為資料中的因與果-黃從仁Tren Huang
 

Destaque (20)

[系列活動] Python爬蟲實戰
[系列活動] Python爬蟲實戰[系列活動] Python爬蟲實戰
[系列活動] Python爬蟲實戰
 
[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹
[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹
[系列活動] 無所不在的自然語言處理—基礎概念、技術與工具介紹
 
給軟體工程師的不廢話 R 語言精要班
給軟體工程師的不廢話 R 語言精要班給軟體工程師的不廢話 R 語言精要班
給軟體工程師的不廢話 R 語言精要班
 
[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業
 
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
 
[系列活動] 智慧城市中的時空大數據應用
[系列活動] 智慧城市中的時空大數據應用[系列活動] 智慧城市中的時空大數據應用
[系列活動] 智慧城市中的時空大數據應用
 
[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 
[系列活動] 手把手教你R語言資料分析實務
[系列活動] 手把手教你R語言資料分析實務[系列活動] 手把手教你R語言資料分析實務
[系列活動] 手把手教你R語言資料分析實務
 
[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程
 
[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R
 
[系列活動] Python 程式語言起步走
[系列活動] Python 程式語言起步走[系列活動] Python 程式語言起步走
[系列活動] Python 程式語言起步走
 
[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人
 
[系列活動] 機器學習速遊
[系列活動] 機器學習速遊[系列活動] 機器學習速遊
[系列活動] 機器學習速遊
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
 
姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘
 
黃從仁/心理與行為資料中的因與果
黃從仁/心理與行為資料中的因與果黃從仁/心理與行為資料中的因與果
黃從仁/心理與行為資料中的因與果
 
心理與行為資料中的因與果-黃從仁
心理與行為資料中的因與果-黃從仁心理與行為資料中的因與果-黃從仁
心理與行為資料中的因與果-黃從仁
 

Semelhante a Introduction to Generative Adversarial Networks (GANs

Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural NetworksDatabricks
 
Deep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An IntroDeep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An IntroSiby Jose Plathottam
 
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Simplilearn
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognitionYUNG-KUEI CHEN
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Turi, Inc.
 
Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMarijn van Zelst
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Codemotion
 
ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4zukun
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
 
An Incomplete Introduction to Artificial Intelligence
An Incomplete Introduction to Artificial IntelligenceAn Incomplete Introduction to Artificial Intelligence
An Incomplete Introduction to Artificial IntelligenceSteven Beeckman
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearningEyad Alshami
 
Neural networks across space & time : Deep learning in java
Neural networks across space & time : Deep learning in javaNeural networks across space & time : Deep learning in java
Neural networks across space & time : Deep learning in javaDave Snowdon
 
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용현호 김
 
Hacker School talk - Learning for Hackers
Hacker School talk - Learning for HackersHacker School talk - Learning for Hackers
Hacker School talk - Learning for HackersMel Chua
 
Gan seminar
Gan seminarGan seminar
Gan seminarSan Kim
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural NetworkOmer Korech
 

Semelhante a Introduction to Generative Adversarial Networks (GANs (20)

Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Deep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An IntroDeep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An Intro
 
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognition
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspective
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
 
ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4ECCV2010: feature learning for image classification, part 4
ECCV2010: feature learning for image classification, part 4
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
An Incomplete Introduction to Artificial Intelligence
An Incomplete Introduction to Artificial IntelligenceAn Incomplete Introduction to Artificial Intelligence
An Incomplete Introduction to Artificial Intelligence
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
 
DL (v2).pptx
DL (v2).pptxDL (v2).pptx
DL (v2).pptx
 
Neural networks across space & time : Deep learning in java
Neural networks across space & time : Deep learning in javaNeural networks across space & time : Deep learning in java
Neural networks across space & time : Deep learning in java
 
Deep Learning for Computer Vision: Deep Networks (UPC 2016)
Deep Learning for Computer Vision: Deep Networks (UPC 2016)Deep Learning for Computer Vision: Deep Networks (UPC 2016)
Deep Learning for Computer Vision: Deep Networks (UPC 2016)
 
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
 
Hacker School talk - Learning for Hackers
Hacker School talk - Learning for HackersHacker School talk - Learning for Hackers
Hacker School talk - Learning for Hackers
 
Gan seminar
Gan seminarGan seminar
Gan seminar
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 

Mais de 台灣資料科學年會

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用台灣資料科學年會
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告台灣資料科學年會
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇台灣資料科學年會
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 台灣資料科學年會
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵台灣資料科學年會
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用台灣資料科學年會
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告台灣資料科學年會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人台灣資料科學年會
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維台灣資料科學年會
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察台灣資料科學年會
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰台灣資料科學年會
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳台灣資料科學年會
 

Mais de 台灣資料科學年會 (20)

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
 
台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
 

Último

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 

Último (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 

Introduction to Generative Adversarial Networks (GANs

  • 1. Introduction of Generative Adversarial Network 李宏毅 Hung-yi Lee
  • 2. Outline Lecture 1: Brief Review of Deep Learning Lecture 2: Introduction of Generative Adversarial Network Lecture 3: Conditional Generation Lecture 4: Making Decision and Control
  • 7. Resources • GAN Zoo • https://github.com/hindupuravinash/the-gan-zoo • Tricks: https://github.com/soumith/ganhacks • Tutorial of Ian Goodfellow: https://arxiv.org/abs/1701.00160
  • 9. Anime Face Generation 何之源的知乎: https://zhuanlan.zhihu.com/p/24767059 DCGAN: https://github.com/carpedm20/DCGAN-tensorflow Drawing?
  • 10. Basic Idea of GAN Generator It is a neural network (NN), or a function. Generator 0.1 −3 ⋮ 2.4 0.9 imagevector Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 2.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 matrix Powered by: http://mattya.github.io/chainer-DCGAN/ Each dimension of input vector represents some characteristics. Longer hair blue hair Open mouth
  • 11. Discri- minator scalar image Basic Idea of GAN It is a neural network (NN), or a function. Larger value means real, smaller value means fake. Discri- minator Discri- minator Discri- minator Discri- minator 1.0 1.0 0.1 ?
  • 12. Basic Idea of GAN Brown veins Butterflies are not brown Butterflies do not have veins …….. Generator Discriminator
  • 13. Basic Idea of GAN NN Generator v1 Discri- minator v1 Real images: NN Generator v2 Discri- minator v2 NN Generator v3 Discri- minator v3 This is where the term “adversarial” comes from. You can explain the process in different ways…….
  • 14. Generator v3 Generator v2 Basic Idea of GAN Generator (student) Discriminator (teacher) Generator v1 Discriminator v1 Discriminator v2 No eyes No mouth
  • 15. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  • 25. Lecture I: Basic Idea of Deep Learning
  • 26. Machine Learning ≈ Looking for a Function • Binary Classification (是 非題) • Multi-class Classification (選擇題) Function f Function f Input Input Yes/No Class 1, Class 2, … Class N
  • 27. Machine Learning ≈ Looking for a Function • Structured input/output “大家好,歡迎大 家來修機器學習” f Speech Recognition “girl with red hair and red eyes” f f Summarization “近半選民挺脫歐 公投 ……” (title, summary)
  • 28. Framework A set of function 21, ff  1f “cat”  1f “dog”  2f “money”  2f “snake” Model  f “cat” Image Recognition:
  • 29. Framework A set of function 21, ff  f “cat” Image Recognition: Model Training Data Goodness of function f Better! “monkey” “cat” “dog” function input: function output: Supervised Learning
  • 30. Framework A set of function 21, ff  f “cat” Image Recognition: Model Training Data Goodness of function f “monkey” “cat” “dog” * f Pick the “Best” Function Using  f “cat” Training Testing
  • 31. Example Application Input Output 16 x 16 = 256 1x 2x 256x …… Ink → 1 No ink → 0 …… y1 y2 y10 Each dimension represents the confidence of a digit. is 1 is 2 is 0 …… 0.1 0.7 0.2 The image is “2”
  • 32. Example Application • Handwriting Digit Recognition Function “2” 1x 2x 256x …… …… y1 y2 y10 is 1 is 2 is 0 …… Input: 256-dim vector output: 10-dim vector Neural Network
  • 33. Training Data • Preparing training data: images and their labels The learning target is defined on the training data. “5” “0” “4” “1” “3”“1”“2”“9”
  • 34. bwawawaz KKkk  11 Neural Network z 1w kw Kw … 1a ka Ka  b  z bias a weights Neuron… …… A simple function Activation function
  • 35. Output LayerHidden Layers Input Layer Fully Connect Feedforward Network Input Output 1x 2x Layer 1 …… Nx …… Layer 2 …… Layer L …… …… …… …… …… y1 y2 yM neuron is 1 is 2 is 0 ……
  • 36. Fully Connect Feedforward Network Input Output 1x 2x Layer 1 …… Nx …… Layer 2 …… Layer L …… …… …… …… …… y1 y2 yM Simple Function 1 is 1 is 2 is 0 …… a1 Simple Function 2 a2 Simple Function L y… A very complex function How can we know the weights and biases of the neurons?
  • 37. 1x 2x …… Nx …… …… …… …… Ink → 1 No ink → 0 …… y1 y2 y10 y1 has the maximum value Find a set of weights and bias such that Input: y2 has the maximum valueInput: is 1 is 2 is 0 Gradient descent is the algorithm finding the weights and biases of the neurons that can achieve the target. Gradient descent has lots of problems …… Learning/ Training:
  • 38. Convolutional Layer…… 1 1 2 2 Sparse Connectivity 3 4 5 Each neural only connects to part of the output of the previous layer 3 4 Receptive Field Different neurons have different, but overlapping, receptive fields
  • 39. Convolutional Layer…… 1 1 2 2 Sparse Connectivity 3 4 5 Each neural only connects to part of the output of the previous layer 3 4 Parameter Sharing The neurons with different receptive fields can use the same set of parameters. Less parameters then fully connected layer
  • 40. Convolutional Layer…… 1 1 2 2 3 4 5 3 4 Considering neuron 1 and 3 as “filter 1” (kernel 1) filter (kernel) size: size of the receptive field of a neuron Stride = 2 Considering neuron 2 and 4 as “filter 2” (kernel 2) Kernel size, no. of filter, stride are all designed by the developers.
  • 41. Pooling Layer Group the neurons corresponding to the same filter with nearby receptive fields …… 1 1 2 2 3 4 5 3 4 Convolutional Layer Pooling Layer 1 2 Subsampling
  • 42. Recurrent Neural Network • Recurrent Structure: usually used when the input is a sequence Sentiment Analysis 我 覺 太得 糟 了 超好雷 好雷 普雷 負雷 超負雷 看了這部電影覺 得很高興 ……. 這部電影太糟了 ……. 這部電影很 棒 ……. Positive (正雷) Negative (負雷) Positive (正雷) …… Function
  • 43. Recurrent Neural Network • Recurrent Structure: usually used when the input is a sequence fh0 h1 x1 f h2 f h3 g No matter how long the input sequence is, we only need one function f y x2 x3 Func f ht xt ht-1
  • 45. Auto-encoder NN Encoder NN Decoder code Compact representation of the input object code Can reconstruct the original object Learn together 28 X 28 = 784 Low dimension 𝑐 Trainable NN Encoder NN Decoder
  • 46. Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507 • NN encoder + NN decoder = a deep network Deep Auto-encoderInputLayer Layer Layer bottle OutputLayer Layer Layer Layer Layer … … Code As close as possible 𝑥 ො𝑥 Encoder Decoder
  • 47. Deep Auto-encoder - Example Pixel -> tSNE 𝑐 NN Encoder PCA 降到 32-dim Unsupervised Leaning without annotation
  • 48. Sequence-to-sequence Auto-encoder • Machine represents each audio segment also by a vector audio segment vector (word-level) Learn from lots of audio without supervision [Chung, Wu, Lee, Lee, Interspeech 16) Used in the following spoken language understanding applications
  • 49. Sequence-to-sequence Auto-encoder audio segment acoustic featuresx1 x2 x3 x4 RNN Encoder vector The vector we want We use sequence-to-sequence auto-encoder here The training is unsupervised.
  • 50. Sequence-to-sequence Auto-encoder RNN Generator x1 x2 x3 x4 y1 y2 y3 y4 x1 x2 x3 x4 RNN Encoder audio segment acoustic features The RNN encoder and generator are jointly trained. Input acoustic features
  • 51. What does machine learn? • Typical word to vector: • Audio word to vector (phonetic information) 𝑉 𝑅𝑜𝑚𝑒 − 𝑉 𝐼𝑡𝑎𝑙𝑦 + 𝑉 𝐺𝑒𝑟𝑚𝑎𝑛𝑦 ≈ 𝑉 𝐵𝑒𝑟𝑙𝑖𝑛 𝑉 𝑘𝑖𝑛𝑔 − 𝑉 𝑞𝑢𝑒𝑒𝑛 + 𝑉 𝑎𝑢𝑛𝑡 ≈ 𝑉 𝑢𝑛𝑐𝑙𝑒 V( ) - V( ) + V( ) = V( ) V( ) - V( ) + V( ) = V( )
  • 52. Network Many kinds of network structures: Matrix How to find the function? Given the example inputs/outputs as training data: {(x1,y1),(x2,y2), ……, (x1000,y1000)} ➢ Fully connected feedforward network ➢ Convolutional neural network (CNN) ➢ Recurrent neural network (RNN) Vector Vector Seq 𝑥 𝑦 Deep Learning in One Slide (Review) (speech, video, sentence) Different networks can take input/output in different domains.
  • 53. Lecture II: Introduction of GAN To learn more theory: https://www.youtube.com/watch?v=0CKeqXl5IY0&lc=z13zuxbgl pvsgbgpo04cg1bxuoraejdpapo0k https://www.youtube.com/watch?v=KSN4QYgAtao&lc=z13kz1nq vuqsipqfn23phthasre4evrdo
  • 54. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory
  • 55. Structured Learning YXf : Regression: output a scalar Classification: output a “class” Structured Learning/Prediction: output a sequence, a matrix, a graph, a tree …… (one-hot vector) 1 0 0 Machine learning is to find a function f Output is composed of components with dependency 0 1 0 0 0 1 Class 1 Class 2 Class 3
  • 57. Output Sequence (what a user says) YXf : “機器學習及其深層與 結構化” “Machine learning and having it deep and structured” :X :Y 感謝大家來上課” (response of machine) :X :Y Machine Translation Speech Recognition Chat-bot (speech) :X :Y (transcription) (sentence of language 1) (sentence of language 2) “How are you?” “I’m fine.”
  • 58. Output Matrix YXf : ref: https://arxiv.org/pdf/1605.05396.pdf “this white and yellow flower have thin white petals and a round yellow stamen” :X :Y Text to Image Image to Image :X :Y Colorization: Ref: https://arxiv.org/pdf/1611.07004v1.pdf
  • 59. Decision Making and Control Action: “right” GO Playing is the same. Action: “fire” Action: “left”
  • 60. Why Structured Learning Interesting? • One-shot/Zero-shot Learning: • In classification, each class has some examples. • In structured learning, • If you consider each possible output as a “class” …… • Since the output space is huge, most “classes” do not have any training data. • Machine has to create new stuff during testing. • Need more intelligence
  • 61. Why Structured Learning Interesting? • Machine has to learn to planning • Machine can generate objects component-by- component, but it should have a big picture in its mind. • Because the output components have dependency, they should be considered globally. Image Generation Sentence Generation 這個婆娘不是人 九天玄女下凡塵
  • 62. Structured Learning Approach Bottom Up Top Down Generative Adversarial Network (GAN) Generator Discriminator Learn to generate the object at the component level Evaluating the whole object, and find the best one
  • 63. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory
  • 64. Generation NN Generator Image Generation Sentence Generation NN Generator We will control what to generate latter. → Conditional Generation 0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.1 −0.1 ⋮ 0.2 −0.3 0.1 ⋮ 0.5 Good morning. How are you? 0.3 −0.1 ⋮ −0.7 0.3 −0.1 ⋮ −0.7 Good afternoon. In a specific range
  • 65. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  • 66. Generator Image: code: 0.1 0.9(where does them come from?) NN Generator 0.1 −0.5 0.2 −0.1 0.3 0.2 NN Generator code vectors code code0.1 0.9 image As close as possible c.f. NN Classifier 𝑦1 𝑦2 ⋮ As close as possible 1 0 ⋮
  • 67. Generator Encoder in auto-encoder provides the code ☺ 𝑐 NN Encoder NN Generator code vectors code code Image: code: (where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.3 0.2
  • 68. Remember Auto-encoder? As close as possible NN Encoder NN Decoder code NN Decoder code Randomly generate a vector as code Image ? = Generator = Generator
  • 71. Generator NN Generator code vectors code codeNN Generator a NN Generator b NN Generator ?a b0.5x + 0.5x Image: code: (where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.3 0.2
  • 72. Auto-encoder Variational Auto-encoder (VAE) NN Encoder input NN Decoder output m1 m2 m3 𝜎1 𝜎2 𝜎3 𝑒3 𝑒1 𝑒2 From a normal distribution 𝑐3 𝑐1 𝑐2 X + Minimize reconstruction error ෍ 𝑖=1 3 𝑒𝑥𝑝 𝜎𝑖 − 1 + 𝜎𝑖 + 𝑚𝑖 2 exp 𝑐𝑖 = 𝑒𝑥𝑝 𝜎𝑖 × 𝑒𝑖 + 𝑚𝑖 Minimize NN Encoder NN Decoder code input output = Generator
  • 74. Ref: http://www.wired.co.uk/article/google-artificial-intelligence-poetry Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio, Generating Sentences from a Continuous Space, arXiv prepring, 2015 VAE - Writing Poetry RNN Encoder sentence RNN Decoder sentence code Code Space i went to the store to buy some groceries. "come with me," she said. i store to buy some groceries. i were to buy any groceries. "talk to me," she said. "don’t worry about it," she said. ……
  • 75. What do we miss? G as close as possible TargetGenerated Image It will be fine if the generator can truly copy the target image. What if the generator makes some mistakes ……. Some mistakes are serious, while some are fine.
  • 76. What do we miss? Target 1 pixel error 1 pixel error 6 pixel errors 6 pixel errors 我覺得不行 我覺得不行 我覺得 其實 OK 我覺得 其實 OK
  • 77. What do we miss? 我覺得不行 我覺得其實 OK The relation between the components are critical. The last layer generates each components independently. Need deep structure to catch the relation between components. Layer L-1 …… Layer L …… …… …… …… Each neural in output layer corresponds to a pixel. ink empty 旁邊的,我們產 生一樣的顏色 誰管你
  • 78. (Variational) Auto-encoder 感謝 黃淞楓 同學提供結果 x1 x2 G 𝑧1 𝑧2 𝑥1 𝑥2
  • 79. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  • 80. Discriminator • Discriminator is a function D (network, can deep) • Input x: an object x (e.g. an image) • Output D(x): scalar which represents how “good” an object x is R:D X D 1.0 D 0.1 Can we use the discriminator to generate objects? Yes. Evaluation function, Potential Function, Evaluation Function …
  • 81. Discriminator • It is easier to catch the relation between the components by top-down evaluation. 我覺得不行 我覺得其實 OK This CNN filter is good enough.
  • 82. Discriminator • Suppose we already have a good discriminator D(x) … How to learn the discriminator? Enumerate all possible x !!! It is feasible ???
  • 83. Discriminator - Training • I have some real images Discriminator training needs some negative examples. D scalar 1 (real) D scalar 1 (real) D scalar 1 (real) D scalar 1 (real) Discriminator only learns to output “1” (real).
  • 84. Discriminator - Training D(x) real examples In practice, you cannot decrease all the x other than real examples. Discrimi nator D object x scalar D(x)
  • 85. Discriminator - Training • Negative examples are critical. D scalar 1 (real) D scalar 0 (fake) D 0.9 Pretty real D scalar 1 (real) D scalar 0 (fake) How to generate realistic negative examples?
  • 86. Discriminator - Training • General Algorithm • Given a set of positive examples, randomly generate a set of negative examples. • In each iteration • Learn a discriminator D that can discriminate positive and negative examples. • Generate negative examples by discriminator D D  xDx Xx  maxarg~ v.s.
  • 87. Discriminator - Training 𝐷 𝑥 𝐷 𝑥 𝐷 𝑥𝐷 𝑥 In the end …… real generated
  • 88. Graphical Model Bayesian Network (Directed Graph) Markov Random Field (Undirected Graph) Markov Logic Network Boltzmann Machine Restricted Boltzmann Machine Structured Learning ➢Structured Perceptron ➢Structured SVM ➢Gibbs sampling ➢Hidden information ➢Application: sequence labelling, summarization Conditional Random Field Segmental CRF (Only list some of the approaches) Energy-based Model: http://www.cs.nyu .edu/~yann/resear ch/ebm/
  • 89. Generator v.s. Discriminator • Generator • Pros: • Easy to generate even with deep model • Cons: • Imitate the appearance • Hard to learn the correlation between components • Discriminator • Pros: • Considering the big picture • Cons: • Generation is not always feasible • Especially when your model is deep • How to do negative sampling?
  • 90. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  • 91. Discriminator - Training • General Algorithm • Given a set of positive examples, randomly generate a set of negative examples. • In each iteration • Learn a discriminator D that can discriminate positive and negative examples. • Generate negative examples by discriminator D D  xDx Xx  maxarg~ v.s. G x~ =
  • 92. Generating Negative Examples  xDx Xx  maxarg~G x~ = Discri- minator NN Generator image code 0.13 hidden layer update fix 0.9 Gradient Ascent
  • 93. • Initialize generator and discriminator • In each training iteration: DG Sample some real objects: Generate some fake objects: G Algorithm D Update G D image code code code code 0000 1111 code code code code image image image 1 update fix
  • 94. • Initialize generator and discriminator • In each training iteration: DG Learning D Sample some real objects: Generate some fake objects: G Algorithm D Update Learning G G D image code code code code 1111 code code code code image image image 1 update fix 0000
  • 95. Benefit of GAN • From Discriminator’s point of view • Using generator to generate negative samples • From Generator’s point of view • Still generate the object component-by- component • But it is learned from the discriminator with global view.  xDx Xx  maxarg~G x~ = efficient
  • 97. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory
  • 98. Binary Classifier as Discriminator Binary Classifier Real Typical binary classifier uses sigmoid function at the output layer You cannot train your classifier too good…… They don’t move. Binary Classifier Fake 1 is the largest, 0 is the smallest real generated
  • 99. Binary Classifier as Discriminator • Don’t let the discriminator perfectly separate real and generated data • Weaken your discriminator? real generated strong weak
  • 100. Binary Classifier as Discriminator • Don’t let the discriminator perfectly separate real and generated data • Add noise to input or label? real generated
  • 101. Least Square GAN (LSGAN) • Replace sigmoid with linear (replace classification with regression) 1 (Real) 0 (Fake) Binary Classifier scalar Binary Classifier scalar They don’t move. 0 1 real generated
  • 102. WGAN • We want the scores of the real examples as large as possible, generated examples as small as possible. Any problem? Binary Classifier scalar Binary Classifier scalar as large as possible as small as possible real generated Move towards Max +∞ −∞
  • 103. WGAN Move towards Max 𝐷 𝑥1 − 𝐷 𝑥2 ≤ 𝐾 𝑥1 − 𝑥2 Lipschitz Function K=1 for "1 − 𝐿𝑖𝑝𝑠𝑐ℎ𝑖𝑡𝑧" Output change Input change Do not change fast The discriminator should be a 1-Lipschitz function. It should be smooth. How to realize?
  • 104. WGAN • Original WGAN → Weight Clipping • Improved WGAN → Gradient Penalty Force the parameters w between c and -c After parameter update, if w > c, w = c; if w < -c, w = -c Do not truly maximize (minimize) the real (generated) examples It should be smooth enough. 𝐷 𝑥𝐷 𝑥 Keep the gradient close to 1 Easy to move
  • 105. DCGAN G: CNN, D: CNN G: CNN (no normalization), D: CNN (no normalization) LSGAN Original WGAN Improved WGAN G: CNN (tanh), D: CNN(tanh)
  • 106. DCGAN LSGAN Original WGAN Improved WGAN G: MLP, D: CNN G: CNN (bad structure), D: CNN G: 101 layer, D: 101 layer
  • 107. Sentence Generation • good bye. g o o d <space> …… 0 1 0 0 ………… d g o <space> 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 Consider this matrix as an “image” 1-of-N Encoding I will talk about RNN later.
  • 108. Sentence Generation • Real sentence • Generated 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0.9 0.1 0 0 0 0.1 0.9 0 0 0 0.1 0.1 0.7 0.1 0 0 0 0.1 0.8 0.1 0 0 0 0.1 0.9 Can never be 1-of-N A binary classifier can immediately find the difference. WGAN is helpful No overlap I will talk about RNN later.
  • 110. W-GAN – 唐詩鍊成 感謝 李仲翊 同學提供實 驗結果 • 升雲白遲丹齋取,此酒新巷市入頭。黃道故海歸中後,不驚入得韻子門。 • 據口容章蕃翎翎,邦貸無遊隔將毬。外蕭曾臺遶出畧,此計推上呂天夢。 • 新來寳伎泉,手雪泓臺蓑。曾子花路魏,不謀散薦船。 • 功持牧度機邈爭,不躚官嬉牧涼散。不迎白旅今掩冬,盡蘸金祇可停。 • 玉十洪沄爭春風,溪子風佛挺橫鞋。盤盤稅焰先花齋,誰過飄鶴一丞幢。 • 海人依野庇,為阻例沉迴。座花不佐樹,弟闌十名儂。 • 入維當興日世瀕,不評皺。頭醉空其杯,駸園凋送頭。 • 鉢笙動春枝,寶叅潔長知。官爲宻爛去,絆粒薛一靜。 • 吾涼腕不楚,縱先待旅知。楚人縱酒待,一蔓飄聖猜。 • 折幕故癘應韻子,徑頭霜瓊老徑徑。尚錯春鏘熊悽梅,去吹依能九將香。 • 通可矯目鷃須浄,丹迤挈花一抵嫖。外子當目中前醒,迎日幽筆鈎弧前。 • 庭愛四樹人庭好,無衣服仍繡秋州。更怯風流欲鴂雲,帛陽舊據畆婷儻。 輸出 32 個字 (包含標點)
  • 111. Loss-sensitive GAN (LSGAN) D(x) 𝑥 WGAN LSGAN 𝑥′′ 𝑥′ D(x) Δ 𝑥, 𝑥′ Δ 𝑥, 𝑥′ 𝑥′′ 𝑥′ 𝑥
  • 112. Energy-based GAN (EBGAN) • Using an autoencoder as discriminator D scalar Discriminator An image is good. It can be reconstructed by autoencoder. = 0 for best images Generator is the same. - 0.1 EN DE Autoencoder X -1 -0.1
  • 113. EBGAN realgen gen Hard to reconstruct, easy to destroy m 0 is for the best. Do not have to be very negative Auto-encoder based discriminator only give limited region large value.
  • 115. Missing Mode ? ? Mode collapse is easy to detect.
  • 116. Solution? • Feature matching, Minibatch discrimination • https://arxiv.org/abs/1606.03498 • Unroll GAN • https://arxiv.org/abs/1611.02163 • InfoGAN • Next lecture • Ensemble GAN • Next lecture
  • 117. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory Lucas Theis, Aäron van den Oord, Matthias Bethge, “A note on the evaluation of generative models”, arXiv preprint, 2015
  • 118. Likelihood : real data (not observed during training) : generated data 𝑥 𝑖 Log Likelihood: Generator PG 𝐿 = 1 𝑁 ෍ 𝑖 𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖 We cannot compute 𝑃𝐺 𝑥 𝑖 . We can only sample from 𝑃𝐺. Prior Distribution
  • 119. Likelihood - Kernel Density Estimation • Estimate the distribution of PG(x) from sampling Generator PG https://stats.stackexchange.com/questions/244012/can- you-explain-parzen-window-kernel-density-estimation-in- laymans-terms/244023 Now we have an approximation of 𝑃𝐺, so we can compute 𝑃𝐺 𝑥 𝑖 for each real data 𝑥 𝑖 Then we can compute the likelihood. Each sample is the mean of a Gaussian with the same covariance.
  • 120. Likelihood v.s. Quality • Low likelihood, high quality? • High likelihood, low quality? Considering a model generating good images (small variance) Generator PG Generated data Data for evaluation 𝑥 𝑖 𝑃𝐺 𝑥 𝑖 = 0 G1 𝐿 = 1 𝑁 ෍ 𝑖 𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖 G2 X 0.99 X 0.01 = −𝑙𝑜𝑔100 + 1 𝑁 ෍ 𝑖 𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖 4.6100
  • 121. Evaluate by Other Networks Well-trained CNN 𝑥 𝑃 𝑦|𝑥 Lower entropy means higher visual quality CNN𝑥1 𝑃 𝑦1|𝑥1 Higher entropy means higher variety CNN𝑥2 𝑃 𝑦2|𝑥2 CNN𝑥3 𝑃 𝑦3 |𝑥3 … 1 𝑁 ෍ 𝑛 𝑃 𝑦 𝑛|𝑥 𝑛 Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, “Improved Techniques for Training GANs”, arXiv prepring, 2016
  • 122. Evaluate by Other Networks - Inception Score • Improved W-GAN
  • 123. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory Can skip this part
  • 124. Theory behind GAN • The data we want to generate has a distribution 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 High Probability Low ProbabilityImage Space
  • 125. Theory behind GAN • A generator G is a network. The network defines a probability distribution. generator G𝑧 𝑥 = 𝐺 𝑧 Normal Distribution 𝑃𝐺 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 As close as possible https://blog.openai.com/generative-models/ It is difficult to compute 𝑃𝐺 𝑥 We do not know what the distribution looks like.
  • 126. generator G𝑧 𝑥 = 𝐺 𝑧 Normal Distribution 𝑃𝐺 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 As close as possible https://blog.openai.com/generative-models/ Discri- minator When the discriminator is well trained, its loss represents a specific divergence measure between PG and Pdata. Update discriminator is to minimize the divergence measure Binary classifier evaluates the JS divergence You can design the discriminator to evaluate other divergence.
  • 127. Why GAN is hard to train? http://www.guokr.com/post/773890/ Better
  • 128. Why GAN is hard to train? 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝐺0 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝐺50 𝑥 𝐽𝑆 𝑃𝐺1 ||𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝐽𝑆 𝑃𝐺2 ||𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝐺100 𝑥 𝐽𝑆 𝑃𝐺2 ||𝑃𝑑𝑎𝑡𝑎 = 0 ? ………… Not really better ……
  • 129. Evaluating JS divergence • JS divergence estimated by discriminator telling little information https://arxiv.org/a bs/1701.07875 Weak Generator Strong Generator
  • 130. Earth Mover’s Distance • Considering one distribution P as a pile of earth, and another distribution Q as the target • The average distance the earth mover has to move the earth. 𝑃 𝑄 d 𝑊 𝑃, 𝑄 = 𝑑
  • 131. 𝑃𝑑𝑎𝑡𝑎𝑃𝐺0 𝑃𝑑𝑎𝑡𝑎𝑃𝐺50 𝐽𝑆 𝑃𝐺0 , 𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝑃𝑑𝑎𝑡𝑎𝑃𝐺100 …… …… 𝑑0 𝑑50 𝐽𝑆 𝑃𝐺50 , 𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝐽𝑆 𝑃𝐺100 , 𝑃𝑑𝑎𝑡𝑎 = 0 𝑊 𝑃𝐺0 , 𝑃𝑑𝑎𝑡𝑎 = 𝑑0 𝑊 𝑃𝐺50 , 𝑃𝑑𝑎𝑡𝑎 = 𝑑50 𝑊 𝑃𝐺100 , 𝑃𝑑𝑎𝑡𝑎 = 0 Why Earth Mover’s Distance? 𝐷𝑓 𝑃𝑑𝑎𝑡𝑎||𝑃𝐺 𝑊 𝑃𝑑𝑎𝑡𝑎, 𝑃𝐺
  • 133. When can I use GAN? • I consider GAN as a new approach for structured learning. Generation by GAN • Generator: generate locally • Discriminator: evaluate globally • Generator + Discriminator → GAN Improving GAN Evaluation A Little Bit Theory Concluding Remarks
  • 135. Discriminator is flat in the end. • Discriminator leads the generator 𝐷 𝑥 𝐷 𝑥 𝐷 𝑥 𝐷 𝑥 Discriminator Real Generated
  • 136. Discriminator is flat in the end. Source: https://www.youtube.com/watch?v=ebMei6bYeWw (credit: Benjamin Striner)
  • 137. Discriminator is flat in the end. http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/
  • 138. Lecture III: Conditional Generation and its application on Anime Face Generation
  • 139. Conditional Generation Generation Conditional Generation NN Generator “Girl with red hair and red eyes” “Girl with yellow ribbon” NN Generator 0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.3 −0.1 ⋮ −0.7 In a specific range
  • 140. Conditional Generation • We don’t want to simply generate some random stuff. • Generate objects based on conditions: Given condition: Caption Generation Chat-bot Given condition: “Hello” “A young girl is dancing.” “Hello. Nice to see you.”
  • 141. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 142. Modifying Input Code Generator 0.1 −3 ⋮ 2.4 0.9 Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 2.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 Each dimension of input vector represents some characteristics. Longer hair blue hair Open mouth ➢The input code determines the generator output. ➢Understand the meaning of each dimension to control the output.
  • 143. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 144. InfoGAN Regular GAN Modifying a specific dimension, no clear meaning What we expect Actually … (The colors represents the characteristics.)
  • 145. What is InfoGAN? Discrimi nator Classifier scalarGenerator=z z’ c x c Predict the code c that generates x “Auto- encoder” Parameter sharing (only the last layer is different) encoder decoder
  • 146. What is InfoGAN? Classifier Generator=z z’ c x c Predict the code c that generates x “Auto- encoder” encoder decoder c must have clear influence on x The classifier can recover c from x.
  • 149. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 150. Connecting Code and Attribute CelebA
  • 151. GAN+Autoencoder • We have a generator (input z, output x) • However, given x, how can we find z? • Learn an encoder (input x, output z) Generator (Decoder) Encoder z xx as close as possible fixed Discriminator init Autoencoder different structures?
  • 153. Attribute Representation 𝑧𝑙𝑜𝑛𝑔 = 1 𝑁1 ෍ 𝑥∈𝑙𝑜𝑛𝑔 𝐸𝑛 𝑥 − 1 𝑁2 ෍ 𝑥′∉𝑙𝑜𝑛𝑔 𝐸𝑛 𝑥′ Short Hair Long Hair 𝐸𝑛 𝑥 + 𝑧𝑙𝑜𝑛𝑔 = 𝑧′𝑥 𝐺𝑒𝑛 𝑧′ 𝑧𝑙𝑜𝑛𝑔
  • 155. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 156. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 157. Conditional GAN • Generating images based on text description NN Encoder NN Generator sentence code code image “a dog is sleeping” ?
  • 158. Conditional GAN • Generating images based on text description NN Encoder NN Generator sentence code code “a bird is flying” c1: a dog is running ො𝑥1: ො𝑥2:c2: a bird is flying “a dog is running”
  • 159. Target of NN output Conditional GAN • Text to image by traditional supervised learning NN Image Text: “train” c1: a dog is running ො𝑥1: ො𝑥2:c2: a bird is flying A blurry image! c1: a dog is running ො𝑥1: as close as possible
  • 160. Conditional GAN G 𝑧Prior distribution x = G(c,z) c: train Target of NN output Text: “train” A blurry image! Approximate the distribution below It is a distribution. Image
  • 161. Conditional GAN D (type 2) scalar 𝑐 𝑥 D (type 1) scalar𝑥 Positive example: Negative example: G 𝑧Prior distribution x = G(c,z) c: train x is realistic or not Image x is realistic or not + c and x are matched or not (train , ) (train , ) (cat , ) Positive example: Negative example:
  • 162. Text to Image - Results "red flower with black center"
  • 163. Text to Image - Results
  • 165. as close as possible Image-to-image • Traditional supervised approach NN Image It is blurry because it is the average of several images. Testing: input close
  • 166. Image-to-image • Experimental results Testing: input close GAN G 𝑧 Image D scalar GAN + close
  • 167. Image super resolution • Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, CVPR, 2016
  • 168. Video Generation Generator Discrimi nator Last frame is real or generated Discriminator thinks it is real target Minimize distance
  • 170. Speech Enhancement • Typical deep learning approach Noisy Clean G Output Using CNN
  • 171. Speech Enhancement • Conditional GAN G D scalar noisy output clean noisy clean output noisy training data (fake pair or not)
  • 172. Speech Enhancement Noisy Speech Enhanced Speech Which Enhanced Speech is better? 感謝廖峴峰同學提供實驗結果 (和中研院曹昱博士共同指導)
  • 173. More about Speech Processing • Speech synthesis • Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino, “Generative Adversarial Network-based Postfiltering for Statistical Parametric Speech Synthesis”, ICASSP 2017 • Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Training algorithm to deceive anti-spoofing verification for DNN-based speech synthesis, ”, ICASSP 2017 • Voice Conversion • Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang, Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks, Interspeech 2017 • Speech Enhancement • Santiago Pascual, Antonio Bonafonte, Joan Serrà, SEGAN: Speech Enhancement Generative Adversarial Network, Interspeech 2017
  • 174. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 175. Cycle GAN, Disco GAN Transform an object from one domain to another without paired data photo van Gogh Domain X Domain Y paired data
  • 176. Cycle GAN 𝐺 𝑋→𝑌 Domain X Domain Y 𝐷 𝑌 Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain Y Not what we want ignore input https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/
  • 177. Cycle GAN Domain X Domain Y 𝐺 𝑋→𝑌 𝐷 𝑌 Domain Y scalar Input image belongs to domain Y or not 𝐺Y→X as close as possible Lack of information for reconstruction
  • 178. Cycle GAN Domain X Domain Y 𝐺 𝑋→𝑌 𝐺Y→X as close as possible 𝐺Y→X 𝐺 𝑋→𝑌 as close as possible 𝐷 𝑌𝐷 𝑋 scalar: belongs to domain Y or not scalar: belongs to domain X or not c.f. Dual Learning
  • 179. 動畫化的世界 • Using the code: https://github.com/Hi- king/kawaii_creator • It is not cycle GAN, Disco GAN input output domain
  • 180. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 181. https://www.youtube.com/watch?v=9c4z6YsBGQ0 Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman and Alexei A. Efros. "Generative Visual Manipulation on the Natural Image Manifold", ECCV, 2016.
  • 182. Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston, Neural Photo Editing with Introspective Adversarial Networks, arXiv preprint, 2017
  • 183. Basic Idea space of code z Why move on the code space? Fulfill the constraint
  • 184. Generator (Decoder) Encoder z xx as close as possible Back to z • Method 1 • Method 2 • Method 3 𝑧∗ = 𝑎𝑟𝑔 min 𝑧 𝐿 𝐺 𝑧 , 𝑥 𝑇 𝑥 𝑇 𝑧∗ Difference between G(z) and xT ➢ Pixel-wise ➢ By another network Using the results from method 2 as the initialization of method 1 Gradient Descent
  • 185. Editing Photos • z0 is the code of the input image 𝑧∗ = 𝑎𝑟𝑔 min 𝑧 𝑈 𝐺 𝑧 + 𝜆1 𝑧 − 𝑧0 2 − 𝜆2 𝐷 𝐺 𝑧 Does it fulfill the constraint of editing? Not too far away from the original image Using discriminator to check the image is realistic or notimage
  • 186. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 187. Domain Independent Features • Training and testing data are in different domains Training data: Testing data: Generator Generator The same distribution feature feature
  • 189. Domain Independent Features Too easy to feature extractor …… feature extractor Domain classifier
  • 190. Domain-adversarial training Not only cheat the domain classifier, but satisfying label classifier at the same time This is a big network, but different parts have different goals. Maximize label classification accuracy Maximize domain classification accuracy Maximize label classification accuracy + minimize domain classification accuracy feature extractor Domain classifier Label predictor
  • 191. Domain-adversarial training Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation, ICML, 2015 Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Domain-Adversarial Training of Neural Networks, JMLR, 2016 Domain classifier fails in the end It should struggle ……
  • 192. Domain-adversarial training Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation, ICML, 2015 Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Domain-Adversarial Training of Neural Networks, JMLR, 2016 Train: Test:
  • 193. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 194. VAE-GAN Discrimi nator Generator (Decoder) Encoder z x scalarx as close as possible VAE GAN ➢ Minimize reconstruction error ➢ z close to normal ➢ Minimize reconstruction error ➢ Cheat discriminator ➢ Discriminate real, generated and reconstructed images Discriminator provides the similarity measure Anders Boesen, Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther, “Autoencoding beyond pixels using a learned similarity metric”, ICML. 2016
  • 195. BiGAN Encoder Decoder Discriminator Image x code z Image x code z Image x code z from encoder or decoder? (real) (generated) (from prior distribution) Jeff Donahue, Philipp Krähenbühl, Trevor Darrell, “Adversarial Feature Learning”, ICLR, 2017 Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville, “Adversarially Learned Inference”, ICLR, 2017
  • 196. BiGAN • Initialize encoder En, decoder De, discriminator Dis • In each iteration: • Sample M images 𝑥1, 𝑥2, ⋯ , 𝑥 𝑀 from database • Generate M codes ǁ𝑧1, ǁ𝑧2, ⋯ , ǁ𝑧 𝑀 from encoder • ǁ𝑧 𝑖 = 𝐸𝑛 𝑥 𝑖 • Sample M codes 𝑧1, 𝑧2, ⋯ , 𝑧 𝑀 from prior P(z) • Generate M codes ෤𝑥1, ෤𝑥2, ⋯ , ෤𝑥 𝑀 from decoder • ෤𝑥 𝑖 = 𝐷𝑒 𝑧 𝑖 • Update Discriminator to increase 𝐷 𝑥 𝑖, ǁ𝑧 𝑖 , decrease 𝐷 ෤𝑥 𝑖, 𝑧 𝑖 • Update En and De to decrease 𝐷 𝑥 𝑖 , ǁ𝑧 𝑖 , increase 𝐷 ෤𝑥 𝑖, 𝑧 𝑖
  • 197. Encoder Decoder Discriminator Image x code z Image x code z Image x code z from encoder or decoder? (real) (generated) (from prior distribution) 𝑃 𝑥, 𝑧 𝑄 𝑥, 𝑧 Evaluate the difference between P and QTo make P and Q the same Optimal encoder and decoder: En(x’) = z’ De(z’) = x’ En(x’’) = z’’De(z’’) = x’’ For all x’ For all z’’
  • 198. BiGAN En(x’) = z’ De(z’) = x’ En(x’’) = z’’De(z’’) = x’’ For all x’ For all z’’ How about? Encoder Decoder x z ෤𝑥 Encoder Decoder x z ǁ𝑧 En, De En, De Optimal encoder and decoder:
  • 199.
  • 200. Concluding Remarks Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  • 202. Anime Face Generation • My Course at NTU: “Machine Learning and having it Deep and Structured” • This spring I opened this course the third time • I want to update the course. • Including the HW for sequence-to-sequence learning, reinforcement learning, GAN • What is the HW for GAN? “girl with red hair and red eyes” machine
  • 203. Anime Face Generation • https://github.com/mattya/chainer-DCGAN • https://zhuanlan.zhihu.com/p/24767059 • https://github.com/jayleicn/animeGAN
  • 204. Anime Face Generation - Girl Friend Factory • http://qiita.com/Hiroshiba/items/d5749d8896613e6f0b48
  • 206. Released Training Data • Data download link: https://drive.google.com/open?id=0BwJmB7alR- AvMHEtczZZN0EtdzQ • Anime Dataset: • training data: 33.4k (image, tags) pair • Training tags file format • img_id <comma> tag1 <colon> #_post <tab> tag2 <colon> … blue eyes red hair short hair tags.csv 96 x 96
  • 207. Data • The students have to submit their codes. • The code generates 5 64x64 images based on text description. • Input testing text only includes ‘color hair’ and ‘color eyes’. • Testing text file format • testing_text_id <comma> testing_text • Extra bonus if you can generate images beyond the above description sample testing_text.txt
  • 208. Evaluation • Lots evaluation methods in the literature • We will put all the generated images in the grading platform • Giving 2 scores for each image • How the image fits the text? • How the image looks real? • Scores should be integer from 1 to 5 • 1 to 5 corressponding to (super bad, bad, average, good, super good) • It is double blind.
  • 209. Implementation Details • Updates between Generator and Discriminator • 1 : 1 or 2 : 1 • ADAM with lr = 0.0002, momentum = 0.5 • gaussian noise dim = 100 • batch size = 64, epoch = 300 Paper: https://arxiv.org/pdf/1605.05396.pdf Code: https://github.com/paarthneekhara/text-to-image
  • 210. Results of TA • input text: black hair blue eyes • input text: green hair green eyes • input text: blue hair green eyes
  • 211. Advanced Techniques • WGAN and Improved WGAN • StackGAN https://arxiv.org/abs/1612.03242
  • 212. Sentence Representation • Using RNN to encode the input text • But you already know the testing input …... One-hot encoding Why not 15*15 one-hot encoding? 本技巧由江東峻、陳翰浩、鄭嘉文提供
  • 213. Data Augmentation • Total training data: 33.4K • Only 14.8K is useful • Not sufficient for image generation? • Data Augmentation • Horizonal Flipping + Rotation 本技巧由江東峻、陳翰浩、鄭嘉文提供
  • 214. Sampled Noise v.s. Image Quality Std=0.950 Std=0.950 Std=0.666 Std=0.666 Std=0.432 Std=0.432 本技巧由柯達方提供
  • 215. Sampled Noise v.s. Image Quality green hair green eyes noise with small std noise with large std green hair green eyes Generator Generator
  • 216. Ensemble • E.g. BEGAN on CelebA 陳柏文同學提供 實驗結果
  • 218. Bonus • hat, twin tails, ribbon, horns, headdress, headphones, glasses …… • Whole images • Deep Cleavage GAN (DCGAN) • Successful …… but I will not show this …… 感謝 孫凡耕、羅啟心、許晉嘉、郭子生 提供結果
  • 219. Lecture 4: Decision Making and Control and its relation with GAN
  • 220. Decision Making and Control • Machine observe some inputs, takes an action, and finally achieve the target. • E.g. Go playing observation action Target: win the game State: summarization of observation
  • 221. Decision Making and Control • Machine plays video games • Widely studies: • Gym: https://gym.openai.com/ • Universe: https://openai.com/blog/universe/ Machine learns to play video games as human players ➢ Machine learns to take proper action itself ➢ What machine observes are pixels
  • 222. Decision Making and Control • E.g. self-driving car Object Recognition 紅燈 Decision Making observation Stop! Action: a giant network?
  • 223. Decision Making and Control • E.g. dialogue system Slot Filling Decision Making 我想訂11月5日 到台北的機票 問出發地 a giant network? 抵達時間:11月5日 目的地:台北 Natural Language Generation 請問您要從 哪裡出發呢?
  • 224. How to solve this problem? • Network as a function, learn as typical supervised tasks Network Target: 3-3 Network Target: Stop! 我想訂11月5日 到台北的機票 Network Target: 請問您要 從哪裡出發呢?
  • 225. Behavior Cloning https://www.youtube.com/watch?v=j2FSB3bseek Machine do not know some behavior must copy, but some can be ignored.
  • 226. Properties of Decision Making and Control • Agent’s actions affect the subsequent data it receives • Reward delay • In space invader, only “fire” obtains reward • Although the moving before “fire” is important • In Go playing, sacrificing immediate reward to gain more long-term reward Machine does not know the influence of each action. What do we miss?
  • 227. Better Way …… Network (19 x 19 positions) Next move A sequence of decision Classification
  • 228. Two Learning Scenarios • Scenario 1: Reinforcement Learning • Machine interacts with the environment. • Machine obtains the reward from the environment, so it knows its performance is good or bad. • Scenario 2: Learning by demonstration • Also known as imitation learning, apprenticeship learning • An expert demonstrates how to solve the task, and machine learns from the demonstration.
  • 229. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  • 230. RL Policy-based Value-based Learning an Actor Learning a CriticActor + Critic Model-based Approach Model-free Approach Alpha Go: policy-based + value-based + model-based
  • 231. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  • 233. Actor, Environment, Reward 𝜏 = 𝑠1, 𝑎1, 𝑠2, 𝑎2, ⋯ , 𝑠 𝑇, 𝑎 𝑇 Trajectory Actor 𝑠1 𝑎1 Env 𝑠2 Env 𝑠1 𝑎1 Actor 𝑠2 𝑎2 Env 𝑠3 𝑎2 …… Total reward: 𝑅 𝜏 = ෍ 𝑡=1 𝑇 𝑟𝑡 Reward 𝑟1 Reward 𝑟2
  • 234. Start with observation 𝑠1 Observation 𝑠2 Observation 𝑠3 Example: Playing Video Game Action 𝑎1: “right” Obtain reward 𝑟1 = 0 Action 𝑎2 : “fire” (kill an alien) Obtain reward 𝑟2 = 5 Usually there is some randomness in the environment
  • 235. Start with observation 𝑠1 Observation 𝑠2 Observation 𝑠3 Example: Playing Video Game After many turns Action 𝑎 𝑇 Obtain reward 𝑟𝑇 Game Over (spaceship destroyed) This is an episode. We want the total reward be maximized. Total reward: 𝑅 = ෍ 𝑡=1 𝑇 𝑟𝑡
  • 236. • Input of neural network: the observation of machine represented as a vector or a matrix • Output neural network : each action corresponds to a neuron in output layer …… NN as actor pixels fire right left Score of an action 0.7 0.2 0.1 Take the action with the highest score Neural network as Actor What is the benefit of using network instead of lookup table? generalization
  • 237. Actor, Environment, Reward Actor 𝑠1 𝑎1 Env 𝑠2 Env 𝑠1 𝑎1 Actor 𝑠2 𝑎2 Env 𝑠3 𝑎2 …… 𝑅 𝜏 = ෍ 𝑡=1 𝑇 𝑟𝑡 Reward 𝑟1 Reward 𝑟2 NetworkNetwork They are not networks. 如果發現不能微分就用 policy gradient 硬 train 一發 argmax Ref: https://www.youtube.com/watch?v=W8XF3ME8G2I
  • 239. Step 1: define a set of function Step 2: goodness of function Step 3: pick the best function Three Steps for Deep Learning Deep Learning is so simple …… Neural Network as Actor
  • 240. Step 1: define a set of function Step 2: goodness of function Step 3: pick the best function Three Steps for Deep Learning Deep Learning is so simple …… Neural Network as Actor
  • 241. Goodness of Actor • Given an actor 𝜋 𝜃 𝑠 with network parameter 𝜃 • Use the actor 𝜋 𝜃 𝑠 to play the video game • Start with observation 𝑠1 • Machine decides to take 𝑎1 • Machine obtains reward 𝑟1 • Machine sees observation 𝑠2 • Machine decides to take 𝑎2 • Machine obtains reward 𝑟2 • Machine sees observation 𝑠3 • …… • Machine decides to take 𝑎 𝑇 • Machine obtains reward 𝑟𝑇 Total reward: 𝑅 𝜃 = σ 𝑡=1 𝑇 𝑟𝑡 Even with the same actor, 𝑅 𝜃 is different each time We define ത𝑅 𝜃 as the expected value of Rθ ത𝑅 𝜃 evaluates the goodness of an actor 𝜋 𝜃 𝑠 Randomness in the actor and the game END
  • 242. Goodness of Actor • 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇 𝑃 𝜏|𝜃 = 𝑝 𝑠1 𝑝 𝑎1|𝑠1, 𝜃 𝑝 𝑟1, 𝑠2|𝑠1, 𝑎1 𝑝 𝑎2|𝑠2, 𝜃 𝑝 𝑟2, 𝑠3|𝑠2, 𝑎2 ⋯ = 𝑝 𝑠1 ෑ 𝑡=1 𝑇 𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡 Control by your actor 𝜋 𝜃 not related to your actor Actor 𝜋 𝜃 left right fire 𝑠𝑡 0.1 0.2 0.7 𝑝 𝑎 𝑡 = "𝑓𝑖𝑟𝑒"|𝑠𝑡, 𝜃 = 0.7 We define ത𝑅 𝜃 as the expected value of Rθ
  • 243. Goodness of Actor • An episode is considered as a trajectory 𝜏 • 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇 • 𝑅 𝜏 = σ 𝑡=1 𝑇 𝑟𝑡 • If you use an actor to play the game, each 𝜏 has a probability to be sampled • The probability depends on actor parameter 𝜃: 𝑃 𝜏|𝜃 ത𝑅 𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 Sum over all possible trajectory Use 𝜋 𝜃 to play the game N times, obtain 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁 Sampling 𝜏 from 𝑃 𝜏|𝜃 N times ≈ 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛
  • 244. Step 1: define a set of function Step 2: goodness of function Step 3: pick the best function Three Steps for Deep Learning Deep Learning is so simple …… Neural Network as Actor
  • 245. Gradient Ascent • Problem statement • Gradient ascent • Start with 𝜃0 • 𝜃1 ← 𝜃0 + 𝜂𝛻 ത𝑅 𝜃0 • 𝜃2 ← 𝜃1 + 𝜂𝛻 ത𝑅 𝜃1 • …… 𝜃∗ = 𝑎𝑟𝑔 max 𝜃 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 𝜃 = 𝑤1, 𝑤2, ⋯ , 𝑏1, ⋯ = Τ𝜕 ത𝑅 𝜃 𝜕𝑤1 Τ𝜕 ത𝑅 𝜃 𝜕𝑤2 ⋮ Τ𝜕 ത𝑅 𝜃 𝜕𝑏1 ⋮
  • 246. Policy Gradient ത𝑅 𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 𝛻 ത𝑅 𝜃 =? 𝛻 ത𝑅 𝜃 = ෍ 𝜏 𝑅 𝜏 𝛻𝑃 𝜏|𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 𝛻𝑃 𝜏|𝜃 𝑃 𝜏|𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 Use 𝜋 𝜃 to play the game N times, Obtain 𝜏1 , 𝜏2 , ⋯ , 𝜏 𝑁≈ 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑃 𝜏 𝑛|𝜃 𝑅 𝜏 do not have to be differentiable It can even be a black box. 𝑑𝑙𝑜𝑔 𝑓 𝑥 𝑑𝑥 = 1 𝑓 𝑥 𝑑𝑓 𝑥 𝑑𝑥
  • 247. Policy Gradient • 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇 𝑃 𝜏|𝜃 = 𝑝 𝑠1 ෑ 𝑡=1 𝑇 𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 =? 𝑙𝑜𝑔𝑃 𝜏|𝜃 = 𝑙𝑜𝑔𝑝 𝑠1 + ෍ 𝑡=1 𝑇 𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 + 𝑙𝑜𝑔𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 = ෍ 𝑡=1 𝑇 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 Ignore the terms not related to 𝜃
  • 248. Policy Gradient 𝛻 ത𝑅 𝜃 ≈ 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑃 𝜏 𝑛|𝜃 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 = ෍ 𝑡=1 𝑇 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛 ෍ 𝑡=1 𝑇𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 𝜃 𝑛𝑒𝑤 ← 𝜃 𝑜𝑙𝑑 + 𝜂𝛻 ത𝑅 𝜃 𝑜𝑙𝑑 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 If in 𝜏 𝑛 machine takes 𝑎 𝑡 𝑛 when seeing 𝑠𝑡 𝑛 in 𝑅 𝜏 𝑛 is positive Tuning 𝜃 to increase 𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 𝑅 𝜏 𝑛 is negative Tuning 𝜃 to decrease 𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 It is very important to consider the cumulative reward 𝑅 𝜏 𝑛 of the whole trajectory 𝜏 𝑛 instead of immediate reward 𝑟𝑡 𝑛 What if we replace 𝑅 𝜏 𝑛 with 𝑟𝑡 𝑛 ……
  • 249. Add a Baseline 𝜃 𝑛𝑒𝑤 ← 𝜃 𝑜𝑙𝑑 + 𝜂𝛻 ത𝑅 𝜃 𝑜𝑙𝑑 𝛻 ത𝑅 𝜃 ≈ 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 It is possible that 𝑅 𝜏 𝑛 is always positive. Ideal case Sampling …… a b c a b c It is probability … a b c Not sampled a b c The probability of the actions not sampled will decrease. 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 − 𝑏 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃
  • 250. Policy Gradient 𝜏1: 𝑠1 1 , 𝑎1 1 𝑅 𝜏1 𝑠2 1 , 𝑎2 1 …… 𝑅 𝜏1…… 𝜏2 : 𝑠1 2 , 𝑎1 2 𝑅 𝜏2 𝑠2 2 , 𝑎2 2 …… 𝑅 𝜏2 …… 𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 Given actor parameter 𝜃 Update Model Data Collection
  • 251. Policy Gradient … … fire right left s 1 0 0 − ෍ 𝑖=1 3 ො𝑦𝑖 𝑙𝑜𝑔𝑦𝑖Minimize: Maximize: 𝑙𝑜𝑔𝑃 "𝑙𝑒𝑓𝑡"|𝑠 𝑦𝑖 ො𝑦𝑖 𝑙𝑜𝑔𝑦𝑖 = 𝜃 ← 𝜃 + 𝜂𝛻𝑙𝑜𝑔𝑃 "𝑙𝑒𝑓𝑡"|𝑠 Considered as Classification Problem
  • 252. Policy Gradient 𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 𝜏1: 𝑠1 1 , 𝑎1 1 𝑅 𝜏1 𝑠2 1 , 𝑎2 1 …… 𝑅 𝜏1 …… 𝜏2: 𝑠1 2 , 𝑎1 2 𝑅 𝜏2 𝑠2 2 , 𝑎2 2 …… 𝑅 𝜏2 …… Given actor parameter 𝜃 … 𝑠1 1 NN fire right left 𝑎1 1 = 𝑙𝑒𝑓𝑡 1 0 0 𝑠1 2 NN fire right left 𝑎1 2 = 𝑓𝑖𝑟𝑒 0 0 1
  • 253. Policy Gradient 𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 𝜏1: 𝑠1 1 , 𝑎1 1 𝑅 𝜏1 𝑠2 1 , 𝑎2 1 …… 𝑅 𝜏1 …… 𝜏2: 𝑠1 2 , 𝑎1 2 𝑅 𝜏2 𝑠2 2 , 𝑎2 2 …… 𝑅 𝜏2 …… Given actor parameter 𝜃 … 𝑠1 1 NN 𝑎1 1 = 𝑙𝑒𝑓𝑡 2 2 1 1 𝑠1 1 NN 𝑎1 1 = 𝑙𝑒𝑓𝑡 𝑠1 2 NN 𝑎1 2 = 𝑓𝑖𝑟𝑒 Each training data is weighted by 𝑅 𝜏 𝑛
  • 255. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  • 256. RNN Generation • Training begin 春 眠 不 …… …… …… 春 眠 不 覺 Unsupervised Leaning without annotation
  • 257. RNN Generation • Testing 前 明 月 <BOS> c1 c2 c3 …… …… …… P(c1) P(c2|c1) P(c3|c1,c2) P(c4|c1,c2,c3) 床 床 前 明 ~ ~ ~ ~ sample Unsupervised Leaning without annotation
  • 258. RNN Generation • Images are composed of pixels • Generating a pixel at each time by RNN blue pink blue <BOS> …… …… …… red Consider as a sentence blue red yellow gray …… Each pixel is a character. red blue pink ~ ~ ~ ~ c1 c2 c3 P(c1) P(c2|c1) P(c3|c1,c2) P(c4|c1,c2,c3)
  • 259. 寶可夢鍊成 • Small images of 792 Pokémon's • Can machine learn to create new Pokémons? • Source of image: http://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A 9mon_by_base_stats_(Generation_VI) Don't catch them! Create them! Original image is 40 x 40 Making them into 20 x 20
  • 260. Real Pokémon Cover 50% Cover 75% It is difficult to evaluate generation. Never seen by machine!
  • 262. PixelRNN Real World Ref: Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent Neural Networks, arXiv preprint, 2016
  • 263. More than images …… Audio: Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, WaveNet: A Generative Model for Raw Audio, arXiv preprint, 2016 Video: Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu, Video Pixel Networks , arXiv preprint, 2016
  • 264. Sentence Generation Generator Discriminator sentence x sentence x Real or fake Original GAN code z sampled from prior distribution
  • 265. Sentence Generation • Initialize generator G and discriminator D • In each iteration: • Sample real sentences 𝑥 from database • Generate sentences ෤𝑥 by G • Update D to increase 𝐷 𝑥 and decrease 𝐷 ෤𝑥 • Update Gen such that Generator Discrimi nator scalar update
  • 266. Sentence Generation A A A B A B A A B B B <BOS> Can we do backpropogation? Tuning generator a little bit will not change the output. Discriminator scalar NO! In the paper of improved WGAN … (ignoring sampling process)
  • 268. Sentence Generation - SeqGAN • Using Reinforcement learning • Consider the discriminator as reward function • Consider the output of discriminator as total reward • Update generator to increase discriminator = to get maximum total reward Ref: Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu, “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient”, AAAI, 2017 Generator Discrimi nator scalar update Using Policy Gradient
  • 269. Chat-bot with GAN Discriminator Input sentence/history h response sentence x Real or fake human dialogues Chatbot En De Conditional GAN response sentence x Input sentence/history h Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, Dan Jurafsky, “Adversarial Learning for Neural Dialogue Generation”, arXiv preprint, 2017
  • 270. Example Results 感謝 段逸林 同學提供實驗結果
  • 271. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  • 272. Critic • A critic does not determine the action. • Given an actor π, it evaluates the how good the actor is http://combiboilersleeds.com/picaso/critics/critics-4.html An actor can be found from a critic. e.g. Q-learning Not suitable for continuous action a
  • 273. Actor-Critic 𝜋 interacts with the environment Learning Critic Update actor from 𝜋 → 𝜋’ based on Critic TD or MC𝜋 = 𝜋′ Actor = Generator ? Critic = Discriminator ? https://arxiv.org/abs/1610.01945
  • 274. Visual Doom AI Competition @ CIG 2016 https://www.youtube.com/watch?v=94EPSjQH38Y
  • 276. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  • 277. Imitation Learning We have demonstration of the expert. Actor 𝑠1 𝑎1 Env 𝑠2 Env 𝑠1 𝑎1 Actor 𝑠2 𝑎2 Env 𝑠3 𝑎2 …… reward function is not available Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁 Each Ƹ𝜏 is a trajectory of the export. Self driving: record human drivers Robot: grab the arm of robot
  • 278. Motivation • It is hard to define reward in some tasks. • Hand-crafted rewards can lead to uncontrolled behavior. 機器人三大法則: 一、機器人不得傷害人類,或坐視人類受到 傷害而袖手旁觀。 二、除非違背第一法則,機器人必須服從人 類的命令。 三、在不違背第一法則及第二法則的情況下, 機器人必須保護自己。 因此為了保護人類整體,控制人類的自由是 必須的 神邏輯
  • 279. Inverse Reinforcement Learning Reward Function Environment Optimal Actor Inverse Reinforcement Learning ➢Using the reward function to find the optimal actor. ➢Modeling reward can be easier. Simple reward function can lead to complex policy. Reinforcement Learning Expert demonstration of the expert Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁
  • 280. Inverse Reinforcement Learning • Principle: The teacher is always the best. • Basic idea: • Initialize an actor • In each iteration • The actor interacts with the environments to obtain some trajectories • Define a reward function, which makes the trajectories of the teacher better than the actor • The actor learns to maximize the reward based on the new reward function. • Output the reward function and the actor learned from the reward function
  • 281. Framework of IRL Expert ො𝜋 Actor 𝜋 Obtain Reward Function R Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁 Find an actor based on reward function R By Reinforcement learning ෍ 𝑛=1 𝑁 𝑅 Ƹ𝜏 𝑛 > ෍ 𝑛=1 𝑁 𝑅 𝜏 Reward function = Discriminator Actor = Generator Reward Function R
  • 282. 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁 GAN IRL G D High score for real, low score for generated Find a G whose output obtains large score from D Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁 Expert Actor Reward Function Larger reward for Ƹ𝜏 𝑛, Lower reward for 𝜏 Find a Actor obtains large reward
  • 283. Teaching Robot • In the past …… https://www.youtube.com/watch?v=DEGbtjTOIB0
  • 284. Robot http://rll.berkeley.edu/gcl/ Chelsea Finn, Sergey Levine, Pieter Abbeel, ” Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization”, ICML, 2016
  • 287. Third Person Imitation Learning • Ref: Bradly C. Stadie, Pieter Abbeel, Ilya Sutskever, “Third-Person Imitation Learning”, arXiv preprint, 2017 http://lasa.epfl.ch/research_new/ML/index.php https://kknews.cc/sports/q5kbb8.html http://sc.chinaz.com/Files/pic/icons/1913/%E6%9C%BA%E5%99%A8%E4%BA%BA%E5%9B %BE%E6%A0%87%E4%B8%8B%E8%BD%BD34.png Third PersonFirst Person
  • 289. Concluding Remarks Lecture 1: Brief Review of Deep Learning Lecture 2: Introduction of Generative Adversarial Network Lecture 3: Conditional Generation Lecture 4: Making Decision and Control
  • 290. To Learn More … • Machine Learning • Slides: http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.htm l • Video: https://www.youtube.com/watch?v=fegAeph9UaA&list= PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49 • Machine Learning and Having it Deep and Structured • Slides: http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.h tml • Video: https://www.youtube.com/watch?v=IzHoNwlCGnE&list= PLJV_el3uVTsPMxPbjeX7PicgWbY7F8wW9
  • 292. 「科技大擂台 與AI對話」 ➢ 參加對象:大專院校研究所以下、 高中職學生 (一到四人一組參加) ➢有高額獎金 ➢但跟這份投影片 沒什麼關係 ➢ 首獎有十萬 ➢ 讓大家在拿高額獎金前先暖個身 ➢全民皆可參加
  • 293. 熱身賽初賽 ─ 試題 • 機器在聽一段人類對話後,預測人類的下一句話 • 試題下載 • 範例試題 (development set):8/02 後可以在科技部官 網上下載 50 題的範例試題和其答案 • 測試試題 (testing set):8/07 釋出 A: 柏翰謝謝你 B: 不客氣 A: 不好意思耽誤你下 課的休息時間 初賽直接提供字幕檔 選項: (1) B: 我很喜歡下課時間 (2) B: 都這麼熟了不用說 客套話 …… 本出題模式 僅限熱身賽
  • 294. Time Topic 13:50-14:00 報到 14:00-14:50 詞向量 (Word Embedding) 實作 15:00-15:50 遞迴式類神經網路(Recurrent Neural Network, RNN) 實作 16:00-16:50 序列對序列模型(Sequence-to- sequence Model) 實作 17:00- Q&A 本計畫辦公室將舉辦賽前教育訓練,邀請台大 李宏毅教授 團隊蒞臨指導。 時間:2017年8月16日(三)下午2-5點 地點:106台北市和平東路2段106號1樓(科技大樓)創新創 業推動組會議室 僅開放給報名完成的團隊 (8月15日報名截止) 實作教學