본문 바로가기

Data Analysis/Deep Learning

Keras / Sequential Model vs Functional Model

Keras 모델을 만드는 방법은 Sequential과 Functional로 두가지 방법이 있다.

 

Sequential Model은 레이어를 층층이 쌓아가는 방법으로 일반적으로 딥러닝 네트워크 구조를 생각할 때 Sequential Model의 구조를 떠올린다. Layer-by-layer 쌓아올리는 방법으로, layer를 공유하는 구조나 다중 입력/출력을 사용하지 못한다는 한계가 있다.

 

Functional Model은 Layer가 앞/뒤 layer에만 연결된 구조뿐 아니라 훨씬 더 자유자재로 그 구조를 정의하여 사용할 수 있게 된다. Layer를 어떤 layer에든지 연결해서 사용할 수 있는 것이다. 그래서 siamese 네트워크와 residual 네트워크 같은 복잡한 구조의 네트워크를 만들수 있게 된다.

 


Sequential Models

from keras.models improt Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(2, input_dim=1))
model.add(Dense(1))

layer들이 각각 하나씩인 Sequential object인 model에 추가되고 있음을 알 수 있다. 입력 크기가 1이고 node 개수가 2인 중간 layer를 거쳐서, 출력 크기가 1인 네트워크 구조다.

 

Sequential 모델을 이용하면 대부분의 딥러닝 모델을 만드는데 큰 무리는 없지만, 다음과 같은 한계점이 있다.

1. 여러 소스의 입력을 받아오지 못한다.

2. 여러 곳에 사용될 수 있는 다중 출력을 만들지 못한다.

3. layer를 재사용할 수가 없다.

 


다음은 Sequential Model의 예제이다.

 

다중 클레스 소프트맥스 분류를 위한 다계층 인공 신경망 (MLP) 

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD

# Generate dummy data
import numpy as np
x_train = np.random.random((1000, 20))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
x_test = np.random.random((100, 20))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)

model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

model.fit(x_train, y_train,
          epochs=20,
          batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)

 

이진 분류를 위한 MLP

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout

# Generate dummy data
x_train = np.random.random((1000, 20))
y_train = np.random.randint(2, size=(1000, 1))
x_test = np.random.random((100, 20))
y_test = np.random.randint(2, size=(100, 1))

model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train,
          epochs=20,
          batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)

 

VGG 스타일 convnet

import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD

# Generate dummy data
x_train = np.random.random((100, 100, 100, 3))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
x_test = np.random.random((20, 100, 100, 3))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)

model = Sequential()
# input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)

model.fit(x_train, y_train, batch_size=32, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=32)

 

시퀀스 분류를 위한 LSTM

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM

max_features = 1024

model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

 

시퀀스 분류를 위한 1D 합성곱

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D

seq_length = 64

model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

 

시퀀스 분류를 위한 축적형 LSTM

from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np

data_dim = 16
timesteps = 8
num_classes = 10

# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
               input_shape=(timesteps, data_dim)))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32))  # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))

# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))

model.fit(x_train, y_train,
          batch_size=64, epochs=5,
          validation_data=(x_val, y_val))

 

축적형 렌더링 LSTM

from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np

data_dim = 16
timesteps = 8
num_classes = 10
batch_size = 32

# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
               batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(LSTM(32, return_sequences=True, stateful=True))
model.add(LSTM(32, stateful=True))
model.add(Dense(10, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

# Generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim))
y_train = np.random.random((batch_size * 10, num_classes))

# Generate dummy validation data
x_val = np.random.random((batch_size * 3, timesteps, data_dim))
y_val = np.random.random((batch_size * 3, num_classes))

model.fit(x_train, y_train,
          batch_size=batch_size, epochs=5, shuffle=False,
          validation_data=(x_val, y_val))

 

 


Functional Models

from keras.models import Model
from keras.layers import Input
from keras.layers import Dense

visible = Input(shape=(2,))
model = Model(inputs=visible, outputs=hidden)

Functional Models에서는 모든 layer들이 다 만들어졌다면, 마지막으로 전체 모델을 정의해야 한다. Sequential 방법에서 했던 것과 같이, 여기서도 summarize, fit, evaluate, prediction을 다 사용할 수 있다. 모델을 만들려면 Keras에서 제공하는 Model()함수에 입력과 출력만 정의해주면 된다.

Keras의 functional 방법은 훨씬 더 유연한 모델 정의 방법을 제공한다.

 

특별히, layer들을 공유할 수 있을 뿐만 아니라, 입력과 출력의 모델을 여러개로 정의할 수도 있다. 더 나아가서, ad hoc acyclic network 그래프도 만들 수 있다.

 

Ad-hoc Network는... 아래의 접은 글에서 확인

더보기

Ad-hoc Network

고정된 유선망을 가지지 않고 이동호스트 (Mobile Host)로만 이루어져 통신되는 망이다. 따라서 유선망을 구성하기 어렵거나 망을 구성한 후 단기간 사용되는 경우에 적합하며 Ad-hoc 네트워크에서는 호스트의 이동에 제약이 없고 유선망과 기지국이 필요없으므로 빠른 망 구성과 저렴한 비용의 장점이 있다.

모델들은 layer들의 instance들을 입출력 쌍으로 연결함으로써 만들어지게 되고, Model()함수에 입력과 출력을 파라미터로 지정해줌으로써 구현된다.

 


 

 

 

Multiple Input Model

model = Model(inputs=[visible1, visible2], outputs=output)
# Multiple Inputs
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
# first input model
visible1 = Input(shape=(64,64,1))
conv11 = Conv2D(32, kernel_size=4, activation='relu')(visible1)
pool11 = MaxPooling2D(pool_size=(2, 2))(conv11)
conv12 = Conv2D(16, kernel_size=4, activation='relu')(pool11)
pool12 = MaxPooling2D(pool_size=(2, 2))(conv12)
flat1 = Flatten()(pool12)
# second input model
visible2 = Input(shape=(32,32,3))
conv21 = Conv2D(32, kernel_size=4, activation='relu')(visible2)
pool21 = MaxPooling2D(pool_size=(2, 2))(conv21)
conv22 = Conv2D(16, kernel_size=4, activation='relu')(pool21)
pool22 = MaxPooling2D(pool_size=(2, 2))(conv22)
flat2 = Flatten()(pool22)
# merge input models
merge = concatenate([flat1, flat2])
# interpretation model
hidden1 = Dense(10, activation='relu')(merge)
hidden2 = Dense(10, activation='relu')(hidden1)
output = Dense(1, activation='sigmoid')(hidden2)
model = Model(inputs=[visible1, visible2], outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multiple_inputs.png')

 

Multiple Output Model

# Multiple Outputs
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import TimeDistributed
# input layer
visible = Input(shape=(100,1))
# feature extraction
extract = LSTM(10, return_sequences=True)(visible)
# classification output
class11 = LSTM(10)(extract)
class12 = Dense(10, activation='relu')(class11)
output1 = Dense(1, activation='sigmoid')(class12)
# sequence output
output2 = TimeDistributed(Dense(1, activation='linear'))(extract)
# output
model = Model(inputs=visible, outputs=[output1, output2])
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multiple_outputs.png')

 

 

 

 

References

https://machinelearningmastery.com/keras-functional-api-deep-learning/

https://keras.io/ko/getting-started/sequential-model-guide/

https://keras.io/api/models/sequential/

https://blog.acronym.co.kr/128

https://leestation.tistory.com/777

'Data Analysis > Deep Learning' 카테고리의 다른 글

Long Short-Term Memory(LSTM)  (0) 2024.05.11
Recurrent Neural Network(RNN)  (0) 2024.05.11
딥러닝과 머신러닝  (0) 2024.05.11
Intra-class Variability / Inter-class Variability  (0) 2024.05.11