성능 최적화_1.배치 정규화를 통한 최적화

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

Colab으로 하루에 하나씩 딥러닝

성능 최적화_1.배치 정규화를 통한 최적화 본문

딥러닝_개념

성능 최적화_1.배치 정규화를 통한 최적화

Elleik 2023. 1. 21. 01:12

728x90

배치정규화(Batch Normalization)

데이터 분포가 안정되어 학습 속도를 높일 수 있음
기울기 소멸(gradient vanishing)이나 기울기 폭발(gradient exploding)의 문제를 내부 공변량 변화 조절을 통해 해결할 수 있음
- 기울기 소멸: 오차 정보를 역전파 시키는 과정에서 기울기가 급격히 0에 가까워져 학습이 되지 않는 현상
- 기울기 폭발: 학습 과정에서 기울기가 급격히 커지는 현상
- 해결 방법: 분산된 분포를 정규 분포로 만들기 위해 표준화와 유사한 방식을 미니 배치에 적용하여 평균은 0으로, 표준편차는 1로 유지하도록 함
활성화 함수보다 배치정규화를 통해 최적화를 하는 이유
- 배치 크기가 작을 때 활성화 함수를 거치면 정규화 값이 기존 값과 다른 방향으로 훈련됨
- RNN은 네트워크 계층별로 미니 정규화를 적용해야 하기 때문에 모델이 더 복잡해지면서 비효율적

배치정규화 실습

### 라이브러리 호출 및 데이터셋 내려받기

import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris= load_iris()

### 데이터프레임에 데이터셋 저장

df = pd.DataFrame(iris.data, columns=iris.feature_names)
df = df.astype(float)
df['label'] = iris.target
df['label'] = df.label.replace(dict(enumerate(iris.target_names)))

### 원-핫 인코딩 적용

label = pd.get_dummies(df['label'], prefix='label')
df = pd.concat([df, label], axis=1)
df.drop(['label'], axis=1,inplace=True)

### 데이터셋 분류

X = df[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']]
X = np.asarray(X)
y = df[['label_setosa', 'label_versicolor', 'label_virginica']]
y = np.asarray(y)

### 데이터셋 분리

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.20	
)	# 훈련과 테스트 데이터를 8:2로 분리

### 배치 정규화가 적용되지 않은 모델 생성

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization

model1 = Sequential([
    Dense(64, input_shape=(4,), activation='relu'),
    Dense(128, activation='relu'),
    Dense(128, activation='relu'),
    Dense(64, activation='relu'),
    Dense(64, activation='relu'),
    Dense(3, activation='softmax')
]);

model1.summary()

### 모델 훈련

model1.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
) 

history1 = model1.fit(
    X_train,
    y_train,
    epochs=1000,
    validation_split=0.25,
    batch_size=40,
    verbose=2
)

### 훈련 결과 시각화

%matplotlib inline	# 설명 1
import matplotlib.pyplot as plt
fig, loss_ax = plt.subplots()
acc_ax = loss_ax.twinx()
loss_ax.plot(history1.history['loss'], 'y', label='train loss')	# 설명 2
loss_ax.plot(history1.history['val_loss'],'r', label='val loss')
acc_ax.plot(history1.history['accuracy'], 'b', label='train acc')
acc_ax.plot(history1.history['val_accuracy'], 'g', label='val acc')

loss_ax.set_xlabel('epoch')
loss_ax.set_ylabel('loss')
acc_ax.set_ylabel('accuracy')

loss_ax.legend(loc='lower right')
acc_ax.legend(loc='upper right')
plt.show()

상세 설명

설명 1:
- 웹 브라우저에서 바로 그림 형태로 출력 결과를 볼 수 있게 함
설명 2:
- 모델을 학습시키기 위해 fit() 메서드를 사용, 이때 반환값으로 학습 이력 정보가 반환. 아래의 항목들이 학습 이력
  - loss: 훈련 손실 값
  - acc: 훈련 정확도
  - val_loss: 검증 손실 값
  - val_acc: 검증 정확도

### 정확도와 손실 정보 표현

loss_and_metrics = model1.evaluate(X_test, y_test)	# 설명 1
print(' 손실과 정확도 평가 ')
print(loss_and_metrics)

상세 설명

설명 1:
- 일반적으로 테스트 데이터에 대한 손실 값은(val loss)은 시간이 흐를수록 감소하지만, 위의 그래프는 시간이 흐를수록 계속 증가
- 훈련 정확도(train accuracy)는 100%에 가깝고, 훈련 손실값(train loss)은 0에 가까운 값을 유지
- 훈련 데이터셋에 대한 정확도는 높으나 테스트 데이터셋에 대한 정확도는 낮음 → 배치 정규화를 통해 문제 해결

### 배치 정규화가 적용된 모델

from tensorflow.keras.initializers import RandomNormal, Constant
model2 = Sequential([
    Dense(64, input_shape=(4,), activation="relu"),
    BatchNormalization(),

    Dense(128, activation='relu'),
    BatchNormalization(),
    Dense(128, activation='relu'),
    BatchNormalization(),
    Dense(64, activation='relu'),
    BatchNormalization(),
    Dense(64, activation='relu'),
    BatchNormalization(
        momentum=0.95, 
        epsilon=0.005,
        beta_initializer=RandomNormal(mean=0.0, stddev=0.05), 
        gamma_initializer=Constant(value=0.9)
    ),	# 설명 1
    Dense(3, activation='softmax')
]);
model2.summary()

상세 설명

설명 1:
- momentum: 미니 배치마다 평균과 표준편차를 구해서 전체 훈련 데이터셋의 평균과 표준편차로 대체
- epsilon: 분산이 0으로 계산되는 것을 방지하기 위해 분산에 추가되는 작은 실수 값
- beta_initializer: 베타 가중치 초깃값
- gamma_initializer: 감마 가중치 초깃값

### 모델 훈련

model2.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

history2 = model2.fit(
    X_train,
    y_train,
    epochs=1000,
    validation_split=0.25,
    batch_size=40,
    verbose=2
)

### 훈련 결과 시각화 

%matplotlib inline
import matplotlib.pyplot as plt

fig, loss_ax = plt.subplots()

acc_ax = loss_ax.twinx()

loss_ax.plot(history2.history['loss'], 'y', label='train loss')
loss_ax.plot(history2.history['val_loss'], 'r', label='val loss')

acc_ax.plot(history2.history['accuracy'], 'b', label='train acc')
acc_ax.plot(history2.history['val_accuracy'], 'g', label='val acc')

loss_ax.set_xlabel('epoch')
loss_ax.set_ylabel('loss')
acc_ax.set_ylabel('accuray')

loss_ax.legend(loc='lower right')
acc_ax.legend(loc='upper right')
plt.show()

### 모델 평가

loss_and_metrics = model2.evaluate(X_test, y_test)	# 설명 1
print(' 손실과 정확도 평가 ')
print(loss_and_metrics)

상세 설명

설명 1:
- 검증 데이터셋의 정확도(val acc)가 시간이 흐를수록 좋아짐
- 손실/오차(val loss)도 처음과 비교할 때 낮아짐

참고: 서지영, 『딥러닝 텐서플로 교과서』, 길벗(2022)

'딥러닝_개념' 카테고리의 다른 글

성능 최적화_3.조기 종료를 통한 최적화 (0)	2023.01.26
성능 최적화_2.드롭아웃을 통한 최적화 (0)	2023.01.25
시계열 분석_5.양방향 RNN (0)	2023.01.19
시계열 분석_4.GRU (1)	2023.01.18
시계열 분석_3.LSTM (0)	2023.01.17

'딥러닝_개념' Related Articles

Colab으로 하루에 하나씩 딥러닝

성능 최적화_1.배치 정규화를 통한 최적화 본문

성능 최적화_1.배치 정규화를 통한 최적화

배치정규화(Batch Normalization)

배치정규화 실습

상세 설명

상세 설명

상세 설명

상세 설명

'딥러닝_개념' 카테고리의 다른 글

티스토리툴바