DL(딥러닝) 실습 : Tensorflow의 keras를 활용한 ANN Deep Learning

DEEP LEARNING/Deep Learning Project

DL(딥러닝) 실습 : Tensorflow의 keras를 활용한 ANN Deep Learning

신강희 2024. 4. 30. 14:24

< Neural Networks and Deep Learning >

# 금융상품 갱신 여부 예측하는 ANN

# Churn_Modelling.csv 파일을 보면, 고객 정보와 해당 고객이 금융상품을 갱신했는지 안했는지의 여부에 대한 데이터가 있다.

# 이 데이터를 가지고 갱신여부를 예측하는 딥러닝을 구성하시오.

# 실습은 구글 Colab을 사용하여 진행한다.

# Importing the libraries

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

import seaborn as sb

# 데이터 불러오는 방법은 이전장에서 설명했던 방법중 내 구글 드라이브에 csv 파일을 갖다놓고, 코랩을 연경실키는 방법을 사용.

from google.colab import drive

drive.mount('/content/drive')

df = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/ML2/data/Churn_Modelling.csv')

# NaN 확인

df.isna().sum()

RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

# X,y 분류

# 컬럼 확인

df.head(2)

# y는 Exited 컬럼을 X 는'CreditScore':'EstimatedSalary'컬럼 까지 지정

y = df['Exited']

X = df.loc[ : ,'CreditScore':'EstimatedSalary']

# 데이터에 문자열이 포함되어 있으므로 인코딩을 진행해야 한다.

# 코랩에서는 인공지능을 활용하여 코딩할수도 있다!

# 생성 키를 누르면 하단에 원하는 코드를 말하면 자동적으로 코드를 생성해준다.

# but, 무조건 정답이 아니므로 정확한 명령어를 인식하고 있어야 하고, 완성해준 코드도 본인이 직접 검토할줄 알아야 한다.

# prompt: Geography 컬럼은 몇개의 데이터로 되어있어?

X['Geography'].value_counts()

Geography
France     5014
Germany    2509
Spain      2477
Name: count, dtype: int64

X['Geography'].nunique()

X['Gender'].nunique()

# 인코딩을 할게 두가지인데 레이블/원-핫 다르게 해야한다면 레이블 인코딩을 먼저 해줘야한다.

# prompt: Gender 컬럼을 레이블 인코딩 해줘

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()

X['Gender'] = le.fit_transform(X['Gender'])

X.head(2)

# 원핫 인코딩 진행

sorted(df['Geography'].unique())

['France', 'Germany', 'Spain']

df['Geography']

0        France
1         Spain
2        France
3        France
4         Spain
         ...   
9995     France
9996     France
9997     France
9998    Germany
9999     France
Name: Geography, Length: 10000, dtype: object

# prompt: X['Geography'] 컬럼을 columtransformer로 원핫 인코딩 해줘

from sklearn.compose import ColumnTransformer

from sklearn.preprocessing import OneHotEncoder

ct = ColumnTransformer(

[('encoder', OneHotEncoder(), [1])], # index of the column

remainder='passthrough' # do not apply transformation to other columns)

X = ct.fit_transform(X)

array([[1.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.0000000e+00,
        1.0000000e+00, 1.0134888e+05],
       [0.0000000e+00, 0.0000000e+00, 1.0000000e+00, ..., 0.0000000e+00,
        1.0000000e+00, 1.1254258e+05],
       [1.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.0000000e+00,
        0.0000000e+00, 1.1393157e+05],
       ...,
       [1.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00,
        1.0000000e+00, 4.2085580e+04],
       [0.0000000e+00, 1.0000000e+00, 0.0000000e+00, ..., 1.0000000e+00,
        0.0000000e+00, 9.2888520e+04],
       [1.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.0000000e+00,
        0.0000000e+00, 3.8190780e+04]])

# Dummy Variable Trap 을 적용하자 => 가장 좌측열 하나 데이터를 삭제하여 컬럼의 개수를 줄이는것.

# 딥러닝에서는 컬럼하나가 연산에 아주 크게 작용하므로,

# 원핫 인코딩한 결과에서, 가장 왼쪽의 컬럼 1개는 삭제해도

# 데이터를 표현하는데는 아무 문제 없다.

# 이번 예제에서는 France, Germany, Spain 3개 컬럼으로 원핫 인코딩 되는데,

# 1 0 0

# 0 1 0

# 0 0 1

# 맨왼쪽 France 컬럼을 삭제해도,

# 0 0 => France

# 1 0 => Germany

# 0 1 => Spain

# 열을 삭제해야 하므로 수정하기 쉬운 판다스로 바꾸자 (numpy는 기계친화적이므로)

# drop 함수를 이용하여 0번째 axis=1 열의 값을 삭제한다.

X = pd.DataFrame(X).drop(0, axis=1).values

array([[0.0000000e+00, 0.0000000e+00, 6.1900000e+02, ..., 1.0000000e+00,
        1.0000000e+00, 1.0134888e+05],
       [0.0000000e+00, 1.0000000e+00, 6.0800000e+02, ..., 0.0000000e+00,
        1.0000000e+00, 1.1254258e+05],
       [0.0000000e+00, 0.0000000e+00, 5.0200000e+02, ..., 1.0000000e+00,
        0.0000000e+00, 1.1393157e+05],
       ...,
       [0.0000000e+00, 0.0000000e+00, 7.0900000e+02, ..., 0.0000000e+00,
        1.0000000e+00, 4.2085580e+04],
       [1.0000000e+00, 0.0000000e+00, 7.7200000e+02, ..., 1.0000000e+00,
        0.0000000e+00, 9.2888520e+04],
       [0.0000000e+00, 0.0000000e+00, 7.9200000e+02, ..., 1.0000000e+00,
        0.0000000e+00, 3.8190780e+04]])

# 이제 가공한 데이터를 학습, 예측을 위하여 피쳐 스케일링 진행

# prompt: X를 스탠다드 스케일러로 피쳐 스케일링 해줘

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()

X = sc.fit_transform(X)

array([[-0.57873591, -0.57380915, -0.32622142, ...,  0.64609167,
         0.97024255,  0.02188649],
       [-0.57873591,  1.74273971, -0.44003595, ..., -1.54776799,
         0.97024255,  0.21653375],
       [-0.57873591, -0.57380915, -1.53679418, ...,  0.64609167,
        -1.03067011,  0.2406869 ],
       ...,
       [-0.57873591, -0.57380915,  0.60498839, ..., -1.54776799,
         0.97024255, -1.00864308],
       [ 1.72790383, -0.57380915,  1.25683526, ...,  0.64609167,
        -1.03067011, -0.12523071],
       [-0.57873591, -0.57380915,  1.46377078, ...,  0.64609167,
        -1.03067011, -1.07636976]])

0       1
1       0
2       1
3       0
4       0
       ..
9995    0
9996    0
9997    1
9998    1
9999    0
Name: Exited, Length: 10000, dtype: int64

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# 이제 학습,예측을 위한 데이터 프리프로세싱이 모두 완료되었다!!

# 이제 인공지능을 만들고 학습, 예측을 진행해보자.

# Part 2 - Now let's make the ANN!

import tensorflow as tf

from tensorflow import keras

from keras.models import Sequential

from keras.layers import Dense

# 학습전 데이터개수 한번더 검토

X.shape

(10000, 11)

X[0, ]

array([-0.57873591, -0.57380915, -0.32622142, -1.09598752,  0.29351742,
       -1.04175968, -1.22584767, -0.91158349,  0.64609167,  0.97024255,
        0.02188649])

# 깡통 인공지능 생성

model = Sequential()

# 모델링 학습 (인풋 레이어)

model.add( Dense(units =8, activation='relu', input_shape= (11,) ) )

/usr/local/lib/python3.10/dist-packages/keras/src/layers/core/dense.py:86: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)

# 히든 레이어 ( 히든레이어에 activation은 relu 혹은 reaky relu 이렇게 두개중에 하나 사용하는것을 추천)

model.add( Dense(6,'relu'))

# 아웃풋 레이어

model.add( Dense(1,'sigmoid'))

model.summary()

Model: "sequential"

# 모델링이 끝나면, 컴파일(Compile)을 해야 한다.

# 컴파일이란,

# 1. 옵티마이저(optimizer) 셋팅

# 2. 로스펑션(loss function 오차함수, 손실함수) 셋팅

# 3. 검증(평가)방법 셋팅

# 2개로 분류하는 문제의 loss는 'binary_crossentropy' 로 설정한다.

# Linear 측정은 MSE를 기준으로, 분류 측정은 Confusion Matrix로 평가하는 것이다. 이런 용어 개념 알아야 한다.

# metrics=['recall', 'precision', 'accuracy'] 중 정확도 accuracy를 가장 많이 사용한다.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] )

# 컴파일이 끝나면 학습을 한다.

model.fit(X_train, y_train, batch_size=10, epochs=20)

Epoch 1/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 8s 5ms/step - accuracy: 0.7925 - loss: 0.5189
Epoch 2/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 4s 5ms/step - accuracy: 0.8145 - loss: 0.4285
Epoch 3/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.8126 - loss: 0.4292
Epoch 4/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 4s 3ms/step - accuracy: 0.8322 - loss: 0.4015
Epoch 5/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.8432 - loss: 0.3831
Epoch 6/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.8404 - loss: 0.3815
Epoch 7/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8482 - loss: 0.3571
Epoch 8/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.8503 - loss: 0.3632
Epoch 9/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 1ms/step - accuracy: 0.8526 - loss: 0.3515
Epoch 10/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8624 - loss: 0.3375
Epoch 11/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8500 - loss: 0.3588
Epoch 12/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8620 - loss: 0.3410
Epoch 13/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8558 - loss: 0.3422
Epoch 14/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8644 - loss: 0.3317
Epoch 15/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8589 - loss: 0.3455
Epoch 16/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8587 - loss: 0.3477
Epoch 17/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.8664 - loss: 0.3338
Epoch 18/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.8603 - loss: 0.3430
Epoch 19/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 2s 1ms/step - accuracy: 0.8622 - loss: 0.3349
Epoch 20/20
800/800 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8635 - loss: 0.3352

<keras.src.callbacks.history.History at 0x78163b9ea0e0>

# 학습이 끝나면, 평가를 한다. => X_test

# evaluated 는 loss , accuracy 를 모두 알려준다.

model.evaluate(X_test, y_test)

63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.8576 - loss: 0.3323

[0.3400079309940338, 0.859000027179718]

# 컨퓨전 매트릭스로 확인!!

from sklearn.metrics import confusion_matrix, accuracy_score

y_pred = model.predict(X_test)

63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step

y_pred

array([[0.25733587],
       [0.33500573],
       [0.13444375],
       ...,
       [0.14407456],
       [0.15712005],
       [0.14735286]], dtype=float32)

y_test

9394    0
898     1
2398    0
5906    0
2343    0
       ..
1037    0
2899    0
9549    0
2740    0
6690    0
Name: Exited, Length: 2000, dtype: int64

# y_pred 는 2차원 y_test는 1차원

# 현재상태로 컨퓨전 메트릭스를 실행하면 에러 발생

# 컨퓨전 메트릭스는 0과 1로 구성된 데이터를 넣어줘야 한다.

# 즉, 예측값(y_pred) 를 0과 1로 구성된 데이터로 변경해 줘야한다.

# 상단에서 아웃풋 레이어를 'sigmoid' 함수로 설정하였었다. 즉, 0과 1 사이에 0.5를 기준으로 참,거짓을 판별하므로 0.5 이상이면 True인 값으로 할수 있다.

y_pred = (y_pred >0.5).astype(int)

cm = confusion_matrix(y_test, y_pred)

cm.sum()

array([[1525,   70],
       [ 212,  193]])

# 정확도는

(1898+263) / cm.sum()

1.0805

accuracy_score(y_test, y_pred)

0.859

# accuracy_score(정확도)를 따로 함수로 쓰지 않고 그냥 model.evaluate를 사용하면 한꺼번에 볼수있다.

model.evaluate(X_test, y_test)

63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.8576 - loss: 0.3323

[0.3400079309940338, 0.859000027179718]

model.layers

# 인풋레이어에서 히든레이어로 연결하는 선 (방정식의 계수)

# 11 * 8 (11행 8열)

model.layers[0].get_weights()[0]

array([[ 0.15510868,  0.48057753,  0.12048828, -0.11949058,  0.21905184,
        -0.21320768, -0.36722365,  0.20958157],
       [ 0.03201314,  0.0164311 ,  0.25372922,  0.18349856,  0.4126134 ,
        -0.19020617, -0.46901786,  0.12217822],
       [-0.01604781,  0.0390755 ,  0.07829322,  0.15240642, -0.062962  ,
         0.33552325, -0.04676249,  0.14250895],
       [-0.01489559,  0.10695752, -0.06455339,  0.07174824, -0.37085876,
        -0.3773975 ,  0.2848377 ,  0.25782293],
       [ 0.24178272,  0.16687939,  0.31543192,  0.7522099 , -0.62823325,
        -0.33388972, -0.6350836 , -0.44254327],
       [-0.05174403,  0.02457642, -0.02072264,  0.09515063,  0.07687005,
        -0.18021777,  0.03759337, -0.05291931],
       [-0.30960017, -0.79405934,  0.0745323 , -0.46475443,  0.107397  ,
         0.13863975, -0.42826137, -0.26842293],
       [-1.452625  , -0.8741043 ,  0.9447404 , -0.28960004, -0.3195589 ,
        -0.35342467, -0.35981154,  0.04665589],
       [ 0.09442655, -0.1937058 ,  0.11555907,  0.24578622,  0.11271417,
         0.110532  , -0.30035266,  0.12412481],
       [-0.5818811 ,  0.00982098, -0.52750164,  0.4784998 , -0.52636874,
        -0.5272814 ,  0.21656933, -0.49670872],
       [ 0.05266256,  0.0443663 ,  0.07713059, -0.18322042,  0.04967958,
        -0.6340016 , -0.21295096,  0.18633987]], dtype=float32)

# 상수항과 히든레이어와 연결된 선 (상수항)

model.layers[0].get_weights()[1]

array([ 0.06635607, -0.02907321, -0.16964573,  0.33312395,  0.23244065,
        0.4578039 ,  0.52377367,  1.0142717 ], dtype=float32)

# 모든 방정식에 대한 실제 실행 경로 확인 방법이라는데 공부가더 필요할듯

model.layers[1].get_weights()[0]

array([[ 0.2001995 , -0.6377279 ,  0.6321338 , -0.63409925, -0.13164514,
        -0.231192  ],
       [-0.6715094 , -0.0855982 ,  0.57218874,  0.32056037, -0.33826792,
        -0.80573475],
       [ 0.78518236, -0.27362934,  0.76037705,  0.10874667, -0.3069774 ,
         0.38885775],
       [-0.37846598,  0.6472185 , -0.04943345,  0.5838219 , -0.44037664,
         0.21265864],
       [-0.46858358,  0.15887392, -0.4912556 , -0.22084698, -0.16053215,
        -0.6083974 ],
       [-0.5674717 , -0.45886916,  0.25360072, -0.09564873,  0.37213144,
        -0.02956283],
       [ 0.1892614 , -0.42350322, -0.08122479,  0.07863261,  0.6747974 ,
        -0.02157269],
       [ 0.14546366,  0.39898443,  0.12616812,  0.63126457,  0.6642552 ,
        -0.34389728]], dtype=float32)

# 히든레이어에서 아웃풋 레이어로 이어지는 선

model.layers[2].get_weights()

[array([[ 1.3704257 ],
        [-1.0855359 ],
        [ 1.0281827 ],
        [-0.56768525],
        [-1.1445732 ],
        [ 0.6428316 ]], dtype=float32),
 array([-0.2651697], dtype=float32)]

< 다음 신규 데이터를 통해 분류해 보자 >

Geography: France
Credit Score: 600
Gender: Male
Age: 40
Tenure: 3
Balance: 60000
Number of Products: 2
Has Credit Card: Yes
Is Active Member: Yes
Estimated Salary: 50000

# (1) 뉴데이터를 [[]] 2차원 데이터로 생성하여 예측하는 방법

new_data = [[600,'France','Male',40,3,60000,2,1,1,50000]]

new_data

[[600, 'France', 'Male', 40, 3, 60000, 2, 1, 1, 50000]]

# 문자열이 있기때문에 기존 파일에 했던것과 동일하게 레이블 인코딩 후에 원-핫 인코딩 피쳐스케일링을 진행해야 한다.

# 판다스로 작업하는게 가독성에도 좋고 사람이 작업하기가 편하다. 특히 데이터에 문자열과 숫자가 섞여있을때

# 그러므로 신규데이터가 들어오면 리스트로 만들어서 넘파이로 만들고 넘파이에 reshape을 이용해 이차원으로 만들고 판다스로 작업하자!

df_new_data = pd.DataFrame(new_data)

df_new_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 10 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       1 non-null      int64 
 1   1       1 non-null      object
 2   2       1 non-null      object
 3   3       1 non-null      int64 
 4   4       1 non-null      int64 
 5   5       1 non-null      int64 
 6   6       1 non-null      int64 
 7   7       1 non-null      int64 
 8   8       1 non-null      int64 
 9   9       1 non-null      int64 
dtypes: int64(8), object(2)
memory usage: 208.0+ bytes

df_new_data

# 컬럼이름 맞추기

# 필요한 컬럼 이름만 카피

df.columns

Index(['RowNumber', 'CustomerId', 'Surname', 'CreditScore', 'Geography',
       'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard',
       'IsActiveMember', 'EstimatedSalary', 'Exited'],
      dtype='object')

df_new_data.columns = ['CreditScore','Geography',

'Gender','Age','Tenure','Balance','NumOfProducts','HasCrCard',

'IsActiveMember','EstimatedSalary']

# 이제부터 인코딩과 피쳐스케일링 진행

df_new_data

# 레이블 인코딩

df_new_data['Gender'] = le.transform(df_new_data['Gender'])

# 원-핫 인코딩

df_new_data = ct.transform(df_new_data)

new_data = pd.DataFrame(df_new_data).drop(0, axis =1).values

# 피쳐스케일링

new_data = sc.transform(new_data)

# 예측 실행

y_pred = model.predict(new_data)

(y_pred >0.5).astype(int)

array([[0]])

## (2) 딕셔너리로 생성하여서 예측하는 방법

new_data2 = [{'CreditScore':600,'Geography':'France','Gender':'Male','Age':40,'Tenure':3,'Balance':60000,'NumOfProducts':2,'HasCrCard':1,'IsActiveMember':1,'EstimatedSalary':50000}]

df_new_data2 = pd.DataFrame(new_data2)

df_new_data2['Gender'] = le.transform(df_new_data2['Gender'])

df_new_data2 = ct.transform(df_new_data2)

new_data2 = pd.DataFrame(df_new_data2).drop(0, axis =1).values

new_data2 = sc.transform(new_data2)

new_data2

array([[-0.57873591, -0.57380915, -0.52281016,  0.91241915,  0.10281024,
        -0.69598177, -0.26422114,  0.80773656,  0.64609167,  0.97024255,
        -0.87101922]])

y_pred = model.predict(new_data2)

1/1━━━━━━━━━━━━━━━━━━━━0s19ms/step

(y_pred >0.5).astype(int)

array([[0]])

< 용어 정리 >

# epoch

한 번의 epoch는 신경망에서 전체 데이터 셋에 대해 forward pass/backward pass 과정을 거친 것을 말함. 즉, 전체 데이터 셋에 대해 한 번 학습을 완료한 상태

# batch_size

메모리의 한계와 속도 저하 때문에 대부분의 경우에는 한 번의 epoch에서 모든 데이터를 한꺼번에 집어넣을 수는 없습니다. 그래서 데이터를 나누어서 주게 되는데 이때 몇 번 나누어서 주는가를 iteration, 각 iteration마다 주는 데이터 사이즈를 batch size라고 합니다.

출처 : https://www.slideshare.net/w0ong/ss-82372826

텐서플로우로 배우는 딥러닝

텐서플로우로 배우는 딥러닝 - Download as a PDF or view online for free

www.slideshare.net

< GridSearch 를 이용한, 최적의 하이퍼 파라미터 찾기 >

!pip install scikeras

# Tuning the ANN

from scikeras.wrappers import KerasClassifier

from sklearn.model_selection import GridSearchCV

from keras.models import Sequential

from keras.layers import Dense

# 실제로 이런 코딩을 많이 짜지는 않지만 용어는 알아야 활용이더 용이하기 때문에 원리를 이해하고 키워드 숙지를 위해서 한번 실습해 보자

def build_model(optimizer='adam') :

model = Sequential()

model.add(Dense(8,'relu', input_shape = (11, )))

model.add(Dense(6,'relu'))

model.add(Dense(1,'sigmoid'))

model.compile(optimizer = optimizer, loss ='binary_crossentropy', metrics = ['accuracy'])

return model

model = KerasClassifier(build_fn = build_model)

my_param = {'batch_size': [10,20],'epochs': [10,20,30],'optimizer': ['adam','rmsprop']}

grid = GridSearchCV (estimator=model, param_grid= my_param, scoring='accuracy')

# 학습에 시간이 오래 소요된다.

grid.fit(X_train, y_train)

grid.best_params_

{'batch_size': 10, 'epochs': 30, 'optimizer': 'adam'}

grid.best_score_

0.86

best_model = grid.best_estimator_

best_model.predict(X_test)

200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step

array([0, 0, 0, ..., 0, 0, 0])

best_model.predict_proba(X_test)

200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step

array([[0.7249168 , 0.27508318],
       [0.71621585, 0.28378415],
       [0.8745559 , 0.12544413],
       ...,
       [0.81129515, 0.18870485],
       [0.8167985 , 0.18320148],
       [0.8356137 , 0.16438629]], dtype=float32)

다음 게시글로 계속

728x90

'DEEP LEARNING > Deep Learning Project' 카테고리의 다른 글

DL(딥러닝) 실습 : Prophet을 활용한 테슬라 주가 분석 (1)	2024.05.02
DL(딥러닝) 실습 : Prophet을 활용하여 시카고 범죄율을 예측해 보자 (0)	2024.05.02
DL(딥러닝) 실습 : validation_split 모델링 시각화 & EarlyStopping 콜백(callback) 사용 (자동차 연비 예측 ANN) (0)	2024.04.30
DL(딥러닝) 실습 : keras.models Sequential/.layers Dense 활용한 차량 구매금액 예측 (0)	2024.04.30

현재글DL(딥러닝) 실습 : Tensorflow의 keras를 활용한 ANN Deep Learning

실습 데이터 다운 Git : https://github.com/sorktjrrb/

머신러닝, android studio, ML, dl, 안드로이드 스튜디오, docker, pandas, 딥러닝, 데이터 분석, python pandas, RESTful API, mysql, AWS, mysql connector, java, streamlit, AWS Lambda, mysql workbench, EC2, Python,

Today :
Yesterday :

Byte의 발자취