快捷導(dǎo)航

Python+SimpleRNN實現(xiàn)股票預(yù)測詳解

更新時間：2022年05月20日 09:01:46 作者：別團等shy哥發(fā)育

這篇文章主要為大家詳細介紹了如何利用Python和SimpleRNN實現(xiàn)股票預(yù)測效果，文中的示例代碼講解詳細，對我們學(xué)習(xí)有一定幫助，需要的可以參考一下

1、數(shù)據(jù)源

SH600519.csv 是用 tushare 模塊下載的 SH600519 貴州茅臺的日 k 線數(shù)據(jù)，本次例子中只用它的 C 列數(shù)據(jù)(如圖所示)：

用連續(xù) 60 天的開盤價，預(yù)測第 61 天的開盤價。

2、代碼實現(xiàn)

按照六步法： import 相關(guān)模塊->讀取貴州茅臺日 k 線數(shù)據(jù)到變量 maotai，把變量 maotai 中前 2126 天數(shù)據(jù)中的開盤價作為訓(xùn)練數(shù)據(jù)，把變量 maotai 中后 300 天數(shù)據(jù)中的開盤價作為測試數(shù)據(jù)；然后對開盤價進行歸一化，使送入神經(jīng)網(wǎng)絡(luò)的數(shù)據(jù)分布在 0 到 1 之間；

接下來建立空列表分別用于接收訓(xùn)練集輸入特征、訓(xùn)練集標簽、測試集輸入特征、測試集標簽；

繼續(xù)構(gòu)造數(shù)據(jù)。用 for 循環(huán)遍歷整個訓(xùn)練數(shù)據(jù)，每連續(xù)60 天數(shù)據(jù)作為輸入特征 x_train，第 61 天數(shù)據(jù)作為對應(yīng)的標簽 y_train ，一共生成 2066 組訓(xùn)練數(shù)據(jù)，然后打亂訓(xùn)練數(shù)據(jù)的順序并轉(zhuǎn)變?yōu)?array 格式繼而轉(zhuǎn)變?yōu)?RNN 輸入要求的維度；

同理，利用 for 循環(huán)遍歷整個測試數(shù)據(jù)，一共生成 240組測試數(shù)據(jù)，測試集不需要打亂順序，但需轉(zhuǎn)變?yōu)?array 格式繼而轉(zhuǎn)變?yōu)?RNN 輸入要求的維度。

用 sequntial 搭建神經(jīng)網(wǎng)絡(luò)：

第一層循環(huán)計算層記憶體設(shè)定 80 個，每個時間步推送 h t h_t ht?給下一層，使用 0.2 的 Dropout；

第二層循環(huán)計算層設(shè)定記憶體有 100 個，僅最后的時間步推送 h t h_t ht?給下一層，使用 0.2 的 Dropout；

由于輸出值是第 61 天的開盤價只有一個數(shù)，所以全連接 Dense 是 1->compile 配置訓(xùn)練方法使用 adam 優(yōu)化器，使用均方誤差損失函數(shù)。在股票預(yù)測代碼中，只需觀測 loss，訓(xùn)練迭代打印的時候也只打印 loss，所以這里就無需給metrics賦值->設(shè)置斷點續(xù)訓(xùn),fit 執(zhí)行訓(xùn)練過程->summary 打印出網(wǎng)絡(luò)結(jié)構(gòu)和參數(shù)統(tǒng)計。

進行 loss 可視化與參數(shù)報錯操作

進行股票預(yù)測。用 predict 預(yù)測測試集數(shù)據(jù)，然后將預(yù)測值和真實值從歸一化的數(shù)值變換到真實數(shù)值，最后用紅色線畫出真實值曲線、用藍色線畫出預(yù)測值曲線。

為了評價模型優(yōu)劣，給出了三個評判指標：均方誤差、均方根誤差和平均絕對誤差，這些誤差越小說明預(yù)測的數(shù)值與真實值越接近。

RNN 股票預(yù)測 loss 曲線:

RNN 股票預(yù)測曲線:

RNN 股票預(yù)測評價指標:

模型摘要：

3、完整代碼

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dropout, Dense, SimpleRNN
import matplotlib.pyplot as plt
import os
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math
# 讀取股票文件
maotai = pd.read_csv('./SH600519.csv')
# 前(2426-300=2126)天的開盤價作為訓(xùn)練集,表格從0開始計數(shù)，2:3 是提取[2:3)列，前閉后開,故提取出C列開盤價
training_set = maotai.iloc[0:2426 - 300, 2:3].values
# 后300天的開盤價作為測試集
test_set = maotai.iloc[2426 - 300:, 2:3].values

# 歸一化
sc = MinMaxScaler(feature_range=(0, 1))  # 定義歸一化：歸一化到(0，1)之間
training_set_scaled = sc.fit_transform(training_set)  # 求得訓(xùn)練集的最大值，最小值這些訓(xùn)練集固有的屬性，并在訓(xùn)練集上進行歸一化
test_set = sc.transform(test_set)  # 利用訓(xùn)練集的屬性對測試集進行歸一化

x_train = []
y_train = []

x_test = []
y_test = []

# 測試集：csv表格中前2426-300=2126天數(shù)據(jù)
# 利用for循環(huán)，遍歷整個訓(xùn)練集，提取訓(xùn)練集中連續(xù)60天的開盤價作為輸入特征x_train，第61天的數(shù)據(jù)作為標簽，for循環(huán)共構(gòu)建2426-300-60=2066組數(shù)據(jù)。
for i in range(60, len(training_set_scaled)):
    x_train.append(training_set_scaled[i - 60:i, 0])
    y_train.append(training_set_scaled[i, 0])
# 對訓(xùn)練集進行打亂
np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)
# 將訓(xùn)練集由list格式變?yōu)閍rray格式
x_train, y_train = np.array(x_train), np.array(y_train)

# 使x_train符合RNN輸入要求：[送入樣本數(shù)， 循環(huán)核時間展開步數(shù)， 每個時間步輸入特征個數(shù)]。
# 此處整個數(shù)據(jù)集送入，送入樣本數(shù)為x_train.shape[0]即2066組數(shù)據(jù)；輸入60個開盤價，預(yù)測出第61天的開盤價，循環(huán)核時間展開步數(shù)為60; 每個時間步送入的特征是某一天的開盤價，只有1個數(shù)據(jù)，故每個時間步輸入特征個數(shù)為1
x_train = np.reshape(x_train, (x_train.shape[0], 60, 1))
# 測試集：csv表格中后300天數(shù)據(jù)
# 利用for循環(huán)，遍歷整個測試集，提取測試集中連續(xù)60天的開盤價作為輸入特征x_test，第61天的數(shù)據(jù)作為標簽y_test，for循環(huán)共構(gòu)建300-60=240組數(shù)據(jù)。
for i in range(60, len(test_set)):
    x_test.append(test_set[i - 60:i, 0])
    y_test.append(test_set[i, 0])
# 測試集變array并reshape為符合RNN輸入要求：[送入樣本數(shù)， 循環(huán)核時間展開步數(shù)， 每個時間步輸入特征個數(shù)]
x_test, y_test = np.array(x_test), np.array(y_test)
x_test = np.reshape(x_test, (x_test.shape[0], 60, 1))

model = tf.keras.Sequential([
    SimpleRNN(80, return_sequences=True),# 第一層循環(huán)計算層：記憶體設(shè)定80個，每個時間步推送ht給下一層
    Dropout(0.2),   #使用0.2的Dropout
    SimpleRNN(100),# 第二層循環(huán)計算層，設(shè)定記憶體100個
    Dropout(0.2),   #
    Dense(1)    # 由于輸出值是第61天的開盤價，只有一個數(shù)，所以Dense是1
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
              loss='mean_squared_error')  # 損失函數(shù)用均方誤差
# 該應(yīng)用只觀測loss數(shù)值，不觀測準確率，所以刪去metrics選項，一會在每個epoch迭代顯示時只顯示loss值

checkpoint_save_path = "./checkpoint/rnn_stock.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='val_loss')

history = model.fit(x_train, y_train, batch_size=64, epochs=50, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])

model.summary()

file = open('./weights.txt', 'w')  # 參數(shù)提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

loss = history.history['loss']
val_loss = history.history['val_loss']

plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

################## predict ######################
# 測試集輸入模型進行預(yù)測
predicted_stock_price = model.predict(x_test)
# 對預(yù)測數(shù)據(jù)還原---從（0，1）反歸一化到原始范圍
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
# 對真實數(shù)據(jù)還原---從（0，1）反歸一化到原始范圍
real_stock_price = sc.inverse_transform(test_set[60:])
# 畫出真實數(shù)據(jù)和預(yù)測數(shù)據(jù)的對比曲線
plt.plot(real_stock_price, color='red', label='MaoTai Stock Price')
plt.plot(predicted_stock_price, color='blue', label='Predicted MaoTai Stock Price')
plt.title('MaoTai Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('MaoTai Stock Price')
plt.legend()
plt.show()

##########evaluate##############
# calculate MSE 均方誤差 ---> E[(預(yù)測值-真實值)^2] (預(yù)測值減真實值求平方后求均值)
mse = mean_squared_error(predicted_stock_price, real_stock_price)
# calculate RMSE 均方根誤差--->sqrt[MSE]    (對均方誤差開方)
rmse = math.sqrt(mean_squared_error(predicted_stock_price, real_stock_price))
# calculate MAE 平均絕對誤差----->E[|預(yù)測值-真實值|](預(yù)測值減真實值求絕對值后求均值）
mae = mean_absolute_error(predicted_stock_price, real_stock_price)
print('均方誤差: %.6f' % mse)
print('均方根誤差: %.6f' % rmse)
print('平均絕對誤差: %.6f' % mae)

以上就是Python+SimpleRNN實現(xiàn)股票預(yù)測詳解的詳細內(nèi)容，更多關(guān)于Python SimpleRNN股票預(yù)測的資料請關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章: