时间序列预测：CNN-BiLSTM模型实践-资源计算器-森日光辉：远征筹备指挥部

时间序列预测：CNN-BiLSTM模型实践

admin 7196 2025-11-27 16:53:45

BiLSTM是一种深度学习模型，它结合了两个方向的长短期记忆网络（LSTM），即正向和反向。它的优势主要体现在两个方面：

双向信息捕捉：BiLSTM能够同时从过去和未来的数据中学习，因为它有两个方向的LSTM单元，一个用于正向序列，另一个用于反向序列，这样，模型可以更全面地捕捉到时间序列中的关联信息，提高了对序列特征的理解和表征。

更丰富的上下文理解：由于BiLSTM可以在两个方向上捕捉信息，它能够更好地理解当前时刻的输入与其前后上下文之间的关系，这对于许多序列任务（如自然语言处理、时间序列预测等）都非常有用，因为理解上下文是解决这些任务的关键。

总的来说，BiLSTM相比于单向的LSTM，能够更全面地捕捉序列中的信息，从而提高了模型对序列数据的理解和预测能力。

1. 代码实现简单流程图

1.1 开始

读取数据

数据预处理

– 将数据转换为时间序列格式

检查数据完整性

划分训练集、验证集和测试集

归一化数据

划分时间窗口

1.2 模型构建

1.2.1 BiLSTM模型

双向LSTM层

密集层

编译模型

1.2.2 CNN-BiLSTM模型

双向LSTM层

重塑层

卷积层

池化层

展平层

密集层

编译模型

1.3 模型训练

训练BiLSTM模型并保存训练历史

训练CNN-BiLSTM模型并保存训练历史

1.4 模型评估

使用测试集评估BiLSTM模型

使用测试集评估CNN-BiLSTM模型

1.5 未来预测

使用BiLSTM模型进行未来预测

使用CNN-BiLSTM模型进行未来预测

1.6 绘制结果图表

绘制训练集、验证集、测试集和预测结果的时序图（BiLSTM）

绘制训练集、验证集、测试集和预测结果的时序图（CNN-BiLSTM）

1.7 结束

2. 代码实现

2.1 读取数据

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

plt.rcParams['font.sans-serif'] = 'SimHei' # 设置中文显示

plt.rcParams['axes.unicode_minus'] = False

df = pd.read_excel('data.xlsx')

2.2 数据预处理

2.2.1 数据转换及缺失检测

df['Date'] = pd.to_datetime(df['Year'].astype(str) + '-' + df['Day'].astype(str), format='%Y-%j')

df.set_index('Date', inplace=True)

df.drop(['Year', 'Day'], axis=1, inplace=True)

# 生成时间范围

start_date = pd.Timestamp('1990-01-01')

end_date = pd.Timestamp('2023-03-01')

date_range = pd.date_range(start=start_date, end=end_date, freq='D')

# 检查时间范围中是否包含DataFrame中的所有日期

missing_dates = date_range[~date_range.isin(df.index)]

print("Missing Dates:")

代码将DataFrame中的“Year”和“Day”列合并成日期，并设置为DataFrame的索引，然后生成一个时间范围，检查该范围中是否包含了DataFrame中的所有日期，避免时间范围不完整存在缺失。

2.2.2 数据划分

# 定义划分比例

train_ratio = 0.7

val_ratio = 0.1

test_ratio = 0.2

# 计算划分的索引

train_split = int(train_ratio * len(df))

val_split = int((train_ratio + val_ratio) * len(df))

# 划分数据集

train_set = df.iloc[:train_split]

val_set = df.iloc[train_split:val_split]

test_set = df.iloc[val_split:]

plt.figure(figsize=(15, 10))

plt.subplot(3,1,1)

plt.plot(train_set, color='g', alpha=0.3)

plt.title('train Temperature时序图')

plt.subplot(3,1,2)

plt.plot(val_set, color='b', alpha=0.3)

plt.title('val Temperature时序图')

plt.subplot(3,1,3)

plt.plot(test_set, color='r', alpha=0.3)

plt.title('test Temperature时序图')

plt.xticks(rotation=45)

plt.show()

数据集按照指定的比例划分为训练集、验证集和测试集，并绘制它们的时序图，训练集用于训练模型，验证集用于调整模型超参数和评估性能，测试集用于评估模型在未知数据上的性能。

2.2.3 归一化数据

from sklearn.preprocessing import MinMaxScaler

def normalize_dataframe(train_set, val_set, test_set):

scaler = MinMaxScaler()

scaler.fit(train_set) # 在训练集上拟合归一化模型

train = pd.DataFrame(scaler.transform(train_set), columns=train_set.columns, index = train_set.index)

val = pd.DataFrame(scaler.transform(val_set), columns=val_set.columns, index = val_set.index)

test = pd.DataFrame(scaler.transform(test_set), columns=test_set.columns, index = test_set.index)

return train, val, test

train, val, test = normalize_dataframe(train_set, val_set, test_set)

plt.figure(figsize=(15, 10))

plt.subplot(3,1,1)

plt.plot(train, color='g', alpha=0.3)

plt.title('train Temperature归一化时序图')

plt.subplot(3,1,2)

plt.plot(val, color='b', alpha=0.3)

plt.title('val Temperature归一化时序图')

plt.subplot(3,1,3)

plt.plot(test, color='r', alpha=0.3)

plt.title('test Temperature归一化时序图')

plt.xticks(rotation=45)

plt.show()

将训练集、验证集和测试集进行归一化，并绘制归一化后的时序图，这里归一化采用训练集统计指标避免出现数据泄露。

2.2.4 时间窗口划分

def prepare_data(data, win_size):

X = []

y = []

for i in range(len(data) - win_size):

temp_x = data[i:i + win_size]

temp_y = data[i + win_size]

X.append(temp_x)

y.append(temp_y)

X = np.asarray(X)

y = np.asarray(y)

X = np.expand_dims(X, axis=-1)

return X, y

win_size = 30

# 训练集

X_train, y_train= prepare_data(train['Temperature'].values, win_size)

# 验证集

X_val, y_val= prepare_data(val['Temperature'].values, win_size)

# 测试集

X_test, y_test = prepare_data(test['Temperature'].values, win_size)

print("训练集形状:", X_train.shape, y_train.shape)

print("验证集形状:", X_val.shape, y_val.shape)

print("测试集形状:", X_test.shape, y_test.shape)

这里的划分为单特征单步预测时间窗口为30。

2.3 BiLSTM模型构建

2.3.1 BiLSTM模型编译训练

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import LSTM, Bidirectional, Dense

model_bilstm = Sequential()

model_bilstm.add(Bidirectional(LSTM(128, activation='relu'), input_shape=(X_train.shape[1], X_train.shape[2])))

model_bilstm.add(Dense(64, activation='relu'))

model_bilstm.add(Dense(32, activation='relu'))

model_bilstm.add(Dense(16, activation='relu'))

model_bilstm.add(Dense(1))

model_bilstm.compile(optimizer='adam', loss='mse')

history = model_bilstm.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_val, y_val))

plt.figure()

plt.plot(history.history['loss'], c='b', label = 'loss')

plt.plot(history.history['val_loss'], c='g', label = 'val_loss')

plt.legend()

plt.show()

model_bilstm.summary()

2.3.2 BiLSTM模型评价

from sklearn import metrics

y_pred = model_bilstm.predict(X_test)

# 计算均方误差（MSE）

mse = metrics.mean_squared_error(y_test, np.array([i for arr in y_pred for i in arr]))

# 计算均方根误差（RMSE）

rmse = np.sqrt(mse)

# 计算平均绝对误差（MAE）

mae = metrics.mean_absolute_error(y_test, np.array([i for arr in y_pred for i in arr]))

from sklearn.metrics import r2_score # 拟合优度

r2 = r2_score(y_test, np.array([i for arr in y_pred for i in arr]))

print("均方误差 (MSE):", mse)

print("均方根误差 (RMSE):", rmse)

print("平均绝对误差 (MAE):", mae)

print("拟合优度:", r2)

2.3.3 BiLSTM模型向后预测及可视化

# 取出预测的最后一个时间步的输出作为下一步的输入

last_output = model_bilstm.predict(X_test)[-1]

# 预测的时间步数

steps = 10 # 假设向后预测10个时间步

predicted = []

for i in range(steps):

# 将最后一个输出加入X_test，继续向后预测

input_data = np.append(X_test[-1][1:], last_output).reshape(1, X_test.shape[1], X_test.shape[2])

# 使用模型进行预测

next_output = model_bilstm.predict(input_data)

# 将预测的值加入结果列表

predicted.append(next_output[0][0])

last_output = next_output[0]

# 反归一化

df_max = np.max(train_set)

df_min = np.min(train_set)

series_1 = np.array(predicted)*(df_max-df_min)+df_min

plt.figure(figsize=(15,4), dpi =300)

plt.subplot(3,1,1)

plt.plot(train_set, color = 'c', label = '训练集')

plt.plot(val_set, color = 'r', label = '验证集')

plt.plot(test_set, color = 'b', label = '测试集')

plt.plot(pd.date_range(start='2016-08-12', end='2023-03-01', freq='D')

,y_pred*(df_max-df_min)+df_min, color = 'y', label = '测试集预测')

plt.plot(pd.date_range(start='2023-03-02', end='2023-03-11', freq='D')

,series_1, color = 'magenta',linestyle='-.', label = '未来预测')

plt.legend()

plt.subplot(3,1,2)

plt.plot(test_set, color = 'b', label = '测试集')

plt.plot(pd.date_range(start='2016-08-12', end='2023-03-01', freq='D')

,y_pred*(df_max-df_min)+df_min, color = 'y', label = '测试集预测')

plt.plot(pd.date_range(start='2023-03-02', end='2023-03-11', freq='D')

,series_1, color = 'magenta', linestyle='-.',label = '未来预测')

plt.legend()

plt.subplot(3,1,3)

plt.plot(test_set, color = 'b', label = '测试集')

plt.plot(pd.date_range(start='2016-08-12', end='2023-03-01', freq='D')

,y_pred*(df_max-df_min)+df_min, color = 'y', label = '测试集预测')

plt.plot(pd.date_range(start='2023-03-02', end='2023-03-11', freq='D')

,series_1, color = 'magenta',linestyle='-.', label = '未来预测')

# 设置x轴范围为2022年到未来预测的结束日期

plt.xlim(pd.Timestamp('2022-01-01'), pd.Timestamp('2023-03-11'))

plt.legend()

plt.show()

2.4 CNN-BiLSTM模型构建

2.4.1CNN-BiLSTM模型编译训练

from tensorflow.keras.layers import Conv1D, MaxPooling1D, Reshape, Flatten

model_cnn_bilstm = Sequential()

model_cnn_bilstm.add(Bidirectional(LSTM(128, activation='relu'), input_shape=(X_train.shape[1], X_train.shape[2])))

# 添加Reshape层将LSTM的输出转换为3维

model_cnn_bilstm.add(Reshape((256, 1)))

model_cnn_bilstm.add(Conv1D(filters=64, kernel_size=7, activation='relu'))

model_cnn_bilstm.add(MaxPooling1D(pool_size=2))

model_cnn_bilstm.add(Flatten()) # 将池化后的输出展平成一维向量

model_cnn_bilstm.add(Dense(32, activation='relu'))

model_cnn_bilstm.add(Dense(16, activation='relu'))

model_cnn_bilstm.add(Dense(1))

model_cnn_bilstm.compile(optimizer='adam', loss='mse')

history = model_cnn_bilstm.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_val, y_val))

plt.figure()

plt.plot(history.history['loss'], c='b', label = 'loss')

plt.plot(history.history['val_loss'], c='g', label = 'val_loss')

plt.legend()

plt.show()

model_cnn_bilstm.summary()

2.4.2CNN-BiLSTM模型评价

y_pred = model_cnn_bilstm.predict(X_test)

mse = metrics.mean_squared_error(y_test, np.array([i for arr in y_pred for i in arr]))

rmse = np.sqrt(mse)

mae = metrics.mean_absolute_error(y_test, np.array([i for arr in y_pred for i in arr]))

from sklearn.metrics import r2_score

r2 = r2_score(y_test, np.array([i for arr in y_pred for i in arr]))

print("均方误差 (MSE):", mse)

print("均方根误差 (RMSE):", rmse)

print("平均绝对误差 (MAE):", mae)

print("拟合优度:", r2)

2.4.3 CNN-BiLSTM模型向后预测及可视化

last_output = model_cnn_bilstm.predict(X_test)[-1]

steps = 10

predicted = []

for i in range(steps):

input_data = np.append(X_test[-1][1:], last_output).reshape(1, X_test.shape[1], X_test.shape[2])

next_output = model_cnn_bilstm.predict(input_data)

predicted.append(next_output[0][0])

last_output = next_output[0]

series_2 = np.array(predicted)*(df_max-df_min)+df_min

plt.figure(figsize=(15,4), dpi =300)

plt.subplot(3,1,1)

plt.plot(train_set, color = 'c', label = '训练集')

plt.plot(val_set, color = 'r', label = '验证集')

plt.plot(test_set, color = 'b', label = '测试集')

plt.plot(pd.date_range(start='2016-08-12', end='2023-03-01', freq='D')

,y_pred*(df_max-df_min)+df_min, color = 'y', label = '测试集预测')

plt.plot(pd.date_range(start='2023-03-02', end='2023-03-11', freq='D')

,series_2, color = 'magenta',linestyle='-.', label = '未来预测')

plt.legend()

plt.subplot(3,1,2)

plt.plot(test_set, color = 'b', label = '测试集')

plt.plot(pd.date_range(start='2016-08-12', end='2023-03-01', freq='D')

,y_pred*(df_max-df_min)+df_min, color = 'y', label = '测试集预测')

plt.plot(pd.date_range(start='2023-03-02', end='2023-03-11', freq='D')

,series_2, color = 'magenta',linestyle='-.', label = '未来预测')

plt.legend()

plt.subplot(3,1,3)

plt.plot(test_set, color = 'b', label = '测试集')

plt.plot(pd.date_range(start='2016-08-12', end='2023-03-01', freq='D')

,y_pred*(df_max-df_min)+df_min, color = 'y', label = '测试集预测')

plt.plot(pd.date_range(start='2023-03-02', end='2023-03-11', freq='D')

,series_2, color = 'magenta',linestyle='-.', label = '未来预测')

# 设置x轴范围为2022年到未来预测的结束日期

plt.xlim(pd.Timestamp('2022-01-01'), pd.Timestamp('2023-03-11'))

plt.legend()

plt.show()

文章转自微信公众号@Python机器学习AI

时间序列预测：CNN-BiLSTM模型实践

在手机上畅玩云顶之弈手机玩云顶之弈教程攻略让你轻松上分

《阴阳师》山童分布与快速获取指南

轮回修仙路：2025年4月5日开启的仙缘秘境探索与修仙挑战盛典

热门文章

最新发布

神曲世界：迷雾之歌——2025年度大型游戏活动

人间中毒(2014)

少林硬功绝技---一指禅

《霸道天下》破界之战：群雄逐鹿巅峰争霸赛·2025跨服全服联动竞技盛典

奥特曼系列：2025年夏日超银河传说终极挑战赛暨光之战士集结庆典

永恒之境：探索与冒险的狂欢盛典

呐喊型的歌曲全部播放共有歌曲71首

《封神传》2025春季狂欢盛典：神魔争霸，赢取稀有神装！

《江湖传》2025春季盛典·侠影风云录——群雄争霸赛暨武林秘宝探寻活动

萌斗魏蜀吴2025暑期狂欢盛典：三国名将集结，限时福利大放送！