Long Short Term Memory (LSTMs)

Storyboard

The passenger number model takes historical segments and the value that follows them and seeks to recognize the pattern and after any sequence infer how it will continue.

Code and data

paxs_rnn.ipynb

international-airline-passengers.csv

>Model

ID:(1793, 0)



Import libraries

Description

>Top


Import the necessary libraries:

import numpy
import pandas as pd 
import matplotlib.pyplot as plt
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error

ID:(13885, 0)



Load historical data

Description

>Top


Load the data from the international-airline-passengers.csv data:

# load data
data = pd.read_csv('international-airline-passengers.csv',skipfooter=5)
data.head()


ID:(13886, 0)



Show the historical data

Description

>Top


Show the data:

# show data
dataset = data.iloc[:,1].values
plt.plot(dataset)
plt.xlabel('time')
plt.ylabel('Number of Passenger')
plt.title('international airline passenger')
plt.show()


ID:(13887, 0)



Reshape and list data

Description

>Top


Reshape and list data:

# reshape and list data
dataset = dataset.reshape(-1,1)
dataset = dataset.astype('float32')
print(dataset)

[[112.]

[118.]

[132.]

[129.]

...

[606.]

[508.]

[461.]]

ID:(13888, 0)



Rescale values

Description

>Top


To improve the quality of learning, the values between 0 and 1 are re-scaled:

# scaling 
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
print(dataset)

[[0.01544401]

[0.02702703]

[0.05405405]

...

[0.96911204]

[0.7799227 ]

[0.6891892 ]]

ID:(13889, 0)



Generar los datos de entrenamiento y evaluation

Description

>Top


El arreglo inicial es segmentado en un grupo de registros para entrenar y un segundo para testear el modelo generado:

# generate train and test data 
train_size = int(len(dataset) * 0.50)
test_size = len(dataset) - train_size
train = dataset[0:train_size,:]
test = dataset[train_size:len(dataset),:]
print('train size: {}, test size: {} '.format(len(train), len(test)))
print(train)

train size: 71, test size: 71

[[0.01544401]

[0.02702703]

[0.05405405]

...

[0.2992278 ]

[0.24131274]

[0.1911197 ]]

ID:(13890, 0)



Create training segments

Description

>Top


The sequences X_train and the value that follows y_train are formed:

# creating a data structure with 10 timesteps and 1 output
time_stemp = 10
dataX = []
dataY = []
for i in range(len(train)-time_stemp-1):
    a = train[i:(i+time_stemp), 0]
    dataX.append(a)
    dataY.append(train[i + time_stemp, 0])
trainX = numpy.array(dataX)
trainY = numpy.array(dataY)
print('trainX:',trainX.shape[0],',',trainX.shape[1],' trainY:',trainY.shape[0])
print('trainX=',trainX)
print('trainY=',trainY)

trainX: 60 , 10 trainY: 60

trainX= [[0.01544401 0.02702703 0.05405405 0.04826255 0.03281853 0.05984557

0.08494207 0.08494207 0.06177607 0.02895753]

...

[0.18725869 0.19305018 0.16216215 0.25289574 0.23745173 0.25096524

0.3088803 0.38223937 0.36486486 0.2992278 ]]

trainY= [0. 0.02702703 0.02123553 0.04247104 0.07142857 0.05984557

...

0.25096524 0.3088803 0.38223937 0.36486486 0.2992278 0.24131274]

ID:(13891, 0)



Create assessment segments

Description

>Top


The sequences X_train and the value that follows y_train are formed:

# creating a data structure with 10 timesteps and 1 output
dataX = []
dataY = []
for i in range(len(test)-time_stemp-1):
    a = test[i:(i+time_stemp), 0]
    dataX.append(a)
    dataY.append(test[i + time_stemp, 0])
testX = numpy.array(dataX)
testY = numpy.array(dataY)
print('testX:',testX.shape[0],',',testX.shape[1],' testY:',testY.shape[0])
print('testX=',testX)
print('testY=',testY)

testX: 60 , 10 testY: 60

testX= [[0.24131274 0.26640925 0.24903473 0.31467178 0.3185328 0.32046333

0.4073359 0.5019305 0.46911195 0.40154442]

...

[0.4980695 0.58108103 0.6042471 0.554054 0.60810804 0.6891892

0.71042466 0.8320464 1. 0.96911204]]

testY= [0.32818535 0.25675675 0.3359073 0.34749034 0.33397684 0.41119692

...

0.6891892 0.71042466 0.8320464 1. 0.96911204 0.7799227 ]

ID:(13892, 0)



Form the tensioners

Description

>Top


To train, the tensors trainX and testX must be formed:

trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
print('trainX=',trainX)
print('testX=',testX)

trainX= [[[0.01544401 0.02702703 0.05405405 0.04826255 0.03281853 0.05984557

0.08494207 0.08494207 0.06177607 0.02895753]]

...

[[0.18725869 0.19305018 0.16216215 0.25289574 0.23745173 0.25096524

0.3088803 0.38223937 0.36486486 0.2992278 ]]]

testX= [[[0.24131274 0.26640925 0.24903473 0.31467178 0.3185328 0.32046333

0.4073359 0.5019305 0.46911195 0.40154442]]

...

[[0.4980695 0.58108103 0.6042471 0.554054 0.60810804 0.6891892

0.71042466 0.8320464 1. 0.96911204]]]

ID:(13893, 0)



Define, build, and train the model

Description

>Top


Define, build, and train the model:

# model
paxs_model = Sequential()
paxs_model.add(LSTM(10, input_shape=(1, time_stemp))) # 10 lstm neuron(block)
paxs_model.add(Dense(1))
paxs_model.compile(loss='mean_squared_error', optimizer='adam')
paxs_model.fit(trainX, trainY, epochs=50, batch_size=1)

Epoch 1/50

60/60 [==============================] - 9s 759us/step - loss: 0.0249

Epoch 2/50

60/60 [==============================] - 0s 710us/step - loss: 0.0041

Epoch 3/50

60/60 [==============================] - 0s 710us/step - loss: 0.0026

...

Epoch 48/50

60/60 [==============================] - 0s 727us/step - loss: 8.7224e-04

Epoch 49/50

60/60 [==============================] - 0s 733us/step - loss: 0.0010

Epoch 50/50

60/60 [==============================] - 0s 724us/step - loss: 0.0012

ID:(13894, 0)



Forecast passangers

Description

>Top


Predict training data trainPredict and test data testPredict :

trainPredict = paxs_model.predict(trainX)
testPredict = paxs_model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))

ID:(13895, 0)



Compare Predicted Values With Actuals

Description

>Top


Finally, the predicted and actual values are displayed:

# shifting train
trainPredictPlot = numpy.empty_like(dataset)
trainPredictPlot[:, :] = numpy.nan
trainPredictPlot[time_stemp:len(trainPredict)+time_stemp, :] = trainPredict
# shifting test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(time_stemp*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset), label='real')
plt.plot(trainPredictPlot, label='training')
plt.plot(testPredictPlot, label='forecast')
plt.xlabel('time')
plt.ylabel('Number of Passenger')
plt.title('international airline passenger')
plt.legend(loc='lower right')
plt.show()


ID:(13896, 0)