Predict Number of Active Cases by Covid19 Pandemic based on Medical Facilities (Volume of Testing, ICU beds, Ventilators, Isolation Units, etc) using Multivariate LSTM based MultiStep Forecasting Models
Introduction and Motivation
The intensity of the growth of the covid19 pandemic worldwide has propelled researchers to evaluate the best machine learning model that could the people affected in the distant future by considering the current statistics and predicting the near future terms in subsequent stages.
While different univariate models like ARIMA/SARIMA and traditional timeseries are capable of predicting Number of Active cases, daily recoveries, Number of deaths, they do not take into consideration the other timevarying factors like Medical Facilities (Volume of Testing, ICU beds, Hospital Admissions, Ventilators, Isolation Units, Quarantine Centres, etc).
As these factors become important we build a predictive model that can predict the Number of Active Cases, Deaths, and Recoveries based on the change in Medical Facilities as well as other changes in infrastructure.
Here in this blog, we try to model Multistep Time Series Prediction using Deep learning Models on the basis of Medical Information available for different states of India.
MultiStep Time Series Prediction
A typical multistep predictive model looks as the below figure, where each of the predicted outcomes from the previous state is treated as next state input to derive the outcome for the secondstate and so forth.
Deep Learningbased Multivariate Time Series Training and Prediction
The following figure illustrated the important steps involved in selecting the best deep learning model.
 Feeding Multivariate data from a single source or from aggregated sources available directly from the cloud or other 3rdparty providers into the ML modeling data ingestion system.
 Cleaning, preprocessing, and feature engineering of the data involving scaling and normalization.
 Conversion of the data to a supervised timeseries.
 Feeding the data to a deep learning training source that can train different timeseries models like LSTM, CNN, BILSTM, CNN+LSTM using different combinations of hidden layers, neurons, batchsize, and other hyperparameters.
 Forecasting based on near term or far distant term in future either using SingleStep or MultiStep Forecasting respectively
 Evaluation of some of the error metrics like (MAPE, MAE, ME, RMSE, MPE) by comparing it with the actual data, when it comes in
 Retraining the model and continuous improvements when the threshold of error exceeds.
Import Necessary Tensorflow Libraries
The code snippet gives an overview of the necessary libraries required for tensorflow.
from tensorflow.python.keras.layers import Dense, LSTM, RepeatVector,TimeDistributed,Flatten, Bidirectional from tensorflow.python.keras import Sequential from tensorflow.python.keras.layers.convolutional import Conv1D, Conv2D, MaxPooling1D,ConvLSTM2D
Data Loading and Selecting Features
As Delhi had high Covid19 cases, here we model different DL models for the “DELHI” State (National Capital of India). Further, we keep the scope of dates from 25th March to 6th June 2020. Data till 29th April has been used for Training, whereas from 30th April to 6th June has been used for testing/prediction. The test data has been used to predict for 7 days for 3 subsequent stages of prediction.
This code demonstrates the data is first split into a 70:30 ratio between training and testing (by finding the closest number to 7), where each set is then restructured to weekly samples of data.
def split_dataset(data):
# split into standard weeks
print(np.shape(data))
split_factor = int((np.shape(data)[0]*0.7))
print("Split Factor no is", split_factor)
m = 7
trn_close_no = closestNumber(split_factor, m)
te_close_no = closestNumber((np.shape(data)[0]split_factor), m)
train, test = data[0:trn_close_no], data[trn_close_no:(trn_close_no + te_close_no)]
print("Initials TrainTest Split ", np.shape(train), np.shape(test))
len_train = np.shape(train)[0]
len_test = np.shape(test)[0]
# restructure into windows of weekly data
train = array(split(train[0:len_train], len(train[0:len_train]) / 7))
test = array(split(test, len(test) / 7))
print("Final TrainTest Split ", np.shape(train), np.shape(test))
return train, test
Initials TrainTest Split  (49, 23) (21, 23)  Training and Test DataSet Final TrainTest Split  (7, 7, 23) (3, 7, 23)  Arrange Train and Test DataSet into 7 and 3 weekly samples respecytively.
The data set and the features have been scaled using MinMax Scaler.
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_dataset = scaler.fit_transform(dataset)
Convert TimeSeries to a Supervised DataSet
The tricky part in converting the timeseries to a supervised timeseries for multistep prediction lies in incorporating the number of past days (i.e. the historic data) that the weekly data has to consider.
The series derived by considering historic data is considered 7 times during training iterations and 3 times during testing iterations (as it got split as (7,7,23) and (7,3,23), where 22 is the number of input features with one predicted output). This series built using historic data helps the model to learn and predict any day of the week.
Note 1: This is the most important step of formulating a timeseries data to a multistep model
The below snippet code demonstrates what is described above.
# convert history into inputs and outputs
def to_supervised(train, n_input, n_out=7):
# flatten data
data = train.reshape((train.shape[0] * train.shape[1], train.shape[2]))
X, y = list(), list()
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
X.append(data[in_start:in_end, :])
y.append(data[in_end:out_end, 0])
# move along one time step
in_start += 1
return array(X), array(y)
Training Different Deep Learning Models using Tensorflow
In this section, we describe how we train different DL models using Tensorflow’s Keras APIs.
Convolution Neural Network (CNN Model)
The following figure recollects the structure of a Convolution Neural Network (CNN) with a code snippet showing how a 1D CNN with 16 filters, with a kernel size of 3 has been used to train the network over 7 steps, where each 7 step is of 7 days.
# train CNN model
def build_model_cnn(train, n_input):
# prepare data
train_x, train_y = to_supervised(train, n_input)
# define parameters
verbose, epochs, batch_size = 0, 200, 4
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# define model model = Sequential()
model.add(Conv1D(filters=16, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(10, activation='relu'))
model.add(Dense(n_outputs))
model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
LSTM
The following code snippet demonstrates how we train an LSTM model, plot the training and validation loss, before making a prediction.
# train LSTM model
def build_model_lstm(train, n_input):
# prepare data
train_x, train_y = to_supervised(train, n_input)
print(np.shape(train_x))
print(np.shape(train_y))
# define parameters
verbose, epochs, batch_size = 0, 50, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features)))
model.add(RepeatVector(n_outputs))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
The below figure illustrates the Actual vs Predicted Outcome of the MultiStep LSTM model after the predicted outcome has been inversetransformed (to remove the effect of scaling).
BiDirectional LSTM
The following code snippet demonstrates how we train a BILSTM model, plot the training and validation loss, before making a prediction.
# train BiDirectionsl LSTM model
def build_model_bi_lstm(train, n_input):
# prepare data
train_x, train_y = to_supervised(train, n_input)
print(np.shape(train_x))
print(np.shape(train_y))
# define parameters
verbose, epochs, batch_size = 0, 50, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]# reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# define model
model = Sequential()
model.add(Bidirectional(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))))
model.add(RepeatVector(n_outputs))
model.add(Bidirectional(LSTM(200, activation='relu', return_sequences=True)))
model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
The below figure illustrates the Actual vs Predicted Outcome of MultiStep BiLSTM model after the predicted outcome has been inversetransformed (to remove the effect of scaling).
Stacked LSTM + CNN
Here we have used Conv1d with TimeDistributed Layer, which is then fed to a single layer of LSTM, to predicted different sequences, as illustrated by the figure below. The CNN model is built first, then added to the LSTM model by wrapping the entire sequence of CNN layers in a TimeDistributed layer.
# train Stacked CNN + LSTM model
def build_model_cnn_lstm(train, n_input):
# prepare data
train_x, train_y = to_supervised(train, n_input)
# define parameters
verbose, epochs, batch_size = 0, 500, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps, n_features)))
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(RepeatVector(n_outputs))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
The prediction and inverse scaling help to yield the actual predicted outcomes, as illustrated below.
MultiStep Forecasting and Evaluation
The below snippet states how the data is properly reshaped into (1, n_input, n) to forecast for the following week. For the multivariate timeseries (of 23 features) with test data of 23 samples (with predicted output from previous steps i.e. 21+2) for 3 weeks is reshaped from (7,7,23), (8,7,23) and (9,7,23) as (49,23), (56,23) and (63, 23)
# make a forecast
def forecast(model, history, n_input):
# flatten data
data = array(history)
data = data.reshape((data.shape[0]*data.shape[1], data.shape[2]))
# retrieve last observations for input data
input_x = data[n_input:, :]
# reshape into [1, n_input, n]
input_x = input_x.reshape((1, input_x.shape[0], input_x.shape[1]))
# forecast the next week
yhat = model.predict(input_x, verbose=0)
# we only want the vector forecast
yhat = yhat[0]
return yhat
Note 2: If you wish to see the evaluation results and plots for each step as stated below, please check the notebook at Github (https://github.com/sharmi1206/covid19analysis Notebook ts_dlearn_mstep_forecats.ipynb)
Here at each step at the granularity of every week, we evaluate the model and compare it against the actual output.
# evaluate one or more weekly forecasts against expected values
def evaluate_forecasts(actual, predicted):
print("Actual Results", np.shape(actual))
print("Predicted Results", np.shape(predicted))
scores = list() # calculate an RMSE score for each day
for i in range(actual.shape[1]):
# calculate mse
mse = mean_squared_error(actual[:, i], predicted[:, i])
# calculate rmse
rmse = sqrt(mse)
# store
scores.append(rmse)
plt.figure(figsize=(14, 12))
plt.plot(actual[:, i], label='actual')
plt.plot(predicted[:, i], label='predicted')
plt.title(ModelType + ' based MultiStep Time Series Active Cases Prediction for step ' + str(i))
plt.legend()
plt.show()
# calculate overall RMSE
s = 0
for row in range(actual.shape[0]):
for col in range(actual.shape[1]):
s += (actual[row, col]  predicted[row, col]) ** 2
score = sqrt(s / (actual.shape[0] * actual.shape[1]))
return score, scores # evaluate a single model
def evaluate_model(train, test, n_input):
model = None
# fit model
if(ModelType == 'LSTM'):
print('lstm')
model = build_model_lstm(train, n_input)
elif(ModelType == 'BI_LSTM'):
print('bi_lstm')
model = build_model_bi_lstm(train, n_input)
elif(ModelType == 'CNN'):
print('cnn')
model = build_model_cnn(train, n_input)
elif(ModelType == 'LSTM_CNN'):
print('lstm_cnn')
model = build_model_cnn_lstm(train, n_input)
# history is a list of weekly data
history = [x for x in train]
# walkforward validation over each week
predictions = list()
for i in range(len(test)):
# predict the week
yhat_sequence = forecast(model, history, n_input)
# store the predictions
predictions.append(yhat_sequence)
# get real observation and add to history for predicting the next week
history.append(test[i, :])
# evaluate predictions days for each week
predictions = array(predictions)
score, scores = evaluate_forecasts(test[:, :, 0], predictions)
return score, scores, test[:, :, 0], predictions
Here we show a univariate and multivariate, multistep timeseries prediction.
MultiStep Conv2D + LSTM (Univariate & MultiVariate) based Prediction for State Delhi
A type of CNNLSTM is the ConvLSTM (primarily for twodimensional spatialtemporal data), where the convolutional reading of input is built directly into each LSTM unit.
Here for this particular univariate time series, we have the input vector as
[timesteps=14, rows=1, columns=7, features=2 (input and output)]
# train CONV LSTM2D model
def build_model_cnn_lstm_2d(train, n_steps, n_length, n_input):
# prepare data
train_x, train_y = to_supervised_2cnn_lstm(train, n_input)
# define parameters
verbose, epochs, batch_size = 0, 750, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# reshape into subsequences [samples, time steps, rows, cols, channels]
train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features))
# reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# define model
model = Sequential()
model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features)))
model.add(Flatten()) model.add(RepeatVector(n_outputs))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
# convert history into inputs and outputs
def to_supervised_2cnn_lstm(train, n_input, n_out=7):
# flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2]))
X, y = list(), list()
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
x_input = data[in_start:in_end, 0]
x_input = x_input.reshape((len(x_input), 1))
X.append(x_input)
y.append(data[in_end:out_end, 0])
# move along one time step
in_start += 1
return array(X), array(y)
# make a forecast def forecast_2cnn_lstm(model, history, n_steps, n_length, n_input):
# flatten data data = array(history)
data = data.reshape((data.shape[0]*data.shape[1], data.shape[2]))
# retrieve last observations for input data
input_x = data[n_input:, 0]
# reshape into [samples, time steps, rows, cols, channels]
input_x = input_x.reshape((1, n_steps, 1, n_length, 1))
# forecast the next week
yhat = model.predict(input_x, verbose=0)
# we only want the vector forecast
yhat = yhat[0]
return yhat
# evaluate a single model
def evaluate_model_2cnn_lstm(train, test, n_steps, n_length, n_input):
# fit model
model = build_model_cnn_lstm_2d(train, n_steps, n_length, n_input)
# history is a list of weekly data
history = [x for x in train]
# walkforward validation over each week
predictions = list()
for i in range(len(test)):
# predict the week
yhat_sequence = forecast_2cnn_lstm(model, history, n_steps, n_length, n_input)
# store the predictions
predictions.append(yhat_sequence)
# get real observation and add to history for predicting the next week
history.append(test[i, :])
# evaluate predictions days for each week
predictions = array(predictions)
score, scores = evaluate_forecasts(test[:, :, 0], predictions)
return score, scores, test[:, :, 0], predictions
Reading Statewise data and Indexing time columns:
df_state_all = pd.read_csv('all_states/all.csv')
df_state_all = df_state_all.drop(columns=['Latitude', 'Longitude', 'index'])
stateName = unique_states[8]
dataset = df_state_all[df_state_all['Name of State / UT'] == unique_states[8]]
dataset = dataset.sort_values(by='Date', ascending=True)
dataset = dataset[(dataset['Date'] >= '20200325') & (dataset['Date'] <= '20200606')]
print(np.shape(dataset))
daterange = dataset['Date'].values
no_Dates = len(daterange)
dateStart = daterange[0] dateEnd = daterange[no_Dates  1]
print(dateStart) print(dateEnd)
dataset = dataset.drop(columns=['Unnamed: 0', 'Date', 'source1', 'state', 'Name of State / UT', 'tagpeopleinquarantine', 'tagtotaltested'])
print(np.shape(dataset)) n = np.shape(dataset)[0] scaler = MinMaxScaler(feature_range=(0, 1)) scaled_dataset = scaler.fit_transform(dataset)
# split into train and test
train, test = split_dataset(scaled_dataset)
# define the number of subsequences and the length of subsequences
n_steps, n_length = 2, 7
# define the total days to use as input
n_input = n_length * n_steps
score, scores, actual, predicted = evaluate_model_2cnn_lstm(train, test, n_steps, n_length, n_input)
# summarize scores
summarize_scores(ModelType, score, scores)
The model parameters can be summarized as :
The evaluate_model function appends the model forecasting score at each step and returns it at the end.
The below figure illustrates the Actual vs Predicted Outcome of MultiStep ConvLSTM2D model after the predicted outcome has been inversetransformed (to remove the effect of scaling).
For multivariate time series with 22 input features and one output prediction, we take into consideration the following changes: In function forecast_2cnn_lstm we replace the input data shaping to constitute the multivariate features
#In function forecast_2cnn_lstm input_x = data[n_input:, :].
#replacing 0 with : # reshape into [samples, time steps, rows, cols, channels]
input_x = input_x.reshape((1, n_steps, 1, n_length, data.shape[1]))
#replacing 1 with #data.shape[1] for multivariate
Further, in function to_supervised_2cnn_lstm, we replace x_input’s feature size from 0 to : and 1 with 23 features as follows:
x_input = data[in_start:in_end, :]
x_input = x_input.reshape((len(x_input), x_input.shape[1]))
Conv2D + BI_LSTM
We can further try out BiDirectional LSTM with a 2D Convolution Layer as depicted in the figure below. The model stacking and subsequent layers remain the same as tried in the previous step, with the exception of using a BILSTM in place of a single LSTM.
Comparison of Model Metrics on test data set
Deep Learning Method  RMSE 
LSTM  912.224 
BI LSTM  1317.841 
CNN  1021.518 
LSTM + CNN  891.076 
Conv2D + LSTM (UniVariate SingleStep)  1288.416 
Conv2D + LSTM (MultiVariate MultiStep)  863.163 
Conclusion
In this blog, I have discussed multistep timeseries prediction using deep learning mechanisms and compared/evaluated them based on RMSE. Here, we notice that for a forecasting timeperiod of 7 days stacked ConvLSTM2D works the best, followed by LSTM with CNN, CNN, and LSTM networks. More extensive model evaluation with different hidden layers and neurons with efficient hyperparameter tuning can further improve accuracy.
Though we see the model accuracy decreases for multistep models, this can be a useful tool for having long term forecasts where predicted outcomes in the previous week help in playing a dominant role on predicted outputs.
For complete source code check out https://github.com/sharmi1206/covid19analysis
Acknowledgments
Special thanks to machinelearningmastery.com. as some of the concepts have been taken from there.
References
 https://arxiv.org/pdf/1801.02143.pdf

https://github.com/sharmi1206/covid19analysis
 https://machinelearningmastery.com/multisteptimeseriesforecasting/
 https://machinelearningmastery.com/multisteptimeseriesforecastingwithmachinelearningmodelsforhouseholdelectricityconsumption/
 https://machinelearningmastery.com/howtodeveloplstmmodelsformultisteptimeseriesforecastingofhouseholdpowerconsumption/
 https://machinelearningmastery.com/converttimeseriessupervisedlearningproblempython/
 https://www.tensorflow.org/tutorials/structured_data/time_series
 https://www.aiproblog.com/index.php/2018/11/13/howtodeveloplstmmodelsfortimeseriesforecasting/