THOMPSON RIVERS UNIVERSITY

A Novel Data-Driven Traffic Prediction Model for Lions
Gate Bridge Traffic Management

By

Zijun Ma

A PROJECT SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
Master of Science in Data Science
KAMLOOPS, BRITISH COLUMBIA

August, 2024

SUPERVISOR
Dr. Erfanul Hoque

© Zijun Ma, 2024

ABSTRACT
This work focuses on predicting traffic flow on the Lions Gate Bridge
adjacent to Stanley Park in Vancouver, Canada, employing a novel hybrid
model. The bridge serves as a vital route for commuters from northern
Vancouver to the city center, experiencing substantial daily traffic volume,
estimated at 60,000+ vehicles on workdays, leading to peak congestion during morning and afternoon commutes. Therefore, the urgency to establish a
traffic prediction model is paramount. The aim is to address this issue, providing urban planners with insights for more effective traffic planning near
the Lions Gate Bridge to alleviate congestion. To that end, in this work,
we propose a novel data-driven hierarchical forecast combination model to
enhance the accuracy of traffic flow predictions. Traffic-related data for this
project are sourced from the Ministry of Transportation and Infrastructure of
BC, while climate-related datasets are obtained from Environment Canada.
The results demonstrate the better performance of the proposed model compared to conventional forecasting models.
Key Words: Deep Learning, Forecast Combination, High Dimensional
Data, Machine Learning, Regression Models, Traffic Volume Forecasting,
Time Series Models.

ii

ACKNOWLEDGEMENTS
I would like to express my deepest gratitude to my supervisor, Dr. Erfanul
Hoque, for his invaluable guidance, support, and encouragement throughout
the course of this project. His insights and expertise were instrumental in
shaping the direction and outcome of this research.
I would also like to extend my sincere thanks to my second reader, Dr.
Sean Hellingman, for his thoughtful feedback and valuable suggestions, which
significantly contributed to the refinement and quality of this work.
Additionally, I am grateful to Thompson Rivers University for providing the necessary resources and institutional support that made this work
possible. The academic environment and facilities at TRU have greatly contributed to the successful completion of this project.
Finally, I would like to acknowledge everyone who has supported me
throughout this journey. Your encouragement and belief in my abilities have
been a constant source of motivation.

iii

Contents

1 Introduction

1

1.1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Problem Statement . . . . . . . . . . . . . . . . . . . . . . . .

3

1.3

Project Objectives . . . . . . . . . . . . . . . . . . . . . . . .

5

1.4

Significance of the Study . . . . . . . . . . . . . . . . . . . . .

5

1.5

Organization of the Project . . . . . . . . . . . . . . . . . . .

7

2 Literature Review

8

2.1

Time Series Methods . . . . . . . . . . . . . . . . . . . . . . .

2.2

Machine Learning Methods . . . . . . . . . . . . . . . . . . . . 10

2.3

Deep Learning Methods . . . . . . . . . . . . . . . . . . . . . 11

2.4

Forecast Combination Methods . . . . . . . . . . . . . . . . . 12

iv

8

CONTENTS

v

3 Methodology

16

3.1

3.2

3.3

Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1

ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1.2

Dynamic Regression Model

3.1.3

Exponential Smoothing Model . . . . . . . . . . . . . . 19

3.1.4

Support Vector Regression (SVR) . . . . . . . . . . . . 22

. . . . . . . . . . . . . . . 18

Non-Linear Models . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1

K-Nearest Neighbor (KNN) . . . . . . . . . . . . . . . 23

3.2.2

Long Short-Term Memory (LSTM) . . . . . . . . . . . 24

3.2.3

Neural Network Autoregression (NNETAR) . . . . . . 26

Proposed Model: Hierarchical Forecast Combination (HFC)
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.1

Proposed Data-Driven HFC Models . . . . . . . . . . . 29

3.3.2

Strategies to Determine Weights . . . . . . . . . . . . . 32

3.3.3

Steps for HFC Model . . . . . . . . . . . . . . . . . . . 33

4 Experiments

35

4.1

Dataset Description . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2

Traffic Data Collection . . . . . . . . . . . . . . . . . . . . . . 36

CONTENTS

vi

4.2.1

Hourly Traffic Volume Data (2022) . . . . . . . . . . . 37

4.2.2

Daily Traffic Volume Data (2017-2022) . . . . . . . . . 41

4.3

Climate Data Collection . . . . . . . . . . . . . . . . . . . . . 45
4.3.1

Hourly Climate Data (2022) . . . . . . . . . . . . . . . 46

4.3.2

Daily Climate Data (2017-2022) . . . . . . . . . . . . . 47

4.4

Traffic and Climate Dataset . . . . . . . . . . . . . . . . . . . 49

4.5

Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . 50

4.6

4.7

4.5.1

ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.5.2

Dynamic Regression Model

4.5.3

Exponential Smoothing Model . . . . . . . . . . . . . . 52

4.5.4

Support Vector Regression (SVR) . . . . . . . . . . . . 53

4.5.5

K-Nearest Neighbor (KNN) . . . . . . . . . . . . . . . 54

4.5.6

Long Short-Term Memory (LSTM) . . . . . . . . . . . 55

4.5.7

Neural Network Autoregression (NNETAR) . . . . . . 56

. . . . . . . . . . . . . . . 51

Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6.1

RMSE . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.6.2

MAPE . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Performance Comparison . . . . . . . . . . . . . . . . . . . . . 58

CONTENTS

vii

4.7.1

Results of 2022 Hourly Both Directions Dataset . . . . 61

4.7.2

Results of 2022 Hourly Positive Direction Dataset . . . 68

4.7.3

Results of 2022 Hourly Negative Direction Dataset . . 73

4.7.4

Results of 2017-2022 Daily Dataset . . . . . . . . . . . 78

5 Conclusions

82

A Climate Data Collection

91

A.1 Hourly Climate Data (2022) . . . . . . . . . . . . . . . . . . . 91
A.2 Daily Climate Data (2017-2022) . . . . . . . . . . . . . . . . . 101

B Linear and Non-linear Model Selection

108

B.1 2022 Hourly Both Directions Dataset . . . . . . . . . . . . . . 108
B.1.1 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . 108
B.1.2 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . 109
B.2 2022 Hourly Positive Direction Dataset . . . . . . . . . . . . . 109
B.2.1 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.2.2 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . 111
B.2.3 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . 112
B.3 2022 Hourly Negative Direction Dataset . . . . . . . . . . . . 113

CONTENTS

viii

B.3.1 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . 113
B.3.2 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . 114
B.3.3 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . 115
B.4 2017-2022 Daily Dataset . . . . . . . . . . . . . . . . . . . . . 116
B.4.1 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . 116

List of Figures

3.1

SVR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2

Basic RNN structure (Source: Zhu [2019]) . . . . . . . . . . . 25

3.3

LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4

The structure of neural network . . . . . . . . . . . . . . . . . 27

3.5

Basic steps of the proposed HFC model . . . . . . . . . . . . . 33

4.1

Process of traffic data collection . . . . . . . . . . . . . . . . . 36

4.2

Hourly traffic volume data plot for February 2022 . . . . . . . 38

4.3

Hourly traffic volume plot from 2022-01-10 to 2022-01-14 (Single Day) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4

Hourly traffic volume plot from 2022-01-10 to 2022-01-14 . . . 40

4.5

Comparison of average hourly traffic volume between holiday,
weekday and weekend . . . . . . . . . . . . . . . . . . . . . . . 41

4.6

Plot of 2017-2022 daily traffic volume data . . . . . . . . . . . 42
ix

LIST OF FIGURES

x

4.7

Daily traffic volume plot from 2018-03-05 to 2018-04-01 . . . . 43

4.8

Distribution of Daily traffic volume data for each day of the
week . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.9

Process of climate data collection . . . . . . . . . . . . . . . . 45

4.10 Process of 2022 hourly climate data collection . . . . . . . . . 46
4.11 Process of 2017-2022 daily climate data collection . . . . . . . 48
4.12 Forecast plot for proposed HFC model and four individual
models (2022 Hourly Both Directions - weekday) . . . . . . . . 63
4.13 Forecast plots for proposed models and the combination models (2022 Hourly Both Directions - weekend) . . . . . . . . . . 66
4.14 Forecast plots for proposed models and the combination models (2022 Hourly Both Directions - holiday) . . . . . . . . . . . 67
4.15 Forecast plot for proposed HFC model and four individual
models (2022 Hourly Positive Direction-weekday) . . . . . . . 70
4.16 Forecast plots for proposed models and the combination models (2022 Hourly Positive Direction-weekend) . . . . . . . . . . 71
4.17 Forecast plots for proposed models and the combination models (2022 Hourly Positive Direction-holiday) . . . . . . . . . . 72
4.18 Forecast plots for proposed models and individual models (2022
Hourly Negative Direction-weekday) . . . . . . . . . . . . . . . 75

LIST OF FIGURES

xi

4.19 Forecast plots for proposed models and individual models (2022
Hourly Negative Direction-weekend) . . . . . . . . . . . . . . . 76
4.20 Forecast plots for proposed models and individual models (2022
Hourly Negative Direction-holiday) . . . . . . . . . . . . . . . 77
4.21 Forecast plot for proposed HFC model and four individual
models (2017-2022 Daily-weekday) . . . . . . . . . . . . . . . 80

A.1 Comparison plot of 2022 hourly temperature data for different
stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.2 Comparison plot of 2022 hourly precipitation data for different
stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
A.3 Comparison plot of 2022 hourly wind speed data for different
stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
A.4 Plot of 2022 hourly temperature average data . . . . . . . . . 99
A.5 Plot of 2022 hourly precipitation average data . . . . . . . . . 100
A.6 Plot of 2022 hourly wind speed average data . . . . . . . . . . 101
A.7 Comparison plot of 2017-2022 daily temperature data . . . . . 102
A.8 Comparison plot of 2017-2022 daily precipitation data . . . . . 103
A.9 Plot of 2017-2022 Daily temperature average data . . . . . . . 106
A.10 Plot of 2017-2022 Daily precipitation average data . . . . . . . 107

List of Tables

4.1

Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 61

4.2

Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 61

4.3

The comparison table of four individual forecast models with
proposed HFC models (2022 hourly both directions - weekday) 62

4.4

The comparison table of four individual forecast models with
proposed HFC model (2022 hourly both directions) . . . . . . 64

4.5

Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.6

Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.7

Comparison of Four Individual Forecast Models with Proposed
HFC Model (2022 Hourly Positive Direction) . . . . . . . . . . 68

4.8

Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.9

Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.10 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

xii

LIST OF TABLES

xiii

4.11 The comparison table of four individual forecast models with
proposed HFC models (2022 hourly negative direction) . . . . 73
4.12 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.13 Weenkend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.14 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.15 The comparison table of four individual forecast models with
proposed HFC models (2017-2022 daily-weekday) . . . . . . . 78

A.1 Missing data for 2022 hourly temperature data (8760 rows) . . 93
A.2 Missing data for 2022 hourly precipitation data (8760 rows)

. 95

A.3 Missing data for 2022 hourly wind speed data (8760 rows) . . 97
A.4 Missing data for 2022 hourly climate data (8760 rows) . . . . . 98
A.5 Missing data for 2017-2022 daily climate data. (2191 rows) . . 104

B.1 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 109
B.2 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 109
B.3 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 110
B.4 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 110
B.5 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 111
B.6 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 111

LIST OF TABLES

xiv

B.7 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 112
B.8 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 112
B.9 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 113
B.10 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 113
B.11 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 114
B.12 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 114
B.13 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 115
B.14 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 115
B.15 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 116
B.16 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 116
B.17 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 117
B.18 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 117

Chapter 1
Introduction

1.1

Background

Since the beginning of the twentieth century, the industrialization of automobiles has led to an increase in demand for cars. For many families, owning a
car is now a necessity. However, this has created an issue of traffic congestion
in major cities due to the high number of automobiles on the road.
Traffic congestion is not just a localized issue but a global challenge
that affects cities worldwide. The increase in vehicles on the road leads
to extended travel times, increased fuel consumption, and elevated levels of
greenhouse gas emissions. According to a report by the 2023 INRIX Global
Traffic Scorecard (INRIX), cities like New York City, London, Paris, and
Mexico City, experience some of the world’s worst traffic congestion, with
commuters spending an average of over 200 hours per year in traffic jams.
This situation has significant economic implications, as traffic congestion is

1

estimated to cost billions of dollars annually in lost productivity, higher fuel
costs, and increased environmental degradation. Moreover, the health impact
cannot be ignored; prolonged exposure to vehicular emissions contributes
to respiratory and cardiovascular diseases, posing a serious public health
concern.
These issues are clearly evident in many urban centers around the world,
where traffic congestion has become a daily reality. One example is the
famed Times Square, which is situated on Sixth Avenue in New York City,
which runs alongside a variety of stores, eateries, entertainment venues, and
skyscrapers as it traverses the business and office districts of Manhattan.
At the same time, Manhattan is a thriving economic hub, with droves of
commuters flooding in every morning and evening to reach their workplaces.
As a result, they need to traverse the city, often utilizing Sixth Avenue,
leading to heavy traffic congestion on the roads.
Another example can be found in renowned scenic areas like Stanley
Park in Vancouver, Canada. This extensive metropolitan park is situated
on a peninsula that borders the Pacific Ocean, approximately 1.5 kilometers
northwest of downtown Vancouver. It is known to be among the largest
urban parks in Canada, and a popular tourist destination in Vancouver. The
park includes vast areas of natural scenery, hiking trails, biking paths, diverse
wildlife, and multiple tourist sites, such as the Lions Gate Bridge, and the
Vancouver Aquarium (City of Vancouver). Generally, during the summer
season and holidays, particularly on sunny weekends, Stanley Park tends
to draw a substantial number of tourists and local inhabitants. Meanwhile,
the surrounding roads during these periods may experience higher traffic
congestion.

2

1.2

Problem Statement

Among all these surrounding roads, there is a bridge connected to the park
known as The Lions Gate Bridge, which is a crucial piece of infrastructure
as it links downtown Vancouver with West Vancouver and North Vancouver.
This bridge serves as a key route for commuters traveling from northern
Vancouver to downtown. Additionally, it provides North Shore residents
with a direct path to Stanley Park, reducing the need for them to detour
through downtown Vancouver. The Lions Gate Bridge handles over 60,000
vehicles every workday (Association of Consulting Engineering Companies
British Columbia), resulting in severe traffic congestion during the morning
and evening rush hours.
Given its strategic importance in the region’s transportation network,
any form of traffic congestion on the Lions Gate Bridge can have significant
repercussions on regional economic activities. It extends commute times
and increases fuel consumption, which in turn exacerbates greenhouse gas
emissions. The efficient functioning of this bridge is therefore critical not
only for maintaining smooth traffic flow but also for supporting the broader
economic and environmental health of the region.
Reducing traffic congestion is therefore of particular importance, and
many cities have experimented with improving public transportation systems, encouraging carpooling, implementing traffic management measures,
building bicycle lanes, and pedestrian walkways, and promoting sustainable
modes of transportation.
However, despite the implementation of these measures, the ever-growing
3

demand for transportation continues to outpace the capacity of these solutions, rendering them less effective in fully alleviating traffic pressures. This
limitation underscores the need for more advanced and adaptive strategies
to manage traffic congestion effectively.
In response to this need, technology is playing an increasingly important
role in traffic management. Intelligent Transportation Systems (ITS) have
been developed to manage traffic flow more efficiently by using data-driven
approaches. These systems utilize information technology, communication
technology, and sensor technology to optimize the transportation network
(WSPglobal). An important component of ITS is traffic data analysis, which
allows for the forecasting of traffic patterns. By accurately predicting traffic
flow, cities can implement preemptive measures such as adjusting traffic signals or diverting traffic to reduce congestion (Laña et al. [2021]).
For example, by forecasting traffic flow on the Lions Gate Bridge, we can
analyze the traffic situation at a certain time in the near future. If there
is a possibility of congestion, we can make adjustments to traffic signals
or mobilize personnel to manage traffic detours. Forecasting traffic flow is
especially important on special days such as holidays and major events.
In this project, we aim to develop a forecast model specifically designed
to predict traffic flow data for Lions Gate Bridge. This research is motivated
by the pressing need to alleviate traffic congestion during peak periods, especially on holidays and weekends.

4

1.3

Project Objectives

The main objectives of this project are:
(a) Utilize traditional time series models, machine learning models, and
deep learning models to predict the traffic flow on the Lions Gate Bridge,
and conduct a comprehensive evaluation to determine which model performs
the best under various conditions.
(b) Identify and analyze the significant factors that influence traffic flow
on the Lions Gate Bridge.
(c) Propose a novel data-driven forecast combination model to enhance
the accuracy of traffic flow predictions.
(d) Conduct a comparative analysis of the proposed model against traditional time series models, machine learning models, and deep learning models.

1.4

Significance of the Study

Recognizing the Lions Gate Bridge as a crucial link not only for the daily
commute of thousands of residents but also for the economic and social activities in the region, it becomes evident that it serves as a primary route for
weekday commuters and plays a significant role in the transportation network. Additionally, being the main access point to Stanley Park, it draws
tourists from various locations during holidays. Thus, the Lions Gate Bridge
serves not only as a vital transportation artery but also as an essential part

5

of the tourist infrastructure.
Given its strategic importance, we propose a data-driven forecast combination model aimed at forecasting traffic flow on the Lions Gate Bridge.
This model leverages the strengths of both traditional and modern predictive
techniques, offering a more accurate and adaptable forecasting tool compared
to traditional models. This approach not only helps government authorities
promptly implement traffic management measures to reduce congestion and
ensure smooth urban traffic but also provides a reliable predictive tool for
future transportation needs.
However, the usefulness of such a predictive model goes beyond these
immediate applications. By providing accurate traffic flow predictions, the
model supports long-term urban planning initiatives, enabling authorities
to design more resilient infrastructure that can adapt to future demands.
This contributes to sustainable urban growth, reducing environmental impact, and improving the quality of life for residents. By comprehensively
understanding traffic flow trends and patterns, authorities can better plan
road construction, public transportation systems, and urban layouts to meet
future transportation needs, thereby enhancing urban sustainability and development quality.
Given the anticipated increase in both commuter and tourist traffic, especially with ongoing urban expansion, the significance of this study lies in
its potential to offer actionable insights that can preemptively address traffic
challenges before they escalate, ensuring the continued vitality of Vancouver’s transportation network.

6

1.5

Organization of the Project

To present our findings, the remainder of this project is structured as follows:
Chapter 2 reviews the related work on time series prediction in the traffic
domain. Chapter 3 explains the various linear and non-linear models we
utilized and introduces our proposed data-driven traffic prediction model.
Chapter 4 discusses the experimental results in detail. Finally, Chapter 5
concludes the study by summarizing the key points and contributions of this
project.

7

Chapter 2
Literature Review
In the field of traffic flow prediction, methods can be broadly classified into
three categories: time series methods, machine learning methods, and deep
learning methods. Additionally, we review literature on forecast combination
methods that are related to our proposed model.

2.1

Time Series Methods

Time series methods such as autoregressive integrated moving average (ARIMA),
exponential smoothing, and regression are commonly used. In the proposed
study by Kumar [2022], the ARIMA model, optimized with hyperparameters, was used to forecast traffic flow using the PeMS dataset from a CalTrans
sensor station. The results showed good performance metrics for whole day,
morning, and evening datasets, respectively. This suggests that integrating
seasonal ARIMA (SARIMA) and Box-Cox transformations could further en-

8

hance prediction accuracy. Regarding the exponential smoothing method, it
was employed to preprocess traffic flow data in the following study. Chan
[2011a] proposed a neural network development approach based on an exponential smoothing method to enhance previously used neural networks (NN)
for traffic flow forecasting. The exponential smoothing method was applied to
preprocess traffic flow data before training the NN. The preprocessed traffic
flow data, with reduced non-smoothness, discontinuity, and lumpiness compared to the original data, was more suitable for neural network training.
This approach was evaluated by forecasting real-time traffic conditions on a
section of the freeway in Western Australia. Regarding training errors, the
neural network models developed using this approach achieved more than a
20% improvement rate compared to those developed using the original traffic
flow data, indicating a significant enhancement in fitting traffic flow data. Regarding testing errors, these models achieved more than an 8% improvement
rate, indicating improved generalization capability in traffic flow forecasting. In another study, Chan [2011b] proposed a novel NN training method
that combines the hybrid exponential smoothing method and the LevenbergMarquardt (LM) algorithm to improve the generalization capabilities of NN
training methods for short-term traffic flow forecasting. The method preprocesses traffic flow data by removing lumpiness, followed by applying a variant
of the LM algorithm to train the NN model’s weights. The smoother and
more continuous preprocessed data aids in NN training. This method was
evaluated by forecasting short-term traffic flow conditions on the Mitchell
Freeway in Western Australia, with NN models developed using this method
outperforming those developed using other algorithms specifically designed
for short-term traffic flow forecasting or enhancing NN generalization capabilities. Meanwhile, Roh et al. [2019] utilized a multiple linear regression to

9

predict daily traffic volume factors, focusing on the effects of severe weather
conditions on travel demand in Alberta, Canada. Their findings revealed
that weather conditions significantly affect passenger car traffic volume more
than trucks, providing valuable insights into modeling traffic flows under severe winter weather conditions.

2.2

Machine Learning Methods

Machine learning methods encompass a variety of techniques such as support vector machines, random forests, artificial neural networks (ANN), and
k-nearest neighbors (KNN). Lippi et al. [2013] introduced a new supervised
learning algorithm based on support vector regression (SVR) and incorporated a seasonal kernel in SVR to capture seasonal characteristics of traffic
data. Their study, conducted using experimental traffic flow data from the
California Performance Measurement System (PeMS) for freeways, demonstrated that seasonality plays a pivotal role in achieving high accuracy. Moreover, they found that their proposed seasonal kernel method strikes a reasonable balance between prediction accuracy and computational complexity,
considering computational resource consumption. Bratsas et al. [2019] conducted a study comparing the predictive effectiveness of machine learning
models including Random Forest, SVR, Multi-Layer Perceptron, and Multiple Linear Regression on probe data collected from the road network in Thessaloniki, Greece. Their experimental results indicated that the SVR model
performed the best under stable conditions, while the multilayer perceptron
model excelled in the face of greater variability, demonstrating the closest
approximation to zero error. Tempelmeier [2019] addressed the challenge of

10

building reliable supervised models (SVR, KNN, ridge regression) for predicting the spatial and temporal impact of planned special events with respect to
road traffic, focusing on the effectiveness of various features derived from historical data. Their evaluation, based on real-world event data from several
venues in the Hannover region, Germany, demonstrates that their models,
using event-, mobility-, and infrastructure-related features, outperform both
event-based and event-agnostic baselines in accurately predicting the impact
of planned special events on urban traffic, providing valuable insights for
traffic management during such events. Ata [2020] proposed a TCC-SVM
system model to analyze traffic congestion in a smart city environment using an ML-enabled IoT-based road traffic congestion control system, which
notifies the occurrence of congestion at specific points. Data collected from
Data Mill North via the internet, consisting of weather and traffic flow at
10-minute intervals, was utilized. The performance of the proposed TCCSVM model is shown to be superior compared to previous approaches by
Tamimi and Zahoor (2010), Pushpi and Dilip Kumar (2018), and Ayesha et
al. (2019). However, the research is limited by potential time delays and
increased data complexity.

2.3

Deep Learning Methods

Deep learning methods have seen increasing popularity in recent years for
traffic flow prediction. These methods include Long Short-Term Memory
(LSTM), gated recurrent units (GRU), NN, and various hybrid models. Fu
et al. [2016] explored the use of two deep learning methods, LSTM and GRU
NN, for short-term traffic flow predictions using the California Caltrans Per-

11

formance Management System (PeMS) dataset. Their comparative experiments demonstrated superior performance of these deep learning methods
over traditional ARIMA models. Additionally, Kang et al. [2017] investigated LSTM for short-term traffic flow prediction, introducing a wider range
of input information including flow rate, occupancy, speed, and neighboring
traffic flow. Experimental results show that acceptable traffic flow prediction
performance can be obtained by using only traffic flow as an input variable.
However, the prediction performance may be better when combining traffic
flow with occupancy/speed as input variables. Zhao et al. [2019] proposed
a novel hybrid neural network called TreNet, integrating the advantages of
convolutional neural networks (CNN) and LSTM for time series traffic flow
prediction. Similarly, Du et al. [2017] presented a hybrid deep learning framework combining recurrent neural networks (RNNs) and convolutional neural
networks (CNNs) for urban traffic flow prediction, demonstrating effectiveness in handling complex nonlinear traffic flow prediction problems.

2.4

Forecast Combination Methods

Forecast combination is a powerful method for achieving more accurate predictions by combining multiple individual forecasts. The underlying idea is
that different forecasting models can capture various aspects of the data, and
by combining them, the overall prediction becomes more accurate and robust
than any single model alone. This approach is particularly valuable as it mitigates the weaknesses of individual models while leveraging their strengths,
leading to improved forecasting accuracy and reliability (Wang [2023]). Forecast combination has been successfully applied in various fields, including the

12

stock market, tourist flow, and internet traffic. By using combined forecasts,
these applications can achieve more reliable and accurate predictions, which
are crucial for decision-making in dynamic environments.
In the area of tourist flow, Andrawis [2011] explored the concept of combining forecasts using different time aggregations to capture diverse dynamics
and enhance the diversity of the forecasts obtained. The forecast combination
methods include Simple Average (AVG), Variance-based (VAR), Inverse of
the Mean Square Error (INV-MSE), Rank-based Weighting (RANK), Least
Squares Estimation, and Hierarchical Forecast Combination (HIER), among
others. Simulations were conducted on benchmark data from the M3 and
NN3 time series competitions and on monthly inbound tourism demand for
Egypt from 33 major source countries, provided by the Egyptian Ministry of
Tourism. The simulation results indicated improvements in accuracy relative
to the underlying forecasting models and demonstrated that this approach
outperformed individual models, offering valuable insights for developing a
forecasting model for inbound tourism demand using a short-term/long-term
forecast combination approach.
In the traffic flow domain, Ma [2020] proposed a combined model that
uses an artificial neural network optimized by a genetic algorithm (GA) and
exponential smoothing to improve the accuracy of short-term traffic flow
prediction. By leveraging the metaheuristic optimization capability of GA,
the connection weights and thresholds of the feedforward neural network,
trained by a backpropagation algorithm, are optimized to prevent the network from falling into a local optimum, thereby establishing the Genetic
Artificial Neural Network (GANN) prediction model. Subsequently, an ES
prediction model is introduced. To fully exploit the advantages of both mod-

13

els, the combined model uses a weighted average, with the weights determined
based on the mean square error of the predictions from the individual models. The model was experimentally validated using road traffic flow data from
Xuancheng, Anhui Province, with an observation interval of 5 minutes. Furthermore, the feedforward neural network model, GANN model, ES model,
and combined model were compared and analyzed. The results demonstrated
that the prediction accuracy of the optimized feedforward neural network was
significantly higher than before optimization. The prediction accuracy of the
combined model was also higher than that of the two individual models, confirming the feasibility and effectiveness of the combined model.
Hou et al. [2019] proposed an adaptive hybrid model aimed at forecasting
traffic flow uncertainty caused by dynamic changes in traffic structure. To
partially overcome the limitations of traditional prediction methods, they
combined the ARIMA method and the Nonlinear Wavelet Neural Network
method to predict traffic flow. The evaluation of the proposed model showed
that it was more efficient in predicting traffic flow than the two individual
models, whether under stable or fluctuating conditions. However, the study
only highlighted the hybrid model’s superiority over individual models and
did not consider other models that have demonstrated greater efficiency than
ARIMA, such as machine learning or deep learning models. Additionally,
they did not consider other environmental factors influencing traffic flow
dynamics, such as weather conditions or events that attract crowds.
Based on the literature review above, it is evident that some researchers
have applied combination methods in the traffic flow domain, but limitations still exist. The two papers mentioned earlier combined only two methods without exploring the potential benefits of combining a wider range of

14

models. Furthermore, they focused solely on processing traffic flow data
without considering other factors that might influence traffic flow, such as
weather conditions, holidays, and events. Therefore, in our project, we will
address these limitations by proposing a novel data-driven forecast combination model.
In conclusion, while traditional time series methods and machine learning
methods have their advantages, deep learning methods, especially forecast
combination models, are increasingly favored for their superior performance
in traffic flow prediction tasks.

15

Chapter 3
Methodology
The methods section consists of two parts. The first part introduces the models used for data analysis, categorized into linear and non-linear models. For
linear models, we include ARIMA, dynamic regression, exponential smoothing model, and SVR. For non-linear models, we include KNN, SVR with a
non-linear kernel, LSTM, and neural network autoregression (NNETAR).
In the second part, we introduced a new method called the hierarchical
Forecast Combination method. This method combines the strengths of different models to improve overall accuracy.

16

3.1

Linear Models

3.1.1

ARIMA

ARIMA is a fundamental model in time series analysis, merging autoregression (AR) and moving average (MA) concepts while addressing nonstationarity via differencing (integration) techniques, making it suitable for
forecasting future numeric values (Hyndman and Athanasopoulos [2021]).
ARIMA consists of three parts: autoregressive (AR), integrated (I), and
moving average (MA). The AR(p) model takes lagged observations as inputs, while the MA(q) model takes lagged errors as inputs.
In a time series, if the current value at time t is yt , then we refer to yt−1 ,
yt−2 , ..., yt−k as lagged observations. When using ARIMA to handle time
series data, we first need to ensure that the time series data is stationary
by using differencing (integration). If yt is a stationary time series, then for
all s, the distribution of (yt , ..., yt+s ) does not depend on t. This means the
series does not exhibit trending behavior, has constant variance, and lacks
long-term predictable patterns.
The ARIMA model can be written as:
(1 − ϕ1 B − · · · − ϕp B p )(1 − B)d yt = c + (1 + θ1 B + · · · + θq B q )ϵt

(3.1)

where the AR (AutoRegressive) part, (1 − ϕ1 B − · · · − ϕp B p ), represents
a linear relationship between the current value yt and its previous p values,
where ϕp are the parameters of the autoregressive part, and B is the backshift
operator, meaning B p yt = yt−p .
17

The I (Integrated) part, (1 − B)d , involves differencing to make the time
series stationary, with d being the number of differences taken; for example,
the differencing operation (1 − B)yt = yt − yt−1 represents a first difference
when d = 1.
The MA (Moving Average) part, (1 + θ1 B + · · · + θq B q ), indicates a linear
relationship between the current error term ϵt and the previous q error terms,
where θq are the parameters of the moving average part. The constant term
c represents the mean of the time series. And, the error term ϵt is a white
noise series, meaning it is a random error term with a mean of zero and a
variance of σ 2 .
The ARIMA model formula describes how the current value yt is explained and predicted by its own past values, its differenced values, past
error terms, and a constant term. By appropriately selecting the parameters
p, d, and q, the ARIMA model can effectively fit a wide range of time series
data.

3.1.2

Dynamic Regression Model

This type of regression analysis recognizes timebased correlations, making it
effective for handling time-dependent data (Hyndman and Athanasopoulos
[2021]).
The full model is:

18

yt = β0 + β1 x1,t + β2 x2,t + · · · + βk xk,t + ηt ,
ϕ (B) (1 − B)d ηt = θ (B) εt ,

εt ∼ IID 0, σ 2 .

(3.2)

Here, yt represents the dependent variable (the target value) at time t,
which is composed of the intercept term β0 , several independent variables and
their corresponding regression coefficients β1 x1,t , β2 x2,t , . . . , βk xk,t , as well as
the error term (ηt ) of the dynamic regression model.
We assume the error term (ηt ) is autocorrelated (not white noise) and
follows the ARIMA model, and the error term of ARIMA ( εt ) follows
white noise behavior, and is assumed to be independently and identically
distributed (IID) with a mean of 0 and a variance of σ 2 .
This model can address the autocorrelations seen in the standard regression model because the ARIMA error term in the dynamic regression model
captures information that is not explained in the standard regression model.
The process of fitting a model, checking the residual plot, doing the forecast,
and evaluating the result are similar to the standard regression model part.

3.1.3

Exponential Smoothing Model

Exponential smoothing is a time series forecasting method for univariate
data. This method uses weighted averages of past observations to forecast
future values, where the weights decrease exponentially as observations get
older. The general idea is that more recent observations are more relevant for
forecasting than older observations (Hyndman and Athanasopoulos [2021]).
19

There are three main types of exponential smoothing:
Simple Exponential Smoothing (SES): SES is used for time series
data without a trend or seasonal pattern (Hyndman and Athanasopoulos
[2021]). The weighted average form of SES is:
ŷt+1 = αyt + (1 − α)ŷt .

(3.3)

Here, ŷt+1 represents the forecast at time t+1, α is the smoothing parameter,
constrained between the range 0 ≤ α ≤ 1, yt denotes the actual value at time
t, and ŷt stands for the forecasted value at time t.
A different way to represent SES is through the component form. In this
approach, the only component considered is the level ℓt . The component
form for SES can be expressed as follows:
Forecast Equation: ŷt+1 = ℓt ,
Level Equation: ℓt = αyt + (1 − α)ℓt−1 .

(3.4)
(3.5)

Here, ℓt represents the level of the series at time t, yt is the actual value at time
t, α is the smoothing parameter, and ŷt+1 is the forecast for the next period.
The parameters α and ℓt are estimated by minimizing the sum of squared
errors (SSE) over the periods t, subject to the constraint that 0 ≤ α ≤ 1. If
we substitute ℓt with ŷt+1 and ℓt−1 with ŷt in the level equation, we obtain
the weighted average form of SES (equation 3.3).
Two other methods are called Holt’s Linear Trend Method and the HoltWinters Method.
Holt’s linear trend method is used for time series data with a trend
(Hyndman and Athanasopoulos [2021]). This method involves a forecast
20

equation and two smoothing equations (one for the level and one for the
trend).
The Holt-Winters method includes the forecast equation and three smoothing equations (level, trend, and seasonal). This method has two variations
depending on the type of seasonal component. The additive method is ideal
when the seasonal fluctuations are roughly constant across the series. On
the other hand, the multiplicative method is better suited when the seasonal
fluctuations vary in relation to the series’ level (Hyndman and Athanasopoulos [2021]).
The detailed component forms of these methods and the corresponding
equations can be found in the book Ch.8.2 and 8.3 by Hyndman and Athanasopoulos [2021].

Innovations state space models for exponential smoothing

Each method mentioned above consists of two main components: the measurement equation (forecast equation) that describes the observed data, and
the state equations that describe how the unobserved components or states
(such as level, trend, and seasonal) change over time. Therefore, we can
also refer to them as State Space Models. For each method, there exist two
corresponding state space models: one with additive errors and one with multiplicative errors. And how to get innovations state space models for each
exponential smoothing method is detailed in Hyndman and Athanasopoulos
[2021] book Chapter 8.5.

21

3.1.4

Support Vector Regression (SVR)

SVR is an application of Support Vector Machine (SVM) for regression tasks,
sharing a similar fundamental approach with slight differences (Santos and
Castillo [2019]).
In SVM, we aim to find a hyperplane with the maximum margin between
classes. In contrast, SVR defines a threshold (ϵ), where data points within
this margin have zero residuals, and points outside contribute to the residuals
(ζ). The goal is to minimize these residuals. Essentially, SVR seeks an
optimal strip (2ϵ width) and performs regression on points outside this strip.

Figure 3.1: SVR

In certain cases, SVR utilizes kernel functions to manage non-linear relationships between features. These functions map the input data into a
higher-dimensional space, allowing a linear hyperplane to effectively separate
or approximate the data. Commonly used kernels are linear, polynomial, radial basis function (RBF), and sigmoid.
In time series forecasting, the aim is to predict the future value of a time

22

series, such as a stock price at a future date or temperature at an upcoming
time step. As a regression technique, SVR constructs a model that connects
historical time series data (features) with their corresponding future values
(target variable).

3.2
3.2.1

Non-Linear Models
K-Nearest Neighbor (KNN)

The KNN algorithm predicts future values by finding the most similar K
subsequences in historical data. It calculates the similarity between a recent
segment (query sequence) and all historical subsequences, then averages the
future values of the most similar ones to make predictions (Francisco Martinez [2023]).
Here are the steps of how it works: First, define the query sequence by
selecting the recent data segment you want to predict from. Next, segment
the historical data by dividing it into subsequences of the same length as the
query sequence. Then, calculate the similarity between the query sequence
and each historical subsequence using a distance metric, such as Euclidean
distance. After that, select the K nearest neighbors by choosing the K historical subsequences with the smallest distances. Finally, predict future values
by averaging the future values of these K nearest neighbors to make the prediction. For example, suppose we have traffic volume data for the past 10
months and want to predict the 11th month’s traffic volume:
• Define the Query Sequence: The traffic volume for the last 3 months,
23

e.g., (August, September, October).
• Segment Historical Data: Create subsequences of 3 months, e.g., (January, February, March), (February, March, April), and so on.
• Calculate Similarity: Measure the similarity between (August, September, October) and all historical subsequences.
• Select K Nearest Neighbors: Choose the 3 subsequences with the smallest distances, e.g., (May, June, July), (April, May, June), (January,
February, March).
• Predict Future Values: Average the traffic volumes of the months following these subsequences (i.e., August, July, April) to predict November’s traffic volume.

This method effectively utilizes patterns and trends in historical data to make
accurate predictions about future traffic volume.

3.2.2

Long Short-Term Memory (LSTM)

LSTM networks are a special type of Recurrent Neural Network (RNN) designed to effectively learn long-term dependencies.
RNNs are a class of NN designed to recognize patterns in sequences of
data, such as time series, natural language, or audio signals. Unlike traditional Feedforward NN, RNNs have a circular connection structure, which
allows the network to retain input information from previous time steps. This
enables the network to consider the context from previous steps at each time
step in the sequence (Starmer [2022b]).
24

In traditional RNNs, issues like vanishing or exploding gradients make it
difficult to learn from long sequences. Fig. 3.2 shows the basic structure of
RNN.

Figure 3.2: Basic RNN structure (Source: Zhu [2019])

LSTM networks address this by introducing structures like memory cells
and gating mechanisms, allowing selective retention or forgetting of information, thereby capturing long-term dependencies (Starmer [2022a]). The main
idea behind how LSTM works is that instead of using the same feedback loop
connection for events that happened long ago and events that just happened
yesterday to make a prediction about tomorrow, LSTM uses two separate
paths to make predictions about tomorrow. One path is for long-term memories, and another is for short-term memories.

25

Figure 3.3: LSTM

As shown in the Fig. 3.3, the upper blue line is called the cell state
(C), representing long-term memory. We notice that there are no weights or
biases that can modify it directly. This allows long-term memories to flow
through a series of unrolled units without causing the gradient to explode or
vanish. The lower red line is called the hidden state (H), which represents
short-term memory. Short-term memory is directly connected to the weights
that can modify it.
By using separate paths for long-term memories and short-term memories, LSTM networks avoid the exploding/vanishing gradient problem, and
that means we can unroll them more times to accommodate longer sequences
of input data than a vanilla recurrent neural network.

3.2.3

Neural Network Autoregression (NNETAR)

A neural network is a computational model that mimics the biological neural
system. By connecting numerous artificial neurons, it processes and analyzes
26

data with powerful nonlinear mapping capabilities. The basic structure of a
neural network includes an input layer, hidden layers, and an output layer
(Saleem [2023]).

Figure 3.4: The structure of neural network

• The input layer receives external data, with each input node corresponding to a feature.
• Hidden layers, consisting of several neurons, perform weighted summation of input data and apply nonlinear transformations through activation functions. The number of hidden layers can vary, and more layers
increase the network’s complexity and expressive power.
• The output layer generates the final prediction.
27

NN learn through forward propagation (data passing from the input layer
to the output layer) and backpropagation (updating weights layer by layer
according to the gradient of the loss function). This adjustment of weights
and biases minimizes the loss function, ensuring that the predictions are as
close to the real values as possible.
The NNETAR model combines the nonlinear modeling capabilities of NN
with the time dependency of autoregressive (AR) models and is widely used
in time series forecasting (Hyndman and Athanasopoulos [2021]).
In an NNETAR model, the input layer receives historical data points of a
time series. If p historical points are used, the input vector is (yt−1 , yt−2 , . . . , yt−p ).
The hidden layers structure same as simple neural network, and the output
layer generates the forecast for time t. NNETAR models are highly capable
of capturing complex nonlinear relationships and are flexible, but they require substantial amounts of data for training and have higher computational
complexity. By appropriately selecting the model structure and parameters,
NNETAR models can demonstrate significant potential in time series forecasting, especially when dealing with complex nonlinear relationships.

3.3

Proposed Model: Hierarchical Forecast
Combination (HFC) Model

Forecast combinations have seen significant growth in popularity within the
forecasting community and have recently become a key aspect of mainstream
forecasting research and practices. This approach involves combining multiple forecasts generated for a target time series to enhance accuracy by
28

integrating information from diverse sources, thus eliminating the necessity
to pinpoint a single “best” forecast. The methods for combining forecasts
have evolved from basic techniques that do not require estimation to advanced strategies that incorporate time-varying weights, nonlinear combinations, component correlations, and cross-learning (Wang [2023]).
From the article by Andrawis [2011], we have learned about various
forecast combination methods, including Hierarchical Forecast Combination
(HIER). The principle behind HIER is to select the two best-performing linear methods and the two best-performing nonlinear methods from all previously described forecast combination methods and then take a simple average
(AVG) of the forecasts from these four methods. The selection is based on the
performance of these methods in the evaluation set. However, in practice, a
simple average (AVG) may not be the best approach for combining forecasts
from these individual models because it assigns equal weight to each model.
To overcome this limitation, in this work, we develop a data-driven Hierarchical Forecast Combination (HFC) model. Our model not only retains the
original idea of selecting the best linear and non-linear models through evaluation but also incorporates appropriate weights for combining these methods.

3.3.1

Proposed Data-Driven HFC Models

Assume we have n individual models, among which some are linear models
and others are non-linear models. Our approach involves three stages of
model combination. In the first stage, we combine the top n1 linear models
to obtain a linear combined model. In the second stage, we combine the top
n2 non-linear models to obtain a non-linear combined model. In the final
29

stage, we combine the linear combined model with the non-linear combined
model, where n1 +n2 =n to get the final proposed data-driven HFC model.

First Stage:

We can define the forecast combined model based on linear models as:
n1
X
L
ŷt+h =
wiL mLi .
i=1
L
Here, ŷt+h
denotes the combined forecast result for the h-step ahead under

the linear model. The corresponding weight for each model mLi is represented
by wiL , where i = 1, 2, . . . , n1 . These weights wiL are optimized according to
specific criteria, which will be detailed later.

Second Stage:

We can define the forecast combined model based on non-linear models as:
n2
X
NL
L
ŷt+h =
wiN L mN
i .
i=1
NL
Similarly, ŷt+h
represents the h-step ahead combined forecast for the nonL
linear model, with wiN L being the weight for model mN
i , where i = 1, 2, . . . , n2 .

The optimization of these weights wiN L is based on criteria that will be discussed later.

Third Stage:

The proposed Data-Driven HFC model can be derived as:
n1
n2
X
X
L
HF C
wiL mLi + w2HF C
wiN L mN
ŷt+h
= w1HF C
i .
i=1

i=1

30

HF C
ŷt+h
means the h-step ahead combined forecast results under HFC model,

and w1HF C and w2HF C are the weight under proposed model which need to be
optimized to get better forecast combination model.
In our work, we set n1 = n2 = 2. This means we consider the two best
linear models and the two best non-linear models to construct our proposed
HFC model. However, it is important to note that one can choose more than
two models to further refine the proposed HFC model.
As a result, our model now combines the top 2 linear models, and the top
2 non-linear models, and finally combines the linear combined model with
the non-linear combined model.
Consequently, the updated formulation of models become:

First Stage:

L
ŷt+h
=

2
X

wiL mLi = w1L · mL1 + w2L · mL2

i=1

Second Stage:

NL
ŷt+h
=

2
X

L
L
L
= w1N L · mN
+ w2N L · mN
wiN L mN
2
i
1

i=1

31

Third Stage:

Then the proposed HFC model can be return as:
HF C
= w1HF C
ŷt+h

2
X

wiL mLi + w2HF C

i=1

3.3.2

2
X

L
wiN L mN
i

i=1

Strategies to Determine Weights

In order to find the optimal weight for each stage of model combination, we
follow the following procedures.
w1 values are generated from a uniform distribution (0,1) with increP
ments of 0.01, where 2i=1 wi = 1. For each w1 value, we calculate ŷt+h .
Subsequently, we calculate the one-step ahead Forecast Error Sum of Squares
P
(FESS), using the formula Tt=1 (yt+h − ŷt+h )2 , where t stands for each observation data point, and h is the forecast horizon. The optimal w1 is then
chosen as the one that minimizes the FESS.
Based on this weight determination procedure, we derive two proposed
methods. The first method is the Data-Driven Hierarchical Forecast Combination Method (DDHFC), where we need to find the optimal weight for each
stage. The second method is the Partially Data-Driven Hierarchical Forecast Combination Method (DDPHFC), where the weight is only optimized
for the last stage, while the weights for first and second stage are fixed at wiL
= wiN L = 0.5. The two proposed DDHFC and DDPHFC are compared with
the simple average combination (AVG) model where all weights are fixed at
0.5 for each stage.

32

3.3.3

Steps for HFC Model

Figure 3.5: Basic steps of the proposed HFC model

The procedure above shows the steps of how the HFC model works. Next,
we will explain how our method is implemented:

1. Select Different Models and Get Forecast Results:
- Each linear model is individually trained on our datasets to generate
forecast results. The Mean Absolute Percentage Error (MAPE) for
each model is calculated, and the top n1 models with the lowest MAPE
values are selected. The same procedure is applied to the non-linear
models.
2. First stage (Forecast combination for top n1 linear models):
- We combine the results of top n1 linear models.
33

- Use different strategies to determine the weights for top n1 linear
models.
- We now have a linear combined model.
3. Second stage (Forecast combination for top n2 non-linear models):
- We combine the results of top n2 non-linear models.
- Use different strategies to determine the weights for top n2 non-linear
models.
- We now have a non-linear combined model.
4. Third stage:
- Combine the results from the first combination of linear and nonlinear models.
- Again, use different strategies to determine the final combination
weights.

By performing a two-step weighted combination and using different strategies to determine the weights, we can effectively leverage the strengths of different models and improve the overall performance of the forecasting model.
This HFC model is particularly useful when dealing with complex forecast
problems, especially when the performance of individual models is unstable
or inconsistent.

34

Chapter 4
Experiments

4.1

Dataset Description

Two datasets will be used for our project. These datasets include 2022
Hourly Traffic and Climate Dataset and 2017-2022 Daily Traffic and Climate
Dataset, respectively.
The traffic data are collected from the Traffic Data Program on the Ministry of Transportation and Infrastructure of B.C. government website (Traffic Data Program), and the climate data are collected from Environment
Canada (Service Canada [2023]).

35

4.2

Traffic Data Collection

Figure 4.1: Process of traffic data collection

36

We initially obtain the necessary traffic data in CSV format. These CSV
files consist of the Monthly Hourly Volume Report (MVO3) and the Daily
Volume Summary Report (DV01).
As per the Traffic Reports User Documentation from the BC Ministry
of Transportation and Infrastructure (WSP Canada Group Limited [2019]),
the MVO3 presents a comprehensive breakdown of hourly traffic for each
day of the month, including data for both negative direction (Neg DIR),
and positive direction (Pos DIR) on the lions gate bridge. Conversely, the
DV01 provides a summary of total daily volumes and daily volumes per lane,
compiled over a one-day period.
Traffic agencies commonly gather diverse traffic data types such as volume, speed, length, axle class, and weigh-in-motion (WIM). In British Columbia,
traffic data collection involves the utilization of inductive loops, pneumatic
hoses, and piezoelectric strips. Inductive loops detect the metal content of
vehicles, enabling the measurement of lane volume with a single loop, while a
pair of loops can assess length and speed. After obtaining these CSV files, we
import the data into R and create two tsibble tables named full.hourly.2022
and full.daily.17 22.

4.2.1

Hourly Traffic Volume Data (2022)

For the full.hourly.2022 tsibble table, special handling is required for data on
March 13, 2022, and November 6, 2022, due to daylight saving time. When
creating the traffic dataset in R, adjustments are necessary while collecting
data from the CSV files. Specifically, the traffic data for 2 am on March 13,
2022, needs to be omitted. Similarly, the traffic data for 1 am on November
37

6, 2022, needs to be duplicated.
Subsequently, we examine the tables for full.hourly.2022 missing data and
outliers. The analysis reveals 217 missing data points in the table. These
missing values are addressed using the function replace missing values(),
which replaces them with the respective average values for all days of the
same weekday throughout the year. Following this, outliers are checked using the built-in function tsoutliers() (Chen and Liu [1993]), and no outliers
are detected.
Moving forward, we examine the plot of the hourly traffic volume, focusing on February 2022. Fig. 4.2 shows that the traffic data exhibits some
level of seasonality, with a daily seasonal pattern.

Figure 4.2: Hourly traffic volume data plot for February 2022

For a deeper understanding of our data, we selected a random workweek
38

from the 2022 Hourly Traffic dataset from January 10th to January 14th,
2022. We conducted separate analyses for each day and the aggregated fiveday period. Fig. 4.3 shows the hourly traffic volume plot from 2022-01-10 to
2022-01-14, these five graphs, revealing dual peaks evident on each workday,
and the reasonably interpreted as corresponding to morning and evening rush
hours. Typically, the morning peak occurs between 7:00 a.m. and 9:00 a.m.,
while the evening peak spans from 4:00 p.m. to 6:00 p.m. Fig. 4.4 illustrates
the hourly traffic volume across the week, shows the strong weekly pattern.

Figure 4.3: Hourly traffic volume plot from 2022-01-10 to 2022-01-14 (Single
Day)

39

Figure 4.4: Hourly traffic volume plot from 2022-01-10 to 2022-01-14

Fig. 4.5 shows the comparison of average hourly traffic volume (across
year 2022) between holiday and weekday and weekend. From the graph, it
can be observed that holiday and weekend do not exhibit morning or evening
rush hours. Instead, there is a continuous increase in traffic flow from around
6:00 AM to approximately 12:00 PM, which is presumably due to increased
outdoor activities during this period. Peak traffic is observed around 12:00
PM. Subsequently, traffic gradually decreases, suggesting a decline in outdoor
recreational activities.

40

Figure 4.5: Comparison of average hourly traffic volume between holiday,
weekday and weekend

4.2.2

Daily Traffic Volume Data (2017-2022)

The same processing steps are applied to the full.daily.17 22 tsibble table,
with the exception that daylight saving time does not need to be considered
here. For handling missing data, the built-in function na seadec() (R: Seasonally Decomposed Missing Value Imputation) is employed, which utilizes
Seasonal Decomposition of Time Series principles. This process results in
the creation of the 2017-2022 Daily Traffic dataset.
Fig. 4.6 illustrates the daily traffic volume from January 1, 2017, to
December 31, 2022. It is evident from the graph that traffic peaks during the
summer months compared to winter. Around March 2020, there is a sharp

41

decline in traffic volume, likely attributed to the outbreak of the COVID-19
pandemic. This decline can be attributed to the implementation of stay-athome orders by the BC government, resulting in a significant reduction in
daily traffic flow (Zussman [2020]).

Figure 4.6: Plot of 2017-2022 daily traffic volume data

Further, we randomly selected a time periods from the 2017-2022 Daily
Traffic Dataset: from March 5, 2018, to April 1, 2018, spanning four weeks.
Fig. 4.7 depicts the plot for this period. From the plot, it is evident that,
except for the last week (March 26, 2018, to April 1, 2018), the daily traffic
volume on weekdays during the week consistently surpassed that of weekends.

42

Figure 4.7: Daily traffic volume plot from 2018-03-05 to 2018-04-01

Similarly result can be found in Fig. 4.8. This graph depicts the distribution of hourly traffic volume data for each day of the week using box
plots. It illustrates that the daily traffic volume on weekdays is higher than
on weekends.

43

Figure 4.8: Distribution of Daily traffic volume data for each day of the week

44

4.3

Climate Data Collection

Figure 4.9: Process of climate data collection

Figure 4.9 illustrates the process of climate data collection. Since there are
no climate stations located directly on Lions Gate Bridge, we extended our
examination to the surrounding area. We selected a 16-kilometer radius
around the bridge based on previous studies that have used this distance
as a standard, as they indicated that weather conditions within this range
show minimal variation and maintain consistency (Roh et al. [2019]). Using Google Maps, we identified a total of 95 weather stations within this
16-kilometer radius, with data sourced from Environment Canada (Service

45

Canada [2023]) and a climate data extraction tool (Canada, Environment
and Climate Change).
Following the elimination of stations lacking weather data for the year
2022, we retained 9 stations with the following climate station IDs: 1105658,
1105669, 1106200, 1106764, 1106PF7, 1108446, 1108395, 1108380, and 1108824.
Some of these stations provide solely hourly data, some solely daily data,
while others offer both. Hourly and daily data reports can be acquired separately from the Historical Data website (Historical Data).

4.3.1

Hourly Climate Data (2022)

Figure 4.10: Process of 2022 hourly climate data collection
Fig. 4.10 shows the process of 2022 hourly climate data collection. The
46

hourly data report includes 11 relevant parameters: Temperature (°C), Dew
Point Temperature (°C), Relative Humidity (%), Total Hourly Precipitation, Wind Direction (10’s deg/tens of degrees), Wind Speed (km/h), Visibility (km), Station Pressure (kPa), Humidex, Wind Chill, and Occurrence
of Weather and Obstructions to Vision.
The climate IDs for the stations have solely hourly data are 1106200,
1108446, 1108395, 1108380, and 1108824.
Furthermore, based on article (Roh et al. [2019]), our focus lies on Temperature (°C), Total Hourly Precipitation, and Wind Speed (km/h) for hourly
data. We identify stations corresponding to these three features separately
to generate three tsibble tables.
Details regarding the examination of temperature, precipitation, and
wind speed for hourly climate data can be found in the Appendix A.1.
Finally, by integrating the 2022 Hourly Climate Dataset with the 2022
Hourly Traffic Dataset, we obtain the first dataset (2022 Hourly Traffic and
Climate Dataset).

4.3.2

Daily Climate Data (2017-2022)

Fig. 4.11 shows the process of 2017-2022 daily climate data collection. The
daily data report includes 11 relevant parameters: Maximum Temperature
(°C), Minimum Temperature (°C), Mean Temperature (°C), Heating Degreedays, Cooling Degree-days, Total Rain (mm), Total Snow (cm), Total Precipitation (mm), Snow on the Ground (cm), Direction of Maximum Gust
(10’s Deg/Tens of Degrees), and Speed of Maximum Gust (km/h).
47

Figure 4.11: Process of 2017-2022 daily climate data collection
The climate IDs for the stations have solely daily date are 1105658,
1105669, 1106200, 1106764, 1106PF7, 1108446, 1108395, 1108380, and 1108824.
Furthermore, based on article (Roh et al. [2019]), our current interest
lies in Mean Temperature (°C) and total Precipitation (mm) for daily data.
Based on different features, we generated two tsibble tables.
Details regarding the examination of temperature, precipitation for daily
data can be found in the Appendix A.2.
Finally, integrating the 2017-2022 Daily Climate Dataset with the 2017-

48

2022 Daily Traffic Dataset yields the second dataset (2017-2022 Daily Traffic
and Climate Dataset).

4.4

Traffic and Climate Dataset

The first dataset (2022 Hourly Traffic and Climate Dataset) consists of 8760
rows with 9 features: “volume”, “volume.P”, “volume.N”, “temp.average”,
“precip.average”, “ws.average”, “weekday”, “is weekend”, and “day.type”.
The feature “volume” represents traffic volume.
For this dataset, we also consider the direction of traffic volume. The
positive direction (volume.P) indicates traffic moving from South to North
or from West to East, whereas the negative direction (volume.N) denotes
traffic moving from North to South or from East to West.
The second dataset (2017-2022 Daily Traffic and Climate Dataset) has
2191 rows with 6 features: “volume”, “temp.average”, “precip.average”,
“weekday”, “is weekend”, and “day.type”. The traffic data includes daily
traffic volume data from 2017 to 2022.
The features “temp.average”, “precip.average” and “ws.average” represent the average temperature (℃), precipitation (mm), and wind speed
(km/h) observed by chosen climate stations, respectively; the feature “weekday” indicates the day of the week corresponding to the data date, “is weekend”
uses “1” to denote weekends and “0” for non-weekends, and “day.type” includes “Weekday”, “Weekend”, and “Holiday”, determined using the calendar functions from the R package QuantLib (Eddelbuettel [2024]), covering
all statutory and regional holidays in British Columbia from 2017-2022 (Of49

ficeHolidays).

4.5

Parameter Estimation

In this section, we will discuss how we divided our training set and test set.
We will also introduce how different linear and non-linear models are implemented in R and how their parameters are selected during model training.
For the 2022 Hourly Traffic and Climate Dataset, to evaluate the performance of our model in predicting different day types, we further divided the
dataset into three categories: weekday, weekend, and holiday. Adjustments
were made to the training set and test set for each type.
Next, we will introduce the specific details of the training set and test
set divisions for each day type:

• Weekday: The training set is from June 1, 2022, 00:00 to November
30, 2022, 23:00, and the test set is from December 1, 2022, 00:00 to
December 2, 2022, 23:00 (48 hours).
• Weekend: We randomly selected a weekend in 2022. The training set
is from January 1, 2022, 00:00 to June 24, 2022, 23:00, and the test set
is from June 25, 2022, 00:00 to June 26, 2022, 23:00 (48 hours).
• Holiday: We randomly selected Canada Day on July 1 as the holiday.
Thus, the training set is from January 1, 2022, 00:00 to June 30, 2022,
23:00, and the test set is from July 1, 2022, 00:00 to July 1, 2022, 23:00
(24 hours).

50

For the 2017-2022 Daily Traffic and Climate Dataset, since the traffic
volume in this dataset is recorded daily, we only consider the weekday type.
The training set is from January 1, 2017, to November 30, 2022, and the test
set is from December 1, 2022, to December 14, 2022 (2 weeks).
Our methods are implemented in R. Most of the methods have corresponding packages in R, but for methods like LSTM and HFC, we did not
fully use existing packages. Especially for HFC, which is our unique method,
the functions within it are self-programmed.
Next, we will introduce the packages used for each method and their
required parameters.

4.5.1

ARIMA

The ARIMA() function in R is from the forecast package. This function
achieves automatic model selection to choose the optimal combination of
p, d, and q parameters by using the stepwise parameter (Hyndman and
Athanasopoulos [2021]).

4.5.2

Dynamic Regression Model

Since we assume the error term of the dynamic regression model is autocorrelated (not white noise) and follows the ARIMA model, we also use the
forecast package for this method, similar to the ARIMA method (Hyndman
and Athanasopoulos [2021]).
As this is a regression method, our variables include the lagged value
51

(traffic volume) as the target variable, and temp.average, precip.average,
ws.average, and day.type as independent variables.

4.5.3

Exponential Smoothing Model

This model utilizes the ETS() function in R, which comes from the forecast
package (Hyndman and Athanasopoulos [2021]).
The ETS() function is used to fit the ES Model, which is widely used
for time series forecasting, especially when there is a trend and seasonality
present. The ETS model consists of three components: Error (E), Trend (T),
and Seasonality (S).
As mentioned in our methodology, the error can be additive (A) or multiplicative (M), the trend (bt ) can be none (N), or additive (A), and the
seasonality (st ) can be none (N), additive (A), or multiplicative (M). Different combinations of these components can construct various ETS models.
For example:
• ETS(A, N, N) represents a model with additive error and no trend or
seasonality, which is SES method;
• ETS(A, A, N) represents a model with additive error, additive trend,
and no seasonality, which is Holt’s linear trend method;
• ETS(M, A, A) represents a model with multiplicative error, additive trend, and additive seasonality, which is Holt-Winters’ additive
method;

52

• ETS(M, A, M) represents a model with multiplicative error, additive
trend, and multiplicative seasonality, which is Holt-Winters’ multiplicative method;

When using this function, we did not specify the model parameters but
allowed it to automatically select the most suitable ES model, including
the best error type, trend type, and seasonality type. It achieves this by
comparing the AIC (Akaike Information Criterion) values of different models.

4.5.4

Support Vector Regression (SVR)

In R, the svm() function from the ‘e1071‘ package is used to build a SVR
model Santos and Castillo [2019]. Depending on the kernel function type,
SVR can be applied as either a linear model or a non-linear model.

SVR Non-linear Model:

We set the kernel function to its default, which uses the Radial Basis Function
(RBF) kernel. A key feature of the RBF kernel is its non-linear mapping
capability. Through the RBF kernel, data points are implicitly mapped to
a high-dimensional feature space. In this space, even if the original data
are not linearly separable in a lower-dimensional space, they may become
linearly separable in the high-dimensional space. Therefore, SVR using the
RBF kernel can effectively address many non-linear problems. We denote
this model as SVRN L .

53

SVR Linear Model:

Similarly, by setting the kernel function to linear, SVR attempts to find a
linear function to fit the data in the original input space, thereby acting as
a linear model. We denote this model as SVRL .

SVR Model with Additional Features:

When additional features, such as “temp.average”, “ws.average”, “precip.average”,
etc., are included in the model, two additional models are denoted: SVR.XN L
and SVR.XL , where “X” represents the additional features.

4.5.5

K-Nearest Neighbor (KNN)

The knn forecasting() function is provided by the tsfknn package in R for
time series forecasting (Francisco Martinez [2023]). This function uses the
KNN algorithm for prediction and is suitable for various time series analysis
scenarios.
In our example, the parameter h = 48 represents forecasting the values
for the next 48 time points. The parameter lags = 1:48 specifies using data
from the past 1 to 48 time points as features, which helps capture patterns
over a longer time range. The parameter k = 3 means referencing the 3
nearest neighbor data points during prediction to balance the accuracy and
stability of the forecast. The parameter msas = “MIMO” defines the multistep prediction strategy as the Multi-Input Multi-Output method, allowing
the model to predict all 48 future time points at once. By choosing these
54

parameters, we can construct an efficient predictive model that adapts to the
complexities of time series data.

4.5.6

Long Short-Term Memory (LSTM)

Similar to SVR, if we only consider the lagged value, then we have a model
called LSTMN L . If we add more variables to this model, such as “temp.average”,
“ws.average”, “precip.average”, etc., we have a model called LSTM.XN L .
In R, using LSTM for time series forecasting typically involves the keras,
tensorflow, and reticulate packages. The keras package provides a high-level
NN API, simplifying the construction and training of deep learning models.
The tensorflow package is R’s interface to TensorFlow, used for building and
training deep learning models. The reticulate package serves as a two-way
interface between R and Python, ensuring compatibility with TensorFlow
and Keras. Through these packages, one can efficiently implement and train
LSTM models in R for accurate time series forecasting (ten).
The lstm build model() function is used to build and train an LSTMbased time series forecasting model (lst). The function parameters include:

• x: Represents the feature matrix of the training data. It contains the
input data for model training, with each row representing a time point
and each column representing a lagged time step.
• y: Represents the target matrix of the training data. It contains the
target data for model training, corresponding to each input in the feature matrix, indicating the values the model needs to predict.

55

• units: Number of units in each LSTM layer (set to 50). This parameter
determines the number of neurons in each LSTM layer.
• batch: Defines the batch size (set to 1). The batch size determines the
amount of data used in each model update. For stateful LSTM, small
batch sizes (like 1) are typically used to maintain sequence continuity
and state.
• epochs: Number of training epochs (set to 20). This parameter determines how many times the entire training dataset is learned by the
model. More epochs allow the model to better learn the patterns in
the data, but too many can lead to overfitting.
• rate: Dropout rate for the Dropout layer (set to 0.5). Dropout is a
regularization technique that randomly drops neurons during training
to prevent overfitting. A higher dropout rate can effectively prevent
overfitting, but if too high, it may lose too much information, affecting
model performance.

4.5.7

Neural Network Autoregression (NNETAR)

Similar to SVR, if we only consider the lagged value, then we have a model
called NNETARN L . If we add more variables to this model, such as “temp.average”,
“ws.average”, “precip.average”, etc., we have a model called NNETAR.XN L
The NNETAR() function in R is from the forecast package (Hyndman
and Athanasopoulos [2021]).
For seasonal data, the model is represented as NNAR(p, P , k)m , where
each parameter is interpreted as follows: p represents the number of lagged
56

inputs, P represents the number of seasonal lagged inputs, k represents the
number of neurons in the hidden layer, and m represents the seasonal period.
More generally, an NNAR(p, P, k)m model has inputs (Yt−1 , Yt−2 , . . . ,
Yt−p , Yt−m , Yt−2m , . . . , Yt−P m ) and k neurons in the hidden layer. For example,
an NNAR(3, 1, 2)12 model has inputs Yt−1 , Yt−2 , Yt−3 , and Yt−12 , and two
neurons in the hidden layer. When using this function, we let NNETAR()
function automatically choose the values of p and P . If k is not specified, it


is by default calculated and set to p+P2 +1 .

4.6

Evaluation Metric

In our project, we consider two evaluation metric, which are root mean square
error (RMSE) and mean absolute percentage error (MAPE).

4.6.1

RMSE

The RMSE are defined as follows:

v
u
T
u1 X
t
RMSE =
(yt − ŷt )2
T t=1
where ŷt means the prediction value and yt means actual data, and T is the
total number of test samples.
RMSE is sensitive to outliers because the errors are squared, amplifying
their effect. The unit of RMSE is the same as the actual values, allowing
57

it to be directly interpreted as the average difference between the predicted
and actual values.

4.6.2

MAPE

The MAPE are defined as follows:

T

MAPE =

1 X yt − ŷt
× 100%
T t=1
yt

where ŷt means the prediction value and yt means actual data, and T is
the total number of test samples.
MAPE represents the relative percentage of prediction error, with units in
percentage (%), making it convenient for comparing data of different scales.
When the actual value yt is close to zero, MAPE becomes unstable and
can produce extremely large error values, making it unsuitable for datasets
containing zero or near-zero values.

4.7

Performance Comparison

After selecting two linear models and two non-linear models based on their
MAPE values, we used these models to construct the HFC model.
We then compared these combined models with four individual models
using MAPE and RMSE values. Each table consists of three columns: the
second column shows the MAPE values for each model, and the third column

58

presents the RMSE values.
As described in the Data Collection section, we utilized two distinct
datasets: the 2022 Hourly Traffic and Climate Dataset and the 2017-2022
Daily Traffic and Climate Dataset. Before comparing the results of different
models, we further distinguish between these two datasets.
The first dataset is further divided into the 2022 Hourly Both Directions
Dataset, the 2022 Hourly Positive Direction Dataset, and the 2022 Hourly
Negative Direction Dataset to account for different traffic directions. The
second dataset does not involve traffic direction and only includes the 20172022 Daily Dataset.
For the hourly related datasets, each comparison table includes three
different day types: weekday, weekend, and holiday. The Training and Parameters section mentioned earlier provides specific details. Each day type
corresponds to distinct training and test set.
For the daily related datasets, each table includes only one day type,
corresponding to weekdays.
In our result table, for the sake of clarity, we have assigned specific names
to different types of models.
As we know, in our project, the models are primarily categorized into
two types: linear models and non-linear models. Based on this classification,
we have added subscripts to the corresponding model names. The subscript
for linear models is L, and for non-linear models, it is N L. Thus, the names
of the linear models with subscripts are:

59

• ARIMAL
• DRegL
• ETSL
• SVRL
The names of the non-linear models with subscripts are:
• SVRN L
• KNNN L
• LSTMN L
• NNETARN L
Since some models incorporate not only traffic volume data but also additional feature-related data, we have added “X” as a suffix to these models.
The models with the added suffix are:
• SVR.XL
• SVR.XN L
• LSTM.XN L
• NNETAR.XN L
Next, we will analyze the comparison results of different datasets. For
the 2022 Hourly Traffic and Climate Dataset, to evaluate the performance of
our model in predicting different day types, we further divided the dataset
into three categories: weekday, weekend, and holiday.
60

4.7.1

Results of 2022 Hourly Both Directions Dataset

Weekday

First, we will analyze the results for the day type corresponding to weekdays,
and select the top two linear models and the top two non-linear models based
on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Table 4.1: Linear Model Results

Table 4.2: Non-Linear Model Results

Model

MAPE

Model

MAPE

SVRL

14.46

SVR.XN L

10.51

DRegL

16.15

SVRN L

14.27

ARIMAL

16.21

LSTM.XN L

21.68

SVR.XL

18.54

NNETAR.XN L

28.45

ETSL

28.18

NNETARN L

30.73

KNNN L

42.89

LSTMN L

44.11

Based on Table 4.1, we selected the top two models: SVRL and DRegL .
Similarly, based on Table 4.2, we selected the top two models: SVR.XN L and
LSTM.XN L . Subsequently, we used these four models to construct the HFC
model.
As shown in Table 4.3, the proposed HFC models (DDHFC) demonstrate
61

Table 4.3: The comparison table of four individual forecast models with
proposed HFC models (2022 hourly both directions - weekday)
Model

MAPE

RMSE

DRegL

16.15

372.24

SVRL

14.46

292.96

SVR.XN L

10.51

218.33

LSTM.XN L

21.68

344.05

AVG

11.7

263.3

DDHFC

9.89

217.5

DDPHFC

12.09

250.25

superior accuracy compared to individual models, particularly in terms of
the MAPE metric. Specifically, DDHFC has the lowest MAPE value of 9.89
and an RMSE value of 217.5, indicating its high accuracy in forecasting. In
contrast, the LSTM model has the highest MAPE of 21.68, suggesting greater
prediction errors. DDHFC not only achieves the best results in MAPE but
also has relatively low RMSE values, highlighting its effectiveness. This
suggests that the HFC model, constructed using the selected two linear and
two nonlinear models, exhibits strong predictive capabilities across multiple
evaluation metrics, validating its effectiveness in traffic forecasting.
Figure 4.12 displays the forecast results of the proposed HFC models and
four individual models for 2022 bi-directional traffic data on weekdays. The
black solid line in the graph represents the actual observed traffic data, while
the other colored lines represent the forecasts of different models.
The graph reveals that the HFC models (DDHFC) closely align with the

62

Figure 4.12: Forecast plot for proposed HFC model and four individual models (2022 Hourly Both Directions - weekday)
actual data, particularly at peak traffic volumes. In comparison, although
the predictions of the other models are also close to the actual data, their
deviations are relatively larger, especially during periods of dramatic traffic
changes, such as the morning and evening rush hours. Overall, the forecast
results in Figure 4.12 demonstrate that the proposed HFC models exhibit
strong stability and accuracy in complex traffic volume forecasting tasks,
particularly excelling in predicting peak periods, outperforming individual
models.

63

Weekend and Holiday

Next, we will analyze the results for the weekend and holiday. The procedure
for selecting the top two linear models and the top two non-linear models is
detailed in the Appendix B.1.
Table 4.4: The comparison table of four individual forecast models with
proposed HFC model (2022 hourly both directions)
Table 4.5: Weekend

Table 4.6: Holiday

Model

MAPE

RMSE

Model

MAPE

RMSE

SVR.XL

10.44

198.1

SVR.XL

12.49

240.09

ARIMAL

28.32

591.38

ARIMAL

29.4

735.71

SVRN L

8.56

203.09

SVRN L

8.65

200.69

KNNN L

11.38

295.81

NNETAR.XN L

9.21

177.76

AVG

9.55

236.87

AVG

12.37

310.75

DDHFC

7.88

178.38

DDHFC

9.05

176.74

DDPHFC

6.99

214.7

DDPHFC

7.68

183.38

The tables 4.4 presented summarize the comparison of four individual
forecast models with the proposed HFC models across different day types:
weekend, and holiday, specifically for the 2022 hourly both directions traffic
data.
For weekend, Table 4.5 highlighting the performance of different models
based on MAPE and RMSE metrics. Among the models, DDPHFC achieves
the lowest MAPE value of 6.99, indicating it has the highest prediction accuracy. Additionally, DDHFC also performs well with a MAPE of 7.88 and
64

the lowest RMSE value of 178.38, suggesting it provides the most precise
forecasts with minimal error. In contrast, the ARIMAL model shows significantly higher errors, with a MAPE of 28.32 and RMSE of 591.38, indicating
it performs poorly compared to the other models.
For holiday, Table 4.6 summarizes the performance of various models
based on MAPE and RMSE metrics in the context of combination forecasting. The DDPHFC model stands out with the lowest MAPE value of 7.68,
reflecting its superior accuracy in traffic prediction. The DDHFC model also
performs strongly, achieving the lowest RMSE of 176.74, indicating its robustness in minimizing forecast errors.
On the other hand, the ARIMAL model exhibits the highest errors, with
a MAPE of 29.4 and an RMSE of 735.71, demonstrating its comparatively
weaker predictive performance.
Overall, the results highlight that both the DDPHFC and DDHFC models demonstrate strong predictive capabilities under both “Weekend” and
“Holiday” conditions, particularly in terms of MAPE and RMSE, significantly outperforming the traditional time series model, ARIMA. This indicates that data-driven hybrid models might be more suitable for handling
complex time-varying data in traffic flow forecasting.
Figures 4.13 and 4.14 visually illustrate the forecast performance of various models on weekend and holiday traffic data (2022 hourly both directions),
respectively. In both figures, the black solid lines represent the actual observed traffic data, while the colored lines show the forecasts of different
models.

65

Figure 4.13: Forecast plots for proposed models and the combination models
(2022 Hourly Both Directions - weekend)
In Figure 4.13, which depicts weekend traffic, the forecast curves of various models generally align well with the actual data, particularly at traffic
peaks and valleys. The DDHFC and DDPHFC models (represented by the
red and yellow lines, respectively) perform exceptionally well, closely matching the actual data, especially during the morning and evening peak periods.
This suggests these models are highly accurate in capturing traffic patterns.
Conversely, the ARIMAL model (dark green line) shows larger prediction
errors during certain periods, particularly during peak times, indicating its
limitations in handling complex traffic patterns.
Similarly, in Figure 4.14, which presents the holiday traffic data, most

66

Figure 4.14: Forecast plots for proposed models and the combination models
(2022 Hourly Both Directions - holiday)
models capture the general trend of traffic flow, though the accuracy varies
across models. Again, the DDHFC and DDPHFC models stand out by closely
following the actual data, especially during peak traffic periods. These models demonstrate a strong ability to predict the sharp increase in traffic volume
during the morning and the subsequent gradual decline throughout the day.
The ARIMAL model, however, continues to show larger deviations from the
actual data during peak hours, struggling to accurately capture high-traffic
periods.
In summary, both figures underscore that the proposed DDHFC and
DDPHFC models exhibit higher forecasting accuracy in both weekend and

67

holiday traffic predictions. They effectively capture traffic peaks and valleys,
proving to be more reliable compared to other models, particularly when
complex and varying traffic conditions are present.

4.7.2

Results of 2022 Hourly Positive Direction Dataset

Next, we will analyze the comparison tables. The procedure for selecting the
top two linear models and the top two non-linear models is detailed in the
Appendix B.2.
Table 4.7: Comparison of Four Individual Forecast Models with Proposed
HFC Model (2022 Hourly Positive Direction)
Table 4.8: Weekday

Table 4.9: Weekend

Table 4.10: Holiday
Model

MAPE

RMSE

Model

MAPE

RMSE

Model

MAPE

RMSE

SVRL

15.12

201.52

SVR.XL

11.07

152.21

SVRL

11

133.1

20.43

323.87
93.2

ARIMAL

17.3

219.84

ARIMAL

24.82

294.1

ARIMAL

NNETARN L

14.6

189.43

SVRN L

9.54

125.68

NNETAR.XN L

9.63

SVR.XN L

15.77

166.93

NNETARN L

11.93

166.58

SVRN L

10.63

133.24

AVG

9.61

141.89

DDHFC

7.5

81.28

DDPHFC

6.53

86.64

AVG

11.78

171.69

AVG

10.01

150.13

DDHFC

14.42

166.57

DDHFC

8.96

125.31

DDPHFC

12.12

169.2

DDPHFC

7.97

134.55

Tables 4.8, 4.9, and 4.10 present the performance of various combination
forecast models using MAPE and RMSE as evaluation metrics. Across these
tables, the DDPHFC and DDHFC models consistently stand out for their
superior accuracy and error minimization.
The AVG model achieves the lowest MAPE of 11.78 in Table 4.8, while
the DDHFC model records the lowest RMSE of 166.57, highlighting its effec68

tiveness in minimizing prediction errors. In comparison, the ARIMAL model
exhibits higher errors with a MAPE of 17.3 and an RMSE of 219.84, making
it less reliable.
Similarly, in Table 4.9, the DDPHFC model achieves the lowest MAPE
value of 7.97, and the DDHFC model again shows the lowest RMSE of 125.31.
The ARIMAL model performs poorly with the highest MAPE of 24.82 and
RMSE of 294.1.
Finally, Table 4.10 reaffirms the superior performance of the DDPHFC
and DDHFC models. The DDPHFC model achieves a MAPE of 6.53, while
the DDHFC model records the lowest RMSE of 81.28. The ARIMAL model,
on the other hand, continues to show the highest errors with a MAPE of
20.43 and RMSE of 323.87.
Overall, the results across these tables indicate that the DDPHFC and
DDHFC models are more reliable and accurate for traffic forecasting tasks,
significantly outperforming traditional models like ARIMAL in both accuracy
and consistency.
Figures 4.15, 4.16, and 4.17 visually illustrate the forecast performance
of various models on weekday, weekend, and holiday traffic data (2022 hourly
positive direction), respectively. In each figure, the black solid lines represent
the actual observed traffic data, while the colored lines show the forecasts of
different models.
In Figure 4.15,which depicts weekday traffic, it is evident that most models are able to follow the general trend of the actual traffic data, capturing
the peaks and troughs in traffic volume. The AVG, DDHFC, and DDPHFC

69

Figure 4.15: Forecast plot for proposed HFC model and four individual models (2022 Hourly Positive Direction-weekday)
models (represented by the green, red, and yellow lines, respectively) are
particularly notable for their close alignment with the actual data, especially
during peak periods. These models effectively predict the sharp increases and
decreases in traffic, demonstrating their robustness in handling fluctuations.
Similarly, in Figure 4.16 which presents the weekend traffic data, most
models capture the general trend of traffic flow. It’s clear that the DDHFC
and DDPHFC models (represented by the red and yellow lines) closely follow the actual data, particularly during peak traffic periods. These models
show strong predictive performance, effectively capturing the increases and
decreases in traffic volume throughout the weekend. The ARIMAL model

70

Figure 4.16: Forecast plots for proposed models and the combination models
(2022 Hourly Positive Direction-weekend)
(dark green line), however, continues to show larger deviations from the actual data during peak hours, struggling to accurately capture the weekend
traffic patterns.
In Figure 4.17, which focuses on holiday traffic data. From the graph,
it is evident that the DDHFC and DDPHFC models (represented by the
red and yellow lines, respectively) closely align with the actual traffic data,
particularly during the peak traffic periods. These models effectively capture
the rise in traffic volume during the morning hours and the gradual decline
in the evening, demonstrating their reliability in forecasting traffic patterns
during holidays. In contrast, the ARIMAL model (dark green line) again

71

Figure 4.17: Forecast plots for proposed models and the combination models
(2022 Hourly Positive Direction-holiday)
shows significant deviations, especially during high-traffic periods, indicating
its ongoing struggle with accurately modeling traffic flow during holidays.
In summary, all three figures underscore that the proposed DDHFC and
DDPHFC models exhibit higher forecasting accuracy across different traffic scenarios, including weekdays, weekends, and holidays. They effectively
capture traffic peaks and valleys, proving to be more reliable compared to
other models, particularly when dealing with the complex and varying traffic
conditions present during these different timeframes.

72

4.7.3

Results of 2022 Hourly Negative Direction Dataset

Next, we will analyze the comparison tables. The procedure for selecting the
top two linear models and the top two non-linear models is detailed in the
Appendix B.3.
Table 4.11: The comparison table of four individual forecast models with
proposed HFC models (2022 hourly negative direction)
Table 4.12: Weekday

Table 4.13: Weenkend

Table 4.14: Holiday

Model

MAPE

RMSE

Model

MAPE

RMSE

Model

MAPE

RMSE

SVRL

16.86

158.73

SVR.XL

15.52

156.36

SVRL

22.54

227.93

ARIMAL

17.4

198.8

ARIMAL

38.79

413.78

ETSL

36.6

383.95

NNETAR.XN L

11.77

144.8

SVRN L

10.66

95.4

NNETAR.XN L

18.94

164.15

SVRN L

16.73

135.87

LSTM.XN L

38.58

235.05

AVG

18.34

156.03

DDHFC

14.67

132.93

DDPHFC

18.34

156.03

KNNN L

12.03

158.4

AVG

14.75

166.79

AVG

18.93

189.55

DDHFC

10.48

127.41

DDHFC

10.66

95.4

DDPHFC

10.95

141.17

DDPHFC

11.8

123.16

The tables 4.11 presented summarize the comparison of four individual
forecast models with the proposed HFC models across different day types:
weekday, weekend, and holiday, specifically for the 2022 hourly negative direction traffic data.
For weekday (Table 4.12), the DDHFC model performed the best with a
MAPE of 14.67 and an RMSE of 132.93, demonstrating its superior accuracy
and lowest error in weekday traffic prediction. In contrast, the LSTM.XN L
model underperformed with a MAPE as high as 38.58 and an RMSE of
235.05. The AVG and DDPHFC models showed identical performance, both
with a MAPE of 18.34 and an RMSE of 156.03, indicating that the DDPHFC
73

model found the optimal weight to be 0.5 for each model during the last stage.
For weekend (Table 4.13), the DDHFC model once again took the lead
with a MAPE of 10.48 and an RMSE of 127.41, making it the most reliable
model for weekend traffic prediction. Conversely, the ARIMAL model had
a significantly higher error, with a MAPE of 38.79 and an RMSE of 413.78,
revealing its shortcomings in weekend traffic forecasting.
For holiday (Table 4.14), both the DDHFC and SVR models performed
consistently and excellently, with a MAPE of 10.66 and an RMSE of 95.4.
This indicates that the DDHFC model found the optimal weight to be 1 for
both the SVRN L model and the non-linear combined model during both the
second stage and the last stage, showcasing its strong accuracy in holiday
traffic prediction. In contrast, the ETSL model performed the worst, with a
MAPE of 36.6 and an RMSE of 383.95, while the DDPHFC model showed
good performance with a MAPE of 11.8 and an RMSE of 123.16.
Overall, the results across all day types consistently highlight the effectiveness of our prposed model (DDHFC) in providing accurate and reliable
traffic forecasts, outperforming the other models in most scenarios.
The figures display below is the forecast results of the proposed HFC
models and four individual models for 2022 negative directional traffic data
on weekday, weekend and holiday.
In the weekday forecast, the DDHFC model demonstrates the best performance, with a MAPE of 14.67 and an RMSE of 132.93 as shown in the
table 4.12. Figure 4.18 illustrates that the DDHFC model (red line) closely
aligns with the actual data (black line), particularly excelling in predicting

74

Figure 4.18: Forecast plots for proposed models and individual models (2022
Hourly Negative Direction-weekday)
peak and low traffic periods. In contrast, the LSTM.XN L model (orange line)
shows higher MAPE and RMSE values in the table, and in the figure, it can
be observed that its predictions deviate from the actual data during certain
periods (notably underestimating traffic volume after 2022-12-02 15:00:00),
indicating its weaker performance.
For the weekend forecast, the DDHFC model continues to stand out,
with a MAPE of 10.48 and an RMSE of 127.41 in the table 4.13. Figure 4.19
shows that the DDHFC model’s predictions closely match the actual data,
especially during peak traffic periods. In contrast, the ARIMAL model (dark
green line) performs the worst, displaying the highest MAPE and RMSE
values in the table, and its prediction curve noticeably deviates from the
actual data during peak periods, reflecting significant errors.

75

Figure 4.19: Forecast plots for proposed models and individual models (2022
Hourly Negative Direction-weekend)
In the holiday traffic forecast, both the DDHFC and SVRN L models show
the best performance, with a MAPE of 10.66 and an RMSE of 95.4 in the
table 4.14. Figure 4.20 demonstrates that the prediction curves of these two
models are very close to the actual data, particularly during peak traffic
periods, effectively capturing traffic flow variations. In contrast, the ETSL
model (dark green line) performs the worst, with a MAPE of 36.6 and an
RMSE of 383.95 in the table, and the figure shows that its prediction curve
significantly diverges from the actual data at multiple times.
Overall, the DDHFC model performs excellently across weekday, weekend, and holiday forecasts, providing accurate and consistent predictions. In
comparison, traditional models like ARIMAL and ETSL clearly underperform relative to the DDHFC model.

76

Figure 4.20: Forecast plots for proposed models and individual models (2022
Hourly Negative Direction-holiday)

77

4.7.4

Results of 2017-2022 Daily Dataset

For the 2017-2022 Daily Traffic and Climate Dataset, since the traffic volume
in this dataset is recorded daily, we only consider the weekday type.
First, we will analyze the results for the day type corresponding to weekdays, and the procedure for selecting the top two linear models and the top
two non-linear models is detailed in the Appendix B.4. Subsequently, we
used these four models to construct the HFC model. The comparison results
of the four individual forecast models, along with the proposed HFC models,
are shown in Table 4.15.
Table 4.15: The comparison table of four individual forecast models with
proposed HFC models (2017-2022 daily-weekday)
Model

MAPE

RMSE

SVR.XL

2.92

2039.42

ARIMAL

6.22

3479.72

SVR.XN L

2.36

1623.65

LSTMN L

4.21

2686.09

AVG

3.45

1986.36

DDHFC

2.3

1529.49

DDPHFC

2.85

1740.94

From Table 4.15, it is evident that there are significant differences in the
performance of various models in predicting daily traffic data on weekdays
from 2017 to 2022. The DDHFC model stands out with a MAPE of 2.3
and an RMSE of 1529.49, demonstrating its superiority in both accuracy

78

and error control. The SVR.reg model also shows a close performance, with
a MAPE of 2.36 and an RMSE of 1623.65, indicating its high precision in
prediction as well.
In contrast, the ARIMAL model performs the worst, with a MAPE of
6.22 and a high RMSE of 3479.72, highlighting its large prediction errors and
lower accuracy. Additionally, the LSTMN L and SVR.XL models also exhibit
relatively high RMSE values of 2686.09 and 2039.42, respectively, suggesting
that their predictive performance for long-term daily traffic data is not as
strong as that of the DDHFC and SVR.reg models.
These results indicate that the DDHFC model provides the most stable
and accurate predictions when dealing with long-term traffic data, outperforming other individual models.
As seen in Figure 4.21, the DDHFC model (red line) closely aligns with
the actual data (black line) when predicting daily traffic flow on weekdays
from 2017 to 2022, particularly excelling during peak and low traffic periods,
demonstrating high predictive accuracy. This is further supported by the
data in Table 4.15, where the DDHFC model shows the best performance
in terms of MAPE and RMSE, with values of 2.3 and 1529.49, respectively,
validating its superior performance illustrated in the figure.
In contrast, the ARIMAL model (dark green line) performs the worst according to the table, with a MAPE of 6.22 and an RMSE as high as 3479.72.
The figure shows that the ARIMAL model’s prediction curve significantly
deviates from the actual data in several periods, particularly during sharp
declines and increases in traffic flow, highlighting its shortcomings in predictive capability.

79

Figure 4.21: Forecast plot for proposed HFC model and four individual models (2017-2022 Daily-weekday)
The SVR.XN L model (blue line) and the DDPHFC model (yellow line)
also perform well in the figure, closely matching the actual data, especially
during periods of significant fluctuations. According to Table 4.15, these
models have MAPEs of 2.36 and 2.85, and RMSEs of 1623.65 and 1740.94,
respectively, indicating their strong performance in handling long-term traffic
data.
Overall, the combined analysis of Figure 4.21 and Table 4.15 shows that
our proposed model (DDHFC) offers remarkable accuracy and stability in
predicting long-term weekday traffic data, outperforming other individual
models. In contrast, the ARIMAL model exhibits the poorest performance,

80

with notably large prediction errors.
It is noteworthy that the RMSE values are relatively high, indicating
significant errors in the predictive model. One possible reason for this is the
impact of the COVID-19 pandemic in 2020, which caused a sharp decline in
traffic volume and disrupted existing patterns. When training the model, we
still included data from that period, which may have led to inaccuracies in
predicting future traffic volumes.

81

Chapter 5
Conclusions
This project focused on predicting traffic flow on the Lions Gate Bridge,
a vital infrastructure link between downtown Vancouver and the northern
regions. The study addressed the pressing issue of traffic congestion that
affects thousands of commuters and tourists daily. We developed a novel
hybrid traffic model that combines the strengths of both linear and nonlinear approaches to provide more accurate and reliable traffic forecasts.
Our findings highlighted the effectiveness of the proposed hybrid model,
demonstrating its superiority over traditional forecasting methods in predicting traffic flow on the Lions Gate Bridge. By integrating multiple predictive
techniques, the model significantly enhances the accuracy and reliability of
traffic predictions, providing urban planners with actionable insights for more
effective traffic management and improved transportation network efficiency.
This study contributes to the field of ITS by supporting sustainable urban
development and reducing the environmental impact of traffic congestion.

82

However, the current model is limited by its reliance on only the two bestperforming linear and two best-performing non-linear models, which may
constrain its adaptability and overall predictive power.
For future work, there are several avenues to explore. Firstly, the model
can be extended by not limiting the selection to just two linear and two nonlinear models. Incorporating a larger set of models could further enhance
the robustness and accuracy of the predictions, allowing the hybrid model to
better capture the complexities of traffic flow. Secondly, the current method
of estimating the weights in the model is based on the FESS. While effective,
there are other optimization techniques, such as genetic algorithms or machine learning approaches, that could potentially yield better results. These
methods could be explored to refine the weight estimation process, leading
to further improvements in model performance.
Overall, this project has laid a solid foundation for future research and development in traffic flow forecasting. The proposed hybrid model has shown
promise, and with further enhancements, it could serve as a powerful tool in
managing urban traffic congestion, not only on the Lions Gate Bridge but in
other urban centers facing similar challenges.

83

Bibliography
Keras lstm neutal networks for univariate time-series in r. URL https:
//rpubs.com/pawel-wieczynski/891765.
Tensorflow for r. URL https://tensorflow.rstudio.com/install/.
El-Shishiny Andrawis, Atiya. Combination of long term and short term
forecasts, with application to tourism demand forecasting. International
Journal of Forecasting, 27:870–886, 2011. doi: 10.1016/j.ijforecast.2010.
05.019. URL https://www.sciencedirect.com/science/article/pii/
S0169207010001147?via%3Dihub.
Association of Consulting Engineering Companies British Columbia.
Lions

gate

bridge

reversible

lane

control

system

rehabilitation.

https://acecbcawards.com/2023-awards/2023-soft-engineering/
lions-gate-bridge-reversible-lane-control-system-rehabilitation/
#one.
Abbas-Khan Ahmad Ata, Khan. Adaptive iot empowered smart road traffic
congestion control system using supervised machine learning algorithm.
THE COMPUTER JOURNAL 2021, 64:1672–1679, 2020. doi: 10.1093/
comjnl/bxz129. URL https://academic.oup.com/comjnl/article/64/
11/1672/5838271#314342292.
84

Shekhar Babar.
babar.

Time series missing value imputation - shekhar

5 2023.

URL https://medium.com/@shekhar.babar/

time-series-missing-value-imputation-fa51a7b1ac49.
Charalampos Bratsas, Kleanthis Koupidis, Josep-Maria Salanova, Konstantinos Giannakopoulos, Aristeidis Kaloudis, and Georgia Aifadopoulou. A
comparison of machine learning methods for the prediction of traffic speed
in urban places. Sustainability, 12(1):142, 2019. doi: 10.3390/su12010142.
URL https://www.mdpi.com/2071-1050/12/1/142.
Canada, Environment and Climate Change.
tion tool.

Climate data extrac-

https://climate-change.canada.ca/climate-data/#/

daily-climate-data.
Singh-Chang Chan, Dillon. Traffic flow forecasting neural networks based on
exponential smoothing method. 2011 6th IEEE Conference on Industrial
Electronics and Applications, 2011a. doi: 10.1109/ICIEA.2011.5975612.
URL https://ieeexplore.ieee.org/abstract/document/5975612.
Singh-Chang Chan, Dillon.

Neural-network-based models for short-

term traffic flow forecasting using a hybrid exponential smoothing

and

levenberg–marquardt

algorithm.

IEEE

Transactions

on Intelligent Transportation Systems,

13:644–654,

10.1109/TITS.2011.2174051.

https://ieeexplore.ieee.

URL

2011b.

doi:

org/abstract/document/6088012?casa_token=uiOITWsZHbkAAAAA:
0zNd1SWb010InfqxfC9uMdJc99dKfRAuNZwDAqSf3ZluIxmuN1mkdDG1vqI_
4wD93UzfsGXKJ9M.
Chung Chen and Lon-Mu Liu. Joint estimation of model parameters and
outlier effects in time series.

Quarterly Publications of the American
85

Statistical Association/Quarterly Publication of the American Statistical
Association, 88(421):284, 1993.

doi: 10.2307/2290724.

URL https:

//www.tandfonline.com/doi/abs/10.1080/01621459.1993.10594321.
City

of

Vancouver.

Stanley

park.

https://vancouver.ca/

parks-recreation-culture/stanley-park.aspx.
Shengdong Du, Tianrui Li, Xun Gong, Yan Yang, and Shi Jinn
Horng.

Traffic flow forecasting based on hybrid deep learning frame-

work.

In IEEE Conference Publication — IEEE Xplore, Novem-

ber 2017.

URL https://ieeexplore.ieee.org/abstract/document/

8258813#full-text-header.
Leitch Eddelbuettel, Nguyen. R interface to the ’quantlib’ library. https:
//cran.r-project.org/web/packages/RQuantLib/RQuantLib.pdf, 07
2024.
Francisco Charte Antonio J. Rivera Francisco Martinez, Maria P. Frias.
Time series forecasting with knn in r:
2023.

the tsfknn package.

12

URL https://cran.r-project.org/web/packages/tsfknn/

vignettes/tsfknn.html.
Rui Fu, Zuo Zhang, and Li Li.

Using lstm and gru neural network

methods for traffic flow prediction. In IEEE Conference Publication —
IEEE Xplore, November 2016. URL https://ieeexplore.ieee.org/
document/7804912.
Historical Data.

Climate historical data website.

https://climate.

weather.gc.ca/historical_data/search_historic_data_e.html?
hlyRange=1976-01-20%7C2024-02-18&dlyRange=1925-11-01%
7C2024-02-18&mlyRange=1925-01-01%7C2007-02-01&urlExtension=
86

_e.html&searchType=stnProx&optLimit=specDate&Month=1&Day=
18&StartYear=1840&EndYear=2024&Year=2022&selRowPerPage=25&
Line=0&txtRadius=25&optProxType=navLink&txtLatDecDeg=49.
314783333333&txtLongDecDeg=-123.11527777778&timeframe=2.
Qinzhong Hou, Junqiang Leng, Guosheng Ma, Weiyi Liu, and Yuxing
Cheng.

An adaptive hybrid model for short-term urban traffic flow

prediction.

Physica. A, 527:121065, 2019.

doi: 10.1016/j.physa.2019.

121065. URL https://www.sciencedirect.com/science/article/pii/
S0378437119306508?via%3Dihub.
Rob J Hyndman and George Athanasopoulos. Forecasting: principles and
practice, 3rd edition. 2021. URL https://otexts.com/fpp3/.
INRIX.

Inrix 2023 global traffic scorecard.

https://inrix.com/

scorecard/.
Danqing Kang, Yisheng LV, and Yuan-yuan Chen. Short-term traffic flow
prediction with lstm recurrent neural network. In IEEE Conference Publication — IEEE Xplore, October 2017. URL https://ieeexplore.ieee.
org/document/8317872.
Hariharan Kumar. Time series traffic flow prediction with hyper-parameter
optimized arima models for intelligent transportation system. Journal
of Scientific

Industrial Research, 81:408–415, 2022.

jsir.v81104.50791.

doi: 10.56042/

URL https://www.semanticscholar.org/paper/

Time-Series-Traffic-Flow-Prediction-with-Optimized-Kumar-Hariharan/
29a25255f6aa9127a227b6655ce4199a75ead881.
Ibai Laña, Javier J. Sanchez Medina Medina, Eleni I. Vlahogianni, and
Javier Del Ser. From data to actions in intelligent transportation sys87

tems: A prescription of functional requirements for model actionability. Sensors (Basel), 21(4):1121, 2021. doi: 10.3390/s21041121. URL
https://doi.org/10.3390/s21041121.
Marco Lippi, Matteo Bertini, and Paolo Frasconi.
fic flow forecasting:

Short-term traf-

An experimental comparison of time-series

analysis and supervised learning.

https://ieeexplore.ieee.org/

abstract/document/6482260?casa_token=UkfsRAE_Pt4AAAAA:OshX6_
poDsdUwx81609E5g3KeuinJSCX78TVspPiZdvvEdkqu4RhpQ4uiY7ZkWsQGcFRLvkcMNc,
June 2013.
Xu Ma, Tian. Short-term traffic flow prediction based on genetic artificial neural network and exponential smoothing. Promet - TrafficTransportation, 32:747–760, 2020. doi: 10.7307/ptt.v32i6.3360. URL https:
//hrcak.srce.hr/253142.
OfficeHolidays.

Statutory holidays in canada in 2022.

https://www.

officeholidays.com/countries/canada/2022.
R: Seasonally Decomposed Missing Value Imputation. R: Seasonally decomposed missing value imputation. https://search.r-project.org/CRAN/
refmans/imputeTS/html/na_seadec.html.
Hyuk-Jae Roh, Furqan A. Bhat, Prasanta K. Sahu, Ata M. Khan, Orlando Rodriguez, Satish Sharma, and Babak Mehran.

Appraisal of

temporal transferability of cold region winter weather traffic models
for major highway segments in alberta canada. Geosciences, 9(3):137,
2019. doi: 10.3390/geosciences9030137. URL https://doi.org/10.3390/
geosciences9030137.

88

Sadaf

Saleem.

Neural

05

2023.

URL

networks

in

10mins.

simply

explained!

https://medium.com/@sadafsaleem5815/

neural-networks-in-10mins-simply-explained-9ec2ad9ea815.
Laurent L. Santos and Francisco S. Castillo. Introduction to spatial network forecast with r. 2019. URL https://laurentlsantos.github.io/
forecasting/support-vector-regression.html.
Service Canada. Weather, climate and hazards. https://www.canada.ca/
en/services/environment/weather.html, 5 2023.
Josh Starmer. Long short-term memory (lstm), clearly explained. https:
//www.youtube.com/watch?v=YCzL96nL7j0, 11 2022a.
Josh Starmer. Recurrent neural networks (rnns), clearly explained!!! https:
//www.youtube.com/watch?v=AsNTP8Kwu80, 7 2022b.
Demidova Tempelmeier, Dietze. Crosstown traffic - supervised prediction
of impact of planned special events on urban traffic. Geolnformatica, 24:
339–370, 2019. doi: 10.1007/s10707-019-00366-x. URL https://link.
springer.com/article/10.1007/s10707-019-00366-x.
Traffic Data Program. Traffic data program. https://www.th.gov.bc.ca/
trafficdata/.
Li Kang Wang, Hyndman. Forecast combinations: An over 50-year review.
International Journal of Forecasting, 39:1518–1547, 2023. doi: 10.1016/j.
ijforecast.2022.11.005. URL https://www.sciencedirect.com/science/
article/pii/S0169207022001480?via%3Dihub.
WSP Canada Group Limited.

Traffic reports user documenta-

89

tion.

https://www.th.gov.bc.ca/trafficData/documents/

TrafficReportsUserDocumentation_2019May16.pdf, 2019.
WSPglobal.

Intelligent transportation systems.

https://www.wsp.com/

en-ca/services/intelligent-transportation-systems-its.
Shengjian Zhao, Shu Lin, and Jungang Xu. Time series traffic prediction
via hybrid neural networks. In IEEE Conference Publication — IEEE
Xplore, October 2019. URL https://ieeexplore.ieee.org/abstract/
document/8917383.
Juncheng Zhu. Electric vehicle charging load forecasting: A comparative
study of deep learning approaches. https://www.mdpi.com/1996-1073/
12/14/2692, 7 2019.
Richard Zussman. B.c. declares state of emergency in response to coronavirus
pandemic. March 2020.

90

Appendix A
Climate Data Collection
This section outlines the process of collecting traffic and climate data to
create two datasets, which is: 2022 Hourly Traffic and Climate Dataset and
2017-2022 Daily Traffic and Climate Dataset

A.1

Hourly Climate Data (2022)

After examining the first tsibble table, namely temp.hourly, containing stations with climate IDs 1106200, 11068446, 1108395, 1108380, and 1108824,
and visualizing its data in Fig. A.1, we note similar weather patterns among
these stations. Further investigation reveals that missing data for these stations are not significant, as indicated in Table A.1.

91

Figure A.1: Comparison plot of 2022 hourly temperature data for different
stations

92

Table A.1: Missing data for 2022 hourly temperature data (8760 rows)
Site No.

1
2

Climate ID

1106200
1108446

Climate
Station

Climate
Station

Distance
from Lions

Missing
temperature

Name

GPS (DD)

Gate Bridge

Data

POINT

49.3304, -

9.3143 km

11

ATKINSON
123.2647
VANCOUVER 49.2954, -

16.402 km

12

13.8134 km

4

15.1886 km

12

5.322 km

16

HARBOUR

123.1219

3

1108395

CS
VANCOUVER 49.1947, -

4

1108380

INTL A
123.1839
VANCOUVER 49.1825, SEA
ISLAND

5

1108824

123.1872

CCG
WEST

49.3470, -

VANCOU-

123.1933

VER AUT

93

The second tsibble table, precip.hourly, includes stations with climate
IDs meeting our criteria: 11068446, 1108380, and 1108824. We plot this
table, as depicted in Fig. A.2, revealing the same pattern in the precipitation
feature among these stations. Subsequently, missing data are checked, and
the results displayed in Table A.2 indicate that missing data for these stations
are not severe.

Figure A.2: Comparison plot of 2022 hourly precipitation data for different
stations

94

Table A.2: Missing data for 2022 hourly precipitation data (8760 rows)
Site No.

1

2

Climate ID

1108446

1108380

Climate
Station

Climate
Station

Distance
from Lions

Missing Precipitation

Name

GPS (DD)

Gate Bridge

Data

VANCOUVER 49.2954, HARBOUR
123.1219

16.402 km

12

15.1886 km

12

5.322 km

16

CS
VANCOUVER 49.1825, SEA
ISLAND

3

1108824

123.1872

CCG
WEST

49.3470, -

VANCOU-

123.1933

VER AUT

95

Lastly, the third table, ws.hourly, encompasses stations with climate
IDs 1106200, 1108395, 1108380, and 1108824. Despite variations in wind
speed, particularly with station 1106200 exhibiting higher speeds than station
1108824, visual inspection in Fig. A.3 reveals no distinct pattern. Moreover,
analysis of missing data in Table A.3 affirms that the quantity of missing
data for these stations is inconsequential and does not significantly affect the
dataset.

Figure A.3: Comparison plot of 2022 hourly wind speed data for different
stations

96

Table A.3: Missing data for 2022 hourly wind speed data (8760 rows)
Site No.

Climate ID

Climate

Climate

Distance

Missing

Station
Name

Station
GPS (DD)

from Lions
Gate Bridge

Wind Speed
Data

1

1106200

POINT
ATKINSON

49.3304, 123.2647

9.3143 km

11

2

1108395

VANCOUVER 49.1947, INTL A
123.1839

13.8134 km

4

3

1108380

VANCOUVER 49.1825, SEA
123.1872
ISLAND

15.1886 km

12

5.322 km

143

4

1108824

CCG
WEST
VANCOUVER AUT

49.3470, 123.1933

97

Next, we identify stations that possess all three features, namely 1108380
and 1108824. Inspection of the Table A.4 reveals that missing data for these
two stations are minimal.
Table A.4: Missing data for 2022 hourly climate data (8760 rows)
Site
No.

Climate
ID

Climate
Station

Climate
Station

Distance Missing
from
temp

Missing
precip

Missing
wind

Missing Missing
Weather Weather

Name

GPS
(DD)

Lions
Gate

Data

speed
Data

Data
(Either

Data
(Both

temp

temp

or
precip

precip
and ws
12

16

Data

Bridge

1

1108380

VAN

49.1825,

15.1886

SEA
IS-

123.1872

km

CCG
WEST

49.3470,

5.322

VANCOU-

123.1933

km

12

12

12

or ws
12

16

16

143

143

LAND
2

1108824

VER
AUT

98

Considering the inclusion of all features from both stations as input and
their subsequent application in models may result in data redundancy and
homogeneity. Therefore, we opt to calculate the mean value of each feature
across the two stations to represent the values of different features.
For the combined datasets, where missing data exists, we opted to handle
it by ignoring these NA values and calculating the average based only on the
non-NA values in each row.
Subsequently, we proceeded to plot each feature separately.
From Fig. A.4, it is apparent that the hourly temperature in 2022 reached
its peak around August and its lowest point around December.

Figure A.4: Plot of 2022 hourly temperature average data

From Fig. A.5, we observe relatively low precipitation from July to Octo99

ber 2022. However, due to having only one year of data, there is not enough
evidence to make reasonable inferences. It might be worthwhile to analyze
the 2017-2022 Daily Precipitation Dataset in the future for further insights.

Figure A.5: Plot of 2022 hourly precipitation average data

Fig. A.6 presents a plot corresponding to the hourly wind speed data
for the year 2022. However, it appears that we cannot glean any significant
information from it.

100

Figure A.6: Plot of 2022 hourly wind speed average data

Finally, we integration 2022 Hourly Climate Dataset with the 2022 Hourly
Traffic dataset yields the 2022 Hourly Traffic and Climate Dataset.

A.2

Daily Climate Data (2017-2022)

The first one is temp.daily, with 7 stations meeting its criteria: 1105658,
1106200, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Fig. A.7 indicate that these stations follow the same pattern.

101

Figure A.7: Comparison plot of 2017-2022 daily temperature data

The second tsibble table, named precip.daily, comprises stations with
the IDs 1105658, 1105669, 1106200, 1106764, 1106PF7, 1108446, 1108395,
1108380, and 1108824. Upon plotting the data, Fig. A.8 shows that these
stations exhibit a similar pattern.

102

Figure A.8: Comparison plot of 2017-2022 daily precipitation data

Subsequently, we identify stations that have both features, including IDs
1105658, 1106200, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Upon
inspection of the missing value Table A.5, it becomes evident that there
is a substantial amount of missing data for station 1105658 and 1106200,
necessitating its removal.

103

Table A.5: Missing data for 2017-2022 daily climate data. (2191 rows)
Site

Climate

Climate

Climate

Distance Missing

Missing

Missing

No.

ID

Station
Name

Station
GPS

from
Lions

temp
Data

precip
Data

Weather Weather
Data
Data

Gate

(Days)

(Days)

(Both

(Either

temp
and

temp
or

(DD)

Bridge

1

1105658

N

49.3811,

VANC
GROUSE 123.0783

8.5159

Missing

633

601

precip
601

precip
633

km

MTN
RESORT
2

1106200

POINT
ATKIN-

49.3304,
-

9.3143
km

72

833

72

833

3

SON
1106PF7 RICH-

123.2647
49.1708,

16.402

201

201

201

201

MOND

-

km

NATURE

123.0931

32

155

32

155

20

6

3

23

39

139

39

139

37

317

37

317

4

5

6

1108446

1108395

1108380

PARK
VAN

49.2954,

2.5117

HARBOUR

123.1219

km

CS
VAN

49.1947,

13.8134

INTL

-

km

A
VAN

123.1839
49.1825, 15.1886

SEA
IS-

123.1872

km

49.3470,

5.322

123.1933

km

LAND
CCG
7

1108824

WEST
VANCOUVER
AUT

104

Similar to the processing of hourly climate data, we need to calculate the
average of each feature across six stations to represent the values of different
features and reduce dataset homogeneity.
After merging the data from five stations (1106PF7, 1108446, 1108395,
1108380, and 1108824) and computing the average values, we discovered
that the combined dataset still contained some missing data. To address
missing values in the temperature feature, we employed the built-in function “na seadec()” for missing value imputation. However, when attempting
to handle missing data in the precipitation feature using this method, we
encountered negative values, which are inconsistent with the nature of precipitation, as it must be non-negative. Therefore, we explored an alternative
method for imputing missing data, namely linear interpolation, available in
the “zoo” library. This technique, rooted in numerical analysis, estimates
unknown values by assuming a linear relationship within the range of data
points. To utilize linear interpolation for estimating missing values, we examined past and future data surrounding the missing values (Babar [2023]).
With this approach, no negative values emerged, hence we adopted it as the
preferred method.
Upon completing the handling of missing data, we obtained two new
tables: final.temp.daily.2017 2022 and final.precip.daily.2017 2022. Subsequently, we proceeded to plot them.
From Fig. A.9, it is evident that daily temperature data exhibit peak
values around August each year, consistent with the results obtained from
the 2022 hourly temperature dataset (Fig. A.4). Additionally, we can see a
seasonal pattern at the yearly level.

105

Figure A.9: Plot of 2017-2022 Daily temperature average data

From Fig. A.10, it is evident that the 2017-2022 daily precipitation data
show lower precipitation levels around June each year compared to other
months. This observation suggests a potential seasonal pattern at the yearly
level.

106

Figure A.10: Plot of 2017-2022 Daily precipitation average data

Finally, integration 2017-2022 Daily Climate Dataset with the 2017-2022
Daily Traffic Dataset yields the 2017-2022 Daily Traffic and Climate Dataset.

107

Appendix B
Linear and Non-linear Model
Selection

B.1

2022 Hourly Both Directions Dataset

We will analyze the results for the day type corresponding to weekend and
holiday, and select the top two linear models and the top two non-linear
models based on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.

B.1.1

Weekend

Based on Table B.1, we selected the top two models: SVR.XL and ARIMAL .
Similarly, based on Table B.2, we selected the top two models: SVRN L and
108

Table B.1: Linear Model Results

Table B.2: Non-Linear Model Results

Model

MAPE

SVR.XL

10.44

SVRL

Model

MAPE

10.85

SVRN L

8.56

ARIMAL

28.32

KNNN L

11.38

DRegL

32.07

NNETARN L

12.52

ETSL

53.58

NNETAR.XN L

12.55

SVR.XN L

13.55

LSTMN L

20.86

LSTM.XN L

27.9

KNNN L .

B.1.2

Holiday

Based on Table B.3, we selected the top two models: SVR.XL and ARIMAL .
Similarly, based on Table B.4, we selected the top two models: SVRN L and
NNETAR.XN L .

B.2

2022 Hourly Positive Direction Dataset

We will analyze the results for the day type corresponding to weekday, weekend and holiday, and select the top two linear models and the top two nonlinear models based on the MAPE value.

109

Table B.3: Linear Model Results

Table B.4: Non-Linear Model Results

Model

MAPE

SVR.XL

12.49

SVRL

Model

MAPE

12.81

SVRN L

8.65

ARIMAL

29.4

NNETAR.XN L

9.21

DRegL

32.19

NNETARN L

11.77

ETSL

55.85

KNNN L

14.13

SVR.XN L

16.86

LSTM.XN L

18.77

LSTMN L

25.32

The tables below present the results of the linear and non-linear models,
ordered by MAPE value.

B.2.1

Weekday

The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.5, we selected the top two models: SVRL and ARIMAL .
Similarly, based on Table B.6, we selected the top two models: NNETARN L
and SVR.XN L .

110

Table B.5: Linear Model Results

Table B.6: Non-Linear Model Results

B.2.2

Model

MAPE

SVRL

15.12

ARIMAL

Model

MAPE

17.3

NNETARN L

14.6

SVR.XL

17.37

SVR.XN L

15.77

DRegL

18.08

NNETAR.XN L

16.5

ETSL

26.46

SVRN L

16.97

LSTM.XN L

33.88

LSTMN L

40.27

KNNN L

41.4

Weekend

First, we will analyze the results for the day type corresponding to weekdays,
and select the top two linear models and the top two non-linear models based
on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.7, we selected the top two models: SVR.XL and
ARIMAL . Similarly, based on Table B.8, we selected the top two models:
SVRN L and NNETARN L .

111

Table B.7: Linear Model Results

Table B.8: Non-Linear Model Results

B.2.3

Model

MAPE

SVR.XL

11.07

SVRL

Model

MAPE

11.46

SVRN L

9.54

ARIMAL

24.82

NNETARN L

11.93

DRegL

26.9

KNNN L

12.11

ETSL

59.7

SVR.XN L

14.26

NNETAR.XN L

14.82

LSTMN L

21.39

LSTM.XN L

22.83

Holiday

First, we will analyze the results for the day type corresponding to weekdays,
and select the top two linear models and the top two non-linear models based
on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.9, we selected the top two models: SVRL and ARIMAL .
Similarly, based on Table B.10, we selected the top two models: NNETAR.XN L
and SVRN L .

112

Table B.9: Linear Model Results

Table B.10: Non-Linear Model Results

B.3

Model

MAPE

SVRL

11

SVR.XL

Model

MAPE

12.5

NNETAR.XN L

9.63

ARIMAL

20.43

SVRN L

10.63

DRegL

23.81

NNETARN L

11.29

ETSL

45.01

SVR.XN L

11.66

KNNN L

17.29

LSTMN L

24.49

LSTM.XN L

26.56

2022 Hourly Negative Direction Dataset

We will analyze the results for the day type corresponding to weekday, weekend and holiday, and select the top two linear models and the top two nonlinear models based on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.

B.3.1

Weekday

The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.11, we selected the top two models: SVRL and ARIMAL .

113

Table B.11: Linear Model Results

Table B.12: Non-Linear Model Results

Model

MAPE

SVRL

16.86

ARIMAL

Model

MAPE

17.4

SVRN L

16.73

DRegL

17.87

SVR.XN L

17.51

ETSL

18.24

LSTM.XN L

38.58

SVR.XL

23.05

KNNN L

40.18

LSTMN L

43.26

NNETAR.XN L

69.71

NNETARN L

77.34

Similarly, based on Table B.12, we selected the top two models: SVRN L and
LSTM.XN L .

B.3.2

Weekend

First, we will analyze the results for the day type corresponding to weekdays,
and select the top two linear models and the top two non-linear models based
on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.13, we selected the top two models: SVR.XL and
ARIMAL . Similarly, based on Table B.14, we selected the top two models:
NNETAR.XN L and KNNN L .

114

Table B.13: Linear Model Results

Table B.14: Non-Linear Model Results

B.3.3

Model

MAPE

SVR.XL

15.52

SVRL

Model

MAPE

15.83

NNETAR.XN L

11.77

ARIMAL

38.79

KNNN L

12.03

DRegL

45.65

NNETARN L

12.58

ETSL

47.79

SVR.XN L

12.9

SVRN L

15.95

LSTM.XN L

27.6

LSTMN L

36.71

Holiday

First, we will analyze the results for the day type corresponding to weekdays,
and select the top two linear models and the top two non-linear models based
on the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.15, we selected the top two models: SVRL and ETSL .
Similarly, based on Table B.16, we selected the top two models: SVRN L and
NNETAR.XN L .

115

Table B.15: Linear Model Results

Table B.16: Non-Linear Model Results

B.4

Model

MAPE

SVRL

22.54

SVR.XL

Model

MAPE

23.24

SVRN L

10.66

ETSL

36.6

NNETAR.XN L

18.94

ARIMAL

46.57

KNNN L

19.1

DRegL

51.77

LSTMN L

21.71

NNETARN L

22.37

LSTM.XN L

27.19

SVR.XN L

27.64

2017-2022 Daily Dataset

We will analyze the results for the day type corresponding to weekday and
select the top two linear models and the top two non-linear models based on
the MAPE value.
The tables below present the results of the linear and non-linear models,
ordered by MAPE value.

B.4.1

Weekday

The tables below present the results of the linear and non-linear models,
ordered by MAPE value.
Based on Table B.17, we selected the top two models: SVR.XL and

116

Table B.17: Linear Model Results

Table B.18: Non-Linear Model Results

Model

MAPE

SVR.XL

2.92

SVRL

Model

MAPE

4.85

SVR.XN L

2.36

ARIMAL

6.22

LSTMN L

4.21

DRegL

8.42

SVRN L

4.38

ETSL

11.85

LSTM.XN L

4.49

NNETARN L

7.85

NNETAR.XN L

11.59

KNNN L

15.79

ARIMAL . Similarly, based on Table B.18, we selected the top two models:
SVR.XN L and LSTMN L .

117