THOMPSON RIVERS UNIVERSITY A Novel Data-Driven Traffic Prediction Model for Lions Gate Bridge Traffic Management By Zijun Ma A PROJECT SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in Data Science KAMLOOPS, BRITISH COLUMBIA August, 2024 SUPERVISOR Dr. Erfanul Hoque © Zijun Ma, 2024 ABSTRACT This work focuses on predicting traffic flow on the Lions Gate Bridge adjacent to Stanley Park in Vancouver, Canada, employing a novel hybrid model. The bridge serves as a vital route for commuters from northern Vancouver to the city center, experiencing substantial daily traffic volume, estimated at 60,000+ vehicles on workdays, leading to peak congestion during morning and afternoon commutes. Therefore, the urgency to establish a traffic prediction model is paramount. The aim is to address this issue, providing urban planners with insights for more effective traffic planning near the Lions Gate Bridge to alleviate congestion. To that end, in this work, we propose a novel data-driven hierarchical forecast combination model to enhance the accuracy of traffic flow predictions. Traffic-related data for this project are sourced from the Ministry of Transportation and Infrastructure of BC, while climate-related datasets are obtained from Environment Canada. The results demonstrate the better performance of the proposed model compared to conventional forecasting models. Key Words: Deep Learning, Forecast Combination, High Dimensional Data, Machine Learning, Regression Models, Traffic Volume Forecasting, Time Series Models. ii ACKNOWLEDGEMENTS I would like to express my deepest gratitude to my supervisor, Dr. Erfanul Hoque, for his invaluable guidance, support, and encouragement throughout the course of this project. His insights and expertise were instrumental in shaping the direction and outcome of this research. I would also like to extend my sincere thanks to my second reader, Dr. Sean Hellingman, for his thoughtful feedback and valuable suggestions, which significantly contributed to the refinement and quality of this work. Additionally, I am grateful to Thompson Rivers University for providing the necessary resources and institutional support that made this work possible. The academic environment and facilities at TRU have greatly contributed to the successful completion of this project. Finally, I would like to acknowledge everyone who has supported me throughout this journey. Your encouragement and belief in my abilities have been a constant source of motivation. iii Contents 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Project Objectives . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Significance of the Study . . . . . . . . . . . . . . . . . . . . . 5 1.5 Organization of the Project . . . . . . . . . . . . . . . . . . . 7 2 Literature Review 8 2.1 Time Series Methods . . . . . . . . . . . . . . . . . . . . . . . 2.2 Machine Learning Methods . . . . . . . . . . . . . . . . . . . . 10 2.3 Deep Learning Methods . . . . . . . . . . . . . . . . . . . . . 11 2.4 Forecast Combination Methods . . . . . . . . . . . . . . . . . 12 iv 8 CONTENTS v 3 Methodology 16 3.1 3.2 3.3 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.1 ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.2 Dynamic Regression Model 3.1.3 Exponential Smoothing Model . . . . . . . . . . . . . . 19 3.1.4 Support Vector Regression (SVR) . . . . . . . . . . . . 22 . . . . . . . . . . . . . . . 18 Non-Linear Models . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.1 K-Nearest Neighbor (KNN) . . . . . . . . . . . . . . . 23 3.2.2 Long Short-Term Memory (LSTM) . . . . . . . . . . . 24 3.2.3 Neural Network Autoregression (NNETAR) . . . . . . 26 Proposed Model: Hierarchical Forecast Combination (HFC) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3.1 Proposed Data-Driven HFC Models . . . . . . . . . . . 29 3.3.2 Strategies to Determine Weights . . . . . . . . . . . . . 32 3.3.3 Steps for HFC Model . . . . . . . . . . . . . . . . . . . 33 4 Experiments 35 4.1 Dataset Description . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Traffic Data Collection . . . . . . . . . . . . . . . . . . . . . . 36 CONTENTS vi 4.2.1 Hourly Traffic Volume Data (2022) . . . . . . . . . . . 37 4.2.2 Daily Traffic Volume Data (2017-2022) . . . . . . . . . 41 4.3 Climate Data Collection . . . . . . . . . . . . . . . . . . . . . 45 4.3.1 Hourly Climate Data (2022) . . . . . . . . . . . . . . . 46 4.3.2 Daily Climate Data (2017-2022) . . . . . . . . . . . . . 47 4.4 Traffic and Climate Dataset . . . . . . . . . . . . . . . . . . . 49 4.5 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . 50 4.6 4.7 4.5.1 ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.5.2 Dynamic Regression Model 4.5.3 Exponential Smoothing Model . . . . . . . . . . . . . . 52 4.5.4 Support Vector Regression (SVR) . . . . . . . . . . . . 53 4.5.5 K-Nearest Neighbor (KNN) . . . . . . . . . . . . . . . 54 4.5.6 Long Short-Term Memory (LSTM) . . . . . . . . . . . 55 4.5.7 Neural Network Autoregression (NNETAR) . . . . . . 56 . . . . . . . . . . . . . . . 51 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.6.1 RMSE . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.6.2 MAPE . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Performance Comparison . . . . . . . . . . . . . . . . . . . . . 58 CONTENTS vii 4.7.1 Results of 2022 Hourly Both Directions Dataset . . . . 61 4.7.2 Results of 2022 Hourly Positive Direction Dataset . . . 68 4.7.3 Results of 2022 Hourly Negative Direction Dataset . . 73 4.7.4 Results of 2017-2022 Daily Dataset . . . . . . . . . . . 78 5 Conclusions 82 A Climate Data Collection 91 A.1 Hourly Climate Data (2022) . . . . . . . . . . . . . . . . . . . 91 A.2 Daily Climate Data (2017-2022) . . . . . . . . . . . . . . . . . 101 B Linear and Non-linear Model Selection 108 B.1 2022 Hourly Both Directions Dataset . . . . . . . . . . . . . . 108 B.1.1 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . 108 B.1.2 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . 109 B.2 2022 Hourly Positive Direction Dataset . . . . . . . . . . . . . 109 B.2.1 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . 110 B.2.2 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . 111 B.2.3 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . 112 B.3 2022 Hourly Negative Direction Dataset . . . . . . . . . . . . 113 CONTENTS viii B.3.1 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . 113 B.3.2 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . 114 B.3.3 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . 115 B.4 2017-2022 Daily Dataset . . . . . . . . . . . . . . . . . . . . . 116 B.4.1 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . 116 List of Figures 3.1 SVR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Basic RNN structure (Source: Zhu [2019]) . . . . . . . . . . . 25 3.3 LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 The structure of neural network . . . . . . . . . . . . . . . . . 27 3.5 Basic steps of the proposed HFC model . . . . . . . . . . . . . 33 4.1 Process of traffic data collection . . . . . . . . . . . . . . . . . 36 4.2 Hourly traffic volume data plot for February 2022 . . . . . . . 38 4.3 Hourly traffic volume plot from 2022-01-10 to 2022-01-14 (Single Day) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4 Hourly traffic volume plot from 2022-01-10 to 2022-01-14 . . . 40 4.5 Comparison of average hourly traffic volume between holiday, weekday and weekend . . . . . . . . . . . . . . . . . . . . . . . 41 4.6 Plot of 2017-2022 daily traffic volume data . . . . . . . . . . . 42 ix LIST OF FIGURES x 4.7 Daily traffic volume plot from 2018-03-05 to 2018-04-01 . . . . 43 4.8 Distribution of Daily traffic volume data for each day of the week . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.9 Process of climate data collection . . . . . . . . . . . . . . . . 45 4.10 Process of 2022 hourly climate data collection . . . . . . . . . 46 4.11 Process of 2017-2022 daily climate data collection . . . . . . . 48 4.12 Forecast plot for proposed HFC model and four individual models (2022 Hourly Both Directions - weekday) . . . . . . . . 63 4.13 Forecast plots for proposed models and the combination models (2022 Hourly Both Directions - weekend) . . . . . . . . . . 66 4.14 Forecast plots for proposed models and the combination models (2022 Hourly Both Directions - holiday) . . . . . . . . . . . 67 4.15 Forecast plot for proposed HFC model and four individual models (2022 Hourly Positive Direction-weekday) . . . . . . . 70 4.16 Forecast plots for proposed models and the combination models (2022 Hourly Positive Direction-weekend) . . . . . . . . . . 71 4.17 Forecast plots for proposed models and the combination models (2022 Hourly Positive Direction-holiday) . . . . . . . . . . 72 4.18 Forecast plots for proposed models and individual models (2022 Hourly Negative Direction-weekday) . . . . . . . . . . . . . . . 75 LIST OF FIGURES xi 4.19 Forecast plots for proposed models and individual models (2022 Hourly Negative Direction-weekend) . . . . . . . . . . . . . . . 76 4.20 Forecast plots for proposed models and individual models (2022 Hourly Negative Direction-holiday) . . . . . . . . . . . . . . . 77 4.21 Forecast plot for proposed HFC model and four individual models (2017-2022 Daily-weekday) . . . . . . . . . . . . . . . 80 A.1 Comparison plot of 2022 hourly temperature data for different stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 A.2 Comparison plot of 2022 hourly precipitation data for different stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 A.3 Comparison plot of 2022 hourly wind speed data for different stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 A.4 Plot of 2022 hourly temperature average data . . . . . . . . . 99 A.5 Plot of 2022 hourly precipitation average data . . . . . . . . . 100 A.6 Plot of 2022 hourly wind speed average data . . . . . . . . . . 101 A.7 Comparison plot of 2017-2022 daily temperature data . . . . . 102 A.8 Comparison plot of 2017-2022 daily precipitation data . . . . . 103 A.9 Plot of 2017-2022 Daily temperature average data . . . . . . . 106 A.10 Plot of 2017-2022 Daily precipitation average data . . . . . . . 107 List of Tables 4.1 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 61 4.3 The comparison table of four individual forecast models with proposed HFC models (2022 hourly both directions - weekday) 62 4.4 The comparison table of four individual forecast models with proposed HFC model (2022 hourly both directions) . . . . . . 64 4.5 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.6 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.7 Comparison of Four Individual Forecast Models with Proposed HFC Model (2022 Hourly Positive Direction) . . . . . . . . . . 68 4.8 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.9 Weekend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.10 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 xii LIST OF TABLES xiii 4.11 The comparison table of four individual forecast models with proposed HFC models (2022 hourly negative direction) . . . . 73 4.12 Weekday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.13 Weenkend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.14 Holiday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.15 The comparison table of four individual forecast models with proposed HFC models (2017-2022 daily-weekday) . . . . . . . 78 A.1 Missing data for 2022 hourly temperature data (8760 rows) . . 93 A.2 Missing data for 2022 hourly precipitation data (8760 rows) . 95 A.3 Missing data for 2022 hourly wind speed data (8760 rows) . . 97 A.4 Missing data for 2022 hourly climate data (8760 rows) . . . . . 98 A.5 Missing data for 2017-2022 daily climate data. (2191 rows) . . 104 B.1 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 109 B.2 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 109 B.3 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 110 B.4 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 110 B.5 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 111 B.6 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 111 LIST OF TABLES xiv B.7 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 112 B.8 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 112 B.9 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 113 B.10 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 113 B.11 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 114 B.12 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 114 B.13 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 115 B.14 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 115 B.15 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 116 B.16 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 116 B.17 Linear Model Results . . . . . . . . . . . . . . . . . . . . . . . 117 B.18 Non-Linear Model Results . . . . . . . . . . . . . . . . . . . . 117 Chapter 1 Introduction 1.1 Background Since the beginning of the twentieth century, the industrialization of automobiles has led to an increase in demand for cars. For many families, owning a car is now a necessity. However, this has created an issue of traffic congestion in major cities due to the high number of automobiles on the road. Traffic congestion is not just a localized issue but a global challenge that affects cities worldwide. The increase in vehicles on the road leads to extended travel times, increased fuel consumption, and elevated levels of greenhouse gas emissions. According to a report by the 2023 INRIX Global Traffic Scorecard (INRIX), cities like New York City, London, Paris, and Mexico City, experience some of the world’s worst traffic congestion, with commuters spending an average of over 200 hours per year in traffic jams. This situation has significant economic implications, as traffic congestion is 1 estimated to cost billions of dollars annually in lost productivity, higher fuel costs, and increased environmental degradation. Moreover, the health impact cannot be ignored; prolonged exposure to vehicular emissions contributes to respiratory and cardiovascular diseases, posing a serious public health concern. These issues are clearly evident in many urban centers around the world, where traffic congestion has become a daily reality. One example is the famed Times Square, which is situated on Sixth Avenue in New York City, which runs alongside a variety of stores, eateries, entertainment venues, and skyscrapers as it traverses the business and office districts of Manhattan. At the same time, Manhattan is a thriving economic hub, with droves of commuters flooding in every morning and evening to reach their workplaces. As a result, they need to traverse the city, often utilizing Sixth Avenue, leading to heavy traffic congestion on the roads. Another example can be found in renowned scenic areas like Stanley Park in Vancouver, Canada. This extensive metropolitan park is situated on a peninsula that borders the Pacific Ocean, approximately 1.5 kilometers northwest of downtown Vancouver. It is known to be among the largest urban parks in Canada, and a popular tourist destination in Vancouver. The park includes vast areas of natural scenery, hiking trails, biking paths, diverse wildlife, and multiple tourist sites, such as the Lions Gate Bridge, and the Vancouver Aquarium (City of Vancouver). Generally, during the summer season and holidays, particularly on sunny weekends, Stanley Park tends to draw a substantial number of tourists and local inhabitants. Meanwhile, the surrounding roads during these periods may experience higher traffic congestion. 2 1.2 Problem Statement Among all these surrounding roads, there is a bridge connected to the park known as The Lions Gate Bridge, which is a crucial piece of infrastructure as it links downtown Vancouver with West Vancouver and North Vancouver. This bridge serves as a key route for commuters traveling from northern Vancouver to downtown. Additionally, it provides North Shore residents with a direct path to Stanley Park, reducing the need for them to detour through downtown Vancouver. The Lions Gate Bridge handles over 60,000 vehicles every workday (Association of Consulting Engineering Companies British Columbia), resulting in severe traffic congestion during the morning and evening rush hours. Given its strategic importance in the region’s transportation network, any form of traffic congestion on the Lions Gate Bridge can have significant repercussions on regional economic activities. It extends commute times and increases fuel consumption, which in turn exacerbates greenhouse gas emissions. The efficient functioning of this bridge is therefore critical not only for maintaining smooth traffic flow but also for supporting the broader economic and environmental health of the region. Reducing traffic congestion is therefore of particular importance, and many cities have experimented with improving public transportation systems, encouraging carpooling, implementing traffic management measures, building bicycle lanes, and pedestrian walkways, and promoting sustainable modes of transportation. However, despite the implementation of these measures, the ever-growing 3 demand for transportation continues to outpace the capacity of these solutions, rendering them less effective in fully alleviating traffic pressures. This limitation underscores the need for more advanced and adaptive strategies to manage traffic congestion effectively. In response to this need, technology is playing an increasingly important role in traffic management. Intelligent Transportation Systems (ITS) have been developed to manage traffic flow more efficiently by using data-driven approaches. These systems utilize information technology, communication technology, and sensor technology to optimize the transportation network (WSPglobal). An important component of ITS is traffic data analysis, which allows for the forecasting of traffic patterns. By accurately predicting traffic flow, cities can implement preemptive measures such as adjusting traffic signals or diverting traffic to reduce congestion (Laña et al. [2021]). For example, by forecasting traffic flow on the Lions Gate Bridge, we can analyze the traffic situation at a certain time in the near future. If there is a possibility of congestion, we can make adjustments to traffic signals or mobilize personnel to manage traffic detours. Forecasting traffic flow is especially important on special days such as holidays and major events. In this project, we aim to develop a forecast model specifically designed to predict traffic flow data for Lions Gate Bridge. This research is motivated by the pressing need to alleviate traffic congestion during peak periods, especially on holidays and weekends. 4 1.3 Project Objectives The main objectives of this project are: (a) Utilize traditional time series models, machine learning models, and deep learning models to predict the traffic flow on the Lions Gate Bridge, and conduct a comprehensive evaluation to determine which model performs the best under various conditions. (b) Identify and analyze the significant factors that influence traffic flow on the Lions Gate Bridge. (c) Propose a novel data-driven forecast combination model to enhance the accuracy of traffic flow predictions. (d) Conduct a comparative analysis of the proposed model against traditional time series models, machine learning models, and deep learning models. 1.4 Significance of the Study Recognizing the Lions Gate Bridge as a crucial link not only for the daily commute of thousands of residents but also for the economic and social activities in the region, it becomes evident that it serves as a primary route for weekday commuters and plays a significant role in the transportation network. Additionally, being the main access point to Stanley Park, it draws tourists from various locations during holidays. Thus, the Lions Gate Bridge serves not only as a vital transportation artery but also as an essential part 5 of the tourist infrastructure. Given its strategic importance, we propose a data-driven forecast combination model aimed at forecasting traffic flow on the Lions Gate Bridge. This model leverages the strengths of both traditional and modern predictive techniques, offering a more accurate and adaptable forecasting tool compared to traditional models. This approach not only helps government authorities promptly implement traffic management measures to reduce congestion and ensure smooth urban traffic but also provides a reliable predictive tool for future transportation needs. However, the usefulness of such a predictive model goes beyond these immediate applications. By providing accurate traffic flow predictions, the model supports long-term urban planning initiatives, enabling authorities to design more resilient infrastructure that can adapt to future demands. This contributes to sustainable urban growth, reducing environmental impact, and improving the quality of life for residents. By comprehensively understanding traffic flow trends and patterns, authorities can better plan road construction, public transportation systems, and urban layouts to meet future transportation needs, thereby enhancing urban sustainability and development quality. Given the anticipated increase in both commuter and tourist traffic, especially with ongoing urban expansion, the significance of this study lies in its potential to offer actionable insights that can preemptively address traffic challenges before they escalate, ensuring the continued vitality of Vancouver’s transportation network. 6 1.5 Organization of the Project To present our findings, the remainder of this project is structured as follows: Chapter 2 reviews the related work on time series prediction in the traffic domain. Chapter 3 explains the various linear and non-linear models we utilized and introduces our proposed data-driven traffic prediction model. Chapter 4 discusses the experimental results in detail. Finally, Chapter 5 concludes the study by summarizing the key points and contributions of this project. 7 Chapter 2 Literature Review In the field of traffic flow prediction, methods can be broadly classified into three categories: time series methods, machine learning methods, and deep learning methods. Additionally, we review literature on forecast combination methods that are related to our proposed model. 2.1 Time Series Methods Time series methods such as autoregressive integrated moving average (ARIMA), exponential smoothing, and regression are commonly used. In the proposed study by Kumar [2022], the ARIMA model, optimized with hyperparameters, was used to forecast traffic flow using the PeMS dataset from a CalTrans sensor station. The results showed good performance metrics for whole day, morning, and evening datasets, respectively. This suggests that integrating seasonal ARIMA (SARIMA) and Box-Cox transformations could further en- 8 hance prediction accuracy. Regarding the exponential smoothing method, it was employed to preprocess traffic flow data in the following study. Chan [2011a] proposed a neural network development approach based on an exponential smoothing method to enhance previously used neural networks (NN) for traffic flow forecasting. The exponential smoothing method was applied to preprocess traffic flow data before training the NN. The preprocessed traffic flow data, with reduced non-smoothness, discontinuity, and lumpiness compared to the original data, was more suitable for neural network training. This approach was evaluated by forecasting real-time traffic conditions on a section of the freeway in Western Australia. Regarding training errors, the neural network models developed using this approach achieved more than a 20% improvement rate compared to those developed using the original traffic flow data, indicating a significant enhancement in fitting traffic flow data. Regarding testing errors, these models achieved more than an 8% improvement rate, indicating improved generalization capability in traffic flow forecasting. In another study, Chan [2011b] proposed a novel NN training method that combines the hybrid exponential smoothing method and the LevenbergMarquardt (LM) algorithm to improve the generalization capabilities of NN training methods for short-term traffic flow forecasting. The method preprocesses traffic flow data by removing lumpiness, followed by applying a variant of the LM algorithm to train the NN model’s weights. The smoother and more continuous preprocessed data aids in NN training. This method was evaluated by forecasting short-term traffic flow conditions on the Mitchell Freeway in Western Australia, with NN models developed using this method outperforming those developed using other algorithms specifically designed for short-term traffic flow forecasting or enhancing NN generalization capabilities. Meanwhile, Roh et al. [2019] utilized a multiple linear regression to 9 predict daily traffic volume factors, focusing on the effects of severe weather conditions on travel demand in Alberta, Canada. Their findings revealed that weather conditions significantly affect passenger car traffic volume more than trucks, providing valuable insights into modeling traffic flows under severe winter weather conditions. 2.2 Machine Learning Methods Machine learning methods encompass a variety of techniques such as support vector machines, random forests, artificial neural networks (ANN), and k-nearest neighbors (KNN). Lippi et al. [2013] introduced a new supervised learning algorithm based on support vector regression (SVR) and incorporated a seasonal kernel in SVR to capture seasonal characteristics of traffic data. Their study, conducted using experimental traffic flow data from the California Performance Measurement System (PeMS) for freeways, demonstrated that seasonality plays a pivotal role in achieving high accuracy. Moreover, they found that their proposed seasonal kernel method strikes a reasonable balance between prediction accuracy and computational complexity, considering computational resource consumption. Bratsas et al. [2019] conducted a study comparing the predictive effectiveness of machine learning models including Random Forest, SVR, Multi-Layer Perceptron, and Multiple Linear Regression on probe data collected from the road network in Thessaloniki, Greece. Their experimental results indicated that the SVR model performed the best under stable conditions, while the multilayer perceptron model excelled in the face of greater variability, demonstrating the closest approximation to zero error. Tempelmeier [2019] addressed the challenge of 10 building reliable supervised models (SVR, KNN, ridge regression) for predicting the spatial and temporal impact of planned special events with respect to road traffic, focusing on the effectiveness of various features derived from historical data. Their evaluation, based on real-world event data from several venues in the Hannover region, Germany, demonstrates that their models, using event-, mobility-, and infrastructure-related features, outperform both event-based and event-agnostic baselines in accurately predicting the impact of planned special events on urban traffic, providing valuable insights for traffic management during such events. Ata [2020] proposed a TCC-SVM system model to analyze traffic congestion in a smart city environment using an ML-enabled IoT-based road traffic congestion control system, which notifies the occurrence of congestion at specific points. Data collected from Data Mill North via the internet, consisting of weather and traffic flow at 10-minute intervals, was utilized. The performance of the proposed TCCSVM model is shown to be superior compared to previous approaches by Tamimi and Zahoor (2010), Pushpi and Dilip Kumar (2018), and Ayesha et al. (2019). However, the research is limited by potential time delays and increased data complexity. 2.3 Deep Learning Methods Deep learning methods have seen increasing popularity in recent years for traffic flow prediction. These methods include Long Short-Term Memory (LSTM), gated recurrent units (GRU), NN, and various hybrid models. Fu et al. [2016] explored the use of two deep learning methods, LSTM and GRU NN, for short-term traffic flow predictions using the California Caltrans Per- 11 formance Management System (PeMS) dataset. Their comparative experiments demonstrated superior performance of these deep learning methods over traditional ARIMA models. Additionally, Kang et al. [2017] investigated LSTM for short-term traffic flow prediction, introducing a wider range of input information including flow rate, occupancy, speed, and neighboring traffic flow. Experimental results show that acceptable traffic flow prediction performance can be obtained by using only traffic flow as an input variable. However, the prediction performance may be better when combining traffic flow with occupancy/speed as input variables. Zhao et al. [2019] proposed a novel hybrid neural network called TreNet, integrating the advantages of convolutional neural networks (CNN) and LSTM for time series traffic flow prediction. Similarly, Du et al. [2017] presented a hybrid deep learning framework combining recurrent neural networks (RNNs) and convolutional neural networks (CNNs) for urban traffic flow prediction, demonstrating effectiveness in handling complex nonlinear traffic flow prediction problems. 2.4 Forecast Combination Methods Forecast combination is a powerful method for achieving more accurate predictions by combining multiple individual forecasts. The underlying idea is that different forecasting models can capture various aspects of the data, and by combining them, the overall prediction becomes more accurate and robust than any single model alone. This approach is particularly valuable as it mitigates the weaknesses of individual models while leveraging their strengths, leading to improved forecasting accuracy and reliability (Wang [2023]). Forecast combination has been successfully applied in various fields, including the 12 stock market, tourist flow, and internet traffic. By using combined forecasts, these applications can achieve more reliable and accurate predictions, which are crucial for decision-making in dynamic environments. In the area of tourist flow, Andrawis [2011] explored the concept of combining forecasts using different time aggregations to capture diverse dynamics and enhance the diversity of the forecasts obtained. The forecast combination methods include Simple Average (AVG), Variance-based (VAR), Inverse of the Mean Square Error (INV-MSE), Rank-based Weighting (RANK), Least Squares Estimation, and Hierarchical Forecast Combination (HIER), among others. Simulations were conducted on benchmark data from the M3 and NN3 time series competitions and on monthly inbound tourism demand for Egypt from 33 major source countries, provided by the Egyptian Ministry of Tourism. The simulation results indicated improvements in accuracy relative to the underlying forecasting models and demonstrated that this approach outperformed individual models, offering valuable insights for developing a forecasting model for inbound tourism demand using a short-term/long-term forecast combination approach. In the traffic flow domain, Ma [2020] proposed a combined model that uses an artificial neural network optimized by a genetic algorithm (GA) and exponential smoothing to improve the accuracy of short-term traffic flow prediction. By leveraging the metaheuristic optimization capability of GA, the connection weights and thresholds of the feedforward neural network, trained by a backpropagation algorithm, are optimized to prevent the network from falling into a local optimum, thereby establishing the Genetic Artificial Neural Network (GANN) prediction model. Subsequently, an ES prediction model is introduced. To fully exploit the advantages of both mod- 13 els, the combined model uses a weighted average, with the weights determined based on the mean square error of the predictions from the individual models. The model was experimentally validated using road traffic flow data from Xuancheng, Anhui Province, with an observation interval of 5 minutes. Furthermore, the feedforward neural network model, GANN model, ES model, and combined model were compared and analyzed. The results demonstrated that the prediction accuracy of the optimized feedforward neural network was significantly higher than before optimization. The prediction accuracy of the combined model was also higher than that of the two individual models, confirming the feasibility and effectiveness of the combined model. Hou et al. [2019] proposed an adaptive hybrid model aimed at forecasting traffic flow uncertainty caused by dynamic changes in traffic structure. To partially overcome the limitations of traditional prediction methods, they combined the ARIMA method and the Nonlinear Wavelet Neural Network method to predict traffic flow. The evaluation of the proposed model showed that it was more efficient in predicting traffic flow than the two individual models, whether under stable or fluctuating conditions. However, the study only highlighted the hybrid model’s superiority over individual models and did not consider other models that have demonstrated greater efficiency than ARIMA, such as machine learning or deep learning models. Additionally, they did not consider other environmental factors influencing traffic flow dynamics, such as weather conditions or events that attract crowds. Based on the literature review above, it is evident that some researchers have applied combination methods in the traffic flow domain, but limitations still exist. The two papers mentioned earlier combined only two methods without exploring the potential benefits of combining a wider range of 14 models. Furthermore, they focused solely on processing traffic flow data without considering other factors that might influence traffic flow, such as weather conditions, holidays, and events. Therefore, in our project, we will address these limitations by proposing a novel data-driven forecast combination model. In conclusion, while traditional time series methods and machine learning methods have their advantages, deep learning methods, especially forecast combination models, are increasingly favored for their superior performance in traffic flow prediction tasks. 15 Chapter 3 Methodology The methods section consists of two parts. The first part introduces the models used for data analysis, categorized into linear and non-linear models. For linear models, we include ARIMA, dynamic regression, exponential smoothing model, and SVR. For non-linear models, we include KNN, SVR with a non-linear kernel, LSTM, and neural network autoregression (NNETAR). In the second part, we introduced a new method called the hierarchical Forecast Combination method. This method combines the strengths of different models to improve overall accuracy. 16 3.1 Linear Models 3.1.1 ARIMA ARIMA is a fundamental model in time series analysis, merging autoregression (AR) and moving average (MA) concepts while addressing nonstationarity via differencing (integration) techniques, making it suitable for forecasting future numeric values (Hyndman and Athanasopoulos [2021]). ARIMA consists of three parts: autoregressive (AR), integrated (I), and moving average (MA). The AR(p) model takes lagged observations as inputs, while the MA(q) model takes lagged errors as inputs. In a time series, if the current value at time t is yt , then we refer to yt−1 , yt−2 , ..., yt−k as lagged observations. When using ARIMA to handle time series data, we first need to ensure that the time series data is stationary by using differencing (integration). If yt is a stationary time series, then for all s, the distribution of (yt , ..., yt+s ) does not depend on t. This means the series does not exhibit trending behavior, has constant variance, and lacks long-term predictable patterns. The ARIMA model can be written as: (1 − ϕ1 B − · · · − ϕp B p )(1 − B)d yt = c + (1 + θ1 B + · · · + θq B q )ϵt (3.1) where the AR (AutoRegressive) part, (1 − ϕ1 B − · · · − ϕp B p ), represents a linear relationship between the current value yt and its previous p values, where ϕp are the parameters of the autoregressive part, and B is the backshift operator, meaning B p yt = yt−p . 17 The I (Integrated) part, (1 − B)d , involves differencing to make the time series stationary, with d being the number of differences taken; for example, the differencing operation (1 − B)yt = yt − yt−1 represents a first difference when d = 1. The MA (Moving Average) part, (1 + θ1 B + · · · + θq B q ), indicates a linear relationship between the current error term ϵt and the previous q error terms, where θq are the parameters of the moving average part. The constant term c represents the mean of the time series. And, the error term ϵt is a white noise series, meaning it is a random error term with a mean of zero and a variance of σ 2 . The ARIMA model formula describes how the current value yt is explained and predicted by its own past values, its differenced values, past error terms, and a constant term. By appropriately selecting the parameters p, d, and q, the ARIMA model can effectively fit a wide range of time series data. 3.1.2 Dynamic Regression Model This type of regression analysis recognizes timebased correlations, making it effective for handling time-dependent data (Hyndman and Athanasopoulos [2021]). The full model is: 18 yt = β0 + β1 x1,t + β2 x2,t + · · · + βk xk,t + ηt , ϕ (B) (1 − B)d ηt = θ (B) εt ,  εt ∼ IID 0, σ 2 . (3.2) Here, yt represents the dependent variable (the target value) at time t, which is composed of the intercept term β0 , several independent variables and their corresponding regression coefficients β1 x1,t , β2 x2,t , . . . , βk xk,t , as well as the error term (ηt ) of the dynamic regression model. We assume the error term (ηt ) is autocorrelated (not white noise) and follows the ARIMA model, and the error term of ARIMA ( εt ) follows white noise behavior, and is assumed to be independently and identically distributed (IID) with a mean of 0 and a variance of σ 2 . This model can address the autocorrelations seen in the standard regression model because the ARIMA error term in the dynamic regression model captures information that is not explained in the standard regression model. The process of fitting a model, checking the residual plot, doing the forecast, and evaluating the result are similar to the standard regression model part. 3.1.3 Exponential Smoothing Model Exponential smoothing is a time series forecasting method for univariate data. This method uses weighted averages of past observations to forecast future values, where the weights decrease exponentially as observations get older. The general idea is that more recent observations are more relevant for forecasting than older observations (Hyndman and Athanasopoulos [2021]). 19 There are three main types of exponential smoothing: Simple Exponential Smoothing (SES): SES is used for time series data without a trend or seasonal pattern (Hyndman and Athanasopoulos [2021]). The weighted average form of SES is: ŷt+1 = αyt + (1 − α)ŷt . (3.3) Here, ŷt+1 represents the forecast at time t+1, α is the smoothing parameter, constrained between the range 0 ≤ α ≤ 1, yt denotes the actual value at time t, and ŷt stands for the forecasted value at time t. A different way to represent SES is through the component form. In this approach, the only component considered is the level ℓt . The component form for SES can be expressed as follows: Forecast Equation: ŷt+1 = ℓt , Level Equation: ℓt = αyt + (1 − α)ℓt−1 . (3.4) (3.5) Here, ℓt represents the level of the series at time t, yt is the actual value at time t, α is the smoothing parameter, and ŷt+1 is the forecast for the next period. The parameters α and ℓt are estimated by minimizing the sum of squared errors (SSE) over the periods t, subject to the constraint that 0 ≤ α ≤ 1. If we substitute ℓt with ŷt+1 and ℓt−1 with ŷt in the level equation, we obtain the weighted average form of SES (equation 3.3). Two other methods are called Holt’s Linear Trend Method and the HoltWinters Method. Holt’s linear trend method is used for time series data with a trend (Hyndman and Athanasopoulos [2021]). This method involves a forecast 20 equation and two smoothing equations (one for the level and one for the trend). The Holt-Winters method includes the forecast equation and three smoothing equations (level, trend, and seasonal). This method has two variations depending on the type of seasonal component. The additive method is ideal when the seasonal fluctuations are roughly constant across the series. On the other hand, the multiplicative method is better suited when the seasonal fluctuations vary in relation to the series’ level (Hyndman and Athanasopoulos [2021]). The detailed component forms of these methods and the corresponding equations can be found in the book Ch.8.2 and 8.3 by Hyndman and Athanasopoulos [2021]. Innovations state space models for exponential smoothing Each method mentioned above consists of two main components: the measurement equation (forecast equation) that describes the observed data, and the state equations that describe how the unobserved components or states (such as level, trend, and seasonal) change over time. Therefore, we can also refer to them as State Space Models. For each method, there exist two corresponding state space models: one with additive errors and one with multiplicative errors. And how to get innovations state space models for each exponential smoothing method is detailed in Hyndman and Athanasopoulos [2021] book Chapter 8.5. 21 3.1.4 Support Vector Regression (SVR) SVR is an application of Support Vector Machine (SVM) for regression tasks, sharing a similar fundamental approach with slight differences (Santos and Castillo [2019]). In SVM, we aim to find a hyperplane with the maximum margin between classes. In contrast, SVR defines a threshold (ϵ), where data points within this margin have zero residuals, and points outside contribute to the residuals (ζ). The goal is to minimize these residuals. Essentially, SVR seeks an optimal strip (2ϵ width) and performs regression on points outside this strip. Figure 3.1: SVR In certain cases, SVR utilizes kernel functions to manage non-linear relationships between features. These functions map the input data into a higher-dimensional space, allowing a linear hyperplane to effectively separate or approximate the data. Commonly used kernels are linear, polynomial, radial basis function (RBF), and sigmoid. In time series forecasting, the aim is to predict the future value of a time 22 series, such as a stock price at a future date or temperature at an upcoming time step. As a regression technique, SVR constructs a model that connects historical time series data (features) with their corresponding future values (target variable). 3.2 3.2.1 Non-Linear Models K-Nearest Neighbor (KNN) The KNN algorithm predicts future values by finding the most similar K subsequences in historical data. It calculates the similarity between a recent segment (query sequence) and all historical subsequences, then averages the future values of the most similar ones to make predictions (Francisco Martinez [2023]). Here are the steps of how it works: First, define the query sequence by selecting the recent data segment you want to predict from. Next, segment the historical data by dividing it into subsequences of the same length as the query sequence. Then, calculate the similarity between the query sequence and each historical subsequence using a distance metric, such as Euclidean distance. After that, select the K nearest neighbors by choosing the K historical subsequences with the smallest distances. Finally, predict future values by averaging the future values of these K nearest neighbors to make the prediction. For example, suppose we have traffic volume data for the past 10 months and want to predict the 11th month’s traffic volume: • Define the Query Sequence: The traffic volume for the last 3 months, 23 e.g., (August, September, October). • Segment Historical Data: Create subsequences of 3 months, e.g., (January, February, March), (February, March, April), and so on. • Calculate Similarity: Measure the similarity between (August, September, October) and all historical subsequences. • Select K Nearest Neighbors: Choose the 3 subsequences with the smallest distances, e.g., (May, June, July), (April, May, June), (January, February, March). • Predict Future Values: Average the traffic volumes of the months following these subsequences (i.e., August, July, April) to predict November’s traffic volume. This method effectively utilizes patterns and trends in historical data to make accurate predictions about future traffic volume. 3.2.2 Long Short-Term Memory (LSTM) LSTM networks are a special type of Recurrent Neural Network (RNN) designed to effectively learn long-term dependencies. RNNs are a class of NN designed to recognize patterns in sequences of data, such as time series, natural language, or audio signals. Unlike traditional Feedforward NN, RNNs have a circular connection structure, which allows the network to retain input information from previous time steps. This enables the network to consider the context from previous steps at each time step in the sequence (Starmer [2022b]). 24 In traditional RNNs, issues like vanishing or exploding gradients make it difficult to learn from long sequences. Fig. 3.2 shows the basic structure of RNN. Figure 3.2: Basic RNN structure (Source: Zhu [2019]) LSTM networks address this by introducing structures like memory cells and gating mechanisms, allowing selective retention or forgetting of information, thereby capturing long-term dependencies (Starmer [2022a]). The main idea behind how LSTM works is that instead of using the same feedback loop connection for events that happened long ago and events that just happened yesterday to make a prediction about tomorrow, LSTM uses two separate paths to make predictions about tomorrow. One path is for long-term memories, and another is for short-term memories. 25 Figure 3.3: LSTM As shown in the Fig. 3.3, the upper blue line is called the cell state (C), representing long-term memory. We notice that there are no weights or biases that can modify it directly. This allows long-term memories to flow through a series of unrolled units without causing the gradient to explode or vanish. The lower red line is called the hidden state (H), which represents short-term memory. Short-term memory is directly connected to the weights that can modify it. By using separate paths for long-term memories and short-term memories, LSTM networks avoid the exploding/vanishing gradient problem, and that means we can unroll them more times to accommodate longer sequences of input data than a vanilla recurrent neural network. 3.2.3 Neural Network Autoregression (NNETAR) A neural network is a computational model that mimics the biological neural system. By connecting numerous artificial neurons, it processes and analyzes 26 data with powerful nonlinear mapping capabilities. The basic structure of a neural network includes an input layer, hidden layers, and an output layer (Saleem [2023]). Figure 3.4: The structure of neural network • The input layer receives external data, with each input node corresponding to a feature. • Hidden layers, consisting of several neurons, perform weighted summation of input data and apply nonlinear transformations through activation functions. The number of hidden layers can vary, and more layers increase the network’s complexity and expressive power. • The output layer generates the final prediction. 27 NN learn through forward propagation (data passing from the input layer to the output layer) and backpropagation (updating weights layer by layer according to the gradient of the loss function). This adjustment of weights and biases minimizes the loss function, ensuring that the predictions are as close to the real values as possible. The NNETAR model combines the nonlinear modeling capabilities of NN with the time dependency of autoregressive (AR) models and is widely used in time series forecasting (Hyndman and Athanasopoulos [2021]). In an NNETAR model, the input layer receives historical data points of a time series. If p historical points are used, the input vector is (yt−1 , yt−2 , . . . , yt−p ). The hidden layers structure same as simple neural network, and the output layer generates the forecast for time t. NNETAR models are highly capable of capturing complex nonlinear relationships and are flexible, but they require substantial amounts of data for training and have higher computational complexity. By appropriately selecting the model structure and parameters, NNETAR models can demonstrate significant potential in time series forecasting, especially when dealing with complex nonlinear relationships. 3.3 Proposed Model: Hierarchical Forecast Combination (HFC) Model Forecast combinations have seen significant growth in popularity within the forecasting community and have recently become a key aspect of mainstream forecasting research and practices. This approach involves combining multiple forecasts generated for a target time series to enhance accuracy by 28 integrating information from diverse sources, thus eliminating the necessity to pinpoint a single “best” forecast. The methods for combining forecasts have evolved from basic techniques that do not require estimation to advanced strategies that incorporate time-varying weights, nonlinear combinations, component correlations, and cross-learning (Wang [2023]). From the article by Andrawis [2011], we have learned about various forecast combination methods, including Hierarchical Forecast Combination (HIER). The principle behind HIER is to select the two best-performing linear methods and the two best-performing nonlinear methods from all previously described forecast combination methods and then take a simple average (AVG) of the forecasts from these four methods. The selection is based on the performance of these methods in the evaluation set. However, in practice, a simple average (AVG) may not be the best approach for combining forecasts from these individual models because it assigns equal weight to each model. To overcome this limitation, in this work, we develop a data-driven Hierarchical Forecast Combination (HFC) model. Our model not only retains the original idea of selecting the best linear and non-linear models through evaluation but also incorporates appropriate weights for combining these methods. 3.3.1 Proposed Data-Driven HFC Models Assume we have n individual models, among which some are linear models and others are non-linear models. Our approach involves three stages of model combination. In the first stage, we combine the top n1 linear models to obtain a linear combined model. In the second stage, we combine the top n2 non-linear models to obtain a non-linear combined model. In the final 29 stage, we combine the linear combined model with the non-linear combined model, where n1 +n2 =n to get the final proposed data-driven HFC model. First Stage: We can define the forecast combined model based on linear models as: n1 X L ŷt+h = wiL mLi . i=1 L Here, ŷt+h denotes the combined forecast result for the h-step ahead under the linear model. The corresponding weight for each model mLi is represented by wiL , where i = 1, 2, . . . , n1 . These weights wiL are optimized according to specific criteria, which will be detailed later. Second Stage: We can define the forecast combined model based on non-linear models as: n2 X NL L ŷt+h = wiN L mN i . i=1 NL Similarly, ŷt+h represents the h-step ahead combined forecast for the nonL linear model, with wiN L being the weight for model mN i , where i = 1, 2, . . . , n2 . The optimization of these weights wiN L is based on criteria that will be discussed later. Third Stage: The proposed Data-Driven HFC model can be derived as: n1 n2 X X L HF C wiL mLi + w2HF C wiN L mN ŷt+h = w1HF C i . i=1 i=1 30 HF C ŷt+h means the h-step ahead combined forecast results under HFC model, and w1HF C and w2HF C are the weight under proposed model which need to be optimized to get better forecast combination model. In our work, we set n1 = n2 = 2. This means we consider the two best linear models and the two best non-linear models to construct our proposed HFC model. However, it is important to note that one can choose more than two models to further refine the proposed HFC model. As a result, our model now combines the top 2 linear models, and the top 2 non-linear models, and finally combines the linear combined model with the non-linear combined model. Consequently, the updated formulation of models become: First Stage: L ŷt+h = 2 X wiL mLi = w1L · mL1 + w2L · mL2 i=1 Second Stage: NL ŷt+h = 2 X L L L = w1N L · mN + w2N L · mN wiN L mN 2 i 1 i=1 31 Third Stage: Then the proposed HFC model can be return as: HF C = w1HF C ŷt+h 2 X wiL mLi + w2HF C i=1 3.3.2 2 X L wiN L mN i i=1 Strategies to Determine Weights In order to find the optimal weight for each stage of model combination, we follow the following procedures. w1 values are generated from a uniform distribution (0,1) with increP ments of 0.01, where 2i=1 wi = 1. For each w1 value, we calculate ŷt+h . Subsequently, we calculate the one-step ahead Forecast Error Sum of Squares P (FESS), using the formula Tt=1 (yt+h − ŷt+h )2 , where t stands for each observation data point, and h is the forecast horizon. The optimal w1 is then chosen as the one that minimizes the FESS. Based on this weight determination procedure, we derive two proposed methods. The first method is the Data-Driven Hierarchical Forecast Combination Method (DDHFC), where we need to find the optimal weight for each stage. The second method is the Partially Data-Driven Hierarchical Forecast Combination Method (DDPHFC), where the weight is only optimized for the last stage, while the weights for first and second stage are fixed at wiL = wiN L = 0.5. The two proposed DDHFC and DDPHFC are compared with the simple average combination (AVG) model where all weights are fixed at 0.5 for each stage. 32 3.3.3 Steps for HFC Model Figure 3.5: Basic steps of the proposed HFC model The procedure above shows the steps of how the HFC model works. Next, we will explain how our method is implemented: 1. Select Different Models and Get Forecast Results: - Each linear model is individually trained on our datasets to generate forecast results. The Mean Absolute Percentage Error (MAPE) for each model is calculated, and the top n1 models with the lowest MAPE values are selected. The same procedure is applied to the non-linear models. 2. First stage (Forecast combination for top n1 linear models): - We combine the results of top n1 linear models. 33 - Use different strategies to determine the weights for top n1 linear models. - We now have a linear combined model. 3. Second stage (Forecast combination for top n2 non-linear models): - We combine the results of top n2 non-linear models. - Use different strategies to determine the weights for top n2 non-linear models. - We now have a non-linear combined model. 4. Third stage: - Combine the results from the first combination of linear and nonlinear models. - Again, use different strategies to determine the final combination weights. By performing a two-step weighted combination and using different strategies to determine the weights, we can effectively leverage the strengths of different models and improve the overall performance of the forecasting model. This HFC model is particularly useful when dealing with complex forecast problems, especially when the performance of individual models is unstable or inconsistent. 34 Chapter 4 Experiments 4.1 Dataset Description Two datasets will be used for our project. These datasets include 2022 Hourly Traffic and Climate Dataset and 2017-2022 Daily Traffic and Climate Dataset, respectively. The traffic data are collected from the Traffic Data Program on the Ministry of Transportation and Infrastructure of B.C. government website (Traffic Data Program), and the climate data are collected from Environment Canada (Service Canada [2023]). 35 4.2 Traffic Data Collection Figure 4.1: Process of traffic data collection 36 We initially obtain the necessary traffic data in CSV format. These CSV files consist of the Monthly Hourly Volume Report (MVO3) and the Daily Volume Summary Report (DV01). As per the Traffic Reports User Documentation from the BC Ministry of Transportation and Infrastructure (WSP Canada Group Limited [2019]), the MVO3 presents a comprehensive breakdown of hourly traffic for each day of the month, including data for both negative direction (Neg DIR), and positive direction (Pos DIR) on the lions gate bridge. Conversely, the DV01 provides a summary of total daily volumes and daily volumes per lane, compiled over a one-day period. Traffic agencies commonly gather diverse traffic data types such as volume, speed, length, axle class, and weigh-in-motion (WIM). In British Columbia, traffic data collection involves the utilization of inductive loops, pneumatic hoses, and piezoelectric strips. Inductive loops detect the metal content of vehicles, enabling the measurement of lane volume with a single loop, while a pair of loops can assess length and speed. After obtaining these CSV files, we import the data into R and create two tsibble tables named full.hourly.2022 and full.daily.17 22. 4.2.1 Hourly Traffic Volume Data (2022) For the full.hourly.2022 tsibble table, special handling is required for data on March 13, 2022, and November 6, 2022, due to daylight saving time. When creating the traffic dataset in R, adjustments are necessary while collecting data from the CSV files. Specifically, the traffic data for 2 am on March 13, 2022, needs to be omitted. Similarly, the traffic data for 1 am on November 37 6, 2022, needs to be duplicated. Subsequently, we examine the tables for full.hourly.2022 missing data and outliers. The analysis reveals 217 missing data points in the table. These missing values are addressed using the function replace missing values(), which replaces them with the respective average values for all days of the same weekday throughout the year. Following this, outliers are checked using the built-in function tsoutliers() (Chen and Liu [1993]), and no outliers are detected. Moving forward, we examine the plot of the hourly traffic volume, focusing on February 2022. Fig. 4.2 shows that the traffic data exhibits some level of seasonality, with a daily seasonal pattern. Figure 4.2: Hourly traffic volume data plot for February 2022 For a deeper understanding of our data, we selected a random workweek 38 from the 2022 Hourly Traffic dataset from January 10th to January 14th, 2022. We conducted separate analyses for each day and the aggregated fiveday period. Fig. 4.3 shows the hourly traffic volume plot from 2022-01-10 to 2022-01-14, these five graphs, revealing dual peaks evident on each workday, and the reasonably interpreted as corresponding to morning and evening rush hours. Typically, the morning peak occurs between 7:00 a.m. and 9:00 a.m., while the evening peak spans from 4:00 p.m. to 6:00 p.m. Fig. 4.4 illustrates the hourly traffic volume across the week, shows the strong weekly pattern. Figure 4.3: Hourly traffic volume plot from 2022-01-10 to 2022-01-14 (Single Day) 39 Figure 4.4: Hourly traffic volume plot from 2022-01-10 to 2022-01-14 Fig. 4.5 shows the comparison of average hourly traffic volume (across year 2022) between holiday and weekday and weekend. From the graph, it can be observed that holiday and weekend do not exhibit morning or evening rush hours. Instead, there is a continuous increase in traffic flow from around 6:00 AM to approximately 12:00 PM, which is presumably due to increased outdoor activities during this period. Peak traffic is observed around 12:00 PM. Subsequently, traffic gradually decreases, suggesting a decline in outdoor recreational activities. 40 Figure 4.5: Comparison of average hourly traffic volume between holiday, weekday and weekend 4.2.2 Daily Traffic Volume Data (2017-2022) The same processing steps are applied to the full.daily.17 22 tsibble table, with the exception that daylight saving time does not need to be considered here. For handling missing data, the built-in function na seadec() (R: Seasonally Decomposed Missing Value Imputation) is employed, which utilizes Seasonal Decomposition of Time Series principles. This process results in the creation of the 2017-2022 Daily Traffic dataset. Fig. 4.6 illustrates the daily traffic volume from January 1, 2017, to December 31, 2022. It is evident from the graph that traffic peaks during the summer months compared to winter. Around March 2020, there is a sharp 41 decline in traffic volume, likely attributed to the outbreak of the COVID-19 pandemic. This decline can be attributed to the implementation of stay-athome orders by the BC government, resulting in a significant reduction in daily traffic flow (Zussman [2020]). Figure 4.6: Plot of 2017-2022 daily traffic volume data Further, we randomly selected a time periods from the 2017-2022 Daily Traffic Dataset: from March 5, 2018, to April 1, 2018, spanning four weeks. Fig. 4.7 depicts the plot for this period. From the plot, it is evident that, except for the last week (March 26, 2018, to April 1, 2018), the daily traffic volume on weekdays during the week consistently surpassed that of weekends. 42 Figure 4.7: Daily traffic volume plot from 2018-03-05 to 2018-04-01 Similarly result can be found in Fig. 4.8. This graph depicts the distribution of hourly traffic volume data for each day of the week using box plots. It illustrates that the daily traffic volume on weekdays is higher than on weekends. 43 Figure 4.8: Distribution of Daily traffic volume data for each day of the week 44 4.3 Climate Data Collection Figure 4.9: Process of climate data collection Figure 4.9 illustrates the process of climate data collection. Since there are no climate stations located directly on Lions Gate Bridge, we extended our examination to the surrounding area. We selected a 16-kilometer radius around the bridge based on previous studies that have used this distance as a standard, as they indicated that weather conditions within this range show minimal variation and maintain consistency (Roh et al. [2019]). Using Google Maps, we identified a total of 95 weather stations within this 16-kilometer radius, with data sourced from Environment Canada (Service 45 Canada [2023]) and a climate data extraction tool (Canada, Environment and Climate Change). Following the elimination of stations lacking weather data for the year 2022, we retained 9 stations with the following climate station IDs: 1105658, 1105669, 1106200, 1106764, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Some of these stations provide solely hourly data, some solely daily data, while others offer both. Hourly and daily data reports can be acquired separately from the Historical Data website (Historical Data). 4.3.1 Hourly Climate Data (2022) Figure 4.10: Process of 2022 hourly climate data collection Fig. 4.10 shows the process of 2022 hourly climate data collection. The 46 hourly data report includes 11 relevant parameters: Temperature (°C), Dew Point Temperature (°C), Relative Humidity (%), Total Hourly Precipitation, Wind Direction (10’s deg/tens of degrees), Wind Speed (km/h), Visibility (km), Station Pressure (kPa), Humidex, Wind Chill, and Occurrence of Weather and Obstructions to Vision. The climate IDs for the stations have solely hourly data are 1106200, 1108446, 1108395, 1108380, and 1108824. Furthermore, based on article (Roh et al. [2019]), our focus lies on Temperature (°C), Total Hourly Precipitation, and Wind Speed (km/h) for hourly data. We identify stations corresponding to these three features separately to generate three tsibble tables. Details regarding the examination of temperature, precipitation, and wind speed for hourly climate data can be found in the Appendix A.1. Finally, by integrating the 2022 Hourly Climate Dataset with the 2022 Hourly Traffic Dataset, we obtain the first dataset (2022 Hourly Traffic and Climate Dataset). 4.3.2 Daily Climate Data (2017-2022) Fig. 4.11 shows the process of 2017-2022 daily climate data collection. The daily data report includes 11 relevant parameters: Maximum Temperature (°C), Minimum Temperature (°C), Mean Temperature (°C), Heating Degreedays, Cooling Degree-days, Total Rain (mm), Total Snow (cm), Total Precipitation (mm), Snow on the Ground (cm), Direction of Maximum Gust (10’s Deg/Tens of Degrees), and Speed of Maximum Gust (km/h). 47 Figure 4.11: Process of 2017-2022 daily climate data collection The climate IDs for the stations have solely daily date are 1105658, 1105669, 1106200, 1106764, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Furthermore, based on article (Roh et al. [2019]), our current interest lies in Mean Temperature (°C) and total Precipitation (mm) for daily data. Based on different features, we generated two tsibble tables. Details regarding the examination of temperature, precipitation for daily data can be found in the Appendix A.2. Finally, integrating the 2017-2022 Daily Climate Dataset with the 2017- 48 2022 Daily Traffic Dataset yields the second dataset (2017-2022 Daily Traffic and Climate Dataset). 4.4 Traffic and Climate Dataset The first dataset (2022 Hourly Traffic and Climate Dataset) consists of 8760 rows with 9 features: “volume”, “volume.P”, “volume.N”, “temp.average”, “precip.average”, “ws.average”, “weekday”, “is weekend”, and “day.type”. The feature “volume” represents traffic volume. For this dataset, we also consider the direction of traffic volume. The positive direction (volume.P) indicates traffic moving from South to North or from West to East, whereas the negative direction (volume.N) denotes traffic moving from North to South or from East to West. The second dataset (2017-2022 Daily Traffic and Climate Dataset) has 2191 rows with 6 features: “volume”, “temp.average”, “precip.average”, “weekday”, “is weekend”, and “day.type”. The traffic data includes daily traffic volume data from 2017 to 2022. The features “temp.average”, “precip.average” and “ws.average” represent the average temperature (℃), precipitation (mm), and wind speed (km/h) observed by chosen climate stations, respectively; the feature “weekday” indicates the day of the week corresponding to the data date, “is weekend” uses “1” to denote weekends and “0” for non-weekends, and “day.type” includes “Weekday”, “Weekend”, and “Holiday”, determined using the calendar functions from the R package QuantLib (Eddelbuettel [2024]), covering all statutory and regional holidays in British Columbia from 2017-2022 (Of49 ficeHolidays). 4.5 Parameter Estimation In this section, we will discuss how we divided our training set and test set. We will also introduce how different linear and non-linear models are implemented in R and how their parameters are selected during model training. For the 2022 Hourly Traffic and Climate Dataset, to evaluate the performance of our model in predicting different day types, we further divided the dataset into three categories: weekday, weekend, and holiday. Adjustments were made to the training set and test set for each type. Next, we will introduce the specific details of the training set and test set divisions for each day type: • Weekday: The training set is from June 1, 2022, 00:00 to November 30, 2022, 23:00, and the test set is from December 1, 2022, 00:00 to December 2, 2022, 23:00 (48 hours). • Weekend: We randomly selected a weekend in 2022. The training set is from January 1, 2022, 00:00 to June 24, 2022, 23:00, and the test set is from June 25, 2022, 00:00 to June 26, 2022, 23:00 (48 hours). • Holiday: We randomly selected Canada Day on July 1 as the holiday. Thus, the training set is from January 1, 2022, 00:00 to June 30, 2022, 23:00, and the test set is from July 1, 2022, 00:00 to July 1, 2022, 23:00 (24 hours). 50 For the 2017-2022 Daily Traffic and Climate Dataset, since the traffic volume in this dataset is recorded daily, we only consider the weekday type. The training set is from January 1, 2017, to November 30, 2022, and the test set is from December 1, 2022, to December 14, 2022 (2 weeks). Our methods are implemented in R. Most of the methods have corresponding packages in R, but for methods like LSTM and HFC, we did not fully use existing packages. Especially for HFC, which is our unique method, the functions within it are self-programmed. Next, we will introduce the packages used for each method and their required parameters. 4.5.1 ARIMA The ARIMA() function in R is from the forecast package. This function achieves automatic model selection to choose the optimal combination of p, d, and q parameters by using the stepwise parameter (Hyndman and Athanasopoulos [2021]). 4.5.2 Dynamic Regression Model Since we assume the error term of the dynamic regression model is autocorrelated (not white noise) and follows the ARIMA model, we also use the forecast package for this method, similar to the ARIMA method (Hyndman and Athanasopoulos [2021]). As this is a regression method, our variables include the lagged value 51 (traffic volume) as the target variable, and temp.average, precip.average, ws.average, and day.type as independent variables. 4.5.3 Exponential Smoothing Model This model utilizes the ETS() function in R, which comes from the forecast package (Hyndman and Athanasopoulos [2021]). The ETS() function is used to fit the ES Model, which is widely used for time series forecasting, especially when there is a trend and seasonality present. The ETS model consists of three components: Error (E), Trend (T), and Seasonality (S). As mentioned in our methodology, the error can be additive (A) or multiplicative (M), the trend (bt ) can be none (N), or additive (A), and the seasonality (st ) can be none (N), additive (A), or multiplicative (M). Different combinations of these components can construct various ETS models. For example: • ETS(A, N, N) represents a model with additive error and no trend or seasonality, which is SES method; • ETS(A, A, N) represents a model with additive error, additive trend, and no seasonality, which is Holt’s linear trend method; • ETS(M, A, A) represents a model with multiplicative error, additive trend, and additive seasonality, which is Holt-Winters’ additive method; 52 • ETS(M, A, M) represents a model with multiplicative error, additive trend, and multiplicative seasonality, which is Holt-Winters’ multiplicative method; When using this function, we did not specify the model parameters but allowed it to automatically select the most suitable ES model, including the best error type, trend type, and seasonality type. It achieves this by comparing the AIC (Akaike Information Criterion) values of different models. 4.5.4 Support Vector Regression (SVR) In R, the svm() function from the ‘e1071‘ package is used to build a SVR model Santos and Castillo [2019]. Depending on the kernel function type, SVR can be applied as either a linear model or a non-linear model. SVR Non-linear Model: We set the kernel function to its default, which uses the Radial Basis Function (RBF) kernel. A key feature of the RBF kernel is its non-linear mapping capability. Through the RBF kernel, data points are implicitly mapped to a high-dimensional feature space. In this space, even if the original data are not linearly separable in a lower-dimensional space, they may become linearly separable in the high-dimensional space. Therefore, SVR using the RBF kernel can effectively address many non-linear problems. We denote this model as SVRN L . 53 SVR Linear Model: Similarly, by setting the kernel function to linear, SVR attempts to find a linear function to fit the data in the original input space, thereby acting as a linear model. We denote this model as SVRL . SVR Model with Additional Features: When additional features, such as “temp.average”, “ws.average”, “precip.average”, etc., are included in the model, two additional models are denoted: SVR.XN L and SVR.XL , where “X” represents the additional features. 4.5.5 K-Nearest Neighbor (KNN) The knn forecasting() function is provided by the tsfknn package in R for time series forecasting (Francisco Martinez [2023]). This function uses the KNN algorithm for prediction and is suitable for various time series analysis scenarios. In our example, the parameter h = 48 represents forecasting the values for the next 48 time points. The parameter lags = 1:48 specifies using data from the past 1 to 48 time points as features, which helps capture patterns over a longer time range. The parameter k = 3 means referencing the 3 nearest neighbor data points during prediction to balance the accuracy and stability of the forecast. The parameter msas = “MIMO” defines the multistep prediction strategy as the Multi-Input Multi-Output method, allowing the model to predict all 48 future time points at once. By choosing these 54 parameters, we can construct an efficient predictive model that adapts to the complexities of time series data. 4.5.6 Long Short-Term Memory (LSTM) Similar to SVR, if we only consider the lagged value, then we have a model called LSTMN L . If we add more variables to this model, such as “temp.average”, “ws.average”, “precip.average”, etc., we have a model called LSTM.XN L . In R, using LSTM for time series forecasting typically involves the keras, tensorflow, and reticulate packages. The keras package provides a high-level NN API, simplifying the construction and training of deep learning models. The tensorflow package is R’s interface to TensorFlow, used for building and training deep learning models. The reticulate package serves as a two-way interface between R and Python, ensuring compatibility with TensorFlow and Keras. Through these packages, one can efficiently implement and train LSTM models in R for accurate time series forecasting (ten). The lstm build model() function is used to build and train an LSTMbased time series forecasting model (lst). The function parameters include: • x: Represents the feature matrix of the training data. It contains the input data for model training, with each row representing a time point and each column representing a lagged time step. • y: Represents the target matrix of the training data. It contains the target data for model training, corresponding to each input in the feature matrix, indicating the values the model needs to predict. 55 • units: Number of units in each LSTM layer (set to 50). This parameter determines the number of neurons in each LSTM layer. • batch: Defines the batch size (set to 1). The batch size determines the amount of data used in each model update. For stateful LSTM, small batch sizes (like 1) are typically used to maintain sequence continuity and state. • epochs: Number of training epochs (set to 20). This parameter determines how many times the entire training dataset is learned by the model. More epochs allow the model to better learn the patterns in the data, but too many can lead to overfitting. • rate: Dropout rate for the Dropout layer (set to 0.5). Dropout is a regularization technique that randomly drops neurons during training to prevent overfitting. A higher dropout rate can effectively prevent overfitting, but if too high, it may lose too much information, affecting model performance. 4.5.7 Neural Network Autoregression (NNETAR) Similar to SVR, if we only consider the lagged value, then we have a model called NNETARN L . If we add more variables to this model, such as “temp.average”, “ws.average”, “precip.average”, etc., we have a model called NNETAR.XN L The NNETAR() function in R is from the forecast package (Hyndman and Athanasopoulos [2021]). For seasonal data, the model is represented as NNAR(p, P , k)m , where each parameter is interpreted as follows: p represents the number of lagged 56 inputs, P represents the number of seasonal lagged inputs, k represents the number of neurons in the hidden layer, and m represents the seasonal period. More generally, an NNAR(p, P, k)m model has inputs (Yt−1 , Yt−2 , . . . , Yt−p , Yt−m , Yt−2m , . . . , Yt−P m ) and k neurons in the hidden layer. For example, an NNAR(3, 1, 2)12 model has inputs Yt−1 , Yt−2 , Yt−3 , and Yt−12 , and two neurons in the hidden layer. When using this function, we let NNETAR() function automatically choose the values of p and P . If k is not specified, it   is by default calculated and set to p+P2 +1 . 4.6 Evaluation Metric In our project, we consider two evaluation metric, which are root mean square error (RMSE) and mean absolute percentage error (MAPE). 4.6.1 RMSE The RMSE are defined as follows: v u T u1 X t RMSE = (yt − ŷt )2 T t=1 where ŷt means the prediction value and yt means actual data, and T is the total number of test samples. RMSE is sensitive to outliers because the errors are squared, amplifying their effect. The unit of RMSE is the same as the actual values, allowing 57 it to be directly interpreted as the average difference between the predicted and actual values. 4.6.2 MAPE The MAPE are defined as follows: T MAPE = 1 X yt − ŷt × 100% T t=1 yt where ŷt means the prediction value and yt means actual data, and T is the total number of test samples. MAPE represents the relative percentage of prediction error, with units in percentage (%), making it convenient for comparing data of different scales. When the actual value yt is close to zero, MAPE becomes unstable and can produce extremely large error values, making it unsuitable for datasets containing zero or near-zero values. 4.7 Performance Comparison After selecting two linear models and two non-linear models based on their MAPE values, we used these models to construct the HFC model. We then compared these combined models with four individual models using MAPE and RMSE values. Each table consists of three columns: the second column shows the MAPE values for each model, and the third column 58 presents the RMSE values. As described in the Data Collection section, we utilized two distinct datasets: the 2022 Hourly Traffic and Climate Dataset and the 2017-2022 Daily Traffic and Climate Dataset. Before comparing the results of different models, we further distinguish between these two datasets. The first dataset is further divided into the 2022 Hourly Both Directions Dataset, the 2022 Hourly Positive Direction Dataset, and the 2022 Hourly Negative Direction Dataset to account for different traffic directions. The second dataset does not involve traffic direction and only includes the 20172022 Daily Dataset. For the hourly related datasets, each comparison table includes three different day types: weekday, weekend, and holiday. The Training and Parameters section mentioned earlier provides specific details. Each day type corresponds to distinct training and test set. For the daily related datasets, each table includes only one day type, corresponding to weekdays. In our result table, for the sake of clarity, we have assigned specific names to different types of models. As we know, in our project, the models are primarily categorized into two types: linear models and non-linear models. Based on this classification, we have added subscripts to the corresponding model names. The subscript for linear models is L, and for non-linear models, it is N L. Thus, the names of the linear models with subscripts are: 59 • ARIMAL • DRegL • ETSL • SVRL The names of the non-linear models with subscripts are: • SVRN L • KNNN L • LSTMN L • NNETARN L Since some models incorporate not only traffic volume data but also additional feature-related data, we have added “X” as a suffix to these models. The models with the added suffix are: • SVR.XL • SVR.XN L • LSTM.XN L • NNETAR.XN L Next, we will analyze the comparison results of different datasets. For the 2022 Hourly Traffic and Climate Dataset, to evaluate the performance of our model in predicting different day types, we further divided the dataset into three categories: weekday, weekend, and holiday. 60 4.7.1 Results of 2022 Hourly Both Directions Dataset Weekday First, we will analyze the results for the day type corresponding to weekdays, and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. Table 4.1: Linear Model Results Table 4.2: Non-Linear Model Results Model MAPE Model MAPE SVRL 14.46 SVR.XN L 10.51 DRegL 16.15 SVRN L 14.27 ARIMAL 16.21 LSTM.XN L 21.68 SVR.XL 18.54 NNETAR.XN L 28.45 ETSL 28.18 NNETARN L 30.73 KNNN L 42.89 LSTMN L 44.11 Based on Table 4.1, we selected the top two models: SVRL and DRegL . Similarly, based on Table 4.2, we selected the top two models: SVR.XN L and LSTM.XN L . Subsequently, we used these four models to construct the HFC model. As shown in Table 4.3, the proposed HFC models (DDHFC) demonstrate 61 Table 4.3: The comparison table of four individual forecast models with proposed HFC models (2022 hourly both directions - weekday) Model MAPE RMSE DRegL 16.15 372.24 SVRL 14.46 292.96 SVR.XN L 10.51 218.33 LSTM.XN L 21.68 344.05 AVG 11.7 263.3 DDHFC 9.89 217.5 DDPHFC 12.09 250.25 superior accuracy compared to individual models, particularly in terms of the MAPE metric. Specifically, DDHFC has the lowest MAPE value of 9.89 and an RMSE value of 217.5, indicating its high accuracy in forecasting. In contrast, the LSTM model has the highest MAPE of 21.68, suggesting greater prediction errors. DDHFC not only achieves the best results in MAPE but also has relatively low RMSE values, highlighting its effectiveness. This suggests that the HFC model, constructed using the selected two linear and two nonlinear models, exhibits strong predictive capabilities across multiple evaluation metrics, validating its effectiveness in traffic forecasting. Figure 4.12 displays the forecast results of the proposed HFC models and four individual models for 2022 bi-directional traffic data on weekdays. The black solid line in the graph represents the actual observed traffic data, while the other colored lines represent the forecasts of different models. The graph reveals that the HFC models (DDHFC) closely align with the 62 Figure 4.12: Forecast plot for proposed HFC model and four individual models (2022 Hourly Both Directions - weekday) actual data, particularly at peak traffic volumes. In comparison, although the predictions of the other models are also close to the actual data, their deviations are relatively larger, especially during periods of dramatic traffic changes, such as the morning and evening rush hours. Overall, the forecast results in Figure 4.12 demonstrate that the proposed HFC models exhibit strong stability and accuracy in complex traffic volume forecasting tasks, particularly excelling in predicting peak periods, outperforming individual models. 63 Weekend and Holiday Next, we will analyze the results for the weekend and holiday. The procedure for selecting the top two linear models and the top two non-linear models is detailed in the Appendix B.1. Table 4.4: The comparison table of four individual forecast models with proposed HFC model (2022 hourly both directions) Table 4.5: Weekend Table 4.6: Holiday Model MAPE RMSE Model MAPE RMSE SVR.XL 10.44 198.1 SVR.XL 12.49 240.09 ARIMAL 28.32 591.38 ARIMAL 29.4 735.71 SVRN L 8.56 203.09 SVRN L 8.65 200.69 KNNN L 11.38 295.81 NNETAR.XN L 9.21 177.76 AVG 9.55 236.87 AVG 12.37 310.75 DDHFC 7.88 178.38 DDHFC 9.05 176.74 DDPHFC 6.99 214.7 DDPHFC 7.68 183.38 The tables 4.4 presented summarize the comparison of four individual forecast models with the proposed HFC models across different day types: weekend, and holiday, specifically for the 2022 hourly both directions traffic data. For weekend, Table 4.5 highlighting the performance of different models based on MAPE and RMSE metrics. Among the models, DDPHFC achieves the lowest MAPE value of 6.99, indicating it has the highest prediction accuracy. Additionally, DDHFC also performs well with a MAPE of 7.88 and 64 the lowest RMSE value of 178.38, suggesting it provides the most precise forecasts with minimal error. In contrast, the ARIMAL model shows significantly higher errors, with a MAPE of 28.32 and RMSE of 591.38, indicating it performs poorly compared to the other models. For holiday, Table 4.6 summarizes the performance of various models based on MAPE and RMSE metrics in the context of combination forecasting. The DDPHFC model stands out with the lowest MAPE value of 7.68, reflecting its superior accuracy in traffic prediction. The DDHFC model also performs strongly, achieving the lowest RMSE of 176.74, indicating its robustness in minimizing forecast errors. On the other hand, the ARIMAL model exhibits the highest errors, with a MAPE of 29.4 and an RMSE of 735.71, demonstrating its comparatively weaker predictive performance. Overall, the results highlight that both the DDPHFC and DDHFC models demonstrate strong predictive capabilities under both “Weekend” and “Holiday” conditions, particularly in terms of MAPE and RMSE, significantly outperforming the traditional time series model, ARIMA. This indicates that data-driven hybrid models might be more suitable for handling complex time-varying data in traffic flow forecasting. Figures 4.13 and 4.14 visually illustrate the forecast performance of various models on weekend and holiday traffic data (2022 hourly both directions), respectively. In both figures, the black solid lines represent the actual observed traffic data, while the colored lines show the forecasts of different models. 65 Figure 4.13: Forecast plots for proposed models and the combination models (2022 Hourly Both Directions - weekend) In Figure 4.13, which depicts weekend traffic, the forecast curves of various models generally align well with the actual data, particularly at traffic peaks and valleys. The DDHFC and DDPHFC models (represented by the red and yellow lines, respectively) perform exceptionally well, closely matching the actual data, especially during the morning and evening peak periods. This suggests these models are highly accurate in capturing traffic patterns. Conversely, the ARIMAL model (dark green line) shows larger prediction errors during certain periods, particularly during peak times, indicating its limitations in handling complex traffic patterns. Similarly, in Figure 4.14, which presents the holiday traffic data, most 66 Figure 4.14: Forecast plots for proposed models and the combination models (2022 Hourly Both Directions - holiday) models capture the general trend of traffic flow, though the accuracy varies across models. Again, the DDHFC and DDPHFC models stand out by closely following the actual data, especially during peak traffic periods. These models demonstrate a strong ability to predict the sharp increase in traffic volume during the morning and the subsequent gradual decline throughout the day. The ARIMAL model, however, continues to show larger deviations from the actual data during peak hours, struggling to accurately capture high-traffic periods. In summary, both figures underscore that the proposed DDHFC and DDPHFC models exhibit higher forecasting accuracy in both weekend and 67 holiday traffic predictions. They effectively capture traffic peaks and valleys, proving to be more reliable compared to other models, particularly when complex and varying traffic conditions are present. 4.7.2 Results of 2022 Hourly Positive Direction Dataset Next, we will analyze the comparison tables. The procedure for selecting the top two linear models and the top two non-linear models is detailed in the Appendix B.2. Table 4.7: Comparison of Four Individual Forecast Models with Proposed HFC Model (2022 Hourly Positive Direction) Table 4.8: Weekday Table 4.9: Weekend Table 4.10: Holiday Model MAPE RMSE Model MAPE RMSE Model MAPE RMSE SVRL 15.12 201.52 SVR.XL 11.07 152.21 SVRL 11 133.1 20.43 323.87 93.2 ARIMAL 17.3 219.84 ARIMAL 24.82 294.1 ARIMAL NNETARN L 14.6 189.43 SVRN L 9.54 125.68 NNETAR.XN L 9.63 SVR.XN L 15.77 166.93 NNETARN L 11.93 166.58 SVRN L 10.63 133.24 AVG 9.61 141.89 DDHFC 7.5 81.28 DDPHFC 6.53 86.64 AVG 11.78 171.69 AVG 10.01 150.13 DDHFC 14.42 166.57 DDHFC 8.96 125.31 DDPHFC 12.12 169.2 DDPHFC 7.97 134.55 Tables 4.8, 4.9, and 4.10 present the performance of various combination forecast models using MAPE and RMSE as evaluation metrics. Across these tables, the DDPHFC and DDHFC models consistently stand out for their superior accuracy and error minimization. The AVG model achieves the lowest MAPE of 11.78 in Table 4.8, while the DDHFC model records the lowest RMSE of 166.57, highlighting its effec68 tiveness in minimizing prediction errors. In comparison, the ARIMAL model exhibits higher errors with a MAPE of 17.3 and an RMSE of 219.84, making it less reliable. Similarly, in Table 4.9, the DDPHFC model achieves the lowest MAPE value of 7.97, and the DDHFC model again shows the lowest RMSE of 125.31. The ARIMAL model performs poorly with the highest MAPE of 24.82 and RMSE of 294.1. Finally, Table 4.10 reaffirms the superior performance of the DDPHFC and DDHFC models. The DDPHFC model achieves a MAPE of 6.53, while the DDHFC model records the lowest RMSE of 81.28. The ARIMAL model, on the other hand, continues to show the highest errors with a MAPE of 20.43 and RMSE of 323.87. Overall, the results across these tables indicate that the DDPHFC and DDHFC models are more reliable and accurate for traffic forecasting tasks, significantly outperforming traditional models like ARIMAL in both accuracy and consistency. Figures 4.15, 4.16, and 4.17 visually illustrate the forecast performance of various models on weekday, weekend, and holiday traffic data (2022 hourly positive direction), respectively. In each figure, the black solid lines represent the actual observed traffic data, while the colored lines show the forecasts of different models. In Figure 4.15,which depicts weekday traffic, it is evident that most models are able to follow the general trend of the actual traffic data, capturing the peaks and troughs in traffic volume. The AVG, DDHFC, and DDPHFC 69 Figure 4.15: Forecast plot for proposed HFC model and four individual models (2022 Hourly Positive Direction-weekday) models (represented by the green, red, and yellow lines, respectively) are particularly notable for their close alignment with the actual data, especially during peak periods. These models effectively predict the sharp increases and decreases in traffic, demonstrating their robustness in handling fluctuations. Similarly, in Figure 4.16 which presents the weekend traffic data, most models capture the general trend of traffic flow. It’s clear that the DDHFC and DDPHFC models (represented by the red and yellow lines) closely follow the actual data, particularly during peak traffic periods. These models show strong predictive performance, effectively capturing the increases and decreases in traffic volume throughout the weekend. The ARIMAL model 70 Figure 4.16: Forecast plots for proposed models and the combination models (2022 Hourly Positive Direction-weekend) (dark green line), however, continues to show larger deviations from the actual data during peak hours, struggling to accurately capture the weekend traffic patterns. In Figure 4.17, which focuses on holiday traffic data. From the graph, it is evident that the DDHFC and DDPHFC models (represented by the red and yellow lines, respectively) closely align with the actual traffic data, particularly during the peak traffic periods. These models effectively capture the rise in traffic volume during the morning hours and the gradual decline in the evening, demonstrating their reliability in forecasting traffic patterns during holidays. In contrast, the ARIMAL model (dark green line) again 71 Figure 4.17: Forecast plots for proposed models and the combination models (2022 Hourly Positive Direction-holiday) shows significant deviations, especially during high-traffic periods, indicating its ongoing struggle with accurately modeling traffic flow during holidays. In summary, all three figures underscore that the proposed DDHFC and DDPHFC models exhibit higher forecasting accuracy across different traffic scenarios, including weekdays, weekends, and holidays. They effectively capture traffic peaks and valleys, proving to be more reliable compared to other models, particularly when dealing with the complex and varying traffic conditions present during these different timeframes. 72 4.7.3 Results of 2022 Hourly Negative Direction Dataset Next, we will analyze the comparison tables. The procedure for selecting the top two linear models and the top two non-linear models is detailed in the Appendix B.3. Table 4.11: The comparison table of four individual forecast models with proposed HFC models (2022 hourly negative direction) Table 4.12: Weekday Table 4.13: Weenkend Table 4.14: Holiday Model MAPE RMSE Model MAPE RMSE Model MAPE RMSE SVRL 16.86 158.73 SVR.XL 15.52 156.36 SVRL 22.54 227.93 ARIMAL 17.4 198.8 ARIMAL 38.79 413.78 ETSL 36.6 383.95 NNETAR.XN L 11.77 144.8 SVRN L 10.66 95.4 NNETAR.XN L 18.94 164.15 SVRN L 16.73 135.87 LSTM.XN L 38.58 235.05 AVG 18.34 156.03 DDHFC 14.67 132.93 DDPHFC 18.34 156.03 KNNN L 12.03 158.4 AVG 14.75 166.79 AVG 18.93 189.55 DDHFC 10.48 127.41 DDHFC 10.66 95.4 DDPHFC 10.95 141.17 DDPHFC 11.8 123.16 The tables 4.11 presented summarize the comparison of four individual forecast models with the proposed HFC models across different day types: weekday, weekend, and holiday, specifically for the 2022 hourly negative direction traffic data. For weekday (Table 4.12), the DDHFC model performed the best with a MAPE of 14.67 and an RMSE of 132.93, demonstrating its superior accuracy and lowest error in weekday traffic prediction. In contrast, the LSTM.XN L model underperformed with a MAPE as high as 38.58 and an RMSE of 235.05. The AVG and DDPHFC models showed identical performance, both with a MAPE of 18.34 and an RMSE of 156.03, indicating that the DDPHFC 73 model found the optimal weight to be 0.5 for each model during the last stage. For weekend (Table 4.13), the DDHFC model once again took the lead with a MAPE of 10.48 and an RMSE of 127.41, making it the most reliable model for weekend traffic prediction. Conversely, the ARIMAL model had a significantly higher error, with a MAPE of 38.79 and an RMSE of 413.78, revealing its shortcomings in weekend traffic forecasting. For holiday (Table 4.14), both the DDHFC and SVR models performed consistently and excellently, with a MAPE of 10.66 and an RMSE of 95.4. This indicates that the DDHFC model found the optimal weight to be 1 for both the SVRN L model and the non-linear combined model during both the second stage and the last stage, showcasing its strong accuracy in holiday traffic prediction. In contrast, the ETSL model performed the worst, with a MAPE of 36.6 and an RMSE of 383.95, while the DDPHFC model showed good performance with a MAPE of 11.8 and an RMSE of 123.16. Overall, the results across all day types consistently highlight the effectiveness of our prposed model (DDHFC) in providing accurate and reliable traffic forecasts, outperforming the other models in most scenarios. The figures display below is the forecast results of the proposed HFC models and four individual models for 2022 negative directional traffic data on weekday, weekend and holiday. In the weekday forecast, the DDHFC model demonstrates the best performance, with a MAPE of 14.67 and an RMSE of 132.93 as shown in the table 4.12. Figure 4.18 illustrates that the DDHFC model (red line) closely aligns with the actual data (black line), particularly excelling in predicting 74 Figure 4.18: Forecast plots for proposed models and individual models (2022 Hourly Negative Direction-weekday) peak and low traffic periods. In contrast, the LSTM.XN L model (orange line) shows higher MAPE and RMSE values in the table, and in the figure, it can be observed that its predictions deviate from the actual data during certain periods (notably underestimating traffic volume after 2022-12-02 15:00:00), indicating its weaker performance. For the weekend forecast, the DDHFC model continues to stand out, with a MAPE of 10.48 and an RMSE of 127.41 in the table 4.13. Figure 4.19 shows that the DDHFC model’s predictions closely match the actual data, especially during peak traffic periods. In contrast, the ARIMAL model (dark green line) performs the worst, displaying the highest MAPE and RMSE values in the table, and its prediction curve noticeably deviates from the actual data during peak periods, reflecting significant errors. 75 Figure 4.19: Forecast plots for proposed models and individual models (2022 Hourly Negative Direction-weekend) In the holiday traffic forecast, both the DDHFC and SVRN L models show the best performance, with a MAPE of 10.66 and an RMSE of 95.4 in the table 4.14. Figure 4.20 demonstrates that the prediction curves of these two models are very close to the actual data, particularly during peak traffic periods, effectively capturing traffic flow variations. In contrast, the ETSL model (dark green line) performs the worst, with a MAPE of 36.6 and an RMSE of 383.95 in the table, and the figure shows that its prediction curve significantly diverges from the actual data at multiple times. Overall, the DDHFC model performs excellently across weekday, weekend, and holiday forecasts, providing accurate and consistent predictions. In comparison, traditional models like ARIMAL and ETSL clearly underperform relative to the DDHFC model. 76 Figure 4.20: Forecast plots for proposed models and individual models (2022 Hourly Negative Direction-holiday) 77 4.7.4 Results of 2017-2022 Daily Dataset For the 2017-2022 Daily Traffic and Climate Dataset, since the traffic volume in this dataset is recorded daily, we only consider the weekday type. First, we will analyze the results for the day type corresponding to weekdays, and the procedure for selecting the top two linear models and the top two non-linear models is detailed in the Appendix B.4. Subsequently, we used these four models to construct the HFC model. The comparison results of the four individual forecast models, along with the proposed HFC models, are shown in Table 4.15. Table 4.15: The comparison table of four individual forecast models with proposed HFC models (2017-2022 daily-weekday) Model MAPE RMSE SVR.XL 2.92 2039.42 ARIMAL 6.22 3479.72 SVR.XN L 2.36 1623.65 LSTMN L 4.21 2686.09 AVG 3.45 1986.36 DDHFC 2.3 1529.49 DDPHFC 2.85 1740.94 From Table 4.15, it is evident that there are significant differences in the performance of various models in predicting daily traffic data on weekdays from 2017 to 2022. The DDHFC model stands out with a MAPE of 2.3 and an RMSE of 1529.49, demonstrating its superiority in both accuracy 78 and error control. The SVR.reg model also shows a close performance, with a MAPE of 2.36 and an RMSE of 1623.65, indicating its high precision in prediction as well. In contrast, the ARIMAL model performs the worst, with a MAPE of 6.22 and a high RMSE of 3479.72, highlighting its large prediction errors and lower accuracy. Additionally, the LSTMN L and SVR.XL models also exhibit relatively high RMSE values of 2686.09 and 2039.42, respectively, suggesting that their predictive performance for long-term daily traffic data is not as strong as that of the DDHFC and SVR.reg models. These results indicate that the DDHFC model provides the most stable and accurate predictions when dealing with long-term traffic data, outperforming other individual models. As seen in Figure 4.21, the DDHFC model (red line) closely aligns with the actual data (black line) when predicting daily traffic flow on weekdays from 2017 to 2022, particularly excelling during peak and low traffic periods, demonstrating high predictive accuracy. This is further supported by the data in Table 4.15, where the DDHFC model shows the best performance in terms of MAPE and RMSE, with values of 2.3 and 1529.49, respectively, validating its superior performance illustrated in the figure. In contrast, the ARIMAL model (dark green line) performs the worst according to the table, with a MAPE of 6.22 and an RMSE as high as 3479.72. The figure shows that the ARIMAL model’s prediction curve significantly deviates from the actual data in several periods, particularly during sharp declines and increases in traffic flow, highlighting its shortcomings in predictive capability. 79 Figure 4.21: Forecast plot for proposed HFC model and four individual models (2017-2022 Daily-weekday) The SVR.XN L model (blue line) and the DDPHFC model (yellow line) also perform well in the figure, closely matching the actual data, especially during periods of significant fluctuations. According to Table 4.15, these models have MAPEs of 2.36 and 2.85, and RMSEs of 1623.65 and 1740.94, respectively, indicating their strong performance in handling long-term traffic data. Overall, the combined analysis of Figure 4.21 and Table 4.15 shows that our proposed model (DDHFC) offers remarkable accuracy and stability in predicting long-term weekday traffic data, outperforming other individual models. In contrast, the ARIMAL model exhibits the poorest performance, 80 with notably large prediction errors. It is noteworthy that the RMSE values are relatively high, indicating significant errors in the predictive model. One possible reason for this is the impact of the COVID-19 pandemic in 2020, which caused a sharp decline in traffic volume and disrupted existing patterns. When training the model, we still included data from that period, which may have led to inaccuracies in predicting future traffic volumes. 81 Chapter 5 Conclusions This project focused on predicting traffic flow on the Lions Gate Bridge, a vital infrastructure link between downtown Vancouver and the northern regions. The study addressed the pressing issue of traffic congestion that affects thousands of commuters and tourists daily. We developed a novel hybrid traffic model that combines the strengths of both linear and nonlinear approaches to provide more accurate and reliable traffic forecasts. Our findings highlighted the effectiveness of the proposed hybrid model, demonstrating its superiority over traditional forecasting methods in predicting traffic flow on the Lions Gate Bridge. By integrating multiple predictive techniques, the model significantly enhances the accuracy and reliability of traffic predictions, providing urban planners with actionable insights for more effective traffic management and improved transportation network efficiency. This study contributes to the field of ITS by supporting sustainable urban development and reducing the environmental impact of traffic congestion. 82 However, the current model is limited by its reliance on only the two bestperforming linear and two best-performing non-linear models, which may constrain its adaptability and overall predictive power. For future work, there are several avenues to explore. Firstly, the model can be extended by not limiting the selection to just two linear and two nonlinear models. Incorporating a larger set of models could further enhance the robustness and accuracy of the predictions, allowing the hybrid model to better capture the complexities of traffic flow. Secondly, the current method of estimating the weights in the model is based on the FESS. While effective, there are other optimization techniques, such as genetic algorithms or machine learning approaches, that could potentially yield better results. These methods could be explored to refine the weight estimation process, leading to further improvements in model performance. Overall, this project has laid a solid foundation for future research and development in traffic flow forecasting. The proposed hybrid model has shown promise, and with further enhancements, it could serve as a powerful tool in managing urban traffic congestion, not only on the Lions Gate Bridge but in other urban centers facing similar challenges. 83 Bibliography Keras lstm neutal networks for univariate time-series in r. URL https: //rpubs.com/pawel-wieczynski/891765. Tensorflow for r. URL https://tensorflow.rstudio.com/install/. El-Shishiny Andrawis, Atiya. Combination of long term and short term forecasts, with application to tourism demand forecasting. International Journal of Forecasting, 27:870–886, 2011. doi: 10.1016/j.ijforecast.2010. 05.019. URL https://www.sciencedirect.com/science/article/pii/ S0169207010001147?via%3Dihub. Association of Consulting Engineering Companies British Columbia. Lions gate bridge reversible lane control system rehabilitation. https://acecbcawards.com/2023-awards/2023-soft-engineering/ lions-gate-bridge-reversible-lane-control-system-rehabilitation/ #one. Abbas-Khan Ahmad Ata, Khan. Adaptive iot empowered smart road traffic congestion control system using supervised machine learning algorithm. THE COMPUTER JOURNAL 2021, 64:1672–1679, 2020. doi: 10.1093/ comjnl/bxz129. URL https://academic.oup.com/comjnl/article/64/ 11/1672/5838271#314342292. 84 Shekhar Babar. babar. Time series missing value imputation - shekhar 5 2023. URL https://medium.com/@shekhar.babar/ time-series-missing-value-imputation-fa51a7b1ac49. Charalampos Bratsas, Kleanthis Koupidis, Josep-Maria Salanova, Konstantinos Giannakopoulos, Aristeidis Kaloudis, and Georgia Aifadopoulou. A comparison of machine learning methods for the prediction of traffic speed in urban places. Sustainability, 12(1):142, 2019. doi: 10.3390/su12010142. URL https://www.mdpi.com/2071-1050/12/1/142. Canada, Environment and Climate Change. tion tool. Climate data extrac- https://climate-change.canada.ca/climate-data/#/ daily-climate-data. Singh-Chang Chan, Dillon. Traffic flow forecasting neural networks based on exponential smoothing method. 2011 6th IEEE Conference on Industrial Electronics and Applications, 2011a. doi: 10.1109/ICIEA.2011.5975612. URL https://ieeexplore.ieee.org/abstract/document/5975612. Singh-Chang Chan, Dillon. Neural-network-based models for short- term traffic flow forecasting using a hybrid exponential smoothing and levenberg–marquardt algorithm. IEEE Transactions on Intelligent Transportation Systems, 13:644–654, 10.1109/TITS.2011.2174051. https://ieeexplore.ieee. URL 2011b. doi: org/abstract/document/6088012?casa_token=uiOITWsZHbkAAAAA: 0zNd1SWb010InfqxfC9uMdJc99dKfRAuNZwDAqSf3ZluIxmuN1mkdDG1vqI_ 4wD93UzfsGXKJ9M. Chung Chen and Lon-Mu Liu. Joint estimation of model parameters and outlier effects in time series. Quarterly Publications of the American 85 Statistical Association/Quarterly Publication of the American Statistical Association, 88(421):284, 1993. doi: 10.2307/2290724. URL https: //www.tandfonline.com/doi/abs/10.1080/01621459.1993.10594321. City of Vancouver. Stanley park. https://vancouver.ca/ parks-recreation-culture/stanley-park.aspx. Shengdong Du, Tianrui Li, Xun Gong, Yan Yang, and Shi Jinn Horng. Traffic flow forecasting based on hybrid deep learning frame- work. In IEEE Conference Publication — IEEE Xplore, Novem- ber 2017. URL https://ieeexplore.ieee.org/abstract/document/ 8258813#full-text-header. Leitch Eddelbuettel, Nguyen. R interface to the ’quantlib’ library. https: //cran.r-project.org/web/packages/RQuantLib/RQuantLib.pdf, 07 2024. Francisco Charte Antonio J. Rivera Francisco Martinez, Maria P. Frias. Time series forecasting with knn in r: 2023. the tsfknn package. 12 URL https://cran.r-project.org/web/packages/tsfknn/ vignettes/tsfknn.html. Rui Fu, Zuo Zhang, and Li Li. Using lstm and gru neural network methods for traffic flow prediction. In IEEE Conference Publication — IEEE Xplore, November 2016. URL https://ieeexplore.ieee.org/ document/7804912. Historical Data. Climate historical data website. https://climate. weather.gc.ca/historical_data/search_historic_data_e.html? hlyRange=1976-01-20%7C2024-02-18&dlyRange=1925-11-01% 7C2024-02-18&mlyRange=1925-01-01%7C2007-02-01&urlExtension= 86 _e.html&searchType=stnProx&optLimit=specDate&Month=1&Day= 18&StartYear=1840&EndYear=2024&Year=2022&selRowPerPage=25& Line=0&txtRadius=25&optProxType=navLink&txtLatDecDeg=49. 314783333333&txtLongDecDeg=-123.11527777778&timeframe=2. Qinzhong Hou, Junqiang Leng, Guosheng Ma, Weiyi Liu, and Yuxing Cheng. An adaptive hybrid model for short-term urban traffic flow prediction. Physica. A, 527:121065, 2019. doi: 10.1016/j.physa.2019. 121065. URL https://www.sciencedirect.com/science/article/pii/ S0378437119306508?via%3Dihub. Rob J Hyndman and George Athanasopoulos. Forecasting: principles and practice, 3rd edition. 2021. URL https://otexts.com/fpp3/. INRIX. Inrix 2023 global traffic scorecard. https://inrix.com/ scorecard/. Danqing Kang, Yisheng LV, and Yuan-yuan Chen. Short-term traffic flow prediction with lstm recurrent neural network. In IEEE Conference Publication — IEEE Xplore, October 2017. URL https://ieeexplore.ieee. org/document/8317872. Hariharan Kumar. Time series traffic flow prediction with hyper-parameter optimized arima models for intelligent transportation system. Journal of Scientific Industrial Research, 81:408–415, 2022. jsir.v81104.50791. doi: 10.56042/ URL https://www.semanticscholar.org/paper/ Time-Series-Traffic-Flow-Prediction-with-Optimized-Kumar-Hariharan/ 29a25255f6aa9127a227b6655ce4199a75ead881. Ibai Laña, Javier J. Sanchez Medina Medina, Eleni I. Vlahogianni, and Javier Del Ser. From data to actions in intelligent transportation sys87 tems: A prescription of functional requirements for model actionability. Sensors (Basel), 21(4):1121, 2021. doi: 10.3390/s21041121. URL https://doi.org/10.3390/s21041121. Marco Lippi, Matteo Bertini, and Paolo Frasconi. fic flow forecasting: Short-term traf- An experimental comparison of time-series analysis and supervised learning. https://ieeexplore.ieee.org/ abstract/document/6482260?casa_token=UkfsRAE_Pt4AAAAA:OshX6_ poDsdUwx81609E5g3KeuinJSCX78TVspPiZdvvEdkqu4RhpQ4uiY7ZkWsQGcFRLvkcMNc, June 2013. Xu Ma, Tian. Short-term traffic flow prediction based on genetic artificial neural network and exponential smoothing. Promet - TrafficTransportation, 32:747–760, 2020. doi: 10.7307/ptt.v32i6.3360. URL https: //hrcak.srce.hr/253142. OfficeHolidays. Statutory holidays in canada in 2022. https://www. officeholidays.com/countries/canada/2022. R: Seasonally Decomposed Missing Value Imputation. R: Seasonally decomposed missing value imputation. https://search.r-project.org/CRAN/ refmans/imputeTS/html/na_seadec.html. Hyuk-Jae Roh, Furqan A. Bhat, Prasanta K. Sahu, Ata M. Khan, Orlando Rodriguez, Satish Sharma, and Babak Mehran. Appraisal of temporal transferability of cold region winter weather traffic models for major highway segments in alberta canada. Geosciences, 9(3):137, 2019. doi: 10.3390/geosciences9030137. URL https://doi.org/10.3390/ geosciences9030137. 88 Sadaf Saleem. Neural 05 2023. URL networks in 10mins. simply explained! https://medium.com/@sadafsaleem5815/ neural-networks-in-10mins-simply-explained-9ec2ad9ea815. Laurent L. Santos and Francisco S. Castillo. Introduction to spatial network forecast with r. 2019. URL https://laurentlsantos.github.io/ forecasting/support-vector-regression.html. Service Canada. Weather, climate and hazards. https://www.canada.ca/ en/services/environment/weather.html, 5 2023. Josh Starmer. Long short-term memory (lstm), clearly explained. https: //www.youtube.com/watch?v=YCzL96nL7j0, 11 2022a. Josh Starmer. Recurrent neural networks (rnns), clearly explained!!! https: //www.youtube.com/watch?v=AsNTP8Kwu80, 7 2022b. Demidova Tempelmeier, Dietze. Crosstown traffic - supervised prediction of impact of planned special events on urban traffic. Geolnformatica, 24: 339–370, 2019. doi: 10.1007/s10707-019-00366-x. URL https://link. springer.com/article/10.1007/s10707-019-00366-x. Traffic Data Program. Traffic data program. https://www.th.gov.bc.ca/ trafficdata/. Li Kang Wang, Hyndman. Forecast combinations: An over 50-year review. International Journal of Forecasting, 39:1518–1547, 2023. doi: 10.1016/j. ijforecast.2022.11.005. URL https://www.sciencedirect.com/science/ article/pii/S0169207022001480?via%3Dihub. WSP Canada Group Limited. Traffic reports user documenta- 89 tion. https://www.th.gov.bc.ca/trafficData/documents/ TrafficReportsUserDocumentation_2019May16.pdf, 2019. WSPglobal. Intelligent transportation systems. https://www.wsp.com/ en-ca/services/intelligent-transportation-systems-its. Shengjian Zhao, Shu Lin, and Jungang Xu. Time series traffic prediction via hybrid neural networks. In IEEE Conference Publication — IEEE Xplore, October 2019. URL https://ieeexplore.ieee.org/abstract/ document/8917383. Juncheng Zhu. Electric vehicle charging load forecasting: A comparative study of deep learning approaches. https://www.mdpi.com/1996-1073/ 12/14/2692, 7 2019. Richard Zussman. B.c. declares state of emergency in response to coronavirus pandemic. March 2020. 90 Appendix A Climate Data Collection This section outlines the process of collecting traffic and climate data to create two datasets, which is: 2022 Hourly Traffic and Climate Dataset and 2017-2022 Daily Traffic and Climate Dataset A.1 Hourly Climate Data (2022) After examining the first tsibble table, namely temp.hourly, containing stations with climate IDs 1106200, 11068446, 1108395, 1108380, and 1108824, and visualizing its data in Fig. A.1, we note similar weather patterns among these stations. Further investigation reveals that missing data for these stations are not significant, as indicated in Table A.1. 91 Figure A.1: Comparison plot of 2022 hourly temperature data for different stations 92 Table A.1: Missing data for 2022 hourly temperature data (8760 rows) Site No. 1 2 Climate ID 1106200 1108446 Climate Station Climate Station Distance from Lions Missing temperature Name GPS (DD) Gate Bridge Data POINT 49.3304, - 9.3143 km 11 ATKINSON 123.2647 VANCOUVER 49.2954, - 16.402 km 12 13.8134 km 4 15.1886 km 12 5.322 km 16 HARBOUR 123.1219 3 1108395 CS VANCOUVER 49.1947, - 4 1108380 INTL A 123.1839 VANCOUVER 49.1825, SEA ISLAND 5 1108824 123.1872 CCG WEST 49.3470, - VANCOU- 123.1933 VER AUT 93 The second tsibble table, precip.hourly, includes stations with climate IDs meeting our criteria: 11068446, 1108380, and 1108824. We plot this table, as depicted in Fig. A.2, revealing the same pattern in the precipitation feature among these stations. Subsequently, missing data are checked, and the results displayed in Table A.2 indicate that missing data for these stations are not severe. Figure A.2: Comparison plot of 2022 hourly precipitation data for different stations 94 Table A.2: Missing data for 2022 hourly precipitation data (8760 rows) Site No. 1 2 Climate ID 1108446 1108380 Climate Station Climate Station Distance from Lions Missing Precipitation Name GPS (DD) Gate Bridge Data VANCOUVER 49.2954, HARBOUR 123.1219 16.402 km 12 15.1886 km 12 5.322 km 16 CS VANCOUVER 49.1825, SEA ISLAND 3 1108824 123.1872 CCG WEST 49.3470, - VANCOU- 123.1933 VER AUT 95 Lastly, the third table, ws.hourly, encompasses stations with climate IDs 1106200, 1108395, 1108380, and 1108824. Despite variations in wind speed, particularly with station 1106200 exhibiting higher speeds than station 1108824, visual inspection in Fig. A.3 reveals no distinct pattern. Moreover, analysis of missing data in Table A.3 affirms that the quantity of missing data for these stations is inconsequential and does not significantly affect the dataset. Figure A.3: Comparison plot of 2022 hourly wind speed data for different stations 96 Table A.3: Missing data for 2022 hourly wind speed data (8760 rows) Site No. Climate ID Climate Climate Distance Missing Station Name Station GPS (DD) from Lions Gate Bridge Wind Speed Data 1 1106200 POINT ATKINSON 49.3304, 123.2647 9.3143 km 11 2 1108395 VANCOUVER 49.1947, INTL A 123.1839 13.8134 km 4 3 1108380 VANCOUVER 49.1825, SEA 123.1872 ISLAND 15.1886 km 12 5.322 km 143 4 1108824 CCG WEST VANCOUVER AUT 49.3470, 123.1933 97 Next, we identify stations that possess all three features, namely 1108380 and 1108824. Inspection of the Table A.4 reveals that missing data for these two stations are minimal. Table A.4: Missing data for 2022 hourly climate data (8760 rows) Site No. Climate ID Climate Station Climate Station Distance Missing from temp Missing precip Missing wind Missing Missing Weather Weather Name GPS (DD) Lions Gate Data speed Data Data (Either Data (Both temp temp or precip precip and ws 12 16 Data Bridge 1 1108380 VAN 49.1825, 15.1886 SEA IS- 123.1872 km CCG WEST 49.3470, 5.322 VANCOU- 123.1933 km 12 12 12 or ws 12 16 16 143 143 LAND 2 1108824 VER AUT 98 Considering the inclusion of all features from both stations as input and their subsequent application in models may result in data redundancy and homogeneity. Therefore, we opt to calculate the mean value of each feature across the two stations to represent the values of different features. For the combined datasets, where missing data exists, we opted to handle it by ignoring these NA values and calculating the average based only on the non-NA values in each row. Subsequently, we proceeded to plot each feature separately. From Fig. A.4, it is apparent that the hourly temperature in 2022 reached its peak around August and its lowest point around December. Figure A.4: Plot of 2022 hourly temperature average data From Fig. A.5, we observe relatively low precipitation from July to Octo99 ber 2022. However, due to having only one year of data, there is not enough evidence to make reasonable inferences. It might be worthwhile to analyze the 2017-2022 Daily Precipitation Dataset in the future for further insights. Figure A.5: Plot of 2022 hourly precipitation average data Fig. A.6 presents a plot corresponding to the hourly wind speed data for the year 2022. However, it appears that we cannot glean any significant information from it. 100 Figure A.6: Plot of 2022 hourly wind speed average data Finally, we integration 2022 Hourly Climate Dataset with the 2022 Hourly Traffic dataset yields the 2022 Hourly Traffic and Climate Dataset. A.2 Daily Climate Data (2017-2022) The first one is temp.daily, with 7 stations meeting its criteria: 1105658, 1106200, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Fig. A.7 indicate that these stations follow the same pattern. 101 Figure A.7: Comparison plot of 2017-2022 daily temperature data The second tsibble table, named precip.daily, comprises stations with the IDs 1105658, 1105669, 1106200, 1106764, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Upon plotting the data, Fig. A.8 shows that these stations exhibit a similar pattern. 102 Figure A.8: Comparison plot of 2017-2022 daily precipitation data Subsequently, we identify stations that have both features, including IDs 1105658, 1106200, 1106PF7, 1108446, 1108395, 1108380, and 1108824. Upon inspection of the missing value Table A.5, it becomes evident that there is a substantial amount of missing data for station 1105658 and 1106200, necessitating its removal. 103 Table A.5: Missing data for 2017-2022 daily climate data. (2191 rows) Site Climate Climate Climate Distance Missing Missing Missing No. ID Station Name Station GPS from Lions temp Data precip Data Weather Weather Data Data Gate (Days) (Days) (Both (Either temp and temp or (DD) Bridge 1 1105658 N 49.3811, VANC GROUSE 123.0783 8.5159 Missing 633 601 precip 601 precip 633 km MTN RESORT 2 1106200 POINT ATKIN- 49.3304, - 9.3143 km 72 833 72 833 3 SON 1106PF7 RICH- 123.2647 49.1708, 16.402 201 201 201 201 MOND - km NATURE 123.0931 32 155 32 155 20 6 3 23 39 139 39 139 37 317 37 317 4 5 6 1108446 1108395 1108380 PARK VAN 49.2954, 2.5117 HARBOUR 123.1219 km CS VAN 49.1947, 13.8134 INTL - km A VAN 123.1839 49.1825, 15.1886 SEA IS- 123.1872 km 49.3470, 5.322 123.1933 km LAND CCG 7 1108824 WEST VANCOUVER AUT 104 Similar to the processing of hourly climate data, we need to calculate the average of each feature across six stations to represent the values of different features and reduce dataset homogeneity. After merging the data from five stations (1106PF7, 1108446, 1108395, 1108380, and 1108824) and computing the average values, we discovered that the combined dataset still contained some missing data. To address missing values in the temperature feature, we employed the built-in function “na seadec()” for missing value imputation. However, when attempting to handle missing data in the precipitation feature using this method, we encountered negative values, which are inconsistent with the nature of precipitation, as it must be non-negative. Therefore, we explored an alternative method for imputing missing data, namely linear interpolation, available in the “zoo” library. This technique, rooted in numerical analysis, estimates unknown values by assuming a linear relationship within the range of data points. To utilize linear interpolation for estimating missing values, we examined past and future data surrounding the missing values (Babar [2023]). With this approach, no negative values emerged, hence we adopted it as the preferred method. Upon completing the handling of missing data, we obtained two new tables: final.temp.daily.2017 2022 and final.precip.daily.2017 2022. Subsequently, we proceeded to plot them. From Fig. A.9, it is evident that daily temperature data exhibit peak values around August each year, consistent with the results obtained from the 2022 hourly temperature dataset (Fig. A.4). Additionally, we can see a seasonal pattern at the yearly level. 105 Figure A.9: Plot of 2017-2022 Daily temperature average data From Fig. A.10, it is evident that the 2017-2022 daily precipitation data show lower precipitation levels around June each year compared to other months. This observation suggests a potential seasonal pattern at the yearly level. 106 Figure A.10: Plot of 2017-2022 Daily precipitation average data Finally, integration 2017-2022 Daily Climate Dataset with the 2017-2022 Daily Traffic Dataset yields the 2017-2022 Daily Traffic and Climate Dataset. 107 Appendix B Linear and Non-linear Model Selection B.1 2022 Hourly Both Directions Dataset We will analyze the results for the day type corresponding to weekend and holiday, and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. B.1.1 Weekend Based on Table B.1, we selected the top two models: SVR.XL and ARIMAL . Similarly, based on Table B.2, we selected the top two models: SVRN L and 108 Table B.1: Linear Model Results Table B.2: Non-Linear Model Results Model MAPE SVR.XL 10.44 SVRL Model MAPE 10.85 SVRN L 8.56 ARIMAL 28.32 KNNN L 11.38 DRegL 32.07 NNETARN L 12.52 ETSL 53.58 NNETAR.XN L 12.55 SVR.XN L 13.55 LSTMN L 20.86 LSTM.XN L 27.9 KNNN L . B.1.2 Holiday Based on Table B.3, we selected the top two models: SVR.XL and ARIMAL . Similarly, based on Table B.4, we selected the top two models: SVRN L and NNETAR.XN L . B.2 2022 Hourly Positive Direction Dataset We will analyze the results for the day type corresponding to weekday, weekend and holiday, and select the top two linear models and the top two nonlinear models based on the MAPE value. 109 Table B.3: Linear Model Results Table B.4: Non-Linear Model Results Model MAPE SVR.XL 12.49 SVRL Model MAPE 12.81 SVRN L 8.65 ARIMAL 29.4 NNETAR.XN L 9.21 DRegL 32.19 NNETARN L 11.77 ETSL 55.85 KNNN L 14.13 SVR.XN L 16.86 LSTM.XN L 18.77 LSTMN L 25.32 The tables below present the results of the linear and non-linear models, ordered by MAPE value. B.2.1 Weekday The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.5, we selected the top two models: SVRL and ARIMAL . Similarly, based on Table B.6, we selected the top two models: NNETARN L and SVR.XN L . 110 Table B.5: Linear Model Results Table B.6: Non-Linear Model Results B.2.2 Model MAPE SVRL 15.12 ARIMAL Model MAPE 17.3 NNETARN L 14.6 SVR.XL 17.37 SVR.XN L 15.77 DRegL 18.08 NNETAR.XN L 16.5 ETSL 26.46 SVRN L 16.97 LSTM.XN L 33.88 LSTMN L 40.27 KNNN L 41.4 Weekend First, we will analyze the results for the day type corresponding to weekdays, and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.7, we selected the top two models: SVR.XL and ARIMAL . Similarly, based on Table B.8, we selected the top two models: SVRN L and NNETARN L . 111 Table B.7: Linear Model Results Table B.8: Non-Linear Model Results B.2.3 Model MAPE SVR.XL 11.07 SVRL Model MAPE 11.46 SVRN L 9.54 ARIMAL 24.82 NNETARN L 11.93 DRegL 26.9 KNNN L 12.11 ETSL 59.7 SVR.XN L 14.26 NNETAR.XN L 14.82 LSTMN L 21.39 LSTM.XN L 22.83 Holiday First, we will analyze the results for the day type corresponding to weekdays, and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.9, we selected the top two models: SVRL and ARIMAL . Similarly, based on Table B.10, we selected the top two models: NNETAR.XN L and SVRN L . 112 Table B.9: Linear Model Results Table B.10: Non-Linear Model Results B.3 Model MAPE SVRL 11 SVR.XL Model MAPE 12.5 NNETAR.XN L 9.63 ARIMAL 20.43 SVRN L 10.63 DRegL 23.81 NNETARN L 11.29 ETSL 45.01 SVR.XN L 11.66 KNNN L 17.29 LSTMN L 24.49 LSTM.XN L 26.56 2022 Hourly Negative Direction Dataset We will analyze the results for the day type corresponding to weekday, weekend and holiday, and select the top two linear models and the top two nonlinear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. B.3.1 Weekday The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.11, we selected the top two models: SVRL and ARIMAL . 113 Table B.11: Linear Model Results Table B.12: Non-Linear Model Results Model MAPE SVRL 16.86 ARIMAL Model MAPE 17.4 SVRN L 16.73 DRegL 17.87 SVR.XN L 17.51 ETSL 18.24 LSTM.XN L 38.58 SVR.XL 23.05 KNNN L 40.18 LSTMN L 43.26 NNETAR.XN L 69.71 NNETARN L 77.34 Similarly, based on Table B.12, we selected the top two models: SVRN L and LSTM.XN L . B.3.2 Weekend First, we will analyze the results for the day type corresponding to weekdays, and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.13, we selected the top two models: SVR.XL and ARIMAL . Similarly, based on Table B.14, we selected the top two models: NNETAR.XN L and KNNN L . 114 Table B.13: Linear Model Results Table B.14: Non-Linear Model Results B.3.3 Model MAPE SVR.XL 15.52 SVRL Model MAPE 15.83 NNETAR.XN L 11.77 ARIMAL 38.79 KNNN L 12.03 DRegL 45.65 NNETARN L 12.58 ETSL 47.79 SVR.XN L 12.9 SVRN L 15.95 LSTM.XN L 27.6 LSTMN L 36.71 Holiday First, we will analyze the results for the day type corresponding to weekdays, and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.15, we selected the top two models: SVRL and ETSL . Similarly, based on Table B.16, we selected the top two models: SVRN L and NNETAR.XN L . 115 Table B.15: Linear Model Results Table B.16: Non-Linear Model Results B.4 Model MAPE SVRL 22.54 SVR.XL Model MAPE 23.24 SVRN L 10.66 ETSL 36.6 NNETAR.XN L 18.94 ARIMAL 46.57 KNNN L 19.1 DRegL 51.77 LSTMN L 21.71 NNETARN L 22.37 LSTM.XN L 27.19 SVR.XN L 27.64 2017-2022 Daily Dataset We will analyze the results for the day type corresponding to weekday and select the top two linear models and the top two non-linear models based on the MAPE value. The tables below present the results of the linear and non-linear models, ordered by MAPE value. B.4.1 Weekday The tables below present the results of the linear and non-linear models, ordered by MAPE value. Based on Table B.17, we selected the top two models: SVR.XL and 116 Table B.17: Linear Model Results Table B.18: Non-Linear Model Results Model MAPE SVR.XL 2.92 SVRL Model MAPE 4.85 SVR.XN L 2.36 ARIMAL 6.22 LSTMN L 4.21 DRegL 8.42 SVRN L 4.38 ETSL 11.85 LSTM.XN L 4.49 NNETARN L 7.85 NNETAR.XN L 11.59 KNNN L 15.79 ARIMAL . Similarly, based on Table B.18, we selected the top two models: SVR.XN L and LSTMN L . 117