Clustering and Forecasting Urban Bus Passenger Demand with a Combination of Time Series Models

Resumen

The present paper focuses on the analysis of large data sets from public transport networks, more specifically, on how to predict urban bus passenger demand. A series of steps are proposed to ease the understanding of passenger demand. First, given the large number of stops in the bus network, these are divided into clusters and then different models are fitted for a representative of each of the clusters. The aim is to compare and combine the predictions associated with traditional methods, such as exponential smoothing or ARIMA, with machine learning methods, such as support vector machines or artificial neural networks. Moreover, support vector machine predictions are improved by incorporating explanatory variables with temporal structure and moving averages. Finally, through cointegration techniques, the results obtained for the representative of each group are extrapolated to the rest of the series within the same cluster. A case study in the city of Salamanca (Spain) is presented to illustrate the problem.

Descripción

The current paper concentrates on the examination of extensive datasets derived from public transportation networks, specifically addressing the prediction of urban bus passenger demand. The approach involves a series of steps designed to enhance the comprehension of passenger demand. Initially, due to the substantial number of bus stops in the network, they are categorized into clusters, and distinct models are subsequently developed for a representative from each cluster. The objective is to compare and integrate predictions generated by conventional methods like exponential smoothing or ARIMA with those from machine learning techniques, such as support vector machines or artificial neural networks. Furthermore, the accuracy of support vector machine predictions is refined by incorporating explanatory variables with temporal structures and moving averages. Ultimately, through cointegration techniques, the outcomes obtained for the representative of each group are extrapolated to the remaining series within the same cluster. The paper illustrates the application of these methods through a case study conducted in the city of Salamanca, Spain.

Citación

Mariñas-Collado, I., Sipols, A. E., Santos-Martín, M. T., & Frutos-Bernal, E. (2022). Clustering and forecasting urban bus passenger demand with a combination of time series models. Mathematics, 10(15), 2670.
license logo
Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution 4.0 International