PREDIETING HOURLY BOARDING DEMAND OF BUS PASSENGERS USING IMBALANCED RECORDS FROM SMART-CARDS: A DEEP LEARNING APPROACH
Keywords:
Deep-GAN, deep generative adversarial nets (Deep-GAN)Abstract
The tap-on smart-card data provides a valuable source to learn passengers’ boarding
behaviour and predict future travel demand. However, when examining the smart-card
records (or instances) by the time of day and by boarding stops, the positive instances (i.e.
boarding at a specific bus stop at a specific time) are rare compared to negative instances (not
boarding at that bus stop at that time). Imbalanced data has been demonstrated to
significantly reduce the accuracy of machine-learning models deployed for predicting hourly
boarding numbers from a particular location. This paper addresses this data imbalance issue
in the smart-card data before applying it to predict bus boarding demand. We propose the
deep generative adversarial nets (Deep-GAN) to generate dummy travelling instances to add
to a synthetic training dataset with more balanced travelling and non-travelling instances.
The synthetic dataset is then used to train a deep neural network (DNN) for predicting the
travelling and non-travelling instances from a particular stop in a given time window. The
results show that addressing the data imbalance issue can significantly improve the predictive
model’s performance and better fit ridership’s actual profile. Comparing the performance of
the Deep-GAN with other traditional resampling methods shows that the proposed method
can produce a synthetic training dataset with a higher similarity and diversity and, thus, a
stronger prediction power. The paper highlights the significance and provides practical
guidance in improving the data quality and model performance on travel behaviour
prediction and individual travel behaviour analysis.
Downloads
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.











