From Traditional to Autonomous Vehicles: A Systematic Review of Data Availability
The increasing accessibility of mobility datasets has enabled research in green mobility, road safety, vehicular automation, and transportation planning and optimization. Many stakeholders have leveraged vehicular datasets to study conventional driving characteristics and self-driving tasks. Notably, many of these datasets have been made publicly available, fostering collaboration, scientific comparability, and replication. As these datasets encompass several study domains and contain distinctive characteristics, selecting the appropriate dataset to investigate driving aspects might be challenging. To the best of the authors’ knowledge, this is the first paper that performs a systematic review of a substantial number of vehicular datasets covering various automation levels. In total, 103 datasets have been reviewed, 35 of which focused on naturalistic driving, and 68 on self-driving tasks. The paper gives researchers the possibility of analyzing the datasets’ principal characteristics and their study domains. Most naturalistic datasets have been centered on road safety and driver behavior, although transportation planning and eco-driving have also been studied. Furthermore, datasets for autonomous driving have been analyzed according to their target self-driving tasks. A particular focus has been placed on data-driven risk assessment for the vehicular ecosystem. It is observed that there exists a lack of relevant publicly available datasets that challenge the creation of new risk assessment models for semi- and fully automated vehicles. Therefore, this paper conducts a gap analysis to identify possible approaches using existing datasets and, additionally, a set of relevant vehicular data fields that could be incorporated in future data collection campaigns to address the challenge.