A feature learning approach based on XGBoost for driving assessment and risk prediction
•A method is designed to extract driving behaviour features and predict risk levels.•Massive driving behaviour features are extracted from real vehicle trajectory data.•Key features are identified by feature importance ranking and recursive elimination.•XGBoost can achieve satisfactory results of be...
Saved in:
Published in | Accident analysis and prevention Vol. 129; pp. 170 - 179 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
England
Elsevier Ltd
01.08.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •A method is designed to extract driving behaviour features and predict risk levels.•Massive driving behaviour features are extracted from real vehicle trajectory data.•Key features are identified by feature importance ranking and recursive elimination.•XGBoost can achieve satisfactory results of behaviour-based crash risk prediction.
This study designs a framework of feature extraction and selection, to assess vehicle driving and predict risk levels. The framework integrates learning-based feature selection, unsupervised risk rating, and imbalanced data resampling. For each vehicle, about 1300 driving behaviour features are extracted from trajectory data, which produce in-depth and multi-view measures on behaviours. To estimate the risk potentials of vehicles in driving, unsupervised data labelling is proposed. Based on extracted risk indicator features, vehicles are clustered into various groups labelled with graded risk levels. Data under-sampling of the safe group is performed to reduce the risk-safe class imbalance. Afterwards, the linkages between behaviour features and corresponding risk levels are built using XGBoost, and key features are identified according to feature importance ranking and recursive elimination. The risk levels of vehicles in driving are predicted based on key features selected. As a case study, NGSIM trajectory data are used in which four risk levels are clustered by Fuzzy C-means, 64 key behaviour features are identified, and an overall accuracy of 89% is achieved for behaviour-based risk prediction. Findings show that this approach is effective and reliable to identify important features for driving assessment, and achieve an accurate prediction of risk levels. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0001-4575 1879-2057 |
DOI: | 10.1016/j.aap.2019.05.005 |