Advancing Pavement Distress Detection in Developing Countries: A Novel Deep Learning Approach with Locally-Collected Datasets

1North Dakota State University, 2Kwame Nkrumah University of Science and Technology 3SMART Lab

Abstract

Road infrastructure maintenance in developing countries faces unique challenges due to resource constraints and diverse environmental factors. This study addresses the critical need for efficient, accurate, and locally-relevant pavement distress detection methods in these regions. We present a novel deep learning approach combining YOLO (You Only Look Once) object detection models with a Convolutional Block Attention Module (CBAM) to simultaneously detect and classify multiple pavement distress types. The model demonstrates robust performance in detecting and classifying potholes, longitudinal cracks, alligator cracks, and raveling, with confidence scores ranging from 0.46 to 0.93. While some misclassifications occur in complex scenarios, these provide insights into unique challenges of pavement assessment in developing countries. Additionally, we developed a web-based application for real-time distress detection from images and videos. This research advances automated pavement distress detection and provides a tailored solution for developing countries, potentially improving road safety, optimizing maintenance strategies, and contributing to sustainable transportation infrastructure development.


Inference Demo for our Model


Modified YOLOv5 architecture

Overall YOLOv5 architecture with modified Cross-Stage Partial Network (C3 Block). The dotted red rectangular boxes show the areas where the architecture was modified.

Modified YOLOv5 architecture

Modified C3 block. We replace the first convolutional layer in the C3 block with a Convolutional Block Attention Module (CBAM). The CBAM output passes through the Bottleneck layers and concatenates with two additional convolutional layers' output

Modified YOLOv5 architecture

CBAM architecture showing the Channel and Spatial Attention Module

Model Predictions with Attention maps

Images with their corresponding detections and attention maps illustrating areas the model focused on to detect the pavement distresses. The grad-cam visualization provides insights into the model's interpretability, allowing us to verify whether the model is accurately identifying the distress locations or being misled by irrelevant features


Model Evaluation Results

Illustration of the trade-off between recall and precision across different confidence levels. (a) Precision-Confidence Curve (b) Recall-Confidence Curve