Efficient Deep Learning-Based Road Objects Detection Model to Improve Driving Safety for Autonomous Vehicles

5275 Words 4 min read May 29, 2024

Abstract

Object detection is a technology that is related to image processing and computer vision. It deals with the detection of instances of certain classes of semantic objects in videos and digital images. This is one of the most important features of autonomous vehicles as the safety of driving depends on this feature. In traffic analysis and intelligent driving assistance, it is very important to accurately identify all the elements from real-time video. Autonomous driving, advanced driver assistance systems, and advanced driver assistance systems need to have insights into road conditions and surroundings. To achieve this, different types of car companies are adopting different technologies. The current work uses a new method based on CenterNet, deep learning without anchors. This task uses a large naturalistic driving data set (BDD100K) available online at Kaggle. Here, feature extraction and segmentation are done either in HOG or LOG, and all edges of the object are recognized by the Canny edge detection algorithm. This detection model is enhanced with Faster RCNN or Hybrid CNN networks. This proposed model improves the accuracy of road object recognition.

Introduction and background

The world is bending to autonomous driving nowadays. People are more likely to enjoy vehicles that have features of self-driving. Road accidents are increasing day by day due to that the requirements of increasing vehicle safety are also increasing. The gaining popularity of autonomous vehicles is also gaining the popularity of the subject identification of objects in heavy traffic conditions as a research topic (Yu et al. 2020). This is a simple but difficult task in the field of computer vision, structured into categories and detecting where they appear in the image. In recent years, deep learning with powerful extraction capabilities that are essential for object identification has expanded and succeeded in this area of application. However, powerful methods such as real-time monitoring, changing weather, and complex lighting have their problems with traffic conditions (Hou et al. 2021). This can be started from the previous studies that many deep learning models can be used in the case of identifying objects. In this regard, the new self-driving car uses a variety of sensors and deep learning techniques to recognize and classify four types of roads and monitor existing road environments which are “vehicles, pedestrians, signs, traffic lights, and many more”(He et al. 2019). This will increase safety.

Object recognition is a technique for identifying and classifying objects in an image so that the image can be fully captured. This is currently one of the first major challenges of autonomous vision driving (Bibi et al. 2021). Element discovery creates a bounding box around the identified element, with the class name and confidence score associated with each bounding box. In the field of deep learning, there are many opportunities to improve the degree of automation by raising awareness of the environment.

This study also explores the applicability of specific algorithms in object identification by intuitively assessing some complex traffic conditions. Some benchmarks contrast common deep learning object recognition methods for researchers to recognize root objects (Qiao et al. 2020). However, the effectiveness of existing methods for recognizing road objects is greatly reduced by small objects, poor lighting, and blurred images. From them, the problem is mainly related to the detection of small objects. In light of the rapid advances in intelligent driving, the identification of objects by deep learning is still worth investigating. The need for accurate and efficient detection systems is becoming increasingly important in order to implement more sophisticated scenarios for real-time detection (Dairi et al. 2018). The main difficulty is balancing accuracy, efficiency, and performance in real-time. Current achievements have proven successful, but more research is needed to resolve this issue. This project will enhance the efficiency of object detection systems in autonomous vehicles. And that will directly improve the driving safety of autonomous vehicles as these vehicles can identify and analyze the objects that are around them and can avoid those in case of avoiding accidents. The increasing performance of autonomous vehicles and the increasing driver safety will help to gain more popularity of autonomous vehicles and that would also be beneficial to the stakeholders. The previous works on this topic have also been done on identifying various objects through various methods. This study will improve the system in terms of efficiency.

Aims and objectives

Aims

The aim of the study is to identify every small element whether they have big dimensions or small dimensions such as two-wheelers, cars, and individuals with a very high rate of efficiency in the situation of dense traffic environments. Every vehicle, as well as objective, must be classified into the vehicle class or objective type.

Objectives

Design a model that recognizes small objects and can better learn in different traffic scenarios
Evaluate the efficiency of the model that has been suggested using a data set of big-scale naturalistic driving (BDD100K).
Using current algorithms that have good accuracy in object identification such as Hybrid CNN and R-CNN Faster architectures based on neural networks.
Train the network for several object classes and test the items with better precision by real-time data analysis.

Literature review

Introduction

This section of the study contains information and critical discussion about previous studies and researches that have been done before on the topic. The literature review section also gives a clear idea about the previous research and the methods that have been implemented in the previous research. Several journals on the topic “DEEP LEARNING-BASED ROAD OBJECTS DETECTION MODEL” have been studied and analyzed while doing this section. The advantages and disadvantages of applied methods, limitations and strengths of previous pieces of literature, and many more sections have been discussed in this section.

Deep learning method

Deep learning is an updated version of machine learning. Deep learning is a neural network that has three layers or more than that. The deep learning algorithms are successfully used in the case of automation. In the case of developing speech recognition systems, image processing, and many more automated systems deep learning has become a great way. This algorithm is also capable of handling huge amounts of data. In most of the cases for processing images using deep learning algorithms, “CNN or Convolutional neural network” has been used. The several layers of CNN help to analyze every pixel of the image and determine the pattern according to that.

Using deep learning in image classification can store the pattern of the image and can detect it for the next time. This method has been used here, after training the model with the data set the model can predict the pattern and can identify the type of the object. As per the opinion of Niranjan & VinayKarthik, (2021) using deep learning can help to predict various objects with its name even in a gathered traffic situation.

Advantages and disadvantages of using a deep learning algorithm

Advantages of deep learning algorithm

Several advantages are there for using deep learning methods. The advantages of using deep learning are as follows. Deep learning algorithms are capable of generating new features from a data set without any extra human intervention. This makes the algorithm deep learning perform either in some composite tasks where extensive feature engineering is required. The algorithms are very useful when the data is unstructured. The times when the data set is not structured well, deep learning can come up like a rescue. Other machine learning languages are very limited while the data set is not structured well. Deep learning has multiple layers, the layers enable the chance of learning very complex features. The algorithm of deep learning also supports distributed algorithms and parallel algorithms. The deep learning model needs days or weeks to be trained but with distributed and parallel algorithms this becomes very fast. Deep learning is also capable of providing efficient processing models. The accuracy rate of prediction is also high while using deep learning methods as this analyzes the data set with multiple layers.

Disadvantages of deep learning algorithm

Besides the advantages, deep learning also has several disadvantages that should also be kept in mind while implementing it. The disadvantages of the deep learning method are discussed below. Deep learning algorithms need a really large amount of data in order to provide better performance than any other technique. If training the model with enough data can not be done the training can also be failed. Training a model with deep learning algorithms is sometimes very expensive as this contains complex models. When the data set is well organized, there is no need of using deep learning languages as this can perform better while the data set is not that organized at all. No defined theory can enable one to choose the right tool of deep learning.

Making architectures of deep learning can be very costly as it needs huge and expensive machines and GPUs. Deep learning can be hacked very easily although it can provide cyber security.

Neural networks

As per the opinion of Kulkarni et al. (2018) a prototype of a deep neural network for accurate recognition and traffic light recognition by transfer learning. This approach uses TensorFlow to transfer and train a faster region-dependent (RCNN) Inception V2 model. The model was trained on a dataset using different images of different Indian traffic lights in five categories. Prototypes achieve that goal by identifying the right type of traffic light. It uses current techniques and architectural prototypes of neural networks with much more accurate object recognition, such as Faster RCNN. Some algorithms run faster than QuickerRCNN, but like small and medium-sized elements, they cannot properly identify the element. The faster RCNN is used for data to assess the atmosphere around the vehicle and the surrounding area. The network was formed with 19 article classes and was tested to identify articles with 86.42 percent accuracy in real-time video analysis. In addition, the proposed confusion matrix model is measured using FPR (false positive rate) and FNR (false-negative rate). According to the survey, FPR is 15.97% of Quicker RCNN prototype and FNR is 12.2% of Faster RCNN. As per the contradiction of Li et al. (2021) a new CenterNet-based method for anchor-free deep learning. Atrous Spatial Pyramid Pooling (ASPP) is used to remove features from various scales to improve recognition efficiency without increasing computational cost and amount of parameters. Next, the method proposed to improve the conventional downsampling method used space-to-depth techniques. The effectiveness of the proposed method was evaluated using a large naturalistic driving data set (BDD100K).

Experimental results suggest that our method can effectively improve the detection performance of small objects under various transport conditions.

Technology and resources

Deep learning algorithms will be used herein in the case of analyzing the data set. Deep learning algorithms are highly capable of studying and analyzing huge data sets that contain images. Deep learning models are based on various neural networks. The neural networks have several layers through which the data set of images have been analyzed. At least three layers are there in the neural networks. The models analyze images and evaluate the pattern almost like a human brain. The programming language python will be used herein in the case of applying deep learning methods to the data set. Python is becoming the most popular programming language in the world. Python is the programming language of choice for well-known companies such as Facebook, Google, Quora, Amazon, and Netflix because of its simplicity, flexibility, and ease of use. Often used in some of the most creative and interesting technologies such as machine learning, artificial intelligence, and robots. Python is a very popular language for college beginners. Experienced developers are also often taught how to build a portfolio of skills. More companies and individuals are using Python. Create additional tools to help developers perform complex tasks without encountering code issues. The libraries of Python deep learning are packages of code that have been compiled together to serve a common purpose. There are several libraries in python for deep learning algorithms. The libraries are as follows: “TensorFlow, Keras, Pandas, NumPy, Scikit-learn, and PyTorch”.

TensorFlow is a Python library developed by Google for internal use in deep learning solutions in late 2015. It then became an open source to the general community and has since grown to be one of the largest open-source libraries in Python developed. TensorFlow also includes important tools and community resources to help users get the best possible response. Even Google uses TensorFlow for all machine learning-based products such as Gmail, YouTube, and Google Search. Keras has been developed for a research project called the "Open-ended Neuro Electronic Intelligent Robot Operating Scheme or ONEIROS" and focuses on a simpler prototyping process, a user-friendly interface, and modularity. It is an interface for building deep learning algorithms rather than tools, supporting both convolutional and recurrent neural networks. Keras is preferred for deep learning primarily because of its easy-to-use interface and excellent support for parallelization between CPU and GPU. The library also provides a modular approach for iterating deep learning solutions. The Pandas library is one of the most comprehensive deep learning libraries of Python that a programmer should have. The name of the library is an abbreviation for the term "panel data" and the module provides data analysis and statistics. It also adds an easy-to-use data structure to facilitate data wrangling and preprocessing. The NumPy library is one of the first libraries downloaded for Python, whether users are using it for deep learning or basic data analysis. This deep learning library of Python provides functions for handling dates and numbers. The library creates an N-dimensional array object that allows the user to enter data and provides transformation functions for that data. The Scikit-learn library is a Python machine learning library for many due to its diverse use cases and powerful tools. The main purpose of scikit-learn is to provide efficient tools for data analysis. The library is built on other powerful libraries such as NumPy, SciPy, Matplotlib and supports Plotly, Pandas, and more. PyTorch's n-dimensional arrays can perform multinomial statistical distributions, inner products, matrix-vector multiplication, and other complex statistical operations. The framework also has a module called "nn". This module contains simple neural network operations, loss functions, and stochastic gradient descent operations for training.

In order to train the classification model, an image dataset is needed. A dataset of 100k high resolution (1280-720) images would be used for the Berkeley DeepDrive (BDD100K) data. This extensive dataset contains many realistic driving traffic scenarios and is widely used in AV for environmental perception studies. The effectiveness of the proposed method would be evaluated using a large naturalistic driving data set (BDD100K). The largest open driving video dataset with 100,000 films and 10 tasks to evaluate the exciting development of image recognition systems for autonomous driving. Each video is of high quality and is 40 seconds long. Over 100 million frames, this dataset represents over 1000 hours of driving experience. For flight paths, the film comes with GPU / IMU data. The dataset has geographic characteristics, environmental and weather variability, which is useful for less surprising training models in new situations.

Dynamic outdoor scenes and complex movements of the ego vehicle make perceptual tasks even more difficult. Tasks included image marking, lane detection, driving area segmentation, multi- object segmentation tracking, semantic segmentation, multi-object detection tracking, area adjustment, road object detection, and imitation learning.

Methodology and work plan

Methodology

A deep learning-based road objects detection model will be implemented here in this study in case of enhancing the protection while driving for autonomous vehicles. In the case of meeting this purpose, a large naturalistic driving dataset has been collected, the Berkeley DeepDrive dataset (BDD100K). This BDD100K will be used to study the efficiency of the proposed model (Bayoudh et al. 2021). The Histogram of Oriented Gradients (HOG) or Laplace of Gaussian Algorithm (LoG) method is used for feature extraction. Both methods are commonly used for object recognition and extracting gradient-based features (Denget al. 2021). HOG divides the image into small square cells, calculates a histogram for each cell using an oriented gradient, normalizes the result using a block-by-block pattern, and returns the descriptor for each cell. The Laplace of Gaussian (LoG) operation is a two-step process that smoothes an image using a Gaussian filter that previously used the Laplace function to reduce its sensitivity to noise (Cui et al. 2021). Its input is a grayscale image and produces another grayscale image as output. Once the features are extracted, the Canny edge detection algorithm is used to detect the edges of objects such as vehicles and people. The Canny technique finds edges by looking for the local maximum gradient in the image. Finally, use FasterRCNN or HybridCNN to train and train the model (Gwak et al. 2019). This will improve the recognition rate of the proposed model. Given the data, use a faster RCNN or hybrid CNN neural network model to analyze the atmosphere around the road and the atmosphere around the vehicle (Haris & Glowacz, 2021). This model helps identify small objects in different traffic conditions and achieves better cognitive performance.

Plan of work

The plan of work is an essential step in the case of implementing the proposed model. The Gantt chart is given below.

Discussion

Self-driving or autonomous cars present a new theme for ethics. The ethical behavior of self- driving cars is predefined by using methods to determine where harmful persons are likely, likely, or unavoidable. These systems may initially be limited to road use, but eventually move into cities, where self-driving cars (AVs) dominate road traffic forecast estimates for the 2040s. The new technology is expected to significantly reduce traffic accidents. There is no doubt that ethical concerns will arise as self-driving car accidents become a reality. Actions that can harm humans or even animals are considered ethical choices (Arya et al. 2021). Ad hoc decisions are made in milliseconds but can rely on lengthy research and discussion. Similar methods are likely to be used simultaneously in millions of vehicles, adding and amplifying the need to compensate for the effects of inherent distortion. Previous studies have shown that the behavior of moral judgments and cognitive processes is highly context-sensitive and that complete and subtle frames have never been achieved. Therefore, the ethical model of self-driving cars should strive to coordinate human decision-making in the same situation (Reda et al. 2020). When simulating road traffic conditions. When simulating road traffic conditions, immersive virtual reality has been used to assess ethical behavior and train and analyze various model options. In this study, controlling the virtual vehicle, participants had to select two obstacles to sacrifice to save the other. Framework comparisons show that a basic one-dimensional model of life value can explain human ethical behavior under certain circumstances. It turns out that the voting patterns are inconsistent, so it's a discussion to decide how to do the traffic. This study shows that virtual reality is well suited to assess people's ethical behavior and provides consistent results across topics while scrutinizing the test environment in the current situation. The data of this project would be protected by the eight principles of data protection. All the principles of protecting data will be studied and implemented in the study.

References

Kulkarni, R., Dhavalikar, S., & Bangar, S. (2018). Traffic Light Detection and Recognition for Self Driving Cars Using Deep Learning. Proceedings - 2018 4th International Conference on Computing, Communication Control and Automation, ICCUBEA 2018. https://doi.org/10.1109/ICCUBEA.2018.8697819
Li, G., Yang, Y., Qu, X., Cao, D., & Li, K. (2021). A deep learning based image enhancement approach for autonomous driving at night. Knowledge-Based Systems, 213, 106617.
Bayoudh, K., Hamdaoui, F., & Mtibaa, A. (2021). Transfer learning based hybrid 2D-3D CNN for traffic sign recognition and semantic road detection applied in advanced driver assistance systems. Applied Intelligence, 51(1), 124-142.
Deng, Y., Zhang, T., Lou, G., Zheng, X., Jin, J., & Han, Q. L. (2021). Deep Learning-Based Autonomous Driving Systems: A Survey of Attacks and Defenses. IEEE Transactions on Industrial Informatics.
Cui, Y., Chen, R., Chu, W., Chen, L., Tian, D., Li, Y., & Cao, D. (2021). Deep learning for image and point cloud fusion in autonomous driving: A review. IEEE Transactions on Intelligent Transportation Systems.
Gwak, J., Jung, J., Oh, R., Park, M., Rakhimov, M. A. K., & Ahn, J. (2019). A review of intelligent self-driving vehicle software research. KSII Transactions on Internet and Information Systems (TIIS), 13(11), 5299-5320.
Haris, M., & Glowacz, A. (2021). Road Object Detection: A Comparative Study of Deep Learning-Based Algorithms. Electronics, 10(16), 1932.

Arya, D., Maeda, H., Ghosh, S. K., Toshniwal, D., Mraz, A., Kashiyama, T., & Sekimoto, Y. (2021). Deep learning-based road damage detection and classification for multiple countries. Automation in Construction, 132, 103935.
Yu, K., Lin, L., Alazab, M., Tan, L., & Gu, B. (2020). Deep learning-based traffic safety solution for a mixture of autonomous and manual vehicles in a 5G-enabled intelligent transportation system. IEEE transactions on intelligent transportation systems.
Hou, L., Chen, H., Zhang, G. K., & Wang, X. (2021). Deep Learning-Based Applications for Safety Management in the AEC Industry: A Review. Applied Sciences, 11(2), 821.
Parmar, Y., Natarajan, S., & Sobha, G. (2019). DeepRange: deep-learning-based object detection and ranging in autonomous driving. IET Intelligent Transport Systems, 13(8), 1256-1264.
He, J., Tang, Z., Fu, X., Leng, S., Wu, F., Huang, K., ... & Xiong, Z. (2019, October). Cooperative connected autonomous vehicles (CAV): research, applications and challenges. In 2019 IEEE 27th International Conference on Network Protocols (ICNP) (pp. 1-6). IEEE.
Bibi, R., Saeed, Y., Zeb, A., Ghazal, T. M., Rahman, T., Said, R. A., ... & Khan, M. A. (2021). Edge AI-based automated detection and classification of road anomalies in VANET using Deep Learning. Computational intelligence and neuroscience, 2021.
Qiao, J. J., Wu, X., He, J. Y., Li, W., & Peng, Q. (2020). SWNet: A Deep Learning Based Approach for Splashed Water Detection on Road. IEEE Transactions on Intelligent Transportation Systems.
Dairi, A., Harrou, F., Senouci, M., & Sun, Y. (2018). Unsupervised obstacle detection in driving environments using deep-learning-based stereovision. Robotics and Autonomous Systems, 100, 287-301.

Reda, A., Bouzid, A., & Vásárhelyi, J. (2021, May). Deep Learning-Based Automated Vehicle Steering. In 2021 22nd International Carpathian Control Conference (ICCC) (pp. 1-5). IEEE.
Niranjan, D. R., & VinayKarthik, B. C. (2021, October). Deep Learning based Object Detection Model for Autonomous Driving Research using CARLA Simulator. In 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC) (pp. 1251- 1258). IEEE.