Pak Hung Chan

and 4 more

Pak Hung Chan

and 4 more

Assisted and automated driving functions will rely on machine learning algorithms, given their ability to cope with real world variations, e.g. vehicles of different shapes, positions, colours, etc. Supervised learning needs annotated datasets, and several datasets are available for developing automotive functions. However, these datasets are tremendous in volume, and labelling accuracy and quality can vary across different datasets and within dataset frames. Accurate and appropriate ground truth is especially important for automotive, as "incomplete" or "incorrect" learning can impact negatively vehicle safety when these neural networks are deployed. This work investigates ground truth quality of widely adopted automotive datasets, including a detailed analysis of KITTI MoSeg. According to the identified and classified errors in the annotations of different automotive datasets, this paper provides three different criteria collections for producing improved annotations. These criteria are enforceable and applicable to a wide variety of datasets. The three annotations sets are created to: (i) remove dubious cases; (ii) annotate to the best of human visual system; (iii) remove clear erroneous bounding boxes. KITTI MoSeg has been reannotated three times according to the specified criteria, and three state-of-the-art deep neural network object detectors are used to evaluate them. The results clearly show that network performance is affected by ground truth variations, and removing clear errors is beneficial for predicting real world objects only for some networks. The relabelled datasets still present some cases with "arbitrary"/"controversial" annotations, and therefore this work concludes with some guidelines related to dataset annotation, metadata/sub-labels, and specific automotive use-cases.

Daniel Gummadi

and 3 more

 Assisted and automated driving (AAD) systems heavily rely on data collected from perception sensors, such as cameras. While prior research has explored the quality of camera data via traditional and well-established image quality assessment (IQA) metrics (e.g. PSNR, SSIM, BRISQUE) or have considered when noisy/degraded data affects perception algorithms (e.g. deep neural network (DNN) based object detection), there are no works that approach the holistic relationship between IQA and DNN performance. This work proposes that traditional IQA metrics, designed to evaluate digital image quality according to human visual perception, can help to predict the sensor data degradation level that perception algorithms can tolerate before performance deterioration occurs. Consequently, a correlation analysis was conducted between 17 selected IQA metrics (with and without reference) and DNN average precision. The evaluated data was increasingly compressed to generate degradation and artefacts. Notably, the experimental results show that several IQA metrics had a strong positive correlation (exceeding correlation scores of 0.7) with average precision, with IW-SSIM and DSS having very high correlation (> 0.9). Interestingly, the results show that re-training BRISQUE on compressed data causes an exceptionally high positive correlation (> 0.97), making it very suitable for predicting the performance of DNN object detectors. By effectively relating traditional image quality metrics to DNN performance, this research offers a series of significant tools to understand and predict perception degradation based on the quality of data, thus resulting in a significant impact on the development of automated driving systems.Â

Pak Hung Chan

and 3 more

The sensor suite for assisted and automated driving functions vehicle is critical to the function of a vehicle, but also the first and most important limitation to the level of automation that the system can achieve. The advancement of 4D RADARs, providing better resolution in both azimuth and elevation compared to traditional RADAR, can assist to achieve more robust situational awareness, whilst also providing more data for perception algorithms and sensor fusion. However, like all perception sensors, 4D RADAR is also affected by numerous noise factors. To explore the sources of noise, this work identifies, classifies, and analyses automotive 4D RADAR noise factors. Overall, 22 noise factors have been considered, in combination with their effect on six 4D RADAR outputs. Finally, this work also presents and applies, for the first time, a dissimilarity metric to collected 4D RADAR data in the presence of rain with different intensities. The proposed metric is used to assess the effect of noise on the variability of the measured data, in addition it can be used to compare any 4D RADAR data. The metric, combined with other pointcloud evaluations, shows that as rainrate intensifies, the size of the pointcloud decreases, but also the variation in the measurements increases. This work presents the importance of evaluating, companding, and quantifying noise for 4D RADAR, and can pave the way for more in depth analysis of its modelling and testing of 4D RADAR for assisted and automated driving functions.

Boda Li

and 4 more

Assisted and automated driving functions are increasingly deployed to support improved safety, efficiency, and enhance driver experience. However, there are still key technical challenges that need to be overcome, such as the degradation of perception sensor data due to noise factors. The quality of data being generated by sensors can directly impact the planning and control of the vehicle, which can affect the vehicle safety. This work builds on a recently proposed framework, analysing noise factors on automotive LiDAR sensors, and deploys it to camera sensors, focusing on the specific disturbed sensor outputs via a detailed analysis and classification of automotive camera specific noise sources (30 noise factors are identified and classified in this work). Moreover, the noise factor analysis has identified two omnipresent and independent noise factors (i.e. obstruction and windshield distortion). These noise factors have been modelled to generate noisy camera data; their impact on the perception step, based on deep neural networks, has been evaluated when the noise factors are applied independently and simultaneously. It is demonstrated that the performance degradation from the combination of noise factors is not simply the accumulated performance degradation from each single factor, which raises the importance of including the simultaneous analysis of multiple noise factors. Thus, the framework can support and enhance the use of simulation for development and testing of automated vehicles through careful consideration of the noise factors affecting camera data.

Pak Hung Chan

and 3 more

Whilst Deep Neural Networks have been developing swiftly, most of the research has been focused on RGB image. This type of image has been traditionally optimised for human vision. However, RGB data is a highly re-elaborated and interpolated version of the collected raw data (i.e. the sensor collects one value per pixel), but an RGB image for human viewing contains 3 values, for red, green, and blue. This processing through the ISP (Image Signal Processing) requires computational resource, time, power and obviously increases by a factor of three the amount of output data. This work investigates Deep Neural Network based detection using (for training and evaluation) Bayer data, generated in different ways, from a benchmarking automotive dataset (i.e. KITTI dataset). A Deep Neural Network (DNN) is deployed in unmodified form, and also modified to accept only single field images, such as Bayer frames. Eleven different re-trained version of the DNN are produced, and cross-evaluated across different data formats. The results demonstrate that the selected DNN has the same accuracy when evaluating RGB or Bayer data, without significant degradation in the perception (the variation of the Average Precision is <1%). Moreover, the colour filter array position and the colour correction matrix do not seem to contribute significantly to the DNN performance. This work demonstrates that Bayer data can be used for object detection in automotive without significant performance loss, and that the processing currently used in ISP can be avoided, allowing for more efficient sensing-perception systems.

Gabriele baris

and 4 more

Recent advances in sensing, electronic, processing, machine learning, and communication technologies are accelerating the development of assisted and automated functions for commercial vehicles. Environmental perception sensor data are processed to generate a correct and complete situational awareness. It is of utmost importance to assess the robustness of the sensor data pipeline, particularly in the case of data degradation in a noisy and variable environment. Sensor data reduction and compression techniques are key for higher levels of driving automation, as there is an expectation that traditional automotive vehicle wired networks will not be able to support the needed sensor datarates (i.e. more than 10 perception sensors, including cameras, LiDARs, and RADARs, generating tens of Gb/s of data). This work proposes for the first time to consider video compression for camera data transmission on vehicle wired networks in the presence of highly noisy data, e.g. partially obstructed camera field of view. The effects are discussed in terms of machine learning vehicle detection accuracy drop, and also visualising how detection performance spatially varies on the frames using the recently introduced metric, the Spatial Recall Index. The presented parametric occlusion noise model is generated to emulate real-world occlusion patterns, whereas compression is based on the well-established AVC/H.264 compression standard. The results demonstrate that the DNN performance are stable when increasing compression despite adding small amounts of noise. However, higher levels of occlusion noise have a higher impact on DNN performance, and when combined with compression, there is a significant decrease in the DNN performance.

Pak Hung Chan

and 4 more

Situational awareness based on the data collected by the vehicle perception sensors (i.e. LiDAR, RADAR, camera and ultrasonic sensors) is key for achieving assisted and automated driving functions in a safe and reliable way. However, the data rates generated by the sensor suite are difficult to support over traditional wired data communication networks on the vehicle, hence there is an interest in techniques that reduce the amount of sensor data to be transmitted without losing key information or introducing unacceptable delays. These techniques must be analysed in combination with the consumer of the data, which will most likely be a machine learning algorithm based on deep neural networks (DNNs). In this paper we demonstrate that by compression tuning the DNNs (i.e. transfer learning by re-training with compressed data) the DNN average precision and recall can significantly improve when uncompressed and compressed data are transmitted. This improvement is achieved independently from the compression standard used for compression-training (we used AVC and HEVC), and also when training and transmitted data use the same compression standard or different compression standards. Furthermore, the performance of the DNNs is stable when transmitting data with increasing lossy compression rate, up to a compression ratio of approximately 1200:1; above this value the performance starts to degrade. This work paves the way for the use of compressed sensor data in assisted and automated driving in combination with the optimisation of compression- tuned DNNs.