To advance research in surveillance and Human Action Recognition (HAR) in challenging low-light environments, we construct a novel dataset: the Parking Lots Thermal HAR (PL-THAR). This dataset, captured using a long-wave infrared camera, serves as a benchmark for all-day surveillance, especially in outdoor parking lots with poor night-time lighting. To address the inherent blurriness of Thermal Infrared (TIR) images, we propose a novel HAR network based on thermal imaging, named THAR-Net. The network incorporates a multi-channel coordinated attention module, deformable convolutional networks, and cross-stage feature fusion modules to enhance feature extraction capabilities while reducing computational overhead. Comparative experiments with YOLOv8 show that THAR-Net achieves a 41.6% to 50.5% reduction in the number of parameters and a 26.2% to 33.1% decrease in floating-point operations, all while maintaining a detection accuracy of approximately 96%. Ablation experiments further validate the rationality and effectiveness of our network design. In addition, we present a new Detail and Contrast Enhancement (DCE) algorithm specifically designed for addressing the blurred edge problems in TIR images, thus enhancing human surveillance performance through thermal vision.

Enhancing surveillance in parking lots: a thermal infrared approach to human action recognition

Sfarra, Stefano;
2025-01-01

Abstract

To advance research in surveillance and Human Action Recognition (HAR) in challenging low-light environments, we construct a novel dataset: the Parking Lots Thermal HAR (PL-THAR). This dataset, captured using a long-wave infrared camera, serves as a benchmark for all-day surveillance, especially in outdoor parking lots with poor night-time lighting. To address the inherent blurriness of Thermal Infrared (TIR) images, we propose a novel HAR network based on thermal imaging, named THAR-Net. The network incorporates a multi-channel coordinated attention module, deformable convolutional networks, and cross-stage feature fusion modules to enhance feature extraction capabilities while reducing computational overhead. Comparative experiments with YOLOv8 show that THAR-Net achieves a 41.6% to 50.5% reduction in the number of parameters and a 26.2% to 33.1% decrease in floating-point operations, all while maintaining a detection accuracy of approximately 96%. Ablation experiments further validate the rationality and effectiveness of our network design. In addition, we present a new Detail and Contrast Enhancement (DCE) algorithm specifically designed for addressing the blurred edge problems in TIR images, thus enhancing human surveillance performance through thermal vision.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/269980
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact