Codes – ISCAAS

Project: AFOSR-HAR-2021-2025

OD-VIRAT: A Large-Scale Benchmark for Object Detection in Realistic Surveillance Environments

Paper link: Submitted

GiHub link: https://github.com/iscaas/AFOSR-HAR-2021-2025/tree/main/OD-VIRAT

Description: This paper introduces two visual object detection benchmarks named OD-VIRAT Large and OD-VIRAT Tiny, aiming at advancing visual understanding tasks in surveillance imagery. The video sequences in both benchmarks cover 10 different scenes of human surveillance recorded from significant height and distance. The proposed benchmarks offer rich annotations of bounding boxes and categories, where OD-VIRAT Large has 8.7 million annotated instances in 599,996 images and OD-VIRAT Tiny has 288,901 annotated instances in 19,860 images. This work also focuses on benchmarking state-of-the-art object detection architectures, including RETMDET, YOLOX, RetinaNet, DETR, and Deformable-DETR on this object detection-specific variant of VIRAT dataset.

A 3DCNN-Based Knowledge Distillation Framework for Human Activity Recognition

Paper link: https://www.mdpi.com/2313-433X/9/4/82

GiHub link: https://github.com/iscaas/AFOSR-HAR-2021-2025/tree/main/3DCNN-Knowledge-Distillation

Description: In this paper, we propose a knowledge distillation framework, which distills spatio-temporal knowledge from a large teacher model to a lightweight student model using an offline knowledge distillation technique. The proposed offline knowledge distillation framework takes the predictions of two models: a large pre-trained 3DCNN teacher model and a lightweight 3DCNN student model as an input and distills the knowledge from teacher to student model. During the knowledge distillation phase, the distillation algorithm trains the student model by minimizing the difference between the teacher’s and student’s prediction, resulting in a small yet robust model that maintains the same level of precision as of teacher model.

Human Activity Recognition Using Cascaded Dual Attention CNN And Bi-directional GRU Framework

Paper link: https://www.mdpi.com/2313-433X/9/7/130

GiHub link: https://github.com/iscaas/AFOSR-HAR-2021-2025/tree/main/DA-2DCNN

Description: This paper presents a computationally efficient yet generic spatial–temporal cascaded framework that exploits the deep discriminative spatial and temporal features for Human action recognition task. To learn robust representation of human actions from input video frames, we propose an efficient dual-attentional convolutional neural network (DA-CNN) architecture that leverages a unified channel–spatial attention mechanism to extract human-centric salient features in video frames. The dual channel–spatial attention layers together with the convolutional layers learn to be more selective in the spatial receptive fields having objects within the feature maps. The extracted discriminative salient features are then forwarded to a stacked bi-directional gated recurrent unit (Bi-GRU) for long-term temporal modeling and recognition of human actions using both forward and backward pass gradient learning.

Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network

Paper link: https://www.mdpi.com/1999-4893/16/8/369

GiHub link: https://github.com/iscaas/AFOSR-HAR-2021-2025/tree/main/DA-Residual%203DCNN

Description: In this paper, we present a computationally efficient yet robust approach, exploiting saliency-aware spatial and temporal features for human action recognition in videos. We propose an efficient approach called the dual-attentional Residual 3D Convolutional Neural Network (DA-R3DCNN). Our proposed method utilizes a unified channel-spatial attention mechanism, allowing it to efficiently extract significant human-centric features from video frames. By combining dual channel-spatial attention layers with residual 3D convolution layers, the network becomes more discerning in capturing spatial receptive field

ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos

Paper link: https://ieeexplore.org/abstract/document/10177697

GiHub link: https://github.com/iscaas/AFOSR-HAR-2021-2025/tree/main/ViR-ReT

Description: This paper proposes and designs two transformer neural networks for human activity recognition: a recurrent transformer (ReT), a specialized neural network used to make predictions on sequences of data, as well as a vision transformer (ViT), a transformer optimized for extracting salient features from images, to improve speed and scalability of activity recognition. We have provided an extensive comparison of the proposed transformer neural networks with the contemporary CNN and RNN-based human activity recognition models in terms of speed and accuracy for four publicly available human action datasets.

Project: NASA-AI-TrajectoryOpt_2020-2024

Cascaded-Deep-Reinforcement-Learning (CDRL) Based Multi-Revolution Low-Thrust-Spacecraft Orbit Transfer

Paper link: https://ieeexplore.ieee.org/abstract/document/10207710

GiHub link: https://github.com/iscaas/NASA-AI-Trajectory-Optimization-2020-2024/tree/main/CDRL-SAC-2Body

Description: Transferring an all-electric spacecraft to a geosynchronous equatorial orbit (GEO) with low thrust propulsion is challenging because it takes several months. To solve this, we present a new method to plan these long orbit-raising maneuvers from geostationary transfer orbit (GTO) and super-GTO. The process is complex, involving many eclipses and orbits. We propose using a cascaded deep reinforcement learning (DRL) model to guide the spacecraft by choosing the best thrust direction at each point. A reward function that uses the spacecraft’s orbital elements helps the DRL agent achieve optimal flight times. Our results show that this approach leads to time-efficient orbit-raising, with better transfer times compared to current methods. This DRL-based trajectory planning enhances spacecraft autonomy by allowing automated trajectory optimization.

Automated Trajectory Planning: A Cascaded Deep Reinforcement Learning Approach for Low-Thrust Spacecraft Orbit-Raising

Paper link: Submitted in IEEE AES Magazine

GiHub link: https://github.com/iscaas/NASA-AI-Trajectory-Optimization-2020-2024/tree/main/CDRL-SAC-2body%2BCislunar

Description: Computing orbit-transfer trajectories for spacecraft with low-thrust propulsion is challenging due to the complex dynamics, long transfer times, and the need for expert solutions. To overcome these challenges, we propose a new cascaded deep reinforcement learning (CDRL) method to optimize trajectory planning. Our approach focuses on transferring spacecraft from launch injection or bits like geostationary transfer orbit (GTO) and super-GTO to destinations such as geosynchronous equatorial orbit (GEO) and near-rectilinear halo orbit (NRHO). Using a gradient-aided reward function, our method surpasses current automated techniques by achieving more time-efficient orbit-raising. Our results confirm that this approach provides optimal or near-optimal solutions compared to other trajectory computation methods.

Single-Agent Attention Actor-Critic (SA3C): A Novel Solution for Low-Thrust Spacecraft Trajectory Optimization

Paper link: To be submitted in IEEE Transaction on AES

GiHub link: https://github.com/iscaas/NASA-AI-Trajectory-Optimization-2020-2024/tree/main/SA3C-2body%2BCislunar

Description: Efficiently computing orbit-transfer trajectories for low-thrust spacecraft is challenging due to complex dynamics, extended transfer times, and reliance on expert solutions. To overcome these issues, we introduce a novel single-agent attention actor-critic deep reinforcement learning (SA3C-DRL) algorithm. This method incorporates an attention mechanism within a single actor-critic network to optimize low-thrust spacecraft trajectory planning. Our approach focuses on transfers from launch orbits, such as geostationary transfer orbit (GTO) and super-GTO, to targets like geosynchronous equatorial orbit (GEO) and cislunar scenarios such as near-rectilinear halo orbit (NRHO) and Patch point. Our method surpasses existing automated techniques by providing more time-efficient orbit-raising solutions, demonstrating its effectiveness in achieving optimal or near-optimal trajectory planning.

Project: DLQuantizationFrameworks

Deep Learning Performance Characterization on GPUs for Various Quantization Frameworks

Paper link: https://www.mdpi.com/2673-2688/4/4/47

GitHub link: https://github.com/iscaas/DLQuantizationFrameworks

Description: Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning performance, such as training time, latency, throughput, energy consumption, and memory usage in the training and inference stages. To solve these challenges, various optimization techniques and frameworks have been developed for the efficient performance of deep learning models in the training and inference stages. Although optimization techniques such as quantization have been studied thoroughly in the past, less work has been done to study the performance of frameworks that provide quantization techniques. In this paper, we have used different performance metrics to study the performance of various quantization frameworks, including TensorFlow automatic mixed precision and TensorRT. These performance metrics include training time and memory utilization in the training stage along with latency and throughput for graphics processing units (GPUs) in the inference stage. We have applied the automatic mixed precision (AMP) technique during the training stage using the TensorFlow framework, while for inference we have utilized the TensorRT framework for the post-training quantization technique using the TensorFlow TensorRT (TF-TRT) application programming interface (API).We performed model profiling for different deep learning models, datasets, image sizes, and batch sizes for both the training and inference stages, the results of which can help developers and researchers to devise and deploy efficient deep learning models for GPUs.