⏩ Volume 20, Issue No.6, 2022 (CVAS)
Advanced Scene Understanding for Autonomous Navigation Using Dense Spatiotemporal Vision Transformers in Urban Environments

This paper explores spatiotemporal vision transformers to enhance real-time scene understanding for autonomous systems navigating complex urban spaces, significantly improving obstacle detection, trajectory planning, and interaction prediction in traffic-intensive scenarios.

Yu Zhenlong, Huang Meilin, Chen Lixue, Wang Jianyu, Zhang Huicheng, Liu Xiaoyong

Paper ID: 32220601
✅ Access Request

Learning Visual Policies for Human-Robot Interaction Tasks in Indoor Dynamic Environments with Uncertainty Factors

This study presents a visual policy learning framework for robots interacting with humans indoors, adapting dynamically to moving elements, occlusions, and environmental uncertainties using deep reinforcement learning for safe and efficient task execution.

Brian Timothy Rogers, Aleksandra Marie Velez, Dominique Carl Fisher, Elijah Frank Montgomery, Lucia Catherine Bennett

Paper ID: 32220602
✅ Access Request

Cross-Domain Visual Feature Embedding for Multi-Sensor Fusion in Autonomous Vehicle Perception Pipelines

This paper proposes a cross-domain visual feature embedding technique for integrating data from cameras, LiDAR, and radar in autonomous driving, achieving robust perception under challenging weather and lighting conditions through learned multi-modal representations.

Raghavendra Harish Pai, Michelle Laura Greenwood, Koji Tanaka, Sandeep Menon, Sara Francesca Di Luca, Ahmed Mostafa Eldarwish

Paper ID: 32220603
✅ Access Request

Simultaneous Localization and Mapping in Subterranean Tunnels Using Depth-Enhanced Monocular Vision Systems

This research develops a monocular vision-based SLAM system enhanced with depth cues for subterranean environments, enabling accurate mapping and navigation where GPS signals are absent and lighting conditions are extremely limited.

George Benjamin Harlow, Nikolai Dimitri Morozov, Teresa Angela De Rosa, Emily Jean Matthews, Carl Anthony Blanchard

Paper ID: 32220604
✅ Access Request

Benchmarking Vision-Based Anomaly Detection Algorithms for Surveillance in Critical Infrastructure Facilities

This study benchmarks multiple vision-based anomaly detection algorithms using synthetic and real surveillance datasets from critical infrastructure, evaluating detection accuracy, speed, and generalization to unseen threat scenarios.

Oliver James Braxton, Pierre Laurent Marchand, Rafael Domingo Vera, Fiona Louise Robinson, Jennifer Grace Whitman, Chang Xiaoyue

Paper ID: 32220605
✅ Access Request

Unsupervised Domain Adaptation for Object Tracking Using Contrastive Feature Alignment and Self-Refinement

This work introduces a domain adaptation strategy for object tracking via contrastive feature alignment and self-refinement, improving model performance when transferred to new environments with unseen object appearances.

Zhou Hengming, Li Chunqiao, Wu Yaodan, Zheng Junwei, Xia Zihui, Deng Qiaoling

Paper ID: 32220606
✅ Access Request

A Hierarchical Graph Attention Network for Scene Graph Generation in Complex Aerial Imagery

This paper proposes a hierarchical graph attention model to extract scene graphs from aerial images, enabling enhanced semantic understanding for downstream tasks like surveillance, disaster response, and urban planning analysis.

Ashutosh Neel Kamble, Hiroshi Yamamoto, Martha Joan Falkner, Paolo Roberto Greco, Ahmed Bilal Saeed, Manuel Javier Ortega

Paper ID: 32220607
✅ Access Request

Multi-Object Pose Estimation from Single-View RGB Input Using Iterative Refinement Networks with Uncertainty Estimation

We present a multi-object pose estimation model using single-view RGB input, incorporating iterative refinement and uncertainty modeling to achieve high precision in cluttered and partially occluded environments.

Victor Alan Reid, Sofia Emilia Markova, Francesco Matteo Lombardi, Sun Linghua, Rajeev Varghese Nair, Boonchai Thanapong

Paper ID: 32220608
✅ Access Request

Neuro-Symbolic Fusion for Explainable Action Recognition in Video Streams Captured by Autonomous Agents

This research integrates symbolic reasoning with deep neural features to enable explainable action recognition in video streams from autonomous agents, enhancing transparency and trust in AI decisions.

Isabel Christine Wagner, Yuting Zhang, Kevin Elias Foster, Marina Ljubica Krstic, Ethan Douglas Romero, Anil Shyam Sundar

Paper ID: 32220609
✅ Access Request

Semantic Scene Completion Using Hierarchical Context Aggregation and Conditional Generative Modeling from Sparse Input

This paper introduces a semantic scene completion framework using sparse depth and RGB data, leveraging hierarchical aggregation and conditional generative modeling to reconstruct detailed 3D semantic layouts in real time.

Kazuo Shinji Nakamoto, William Darnell O’Neal, Frederic George Lambert, Dinesh Krishnan Reddy, Leo Alphonse Durand

Paper ID: 32220610
✅ Access Request

Back