Articles
- Vol.23, No.1, 2025
- Vol.22, No.6, 2024
- Vol.22, No.5, 2024
- Vol.22, No.4, 2024
- Vol.22, No.3, 2024
- Vol.22, No.2, 2024
- Vol.22, No.1, 2024
- Vol.21, No.6, 2023
- Vol.21, No.5, 2023
- Vol.21, No.4, 2023
- Vol.21, No.3, 2023
- Vol.21, No.2, 2023
- Vol.21, No.1, 2023
- Vol.20, No.6, 2022
- Vol.20, No.5, 2022
- Vol.20, No.4, 2022
- Vol.20, No.3, 2022
- Vol.20, No.2, 2022
- Vol.20, No.1, 2022
- Vol.19, No.6, 2021
- Vol.19, No.5, 2021
- Vol.19, No.4, 2021
- Vol.19, No.3, 2021
- Vol.19, No.2, 2021
- Vol.19, No.1, 2021
This paper explores spatiotemporal vision transformers to enhance real-time scene understanding for autonomous systems navigating complex urban spaces, significantly improving obstacle detection, trajectory planning, and interaction prediction in traffic-intensive scenarios.
Yu Zhenlong, Huang Meilin, Chen Lixue, Wang Jianyu, Zhang Huicheng, Liu Xiaoyong
Paper ID: 32220601 | ✅ Access Request |
This study presents a visual policy learning framework for robots interacting with humans indoors, adapting dynamically to moving elements, occlusions, and environmental uncertainties using deep reinforcement learning for safe and efficient task execution.
Brian Timothy Rogers, Aleksandra Marie Velez, Dominique Carl Fisher, Elijah Frank Montgomery, Lucia Catherine Bennett
Paper ID: 32220602 | ✅ Access Request |
This paper proposes a cross-domain visual feature embedding technique for integrating data from cameras, LiDAR, and radar in autonomous driving, achieving robust perception under challenging weather and lighting conditions through learned multi-modal representations.
Raghavendra Harish Pai, Michelle Laura Greenwood, Koji Tanaka, Sandeep Menon, Sara Francesca Di Luca, Ahmed Mostafa Eldarwish
Paper ID: 32220603 | ✅ Access Request |
This research develops a monocular vision-based SLAM system enhanced with depth cues for subterranean environments, enabling accurate mapping and navigation where GPS signals are absent and lighting conditions are extremely limited.
George Benjamin Harlow, Nikolai Dimitri Morozov, Teresa Angela De Rosa, Emily Jean Matthews, Carl Anthony Blanchard
Paper ID: 32220604 | ✅ Access Request |
This study benchmarks multiple vision-based anomaly detection algorithms using synthetic and real surveillance datasets from critical infrastructure, evaluating detection accuracy, speed, and generalization to unseen threat scenarios.
Oliver James Braxton, Pierre Laurent Marchand, Rafael Domingo Vera, Fiona Louise Robinson, Jennifer Grace Whitman, Chang Xiaoyue
Paper ID: 32220605 | ✅ Access Request |
This work introduces a domain adaptation strategy for object tracking via contrastive feature alignment and self-refinement, improving model performance when transferred to new environments with unseen object appearances.
Zhou Hengming, Li Chunqiao, Wu Yaodan, Zheng Junwei, Xia Zihui, Deng Qiaoling
Paper ID: 32220606 | ✅ Access Request |
This paper proposes a hierarchical graph attention model to extract scene graphs from aerial images, enabling enhanced semantic understanding for downstream tasks like surveillance, disaster response, and urban planning analysis.
Ashutosh Neel Kamble, Hiroshi Yamamoto, Martha Joan Falkner, Paolo Roberto Greco, Ahmed Bilal Saeed, Manuel Javier Ortega
Paper ID: 32220607 | ✅ Access Request |
We present a multi-object pose estimation model using single-view RGB input, incorporating iterative refinement and uncertainty modeling to achieve high precision in cluttered and partially occluded environments.
Victor Alan Reid, Sofia Emilia Markova, Francesco Matteo Lombardi, Sun Linghua, Rajeev Varghese Nair, Boonchai Thanapong
Paper ID: 32220608 | ✅ Access Request |
This research integrates symbolic reasoning with deep neural features to enable explainable action recognition in video streams from autonomous agents, enhancing transparency and trust in AI decisions.
Isabel Christine Wagner, Yuting Zhang, Kevin Elias Foster, Marina Ljubica Krstic, Ethan Douglas Romero, Anil Shyam Sundar
Paper ID: 32220609 | ✅ Access Request |
This paper introduces a semantic scene completion framework using sparse depth and RGB data, leveraging hierarchical aggregation and conditional generative modeling to reconstruct detailed 3D semantic layouts in real time.
Kazuo Shinji Nakamoto, William Darnell O’Neal, Frederic George Lambert, Dinesh Krishnan Reddy, Leo Alphonse Durand
Paper ID: 32220610 | ✅ Access Request |
Back